![]() |
|
|
The XPathScript APIAlong with the code delimiters XPathScript provides stylesheet developers with a full API for accessing and transforming the source XML file. This API can be used in conjunction with the delimiters above to provide a stylesheet language that is as powerful as XSLT, and yet provides all the features of a full programming language (in this case, Perl, but I'm certain that other implementations such as Python or Java would be possible). Extracting Values
A simple example to get us started, is to use the API to bring in the
title from a docbook article. A docbook article title looks like this:
There are lots of features to the expression syntax we used to find that "node", and this syntax is called XPath. This is a W3C standard for finding and matching XML document nodes. The standard is fairly readable and is at http://www.w3.org/TR/xpath alternatively I can recommend Norm Walsh's XPath introduction which covers a slightly older version of the specification, but I didn't notice anything in the article that is missing or different from the current recommendation. Extracting Nodes
The above example showed us how to extract single values, but what if we
have a list of things we wish to extract values from? Here's how we
might get a table of contents from docbook article sections:
Note that in the above we don't use the global function findnodes() after finding the sect1 nodes, instead we call the node method findnodes(), which does exactly the same thing, but makes the node you are calling from the context of the XPath expression. Declarative TemplatesThe examples up to now have all covered a concept of a single global template with a search/replace type functionality from the source XML document. This is a powerful concept in itself, especially when combined with loops and the ability to change the context of searches. But that style of template is limited in utility to well structured data, rather than processing large documents. In order to ease the processing of documents, XPathScript includes a declarative template processing model too, so that you can simply specify the format for a particular element and let XPathScript do the work for you. In order to support this method, XPathScript introduces one more API function: apply_templates(). The name is intended to appeal to people already familiar with XSLT. The apply_templates() function takes either a list of start nodes, or an XPath expression (that must result in a node set) and optional context. Starting at the start nodes it traverses the document tree applying the templates defined by the $t hash reference.
First a simple example to introduce this feature. Lets assume for a
moment that our source XML file is valid XHTML, and we want to change
all anchor links to italics. Here is the very simple XPathScript
template that will do that:
The first thing this example does is sets up a hash reference $t that XPathScript knows about (lets call it magical). The keys of $t are element names (including namespace prefix if we are using namespaces). The hash can have the following sub-keys:
Unlike XSLT's declarative transformation syntax, the keys of $t do not specify XPath match expressions. Instead they are simple element names. This is a trade off of speed of execution over flexibility. Perl hash lookups are extremely quick compared to XPath matching. Luckily because of the testcode option, more complex matches are quite possible with XPathScript. The simple explanation for now is that pre specifies output to appear before the tag, post specifies output to appear after the tag, and showtag specifies that the tag itself should be output as well as the pre and post values.
|
||||||||||||||||||||||||||||||||