Entity Tricks for Your XPaths (Part III)

Written by Chriztian Steinmeier.
Got comments? I’m @greystate on Twitter.

This concludes my trilogy of entity tricks (everyone should have a trilogy, right?) with some nasty trickery to finally bring order to the galaxy… sort of :-)

Parts I and II should have prepared you for the final chapter - you should definitely brush up on them if you’re not entirely sure about the difference between a “General Entity” and a “Parameter Entity”.

Welcome To My World

Now, I do most of my development locally, including XSLT transforms—I usually grab the source XML and save it as a static XML file and start hacking away in TextMate, where I’ve got a set of fine-tuned (to my workflow) snippets and commands to help the code flow…

Working in Mac OS X, I use the xsltproc commandline processor for transforms (yes, but of course they’re wired up to a keyboard shortcut), which obviously presents a couple of blockers when dealing with something so .NET-based as Umbraco—I’ll list them here, and describe how I mock (work around) them:

  1. The $currentPage parameter is not supplied to my XSLT
  2. I don’t have access to the extension functions in the various namespaces (umbraco.library, EXSLT & MSXML)

How to mock $currentPage

The first one is actually quite easy to mock: Using the select attribute on the param instruction, you can specify a default value that’ll be discarded if a value is actually passed to the parameter. So I just decide on a node that will be a good fit for the current development session, and point to it like this:

<xsl:param name="currentPage" select="/root/Website[1]/Textpage[3]" />

—which would select the 3rd top nav item of the 1st website in my “standard” Umbraco setup.

Because it’s just a default value I can leave it in the file when deploying, because in the production environment, Umbraco always supplies a value for this parameter.

So that one’s actually really easy—one down, more to go…

Mocking extension functions

These are the functions I use frequently enough to want to mock, so I don’t get a bunch of errors when transforming:

So to be able to use these, I tried a bunch of different things - but I wanted to find something that wouldn’t require remembering to comment/un-comment lines between development and production code. And I found one, which is what this is all about… entities to the rescue.

First Things First

Mocking NiceUrl()

But we’ll take baby steps and first decide what to replace NiceUrl() with locally - I wound up using the string() function because it also takes a single argument, so I declared this entity:

<!ENTITY NiceUrl "string">

and repaced all occurrences of the NiceUrl() library call with that, e.g. this:

<a href="{umb:NiceUrl(@id)}">The Dharma Initiative</a>

became this:

<a href="{&NiceUrl;(@id)}">The Dharma Initiative</a>

which, as we’ve learned in Part I, is seen by the XSLT processor as:

<a href="{string(@id)}">The Dharma Initiative</a>

The links won’t work, but that’s never a concern for the XSLT code, since the responsibility of the link being correct lies entirely within the internal Umbraco code for that method. Only thing I should care about is whether the right id is processed - and I’ll see that in the final frontend code after the transformation has finished:

<a href="4815">The Dharma Initiative</a>

Very easy to check the correctness of that, if necessary.

(And the reason for using an entity for this? Well, you should know that by now, but it’ll make it really easy to swap the faked version with the real one later.)

Mocking node-set()

This one’s a little different, right? I didn’t need NiceUrl() to function correctly to be able to develop my transforms, whereas I can’t do without the specific functionality that node-set() provides.

Luckily, it’s a very common need in XSLT 1.0 to be able to convert a so-called “DocumentFragment” to a node set—thus, every XSLT processor I know of has an implementation of that method. In xsltproc it’s implemented as one of the EXSLT methods.

Now, in a perfect world, it would be possible for me to use the exslt:node-set() function in the xsltproc processor on my Mac, and because Umbraco implements a .NET EXSLT library it would automatically work in the Umbraco application too; But sadly that’s not the case. If you want to know a little more about this, check out this question I put on our.umbraco.org just about a year ago: Question regarding Umbraco’s EXSLT implementation

So to solve this, I set up a small snippet like this to visualize the differences:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt"
    xmlns:exslt="http://exslt.org/common"
>
    <xsl:variable name="data-values">
        <val>2</val>
        <val>4</val>
        <val>8</val>
    </xsl:variable>
    <!-- Create node-set using EXSLT function -->
    <xsl:variable name="exsltdata" select="exslt:node-set($data-values)" />

    <!-- Create node-set using MSXML function -->
    <xsl:variable name="msxsldata" select="msxsl:node-set($data-values)" />

    <xsl:template match="/">
        ...
    </xsl:template>

</xsl:stylesheet>

So it’s obvious that in both versions we’re calling a function called node-set(), but each in its own namespace, hence the different namespace-prefixes. Since the only thing that matters when dealing with namespaces is the actual namespace-URI, I figured I could use the same prefix for both versions and just set the namespace-URI as I saw fit, so I came up with this:

<!ENTITY nodeset-ns-uri "http://exslt.org/common">

—and then using that when defining the prefix:

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:make="&nodeset-ns-uri;"
>

This gave me the side-effect of using the prefix to better “name” the function when used:

<xsl:variable name="data-values">
    <val>2</val>
    <val>4</val>
    <val>8</val>
</xsl:variable>
<xsl:variable name="data" select="make:node-set($data-values)" />

“Hey, make a node set out of those data values!”

Externalizing For Development & Production

All is well, I’ve made it possible to run the XSLT transforms locally on my trusty old MacBook Pro, so I can concentrate on generating the right markup with fast previews and all the benefits of not having a full solution running.

It should come as no surprise to you that the next step is to figure out how to handle the entity-swapping that must happen before putting the XSLT file back onto the server. To recap:

$currentPage

umbraco.library:NiceUrl()

msxml:node-set()

I could create all of the entities in a single file, and comment out the ones that shouldn’t be used:

<!-- Entities for development -->
<!--ENTITY nodeset-ns-uri "http://exslt.org/common"-->
<!--ENTITY NiceUrl "string"-->

<!-- Entities for production -->
<!ENTITY nodeset-ns-uri "urn:schemas-microsoft-com:xslt">
<!ENTITY NiceUrl "umbraco.library:NiceUrl">

But as it turns out, there’s an even neater way to mark a section to be ignored or included:

<![IGNORE[
    <!ENTITY nodeset-ns-uri "http://exslt.org/common">
    <!ENTITY NiceUrl "string">
]]>
<![INCLUDE[
    <!ENTITY nodeset-ns-uri "urn:schemas-microsoft-com:xslt">
    <!ENTITY NiceUrl "umbraco.library:NiceUrl">
]]>

As the Parameter Entity can only be used inside a DTD, these declarations can only be used in an external DTD (i.e., one referenced by a Parameter Entity) so extracting this we’re left with a single Parameter Entity in the XSLT file:

<!DOCTYPE xsl:stylesheet [
    <!ENTITY % entities SYSTEM "entities.ent">
    %entities;
]>

and in the “entities.ent” file, we’ll put the entities, with a little switching magic at the top, so we (again) get labels for the sections (production/development) and configuration at the top:

<!ENTITY production "IGNORE">   <!-- Swap these for correct environment -->
<!ENTITY development "INCLUDE">

<![&production;[
    <!ENTITY nodeset-ns-uri "http://exslt.org/common">
    <!ENTITY NiceUrl "string">
]]>
<![&development;[
    <!ENTITY nodeset-ns-uri "urn:schemas-microsoft-com:xslt">
    <!ENTITY NiceUrl "umbraco.library:NiceUrl">
]]>

Needless to say, there will be two different versions of this file - one on the server, and one on the local dev machine - only differing in the top two lines. You probably shouldn’t put this file in version control, and definitely don’t include it in any deployment script you’re running. You know you’ll end up overwriting the one with the other, and all hell breaks loose :-)

Still here?

You should know by now, that I really think you rock. Big-time.

Bonus Entities to think about:

<!ENTITY all-siblings "../*">
<!ENTITY empty "not(normalize-space())">

<!ENTITY dates "urn:Exslt.ExsltDatesAndTimes">
<!ENTITY day-in-week "dayinweek">
<!-- vs. -->
<!ENTITY dates "http://exslt.org/dates-and-times">
<!ENTITY day-in-week "day-in-week">

<!ENTITY displayTitle "(pageTitle | @nodeName[not(normalize-space(../pageTitle))])[1]">

<!ENTITY getCookie "string">
<!ENTITY setCookie "concat">
<!-- vs. -->
<!ENTITY getCookie "umb:RequestCookies">
<!ENTITY setCookie "umb:setCookie">

I’m very much aware that what I’ve described in this series is by no means an average scenario, but I sincerely hope that if you’ve read this far, you’d want to dig a little deeper in some concept or coding pattern you’ve only scratched the surface of until now; and if you do — please consider writing a series of blogposts or articles for us to read!