Name
cx:pretty-print — Reformat whitespace in a document.
Synopsis
<p:declare-step
type
="
cx:pretty-print
"
xmlns:cx
="
http://xmlcalabash.com/ns/extensions
"
>
<p:input
port
="
source
"
/>
<p:output
port
="
result
"
/>
</p:declare-step>
Description
The cx:pretty-print
step reformats an XML document by
passing it through the following XSLT stylesheet, serializing the result,
and then reparsing it[1].
<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
xmlns:saxon='http://icl.com/saxon'
exclude-result-prefixes='saxon'
version='2.0'>
<xsl:output method='xml' indent='yes' saxon:indent-spaces='2'/>
<xsl:strip-space elements='*'/>
<xsl:template match='*'>
<xsl:copy>
<xsl:copy-of select='@*'/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match='comment()'>
<xsl:choose>
<xsl:when test="preceding-sibling::node()[1]/self::text()
and contains(preceding-sibling::text()[1], ' ')">
</xsl:when>
<xsl:otherwise>
<xsl:text> </xsl:text>
</xsl:otherwise>
</xsl:choose>
<xsl:copy/>
<xsl:choose>
<xsl:when test="following-sibling::node()[1]/self::text()
and contains(following-sibling::text()[1], ' ')">
</xsl:when>
<xsl:when test="following-sibling::node()[1]/self::comment()
or following-sibling::node()[1]/self::processing-instruction()">
</xsl:when>
<xsl:otherwise>
<xsl:text> </xsl:text>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match='processing-instruction()'>
<xsl:choose>
<xsl:when test="preceding-sibling::node()[1]/self::text()
and contains(preceding-sibling::text()[1], ' ')">
</xsl:when>
<xsl:otherwise>
<xsl:text> </xsl:text>
</xsl:otherwise>
</xsl:choose>
<xsl:copy/>
<xsl:choose>
<xsl:when test="following-sibling::node()[1]/self::text()
and contains(following-sibling::text()[1], ' ')">
</xsl:when>
<xsl:when test="following-sibling::node()[1]/self::comment()
or following-sibling::node()[1]/self::processing-instruction()">
</xsl:when>
<xsl:otherwise>
<xsl:text> </xsl:text>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Serializing the pretty-printed output and reparsing it should have the effect of normalizing the whitespace so that the document will print with reasonable line breaks and indentation. However,
-
There's nothing about this process that will break very long runs of text into lines of reasonable length.
-
If the parser performs validation on the input, it may have the effect of removing insignificant whitespace.
Your milage may vary.
[1]
Technically, the stylesheet used
is /etc/prettyprint.xsl
in the XML
Calabash jar file.