I have a resource in my sitemap which makes a web page available as XHTML:

<map:match pattern="fetch/**">
  <map:generate src="http://{1}"; type="html"/>
  <map:transform src="xsl/as-is.xsl"/>
  <map:serialize type="xhtml"/>
</map:match>

I call this from within another XSLT file so that I can screenscrape the document for a specific element type by ensuring that it is Tidy'd to XHTML first. The as-is.xsl is a plain identity transform to match "*". Ugly, but useful (there must be a more elegant way but I haven't found it). In the second XSLT file I have a match for an element type which holds the desired URI in an attribute:

<xsl:apply-templates select="document(concat('http://myserver/fetch/',@site))//
             descendant::html:d...@class='foo']"/>

Constructing the URI and issuing it by hand from the terminal with curl, wget, dog, etc works fine, and the resulting XHTML file works (tested with lxgrep to ensure that the XPath extracts the right element), so I know that bit works.

When accessed from within the second stylesheet, the cocoon.log shows Tidy successfully converting the remote page to XHTML, the same as when tested from the terminal, but the data never makes it through to the template for html:div (the namespace *is* specified in the stylesheet :-) In cocoon.log there's a warning:

WARN  (2009-10-23) 11:34.02:162 [sitemap.transformer.xslt] (/doc/test) 
TP-Processor9/TraxErrorListener: file:///xsl/tools.xsl:7:138

but it doesn't say what it found wrong (not very helpful). Line 7 of tools.xsl is the apply-templates shown above, char 138 is the end of that line.

Testing it from the command line with Saxon, I get this:

Recoverable error on line 7 of file:/xsl/tools.xsl:
  FODC0005: java.io.IOException: Server returned HTTP response code: 503 for 
URL:
  http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd

503 is a temporary overload, but that URI is retrievable with curl the instant before and after using Saxon. And in any case, when going via Cocoon it would cache the DTD (wouldn't it? to avoid overloading the W3C with a gazillion requests for the DTD URI?)

I'm missing a trick here, but I can't see what.

///Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to