Joseph,
This is very helpful. As we'll be dealing mostly with database records, they
can indeed be formatted as small chunks (or even individual records by
type). I'm thinking now of using a flow like this:
1) Output a query of rows incrementally as an XML stream
2) Parse the stream using a SAX2 Parser (the one in Xerces)
3) Use SAX2 events to determine which type of data it is, and what XSLT
stylesheet applies;
4) Transform the XML element
5) Output a stream of transformed segments
6) Write the results to an output file (typically HTML)
We also may be able to chunk groups of similar records with some sort of
preset buffer size. Let me know if you see any huge gotchas with this, but
so far it seems like using Transformers in Xalan is very fast, and so if I
repetitively call them I'm thinking it should perform pretty well.
Thanks in advance for the help.
Cory
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Thursday, July 12, 2001 2:23 PM
To: [EMAIL PROTECTED]
Subject: Re: Incremental Transformation Question
With Xalan switched into incremental mode (not currently the default), it
should generate output as sufficient input arrives to start running the
stylesheet, and should only read as much as this stylesheet actually needs.
However, Xalan is still building a tree internally, so as your stream
continues it will consume more memory.
If you have the option of running a wrapper around Xalan that divides your
input stream into managable sub-documents and proceses each of those in
turn, that would reduce the maximum load on the system. Of course you'd
have to rewrite your stylesheets to work on one chunk at a time, and there
are some kinds of stylesheet that you can't do that with though you might
be able to reorganize the task to make this approach work -- eg, generating
the table of contents as a separate document rather than scanning all the
chapters beforehand.
In the long term, we really want to automatically recognize when this sort
of reorganization of the problem would be useful and do it for you. See the
recent discussions of "pruning" for some comments on what we want to do
about that and the challenges involved.