With Xalan switched into incremental mode (not currently the default), it
should generate output as sufficient input arrives to start running the
stylesheet, and should only read as much as this stylesheet actually needs.
However, Xalan is still building a tree internally, so as your stream
continues it will consume more memory.

If you have the option of running a wrapper around Xalan that divides your
input stream into managable sub-documents and proceses each of those in
turn, that would reduce the maximum load on the system. Of course you'd
have to rewrite your stylesheets to work on one chunk at a time, and there
are some kinds of stylesheet that you can't do that with though you might
be able to reorganize the task to make this approach work -- eg, generating
the table of contents as a separate document rather than scanning all the
chapters beforehand.

In the long term, we really want to automatically recognize when this sort
of reorganization of the problem would be useful and do it for you. See the
recent discussions of "pruning" for some comments on what we want to do
about that and the challenges involved.


Reply via email to