The archives have been moved to http://marc.theaimsgroup.com/?l=xalan-dev. You may want to check there for past discussion of "streaming" and "pruning". (Does the website really still point to Covalent? We should get that fixed..)
Brief answer: We recently made some changes which should permit us to accomodate much larger documents than in the past, so you're fairly unlikely to run into a hard limit in Xalan's storage. But you may find that swapping takes a significant bite out of your performance. Longer answer: The XSLT data model assumes the entire document will be in memory at once. We have the ability to defer loading until the data is required, which can be very helpful if you need lower latentcy and can reduce the problem size if your stylesheet does not access the whole document... but if you're going to process all the way to the end of a huge document, the whole thing will end up in memory. Being able to recognize portions of the source document which will never again be referenced and discard them from memory -- which we refer to as "tree pruning" -- is an area of ongoing research, complicated somewhat by our use of the DTM data model. I had a prototype of user-invoked pruning almost ready for testing, but the changes we made to allow larger documents will require some redesign.
