Since XSLT can refer to any element in the document at any time, we do have
to build an in-memory model even when our input and output is SAX. We can
build it incrementally if so requested (see the DTM description page in our
documentation), so if your stylesheet only references the first half of the
document you wouldn't have to load the second half, and we do use a fairly
compact representation of the loaded docuent, but that's currently the best
we can do.
Searh the archives of this discussion for the term "pruning" for a
discussion of what we're hoping to do in the long term to reduce memory use
-- basically, finding ways to discard data which we're sure will never
again be referenced. I'm currently working on prototype code which lays
some of the groundwork for that, but it isn't yet ready to be checked in
and I'm very concerned about its possible performance impact.
Note too that the DTM model can be made still more compact, via the sort of
data-overlay tricks used in DTM's previous incarnation. However, doing so
will definitely have performance costs due to having to pack and unpack the
information. It still might be worth considering for larger documents. We
have a sketch of this partly drafted, but have not had the time to bring it
fully up to speed.