--- On Sat, 11/22/08, Mikael Jansson <[EMAIL PROTECTED]> wrote:
...
> it's an app for converting XML-files from the Swedish tax aurthur with
> tax-data (and som other stuff, like EU custom-data).
> I'm doing this for a company witch business is to hold a databas with this
> data
> and distribute it to there customers directly from databas to database so
> they wount have
> to create this app for there self. My app, downloads, pgp-validates, unzips
> and convert the XML to SQL and then loads them in to a database.
>
> They distribute a number difference-xml-files every night and a number of
> total-xml-files every month or so. It's these Totals that is so big.
This is actually quite a common use case for xml, although not necessarily for
xslt, or other approaches that require full in-memory tree representation.
Your best best really is to process data one sub-tree at a time (build
in-memory sub-trees, process separately, re-combine output if necessary), as
suggested by others, or use a fully streaming approach (such as StaxMate
library).
Also, given that this sounds more like a data-oriented task (as opposed to
document oriented), perhaps data mapping (xml-to-object) tools such as JAXB
might be better fit for processing the data. JAXB for example can also bind
just sub-trees (when using Stax parser as source), to avoid size problems.
-+ Tatu +-