Do you mean stream a single large XML file ? A series of XML files, or stream a
file thru a series of XQuery|XSLT|XPath transforms.
Possibly poor wording, I meant read a large XML file and produce i.e. a
csv file.
I don’t believe BaseX uses a streaming XML parser, so probably can’t handle
streaming a single large XML file and produce output before it’s parsed the
complete file.
Do you know of a streaming xml lib? other than StAX (no Java here :<)?
But it looks like, from the link in your stackoverflow post that the data is already
sharded into a collection of separate XML files that each contain multiple
<page> elements.
This is the alternative, instead of processing the monolithic
multistream file, I could crawl over the ~150MB bz2-compressed chunks.
Regards, Maxime