Do you mean stream a single large XML file ? A series of XML files, or stream a 
file thru a series of XQuery|XSLT|XPath transforms.

Possibly poor wording, I meant read a large XML file and produce i.e. a csv file.
I don’t believe BaseX uses a streaming XML parser, so probably can’t handle 
streaming a single large XML file and produce output before it’s parsed the 
complete file.
Do you know of a streaming xml lib? other than StAX (no Java here :<)?
But it looks like, from the link in your stackoverflow post that the data is already 
sharded into a collection of separate XML files that each contain multiple 
<page> elements.

This is the alternative, instead of processing the monolithic multistream file, I could crawl over the ~150MB bz2-compressed chunks.

Regards, Maxime


Reply via email to