Hi,
I am handling GML documents with Cocoon which are very verbose (about
11mb in size).
I am receiving the compressed data from an http post and then using a
custom generator to decompress this data and pass it on down the
pipeline. Is there a way to improve this bottle neck?
At the generator is has to hold the entire 11mb since it decompressing, and then in places with XSL it has to hold the XML file in memory to do a sort.
What does anyone else do with very large XML documents and Cocoon? The fact that it was SAX I thought would make it ok, but it is slow, and occasionally falls over.
Why does the generator need to hold the entire file? If the content is gzipped, you can just wrap a GZipInputStream around the input stream you get from the servlet container. And hand that GZipInputStream to the parser.
If you want to dynamically process this, you're gonna have fun, on any system. However, what you want to do is avoid holding stuff in memory, and use streaming technologies wherever possible. One to look at is the STX block. I don't know if it can give you everything you want, but it might help as a replacement for XSLT that is much more 'streaming' focussed, and therefore can handle files of arbitrary size. It won't handle the sorting though, I wouldn't have thought.
Hope that helps.
Regards, Upayavira
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]