Hi all, This might be a bit of an unusual question... I think normally one of the first thing people ask for when starting to work with an XML parser is, "how can I make it stop chunking my characters() callbacks?" and the answer usually is, "well, it's allowed to do that, just aggregate them yourself." In my case, I'd actually like to *force* Xerces to chunk. I have some truly horrible, degenerate XML I need to parse, that basically consists of a 70 megabyte block of binary data (Base64 encoded so as not to wreak havoc on XML parsing) enclosed in an element.
The trouble I'm running into is, when parsing, a buffer in memory for the characters in this tremendous block of data is being maintained, and is grown when necessary by XMLBuffer::insureCapacity. This buffer gets so large that at some point, the allocation in insureCapacity fails, and parsing can't continue. What I'd like to be able to do is, specify to Xerces that it should buffer up only a certain maximum amount of character data at a time before calling sendCharData (in IGXMLScanner::scanCharData), rather than waiting until it has everything. As far as I can tell, there isn't a way to do this currently. But I'd like some feedback as to how easily people think this might be implemented, whether it's reasonable to do so, etc., and (as a newbie to the Xerces codebase) hopefully get some assistance in implementing it. Any help would be much appreciated. I look forward to your answers, Dan Rosen --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
