Hi all,

This might be a bit of an unusual question... I think normally one of the
first thing people ask for when starting to work with an XML parser is, "how
can I make it stop chunking my characters() callbacks?" and the answer
usually is, "well, it's allowed to do that, just aggregate them yourself." In
my case, I'd actually like to *force* Xerces to chunk. I have some truly
horrible, degenerate XML I need to parse, that basically consists of a 70
megabyte block of binary data (Base64 encoded so as not to wreak havoc on XML
parsing) enclosed in an element.

The trouble I'm running into is, when parsing, a buffer in memory for the
characters in this tremendous block of data is being maintained, and is grown
when necessary by XMLBuffer::insureCapacity. This buffer gets so large that
at some point, the allocation in insureCapacity fails, and parsing can't
continue. What I'd like to be able to do is, specify to Xerces that it should
buffer up only a certain maximum amount of character data at a time before
calling sendCharData (in IGXMLScanner::scanCharData), rather than waiting
until it has everything.

As far as I can tell, there isn't a way to do this currently. But I'd like
some feedback as to how easily people think this might be implemented,
whether it's reasonable to do so, etc., and (as a newbie to the Xerces
codebase) hopefully get some assistance in implementing it.

Any help would be much appreciated. I look forward to your answers,
Dan Rosen

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to