On Sat, 08 Nov 2003 18:22:08 +0100
Torsten Curdt <[EMAIL PROTECTED]> wrote:

> >> ...I was wondering - is this a bug of the component that produces the
> >> SAX events or the XMLByteStreamCompiler? I mean: now it's ok - but 
> >> should we
> >> silently ignore the problem?
> > 
> > 
> > Torsten, I don't understand your concerns. Isn't the fix simply about 
> > handling text nodes longer than 32 k? Ok, they shouldn't occur that 
> > often (it's half a novel :-) ), but it's possible.
> 
> ...we duplicate events here and the thereby modify the SAX stream.
> Should be no problem.... but who knows ;)
> 
> with the patch:
> 
>   characters(36k)
> ->
>   event
>   string 32k
>   event
>   string 4k
> 
> I guess it would be better to have it like this:
> 
>   characters(36k)
> ->
>   event
>   string 32k
>   string 4k
> 
> So what goes in comes out the same way.
> 
> We could also increasing the max length of a stored
> character event in general. ...but that would waste
> 2 bytes per event. Hm...
> 
> What do you think?
> --
> Torsten
> 
Hi,

why should we handle the UTFDataFormat exception, at all?. The last solution ignores 
this exception, doesn't it?
Where is the difference between 

event 
string 32k
string 4k

and
event
string 36k 

in the bytestream?

The questions is if we need the UTFDataFormatException or not. If not a patch can 
simply remove the statement if(string>32k){} and then we get the result:

event
string xxk (the limit is than the java integer-range)

Maybe I'm totally wrong, but i think the string 32k limitation comes from the 
CXML-format from   Stefano Mazzocchi
http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=97194999124269&w=2

I understand it in this way, that the cxml-format is independent from cocoon and java, 
so if anyone writes a decoder in Language C he can use that bytestream, too.

The Sax-Events should not be the problem, every SaxHandler has to process the 
following correct

<node>
  text here 
  <!-- comment here -->
  text here
</node>

this gives a Character-Event,Comment-Event,Character-Event for one node, or do i 
misunterstand the SAX-processing totally?

If it's correct, a Character-Event,Character-Event,... should not be a problem.

Of course, the patch handles the splitting not efficiently and it may be better to 
write it in a while-loop in an extra splitBigStrings-method.



Sorry if i'm wrong,

regards Simon



Reply via email to