I'm not very surprised by these numbers: XMLC does a pretty heavy job to serialize Strings to bytes.

Furthermore, I just looked at the XMLByteStreamCompiler.write() which shows that it spends most of its time resizing the byte buffer, as resizing is limited to the actual number of bytes needed for the current write, and not by a larger growth increment.

It would be interesting to redo the test by introducing this growth increment. BTW, I don't understand the "this.buf.length << 1" in the write() method.

Well, thats not exactly true:


buf.length << 1 is a shift operation which is the same
as buf.length*2. The Max() chooses the bigger value.

So that method is fine ;)

But the huge difference between the SaxBuffer and the XMLC is that the XMLC serializes the SAX event on the fly. The SaxBuffer does not support serialization but keeps the events as objects.

IMO spending time on the serialization only makes sense if

 a) the memory consumption is too high otherwise
 b) the SAX stream is being saved to disk

Maybe we can extend the testcases to compare the memory consumption. For the question of the destination we could let the store decide.

Anyway both classes make sense. But maybe they would make even more sense if they would share the same interface and would become interchangeable.

The SAX stream buffering is a vital component of cocoon. Looking at the numbers the impact on the performance could be tremendous.

What do you think?



Can't we merge both: use SAXBuffer for in-memory storage, and use XMLC/XMLI to serialize it? This could even be done transparently by having SAXBuffer implementing Serializable and use XMLC/XMLI to implement readObject() and writeObject().

Hm... I don't know if I like that. Although it also came to my mind.


That way we *always* have the memory consumption. It sounds reasonable
from a OOP POV but it might not be a good choice in terms of
scaleability ...I assume :-/
--
Torsten



Reply via email to