Hello people,
I'm currently at Giacomo's place and we spent a rainy afternoon
profiling the latest Cocoon to see if there is something we could
fix/improve/blah-blah.
WARNING: this is *by no means* a scientific report. But we have tried to
be as informative as possible for developers.
We were running Tomcat 4.1.10 + Cocoon HEAD on Sun JDK 1.4.1-b21 on
linux, instrumented with Borland OptimizeIt 4.2.
Here is what we discovered:
1) Regarding memory leaks, Cocoon seems absolutely clean (for cocoon, we
mean org.apache.cocoon.* classes). Avalon seems to be clean as well.
Good job everyone.
2) we noticed an incredible use of
org.apache.avalon.excalibur.collections.BucketMap$Node. It is *by far*
the most used class in the heap. More than Strings, byte[], char[] and
int[]. Some 140000 instances of that class.
The number of bucketmap nodes grows linearly with the amount of
different pages accessed (as they are fed into the cache), but even a
cached resource creates some 44 new nodes, which are later garbage
collected.
44 is nothing compared to 140000, but still something to investigate.
So, discovery #1:
BucketMaps are used *a lot*. Be aware of this.
3) Catalina seems to be spending 10% of the pipeline time. Having
extensively profiled and carefully optimized a servlet engine (JServ) I
can tell you that this is *WAY* too much. Catalina doesn't seem like the
best choice to run a loaded servlet-based site (contact [EMAIL PROTECTED]
if you want to do something about it: he's working on Jerry, a
super-light servlet engine based on native APR and targetted expecially
for Apache 2.0)
4) java IO takes something from 20% to 35% of the entire request time
(reading and writing from the socket). This could well be a problem with
the instrumented JVM since I don't think the JDK 1.4 is that slow on IO
(expecially using the new NIO facilities internally)
5) most of the time is spent on:
a) XSLT processing (and we knew that)
b) DTD parsing (and that was surprise for me!)
Yeah, DTD parsing. No, not for validation, but for entity resolution. It
seems that even if the parser is non-validated, the DTD is fully parsed
anyway just to do entity evalutation.
So, discovery #2:
Be careful about DTDs even if the parser is not validating.
Of course, when the cache kicks in and the cached document is read
directly from the compiled SAX events, we have an incredible speed
improvement (also because entities are already resolved and hardwired).
6) Xalan incremental seems to be a little slower than regular Xalan, but
on multiprocessing machines this might not be the case [Xalan uses two
threads for incremental processing]
NOTE: Xalan doesn't pool threads when it does that!
So, while perceived performance is better for Xalan in incremental mode,
the overall load of the machine is reduced if Xalan is used normally.
7) XSLTC *IS* blazingly fast compared to Xalan and is much less resource
intensive.
Discovery #3:
use XSLTC as much as possible!
NOTE: our current root sitemap.xmap indicates that XSLTC is default XSLT
engine for Cocoon 2.1, but the fact is that the XSLTC factory is
commented out, resulting in running Xalan. We should either remove that
comment or uncomment the XSLTC factory.
I vote for making XSLTC default even if this generates a few bug reports.
8) Cocoon's hotspot is.... drum roll.... URI matching.
TreeProcessor is complex and adds lots of complexity to the call stacks,
but it seems to be very lightweight. It's URI matching that is the thing
that needs more work performance-wise.
Don't get me wrong, my numbers indicate that URI matching takes for 3%
to 8% of response time. Compared to the rest is nothing, but since this
is the only thing we are in total control, this is where we should
concentrate profiling efforts.
Ok, that's it. Enough for a rainy swiss afternoon.
Anyway, Cocoon is pretty optimized for what we could see. So let's be
happy about it.
--
Stefano Mazzocchi <[EMAIL PROTECTED]>
--------------------------------------------------------------------
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]