Chris, Chuck and others,
many thanks for taking the time to educate me (on both "Tomcat threads"
threads).
I got lots of information and tips, which will be useful now or later.
I'll now go sift through them again. At least now I have an idea where
to start.
About the fact that my hardware sucks, for Java : I know.
On the other hand, that machine is a good filter against programs' and
programmer's hubris, particularly Java ones. If it runs there, it will
run anywhere kind of thing, and I don't need 50 fake clients to stress
it out (obviously).
On the other hand again, on the same machine I have a text search and
retrieval application that can sift through a full-text index of 100,000
documents (1 Gb of text) and retrieve the ones I want in couple of
seconds. It has a 10 Mb memory footprint. That's why the 500 Mb
footprint of Tomcat (with the app) and the 5 minute delay in starting
the app over 25 Mb of XML so struck me.
I also have learned (separately, and confirmed here several times) that
XML parsing is a hog, and that is not only in Java. Particularly the
DOM-style of parsing exhibits exponential time behaviour in relation to
document size. Large text fields are absolute killers, and making them
CDATA only partly alleviates that.
One can always throw more hardware at things, and sometimes it may be
cheaper than trying to over-optimise. But some applications out there
will kill any hardware. It is sometimes surprisingly easy to gain a
factor 2 with little investment though, and if that means halving the
number of servers and their attendant care and paraphernalia, it's still
worth it for us. Even when they are virtual.
Our main business is processing documents, text-intensive, that's why I
am interested. Gaining 5 seconds in processing a document counts, if
you're processing thousands per day. For a user with his finger on the
mouse button too, there is a lot of difference between 1 and 3 seconds.
A note here of one of you regarding substrings in Java has particularly
caught my interest. And I'll go check if the XML parser in that
application could be replaced by a newer version maybe.
One alternative to XML in feeding that application with data is CSV
files (the text version of spreadsheet). I had discarded it until now
as old-fashioned, "passé", limited etc.. XML is so much more "in". But I
am having second thoughts now, and I will give it a try.
When I started in this business, 64 Kb was a nice quantity of memory to
program in, and quite expensive too. I created and ran a payroll
application for a 1,000 people company in there. This Java app looks a
lot cuter than the payroll did, but 500 Megabyte of memory for one
single Tomcat app, mmm. Some reflexes remain for a lifetime.
Thanks.
---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]