Unfortunately I think it is way more complicated than this.

I think that Mladen Turk's article has a lot of very useful information about configuring Tomcat and I congratulate him on putting it together. However, I've spent some time recently working on some performance issues and I think that for any given installation you must consider a number of factors.

These include:

1. The time it takes to prepare the data for delivery.

2. The way available resources are combined in the delivery of this data.

1. The database

Where a database or some other persistence mechanism is involved the
performance of the database can easily overwhelm almost all other considerations.

Is the database local or remote? If local, the time taken transferring results from the database to the business layer may not be that significant but the cpu time and disk I/O spent running the query will immediately reduce the resources available and thus increase the response time of web requests.

If remote then it is easy for the database to become a bottleneck for requests. Also the time transferring the data over the network becomes more significant and it is arithmetically added to the AART unless the data can be 'preread' and cached.

How well designed are the queries? Are they subject to issues such as badly designed indexes which mean that as the data set grows query processing time responds exponentially? Has the object design been tested for performance - an ORM solution can save a lot of time in development but add significant 'hidden' costs. [Please note, I think ORMs are a great idea and always use them where possible but like all technology they have their drawbacks!]


2. The mix of the need for resources by the application.

Issues to consider here include:

Are sessions required, and the size of the session. This can also significantly impact clustered performance. It may be better to keep keys for information in the session and re-read the data rather than read data once at the beginning of the session.

What processing is required in the business/web layer to structure the information? Is there significant XML processing (which can be very time intensive?) Is there significant EL processing on a page (which can be much slower than 'raw' java')

My experience is that the only way to really assess the overall performance of a system is to instrument the system extensively and then monitor it in action. Doing this will sometimes reveal bottlenecks which you just didn't consider. Of course even then you must be careful of 'the Heisenberg effect'.

One final point. In Mladen Turk's example he assumes that the full bitrate bandwidth of an Ethernet connection is available for useful data. From previous experience I have seen that the overhead of the protocol and framing information reduce the 100 Mbps to more like 70 Mbps which would change his max of 625 concurrent requests to about 440.

Another network related issue is that actually most client 'last hop' connections are considerably slower than 100 Mbps. Where TCP/IP is involved the thread which is actually writing to the network interface will block until the *client* has read all the data and indicated that the transmission is successfully completed.

In conclusion I think that if you apply two simple formulae to the design of a Tomcat based web application you may be shocked and surprised at the actual results unless you have very carefully analysed your design and investigated the factors which affect performance.

Regards

Alan



Andrew Hole wrote:
Hello

I read an interesting document from Mladen Turk (with whom I want to speak
directly, but I don't know direct contact) that there is a formula to
calculate the number of concurrent request:
http://people.apache.org/~mturk/docs/article/ftwai.html

Calculating Load

When determining the number of Tomcat servers that you will need to satisfy
the client load, the first and major task is determining the Average
Application Response Time (hereafter AART). As said before, to satisfy the
user experience the application has to respond within half of second. The
content received by the client browser usually triggers couple of physical
requests to the Web server (e.g. images). The web page usually consists of
html and image data, so client issues a series of requests, and the time
that all this gets processed and delivered is called AART. To get most out
of Tomcat you should limit the number of concurrent requests to 200 per CPU.

So we can come with the simple formula to calculate the maximum number of
concurrent connections a physical box can handle:

                              500
    Concurrent requests = ( ---------- max 200 ) * Number of CPU's
                            AART (ms)

The other thing that you must care is the Network throughput between the Web
server and Tomcat instances. This introduces a new variable called Average
Application Response Size (hereafter AARS), that is the number of bytes of
all context on a web page presented to the user. On a standard 100Mbps
network card with 8 Bits per Byte, the maximum theoretical throughput is
12.5 MBytes.

                               12500
    Concurrent requests = ---------------
                            AARS (KBytes)

For a 20KB AARS this will give a theoretical maximum of 625 concurrent
requests. You can add more cards or use faster 1Gbps hardware if need to
handle more load.

The formulas above will give you rudimentary estimation of the number of
Tomcat boxes and CPU's that you will need to handle the desired number of
concurrent client requests. If you have to deploy the configuration without
having actual hardware, the closest you can get is to measure the AART on a
test platform and then compare the hardware vendor Specmarks.

I would like to launch a discussion on the validity of this formula and, in
case of inappropriate, to try to get a more accurate formula.

Thanks a lot


!DSPAM:479f345936612117114708!


---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to