Hi all,
sorry for the stress but it seems that it is a time to come back to  the 
discussion related to the load balancing for JVM (Tomcat).

Prehistory:
Recently we made benchmark and smoke tests of our product at the sun high 
tech centre in Langen (Germany). 
 
As the webserver apache2.2.4 has been used, container -10xTomcat 5.5.25 
and as load balancer - JK connector 1.2.23 with busyness algorithm. 

        Under the high load the strange behaviour was  observed: some 
tomcat workers temporary got the non-proportional load, often 10 times 
higher then the others for the relatively long periods.  As the result the 
response times that usually stay under 500ms went up to 20+ sec, that in 
its turn  made the overall test results almost two time worst as 
estimated. 

                At the beginning we were quite confused, because we were 
sure that it was not the problem of JVM configuration and supposed that 
the reason is in LB logic of mod_jk, and the both suggestions were right. 

Actually the following was happening: the LB sends requests and gets the 
session sticky, continuously sending the upcoming requests to the same 
cluster node. At the certain period of time the JVM started the major 
garbage collection (full gc) and spent, mentioned above, 20 seconds. At 
the same time jk continued to send new requests and the sticky to node 
requests that led us to the situation where the one node broke the SLA on 
response times. 

I ^ve been searching the web for awhile to find the LoadBalancer 
implementation that takes an account the GC activity and reduces the load 
accordingly case JVM is close to the major collection, but nothing found.

Once again the LB of JVMs under the load is really an issue for production 
and with optimally distributed load you are able not only to lower the 
costs, but also able to prevent bad customer experience, not to mention 
broken SLAs. 

Feature request:

        All lb algorithms have to be extended with the bidirectional 
connection with jvm:
             Jvm -> Lb: old gen size and the current occupancy
         Lb -> Jvm: prevent node overload and advice gc on dependent on 
parameterized free old gen space in %. 
 

All the ideas and comments are appreciated.

Regards,
Yefym.

Reply via email to