RE: JK Loadbalancer not balancing fairly
Ben, So I assume you have two web servers fronting two app servers - or there are two servers both of which have a web server and an app server? For the restart you talk about - did you restart both web servers? Do you have a good load balancer (local director, content director like an F5) in front of the two web servers? If I am reading your JKStatus text correctly I noticed the following: Load balancer value on web server 2 --- = ~0.56 Load balancer value on web server 1 but Number requests on web server 2 --- = ~0.91 Number requests on web server 1 Now, if I am interpreting the meaning of "load balancer value" and "number of reuqests" correctly, that would imply that the number of sessions stuck to each app server from web server 1 is very roughly twice as high as from 2, but the total number of requests sent to each app server from both web servers is very roughly the same. (Can someone confirm I'm intrepreting those #s correctly?) According to the docs, each connect by default trys to keep the number of requests sent to each worker the same, which looks to be happening reasonably well. (I'm playing with trying the keep the number of sessions balanced since our apps tend to be more of a memory issue than a cpu issue. There is a setting on the connector for this.) With a some info on your setup we can try to figure out the load imbalance. As a note, I am playing with the jk1.2.x connector, but our productio systems use the old jk2.x connector. With that, I've seen a load imbalance on the app servers when one of the app serves has gone down for a while, and then has come back up. If the connectors are not reset, they will try to "catch up" the restarted app server in terms of the number of requests it has handled, thus loading it more heavily than servers that have been up the whole time. Brian -->-Original Message- -->From: ben short [mailto:[EMAIL PROTECTED] -->Sent: Thursday, August 23, 2007 4:51 AM -->To: Tomcat Users List -->Subject: JK Loadbalancer not balancing fairly --> -->Hi All, --> -->We are doing some load testing on our setup and find that -->the cpu use age of tomcat reported by top on the two systems -->is not equal. -->Typically we see figures like ~400% to 800% cpu on one -->machine and ~50% on the other machine for the java process. -->We would expect that the two cpu values to be equal. --> -->The jkstatus page on box one shows the following after a restart. -->Although before a restart the Max column was showing 250 for -->jcpres1 and 32 for jcpres2. --> -->Name TypeHostAddrAct State D F -->MV Acc Err CE RE Wr Rd -->Busy Max Route RR Cd Rs --> jcpres1 ajp13 172.16.4.11:8009 -->172.16.4.11:8009 ACT OK 0 1 1 -->869 42762 4 0 939K286M1 -->11 jcpres1 --> 0/0 --> jcpres2 ajp13 172.16.4.12:8009 -->172.16.4.12:8009 ACT OK 0 1 1 -->869 42772 1 0 943K280M2 -->9jcpres2 --> 0/0 --> -->and box 2 --> -->Name TypeHostAddrAct State D F -->MV Acc Err CE RE Wr Rd -->Busy Max Route RR Cd Rs --> jcpres1 ajp13 172.16.4.11:8009 -->172.16.4.11:8009 ACT OK 0 1 1 -->484 38720 4 0 850K256M3 -->10 jcpres1 --> 0/0 --> jcpres2 ajp13 172.16.4.12:8009 -->172.16.4.12:8009 ACT OK 0 1 1 -->483 38710 4 0 850K260M1 -->10 jcpres2 --> 0/0 --> --> -->Our system setup. --> -->Both machines are running the the following software on RedHat 4ES --> -->Httpd 2.2.4 -->Mod JK 1.2.25 -->Tomcat 6.0.12 -->Java 1.6.0_01 --> -->Box 1. --> -->workers.properties --> --># JK Status worker config --> -->worker.list=jkstatus -->worker.jkstatus.type=status --> --># Presentaton Load Balancer Config --> -->worker.list=preslb --> -->worker.preslb.type=lb -->worker.preslb.balance_workers=jcpres1,jcpres2 -->worker.preslb.sticky_session=1 --> -->worker.jcpres1.port=8009 -->worker.jcpres1.host=172.16.4.11 -->worker.jcpres1.type=ajp13 -->worker.jcpres1.lbfactor=1 -->worker.jcpres1.fail_on_status=503,400,500,909 --> -->worker.jcpres2.port=8009 -->worker.jcpres2.host=172.16.4.12 -->worker.jcpres2.type=ajp13 -->worker.jcpres2.lbfactor=1 -->worker.jcpres2.fail_on_status=503,400,500,909 --> --> -->Box 2. --> -->workers.properties --> --># JK Status worker config --> -->worker.list=jkstatus -->worker.jkstatus.type=status --> --># Presentaton Load Balancer Config --> -->worker.list=preslb --> -->worker.preslb.type=lb -->worker.preslb.balance_workers=jcpres1,jcpres2 -->worker.preslb.sticky_session=1
Re: JK Loadbalancer not balancing fairly
[EMAIL PROTECTED] schrieb: > Ben, > > So I assume you have two web servers fronting two app servers - or there > are two servers both of which have a web server and an app server? For > the restart you talk about - did you restart both web servers? Do you > have a good load balancer (local director, content director like an F5) > in front of the two web servers? > > If I am reading your JKStatus text correctly I noticed the following: > > Load balancer value on web server 2 > --- = ~0.56 > Load balancer value on web server 1 > > but > > Number requests on web server 2 > --- = ~0.91 > Number requests on web server 1 > > > Now, if I am interpreting the meaning of "load balancer value" and > "number of reuqests" correctly, that would imply that the number of > sessions stuck to each app server from web server 1 is very roughly > twice as high as from 2, but the total number of requests sent to each > app server from both web servers is very roughly the same. (Can someone > confirm I'm intrepreting those #s correctly?) The number of requests is the total since last jk/apache restart. So if the last restart was shortly before, the numbers will not help. If they were not reset after the tests, we would know, that Apache 1 had a little more requests than apache 2, but both of them send exacty the same number of requests to the two tomcat nodes (delta=1 request). The V column is the balancing value used to decide, where the next request goes to. It is the number of requests sent to the tomcat divided by two once a minute, so it is multiplied by a decay curve. The big difference between the V values of apache 1 and apache 2 does not matter. It could simply mean, that the one with the bigger V value did it's division more recent in time. The V values for the two tomcats are again very similar on the same Apache, another indication of good balancing. All his is true for the default balancing method "Requests". I would suggest first to follow CPU by Tomcat process over the test period (not per system and not simply as one number, instead as a graph over time). > According to the docs, each connect by default trys to keep the number > of requests sent to each worker the same, which looks to be happening > reasonably well. (I'm playing with trying the keep the number of > sessions balanced since our apps tend to be more of a memory issue than > a cpu issue. There is a setting on the connector for this.) > > With a some info on your setup we can try to figure out the load > imbalance. > > As a note, I am playing with the jk1.2.x connector, but our productio > systems use the old jk2.x connector. With that, I've seen a load > imbalance on the app servers when one of the app serves has gone down > for a while, and then has come back up. If the connectors are not reset, > they will try to "catch up" the restarted app server in terms of the > number of requests it has handled, thus loading it more heavily than > servers that have been up the whole time. The catchup problem should be fixed. A recovered or reactivated worker gets the biggest "work done" value of all other workers, so it should start normal or even a little less loaded. > > Brian Regards, Rainer - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: JK Loadbalancer not balancing fairly
Rainer, Thanks very much for the clarification! Since I have playing with the load balancing strategy set to session ("worker.router.method=S" on my load balancer), is there a way to tell roughly how many sessions have been pinned to each worker/tomcat? In this case would the load balancer value be (something like) the number of new sessions sent to a particular worker divided by two some number of times? If this were true you still would not know the number of sessions pinned to a worked because of the factors of two having been divided out. I just got a HTTP JMX adapter wired up in Tomcat so I'll see if I can get session info that way... Thanks again, Brian -->-Original Message- -->From: Rainer Jung [mailto:[EMAIL PROTECTED] -->Sent: Thursday, August 23, 2007 11:22 AM -->To: Tomcat Users List -->Subject: Re: JK Loadbalancer not balancing fairly --> -->[EMAIL PROTECTED] schrieb: -->> Ben, -->> -->> So I assume you have two web servers fronting two app servers - or -->> there are two servers both of which have a web server and an app -->> server? For the restart you talk about - did you restart both web -->> servers? Do you have a good load balancer (local director, content -->> director like an F5) in front of the two web servers? -->> -->> If I am reading your JKStatus text correctly I noticed the -->following: -->> -->> Load balancer value on web server 2 -->> --- = ~0.56 Load balancer -->value on web -->> server 1 -->> -->> but -->> -->> Number requests on web server 2 -->> --- = ~0.91 Number requests on web -->> server 1 -->> -->> -->> Now, if I am interpreting the meaning of "load balancer value" and -->> "number of reuqests" correctly, that would imply that the -->number of -->> sessions stuck to each app server from web server 1 is -->very roughly -->> twice as high as from 2, but the total number of requests -->sent to each -->> app server from both web servers is very roughly the same. (Can -->> someone confirm I'm intrepreting those #s correctly?) --> -->The number of requests is the total since last jk/apache -->restart. So if the last restart was shortly before, the -->numbers will not help. If they were not reset after the -->tests, we would know, that Apache 1 had a little more -->requests than apache 2, but both of them send exacty the -->same number of requests to the two tomcat nodes (delta=1 request). --> -->The V column is the balancing value used to decide, where -->the next request goes to. It is the number of requests sent -->to the tomcat divided by two once a minute, so it is -->multiplied by a decay curve. The big difference between the -->V values of apache 1 and apache 2 does not matter. It could -->simply mean, that the one with the bigger V value did it's -->division more recent in time. The V values for the two -->tomcats are again very similar on the same Apache, another -->indication of good balancing. --> -->All his is true for the default balancing method "Requests". --> -->I would suggest first to follow CPU by Tomcat process over -->the test period (not per system and not simply as one -->number, instead as a graph over time). --> -->> According to the docs, each connect by default trys to -->keep the number -->> of requests sent to each worker the same, which looks to -->be happening -->> reasonably well. (I'm playing with trying the keep the number of -->> sessions balanced since our apps tend to be more of a memory issue -->> than a cpu issue. There is a setting on the connector for this.) -->> -->> With a some info on your setup we can try to figure out the load -->> imbalance. -->> -->> As a note, I am playing with the jk1.2.x connector, but -->our productio -->> systems use the old jk2.x connector. With that, I've seen a load -->> imbalance on the app servers when one of the app serves -->has gone down -->> for a while, and then has come back up. If the connectors are not -->> reset, they will try to "catch up" the restarted app -->server in terms -->> of the number of requests it has handled, thus loading it -->more heavily -->> than servers that have been up the whole time. --> -->The catchup problem should be fixed. A recovered or -->reactivated worker gets the biggest "work done" value of all -->other workers, so it should start normal or even a little -->less loaded. --> -->> -->> Brian --> -->Regards, --> -->Rainer --> -->- -->To start a new topic, e-mail: users@tomcat.apache.org To -->unsubscribe, e-mail: [EMAIL PROTECTED] -->For additional commands, e-mail: [EMAIL PROTECTED] --> --> - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: JK Loadbalancer not balancing fairly
[EMAIL PROTECTED] schrieb: > Rainer, > > Thanks very much for the clarification! Since I have playing with the > load balancing strategy set to session ("worker.router.method=S" on my > load balancer), is there a way to tell roughly how many sessions have > been pinned to each worker/tomcat? In this case would the load balancer No unfortunatley not. You can log cookies (if used) wuth apache and the name of the target worker in the access log. Maybe easier is to log the session ID in Tomcats access log (I think %S, check the Valves docs) and then count the different IDs (not nice, but will work). > value be (something like) the number of new sessions sent to a > particular worker divided by two some number of times? If this were true > you still would not know the number of sessions pinned to a worked > because of the factors of two having been divided out. I just got a HTTP It is true. > JMX adapter wired up in Tomcat so I'll see if I can get session info > that way... Yes, the manager MBean of the context contains session info. > Thanks again, > > Brian Regards, Rainer - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]