[squid-users] Uneven load distribution between SMP Workers
Good Afternoon Everyone I'm running Squid 3.3.5 on 3 multicore systems here, using SMP and 6 workers per server dedicated to their own core. Each one running OS RHEL6 U4 with 2.6.32 kernel. I'm noticing as time goes on, some workers seem to be favoured and doing the majority of the work. I've read the article regarding SMP Scaling here: http://wiki.squid-cache.org/Features/SmpScale However I'm find our workers CPU time is differing quite substantially; Server 1: TIME+ COMMAND 287:01.16 (squid-3) -f /etc/squid/squid.conf 248:36.07 (squid-2) -f /etc/squid/squid.conf 146:04.90 (squid-5) -f /etc/squid/squid.conf 140:59.06 (squid-1) -f /etc/squid/squid.conf 111:24.22 (squid-6) -f /etc/squid/squid.conf 120:41.21 (squid-4) -f /etc/squid/squid.conf Server 2: TIME+ COMMAND 618:05.08 (squid-1) -f /etc/squid/squid.conf 405:59.84 (squid-5) -f /etc/squid/squid.conf 362:29.37 (squid-3) -f /etc/squid/squid.conf 318:56.54 (squid-2) -f /etc/squid/squid.conf 211:11.80 (squid-6) -f /etc/squid/squid.conf 204:48.51 (squid-4) -f /etc/squid/squid.conf Server 3: TIME+ COMMAND 497:21.70 (squid-5) -f /etc/squid/squid.conf 389:32.63 (squid-1) -f /etc/squid/squid.conf 171:31.28 (squid-6) -f /etc/squid/squid.conf 177:15.38 (squid-4) -f /etc/squid/squid.conf 346:28.21 (squid-3) -f /etc/squid/squid.conf 174:05.69 (squid-2) -f /etc/squid/squid.conf I can also see the connections differ massively between the workers: Server 1: (Client and Server side connections) squid-1 145 ESTABLISHED squid-2 547 ESTABLISHED squid-3 929 ESTABLISHED squid-4 118 ESTABLISHED squid-5 298 ESTABLISHED squid-6 276 ESTABLISHED Server 2: (Client and Server side connections) squid-1 899 ESTABLISHED squid-2 215 ESTABLISHED squid-3 311 ESTABLISHED squid-4 96 ESTABLISHED squid-5 516 ESTABLISHED squid-6 70 ESTABLISHED Server 3: (Client and Server side connections) squid-1 517 ESTABLISHED squid-2 96 ESTABLISHED squid-3 366 ESTABLISHED squid-4 83 ESTABLISHED squid-5 1030 ESTABLISHED squid-6 189 ESTABLISHED I'm a little concerned that the more people I migrate to this solution the more the first 1 or 2 workers will become saturated. Do the workers happen to have some form of source or destination persistance for (SSL?) connections or something that might be causing this to occur? And is there anything I can do to improve the distribution between workers? Or have I missed something along the line? Cheers
Re: [squid-users] Uneven load distribution between SMP Workers
On 07/30/2013 03:44 PM, Tim Murray wrote: I'm a little concerned that the more people I migrate to this solution the more the first 1 or 2 workers will become saturated. Do the workers happen to have some form of source or destination persistance for (SSL?) connections or something that might be causing this to occur? And is there anything I can do to improve the distribution between workers? Or have I missed something along the line? Cheers Hey, it's the OS that does the random load on the process using the source and destination routing path. it should be checked and tested to make sure that IPTABLES does a fair LB if you use different process VS SMP. When using simple PROCESS and IPTABLES is load balancing the ports it shows you the load from specific IP address:port to specific address:ip. If the process works and only the load on one specific rises is it effecting performance? Eliezer
Re: [squid-users] Uneven load distribution between SMP Workers
On 07/30/2013 06:44 AM, Tim Murray wrote: I'm running Squid 3.3.5 on 3 multicore systems here, using SMP and 6 workers per server dedicated to their own core. Each one running OS RHEL6 U4 with 2.6.32 kernel. I'm noticing as time goes on, some workers seem to be favoured and doing the majority of the work. I've read the article regarding SMP Scaling here: http://wiki.squid-cache.org/Features/SmpScale However I'm find our workers CPU time is differing quite substantially; As discussed on the above wiki page, this is expected. We see it all the time on many boxes, especially if Squid is not very loaded. IIRC, the patch working around that problem has not been submitted for the official review yet -- no free cycles to finish its polishing at the moment. I can also see the connections differ massively between the workers: Same thing. I'm a little concerned that the more people I migrate to this solution the more the first 1 or 2 workers will become saturated. Do the workers happen to have some form of source or destination persistance for (SSL?) connections or something that might be causing this to occur? The wiki page provides the best explanation of the phenomena I know about. In short, some kernels (including their TCP stacks) are not very good at balancing this kind of server load. And is there anything I can do to improve the distribution between workers? I am not aware of any specific fix, except for the workaround patch mentioned on the wiki. Alex.
Re: [squid-users] Uneven load distribution between SMP Workers
On Wed, Jul 31, 2013 at 1:44 AM, Alex Rousskov rouss...@measurement-factory.com wrote: On 07/30/2013 06:44 AM, Tim Murray wrote: I'm running Squid 3.3.5 on 3 multicore systems here, using SMP and 6 workers per server dedicated to their own core. Each one running OS RHEL6 U4 with 2.6.32 kernel. I'm noticing as time goes on, some workers seem to be favoured and doing the majority of the work. I've read the article regarding SMP Scaling here: http://wiki.squid-cache.org/Features/SmpScale However I'm find our workers CPU time is differing quite substantially; As discussed on the above wiki page, this is expected. We see it all the time on many boxes, especially if Squid is not very loaded. IIRC, the patch working around that problem has not been submitted for the official review yet -- no free cycles to finish its polishing at the moment. I can also see the connections differ massively between the workers: Same thing. I'm a little concerned that the more people I migrate to this solution the more the first 1 or 2 workers will become saturated. Do the workers happen to have some form of source or destination persistance for (SSL?) connections or something that might be causing this to occur? The wiki page provides the best explanation of the phenomena I know about. In short, some kernels (including their TCP stacks) are not very good at balancing this kind of server load. And is there anything I can do to improve the distribution between workers? I am not aware of any specific fix, except for the workaround patch mentioned on the wiki. Alex. Thank you very much for that Alex, to be honest when I read the Wiki page I had assumed this patch had already been implemented. In the meantime, I might see if using separate http_ports for each worker and using the load balancer to even up the spread of traffic will work.
Re: [squid-users] Uneven load distribution between SMP Workers
On 07/30/2013 07:13 PM, Tim Murray wrote: In the meantime, I might see if using separate http_ports for each worker and using the load balancer to even up the spread of traffic will work. It should work as well as the load balancer can balance the load. You might be introducing an additional single point of failure (the load balancer) though. Alex.