[squid-users] Odd SMP and DNS issue

2013-09-13 Thread Tim Murray
Hi Everybody

I've got a bit of an odd issue, and I was hoping I could possibly get
some advice.

I'm running 3.3.5 with 9 SMP workers.

On certain SMP workers, certain DNS records will not resolve. The
users are getting No DNS Records error from Squid.

Each one of my workers runs its own http_port, if I do a GET requests
with the reported failing domain across each worker, I clearly see
that error being returned, however it will ONLY occur on 1 or 2
workers max at any given time.

If I wait a couple minutes, the GET requests takes much longer to
return the No DNS Record error, I'm assuming this is the
negative_dns_ttl setting and it's asking

A restart (or reconfigure) of Squid will resolve the issue for a
little while, the workers it affects seems to be different each time.
I've confirmed that the DNS server is definitely resolving the
successfully.

And this issue has actually only started happening (That we know of at
least) in the last couple of days.

Has anyone seen this behaviour before by any chance?

Cheers
Tim


[squid-users] Uneven load distribution between SMP Workers

2013-07-30 Thread Tim Murray
Good Afternoon Everyone

I'm running Squid 3.3.5 on 3 multicore systems here, using SMP and 6
workers per server dedicated to their own core. Each one running OS
RHEL6 U4 with 2.6.32 kernel.

I'm noticing as time goes on, some workers seem to be favoured and
doing the majority of the work. I've read the article regarding SMP
Scaling here:

http://wiki.squid-cache.org/Features/SmpScale

However I'm find our workers CPU time is differing quite substantially;

Server 1:

TIME+  COMMAND
287:01.16 (squid-3) -f /etc/squid/squid.conf
248:36.07 (squid-2) -f /etc/squid/squid.conf
146:04.90 (squid-5) -f /etc/squid/squid.conf
140:59.06 (squid-1) -f /etc/squid/squid.conf
111:24.22 (squid-6) -f /etc/squid/squid.conf
120:41.21 (squid-4) -f /etc/squid/squid.conf

Server 2:

TIME+  COMMAND
618:05.08 (squid-1) -f /etc/squid/squid.conf
405:59.84 (squid-5) -f /etc/squid/squid.conf
362:29.37 (squid-3) -f /etc/squid/squid.conf
318:56.54 (squid-2) -f /etc/squid/squid.conf
211:11.80 (squid-6) -f /etc/squid/squid.conf
204:48.51 (squid-4) -f /etc/squid/squid.conf

Server 3:

TIME+  COMMAND
497:21.70 (squid-5) -f /etc/squid/squid.conf
389:32.63 (squid-1) -f /etc/squid/squid.conf
171:31.28 (squid-6) -f /etc/squid/squid.conf
177:15.38 (squid-4) -f /etc/squid/squid.conf
346:28.21 (squid-3) -f /etc/squid/squid.conf
174:05.69 (squid-2) -f /etc/squid/squid.conf


I can also see the connections differ massively between the workers:

Server 1:

(Client and Server side connections)

squid-1 145 ESTABLISHED
squid-2 547 ESTABLISHED
squid-3 929 ESTABLISHED
squid-4 118 ESTABLISHED
squid-5 298 ESTABLISHED
squid-6 276 ESTABLISHED

Server 2:

(Client and Server side connections)

squid-1 899 ESTABLISHED
squid-2 215 ESTABLISHED
squid-3 311 ESTABLISHED
squid-4 96 ESTABLISHED
squid-5 516 ESTABLISHED
squid-6 70 ESTABLISHED

Server 3:

(Client and Server side connections)
squid-1 517 ESTABLISHED
squid-2 96 ESTABLISHED
squid-3 366 ESTABLISHED
squid-4 83 ESTABLISHED
squid-5 1030 ESTABLISHED
squid-6 189 ESTABLISHED

I'm a little concerned that the more people I migrate to this solution
the more the first 1 or 2 workers will become saturated. Do the
workers happen to have some form of source or destination persistance
for (SSL?) connections or something that might be causing this to
occur?

And is there anything I can do to improve the distribution between
workers? Or have I missed something along the line?


Cheers


Re: [squid-users] Uneven load distribution between SMP Workers

2013-07-30 Thread Tim Murray
On Wed, Jul 31, 2013 at 1:44 AM, Alex Rousskov
rouss...@measurement-factory.com wrote:
 On 07/30/2013 06:44 AM, Tim Murray wrote:

 I'm running Squid 3.3.5 on 3 multicore systems here, using SMP and 6
 workers per server dedicated to their own core. Each one running OS
 RHEL6 U4 with 2.6.32 kernel.

 I'm noticing as time goes on, some workers seem to be favoured and
 doing the majority of the work. I've read the article regarding SMP
 Scaling here:

 http://wiki.squid-cache.org/Features/SmpScale

 However I'm find our workers CPU time is differing quite substantially;

 As discussed on the above wiki page, this is expected. We see it all the
 time on many boxes, especially if Squid is not very loaded. IIRC, the
 patch working around that problem has not been submitted for the
 official review yet -- no free cycles to finish its polishing at the moment.


 I can also see the connections differ massively between the workers:

 Same thing.


 I'm a little concerned that the more people I migrate to this solution
 the more the first 1 or 2 workers will become saturated. Do the
 workers happen to have some form of source or destination persistance
 for (SSL?) connections or something that might be causing this to
 occur?

 The wiki page provides the best explanation of the phenomena I know
 about. In short, some kernels (including their TCP stacks) are not very
 good at balancing this kind of server load.


 And is there anything I can do to improve the distribution between
 workers?

 I am not aware of any specific fix, except for the workaround patch
 mentioned on the wiki.


 Alex.


Thank you very much for that Alex, to be honest when I read the Wiki
page I had assumed this patch had already been implemented.

In the meantime, I might see if using separate http_ports for each
worker and using the load balancer to even up the spread of traffic
will work.