subject:"\[squid\-users\] squid 3.2.0.5 smp scaling issues"

On Sun, 12 Jun 2011, Jenny Lee wrote:

With tcp_fin_timeout set at theoretical minimum of 12 secs, we can do 5K req/s
with 64K ports.

Setting tcp_fin_timeout had no effect for me. Apparently there is conflicting /
outdated information everywhere and I could not lower TIME_WAIT from its
default of 60 secs which is hardcoded into include/net/tcp.h. But I doubt this
would have any effect when you are constantly loading the machine.

Making localhost to localhost connections didn't help either.

I am not a network guru, so of course I am probably doing things wrong. But no
matter how wrong you do stuff, they cannot escape brute-forcing :) And I have
tried everything!

I Can't do more than 450-470 reqs/sec even with 200K in "/proc/sys/net/netfilter/nf_conntrack_max"
and "/sys/module/nf_conntrack/parameters/hashsize". This allows me bypass "CONNTRACK table
full" issues, but my ports run out.

Could you be kind enough to specify which OS you are using and if you are
running the benches for extended periods of time?

Any TCP tuning options you are doing also would be very useful. Of course, when
you are back in the office.

As I mentioned, we find your work on acls and workers valuable.

I'm running Debian with custom built kernels.

In the testing that I have done over the years, I have had tests at 6000+
connections/sec through forking proxies (that only log when they get a new
connection, with connection rates calculated by the logs of the proxy so I
know that they aren't using persistant or keep-alive connections)

unfortunantly the machine in my lab with squid on it is unplugged right
now. I can get at the machines running ab and apache remotely, so I can
hopefully get logged in and give you the kernel settings in the next
coupld of days (things are _extremely_ hectic through most of monday, so
it'll probably be monday night or tuesday before I get a chance)

David Lang

RE: [squid-users] squid 3.2.0.5 smp scaling issues





> Date: Sun, 12 Jun 2011 03:35:28 -0700
> From: da...@lang.hm
> To: bodycar...@live.com
> CC: squid-users@squid-cache.org
> Subject: RE: [squid-users] squid 3.2.0.5 smp scaling issues
>
> On Sun, 12 Jun 2011, Jenny Lee wrote:
>
> >> Date: Sun, 12 Jun 2011 03:02:23 -0700
> >> From: da...@lang.hm
> >> To: bodycar...@live.com
> >> CC: squ...@treenet.co.nz; squid-users@squid-cache.org
> >> Subject: RE: [squid-users] squid 3.2.0.5 smp scaling issues
> >>
> >> On Sun, 12 Jun 2011, Jenny Lee wrote:
> >>
> >>>> On 12/06/11 18:46, Jenny Lee wrote:
> >>>>>
> >>>>> On Sat, Jun 11, 2011 at 9:40 PM, Jenny Lee wrote:
> >>>>>
> >>>>> I like to know how you are able to do>13000 requests/sec.
> >>>>> tcp_fin_timeout is 60 seconds default on all *NIXes and available 
> >>>>> ephemeral port range is 64K.
> >>>>> I can't do more than 1K requests/sec even with 
> >>>>> tcp_tw_reuse/tcp_tw_recycle with ab. I get commBind errors due to 
> >>>>> connections in TIME_WAIT.
> >>>>> Any tuning options suggested for RHEL6 x64?
> >>>>> Jenny
> >>>>>
> >>>>> I would have a concern using both those at the same time. reuse and 
> >>>>> recycle. Reuse a socket, but recycle it, I've seen issues when testing 
> >>>>> my own linux distro's with both of these settings. Right or wrong that 
> >>>>> was my experience.
> >>>>> fin_timeout, if you have a good connection, there should be no reason 
> >>>>> that a system takes 60 seconds to send out a fin. Cut that in half, if 
> >>>>> not by 2/3's
> >>>>> And what is your limitation at 1K requests/sec, load (if so look at 
> >>>>> I/O) Network saturation? Maybe I missed an earlier thread and I too 
> >>>>> would tilt my head at 13K requests sec!
> >>>>> Tory
> >>>>> ---
> >>>>>
> >>>>>
> >>>>> As I mentioned, my limitation is the ephemeral ports tied up with 
> >>>>> TIME_WAIT. TIME_WAIT issue is a known factor when you are doing testing.
> >>>>>
> >>>>> When you are tuning, you apply options one at a time. 
> >>>>> tw_reuse/tc_recycle were not used togeter and I had 10 sec fin_timeout 
> >>>>> which made no difference.
> >>>>>
> >>>>> Jenny
> >>>>>
> >>>>>
> >>>>> nb: i still dont know how to do indenting/quoting with this hotmail... 
> >>>>> after 10 years.
> >>>>>
> >>>>
> >>>> Couple of thing to note.
> >>>> Firstly that this was an ab (apache bench) reported figure. It
> >>>> calculates the software limitation based on speed of transactions done.
> >>>> Not necessarily accounting for things like TIME_WAIT. Particularly if it
> >>>> was extrapolated from say, 50K requests, which would not hit that OS 
> >>>> limit.
> >>>
> >>> Ab accounts for 200-OK responses and TIME_WAITS cause squid to issue 500. 
> >>> Of course if you send in 50K it would not be subject to this but I 
> >>> usually send couple 10+ million to simulate load at least for a while.
> >>>
> >>>
> >>>> He also mentioned using a "local IP address". If that was on the lo
> >>>> interface. It would not be subject to things like TIME_WAIT or RTT lag.
> >>>
> >>> When I was running my benches on loopback, I had tons of TIME_WAITS for 
> >>> 127.0.0.1 and squid would bail out with: "commBind: Cannot bind socket..."
> >>>
> >>> Of course, I might be doing things wrong.
> >>>
> >>> I am interested in what to optimize on RHEL6 OS level to achieve higher 
> >>> requests per second.
> >>>
> >>> Jenny
> >>
> >> I'll post my configs when I get back to the office, but one thing is that
> >> if you send requests faster than they can be serviced the pending requests
> >> build up until you start getting timeouts. so I have to tinker with the
> >> number of requests that can be sent in parallel to keep the request rate
> >> below this point.
> >>
> >> note

RE: [squid-users] squid 3.2.0.5 smp scaling issues



> Date: Sun, 12 Jun 2011 22:47:25 +1200
> From: squ...@treenet.co.nz
> To: squid-users@squid-cache.org
> Subject: Re: [squid-users] squid 3.2.0.5 smp scaling issues
>
> On 12/06/11 22:20, Jenny Lee wrote:
> >
> >> Date: Sun, 12 Jun 2011 03:02:23 -0700
> >> From: da...@lang.hm
> >> To: bodycar...@live.com
> >> CC: squ...@treenet.co.nz; squid-users@squid-cache.org
> >> Subject: RE: [squid-users] squid 3.2.0.5 smp scaling issues
> >>
> >> On Sun, 12 Jun 2011, Jenny Lee wrote:
> >>
> >>>> On 12/06/11 18:46, Jenny Lee wrote:
> >>>>>
> >>>>> On Sat, Jun 11, 2011 at 9:40 PM, Jenny Lee wrote:
> >>>>>
> >>>>> I like to know how you are able to do>13000 requests/sec.
> >>>>> tcp_fin_timeout is 60 seconds default on all *NIXes and available 
> >>>>> ephemeral port range is 64K.
> >>>>> I can't do more than 1K requests/sec even with 
> >>>>> tcp_tw_reuse/tcp_tw_recycle with ab. I get commBind errors due to 
> >>>>> connections in TIME_WAIT.
> >>>>> Any tuning options suggested for RHEL6 x64?
> >>>>> Jenny
> >>>>>
> >>>>> I would have a concern using both those at the same time. reuse and 
> >>>>> recycle. Reuse a socket, but recycle it, I've seen issues when testing 
> >>>>> my own linux distro's with both of these settings. Right or wrong that 
> >>>>> was my experience.
> >>>>> fin_timeout, if you have a good connection, there should be no reason 
> >>>>> that a system takes 60 seconds to send out a fin. Cut that in half, if 
> >>>>> not by 2/3's
> >>>>> And what is your limitation at 1K requests/sec, load (if so look at 
> >>>>> I/O) Network saturation? Maybe I missed an earlier thread and I too 
> >>>>> would tilt my head at 13K requests sec!
> >>>>> Tory
> >>>>> ---
> >>>>>
> >>>>>
> >>>>> As I mentioned, my limitation is the ephemeral ports tied up with 
> >>>>> TIME_WAIT. TIME_WAIT issue is a known factor when you are doing testing.
> >>>>>
> >>>>> When you are tuning, you apply options one at a time. 
> >>>>> tw_reuse/tc_recycle were not used togeter and I had 10 sec fin_timeout 
> >>>>> which made no difference.
> >>>>>
> >>>>> Jenny
> >>>>>
> >>>>>
> >>>>> nb: i still dont know how to do indenting/quoting with this hotmail... 
> >>>>> after 10 years.
> >>>>>
> >>>>
> >>>> Couple of thing to note.
> >>>> Firstly that this was an ab (apache bench) reported figure. It
> >>>> calculates the software limitation based on speed of transactions done.
> >>>> Not necessarily accounting for things like TIME_WAIT. Particularly if it
> >>>> was extrapolated from say, 50K requests, which would not hit that OS 
> >>>> limit.
> >>>
> >>> Ab accounts for 200-OK responses and TIME_WAITS cause squid to issue 500. 
> >>> Of course if you send in 50K it would not be subject to this but I 
> >>> usually send couple 10+ million to simulate load at least for a while.
> >>>
> >>>
> >>>> He also mentioned using a "local IP address". If that was on the lo
> >>>> interface. It would not be subject to things like TIME_WAIT or RTT lag.
> >>>
> >>> When I was running my benches on loopback, I had tons of TIME_WAITS for 
> >>> 127.0.0.1 and squid would bail out with: "commBind: Cannot bind socket..."
> >>>
> >>> Of course, I might be doing things wrong.
> >>>
> >>> I am interested in what to optimize on RHEL6 OS level to achieve higher 
> >>> requests per second.
> >>>
> >>> Jenny
> >>
> >> I'll post my configs when I get back to the office, but one thing is that
> >> if you send requests faster than they can be serviced the pending requests
> >> build up until you start getting timeouts. so I have to tinker with the
> >> number of requests that can be sent in parallel to keep the request rate
> >> below this point.
> >>
> >> note that when I removed the long

Re: [squid-users] squid 3.2.0.5 smp scaling issues


On Sun, 12 Jun 2011, Amos Jeffries wrote:


On 12/06/11 22:20, Jenny Lee wrote:



On Sun, 12 Jun 2011, Jenny Lee wrote:


On 12/06/11 18:46, Jenny Lee wrote:


On Sat, Jun 11, 2011 at 9:40 PM, Jenny Lee wrote:

I like to know how you are able to do>13000 requests/sec.
tcp_fin_timeout is 60 seconds default on all *NIXes and available 
ephemeral port range is 64K.
I can't do more than 1K requests/sec even with 
tcp_tw_reuse/tcp_tw_recycle with ab. I get commBind errors due to 
connections in TIME_WAIT.

Any tuning options suggested for RHEL6 x64?
Jenny

I would have a concern using both those at the same time. reuse and 
recycle. Reuse a socket, but recycle it, I've seen issues when testing 
my own linux distro's with both of these settings. Right or wrong that 
was my experience.
fin_timeout, if you have a good connection, there should be no reason 
that a system takes 60 seconds to send out a fin. Cut that in half, if 
not by 2/3's
And what is your limitation at 1K requests/sec, load (if so look at 
I/O) Network saturation? Maybe I missed an earlier thread and I too 
would tilt my head at 13K requests sec!

Tory
---


As I mentioned, my limitation is the ephemeral ports tied up with 
TIME_WAIT. TIME_WAIT issue is a known factor when you are doing 
testing.


When you are tuning, you apply options one at a time. 
tw_reuse/tc_recycle were not used togeter and I had 10 sec fin_timeout 
which made no difference.


Jenny


nb: i still dont know how to do indenting/quoting with this hotmail... 
after 10 years.




Couple of thing to note.
Firstly that this was an ab (apache bench) reported figure. It
calculates the software limitation based on speed of transactions done.
Not necessarily accounting for things like TIME_WAIT. Particularly if it
was extrapolated from say, 50K requests, which would not hit that OS 
limit.


Ab accounts for 200-OK responses and TIME_WAITS cause squid to issue 500. 
Of course if you send in 50K it would not be subject to this but I 
usually send couple 10+ million to simulate load at least for a while.




He also mentioned using a "local IP address". If that was on the lo
interface. It would not be subject to things like TIME_WAIT or RTT lag.


When I was running my benches on loopback, I had tons of TIME_WAITS for 
127.0.0.1 and squid would bail out with: "commBind: Cannot bind 
socket..."


Of course, I might be doing things wrong.

I am interested in what to optimize on RHEL6 OS level to achieve higher 
requests per second.


Jenny


I'll post my configs when I get back to the office, but one thing is that
if you send requests faster than they can be serviced the pending requests
build up until you start getting timeouts. so I have to tinker with the
number of requests that can be sent in parallel to keep the request rate
below this point.

note that when I removed the long list of ACLs I was able to get this 13K
requests/sec rate going from machine A to squid on machine B to apache on
machine C so it's not a localhost thing.

getting up to the 13K rate on apache does require doing some tuning and
tweaking of apache, stock configs that include dozens of dynamically
loaded modules just can't achieve these speeds. These are also fairly
beefy boxes, dual quad core opterons with 64G ram and 1G ethernet
(multiple cards, but I haven't tried trunking them yet)

David Lang



Ok, I am assuming that persistent-connections are on. This doesn't simulate 
any real life scenario.


What do you mean by that? it is the basic requirement for access to the major 
HTTP/1.1 performance features. ON is the default.


some of the proxies that I've been testing don't support this (and don't 
support HTTP/1.1), so I am sure that my tests are not using persistant 
connections.


using olde firewall toolkit http-gw (which forks a new process for every 
incoming connection and doesn't even support all HTTP/1.0 features), I've 
seen >4000 requests/sec.


I've got systems in production that routinely top 1000 connections/sec 
between one source and one destination.


David Lang

I would like to know if anyone can do more than 500 reqs/sec with 
persistent connections off.


Jenny 


Good question. Anyone?

These are our collected reports:
 http://wiki.squid-cache.org/KnowledgeBase/Benchmarks

They are all actual production networks traffic rates. The actual benchmark 
tests like David's have been kept out since we have no standard set to make 
them comparable.

Re: [squid-users] squid 3.2.0.5 smp scaling issues

2011-06-12 Thread Amos Jeffries

On 12/06/11 22:20, Jenny Lee wrote:

Date: Sun, 12 Jun 2011 03:02:23 -0700
From: da...@lang.hm
To: bodycar...@live.com
CC: squ...@treenet.co.nz; squid-users@squid-cache.org
Subject: RE: [squid-users] squid 3.2.0.5 smp scaling issues

On Sun, 12 Jun 2011, Jenny Lee wrote:

On 12/06/11 18:46, Jenny Lee wrote:

On Sat, Jun 11, 2011 at 9:40 PM, Jenny Lee wrote:

I like to know how you are able to do>13000 requests/sec.
tcp_fin_timeout is 60 seconds default on all *NIXes and available ephemeral 
port range is 64K.
I can't do more than 1K requests/sec even with tcp_tw_reuse/tcp_tw_recycle with 
ab. I get commBind errors due to connections in TIME_WAIT.
Any tuning options suggested for RHEL6 x64?
Jenny

I would have a concern using both those at the same time. reuse and recycle. 
Reuse a socket, but recycle it, I've seen issues when testing my own linux 
distro's with both of these settings. Right or wrong that was my experience.
fin_timeout, if you have a good connection, there should be no reason that a 
system takes 60 seconds to send out a fin. Cut that in half, if not by 2/3's
And what is your limitation at 1K requests/sec, load (if so look at I/O) 
Network saturation? Maybe I missed an earlier thread and I too would tilt my 
head at 13K requests sec!
Tory
---

As I mentioned, my limitation is the ephemeral ports tied up with TIME_WAIT. 
TIME_WAIT issue is a known factor when you are doing testing.

When you are tuning, you apply options one at a time. tw_reuse/tc_recycle were 
not used togeter and I had 10 sec fin_timeout which made no difference.

Jenny

nb: i still dont know how to do indenting/quoting with this hotmail... after 10 
years.

Couple of thing to note.
Firstly that this was an ab (apache bench) reported figure. It
calculates the software limitation based on speed of transactions done.
Not necessarily accounting for things like TIME_WAIT. Particularly if it
was extrapolated from say, 50K requests, which would not hit that OS limit.

Ab accounts for 200-OK responses and TIME_WAITS cause squid to issue 500. Of 
course if you send in 50K it would not be subject to this but I usually send 
couple 10+ million to simulate load at least for a while.

He also mentioned using a "local IP address". If that was on the lo
interface. It would not be subject to things like TIME_WAIT or RTT lag.

When I was running my benches on loopback, I had tons of TIME_WAITS for 127.0.0.1 and 
squid would bail out with: "commBind: Cannot bind socket..."

Of course, I might be doing things wrong.

I am interested in what to optimize on RHEL6 OS level to achieve higher 
requests per second.

Jenny

I'll post my configs when I get back to the office, but one thing is that
if you send requests faster than they can be serviced the pending requests
build up until you start getting timeouts. so I have to tinker with the
number of requests that can be sent in parallel to keep the request rate
below this point.

note that when I removed the long list of ACLs I was able to get this 13K
requests/sec rate going from machine A to squid on machine B to apache on
machine C so it's not a localhost thing.

getting up to the 13K rate on apache does require doing some tuning and
tweaking of apache, stock configs that include dozens of dynamically
loaded modules just can't achieve these speeds. These are also fairly
beefy boxes, dual quad core opterons with 64G ram and 1G ethernet
(multiple cards, but I haven't tried trunking them yet)

David Lang

Ok, I am assuming that persistent-connections are on. This doesn't simulate any 
real life scenario.

What do you mean by that? it is the basic requirement for access to the 
major HTTP/1.1 performance features. ON is the default.

I would like to know if anyone can do more than 500 reqs/sec with persistent 
connections off.

Jenny   

Good question. Anyone?

These are our collected reports:
  http://wiki.squid-cache.org/KnowledgeBase/Benchmarks

They are all actual production networks traffic rates. The actual 
benchmark tests like David's have been kept out since we have no 
standard set to make them comparable.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.12
  Beta testers wanted for 3.2.0.8 and 3.1.12.2

RE: [squid-users] squid 3.2.0.5 smp scaling issues

On Sun, 12 Jun 2011, Jenny Lee wrote:

Date: Sun, 12 Jun 2011 03:02:23 -0700
From: da...@lang.hm
To: bodycar...@live.com
CC: squ...@treenet.co.nz; squid-users@squid-cache.org
Subject: RE: [squid-users] squid 3.2.0.5 smp scaling issues

On Sun, 12 Jun 2011, Jenny Lee wrote:

On 12/06/11 18:46, Jenny Lee wrote:

On Sat, Jun 11, 2011 at 9:40 PM, Jenny Lee wrote:

I like to know how you are able to do>13000 requests/sec.
tcp_fin_timeout is 60 seconds default on all *NIXes and available ephemeral 
port range is 64K.
I can't do more than 1K requests/sec even with tcp_tw_reuse/tcp_tw_recycle with 
ab. I get commBind errors due to connections in TIME_WAIT.
Any tuning options suggested for RHEL6 x64?
Jenny

I would have a concern using both those at the same time. reuse and recycle. 
Reuse a socket, but recycle it, I've seen issues when testing my own linux 
distro's with both of these settings. Right or wrong that was my experience.
fin_timeout, if you have a good connection, there should be no reason that a 
system takes 60 seconds to send out a fin. Cut that in half, if not by 2/3's
And what is your limitation at 1K requests/sec, load (if so look at I/O) 
Network saturation? Maybe I missed an earlier thread and I too would tilt my 
head at 13K requests sec!
Tory
---

As I mentioned, my limitation is the ephemeral ports tied up with TIME_WAIT. 
TIME_WAIT issue is a known factor when you are doing testing.

When you are tuning, you apply options one at a time. tw_reuse/tc_recycle were 
not used togeter and I had 10 sec fin_timeout which made no difference.

Jenny

nb: i still dont know how to do indenting/quoting with this hotmail... after 10 
years.

Couple of thing to note.
Firstly that this was an ab (apache bench) reported figure. It
calculates the software limitation based on speed of transactions done.
Not necessarily accounting for things like TIME_WAIT. Particularly if it
was extrapolated from say, 50K requests, which would not hit that OS limit.

Ab accounts for 200-OK responses and TIME_WAITS cause squid to issue 500. Of 
course if you send in 50K it would not be subject to this but I usually send 
couple 10+ million to simulate load at least for a while.

He also mentioned using a "local IP address". If that was on the lo
interface. It would not be subject to things like TIME_WAIT or RTT lag.

When I was running my benches on loopback, I had tons of TIME_WAITS for 127.0.0.1 and 
squid would bail out with: "commBind: Cannot bind socket..."

Of course, I might be doing things wrong.

I am interested in what to optimize on RHEL6 OS level to achieve higher 
requests per second.

Jenny

I'll post my configs when I get back to the office, but one thing is that
if you send requests faster than they can be serviced the pending requests
build up until you start getting timeouts. so I have to tinker with the
number of requests that can be sent in parallel to keep the request rate
below this point.

note that when I removed the long list of ACLs I was able to get this 13K
requests/sec rate going from machine A to squid on machine B to apache on
machine C so it's not a localhost thing.

getting up to the 13K rate on apache does require doing some tuning and
tweaking of apache, stock configs that include dozens of dynamically
loaded modules just can't achieve these speeds. These are also fairly
beefy boxes, dual quad core opterons with 64G ram and 1G ethernet
(multiple cards, but I haven't tried trunking them yet)

David Lang

Ok, I am assuming that persistent-connections are on. This doesn't simulate any 
real life scenario.

I would like to know if anyone can do more than 500 reqs/sec with persistent 
connections off.

I'm not using persistant connections. I do this same sort of testing to 
validate various proxies that don't support persistant connections.

I'm remembering the theoretical max of the TCP stack (from one source IP 
to one destination IP) as being ~16K requests/sec, but I don't have 
references to point to at the moment.

David Lang

RE: [squid-users] squid 3.2.0.5 smp scaling issues


> Date: Sun, 12 Jun 2011 03:02:23 -0700
> From: da...@lang.hm
> To: bodycar...@live.com
> CC: squ...@treenet.co.nz; squid-users@squid-cache.org
> Subject: RE: [squid-users] squid 3.2.0.5 smp scaling issues
> 
> On Sun, 12 Jun 2011, Jenny Lee wrote:
> 
> >> On 12/06/11 18:46, Jenny Lee wrote:
> >>>
> >>> On Sat, Jun 11, 2011 at 9:40 PM, Jenny Lee wrote:
> >>>
> >>> I like to know how you are able to do>13000 requests/sec.
> >>> tcp_fin_timeout is 60 seconds default on all *NIXes and available 
> >>> ephemeral port range is 64K.
> >>> I can't do more than 1K requests/sec even with 
> >>> tcp_tw_reuse/tcp_tw_recycle with ab. I get commBind errors due to 
> >>> connections in TIME_WAIT.
> >>> Any tuning options suggested for RHEL6 x64?
> >>> Jenny
> >>>
> >>> I would have a concern using both those at the same time. reuse and 
> >>> recycle. Reuse a socket, but recycle it, I've seen issues when testing my 
> >>> own linux distro's with both of these settings. Right or wrong that was 
> >>> my experience.
> >>> fin_timeout, if you have a good connection, there should be no reason 
> >>> that a system takes 60 seconds to send out a fin. Cut that in half, if 
> >>> not by 2/3's
> >>> And what is your limitation at 1K requests/sec, load (if so look at I/O) 
> >>> Network saturation? Maybe I missed an earlier thread and I too would tilt 
> >>> my head at 13K requests sec!
> >>> Tory
> >>> ---
> >>>
> >>>
> >>> As I mentioned, my limitation is the ephemeral ports tied up with 
> >>> TIME_WAIT. TIME_WAIT issue is a known factor when you are doing testing.
> >>>
> >>> When you are tuning, you apply options one at a time. tw_reuse/tc_recycle 
> >>> were not used togeter and I had 10 sec fin_timeout which made no 
> >>> difference.
> >>>
> >>> Jenny
> >>>
> >>>
> >>> nb: i still dont know how to do indenting/quoting with this hotmail... 
> >>> after 10 years.
> >>>
> >>
> >> Couple of thing to note.
> >> Firstly that this was an ab (apache bench) reported figure. It
> >> calculates the software limitation based on speed of transactions done.
> >> Not necessarily accounting for things like TIME_WAIT. Particularly if it
> >> was extrapolated from say, 50K requests, which would not hit that OS limit.
> >
> > Ab accounts for 200-OK responses and TIME_WAITS cause squid to issue 500. 
> > Of course if you send in 50K it would not be subject to this but I usually 
> > send couple 10+ million to simulate load at least for a while.
> >
> >
> >> He also mentioned using a "local IP address". If that was on the lo
> >> interface. It would not be subject to things like TIME_WAIT or RTT lag.
> >
> > When I was running my benches on loopback, I had tons of TIME_WAITS for 
> > 127.0.0.1 and squid would bail out with: "commBind: Cannot bind socket..."
> >
> > Of course, I might be doing things wrong.
> >
> > I am interested in what to optimize on RHEL6 OS level to achieve higher 
> > requests per second.
> >
> > Jenny
> 
> I'll post my configs when I get back to the office, but one thing is that 
> if you send requests faster than they can be serviced the pending requests 
> build up until you start getting timeouts. so I have to tinker with the 
> number of requests that can be sent in parallel to keep the request rate 
> below this point.
> 
> note that when I removed the long list of ACLs I was able to get this 13K 
> requests/sec rate going from machine A to squid on machine B to apache on 
> machine C so it's not a localhost thing.
> 
> getting up to the 13K rate on apache does require doing some tuning and 
> tweaking of apache, stock configs that include dozens of dynamically 
> loaded modules just can't achieve these speeds. These are also fairly 
> beefy boxes, dual quad core opterons with 64G ram and 1G ethernet 
> (multiple cards, but I haven't tried trunking them yet)
> 
> David Lang


Ok, I am assuming that persistent-connections are on. This doesn't simulate any 
real life scenario.

I would like to know if anyone can do more than 500 reqs/sec with persistent 
connections off.

Jenny

RE: [squid-users] squid 3.2.0.5 smp scaling issues


On Sun, 12 Jun 2011, Jenny Lee wrote:


On 12/06/11 18:46, Jenny Lee wrote:


On Sat, Jun 11, 2011 at 9:40 PM, Jenny Lee wrote:

I like to know how you are able to do>13000 requests/sec.
tcp_fin_timeout is 60 seconds default on all *NIXes and available ephemeral 
port range is 64K.
I can't do more than 1K requests/sec even with tcp_tw_reuse/tcp_tw_recycle with 
ab. I get commBind errors due to connections in TIME_WAIT.
Any tuning options suggested for RHEL6 x64?
Jenny

I would have a concern using both those at the same time. reuse and recycle. 
Reuse a socket, but recycle it, I've seen issues when testing my own linux 
distro's with both of these settings. Right or wrong that was my experience.
fin_timeout, if you have a good connection, there should be no reason that a 
system takes 60 seconds to send out a fin. Cut that in half, if not by 2/3's
And what is your limitation at 1K requests/sec, load (if so look at I/O) 
Network saturation? Maybe I missed an earlier thread and I too would tilt my 
head at 13K requests sec!
Tory
---


As I mentioned, my limitation is the ephemeral ports tied up with TIME_WAIT. 
TIME_WAIT issue is a known factor when you are doing testing.

When you are tuning, you apply options one at a time. tw_reuse/tc_recycle were 
not used togeter and I had 10 sec fin_timeout which made no difference.

Jenny


nb: i still dont know how to do indenting/quoting with this hotmail... after 10 
years.



Couple of thing to note.
Firstly that this was an ab (apache bench) reported figure. It
calculates the software limitation based on speed of transactions done.
Not necessarily accounting for things like TIME_WAIT. Particularly if it
was extrapolated from say, 50K requests, which would not hit that OS limit.


Ab accounts for 200-OK responses and TIME_WAITS cause squid to issue 500. Of 
course if you send in 50K it would not be subject to this but I usually send 
couple 10+ million to simulate load at least for a while.



He also mentioned using a "local IP address". If that was on the lo
interface. It would not be subject to things like TIME_WAIT or RTT lag.


When I was running my benches on loopback, I had tons of TIME_WAITS for 127.0.0.1 and 
squid would bail out with: "commBind: Cannot bind socket..."

Of course, I might be doing things wrong.

I am interested in what to optimize on RHEL6 OS level to achieve higher 
requests per second.

Jenny


I'll post my configs when I get back to the office, but one thing is that 
if you send requests faster than they can be serviced the pending requests 
build up until you start getting timeouts. so I have to tinker with the 
number of requests that can be sent in parallel to keep the request rate 
below this point.


note that when I removed the long list of ACLs I was able to get this 13K 
requests/sec rate going from machine A to squid on machine B to apache on 
machine C so it's not a localhost thing.


getting up to the 13K rate on apache does require doing some tuning and 
tweaking of apache, stock configs that include dozens of dynamically 
loaded modules just can't achieve these speeds. These are also fairly 
beefy boxes, dual quad core opterons with 64G ram and 1G ethernet 
(multiple cards, but I haven't tried trunking them yet)


David Lang

RE: [squid-users] squid 3.2.0.5 smp scaling issues





> Date: Sun, 12 Jun 2011 19:54:10 +1200
> From: squ...@treenet.co.nz
> To: squid-users@squid-cache.org
> Subject: Re: [squid-users] squid 3.2.0.5 smp scaling issues
>
> On 12/06/11 18:46, Jenny Lee wrote:
> >
> > On Sat, Jun 11, 2011 at 9:40 PM, Jenny Lee wrote:
> >
> > I like to know how you are able to do>13000 requests/sec.
> > tcp_fin_timeout is 60 seconds default on all *NIXes and available ephemeral 
> > port range is 64K.
> > I can't do more than 1K requests/sec even with tcp_tw_reuse/tcp_tw_recycle 
> > with ab. I get commBind errors due to connections in TIME_WAIT.
> > Any tuning options suggested for RHEL6 x64?
> > Jenny
> >
> > I would have a concern using both those at the same time. reuse and 
> > recycle. Reuse a socket, but recycle it, I've seen issues when testing my 
> > own linux distro's with both of these settings. Right or wrong that was my 
> > experience.
> > fin_timeout, if you have a good connection, there should be no reason that 
> > a system takes 60 seconds to send out a fin. Cut that in half, if not by 
> > 2/3's
> > And what is your limitation at 1K requests/sec, load (if so look at I/O) 
> > Network saturation? Maybe I missed an earlier thread and I too would tilt 
> > my head at 13K requests sec!
> > Tory
> > ---
> >
> >
> > As I mentioned, my limitation is the ephemeral ports tied up with 
> > TIME_WAIT. TIME_WAIT issue is a known factor when you are doing testing.
> >
> > When you are tuning, you apply options one at a time. tw_reuse/tc_recycle 
> > were not used togeter and I had 10 sec fin_timeout which made no difference.
> >
> > Jenny
> >
> >
> > nb: i still dont know how to do indenting/quoting with this hotmail... 
> > after 10 years.
> >
>
> Couple of thing to note.
> Firstly that this was an ab (apache bench) reported figure. It
> calculates the software limitation based on speed of transactions done.
> Not necessarily accounting for things like TIME_WAIT. Particularly if it
> was extrapolated from say, 50K requests, which would not hit that OS limit.
 
Ab accounts for 200-OK responses and TIME_WAITS cause squid to issue 500. Of 
course if you send in 50K it would not be subject to this but I usually send 
couple 10+ million to simulate load at least for a while.

 
> He also mentioned using a "local IP address". If that was on the lo
> interface. It would not be subject to things like TIME_WAIT or RTT lag.
 
When I was running my benches on loopback, I had tons of TIME_WAITS for 
127.0.0.1 and squid would bail out with: "commBind: Cannot bind socket..."
 
Of course, I might be doing things wrong.
 
I am interested in what to optimize on RHEL6 OS level to achieve higher 
requests per second.
 
Jenny

Re: [squid-users] squid 3.2.0.5 smp scaling issues

2011-06-12 Thread Amos Jeffries


On 12/06/11 18:46, Jenny Lee wrote:


On Sat, Jun 11, 2011 at 9:40 PM, Jenny Lee  wrote:

I like to know how you are able to do>13000 requests/sec.
tcp_fin_timeout is 60 seconds default on all *NIXes and available ephemeral 
port range is 64K.
I can't do more than 1K requests/sec even with tcp_tw_reuse/tcp_tw_recycle with 
ab. I get commBind errors due to connections in TIME_WAIT.
Any tuning options suggested for RHEL6 x64?
Jenny

I would have a concern using both those at the same time.   reuse and recycle. 
Reuse a socket, but recycle it, I've seen issues when testing my own linux 
distro's with both of these settings. Right or wrong that was my experience.
fin_timeout, if you have a good connection, there should be no reason that a 
system takes 60 seconds to send out a fin. Cut that in half, if not by 2/3's
And what is your limitation at 1K requests/sec, load (if so look at I/O) 
Network saturation? Maybe I missed an earlier thread and I too would tilt my 
head at 13K requests sec!
Tory
---


As I mentioned, my limitation is the ephemeral ports tied up with TIME_WAIT.  
TIME_WAIT issue is a known factor when you are doing testing.

When you are tuning, you apply options one at a time. tw_reuse/tc_recycle were 
not used togeter and I had 10 sec fin_timeout which made no difference.

Jenny


nb: i still dont know how to do indenting/quoting with this hotmail... after 10 
years.



Couple of thing to note.
 Firstly that this was an ab (apache bench) reported figure. It 
calculates the software limitation based on speed of transactions done. 
Not necessarily accounting for things like TIME_WAIT. Particularly if it 
was extrapolated from say, 50K requests, which would not hit that OS limit.


He also mentioned using a "local IP address". If that was on the lo 
interface. It would not be subject to things like TIME_WAIT or RTT lag.



The test was also specific to the very long lists of non-matching regex 
ACL he apparently used. Once those were eliminated the test showed much 
faster numbers, but similar worker pattern.


Overall, useful info for us regarding worker load sharing. And a bit of 
a warning for people writing long lists of regex ACL. But the ACL issue 
was not really surprising.


HTH

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.12
  Beta testers wanted for 3.2.0.8 and 3.1.12.2

RE: [squid-users] squid 3.2.0.5 smp scaling issues

2011-06-11 Thread Jenny Lee


On Sat, Jun 11, 2011 at 9:40 PM, Jenny Lee  wrote:

I like to know how you are able to do >13000 requests/sec.
tcp_fin_timeout is 60 seconds default on all *NIXes and available ephemeral 
port range is 64K.
I can't do more than 1K requests/sec even with tcp_tw_reuse/tcp_tw_recycle with 
ab. I get commBind errors due to connections in TIME_WAIT.
Any tuning options suggested for RHEL6 x64?
Jenny

I would have a concern using both those at the same time.   reuse and recycle. 
Reuse a socket, but recycle it, I've seen issues when testing my own linux 
distro's with both of these settings. Right or wrong that was my experience.
fin_timeout, if you have a good connection, there should be no reason that a 
system takes 60 seconds to send out a fin. Cut that in half, if not by 2/3's
And what is your limitation at 1K requests/sec, load (if so look at I/O) 
Network saturation? Maybe I missed an earlier thread and I too would tilt my 
head at 13K requests sec!
Tory
---
 
 
As I mentioned, my limitation is the ephemeral ports tied up with TIME_WAIT.  
TIME_WAIT issue is a known factor when you are doing testing.
 
When you are tuning, you apply options one at a time. tw_reuse/tc_recycle were 
not used togeter and I had 10 sec fin_timeout which made no difference.
 
Jenny

 
nb: i still dont know how to do indenting/quoting with this hotmail... after 10 
years.

[squid-users] squid 3.2.0.5 smp scaling issues

2011-06-11 Thread Jenny Lee


I like to know how you are able to do >13000 requests/sec.
 
tcp_fin_timeout is 60 seconds default on all *NIXes and available ephemeral 
port range is 64K.
 
I can't do more than 1K requests/sec even with tcp_tw_reuse/tcp_tw_recycle with 
ab. I get commBind errors due to connections in TIME_WAIT.
 
Any tuning options suggested for RHEL6 x64?
 
Jenny
 
 
 
 
---
test setup
box A running apache and ab
test against local IP address >13000 requests/sec
box B running squid, 8 2.3 GHz Opteron cores with 16G ram
non acl/cache-peer related lines in the config are (including typos from 
me manually entering this)
http_port 8000
icp_port 0
visible_hostname gromit1
cache_effective_user proxy
cache_effective_group proxy
appaend_domain .invalid.server.name
pid_filename /var/run/squid.pid
cache_dir null /tmp
client_db off
cache_access_log syslog squid
cache_log /var/log/squid/cache.log
cache_store_log none
coredump_dir none
no_cache deny all

results when requesting short html page 
squid 3.0.STABLE12 4200 requests/sec
squid 3.1.11 2100 requests/sec
squid 3.2.0.5 1 worker 1400 requests/sec
squid 3.2.0.5 2 workers 2100 requests/sec
squid 3.2.0.5 3 workers 2500 requests/sec
squid 3.2.0.5 4 workers 2900 requests/sec
squid 3.2.0.5 5 workers 2900 requests/sec
squid 3.2.0.5 6 workers 2500 requests/sec
squid 3.2.0.5 7 workers 2000 requests/sec
squid 3.2.0.5 8 workers 1900 requests/sec
in all these tests the squid process was using 100% of the cpu
I tried it pulling a large file (100K instead of <50 bytes) on the thought 
that this may be bottlenecking on accepting the connections but with 
something that took more time to service the connections it could do 
better however what I found is that with 8 workers all 8 were using <50% 
of the CPU at 1000 requests/sec
local machine would do 7000 requests/sec to itself
1 worker 500 requests/sec
2 workers 957 requests/sec
from there it remained about 1000 requests/sec with the cpu 
utilization slowly dropping off (but not dropping as fast as it should 
with the number of cores available)
so it looks like there is some significant bottleneck in version 3.2 that 
makes the SMP support fairly ineffective.

in reading the wiki page at wili.squid-cache.org/Features/SmpScale I see 
you worrying about fairness between workers. If you have put in code to 
try and ensure fairness, you may want to remove it and see what happens to 
performance. what you are describing on that page in terms of fairness is 
what I would expect form a 'first-come-first-served' approach to multiple 
processes grabbing new connections. The worker that last ran is hot in the 
cache and so has an 'unfair' advantage in noticing and processing the new 
request, but as that worker gets busier, it will be spending more time 
servicing the request and the other processes will get more of a chance to 
grab the new connection, so it will appear unfair under light load, but 
become more fair under heavy load.
David Lang

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

2011-05-04 Thread Amos Jeffries


On Wed, 4 May 2011 16:36:08 -0700 (PDT), da...@lang.hm wrote:

On Wed, 4 May 2011, Alex Rousskov wrote:


On 05/04/2011 12:49 PM, da...@lang.hm wrote:



IMHO, you can maximize your chances of getting free help by 
isolating

the problem better. For example, perhaps you can try to reproduce it
with different kinds of fast ACLs (the simpler the better!). This 
will

help clarify whether the problem is specific to IPv6, IP, or ACLs in
general. Test different number of ACLs: Does the problem happen only
when there number of simple ACLs is huge? Make the problem easier to
reproduce by posting configuration files (including Polygraph 
workloads

or options for some other benchmarking tool you use).
-
This is not a guarantee that somebody will jump and help you, but 
fixing

a well-triaged issue is often much easier.


that's why I'm speaking up. I just have not known what to test.

are there other types of ACLs that I should be testing?


We can't answer that without having seen your config file and which are 
in use now.


The list of all available ACL are at 
http://wiki.squid-cache.org/SquidFaq/SquidAcl and 
http://www.squid-cache.org/Doc/config/acl/




I'll setup some tests with differnet numbers of ACLs. since I've
already verified that the number of ACLs defined isn't the 
significant

factor, only the number tested before one succeds (by moving the ACL
that allows my access from the end of the file to the beginning of 
the

file, keeping everything else the same), I'll see if the slowdown
seems proportional to the number of rules, or if there is something
else going on.

any other types of testing I should do?


The above looks like a good benchmark *provided* all the ACLs have the 
same type with consistent content counts. Mixing types makes the result 
non-comparable with other tests.


If you have time (and want to), we kind of need that type of 
benchmarking done for each ACL type. Prioritising by popularity: src/dst 
by IP, port, domain and regex variants. Then proxy_auth, external (the 
"fake" helpers can help here). Then the others; ie browser, proto, 
method, header matching.


We know general fuzzy details like, for example, a port test is faster 
than a domain test. One with details presented up front by the client is 
also faster than one where a lookup is needed. But have no deeper info 
to say if a dstdomain test is faster or slower than a src (IP) test.


Way down my TODO list is the dream of micro-benchmarking the ACLs in 
their unit-tests.



Amos

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

2011-05-04 Thread david


On Wed, 4 May 2011, Alex Rousskov wrote:


On 05/04/2011 12:49 PM, da...@lang.hm wrote:


I don't know how many developers are working on squid, so I don't knwo
if you are the only person who can do this sort of work or not.


I am sure there are others who can do this. The question is whether you
can quickly find somebody interested enough to spend their time on your
problem. In general, folks work on issues that are important to them or
to their customers. Most active developers donate a lot of free time,
but it still tends to revolve around issues they care about for one
reason or another. We all have to prioritize.


I do understand this.


do you think that I should join the squid-dev list?


I believe your messages are posted to squid-dev so you are not going to
reach a wider audience if you do. If you want to write Squid code,
joining is a good idea!


I don't really have the time to do coding on this project


IMHO, you can maximize your chances of getting free help by isolating
the problem better. For example, perhaps you can try to reproduce it
with different kinds of fast ACLs (the simpler the better!). This will
help clarify whether the problem is specific to IPv6, IP, or ACLs in
general. Test different number of ACLs: Does the problem happen only
when there number of simple ACLs is huge? Make the problem easier to
reproduce by posting configuration files (including Polygraph workloads
or options for some other benchmarking tool you use).

This is not a guarantee that somebody will jump and help you, but fixing
a well-triaged issue is often much easier.


that's why I'm speaking up. I just have not known what to test.

are there other types of ACLs that I should be testing?

I'll setup some tests with differnet numbers of ACLs. since I've already 
verified that the number of ACLs defined isn't the significant factor, 
only the number tested before one succeds (by moving the ACL that allows 
my access from the end of the file to the beginning of the file, keeping 
everything else the same), I'll see if the slowdown seems proportional to 
the number of rules, or if there is something else going on.


any other types of testing I should do?

David Lang



HTH,

Alex.



On Wed, 4 May 2011, Alex Rousskov wrote:


On 05/04/2011 11:41 AM, da...@lang.hm wrote:


anything new on this issue? (including any patches for me to test?)


If you mean the "ACLs do not scale well" issue, then I do not have any
free cycles to work on it right now.  I was happy to clarify the new SMP
architecture and suggest ways to triage the issue further. Let's hope
somebody else can volunteer to do the required legwork.

Alex.



On Mon, 25 Apr 2011, da...@lang.hm wrote:


Date: Mon, 25 Apr 2011 17:14:52 -0700 (PDT)
From: da...@lang.hm
To: Alex Rousskov 
Cc: Marcos , squid-users@squid-cache.org,
    squid-...@squid-cache.org
Subject: Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

On Mon, 25 Apr 2011, Alex Rousskov wrote:


On 04/25/2011 05:31 PM, da...@lang.hm wrote:

On Mon, 25 Apr 2011, da...@lang.hm wrote:

On Mon, 25 Apr 2011, Alex Rousskov wrote:

On 04/14/2011 09:06 PM, da...@lang.hm wrote:


In addition, there seems to be some sort of locking betwen the
multiple
worker processes in 3.2 when checking the ACLs


There are pretty much no locks in the current official SMP code.
This
will change as we start adding shared caches in a week or so, but
even
then the ACLs will remain lock-free. There could be some internal
locking in the 3rd-party libraries used by ACLs (regex and such),
but I
do not know much about them.


what are the 3rd party libraries that I would be using?


See "ldd squid". Here is a sample based on a randomly picked Squid:

   libnsl, libresolv, libstdc++, libgcc_s, libm, libc, libz, libepol

Please note that I am not saying that any of these have problems in
SMP
environment. I am only saying that Squid itself does not lock anything
runtime so if our suspect is SMP-related locks, they would have to
reside elsewhere. The other possibility is that we should suspect
something else, of course. IMHO, it is more likely to be something
else:
after all, Squid does not use threads, where such problems are
expected.




BTW, do you see more-or-less even load across CPU cores? If not,
you may
need a patch that we find useful on older Linux kernels. It is
discussed
in the "Will similar workers receive similar amount of work?"
section of
http://wiki.squid-cache.org/Features/SmpScale


the load is pretty even across all workers.

with the problems descripted on that page, I would expect uneven
utilization at low loads, but at high loads (with the workers busy
serviceing requests rather than waiting for new connections), I would
expect the work to even out (and the types of hacks described in that
section to end up costing performance, but not in a way that would
scale with the ACL processing load)


one thought I had is th

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

2011-05-04 Thread Amos Jeffries


On Wed, 4 May 2011 11:49:01 -0700 (PDT), da...@lang.hm wrote:

I don't know how many developers are working on squid, so I don't
knwo if you are the only person who can do this sort of work or not.


4 part-timers and a few others focused on specific areas.



do you think that I should join the squid-dev list?


I thought you had, if you are intending to follow this for long it 
could be a good idea anyway.
If you have any time to spare on tinkering with optimizations even 
better.


Amos

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

2011-05-04 Thread Alex Rousskov

On 05/04/2011 12:49 PM, da...@lang.hm wrote:

> I don't know how many developers are working on squid, so I don't knwo
> if you are the only person who can do this sort of work or not.

I am sure there are others who can do this. The question is whether you
can quickly find somebody interested enough to spend their time on your
problem. In general, folks work on issues that are important to them or
to their customers. Most active developers donate a lot of free time,
but it still tends to revolve around issues they care about for one
reason or another. We all have to prioritize.

> do you think that I should join the squid-dev list?

I believe your messages are posted to squid-dev so you are not going to
reach a wider audience if you do. If you want to write Squid code,
joining is a good idea!

IMHO, you can maximize your chances of getting free help by isolating
the problem better. For example, perhaps you can try to reproduce it
with different kinds of fast ACLs (the simpler the better!). This will
help clarify whether the problem is specific to IPv6, IP, or ACLs in
general. Test different number of ACLs: Does the problem happen only
when there number of simple ACLs is huge? Make the problem easier to
reproduce by posting configuration files (including Polygraph workloads
or options for some other benchmarking tool you use).

This is not a guarantee that somebody will jump and help you, but fixing
a well-triaged issue is often much easier.

HTH,

Alex.

> On Wed, 4 May 2011, Alex Rousskov wrote:
> 
>> On 05/04/2011 11:41 AM, da...@lang.hm wrote:
>>
>>> anything new on this issue? (including any patches for me to test?)
>>
>> If you mean the "ACLs do not scale well" issue, then I do not have any
>> free cycles to work on it right now.  I was happy to clarify the new SMP
>> architecture and suggest ways to triage the issue further. Let's hope
>> somebody else can volunteer to do the required legwork.
>>
>> Alex.
>>
>>
>>> On Mon, 25 Apr 2011, da...@lang.hm wrote:
>>>
>>>> Date: Mon, 25 Apr 2011 17:14:52 -0700 (PDT)
>>>> From: da...@lang.hm
>>>> To: Alex Rousskov 
>>>> Cc: Marcos , squid-users@squid-cache.org,
>>>> squid-...@squid-cache.org
>>>> Subject: Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues
>>>>
>>>> On Mon, 25 Apr 2011, Alex Rousskov wrote:
>>>>
>>>>> On 04/25/2011 05:31 PM, da...@lang.hm wrote:
>>>>>> On Mon, 25 Apr 2011, da...@lang.hm wrote:
>>>>>>> On Mon, 25 Apr 2011, Alex Rousskov wrote:
>>>>>>>> On 04/14/2011 09:06 PM, da...@lang.hm wrote:
>>>>>>>>
>>>>>>>>> In addition, there seems to be some sort of locking betwen the
>>>>>>>>> multiple
>>>>>>>>> worker processes in 3.2 when checking the ACLs
>>>>>>>>
>>>>>>>> There are pretty much no locks in the current official SMP code.
>>>>>>>> This
>>>>>>>> will change as we start adding shared caches in a week or so, but
>>>>>>>> even
>>>>>>>> then the ACLs will remain lock-free. There could be some internal
>>>>>>>> locking in the 3rd-party libraries used by ACLs (regex and such),
>>>>>>>> but I
>>>>>>>> do not know much about them.
>>>>>>>
>>>>>>> what are the 3rd party libraries that I would be using?
>>>>>
>>>>> See "ldd squid". Here is a sample based on a randomly picked Squid:
>>>>>
>>>>>libnsl, libresolv, libstdc++, libgcc_s, libm, libc, libz, libepol
>>>>>
>>>>> Please note that I am not saying that any of these have problems in
>>>>> SMP
>>>>> environment. I am only saying that Squid itself does not lock anything
>>>>> runtime so if our suspect is SMP-related locks, they would have to
>>>>> reside elsewhere. The other possibility is that we should suspect
>>>>> something else, of course. IMHO, it is more likely to be something
>>>>> else:
>>>>> after all, Squid does not use threads, where such problems are
>>>>> expected.
>>>>
>>>>
>>>>> BTW, do you see more-or-less even load across CPU cores? If not,
>>>>> you may
>>>>> need a patch that we find useful on older Linux kernels. It is
>>>>> discussed
>>>>> in the "Will similar

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

2011-05-04 Thread david

I don't know how many developers are working on squid, so I don't knwo if 
you are the only person who can do this sort of work or not.


do you think that I should join the squid-dev list?

David Lang

On Wed, 4 May 2011, Alex Rousskov wrote:


On 05/04/2011 11:41 AM, da...@lang.hm wrote:


anything new on this issue? (including any patches for me to test?)


If you mean the "ACLs do not scale well" issue, then I do not have any
free cycles to work on it right now.  I was happy to clarify the new SMP
architecture and suggest ways to triage the issue further. Let's hope
somebody else can volunteer to do the required legwork.

Alex.



On Mon, 25 Apr 2011, da...@lang.hm wrote:


Date: Mon, 25 Apr 2011 17:14:52 -0700 (PDT)
From: da...@lang.hm
To: Alex Rousskov 
Cc: Marcos , squid-users@squid-cache.org,
squid-...@squid-cache.org
Subject: Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

On Mon, 25 Apr 2011, Alex Rousskov wrote:


On 04/25/2011 05:31 PM, da...@lang.hm wrote:

On Mon, 25 Apr 2011, da...@lang.hm wrote:

On Mon, 25 Apr 2011, Alex Rousskov wrote:

On 04/14/2011 09:06 PM, da...@lang.hm wrote:


In addition, there seems to be some sort of locking betwen the
multiple
worker processes in 3.2 when checking the ACLs


There are pretty much no locks in the current official SMP code. This
will change as we start adding shared caches in a week or so, but
even
then the ACLs will remain lock-free. There could be some internal
locking in the 3rd-party libraries used by ACLs (regex and such),
but I
do not know much about them.


what are the 3rd party libraries that I would be using?


See "ldd squid". Here is a sample based on a randomly picked Squid:

   libnsl, libresolv, libstdc++, libgcc_s, libm, libc, libz, libepol

Please note that I am not saying that any of these have problems in SMP
environment. I am only saying that Squid itself does not lock anything
runtime so if our suspect is SMP-related locks, they would have to
reside elsewhere. The other possibility is that we should suspect
something else, of course. IMHO, it is more likely to be something else:
after all, Squid does not use threads, where such problems are expected.




BTW, do you see more-or-less even load across CPU cores? If not, you may
need a patch that we find useful on older Linux kernels. It is discussed
in the "Will similar workers receive similar amount of work?" section of
http://wiki.squid-cache.org/Features/SmpScale


the load is pretty even across all workers.

with the problems descripted on that page, I would expect uneven
utilization at low loads, but at high loads (with the workers busy
serviceing requests rather than waiting for new connections), I would
expect the work to even out (and the types of hacks described in that
section to end up costing performance, but not in a way that would
scale with the ACL processing load)


one thought I had is that this could be locking on name lookups. how
hard would it be to create a quick patch that would bypass the name
lookups entirely and only do the lookups by IP.


I did not realize your ACLs use DNS lookups. Squid internal DNS code
does not have any runtime SMP locks. However, the presence of DNS
lookups increases the number of suspects.


they don't, everything in my test environment is by IP. But I've seen
other software that still runs everything through name lookups, even
if what's presented to the software (both in what's requested and in
the ACLs) is all done by IPs. It's a easy way to bullet-proof the
input (if it's a name it gets resolved, if it's an IP, the IP comes
back as-is, and it works for IPv4 and IPv6, no need to have logic that
looks at the value and tries to figure out if the user intended to
type a name or an IP). I don't know how squid is working internally
(it's a pretty large codebase, and I haven't tried to really dive into
it) so I don't know if squid does this or not.


A patch you propose does not sound difficult to me, but since I cannot
contribute such a patch soon, it is probably better to test with ACLs
that do not require any DNS lookups instead.



if that regains the speed and/or scalability it would point fingers
fairly conclusively at the DNS components.

this is the only think that I can think of that should be shared
between
multiple workers processing ACLs


but it is _not_ currently shared from Squid point of view.


Ok, I was assuming from the description of things that there would be
one DNS process that all the workers would be accessing. from the way
it's described in the documentation it sounds as if it's already a
separate process, so I was thinking that it was possible that if each
ACL IP address is being put through a single DNS process, I could be
running into contention on that process (and having to do name lookups
for both IPv6 and then falling back to IPv4 would explain the severe
performance hit far more than the difference between IPs being 128 bit
values instead of 32 bit values)

David Lang

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

2011-05-04 Thread Alex Rousskov

On 05/04/2011 11:41 AM, da...@lang.hm wrote:

> anything new on this issue? (including any patches for me to test?)

If you mean the "ACLs do not scale well" issue, then I do not have any
free cycles to work on it right now.  I was happy to clarify the new SMP
architecture and suggest ways to triage the issue further. Let's hope
somebody else can volunteer to do the required legwork.

Alex.


> On Mon, 25 Apr 2011, da...@lang.hm wrote:
> 
>> Date: Mon, 25 Apr 2011 17:14:52 -0700 (PDT)
>> From: da...@lang.hm
>> To: Alex Rousskov 
>> Cc: Marcos , squid-users@squid-cache.org,
>> squid-...@squid-cache.org
>> Subject: Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues
>>
>> On Mon, 25 Apr 2011, Alex Rousskov wrote:
>>
>>> On 04/25/2011 05:31 PM, da...@lang.hm wrote:
>>>> On Mon, 25 Apr 2011, da...@lang.hm wrote:
>>>>> On Mon, 25 Apr 2011, Alex Rousskov wrote:
>>>>>> On 04/14/2011 09:06 PM, da...@lang.hm wrote:
>>>>>>
>>>>>>> In addition, there seems to be some sort of locking betwen the
>>>>>>> multiple
>>>>>>> worker processes in 3.2 when checking the ACLs
>>>>>>
>>>>>> There are pretty much no locks in the current official SMP code. This
>>>>>> will change as we start adding shared caches in a week or so, but
>>>>>> even
>>>>>> then the ACLs will remain lock-free. There could be some internal
>>>>>> locking in the 3rd-party libraries used by ACLs (regex and such),
>>>>>> but I
>>>>>> do not know much about them.
>>>>>
>>>>> what are the 3rd party libraries that I would be using?
>>>
>>> See "ldd squid". Here is a sample based on a randomly picked Squid:
>>>
>>>libnsl, libresolv, libstdc++, libgcc_s, libm, libc, libz, libepol
>>>
>>> Please note that I am not saying that any of these have problems in SMP
>>> environment. I am only saying that Squid itself does not lock anything
>>> runtime so if our suspect is SMP-related locks, they would have to
>>> reside elsewhere. The other possibility is that we should suspect
>>> something else, of course. IMHO, it is more likely to be something else:
>>> after all, Squid does not use threads, where such problems are expected.
>>
>>
>>> BTW, do you see more-or-less even load across CPU cores? If not, you may
>>> need a patch that we find useful on older Linux kernels. It is discussed
>>> in the "Will similar workers receive similar amount of work?" section of
>>> http://wiki.squid-cache.org/Features/SmpScale
>>
>> the load is pretty even across all workers.
>>
>> with the problems descripted on that page, I would expect uneven
>> utilization at low loads, but at high loads (with the workers busy
>> serviceing requests rather than waiting for new connections), I would
>> expect the work to even out (and the types of hacks described in that
>> section to end up costing performance, but not in a way that would
>> scale with the ACL processing load)
>>
>>>> one thought I had is that this could be locking on name lookups. how
>>>> hard would it be to create a quick patch that would bypass the name
>>>> lookups entirely and only do the lookups by IP.
>>>
>>> I did not realize your ACLs use DNS lookups. Squid internal DNS code
>>> does not have any runtime SMP locks. However, the presence of DNS
>>> lookups increases the number of suspects.
>>
>> they don't, everything in my test environment is by IP. But I've seen
>> other software that still runs everything through name lookups, even
>> if what's presented to the software (both in what's requested and in
>> the ACLs) is all done by IPs. It's a easy way to bullet-proof the
>> input (if it's a name it gets resolved, if it's an IP, the IP comes
>> back as-is, and it works for IPv4 and IPv6, no need to have logic that
>> looks at the value and tries to figure out if the user intended to
>> type a name or an IP). I don't know how squid is working internally
>> (it's a pretty large codebase, and I haven't tried to really dive into
>> it) so I don't know if squid does this or not.
>>
>>> A patch you propose does not sound difficult to me, but since I cannot
>>> contribute such a patch soon, it is probably better to test with ACLs
>>> that do not require any DNS lookups instead.
>>>
&

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

2011-05-04 Thread david


ping,

anything new on this issue? (including any patches for me to test?)

David Lang

On Mon, 25 Apr 2011, da...@lang.hm wrote:


Date: Mon, 25 Apr 2011 17:14:52 -0700 (PDT)
From: da...@lang.hm
To: Alex Rousskov 
Cc: Marcos , squid-users@squid-cache.org,
squid-...@squid-cache.org
Subject: Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

On Mon, 25 Apr 2011, Alex Rousskov wrote:


On 04/25/2011 05:31 PM, da...@lang.hm wrote:

On Mon, 25 Apr 2011, da...@lang.hm wrote:

On Mon, 25 Apr 2011, Alex Rousskov wrote:

On 04/14/2011 09:06 PM, da...@lang.hm wrote:


In addition, there seems to be some sort of locking betwen the multiple
worker processes in 3.2 when checking the ACLs


There are pretty much no locks in the current official SMP code. This
will change as we start adding shared caches in a week or so, but even
then the ACLs will remain lock-free. There could be some internal
locking in the 3rd-party libraries used by ACLs (regex and such), but I
do not know much about them.


what are the 3rd party libraries that I would be using?


See "ldd squid". Here is a sample based on a randomly picked Squid:

   libnsl, libresolv, libstdc++, libgcc_s, libm, libc, libz, libepol

Please note that I am not saying that any of these have problems in SMP
environment. I am only saying that Squid itself does not lock anything
runtime so if our suspect is SMP-related locks, they would have to
reside elsewhere. The other possibility is that we should suspect
something else, of course. IMHO, it is more likely to be something else:
after all, Squid does not use threads, where such problems are expected.




BTW, do you see more-or-less even load across CPU cores? If not, you may
need a patch that we find useful on older Linux kernels. It is discussed
in the "Will similar workers receive similar amount of work?" section of
http://wiki.squid-cache.org/Features/SmpScale


the load is pretty even across all workers.

with the problems descripted on that page, I would expect uneven utilization 
at low loads, but at high loads (with the workers busy serviceing requests 
rather than waiting for new connections), I would expect the work to even out 
(and the types of hacks described in that section to end up costing 
performance, but not in a way that would scale with the ACL processing load)



one thought I had is that this could be locking on name lookups. how
hard would it be to create a quick patch that would bypass the name
lookups entirely and only do the lookups by IP.


I did not realize your ACLs use DNS lookups. Squid internal DNS code
does not have any runtime SMP locks. However, the presence of DNS
lookups increases the number of suspects.


they don't, everything in my test environment is by IP. But I've seen other 
software that still runs everything through name lookups, even if what's 
presented to the software (both in what's requested and in the ACLs) is all 
done by IPs. It's a easy way to bullet-proof the input (if it's a name it 
gets resolved, if it's an IP, the IP comes back as-is, and it works for IPv4 
and IPv6, no need to have logic that looks at the value and tries to figure 
out if the user intended to type a name or an IP). I don't know how squid is 
working internally (it's a pretty large codebase, and I haven't tried to 
really dive into it) so I don't know if squid does this or not.



A patch you propose does not sound difficult to me, but since I cannot
contribute such a patch soon, it is probably better to test with ACLs
that do not require any DNS lookups instead.



if that regains the speed and/or scalability it would point fingers
fairly conclusively at the DNS components.

this is the only think that I can think of that should be shared between
multiple workers processing ACLs


but it is _not_ currently shared from Squid point of view.


Ok, I was assuming from the description of things that there would be one DNS 
process that all the workers would be accessing. from the way it's described 
in the documentation it sounds as if it's already a separate process, so I 
was thinking that it was possible that if each ACL IP address is being put 
through a single DNS process, I could be running into contention on that 
process (and having to do name lookups for both IPv6 and then falling back to 
IPv4 would explain the severe performance hit far more than the difference 
between IPs being 128 bit values instead of 32 bit values)


David Lang

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues


On Mon, 25 Apr 2011, Alex Rousskov wrote:


On 04/25/2011 06:14 PM, da...@lang.hm wrote:

if that regains the speed and/or scalability it would point fingers
fairly conclusively at the DNS components.

this is the only think that I can think of that should be shared between
multiple workers processing ACLs


but it is _not_ currently shared from Squid point of view.


Ok, I was assuming from the description of things that there would be
one DNS process that all the workers would be accessing. from the way
it's described in the documentation it sounds as if it's already a
separate process


I would like to fix that documentation, but I cannot find what phrase
led you to the above conclusion. The SmpScale wiki page says:


Currently, Squid workers do not share and do not synchronize other
resources or services, including:

* DNS caches (ipcache and fqdncache);


So that seems to be correct and clear. Which documentation are you
referring to?


ahh, I missed that, I was going by the description of the config options 
that configure and disable the DNS cache (they don't say anything about 
the SMP mode, but I read them to imply that the squid-internal DNS cache 
was a separate thread/proccess)


David Lang

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

2011-04-25 Thread Alex Rousskov

On 04/25/2011 06:14 PM, da...@lang.hm wrote:
>>> if that regains the speed and/or scalability it would point fingers
>>> fairly conclusively at the DNS components.
>>>
>>> this is the only think that I can think of that should be shared between
>>> multiple workers processing ACLs
>>
>> but it is _not_ currently shared from Squid point of view.
> 
> Ok, I was assuming from the description of things that there would be
> one DNS process that all the workers would be accessing. from the way
> it's described in the documentation it sounds as if it's already a
> separate process

I would like to fix that documentation, but I cannot find what phrase
led you to the above conclusion. The SmpScale wiki page says:

> Currently, Squid workers do not share and do not synchronize other
> resources or services, including:
> 
> * DNS caches (ipcache and fqdncache);

So that seems to be correct and clear. Which documentation are you
referring to?


Thank you,

Alex.

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues


On Mon, 25 Apr 2011, Alex Rousskov wrote:


On 04/25/2011 05:31 PM, da...@lang.hm wrote:

On Mon, 25 Apr 2011, da...@lang.hm wrote:

On Mon, 25 Apr 2011, Alex Rousskov wrote:

On 04/14/2011 09:06 PM, da...@lang.hm wrote:


In addition, there seems to be some sort of locking betwen the multiple
worker processes in 3.2 when checking the ACLs


There are pretty much no locks in the current official SMP code. This
will change as we start adding shared caches in a week or so, but even
then the ACLs will remain lock-free. There could be some internal
locking in the 3rd-party libraries used by ACLs (regex and such), but I
do not know much about them.


what are the 3rd party libraries that I would be using?


See "ldd squid". Here is a sample based on a randomly picked Squid:

   libnsl, libresolv, libstdc++, libgcc_s, libm, libc, libz, libepol

Please note that I am not saying that any of these have problems in SMP
environment. I am only saying that Squid itself does not lock anything
runtime so if our suspect is SMP-related locks, they would have to
reside elsewhere. The other possibility is that we should suspect
something else, of course. IMHO, it is more likely to be something else:
after all, Squid does not use threads, where such problems are expected.




BTW, do you see more-or-less even load across CPU cores? If not, you may
need a patch that we find useful on older Linux kernels. It is discussed
in the "Will similar workers receive similar amount of work?" section of
http://wiki.squid-cache.org/Features/SmpScale


the load is pretty even across all workers.

with the problems descripted on that page, I would expect uneven 
utilization at low loads, but at high loads (with the workers busy 
serviceing requests rather than waiting for new connections), I would 
expect the work to even out (and the types of hacks described in that 
section to end up costing performance, but not in a way that would scale 
with the ACL processing load)



one thought I had is that this could be locking on name lookups. how
hard would it be to create a quick patch that would bypass the name
lookups entirely and only do the lookups by IP.


I did not realize your ACLs use DNS lookups. Squid internal DNS code
does not have any runtime SMP locks. However, the presence of DNS
lookups increases the number of suspects.


they don't, everything in my test environment is by IP. But I've seen 
other software that still runs everything through name lookups, even if 
what's presented to the software (both in what's requested and in the 
ACLs) is all done by IPs. It's a easy way to bullet-proof the input (if 
it's a name it gets resolved, if it's an IP, the IP comes back as-is, and 
it works for IPv4 and IPv6, no need to have logic that looks at the value 
and tries to figure out if the user intended to type a name or an IP). I 
don't know how squid is working internally (it's a pretty large codebase, 
and I haven't tried to really dive into it) so I don't know if squid does 
this or not.



A patch you propose does not sound difficult to me, but since I cannot
contribute such a patch soon, it is probably better to test with ACLs
that do not require any DNS lookups instead.



if that regains the speed and/or scalability it would point fingers
fairly conclusively at the DNS components.

this is the only think that I can think of that should be shared between
multiple workers processing ACLs


but it is _not_ currently shared from Squid point of view.


Ok, I was assuming from the description of things that there would be one 
DNS process that all the workers would be accessing. from the way it's 
described in the documentation it sounds as if it's already a separate 
process, so I was thinking that it was possible that if each ACL IP 
address is being put through a single DNS process, I could be running into 
contention on that process (and having to do name lookups for both IPv6 
and then falling back to IPv4 would explain the severe performance hit far 
more than the difference between IPs being 128 bit values instead of 32 
bit values)


David Lang

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues

2011-04-25 Thread Alex Rousskov

On 04/25/2011 05:31 PM, da...@lang.hm wrote:
> On Mon, 25 Apr 2011, da...@lang.hm wrote: 
>> On Mon, 25 Apr 2011, Alex Rousskov wrote:
>>> On 04/14/2011 09:06 PM, da...@lang.hm wrote:
>>>
 In addition, there seems to be some sort of locking betwen the multiple
 worker processes in 3.2 when checking the ACLs
>>>
>>> There are pretty much no locks in the current official SMP code. This
>>> will change as we start adding shared caches in a week or so, but even
>>> then the ACLs will remain lock-free. There could be some internal
>>> locking in the 3rd-party libraries used by ACLs (regex and such), but I
>>> do not know much about them.
>>
>> what are the 3rd party libraries that I would be using?

See "ldd squid". Here is a sample based on a randomly picked Squid:

libnsl, libresolv, libstdc++, libgcc_s, libm, libc, libz, libepol

Please note that I am not saying that any of these have problems in SMP
environment. I am only saying that Squid itself does not lock anything
runtime so if our suspect is SMP-related locks, they would have to
reside elsewhere. The other possibility is that we should suspect
something else, of course. IMHO, it is more likely to be something else:
after all, Squid does not use threads, where such problems are expected.

BTW, do you see more-or-less even load across CPU cores? If not, you may
need a patch that we find useful on older Linux kernels. It is discussed
in the "Will similar workers receive similar amount of work?" section of
http://wiki.squid-cache.org/Features/SmpScale

> one thought I had is that this could be locking on name lookups. how
> hard would it be to create a quick patch that would bypass the name
> lookups entirely and only do the lookups by IP.

I did not realize your ACLs use DNS lookups. Squid internal DNS code
does not have any runtime SMP locks. However, the presence of DNS
lookups increases the number of suspects.

A patch you propose does not sound difficult to me, but since I cannot
contribute such a patch soon, it is probably better to test with ACLs
that do not require any DNS lookups instead.

> if that regains the speed and/or scalability it would point fingers
> fairly conclusively at the DNS components.
> 
> this is the only think that I can think of that should be shared between
> multiple workers processing ACLs

but it is _not_ currently shared from Squid point of view.

Cheers,

Alex.

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues


On Mon, 25 Apr 2011, da...@lang.hm wrote:


On Mon, 25 Apr 2011, Alex Rousskov wrote:


On 04/14/2011 09:06 PM, da...@lang.hm wrote:


In addition, there seems to be some sort of locking betwen the multiple
worker processes in 3.2 when checking the ACLs


There are pretty much no locks in the current official SMP code. This
will change as we start adding shared caches in a week or so, but even
then the ACLs will remain lock-free. There could be some internal
locking in the 3rd-party libraries used by ACLs (regex and such), but I
do not know much about them.


what are the 3rd party libraries that I would be using?


one thought I had is that this could be locking on name lookups. how hard 
would it be to create a quick patch that would bypass the name lookups 
entirely and only do the lookups by IP.


if that regains the speed and/or scalability it would point fingers fairly 
conclusively at the DNS components.


this is the only think that I can think of that should be shared between 
multiple workers processing ACLs


David Lang

Re: Res: [squid-users] squid 3.2.0.5 smp scaling issues


On Mon, 25 Apr 2011, Alex Rousskov wrote:


On 04/14/2011 09:06 PM, da...@lang.hm wrote:

Ok, I finally got a chance to test 2.7STABLE9

it performs about the same as squid 3.0, possibly a little better.

with my somewhat stripped down config (smaller regex patterns, replacing
CIDR blocks and names that would need to be looked up in /etc/hosts with
individual IP addresses)

2.7 gives ~4800 requests/sec
3.0 gives ~4600 requests/sec
3.2.0.6 with 1 worker gives ~1300 requests/sec
3.2.0.6 with 5 workers gives ~2800 requests/sec


Glad you did not see a significant regression between v2.7 and v3.0. We
have heard rather different stories. Every environment is different, and
many lab tests are misguided, of course, but it is still good to hear
positive reports.

The difference between v3.2 and v3.0 is known and have been discussed on
squid-dev. A few specific culprits are also known, but more need to be
identified. We are working on identifying these performance bugs and
reducing that difference.


let me know if there are any tests that I can run that will help you.


As for 1 versus 5 worker difference, it seems to be specific to your
environment (as discussed below).



the numbers for 3.0 are slightly better than what I was getting with the
full ruleset, but the numbers for 3.2.0.6 are pretty much exactly what I
got from the last round of tests (with either the full or simplified
ruleset)

so 3.1 and 3.2 are a very significant regression from 2.7 or 3.0, and
the ability to use multiple worker processes in 3.2 doesn't make up for
this.

the time taken seems to almost all be in the ACL avaluation as
eliminating all the ACLs takes 1 worker with 3.2 up to 4200 requests/sec.


If ACLs are the major culprit in your environment, then this is most
likely not a problem in Squid source code. AFAIK, there are no locks or
other synchronization primitives/overheads when it comes to Squid ACLs.
The solution may lie in optimizing some 3rd-party libraries (used by
ACLs) or in optimizing how they are used by Squid, depending on what
ACLs you use. As far as Squid-specific code is concerned, you should see
nearly linear ACL scale with the number of workers.


given that my ACLs are IP/port matches or regex matches (and I've tested 
replacing the regex matches with IP matches with no significant change in 
performance), what components would be used.





one theory is that even though I have IPv6 disabled on this build, the
added space and more expensive checks needed to compare IPv6 addresses
instead of IPv4 addresses accounts for the single worker drop of ~66%.
that seems rather expensive, even though there are 293 http_access lines
(and one of them uses external file contents in it's acls, so it's a
total of ~2400 source/destination pairs, however due to the ability to
shortcut the comparison the number of tests that need to be done should
be <400)


Yes, IPv6 is one of the known major performance regression culprits, but
IPv6 ACLs should still scale linearly with the number of workers, AFAICT.

Please note that I am not an ACL expert. I am just talking from the
overall Squid SMP design point of view and from our testing/deployment
experience point of view.


that makes sense and is what I would have expected, but in my case (lots 
of ACLs) I am seeing a definante problem with more workers not completing 
more work, and beyond about 5 workers I am seeing the total work being 
completed drop. I can't think of any reason besides locking that this may 
be the case.



In addition, there seems to be some sort of locking betwen the multiple
worker processes in 3.2 when checking the ACLs


There are pretty much no locks in the current official SMP code. This
will change as we start adding shared caches in a week or so, but even
then the ACLs will remain lock-free. There could be some internal
locking in the 3rd-party libraries used by ACLs (regex and such), but I
do not know much about them.


what are the 3rd party libraries that I would be using?

David Lang



HTH,

Alex.



On Wed, 13 Apr 2011, Marcos wrote:


Hi David,

could you run and publish your benchmark with squid 2.7 ???
i'd like to know if is there any regression between 2.7 and 3.x series.

thanks.

Marcos


- Mensagem original 
De: "da...@lang.hm" 
Para: Amos Jeffries 
Cc: squid-users@squid-cache.org; squid-...@squid-cache.org
Enviadas: S?bado, 9 de Abril de 2011 12:56:12
Assunto: Re: [squid-users] squid 3.2.0.5 smp scaling issues

On Sat, 9 Apr 2011, Amos Jeffries wrote:


On 09/04/11 14:27, da...@lang.hm wrote:

A couple more things about the ACLs used in my test

all of them are allow ACLs (no deny rules to worry about precidence
of)
except for a deny-all at the bottom

the ACL line that permits the test source to the test destination has
zero overlap with the rest of the rules

every rule has an IP based restriction (even the ones with
url_regex are
source -> URL regex)

I moved the ACL that allows my test

Re: Res: Res: [squid-users] squid 3.2.0.5 smp scaling issues