Re: [squid-users] Re: [patch] Re: [squid-users] X-Forwarded-For and cache_peer_access -- Fixed!

2013-08-24 Thread Amos Jeffries

On 24/08/2013 5:50 p.m., David Isaacs wrote:

Amos,

I've also come across what Michael identified. This is actually a bug,
right? The checklist() constructor initialises checklist.src_addr correctly
based on acl_uses_indirect_client but it is then overridden with the
request's true client_addr by the calling function.

I filed it as #3895
http://bugs.squid-cache.org/show_bug.cgi?id=3895


And applied. It should be in the next releases at the end of this month.

Amos


[squid-users] Re: icp_query_timeout directive is not working in 3.3.8 for some reason

2013-08-24 Thread x-man
Hi Guys,

actually all the problem appeared to be happening due to ICP communication
between squid and our cache peer which also supports ICP. When we have
workers (more than 1) the ICP packets were not returning back (eventually)
to the proper worker so that's why the TIMEOUTS were happening, probably due
to the reason that this is UDP communication

After adding the following simple config 

# different icp port per worker in order to make it work
icp_port 313${process_number}



now everything is working fine with Multi worker environment and our
cache_peer

 



--
View this message in context: 
http://squid-web-proxy-cache.1019090.n4.nabble.com/icp-query-timeout-directive-is-not-working-in-3-3-8-for-some-reason-tp4661324p4661754.html
Sent from the Squid - Users mailing list archive at Nabble.com.


[squid-users] Re: squid 3.2.0.14 with TPROXY = commBind: Cannot bind socket FD 773 to xxx.xxx.xxx.xx: (98) Address

2013-08-24 Thread x-man
Hi Amos,

I have exactly the same issue as the above described.

Running squid 3.3.8 in TPROXY mode. 

In my setup the squid is serving around 1 online subscribers, and this
problem happens when i put the whole HTTP traffic. If I'm redirecting only
half of the users - then it works fine.

I guess it's something related to LIMITS imposed by the OS or the Squid
itself. Please help to identify the exact bottleneck if this issue, because
this is scalability issue.

squidclient mgr:info |grep HTTP
HTTP/1.1 200 OK
Number of HTTP requests received:   1454792
Average HTTP requests per minute since start:   116719.5

squidclient mgr:info |grep file
Maximum number of file descriptors:   524288
Largest file desc currently in use:   132904
Number of file desc currently in use: 80893
Available number of file descriptors: 443395
Reserved number of file descriptors:   800
Store Disk files open:   0

ulimit -a from the OS

core file size  (blocks, -c) unlimited
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 386229
max locked memory   (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files  (-n) 100
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) 8192
cpu time   (seconds, -t) unlimited
max user processes  (-u) 386229
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited

Some tunings applied also, but not helping much:

echo applying specific tunings
echo 500 65535  /proc/sys/net/ipv4/ip_local_port_range
echo 65000  /proc/sys/net/ipv4/tcp_max_syn_backlog
echo 600  /proc/sys/net/ipv4/tcp_keepalive_time
echo 5  /proc/sys/net/core/netdev_max_backlog

echo 15  /proc/sys/net/ipv4/tcp_keepalive_intvl
echo 5   /proc/sys/net/ipv4/tcp_keepalive_probes
echo 1  /proc/sys/net/ipv4/tcp_tw_reuse# it's ok
echo 1  /proc/sys/net/ipv4/tcp_tw_recycle# it's ok
echo 200  /proc/sys/net/ipv4/tcp_max_tw_buckets#default 262144 on
Ubuntu


Let me know what other info might be useful for you?




--
View this message in context: 
http://squid-web-proxy-cache.1019090.n4.nabble.com/squid-3-2-0-14-with-TPROXY-commBind-Cannot-bind-socket-FD-773-to-xxx-xxx-xxx-xx-98-Address-tp4225143p4661755.html
Sent from the Squid - Users mailing list archive at Nabble.com.


Re: [squid-users] Re: squid 3.2.0.14 with TPROXY = commBind: Cannot bind socket FD 773 to xxx.xxx.xxx.xx: (98) Address

2013-08-24 Thread Amos Jeffries

On 24/08/2013 9:45 p.m., x-man wrote:

Hi Amos,

I have exactly the same issue as the above described.

Running squid 3.3.8 in TPROXY mode.

In my setup the squid is serving around 1 online subscribers, and this
problem happens when i put the whole HTTP traffic. If I'm redirecting only
half of the users - then it works fine.

I guess it's something related to LIMITS imposed by the OS or the Squid
itself. Please help to identify the exact bottleneck if this issue, because
this is scalability issue.

squidclient mgr:info |grep HTTP
HTTP/1.1 200 OK
Number of HTTP requests received:   1454792
Average HTTP requests per minute since start:   116719.5


Nice. With stats like these would you mind supplying the data necessary 
for an entry in this page?

 http://wiki.squid-cache.org/KnowledgeBase/Benchmarks
(see section 2 for how to calculate the datum).



squidclient mgr:info |grep file
Maximum number of file descriptors:   524288
Largest file desc currently in use:   132904
Number of file desc currently in use: 80893
Available number of file descriptors: 443395
Reserved number of file descriptors:   800
Store Disk files open:   0

ulimit -a from the OS

core file size  (blocks, -c) unlimited
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 386229
max locked memory   (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files  (-n) 100
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) 8192
cpu time   (seconds, -t) unlimited
max user processes  (-u) 386229
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited

Some tunings applied also, but not helping much:

echo applying specific tunings
echo 500 65535  /proc/sys/net/ipv4/ip_local_port_range
echo 65000  /proc/sys/net/ipv4/tcp_max_syn_backlog
echo 600  /proc/sys/net/ipv4/tcp_keepalive_time
echo 5  /proc/sys/net/core/netdev_max_backlog

echo 15  /proc/sys/net/ipv4/tcp_keepalive_intvl
echo 5   /proc/sys/net/ipv4/tcp_keepalive_probes
echo 1  /proc/sys/net/ipv4/tcp_tw_reuse# it's ok
echo 1  /proc/sys/net/ipv4/tcp_tw_recycle# it's ok
echo 200  /proc/sys/net/ipv4/tcp_max_tw_buckets#default 262144 on
Ubuntu


Let me know what other info might be useful for you?



Unfortunately all I can do is point you at the known reasons for the 
message.
The things to figure out is whether there is some limit in TPROXY kernel 
code itself (the socket match module is the critical point I think) 
about how many sockets it can manage. Or if some of the traffic is 
coming an excessive amounts from any particular IPs and reducing the 
amount of outgoing connections that can be used for it.


Amos


Re: [squid-users] Re: squid 3.2.0.14 with TPROXY = commBind: Cannot bind socket FD 773 to xxx.xxx.xxx.xx: (98) Address

2013-08-24 Thread Niki Gorchilov
Hi, Amos,

I'm working on the same project with Plamen.

 squidclient mgr:info |grep HTTP
 HTTP/1.1 200 OK
 Number of HTTP requests received:   1454792
 Average HTTP requests per minute since start:   116719.5


 Nice. With stats like these would you mind supplying the data necessary for
 an entry in this page?
  http://wiki.squid-cache.org/KnowledgeBase/Benchmarks
 (see section 2 for how to calculate the datum).

The moment we manage to fix this issue and are able to run squid for
more than few minutes without commBind problem, I promise to submit
benchmarks for two times bigger setup. Just we have to iron out this
issue. :-)

 Unfortunately all I can do is point you at the known reasons for the
 message.
 The things to figure out is whether there is some limit in TPROXY kernel
 code itself (the socket match module is the critical point I think) about
 how many sockets it can manage. Or if some of the traffic is coming an
 excessive amounts from any particular IPs and reducing the amount of
 outgoing connections that can be used for it.

Before digging deeper into the TPROXY kernel code, I'd like to clarify
one aspect of squid's behaviour. Do you pass a port number (anything 
0) in inaddr.ai_addr during the bind call? Sorry, I couldn't trace it
myself, as I didn't do much C/C++ programming since early 90's :-)

Is it Squid or the kernel who decides what port to be used?

I believe the kernel will return EADDRNOTAVAIL in case of exhausted
ports for the specific IP. And the commBind errors will cite one and
the same IP, which is not the case. All random IPs are there in the
log. Very few IP's has more (100-200) error log lines. Most IPs will
be mentioned just 1,2,3 times.

EADDRINUSE error is a clear indication that this same IP:port pair is
already in use. Or someone else listens to 0.0.0.0:same_port.

It'll be of great help if we manage to log the port number together
with the address in order too look for possible collisions with other
processes running on the machine (incl all other squid workers).

Thank you in advance for your support!

Best,
Niki


[squid-users] Re: cache_dir size v.s. available RAM

2013-08-24 Thread HillTopsGM


The rule-of-thumb is 15MB *per GB of cache size*. 


Thanks Amos - that's the rule of thumb I was looking for.

Somewhere along the way I thought I saw 10MB per GB of Cache but it was
vague.

I was also given a piece of advice to maybe create multiple directories of
no more than 100 GB per. 
I am guessing that this was to address the 16 256 Question I had in Question
2.  

*IF* that is the case, would increasing it to 32 256 and having 1 cache
directory of 200GB not do the same thing as having 2 100GB caches with each
set at 16 256?

I ask only to try and better understand what it is that is taking place with
the setup.





--
View this message in context: 
http://squid-web-proxy-cache.1019090.n4.nabble.com/cache-dir-size-v-s-available-RAM-tp4661705p4661758.html
Sent from the Squid - Users mailing list archive at Nabble.com.


[squid-users] Re: cache_dir size v.s. available RAM

2013-08-24 Thread HillTopsGM
In connection with my last post, I also had this question:

Let's say that with my 4GB of RAM I decided to create a total cache storage
area that was 650GB; obviously the index would be much larger than could be
stored in RAM.

If my primary purpose was to 'archive' my windows updates, I'd expect that
it would take the system only a couple of seconds to review the index that
would spill over to the drive, and then we'd be back in business for the
updates - no?

I simply want the Proxy to help serve updates of all programs - Windows,
Browser updates like Firefox, Thunderbird, Adobe Reader, Skype, nVidia
Driver updates (100's of MB at a crack), etc, etc.

I was thinking of creating a rule (maybe someone could help be write it so
it makes sense) that all sites would be accessed directly and told NOT TO BE
cached.

For Example:

*STEP 1/4*
acl noproxy dstdomain .com .net .org   etc, etc. Would that work?

always_direct allow noproxy
cache deny noproxy

*STEP 2/4*:
Then for Each site, *in particular*, I want cached (like the Windows update
sites) create rules like this: 

never_direct deny windowsupdate
never_direct allow all
cache allow windowsupdate

NOTE: I chose 'windowsupdate' as that is what was used for the acl rules on
the FAQ page here   http://wiki.squid-cache.org/SquidFaq/WindowsUpdate
http://wiki.squid-cache.org/SquidFaq/WindowsUpdate  

*STEP 3/4*, Next I was thinking that I'd have to add is acl's  for
acl windowsupdate dstdomain microsoft.com
acl windowsupdate dstdomain windowsupdate.com
acl windowsupdate dstdomain my.windowsupdate.website.com

. . . as I see that those domains are part of the refresh rules for the
windows updates but not the acl's, and I was thinking that If I didn't do
that, they would be allowed through as per the
never_direct allow all
rule.

Frankly, I was wondering why they WERE NOT included in the group of acl's
listed on that page.
Comments on that?

*STEP 4/4*, Lastly, all I'd have to do is add an acl for sites that I want
to be cached in addition to the windows updates like so:

acl windowsupdate dstdomain .mozilla.org
acl windowsupdate dstdomain .adobe.com
acl windowsupdate dstdomain .java.com
acl windowsupdate dstdomain .nvidia.com
etc, etc, etc,

. . . and I should be good to go.

Does that make sense?



--
View this message in context: 
http://squid-web-proxy-cache.1019090.n4.nabble.com/cache-dir-size-v-s-available-RAM-tp4661705p4661759.html
Sent from the Squid - Users mailing list archive at Nabble.com.


Re: [squid-users] Re: squid 3.2.0.14 with TPROXY = commBind: Cannot bind socket FD 773 to xxx.xxx.xxx.xx: (98) Address

2013-08-24 Thread Amos Jeffries

On 25/08/2013 3:12 a.m., Niki Gorchilov wrote:

Hi, Amos,

I'm working on the same project with Plamen.


squidclient mgr:info |grep HTTP
HTTP/1.1 200 OK
 Number of HTTP requests received:   1454792
 Average HTTP requests per minute since start:   116719.5


Nice. With stats like these would you mind supplying the data necessary for
an entry in this page?
  http://wiki.squid-cache.org/KnowledgeBase/Benchmarks
(see section 2 for how to calculate the datum).

The moment we manage to fix this issue and are able to run squid for
more than few minutes without commBind problem, I promise to submit
benchmarks for two times bigger setup. Just we have to iron out this
issue. :-)


Unfortunately all I can do is point you at the known reasons for the
message.
The things to figure out is whether there is some limit in TPROXY kernel
code itself (the socket match module is the critical point I think) about
how many sockets it can manage. Or if some of the traffic is coming an
excessive amounts from any particular IPs and reducing the amount of
outgoing connections that can be used for it.

Before digging deeper into the TPROXY kernel code, I'd like to clarify
one aspect of squid's behaviour. Do you pass a port number (anything 
0) in inaddr.ai_addr during the bind call? Sorry, I couldn't trace it
myself, as I didn't do much C/C++ programming since early 90's :-)

Is it Squid or the kernel who decides what port to be used?


We pass the destination port:IP to connect() and then try to bind() to 
the client IP on port 0 for source. The kernel decides which port is 
available, then we retrieve its decision with getsockname().



I believe the kernel will return EADDRNOTAVAIL in case of exhausted
ports for the specific IP. And the commBind errors will cite one and
the same IP, which is not the case. All random IPs are there in the
log. Very few IP's has more (100-200) error log lines. Most IPs will
be mentioned just 1,2,3 times.

EADDRINUSE error is a clear indication that this same IP:port pair is
already in use. Or someone else listens to 0.0.0.0:same_port.
It'll be of great help if we manage to log the port number together
with the address in order too look for possible collisions with other
processes running on the machine (incl all other squid workers).

Thank you in advance for your support!


You can add a debugs line on src/comm.cc where it call commBind() from 
comm_apply_flags() to display the addr variable. The if-statements 
above it have some examples.


However the existing log line shodul be dumping out the full IP:port 
details when a port 0 is used, so you should be seeing a port there if 
a port was sent to bind(). Your obfuscation indicates only an IPv4 was used.


Amos


Re: [squid-users] Re: cache_dir size v.s. available RAM

2013-08-24 Thread Amos Jeffries

On 25/08/2013 11:20 a.m., HillTopsGM wrote:

In connection with my last post, I also had this question:

Let's say that with my 4GB of RAM I decided to create a total cache storage
area that was 650GB; obviously the index would be much larger than could be
stored in RAM.

If my primary purpose was to 'archive' my windows updates, I'd expect that
it would take the system only a couple of seconds to review the index that
would spill over to the drive, and then we'd be back in business for the
updates - no?


Sort of. This couple of seconds delay would happen on *every* HTTP 
request to the proxy.



I simply want the Proxy to help serve updates of all programs - Windows,
Browser updates like Firefox, Thunderbird, Adobe Reader, Skype, nVidia
Driver updates (100's of MB at a crack), etc, etc.

I was thinking of creating a rule (maybe someone could help be write it so
it makes sense) that all sites would be accessed directly and told NOT TO BE
cached.


You seem to have the common misunderstanding about what DIRECT is. HTTP 
permits an arbitrarily long chaining of proxies:


client-A-B-C-D-E-F-. - origin server

always_direct causes Squid to ignore any cache_peer which you have 
configured and use DNS lookup to fetch the object DIRECT-ly from the 
origin. Giving an error if the DNS produces no results or is not working.


never_direct does the opposite and forces Squid to ignore DNS for the 
domain being requested and just send to cache_peer. Giving an error if 
the cache_peer are unavailable.


So:
* Squid always services the request recieved.

* cache deny xxx prevents the reponse matching xxx being stored is all.

* refresh_pattern operates on already stored content in determining 
whether it can be a HIT or needs REFRESH-ing.


Amos