[squid-users] CLOSE_WAIT state in Squid leads to bandwidth drop
Hi All, I am doing a small test for bandwidth measurement of my test setup while squid is running. I am running a script to pump the traffic from client browser to Web-server via Squid box. The script creates around 50 user sessions and tries to do wget of randomly selected dynamic URL's. After some time , I m observing a drop in bandwidth of the link, which is connecting the webserver even there is no HIT in the squid cache. I analyzed the netstat output during the problem scenario, I could see Recv-q gets piled up in CLOSE_WAIT tcp state of squid and also squid stays in CLOSE_WAIT state for more than a minute. The number of squid sessions to webserver are getting dropped to 5 from 70, but still tcp sessions from client to squid are around 80. Without Squid, there is no drop in the bandwidth with the same load. Why bandwidth is getting dropped when squid is running? Please provide your suggestions on this. Logs Squid version : 2.6.STABLE14 2013-11-25 10:17:53 Collecting netstat statistics... tcp 248352 0 172.19.134.2:51439 194.50.177.163:80 CLOSE_WAIT 5477/(squid) tcp77229 0 172.19.134.2:41998 64.15.157.134:80 CLOSE_WAIT 5477/(squid) tcp15853 0 172.19.134.2:55344 64.136.20.39:80 CLOSE_WAIT 5477/(squid) tcp30022 0 172.19.134.2:47485 50.56.161.66:80 CLOSE_WAIT 5477/(squid) tcp30202 0 172.19.134.2:59213 198.90.22.194:80 CLOSE_WAIT 5477/(squid) tcp 9787 0 172.19.134.2:52761 184.26.136.73:80 CLOSE_WAIT 5477/(squid) tcp 106892 0 172.19.134.2:55109 184.26.136.115:80 CLOSE_WAIT 5477/(squid) 2013-11-25 10:18:42 Collecting netstat statistics... tcp 248352 0 172.19.134.2:51439 194.50.177.163:80 CLOSE_WAIT 5477/(squid) tcp95558 0 172.19.134.2:42559 67.192.29.225:80 CLOSE_WAIT 5477/(squid) tcp77229 0 172.19.134.2:41998 64.15.157.134:80 CLOSE_WAIT 5477/(squid) tcp15853 0 172.19.134.2:55344 64.136.20.39:80 CLOSE_WAIT 5477/(squid) tcp30022 0 172.19.134.2:47485 50.56.161.66:80 CLOSE_WAIT 5477/(squid) tcp30202 0 172.19.134.2:59213 198.90.22.194:80 CLOSE_WAIT 5477/(squid) tcp 9787 0 172.19.134.2:52761 184.26.136.73:80 CLOSE_WAIT 5477/(squid) tcp 106892 0 172.19.134.2:55109 184.26.136.115:80 CLOSE_WAIT 5477/(squid) Squid info : --- Connection information for squid: Number of clients accessing cache: 3 Number of HTTP requests received: 257549 Number of ICP messages received:0 Number of ICP messages sent:0 Number of queued ICP replies: 0 Request failure ratio: 0.00 Average HTTP requests per minute since start: 1443.2 Average ICP messages per minute since start:0.0 Select loop called: 4924570 times, 2.174 ms avg Cache information for squid: Request Hit Ratios: 5min: 0.0%, 60min: 0.0% Byte Hit Ratios:5min: -0.0%, 60min: 3.2% Request Memory Hit Ratios: 5min: 0.0%, 60min: 0.0% Request Disk Hit Ratios:5min: 0.0%, 60min: 0.0% Storage Swap size: 107524 KB Storage Mem size: 8408 KB Mean Object Size: 20.69 KB Requests given to unlinkd: 0 Regards, Saravanan N
Re: [squid-users] CLOSE_WAIT state in Squid leads to bandwidth drop
On Tuesday 26 November 2013 at 11:37, SaRaVanAn wrote: > Hi All, > I am doing a small test for bandwidth measurement of my test setup > while squid is running. I am running a script to pump the traffic from > client browser to Web-server via Squid box. Er, do you really mean you are sending data from the browser to the server? > The script creates around 50 user sessions and tries to do wget of randomly > selected dynamic URL's. That sounds more standard - wget will fetch data from the server to the browser. What do you mean by "dynamic URLs"? Where / how is the content actually being generated? > After some time, Please define. > I'm observing a drop in bandwidth of the link, Please define - what network setup are you using - what bandwidth are you getting at the start. what level does it drop to, does it return to the previous level? > Squid version : 2.6.STABLE14 That is rather old (the last release of the 2.6 branch was STABLE23 September 2009). Is there any reason you have not upgraded to a current version? Regards, Antony. -- Behind the counter a boy with a shaven head stared vacantly into space, a dozen spikes of microsoft protruding from the socket behind his ear. - William Gibson, Neuromancer (1984) http://www.Open.Source.ITPlease reply to the list; The Open Source IT forum please don't CC me.
Re: [squid-users] CLOSE_WAIT state in Squid leads to bandwidth drop
On Tue, Nov 26, 2013 at 5:16 PM, Antony Stone wrote: > On Tuesday 26 November 2013 at 11:37, SaRaVanAn wrote: > >> Hi All, >> I am doing a small test for bandwidth measurement of my test setup >> while squid is running. I am running a script to pump the traffic from >> client browser to Web-server via Squid box. > > Er, do you really mean you are sending data from the browser to the server? > >> The script creates around 50 user sessions and tries to do wget of randomly >> selected dynamic URL's. > > That sounds more standard - wget will fetch data from the server to the > browser. = The script randomly picks the URL from the list of URL's defined in a file and tries to fetch that URL. > > What do you mean by "dynamic URLs"? Where / how is the content actually being > generated? > == Its a standard list of URL's with question mark in the end to avoid Squid caching. For example : www.espncricinfo.com? >> After some time, > > Please define. > == After 15-20 minutes from the time of execution of script. >> I'm observing a drop in bandwidth of the link, > > Please define - what network setup are you using - what bandwidth are you > getting at the start. what level does it drop to, does it return to the > previous level? > eth0 eth1 Windows Laptop - Linux machine(Squid Running) - Internet We are measuring the outgoing traffic in the link(eth1), which leads to the internet in order to calculate the bandwidth usage. Eth1 link bandwidth capability is around 10 Mbps. we are able utilize a maximum of 7-8 Mbps when squid is running. After 15 minutes, there is a sudden drop in bandwidth from 8Mbps to 6.5 Mbps and it comes back to 8Mbps after 2 -3 min. >> Squid version : 2.6.STABLE14 > > That is rather old (the last release of the 2.6 branch was STABLE23 September > 2009). Is there any reason you have not upgraded to a current version? > > = There are some practical difficulties(our side) in upgrading to newer version. > Regards, > > > Antony. > > -- > Behind the counter a boy with a shaven head stared vacantly into space, > a dozen spikes of microsoft protruding from the socket behind his ear. > > - William Gibson, Neuromancer (1984) > > http://www.Open.Source.ITPlease reply to the list; > The Open Source IT forum please don't CC me.
Re: [squid-users] CLOSE_WAIT state in Squid leads to bandwidth drop
Hi All, I need a help on this issue. On heavy network traffic with squid running, link bandwidth is not utilized properly. If I bypass squid, my link bandwidth is utilized properly. Updated topology: = (10 Mbps Link) client< --- > Squid Box <---> Proxy client<--> Proxy server<---> webserver During problem scenario, I could see more tcp sessions with FIN_WAIT_1 state in Proxy server . I also observed that Recv -q in CLOSE_WAIT state is getting increased in Squid Box. The number of tcp sessions from Squid to webserver are also getting dropped drastically. Squid.conf http_port 3128 tproxy transparent http_port 80 accel defaultsite=xyz.abc.com hierarchy_stoplist cgi-bin acl VIDEO url_regex ^http://fa\.video\.abc\.com cache allow VIDEO acl QUERY urlpath_regex cgi-bin \? cache deny QUERY acl apache rep_header Server ^Apache broken_vary_encoding allow apache cache_mem 100 MB cache_swap_low 70 cache_swap_high 80 maximum_object_size 51200 KB maximum_object_size_in_memory 10 KB ipcache_size 8192 fqdncache_size 8192 cache_replacement_policy heap LFUDA memory_replacement_policy heap LFUDA cache_dir aufs //var/logs/cache 6144 16 256 access_log //var/logs/access.log squid cache_log //var/logs/cache.log cache_store_log none mime_table //var/opt/abs/config/acpu/mime.conf pid_filename //var/run/squid.pid refresh_pattern -i fa.video.abc.com/* 600 0% 600 override-expire override-lastmod reload-into-ims ignore-reload refresh_pattern -i video.abc.com/* 600 0% 600 override-expire override-lastmod reload-into-ims ignore-reload refresh_pattern -i media.abc.com/* 600 0% 600 override-expire override-lastmod reload-into-ims ignore-reload refresh_pattern -i xyz.abc.com/.*\.js 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i xyz.abc.com/.*\.gif 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i xyz.abc.com/.*\.jpg 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i xyz.abc.com/.*\.jpg 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i xyz.abc.com/.*\.png 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i xyz.abc.com/.*\.css 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i ^http://.wsj./.* 10 200% 10 override-expire override-lastmod reload-into-ims ignore-reload refresh_pattern -i \.(gif|png|jpg|jpeg|ico)$ 480 100% 480 override-expire override-lastmod reload-into-ims refresh_pattern -i \.(htm|html|js|css)$ 480 100% 480 override-expire override-lastmod reload-into-ims refresh_pattern ^ftp: 144020% 10080 refresh_pattern ^gopher:14400% 1440 refresh_pattern . 0 20% 4320 quick_abort_min 0 KB quick_abort_max 0 KB negative_ttl 1 minutes positive_dns_ttl 1800 seconds forward_timeout 2 minutes acl all src 0.0.0.0/0.0.0.0 acl manager proto cache_object acl localhost src 127.0.0.1/255.255.255.255 acl to_localhost dst 127.0.0.0/8 acl SSL_ports port 443 acl Safe_ports port 80 acl Safe_ports port 21 acl Safe_ports port 443 acl Safe_ports port 70 acl Safe_ports port 210 acl Safe_ports port 1025-65535 acl Safe_ports port 280 acl Safe_ports port 488 acl Safe_ports port 591 acl Safe_ports port 777 acl CONNECT method CONNECT acl video_server dstdomain cs.video.abc.com always_direct allow video_server acl PURGE method PURGE http_access allow PURGE localhost http_access deny PURGE http_access allow manager localhost http_access deny manager http_access deny !Safe_ports http_access deny CONNECT all http_access allow all icp_access allow all tcp_outgoing_address 172.19.134.2 visible_hostname 172.19.134.2 server_persistent_connections off logfile_rotate 1 error_map http://localhost:1000/abp/squidError.do 404 memory_pools off store_objects_per_bucket 100 strip_query_terms off coredump_dir //var/cache store_dir_select_algorithm round-robin cache_peer 172.19.134.2 parent 1000 0 no-query no-digest originserver name=aportal cache_peer www.abc.com parent 80 0 no-query no-digest originserver name=dotcom cache_peer guides.abc.com parent 80 0 no-query no-digest originserver name=travelguide cache_peer selfcare.abc.com parent 80 0 no-query no-digest originserver name=selfcare cache_peer abcd.mediaroom.com parent 80 0 no-query no-digest originserver name=mediaroom acl webtrends url_regex ^http://statse\.webtrendslive\.com acl the_host dstdom_regex xyz\.abc\.com acl abp_regex url_regex ^http://xyz\.abc\.com/abp acl gbp_regex url_regex ^http://xyz\.abc\.com/gbp acl abcdstatic_regex url_regex ^http://xyz\.goginflight\.com/static acl dotcom_regex url_regex ^www\.abc\.com acl dotcomstatic_regex url_regex ^www\.abc\.com/static acl travelguide_regex url_regex ^http://guides\.abc\.com acl selfcare_regex url_regex ^http://selfcare\.abc\.com acl mediaroom_regex url_regex ^http://abcd\.mediaroom\.com never_direct allow abp_regex cache_peer
Re: [squid-users] CLOSE_WAIT state in Squid leads to bandwidth drop
Hey Saravanan, The main issue is that we can try to support you in a very basic way but note that if it's a BUG it cannot be fixed later rather then porting a patch manually or to try newer versions of squid. Sometimes it's a bit difficult to upgrade but you can compile squid without installing it and also installing it along side older version (with proper configurations). Your problem is a bit difficult to understand since if you use a proxy server with 100hz I assume this is what you will get from it.. There are couple levels to the connections which needs to be analyzed first before jumping and throwing everything on the linux machine. The availability of example bug reports is nice to analyze but I am not sure this is the case. 10Mbps link or 15Mbps link is almost the same but some things in the network are out of your hands. First the diagram is a bit weird to me.. what is the network topology and what hardware are we talking about? There is a reason for *dropping* from 6.5 to 8.5 Mbps. Either this is being consumed in a way or it might be throttled in a way. Both can be squid or in any other level of the link and even physical one. A cat4 cable with a loose contact will lead for something like that in some cases. So I am saying "from the ground up". What is the IP of the client? Is this server properly firewalled? What is the basic TCP settings for CLOSE_WAIT timeout? Do you have iptraf installed on this server? You can look at the "general interface statistics" or "Detailed interface statistics" to identify couple things. The iptraf tool can give you another angle on your network traffic(note that using it on a ssh can be confusing due to the ssh overhead usage of the link) It can happen that squid server "slows" down the connection but not in most cases. So we need: basic network diagram or "picture" like "a cable goes from this computer to this switch and from this switch to this router and from this router to this switch". If you can add IP addresses it will help me to understand the big picture. I am not sure yet what is the client IP and what is the speed between each connection and whether it's a full-duplex half-duplex or no-duplex support at all. Are talking about a LAN traffic only? what about DNS and WAN traffic? Thanks, Eliezer On 04/12/13 18:02, SaRaVanAn wrote: Hi All, I need a help on this issue. On heavy network traffic with squid running, link bandwidth is not utilized properly. If I bypass squid, my link bandwidth is utilized properly. Updated topology: = (10 Mbps Link) client< --- > Squid Box <---> Proxy client<--> Proxy server<---> webserver During problem scenario, I could see more tcp sessions with FIN_WAIT_1 state in Proxy server . I also observed that Recv -q in CLOSE_WAIT state is getting increased in Squid Box. The number of tcp sessions from Squid to webserver are also getting dropped drastically. Squid.conf http_port 3128 tproxy transparent http_port 80 accel defaultsite=xyz.abc.com hierarchy_stoplist cgi-bin acl VIDEO url_regex ^http://fa\.video\.abc\.com cache allow VIDEO acl QUERY urlpath_regex cgi-bin \? cache deny QUERY acl apache rep_header Server ^Apache broken_vary_encoding allow apache cache_mem 100 MB cache_swap_low 70 cache_swap_high 80 maximum_object_size 51200 KB maximum_object_size_in_memory 10 KB ipcache_size 8192 fqdncache_size 8192 cache_replacement_policy heap LFUDA memory_replacement_policy heap LFUDA cache_dir aufs //var/logs/cache 6144 16 256 access_log //var/logs/access.log squid cache_log //var/logs/cache.log cache_store_log none mime_table //var/opt/abs/config/acpu/mime.conf pid_filename //var/run/squid.pid refresh_pattern -i fa.video.abc.com/* 600 0% 600 override-expire override-lastmod reload-into-ims ignore-reload refresh_pattern -i video.abc.com/* 600 0% 600 override-expire override-lastmod reload-into-ims ignore-reload refresh_pattern -i media.abc.com/* 600 0% 600 override-expire override-lastmod reload-into-ims ignore-reload refresh_pattern -i xyz.abc.com/.*\.js 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i xyz.abc.com/.*\.gif 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i xyz.abc.com/.*\.jpg 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i xyz.abc.com/.*\.jpg 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i xyz.abc.com/.*\.png 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i xyz.abc.com/.*\.css 600 200% 600 override-expire override-lastmod reload-into-ims refresh_pattern -i ^http://.wsj./.* 10 200% 10 override-expire override-lastmod reload-into-ims ignore-reload refresh_pattern -i \.(gif|png|jpg|jpeg|ico)$ 480 100% 480 override-expire override-lastmod reload-into-ims refresh_pattern -i \.(htm|html|js|css)$ 480 100% 480 override-expir
Re: [squid-users] CLOSE_WAIT state in Squid leads to bandwidth drop
On 5/12/2013 1:45 p.m., Eliezer Croitoru wrote: > Hey Saravanan, > > The main issue is that we can try to support you in a very basic way but > note that if it's a BUG it cannot be fixed later rather then porting a > patch manually or to try newer versions of squid. Going by the description here and back in Aug when it was last posted tis is not a bug exactly. But normal behaviour of TCP combined with TPROXY limitations and large traffic flows. When sockets run out traffic gets throttled until more become available. When the level of constant TCP connection churn grows higher than sockets becoming available there grows a backlog of client connections holding sockets open and waiting for service. Part of the problem is TPROXY. Each outgoing connection requires identical src-IP:dst-IP:dst-port triplet as the incoming ones, thus sharing the 64K src-port range between both inbound and outbound connections. Normally this can be using a different src-IP with a full 64K ports on each side of the proxy. So you *start* with that handicap, then on top of it this proxy is churning through 42 sockets per second. For every port released for re-use while it is in TIME_WAIT status 40K other sockets have been needed (yes 40K needed out of ~32K available). So TCP is constantly running a backlog of available sockets. Most of which are consumed by new client connections. Imagine the machine only had 16 sockets in total, and those sockets needed to wait for 10 seconds before each use. Note that with a proxy each connection requires 2 sockets (client connection and server connection. eg in/out of the proxy). (for the sake of simplicity this description assumes the socket is done with in almost zero time). 1) When traffic is arriving at a rate of 1 connection every 2 seconds everything looks perfectly fine. * 1 client socket gets used, and one server socket. Then released and for the next 10 seconds there are 14 sockets available. * during that 10 second period, 5 more connections arrive and 10 sockets get used. * leaving the machine with 4 free sockets at the same time the first 2 are being re-added to the available pool. Making it 4-6 sockets constatly free. 2) Compare that to a traffic rate of just 1 connection every second. To begin with everything seens perfectly fine. * the first 8 connections happen perfectly. However they take 8 seconds and completely empty the available pool of sockets. ** what is the proxy to do? it must wait for 2 more seconds for the next sockets to be available. * during that 2 seconds another 2 connections have been attempted. * when the first 2 sockets become available both sockets get used by accept() * the socket pool is now empty again and the proxy must wait another 1 second for more sockets to become available. - the proxy now has 2 inbound connections waiting to be served, 7 inbound sockets in TIME_WAIT and 7 outbound sockets in TIME_WAIT. * when the second 2 sockets become available, one is used to receive the new waiting connection and one used to service an existing connection. * things continue until we reach the 16 second mark. - this is the repeat of that point when no new sockets were finishing TIME_WAIT. * at the 20 second mark socekts are becoming available again - the proxy now has 4 inbound connections waiting to be served, 6 inbound sockets in TIME_WAIT and 6 outbound sockets in TIME_WAIT. ... the cycle continues with the gap between inbound and outbound growing by 2 sockets every 8 seconds. If the clients were to all be extremely patient the machine would end up with all sockets being used by inbound connections and none for outbound. However, Squid contains a reserved-FD feature to prevent that situation happening and clients get impatient and disconnect when the wait is too long. So you will always see traffic flowing, but it will flow at a much reduced rate with ever longer delays visible to clients, and somewhat "bursty" flow rates as clients give up in bunches. Notice how in (2) there is is all the *extra* waiting time above and beyond what the traffic would normally take going through the proxy. In fact the slower the traffic through the proxy the worse the problem becomes as without connection persistence the transaction time is added on top of each TIME_WAIT for socket re-use. The important thing to be aware of is that this is normal behaviour, nasty as it is. You will hit it on any proxy or relay software if you throw large numbers of new TCP connections at it fast enough. There are two ways to avoid this: 1) reducing the amount of sockets the proxy allows to be closed. In other words enable persistent connections in HTTP. Both server and client connections. It is not perfect (especially in the old Squid like 2.6) but it avoids a lot of TIME_WAIT delays between HTTP requests. * then reduce the request processing time spent holding those sockets. So that more traffic can flow through faster overall. 2) reduce the traffic loadi