Re: [squid-users] Ignoring query string from url
We use query string in each url for bursting cache at client end ( browser) hence its not important for us and it won't provide any incorrect results. We already use similar configuration at CDN level. We are trying to add squid layer between origin and CDN to reduce the load on our origin servers. setup works fine for few requests but as traffic grows up to 100 req/sec, squid response is slow. Each of machine that squid is running on has got 20 GB RAM and Dual core processor. I used squirm for striping query string but I am seeing squid responding slowly when url_rewrite_program is introduced in between. Henrik suggested some clever idea to make changes to url_rewrite_program to process request in parallel but unfortunately i am not sure how to incorporate it. Here are my rewrite program settings. url_rewrite_program /home/zdn/squirm/bin/squirm url_rewrite_children 1500 url_rewrite_concurrency 0 url_rewrite_host_header off url_rewrite_bypass off Regards Nitesh On Sun, Oct 26, 2008 at 4:51 PM, Matus UHLAR - fantomas [EMAIL PROTECTED] wrote: On 25.10.08 12:40, Nitesh Naik wrote: Squid should give out same object for different query string. Basically it should strip query string and cache the object so that same object is delivered to the client browser for different query string. Did you understand what I've said - that such misconfiguration can provide incorrect results? Your users will hate you for that -- Matus UHLAR - fantomas, [EMAIL PROTECTED] ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. WinError #9: Out of error messages.
Re: [squid-users] Delivering ident to url_rewrite_program
Stefan Adams wrote: On Sun, Oct 26, 2008 at 8:35 PM, Amos Jeffries [EMAIL PROTECTED] wrote: Stefan Adams wrote: In squid 3.0STABLE9: Following Redirector interface is broken re IDENT values from http://wiki.squid-cache.org/SquidFaq/SquidRedirectors, I can see ident requests appearing in access.log, but with a url_rewrite_program of /usr/bin/tee, the ident field is ALWAYS '-'. I have never been able to get the ident field in the output of /usr/bin/tee to display the ident field that is shown in the access.log when using squid 3.0. In squid 2.5STABLE10: Using the exact same ident instructions from the FAQ, the ident column is correct in both the access.log AND in the output of /usr/bin/tee as the rewrite_program. So... Is this a bug in 3.0 or is there a directive that I am missing that is not in the FAQ? Your 2.5 configuration has ident REQUIRED which is missing from the 3.0 config. This will result in squid 3.0 not waiting for the ident response to arrive before passing it to the redirector. I've tried each acl reqident ident cogent AND acl reqident ident REQUIRED with squid 3.0. Neither of these caused squid to wait for the ident response. I don't quite follow your response; were you saying that it would work if my config shown below had acl reqident ident REQUIRED? Unfortunately, that also did not work. I got the same results where squid did not wait for an ident response. It should wait yes. You seem to have uncovered a bug. I do not have time right now to track it down, please check bugzilla about whether its already been reported. If not please add a new bug report with what traces you can do. Thanks Amos Thanks! Stefan # cat /tmp/squid-3.0.conf debug_options 29,3 30,3 28,3 33,3 acl termserv src 192.168.0.112 acl reqident ident cogent ident_lookup_access allow termserv ident_lookup_access deny all http_access allow reqident termserv http_access deny reqident url_rewrite_access allow all url_rewrite_children 1 http_port 3128 access_log /var/log/squid/access.log squid url_rewrite_program /usr/bin/tee -a /tmp/redirector.log cache_effective_user squid cache_effective_group squid # cat /tmp/squid-2.5.conf debug_options 29,3 30,3 28,3 33,3 acl all src 0.0.0.0/0.0.0.0 acl termserv src 192.168.0.112 acl reqident ident REQUIRED http_access allow reqident termserv http_access deny reqident redirect_children 1 redirect_program /usr/bin/tee -a /tmp/redirector.log # cat /tmp/redirector.log http://checkip.cogent.com/favicon.ico 192.168.0.112/- - GET myip=192.168.0.1 myport=3128 http://checkip.cogent.com/favicon.ico 192.168.0.112/- HP_Administrator GET -- Please use Squid 2.7.STABLE4 or 3.0.STABLE9 -- Please use Squid 2.7.STABLE4 or 3.0.STABLE9
Re: [squid-users] Ignoring query string from url
On mån, 2008-10-27 at 12:30 +0530, nitesh naik wrote: We use query string in each url for bursting cache at client end ( browser) hence its not important for us and it won't provide any incorrect results. We already use similar configuration at CDN level. Why do you do this? Henrik suggested some clever idea to make changes to url_rewrite_program to process request in parallel but unfortunately i am not sure how to incorporate it. Write your own url rewriter helper. It's no more than a couple of lines perl.. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: [squid-users] Ignoring query string from url
On mån, 2008-10-27 at 12:30 +0530, nitesh naik wrote: We use query string in each url for bursting cache at client end ( browser) hence its not important for us and it won't provide any incorrect results. We already use similar configuration at CDN level. On 27.10.08 10:09, Henrik Nordstrom wrote: Why do you do this? Henrik suggested some clever idea to make changes to url_rewrite_program to process request in parallel but unfortunately i am not sure how to incorporate it. Write your own url rewriter helper. It's no more than a couple of lines perl.. shouldn't that be storeurl rewriter? -- Matus UHLAR - fantomas, [EMAIL PROTECTED] ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. One World. One Web. One Program. - Microsoft promotional advertisement Ein Volk, ein Reich, ein Fuhrer! - Adolf Hitler
Re: [squid-users] Ignoring query string from url
Henrik / Matus , For certain request we don't want client browser to look for object in its cache and everything should be served fresh. CDN will determine expire time for the object. Some of these objects doesn't send out Last modified header. In our case it is not important to pass query string to the origin as query string is random number used for bursting client side cache. Is there any sample code available for url rewriter helper which will process requests in parallel ? I am still not clear as how to write help program which will process requests in parallel using perl ? Do you think squirm with 1500 child processes works differently compared to the solution you are talking about ? Regards Nitesh On Mon, Oct 27, 2008 at 2:39 PM, Henrik Nordstrom [EMAIL PROTECTED] wrote: On mån, 2008-10-27 at 12:30 +0530, nitesh naik wrote: We use query string in each url for bursting cache at client end ( browser) hence its not important for us and it won't provide any incorrect results. We already use similar configuration at CDN level. Why do you do this? Henrik suggested some clever idea to make changes to url_rewrite_program to process request in parallel but unfortunately i am not sure how to incorporate it. Write your own url rewriter helper. It's no more than a couple of lines perl.. Regards Henrik -- Regards Nitesh
Re: [squid-users] Ignoring query string from url
On mån, 2008-10-27 at 10:11 +0100, Matus UHLAR - fantomas wrote: Write your own url rewriter helper. It's no more than a couple of lines perl.. shouldn't that be storeurl rewriter? No, since the backend server is not interested in this dummy query string an url rewriter is better. Regards Henrik
Re: [squid-users] Ignoring query string from url
Henrik, Is this code capable for handling requests in parallel ? #!/usr/bin/perl $|=1; while () { s|(.*)\?(.*$)|$1|; print; } Regards Nitesh On Mon, Oct 27, 2008 at 4:04 PM, Henrik Nordstrom [EMAIL PROTECTED] wrote: On mån, 2008-10-27 at 10:11 +0100, Matus UHLAR - fantomas wrote: Write your own url rewriter helper. It's no more than a couple of lines perl.. shouldn't that be storeurl rewriter? No, since the backend server is not interested in this dummy query string an url rewriter is better. Regards Henrik
Re: [squid-users] Ignoring query string from url
On mån, 2008-10-27 at 16:12 +0530, nitesh naik wrote: Henrik, Is this code capable for handling requests in parallel ? It's capable to handle the concurrent helper mode yes. It doesn't process requests in parallell, but you don't need to. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: [squid-users] Ignoring query string from url
Sorry, forgot the following important line in both BEGIN { $|=1; } should be inserted as the second line in each script (just after the #! line) On mån, 2008-10-27 at 11:48 +0100, Henrik Nordstrom wrote: Example script removing query strings from any file ending in .ext: #!/usr/bin/perl -an $id = $F[0]; $url = $F[1]; if ($url =~ m#\.ext\?#) { $url =~ s/\?.*//; print $id $url\n; next; } print $id\n; next; Or if you want to keep it real simple: #!/usr/bin/perl -p s%\.ext\?.*%.ext%; but doesn't illustrate the principle that well, and causes a bit more work for Squid.. (but not much) I am still not clear as how to write help program which will process requests in parallel using perl ? Do you think squirm with 1500 child processes works differently compared to the solution you are talking about ? Yes. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: [squid-users] Ignoring query string from url
Henrik, What if I use following code ? logic is same as your program ? #!/usr/bin/perl $|=1; while () { s|(.*)\?(.*$)|$1|; print; next; } Regards Nitesh On Mon, Oct 27, 2008 at 4:25 PM, Henrik Nordstrom [EMAIL PROTECTED] wrote: Sorry, forgot the following important line in both BEGIN { $|=1; } should be inserted as the second line in each script (just after the #! line) On mån, 2008-10-27 at 11:48 +0100, Henrik Nordstrom wrote: Example script removing query strings from any file ending in .ext: #!/usr/bin/perl -an $id = $F[0]; $url = $F[1]; if ($url =~ m#\.ext\?#) { $url =~ s/\?.*//; print $id $url\n; next; } print $id\n; next; Or if you want to keep it real simple: #!/usr/bin/perl -p s%\.ext\?.*%.ext%; but doesn't illustrate the principle that well, and causes a bit more work for Squid.. (but not much) I am still not clear as how to write help program which will process requests in parallel using perl ? Do you think squirm with 1500 child processes works differently compared to the solution you are talking about ? Yes. Regards Henrik
Re: [squid-users] Ignoring query string from url
See earlier response. On mån, 2008-10-27 at 16:59 +0530, nitesh naik wrote: Henrik, What if I use following code ? logic is same as your program ? #!/usr/bin/perl $|=1; while () { s|(.*)\?(.*$)|$1|; print; next; } Regards Nitesh On Mon, Oct 27, 2008 at 4:25 PM, Henrik Nordstrom [EMAIL PROTECTED] wrote: Sorry, forgot the following important line in both BEGIN { $|=1; } should be inserted as the second line in each script (just after the #! line) On mån, 2008-10-27 at 11:48 +0100, Henrik Nordstrom wrote: Example script removing query strings from any file ending in .ext: #!/usr/bin/perl -an $id = $F[0]; $url = $F[1]; if ($url =~ m#\.ext\?#) { $url =~ s/\?.*//; print $id $url\n; next; } print $id\n; next; Or if you want to keep it real simple: #!/usr/bin/perl -p s%\.ext\?.*%.ext%; but doesn't illustrate the principle that well, and causes a bit more work for Squid.. (but not much) I am still not clear as how to write help program which will process requests in parallel using perl ? Do you think squirm with 1500 child processes works differently compared to the solution you are talking about ? Yes. Regards Henrik signature.asc Description: This is a digitally signed message part
[squid-users] 2.7 reverse proxy -- compression problems
I set up a reverse proxy in front to http://www.charite.de (typo3) since it's fucking slow. Now it's fast, but SOME (!) users are reporting the sites: http://www.charite.de/neurologie/ http://www.charite.de/stoffwechsel-centrum/ http://www.charite.de/ch/anaest/ards/ http://www.charite.de/akademie/ http://www.charite.de/biometrie/de/ as broken. The pictures they sent me look like compressed data instead of a page. I distinctly remember a similar problem with HTTP/1.1 and compression and heise.de --- -- Ralf Hildebrandt (i.A. des GB IT) [EMAIL PROTECTED] Charite - Universitätsmedizin BerlinTel. +49 (0)30-450 570-155 Gemeinsame Einrichtung von FU- und HU-BerlinFax. +49 (0)30-450 570-962 Geschäftsbereich IT Standort CBF I'm looking for a job!
Re: [squid-users] 2.7 reverse proxy -- compression problems
* Ralf Hildebrandt [EMAIL PROTECTED]: I set up a reverse proxy in front to http://www.charite.de (typo3) since it's fucking slow. Now it's fast, but SOME (!) users are reporting the sites: http://www.charite.de/neurologie/ http://www.charite.de/stoffwechsel-centrum/ http://www.charite.de/ch/anaest/ards/ http://www.charite.de/akademie/ http://www.charite.de/biometrie/de/ as broken. The pictures they sent me look like compressed data instead of a page. I distinctly remember a similar problem with HTTP/1.1 and compression and heise.de --- I think it might be due to mod_deflate on the typo3 box. I disabled it for now and asked the users for a retest. -- Ralf Hildebrandt (i.A. des GB IT) [EMAIL PROTECTED] Charite - Universitätsmedizin BerlinTel. +49 (0)30-450 570-155 Gemeinsame Einrichtung von FU- und HU-BerlinFax. +49 (0)30-450 570-962 Geschäftsbereich IT Standort CBF I'm looking for a job!
[squid-users] another config question
Previously, I asked about allowing authorized Users to get access to the Internet without a need to authenticate. I had hoped to use NTLM_AUTH, but it appears hopeless. Try as i might, I cannot get NTLM to work. I CAN, however, get ldap to authenticate against the AD, and when I figure out GROUP access, I'll be golden - except - I STILL don't have a process that uses the network credentials already in place to authorize Internet Access. The question is - is it possible to do that using ldap - or must I continue to beat this NTLM horse to death? Lou Lohman Really Puzzled Squid Admin -- 90% of the game is half mental. Yogi Berra See my daily blog at http://louceel.blogspot.com. See my Art blog at http://newatart.blogspot.com .
Re: [squid-users] 2.7 reverse proxy -- compression problems
On Monday 27 October 2008 15:10:35 Ralf Hildebrandt wrote: I think it might be due to mod_deflate on the typo3 box. I disabled it for now and asked the users for a retest. Was it configured to compress images? I doubt that is especially useful, since most image formats get bigger with regular compression. You can probably safely apply it to type text/html Have to say the sites load impressively quickly here, despite our usually congested bandwidth into the office, although one of the sites had a load of 404 errors on CSS (and other items) linked by IP address which was presumably slowing it down slightly. Is this all academic bandwidth used for charity work, or could we host stuff nearby if we wanted to? Simon
Re: [squid-users] 2.7 reverse proxy -- compression problems
* Simon Waters [EMAIL PROTECTED]: On Monday 27 October 2008 15:10:35 Ralf Hildebrandt wrote: I think it might be due to mod_deflate on the typo3 box. I disabled it for now and asked the users for a retest. Was it configured to compress images? It rather looked like an HTML page... -- Ralf Hildebrandt (i.A. des GB IT) [EMAIL PROTECTED] Charite - Universitätsmedizin BerlinTel. +49 (0)30-450 570-155 Gemeinsame Einrichtung von FU- und HU-BerlinFax. +49 (0)30-450 570-962 Geschäftsbereich IT Standort CBF I'm looking for a job!
Re: [squid-users] headers say HIT, logs say MISS, payload is truncated...
On Sat, Oct 25, 2008 at 8:54 AM, Henrik Nordstrom [EMAIL PROTECTED] wrote: On fre, 2008-10-24 at 15:44 -0700, Neil Harkins wrote: We are using collapsed_forwarding here. I haven't tried disabling it yet. Unfortunately, since the problem appears to be load-related, I've been unable to reproduce for a tcpdump or running squid in debug thus far. The mismatch in HIT/MISS is most likely related to collapsed forwarding. Collapsed requests gets somewhere inbetween an hit or miss, and may well be reported a little inconsistent. Have no idea on the timeout issue unless there is a communication issue between Squid and your web server. The timeout is because the Content-Length header is bigger than the payload it sent. Every http client/server will hang in that situation. This isn't simply a misreported HIT-MISS in the log, this is absolutely a significant bug where collapsed forwarding is mixing up the metadata from the two branches of our Vary: Accept-Encoding (gzip and not), i.e. giving the headers and content as non-gzip, but the amount of payload it reads from the cache and sends is based on the gzip size. Disabling collapsed_forwarding fixed it. -neil
Re: [squid-users] headers say HIT, logs say MISS, payload is truncated...
On mån, 2008-10-27 at 12:23 -0700, Neil Harkins wrote: The timeout is because the Content-Length header is bigger than the payload it sent. Every http client/server will hang in that situation. This isn't simply a misreported HIT-MISS in the log, this is absolutely a significant bug where collapsed forwarding is mixing up the metadata from the two branches of our Vary: Accept-Encoding (gzip and not), i.e. giving the headers and content as non-gzip, but the amount of payload it reads from the cache and sends is based on the gzip size. Disabling collapsed_forwarding fixed it. Please file a bug report on this. Preferably including squid -k debug cache.log output and tcpdump -s 0 -w traffic.pcap traces. http://bugs.squid-cache.org/ Regards Henrik signature.asc Description: This is a digitally signed message part
Re: [squid-users] 2.7 reverse proxy -- compression problems
On mån, 2008-10-27 at 14:49 +0100, Ralf Hildebrandt wrote: I set up a reverse proxy in front to http://www.charite.de (typo3) since it's fucking slow. Now it's fast, but SOME (!) users are reporting the sites: http://www.charite.de/neurologie/ http://www.charite.de/stoffwechsel-centrum/ http://www.charite.de/ch/anaest/ards/ http://www.charite.de/akademie/ http://www.charite.de/biometrie/de/ as broken. The pictures they sent me look like compressed data instead of a page. I distinctly remember a similar problem with HTTP/1.1 and compression and heise.de --- Apache mod_deflate is broken in many versions, hence the broken_vary_encoding directive in squid.conf... It could also be the case that the site doesn't announce Vary at all on the compressed objects. This is another mod_deflate bug, but can be worked around easily by Apache configuration forcing the Vary: accept-encoding header to be added on responses processed by mod_deflate. Regards Henrik signature.asc Description: This is a digitally signed message part
Re: [squid-users] another config question
On mån, 2008-10-27 at 11:58 -0500, Lou Lohman wrote: don't have a process that uses the network credentials already in place to authorize Internet Access. The question is - is it possible to do that using ldap - or must I continue to beat this NTLM horse to death? You need NTLM or Negotiate for that. Note: MSIE6 only supports NTLM. How far have you managed to beat the NTLM horse? - Have Samba joined the domain sucessfully? - Do a manual ntlm_auth test work when running as your cache_effective_user (as defined in squid.conf)? Regards Henrik signature.asc Description: This is a digitally signed message part
Re: [squid-users] headers say HIT, logs say MISS, payload is truncated...
On Mon, Oct 27, 2008 at 2:56 PM, Henrik Nordstrom [EMAIL PROTECTED] wrote: On mån, 2008-10-27 at 12:23 -0700, Neil Harkins wrote: The timeout is because the Content-Length header is bigger than the payload it sent. Every http client/server will hang in that situation. This isn't simply a misreported HIT-MISS in the log, this is absolutely a significant bug where collapsed forwarding is mixing up the metadata from the two branches of our Vary: Accept-Encoding (gzip and not), i.e. giving the headers and content as non-gzip, but the amount of payload it reads from the cache and sends is based on the gzip size. Disabling collapsed_forwarding fixed it. Please file a bug report on this. Preferably including squid -k debug cache.log output and tcpdump -s 0 -w traffic.pcap traces. I'd like to help and see this get fixed, but as I said earlier, it happens on about 16% of our test requests, only when there's 750~1050 reqs/second going through the box, and pretty much disappears under 500 reqs/s (off-peak). I managed to catch only 1 instance of the timeout in a 17 Gig cache.log with debugging enabled. Is this except significant?: 2008/10/24 17:09:23| storeLocateVaryRead: accept-encoding=gzip 0x3b28978 seen_offset=204 buf_offset=0 size=85 2008/10/24 17:09:23| storeLocateVaryRead: Key: 968D51EAA0C2BCF5688EAB92E8F56EE4 2008/10/24 17:09:23| storeLocateVaryRead: 0x3b28978 seen_offset=289 buf_offset=0 2008/10/24 17:09:23| storeClientCopy: 4F3F9F8F4461796C3C469E610B6E2D5C, seen 289, want 289, size 4096, cb 0x46169d, cbdata 0x3b28978 [snip] 2008/10/24 17:09:23| storeLocateVaryRead: accept-encoding= 0x3b28978 seen_offset=204 buf_offset=0 size=178 2008/10/24 17:09:23| storeLocateVaryRead: Key: 968D51EAA0C2BCF5688EAB92E8F56EE4 2008/10/24 17:09:23| storeLocateVaryRead: MATCH! 968D51EAA0C2BCF5688EAB92E8F56EE4 (null) 2008/10/24 17:09:23| storeClientCopy: 4F3F9F8F4461796C3C469E610B6E2D5C, seen 382, want 382, size 4096, cb 0x46169d, cbdata 0x3b28978 Note that I've since changed our load balancer to rewrite Accept-Encoding: to Accept-Encoding: identity in case squid didn't like a null header, (although the example in the RFC implies that Accept-Encoding: is valid: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.3) but the timeouts still happened, although I didn't grab debugging then. -neil
Re: [squid-users] headers say HIT, logs say MISS, payload is truncated...
On mån, 2008-10-27 at 15:56 -0700, Neil Harkins wrote: I'd like to help and see this get fixed, but as I said earlier, it happens on about 16% of our test requests, only when there's 750~1050 reqs/second going through the box, and pretty much disappears under 500 reqs/s (off-peak). Ouch.. Is this except significant?: Hard to say, but probably not. It's just reading of the Vary/ETag index, finding that the request matches the object with key 968D51EAA0C2BCF5688EAB92E8F56EE4. Do your server support ETag on these objects? And do it properly report different ETag values for the different variants? Or are you using broken_vary_encoding to work around server brokenness? Note that I've since changed our load balancer to rewrite Accept-Encoding: to Accept-Encoding: identity in case squid didn't like a null header, (although the example in the RFC implies that Accept-Encoding: is valid: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.3) but the timeouts still happened, although I didn't grab debugging then. Accept-Encoding header without a value is not really a valid HTTP header (what you probably want is no Accept-Encoding header at all). But Squid should work fine in both cases as it's just two different Vary request fingerprints. The blank example is an error in the specifications descriptive text and has been corrected in the errata. If you look closely at the above reference you'll notice the BNF says 1#(...) which means at least one. The BNF is the authorative definition of the syntax of this header, the rest of the text just tries to explain it.. Regards Henrik signature.asc Description: This is a digitally signed message part