[squid-users] Re: Unable to match empty user-agent strings?
After some further testing and looking closely at the request headers it turns out that this is failing because the User-Agent header field isn't present (rather than it being present but empty). Here's my workaround/solution which seems to work nicely. acl image_leechers browser ^$ acl image_leechers browser Wget acl has_user_agent browser ^.+$ http_access deny !has_user_agent http_access deny image_leechers I promise not to make a habit of just conversing with myself on this list... 2008/10/20 James Cohen [EMAIL PROTECTED]: Hi, I think I've found a bug but first wanted to double-check I wasn't doing anything dumb. In our reverse proxy setup we want to block people from leeching the images using Wget or similar applications. To do this we want to block user agents that match Wget and because lots of people use CURL or their own home-brew clients anything with an empty user agent string. I added the following acl rule: # Block automated processes from requesting our images acl image_leechers browser ^$ acl image_leechers browser Wget and later on... http_access deny image_leechers Requests that contain Wget are being blocked exactly as expected by the proxy. Empty requests are still going through to the parent server: Request with Wget in the user agent request headers (correct behaviour) $ wget -S http://images.xxx.com/preview/1134/35121981.jpg --11:29:45-- http://images.xxx.com/preview/1134/35121981.jpg = `35121981.jpg' Resolving images.xxx.com... 62.216.237.30 Connecting to images.xxx.com|62.216.237.30|:80... connected. HTTP request sent, awaiting response... HTTP/1.0 403 Forbidden Server: squid/3.0.STABLE9 Mime-Version: 1.0 Date: Mon, 20 Oct 2008 10:29:45 GMT Content-Type: text/html Content-Length: 1653 Expires: Mon, 20 Oct 2008 10:29:45 GMT X-Squid-Error: ERR_ACCESS_DENIED 0 X-Cache: MISS from ws2 Via: 1.0 ws2 (squid/3.0.STABLE9) Connection: close 11:29:45 ERROR 403: Forbidden. And a similar request with an empty user agent string (incorrect - the request is being passed back to the parent where it returns a 403) $ wget -U -S http://images.xxx.com/preview/1134/james.jpg --11:30:09-- http://images.xxx.com/preview/1134/james.jpg = `james.jpg' Resolving images.xxx.com... 62.216.237.30 Connecting to images.xxx.com|62.216.237.30|:80... connected. HTTP request sent, awaiting response... HTTP/1.0 403 Forbidden Content-Type: text/html Content-Length: 345 Date: Mon, 20 Oct 2008 10:30:09 GMT Server: lighttpd/1.4.20 X-Cache: MISS from ws2 Via: 1.0 ws2 (squid/3.0.STABLE9) Connection: close 11:30:09 ERROR 403: Forbidden. Thanks, James
Re: [squid-users] Unable to match empty user-agent strings?
2008/10/20 Amos Jeffries [EMAIL PROTECTED]: It's not so much an empty string. As a completely missing header. Squid can only test what it has against what it checks. If you get my meaning. I haven't tested it, but you might have better luck if you invert the test to allow access to okay agents and deny the rest. All they have to do is send -U fu and they get past the wget blocker. Not to mention the real browser UA are commonly known and often recommended for script kiddies to spoof the IE agent to get past site barriers and brokenness in one action. Amos Thanks Amos, I figured that out just after I'd posted my original mail. I appreciate that the blocking is pretty weak but it seems that the majority of the unwanted traffic is some kind of automated client not supplying any User Agent at all. I guess we going for the low hanging fruit, anyone who really wants the content will be able to fetch it (by spoofing as a real user agent) but this should way to block a bunch of it. James
[squid-users] Unable to match empty user-agent strings?
Hi, I think I've found a bug but first wanted to double-check I wasn't doing anything dumb. In our reverse proxy setup we want to block people from leeching the images using Wget or similar applications. To do this we want to block user agents that match Wget and because lots of people use CURL or their own home-brew clients anything with an empty user agent string. I added the following acl rule: # Block automated processes from requesting our images acl image_leechers browser ^$ acl image_leechers browser Wget and later on... http_access deny image_leechers Requests that contain Wget are being blocked exactly as expected by the proxy. Empty requests are still going through to the parent server: Request with Wget in the user agent request headers (correct behaviour) $ wget -S http://images.xxx.com/preview/1134/35121981.jpg --11:29:45-- http://images.xxx.com/preview/1134/35121981.jpg = `35121981.jpg' Resolving images.xxx.com... 62.216.237.30 Connecting to images.xxx.com|62.216.237.30|:80... connected. HTTP request sent, awaiting response... HTTP/1.0 403 Forbidden Server: squid/3.0.STABLE9 Mime-Version: 1.0 Date: Mon, 20 Oct 2008 10:29:45 GMT Content-Type: text/html Content-Length: 1653 Expires: Mon, 20 Oct 2008 10:29:45 GMT X-Squid-Error: ERR_ACCESS_DENIED 0 X-Cache: MISS from ws2 Via: 1.0 ws2 (squid/3.0.STABLE9) Connection: close 11:29:45 ERROR 403: Forbidden. And a similar request with an empty user agent string (incorrect - the request is being passed back to the parent where it returns a 403) $ wget -U -S http://images.xxx.com/preview/1134/james.jpg --11:30:09-- http://images.xxx.com/preview/1134/james.jpg = `james.jpg' Resolving images.xxx.com... 62.216.237.30 Connecting to images.xxx.com|62.216.237.30|:80... connected. HTTP request sent, awaiting response... HTTP/1.0 403 Forbidden Content-Type: text/html Content-Length: 345 Date: Mon, 20 Oct 2008 10:30:09 GMT Server: lighttpd/1.4.20 X-Cache: MISS from ws2 Via: 1.0 ws2 (squid/3.0.STABLE9) Connection: close 11:30:09 ERROR 403: Forbidden. Thanks, James
Re: [squid-users] Re-distributing the cache between multiple servers
Henrik/Amos, Thanks for the replies. You're 100% correct in suggesting that we are using proxy-only. Thinking a little bit more now about the resilience we want to put in place and the impact of one of the cache servers going down I can see that running without proxy-only could be a great benefit to us. Thanks again for your help. James 2008/10/17 Amos Jeffries [EMAIL PROTECTED]: Hi, I have two reverse proxy servers using each other as neighbours. The proxy servers are load balanced (using a least connections algorithm) by a Netscaler upstream of them. A small amount of URLs account for around 50% or so of the requests. At the moment there's some imbalance in the hit rates on the two caches because I brought up server A before server B and it's holding the majority of the objects which make that 50% of request traffic. I can see that clearing/expiring both caches should result in an equal hit rate between the two servers. Is this the only way of achieving this? I'm concerned now that if I was to add a third server C into the cache pool it'd have an even lower hit rate than on A or B. I spent some time searching but wasn't able to find Squid administration for dummies ;) Sounds like one of the expected side effects of sibling 'proxy-only' setting. If squid were allowed to cache data received from their siblings in one of these setups, the hits would balance out naturally. Amos
[squid-users] Re-distributing the cache between multiple servers
Hi, I have two reverse proxy servers using each other as neighbours. The proxy servers are load balanced (using a least connections algorithm) by a Netscaler upstream of them. A small amount of URLs account for around 50% or so of the requests. At the moment there's some imbalance in the hit rates on the two caches because I brought up server A before server B and it's holding the majority of the objects which make that 50% of request traffic. I can see that clearing/expiring both caches should result in an equal hit rate between the two servers. Is this the only way of achieving this? I'm concerned now that if I was to add a third server C into the cache pool it'd have an even lower hit rate than on A or B. I spent some time searching but wasn't able to find Squid administration for dummies ;) Thanks, James