RE: [squid-users] Squid dstdomain ACL
On Fri, 12 Dec 2003, Mike McCall wrote: > Thanks Duane. Unfortunately, my domains list is HUGE (~600,000 domains) and > the cache already runs at 50-95% CPU during the day, most of which I assume > is due to the huge domains list. If I were to lose the dstdomain ACL and > only use url_regex, would performance stay where it is? Sadly, I can't use > the second option you mention because google's cache is useful for other > non-offensive websites. Ouch.. such large regex list will give a significant performance hit. You could extend Squid with a special acl type for dstdomain matches to google cache lookups. This should allow to keep the speed the same as using dstdomain. Regards Henrik
RE: [squid-users] Squid dstdomain ACL
> Thanks Duane. Unfortunately, my domains list is HUGE (~600,000 domains) and > the cache already runs at 50-95% CPU during the day, most of which I assume > is due to the huge domains list. If I were to lose the dstdomain ACL and > only use url_regex, would performance stay where it is? Sadly, I can't use > the second option you mention because google's cache is useful for other > non-offensive websites. Switching from dstdomain to url_regex will likely be much less efficient. dstdomain searching is probably O(log N), while url_regex searching is O(N). There are some redirectors (like Squirm, Jersed, and squidGuard) that claim to be very fast and efficient. You might be able to do regex searching with them faster than with Squid's internal implementation. A nice thing about redirectors, too, is that you can test them separately before you configure Squid to use them. Duane W.
RE: [squid-users] Squid dstdomain ACL
> On Fri, 12 Dec 2003, Mike McCall wrote: > > > All, > > > > I have a fairly busy cache using native squid ACLs to block > access to > > certain sites using the dstdomain ACL type. This is fine > for denying > > access to sites like www.playboy.com, but doesn't work when > people use > > google's cache of pages and google images, since the domain becomes > > www.google.com. > > > > My question; is there an ACL that will deny both > > http://www.playboy.com and > > http://www.google.com/search?q=cache:www.playboy.com/? > > > > I know regexes might be able to do this, but will there be a > > performance hit? > > You have (at least) two options: > > 1) use the 'url_regex' type to block hostnames that appear > anywhere in the URL, like: > > acl foo url_regex www.playboy.com > >The "performance hit" depends on the size of your regex > list and the load on >Squid. If Squid is not currently running at, say mor than > 50% of CPU usage, >you'll probably be fine. > > > 2) Use a similar ACL to block all google cache queries: > > acl foo url_regex google.com.*cache: > > Duane W. Thanks Duane. Unfortunately, my domains list is HUGE (~600,000 domains) and the cache already runs at 50-95% CPU during the day, most of which I assume is due to the huge domains list. If I were to lose the dstdomain ACL and only use url_regex, would performance stay where it is? Sadly, I can't use the second option you mention because google's cache is useful for other non-offensive websites. Mike
Re: [squid-users] Squid dstdomain ACL
On Fri, 12 Dec 2003, Mike McCall wrote: > All, > > I have a fairly busy cache using native squid ACLs to block access to > certain sites using the dstdomain ACL type. This is fine for denying access > to sites like www.playboy.com, but doesn't work when people use google's > cache of pages and google images, since the domain becomes www.google.com. > > My question; is there an ACL that will deny both > http://www.playboy.com and > http://www.google.com/search?q=cache:www.playboy.com/? > > I know regexes might be able to do this, but will there be a performance > hit? You have (at least) two options: 1) use the 'url_regex' type to block hostnames that appear anywhere in the URL, like: acl foo url_regex www.playboy.com The "performance hit" depends on the size of your regex list and the load on Squid. If Squid is not currently running at, say mor than 50% of CPU usage, you'll probably be fine. 2) Use a similar ACL to block all google cache queries: acl foo url_regex google.com.*cache: Duane W.
[squid-users] Squid dstdomain ACL
All, I have a fairly busy cache using native squid ACLs to block access to certain sites using the dstdomain ACL type. This is fine for denying access to sites like www.playboy.com, but doesn't work when people use google's cache of pages and google images, since the domain becomes www.google.com. My question; is there an ACL that will deny both http://www.playboy.com and http://www.google.com/search?q=cache:www.playboy.com/? I know regexes might be able to do this, but will there be a performance hit? Thanks. Mike