[squid-users] cpu load boom when rotate the access.log(coss filesystem)
hi all, i have used aufs file system for a few days and that is very good. but i encounter a question when i chang the aufs to coss: the cpu load is very very high(100% nearly) when i rotate the squid access log file with command 'squid -k rotate', and can not fall down. i google that and find a article about that:http://www.freeproxies.org/blog/2007/12/29/advanced-squid-issues-upkeep-scripts-and-disk-management/ --- If you have a script rotate your squid logs (as you should have), and the squid cache is rebuilding when you are rotating your logs, squid will not accept any more connections until it has finished rebuilding the storage. is this a squid bug?how to fix it? thank you. -- Best regards Felix New
Re: [squid-users] Confusing redirection behaviour
Thanks, Chris, Henrik. (Apologies to Henrik; I thought I was replying to the list, forgot that the default is to reply off-list.) I think it was my firewall which was causing a lot of the odd behavior, I hope I have that sorted now... Chris, regarding the 302 redirection and the use of %s, where can I find information on this? I've tried: deny_info 302:http://192.168.60.254/login.html; lan but the Access denied page which is served is just the /usr/local/squid/share/errors/English/ERR_ACCESS_DENIED.
Re: [squid-users] cpu load boom when rotate the access.log(coss filesystem)
Felix New wrote: hi all, i have used aufs file system for a few days and that is very good. but i encounter a question when i chang the aufs to coss: the cpu load is very very high(100% nearly) when i rotate the squid access log file with command 'squid -k rotate', and can not fall down. i google that and find a article about that:http://www.freeproxies.org/blog/2007/12/29/advanced-squid-issues-upkeep-scripts-and-disk-management/ --- If you have a script rotate your squid logs (as you should have), and the squid cache is rebuilding when you are rotating your logs, squid will not accept any more connections until it has finished rebuilding the storage. is this a squid bug?how to fix it? thank you. FAQ #1: Which squid release are you using? FAQ #2: What exact configuration are you using (minus default comments)? Also: What system setup do you have underneath squid? disks and cache_dirs, etc. Amos -- Please use Squid 2.6STABLE17+ or 3.0STABLE1+ There are serious security advisories out on all earlier releases.
[squid-users] HTTPS upstream cache problem
Hi, i have a strange problem with my squid configuration. We use an internal squid server to authenticate user requests which then sends all requests not in his local cache to its upstream squid which then retrieves the content from the internet. This solution works almost perfect, but in some combinations it does not work as expected. Sometimes, when you click on a link on a http page, pointing to a https page, you only get an error generated by the second proxy, telling you, it cant connect to http:443. For example, when i go to the ATI driver download page, the link is: https://a248.e.akamai.net/f/674/9206/0/www2.ati.com/drivers/6-11-pre-r300_xp-2k_dd_ccc_wdm_38185.exe in the access.log of the first proxy is appears correct: TCP_MISS/000 1557 CONNECT a248.e.akamai.net:443 - FIRST_UP_PARENT/192.168.100.11 but in the log of the upstream proxy it looks like: TCP_MISS/404 0 CONNECT http:443 - DIRECT/- I have absolutely no idea, why and under which specific conditions this error occurs. Thanks for your help in advance! Regards, Daniel Becker
[squid-users] X-Forwarded-For in Squid3 STABLE1
Hello, Is there a new x-forwarded-for patch to be used on squid3 ? I've searching a lot but without success. In my opinion such a good feature must be added to the squid base code. It's really helpful especially if you're using a content filter such as DansGuardian. TIA, c0re -- http://www.webcrunchers.com/crunch/ http://www.myspace.com/whippersnappermusic http://www.purevolume.com/whippersnapper
Re: [squid-users] RAID is good
The point of why I started the discussion is that the statement in the wiki Do not use RAID under any circumstances is at least outdated. Most companies will trade in performance for reliability because they depend on internet access for their business and cannot afford to have 2-48 hours of unavailability. Everybody knows that EMC and HP systems are much more expensive than a JBOD but this is not a valid reason to say Never use RAID. Never use RAID implies that RAID is *BAD* which is simply not true. From my point of view, the wiki should say something like: If you want cheapest, modest performance, with no availability guarantees use JBOD. If you want cheap, modest performance and availability use RAID1/RIAD5 without a sophisticated disk array (preferably with a RAID card that has 128+ MB battery-backed write cache). If you want cheapest availability use RAID5 without a sophisticated disk array If you want expensive extreme performance and availability use a sophisticated disk array. -Marcus Adrian Chadd wrote: And I'd completely agree with you; because you're comparing $EXPENSIVE attached storage (that generally is run as RAID) to $NOT_SO_EXPENSIVE local storage which doesn't have .. well, all the fruit. The EMC disk arrays, when treated as JBOD's, won't be faster. They're faster because you're rolling massive caches on top of RAID5+striping, or RAID1+0, etc. The trouble is this - none of us have access to high end storage kit, so developing solutions that'll work there is just not going to happen. I've just acquired a 14 disk compaq storageworks array, so at least I have $MANY disks to benchmark against, but its still effectively direct attach JBOD rather than hardware RAID. Want this fixed? Partner with someone who can; or do the benchmarks yourself and publish some results. My experience with hardware RAID5 cards attached to disk arrays (ie, read -not- intelligent disk shelves like EMC, etc) is that RAID5 is somewhat slower for the Squid IO patterns. I'd repeat that, but I don't have a U320-enabled RAID5 card here to talk to this shelf. Adrian On Tue, Mar 25, 2008, Ben Hollingsworth wrote: One should also consider the difference between simple RAID and extremely advanced RAID disk systems (i.e. EMC and other arrays). The external disk arrays like EMC with internal RAID5 are simply faster than a JBOD of internal disks. How many write-cycles does EMC use to backup data after one system-used write cycle? How may CPU cycles does EMC spend figuring out which disk the file-slice is located on, _after_ squid has already hashed the file location to figure out which disk the file is located on? Regardless of speed, unless you can provide a RAID system which has less than one hardware disk-io read/write per system disk-io read/write you hit these theoretical limits. I can't quote disk cycle numbers, but I know that our fiber-connected HP EVA8000's (with ginormous caches and LUNs spread over 72 spindles, even at RAID5) are one hell of a lot faster than the local disks. The 2 Gbps fiber connection is the limiting factor for most of our high-bandwidth apps. In our shop, squid is pretty low bandwidth by comparison. We normally hover around 100 req/sec with occasional peaks at 200 req/sec. But its not so much a problem of human-noticable absolute-time as a problem of underlying duplicated disk-io-cycles and processor-io-cycles and processor delays remains. For now the CPU half of the problem gets masked by the single-threadedness of squid (never though you'd see that being a major benefit eh?). If squid begins using all the CPU threads the OS will loose out on its spare CPU cycles on dual-core machines and RAID may become a noticable problem there. Your arguments are valid for software RAID, but not for hardware RAID. Most nicer systems have a dedicated disk controller with its own processor that handles nothing but the onboard RAID. A fiber-connected disk array is conceptually similar, but with more horsepower. The CPU never has to worry about overhead in this case. Perhaps for these scenarios, squid could use a config flag that tells it to put everything on one disk (as it sees it) and not bother imposing any of its own overhead for operations that will already be done by the array controller. begin:vcard fn:Ben Hollingsworth n:Hollingsworth;Ben org:BryanLGH Health System;Information Technology adr:;;1600 S. 48th St.;Lincoln;NE;68506;USA email;internet:[EMAIL PROTECTED] title:Systems Programmer tel;work:402-481-8582 tel;fax:402-481-8354 tel;cell:402-432-5334 url:http://www.bryanlgh.org version:2.1 end:vcard
RE: [squid-users] How do I allow access to a specific URL:port_number
If you are using this as an accelerator, try this in the squid.conf: acl SAFE_ports port 10020 #random port http_port 10020 accel vhost vport defaultsite=www.myisp.com cache_peer [IP of website] parent 10020 0 no-query originserver name=RPServ acl ranport myport 10020 cache_peer_access httpsWeb deny ranport cache_peer_access RPServ allow ranport ourWebSite Ed Flecko wrote: Hi folks, Our ISP has a SPAM server with a web page that you have to be able to reach in order to manage your SPAM settings. I can't figure out how to tell Squid to allow this page. The web page is: myisp.com:10020 I've tried using the always_direct method Unless you are using a parent cache, this will have no effect. and adding the 10020 port number to my Safe_ports, Unless you have modified the line... Safe_ports port 1025-65535 # unregistered ports ...this is redundant. but neither method worked. I always get: The following error was encountered: * Access Denied. Access control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect. Suggestions??? Supply more details (your squid.conf without comments, the real URL used to access the SPAM page), or hit up the FAQ (http://wiki.squid-cache.org/SquidFaq/SquidAcl). Thank you, Ed Chris
Re: [squid-users] RAID is good
On Wed, Mar 26, 2008 at 3:30 PM, Marcus Kool [EMAIL PROTECTED] wrote: The point of why I started the discussion is that the statement in the wiki Do not use RAID under any circumstances is at least outdated. Well, it says: Don't. Agreed, it's a bit radical. You're welcome to edit the wiki if you wish, just let me know your wiki username so that I may give you write access. Most companies will trade in performance for reliability because they depend on internet access for their business and cannot afford to have 2-48 hours of unavailability. I'm not going to argue with that. The point is that usually there are more cost-effective ways to get the same level of reliability if not more. For instance, going JBOC (Just a Bunch Of Caches) with load-balancing/high-availability mechanisms (Proxy PAC/WPAD or Linux Virtual Server with or without VRRP or any other Layer 2-4 load balancing solution) is a very effective system to get very high reliability. Everybody knows that EMC and HP systems are much more expensive than a JBOD but this is not a valid reason to say Never use RAID. Never use RAID implies that RAID is *BAD* which is simply not true. From my point of view, the wiki should say something like: If you want cheapest, modest performance, with no availability guarantees use JBOD. If you want cheap, modest performance and availability use RAID1/RIAD5 without a sophisticated disk array (preferably with a RAID card that has 128+ MB battery-backed write cache). If you want cheapest availability use RAID5 without a sophisticated disk array If you want expensive extreme performance and availability use a sophisticated disk array. Agreed, it can be improved. The point that should be driven across is that rather than spending 1kEur for a HW RAID SCSI Controller + 5KEur for the disks to go with it, it's much more cost-effective to spend 2KEur for a second server and use VRRP. -- /kinkie
RE: [squid-users] TCP_HIT and TCP_MISS
-Message d'origine- De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] De la part de Chris Robertson Envoyé : 25 mars 2008 15:41 À : squid-users@squid-cache.org Objet : Re: [squid-users] TCP_HIT and TCP_MISS Guillaume Chartrand wrote: -Message d'origine- De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] De la part de Chris Robertson Envoyé : 20 mars 2008 21:21 À : squid-users@squid-cache.org Objet : Re: [squid-users] TCP_HIT and TCP_MISS Guillaume Chartrand wrote: I try the solution on the other post to increase file descriptor I have 4096 when I do a ulimit -n I recompile squid Here what I have when I run squid -v Squid Cache: Version 2.6.STABLE12 configure options: No configure options at all? The default (last I checked) did not allow for support of aufs cache dirs. I've just run ./configure Without no other option. Now I look for enable aufs cache dir but I didn't find with option I need to enable. The only option who's near that it's --with-aufs-threads=N_THREADS If it's that option, how many threads should I put. Personally I use --enable-storeio=aufs,null,ufs, with a few other options (with-large-files, enable-snmp, enable-removal-policies=heap,lru). There is a message in the archives (somewhere, I can't find it right now) where Henrik explains that ufs is actually better for a lightly loaded cache (less overhead), but the blocking becomes a factor as the requests per second rise. For me it has become reflexive to suggest using aufs. 2008/03/18 09:11:01| NOTICE: no explicit transparent proxy support enabled. Assuming getsockname() works on intercepted conne ctions 2008/03/18 09:11:01| WARNING: Forwarding loop detected for: Client: 172.20.20.18 http_port: 172.20.20.18:3128 GET http://172.20.20.18:3128/design/motherbd/software/ias/updates.htm HTTP/1.0 Via: 1.0 squid.collanaud.qc.ca:3128 (squid/2.6.STABLE12) X-Forwarded-For: 172.21.132.93 Host: 172.20.20.18:3128 Cache-Control: max-age=259200 Connection: keep-alive So your router is intercepting Squid's traffic and redirecting it back to Squid. That's not so good. In a big way. This was not addressed in your reply. Hopefully it was addressed on your network. I feel it's the real issue. And here is some of my squid.conf # Squid normally listens to port 3128 #http_port 3128 http_port 3128 transparent #Default: # cache_mem 8 MB cache_mem 512 MB #Default: # maximum_object_size 4096 KB maximum_object_size 25600 KB#Default: cache_dir ufs /usr/local/squid/var/cache 2500 16 256 # this one is a symlink to another disk A symlink, or is the other disk mounted here. No matter, but you should probably be using aufs, which you will have to compile support for. It's a symlink only. Interesting. Any reason for using a symlink instead of a mount point? I deleted the symlink and make a mount point cache_dir ufs /usr/local/squid/var/cache2 2500 16 256 So how much memory does this box have? You've dedicated about a GB of RAM for Squid alone (512 cache_mem + (5GB of cache_dir * 0.1)). The machine is on VMware virtual machine and have 2 disk of 40GB each and 1GB of RAM Ugh. My calculation was off by an order of magnitude. 5GB of cache will only take (on average) 50MB of memory. See http://wiki.squid-cache.org/SquidFaq/SquidMemory for more details. But be aware, squid does not put objects from the disk cache BACK into memory. Only objects fetched from the net are put in the memory cache. I've downgrade my cache_mem to 100M cache_mem 100 MB I've modified the cache_dir cache_dir aufs /usr/local/squid/var/cache 2500 16 256 cache_dir aufs /usr/local/squid/var/cache2 2500 16 256 I've make a make clean on my src dir I've deleted all file and directory in my cache directory I make my mount point of my second disk to /usr/local/squid/var/cache2 and deleted my symlink to have direct access I recompile with this line ./configure --enable-storeio=aufs,null,ufs --with-large-files Make Make install I rerun squid with option -z to remake my directory After I run squid normally It's about 12 hours and that squid running and I don't have TCP_HIT I try to access a gif with 2 different browser on the same machine and here What I got in my access.log 1206545448.392 6413 172.20.51.11 TCP_MISS/200 596 GET http://www.nu.nl/img/balkje.gif - DIRECT/62.69.179.208 image/gif 1206545708.067252 172.20.51.11 TCP_MISS/200 596 GET http://www.nu.nl/img/balkje.gif - DIRECT/62.69.184.229 image/gif So I don't know again what I misconfigured Thank
Re: [squid-users] RAID is good
Kinkie wrote: On Wed, Mar 26, 2008 at 3:30 PM, Marcus Kool [EMAIL PROTECTED] wrote: The point of why I started the discussion is that the statement in the wiki Do not use RAID under any circumstances is at least outdated. Well, it says: Don't. Agreed, it's a bit radical. You're welcome to edit the wiki if you wish, just let me know your wiki username so that I may give you write access. OK, I am willing to to the edit and also include the VRRP/JBOC as an alternative. My Wiki name is Marcus Kool. Most companies will trade in performance for reliability because they depend on internet access for their business and cannot afford to have 2-48 hours of unavailability. I'm not going to argue with that. The point is that usually there are more cost-effective ways to get the same level of reliability if not more. For instance, going JBOC (Just a Bunch Of Caches) with load-balancing/high-availability mechanisms (Proxy PAC/WPAD or Linux Virtual Server with or without VRRP or any other Layer 2-4 load balancing solution) is a very effective system to get very high reliability. Everybody knows that EMC and HP systems are much more expensive than a JBOD but this is not a valid reason to say Never use RAID. Never use RAID implies that RAID is *BAD* which is simply not true. From my point of view, the wiki should say something like: If you want cheapest, modest performance, with no availability guarantees use JBOD. If you want cheap, modest performance and availability use RAID1/RIAD5 without a sophisticated disk array (preferably with a RAID card that has 128+ MB battery-backed write cache). If you want cheapest availability use RAID5 without a sophisticated disk array If you want expensive extreme performance and availability use a sophisticated disk array. Agreed, it can be improved. The point that should be driven across is that rather than spending 1kEur for a HW RAID SCSI Controller + 5KEur for the disks to go with it, it's much more cost-effective to spend 2KEur for a second server and use VRRP.
Re: [squid-users] RAID is good
On Wed, Mar 26, 2008 at 5:04 PM, Marcus Kool [EMAIL PROTECTED] wrote: Kinkie wrote: On Wed, Mar 26, 2008 at 3:30 PM, Marcus Kool [EMAIL PROTECTED] wrote: The point of why I started the discussion is that the statement in the wiki Do not use RAID under any circumstances is at least outdated. Well, it says: Don't. Agreed, it's a bit radical. You're welcome to edit the wiki if you wish, just let me know your wiki username so that I may give you write access. OK, I am willing to to the edit and also include the VRRP/JBOC as an alternative. My Wiki name is Marcus Kool. Write access granted. Welcome on board, and thanks for your contribution. -- /kinkie
[squid-users] 3.0.2 ncsa_auth broken?
Hello all, Still looking at it but it looks like 3.0.1 had no problem with this. I have squid 3.0.2 with squidguard from FreeBSD ports running on 7.0 and 6.2 and 6.3 and it acted the same way on all of them, went back to 3.0.1 and problem went away. After I upgraded to 3.0.2 and restarted squid only people that had auth still worked, but new users did not get a prompt for auth, and therefore received the cache access denied page. Anyone else have this issue? Thanks in advance,
Re: [squid-users] 3.0.2 ncsa_auth broken?
B. Cook wrote: Hello all, Still looking at it but it looks like 3.0.1 had no problem with this. I have squid 3.0.2 with squidguard from FreeBSD ports running on 7.0 and 6.2 and 6.3 and it acted the same way on all of them, went back to 3.0.1 and problem went away. After I upgraded to 3.0.2 and restarted squid only people that had auth still worked, but new users did not get a prompt for auth, and therefore received the cache access denied page. Anyone else have this issue? I think so... http://www.squid-cache.org/bugs/show_bug.cgi?id=2206 Thanks in advance, Chris
Re: [squid-users] TCP_HIT and TCP_MISS
Guillaume Chartrand wrote: I've downgrade my cache_mem to 100M cache_mem 100 MB I've modified the cache_dir cache_dir aufs /usr/local/squid/var/cache 2500 16 256 cache_dir aufs /usr/local/squid/var/cache2 2500 16 256 I've make a make clean on my src dir I've deleted all file and directory in my cache directory I make my mount point of my second disk to /usr/local/squid/var/cache2 and deleted my symlink to have direct access I recompile with this line ./configure --enable-storeio=aufs,null,ufs --with-large-files Make Make install I rerun squid with option -z to remake my directory After I run squid normally It's about 12 hours and that squid running and I don't have TCP_HIT I try to access a gif with 2 different browser on the same machine and here What I got in my access.log 1206545448.392 6413 172.20.51.11 TCP_MISS/200 596 GET http://www.nu.nl/img/balkje.gif - DIRECT/62.69.179.208 image/gif 1206545708.067252 172.20.51.11 TCP_MISS/200 596 GET http://www.nu.nl/img/balkje.gif - DIRECT/62.69.184.229 image/gif So I don't know again what I misconfigured Thank Have you solved the forwarding loop? What is the output of /usr/local/squid/bin/squidclient -s http://www.nu.nl/img/balkje.gif /usr/local/squid/bin/squidclient http://www.nu.nl/img/balkje.gif; (run on the squid box)? Chris
[squid-users] Using a parent cache for content filtering only
Hello all I run a small squid cache for a high school that due to location and budget has a very limited internet connection and therefore must try to conserve bandwidth. The school wishes to have content filtering enacted to prevent students from accessing inappropriate content while on school grounds. Currently the school has a service to filter content using a parent proxy however this server is located in England and the school is in Ontario, Canada. The way it is set now squid will only connect to the parent proxy to retrieve pages however it is very inefficient to run all the school's http requests to europe and back. This is causing 2 problems: 1) Often when trying to load a webpage it appears to get stuck at connecting to the server but hitting stop and then refresh will load the page very quickly. 2) Downloads over HTTP start at only a few kb/s. Sometimes they stay that way, other times after 4 or 5 minutes they speed up to most of the school's available speed of around 180 kb/s. This problem used to extend to FTP as well until I used always_direct for FTP transfers. However this means there is no filtering of ftp traffic. I disabled the parent cache and tested the speed and it was a remarkable difference. No stutters when loading pages and no problems with HTTP downloads but it also means no filtering. What I would like to do is have squid check for permission to access the site from the parent proxy but then directly connect to the hosting server to actually make the transfer. I am not sure if it is even possible but if anyone has some ideas I'd love to hear them. I've included most of squid.conf for completeness sake: http_port 192.168.3.1:3128 cache_peer parent-proxy.co.uk parent 2326 0 ## Direct connections to FTP sites. FTP transfer suffered from terrible performance until I did this acl FTP proto FTP acl HTTP proto HTTP acl HTTPS proto HTTPS cache_peer_access parent-proxy.co.uk allow HTTP cache_peer_access parent-proxy.co.uk allow HTTPS cache_peer_access parent-proxy.co.uk deny FTP never_direct allow HTTP never_direct allow HTTPS always_direct allow FTP #NTLM auth_param basic program /usr/lib/squid/smb_auth -W domain -U domain_controller auth_param basic children 10 auth_param basic realm Squid Proxy Server auth_param basic credentialsttl 1 hour acl password proxy_auth REQUIRED acl all src 0.0.0.0/0.0.0.0 http_access allow purge localhost http_access deny purge http_access deny !Safe_ports http_access allow password http_access deny all http_reply_access allow all Thanks in advance Luke Taylor
[squid-users] Forwarding client ip address
I am using squid 2.6 Stable18 as a Reverse Proxy / Accelerator to my internal website. I would like to have it forward the requesting client's IP address to the web server and I am unsure how to do this. Thanks, Keith
Re: [squid-users] bug? (was cache deny and the 'public' token)
On Tue, 2008-03-25 at 18:13 -0700, Ric wrote: Even then you have the same problem. A public response is a cache hit even if the request carries authentication. Umm... only if it contains a public cache control token. That's the point of the public token. That's why your backend should only add this token to items that contain no personalization. Not what I meant. What I meant was 1. A unauthenticated request for the resource, causing a public version to get cached. 2. Authenticated request. This will see a cache hit to the previous cached public response. But if we're authenticating via cookie instead, then the public token is sort of pointless and the private token may be more useful. Yes. I believe this is incorrect. We don't care whether the *request* contains a personalized cookie; we only care if the subsequent *response* is personalized. The cache cares. It needs to be able to identify what to use for this request. For example, even a personalized response to an authenticated request may assemble several non-personalized inline graphics, css, and javascript. Yes, but each of those is individual URLs and handled separate. If non-personal then make them public. Maybe I'm missing something here, but I don't see a need for Vary: Cookie for non-personalized content (unless one is doing some sort of cookie-based content negotiation... shudder), and of course we don't want to cache personalized content. No, but you want to provide a split-view on the same URL providing both a public copy for anonymous guests and a personalized copy for authenticated users. Regards Henrik
Re: [squid-users] TCP_HIT and TCP_MISS
Guillaume Chartrand wrote: Have you solved the forwarding loop? Nope, If I understand the loop is when my squid box try himself to go to the web the routeur redirected to himself. Is that means? That's how I interpret it. If so, I will try to modify my ACL on my router to not redirect packet when it's from my squid box Which might fix the problem... It's a good idea in any case. What is the output of /usr/local/squid/bin/squidclient -s http://www.nu.nl/img/balkje.gif /usr/local/squid/bin/squidclient http://www.nu.nl/img/balkje.gif; (run on the squid box)? Chris Nothing happens on the command line but in my access.log here the result 1206562805.600570 127.0.0.1 TCP_DENIED/403 1514 GET http://www.nu.nl/img/balkje.gif - NONE/- text/html 1206562805.611 0 127.0.0.1 TCP_DENIED/403 1514 GET http://www.nu.nl/img/balkje.gif - NONE/- text/html Fair enough. You aren't allowing localhost to surf. For this test to work, either copy squidclient to another machine on your network (and use the -h with squidclient to specify your squid host) or allow localhost http_access (http_access allow localhost just above http_access deny all). Thanks again Guillaume Chris
[squid-users] Ignore If-Modified-Since
Hi , Is there any way I can simply ignore the If-Modify-Since header that comes in the request to always return 200 OK, with the content attached ? Regards, Pablo
Re: [squid-users] Forwarding client ip address
On Wed, Mar 26, 2008 at 10:33 PM, Keith M. Richard [EMAIL PROTECTED] wrote: I am using squid 2.6 Stable18 as a Reverse Proxy / Accelerator to my internal website. I would like to have it forward the requesting client's IP address to the web server and I am unsure how to do this. You can either go transparent, or just use on your servers the content of the X-Forwarded-For HTTP header instead of the client's source IP. -- /kinkie
Re: [squid-users] RAID is good (was: Re: [squid-users] Hardware setup ?)
On Tue, Mar 25, 2008 at 1:23 PM, Marcus Kool [EMAIL PROTECTED] wrote: I wish that the wiki for RIAD is rewritten. Companies depend on internet access and a working Squid proxy and therefore the advocated no problem if a single disk fails is not from today's reality. One should also consider the difference between simple RAID and extremely advanced RAID disk systems Recently I've spent a fair bit of time benchmarking a Squid system whose COSS and AUFS storage (10GB total) + access logging are on a RAID0 array of two consumer grade SATA disks. For various reasons, I'm stuck with RAID0 for now, but I thought you might be interested to hear that the box performs pretty well. The box can handle a 600 - 700 Req/Sec Polygraph polymix-4 benchmark with a ~40% document hit ratio. usage Doubling the total storage to 20GB, increased the doc hit ratio to 55%, but hit response times began to increase noticably during the top phases. CPU was about 5% idle during the top phases. Logs were being rotated and compressed every five minutes. CPU usage never Some initial experiments suggest that removing RAID doesn't particularly improve performance, but I intend to do a more thorough set of benchmarks soon. I'm not sure how relevant this is to your discussion. I don't know how RAID0 performance is expected to compare to RAID5. I'll post here if and when I do more benchmarking without RAID. -RichardW. == Spec == CPU: Intel(R) Celeron(R) CPU 2.53GHz RAM: 3GB Disks: 2 x Seagate Barracuda 160GB Squid: 2.6.STABLE17 Linux Kernel: 2.6.23.8 FS: reiserfs == Squid Conf (extract) == # NETWORK OPTIONS http_port 800 transparent # MEMORY CACHE OPTIONS cache_mem 152 MB maximum_object_size_in_memory 50 KB # DISK CACHE OPTIONS cache_replacement_policy lru # TOTAL AVAILABLE STORAGE: 272445 MB # MEMORY STORAGE LIMIT: 46694 MB # CONFIGURED STORAGE LIMIT: 1 MB cache_dir coss /squid_data/squid/coss0 2000 max-size=16000 cache_swap_log /squid_data/squid/%s cache_dir coss /squid_data/squid/coss1 2000 max-size=16000 cache_swap_log /squid_data/squid/%s cache_dir coss /squid_data/squid/coss2 2000 max-size=16000 cache_swap_log /squid_data/squid/%s cache_dir aufs /squid_data/squid 4000 16 256 max_open_disk_fds 0 maximum_object_size 2 KB # LOGFILE OPTIONS debug_options ALL,1 buffered_logs on logfile_rotate 10 # MISCELLANEOUS memory_pools_limit 10 MB memory_pools off cachemgr_passwd none all client_db off
RE: [squid-users] RAID is good (was: Re: [squid-users] Hardwaresetup ?)
Recently I've spent a fair bit of time benchmarking a Squid system whose COSS and AUFS storage (10GB total) + access logging are on a RAID0 array of two consumer grade SATA disks. For various reasons, I'm stuck with RAID0 for now, but I thought you might be interested to hear that the box performs pretty well. I don't think anyone will be interested in RAID0, as Squid's simultaneous access of each cache_dir on different disks is loosely analogous to RAID0. RAID1 on the other hand is very interesting. Some initial experiments suggest that removing RAID doesn't particularly improve performance, but I intend to do a more thorough set of benchmarks soon. Following on from my comment above, a single 20gig RAID0 cache_dir is probably not that much different to two 10gig cache_dirs on single disks. If using aufs then the RAID0 would only run as a single thread so that may adversely affect performance. I guess that RAID0 would offer a worse seek time from squid's perspectiv as each request from squid is serialised, but data transfer rate will be higher for a particular object. I imagine squid is more sensitive to seek than throughput. I'm just speculating on all this... Also, are you using the noatime mount option with reiserfs? Do you know what your 600 - 700 Req/Sec Polygraph polymix-4 benchmark is in Mbps?
RE: [squid-users] Using a parent cache for content filtering only
I disabled the parent cache and tested the speed and it was a remarkable difference. Performance problems on the parent? Using a parent in another country would effect latency but shouldn't effect throughput.
Re: [squid-users] Using a parent cache for content filtering only
I'm quite certain there are performance issues on it but it's a black box from where I stand and I'm somewhat stuck with it which is why I'm hoping to poll it for permission and then make direct connections On Wed, Mar 26, 2008 at 9:28 PM, Adam Carter [EMAIL PROTECTED] wrote: I disabled the parent cache and tested the speed and it was a remarkable difference. Performance problems on the parent? Using a parent in another country would effect latency but shouldn't effect throughput.
Re: [squid-users] RAID is good
Richard, RAID0 is considered to have a worse performance than JBOD with 2 disks with one cache directory per disk. Since you mentioned that you have to stick with RAID0 all you can do is optimize the RAID0 usage. Only one cache directory per disk is recommended while you have 4 cache directories on one file system. Consider dropping 2 COSS cache directories so that you have 1 COSS and 1 AUFS. Kinkie and I rewrote the RAID for Squid section of the FAQ and it includes more details about price, performance and reliability trade-offs. You will find that Software RAID5 is the slowest option. -Marcus Richard Wall wrote: On Tue, Mar 25, 2008 at 1:23 PM, Marcus Kool [EMAIL PROTECTED] wrote: I wish that the wiki for RIAD is rewritten. Companies depend on internet access and a working Squid proxy and therefore the advocated no problem if a single disk fails is not from today's reality. One should also consider the difference between simple RAID and extremely advanced RAID disk systems Recently I've spent a fair bit of time benchmarking a Squid system whose COSS and AUFS storage (10GB total) + access logging are on a RAID0 array of two consumer grade SATA disks. For various reasons, I'm stuck with RAID0 for now, but I thought you might be interested to hear that the box performs pretty well. The box can handle a 600 - 700 Req/Sec Polygraph polymix-4 benchmark with a ~40% document hit ratio. usage Doubling the total storage to 20GB, increased the doc hit ratio to 55%, but hit response times began to increase noticably during the top phases. CPU was about 5% idle during the top phases. Logs were being rotated and compressed every five minutes. CPU usage never Some initial experiments suggest that removing RAID doesn't particularly improve performance, but I intend to do a more thorough set of benchmarks soon. I'm not sure how relevant this is to your discussion. I don't know how RAID0 performance is expected to compare to RAID5. I'll post here if and when I do more benchmarking without RAID. -RichardW. == Spec == CPU: Intel(R) Celeron(R) CPU 2.53GHz RAM: 3GB Disks: 2 x Seagate Barracuda 160GB Squid: 2.6.STABLE17 Linux Kernel: 2.6.23.8 FS: reiserfs == Squid Conf (extract) == # NETWORK OPTIONS http_port 800 transparent # MEMORY CACHE OPTIONS cache_mem 152 MB maximum_object_size_in_memory 50 KB # DISK CACHE OPTIONS cache_replacement_policy lru # TOTAL AVAILABLE STORAGE: 272445 MB # MEMORY STORAGE LIMIT: 46694 MB # CONFIGURED STORAGE LIMIT: 1 MB cache_dir coss /squid_data/squid/coss0 2000 max-size=16000 cache_swap_log /squid_data/squid/%s cache_dir coss /squid_data/squid/coss1 2000 max-size=16000 cache_swap_log /squid_data/squid/%s cache_dir coss /squid_data/squid/coss2 2000 max-size=16000 cache_swap_log /squid_data/squid/%s cache_dir aufs /squid_data/squid 4000 16 256 max_open_disk_fds 0 maximum_object_size 2 KB # LOGFILE OPTIONS debug_options ALL,1 buffered_logs on logfile_rotate 10 # MISCELLANEOUS memory_pools_limit 10 MB memory_pools off cachemgr_passwd none all client_db off
Re: [squid-users] RAID is good (was: Re: [squid-users] Hardware setup ?)
On Tue, Mar 25, 2008 at 1:23 PM, Marcus Kool [EMAIL PROTECTED] wrote: I wish that the wiki for RIAD is rewritten. Companies depend on internet access and a working Squid proxy and therefore the advocated no problem if a single disk fails is not from today's reality. One should also consider the difference between simple RAID and extremely advanced RAID disk systems Recently I've spent a fair bit of time benchmarking a Squid system whose COSS and AUFS storage (10GB total) + access logging are on a RAID0 array of two consumer grade SATA disks. For various reasons, I'm stuck with RAID0 for now, but I thought you might be interested to hear that the box performs pretty well. The box can handle a 600 - 700 Req/Sec Polygraph polymix-4 benchmark with a ~40% document hit ratio. vs the 850 req/Sec Adrian has demonstrated for those Squid releases using polygraph on slower servers. The difference is rather interesting. Thank you very much. If we can get more submissions like this we can update the wiki with actual Req/Sec performance ratings rather than the vague low/good. (Hint, hint.) usage Doubling the total storage to 20GB, increased the doc hit ratio to 55%, but hit response times began to increase noticably during the top phases. CPU was about 5% idle during the top phases. Logs were being rotated and compressed every five minutes. CPU usage never Some initial experiments suggest that removing RAID doesn't particularly improve performance, but I intend to do a more thorough set of benchmarks soon. I'm not sure how relevant this is to your discussion. I don't know how RAID0 performance is expected to compare to RAID5. I'll post here if and when I do more benchmarking without RAID. -RichardW. == Spec == CPU: Intel(R) Celeron(R) CPU 2.53GHz RAM: 3GB Disks: 2 x Seagate Barracuda 160GB Squid: 2.6.STABLE17 Linux Kernel: 2.6.23.8 FS: reiserfs == Squid Conf (extract) == # NETWORK OPTIONS http_port 800 transparent # MEMORY CACHE OPTIONS cache_mem 152 MB maximum_object_size_in_memory 50 KB # DISK CACHE OPTIONS cache_replacement_policy lru # TOTAL AVAILABLE STORAGE: 272445 MB # MEMORY STORAGE LIMIT: 46694 MB # CONFIGURED STORAGE LIMIT: 1 MB cache_dir coss /squid_data/squid/coss0 2000 max-size=16000 cache_swap_log /squid_data/squid/%s cache_dir coss /squid_data/squid/coss1 2000 max-size=16000 cache_swap_log /squid_data/squid/%s cache_dir coss /squid_data/squid/coss2 2000 max-size=16000 cache_swap_log /squid_data/squid/%s cache_dir aufs /squid_data/squid 4000 16 256 max_open_disk_fds 0 maximum_object_size 2 KB # LOGFILE OPTIONS debug_options ALL,1 buffered_logs on logfile_rotate 10 # MISCELLANEOUS memory_pools_limit 10 MB memory_pools off cachemgr_passwd none all client_db off