Re: [squid-users] question about filesystems and directories for cache.
Tony Dodd wrote: Chris Robertson wrote: First of all, thanks for sharing the write-up. There are a number of high-load squid installations (Wikipedia, and Flikr are two of the largest I know of), but not much information on what tweaks to make in the interest of performance. No problem. =] I encountered the same problem when trying to figure out how to get more performance so I figured once I'd cracked it, the least I could do was document it for the other people having the same issue (and to give myself a reference for later). After perusing your posting, I'm wondering if you would run a squidclient -p 80 mgr:info |grep method. I'm making the assumption that your squid is listening on port 80, so please change the argument to -p if needed. Your configuration options included --enable-poll, but with a 2.6 kernel and 2.6 sources, I would be surprised if you are not actually using epoll. It might be a superfluous compile option. [EMAIL PROTECTED] ~]# squidclient -p 8081 mgr:info |grep method IO loop method: poll Hmm, as Adrian said, try adding --enable-epoll to your options, that should theoretically have a similar difference over poll as aufs has over ufs. Also, since you are building from source, try the absolute latest 2.6 around. There is an ongoing optimisation work by Adrian underway that is showing some noticible speed improvements across the 2.6-teens. Cache digests are not the only method of sharing between peers. ICP is an alternative and I have read that multicast works well for scaling beyond a handful of peers. I can't seem to find the posting now that I want to reference it. I'd trust your experience over my memory of someone else's posting, but I thought I would raise the possibility. I was under the impression that when utilizing cache peering, it worked better if the squids had a digest of what was on X squid server, before asking for it. I could be wrong on that though - Adrian, care to comment on this one? It's now redundant in my situation though, as every peering mechanism is slower than going back to parent in our use case. Theoretically yes. Practically ... there are incremental and cyclic digest methods. The former is not much better than multicast ICP. The later suffers from periodic minor update delays. But none have been adequately benchmarked in squid IFAIK. ... Side project anyone? I'm surprised you had to specify your hosts file in your squid.conf. /etc/hosts is the default. There are a couple of bugs in squid that seem to cause issues if you don't actually specify the hosts file within the squid conf... worst case, it's an extra line of config to parse on startup. Are these bugs in bugzilla? Please add asap if not. Lastly, I'd be wary of specifying dns_nameservers as a squid.conf option. Squid will use the servers specified in /etc/resolv.conf if this option is not specified. Now you have to maintain name servers in two locations. Same goes here; DNS lookups were taking 200-1000ms without specifying dns_nameservers in the config (the nameservers specified there are the same ones within /etc/resolv.conf), now they're sub 1ms. There isn't much chance of us re-ip-ing internally, so it's a pretty safe config option for us. I definitely agree that it could cause problems for people using public DNS resolution though. Hmm, glad it works for you. I think it might have something to do with other settings in resolv.conf, namely 'search' and 'domain' which can result in NXDOMAIN results leading to several lookups. The default may also include a legacy host name lookup, where dns_nameservers might cause a bypass (although I don't have time to check the code and confirm that). Worth a note, though. This is going into my todo pile for a later check. Amos
Re: [squid-users] question about filesystems and directories for cache.
Tony Dodd wrote: Matias Lopez Bergero wrote: Hello, snip I'm being reading the wiki and the mailing list to know, which is the best filesystem to use, for now I have chose ext3 based on comments on the list, also, I have passed the nodev,nosuid,noexec,noatime flags to fstab in order to get a security and faster performance. snip Hi Matias, I'd personally recommend against ext3, and point you towards reiserfs. ext3 is horribly slow for many small files being read/written at the same time. I'd also recommend maximizing your disk throughput, by splitting the raid, and having a cache-dir on each disk; though of course, you'll loose redundancy in the event of a disk failure. I wrote a howto that revolves around maximizing squid performance, take a look at it, you may find it helpful: http://blog.last.fm/2007/08/30/squid-optimization-guide Thank you I'll try that! Regards, Matías.
Re: [squid-users] question about filesystems and directories for cache.
Tony Dodd wrote: Matias Lopez Bergero wrote: Hello, snip I'm being reading the wiki and the mailing list to know, which is the best filesystem to use, for now I have chose ext3 based on comments on the list, also, I have passed the nodev,nosuid,noexec,noatime flags to fstab in order to get a security and faster performance. snip Hi Matias, I'd personally recommend against ext3, and point you towards reiserfs. ext3 is horribly slow for many small files being read/written at the same time. I'd also recommend maximizing your disk throughput, by splitting the raid, and having a cache-dir on each disk; though of course, you'll loose redundancy in the event of a disk failure. I wrote a howto that revolves around maximizing squid performance, take a look at it, you may find it helpful: http://blog.last.fm/2007/08/30/squid-optimization-guide Hi Tony, First of all, thanks for sharing the write-up. There are a number of high-load squid installations (Wikipedia, and Flikr are two of the largest I know of), but not much information on what tweaks to make in the interest of performance. After perusing your posting, I'm wondering if you would run a squidclient -p 80 mgr:info |grep method. I'm making the assumption that your squid is listening on port 80, so please change the argument to -p if needed. Your configuration options included --enable-poll, but with a 2.6 kernel and 2.6 sources, I would be surprised if you are not actually using epoll. It might be a superfluous compile option. Cache digests are not the only method of sharing between peers. ICP is an alternative and I have read that multicast works well for scaling beyond a handful of peers. I can't seem to find the posting now that I want to reference it. I'd trust your experience over my memory of someone else's posting, but I thought I would raise the possibility. I'm surprised you had to specify your hosts file in your squid.conf. /etc/hosts is the default. Lastly, I'd be wary of specifying dns_nameservers as a squid.conf option. Squid will use the servers specified in /etc/resolv.conf if this option is not specified. Now you have to maintain name servers in two locations. Chris
Re: [squid-users] question about filesystems and directories for cache.
Chris Robertson wrote: First of all, thanks for sharing the write-up. There are a number of high-load squid installations (Wikipedia, and Flikr are two of the largest I know of), but not much information on what tweaks to make in the interest of performance. No problem. =] I encountered the same problem when trying to figure out how to get more performance so I figured once I'd cracked it, the least I could do was document it for the other people having the same issue (and to give myself a reference for later). After perusing your posting, I'm wondering if you would run a squidclient -p 80 mgr:info |grep method. I'm making the assumption that your squid is listening on port 80, so please change the argument to -p if needed. Your configuration options included --enable-poll, but with a 2.6 kernel and 2.6 sources, I would be surprised if you are not actually using epoll. It might be a superfluous compile option. [EMAIL PROTECTED] ~]# squidclient -p 8081 mgr:info |grep method IO loop method: poll Cache digests are not the only method of sharing between peers. ICP is an alternative and I have read that multicast works well for scaling beyond a handful of peers. I can't seem to find the posting now that I want to reference it. I'd trust your experience over my memory of someone else's posting, but I thought I would raise the possibility. I was under the impression that when utilizing cache peering, it worked better if the squids had a digest of what was on X squid server, before asking for it. I could be wrong on that though - Adrian, care to comment on this one? It's now redundant in my situation though, as every peering mechanism is slower than going back to parent in our use case. I'm surprised you had to specify your hosts file in your squid.conf. /etc/hosts is the default. There are a couple of bugs in squid that seem to cause issues if you don't actually specify the hosts file within the squid conf... worst case, it's an extra line of config to parse on startup. Lastly, I'd be wary of specifying dns_nameservers as a squid.conf option. Squid will use the servers specified in /etc/resolv.conf if this option is not specified. Now you have to maintain name servers in two locations. Same goes here; DNS lookups were taking 200-1000ms without specifying dns_nameservers in the config (the nameservers specified there are the same ones within /etc/resolv.conf), now they're sub 1ms. There isn't much chance of us re-ip-ing internally, so it's a pretty safe config option for us. I definitely agree that it could cause problems for people using public DNS resolution though. -- Tony Dodd, Systems Administrator Last.fm | http://www.last.fm Karen House 1-11 Baches Street London N1 6DL check out my music taste at: http://www.last.fm/user/hawkeviper
Re: [squid-users] question about filesystems and directories for cache.
not actually using epoll. It might be a superfluous compile option. [EMAIL PROTECTED] ~]# squidclient -p 8081 mgr:info |grep method IO loop method: poll Try --enable-epoll and see if your caches are faster? Adrian
Re: [squid-users] question about filesystems and directories for cache.
Matias Lopez Bergero wrote: Hello, snip I'm being reading the wiki and the mailing list to know, which is the best filesystem to use, for now I have chose ext3 based on comments on the list, also, I have passed the nodev,nosuid,noexec,noatime flags to fstab in order to get a security and faster performance. snip Hi Matias, I'd personally recommend against ext3, and point you towards reiserfs. ext3 is horribly slow for many small files being read/written at the same time. I'd also recommend maximizing your disk throughput, by splitting the raid, and having a cache-dir on each disk; though of course, you'll loose redundancy in the event of a disk failure. I wrote a howto that revolves around maximizing squid performance, take a look at it, you may find it helpful: http://blog.last.fm/2007/08/30/squid-optimization-guide -- Tony Dodd, Systems Administrator Last.fm | http://www.last.fm Karen House 1-11 Baches Street London N1 6DL check out my music taste at: http://www.last.fm/user/hawkeviper
Re: [squid-users] question about filesystems and directories for cache.
reiserfs 4 is much better than ext3 ... On Nov 24, 2007 9:55 PM, Tony Dodd [EMAIL PROTECTED] wrote: Matias Lopez Bergero wrote: Hello, snip I'm being reading the wiki and the mailing list to know, which is the best filesystem to use, for now I have chose ext3 based on comments on the list, also, I have passed the nodev,nosuid,noexec,noatime flags to fstab in order to get a security and faster performance. snip Hi Matias, I'd personally recommend against ext3, and point you towards reiserfs. ext3 is horribly slow for many small files being read/written at the same time. I'd also recommend maximizing your disk throughput, by splitting the raid, and having a cache-dir on each disk; though of course, you'll loose redundancy in the event of a disk failure. I wrote a howto that revolves around maximizing squid performance, take a look at it, you may find it helpful: http://blog.last.fm/2007/08/30/squid-optimization-guide -- Tony Dodd, Systems Administrator Last.fm | http://www.last.fm Karen House 1-11 Baches Street London N1 6DL check out my music taste at: http://www.last.fm/user/hawkeviper -- Sds. Alexandre J. Correa Onda Internet / OPinguim.net http://www.ondainternet.com.br http://www.opinguim.net
Re: [squid-users] question about filesystems and directories for cache.
On Sat, Nov 24, 2007, Alexandre Correa wrote: reiserfs 4 is much better than ext3 ... [citation needed] I know reiserfs vs ext2|3 benchmarks in the past showed reiserfs did a little better but both codebases have advanced over the last few years. I'd love to see an actual up to date comparison. Adrian -- - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support -
Re: [squid-users] question about filesystems and directories for cache.
Quoting Adrian Chadd [EMAIL PROTECTED]: On Sat, Nov 24, 2007, Alexandre Correa wrote: reiserfs 4 is much better than ext3 ... [citation needed] I know reiserfs vs ext2|3 benchmarks in the past showed reiserfs did a little better but both codebases have advanced over the last few years. I'd love to see an actual up to date comparison. All the benchmarking I performed while testing ext3 vs xfs vs reiserfs for squid showed that reiserfs gave the best bang per buck for io intensive small file operations... That said, I too would like some definative numbers/graphs for comparison in different settings. Perhaps next time I rebuild one of my squid boxes, I'll run some benchmarks and document them. -- Tony Dodd, Systems Administrator Last.fm | http://www.last.fm Karen House 1-11 Baches Street London N1 6DL check out my music taste at: http://www.last.fm/user/hawkeviper -- Tony Dodd, Systems Administrator Last.fm | http://www.last.fm Karen House 1-11 Baches Street London, N1 6DL Check out my music taste at http://www.last.fm/user/HawkeVIPER
[squid-users] question about filesystems and directories for cache.
Hello, I'm installing a new squid server (I have a couple running already), but this is going to server as gateway for about 450 clients. I have a good piece of hardware for it, but I have just two hard discs RAID 1 mirrored. I'll like to get the best performance of this servers, and I think that the iowait would be the bottle neck of this setup. So, I'm looking forward to configure the system in the most optimums way... I'm being reading the wiki and the mailing list to know, which is the best filesystem to use, for now I have chose ext3 based on comments on the list, also, I have passed the nodev,nosuid,noexec,noatime flags to fstab in order to get a security and faster performance. I am not sure how to setup the caching directories what would be better to have one directory for store the cache, or have more than one... to use ufs, aufs or diskd. For now based on comments at the wiki, I have chose to have four directories using diskd. I would like to know, what you guys think about this, or if you have some comments or experience about this little tweaks to improve performance. Any comments are welcome, BR, Matías