Re: performance of AAC-RAID (ICP9087MA) - NFS actually
Andrew Sharp wrote: On Tue, Sep 12, 2006 at 02:17:48PM +0200, Erik Mouw wrote: On Tue, Sep 12, 2006 at 11:50:46AM +0200, Raimund Jacob wrote: Hello again! My largish CVS-module checks out (cvs up -dP actually) in about 1s when I do it locally on the server machine. It also takes about 1s when I check it out on a remote machine but on a local disk. On the same remote machine via NFS it takes about 30s. So NFS is actually the problem here, not the ICP. One of the main problems with remote CVS is that it uses /tmp on the server. Make sure that is a fast and large disk as well, or tell CVS to use another (fast) directory as scratch space. ok. /tmp is fast enough. it's only NFS performance i have problems with. raid - cvs pserver - local disk is pretty fast. Furthermore I observed this: I ran 'vmstat 1'. Checking out locally shows a 'bo' of about 1MB during the second it takes. During the checkount via NFS there is a sustained 2 to 3 MB 'bo' on the server. So my assumption is that lots of fs metadata get updated during that 30s (files dont actually change) and due to the sync nature of the mount everything is committed to disk pretty hard (ext3) - and that is what I'm waiting for. Mounting filesystems with -o noatime,nodiratime makes quite a difference. yeah, i guess so. but on this fs i want to keep my atimes. If you're using ext3 with lots of files in a single directory, make sure you're using htree directory indexing. To see if it is enabled: dumpe2fs /dev/whatever Look for the features line, if it has dir_index, it is enabled. If not, enable it with (can be done on a mounted filesystem): tune2fs -O dir_index /dev/whatever Now all new directories will be created with a directory index. If you want to enable it on all directories, unmount the filesystem and run e2fsck on it: e2fsck -f -y -D /dev/whatever that's a nice hint. i'll do that next time i reboot (not anytime soon :). Increasing the journal size can also make a difference, or try putting the journal on a separate device (quite invasive, make sure you have a backup). See tune2fs(8). ack. These are all good suggestions for speedups, especially this last one, but I would think that none of this should really be necessary unless your load is remarkably high, not just one user doing a cvs check out. I would strace the cvs checkout with timestamps and see where it is waiting. It seems to me like this has more to do with some configuration snafu than any of this stuff. as i described it's all NFS fault. Why are you trying to configure it this way anyway? Just use the standard client/server configuration. You'll probably be glad you did. And it seems to work a lot faster that way anyway ~:^) well, shared /home among multiple unix workstations is not that uncommon. Here is what I will try next (when people leave the office): - Mount the exported fs as data=journal - the NFS-HOWTO says this might improve things. I hope this works with remount since reboot is not an option. I personally would NOT do this. There is a good reason why none of the top performing journaling file systems journal data by default. I don't think it makes a difference, I'd rather say it makes things worse cause it forces all *data* (and not only metadata) through the journal. ok, i see. i found that in the NFS-HOWTO and probably got it wrong. the manpage was really enlightning either. your comments make sense so i wont even try. so, after having another look at our UPS and at the other folks i decided to just async-export the fs and - of course - that solved all problems. the very same cvs checkout takes 2 to 4 seconds now (not 30). i twiggled the rsize/wsize but that's below noise, seems our LAN is as good as it gets. bottom line: the ICP behaves as it should was far as one can notice. performance problems were due to NFS and were solved by exporting async. local fs optimizations on the server and NFS client options are still to be tuned but everything is to acceptable speed now. thanks for all the suggestions, i learned something in this thread. Raimund -- Die Lösung für effizientes Kundenbeziehungsmanagement. Jetzt informieren: http://www.universal-messenger.de Pinuts media+science GmbH http://www.pinuts.de Dipl.-Inform. Raimund Jacob [EMAIL PROTECTED] Krausenstr. 9-10 voice : +49 30 59 00 90 322 10117 Berlin fax : +49 30 59 00 90 390 Germany -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: performance of AAC-RAID (ICP9087MA)
Erik Mouw wrote: Hello! And thanks for your suggestions. On Fri, Sep 08, 2006 at 05:08:52PM +0200, Raimund Jacob wrote: Checking out a largish CVS module is no fun. The data is retrieved via cvs pserver from the file server and written back via NFS into my home directory. This process is sometimes pretty quick and sometimes blocks in between as if the RAID controller has to think about the requests. I know this phenomenon only from a megaraid controller, which we eventuelly canned for a pure linux software raid (2 disks mirror). Also, compiling in the nfs-mounted home directory is too slow - even on a 1000Mbit link. Try with a different IO scheduler. You probably have the anticipatory scheduler, you want to give the cfq scheduler a try. echo cfq /sys/block/[device]/queue/scheduler For NFS, you also want to increase the number of daemons. Put the line RPCNFSDCOUNT=32 in /etc/default/nfs-kernel-server . Thanks for these hints. In the meantime I was also reading up the NFS-HOWTO on the performance subject. Playing around with the rsize/wsize did not turn up much - seems they dont really matter in my case. My largish CVS-module checks out (cvs up -dP actually) in about 1s when I do it locally on the server machine. It also takes about 1s when I check it out on a remote machine but on a local disk. On the same remote machine via NFS it takes about 30s. So NFS is actually the problem here, not the ICP. Furthermore I observed this: I ran 'vmstat 1'. Checking out locally shows a 'bo' of about 1MB during the second it takes. During the checkount via NFS there is a sustained 2 to 3 MB 'bo' on the server. So my assumption is that lots of fs metadata get updated during that 30s (files dont actually change) and due to the sync nature of the mount everything is committed to disk pretty hard (ext3) - and that is what I'm waiting for. Here is what I will try next (when people leave the office): - Mount the exported fs as data=journal - the NFS-HOWTO says this might improve things. I hope this works with remount since reboot is not an option. - Try an async nfs export - There is an UPS on the server anyway. - Try the cfq scheduler and even more increased RPCNFSDCOUNT thing (I have 12 already on an UP machine). Due to my observations I dont expect much here but it's worth a try. Anyone thinks one of those is a bad idea? :) Raimund -- Die Lösung für effizientes Kundenbeziehungsmanagement. Jetzt informieren: http://www.universal-messenger.de Pinuts media+science GmbH http://www.pinuts.de Dipl.-Inform. Raimund Jacob [EMAIL PROTECTED] Krausenstr. 9-10 voice : +49 30 59 00 90 322 10117 Berlin fax : +49 30 59 00 90 390 Germany -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: performance of AAC-RAID (ICP9087MA)
On Tue, Sep 12, 2006 at 11:50:46AM +0200, Raimund Jacob wrote: Erik Mouw wrote: Hello! And thanks for your suggestions. On Fri, Sep 08, 2006 at 05:08:52PM +0200, Raimund Jacob wrote: Checking out a largish CVS module is no fun. The data is retrieved via cvs pserver from the file server and written back via NFS into my home directory. This process is sometimes pretty quick and sometimes blocks in between as if the RAID controller has to think about the requests. I know this phenomenon only from a megaraid controller, which we eventuelly canned for a pure linux software raid (2 disks mirror). Also, compiling in the nfs-mounted home directory is too slow - even on a 1000Mbit link. Try with a different IO scheduler. You probably have the anticipatory scheduler, you want to give the cfq scheduler a try. echo cfq /sys/block/[device]/queue/scheduler For NFS, you also want to increase the number of daemons. Put the line RPCNFSDCOUNT=32 in /etc/default/nfs-kernel-server . Thanks for these hints. In the meantime I was also reading up the NFS-HOWTO on the performance subject. Playing around with the rsize/wsize did not turn up much - seems they dont really matter in my case. In my case it did matter: setting them to 4k (ie: CPU pagesize) increased throughput. My largish CVS-module checks out (cvs up -dP actually) in about 1s when I do it locally on the server machine. It also takes about 1s when I check it out on a remote machine but on a local disk. On the same remote machine via NFS it takes about 30s. So NFS is actually the problem here, not the ICP. One of the main problems with remote CVS is that it uses /tmp on the server. Make sure that is a fast and large disk as well, or tell CVS to use another (fast) directory as scratch space. Furthermore I observed this: I ran 'vmstat 1'. Checking out locally shows a 'bo' of about 1MB during the second it takes. During the checkount via NFS there is a sustained 2 to 3 MB 'bo' on the server. So my assumption is that lots of fs metadata get updated during that 30s (files dont actually change) and due to the sync nature of the mount everything is committed to disk pretty hard (ext3) - and that is what I'm waiting for. Mounting filesystems with -o noatime,nodiratime makes quite a difference. If you're using ext3 with lots of files in a single directory, make sure you're using htree directory indexing. To see if it is enabled: dumpe2fs /dev/whatever Look for the features line, if it has dir_index, it is enabled. If not, enable it with (can be done on a mounted filesystem): tune2fs -O dir_index /dev/whatever Now all new directories will be created with a directory index. If you want to enable it on all directories, unmount the filesystem and run e2fsck on it: e2fsck -f -y -D /dev/whatever Increasing the journal size can also make a difference, or try putting the journal on a separate device (quite invasive, make sure you have a backup). See tune2fs(8). Here is what I will try next (when people leave the office): - Mount the exported fs as data=journal - the NFS-HOWTO says this might improve things. I hope this works with remount since reboot is not an option. I don't think it makes a difference, I'd rather say it makes things worse cause it forces all *data* (and not only metadata) through the journal. - Try an async nfs export - There is an UPS on the server anyway. async makes it indeed faster. - Try the cfq scheduler and even more increased RPCNFSDCOUNT thing (I have 12 already on an UP machine). Due to my observations I dont expect much here but it's worth a try. It did make a difference over here, that's why I increased it to 32. Anyone thinks one of those is a bad idea? :) I only think data=journal is a bad idea. Erik -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: performance of AAC-RAID (ICP9087MA)
On Tue, Sep 12, 2006 at 02:17:48PM +0200, Erik Mouw wrote: On Tue, Sep 12, 2006 at 11:50:46AM +0200, Raimund Jacob wrote: Erik Mouw wrote: Hello! And thanks for your suggestions. On Fri, Sep 08, 2006 at 05:08:52PM +0200, Raimund Jacob wrote: Checking out a largish CVS module is no fun. The data is retrieved via cvs pserver from the file server and written back via NFS into my home directory. This process is sometimes pretty quick and sometimes blocks in between as if the RAID controller has to think about the requests. I know this phenomenon only from a megaraid controller, which we eventuelly canned for a pure linux software raid (2 disks mirror). Also, compiling in the nfs-mounted home directory is too slow - even on a 1000Mbit link. Try with a different IO scheduler. You probably have the anticipatory scheduler, you want to give the cfq scheduler a try. echo cfq /sys/block/[device]/queue/scheduler For NFS, you also want to increase the number of daemons. Put the line RPCNFSDCOUNT=32 in /etc/default/nfs-kernel-server . Thanks for these hints. In the meantime I was also reading up the NFS-HOWTO on the performance subject. Playing around with the rsize/wsize did not turn up much - seems they dont really matter in my case. In my case it did matter: setting them to 4k (ie: CPU pagesize) increased throughput. My largish CVS-module checks out (cvs up -dP actually) in about 1s when I do it locally on the server machine. It also takes about 1s when I check it out on a remote machine but on a local disk. On the same remote machine via NFS it takes about 30s. So NFS is actually the problem here, not the ICP. One of the main problems with remote CVS is that it uses /tmp on the server. Make sure that is a fast and large disk as well, or tell CVS to use another (fast) directory as scratch space. Furthermore I observed this: I ran 'vmstat 1'. Checking out locally shows a 'bo' of about 1MB during the second it takes. During the checkount via NFS there is a sustained 2 to 3 MB 'bo' on the server. So my assumption is that lots of fs metadata get updated during that 30s (files dont actually change) and due to the sync nature of the mount everything is committed to disk pretty hard (ext3) - and that is what I'm waiting for. Mounting filesystems with -o noatime,nodiratime makes quite a difference. If you're using ext3 with lots of files in a single directory, make sure you're using htree directory indexing. To see if it is enabled: dumpe2fs /dev/whatever Look for the features line, if it has dir_index, it is enabled. If not, enable it with (can be done on a mounted filesystem): tune2fs -O dir_index /dev/whatever Now all new directories will be created with a directory index. If you want to enable it on all directories, unmount the filesystem and run e2fsck on it: e2fsck -f -y -D /dev/whatever Increasing the journal size can also make a difference, or try putting the journal on a separate device (quite invasive, make sure you have a backup). See tune2fs(8). These are all good suggestions for speedups, especially this last one, but I would think that none of this should really be necessary unless your load is remarkably high, not just one user doing a cvs check out. I would strace the cvs checkout with timestamps and see where it is waiting. It seems to me like this has more to do with some configuration snafu than any of this stuff. Why are you trying to configure it this way anyway? Just use the standard client/server configuration. You'll probably be glad you did. And it seems to work a lot faster that way anyway ~:^) Here is what I will try next (when people leave the office): - Mount the exported fs as data=journal - the NFS-HOWTO says this might improve things. I hope this works with remount since reboot is not an option. I personally would NOT do this. There is a good reason why none of the top performing journaling file systems journal data by default. I don't think it makes a difference, I'd rather say it makes things worse cause it forces all *data* (and not only metadata) through the journal. Eggxacly. a -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: performance of AAC-RAID (ICP9087MA)
On Saturday, 09.09.2006 at 13:33 +0200, Raimund Jacob wrote: Checking out a largish CVS module is no fun. The data is retrieved via cvs pserver from the file server and written back via NFS into my home directory. Have you checked your NFS settings? I'd expect that to be the main bottleneck in this setup. hm, that's a nice idea. I worked locally on the server machine for a while (but in the same setup, which only eliminates NFS access) and it does indeed feel better. on the other hand we have weekend. anyway, here is my NFS setup: server exports basically with: /raid/home *(rw,sync,no_root_squash) (and no, using async is not an option, this time) Just to throw my hat into the ring on this one... I've noticed the 'blocking' effect on our Adaptec controller too: it seems to occur when the process that is running requires/demands a full disk sync. Given the nature of our setup, with lots of systems running on UPSen, I'm happy to use 'async' for NFS, which obviously helps hugely. Dave. -- Dave Ewart [EMAIL PROTECTED] Computing Manager, Cancer Epidemiology Unit Cancer Research UK / Oxford University PGP: CC70 1883 BD92 E665 B840 118B 6E94 2CFD 694D E370 Get key from http://www.ceu.ox.ac.uk/~davee/davee-ceu-ox-ac-uk.asc N 51.7518, W 1.2016 signature.asc Description: Digital signature
Re: performance of AAC-RAID (ICP9087MA)
On Fri, Sep 08, 2006 at 05:08:52PM +0200, Raimund Jacob wrote: Checking out a largish CVS module is no fun. The data is retrieved via cvs pserver from the file server and written back via NFS into my home directory. This process is sometimes pretty quick and sometimes blocks in between as if the RAID controller has to think about the requests. I know this phenomenon only from a megaraid controller, which we eventuelly canned for a pure linux software raid (2 disks mirror). Also, compiling in the nfs-mounted home directory is too slow - even on a 1000Mbit link. The new fileserver therefor feels worse than the old one which served the same purpose with FreeBSD/Symbios Logic SCSI RAID5/AMD K7. I try to find someone with some experience with this kind of problem. Perhaps the other 3 users of an ICP controller. I wonder if this is related to the aacraid driver. Or perhaps it's because of the 64-bit thing. Does anyone have any guesses? Try with a different IO scheduler. You probably have the anticipatory scheduler, you want to give the cfq scheduler a try. echo cfq /sys/block/[device]/queue/scheduler For NFS, you also want to increase the number of daemons. Put the line RPCNFSDCOUNT=32 in /etc/default/nfs-kernel-server . Erik -- +-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 -- | Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: performance of AAC-RAID (ICP9087MA)
Paul Brook wrote: Hello! Checking out a largish CVS module is no fun. The data is retrieved via cvs pserver from the file server and written back via NFS into my home directory. Have you checked your NFS settings? I'd expect that to be the main bottleneck in this setup. hm, that's a nice idea. I worked locally on the server machine for a while (but in the same setup, which only eliminates NFS access) and it does indeed feel better. on the other hand we have weekend. anyway, here is my NFS setup: server exports basically with: /raid/home *(rw,sync,no_root_squash) (and no, using async is not an option, this time) clients import with: server:/raid/home /home nfs rw,rsize=32768,wsize=32768 0 0 hm. perhaps that 32k is a little too much due to ip fragmentation. any idea what would be a good size to match this with my MTU? or what do you use for good performance? other options? something to change the nfs version being used? would NFSv4 yield anything? also, which would be a good benchmark to measure results wrt access times? i dont worry about raw throughput so much. bonnie++ doesnt seem to be that well suited (or i'm too stupid to read them). thanks for any hint, Raimund -- Pinuts media+science GmbH http://www.pinuts.de Raimund Jacob [EMAIL PROTECTED] Krausenstr. 9-10 voice : +49 30 59 00 90 322 10117 Berlin fax : +49 30 59 00 90 390 Germany -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
performance of AAC-RAID (ICP9087MA)
Hi *, We are happily running our fileserver on AMD64 (from /proc/cpuinfo): model name : AMD Opteron(tm) Processor 246 cpu MHz : 1995.066 cache size : 1024 KB we thought it might be a good idea to run one of those expensive, over-engineered ICP controllers (on PCI-X): 02:03.0 RAID bus controller: Adaptec AAC-RAID (Rocket) (rev 02) Subsystem: Adaptec ICP ICP9087MA Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 16 Memory at ff20 (64-bit, non-prefetchable) [size=2M] Memory at ff1ff000 (64-bit, non-prefetchable) [size=4K] Expansion ROM at ff1e [disabled] [size=32K] Capabilities: [40] Power Management version 2 Capabilities: [48] Message Signalled Interrupts: 64bit+ Queue=0/2 Enable- Capabilities: [58] PCI-X non-bridge device Capabilities: [60] Vital Product Data ...which gives us a nice 1.5TB RAID5 made up from 6 SATA disks (the controller has 256MB on-board cache RAM). clients connect with samba (win) and nfs (linux), both with 100Mbit and 1000Mbit NICs, the server itself is hooked up to a 1000Mbit switch, of course. When we freshly installed the machine we did some bonnie++ and dd(1) testing which showed good (but not overly impressive results) for raw I/O performance. IIRC something about 60MB/s for sustained writes and perhaps 80MB/s for reads. Checking out a largish CVS module is no fun. The data is retrieved via cvs pserver from the file server and written back via NFS into my home directory. This process is sometimes pretty quick and sometimes blocks in between as if the RAID controller has to think about the requests. I know this phenomenon only from a megaraid controller, which we eventuelly canned for a pure linux software raid (2 disks mirror). Also, compiling in the nfs-mounted home directory is too slow - even on a 1000Mbit link. The new fileserver therefor feels worse than the old one which served the same purpose with FreeBSD/Symbios Logic SCSI RAID5/AMD K7. I try to find someone with some experience with this kind of problem. Perhaps the other 3 users of an ICP controller. I wonder if this is related to the aacraid driver. Or perhaps it's because of the 64-bit thing. Does anyone have any guesses? Thanks for any hint, Raimund PS: This is etch with stock kernel 2.6.15-1-amd64-generic #2 -- Die Lösung für effizientes Kundenbeziehungsmanagement. Jetzt informieren: http://www.universal-messenger.de Pinuts media+science GmbH http://www.pinuts.de Dipl.-Inform. Raimund Jacob [EMAIL PROTECTED] Krausenstr. 9-10 voice : +49 30 59 00 90 322 10117 Berlin fax : +49 30 59 00 90 390 Germany -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: performance of AAC-RAID (ICP9087MA)
Checking out a largish CVS module is no fun. The data is retrieved via cvs pserver from the file server and written back via NFS into my home directory. Have you checked your NFS settings? I'd expect that to be the main bottleneck in this setup. Paul -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]