Re: RAID0 performance question
On Tuesday November 22, [EMAIL PROTECTED] wrote: I have already try the all available options, including readahead in all layer (result in earlyer mails), and chunksize. But with this settings, i cannot workaround this. And the result is incomprehensible for me! The raid0 performance is not equal with one component , with sum of all component , and not equal with the slowest component! This is quite perplexing. My next step would probably be to watch the network traffic with tcpdump or ethereal. I would look for any differences between when it is going quickly (without raid0) and when slowly (with raid0). Rather than tcpdump, it might be easier to instrument the nbd server to print out requests and timestamps. Sorry I cannot be more helpful, and do have a Merry Christmas anyway :-) NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID0 performance question
On Sunday December 18, [EMAIL PROTECTED] wrote: The raid (md) device why dont have scheduler in sysfs? And if it have scheduler, where can i tune it? raid0 doesn't do any scheduling. All it does is take requests from the filesystem, decide which device they should go do (possibly splitting them if needed) and forwarding them on to the device. That is all. The raid0 can handle multiple requests at one time? Yes. But raid0 doesn't exactly 'handle' requests. It 'directs' requests for other devices to 'handle'. For me, the performance bottleneck is cleanly about RAID0 layer used exactly as concentrator to join the 4x2TB to 1x8TB. But it is only a software, and i cant beleave it is unfixable, or tunable. There is really nothing to tune apart from chunksize. You can tune the way the filesystem/vm accesses the device by setting readahead (readahead on component devices of a raid0 has exactly 0 effect). You can tune the underlying devices by choosing a scheduler (for a disk drive) or a packet size (for over-the-network devices) or whatever. But there is nothing to tune in raid0. Also, rather than doing measurements on the block devices (/dev/mdX) do measurements on a filesystem created on that device. I have often found that the filesystem goes faster than the block device. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID0 performance question
- Original Message - From: Neil Brown [EMAIL PROTECTED] To: JaniD++ [EMAIL PROTECTED] Cc: Al Boldi [EMAIL PROTECTED]; linux-raid@vger.kernel.org Sent: Wednesday, December 21, 2005 2:40 AM Subject: Re: RAID0 performance question On Sunday December 18, [EMAIL PROTECTED] wrote: The raid (md) device why dont have scheduler in sysfs? And if it have scheduler, where can i tune it? raid0 doesn't do any scheduling. All it does is take requests from the filesystem, decide which device they should go do (possibly splitting them if needed) and forwarding them on to the device. That is all. The raid0 can handle multiple requests at one time? Yes. But raid0 doesn't exactly 'handle' requests. It 'directs' requests for other devices to 'handle'. For me, the performance bottleneck is cleanly about RAID0 layer used exactly as concentrator to join the 4x2TB to 1x8TB. But it is only a software, and i cant beleave it is unfixable, or tunable. There is really nothing to tune apart from chunksize. You can tune the way the filesystem/vm accesses the device by setting readahead (readahead on component devices of a raid0 has exactly 0 effect). First i want to sorry, about Neil not interested thing in previous mail... :-( I have already try the all available options, including readahead in all layer (result in earlyer mails), and chunksize. But with this settings, i cannot workaround this. And the result is incomprehensible for me! The raid0 performance is not equal with one component , with sum of all component , and not equal with the slowest component! You can tune the underlying devices by choosing a scheduler (for a disk drive) or a packet size (for over-the-network devices) or whatever. The NBD has a scheduler, and this is already tuned for really top performance, and for the components it is really great! :-) (I have planned to set the NBD to 4KB packets, but this is hard, becaused by my NICs are not supported the jumbo packets...) But there is nothing to tune in raid0. Also, rather than doing measurements on the block devices (/dev/mdX) do measurements on a filesystem created on that device. I have often found that the filesystem goes faster than the block device. I use XFS, and the two performance is almost equal, depends on kind of load. But in most often case, it is almost equal. Thanks, Janos NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID0 performance question
JaniD++ wrote: For me, the performance bottleneck is cleanly about RAID0 layer used exactly as concentrator to join the 4x2TB to 1x8TB. Did you try running RAID0 over nbd directly and found it to be faster? IIRC, stacking raid modules does need a considerable amount of tuning, and even then it does not scale linearly. Maybe NeilBrown can help? -- Al - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID0 performance question
- Original Message - From: Al Boldi [EMAIL PROTECTED] To: JaniD++ [EMAIL PROTECTED] Cc: linux-raid@vger.kernel.org Sent: Friday, December 02, 2005 8:53 PM Subject: Re: RAID0 performance question JaniD++ wrote: But the cat /dev/md31 /dev/null (RAID0, the sum of 4 nodes) only makes ~450-490 Mbit/s, and i dont know why Somebody have an idea? :-) Try increasing the read-ahead setting on /dev/md31 using 'blockdev'. network block devices are likely to have latency issues and would benefit from large read-ahead. Also try larger chunk-size ~4mb. But i don't know exactly what to try. increase or decrease the chunksize? In the top layer raid (md31,raid0) or in the middle layer raids (md1-4, raid1) or both? What I found is that raid over nbd is highly max-chunksize dependent, due to nbd running over TCP. But increasing chunksize does not necessarily mean better system utilization. Much depends on your application request size. Tuning performance to maximize cat/dd /dev/md# throughput may only be suitable for a synthetic indication of overall performance in system comparisons. Yes, you have right! I already know that. ;-) But the bottleneck-effect is visible with dd/cat too. (and i am a litte bit lazy :-) Now i try the system with my spare drives, with the bigger chunk size (=4096K on RAID0 and all RAID1), and the slowness is still here. :( The problem is _exactly_ the same as previously. I think unneccessary to try smaller chunk size, because the 32k is allready small for 2,5,8MB readahead. The problem is somewhere else... :-/ I have got one (or more) question for the raid list! The raid (md) device why dont have scheduler in sysfs? And if it have scheduler, where can i tune it? The raid0 can handle multiple requests at one time? For me, the performance bottleneck is cleanly about RAID0 layer used exactly as concentrator to join the 4x2TB to 1x8TB. But it is only a software, and i cant beleave it is unfixable, or tunable. ;-) Cheers, Janos If your aim is to increase system utilization, then look for a good benchmark specific to your application requirements which would mimic a realistic load. -- Al - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID0 performance question
Hello, But the cat /dev/md31 /dev/null (RAID0, the sum of 4 nodes) only makes ~450-490 Mbit/s, and i dont know why Somebody have an idea? :-) Try increasing the read-ahead setting on /dev/md31 using 'blockdev'. network block devices are likely to have latency issues and would benefit from large read-ahead. Also try larger chunk-size ~4mb. Ahh. This is what i can't do. :-( I dont know how to backup 8TB! ;-) Maybe you could use your mirror!? I have one idea! :-) I can use the spare drives in the disknodes! :-) But i don't know exactly what to try. increase or decrease the chunksize? In the top layer raid (md31,raid0) or in the middle layer raids (md1-4, raid1) or both? Can somebody help me to find the performance problem source? Thanks, Janos -- Al - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID0 performance question
look at the cpu consumption. On 11/26/05, JaniD++ [EMAIL PROTECTED] wrote: Hello list, I have searching the bottleneck of my system, and found something what i cant cleanly understand. I have use NBD with 4 disk nodes. (raidtab is the bottom of mail) The cat /dev/nb# /dev/nullmakes ~ 350 Mbit/s on each nodes. The cat /dev/nb0 + nb1 + nb2 + nb3 in one time parallel makes ~ 780-800 Mbit/s. - i think this is my network bottleneck. But the cat /dev/md31 /dev/null (RAID0, the sum of 4 nodes) only makes ~450-490 Mbit/s, and i dont know why Somebody have an idea? :-) (the nb31,30,29,28 only possible mirrors) Thanks Janos raiddev /dev/md1 raid-level 1 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/nb0 raid-disk 0 device /dev/nb31 raid-disk 1 failed-disk /dev/nb31 raiddev /dev/md2 raid-level 1 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/nb1 raid-disk 0 device /dev/hb30 raid-disk 1 failed-disk /dev/nb30 raiddev /dev/md3 raid-level 1 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/nb2 raid-disk 0 device /dev/nb29 raid-disk 1 failed-disk /dev/nb29 raiddev /dev/md4 raid-level 1 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/nb3 raid-disk 0 device /dev/nb28 raid-disk 1 failed-disk /dev/nb28 raiddev /dev/md31 raid-level 0 nr-raid-disks 4 chunk-size 32 persistent-superblock 1 device /dev/md1 raid-disk 0 device /dev/md2 raid-disk 1 device /dev/md3 raid-disk 2 device /dev/md4 raid-disk 3 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- Raz - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID0 performance question
Hello, Raz, Think this is not cpu usage problem. :-) The system is divided to 4 cpuset, and each cpuset uses only one disknode. (CPU0-nb0, CPU1-nb1, ...) this top is under cat /dev/md31 (raid0) Thanks, Janos 17:16:01 up 14:19, 4 users, load average: 7.74, 5.03, 4.20 305 processes: 301 sleeping, 4 running, 0 zombie, 0 stopped CPU0 states: 33.1% user 47.0% system0.0% nice 0.0% iowait 18.0% idle CPU1 states: 21.0% user 52.0% system0.0% nice 6.0% iowait 19.0% idle CPU2 states: 2.0% user 74.0% system0.0% nice 3.0% iowait 18.0% idle CPU3 states: 10.0% user 57.0% system0.0% nice 5.0% iowait 26.0% idle Mem: 4149412k av, 3961084k used, 188328k free, 0k shrd, 557032k buff 911068k active,2881680k inactive Swap: 0k av, 0k used, 0k free 2779388k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 2410 root 0 -19 1584 10836 S 48.3 0.0 21:57 3 nbd-client 16191 root 25 0 4832 820 664 R48.3 0.0 3:04 0 grep 2408 root 0 -19 1588 11236 S 47.3 0.0 24:05 2 nbd-client 2406 root 0 -19 1584 10836 S 40.8 0.0 22:56 1 nbd-client 18126 root 18 0 5780 1604 508 D38.0 0.0 0:12 1 dd 2404 root 0 -19 1588 11236 S 36.2 0.0 22:56 0 nbd-client 294 root 15 0 00 0 SW7.4 0.0 3:22 1 kswapd0 2284 root 16 0 13500 5376 3040 S 7.4 0.1 8:53 2 httpd 18307 root 16 0 6320 2232 1432 S 4.6 0.0 0:00 2 sendmail 16789 root 16 0 5472 1552 952 R 3.7 0.0 0:03 3 top 2431 root 10 -5 00 0 SW 2.7 0.0 7:32 2 md2_raid1 29076 root 17 0 4776 772 680 S 2.7 0.0 1:09 3 xfs_fsr 6955 root 15 0 1588 10836 S 2.7 0.0 0:56 2 nbd-client - Original Message - From: Raz Ben-Jehuda(caro) [EMAIL PROTECTED] To: JaniD++ [EMAIL PROTECTED] Cc: linux-raid@vger.kernel.org Sent: Saturday, November 26, 2005 4:56 PM Subject: Re: RAID0 performance question look at the cpu consumption. On 11/26/05, JaniD++ [EMAIL PROTECTED] wrote: Hello list, I have searching the bottleneck of my system, and found something what i cant cleanly understand. I have use NBD with 4 disk nodes. (raidtab is the bottom of mail) The cat /dev/nb# /dev/nullmakes ~ 350 Mbit/s on each nodes. The cat /dev/nb0 + nb1 + nb2 + nb3 in one time parallel makes ~ 780-800 Mbit/s. - i think this is my network bottleneck. But the cat /dev/md31 /dev/null (RAID0, the sum of 4 nodes) only makes ~450-490 Mbit/s, and i dont know why Somebody have an idea? :-) (the nb31,30,29,28 only possible mirrors) Thanks Janos raiddev /dev/md1 raid-level 1 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/nb0 raid-disk 0 device /dev/nb31 raid-disk 1 failed-disk /dev/nb31 raiddev /dev/md2 raid-level 1 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/nb1 raid-disk 0 device /dev/hb30 raid-disk 1 failed-disk /dev/nb30 raiddev /dev/md3 raid-level 1 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/nb2 raid-disk 0 device /dev/nb29 raid-disk 1 failed-disk /dev/nb29 raiddev /dev/md4 raid-level 1 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/nb3 raid-disk 0 device /dev/nb28 raid-disk 1 failed-disk /dev/nb28 raiddev /dev/md31 raid-level 0 nr-raid-disks 4 chunk-size 32 persistent-superblock 1 device /dev/md1 raid-disk 0 device /dev/md2 raid-disk 1 device /dev/md3 raid-disk 2 device /dev/md4 raid-disk 3 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- Raz - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID0 performance question
On Sat, 26 Nov 2005, JaniD++ wrote: Hello, Raz, Think this is not cpu usage problem. :-) The system is divided to 4 cpuset, and each cpuset uses only one disknode. (CPU0-nb0, CPU1-nb1, ...) Seams to be CPU problem. Which kind of NIC do you have? CPU2 states: 2.0% user 74.0% system0.0% nice 3.0% iowait 18.0% idle CPU3 states: 10.0% user 57.0% system0.0% nice 5.0% iowait 26.0% Do you have 4 cpu, or 2 HT cpu? Bye, -=Lajbi= LAJBER Zoltan Szent Istvan Egyetem, Informatika Hivatal Most of the time, if you think you are in trouble, crank that throttle! - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID0 performance question
Hi, If you don't speak hungarian, forget this sentence: Beszelsz magyarul? akkor folytathatjuk ugy is. On Sat, 26 Nov 2005, JaniD++ wrote: Intel xeon motherboard, intel e1000 x2. (64bit) But i already write that, if i cut out the raid, and starts the 4 cat at one time the traffic is rise to 780-800 Mbit! :-) This is not hardware related problem. Only tune, or missconfiguration problem. - I think... What is in the /proc/interrupts? interruts distibuted over cpus, or all irq goes for one cpu? What about, if you switch off HT? Bye, -=Lajbi= LAJBER Zoltan Szent Istvan Egyetem, Informatika Hivatal Most of the time, if you think you are in trouble, crank that throttle! - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html