Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195MB/s write)
Justin Piszcz wrote: > On Sat, 13 Jan 2007, Al Boldi wrote: > > Justin Piszcz wrote: > > > Btw, max sectors did improve my performance a little bit but > > > stripe_cache+read_ahead were the main optimizations that made > > > everything go faster by about ~1.5x. I have individual bonnie++ > > > benchmarks of [only] the max_sector_kb tests as well, it improved the > > > times from 8min/bonnie run -> 7min 11 seconds or so, see below and > > > then after that is what you requested. > > > > Can you repeat with /dev/sda only? > > For sda-- (is a 74GB raptor only)-- but ok. Do you get the same results for the 150GB-raptor on sd{e,g,i,k}? > # uptime > 16:25:38 up 1 min, 3 users, load average: 0.23, 0.14, 0.05 > # cat /sys/block/sda/queue/max_sectors_kb > 512 > # echo 3 > /proc/sys/vm/drop_caches > # dd if=/dev/sda of=/dev/null bs=1M count=10240 > 10240+0 records in > 10240+0 records out > 10737418240 bytes (11 GB) copied, 150.891 seconds, 71.2 MB/s > # echo 192 > /sys/block/sda/queue/max_sectors_kb > # echo 3 > /proc/sys/vm/drop_caches > # dd if=/dev/sda of=/dev/null bs=1M count=10240 > 10240+0 records in > 10240+0 records out > 10737418240 bytes (11 GB) copied, 150.192 seconds, 71.5 MB/s > # echo 128 > /sys/block/sda/queue/max_sectors_kb > # echo 3 > /proc/sys/vm/drop_caches > # dd if=/dev/sda of=/dev/null bs=1M count=10240 > 10240+0 records in > 10240+0 records out > 10737418240 bytes (11 GB) copied, 150.15 seconds, 71.5 MB/s > > > Does this show anything useful? Probably a latency issue. md is highly latency sensitive. What CPU type/speed do you have? Bootlog/dmesg? Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195MB/s write)
On Sat, 13 Jan 2007, Al Boldi wrote: > Justin Piszcz wrote: > > Btw, max sectors did improve my performance a little bit but > > stripe_cache+read_ahead were the main optimizations that made everything > > go faster by about ~1.5x. I have individual bonnie++ benchmarks of > > [only] the max_sector_kb tests as well, it improved the times from > > 8min/bonnie run -> 7min 11 seconds or so, see below and then after that is > > what you requested. > > > > # echo 3 > /proc/sys/vm/drop_caches > > # dd if=/dev/md3 of=/dev/null bs=1M count=10240 > > 10240+0 records in > > 10240+0 records out > > 10737418240 bytes (11 GB) copied, 399.352 seconds, 26.9 MB/s > > # for i in sde sdg sdi sdk; do echo 192 > > > /sys/block/"$i"/queue/max_sectors_kb; echo "Set > > /sys/block/"$i"/queue/max_sectors_kb to 192kb"; done > > Set /sys/block/sde/queue/max_sectors_kb to 192kb > > Set /sys/block/sdg/queue/max_sectors_kb to 192kb > > Set /sys/block/sdi/queue/max_sectors_kb to 192kb > > Set /sys/block/sdk/queue/max_sectors_kb to 192kb > > # echo 3 > /proc/sys/vm/drop_caches > > # dd if=/dev/md3 of=/dev/null bs=1M count=10240 > > 10240+0 records in > > 10240+0 records out > > 10737418240 bytes (11 GB) copied, 398.069 seconds, 27.0 MB/s > > > > Awful performance with your numbers/drop_caches settings.. ! > > Can you repeat with /dev/sda only? > > With fresh reboot to shell, then: > $ cat /sys/block/sda/queue/max_sectors_kb > $ echo 3 > /proc/sys/vm/drop_caches > $ dd if=/dev/sda of=/dev/null bs=1M count=10240 > > $ echo 192 > /sys/block/sda/queue/max_sectors_kb > $ echo 3 > /proc/sys/vm/drop_caches > $ dd if=/dev/sda of=/dev/null bs=1M count=10240 > > $ echo 128 > /sys/block/sda/queue/max_sectors_kb > $ echo 3 > /proc/sys/vm/drop_caches > $ dd if=/dev/sda of=/dev/null bs=1M count=10240 > > > What were your tests designed to show? > > A problem with the block-io. > > > Thanks! > > -- > Al > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Here you go: For sda-- (is a 74GB raptor only)-- but ok. # uptime 16:25:38 up 1 min, 3 users, load average: 0.23, 0.14, 0.05 # cat /sys/block/sda/queue/max_sectors_kb 512 # echo 3 > /proc/sys/vm/drop_caches # dd if=/dev/sda of=/dev/null bs=1M count=10240 10240+0 records in 10240+0 records out 10737418240 bytes (11 GB) copied, 150.891 seconds, 71.2 MB/s # # # # echo 192 > /sys/block/sda/queue/max_sectors_kb # echo 3 > /proc/sys/vm/drop_caches # dd if=/dev/sda of=/dev/null bs=1M count=10240 10240+0 records in 10240+0 records out 10737418240 bytes (11 GB) copied, 150.192 seconds, 71.5 MB/s # echo 128 > /sys/block/sda/queue/max_sectors_kb # echo 3 > /proc/sys/vm/drop_caches # dd if=/dev/sda of=/dev/null bs=1M count=10240 10240+0 records in 10240+0 records out 10737418240 bytes (11 GB) copied, 150.15 seconds, 71.5 MB/s Does this show anything useful? Justin. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195MB/s write)
Justin Piszcz wrote: > Btw, max sectors did improve my performance a little bit but > stripe_cache+read_ahead were the main optimizations that made everything > go faster by about ~1.5x. I have individual bonnie++ benchmarks of > [only] the max_sector_kb tests as well, it improved the times from > 8min/bonnie run -> 7min 11 seconds or so, see below and then after that is > what you requested. > > # echo 3 > /proc/sys/vm/drop_caches > # dd if=/dev/md3 of=/dev/null bs=1M count=10240 > 10240+0 records in > 10240+0 records out > 10737418240 bytes (11 GB) copied, 399.352 seconds, 26.9 MB/s > # for i in sde sdg sdi sdk; do echo 192 > > /sys/block/"$i"/queue/max_sectors_kb; echo "Set > /sys/block/"$i"/queue/max_sectors_kb to 192kb"; done > Set /sys/block/sde/queue/max_sectors_kb to 192kb > Set /sys/block/sdg/queue/max_sectors_kb to 192kb > Set /sys/block/sdi/queue/max_sectors_kb to 192kb > Set /sys/block/sdk/queue/max_sectors_kb to 192kb > # echo 3 > /proc/sys/vm/drop_caches > # dd if=/dev/md3 of=/dev/null bs=1M count=10240 > 10240+0 records in > 10240+0 records out > 10737418240 bytes (11 GB) copied, 398.069 seconds, 27.0 MB/s > > Awful performance with your numbers/drop_caches settings.. ! Can you repeat with /dev/sda only? With fresh reboot to shell, then: $ cat /sys/block/sda/queue/max_sectors_kb $ echo 3 > /proc/sys/vm/drop_caches $ dd if=/dev/sda of=/dev/null bs=1M count=10240 $ echo 192 > /sys/block/sda/queue/max_sectors_kb $ echo 3 > /proc/sys/vm/drop_caches $ dd if=/dev/sda of=/dev/null bs=1M count=10240 $ echo 128 > /sys/block/sda/queue/max_sectors_kb $ echo 3 > /proc/sys/vm/drop_caches $ dd if=/dev/sda of=/dev/null bs=1M count=10240 > What were your tests designed to show? A problem with the block-io. Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195MB/s write)
Justin Piszcz wrote: # echo 3 > /proc/sys/vm/drop_caches # dd if=/dev/md3 of=/dev/null bs=1M count=10240 10240+0 records in 10240+0 records out 10737418240 bytes (11 GB) copied, 399.352 seconds, 26.9 MB/s # for i in sde sdg sdi sdk; do echo 192 > /sys/block/"$i"/queue/max_sectors_kb; echo "Set /sys/block/"$i"/queue/max_sectors_kb to 192kb"; done Set /sys/block/sde/queue/max_sectors_kb to 192kb Set /sys/block/sdg/queue/max_sectors_kb to 192kb Set /sys/block/sdi/queue/max_sectors_kb to 192kb Set /sys/block/sdk/queue/max_sectors_kb to 192kb # echo 3 > /proc/sys/vm/drop_caches # dd if=/dev/md3 of=/dev/null bs=1M count=10240 10240+0 records in 10240+0 records out 10737418240 bytes (11 GB) copied, 398.069 seconds, 27.0 MB/s Awful performance with your numbers/drop_caches settings.. ! What were your tests designed to show? To start, I expect then to show change in write, not read... and IIRC (I didn't look it up) drop_caches just flushes the caches so you start with known memory contents, none. Justin. On Fri, 12 Jan 2007, Justin Piszcz wrote: On Fri, 12 Jan 2007, Al Boldi wrote: Justin Piszcz wrote: RAID 5 TWEAKED: 1:06.41 elapsed @ 60% CPU This should be 1:14 not 1:06(was with a similarly sized file but not the same) the 1:14 is the same file as used with the other benchmarks. and to get that I used 256mb read-ahead and 16384 stripe size ++ 128 max_sectors_kb (same size as my sw raid5 chunk size) max_sectors_kb is probably your key. On my system I get twice the read performance by just reducing max_sectors_kb from default 512 to 192. Can you do a fresh reboot to shell and then: $ cat /sys/block/hda/queue/* $ cat /proc/meminfo $ echo 3 > /proc/sys/vm/drop_caches $ dd if=/dev/hda of=/dev/null bs=1M count=10240 $ echo 192 > /sys/block/hda/queue/max_sectors_kb $ echo 3 > /proc/sys/vm/drop_caches $ dd if=/dev/hda of=/dev/null bs=1M count=10240 -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid5 software vs hardware: parity calculations?
On 2007-01-12 at 09:39-08 dean gaudet <[EMAIL PROTECTED]> wrote: > On Thu, 11 Jan 2007, James Ralston wrote: > > > I'm having a discussion with a coworker concerning the cost of > > md's raid5 implementation versus hardware raid5 implementations. > > > > Specifically, he states: > > > > > The performance [of raid5 in hardware] is so much better with > > > the write-back caching on the card and the offload of the > > > parity, it seems to me that the minor increase in work of having > > > to upgrade the firmware if there's a buggy one is a highly > > > acceptable trade-off to the increased performance. The md > > > driver still commits you to longer run queues since IO calls to > > > disk, parity calculator and the subsequent kflushd operations > > > are non-interruptible in the CPU. A RAID card with write-back > > > cache releases the IO operation virtually instantaneously. > > > > It would seem that his comments have merit, as there appears to be > > work underway to move stripe operations outside of the spinlock: > > > > http://lwn.net/Articles/184102/ > > > > What I'm curious about is this: for real-world situations, how > > much does this matter? In other words, how hard do you have to > > push md raid5 before doing dedicated hardware raid5 becomes a real > > win? > > hardware with battery backed write cache is going to beat the > software at small write traffic latency essentially all the time but > it's got nothing to do with the parity computation. I'm not convinced that's true. What my coworker is arguing is that md raid5 code spinlocks while it is performing this sequence of operations: 1. executing the write 2. reading the blocks necessary for recalculating the parity 3. recalculating the parity 4. updating the parity block My [admittedly cursory] read of the code, coupled with the link above, leads me to believe that my coworker is correct, which is why I was for trolling for [informed] opinions about how much of a performance hit the spinlock causes. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195MB/s write)
Btw, max sectors did improve my performance a little bit but stripe_cache+read_ahead were the main optimizations that made everything go faster by about ~1.5x. I have individual bonnie++ benchmarks of [only] the max_sector_kb tests as well, it improved the times from 8min/bonnie run -> 7min 11 seconds or so, see below and then after that is what you requested. # Options used: # blockdev --setra 1536 /dev/md3 (back to default) # cat /sys/block/sd{e,g,i,k}/queue/max_sectors_kb # value: 512 # value: 512 # value: 512 # value: 512 # Test with, chunksize of raid array (128) # echo 128 > /sys/block/sde/queue/max_sectors_kb # echo 128 > /sys/block/sdg/queue/max_sectors_kb # echo 128 > /sys/block/sdi/queue/max_sectors_kb # echo 128 > /sys/block/sdk/queue/max_sectors_kb max_sectors_kb128_run1:max_sectors_kb128_run1,4000M,46522,98,109829,19,42776,12,46527,97,86206,14,647.7,1,16:10:16/64,874,9,29123,97,2778,16,852,9,25399,86,1396,10 max_sectors_kb128_run2:max_sectors_kb128_run2,4000M,44037,99,107971,19,42420,12,46385,97,85773,14,628.8,1,16:10:16/64,981,10,23006,77,3185,19,848,9,27891,94,1737,13 max_sectors_kb128_run3:max_sectors_kb128_run3,4000M,46501,98,108313,19,42558,12,46314,97,87697,15,617.0,1,16:10:16/64,864,9,29795,99,2744,16,897,9,29021,98,1439,10 max_sectors_kb128_run4:max_sectors_kb128_run4,4000M,40750,98,108959,19,42519,12,45027,97,86484,14,637.0,1,16:10:16/64,929,10,29641,98,2476,14,883,9,29529,99,1867,13 max_sectors_kb128_run5:max_sectors_kb128_run5,4000M,46664,98,108387,19,42801,12,46423,97,87379,14,642.5,0,16:10:16/64,925,10,29756,99,2759,16,915,10,28694,97,1215,8 162.54user 43.96system 7:12.02elapsed 47%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (5major+1104minor)pagefaults 0swaps 168.75user 43.51system 7:14.49elapsed 48%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (13major+1092minor)pagefaults 0swaps 162.76user 44.18system 7:12.26elapsed 47%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (13major+1096minor)pagefaults 0swaps 178.91user 43.39system 7:24.39elapsed 50%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (13major+1094minor)pagefaults 0swaps 162.45user 43.86system 7:11.26elapsed 47%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (13major+1092minor)pagefaults 0swaps --- # cat /sys/block/sd[abcdefghijk]/queue/* cat: /sys/block/sda/queue/iosched: Is a directory 32767 512 128 128 noop [anticipatory] cat: /sys/block/sdb/queue/iosched: Is a directory 32767 512 128 128 noop [anticipatory] cat: /sys/block/sdc/queue/iosched: Is a directory 32767 128 128 128 noop [anticipatory] cat: /sys/block/sdd/queue/iosched: Is a directory 32767 128 128 128 noop [anticipatory] cat: /sys/block/sde/queue/iosched: Is a directory 32767 128 128 128 noop [anticipatory] cat: /sys/block/sdf/queue/iosched: Is a directory 32767 128 128 128 noop [anticipatory] cat: /sys/block/sdg/queue/iosched: Is a directory 32767 128 128 128 noop [anticipatory] cat: /sys/block/sdh/queue/iosched: Is a directory 32767 128 128 128 noop [anticipatory] cat: /sys/block/sdi/queue/iosched: Is a directory 32767 128 128 128 noop [anticipatory] cat: /sys/block/sdj/queue/iosched: Is a directory 32767 128 128 128 noop [anticipatory] cat: /sys/block/sdk/queue/iosched: Is a directory 32767 128 128 128 noop [anticipatory] # (note I am only using four of these (which are raptors, in raid5 for md3)) # cat /proc/meminfo MemTotal: 2048904 kB MemFree: 1299980 kB Buffers: 1408 kB Cached: 58032 kB SwapCached: 0 kB Active: 65012 kB Inactive:33796 kB HighTotal: 1153312 kB HighFree: 1061792 kB LowTotal: 895592 kB LowFree:238188 kB SwapTotal: 2200760 kB SwapFree: 2200760 kB Dirty: 8 kB Writeback: 0 kB AnonPages: 39332 kB Mapped: 20248 kB Slab:37116 kB SReclaimable:10580 kB SUnreclaim: 26536 kB PageTables: 1284 kB NFS_Unstable:0 kB Bounce: 0 kB CommitLimit: 3225212 kB Committed_AS: 111056 kB VmallocTotal: 114680 kB VmallocUsed: 3828 kB VmallocChunk: 110644 kB # # echo 3 > /proc/sys/vm/drop_caches # dd if=/dev/md3 of=/dev/null bs=1M count=10240 10240+0 records in 10240+0 records out 10737418240 bytes (11 GB) copied, 399.352 seconds, 26.9 MB/s # for i in sde sdg sdi sdk; do echo 192 > /sys/block/"$i"/queue/max_sectors_kb; echo "Set /sys/block/"$i"/queue/max_sectors_kb to 192kb"; done Set /sys/block/sde/queue/max_sectors_kb to 192kb Set /sys/block/sdg/queue/max_sectors_kb to 192kb Set /sys/block/sdi/queue/max_sectors_kb to 192kb Set /sys/block/sdk/queue/max_sectors_kb to 192kb # echo 3 > /proc/sys/vm/drop_caches # dd if=/dev/md3 of=/dev/null bs=1M count=10240 10240+0 records in 10240+0 records out 10737418240 bytes (11 GB) copied, 398.069 seconds, 27.0 MB/s Awful performance with your numbers/drop_caches settings.. ! What were your tests designed to show? Justin. On Fri, 1
Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195MB/s write)
On Fri, 12 Jan 2007, Al Boldi wrote: > Justin Piszcz wrote: > > RAID 5 TWEAKED: 1:06.41 elapsed @ 60% CPU > > > > This should be 1:14 not 1:06(was with a similarly sized file but not the > > same) the 1:14 is the same file as used with the other benchmarks. and to > > get that I used 256mb read-ahead and 16384 stripe size ++ 128 > > max_sectors_kb (same size as my sw raid5 chunk size) > > max_sectors_kb is probably your key. On my system I get twice the read > performance by just reducing max_sectors_kb from default 512 to 192. > > Can you do a fresh reboot to shell and then: > $ cat /sys/block/hda/queue/* > $ cat /proc/meminfo > $ echo 3 > /proc/sys/vm/drop_caches > $ dd if=/dev/hda of=/dev/null bs=1M count=10240 > $ echo 192 > /sys/block/hda/queue/max_sectors_kb > $ echo 3 > /proc/sys/vm/drop_caches > $ dd if=/dev/hda of=/dev/null bs=1M count=10240 > > > Thanks! > > -- > Al > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Ok. sec - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195MB/s write)
Justin Piszcz wrote: > RAID 5 TWEAKED: 1:06.41 elapsed @ 60% CPU > > This should be 1:14 not 1:06(was with a similarly sized file but not the > same) the 1:14 is the same file as used with the other benchmarks. and to > get that I used 256mb read-ahead and 16384 stripe size ++ 128 > max_sectors_kb (same size as my sw raid5 chunk size) max_sectors_kb is probably your key. On my system I get twice the read performance by just reducing max_sectors_kb from default 512 to 192. Can you do a fresh reboot to shell and then: $ cat /sys/block/hda/queue/* $ cat /proc/meminfo $ echo 3 > /proc/sys/vm/drop_caches $ dd if=/dev/hda of=/dev/null bs=1M count=10240 $ echo 192 > /sys/block/hda/queue/max_sectors_kb $ echo 3 > /proc/sys/vm/drop_caches $ dd if=/dev/hda of=/dev/null bs=1M count=10240 Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid5 software vs hardware: parity calculations?
On Thu, 11 Jan 2007, James Ralston wrote: > I'm having a discussion with a coworker concerning the cost of md's > raid5 implementation versus hardware raid5 implementations. > > Specifically, he states: > > > The performance [of raid5 in hardware] is so much better with the > > write-back caching on the card and the offload of the parity, it > > seems to me that the minor increase in work of having to upgrade the > > firmware if there's a buggy one is a highly acceptable trade-off to > > the increased performance. The md driver still commits you to > > longer run queues since IO calls to disk, parity calculator and the > > subsequent kflushd operations are non-interruptible in the CPU. A > > RAID card with write-back cache releases the IO operation virtually > > instantaneously. > > It would seem that his comments have merit, as there appears to be > work underway to move stripe operations outside of the spinlock: > > http://lwn.net/Articles/184102/ > > What I'm curious about is this: for real-world situations, how much > does this matter? In other words, how hard do you have to push md > raid5 before doing dedicated hardware raid5 becomes a real win? hardware with battery backed write cache is going to beat the software at small write traffic latency essentially all the time but it's got nothing to do with the parity computation. -dean - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195MB/s write)
RAID 5 TWEAKED: 1:06.41 elapsed @ 60% CPU This should be 1:14 not 1:06(was with a similarly sized file but not the same) the 1:14 is the same file as used with the other benchmarks. and to get that I used 256mb read-ahead and 16384 stripe size ++ 128 max_sectors_kb (same size as my sw raid5 chunk size) On Fri, 12 Jan 2007, Justin Piszcz wrote: > > > On Fri, 12 Jan 2007, Michael Tokarev wrote: > > > Justin Piszcz wrote: > > > Using 4 raptor 150s: > > > > > > Without the tweaks, I get 111MB/s write and 87MB/s read. > > > With the tweaks, 195MB/s write and 211MB/s read. > > > > > > Using kernel 2.6.19.1. > > > > > > Without the tweaks and with the tweaks: > > > > > > # Stripe tests: > > > echo 8192 > /sys/block/md3/md/stripe_cache_size > > > > > > # DD TESTS [WRITE] > > > > > > DEFAULT: (512K) > > > $ dd if=/dev/zero of=10gb.no.optimizations.out bs=1M count=10240 > > > 10240+0 records in > > > 10240+0 records out > > > 10737418240 bytes (11 GB) copied, 96.6988 seconds, 111 MB/s > > [] > > > 8192K READ AHEAD > > > $ dd if=10gb.16384k.stripe.out of=/dev/null bs=1M > > > 10240+0 records in > > > 10240+0 records out > > > 10737418240 bytes (11 GB) copied, 64.9454 seconds, 165 MB/s > > > > What exactly are you measuring? Linear read/write, like copying one > > device to another (or to /dev/null), in large chunks? > Check bonnie benchmarks below. > > > > I don't think it's an interesting test. Hint: how many times a day > > you plan to perform such a copy? > It is a measurement of raw performance. > > > > (By the way, for a copy of one block device to another, try using > > O_DIRECT, with two dd processes doing the copy - one reading, and > > another writing - this way, you'll get best results without huge > > affect on other things running on the system. Like this: > > > > dd if=/dev/onedev bs=1M iflag=direct | > > dd of=/dev/twodev bs=1M oflag=direct > > ) > Interesting, I will take this into consideration-- however, an untar test > shows a 2:1 improvement, see below. > > > > /mjt > > > > Decompress/unrar a DVD-sized file: > > On the following RAID volumes with the same set of [4] 150GB raptors: > > RAID 0] 1:13.16 elapsed @ 49% CPU > RAID 4] 2:05.85 elapsed @ 30% CPU > RAID 5] 2:01.94 elapsed @ 32% CPU > RAID 6] 2:39.34 elapsed @ 24% CPU > RAID 10] 1:52.37 elapsed @ 32% CPU > > RAID 5 Tweaked (8192 stripe_cache & 16384 setra/blockdev):: > > RAID 5 TWEAKED: 1:06.41 elapsed @ 60% CPU > > I did not tweak raid 0, but seeing how RAID5 tweaked is faster than RAID0 > is good enough for me :) > > RAID0 did 278MB/s read and 317MB/s write (by the way) > > Here are the bonnie results, the times alone speak for themselves, from 8 > minutes to min and 48-59 seconds. > > # No optimizations: > # Run Benchmarks > Default Bonnie: > [nr_requests=128,max_sectors_kb=512,stripe_cache_size=256,read_ahead=1536] > default_run1,4000M,42879,98,105436,19,41081,11,46277,96,87845,15,639.2,1,16:10:16/64,380,4,29642,99,2990,18,469,5,11784,40,1712,12 > default_run2,4000M,47145,99,108664,19,40931,11,46466,97,94158,16,634.8,0,16:10:16/64,377,4,16990,56,2850,17,431,4,21066,71,1800,13 > default_run3,4000M,43653,98,109063,19,40898,11,46447,97,97141,16,645.8,1,16:10:16/64,373,4,22302,75,2793,16,420,4,16708,56,1794,13 > default_run4,4000M,46485,98,110664,20,41102,11,46443,97,93616,16,631.3,1,16:10:16/64,363,3,14484,49,2802,17,388,4,25532,86,1604,12 > default_run5,4000M,43813,98,109800,19,41214,11,46457,97,92563,15,635.1,1,16:10:16/64,376,4,28990,95,2827,17,388,4,22874,76,1817,13 > > 169.88user 44.01system 8:02.98elapsed 44%CPU (0avgtext+0avgdata > 0maxresident)k > 0inputs+0outputs (6major+1102minor)pagefaults 0swaps > 161.60user 44.33system 7:53.14elapsed 43%CPU (0avgtext+0avgdata > 0maxresident)k > 0inputs+0outputs (13major+1095minor)pagefaults 0swaps > 166.64user 45.24system 8:00.07elapsed 44%CPU (0avgtext+0avgdata > 0maxresident)k > 0inputs+0outputs (13major+1096minor)pagefaults 0swaps > 161.90user 44.66system 8:00.85elapsed 42%CPU (0avgtext+0avgdata > 0maxresident)k > 0inputs+0outputs (13major+1094minor)pagefaults 0swaps > 167.61user 44.12system 8:03.26elapsed 43%CPU (0avgtext+0avgdata > 0maxresident)k > 0inputs+0outputs (13major+1092minor)pagefaults 0swaps > > > All optimizations [bonnie++] > > 168.08user 46.05system 5:55.13elapsed 60%CPU (0avgtext+0avgdata > 0maxresident)k > 0inputs+0outputs (16major+1092minor)pagefaults 0swaps > 162.65user 46.21system 5:48.47elapsed 59%CPU (0avgtext+0avgdata > 0maxresident)k > 0inputs+0outputs (7major+1101minor)pagefaults 0swaps > 168.06user 45.74system 5:59.84elapsed 59%CPU (0avgtext+0avgdata > 0maxresident)k > 0inputs+0outputs (7major+1102minor)pagefaults 0swaps > 168.00user 46.18system 5:58.77elapsed 59%CPU (0avgtext+0avgdata > 0maxresident)k > 0inputs+0outputs (13major+1095minor)pagefaults 0swaps > 167.98user 45.53system 5:56.49elapsed 59%CPU (0avgtext+0avgdata > 0maxresident)k > 0inputs+0outputs (5major+1101minor)pagefaults 0s
Re: FailSpare event?
On Thursday 11 January 2007 23:23, Neil Brown wrote: > On Thursday January 11, [EMAIL PROTECTED] wrote: > > Can someone tell me what this means please? I just received this in > > an email from one of my servers: > > > Same problem here, on different machines. But only with mdadm 2.6, with mdadm 2.5.5 no problems. First machine sends direct after starting mdadm in monitor mode: (kernel 2.6.20-rc3) - event=DeviceDisappeared mddev=/dev/md1 device=Wrong-Level Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] md1 : active raid0 sdb2[1] sda2[0] 3904704 blocks 16k chunks md2 : active raid0 sdb3[1] sda3[0] 153930112 blocks 16k chunks md3 : active raid5 sdf1[3] sde1[2] sdd1[1] sdc1[0] 732587712 blocks level 5, 16k chunk, algorithm 2 [4/4] [] md0 : active raid1 sdb1[1] sda1[0] 192640 blocks [2/2] [UU] unused devices: --- and a second time for md2. Then every about 60 sec 4 times event=SpareActive mddev=/dev/md3 ** Second machine sends about every 60sec 8 messages with: (kernel 2.6.19.2) -- event=SpareActive mddev=/dev/md0 device= Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : active raid1 sdb1[1] sda1[0] 979840 blocks [2/2] [UU] md3 : active raid5 sdh1[5] sdg1[4] sdf1[3] sde1[2] sdd1[1] sdc1[0] 4899200 blocks level 5, 8k chunk, algorithm 2 [6/6] [UU] md2 : active raid5 sdh2[7] sdg2[6] sdf2[5] sde2[4] sdd2[3] sdc2[2] sdb2[1] sda2[0] 6858880 blocks level 5, 4k chunk, algorithm 2 [8/8] [] md0 : active raid5 sdh3[7] sdg3[6] sdf3[5] sde3[4] sdd3[3] sdc3[2] sdb3[1] sda3[0] 235086656 blocks level 5, 16k chunk, algorithm 2 [8/8] [] unused devices: -- Both machines had nerver seen any spare device, and there are no failing devices, everything works as expected. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195MB/s write)
On Fri, 12 Jan 2007, Michael Tokarev wrote: > Justin Piszcz wrote: > > Using 4 raptor 150s: > > > > Without the tweaks, I get 111MB/s write and 87MB/s read. > > With the tweaks, 195MB/s write and 211MB/s read. > > > > Using kernel 2.6.19.1. > > > > Without the tweaks and with the tweaks: > > > > # Stripe tests: > > echo 8192 > /sys/block/md3/md/stripe_cache_size > > > > # DD TESTS [WRITE] > > > > DEFAULT: (512K) > > $ dd if=/dev/zero of=10gb.no.optimizations.out bs=1M count=10240 > > 10240+0 records in > > 10240+0 records out > > 10737418240 bytes (11 GB) copied, 96.6988 seconds, 111 MB/s > [] > > 8192K READ AHEAD > > $ dd if=10gb.16384k.stripe.out of=/dev/null bs=1M > > 10240+0 records in > > 10240+0 records out > > 10737418240 bytes (11 GB) copied, 64.9454 seconds, 165 MB/s > > What exactly are you measuring? Linear read/write, like copying one > device to another (or to /dev/null), in large chunks? Check bonnie benchmarks below. > > I don't think it's an interesting test. Hint: how many times a day > you plan to perform such a copy? It is a measurement of raw performance. > > (By the way, for a copy of one block device to another, try using > O_DIRECT, with two dd processes doing the copy - one reading, and > another writing - this way, you'll get best results without huge > affect on other things running on the system. Like this: > > dd if=/dev/onedev bs=1M iflag=direct | > dd of=/dev/twodev bs=1M oflag=direct > ) Interesting, I will take this into consideration-- however, an untar test shows a 2:1 improvement, see below. > > /mjt > Decompress/unrar a DVD-sized file: On the following RAID volumes with the same set of [4] 150GB raptors: RAID 0] 1:13.16 elapsed @ 49% CPU RAID 4] 2:05.85 elapsed @ 30% CPU RAID 5] 2:01.94 elapsed @ 32% CPU RAID 6] 2:39.34 elapsed @ 24% CPU RAID 10] 1:52.37 elapsed @ 32% CPU RAID 5 Tweaked (8192 stripe_cache & 16384 setra/blockdev):: RAID 5 TWEAKED: 1:06.41 elapsed @ 60% CPU I did not tweak raid 0, but seeing how RAID5 tweaked is faster than RAID0 is good enough for me :) RAID0 did 278MB/s read and 317MB/s write (by the way) Here are the bonnie results, the times alone speak for themselves, from 8 minutes to min and 48-59 seconds. # No optimizations: # Run Benchmarks Default Bonnie: [nr_requests=128,max_sectors_kb=512,stripe_cache_size=256,read_ahead=1536] default_run1,4000M,42879,98,105436,19,41081,11,46277,96,87845,15,639.2,1,16:10:16/64,380,4,29642,99,2990,18,469,5,11784,40,1712,12 default_run2,4000M,47145,99,108664,19,40931,11,46466,97,94158,16,634.8,0,16:10:16/64,377,4,16990,56,2850,17,431,4,21066,71,1800,13 default_run3,4000M,43653,98,109063,19,40898,11,46447,97,97141,16,645.8,1,16:10:16/64,373,4,22302,75,2793,16,420,4,16708,56,1794,13 default_run4,4000M,46485,98,110664,20,41102,11,46443,97,93616,16,631.3,1,16:10:16/64,363,3,14484,49,2802,17,388,4,25532,86,1604,12 default_run5,4000M,43813,98,109800,19,41214,11,46457,97,92563,15,635.1,1,16:10:16/64,376,4,28990,95,2827,17,388,4,22874,76,1817,13 169.88user 44.01system 8:02.98elapsed 44%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (6major+1102minor)pagefaults 0swaps 161.60user 44.33system 7:53.14elapsed 43%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (13major+1095minor)pagefaults 0swaps 166.64user 45.24system 8:00.07elapsed 44%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (13major+1096minor)pagefaults 0swaps 161.90user 44.66system 8:00.85elapsed 42%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (13major+1094minor)pagefaults 0swaps 167.61user 44.12system 8:03.26elapsed 43%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (13major+1092minor)pagefaults 0swaps All optimizations [bonnie++] 168.08user 46.05system 5:55.13elapsed 60%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (16major+1092minor)pagefaults 0swaps 162.65user 46.21system 5:48.47elapsed 59%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (7major+1101minor)pagefaults 0swaps 168.06user 45.74system 5:59.84elapsed 59%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (7major+1102minor)pagefaults 0swaps 168.00user 46.18system 5:58.77elapsed 59%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (13major+1095minor)pagefaults 0swaps 167.98user 45.53system 5:56.49elapsed 59%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (5major+1101minor)pagefaults 0swaps c6300-optimized:4000M,43976,99,167209,29,73109,22,43471,91,208572,40,511.4,1,16:10:16/64,1109,12,26948,89,2469,14,1051,11,29037,97,2167,16 c6300-optimized:4000M,47455,99,190212,35,70402,21,43167,92,206290,40,503.3,1,16:10:16/64,1071,11,29893,99,2804,16,1059,12,24887,84,2090,16 c6300-optimized:4000M,43979,99,172543,29,71811,21,41760,87,201870,39,498.9,1,16:10:16/64,1042,11,30276,99,2800,16,1063,12,29491,99,2257,17 c6300-optimized:4000M,43824,98,164585,29,73470,22,43098,90,207003,40,489.1,1,16:10:16/64,1045,11,30288,98,2512,15,1018,11,27365,92,2097,16 c6300-optimiz
Re: Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195MB/s write)
Justin Piszcz wrote: > Using 4 raptor 150s: > > Without the tweaks, I get 111MB/s write and 87MB/s read. > With the tweaks, 195MB/s write and 211MB/s read. > > Using kernel 2.6.19.1. > > Without the tweaks and with the tweaks: > > # Stripe tests: > echo 8192 > /sys/block/md3/md/stripe_cache_size > > # DD TESTS [WRITE] > > DEFAULT: (512K) > $ dd if=/dev/zero of=10gb.no.optimizations.out bs=1M count=10240 > 10240+0 records in > 10240+0 records out > 10737418240 bytes (11 GB) copied, 96.6988 seconds, 111 MB/s [] > 8192K READ AHEAD > $ dd if=10gb.16384k.stripe.out of=/dev/null bs=1M > 10240+0 records in > 10240+0 records out > 10737418240 bytes (11 GB) copied, 64.9454 seconds, 165 MB/s What exactly are you measuring? Linear read/write, like copying one device to another (or to /dev/null), in large chunks? I don't think it's an interesting test. Hint: how many times a day you plan to perform such a copy? (By the way, for a copy of one block device to another, try using O_DIRECT, with two dd processes doing the copy - one reading, and another writing - this way, you'll get best results without huge affect on other things running on the system. Like this: dd if=/dev/onedev bs=1M iflag=direct | dd of=/dev/twodev bs=1M oflag=direct ) /mjt - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html