Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?
Question, I currently have a 965 chipset-based motherboard, use 4 port onboard and several PCI-e x1 controller cards for a raid 5 of 10 raptor drives. I get pretty decent speeds: [EMAIL PROTECTED] time dd if=/dev/zero of=100gb bs=1M count=102400 102400+0 records in 102400+0 records out 107374182400 bytes (107 GB) copied, 247.134 seconds, 434 MB/s real 4m7.164s user 0m0.223s sys 3m3.505s [EMAIL PROTECTED] time dd if=100gb of=/dev/null bs=1M count=102400 102400+0 records in 102400+0 records out 107374182400 bytes (107 GB) copied, 172.588 seconds, 622 MB/s real 2m52.631s user 0m0.212s sys 1m50.905s [EMAIL PROTECTED] Also, when I run simultaenous dd's from all of the drives, I see 850-860MB/s, I am curious if there is some kind of limitation with software raid as to why I am not getting better than 500MB/s for sequential write speed? With 7 disks, I got about the same speed, adding 3 more for a total of 10 did not seem to help in regards to write. However, read improved to 622MBs/ from about 420-430MB/s. However, if I want to upgrade to more than 12 disks, I am out of PCI-e slots, so I was wondering, does anyone on this list run a 16 port Areca or 3ware card and use it for JBOD? What kind of performance do you see when using mdadm with such a card? Or if anyone uses mdadm with less than a 16 port card, I'd like to hear what kind of experiences you have seen with that type of configuration. Justin. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?
Justin Piszcz wrote: Question, I currently have a 965 chipset-based motherboard, use 4 port onboard and several PCI-e x1 controller cards for a raid 5 of 10 raptor drives. I get pretty decent speeds: [EMAIL PROTECTED] time dd if=/dev/zero of=100gb bs=1M count=102400 102400+0 records in 102400+0 records out 107374182400 bytes (107 GB) copied, 247.134 seconds, 434 MB/s real 4m7.164s user 0m0.223s sys 3m3.505s [EMAIL PROTECTED] time dd if=100gb of=/dev/null bs=1M count=102400 102400+0 records in 102400+0 records out 107374182400 bytes (107 GB) copied, 172.588 seconds, 622 MB/s real 2m52.631s user 0m0.212s sys 1m50.905s [EMAIL PROTECTED] Also, when I run simultaenous dd's from all of the drives, I see 850-860MB/s, I am curious if there is some kind of limitation with software raid as to why I am not getting better than 500MB/s for sequential write speed? With 7 disks, I got about the same speed, adding 3 more for a total of 10 did not seem to help in regards to write. However, read improved to 622MBs/ from about 420-430MB/s. However, if I want to upgrade to more than 12 disks, I am out of PCI-e slots, so I was wondering, does anyone on this list run a 16 port Areca or 3ware card and use it for JBOD? What kind of performance do you see when using mdadm with such a card? Or if anyone uses mdadm with less than a 16 port card, I'd like to hear what kind of experiences you have seen with that type of configuration. RAID5 is not the fastest at write, there are patches being tested to improve that. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?
Justin Piszcz wrote: However, if I want to upgrade to more than 12 disks, I am out of PCI-e slots, so I was wondering, does anyone on this list run a 16 port Areca or 3ware card and use it for JBOD? What kind of performance do you see when using mdadm with such a card? Or if anyone uses mdadm with less than a 16 port card, I'd like to hear what kind of experiences you have seen with that type of configuration. I just recently got an ARC-1260 16-port card and tested md raid5 vs Areca raid5 with 5x500gb WD RE2 drives. I didn't do anything exhaustive, just ran bonnie++ on it, and I didn't play around with any parameters either, but straight out of the box, the 1260 beat md in most of the performance numbers by 10-50%. I think md did beat the 1260 slightly in one characteristic, but I don't have the numbers in front of me right now. If you are interested, I can probably dig them out. I was disappointed to discover that the 1260 doesn't support SMART passthrough. Even if you make the disks jbod, you can't use smartctl to conduct self tests or read attributes. You can query the temperature and smart health status (I think) through the areca cli interface, but that's it. In the end, I decided to save the cpu for numerical stuff and let the Areca handle this array (also did an online conversion from raid5 to raid6), so now I have a 10x500gb Areca raid6 and a 10x250gb md raid5, all lvm'd together. Another thing about the 1260 that you might want to watch out for: It has compatibility problems with some consumer motherboards where you use the 16x graphics slot for the raid card. If this is your plan, check Areca's compatibility list first. cheers, /Patrik signature.asc Description: OpenPGP digital signature
Re: Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?
On Sat, May 05, 2007 at 12:33:49PM -0400, Justin Piszcz wrote: Also, when I run simultaenous dd's from all of the drives, I see 850-860MB/s, I am curious if there is some kind of limitation with software raid as to why I am not getting better than 500MB/s for sequential write speed? What does vmstat 1 output look like in both cases? My guess is that for large writes it's NOT CPU bound but it can't hurt to check. With 7 disks, I got about the same speed, adding 3 more for a total of 10 did not seem to help in regards to write. However, read improved to 622MBs/ from about 420-430MB/s. RAID is quirky. It's worth fiddling with the stripe size as that can have a big difference in terms of performance --- it's far from clear why on some setups some values work well and other setups you want very different values. It would be good to know if anyone has ever studied stripe size and also controller interleave/layout issues to get a good understanding of why certain values are good and others are very poor and why it varies so much from one setup to the other. Also, 'dd performance' varies between the start of a disk and the end. Typically you get better performance at the start of the disk so dd might not be a very good benchmark here. However, if I want to upgrade to more than 12 disks, I am out of PCI-e slots, so I was wondering, does anyone on this list run a 16 port Areca or 3ware card and use it for JBOD? What kind of performance do you see when using mdadm with such a card? Or if anyone uses mdadm with less than a 16 port card, I'd like to hear what kind of experiences you have seen with that type of configuration. I've used some 2, 4 and 8 port 3ware cards. As JBODS they worked fine, as RAID cards I had no end of problems. I'm happy to test larger cards if someone wants to donate them :-) - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?
On Sat, 5 May 2007, Emmanuel Florac wrote: Le Sat, 5 May 2007 12:33:49 -0400 (EDT) vous écriviez: However, if I want to upgrade to more than 12 disks, I am out of PCI-e slots, so I was wondering, does anyone on this list run a 16 port Areca or 3ware card and use it for JBOD? I don't use this setup in production, but I tried it with 8 ports 3Ware cards. I didn't try the latest 9650 though. What kind of performance do you see when using mdadm with such a card? 3Ghz Supermicro P4D 1 GB RAM, 3Ware 9550SX with 8x250GB 8MB cache 7200 RPM Seagate drives, raid 0 Tested XFS and reiserfs, with 64 and 256K stripes. tested under Linux 2.6.15.1, with bonnie++ in fast mode (-f option). use bon_csv2html to translate, or see bonnie++ documentation, roughly : 2G is the file size tested, then numbers on the first line are : write speed (KB/s), CPU usage (%), rewrite speed (overwrite), cpu usage, read speed, cpu usage. Then follow sequential and random seeks, reads, writes and delete with their cpu usage. + means no significant value. # XFS, stripe 256k storiq,2G,,,353088,69,76437,17,,,197376,16,410.8,0,16,11517,57,+,+++,10699,51,11502,59,+,+++,12158,61 storiq,2G,,,349166,71,75397,17,,,196057,16,433.3,0,16,12744,64,+,+++,12700,58,13008,67,+,+++,9890,51 storiq,2G,,,336683,68,72581,16,,,191254,18,419.9,0,16,12377,62,+,+++,10991,52,12947,67,+,+++,10580,52 storiq,2G,,,335646,65,77938,17,,,195350,17,397.4,0,16,14578,74,+,+++,11085,53,14377,74,+,+++,10852,54 storiq,2G,,,330022,67,73004,17,,,197846,18,412.3,0,16,12534,65,+,+++,10983,52,12161,63,+,+++,11752,61 storiq,2G,,,279454,55,75256,17,,,196065,18,412.7,0,16,13022,67,+,+++,10802,52,13759,72,+,+++,9800,47 storiq,2G,,,314606,61,74883,16,,,194131,16,401.2,0,16,11665,58,+,+++,10723,52,11880,61,+,+++,6659,33 storiq,2G,,,264382,53,72011,15,,,196690,18,411.5,0,16,10194,52,+,+++,12202,57,10367,52,+,+++,9175,45 storiq,2G,,,360252,72,75845,17,,,199721,18,432.7,0,16,12067,61,+,+++,11047,54,12156,62,+,+++,12372,60 storiq,2G,,,280746,57,74541,17,,,193562,19,414.0,0,16,12418,61,+,+++,11090,52,11135,57,+,+++,11309,55 storiq,2G,,,309464,61,79153,18,,,191533,17,419.5,0,16,12705,62,+,+++,11889,57,12027,61,+,+++,10960,54 storiq,2G,,,342122,67,68113,15,,,195572,16,413.5,0,16,13667,69,+,+++,10596,55,12731,66,+,+++,10766,54 storiq,2G,,,329945,63,72183,15,,,193082,18,421.8,0,16,12627,62,+,+++,9270,43,12455,63,+,+++,8878,44 storiq,2G,,,309570,63,69628,16,,,192415,19,413.1,0,16,13568,69,+,+++,10104,48,13512,70,+,+++,9261,45 storiq,2G,,,298528,58,70029,15,,,193531,17,399.5,0,16,13028,64,+,+++,9990,47,10098,52,+,+++,7544,38 storiq,2G,,,260341,52,66979,15,,,197199,18,393.1,0,16,10633,53,+,+++,9189,43,11159,56,+,+++,11696,58 # XFS, stripe 64k storiq,2G,,,351241,70,90868,22,,,305222,29,408.7,0,16,8593,43,+,+++,6639,31,7555,39,+,+++,6639,33 storiq,2G,,,340145,67,83790,19,,,297148,28,401.4,0,16,9132,46,+,+++,6790,34,8881,45,+,+++,6305,31 storiq,2G,,,325791,65,81314,19,,,282439,26,395.5,0,16,9095,44,+,+++,6255,29,8173,42,+,+++,6194,31 storiq,2G,,,266009,53,83362,20,,,308438,26,407.7,0,16,8362,43,+,+++,6443,30,9264,47,+,+++,6339,33 storiq,2G,,,322776,65,76466,17,,,288001,26,399.7,0,16,8038,41,+,+++,5387,26,6389,34,+,+++,6545,31 storiq,2G,,,309007,60,77846,18,,,290613,29,392.8,0,16,7183,37,+,+++,6492,30,8270,41,+,+++,6813,35 storiq,2G,,,287662,58,72920,17,,,287911,26,398.4,0,16,8893,44,+,+++,,36,8150,41,+,+++,7717,39 storiq,2G,,,288149,56,75743,17,,,300949,29,386.2,0,16,9545,47,+,+++,7572,35,9115,46,+,+++,7211,36 # reiser, stripe 256k storiq,2G,,,289179,98,102775,26,,,188307,22,444.0,0,16,27326,100,+,+++,21887,99,26726,99,+,+++,20633,98 storiq,2G,,,275847,93,101970,25,,,190551,21,450.2,0,16,27397,100,+,+++,21926,100,26609,100,+,+++,20895,99 storiq,2G,,,289414,99,105080,26,,,189022,22,423.9,0,16,27212,100,+,+++,21757,100,26651,99,+,+++,20863,100 storiq,2G,,,292746,99,103681,25,,,186303,21,431.5,0,16,27375,100,+,+++,21989,99,26251,99,+,+++,20924,99 storiq,2G,,,290222,99,104135,26,,,189656,22,449.7,0,16,27453,99,+,+++,21849,100,26757,99,+,+++,20845,99 storiq,2G,,,291716,99,103872,26,,,187410,23,437.0,0,16,27419,99,+,+++,22119,99,26516,100,+,+++,20934,100 storiq,2G,,,285545,99,101637,25,,,189788,21,422.1,0,16,27224,99,+,+++,21742,99,26500,99,+,+++,20922,100 storiq,2G,,,293042,98,100272,24,,,185631,22,453.8,0,16,27268,99,+,+++,21944,100,26777,100,+,+++,21042,99 # reiser stripe 64k storiq,2G,,,295569,99,112563,29,,,282178,32,434.5,0,16,27631,99,+,+++,22015,99,27021,100,+,+++,21028,99 storiq,2G,,,287830,98,112449,29,,,271047,33,425.1,0,16,27447,99,+,+++,21973,99,26810,99,+,+++,21008,100 storiq,2G,,,271668,95,114410,30,,,282419,33,438.7,0,16,27495,100,+,+++,22158,100,26707,100,+,+++,21106,100
Speed variation depending on disk position (was: Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?)
Chris Wedgwood wrote: snip Also, 'dd performance' varies between the start of a disk and the end. Typically you get better performance at the start of the disk so dd might not be a very good benchmark here. Hi, Sorry for hijacking this thread, but I was actually planning to ask this very same question. Is the behavior you are describing above manufacturer dependent or it is pretty much dictated by the general design of modern drives? I have an array of 4 Maxtor sata drives, and raw read performance at the end of the disk is 38mb/s compared to 62mb/s at the beginning. Thanks - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
what does md do if it finds an inconsistency?
Neil, With the check feature of the recent md feature, the question popped up what happens when an inconsistency is found. Does it fix it? If so, which disk it assumes to be wrong if an inconsistency is found? Cheers, -- martin; (greetings from the heart of the sun.) \ echo mailto: !#^.*|tr * mailto:; [EMAIL PROTECTED] spamtraps: [EMAIL PROTECTED] frank harris has been received in all the great houses -- once! -- oscar wilde signature.asc Description: Digital signature (GPG/PGP)
Re: Speed variation depending on disk position
Peter Rabbitson wrote: Is the behavior you are describing above [decaying STR] manufacturer dependent or it is pretty much dictated by the general design of modern drives? It's an artifact of the physical layout of the disk. Disks are divided into tracks (concentric circles laid out across the surface of the drive). Clearly, the outer tracks are longer than the inner tracks. For a very long time, drives have therefore stored more information on these outer tracks. Since the disk's spindle speed is constant, reading these outer tracks therefore means more data passes under the active read head in a given second. That's why you see sequential transfer rates decay from the start (outer tracks) of the disk to the end (inner tracks). This is the opposite of the behavior seen on CDs, because the start of a CD is the inside track. -Ben - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html