Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?

2007-05-05 Thread Justin Piszcz

Question,

I currently have a 965 chipset-based motherboard, use 4 port onboard and 
several PCI-e x1 controller cards for a raid 5 of 10 raptor drives.  I get 
pretty decent speeds:


[EMAIL PROTECTED] time dd if=/dev/zero of=100gb bs=1M count=102400
102400+0 records in
102400+0 records out
107374182400 bytes (107 GB) copied, 247.134 seconds, 434 MB/s

real 4m7.164s
user 0m0.223s
sys 3m3.505s
[EMAIL PROTECTED] time dd if=100gb of=/dev/null bs=1M count=102400
102400+0 records in
102400+0 records out
107374182400 bytes (107 GB) copied, 172.588 seconds, 622 MB/s

real 2m52.631s
user 0m0.212s
sys 1m50.905s
[EMAIL PROTECTED]

Also, when I run simultaenous dd's from all of the drives, I see 
850-860MB/s, I am curious if there is some kind of limitation with 
software raid as to why I am not getting better than 500MB/s for 
sequential write speed?  With 7 disks, I got about the same speed, adding 
3 more for a total of 10 did not seem to help in regards to write. 
However, read improved to 622MBs/ from about 420-430MB/s.


However, if I want to upgrade to more than 12 disks, I am out of PCI-e 
slots, so I was wondering, does anyone on this list run a 16 port Areca or 
3ware card and use it for JBOD?  What kind of performance do you see when 
using mdadm with such a card?  Or if anyone uses mdadm with less than a 16 
port card, I'd like to hear what kind of experiences you have seen with 
that type of configuration.


Justin.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?

2007-05-05 Thread Bill Davidsen

Justin Piszcz wrote:

Question,

I currently have a 965 chipset-based motherboard, use 4 port onboard 
and several PCI-e x1 controller cards for a raid 5 of 10 raptor 
drives.  I get pretty decent speeds:


[EMAIL PROTECTED] time dd if=/dev/zero of=100gb bs=1M count=102400
102400+0 records in
102400+0 records out
107374182400 bytes (107 GB) copied, 247.134 seconds, 434 MB/s

real 4m7.164s
user 0m0.223s
sys 3m3.505s
[EMAIL PROTECTED] time dd if=100gb of=/dev/null bs=1M count=102400
102400+0 records in
102400+0 records out
107374182400 bytes (107 GB) copied, 172.588 seconds, 622 MB/s

real 2m52.631s
user 0m0.212s
sys 1m50.905s
[EMAIL PROTECTED]

Also, when I run simultaenous dd's from all of the drives, I see 
850-860MB/s, I am curious if there is some kind of limitation with 
software raid as to why I am not getting better than 500MB/s for 
sequential write speed?  With 7 disks, I got about the same speed, 
adding 3 more for a total of 10 did not seem to help in regards to 
write. However, read improved to 622MBs/ from about 420-430MB/s.


However, if I want to upgrade to more than 12 disks, I am out of PCI-e 
slots, so I was wondering, does anyone on this list run a 16 port 
Areca or 3ware card and use it for JBOD?  What kind of performance do 
you see when using mdadm with such a card?  Or if anyone uses mdadm 
with less than a 16 port card, I'd like to hear what kind of 
experiences you have seen with that type of configuration.


RAID5 is not the fastest at write, there are patches being tested to 
improve that.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?

2007-05-05 Thread Patrik Jonsson
Justin Piszcz wrote:
 However, if I want to upgrade to more than 12 disks, I am out of PCI-e
 slots, so I was wondering, does anyone on this list run a 16 port
 Areca or 3ware card and use it for JBOD?  What kind of performance do
 you see when using mdadm with such a card?  Or if anyone uses mdadm
 with less than a 16 port card, I'd like to hear what kind of
 experiences you have seen with that type of configuration.
I just recently got an ARC-1260 16-port card and tested md raid5 vs
Areca raid5 with 5x500gb WD RE2 drives. I didn't do anything exhaustive,
just ran bonnie++ on it, and I didn't play around with any parameters
either, but straight out of the box, the 1260 beat md in most of the
performance numbers by 10-50%. I think md did beat the 1260 slightly in
one characteristic, but I don't have the numbers in front of me right
now. If you are interested, I can probably dig them out.

I was disappointed to discover that the 1260 doesn't support SMART
passthrough. Even if you make the disks jbod, you can't use smartctl to
conduct self tests or read attributes. You can query the temperature and
smart health status (I think) through the areca cli interface, but
that's it.

In the end, I decided to save the cpu for numerical stuff and let the
Areca handle this array (also did an online conversion from raid5 to
raid6), so now I have a 10x500gb Areca raid6 and a 10x250gb md raid5,
all lvm'd together.

Another thing about the 1260 that you might want to watch out for: It
has compatibility problems with some consumer motherboards where you use
the 16x graphics slot for the raid card. If this is your plan, check
Areca's compatibility list first.

cheers,

/Patrik




signature.asc
Description: OpenPGP digital signature


Re: Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?

2007-05-05 Thread Chris Wedgwood
On Sat, May 05, 2007 at 12:33:49PM -0400, Justin Piszcz wrote:

 Also, when I run simultaenous dd's from all of the drives, I see
 850-860MB/s, I am curious if there is some kind of limitation with
 software raid as to why I am not getting better than 500MB/s for
 sequential write speed?

What does vmstat 1 output look like in both cases?  My guess is that
for large writes it's NOT CPU bound but it can't hurt to check.

 With 7 disks, I got about the same speed, adding 3 more for a total
 of 10 did not seem to help in regards to write.  However, read
 improved to 622MBs/ from about 420-430MB/s.

RAID is quirky.

It's worth fiddling with the stripe size as that can have a big
difference in terms of performance --- it's far from clear why on some
setups some values work well and other setups you want very different
values.

It would be good to know if anyone has ever studied stripe size and
also controller interleave/layout issues to get a good understanding
of why certain values are good and others are very poor and why it
varies so much from one setup to the other.

Also, 'dd performance' varies between the start of a disk and the end.
Typically you get better performance at the start of the disk so dd
might not be a very good benchmark here.

 However, if I want to upgrade to more than 12 disks, I am out of
 PCI-e slots, so I was wondering, does anyone on this list run a 16
 port Areca or 3ware card and use it for JBOD?  What kind of
 performance do you see when using mdadm with such a card?  Or if
 anyone uses mdadm with less than a 16 port card, I'd like to hear
 what kind of experiences you have seen with that type of
 configuration.

I've used some 2, 4 and 8 port 3ware cards.  As JBODS they worked
fine, as RAID cards I had no end of problems.  I'm happy to test
larger cards if someone wants to donate them :-)
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?

2007-05-05 Thread Justin Piszcz



On Sat, 5 May 2007, Emmanuel Florac wrote:


Le Sat, 5 May 2007 12:33:49 -0400 (EDT) vous écriviez:


However, if I want to upgrade to more than 12 disks, I am out of
PCI-e slots, so I was wondering, does anyone on this list run a 16
port Areca or 3ware card and use it for JBOD?


I don't use this setup in production, but I tried it with 8 ports 3Ware
cards.
I didn't try the latest 9650 though.


 What kind of
performance do you see when using mdadm with such a card?


3Ghz Supermicro P4D 1 GB RAM, 3Ware 9550SX with 8x250GB 8MB cache 7200
RPM Seagate drives, raid 0

Tested XFS and reiserfs, with 64 and 256K stripes.

tested under Linux 2.6.15.1, with bonnie++ in fast mode (-f option).
use bon_csv2html to translate, or see bonnie++ documentation, roughly :
2G is the file size tested, then numbers on the first line are : write
speed (KB/s), CPU usage (%), rewrite speed (overwrite), cpu usage, read
speed, cpu usage. Then follow sequential and random seeks, reads,
writes and delete with their cpu usage. + means no significant
value.

# XFS, stripe 256k
storiq,2G,,,353088,69,76437,17,,,197376,16,410.8,0,16,11517,57,+,+++,10699,51,11502,59,+,+++,12158,61
storiq,2G,,,349166,71,75397,17,,,196057,16,433.3,0,16,12744,64,+,+++,12700,58,13008,67,+,+++,9890,51
storiq,2G,,,336683,68,72581,16,,,191254,18,419.9,0,16,12377,62,+,+++,10991,52,12947,67,+,+++,10580,52
storiq,2G,,,335646,65,77938,17,,,195350,17,397.4,0,16,14578,74,+,+++,11085,53,14377,74,+,+++,10852,54
storiq,2G,,,330022,67,73004,17,,,197846,18,412.3,0,16,12534,65,+,+++,10983,52,12161,63,+,+++,11752,61
storiq,2G,,,279454,55,75256,17,,,196065,18,412.7,0,16,13022,67,+,+++,10802,52,13759,72,+,+++,9800,47
storiq,2G,,,314606,61,74883,16,,,194131,16,401.2,0,16,11665,58,+,+++,10723,52,11880,61,+,+++,6659,33
storiq,2G,,,264382,53,72011,15,,,196690,18,411.5,0,16,10194,52,+,+++,12202,57,10367,52,+,+++,9175,45
storiq,2G,,,360252,72,75845,17,,,199721,18,432.7,0,16,12067,61,+,+++,11047,54,12156,62,+,+++,12372,60
storiq,2G,,,280746,57,74541,17,,,193562,19,414.0,0,16,12418,61,+,+++,11090,52,11135,57,+,+++,11309,55
storiq,2G,,,309464,61,79153,18,,,191533,17,419.5,0,16,12705,62,+,+++,11889,57,12027,61,+,+++,10960,54
storiq,2G,,,342122,67,68113,15,,,195572,16,413.5,0,16,13667,69,+,+++,10596,55,12731,66,+,+++,10766,54
storiq,2G,,,329945,63,72183,15,,,193082,18,421.8,0,16,12627,62,+,+++,9270,43,12455,63,+,+++,8878,44
storiq,2G,,,309570,63,69628,16,,,192415,19,413.1,0,16,13568,69,+,+++,10104,48,13512,70,+,+++,9261,45
storiq,2G,,,298528,58,70029,15,,,193531,17,399.5,0,16,13028,64,+,+++,9990,47,10098,52,+,+++,7544,38
storiq,2G,,,260341,52,66979,15,,,197199,18,393.1,0,16,10633,53,+,+++,9189,43,11159,56,+,+++,11696,58
# XFS, stripe 64k
storiq,2G,,,351241,70,90868,22,,,305222,29,408.7,0,16,8593,43,+,+++,6639,31,7555,39,+,+++,6639,33
storiq,2G,,,340145,67,83790,19,,,297148,28,401.4,0,16,9132,46,+,+++,6790,34,8881,45,+,+++,6305,31
storiq,2G,,,325791,65,81314,19,,,282439,26,395.5,0,16,9095,44,+,+++,6255,29,8173,42,+,+++,6194,31
storiq,2G,,,266009,53,83362,20,,,308438,26,407.7,0,16,8362,43,+,+++,6443,30,9264,47,+,+++,6339,33
storiq,2G,,,322776,65,76466,17,,,288001,26,399.7,0,16,8038,41,+,+++,5387,26,6389,34,+,+++,6545,31
storiq,2G,,,309007,60,77846,18,,,290613,29,392.8,0,16,7183,37,+,+++,6492,30,8270,41,+,+++,6813,35
storiq,2G,,,287662,58,72920,17,,,287911,26,398.4,0,16,8893,44,+,+++,,36,8150,41,+,+++,7717,39
storiq,2G,,,288149,56,75743,17,,,300949,29,386.2,0,16,9545,47,+,+++,7572,35,9115,46,+,+++,7211,36
# reiser, stripe 256k
storiq,2G,,,289179,98,102775,26,,,188307,22,444.0,0,16,27326,100,+,+++,21887,99,26726,99,+,+++,20633,98
storiq,2G,,,275847,93,101970,25,,,190551,21,450.2,0,16,27397,100,+,+++,21926,100,26609,100,+,+++,20895,99
storiq,2G,,,289414,99,105080,26,,,189022,22,423.9,0,16,27212,100,+,+++,21757,100,26651,99,+,+++,20863,100
storiq,2G,,,292746,99,103681,25,,,186303,21,431.5,0,16,27375,100,+,+++,21989,99,26251,99,+,+++,20924,99
storiq,2G,,,290222,99,104135,26,,,189656,22,449.7,0,16,27453,99,+,+++,21849,100,26757,99,+,+++,20845,99
storiq,2G,,,291716,99,103872,26,,,187410,23,437.0,0,16,27419,99,+,+++,22119,99,26516,100,+,+++,20934,100
storiq,2G,,,285545,99,101637,25,,,189788,21,422.1,0,16,27224,99,+,+++,21742,99,26500,99,+,+++,20922,100
storiq,2G,,,293042,98,100272,24,,,185631,22,453.8,0,16,27268,99,+,+++,21944,100,26777,100,+,+++,21042,99
# reiser stripe 64k
storiq,2G,,,295569,99,112563,29,,,282178,32,434.5,0,16,27631,99,+,+++,22015,99,27021,100,+,+++,21028,99
storiq,2G,,,287830,98,112449,29,,,271047,33,425.1,0,16,27447,99,+,+++,21973,99,26810,99,+,+++,21008,100
storiq,2G,,,271668,95,114410,30,,,282419,33,438.7,0,16,27495,100,+,+++,22158,100,26707,100,+,+++,21106,100

Speed variation depending on disk position (was: Linux SW RAID: HW Raid Controller/JBOD vs. Multiple PCI-e Cards?)

2007-05-05 Thread Peter Rabbitson
Chris Wedgwood wrote:
 snip
 
 Also, 'dd performance' varies between the start of a disk and the end.
 Typically you get better performance at the start of the disk so dd
 might not be a very good benchmark here.
 

Hi,
Sorry for hijacking this thread, but I was actually planning to ask this
very same question. Is the behavior you are describing above
manufacturer dependent or it is pretty much dictated by the general
design of modern drives? I have an array of 4 Maxtor sata drives, and
raw read performance at the end of the disk is 38mb/s compared to 62mb/s
at the beginning.

Thanks
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


what does md do if it finds an inconsistency?

2007-05-05 Thread martin f krafft
Neil,

With the check feature of the recent md feature, the question popped
up what happens when an inconsistency is found. Does it fix it? If
so, which disk it assumes to be wrong if an inconsistency is found?

Cheers,

-- 
martin;  (greetings from the heart of the sun.)
  \ echo mailto: !#^.*|tr * mailto:; [EMAIL PROTECTED]
 
spamtraps: [EMAIL PROTECTED]
 
frank harris has been received
 in all the great houses -- once!
-- oscar wilde


signature.asc
Description: Digital signature (GPG/PGP)


Re: Speed variation depending on disk position

2007-05-05 Thread Benjamin Davenport
Peter Rabbitson wrote:
 Is the behavior you are describing above [decaying STR]
 manufacturer dependent or it is pretty much dictated by the general
 design of modern drives?

It's an artifact of the physical layout of the disk.  Disks are divided into
tracks (concentric circles laid out across the surface of the drive).  Clearly,
the outer tracks are longer than the inner tracks.  For a very long time, drives
have therefore stored more information on these outer tracks.  Since the disk's
spindle speed is constant, reading these outer tracks therefore means more data
passes under the active read head in a given second.  That's why you see
sequential transfer rates decay from the start (outer tracks) of the disk to the
end (inner tracks).  This is the opposite of the behavior seen on CDs, because
the start of a CD is the inside track.

-Ben
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html