Re: Linux: Why software RAID?

2006-08-24 Thread Adam Kropelin
Jeff Garzik [EMAIL PROTECTED] wrote:
 But anyway, to help answer the question of hardware vs. software RAID, I 
 wrote up a page:
 
   http://linux.yyz.us/why-software-raid.html
 
 Generally, you want software RAID unless your PCI bus (or more rarely, 
 your CPU) is getting saturated.  With RAID-0, there is no duplication of 
 data, and so, PCI bus and CPU usage should be about the same for 
 hardware and software RAID.

Hardware RAID can be (!= is) more tolerant of serious drive failures
where a single drive locks up the bus. A high-end hardware RAID card 
may be designed with independent controllers so a single drive failure
cannot take other spindles down with it. The same can be accomplished 
with sw RAID of course if the builder is careful to use multiple PCI 
cards, etc. Sw RAID over your motherboard's onboard controllers leaves
you vulnerable.

--Adam

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux: Why software RAID?

2006-08-24 Thread Adam Kropelin
On Thu, Aug 24, 2006 at 02:20:50PM +0100, Alan Cox wrote:
 Ar Iau, 2006-08-24 am 09:07 -0400, ysgrifennodd Adam Kropelin:
  Jeff Garzik [EMAIL PROTECTED] wrote:
  with sw RAID of course if the builder is careful to use multiple PCI 
  cards, etc. Sw RAID over your motherboard's onboard controllers leaves
  you vulnerable.
 
 Generally speaking the channels on onboard ATA are independant with any
 vaguely modern card. 

Ahh, I did not know that. Does this apply to master/slave connections on
the same PATA cable as well? I know zero about PATA, but I assumed from
the terminology that master and slave needed to cooperate rather closely.

 And for newer systems well the motherboard tends to
 be festooned with random SATA controllers, all separate!

And how. You can't swing a dead cat without hitting a half-dozen ATA
ports these days. And most of them are those infuriatingly insecure SATA
connectors that pop off when you look at them cross-eyed...

--Adam

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 5] md: Introduction

2006-01-23 Thread Adam Kropelin

Neil Brown wrote:

On Saturday January 21, [EMAIL PROTECTED] wrote:

On the first try I neglected to read the directions and increased the
number of devices first (which worked) and then attempted to add the
physical device (which didn't work; at least not the way I intended).


Thanks, this is exactly the sort of feedback I was hoping for - people
testing thing that I didn't think to...


  mdadm --create -l5 -n3 /dev/md0 /dev/sda /dev/sdb /dev/sdc

md0 : active raid5 sda[0] sdc[2] sdb[1]
  2097024 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

  mdadm --grow -n4 /dev/md0

md0 : active raid5 sda[0] sdc[2] sdb[1]
  3145536 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]


I assume that no resync started at this point?  It should have done.


Actually, it did start a resync. Sorry, I should have mentioned that. I 
waited until the resync completed before I issued the 'mdadm --add' 
command.



md0 : active raid5 sdd[3] sdc[2] sdb[1] sda[0]
  2097024 blocks level 5, 64k chunk, algorithm 2 [4/4] []
...should this be... -- [4/3]
[UUU_] perhaps?


Well, part of the array is 4/4  and part is 3/3 UUU.  How do
you represent that?  I think 4/4  is best.


I see your point. I was expecting some indication that that my array was 
vulnerable and that the new disk was not fully utilized yet. I guess the 
resync in progress indicator is sufficient.



My final test was a repeat of #2, but with data actively being
written
to the array during the reshape (the previous tests were on an idle,
unmounted array). This one failed pretty hard, with several processes
ending up in the D state.


Hmmm... I tried similar things but didn't get this deadlock.  Somehow
the fact that mdadm is holding the reconfig_sem semaphore means that
some IO cannot proceed and so mdadm cannot grab and resize all the
stripe heads... I'll have to look more deeply into this.


For what it's worth, I'm using the Buslogic SCSI driver for the disks in 
the array.



I'm happy to do more tests. It's easy to conjur up virtual disks and
load them with irrelevant data (like kernel trees ;)


Great.  I'll probably be putting out a new patch set  late this week
or early next.  Hopefully it will fix the issues you can found and you
can try it again..


Looking forward to it...

--Adam

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 5] md: Introduction

2006-01-21 Thread Adam Kropelin
NeilBrown [EMAIL PROTECTED] wrote:
 In line with the principle of release early, following are 5 patches
 against md in 2.6.latest which implement reshaping of a raid5 array.
 By this I mean adding 1 or more drives to the array and then re-laying
 out all of the data.

I've been looking forward to a feature like this, so I took the
opportunity to set up a vmware session and give the patches a try. I
encountered both success and failure, and here are the details of both.

On the first try I neglected to read the directions and increased the
number of devices first (which worked) and then attempted to add the
physical device (which didn't work; at least not the way I intended).
The result was an array of size 4, operating in degraded mode, with 
three active drives and one spare. I was unable to find a way to coax
mdadm into adding the 4th drive as an active device instead of a 
spare. I'm not an mdadm guru, so there may be a method I overlooked.
Here's what I did, interspersed with trimmed /proc/mdstat output:

  mdadm --create -l5 -n3 /dev/md0 /dev/sda /dev/sdb /dev/sdc

md0 : active raid5 sda[0] sdc[2] sdb[1]
  2097024 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

  mdadm --grow -n4 /dev/md0

md0 : active raid5 sda[0] sdc[2] sdb[1]
  3145536 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]

  mdadm --manage --add /dev/md0 /dev/sdd

md0 : active raid5 sdd[3](S) sda[0] sdc[2] sdb[1]
  3145536 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]

  mdadm --misc --stop /dev/md0
  mdadm --assemble /dev/md0 /dev/sda /dev/sdb /dev/sdc /dev/sdd

md0 : active raid5 sdd[3](S) sda[0] sdc[2] sdb[1]
  3145536 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]

For my second try I actually read the directions and things went much
better, aside from a possible /proc/mdstat glitch shown below.

  mdadm --create -l5 -n3 /dev/md0 /dev/sda /dev/sdb /dev/sdc

md0 : active raid5 sda[0] sdc[2] sdb[1]
  2097024 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

  mdadm --manage --add /dev/md0 /dev/sdd

md0 : active raid5 sdd[3](S) sdc[2] sdb[1] sda[0]
  2097024 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

  mdadm --grow -n4 /dev/md0

md0 : active raid5 sdd[3] sdc[2] sdb[1] sda[0]
  2097024 blocks level 5, 64k chunk, algorithm 2 [4/4] []
...should this be... -- [4/3] [UUU_] perhaps?
  []  recovery =  0.4% (5636/1048512) 
finish=9.1min speed=1878K/sec

[...time passes...]

md0 : active raid5 sdd[3] sdc[2] sdb[1] sda[0]
  3145536 blocks level 5, 64k chunk, algorithm 2 [4/4] []

My final test was a repeat of #2, but with data actively being written
to the array during the reshape (the previous tests were on an idle,
unmounted array). This one failed pretty hard, with several processes
ending up in the D state. I repeated it twice and sysrq-t dumps can be
found at http://www.kroptech.com/~adk0212/md-raid5-reshape-wedge.txt.
The writeout load was a kernel tree untar started shortly before the
'mdadm --grow' command was given. mdadm hung, as did tar. Any process
which subsequently attmpted to access the array hung as well. A second
attempt at the same thing hung similarly, although only pdflush shows up
hung in that trace. mdadm and tar are missing for some reason.

I'm happy to do more tests. It's easy to conjur up virtual disks and
load them with irrelevant data (like kernel trees ;)

--Adam

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html