Re: RAID5 recovery trouble, bd_claim failed?

2006-04-16 Thread Neil Brown
On Sunday April 16, [EMAIL PROTECTED] wrote: > Hi Neil, > Thanks for your reply. I tried that, but here is there error I > received: > > [EMAIL PROTECTED]:/etc# mdadm --assemble /dev/md0 > --uuid=38081921:59a998f9:64c1a001:ec53 4ef2 /dev/hd[efgh] > mdadm: failed to add /dev/hdf to /dev/md0:

Re: RAID5 recovery trouble, bd_claim failed?

2006-04-16 Thread Nathanial Byrnes
Hi Neil, Thanks for your reply. I tried that, but here is there error I received: [EMAIL PROTECTED]:/etc# mdadm --assemble /dev/md0 --uuid=38081921:59a998f9:64c1a001:ec53 4ef2 /dev/hd[efgh] mdadm: failed to add /dev/hdf to /dev/md0: Device or resource busy mdadm: /dev/md0 assembled from 2

[bug?] MD doesn't stop failed array

2006-04-16 Thread Molle Bestefich
May I offer the point of view that this is a bug: MD apparently tries to keep a raid5 array up by using 4 out of 6 disks. Here's the event chain, from start to now: == 1.) Array assembled automatically with 6/6 devices. 2.) Read error, MD kicks sdb1. 3.)

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Molle Bestefich
Neil Brown wrote: > > How do I force MD to raise the event counter on sdb1 and accept it > > into the array as-is, so I can avoid bad-block induced data > > corruption? > > For that, you have to recreate the array. Scary. And hairy. How much do I have to bribe you to make this work: # mdadm --as

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread CaT
On Sun, Apr 16, 2006 at 09:42:34PM -0300, Carlos Carvalho wrote: > CaT ([EMAIL PROTECTED]) wrote on 17 April 2006 10:25: > >Not necessarily. You probably have something like (say) 200GB of data > >stripes across that disk. That one read error may affect just one or a > >few which means there's a wh

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Neil Brown
On Monday April 17, [EMAIL PROTECTED] wrote: > Neil Brown wrote: > > use --assemble --force > > # mdadm --assemble --force /dev/md1 > mdadm: forcing event count in /dev/sda1(0) from 163362 upto 163368 > mdadm: /dev/md1 has been started with 5 drives (out of 6). > > Oops, only 5 drives, but I know

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Molle Bestefich
Carlos Carvalho wrote: > You want the array to stay on and jump here and there getting the > stripes from wherever it can, each time from a different set of disks. > That's surely nice but I think it's too much to ask... With the bad block rate on modern disks as high as ever, I dare say it's an a

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Carlos Carvalho
CaT ([EMAIL PROTECTED]) wrote on 17 April 2006 10:25: >On Sun, Apr 16, 2006 at 08:46:52PM -0300, Carlos Carvalho wrote: >> Neil Brown ([EMAIL PROTECTED]) wrote on 17 April 2006 09:30: >> >The easiest thing to do when you get an error on a drive is to kick >> >the drive from the array, so that

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Carlos Carvalho
Molle Bestefich ([EMAIL PROTECTED]) wrote on 17 April 2006 02:21: >Neil Brown wrote: >> use --assemble --force > ># mdadm --assemble --force /dev/md1 >mdadm: forcing event count in /dev/sda1(0) from 163362 upto 163368 >mdadm: /dev/md1 has been started with 5 drives (out of 6). > >Oops, only

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread CaT
On Sun, Apr 16, 2006 at 08:46:52PM -0300, Carlos Carvalho wrote: > Neil Brown ([EMAIL PROTECTED]) wrote on 17 April 2006 09:30: > >The easiest thing to do when you get an error on a drive is to kick > >the drive from the array, so that is what the code always did, and > >still does in many cases

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Molle Bestefich
Neil Brown wrote: > use --assemble --force # mdadm --assemble --force /dev/md1 mdadm: forcing event count in /dev/sda1(0) from 163362 upto 163368 mdadm: /dev/md1 has been started with 5 drives (out of 6). Oops, only 5 drives, but I know data is OK on all 6 drives. I also know that there are bad

Re: Raid 4 idea!

2006-04-16 Thread Neil Brown
On Saturday April 8, [EMAIL PROTECTED] wrote: > Hello, list, > > I have one idea! > > I using raid4, and found to be really good, but i think this can be more > better! :-) > > If the raid4 fails one disk (except the patiry disk), the md currently only > drop the disk from array, but i think thi

Re: Questions about: Where to find algorithms for RAID5 / RAID6

2006-04-16 Thread Neil Brown
On Tuesday April 11, [EMAIL PROTECTED] wrote: > Good day. > > I am looking for some information, and hope the readers of this list > might be able to point me in the right direction: > > Here is the scenario: > In RAID5 ( or RAID6) when a file is written, some parity data is > created, (by some f

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Carlos Carvalho
Neil Brown ([EMAIL PROTECTED]) wrote on 17 April 2006 09:30: >The easiest thing to do when you get an error on a drive is to kick >the drive from the array, so that is what the code always did, and >still does in many cases. >It is arguable that for a read error on a degraded raid5, that may no

RE: Problem in creating RAID5 MD array with kernel 2.6.15

2006-04-16 Thread Neil Brown
On Tuesday April 11, [EMAIL PROTECTED] wrote: > Hi Neil, > > Can you provide me details of your setup? Just a ordinary p4 server with a bunch of SCSI drives. Running 2.6.17-rc1-mm1, but I doubt that would make a difference. > Is there any kernel configuration that I will have to change and buil

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Molle Bestefich
Neil Brown wrote: > It is arguable that for a read error on a degraded raid5, that may not > be the best thing to do, but I'm not completely convinced. A read > error will mean that a write to the same stripe will have to fail, so > at the very least we would want to switch the array read-only. T

Re: mdadm + raid1 of 2 disks and now need to add more

2006-04-16 Thread Neil Brown
On Wednesday April 12, [EMAIL PROTECTED] wrote: > yea! it is a testing server I use that is the same kernel as our main > server... just for tests like these so we don't screw up I never > run the real thing before I'm sure I know I've got it working on our > test server. > > Well... what I d

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Neil Brown
On Monday April 17, [EMAIL PROTECTED] wrote: > Neil Brown wrote: > > You shouldn't need to upgrade kernel > > Ok. > I had a crazy idea that 2 devices down in a RAID5 was an MD bug. > > I didn't expect MD to kick that last disk - I would have thought that > it would just pass on the read error in

Re: linear writes to raid5

2006-04-16 Thread Neil Brown
On Wednesday April 12, [EMAIL PROTECTED] wrote: > > Neil Brown (NB) writes: > > NB> There are a number of aspects to this. > > NB> - When a write arrives we 'plug' the queue so the stripe goes onto a > NB>'delayed' list which doesn't get processed until an unplug happens, > NB>o

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Molle Bestefich
Neil Brown wrote: > You shouldn't need to upgrade kernel Ok. I had a crazy idea that 2 devices down in a RAID5 was an MD bug. I didn't expect MD to kick that last disk - I would have thought that it would just pass on the read error in that situation. If you've got the time to explain I'd like t

Re: How to verify RAID parity/mirroring?

2006-04-16 Thread Neil Brown
On April 13, [EMAIL PROTECTED] wrote: > I'm looking forward to testing the new SATA NCQ support that the linux > IDE developers have working, but of course that opens me to the risk of > disk corruption. > > So I'd like to be able to do clever things with the existing RAID arrays > to mitigate t

Re: Softraid wont be restarting

2006-04-16 Thread Neil Brown
On Thursday April 13, [EMAIL PROTECTED] wrote: > Hi! > > Since two weeks im trying to remount a Sofraid (Level1) which I created > with > > mdadm -Cv /dev/md3 -l1 -2 /dev/sdc1 /dev/sdd1 > > -- > > > I?m having other softraids, wich are created by the installationsroutine > of sa

Re: accessing mirrired lvm on shared storage

2006-04-16 Thread Neil Brown
On Thursday April 13, [EMAIL PROTECTED] wrote: > On 4/12/06, Neil Brown <[EMAIL PROTECTED]> wrote: > > > One thing that is on my todo list is supporting shared raid1, so that > > several nodes in the cluster can assemble the same raid1 and access it > > - providing that the clients all do proper m

Re: RAID5 recovery trouble, bd_claim failed?

2006-04-16 Thread Neil Brown
On Saturday April 15, [EMAIL PROTECTED] wrote: > Hi All, > Recently I lost a disk in my raid5 SW array. It seems that it took a > second disk with it. The other disk appears to still be funtional (from > an fdisk perspective...). I am trying to get the array to work in > degraded mode via fai

Re: help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Neil Brown
On Sunday April 16, [EMAIL PROTECTED] wrote: > A system with 6 disks, it was UU a moment ago, after read errors > on a file now looks like: > > /proc/mdstat: > md1 : active raid5 sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[6](F) sda1[7](F) > level 5, 64k chunk, algorithm 2 [6/4] [__] > >

Re: Raid 4/5 small writes

2006-04-16 Thread Neil Brown
On Sunday April 16, [EMAIL PROTECTED] wrote: > Neil Brown wrote: > > > >If you are writing exactly half the data in a stripe, I think it takes > >the first option (Read the old unchanged data) as that is fewer reads > >than reading the old changed data and the parity block. > > > >Does that make se

help wanted - 6-disk raid5 borked: _ _ U U U U

2006-04-16 Thread Molle Bestefich
A system with 6 disks, it was UU a moment ago, after read errors on a file now looks like: /proc/mdstat: md1 : active raid5 sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[6](F) sda1[7](F) level 5, 64k chunk, algorithm 2 [6/4] [__] uname: linux 2.6.11-gentoo-r4 What's the recommended appr

Re: RHEL3 kernel panic with md

2006-04-16 Thread Bill Davidsen
Colin McDonald wrote: I appear to have a corrupt file system and now it is mirrored. LOL. I am running Redhat Enterprise 3 and using mdtools. I booted from the install media iso and went into rescue mode. RH was unable to find the partitions automatically but after exiting into bash i can run

Re: Softraid controllers and Linux

2006-04-16 Thread Bill Davidsen
Jim Klimov wrote: Hello linux-raid, I have tried several cheap RAID controllers recently (namely, VIA VT6421, Intel 6300ESB and Adaptec/Marvell 885X6081). VIA one is a PCI card, the second two are built in a Supermicro motherboard (E7520/X6DHT-G). The intent was to let the BIOS of the