Re: Fwd: RAID5 Recovery

2007-11-14 Thread David Greaves
Neil Cavan wrote:
> Thanks for taking a look, David.
No problem.

> Kernel:
> 2.6.15-27-k7, stock for Ubuntu 6.06 LTS
> 
> mdadm:
> mdadm - v1.12.0 - 14 June 2005
OK - fairly old then. Not really worth trying to figure out why hdc got re-added
when things had gone wrong.

> You're right, earlier in /var/log/messages there's a notice that hdg
> dropped, I missed it before. I use mdadm --monitor, but I recently
> changed the target email address - I guess it didn't take properly.
> 
> As for replacing hdc, thanks for the diagnosis but it won't help: the
> drive is actually fine, as is hdg. I've replaced hdc before, only to
> have the brand new hdc show the same behaviour, and SMART says the
> drive is A-OK. There's something flaky about these PCI IDE
> controllers. I think it's new system time.
Any excuse eh? :)


> Reiserfs recovery-wise: any suggestions? A simple fsck doesn't find a
> file system superblock. Is --rebuild-sb the way to go here?
No idea, sorry. I only ever tried Reiser once and it failed. It was very hard to
get recovered so I swapped back to XFS.

Good luck on the fscking

David
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: RAID5 Recovery

2007-11-14 Thread Neil Cavan
Thanks for taking a look, David.

Kernel:
2.6.15-27-k7, stock for Ubuntu 6.06 LTS

mdadm:
mdadm - v1.12.0 - 14 June 2005

You're right, earlier in /var/log/messages there's a notice that hdg
dropped, I missed it before. I use mdadm --monitor, but I recently
changed the target email address - I guess it didn't take properly.

As for replacing hdc, thanks for the diagnosis but it won't help: the
drive is actually fine, as is hdg. I've replaced hdc before, only to
have the brand new hdc show the same behaviour, and SMART says the
drive is A-OK. There's something flaky about these PCI IDE
controllers. I think it's new system time.

Reiserfs recovery-wise: any suggestions? A simple fsck doesn't find a
file system superblock. Is --rebuild-sb the way to go here?

Thanks,
Neil


On Nov 14, 2007 5:58 AM, David Greaves <[EMAIL PROTECTED]> wrote:
> Neil Cavan wrote:
> > Hello,
> Hi Neil
>
> What kernel version?
> What mdadm version?
>
> > This morning, I woke up to find the array had kicked two disks. This
> > time, though, /proc/mdstat showed one of the failed disks (U_U_U, one
> > of the "_"s) had been marked as a spare - weird, since there are no
> > spare drives in this array. I rebooted, and the array came back in the
> > same state: one failed, one spare. I hot-removed and hot-added the
> > spare drive, which put the array back to where I thought it should be
> > ( still U_U_U, but with both "_"s marked as failed). Then I rebooted,
> > and the array began rebuilding on its own. Usually I have to hot-add
> > manually, so that struck me as a little odd, but I gave it no mind and
> > went to work. Without checking the contents of the filesystem. Which
> > turned out not to have been mounted on reboot.
> OK
>
> > Because apparently things went horribly wrong.
> Yep :(
>
> > Do I have any hope of recovering this data? Could rebuilding the
> > reiserfs superblock help if the rebuild managed to corrupt the
> > superblock but not the data?
> See below
>
>
>
> > Nov 13 02:01:03 localhost kernel: [17805772.424000] hdc: dma_intr:
> > status=0x51 { DriveReady SeekComplete Error }
> 
> > Nov 13 02:01:06 localhost kernel: [17805775.156000] lost page write
> > due to I/O error on md0
> hdc1 fails
>
>
> > Nov 13 02:01:06 localhost kernel: [17805775.196000] RAID5 conf printout:
> > Nov 13 02:01:06 localhost kernel: [17805775.196000]  --- rd:5 wd:3 fd:2
> > Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 0, o:1, dev:hda1
> > Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 1, o:0, dev:hdc1
> > Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 2, o:1, dev:hde1
> > Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 4, o:1, dev:hdi1
>
> hdg1 is already missing?
>
> > Nov 13 02:01:06 localhost kernel: [17805775.212000] RAID5 conf printout:
> > Nov 13 02:01:06 localhost kernel: [17805775.212000]  --- rd:5 wd:3 fd:2
> > Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 0, o:1, dev:hda1
> > Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 2, o:1, dev:hde1
> > Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 4, o:1, dev:hdi1
>
> so now the array is bad.
>
> a reboot happens and:
> > Nov 13 07:21:07 localhost kernel: [17179584.712000] md: md0 stopped.
> > Nov 13 07:21:07 localhost kernel: [17179584.876000] md: bind
> > Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> > Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> > Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> > Nov 13 07:21:07 localhost kernel: [17179584.892000] md: bind
> > Nov 13 07:21:07 localhost kernel: [17179584.892000] md: kicking
> > non-fresh hdg1 from array!
> > Nov 13 07:21:07 localhost kernel: [17179584.892000] md: unbind
> > Nov 13 07:21:07 localhost kernel: [17179584.892000] md: export_rdev(hdg1)
> > Nov 13 07:21:07 localhost kernel: [17179584.896000] raid5: allocated
> > 5245kB for md0
> ... apparently hdc1 is OK? Hmmm.
>
> > Nov 13 07:21:07 localhost kernel: [17179665.524000] ReiserFS: md0:
> > found reiserfs format "3.6" with standard journal
> > Nov 13 07:21:07 localhost kernel: [17179676.136000] ReiserFS: md0:
> > using ordered data mode
> > Nov 13 07:21:07 localhost kernel: [17179676.164000] ReiserFS: md0:
> > journal params: device md0, size 8192, journal first block 18, max
> > trans len 1024, max batch 900, max commit age 30, max trans age 30
> > Nov 13 07:21:07 localhost kernel: [17179676.164000] ReiserFS: md0:
> > checking transaction log (md0)
> > Nov 13 07:21:07 localhost kernel: [17179676.828000] ReiserFS: md0:
> > replayed 7 transactions in 1 seconds
> > Nov 13 07:21:07 localhost kernel: [17179677.012000] ReiserFS: md0:
> > Using r5 hash to sort names
> > Nov 13 07:21:09 localhost kernel: [17179682.064000] lost page write
> > due to I/O error on md0
> Reiser tries to mount/replay itself relying on hdc1 (which is partly bad)
>
> > Nov 13 07:25:39 localhost kernel: [17179584.828000] md: raid5
> > personality registered as nr 4
> > Nov 13 07:25:39 localh

Re: RAID5 Recovery

2007-11-14 Thread David Greaves
Neil Cavan wrote:
> Hello,
Hi Neil

What kernel version?
What mdadm version?

> This morning, I woke up to find the array had kicked two disks. This
> time, though, /proc/mdstat showed one of the failed disks (U_U_U, one
> of the "_"s) had been marked as a spare - weird, since there are no
> spare drives in this array. I rebooted, and the array came back in the
> same state: one failed, one spare. I hot-removed and hot-added the
> spare drive, which put the array back to where I thought it should be
> ( still U_U_U, but with both "_"s marked as failed). Then I rebooted,
> and the array began rebuilding on its own. Usually I have to hot-add
> manually, so that struck me as a little odd, but I gave it no mind and
> went to work. Without checking the contents of the filesystem. Which
> turned out not to have been mounted on reboot.
OK

> Because apparently things went horribly wrong.
Yep :(

> Do I have any hope of recovering this data? Could rebuilding the
> reiserfs superblock help if the rebuild managed to corrupt the
> superblock but not the data?
See below



> Nov 13 02:01:03 localhost kernel: [17805772.424000] hdc: dma_intr:
> status=0x51 { DriveReady SeekComplete Error }

> Nov 13 02:01:06 localhost kernel: [17805775.156000] lost page write
> due to I/O error on md0
hdc1 fails


> Nov 13 02:01:06 localhost kernel: [17805775.196000] RAID5 conf printout:
> Nov 13 02:01:06 localhost kernel: [17805775.196000]  --- rd:5 wd:3 fd:2
> Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 0, o:1, dev:hda1
> Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 1, o:0, dev:hdc1
> Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 2, o:1, dev:hde1
> Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 4, o:1, dev:hdi1

hdg1 is already missing?

> Nov 13 02:01:06 localhost kernel: [17805775.212000] RAID5 conf printout:
> Nov 13 02:01:06 localhost kernel: [17805775.212000]  --- rd:5 wd:3 fd:2
> Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 0, o:1, dev:hda1
> Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 2, o:1, dev:hde1
> Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 4, o:1, dev:hdi1

so now the array is bad.

a reboot happens and:
> Nov 13 07:21:07 localhost kernel: [17179584.712000] md: md0 stopped.
> Nov 13 07:21:07 localhost kernel: [17179584.876000] md: bind
> Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> Nov 13 07:21:07 localhost kernel: [17179584.892000] md: bind
> Nov 13 07:21:07 localhost kernel: [17179584.892000] md: kicking
> non-fresh hdg1 from array!
> Nov 13 07:21:07 localhost kernel: [17179584.892000] md: unbind
> Nov 13 07:21:07 localhost kernel: [17179584.892000] md: export_rdev(hdg1)
> Nov 13 07:21:07 localhost kernel: [17179584.896000] raid5: allocated
> 5245kB for md0
... apparently hdc1 is OK? Hmmm.

> Nov 13 07:21:07 localhost kernel: [17179665.524000] ReiserFS: md0:
> found reiserfs format "3.6" with standard journal
> Nov 13 07:21:07 localhost kernel: [17179676.136000] ReiserFS: md0:
> using ordered data mode
> Nov 13 07:21:07 localhost kernel: [17179676.164000] ReiserFS: md0:
> journal params: device md0, size 8192, journal first block 18, max
> trans len 1024, max batch 900, max commit age 30, max trans age 30
> Nov 13 07:21:07 localhost kernel: [17179676.164000] ReiserFS: md0:
> checking transaction log (md0)
> Nov 13 07:21:07 localhost kernel: [17179676.828000] ReiserFS: md0:
> replayed 7 transactions in 1 seconds
> Nov 13 07:21:07 localhost kernel: [17179677.012000] ReiserFS: md0:
> Using r5 hash to sort names
> Nov 13 07:21:09 localhost kernel: [17179682.064000] lost page write
> due to I/O error on md0
Reiser tries to mount/replay itself relying on hdc1 (which is partly bad)

> Nov 13 07:25:39 localhost kernel: [17179584.828000] md: raid5
> personality registered as nr 4
> Nov 13 07:25:39 localhost kernel: [17179585.708000] md: kicking
> non-fresh hdg1 from array!
Another reboot...

> Nov 13 07:25:40 localhost kernel: [17179666.064000] ReiserFS: md0:
> found reiserfs format "3.6" with standard journal
> Nov 13 07:25:40 localhost kernel: [17179676.904000] ReiserFS: md0:
> using ordered data mode
> Nov 13 07:25:40 localhost kernel: [17179676.928000] ReiserFS: md0:
> journal params: device md0, size 8192, journal first block 18, max
> trans len 1024, max batch 900, max commit age 30, max trans age 30
> Nov 13 07:25:40 localhost kernel: [17179676.932000] ReiserFS: md0:
> checking transaction log (md0)
> Nov 13 07:25:40 localhost kernel: [17179677.08] ReiserFS: md0:
> Using r5 hash to sort names
> Nov 13 07:25:42 localhost kernel: [17179683.128000] lost page write
> due to I/O error on md0
Reiser tries again...

> Nov 13 07:26:57 localhost kernel: [17179757.524000] md: unbind
> Nov 13 07:26:57 localhost kernel: [17179757.524000] md: export_rdev(hdc1)
> Nov 13 07:27:03 localhost kernel: [17

RAID5 Recovery

2007-11-13 Thread Neil Cavan
Hello,

I have a 5-disk RAID5 array that has gone belly-up. It consists of 2x
2 disks on Promise PCI controllers, and one on the mobo controller.

This array has been running for a couple years, and every so often
(randomly, sometimes every couple weeks sometimes no problem for
months) it will drop a drive. It's not a drive failure per se, it's
something controller-related since the failures tend to happen in
pairs and SMART gives the drives a clean bill of health. If it's only
one drive, I can hot-add with no problem. If it's 2 drives my heart
leaps into my mouth but I reboot, only one of the drives comes up as
failed, and I can hot-add with no problem. The 2-drive case has
happened a dozen times and my array is never any worse for the wear.

This morning, I woke up to find the array had kicked two disks. This
time, though, /proc/mdstat showed one of the failed disks (U_U_U, one
of the "_"s) had been marked as a spare - weird, since there are no
spare drives in this array. I rebooted, and the array came back in the
same state: one failed, one spare. I hot-removed and hot-added the
spare drive, which put the array back to where I thought it should be
( still U_U_U, but with both "_"s marked as failed). Then I rebooted,
and the array began rebuilding on its own. Usually I have to hot-add
manually, so that struck me as a little odd, but I gave it no mind and
went to work. Without checking the contents of the filesystem. Which
turned out not to have been mounted on reboot. Because apparently
things went horribly wrong.

The rebuild process ran its course. I now have an array that mdadm
insists is peachy:
---
md0 : active raid5 hda1[0] hdc1[1] hdi1[4] hdg1[3] hde1[2]
  468872704 blocks level 5, 64k chunk, algorithm 2 [5/5] [U]

unused devices: 
---

But there is no filesystem on /dev/md0:

---
sudo mount -t reiserfs /dev/md0 /storage/
mount: wrong fs type, bad option, bad superblock on /dev/md0,
   missing codepage or other error
---

Do I have any hope of recovering this data? Could rebuilding the
reiserfs superblock help if the rebuild managed to corrupt the
superblock but not the data?

Any help is appreciated, below is the failure event in
/var/log/messages, followed by the output of cat /var/log/messages |
grep md.

Thanks,
Neil Cavan

Nov 13 02:01:03 localhost kernel: [17805772.424000] hdc: dma_intr:
status=0x51 { DriveReady SeekComplete Error }
Nov 13 02:01:03 localhost kernel: [17805772.424000] hdc: dma_intr:
error=0x40 { UncorrectableError }, LBAsect=11736, sector=1
1719
Nov 13 02:01:03 localhost kernel: [17805772.424000] ide: failed opcode
was: unknown
Nov 13 02:01:03 localhost kernel: [17805772.424000] end_request: I/O
error, dev hdc, sector 11719
Nov 13 02:01:03 localhost kernel: [17805772.424000] R5: read error not
correctable.
Nov 13 02:01:03 localhost kernel: [17805772.464000] lost page write
due to I/O error on md0
Nov 13 02:01:05 localhost kernel: [17805773.776000] hdc: dma_intr:
status=0x51 { DriveReady SeekComplete Error }
Nov 13 02:01:05 localhost kernel: [17805773.776000] hdc: dma_intr:
error=0x40 { UncorrectableError }, LBAsect=11736, sector=1
1727
Nov 13 02:01:05 localhost kernel: [17805773.776000] ide: failed opcode
was: unknown
Nov 13 02:01:05 localhost kernel: [17805773.776000] end_request: I/O
error, dev hdc, sector 11727
Nov 13 02:01:05 localhost kernel: [17805773.776000] R5: read error not
correctable.
Nov 13 02:01:05 localhost kernel: [17805773.776000] lost page write
due to I/O error on md0
Nov 13 02:01:06 localhost kernel: [17805775.156000] hdc: dma_intr:
status=0x51 { DriveReady SeekComplete Error }
Nov 13 02:01:06 localhost kernel: [17805775.156000] hdc: dma_intr:
error=0x40 { UncorrectableError }, LBAsect=11736, sector=1
1735
Nov 13 02:01:06 localhost kernel: [17805775.156000] ide: failed opcode
was: unknown
Nov 13 02:01:06 localhost kernel: [17805775.156000] end_request: I/O
error, dev hdc, sector 11735
Nov 13 02:01:06 localhost kernel: [17805775.156000] R5: read error not
correctable.
Nov 13 02:01:06 localhost kernel: [17805775.156000] lost page write
due to I/O error on md0
Nov 13 02:01:06 localhost kernel: [17805775.196000] RAID5 conf printout:
Nov 13 02:01:06 localhost kernel: [17805775.196000]  --- rd:5 wd:3 fd:2
Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 0, o:1, dev:hda1
Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 1, o:0, dev:hdc1
Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 2, o:1, dev:hde1
Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 4, o:1, dev:hdi1
Nov 13 02:01:06 localhost kernel: [17805775.212000] RAID5 c

Re: RAID5 Recovery

2006-10-22 Thread Neil Brown
On Sunday October 22, [EMAIL PROTECTED] wrote:
> The drives have not been repartitioned.
> 
> I think what happened is that I created a new raid5 array over the old
> one, but never synced or initialized it.

If you created an array - whether it synced or not - the superblock
would be written and --examine would have found them.  So there must
be something else that happened.  Hard to know what.

> 
> I'm leery of re-creating the array as you suggest, because I think
> re-creating an array "over top" of my existing array is what got me into
> trouble in the first place.
> 
> Also, from mdadm man page (using v 1.12.0):
> 
> --assume-clean
> Tell mdadm that the array pre-existed and is known to be clean.
> This is only really useful for Building RAID1  array.   Only
> use this if you really know what you are doing.  This is currently
> only supported for --build.
> 
> This suggests to me that I can only use this to build a legacy array
> without superblocks - which I don't want - and that since my array was
> RAID5, that it's not "really useful", whatever that means. Oh, and also,
> I don't really know what I'm doing. ;)

--assume-clean was extended to --create in mdadm-2.2.

> 
> If I do re-create the array to regenerate the superblocks, isn't it
> important that I know the exact parameters of the pre-existing array, to
> get the data to match up? chunk size, parity method, etc?

Yes, but I would assume you just used the defaults.  If not, you
presumably know why you changed the defaults and can do it again???

In any case, creating the array with --assume-clean does not modify
any data.  It only overwrites the superblocks.  As you currently don't
have any superblock, you have nothing to lose.
After you create the array you can try 'fsck' or other tools to see if
the data is intact.  If it is - good.  If not, stop the array and try
creating it with different parameters.

> 
> I just don't want to rush in and mess things up. Did that once
> already. ;)

Very sensible.  
Assuming the partitions really are the same as they were before (can't
hurt to triple-check) then I really thing '--create --assume-clean' is
your best bet.  Maybe download and compile the latest mdadm
 http://www.kernel.org/pub/linux/utils/raid/mdadm/
to make sure you have a working --assume-clean.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 Recovery

2006-10-22 Thread Neil Cavan
The drives have not been repartitioned.

I think what happened is that I created a new raid5 array over the old
one, but never synced or initialized it.

I'm leery of re-creating the array as you suggest, because I think
re-creating an array "over top" of my existing array is what got me into
trouble in the first place.

Also, from mdadm man page (using v 1.12.0):

--assume-clean
Tell mdadm that the array pre-existed and is known to be clean.
This is only really useful for Building RAID1  array.   Only
use this if you really know what you are doing.  This is currently
only supported for --build.

This suggests to me that I can only use this to build a legacy array
without superblocks - which I don't want - and that since my array was
RAID5, that it's not "really useful", whatever that means. Oh, and also,
I don't really know what I'm doing. ;)

If I do re-create the array to regenerate the superblocks, isn't it
important that I know the exact parameters of the pre-existing array, to
get the data to match up? chunk size, parity method, etc?

I just don't want to rush in and mess things up. Did that once
already. ;)

Thanks,
Neil

On Mon, 2006-23-10 at 11:29 +1000, Neil Brown wrote:
> On Saturday October 21, [EMAIL PROTECTED] wrote:
> > Hi,
> > 
> > I had a run-in with the Ubuntu Server installer, and in trying to get
> > the new system to recognize the clean 5-disk raid5 array left behind by
> > the previous Ubuntu system, I think I inadvertently instructed it to
> > create a new raid array using those same partitions.
> > 
> > What I know for sure is that now, I get this:
> > 
> > [EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hda1
> > mdadm: No super block found on /dev/hda1 (Expected magic a92b4efc, got
> > )
> > [EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hdc1
> > mdadm: No super block found on /dev/hdc1 (Expected magic a92b4efc, got
> > )
> > [EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hde1
> > mdadm: No super block found on /dev/hde1 (Expected magic a92b4efc, got
> > )
> > [EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hdg1
> > mdadm: No super block found on /dev/hdg1 (Expected magic a92b4efc, got
> > )
> > [EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hdi1
> > mdadm: No super block found on /dev/hdi1 (Expected magic a92b4efc, got
> > )
> > 
> > I didn't format the partitions or write any data to the disk, so I think
> > the array's data should be intact. Is there a way to recreate the
> > superblocks, or am I hosed?
> 
> Weirds Could the drives have been repartitioned in the process,
> with the partitions being slightly different sizes or at slightly
> different offsets?  That might explain the disappearing superblocks,
> and remaking the partitions might fix it.
> 
> Or you can just re-create the array.  Doing so won't destroy any data
> that happens to be there.
> To be on the safe side, create it with --assume-clean.  This will avoid
> a resync so you can be sure that no data blocks will be written at
> all.
> Then 'fsck -n' or mount readonly and see if you data is safe.  
> Once you are happy that you have the data safe you can trigger the
> resync with
>mdadm --assemble --update=resync .
> or 
>echo resync > /sys/block/md0/md/sync_action
> 
> (assuming it is 'md0').
> 
> Good luck.
> 
> NeilBrown
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 Recovery

2006-10-22 Thread Neil Brown
On Saturday October 21, [EMAIL PROTECTED] wrote:
> Hi,
> 
> I had a run-in with the Ubuntu Server installer, and in trying to get
> the new system to recognize the clean 5-disk raid5 array left behind by
> the previous Ubuntu system, I think I inadvertently instructed it to
> create a new raid array using those same partitions.
> 
> What I know for sure is that now, I get this:
> 
> [EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hda1
> mdadm: No super block found on /dev/hda1 (Expected magic a92b4efc, got
> )
> [EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hdc1
> mdadm: No super block found on /dev/hdc1 (Expected magic a92b4efc, got
> )
> [EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hde1
> mdadm: No super block found on /dev/hde1 (Expected magic a92b4efc, got
> )
> [EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hdg1
> mdadm: No super block found on /dev/hdg1 (Expected magic a92b4efc, got
> )
> [EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hdi1
> mdadm: No super block found on /dev/hdi1 (Expected magic a92b4efc, got
> )
> 
> I didn't format the partitions or write any data to the disk, so I think
> the array's data should be intact. Is there a way to recreate the
> superblocks, or am I hosed?

Weirds Could the drives have been repartitioned in the process,
with the partitions being slightly different sizes or at slightly
different offsets?  That might explain the disappearing superblocks,
and remaking the partitions might fix it.

Or you can just re-create the array.  Doing so won't destroy any data
that happens to be there.
To be on the safe side, create it with --assume-clean.  This will avoid
a resync so you can be sure that no data blocks will be written at
all.
Then 'fsck -n' or mount readonly and see if you data is safe.  
Once you are happy that you have the data safe you can trigger the
resync with
   mdadm --assemble --update=resync .
or 
   echo resync > /sys/block/md0/md/sync_action

(assuming it is 'md0').

Good luck.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RAID5 Recovery

2006-10-21 Thread Neil Cavan
Hi,

I had a run-in with the Ubuntu Server installer, and in trying to get
the new system to recognize the clean 5-disk raid5 array left behind by
the previous Ubuntu system, I think I inadvertently instructed it to
create a new raid array using those same partitions.

What I know for sure is that now, I get this:

[EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hda1
mdadm: No super block found on /dev/hda1 (Expected magic a92b4efc, got
)
[EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hdc1
mdadm: No super block found on /dev/hdc1 (Expected magic a92b4efc, got
)
[EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hde1
mdadm: No super block found on /dev/hde1 (Expected magic a92b4efc, got
)
[EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hdg1
mdadm: No super block found on /dev/hdg1 (Expected magic a92b4efc, got
)
[EMAIL PROTECTED]:~$ sudo mdadm --examine /dev/hdi1
mdadm: No super block found on /dev/hdi1 (Expected magic a92b4efc, got
)

I didn't format the partitions or write any data to the disk, so I think
the array's data should be intact. Is there a way to recreate the
superblocks, or am I hosed?

Thanks,
Neil

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-19 Thread Nate Byrnes

Hello,
   I replaced the failed disk. The configuration is /dev/hde, /dev/hdf 
(replaced), on IDE channel 0, /dev/hdg, /dev/hdh on IDE channel 1, on a 
single PCI controller card. The issue here is that hde in now also not 
accessible after the failure of hdf.  I cannot see the jumper configs as 
the server is at home, and I am at work. The general thinking was that 
the hde superblock got hosed with the loss of hdf.


My initial post only did discuss the disk ordering and device names. As 
I had replaced the disk which had failed (in a previously fully 
functioning array), with a new disk with exactly the same configuration 
(jumpers, cable locations, etc), and each of the disks could be 
accessed, my thinking was that there would not be a hardware problem to 
sort through. Is this logic flawed?

   Thanks again,
   Nate

Maurice Hilarius wrote:

Nate Byrnes wrote:
  

Hi All,
   I'm not sure that is entirely the case. From a hardware
perspective, I can access all the disks from the OS, via fdisk and dd.
It is really just mdadm that is failing.  Would I still need to work
the jumper issue?
   Thanks,
   Nate



IF the disks are as we suspect (master and slave relationships) and IF
you now have either a failed or a removed drive, then you  MUST correct
the jumpering.
Sure, you can often see a disk that is misconfigured.
It is almost certain, however, that when you write to it you will simply
cause corruption on it.

Of course, so far this is all speculation, as you have not actually said
what the disks, controller interfaces, and jumpering and so forth are at.
I was merely speculating, based on what you have said.

No amount of software magic will "cure" a hardware problem..


  

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-19 Thread Maurice Hilarius
Nate Byrnes wrote:
> Hi All,
>I'm not sure that is entirely the case. From a hardware
> perspective, I can access all the disks from the OS, via fdisk and dd.
> It is really just mdadm that is failing.  Would I still need to work
> the jumper issue?
>Thanks,
>Nate
>
IF the disks are as we suspect (master and slave relationships) and IF
you now have either a failed or a removed drive, then you  MUST correct
the jumpering.
Sure, you can often see a disk that is misconfigured.
It is almost certain, however, that when you write to it you will simply
cause corruption on it.

Of course, so far this is all speculation, as you have not actually said
what the disks, controller interfaces, and jumpering and so forth are at.
I was merely speculating, based on what you have said.

No amount of software magic will "cure" a hardware problem..


-- 

With our best regards,


Maurice W. HilariusTelephone: 01-780-456-9771
Hard Data Ltd.  FAX:   01-780-456-9772
11060 - 166 Avenue email:[EMAIL PROTECTED]
Edmonton, AB, Canada   http://www.harddata.com/
   T5X 1Y3

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-19 Thread Nate Byrnes

Hi All,
   I'm not sure that is entirely the case. From a hardware perspective, 
I can access all the disks from the OS, via fdisk and dd. It is really 
just mdadm that is failing.  Would I still need to work the jumper issue?

   Thanks,
   Nate

Maurice Hilarius wrote:

Nathanial Byrnes wrote:
  

Yes, I did not have the funding nor approval to purchase more hardware
when I set it up (read wife). Once it was working... the rest is
history.

  



OK, so if you have a pair of IDE disks, jumpered as Master and slave,
and if one fails:

If Master failed, re-jumper remaining disk on pair on same cable as
Master, no slave present

If Slave failed, re-jumper remaining disk on pair on same cable as
Master, no slave present.

Then you will have the remaining disk working normally, at least.

When you can afford it I suggest buying a controller with enough ports
to support the number of drives you have, with no Master/Slave pairing.

Good luck !

And to the  software guys trying to help: We need to start with the
(obvious) hardware problem, before we advise on how to recover data from
a borked system..
Once he has the jumpering on the drives sorted out, the drive that went
missing will be back again..


  

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-19 Thread Maurice Hilarius
Nathanial Byrnes wrote:
> Yes, I did not have the funding nor approval to purchase more hardware
> when I set it up (read wife). Once it was working... the rest is
> history.
>
>   

OK, so if you have a pair of IDE disks, jumpered as Master and slave,
and if one fails:

If Master failed, re-jumper remaining disk on pair on same cable as
Master, no slave present

If Slave failed, re-jumper remaining disk on pair on same cable as
Master, no slave present.

Then you will have the remaining disk working normally, at least.

When you can afford it I suggest buying a controller with enough ports
to support the number of drives you have, with no Master/Slave pairing.

Good luck !

And to the  software guys trying to help: We need to start with the
(obvious) hardware problem, before we advise on how to recover data from
a borked system..
Once he has the jumpering on the drives sorted out, the drive that went
missing will be back again..


-- 

Regards,
Maurice

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-18 Thread Nathanial Byrnes
Yes, I did not have the funding nor approval to purchase more hardware
when I set it up (read wife). Once it was working... the rest is
history.

On Tue, 2006-04-18 at 16:13 -0600, Maurice Hilarius wrote:
> Nathanial Byrnes wrote:
> > Hi All,
> > Recently I lost a disk in my raid5 SW array. It seems that it took a
> > second disk with it. The other disk appears to still be funtional (from
> > an fdisk perspective...). I am trying to get the array to work in
> > degraded mode via failed-disk in raidtab, but am always getting the
> > following error:
> >
> >   
> Let me guess:
> IDE disks, in pairs.
> Jumpered as Master and Salve.
> 
> Right?
> 
> 
> 
> 
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-18 Thread Maurice Hilarius
Nathanial Byrnes wrote:
> Hi All,
>   Recently I lost a disk in my raid5 SW array. It seems that it took a
> second disk with it. The other disk appears to still be funtional (from
> an fdisk perspective...). I am trying to get the array to work in
> degraded mode via failed-disk in raidtab, but am always getting the
> following error:
>
>   
Let me guess:
IDE disks, in pairs.
Jumpered as Master and Salve.

Right?





-- 

With our best regards,


Maurice W. HilariusTelephone: 01-780-456-9771
Hard Data Ltd.  FAX:   01-780-456-9772
11060 - 166 Avenue email:[EMAIL PROTECTED]
Edmonton, AB, Canada   http://www.harddata.com/
   T5X 1Y3

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-18 Thread Nathanial Byrnes
2.4.1 behaves just like 2.1. so far nothing in the syslog or messages.

On Tue, 2006-04-18 at 10:24 +1000, Neil Brown wrote:
> On Monday April 17, [EMAIL PROTECTED] wrote:
> > Unfortunately nothing changed. 
> 
> Weird... so hdf still reports as 'busy'?
> Is it mentioned anywhere in /var/log/messages since reboot?
> 
> What version of mdadm are you using?  Try 2.4.1 and see if that works
> differently.
> 
> NeilBrown
> 
> > 
> > 
> > On Tue, 2006-04-18 at 07:43 +1000, Neil Brown wrote:
> > > On Monday April 17, [EMAIL PROTECTED] wrote:
> > > > Hi Neil, List,
> > > > Am I just out of luck? Perhaps a full reboot? Something else?
> > > > Thanks,
> > > > Nate
> > > 
> > > Reboot and try again seems like the best bet at this stage.
> > > 
> > > NeilBrown
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > > the body of a message to [EMAIL PROTECTED]
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > 
> > > 
> > > 
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to [EMAIL PROTECTED]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> !DSPAM:31e693751804284693!
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-17 Thread Neil Brown
On Monday April 17, [EMAIL PROTECTED] wrote:
> Unfortunately nothing changed. 

Weird... so hdf still reports as 'busy'?
Is it mentioned anywhere in /var/log/messages since reboot?

What version of mdadm are you using?  Try 2.4.1 and see if that works
differently.

NeilBrown

> 
> 
> On Tue, 2006-04-18 at 07:43 +1000, Neil Brown wrote:
> > On Monday April 17, [EMAIL PROTECTED] wrote:
> > > Hi Neil, List,
> > > Am I just out of luck? Perhaps a full reboot? Something else?
> > > Thanks,
> > > Nate
> > 
> > Reboot and try again seems like the best bet at this stage.
> > 
> > NeilBrown
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to [EMAIL PROTECTED]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > !DSPAM:0c1a90901937570534!
> > 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-17 Thread Nathanial Byrnes
Unfortunately nothing changed. 


On Tue, 2006-04-18 at 07:43 +1000, Neil Brown wrote:
> On Monday April 17, [EMAIL PROTECTED] wrote:
> > Hi Neil, List,
> > Am I just out of luck? Perhaps a full reboot? Something else?
> > Thanks,
> > Nate
> 
> Reboot and try again seems like the best bet at this stage.
> 
> NeilBrown
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> !DSPAM:0c1a90901937570534!
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-17 Thread Neil Brown
On Monday April 17, [EMAIL PROTECTED] wrote:
> Hi Neil, List,
> Am I just out of luck? Perhaps a full reboot? Something else?
> Thanks,
> Nate

Reboot and try again seems like the best bet at this stage.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-17 Thread Nate Byrnes

Hi Neil, List,
   Am I just out of luck? Perhaps a full reboot? Something else?
   Thanks,
   Nate

Nate Byrnes wrote:

Hi Neil,
   Nothing references hdf as you can see below.  I have also rmmod'ed 
md and raid5 modules and modprobed them back in. Thoughts?


   Thanks again,
   Nate

[EMAIL PROTECTED]:~# cat /proc/swaps
FilenameTypeSize
UsedPriority
/dev/sdb2   partition   1050616 
1028-1


[EMAIL PROTECTED]:~# cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / ext3 rw 0 0
proc /proc proc rw,nodiratime 0 0
sysfs /sys sysfs rw 0 0
none /dev ramfs rw 0 0
/dev/sdb1 /usr ext3 rw 0 0
devpts /dev/pts devpts rw 0 0
nfsd /proc/fs/nfsd nfsd rw 0 0
usbfs /proc/bus/usb usbfs rw 0 0

[EMAIL PROTECTED]:~# cat /proc/mdstat
Personalities : [raid5]
md0 : inactive hdh[2] hdg[3] hde[1]
 234451968 blocks

unused devices: 


Neil Brown wrote:

On Monday April 17, [EMAIL PROTECTED] wrote:
 

What is /dev/hdf busy? Is it in use? mounted? something?

  

Not that I am aware of. Here is the mount output:

[EMAIL PROTECTED]:/etc# mount
/dev/sda1 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
/dev/sdb1 on /usr type ext3 (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
nfsd on /proc/fs/nfsd type nfsd (rw)
usbfs on /proc/bus/usb type usbfs (rw)

lsof | grep hdf does not return any results.

is there some other way to find out?



 cat /proc/swaps
 cat /proc/mounts
 cat /proc/mdstat

as well as 'lsof' should find it.

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



  


!DSPAM:444386c978211215816793!


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-17 Thread Nate Byrnes

Hi Neil,
   Nothing references hdf as you can see below.  I have also rmmod'ed 
md and raid5 modules and modprobed them back in. Thoughts?


   Thanks again,
   Nate

[EMAIL PROTECTED]:~# cat /proc/swaps
FilenameTypeSizeUsed
Priority

/dev/sdb2   partition   1050616 1028-1

[EMAIL PROTECTED]:~# cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / ext3 rw 0 0
proc /proc proc rw,nodiratime 0 0
sysfs /sys sysfs rw 0 0
none /dev ramfs rw 0 0
/dev/sdb1 /usr ext3 rw 0 0
devpts /dev/pts devpts rw 0 0
nfsd /proc/fs/nfsd nfsd rw 0 0
usbfs /proc/bus/usb usbfs rw 0 0

[EMAIL PROTECTED]:~# cat /proc/mdstat
Personalities : [raid5]
md0 : inactive hdh[2] hdg[3] hde[1]
 234451968 blocks

unused devices: 


Neil Brown wrote:

On Monday April 17, [EMAIL PROTECTED] wrote:
  

What is /dev/hdf busy? Is it in use? mounted? something?

  

Not that I am aware of. Here is the mount output:

[EMAIL PROTECTED]:/etc# mount
/dev/sda1 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
/dev/sdb1 on /usr type ext3 (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
nfsd on /proc/fs/nfsd type nfsd (rw)
usbfs on /proc/bus/usb type usbfs (rw)

lsof | grep hdf does not return any results.

is there some other way to find out?



 cat /proc/swaps
 cat /proc/mounts
 cat /proc/mdstat

as well as 'lsof' should find it.

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

!DSPAM:44436e3576593808182809!

  

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-17 Thread Neil Brown
On Monday April 17, [EMAIL PROTECTED] wrote:
> > 
> > What is /dev/hdf busy? Is it in use? mounted? something?
> > 
> Not that I am aware of. Here is the mount output:
> 
> [EMAIL PROTECTED]:/etc# mount
> /dev/sda1 on / type ext3 (rw)
> proc on /proc type proc (rw)
> sysfs on /sys type sysfs (rw)
> /dev/sdb1 on /usr type ext3 (rw)
> devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> nfsd on /proc/fs/nfsd type nfsd (rw)
> usbfs on /proc/bus/usb type usbfs (rw)
> 
> lsof | grep hdf does not return any results.
> 
> is there some other way to find out?

 cat /proc/swaps
 cat /proc/mounts
 cat /proc/mdstat

as well as 'lsof' should find it.

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-17 Thread Nathanial Byrnes
Please see below.

On Mon, 2006-04-17 at 13:04 +1000, Neil Brown wrote:
> On Sunday April 16, [EMAIL PROTECTED] wrote:
> > Hi Neil,
> > Thanks for your reply. I tried that, but here is there error I
> > received:
> > 
> > [EMAIL PROTECTED]:/etc# mdadm --assemble /dev/md0
> > --uuid=38081921:59a998f9:64c1a001:ec53 4ef2 /dev/hd[efgh]
> > mdadm: failed to add /dev/hdf to /dev/md0: Device or resource busy
> > mdadm: /dev/md0 assembled from 2 drives and -1 spares - not enough to
> > start the array.
> 
> What is /dev/hdf busy? Is it in use? mounted? something?
> 
Not that I am aware of. Here is the mount output:

[EMAIL PROTECTED]:/etc# mount
/dev/sda1 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
/dev/sdb1 on /usr type ext3 (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
nfsd on /proc/fs/nfsd type nfsd (rw)
usbfs on /proc/bus/usb type usbfs (rw)

lsof | grep hdf does not return any results.

is there some other way to find out?
> > 
> > The output from lsraid against each device is as follows (I think that I
> > messed up my superblocks pretty well...): 
> 
> Sorry, but I don't use lsraid and cannot tell anything useful from it's
> output.
ok
> 
> NeilBrown
> 
> !DSPAM:444305b971501811819476!
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-16 Thread Neil Brown
On Sunday April 16, [EMAIL PROTECTED] wrote:
> Hi Neil,
>   Thanks for your reply. I tried that, but here is there error I
> received:
> 
> [EMAIL PROTECTED]:/etc# mdadm --assemble /dev/md0
> --uuid=38081921:59a998f9:64c1a001:ec53 4ef2 /dev/hd[efgh]
> mdadm: failed to add /dev/hdf to /dev/md0: Device or resource busy
> mdadm: /dev/md0 assembled from 2 drives and -1 spares - not enough to
> start the array.

What is /dev/hdf busy? Is it in use? mounted? something?

> 
> The output from lsraid against each device is as follows (I think that I
> messed up my superblocks pretty well...): 

Sorry, but I don't use lsraid and cannot tell anything useful from it's
output.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-16 Thread Nathanial Byrnes
Hi Neil,
Thanks for your reply. I tried that, but here is there error I
received:

[EMAIL PROTECTED]:/etc# mdadm --assemble /dev/md0
--uuid=38081921:59a998f9:64c1a001:ec53 4ef2 /dev/hd[efgh]
mdadm: failed to add /dev/hdf to /dev/md0: Device or resource busy
mdadm: /dev/md0 assembled from 2 drives and -1 spares - not enough to
start the array.

The output from lsraid against each device is as follows (I think that I
messed up my superblocks pretty well...): 

[EMAIL PROTECTED]:/etc# lsraid -d /dev/hde
[dev   9,   0] /dev/md/038081921.59A998F9.64C1A001.EC534EF2
offline
[dev   ?,   ?] (unknown)...
missing
[dev   ?,   ?] (unknown)...
missing
[dev  34,  64] /dev/hdh 38081921.59A998F9.64C1A001.EC534EF2 good
[dev  34,   0] /dev/hdg 38081921.59A998F9.64C1A001.EC534EF2 good
[dev  33,  64] (unknown)38081921.59A998F9.64C1A001.EC534EF2
unknown
[dev  33,   0] (unknown)38081921.59A998F9.64C1A001.EC534EF2
unknown

[dev  33,   0] /dev/hde 38081921.59A998F9.64C1A001.EC534EF2
unbound
[EMAIL PROTECTED]:/etc# lsraid -d /dev/hdf
[dev   9,   0] /dev/md/038081921.59A998F9.64C1A001.EC534EF2
offline
[dev   ?,   ?] (unknown)...
missing
[dev   ?,   ?] (unknown)...
missing
[dev  34,  64] /dev/hdh 38081921.59A998F9.64C1A001.EC534EF2 good
[dev  34,   0] /dev/hdg 38081921.59A998F9.64C1A001.EC534EF2 good
[dev  33,  64] (unknown)38081921.59A998F9.64C1A001.EC534EF2
unknown
[dev  33,   0] (unknown)38081921.59A998F9.64C1A001.EC534EF2
unknown

[dev  33,  64] /dev/hdf 38081921.59A998F9.64C1A001.EC534EF2
unbound
[EMAIL PROTECTED]:/etc# lsraid -d /dev/hdg
[dev   9,   0] /dev/md/038081921.59A998F9.64C1A001.EC534EF2
offline
[dev   ?,   ?] (unknown)...
missing
[dev   ?,   ?] (unknown)...
missing
[dev  34,  64] /dev/hdh 38081921.59A998F9.64C1A001.EC534EF2 good
[dev  34,   0] /dev/hdg 38081921.59A998F9.64C1A001.EC534EF2 good
[dev  33,  64] (unknown)38081921.59A998F9.64C1A001.EC534EF2
unknown
[dev  33,   0] (unknown)38081921.59A998F9.64C1A001.EC534EF2
unknown

[EMAIL PROTECTED]:/etc# lsraid -d /dev/hdh
[dev   9,   0] /dev/md/038081921.59A998F9.64C1A001.EC534EF2
offline
[dev   ?,   ?] (unknown)...
missing
[dev   ?,   ?] (unknown)...
missing
[dev  34,  64] /dev/hdh 38081921.59A998F9.64C1A001.EC534EF2 good
[dev  34,   0] /dev/hdg 38081921.59A998F9.64C1A001.EC534EF2 good
[dev  33,  64] (unknown)38081921.59A998F9.64C1A001.EC534EF2
unknown
[dev  33,   0] (unknown)38081921.59A998F9.64C1A001.EC534EF2
unknown


Thanks again,
Nate

On Mon, 2006-04-17 at 08:46 +1000, Neil Brown wrote:
> On Saturday April 15, [EMAIL PROTECTED] wrote:
> > Hi All,
> > Recently I lost a disk in my raid5 SW array. It seems that it took a
> > second disk with it. The other disk appears to still be funtional (from
> > an fdisk perspective...). I am trying to get the array to work in
> > degraded mode via failed-disk in raidtab, but am always getting the
> > following error:
> > 
> > md: could not bd_claim hde.
> > md: autostart failed!
> > 
> > When I try to raidstart the array. Is it the case tha I had been running
> > in degraded mode before the disk failure, and then lost the other disk?
> > if so, how can I tell. 
> 
> raidstart is deprecated.  It doesn't work reliably.  Don't use it.
> 
> > 
> > I have been messing about with mkraid -R and I have tried to
> > add /dev/hdf (a new disk) back to the array. However, I am fairly
> > confident that I have not kicked off the recovery process, so I am
> > imagining that once I get the superblocks in order, I should be able to
> > recover to the new disk?
> > 
> > My system and raid config are:
> > Kernel 2.6.13.1
> > Slack 10.2
> > RAID 5 which originally looked like:
> > /dev/hde
> > /dev/hdg
> > /dev/hdi
> > /dev/hdk
> > 
> > but when I moved the disks to another box with fewer IDE controllers
> > /dev/hde
> > /dev/hdf
> > /dev/hdg
> > /dev/hdh
> > 
> > How should I approach this?
> 
> mdadm --assemble /dev/md0 --uuid=38081921:59a998f9:64c1a001:ec534ef2 /dev/hd*
> 
> If that doesn't work, add "--force" but be cautious of the data - do
> an fsck atleast.
> 
> NeilBrown
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> !DSPAM:4442c93863991804284693!
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 recovery trouble, bd_claim failed?

2006-04-16 Thread Neil Brown
On Saturday April 15, [EMAIL PROTECTED] wrote:
> Hi All,
>   Recently I lost a disk in my raid5 SW array. It seems that it took a
> second disk with it. The other disk appears to still be funtional (from
> an fdisk perspective...). I am trying to get the array to work in
> degraded mode via failed-disk in raidtab, but am always getting the
> following error:
> 
> md: could not bd_claim hde.
> md: autostart failed!
> 
> When I try to raidstart the array. Is it the case tha I had been running
> in degraded mode before the disk failure, and then lost the other disk?
> if so, how can I tell. 

raidstart is deprecated.  It doesn't work reliably.  Don't use it.

> 
> I have been messing about with mkraid -R and I have tried to
> add /dev/hdf (a new disk) back to the array. However, I am fairly
> confident that I have not kicked off the recovery process, so I am
> imagining that once I get the superblocks in order, I should be able to
> recover to the new disk?
> 
> My system and raid config are:
> Kernel 2.6.13.1
> Slack 10.2
> RAID 5 which originally looked like:
> /dev/hde
> /dev/hdg
> /dev/hdi
> /dev/hdk
> 
> but when I moved the disks to another box with fewer IDE controllers
> /dev/hde
> /dev/hdf
> /dev/hdg
> /dev/hdh
> 
> How should I approach this?

mdadm --assemble /dev/md0 --uuid=38081921:59a998f9:64c1a001:ec534ef2 /dev/hd*

If that doesn't work, add "--force" but be cautious of the data - do
an fsck atleast.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RAID5 recovery trouble, bd_claim failed?

2006-04-15 Thread Nathanial Byrnes
Hi All,
Recently I lost a disk in my raid5 SW array. It seems that it took a
second disk with it. The other disk appears to still be funtional (from
an fdisk perspective...). I am trying to get the array to work in
degraded mode via failed-disk in raidtab, but am always getting the
following error:

md: could not bd_claim hde.
md: autostart failed!

When I try to raidstart the array. Is it the case tha I had been running
in degraded mode before the disk failure, and then lost the other disk?
if so, how can I tell. 

I have been messing about with mkraid -R and I have tried to
add /dev/hdf (a new disk) back to the array. However, I am fairly
confident that I have not kicked off the recovery process, so I am
imagining that once I get the superblocks in order, I should be able to
recover to the new disk?

My system and raid config are:
Kernel 2.6.13.1
Slack 10.2
RAID 5 which originally looked like:
/dev/hde
/dev/hdg
/dev/hdi
/dev/hdk

but when I moved the disks to another box with fewer IDE controllers
/dev/hde
/dev/hdf
/dev/hdg
/dev/hdh

How should I approach this?

Below is the output of mdadm --examine /dev/hd*

Thanks in advance,
Nate

/dev/hde:
  Magic : a92b4efc
Version : 00.90.00
   UUID : 38081921:59a998f9:64c1a001:ec534ef2
  Creation Time : Fri Aug 22 16:34:37 2003
 Raid Level : raid5
Device Size : 78150656 (74.53 GiB 80.03 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

Update Time : Wed Apr 12 02:26:37 2006
  State : active
 Active Devices : 3
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 0
   Checksum : 165c1b4c - correct
 Events : 0.37523832

 Layout : left-symmetric
 Chunk Size : 128K

  Number   Major   Minor   RaidDevice State
this 1  3301  active sync   /dev/hde

   0 0   000  removed
   1 1  3301  active sync   /dev/hde
   2 2  34   642  active sync   /dev/hdh
   3 3  3403  active sync   /dev/hdg

/dev/hdf:
  Magic : a92b4efc
Version : 00.90.00
   UUID : 38081921:59a998f9:64c1a001:ec534ef2
  Creation Time : Fri Aug 22 16:34:37 2003
 Raid Level : raid5
Device Size : 78150656 (74.53 GiB 80.03 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

Update Time : Wed Apr 12 02:26:37 2006
  State : active
 Active Devices : 3
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 0
   Checksum : 165c1bc5 - correct
 Events : 0.37523832

 Layout : left-symmetric
 Chunk Size : 128K

  Number   Major   Minor   RaidDevice State
this 3  33   64   -1  sync   /dev/hdf

   0 0   000  removed
   1 1  3301  active sync   /dev/hde
   2 2  34   642  active sync   /dev/hdh
   3 3  33   64   -1  sync   /dev/hdf
/dev/hdg:
  Magic : a92b4efc
Version : 00.90.00
   UUID : 38081921:59a998f9:64c1a001:ec534ef2
  Creation Time : Fri Aug 22 16:34:37 2003
 Raid Level : raid5
Device Size : 78150656 (74.53 GiB 80.03 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

Update Time : Wed Apr 12 06:12:58 2006
  State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 3
  Spare Devices : 0
   Checksum : 1898e1fd - correct
 Events : 0.37523844

 Layout : left-symmetric
 Chunk Size : 128K

  Number   Major   Minor   RaidDevice State
this 3  3403  active sync   /dev/hdg

   0 0   000  removed
   1 1   001  faulty removed
   2 2  34   642  active sync   /dev/hdh
   3 3  3403  active sync   /dev/hdg
/dev/hdh:
  Magic : a92b4efc
Version : 00.90.00
   UUID : 38081921:59a998f9:64c1a001:ec534ef2
  Creation Time : Fri Aug 22 16:34:37 2003
 Raid Level : raid5
Device Size : 78150656 (74.53 GiB 80.03 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

Update Time : Wed Apr 12 06:12:58 2006
  State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 3
  Spare Devices : 0
   Checksum : 1898e23b - correct
 Events : 0.37523844

 Layout : left-symmetric
 Chunk Size : 128K

  Number   Major   Minor   RaidDevice State
this 2  34   642  active sync   /dev/hdh

   0 0   000  removed
   1 1   001  faulty removed
   2 2  34   642  active sync   /dev/hdh
   3 3  3403  active sync   /dev/hdg


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/

Re: Help needed - RAID5 recovery from Power-fail - SOLVED

2006-04-05 Thread Nigel J. Terry
Thanks for all the help. I am now up and running again and have been
stable for over a day. I will now install my new drive and add it to
give me an array of three drives.

I'll also learn more about Raid, mdadm and smartd so that I am better
prepared next time.

Thanks again

Nigel
Neil Brown wrote:
> On Monday April 3, [EMAIL PROTECTED] wrote:
>   
>> I wonder if you could help a Raid Newbie with a problem
>>
>> I had a power fail, and now I can't access my RAID array. It has been
>> working fine for months until I lost power... Being a fool, I don't have
>> a full backup, so I really need to get this data back.
>>
>> I run FC4 (64bit).
>> I have an array of two disks /dev/sda1 and /dev/sdb1 as a raid5 array
>> /dev/md0 on top of which I run lvm and mount the whole lot as /home. My
>> intention was always to add another disk to this array, and I purchased
>> one yesterday.
>> 
>
> 2 devices in a raid5??  Doesn't seem a lot of point it being raid5
> rather than raid1.
>
>   
>> When I boot, I get:
>>
>> md0 is not clean
>> Cannot start dirty degraded array
>> failed to run raid set md0
>> 
>
> This tells use that the array is degraded.  A dirty degraded array can
> have undetectable data corruption.  That is why it won't start it for
> you.
> However with only two devices, data corruption from this cause isn't
> actually possible. 
>
> The kernel parameter
>md_mod.start_dirty_degraded=1
> will bypass this message and start the array anyway.
>
> Alternately:
>   mdadm -A --force /dev/md0 /dev/sd[ab]1
>
>   
>> # mdadm --examine /dev/sda1
>> /dev/sda1:
>>   Magic : a92b4efc
>> Version : 00.90.02
>>UUID : c57d50aa:1b3bcabd:ab04d342:6049b3f1
>>   Creation Time : Thu Dec 15 15:29:36 2005
>>  Raid Level : raid5
>>Raid Devices : 2
>>   Total Devices : 2
>> Preferred Minor : 0
>>
>> Update Time : Tue Mar 21 06:25:52 2006
>>   State : active
>>  Active Devices : 1
>> 
>
> So at 06:25:52, there was only one working devices, while...
>
>
>   
>> #mdadm --examine /dev/sdb1
>> /dev/sdb1:
>>   Magic : a92b4efc
>> Version : 00.90.02
>>UUID : c57d50aa:1b3bcabd:ab04d342:6049b3f1
>>   Creation Time : Thu Dec 15 15:29:36 2005
>>  Raid Level : raid5
>>Raid Devices : 2
>>   Total Devices : 2
>> Preferred Minor : 0
>>
>> Update Time : Tue Mar 21 06:23:57 2006
>>   State : active
>>  Active Devices : 2
>> 
>
> at 06:23:57 there were two.
>
> It looks like you lost a drive a while ago. Did you notice?
>
> Anyway, the 'mdadm' command I gave above should get the array working
> again for you.  Then you might want to
>mdadm /dev/md0 -a /dev/sdb1
> is you trust /dev/sdb
>
> NeilBrown
>
>
>   
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help needed - RAID5 recovery from Power-fail

2006-04-04 Thread Al Boldi
Neil Brown wrote:
> 2 devices in a raid5??  Doesn't seem a lot of point it being raid5
> rather than raid1.

Wouldn't a 2-dev raid5 imply a striped block mirror (i.e faster) rather than 
a raid1 duplicate block mirror (i.e. slower) ?

Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help needed - RAID5 recovery from Power-fail

2006-04-03 Thread David Greaves
Neil Brown wrote:

>On Monday April 3, [EMAIL PROTECTED] wrote:
>  
>
>>I wonder if you could help a Raid Newbie with a problem
>>
>>


>It looks like you lost a drive a while ago. Did you notice?
>
This is not unusual - raid just keeps on going if a disk fails.
When things are working again you really should read up on "mdadm -F" -
it runs as a daemon and sends you mail if any raid events occur.

See if FC4 has a script that automatically runs it - you may need to
tweak some config parameters somewhere (I use Debian so I'm not much help).

David

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help needed - RAID5 recovery from Power-fail

2006-04-03 Thread Neil Brown
On Monday April 3, [EMAIL PROTECTED] wrote:
> I wonder if you could help a Raid Newbie with a problem
> 
> I had a power fail, and now I can't access my RAID array. It has been
> working fine for months until I lost power... Being a fool, I don't have
> a full backup, so I really need to get this data back.
> 
> I run FC4 (64bit).
> I have an array of two disks /dev/sda1 and /dev/sdb1 as a raid5 array
> /dev/md0 on top of which I run lvm and mount the whole lot as /home. My
> intention was always to add another disk to this array, and I purchased
> one yesterday.

2 devices in a raid5??  Doesn't seem a lot of point it being raid5
rather than raid1.

> 
> When I boot, I get:
> 
> md0 is not clean
> Cannot start dirty degraded array
> failed to run raid set md0

This tells use that the array is degraded.  A dirty degraded array can
have undetectable data corruption.  That is why it won't start it for
you.
However with only two devices, data corruption from this cause isn't
actually possible. 

The kernel parameter
   md_mod.start_dirty_degraded=1
will bypass this message and start the array anyway.

Alternately:
  mdadm -A --force /dev/md0 /dev/sd[ab]1

> 
> # mdadm --examine /dev/sda1
> /dev/sda1:
>   Magic : a92b4efc
> Version : 00.90.02
>UUID : c57d50aa:1b3bcabd:ab04d342:6049b3f1
>   Creation Time : Thu Dec 15 15:29:36 2005
>  Raid Level : raid5
>Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 0
> 
> Update Time : Tue Mar 21 06:25:52 2006
>   State : active
>  Active Devices : 1

So at 06:25:52, there was only one working devices, while...


> 
> #mdadm --examine /dev/sdb1
> /dev/sdb1:
>   Magic : a92b4efc
> Version : 00.90.02
>UUID : c57d50aa:1b3bcabd:ab04d342:6049b3f1
>   Creation Time : Thu Dec 15 15:29:36 2005
>  Raid Level : raid5
>Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 0
> 
> Update Time : Tue Mar 21 06:23:57 2006
>   State : active
>  Active Devices : 2

at 06:23:57 there were two.

It looks like you lost a drive a while ago. Did you notice?

Anyway, the 'mdadm' command I gave above should get the array working
again for you.  Then you might want to
   mdadm /dev/md0 -a /dev/sdb1
is you trust /dev/sdb

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Help needed - RAID5 recovery from Power-fail

2006-04-03 Thread Nigel J. Terry
I wonder if you could help a Raid Newbie with a problem

I had a power fail, and now I can't access my RAID array. It has been
working fine for months until I lost power... Being a fool, I don't have
a full backup, so I really need to get this data back.

I run FC4 (64bit).
I have an array of two disks /dev/sda1 and /dev/sdb1 as a raid5 array
/dev/md0 on top of which I run lvm and mount the whole lot as /home. My
intention was always to add another disk to this array, and I purchased
one yesterday.

When I boot, I get:

md0 is not clean
Cannot start dirty degraded array
failed to run raid set md0


I can provide the following extra information:

# cat /proc/mdstat
Personalities : [raid5]
unused devices: 

# mdadm --query /dev/md0
/dev/md0: is an md device which is not active

# mdadm --query /dev/md0
/dev/md0: is an md device which is not active
/dev/md0: is too small to be an md component.

# mdadm --query /dev/sda1
/dev/sda1: is not an md array
/dev/sda1: device 0 in 2 device undetected raid5 md0.  Use mdadm
--examine for more detail.

#mdadm --query /dev/sdb1
/dev/sdb1: is not an md array
/dev/sdb1: device 1 in 2 device undetected raid5 md0.  Use mdadm
--examine for more detail.

# mdadm --examine /dev/md0
mdadm: /dev/md0 is too small for md

# mdadm --examine /dev/sda1
/dev/sda1:
  Magic : a92b4efc
Version : 00.90.02
   UUID : c57d50aa:1b3bcabd:ab04d342:6049b3f1
  Creation Time : Thu Dec 15 15:29:36 2005
 Raid Level : raid5
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

Update Time : Tue Mar 21 06:25:52 2006
  State : active
 Active Devices : 1
Working Devices : 1
 Failed Devices : 2
  Spare Devices : 0
   Checksum : 2ba99f09 - correct
 Events : 0.1498318

 Layout : left-symmetric
 Chunk Size : 128K

  Number   Major   Minor   RaidDevice State
this 0   810  active sync   /dev/sda1

   0 0   810  active sync   /dev/sda1
   1 1   001  faulty removed

#mdadm --examine /dev/sdb1
/dev/sdb1:
  Magic : a92b4efc
Version : 00.90.02
   UUID : c57d50aa:1b3bcabd:ab04d342:6049b3f1
  Creation Time : Thu Dec 15 15:29:36 2005
 Raid Level : raid5
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

Update Time : Tue Mar 21 06:23:57 2006
  State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
   Checksum : 2ba99e95 - correct
 Events : 0.1498307

 Layout : left-symmetric
 Chunk Size : 128K

  Number   Major   Minor   RaidDevice State
this 1   8   171  active sync   /dev/sdb1

   0 0   810  active sync   /dev/sda1
   1 1   8   171  active sync   /dev/sdb1

It looks to me like there is no hardware problem, but maybe I am wrong.
I cannot find any file /etc/mdadm.confnor   /etc/raidtab.

How would you suggest I proceed? I'm wary of doing anything (assemble,
build, create) until I am sure it won't reset everything.

Many Thanks

Nigel



-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 recovery fails

2005-11-14 Thread Neil Brown
On Tuesday November 15, [EMAIL PROTECTED] wrote:
> > mdadm --add /dev/md0 /dev/sda2..
> yes , i am using raidstart for this.  it should be the same.

No, it shouldn't.
raidstart is broken by design and cannot work reliable.  It is one of
the main reasons that I wrote mdadm.

raidstart trusts the device numbers (major and minor) that are stored
in the superblock.  If you pull a drive out these numbers change and
raidstart fails miserably.

# rm -f /usr/sbin/raidstart

is probably a good idea.

> I am handling a big cluster with supermicro machines,each machine has its
> own 4 sata disks.I am using 2.6.6 kernel.
> 
> Did you ever pulled out a disk from raid5 while the machine was
> running ?

Yes, several times.


NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 recovery fails

2005-11-14 Thread Raz Ben-Jehuda(caro)
> mdadm --add /dev/md0 /dev/sda2..
yes , i am using raidstart for this.  it should be the same.
I am handling a big cluster with supermicro machines,each machine has its
own 4 sata disks.I am using 2.6.6 kernel.

Did you ever pulled out a disk from raid5 while the machine was running ?
Just want to know before i dive into the raid code that it is realy a bug.

On 11/14/05, Ross Vandegrift <[EMAIL PROTECTED]> wrote:
> On Mon, Nov 14, 2005 at 09:27:25PM +0200, Raz Ben-Jehuda(caro) wrote:
> > I have made the following test with my raid5:
> > 1. created raid5 with 4 sata disks.
> > 2. waited untill raid was fully initialized.
> > 3. pulled a disk from the panel.
> > 4. shut the system.
> > 5. put back the disk.
> > 6. turn on the system.
> >
> > The raid failed failed to recver. i got message from the md layer
> > saying that it rejects the dirty disk.
> > Anyone ?
>
> Did you re-add the disk to the array?
>
> # mdadm --add /dev/md0 /dev/sda2
>
> Of course, substitude your appropriate devices for the ones that I
> randomly chose ::-)
>
>
> --
> Ross Vandegrift
> [EMAIL PROTECTED]
>
> "The good Christian should beware of mathematicians, and all those who
> make empty prophecies. The danger already exists that the mathematicians
> have made a covenant with the devil to darken the spirit and to confine
> man in the bonds of Hell."
> --St. Augustine, De Genesi ad Litteram, Book II, xviii, 37
>


--
Raz
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 recovery fails

2005-11-14 Thread Ross Vandegrift
On Mon, Nov 14, 2005 at 09:27:25PM +0200, Raz Ben-Jehuda(caro) wrote:
> I have made the following test with my raid5:
> 1. created raid5 with 4 sata disks.
> 2. waited untill raid was fully initialized.
> 3. pulled a disk from the panel.
> 4. shut the system.
> 5. put back the disk.
> 6. turn on the system.
> 
> The raid failed failed to recver. i got message from the md layer
> saying that it rejects the dirty disk.
> Anyone ?

Did you re-add the disk to the array?

# mdadm --add /dev/md0 /dev/sda2

Of course, substitude your appropriate devices for the ones that I
randomly chose ::-)


-- 
Ross Vandegrift
[EMAIL PROTECTED]

"The good Christian should beware of mathematicians, and all those who
make empty prophecies. The danger already exists that the mathematicians
have made a covenant with the devil to darken the spirit and to confine
man in the bonds of Hell."
--St. Augustine, De Genesi ad Litteram, Book II, xviii, 37
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


raid5 recovery fails

2005-11-14 Thread Raz Ben-Jehuda(caro)
I have made the following test with my raid5:
1. created raid5 with 4 sata disks.
2. waited untill raid was fully initialized.
3. pulled a disk from the panel.
4. shut the system.
5. put back the disk.
6. turn on the system.

The raid failed failed to recver. i got message from the md layer
saying that it rejects the dirty disk.
Anyone ?
--
Raz
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html