Re: Raid5 with two failed disks?

2000-04-09 Thread Michael T. Babcock

Its a nice complicated case of semaphores in threaded (multi process?) systems ...

... one system needs to be aware that the other system isn't ready yet, without
causing incompatibilities.  With RAID, would it be possible for the MD driver to
actually accept the mount request but halt the process until the driver was ready to
actually give data?

Jakob Østergaard wrote:

> >   I think my situation is the same as this "two failed disks" one but I
> > haven't been following the thread carefully and I just want to double check.
> >
> >   I have a mirrored RAID-1 setup between 2 disks with no spare disks.
> > Inadvertantly the machine got powered down without a proper shutdown
> > apparently causing the RAID to become unhappy. It would boot to the point
> > where it needed to mount root and then would fail saying that it couldn't
> > access /dev/md1 because the two RAID disks were out of sync.
> >   Anyway, given this situation, how can I rebuild my array? Is all it takes
> > is doing another mkraid (given the raidtab is identical to the real setup,
> > etc)? If so, since I'm also booting off of raid, how do I do this for the
> > boot partition? I can boot up using one of the individual disks (e.g.
> > /dev/sda1) instead of the raid disk (/dev/md1), but if I do that will I be
> > able to do a mkraid on an in-use partition? If not, how do I resolve this
> > (boot from floppy?).
> >   Finally, is there any way to automate this recovery process. That is, if
> > the machine is improperly powered down again, can I have it automatically
> > rebuild itself the next time it comes up?
>
> As others already pointed out, this doesn't make sense.  The boot sequence uses
> the mount command to mount your fs, and mount doesn't know that your md device
> is in any way different from other block devices.
>
> Only if the md device doesn't start, the mount program will be unable to
> request the kernel to mount the device.
>
> We definitely need log output in order to tell what happened and why.
>
> --
> 
> : [EMAIL PROTECTED]  : And I see the elder races, :
> :.: putrid forms of man:
> :   Jakob Østergaard  : See him rise and claim the earth,  :
> :OZ9ABN   : his downfall is at hand.   :
> :.:{Konkhra}...:

--
   _/~-=##=-~\_
   -=+0+=-< Michael T. Babcock >-=+0+=-
   ~\_-=##=-_/~
http://www.linuxsupportline.com/~pgp/ ICQ: 4835018






Re: Raid5 with two failed disks?

2000-04-04 Thread Jakob Østergaard

On Mon, 03 Apr 2000, Rainer Mager wrote:

> Hi all,
> 
>   I think my situation is the same as this "two failed disks" one but I
> haven't been following the thread carefully and I just want to double check.
> 
>   I have a mirrored RAID-1 setup between 2 disks with no spare disks.
> Inadvertantly the machine got powered down without a proper shutdown
> apparently causing the RAID to become unhappy. It would boot to the point
> where it needed to mount root and then would fail saying that it couldn't
> access /dev/md1 because the two RAID disks were out of sync.
>   Anyway, given this situation, how can I rebuild my array? Is all it takes
> is doing another mkraid (given the raidtab is identical to the real setup,
> etc)? If so, since I'm also booting off of raid, how do I do this for the
> boot partition? I can boot up using one of the individual disks (e.g.
> /dev/sda1) instead of the raid disk (/dev/md1), but if I do that will I be
> able to do a mkraid on an in-use partition? If not, how do I resolve this
> (boot from floppy?).
>   Finally, is there any way to automate this recovery process. That is, if
> the machine is improperly powered down again, can I have it automatically
> rebuild itself the next time it comes up?

As others already pointed out, this doesn't make sense.  The boot sequence uses
the mount command to mount your fs, and mount doesn't know that your md device
is in any way different from other block devices.

Only if the md device doesn't start, the mount program will be unable to
request the kernel to mount the device.

We definitely need log output in order to tell what happened and why.

-- 

: [EMAIL PROTECTED]  : And I see the elder races, :
:.: putrid forms of man:
:   Jakob Østergaard  : See him rise and claim the earth,  :
:OZ9ABN   : his downfall is at hand.   :
:.:{Konkhra}...:



RE: Raid5 with two failed disks?

2000-04-02 Thread Michael Robinton

> Hmm, well, I'm certainly not positive why it wouldn't boot and I don't have
> the logs in front of me, but I do remember it saying that it couldn't mount
> /dev/md1 and therefore had a panic during boot. My solution was to specify
> the root device as /dev/sda1 instead of the configured /dev/md1 from the
> lilo prompt.

Hmm the only time I've seen this message has been when using initrd 
with an out of sync /dev/md or when the raidtab in the initrd was bad or 
missing. This was without autostart.

Michael


> 
> The disk is marked to auto raid start and marked as fd. And, it booted just
> fine until the "dumb" shutdown.
> 
> As for a rescue disk I'll put one together. Thanks for the advice.
> 
> --Rainer
> 
> 
> > -Original Message-
> > From: Michael Robinton [mailto:[EMAIL PROTECTED]]
> > Sent: Monday, April 03, 2000 8:50 AM
> > To: Rainer Mager
> > Cc: Jakob Ostergaard; [EMAIL PROTECTED]
> > Subject: RE: Raid5 with two failed disks?
> >
> > Whether or not the array is in sync should not make a difference to the
> > boot process. I have both raid1 and raid 5 systems that run root raid and
> > will boot quite nicely and rsync automatically after a "dumb" shutdown
> > that leaves them out of sync.
> >
> > Do you have your kernel built for auto raid start?? and partitions marked
> > "fd" ?
> >
> > You can reconstruct you existing array by booting with a kernel that
> > supports raid and with the raid tools on the rescue system. Do it all the
> > time.
> >
> > Michael
> >
> 



RE: Raid5 with two failed disks?

2000-04-02 Thread Rainer Mager

Hmm, well, I'm certainly not positive why it wouldn't boot and I don't have
the logs in front of me, but I do remember it saying that it couldn't mount
/dev/md1 and therefore had a panic during boot. My solution was to specify
the root device as /dev/sda1 instead of the configured /dev/md1 from the
lilo prompt.

The disk is marked to auto raid start and marked as fd. And, it booted just
fine until the "dumb" shutdown.

As for a rescue disk I'll put one together. Thanks for the advice.

--Rainer


> -Original Message-
> From: Michael Robinton [mailto:[EMAIL PROTECTED]]
> Sent: Monday, April 03, 2000 8:50 AM
> To: Rainer Mager
> Cc: Jakob Ostergaard; [EMAIL PROTECTED]
> Subject: RE: Raid5 with two failed disks?
>
> Whether or not the array is in sync should not make a difference to the
> boot process. I have both raid1 and raid 5 systems that run root raid and
> will boot quite nicely and rsync automatically after a "dumb" shutdown
> that leaves them out of sync.
>
> Do you have your kernel built for auto raid start?? and partitions marked
> "fd" ?
>
> You can reconstruct you existing array by booting with a kernel that
> supports raid and with the raid tools on the rescue system. Do it all the
> time.
>
> Michael
>




RE: Raid5 with two failed disks?

2000-04-02 Thread Michael Robinton

On Mon, 3 Apr 2000, Rainer Mager wrote:

>   I think my situation is the same as this "two failed disks" one but I
> haven't been following the thread carefully and I just want to double check.
> 
>   I have a mirrored RAID-1 setup between 2 disks with no spare disks.
> Inadvertantly the machine got powered down without a proper shutdown
> apparently causing the RAID to become unhappy. It would boot to the point
> where it needed to mount root and then would fail saying that it couldn't
> access /dev/md1 because the two RAID disks were out of sync.
>   Anyway, given this situation, how can I rebuild my array? Is all it takes
> is doing another mkraid (given the raidtab is identical to the real setup,
> etc)? If so, since I'm also booting off of raid, how do I do this for the
> boot partition? I can boot up using one of the individual disks (e.g.
> /dev/sda1) instead of the raid disk (/dev/md1), but if I do that will I be
> able to do a mkraid on an in-use partition? If not, how do I resolve this
> (boot from floppy?).
>   Finally, is there any way to automate this recovery process. That is, if
> the machine is improperly powered down again, can I have it automatically
> rebuild itself the next time it comes up?
> 
Whether or not the array is in sync should not make a difference to the 
boot process. I have both raid1 and raid 5 systems that run root raid and 
will boot quite nicely and rsync automatically after a "dumb" shutdown 
that leaves them out of sync.

Do you have your kernel built for auto raid start?? and partitions marked 
"fd" ?

You can reconstruct you existing array by booting with a kernel that 
supports raid and with the raid tools on the rescue system. Do it all the 
time.

Michael



RE: Raid5 with two failed disks?

2000-04-02 Thread Rainer Mager

Hi all,

I think my situation is the same as this "two failed disks" one but I
haven't been following the thread carefully and I just want to double check.

I have a mirrored RAID-1 setup between 2 disks with no spare disks.
Inadvertantly the machine got powered down without a proper shutdown
apparently causing the RAID to become unhappy. It would boot to the point
where it needed to mount root and then would fail saying that it couldn't
access /dev/md1 because the two RAID disks were out of sync.
Anyway, given this situation, how can I rebuild my array? Is all it takes
is doing another mkraid (given the raidtab is identical to the real setup,
etc)? If so, since I'm also booting off of raid, how do I do this for the
boot partition? I can boot up using one of the individual disks (e.g.
/dev/sda1) instead of the raid disk (/dev/md1), but if I do that will I be
able to do a mkraid on an in-use partition? If not, how do I resolve this
(boot from floppy?).
Finally, is there any way to automate this recovery process. That is, if
the machine is improperly powered down again, can I have it automatically
rebuild itself the next time it comes up?

Thanks in advance,

--Rainer




Re: Raid5 with two failed disks?

2000-04-02 Thread Jakob Østergaard

On Sun, 02 Apr 2000, Marc Haber wrote:

[snip]
> Yes, I did. However, I'd add a sentence mentioning that in this case
> mkraid probably won't be destructive to the HOWTO. After the mkraid
> warning, I aborted the procedure and started asking. I think this
> should be avoided in the future.

I have added this to my FIX file for the next revision of the HOWTO.

Thanks,
-- 

: [EMAIL PROTECTED]  : And I see the elder races, :
:.: putrid forms of man:
:   Jakob Østergaard  : See him rise and claim the earth,  :
:OZ9ABN   : his downfall is at hand.   :
:.:{Konkhra}...:



Re: Raid5 with two failed disks?

2000-04-02 Thread Marc Haber

On Sun, 2 Apr 2000 15:28:28 +0200, you wrote:
>On Sun, 02 Apr 2000, Marc Haber wrote:
>> On Sat, 1 Apr 2000 12:44:49 +0200, you wrote:
>> >It _is_ in the docs.
>> 
>> Which docs do you refer to? I must have missed this.
>
>Section 6.1 in http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/
>
>Didn't you actually mention it yourself ?   :)

Yes, I did. However, I'd add a sentence mentioning that in this case
mkraid probably won't be destructive to the HOWTO. After the mkraid
warning, I aborted the procedure and started asking. I think this
should be avoided in the future.

Greetings
Marc

-- 
-- !! No courtesy copies, please !! -
Marc Haber  |   " Questions are the | Mailadresse im Header
Karlsruhe, Germany  | Beginning of Wisdom " | Fon: *49 721 966 32 15
Nordisch by Nature  | Lt. Worf, TNG "Rightful Heir" | Fax: *49 721 966 31 29



Re: Raid5 with two failed disks?

2000-04-02 Thread Jakob Østergaard

On Sun, 02 Apr 2000, Marc Haber wrote:

> On Sat, 1 Apr 2000 12:44:49 +0200, you wrote:
> >It _is_ in the docs.
> 
> Which docs do you refer to? I must have missed this.

Section 6.1 in http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/

Didn't you actually mention it yourself ?   :)
(don't remember - someone mentioned it at least...)

-- 

: [EMAIL PROTECTED]  : And I see the elder races, :
:.: putrid forms of man:
:   Jakob Østergaard  : See him rise and claim the earth,  :
:OZ9ABN   : his downfall is at hand.   :
:.:{Konkhra}...:



Re: Raid5 with two failed disks?

2000-04-02 Thread Marc Haber

On Sat, 1 Apr 2000 12:44:49 +0200, you wrote:
>It _is_ in the docs.

Which docs do you refer to? I must have missed this.

Greetings
Marc

-- 
-- !! No courtesy copies, please !! -
Marc Haber  |   " Questions are the | Mailadresse im Header
Karlsruhe, Germany  | Beginning of Wisdom " | Fon: *49 721 966 32 15
Nordisch by Nature  | Lt. Worf, TNG "Rightful Heir" | Fax: *49 721 966 31 29



Re: Raid5 with two failed disks?

2000-04-01 Thread Jakob Østergaard

On Fri, 31 Mar 2000, Marc Haber wrote:

> On Thu, 30 Mar 2000 09:20:57 +0200, you wrote:
> >At 02:16 30.03.00, you wrote:
> >>Hi... I have a Raid5 Array, using 4 IDE HDs. A few days ago, the system
> >>hung, no reaction, except ping from the host, nothing to see on the
> >>monitor. I rebooted the system and it told me, 2 out of 4 disks were out
> >>of sync. 2 Disks have an event counter of 0062, the two others
> >>0064. I hope, that there is a way to fix this. I searched through the
> >>mailing-list and found one thread, but it did not help me.
> >
> >Yes I do. Check Jakobs Raid howto, section "recovering from multiple failures".
> >
> >You can recreate the superblocks of the raid disks using mkraid;
> 
> I had that problem a week ago and chickened out after mkraid told me
> it would destroy my array. If, in this situation, destruction doesn't
> happen, this should be mentioned in the docs.

It _is_ in the docs.  But the message from the mkraid tool is still sane,
because it actually _will_ destroy your data *if you do not know what you are
doing*.   So, for the average Joe-user just playing with his tools as root
(*ouch!*), this message is a life saver.  For people who actually need to
re-write the superblocks for good reasons, well they have read the docs so they
know the message doesn't apply to them - if they don't make mistakes.

mkraid'ing an existing array is inherently dangerous if you're not careful and
know what you're doing.  It's perfectly safe otherwise.  Having the tool tell
the user that ``here be dragons'' is perfectly sensible IMHO.

-- 

: [EMAIL PROTECTED]  : And I see the elder races, :
:.: putrid forms of man:
:   Jakob Østergaard  : See him rise and claim the earth,  :
:OZ9ABN   : his downfall is at hand.   :
:.:{Konkhra}...:



Re: Raid5 with two failed disks?

2000-03-31 Thread Marc Haber

On Thu, 30 Mar 2000 09:20:57 +0200, you wrote:
>At 02:16 30.03.00, you wrote:
>>Hi... I have a Raid5 Array, using 4 IDE HDs. A few days ago, the system
>>hung, no reaction, except ping from the host, nothing to see on the
>>monitor. I rebooted the system and it told me, 2 out of 4 disks were out
>>of sync. 2 Disks have an event counter of 0062, the two others
>>0064. I hope, that there is a way to fix this. I searched through the
>>mailing-list and found one thread, but it did not help me.
>
>Yes I do. Check Jakobs Raid howto, section "recovering from multiple failures".
>
>You can recreate the superblocks of the raid disks using mkraid;

I had that problem a week ago and chickened out after mkraid told me
it would destroy my array. If, in this situation, destruction doesn't
happen, this should be mentioned in the docs.

Greetings
Marc

-- 
-- !! No courtesy copies, please !! -
Marc Haber  |   " Questions are the | Mailadresse im Header
Karlsruhe, Germany  | Beginning of Wisdom " | Fon: *49 721 966 32 15
Nordisch by Nature  | Lt. Worf, TNG "Rightful Heir" | Fax: *49 721 966 31 29



Re: Raid5 with two failed disks?

2000-03-31 Thread Marc Haber

On Thu, 30 Mar 2000 10:17:06 -0500, you wrote:
>On Thu, Mar 30, 2000 at 08:36:52AM -0600, Bill Carlson wrote:
>> I've been thinking about this for a different project, how bad would it be
>> to setup RAID 5 to allow for 2 (or more) failures in an array? Or is this
>> handled under a different class of RAID (ignoring things like RAID 5 over
>> mirrored disks and such).
>
>You just can't do that with RAID5.  I seem to remember that there's a RAID 6
>or 7 that handles 2 disk failures (multiple parity devices or something like
>that.)
>
>You can optionally do RAID 5+1 where you mirror partitions and then stripe
>across them ala RAID 0+1.  You'd have to lose 4 disks minimally before the
>array goes offline.

How about a RAID 5 with a single spare disk? You are dead if two disks
fail within the time it takes to resync, though. If you have n spare
disks, you can survive n+1 disks failing, provided they don't fail at
once.

Greetings
Marc

-- 
-- !! No courtesy copies, please !! -
Marc Haber  |   " Questions are the | Mailadresse im Header
Karlsruhe, Germany  | Beginning of Wisdom " | Fon: *49 721 966 32 15
Nordisch by Nature  | Lt. Worf, TNG "Rightful Heir" | Fax: *49 721 966 31 29



Re: Raid5 with two failed disks?

2000-03-30 Thread Bill Carlson

On Thu, 30 Mar 2000, Theo Van Dinter wrote:

> On Thu, Mar 30, 2000 at 02:21:45PM -0600, Bill Carlson wrote:
> > 1+5 would still fail on 2 drives if those 2 drives where both from the 
> > same RAID 1 set. The wasted space becomes more than N/2, but it might
> > worth it for the HA aspect. RAID 6 looks cleaner, but that would require
> > someone to write an implementation, whereas you could do RAID 15 (51?)
> > now. 
> 
> 2 drives failing in either RAID 1+5 or 5+1 results in a still available
> array:

Doh, you're right. Thanks for drawing me a picture...:)

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: Raid5 with two failed disks?

2000-03-30 Thread Theo Van Dinter

On Thu, Mar 30, 2000 at 02:21:45PM -0600, Bill Carlson wrote:
> 1+5 would still fail on 2 drives if those 2 drives where both from the 
> same RAID 1 set. The wasted space becomes more than N/2, but it might
> worth it for the HA aspect. RAID 6 looks cleaner, but that would require
> someone to write an implementation, whereas you could do RAID 15 (51?)
> now. 

2 drives failing in either RAID 1+5 or 5+1 results in a still available
array:

minimal RAID 1+5 (ie: mirroring stripes)

stripe 1mirroredstripe 2
sda1sdd1
sdb1sde1
sdc1sdf1

If 2 in the same stripe die, then that stripe dies, but the array is still
there since the other stripe is fine.  If 1 in each stripe die then both
stripes are still available (although degraded) and the whole array is still
up.

You can lose a third disk as well without any problems (either all 3 on one
side, or a 1/2 split which leaves the array available but degraded.)

minimal RAID 5+1 (ie: striped mirrors)

sda1mirroredsdd1
sdb1mirroredsde1
sdc1mirroredsdf1

and then striped vertically.  2 disks failing in the same mirror means
the array goes into degraded mode, but it's still available.  2 disks
failing in different mirrors means that the array is still 100% up and
available (not degraded).

you can still lose a third disk, either both sides of a mirror and another,
or 1 from each mirror -- same result, the array is still available (and in
the latter case, non degraded).

In either case, losing a 4th disk could potentially bring the array down.

I'll agree, BTW, that this is a large amount of "wasted" space, but it
depends what your goals are.  x2 disks may be worth it if you need a large
amount of reliability.  I haven't looked at them too much, but since RAID
6/7(?) handle double disk failures, they're worth looking into.

> My thought here is leading to a distributed file system that is server
> independent, it seems something like that would solve a lot of problems

It would be pretty nifty.

-- 
Randomly Generated Tagline:
"If you're ordering from us, your miserable enough to do without spam
 from strangers." - Despair.com's privacy statement



Re: Raid5 with two failed disks?

2000-03-30 Thread Bill Carlson

On Thu, 30 Mar 2000, Theo Van Dinter wrote:

> On Thu, Mar 30, 2000 at 08:36:52AM -0600, Bill Carlson wrote:
> > I've been thinking about this for a different project, how bad would it be
> > to setup RAID 5 to allow for 2 (or more) failures in an array? Or is this
> > handled under a different class of RAID (ignoring things like RAID 5 over
> > mirrored disks and such).
> 
> You just can't do that with RAID5.  I seem to remember that there's a RAID 6
> or 7 that handles 2 disk failures (multiple parity devices or something like
> that.)
> 
> You can optionally do RAID 5+1 where you mirror partitions and then stripe
> across them ala RAID 0+1.  You'd have to lose 4 disks minimally before the
> array goes offline.

1+5 would still fail on 2 drives if those 2 drives where both from the 
same RAID 1 set. The wasted space becomes more than N/2, but it might
worth it for the HA aspect. RAID 6 looks cleaner, but that would require
someone to write an implementation, whereas you could do RAID 15 (51?)
now. 

My thought here is leading to a distributed file system that is server
independent, it seems something like that would solve a lot of problems
that things like NFS and Coda don't handle. From what I've read GFS is
supposed to do this, never hurts to attack a thing from a couple of
directions.

Use the net block device, RAID 15 and go. Very tempting...:)

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: Raid5 with two failed disks?

2000-03-30 Thread Sven Kirmess

 Hi Bill,

Thursday, March 30, 2000, 4:36:52 PM, you wrote:

> I've been thinking about this for a different project, how bad would
> it be to setup RAID 5 to allow for 2 (or more) failures in an array?
> Or is this handled under a different class of RAID (ignoring things
> like RAID 5 over mirrored disks and such).

Raid 6 is exactly what you are looking for. Raid 5 with double parity
info. You lose 2 disks of N.

http://www.raid5.com/raid6.html


Or you may just take Raid 7 http://www.raid5.com/raid7.html ... Sounds
great. :-)



 Sven





Re: Raid5 with two failed disks?

2000-03-30 Thread Tmm

Thanks to all, it worked!





Re: Raid5 with two failed disks?

2000-03-30 Thread Theo Van Dinter

On Thu, Mar 30, 2000 at 08:36:52AM -0600, Bill Carlson wrote:
> I've been thinking about this for a different project, how bad would it be
> to setup RAID 5 to allow for 2 (or more) failures in an array? Or is this
> handled under a different class of RAID (ignoring things like RAID 5 over
> mirrored disks and such).

You just can't do that with RAID5.  I seem to remember that there's a RAID 6
or 7 that handles 2 disk failures (multiple parity devices or something like
that.)

You can optionally do RAID 5+1 where you mirror partitions and then stripe
across them ala RAID 0+1.  You'd have to lose 4 disks minimally before the
array goes offline.

-- 
Randomly Generated Tagline:
"There are more ways to reduce friction in metals then there were
 release dates for Windows 95."- Quantum on TLC



Re: Raid5 with two failed disks?

2000-03-30 Thread Bill Carlson

On Thu, 30 Mar 2000, Martin Bene wrote:

> At 02:16 30.03.00, you wrote:
> >Hi... I have a Raid5 Array, using 4 IDE HDs. A few days ago, the system
> >hung, no reaction, except ping from the host, nothing to see on the
> >monitor. I rebooted the system and it told me, 2 out of 4 disks were out
> >of sync. 2 Disks have an event counter of 0062, the two others
> >0064. I hope, that there is a way to fix this. I searched through the
> >mailing-list and found one thread, but it did not help me.
> 
> Yes I do. Check Jakobs Raid howto, section "recovering from multiple failures".
> 
> You can recreate the superblocks of the raid disks using mkraid; if you 
> explicitly mark one disk as failed in the raidtab, no automatic resync is 
> started, so you get to check if all works and perhaps change something and 
> retry.
>

Hey all,

I've been thinking about this for a different project, how bad would it be
to setup RAID 5 to allow for 2 (or more) failures in an array? Or is this
handled under a different class of RAID (ignoring things like RAID 5 over
mirrored disks and such).

Three words: Net block device 

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: Raid5 with two failed disks?

2000-03-29 Thread Martin Bene

At 02:16 30.03.00, you wrote:
>Hi... I have a Raid5 Array, using 4 IDE HDs. A few days ago, the system
>hung, no reaction, except ping from the host, nothing to see on the
>monitor. I rebooted the system and it told me, 2 out of 4 disks were out
>of sync. 2 Disks have an event counter of 0062, the two others
>0064. I hope, that there is a way to fix this. I searched through the
>mailing-list and found one thread, but it did not help me.

Yes I do. Check Jakobs Raid howto, section "recovering from multiple failures".

You can recreate the superblocks of the raid disks using mkraid; if you 
explicitly mark one disk as failed in the raidtab, no automatic resync is 
started, so you get to check if all works and perhaps change something and 
retry.

* Check your raidtab against the info you get in the logs from the failed 
startup (correct sequence of partitions).
* mark one of the disks with the lowest event count as a "failed-disk" 
instead of "raid-disk" in the raidtab
* recreate the raid superblocks using mkraid
* try to mount readonly, check if all is OK
* if it doesn't work, recheck raidtab, perhaps mark a different drive as 
failed, go back to the mkraid step.
* mount normally, add the last disk using raidhotadd
* remove the failed-disk stuff from your raidtab.

The mkraid command will NOT change any data on your drives except for the 
raid superblocks. normally, after mkraid the resynchronisation gets startet 
automatically - which you definitely don't want while trying to recover 
from a failure - should you have gotten anything wrong, the resync would 
definitely kill your data. So, you mark one disk as failed and create the 
array in degraded mode.

Bye, Martin
"you have moved your mouse, please reboot to make this change take effect"
--
  Martin Bene   vox: +43-316-813824
  simon media   fax: +43-316-813824-6
  Andreas-Hofer-Platz 9 e-mail: [EMAIL PROTECTED]
  8010 Graz, Austria
--
finger [EMAIL PROTECTED] for PGP public key




Raid5 with two failed disks?

2000-03-29 Thread Tmm

Hi... I have a Raid5 Array, using 4 IDE HDs. A few days ago, the system
hung, no reaction, except ping from the host, nothing to see on the
monitor. I rebooted the system and it told me, 2 out of 4 disks were out
of sync. 2 Disks have an event counter of 0062, the two others
0064. I hope, that there is a way to fix this. I searched through the
mailing-list and found one thread, but it did not help me.

Does anyone have some ideas?

Greetings, 
   Stefan


ps: Here is some part of my dmesg output:
hde: IBM-DJNA-352500, 24405MB w/1966kB Cache, CHS=49585/16/63, UDMA
hdf: IBM-DJNA-352500, 24405MB w/1966kB Cache, CHS=49585/16/63, UDMA
hdg: IBM-DJNA-352500, 24405MB w/1966kB Cache, CHS=49585/16/63, UDMA
hdh: IBM-DJNA-352500, 24405MB w/1966kB Cache, CHS=49585/16/63, UDMA
md driver 0.90.0 MAX_MD_DEVS=256, MAX_REAL=12
translucent personality registered
linear personality registered
raid5 personality registered
raid5: measuring checksumming speed
   8regs :   115.443 MB/sec
   32regs:87.630 MB/sec
using fastest function: 8regs (115.443 MB/sec)
md.c: sizeof(mdp_super_t) = 4096
 hde: hde1
 hdf: hdf1
 hdg: hdg1
 hdh: hdh1
autodetecting RAID arrays
(read) hde1's sb offset: 24990720 [events: 0062]
(read) hdf1's sb offset: 24990720 [events: 0062]
(read) hdg1's sb offset: 24990720 [events: 0064]
(read) hdh1's sb offset: 24990720 [events: 0064]
autorun ...
considering hdh1 ...
  adding hdh1 ...
  adding hdg1 ...
  adding hdf1 ...
  adding hde1 ...
created md0
bind
bind
bind
bind
running: 
now!
hdh1's event counter: 0064
hdg1's event counter: 0064
hdf1's event counter: 0062
hde1's event counter: 0062
md: superblock update time inconsistency -- using the most recent one
freshest: hdh1
md: kicking non-fresh hdf1 from array!
unbind
export_rdev(hdf1)
md: kicking non-fresh hde1 from array!
unbind
export_rdev(hde1)
md0: removing former faulty hdf1!
md0: removing former faulty hde1!
md: md0: raid array is not clean -- starting background reconstruction
md0: max total readahead window set to 384k
md0: 3 data-disks, max readahead per data-disk: 128k
raid5: device hdh1 operational as raid disk 0
raid5: device hdg1 operational as raid disk 1
raid5: not enough operational devices for md0 (2/4 failed)
RAID5 conf printout:
 --- rd:4 wd:2 fd:2
 disk 0, s:0, o:1, n:0 rd:0 us:1 dev:hdh1
 disk 1, s:0, o:1, n:1 rd:1 us:1 dev:hdg1
 disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
 disk 3, s:0, o:0, n:3 rd:3 us:1 dev:[dev 00:00]
 disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
 disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
 disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
 disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
 disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
 disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
 disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
 disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
raid5: failed to run raid set md0
pers->run() failed ...
do_md_run() returned -22
unbind
export_rdev(hdh1)
unbind
export_rdev(hdg1)
md0 stopped.
... autorun DONE.