Re: [opensuse] Re-adding RAID drives

2007-06-25 Thread Roger Oberholtzer
On Mon, 2007-06-25 at 16:26 -0400, Greg Freemyer wrote:
> On 6/25/07, Roger Oberholtzer <[EMAIL PROTECTED]> wrote:
> > 4. insert the new disks.
> >
> 
> >
> > One more thing: I have a udev rule so my SATA disks have the same kernel
> > name when they are put in the same physical slot (I have 4 removable
> > SATS disks bays). If you do not have this, sdd can become sde (and so
> > on) after a swap. /var/log/messages tells what the inserted disk is
> > known as. Perhaps it is as simple as that.
> 
> I am not a mdraid user, but I do follow the libata mailing list.
> Hotswap should work from an ATA level.  As Roger says the default
> behavior is to assign the drive a new /dev/sdx name that will not
> correspond with the original name.
> 
> I don't know what the fix is, but that is almost definitely the
> problem.  FYI: If your just testing and you get the rebuild to work,
> make sure you test a reboot, because that will change the /dev/sdx
> value back to its original value.

My udev rules for 4 removable disks are:

SUBSYSTEM=="block", BUS=="scsi", KERNEL=="sd*[0-9]", ID=="0:0:0:0", \
SYMLINK="cameraA_p%n"

SUBSYSTEM=="block", BUS=="scsi", KERNEL=="sd*[0-9]", ID=="1:0:0:0", \
SYMLINK="cameraB_p%n"

SUBSYSTEM=="block", BUS=="scsi", KERNEL=="sd*[0-9]", ID=="0:0:1:0", \
SYMLINK="cameraC_p%n"

SUBSYSTEM=="block", BUS=="scsi", KERNEL=="sd*[0-9]", ID=="1:0:1:0", \
SYMLINK="cameraD_p%n"

Each removable bay has an unchanging ID, which you can see
in /var/log/messages when a disk is found. In my case, I make a symlink
from the /dev/sdXN to a consistent name I like. I decided not to change
the name the kernel will use just so I did not mess up something else.
All my mount commands and such use the symlink name, not the sdXN kernel
name. I would imagine the same thing could be done when the disks are to
be part of a RAID. I guess the removable aspect of the RAID is that you
could replace a bad disk and rebuild it without a reboot? My RAID is not
build on a hot swappable hardware. Too bad...

-- 
Roger Oberholtzer

OPQ Systems / Ramböll RST
Ramböll Sverige AB
Kapellgränd 7
P.O. Box 4205
SE-102 65 Stockholm, Sweden

Tel: Int +46 8-615 60 20
Fax: Int +46 8-31 42 23

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [opensuse] Re-adding RAID drives

2007-06-25 Thread Greg Freemyer

On 6/25/07, Roger Oberholtzer <[EMAIL PROTECTED]> wrote:

4. insert the new disks.





One more thing: I have a udev rule so my SATA disks have the same kernel
name when they are put in the same physical slot (I have 4 removable
SATS disks bays). If you do not have this, sdd can become sde (and so
on) after a swap. /var/log/messages tells what the inserted disk is
known as. Perhaps it is as simple as that.


I am not a mdraid user, but I do follow the libata mailing list.
Hotswap should work from an ATA level.  As Roger says the default
behavior is to assign the drive a new /dev/sdx name that will not
correspond with the original name.

I don't know what the fix is, but that is almost definitely the
problem.  FYI: If your just testing and you get the rebuild to work,
make sure you test a reboot, because that will change the /dev/sdx
value back to its original value.

Greg
--
Greg Freemyer
The Norcross Group
Forensics for the 21st Century
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [opensuse] Re-adding RAID drives

2007-06-25 Thread James Knott
Roger Oberholtzer wrote:
> On Mon, 2007-06-25 at 11:49 +0200, Carlos E. R. wrote:
>
> The Sunday 2007-06-24 at 23:06 -0400, James Knott wrote:
>
>
> >>> netfinity:~ # mdadm --add /dev/md0 /dev/sdd2
> >>> mdadm: add new device failed for /dev/sdd2 as 4: Invalid argument
> >>>
> >>> If I now reboot, I'll be able to add the drive, using the same
> command,
> >>> like so.
> I would say that the kernel does not support hot swapping of disks. In
> fact, I read time ago it didn't and was work in progress. I don't know
> what is the current state.
>
> > The kernel does indeed support hot swapping. However, the quality of
> > this depends on the underlying hardware. For example, the intel ICHx
> > chips require that there be a disk in the machine when it is powered on.
> > After that you can hotswap.  Don't forget the hotplug option
> > in /etc/fstab (or wherever you put the mount options). Also, the ICHx
> > chipset seems to require time to sense that a swap has indeed happened.
> > So I always:
>
> > 1. umount the disks to be removed.
Since the disks are part of a RAID array, they're not listed in fstab,
except for the small slice of one used for /boot.

-- 
Use OpenOffice.org 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [opensuse] Re-adding RAID drives

2007-06-25 Thread James Knott
Carlos E. R. wrote:
>
> The Sunday 2007-06-24 at 23:06 -0400, James Knott wrote:
>
>
> > netfinity:~ # mdadm --add /dev/md0 /dev/sdd2
> > mdadm: add new device failed for /dev/sdd2 as 4: Invalid argument
>
> > If I now reboot, I'll be able to add the drive, using the same command,
> > like so.
>
> I would say that the kernel does not support hot swapping of disks. In
> fact, I read time ago it didn't and was work in progress. I don't know
> what is the current state.
>
It apparently still doesn't work.  :-(

Oh well, this system is just for learning about such things.  As it sits
right now, I've got LVM running on top of RAID, containing everything
but /boot. .


-- 
Use OpenOffice.org 

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [opensuse] Re-adding RAID drives

2007-06-25 Thread Roger Oberholtzer
On Mon, 2007-06-25 at 11:49 +0200, Carlos E. R. wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> 
> The Sunday 2007-06-24 at 23:06 -0400, James Knott wrote:
> 
> 
> > netfinity:~ # mdadm --add /dev/md0 /dev/sdd2
> > mdadm: add new device failed for /dev/sdd2 as 4: Invalid argument
> > 
> > If I now reboot, I'll be able to add the drive, using the same command,
> > like so.
> 
> I would say that the kernel does not support hot swapping of disks. In 
> fact, I read time ago it didn't and was work in progress. I don't know 
> what is the current state.

The kernel does indeed support hot swapping. However, the quality of
this depends on the underlying hardware. For example, the intel ICHx
chips require that there be a disk in the machine when it is powered on.
After that you can hotswap.  Don't forget the hotplug option
in /etc/fstab (or wherever you put the mount options). Also, the ICHx
chipset seems to require time to sense that a swap has indeed happened.
So I always:

1. umount the disks to be removed.

2. remove the disks

3. wait, say, 30 seconds. During this time you should see messages
in /car/log/messages that the disks have been removed. If you are using
intel ICHx chips, you may even see messages about disks being removed
and waiting for the hardware to calm down.

4. insert the new disks.

I do this in a system with 4 swappable SATA disks. I think it requires a
kernel newer that 2.6.17. Otherwise you need to update the libata in
your kernel to a newer version. Hotswapping (intel ICHx at least) did
not really work in kernels older than 2.6.17.

I do no know about hot swapping on other hardware. My understanding is
that in most hardware it works ok. Only the intel ICHx stuff is
problematic to some odd hardware quirks.

One more thing: I have a udev rule so my SATA disks have the same kernel
name when they are put in the same physical slot (I have 4 removable
SATS disks bays). If you do not have this, sdd can become sde (and so
on) after a swap. /var/log/messages tells what the inserted disk is
known as. Perhaps it is as simple as that.

> 
> - -- 
> Cheers,
>Carlos E. R.
> 
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.5 (GNU/Linux)
> Comment: Made with pgp4pine 1.76
> 
> iD8DBQFGf4/ntTMYHG2NR9URAsXRAJ0ft2C+Rjsr5XrOc9yyx9E5Z7HiNACbBMKv
> AufTjctO9G3gujccqtriZHk=
> =8fcA
> -END PGP SIGNATURE-
> 
-- 
Roger Oberholtzer

OPQ Systems / Ramböll RST

Ramböll Sverige AB
Kapellgränd 7
P.O. Box 4205
SE-102 65 Stockholm, Sweden

Tel: Int +46 8-615 60 20
Fax: Int +46 8-31 42 23

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [opensuse] Re-adding RAID drives

2007-06-25 Thread Carlos E. R.
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


The Sunday 2007-06-24 at 23:06 -0400, James Knott wrote:


> netfinity:~ # mdadm --add /dev/md0 /dev/sdd2
> mdadm: add new device failed for /dev/sdd2 as 4: Invalid argument
> 
> If I now reboot, I'll be able to add the drive, using the same command,
> like so.

I would say that the kernel does not support hot swapping of disks. In 
fact, I read time ago it didn't and was work in progress. I don't know 
what is the current state.

- -- 
Cheers,
   Carlos E. R.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Made with pgp4pine 1.76

iD8DBQFGf4/ntTMYHG2NR9URAsXRAJ0ft2C+Rjsr5XrOc9yyx9E5Z7HiNACbBMKv
AufTjctO9G3gujccqtriZHk=
=8fcA
-END PGP SIGNATURE-

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [opensuse] Re-adding RAID drives

2007-06-24 Thread James Knott
Rui Santos wrote:
> James Knott wrote:
>   
>> Rui Santos wrote:
>> 
>>> Rui Santos wrote:
>>>   
>>>   
 James Knott wrote:
 
 
> I've now got SUSE 10.2 set up with RAID and LVM on my server.  However,
> I don't seem to be able to re-add a "failed" drive, without rebooting. 
> The drives are hot-swapable.  When I use the command mdadm  /dev/md0/
> -add /dev/sdxx I get a "device busy" error message.  Even removing the
> drive with the --remove option, before the add command doesn't help.  I
> still have to reboot to add the drive.  Is there something I'm missing?
>   
>   
 If a drive/partition is marked as "failed", you need to remove it from
 the RAID first with:
 
 
>>> Sorry - forgot this:
>>>
>>> mdadm --manage --set-faulty /dev/mdx /dev/sdxx
>>>
>>>   
>>>   
 mdadm --manage --remove /dev/mdx /dev/sdxx

 Then you can add it again with:

 mdadm --manage --add /dev/mdx /dev/sdxx

 Hope it helps.

 
 
> tnx jk
>
>   
>   
>>>   
>>>   
>> If I go through the --set-faulty, --fail, --add sequence from the
>> command line, I have no problem adding the drive back.  However, if I
>> simulate a drive failure by pulling the drive, that sequence fails with
>> the error message "mdadm: add new device failed for /dev/sdd2 as 4:
>> Invalid argument".  If I then reboot the computer, I can then use --add
>> to add the drive again.  So, there appears to be some difference between
>> using commands to remove a drive and an actual hardware failure.
>> 
>
> The --set-faulty and --fail options are the same... if you say you can
> execute a --set-faulty, then re-add the device, that is new for me.
>
> About you pulling out a hot-swap device, the device should be considered
> failed at that time. Before you add the drive back into the slot, do you
>  use the --remove option on the already removed drive...at that time it
> should still be a part of the RAID but, in faulty mode. You have to firt
> remove it by issuing 'mdadm --manage --remove /dev/mdx /dev/sdxx'. Have
> you done this?
>
> Only then you're able to plug the device back in and re-add the device.
>
> At least that's how I use it... never tryed on hot-swap though, but the
> --set-faulty is supposed to do just that.
>
> There's one other issue: The kernel driver of the device you use should
> be able to disconnect and re-connect the device cleanly. Check 'dmesg'
> to see if it happens as it should...
>
>
>   

After I unplugged the drive, dmesg shows this.

raid5: Disk failure on sdd2, disabling device. Operation continuing on 3
devices
RAID5 conf printout:
 --- rd:4 wd:3 fd:1
 disk 0, o:1, dev:sda2
 disk 1, o:1, dev:sdb2
 disk 2, o:1, dev:sdc2
 disk 3, o:0, dev:sdd2
RAID5 conf printout:
 --- rd:4 wd:3 fd:1
 disk 0, o:1, dev:sda2
 disk 1, o:1, dev:sdb2
 disk 2, o:1, dev:sdc2
md: unbind
md: export_rdev(sdd2)

At this point, I run the remove option.

Then after reinserting the drive, it shows this, even though the drives
are on B.

scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A
scsi1: Someone reset channel A

Then when adding I get this

netfinity:~ # mdadm --add /dev/md0 /dev/sdd2
mdadm: add new device failed for /dev/sdd2 as 4: Invalid argument

If I now reboot, I'll be able to add the drive, using the same command,
like so.

netfinity:~ # mdadm --add /dev/md0 /dev/sdd2
mdadm: re-added /dev/sdd2


-- 
Use OpenOffice.org 
-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [opensuse] Re-adding RAID drives

2007-06-24 Thread Rui Santos

James Knott wrote:
> Rui Santos wrote:
>> Rui Santos wrote:
>>   
>>> James Knott wrote:
>>> 
 I've now got SUSE 10.2 set up with RAID and LVM on my server.  However,
 I don't seem to be able to re-add a "failed" drive, without rebooting. 
 The drives are hot-swapable.  When I use the command mdadm  /dev/md0/
 -add /dev/sdxx I get a "device busy" error message.  Even removing the
 drive with the --remove option, before the add command doesn't help.  I
 still have to reboot to add the drive.  Is there something I'm missing?
   
>>> If a drive/partition is marked as "failed", you need to remove it from
>>> the RAID first with:
>>> 
>> Sorry - forgot this:
>>
>> mdadm --manage --set-faulty /dev/mdx /dev/sdxx
>>
>>   
>>> mdadm --manage --remove /dev/mdx /dev/sdxx
>>>
>>> Then you can add it again with:
>>>
>>> mdadm --manage --add /dev/mdx /dev/sdxx
>>>
>>> Hope it helps.
>>>
>>> 
 tnx jk

   
>>   
> If I go through the --set-faulty, --fail, --add sequence from the
> command line, I have no problem adding the drive back.  However, if I
> simulate a drive failure by pulling the drive, that sequence fails with
> the error message "mdadm: add new device failed for /dev/sdd2 as 4:
> Invalid argument".  If I then reboot the computer, I can then use --add
> to add the drive again.  So, there appears to be some difference between
> using commands to remove a drive and an actual hardware failure.

The --set-faulty and --fail options are the same... if you say you can
execute a --set-faulty, then re-add the device, that is new for me.

About you pulling out a hot-swap device, the device should be considered
failed at that time. Before you add the drive back into the slot, do you
 use the --remove option on the already removed drive...at that time it
should still be a part of the RAID but, in faulty mode. You have to firt
remove it by issuing 'mdadm --manage --remove /dev/mdx /dev/sdxx'. Have
you done this?

Only then you're able to plug the device back in and re-add the device.

At least that's how I use it... never tryed on hot-swap though, but the
--set-faulty is supposed to do just that.

There's one other issue: The kernel driver of the device you use should
be able to disconnect and re-connect the device cleanly. Check 'dmesg'
to see if it happens as it should...


> 
> 
> 

-- 
Rui Santos
http://www.ruisantos.com/

Veni, vidi, Linux!
-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [opensuse] Re-adding RAID drives

2007-06-24 Thread James Knott
Rui Santos wrote:
> Rui Santos wrote:
>   
>> James Knott wrote:
>> 
>>> I've now got SUSE 10.2 set up with RAID and LVM on my server.  However,
>>> I don't seem to be able to re-add a "failed" drive, without rebooting. 
>>> The drives are hot-swapable.  When I use the command mdadm  /dev/md0/
>>> -add /dev/sdxx I get a "device busy" error message.  Even removing the
>>> drive with the --remove option, before the add command doesn't help.  I
>>> still have to reboot to add the drive.  Is there something I'm missing?
>>>   
>> If a drive/partition is marked as "failed", you need to remove it from
>> the RAID first with:
>> 
>
> Sorry - forgot this:
>
> mdadm --manage --set-faulty /dev/mdx /dev/sdxx
>
>   
>> mdadm --manage --remove /dev/mdx /dev/sdxx
>>
>> Then you can add it again with:
>>
>> mdadm --manage --add /dev/mdx /dev/sdxx
>>
>> Hope it helps.
>>
>> 
>>> tnx jk
>>>
>>>   
>
>   
If I go through the --set-faulty, --fail, --add sequence from the
command line, I have no problem adding the drive back.  However, if I
simulate a drive failure by pulling the drive, that sequence fails with
the error message "mdadm: add new device failed for /dev/sdd2 as 4:
Invalid argument".  If I then reboot the computer, I can then use --add
to add the drive again.  So, there appears to be some difference between
using commands to remove a drive and an actual hardware failure.



-- 
Use OpenOffice.org 
-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [opensuse] Re-adding RAID drives

2007-06-24 Thread Rui Santos


Rui Santos wrote:
> James Knott wrote:
>> I've now got SUSE 10.2 set up with RAID and LVM on my server.  However,
>> I don't seem to be able to re-add a "failed" drive, without rebooting. 
>> The drives are hot-swapable.  When I use the command mdadm  /dev/md0/
>> -add /dev/sdxx I get a "device busy" error message.  Even removing the
>> drive with the --remove option, before the add command doesn't help.  I
>> still have to reboot to add the drive.  Is there something I'm missing?
> 
> If a drive/partition is marked as "failed", you need to remove it from
> the RAID first with:

Sorry - forgot this:

mdadm --manage --set-faulty /dev/mdx /dev/sdxx

> 
> mdadm --manage --remove /dev/mdx /dev/sdxx
> 
> Then you can add it again with:
> 
> mdadm --manage --add /dev/mdx /dev/sdxx
> 
> Hope it helps.
> 
>> tnx jk
>>
> 

-- 
Rui Santos
http://www.ruisantos.com/

Veni, vidi, Linux!
-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [opensuse] Re-adding RAID drives

2007-06-24 Thread Rui Santos
James Knott wrote:
> I've now got SUSE 10.2 set up with RAID and LVM on my server.  However,
> I don't seem to be able to re-add a "failed" drive, without rebooting. 
> The drives are hot-swapable.  When I use the command mdadm  /dev/md0/
> -add /dev/sdxx I get a "device busy" error message.  Even removing the
> drive with the --remove option, before the add command doesn't help.  I
> still have to reboot to add the drive.  Is there something I'm missing?

If a drive/partition is marked as "failed", you need to remove it from
the RAID first with:

mdadm --manage --remove /dev/mdx /dev/sdxx

Then you can add it again with:

mdadm --manage --add /dev/mdx /dev/sdxx

Hope it helps.

> 
> tnx jk
> 

-- 
Rui Santos
http://www.ruisantos.com/

Veni, vidi, Linux!
-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[opensuse] Re-adding RAID drives

2007-06-24 Thread James Knott
I've now got SUSE 10.2 set up with RAID and LVM on my server.  However,
I don't seem to be able to re-add a "failed" drive, without rebooting. 
The drives are hot-swapable.  When I use the command mdadm  /dev/md0/
-add /dev/sdxx I get a "device busy" error message.  Even removing the
drive with the --remove option, before the add command doesn't help.  I
still have to reboot to add the drive.  Is there something I'm missing?

tnx jk

-- 
Use OpenOffice.org 
-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]