Re: md rotates RAID5 spare at boot

2008-01-10 Thread Jed Davidow
I'm sorry- is this an inappropriate list to ask for help?  There seemed 
to be a fair amount of that when I searched the archives, but I don't 
want to bug developers with my problems!


Please let me know if I should find another place to ask for help (and 
please let me know where that might be!).


Thanks!

Jed Davidow wrote:
I have a RAID5 (5+1spare) setup that works perfectly well until I 
reboot.  I have 6 drives (two different models) partitioned to give me 
2 arrays, md0 and md1, that I use for /home and /var respectively.


When I reboot, the system assembles each array, but swaps out what was 
the spare with one of the member drives.  It then immediately detects 
a degraded array and rebuilds.  After that, all is fine and testing 
has shown things to be working like they should.  Until I reboot.


Example:
Built two arrays:  /dev/md0 - /dev/sd[abcef]1 and /dev/md1 - 
/dev/sd[abcef]2

Added /dev/sdg1 and /dev/sdg2 as spares, and this works.

One scenario when I reboot:
   md0 is assembled from sd[abceg]1; it's degraded and reports a 
spares missing event.

   md1 assembles correctly, spare is not missing

Any ideas?  I have asked about this on various boards (some said UDEV 
rules would help, some thought the issue had to do with the /dev/sdX 
names changing, etc).  I don't think those are applicable since dmesg 
reports the arrays assemble as soon as the disks are detected.



Thanks in advance!


INFO:
(currently the boot drive (non raid) is sdd, otherwise all sd devices 
are part of the raid)


fdisk:

   $ sudo fdisk -l

   Disk /dev/sda: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sda1   1152112217401   fd  Linux raid
   autodetect
   /dev/sda21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdb: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdb1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdb21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdc: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdc1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdc21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/md0: 50.0 GB, 50041978880 bytes
   2 heads, 4 sectors/track, 12217280 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md0 doesn't contain a valid partition table

   Disk /dev/md1: 950.1 GB, 950183919616 bytes
   2 heads, 4 sectors/track, 231978496 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md1 doesn't contain a valid partition table

   Disk /dev/sdd: 120.0 GB, 120034123776 bytes
   255 heads, 63 sectors/track, 14593 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x535bfd7a

  Device Boot  Start End  Blocks   Id  System
   /dev/sdd1   *   1   14219   114214086   83  Linux
   /dev/sdd2   14220   14593 30041555  Extended
   /dev/sdd5   14220   14593 3004123+  82  Linux swap /
   Solaris

   Disk /dev/sde: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sde1   1152112217401   fd  Linux raid
   autodetect
   /dev/sde21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdf: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdf1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdf21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdg: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdg1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdg21522   30401   231978600   fd  Linux raid
   autodetect


$ sudo mdadm --detail /dev/md0 

Re: md rotates RAID5 spare at boot

2008-01-10 Thread Bill Davidsen

Jed Davidow wrote:
I have a RAID5 (5+1spare) setup that works perfectly well until I 
reboot.  I have 6 drives (two different models) partitioned to give me 
2 arrays, md0 and md1, that I use for /home and /var respectively.


When I reboot, the system assembles each array, but swaps out what was 
the spare with one of the member drives.  It then immediately detects 
a degraded array and rebuilds.  After that, all is fine and testing 
has shown things to be working like they should.  Until I reboot.


Example:
Built two arrays:  /dev/md0 - /dev/sd[abcef]1 and /dev/md1 - 
/dev/sd[abcef]2

Added /dev/sdg1 and /dev/sdg2 as spares, and this works.

One scenario when I reboot:
   md0 is assembled from sd[abceg]1; it's degraded and reports a 
spares missing event.

   md1 assembles correctly, spare is not missing

I'm looking at the dmesg which follows and seeing md1 reconstructing. 
This seems to be at variance with assembles correctly here. That's the 
only thing which has struck me as worth mentioning so far.
Any ideas?  I have asked about this on various boards (some said UDEV 
rules would help, some thought the issue had to do with the /dev/sdX 
names changing, etc).  I don't think those are applicable since dmesg 
reports the arrays assemble as soon as the disks are detected.



Thanks in advance!


INFO:
(currently the boot drive (non raid) is sdd, otherwise all sd devices 
are part of the raid)


fdisk:

   $ sudo fdisk -l

   Disk /dev/sda: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sda1   1152112217401   fd  Linux raid
   autodetect
   /dev/sda21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdb: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdb1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdb21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdc: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdc1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdc21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/md0: 50.0 GB, 50041978880 bytes
   2 heads, 4 sectors/track, 12217280 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md0 doesn't contain a valid partition table

   Disk /dev/md1: 950.1 GB, 950183919616 bytes
   2 heads, 4 sectors/track, 231978496 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md1 doesn't contain a valid partition table

   Disk /dev/sdd: 120.0 GB, 120034123776 bytes
   255 heads, 63 sectors/track, 14593 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x535bfd7a

  Device Boot  Start End  Blocks   Id  System
   /dev/sdd1   *   1   14219   114214086   83  Linux
   /dev/sdd2   14220   14593 30041555  Extended
   /dev/sdd5   14220   14593 3004123+  82  Linux swap /
   Solaris

   Disk /dev/sde: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sde1   1152112217401   fd  Linux raid
   autodetect
   /dev/sde21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdf: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdf1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdf21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdg: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdg1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdg21522   30401   231978600   fd  Linux raid
   autodetect


$ sudo mdadm --detail /dev/md0 (md1 shows similar info)

   /dev/md0:
   Version : 00.90.03
 Creation Time : Sat Apr  7 23:32:58 

Re: md rotates RAID5 spare at boot

2008-01-10 Thread Jed Davidow

Hi Bill,

Maybe I'm using the wrong words...
In this instance, on the previous boot, md1 was assembled from 
sd[efbac]2 and sdg2 was the spare.  When I rebooted it assembled from 
sd[efbgc]2 and had no spare (appears that sdg was swapped in for sda).  
Since sdg2 had been the spare, the array is degraded and it rebuilds.  I 
suppose this would be the case if, during the shutdown that sda2 was 
compromised (although I see nothing about sda2 as being faulty- I can 
manually add it immediately).  But this happens just about every time I 
reboot, sometimes to only one of the two arrays, sometimes with the 
corresponding partitions on both arrays and sometimes with different 
partitions on each array.


If something was physically wrong with one of the drives, I would expect 
it to swap in the spare for that drive each time.  But it seems to swap 
in the spare randomly.


Note- last night I shutdown completely, restarted after 30 sec and for 
the first time in a while did not have an issue.  This time the drives 
were recognized and assigned device nodes in the 'correct' order (MB 
controller first, PCI controller next).  Would device node assignments 
have any affect on how the array was being assembled?


It looks to me like md inspects and attempts to assemble after each 
drive controller is scanned (from dmesg, there appears to be a failed 
bind on the first three devices after they are scanned, and then again 
when the second controller is scanned).  Would the scan order cause a 
spare to be swapped in?



Bill Davidsen wrote:

Jed Davidow wrote:
I have a RAID5 (5+1spare) setup that works perfectly well until I 
reboot.  I have 6 drives (two different models) partitioned to give 
me 2 arrays, md0 and md1, that I use for /home and /var respectively.


When I reboot, the system assembles each array, but swaps out what 
was the spare with one of the member drives.  It then immediately 
detects a degraded array and rebuilds.  After that, all is fine and 
testing has shown things to be working like they should.  Until I 
reboot.


Example:
Built two arrays:  /dev/md0 - /dev/sd[abcef]1 and /dev/md1 - 
/dev/sd[abcef]2

Added /dev/sdg1 and /dev/sdg2 as spares, and this works.

One scenario when I reboot:
   md0 is assembled from sd[abceg]1; it's degraded and reports a 
spares missing event.

   md1 assembles correctly, spare is not missing

I'm looking at the dmesg which follows and seeing md1 reconstructing. 
This seems to be at variance with assembles correctly here. That's 
the only thing which has struck me as worth mentioning so far.
Any ideas?  I have asked about this on various boards (some said UDEV 
rules would help, some thought the issue had to do with the /dev/sdX 
names changing, etc).  I don't think those are applicable since dmesg 
reports the arrays assemble as soon as the disks are detected.



Thanks in advance!


INFO:
(currently the boot drive (non raid) is sdd, otherwise all sd devices 
are part of the raid)


fdisk:

   $ sudo fdisk -l

   Disk /dev/sda: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sda1   1152112217401   fd  Linux raid
   autodetect
   /dev/sda21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdb: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdb1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdb21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdc: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdc1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdc21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/md0: 50.0 GB, 50041978880 bytes
   2 heads, 4 sectors/track, 12217280 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md0 doesn't contain a valid partition table

   Disk /dev/md1: 950.1 GB, 950183919616 bytes
   2 heads, 4 sectors/track, 231978496 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md1 doesn't contain a valid partition table

   Disk /dev/sdd: 120.0 GB, 120034123776 bytes
   255 heads, 63 sectors/track, 14593 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x535bfd7a

  Device Boot  Start End  Blocks   Id  System
   /dev/sdd1   *   1   14219   

this goes go my megaraid probs too was: Re: md rotates RAID5 spare at boot

2008-01-10 Thread Eric S. Johansson

Jed Davidow wrote:
I'm sorry- is this an inappropriate list to ask for help?  There seemed 
to be a fair amount of that when I searched the archives, but I don't 
want to bug developers with my problems!


Please let me know if I should find another place to ask for help (and 
please let me know where that might be!).


I could also use help with my mega-raid 150 question.  Don't know if I asked it 
wrong or it was the color shirt I was wearing.  I am unfortunately running with 
such a dearth of knowledge on the topic that I don't really know the right 
questions to ask when diagnosing a performance problem.  All I know is that 
there's very little documentation on this card, is even less documentation on 
the commandline tool to access/control the card and if it makes the most sense, 
I am perfectly willing to deep six the card on eBay and pick up a couple of 
reasonable speed serial ATA controller cards in its stead.


the only reason I want to try and learn more about the hardware raid is because 
the problems I'm experiencing with my virtual machines on this platform mimic 
problems a customer of mine is experiencing and if I can fix them just by 
changing how the raid controller uses the discs, then that is a huge win. 
Personally, I think it's something a little deeper because VMware server seems 
to go out to lunch whenever there is a backup in the disk I/O queue.  I'm 
seriously thinking about picking up esx as soon as the budget allows.


I just need some good solid advice on what path I should take.

---eric


--
Speech-recognition in use.  It makes mistakes, I correct some.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Neil Brown
On Thursday January 10, [EMAIL PROTECTED] wrote:
 
 It looks to me like md inspects and attempts to assemble after each 
 drive controller is scanned (from dmesg, there appears to be a failed 
 bind on the first three devices after they are scanned, and then again 
 when the second controller is scanned).  Would the scan order cause a 
 spare to be swapped in?
 

This suggests that mdadm --incremental is being used to assemble the
arrays.  Every time udev finds a new device, it gets added to
whichever array is should be in.
If it is called as mdadm --incremental --run, then it will get
started as soon as possible, even if it is degraded.  With the
--run, it will wait until all devices are available.

Even with mdadm --incremental --run, you shouldn't get a resync if
the last device is added before the array is written to.

What distro are you running?
What does
   grep -R mdadm /etc/udev

show?

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Jed Davidow

distro: Ubuntu 7.10

Two files show up...

85-mdadm.rules:
# This file causes block devices with Linux RAID (mdadm) signatures to
# automatically cause mdadm to be run.
# See udev(8) for syntax

SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \
   RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded



65-mdadm.vol_id.rules:
# This file causes Linux RAID (mdadm) block devices to be checked for
# further filesystems if the array is active.
# See udev(8) for syntax

SUBSYSTEM!=block, GOTO=mdadm_end
KERNEL!=md[0-9]*, GOTO=mdadm_end
ACTION!=add|change, GOTO=mdadm_end

# Check array status
ATTR{md/array_state}==|clear|inactive, GOTO=mdadm_end

# Obtain array information
IMPORT{program}=/sbin/mdadm --detail --export $tempnode
ENV{MD_NAME}==?*, SYMLINK+=disk/by-id/md-name-$env{MD_NAME}
ENV{MD_UUID}==?*, SYMLINK+=disk/by-id/md-uuid-$env{MD_UUID}

# by-uuid and by-label symlinks
IMPORT{program}=vol_id --export $tempnode
OPTIONS=link_priority=-100
ENV{ID_FS_USAGE}==filesystem|other|crypto, ENV{ID_FS_UUID_ENC}==?*, \
   SYMLINK+=disk/by-uuid/$env{ID_FS_UUID_ENC}
ENV{ID_FS_USAGE}==filesystem|other, ENV{ID_FS_LABEL_ENC}==?*, \
   SYMLINK+=disk/by-label/$env{ID_FS_LABEL_ENC}


I see.  So udev is invoking the assemble command as soon as it detects 
the devices.  So is it possible that the spare is not the last drive to 
be detected and mdadm assembles too soon?



Neil Brown wrote:

On Thursday January 10, [EMAIL PROTECTED] wrote:
  
It looks to me like md inspects and attempts to assemble after each 
drive controller is scanned (from dmesg, there appears to be a failed 
bind on the first three devices after they are scanned, and then again 
when the second controller is scanned).  Would the scan order cause a 
spare to be swapped in?





This suggests that mdadm --incremental is being used to assemble the
arrays.  Every time udev finds a new device, it gets added to
whichever array is should be in.
If it is called as mdadm --incremental --run, then it will get
started as soon as possible, even if it is degraded.  With the
--run, it will wait until all devices are available.

Even with mdadm --incremental --run, you shouldn't get a resync if
the last device is added before the array is written to.

What distro are you running?
What does
   grep -R mdadm /etc/udev

show?

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Jed Davidow
One quick question about those rules.  The 65-mdadm rule looks like it 
checks ACTIVE arrays for filesystems, and the 85 rule assembles arrays.  
Shouldn't they run in the other order?





distro: Ubuntu 7.10

Two files show up...

85-mdadm.rules:
# This file causes block devices with Linux RAID (mdadm) signatures to
# automatically cause mdadm to be run.
# See udev(8) for syntax

SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \
   RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded



65-mdadm.vol_id.rules:
# This file causes Linux RAID (mdadm) block devices to be checked for
# further filesystems if the array is active.
# See udev(8) for syntax

SUBSYSTEM!=block, GOTO=mdadm_end
KERNEL!=md[0-9]*, GOTO=mdadm_end
ACTION!=add|change, GOTO=mdadm_end

# Check array status
ATTR{md/array_state}==|clear|inactive, GOTO=mdadm_end

# Obtain array information
IMPORT{program}=/sbin/mdadm --detail --export $tempnode
ENV{MD_NAME}==?*, SYMLINK+=disk/by-id/md-name-$env{MD_NAME}
ENV{MD_UUID}==?*, SYMLINK+=disk/by-id/md-uuid-$env{MD_UUID}

# by-uuid and by-label symlinks
IMPORT{program}=vol_id --export $tempnode
OPTIONS=link_priority=-100
ENV{ID_FS_USAGE}==filesystem|other|crypto, ENV{ID_FS_UUID_ENC}==?*, \
   SYMLINK+=disk/by-uuid/$env{ID_FS_UUID_ENC}
ENV{ID_FS_USAGE}==filesystem|other, ENV{ID_FS_LABEL_ENC}==?*, \
   SYMLINK+=disk/by-label/$env{ID_FS_LABEL_ENC}


I see.  So udev is invoking the assemble command as soon as it detects 
the devices.  So is it possible that the spare is not the last drive to 
be detected and mdadm assembles too soon?




Neil Brown wrote:

On Thursday January 10, [EMAIL PROTECTED] wrote:
  
It looks to me like md inspects and attempts to assemble after each 
drive controller is scanned (from dmesg, there appears to be a failed 
bind on the first three devices after they are scanned, and then again 
when the second controller is scanned).  Would the scan order cause a 
spare to be swapped in?





This suggests that mdadm --incremental is being used to assemble the
arrays.  Every time udev finds a new device, it gets added to
whichever array is should be in.
If it is called as mdadm --incremental --run, then it will get
started as soon as possible, even if it is degraded.  With the
--run, it will wait until all devices are available.

Even with mdadm --incremental --run, you shouldn't get a resync if
the last device is added before the array is written to.

What distro are you running?
What does
   grep -R mdadm /etc/udev

show?

NeilBrown

  

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Neil Brown
On Thursday January 10, [EMAIL PROTECTED] wrote:
 distro: Ubuntu 7.10
 
 Two files show up...
 
 85-mdadm.rules:
 # This file causes block devices with Linux RAID (mdadm) signatures to
 # automatically cause mdadm to be run.
 # See udev(8) for syntax
 
 SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \
 RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded

 
 I see.  So udev is invoking the assemble command as soon as it detects 
 the devices.  So is it possible that the spare is not the last drive to 
 be detected and mdadm assembles too soon?

The --no-degraded' should stop it from assembling until all expected
devices have been found.  It could assemble before the spare is found,
but should not assemble before all the data devices have been found.

The dmesg trace you included in your first mail doesn't actually
show anything wrong - it never starts and incomplete array.
Can you try again and get a trace where there definitely is a rebuild
happening.

And please don't drop linux-raid from the 'cc' list.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Neil Brown
On Thursday January 10, [EMAIL PROTECTED] wrote:
 One quick question about those rules.  The 65-mdadm rule looks like it 
 checks ACTIVE arrays for filesystems, and the 85 rule assembles arrays.  
 Shouldn't they run in the other order?
 

They are fine.  The '65' rule applies to arrays.  I.e. it fires on an
array device once it has been started.
The '85' rule applies to component devices.

They are quite independent.

NeilBrown


 
 
 
 distro: Ubuntu 7.10
 
 Two files show up...
 
 85-mdadm.rules:
 # This file causes block devices with Linux RAID (mdadm) signatures to
 # automatically cause mdadm to be run.
 # See udev(8) for syntax
 
 SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \
 RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded
 
 
 
 65-mdadm.vol_id.rules:
 # This file causes Linux RAID (mdadm) block devices to be checked for
 # further filesystems if the array is active.
 # See udev(8) for syntax
 
 SUBSYSTEM!=block, GOTO=mdadm_end
 KERNEL!=md[0-9]*, GOTO=mdadm_end
 ACTION!=add|change, GOTO=mdadm_end
 
 # Check array status
 ATTR{md/array_state}==|clear|inactive, GOTO=mdadm_end
 
 # Obtain array information
 IMPORT{program}=/sbin/mdadm --detail --export $tempnode
 ENV{MD_NAME}==?*, SYMLINK+=disk/by-id/md-name-$env{MD_NAME}
 ENV{MD_UUID}==?*, SYMLINK+=disk/by-id/md-uuid-$env{MD_UUID}
 
 # by-uuid and by-label symlinks
 IMPORT{program}=vol_id --export $tempnode
 OPTIONS=link_priority=-100
 ENV{ID_FS_USAGE}==filesystem|other|crypto, ENV{ID_FS_UUID_ENC}==?*, \
 SYMLINK+=disk/by-uuid/$env{ID_FS_UUID_ENC}
 ENV{ID_FS_USAGE}==filesystem|other, ENV{ID_FS_LABEL_ENC}==?*, \
 SYMLINK+=disk/by-label/$env{ID_FS_LABEL_ENC}
 
 
 I see.  So udev is invoking the assemble command as soon as it detects 
 the devices.  So is it possible that the spare is not the last drive to 
 be detected and mdadm assembles too soon?
 
 
 
 Neil Brown wrote:
  On Thursday January 10, [EMAIL PROTECTED] wrote:

  It looks to me like md inspects and attempts to assemble after each 
  drive controller is scanned (from dmesg, there appears to be a failed 
  bind on the first three devices after they are scanned, and then again 
  when the second controller is scanned).  Would the scan order cause a 
  spare to be swapped in?
 
  
 
  This suggests that mdadm --incremental is being used to assemble the
  arrays.  Every time udev finds a new device, it gets added to
  whichever array is should be in.
  If it is called as mdadm --incremental --run, then it will get
  started as soon as possible, even if it is degraded.  With the
  --run, it will wait until all devices are available.
 
  Even with mdadm --incremental --run, you shouldn't get a resync if
  the last device is added before the array is written to.
 
  What distro are you running?
  What does
 grep -R mdadm /etc/udev
 
  show?
 
  NeilBrown
 

 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Jed Davidow

(Sorry- yes it looks like I posted an incorrect dmesg extract)

$egrep sd|md|raid|scsi /var/log/dmesg.0
[   36.112449] md: linear personality registered for level -1
[   36.117197] md: multipath personality registered for level -4
[   36.121795] md: raid0 personality registered for level 0
[   36.126950] md: raid1 personality registered for level 1
[   36.131424] raid5: automatically using best checksumming function: 
pIII_sse

[   36.150020] raid5: using function: pIII_sse (4564.000 MB/sec)
[   36.218015] raid6: int32x1780 MB/s
[   36.285943] raid6: int32x2902 MB/s
[   36.353961] raid6: int32x4667 MB/s
[   36.421869] raid6: int32x8528 MB/s
[   36.489811] raid6: mmxx1 1813 MB/s
[   36.557775] raid6: mmxx2 2123 MB/s
[   36.625763] raid6: sse1x11101 MB/s
[   36.693717] raid6: sse1x21898 MB/s
[   36.761688] raid6: sse2x12227 MB/s
[   36.829647] raid6: sse2x23178 MB/s
[   36.829695] raid6: using algorithm sse2x2 (3178 MB/s)
[   36.829744] md: raid6 personality registered for level 6
[   36.829793] md: raid5 personality registered for level 5
[   36.829842] md: raid4 personality registered for level 4
[   36.853475] md: raid10 personality registered for level 10
[   37.781513] scsi0 : sata_sil
[   37.781628] scsi1 : sata_sil
[   37.781724] scsi2 : sata_sil
[   37.781820] scsi3 : sata_sil
[   37.781922] ata1: SATA max UDMA/100 cmd 0xf88c0080 ctl 0xf88c008a 
bmdma 0xf88c irq 20
[   37.781997] ata2: SATA max UDMA/100 cmd 0xf88c00c0 ctl 0xf88c00ca 
bmdma 0xf88c0008 irq 20
[   37.782069] ata3: SATA max UDMA/100 cmd 0xf88c0280 ctl 0xf88c028a 
bmdma 0xf88c0200 irq 20
[   37.782142] ata4: SATA max UDMA/100 cmd 0xf88c02c0 ctl 0xf88c02ca 
bmdma 0xf88c0208 irq 20
[   39.577812] scsi 0:0:0:0: Direct-Access ATA  WDC WD2500JD-00H 
08.0 PQ: 0 ANSI: 5
[   39.578027] scsi 1:0:0:0: Direct-Access ATA  Maxtor 7L250S0   
BACE PQ: 0 ANSI: 5
[   39.578234] scsi 3:0:0:0: Direct-Access ATA  Maxtor 7L250S0   
BACE PQ: 0 ANSI: 5

[   39.632483] scsi4 : ata_piix
[   39.632591] scsi5 : ata_piix
[   39.632812] ata5: PATA max UDMA/133 cmd 0x000101f0 ctl 0x000103f6 
bmdma 0x0001f000 irq 14
[   39.634522] ata6: PATA max UDMA/133 cmd 0x00010170 ctl 0x00010376 
bmdma 0x0001f008 irq 15
[   39.634924] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors 
(250059 MB)

[   39.634995] sd 0:0:0:0: [sda] Write Protect is off
[   39.635048] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   39.635076] sd 0:0:0:0: [sda] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
[   39.635218] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors 
(250059 MB)

[   39.635292] sd 0:0:0:0: [sda] Write Protect is off
[   39.635350] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   39.635380] sd 0:0:0:0: [sda] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA

[   39.635462]  sda: sda1 sda2
[   39.650092] sd 0:0:0:0: [sda] Attached SCSI disk
[   39.650226] sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors 
(251000 MB)

[   39.650296] sd 1:0:0:0: [sdb] Write Protect is off
[   39.650348] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[   39.650379] sd 1:0:0:0: [sdb] Write cache: disabled, read cache: 
enabled, doesn't support DPO or FUA
[   39.650505] sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors 
(251000 MB)

[   39.650573] sd 1:0:0:0: [sdb] Write Protect is off
[   39.650625] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[   39.650657] sd 1:0:0:0: [sdb] Write cache: disabled, read cache: 
enabled, doesn't support DPO or FUA

[   39.650727]  sdb: sdb1 sdb2
[   39.667599] sd 1:0:0:0: [sdb] Attached SCSI disk
[   39.667719] sd 3:0:0:0: [sdc] 490234752 512-byte hardware sectors 
(251000 MB)

[   39.667788] sd 3:0:0:0: [sdc] Write Protect is off
[   39.667840] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[   39.667871] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
[   39.667997] sd 3:0:0:0: [sdc] 490234752 512-byte hardware sectors 
(251000 MB)

[   39.668064] sd 3:0:0:0: [sdc] Write Protect is off
[   39.668116] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[   39.668146] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA

[   39.668213]  sdc: sdc1 sdc2
[   39.692703] sd 3:0:0:0: [sdc] Attached SCSI disk
[   39.699348] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   39.699570] sd 1:0:0:0: Attached scsi generic sg1 type 0
[   39.699786] sd 3:0:0:0: Attached scsi generic sg2 type 0
[   39.834560] md: md0 stopped.
[   39.870361] md: bindsdc1
[   39.870527] md: md1 stopped.
[   39.910999] md: md0 stopped.
[   39.911064] md: unbindsdc1
[   39.911120] md: export_rdev(sdc1)
[   39.929760] md: bindsda1
[   39.929953] md: bindsdc1
[   39.930139] md: bindsdb1
[   39.930231] md: md1 stopped.
[   39.932468] md: bindsdc2
[   39.932674] md: bindsda2
[   39.932860] md: bindsdb2
[   40.411001] scsi 4:0:1:0: CD-ROMLITE-ON  DVDRW SOHW-1213S 
TS09 PQ: 0 ANSI: 5

[   40.411152] scsi 4:0:1:0: Attached 

Re: md rotates RAID5 spare at boot

2008-01-10 Thread Neil Brown
On Thursday January 10, [EMAIL PROTECTED] wrote:
 (Sorry- yes it looks like I posted an incorrect dmesg extract)

This still doesn't seem to match your description.
I see:

 [   41.247389] md: bindsdf1
 [   41.247584] md: bindsdb1
 [   41.247787] md: bindsda1
 [   41.247971] md: bindsdc1
 [   41.248151] md: bindsdg1
 [   41.248325] md: bindsde1
 [   41.256718] raid5: device sde1 operational as raid disk 0
 [   41.256771] raid5: device sdc1 operational as raid disk 4
 [   41.256821] raid5: device sda1 operational as raid disk 3
 [   41.256870] raid5: device sdb1 operational as raid disk 2
 [   41.256919] raid5: device sdf1 operational as raid disk 1
 [   41.257426] raid5: allocated 5245kB for md0
 [   41.257476] raid5: raid level 5 set md0 active with 5 out of 5 
 devices, algorithm 2

which looks like 'md0' started with 5 of 5 drives, plus g1 is there as
a spare.  And

 [   41.312250] md: bindsdf2
 [   41.312476] md: bindsdb2
 [   41.312711] md: bindsdg2
 [   41.312922] md: bindsdc2
 [   41.313138] md: bindsda2
 [   41.313343] md: bindsde2
 [   41.313452] md: md1: raid array is not clean -- starting background 
 reconstruction
 [   41.322189] raid5: device sde2 operational as raid disk 0
 [   41.322243] raid5: device sdc2 operational as raid disk 4
 [   41.322292] raid5: device sdg2 operational as raid disk 3
 [   41.322342] raid5: device sdb2 operational as raid disk 2
 [   41.322391] raid5: device sdf2 operational as raid disk 1
 [   41.322823] raid5: allocated 5245kB for md1
 [   41.322872] raid5: raid level 5 set md1 active with 5 out of 5 
 devices, algorithm 2

md1 also assembled with 5/5 drives and sda2 as a spare.  
This one was not shut down cleanly so it started a resync.  But there
is not evidence of anything starting degraded.



NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


md rotates RAID5 spare at boot

2008-01-09 Thread Jed Davidow
I have a RAID5 (5+1spare) setup that works perfectly well until I 
reboot.  I have 6 drives (two different models) partitioned to give me 2 
arrays, md0 and md1, that I use for /home and /var respectively.


When I reboot, the system assembles each array, but swaps out what was 
the spare with one of the member drives.  It then immediately detects a 
degraded array and rebuilds.  After that, all is fine and testing has 
shown things to be working like they should.  Until I reboot.


Example:
Built two arrays:  /dev/md0 - /dev/sd[abcef]1 and /dev/md1 - 
/dev/sd[abcef]2

Added /dev/sdg1 and /dev/sdg2 as spares, and this works.

One scenario when I reboot:
   md0 is assembled from sd[abceg]1; it's degraded and reports a 
spares missing event.

   md1 assembles correctly, spare is not missing

Any ideas?  I have asked about this on various boards (some said UDEV 
rules would help, some thought the issue had to do with the /dev/sdX 
names changing, etc).  I don't think those are applicable since dmesg 
reports the arrays assemble as soon as the disks are detected.



Thanks in advance!


INFO:
(currently the boot drive (non raid) is sdd, otherwise all sd devices 
are part of the raid)


fdisk:

   $ sudo fdisk -l

   Disk /dev/sda: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sda1   1152112217401   fd  Linux raid
   autodetect
   /dev/sda21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdb: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdb1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdb21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdc: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdc1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdc21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/md0: 50.0 GB, 50041978880 bytes
   2 heads, 4 sectors/track, 12217280 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md0 doesn't contain a valid partition table

   Disk /dev/md1: 950.1 GB, 950183919616 bytes
   2 heads, 4 sectors/track, 231978496 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md1 doesn't contain a valid partition table

   Disk /dev/sdd: 120.0 GB, 120034123776 bytes
   255 heads, 63 sectors/track, 14593 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x535bfd7a

  Device Boot  Start End  Blocks   Id  System
   /dev/sdd1   *   1   14219   114214086   83  Linux
   /dev/sdd2   14220   14593 30041555  Extended
   /dev/sdd5   14220   14593 3004123+  82  Linux swap /
   Solaris

   Disk /dev/sde: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sde1   1152112217401   fd  Linux raid
   autodetect
   /dev/sde21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdf: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdf1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdf21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdg: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdg1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdg21522   30401   231978600   fd  Linux raid
   autodetect


$ sudo mdadm --detail /dev/md0 (md1 shows similar info)

   /dev/md0:
   Version : 00.90.03
 Creation Time : Sat Apr  7 23:32:58 2007
Raid Level : raid5
Array Size : 48869120 (46.61 GiB 50.04 GB)
 Used Dev Size : 12217280 (11.65 GiB 12.51 GB)
  Raid Devices : 5
 Total Devices : 6
   Preferred Minor : 0
   Persistence