Re: md rotates RAID5 spare at boot
I'm sorry- is this an inappropriate list to ask for help? There seemed to be a fair amount of that when I searched the archives, but I don't want to bug developers with my problems! Please let me know if I should find another place to ask for help (and please let me know where that might be!). Thanks! Jed Davidow wrote: I have a RAID5 (5+1spare) setup that works perfectly well until I reboot. I have 6 drives (two different models) partitioned to give me 2 arrays, md0 and md1, that I use for /home and /var respectively. When I reboot, the system assembles each array, but swaps out what was the spare with one of the member drives. It then immediately detects a degraded array and rebuilds. After that, all is fine and testing has shown things to be working like they should. Until I reboot. Example: Built two arrays: /dev/md0 - /dev/sd[abcef]1 and /dev/md1 - /dev/sd[abcef]2 Added /dev/sdg1 and /dev/sdg2 as spares, and this works. One scenario when I reboot: md0 is assembled from sd[abceg]1; it's degraded and reports a spares missing event. md1 assembles correctly, spare is not missing Any ideas? I have asked about this on various boards (some said UDEV rules would help, some thought the issue had to do with the /dev/sdX names changing, etc). I don't think those are applicable since dmesg reports the arrays assemble as soon as the disks are detected. Thanks in advance! INFO: (currently the boot drive (non raid) is sdd, otherwise all sd devices are part of the raid) fdisk: $ sudo fdisk -l Disk /dev/sda: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sda1 1152112217401 fd Linux raid autodetect /dev/sda21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdb: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdb1 1152112217401 fd Linux raid autodetect /dev/sdb21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdc: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdc1 1152112217401 fd Linux raid autodetect /dev/sdc21522 30401 231978600 fd Linux raid autodetect Disk /dev/md0: 50.0 GB, 50041978880 bytes 2 heads, 4 sectors/track, 12217280 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk identifier: 0x Disk /dev/md0 doesn't contain a valid partition table Disk /dev/md1: 950.1 GB, 950183919616 bytes 2 heads, 4 sectors/track, 231978496 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk identifier: 0x Disk /dev/md1 doesn't contain a valid partition table Disk /dev/sdd: 120.0 GB, 120034123776 bytes 255 heads, 63 sectors/track, 14593 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x535bfd7a Device Boot Start End Blocks Id System /dev/sdd1 * 1 14219 114214086 83 Linux /dev/sdd2 14220 14593 30041555 Extended /dev/sdd5 14220 14593 3004123+ 82 Linux swap / Solaris Disk /dev/sde: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sde1 1152112217401 fd Linux raid autodetect /dev/sde21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdf: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdf1 1152112217401 fd Linux raid autodetect /dev/sdf21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdg: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdg1 1152112217401 fd Linux raid autodetect /dev/sdg21522 30401 231978600 fd Linux raid autodetect $ sudo mdadm --detail /dev/md0
Re: md rotates RAID5 spare at boot
Jed Davidow wrote: I have a RAID5 (5+1spare) setup that works perfectly well until I reboot. I have 6 drives (two different models) partitioned to give me 2 arrays, md0 and md1, that I use for /home and /var respectively. When I reboot, the system assembles each array, but swaps out what was the spare with one of the member drives. It then immediately detects a degraded array and rebuilds. After that, all is fine and testing has shown things to be working like they should. Until I reboot. Example: Built two arrays: /dev/md0 - /dev/sd[abcef]1 and /dev/md1 - /dev/sd[abcef]2 Added /dev/sdg1 and /dev/sdg2 as spares, and this works. One scenario when I reboot: md0 is assembled from sd[abceg]1; it's degraded and reports a spares missing event. md1 assembles correctly, spare is not missing I'm looking at the dmesg which follows and seeing md1 reconstructing. This seems to be at variance with assembles correctly here. That's the only thing which has struck me as worth mentioning so far. Any ideas? I have asked about this on various boards (some said UDEV rules would help, some thought the issue had to do with the /dev/sdX names changing, etc). I don't think those are applicable since dmesg reports the arrays assemble as soon as the disks are detected. Thanks in advance! INFO: (currently the boot drive (non raid) is sdd, otherwise all sd devices are part of the raid) fdisk: $ sudo fdisk -l Disk /dev/sda: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sda1 1152112217401 fd Linux raid autodetect /dev/sda21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdb: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdb1 1152112217401 fd Linux raid autodetect /dev/sdb21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdc: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdc1 1152112217401 fd Linux raid autodetect /dev/sdc21522 30401 231978600 fd Linux raid autodetect Disk /dev/md0: 50.0 GB, 50041978880 bytes 2 heads, 4 sectors/track, 12217280 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk identifier: 0x Disk /dev/md0 doesn't contain a valid partition table Disk /dev/md1: 950.1 GB, 950183919616 bytes 2 heads, 4 sectors/track, 231978496 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk identifier: 0x Disk /dev/md1 doesn't contain a valid partition table Disk /dev/sdd: 120.0 GB, 120034123776 bytes 255 heads, 63 sectors/track, 14593 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x535bfd7a Device Boot Start End Blocks Id System /dev/sdd1 * 1 14219 114214086 83 Linux /dev/sdd2 14220 14593 30041555 Extended /dev/sdd5 14220 14593 3004123+ 82 Linux swap / Solaris Disk /dev/sde: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sde1 1152112217401 fd Linux raid autodetect /dev/sde21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdf: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdf1 1152112217401 fd Linux raid autodetect /dev/sdf21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdg: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdg1 1152112217401 fd Linux raid autodetect /dev/sdg21522 30401 231978600 fd Linux raid autodetect $ sudo mdadm --detail /dev/md0 (md1 shows similar info) /dev/md0: Version : 00.90.03 Creation Time : Sat Apr 7 23:32:58
Re: md rotates RAID5 spare at boot
Hi Bill, Maybe I'm using the wrong words... In this instance, on the previous boot, md1 was assembled from sd[efbac]2 and sdg2 was the spare. When I rebooted it assembled from sd[efbgc]2 and had no spare (appears that sdg was swapped in for sda). Since sdg2 had been the spare, the array is degraded and it rebuilds. I suppose this would be the case if, during the shutdown that sda2 was compromised (although I see nothing about sda2 as being faulty- I can manually add it immediately). But this happens just about every time I reboot, sometimes to only one of the two arrays, sometimes with the corresponding partitions on both arrays and sometimes with different partitions on each array. If something was physically wrong with one of the drives, I would expect it to swap in the spare for that drive each time. But it seems to swap in the spare randomly. Note- last night I shutdown completely, restarted after 30 sec and for the first time in a while did not have an issue. This time the drives were recognized and assigned device nodes in the 'correct' order (MB controller first, PCI controller next). Would device node assignments have any affect on how the array was being assembled? It looks to me like md inspects and attempts to assemble after each drive controller is scanned (from dmesg, there appears to be a failed bind on the first three devices after they are scanned, and then again when the second controller is scanned). Would the scan order cause a spare to be swapped in? Bill Davidsen wrote: Jed Davidow wrote: I have a RAID5 (5+1spare) setup that works perfectly well until I reboot. I have 6 drives (two different models) partitioned to give me 2 arrays, md0 and md1, that I use for /home and /var respectively. When I reboot, the system assembles each array, but swaps out what was the spare with one of the member drives. It then immediately detects a degraded array and rebuilds. After that, all is fine and testing has shown things to be working like they should. Until I reboot. Example: Built two arrays: /dev/md0 - /dev/sd[abcef]1 and /dev/md1 - /dev/sd[abcef]2 Added /dev/sdg1 and /dev/sdg2 as spares, and this works. One scenario when I reboot: md0 is assembled from sd[abceg]1; it's degraded and reports a spares missing event. md1 assembles correctly, spare is not missing I'm looking at the dmesg which follows and seeing md1 reconstructing. This seems to be at variance with assembles correctly here. That's the only thing which has struck me as worth mentioning so far. Any ideas? I have asked about this on various boards (some said UDEV rules would help, some thought the issue had to do with the /dev/sdX names changing, etc). I don't think those are applicable since dmesg reports the arrays assemble as soon as the disks are detected. Thanks in advance! INFO: (currently the boot drive (non raid) is sdd, otherwise all sd devices are part of the raid) fdisk: $ sudo fdisk -l Disk /dev/sda: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sda1 1152112217401 fd Linux raid autodetect /dev/sda21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdb: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdb1 1152112217401 fd Linux raid autodetect /dev/sdb21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdc: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdc1 1152112217401 fd Linux raid autodetect /dev/sdc21522 30401 231978600 fd Linux raid autodetect Disk /dev/md0: 50.0 GB, 50041978880 bytes 2 heads, 4 sectors/track, 12217280 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk identifier: 0x Disk /dev/md0 doesn't contain a valid partition table Disk /dev/md1: 950.1 GB, 950183919616 bytes 2 heads, 4 sectors/track, 231978496 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk identifier: 0x Disk /dev/md1 doesn't contain a valid partition table Disk /dev/sdd: 120.0 GB, 120034123776 bytes 255 heads, 63 sectors/track, 14593 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x535bfd7a Device Boot Start End Blocks Id System /dev/sdd1 * 1 14219
this goes go my megaraid probs too was: Re: md rotates RAID5 spare at boot
Jed Davidow wrote: I'm sorry- is this an inappropriate list to ask for help? There seemed to be a fair amount of that when I searched the archives, but I don't want to bug developers with my problems! Please let me know if I should find another place to ask for help (and please let me know where that might be!). I could also use help with my mega-raid 150 question. Don't know if I asked it wrong or it was the color shirt I was wearing. I am unfortunately running with such a dearth of knowledge on the topic that I don't really know the right questions to ask when diagnosing a performance problem. All I know is that there's very little documentation on this card, is even less documentation on the commandline tool to access/control the card and if it makes the most sense, I am perfectly willing to deep six the card on eBay and pick up a couple of reasonable speed serial ATA controller cards in its stead. the only reason I want to try and learn more about the hardware raid is because the problems I'm experiencing with my virtual machines on this platform mimic problems a customer of mine is experiencing and if I can fix them just by changing how the raid controller uses the discs, then that is a huge win. Personally, I think it's something a little deeper because VMware server seems to go out to lunch whenever there is a backup in the disk I/O queue. I'm seriously thinking about picking up esx as soon as the budget allows. I just need some good solid advice on what path I should take. ---eric -- Speech-recognition in use. It makes mistakes, I correct some. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: md rotates RAID5 spare at boot
On Thursday January 10, [EMAIL PROTECTED] wrote: It looks to me like md inspects and attempts to assemble after each drive controller is scanned (from dmesg, there appears to be a failed bind on the first three devices after they are scanned, and then again when the second controller is scanned). Would the scan order cause a spare to be swapped in? This suggests that mdadm --incremental is being used to assemble the arrays. Every time udev finds a new device, it gets added to whichever array is should be in. If it is called as mdadm --incremental --run, then it will get started as soon as possible, even if it is degraded. With the --run, it will wait until all devices are available. Even with mdadm --incremental --run, you shouldn't get a resync if the last device is added before the array is written to. What distro are you running? What does grep -R mdadm /etc/udev show? NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: md rotates RAID5 spare at boot
distro: Ubuntu 7.10 Two files show up... 85-mdadm.rules: # This file causes block devices with Linux RAID (mdadm) signatures to # automatically cause mdadm to be run. # See udev(8) for syntax SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \ RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded 65-mdadm.vol_id.rules: # This file causes Linux RAID (mdadm) block devices to be checked for # further filesystems if the array is active. # See udev(8) for syntax SUBSYSTEM!=block, GOTO=mdadm_end KERNEL!=md[0-9]*, GOTO=mdadm_end ACTION!=add|change, GOTO=mdadm_end # Check array status ATTR{md/array_state}==|clear|inactive, GOTO=mdadm_end # Obtain array information IMPORT{program}=/sbin/mdadm --detail --export $tempnode ENV{MD_NAME}==?*, SYMLINK+=disk/by-id/md-name-$env{MD_NAME} ENV{MD_UUID}==?*, SYMLINK+=disk/by-id/md-uuid-$env{MD_UUID} # by-uuid and by-label symlinks IMPORT{program}=vol_id --export $tempnode OPTIONS=link_priority=-100 ENV{ID_FS_USAGE}==filesystem|other|crypto, ENV{ID_FS_UUID_ENC}==?*, \ SYMLINK+=disk/by-uuid/$env{ID_FS_UUID_ENC} ENV{ID_FS_USAGE}==filesystem|other, ENV{ID_FS_LABEL_ENC}==?*, \ SYMLINK+=disk/by-label/$env{ID_FS_LABEL_ENC} I see. So udev is invoking the assemble command as soon as it detects the devices. So is it possible that the spare is not the last drive to be detected and mdadm assembles too soon? Neil Brown wrote: On Thursday January 10, [EMAIL PROTECTED] wrote: It looks to me like md inspects and attempts to assemble after each drive controller is scanned (from dmesg, there appears to be a failed bind on the first three devices after they are scanned, and then again when the second controller is scanned). Would the scan order cause a spare to be swapped in? This suggests that mdadm --incremental is being used to assemble the arrays. Every time udev finds a new device, it gets added to whichever array is should be in. If it is called as mdadm --incremental --run, then it will get started as soon as possible, even if it is degraded. With the --run, it will wait until all devices are available. Even with mdadm --incremental --run, you shouldn't get a resync if the last device is added before the array is written to. What distro are you running? What does grep -R mdadm /etc/udev show? NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: md rotates RAID5 spare at boot
One quick question about those rules. The 65-mdadm rule looks like it checks ACTIVE arrays for filesystems, and the 85 rule assembles arrays. Shouldn't they run in the other order? distro: Ubuntu 7.10 Two files show up... 85-mdadm.rules: # This file causes block devices with Linux RAID (mdadm) signatures to # automatically cause mdadm to be run. # See udev(8) for syntax SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \ RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded 65-mdadm.vol_id.rules: # This file causes Linux RAID (mdadm) block devices to be checked for # further filesystems if the array is active. # See udev(8) for syntax SUBSYSTEM!=block, GOTO=mdadm_end KERNEL!=md[0-9]*, GOTO=mdadm_end ACTION!=add|change, GOTO=mdadm_end # Check array status ATTR{md/array_state}==|clear|inactive, GOTO=mdadm_end # Obtain array information IMPORT{program}=/sbin/mdadm --detail --export $tempnode ENV{MD_NAME}==?*, SYMLINK+=disk/by-id/md-name-$env{MD_NAME} ENV{MD_UUID}==?*, SYMLINK+=disk/by-id/md-uuid-$env{MD_UUID} # by-uuid and by-label symlinks IMPORT{program}=vol_id --export $tempnode OPTIONS=link_priority=-100 ENV{ID_FS_USAGE}==filesystem|other|crypto, ENV{ID_FS_UUID_ENC}==?*, \ SYMLINK+=disk/by-uuid/$env{ID_FS_UUID_ENC} ENV{ID_FS_USAGE}==filesystem|other, ENV{ID_FS_LABEL_ENC}==?*, \ SYMLINK+=disk/by-label/$env{ID_FS_LABEL_ENC} I see. So udev is invoking the assemble command as soon as it detects the devices. So is it possible that the spare is not the last drive to be detected and mdadm assembles too soon? Neil Brown wrote: On Thursday January 10, [EMAIL PROTECTED] wrote: It looks to me like md inspects and attempts to assemble after each drive controller is scanned (from dmesg, there appears to be a failed bind on the first three devices after they are scanned, and then again when the second controller is scanned). Would the scan order cause a spare to be swapped in? This suggests that mdadm --incremental is being used to assemble the arrays. Every time udev finds a new device, it gets added to whichever array is should be in. If it is called as mdadm --incremental --run, then it will get started as soon as possible, even if it is degraded. With the --run, it will wait until all devices are available. Even with mdadm --incremental --run, you shouldn't get a resync if the last device is added before the array is written to. What distro are you running? What does grep -R mdadm /etc/udev show? NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: md rotates RAID5 spare at boot
On Thursday January 10, [EMAIL PROTECTED] wrote: distro: Ubuntu 7.10 Two files show up... 85-mdadm.rules: # This file causes block devices with Linux RAID (mdadm) signatures to # automatically cause mdadm to be run. # See udev(8) for syntax SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \ RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded I see. So udev is invoking the assemble command as soon as it detects the devices. So is it possible that the spare is not the last drive to be detected and mdadm assembles too soon? The --no-degraded' should stop it from assembling until all expected devices have been found. It could assemble before the spare is found, but should not assemble before all the data devices have been found. The dmesg trace you included in your first mail doesn't actually show anything wrong - it never starts and incomplete array. Can you try again and get a trace where there definitely is a rebuild happening. And please don't drop linux-raid from the 'cc' list. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: md rotates RAID5 spare at boot
On Thursday January 10, [EMAIL PROTECTED] wrote: One quick question about those rules. The 65-mdadm rule looks like it checks ACTIVE arrays for filesystems, and the 85 rule assembles arrays. Shouldn't they run in the other order? They are fine. The '65' rule applies to arrays. I.e. it fires on an array device once it has been started. The '85' rule applies to component devices. They are quite independent. NeilBrown distro: Ubuntu 7.10 Two files show up... 85-mdadm.rules: # This file causes block devices with Linux RAID (mdadm) signatures to # automatically cause mdadm to be run. # See udev(8) for syntax SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \ RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded 65-mdadm.vol_id.rules: # This file causes Linux RAID (mdadm) block devices to be checked for # further filesystems if the array is active. # See udev(8) for syntax SUBSYSTEM!=block, GOTO=mdadm_end KERNEL!=md[0-9]*, GOTO=mdadm_end ACTION!=add|change, GOTO=mdadm_end # Check array status ATTR{md/array_state}==|clear|inactive, GOTO=mdadm_end # Obtain array information IMPORT{program}=/sbin/mdadm --detail --export $tempnode ENV{MD_NAME}==?*, SYMLINK+=disk/by-id/md-name-$env{MD_NAME} ENV{MD_UUID}==?*, SYMLINK+=disk/by-id/md-uuid-$env{MD_UUID} # by-uuid and by-label symlinks IMPORT{program}=vol_id --export $tempnode OPTIONS=link_priority=-100 ENV{ID_FS_USAGE}==filesystem|other|crypto, ENV{ID_FS_UUID_ENC}==?*, \ SYMLINK+=disk/by-uuid/$env{ID_FS_UUID_ENC} ENV{ID_FS_USAGE}==filesystem|other, ENV{ID_FS_LABEL_ENC}==?*, \ SYMLINK+=disk/by-label/$env{ID_FS_LABEL_ENC} I see. So udev is invoking the assemble command as soon as it detects the devices. So is it possible that the spare is not the last drive to be detected and mdadm assembles too soon? Neil Brown wrote: On Thursday January 10, [EMAIL PROTECTED] wrote: It looks to me like md inspects and attempts to assemble after each drive controller is scanned (from dmesg, there appears to be a failed bind on the first three devices after they are scanned, and then again when the second controller is scanned). Would the scan order cause a spare to be swapped in? This suggests that mdadm --incremental is being used to assemble the arrays. Every time udev finds a new device, it gets added to whichever array is should be in. If it is called as mdadm --incremental --run, then it will get started as soon as possible, even if it is degraded. With the --run, it will wait until all devices are available. Even with mdadm --incremental --run, you shouldn't get a resync if the last device is added before the array is written to. What distro are you running? What does grep -R mdadm /etc/udev show? NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: md rotates RAID5 spare at boot
(Sorry- yes it looks like I posted an incorrect dmesg extract) $egrep sd|md|raid|scsi /var/log/dmesg.0 [ 36.112449] md: linear personality registered for level -1 [ 36.117197] md: multipath personality registered for level -4 [ 36.121795] md: raid0 personality registered for level 0 [ 36.126950] md: raid1 personality registered for level 1 [ 36.131424] raid5: automatically using best checksumming function: pIII_sse [ 36.150020] raid5: using function: pIII_sse (4564.000 MB/sec) [ 36.218015] raid6: int32x1780 MB/s [ 36.285943] raid6: int32x2902 MB/s [ 36.353961] raid6: int32x4667 MB/s [ 36.421869] raid6: int32x8528 MB/s [ 36.489811] raid6: mmxx1 1813 MB/s [ 36.557775] raid6: mmxx2 2123 MB/s [ 36.625763] raid6: sse1x11101 MB/s [ 36.693717] raid6: sse1x21898 MB/s [ 36.761688] raid6: sse2x12227 MB/s [ 36.829647] raid6: sse2x23178 MB/s [ 36.829695] raid6: using algorithm sse2x2 (3178 MB/s) [ 36.829744] md: raid6 personality registered for level 6 [ 36.829793] md: raid5 personality registered for level 5 [ 36.829842] md: raid4 personality registered for level 4 [ 36.853475] md: raid10 personality registered for level 10 [ 37.781513] scsi0 : sata_sil [ 37.781628] scsi1 : sata_sil [ 37.781724] scsi2 : sata_sil [ 37.781820] scsi3 : sata_sil [ 37.781922] ata1: SATA max UDMA/100 cmd 0xf88c0080 ctl 0xf88c008a bmdma 0xf88c irq 20 [ 37.781997] ata2: SATA max UDMA/100 cmd 0xf88c00c0 ctl 0xf88c00ca bmdma 0xf88c0008 irq 20 [ 37.782069] ata3: SATA max UDMA/100 cmd 0xf88c0280 ctl 0xf88c028a bmdma 0xf88c0200 irq 20 [ 37.782142] ata4: SATA max UDMA/100 cmd 0xf88c02c0 ctl 0xf88c02ca bmdma 0xf88c0208 irq 20 [ 39.577812] scsi 0:0:0:0: Direct-Access ATA WDC WD2500JD-00H 08.0 PQ: 0 ANSI: 5 [ 39.578027] scsi 1:0:0:0: Direct-Access ATA Maxtor 7L250S0 BACE PQ: 0 ANSI: 5 [ 39.578234] scsi 3:0:0:0: Direct-Access ATA Maxtor 7L250S0 BACE PQ: 0 ANSI: 5 [ 39.632483] scsi4 : ata_piix [ 39.632591] scsi5 : ata_piix [ 39.632812] ata5: PATA max UDMA/133 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x0001f000 irq 14 [ 39.634522] ata6: PATA max UDMA/133 cmd 0x00010170 ctl 0x00010376 bmdma 0x0001f008 irq 15 [ 39.634924] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB) [ 39.634995] sd 0:0:0:0: [sda] Write Protect is off [ 39.635048] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 39.635076] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 39.635218] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB) [ 39.635292] sd 0:0:0:0: [sda] Write Protect is off [ 39.635350] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 39.635380] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 39.635462] sda: sda1 sda2 [ 39.650092] sd 0:0:0:0: [sda] Attached SCSI disk [ 39.650226] sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors (251000 MB) [ 39.650296] sd 1:0:0:0: [sdb] Write Protect is off [ 39.650348] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [ 39.650379] sd 1:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 39.650505] sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors (251000 MB) [ 39.650573] sd 1:0:0:0: [sdb] Write Protect is off [ 39.650625] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [ 39.650657] sd 1:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 39.650727] sdb: sdb1 sdb2 [ 39.667599] sd 1:0:0:0: [sdb] Attached SCSI disk [ 39.667719] sd 3:0:0:0: [sdc] 490234752 512-byte hardware sectors (251000 MB) [ 39.667788] sd 3:0:0:0: [sdc] Write Protect is off [ 39.667840] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [ 39.667871] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 39.667997] sd 3:0:0:0: [sdc] 490234752 512-byte hardware sectors (251000 MB) [ 39.668064] sd 3:0:0:0: [sdc] Write Protect is off [ 39.668116] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [ 39.668146] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 39.668213] sdc: sdc1 sdc2 [ 39.692703] sd 3:0:0:0: [sdc] Attached SCSI disk [ 39.699348] sd 0:0:0:0: Attached scsi generic sg0 type 0 [ 39.699570] sd 1:0:0:0: Attached scsi generic sg1 type 0 [ 39.699786] sd 3:0:0:0: Attached scsi generic sg2 type 0 [ 39.834560] md: md0 stopped. [ 39.870361] md: bindsdc1 [ 39.870527] md: md1 stopped. [ 39.910999] md: md0 stopped. [ 39.911064] md: unbindsdc1 [ 39.911120] md: export_rdev(sdc1) [ 39.929760] md: bindsda1 [ 39.929953] md: bindsdc1 [ 39.930139] md: bindsdb1 [ 39.930231] md: md1 stopped. [ 39.932468] md: bindsdc2 [ 39.932674] md: bindsda2 [ 39.932860] md: bindsdb2 [ 40.411001] scsi 4:0:1:0: CD-ROMLITE-ON DVDRW SOHW-1213S TS09 PQ: 0 ANSI: 5 [ 40.411152] scsi 4:0:1:0: Attached
Re: md rotates RAID5 spare at boot
On Thursday January 10, [EMAIL PROTECTED] wrote: (Sorry- yes it looks like I posted an incorrect dmesg extract) This still doesn't seem to match your description. I see: [ 41.247389] md: bindsdf1 [ 41.247584] md: bindsdb1 [ 41.247787] md: bindsda1 [ 41.247971] md: bindsdc1 [ 41.248151] md: bindsdg1 [ 41.248325] md: bindsde1 [ 41.256718] raid5: device sde1 operational as raid disk 0 [ 41.256771] raid5: device sdc1 operational as raid disk 4 [ 41.256821] raid5: device sda1 operational as raid disk 3 [ 41.256870] raid5: device sdb1 operational as raid disk 2 [ 41.256919] raid5: device sdf1 operational as raid disk 1 [ 41.257426] raid5: allocated 5245kB for md0 [ 41.257476] raid5: raid level 5 set md0 active with 5 out of 5 devices, algorithm 2 which looks like 'md0' started with 5 of 5 drives, plus g1 is there as a spare. And [ 41.312250] md: bindsdf2 [ 41.312476] md: bindsdb2 [ 41.312711] md: bindsdg2 [ 41.312922] md: bindsdc2 [ 41.313138] md: bindsda2 [ 41.313343] md: bindsde2 [ 41.313452] md: md1: raid array is not clean -- starting background reconstruction [ 41.322189] raid5: device sde2 operational as raid disk 0 [ 41.322243] raid5: device sdc2 operational as raid disk 4 [ 41.322292] raid5: device sdg2 operational as raid disk 3 [ 41.322342] raid5: device sdb2 operational as raid disk 2 [ 41.322391] raid5: device sdf2 operational as raid disk 1 [ 41.322823] raid5: allocated 5245kB for md1 [ 41.322872] raid5: raid level 5 set md1 active with 5 out of 5 devices, algorithm 2 md1 also assembled with 5/5 drives and sda2 as a spare. This one was not shut down cleanly so it started a resync. But there is not evidence of anything starting degraded. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
md rotates RAID5 spare at boot
I have a RAID5 (5+1spare) setup that works perfectly well until I reboot. I have 6 drives (two different models) partitioned to give me 2 arrays, md0 and md1, that I use for /home and /var respectively. When I reboot, the system assembles each array, but swaps out what was the spare with one of the member drives. It then immediately detects a degraded array and rebuilds. After that, all is fine and testing has shown things to be working like they should. Until I reboot. Example: Built two arrays: /dev/md0 - /dev/sd[abcef]1 and /dev/md1 - /dev/sd[abcef]2 Added /dev/sdg1 and /dev/sdg2 as spares, and this works. One scenario when I reboot: md0 is assembled from sd[abceg]1; it's degraded and reports a spares missing event. md1 assembles correctly, spare is not missing Any ideas? I have asked about this on various boards (some said UDEV rules would help, some thought the issue had to do with the /dev/sdX names changing, etc). I don't think those are applicable since dmesg reports the arrays assemble as soon as the disks are detected. Thanks in advance! INFO: (currently the boot drive (non raid) is sdd, otherwise all sd devices are part of the raid) fdisk: $ sudo fdisk -l Disk /dev/sda: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sda1 1152112217401 fd Linux raid autodetect /dev/sda21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdb: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdb1 1152112217401 fd Linux raid autodetect /dev/sdb21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdc: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdc1 1152112217401 fd Linux raid autodetect /dev/sdc21522 30401 231978600 fd Linux raid autodetect Disk /dev/md0: 50.0 GB, 50041978880 bytes 2 heads, 4 sectors/track, 12217280 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk identifier: 0x Disk /dev/md0 doesn't contain a valid partition table Disk /dev/md1: 950.1 GB, 950183919616 bytes 2 heads, 4 sectors/track, 231978496 cylinders Units = cylinders of 8 * 512 = 4096 bytes Disk identifier: 0x Disk /dev/md1 doesn't contain a valid partition table Disk /dev/sdd: 120.0 GB, 120034123776 bytes 255 heads, 63 sectors/track, 14593 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x535bfd7a Device Boot Start End Blocks Id System /dev/sdd1 * 1 14219 114214086 83 Linux /dev/sdd2 14220 14593 30041555 Extended /dev/sdd5 14220 14593 3004123+ 82 Linux swap / Solaris Disk /dev/sde: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sde1 1152112217401 fd Linux raid autodetect /dev/sde21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdf: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdf1 1152112217401 fd Linux raid autodetect /dev/sdf21522 30401 231978600 fd Linux raid autodetect Disk /dev/sdg: 250.0 GB, 250059350016 bytes 255 heads, 63 sectors/track, 30401 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x Device Boot Start End Blocks Id System /dev/sdg1 1152112217401 fd Linux raid autodetect /dev/sdg21522 30401 231978600 fd Linux raid autodetect $ sudo mdadm --detail /dev/md0 (md1 shows similar info) /dev/md0: Version : 00.90.03 Creation Time : Sat Apr 7 23:32:58 2007 Raid Level : raid5 Array Size : 48869120 (46.61 GiB 50.04 GB) Used Dev Size : 12217280 (11.65 GiB 12.51 GB) Raid Devices : 5 Total Devices : 6 Preferred Minor : 0 Persistence