Switched to dracut, which does not use kernel raid autodetect.  Works great, 
though subject to BZ 513267.




----- Original Message ----
From: S Murthy Kambhampaty <smk...@yahoo.com>
To: fedora-list@redhat.com
Sent: Thursday, September 3, 2009 3:52:43 AM
Subject: Fedora 11 boot fails on md with two HBAs

[Fixed typo in header line, etc.; added info at end.]


With Fedora 11 on IBM x3650 M2 (x86_64), I am having problems booting after 
adding a second, add-in, SAS controller with 12 SAS disks in an external 
enclosure.  Without the add-in SAS controller, the system boots fine and 
provides file sharing over samba, etc.

The main problem with this machine is that it has an EFI BIOS which appears to 
bind the external/add-on SAS HBA before the internal one on which the system 
drives are hosted (they are both LSI SAS 3801e HBAs - the internal one has an 
IBM part number, while the external one is LSI).  Changing the BIOS order in 
the option ROMs and disabling boot services on the external/add-on HBA does not 
seem to affect the adapter binding order, and the EFI BIOS does not provide any 
mechanism for controlling this.  Which brings us to the bootup issues.

Note that after booting into rescue mode from the installer image mdadm works 
as expected, with raid arrays detected and started without any problems on the 
disks located on both controllers.  The md configuration is :

/boot is on /dev/md0 with devices  /dev/sd[mn]3, raid1
/root is on /dev/md1 with devices /dev/sd[op]1, raid1
/usr, /var and swap are on an LVM vg on /dev/md2 with devices /dev/sd[m-p]2, 
raid10
/boot/efi is on /dev/md64 with devices /dev/sd[mn]2, raid1

A separate vg for the data volume is on /dev/md127 with devices /dev/sd[a-l], 
raid6 (whole disks)

/dev/sd[m-p] are hosted on the internal HBA, and appear as /dev/sd[a-d] when 
the add-in HBA is disabled.  /dev/sd[a-l] hang of the add-in HBA.  Note that in 
rescue mode, the disks start at /dev/sdc, as I'm booting from a virtual 
console, and /dev/sda and /dev/sdb are assigned to virtual USB disks.

The problem seems to be that during bootup the raid arrays are autodetected 
rather than using mdadm.  If the raid456 module is not included in the initrd 
image, booting fails with raid6 personality not detected, when the boot process 
tries to start /dev/md1 incrementally with /dev/sda as its first member.  (This 
appears not to reference /etc/mdadm.conf in the initrd at all.)

If the raid456 module is included in the initrd image (using --with=raid456), 
booting fails with /dev/sda added incrementally to /dev/md1 and /dev/sdb added 
incrementally to /dev/md2; it appears autodetect fails because the raid device 
was built from rescue mode, so the components are listed with different letters 
in the superblock.

If I put a line in /etc/mdadm.conf in the initird image, to only scan 
partitioned disks (DEVICE /dev/sd*[1234]), boot hangs after loading the raid 
modules. (Potentially on the call to mkblkdevs after scsi_wait_scan is 
rmmod-ed.)

Partitioning /dev/sd[a-l] and setting the partition type to other than 'raid' 
does not seem to make any difference, during bootup the kernel still tries to 
assemble the root raid device (/dev/md1) from /dev/sda (though it is on 
/dev/md[op]1.

This seems to suggest that the md devices are being started by kernel raid 
autodetection rather than mdadm.  Simply switching to mdamd would likely solve 
the problem, given it works fine in rescue mode.  Alternative suggestions are 
welcome, of course.

Thanks for the help,
   Murthy


      

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines

Reply via email to