Re: [gentoo-user] Re: initramfs & RAID at boot time
On Mon, 19 Apr 2010 14:27:15 +0300, Ciprian Dorin, Craciun wrote: > Well I've tried exactly that: I've aggregated two partitions in > RAID1, made the file system, then tried to install Grub on them (as in > run grub-setup or grub-install or grub and then from the shell the > setup)... And I didn't succeeded. I've tried the following: > * try to install Grub on the MD as it would have been a partition > -- failed (as expected as the MD device is not on a hard-drive); > * stopped the MD, and then tried to install grub on each partition > individually -- it worked to install, but from a reason I don't > remember right now it failed to boot; > > So what intrigues me is how you've initialized the MBR, how you've > runned grub-setup? I don't use grub-setup, just run grub and do root (hd0,4) setup (hd0) quit and repeat for each disk. -- Neil Bothwick My friends went to alt.california, and all they brought me was this lousy sig. signature.asc Description: PGP signature
Re: [gentoo-user] Re: initramfs & RAID at boot time
On Sun, Apr 18, 2010 at 11:16, Neil Bothwick wrote: > On Sun, 18 Apr 2010 09:57:38 +0300, Ciprian Dorin, Craciun wrote: > >> Also a question for about /boot on RAID1... I didn't manage to >> make it work... Could you Neil please tell me exactly how you did >> this? I'm most interested in how you've convinced Grub to work... > > You just don't tell GRUB that it's working with one half of a RAID1 > array. Unlike all other RAID level, with 1 you can also access the > individual disks. > > -- > Neil Bothwick Well I've tried exactly that: I've aggregated two partitions in RAID1, made the file system, then tried to install Grub on them (as in run grub-setup or grub-install or grub and then from the shell the setup)... And I didn't succeeded. I've tried the following: * try to install Grub on the MD as it would have been a partition -- failed (as expected as the MD device is not on a hard-drive); * stopped the MD, and then tried to install grub on each partition individually -- it worked to install, but from a reason I don't remember right now it failed to boot; So what intrigues me is how you've initialized the MBR, how you've runned grub-setup? (In the end I am more pleased with two boot partitions, as if I miss-configure one, I'll have the other one to boot from. I've also cross-referenced the grub menu to chain-load the other disk.) Thanks, Ciprian.
Re: [gentoo-user] Re: initramfs & RAID at boot time
On Sun, Apr 18, 2010 at 10:42, Jarry wrote: > On 18. 4. 2010 8:57, Ciprian Dorin, Craciun wrote: > >> * there is an option for the kernel that must be enabled at >> compile time that enables automatic RAID detection and assembly by the >> kernel before mounting /, but it works only for MD metadata 0.96 (see >> [1]); >> * the default metadata for `mdadm` is 1.2 (see `man mdadm`, and >> search for `--metadata`), so when creating the RAID you must >> explicitly select the metadata you want; > >> [1] >> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/md.txt;h=188f4768f1d58c013d962f993ae36483195fd288;hb=HEAD > > > Which version of mdadm are you using? I have 3.0, and defalut metadata > is 0.90: > > -e , --metadata= > Declare the style of RAID metadata (superblock) to be used. The > default is 0.90 for --create, and to guess for other operations. > The default can be overridden by setting the metadata value for > the CREATE keyword in mdadm.conf. > > BTW [1] says about kernel 2.6.9, things might have changed since then... > > Jarry On my laptop on which I've made the experiments I have ArchLinux (which always has the bleeding edge packages), I have 3.1.2. So maybe between 3.0 and 3.1 there was this switch from 0.90 to 1.2 default metadata. About the autodetection stuff I'm absolutely positive that it only handles 0.90 format as I've tried it and didn't work with the 1.x version of superblock format. Ciprian.
Re: [gentoo-user] Re: initramfs & RAID at boot time
On Sun, Apr 18, 2010 at 11:01 AM, Neil Bothwick wrote: > On Sun, 18 Apr 2010 08:13:08 -0700, Mark Knecht wrote: > >> I'm not sure that we good advice or not for RAIDs that could be >> assembled later but that's what I did and it leads to the kernel >> trying to do everything before the system is totally up and mdadm is >> really running. > > I only have one RAID1 of 400MB for / and one RAID5 carrying an LVM volume > group for everything else. Using multiple RAID partitions without LVM is > far to complicated for my brain to handle. > > > -- > Neil Bothwick Nahh...I don't believe that for a moment, but this is a rather more complicated task than a basic desktop PC. This is about number crunching using multiple instances of Windows running under VMWare. First, the basic system: /dev/md3 - 50GB 3-drive RAID1 => The ~amd64 install we discussed over the last week. This is the whole Gentoo install. /dev/md5 - 50GB 3-drive RAID1 => A standard stable install - same as md3 but stable, and again the whole Gentoo install. Obviously I don't use the two above at the same time. I'm mostly on stable and testing out ~amd64 right now. I use one or the other. /dev/md11 => 100GB RAID0 - This partition is the main data storage for the 5 Windows VMs I want to run at the same time. I went RAID0 because my Windows apps appear to need an aggregate disk bandwidth of about 150-200MB/Sec and I couldn't get that with RAID1. I'll see how well this works out over time. /dev/md6 => 250GB RAID1 used purely as backup for the RAID0 which is backed up daily, although right now not automatically. The RAID0 and backup RAID1 need to be available whether I'm booting stable (md5) or ~amd64. (md3) I found some BIOS options, one of which was as default set to 'Fast Boot'. I disabled that, slowing down boot and hopefully allowing far more time to get the drives online more reliably. So far I've powered off and rebooted 5 or 6 times. Each time the system has come up clean. That's a first. I could maybe post a photo of what I'm seeing at boot but essentially the boot process complains with red exclamation marks about md6 & md11 but in dmesg the only thing I find is the one-liner md: created md3 md: bind md: bind md: bind md: running: raid1: raid set md3 active with 3 out of 3 mirrors md3: detected capacity change from 0 to 53694562304 md: ... autorun DONE. md5: unknown partition table and after that no other messages. BTW - I did sort of take a gamble and change the partitions for md6 and md11 to type 83 instead of 0xfd. It doesn't appear to have caused any problems and I have only the above 'unknown partition table' message. Strange as md5 is mounted and the system seems completely happy: m...@c2stable ~ $ df Filesystem 1K-blocks Used Available Use% Mounted on /dev/md5 51612920 7552836 41438284 16% / udev 10240 296 9944 3% /dev /dev/md11103224600 1740 80558784 18% /virdata /dev/md6 243534244 24664820 206498580 11% /backups shm6151580 0 6151580 0% /dev/shm m...@c2stable ~ $ Cheers, Mark
Re: [gentoo-user] Re: initramfs & RAID at boot time
On Sun, 18 Apr 2010 08:13:08 -0700, Mark Knecht wrote: > I'm not sure that we good advice or not for RAIDs that could be > assembled later but that's what I did and it leads to the kernel > trying to do everything before the system is totally up and mdadm is > really running. I only have one RAID1 of 400MB for / and one RAID5 carrying an LVM volume group for everything else. Using multiple RAID partitions without LVM is far to complicated for my brain to handle. -- Neil Bothwick Top Oxymorons Number 32: Living dead signature.asc Description: PGP signature
Re: [gentoo-user] Re: initramfs & RAID at boot time
On Sat, Apr 17, 2010 at 3:01 PM, Neil Bothwick wrote: > On Sat, 17 Apr 2010 14:36:39 -0700, Mark Knecht wrote: > >> Empirically any way there doesn't seem to be a problem. I built the >> new kernel and it booted normally so I think I'm misinterpreting what >> was written in the Wiki or the Wiki is wrong. > > As long as /boot is not on RAID, or is on RAID1, you don't need an > initrd. I've been booting this system for years with / on RAID1 and > everything else on RAID5. > > > -- > Neil Bothwick Neil, Completely agreed, and in fact it's the way I built my new system. /boot is just a partition, / is RAID1 is three partitions marked with 0xfd partition type, using metadata=0.90 and assembled by the kernel. I'm using WD RAID Edition drives and an Asus Rampage II Extreme motherboard. It works, however I'm running into the sort of thing I ran into this morning when booting - both md5 and md6 have problems this morning. Random partitions get dropped out. It's never the same ones, and it's sometimes only 1 partition out of three on the same drive - sdc5 and sdc6 aren't found until I reboot, but sda3, sdb3 & sdc3 were. Flakey hardware? What? The motherboard? The drives? I've noticed the entering the BIOS setup screens before allowing grub to take over seems to eliminate the problem. Timing? m...@c2stable ~ $ cat /proc/mdstat Personalities : [raid0] [raid1] md6 : active raid1 sda6[0] sdb6[1] 247416933 blocks super 1.1 [3/2] [UU_] md11 : active raid0 sdd1[0] sde1[1] 104871936 blocks super 1.1 512k chunks md3 : active raid1 sdc3[2] sdb3[1] sda3[0] 52436096 blocks [3/3] [UUU] md5 : active raid1 sdb5[1] sda5[0] 52436032 blocks [3/2] [UU_] unused devices: m...@c2stable ~ $ For clarity, md3 is the only one needed to boot the system. The other three RAIDs aren't required until I start running apps. However they are all being assembled by the kernel at boot time and I would prefer not to do that, or at least learn how not to do it. Now, as to why they are being assembled I suspect it's because I marked them all with partition type 0xfd when possibly it's not the best thing to have done. The kernel won't bother with non-0xfd partitions and then mdadm could have done it later: c2stable ~ # fdisk -l /dev/sda Disk /dev/sda: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x8b45be24 Device Boot Start End Blocks Id System /dev/sda1 * 1 7 56196 83 Linux /dev/sda2 8 530 4200997+ 82 Linux swap / Solaris /dev/sda3 536706352436160 fd Linux raid autodetect /dev/sda47064 60801 4316504855 Extended /dev/sda57064 1359152436128+ fd Linux raid autodetect /dev/sda6 3 60801 247417065 fd Linux raid autodetect c2stable ~ # However the Gentoo Wiki says we are supposed to mark everything 0xfd: http://en.gentoo-wiki.com/wiki/RAID/Software#Setup_Partitions I'm not sure that we good advice or not for RAIDs that could be assembled later but that's what I did and it leads to the kernel trying to do everything before the system is totally up and mdadm is really running. Anyway, the failures happen, so I can step through and fail, remove and add the partition back to the array. (In this case fail and remove aren't necessary) c2stable ~ # mdadm /dev/md5 -f /dev/sdc5 mdadm: set device faulty failed for /dev/sdc5: No such device c2stable ~ # mdadm /dev/md5 -r /dev/sdc5 mdadm: hot remove failed for /dev/sdc5: No such device or address c2stable ~ # mdadm /dev/md5 -a /dev/sdc5 mdadm: re-added /dev/sdc5 c2stable ~ # mdadm /dev/md6 -a /dev/sdc6 mdadm: re-added /dev/sdc6 c2stable ~ # At this point md5 is repaired and I'm waiting for md6 c2stable ~ # cat /proc/mdstat Personalities : [raid0] [raid1] md6 : active raid1 sdc6[2] sda6[0] sdb6[1] 247416933 blocks super 1.1 [3/2] [UU_] [>] recovery = 22.0% (54525440/247416933) finish=38.1min speed=84230K/sec md11 : active raid0 sdd1[0] sde1[1] 104871936 blocks super 1.1 512k chunks md3 : active raid1 sdc3[2] sdb3[1] sda3[0] 52436096 blocks [3/3] [UUU] md5 : active raid1 sdc5[2] sdb5[1] sda5[0] 52436032 blocks [3/3] [UUU] unused devices: c2stable ~ #c2stable ~ # cat /proc/mdstat Personalities : [raid0] [raid1] md6 : active raid1 sdc6[2] sda6[0] sdb6[1] 247416933 blocks super 1.1 [3/2] [UU_] [>] recovery = 22.0% (54525440/247416933) finish=38.1min speed=84230K/sec md11 : active raid0 sdd1[0] sde1[1] 104871936 blocks super 1.1 512k chunks md3 : active raid1 sdc3[2] sdb3[1] sda3[0] 52436096 blocks [3/3] [UUU] md5 : active raid1 sdc5[2] sdb5[1] sda5[0] 52436032 blocks [3/3] [UUU] unused devices: c2stable ~ # How do I get past this? It's happening 2-3 times a week! I'm figuring
Re: [gentoo-user] Re: initramfs & RAID at boot time
On Sun, 18 Apr 2010 09:57:38 +0300, Ciprian Dorin, Craciun wrote: > Also a question for about /boot on RAID1... I didn't manage to > make it work... Could you Neil please tell me exactly how you did > this? I'm most interested in how you've convinced Grub to work... You just don't tell GRUB that it's working with one half of a RAID1 array. Unlike all other RAID level, with 1 you can also access the individual disks. -- Neil Bothwick I am sitting on the toilet with your article before me. Soon it will be behind me. signature.asc Description: PGP signature
Re: [gentoo-user] Re: initramfs & RAID at boot time
On 18. 4. 2010 8:57, Ciprian Dorin, Craciun wrote: * there is an option for the kernel that must be enabled at compile time that enables automatic RAID detection and assembly by the kernel before mounting /, but it works only for MD metadata 0.96 (see [1]); * the default metadata for `mdadm` is 1.2 (see `man mdadm`, and search for `--metadata`), so when creating the RAID you must explicitly select the metadata you want; [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/md.txt;h=188f4768f1d58c013d962f993ae36483195fd288;hb=HEAD Which version of mdadm are you using? I have 3.0, and defalut metadata is 0.90: -e , --metadata= Declare the style of RAID metadata (superblock) to be used. The default is 0.90 for --create, and to guess for other operations. The default can be overridden by setting the metadata value for the CREATE keyword in mdadm.conf. BTW [1] says about kernel 2.6.9, things might have changed since then... Jarry -- ___ This mailbox accepts e-mails only from selected mailing-lists! Everything else is considered to be spam and therefore deleted.
Re: [gentoo-user] Re: initramfs & RAID at boot time
On Sun, Apr 18, 2010 at 1:01 AM, Neil Bothwick wrote: > On Sat, 17 Apr 2010 14:36:39 -0700, Mark Knecht wrote: > >> Empirically any way there doesn't seem to be a problem. I built the >> new kernel and it booted normally so I think I'm misinterpreting what >> was written in the Wiki or the Wiki is wrong. > > As long as /boot is not on RAID, or is on RAID1, you don't need an > initrd. I've been booting this system for years with / on RAID1 and > everything else on RAID5. From my research on the topic (I also wanted to have both /boot and / on RAID1) there are the following traps: * there is an option for the kernel that must be enabled at compile time that enables automatic RAID detection and assembly by the kernel before mounting /, but it works only for MD metadata 0.96 (see [1]); * the default metadata for `mdadm` is 1.2 (see `man mdadm`, and search for `--metadata`), so when creating the RAID you must explicitly select the metadata you want; * indeed the preferred may to do it is using an initramfs; (I've posted below some shell snippets that create do exactly this: assemble my RAID); (the code snippets are between {{{...}}}, it's from a MoinMoin wiki page;) Also a question for about /boot on RAID1... I didn't manage to make it work... Could you Neil please tell me exactly how you did this? I'm most interested in how you've convinced Grub to work... Best, Ciprian. [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/md.txt;h=188f4768f1d58c013d962f993ae36483195fd288;hb=HEAD Init-ramfs preparation {{{ mkdir -p /usr/src/initramfs cd /usr/src/initramfs mkdir /usr/src/initramfs/bin mkdir /usr/src/initramfs/dev mkdir /usr/src/initramfs/proc mkdir /usr/src/initramfs/rootfs mkdir /usr/src/initramfs/sys cp -a /bin/busybox /usr/src/initramfs/bin/busybox cp -a /sbin/mdadm /usr/src/initramfs/bin/mdadm cp -a /sbin/jfs_fsck /usr/src/initramfs/bin/jfs_fsck cp -a /dev/console /usr/src/initramfs/dev/console cp -a /dev/null /usr/src/initramfs/dev/null cp -a /dev/sda2 /usr/src/initramfs/dev/sda2 cp -a /dev/sdc2 /usr/src/initramfs/dev/sdc2 cp -a /dev/md127 /usr/src/initramfs/dev/md127 }}} {{{ cat >/usr/src/initramfs/init <<'EOS' #!/bin/busybox ash exec /dev/null 2>/dev/console exec 1>&2 /bin/busybox mount -n -t proc none /proc || exit 1 /bin/busybox mount -n -t sysfs none /sys || exit 1 /bin/mdadm -A /dev/md127 -R -a md /dev/sda2 /dev/sdc2 || exit 1 /bin/jfs_fsck -p /dev/md127 || true /bin/busybox mount -n -t jfs /dev/md127 /rootfs -o ro,exec,suid,dev,relatime,errors=remount-ro || exit 1 /bin/busybox umount -n /sys || exit 1 /bin/busybox umount -n /proc || exit 1 # /bin/busybox ash /dev/console 2>/dev/console || exit 1 exec /bin/busybox switch_root /rootfs /sbin/init || exit 1 exit 1 EOS chmod +x /usr/src/initramfs/init }}} {{{ ( cd /usr/src/initramfs ; find . | cpio --quiet -o -H newc | gzip -9 > /boot/initramfs ) }}}
Re: [gentoo-user] Re: initramfs & RAID at boot time
On Sat, 17 Apr 2010 14:36:39 -0700, Mark Knecht wrote: > Empirically any way there doesn't seem to be a problem. I built the > new kernel and it booted normally so I think I'm misinterpreting what > was written in the Wiki or the Wiki is wrong. As long as /boot is not on RAID, or is on RAID1, you don't need an initrd. I've been booting this system for years with / on RAID1 and everything else on RAID5. -- Neil Bothwick Scientists decode the first confirmed alien transmission from outer space ... "This really works! Just send 5*10^50 H atoms to each of the five star systems listed below. Then, add your own system to the top of the list, delete the system at the bottom, and send out copies of this message to 100 other solar systems. If you follow these instructions, within 0.25 of a galactic rotation you are guaranteed to receive enough hydrogen in return to power your civilization until entropy reaches its maximum!" signature.asc Description: PGP signature