Re: [gentoo-user] Re: initramfs & RAID at boot time

2010-04-19 Thread Neil Bothwick
On Mon, 19 Apr 2010 14:27:15 +0300, Ciprian Dorin, Craciun wrote:

> Well I've tried exactly that: I've aggregated two partitions in
> RAID1, made the file system, then tried to install Grub on them (as in
> run grub-setup or grub-install or grub and then from the shell the
> setup)... And I didn't succeeded. I've tried the following:
> * try to install Grub on the MD as it would have been a partition
> -- failed (as expected as the MD device is not on a hard-drive);
> * stopped the MD, and then tried to install grub on each partition
> individually -- it worked to install, but from a reason I don't
> remember right now it failed to boot;
> 
> So what intrigues me is how you've initialized the MBR, how you've
> runned grub-setup?

I don't use grub-setup, just run grub and do
root (hd0,4)
setup (hd0)
quit

and repeat for each disk.


-- 
Neil Bothwick

My friends went to alt.california, and all they brought
me was this lousy sig.


signature.asc
Description: PGP signature


Re: [gentoo-user] Re: initramfs & RAID at boot time

2010-04-19 Thread Ciprian Dorin, Craciun
On Sun, Apr 18, 2010 at 11:16, Neil Bothwick  wrote:
> On Sun, 18 Apr 2010 09:57:38 +0300, Ciprian Dorin, Craciun wrote:
>
>>     Also a question for about /boot on RAID1... I didn't manage to
>> make it work... Could you Neil please tell me exactly how you did
>> this? I'm most interested in how you've convinced Grub to work...
>
> You just don't tell GRUB that it's working with one half of a RAID1
> array. Unlike all other RAID level, with 1 you can also access the
> individual disks.
>
> --
> Neil Bothwick


Well I've tried exactly that: I've aggregated two partitions in
RAID1, made the file system, then tried to install Grub on them (as in
run grub-setup or grub-install or grub and then from the shell the
setup)... And I didn't succeeded. I've tried the following:
* try to install Grub on the MD as it would have been a partition
-- failed (as expected as the MD device is not on a hard-drive);
* stopped the MD, and then tried to install grub on each partition
individually -- it worked to install, but from a reason I don't
remember right now it failed to boot;

So what intrigues me is how you've initialized the MBR, how you've
runned grub-setup?

(In the end I am more pleased with two boot partitions, as if I
miss-configure one, I'll have the other one to boot from. I've also
cross-referenced the grub menu to chain-load the other disk.)

Thanks,
Ciprian.



Re: [gentoo-user] Re: initramfs & RAID at boot time

2010-04-19 Thread Ciprian Dorin, Craciun
On Sun, Apr 18, 2010 at 10:42, Jarry  wrote:
> On 18. 4. 2010 8:57, Ciprian Dorin, Craciun wrote:
>
>>     * there is an option for the kernel that must be enabled at
>> compile time that enables automatic RAID detection and assembly by the
>> kernel before mounting /, but it works only for MD metadata 0.96 (see
>> [1]);
>>     * the default metadata for `mdadm` is 1.2 (see `man mdadm`, and
>> search for `--metadata`), so when creating the RAID you must
>> explicitly select the metadata you want;
>
>> [1]
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/md.txt;h=188f4768f1d58c013d962f993ae36483195fd288;hb=HEAD
>
>
> Which version of mdadm are you using? I have 3.0, and defalut metadata
> is 0.90:
>
>  -e ,  --metadata=
>        Declare the style of RAID metadata (superblock) to be used.  The
>        default is 0.90 for --create, and to guess for other operations.
>        The  default can be overridden by setting the metadata value for
>        the CREATE keyword in mdadm.conf.
>
> BTW [1] says about kernel 2.6.9, things might have changed since then...
>
> Jarry


On my laptop on which I've made the experiments I have ArchLinux
(which always has the bleeding edge packages), I have 3.1.2. So maybe
between 3.0 and 3.1 there was this switch from 0.90 to 1.2 default
metadata.

About the autodetection stuff I'm absolutely positive that it only
handles 0.90 format as I've tried it and didn't work with the 1.x
version of superblock format.

Ciprian.



Re: [gentoo-user] Re: initramfs & RAID at boot time

2010-04-18 Thread Mark Knecht
On Sun, Apr 18, 2010 at 11:01 AM, Neil Bothwick  wrote:
> On Sun, 18 Apr 2010 08:13:08 -0700, Mark Knecht wrote:
>
>> I'm not sure that we good advice or not for RAIDs that could be
>> assembled later but that's what I did and it leads to the kernel
>> trying to do everything before the system is totally up and mdadm is
>> really running.
>
> I only have one RAID1 of 400MB for / and one RAID5 carrying an LVM volume
> group for everything else. Using multiple RAID partitions without LVM is
> far to complicated for my brain to handle.
>
>
> --
> Neil Bothwick

Nahh...I don't believe that for a moment, but this is a rather more
complicated task than a basic desktop PC. This is about number
crunching using multiple instances of Windows running under VMWare.

First, the basic system:

/dev/md3 - 50GB 3-drive RAID1 => The ~amd64 install we discussed over
the last week. This is the whole Gentoo install.
/dev/md5 - 50GB 3-drive RAID1 => A standard stable install - same as
md3 but stable, and again the whole Gentoo install.

Obviously I don't use the two above at the same time. I'm mostly on
stable and testing out ~amd64 right now. I use one or the other.

/dev/md11 => 100GB RAID0 - This partition is the main data storage for
the 5 Windows VMs I want to run at the same time. I went RAID0 because
my Windows apps appear to need an aggregate disk bandwidth of about
150-200MB/Sec and I couldn't get that with RAID1. I'll see how well
this works out over time.

/dev/md6 => 250GB RAID1 used purely as backup for the RAID0 which is
backed up daily, although right now not automatically.

The RAID0 and backup RAID1 need to be available whether I'm booting
stable (md5) or ~amd64. (md3)

I found some BIOS options, one of which was as default set to 'Fast
Boot'. I disabled that, slowing down boot and hopefully allowing far
more time to get the drives online more reliably. So far I've powered
off and rebooted 5 or 6 times. Each time the system has come up clean.
That's a first.

I could maybe post a photo of what I'm seeing at boot but essentially
the boot process complains with red exclamation marks about md6 & md11
but in dmesg the only thing I find is the one-liner

md: created md3
md: bind
md: bind
md: bind
md: running: 
raid1: raid set md3 active with 3 out of 3 mirrors
md3: detected capacity change from 0 to 53694562304
md: ... autorun DONE.
md5: unknown partition table

and after that no other messages.

BTW - I did sort of take a gamble and change the partitions for md6
and md11 to type 83 instead of 0xfd. It doesn't appear to have caused
any problems and I have only the above 'unknown partition table'
message. Strange as md5 is mounted and the system seems completely
happy:

m...@c2stable ~ $ df
Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/md5  51612920   7552836  41438284  16% /
udev 10240   296  9944   3% /dev
/dev/md11103224600  1740  80558784  18% /virdata
/dev/md6 243534244  24664820 206498580  11% /backups
shm6151580 0   6151580   0% /dev/shm
m...@c2stable ~ $

Cheers,
Mark



Re: [gentoo-user] Re: initramfs & RAID at boot time

2010-04-18 Thread Neil Bothwick
On Sun, 18 Apr 2010 08:13:08 -0700, Mark Knecht wrote:

> I'm not sure that we good advice or not for RAIDs that could be
> assembled later but that's what I did and it leads to the kernel
> trying to do everything before the system is totally up and mdadm is
> really running.

I only have one RAID1 of 400MB for / and one RAID5 carrying an LVM volume
group for everything else. Using multiple RAID partitions without LVM is
far to complicated for my brain to handle.


-- 
Neil Bothwick

Top Oxymorons Number 32: Living dead


signature.asc
Description: PGP signature


Re: [gentoo-user] Re: initramfs & RAID at boot time

2010-04-18 Thread Mark Knecht
On Sat, Apr 17, 2010 at 3:01 PM, Neil Bothwick  wrote:
> On Sat, 17 Apr 2010 14:36:39 -0700, Mark Knecht wrote:
>
>> Empirically any way there doesn't seem to be a problem. I built the
>> new kernel and it booted normally so I think I'm misinterpreting what
>> was written in the Wiki or the Wiki is wrong.
>
> As long as /boot is not on RAID, or is on RAID1, you don't need an
> initrd. I've been booting this system for years with / on RAID1 and
> everything else on RAID5.
>
>
> --
> Neil Bothwick

Neil,
   Completely agreed, and in fact it's the way I built my new system.
/boot is just a partition, / is RAID1 is three partitions marked with
0xfd partition type, using metadata=0.90 and assembled by the kernel.
I'm using WD RAID Edition drives and an Asus Rampage II Extreme
motherboard.

   It works, however I'm running into the sort of thing I ran into
this morning when booting - both md5 and md6 have problems this
morning. Random partitions get dropped out. It's never the same ones,
and it's sometimes only 1 partition out of three on the same drive -
sdc5 and sdc6 aren't found until I reboot, but sda3, sdb3 & sdc3 were.
Flakey hardware? What? The motherboard? The drives?

   I've noticed the entering the BIOS setup screens before allowing
grub to take over seems to eliminate the problem. Timing?

m...@c2stable ~ $ cat /proc/mdstat
Personalities : [raid0] [raid1]
md6 : active raid1 sda6[0] sdb6[1]
  247416933 blocks super 1.1 [3/2] [UU_]

md11 : active raid0 sdd1[0] sde1[1]
  104871936 blocks super 1.1 512k chunks

md3 : active raid1 sdc3[2] sdb3[1] sda3[0]
  52436096 blocks [3/3] [UUU]

md5 : active raid1 sdb5[1] sda5[0]
  52436032 blocks [3/2] [UU_]

unused devices: 
m...@c2stable ~ $

   For clarity, md3 is the only one needed to boot the system. The
other three RAIDs aren't required until I start running apps. However
they are all being assembled by the kernel at boot time and I would
prefer not to do that, or at least learn how not to do it.

   Now, as to why they are being assembled I suspect it's because I
marked them all with partition type 0xfd when possibly it's not the
best thing to have done. The kernel won't bother with non-0xfd
partitions and then mdadm could have done it later:

c2stable ~ # fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x8b45be24

   Device Boot  Start End  Blocks   Id  System
/dev/sda1   *   1   7   56196   83  Linux
/dev/sda2   8 530 4200997+  82  Linux swap / Solaris
/dev/sda3 536706352436160   fd  Linux raid autodetect
/dev/sda47064   60801   4316504855  Extended
/dev/sda57064   1359152436128+  fd  Linux raid autodetect
/dev/sda6   3   60801   247417065   fd  Linux raid autodetect
c2stable ~ #

However the Gentoo Wiki says we are supposed to mark everything 0xfd:

http://en.gentoo-wiki.com/wiki/RAID/Software#Setup_Partitions

I'm not sure that we good advice or not for RAIDs that could be
assembled later but that's what I did and it leads to the kernel
trying to do everything before the system is totally up and mdadm is
really running.

   Anyway, the failures happen, so I can step through and fail, remove
and add the partition back to the array. (In this case fail and remove
aren't necessary)

c2stable ~ # mdadm /dev/md5 -f /dev/sdc5
mdadm: set device faulty failed for /dev/sdc5:  No such device
c2stable ~ # mdadm /dev/md5 -r /dev/sdc5
mdadm: hot remove failed for /dev/sdc5: No such device or address
c2stable ~ # mdadm /dev/md5 -a /dev/sdc5
mdadm: re-added /dev/sdc5
c2stable ~ # mdadm /dev/md6 -a /dev/sdc6
mdadm: re-added /dev/sdc6
c2stable ~ #

At this point md5 is repaired and I'm waiting for md6

c2stable ~ # cat /proc/mdstat
Personalities : [raid0] [raid1]
md6 : active raid1 sdc6[2] sda6[0] sdb6[1]
  247416933 blocks super 1.1 [3/2] [UU_]
  [>]  recovery = 22.0% (54525440/247416933)
finish=38.1min speed=84230K/sec

md11 : active raid0 sdd1[0] sde1[1]
  104871936 blocks super 1.1 512k chunks

md3 : active raid1 sdc3[2] sdb3[1] sda3[0]
  52436096 blocks [3/3] [UUU]

md5 : active raid1 sdc5[2] sdb5[1] sda5[0]
  52436032 blocks [3/3] [UUU]

unused devices: 
c2stable ~ #c2stable ~ # cat /proc/mdstat
Personalities : [raid0] [raid1]
md6 : active raid1 sdc6[2] sda6[0] sdb6[1]
  247416933 blocks super 1.1 [3/2] [UU_]
  [>]  recovery = 22.0% (54525440/247416933)
finish=38.1min speed=84230K/sec

md11 : active raid0 sdd1[0] sde1[1]
  104871936 blocks super 1.1 512k chunks

md3 : active raid1 sdc3[2] sdb3[1] sda3[0]
  52436096 blocks [3/3] [UUU]

md5 : active raid1 sdc5[2] sdb5[1] sda5[0]
  52436032 blocks [3/3] [UUU]

unused devices: 
c2stable ~ #

   How do I get past this? It's happening 2-3 times a week! I'm
figuring 

Re: [gentoo-user] Re: initramfs & RAID at boot time

2010-04-18 Thread Neil Bothwick
On Sun, 18 Apr 2010 09:57:38 +0300, Ciprian Dorin, Craciun wrote:

> Also a question for about /boot on RAID1... I didn't manage to
> make it work... Could you Neil please tell me exactly how you did
> this? I'm most interested in how you've convinced Grub to work...

You just don't tell GRUB that it's working with one half of a RAID1
array. Unlike all other RAID level, with 1 you can also access the
individual disks.


-- 
Neil Bothwick

I am sitting on the toilet with your article before me. Soon it will be
behind me.


signature.asc
Description: PGP signature


Re: [gentoo-user] Re: initramfs & RAID at boot time

2010-04-18 Thread Jarry

On 18. 4. 2010 8:57, Ciprian Dorin, Craciun wrote:


 * there is an option for the kernel that must be enabled at
compile time that enables automatic RAID detection and assembly by the
kernel before mounting /, but it works only for MD metadata 0.96 (see
[1]);
 * the default metadata for `mdadm` is 1.2 (see `man mdadm`, and
search for `--metadata`), so when creating the RAID you must
explicitly select the metadata you want;



[1] 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/md.txt;h=188f4768f1d58c013d962f993ae36483195fd288;hb=HEAD



Which version of mdadm are you using? I have 3.0, and defalut metadata
is 0.90:

 -e ,  --metadata=
Declare the style of RAID metadata (superblock) to be used.  The
default is 0.90 for --create, and to guess for other operations.
The  default can be overridden by setting the metadata value for
the CREATE keyword in mdadm.conf.

BTW [1] says about kernel 2.6.9, things might have changed since then...

Jarry


--
___
This mailbox accepts e-mails only from selected mailing-lists!
Everything else is considered to be spam and therefore deleted.



Re: [gentoo-user] Re: initramfs & RAID at boot time

2010-04-17 Thread Ciprian Dorin, Craciun
On Sun, Apr 18, 2010 at 1:01 AM, Neil Bothwick  wrote:
> On Sat, 17 Apr 2010 14:36:39 -0700, Mark Knecht wrote:
>
>> Empirically any way there doesn't seem to be a problem. I built the
>> new kernel and it booted normally so I think I'm misinterpreting what
>> was written in the Wiki or the Wiki is wrong.
>
> As long as /boot is not on RAID, or is on RAID1, you don't need an
> initrd. I've been booting this system for years with / on RAID1 and
> everything else on RAID5.


From my research on the topic (I also wanted to have both /boot
and / on RAID1) there are the following traps:
* there is an option for the kernel that must be enabled at
compile time that enables automatic RAID detection and assembly by the
kernel before mounting /, but it works only for MD metadata 0.96 (see
[1]);
* the default metadata for `mdadm` is 1.2 (see `man mdadm`, and
search for `--metadata`), so when creating the RAID you must
explicitly select the metadata you want;
* indeed the preferred may to do it is using an initramfs; (I've
posted below some shell snippets that create do exactly this: assemble
my RAID); (the code snippets are between {{{...}}}, it's from a
MoinMoin wiki page;)

Also a question for about /boot on RAID1... I didn't manage to
make it work... Could you Neil please tell me exactly how you did
this? I'm most interested in how you've convinced Grub to work...

Best,
Ciprian.

[1] 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/md.txt;h=188f4768f1d58c013d962f993ae36483195fd288;hb=HEAD


 Init-ramfs preparation 

{{{
mkdir -p /usr/src/initramfs

cd /usr/src/initramfs

mkdir /usr/src/initramfs/bin
mkdir /usr/src/initramfs/dev
mkdir /usr/src/initramfs/proc
mkdir /usr/src/initramfs/rootfs
mkdir /usr/src/initramfs/sys

cp -a /bin/busybox /usr/src/initramfs/bin/busybox
cp -a /sbin/mdadm /usr/src/initramfs/bin/mdadm
cp -a /sbin/jfs_fsck /usr/src/initramfs/bin/jfs_fsck

cp -a /dev/console /usr/src/initramfs/dev/console
cp -a /dev/null /usr/src/initramfs/dev/null

cp -a /dev/sda2 /usr/src/initramfs/dev/sda2
cp -a /dev/sdc2 /usr/src/initramfs/dev/sdc2
cp -a /dev/md127 /usr/src/initramfs/dev/md127
}}}

{{{
cat >/usr/src/initramfs/init <<'EOS'
#!/bin/busybox ash

exec /dev/null 2>/dev/console
exec 1>&2

/bin/busybox mount -n -t proc none /proc || exit 1
/bin/busybox mount -n -t sysfs none /sys || exit 1

/bin/mdadm -A /dev/md127 -R -a md /dev/sda2 /dev/sdc2 || exit 1

/bin/jfs_fsck -p /dev/md127 || true

/bin/busybox mount -n -t jfs /dev/md127 /rootfs -o
ro,exec,suid,dev,relatime,errors=remount-ro || exit 1

/bin/busybox umount -n /sys || exit 1
/bin/busybox umount -n /proc || exit 1

# /bin/busybox ash /dev/console 2>/dev/console || exit 1

exec /bin/busybox switch_root /rootfs /sbin/init || exit 1

exit 1

EOS

chmod +x /usr/src/initramfs/init
}}}

{{{
( cd /usr/src/initramfs ; find . | cpio --quiet -o -H newc | gzip -9 >
/boot/initramfs )
}}}



Re: [gentoo-user] Re: initramfs & RAID at boot time

2010-04-17 Thread Neil Bothwick
On Sat, 17 Apr 2010 14:36:39 -0700, Mark Knecht wrote:

> Empirically any way there doesn't seem to be a problem. I built the
> new kernel and it booted normally so I think I'm misinterpreting what
> was written in the Wiki or the Wiki is wrong.

As long as /boot is not on RAID, or is on RAID1, you don't need an
initrd. I've been booting this system for years with / on RAID1 and
everything else on RAID5.


-- 
Neil Bothwick

Scientists decode the first confirmed alien transmission from outer space
...
"This really works! Just send 5*10^50 H atoms to each of the five star
systems listed below. Then, add your own system to the top of the list,
delete the system at the bottom, and send out copies of this message to
100 other solar systems. If you follow these instructions, within 0.25 of
a galactic rotation you are guaranteed to receive enough hydrogen in
return to power your civilization until entropy reaches its maximum!"


signature.asc
Description: PGP signature