RAID 1 error question - boot problem.

2009-07-22 Thread Robin Laing

Hello,

I am trying to help someone get a system to reboot after a system issue. 
   He is not the builder of the system and the person that knows the 
system is away for a few weeks.  Great timing.  :)


It is Fedora 10.

Two drives, two partitions each drive which one is a mirror.

/dev/sda is partitioned as /boot and / which is mirror 1.
/dev/sdb is partitioned as swap and /

The system wouldn't restart on the reboot and came up with an error 
after creating the raid arrays and then saying that it cannot find 
/dev/md0.  I don't have the exact error message right now.


Using an Ubuntu disk (persons personal preference) the system was booted 
into a live system and using gparted the partitions were shown to be as 
above with as I see it, one error.


/dev/sda1 ext3 boot
/dev/sda2 ext3 /   raid

/dev/sdb1 Swap
/dev/sdb2 Unknown  / raid

Running mdadm /dev/sdb2 --examine shows that the partition superblock is 
showing RAID 1 and that it is clean.


As this is a critical system, it is a priority and is being used as a 
virtual server.


With only the second drive installed, we tried to run fsck.ext3 on the 
/dev/sda2 (normally b2) with no success.  We also tried /dev/md0 as 
Ubuntu has created the /dev/md0 from the single drive.


The user has not tried to boot with only the one drive in yet.  He is 
making a copy of the drive on a different system.


Now, the question.  On booting from a mirror 1 array, if there is a 
problem with the raid system, how does the boot process read the 
mdadm.conf file when it is on the RAID array that needs to be created? 
Is there some data that is stored in the /boot or someplace else that 
has the necessary info to tell the system how to build the array?


Is it part of the /boot/grub/device.map or /boot/System.map* ?

Any suggestions to where to start?

Thank you.

--
Robin Laing

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: RAID 1 error question - boot problem.

2009-08-18 Thread Robin Laing

Bill Davidsen wrote:

Robin Laing wrote:

Hello,

I am trying to help someone get a system to reboot after a system 
issue.He is not the builder of the system and the person that 
knows the system is away for a few weeks.  Great timing.  :)


It is Fedora 10.

Two drives, two partitions each drive which one is a mirror.

/dev/sda is partitioned as /boot and / which is mirror 1.
/dev/sdb is partitioned as swap and /

The system wouldn't restart on the reboot and came up with an error 
after creating the raid arrays and then saying that it cannot find 
/dev/md0.  I don't have the exact error message right now.


Using an Ubuntu disk (persons personal preference) the system was 
booted into a live system and using gparted the partitions were shown 
to be as above with as I see it, one error.


/dev/sda1 ext3 boot
/dev/sda2 ext3 /   raid

/dev/sdb1 Swap
/dev/sdb2 Unknown  / raid

Running mdadm /dev/sdb2 --examine shows that the partition superblock 
is showing RAID 1 and that it is clean.


As this is a critical system, it is a priority and is being used as a 
virtual server.


With only the second drive installed, we tried to run fsck.ext3 on the 
/dev/sda2 (normally b2) with no success.  We also tried /dev/md0 as 
Ubuntu has created the /dev/md0 from the single drive.


The user has not tried to boot with only the one drive in yet.  He is 
making a copy of the drive on a different system.


Now, the question.  On booting from a mirror 1 array, if there is a 
problem with the raid system, how does the boot process read the 
mdadm.conf file when it is on the RAID array that needs to be created? 
Is there some data that is stored in the /boot or someplace else that 
has the necessary info to tell the system how to build the array?


Is it part of the /boot/grub/device.map or /boot/System.map* ?

Any suggestions to where to start?

The linux-raid group would have been a better choice, but this is a 
simple question. The mdadm.conf file should get put in the initrd file, 
which is in the /boot partition, which you didn't mirror for some 
reason. I'm guessing that sda2 is a better place to start, since that's 
recognizable as an ext3 partition. Having a partition identify as 
"Unknown" is usually not a good thing. I would mount that partition and 
copy the contents to a secure backup if this is critical.


"I don't have the exact error message right now" doesn't help, I suggest 
backing up sda2, and sda1 if you can, noting the error message, and post 
back. Without more information I am guessing that the sdb2 partition is 
in some way hosed, do NOT run fsck on sda2 before backing up, and run 
with the "-n" option to see what condition the f/s is in. I doubt you've 
lost your data yet, don't do anything which would change that.




Thank you for the information.

This was complicated as I was helping an admin (ubuntu user) that had 
never worked with md and was filling in for the admin that created the 
system.  I have had my share of md issues between various versions of 
Fedora over the years.  He had already started trying to fix the system 
before I even heard about the problem.  Part of the reason I didn't have 
the answers.


As it turned out, there were two problems, one on each drive.  We got 
the system up and running before there was any response to my query.


We wanted to understand more things about the md operation, especially 
on booting with md arrays and using the /boot partition.


You confirmed my suspicion that the superblock was the key.

--
Robin Laing

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: RAID 1 error question - boot problem.

2009-07-24 Thread Roberto Ragusa
Robin Laing wrote:

> /dev/sda1 ext3 boot
> /dev/sda2 ext3 /   raid
> 
> /dev/sdb1 Swap
> /dev/sdb2 Unknown  / raid
> 
> Running mdadm /dev/sdb2 --examine shows that the partition superblock is
> showing RAID 1 and that it is clean.

If sda2 and sdb2 are part of a RAID-1 they should be identical.
The fsck should give the same result on both, but sdb2
is not even recognized as ext3.
I do not know exactly the meaning of what you have pasted, but
I suppose that ext3 and unknown is from content autodetection
and raid is from partition type (0xfd).

So you should have an md device (md0?) composed by sda2 and sdb2,
which is not happening.

> As this is a critical system, it is a priority and is being used as a
> virtual server.
> 
> With only the second drive installed, we tried to run fsck.ext3 on the
> /dev/sda2 (normally b2) with no success.  We also tried /dev/md0 as
> Ubuntu has created the /dev/md0 from the single drive.

So something is wrong on sdb2.

If sda2 is good, you should just get the RAID working on one drive
(degraded mode) and then readd the sdb2 to the mirror.

> The user has not tried to boot with only the one drive in yet.  He is
> making a copy of the drive on a different system.
> 
> Now, the question.  On booting from a mirror 1 array, if there is a
> problem with the raid system, how does the boot process read the
> mdadm.conf file when it is on the RAID array that needs to be created?
> Is there some data that is stored in the /boot or someplace else that
> has the necessary info to tell the system how to build the array?

It will not read mdadm.conf.
The info is in the superblocks of the partitions.

> Is it part of the /boot/grub/device.map or /boot/System.map* ?

No.
Just superblocks, and possibly the initrd image you boot from.

> Any suggestions to where to start?

Boot from a live cd and do:

  fdisk -l /dev/sda
  fdisk -l /dev/sdb
  blkid /dev/sda1
  blkid /dev/sda2
  blkid /dev/sdb1
  blkid /dev/sdb2
  cat /proc/mdstat
  dmesg|less
  fsck /dev/sda2   (but do not let it modify anything)
  fsck /dev/sdb2   (but do not let it modify anything)
  fsck /dev/md0(but do not let it modify anything)

these will give you more info to understand what's happening.

Best regards.
-- 
   Roberto Ragusamail at robertoragusa.it

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: RAID 1 error question - boot problem.

2009-07-24 Thread Bill Davidsen

Robin Laing wrote:

Hello,

I am trying to help someone get a system to reboot after a system issue. 
   He is not the builder of the system and the person that knows the 
system is away for a few weeks.  Great timing.  :)


It is Fedora 10.

Two drives, two partitions each drive which one is a mirror.

/dev/sda is partitioned as /boot and / which is mirror 1.
/dev/sdb is partitioned as swap and /

The system wouldn't restart on the reboot and came up with an error 
after creating the raid arrays and then saying that it cannot find 
/dev/md0.  I don't have the exact error message right now.


Using an Ubuntu disk (persons personal preference) the system was booted 
into a live system and using gparted the partitions were shown to be as 
above with as I see it, one error.


/dev/sda1 ext3 boot
/dev/sda2 ext3 /   raid

/dev/sdb1 Swap
/dev/sdb2 Unknown  / raid

Running mdadm /dev/sdb2 --examine shows that the partition superblock is 
showing RAID 1 and that it is clean.


As this is a critical system, it is a priority and is being used as a 
virtual server.


With only the second drive installed, we tried to run fsck.ext3 on the 
/dev/sda2 (normally b2) with no success.  We also tried /dev/md0 as 
Ubuntu has created the /dev/md0 from the single drive.


The user has not tried to boot with only the one drive in yet.  He is 
making a copy of the drive on a different system.


Now, the question.  On booting from a mirror 1 array, if there is a 
problem with the raid system, how does the boot process read the 
mdadm.conf file when it is on the RAID array that needs to be created? 
Is there some data that is stored in the /boot or someplace else that 
has the necessary info to tell the system how to build the array?


Is it part of the /boot/grub/device.map or /boot/System.map* ?

Any suggestions to where to start?

The linux-raid group would have been a better choice, but this is a simple 
question. The mdadm.conf file should get put in the initrd file, which is in the 
/boot partition, which you didn't mirror for some reason. I'm guessing that sda2 
is a better place to start, since that's recognizable as an ext3 partition. 
Having a partition identify as "Unknown" is usually not a good thing. I would 
mount that partition and copy the contents to a secure backup if this is critical.


"I don't have the exact error message right now" doesn't help, I suggest backing 
up sda2, and sda1 if you can, noting the error message, and post back. Without 
more information I am guessing that the sdb2 partition is in some way hosed, do 
NOT run fsck on sda2 before backing up, and run with the "-n" option to see what 
condition the f/s is in. I doubt you've lost your data yet, don't do anything 
which would change that.


--
Bill Davidsen 
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines