Re: [CentOS] kernel not booting after update

2009-11-15 Thread Spiro Harvey
Larry Brigman larry.brig...@gmail.com wrote:
 If the /boot is also part of the raid and it is a soft raid (fake raid
 is the same)
 then maybe only one of the mirror is being updated and grub is
 looking at the other mirror and not finding the files needed.

I think you're on the right track here. I dropped the raid set, and
rebuilt the box, and this time took note of the syncing.. dmraid -s
kept telling me the mirror was ok so I'm guessing it synced correctly.

I installed the update again, then set -53 to boot first in the grub
order, but it dumped me at the grub prompt.

So I typed kernel 2.6.18- and hit tab, and saw both files (-53 and
-164). Hit tab again, and it completed -53. I went back and typed -164
and selected that. First time Error 13 (unknown executable format). I
reran the kernel line and this time was told Error 15: File Not Found.
Ran it again and got error 13. It pretty much alternated.

So it looks like one side of the mirror isn't getting synced properly.

There are 7 other boxen for which this has worked, so it's possible
this one is just faulty.

I'm also going to try Rob's idea of nodmraid to see what happens
there.

Appreciate all the help.


-- 
Spiro Harvey  Knossos Networks Ltd
021-295-1923  www.knossos.net.nz


signature.asc
Description: PGP signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] kernel not booting after update

2009-11-13 Thread Rob Kampen

Spiro Harvey wrote:
I would backup ALL your file systems off that disk, perhaps using a 



This is a fresh install, so that's not an issue.

  

Linux rescue CD, then configure the controller in the BIOS for JBOD,
use a rescue disk to build mdraid partitions, and restore your files
from the backups.   you may have to rebuild the /boot/initrd on the
system to dump the fakeraid (dmraid) driver and enable the mdraid
native linux raid driver



I'm interested in knowing why the machine isn't booting some kernels,
but will happily boot another. I figure if it's a hardware issue, then
it should be an all-or-nothing issue? I'm positive this is the same
spec as the last servers built for this same purpose, but the others
are now on the other side of the country, so I can't access them to
verify.

So assuming the hardware is exactly the same, and assuming there's
something in the -164 kernel that doesn't like that particular fake
raid card, then I still can't see why I can't boot the -128 kernel as
that's what the other boxes have running. :/


  



___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
  

Spiro,
I had a similar problem with an Intel MB shifting from -53 kernel to 
newer and ended up with adding nodmraid to the kernel line in grub so 
I could actually use the drives. For some reason no BIOS setting would 
set the onboard fake raid into a mode that the kernel could deal with.
Suggest you do the back up and re-install with mdraid - has worked like 
a charm since I did this.

HTH
Rob
attachment: rkampen.vcf___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] kernel not booting after update

2009-11-12 Thread John R Pierce
Spiro Harvey wrote:
 Box is a dual core Xeon (E8500) with hardware SATA RAID on board.
   


The E8500 is a desktop Core2Duo CPU, I thought? 'what sort of 
Hardware SATA RAID?  Do you mean, Intel Matrix Raid?  thats not actually 
hardware, thats BIOS/driver implemented fake raid, and frankly, you'd be 
better off using native linux mdraid.

Or did you mean the older E8500 server chipset, for the Xeon MP 70x0 or 
71x0 series (p4 prescott based) CPUs in a server with proper onboard 
raid such as a HP SmartArray or Dell PERC ?

If this latter, never mind what I said above...needless to say, 
these part numbers can be very confusing.








___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] kernel not booting after update

2009-11-12 Thread Spiro Harvey
John R Pierce pie...@hogranch.com wrote:

 The E8500 is a desktop Core2Duo CPU, I thought? 'what sort of 

Yes, my mistake. It's a Core 2 Duo. 

I don't know where I saw the Xeon sticker. I saw the E8500
on /proc/cpuinfo but didn't RTFS properly. :/ This was further confused
when I googled E8500 and one of the hits mentioned Xeon...

The raid is indeed an Intel Matrix RAID. The BIOS is configured so that
the sata controller is in RAID mode, and the OPROM is set to Matrix
Raid.



-- 
Spiro Harvey  Knossos Networks Ltd
021-295-1923  www.knossos.net.nz


signature.asc
Description: PGP signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] kernel not booting after update

2009-11-12 Thread John R Pierce

 The raid is indeed an Intel Matrix RAID. The BIOS is configured so that
 the sata controller is in RAID mode, and the OPROM is set to Matrix
 Raid.
   

I would backup ALL your file systems off that disk, perhaps using a 
Linux rescue CD, then configure the controller in the BIOS for JBOD, use 
a rescue disk to build mdraid partitions, and restore your files from 
the backups.   you may have to rebuild the /boot/initrd on the system to 
dump the fakeraid (dmraid) driver and enable the mdraid native linux 
raid driver

Fake Raid like Intel Matrix Raid is NOT recommended for linux/unix systems
http://thebs413.blogspot.com/2005/09/fake-raid-fraid-sucks-even-more-at.html


someone's procedure for undoing a fakeraid.
http://www.brandonchecketts.com/archives/disabling-dmraid-fakeraid-on-centos-5

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] kernel not booting after update

2009-11-12 Thread Ross Walker
On Nov 12, 2009, at 7:53 PM, Spiro Harvey sp...@knossos.net.nz wrote:

 I would backup ALL your file systems off that disk, perhaps using a

 This is a fresh install, so that's not an issue.

 Linux rescue CD, then configure the controller in the BIOS for JBOD,
 use a rescue disk to build mdraid partitions, and restore your files
 from the backups.   you may have to rebuild the /boot/initrd on the
 system to dump the fakeraid (dmraid) driver and enable the mdraid
 native linux raid driver

 I'm interested in knowing why the machine isn't booting some kernels,
 but will happily boot another. I figure if it's a hardware issue, then
 it should be an all-or-nothing issue? I'm positive this is the same
 spec as the last servers built for this same purpose, but the others
 are now on the other side of the country, so I can't access them to
 verify.

 So assuming the hardware is exactly the same, and assuming there's
 something in the -164 kernel that doesn't like that particular fake
 raid card, then I still can't see why I can't boot the -128 kernel as
 that's what the other boxes have running. :/

You might have installed a driver for the fake raid before which added  
it to /etc/modprobe.conf and did a mkinitrd to add it to the initrd  
during boot, but at some point removed it and from that point on newer  
kernels didn't get the driver in their initrd images?

Just an idea.

-Ross

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] kernel not booting after update

2009-11-12 Thread Larry Brigman
On Thu, Nov 12, 2009 at 9:22 PM, Ross Walker rswwal...@gmail.com wrote:
 On Nov 12, 2009, at 7:53 PM, Spiro Harvey sp...@knossos.net.nz wrote:

 I would backup ALL your file systems off that disk, perhaps using a

 This is a fresh install, so that's not an issue.

 Linux rescue CD, then configure the controller in the BIOS for JBOD,
 use a rescue disk to build mdraid partitions, and restore your files
 from the backups.   you may have to rebuild the /boot/initrd on the
 system to dump the fakeraid (dmraid) driver and enable the mdraid
 native linux raid driver

 I'm interested in knowing why the machine isn't booting some kernels,
 but will happily boot another. I figure if it's a hardware issue, then
 it should be an all-or-nothing issue? I'm positive this is the same
 spec as the last servers built for this same purpose, but the others
 are now on the other side of the country, so I can't access them to
 verify.

 So assuming the hardware is exactly the same, and assuming there's
 something in the -164 kernel that doesn't like that particular fake
 raid card, then I still can't see why I can't boot the -128 kernel as
 that's what the other boxes have running. :/

 You might have installed a driver for the fake raid before which added
 it to /etc/modprobe.conf and did a mkinitrd to add it to the initrd
 during boot, but at some point removed it and from that point on newer
 kernels didn't get the driver in their initrd images?


If the /boot is also part of the raid and it is a soft raid (fake raid
is the same)
then maybe only one of the mirror is being updated and grub is looking at the
other mirror and not finding the files needed.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos