Re: [CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.

2008-02-14 Thread Wojtek Pilorz
On Thu, Feb 14, 2008 at 03:08:54PM +1100, Steven Haigh wrote:
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
  Behalf Of nate
  Sent: Thursday, 14 February 2008 2:46 PM
  To: centos@centos.org
  Subject: Re: [CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.
  
  Indunil Jayasooriya wrote:
  
   I also got this type of probles once before. pls check initrd image.
   pls performe below steps.
  
  While it's always good to make sure your initrd is in a good state,
  the network drivers don't need to be in the initrd (unless your booting
  from NFS or something). They can be loaded fine from
  /lib/modules/`uname -r`
  
  What kind of network chip(s) are in the system? What driver are they
  using?(/etc/modprobe.conf), it'd be helpful to have the output of
  dmesg as well from the kernel that doesn't provide networking support.
 
 The network is an e100 - dmesg shows the following:
   # dmesg | grep e100:
   e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
   e100: Copyright(c) 1999-2005 Intel Corporation
   e100: eth0: e100_probe: addr 0xdfffe000, irq 169, MAC addr
 00:02:B3:8B:BE:26
   e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
 
 Of course, this doesn't give us the exact chip, however mii-tool is a bit
 more helpful:
   # mii-tool -v eth0
   eth0: negotiated 100baseTx-FD, link ok
 product info: Intel 82555 rev 4
 basic mode:   autonegotiation enabled
 basic status: autonegotiation complete, link ok
 capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
 advertising:  100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
 flow-control
 link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
 
 The interesting part for me however, is that certain things unrelated to the
 network also fail. I would expect iptables to come up as OK on boot - even
 if no network device was configured - as its independent of network
 configuration. It also doesn't explain how the firmware microcode update
 also fails.
 
I had similar problem with a Linux system (Fedora) which was using SElinux in 
enforcing
mode (like CentOS is doing by default) after I booted from
a CD not supporting SElinux and editing some configuration files
(like ifcfg-eth0) which has lost appropriate SElinux labels because of that.
This is most probably different from what you see (one kernel working OK, 
the other not); 
no-one was tinkering with /lib/modules from not-SElinux CD, right?



  You could write a script for some person at the remote co-lo to execute
  when the system comes up w/o network, the results could be stored in
  a file on the disk and when the system is rebooted again under the
  old kernel you can examine them for possible causes.
  
  Some commands to try:
  dmesg
  ifconfig -a
  mii-tool
  route -n
  ping -c 5 (IP of default gateway)
  arping -c 5 (IP of default gateway)
  arp -an
  lsmod
 
 I have a bit of trouble with this, as the only person that can do it is
 around 30 minutes travel from the colo. As the system boots, I'm thinking of
 writing a script that will gather this, then reboot the system after
 changing the default=x line in /etc/grub.conf - however obviously I want to
 make sure it works 100% before I tell the machine to reboot ;)
IP KVM device would be your friend (unfortunately they are not cheap...)


 
 --
 Steven Haigh
 
 Email: [EMAIL PROTECTED]
 Web: http://www.crc.id.au
 Phone: (03) 9001 6090 - 0412 935 897
Best regards,

Wojtek

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.

2008-02-14 Thread William L. Maltby
On Thu, 2008-02-14 at 14:57 +1100, Steven Haigh wrote:
 There are a number of differences in the initrd, although nothing that I
 would call obvious as causing an issue..
 
 -
 # gunzip -cd /boot/initrd-2.6.18-8.1.8.el5.img |cpio -t |more
 6097 blocks
 bin
 snip ...

 sys
 etc
 #
 -
 # gunzip -cd /boot/initrd-2.6.18-53.1.13.el5.img |cpio -t |more
 9679 blocks
 bin
 bin/dmraid
 snip

 sys
 etc
 #
 -

Do yourself a favor, as you'll probably have several more comparisons to
do.

When making the lists, sort the output, either piped to sort or make a
sorted version afterward, and use comm (man comm). You can see a nice
consolidated output, or select any combination of only on file1, only
on file2, ... both, etc. Makes detecting differences much faster.

 snip

If grub had a one time next boot like LILO, I'd have some more
thoughts, but *sigh*

-- 
Bill

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.

2008-02-14 Thread nate
Steven Haigh wrote:

 I have a bit of trouble with this, as the only person that can do it is
 around 30 minutes travel from the colo. As the system boots, I'm thinking of
 writing a script that will gather this, then reboot the system after
 changing the default=x line in /etc/grub.conf - however obviously I want to
 make sure it works 100% before I tell the machine to reboot ;)

I looked at your original email again, and if I read your previous
kernel right it's over a year since you last updated the kernel?
(2.6.18-8 was released 1/07 by RH, though I can't find 8.1.8)

I was browsing through the change log and saw several e100 related
changes, which could be related to the network end of your problems.
Without more detailed information as to error messages and stuff
for the failures the best thing I can suggest at this point is
to try a few kernels in between the one you were on and the latest
and see if any of them break, likely they will as the latest kernel
only has 1 change in it. Maybe you can narrow it down to a
particular kernel rev that came out.

nate


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.

2008-02-13 Thread Indunil Jayasooriya
 After the latest lot of kernel security updates have come out, I updated one
 of my colo boxes and rebooted. It didn't come back up and fails when booting
 on:
 * CPU Microcode update
 * iptables
 * eth0

 The booting process completes, however as you can imagine, there is no
 network connectivity at all. The only config changes were installing the new
 kernel. Booting back into 2.6.18-8.1.8.el5 make things work 100% again.

I also got this type of probles once before. pls check initrd image.
pls performe below steps.

 gunzip -cd /boot/initrd-2.6.18-8.1.8.el5.img |cpio -t |more and see

then, check your newly installed kernel. as below

 gunzip -cd /boot/initrd-2.6.18-53.1.13.el5.img |cpio -t |more and see

pls check what is missing. If found, All you have to make an initrd by
using mkinitrd command.

pls check below URL

http://readlist.com/lists/centos.org/centos/2/13952.html

-- 
Thank you
Indunil Jayasooriya
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.

2008-02-13 Thread Steven Haigh
There are a number of differences in the initrd, although nothing that I
would call obvious as causing an issue..

-
# gunzip -cd /boot/initrd-2.6.18-8.1.8.el5.img |cpio -t |more
6097 blocks
bin
bin/modprobe
bin/insmod
bin/nash
dev
dev/tty6
dev/zero
dev/tty5
dev/console
dev/ram1
dev/ttyS3
dev/tty0
dev/ttyS0
dev/null
dev/tty3
dev/tty10
dev/ram0
dev/ptmx
dev/rtc
dev/tty
dev/tty8
dev/ttyS1
dev/systty
dev/ram
dev/tty7
dev/tty1
dev/tty11
dev/tty4
dev/tty2
dev/tty12
dev/tty9
dev/ttyS2
dev/mapper
proc
lib
lib/jbd.ko
lib/uhci-hcd.ko
lib/ext3.ko
lib/ohci-hcd.ko
lib/ehci-hcd.ko
init
sysroot
sbin
sys
etc
#
-
# gunzip -cd /boot/initrd-2.6.18-53.1.13.el5.img |cpio -t |more
9679 blocks
bin
bin/dmraid
bin/modprobe
bin/insmod
bin/kpartx
bin/nash
dev
dev/tty6
dev/zero
dev/tty5
dev/console
dev/ram1
dev/ttyS3
dev/tty0
dev/ttyS0
dev/null
dev/tty3
dev/tty10
dev/ram0
dev/ptmx
dev/rtc
dev/tty
dev/tty8
dev/ttyS1
dev/systty
dev/ram
dev/tty7
dev/tty1
dev/tty11
dev/tty4
dev/tty2
dev/tty12
dev/tty9
dev/ttyS2
dev/mapper
proc
lib
lib/jbd.ko
lib/uhci-hcd.ko
lib/ext3.ko
lib/firmware
lib/ohci-hcd.ko
lib/ehci-hcd.ko
init
sysroot
sbin
sys
etc
#
-

The obvious additions in .53 are kpartx and dmraid - however as I'm using a
plain HDD (hda) with no RAID, I don't really think that would cause an
issue.

--
Steven Haigh

Email: [EMAIL PROTECTED]
Web: http://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
Of Indunil Jayasooriya
Sent: Thursday, 14 February 2008 2:34 PM
To: CentOS mailing list
Subject: Re: [CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.

 After the latest lot of kernel security updates have come out, I updated
one
 of my colo boxes and rebooted. It didn't come back up and fails when
booting
 on:
 * CPU Microcode update
 * iptables
 * eth0

 The booting process completes, however as you can imagine, there is no
 network connectivity at all. The only config changes were installing the
new
 kernel. Booting back into 2.6.18-8.1.8.el5 make things work 100% again.

I also got this type of probles once before. pls check initrd image.
pls performe below steps.

 gunzip -cd /boot/initrd-2.6.18-8.1.8.el5.img |cpio -t |more and see

then, check your newly installed kernel. as below

 gunzip -cd /boot/initrd-2.6.18-53.1.13.el5.img |cpio -t |more and see

pls check what is missing. If found, All you have to make an initrd by
using mkinitrd command.

pls check below URL

http://readlist.com/lists/centos.org/centos/2/13952.html

-- 
Thank you
Indunil Jayasooriya
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.

2008-02-13 Thread nate
Indunil Jayasooriya wrote:

 I also got this type of probles once before. pls check initrd image.
 pls performe below steps.

While it's always good to make sure your initrd is in a good state,
the network drivers don't need to be in the initrd (unless your booting
from NFS or something). They can be loaded fine from /lib/modules/`uname -r`

What kind of network chip(s) are in the system? What driver are they
using?(/etc/modprobe.conf), it'd be helpful to have the output of
dmesg as well from the kernel that doesn't provide networking support.

You could write a script for some person at the remote co-lo to execute
when the system comes up w/o network, the results could be stored in
a file on the disk and when the system is rebooted again under the
old kernel you can examine them for possible causes.

Some commands to try:
dmesg
ifconfig -a
mii-tool
route -n
ping -c 5 (IP of default gateway)
arping -c 5 (IP of default gateway)
arp -an
lsmod

nate


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


RE: [CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.

2008-02-13 Thread Steven Haigh
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
 Behalf Of nate
 Sent: Thursday, 14 February 2008 2:46 PM
 To: centos@centos.org
 Subject: Re: [CentOS] Kernel 2.6.18-53.1.13.el5 fails on network.
 
 Indunil Jayasooriya wrote:
 
  I also got this type of probles once before. pls check initrd image.
  pls performe below steps.
 
 While it's always good to make sure your initrd is in a good state,
 the network drivers don't need to be in the initrd (unless your booting
 from NFS or something). They can be loaded fine from
 /lib/modules/`uname -r`
 
 What kind of network chip(s) are in the system? What driver are they
 using?(/etc/modprobe.conf), it'd be helpful to have the output of
 dmesg as well from the kernel that doesn't provide networking support.

The network is an e100 - dmesg shows the following:
# dmesg | grep e100:
e100: Intel(R) PRO/100 Network Driver, 3.5.10-k2-NAPI
e100: Copyright(c) 1999-2005 Intel Corporation
e100: eth0: e100_probe: addr 0xdfffe000, irq 169, MAC addr
00:02:B3:8B:BE:26
e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex

Of course, this doesn't give us the exact chip, however mii-tool is a bit
more helpful:
# mii-tool -v eth0
eth0: negotiated 100baseTx-FD, link ok
  product info: Intel 82555 rev 4
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
flow-control
  link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD

The interesting part for me however, is that certain things unrelated to the
network also fail. I would expect iptables to come up as OK on boot - even
if no network device was configured - as its independent of network
configuration. It also doesn't explain how the firmware microcode update
also fails.

 You could write a script for some person at the remote co-lo to execute
 when the system comes up w/o network, the results could be stored in
 a file on the disk and when the system is rebooted again under the
 old kernel you can examine them for possible causes.
 
 Some commands to try:
 dmesg
 ifconfig -a
 mii-tool
 route -n
 ping -c 5 (IP of default gateway)
 arping -c 5 (IP of default gateway)
 arp -an
 lsmod

I have a bit of trouble with this, as the only person that can do it is
around 30 minutes travel from the colo. As the system boots, I'm thinking of
writing a script that will gather this, then reboot the system after
changing the default=x line in /etc/grub.conf - however obviously I want to
make sure it works 100% before I tell the machine to reboot ;)

--
Steven Haigh

Email: [EMAIL PROTECTED]
Web: http://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897



___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos