Re: [CentOS] network connectivity lost after reboot/upgrade

2013-03-05 Thread Kai Schaetzl
Kai Schaetzl wrote on Mon, 04 Mar 2013 19:15:46 +0100:

 Has anyone seen such a hardware failure where the link goes up but no 
 packets go over the wire? It seems a bit unlikely that this hardware 
 failure (and nothing else) should happen on a reboot after an upgrade.

It was indeed a weird hardware failure. All works fine with disabled 
inboard LAN and a cheap PCI network card.

Kai


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] network connectivity lost after reboot/upgrade

2013-03-05 Thread SilverTip257
On Tue, Mar 5, 2013 at 9:22 AM, Kai Schaetzl mailli...@conactive.comwrote:

 Kai Schaetzl wrote on Mon, 04 Mar 2013 19:15:46 +0100:

  Has anyone seen such a hardware failure where the link goes up but no
  packets go over the wire? It seems a bit unlikely that this hardware
  failure (and nothing else) should happen on a reboot after an upgrade.

 It was indeed a weird hardware failure. All works fine with disabled
 inboard LAN and a cheap PCI network card.


That's a suitable workaround for getting a system operational again.
In the end that is nothing more than a workaround, not a true solution. :-/

But it would have been helpful if you had shared more information (think
NIC model, NIC chipset, kernel module in use for that chipset).



 Kai


 ___
 CentOS mailing list
 CentOS@centos.org
 http://lists.centos.org/mailman/listinfo/centos




-- 
---~~.~~---
Mike
//  SilverTip257  //
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] network connectivity lost after reboot/upgrade

2013-03-05 Thread Kai Schaetzl
SilverTip257 wrote on Tue, 5 Mar 2013 12:28:29 -0500:

 But it would have been helpful if you had shared more information (think
 NIC model, NIC chipset, kernel module in use for that chipset).

Why? It's quite clear that this is a hardware failure. I tested a live CD 
and PXE booting on it with the same problem before buying the new card. I 
also tested the system disk fine in another machine. It's got nothing to 
do with the system, although it happened right after the update/reboot.

So, other than replacing the mobo, it *is* the solution. Mobo might be 
going haywire next as well, but currently it's absolutely stable. And I 
have a backup now in case it wants to go ...

Kai


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] network connectivity lost after reboot/upgrade

2013-03-04 Thread Kai Schaetzl
I upgraded one of my old machines running 5.x to the latest kernel (from 
308.24.1 to 348.1.1).
After rebooting network connectivity was gone. I rebooted with the old 
kernel, I also tried the one before it (308.20.1) still no luck. So I 
assume it's got nothing to do with the kernel or even CentOS. But a 
hardware failure seems also unlikely, see below.

ethtool shows the link as up and if I remove the cable as down.
I attached a laptop via crossover cable, it detects the link, but same 
problem.
I disabled iptables and set selinux to disabled. No change.
There's a Xen VM running on that machine and I can ping it from the 
hardware. So, internal networking seems to be ok. I'm using bridged 
networking for Xen connectivity, setup by normal Red Hat means, not via 
Xen. Never had a problem.
There are no errors in the logs, except for dhcpd telling network is down 
and named is also giving some weird errors. This is my only dhcpd, so I 
would like to have it up ASAP :-(

Is there anything else besides a weird hardware failure that I could 
check? I'm going to get a new card tomorrow and see if that changes the 
situation. This is mobo internal networking based on nforce-MCP61.

Has anyone seen such a hardware failure where the link goes up but no 
packets go over the wire? It seems a bit unlikely that this hardware 
failure (and nothing else) should happen on a reboot after an upgrade.

Thanks.



Kai


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] network connectivity lost after reboot/upgrade

2013-03-04 Thread zGreenfelder
On Mon, Mar 4, 2013 at 1:15 PM, Kai Schaetzl mailli...@conactive.com wrote:
 I upgraded one of my old machines running 5.x to the latest kernel (from
 308.24.1 to 348.1.1).
 After rebooting network connectivity was gone. I rebooted with the old
 kernel, I also tried the one before it (308.20.1) still no luck. So I
 assume it's got nothing to do with the kernel or even CentOS. But a
 hardware failure seems also unlikely, see below.

 ethtool shows the link as up and if I remove the cable as down.
 I attached a laptop via crossover cable, it detects the link, but same
 problem.
 I disabled iptables and set selinux to disabled. No change.
 There's a Xen VM running on that machine and I can ping it from the
 hardware. So, internal networking seems to be ok. I'm using bridged
 networking for Xen connectivity, setup by normal Red Hat means, not via
 Xen. Never had a problem.
 There are no errors in the logs, except for dhcpd telling network is down
 and named is also giving some weird errors. This is my only dhcpd, so I
 would like to have it up ASAP :-(

 Is there anything else besides a weird hardware failure that I could
 check? I'm going to get a new card tomorrow and see if that changes the
 situation. This is mobo internal networking based on nforce-MCP61.

 Has anyone seen such a hardware failure where the link goes up but no
 packets go over the wire? It seems a bit unlikely that this hardware
 failure (and nothing else) should happen on a reboot after an upgrade.



I've seen similarly weird things when running VMs on some smart
switches where (and I'm not a networking guy here, so my terminology
will get fuzzy) something was set to disable ports(port fast, maybe?)
if multiple MACs were seen on the port (on machine other than my
desktop, I can normally get that fixed by having a trunkport and
default VLAN assigned to my port(s)).not sure if that could be
applied to your situation.




-- 
Even the Magic 8 ball has an opinion on email clients: Outlook not so good.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] network connectivity lost after reboot/upgrade

2013-03-04 Thread Kai Schaetzl
thanks for the tip, but, unfortunately, this cannot be the case here.
Networking of the host is also affected, even when Xen is shut off.
I have no smart switches in this office and I ruled out switches by using 
a direct connection to the laptop.

Kai


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] network connectivity lost after reboot/upgrade

2013-03-04 Thread James Hogarth

 thanks for the tip, but, unfortunately, this cannot be the case here.
 Networking of the host is also affected, even when Xen is shut off.
 I have no smart switches in this office and I ruled out switches by using
 a direct connection to the laptop.

So it's something unrelated to xen...

Is the host using a static address or dhcp?

If you tcpdump do you see all the packets you'd expect for layer 2
connectivity (ie ARP requests and responses?)

Does ss or ifconfig show any transmit or receive errors? Do packet counts
go up?

Given that ethtool states the link is up I'd statically configure an
address and try to ping the gateway whilst running tcpdump ... Then take
the packet dump (-w filename to save it) and take a look in wire shark ...
You should see 'who has gateway IP' as an ARP request and the response from
the gateway... Along with the ICMP echo-request and echo-reply packets...

From there you can start diagnosis properly...
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] network connectivity lost after reboot/upgrade

2013-03-04 Thread Gordon Messmer
On 03/04/2013 01:54 PM, James Hogarth wrote:
 If you tcpdump do you see all the packets you'd expect for layer 2
 connectivity (ie ARP requests and responses?)

specifically, use tcpdump on your bridged interface:
tcpdump -nn -i br0

Check your bridge details and make sure that the ethernet device is listed:
brctl show

If those look good, send the content of 
/etc/sysconfig/network-scripts/ifcfg-{br0,eth0} (or whatever eth device 
is a member of the bridge).
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] network connectivity lost after reboot/upgrade

2013-03-04 Thread Kai Schaetzl
Gordon Messmer wrote on Mon, 04 Mar 2013 15:29:58 -0800:

 Check your bridge details and make sure that the ethernet device is listed:
  brctl show
 
 If those look good, send the content of 
 /etc/sysconfig/network-scripts/ifcfg-{br0,eth0} (or whatever eth device 
 is a member of the bridge).

This is all fine, it's been this way for years. It looks as it always has. No 
errors, collisions, whatever anywhere. TX and RX are about the same.
Just to prove that config is fine I removed the bridge and brought up a 
normal eth0. It's got the same problem. I've never seen such a problem 
before.

The tcpdump shows a lot of arp requests
who-has IP tell IP
As I understand these are requests for MAC addresses? And tell is the asking 
IP number? In that case there is at least *some* outside connectivity. Most 
of the requests are from the local IP and the IP of the VM, but a few are 
from other machines on the network, including the outbound router. The VM 
runs a monitoring system and these are the clients that want to call in.
Also a few UDP requests (port 1900 and NBT), and that's all.
There are also a few responses to the arp requests, but mostly it's requests. 
Makes sense if it doesn't have much in the arp cache. arp -a lists two 
machines with missing MAC data, that's all.

Kai


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] network connectivity lost after reboot/upgrade

2013-03-04 Thread Robert
On Tue, 05 Mar 2013 02:02:54 +0100
Kai Schaetzl mailli...@conactive.com wrote:

 Gordon Messmer wrote on Mon, 04 Mar 2013 15:29:58 -0800:
 
 This is all fine, it's been this way for years. It looks as it always has. No 
 errors, collisions, whatever anywhere. TX and RX are about the same.
 Just to prove that config is fine I removed the bridge and brought up a 
 normal eth0. It's got the same problem. I've never seen such a problem 
 before.

Things I would look at

1. route to ensure that the routing table is correct.
2. ifcfg-eth0 and see it there are any MAC addresses listed if so ensure they 
match the MAC address in ifconfig output.


--  
Regards
Robert

Linux
The adventure of a lifetime.

Linux User #296285
Get Counted
http://linuxcounter.net/
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] network connectivity lost after reboot/upgrade

2013-03-04 Thread Gordon Messmer
On 03/04/2013 05:02 PM, Kai Schaetzl wrote:
 The tcpdump shows a lot of arp requests
 who-has IP tell IP
 As I understand these are requests for MAC addresses? And tell is the asking
 IP number?

The arp request will have both the source IP address and the Ethernet 
address of the requesting host.  tcpdump will only print the IP unless 
you use the -e flag.

If the layout of your network is such a closely guarded secret that you 
can't share the information that we need to help, you're mostly on your 
own here.

At this point, the problem could be almost anything.  A bad switch port, 
or a bad switch, or a bad cable seem very likely.  Try a new cable to a 
new switch port and reboot the switch if the problem continues.  Try a 
full power down (as in, remove the power cable) for the affected system 
and with the switch.  It sounds like your system is receiving packets 
but unable to send them to other hosts.

 From any other host on the network, you should be able to:
   tcpdump -nn -e ether host mac
where mac is the Ethernet address of the system with no connectivity. 
  If you try to ping any address at all, the other system should see it 
broadcasting ARP requests for the local destination or the default 
gateway.  If you don't see ARP requests on the other host, then you know 
that the affected system isn't able to sent out traffic.


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos