[Bug 1395269] Re: [e1000e] ethtool -t eth0 offline loses routing table
Found a time when I didn't mind rebooting, and tested tested linux-image-3.18.0-031800rc7-generic version 3.18.0-031800rc7.201411302035 peter@tesla:~$ uname -a Linux tesla 3.18.0-031800rc7-generic #201411302035 SMP Mon Dec 1 01:36:38 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux peter@tesla:~$ route -n Kernel IP routing table Destination Gateway Genmask Flags Metric RefUse Iface 0.0.0.0 10.0.0.10.0.0.0 UG0 00 eth0 10.0.0.00.0.0.0 255.0.0.0 U 0 00 eth0 peter@tesla:~$ sudo ethtool -t eth0 offline The test result is FAIL The test extra info: Register test (offline) 0 Eeprom test(offline) 0 Interrupt test (offline) 0 Loopback test (offline) 0 Link test (on/offline) 1 peter@tesla:~$ route -n Kernel IP routing table Destination Gateway Genmask Flags Metric RefUse Iface 10.0.0.00.0.0.0 255.0.0.0 U 0 00 eth0 peter@tesla:~$ sudo route add default gw 10.0.0.1 peter@tesla:~$ route -n Kernel IP routing table Destination Gateway Genmask Flags Metric RefUse Iface 0.0.0.0 10.0.0.10.0.0.0 UG0 00 eth0 10.0.0.00.0.0.0 255.0.0.0 U 0 00 eth0 dmesg (relevant section): [ 2039.585309] e1000e :00:19.0 eth0: offline testing starting [ 2039.805954] e1000e: eth0 NIC Link is Down [ 2040.225964] e1000e :00:19.0 eth0: testing unshared interrupt [ 2053.684783] e1000e :00:19.0: irq 29 for MSI/MSI-X [ 2053.788074] e1000e :00:19.0: irq 29 for MSI/MSI-X [ 2068.760891] e1000e: eth0 NIC Link is Up 10 Mbps Full Duplex, Flow Control: Rx/Tx [ 2068.761001] e1000e :00:19.0 eth0: Link Speed was downgraded by SmartSpeed [ 2068.761004] e1000e :00:19.0 eth0: 10/100 speed: disabling TSO So it takes about 18 seconds for my NIC to fall back to a 10baseT link to my gigabit switch. I think my switch may be dying, I've seen some issues on other machines, too. If anyone's interested in working on fixing this, you could simulate this by unplugging the ethernet cable as you press return on ethtool, and wait 10 secs to plug it back in. Or just ethtool -t offline with your cable unplugged, I guess. ** Tags added: e1000e kernel-bug-exists-upstream -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1395269 Title: [e1000e] ethtool -t eth0 offline loses routing table To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1395269/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1395269] Re: [e1000e] ethtool -t eth0 offline loses routing table
Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.18 kernel[0]. If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'. If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'. If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'. Once testing of the upstream kernel is complete, please mark this bug as Confirmed. Thanks in advance. [0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.18-rc6-vivid/ -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1395269 Title: [e1000e] ethtool -t eth0 offline loses routing table To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1395269/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1395269] Re: [e1000e] ethtool -t eth0 offline loses routing table
Yup, will test this sometime this week and post again. Good suggestion, hadn't even thought of doing that, derp. :P -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1395269 Title: [e1000e] ethtool -t eth0 offline loses routing table To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1395269/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1395269] Re: [e1000e] ethtool -t eth0 offline loses routing table
apport information ** Summary changed: - ethtool -t eth0 offline loses routing table + [e1000e] ethtool -t eth0 offline loses routing table ** Tags added: apport-collected ** Description changed: ethtool -t eth0 offline does the tests, but leaves the routing table with only the entry for the local network. I had to sudo route add default gw 10.0.0.1, in my case. The online test didn't do this. Ubuntu 14.04, ethtool 1:3.13-1 Linux tesla 3.13.0-39-generic #66-Ubuntu SMP Tue Oct 28 13:30:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux ethtool -i eth0: driver: e1000e version: 2.3.2-k firmware-version: 1.1-0 bus-info: :00:19.0 relevant kernel log: [637008.472410] e1000e :00:19.0 eth0: offline testing starting [637009.077985] e1000e :00:19.0 eth0: testing unshared interrupt [637022.468941] e1000e :00:19.0: irq 45 for MSI/MSI-X [637022.572094] e1000e :00:19.0: irq 45 for MSI/MSI-X [637022.572257] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [637037.432893] e1000e: eth0 NIC Link is Up 10 Mbps Full Duplex, Flow Control: Rx/Tx [637037.433003] e1000e :00:19.0 eth0: Link Speed was downgraded by SmartSpeed [637037.433005] e1000e :00:19.0 eth0: 10/100 speed: disabling TSO [637037.433035] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [637037.982611] net_ratelimit: 3 callbacks suppressed [637037.982623] IPv4: martian source 10.0.0.17 from 80.73.161.44, on dev eth0 [637037.982628] ll header: : 00 19 d1 11 b4 9b 00 03 6d 11 34 1b 08 00m.4... (the martian packets are from TCP connections that my router is still NATing to this machine, even though without its routing table, it's not happy to see them.) And yes, my e1000e is autonegotiating to 10baseT/Full on the same cables and switch that still works at 1000baseT with another machine, hence running self-tests... I thought this machine used to run at 1000baseT, weird if I went 5 years without noticing my desktop being slow. Not what this bug report is about, though. The e1000e hardware is on a DG965WH Intel mobo (ICH8 / g965 graphics, first-gen core2) 00:19.0 Ethernet controller: Intel Corporation 82566DC Gigabit Network Connection (rev 02) Subsystem: Intel Corporation Device 0001 Flags: bus master, fast devsel, latency 0, IRQ 45 Memory at e030 (32-bit, non-prefetchable) [size=128K] Memory at e0324000 (32-bit, non-prefetchable) [size=4K] I/O ports at 20e0 [size=32] Capabilities: [c8] Power Management version 2 Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Kernel driver in use: e1000e $ ethtool eth0 Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause frame use: No Supports auto-negotiation: Yes Advertised link modes: 10baseT/Full 100baseT/Full 1000baseT/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Speed: 10Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on MDI-X: on (auto) Cannot get wake-on-lan settings: Operation not permitted Current message level: 0x0007 (7) drv probe link Link detected: yes Appears to be the same problem as someone reported to Redhat a while ago, which got marked as fixed for the igb driver https://bugzilla.redhat.com/show_bug.cgi?format=multipleid=661976 Not very useful info in their BTS, because the bug that it's a dup of is now flagged private, so nobody can even look at it. Possibly this is a per-driver thing, unless the right fix is to have ethtool save/restore the routing table entries for that iface. ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: ethtool 1:3.13-1 ProcVersionSignature: Ubuntu 3.13.0-39.66-generic 3.13.11.8 Uname: Linux 3.13.0-39-generic x86_64 ApportVersion: 2.14.1-0ubuntu3.5 Architecture: amd64 Date: Sat Nov 22 03:00:35 2014 Dependencies: gcc-4.9-base 4.9.1-0ubuntu1 libc6 2.19-0ubuntu6.3 libgcc1 1:4.9.1-0ubuntu1 multiarch-support 2.19-0ubuntu6.3 SourcePackage: ethtool UpgradeStatus: Upgraded to trusty on 2014-07-14 (130 days ago) + --- + ApportVersion: 2.14.1-0ubuntu3.5 + Architecture: amd64 + AudioDevicesInUse: + USERPID ACCESS COMMAND + /dev/snd/controlC0: peter 2715 F pulseaudio + /dev/snd/pcmC0D0p: peter 2715 F...m pulseaudio + CRDA: Error: [Errno 2] No such file or directory + DistroRelease: Ubuntu 14.04 + HibernationDevice:
[Bug 1395269] Re: [e1000e] ethtool -t eth0 offline loses routing table
So the idea is for drivers to not tell the kernel that the interface went down, while it's doing self-tests? I guess igb had this problem fixed, according to the redhat bug, but I guess not e1000e. Yes, I'm pretty sure the interface goes down during the offline portion of the full set of self-tests, for my e1000e. Connected to my switch, it takes longer than usual to autonegotiate a link. I should have posted this in the initial report, but here's the actual output: sudo ethtool -t eth0 offline The test result is FAIL The test extra info: Register test (offline) 0 Eeprom test(offline) 0 Interrupt test (offline) 0 Loopback test (offline) 0 Link test (on/offline) 1 If this doesn't usually happen with e1000e, the long autonegotiation is probably the corner case that's causing it. It's so long that the link test fails. (also, would it make sense to do the link test first, before offline tests that trigger autonegotiation? Or do we WANT to flag problems like sketchy setups that require SmartSpeed fallback to 10baseT to make a working link?) The other solution would be to save/restore routing table entries for that interface. But that might cause problems in some corner cases. So it might be a lot of work to implement safely, in the face of complex routing tables and/or changes made during the self-test while the interface was still online. Oh duh, nvm, there's more than just IPv4 to save/restore routing tables for. Some custom protocol that ethtool doesn't know about would not have its routing table saved/restored. Anyway, thanks for having a look into this. It's not a problem for me now that I know about it, just wanted to get it reported so at least the docs could include a warning. That's all that I think really needs doing, since checking every driver would be a lot of work. how about: ethtool(8): ... offline Perform all tests, including ones that interrupt normal operation. Some drivers may bring the interface down/up during this process, flushing routing table entries. They shouldn't, but be prepared just in case. Report problems with specific drivers against the Linux kernel (not ethtool). The report a bug sentence is probably too much, and could go. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1395269 Title: [e1000e] ethtool -t eth0 offline loses routing table To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1395269/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs