[Bug 1395269] Re: [e1000e] ethtool -t eth0 offline loses routing table

2014-12-04 Thread Peter Cordes
Found a time when I didn't mind rebooting, and tested tested
linux-image-3.18.0-031800rc7-generic version 3.18.0-031800rc7.201411302035

peter@tesla:~$ uname -a
Linux tesla 3.18.0-031800rc7-generic #201411302035 SMP Mon Dec 1 01:36:38 UTC 
2014 x86_64 x86_64 x86_64 GNU/Linux

peter@tesla:~$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric RefUse Iface
0.0.0.0 10.0.0.10.0.0.0 UG0  00 eth0
10.0.0.00.0.0.0 255.0.0.0   U 0  00 eth0

peter@tesla:~$ sudo ethtool -t eth0 offline
The test result is FAIL
The test extra info:
Register test  (offline) 0
Eeprom test(offline) 0
Interrupt test (offline) 0
Loopback test  (offline) 0
Link test   (on/offline) 1

peter@tesla:~$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric RefUse Iface
10.0.0.00.0.0.0 255.0.0.0   U 0  00 eth0

peter@tesla:~$ sudo route add default gw 10.0.0.1
peter@tesla:~$ route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric RefUse Iface
0.0.0.0 10.0.0.10.0.0.0 UG0  00 eth0
10.0.0.00.0.0.0 255.0.0.0   U 0  00 eth0


dmesg (relevant section):

[ 2039.585309] e1000e :00:19.0 eth0: offline testing starting
[ 2039.805954] e1000e: eth0 NIC Link is Down
[ 2040.225964] e1000e :00:19.0 eth0: testing unshared interrupt
[ 2053.684783] e1000e :00:19.0: irq 29 for MSI/MSI-X
[ 2053.788074] e1000e :00:19.0: irq 29 for MSI/MSI-X
[ 2068.760891] e1000e: eth0 NIC Link is Up 10 Mbps Full Duplex, Flow Control: 
Rx/Tx
[ 2068.761001] e1000e :00:19.0 eth0: Link Speed was downgraded by SmartSpeed
[ 2068.761004] e1000e :00:19.0 eth0: 10/100 speed: disabling TSO

So it takes about 18 seconds for my NIC to fall back to a 10baseT link
to my gigabit switch.  I think my switch may be dying, I've seen some
issues on other machines, too.

 If anyone's interested in working on fixing this, you could simulate
this by unplugging the ethernet cable as you press return on ethtool,
and wait 10 secs to plug it back in.  Or just ethtool -t offline with
your cable unplugged, I guess.


** Tags added: e1000e kernel-bug-exists-upstream

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1395269

Title:
  [e1000e] ethtool -t eth0 offline loses routing table

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1395269/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1395269] Re: [e1000e] ethtool -t eth0 offline loses routing table

2014-11-24 Thread Joseph Salisbury
Would it be possible for you to test the latest upstream kernel? Refer
to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest
v3.18 kernel[0].

If this bug is fixed in the mainline kernel, please add the following
tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag:
'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, 
please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as 
Confirmed.


Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.18-rc6-vivid/

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1395269

Title:
  [e1000e] ethtool -t eth0 offline loses routing table

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1395269/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1395269] Re: [e1000e] ethtool -t eth0 offline loses routing table

2014-11-24 Thread Peter Cordes
Yup, will test this sometime this week and post again.  Good suggestion,
hadn't even thought of doing that, derp.  :P

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1395269

Title:
  [e1000e] ethtool -t eth0 offline loses routing table

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1395269/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1395269] Re: [e1000e] ethtool -t eth0 offline loses routing table

2014-11-22 Thread Peter Cordes
apport information

** Summary changed:

- ethtool -t eth0 offline loses routing table
+ [e1000e] ethtool -t eth0 offline loses routing table

** Tags added: apport-collected

** Description changed:

  ethtool -t eth0 offline does the tests, but leaves the routing table
  with only the entry for the local network.  I had to sudo route add
  default gw 10.0.0.1, in my case.  The online test didn't do this.
  
  Ubuntu 14.04, ethtool 1:3.13-1
  
  Linux tesla 3.13.0-39-generic #66-Ubuntu SMP Tue Oct 28 13:30:27 UTC
  2014 x86_64 x86_64 x86_64 GNU/Linux
  
  ethtool -i eth0: 
  driver: e1000e
  version: 2.3.2-k
  firmware-version: 1.1-0
  bus-info: :00:19.0
  
  relevant kernel log:
  [637008.472410] e1000e :00:19.0 eth0: offline testing starting
  [637009.077985] e1000e :00:19.0 eth0: testing unshared interrupt
  [637022.468941] e1000e :00:19.0: irq 45 for MSI/MSI-X
  [637022.572094] e1000e :00:19.0: irq 45 for MSI/MSI-X
  [637022.572257] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
  [637037.432893] e1000e: eth0 NIC Link is Up 10 Mbps Full Duplex, Flow 
Control: Rx/Tx
  [637037.433003] e1000e :00:19.0 eth0: Link Speed was downgraded by 
SmartSpeed
  [637037.433005] e1000e :00:19.0 eth0: 10/100 speed: disabling TSO
  [637037.433035] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
  [637037.982611] net_ratelimit: 3 callbacks suppressed
  [637037.982623] IPv4: martian source 10.0.0.17 from 80.73.161.44, on dev eth0
  [637037.982628] ll header: : 00 19 d1 11 b4 9b 00 03 6d 11 34 1b 08 
00m.4...
  
  (the martian packets are from TCP connections that my router is still
  NATing to this machine, even though without its routing table, it's not
  happy to see them.)
  
   And yes, my e1000e is autonegotiating to 10baseT/Full on the same
  cables and switch that still works at 1000baseT with another machine,
  hence running self-tests...  I thought this machine used to run at
  1000baseT, weird if I went 5 years without noticing my desktop being
  slow.  Not what this bug report is about, though.
  
   The e1000e hardware is on a DG965WH Intel mobo (ICH8 / g965 graphics, 
first-gen core2)
  00:19.0 Ethernet controller: Intel Corporation 82566DC Gigabit Network 
Connection (rev 02)
  Subsystem: Intel Corporation Device 0001
  Flags: bus master, fast devsel, latency 0, IRQ 45
  Memory at e030 (32-bit, non-prefetchable) [size=128K]
  Memory at e0324000 (32-bit, non-prefetchable) [size=4K]
  I/O ports at 20e0 [size=32]
  Capabilities: [c8] Power Management version 2
  Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
  Kernel driver in use: e1000e
  
  
  $ ethtool eth0
  Settings for eth0:
  Supported ports: [ TP ]
  Supported link modes:   10baseT/Half 10baseT/Full 
  100baseT/Half 100baseT/Full 
  1000baseT/Full 
  Supported pause frame use: No
  Supports auto-negotiation: Yes
  Advertised link modes:  10baseT/Full 
  100baseT/Full 
  1000baseT/Full 
  Advertised pause frame use: No
  Advertised auto-negotiation: Yes
  Speed: 10Mb/s
  Duplex: Full
  Port: Twisted Pair
  PHYAD: 1
  Transceiver: internal
  Auto-negotiation: on
  MDI-X: on (auto)
  Cannot get wake-on-lan settings: Operation not permitted
  Current message level: 0x0007 (7)
 drv probe link
  Link detected: yes
  
  
  Appears to be the same problem as someone reported to Redhat a while ago, 
which got marked as fixed for the igb driver
  https://bugzilla.redhat.com/show_bug.cgi?format=multipleid=661976
  Not very useful info in their BTS, because the bug that it's a dup of is now 
flagged private, so nobody can even look at it.
  
   Possibly this is a per-driver thing, unless the right fix is to have
  ethtool save/restore the routing table entries for that iface.
  
  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: ethtool 1:3.13-1
  ProcVersionSignature: Ubuntu 3.13.0-39.66-generic 3.13.11.8
  Uname: Linux 3.13.0-39-generic x86_64
  ApportVersion: 2.14.1-0ubuntu3.5
  Architecture: amd64
  Date: Sat Nov 22 03:00:35 2014
  Dependencies:
   gcc-4.9-base 4.9.1-0ubuntu1
   libc6 2.19-0ubuntu6.3
   libgcc1 1:4.9.1-0ubuntu1
   multiarch-support 2.19-0ubuntu6.3
  SourcePackage: ethtool
  UpgradeStatus: Upgraded to trusty on 2014-07-14 (130 days ago)
+ --- 
+ ApportVersion: 2.14.1-0ubuntu3.5
+ Architecture: amd64
+ AudioDevicesInUse:
+  USERPID ACCESS COMMAND
+  /dev/snd/controlC0:  peter  2715 F pulseaudio
+  /dev/snd/pcmC0D0p:   peter  2715 F...m pulseaudio
+ CRDA: Error: [Errno 2] No such file or directory
+ DistroRelease: Ubuntu 14.04
+ HibernationDevice: 

[Bug 1395269] Re: [e1000e] ethtool -t eth0 offline loses routing table

2014-11-22 Thread Peter Cordes
So the idea is for drivers to not tell the kernel that the interface
went down, while it's doing self-tests?  I guess igb had this problem
fixed, according to the redhat bug, but I guess not e1000e.

 Yes, I'm pretty sure the interface goes down during the offline portion
of the full set of self-tests, for my e1000e.  Connected to my switch,
it takes longer than usual to autonegotiate a link.  I should have
posted this in the initial report, but here's the actual output:

sudo ethtool -t eth0 offline
The test result is FAIL
The test extra info:
Register test  (offline) 0
Eeprom test(offline) 0
Interrupt test (offline) 0
Loopback test  (offline) 0
Link test   (on/offline) 1

 If this doesn't usually happen with e1000e, the long autonegotiation is
probably the corner case that's causing it.  It's so long that the link
test fails.  (also, would it make sense to do the link test first,
before offline tests that trigger autonegotiation?  Or do we WANT to
flag problems like sketchy setups that require SmartSpeed fallback to
10baseT to make a working link?)


 The other solution would be to save/restore routing table entries for that 
interface.  But that might cause problems in some corner cases.  So it might be 
a lot of work to implement safely, in the face of complex routing tables and/or 
changes made during the self-test while the interface was still online.  Oh 
duh, nvm, there's more than just IPv4 to save/restore routing tables for.  Some 
custom protocol that ethtool doesn't know about would not have its routing 
table saved/restored.

 Anyway, thanks for having a look into this.  It's not a problem for me
now that I know about it, just wanted to get it reported so at least the
docs could include a warning.  That's all that I think really needs
doing, since checking every driver would be a lot of work.


how about:
 ethtool(8):
...
   offline
  Perform all tests, including ones that interrupt normal 
operation.  Some drivers may bring the interface down/up during this process, 
flushing routing table entries.  They shouldn't, but be prepared just in case.  
Report problems with specific drivers against the Linux kernel (not ethtool).


 The report a bug sentence is probably too much, and could go.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1395269

Title:
  [e1000e] ethtool -t eth0 offline loses routing table

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1395269/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs