Bug#945912: Kernel 5.3 e100e Detected Hardware Unit Hang

2019-12-03 Thread T.A. van Roermund
This may be related:

https://bugzilla.kernel.org/show_bug.cgi?id=205047

Bug#945912: Kernel 5.3 e100e Detected Hardware Unit Hang

2019-11-30 Thread Russell Mosemann

Package: linux-image-5.3.0-0.bpo.2-amd64
Severity: important

Dear Maintainer,

   * What led up to the situation?
 
Installed kernel 5.2 and 5.3 on two physical hosts in a KVM virtual cluster, 
each host with two bonded Ethernet ports. After some random amount of time that 
ranges from right after booting to several hours later, the e1000e driver hangs 
and all heck breaks loose with kernel errors. This has happened on both hosts.

   * What exactly did you do (or not do) that was effective (or
 ineffective)?
 
This problem does not occur with kernel 4.19. I reverted to kernel 4.19.

   * What was the outcome of this action?
 
Using kernel 4.19 fixes the e1000e hang problem.

   * What outcome did you expect instead?
 
Networking would work perfectly in kernels 5.2 and 5.3, just like it does in 
4.19.
 
The hang occurs with eth1 and the e1000e driver, which involves the Intel I210 
Gb adapter.

# lspci | grep I2
00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I217-LM (rev 
05)
02:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection 
(rev 03)
 
# Lines from the journal during boot
 
Nov 30 04:47:57 vhost002 kernel: e1000e: Intel(R) PRO/1000 Network Driver - 
3.2.6-k
Nov 30 04:47:57 vhost002 kernel: e1000e: Copyright(c) 1999 - 2015 Intel 
Corporation.
Nov 30 04:47:57 vhost002 kernel: e1000e :00:19.0: Interrupt Throttling Rate 
(ints/sec) set to dynamic conservative mode
Nov 30 04:47:57 vhost002 kernel: igb: Intel(R) Gigabit Ethernet Network Driver 
- version 5.6.0-k
Nov 30 04:47:57 vhost002 kernel: igb: Copyright (c) 2007-2014 Intel Corporation.
Nov 30 04:47:57 vhost002 kernel: igb :02:00.0: PHY reset is blocked due to 
SOL/IDER session.
Nov 30 04:47:57 vhost002 kernel: igb :02:00.0: added PHC on eth0
Nov 30 04:47:57 vhost002 kernel: igb :02:00.0: Intel(R) Gigabit Ethernet 
Network Connection
Nov 30 04:47:57 vhost002 kernel: igb :02:00.0: eth0: (PCIe:2.5Gb/s:Width 
x1) d0:50:99:c0:38:b6
Nov 30 04:47:57 vhost002 kernel: igb :02:00.0: eth0: PBA No: 001300-000
Nov 30 04:47:57 vhost002 kernel: igb :02:00.0: Using MSI-X interrupts. 4 rx 
queue(s), 4 tx queue(s)
Nov 30 04:47:57 vhost002 kernel: e1000e :00:19.0 :00:19.0 
(uninitialized): registered PHC clock
Nov 30 04:47:57 vhost002 kernel: e1000e :00:19.0 eth1: (PCI 
Express:2.5GT/s:Width x1) d0:50:99:c0:38:b7
Nov 30 04:47:57 vhost002 kernel: e1000e :00:19.0 eth1: Intel(R) PRO/1000 
Network Connection
Nov 30 04:47:57 vhost002 kernel: e1000e :00:19.0 eth1: MAC: 11, PHY: 12, 
PBA No: FF-0FF
Nov 30 04:47:58 vhost002 kernel: Ethernet Channel Bonding Driver: v3.7.1 (April 
27, 2011)
Nov 30 04:47:58 vhost002 kernel: bonding: bond0 is being created...
Nov 30 04:47:58 vhost002 systemd-udevd[387]: Could not generate persistent MAC 
address for bond0: No such file or directory
Nov 30 04:47:59 vhost002 kernel: igb :02:00.0 eth0: igb: eth0 NIC Link is 
Up 1000 Mbps Full Duplex, Flow Control: RX
Nov 30 04:47:59 vhost002 kernel: bond0: (slave eth0): Enslaving as a backup 
interface with an up link
Nov 30 04:47:59 vhost002 kernel: bond0: (slave eth1): Enslaving as a backup 
interface with a down link
Nov 30 04:47:59 vhost002 kernel: bond0: Warning: No 802.3ad response from the 
link partner for any adapters in the bond
Nov 30 04:47:59 vhost002 kernel: bond0: (slave eth0): link status definitely 
up, 1000 Mbps full duplex
Nov 30 04:47:59 vhost002 kernel: bond0: active interface up!
Nov 30 04:47:59 vhost002 systemd-udevd[445]: link_config: autonegotiation is 
unset or enabled, the speed and duplex are not writable.
Nov 30 04:47:59 vhost002 systemd-udevd[445]: Could not generate persistent MAC 
address for kvmbr0: No such file or directory
Nov 30 04:47:59 vhost002 ifup[781]: Waiting for a max of 0 seconds for bond0 to 
become available.
Nov 30 04:47:59 vhost002 kernel: bridge: filtering via arp/ip/ip6tables is no 
longer available by default. Update your scripts to load br_netfilter if you 
need this.
Nov 30 04:47:59 vhost002 kernel: kvmbr0: port 1(bond0) entered blocking state
Nov 30 04:47:59 vhost002 kernel: kvmbr0: port 1(bond0) entered disabled state
Nov 30 04:47:59 vhost002 kernel: device bond0 entered promiscuous mode
Nov 30 04:47:59 vhost002 kernel: device eth0 entered promiscuous mode
Nov 30 04:47:59 vhost002 kernel: device eth1 entered promiscuous mode
Nov 30 04:47:59 vhost002 kernel: kvmbr0: port 1(bond0) entered blocking state
Nov 30 04:47:59 vhost002 kernel: kvmbr0: port 1(bond0) entered forwarding state
Nov 30 04:47:59 vhost002 ifup[781]: Waiting for kvmbr0 to get ready (MAXWAIT is 
2 seconds).
Nov 30 04:47:59 vhost002 avahi-daemon[711]: Joining mDNS multicast group on 
interface kvmbr0.IPv4 with address 192.168.0.237.
Nov 30 04:47:59 vhost002 avahi-daemon[711]: New relevant interface kvmbr0.IPv4 
for mDNS.
Nov 30 04:47:59 vhost002 avahi-daemon[711]: Registering new address record for 
192.168.0.237 on kvmbr0.IPv4.
Nov 30 04:47:59 vhost002 avahi-daemon[711]: Register