This kernel did make it so I could not reproduce it on demand using ethtool -G however it broke so many other things I could not leave it running to see if it fixed spontaneous hangs.
Strangely it broke nfs-kernel-server on the i7-6850k machine but not the i7-6700k machine. I did much stare and compare to make sure they were configured the same. This forced me to back out this kernel. But in addition to NFS, the nouveau drivers needed on the i7-6850k machine had some bug that would pixelize much of the screen in a semi-random fashion. Also for whatever reason x2goserver would not work properly with that kernel. On the i7-6700k machines, one I had to restart lightdm several times to get it to actually start, it did not start on boot up. On another I was unable to get lightdm to start at all and only console graphics worked, and for some reason they were in yellow instead of white. The i7-6700k machines are using the internal graphics of the i7-6700k processor clocked real slow to minimize the impact on heat budget. So on the kernel-developers's PPA I saw another test kernel, 4.15.0-21, I installed it, it also made ethtool -G not induce Ethernet hang but on the i7-6850 it's already hung once spontaneously: [ 4112.809034] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang: TDH <4f> TDT <6f> next_to_use <6f> next_to_clean <4e> buffer_info[next_to_clean]: time_stamp <1003a2221> next_to_watch <4f> jiffies <1003a2d80> next_to_watch.status <0> MAC Status <80083> PHY Status <796d> PHY 1000BASE-T Status <7c00> PHY Extended Status <3000> PCI Status <10> [ 4114.793198] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang: TDH <4f> TDT <6f> next_to_use <6f> next_to_clean <4e> buffer_info[next_to_clean]: time_stamp <1003a2221> next_to_watch <4f> jiffies <1003a3540> next_to_watch.status <0> MAC Status <80083> PHY Status <796d> PHY 1000BASE-T Status <7c00> PHY Extended Status <3000> PCI Status <10> [ 4116.008748] ------------[ cut here ]------------ [ 4116.008750] NETDEV WATCHDOG: eno1 (e1000e): transmit queue 0 timed out [ 4116.008765] WARNING: CPU: 8 PID: 59 at /build/linux-QLn4bB/linux-4.15.0/net/s ched/sch_generic.c:323 dev_watchdog+0x21d/0x230 [ 4116.008765] Modules linked in: tcp_diag inet_diag vhost_net vhost tap xt_CHEC KSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv 4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT n f_reject_ipv4 xt_tcpudp ebtable_filter ebtables ip6table_filter ip6_tables iptab le_filter devlink rpcsec_gss_krb5 nfsv4 nfs fscache msr bridge stp llc binfmt_mi sc nls_iso8859_1 quota_v2 quota_tree intel_rapl x86_pkg_temp_thermal intel_power clamp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel snd_hda_codec_hdmi aes_x86_64 input_leds crypto_simd glue_helper cryptd snd_hda_codec_realtek snd_hda_codec_generic snd_seq_midi snd_seq_midi_e vent snd_hda_intel snd_hda_codec snd_hda_core snd_rawmidi snd_hwdep snd_seq snd_ pcm snd_seq_device [ 4116.008792] snd_timer intel_cstate eeepc_wmi snd asus_wmi sparse_keymap wmi_ bmof lpc_ich intel_rapl_perf intel_wmi_thunderbolt shpchp soundcore mei_me mei m ac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp l ibiscsi scsi_transport_iscsi nct6775 hwmon_vid coretemp parport_pc ppdev nfsd au th_rpcgss nfs_acl lockd grace sunrpc lp parport ip_tables x_tables autofs4 btrfs zstd_compress raid456 async_raid6_recov async_memcpy async_pq async_xor async_t x xor raid6_pq libcrc32c raid1 raid0 multipath linear dm_mirror dm_region_hash d m_log hid_generic usbhid hid raid10 nouveau video i2c_algo_bit ttm drm_kms_helpe r mxm_wmi syscopyarea sysfillrect e1000e sysimgblt fb_sys_fops ahci drm ptp liba hci pps_core wmi [ 4116.008825] CPU: 8 PID: 59 Comm: ksoftirqd/8 Not tainted 4.15.0-21-lowlatency #22-Ubuntu [ 4116.008825] Hardware name: ASUS All Series/X99-E, BIOS 1801 08/11/2017 [ 4116.008827] RIP: 0010:dev_watchdog+0x21d/0x230 [ 4116.008827] RSP: 0018:ffffa9b8c64cbd60 EFLAGS: 00010282 [ 4116.008828] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 [ 4116.008829] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff8e863f416490 [ 4116.008829] RBP: ffffa9b8c64cbd90 R08: 00000000000004f5 R09: 0000000000000004 [ 4116.008830] R10: ffffa9b8c64cbde8 R11: 0000000000000001 R12: ffff8e862ba8be80 [ 4116.008830] R13: ffff8e862a7f4000 R14: ffff8e862a7f4478 R15: 0000000000000001[ 4116.008831] FS: 0000000000000000(0000) GS:ffff8e863f400000(0000) knlGS:00000 00000000000 [ 4116.008832] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4116.008832] CR2: 00007fd12a9df850 CR3: 00000018cc40a006 CR4: 00000000003626e0 [ 4116.008833] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4116.008833] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4116.008834] Call Trace: [ 4116.008837] ? qdisc_reset+0x70/0x70 [ 4116.008841] call_timer_fn+0x30/0x160 [ 4116.008843] ? qdisc_reset+0x70/0x70 [ 4116.008844] run_timer_softirq+0x422/0x470 [ 4116.008847] ? __switch_to+0x4c6/0x530 [ 4116.008848] ? __switch_to+0x4c6/0x530 [ 4116.008852] __do_softirq+0xdf/0x2e4 [ 4116.008855] run_ksoftirqd+0x20/0x60 [ 4116.008857] smpboot_thread_fn+0x131/0x1f0 [ 4116.008859] kthread+0x121/0x140 [ 4116.008860] ? sort_range+0x30/0x30 [ 4116.008861] ? kthread_create_worker_on_cpu+0x70/0x70 [ 4116.008863] ret_from_fork+0x35/0x40 [ 4116.008863] Code: 37 00 49 63 4e e8 eb 92 4c 89 ef c6 05 3b 1c dc 00 01 e8 67 34 fd ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 a0 74 d9 8b e8 83 30 80 ff <0f> 0b eb c0 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f [ 4116.008880] ---[ end trace d9fd2f2b29f4469f ]--- [ 4116.008888] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly [ 4116.008959] bridge0: port 1(eno1) entered disabled state [ 4116.008992] bridge0: topology change detected, propagating [ 4119.688830] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 4119.688867] bridge0: port 1(eno1) entered blocking state [ 4119.688871] bridge0: port 1(eno1) entered listening state [ 4134.953070] bridge0: port 1(eno1) entered learning state [ 4150.314349] bridge0: port 1(eno1) entered forwarding state [ 4150.314351] bridge0: topology change detected, sending tcn bpdu This data was produced by: Linux iglulik 4.15.0-21-lowlatency #22-Ubuntu SMP PREEMPT Tue May 1 15:47:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Home that something here is helpful. -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_- Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting. Knowledgeable human assistance, not telephone trees or script readers. See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874. On Tue, 1 May 2018, Joseph Salisbury wrote: > Date: Tue, 01 May 2018 19:16:25 -0000 > From: Joseph Salisbury <joseph.salisb...@canonical.com> > Reply-To: Bug 1766377 <1766...@bugs.launchpad.net> > To: nan...@eskimo.com > Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs > > Can you see if this bug also happens with the latest mainline kernel, or > if it was already fixed upstream? It can be downloaded from: > > http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.17-rc3 > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1766377 > > Title: > Ethernet E1000 Controller Hangs > > Status in linux package in Ubuntu: > Incomplete > Status in linux source package in Bionic: > Incomplete > > Bug description: > With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic > hanging of the LAN connection. This is happening on an Asus X99-DELUX > motherboard, controller specifications: > Intel® I218V, 1 x Gigabit LAN Controller(s) > Intel® I211-AT, 1 x Gigabit LAN > Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) > appliance > Support Teaming Technology > ASUS Turbo LAN Utility > The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM. > This machine has a number of Qemu/KVM virtual guests and is using a software > bridge to share the interface. > This did not happen with 17.10 and 4.13.0 kernel. It is happening on > multiple machines here. > Here are the messages from dmesg: > 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang: > TDH <ea> > TDT <2d> > next_to_use <2d> > next_to_clean <e9> > buffer_info[next_to_clean]: > time_stamp <13c8d0008> > next_to_watch <ea> > jiffies <13c8d0880> > next_to_watch.status <0> > MAC Status <80083> > PHY Status <796d> > PHY 1000BASE-T Status <3c00> > PHY Extended Status <3000> > PCI Status <10> > [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang: > TDH <ea> > TDT <2d> > next_to_use <2d> > next_to_clean <e9> > buffer_info[next_to_clean]: > time_stamp <13c8d0008> > next_to_watch <ea> > jiffies <13c8d1040> > next_to_watch.status <0> > MAC Status <80083> > PHY Status <796d> > PHY 1000BASE-T Status <3c00> > PHY Extended Status <3000> > PCI Status <10> > [1016202.413607] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly > [1016202.413701] bridge0: port 1(eno1) entered disabled state > [1016202.413732] bridge0: topology change detected, propagating > [1016206.666676] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow > Control: Rx/Tx > [1016206.666708] bridge0: port 1(eno1) entered blocking state > [1016206.666712] bridge0: port 1(eno1) entered listening state > [1016216.750911] bridge0: port 1(eno1) entered learning state > [1016232.110291] bridge0: port 1(eno1) entered forwarding state > [1016232.110294] bridge0: topology change detected, sending tcn bpdu > [1017834.390579] cfg80211: Loading compiled-in X.509 certificates for > regulatory database > [1017834.390770] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7' > [1017834.414792] platform regulatory.0: Direct firmware load for > regulatory.db failed with error -2 > [1017834.414794] cfg80211: failed to load regulatory.db > If there is any other information I can provide to aid in resolution, please > contact me, nan...@eskimo.com. Thank you! > > ProblemType: Bug > DistroRelease: Ubuntu 18.04 > Package: linux-image-4.15.0-15-lowlatency 4.15.0-15.16 > ProcVersionSignature: Ubuntu 4.15.0-15.16-lowlatency 4.15.15 > Uname: Linux 4.15.0-15-lowlatency x86_64 > ApportVersion: 2.20.9-0ubuntu6 > Architecture: amd64 > AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/hwC1D3', > '/dev/snd/hwC1D2', '/dev/snd/hwC1D1', '/dev/snd/hwC1D0', '/dev/snd/pcmC1D9p', > '/dev/snd/pcmC1D8p', '/dev/snd/pcmC1D7p', '/dev/snd/pcmC1D3p', > '/dev/snd/controlC1', '/dev/snd/by-path', '/dev/snd/hwC0D0', > '/dev/snd/pcmC0D2c', '/dev/snd/pcmC0D1p', '/dev/snd/pcmC0D0c', > '/dev/snd/pcmC0D0p', '/dev/snd/controlC0', '/dev/snd/seq', '/dev/snd/timer'] > failed with exit code 1: > CurrentDesktop: MATE > Date: Mon Apr 23 16:45:30 2018 > HibernationDevice: RESUME=UUID=963cb206-8962-4fc0-82a1-fc4f02a9b5c5 > InstallationDate: Installed on 2017-05-05 (353 days ago) > InstallationMedia: Ubuntu-MATE 17.04 "Zesty Zapus" - Release amd64 (20170412) > MachineType: ASUS All Series > ProcFB: 0 nouveaufb > ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-15-lowlatency > root=UUID=28825f5b-a6fd-4e09-982c-0513ae4d2842 ro quiet splash vt.handoff=1 > PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No > PulseAudio daemon running, or not running as session daemon. > RelatedPackageVersions: > linux-restricted-modules-4.15.0-15-lowlatency N/A > linux-backports-modules-4.15.0-15-lowlatency N/A > linux-firmware 1.173 > RfKill: > > SourcePackage: linux > UpgradeStatus: Upgraded to bionic on 2018-04-12 (11 days ago) > dmi.bios.date: 08/11/2017 > dmi.bios.vendor: American Megatrends Inc. > dmi.bios.version: 1801 > dmi.board.asset.tag: Default string > dmi.board.name: X99-E > dmi.board.vendor: ASUSTeK COMPUTER INC. > dmi.board.version: Rev 1.xx > dmi.chassis.asset.tag: Default string > dmi.chassis.type: 3 > dmi.chassis.vendor: Default string > dmi.chassis.version: Default string > dmi.modalias: > dmi:bvnAmericanMegatrendsInc.:bvr1801:bd08/11/2017:svnASUS:pnAllSeries:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnX99-E:rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring: > dmi.product.family: ASUS MB > dmi.product.name: All Series > dmi.product.version: System Version > dmi.sys.vendor: ASUS > > To manage notifications about this bug go to: > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1766377/+subscriptions > -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1766377 Title: Ethernet E1000 Controller Hangs Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Incomplete Bug description: With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic hanging of the LAN connection. This is happening on an Asus X99-DELUX motherboard, controller specifications: Intel® I218V, 1 x Gigabit LAN Controller(s) Intel® I211-AT, 1 x Gigabit LAN Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) appliance Support Teaming Technology ASUS Turbo LAN Utility The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM. This machine has a number of Qemu/KVM virtual guests and is using a software bridge to share the interface. This did not happen with 17.10 and 4.13.0 kernel. It is happening on multiple machines here. Here are the messages from dmesg: 1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang: TDH <ea> TDT <2d> next_to_use <2d> next_to_clean <e9> buffer_info[next_to_clean]: time_stamp <13c8d0008> next_to_watch <ea> jiffies <13c8d0880> next_to_watch.status <0> MAC Status <80083> PHY Status <796d> PHY 1000BASE-T Status <3c00> PHY Extended Status <3000> PCI Status <10> [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang: TDH <ea> TDT <2d> next_to_use <2d> next_to_clean <e9> buffer_info[next_to_clean]: time_stamp <13c8d0008> next_to_watch <ea> jiffies <13c8d1040> next_to_watch.status <0> MAC Status <80083> PHY Status <796d> PHY 1000BASE-T Status <3c00> PHY Extended Status <3000> PCI Status <10> [1016202.413607] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly [1016202.413701] bridge0: port 1(eno1) entered disabled state [1016202.413732] bridge0: topology change detected, propagating [1016206.666676] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [1016206.666708] bridge0: port 1(eno1) entered blocking state [1016206.666712] bridge0: port 1(eno1) entered listening state [1016216.750911] bridge0: port 1(eno1) entered learning state [1016232.110291] bridge0: port 1(eno1) entered forwarding state [1016232.110294] bridge0: topology change detected, sending tcn bpdu [1017834.390579] cfg80211: Loading compiled-in X.509 certificates for regulatory database [1017834.390770] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7' [1017834.414792] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2 [1017834.414794] cfg80211: failed to load regulatory.db If there is any other information I can provide to aid in resolution, please contact me, nan...@eskimo.com. Thank you! ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-15-lowlatency 4.15.0-15.16 ProcVersionSignature: Ubuntu 4.15.0-15.16-lowlatency 4.15.15 Uname: Linux 4.15.0-15-lowlatency x86_64 ApportVersion: 2.20.9-0ubuntu6 Architecture: amd64 AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/hwC1D3', '/dev/snd/hwC1D2', '/dev/snd/hwC1D1', '/dev/snd/hwC1D0', '/dev/snd/pcmC1D9p', '/dev/snd/pcmC1D8p', '/dev/snd/pcmC1D7p', '/dev/snd/pcmC1D3p', '/dev/snd/controlC1', '/dev/snd/by-path', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D2c', '/dev/snd/pcmC0D1p', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/controlC0', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CurrentDesktop: MATE Date: Mon Apr 23 16:45:30 2018 HibernationDevice: RESUME=UUID=963cb206-8962-4fc0-82a1-fc4f02a9b5c5 InstallationDate: Installed on 2017-05-05 (353 days ago) InstallationMedia: Ubuntu-MATE 17.04 "Zesty Zapus" - Release amd64 (20170412) MachineType: ASUS All Series ProcFB: 0 nouveaufb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-15-lowlatency root=UUID=28825f5b-a6fd-4e09-982c-0513ae4d2842 ro quiet splash vt.handoff=1 PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon. RelatedPackageVersions: linux-restricted-modules-4.15.0-15-lowlatency N/A linux-backports-modules-4.15.0-15-lowlatency N/A linux-firmware 1.173 RfKill: SourcePackage: linux UpgradeStatus: Upgraded to bionic on 2018-04-12 (11 days ago) dmi.bios.date: 08/11/2017 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 1801 dmi.board.asset.tag: Default string dmi.board.name: X99-E dmi.board.vendor: ASUSTeK COMPUTER INC. dmi.board.version: Rev 1.xx dmi.chassis.asset.tag: Default string dmi.chassis.type: 3 dmi.chassis.vendor: Default string dmi.chassis.version: Default string dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1801:bd08/11/2017:svnASUS:pnAllSeries:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnX99-E:rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring: dmi.product.family: ASUS MB dmi.product.name: All Series dmi.product.version: System Version dmi.sys.vendor: ASUS To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1766377/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp