This kernel did make it so I could not reproduce it on demand using
ethtool -G however it broke so many other things I could not leave it running
to see if it fixed spontaneous hangs.

      Strangely it broke nfs-kernel-server on the i7-6850k machine but not the
i7-6700k machine.  I did much stare and compare to make sure they were
configured the same.  This forced me to back out this kernel.

      But in addition to NFS, the nouveau drivers needed on the i7-6850k
machine had some bug that would pixelize much of the screen in a semi-random
fashion.  Also for whatever reason x2goserver would not work properly with
that kernel.

      On the i7-6700k machines, one I had to restart lightdm several times to
get it to actually start, it did not start on boot up.  On another I was unable
to get lightdm to start at all and only console graphics worked, and for some
reason they were in yellow instead of white.  The i7-6700k machines are using
the internal graphics of the i7-6700k processor clocked real slow to minimize
the impact on heat budget.

      So on the kernel-developers's PPA I saw another test kernel, 4.15.0-21,
I installed it, it also made ethtool -G not induce Ethernet hang but on the 
i7-6850 it's already hung once spontaneously:

[ 4112.809034] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                  TDH                  <4f>
                  TDT                  <6f>
                  next_to_use          <6f>
                  next_to_clean        <4e>
                buffer_info[next_to_clean]:
                  time_stamp           <1003a2221>
                  next_to_watch        <4f>
                  jiffies              <1003a2d80>
                  next_to_watch.status <0>
                MAC Status             <80083>
                PHY Status             <796d>
                PHY 1000BASE-T Status  <7c00>
                PHY Extended Status    <3000>
                PCI Status             <10>
[ 4114.793198] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                  TDH                  <4f>
                  TDT                  <6f>
                  next_to_use          <6f>
                  next_to_clean        <4e>
                buffer_info[next_to_clean]:
                  time_stamp           <1003a2221>
                  next_to_watch        <4f>
                  jiffies              <1003a3540>
                  next_to_watch.status <0>
                MAC Status             <80083>
                PHY Status             <796d>
                PHY 1000BASE-T Status  <7c00>
                PHY Extended Status    <3000>
                PCI Status             <10>
[ 4116.008748] ------------[ cut here ]------------
[ 4116.008750] NETDEV WATCHDOG: eno1 (e1000e): transmit queue 0 timed out
[ 4116.008765] WARNING: CPU: 8 PID: 59 at 
/build/linux-QLn4bB/linux-4.15.0/net/s
ched/sch_generic.c:323 dev_watchdog+0x21d/0x230
[ 4116.008765] Modules linked in: tcp_diag inet_diag vhost_net vhost tap 
xt_CHEC
KSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_nat_ipv
4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
n
f_reject_ipv4 xt_tcpudp ebtable_filter ebtables ip6table_filter ip6_tables 
iptab
le_filter devlink rpcsec_gss_krb5 nfsv4 nfs fscache msr bridge stp llc 
binfmt_mi
sc nls_iso8859_1 quota_v2 quota_tree intel_rapl x86_pkg_temp_thermal 
intel_power
clamp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
pcbc aesni_intel snd_hda_codec_hdmi aes_x86_64 input_leds crypto_simd 
glue_helper cryptd snd_hda_codec_realtek snd_hda_codec_generic snd_seq_midi 
snd_seq_midi_e
vent snd_hda_intel snd_hda_codec snd_hda_core snd_rawmidi snd_hwdep snd_seq 
snd_
pcm snd_seq_device
[ 4116.008792]  snd_timer intel_cstate eeepc_wmi snd asus_wmi sparse_keymap 
wmi_
bmof lpc_ich intel_rapl_perf intel_wmi_thunderbolt shpchp soundcore mei_me mei 
m
ac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp 
l
ibiscsi scsi_transport_iscsi nct6775 hwmon_vid coretemp parport_pc ppdev nfsd 
au
th_rpcgss nfs_acl lockd grace sunrpc lp parport ip_tables x_tables autofs4 
btrfs
  zstd_compress raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_t
x xor raid6_pq libcrc32c raid1 raid0 multipath linear dm_mirror dm_region_hash 
d
m_log hid_generic usbhid hid raid10 nouveau video i2c_algo_bit ttm 
drm_kms_helpe
r mxm_wmi syscopyarea sysfillrect e1000e sysimgblt fb_sys_fops ahci drm ptp 
liba
hci pps_core wmi
[ 4116.008825] CPU: 8 PID: 59 Comm: ksoftirqd/8 Not tainted 
4.15.0-21-lowlatency
  #22-Ubuntu
[ 4116.008825] Hardware name: ASUS All Series/X99-E, BIOS 1801 08/11/2017
[ 4116.008827] RIP: 0010:dev_watchdog+0x21d/0x230
[ 4116.008827] RSP: 0018:ffffa9b8c64cbd60 EFLAGS: 00010282
[ 4116.008828] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
0000000000000006
[ 4116.008829] RDX: 0000000000000007 RSI: 0000000000000096 RDI: 
ffff8e863f416490
[ 4116.008829] RBP: ffffa9b8c64cbd90 R08: 00000000000004f5 R09: 
0000000000000004
[ 4116.008830] R10: ffffa9b8c64cbde8 R11: 0000000000000001 R12: 
ffff8e862ba8be80
[ 4116.008830] R13: ffff8e862a7f4000 R14: ffff8e862a7f4478 R15: 
0000000000000001[ 4116.008831] FS:  0000000000000000(0000) 
GS:ffff8e863f400000(0000) knlGS:00000
00000000000
[ 4116.008832] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4116.008832] CR2: 00007fd12a9df850 CR3: 00000018cc40a006 CR4: 
00000000003626e0
[ 4116.008833] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[ 4116.008833] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[ 4116.008834] Call Trace:
[ 4116.008837]  ? qdisc_reset+0x70/0x70
[ 4116.008841]  call_timer_fn+0x30/0x160
[ 4116.008843]  ? qdisc_reset+0x70/0x70
[ 4116.008844]  run_timer_softirq+0x422/0x470
[ 4116.008847]  ? __switch_to+0x4c6/0x530
[ 4116.008848]  ? __switch_to+0x4c6/0x530
[ 4116.008852]  __do_softirq+0xdf/0x2e4
[ 4116.008855]  run_ksoftirqd+0x20/0x60
[ 4116.008857]  smpboot_thread_fn+0x131/0x1f0
[ 4116.008859]  kthread+0x121/0x140
[ 4116.008860]  ? sort_range+0x30/0x30
[ 4116.008861]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 4116.008863]  ret_from_fork+0x35/0x40
[ 4116.008863] Code: 37 00 49 63 4e e8 eb 92 4c 89 ef c6 05 3b 1c dc 00 01 e8 
67
  34 fd ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 a0 74 d9 8b e8 83 30 80 ff <0f> 0b 
eb
  c0 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
  [ 4116.008880] ---[ end trace d9fd2f2b29f4469f ]---
[ 4116.008888] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
[ 4116.008959] bridge0: port 1(eno1) entered disabled state
[ 4116.008992] bridge0: topology change detected, propagating
[ 4119.688830] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control:
Rx/Tx
[ 4119.688867] bridge0: port 1(eno1) entered blocking state
[ 4119.688871] bridge0: port 1(eno1) entered listening state
[ 4134.953070] bridge0: port 1(eno1) entered learning state
[ 4150.314349] bridge0: port 1(eno1) entered forwarding state
[ 4150.314351] bridge0: topology change detected, sending tcn bpdu

      This data was produced by: Linux iglulik 4.15.0-21-lowlatency #22-Ubuntu 
SMP PREEMPT Tue May 1 15:47:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

      Home that something here is helpful.

-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
  Eskimo North Linux Friendly Internet Access, Shell Accounts, and Hosting.
    Knowledgeable human assistance, not telephone trees or script readers.
  See our web site: http://www.eskimo.com/ (206) 812-0051 or (800) 246-6874.

On Tue, 1 May 2018, Joseph Salisbury wrote:

> Date: Tue, 01 May 2018 19:16:25 -0000
> From: Joseph Salisbury <joseph.salisb...@canonical.com>
> Reply-To: Bug 1766377 <1766...@bugs.launchpad.net>
> To: nan...@eskimo.com
> Subject: [Bug 1766377] Re: Ethernet E1000 Controller Hangs
> 
> Can you see if this bug also happens with the latest mainline kernel, or
> if it was already fixed upstream?  It can be downloaded from:
>
> http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.17-rc3
>
> -- 
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1766377
>
> Title:
>  Ethernet E1000 Controller Hangs
>
> Status in linux package in Ubuntu:
>  Incomplete
> Status in linux source package in Bionic:
>  Incomplete
>
> Bug description:
>       With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic 
> hanging of the LAN connection.  This is happening on an Asus X99-DELUX 
> motherboard, controller specifications:
>  Intel® I218V, 1 x Gigabit LAN Controller(s)
>  Intel® I211-AT, 1 x Gigabit LAN
>  Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) 
> appliance
>  Support Teaming Technology
>  ASUS Turbo LAN Utility
>  The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
>  This machine has a number of Qemu/KVM virtual guests and is using a software 
> bridge to share the interface.
>  This did not happen with 17.10 and 4.13.0 kernel.  It is happening on 
> multiple machines here.
>  Here are the messages from dmesg:
>  1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
>                     TDH                  <ea>
>                     TDT                  <2d>
>                     next_to_use          <2d>
>                     next_to_clean        <e9>
>                   buffer_info[next_to_clean]:
>                     time_stamp           <13c8d0008>
>                     next_to_watch        <ea>
>                     jiffies              <13c8d0880>
>                     next_to_watch.status <0>
>                   MAC Status             <80083>
>                   PHY Status             <796d>
>                   PHY 1000BASE-T Status  <3c00>
>                   PHY Extended Status    <3000>
>                   PCI Status             <10>
>  [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
>                     TDH                  <ea>
>                     TDT                  <2d>
>                     next_to_use          <2d>
>                     next_to_clean        <e9>
>                   buffer_info[next_to_clean]:
>                     time_stamp           <13c8d0008>
>                     next_to_watch        <ea>
>                     jiffies              <13c8d1040>
>                     next_to_watch.status <0>
>                   MAC Status             <80083>
>                   PHY Status             <796d>
>                   PHY 1000BASE-T Status  <3c00>
>                   PHY Extended Status    <3000>
>                   PCI Status             <10>
>  [1016202.413607] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
>  [1016202.413701] bridge0: port 1(eno1) entered disabled state
>  [1016202.413732] bridge0: topology change detected, propagating
>  [1016206.666676] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow 
> Control: Rx/Tx
>  [1016206.666708] bridge0: port 1(eno1) entered blocking state
>  [1016206.666712] bridge0: port 1(eno1) entered listening state
>  [1016216.750911] bridge0: port 1(eno1) entered learning state
>  [1016232.110291] bridge0: port 1(eno1) entered forwarding state
>  [1016232.110294] bridge0: topology change detected, sending tcn bpdu
>  [1017834.390579] cfg80211: Loading compiled-in X.509 certificates for 
> regulatory database
>  [1017834.390770] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
>  [1017834.414792] platform regulatory.0: Direct firmware load for 
> regulatory.db failed with error -2
>  [1017834.414794] cfg80211: failed to load regulatory.db
>  If there is any other information I can provide to aid in resolution, please 
> contact me, nan...@eskimo.com.  Thank you!
>
>  ProblemType: Bug
>  DistroRelease: Ubuntu 18.04
>  Package: linux-image-4.15.0-15-lowlatency 4.15.0-15.16
>  ProcVersionSignature: Ubuntu 4.15.0-15.16-lowlatency 4.15.15
>  Uname: Linux 4.15.0-15-lowlatency x86_64
>  ApportVersion: 2.20.9-0ubuntu6
>  Architecture: amd64
>  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/hwC1D3', 
> '/dev/snd/hwC1D2', '/dev/snd/hwC1D1', '/dev/snd/hwC1D0', '/dev/snd/pcmC1D9p', 
> '/dev/snd/pcmC1D8p', '/dev/snd/pcmC1D7p', '/dev/snd/pcmC1D3p', 
> '/dev/snd/controlC1', '/dev/snd/by-path', '/dev/snd/hwC0D0', 
> '/dev/snd/pcmC0D2c', '/dev/snd/pcmC0D1p', '/dev/snd/pcmC0D0c', 
> '/dev/snd/pcmC0D0p', '/dev/snd/controlC0', '/dev/snd/seq', '/dev/snd/timer'] 
> failed with exit code 1:
>  CurrentDesktop: MATE
>  Date: Mon Apr 23 16:45:30 2018
>  HibernationDevice: RESUME=UUID=963cb206-8962-4fc0-82a1-fc4f02a9b5c5
>  InstallationDate: Installed on 2017-05-05 (353 days ago)
>  InstallationMedia: Ubuntu-MATE 17.04 "Zesty Zapus" - Release amd64 (20170412)
>  MachineType: ASUS All Series
>  ProcFB: 0 nouveaufb
>  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-15-lowlatency 
> root=UUID=28825f5b-a6fd-4e09-982c-0513ae4d2842 ro quiet splash vt.handoff=1
>  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
> PulseAudio daemon running, or not running as session daemon.
>  RelatedPackageVersions:
>   linux-restricted-modules-4.15.0-15-lowlatency N/A
>   linux-backports-modules-4.15.0-15-lowlatency  N/A
>   linux-firmware                                1.173
>  RfKill:
>
>  SourcePackage: linux
>  UpgradeStatus: Upgraded to bionic on 2018-04-12 (11 days ago)
>  dmi.bios.date: 08/11/2017
>  dmi.bios.vendor: American Megatrends Inc.
>  dmi.bios.version: 1801
>  dmi.board.asset.tag: Default string
>  dmi.board.name: X99-E
>  dmi.board.vendor: ASUSTeK COMPUTER INC.
>  dmi.board.version: Rev 1.xx
>  dmi.chassis.asset.tag: Default string
>  dmi.chassis.type: 3
>  dmi.chassis.vendor: Default string
>  dmi.chassis.version: Default string
>  dmi.modalias: 
> dmi:bvnAmericanMegatrendsInc.:bvr1801:bd08/11/2017:svnASUS:pnAllSeries:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnX99-E:rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring:
>  dmi.product.family: ASUS MB
>  dmi.product.name: All Series
>  dmi.product.version: System Version
>  dmi.sys.vendor: ASUS
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1766377/+subscriptions
>

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1766377

Title:
  Ethernet E1000 Controller Hangs

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
       With Bionic kernel 4.15.0-15 and 4.15.0-17 I am experiencing periodic 
hanging of the LAN connection.  This is happening on an Asus X99-DELUX 
motherboard, controller specifications:
  Intel® I218V, 1 x Gigabit LAN Controller(s)
  Intel® I211-AT, 1 x Gigabit LAN
  Dual Gigabit LAN controllers- 802.3az Energy Efficient Ethernet (EEE) 
appliance
  Support Teaming Technology
  ASUS Turbo LAN Utility
  The CPU is an i7-6850 and it is configured with 128GB of DDR4 RAM.
  This machine has a number of Qemu/KVM virtual guests and is using a software 
bridge to share the interface.
  This did not happen with 17.10 and 4.13.0 kernel.  It is happening on 
multiple machines here.
  Here are the messages from dmesg:
  1016198.957850] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                     TDH                  <ea>
                     TDT                  <2d>
                     next_to_use          <2d>
                     next_to_clean        <e9>
                   buffer_info[next_to_clean]:
                     time_stamp           <13c8d0008>
                     next_to_watch        <ea>
                     jiffies              <13c8d0880>
                     next_to_watch.status <0>
                   MAC Status             <80083>
                   PHY Status             <796d>
                   PHY 1000BASE-T Status  <3c00>
                   PHY Extended Status    <3000>
                   PCI Status             <10>
  [1016200.942072] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
                     TDH                  <ea>
                     TDT                  <2d>
                     next_to_use          <2d>
                     next_to_clean        <e9>
                   buffer_info[next_to_clean]:
                     time_stamp           <13c8d0008>
                     next_to_watch        <ea>
                     jiffies              <13c8d1040>
                     next_to_watch.status <0>
                   MAC Status             <80083>
                   PHY Status             <796d>
                   PHY 1000BASE-T Status  <3c00>
                   PHY Extended Status    <3000>
                   PCI Status             <10>
  [1016202.413607] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly
  [1016202.413701] bridge0: port 1(eno1) entered disabled state
  [1016202.413732] bridge0: topology change detected, propagating
  [1016206.666676] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: Rx/Tx
  [1016206.666708] bridge0: port 1(eno1) entered blocking state
  [1016206.666712] bridge0: port 1(eno1) entered listening state
  [1016216.750911] bridge0: port 1(eno1) entered learning state
  [1016232.110291] bridge0: port 1(eno1) entered forwarding state
  [1016232.110294] bridge0: topology change detected, sending tcn bpdu
  [1017834.390579] cfg80211: Loading compiled-in X.509 certificates for 
regulatory database
  [1017834.390770] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
  [1017834.414792] platform regulatory.0: Direct firmware load for 
regulatory.db failed with error -2
  [1017834.414794] cfg80211: failed to load regulatory.db
  If there is any other information I can provide to aid in resolution, please 
contact me, nan...@eskimo.com.  Thank you!

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-15-lowlatency 4.15.0-15.16
  ProcVersionSignature: Ubuntu 4.15.0-15.16-lowlatency 4.15.15
  Uname: Linux 4.15.0-15-lowlatency x86_64
  ApportVersion: 2.20.9-0ubuntu6
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/hwC1D3', 
'/dev/snd/hwC1D2', '/dev/snd/hwC1D1', '/dev/snd/hwC1D0', '/dev/snd/pcmC1D9p', 
'/dev/snd/pcmC1D8p', '/dev/snd/pcmC1D7p', '/dev/snd/pcmC1D3p', 
'/dev/snd/controlC1', '/dev/snd/by-path', '/dev/snd/hwC0D0', 
'/dev/snd/pcmC0D2c', '/dev/snd/pcmC0D1p', '/dev/snd/pcmC0D0c', 
'/dev/snd/pcmC0D0p', '/dev/snd/controlC0', '/dev/snd/seq', '/dev/snd/timer'] 
failed with exit code 1:
  CurrentDesktop: MATE
  Date: Mon Apr 23 16:45:30 2018
  HibernationDevice: RESUME=UUID=963cb206-8962-4fc0-82a1-fc4f02a9b5c5
  InstallationDate: Installed on 2017-05-05 (353 days ago)
  InstallationMedia: Ubuntu-MATE 17.04 "Zesty Zapus" - Release amd64 (20170412)
  MachineType: ASUS All Series
  ProcFB: 0 nouveaufb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-15-lowlatency 
root=UUID=28825f5b-a6fd-4e09-982c-0513ae4d2842 ro quiet splash vt.handoff=1
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-15-lowlatency N/A
   linux-backports-modules-4.15.0-15-lowlatency  N/A
   linux-firmware                                1.173
  RfKill:
   
  SourcePackage: linux
  UpgradeStatus: Upgraded to bionic on 2018-04-12 (11 days ago)
  dmi.bios.date: 08/11/2017
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 1801
  dmi.board.asset.tag: Default string
  dmi.board.name: X99-E
  dmi.board.vendor: ASUSTeK COMPUTER INC.
  dmi.board.version: Rev 1.xx
  dmi.chassis.asset.tag: Default string
  dmi.chassis.type: 3
  dmi.chassis.vendor: Default string
  dmi.chassis.version: Default string
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr1801:bd08/11/2017:svnASUS:pnAllSeries:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnX99-E:rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring:
  dmi.product.family: ASUS MB
  dmi.product.name: All Series
  dmi.product.version: System Version
  dmi.sys.vendor: ASUS

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1766377/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to