[Kernel-packages] [Bug 1921137] Re: mount.ocfs2 causes kernel BUG at lib/string.c:1149!

2021-10-20 Thread Mikko Tanner
Another data point here:

kernel: [5150033.094216] kernel BUG at lib/string.c:1149!
kernel: [5150033.094224] invalid opcode:  [#1] SMP NOPTI
kernel: [5150033.094229] CPU: 1 PID: 2940890 Comm: mount.ocfs2 Tainted: P   
OE 5.13.12-051312-generic #202108181219-Ubuntu
kernel: [5150033.094233] Hardware name: Gigabyte Technology Co., Ltd. X399 
DESIGNARE EX/X399 DESIGNARE EX-CF, BIOS F12i 09/24/2019
kernel: [5150033.094236] RIP: 0010:fortify_panic+0x13/0x15
kernel: [5150033.094244] Code: 35 37 a8 3b 01 48 c7 c7 93 63 01 b6 e8 c9 c9 fe 
ff 41 5c 41 5d 5d c3 55 48 89 fe 48 c7 c7 e0 63 01 b6 48 89 e5 e8 b0 c9 fe ff 
<0f> 0b 48 c7 c7 18 dc c8 b5 e8 df ff ff ff 48 c7 c7 10 dc c8 b5 e8
kernel: [5150033.094248] RSP: 0018:b4f1ee523c50 EFLAGS: 00010246
kernel: [5150033.094252] RAX: 0022 RBX: 9cf5639bb000 RCX: 

kernel: [5150033.094254] RDX:  RSI: 9d033e2589c0 RDI: 
9d033e2589c0
kernel: [5150033.094257] RBP: b4f1ee523c50 R08: 9d033e2589c0 R09: 
b4f1ee523a30
kernel: [5150033.094258] R10: 0001 R11: 0001 R12: 
0004
kernel: [5150033.094260] R13: 9cf496853000 R14: 9d00f6a91000 R15: 
9cf5639bb291
kernel: [5150033.094262] FS:  7fb7fd6d3b80() GS:9d033e24() 
knlGS:
kernel: [5150033.094265] CS:  0010 DS:  ES:  CR0: 80050033
kernel: [5150033.094267] CR2: 55f52c08f040 CR3: 00029a09e000 CR4: 
003506e0
kernel: [5150033.094270] Call Trace:
kernel: [5150033.094276]  ocfs2_initialize_super.isra.0.cold+0xc/0x18 [ocfs2]
kernel: [5150033.094347]  ? ocfs2_verify_volume+0x143/0x310 [ocfs2]
kernel: [5150033.094410]  ocfs2_fill_super+0x262/0xda0 [ocfs2]
kernel: [5150033.094473]  mount_bdev+0x18d/0x1c0
kernel: [5150033.094478]  ? ocfs2_initialize_super.isra.0+0x1070/0x1070 [ocfs2]
kernel: [5150033.094539]  ocfs2_mount+0x15/0x20 [ocfs2]
kernel: [5150033.094599]  legacy_get_tree+0x2b/0x50
kernel: [5150033.094604]  vfs_get_tree+0x2a/0xc0
kernel: [5150033.094607]  ? capable+0x19/0x20
kernel: [5150033.094612]  path_mount+0x468/0xa60
kernel: [5150033.094617]  do_mount+0x7c/0xa0
kernel: [5150033.094620]  __x64_sys_mount+0x8b/0xe0
kernel: [5150033.094623]  do_syscall_64+0x61/0xb0
kernel: [5150033.094627]  ? syscall_exit_to_user_mode+0x27/0x50
kernel: [5150033.094632]  ? __x64_sys_readlink+0x1f/0x30
kernel: [5150033.094635]  ? do_syscall_64+0x6e/0xb0
kernel: [5150033.094638]  ? irqentry_exit+0x19/0x30
kernel: [5150033.094641]  ? exc_page_fault+0x8f/0x170
kernel: [5150033.094645]  ? asm_exc_page_fault+0x8/0x30
kernel: [5150033.094649]  entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: [5150033.094651] RIP: 0033:0x7fb7fd88cdde
kernel: [5150033.094679] Code: 48 8b 0d b5 80 0c 00 f7 d8 64 89 01 48 83 c8 ff 
c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 
<48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 82 80 0c 00 f7 d8 64 89 01 48
kernel: [5150033.094682] RSP: 002b:7ffea9610c18 EFLAGS: 0246 ORIG_RAX: 
00a5
kernel: [5150033.094686] RAX: ffda RBX:  RCX: 
7fb7fd88cdde
kernel: [5150033.094688] RDX: 55cd6acb10ae RSI: 55cd6c9a7340 RDI: 
55cd6c9ac140
kernel: [5150033.094689] RBP: 7ffea9610dc0 R08: 55cd6c9ac0e0 R09: 
7ffea960e650
kernel: [5150033.094691] R10:  R11: 0246 R12: 
7ffea9610cb0
kernel: [5150033.094693] R13: 7ffea9610c30 R14: 55cd6c9ac0e0 R15: 

kernel: [5150033.094696] Modules linked in: ocfs2_stack_o2cb ocfs2_dlm ocfs2 
ocfs2_nodemanager ocfs2_stackglue quota_tree nft_reject_inet nf_reject_ipv4 
nf_reject_ipv6 nft_reject nft_ct nft_counter nft_limit nft_meta_bridge bridge 
stp llc snd_seq_dummy vhost_net vhost vhost_iotlb tap rfcomm nf_tables 
ip6table_filter ip6_tables iptable_filter bpfilter wireguard curve25519_x86_64 
libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2s_x86_64 
libcurve25519_generic libchacha libblake2s_generic ip6_udp_tunnel udp_tunnel 
nfnetlink_cttimeout nfnetlink cmac algif_hash openvswitch nsh algif_skcipher 
nf_conncount af_alg nf_nat bnep nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 
binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common amd64_edac 
edac_mce_amd kvm_amd kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel 
snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio crypto_simd 
snd_hda_codec_hdmi cryptd rapl iwlmvm snd_hda_intel snd_intel_dspcfg 
snd_intel_sdw_acpi mac80211 uvcvideo
kernel: [5150033.094758]  snd_hda_codec videobuf2_vmalloc videobuf2_memops 
snd_hda_core snd_seq_midi videobuf2_v4l2 btusb snd_seq_midi_event libarc4 btrtl 
videobuf2_common snd_hwdep snd_rawmidi btbcm btintel videodev snd_seq joydev 
gigabyte_wmi mc input_leds serio_raw snd_pcm snd_seq_device wmi_bmof bluetooth 
iwlwifi snd_timer ecdh_generic efi_pstore snd ecc ccp mxm_wmi cfg80211 k10temp 
soundcore lz4 lz4_compress mac_hid nvidia_uvm(POE) tcp_htcp sch_cake ib_umad 

[Kernel-packages] [Bug 1898057] Re: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

2020-12-11 Thread Mikko Tanner
After a reboot of the whole fabric (switches and machines), this problem
has not resurfaced. Conceivably this could have been a transient error
state, so I will close this with "invalid".

** Changed in: linux (Ubuntu)
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1898057

Title:
  Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

Status in linux package in Ubuntu:
  Invalid

Bug description:
  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):

  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
  Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
  Kernel driver in use: mlx4_core

  Below is the WARN from one machine's syslog. The others are
  practically identical. When the WARN happens on any of the machines,
  other 2 will _also_ exhibit queue timeouts. Additionally, other
  (unrelated) machines connected to the same infiniband fabric will
  exhibit a 12-second transmission delay. This could conceivably be
  caused by these 3 servers also being Subnet Managers (opensm package).

  The infiniband fabric is partitioned, with the affected partition
  (8011) seeing most of the traffic.

  

  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [5264

[Kernel-packages] [Bug 1898057] Re: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

2020-10-01 Thread Mikko Tanner
** Changed in: linux (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1898057

Title:
  Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):

  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
  Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
  Kernel driver in use: mlx4_core

  Below is the WARN from one machine's syslog. The others are
  practically identical. When the WARN happens on any of the machines,
  other 2 will _also_ exhibit queue timeouts. Additionally, other
  (unrelated) machines connected to the same infiniband fabric will
  exhibit a 12-second transmission delay. This could conceivably be
  caused by these 3 servers also being Subnet Managers (opensm package).

  The infiniband fabric is partitioned, with the affected partition
  (8011) seeing most of the traffic.

  

  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [52642.480262]  ? ktime_get+0x43/0xa0
  kernel: [52642.480268]  ? lapic_next_deadline+0x26/0x30
  kernel: [52642.480273]  __do_softirq+0xe4/0x2da
  kernel: [52642.480278]  irq_exit+0xae/0xb0
  ker

[Kernel-packages] [Bug 1898057] WifiSyslog.txt

2020-10-01 Thread Mikko Tanner
apport information

** Attachment added: "WifiSyslog.txt"
   
https://bugs.launchpad.net/bugs/1898057/+attachment/5416159/+files/WifiSyslog.txt

** Description changed:

  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):
  
  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
- Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
- Kernel driver in use: mlx4_core
+ Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
+ Kernel driver in use: mlx4_core
  
  Below is the WARN from one machine's syslog. The others are practically
  identical. When the WARN happens on any of the machines, other 2 will
  _also_ exhibit queue timeouts. Additionally, other (unrelated) machines
  connected to the same infiniband fabric will exhibit a 12-second
  transmission delay. This could conceivably be caused by these 3 servers
  also being Subnet Managers (opensm package).
  
  The infiniband fabric is partitioned, with the affected partition (8011)
  seeing most of the traffic.
  
  
  
  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [52642.480262]  ? ktime_get+0x43/0xa0
  kernel: [52642.480268]  ? lapic_next_deadline+0x26/0x30
  kernel: [52642.480273]  __do_softirq+0xe4/0x2da
  kernel: [52642.480278]  irq_exit+0xae/0xb0
  kernel: [52642.480282]  smp_apic_timer_interrupt+0x79/0x130
  kernel

[Kernel-packages] [Bug 1898057] Lspci.txt

2020-10-01 Thread Mikko Tanner
apport information

** Attachment added: "Lspci.txt"
   https://bugs.launchpad.net/bugs/1898057/+attachment/5416152/+files/Lspci.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1898057

Title:
  Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):

  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
  Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
  Kernel driver in use: mlx4_core

  Below is the WARN from one machine's syslog. The others are
  practically identical. When the WARN happens on any of the machines,
  other 2 will _also_ exhibit queue timeouts. Additionally, other
  (unrelated) machines connected to the same infiniband fabric will
  exhibit a 12-second transmission delay. This could conceivably be
  caused by these 3 servers also being Subnet Managers (opensm package).

  The infiniband fabric is partitioned, with the affected partition
  (8011) seeing most of the traffic.

  

  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [52642.480262]  ? ktime_get+0x43/0xa0
  kernel: [52642.480268]  ? lapic_next_deadline+0x26/0x30
  kernel: [52642.480273]  __d

[Kernel-packages] [Bug 1898057] ProcModules.txt

2020-10-01 Thread Mikko Tanner
apport information

** Attachment added: "ProcModules.txt"
   
https://bugs.launchpad.net/bugs/1898057/+attachment/5416157/+files/ProcModules.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1898057

Title:
  Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):

  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
  Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
  Kernel driver in use: mlx4_core

  Below is the WARN from one machine's syslog. The others are
  practically identical. When the WARN happens on any of the machines,
  other 2 will _also_ exhibit queue timeouts. Additionally, other
  (unrelated) machines connected to the same infiniband fabric will
  exhibit a 12-second transmission delay. This could conceivably be
  caused by these 3 servers also being Subnet Managers (opensm package).

  The infiniband fabric is partitioned, with the affected partition
  (8011) seeing most of the traffic.

  

  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [52642.480262]  ? ktime_get+0x43/0xa0
  kernel: [52642.480268]  ? lapic_next_deadline+0x26/0x30
  kernel: [52642

[Kernel-packages] [Bug 1898057] ProcCpuinfoMinimal.txt

2020-10-01 Thread Mikko Tanner
apport information

** Attachment added: "ProcCpuinfoMinimal.txt"
   
https://bugs.launchpad.net/bugs/1898057/+attachment/5416154/+files/ProcCpuinfoMinimal.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1898057

Title:
  Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):

  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
  Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
  Kernel driver in use: mlx4_core

  Below is the WARN from one machine's syslog. The others are
  practically identical. When the WARN happens on any of the machines,
  other 2 will _also_ exhibit queue timeouts. Additionally, other
  (unrelated) machines connected to the same infiniband fabric will
  exhibit a 12-second transmission delay. This could conceivably be
  caused by these 3 servers also being Subnet Managers (opensm package).

  The infiniband fabric is partitioned, with the affected partition
  (8011) seeing most of the traffic.

  

  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [52642.480262]  ? ktime_get+0x43/0xa0
  kernel: [52642.480268]  ? lapic_next_deadline+0x26/0x30
  

[Kernel-packages] [Bug 1898057] CurrentDmesg.txt

2020-10-01 Thread Mikko Tanner
apport information

** Attachment added: "CurrentDmesg.txt"
   
https://bugs.launchpad.net/bugs/1898057/+attachment/5416151/+files/CurrentDmesg.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1898057

Title:
  Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):

  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
  Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
  Kernel driver in use: mlx4_core

  Below is the WARN from one machine's syslog. The others are
  practically identical. When the WARN happens on any of the machines,
  other 2 will _also_ exhibit queue timeouts. Additionally, other
  (unrelated) machines connected to the same infiniband fabric will
  exhibit a 12-second transmission delay. This could conceivably be
  caused by these 3 servers also being Subnet Managers (opensm package).

  The infiniband fabric is partitioned, with the affected partition
  (8011) seeing most of the traffic.

  

  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [52642.480262]  ? ktime_get+0x43/0xa0
  kernel: [52642.480268]  ? lapic_next_deadline+0x26/0x30
  kernel: [526

[Kernel-packages] [Bug 1898057] UdevDb.txt

2020-10-01 Thread Mikko Tanner
apport information

** Attachment added: "UdevDb.txt"
   https://bugs.launchpad.net/bugs/1898057/+attachment/5416158/+files/UdevDb.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1898057

Title:
  Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):

  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
  Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
  Kernel driver in use: mlx4_core

  Below is the WARN from one machine's syslog. The others are
  practically identical. When the WARN happens on any of the machines,
  other 2 will _also_ exhibit queue timeouts. Additionally, other
  (unrelated) machines connected to the same infiniband fabric will
  exhibit a 12-second transmission delay. This could conceivably be
  caused by these 3 servers also being Subnet Managers (opensm package).

  The infiniband fabric is partitioned, with the affected partition
  (8011) seeing most of the traffic.

  

  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [52642.480262]  ? ktime_get+0x43/0xa0
  kernel: [52642.480268]  ? lapic_next_deadline+0x26/0x30
  kernel: [52642.480273]  _

[Kernel-packages] [Bug 1898057] ProcEnviron.txt

2020-10-01 Thread Mikko Tanner
apport information

** Attachment added: "ProcEnviron.txt"
   
https://bugs.launchpad.net/bugs/1898057/+attachment/5416155/+files/ProcEnviron.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1898057

Title:
  Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):

  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
  Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
  Kernel driver in use: mlx4_core

  Below is the WARN from one machine's syslog. The others are
  practically identical. When the WARN happens on any of the machines,
  other 2 will _also_ exhibit queue timeouts. Additionally, other
  (unrelated) machines connected to the same infiniband fabric will
  exhibit a 12-second transmission delay. This could conceivably be
  caused by these 3 servers also being Subnet Managers (opensm package).

  The infiniband fabric is partitioned, with the affected partition
  (8011) seeing most of the traffic.

  

  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [52642.480262]  ? ktime_get+0x43/0xa0
  kernel: [52642.480268]  ? lapic_next_deadline+0x26/0x30
  kernel: [52642

[Kernel-packages] [Bug 1898057] Lsusb.txt

2020-10-01 Thread Mikko Tanner
apport information

** Attachment added: "Lsusb.txt"
   https://bugs.launchpad.net/bugs/1898057/+attachment/5416153/+files/Lsusb.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1898057

Title:
  Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):

  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
  Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
  Kernel driver in use: mlx4_core

  Below is the WARN from one machine's syslog. The others are
  practically identical. When the WARN happens on any of the machines,
  other 2 will _also_ exhibit queue timeouts. Additionally, other
  (unrelated) machines connected to the same infiniband fabric will
  exhibit a 12-second transmission delay. This could conceivably be
  caused by these 3 servers also being Subnet Managers (opensm package).

  The infiniband fabric is partitioned, with the affected partition
  (8011) seeing most of the traffic.

  

  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [52642.480262]  ? ktime_get+0x43/0xa0
  kernel: [52642.480268]  ? lapic_next_deadline+0x26/0x30
  kernel: [52642.480273]  __d

[Kernel-packages] [Bug 1898057] Re: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

2020-10-01 Thread Mikko Tanner
apport information

** Package changed: linux-hwe-5.4 (Ubuntu) => linux (Ubuntu)

** Tags added: apport-collected bionic

** Description changed:

  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):
  
  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
  Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
  Kernel driver in use: mlx4_core
  
  Below is the WARN from one machine's syslog. The others are practically
  identical. When the WARN happens on any of the machines, other 2 will
  _also_ exhibit queue timeouts. Additionally, other (unrelated) machines
  connected to the same infiniband fabric will exhibit a 12-second
  transmission delay. This could conceivably be caused by these 3 servers
  also being Subnet Managers (opensm package).
  
  The infiniband fabric is partitioned, with the affected partition (8011)
  seeing most of the traffic.
  
  
  
  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [52642.480262]  ? ktime_get+0x43/0xa0
  kernel: [52642.480268]  ? lapic_next_deadline+0x26/0x30
  kernel: [52642.480273]  __do_softirq+0xe4/0x2da
  kernel: [52642.480278]  irq_exit+0xae/0xb0
  kernel: [52642.480282]  smp_apic_timer_interrupt+0x79/0x130
  kernel: [52642.480285]  apic_timer_interrupt+0xf/0x20
  kernel: [52642.480286]  
  kernel: [52642.480292] RIP: 0010:cpuidle_enter_state+0xbc/0x440
  kernel

[Kernel-packages] [Bug 1898057] ProcInterrupts.txt

2020-10-01 Thread Mikko Tanner
apport information

** Attachment added: "ProcInterrupts.txt"
   
https://bugs.launchpad.net/bugs/1898057/+attachment/5416156/+files/ProcInterrupts.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1898057

Title:
  Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  After upgrading 3 servers from linux-image-5.3.0-40-generic to linux-
  image-5.4.0-48-generic I have started seeing the following queue
  timeouts from IP-over-Infiniband (ipoib) devices. The devices in
  question are (with newest available firmware, 2.42.5000):

  # lspci -nnk -s 83:00.0
  83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family 
[ConnectX-3] [15b3:1003]
  Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] 
[15b3:0027]
  Kernel driver in use: mlx4_core

  Below is the WARN from one machine's syslog. The others are
  practically identical. When the WARN happens on any of the machines,
  other 2 will _also_ exhibit queue timeouts. Additionally, other
  (unrelated) machines connected to the same infiniband fabric will
  exhibit a 12-second transmission delay. This could conceivably be
  caused by these 3 servers also being Subnet Managers (opensm package).

  The infiniband fabric is partitioned, with the affected partition
  (8011) seeing most of the traffic.

  

  kernel: [52642.480066] [ cut here ]
  kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed 
out
  kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at 
/build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 
dev_watchdog+0x264/0x270
  kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw 
ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables 
nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc 
nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm 
sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal 
scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost 
kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh 
nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev 
input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich 
acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf 
ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables 
autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) 
spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456
  kernel: [52642.480182]  async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash 
dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast 
drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect 
sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo 
crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core 
libahci scsi_transport_sas mdio wmi
  kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P   
OE 5.4.0-48-generic #52~18.04.1-Ubuntu
  kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 
2.0b 04/13/2017
  kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270
  kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 
01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff 
<0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
  kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282
  kernel: [52642.480233] RAX:  RBX:  RCX: 
083f
  kernel: [52642.480234] RDX:  RSI: 00f6 RDI: 
083f
  kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 
0003
  kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 
0001
  kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 
93f2d293c880
  kernel: [52642.480240] FS:  () GS:93f33f64() 
knlGS:
  kernel: [52642.480242] CS:  0010 DS:  ES:  CR0: 80050033
  kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 
001626e0
  kernel: [52642.480245] Call Trace:
  kernel: [52642.480247]  
  kernel: [52642.480252]  ? pfifo_fast_reset+0x110/0x110
  kernel: [52642.480255]  call_timer_fn+0x32/0x130
  kernel: [52642.480258]  run_timer_softirq+0x443/0x480
  kernel: [52642.480262]  ? ktime_get+0x43/0xa0
  kernel: [52642.480268]  ? lapic_next_deadline+0x26/0x30
  kernel: 

[Kernel-packages] [Bug 1679823] Re: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS

2017-06-20 Thread Mikko Tanner
# apt update
[...]
# apt install linux-image-4.10.0-23-generic
N: Unable to locate package linux-image-4.10.0-23-generic

(yes I know it is in proposed)

However, 4.10.0-24 does exist and gets installed with -edge. And it is
known broken. Why push it out at all? Am I wrong to assume that a later
numbered (point?) release would have the fixes contained in previous
versions?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1679823

Title:
  bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10
  in XENIAL LTS

Status in Linux:
  Unknown
Status in linux package in Ubuntu:
  Triaged
Status in linux-hwe package in Ubuntu:
  Fix Released
Status in linux-hwe-edge package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  Confirmed
Status in linux-hwe source package in Xenial:
  Fix Released
Status in linux-hwe-edge source package in Xenial:
  Confirmed
Status in linux source package in Yakkety:
  Fix Released
Status in linux-hwe source package in Yakkety:
  Fix Committed
Status in linux-hwe-edge source package in Yakkety:
  Fix Committed
Status in linux source package in Zesty:
  In Progress
Status in linux-hwe source package in Zesty:
  Confirmed
Status in linux-hwe-edge source package in Zesty:
  Confirmed

Bug description:
  Since I upgraded the kernel from linux-image-4.8.0-46-generic to
  linux-image-extra-4.10.0-14-generic I'm facing an issue when I want to
  change the MTU.

  It seems to be known bug already fixed:
  https://bugzilla.kernel.org/show_bug.cgi?id=194763

  # ip l sh eno49
  2: eno49:  mtu 9000 qdisc mq master 
bond0 state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l sh eno50
  3: eno50:  mtu 9000 qdisc mq master 
bond0 state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l sh bond0
  6: bond0:  mtu 1500 qdisc noqueue 
state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l set mtu 9000 bond0
  RTNETLINK answers: Invalid argument

  root@controller002[SRV][YUL]:~# tail -1 /var/log/syslog
  Apr  4 19:36:28 controller002 kernel: [ 8869.077853] bond0: Invalid MTU 9000 
requested, hw max 1500

  # modinfo ixgbe
  filename:   
/lib/modules/4.10.0-14-generic/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
  version:4.4.0-k
  license:GPL
  description:Intel(R) 10 Gigabit PCI Express Network Driver

  # modinfo bonding
  filename:   
/lib/modules/4.10.0-14-generic/kernel/drivers/net/bonding/bonding.ko
  author: Thomas Davis, tada...@lbl.gov and many others
  description:Ethernet Channel Bonding Driver, v3.7.1
  version:3.7.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1679823/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1679823] Re: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS

2017-06-20 Thread Mikko Tanner
The MTU problem affecting OpenvSwitch bridges still exists (or
resurfaced) in linux-generic-hwe-16.04-edge (4.10.0.24.17).

# grep -i mtu /var/log/syslog
[...] br-int: Invalid MTU 9000 requested, hw max 1500
[...] lxcbr0: Invalid MTU 9000 requested, hw max 1500

Pretty please STOP BREAKING things. No network access is especially
annoying to fix remotely. Luckily I tested this before upgrading more
critical systems (as everyone should, of course).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1679823

Title:
  bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10
  in XENIAL LTS

Status in Linux:
  Unknown
Status in linux package in Ubuntu:
  Triaged
Status in linux-hwe package in Ubuntu:
  Fix Released
Status in linux-hwe-edge package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  Confirmed
Status in linux-hwe source package in Xenial:
  Fix Released
Status in linux-hwe-edge source package in Xenial:
  Confirmed
Status in linux source package in Yakkety:
  Fix Released
Status in linux-hwe source package in Yakkety:
  Fix Committed
Status in linux-hwe-edge source package in Yakkety:
  Fix Committed
Status in linux source package in Zesty:
  In Progress
Status in linux-hwe source package in Zesty:
  Confirmed
Status in linux-hwe-edge source package in Zesty:
  Confirmed

Bug description:
  Since I upgraded the kernel from linux-image-4.8.0-46-generic to
  linux-image-extra-4.10.0-14-generic I'm facing an issue when I want to
  change the MTU.

  It seems to be known bug already fixed:
  https://bugzilla.kernel.org/show_bug.cgi?id=194763

  # ip l sh eno49
  2: eno49:  mtu 9000 qdisc mq master 
bond0 state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l sh eno50
  3: eno50:  mtu 9000 qdisc mq master 
bond0 state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l sh bond0
  6: bond0:  mtu 1500 qdisc noqueue 
state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l set mtu 9000 bond0
  RTNETLINK answers: Invalid argument

  root@controller002[SRV][YUL]:~# tail -1 /var/log/syslog
  Apr  4 19:36:28 controller002 kernel: [ 8869.077853] bond0: Invalid MTU 9000 
requested, hw max 1500

  # modinfo ixgbe
  filename:   
/lib/modules/4.10.0-14-generic/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
  version:4.4.0-k
  license:GPL
  description:Intel(R) 10 Gigabit PCI Express Network Driver

  # modinfo bonding
  filename:   
/lib/modules/4.10.0-14-generic/kernel/drivers/net/bonding/bonding.ko
  author: Thomas Davis, tada...@lbl.gov and many others
  description:Ethernet Channel Bonding Driver, v3.7.1
  version:3.7.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1679823/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1679823] Re: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS

2017-06-08 Thread Mikko Tanner
Ok, so the fix is rather trivial. I've verified that building 4.10.0-22
kernel from Ubuntu sources with the following patch fixes the
OpenvSwitch MTU issue:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/net/openvswitch
/vport-internal_dev.c?id=425df17ce3a26d98f76e2b6b0af2acf4aeb0b026

Here's the instructions for people who want the fixed kernel asap:

# apt-get source linux-image-4.10.0-22-generic
# apt-get build-dep linux-image-4.10.0-22-generic

# fakeroot debian/rules binary-headers binary-generic binary-perarch

Install needed DEBs, reboot.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1679823

Title:
  bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10
  in XENIAL LTS

Status in Linux:
  Unknown
Status in linux package in Ubuntu:
  Triaged
Status in linux-hwe package in Ubuntu:
  Fix Released
Status in linux-hwe-edge package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  New
Status in linux-hwe source package in Xenial:
  Fix Released
Status in linux-hwe-edge source package in Xenial:
  New
Status in linux source package in Yakkety:
  Fix Released
Status in linux-hwe source package in Yakkety:
  Fix Committed
Status in linux-hwe-edge source package in Yakkety:
  Fix Committed
Status in linux source package in Zesty:
  In Progress
Status in linux-hwe source package in Zesty:
  Confirmed
Status in linux-hwe-edge source package in Zesty:
  Confirmed

Bug description:
  Since I upgraded the kernel from linux-image-4.8.0-46-generic to
  linux-image-extra-4.10.0-14-generic I'm facing an issue when I want to
  change the MTU.

  It seems to be known bug already fixed:
  https://bugzilla.kernel.org/show_bug.cgi?id=194763

  # ip l sh eno49
  2: eno49:  mtu 9000 qdisc mq master 
bond0 state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l sh eno50
  3: eno50:  mtu 9000 qdisc mq master 
bond0 state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l sh bond0
  6: bond0:  mtu 1500 qdisc noqueue 
state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l set mtu 9000 bond0
  RTNETLINK answers: Invalid argument

  root@controller002[SRV][YUL]:~# tail -1 /var/log/syslog
  Apr  4 19:36:28 controller002 kernel: [ 8869.077853] bond0: Invalid MTU 9000 
requested, hw max 1500

  # modinfo ixgbe
  filename:   
/lib/modules/4.10.0-14-generic/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
  version:4.4.0-k
  license:GPL
  description:Intel(R) 10 Gigabit PCI Express Network Driver

  # modinfo bonding
  filename:   
/lib/modules/4.10.0-14-generic/kernel/drivers/net/bonding/bonding.ko
  author: Thomas Davis, tada...@lbl.gov and many others
  description:Ethernet Channel Bonding Driver, v3.7.1
  version:3.7.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1679823/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1679823] Re: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS

2017-06-08 Thread Mikko Tanner
4.10 series (hwe-edge) kernel doesn't seem to have gotten these fixes
yet, as 4.10.0-22-generic #24~16.04.1-Ubuntu still fails to set
openvswitch MTU correctly. When can we expect a fix to be released for
-edge?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1679823

Title:
  bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10
  in XENIAL LTS

Status in Linux:
  Unknown
Status in linux package in Ubuntu:
  Triaged
Status in linux-hwe package in Ubuntu:
  Fix Released
Status in linux-hwe-edge package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  New
Status in linux-hwe source package in Xenial:
  Fix Released
Status in linux-hwe-edge source package in Xenial:
  New
Status in linux source package in Yakkety:
  Fix Released
Status in linux-hwe source package in Yakkety:
  Fix Committed
Status in linux-hwe-edge source package in Yakkety:
  Fix Committed
Status in linux source package in Zesty:
  In Progress
Status in linux-hwe source package in Zesty:
  Confirmed
Status in linux-hwe-edge source package in Zesty:
  Confirmed

Bug description:
  Since I upgraded the kernel from linux-image-4.8.0-46-generic to
  linux-image-extra-4.10.0-14-generic I'm facing an issue when I want to
  change the MTU.

  It seems to be known bug already fixed:
  https://bugzilla.kernel.org/show_bug.cgi?id=194763

  # ip l sh eno49
  2: eno49:  mtu 9000 qdisc mq master 
bond0 state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l sh eno50
  3: eno50:  mtu 9000 qdisc mq master 
bond0 state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l sh bond0
  6: bond0:  mtu 1500 qdisc noqueue 
state UP mode DEFAULT group default qlen 1000
  link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff

  # ip l set mtu 9000 bond0
  RTNETLINK answers: Invalid argument

  root@controller002[SRV][YUL]:~# tail -1 /var/log/syslog
  Apr  4 19:36:28 controller002 kernel: [ 8869.077853] bond0: Invalid MTU 9000 
requested, hw max 1500

  # modinfo ixgbe
  filename:   
/lib/modules/4.10.0-14-generic/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
  version:4.4.0-k
  license:GPL
  description:Intel(R) 10 Gigabit PCI Express Network Driver

  # modinfo bonding
  filename:   
/lib/modules/4.10.0-14-generic/kernel/drivers/net/bonding/bonding.ko
  author: Thomas Davis, tada...@lbl.gov and many others
  description:Ethernet Channel Bonding Driver, v3.7.1
  version:3.7.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1679823/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp