[Kernel-packages] [Bug 1921137] Re: mount.ocfs2 causes kernel BUG at lib/string.c:1149!
Another data point here: kernel: [5150033.094216] kernel BUG at lib/string.c:1149! kernel: [5150033.094224] invalid opcode: [#1] SMP NOPTI kernel: [5150033.094229] CPU: 1 PID: 2940890 Comm: mount.ocfs2 Tainted: P OE 5.13.12-051312-generic #202108181219-Ubuntu kernel: [5150033.094233] Hardware name: Gigabyte Technology Co., Ltd. X399 DESIGNARE EX/X399 DESIGNARE EX-CF, BIOS F12i 09/24/2019 kernel: [5150033.094236] RIP: 0010:fortify_panic+0x13/0x15 kernel: [5150033.094244] Code: 35 37 a8 3b 01 48 c7 c7 93 63 01 b6 e8 c9 c9 fe ff 41 5c 41 5d 5d c3 55 48 89 fe 48 c7 c7 e0 63 01 b6 48 89 e5 e8 b0 c9 fe ff <0f> 0b 48 c7 c7 18 dc c8 b5 e8 df ff ff ff 48 c7 c7 10 dc c8 b5 e8 kernel: [5150033.094248] RSP: 0018:b4f1ee523c50 EFLAGS: 00010246 kernel: [5150033.094252] RAX: 0022 RBX: 9cf5639bb000 RCX: kernel: [5150033.094254] RDX: RSI: 9d033e2589c0 RDI: 9d033e2589c0 kernel: [5150033.094257] RBP: b4f1ee523c50 R08: 9d033e2589c0 R09: b4f1ee523a30 kernel: [5150033.094258] R10: 0001 R11: 0001 R12: 0004 kernel: [5150033.094260] R13: 9cf496853000 R14: 9d00f6a91000 R15: 9cf5639bb291 kernel: [5150033.094262] FS: 7fb7fd6d3b80() GS:9d033e24() knlGS: kernel: [5150033.094265] CS: 0010 DS: ES: CR0: 80050033 kernel: [5150033.094267] CR2: 55f52c08f040 CR3: 00029a09e000 CR4: 003506e0 kernel: [5150033.094270] Call Trace: kernel: [5150033.094276] ocfs2_initialize_super.isra.0.cold+0xc/0x18 [ocfs2] kernel: [5150033.094347] ? ocfs2_verify_volume+0x143/0x310 [ocfs2] kernel: [5150033.094410] ocfs2_fill_super+0x262/0xda0 [ocfs2] kernel: [5150033.094473] mount_bdev+0x18d/0x1c0 kernel: [5150033.094478] ? ocfs2_initialize_super.isra.0+0x1070/0x1070 [ocfs2] kernel: [5150033.094539] ocfs2_mount+0x15/0x20 [ocfs2] kernel: [5150033.094599] legacy_get_tree+0x2b/0x50 kernel: [5150033.094604] vfs_get_tree+0x2a/0xc0 kernel: [5150033.094607] ? capable+0x19/0x20 kernel: [5150033.094612] path_mount+0x468/0xa60 kernel: [5150033.094617] do_mount+0x7c/0xa0 kernel: [5150033.094620] __x64_sys_mount+0x8b/0xe0 kernel: [5150033.094623] do_syscall_64+0x61/0xb0 kernel: [5150033.094627] ? syscall_exit_to_user_mode+0x27/0x50 kernel: [5150033.094632] ? __x64_sys_readlink+0x1f/0x30 kernel: [5150033.094635] ? do_syscall_64+0x6e/0xb0 kernel: [5150033.094638] ? irqentry_exit+0x19/0x30 kernel: [5150033.094641] ? exc_page_fault+0x8f/0x170 kernel: [5150033.094645] ? asm_exc_page_fault+0x8/0x30 kernel: [5150033.094649] entry_SYSCALL_64_after_hwframe+0x44/0xae kernel: [5150033.094651] RIP: 0033:0x7fb7fd88cdde kernel: [5150033.094679] Code: 48 8b 0d b5 80 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 82 80 0c 00 f7 d8 64 89 01 48 kernel: [5150033.094682] RSP: 002b:7ffea9610c18 EFLAGS: 0246 ORIG_RAX: 00a5 kernel: [5150033.094686] RAX: ffda RBX: RCX: 7fb7fd88cdde kernel: [5150033.094688] RDX: 55cd6acb10ae RSI: 55cd6c9a7340 RDI: 55cd6c9ac140 kernel: [5150033.094689] RBP: 7ffea9610dc0 R08: 55cd6c9ac0e0 R09: 7ffea960e650 kernel: [5150033.094691] R10: R11: 0246 R12: 7ffea9610cb0 kernel: [5150033.094693] R13: 7ffea9610c30 R14: 55cd6c9ac0e0 R15: kernel: [5150033.094696] Modules linked in: ocfs2_stack_o2cb ocfs2_dlm ocfs2 ocfs2_nodemanager ocfs2_stackglue quota_tree nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_counter nft_limit nft_meta_bridge bridge stp llc snd_seq_dummy vhost_net vhost vhost_iotlb tap rfcomm nf_tables ip6table_filter ip6_tables iptable_filter bpfilter wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2s_x86_64 libcurve25519_generic libchacha libblake2s_generic ip6_udp_tunnel udp_tunnel nfnetlink_cttimeout nfnetlink cmac algif_hash openvswitch nsh algif_skcipher nf_conncount af_alg nf_nat bnep nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio crypto_simd snd_hda_codec_hdmi cryptd rapl iwlmvm snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi mac80211 uvcvideo kernel: [5150033.094758] snd_hda_codec videobuf2_vmalloc videobuf2_memops snd_hda_core snd_seq_midi videobuf2_v4l2 btusb snd_seq_midi_event libarc4 btrtl videobuf2_common snd_hwdep snd_rawmidi btbcm btintel videodev snd_seq joydev gigabyte_wmi mc input_leds serio_raw snd_pcm snd_seq_device wmi_bmof bluetooth iwlwifi snd_timer ecdh_generic efi_pstore snd ecc ccp mxm_wmi cfg80211 k10temp soundcore lz4 lz4_compress mac_hid nvidia_uvm(POE) tcp_htcp sch_cake ib_umad
[Kernel-packages] [Bug 1898057] Re: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4
After a reboot of the whole fabric (switches and machines), this problem has not resurfaced. Conceivably this could have been a transient error state, so I will close this with "invalid". ** Changed in: linux (Ubuntu) Status: Confirmed => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1898057 Title: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4 Status in linux package in Ubuntu: Invalid Bug description: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [5264
[Kernel-packages] [Bug 1898057] Re: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4
** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1898057 Title: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4 Status in linux package in Ubuntu: Confirmed Bug description: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [52642.480262] ? ktime_get+0x43/0xa0 kernel: [52642.480268] ? lapic_next_deadline+0x26/0x30 kernel: [52642.480273] __do_softirq+0xe4/0x2da kernel: [52642.480278] irq_exit+0xae/0xb0 ker
[Kernel-packages] [Bug 1898057] WifiSyslog.txt
apport information ** Attachment added: "WifiSyslog.txt" https://bugs.launchpad.net/bugs/1898057/+attachment/5416159/+files/WifiSyslog.txt ** Description changed: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] - Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] - Kernel driver in use: mlx4_core + Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] + Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [52642.480262] ? ktime_get+0x43/0xa0 kernel: [52642.480268] ? lapic_next_deadline+0x26/0x30 kernel: [52642.480273] __do_softirq+0xe4/0x2da kernel: [52642.480278] irq_exit+0xae/0xb0 kernel: [52642.480282] smp_apic_timer_interrupt+0x79/0x130 kernel
[Kernel-packages] [Bug 1898057] Lspci.txt
apport information ** Attachment added: "Lspci.txt" https://bugs.launchpad.net/bugs/1898057/+attachment/5416152/+files/Lspci.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1898057 Title: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4 Status in linux package in Ubuntu: Confirmed Bug description: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [52642.480262] ? ktime_get+0x43/0xa0 kernel: [52642.480268] ? lapic_next_deadline+0x26/0x30 kernel: [52642.480273] __d
[Kernel-packages] [Bug 1898057] ProcModules.txt
apport information ** Attachment added: "ProcModules.txt" https://bugs.launchpad.net/bugs/1898057/+attachment/5416157/+files/ProcModules.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1898057 Title: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4 Status in linux package in Ubuntu: Confirmed Bug description: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [52642.480262] ? ktime_get+0x43/0xa0 kernel: [52642.480268] ? lapic_next_deadline+0x26/0x30 kernel: [52642
[Kernel-packages] [Bug 1898057] ProcCpuinfoMinimal.txt
apport information ** Attachment added: "ProcCpuinfoMinimal.txt" https://bugs.launchpad.net/bugs/1898057/+attachment/5416154/+files/ProcCpuinfoMinimal.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1898057 Title: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4 Status in linux package in Ubuntu: Confirmed Bug description: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [52642.480262] ? ktime_get+0x43/0xa0 kernel: [52642.480268] ? lapic_next_deadline+0x26/0x30
[Kernel-packages] [Bug 1898057] CurrentDmesg.txt
apport information ** Attachment added: "CurrentDmesg.txt" https://bugs.launchpad.net/bugs/1898057/+attachment/5416151/+files/CurrentDmesg.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1898057 Title: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4 Status in linux package in Ubuntu: Confirmed Bug description: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [52642.480262] ? ktime_get+0x43/0xa0 kernel: [52642.480268] ? lapic_next_deadline+0x26/0x30 kernel: [526
[Kernel-packages] [Bug 1898057] UdevDb.txt
apport information ** Attachment added: "UdevDb.txt" https://bugs.launchpad.net/bugs/1898057/+attachment/5416158/+files/UdevDb.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1898057 Title: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4 Status in linux package in Ubuntu: Confirmed Bug description: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [52642.480262] ? ktime_get+0x43/0xa0 kernel: [52642.480268] ? lapic_next_deadline+0x26/0x30 kernel: [52642.480273] _
[Kernel-packages] [Bug 1898057] ProcEnviron.txt
apport information ** Attachment added: "ProcEnviron.txt" https://bugs.launchpad.net/bugs/1898057/+attachment/5416155/+files/ProcEnviron.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1898057 Title: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4 Status in linux package in Ubuntu: Confirmed Bug description: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [52642.480262] ? ktime_get+0x43/0xa0 kernel: [52642.480268] ? lapic_next_deadline+0x26/0x30 kernel: [52642
[Kernel-packages] [Bug 1898057] Lsusb.txt
apport information ** Attachment added: "Lsusb.txt" https://bugs.launchpad.net/bugs/1898057/+attachment/5416153/+files/Lsusb.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1898057 Title: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4 Status in linux package in Ubuntu: Confirmed Bug description: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [52642.480262] ? ktime_get+0x43/0xa0 kernel: [52642.480268] ? lapic_next_deadline+0x26/0x30 kernel: [52642.480273] __d
[Kernel-packages] [Bug 1898057] Re: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4
apport information ** Package changed: linux-hwe-5.4 (Ubuntu) => linux (Ubuntu) ** Tags added: apport-collected bionic ** Description changed: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [52642.480262] ? ktime_get+0x43/0xa0 kernel: [52642.480268] ? lapic_next_deadline+0x26/0x30 kernel: [52642.480273] __do_softirq+0xe4/0x2da kernel: [52642.480278] irq_exit+0xae/0xb0 kernel: [52642.480282] smp_apic_timer_interrupt+0x79/0x130 kernel: [52642.480285] apic_timer_interrupt+0xf/0x20 kernel: [52642.480286] kernel: [52642.480292] RIP: 0010:cpuidle_enter_state+0xbc/0x440 kernel
[Kernel-packages] [Bug 1898057] ProcInterrupts.txt
apport information ** Attachment added: "ProcInterrupts.txt" https://bugs.launchpad.net/bugs/1898057/+attachment/5416156/+files/ProcInterrupts.txt -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1898057 Title: Infiniband transmit queue timeouts after upgrading to linux-hwe-5.4 Status in linux package in Ubuntu: Confirmed Bug description: After upgrading 3 servers from linux-image-5.3.0-40-generic to linux- image-5.4.0-48-generic I have started seeing the following queue timeouts from IP-over-Infiniband (ipoib) devices. The devices in question are (with newest available firmware, 2.42.5000): # lspci -nnk -s 83:00.0 83:00.0 Network controller [0280]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:0027] Kernel driver in use: mlx4_core Below is the WARN from one machine's syslog. The others are practically identical. When the WARN happens on any of the machines, other 2 will _also_ exhibit queue timeouts. Additionally, other (unrelated) machines connected to the same infiniband fabric will exhibit a 12-second transmission delay. This could conceivably be caused by these 3 servers also being Subnet Managers (opensm package). The infiniband fabric is partitioned, with the affected partition (8011) seeing most of the traffic. kernel: [52642.480066] [ cut here ] kernel: [52642.480092] NETDEV WATCHDOG: ib0.8011 (): transmit queue 0 timed out kernel: [52642.480120] WARNING: CPU: 13 PID: 0 at /build/linux-hwe-5.4-8m2I8l/linux-hwe-5.4-5.4.0/net/sched/sch_generic.c:448 dev_watchdog+0x264/0x270 kernel: [52642.480121] Modules linked in: aufs overlay ip6table_raw ip6table_mangle ip6table_nat iptable_raw iptable_mangle iptable_nat nf_tables nfnetlink cfg80211 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter mst_pciconf(OE) mst_pci(OE) 8021q garp mrp stp llc nls_iso8859_1 intel_rapl_msr lz4 lz4_compress intel_rapl_common ib_iser rdma_cm sb_edac iw_cm iscsi_tcp libiscsi_tcp libiscsi x86_pkg_temp_thermal scsi_transport_iscsi intel_powerclamp zram veth vhost_net tap coretemp vhost kvm_intel crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm openvswitch nsh nf_conncount nf_nat nf_conntrack rapl nf_defrag_ipv6 nf_defrag_ipv4 joydev input_leds intel_cstate ib_ipoib mei_me mei ib_cm ioatdma ib_umad lpc_ich acpi_pad acpi_power_meter mac_hid ipmi_si ipmi_ssif ipmi_devintf ipmi_msghandler kyber_iosched sch_fq_codel tcp_highspeed ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlua(POE) btrfs zstd_compress raid10 raid456 kernel: [52642.480182] async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear dm_mirror dm_region_hash dm_log mlx4_ib ib_uverbs ib_core hid_generic raid1 ses enclosure usbhid hid ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops ixgbe glue_helper mpt3sas nvme xfrm_algo crypto_simd ahci raid_class dca cryptd mlx4_core drm megaraid_sas nvme_core libahci scsi_transport_sas mdio wmi kernel: [52642.480221] CPU: 13 PID: 0 Comm: swapper/13 Tainted: P OE 5.4.0-48-generic #52~18.04.1-Ubuntu kernel: [52642.480223] Hardware name: Supermicro Super Server/X10DRW-iT, BIOS 2.0b 04/13/2017 kernel: [52642.480226] RIP: 0010:dev_watchdog+0x264/0x270 kernel: [52642.480229] Code: 48 85 c0 75 e6 eb a0 4c 89 ef c6 05 42 c1 e7 00 01 e8 30 b8 fa ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 50 05 63 ae e8 4c 31 71 ff <0f> 0b eb 82 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 kernel: [52642.480230] RSP: 0018:b4998c970e48 EFLAGS: 00010282 kernel: [52642.480233] RAX: RBX: RCX: 083f kernel: [52642.480234] RDX: RSI: 00f6 RDI: 083f kernel: [52642.480235] RBP: b4998c970e78 R08: 08fd R09: 0003 kernel: [52642.480237] R10: b4998c970ee8 R11: 0001 R12: 0001 kernel: [52642.480238] R13: 93d302465000 R14: 93d302465480 R15: 93f2d293c880 kernel: [52642.480240] FS: () GS:93f33f64() knlGS: kernel: [52642.480242] CS: 0010 DS: ES: CR0: 80050033 kernel: [52642.480243] CR2: f90140127000 CR3: 002a26e0a004 CR4: 001626e0 kernel: [52642.480245] Call Trace: kernel: [52642.480247] kernel: [52642.480252] ? pfifo_fast_reset+0x110/0x110 kernel: [52642.480255] call_timer_fn+0x32/0x130 kernel: [52642.480258] run_timer_softirq+0x443/0x480 kernel: [52642.480262] ? ktime_get+0x43/0xa0 kernel: [52642.480268] ? lapic_next_deadline+0x26/0x30 kernel:
[Kernel-packages] [Bug 1679823] Re: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS
# apt update [...] # apt install linux-image-4.10.0-23-generic N: Unable to locate package linux-image-4.10.0-23-generic (yes I know it is in proposed) However, 4.10.0-24 does exist and gets installed with -edge. And it is known broken. Why push it out at all? Am I wrong to assume that a later numbered (point?) release would have the fixes contained in previous versions? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1679823 Title: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS Status in Linux: Unknown Status in linux package in Ubuntu: Triaged Status in linux-hwe package in Ubuntu: Fix Released Status in linux-hwe-edge package in Ubuntu: Confirmed Status in linux source package in Xenial: Confirmed Status in linux-hwe source package in Xenial: Fix Released Status in linux-hwe-edge source package in Xenial: Confirmed Status in linux source package in Yakkety: Fix Released Status in linux-hwe source package in Yakkety: Fix Committed Status in linux-hwe-edge source package in Yakkety: Fix Committed Status in linux source package in Zesty: In Progress Status in linux-hwe source package in Zesty: Confirmed Status in linux-hwe-edge source package in Zesty: Confirmed Bug description: Since I upgraded the kernel from linux-image-4.8.0-46-generic to linux-image-extra-4.10.0-14-generic I'm facing an issue when I want to change the MTU. It seems to be known bug already fixed: https://bugzilla.kernel.org/show_bug.cgi?id=194763 # ip l sh eno49 2: eno49: mtu 9000 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l sh eno50 3: eno50: mtu 9000 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l sh bond0 6: bond0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l set mtu 9000 bond0 RTNETLINK answers: Invalid argument root@controller002[SRV][YUL]:~# tail -1 /var/log/syslog Apr 4 19:36:28 controller002 kernel: [ 8869.077853] bond0: Invalid MTU 9000 requested, hw max 1500 # modinfo ixgbe filename: /lib/modules/4.10.0-14-generic/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko version:4.4.0-k license:GPL description:Intel(R) 10 Gigabit PCI Express Network Driver # modinfo bonding filename: /lib/modules/4.10.0-14-generic/kernel/drivers/net/bonding/bonding.ko author: Thomas Davis, tada...@lbl.gov and many others description:Ethernet Channel Bonding Driver, v3.7.1 version:3.7.1 To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1679823/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1679823] Re: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS
The MTU problem affecting OpenvSwitch bridges still exists (or resurfaced) in linux-generic-hwe-16.04-edge (4.10.0.24.17). # grep -i mtu /var/log/syslog [...] br-int: Invalid MTU 9000 requested, hw max 1500 [...] lxcbr0: Invalid MTU 9000 requested, hw max 1500 Pretty please STOP BREAKING things. No network access is especially annoying to fix remotely. Luckily I tested this before upgrading more critical systems (as everyone should, of course). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1679823 Title: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS Status in Linux: Unknown Status in linux package in Ubuntu: Triaged Status in linux-hwe package in Ubuntu: Fix Released Status in linux-hwe-edge package in Ubuntu: Confirmed Status in linux source package in Xenial: Confirmed Status in linux-hwe source package in Xenial: Fix Released Status in linux-hwe-edge source package in Xenial: Confirmed Status in linux source package in Yakkety: Fix Released Status in linux-hwe source package in Yakkety: Fix Committed Status in linux-hwe-edge source package in Yakkety: Fix Committed Status in linux source package in Zesty: In Progress Status in linux-hwe source package in Zesty: Confirmed Status in linux-hwe-edge source package in Zesty: Confirmed Bug description: Since I upgraded the kernel from linux-image-4.8.0-46-generic to linux-image-extra-4.10.0-14-generic I'm facing an issue when I want to change the MTU. It seems to be known bug already fixed: https://bugzilla.kernel.org/show_bug.cgi?id=194763 # ip l sh eno49 2: eno49: mtu 9000 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l sh eno50 3: eno50: mtu 9000 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l sh bond0 6: bond0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l set mtu 9000 bond0 RTNETLINK answers: Invalid argument root@controller002[SRV][YUL]:~# tail -1 /var/log/syslog Apr 4 19:36:28 controller002 kernel: [ 8869.077853] bond0: Invalid MTU 9000 requested, hw max 1500 # modinfo ixgbe filename: /lib/modules/4.10.0-14-generic/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko version:4.4.0-k license:GPL description:Intel(R) 10 Gigabit PCI Express Network Driver # modinfo bonding filename: /lib/modules/4.10.0-14-generic/kernel/drivers/net/bonding/bonding.ko author: Thomas Davis, tada...@lbl.gov and many others description:Ethernet Channel Bonding Driver, v3.7.1 version:3.7.1 To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1679823/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1679823] Re: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS
Ok, so the fix is rather trivial. I've verified that building 4.10.0-22 kernel from Ubuntu sources with the following patch fixes the OpenvSwitch MTU issue: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/net/openvswitch /vport-internal_dev.c?id=425df17ce3a26d98f76e2b6b0af2acf4aeb0b026 Here's the instructions for people who want the fixed kernel asap: # apt-get source linux-image-4.10.0-22-generic # apt-get build-dep linux-image-4.10.0-22-generic # fakeroot debian/rules binary-headers binary-generic binary-perarch Install needed DEBs, reboot. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1679823 Title: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS Status in Linux: Unknown Status in linux package in Ubuntu: Triaged Status in linux-hwe package in Ubuntu: Fix Released Status in linux-hwe-edge package in Ubuntu: Confirmed Status in linux source package in Xenial: New Status in linux-hwe source package in Xenial: Fix Released Status in linux-hwe-edge source package in Xenial: New Status in linux source package in Yakkety: Fix Released Status in linux-hwe source package in Yakkety: Fix Committed Status in linux-hwe-edge source package in Yakkety: Fix Committed Status in linux source package in Zesty: In Progress Status in linux-hwe source package in Zesty: Confirmed Status in linux-hwe-edge source package in Zesty: Confirmed Bug description: Since I upgraded the kernel from linux-image-4.8.0-46-generic to linux-image-extra-4.10.0-14-generic I'm facing an issue when I want to change the MTU. It seems to be known bug already fixed: https://bugzilla.kernel.org/show_bug.cgi?id=194763 # ip l sh eno49 2: eno49: mtu 9000 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l sh eno50 3: eno50: mtu 9000 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l sh bond0 6: bond0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l set mtu 9000 bond0 RTNETLINK answers: Invalid argument root@controller002[SRV][YUL]:~# tail -1 /var/log/syslog Apr 4 19:36:28 controller002 kernel: [ 8869.077853] bond0: Invalid MTU 9000 requested, hw max 1500 # modinfo ixgbe filename: /lib/modules/4.10.0-14-generic/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko version:4.4.0-k license:GPL description:Intel(R) 10 Gigabit PCI Express Network Driver # modinfo bonding filename: /lib/modules/4.10.0-14-generic/kernel/drivers/net/bonding/bonding.ko author: Thomas Davis, tada...@lbl.gov and many others description:Ethernet Channel Bonding Driver, v3.7.1 version:3.7.1 To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1679823/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1679823] Re: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS
4.10 series (hwe-edge) kernel doesn't seem to have gotten these fixes yet, as 4.10.0-22-generic #24~16.04.1-Ubuntu still fails to set openvswitch MTU correctly. When can we expect a fix to be released for -edge? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1679823 Title: bond0: Invalid MTU 9000 requested, hw max 1500 with kernel 4.8 / 4.10 in XENIAL LTS Status in Linux: Unknown Status in linux package in Ubuntu: Triaged Status in linux-hwe package in Ubuntu: Fix Released Status in linux-hwe-edge package in Ubuntu: Confirmed Status in linux source package in Xenial: New Status in linux-hwe source package in Xenial: Fix Released Status in linux-hwe-edge source package in Xenial: New Status in linux source package in Yakkety: Fix Released Status in linux-hwe source package in Yakkety: Fix Committed Status in linux-hwe-edge source package in Yakkety: Fix Committed Status in linux source package in Zesty: In Progress Status in linux-hwe source package in Zesty: Confirmed Status in linux-hwe-edge source package in Zesty: Confirmed Bug description: Since I upgraded the kernel from linux-image-4.8.0-46-generic to linux-image-extra-4.10.0-14-generic I'm facing an issue when I want to change the MTU. It seems to be known bug already fixed: https://bugzilla.kernel.org/show_bug.cgi?id=194763 # ip l sh eno49 2: eno49: mtu 9000 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l sh eno50 3: eno50: mtu 9000 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l sh bond0 6: bond0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 5c:b9:01:8a:61:e9 brd ff:ff:ff:ff:ff:ff # ip l set mtu 9000 bond0 RTNETLINK answers: Invalid argument root@controller002[SRV][YUL]:~# tail -1 /var/log/syslog Apr 4 19:36:28 controller002 kernel: [ 8869.077853] bond0: Invalid MTU 9000 requested, hw max 1500 # modinfo ixgbe filename: /lib/modules/4.10.0-14-generic/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko version:4.4.0-k license:GPL description:Intel(R) 10 Gigabit PCI Express Network Driver # modinfo bonding filename: /lib/modules/4.10.0-14-generic/kernel/drivers/net/bonding/bonding.ko author: Thomas Davis, tada...@lbl.gov and many others description:Ethernet Channel Bonding Driver, v3.7.1 version:3.7.1 To manage notifications about this bug go to: https://bugs.launchpad.net/linux/+bug/1679823/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp