Hi Mohammad, It seems things have returned back to normal for this current SRU cycle, and the two commits you requested:
net/mlx5e: Rx, Fix checksum calculation for new hardware net/mlx5e: Rx, Fixup skb checksum for packets with tail padding Have been tagged and built into the 4.15.0-87-generic bionic kernel, which is currently sitting in -proposed awaiting validation. Can you please install the kernel in -proposed, and run the reproducer and check that no kernel splat is generated when you send large IP packets with padding at the end? Instructions to install (on a bionic system): 1) Add the -proposed repository, by adding the following line to /etc/apt/sources.list deb http://archive.ubuntu.com/ubuntu/ bionic-proposed restricted main multiverse universe 2) sudo apt update 3) sudo apt install linux-image-4.15.0-87-generic linux-modules-4.15.0-87-generic \ linux-modules-extra-4.15.0-87-generic linux-headers-4.15.0-87 linux-headers-4.15.0-87-generic 4) sudo reboot 5) uname -rv 4.15.0-87-generic #87-Ubuntu SMP Fri Jan 31 19:32:37 UTC 2020 Hopefully the reproducer shows everything has been fixed. I apologise again for the delay, the kernel team were really adamant about having no regressions in the previous SRU cycle, but things should be back to normal now. Let me know how the kernel in -proposed goes. Thanks, Matthew -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1854842 Title: mlx5_core reports hardware checksum error for padded packets on Mellanox NICs Status in linux package in Ubuntu: Fix Released Status in linux source package in Bionic: Fix Committed Bug description: BugLink: https://bugs.launchpad.net/bugs/1854842 [Impact] On machines equipped with Mellanox NIC's, in this particular case, Mellanox 5 series NICs using the mlx5_core driver, there is a kernel splat when sending large IP packets which have padding at the end. enp6s0f0: hw csum failure CPU: 19 PID: 0 Comm: swapper/19 Not tainted 4.15.0-72-generic Call Trace: <IRQ> dump_stack+0x63/0x8e netdev_rx_csum_fault+0x38/0x40 __skb_checksum_complete+0xbc/0xd0 nf_ip_checksum+0xc3/0xf0 icmp_error+0x27d/0x310 [nf_conntrack_ipv4] nf_conntrack_in+0x15a/0x510 [nf_conntrack] ? __skb_checksum+0x68/0x330 ipv4_conntrack_in+0x1c/0x20 [nf_conntrack_ipv4] nf_hook_slow+0x48/0xc0 ? skb_send_sock+0x50/0x50 ip_rcv+0x301/0x360 ? inet_del_offload+0x40/0x40 __netif_receive_skb_core+0x432/0xb80 __netif_receive_skb+0x18/0x60 ? __netif_receive_skb+0x18/0x60 netif_receive_skb_internal+0x45/0xe0 napi_gro_receive+0xc5/0xf0 mlx5e_handle_rx_cqe+0x48d/0x5e0 [mlx5_core] ? enqueue_task_rt+0x1b4/0x2e0 mlx5e_poll_rx_cq+0xd1/0x8c0 [mlx5_core] mlx5e_napi_poll+0x9d/0x290 [mlx5_core] net_rx_action+0x140/0x3a0 __do_softirq+0xe4/0x2d4 irq_exit+0xc5/0xd0 do_IRQ+0x86/0xe0 common_interrupt+0x8c/0x8c </IRQ> This bug is a further attempt to fix these splats, as there has been previous fixes in LP #1840854 and a series of commits which landed in 4.15.0-67 (LP #1847155) as a part of upstream -stable patches. This bug will also fix the same problems on the new Mellanox CX6 and Bluefield hardware, which has been enabled already via previous upstream -stable patches which landed in LP #1847155. [Fix] This particular issue was fixed for Mellanox series 5 drivers in the following commits: commit 0aa1d18615c163f92935b806dcaff9157645233a Author: Saeed Mahameed <sae...@mellanox.com> Date: Tue Mar 12 00:24:52 2019 -0700 Subject: net/mlx5e: Rx, Fixup skb checksum for packets with tail padding This commit required a minor backport. This commit was selected for upstream -stable in 4.19.76 and 5.0.10. This commit appears to be omitted from "Bionic update: upstream stable patchset 2019-10-07", which is LP #1847155, probably due to requiring a backport. commit db849faa9bef993a1379dc510623f750a72fa7ce Author: Saeed Mahameed <sae...@mellanox.com> Date: Fri May 3 13:14:59 2019 -0700 Subject: net/mlx5e: Rx, Fix checksum calculation for new hardware This commit required a minor backport. This commit was selected for upstream -stable in 5.1.21 and 5.2.4. This commit has already been applied to the disco kernel, as part of stable updates. [Testcase] The following scapy script will reproduce this issue. Run from the machine with the Mellanox series 5 NIC: 1) a=Ether(dst='ff:ff:ff:ff:ff:ff')/IP(dst='127.0.0.1')/ICMP()/Padding(load='\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe') 2) sendp(a, iface='enp6s0f0') 3) Check dmesg on the reciever side. The example uses localhost, so check dmesg. I have built some test kernels, which are available here: https://launchpad.net/~mruffell/+archive/ubuntu/lp1854842-test This kernel contains 0aa1d18615c163f92935b806dcaff9157645233a. and https://launchpad.net/~mruffell/+archive/ubuntu/lp1854842-test-2 This kernel contains db849faa9bef993a1379dc510623f750a72fa7ce. If you install the test kernels the issue is resolved. [Regression Potential] The changes are limited to the mlx5_core driver, and only modify how packet checksums are calculated when padding is involved. Both patches have been accepted and published by upstream -stable, and are widely accepted by the community. Because of this, I believe the risk of regression is low. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1854842/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp