Bug#1070685: linux-image-6.1.0-21-amd64: Found Trace in the logs about br_netfilter and nf_conntrack
On Sat, 7 Sep 2024 09:13:09 +0200 Salvatore Bonaccorso wrote: > Hi, > > On Fri, Sep 06, 2024 at 10:47:04PM +0200, XXX XXX wrote: > > On Fri, 6 Sep 2024 22:04:27 +0200 > > Salvatore Bonaccorso wrote: > > > > > Control: tags -1 + moreinfo > > > > > > Hi, > > > > > > On Mon, Sep 02, 2024 at 11:09:42PM +0200, XXX XXX wrote: > > > > Hi, > > > > this bug seems to be fixed in linux kernel 6.1.107, > > > > I suspect the commit that fixed it is: > > > > > > > > commit 6dcc8ba8a6074bb79040f502dc66ad23a58a1c86 > > > > Author: Florian Westphal > > > > Date: Wed Aug 7 21:28:41 2024 +0200 > > > > > > > > netfilter: nf_queue: drop packets with cloned unconfirmed conntracks > > > > > > > > [ Upstream commit 7d8dc1c7be8d3509e8f5164dd5df64c8e34d7eeb ] > > > > > > > > Conntrack assumes an unconfirmed entry (not yet committed to global > > > > hash > > > > table) has a refcount of 1 and is not visible to other cores. > > > > > > > > With multicast forwarding this assumption breaks down because such > > > > skbs get cloned after being picked up, i.e. ct->use refcount is > > > > > 1. > > > > > > > > Likewise, bridge netfilter will clone broad/mutlicast frames and > > > > all frames in case they need to be flood-forwarded during learning > > > > phase. > > > > > > > > For ip multicast forwarding or plain bridge flood-forward this will > > > > "work" because packets don't leave softirq and are implicitly > > > > serialized. > > > > > > > > With nfqueue this no longer holds true, the packets get queued > > > > and can be reinjected in arbitrary ways. > > > > > > > > Disable this feature, I see no other solution. > > > > > > > > After this patch, nfqueue cannot queue packets except the last > > > > multicast/broadcast packet. > > > > > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > > > Signed-off-by: Florian Westphal > > > > Signed-off-by: Pablo Neira Ayuso > > > > Signed-off-by: Sasha Levin > > > > > > Would you be able to confirm this? In case this is true, then this > > > would imply that the issue should be visible as well current testing > > > until <= 6.10.7-1. > > > > > > Regards, > > > Salvatore > > > > Hi, > > > > for sure it was visible in linux-image-6.10.6+bpo-amd64 that I tried from > > stable backports after the trace popped up again after upgrading to > > linux-image-6.1.0-25-amd64. > > So by checking the changelog for the source file and line shown in the > > traces on kernel.org > > I've spotted this patch that was interesting because I use suricata in > > nfqueue > > mode and because the trace happened always at boot (during the learning > > phase). > > So I first erroneously I tried 6.1.106 and the trace was still there > > and then 6.1.107 and it was gone. > > Hope this helps. > > Yes thanks. One option to get a final confirmation and proper closure > tracking, would be if you can cherry-pick the commit on top of the > 6.1.106-3 version and see if it resolved the issue. > > You could proceed as described in > > https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#id-1.6.6.4 > > Regards, > Salvatore Did so: # apt-get install build-essential Reading package lists... Done Building dependency tree... Done Reading state information... Done build-essential is already the newest version (12.9). 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. # apt install kernel-wedge -t daedalus-backports Reading package lists... Done Building dependency tree... Done Reading state information... Done The following packages will be upgraded: kernel-wedge 1 upgraded, 0 newly installed, 0 to remove and 166 not upgraded. Need to get 21.0 kB of archives. After this operation, 24.6 kB disk space will be freed. Get:1 http://deb.devuan.org/merged daedalus-backports/main amd64 kernel-wedge all 2.105~bpo12+1 [21.0 kB] Fetched 21.0 kB in 0s (43.3 kB/s) Reading changelogs... Done (Reading database ... 563865 files and directories currently installed.) Preparing to unpack .../kernel-wedge_2.105~bpo12+1_all.deb ... Unpacking kernel-wedge (2.105~bpo12+1) over (2.104) ...
Bug#1070685: linux-image-6.1.0-21-amd64: Found Trace in the logs about br_netfilter and nf_conntrack
On Fri, 6 Sep 2024 22:04:27 +0200 Salvatore Bonaccorso wrote: > Control: tags -1 + moreinfo > > Hi, > > On Mon, Sep 02, 2024 at 11:09:42PM +0200, XXX XXX wrote: > > Hi, > > this bug seems to be fixed in linux kernel 6.1.107, > > I suspect the commit that fixed it is: > > > > commit 6dcc8ba8a6074bb79040f502dc66ad23a58a1c86 > > Author: Florian Westphal > > Date: Wed Aug 7 21:28:41 2024 +0200 > > > > netfilter: nf_queue: drop packets with cloned unconfirmed conntracks > > > > [ Upstream commit 7d8dc1c7be8d3509e8f5164dd5df64c8e34d7eeb ] > > > > Conntrack assumes an unconfirmed entry (not yet committed to global hash > > table) has a refcount of 1 and is not visible to other cores. > > > > With multicast forwarding this assumption breaks down because such > > skbs get cloned after being picked up, i.e. ct->use refcount is > 1. > > > > Likewise, bridge netfilter will clone broad/mutlicast frames and > > all frames in case they need to be flood-forwarded during learning > > phase. > > > > For ip multicast forwarding or plain bridge flood-forward this will > > "work" because packets don't leave softirq and are implicitly > > serialized. > > > > With nfqueue this no longer holds true, the packets get queued > > and can be reinjected in arbitrary ways. > > > > Disable this feature, I see no other solution. > > > > After this patch, nfqueue cannot queue packets except the last > > multicast/broadcast packet. > > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > > Signed-off-by: Florian Westphal > > Signed-off-by: Pablo Neira Ayuso > > Signed-off-by: Sasha Levin > > Would you be able to confirm this? In case this is true, then this > would imply that the issue should be visible as well current testing > until <= 6.10.7-1. > > Regards, > Salvatore Hi, for sure it was visible in linux-image-6.10.6+bpo-amd64 that I tried from stable backports after the trace popped up again after upgrading to linux-image-6.1.0-25-amd64. So by checking the changelog for the source file and line shown in the traces on kernel.org I've spotted this patch that was interesting because I use suricata in nfqueue mode and because the trace happened always at boot (during the learning phase). So I first erroneously I tried 6.1.106 and the trace was still there and then 6.1.107 and it was gone. Hope this helps. Ciao, Tito
Bug#1070685: linux-image-6.1.0-21-amd64: Found Trace in the logs about br_netfilter and nf_conntrack
Hi, this bug seems to be fixed in linux kernel 6.1.107, I suspect the commit that fixed it is: commit 6dcc8ba8a6074bb79040f502dc66ad23a58a1c86 Author: Florian Westphal Date: Wed Aug 7 21:28:41 2024 +0200 netfilter: nf_queue: drop packets with cloned unconfirmed conntracks [ Upstream commit 7d8dc1c7be8d3509e8f5164dd5df64c8e34d7eeb ] Conntrack assumes an unconfirmed entry (not yet committed to global hash table) has a refcount of 1 and is not visible to other cores. With multicast forwarding this assumption breaks down because such skbs get cloned after being picked up, i.e. ct->use refcount is > 1. Likewise, bridge netfilter will clone broad/mutlicast frames and all frames in case they need to be flood-forwarded during learning phase. For ip multicast forwarding or plain bridge flood-forward this will "work" because packets don't leave softirq and are implicitly serialized. With nfqueue this no longer holds true, the packets get queued and can be reinjected in arbitrary ways. Disable this feature, I see no other solution. After this patch, nfqueue cannot queue packets except the last multicast/broadcast packet. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin Best regards, Tito Ragusa
Bug#1070685: linux-image-6.1.0-21-amd64: Found Trace in the logs about br_netfilter and nf_conntrack
On Tue, 7 May 2024 21:50:45 +0200 XXX XXX wrote: > On Tue, 7 May 2024 21:01:06 +0200 > Salvatore Bonaccorso wrote: > > > Control: tags -1 + moreinfo > > > > Hi Tito, > > > > On Tue, May 07, 2024 at 10:19:44AM +0200, Tito Ragusa wrote: > > > Package: src:linux > > > Version: 6.1.90-1 > > > Severity: normal > > > > > > Dear Maintainer, > > > > > >* What led up to the situation? > > > > > >Rebooting the box after kernel package upgrade > > > > > >* What exactly did you do (or not do) that was effective (or > > > ineffective)? > > > > > >Nothing > > > > > >* What was the outcome of this action? > > > > > >Nothing > > > > > >* What outcome did you expect instead? > > > > > >Rebooting without traces in the logs > > > > > > -- Package-specific info: > > > ** Version: > > > Linux version 6.1.0-21-amd64 (debian-ker...@lists.debian.org) (gcc-12 > > > (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP > > > PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03) > > > > > > ** Command line: > > > BOOT_IMAGE=/vmlinuz-6.1.0-21-amd64 > > > root=UUID=a75e6ad5-37fc-4f69-9361-f94d6c0e5d2f ro net.ifnames=0 > > > apparmor=0 selinux=0 noresume consoleblank=0 console=tty1 > > > > > > ** Tainted: WOE (12800) > > > * kernel issued warning > > > * externally-built ("out-of-tree") module was loaded > > > * unsigned module was loaded > > > > > > ** Kernel log: > > > May 7 08:10:12 cerberus kernel: [ 76.203881] [ cut here > > > ] > > > May 7 08:10:12 cerberus kernel: [ 76.203895] WARNING: CPU: 3 PID: 0 at > > > net/bridge/br_netfilter_hooks.c:622 br_nf_local_in+0x1a9/0x1d0 > > > [br_netfilter] > > > May 7 08:10:12 cerberus kernel: [ 76.203911] Modules linked in: ctr > > > ccm nf_tables xt_nat xt_recent xt_geoip(OE) xt_NFQUEUE xt_mark xt_CT > > > xt_tcpudp xt_helper nf_nat_ftp nf_conntrack_ftp ip6table_raw > > > ip6table_mangle ip6table_nat xt_MASQUERADE iptable_nat nf_nat xt_TCPMSS > > > xt_LOG nf_log_syslog ipt_REJECT nf_reject_ipv4 iptable_raw iptable_mangle > > > xt_multiport xt_state xt_limit xt_conntrack nf_conntrack nf_defrag_ipv6 > > > nf_defrag_ipv4 libcrc32c ip6table_filter ip6_tables iptable_filter > > > ip_tables x_tables ovpn_dco_v2(OE) ip6_udp_tunnel udp_tunnel tcp_bbr > > > nct6775 nct6775_core hwmon_vid br_netfilter bridge stp llc > > > nfnetlink_queue nfnetlink i915 ppdev intel_rapl_msr evdev > > > intel_rapl_common x86_pkg_temp_thermal intel_powerclamp drm_buddy > > > coretemp video rt2800usb wmi ghash_clmulni_intel drm_display_helper > > > rt2x00usb sha512_ssse3 sha512_generic rt2800lib rt2x00lib cec > > > sha256_ssse3 sha1_ssse3 rc_core mac80211 aesni_intel ttm crypto_simd > > > drm_kms_helper libarc4 cryptd cfg80211 rapl intel_cstat e drm intel_uncore rfkill parport_pc pcspkr > > > May 7 08:10:12 cerberus kernel: [ 76.203999] serio_raw iTCO_wdt > > > intel_pmc_bxt iTCO_vendor_support parport watchdog at24 button ext4 crc16 > > > mbcache jbd2 crc32c_generic sg sd_mod t10_pi crc64_rocksoft crc64 > > > crc_t10dif crct10dif_generic ahci libahci libata crct10dif_pclmul > > > crct10dif_common crc32_pclmul crc32c_intel psmouse scsi_mod i2c_i801 > > > i2c_smbus ehci_pci ehci_hcd scsi_common lpc_ich usbcore igb i2c_algo_bit > > > usb_common dca > > > May 7 08:10:12 cerberus kernel: [ 76.204039] CPU: 3 PID: 0 Comm: > > > swapper/3 Tainted: G OE 6.1.0-21-amd64 #1 Debian 6.1.90-1 > > > May 7 08:10:12 cerberus kernel: [ 76.204044] Hardware name: Sophos > > > UTM/To be filled by O.E.M., BIOS 4.6.4 11/08/2011 > > > May 7 08:10:12 cerberus kernel: [ 76.204046] RIP: > > > 0010:br_nf_local_in+0x1a9/0x1d0 [br_netfilter] > > > May 7 08:10:12 cerberus kernel: [ 76.204056] Code: df e8 4b b7 cd fa > > > 66 83 ab b8 00 00 00 08 eb 94 be 04 00 00 00 48 89 df e8 34 b7 cd fa 66 > > > 83 ab b8 00 00 00 04 e9 7a ff ff ff <0f> 0b e9 f0 fe ff ff 0f 0b e9 dd fe > > > ff ff 48 89 ef e8 41 67 d8 fa > > > May 7 08:10:12 cerberus kernel: [ 76.204059] RSP: > > > 0018:bf5600144928 EFLAGS: 00010202 > > > May 7 08:10:12 cerberus kernel: [ 76.204062] RAX: 0002 > > > RBX: 9ac2862
Bug#1070685: linux-image-6.1.0-21-amd64: Found Trace in the logs about br_netfilter and nf_conntrack
On Tue, 7 May 2024 21:01:06 +0200 Salvatore Bonaccorso wrote: > Control: tags -1 + moreinfo > > Hi Tito, > > On Tue, May 07, 2024 at 10:19:44AM +0200, Tito Ragusa wrote: > > Package: src:linux > > Version: 6.1.90-1 > > Severity: normal > > > > Dear Maintainer, > > > >* What led up to the situation? > > > >Rebooting the box after kernel package upgrade > > > >* What exactly did you do (or not do) that was effective (or > > ineffective)? > > > >Nothing > > > >* What was the outcome of this action? > > > >Nothing > > > >* What outcome did you expect instead? > > > >Rebooting without traces in the logs > > > > -- Package-specific info: > > ** Version: > > Linux version 6.1.0-21-amd64 (debian-ker...@lists.debian.org) (gcc-12 > > (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP > > PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03) > > > > ** Command line: > > BOOT_IMAGE=/vmlinuz-6.1.0-21-amd64 > > root=UUID=a75e6ad5-37fc-4f69-9361-f94d6c0e5d2f ro net.ifnames=0 apparmor=0 > > selinux=0 noresume consoleblank=0 console=tty1 > > > > ** Tainted: WOE (12800) > > * kernel issued warning > > * externally-built ("out-of-tree") module was loaded > > * unsigned module was loaded > > > > ** Kernel log: > > May 7 08:10:12 cerberus kernel: [ 76.203881] [ cut here > > ] > > May 7 08:10:12 cerberus kernel: [ 76.203895] WARNING: CPU: 3 PID: 0 at > > net/bridge/br_netfilter_hooks.c:622 br_nf_local_in+0x1a9/0x1d0 > > [br_netfilter] > > May 7 08:10:12 cerberus kernel: [ 76.203911] Modules linked in: ctr ccm > > nf_tables xt_nat xt_recent xt_geoip(OE) xt_NFQUEUE xt_mark xt_CT xt_tcpudp > > xt_helper nf_nat_ftp nf_conntrack_ftp ip6table_raw ip6table_mangle > > ip6table_nat xt_MASQUERADE iptable_nat nf_nat xt_TCPMSS xt_LOG > > nf_log_syslog ipt_REJECT nf_reject_ipv4 iptable_raw iptable_mangle > > xt_multiport xt_state xt_limit xt_conntrack nf_conntrack nf_defrag_ipv6 > > nf_defrag_ipv4 libcrc32c ip6table_filter ip6_tables iptable_filter > > ip_tables x_tables ovpn_dco_v2(OE) ip6_udp_tunnel udp_tunnel tcp_bbr > > nct6775 nct6775_core hwmon_vid br_netfilter bridge stp llc nfnetlink_queue > > nfnetlink i915 ppdev intel_rapl_msr evdev intel_rapl_common > > x86_pkg_temp_thermal intel_powerclamp drm_buddy coretemp video rt2800usb > > wmi ghash_clmulni_intel drm_display_helper rt2x00usb sha512_ssse3 > > sha512_generic rt2800lib rt2x00lib cec sha256_ssse3 sha1_ssse3 rc_core > > mac80211 aesni_intel ttm crypto_simd drm_kms_helper libarc4 cryptd cfg80211 > > rapl intel_cstate drm intel_uncore rfkill parport_pc pcspkr > > May 7 08:10:12 cerberus kernel: [ 76.203999] serio_raw iTCO_wdt > > intel_pmc_bxt iTCO_vendor_support parport watchdog at24 button ext4 crc16 > > mbcache jbd2 crc32c_generic sg sd_mod t10_pi crc64_rocksoft crc64 > > crc_t10dif crct10dif_generic ahci libahci libata crct10dif_pclmul > > crct10dif_common crc32_pclmul crc32c_intel psmouse scsi_mod i2c_i801 > > i2c_smbus ehci_pci ehci_hcd scsi_common lpc_ich usbcore igb i2c_algo_bit > > usb_common dca > > May 7 08:10:12 cerberus kernel: [ 76.204039] CPU: 3 PID: 0 Comm: > > swapper/3 Tainted: G OE 6.1.0-21-amd64 #1 Debian 6.1.90-1 > > May 7 08:10:12 cerberus kernel: [ 76.204044] Hardware name: Sophos > > UTM/To be filled by O.E.M., BIOS 4.6.4 11/08/2011 > > May 7 08:10:12 cerberus kernel: [ 76.204046] RIP: > > 0010:br_nf_local_in+0x1a9/0x1d0 [br_netfilter] > > May 7 08:10:12 cerberus kernel: [ 76.204056] Code: df e8 4b b7 cd fa 66 > > 83 ab b8 00 00 00 08 eb 94 be 04 00 00 00 48 89 df e8 34 b7 cd fa 66 83 ab > > b8 00 00 00 04 e9 7a ff ff ff <0f> 0b e9 f0 fe ff ff 0f 0b e9 dd fe ff ff > > 48 89 ef e8 41 67 d8 fa > > May 7 08:10:12 cerberus kernel: [ 76.204059] RSP: 0018:bf5600144928 > > EFLAGS: 00010202 > > May 7 08:10:12 cerberus kernel: [ 76.204062] RAX: 0002 RBX: > > 9ac2862ff300 RCX: > > May 7 08:10:12 cerberus kernel: [ 76.204065] RDX: bf5600144980 RSI: > > 9ac2862ff300 RDI: > > May 7 08:10:12 cerberus kernel: [ 76.204067] RBP: 9ac2848a8100 R08: > > 0001 R09: 9ac2872be980 > > May 7 08:10:12 cerberus kernel: [ 76.204070] R10: 9ac2872be000 R11: > > 0002 R12: bf5600144980 > > May 7 08:10:12 cerberus kernel: [ 76.204072] R13: R14: > > 9ac282f4bac0 R15: 9ac2d027da00 > > May 7 08:10:12 cerberus kernel: [ 76.204074] FS: () > > GS:9ac5b018() knlGS: > > May 7 08:10:12 cerberus kernel: [ 76.204077] CS: 0010 DS: ES: > > CR0: 80050033 > > May 7 08:10:12 cerberus kernel: [ 76.204080] CR2: 5618751eb018 CR3: > > 2e610006 CR4: 000606e0 > > May 7 08:10:12 cerberus kernel: [ 76.204083] Call Trace: > > May 7 08:10:12 cerberus kernel: [ 76.204087] > > May 7 08:10