Bug#1070685: linux-image-6.1.0-21-amd64: Found Trace in the logs about br_netfilter and nf_conntrack

2024-09-07 Thread XXX XXX
On Sat, 7 Sep 2024 09:13:09 +0200
Salvatore Bonaccorso  wrote:

> Hi,
> 
> On Fri, Sep 06, 2024 at 10:47:04PM +0200, XXX XXX wrote:
> > On Fri, 6 Sep 2024 22:04:27 +0200
> > Salvatore Bonaccorso  wrote:
> > 
> > > Control: tags -1 + moreinfo
> > > 
> > > Hi,
> > > 
> > > On Mon, Sep 02, 2024 at 11:09:42PM +0200, XXX XXX wrote:
> > > > Hi,
> > > > this bug seems to be fixed in linux kernel 6.1.107,
> > > > I suspect the commit that fixed it is:
> > > > 
> > > > commit 6dcc8ba8a6074bb79040f502dc66ad23a58a1c86
> > > > Author: Florian Westphal 
> > > > Date:   Wed Aug 7 21:28:41 2024 +0200
> > > > 
> > > > netfilter: nf_queue: drop packets with cloned unconfirmed conntracks
> > > > 
> > > > [ Upstream commit 7d8dc1c7be8d3509e8f5164dd5df64c8e34d7eeb ]
> > > > 
> > > > Conntrack assumes an unconfirmed entry (not yet committed to global 
> > > > hash
> > > > table) has a refcount of 1 and is not visible to other cores.
> > > > 
> > > > With multicast forwarding this assumption breaks down because such
> > > > skbs get cloned after being picked up, i.e.  ct->use refcount is > 
> > > > 1.
> > > > 
> > > > Likewise, bridge netfilter will clone broad/mutlicast frames and
> > > > all frames in case they need to be flood-forwarded during learning
> > > > phase.
> > > > 
> > > > For ip multicast forwarding or plain bridge flood-forward this will
> > > > "work" because packets don't leave softirq and are implicitly
> > > > serialized.
> > > > 
> > > > With nfqueue this no longer holds true, the packets get queued
> > > > and can be reinjected in arbitrary ways.
> > > > 
> > > > Disable this feature, I see no other solution.
> > > > 
> > > > After this patch, nfqueue cannot queue packets except the last
> > > > multicast/broadcast packet.
> > > > 
> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > Signed-off-by: Florian Westphal 
> > > > Signed-off-by: Pablo Neira Ayuso 
> > > > Signed-off-by: Sasha Levin 
> > > 
> > > Would you be able to confirm this? In case this is true, then this
> > > would imply that the issue should be visible as well current testing
> > > until <= 6.10.7-1.
> > > 
> > > Regards,
> > > Salvatore
> > 
> > Hi,
> > 
> > for sure it was visible in linux-image-6.10.6+bpo-amd64 that I tried from 
> > stable backports after the trace popped up again after upgrading to 
> > linux-image-6.1.0-25-amd64.
> > So  by checking the changelog for the source file and line shown in the 
> > traces on kernel.org 
> > I've spotted this patch that was interesting because  I use suricata in 
> > nfqueue
> > mode and because the trace happened always at boot (during the learning 
> > phase).
> > So I first erroneously I tried 6.1.106 and the trace was still there
> > and then 6.1.107 and it was gone.
> > Hope this helps.
> 
> Yes thanks. One option to get a final confirmation and proper closure
> tracking, would be if you can cherry-pick the commit on top of the
> 6.1.106-3 version and see if it resolved the issue.
> 
> You could proceed as described in
> 
> https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#id-1.6.6.4
> 
> Regards,
> Salvatore

Did so:

# apt-get install build-essential
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
build-essential is already the newest version (12.9).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
# apt install kernel-wedge -t daedalus-backports 
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages will be upgraded:
  kernel-wedge
1 upgraded, 0 newly installed, 0 to remove and 166 not upgraded.
Need to get 21.0 kB of archives.
After this operation, 24.6 kB disk space will be freed.
Get:1 http://deb.devuan.org/merged daedalus-backports/main amd64 kernel-wedge 
all 2.105~bpo12+1 [21.0 kB]
Fetched 21.0 kB in 0s (43.3 kB/s) 
Reading changelogs... Done
(Reading database ... 563865 files and directories currently installed.)
Preparing to unpack .../kernel-wedge_2.105~bpo12+1_all.deb ...
Unpacking kernel-wedge (2.105~bpo12+1) over (2.104) ...

Bug#1070685: linux-image-6.1.0-21-amd64: Found Trace in the logs about br_netfilter and nf_conntrack

2024-09-06 Thread XXX XXX
On Fri, 6 Sep 2024 22:04:27 +0200
Salvatore Bonaccorso  wrote:

> Control: tags -1 + moreinfo
> 
> Hi,
> 
> On Mon, Sep 02, 2024 at 11:09:42PM +0200, XXX XXX wrote:
> > Hi,
> > this bug seems to be fixed in linux kernel 6.1.107,
> > I suspect the commit that fixed it is:
> > 
> > commit 6dcc8ba8a6074bb79040f502dc66ad23a58a1c86
> > Author: Florian Westphal 
> > Date:   Wed Aug 7 21:28:41 2024 +0200
> > 
> > netfilter: nf_queue: drop packets with cloned unconfirmed conntracks
> > 
> > [ Upstream commit 7d8dc1c7be8d3509e8f5164dd5df64c8e34d7eeb ]
> > 
> > Conntrack assumes an unconfirmed entry (not yet committed to global hash
> > table) has a refcount of 1 and is not visible to other cores.
> > 
> > With multicast forwarding this assumption breaks down because such
> > skbs get cloned after being picked up, i.e.  ct->use refcount is > 1.
> > 
> > Likewise, bridge netfilter will clone broad/mutlicast frames and
> > all frames in case they need to be flood-forwarded during learning
> > phase.
> > 
> > For ip multicast forwarding or plain bridge flood-forward this will
> > "work" because packets don't leave softirq and are implicitly
> > serialized.
> > 
> > With nfqueue this no longer holds true, the packets get queued
> > and can be reinjected in arbitrary ways.
> > 
> > Disable this feature, I see no other solution.
> > 
> > After this patch, nfqueue cannot queue packets except the last
> > multicast/broadcast packet.
> > 
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Signed-off-by: Florian Westphal 
> > Signed-off-by: Pablo Neira Ayuso 
> > Signed-off-by: Sasha Levin 
> 
> Would you be able to confirm this? In case this is true, then this
> would imply that the issue should be visible as well current testing
> until <= 6.10.7-1.
> 
> Regards,
> Salvatore

Hi,

for sure it was visible in linux-image-6.10.6+bpo-amd64 that I tried from 
stable backports after the trace popped up again after upgrading to 
linux-image-6.1.0-25-amd64.
So  by checking the changelog for the source file and line shown in the traces 
on kernel.org 
I've spotted this patch that was interesting because  I use suricata in nfqueue
mode and because the trace happened always at boot (during the learning phase).
So I first erroneously I tried 6.1.106 and the trace was still there
and then 6.1.107 and it was gone.
Hope this helps.

Ciao,
Tito



Bug#1070685: linux-image-6.1.0-21-amd64: Found Trace in the logs about br_netfilter and nf_conntrack

2024-09-02 Thread XXX XXX
Hi,
this bug seems to be fixed in linux kernel 6.1.107,
I suspect the commit that fixed it is:

commit 6dcc8ba8a6074bb79040f502dc66ad23a58a1c86
Author: Florian Westphal 
Date:   Wed Aug 7 21:28:41 2024 +0200

netfilter: nf_queue: drop packets with cloned unconfirmed conntracks

[ Upstream commit 7d8dc1c7be8d3509e8f5164dd5df64c8e34d7eeb ]

Conntrack assumes an unconfirmed entry (not yet committed to global hash
table) has a refcount of 1 and is not visible to other cores.

With multicast forwarding this assumption breaks down because such
skbs get cloned after being picked up, i.e.  ct->use refcount is > 1.

Likewise, bridge netfilter will clone broad/mutlicast frames and
all frames in case they need to be flood-forwarded during learning
phase.

For ip multicast forwarding or plain bridge flood-forward this will
"work" because packets don't leave softirq and are implicitly
serialized.

With nfqueue this no longer holds true, the packets get queued
and can be reinjected in arbitrary ways.

Disable this feature, I see no other solution.

After this patch, nfqueue cannot queue packets except the last
multicast/broadcast packet.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Sasha Levin 

Best regards, 
Tito Ragusa



Bug#1070685: linux-image-6.1.0-21-amd64: Found Trace in the logs about br_netfilter and nf_conntrack

2024-09-02 Thread XXX XXX
On Tue, 7 May 2024 21:50:45 +0200
XXX XXX  wrote:

> On Tue, 7 May 2024 21:01:06 +0200
> Salvatore Bonaccorso  wrote:
> 
> > Control: tags -1 + moreinfo
> > 
> > Hi Tito,
> > 
> > On Tue, May 07, 2024 at 10:19:44AM +0200, Tito Ragusa wrote:
> > > Package: src:linux
> > > Version: 6.1.90-1
> > > Severity: normal
> > > 
> > > Dear Maintainer,
> > > 
> > >* What led up to the situation?
> > > 
> > >Rebooting the box after kernel package upgrade
> > > 
> > >* What exactly did you do (or not do) that was effective (or
> > >  ineffective)?
> > >
> > >Nothing
> > > 
> > >* What was the outcome of this action?
> > >  
> > >Nothing 
> > > 
> > >* What outcome did you expect instead?
> > > 
> > >Rebooting without traces in the logs
> > >  
> > > -- Package-specific info:
> > > ** Version:
> > > Linux version 6.1.0-21-amd64 (debian-ker...@lists.debian.org) (gcc-12 
> > > (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP 
> > > PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03)
> > > 
> > > ** Command line:
> > > BOOT_IMAGE=/vmlinuz-6.1.0-21-amd64 
> > > root=UUID=a75e6ad5-37fc-4f69-9361-f94d6c0e5d2f ro net.ifnames=0 
> > > apparmor=0 selinux=0 noresume consoleblank=0 console=tty1
> > > 
> > > ** Tainted: WOE (12800)
> > >  * kernel issued warning
> > >  * externally-built ("out-of-tree") module was loaded
> > >  * unsigned module was loaded
> > > 
> > > ** Kernel log:
> > >   May  7 08:10:12 cerberus kernel: [   76.203881] [ cut here 
> > > ]
> > > May  7 08:10:12 cerberus kernel: [   76.203895] WARNING: CPU: 3 PID: 0 at 
> > > net/bridge/br_netfilter_hooks.c:622 br_nf_local_in+0x1a9/0x1d0 
> > > [br_netfilter]
> > > May  7 08:10:12 cerberus kernel: [   76.203911] Modules linked in: ctr 
> > > ccm nf_tables xt_nat xt_recent xt_geoip(OE) xt_NFQUEUE xt_mark xt_CT 
> > > xt_tcpudp xt_helper nf_nat_ftp nf_conntrack_ftp ip6table_raw 
> > > ip6table_mangle ip6table_nat xt_MASQUERADE iptable_nat nf_nat xt_TCPMSS 
> > > xt_LOG nf_log_syslog ipt_REJECT nf_reject_ipv4 iptable_raw iptable_mangle 
> > > xt_multiport xt_state xt_limit xt_conntrack nf_conntrack nf_defrag_ipv6 
> > > nf_defrag_ipv4 libcrc32c ip6table_filter ip6_tables iptable_filter 
> > > ip_tables x_tables ovpn_dco_v2(OE) ip6_udp_tunnel udp_tunnel tcp_bbr 
> > > nct6775 nct6775_core hwmon_vid br_netfilter bridge stp llc 
> > > nfnetlink_queue nfnetlink i915 ppdev intel_rapl_msr evdev 
> > > intel_rapl_common x86_pkg_temp_thermal intel_powerclamp drm_buddy 
> > > coretemp video rt2800usb wmi ghash_clmulni_intel drm_display_helper 
> > > rt2x00usb sha512_ssse3 sha512_generic rt2800lib rt2x00lib cec 
> > > sha256_ssse3 sha1_ssse3 rc_core mac80211 aesni_intel ttm crypto_simd 
> > > drm_kms_helper libarc4 cryptd cfg80211 rapl intel_cstat
 e drm intel_uncore rfkill parport_pc pcspkr
> > > May  7 08:10:12 cerberus kernel: [   76.203999]  serio_raw iTCO_wdt 
> > > intel_pmc_bxt iTCO_vendor_support parport watchdog at24 button ext4 crc16 
> > > mbcache jbd2 crc32c_generic sg sd_mod t10_pi crc64_rocksoft crc64 
> > > crc_t10dif crct10dif_generic ahci libahci libata crct10dif_pclmul 
> > > crct10dif_common crc32_pclmul crc32c_intel psmouse scsi_mod i2c_i801 
> > > i2c_smbus ehci_pci ehci_hcd scsi_common lpc_ich usbcore igb i2c_algo_bit 
> > > usb_common dca
> > > May  7 08:10:12 cerberus kernel: [   76.204039] CPU: 3 PID: 0 Comm: 
> > > swapper/3 Tainted: G   OE  6.1.0-21-amd64 #1  Debian 6.1.90-1
> > > May  7 08:10:12 cerberus kernel: [   76.204044] Hardware name: Sophos 
> > > UTM/To be filled by O.E.M., BIOS 4.6.4 11/08/2011
> > > May  7 08:10:12 cerberus kernel: [   76.204046] RIP: 
> > > 0010:br_nf_local_in+0x1a9/0x1d0 [br_netfilter]
> > > May  7 08:10:12 cerberus kernel: [   76.204056] Code: df e8 4b b7 cd fa 
> > > 66 83 ab b8 00 00 00 08 eb 94 be 04 00 00 00 48 89 df e8 34 b7 cd fa 66 
> > > 83 ab b8 00 00 00 04 e9 7a ff ff ff <0f> 0b e9 f0 fe ff ff 0f 0b e9 dd fe 
> > > ff ff 48 89 ef e8 41 67 d8 fa
> > > May  7 08:10:12 cerberus kernel: [   76.204059] RSP: 
> > > 0018:bf5600144928 EFLAGS: 00010202
> > > May  7 08:10:12 cerberus kernel: [   76.204062] RAX: 0002 
> > > RBX: 9ac2862

Bug#1070685: linux-image-6.1.0-21-amd64: Found Trace in the logs about br_netfilter and nf_conntrack

2024-05-07 Thread XXX XXX
On Tue, 7 May 2024 21:01:06 +0200
Salvatore Bonaccorso  wrote:

> Control: tags -1 + moreinfo
> 
> Hi Tito,
> 
> On Tue, May 07, 2024 at 10:19:44AM +0200, Tito Ragusa wrote:
> > Package: src:linux
> > Version: 6.1.90-1
> > Severity: normal
> > 
> > Dear Maintainer,
> > 
> >* What led up to the situation?
> > 
> >Rebooting the box after kernel package upgrade
> > 
> >* What exactly did you do (or not do) that was effective (or
> >  ineffective)?
> >
> >Nothing
> > 
> >* What was the outcome of this action?
> >  
> >Nothing 
> > 
> >* What outcome did you expect instead?
> > 
> >Rebooting without traces in the logs
> >  
> > -- Package-specific info:
> > ** Version:
> > Linux version 6.1.0-21-amd64 (debian-ker...@lists.debian.org) (gcc-12 
> > (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP 
> > PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03)
> > 
> > ** Command line:
> > BOOT_IMAGE=/vmlinuz-6.1.0-21-amd64 
> > root=UUID=a75e6ad5-37fc-4f69-9361-f94d6c0e5d2f ro net.ifnames=0 apparmor=0 
> > selinux=0 noresume consoleblank=0 console=tty1
> > 
> > ** Tainted: WOE (12800)
> >  * kernel issued warning
> >  * externally-built ("out-of-tree") module was loaded
> >  * unsigned module was loaded
> > 
> > ** Kernel log:
> >   May  7 08:10:12 cerberus kernel: [   76.203881] [ cut here 
> > ]
> > May  7 08:10:12 cerberus kernel: [   76.203895] WARNING: CPU: 3 PID: 0 at 
> > net/bridge/br_netfilter_hooks.c:622 br_nf_local_in+0x1a9/0x1d0 
> > [br_netfilter]
> > May  7 08:10:12 cerberus kernel: [   76.203911] Modules linked in: ctr ccm 
> > nf_tables xt_nat xt_recent xt_geoip(OE) xt_NFQUEUE xt_mark xt_CT xt_tcpudp 
> > xt_helper nf_nat_ftp nf_conntrack_ftp ip6table_raw ip6table_mangle 
> > ip6table_nat xt_MASQUERADE iptable_nat nf_nat xt_TCPMSS xt_LOG 
> > nf_log_syslog ipt_REJECT nf_reject_ipv4 iptable_raw iptable_mangle 
> > xt_multiport xt_state xt_limit xt_conntrack nf_conntrack nf_defrag_ipv6 
> > nf_defrag_ipv4 libcrc32c ip6table_filter ip6_tables iptable_filter 
> > ip_tables x_tables ovpn_dco_v2(OE) ip6_udp_tunnel udp_tunnel tcp_bbr 
> > nct6775 nct6775_core hwmon_vid br_netfilter bridge stp llc nfnetlink_queue 
> > nfnetlink i915 ppdev intel_rapl_msr evdev intel_rapl_common 
> > x86_pkg_temp_thermal intel_powerclamp drm_buddy coretemp video rt2800usb 
> > wmi ghash_clmulni_intel drm_display_helper rt2x00usb sha512_ssse3 
> > sha512_generic rt2800lib rt2x00lib cec sha256_ssse3 sha1_ssse3 rc_core 
> > mac80211 aesni_intel ttm crypto_simd drm_kms_helper libarc4 cryptd cfg80211 
> > rapl intel_cstate 
 drm intel_uncore rfkill parport_pc pcspkr
> > May  7 08:10:12 cerberus kernel: [   76.203999]  serio_raw iTCO_wdt 
> > intel_pmc_bxt iTCO_vendor_support parport watchdog at24 button ext4 crc16 
> > mbcache jbd2 crc32c_generic sg sd_mod t10_pi crc64_rocksoft crc64 
> > crc_t10dif crct10dif_generic ahci libahci libata crct10dif_pclmul 
> > crct10dif_common crc32_pclmul crc32c_intel psmouse scsi_mod i2c_i801 
> > i2c_smbus ehci_pci ehci_hcd scsi_common lpc_ich usbcore igb i2c_algo_bit 
> > usb_common dca
> > May  7 08:10:12 cerberus kernel: [   76.204039] CPU: 3 PID: 0 Comm: 
> > swapper/3 Tainted: G   OE  6.1.0-21-amd64 #1  Debian 6.1.90-1
> > May  7 08:10:12 cerberus kernel: [   76.204044] Hardware name: Sophos 
> > UTM/To be filled by O.E.M., BIOS 4.6.4 11/08/2011
> > May  7 08:10:12 cerberus kernel: [   76.204046] RIP: 
> > 0010:br_nf_local_in+0x1a9/0x1d0 [br_netfilter]
> > May  7 08:10:12 cerberus kernel: [   76.204056] Code: df e8 4b b7 cd fa 66 
> > 83 ab b8 00 00 00 08 eb 94 be 04 00 00 00 48 89 df e8 34 b7 cd fa 66 83 ab 
> > b8 00 00 00 04 e9 7a ff ff ff <0f> 0b e9 f0 fe ff ff 0f 0b e9 dd fe ff ff 
> > 48 89 ef e8 41 67 d8 fa
> > May  7 08:10:12 cerberus kernel: [   76.204059] RSP: 0018:bf5600144928 
> > EFLAGS: 00010202
> > May  7 08:10:12 cerberus kernel: [   76.204062] RAX: 0002 RBX: 
> > 9ac2862ff300 RCX: 
> > May  7 08:10:12 cerberus kernel: [   76.204065] RDX: bf5600144980 RSI: 
> > 9ac2862ff300 RDI: 
> > May  7 08:10:12 cerberus kernel: [   76.204067] RBP: 9ac2848a8100 R08: 
> > 0001 R09: 9ac2872be980
> > May  7 08:10:12 cerberus kernel: [   76.204070] R10: 9ac2872be000 R11: 
> > 0002 R12: bf5600144980
> > May  7 08:10:12 cerberus kernel: [   76.204072] R13:  R14: 
> > 9ac282f4bac0 R15: 9ac2d027da00
> > May  7 08:10:12 cerberus kernel: [   76.204074] FS:  () 
> > GS:9ac5b018() knlGS:
> > May  7 08:10:12 cerberus kernel: [   76.204077] CS:  0010 DS:  ES:  
> > CR0: 80050033
> > May  7 08:10:12 cerberus kernel: [   76.204080] CR2: 5618751eb018 CR3: 
> > 2e610006 CR4: 000606e0
> > May  7 08:10:12 cerberus kernel: [   76.204083] Call Trace:
> > May  7 08:10:12 cerberus kernel: [   76.204087]  
> > May  7 08:10