On 4/20/26 21:44, David CARLIER wrote:
Hi Julian,

On Mon, 20 Apr 2026 at 20:02, Justin Iurman <[email protected]> wrote:

On 4/17/26 07:54, David Carlier wrote:
gtp_genl_send_echo_req() runs as a generic netlink doit handler in
process context with BH not disabled. It calls udp_tunnel_xmit_skb(),
which eventually invokes iptunnel_xmit() — that uses __this_cpu_inc/dec
on softnet_data.xmit.recursion to track the tunnel xmit recursion level.

Without local_bh_disable(), the task may migrate between
dev_xmit_recursion_inc() and dev_xmit_recursion_dec(), breaking the
per-CPU counter pairing. The result is stale or negative recursion
levels that can later produce false-positive
SKB_DROP_REASON_RECURSION_LIMIT drops on either CPU.

The other udp_tunnel_xmit_skb() call sites in gtp.c are unaffected:
the data path runs under ndo_start_xmit and the echo response handlers
run from the UDP encap rx softirq, both with BH already disabled.

Fix it by disabling BH around the udp_tunnel_xmit_skb() call, mirroring
commit 2cd7e6971fc2 ("sctp: disable BH before calling
udp_tunnel_xmit_skb()").

Why not fix iptunnel_xmit() directly, rather than fixing all possible
callers? Basically, jut like we did for lwtunnel_{output|xmit}(). The
advantage would be that we no longer have to worry about BHs in the
callers, and BHs would only be disabled when necessary.

Good point — your lwtunnel fix (c03a49f3093a) is a close parallel, and
   a central fix would avoid chasing callers one by one (sctp was patched
   last week, gtp is this one, and tipc/wireguard/ovpn genl paths look
   similar).

   Happy to respin as v2 with local_bh_disable/enable moved into
   iptunnel_xmit() (and ip6tunnel_xmit() for symmetry), and drop the
   gtp-local hunk. That would also supersede Xin Long's recent sctp
commit
   (2cd7e6971fc2), so I'll make sure to Cc him.

Jakub merged it already, so no need to respin. I guess we could revisit later if required.

   One thing I'd like your take on before I send: iptunnel_xmit() feels
   like the natural home since it owns the recursion counter, but would
   you rather see it in udp_tunnel_xmit_skb()? I don't want to pick the
   wrong spot if you already have a preference.

Since udp_tunnel_xmit_skb() is just another caller, I'd definitely do it in iptunnel_xmit() to centralize things (same for v6).

Reply via email to