On Mon, 20 Apr 2026 21:02:55 +0200 Justin Iurman wrote:
> On 4/17/26 07:54, David Carlier wrote:
> > gtp_genl_send_echo_req() runs as a generic netlink doit handler in
> > process context with BH not disabled. It calls udp_tunnel_xmit_skb(),
> > which eventually invokes iptunnel_xmit() — that uses __this_cpu_inc/dec
> > on softnet_data.xmit.recursion to track the tunnel xmit recursion level.
> >
> > Without local_bh_disable(), the task may migrate between
> > dev_xmit_recursion_inc() and dev_xmit_recursion_dec(), breaking the
> > per-CPU counter pairing. The result is stale or negative recursion
> > levels that can later produce false-positive
> > SKB_DROP_REASON_RECURSION_LIMIT drops on either CPU.
> >
> > The other udp_tunnel_xmit_skb() call sites in gtp.c are unaffected:
> > the data path runs under ndo_start_xmit and the echo response handlers
> > run from the UDP encap rx softirq, both with BH already disabled.
> >
> > Fix it by disabling BH around the udp_tunnel_xmit_skb() call, mirroring
> > commit 2cd7e6971fc2 ("sctp: disable BH before calling
> > udp_tunnel_xmit_skb()").
>
> Why not fix iptunnel_xmit() directly, rather than fixing all possible
> callers? Basically, jut like we did for lwtunnel_{output|xmit}(). The
> advantage would be that we no longer have to worry about BHs in the
> callers, and BHs would only be disabled when necessary.
Oops, I pushed this already. The bot hasn't caught up yet.
Let's revisit this if we find another caller in process context?