On Tue, 2026-04-21 at 16:51 +0000, [email protected] wrote:
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 5fa9189eb772..96849f4c1fbc 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -5833,6 +5833,12 @@ BPF_CALL_5(bpf_sock_ops_setsockopt, struct 
> > bpf_sock_ops_kern *, bpf_sock,
> >     if (!is_locked_tcp_sock_ops(bpf_sock))
> >             return -EOPNOTSUPP;
> > 
> > +   /* TCP_NODELAY triggers tcp_push_pending_frames() and re-enters these 
> > callbacks. */
> > +   if ((bpf_sock->op == BPF_SOCK_OPS_HDR_OPT_LEN_CB ||
> > +        bpf_sock->op == BPF_SOCK_OPS_WRITE_HDR_OPT_CB) &&
> > +       level == SOL_TCP && optname == TCP_NODELAY)
> > +           return -EOPNOTSUPP;
> > +
> 
> Does this same recursion vulnerability exist for BPF TCP congestion
> control algorithms using BPF_PROG_TYPE_STRUCT_OPS?
> 
> If a BPF congestion control algorithm invokes bpf_setsockopt(TCP_NODELAY)
> from its cwnd_event callback when handling CA_EVENT_TX_START, could it
> trigger unbounded recursion through this path:
> 
> BPF_PROG_TYPE_STRUCT_OPS cwnd_event callback (CA_EVENT_TX_START)
>   -> bpf_setsockopt(TCP_NODELAY)
>     -> tcp_push_pending_frames()
>       -> tcp_write_xmit()
>         -> tcp_transmit_skb()
>           -> tcp_event_data_sent()
>             -> cwnd_event()
>               -> BPF congestion control cwnd_event callback
> 
> The author confirmed this issue in the v3 discussion thread and Martin
> KaFai Lau suggested creating a new func_proto
> (bpf_sk_setsockopt_nodelay_proto) to reject TCP_NODELAY for all struct_ops
> programs, but this patch only addresses BPF_SOCK_OPS callbacks.

patch#2 ("bpf: Reject TCP_NODELAY in bpf-tcp-cc") fix this, pls check.
> 
> Reference: 
> https://lore.kernel.org/bpf/[email protected]/
> 
> 
> Does this fix also need to block TCP_CORK when the value is 0?

TCP_CORK is not support, return -EINVAL by default in sol_tcp_sockopt().
> 
> When bpf_setsockopt() is called with SOL_TCP and TCP_CORK with value 0
> (disabling cork), the call path goes through __tcp_sock_set_cork() in
> net/ipv4/tcp.c, which unconditionally calls tcp_push_pending_frames(sk)
> when cork is disabled:
> 
> __tcp_sock_set_cork()
>   -> tcp_push_pending_frames()
>     -> tcp_current_mss()
>       -> tcp_established_options()
>         -> bpf_skops_hdr_opt_len()
>           -> BPF_SOCK_OPS_HDR_OPT_LEN_CB
> 
> This triggers the same recursion chain described in the commit message,
> but the patch only blocks TCP_NODELAY, not TCP_CORK with value 0.
> 
> Reference:
> https://lore.kernel.org/bpf/5fea6b5e9d1a13e72edf8222a8ebe50cbb660b37e62b87b65dab6f6f31794...@mail.kernel.org/
> 
> >     return _bpf_setsockopt(bpf_sock->sk, level, optname, optval, optlen);
> >  }
> > 
> 
> 
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> 
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/24733356810

-- 
Thanks,
KaFai

Reply via email to