> diff --git a/net/core/filter.c b/net/core/filter.c
> index 5fa9189eb772..96849f4c1fbc 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -5833,6 +5833,12 @@ BPF_CALL_5(bpf_sock_ops_setsockopt, struct 
> bpf_sock_ops_kern *, bpf_sock,
>       if (!is_locked_tcp_sock_ops(bpf_sock))
>               return -EOPNOTSUPP;
>
> +     /* TCP_NODELAY triggers tcp_push_pending_frames() and re-enters these 
> callbacks. */
> +     if ((bpf_sock->op == BPF_SOCK_OPS_HDR_OPT_LEN_CB ||
> +          bpf_sock->op == BPF_SOCK_OPS_WRITE_HDR_OPT_CB) &&
> +         level == SOL_TCP && optname == TCP_NODELAY)
> +             return -EOPNOTSUPP;
> +

Does this same recursion vulnerability exist for BPF TCP congestion
control algorithms using BPF_PROG_TYPE_STRUCT_OPS?

If a BPF congestion control algorithm invokes bpf_setsockopt(TCP_NODELAY)
from its cwnd_event callback when handling CA_EVENT_TX_START, could it
trigger unbounded recursion through this path:

BPF_PROG_TYPE_STRUCT_OPS cwnd_event callback (CA_EVENT_TX_START)
  -> bpf_setsockopt(TCP_NODELAY)
    -> tcp_push_pending_frames()
      -> tcp_write_xmit()
        -> tcp_transmit_skb()
          -> tcp_event_data_sent()
            -> cwnd_event()
              -> BPF congestion control cwnd_event callback

The author confirmed this issue in the v3 discussion thread and Martin
KaFai Lau suggested creating a new func_proto
(bpf_sk_setsockopt_nodelay_proto) to reject TCP_NODELAY for all struct_ops
programs, but this patch only addresses BPF_SOCK_OPS callbacks.

Reference: 
https://lore.kernel.org/bpf/[email protected]/


Does this fix also need to block TCP_CORK when the value is 0?

When bpf_setsockopt() is called with SOL_TCP and TCP_CORK with value 0
(disabling cork), the call path goes through __tcp_sock_set_cork() in
net/ipv4/tcp.c, which unconditionally calls tcp_push_pending_frames(sk)
when cork is disabled:

__tcp_sock_set_cork()
  -> tcp_push_pending_frames()
    -> tcp_current_mss()
      -> tcp_established_options()
        -> bpf_skops_hdr_opt_len()
          -> BPF_SOCK_OPS_HDR_OPT_LEN_CB

This triggers the same recursion chain described in the commit message,
but the patch only blocks TCP_NODELAY, not TCP_CORK with value 0.

Reference: 
https://lore.kernel.org/bpf/5fea6b5e9d1a13e72edf8222a8ebe50cbb660b37e62b87b65dab6f6f31794...@mail.kernel.org/

>       return _bpf_setsockopt(bpf_sock->sk, level, optname, optval, optlen);
>  }
>


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/24733356810

Reply via email to