On 4/21/26 11:58 PM, KaFai Wan wrote:
A BPF TCP congestion control program can call bpf_setsockopt() from
its callbacks. In current kernels, if it calls
bpf_setsockopt(TCP_NODELAY) from cwnd_event_tx_start(), the call can
re-enter the TCP transmit path before the outer tcp_transmit_skb()
has completed and advanced the send head.

This can re-trigger CA_EVENT_TX_START and lead to unbounded recursion:

   tcp_transmit_skb()
     -> tcp_event_data_sent()
       -> tcp_ca_event(sk, CA_EVENT_TX_START)
         -> cwnd_event_tx_start()
           -> bpf_setsockopt(TCP_NODELAY)
             -> tcp_push_pending_frames()
               -> tcp_write_xmit()
                 -> tcp_transmit_skb()

This leads to unbounded recursion and can overflow the kernel stack.

Reject TCP_NODELAY with -EOPNOTSUPP for bpf-tcp-cc by introducing
a dedicated setsockopt proto for BPF_PROG_TYPE_STRUCT_OPS TCP
congestion control programs.

Fixes: 7e41df5dbba2 ("bpf: Add a few optnames to bpf_setsockopt")
Suggested-by: Martin KaFai Lau <[email protected]>
Signed-off-by: KaFai Wan <[email protected]>
Reviewed-by: Jiayuan Chen <[email protected]>

Reply via email to