On Mon, Jan 20, 2025 at 8:03 PM Breno Leitao <[email protected]> wrote: > > Add a tracepoint to monitor TCP congestion window adjustments through the > tcp_cwnd_reduction() function. This tracepoint helps track: > - TCP window size fluctuations > - Active socket behavior > - Congestion window reduction events > > Meta has been using BPF programs to monitor this function for years. By > adding a proper tracepoint, we provide a stable API for all users who > need to monitor TCP congestion window behavior. > > The tracepoint captures: > - Socket source and destination IPs > - Number of newly acknowledged packets > - Number of newly lost packets > - Packets in flight > > Here is an example of a tracepoint when viewed in the trace buffer: > > tcp_cwnd_reduction: src=[2401:db00:3021:10e1:face:0:32a:0]:45904 > dest=[2401:db00:3021:1fb:face:0:23:0]:5201 newly_lost=0 newly_acked_sacked=27 > in_flight=34 > > CC: Yonghong Song <[email protected]> > CC: Song Liu <[email protected]> > CC: Martin KaFai Lau <[email protected]> > Signed-off-by: Breno Leitao <[email protected]> > --- > include/trace/events/tcp.h | 34 ++++++++++++++++++++++++++++++++++ > net/ipv4/tcp_input.c | 2 ++ > 2 files changed, 36 insertions(+) > > diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h > index > a27c4b619dffd7dcc72fffa71bf0fd5e34fe6681..b3a636658b39721cca843c0000eaa573cf4d09d5 > 100644 > --- a/include/trace/events/tcp.h > +++ b/include/trace/events/tcp.h > @@ -259,6 +259,40 @@ TRACE_EVENT(tcp_retransmit_synack, > __entry->saddr_v6, __entry->daddr_v6) > ); > > +TRACE_EVENT(tcp_cwnd_reduction, > + > + TP_PROTO(const struct sock *sk, const int newly_acked_sacked, > + const int newly_lost, const int flag), > + > + TP_ARGS(sk, newly_acked_sacked, newly_lost, flag), > + > + TP_STRUCT__entry( > + __array(__u8, saddr, sizeof(struct sockaddr_in6)) > + __array(__u8, daddr, sizeof(struct sockaddr_in6)) > + __field(int, in_flight) > + > + __field(int, newly_acked_sacked) > + __field(int, newly_lost) > + ), > + > + TP_fast_assign( > + const struct inet_sock *inet = inet_sk(sk); > + const struct tcp_sock *tp = tcp_sk(sk); > + > + memset(__entry->saddr, 0, sizeof(struct sockaddr_in6)); > + memset(__entry->daddr, 0, sizeof(struct sockaddr_in6)); > + > + TP_STORE_ADDR_PORTS(__entry, inet, sk); > + __entry->in_flight = tcp_packets_in_flight(tp); > + __entry->newly_lost = newly_lost; > + __entry->newly_acked_sacked = newly_acked_sacked; > + ), > + > + TP_printk("src=%pISpc dest=%pISpc newly_lost=%d newly_acked_sacked=%d > in_flight=%d", > + __entry->saddr, __entry->daddr, __entry->newly_lost, > + __entry->newly_acked_sacked, __entry->in_flight) > +); > + > #include <trace/events/net_probe_common.h> > > TRACE_EVENT(tcp_probe, > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index > 4811727b8a02258ec6fa1fd129beecf7cbb0f90e..fc88c511e81bc12ec57e8dc3e9185a920d1bd079 > 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -2710,6 +2710,8 @@ void tcp_cwnd_reduction(struct sock *sk, int > newly_acked_sacked, int newly_lost, > if (newly_acked_sacked <= 0 || WARN_ON_ONCE(!tp->prior_cwnd)) > return; > > + trace_tcp_cwnd_reduction(sk, newly_acked_sacked, newly_lost, flag); > +
Are there any other reasons why introducing a new tracepoint here? AFAIK, it can be easily replaced by a bpf related program or script to monitor in the above position. Thanks, Jason
