Re: [PATCH v3 net-next 2/5] net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint
On Sat, Dec 30, 2017 at 7:06 PM, Yafang Shao <laoar.s...@gmail.com> wrote: > On Sun, Dec 31, 2017 at 6:33 AM, Brendan Gregg > <brendan.d.gr...@gmail.com> wrote: >> On Tue, Dec 19, 2017 at 7:12 PM, Yafang Shao <laoar.s...@gmail.com> wrote: >>> As sk_state is a common field for struct sock, so the state >>> transition tracepoint should not be a TCP specific feature. >>> Currently it traces all AF_INET state transition, so I rename this >>> tracepoint to inet_sock_set_state tracepoint with some minor changes and >>> move it >>> into trace/events/sock.h. >> >> The tcp:tcp_set_state probe is tcp_set_state(), so it's only going to >> fire for TCP sessions. It's not broken, and we could add a >> sctp:sctp_set_state as well. Replacing tcp:tcp_set_state with >> inet_sk_set_state is feeling like we might be baking too much >> implementation detail into the tracepoint API. >> >> If we must have inet_sk_set_state, then must we also delete >> tcp:tcp_set_state? >> > > Hi Brendan, > > The reason we have to make this change could be got from this mail > thread, https://patchwork.kernel.org/patch/10099243/ . > > The original tcp:tcp_set_state probe doesn't traced all TCP state transitions. > There're some state transitions in inet_connection_sock.c and > inet_hashtables.c are missed. > So we have to place this probe into these two files to fix the issue. > But as inet_connection_sock.c and inet_hashtables.c are common files > for all IPv4 protocols, not only for TCP, so it is not proper to place > a tcp_ function in these two files. > That's why we decide to rename tcp:tcp_set_state probe to > sock:inet_sock_set_state. It kinda feels like we are fixing one exposing-implementation problem (the missing state changes, which I'm happy to see fixed), by exposing another (there's no tcp:tcp_set_state because we don't want to put tcp functions in inet*.c files). Anyway... If I'm to use sock:inet_sock_set_state for TCP tracing, I'd like sk->sk_protocol exposed as a tracepoint argument so I can match on IPPROTO_TCP. Otherwise I'll have to keep digging it out of (void *)skaddr. (And if we're adding arguments, maybe consider sk_family as well, to make it easier to see which address arguments to use). Brendan
Re: [PATCH v3 net-next 2/5] net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint
On Tue, Dec 19, 2017 at 7:12 PM, Yafang Shaowrote: > As sk_state is a common field for struct sock, so the state > transition tracepoint should not be a TCP specific feature. > Currently it traces all AF_INET state transition, so I rename this > tracepoint to inet_sock_set_state tracepoint with some minor changes and move > it > into trace/events/sock.h. The tcp:tcp_set_state probe is tcp_set_state(), so it's only going to fire for TCP sessions. It's not broken, and we could add a sctp:sctp_set_state as well. Replacing tcp:tcp_set_state with inet_sk_set_state is feeling like we might be baking too much implementation detail into the tracepoint API. If we must have inet_sk_set_state, then must we also delete tcp:tcp_set_state? Brendan > We dont need to create a file named trace/events/inet_sock.h for this one > single > tracepoint. > > Two helpers are introduced to trace sk_state transition > - void inet_sk_state_store(struct sock *sk, int newstate); > - void inet_sk_set_state(struct sock *sk, int state); > As trace header should not be included in other header files, > so they are defined in sock.c. > > The protocol such as SCTP maybe compiled as a ko, hence export > inet_sk_set_state(). >[...]
Re: [Patch net-next v3] tcp: add a tracepoint for tcp retransmission
On Fri, Oct 13, 2017 at 3:09 PM, Alexei Starovoitov <alexei.starovoi...@gmail.com> wrote: > On Fri, Oct 13, 2017 at 01:50:44PM -0700, Brendan Gregg wrote: >> On Fri, Oct 13, 2017 at 1:03 PM, Cong Wang <xiyou.wangc...@gmail.com> wrote: >> > We need a real-time notification for tcp retransmission >> > for monitoring. >> > >> > Of course we could use ftrace to dynamically instrument this >> > kernel function too, however we can't retrieve the connection >> > information at the same time, for example perf-tools [1] reads >> > /proc/net/tcp for socket details, which is slow when we have >> > a lots of connections. >> > >> > Therefore, this patch adds a tracepoint for __tcp_retransmit_skb() >> > and exposes src/dst IP addresses and ports of the connection. >> > This also makes it easier to integrate into perf. >> > >> > Note, I expose both IPv4 and IPv6 addresses at the same time: >> > for a IPv4 socket, v4 mapped address is used as IPv6 addresses, >> > for a IPv6 socket, LOOPBACK4_IPV6 is already filled by kernel. >> > Also, add sk and skb pointers as they are useful for BPF. >> >> Thanks, a TCP retransmit tracepoint would be great. (tcp_set_state >> would be highly useful too, which Alexei already has in his list). >> >> Should skp->__sk_common.skc_state be included in the format string, so >> we don't have to always dig it out of the skaddr? For retransmits I >> always want to know the TCP state, to determine if it is ESTABLISHED >> (packet drop) or SYN_SENT (backlog full) or something else. > > let's not expose internal socket fields into tp fields. > Few people still believe that tp fields are abi, so to be safe > no such fields should be exposed. > It's trivial enough to read sk_state from bpf program > with bpf_probe_read(). Ah, right, the number mapping for TCP_ESTABLISHED and friends is a Linux implementation detail, and not from the RFCs. Ok, I can dig it from the skp instead. > >> We probably need a tracepoint for tcp_send_loss_probe() (TLP) as well, >> for tracing at the same time as retransmits (like my tools do), but >> that can be added later. > > hmm. why? > This single tracepoint will cover both cases of retransmits. I don't think tcp_send_loss_probe() TLP goes through __tcp_retransmit_skb(): look at the path to bumping LINUX_MIB_TCPLOSSPROBES. I was thinking that later on we might want to add a tcp:tcp_send_tlp tracepoint, in addition to this tcp:tcp_retransmit_skb tracepoint, for investigating the same kind of issues: packet loss. This existing tcp:tcp_retransmit_skb tracepoint patch is ok. Acked-by: Brendan Gregg <bgr...@netflix.com> (with or without %pI6c) Brendan
Re: [Patch net-next v3] tcp: add a tracepoint for tcp retransmission
On Fri, Oct 13, 2017 at 1:03 PM, Cong Wangwrote: > We need a real-time notification for tcp retransmission > for monitoring. > > Of course we could use ftrace to dynamically instrument this > kernel function too, however we can't retrieve the connection > information at the same time, for example perf-tools [1] reads > /proc/net/tcp for socket details, which is slow when we have > a lots of connections. > > Therefore, this patch adds a tracepoint for __tcp_retransmit_skb() > and exposes src/dst IP addresses and ports of the connection. > This also makes it easier to integrate into perf. > > Note, I expose both IPv4 and IPv6 addresses at the same time: > for a IPv4 socket, v4 mapped address is used as IPv6 addresses, > for a IPv6 socket, LOOPBACK4_IPV6 is already filled by kernel. > Also, add sk and skb pointers as they are useful for BPF. Thanks, a TCP retransmit tracepoint would be great. (tcp_set_state would be highly useful too, which Alexei already has in his list). Should skp->__sk_common.skc_state be included in the format string, so we don't have to always dig it out of the skaddr? For retransmits I always want to know the TCP state, to determine if it is ESTABLISHED (packet drop) or SYN_SENT (backlog full) or something else. We probably need a tracepoint for tcp_send_loss_probe() (TLP) as well, for tracing at the same time as retransmits (like my tools do), but that can be added later. Brendan
Re: [PATCH v2 net-next 0/6] perf, bpf: add support for bpf in sw/hw perf_events
On Wed, Aug 31, 2016 at 2:50 PM, Alexei Starovoitov <a...@fb.com> wrote: > Hi Peter, Dave, > > this patch set is a follow up to the discussion: > https://lkml.kernel.org/r/20160804142853.GO6862%20()%20twins%20!%20programming%20!%20kicks-ass%20!%20net > It turned out to be simpler than what we discussed. > > Patches 1-3 is bpf-side prep for the main patch 4 > that adds bpf program as an overflow_handler to sw and hw perf_events. > Peter, please review. > > Patches 5 and 6 are examples from myself and Brendan. > > v1-v2: fixed issues spotted by Peter and Daniel. Thanks Alexei! Tested-by: Brendan Gregg <bgr...@netflix.com> Brendan
Re: [PATCH net-next 0/6] perf, bpf: add support for bpf in sw/hw perf_events
On Mon, Aug 29, 2016 at 5:19 AM, Peter Zijlstrawrote: > > On Fri, Aug 26, 2016 at 07:31:18PM -0700, Alexei Starovoitov wrote: > > Hi Peter, Dave, > > > > this patch set is a follow up to the discussion: > > https://lkml.org/lkml/2016/8/4/304 > > It turned out to be simpler than what we discussed. > > > > Patches 1-3 is a bpf-side prep for the main patch 4 > > that adds bpf program as an overflow_handler to sw and hw perf_events. > > Peter, please review. > > > > Patches 5 and 6 are tests/examples from myself and Brendan. > > Brendan, so this works for you without extra hacks required? Yes, thanks for checking, I've done both IP and stack sampling so far with it. Brendan