Re: [PATCH] ftrace: make extra rcu_is_watching() validation check optional

2024-03-26 Thread Steven Rostedt
On Tue, 26 Mar 2024 09:16:33 -0700
Andrii Nakryiko  wrote:

> > It's no different than lockdep. Test boxes should have it enabled, but
> > there's no reason to have this enabled in a production system.
> >  
> 
> I tend to agree with Steven here (which is why I sent this patch as it
> is), but I'm happy to do it as an opt-out, if Masami insists. Please
> do let me know if I need to send v2 or this one is actually the one
> we'll end up using. Thanks!

Masami,

Are you OK with just keeping it set to N.

We could have other options like PROVE_LOCKING enable it.

-- Steve



Re: [PATCH] ftrace: make extra rcu_is_watching() validation check optional

2024-03-26 Thread Andrii Nakryiko
On Mon, Mar 25, 2024 at 3:11 PM Steven Rostedt  wrote:
>
> On Mon, 25 Mar 2024 11:38:48 +0900
> Masami Hiramatsu (Google)  wrote:
>
> > On Fri, 22 Mar 2024 09:03:23 -0700
> > Andrii Nakryiko  wrote:
> >
> > > Introduce CONFIG_FTRACE_VALIDATE_RCU_IS_WATCHING config option to
> > > control whether ftrace low-level code performs additional
> > > rcu_is_watching()-based validation logic in an attempt to catch noinstr
> > > violations.
> > >
> > > This check is expected to never be true in practice and would be best
> > > controlled with extra config to let users decide if they are willing to
> > > pay the price.
> >
> > Hmm, for me, it sounds like "WARN_ON(something) never be true in practice
> > so disable it by default". I think CONFIG_FTRACE_VALIDATE_RCU_IS_WATCHING
> > is OK, but tht should be set to Y by default. If you have already verified
> > that your system never make it true and you want to optimize your ftrace
> > path, you can manually set it to N at your own risk.
> >
>
> Really, it's for debugging. I would argue that it should *not* be default y.
> Peter added this to find all the locations that could be called where RCU
> is not watching. But the issue I have is that this is that it *does cause
> overhead* with function tracing.
>
> I believe we found pretty much all locations that were an issue, and we
> should now just make it an option for developers.
>
> It's no different than lockdep. Test boxes should have it enabled, but
> there's no reason to have this enabled in a production system.
>

I tend to agree with Steven here (which is why I sent this patch as it
is), but I'm happy to do it as an opt-out, if Masami insists. Please
do let me know if I need to send v2 or this one is actually the one
we'll end up using. Thanks!

> -- Steve
>
>
> > >
> > > Cc: Steven Rostedt 
> > > Cc: Masami Hiramatsu 
> > > Cc: Paul E. McKenney 
> > > Signed-off-by: Andrii Nakryiko 
> > > ---
> > >  include/linux/trace_recursion.h |  2 +-
> > >  kernel/trace/Kconfig| 13 +
> > >  2 files changed, 14 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/trace_recursion.h 
> > > b/include/linux/trace_recursion.h
> > > index d48cd92d2364..24ea8ac049b4 100644
> > > --- a/include/linux/trace_recursion.h
> > > +++ b/include/linux/trace_recursion.h
> > > @@ -135,7 +135,7 @@ extern void ftrace_record_recursion(unsigned long ip, 
> > > unsigned long parent_ip);
> > >  # define do_ftrace_record_recursion(ip, pip)   do { } while (0)
> > >  #endif
> > >
> > > -#ifdef CONFIG_ARCH_WANTS_NO_INSTR
> > > +#ifdef CONFIG_FTRACE_VALIDATE_RCU_IS_WATCHING
> > >  # define trace_warn_on_no_rcu(ip)  \
> > > ({  \
> > > bool __ret = !rcu_is_watching();\
> >
> > BTW, maybe we can add "unlikely" in the next "if" line?
> >
> > > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> > > index 61c541c36596..19bce4e217d6 100644
> > > --- a/kernel/trace/Kconfig
> > > +++ b/kernel/trace/Kconfig
> > > @@ -974,6 +974,19 @@ config FTRACE_RECORD_RECURSION_SIZE
> > >   This file can be reset, but the limit can not change in
> > >   size at runtime.
> > >
> > > +config FTRACE_VALIDATE_RCU_IS_WATCHING
> > > +   bool "Validate RCU is on during ftrace recursion check"
> > > +   depends on FUNCTION_TRACER
> > > +   depends on ARCH_WANTS_NO_INSTR
> >
> >   default y
> >
> > > +   help
> > > + All callbacks that attach to the function tracing have some sort
> > > + of protection against recursion. This option performs additional
> > > + checks to make sure RCU is on when ftrace callbacks recurse.
> > > +
> > > + This will add more overhead to all ftrace-based invocations.
> >
> >   ... invocations, but keep it safe.
> >
> > > +
> > > + If unsure, say N
> >
> >   If unsure, say Y
> >
> > Thank you,
> >
> > > +
> > >  config RING_BUFFER_RECORD_RECURSION
> > > bool "Record functions that recurse in the ring buffer"
> > > depends on FTRACE_RECORD_RECURSION
> > > --
> > > 2.43.0
> > >
> >
> >
>



Re: [PATCH net-next 0/3] trace: use TP_STORE_ADDRS macro

2024-03-26 Thread Jason Xing
On Tue, Mar 26, 2024 at 9:18 PM Eric Dumazet  wrote:
>
> On Tue, Mar 26, 2024 at 11:44 AM Jason Xing  wrote:
>
> > Well, it's a pity that it seems that we are about to abandon this
> > method but it's not that friendly to the users who are unable to
> > deploy BPF...
>
> It is a pity these tracepoint patches are consuming a lot of reviewer
> time, just because
> some people 'can not deploy BPF'

Sure, not everyone can do this easily. The phenomenon still exists and
we cannot ignore it. Do you remember that about a month ago someone
submitted one patch introducing a new tracepoint and then I replied
to/asked you if it's necessary that we replace most of the tracepoints
with BPF? Now I realise and accept the fact...

I'll keep reviewing such patches and hope it can give you maintainers
a break. I don't mind taking some time to do it, after all it's not a
bad thing to help some people.

>
> Well, I came up with more ideas about how to improve the
> > trace function in recent days. The motivation of doing this is that I
> > encountered some issues which could be traced/diagnosed by using trace
> > effortlessly without writing some bpftrace codes again and again. The
> > status of trace seems not active but many people are still using it, I
> > believe.
>
> 'Writing bpftrace codes again and again' is not a good reason to add
> maintenance costs
> to linux networking stack.

I'm just saying :)



Re: [PATCH net-next 0/3] trace: use TP_STORE_ADDRS macro

2024-03-26 Thread Eric Dumazet
On Tue, Mar 26, 2024 at 11:44 AM Jason Xing  wrote:

> Well, it's a pity that it seems that we are about to abandon this
> method but it's not that friendly to the users who are unable to
> deploy BPF...

It is a pity these tracepoint patches are consuming a lot of reviewer
time, just because
some people 'can not deploy BPF'

Well, I came up with more ideas about how to improve the
> trace function in recent days. The motivation of doing this is that I
> encountered some issues which could be traced/diagnosed by using trace
> effortlessly without writing some bpftrace codes again and again. The
> status of trace seems not active but many people are still using it, I
> believe.

'Writing bpftrace codes again and again' is not a good reason to add
maintenance costs
to linux networking stack.



Re: [PATCH net-next 0/3] trace: use TP_STORE_ADDRS macro

2024-03-26 Thread Paolo Abeni
On Tue, 2024-03-26 at 18:43 +0800, Jason Xing wrote:
> On Tue, Mar 26, 2024 at 6:29 PM Paolo Abeni  wrote:
> > 
> > On Tue, 2024-03-26 at 12:14 +0800, Jason Xing wrote:
> > > On Mon, Mar 25, 2024 at 11:43 AM Jason Xing  
> > > wrote:
> > > > 
> > > > From: Jason Xing 
> > > > 
> > > > Using the macro for other tracepoints use to be more concise.
> > > > No functional change.
> > > > 
> > > > Jason Xing (3):
> > > >   trace: move to TP_STORE_ADDRS related macro to net_probe_common.h
> > > >   trace: use TP_STORE_ADDRS() macro in inet_sk_error_report()
> > > >   trace: use TP_STORE_ADDRS() macro in inet_sock_set_state()
> > > > 
> > > >  include/trace/events/net_probe_common.h | 29 
> > > >  include/trace/events/sock.h | 35 -
> > > 
> > > I just noticed that some trace files in include/trace directory (like
> > > net_probe_common.h, sock.h, skb.h, net.h, sock.h, udp.h, sctp.h,
> > > qdisc.h, neigh.h, napi.h, icmp.h, ...) are not owned by networking
> > > folks while some files (like tcp.h) have been maintained by specific
> > > maintainers/experts (like Eric) because they belong to one specific
> > > area. I wonder if we can get more networking guys involved in net
> > > tracing.
> > > 
> > > I'm not sure if 1) we can put those files into the "NETWORKING
> > > [GENERAL]" category, or 2) we can create a new category to include
> > > them all.
> > 
> > I think all the file you mentioned are not under networking because of
> > MAINTAINER file inaccuracy, and we could move there them accordingly.
> 
> Yes, they are not under the networking category currently. So how
> could we move them? The MAINTAINER file doesn't have all the specific
> categories which are suitable for each of the trace files.

I think there is no need to other categories: adding the explicit 'F:'
entries for such files in the NETWORKING [GENERAL] section should fit.

> > > I know people start using BPF to trace them all instead, but I can see
> > > some good advantages of those hooks implemented in the kernel, say:
> > > 1) help those machines which are not easy to use BPF tools.
> > > 2) insert the tracepoint in the middle of some functions which cannot
> > > be replaced by bpf kprobe.
> > > 3) if we have enough tracepoints, we can generate a timeline to
> > > know/detect which flow/skb spends unexpected time at which point.
> > > ...
> > > We can do many things in this area, I think :)
> > > 
> > > What do you think about this, Jakub, Paolo, Eric ?
> > 
> > I agree tracepoints are useful, but I think the general agreement is
> > that they are the 'old way', we should try to avoid their
> > proliferation.
> 
> Well, it's a pity that it seems that we are about to abandon this
> method but it's not that friendly to the users who are unable to
> deploy BPF... Well, I came up with more ideas about how to improve the
> trace function in recent days. The motivation of doing this is that I
> encountered some issues which could be traced/diagnosed by using trace
> effortlessly without writing some bpftrace codes again and again. The
> status of trace seems not active but many people are still using it, I
> believe.

I don't think we should abandon it completely. My understanding is that
we should thing carefully before adding new tracepoints, and generally
speaking, avoid adding 'too many' of them.

Cheers,

Paolo





Re: [PATCH net-next v2 3/3] tcp: add location into reset trace process

2024-03-26 Thread Paolo Abeni
On Mon, 2024-03-25 at 14:28 +0800, Jason Xing wrote:
> From: Jason Xing 
> 
> In addition to knowing the 4-tuple of the flow which generates RST,
> the reason why it does so is very important because we have some
> cases where the RST should be sent and have no clue which one
> exactly.
> 
> Adding location of reset process can help us more, like what
> trace_kfree_skb does.
> 
> Signed-off-by: Jason Xing 
> ---
>  include/trace/events/tcp.h | 14 ++
>  net/ipv4/tcp_ipv4.c|  2 +-
>  net/ipv4/tcp_output.c  |  2 +-
>  net/ipv6/tcp_ipv6.c|  2 +-
>  4 files changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> index a13eb2147a02..8f6c1a07503c 100644
> --- a/include/trace/events/tcp.h
> +++ b/include/trace/events/tcp.h
> @@ -109,13 +109,17 @@ DEFINE_EVENT(tcp_event_sk_skb, tcp_retransmit_skb,
>   */
>  TRACE_EVENT(tcp_send_reset,
>  
> - TP_PROTO(const struct sock *sk, const struct sk_buff *skb),
> + TP_PROTO(
> + const struct sock *sk,
> + const struct sk_buff *skb,
> + void *location),

Very minor nit: the above lines should be aligned with the open
bracket.

No need to repost just for this, but let's wait for Eric's feedback.

Cheers,

Paolo




Re: [PATCH net-next 0/3] trace: use TP_STORE_ADDRS macro

2024-03-26 Thread patchwork-bot+netdevbpf
Hello:

This series was applied to netdev/net-next.git (main)
by Paolo Abeni :

On Mon, 25 Mar 2024 11:43:44 +0800 you wrote:
> From: Jason Xing 
> 
> Using the macro for other tracepoints use to be more concise.
> No functional change.
> 
> Jason Xing (3):
>   trace: move to TP_STORE_ADDRS related macro to net_probe_common.h
>   trace: use TP_STORE_ADDRS() macro in inet_sk_error_report()
>   trace: use TP_STORE_ADDRS() macro in inet_sock_set_state()
> 
> [...]

Here is the summary with links:
  - [net-next,1/3] trace: move to TP_STORE_ADDRS related macro to 
net_probe_common.h
https://git.kernel.org/netdev/net-next/c/b3af9045b482
  - [net-next,2/3] trace: use TP_STORE_ADDRS() macro in inet_sk_error_report()
https://git.kernel.org/netdev/net-next/c/a24c855a5ef2
  - [net-next,3/3] trace: use TP_STORE_ADDRS() macro in inet_sock_set_state()
https://git.kernel.org/netdev/net-next/c/646700ce23f4

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html





Re: [PATCH net-next 0/3] trace: use TP_STORE_ADDRS macro

2024-03-26 Thread Jason Xing
On Tue, Mar 26, 2024 at 6:29 PM Paolo Abeni  wrote:
>
> On Tue, 2024-03-26 at 12:14 +0800, Jason Xing wrote:
> > On Mon, Mar 25, 2024 at 11:43 AM Jason Xing  
> > wrote:
> > >
> > > From: Jason Xing 
> > >
> > > Using the macro for other tracepoints use to be more concise.
> > > No functional change.
> > >
> > > Jason Xing (3):
> > >   trace: move to TP_STORE_ADDRS related macro to net_probe_common.h
> > >   trace: use TP_STORE_ADDRS() macro in inet_sk_error_report()
> > >   trace: use TP_STORE_ADDRS() macro in inet_sock_set_state()
> > >
> > >  include/trace/events/net_probe_common.h | 29 
> > >  include/trace/events/sock.h | 35 -
> >
> > I just noticed that some trace files in include/trace directory (like
> > net_probe_common.h, sock.h, skb.h, net.h, sock.h, udp.h, sctp.h,
> > qdisc.h, neigh.h, napi.h, icmp.h, ...) are not owned by networking
> > folks while some files (like tcp.h) have been maintained by specific
> > maintainers/experts (like Eric) because they belong to one specific
> > area. I wonder if we can get more networking guys involved in net
> > tracing.
> >
> > I'm not sure if 1) we can put those files into the "NETWORKING
> > [GENERAL]" category, or 2) we can create a new category to include
> > them all.
>
> I think all the file you mentioned are not under networking because of
> MAINTAINER file inaccuracy, and we could move there them accordingly.

Yes, they are not under the networking category currently. So how
could we move them? The MAINTAINER file doesn't have all the specific
categories which are suitable for each of the trace files.

> >
> > I know people start using BPF to trace them all instead, but I can see
> > some good advantages of those hooks implemented in the kernel, say:
> > 1) help those machines which are not easy to use BPF tools.
> > 2) insert the tracepoint in the middle of some functions which cannot
> > be replaced by bpf kprobe.
> > 3) if we have enough tracepoints, we can generate a timeline to
> > know/detect which flow/skb spends unexpected time at which point.
> > ...
> > We can do many things in this area, I think :)
> >
> > What do you think about this, Jakub, Paolo, Eric ?
>
> I agree tracepoints are useful, but I think the general agreement is
> that they are the 'old way', we should try to avoid their
> proliferation.

Well, it's a pity that it seems that we are about to abandon this
method but it's not that friendly to the users who are unable to
deploy BPF... Well, I came up with more ideas about how to improve the
trace function in recent days. The motivation of doing this is that I
encountered some issues which could be traced/diagnosed by using trace
effortlessly without writing some bpftrace codes again and again. The
status of trace seems not active but many people are still using it, I
believe.

Thanks,
Jason

>
> Cheers,
>
> Paolo
>



Re: [PATCH net-next 0/3] trace: use TP_STORE_ADDRS macro

2024-03-26 Thread Paolo Abeni
On Tue, 2024-03-26 at 12:14 +0800, Jason Xing wrote:
> On Mon, Mar 25, 2024 at 11:43 AM Jason Xing  wrote:
> > 
> > From: Jason Xing 
> > 
> > Using the macro for other tracepoints use to be more concise.
> > No functional change.
> > 
> > Jason Xing (3):
> >   trace: move to TP_STORE_ADDRS related macro to net_probe_common.h
> >   trace: use TP_STORE_ADDRS() macro in inet_sk_error_report()
> >   trace: use TP_STORE_ADDRS() macro in inet_sock_set_state()
> > 
> >  include/trace/events/net_probe_common.h | 29 
> >  include/trace/events/sock.h | 35 -
> 
> I just noticed that some trace files in include/trace directory (like
> net_probe_common.h, sock.h, skb.h, net.h, sock.h, udp.h, sctp.h,
> qdisc.h, neigh.h, napi.h, icmp.h, ...) are not owned by networking
> folks while some files (like tcp.h) have been maintained by specific
> maintainers/experts (like Eric) because they belong to one specific
> area. I wonder if we can get more networking guys involved in net
> tracing.
> 
> I'm not sure if 1) we can put those files into the "NETWORKING
> [GENERAL]" category, or 2) we can create a new category to include
> them all.

I think all the file you mentioned are not under networking because of
MAINTAINER file inaccuracy, and we could move there them accordingly.
> 
> I know people start using BPF to trace them all instead, but I can see
> some good advantages of those hooks implemented in the kernel, say:
> 1) help those machines which are not easy to use BPF tools.
> 2) insert the tracepoint in the middle of some functions which cannot
> be replaced by bpf kprobe.
> 3) if we have enough tracepoints, we can generate a timeline to
> know/detect which flow/skb spends unexpected time at which point.
> ...
> We can do many things in this area, I think :)
> 
> What do you think about this, Jakub, Paolo, Eric ?

I agree tracepoints are useful, but I think the general agreement is
that they are the 'old way', we should try to avoid their
proliferation. 

Cheers,

Paolo