from:"Alan Maguire"

Re: [PATCH] bpf/btf: Move tracing BTF APIs to the BTF library

2023-10-11 Thread Alan Maguire

On 10/10/2023 14:54, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) 
> 
> Move the BTF APIs used in tracing to the BTF library code for sharing it
> with others.
> Previously, to avoid complex dependency in a series I made it on the
> tracing tree, but now it is a good time to move it to BPF tree because
> these functions are pure BTF functions.
>

Makes sense to me. Two very small things - usual practice for
bpf-related changes is to specify "PATCH bpf-next" for changes like
this that target the -next tree. Other thing is I'm reasonably sure
no functional changes are intended - it's basically just a matter of
moving code from trace_btf -> btf - but would be good to confirm
that no functional changes are intended or similar in the commit
message. It's sort of implicit when you say "move the BTF APIs", but
would be good to confirm.


> Signed-off-by: Masami Hiramatsu (Google) 

Reviewed-by: Alan Maguire 


> ---
>  include/linux/btf.h|   24 +
>  kernel/bpf/btf.c   |  115 +
>  kernel/trace/Makefile  |1 
>  kernel/trace/trace_btf.c   |  122 
> 
>  kernel/trace/trace_btf.h   |   11 
>  kernel/trace/trace_probe.c |2 -
>  6 files changed, 140 insertions(+), 135 deletions(-)
>  delete mode 100644 kernel/trace/trace_btf.c
>  delete mode 100644 kernel/trace/trace_btf.h
> 
> diff --git a/include/linux/btf.h b/include/linux/btf.h
> index 928113a80a95..8372d93ea402 100644
> --- a/include/linux/btf.h
> +++ b/include/linux/btf.h
> @@ -507,6 +507,14 @@ btf_get_prog_ctx_type(struct bpf_verifier_log *log, 
> const struct btf *btf,
>  int get_kern_ctx_btf_id(struct bpf_verifier_log *log, enum bpf_prog_type 
> prog_type);
>  bool btf_types_are_same(const struct btf *btf1, u32 id1,
>   const struct btf *btf2, u32 id2);
> +const struct btf_type *btf_find_func_proto(const char *func_name,
> +struct btf **btf_p);
> +const struct btf_param *btf_get_func_param(const struct btf_type *func_proto,
> +s32 *nr);
> +const struct btf_member *btf_find_struct_member(struct btf *btf,
> + const struct btf_type *type,
> + const char *member_name,
> + u32 *anon_offset);
>  #else
>  static inline const struct btf_type *btf_type_by_id(const struct btf *btf,
>   u32 type_id)
> @@ -559,6 +567,22 @@ static inline bool btf_types_are_same(const struct btf 
> *btf1, u32 id1,
>  {
>   return false;
>  }
> +static inline const struct btf_type *btf_find_func_proto(const char 
> *func_name,
> +  struct btf **btf_p)
> +{
> + return NULL;
> +}
> +static inline const struct btf_param *
> +btf_get_func_param(const struct btf_type *func_proto, s32 *nr)
> +{
> + return NULL;
> +}
> +static inline const struct btf_member *
> +btf_find_struct_member(struct btf *btf, const struct btf_type *type,
> +const char *member_name, u32 *anon_offset)
> +{
> + return NULL;
> +}
>  #endif
>  
>  static inline bool btf_type_is_struct_ptr(struct btf *btf, const struct 
> btf_type *t)
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 8090d7fb11ef..e5cbf3b31b78 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -912,6 +912,121 @@ static const struct btf_type 
> *btf_type_skip_qualifiers(const struct btf *btf,
>   return t;
>  }
>  
> +/*
> + * Find a function proto type by name, and return the btf_type with its btf
> + * in *@btf_p. Return NULL if not found.
> + * Note that caller has to call btf_put(*@btf_p) after using the btf_type.
> + */
> +const struct btf_type *btf_find_func_proto(const char *func_name, struct btf 
> **btf_p)
> +{
> + const struct btf_type *t;
> + s32 id;
> +
> + id = bpf_find_btf_id(func_name, BTF_KIND_FUNC, btf_p);
> + if (id < 0)
> + return NULL;
> +
> + /* Get BTF_KIND_FUNC type */
> + t = btf_type_by_id(*btf_p, id);
> + if (!t || !btf_type_is_func(t))
> + goto err;
> +
> + /* The type of BTF_KIND_FUNC is BTF_KIND_FUNC_PROTO */
> + t = btf_type_by_id(*btf_p, t->type);
> + if (!t || !btf_type_is_func_proto(t))
> + goto err;
> +
> + return t;
> +err:
> + btf_put(*btf_p);
> + return NULL;
> +}
> +
> +/*
> + * Get function parameter with the number of parameters.
&g

Re: [PATCH 0/4] tracing: improve symbolic printing

2023-10-04 Thread Alan Maguire

On 04/10/2023 22:43, Steven Rostedt wrote:
> On Wed, 4 Oct 2023 22:35:07 +0100
> Alan Maguire  wrote:
> 
>> One thing we've heard from some embedded folks [1] is that having
>> kernel BTF loadable as a separate module (rather than embedded in
>> vmlinux) would help, as there are size limits on vmlinux that they can
>> workaround by having modules on a different partition. We're hoping
>> to get that working soon. I was wondering if you see other issues around
>> BTF adoption for embedded systems that we could put on the to-do list?
>> Not necessarily for this particular use-case (since there are
>> complications with trace data as you describe), but just trying to make
>> sure we can remove barriers to BTF adoption where possible.
> 
> I wonder how easy is it to create subsets of BTF. For one thing, in the
> future we want to be able to trace the arguments of all functions. That is,
> tracing all functions at the same time (function tracer) and getting the
> arguments within the trace.
> 
> This would only require information about functions and their arguments,
> which would be very useful. Is BTF easy to break apart? That is, just
> generate the information needed for function arguments?
>

There has been a fair bit of effort around this from the userspace side;
the BTF gen efforts were focused around applications carrying the
minimum BTF for their needs, so just the structures needed by the
particular BPF programs rather than the full set of vmlinux structures
for example [1].

Parsing BTF in-kernel to pull out the BTF functions (BTF_KIND_FUNC),
their prototypes (BTF_KIND_FUNC_PROTO) and all associated parameters
would be pretty straightforward I think, especially if you don't need
the structures that are passed via pointers. So if you're starting with
the full BTF, creating a subset for use in tracing would be reasonably
straightforward. My personal preference would always be to have the
full BTF where possible, but if that wasn't feasible on some systems
we'd need to add some options to pahole/libbpf to support such trimming
during the DWARF->BTF translation process.

Alan

[1] https://lore.kernel.org/bpf/20220209222646.348365-7-mauri...@kinvolk.io/

> Note, pretty much all functions do not pass structures by values, and this
> would not need to know the contents of a pointer to a structure. This would
> mean that structure layout information is not needed.
> 
> -- Steve
>

Re: [PATCH 0/4] tracing: improve symbolic printing

2023-10-04 Thread Alan Maguire

On 04/10/2023 18:29, Steven Rostedt wrote:
> On Wed, 4 Oct 2023 09:54:31 -0700
> Jakub Kicinski  wrote:
> 
>> On Wed, 4 Oct 2023 12:35:24 -0400 Steven Rostedt wrote:
 Potentially naive question - the trace point holds enum skb_drop_reason.
 The user space can get the names from BTF. Can we not teach user space
 to generically look up names of enums in BTF?
>>>
>>> That puts a hard requirement to include BTF in builds where it was not
>>> needed before. I really do not want to build with BTF just to get access to
>>> these symbols. And since this is used by the embedded world, and BTF is
>>> extremely bloated, the short answer is "No".  
>>
>> Dunno. BTF is there most of the time. It could make the life of
>> majority of the users far more pleasant.
> 
> BTF isn't there for a lot of developers working in embedded who use this
> code. Most my users that I deal with have minimal environments, so BTF is a
> showstopper.

One thing we've heard from some embedded folks [1] is that having
kernel BTF loadable as a separate module (rather than embedded in
vmlinux) would help, as there are size limits on vmlinux that they can
workaround by having modules on a different partition. We're hoping
to get that working soon. I was wondering if you see other issues around
BTF adoption for embedded systems that we could put on the to-do list?
Not necessarily for this particular use-case (since there are
complications with trace data as you describe), but just trying to make
sure we can remove barriers to BTF adoption where possible.

Thanks!

Alan

[1]
https://lore.kernel.org/bpf/CAHBbfcUkr6fTm2X9GNsFNqV75fTG=abqxfx_8ayk+4hk7he...@mail.gmail.com/

> 
>>
>> I hope we can at least agree that the current methods of generating 
>> the string arrays at C level are... aesthetically displeasing.
> 
> I don't know, I kinda like it ;-)
> 
> -- Steve
>

Re: [PATCH v3 1/2] kunit: support failure from dynamic analysis tools

2021-02-11 Thread Alan Maguire

On Thu, 11 Feb 2021, David Gow wrote:

> On Wed, Feb 10, 2021 at 6:14 AM Daniel Latypov  wrote:
> >
> > From: Uriel Guajardo 
> >
> > Add a kunit_fail_current_test() function to fail the currently running
> > test, if any, with an error message.
> >
> > This is largely intended for dynamic analysis tools like UBSAN and for
> > fakes.
> > E.g. say I had a fake ops struct for testing and I wanted my `free`
> > function to complain if it was called with an invalid argument, or
> > caught a double-free. Most return void and have no normal means of
> > signalling failure (e.g. super_operations, iommu_ops, etc.).
> >
> > Key points:
> > * Always update current->kunit_test so anyone can use it.
> >   * commit 83c4e7a0363b ("KUnit: KASAN Integration") only updated it for
> >   CONFIG_KASAN=y
> >
> > * Create a new header  so non-test code doesn't have
> > to include all of  (e.g. lib/ubsan.c)
> >
> > * Forward the file and line number to make it easier to track down
> > failures
> >
> > * Declare the helper function for nice __printf() warnings about mismatched
> > format strings even when KUnit is not enabled.
> >
> > Example output from kunit_fail_current_test("message"):
> > [15:19:34] [FAILED] example_simple_test
> > [15:19:34] # example_simple_test: initializing
> > [15:19:34] # example_simple_test: lib/kunit/kunit-example-test.c:24: 
> > message
> > [15:19:34] not ok 1 - example_simple_test
> >
> > Co-developed-by: Daniel Latypov 
> > Signed-off-by: Uriel Guajardo 
> > Signed-off-by: Daniel Latypov 
> > ---
> >  include/kunit/test-bug.h | 30 ++
> >  lib/kunit/test.c | 37 +
> >  2 files changed, 63 insertions(+), 4 deletions(-)
> >  create mode 100644 include/kunit/test-bug.h
> >
> > diff --git a/include/kunit/test-bug.h b/include/kunit/test-bug.h
> > new file mode 100644
> > index ..18b1034ec43a
> > --- /dev/null
> > +++ b/include/kunit/test-bug.h
> > @@ -0,0 +1,30 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * KUnit API allowing dynamic analysis tools to interact with KUnit tests
> > + *
> > + * Copyright (C) 2020, Google LLC.
> > + * Author: Uriel Guajardo 
> > + */
> > +
> > +#ifndef _KUNIT_TEST_BUG_H
> > +#define _KUNIT_TEST_BUG_H
> > +
> > +#define kunit_fail_current_test(fmt, ...) \
> > +   __kunit_fail_current_test(__FILE__, __LINE__, fmt, ##__VA_ARGS__)
> > +
> > +#if IS_ENABLED(CONFIG_KUNIT)
> 
> As the kernel test robot has pointed out on the second patch, this
> probably should be IS_BUILTIN(), otherwise this won't build if KUnit
> is a module, and the code calling it isn't.
> 
> This does mean that things like UBSAN integration won't work if KUnit
> is a module, which is a shame.
> 
> (It's worth noting that the KASAN integration worked around this by
> only calling inline functions, which would therefore be built-in even
> if the rest of KUnit was built as a module. I don't think it's quite
> as convenient to do that here, though.)
>

Right, static inline'ing __kunit_fail_current_test() seems problematic
because it calls other exported functions; more below 

> > +
> > +extern __printf(3, 4) void __kunit_fail_current_test(const char *file, int 
> > line,
> > +   const char *fmt, ...);
> > +
> > +#else
> > +
> > +static __printf(3, 4) void __kunit_fail_current_test(const char *file, int 
> > line,
> > +   const char *fmt, ...)
> > +{
> > +}
> > +
> > +#endif
> > +
> > +
> > +#endif /* _KUNIT_TEST_BUG_H */
> > diff --git a/lib/kunit/test.c b/lib/kunit/test.c
> > index ec9494e914ef..5794059505cf 100644
> > --- a/lib/kunit/test.c
> > +++ b/lib/kunit/test.c
> > @@ -7,6 +7,7 @@
> >   */
> >
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -16,6 +17,38 @@
> >  #include "string-stream.h"
> >  #include "try-catch-impl.h"
> >
> > +/*
> > + * Fail the current test and print an error message to the log.
> > + */
> > +void __kunit_fail_current_test(const char *file, int line, const char 
> > *fmt, ...)
> > +{
> > +   va_list args;
> > +   int len;
> > +   char *buffer;
> > +
> > +   if (!current->kunit_test)
> > +   return;
> > +
> > +   kunit_set_failure(current->kunit_test);
> > +

currently kunit_set_failure() is static, but it could be inlined I
suspect.

> > +   /* kunit_err() only accepts literals, so evaluate the args first. */
> > +   va_start(args, fmt);
> > +   len = vsnprintf(NULL, 0, fmt, args) + 1;
> > +   va_end(args);
> > +
> > +   buffer = kunit_kmalloc(current->kunit_test, len, GFP_KERNEL);

kunit_kmalloc()/kunit_kfree() are exported also, but we could probably
dodge allocation with a static buffer.  In fact since we end up
using an on-stack buffer for logging in kunit_log_append(), it might make 
sense to #define __kunit_fail_current_test() instead, i.e.

#define __kunit_fail_current_test(file,

Re: [PATCH v3 0/2] kunit: fail tests on UBSAN errors

2021-02-10 Thread Alan Maguire



On Tue, 9 Feb 2021, Daniel Latypov wrote:

> v1 by Uriel is here: [1].
> Since it's been a while, I've dropped the Reviewed-By's.
> 
> It depended on commit 83c4e7a0363b ("KUnit: KASAN Integration") which
> hadn't been merged yet, so that caused some kerfuffle with applying them
> previously and the series was reverted.
> 
> This revives the series but makes the kunit_fail_current_test() function
> take a format string and logs the file and line number of the failing
> code, addressing Alan Maguire's comments on the previous version.
> 
> As a result, the patch that makes UBSAN errors was tweaked slightly to
> include an error message.
> 
> v2 -> v3:
>   Fix kunit_fail_current_test() so it works w/ CONFIG_KUNIT=m
>   s/_/__ on the helper func to match others in test.c
> 
> [1] 
> https://lore.kernel.org/linux-kselftest/20200806174326.3577537-1-urielguajard...@gmail.com/
>

For the series:

Reviewed-by: Alan Maguire 

Thanks!

Re: [PATCH v2 1/2] kunit: support failure from dynamic analysis tools

2021-02-09 Thread Alan Maguire

On Tue, 9 Feb 2021, Daniel Latypov wrote:

> On Tue, Feb 9, 2021 at 9:26 AM Alan Maguire  wrote:
> >
> > On Fri, 5 Feb 2021, Daniel Latypov wrote:
> >
> > > From: Uriel Guajardo 
> > >
> > > Add a kunit_fail_current_test() function to fail the currently running
> > > test, if any, with an error message.
> > >
> > > This is largely intended for dynamic analysis tools like UBSAN and for
> > > fakes.
> > > E.g. say I had a fake ops struct for testing and I wanted my `free`
> > > function to complain if it was called with an invalid argument, or
> > > caught a double-free. Most return void and have no normal means of
> > > signalling failure (e.g. super_operations, iommu_ops, etc.).
> > >
> > > Key points:
> > > * Always update current->kunit_test so anyone can use it.
> > >   * commit 83c4e7a0363b ("KUnit: KASAN Integration") only updated it for
> > >   CONFIG_KASAN=y
> > >
> > > * Create a new header  so non-test code doesn't have
> > > to include all of  (e.g. lib/ubsan.c)
> > >
> > > * Forward the file and line number to make it easier to track down
> > > failures
> > >
> >
> > Thanks for doing this!
> >
> > > * Declare it as a function for nice __printf() warnings about mismatched
> > > format strings even when KUnit is not enabled.
> > >
> >
> > One thing I _think_ this assumes is that KUnit is builtin;
> > don't we need an
> 
> Ah, you're correct.
> Also going to rename it to have two _ to match other functions used in
> macros like __kunit_test_suites_init.
> 

Great! If you're sending out an updated version with these
changes, feel free to add

Reviewed-by: Alan Maguire

Re: [PATCH v2 1/2] kunit: support failure from dynamic analysis tools

2021-02-09 Thread Alan Maguire

On Fri, 5 Feb 2021, Daniel Latypov wrote:

> From: Uriel Guajardo 
> 
> Add a kunit_fail_current_test() function to fail the currently running
> test, if any, with an error message.
> 
> This is largely intended for dynamic analysis tools like UBSAN and for
> fakes.
> E.g. say I had a fake ops struct for testing and I wanted my `free`
> function to complain if it was called with an invalid argument, or
> caught a double-free. Most return void and have no normal means of
> signalling failure (e.g. super_operations, iommu_ops, etc.).
> 
> Key points:
> * Always update current->kunit_test so anyone can use it.
>   * commit 83c4e7a0363b ("KUnit: KASAN Integration") only updated it for
>   CONFIG_KASAN=y
> 
> * Create a new header  so non-test code doesn't have
> to include all of  (e.g. lib/ubsan.c)
> 
> * Forward the file and line number to make it easier to track down
> failures
> 

Thanks for doing this!

> * Declare it as a function for nice __printf() warnings about mismatched
> format strings even when KUnit is not enabled.
>

One thing I _think_ this assumes is that KUnit is builtin;
don't we need an

EXPORT_SYMBOL_GPL(_kunit_fail_current_test);

?

Without it, if an analysis tool (or indeed if KUnit) is built
as a module, it won't be possible to use this functionality.

> Example output from kunit_fail_current_test("message"):
> [15:19:34] [FAILED] example_simple_test
> [15:19:34] # example_simple_test: initializing
> [15:19:34] # example_simple_test: lib/kunit/kunit-example-test.c:24: 
> message
> [15:19:34] not ok 1 - example_simple_test
> 
> Co-developed-by: Daniel Latypov 
> Signed-off-by: Uriel Guajardo 
> Signed-off-by: Daniel Latypov 
> ---
>  include/kunit/test-bug.h | 30 ++
>  lib/kunit/test.c | 36 
>  2 files changed, 62 insertions(+), 4 deletions(-)
>  create mode 100644 include/kunit/test-bug.h
> 
> diff --git a/include/kunit/test-bug.h b/include/kunit/test-bug.h
> new file mode 100644
> index ..4963ed52c2df
> --- /dev/null
> +++ b/include/kunit/test-bug.h
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * KUnit API allowing dynamic analysis tools to interact with KUnit tests
> + *
> + * Copyright (C) 2020, Google LLC.

nit; might want to update copyright year.

> + * Author: Uriel Guajardo 
> + */
> +
> +#ifndef _KUNIT_TEST_BUG_H
> +#define _KUNIT_TEST_BUG_H
> +
> +#define kunit_fail_current_test(fmt, ...) \
> + _kunit_fail_current_test(__FILE__, __LINE__, fmt, ##__VA_ARGS__)
> +
> +#if IS_ENABLED(CONFIG_KUNIT)
> +
> +extern __printf(3, 4) void _kunit_fail_current_test(const char *file, int 
> line,
> + const char *fmt, ...);
> +
> +#else
> +
> +static __printf(3, 4) void _kunit_fail_current_test(const char *file, int 
> line,
> + const char *fmt, ...)
> +{
> +}
> +
> +#endif
> +
> +
> +#endif /* _KUNIT_TEST_BUG_H */
> diff --git a/lib/kunit/test.c b/lib/kunit/test.c
> index ec9494e914ef..7b16aae0ccae 100644
> --- a/lib/kunit/test.c
> +++ b/lib/kunit/test.c
> @@ -7,6 +7,7 @@
>   */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -16,6 +17,37 @@
>  #include "string-stream.h"
>  #include "try-catch-impl.h"
>  
> +/*
> + * Fail the current test and print an error message to the log.
> + */
> +void _kunit_fail_current_test(const char *file, int line, const char *fmt, 
> ...)
> +{
> + va_list args;
> + int len;
> + char *buffer;
> +
> + if (!current->kunit_test)
> + return;
> +
> + kunit_set_failure(current->kunit_test);
> +
> + /* kunit_err() only accepts literals, so evaluate the args first. */
> + va_start(args, fmt);
> + len = vsnprintf(NULL, 0, fmt, args) + 1;
> + va_end(args);
> +
> + buffer = kunit_kmalloc(current->kunit_test, len, GFP_KERNEL);
> + if (!buffer)
> + return;
> +
> + va_start(args, fmt);
> + vsnprintf(buffer, len, fmt, args);
> + va_end(args);
> +
> + kunit_err(current->kunit_test, "%s:%d: %s", file, line, buffer);
> + kunit_kfree(current->kunit_test, buffer);
> +}
> +
>  /*
>   * Append formatted message to log, size of which is limited to
>   * KUNIT_LOG_SIZE bytes (including null terminating byte).
> @@ -273,9 +305,7 @@ static void kunit_try_run_case(void *data)
>   struct kunit_suite *suite = ctx->suite;
>   struct kunit_case *test_case = ctx->test_case;
>  
> -#if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT))
>   current->kunit_test = test;
> -#endif /* IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT) */
>  
>   /*
>* kunit_run_case_internal may encounter a fatal error; if it does,
> @@ -624,9 +654,7 @@ void kunit_cleanup(struct kunit *test)
>   spin_unlock(&test->lock);
>   kunit_remove_resource(test, res);
>   }
> -#if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT))
>

Re: [PATCH v2 bpf-next 3/4] libbpf: BTF dumper support for typed data

2021-01-22 Thread Alan Maguire

On Thu, 21 Jan 2021, Andrii Nakryiko wrote:

> On Wed, Jan 20, 2021 at 10:56 PM Andrii Nakryiko
>  wrote:
> >
> > On Sun, Jan 17, 2021 at 2:22 PM Alan Maguire  
> > wrote:
> > >
> > > Add a BTF dumper for typed data, so that the user can dump a typed
> > > version of the data provided.
> > >
> > > The API is
> > >
> > > int btf_dump__emit_type_data(struct btf_dump *d, __u32 id,
> > >  const struct btf_dump_emit_type_data_opts 
> > > *opts,
> > >  void *data);
> > >
> 
> Two more things I realized about this API overnight:
> 
> 1. It's error-prone to specify only the pointer to data without
> specifying the size. If user screws up and scecifies wrong type ID or
> if BTF data is corrupted, then this API would start reading and
> printing memory outside the bounds. I think it's much better to also
> require user to specify the size and bail out with error if we reach
> the end of the allowed memory area.

Yep, good point, especially given in the tracing context we will likely
only have a subset of the data (e.g. part of the 16k representing a
task_struct).  The way I was approaching this was to return -E2BIG
and append a "..." to the dumped data denoting the data provided
didn't cover the size needed to fully represent the type. The idea is
the structure is too big for the data provided, hence E2BIG, but maybe 
there's a more intuitive way to do this? See below for more...

> 
> 2. This API would be more useful if it also returns the amount of
> "consumed" bytes. That way users can do more flexible and powerful
> pretty-printing of raw data. So on success we'll have >= 0 number of
> bytes used for dumping given BTF type, or <0 on error. WDYT?
> 

I like it! So 

1. if a user provides a too-big data object, we return the amount we used; and
2. if a user provides a too-small data object, we append "..." to the dump
  and return -E2BIG (or whatever error code).

However I wonder for case 2 if it'd be better to use a snprintf()-like 
semantic rather than an error code, returning the amount we would have 
used. That way we easily detect case 1 (size passed in > return value), 
case 2 (size passed in < return value), and errors can be treated separately.  
Feels to me that dealing with truncated data is going to be sufficiently 
frequent it might be good not to classify it as an error. Let me know if 
you think that makes sense.

I'm working on v3, and hope to have something early next week, but a quick 
reply to a question below...

> > > ...where the id is the BTF id of the data pointed to by the "void *"
> > > argument; for example the BTF id of "struct sk_buff" for a
> > > "struct skb *" data pointer.  Options supported are
> > >
> > >  - a starting indent level (indent_lvl)
> > >  - a set of boolean options to control dump display, similar to those
> > >used for BPF helper bpf_snprintf_btf().  Options are
> > > - compact : omit newlines and other indentation
> > > - noname: omit member names
> > > - zero: show zero-value members
> > >
> > > Default output format is identical to that dumped by bpf_snprintf_btf(),
> > > for example a "struct sk_buff" representation would look like this:
> > >
> > > struct sk_buff){
> > >  (union){
> > >   (struct){
> >
> > Curious, these explicit anonymous (union) and (struct), is that
> > preferred way for explicitness, or is it just because it makes
> > implementation simpler and thus was chosen? I.e., if the goal was to
> > mimic C-style data initialization, you'd just have plain .next = ...,
> > .prev = ..., .dev = ..., .dev_scratch = ..., all on the same level. So
> > just checking for myself.

The idea here is that we want to clarify if we're dealing with
an anonymous struct or union.  I wanted to have things work
like a C-style initializer as closely as possible, but I
realized it's not legit to initialize multiple values in a
union, and more importantly when we're trying to visually interpret
data, we really want to know if an anonymous container of data is
a structure (where all values represent different elements in the
structure) or a union (where we're seeing multiple interpretations of
the same value).

Thanks again for the detailed review!

Alan

[PATCH v2 bpf-next 1/4] libbpf: add btf_has_size() and btf_int() inlines

2021-01-17 Thread Alan Maguire

BTF type data dumping will use them in later patches, and they
are useful generally when handling BTF data.

Signed-off-by: Alan Maguire 
---
 tools/lib/bpf/btf.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 1237bcd..0c48f2e 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -294,6 +294,20 @@ static inline bool btf_is_datasec(const struct btf_type *t)
return btf_kind(t) == BTF_KIND_DATASEC;
 }
 
+static inline bool btf_has_size(const struct btf_type *t)
+{
+   switch (BTF_INFO_KIND(t->info)) {
+   case BTF_KIND_INT:
+   case BTF_KIND_STRUCT:
+   case BTF_KIND_UNION:
+   case BTF_KIND_ENUM:
+   case BTF_KIND_DATASEC:
+   return true;
+   default:
+   return false;
+   }
+}
+
 static inline __u8 btf_int_encoding(const struct btf_type *t)
 {
return BTF_INT_ENCODING(*(__u32 *)(t + 1));
@@ -309,6 +323,11 @@ static inline __u8 btf_int_bits(const struct btf_type *t)
return BTF_INT_BITS(*(__u32 *)(t + 1));
 }
 
+static inline __u32 btf_int(const struct btf_type *t)
+{
+   return *(__u32 *)(t + 1);
+}
+
 static inline struct btf_array *btf_array(const struct btf_type *t)
 {
return (struct btf_array *)(t + 1);
-- 
1.8.3.1

[PATCH v2 bpf-next 3/4] libbpf: BTF dumper support for typed data

2021-01-17 Thread Alan Maguire

Add a BTF dumper for typed data, so that the user can dump a typed
version of the data provided.

The API is

int btf_dump__emit_type_data(struct btf_dump *d, __u32 id,
 const struct btf_dump_emit_type_data_opts *opts,
 void *data);

...where the id is the BTF id of the data pointed to by the "void *"
argument; for example the BTF id of "struct sk_buff" for a
"struct skb *" data pointer.  Options supported are

 - a starting indent level (indent_lvl)
 - a set of boolean options to control dump display, similar to those
   used for BPF helper bpf_snprintf_btf().  Options are
- compact : omit newlines and other indentation
- noname: omit member names
- zero: show zero-value members

Default output format is identical to that dumped by bpf_snprintf_btf(),
for example a "struct sk_buff" representation would look like this:

struct sk_buff){
 (union){
  (struct){
   .next = (struct sk_buff *)0x,
   .prev = (struct sk_buff *)0x,
   (union){
.dev = (struct net_device *)0x,
.dev_scratch = (long unsigned int)18446744073709551615,
   },
  },
...

Signed-off-by: Alan Maguire 
---
 tools/lib/bpf/btf.h  |  17 +
 tools/lib/bpf/btf_dump.c | 974 +++
 tools/lib/bpf/libbpf.map |   5 +
 3 files changed, 996 insertions(+)

diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 0c48f2e..7937124 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -180,6 +180,23 @@ struct btf_dump_emit_type_decl_opts {
 btf_dump__emit_type_decl(struct btf_dump *d, __u32 id,
 const struct btf_dump_emit_type_decl_opts *opts);
 
+
+struct btf_dump_emit_type_data_opts {
+   /* size of this struct, for forward/backward compatibility */
+   size_t sz;
+   int indent_level;
+   /* below match "show" flags for bpf_show_snprintf() */
+   bool compact;
+   bool noname;
+   bool zero;
+};
+#define btf_dump_emit_type_data_opts__last_field zero
+
+LIBBPF_API int
+btf_dump__emit_type_data(struct btf_dump *d, __u32 id,
+const struct btf_dump_emit_type_data_opts *opts,
+void *data);
+
 /*
  * A set of helpers for easier BTF types handling
  */
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c
index 2f9d685..04d604f 100644
--- a/tools/lib/bpf/btf_dump.c
+++ b/tools/lib/bpf/btf_dump.c
@@ -10,6 +10,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -19,14 +21,31 @@
 #include "libbpf.h"
 #include "libbpf_internal.h"
 
+#define BITS_PER_BYTE  8
+#define BITS_PER_U128  (sizeof(__u64) * BITS_PER_BYTE * 2)
+#define BITS_PER_BYTE_MASK (BITS_PER_BYTE - 1)
+#define BITS_PER_BYTE_MASKED(bits) ((bits) & BITS_PER_BYTE_MASK)
+#define BITS_ROUNDDOWN_BYTES(bits) ((bits) >> 3)
+#define BITS_ROUNDUP_BYTES(bits) \
+   (BITS_ROUNDDOWN_BYTES(bits) + !!BITS_PER_BYTE_MASKED(bits))
+
 static const char PREFIXES[] = "\t\t\t\t\t\t\t\t\t\t\t\t\t";
 static const size_t PREFIX_CNT = sizeof(PREFIXES) - 1;
 
+
 static const char *pfx(int lvl)
 {
return lvl >= PREFIX_CNT ? PREFIXES : &PREFIXES[PREFIX_CNT - lvl];
 }
 
+static const char SPREFIXES[] = " ";
+static const size_t SPREFIX_CNT = sizeof(SPREFIXES) - 1;
+
+static const char *spfx(int lvl)
+{
+   return lvl >= SPREFIX_CNT ? SPREFIXES : &SPREFIXES[SPREFIX_CNT - lvl];
+}
+
 enum btf_dump_type_order_state {
NOT_ORDERED,
ORDERING,
@@ -53,6 +72,49 @@ struct btf_dump_type_aux_state {
__u8 referenced: 1;
 };
 
+#define BTF_DUMP_DATA_MAX_NAME_LEN 256
+
+/*
+ * Common internal data for BTF type data dump operations.
+ *
+ * The implementation here is similar to that in kernel/bpf/btf.c
+ * that supports the bpf_snprintf_btf() helper, so any bugs in
+ * type data dumping here are likely in that code also.
+ *
+ * One challenge with showing nested data is we want to skip 0-valued
+ * data, but in order to figure out whether a nested object is all zeros
+ * we need to walk through it.  As a result, we need to make two passes
+ * when handling structs, unions and arrays; the first path simply looks
+ * for nonzero data, while the second actually does the display.  The first
+ * pass is signalled by state.depth_check being set, and if we
+ * encounter a non-zero value we set state.depth_to_show to the depth
+ * at which we encountered it.  When we have completed the first pass,
+ * we will know if anything needs to be displayed if
+ * state.depth_to_show > state.depth.  See btf_dump_emit_[struct,array]_data()
+ * for the implementation of this.
+ *
+ */
+struct btf_dump_data {
+   bool compact;
+   bool noname;
+   bool zero;
+   __u8 indent_lv

[PATCH v2 bpf-next 0/4] libbpf: BTF dumper support for typed data

2021-01-17 Thread Alan Maguire

Add a libbpf dumper function that supports dumping a representation
of data passed in using the BTF id associated with the data in a
manner similar to the bpf_snprintf_btf helper.

Default output format is identical to that dumped by bpf_snprintf_btf(),
for example a "struct sk_buff" representation would look like this:

struct sk_buff){
 (union){
  (struct){
   .next = (struct sk_buff *)0x,
   .prev = (struct sk_buff *)0x,
   (union){
.dev = (struct net_device *)0x,
.dev_scratch = (long unsigned int)18446744073709551615,
   },
  },
...

Patches 1 and 2 make functions available that are needed during
dump operations.

Patch 3 implements the dump functionality in a manner similar
to that in kernel/bpf/btf.c, but with a view to fitting into
libbpf more naturally.  For example, rather than using flags,
boolean dump options are used to control output.

Patch 4 is a selftest that utilizes a dump printf function
to snprintf the dump output to a string for comparison with
expected output.  Tests deliberately mirror those in
snprintf_btf helper test to keep output consistent.

Changes since RFC [1]

- The initial approach explored was to share the kernel code
  with libbpf using #defines to paper over the different needs;
  however it makes more sense to try and fit in with libbpf
  code style for maintenance.  A comment in the code points at
  the implementation in kernel/bpf/btf.c and notes that any
  issues found in it should be fixed there or vice versa;
  mirroring the tests should help with this also
  (Andrii)

[1] 
https://lore.kernel.org/bpf/1610386373-24162-1-git-send-email-alan.magu...@oracle.com/T/#t

Alan Maguire (4):
  libbpf: add btf_has_size() and btf_int() inlines
  libbpf: make skip_mods_and_typedefs available internally in libbpf
  libbpf: BTF dumper support for typed data
  selftests/bpf: add dump type data tests to btf dump tests

 tools/lib/bpf/btf.h   |  36 +
 tools/lib/bpf/btf_dump.c  | 974 ++
 tools/lib/bpf/libbpf.c|   4 +-
 tools/lib/bpf/libbpf.map  |   5 +
 tools/lib/bpf/libbpf_internal.h   |   2 +
 tools/testing/selftests/bpf/prog_tests/btf_dump.c | 233 ++
 6 files changed, 1251 insertions(+), 3 deletions(-)

-- 
1.8.3.1

[PATCH v2 bpf-next 4/4] selftests/bpf: add dump type data tests to btf dump tests

2021-01-17 Thread Alan Maguire

Test various type data dumping operations by comparing expected
format with the dumped string; an snprintf-style printf function
is used to record the string dumped.

Signed-off-by: Alan Maguire 
---
 tools/testing/selftests/bpf/prog_tests/btf_dump.c | 233 ++
 1 file changed, 233 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dump.c 
b/tools/testing/selftests/bpf/prog_tests/btf_dump.c
index c60091e..262561f4 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf_dump.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf_dump.c
@@ -232,6 +232,237 @@ void test_btf_dump_incremental(void)
btf__free(btf);
 }
 
+#define STRSIZE2048
+#defineEXPECTED_STRSIZE256
+
+void btf_dump_snprintf(void *ctx, const char *fmt, va_list args)
+{
+   char *s = ctx, new[STRSIZE];
+
+   vsnprintf(new, STRSIZE, fmt, args);
+   strncat(s, new, STRSIZE);
+   vfprintf(ctx, fmt, args);
+}
+
+/* skip "enum "/"struct " prefixes */
+#define SKIP_PREFIX(_typestr, _prefix) \
+   do {\
+   if (strstr(_typestr, _prefix) == _typestr)  \
+   _typestr += strlen(_prefix) + 1;\
+   } while (0)
+
+int btf_dump_data(struct btf *btf, struct btf_dump *d,
+ char *ptrtype, __u64 flags, void *ptr,
+ char *str, char *expectedval)
+{
+   struct btf_dump_emit_type_data_opts opts = { 0 };
+   int ret = 0, cmp;
+   __s32 type_id;
+
+   opts.sz = sizeof(opts);
+   opts.compact = true;
+   if (flags & BTF_F_NONAME)
+   opts.noname = true;
+   if (flags & BTF_F_ZERO)
+   opts.zero = true;
+   SKIP_PREFIX(ptrtype, "enum");
+   SKIP_PREFIX(ptrtype, "struct");
+   SKIP_PREFIX(ptrtype, "union");
+   type_id = btf__find_by_name(btf, ptrtype);
+   if (CHECK(type_id <= 0, "find type id",
+ "no '%s' in BTF: %d\n", ptrtype, type_id)) {
+   ret = -ENOENT;
+   goto err;
+   }
+   str[0] = '\0';
+   ret = btf_dump__emit_type_data(d, type_id, &opts, ptr);
+   if (CHECK(ret < 0, "btf_dump__emit_type_data",
+ "failed: %d\n", ret))
+   goto err;
+
+   cmp = strncmp(str, expectedval, EXPECTED_STRSIZE);
+   if (CHECK(cmp, "ensure expected/actual match",
+ "'%s' does not match expected '%s': %d\n",
+ str, expectedval, cmp))
+   ret = -EFAULT;
+
+err:
+   if (ret)
+   btf_dump__free(d);
+   return ret;
+}
+
+#define TEST_BTF_DUMP_DATA(_b, _d, _str, _type, _flags, _expected, ...)
\
+   do {\
+   char _expectedval[EXPECTED_STRSIZE] = _expected;\
+   char __ptrtype[64] = #_type;\
+   char *_ptrtype = (char *)__ptrtype; \
+   static _type _ptrdata = __VA_ARGS__;\
+   void *_ptr = &_ptrdata; \
+   \
+   if (btf_dump_data(_b, _d, _ptrtype, _flags, _ptr,   \
+ _str, _expectedval))  \
+   return; \
+   } while (0)
+
+/* Use where expected data string matches its stringified declaration */
+#define TEST_BTF_DUMP_DATA_C(_b, _d, _str, _type, _opts, ...)  \
+   TEST_BTF_DUMP_DATA(_b, _d, _str, _type, _opts,  \
+  "(" #_type ")" #__VA_ARGS__, __VA_ARGS__)
+
+void test_btf_dump_data(void)
+{
+   struct btf *btf = libbpf_find_kernel_btf();
+   char str[STRSIZE];
+   struct btf_dump_opts opts = { .ctx = str };
+   struct btf_dump *d;
+
+   if (CHECK(!btf, "get kernel BTF", "no kernel BTF found"))
+   return;
+
+   d = btf_dump__new(btf, NULL, &opts, btf_dump_snprintf);
+
+   if (CHECK(!d, "new dump", "could not create BTF dump"))
+   return;
+
+   /* Verify type display for various types. */
+
+   /* simple int */
+   TEST_BTF_DUMP_DATA_C(btf, d, str, int, 0, 1234);
+   TEST_BTF_DUMP_DATA(btf, d, str, int, BTF_F_NONAME, "1234", 1234);
+
+   /* zero value should be printed at toplevel */
+   TEST_BTF_DUMP_DATA(btf, d, str, int, 0, "(int)0", 0);
+   TEST_BTF_DUMP_DATA(btf, d, str, int, BTF_F_NONAME, "0", 0);
+   TEST_BTF_DUMP_DATA(btf, d, s

[PATCH v2 bpf-next 2/4] libbpf: make skip_mods_and_typedefs available internally in libbpf

2021-01-17 Thread Alan Maguire

btf_dump.c will need it for type-based data display.

Signed-off-by: Alan Maguire 
---
 tools/lib/bpf/libbpf.c  | 4 +---
 tools/lib/bpf/libbpf_internal.h | 2 ++
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 2abbc38..4ef84e1 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -73,8 +73,6 @@
 #define __printf(a, b) __attribute__((format(printf, a, b)))
 
 static struct bpf_map *bpf_object__add_map(struct bpf_object *obj);
-static const struct btf_type *
-skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id);
 
 static int __base_pr(enum libbpf_print_level level, const char *format,
 va_list args)
@@ -1885,7 +1883,7 @@ static int bpf_object__init_user_maps(struct bpf_object 
*obj, bool strict)
return 0;
 }
 
-static const struct btf_type *
+const struct btf_type *
 skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id)
 {
const struct btf_type *t = btf__type_by_id(btf, id);
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 969d0ac..c25d2df 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -108,6 +108,8 @@ static inline void *libbpf_reallocarray(void *ptr, size_t 
nmemb, size_t size)
 void *btf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz,
  size_t cur_cnt, size_t max_cnt, size_t add_cnt);
 int btf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t 
need_cnt);
+const struct btf_type *skip_mods_and_typedefs(const struct btf *btf, __u32 id,
+ __u32 *res_id);
 
 static inline bool libbpf_validate_opts(const char *opts,
size_t opts_sz, size_t user_sz,
-- 
1.8.3.1

Re: [RFC PATCH bpf-next 1/2] bpf: share BTF "show" implementation between kernel and libbpf

2021-01-14 Thread Alan Maguire

On Mon, 11 Jan 2021, Andrii Nakryiko wrote:

> On Mon, Jan 11, 2021 at 9:34 AM Alan Maguire  wrote:
> > Currently the only "show" function for userspace is to write the
> > representation of the typed data to a string via
> >
> > LIBBPF_API int
> > btf__snprintf(struct btf *btf, char *buf, int len, __u32 id, void *obj,
> >   __u64 flags);
> >
> > ...but other approaches could be pursued including printf()-based
> > show, or even a callback mechanism could be supported to allow
> > user-defined show functions.
> >
> 
> It's strange that you saw btf_dump APIs, and yet decided to go with
> this API instead. snprintf() is not a natural "method" of struct btf.
> Using char buffer as an output is overly restrictive and inconvenient.
> It's appropriate for kernel and BPF program due to their restrictions,
> but there is no need to cripple libbpf APIs for that. I think it
> should follow btf_dump APIs with custom callback so that it's easy to
> just printf() everything, but also user can create whatever elaborate
> mechanism they need and that fits their use case.
> 
> Code reuse is not the ultimate goal, it should facilitate
> maintainability, not harm it. There are times where sharing code
> introduces unnecessary coupling and maintainability issues. And I
> think this one is a very obvious case of that.
> 

Okay, so I've been exploring adding dumper API support.  The initial
approach I've been using is to provide an API like this:

/* match show flags for bpf_show_snprintf() */
enum {
BTF_DUMP_F_COMPACT  =   (1ULL << 0),
BTF_DUMP_F_NONAME   =   (1ULL << 1),
BTF_DUMP_F_ZERO =   (1ULL << 3),
};

struct btf_dump_emit_type_data_opts {
/* size of this struct, for forward/backward compatibility */
size_t sz;
void *data;
int indent_level;
__u64 flags;
};
#define btf_dump_emit_type_data_opts__last_field flags

LIBBPF_API int
btf_dump__emit_type_data(struct btf_dump *d, __u32 id,
 const struct btf_dump_emit_type_data_opts *opts);


...so the opts play a similiar role to the struct btf_ptr + flags
in bpf_snprintf_btf.  I've got this working, but the current 
implementation is tied to emitting the same C-based syntax as 
bpf_snprintf_btf(); though of course the printf function is invoked.
So a use case looks something like this:

struct btf_dump_emit_type_data_opts opts;
char skbufmem[1024], skbufstr[8192];
struct btf *btf = libbpf_find_kernel_btf();
struct btf_dump *d;
__s32 skbid;
int indent = 0;

memset(skbufmem, 0xff, sizeof(skbufmem));
opts.data = skbufmem;
opts.sz = sizeof(opts);
opts.indent_level = indent;

d = btf_dump__new(btf, NULL, NULL, printffn);

skbid = btf__find_by_name_kind(btf, "sk_buff", BTF_KIND_STRUCT);
if (skbid < 0) {
fprintf(stderr, "no skbuff, err %d\n", skbid);
exit(1);
}

btf_dump__emit_type_data(d, skbid, &opts);


..and we get output of the form

(struct sk_buff){
 (union){
  (struct){
   .next = (struct sk_buff *)0x,
   .prev = (struct sk_buff *)0x,
   (union){
.dev = (struct net_device *)0x,
.dev_scratch = (long unsigned int)18446744073709551615,
   },
  },
...

etc.  However it would be nice to find a way to help printf function
providers emit different formats such as JSON without having to
parse the data they are provided in the printf function.
That would remove the need for the output flags, since the printf
function provider could control display.

If we provided an option to provider a "kind" printf function,
and ensured that the BTF dumper sets a "kind" prior to each
_internal_ call to the printf function, we could use that info
to adapt output in various ways.  For example, consider the case
where we want to emit C-type output.  We can use the kind
info to control output for various scenarios:

void c_dump_kind_printf(struct btf_dump *d, enum btf_dump_kind kind,
void *ctx, const char *fmt, va_list args)
{   
switch (kind) {
case BTF_DUMP_KIND_TYPE_NAME:
/* For C, add brackets around the type name string ( ) */
btf_dump__printf(d, "(");
btf_dump__vprintf(d, fmt, args);
btf_dump__printf(d, ")");
break;
case BTF_DUMP_KIND_MEMBER_NAME:
/* for C, prefix a "." to member name, suffix a "=" */
btf_dump__printf(d, ".");
btf_dump__vprintf(d, fmt, args);
btf_dump__printf(d, " = ");

[RFC PATCH bpf-next 1/2] bpf: share BTF "show" implementation between kernel and libbpf

2021-01-11 Thread Alan Maguire

libbpf already supports a "dumper" API for dumping type information,
but there is currently no support for dumping typed _data_ via libbpf.
However this functionality does exist in the kernel, in part to
facilitate the bpf_snprintf_btf() helper which dumps a string
representation of the pointer passed in utilizing the BTF type id
of the data pointed to.  For example, the pair of a pointer to
a "struct sk_buff" and the BTF type id of "struct sk_buff" can be
used.

Here the kernel code is generalized into btf_show_common.c.  For the
most part, code is identical for userspace and kernel, beyond a few API
differences and missing functions.  The only significant differences are

 - the "safe copy" logic used by the kernel to ensure we do not induce a
   crash during BPF operation; and
 - the BTF seq file support that is kernel-only.

The mechanics are to maintain identical btf_show_common.c files in
kernel/bpf and tools/lib/bpf , and a common header btf_common.h in
include/linux/ and tools/lib/bpf/.  This file duplication seems to
be the common practice with duplication between kernel and tools/
so it's the approach taken here.

The common code approach could likely be explored further, but here
the minimum common code required to support BTF show functionality is
used.

Currently the only "show" function for userspace is to write the
representation of the typed data to a string via

LIBBPF_API int
btf__snprintf(struct btf *btf, char *buf, int len, __u32 id, void *obj,
  __u64 flags);

...but other approaches could be pursued including printf()-based
show, or even a callback mechanism could be supported to allow
user-defined show functions.

Here's an example usage, storing a string representation of
struct sk_buff *skb in buf:

struct btf *btf = libbpf_find_kernel_btf();
char buf[8192];
__s32 skb_id;

skb_id = btf__find_by_name_kind(btf, "sk_buff", BTF_KIND_STRUCT);
if (skb_id < 0)
fprintf(stderr, "no skbuff, err %d\n", skb_id);
else
btf__snprintf(btf, buf, sizeof(buf), skb_id, skb, 0);

Suggested-by: Alexei Starovoitov 
Signed-off-by: Alan Maguire 
---
 include/linux/btf.h |  121 +---
 include/linux/btf_common.h  |  286 +
 kernel/bpf/Makefile |2 +-
 kernel/bpf/arraymap.c   |1 +
 kernel/bpf/bpf_struct_ops.c |1 +
 kernel/bpf/btf.c| 1215 +-
 kernel/bpf/btf_show_common.c| 1218 +++
 kernel/bpf/core.c   |1 +
 kernel/bpf/hashtab.c|1 +
 kernel/bpf/local_storage.c  |1 +
 kernel/bpf/verifier.c   |1 +
 kernel/trace/bpf_trace.c|1 +
 tools/lib/bpf/Build |2 +-
 tools/lib/bpf/btf.h |7 +
 tools/lib/bpf/btf_common.h  |  286 +
 tools/lib/bpf/btf_show_common.c | 1218 +++
 tools/lib/bpf/libbpf.map|1 +
 17 files changed, 3044 insertions(+), 1319 deletions(-)
 create mode 100644 include/linux/btf_common.h
 create mode 100644 kernel/bpf/btf_show_common.c
 create mode 100644 tools/lib/bpf/btf_common.h
 create mode 100644 tools/lib/bpf/btf_show_common.c

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 4c200f5..a1f6325 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -50,43 +50,6 @@ const struct btf_type *btf_type_id_size(const struct btf 
*btf,
u32 *type_id,
u32 *ret_size);
 
-/*
- * Options to control show behaviour.
- * - BTF_SHOW_COMPACT: no formatting around type information
- * - BTF_SHOW_NONAME: no struct/union member names/types
- * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values;
- *   equivalent to %px.
- * - BTF_SHOW_ZERO: show zero-valued struct/union members; they
- *   are not displayed by default
- * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
- *   data before displaying it.
- */
-#define BTF_SHOW_COMPACT   BTF_F_COMPACT
-#define BTF_SHOW_NONAMEBTF_F_NONAME
-#define BTF_SHOW_PTR_RAW   BTF_F_PTR_RAW
-#define BTF_SHOW_ZERO  BTF_F_ZERO
-#define BTF_SHOW_UNSAFE(1ULL << 4)
-
-void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
-  struct seq_file *m);
-int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj,
-   struct seq_file *m, u64 flags);
-
-/*
- * Copy len bytes of string representation of obj of BTF type_id into buf.
- *
- * @btf: struct btf object
- * @type_id: type id of type obj points to
- * @obj: pointer to typed data
- * @buf: buffer to write to
- * @len: maximum length to write to buf
- * @flags: show options (see above)
- *
- * Ret

[RFC PATCH bpf-next 2/2] selftests/bpf: test libbpf-based type display

2021-01-11 Thread Alan Maguire

Test btf__snprintf with various base/kernel types and ensure
display is as expected; tests are identical to those in snprintf_btf
test save for the fact these run in userspace rather than BPF program
context.

Signed-off-by: Alan Maguire 
---
 .../selftests/bpf/prog_tests/snprintf_btf_user.c   | 192 +
 1 file changed, 192 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c 
b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c
new file mode 100644
index 000..9eb82b2
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c
@@ -0,0 +1,192 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021, Oracle and/or its affiliates. */
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#define STRSIZE2048
+#define EXPECTED_STRSIZE   256
+
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(x)   (sizeof(x) / sizeof((x)[0]))
+#endif
+
+/* skip "enum "/"struct " prefixes */
+#define SKIP_PREFIX(_typestr, _prefix) \
+   do {\
+   if (strstr(_typestr, _prefix) == _typestr)  \
+   _typestr += strlen(_prefix) + 1;\
+   } while (0)
+
+#define TEST_BTF(btf, _str, _type, _flags, _expected, ...) \
+   do {\
+   const char _expectedval[EXPECTED_STRSIZE] = _expected;  \
+   const char __ptrtype[64] = #_type;  \
+   char *_ptrtype = (char *)__ptrtype; \
+   __u64 _hflags = _flags | BTF_F_COMPACT; \
+   static _type _ptrdata = __VA_ARGS__;\
+   void *_ptr = &_ptrdata; \
+   __s32 _type_id; \
+   int _cmp, _ret; \
+   \
+   SKIP_PREFIX(_ptrtype, "enum");  \
+   SKIP_PREFIX(_ptrtype, "struct");\
+   SKIP_PREFIX(_ptrtype, "union"); \
+   _ptr = &_ptrdata;   \
+   _type_id = btf__find_by_name(btf, _ptrtype);\
+   if (CHECK(_type_id <= 0, "find type id",\
+ "no '%s' in BTF: %d\n", _ptrtype, _type_id))  \
+   return; \
+   _ret = btf__snprintf(btf, _str, STRSIZE, _type_id, _ptr,\
+_hflags);  \
+   if (CHECK(_ret < 0, "btf snprintf", "failed: %d\n", \
+ _ret))\
+   return; \
+   _cmp = strncmp(_str, _expectedval, EXPECTED_STRSIZE);   \
+   if (CHECK(_cmp, "ensure expected/actual match", \
+ "'%s' does not match expected '%s': %d\n",\
+  _str, _expectedval, _cmp))   \
+   return; \
+   } while (0)
+
+/* Use where expected data string matches its stringified declaration */
+#define TEST_BTF_C(btf, _str, _type, _flags, ...)  \
+   TEST_BTF(btf, _str, _type, _flags, "(" #_type ")" #__VA_ARGS__, \
+__VA_ARGS__)
+
+/* Demonstrate that libbpf btf__snprintf succeeds and that various
+ * data types are formatted correctly.
+ */
+void test_snprintf_btf_user(void)
+{
+   struct btf *btf = libbpf_find_kernel_btf();
+   int duration = 0;
+   char str[STRSIZE];
+
+   if (CHECK(!btf, "get kernel BTF", "no kernel BTF found"))
+   return;
+
+   /* Verify type display for various types. */
+
+   /* simple int */
+   TEST_BTF_C(btf, str, int, 0, 1234);
+   TEST_BTF(btf, str, int, BTF_F_NONAME, "1234", 1234);
+
+   /* zero value should be printed at toplevel */
+   TEST_BTF(btf, str, int, 0, "(int)0", 0);
+   TEST_BTF(btf, str, int, BTF_F_NONAME, "0", 0);
+   TEST_BTF(btf, str, int, BTF_F_ZERO, "(int)0", 0);
+   TEST_BTF(btf, str, int, BTF_F_NONAME | BTF_F_ZERO, "0", 0);
+   TEST_BTF_C(btf, str, int, 0, -4567);
+   TEST_BTF(btf, str, int, BTF_F_NONAME, "-4567", -4567);

[RFC PATCH bpf-next 0/2] bpf, libbpf: share BTF data show functionality

2021-01-11 Thread Alan Maguire

The BPF Type Format (BTF) can be used in conjunction with the helper
bpf_snprintf_btf() to display kernel data with type information.

This series generalizes that support and shares it with libbpf so
that libbpf can display typed data.  BTF display functionality is
factored out of kernel/bpf/btf.c into kernel/bpf/btf_show_common.c,
and that file is duplicated in tools/lib/bpf.  Similarly, common
definitions and inline functions needed for this support are
extracted into include/linux/btf_common.h and this header is again
duplicated in tools/lib/bpf.

Patch 1 carries out the refactoring, for which no kernel changes
are intended, and introduces btf__snprintf() a libbpf function
that supports dumping a string representation of typed data using
the struct btf * and id associated with that type.

Patch 2 tests btf__snprintf() with built-in and kernel types to
ensure data is of expected format.  The test closely mirrors
the BPF program associated with the snprintf_btf.c; in this case
however the string representations are verified in userspace rather
than in BPF program context.

Alan Maguire (2):
  bpf: share BTF "show" implementation between kernel and libbpf
  selftests/bpf: test libbpf-based type display

 include/linux/btf.h|  121 +-
 include/linux/btf_common.h |  286 +
 kernel/bpf/Makefile|2 +-
 kernel/bpf/arraymap.c  |1 +
 kernel/bpf/bpf_struct_ops.c|1 +
 kernel/bpf/btf.c   | 1215 +--
 kernel/bpf/btf_show_common.c   | 1218 
 kernel/bpf/core.c  |1 +
 kernel/bpf/hashtab.c   |1 +
 kernel/bpf/local_storage.c |1 +
 kernel/bpf/verifier.c  |1 +
 kernel/trace/bpf_trace.c   |1 +
 tools/lib/bpf/Build|2 +-
 tools/lib/bpf/btf.h|7 +
 tools/lib/bpf/btf_common.h |  286 +
 tools/lib/bpf/btf_show_common.c| 1218 
 tools/lib/bpf/libbpf.map   |1 +
 .../selftests/bpf/prog_tests/snprintf_btf_user.c   |  192 +++
 18 files changed, 3236 insertions(+), 1319 deletions(-)
 create mode 100644 include/linux/btf_common.h
 create mode 100644 kernel/bpf/btf_show_common.c
 create mode 100644 tools/lib/bpf/btf_common.h
 create mode 100644 tools/lib/bpf/btf_show_common.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c

-- 
1.8.3.1

[PATCH bpf] bpftool: fix compilation failure for net.o with older glibc

2021-01-06 Thread Alan Maguire

For older glibc ~2.17, #include'ing both linux/if.h and net/if.h
fails due to complaints about redefinition of interface flags:

  CC   net.o
In file included from net.c:13:0:
/usr/include/linux/if.h:71:2: error: redeclaration of enumerator ‘IFF_UP’
  IFF_UP= 1<<0,  /* sysfs */
  ^
/usr/include/net/if.h:44:5: note: previous definition of ‘IFF_UP’ was here
 IFF_UP = 0x1,  /* Interface is up.  */

The issue was fixed in kernel headers in [1], but since compilation
of net.c picks up system headers the problem can recur.

Dropping #include  resolves the issue and it is
not needed for compilation anyhow.

[1] 
https://lore.kernel.org/netdev/1461512707-23058-1-git-send-email-mikko.rapeli__34748.27880641$1462831734$gmane$o...@iki.fi/

Fixes: f6f3bac08ff9 ("tools/bpf: bpftool: add net support")
Signed-off-by: Alan Maguire 
---
 tools/bpf/bpftool/net.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
index 3fae61e..ff3aa0c 100644
--- a/tools/bpf/bpftool/net.c
+++ b/tools/bpf/bpftool/net.c
@@ -11,7 +11,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-- 
1.8.3.1

Re: [RFC PATCH bpf-next] ksnoop: kernel argument/return value tracing/display using BTF

2021-01-05 Thread Alan Maguire

On Tue, 5 Jan 2021, Cong Wang wrote:

> On Mon, Jan 4, 2021 at 7:29 AM Alan Maguire  wrote:
> >
> > BPF Type Format (BTF) provides a description of kernel data structures
> > and of the types kernel functions utilize as arguments and return values.
> >
> > A helper was recently added - bpf_snprintf_btf() - that uses that
> > description to create a string representation of the data provided,
> > using the BTF id of its type.  For example to create a string
> > representation of a "struct sk_buff", the pointer to the skb
> > is provided along with the type id of "struct sk_buff".
> >
> > Here that functionality is utilized to support tracing kernel
> > function entry and return using k[ret]probes.  The "struct pt_regs"
> > context can be used to derive arguments and return values, and
> > when the user supplies a function name we
> >
> > - look it up in /proc/kallsyms to find its address/module
> > - look it up in the BTF kernel data to get types of arguments
> >   and return value
> > - store a map representation of the trace information, keyed by
> >   instruction pointer
> > - on function entry/return we look up the map to retrieve the BTF
> >   ids of the arguments/return values and can call bpf_snprintf_btf()
> >   with these argument/return values along with the type ids to store
> >   a string representation in the map.
> > - this is then sent via perf event to userspace where it can be
> >   displayed.
> >
> > ksnoop can be used to show function signatures; for example:
> 
> This is definitely quite useful!
> 
> Is it possible to integrate this with bpftrace? That would save people
> from learning yet another tool. ;)
> 

I'd imagine (and hope!) other tracing tools will do this, but right 
now the aim is to make the task of tracing kernel data structures simpler, 
so having a tool dedicated to just that can hopefully help those 
discussions.  There's a bit more work to be done to simplify that task, for
example  implementing Alexei's suggestion to support pretty-printing of 
data structures using BTF in libbpf.

My hope is that we can evolve this tool - or something like it - to the 
point where we can solve that one problem easily, and that other more 
general tracers can then make use of that solution.  I probably should
have made all of this clearer in the patch submission, sorry about that.

Alan

[RFC PATCH bpf-next] ksnoop: kernel argument/return value tracing/display using BTF

2021-01-04 Thread Alan Maguire

  },
  .tcp_tsorted_anchor = (struct 
list_head){
   .next = (struct list_head 
*)0x930b6729bb40,
   .prev = (struct list_head 
*)0xa5bfaf00,
  },
 },
 .len = (unsigned int)84,
 .ignore_df = (__u8)0x1,
 (union){
  .csum = (__wsum)2619910871,
  (struct){
   .csum_start = (__u16)43735,
   .csum_offset = (__u16)39976,
  },
 },
 .transport_header = (__u16)36,
 .network_header = (__u16)16,
 .mac_header = (__u16)65535,
 .tail = (sk_buff_data_t)100,
 .end = (sk_buff_data_t)192,
 .head = (unsigned char 
*)0x930b9d3cf800,
 .data = (unsigned char 
*)0x930b9d3cf810,
 .truesize = (unsigned int)768,
 .users = (refcount_t){
  .refs = (atomic_t){
   .counter = (int)1,
  },
 },
}

   );

It is possible to combine a request for entry arguments with a
predicate on return value; for example we might want to see
skbs on entry for cases where ip_send_skb eventually returned
an error value.  To do this, a predicate such as

$ ksnoop "ip_send_skb(skb, return!=0)"

...could be used.  On entry, rather than sending perf events
the skb argument string representation is "stashed", and
on return if the predicate is satisfied, the stashed data
along with return-value-related data is sent as a perf
event.  This allows us to satisfy requests such as
"show me entry argument X when the function fails, returning
a negative errno".

A note about overhead: it is very high.  The overhead costs are
a combination of known kprobe overhead costs and the cost of
assembling string representations of kernel data.

Use of predicates can mitigate overhead, as collection of trace
data will only occur when the predicate is satisfied; in such
cases it is best to lead with the predicate, e.g.

ksnoop "ip_send_skb(skb->dev == 0, skb)"

...as this will be evaluated before the skb is stringified,
and we potentially avoid that operation if the predicate fails.
The same is _not_ true however in the stash case; for

ksnoop "ip_send_skb(skb, return!=0)"

...we must collect the skb representation on entry as we do not
yet know if the function will fail or not.  If it does, the
data is discarded rather than sent as a perf event.

Signed-off-by: Alan Maguire 
---
 tools/bpf/Makefile|  16 +-
 tools/bpf/ksnoop/Makefile | 102 +
 tools/bpf/ksnoop/ksnoop.bpf.c | 336 +++
 tools/bpf/ksnoop/ksnoop.c | 981 ++
 tools/bpf/ksnoop/ksnoop.h | 110 +
 5 files changed, 1542 insertions(+), 3 deletions(-)
 create mode 100644 tools/bpf/ksnoop/Makefile
 create mode 100644 tools/bpf/ksnoop/ksnoop.bpf.c
 create mode 100644 tools/bpf/ksnoop/ksnoop.c
 create mode 100644 tools/bpf/ksnoop/ksnoop.h

diff --git a/tools/bpf/Makefile b/tools/bpf/Makefile
index 39bb322..8b2b6c9 100644
--- a/tools/bpf/Makefile
+++ b/tools/bpf/Makefile
@@ -73,7 +73,7 @@ $(OUTPUT)%.lex.o: $(OUTPUT)%.lex.c
 
 PROGS = $(OUTPUT)bpf_jit_disasm $(OUTPUT)bpf_dbg $(OUTPUT)bpf_asm
 
-all: $(PROGS) bpftool runqslower
+all: $(PROGS) bpftool runqslower ksnoop
 
 $(OUTPUT)bpf_jit_disasm: CFLAGS += -DPACKAGE='bpf_jit_disasm'
 $(OUTPUT)bpf_jit_disasm: $(OUTPUT)bpf_jit_disasm.o
@@ -89,7 +89,7 @@ $(OUTPUT)bpf_exp.lex.c: $(OUTPUT)bpf_exp.yacc.c
 $(OUTPUT)bpf_exp.yacc.o: $(OUTPUT)bpf_exp.yacc.c
 $(OUTPUT)bpf_exp.lex.o: $(OUTPUT)bpf_exp.lex.c
 
-clean: bpftool_clean runqslower_clean resolve_btfids_clean
+clean: bpftool_clean runqslower_clean resolve_btfids_clean ksnoop_clean
$(call QUIET_CLEAN, bpf-progs)
$(Q)$(RM) -r -- $(OUTPUT)*.o $(OUTPUT)bpf_jit_disasm $(OUTPUT)bpf_dbg \

Re: [PATCH v2 bpf-next 0/3] bpf: support module BTF in BTF display helpers

2020-12-05 Thread Alan Maguire



On Sat, 5 Dec 2020, Yonghong Song wrote:

> 
> 
> __builtin_btf_type_id() is really only supported in llvm12
> and 64bit return value support is pushed to llvm12 trunk
> a while back. The builtin is introduced in llvm11 but has a
> corner bug, so llvm12 is recommended. So if people use the builtin,
> you can assume 64bit return value. libbpf support is required
> here. So in my opinion, there is no need to do feature detection.
> 
> Andrii has a patch to support 64bit return value for
> __builtin_btf_type_id() and I assume that one should
> be landed before or together with your patch.
> 
> Just for your info. The following is an example you could
> use to determine whether __builtin_btf_type_id()
> supports btf object id at llvm level.
> 
> -bash-4.4$ cat t.c
> int test(int arg) {
>   return __builtin_btf_type_id(arg, 1);
> }
> 
> Compile to generate assembly code with latest llvm12 trunk:
>   clang -target bpf -O2 -S -g -mcpu=v3 t.c
> In the asm code, you should see one line with
>   r0 = 1 ll
> 
> Or you can generate obj code:
>   clang -target bpf -O2 -c -g -mcpu=v3 t.c
> and then you disassemble the obj file
>   llvm-objdump -d --no-show-raw-insn --no-leading-addr t.o
> You should see below in the output
>   r0 = 1 ll
> 
> Use earlier version of llvm12 trunk, the builtin has
> 32bit return value, you will see
>   r0 = 1
> which is a 32bit imm to r0, while "r0 = 1 ll" is
> 64bit imm to r0.
>

Thanks for this Yonghong!  I'm thinking the way I'll tackle it
is to simply verify that the upper 32 bits specifying the
veth module object id are non-zero; if they are zero, we'll skip
the test (I think a skip probably makes sense as not everyone will
have llvm12). Does that seem reasonable?

With the additional few minor changes on top of Andrii's patch,
the use of __builtin_btf_type_id() worked perfectly. Thanks!

Alan

[PATCH v2 bpf-next 1/3] bpf: eliminate btf_module_mutex as RCU synchronization can be used

2020-12-04 Thread Alan Maguire

btf_module_mutex is used when manipulating the BTF module list.
However we will wish to look up this list from BPF program context,
and such contexts can include interrupt state where we cannot sleep
due to a mutex_lock().  RCU usage here conforms quite closely
to the example in the system call auditing example in
Documentation/RCU/listRCU.rst ; and as such we can eliminate
the lock and use list_del_rcu()/call_rcu() on module removal,
and list_add_rcu() for module addition.

Signed-off-by: Alan Maguire 
---
 kernel/bpf/btf.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 8d6bdb4..333f41c 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5758,13 +5758,13 @@ bool btf_id_set_contains(const struct btf_id_set *set, 
u32 id)
 #ifdef CONFIG_DEBUG_INFO_BTF_MODULES
 struct btf_module {
struct list_head list;
+   struct rcu_head rcu;
struct module *module;
struct btf *btf;
struct bin_attribute *sysfs_attr;
 };
 
 static LIST_HEAD(btf_modules);
-static DEFINE_MUTEX(btf_module_mutex);
 
 static ssize_t
 btf_module_read(struct file *file, struct kobject *kobj,
@@ -5777,10 +5777,21 @@ struct btf_module {
return len;
 }
 
+static void btf_module_free(struct rcu_head *rcu)
+{
+   struct btf_module *btf_mod = container_of(rcu, struct btf_module, rcu);
+
+   if (btf_mod->sysfs_attr)
+   sysfs_remove_bin_file(btf_kobj, btf_mod->sysfs_attr);
+   btf_put(btf_mod->btf);
+   kfree(btf_mod->sysfs_attr);
+   kfree(btf_mod);
+}
+
 static int btf_module_notify(struct notifier_block *nb, unsigned long op,
 void *module)
 {
-   struct btf_module *btf_mod, *tmp;
+   struct btf_module *btf_mod;
struct module *mod = module;
struct btf *btf;
int err = 0;
@@ -5811,11 +5822,9 @@ static int btf_module_notify(struct notifier_block *nb, 
unsigned long op,
goto out;
}
 
-   mutex_lock(&btf_module_mutex);
btf_mod->module = module;
btf_mod->btf = btf;
-   list_add(&btf_mod->list, &btf_modules);
-   mutex_unlock(&btf_module_mutex);
+   list_add_rcu(&btf_mod->list, &btf_modules);
 
if (IS_ENABLED(CONFIG_SYSFS)) {
struct bin_attribute *attr;
@@ -5845,20 +5854,14 @@ static int btf_module_notify(struct notifier_block *nb, 
unsigned long op,
 
break;
case MODULE_STATE_GOING:
-   mutex_lock(&btf_module_mutex);
-   list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) {
+   list_for_each_entry(btf_mod, &btf_modules, list) {
if (btf_mod->module != module)
continue;
 
-   list_del(&btf_mod->list);
-   if (btf_mod->sysfs_attr)
-   sysfs_remove_bin_file(btf_kobj, 
btf_mod->sysfs_attr);
-   btf_put(btf_mod->btf);
-   kfree(btf_mod->sysfs_attr);
-   kfree(btf_mod);
+   list_del_rcu(&btf_mod->list);
+   call_rcu(&btf_mod->rcu, btf_module_free);
break;
}
-   mutex_unlock(&btf_module_mutex);
break;
}
 out:
-- 
1.8.3.1

[PATCH v2 bpf-next 3/3] selftests/bpf: verify module-specific types can be shown via bpf_snprintf_btf

2020-12-04 Thread Alan Maguire

Verify that specifying a module object id in "struct btf_ptr *" along
with a type id of a module-specific type will succeed.

veth_stats_rx() is chosen because its function signature consists
of a module-specific type "struct veth_stats" and a kernel-specific
one "struct net_device".

Currently the tests take the messy approach of determining object
and type ids for the relevant module/function; __builtin_btf_type_id()
supports object ids by returning a 64-bit value, but need to find a good
way to determine if that support is present.

Signed-off-by: Alan Maguire 
---
 .../selftests/bpf/prog_tests/snprintf_btf_mod.c| 124 +
 tools/testing/selftests/bpf/progs/bpf_iter.h   |   2 +-
 tools/testing/selftests/bpf/progs/btf_ptr.h|   2 +-
 tools/testing/selftests/bpf/progs/veth_stats_rx.c  |  72 
 4 files changed, 198 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c
 create mode 100644 tools/testing/selftests/bpf/progs/veth_stats_rx.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c 
b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c
new file mode 100644
index 000..89805d7
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c
@@ -0,0 +1,124 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include "veth_stats_rx.skel.h"
+
+#define VETH_NAME  "bpfveth0"
+
+/* Demonstrate that bpf_snprintf_btf succeeds for both module-specific
+ * and kernel-defined data structures; veth_stats_rx() is used as
+ * it has both module-specific and kernel-defined data as arguments.
+ * This test assumes that veth is built as a module and will skip if not.
+ */
+void test_snprintf_btf_mod(void)
+{
+   struct btf *vmlinux_btf = NULL, *veth_btf = NULL;
+   struct veth_stats_rx *skel = NULL;
+   struct veth_stats_rx__bss *bss;
+   int err, duration = 0;
+   __u32 id;
+
+   err = system("ip link add name " VETH_NAME " type veth");
+   if (CHECK(err, "system", "ip link add veth failed: %d\n", err))
+   return;
+
+   vmlinux_btf = btf__parse_raw("/sys/kernel/btf/vmlinux");
+   err = libbpf_get_error(vmlinux_btf);
+   if (CHECK(err, "parse vmlinux BTF", "failed parsing vmlinux BTF: %d\n",
+ err))
+   goto cleanup;
+   veth_btf = btf__parse_raw_split("/sys/kernel/btf/veth", vmlinux_btf);
+   err = libbpf_get_error(veth_btf);
+   if (err == -ENOENT) {
+   printf("%s:SKIP:no BTF info for veth\n", __func__);
+   test__skip();
+   goto cleanup;
+   }
+
+   if (CHECK(err, "parse veth BTF", "failed parsing veth BTF: %d\n", err))
+   goto cleanup;
+
+   skel = veth_stats_rx__open();
+   if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+   goto cleanup;
+
+   err = veth_stats_rx__load(skel);
+   if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err))
+   goto cleanup;
+
+   bss = skel->bss;
+
+   /* This could all be replaced by __builtin_btf_type_id(); but need
+* a way to determine if it supports object and type id.  In the
+* meantime, look up type id for veth_stats and object id for veth.
+*/
+   bss->veth_stats_btf_id = btf__find_by_name(veth_btf, "veth_stats");
+
+   if (CHECK(bss->veth_stats_btf_id <= 0, "find 'struct veth_stats'",
+ "could not find 'struct veth_stats' in veth BTF: %d",
+ bss->veth_stats_btf_id))
+   goto cleanup;
+
+   bss->veth_obj_id = 0;
+
+   for (id = 1; bpf_btf_get_next_id(id, &id) == 0; ) {
+   struct bpf_btf_info info;
+   __u32 len = sizeof(info);
+   char name[64];
+   int fd;
+
+   fd = bpf_btf_get_fd_by_id(id);
+   if (fd < 0)
+   continue;
+
+   memset(&info, 0, sizeof(info));
+   info.name_len = sizeof(name);
+   info.name = (__u64)name;
+   if (bpf_obj_get_info_by_fd(fd, &info, &len) ||
+   strcmp((char *)info.name, "veth") != 0)
+   continue;
+   bss->veth_obj_id = info.id;
+   }
+
+   if (CHECK(bss->veth_obj_id == 0, "get obj id for veth module",
+ "could not get veth module id"))
+   goto cleanup;
+
+   err = veth_stats_rx__attach(skel);
+   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+   goto cleanup;
+
+   /* g

[PATCH v2 bpf-next 2/3] bpf: add module support to btf display helpers

2020-12-04 Thread Alan Maguire

bpf_snprintf_btf and bpf_seq_printf_btf use a "struct btf_ptr *"
argument that specifies type information about the type to
be displayed.  Augment this information to include an object
id.  If this id is 0, the assumption is that it refers
to a core kernel type from vmlinux; otherwise the object id
specifies the module the type is in, or if no such id is
found in the module list, we fall back to vmlinux.

Signed-off-by: Alan Maguire 
---
 include/linux/btf.h| 12 
 include/uapi/linux/bpf.h   | 13 +++--
 kernel/bpf/btf.c   | 18 +
 kernel/trace/bpf_trace.c   | 44 +++---
 tools/include/uapi/linux/bpf.h | 13 +++--
 5 files changed, 77 insertions(+), 23 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 4c200f5..688786a 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -214,6 +214,14 @@ static inline const struct btf_var_secinfo 
*btf_type_var_secinfo(
 const char *btf_name_by_offset(const struct btf *btf, u32 offset);
 struct btf *btf_parse_vmlinux(void);
 struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog);
+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
+struct btf *bpf_get_btf_module(__u32 obj_id);
+#else
+static inline struct btf *bpf_get_btf_module(__u32 obj_id)
+{
+   return ERR_PTR(-ENOTSUPP);
+}
+#endif
 #else
 static inline const struct btf_type *btf_type_by_id(const struct btf *btf,
u32 type_id)
@@ -225,6 +233,10 @@ static inline const char *btf_name_by_offset(const struct 
btf *btf,
 {
return NULL;
 }
+static inline struct btf *bpf_get_btf_module(__u32 obj_id)
+{
+   return ERR_PTR(-ENOTSUPP);
+}
 #endif
 
 #endif
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 1233f14..ccb75299 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3641,7 +3641,9 @@ struct bpf_stack_build_id {
  * the pointer data is carried out to avoid kernel crashes during
  * operation.  Smaller types can use string space on the stack;
  * larger programs can use map data to store the string
- * representation.
+ * representation.  Module-specific data structures can be
+ * displayed if the module BTF object id is supplied in the
+ * *ptr*->obj_id field.
  *
  * The string can be subsequently shared with userspace via
  * bpf_perf_event_output() or ring buffer interfaces.
@@ -5115,15 +5117,14 @@ struct bpf_sk_lookup {
 /*
  * struct btf_ptr is used for typed pointer representation; the
  * type id is used to render the pointer data as the appropriate type
- * via the bpf_snprintf_btf() helper described above.  A flags field -
- * potentially to specify additional details about the BTF pointer
- * (rather than its mode of display) - is included for future use.
- * Display flags - BTF_F_* - are passed to bpf_snprintf_btf separately.
+ * via the bpf_snprintf_btf() helper described above.  The obj_id
+ * is used to specify an object id (such as a module); if unset
+ * a core vmlinux type id is assumed.
  */
 struct btf_ptr {
void *ptr;
__u32 type_id;
-   __u32 flags;/* BTF ptr flags; unused at present. */
+   __u32 obj_id;   /* BTF object; vmlinux if 0 */
 };
 
 /*
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 333f41c..8ee691e 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5777,6 +5777,24 @@ struct btf_module {
return len;
 }
 
+struct btf *bpf_get_btf_module(__u32 obj_id)
+{
+   struct btf *btf = ERR_PTR(-ENOENT);
+   struct btf_module *btf_mod;
+
+   rcu_read_lock();
+   list_for_each_entry_rcu(btf_mod, &btf_modules, list) {
+   if (!btf_mod->btf || obj_id != btf_mod->btf->id)
+   continue;
+
+   refcount_inc(&btf_mod->btf->refcnt);
+   btf = btf_mod->btf;
+   break;
+   }
+   rcu_read_unlock();
+   return btf;
+}
+
 static void btf_module_free(struct rcu_head *rcu)
 {
struct btf_module *btf_mod = container_of(rcu, struct btf_module, rcu);
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 23a390a..66d4120 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -75,8 +75,8 @@ static struct bpf_raw_event_map 
*bpf_get_raw_tracepoint_module(const char *name)
 u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 
 static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size,
- u64 flags, const struct btf **btf,
- s32 *btf_id);
+ u64 flags, struct btf **btf,
+ bool *btf_is_vmlinux, s32 *btf_id);
 
 /**
  * trace_call_bpf - invoke BPF program
@@ -786,15 +786,22 @@ struct bpf_seq_prin

[PATCH v2 bpf-next 0/3] bpf: support module BTF in BTF display helpers

2020-12-04 Thread Alan Maguire

This series aims to add support to bpf_snprintf_btf() and
bpf_seq_printf_btf() allowing them to store string representations
of module-specific types, as well as the kernel-specific ones
they currently support.

Patch 1 removes the btf_module_mutex, as since we will need to
look up module BTF during BPF program execution, we don't want
to risk sleeping in the various contexts in which BPF can run.
The access patterns to the btf module list seem to conform to
classic list RCU usage so with a few minor tweaks this seems
workable.

Patch 2 replaces the unused flags field in struct btf_ptr with
an obj_id field,  allowing the specification of the id of a
BTF module.  If the value is 0, the core kernel vmlinux is
assumed to contain the type's BTF information.  Otherwise the
module with that id is used to identify the type.  If the
object-id based lookup fails, we again fall back to vmlinux
BTF.

Patch 3 is a selftest that uses veth (when built as a
module) and a kprobe to display both a module-specific
and kernel-specific type; both are arguments to veth_stats_rx().
Currently it looks up the module-specific type and object ids
using libbpf; in future, these lookups will likely be supported
directly in the BPF program via __builtin_btf_type_id(); but
I need to determine a good test to determine if that builtin
supports object ids.

Changes since RFC

- add patch to remove module mutex
- modify to use obj_id instead of module name as identifier
  in "struct btf_ptr" (Andrii)

Alan Maguire (3):
  bpf: eliminate btf_module_mutex as RCU synchronization can be used
  bpf: add module support to btf display helpers
  selftests/bpf: verify module-specific types can be shown via
bpf_snprintf_btf

 include/linux/btf.h|  12 ++
 include/uapi/linux/bpf.h   |  13 ++-
 kernel/bpf/btf.c   |  49 +---
 kernel/trace/bpf_trace.c   |  44 ++--
 tools/include/uapi/linux/bpf.h |  13 ++-
 .../selftests/bpf/prog_tests/snprintf_btf_mod.c| 124 +
 tools/testing/selftests/bpf/progs/bpf_iter.h   |   2 +-
 tools/testing/selftests/bpf/progs/btf_ptr.h|   2 +-
 tools/testing/selftests/bpf/progs/veth_stats_rx.c  |  72 
 9 files changed, 292 insertions(+), 39 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c
 create mode 100644 tools/testing/selftests/bpf/progs/veth_stats_rx.c

-- 
1.8.3.1

Re: [RFC bpf-next 1/3] bpf: add module support to btf display helpers

2020-11-15 Thread Alan Maguire

On Sat, 14 Nov 2020, Yonghong Song wrote:

> 
> 
> On 11/14/20 8:04 AM, Alexei Starovoitov wrote:
> > On Fri, Nov 13, 2020 at 10:59 PM Andrii Nakryiko
> >  wrote:
> >>
> >> On Fri, Nov 13, 2020 at 10:11 AM Alan Maguire 
> >> wrote:
> >>>
> >>> bpf_snprintf_btf and bpf_seq_printf_btf use a "struct btf_ptr *"
> >>> argument that specifies type information about the type to
> >>> be displayed.  Augment this information to include a module
> >>> name, allowing such display to support module types.
> >>>
> >>> Signed-off-by: Alan Maguire 
> >>> ---
> >>>   include/linux/btf.h|  8 
> >>>   include/uapi/linux/bpf.h   |  5 -
> >>>   kernel/bpf/btf.c   | 18 ++
> >>>   kernel/trace/bpf_trace.c   | 42
> >>>   --
> >>>   tools/include/uapi/linux/bpf.h |  5 -
> >>>   5 files changed, 66 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/include/linux/btf.h b/include/linux/btf.h
> >>> index 2bf6418..d55ca00 100644
> >>> --- a/include/linux/btf.h
> >>> +++ b/include/linux/btf.h
> >>> @@ -209,6 +209,14 @@ static inline const struct btf_var_secinfo
> >>> *btf_type_var_secinfo(
> >>>   const struct btf_type *btf_type_by_id(const struct btf *btf, u32
> >>>   type_id);
> >>>   const char *btf_name_by_offset(const struct btf *btf, u32 offset);
> >>>   struct btf *btf_parse_vmlinux(void);
> >>> +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
> >>> +struct btf *bpf_get_btf_module(const char *name);
> >>> +#else
> >>> +static inline struct btf *bpf_get_btf_module(const char *name)
> >>> +{
> >>> +   return ERR_PTR(-ENOTSUPP);
> >>> +}
> >>> +#endif
> >>>   struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog);
> >>>   #else
> >>>   static inline const struct btf_type *btf_type_by_id(const struct btf
> >>>   *btf,
> >>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> >>> index 162999b..26978be 100644
> >>> --- a/include/uapi/linux/bpf.h
> >>> +++ b/include/uapi/linux/bpf.h
> >>> @@ -3636,7 +3636,8 @@ struct bpf_stack_build_id {
> >>>* the pointer data is carried out to avoid kernel crashes
> >>>during
> >>>* operation.  Smaller types can use string space on the
> >>>stack;
> >>>* larger programs can use map data to store the string
> >>> - * representation.
> >>> + * representation.  Module-specific data structures can be
> >>> + * displayed if the module name is supplied.
> >>>*
> >>>* The string can be subsequently shared with userspace via
> >>>* bpf_perf_event_output() or ring buffer interfaces.
> >>> @@ -5076,11 +5077,13 @@ struct bpf_sk_lookup {
> >>>* potentially to specify additional details about the BTF pointer
> >>>* (rather than its mode of display) - is included for future use.
> >>>* Display flags - BTF_F_* - are passed to bpf_snprintf_btf separately.
> >>> + * A module name can be specified for module-specific data.
> >>>   */
> >>>   struct btf_ptr {
> >>>  void *ptr;
> >>>  __u32 type_id;
> >>>  __u32 flags;/* BTF ptr flags; unused at present. */
> >>> +   const char *module; /* optional module name. */
> >>
> >> I think module name is a wrong API here, similarly how type name was
> >> wrong API for specifying the type (and thus we use type_id here).
> >> Using the module's BTF ID seems like a more suitable interface. That's
> >> what I'm going to use for all kinds of existing BPF APIs that expect
> >> BTF type to attach BPF programs.
> >>
> >> Right now, we use only type_id and implicitly know that it's in
> >> vmlinux BTF. With module BTFs, we now need a pair of BTF object ID +
> >> BTF type ID to uniquely identify the type. vmlinux BTF now can be
> >> specified in two different ways: either leaving BTF object ID as zero
> >> (for simplicity and backwards compatibility) or specifying it's actual
> >> BTF obj ID (which pretty much alwa

[PATCH bpf-next] libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types()

2020-11-15 Thread Alan Maguire

When operating on split BTF, btf__find_by_name[_kind] will not
iterate over all types since they use btf->nr_types to show
the number of types to iterate over.  For split BTF this is
the number of types _on top of base BTF_, so it will
underestimate the number of types to iterate over, especially
for vmlinux + module BTF, where the latter is much smaller.

Use btf__get_nr_types() instead.

Fixes: ba451366bf44 ("libbpf: Implement basic split BTF support")
Signed-off-by: Alan Maguire 
---
 tools/lib/bpf/btf.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 2d0d064..8ff46cd 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -674,12 +674,12 @@ int btf__resolve_type(const struct btf *btf, __u32 
type_id)
 
 __s32 btf__find_by_name(const struct btf *btf, const char *type_name)
 {
-   __u32 i;
+   __u32 i, nr_types = btf__get_nr_types(btf);
 
if (!strcmp(type_name, "void"))
return 0;
 
-   for (i = 1; i <= btf->nr_types; i++) {
+   for (i = 1; i <= nr_types; i++) {
const struct btf_type *t = btf__type_by_id(btf, i);
const char *name = btf__name_by_offset(btf, t->name_off);
 
@@ -693,12 +693,12 @@ __s32 btf__find_by_name(const struct btf *btf, const char 
*type_name)
 __s32 btf__find_by_name_kind(const struct btf *btf, const char *type_name,
 __u32 kind)
 {
-   __u32 i;
+   __u32 i, nr_types = btf__get_nr_types(btf);
 
if (kind == BTF_KIND_UNKN || !strcmp(type_name, "void"))
return 0;
 
-   for (i = 1; i <= btf->nr_types; i++) {
+   for (i = 1; i <= nr_types; i++) {
const struct btf_type *t = btf__type_by_id(btf, i);
const char *name;
 
-- 
1.8.3.1

[RFC bpf-next 3/3] selftests/bpf: verify module-specific types can be shown via bpf_snprintf_btf

2020-11-13 Thread Alan Maguire

Verify that specifying a module name in "struct btf_ptr *" along
with a type id of a module-specific type will succeed.

veth_stats_rx() is chosen because its function signature consists
of a module-specific type "struct veth_stats" and a kernel-specific
one "struct net_device".

Signed-off-by: Alan Maguire 
---
 .../selftests/bpf/prog_tests/snprintf_btf_mod.c| 96 ++
 tools/testing/selftests/bpf/progs/btf_ptr.h|  1 +
 tools/testing/selftests/bpf/progs/veth_stats_rx.c  | 73 
 3 files changed, 170 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c
 create mode 100644 tools/testing/selftests/bpf/progs/veth_stats_rx.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c 
b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c
new file mode 100644
index 000..f1b12df
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include "veth_stats_rx.skel.h"
+
+#define VETH_NAME  "bpfveth0"
+
+/* Demonstrate that bpf_snprintf_btf succeeds for both module-specific
+ * and kernel-defined data structures; veth_stats_rx() is used as
+ * it has both module-specific and kernel-defined data as arguments.
+ * This test assumes that veth is built as a module and will skip if not.
+ */
+void test_snprintf_btf_mod(void)
+{
+   struct btf *vmlinux_btf = NULL, *veth_btf = NULL;
+   struct veth_stats_rx *skel = NULL;
+   struct veth_stats_rx__bss *bss;
+   int err, duration = 0;
+   __u32 id;
+
+   err = system("ip link add name " VETH_NAME " type veth");
+   if (CHECK(err, "system", "ip link add veth failed: %d\n", err))
+   return;
+
+   vmlinux_btf = btf__parse_raw("/sys/kernel/btf/vmlinux");
+   err = libbpf_get_error(vmlinux_btf);
+   if (CHECK(err, "parse vmlinux BTF", "failed parsing vmlinux BTF: %d\n",
+ err))
+   goto cleanup;
+   veth_btf = btf__parse_raw_split("/sys/kernel/btf/veth", vmlinux_btf);
+   err = libbpf_get_error(veth_btf);
+   if (err == -ENOENT) {
+   printf("%s:SKIP:no BTF info for veth\n", __func__);
+   test__skip();
+goto cleanup;
+   }
+
+   if (CHECK(err, "parse veth BTF", "failed parsing veth BTF: %d\n", err))
+   goto cleanup;
+
+   skel = veth_stats_rx__open();
+   if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+   goto cleanup;
+
+   err = veth_stats_rx__load(skel);
+   if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err))
+   goto cleanup;
+
+   bss = skel->bss;
+
+   bss->veth_stats_btf_id = btf__find_by_name(veth_btf, "veth_stats");
+
+   if (CHECK(bss->veth_stats_btf_id <= 0, "find 'struct veth_stats'",
+ "could not find 'struct veth_stats' in veth BTF: %d",
+ bss->veth_stats_btf_id))
+   goto cleanup;
+
+   err = veth_stats_rx__attach(skel);
+   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+   goto cleanup;
+
+   /* generate stats event, then delete; this ensures the program
+* triggers prior to reading status.
+*/
+   err = system("ethtool -S " VETH_NAME " > /dev/null");
+   if (CHECK(err, "system", "ethtool -S failed: %d\n", err))
+   goto cleanup;
+
+   system("ip link delete " VETH_NAME);
+
+   /*
+* Make sure veth_stats_rx program was triggered and it set
+* expected return values from bpf_trace_printk()s and all
+* tests ran.
+*/
+   if (CHECK(bss->ret <= 0,
+ "bpf_snprintf_btf: got return value",
+ "ret <= 0 %ld test %d\n", bss->ret, bss->ran_subtests))
+   goto cleanup;
+
+   if (CHECK(bss->ran_subtests == 0, "check if subtests ran",
+ "no subtests ran, did BPF program run?"))
+   goto cleanup;
+
+   if (CHECK(bss->num_subtests != bss->ran_subtests,
+ "check all subtests ran",
+ "only ran %d of %d tests\n", bss->num_subtests,
+ bss->ran_subtests))
+   goto cleanup;
+
+cleanup:
+   system("ip link delete " VETH_NAME ">/dev/null 2>&1");
+   if (skel)
+   veth_stats_rx__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/btf_ptr.h 
b/tools/testing/s

[RFC bpf-next 1/3] bpf: add module support to btf display helpers

2020-11-13 Thread Alan Maguire

bpf_snprintf_btf and bpf_seq_printf_btf use a "struct btf_ptr *"
argument that specifies type information about the type to
be displayed.  Augment this information to include a module
name, allowing such display to support module types.

Signed-off-by: Alan Maguire 
---
 include/linux/btf.h|  8 
 include/uapi/linux/bpf.h   |  5 -
 kernel/bpf/btf.c   | 18 ++
 kernel/trace/bpf_trace.c   | 42 --
 tools/include/uapi/linux/bpf.h |  5 -
 5 files changed, 66 insertions(+), 12 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 2bf6418..d55ca00 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -209,6 +209,14 @@ static inline const struct btf_var_secinfo 
*btf_type_var_secinfo(
 const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
 const char *btf_name_by_offset(const struct btf *btf, u32 offset);
 struct btf *btf_parse_vmlinux(void);
+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
+struct btf *bpf_get_btf_module(const char *name);
+#else
+static inline struct btf *bpf_get_btf_module(const char *name)
+{
+   return ERR_PTR(-ENOTSUPP);
+}
+#endif
 struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog);
 #else
 static inline const struct btf_type *btf_type_by_id(const struct btf *btf,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 162999b..26978be 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3636,7 +3636,8 @@ struct bpf_stack_build_id {
  * the pointer data is carried out to avoid kernel crashes during
  * operation.  Smaller types can use string space on the stack;
  * larger programs can use map data to store the string
- * representation.
+ * representation.  Module-specific data structures can be
+ * displayed if the module name is supplied.
  *
  * The string can be subsequently shared with userspace via
  * bpf_perf_event_output() or ring buffer interfaces.
@@ -5076,11 +5077,13 @@ struct bpf_sk_lookup {
  * potentially to specify additional details about the BTF pointer
  * (rather than its mode of display) - is included for future use.
  * Display flags - BTF_F_* - are passed to bpf_snprintf_btf separately.
+ * A module name can be specified for module-specific data.
  */
 struct btf_ptr {
void *ptr;
__u32 type_id;
__u32 flags;/* BTF ptr flags; unused at present. */
+   const char *module; /* optional module name. */
 };
 
 /*
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 6b2d508..3ddd1fd 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5738,6 +5738,24 @@ struct btf_module {
 static LIST_HEAD(btf_modules);
 static DEFINE_MUTEX(btf_module_mutex);
 
+struct btf *bpf_get_btf_module(const char *name)
+{
+   struct btf *btf = ERR_PTR(-ENOENT);
+   struct btf_module *btf_mod, *tmp;
+
+   mutex_lock(&btf_module_mutex);
+   list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) {
+   if (!btf_mod->btf || strcmp(name, btf_mod->btf->name) != 0)
+   continue;
+
+   refcount_inc(&btf_mod->btf->refcnt);
+   btf = btf_mod->btf;
+   break;
+   }
+   mutex_unlock(&btf_module_mutex);
+   return btf;
+}
+
 static ssize_t
 btf_module_read(struct file *file, struct kobject *kobj,
struct bin_attribute *bin_attr,
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index cfce60a..a4d5a26 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -73,8 +73,7 @@ static struct bpf_raw_event_map 
*bpf_get_raw_tracepoint_module(const char *name)
 u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 
 static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size,
- u64 flags, const struct btf **btf,
- s32 *btf_id);
+ u64 flags, struct btf **btf, s32 *btf_id);
 
 /**
  * trace_call_bpf - invoke BPF program
@@ -784,7 +783,7 @@ struct bpf_seq_printf_buf {
 BPF_CALL_4(bpf_seq_printf_btf, struct seq_file *, m, struct btf_ptr *, ptr,
   u32, btf_ptr_size, u64, flags)
 {
-   const struct btf *btf;
+   struct btf *btf;
s32 btf_id;
int ret;
 
@@ -792,7 +791,11 @@ struct bpf_seq_printf_buf {
if (ret)
return ret;
 
-   return btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags);
+   ret = btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags);
+   if (btf_ptr_size == sizeof(struct btf_ptr) && ptr->module)
+   btf_put(btf);
+
+   return ret;
 }
 
 static const struct bpf_func_proto bpf_seq_printf_btf_proto = {
@@ -1199,18 +1202,33 @@ static bool bpf_d_path_allowed(const struct bpf_pro

[RFC bpf-next 2/3] libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types()

2020-11-13 Thread Alan Maguire

When operating on split BTF, btf__find_by_name[_kind] will not
iterate over all types since they use btf->nr_types to show
the number of types to iterate over.  For split BTF this is
the number of types _on top of base BTF_, so it will
underestimate the number of types to iterate over, especially
for vmlinux + module BTF, where the latter is much smaller.

Use btf__get_nr_types() instead.

Signed-off-by: Alan Maguire 
---
 tools/lib/bpf/btf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 2d0d064..0fccf4b 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -679,7 +679,7 @@ __s32 btf__find_by_name(const struct btf *btf, const char 
*type_name)
if (!strcmp(type_name, "void"))
return 0;
 
-   for (i = 1; i <= btf->nr_types; i++) {
+   for (i = 1; i <= btf__get_nr_types(btf); i++) {
const struct btf_type *t = btf__type_by_id(btf, i);
const char *name = btf__name_by_offset(btf, t->name_off);
 
@@ -698,7 +698,7 @@ __s32 btf__find_by_name_kind(const struct btf *btf, const 
char *type_name,
if (kind == BTF_KIND_UNKN || !strcmp(type_name, "void"))
return 0;
 
-   for (i = 1; i <= btf->nr_types; i++) {
+   for (i = 1; i <= btf__get_nr_types(btf); i++) {
const struct btf_type *t = btf__type_by_id(btf, i);
const char *name;
 
-- 
1.8.3.1

[RFC bpf-next 0/3] bpf: support module BTF in btf display helpers

2020-11-13 Thread Alan Maguire

This series aims to add support to bpf_snprintf_btf() and 
bpf_seq_printf_btf() allowing them to store string representations
of module-specific types, as well as the kernel-specific ones
they currently support.

Patch 1 adds an additional field "const char *module" to
"struct btf_ptr", allowing the specification of a module
name along with a data pointer, BTF id, etc.  It is then 
used to look up module BTF, rather than the default
vmlinux BTF.

Patch 2 makes a small fix to libbpf to allow 
btf__type_by_name[_kind] to work with split BTF.  Without
this fix, type lookup of a module-specific type id will fail
in patch 3.

Patch 3 is a selftest that uses veth (when built as a
module) and a kprobe to display both a module-specific 
and kernel-specific type; both are arguments to veth_stats_rx().

Alan Maguire (3):
  bpf: add module support to btf display helpers
  libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types()
  selftests/bpf: verify module-specific types can be shown via
bpf_snprintf_btf

 include/linux/btf.h|  8 ++
 include/uapi/linux/bpf.h   |  5 +-
 kernel/bpf/btf.c   | 18 
 kernel/trace/bpf_trace.c   | 42 +++---
 tools/include/uapi/linux/bpf.h |  5 +-
 tools/lib/bpf/btf.c|  4 +-
 .../selftests/bpf/prog_tests/snprintf_btf_mod.c| 96 ++
 tools/testing/selftests/bpf/progs/btf_ptr.h|  1 +
 tools/testing/selftests/bpf/progs/veth_stats_rx.c  | 73 
 9 files changed, 238 insertions(+), 14 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c
 create mode 100644 tools/testing/selftests/bpf/progs/veth_stats_rx.c

-- 
1.8.3.1

Re: [PATCH bpf-next 5/5] tools/bpftool: add support for in-kernel and named BTF in `btf show`

2020-11-09 Thread Alan Maguire

On Thu, 5 Nov 2020, Andrii Nakryiko wrote:

> Display vmlinux BTF name and kernel module names when listing available BTFs
> on the system.
> 
> In human-readable output mode, module BTFs are reported with "name
> [module-name]", while vmlinux BTF will be reported as "name [vmlinux]".
> Square brackets are added by bpftool and follow kernel convention when
> displaying modules in human-readable text outputs.
> 

I had a go at testing this and all looks good, but I was curious
if  "bpftool btf dump" is expected to work with module BTF? I see
the various modules in /sys/kernel/btf, but if I run:

# bpftool btf dump file /sys/kernel/btf/ixgbe
Error: failed to load BTF from /sys/kernel/btf/ixgbe: Invalid argument

...while it still works for vmlinux:

# bpftool btf dump file /sys/kernel/btf/vmlinux
[1] INT '(anon)' size=4 bits_offset=0 nr_bits=32 encoding=(none)
[2] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 
encoding=(none)
...

"bpftool btf show" works for ixgbe:

# bpftool btf show|grep ixgbe
19: name [ixgbe]  size 182074B

Is this perhaps not expected to work yet? (I updated pahole
to the latest changes etc and BTF generation seemed to work
fine for modules during kernel build).

For the "bpftool btf show" functionality, feel free to add

Tested-by: Alan Maguire 

Thanks!

Alan

Re: [PATCH 5.8 574/633] selftests/bpf: Fix overflow tests to reflect iter size increase

2020-10-27 Thread Alan Maguire

On Tue, 27 Oct 2020, Greg Kroah-Hartman wrote:

> From: Alan Maguire 
> 
> [ Upstream commit eb58bbf2e5c7917aa30bf8818761f26bbeeb2290 ]
> 
> bpf iter size increase to PAGE_SIZE << 3 means overflow tests assuming
> page size need to be bumped also.
>

Alexei can correct me if I've got this wrong but I don't believe
it's a stable backport candidate.

This selftests change should only be relevant when the BPF iterator
size has been bumped up as it was in

af65320 bpf: Bump iter seq size to support BTF representation of large 
data structures

...so I don't _think_ this commit belongs in stable unless the
above commit is backported also (and unless I'm missing something
I don't see a burning reason to do that currently).

Backporting this alone will likely induce bpf test failures.
Apologies if the "Fix" in the title was misleading; it should
probably have been "Update" to reflect the fact it's not fixing
an existing bug but rather updating the test to operate correctly
in the context of other changes in the for-next patch series
it was part of.

Thanks!

Alan

[PATCH bpf-next 0/2] selftests/bpf: BTF-based kernel data display fixes

2020-09-29 Thread Alan Maguire

Resolve issues in bpf selftests introduced with BTF-based kernel data
display selftests; these are

- a warning introduced in snprintf_btf.c; and
- compilation failures with old kernels vmlinux.h

Alan Maguire (2):
  selftests/bpf: fix unused-result warning in snprintf_btf.c
  selftests/bpf: ensure snprintf_btf/bpf_iter tests compatibility with
old vmlinux.h

 .../selftests/bpf/prog_tests/snprintf_btf.c|  2 +-
 tools/testing/selftests/bpf/progs/bpf_iter.h   | 23 ++
 tools/testing/selftests/bpf/progs/btf_ptr.h| 27 ++
 .../selftests/bpf/progs/netif_receive_skb.c|  2 +-
 4 files changed, 52 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/btf_ptr.h

-- 
1.8.3.1

[PATCH bpf-next 2/2] selftests/bpf: ensure snprintf_btf/bpf_iter tests compatibility with old vmlinux.h

2020-09-29 Thread Alan Maguire

Andrii reports that bpf selftests relying on "struct btf_ptr" and BTF_F_*
values will not build as vmlinux.h for older kernels will not include
"struct btf_ptr" or the BTF_F_* enum values.  Undefine and redefine
them to work around this.

Fixes: b72091bd4ee4 ("selftests/bpf: Add test for bpf_seq_printf_btf helper")
Fixes: 076a95f5aff2 ("selftests/bpf: Add bpf_snprintf_btf helper tests")
Reported-by: Andrii Nakryiko 
Signed-off-by: Alan Maguire 
---
 tools/testing/selftests/bpf/progs/bpf_iter.h   | 23 ++
 tools/testing/selftests/bpf/progs/btf_ptr.h| 27 ++
 .../selftests/bpf/progs/netif_receive_skb.c|  2 +-
 3 files changed, 51 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/progs/btf_ptr.h

diff --git a/tools/testing/selftests/bpf/progs/bpf_iter.h 
b/tools/testing/selftests/bpf/progs/bpf_iter.h
index df682af..6a12554 100644
--- a/tools/testing/selftests/bpf/progs/bpf_iter.h
+++ b/tools/testing/selftests/bpf/progs/bpf_iter.h
@@ -14,6 +14,11 @@
 #define bpf_iter__bpf_map_elem bpf_iter__bpf_map_elem___not_used
 #define bpf_iter__bpf_sk_storage_map bpf_iter__bpf_sk_storage_map___not_used
 #define bpf_iter__sockmap bpf_iter__sockmap___not_used
+#define btf_ptr btf_ptr___not_used
+#define BTF_F_COMPACT BTF_F_COMPACT___not_used
+#define BTF_F_NONAME BTF_F_NONAME___not_used
+#define BTF_F_PTR_RAW BTF_F_PTR_RAW___not_used
+#define BTF_F_ZERO BTF_F_ZERO___not_used
 #include "vmlinux.h"
 #undef bpf_iter_meta
 #undef bpf_iter__bpf_map
@@ -28,6 +33,11 @@
 #undef bpf_iter__bpf_map_elem
 #undef bpf_iter__bpf_sk_storage_map
 #undef bpf_iter__sockmap
+#undef btf_ptr
+#undef BTF_F_COMPACT
+#undef BTF_F_NONAME
+#undef BTF_F_PTR_RAW
+#undef BTF_F_ZERO
 
 struct bpf_iter_meta {
struct seq_file *seq;
@@ -105,3 +115,16 @@ struct bpf_iter__sockmap {
void *key;
struct sock *sk;
 };
+
+struct btf_ptr {
+   void *ptr;
+   __u32 type_id;
+   __u32 flags;
+};
+
+enum {
+   BTF_F_COMPACT   =   (1ULL << 0),
+   BTF_F_NONAME=   (1ULL << 1),
+   BTF_F_PTR_RAW   =   (1ULL << 2),
+   BTF_F_ZERO  =   (1ULL << 3),
+};
diff --git a/tools/testing/selftests/bpf/progs/btf_ptr.h 
b/tools/testing/selftests/bpf/progs/btf_ptr.h
new file mode 100644
index 000..c3c9797
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/btf_ptr.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+/* "undefine" structs in vmlinux.h, because we "override" them below */
+#define btf_ptr btf_ptr___not_used
+#define BTF_F_COMPACT BTF_F_COMPACT___not_used
+#define BTF_F_NONAME BTF_F_NONAME___not_used
+#define BTF_F_PTR_RAW BTF_F_PTR_RAW___not_used
+#define BTF_F_ZERO BTF_F_ZERO___not_used
+#include "vmlinux.h"
+#undef btf_ptr
+#undef BTF_F_COMPACT
+#undef BTF_F_NONAME
+#undef BTF_F_PTR_RAW
+#undef BTF_F_ZERO
+
+struct btf_ptr {
+   void *ptr;
+   __u32 type_id;
+   __u32 flags;
+};
+
+enum {
+   BTF_F_COMPACT   =   (1ULL << 0),
+   BTF_F_NONAME=   (1ULL << 1),
+   BTF_F_PTR_RAW   =   (1ULL << 2),
+   BTF_F_ZERO  =   (1ULL << 3),
+};
diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c 
b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
index b873d80..6b67003 100644
--- a/tools/testing/selftests/bpf/progs/netif_receive_skb.c
+++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2020, Oracle and/or its affiliates. */
 
-#include "vmlinux.h"
+#include "btf_ptr.h"
 #include 
 #include 
 #include 
-- 
1.8.3.1

[PATCH bpf-next 1/2] selftests/bpf: fix unused-result warning in snprintf_btf.c

2020-09-29 Thread Alan Maguire

Daniel reports:

+system("ping -c 1 127.0.0.1 > /dev/null");

This generates the following new warning when compiling BPF selftests:

  [...]
  EXT-OBJ  [test_progs] cgroup_helpers.o
  EXT-OBJ  [test_progs] trace_helpers.o
  EXT-OBJ  [test_progs] network_helpers.o
  EXT-OBJ  [test_progs] testing_helpers.o
  TEST-OBJ [test_progs] snprintf_btf.test.o
/root/bpf-next/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c: In 
function ‘test_snprintf_btf’:
/root/bpf-next/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c:30:2: 
warning: ignoring return value of ‘system’, declared with attribute 
warn_unused_result [-Wunused-result]
  system("ping -c 1 127.0.0.1 > /dev/null");
  ^
  [...]

Fixes: 076a95f5aff2 ("selftests/bpf: Add bpf_snprintf_btf helper tests")
Reported-by: Daniel Borkmann 
Signed-off-by: Alan Maguire 
---
 tools/testing/selftests/bpf/prog_tests/snprintf_btf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c 
b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c
index 3a8ecf8..3c63a70 100644
--- a/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c
@@ -27,7 +27,7 @@ void test_snprintf_btf(void)
goto cleanup;
 
/* generate receive event */
-   system("ping -c 1 127.0.0.1 > /dev/null");
+   (void) system("ping -c 1 127.0.0.1 > /dev/null");
 
if (bss->skip) {
printf("%s:SKIP:no __builtin_btf_type_id\n", __func__);
-- 
1.8.3.1

Re: [PATCH v6 bpf-next 6/6] selftests/bpf: add test for bpf_seq_printf_btf helper

2020-09-28 Thread Alan Maguire




On Thu, 24 Sep 2020, Alexei Starovoitov wrote:

> to whatever number, but printing single task_struct needs ~800 lines and
> ~18kbytes. Humans can scroll through that much spam, but can we make it less
> verbose by default somehow?
> May be not in this patch set, but in the follow up?
>

One approach that might work would be to devote 4 bits or so of
flag space to a "maximum depth" specifier; i.e. at depth 1,
only base types are displayed, no aggregate types like arrays,
structs and unions.  We've already got depth processing in the
code to figure out if possibly zeroed nested data needs to be
displayed, so it should hopefully be a simple follow-up.

One way to express it would be to use "..." to denote field(s)
were omitted. We could even use the number of "."s to denote
cases where multiple fields were omitted, giving a visual sense
of how much data was omitted.  So for example with 
BTF_F_MAX_DEPTH(1), task_struct looks like this:

(struct task_struct){
 .state = ()1,
 .stack = ( *)0x029d1e6f,
 ...
 .flags = (unsigned int)4194560,
 ...
 .cpu = (unsigned int)36,
 .wakee_flips = (unsigned int)11,
 .wakee_flip_decay_ts = (long unsigned int)4294914874,
 .last_wakee = (struct task_struct *)0x6c7dfe6d,
 .recent_used_cpu = (int)19,
 .wake_cpu = (int)36,
 .prio = (int)120,
 .static_prio = (int)120,
 .normal_prio = (int)120,
 .sched_class = (struct sched_class *)0xad1561e6,
 ...
 .exec_start = (u64)674402577156,
 .sum_exec_runtime = (u64)5009664110,
 .vruntime = (u64)167038057,
 .prev_sum_exec_runtime = (u64)5009578167,
 .nr_migrations = (u64)54,
 .depth = (int)1,
 .parent = (struct sched_entity *)0xcba60e7d,
 .cfs_rq = (struct cfs_rq *)0x14f353ed,
 ...
 
...etc. What do you think?

> > +SEC("iter/task")
> > +int dump_task_fs_struct(struct bpf_iter__task *ctx)
> > +{
> > +   static const char fs_type[] = "struct fs_struct";
> > +   struct seq_file *seq = ctx->meta->seq;
> > +   struct task_struct *task = ctx->task;
> > +   struct fs_struct *fs = (void *)0;
> > +   static struct btf_ptr ptr = { };
> > +   long ret;
> > +
> > +   if (task)
> > +   fs = task->fs;
> > +
> > +   ptr.type = fs_type;
> > +   ptr.ptr = fs;
> 
> imo the following is better:
>ptr.type_id = __builtin_btf_type_id(*fs, 1);
>ptr.ptr = fs;
> 

I'm still seeing lookup failures using __builtin_btf_type_id(,1) -
whereas both __builtin_btf_type_id(,0) and Andrii's
suggestion of bpf_core_type_id_kernel() work. Not sure what's
going on - pahole is v1.17, clang is

clang version 12.0.0 (/mnt/src/llvm-project/clang 
7ab7b979d29e1e43701cf690f5cf1903740f50e3)

> > +
> > +   if (ctx->meta->seq_num == 0)
> > +   BPF_SEQ_PRINTF(seq, "Raw BTF fs_struct per task\n");
> > +
> > +   ret = bpf_seq_printf_btf(seq, &ptr, sizeof(ptr), 0);
> > +   switch (ret) {
> > +   case 0:
> > +   tasks++;
> > +   break;
> > +   case -ERANGE:
> > +   /* NULL task or task->fs, don't count it as an error. */
> > +   break;
> > +   default:
> > +   seq_err = ret;
> > +   break;
> > +   }
> 
> Please add handling of E2BIG to this switch. Otherwise
> printing large amount of tiny structs will overflow PAGE_SIZE and E2BIG
> will be send to user space.
> Like this:
> @@ -40,6 +40,8 @@ int dump_task_fs_struct(struct bpf_iter__task *ctx)
> case -ERANGE:
> /* NULL task or task->fs, don't count it as an error. */
> break;
> +   case -E2BIG:
> +   return 1;
> 

Done.

> Also please change bpf_seq_read() like this:
> diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
> index 30833bbf3019..8f10e30ea0b0 100644
> --- a/kernel/bpf/bpf_iter.c
> +++ b/kernel/bpf/bpf_iter.c
> @@ -88,8 +88,8 @@ static ssize_t bpf_seq_read(struct file *file, char __user 
> *buf, size_t size,
> mutex_lock(&seq->lock);
> 
> if (!seq->buf) {
> -   seq->size = PAGE_SIZE;
> -   seq->buf = kmalloc(seq->size, GFP_KERNEL);
> +   seq->size = PAGE_SIZE << 3;
> +   seq->buf = kvmalloc(seq->size, GFP_KERNEL);
> 
> So users can print task_struct by default.
> Hopefully we will figure out how to deal with spam later.
> 

Thanks for all the help and suggestions! I didn't want to
attribute the patch bumping seq size in v7 to you without your 
permission, but it's all your work so if I need to respin let me
know if you'd like me to fix that. Thanks again!

Alan

[PATCH v7 bpf-next 8/8] selftests/bpf: add test for bpf_seq_printf_btf helper

2020-09-28 Thread Alan Maguire

Add a test verifying iterating over tasks and displaying BTF
representation of task_struct succeeds.

Suggested-by: Alexei Starovoitov 
Signed-off-by: Alan Maguire 
---
 tools/testing/selftests/bpf/prog_tests/bpf_iter.c  | 74 ++
 .../selftests/bpf/progs/bpf_iter_task_btf.c| 50 +++
 2 files changed, 124 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c 
b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
index ad9de13..af15630 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
@@ -7,6 +7,7 @@
 #include "bpf_iter_task.skel.h"
 #include "bpf_iter_task_stack.skel.h"
 #include "bpf_iter_task_file.skel.h"
+#include "bpf_iter_task_btf.skel.h"
 #include "bpf_iter_tcp4.skel.h"
 #include "bpf_iter_tcp6.skel.h"
 #include "bpf_iter_udp4.skel.h"
@@ -167,6 +168,77 @@ static void test_task_file(void)
bpf_iter_task_file__destroy(skel);
 }
 
+#define TASKBUFSZ  32768
+
+static char taskbuf[TASKBUFSZ];
+
+static void do_btf_read(struct bpf_iter_task_btf *skel)
+{
+   struct bpf_program *prog = skel->progs.dump_task_struct;
+   struct bpf_iter_task_btf__bss *bss = skel->bss;
+   int iter_fd = -1, len = 0, bufleft = TASKBUFSZ;
+   struct bpf_link *link;
+   char *buf = taskbuf;
+
+   link = bpf_program__attach_iter(prog, NULL);
+   if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n"))
+   return;
+
+   iter_fd = bpf_iter_create(bpf_link__fd(link));
+   if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n"))
+   goto free_link;
+
+   do {
+   len = read(iter_fd, buf, bufleft);
+   if (len > 0) {
+   buf += len;
+   bufleft -= len;
+   }
+   } while (len > 0);
+
+   if (bss->skip) {
+   printf("%s:SKIP:no __builtin_btf_type_id\n", __func__);
+   test__skip();
+   goto free_link;
+   }
+
+   if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno)))
+   goto free_link;
+
+   CHECK(strstr(taskbuf, "(struct task_struct)") == NULL,
+ "check for btf representation of task_struct in iter data",
+ "struct task_struct not found");
+free_link:
+   if (iter_fd > 0)
+   close(iter_fd);
+   bpf_link__destroy(link);
+}
+
+static void test_task_btf(void)
+{
+   struct bpf_iter_task_btf__bss *bss;
+   struct bpf_iter_task_btf *skel;
+
+   skel = bpf_iter_task_btf__open_and_load();
+   if (CHECK(!skel, "bpf_iter_task_btf__open_and_load",
+ "skeleton open_and_load failed\n"))
+   return;
+
+   bss = skel->bss;
+
+   do_btf_read(skel);
+
+   if (CHECK(bss->tasks == 0, "check if iterated over tasks",
+ "no task iteration, did BPF program run?\n"))
+   goto cleanup;
+
+   CHECK(bss->seq_err != 0, "check for unexpected err",
+ "bpf_seq_printf_btf returned %ld", bss->seq_err);
+
+cleanup:
+   bpf_iter_task_btf__destroy(skel);
+}
+
 static void test_tcp4(void)
 {
struct bpf_iter_tcp4 *skel;
@@ -957,6 +1029,8 @@ void test_bpf_iter(void)
test_task_stack();
if (test__start_subtest("task_file"))
test_task_file();
+   if (test__start_subtest("task_btf"))
+   test_task_btf();
if (test__start_subtest("tcp4"))
test_tcp4();
if (test__start_subtest("tcp6"))
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c 
b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c
new file mode 100644
index 000..a1ddc36
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c
@@ -0,0 +1,50 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+#include "bpf_iter.h"
+#include 
+#include 
+#include 
+
+#include 
+
+char _license[] SEC("license") = "GPL";
+
+long tasks = 0;
+long seq_err = 0;
+bool skip = false;
+
+SEC("iter/task")
+int dump_task_struct(struct bpf_iter__task *ctx)
+{
+   struct seq_file *seq = ctx->meta->seq;
+   struct task_struct *task = ctx->task;
+   static struct btf_ptr ptr = { };
+   long ret;
+
+#if __has_builtin(__builtin_btf_type_id)
+   ptr.type_id = bpf_core_type_id_kernel(struct task_struct);
+   ptr.ptr = task;
+
+   if (ctx->meta->seq_num == 0)
+   BPF_SEQ_PRINTF(seq, "Raw BTF

[PATCH v7 bpf-next 5/8] bpf: bump iter seq size to support BTF representation of large data structures

2020-09-28 Thread Alan Maguire

BPF iter size is limited to PAGE_SIZE; if we wish to display BTF-based
representations of larger kernel data structures such as task_struct,
this will be insufficient.

Suggested-by: Alexei Starovoitov 
Signed-off-by: Alan Maguire 
---
 kernel/bpf/bpf_iter.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
index 30833bb..8f10e30 100644
--- a/kernel/bpf/bpf_iter.c
+++ b/kernel/bpf/bpf_iter.c
@@ -88,8 +88,8 @@ static ssize_t bpf_seq_read(struct file *file, char __user 
*buf, size_t size,
mutex_lock(&seq->lock);
 
if (!seq->buf) {
-   seq->size = PAGE_SIZE;
-   seq->buf = kmalloc(seq->size, GFP_KERNEL);
+   seq->size = PAGE_SIZE << 3;
+   seq->buf = kvmalloc(seq->size, GFP_KERNEL);
if (!seq->buf) {
err = -ENOMEM;
goto done;
-- 
1.8.3.1

[PATCH v7 bpf-next 7/8] bpf: add bpf_seq_printf_btf helper

2020-09-28 Thread Alan Maguire

A helper is added to allow seq file writing of kernel data
structures using vmlinux BTF.  Its signature is

long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr,
u32 btf_ptr_size, u64 flags);

Flags and struct btf_ptr definitions/use are identical to the
bpf_snprintf_btf helper, and the helper returns 0 on success
or a negative error value.

Suggested-by: Alexei Starovoitov 
Signed-off-by: Alan Maguire 
---
 include/linux/btf.h|  2 ++
 include/uapi/linux/bpf.h   |  9 +
 kernel/bpf/btf.c   |  4 ++--
 kernel/bpf/core.c  |  1 +
 kernel/trace/bpf_trace.c   | 33 +
 tools/include/uapi/linux/bpf.h |  9 +
 6 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 3e5cdc2..024e16f 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -68,6 +68,8 @@ const struct btf_type *btf_type_id_size(const struct btf *btf,
 
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
   struct seq_file *m);
+int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj,
+   struct seq_file *m, u64 flags);
 
 /*
  * Copy len bytes of string representation of obj of BTF type_id into buf.
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index fcafe80..82817c4 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3623,6 +3623,14 @@ struct bpf_stack_build_id {
  * The number of bytes that were written (or would have been
  * written if output had to be truncated due to string size),
  * or a negative error in cases of failure.
+ *
+ * long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr, u32 
ptr_size, u64 flags)
+ * Description
+ * Use BTF to write to seq_write a string representation of
+ * *ptr*->ptr, using *ptr*->type_id as per bpf_snprintf_btf().
+ * *flags* are identical to those used for bpf_snprintf_btf.
+ * Return
+ * 0 on success or a negative error in case of failure.
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -3775,6 +3783,7 @@ struct bpf_stack_build_id {
FN(d_path), \
FN(copy_from_user), \
FN(snprintf_btf),   \
+   FN(seq_printf_btf), \
/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index be5acf6..99e307a 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5346,8 +5346,8 @@ static void btf_seq_show(struct btf_show *show, const 
char *fmt,
seq_vprintf((struct seq_file *)show->target, fmt, args);
 }
 
-static int btf_type_seq_show_flags(const struct btf *btf, u32 type_id,
-  void *obj, struct seq_file *m, u64 flags)
+int btf_type_seq_show_flags(const struct btf *btf, u32 type_id,
+   void *obj, struct seq_file *m, u64 flags)
 {
struct btf_show sseq;
 
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 403fb23..c4ba45f 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2217,6 +2217,7 @@ void bpf_user_rnd_init_once(void)
 const struct bpf_func_proto bpf_get_local_storage_proto __weak;
 const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak;
 const struct bpf_func_proto bpf_snprintf_btf_proto __weak;
+const struct bpf_func_proto bpf_seq_printf_btf_proto __weak;
 
 const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
 {
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 983cbd3..6ac254e 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -71,6 +71,10 @@ static struct bpf_raw_event_map 
*bpf_get_raw_tracepoint_module(const char *name)
 u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 
+static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size,
+ u64 flags, const struct btf **btf,
+ s32 *btf_id);
+
 /**
  * trace_call_bpf - invoke BPF program
  * @call: tracepoint event
@@ -776,6 +780,31 @@ struct bpf_seq_printf_buf {
.arg3_type  = ARG_CONST_SIZE_OR_ZERO,
 };
 
+BPF_CALL_4(bpf_seq_printf_btf, struct seq_file *, m, struct btf_ptr *, ptr,
+  u32, btf_ptr_size, u64, flags)
+{
+   const struct btf *btf;
+   s32 btf_id;
+   int ret;
+
+   ret = bpf_btf_printf_prepare(ptr, btf_ptr_size, flags, &btf, &btf_id);
+   if (ret)
+   return ret;
+
+   return btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags);
+}
+
+static const struct bpf_func_proto bpf_seq_printf_btf_proto = {
+   .func   = bpf_seq_printf_btf,

[PATCH v7 bpf-next 6/8] selftests/bpf: fix overflow tests to reflect iter size increase

2020-09-28 Thread Alan Maguire

bpf iter size increase to PAGE_SIZE << 3 means overflow tests assuming
page size need to be bumped also.

Signed-off-by: Alan Maguire 
---
 tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c 
b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
index fe1a83b9..ad9de13 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
@@ -352,7 +352,7 @@ static void test_overflow(bool test_e2big_overflow, bool 
ret1)
struct bpf_map_info map_info = {};
struct bpf_iter_test_kern4 *skel;
struct bpf_link *link;
-   __u32 page_size;
+   __u32 iter_size;
char *buf;
 
skel = bpf_iter_test_kern4__open();
@@ -374,19 +374,19 @@ static void test_overflow(bool test_e2big_overflow, bool 
ret1)
  "map_creation failed: %s\n", strerror(errno)))
goto free_map1;
 
-   /* bpf_seq_printf kernel buffer is one page, so one map
+   /* bpf_seq_printf kernel buffer is 8 pages, so one map
 * bpf_seq_write will mostly fill it, and the other map
 * will partially fill and then trigger overflow and need
 * bpf_seq_read restart.
 */
-   page_size = sysconf(_SC_PAGE_SIZE);
+   iter_size = sysconf(_SC_PAGE_SIZE) << 3;
 
if (test_e2big_overflow) {
-   skel->rodata->print_len = (page_size + 8) / 8;
-   expected_read_len = 2 * (page_size + 8);
+   skel->rodata->print_len = (iter_size + 8) / 8;
+   expected_read_len = 2 * (iter_size + 8);
} else if (!ret1) {
-   skel->rodata->print_len = (page_size - 8) / 8;
-   expected_read_len = 2 * (page_size - 8);
+   skel->rodata->print_len = (iter_size - 8) / 8;
+   expected_read_len = 2 * (iter_size - 8);
} else {
skel->rodata->print_len = 1;
expected_read_len = 2 * 8;
-- 
1.8.3.1

[PATCH v7 bpf-next 4/8] selftests/bpf: add bpf_snprintf_btf helper tests

2020-09-28 Thread Alan Maguire

Tests verifying snprintf()ing of various data structures,
flags combinations using a tp_btf program. Tests are skipped
if __builtin_btf_type_id is not available to retrieve BTF
type ids.

Signed-off-by: Alan Maguire 
---
 .../selftests/bpf/prog_tests/snprintf_btf.c|  60 +
 .../selftests/bpf/progs/netif_receive_skb.c| 249 +
 2 files changed, 309 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf.c
 create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c 
b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c
new file mode 100644
index 000..3a8ecf8
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include "netif_receive_skb.skel.h"
+
+/* Demonstrate that bpf_snprintf_btf succeeds and that various data types
+ * are formatted correctly.
+ */
+void test_snprintf_btf(void)
+{
+   struct netif_receive_skb *skel;
+   struct netif_receive_skb__bss *bss;
+   int err, duration = 0;
+
+   skel = netif_receive_skb__open();
+   if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+   return;
+
+   err = netif_receive_skb__load(skel);
+   if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err))
+   goto cleanup;
+
+   bss = skel->bss;
+
+   err = netif_receive_skb__attach(skel);
+   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+   goto cleanup;
+
+   /* generate receive event */
+   system("ping -c 1 127.0.0.1 > /dev/null");
+
+   if (bss->skip) {
+   printf("%s:SKIP:no __builtin_btf_type_id\n", __func__);
+   test__skip();
+   goto cleanup;
+   }
+
+   /*
+* Make sure netif_receive_skb program was triggered
+* and it set expected return values from bpf_trace_printk()s
+* and all tests ran.
+*/
+   if (CHECK(bss->ret <= 0,
+ "bpf_snprintf_btf: got return value",
+ "ret <= 0 %ld test %d\n", bss->ret, bss->ran_subtests))
+   goto cleanup;
+
+   if (CHECK(bss->ran_subtests == 0, "check if subtests ran",
+ "no subtests ran, did BPF program run?"))
+   goto cleanup;
+
+   if (CHECK(bss->num_subtests != bss->ran_subtests,
+ "check all subtests ran",
+ "only ran %d of %d tests\n", bss->num_subtests,
+ bss->ran_subtests))
+   goto cleanup;
+
+cleanup:
+   netif_receive_skb__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c 
b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
new file mode 100644
index 000..b873d80
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
@@ -0,0 +1,249 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+
+#include "vmlinux.h"
+#include 
+#include 
+#include 
+
+#include 
+
+long ret = 0;
+int num_subtests = 0;
+int ran_subtests = 0;
+bool skip = false;
+
+#define STRSIZE2048
+#define EXPECTED_STRSIZE   256
+
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(x)  (sizeof(x) / sizeof((x)[0]))
+#endif
+
+struct {
+   __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+   __uint(max_entries, 1);
+   __type(key, __u32);
+   __type(value, char[STRSIZE]);
+} strdata SEC(".maps");
+
+static int __strncmp(const void *m1, const void *m2, size_t len)
+{
+   const unsigned char *s1 = m1;
+   const unsigned char *s2 = m2;
+   int i, delta = 0;
+
+   for (i = 0; i < len; i++) {
+   delta = s1[i] - s2[i];
+   if (delta || s1[i] == 0 || s2[i] == 0)
+   break;
+   }
+   return delta;
+}
+
+#if __has_builtin(__builtin_btf_type_id)
+#defineTEST_BTF(_str, _type, _flags, _expected, ...)   
\
+   do {\
+   static const char _expectedval[EXPECTED_STRSIZE] =  \
+   _expected;  \
+   static const char _ptrtype[64] = #_type;\
+   __u64 _hflags = _flags | BTF_F_COMPACT; \
+   static _type _ptrdata = __VA_ARGS__;\
+   static struct btf_ptr _ptr = { };   \
+   int _cmp;   \
+

[PATCH v7 bpf-next 2/8] bpf: move to generic BTF show support, apply it to seq files/strings

2020-09-28 Thread Alan Maguire

generalize the "seq_show" seq file support in btf.c to support
a generic show callback of which we support two instances; the
current seq file show, and a show with snprintf() behaviour which
instead writes the type data to a supplied string.

Both classes of show function call btf_type_show() with different
targets; the seq file or the string to be written.  In the string
case we need to track additional data - length left in string to write
and length to return that we would have written (a la snprintf).

By default show will display type information, field members and
their types and values etc, and the information is indented
based upon structure depth. Zeroed fields are omitted.

Show however supports flags which modify its behaviour:

BTF_SHOW_COMPACT - suppress newline/indent.
BTF_SHOW_NONAME - suppress show of type and member names.
BTF_SHOW_PTR_RAW - do not obfuscate pointer values.
BTF_SHOW_UNSAFE - do not copy data to safe buffer before display.
BTF_SHOW_ZERO - show zeroed values (by default they are not shown).

Signed-off-by: Alan Maguire 
---
 include/linux/btf.h |   36 ++
 kernel/bpf/btf.c| 1007 +--
 2 files changed, 941 insertions(+), 102 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index a9af5e7..d0f5d3c 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -13,6 +13,7 @@
 struct btf_member;
 struct btf_type;
 union bpf_attr;
+struct btf_show;
 
 extern const struct file_operations btf_fops;
 
@@ -46,8 +47,43 @@ int btf_get_info_by_fd(const struct btf *btf,
 const struct btf_type *btf_type_id_size(const struct btf *btf,
u32 *type_id,
u32 *ret_size);
+
+/*
+ * Options to control show behaviour.
+ * - BTF_SHOW_COMPACT: no formatting around type information
+ * - BTF_SHOW_NONAME: no struct/union member names/types
+ * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values;
+ *   equivalent to %px.
+ * - BTF_SHOW_ZERO: show zero-valued struct/union members; they
+ *   are not displayed by default
+ * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
+ *   data before displaying it.
+ */
+#define BTF_SHOW_COMPACT   (1ULL << 0)
+#define BTF_SHOW_NONAME(1ULL << 1)
+#define BTF_SHOW_PTR_RAW   (1ULL << 2)
+#define BTF_SHOW_ZERO  (1ULL << 3)
+#define BTF_SHOW_UNSAFE(1ULL << 4)
+
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
   struct seq_file *m);
+
+/*
+ * Copy len bytes of string representation of obj of BTF type_id into buf.
+ *
+ * @btf: struct btf object
+ * @type_id: type id of type obj points to
+ * @obj: pointer to typed data
+ * @buf: buffer to write to
+ * @len: maximum length to write to buf
+ * @flags: show options (see above)
+ *
+ * Return: length that would have been/was copied as per snprintf, or
+ *negative error.
+ */
+int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj,
+  char *buf, int len, u64 flags);
+
 int btf_get_fd_by_id(u32 id);
 u32 btf_id(const struct btf *btf);
 bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 5d3c36e..be5acf6 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -284,6 +284,91 @@ static const char *btf_type_str(const struct btf_type *t)
return btf_kind_str[BTF_INFO_KIND(t->info)];
 }
 
+/* Chunk size we use in safe copy of data to be shown. */
+#define BTF_SHOW_OBJ_SAFE_SIZE 32
+
+/*
+ * This is the maximum size of a base type value (equivalent to a
+ * 128-bit int); if we are at the end of our safe buffer and have
+ * less than 16 bytes space we can't be assured of being able
+ * to copy the next type safely, so in such cases we will initiate
+ * a new copy.
+ */
+#define BTF_SHOW_OBJ_BASE_TYPE_SIZE16
+
+/* Type name size */
+#define BTF_SHOW_NAME_SIZE 80
+
+/*
+ * Common data to all BTF show operations. Private show functions can add
+ * their own data to a structure containing a struct btf_show and consult it
+ * in the show callback.  See btf_type_show() below.
+ *
+ * One challenge with showing nested data is we want to skip 0-valued
+ * data, but in order to figure out whether a nested object is all zeros
+ * we need to walk through it.  As a result, we need to make two passes
+ * when handling structs, unions and arrays; the first path simply looks
+ * for nonzero data, while the second actually does the display.  The first
+ * pass is signalled by show->state.depth_check being set, and if we
+ * encounter a non-zero value we set show->state.depth_to_show to
+ * the depth at which we encountered it.  When we have completed the
+ * first pass, we will know if anything needs to be displayed if
+ * depth_to_show > depth.  Se

[PATCH v7 bpf-next 3/8] bpf: add bpf_snprintf_btf helper

2020-09-28 Thread Alan Maguire

A helper is added to support tracing kernel type information in BPF
using the BPF Type Format (BTF).  Its signature is

long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr,
  u32 btf_ptr_size, u64 flags);

struct btf_ptr * specifies

- a pointer to the data to be traced
- the BTF id of the type of data pointed to
- a flags field is provided for future use; these flags
  are not to be confused with the BTF_F_* flags
  below that control how the btf_ptr is displayed; the
  flags member of the struct btf_ptr may be used to
  disambiguate types in kernel versus module BTF, etc;
  the main distinction is the flags relate to the type
  and information needed in identifying it; not how it
  is displayed.

For example a BPF program with a struct sk_buff *skb
could do the following:

static struct btf_ptr b = { };

b.ptr = skb;
b.type_id = __builtin_btf_type_id(struct sk_buff, 1);
bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0);

Default output looks like this:

(struct sk_buff){
 .transport_header = (__u16)65535,
 .mac_header = (__u16)65535,
 .end = (sk_buff_data_t)192,
 .head = (unsigned char *)0x7524fd8b,
 .data = (unsigned char *)0x7524fd8b,
 .truesize = (unsigned int)768,
 .users = (refcount_t){
  .refs = (atomic_t){
   .counter = (int)1,
  },
 },
}

Flags modifying display are as follows:

- BTF_F_COMPACT:no formatting around type information
- BTF_F_NONAME: no struct/union member names/types
- BTF_F_PTR_RAW:show raw (unobfuscated) pointer values;
equivalent to %px.
- BTF_F_ZERO:   show zero-valued struct/union members;
they are not displayed by default

Signed-off-by: Alan Maguire 
---
 include/linux/bpf.h|  1 +
 include/linux/btf.h|  9 +++---
 include/uapi/linux/bpf.h   | 67 ++
 kernel/bpf/core.c  |  1 +
 kernel/bpf/helpers.c   |  4 +++
 kernel/trace/bpf_trace.c   | 65 
 scripts/bpf_helpers_doc.py |  2 ++
 tools/include/uapi/linux/bpf.h | 67 ++
 8 files changed, 212 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 2eae3f3..1d020d8 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1810,6 +1810,7 @@ static inline int 
bpf_fd_reuseport_array_update_elem(struct bpf_map *map,
 extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto;
 extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
 extern const struct bpf_func_proto bpf_copy_from_user_proto;
+extern const struct bpf_func_proto bpf_snprintf_btf_proto;
 
 const struct bpf_func_proto *bpf_tracing_func_proto(
enum bpf_func_id func_id, const struct bpf_prog *prog);
diff --git a/include/linux/btf.h b/include/linux/btf.h
index d0f5d3c..3e5cdc2 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 
 #define BTF_TYPE_EMIT(type) ((void)(type *)0)
 
@@ -59,10 +60,10 @@ const struct btf_type *btf_type_id_size(const struct btf 
*btf,
  * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
  *   data before displaying it.
  */
-#define BTF_SHOW_COMPACT   (1ULL << 0)
-#define BTF_SHOW_NONAME(1ULL << 1)
-#define BTF_SHOW_PTR_RAW   (1ULL << 2)
-#define BTF_SHOW_ZERO  (1ULL << 3)
+#define BTF_SHOW_COMPACT   BTF_F_COMPACT
+#define BTF_SHOW_NONAMEBTF_F_NONAME
+#define BTF_SHOW_PTR_RAW   BTF_F_PTR_RAW
+#define BTF_SHOW_ZERO  BTF_F_ZERO
 #define BTF_SHOW_UNSAFE(1ULL << 4)
 
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2d6519a..fcafe80 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3587,6 +3587,42 @@ struct bpf_stack_build_id {
  * the data in *dst*. This is a wrapper of **copy_from_user**\ ().
  * Return
  * 0 on success, or a negative error in case of failure.
+ *
+ * long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32 
btf_ptr_size, u64 flags)
+ * Description
+ * Use BTF to store a string representation of *ptr*->ptr in *str*,
+ * using *ptr*->type_id.  This value should specify the type
+ * that *ptr*->ptr points to. LLVM __builtin_btf_type_id(type, 1)
+ * can be used to look up vmlinux BTF type ids. Traversing the
+ * data structure using BTF, the type information and values are
+ * stored in the first *str_size* - 1 bytes of *str*.  Safe copy of
+ * the pointer data is carried out to avoid kernel crashes during
+ * operation.  Smaller types can use string space on the stack;
+ *

[PATCH v7 bpf-next 0/8] bpf: add helpers to support BTF-based kernel data display

2020-09-28 Thread Alan Maguire

ersion of the target dummy
  value which is either all zeros or all 0xff values; the idea is this
  exercises the "skip if zero" and "print everything" cases.
- added support in BPF for using the %pT format specifier in
  bpf_trace_printk()
- added BPF tests which ensure %pT format specifier use works (Alexei).

Alan Maguire (8):
  bpf: provide function to get vmlinux BTF information
  bpf: move to generic BTF show support, apply it to seq files/strings
  bpf: add bpf_snprintf_btf helper
  selftests/bpf: add bpf_snprintf_btf helper tests
  bpf: bump iter seq size to support BTF representation of large data
structures
  selftests/bpf: fix overflow tests to reflect iter size increase
  bpf: add bpf_seq_printf_btf helper
  selftests/bpf: add test for bpf_seq_printf_btf helper

 include/linux/bpf.h|3 +
 include/linux/btf.h|   39 +
 include/uapi/linux/bpf.h   |   76 ++
 kernel/bpf/bpf_iter.c  |4 +-
 kernel/bpf/btf.c   | 1007 ++--
 kernel/bpf/core.c  |2 +
 kernel/bpf/helpers.c   |4 +
 kernel/bpf/verifier.c  |   18 +-
 kernel/trace/bpf_trace.c   |   98 ++
 scripts/bpf_helpers_doc.py |2 +
 tools/include/uapi/linux/bpf.h |   76 ++
 tools/testing/selftests/bpf/prog_tests/bpf_iter.c  |   88 +-
 .../selftests/bpf/prog_tests/snprintf_btf.c|   60 ++
 .../selftests/bpf/progs/bpf_iter_task_btf.c|   50 +
 .../selftests/bpf/progs/netif_receive_skb.c|  249 +
 15 files changed, 1659 insertions(+), 117 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c
 create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c

-- 
1.8.3.1

[PATCH v7 bpf-next 1/8] bpf: provide function to get vmlinux BTF information

2020-09-28 Thread Alan Maguire

It will be used later for BPF structure display support

Signed-off-by: Alan Maguire 
---
 include/linux/bpf.h   |  2 ++
 kernel/bpf/verifier.c | 18 --
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 7990232..2eae3f3 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1355,6 +1355,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr,
  union bpf_attr __user *uattr);
 void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
 
+struct btf *bpf_get_btf_vmlinux(void);
+
 /* Map specifics */
 struct xdp_buff;
 struct sk_buff;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b25ba98..686f6a9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11517,6 +11517,17 @@ static int check_attach_btf_id(struct bpf_verifier_env 
*env)
}
 }
 
+struct btf *bpf_get_btf_vmlinux(void)
+{
+   if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
+   mutex_lock(&bpf_verifier_lock);
+   if (!btf_vmlinux)
+   btf_vmlinux = btf_parse_vmlinux();
+   mutex_unlock(&bpf_verifier_lock);
+   }
+   return btf_vmlinux;
+}
+
 int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
  union bpf_attr __user *uattr)
 {
@@ -11550,12 +11561,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr 
*attr,
env->ops = bpf_verifier_ops[env->prog->type];
is_priv = bpf_capable();
 
-   if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
-   mutex_lock(&bpf_verifier_lock);
-   if (!btf_vmlinux)
-   btf_vmlinux = btf_parse_vmlinux();
-   mutex_unlock(&bpf_verifier_lock);
-   }
+   bpf_get_btf_vmlinux();
 
/* grab the mutex to protect few globals used by verifier */
if (!is_priv)
-- 
1.8.3.1

[PATCH v6 bpf-next 6/6] selftests/bpf: add test for bpf_seq_printf_btf helper

2020-09-23 Thread Alan Maguire

Add a test verifying iterating over tasks and displaying BTF
representation of data succeeds.  Note here that we do not display
the task_struct itself, as it will overflow the PAGE_SIZE limit on seq
data; instead we write task->fs (a struct fs_struct).

Suggested-by: Alexei Starovoitov 
Signed-off-by: Alan Maguire 
---
 tools/testing/selftests/bpf/prog_tests/bpf_iter.c  | 66 ++
 .../selftests/bpf/progs/bpf_iter_task_btf.c| 49 
 2 files changed, 115 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c 
b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
index fe1a83b9..323c48a 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
@@ -7,6 +7,7 @@
 #include "bpf_iter_task.skel.h"
 #include "bpf_iter_task_stack.skel.h"
 #include "bpf_iter_task_file.skel.h"
+#include "bpf_iter_task_btf.skel.h"
 #include "bpf_iter_tcp4.skel.h"
 #include "bpf_iter_tcp6.skel.h"
 #include "bpf_iter_udp4.skel.h"
@@ -167,6 +168,69 @@ static void test_task_file(void)
bpf_iter_task_file__destroy(skel);
 }
 
+#define FSBUFSZ8192
+
+static char fsbuf[FSBUFSZ];
+
+static void do_btf_read(struct bpf_program *prog)
+{
+   int iter_fd = -1, len = 0, bufleft = FSBUFSZ;
+   struct bpf_link *link;
+   char *buf = fsbuf;
+
+   link = bpf_program__attach_iter(prog, NULL);
+   if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n"))
+   return;
+
+   iter_fd = bpf_iter_create(bpf_link__fd(link));
+   if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n"))
+   goto free_link;
+
+   do {
+   len = read(iter_fd, buf, bufleft);
+   if (len > 0) {
+   buf += len;
+   bufleft -= len;
+   }
+   } while (len > 0);
+
+   if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno)))
+   goto free_link;
+
+   CHECK(strstr(fsbuf, "(struct fs_struct)") == NULL,
+ "check for btf representation of fs_struct in iter data",
+ "struct fs_struct not found");
+free_link:
+   if (iter_fd > 0)
+   close(iter_fd);
+   bpf_link__destroy(link);
+}
+
+static void test_task_btf(void)
+{
+   struct bpf_iter_task_btf__bss *bss;
+   struct bpf_iter_task_btf *skel;
+
+   skel = bpf_iter_task_btf__open_and_load();
+   if (CHECK(!skel, "bpf_iter_task_btf__open_and_load",
+ "skeleton open_and_load failed\n"))
+   return;
+
+   bss = skel->bss;
+
+   do_btf_read(skel->progs.dump_task_fs_struct);
+
+   if (CHECK(bss->tasks == 0, "check if iterated over tasks",
+ "no task iteration, did BPF program run?\n"))
+   goto cleanup;
+
+   CHECK(bss->seq_err != 0, "check for unexpected err",
+ "bpf_seq_printf_btf returned %ld", bss->seq_err);
+
+cleanup:
+   bpf_iter_task_btf__destroy(skel);
+}
+
 static void test_tcp4(void)
 {
struct bpf_iter_tcp4 *skel;
@@ -957,6 +1021,8 @@ void test_bpf_iter(void)
test_task_stack();
if (test__start_subtest("task_file"))
test_task_file();
+   if (test__start_subtest("task_btf"))
+   test_task_btf();
if (test__start_subtest("tcp4"))
test_tcp4();
if (test__start_subtest("tcp6"))
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c 
b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c
new file mode 100644
index 000..88631a8
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c
@@ -0,0 +1,49 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+#include "bpf_iter.h"
+#include 
+#include 
+#include 
+
+char _license[] SEC("license") = "GPL";
+
+long tasks = 0;
+long seq_err = 0;
+
+/* struct task_struct's BTF representation will overflow PAGE_SIZE so cannot
+ * be used here; instead dump a structure associated with each task.
+ */
+SEC("iter/task")
+int dump_task_fs_struct(struct bpf_iter__task *ctx)
+{
+   static const char fs_type[] = "struct fs_struct";
+   struct seq_file *seq = ctx->meta->seq;
+   struct task_struct *task = ctx->task;
+   struct fs_struct *fs = (void *)0;
+   static struct btf_ptr ptr = { };
+   long ret;
+
+   if (task)
+   fs = task->fs;
+
+   ptr.type = fs_type;
+   ptr.ptr = fs;
+
+   if (ctx->meta->

[PATCH v6 bpf-next 1/6] bpf: provide function to get vmlinux BTF information

2020-09-23 Thread Alan Maguire

It will be used later for BPF structure display support

Signed-off-by: Alan Maguire 
---
 include/linux/bpf.h   |  2 ++
 kernel/bpf/verifier.c | 18 --
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index fc5c901..049e50f 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1340,6 +1340,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr,
  union bpf_attr __user *uattr);
 void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
 
+struct btf *bpf_get_btf_vmlinux(void);
+
 /* Map specifics */
 struct xdp_buff;
 struct sk_buff;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 15ab889b..092ffd6 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11488,6 +11488,17 @@ static int check_attach_btf_id(struct bpf_verifier_env 
*env)
}
 }
 
+struct btf *bpf_get_btf_vmlinux(void)
+{
+   if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
+   mutex_lock(&bpf_verifier_lock);
+   if (!btf_vmlinux)
+   btf_vmlinux = btf_parse_vmlinux();
+   mutex_unlock(&bpf_verifier_lock);
+   }
+   return btf_vmlinux;
+}
+
 int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
  union bpf_attr __user *uattr)
 {
@@ -11521,12 +11532,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr 
*attr,
env->ops = bpf_verifier_ops[env->prog->type];
is_priv = bpf_capable();
 
-   if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
-   mutex_lock(&bpf_verifier_lock);
-   if (!btf_vmlinux)
-   btf_vmlinux = btf_parse_vmlinux();
-   mutex_unlock(&bpf_verifier_lock);
-   }
+   bpf_get_btf_vmlinux();
 
/* grab the mutex to protect few globals used by verifier */
if (!is_priv)
-- 
1.8.3.1

[PATCH v6 bpf-next 5/6] bpf: add bpf_seq_printf_btf helper

2020-09-23 Thread Alan Maguire

A helper is added to allow seq file writing of kernel data
structures using vmlinux BTF.  Its signature is

long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr,
u32 btf_ptr_size, u64 flags);

Flags and struct btf_ptr definitions/use are identical to the
bpf_snprintf_btf helper, and the helper returns 0 on success
or a negative error value.

Suggested-by: Alexei Starovoitov 
Signed-off-by: Alan Maguire 
---
 include/linux/btf.h|  2 ++
 include/uapi/linux/bpf.h   | 10 ++
 kernel/bpf/btf.c   |  4 ++--
 kernel/bpf/core.c  |  1 +
 kernel/trace/bpf_trace.c   | 33 +
 tools/include/uapi/linux/bpf.h | 10 ++
 6 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 3e5cdc2..024e16f 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -68,6 +68,8 @@ const struct btf_type *btf_type_id_size(const struct btf *btf,
 
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
   struct seq_file *m);
+int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj,
+   struct seq_file *m, u64 flags);
 
 /*
  * Copy len bytes of string representation of obj of BTF type_id into buf.
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c1675ad..c3231a8 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3621,6 +3621,15 @@ struct bpf_stack_build_id {
  * The number of bytes that were written (or would have been
  * written if output had to be truncated due to string size),
  * or a negative error in cases of failure.
+ *
+ * long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr, u32 
ptr_size, u64 flags)
+ * Description
+ * Use BTF to write to seq_write a string representation of
+ * *ptr*->ptr, using *ptr*->type name or *ptr*->type_id as per
+ * bpf_snprintf_btf() above.  *flags* are identical to those
+ * used for bpf_snprintf_btf.
+ * Return
+ * 0 on success or a negative error in case of failure.
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -3773,6 +3782,7 @@ struct bpf_stack_build_id {
FN(d_path), \
FN(copy_from_user), \
FN(snprintf_btf),   \
+   FN(seq_printf_btf), \
/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 94190ec..dfc8654 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5316,8 +5316,8 @@ static __printf(2, 3) void btf_seq_show(struct btf_show 
*show, const char *fmt,
va_end(args);
 }
 
-static int btf_type_seq_show_flags(const struct btf *btf, u32 type_id,
-  void *obj, struct seq_file *m, u64 flags)
+int btf_type_seq_show_flags(const struct btf *btf, u32 type_id,
+   void *obj, struct seq_file *m, u64 flags)
 {
struct btf_show sseq;
 
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 403fb23..c4ba45f 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2217,6 +2217,7 @@ void bpf_user_rnd_init_once(void)
 const struct bpf_func_proto bpf_get_local_storage_proto __weak;
 const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak;
 const struct bpf_func_proto bpf_snprintf_btf_proto __weak;
+const struct bpf_func_proto bpf_seq_printf_btf_proto __weak;
 
 const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void)
 {
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 61c274f8..e8fa1c0 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -71,6 +71,10 @@ static struct bpf_raw_event_map 
*bpf_get_raw_tracepoint_module(const char *name)
 u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 
+static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size,
+ u64 flags, const struct btf **btf,
+ s32 *btf_id);
+
 /**
  * trace_call_bpf - invoke BPF program
  * @call: tracepoint event
@@ -776,6 +780,31 @@ struct bpf_seq_printf_buf {
.arg3_type  = ARG_CONST_SIZE_OR_ZERO,
 };
 
+BPF_CALL_4(bpf_seq_printf_btf, struct seq_file *, m, struct btf_ptr *, ptr,
+  u32, btf_ptr_size, u64, flags)
+{
+   const struct btf *btf;
+   s32 btf_id;
+   int ret;
+
+   ret = bpf_btf_printf_prepare(ptr, btf_ptr_size, flags, &btf, &btf_id);
+   if (ret)
+   return ret;
+
+   return btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags);
+}
+
+static const struct bpf_func_proto bpf_seq_printf_btf_proto = {
+   .func   = bpf_seq

[PATCH v6 bpf-next 2/6] bpf: move to generic BTF show support, apply it to seq files/strings

2020-09-23 Thread Alan Maguire

generalize the "seq_show" seq file support in btf.c to support
a generic show callback of which we support two instances; the
current seq file show, and a show with snprintf() behaviour which
instead writes the type data to a supplied string.

Both classes of show function call btf_type_show() with different
targets; the seq file or the string to be written.  In the string
case we need to track additional data - length left in string to write
and length to return that we would have written (a la snprintf).

By default show will display type information, field members and
their types and values etc, and the information is indented
based upon structure depth. Zeroed fields are omitted.

Show however supports flags which modify its behaviour:

BTF_SHOW_COMPACT - suppress newline/indent.
BTF_SHOW_NONAME - suppress show of type and member names.
BTF_SHOW_PTR_RAW - do not obfuscate pointer values.
BTF_SHOW_UNSAFE - do not copy data to safe buffer before display.
BTF_SHOW_ZERO - show zeroed values (by default they are not shown).

Signed-off-by: Alan Maguire 
---
 include/linux/btf.h |  36 ++
 kernel/bpf/btf.c| 980 ++--
 2 files changed, 914 insertions(+), 102 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index a9af5e7..d0f5d3c 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -13,6 +13,7 @@
 struct btf_member;
 struct btf_type;
 union bpf_attr;
+struct btf_show;
 
 extern const struct file_operations btf_fops;
 
@@ -46,8 +47,43 @@ int btf_get_info_by_fd(const struct btf *btf,
 const struct btf_type *btf_type_id_size(const struct btf *btf,
u32 *type_id,
u32 *ret_size);
+
+/*
+ * Options to control show behaviour.
+ * - BTF_SHOW_COMPACT: no formatting around type information
+ * - BTF_SHOW_NONAME: no struct/union member names/types
+ * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values;
+ *   equivalent to %px.
+ * - BTF_SHOW_ZERO: show zero-valued struct/union members; they
+ *   are not displayed by default
+ * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
+ *   data before displaying it.
+ */
+#define BTF_SHOW_COMPACT   (1ULL << 0)
+#define BTF_SHOW_NONAME(1ULL << 1)
+#define BTF_SHOW_PTR_RAW   (1ULL << 2)
+#define BTF_SHOW_ZERO  (1ULL << 3)
+#define BTF_SHOW_UNSAFE(1ULL << 4)
+
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
   struct seq_file *m);
+
+/*
+ * Copy len bytes of string representation of obj of BTF type_id into buf.
+ *
+ * @btf: struct btf object
+ * @type_id: type id of type obj points to
+ * @obj: pointer to typed data
+ * @buf: buffer to write to
+ * @len: maximum length to write to buf
+ * @flags: show options (see above)
+ *
+ * Return: length that would have been/was copied as per snprintf, or
+ *negative error.
+ */
+int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj,
+  char *buf, int len, u64 flags);
+
 int btf_get_fd_by_id(u32 id);
 u32 btf_id(const struct btf *btf);
 bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 5d3c36e..94190ec 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -284,6 +284,88 @@ static const char *btf_type_str(const struct btf_type *t)
return btf_kind_str[BTF_INFO_KIND(t->info)];
 }
 
+/* Chunk size we use in safe copy of data to be shown. */
+#define BTF_SHOW_OBJ_SAFE_SIZE 256
+
+/*
+ * This is the maximum size of a base type value (equivalent to a
+ * 128-bit int); if we are at the end of our safe buffer and have
+ * less than 16 bytes space we can't be assured of being able
+ * to copy the next type safely, so in such cases we will initiate
+ * a new copy.
+ */
+#define BTF_SHOW_OBJ_BASE_TYPE_SIZE16
+
+/*
+ * Common data to all BTF show operations. Private show functions can add
+ * their own data to a structure containing a struct btf_show and consult it
+ * in the show callback.  See btf_type_show() below.
+ *
+ * One challenge with showing nested data is we want to skip 0-valued
+ * data, but in order to figure out whether a nested object is all zeros
+ * we need to walk through it.  As a result, we need to make two passes
+ * when handling structs, unions and arrays; the first path simply looks
+ * for nonzero data, while the second actually does the display.  The first
+ * pass is signalled by show->state.depth_check being set, and if we
+ * encounter a non-zero value we set show->state.depth_to_show to
+ * the depth at which we encountered it.  When we have completed the
+ * first pass, we will know if anything needs to be displayed if
+ * depth_to_show > depth.  See btf_[struct,array]_show() for the
+ * implementation of this.

[PATCH v6 bpf-next 0/6] bpf: add helpers to support BTF-based kernel data display

2020-09-23 Thread Alan Maguire

t approaches were explored
  including dynamic allocation and per-cpu buffers. The
  downside of dynamic allocation is that it would be done
  during BPF program execution for bpf_trace_printk()s using
  %pT format specifiers. The problem with per-cpu buffers
  is we'd have to manage preemption and since the display
  of an object occurs over an extended period and in printk
  context where we'd rather not change preemption status,
  it seemed tricky to manage buffer safety while considering
  preemption.  The approach of utilizing stack buffer space
  via the "struct btf_show" seemed like the simplest approach.
  The stack size of the associated functions which have a
  "struct btf_show" on their stack to support show operation
  (btf_type_snprintf_show() and btf_type_seq_show()) stays
  under 500 bytes. The compromise here is the safe buffer we
  use is small - 256 bytes - and as a result multiple
  probe_kernel_read()s are needed for larger objects. Most
  objects of interest are smaller than this (e.g.
  "struct sk_buff" is 224 bytes), and while task_struct is a
  notable exception at ~8K, performance is not the priority for
  BTF-based display. (Alexei and Yonghong, patch 2).
- safe buffer use is the default behaviour (and is mandatory
  for BPF) but unsafe display - meaning no safe copy is done
  and we operate on the object itself - is supported via a
  'u' option.
- pointers are prefixed with 0x for clarity (Alexei, patch 2)
- added additional comments and explanations around BTF show
  code, especially around determining whether objects such
  zeroed. Also tried to comment safe object scheme used. (Yonghong,
  patch 2)
- added late_initcall() to initialize vmlinux BTF so that it would
  not have to be initialized during printk operation (Alexei,
  patch 5)
- removed CONFIG_BTF_PRINTF config option as it is not needed;
  CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and
  determining behaviour of type-based printk can be done via
  retrieval of BTF data; if it's not there BTF was unavailable
  or broken (Alexei, patches 4,6)
- fix bpf_trace_printk test to use vmlinux.h and globals via
  skeleton infrastructure, removing need for perf events
  (Andrii, patch 8)

Changes since v1:

- changed format to be more drgn-like, rendering indented type info
  along with type names by default (Alexei)
- zeroed values are omitted (Arnaldo) by default unless the '0'
  modifier is specified (Alexei)
- added an option to print pointer values without obfuscation.
  The reason to do this is the sysctls controlling pointer display
  are likely to be irrelevant in many if not most tracing contexts.
  Some questions on this in the outstanding questions section below...
- reworked printk format specifer so that we no longer rely on format
  %pT but instead use a struct * which contains type information
  (Rasmus). This simplifies the printk parsing, makes use more dynamic
  and also allows specification by BTF id as well as name.
- removed incorrect patch which tried to fix dereferencing of resolved
  BTF info for vmlinux; instead we skip modifiers for the relevant
  case (array element type determination) (Alexei).
- fixed issues with negative snprintf format length (Rasmus)
- added test cases for various data structure formats; base types,
  typedefs, structs, etc.
- tests now iterate through all typedef, enum, struct and unions
  defined for vmlinux BTF and render a version of the target dummy
  value which is either all zeros or all 0xff values; the idea is this
  exercises the "skip if zero" and "print everything" cases.
- added support in BPF for using the %pT format specifier in
  bpf_trace_printk()
- added BPF tests which ensure %pT format specifier use works (Alexei).

Alan Maguire (6):
  bpf: provide function to get vmlinux BTF information
  bpf: move to generic BTF show support, apply it to seq files/strings
  bpf: add bpf_snprintf_btf helper
  selftests/bpf: add bpf_snprintf_btf helper tests
  bpf: add bpf_seq_printf_btf helper
  selftests/bpf: add test for bpf_seq_printf_btf helper

 include/linux/bpf.h|   3 +
 include/linux/btf.h|  39 +
 include/uapi/linux/bpf.h   |  78 ++
 kernel/bpf/btf.c   | 980 ++---
 kernel/bpf/core.c  |   2 +
 kernel/bpf/helpers.c   |   4 +
 kernel/bpf/verifier.c  |  18 +-
 kernel/trace/bpf_trace.c   | 134 +++
 scripts/bpf_helpers_doc.py |   2 +
 tools/include/uapi/linux/bpf.h |  78 ++
 tools/testing/selftests/bpf/prog_tests/bpf_iter.c  |  66 ++
 .../selftests/bpf/prog_tests/snprintf_btf.c|  54 ++
 .../selftests/bpf/progs/bpf_iter_task_btf.c|  49 ++
 .../selftests/bpf/progs/netif_receive_sk

[PATCH v6 bpf-next 4/6] selftests/bpf: add bpf_snprintf_btf helper tests

2020-09-23 Thread Alan Maguire

Tests verifying snprintf()ing of various data structures,
flags combinations using a tp_btf program.

Signed-off-by: Alan Maguire 
---
 .../selftests/bpf/prog_tests/snprintf_btf.c|  54 +
 .../selftests/bpf/progs/netif_receive_skb.c| 260 +
 2 files changed, 314 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf.c
 create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c 
b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c
new file mode 100644
index 000..855e11d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c
@@ -0,0 +1,54 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include "netif_receive_skb.skel.h"
+
+/* Demonstrate that bpf_snprintf_btf succeeds and that various data types
+ * are formatted correctly.
+ */
+void test_snprintf_btf(void)
+{
+   struct netif_receive_skb *skel;
+   struct netif_receive_skb__bss *bss;
+   int err, duration = 0;
+
+   skel = netif_receive_skb__open();
+   if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+   return;
+
+   err = netif_receive_skb__load(skel);
+   if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err))
+   goto cleanup;
+
+   bss = skel->bss;
+
+   err = netif_receive_skb__attach(skel);
+   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+   goto cleanup;
+
+   /* generate receive event */
+   system("ping -c 1 127.0.0.1 > /dev/null");
+
+   /*
+* Make sure netif_receive_skb program was triggered
+* and it set expected return values from bpf_trace_printk()s
+* and all tests ran.
+*/
+   if (CHECK(bss->ret <= 0,
+ "bpf_snprintf_btf: got return value",
+ "ret <= 0 %ld test %d\n", bss->ret, bss->ran_subtests))
+   goto cleanup;
+
+   if (CHECK(bss->ran_subtests == 0, "check if subtests ran",
+ "no subtests ran, did BPF program run?"))
+   goto cleanup;
+
+   if (CHECK(bss->num_subtests != bss->ran_subtests,
+ "check all subtests ran",
+ "only ran %d of %d tests\n", bss->num_subtests,
+ bss->ran_subtests))
+   goto cleanup;
+
+cleanup:
+   netif_receive_skb__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c 
b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
new file mode 100644
index 000..b4f96f1
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
@@ -0,0 +1,260 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+
+#include "vmlinux.h"
+#include 
+#include 
+#include 
+
+long ret = 0;
+int num_subtests = 0;
+int ran_subtests = 0;
+
+#define STRSIZE2048
+#define EXPECTED_STRSIZE   256
+
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(x)  (sizeof(x) / sizeof((x)[0]))
+#endif
+
+struct {
+   __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+   __uint(max_entries, 1);
+   __type(key, __u32);
+   __type(value, char[STRSIZE]);
+} strdata SEC(".maps");
+
+static int __strncmp(const void *m1, const void *m2, size_t len)
+{
+   const unsigned char *s1 = m1;
+   const unsigned char *s2 = m2;
+   int i, delta = 0;
+
+#pragma clang loop unroll(full)
+   for (i = 0; i < len; i++) {
+   delta = s1[i] - s2[i];
+   if (delta || s1[i] == 0 || s2[i] == 0)
+   break;
+   }
+   return delta;
+}
+
+/* Use __builtin_btf_type_id to test snprintf_btf by type id instead of name */
+#if __has_builtin(__builtin_btf_type_id)
+#define TEST_BTF_BY_ID(_str, _typestr, _ptr, _hflags)  \
+   do {\
+   int _expected_ret = ret;\
+   _ptr.type = 0;  \
+   _ptr.type_id = __builtin_btf_type_id(_typestr, 0);  \
+   ret = bpf_snprintf_btf(_str, STRSIZE, &_ptr,\
+  sizeof(_ptr), _hflags);  \
+   if (ret != _expected_ret) { \
+   bpf_printk("expected ret (%d), got (%d)",   \
+  _expected_ret, ret); \
+   ret = -EBADMSG; \
+   }   \
+   } while

[PATCH v6 bpf-next 3/6] bpf: add bpf_snprintf_btf helper

2020-09-23 Thread Alan Maguire

A helper is added to support tracing kernel type information in BPF
using the BPF Type Format (BTF).  Its signature is

long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr,
  u32 btf_ptr_size, u64 flags);

struct btf_ptr * specifies

- a pointer to the data to be traced;
- the BTF id of the type of data pointed to; or
- a string representation of the type of data pointed to
- a flags field is provided for future use; these flags
  are not to be confused with the BTF_F_* flags
  below that control how the btf_ptr is displayed; the
  flags member of the struct btf_ptr may be used to
  disambiguate types in kernel versus module BTF, etc;
  the main distinction is the flags relate to the type
  and information needed in identifying it; not how it
  is displayed.

For example a BPF program with a struct sk_buff *skb
could do the following:

static const char skb_type[] = "struct sk_buff";
static struct btf_ptr b = { };

b.ptr = skb;
b.type = skb_type;
bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0);

Default output looks like this:

(struct sk_buff){
 .transport_header = (__u16)65535,
 .mac_header = (__u16)65535,
 .end = (sk_buff_data_t)192,
 .head = (unsigned char *)0x7524fd8b,
 .data = (unsigned char *)0x7524fd8b,
 .truesize = (unsigned int)768,
 .users = (refcount_t){
  .refs = (atomic_t){
   .counter = (int)1,
  },
 },
}

Flags modifying display are as follows:

- BTF_F_COMPACT:no formatting around type information
- BTF_F_NONAME: no struct/union member names/types
- BTF_F_PTR_RAW:show raw (unobfuscated) pointer values;
equivalent to %px.
- BTF_F_ZERO:   show zero-valued struct/union members;
they are not displayed by default

Signed-off-by: Alan Maguire 
---
 include/linux/bpf.h|   1 +
 include/linux/btf.h|   9 ++--
 include/uapi/linux/bpf.h   |  68 +++
 kernel/bpf/core.c  |   1 +
 kernel/bpf/helpers.c   |   4 ++
 kernel/trace/bpf_trace.c   | 101 +
 scripts/bpf_helpers_doc.py |   2 +
 tools/include/uapi/linux/bpf.h |  68 +++
 8 files changed, 250 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 049e50f..a3b40a5 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1795,6 +1795,7 @@ static inline int 
bpf_fd_reuseport_array_update_elem(struct bpf_map *map,
 extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto;
 extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
 extern const struct bpf_func_proto bpf_copy_from_user_proto;
+extern const struct bpf_func_proto bpf_snprintf_btf_proto;
 
 const struct bpf_func_proto *bpf_tracing_func_proto(
enum bpf_func_id func_id, const struct bpf_prog *prog);
diff --git a/include/linux/btf.h b/include/linux/btf.h
index d0f5d3c..3e5cdc2 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 
 #define BTF_TYPE_EMIT(type) ((void)(type *)0)
 
@@ -59,10 +60,10 @@ const struct btf_type *btf_type_id_size(const struct btf 
*btf,
  * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
  *   data before displaying it.
  */
-#define BTF_SHOW_COMPACT   (1ULL << 0)
-#define BTF_SHOW_NONAME(1ULL << 1)
-#define BTF_SHOW_PTR_RAW   (1ULL << 2)
-#define BTF_SHOW_ZERO  (1ULL << 3)
+#define BTF_SHOW_COMPACT   BTF_F_COMPACT
+#define BTF_SHOW_NONAMEBTF_F_NONAME
+#define BTF_SHOW_PTR_RAW   BTF_F_PTR_RAW
+#define BTF_SHOW_ZERO  BTF_F_ZERO
 #define BTF_SHOW_UNSAFE(1ULL << 4)
 
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index a228125..c1675ad 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3586,6 +3586,41 @@ struct bpf_stack_build_id {
  * the data in *dst*. This is a wrapper of **copy_from_user**\ ().
  * Return
  * 0 on success, or a negative error in case of failure.
+ *
+ * long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32 
btf_ptr_size, u64 flags)
+ * Description
+ * Use BTF to store a string representation of *ptr*->ptr in *str*,
+ * using *ptr*->type name or *ptr*->type_id.  These values should
+ * specify the type *ptr*->ptr points to. Traversing that
+ * data structure using BTF, the type information and values are
+ * stored in the first *str_size* - 1 bytes of *str*.  Safe copy of
+ * the pointer data is carried out to avoid kernel crashes during
+ * operation.  Smaller types can use string space on the stack;
+ *

[PATCH v5 bpf-next 4/6] selftests/bpf: add bpf_btf_snprintf helper tests

2020-09-18 Thread Alan Maguire

Tests verifying snprintf()ing of various data structures,
flags combinations using a tp_btf program.

Signed-off-by: Alan Maguire 
---
 .../selftests/bpf/prog_tests/btf_snprintf.c|  55 +
 .../selftests/bpf/progs/netif_receive_skb.c| 260 +
 2 files changed, 315 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c

diff --git a/tools/testing/selftests/bpf/prog_tests/btf_snprintf.c 
b/tools/testing/selftests/bpf/prog_tests/btf_snprintf.c
new file mode 100644
index 000..8f277a5
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/btf_snprintf.c
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include "netif_receive_skb.skel.h"
+
+/* Demonstrate that bpf_btf_snprintf succeeds with non-zero return values,
+ * and that string representation of kernel data can then be displayed
+ * via bpf_trace_printk().
+ */
+void test_btf_snprintf(void)
+{
+   struct netif_receive_skb *skel;
+   struct netif_receive_skb__bss *bss;
+   int err, duration = 0;
+
+   skel = netif_receive_skb__open();
+   if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+   return;
+
+   err = netif_receive_skb__load(skel);
+   if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err))
+   goto cleanup;
+
+   bss = skel->bss;
+
+   err = netif_receive_skb__attach(skel);
+   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+   goto cleanup;
+
+   /* generate receive event */
+   system("ping -c 1 127.0.0.1 > /dev/null");
+
+   /*
+* Make sure netif_receive_skb program was triggered
+* and it set expected return values from bpf_trace_printk()s
+* and all tests ran.
+*/
+   if (CHECK(bss->ret <= 0,
+ "bpf_btf_snprintf: got return value",
+ "ret <= 0 %ld test %d\n", bss->ret, bss->ran_subtests))
+   goto cleanup;
+
+   if (CHECK(bss->ran_subtests == 0, "check if subtests ran",
+ "no subtests ran, did BPF program run?"))
+   goto cleanup;
+
+   if (CHECK(bss->num_subtests != bss->ran_subtests,
+ "check all subtests ran",
+ "only ran %d of %d tests\n", bss->num_subtests,
+ bss->ran_subtests))
+   goto cleanup;
+
+cleanup:
+   netif_receive_skb__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c 
b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
new file mode 100644
index 000..dd08a7d
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
@@ -0,0 +1,260 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+
+#include "vmlinux.h"
+#include 
+#include 
+#include 
+
+long ret = 0;
+int num_subtests = 0;
+int ran_subtests = 0;
+
+#define STRSIZE2048
+#define EXPECTED_STRSIZE   256
+
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(x)  (sizeof(x) / sizeof((x)[0]))
+#endif
+
+struct {
+   __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+   __uint(max_entries, 1);
+   __type(key, __u32);
+   __type(value, char[STRSIZE]);
+} strdata SEC(".maps");
+
+static int __strncmp(const void *m1, const void *m2, size_t len)
+{
+   const unsigned char *s1 = m1;
+   const unsigned char *s2 = m2;
+   int i, delta = 0;
+
+#pragma clang loop unroll(full)
+   for (i = 0; i < len; i++) {
+   delta = s1[i] - s2[i];
+   if (delta || s1[i] == 0 || s2[i] == 0)
+   break;
+   }
+   return delta;
+}
+
+/* Use __builtin_btf_type_id to test btf_snprintf by type id instead of name */
+#if __has_builtin(__builtin_btf_type_id)
+#define TEST_BTF_BY_ID(_str, _typestr, _ptr, _hflags)  \
+   do {\
+   int _expected_ret = ret;\
+   _ptr.type = 0;  \
+   _ptr.type_id = __builtin_btf_type_id(_typestr, 0);  \
+   ret = bpf_btf_snprintf(_str, STRSIZE, &_ptr,\
+  sizeof(_ptr), _hflags);  \
+   if (ret != _expected_ret) { \
+   bpf_printk("expected ret (%d), got (%d)",   \
+  _expected_ret, ret); \
+   ret = -EBADMSG; \
+   }

[PATCH v5 bpf-next 2/6] bpf: move to generic BTF show support, apply it to seq files/strings

2020-09-18 Thread Alan Maguire

generalize the "seq_show" seq file support in btf.c to support
a generic show callback of which we support two instances; the
current seq file show, and a show with snprintf() behaviour which
instead writes the type data to a supplied string.

Both classes of show function call btf_type_show() with different
targets; the seq file or the string to be written.  In the string
case we need to track additional data - length left in string to write
and length to return that we would have written (a la snprintf).

By default show will display type information, field members and
their types and values etc, and the information is indented
based upon structure depth. Zeroed fields are omitted.

Show however supports flags which modify its behaviour:

BTF_SHOW_COMPACT - suppress newline/indent.
BTF_SHOW_NONAME - suppress show of type and member names.
BTF_SHOW_PTR_RAW - do not obfuscate pointer values.
BTF_SHOW_UNSAFE - do not copy data to safe buffer before display.
BTF_SHOW_ZERO - show zeroed values (by default they are not shown).

Signed-off-by: Alan Maguire 
---
 include/linux/btf.h |  36 ++
 kernel/bpf/btf.c| 971 ++--
 2 files changed, 904 insertions(+), 103 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index a9af5e7..d0f5d3c 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -13,6 +13,7 @@
 struct btf_member;
 struct btf_type;
 union bpf_attr;
+struct btf_show;
 
 extern const struct file_operations btf_fops;
 
@@ -46,8 +47,43 @@ int btf_get_info_by_fd(const struct btf *btf,
 const struct btf_type *btf_type_id_size(const struct btf *btf,
u32 *type_id,
u32 *ret_size);
+
+/*
+ * Options to control show behaviour.
+ * - BTF_SHOW_COMPACT: no formatting around type information
+ * - BTF_SHOW_NONAME: no struct/union member names/types
+ * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values;
+ *   equivalent to %px.
+ * - BTF_SHOW_ZERO: show zero-valued struct/union members; they
+ *   are not displayed by default
+ * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
+ *   data before displaying it.
+ */
+#define BTF_SHOW_COMPACT   (1ULL << 0)
+#define BTF_SHOW_NONAME(1ULL << 1)
+#define BTF_SHOW_PTR_RAW   (1ULL << 2)
+#define BTF_SHOW_ZERO  (1ULL << 3)
+#define BTF_SHOW_UNSAFE(1ULL << 4)
+
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
   struct seq_file *m);
+
+/*
+ * Copy len bytes of string representation of obj of BTF type_id into buf.
+ *
+ * @btf: struct btf object
+ * @type_id: type id of type obj points to
+ * @obj: pointer to typed data
+ * @buf: buffer to write to
+ * @len: maximum length to write to buf
+ * @flags: show options (see above)
+ *
+ * Return: length that would have been/was copied as per snprintf, or
+ *negative error.
+ */
+int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj,
+  char *buf, int len, u64 flags);
+
 int btf_get_fd_by_id(u32 id);
 u32 btf_id(const struct btf *btf);
 bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index f9ac693..70f5b88 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -284,6 +284,88 @@ static const char *btf_type_str(const struct btf_type *t)
return btf_kind_str[BTF_INFO_KIND(t->info)];
 }
 
+/* Chunk size we use in safe copy of data to be shown. */
+#define BTF_SHOW_OBJ_SAFE_SIZE 256
+
+/*
+ * This is the maximum size of a base type value (equivalent to a
+ * 128-bit int); if we are at the end of our safe buffer and have
+ * less than 16 bytes space we can't be assured of being able
+ * to copy the next type safely, so in such cases we will initiate
+ * a new copy.
+ */
+#define BTF_SHOW_OBJ_BASE_TYPE_SIZE16
+
+/*
+ * Common data to all BTF show operations. Private show functions can add
+ * their own data to a structure containing a struct btf_show and consult it
+ * in the show callback.  See btf_type_show() below.
+ *
+ * One challenge with showing nested data is we want to skip 0-valued
+ * data, but in order to figure out whether a nested object is all zeros
+ * we need to walk through it.  As a result, we need to make two passes
+ * when handling structs, unions and arrays; the first path simply looks
+ * for nonzero data, while the second actually does the display.  The first
+ * pass is signalled by show->state.depth_check being set, and if we
+ * encounter a non-zero value we set show->state.depth_to_show to
+ * the depth at which we encountered it.  When we have completed the
+ * first pass, we will know if anything needs to be displayed if
+ * depth_to_show > depth.  See btf_[struct,array]_show() for the
+ * implementation of this.

[PATCH v5 bpf-next 0/6] bpf: add helpers to support BTF-based kernel data display

2020-09-18 Thread Alan Maguire

 to manage preemption and since the display
  of an object occurs over an extended period and in printk
  context where we'd rather not change preemption status,
  it seemed tricky to manage buffer safety while considering
  preemption.  The approach of utilizing stack buffer space
  via the "struct btf_show" seemed like the simplest approach.
  The stack size of the associated functions which have a
  "struct btf_show" on their stack to support show operation
  (btf_type_snprintf_show() and btf_type_seq_show()) stays
  under 500 bytes. The compromise here is the safe buffer we
  use is small - 256 bytes - and as a result multiple
  probe_kernel_read()s are needed for larger objects. Most
  objects of interest are smaller than this (e.g.
  "struct sk_buff" is 224 bytes), and while task_struct is a
  notable exception at ~8K, performance is not the priority for
  BTF-based display. (Alexei and Yonghong, patch 2).
- safe buffer use is the default behaviour (and is mandatory
  for BPF) but unsafe display - meaning no safe copy is done
  and we operate on the object itself - is supported via a
  'u' option.
- pointers are prefixed with 0x for clarity (Alexei, patch 2)
- added additional comments and explanations around BTF show
  code, especially around determining whether objects such
  zeroed. Also tried to comment safe object scheme used. (Yonghong,
  patch 2)
- added late_initcall() to initialize vmlinux BTF so that it would
  not have to be initialized during printk operation (Alexei,
  patch 5)
- removed CONFIG_BTF_PRINTF config option as it is not needed;
  CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and
  determining behaviour of type-based printk can be done via
  retrieval of BTF data; if it's not there BTF was unavailable
  or broken (Alexei, patches 4,6)
- fix bpf_trace_printk test to use vmlinux.h and globals via
  skeleton infrastructure, removing need for perf events
  (Andrii, patch 8)

Changes since v1:

- changed format to be more drgn-like, rendering indented type info
  along with type names by default (Alexei)
- zeroed values are omitted (Arnaldo) by default unless the '0'
  modifier is specified (Alexei)
- added an option to print pointer values without obfuscation.
  The reason to do this is the sysctls controlling pointer display
  are likely to be irrelevant in many if not most tracing contexts.
  Some questions on this in the outstanding questions section below...
- reworked printk format specifer so that we no longer rely on format
  %pT but instead use a struct * which contains type information
  (Rasmus). This simplifies the printk parsing, makes use more dynamic
  and also allows specification by BTF id as well as name.
- removed incorrect patch which tried to fix dereferencing of resolved
  BTF info for vmlinux; instead we skip modifiers for the relevant
  case (array element type determination) (Alexei).
- fixed issues with negative snprintf format length (Rasmus)
- added test cases for various data structure formats; base types,
  typedefs, structs, etc.
- tests now iterate through all typedef, enum, struct and unions
  defined for vmlinux BTF and render a version of the target dummy
  value which is either all zeros or all 0xff values; the idea is this
  exercises the "skip if zero" and "print everything" cases.
- added support in BPF for using the %pT format specifier in
  bpf_trace_printk()
- added BPF tests which ensure %pT format specifier use works (Alexei).

Alan Maguire (6):
  bpf: provide function to get vmlinux BTF information
  bpf: move to generic BTF show support, apply it to seq files/strings
  bpf: add bpf_btf_snprintf helper
  selftests/bpf: add bpf_btf_snprintf helper tests
  bpf: add bpf_seq_btf_write helper
  selftests/bpf: add test for bpf_seq_btf_write helper

 include/linux/bpf.h|   3 +
 include/linux/btf.h|  40 +
 include/uapi/linux/bpf.h   |  78 ++
 kernel/bpf/btf.c   | 978 ++---
 kernel/bpf/helpers.c   |   4 +
 kernel/bpf/verifier.c  |  18 +-
 kernel/trace/bpf_trace.c   | 133 +++
 scripts/bpf_helpers_doc.py |   2 +
 tools/include/uapi/linux/bpf.h |  78 ++
 tools/testing/selftests/bpf/prog_tests/bpf_iter.c  |  66 ++
 .../selftests/bpf/prog_tests/btf_snprintf.c|  55 ++
 .../selftests/bpf/progs/bpf_iter_task_btf.c|  49 ++
 .../selftests/bpf/progs/netif_receive_skb.c| 260 ++
 13 files changed, 1656 insertions(+), 108 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c
 create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c

-- 
1.8.3.1

[PATCH v5 bpf-next 6/6] selftests/bpf: add test for bpf_seq_btf_write helper

2020-09-18 Thread Alan Maguire

Add a test verifying iterating over tasks and displaying BTF
representation of data succeeds.  Note here that we do not display
the task_struct itself, as it will overflow the PAGE_SIZE limit on seq
data; instead we write task->fs (a struct fs_struct).

Suggested-by: Alexei Starovoitov 
Signed-off-by: Alan Maguire 
---
 tools/testing/selftests/bpf/prog_tests/bpf_iter.c  | 66 ++
 .../selftests/bpf/progs/bpf_iter_task_btf.c| 49 
 2 files changed, 115 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c 
b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
index fe1a83b9..b9f13f9 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c
@@ -7,6 +7,7 @@
 #include "bpf_iter_task.skel.h"
 #include "bpf_iter_task_stack.skel.h"
 #include "bpf_iter_task_file.skel.h"
+#include "bpf_iter_task_btf.skel.h"
 #include "bpf_iter_tcp4.skel.h"
 #include "bpf_iter_tcp6.skel.h"
 #include "bpf_iter_udp4.skel.h"
@@ -167,6 +168,69 @@ static void test_task_file(void)
bpf_iter_task_file__destroy(skel);
 }
 
+#define FSBUFSZ8192
+
+static char fsbuf[FSBUFSZ];
+
+static void do_btf_read(struct bpf_program *prog)
+{
+   int iter_fd = -1, len = 0, bufleft = FSBUFSZ;
+   struct bpf_link *link;
+   char *buf = fsbuf;
+
+   link = bpf_program__attach_iter(prog, NULL);
+   if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n"))
+   return;
+
+   iter_fd = bpf_iter_create(bpf_link__fd(link));
+   if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n"))
+   goto free_link;
+
+   do {
+   len = read(iter_fd, buf, bufleft);
+   if (len > 0) {
+   buf += len;
+   bufleft -= len;
+   }
+   } while (len > 0);
+
+   if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno)))
+   goto free_link;
+
+   CHECK(strstr(fsbuf, "(struct fs_struct)") == NULL,
+ "check for btf representation of fs_struct in iter data",
+ "struct fs_struct not found");
+free_link:
+   if (iter_fd > 0)
+   close(iter_fd);
+   bpf_link__destroy(link);
+}
+
+static void test_task_btf(void)
+{
+   struct bpf_iter_task_btf__bss *bss;
+   struct bpf_iter_task_btf *skel;
+
+   skel = bpf_iter_task_btf__open_and_load();
+   if (CHECK(!skel, "bpf_iter_task_btf__open_and_load",
+ "skeleton open_and_load failed\n"))
+   return;
+
+   bss = skel->bss;
+
+   do_btf_read(skel->progs.dump_task_fs_struct);
+
+   if (CHECK(bss->tasks == 0, "check if iterated over tasks",
+ "no task iteration, did BPF program run?\n"))
+   goto cleanup;
+
+   CHECK(bss->seq_err != 0, "check for unexpected err",
+ "bpf_seq_btf_write returned %ld", bss->seq_err);
+
+cleanup:
+   bpf_iter_task_btf__destroy(skel);
+}
+
 static void test_tcp4(void)
 {
struct bpf_iter_tcp4 *skel;
@@ -957,6 +1021,8 @@ void test_bpf_iter(void)
test_task_stack();
if (test__start_subtest("task_file"))
test_task_file();
+   if (test__start_subtest("task_btf"))
+   test_task_btf();
if (test__start_subtest("tcp4"))
test_tcp4();
if (test__start_subtest("tcp6"))
diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c 
b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c
new file mode 100644
index 000..0451682
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c
@@ -0,0 +1,49 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+#include "bpf_iter.h"
+#include 
+#include 
+#include 
+
+char _license[] SEC("license") = "GPL";
+
+long tasks = 0;
+long seq_err = 0;
+
+/* struct task_struct's BTF representation will overflow PAGE_SIZE so cannot
+ * be used here; instead dump a structure associated with each task.
+ */
+SEC("iter/task")
+int dump_task_fs_struct(struct bpf_iter__task *ctx)
+{
+   static const char fs_type[] = "struct fs_struct";
+   struct seq_file *seq = ctx->meta->seq;
+   struct task_struct *task = ctx->task;
+   struct fs_struct *fs = (void *)0;
+   static struct btf_ptr ptr = { };
+   long ret;
+
+   if (task)
+   fs = task->fs;
+
+   ptr.type = fs_type;
+   ptr.ptr = fs;
+
+   if (ctx->meta->

[PATCH v5 bpf-next 3/6] bpf: add bpf_btf_snprintf helper

2020-09-18 Thread Alan Maguire

A helper is added to support tracing kernel type information in BPF
using the BPF Type Format (BTF).  Its signature is

long bpf_btf_snprintf(char *str, u32 str_size, struct btf_ptr *ptr,
  u32 btf_ptr_size, u64 flags);

struct btf_ptr * specifies

- a pointer to the data to be traced;
- the BTF id of the type of data pointed to; or
- a string representation of the type of data pointed to
- a flags field is provided for future use; these flags
  are not to be confused with the BTF_SNPRINTF_F_* flags
  below that control how the btf_ptr is displayed; the
  flags member of the struct btf_ptr may be used to
  disambiguate types in kernel versus module BTF, etc;
  the main distinction is the flags relate to the type
  and information needed in identifying it; not how it
  is displayed.

For example a BPF program with a struct sk_buff *skb
could do the following:

static const char skb_type[] = "struct sk_buff";
static struct btf_ptr b = { };

b.ptr = skb;
b.type = skb_type;
bpf_btf_snprintf(str, sizeof(str), &b, sizeof(b), 0, 0);

Default output looks like this:

(struct sk_buff){
 .transport_header = (__u16)65535,
 .mac_header = (__u16)65535,
 .end = (sk_buff_data_t)192,
 .head = (unsigned char *)0x7524fd8b,
 .data = (unsigned char *)0x7524fd8b,
 .truesize = (unsigned int)768,
 .users = (refcount_t){
  .refs = (atomic_t){
   .counter = (int)1,
  },
 },
}

Flags modifying display are as follows:

- BTF_F_COMPACT:no formatting around type information
- BTF_F_NONAME: no struct/union member names/types
- BTF_F_PTR_RAW:show raw (unobfuscated) pointer values;
equivalent to %px.
- BTF_F_ZERO:   show zero-valued struct/union members;
they are not displayed by default

Signed-off-by: Alan Maguire 
---
 include/linux/bpf.h|  1 +
 include/linux/btf.h|  9 +++--
 include/uapi/linux/bpf.h   | 68 
 kernel/bpf/helpers.c   |  4 ++
 kernel/trace/bpf_trace.c   | 88 ++
 scripts/bpf_helpers_doc.py |  2 +
 tools/include/uapi/linux/bpf.h | 68 
 7 files changed, 236 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index c0ad5d8..9acbd59 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1787,6 +1787,7 @@ static inline int 
bpf_fd_reuseport_array_update_elem(struct bpf_map *map,
 extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto;
 extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
 extern const struct bpf_func_proto bpf_copy_from_user_proto;
+extern const struct bpf_func_proto bpf_btf_snprintf_proto;
 
 const struct bpf_func_proto *bpf_tracing_func_proto(
enum bpf_func_id func_id, const struct bpf_prog *prog);
diff --git a/include/linux/btf.h b/include/linux/btf.h
index d0f5d3c..3e5cdc2 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 
 #define BTF_TYPE_EMIT(type) ((void)(type *)0)
 
@@ -59,10 +60,10 @@ const struct btf_type *btf_type_id_size(const struct btf 
*btf,
  * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
  *   data before displaying it.
  */
-#define BTF_SHOW_COMPACT   (1ULL << 0)
-#define BTF_SHOW_NONAME(1ULL << 1)
-#define BTF_SHOW_PTR_RAW   (1ULL << 2)
-#define BTF_SHOW_ZERO  (1ULL << 3)
+#define BTF_SHOW_COMPACT   BTF_F_COMPACT
+#define BTF_SHOW_NONAMEBTF_F_NONAME
+#define BTF_SHOW_PTR_RAW   BTF_F_PTR_RAW
+#define BTF_SHOW_ZERO  BTF_F_ZERO
 #define BTF_SHOW_UNSAFE(1ULL << 4)
 
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 7dd3141..9b89b67 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3579,6 +3579,41 @@ struct bpf_stack_build_id {
  * the data in *dst*. This is a wrapper of **copy_from_user**\ ().
  * Return
  * 0 on success, or a negative error in case of failure.
+ *
+ * long bpf_btf_snprintf(char *str, u32 str_size, struct btf_ptr *ptr, u32 
btf_ptr_size, u64 flags)
+ * Description
+ * Use BTF to store a string representation of *ptr*->ptr in *str*,
+ * using *ptr*->type name or *ptr*->type_id.  These values should
+ * specify the type *ptr*->ptr points to. Traversing that
+ * data structure using BTF, the type information and values are
+ * stored in the first *str_size* - 1 bytes of *str*.  Safe copy of
+ * the pointer data is carried out to avoid kernel crashes during
+ * operation.  Smaller types can use string space on the stack;
+ * larger programs can use map

[PATCH v5 bpf-next 1/6] bpf: provide function to get vmlinux BTF information

2020-09-18 Thread Alan Maguire

It will be used later for BPF structure display support

Signed-off-by: Alan Maguire 
---
 include/linux/bpf.h   |  2 ++
 kernel/bpf/verifier.c | 18 --
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index c6d9f2c..c0ad5d8 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1330,6 +1330,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr,
  union bpf_attr __user *uattr);
 void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
 
+struct btf *bpf_get_btf_vmlinux(void);
+
 /* Map specifics */
 struct xdp_buff;
 struct sk_buff;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 814bc6c..11d7985 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11311,6 +11311,17 @@ static int check_attach_btf_id(struct bpf_verifier_env 
*env)
}
 }
 
+struct btf *bpf_get_btf_vmlinux(void)
+{
+   if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
+   mutex_lock(&bpf_verifier_lock);
+   if (!btf_vmlinux)
+   btf_vmlinux = btf_parse_vmlinux();
+   mutex_unlock(&bpf_verifier_lock);
+   }
+   return btf_vmlinux;
+}
+
 int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
  union bpf_attr __user *uattr)
 {
@@ -11344,12 +11355,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr 
*attr,
env->ops = bpf_verifier_ops[env->prog->type];
is_priv = bpf_capable();
 
-   if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
-   mutex_lock(&bpf_verifier_lock);
-   if (!btf_vmlinux)
-   btf_vmlinux = btf_parse_vmlinux();
-   mutex_unlock(&bpf_verifier_lock);
-   }
+   bpf_get_btf_vmlinux();
 
/* grab the mutex to protect few globals used by verifier */
if (!is_priv)
-- 
1.8.3.1

[PATCH v5 bpf-next 5/6] bpf: add bpf_seq_btf_write helper

2020-09-18 Thread Alan Maguire

A helper is added to allow seq file writing of kernel data
structures using vmlinux BTF.  Its signature is

long bpf_seq_btf_write(struct seq_file *m, struct btf_ptr *ptr,
   u32 btf_ptr_size, u64 flags);

Flags and struct btf_ptr definitions/use are identical to the
bpf_btf_snprintf helper, and the helper returns 0 on success
or a negative error value.

Suggested-by: Alexei Starovoitov 
Signed-off-by: Alan Maguire 
---
 include/linux/btf.h|  3 ++
 include/uapi/linux/bpf.h   | 10 ++
 kernel/bpf/btf.c   | 17 +++---
 kernel/trace/bpf_trace.c   | 75 +-
 tools/include/uapi/linux/bpf.h | 10 ++
 5 files changed, 96 insertions(+), 19 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 3e5cdc2..eed23a4 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -69,6 +69,9 @@ const struct btf_type *btf_type_id_size(const struct btf *btf,
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
   struct seq_file *m);
 
+int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj,
+   struct seq_file *m, u64 flags);
+
 /*
  * Copy len bytes of string representation of obj of BTF type_id into buf.
  *
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 9b89b67..c0815f1 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3614,6 +3614,15 @@ struct bpf_stack_build_id {
  * The number of bytes that were written (or would have been
  * written if output had to be truncated due to string size),
  * or a negative error in cases of failure.
+ *
+ * long bpf_seq_btf_write(struct seq_file *m, struct btf_ptr *ptr, u32 
ptr_size, u64 flags)
+ * Description
+ * Use BTF to write to seq_write a string representation of
+ * *ptr*->ptr, using *ptr*->type name or *ptr*->type_id as per
+ * bpf_btf_snprintf() above.  *flags* are identical to those
+ * used for bpf_btf_snprintf.
+ * Return
+ * 0 on success or a negative error in case of failure.
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -3766,6 +3775,7 @@ struct bpf_stack_build_id {
FN(d_path), \
FN(copy_from_user), \
FN(btf_snprintf),   \
+   FN(seq_btf_write),  \
/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 70f5b88..0902464 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5328,17 +5328,26 @@ static void btf_seq_show(struct btf_show *show, const 
char *fmt, ...)
va_end(args);
 }
 
-void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
-   struct seq_file *m)
+int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj,
+   struct seq_file *m, u64 flags)
 {
struct btf_show sseq;
 
sseq.target = m;
sseq.showfn = btf_seq_show;
-   sseq.flags = BTF_SHOW_NONAME | BTF_SHOW_COMPACT | BTF_SHOW_ZERO |
-BTF_SHOW_UNSAFE;
+   sseq.flags = flags;
 
btf_type_show(btf, type_id, obj, &sseq);
+
+   return sseq.state.status;
+}
+
+void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
+  struct seq_file *m)
+{
+   (void) btf_type_seq_show_flags(btf, type_id, obj, m,
+  BTF_SHOW_NONAME | BTF_SHOW_COMPACT |
+  BTF_SHOW_ZERO | BTF_SHOW_UNSAFE);
 }
 
 struct btf_show_snprintf {
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index f171e03..eee36a8 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -71,6 +71,10 @@ static struct bpf_raw_event_map 
*bpf_get_raw_tracepoint_module(const char *name)
 u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 
+static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size,
+ u64 flags, const struct btf **btf,
+ s32 *btf_id);
+
 /**
  * trace_call_bpf - invoke BPF program
  * @call: tracepoint event
@@ -780,6 +784,30 @@ struct bpf_seq_printf_buf {
.btf_id = bpf_seq_write_btf_ids,
 };
 
+BPF_CALL_4(bpf_seq_btf_write, struct seq_file *, m, struct btf_ptr *, ptr,
+  u32, btf_ptr_size, u64, flags)
+{
+   const struct btf *btf;
+   s32 btf_id;
+   int ret;
+
+   ret = bpf_btf_printf_prepare(ptr, btf_ptr_size, flags, &btf, &btf_id);
+   if (ret)
+   return ret;
+
+   return btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags);
+}
+
+static const struct bpf_func_proto bpf_seq

Re: [RFC PATCH bpf-next 2/4] bpf: make BTF show support generic, apply to seq files/bpf_trace_printk

2020-08-18 Thread Alan Maguire



On Fri, 14 Aug 2020, Alexei Starovoitov wrote:

> On Fri, Aug 14, 2020 at 02:06:37PM +0100, Alan Maguire wrote:
> > On Wed, 12 Aug 2020, Alexei Starovoitov wrote:
> > 
> > > On Thu, Aug 06, 2020 at 03:42:23PM +0100, Alan Maguire wrote:
> > > > 
> > > > The bpf_trace_printk tracepoint is augmented with a "trace_id"
> > > > field; it is used to allow tracepoint filtering as typed display
> > > > information can easily be interspersed with other tracing data,
> > > > making it hard to read.  Specifying a trace_id will allow users
> > > > to selectively trace data, eliminating noise.
> > > 
> > > Since trace_id is not seen in trace_pipe, how do you expect users
> > > to filter by it?
> > 
> > Sorry should have specified this.  The approach is to use trace
> > instances and filtering such that we only see events associated
> > with a specific trace_id.  There's no need for the trace event to
> > actually display the trace_id - it's still usable as a filter.
> > The steps involved are:
> > 
> > 1. create a trace instance within which we can specify a fresh
> >set of trace event enablings, filters etc.
> > 
> > mkdir /sys/kernel/debug/tracing/instances/traceid100
> > 
> > 2. enable the filter for the specific trace id
> > 
> > echo "trace_id == 100" > 
> > /sys/kernel/debug/tracing/instances/traceid100/events/bpf_trace/bpf_trace_printk/filter
> > 
> > 3. enable the trace event
> > 
> > echo 1 > 
> > /sys/kernel/debug/tracing/instances/events/bpf_trace/bpf_trace_printk/enable
> > 
> > 4. ensure the BPF program uses a trace_id 100 when calling bpf_trace_btf()
> 
> ouch.
> I think you interpreted the acceptance of the
> commit 7fb20f9e901e ("bpf, doc: Remove references to warning message when 
> using bpf_trace_printk()")
> in the wrong way.
> 
> Everything that doc had said is still valid. In particular:
> -A: This is done to nudge program authors into better interfaces when
> -programs need to pass data to user space. Like bpf_perf_event_output()
> -can be used to efficiently stream data via perf ring buffer.
> -BPF maps can be used for asynchronous data sharing between kernel
> -and user space. bpf_trace_printk() should only be used for debugging.
> 
> bpf_trace_printk is for debugging only. _debugging of bpf programs 
> themselves_.
> What you're describing above is logging and tracing. It's not debugging of 
> programs.
> perf buffer, ring buffer, and seq_file interfaces are the right
> interfaces for tracing, logging, and kernel debugging.
> 
> > > It also feels like workaround. May be let bpf prog print the whole
> > > struct in one go with multiple new lines and call
> > > trace_bpf_trace_printk(buf) once?
> > 
> > We can do that absolutely, but I'd be interested to get your take
> > on the filtering mechanism before taking that approach.  I'll add
> > a description of the above mechanism to the cover letter and
> > patch to be clearer next time too.
> 
> I think patch 3 is no go, because it takes bpf_trace_printk in
> the wrong direction.
> Instead please refactor it to use string buffer or seq_file as an output.

Fair enough. I'm thinking a helper like

long bpf_btf_snprintf(char *str, u32 str_size, struct btf_ptr *ptr,
  u32 ptr_size, u64 flags);

Then the user can choose perf event or ringbuf interfaces
to share the results with userspace.

> If the user happen to use bpf_trace_printk("%s", buf);
> after that to print that string buffer to trace_pipe that's user's choice.
> I can see such use case when program author wants to debug
> their bpf program. That's fine. But for kernel debugging, on demand and
> "always on" logging and tracing the documentation should point
> to sustainable interfaces that don't interfere with each other,
> can be run in parallel by multiple users, etc.
> 

The problem with bpf_trace_printk() under this approach is
that the string size for %s arguments is very limited;
bpf_trace_printk() restricts these to 64 bytes in size.
Looks like bpf_seq_printf() restricts a %s string to 128
bytes also.  We could add an additional helper for the 
bpf_seq case which calls bpf_seq_printf() for each component
in the object, i.e.

long bpf_seq_btf_printf(struct seq_file *m, struct btf_ptr *ptr,
u32 ptr_size, u64 flags);

This would steer users away from bpf_trace_printk()
for this use case - since it can print only a small
amount of the string - while supporting all 
the other user-space communication mechanisms.

Alan

Re: [RFC PATCH bpf-next 2/4] bpf: make BTF show support generic, apply to seq files/bpf_trace_printk

2020-08-14 Thread Alan Maguire

On Wed, 12 Aug 2020, Alexei Starovoitov wrote:

> On Thu, Aug 06, 2020 at 03:42:23PM +0100, Alan Maguire wrote:
> > 
> > The bpf_trace_printk tracepoint is augmented with a "trace_id"
> > field; it is used to allow tracepoint filtering as typed display
> > information can easily be interspersed with other tracing data,
> > making it hard to read.  Specifying a trace_id will allow users
> > to selectively trace data, eliminating noise.
> 
> Since trace_id is not seen in trace_pipe, how do you expect users
> to filter by it?

Sorry should have specified this.  The approach is to use trace
instances and filtering such that we only see events associated
with a specific trace_id.  There's no need for the trace event to
actually display the trace_id - it's still usable as a filter.
The steps involved are:

1. create a trace instance within which we can specify a fresh
   set of trace event enablings, filters etc.

mkdir /sys/kernel/debug/tracing/instances/traceid100

2. enable the filter for the specific trace id

echo "trace_id == 100" > 
/sys/kernel/debug/tracing/instances/traceid100/events/bpf_trace/bpf_trace_printk/filter

3. enable the trace event

echo 1 > 
/sys/kernel/debug/tracing/instances/events/bpf_trace/bpf_trace_printk/enable

4. ensure the BPF program uses a trace_id 100 when calling bpf_trace_btf()

So the above can be done for multiple programs; output can then
be separated for different programs if trace_ids and filtering are
used together. The above trace instance only sees bpf_trace_btf()
events which specify trace_id 100.

I've attached a tweaked version of the patch 4 in the patchset that 
ensures that a trace instance with filtering enabled as above sees
the bpf_trace_btf events, but _not_ bpf_trace_printk events (since
they have trace_id 0 by default).

To me the above provides a simple way to separate BPF program output
for simple BPF programs; ringbuf and perf events require a bit more
work in both BPF and userspace to support such coordination.  What do
you think - does this approach seem worth using? If so we could also
consider extending it to bpf_trace_printk(), if we can find a way
to provide a trace_id there too.

> It also feels like workaround. May be let bpf prog print the whole
> struct in one go with multiple new lines and call
> trace_bpf_trace_printk(buf) once?

We can do that absolutely, but I'd be interested to get your take
on the filtering mechanism before taking that approach.  I'll add
a description of the above mechanism to the cover letter and
patch to be clearer next time too.

> 
> Also please add interface into bpf_seq_printf.
> BTF enabled struct prints is useful for iterators too
> and generalization you've done in this patch pretty much takes it there.
> 

Sure, I'll try and tackle that next time.

> > +#define BTF_SHOW_COMPACT   (1ULL << 0)
> > +#define BTF_SHOW_NONAME(1ULL << 1)
> > +#define BTF_SHOW_PTR_RAW   (1ULL << 2)
> > +#define BTF_SHOW_ZERO  (1ULL << 3)
> > +#define BTF_SHOW_NONEWLINE (1ULL << 32)
> > +#define BTF_SHOW_UNSAFE(1ULL << 33)
> 
> I could have missed it earlier, but what is the motivation to leave the gap
> in bits? Just do bit 4 and 5 ?
> 

Patch 3 uses the first 4 as flags to bpf_trace_btf();
the final two are not supported for the helper as flag values
so I wanted to leave some space for additional bpf_trace_btf() flags.
BTF_SHOW_NONEWLINE is always used for bpf_trace_btf(), since the
tracing adds a newline for us and we don't want to double up on newlines, 
so it's ORed in as an implicit argument for the bpf_trace_btf() case. 
BTF_SHOW_UNSAFE isn't allowed within BPF so it's not available as
a flag for the helper.

Thanks!

Alan

>From 10bd268b2585084c8f35d1b6ab0c3df76203f5cc Mon Sep 17 00:00:00 2001
From: Alan Maguire 
Date: Thu, 6 Aug 2020 14:21:10 +0200
Subject: [PATCH] selftests/bpf: add bpf_trace_btf helper tests

Basic tests verifying various flag combinations for bpf_trace_btf()
using a tp_btf program to trace skb data.  Also verify that we
can create a trace instance to filter trace data, using the
trace_id value passed to bpf_trace/bpf_trace_printk events.

trace_id is specifiable for bpf_trace_btf() so the test ensures
the trace instance sees the filtered events only.

Signed-off-by: Alan Maguire 
---
 tools/testing/selftests/bpf/prog_tests/trace_btf.c | 150 +
 .../selftests/bpf/progs/netif_receive_skb.c|  48 +++
 2 files changed, 198 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_btf.c
 create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c

diff --git a/tools/testing/selftests/bpf/prog_tests/trace_btf.c 
b/tools/testing/selftests/bpf/prog_tests/trace_btf.

Re: [PATCH v2] kunit: added lockdep support

2020-08-13 Thread Alan Maguire

On Wed, 12 Aug 2020, Uriel Guajardo wrote:

> KUnit will fail tests upon observing a lockdep failure. Because lockdep
> turns itself off after its first failure, only fail the first test and
> warn users to not expect any future failures from lockdep.
> 
> Similar to lib/locking-selftest [1], we check if the status of
> debug_locks has changed after the execution of a test case. However, we
> do not reset lockdep afterwards.
> 
> Like the locking selftests, we also fix possible preemption count
> corruption from lock bugs.
> 
> Depends on kunit: support failure from dynamic analysis tools [2]
> 
> [1] 
> https://elixir.bootlin.com/linux/v5.7.12/source/lib/locking-selftest.c#L1137
> 
> [2] 
> https://lore.kernel.org/linux-kselftest/20200806174326.3577537-1-urielguajard...@gmail.com/
> 
> Signed-off-by: Uriel Guajardo 
> ---
> v2 Changes:
> - Removed lockdep_reset
> 
> - Added warning to users about lockdep shutting off
> ---
>  lib/kunit/test.c | 27 ++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/kunit/test.c b/lib/kunit/test.c
> index d8189d827368..7e477482457b 100644
> --- a/lib/kunit/test.c
> +++ b/lib/kunit/test.c
> @@ -11,6 +11,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "debugfs.h"
>  #include "string-stream.h"
> @@ -22,6 +23,26 @@ void kunit_fail_current_test(void)
>   kunit_set_failure(current->kunit_test);
>  }
>  
> +static void kunit_check_locking_bugs(struct kunit *test,
> +  unsigned long saved_preempt_count,
> +  bool saved_debug_locks)
> +{
> + preempt_count_set(saved_preempt_count);
> +#ifdef CONFIG_TRACE_IRQFLAGS
> + if (softirq_count())
> + current->softirqs_enabled = 0;
> + else
> + current->softirqs_enabled = 1;
> +#endif
> +#if IS_ENABLED(CONFIG_LOCKDEP)
> + if (saved_debug_locks && !debug_locks) {
> + kunit_set_failure(test);
> + kunit_warn(test, "Dynamic analysis tool failure from LOCKDEP.");
> + kunit_warn(test, "Further tests will have LOCKDEP disabled.");
> + }
> +#endif
> +}

Nit: I could be wrong but the general approach for this sort of
feature is to do conditional compilation combined with "static inline"
definitions to handle the case where the feature isn't enabled. 
Could we tidy this up a bit and haul this stuff out into a
conditionally-compiled (if CONFIG_LOCKDEP) kunit lockdep.c file?
Then in kunit's lockdep.h we'd have

struct kunit_lockdep {
int preempt_count;
bool debug_locks;
};

#if IS_ENABLED(CONFIG_LOCKDEP)
void kunit_test_init_lockdep(struct kunit_test *test, struct 
 kunit_lockdep *lockdep);
void kunit_test_check_lockdep(struct kunit_test *test,
  struct kunit_lockdep *lockdep);
#else
static inline void kunit_init_lockdep(struct kunit_test *test,
  struct kunit_lockdep *lockdep) { }
static inline void kunit_check_lockdep(struct kunit_test *test,
   struct kunit_lockdep *lockdep) { }
#endif

The test execution code could then call

struct kunit_lockdep lockdep;

kunit_test_init_lockdep(test, &lockdep);

kunit_test_check_lockdep(test, &lockdep);

If that approach makes sense, we could go a bit further
and we might benefit from a bit more generalization
here.  _If_ the pattern of needing pre- and post- test
actions is sustained across multiple analysis tools,
could we add generic hooks for this? That would allow any
additional dynamic analysis tools to utilize them.  So 
kunit_try_run_case() would then cycle through the registered
pre- hooks prior to running the case and post- hooks after,
failing if any of the latter returned a failure value.

I'm thinking something like

  kunit_register_external_test("lockdep", lockdep_pre, lockdep_post, 
   &kunit_lockdep);

(or we could define a kunit_external_test struct for
better extensibility).

A void * would be passed to pre/post, in this case it'd
be a pointer to a struct containing the saved preempt
count/debug locks, and the registration could be called during
kunit initialization.  This doesn't need to be done with your
change of course but I wanted to float the idea as in addition
to uncluttering the test case execution code, it might allow
us to build facilities on top of that generic tool support for
situations like "I'd like to see if the test passes absent
any lockdep issues, so I'd like to disable lockdep-based failure".
Such situations are more likely to arise in a world where
kunit+tests are built as modules and run multiple times within
a single system boot admittedly, but worth considering I think.

For that we'd need a way to select which dynamic tools kunit
enables(kernel/module parameters or debugfs could do
this), but a generic approach might help that sort of thing.

An extern

Re: [PATCH 1/2] kunit: support failure from dynamic analysis tools

2020-08-07 Thread Alan Maguire

On Thu, 6 Aug 2020, Uriel Guajardo wrote:

> Adds an API to allow dynamic analysis tools to fail the currently
> running KUnit test case.
> 
> - Always places the kunit test in the task_struct to allow other tools
> to access the currently running KUnit test.
> 
> - Creates a new header file to avoid circular dependencies that could be
> created from the test.h file.
> 
> Requires KASAN-KUnit integration patch to access the kunit test from
> task_struct:
> https://lore.kernel.org/linux-kselftest/20200606040349.246780-2-david...@google.com/
> 
> Signed-off-by: Uriel Guajardo 
> ---
>  include/kunit/test-bug.h | 24 
>  include/kunit/test.h |  1 +
>  lib/kunit/test.c | 10 ++
>  3 files changed, 31 insertions(+), 4 deletions(-)
>  create mode 100644 include/kunit/test-bug.h
> 
> diff --git a/include/kunit/test-bug.h b/include/kunit/test-bug.h
> new file mode 100644
> index ..283c19ec328f
> --- /dev/null
> +++ b/include/kunit/test-bug.h
> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * KUnit API allowing dynamic analysis tools to interact with KUnit tests
> + *
> + * Copyright (C) 2020, Google LLC.
> + * Author: Uriel Guajardo 
> + */
> +
> +#ifndef _KUNIT_TEST_BUG_H
> +#define _KUNIT_TEST_BUG_H
> +
> +#if IS_ENABLED(CONFIG_KUNIT)
> +
> +extern void kunit_fail_current_test(void);
> +
> +#else
> +
> +static inline void kunit_fail_current_test(void)
> +{
> +}
> +
> +#endif
> +
> +#endif /* _KUNIT_TEST_BUG_H */

This is great stuff!

One thing I wonder though; how obvious will it be to someone
running a KUnit test that the cause of the test failure
is a dynamic analysis tool?  Yes we'll see the dmesg logging
from that tool but I don't think there's any context _within_
KUnit that could clarify the source of the failure.  What about
changing the above API to include a string message that KUnit can
log, so it can at least identify the source of the failure
(ubsan, kasan etc).  That would alert anyone looking at KUnit
output only that there's an external context to examine.

> diff --git a/include/kunit/test.h b/include/kunit/test.h
> index 3391f38389f8..81bf43a1abda 100644
> --- a/include/kunit/test.h
> +++ b/include/kunit/test.h
> @@ -11,6 +11,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/lib/kunit/test.c b/lib/kunit/test.c
> index dcc35fd30d95..d8189d827368 100644
> --- a/lib/kunit/test.c
> +++ b/lib/kunit/test.c
> @@ -16,6 +16,12 @@
>  #include "string-stream.h"
>  #include "try-catch-impl.h"
>  
> +void kunit_fail_current_test(void)
> +{
> + if (current->kunit_test)
> + kunit_set_failure(current->kunit_test);
> +}
> +
>  static void kunit_print_tap_version(void)
>  {
>   static bool kunit_has_printed_tap_version;
> @@ -284,9 +290,7 @@ static void kunit_try_run_case(void *data)
>   struct kunit_suite *suite = ctx->suite;
>   struct kunit_case *test_case = ctx->test_case;
>  
> -#if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT))
>   current->kunit_test = test;
> -#endif /* IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT) */
>  
>   /*
>* kunit_run_case_internal may encounter a fatal error; if it does,
> @@ -602,9 +606,7 @@ void kunit_cleanup(struct kunit *test)
>   spin_unlock(&test->lock);
>   kunit_remove_resource(test, res);
>   }
> -#if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT))
>   current->kunit_test = NULL;
> -#endif /* IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT)*/
>  }
>  EXPORT_SYMBOL_GPL(kunit_cleanup);
>  
> -- 
> 2.28.0.163.g6104cc2f0b6-goog
> 
>

[PATCH bpf] bpf: doc: remove references to warning message when using bpf_trace_printk()

2020-08-07 Thread Alan Maguire

The BPF helper bpf_trace_printk() no longer uses trace_printk();
it is now triggers a dedicated trace event.  Hence the described
warning is no longer present, so remove the discussion of it as
it may confuse people.

Fixes: ac5a72ea5c89 ("bpf: Use dedicated bpf_trace_printk event instead of 
trace_printk()")
Signed-off-by: Alan Maguire 
---
 Documentation/bpf/bpf_design_QA.rst | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/Documentation/bpf/bpf_design_QA.rst 
b/Documentation/bpf/bpf_design_QA.rst
index 12a246f..2df7b06 100644
--- a/Documentation/bpf/bpf_design_QA.rst
+++ b/Documentation/bpf/bpf_design_QA.rst
@@ -246,17 +246,6 @@ program is loaded the kernel will print warning message, so
 this helper is only useful for experiments and prototypes.
 Tracing BPF programs are root only.
 
-Q: bpf_trace_printk() helper warning
-
-Q: When bpf_trace_printk() helper is used the kernel prints nasty
-warning message. Why is that?
-
-A: This is done to nudge program authors into better interfaces when
-programs need to pass data to user space. Like bpf_perf_event_output()
-can be used to efficiently stream data via perf ring buffer.
-BPF maps can be used for asynchronous data sharing between kernel
-and user space. bpf_trace_printk() should only be used for debugging.
-
 Q: New functionality via kernel modules?
 
 Q: Can BPF functionality such as new program or map types, new
-- 
1.8.3.1

[RFC PATCH bpf-next 4/4] selftests/bpf: add bpf_trace_btf helper tests

2020-08-06 Thread Alan Maguire

Basic tests verifying various flag combinations for bpf_trace_btf()
using a tp_btf program to trace skb data.

Signed-off-by: Alan Maguire 
---
 tools/testing/selftests/bpf/prog_tests/trace_btf.c | 45 ++
 .../selftests/bpf/progs/netif_receive_skb.c| 43 +
 2 files changed, 88 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_btf.c
 create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c

diff --git a/tools/testing/selftests/bpf/prog_tests/trace_btf.c 
b/tools/testing/selftests/bpf/prog_tests/trace_btf.c
new file mode 100644
index 000..e64b69d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/trace_btf.c
@@ -0,0 +1,45 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+
+#include "netif_receive_skb.skel.h"
+
+void test_trace_btf(void)
+{
+   struct netif_receive_skb *skel;
+   struct netif_receive_skb__bss *bss;
+   int err, duration = 0;
+
+   skel = netif_receive_skb__open();
+   if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+   return;
+
+   err = netif_receive_skb__load(skel);
+   if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err))
+   goto cleanup;
+
+   bss = skel->bss;
+
+   err = netif_receive_skb__attach(skel);
+   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+   goto cleanup;
+
+   /* generate receive event */
+   system("ping -c 10 127.0.0.1");
+
+   /*
+* Make sure netif_receive_skb program was triggered
+* and it set expected return values from bpf_trace_printk()s
+* and all tests ran.
+*/
+   if (CHECK(bss->ret <= 0,
+ "bpf_trace_btf: got return value",
+ "ret <= 0 %ld test %d\n", bss->ret, bss->num_subtests))
+   goto cleanup;
+
+   CHECK(bss->num_subtests != bss->ran_subtests, "check all subtests ran",
+ "only ran %d of %d tests\n", bss->num_subtests,
+ bss->ran_subtests);
+
+cleanup:
+   netif_receive_skb__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c 
b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
new file mode 100644
index 000..cab764e
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+#include "vmlinux.h"
+#include 
+#include 
+
+char _license[] SEC("license") = "GPL";
+
+long ret = 0;
+int num_subtests = 0;
+int ran_subtests = 0;
+
+#define CHECK_TRACE(_p, flags)  \
+   do { \
+   ++num_subtests;  \
+   if (ret >= 0) {  \
+   ++ran_subtests;  \
+   ret = bpf_trace_btf(_p, sizeof(*(_p)), 0, flags);\
+   }\
+   } while (0)
+
+/* TRACE_EVENT(netif_receive_skb,
+ * TP_PROTO(struct sk_buff *skb),
+ */
+SEC("tp_btf/netif_receive_skb")
+int BPF_PROG(trace_netif_receive_skb, struct sk_buff *skb)
+{
+   static const char skb_type[] = "struct sk_buff";
+   static struct btf_ptr p = { };
+
+   p.ptr = skb;
+   p.type = skb_type;
+
+   CHECK_TRACE(&p, 0);
+   CHECK_TRACE(&p, BTF_TRACE_F_COMPACT);
+   CHECK_TRACE(&p, BTF_TRACE_F_NONAME);
+   CHECK_TRACE(&p, BTF_TRACE_F_PTR_RAW);
+   CHECK_TRACE(&p, BTF_TRACE_F_ZERO);
+   CHECK_TRACE(&p, BTF_TRACE_F_COMPACT | BTF_TRACE_F_NONAME |
+   BTF_TRACE_F_PTR_RAW | BTF_TRACE_F_ZERO);
+
+   return 0;
+}
-- 
1.8.3.1

[RFC PATCH bpf-next 3/4] bpf: add bpf_trace_btf helper

2020-08-06 Thread Alan Maguire

f_trace_printk:  },
  -0 [023] d.s.  1825.778448: bpf_trace_printk: }

Flags modifying display are as follows:

- BTF_TRACE_F_COMPACT:  no formatting around type information
- BTF_TRACE_F_NONAME:   no struct/union member names/types
- BTF_TRACE_F_PTR_RAW:  show raw (unobfuscated) pointer values;
equivalent to %px.
- BTF_TRACE_F_ZERO: show zero-valued struct/union members;
they are not displayed by default

Signed-off-by: Alan Maguire 
---
 include/linux/bpf.h|   1 +
 include/linux/btf.h|   9 ++--
 include/uapi/linux/bpf.h   |  63 +
 kernel/bpf/core.c  |   5 ++
 kernel/bpf/helpers.c   |   4 ++
 kernel/trace/bpf_trace.c   | 102 -
 scripts/bpf_helpers_doc.py |   2 +
 tools/include/uapi/linux/bpf.h |  63 +
 8 files changed, 243 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6143b6e..f67819d 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -934,6 +934,7 @@ struct bpf_event_entry {
 const char *kernel_type_name(u32 btf_type_id);
 
 const struct bpf_func_proto *bpf_get_trace_printk_proto(void);
+const struct bpf_func_proto *bpf_get_trace_btf_proto(void);
 
 typedef unsigned long (*bpf_ctx_copy_t)(void *dst, const void *src,
unsigned long off, unsigned long len);
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 46bf9f4..3d31e28 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 
 #define BTF_TYPE_EMIT(type) ((void)(type *)0)
 
@@ -61,10 +62,10 @@ const struct btf_type *btf_type_id_size(const struct btf 
*btf,
  * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
  *   data before displaying it.
  */
-#define BTF_SHOW_COMPACT   (1ULL << 0)
-#define BTF_SHOW_NONAME(1ULL << 1)
-#define BTF_SHOW_PTR_RAW   (1ULL << 2)
-#define BTF_SHOW_ZERO  (1ULL << 3)
+#define BTF_SHOW_COMPACT   BTF_TRACE_F_COMPACT
+#define BTF_SHOW_NONAMEBTF_TRACE_F_NONAME
+#define BTF_SHOW_PTR_RAW   BTF_TRACE_F_PTR_RAW
+#define BTF_SHOW_ZERO  BTF_TRACE_F_ZERO
 #define BTF_SHOW_NONEWLINE (1ULL << 32)
 #define BTF_SHOW_UNSAFE(1ULL << 33)
 
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b134e67..726fee4 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3394,6 +3394,36 @@ struct bpf_stack_build_id {
  * A non-negative value equal to or less than *size* on success,
  * or a negative error in case of failure.
  *
+ * long bpf_trace_btf(struct btf_ptr *ptr, u32 btf_ptr_size, u32 trace_id, u64 
flags)
+ * Description
+ * Utilize BTF to trace a representation of *ptr*->ptr, using
+ * *ptr*->type name or *ptr*->type_id.  *ptr*->type_name
+ * should specify the type *ptr*->ptr points to. Traversing that
+ * data structure using BTF, the type information and values are
+ * bpf_trace_printk()ed.  Safe copy of the pointer data is
+ * carried out to avoid kernel crashes during data display.
+ * Tracing specifies *trace_id* as the id associated with the
+ * trace event; this can be used to filter trace events
+ * to show a subset of all traced output, helping to avoid
+ * the situation where BTF output is intermixed with other
+ * output.
+ *
+ * *flags* is a combination of
+ *
+ * **BTF_TRACE_F_COMPACT**
+ * no formatting around type information
+ * **BTF_TRACE_F_NONAME**
+ * no struct/union member names/types
+ * **BTF_TRACE_F_PTR_RAW**
+ * show raw (unobfuscated) pointer values;
+ * equivalent to printk specifier %px.
+ * **BTF_TRACE_F_ZERO**
+ * show zero-valued struct/union members; they
+ * are not displayed by default
+ *
+ * Return
+ * The number of bytes traced, or a negative error in cases of
+ * failure.
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -3538,6 +3568,7 @@ struct bpf_stack_build_id {
FN(skc_to_tcp_request_sock),\
FN(skc_to_udp6_sock),   \
FN(get_task_stack), \
+   FN(trace_btf),  \
/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
@@ -4446,4 +4477,36 @@ struct bpf_sk_lookup {
__u32 local_port;   /* Host byte order */
 };
 
+/*
+ * struct btf_ptr is used for typed pointer display; the
+ * additional type string/BTF type id are used to render

[RFC PATCH bpf-next 0/4] bpf: add bpf-based bpf_trace_printk()-like support

2020-08-06 Thread Alan Maguire

arge amounts of data using a complex mechanism such as
  BTF traversal, but still provides a way for the display of
  such data to be achieved via BPF programs.  Future work could
  include a bpf_printk_btf() function to invoke display via
  printk() where the elements of a data structure are printk()ed
  one at a time.  Thanks to Petr Mladek, Andy Shevchenko and
  Rasmus Villemoes who took time to look at the earlier printk()
  format-specifier-focused version of this and provided feedback
  clarifying the problems with that approach.
- Added trace id to the bpf_trace_printk events as a means of
  separating output from standard bpf_trace_printk() events,
  ensuring it can be easily parsed by the reader.
- Added bpf_trace_btf() helper tests which do simple verification
  of the various display options.

Changes since v2:

- Alexei and Yonghong suggested it would be good to use
  probe_kernel_read() on to-be-shown data to ensure safety
  during operation.  Safe copy via probe_kernel_read() to a
  buffer object in "struct btf_show" is used to support
  this.  A few different approaches were explored
  including dynamic allocation and per-cpu buffers. The
  downside of dynamic allocation is that it would be done
  during BPF program execution for bpf_trace_printk()s using
  %pT format specifiers. The problem with per-cpu buffers
  is we'd have to manage preemption and since the display
  of an object occurs over an extended period and in printk
  context where we'd rather not change preemption status,
  it seemed tricky to manage buffer safety while considering
  preemption.  The approach of utilizing stack buffer space
  via the "struct btf_show" seemed like the simplest approach.
  The stack size of the associated functions which have a
  "struct btf_show" on their stack to support show operation
  (btf_type_snprintf_show() and btf_type_seq_show()) stays
  under 500 bytes. The compromise here is the safe buffer we
  use is small - 256 bytes - and as a result multiple
  probe_kernel_read()s are needed for larger objects. Most
  objects of interest are smaller than this (e.g.
  "struct sk_buff" is 224 bytes), and while task_struct is a
  notable exception at ~8K, performance is not the priority for
  BTF-based display. (Alexei and Yonghong, patch 2).
- safe buffer use is the default behaviour (and is mandatory
  for BPF) but unsafe display - meaning no safe copy is done
  and we operate on the object itself - is supported via a
  'u' option.
- pointers are prefixed with 0x for clarity (Alexei, patch 2)
- added additional comments and explanations around BTF show
  code, especially around determining whether objects such
  zeroed. Also tried to comment safe object scheme used. (Yonghong,
  patch 2)
- added late_initcall() to initialize vmlinux BTF so that it would
  not have to be initialized during printk operation (Alexei,
  patch 5)
- removed CONFIG_BTF_PRINTF config option as it is not needed;
  CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and
  determining behaviour of type-based printk can be done via
  retrieval of BTF data; if it's not there BTF was unavailable
  or broken (Alexei, patches 4,6)
- fix bpf_trace_printk test to use vmlinux.h and globals via
  skeleton infrastructure, removing need for perf events
  (Andrii, patch 8)

Changes since v1:

- changed format to be more drgn-like, rendering indented type info
  along with type names by default (Alexei)
- zeroed values are omitted (Arnaldo) by default unless the '0'
  modifier is specified (Alexei)
- added an option to print pointer values without obfuscation.
  The reason to do this is the sysctls controlling pointer display
  are likely to be irrelevant in many if not most tracing contexts.
  Some questions on this in the outstanding questions section below...
- reworked printk format specifer so that we no longer rely on format
  %pT but instead use a struct * which contains type information
  (Rasmus). This simplifies the printk parsing, makes use more dynamic
  and also allows specification by BTF id as well as name.
- removed incorrect patch which tried to fix dereferencing of resolved
  BTF info for vmlinux; instead we skip modifiers for the relevant
  case (array element type determination) (Alexei).
- fixed issues with negative snprintf format length (Rasmus)
- added test cases for various data structure formats; base types,
  typedefs, structs, etc.
- tests now iterate through all typedef, enum, struct and unions
  defined for vmlinux BTF and render a version of the target dummy
  value which is either all zeros or all 0xff values; the idea is this
  exercises the "skip if zero" and "print everything" cases.
- added support in BPF for using the %pT format specifier in
  bpf_trace_printk()
- added BPF tests which ensure %pT format specifier use works (Alexei).





Alan Maguire (4):
  bpf: provide function to get vmlinux BTF information

[RFC PATCH bpf-next 2/4] bpf: make BTF show support generic, apply to seq files/bpf_trace_printk

2020-08-06 Thread Alan Maguire

generalize the "seq_show" seq file support in btf.c to support
a generic show callback of which we support three instances;

- the current seq file show
- a show which triggers the bpf_trace/bpf_trace_printk tracepoint
  for each portion of the data displayed

Both classes of show function call btf_type_show() with different
targets:

- for seq_show, the seq file is the target
- for bpf_trace_printk(), no target is needed.

In the tracing case we need to also track additional data -
length of data written specifically for the return value.

By default show will display type information, field members and
their types and values etc, and the information is indented
based upon structure depth. Zeroed fields are omitted.

Show however supports flags which modify its behaviour:

BTF_SHOW_COMPACT - suppress newline/indent.
BTF_SHOW_NONEWLINE - suppress newline only.
BTF_SHOW_NONAME - suppress show of type and member names.
BTF_SHOW_PTR_RAW - do not obfuscate pointer values.
BTF_SHOW_UNSAFE - do not copy data to safe buffer before display.
BTF_SHOW_ZERO - show zeroed values (by default they are not shown).

The bpf_trace_printk tracepoint is augmented with a "trace_id"
field; it is used to allow tracepoint filtering as typed display
information can easily be interspersed with other tracing data,
making it hard to read.  Specifying a trace_id will allow users
to selectively trace data, eliminating noise.

Signed-off-by: Alan Maguire 
---
 include/linux/bpf.h  |   2 +
 include/linux/btf.h  |  37 ++
 kernel/bpf/btf.c | 962 ++-
 kernel/trace/bpf_trace.c |  19 +-
 kernel/trace/bpf_trace.h |   6 +-
 5 files changed, 916 insertions(+), 110 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 55eb67d..6143b6e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -946,6 +946,8 @@ typedef u32 (*bpf_convert_ctx_access_t)(enum 
bpf_access_type type,
 u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
 void *ctx, u64 ctx_size, bpf_ctx_copy_t ctx_copy);
 
+int bpf_trace_vprintk(__u32 trace_id, const char *fmt, va_list ap);
+
 /* an array of programs to be executed under rcu_lock.
  *
  * Typical usage:
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 8b81fbb..46bf9f4 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -13,6 +13,7 @@
 struct btf_member;
 struct btf_type;
 union bpf_attr;
+struct btf_show;
 
 extern const struct file_operations btf_fops;
 
@@ -46,8 +47,44 @@ int btf_get_info_by_fd(const struct btf *btf,
 const struct btf_type *btf_type_id_size(const struct btf *btf,
u32 *type_id,
u32 *ret_size);
+
+/*
+ * Options to control show behaviour.
+ * - BTF_SHOW_COMPACT: no formatting around type information
+ * - BTF_SHOW_NONAME: no struct/union member names/types
+ * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values;
+ *   equivalent to %px.
+ * - BTF_SHOW_ZERO: show zero-valued struct/union members; they
+ *   are not displayed by default
+ * - BTF_SHOW_NONEWLINE: include indent, but suppress newline;
+ *   to be used when a show function implicitly includes a newline.
+ * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
+ *   data before displaying it.
+ */
+#define BTF_SHOW_COMPACT   (1ULL << 0)
+#define BTF_SHOW_NONAME(1ULL << 1)
+#define BTF_SHOW_PTR_RAW   (1ULL << 2)
+#define BTF_SHOW_ZERO  (1ULL << 3)
+#define BTF_SHOW_NONEWLINE (1ULL << 32)
+#define BTF_SHOW_UNSAFE(1ULL << 33)
+
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
   struct seq_file *m);
+
+/*
+ * Trace string representation of obj of BTF type_id.
+ *
+ * @btf: struct btf object
+ * @type_id: type id of type obj points to
+ * @obj: pointer to typed data
+ * @flags: show options (see above)
+ *
+ * Return: length that would have been/was copied as per snprintf, or
+ *negative error.
+ */
+int btf_type_trace_show(const struct btf *btf, u32 type_id, void *obj,
+   u32 trace_id, u64 flags);
+
 int btf_get_fd_by_id(u32 id);
 u32 btf_id(const struct btf *btf);
 bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 91afdd4..be47304 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -282,6 +282,88 @@ static const char *btf_type_str(const struct btf_type *t)
return btf_kind_str[BTF_INFO_KIND(t->info)];
 }
 
+/* Chunk size we use in safe copy of data to be shown. */
+#define BTF_SHOW_OBJ_SAFE_SIZE 256
+
+/*
+ * This is the maximum size of a base type value (equivalent to a
+ * 128-bit int); if we are at the end of our safe buffer and have
+ * less than 16 bytes space we can't

[RFC PATCH bpf-next 1/4] bpf: provide function to get vmlinux BTF information

2020-08-06 Thread Alan Maguire

It will be used later for BPF structure display support

Signed-off-by: Alan Maguire 
---
 include/linux/bpf.h   |  2 ++
 kernel/bpf/verifier.c | 18 --
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index cef4ef0..55eb67d 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1290,6 +1290,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr,
  union bpf_attr __user *uattr);
 void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
 
+struct btf *bpf_get_btf_vmlinux(void);
+
 /* Map specifics */
 struct xdp_buff;
 struct sk_buff;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b6ccfce..05dfc41 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11064,6 +11064,17 @@ static int check_attach_btf_id(struct bpf_verifier_env 
*env)
}
 }
 
+struct btf *bpf_get_btf_vmlinux(void)
+{
+   if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
+   mutex_lock(&bpf_verifier_lock);
+   if (!btf_vmlinux)
+   btf_vmlinux = btf_parse_vmlinux();
+   mutex_unlock(&bpf_verifier_lock);
+   }
+   return btf_vmlinux;
+}
+
 int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
  union bpf_attr __user *uattr)
 {
@@ -11097,12 +11108,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr 
*attr,
env->ops = bpf_verifier_ops[env->prog->type];
is_priv = bpf_capable();
 
-   if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
-   mutex_lock(&bpf_verifier_lock);
-   if (!btf_vmlinux)
-   btf_vmlinux = btf_parse_vmlinux();
-   mutex_unlock(&bpf_verifier_lock);
-   }
+   bpf_get_btf_vmlinux();
 
/* grab the mutex to protect few globals used by verifier */
if (!is_priv)
-- 
1.8.3.1

[PATCH v3 bpf-next 1/2] bpf: use dedicated bpf_trace_printk event instead of trace_printk()

2020-07-13 Thread Alan Maguire

The bpf helper bpf_trace_printk() uses trace_printk() under the hood.
This leads to an alarming warning message originating from trace
buffer allocation which occurs the first time a program using
bpf_trace_printk() is loaded.

We can instead create a trace event for bpf_trace_printk() and enable
it in-kernel when/if we encounter a program using the
bpf_trace_printk() helper.  With this approach, trace_printk()
is not used directly and no warning message appears.

This work was started by Steven (see Link) and finished by Alan; added
Steven's Signed-off-by with his permission.

Link: https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Alan Maguire 
Acked-by: Andrii Nakryiko 
---
 kernel/trace/Makefile|  2 ++
 kernel/trace/bpf_trace.c | 42 +-
 kernel/trace/bpf_trace.h | 34 ++
 3 files changed, 73 insertions(+), 5 deletions(-)
 create mode 100644 kernel/trace/bpf_trace.h

diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 6575bb0..aeba5ee 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -31,6 +31,8 @@ ifdef CONFIG_GCOV_PROFILE_FTRACE
 GCOV_PROFILE := y
 endif
 
+CFLAGS_bpf_trace.o := -I$(src)
+
 CFLAGS_trace_benchmark.o := -I$(src)
 CFLAGS_trace_events_filter.o := -I$(src)
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index e0b7775..0a2716d 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -19,6 +20,9 @@
 #include "trace_probe.h"
 #include "trace.h"
 
+#define CREATE_TRACE_POINTS
+#include "bpf_trace.h"
+
 #define bpf_event_rcu_dereference(p)   \
rcu_dereference_protected(p, lockdep_is_held(&bpf_event_mutex))
 
@@ -374,6 +378,30 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
}
 }
 
+static DEFINE_RAW_SPINLOCK(trace_printk_lock);
+
+#define BPF_TRACE_PRINTK_SIZE   1024
+
+static inline __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...)
+{
+   static char buf[BPF_TRACE_PRINTK_SIZE];
+   unsigned long flags;
+   va_list ap;
+   int ret;
+
+   raw_spin_lock_irqsave(&trace_printk_lock, flags);
+   va_start(ap, fmt);
+   ret = vsnprintf(buf, sizeof(buf), fmt, ap);
+   va_end(ap);
+   /* vsnprintf() will not append null for zero-length strings */
+   if (ret == 0)
+   buf[0] = '\0';
+   trace_bpf_trace_printk(buf);
+   raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
+
+   return ret;
+}
+
 /*
  * Only limited trace_printk() conversion specifiers allowed:
  * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s
@@ -483,8 +511,7 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
  */
 #define __BPF_TP_EMIT()__BPF_ARG3_TP()
 #define __BPF_TP(...)  \
-   __trace_printk(0 /* Fake ip */, \
-  fmt, ##__VA_ARGS__)
+   bpf_do_trace_printk(fmt, ##__VA_ARGS__)
 
 #define __BPF_ARG1_TP(...) \
((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64))\
@@ -521,10 +548,15 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
 const struct bpf_func_proto *bpf_get_trace_printk_proto(void)
 {
/*
-* this program might be calling bpf_trace_printk,
-* so allocate per-cpu printk buffers
+* This program might be calling bpf_trace_printk,
+* so enable the associated bpf_trace/bpf_trace_printk event.
+* Repeat this each time as it is possible a user has
+* disabled bpf_trace_printk events.  By loading a program
+* calling bpf_trace_printk() however the user has expressed
+* the intent to see such events.
 */
-   trace_printk_init_buffers();
+   if (trace_set_clr_event("bpf_trace", "bpf_trace_printk", 1))
+   pr_warn_ratelimited("could not enable bpf_trace_printk events");
 
return &bpf_trace_printk_proto;
 }
diff --git a/kernel/trace/bpf_trace.h b/kernel/trace/bpf_trace.h
new file mode 100644
index 000..9acbc11
--- /dev/null
+++ b/kernel/trace/bpf_trace.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM bpf_trace
+
+#if !defined(_TRACE_BPF_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
+
+#define _TRACE_BPF_TRACE_H
+
+#include 
+
+TRACE_EVENT(bpf_trace_printk,
+
+   TP_PROTO(const char *bpf_string),
+
+   TP_ARGS(bpf_string),
+
+   TP_STRUCT__entry(
+   __string(bpf_string, bpf_string)
+   ),
+
+   TP_fast_assign(
+   __assign_str

[PATCH v3 bpf-next 0/2] bpf: fix use of trace_printk() in BPF

2020-07-13 Thread Alan Maguire

Steven suggested a way to resolve the appearance of the warning banner
that appears as a result of using trace_printk() in BPF [1].
Applying the patch and testing reveals all works as expected; we
can call bpf_trace_printk() and see the trace messages in
/sys/kernel/debug/tracing/trace_pipe and no banner message appears.

Also add a test prog to verify basic bpf_trace_printk() helper behaviour.

Changes since v2:

- fixed stray newline in bpf_trace_printk(), use sizeof(buf)
  rather than #defined value in vsnprintf() (Daniel, patch 1)
- Daniel also pointed out that vsnprintf() returns 0 on error rather
  than a negative value; also turns out that a null byte is not
  appended if the length of the string written is zero, so to fix
  for cases where the string to be traced is zero length we set the
  null byte explicitly (Daniel, patch 1)
- switch to using getline() for retrieving lines from trace buffer
  to ensure we don't read a portion of the search message in one
  read() operation and then fail to find it (Andrii, patch 2)

Changes since v1:

- reorder header inclusion in bpf_trace.c (Steven, patch 1)
- trace zero-length messages also (Andrii, patch 1)
- use a raw spinlock to ensure there are no issues for PREMMPT_RT
  kernels when using bpf_trace_printk() within other raw spinlocks
  (Steven, patch 1)
- always enable bpf_trace_printk() tracepoint when loading programs
  using bpf_trace_printk() as this will ensure that a user disabling
  that tracepoint will not prevent tracing output from being logged
  (Steven, patch 1)
- use "tp/raw_syscalls/sys_enter" and a usleep(1) to trigger events
  in the selftest ensuring test runs faster (Andrii, patch 2)

[1]  https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home

Alan Maguire (2):
  bpf: use dedicated bpf_trace_printk event instead of trace_printk()
  selftests/bpf: add selftests verifying bpf_trace_printk() behaviour

 kernel/trace/Makefile  |  2 +
 kernel/trace/bpf_trace.c   | 42 ++--
 kernel/trace/bpf_trace.h   | 34 ++
 .../selftests/bpf/prog_tests/trace_printk.c| 75 ++
 tools/testing/selftests/bpf/progs/trace_printk.c   | 21 ++
 5 files changed, 169 insertions(+), 5 deletions(-)
 create mode 100644 kernel/trace/bpf_trace.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c
 create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c

-- 
1.8.3.1

[PATCH v3 bpf-next 2/2] selftests/bpf: add selftests verifying bpf_trace_printk() behaviour

2020-07-13 Thread Alan Maguire

Simple selftests that verifies bpf_trace_printk() returns a sensible
value and tracing messages appear.

Signed-off-by: Alan Maguire 
Acked-by: Andrii Nakryiko 
---
 .../selftests/bpf/prog_tests/trace_printk.c| 75 ++
 tools/testing/selftests/bpf/progs/trace_printk.c   | 21 ++
 2 files changed, 96 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c
 create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c

diff --git a/tools/testing/selftests/bpf/prog_tests/trace_printk.c 
b/tools/testing/selftests/bpf/prog_tests/trace_printk.c
new file mode 100644
index 000..39b0dec
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/trace_printk.c
@@ -0,0 +1,75 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+
+#include 
+
+#include "trace_printk.skel.h"
+
+#define TRACEBUF   "/sys/kernel/debug/tracing/trace_pipe"
+#define SEARCHMSG  "testing,testing"
+
+void test_trace_printk(void)
+{
+   int err, iter = 0, duration = 0, found = 0;
+   struct trace_printk__bss *bss;
+   struct trace_printk *skel;
+   char *buf = NULL;
+   FILE *fp = NULL;
+   size_t buflen;
+
+   skel = trace_printk__open();
+   if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+   return;
+
+   err = trace_printk__load(skel);
+   if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err))
+   goto cleanup;
+
+   bss = skel->bss;
+
+   err = trace_printk__attach(skel);
+   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+   goto cleanup;
+
+   fp = fopen(TRACEBUF, "r");
+   if (CHECK(fp == NULL, "could not open trace buffer",
+ "error %d opening %s", errno, TRACEBUF))
+   goto cleanup;
+
+   /* We do not want to wait forever if this test fails... */
+   fcntl(fileno(fp), F_SETFL, O_NONBLOCK);
+
+   /* wait for tracepoint to trigger */
+   usleep(1);
+   trace_printk__detach(skel);
+
+   if (CHECK(bss->trace_printk_ran == 0,
+ "bpf_trace_printk never ran",
+ "ran == %d", bss->trace_printk_ran))
+   goto cleanup;
+
+   if (CHECK(bss->trace_printk_ret <= 0,
+ "bpf_trace_printk returned <= 0 value",
+ "got %d", bss->trace_printk_ret))
+   goto cleanup;
+
+   /* verify our search string is in the trace buffer */
+   while (getline(&buf, &buflen, fp) >= 0 || errno == EAGAIN) {
+   if (strstr(buf, SEARCHMSG) != NULL)
+   found++;
+   if (found == bss->trace_printk_ran)
+   break;
+   if (++iter > 1000)
+   break;
+   }
+
+   if (CHECK(!found, "message from bpf_trace_printk not found",
+ "no instance of %s in %s", SEARCHMSG, TRACEBUF))
+   goto cleanup;
+
+cleanup:
+   trace_printk__destroy(skel);
+   free(buf);
+   if (fp)
+   fclose(fp);
+}
diff --git a/tools/testing/selftests/bpf/progs/trace_printk.c 
b/tools/testing/selftests/bpf/progs/trace_printk.c
new file mode 100644
index 000..8ca7f39
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/trace_printk.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020, Oracle and/or its affiliates.
+
+#include "vmlinux.h"
+#include 
+#include 
+
+char _license[] SEC("license") = "GPL";
+
+int trace_printk_ret = 0;
+int trace_printk_ran = 0;
+
+SEC("tp/raw_syscalls/sys_enter")
+int sys_enter(void *ctx)
+{
+   static const char fmt[] = "testing,testing %d\n";
+
+   trace_printk_ret = bpf_trace_printk(fmt, sizeof(fmt),
+   ++trace_printk_ran);
+   return 0;
+}
-- 
1.8.3.1

[PATCH v2 bpf-next 0/2] bpf: fix use of trace_printk() in BPF

2020-07-10 Thread Alan Maguire

Steven suggested a way to resolve the appearance of the warning banner
that appears as a result of using trace_printk() in BPF [1].
Applying the patch and testing reveals all works as expected; we
can call bpf_trace_printk() and see the trace messages in
/sys/kernel/debug/tracing/trace_pipe and no banner message appears.

Also add a test prog to verify basic bpf_trace_printk() helper behaviour.

Changes since v1:

- reorder header inclusion in bpf_trace.c (Steven, patch 1)
- trace zero-length messages also (Andrii, patch 1)
- use a raw spinlock to ensure there are no issues for PREMMPT_RT
  kernels when using bpf_trace_printk() within other raw spinlocks
  (Steven, patch 1)
- always enable bpf_trace_printk() tracepoint when loading programs
  using bpf_trace_printk() as this will ensure that a user disabling
  that tracepoint will not prevent tracing output from being logged
  (Steven, patch 1)
- use "tp/raw_syscalls/sys_enter" and a usleep(1) to trigger events
  in the selftest ensuring test runs faster (Andrii, patch 2)

[1]  https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home

Alan Maguire (2):
  bpf: use dedicated bpf_trace_printk event instead of trace_printk()
  selftests/bpf: add selftests verifying bpf_trace_printk() behaviour

 kernel/trace/Makefile  |  2 +
 kernel/trace/bpf_trace.c   | 41 ++--
 kernel/trace/bpf_trace.h   | 34 ++
 .../selftests/bpf/prog_tests/trace_printk.c| 74 ++
 tools/testing/selftests/bpf/progs/trace_printk.c   | 21 ++
 5 files changed, 167 insertions(+), 5 deletions(-)
 create mode 100644 kernel/trace/bpf_trace.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c
 create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c

-- 
1.8.3.1

[PATCH v2 bpf-next 1/2] bpf: use dedicated bpf_trace_printk event instead of trace_printk()

2020-07-10 Thread Alan Maguire

The bpf helper bpf_trace_printk() uses trace_printk() under the hood.
This leads to an alarming warning message originating from trace
buffer allocation which occurs the first time a program using
bpf_trace_printk() is loaded.

We can instead create a trace event for bpf_trace_printk() and enable
it in-kernel when/if we encounter a program using the
bpf_trace_printk() helper.  With this approach, trace_printk()
is not used directly and no warning message appears.

This work was started by Steven (see Link) and finished by Alan; added
Steven's Signed-off-by with his permission.

Link: https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Alan Maguire 
---
 kernel/trace/Makefile|  2 ++
 kernel/trace/bpf_trace.c | 41 -
 kernel/trace/bpf_trace.h | 34 ++
 3 files changed, 72 insertions(+), 5 deletions(-)
 create mode 100644 kernel/trace/bpf_trace.h

diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 6575bb0..aeba5ee 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -31,6 +31,8 @@ ifdef CONFIG_GCOV_PROFILE_FTRACE
 GCOV_PROFILE := y
 endif
 
+CFLAGS_bpf_trace.o := -I$(src)
+
 CFLAGS_trace_benchmark.o := -I$(src)
 CFLAGS_trace_events_filter.o := -I$(src)
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 1d874d8..1414bf5 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -19,6 +20,9 @@
 #include "trace_probe.h"
 #include "trace.h"
 
+#define CREATE_TRACE_POINTS
+#include "bpf_trace.h"
+
 #define bpf_event_rcu_dereference(p)   \
rcu_dereference_protected(p, lockdep_is_held(&bpf_event_mutex))
 
@@ -374,6 +378,29 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
}
 }
 
+static DEFINE_RAW_SPINLOCK(trace_printk_lock);
+
+#define BPF_TRACE_PRINTK_SIZE   1024
+
+
+static inline __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...)
+{
+   static char buf[BPF_TRACE_PRINTK_SIZE];
+   unsigned long flags;
+   va_list ap;
+   int ret;
+
+   raw_spin_lock_irqsave(&trace_printk_lock, flags);
+   va_start(ap, fmt);
+   ret = vsnprintf(buf, BPF_TRACE_PRINTK_SIZE, fmt, ap);
+   va_end(ap);
+   if (ret >= 0)
+   trace_bpf_trace_printk(buf);
+   raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
+
+   return ret;
+}
+
 /*
  * Only limited trace_printk() conversion specifiers allowed:
  * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s
@@ -483,8 +510,7 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
  */
 #define __BPF_TP_EMIT()__BPF_ARG3_TP()
 #define __BPF_TP(...)  \
-   __trace_printk(0 /* Fake ip */, \
-  fmt, ##__VA_ARGS__)
+   bpf_do_trace_printk(fmt, ##__VA_ARGS__)
 
 #define __BPF_ARG1_TP(...) \
((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64))\
@@ -521,10 +547,15 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
 const struct bpf_func_proto *bpf_get_trace_printk_proto(void)
 {
/*
-* this program might be calling bpf_trace_printk,
-* so allocate per-cpu printk buffers
+* This program might be calling bpf_trace_printk,
+* so enable the associated bpf_trace/bpf_trace_printk event.
+* Repeat this each time as it is possible a user has
+* disabled bpf_trace_printk events.  By loading a program
+* calling bpf_trace_printk() however the user has expressed
+* the intent to see such events.
 */
-   trace_printk_init_buffers();
+   if (trace_set_clr_event("bpf_trace", "bpf_trace_printk", 1))
+   pr_warn_ratelimited("could not enable bpf_trace_printk events");
 
return &bpf_trace_printk_proto;
 }
diff --git a/kernel/trace/bpf_trace.h b/kernel/trace/bpf_trace.h
new file mode 100644
index 000..9acbc11
--- /dev/null
+++ b/kernel/trace/bpf_trace.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM bpf_trace
+
+#if !defined(_TRACE_BPF_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
+
+#define _TRACE_BPF_TRACE_H
+
+#include 
+
+TRACE_EVENT(bpf_trace_printk,
+
+   TP_PROTO(const char *bpf_string),
+
+   TP_ARGS(bpf_string),
+
+   TP_STRUCT__entry(
+   __string(bpf_string, bpf_string)
+   ),
+
+   TP_fast_assign(
+   __assign_str(bpf_string, bpf_string);
+   ),
+
+   TP_printk("%s", __get_str(bpf_string))
+);
+
+#endif /* _TRACE

[PATCH v2 bpf-next 2/2] selftests/bpf: add selftests verifying bpf_trace_printk() behaviour

2020-07-10 Thread Alan Maguire

Simple selftests that verifies bpf_trace_printk() returns a sensible
value and tracing messages appear.

Signed-off-by: Alan Maguire 
---
 .../selftests/bpf/prog_tests/trace_printk.c| 74 ++
 tools/testing/selftests/bpf/progs/trace_printk.c   | 21 ++
 2 files changed, 95 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c
 create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c

diff --git a/tools/testing/selftests/bpf/prog_tests/trace_printk.c 
b/tools/testing/selftests/bpf/prog_tests/trace_printk.c
new file mode 100644
index 000..25dd0f47
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/trace_printk.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+
+#include 
+
+#include "trace_printk.skel.h"
+
+#define TRACEBUF   "/sys/kernel/debug/tracing/trace_pipe"
+#define SEARCHMSG  "testing,testing"
+
+void test_trace_printk(void)
+{
+   int err, iter = 0, duration = 0, found = 0, fd = -1;
+   struct trace_printk__bss *bss;
+   struct trace_printk *skel;
+   char buf[1024];
+
+   skel = trace_printk__open();
+   if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+   return;
+
+   err = trace_printk__load(skel);
+   if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err))
+   goto cleanup;
+
+   bss = skel->bss;
+
+   err = trace_printk__attach(skel);
+   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+   goto cleanup;
+
+   fd = open(TRACEBUF, O_RDONLY);
+   if (CHECK(fd < 0, "could not open trace buffer",
+ "error %d opening %s", errno, TRACEBUF))
+   goto cleanup;
+
+   /* We do not want to wait forever if this test fails... */
+   fcntl(fd, F_SETFL, O_NONBLOCK);
+
+   /* wait for tracepoint to trigger */
+   usleep(1);
+   trace_printk__detach(skel);
+
+   if (CHECK(bss->trace_printk_ran == 0,
+ "bpf_trace_printk never ran",
+ "ran == %d", bss->trace_printk_ran))
+   goto cleanup;
+
+   if (CHECK(bss->trace_printk_ret <= 0,
+ "bpf_trace_printk returned <= 0 value",
+ "got %d", bss->trace_printk_ret))
+   goto cleanup;
+
+   /* verify our search string is in the trace buffer */
+   while (read(fd, buf, sizeof(buf)) >= 0 || errno == EAGAIN) {
+   if (strstr(buf, SEARCHMSG) != NULL)
+   found++;
+   if (found == bss->trace_printk_ran)
+   break;
+   if (++iter > 1000)
+   break;
+   }
+
+   if (CHECK(!found, "message from bpf_trace_printk not found",
+ "no instance of %s in %s", SEARCHMSG, TRACEBUF))
+   goto cleanup;
+
+   printf("ran %d times; last return value %d, with %d instances of msg\n",
+  bss->trace_printk_ran, bss->trace_printk_ret, found);
+cleanup:
+   trace_printk__destroy(skel);
+   if (fd != -1)
+   close(fd);
+}
diff --git a/tools/testing/selftests/bpf/progs/trace_printk.c 
b/tools/testing/selftests/bpf/progs/trace_printk.c
new file mode 100644
index 000..8ca7f39
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/trace_printk.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020, Oracle and/or its affiliates.
+
+#include "vmlinux.h"
+#include 
+#include 
+
+char _license[] SEC("license") = "GPL";
+
+int trace_printk_ret = 0;
+int trace_printk_ran = 0;
+
+SEC("tp/raw_syscalls/sys_enter")
+int sys_enter(void *ctx)
+{
+   static const char fmt[] = "testing,testing %d\n";
+
+   trace_printk_ret = bpf_trace_printk(fmt, sizeof(fmt),
+   ++trace_printk_ran);
+   return 0;
+}
-- 
1.8.3.1

Re: [PATCH bpf-next 1/2] bpf: use dedicated bpf_trace_printk event instead of trace_printk()

2020-07-09 Thread Alan Maguire



On Tue, 7 Jul 2020, Andrii Nakryiko wrote:

> On Fri, Jul 3, 2020 at 7:47 AM Alan Maguire  wrote:
> >
> > The bpf helper bpf_trace_printk() uses trace_printk() under the hood.
> > This leads to an alarming warning message originating from trace
> > buffer allocation which occurs the first time a program using
> > bpf_trace_printk() is loaded.
> >
> > We can instead create a trace event for bpf_trace_printk() and enable
> > it in-kernel when/if we encounter a program using the
> > bpf_trace_printk() helper.  With this approach, trace_printk()
> > is not used directly and no warning message appears.
> >
> > This work was started by Steven (see Link) and finished by Alan; added
> > Steven's Signed-off-by with his permission.
> >
> > Link: https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home
> > Signed-off-by: Steven Rostedt (VMware) 
> > Signed-off-by: Alan Maguire 
> > ---
> >  kernel/trace/Makefile|  2 ++
> >  kernel/trace/bpf_trace.c | 41 +
> >  kernel/trace/bpf_trace.h | 34 ++
> >  3 files changed, 73 insertions(+), 4 deletions(-)
> >  create mode 100644 kernel/trace/bpf_trace.h
> >
> 
> [...]
> 
> > +static DEFINE_SPINLOCK(trace_printk_lock);
> > +
> > +#define BPF_TRACE_PRINTK_SIZE   1024
> > +
> > +static inline int bpf_do_trace_printk(const char *fmt, ...)
> > +{
> > +   static char buf[BPF_TRACE_PRINTK_SIZE];
> > +   unsigned long flags;
> > +   va_list ap;
> > +   int ret;
> > +
> > +   spin_lock_irqsave(&trace_printk_lock, flags);
> > +   va_start(ap, fmt);
> > +   ret = vsnprintf(buf, BPF_TRACE_PRINTK_SIZE, fmt, ap);
> > +   va_end(ap);
> > +   if (ret > 0)
> > +   trace_bpf_trace_printk(buf);
> 
> Is there any reason to artificially limit the case of printing empty
> string? It's kind of an awkward use case, for sure, but having
> guarantee that every bpf_trace_printk() invocation triggers tracepoint
> is a nice property, no?
>

True enough; I'll modify the above to support empty string display also.
 
> > +   spin_unlock_irqrestore(&trace_printk_lock, flags);
> > +
> > +   return ret;
> > +}
> > +
> >  /*
> >   * Only limited trace_printk() conversion specifiers allowed:
> >   * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s
> > @@ -483,8 +510,7 @@ static void bpf_trace_copy_string(char *buf, void 
> > *unsafe_ptr, char fmt_ptype,
> >   */
> >  #define __BPF_TP_EMIT()__BPF_ARG3_TP()
> >  #define __BPF_TP(...)  \
> > -   __trace_printk(0 /* Fake ip */, \
> > -  fmt, ##__VA_ARGS__)
> > +   bpf_do_trace_printk(fmt, ##__VA_ARGS__)
> >
> >  #define __BPF_ARG1_TP(...) \
> > ((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64))\
> > @@ -518,13 +544,20 @@ static void bpf_trace_copy_string(char *buf, void 
> > *unsafe_ptr, char fmt_ptype,
> > .arg2_type  = ARG_CONST_SIZE,
> >  };
> >
> > +int bpf_trace_printk_enabled;
> 
> static?
> 

oops, will fix.

> > +
> >  const struct bpf_func_proto *bpf_get_trace_printk_proto(void)
> >  {
> > /*
> >  * this program might be calling bpf_trace_printk,
> > -* so allocate per-cpu printk buffers
> > +* so enable the associated bpf_trace/bpf_trace_printk event.
> >  */
> > -   trace_printk_init_buffers();
> > +   if (!bpf_trace_printk_enabled) {
> > +   if (trace_set_clr_event("bpf_trace", "bpf_trace_printk", 1))
> 
> just to double check, it's ok to simultaneously enable same event in
> parallel, right?
>

>From an ftrace perspective, it looks fine since the actual enable is 
mutex-protected. We could grab the trace_printk_lock here too I guess,
but I don't _think_ there's a need. 
 
Thanks for reviewing! I'll spin up a v2 with the above fixes shortly
plus I'll change to using tp/raw_syscalls/sys_enter in the test as you 
suggested.

Alan

[PATCH bpf-next 1/2] bpf: use dedicated bpf_trace_printk event instead of trace_printk()

2020-07-03 Thread Alan Maguire

The bpf helper bpf_trace_printk() uses trace_printk() under the hood.
This leads to an alarming warning message originating from trace
buffer allocation which occurs the first time a program using
bpf_trace_printk() is loaded.

We can instead create a trace event for bpf_trace_printk() and enable
it in-kernel when/if we encounter a program using the
bpf_trace_printk() helper.  With this approach, trace_printk()
is not used directly and no warning message appears.

This work was started by Steven (see Link) and finished by Alan; added
Steven's Signed-off-by with his permission.

Link: https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Alan Maguire 
---
 kernel/trace/Makefile|  2 ++
 kernel/trace/bpf_trace.c | 41 +
 kernel/trace/bpf_trace.h | 34 ++
 3 files changed, 73 insertions(+), 4 deletions(-)
 create mode 100644 kernel/trace/bpf_trace.h

diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 6575bb0..aeba5ee 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -31,6 +31,8 @@ ifdef CONFIG_GCOV_PROFILE_FTRACE
 GCOV_PROFILE := y
 endif
 
+CFLAGS_bpf_trace.o := -I$(src)
+
 CFLAGS_trace_benchmark.o := -I$(src)
 CFLAGS_trace_events_filter.o := -I$(src)
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 1d874d8..cdbafc4 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2,6 +2,10 @@
 /* Copyright (c) 2011-2015 PLUMgrid, http://plumgrid.com
  * Copyright (c) 2016 Facebook
  */
+#define CREATE_TRACE_POINTS
+
+#include "bpf_trace.h"
+
 #include 
 #include 
 #include 
@@ -11,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -374,6 +379,28 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
}
 }
 
+static DEFINE_SPINLOCK(trace_printk_lock);
+
+#define BPF_TRACE_PRINTK_SIZE   1024
+
+static inline int bpf_do_trace_printk(const char *fmt, ...)
+{
+   static char buf[BPF_TRACE_PRINTK_SIZE];
+   unsigned long flags;
+   va_list ap;
+   int ret;
+
+   spin_lock_irqsave(&trace_printk_lock, flags);
+   va_start(ap, fmt);
+   ret = vsnprintf(buf, BPF_TRACE_PRINTK_SIZE, fmt, ap);
+   va_end(ap);
+   if (ret > 0)
+   trace_bpf_trace_printk(buf);
+   spin_unlock_irqrestore(&trace_printk_lock, flags);
+
+   return ret;
+}
+
 /*
  * Only limited trace_printk() conversion specifiers allowed:
  * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s
@@ -483,8 +510,7 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
  */
 #define __BPF_TP_EMIT()__BPF_ARG3_TP()
 #define __BPF_TP(...)  \
-   __trace_printk(0 /* Fake ip */, \
-  fmt, ##__VA_ARGS__)
+   bpf_do_trace_printk(fmt, ##__VA_ARGS__)
 
 #define __BPF_ARG1_TP(...) \
((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64))\
@@ -518,13 +544,20 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
.arg2_type  = ARG_CONST_SIZE,
 };
 
+int bpf_trace_printk_enabled;
+
 const struct bpf_func_proto *bpf_get_trace_printk_proto(void)
 {
/*
 * this program might be calling bpf_trace_printk,
-* so allocate per-cpu printk buffers
+* so enable the associated bpf_trace/bpf_trace_printk event.
 */
-   trace_printk_init_buffers();
+   if (!bpf_trace_printk_enabled) {
+   if (trace_set_clr_event("bpf_trace", "bpf_trace_printk", 1))
+   pr_warn_ratelimited("could not enable bpf_trace_printk 
events");
+   else
+   bpf_trace_printk_enabled = 1;
+   }
 
return &bpf_trace_printk_proto;
 }
diff --git a/kernel/trace/bpf_trace.h b/kernel/trace/bpf_trace.h
new file mode 100644
index 000..9acbc11
--- /dev/null
+++ b/kernel/trace/bpf_trace.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM bpf_trace
+
+#if !defined(_TRACE_BPF_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
+
+#define _TRACE_BPF_TRACE_H
+
+#include 
+
+TRACE_EVENT(bpf_trace_printk,
+
+   TP_PROTO(const char *bpf_string),
+
+   TP_ARGS(bpf_string),
+
+   TP_STRUCT__entry(
+   __string(bpf_string, bpf_string)
+   ),
+
+   TP_fast_assign(
+   __assign_str(bpf_string, bpf_string);
+   ),
+
+   TP_printk("%s", __get_str(bpf_string))
+);
+
+#endif /* _TRACE_BPF_TRACE_H */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE bpf_trace
+
+#include 
-- 
1.8.3.1

[PATCH bpf-next 0/2] bpf: fix use of trace_printk() in BPF

2020-07-03 Thread Alan Maguire

Steven suggested a way to resolve the appearance of the warning banner
that appears as a result of using trace_printk() in BPF [1].
Applying the patch and testing reveals all works as expected; we
can call bpf_trace_printk() and see the trace messages in
/sys/kernel/debug/tracing/trace_pipe and no banner message appears.

Also add a test prog to verify basic bpf_trace_printk() helper behaviour.

Possible future work: ftrace supports trace instances, and one thing
that strikes me is that we could make use of these in BPF to separate
BPF program bpf_trace_printk() output from output of other tracing
activities.

I was thinking something like a sysctl net.core.bpf_trace_instance,
defaulting to an empty value signifying we use the root trace
instance.  This would preserve existing behaviour while giving a
way to separate BPF tracing output from other tracing output if wanted.

[1]  https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home

Alan Maguire (2):
  bpf: use dedicated bpf_trace_printk event instead of trace_printk()
  selftests/bpf: add selftests verifying bpf_trace_printk() behaviour

 kernel/trace/Makefile  |  2 +
 kernel/trace/bpf_trace.c   | 41 +++--
 kernel/trace/bpf_trace.h   | 34 +++
 .../selftests/bpf/prog_tests/trace_printk.c| 71 ++
 tools/testing/selftests/bpf/progs/trace_printk.c   | 21 +++
 5 files changed, 165 insertions(+), 4 deletions(-)
 create mode 100644 kernel/trace/bpf_trace.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c
 create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c

-- 
1.8.3.1

[PATCH bpf-next 2/2] selftests/bpf: add selftests verifying bpf_trace_printk() behaviour

2020-07-03 Thread Alan Maguire

Simple selftest that verifies bpf_trace_printk() returns a sensible
value and tracing messages appear.

Signed-off-by: Alan Maguire 
---
 .../selftests/bpf/prog_tests/trace_printk.c| 71 ++
 tools/testing/selftests/bpf/progs/trace_printk.c   | 21 +++
 2 files changed, 92 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c
 create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c

diff --git a/tools/testing/selftests/bpf/prog_tests/trace_printk.c 
b/tools/testing/selftests/bpf/prog_tests/trace_printk.c
new file mode 100644
index 000..a850cba
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/trace_printk.c
@@ -0,0 +1,71 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+
+#include 
+
+#include "trace_printk.skel.h"
+
+#define TRACEBUF   "/sys/kernel/debug/tracing/trace_pipe"
+#define SEARCHMSG  "testing,testing"
+
+void test_trace_printk(void)
+{
+   int err, duration = 0, found = 0;
+   struct trace_printk *skel;
+   struct trace_printk__bss *bss;
+   char buf[1024];
+   int fd = -1;
+
+   skel = trace_printk__open();
+   if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+   return;
+
+   err = trace_printk__load(skel);
+   if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err))
+   goto cleanup;
+
+   bss = skel->bss;
+
+   err = trace_printk__attach(skel);
+   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+   goto cleanup;
+
+   fd = open(TRACEBUF, O_RDONLY);
+   if (CHECK(fd < 0, "could not open trace buffer",
+ "error %d opening %s", errno, TRACEBUF))
+   goto cleanup;
+
+   /* We do not want to wait forever if this test fails... */
+   fcntl(fd, F_SETFL, O_NONBLOCK);
+
+   /* wait for tracepoint to trigger */
+   sleep(1);
+   trace_printk__detach(skel);
+
+   if (CHECK(bss->trace_printk_ran == 0,
+ "bpf_trace_printk never ran",
+ "ran == %d", bss->trace_printk_ran))
+   goto cleanup;
+
+   if (CHECK(bss->trace_printk_ret <= 0,
+ "bpf_trace_printk returned <= 0 value",
+ "got %d", bss->trace_printk_ret))
+   goto cleanup;
+
+   /* verify our search string is in the trace buffer */
+   while (read(fd, buf, sizeof(buf)) >= 0) {
+   if (strstr(buf, SEARCHMSG) != NULL)
+   found++;
+   }
+
+   if (CHECK(!found, "message from bpf_trace_printk not found",
+ "no instance of %s in %s", SEARCHMSG, TRACEBUF))
+   goto cleanup;
+
+   printf("ran %d times; last return value %d, with %d instances of msg\n",
+  bss->trace_printk_ran, bss->trace_printk_ret, found);
+cleanup:
+   trace_printk__destroy(skel);
+   if (fd != -1)
+   close(fd);
+}
diff --git a/tools/testing/selftests/bpf/progs/trace_printk.c 
b/tools/testing/selftests/bpf/progs/trace_printk.c
new file mode 100644
index 000..8ff6d49
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/trace_printk.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020, Oracle and/or its affiliates.
+
+#include "vmlinux.h"
+#include 
+#include 
+
+char _license[] SEC("license") = "GPL";
+
+int trace_printk_ret = 0;
+int trace_printk_ran = 0;
+
+SEC("tracepoint/sched/sched_switch")
+int sched_switch(void *ctx)
+{
+   static const char fmt[] = "testing,testing %d\n";
+
+   trace_printk_ret = bpf_trace_printk(fmt, sizeof(fmt),
+   ++trace_printk_ran);
+   return 0;
+}
-- 
1.8.3.1

Re: linux-next: build failure after merge of the thunderbolt tree

2020-06-30 Thread Alan Maguire

On Tue, 30 Jun 2020, Stephen Rothwell wrote:

> Hi all,
> 
> After merging the thunderbolt tree, today's linux-next build (powerpc
> allyesconfig) failed like this:
> 
> 
> Caused by commit
> 
>   54509f5005ca ("thunderbolt: Add KUnit tests for path walking")
> 
> interacting with commit
> 
>   d4cdd146d0db ("kunit: generalize kunit_resource API beyond allocated 
> resources")
> 
> from the kunit-next tree.
> 
> I have applied the following merge fix patch.
> 
> From: Stephen Rothwell 
> Date: Tue, 30 Jun 2020 15:51:50 +1000
> Subject: [PATCH] thunderbolt: merge fix for kunix_resource changes
> 
> Signed-off-by: Stephen Rothwell 

Thanks Stephen, resolution looks good to me! If you need it

Reviewed-by: Alan Maguire 

Once the kunit and thunderbolt trees are merged there may
be some additional things we can do to simplify kunit
resource utilization in the thuderbolt tests using the new
kunit resource APIs; no hurry with that though. Nice to
see the kunit resources code being used!

Alan

Re: [PATCH v3 bpf-next 4/8] printk: add type-printing %pT format specifier which uses BTF

2020-06-26 Thread Alan Maguire



On Fri, 26 Jun 2020, Petr Mladek wrote:

> On Tue 2020-06-23 13:07:07, Alan Maguire wrote:
> > printk supports multiple pointer object type specifiers (printing
> > netdev features etc).  Extend this support using BTF to cover
> > arbitrary types.  "%pT" specifies the typed format, and the pointer
> > argument is a "struct btf_ptr *" where struct btf_ptr is as follows:
> > 
> > struct btf_ptr {
> > void *ptr;
> > const char *type;
> > u32 id;
> > };
> > 
> > Either the "type" string ("struct sk_buff") or the BTF "id" can be
> > used to identify the type to use in displaying the associated "ptr"
> > value.  A convenience function to create and point at the struct
> > is provided:
> > 
> > printk(KERN_INFO "%pT", BTF_PTR_TYPE(skb, struct sk_buff));
> > 
> > When invoked, BTF information is used to traverse the sk_buff *
> > and display it.  Support is present for structs, unions, enums,
> > typedefs and core types (though in the latter case there's not
> > much value in using this feature of course).
> > 
> > Default output is indented, but compact output can be specified
> > via the 'c' option.  Type names/member values can be suppressed
> > using the 'N' option.  Zero values are not displayed by default
> > but can be using the '0' option.  Pointer values are obfuscated
> > unless the 'x' option is specified.  As an example:
> > 
> >   struct sk_buff *skb = alloc_skb(64, GFP_KERNEL);
> >   pr_info("%pT", BTF_PTR_TYPE(skb, struct sk_buff));
> > 
> > ...gives us:
> > 
> > (struct sk_buff){
> >  .transport_header = (__u16)65535,
> >  .mac_header = (__u16)65535,
> >  .end = (sk_buff_data_t)192,
> >  .head = (unsigned char *)0x6b71155a,
> >  .data = (unsigned char *)0x6b71155a,
> >  .truesize = (unsigned int)768,
> >  .users = (refcount_t){
> >   .refs = (atomic_t){
> >.counter = (int)1,
> >   },
> >  },
> >  .extensions = (struct skb_ext *)0xf486a130,
> > }
> > 
> > printk output is truncated at 1024 bytes.  For cases where overflow
> > is likely, the compact/no type names display modes may be used.
> 
> Hmm, this scares me:
> 
>1. The long message and many lines are going to stretch printk
>   design in another dimensions.
> 
>2. vsprintf() is important for debugging the system. It has to be
>   stable. But the btf code is too complex.
>

Right on both points, and there's no way around that really. Representing 
even small data structures will stretch us to or beyond the 1024 byte 
limit.  This can be mitigated by using compact display mode and not 
printing field names, but the output becomes hard to parse then.

I think a better approach might be to start small, adding the core
btf_show functionality to BPF, allowing consumers to use it there,
perhaps via a custom helper. In the current model bpf_trace_printk()
inherits the functionality to display data from core printk, so a
different approach would be needed there.  Other consumers outside of BPF
could potentially avail of the show functionality directly via the btf_show
functions in the future, but at least it would have one consumer at the 
outset, and wouldn't present problems like these for printk.
 
> I would strongly prefer to keep this outside vsprintf and printk.
> Please, invert the logic and convert it into using separate printk()
> call for each printed line.
> 

I think the above is in line with what you're suggesting?

> 
> More details:
> 
> Add 1: Long messages with many lines:
> 
>   IMHO, all existing printk() users are far below this limit. And this is
>   even worse because there are many short lines. They would require
>   double space to add prefixes (loglevel, timestamp, caller id) when
>   printing to console.
> 
>   You might argue that 1024bytes are enough for you. But for how long?
> 
>   Now, we have huge troubles to make printk() lockless and thus more
>   reliable. There is no way to allocate any internal buffers
>   dynamically. People using kernel on small devices have problem
>   with large static buffers.
> 
>   printk() is primary designed to print single line messages. There are
>   many use cases where many lines are needed and they are solved by
>   many separate printk() calls.
> 
> 
> Add 2: Complex code:
> 
>   vsprintf() is currently called in printk() under logbuf_lock. It
>   might block printk() on the entire system.
> 
>   Most existing %p handlers are implemented by relatively
>   simple routines inside lib/vsprinf.c. The other external routines
>   look simple as well.
> 
>   btf looks like a huge beast to me. For example, probe_kernel_read()
>   prevented boot recently, see the commit 2ac5a3bf7042a1c4abb
>   ("vsprintf: Do not break early boot with probing addresses").
> 
> 

Yep, no way round this either. I'll try a different approach. Thanks for 
taking a look!

Alan

> Best Regards,
> Petr
>

Re: RFC: KTAP documentation - expected messages

2020-06-24 Thread Alan Maguire

On Tue, 23 Jun 2020, David Gow wrote:

> On Mon, Jun 22, 2020 at 6:45 AM Frank Rowand  wrote:
> >
> > Tim Bird started a thread [1] proposing that he document the selftest result
> > format used by Linux kernel tests.
> >
> > [1] 
> > https://lore.kernel.org/r/cy4pr13mb1175b804e31e502221bc8163fd...@cy4pr13mb1175.namprd13.prod.outlook.com
> >
> > The issue of messages generated by the kernel being tested (that are not
> > messages directly created by the tests, but are instead triggered as a
> > side effect of the test) came up.  In this thread, I will call these
> > messages "expected messages".  Instead of sidetracking that thread with
> > a proposal to handle expected messages, I am starting this new thread.
> 
> Thanks for doing this: I think there are quite a few tests which could
> benefit from something like this.
> 
> I think there were actually two separate questions: what do we do with
> unexpected messages (most of which I expect are useless, but some of
> which may end up being related to an unexpected test failure), and how
> to have tests "expect" a particular message to appear. I'll stick to
> talking about the latter for this thread, but even there there's two
> possible interpretations of "expected messages" we probably want to
> explicitly distinguish between: a message which must be present for
> the test to pass (which I think best fits the "expected message"
> name), and a message which the test is likely to produce, but which
> shouldn't alter the result (an "ignored message"). I don't see much
> use for the latter at present, but if we wanted to do more things with
> messages and had some otherwise very verbose tests, it could
> potentially be useful.
> 
> The other thing I'd note here is that this proposal seems to be doing
> all of the actual message filtering in userspace, which makes a lot of
> sense for kselftest tests, but does mean that the kernel can't know if
> the test has passed or failed. There's definitely a tradeoff between
> trying to put too much needless string parsing in the kernel and
> having to have a userland tool determine the test results. The
> proposed KCSAN test suite[1] is using tracepoints to do this in the
> kernel. It's not the cleanest thing, but there's no reason KUnit or
> similar couldn't implement a nicer API around it.
> 
> [1]: https://lkml.org/lkml/2020/6/22/1506
>

For KTF the way we handled this was to use the APIs for catching
function entry and return (via kprobes), specifying printk as the
function to catch, and checking its argument string to verify
the expected message was seen. That allows you to verify
that messages appear in kernel testing context, but it's
not ideal as printk() has not yet filled in the arguments in
the buffer for display (there may be a better place to trace).
If it seems like it could be useful I  could have a go at 
porting the kprobe stuff to KUnit, as it helps expand the vocabulary 
for what can be tested in kernel context; for example we can 
also override return values for kernel functions to simulate errors.

Alan

[PATCH v3 bpf-next 8/8] bpf/selftests: add tests for %pT format specifier

2020-06-23 Thread Alan Maguire

tests verify we get 0 return value from bpf_trace_print()
using %pT format specifier with various modifiers/pointer
values.

Signed-off-by: Alan Maguire 
---
 .../selftests/bpf/prog_tests/trace_printk_btf.c| 45 +
 .../selftests/bpf/progs/netif_receive_skb.c| 47 ++
 2 files changed, 92 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk_btf.c
 create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c

diff --git a/tools/testing/selftests/bpf/prog_tests/trace_printk_btf.c 
b/tools/testing/selftests/bpf/prog_tests/trace_printk_btf.c
new file mode 100644
index 000..791eb97
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/trace_printk_btf.c
@@ -0,0 +1,45 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+
+#include "netif_receive_skb.skel.h"
+
+void test_trace_printk_btf(void)
+{
+   struct netif_receive_skb *skel;
+   struct netif_receive_skb__bss *bss;
+   int err, duration = 0;
+
+   skel = netif_receive_skb__open();
+   if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+   return;
+
+   err = netif_receive_skb__load(skel);
+   if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err))
+   goto cleanup;
+
+   bss = skel->bss;
+
+   err = netif_receive_skb__attach(skel);
+   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+   goto cleanup;
+
+   /* generate receive event */
+   system("ping -c 1 127.0.0.1 >/dev/null");
+
+   /*
+* Make sure netif_receive_skb program was triggered
+* and it set expected return values from bpf_trace_printk()s
+* and all tests ran.
+*/
+   if (CHECK(bss->ret <= 0,
+ "bpf_trace_printk: got return value",
+ "ret <= 0 %d test %d\n", bss->ret, bss->num_subtests))
+   goto cleanup;
+
+   CHECK(bss->num_subtests != bss->ran_subtests, "check all subtests ran",
+ "only ran %d of %d tests\n", bss->num_subtests,
+ bss->ran_subtests);
+
+cleanup:
+   netif_receive_skb__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c 
b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
new file mode 100644
index 000..03ca1d8
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c
@@ -0,0 +1,47 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Oracle and/or its affiliates. */
+#include "vmlinux.h"
+#include 
+#include 
+
+char _license[] SEC("license") = "GPL";
+
+int ret;
+int num_subtests;
+int ran_subtests;
+
+#define CHECK_PRINTK(_fmt, _p, res)\
+   do {\
+   char fmt[] = _fmt;  \
+   ++num_subtests; \
+   if (ret >= 0) { \
+   ++ran_subtests; \
+   ret = bpf_trace_printk(fmt, sizeof(fmt), (_p)); \
+   }   \
+   } while (0)
+
+/* TRACE_EVENT(netif_receive_skb,
+ * TP_PROTO(struct sk_buff *skb),
+ */
+SEC("tp_btf/netif_receive_skb")
+int BPF_PROG(trace_netif_receive_skb, struct sk_buff *skb)
+{
+   char skb_type[] = "struct sk_buff";
+   struct btf_ptr nullp = { .ptr = 0, .type = skb_type };
+   struct btf_ptr p = { .ptr = skb, .type = skb_type };
+
+   CHECK_PRINTK("%pT\n", &p, &res);
+   CHECK_PRINTK("%pTc\n", &p, &res);
+   CHECK_PRINTK("%pTN\n", &p, &res);
+   CHECK_PRINTK("%pTx\n", &p, &res);
+   CHECK_PRINTK("%pT0\n", &p, &res);
+   CHECK_PRINTK("%pTcNx0\n", &p, &res);
+   CHECK_PRINTK("%pT\n", &nullp, &res);
+   CHECK_PRINTK("%pTc\n", &nullp, &res);
+   CHECK_PRINTK("%pTN\n", &nullp, &res);
+   CHECK_PRINTK("%pTx\n", &nullp, &res);
+   CHECK_PRINTK("%pT0\n", &nullp, &res);
+   CHECK_PRINTK("%pTcNx0\n", &nullp, &res);
+
+   return 0;
+}
-- 
1.8.3.1

[PATCH v3 bpf-next 5/8] printk: initialize vmlinux BTF outside of printk in late_initcall()

2020-06-23 Thread Alan Maguire

vmlinux BTF initialization can take time so it's best to do that
outside of printk context; otherwise the first printk() using %pT
will trigger BTF initialization.

Signed-off-by: Alan Maguire 
---
 lib/vsprintf.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index c0d209d..8ac136a 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -3628,3 +3628,15 @@ int sscanf(const char *buf, const char *fmt, ...)
return i;
 }
 EXPORT_SYMBOL(sscanf);
+
+/*
+ * Initialize vmlinux BTF as it may be used by printk()s and it's better
+ * to incur the cost of initialization outside of printk context.
+ */
+static int __init init_btf_vmlinux(void)
+{
+   (void) bpf_get_btf_vmlinux();
+
+   return 0;
+}
+late_initcall(init_btf_vmlinux);
-- 
1.8.3.1

[PATCH v3 bpf-next 2/8] bpf: move to generic BTF show support, apply it to seq files/strings

2020-06-23 Thread Alan Maguire

generalize the "seq_show" seq file support in btf.c to support
a generic show callback of which we support two instances; the
current seq file show, and a show with snprintf() behaviour which
instead writes the type data to a supplied string.

Both classes of show function call btf_type_show() with different
targets; the seq file or the string to be written.  In the string
case we need to track additional data - length left in string to write
and length to return that we would have written (a la snprintf).

By default show will display type information, field members and
their types and values etc, and the information is indented
based upon structure depth. Zeroed fields are omitted.

Show however supports flags which modify its behaviour:

BTF_SHOW_COMPACT - suppress newline/indent.
BTF_SHOW_NONAME - suppress show of type and member names.
BTF_SHOW_PTR_RAW - do not obfuscate pointer values.
BTF_SHOW_UNSAFE - do not copy data to safe buffer before display.
BTF_SHOW_ZERO - show zeroed values (by default they are not shown).

Signed-off-by: Alan Maguire 
---
 include/linux/btf.h |  36 ++
 kernel/bpf/btf.c| 966 ++--
 2 files changed, 899 insertions(+), 103 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 5c1ea99..a8a4563 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -13,6 +13,7 @@
 struct btf_member;
 struct btf_type;
 union bpf_attr;
+struct btf_show;
 
 extern const struct file_operations btf_fops;
 
@@ -46,8 +47,43 @@ int btf_get_info_by_fd(const struct btf *btf,
 const struct btf_type *btf_type_id_size(const struct btf *btf,
u32 *type_id,
u32 *ret_size);
+
+/*
+ * Options to control show behaviour.
+ * - BTF_SHOW_COMPACT: no formatting around type information
+ * - BTF_SHOW_NONAME: no struct/union member names/types
+ * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values;
+ *   equivalent to %px.
+ * - BTF_SHOW_ZERO: show zero-valued struct/union members; they
+ *   are not displayed by default
+ * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
+ *   data before displaying it.
+ */
+#define BTF_SHOW_COMPACT   (1ULL << 0)
+#define BTF_SHOW_NONAME(1ULL << 1)
+#define BTF_SHOW_PTR_RAW   (1ULL << 2)
+#define BTF_SHOW_ZERO  (1ULL << 3)
+#define BTF_SHOW_UNSAFE(1ULL << 4)
+
 void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
   struct seq_file *m);
+
+/*
+ * Copy len bytes of string representation of obj of BTF type_id into buf.
+ *
+ * @btf: struct btf object
+ * @type_id: type id of type obj points to
+ * @obj: pointer to typed data
+ * @buf: buffer to write to
+ * @len: maximum length to write to buf
+ * @flags: show options (see above)
+ *
+ * Return: length that would have been/was copied as per snprintf, or
+ *negative error.
+ */
+int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj,
+  char *buf, int len, u64 flags);
+
 int btf_get_fd_by_id(u32 id);
 u32 btf_id(const struct btf *btf);
 bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 58c9af1..c82cb18 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -281,6 +281,88 @@ static const char *btf_type_str(const struct btf_type *t)
return btf_kind_str[BTF_INFO_KIND(t->info)];
 }
 
+/* Chunk size we use in safe copy of data to be shown. */
+#define BTF_SHOW_OBJ_SAFE_SIZE 256
+
+/*
+ * This is the maximum size of a base type value (equivalent to a
+ * 128-bit int); if we are at the end of our safe buffer and have
+ * less than 16 bytes space we can't be assured of being able
+ * to copy the next type safely, so in such cases we will initiate
+ * a new copy.
+ */
+#define BTF_SHOW_OBJ_BASE_TYPE_SIZE16
+
+/*
+ * Common data to all BTF show operations. Private show functions can add
+ * their own data to a structure containing a struct btf_show and consult it
+ * in the show callback.  See btf_type_show() below.
+ *
+ * One challenge with showing nested data is we want to skip 0-valued
+ * data, but in order to figure out whether a nested object is all zeros
+ * we need to walk through it.  As a result, we need to make two passes
+ * when handling structs, unions and arrays; the first path simply looks
+ * for nonzero data, while the second actually does the display.  The first
+ * pass is signalled by show->state.depth_check being set, and if we
+ * encounter a non-zero value we set show->state.depth_to_show to
+ * the depth at which we encountered it.  When we have completed the
+ * first pass, we will know if anything needs to be displayed if
+ * depth_to_show > depth.  See btf_[struct,array]_show() for the
+ * implementation of this.

[PATCH v3 bpf-next 4/8] printk: add type-printing %pT format specifier which uses BTF

2020-06-23 Thread Alan Maguire

printk supports multiple pointer object type specifiers (printing
netdev features etc).  Extend this support using BTF to cover
arbitrary types.  "%pT" specifies the typed format, and the pointer
argument is a "struct btf_ptr *" where struct btf_ptr is as follows:

struct btf_ptr {
void *ptr;
const char *type;
u32 id;
};

Either the "type" string ("struct sk_buff") or the BTF "id" can be
used to identify the type to use in displaying the associated "ptr"
value.  A convenience function to create and point at the struct
is provided:

printk(KERN_INFO "%pT", BTF_PTR_TYPE(skb, struct sk_buff));

When invoked, BTF information is used to traverse the sk_buff *
and display it.  Support is present for structs, unions, enums,
typedefs and core types (though in the latter case there's not
much value in using this feature of course).

Default output is indented, but compact output can be specified
via the 'c' option.  Type names/member values can be suppressed
using the 'N' option.  Zero values are not displayed by default
but can be using the '0' option.  Pointer values are obfuscated
unless the 'x' option is specified.  As an example:

  struct sk_buff *skb = alloc_skb(64, GFP_KERNEL);
  pr_info("%pT", BTF_PTR_TYPE(skb, struct sk_buff));

...gives us:

(struct sk_buff){
 .transport_header = (__u16)65535,
 .mac_header = (__u16)65535,
 .end = (sk_buff_data_t)192,
 .head = (unsigned char *)0x6b71155a,
 .data = (unsigned char *)0x6b71155a,
 .truesize = (unsigned int)768,
 .users = (refcount_t){
  .refs = (atomic_t){
   .counter = (int)1,
  },
 },
 .extensions = (struct skb_ext *)0xf486a130,
}

printk output is truncated at 1024 bytes.  For cases where overflow
is likely, the compact/no type names display modes may be used.

Signed-off-by: Alan Maguire 

i
---
 Documentation/core-api/printk-formats.rst | 17 ++
 include/linux/btf.h   |  3 +-
 include/linux/printk.h| 16 +
 lib/vsprintf.c| 98 +++
 4 files changed, 133 insertions(+), 1 deletion(-)

diff --git a/Documentation/core-api/printk-formats.rst 
b/Documentation/core-api/printk-formats.rst
index 8c9aba2..8f255d0 100644
--- a/Documentation/core-api/printk-formats.rst
+++ b/Documentation/core-api/printk-formats.rst
@@ -563,6 +563,23 @@ For printing netdev_features_t.
 
 Passed by reference.
 
+BTF-based printing of pointer data
+--
+If '%pT' is specified, use the struct btf_ptr * along with kernel vmlinux
+BPF Type Format (BTF) to show the typed data.  For example, specifying
+
+   printk(KERN_INFO "%pT", BTF_PTR_TYPE(skb, struct_sk_buff));
+
+will utilize BTF information to traverse the struct sk_buff * and display it.
+
+Supported modifers are
+ 'c' compact output (no indentation, newlines etc)
+ 'N' do not show type names
+ 'u' unsafe printing; probe_kernel_read() is not used to copy data safely
+ before use
+ 'x' show raw pointers (no obfuscation)
+ '0' show zero-valued data (it is not shown by default)
+
 Thanks
 ==
 
diff --git a/include/linux/btf.h b/include/linux/btf.h
index a8a4563..e8dbf0c 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -172,10 +172,11 @@ static inline const struct btf_member 
*btf_type_member(const struct btf_type *t)
return (const struct btf_member *)(t + 1);
 }
 
+struct btf *btf_parse_vmlinux(void);
+
 #ifdef CONFIG_BPF_SYSCALL
 const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
 const char *btf_name_by_offset(const struct btf *btf, u32 offset);
-struct btf *btf_parse_vmlinux(void);
 struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog);
 #else
 static inline const struct btf_type *btf_type_by_id(const struct btf *btf,
diff --git a/include/linux/printk.h b/include/linux/printk.h
index fc8f03c..8f8f5d2 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -618,4 +618,20 @@ static inline void print_hex_dump_debug(const char 
*prefix_str, int prefix_type,
 #define print_hex_dump_bytes(prefix_str, prefix_type, buf, len)\
print_hex_dump_debug(prefix_str, prefix_type, 16, 1, buf, len, true)
 
+/**
+ * struct btf_ptr is used for %pT (typed pointer) display; the
+ * additional type string/BTF id are used to render the pointer
+ * data as the appropriate type.
+ */
+struct btf_ptr {
+   void *ptr;
+   const char *type;
+   u32 id;
+};
+
+#defineBTF_PTR_TYPE(ptrval, typeval) \
+   (&((struct btf_ptr){.ptr = ptrval, .type = #typeval}))
+
+#define BTF_PTR_ID(ptrval, idval) \
+   (&((struct btf_ptr){.ptr = ptrval, .id = idval}))
 #endif
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 259e558..c0d209d 100644
--- a/lib/vsprintf.c
+++

[PATCH v3 bpf-next 6/8] printk: extend test_printf to test %pT BTF-based format specifier

2020-06-23 Thread Alan Maguire

Add tests to verify basic type display and to iterate through all
enums, structs, unions and typedefs ensuring expected behaviour
occurs.  Since test_printf can be built as a module we need to
export a BTF kind iterator function to allow us to iterate over
all names of a particular BTF kind.

These changes add up to approximately 20,000 new tests covering
all enum, struct, union and typedefs in vmlinux BTF.

Individual tests are also added for int, char, struct, enum
and typedefs which verify output is as expected.

Signed-off-by: Alan Maguire 
---
 include/linux/btf.h |   3 +
 kernel/bpf/btf.c|  33 ++
 lib/test_printf.c   | 316 
 3 files changed, 352 insertions(+)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index e8dbf0c..e3102a7 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -191,4 +191,7 @@ static inline const char *btf_name_by_offset(const struct 
btf *btf,
 }
 #endif
 
+/* Following function used for testing BTF-based printk-family support */
+const char *btf_vmlinux_next_type_name(u8 kind, s32 *id);
+
 #endif
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index c82cb18..4e250cd 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5459,3 +5459,36 @@ u32 btf_id(const struct btf *btf)
 {
return btf->id;
 }
+
+/*
+ * btf_vmlinux_next_type_name():  used in test_printf.c to
+ * iterate over types for testing.
+ * Exported as test_printf can be built as a module.
+ *
+ * @kind: BTF_KIND_* value
+ * @id: pointer to last id; value/result argument. When next
+ *  type name is found, we set *id to associated id.
+ * Returns:
+ * Next type name, sets *id to associated id.
+ */
+const char *btf_vmlinux_next_type_name(u8 kind, s32 *id)
+{
+   const struct btf *btf = bpf_get_btf_vmlinux();
+   const struct btf_type *t;
+   const char *name;
+
+   if (!btf || !id)
+   return NULL;
+
+   for ((*id)++; *id <= btf->nr_types; (*id)++) {
+   t = btf->types[*id];
+   if (BTF_INFO_KIND(t->info) != kind)
+   continue;
+   name = btf_name_by_offset(btf, t->name_off);
+   if (name && strlen(name) > 0)
+   return name;
+   }
+
+   return NULL;
+}
+EXPORT_SYMBOL_GPL(btf_vmlinux_next_type_name);
diff --git a/lib/test_printf.c b/lib/test_printf.c
index 7ac87f1..7ce7387 100644
--- a/lib/test_printf.c
+++ b/lib/test_printf.c
@@ -23,6 +23,9 @@
 #include 
 
 #include 
+#include 
+#include 
+#include 
 
 #include "../tools/testing/selftests/kselftest_module.h"
 
@@ -669,6 +672,318 @@ static void __init fwnode_pointer(void)
 #endif
 }
 
+#define__TEST_BTF(fmt, type, ptr, expected)
   \
+   test(expected, "%pT"fmt, ptr)
+
+#define TEST_BTF_C(type, var, ...)\
+   do {   \
+   type var = __VA_ARGS__;\
+   struct btf_ptr *ptr = BTF_PTR_TYPE(&var, type);\
+   pr_debug("type %s: %pTc", #type, ptr); \
+   __TEST_BTF("c", type, ptr, "(" #type ")" #__VA_ARGS__);\
+   } while (0)
+
+#define TEST_BTF(fmt, type, var, expected, ...)
   \
+   do {   \
+   type var = __VA_ARGS__;\
+   struct btf_ptr *ptr = BTF_PTR_TYPE(&var, type);\
+   pr_debug("type %s: %pT"fmt, #type, ptr);   \
+   __TEST_BTF(fmt, type, ptr, expected);  \
+   } while (0)
+
+#defineBTF_MAX_DATA_SIZE   65536
+
+static void __init
+btf_print_kind(u8 kind, const char *kind_name, u64 fillval)
+{
+   const char *fmt1 = "%pT", *fmt2 = "%pTN", *fmt3 = "%pT0";
+   const char *name, *fmt = fmt1;
+   int i, res1, res2, res3, res4;
+   char type_name[256];
+   char *buf, *buf2;
+   u8 *dummy_data;
+   s32 id = 0;
+
+   dummy_data = kzalloc(BTF_MAX_DATA_SIZE, GFP_KERNEL);
+
+   /* fill our dummy data with supplied fillval. */
+   for (i = 0; i < BTF_MAX_DATA_SIZE; i++)
+   dummy_data[i] = fillval;
+
+   buf = kzalloc(BTF_MAX_DATA_SIZE, GFP_KERNEL);
+   buf2 = kzalloc(BTF_MAX_DATA_SIZE, GFP_KERNEL);
+
+   for (;;) {
+   name = btf_vmlinux_next_type_name(kind, &id);
+   if (!name)
+   break;
+
+   total_tests++;
+
+   snprintf(type_name, sizeof(type_name), "%s%s",
+kind_name, name);
+
+

[PATCH v3 bpf-next 7/8] bpf: add support for %pT format specifier for bpf_trace_printk() helper

2020-06-23 Thread Alan Maguire

Allow %pT[cNx0] format specifier for BTF-based display of data associated
with pointer.  The unsafe data modifier 'u' - where the source data
is traversed without copying it to a safe buffer via probe_kernel_read() -
is not supported.

Signed-off-by: Alan Maguire 
---
 include/uapi/linux/bpf.h   | 27 ++-
 kernel/trace/bpf_trace.c   | 24 +++-
 tools/include/uapi/linux/bpf.h | 27 ++-
 3 files changed, 67 insertions(+), 11 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 1968481..ea4fbf3 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -702,7 +702,12 @@ struct bpf_stack_build_id {
  * to file *\/sys/kernel/debug/tracing/trace* from DebugFS, if
  * available. It can take up to three additional **u64**
  * arguments (as an eBPF helpers, the total number of arguments is
- * limited to five).
+ * limited to five), and also supports %pT (BTF-based type
+ * printing), as long as BPF_READ lockdown is not active.
+ * "%pT" takes a "struct __btf_ptr *" as an argument; it
+ * consists of a pointer value and specified BTF type string or id
+ * used to select the type for display.  For more details, see
+ * Documentation/core-api/printk-formats.rst.
  *
  * Each time the helper is called, it appends a line to the trace.
  * Lines are discarded while *\/sys/kernel/debug/tracing/trace* is
@@ -738,10 +743,10 @@ struct bpf_stack_build_id {
  * The conversion specifiers supported by *fmt* are similar, but
  * more limited than for printk(). They are **%d**, **%i**,
  * **%u**, **%x**, **%ld**, **%li**, **%lu**, **%lx**, **%lld**,
- * **%lli**, **%llu**, **%llx**, **%p**, **%s**. No modifier (size
- * of field, padding with zeroes, etc.) is available, and the
- * helper will return **-EINVAL** (but print nothing) if it
- * encounters an unknown specifier.
+ * **%lli**, **%llu**, **%llx**, **%p**, **%pT[cNx0], **%s**.
+ * Only %pT supports modifiers, and the helper will return
+ * **-EINVAL** (but print nothing) if it encouters an unknown
+ * specifier.
  *
  * Also, note that **bpf_trace_printk**\ () is slow, and should
  * only be used for debugging purposes. For this reason, a notice
@@ -4260,4 +4265,16 @@ struct bpf_pidns_info {
__u32 pid;
__u32 tgid;
 };
+
+/*
+ * struct __btf_ptr is used for %pT (typed pointer) display; the
+ * additional type string/BTF id are used to render the pointer
+ * data as the appropriate type.
+ */
+struct __btf_ptr {
+   void *ptr;
+   const char *type;
+   __u32 id;
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index e729c9e5..33ddb31 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -374,9 +374,13 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
}
 }
 
+/* Unsafe BTF display ('u' modifier) is absent here. */
+#define is_btf_safe_modifier(c)\
+   (c == 'c' || c == 'N' || c == 'x' || c == '0')
+
 /*
  * Only limited trace_printk() conversion specifiers allowed:
- * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pks %pus %s
+ * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pks %pus %s %pT
  */
 BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
   u64, arg2, u64, arg3)
@@ -412,6 +416,24 @@ static void bpf_trace_copy_string(char *buf, void 
*unsafe_ptr, char fmt_ptype,
i++;
} else if (fmt[i] == 'p') {
mod[fmt_cnt]++;
+
+   /*
+* allow BTF type-based printing, but disallow unsafe
+* mode - this ensures the data is copied safely
+* using probe_kernel_read() prior to traversing it.
+*/
+   if (fmt[i + 1] == 'T') {
+   int ret;
+
+   ret = security_locked_down(LOCKDOWN_BPF_READ);
+   if (unlikely(ret < 0))
+   return ret;
+   i += 2;
+   while (is_btf_safe_modifier(fmt[i]))
+   i++;
+   goto fmt_next;
+   }
+
if ((fmt[i + 1] == 'k' ||
 fmt[i + 1] == 'u') &&
fmt[i + 2] == 's') {
diff --git a/tools/inc

[PATCH v3 bpf-next 0/8] bpf, printk: add BTF-based type printing

2020-06-23 Thread Alan Maguire

oed. Also tried to comment safe object scheme used. (Yonghong,
  patch 2)
- added late_initcall() to initialize vmlinux BTF so that it would
  not have to be initialized during printk operation (Alexei,
  patch 5)
- removed CONFIG_BTF_PRINTF config option as it is not needed;
  CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and
  determining behaviour of type-based printk can be done via
  retrieval of BTF data; if it's not there BTF was unavailable
  or broken (Alexei, patches 4,6)
- fix bpf_trace_printk test to use vmlinux.h and globals via
  skeleton infrastructure, removing need for perf events
  (Andrii, patch 8)

Changes since v1:

- changed format to be more drgn-like, rendering indented type info
  along with type names by default (Alexei)
- zeroed values are omitted (Arnaldo) by default unless the '0'
  modifier is specified (Alexei)
- added an option to print pointer values without obfuscation.
  The reason to do this is the sysctls controlling pointer display
  are likely to be irrelevant in many if not most tracing contexts.
  Some questions on this in the outstanding questions section below...
- reworked printk format specifer so that we no longer rely on format
  %pT but instead use a struct * which contains type information
  (Rasmus). This simplifies the printk parsing, makes use more dynamic
  and also allows specification by BTF id as well as name.
- removed incorrect patch which tried to fix dereferencing of resolved
  BTF info for vmlinux; instead we skip modifiers for the relevant
  case (array element type determination) (Alexei).
- fixed issues with negative snprintf format length (Rasmus)
- added test cases for various data structure formats; base types,
  typedefs, structs, etc.
- tests now iterate through all typedef, enum, struct and unions
  defined for vmlinux BTF and render a version of the target dummy
  value which is either all zeros or all 0xff values; the idea is this
  exercises the "skip if zero" and "print everything" cases.
- added support in BPF for using the %pT format specifier in
  bpf_trace_printk()
- added BPF tests which ensure %pT format specifier use works (Alexei).

Important note: if running test_printf.ko - the version in the bpf-next
tree will induce a panic when running the fwnode_pointer() tests due
to a kobject issue; applying the patch in

https://lkml.org/lkml/2020/4/17/389

...resolved this issue for me.

Alan Maguire (8):
  bpf: provide function to get vmlinux BTF information
  bpf: move to generic BTF show support, apply it to seq files/strings
  checkpatch: add new BTF pointer format specifier
  printk: add type-printing %pT format specifier which uses BTF
  printk: initialize vmlinux BTF outside of printk in late_initcall()
  printk: extend test_printf to test %pT BTF-based format specifier
  bpf: add support for %pT format specifier for bpf_trace_printk()
helper
  bpf/selftests: add tests for %pT format specifier

 Documentation/core-api/printk-formats.rst  |  17 +
 include/linux/bpf.h|   2 +
 include/linux/btf.h|  42 +-
 include/linux/printk.h |  16 +
 include/uapi/linux/bpf.h   |  27 +-
 kernel/bpf/btf.c   | 999 ++---
 kernel/bpf/verifier.c  |  18 +-
 kernel/trace/bpf_trace.c   |  24 +-
 lib/test_printf.c  | 316 +++
 lib/vsprintf.c | 110 +++
 scripts/checkpatch.pl  |   2 +-
 tools/include/uapi/linux/bpf.h |  27 +-
 .../selftests/bpf/prog_tests/trace_printk_btf.c|  45 +
 .../selftests/bpf/progs/netif_receive_skb.c|  47 +
 14 files changed, 1570 insertions(+), 122 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk_btf.c
 create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c

-- 
1.8.3.1

[PATCH v3 bpf-next 3/8] checkpatch: add new BTF pointer format specifier

2020-06-23 Thread Alan Maguire

checkpatch complains about unknown format specifiers, so add
the BTF format specifier we will implement in a subsequent
patch to avoid errors.

Signed-off-by: Alan Maguire 
---
 scripts/checkpatch.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 4c82060..e89631e 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -6148,7 +6148,7 @@ sub process {
$specifier = $1;
$extension = $2;
$qualifier = $3;
-   if ($extension !~ 
/[SsBKRraEehMmIiUDdgVCbGNOxtf]/ ||
+   if ($extension !~ 
/[SsBKRraEehMmIiUDdgVCbGNOxtfT]/ ||
($extension eq "f" &&
 defined $qualifier && $qualifier 
!~ /^w/)) {
$bad_specifier = $specifier;
-- 
1.8.3.1

[PATCH v3 bpf-next 1/8] bpf: provide function to get vmlinux BTF information

2020-06-23 Thread Alan Maguire

It will be used later for BTF printk() support

Signed-off-by: Alan Maguire 
---
 include/linux/bpf.h   |  2 ++
 kernel/bpf/verifier.c | 18 --
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 07052d4..a2ecebd 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1237,6 +1237,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr,
  union bpf_attr __user *uattr);
 void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
 
+struct btf *bpf_get_btf_vmlinux(void);
+
 /* Map specifics */
 struct xdp_buff;
 struct sk_buff;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a1857c4..d448aa8 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10878,6 +10878,17 @@ static int check_attach_btf_id(struct bpf_verifier_env 
*env)
}
 }
 
+struct btf *bpf_get_btf_vmlinux(void)
+{
+   if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
+   mutex_lock(&bpf_verifier_lock);
+   if (!btf_vmlinux)
+   btf_vmlinux = btf_parse_vmlinux();
+   mutex_unlock(&bpf_verifier_lock);
+   }
+   return btf_vmlinux;
+}
+
 int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
  union bpf_attr __user *uattr)
 {
@@ -10911,12 +10922,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr 
*attr,
env->ops = bpf_verifier_ops[env->prog->type];
is_priv = bpf_capable();
 
-   if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
-   mutex_lock(&bpf_verifier_lock);
-   if (!btf_vmlinux)
-   btf_vmlinux = btf_parse_vmlinux();
-   mutex_unlock(&bpf_verifier_lock);
-   }
+   bpf_get_btf_vmlinux();
 
/* grab the mutex to protect few globals used by verifier */
if (!is_priv)
-- 
1.8.3.1

Re: common KUnit Kconfig and file naming (was: Re: [PATCH] lib: kunit_test_overflow: add KUnit test of check_*_overflow functions)

2020-06-16 Thread Alan Maguire

On Tue, 16 Jun 2020, David Gow wrote:

> CONFIG_PM_QOS_KUNIT_TESTOn Mon, Jun 15, 2020 at 1:48 AM Kees Cook
>  wrote:
> >
> > On Sat, Jun 13, 2020 at 02:51:17PM +0800, David Gow wrote:
> > > Yeah, _KUNIT_TEST was what we've sort-of implicitly decided on for
> > > config names, but the documentation does need to happen.
> >
> > That works for me. It still feels redundant, but all I really want is a
> > standard name. :)
> >
> > > We haven't put as much thought into standardising the filenames much, 
> > > though.
> >
> > I actually find this to be much more important because it is more
> > end-user-facing (i.e. in module naming, in build logs, in scripts, on
> > filesystem, etc -- CONFIG is basically only present during kernel build).
> > Trying to do any sorting or greping really needs a way to find all the
> > kunit pieces.
> >
> 
> Certainly this is more of an issue now we support building KUnit tests
> as modules, rather than having them always be built-in.
> 
> Having some halfway consistent config-name <-> filename <-> test suite
> name could be useful down the line, too. Unfortunately, not
> necessarily a 1:1 mapping, e.g.:
> - CONFIG_KUNIT_TEST compiles both kunit-test.c and string-stream-test.c
> - kunit-test.c has several test suites within it:
> kunit-try-catch-test, kunit-resource-test & kunit-log-test.
> - CONFIG_EXT4_KUNIT_TESTS currently only builds ext4-inode-test.c, but
> as the plural name suggests, might build others later.
> - CONFIG_SECURITY_APPARMOR_KUNIT_TEST doesn't actually have its own
> source file: the test is built into policy_unpack.c
> - &cetera
> 
> Indeed, this made me quickly look up the names of suites, and there
> are a few inconsistencies there:
> - most have "-test" as a suffix
> - some have "_test" as a suffix
> - some have no suffix
> 
> (I'm inclined to say that these don't need a suffix at all.)
> 

A good convention for module names - which I _think_ is along the lines
of what Kees is suggesting - might be something like

[_]_kunit.ko

So for example

kunit_test -> test_kunit.ko
string_stream_test.ko -> test_string_stream_kunit.ko
kunit_example_test -> example_kunit.ko
ext4_inode_test.ko -> ext4_inode_kunit.ko

For the kunit selftests, "selftest_" might be a better name
than "test_", as the latter might encourage people to reintroduce
a redundant "test" into their module name.  

> Within test suites, we're also largely prefixing all of the tests with
> a suite name (even if it's not actually the specified suite name). For
> example, CONFIG_PM_QOS_KUNIT_TEST builds
> drivers/base/power/qos-test.c which contains a suite called
> "qos-kunit-test", with tests prefixed "freq_qos_test_". Some of this
> clearly comes down to wanting to namespace things a bit more
> ("qos-test" as a name could refer to a few things, I imagine), but
> specifying how to do so consistently could help.
> 

Could we add some definitions to help standardize this?
For example, adding a "subsystem" field to "struct kunit_suite"?

So for the ext4 tests the "subsystem" would be "ext4" and the
name "inode" would specify the test area within that subsystem.
For the KUnit selftests, the subsystem would be "test"/"selftest".
Logging could utilize the subsystem definition to allow test
writers to use less redundant test names too.  For example
the suite name logged could be constructed from the
subsystem + area values associated with the kunit_suite,
and individual test names could be shown as the
suite area + test_name.

Thanks!

Alan

Re: [PATCH] Documentation: kunit: Add some troubleshooting tips to the FAQ

2020-06-02 Thread Alan Maguire

On Mon, 1 Jun 2020, David Gow wrote:

> Add an FAQ entry to the KUnit documentation with some tips for
> troubleshooting KUnit and kunit_tool.
> 
> These suggestions largely came from an email thread:
> https://lore.kernel.org/linux-kselftest/41db8bbd-3ba0-8bde-7352-083bf4b94...@intel.com/T/#m23213d4e156db6d59b0b460a9014950f5ff6eb03
> 
> Signed-off-by: David Gow 
> ---
>  Documentation/dev-tools/kunit/faq.rst | 32 +++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/Documentation/dev-tools/kunit/faq.rst 
> b/Documentation/dev-tools/kunit/faq.rst
> index ea55b2467653..40109d425988 100644
> --- a/Documentation/dev-tools/kunit/faq.rst
> +++ b/Documentation/dev-tools/kunit/faq.rst
> @@ -61,3 +61,35 @@ test, or an end-to-end test.
>kernel by installing a production configuration of the kernel on production
>hardware with a production userspace and then trying to exercise some 
> behavior
>that depends on interactions between the hardware, the kernel, and 
> userspace.
> +
> +KUnit isn't working, what should I do?
> +==
> +
> +Unfortunately, there are a number of things which can break, but here are 
> some
> +things to try.
> +
> +1. Try running ``./tools/testing/kunit/kunit.py run`` with the 
> ``--raw_output``
> +   parameter. This might show details or error messages hidden by the 
> kunit_tool
> +   parser.
> +2. Instead of running ``kunit.py run``, try running ``kunit.py config``,
> +   ``kunit.py build``, and ``kunit.py exec`` independently. This can help 
> track
> +   down where an issue is occurring. (If you think the parser is at fault, 
> you
> +   can run it manually against stdin or a file with ``kunit.py parse``.)
> +3. Running the UML kernel directly can often reveal issues or error messages
> +   kunit_tool ignores. This should be as simple as running ``./vmlinux`` 
> after
> +   building the UML kernel (e.g., by using ``kunit.py build``). Note that UML
> +   has some unusual requirements (such as the host having a tmpfs filesystem
> +   mounted), and has had issues in the past when built statically and the 
> host
> +   has KASLR enabled. (On older host kernels, you may need to run ``setarch
> +   `uname -m` -R ./vmlinux`` to disable KASLR.)
> +4. Make sure the kernel .config has ``CONFIG_KUNIT=y`` and at least one test
> +   (e.g. ``CONFIG_KUNIT_EXAMPLE_TEST=y``). kunit_tool will keep its .config
> +   around, so you can see what config was used after running ``kunit.py 
> run``.
> +   It also preserves any config changes you might make, so you can
> +   enable/disable things with ``make ARCH=um menuconfig`` or similar, and 
> then
> +   re-run kunit_tool.
> +5. Finally, running ``make ARCH=um defconfig`` before running ``kunit.py 
> run``
> +   may help clean up any residual config items which could be causing 
> problems.
> +

Looks great! Could we add something like:

6. Try running kunit standalone (without UML).  KUnit and associated 
tests can be built into a standard kernel or built as a module; doing
so allows us to verify test behaviour independent of UML so can be
useful to do if running under UML is failing.  When tests are built-in
they will execute on boot, and modules will automatically execute
associated tests when loaded.  Test results can be collected from
/sys/kernel/debug/kunit//results. For more details see
"KUnit on non-UML architectures" in :doc:`usage`. 

Reviewed-by: Alan Maguire

[PATCH v4 kunit-next 2/2] kunit: add support for named resources

2020-05-29 Thread Alan Maguire

The kunit resources API allows for custom initialization and
cleanup code (init/fini); here a new resource add function sets
the "struct kunit_resource" "name" field, and calls the standard
add function.  Having a simple way to name resources is
useful in cases such as multithreaded tests where a set of
resources are shared among threads; a pointer to the
"struct kunit *" test state then is all that is needed to
retrieve and use named resources.  Support is provided to add,
find and destroy named resources; the latter two are simply
wrappers that use a "match-by-name" callback.

If an attempt to add a resource with a name that already exists
is made kunit_add_named_resource() will return -EEXIST.

Signed-off-by: Alan Maguire 
Reviewed-by: Brendan Higgins 
---
 include/kunit/test.h   | 54 ++
 lib/kunit/kunit-test.c | 37 ++
 lib/kunit/test.c   | 24 ++
 3 files changed, 115 insertions(+)

diff --git a/include/kunit/test.h b/include/kunit/test.h
index f9b914e..59f3144 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -72,9 +72,15 @@
  * return kunit_alloc_resource(test, kunit_kmalloc_init,
  * kunit_kmalloc_free, ¶ms);
  * }
+ *
+ * Resources can also be named, with lookup/removal done on a name
+ * basis also.  kunit_add_named_resource(), kunit_find_named_resource()
+ * and kunit_destroy_named_resource().  Resource names must be
+ * unique within the test instance.
  */
 struct kunit_resource {
void *data;
+   const char *name;   /* optional name */
 
/* private: internal use only. */
kunit_resource_free_t free;
@@ -344,6 +350,21 @@ int kunit_add_resource(struct kunit *test,
   kunit_resource_free_t free,
   struct kunit_resource *res,
   void *data);
+
+/**
+ * kunit_add_named_resource() - Add a named *test managed resource*.
+ * @test: The test context object.
+ * @init: a user-supplied function to initialize the resource data, if needed.
+ * @free: a user-supplied function to free the resource data, if needed.
+ * @name_data: name and data to be set for resource.
+ */
+int kunit_add_named_resource(struct kunit *test,
+kunit_resource_init_t init,
+kunit_resource_free_t free,
+struct kunit_resource *res,
+const char *name,
+void *data);
+
 /**
  * kunit_alloc_resource() - Allocates a *test managed resource*.
  * @test: The test context object.
@@ -399,6 +420,19 @@ static inline bool kunit_resource_instance_match(struct 
kunit *test,
 }
 
 /**
+ * kunit_resource_name_match() - Match a resource with the same name.
+ * @test: Test case to which the resource belongs.
+ * @res: The resource.
+ * @match_name: The name to match against.
+ */
+static inline bool kunit_resource_name_match(struct kunit *test,
+struct kunit_resource *res,
+void *match_name)
+{
+   return res->name && strcmp(res->name, match_name) == 0;
+}
+
+/**
  * kunit_find_resource() - Find a resource using match function/data.
  * @test: Test case to which the resource belongs.
  * @match: match function to be applied to resources/match data.
@@ -427,6 +461,19 @@ static inline bool kunit_resource_instance_match(struct 
kunit *test,
 }
 
 /**
+ * kunit_find_named_resource() - Find a resource using match name.
+ * @test: Test case to which the resource belongs.
+ * @name: match name.
+ */
+static inline struct kunit_resource *
+kunit_find_named_resource(struct kunit *test,
+ const char *name)
+{
+   return kunit_find_resource(test, kunit_resource_name_match,
+  (void *)name);
+}
+
+/**
  * kunit_destroy_resource() - Find a kunit_resource and destroy it.
  * @test: Test case to which the resource belongs.
  * @match: Match function. Returns whether a given resource matches 
@match_data.
@@ -439,6 +486,13 @@ int kunit_destroy_resource(struct kunit *test,
   kunit_resource_match_t match,
   void *match_data);
 
+static inline int kunit_destroy_named_resource(struct kunit *test,
+  const char *name)
+{
+   return kunit_destroy_resource(test, kunit_resource_name_match,
+ (void *)name);
+}
+
 /**
  * kunit_remove_resource: remove resource from resource list associated with
  *   test.
diff --git a/lib/kunit/kunit-test.c b/lib/kunit/kunit-test.c
index 03f3eca..69f9024 100644
--- a/lib/kunit/kunit-test.c
+++ b/lib/kunit/kunit-test.c
@@ -325,6 +325,42 @@ static void kunit_resource_test_static(struct kunit *test)

[PATCH v4 kunit-next 1/2] kunit: generalize kunit_resource API beyond allocated resources

2020-05-29 Thread Alan Maguire

In its original form, the kunit resources API - consisting the
struct kunit_resource and associated functions - was focused on
adding allocated resources during test operation that would be
automatically cleaned up on test completion.

The recent RFC patch proposing converting KASAN tests to KUnit [1]
showed another potential model - where outside of test context,
but with a pointer to the test state, we wish to access/update
test-related data, but expressly want to avoid allocations.

It turns out we can generalize the kunit_resource to support
static resources where the struct kunit_resource * is passed
in and initialized for us. As part of this work, we also
change the "allocation" field to the more general "data" name,
as instead of associating an allocation, we can associate a
pointer to static data.  Static data is distinguished by a NULL
free functions.  A test is added to cover using kunit_add_resource()
with a static resource and data.

Finally we also make use of the kernel's krefcount interfaces
to manage reference counting of KUnit resources.  The motivation
for this is simple; if we have kernel threads accessing and
using resources (say via kunit_find_resource()) we need to
ensure we do not remove said resources (or indeed free them
if they were dynamically allocated) until the reference count
reaches zero.  A new function - kunit_put_resource() - is
added to handle this, and it should be called after a
thread using kunit_find_resource() is finished with the
retrieved resource.

We ensure that the functions needed to look up, use and
drop reference count are "static inline"-defined so that
they can be used by builtin code as well as modules in
the case that KUnit is built as a module.

A cosmetic change here also; I've tried moving to
kunit_[action]_resource() as the format of function names
for consistency and readability.

[1] https://lkml.org/lkml/2020/2/26/1286

Signed-off-by: Alan Maguire 
Reviewed-by: Brendan Higgins 
---
 include/kunit/test.h  | 156 +-
 lib/kunit/kunit-test.c|  74 --
 lib/kunit/string-stream.c |  14 ++---
 lib/kunit/test.c  | 153 -
 4 files changed, 268 insertions(+), 129 deletions(-)

diff --git a/include/kunit/test.h b/include/kunit/test.h
index 47e61e1..f9b914e 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct kunit_resource;
 
@@ -23,13 +24,19 @@
 
 /**
  * struct kunit_resource - represents a *test managed resource*
- * @allocation: for the user to store arbitrary data.
+ * @data: for the user to store arbitrary data.
  * @free: a user supplied function to free the resource. Populated by
- * kunit_alloc_resource().
+ * kunit_resource_alloc().
  *
  * Represents a *test managed resource*, a resource which will automatically be
  * cleaned up at the end of a test case.
  *
+ * Resources are reference counted so if a resource is retrieved via
+ * kunit_alloc_and_get_resource() or kunit_find_resource(), we need
+ * to call kunit_put_resource() to reduce the resource reference count
+ * when finished with it.  Note that kunit_alloc_resource() does not require a
+ * kunit_resource_put() because it does not retrieve the resource itself.
+ *
  * Example:
  *
  * .. code-block:: c
@@ -42,9 +49,9 @@
  * static int kunit_kmalloc_init(struct kunit_resource *res, void *context)
  * {
  * struct kunit_kmalloc_params *params = context;
- * res->allocation = kmalloc(params->size, params->gfp);
+ * res->data = kmalloc(params->size, params->gfp);
  *
- * if (!res->allocation)
+ * if (!res->data)
  * return -ENOMEM;
  *
  * return 0;
@@ -52,30 +59,26 @@
  *
  * static void kunit_kmalloc_free(struct kunit_resource *res)
  * {
- * kfree(res->allocation);
+ * kfree(res->data);
  * }
  *
  * void *kunit_kmalloc(struct kunit *test, size_t size, gfp_t gfp)
  * {
  * struct kunit_kmalloc_params params;
- * struct kunit_resource *res;
  *
  * params.size = size;
  * params.gfp = gfp;
  *
- * res = kunit_alloc_resource(test, kunit_kmalloc_init,
+ * return kunit_alloc_resource(test, kunit_kmalloc_init,
  * kunit_kmalloc_free, ¶ms);
- * if (res)
- * return res->allocation;
- *
- * return NULL;
  * }
  */
 struct kunit_resource {
-   void *allocation;
-   kunit_resource_free_t free;
+   void *data;
 
/* private: internal use only. */
+   kunit_resource_free_t free;
+   struct kref refcount;
struct list_head node;
 };
 
@@ -284,6 +287,64 @@ struct kunit_resource *kunit_alloc_and_get_resource(struct 
kunit

[PATCH v4 kunit-next 0/2] kunit: extend kunit resources API

2020-05-29 Thread Alan Maguire

A recent RFC patch set [1] suggests some additional functionality
may be needed around kunit resources.  It seems to require

1. support for resources without allocation
2. support for lookup of such resources
3. support for access to resources across multiple kernel threads

The proposed changes here are designed to address these needs.
The idea is we first generalize the API to support adding
resources with static data; then from there we support named
resources.  The latter support is needed because if we are
in a different thread context and only have the "struct kunit *"
to work with, we need a way to identify a resource in lookup.

[1] https://lkml.org/lkml/2020/2/26/1286

Changes since v3:
- removed unused "init" field from "struct kunit_resources" (Brendan)

Changes since v2:
 - moved a few functions relating to resource retrieval in patches
   1 and 2 into include/kunit/test.h and defined as "static inline";
   this allows built-in consumers to use these functions when KUnit
   is built as a module

Changes since v1:
 - reformatted longer parameter lists to have one parameter per-line
   (Brendan, patch 1)
 - fixed phrasing in various comments to clarify allocation of memory
   and added comment to kunit resource tests to clarify why
   kunit_put_resource() is used there (Brendan, patch 1)
 - changed #define to static inline function (Brendan, patch 2)
 - simplified kunit_add_named_resource() to use more of existing
   code for non-named resource (Brendan, patch 2)

Alan Maguire (2):
  kunit: generalize kunit_resource API beyond allocated resources
  kunit: add support for named resources

Alan Maguire (2):
  kunit: generalize kunit_resource API beyond allocated resources
  kunit: add support for named resources

 include/kunit/test.h  | 210 +++---
 lib/kunit/kunit-test.c| 111 +++-
 lib/kunit/string-stream.c |  14 ++--
 lib/kunit/test.c  | 171 ++---
 4 files changed, 380 insertions(+), 126 deletions(-)

-- 
1.8.3.1

Re: [PATCH v3 3/7] kunit: tests for stats_fs API

2020-05-27 Thread Alan Maguire

On Tue, 26 May 2020, Emanuele Giuseppe Esposito wrote:

> Add kunit tests to extensively test the stats_fs API functionality.
>

I've added in the kunit-related folks.

> In order to run them, the kernel .config must set CONFIG_KUNIT=y
> and a new .kunitconfig file must be created with CONFIG_STATS_FS=y
> and CONFIG_STATS_FS_TEST=y
>

It looks like CONFIG_STATS_FS is built-in, but it exports
much of the functionality you are testing.  However could the
tests also be built as a module (i.e. make CONFIG_STATS_FS_TEST
a tristate variable)? To test this you'd need to specify
CONFIG_KUNIT=m and CONFIG_STATS_FS_TEST=m, and testing would
simply be a case of "modprobe"ing the stats fs module and collecting
results in /sys/kernel/debug/kunit/ (rather 
than running kunit.py). Are you relying on unexported internals in
the the tests that would prevent building them as a module?

Thanks!

Alan

Re: [PATCH v2 bpf-next 2/7] bpf: move to generic BTF show support, apply it to seq files/strings

2020-05-18 Thread Alan Maguire

On Wed, 13 May 2020, Yonghong Song wrote:

> 
> > +struct btf_show {
> > +   u64 flags;
> > +   void *target;   /* target of show operation (seq file, buffer) */
> > +   void (*showfn)(struct btf_show *show, const char *fmt, ...);
> > +   const struct btf *btf;
> > +   /* below are used during iteration */
> > +   struct {
> > +   u8 depth;
> > +   u8 depth_shown;
> > +   u8 depth_check;
> 
> I have some difficulties to understand the relationship between
> the above three variables. Could you add some comments here?
>

Will do; sorry the code got a bit confusing. The goal is to track
which sub-components in a data structure we need to display.  The
"depth" variable tracks where we are currently; "depth_shown"
is the depth at which we have something nonzer to display (perhaps
"depth_to_show" would be a better name?). "depth_check" tells
us whether we are currently checking depth or doing printing.
If we're checking, we don't actually print anything, we merely note
if we hit a non-zero value, and if so, we set "depth_shown"
to the depth at which we hit that value.

When we show a struct, union or array, we will only display an
object has one or more non-zero members.  But because
the struct can in turn nest a struct or array etc, we need
to recurse into the object.  When we are doing that, depth_check
is set, and this tells us not to do any actual display. When
that recursion is complete, we check if "depth_shown" (depth
to show) is > depth (i.e. we found something) and if it is
we go on to display the object (setting depth_check to 0).

There may be a better way to solve this problem of course,
but I wanted to avoid storing values where possible as
deeply-nested data structures might overrun such storage.

> > +   u8 array_member:1,
> > +  array_terminated:1;
> > +   u16 array_encoding;
> > +   u32 type_id;
> > +   const struct btf_type *type;
> > +   const struct btf_member *member;
> > +   char name[KSYM_NAME_LEN];   /* scratch space for name */
> > +   char type_name[KSYM_NAME_LEN];  /* scratch space for type */
> 
> KSYM_NAME_LEN is for symbol name, not for type name. But I guess in kernel we
> probably do not have > 128 bytes type name so we should be
> okay here.
> 

Yeah, I couldn't find a good length to use here.  We
eliminate qualifiers such as "const" in the display, so
it's unlikely we'd overrun.

> > +   } state;
> > +};
> > +
> >   struct btf_kind_operations {
> >s32 (*check_meta)(struct btf_verifier_env *env,
> >   const struct btf_type *t,
> > @@ -297,9 +323,9 @@ struct btf_kind_operations {
> >   const struct btf_type *member_type);
> >void (*log_details)(struct btf_verifier_env *env,
> > const struct btf_type *t);
> > -   void (*seq_show)(const struct btf *btf, const struct btf_type *t,
> > +   void (*show)(const struct btf *btf, const struct btf_type *t,
> >  u32 type_id, void *data, u8 bits_offsets,
> > -struct seq_file *m);
> > +struct btf_show *show);
> >   };
> >   
> >   static const struct btf_kind_operations * const kind_ops[NR_BTF_KINDS];
> > @@ -676,6 +702,340 @@ bool btf_member_is_reg_int(const struct btf *btf,
> > const struct btf_type *s,
> > return true;
> >   }
> >   
> > +/* Similar to btf_type_skip_modifiers() but does not skip typedefs. */
> > +static inline
> > +const struct btf_type *btf_type_skip_qualifiers(const struct btf *btf, u32
> > id)
> > +{
> > +   const struct btf_type *t = btf_type_by_id(btf, id);
> > +
> > +   while (btf_type_is_modifier(t) &&
> > +  BTF_INFO_KIND(t->info) != BTF_KIND_TYPEDEF) {
> > +   id = t->type;
> > +   t = btf_type_by_id(btf, t->type);
> > +   }
> > +
> > +   return t;
> > +}
> > +
> > +#define BTF_SHOW_MAX_ITER  10
> > +
> > +#define BTF_KIND_BIT(kind) (1ULL << kind)
> > +
> > +static inline const char *btf_show_type_name(struct btf_show *show,
> > +const struct btf_type *t)
> > +{
> > +   const char *array_suffixes = "[][][][][][][][][][]";
> 
> Add a comment here saying length BTF_SHOW_MAX_ITER * 2
> so later on if somebody changes the BTF_SHOW_MAX_ITER from 10 to 12,
> it won't miss here?
> 
> > +   const char *array_suffix = &array_suffixes[strlen(array_suffixes)];
> > +   const char *ptr_suffixes = "**";
> 
> The same here.
>

Good idea; will do.
 
> > +   const char *ptr_suffix = &ptr_suffixes[strlen(ptr_suffixes)];
> > +   const char *type_name = NULL, *prefix = "", *parens = "";
> > +   const struct btf_array *array;
> > +   u32 id = show->state.type_id;
> > +   bool allow_anon = true;
> > +   u64 kinds = 0;
> > +   int i;
> > +
> > +   show->state.type_name[0] = '\0';
> > +
> > +   /*
> > +* Start with type_id, as we have have resolved the struct btf_type *
> > +* via btf_modifier_show() past the parent typedef to the c

Re: [PATCH v2 bpf-next 6/7] bpf: add support for %pT format specifier for bpf_trace_printk() helper

2020-05-18 Thread Alan Maguire

On Wed, 13 May 2020, Yonghong Song wrote:

> 
> > +   while (isbtffmt(fmt[i]))
> > +   i++;
> 
> The pointer passed to the helper may not be valid pointer. I think you
> need to do a probe_read_kernel() here. Do an atomic memory allocation
> here should be okay as this is mostly for debugging only.
> 

Are there other examples of doing allocations in program execution
context? I'd hate to be the first to introduce one if not. I was hoping
I could get away with some per-CPU scratch space. Most data structures
will fit within a small per-CPU buffer, but if multiple copies
are required, performance isn't the key concern. It will make traversing
the buffer during display a bit more complex but I think avoiding 
allocation might make that complexity worth it. The other thought I had 
was we could carry out an allocation associated with the attach, 
but that's messy as it's possible run-time might determine the type for
display (and thus the amount of the buffer we need to copy safely).

Great news about LLVM support for __builtin_btf_type_id()!

Thanks!

Alan

1 2 >

1 - 100 of 128 matches

Mail list logo