Re: [PATCH] bpf/btf: Move tracing BTF APIs to the BTF library
On 10/10/2023 14:54, Masami Hiramatsu (Google) wrote: > From: Masami Hiramatsu (Google) > > Move the BTF APIs used in tracing to the BTF library code for sharing it > with others. > Previously, to avoid complex dependency in a series I made it on the > tracing tree, but now it is a good time to move it to BPF tree because > these functions are pure BTF functions. > Makes sense to me. Two very small things - usual practice for bpf-related changes is to specify "PATCH bpf-next" for changes like this that target the -next tree. Other thing is I'm reasonably sure no functional changes are intended - it's basically just a matter of moving code from trace_btf -> btf - but would be good to confirm that no functional changes are intended or similar in the commit message. It's sort of implicit when you say "move the BTF APIs", but would be good to confirm. > Signed-off-by: Masami Hiramatsu (Google) Reviewed-by: Alan Maguire > --- > include/linux/btf.h| 24 + > kernel/bpf/btf.c | 115 + > kernel/trace/Makefile |1 > kernel/trace/trace_btf.c | 122 > > kernel/trace/trace_btf.h | 11 > kernel/trace/trace_probe.c |2 - > 6 files changed, 140 insertions(+), 135 deletions(-) > delete mode 100644 kernel/trace/trace_btf.c > delete mode 100644 kernel/trace/trace_btf.h > > diff --git a/include/linux/btf.h b/include/linux/btf.h > index 928113a80a95..8372d93ea402 100644 > --- a/include/linux/btf.h > +++ b/include/linux/btf.h > @@ -507,6 +507,14 @@ btf_get_prog_ctx_type(struct bpf_verifier_log *log, > const struct btf *btf, > int get_kern_ctx_btf_id(struct bpf_verifier_log *log, enum bpf_prog_type > prog_type); > bool btf_types_are_same(const struct btf *btf1, u32 id1, > const struct btf *btf2, u32 id2); > +const struct btf_type *btf_find_func_proto(const char *func_name, > +struct btf **btf_p); > +const struct btf_param *btf_get_func_param(const struct btf_type *func_proto, > +s32 *nr); > +const struct btf_member *btf_find_struct_member(struct btf *btf, > + const struct btf_type *type, > + const char *member_name, > + u32 *anon_offset); > #else > static inline const struct btf_type *btf_type_by_id(const struct btf *btf, > u32 type_id) > @@ -559,6 +567,22 @@ static inline bool btf_types_are_same(const struct btf > *btf1, u32 id1, > { > return false; > } > +static inline const struct btf_type *btf_find_func_proto(const char > *func_name, > + struct btf **btf_p) > +{ > + return NULL; > +} > +static inline const struct btf_param * > +btf_get_func_param(const struct btf_type *func_proto, s32 *nr) > +{ > + return NULL; > +} > +static inline const struct btf_member * > +btf_find_struct_member(struct btf *btf, const struct btf_type *type, > +const char *member_name, u32 *anon_offset) > +{ > + return NULL; > +} > #endif > > static inline bool btf_type_is_struct_ptr(struct btf *btf, const struct > btf_type *t) > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > index 8090d7fb11ef..e5cbf3b31b78 100644 > --- a/kernel/bpf/btf.c > +++ b/kernel/bpf/btf.c > @@ -912,6 +912,121 @@ static const struct btf_type > *btf_type_skip_qualifiers(const struct btf *btf, > return t; > } > > +/* > + * Find a function proto type by name, and return the btf_type with its btf > + * in *@btf_p. Return NULL if not found. > + * Note that caller has to call btf_put(*@btf_p) after using the btf_type. > + */ > +const struct btf_type *btf_find_func_proto(const char *func_name, struct btf > **btf_p) > +{ > + const struct btf_type *t; > + s32 id; > + > + id = bpf_find_btf_id(func_name, BTF_KIND_FUNC, btf_p); > + if (id < 0) > + return NULL; > + > + /* Get BTF_KIND_FUNC type */ > + t = btf_type_by_id(*btf_p, id); > + if (!t || !btf_type_is_func(t)) > + goto err; > + > + /* The type of BTF_KIND_FUNC is BTF_KIND_FUNC_PROTO */ > + t = btf_type_by_id(*btf_p, t->type); > + if (!t || !btf_type_is_func_proto(t)) > + goto err; > + > + return t; > +err: > + btf_put(*btf_p); > + return NULL; > +} > + > +/* > + * Get function parameter with the number of parameters. &g
Re: [PATCH 0/4] tracing: improve symbolic printing
On 04/10/2023 22:43, Steven Rostedt wrote: > On Wed, 4 Oct 2023 22:35:07 +0100 > Alan Maguire wrote: > >> One thing we've heard from some embedded folks [1] is that having >> kernel BTF loadable as a separate module (rather than embedded in >> vmlinux) would help, as there are size limits on vmlinux that they can >> workaround by having modules on a different partition. We're hoping >> to get that working soon. I was wondering if you see other issues around >> BTF adoption for embedded systems that we could put on the to-do list? >> Not necessarily for this particular use-case (since there are >> complications with trace data as you describe), but just trying to make >> sure we can remove barriers to BTF adoption where possible. > > I wonder how easy is it to create subsets of BTF. For one thing, in the > future we want to be able to trace the arguments of all functions. That is, > tracing all functions at the same time (function tracer) and getting the > arguments within the trace. > > This would only require information about functions and their arguments, > which would be very useful. Is BTF easy to break apart? That is, just > generate the information needed for function arguments? > There has been a fair bit of effort around this from the userspace side; the BTF gen efforts were focused around applications carrying the minimum BTF for their needs, so just the structures needed by the particular BPF programs rather than the full set of vmlinux structures for example [1]. Parsing BTF in-kernel to pull out the BTF functions (BTF_KIND_FUNC), their prototypes (BTF_KIND_FUNC_PROTO) and all associated parameters would be pretty straightforward I think, especially if you don't need the structures that are passed via pointers. So if you're starting with the full BTF, creating a subset for use in tracing would be reasonably straightforward. My personal preference would always be to have the full BTF where possible, but if that wasn't feasible on some systems we'd need to add some options to pahole/libbpf to support such trimming during the DWARF->BTF translation process. Alan [1] https://lore.kernel.org/bpf/20220209222646.348365-7-mauri...@kinvolk.io/ > Note, pretty much all functions do not pass structures by values, and this > would not need to know the contents of a pointer to a structure. This would > mean that structure layout information is not needed. > > -- Steve >
Re: [PATCH 0/4] tracing: improve symbolic printing
On 04/10/2023 18:29, Steven Rostedt wrote: > On Wed, 4 Oct 2023 09:54:31 -0700 > Jakub Kicinski wrote: > >> On Wed, 4 Oct 2023 12:35:24 -0400 Steven Rostedt wrote: Potentially naive question - the trace point holds enum skb_drop_reason. The user space can get the names from BTF. Can we not teach user space to generically look up names of enums in BTF? >>> >>> That puts a hard requirement to include BTF in builds where it was not >>> needed before. I really do not want to build with BTF just to get access to >>> these symbols. And since this is used by the embedded world, and BTF is >>> extremely bloated, the short answer is "No". >> >> Dunno. BTF is there most of the time. It could make the life of >> majority of the users far more pleasant. > > BTF isn't there for a lot of developers working in embedded who use this > code. Most my users that I deal with have minimal environments, so BTF is a > showstopper. One thing we've heard from some embedded folks [1] is that having kernel BTF loadable as a separate module (rather than embedded in vmlinux) would help, as there are size limits on vmlinux that they can workaround by having modules on a different partition. We're hoping to get that working soon. I was wondering if you see other issues around BTF adoption for embedded systems that we could put on the to-do list? Not necessarily for this particular use-case (since there are complications with trace data as you describe), but just trying to make sure we can remove barriers to BTF adoption where possible. Thanks! Alan [1] https://lore.kernel.org/bpf/CAHBbfcUkr6fTm2X9GNsFNqV75fTG=abqxfx_8ayk+4hk7he...@mail.gmail.com/ > >> >> I hope we can at least agree that the current methods of generating >> the string arrays at C level are... aesthetically displeasing. > > I don't know, I kinda like it ;-) > > -- Steve >
Re: [PATCH v3 1/2] kunit: support failure from dynamic analysis tools
On Thu, 11 Feb 2021, David Gow wrote: > On Wed, Feb 10, 2021 at 6:14 AM Daniel Latypov wrote: > > > > From: Uriel Guajardo > > > > Add a kunit_fail_current_test() function to fail the currently running > > test, if any, with an error message. > > > > This is largely intended for dynamic analysis tools like UBSAN and for > > fakes. > > E.g. say I had a fake ops struct for testing and I wanted my `free` > > function to complain if it was called with an invalid argument, or > > caught a double-free. Most return void and have no normal means of > > signalling failure (e.g. super_operations, iommu_ops, etc.). > > > > Key points: > > * Always update current->kunit_test so anyone can use it. > > * commit 83c4e7a0363b ("KUnit: KASAN Integration") only updated it for > > CONFIG_KASAN=y > > > > * Create a new header so non-test code doesn't have > > to include all of (e.g. lib/ubsan.c) > > > > * Forward the file and line number to make it easier to track down > > failures > > > > * Declare the helper function for nice __printf() warnings about mismatched > > format strings even when KUnit is not enabled. > > > > Example output from kunit_fail_current_test("message"): > > [15:19:34] [FAILED] example_simple_test > > [15:19:34] # example_simple_test: initializing > > [15:19:34] # example_simple_test: lib/kunit/kunit-example-test.c:24: > > message > > [15:19:34] not ok 1 - example_simple_test > > > > Co-developed-by: Daniel Latypov > > Signed-off-by: Uriel Guajardo > > Signed-off-by: Daniel Latypov > > --- > > include/kunit/test-bug.h | 30 ++ > > lib/kunit/test.c | 37 + > > 2 files changed, 63 insertions(+), 4 deletions(-) > > create mode 100644 include/kunit/test-bug.h > > > > diff --git a/include/kunit/test-bug.h b/include/kunit/test-bug.h > > new file mode 100644 > > index ..18b1034ec43a > > --- /dev/null > > +++ b/include/kunit/test-bug.h > > @@ -0,0 +1,30 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +/* > > + * KUnit API allowing dynamic analysis tools to interact with KUnit tests > > + * > > + * Copyright (C) 2020, Google LLC. > > + * Author: Uriel Guajardo > > + */ > > + > > +#ifndef _KUNIT_TEST_BUG_H > > +#define _KUNIT_TEST_BUG_H > > + > > +#define kunit_fail_current_test(fmt, ...) \ > > + __kunit_fail_current_test(__FILE__, __LINE__, fmt, ##__VA_ARGS__) > > + > > +#if IS_ENABLED(CONFIG_KUNIT) > > As the kernel test robot has pointed out on the second patch, this > probably should be IS_BUILTIN(), otherwise this won't build if KUnit > is a module, and the code calling it isn't. > > This does mean that things like UBSAN integration won't work if KUnit > is a module, which is a shame. > > (It's worth noting that the KASAN integration worked around this by > only calling inline functions, which would therefore be built-in even > if the rest of KUnit was built as a module. I don't think it's quite > as convenient to do that here, though.) > Right, static inline'ing __kunit_fail_current_test() seems problematic because it calls other exported functions; more below > > + > > +extern __printf(3, 4) void __kunit_fail_current_test(const char *file, int > > line, > > + const char *fmt, ...); > > + > > +#else > > + > > +static __printf(3, 4) void __kunit_fail_current_test(const char *file, int > > line, > > + const char *fmt, ...) > > +{ > > +} > > + > > +#endif > > + > > + > > +#endif /* _KUNIT_TEST_BUG_H */ > > diff --git a/lib/kunit/test.c b/lib/kunit/test.c > > index ec9494e914ef..5794059505cf 100644 > > --- a/lib/kunit/test.c > > +++ b/lib/kunit/test.c > > @@ -7,6 +7,7 @@ > > */ > > > > #include > > +#include > > #include > > #include > > #include > > @@ -16,6 +17,38 @@ > > #include "string-stream.h" > > #include "try-catch-impl.h" > > > > +/* > > + * Fail the current test and print an error message to the log. > > + */ > > +void __kunit_fail_current_test(const char *file, int line, const char > > *fmt, ...) > > +{ > > + va_list args; > > + int len; > > + char *buffer; > > + > > + if (!current->kunit_test) > > + return; > > + > > + kunit_set_failure(current->kunit_test); > > + currently kunit_set_failure() is static, but it could be inlined I suspect. > > + /* kunit_err() only accepts literals, so evaluate the args first. */ > > + va_start(args, fmt); > > + len = vsnprintf(NULL, 0, fmt, args) + 1; > > + va_end(args); > > + > > + buffer = kunit_kmalloc(current->kunit_test, len, GFP_KERNEL); kunit_kmalloc()/kunit_kfree() are exported also, but we could probably dodge allocation with a static buffer. In fact since we end up using an on-stack buffer for logging in kunit_log_append(), it might make sense to #define __kunit_fail_current_test() instead, i.e. #define __kunit_fail_current_test(file,
Re: [PATCH v3 0/2] kunit: fail tests on UBSAN errors
On Tue, 9 Feb 2021, Daniel Latypov wrote: > v1 by Uriel is here: [1]. > Since it's been a while, I've dropped the Reviewed-By's. > > It depended on commit 83c4e7a0363b ("KUnit: KASAN Integration") which > hadn't been merged yet, so that caused some kerfuffle with applying them > previously and the series was reverted. > > This revives the series but makes the kunit_fail_current_test() function > take a format string and logs the file and line number of the failing > code, addressing Alan Maguire's comments on the previous version. > > As a result, the patch that makes UBSAN errors was tweaked slightly to > include an error message. > > v2 -> v3: > Fix kunit_fail_current_test() so it works w/ CONFIG_KUNIT=m > s/_/__ on the helper func to match others in test.c > > [1] > https://lore.kernel.org/linux-kselftest/20200806174326.3577537-1-urielguajard...@gmail.com/ > For the series: Reviewed-by: Alan Maguire Thanks!
Re: [PATCH v2 1/2] kunit: support failure from dynamic analysis tools
On Tue, 9 Feb 2021, Daniel Latypov wrote: > On Tue, Feb 9, 2021 at 9:26 AM Alan Maguire wrote: > > > > On Fri, 5 Feb 2021, Daniel Latypov wrote: > > > > > From: Uriel Guajardo > > > > > > Add a kunit_fail_current_test() function to fail the currently running > > > test, if any, with an error message. > > > > > > This is largely intended for dynamic analysis tools like UBSAN and for > > > fakes. > > > E.g. say I had a fake ops struct for testing and I wanted my `free` > > > function to complain if it was called with an invalid argument, or > > > caught a double-free. Most return void and have no normal means of > > > signalling failure (e.g. super_operations, iommu_ops, etc.). > > > > > > Key points: > > > * Always update current->kunit_test so anyone can use it. > > > * commit 83c4e7a0363b ("KUnit: KASAN Integration") only updated it for > > > CONFIG_KASAN=y > > > > > > * Create a new header so non-test code doesn't have > > > to include all of (e.g. lib/ubsan.c) > > > > > > * Forward the file and line number to make it easier to track down > > > failures > > > > > > > Thanks for doing this! > > > > > * Declare it as a function for nice __printf() warnings about mismatched > > > format strings even when KUnit is not enabled. > > > > > > > One thing I _think_ this assumes is that KUnit is builtin; > > don't we need an > > Ah, you're correct. > Also going to rename it to have two _ to match other functions used in > macros like __kunit_test_suites_init. > Great! If you're sending out an updated version with these changes, feel free to add Reviewed-by: Alan Maguire
Re: [PATCH v2 1/2] kunit: support failure from dynamic analysis tools
On Fri, 5 Feb 2021, Daniel Latypov wrote: > From: Uriel Guajardo > > Add a kunit_fail_current_test() function to fail the currently running > test, if any, with an error message. > > This is largely intended for dynamic analysis tools like UBSAN and for > fakes. > E.g. say I had a fake ops struct for testing and I wanted my `free` > function to complain if it was called with an invalid argument, or > caught a double-free. Most return void and have no normal means of > signalling failure (e.g. super_operations, iommu_ops, etc.). > > Key points: > * Always update current->kunit_test so anyone can use it. > * commit 83c4e7a0363b ("KUnit: KASAN Integration") only updated it for > CONFIG_KASAN=y > > * Create a new header so non-test code doesn't have > to include all of (e.g. lib/ubsan.c) > > * Forward the file and line number to make it easier to track down > failures > Thanks for doing this! > * Declare it as a function for nice __printf() warnings about mismatched > format strings even when KUnit is not enabled. > One thing I _think_ this assumes is that KUnit is builtin; don't we need an EXPORT_SYMBOL_GPL(_kunit_fail_current_test); ? Without it, if an analysis tool (or indeed if KUnit) is built as a module, it won't be possible to use this functionality. > Example output from kunit_fail_current_test("message"): > [15:19:34] [FAILED] example_simple_test > [15:19:34] # example_simple_test: initializing > [15:19:34] # example_simple_test: lib/kunit/kunit-example-test.c:24: > message > [15:19:34] not ok 1 - example_simple_test > > Co-developed-by: Daniel Latypov > Signed-off-by: Uriel Guajardo > Signed-off-by: Daniel Latypov > --- > include/kunit/test-bug.h | 30 ++ > lib/kunit/test.c | 36 > 2 files changed, 62 insertions(+), 4 deletions(-) > create mode 100644 include/kunit/test-bug.h > > diff --git a/include/kunit/test-bug.h b/include/kunit/test-bug.h > new file mode 100644 > index ..4963ed52c2df > --- /dev/null > +++ b/include/kunit/test-bug.h > @@ -0,0 +1,30 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * KUnit API allowing dynamic analysis tools to interact with KUnit tests > + * > + * Copyright (C) 2020, Google LLC. nit; might want to update copyright year. > + * Author: Uriel Guajardo > + */ > + > +#ifndef _KUNIT_TEST_BUG_H > +#define _KUNIT_TEST_BUG_H > + > +#define kunit_fail_current_test(fmt, ...) \ > + _kunit_fail_current_test(__FILE__, __LINE__, fmt, ##__VA_ARGS__) > + > +#if IS_ENABLED(CONFIG_KUNIT) > + > +extern __printf(3, 4) void _kunit_fail_current_test(const char *file, int > line, > + const char *fmt, ...); > + > +#else > + > +static __printf(3, 4) void _kunit_fail_current_test(const char *file, int > line, > + const char *fmt, ...) > +{ > +} > + > +#endif > + > + > +#endif /* _KUNIT_TEST_BUG_H */ > diff --git a/lib/kunit/test.c b/lib/kunit/test.c > index ec9494e914ef..7b16aae0ccae 100644 > --- a/lib/kunit/test.c > +++ b/lib/kunit/test.c > @@ -7,6 +7,7 @@ > */ > > #include > +#include > #include > #include > #include > @@ -16,6 +17,37 @@ > #include "string-stream.h" > #include "try-catch-impl.h" > > +/* > + * Fail the current test and print an error message to the log. > + */ > +void _kunit_fail_current_test(const char *file, int line, const char *fmt, > ...) > +{ > + va_list args; > + int len; > + char *buffer; > + > + if (!current->kunit_test) > + return; > + > + kunit_set_failure(current->kunit_test); > + > + /* kunit_err() only accepts literals, so evaluate the args first. */ > + va_start(args, fmt); > + len = vsnprintf(NULL, 0, fmt, args) + 1; > + va_end(args); > + > + buffer = kunit_kmalloc(current->kunit_test, len, GFP_KERNEL); > + if (!buffer) > + return; > + > + va_start(args, fmt); > + vsnprintf(buffer, len, fmt, args); > + va_end(args); > + > + kunit_err(current->kunit_test, "%s:%d: %s", file, line, buffer); > + kunit_kfree(current->kunit_test, buffer); > +} > + > /* > * Append formatted message to log, size of which is limited to > * KUNIT_LOG_SIZE bytes (including null terminating byte). > @@ -273,9 +305,7 @@ static void kunit_try_run_case(void *data) > struct kunit_suite *suite = ctx->suite; > struct kunit_case *test_case = ctx->test_case; > > -#if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT)) > current->kunit_test = test; > -#endif /* IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT) */ > > /* >* kunit_run_case_internal may encounter a fatal error; if it does, > @@ -624,9 +654,7 @@ void kunit_cleanup(struct kunit *test) > spin_unlock(&test->lock); > kunit_remove_resource(test, res); > } > -#if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT)) >
Re: [PATCH v2 bpf-next 3/4] libbpf: BTF dumper support for typed data
On Thu, 21 Jan 2021, Andrii Nakryiko wrote: > On Wed, Jan 20, 2021 at 10:56 PM Andrii Nakryiko > wrote: > > > > On Sun, Jan 17, 2021 at 2:22 PM Alan Maguire > > wrote: > > > > > > Add a BTF dumper for typed data, so that the user can dump a typed > > > version of the data provided. > > > > > > The API is > > > > > > int btf_dump__emit_type_data(struct btf_dump *d, __u32 id, > > > const struct btf_dump_emit_type_data_opts > > > *opts, > > > void *data); > > > > > Two more things I realized about this API overnight: > > 1. It's error-prone to specify only the pointer to data without > specifying the size. If user screws up and scecifies wrong type ID or > if BTF data is corrupted, then this API would start reading and > printing memory outside the bounds. I think it's much better to also > require user to specify the size and bail out with error if we reach > the end of the allowed memory area. Yep, good point, especially given in the tracing context we will likely only have a subset of the data (e.g. part of the 16k representing a task_struct). The way I was approaching this was to return -E2BIG and append a "..." to the dumped data denoting the data provided didn't cover the size needed to fully represent the type. The idea is the structure is too big for the data provided, hence E2BIG, but maybe there's a more intuitive way to do this? See below for more... > > 2. This API would be more useful if it also returns the amount of > "consumed" bytes. That way users can do more flexible and powerful > pretty-printing of raw data. So on success we'll have >= 0 number of > bytes used for dumping given BTF type, or <0 on error. WDYT? > I like it! So 1. if a user provides a too-big data object, we return the amount we used; and 2. if a user provides a too-small data object, we append "..." to the dump and return -E2BIG (or whatever error code). However I wonder for case 2 if it'd be better to use a snprintf()-like semantic rather than an error code, returning the amount we would have used. That way we easily detect case 1 (size passed in > return value), case 2 (size passed in < return value), and errors can be treated separately. Feels to me that dealing with truncated data is going to be sufficiently frequent it might be good not to classify it as an error. Let me know if you think that makes sense. I'm working on v3, and hope to have something early next week, but a quick reply to a question below... > > > ...where the id is the BTF id of the data pointed to by the "void *" > > > argument; for example the BTF id of "struct sk_buff" for a > > > "struct skb *" data pointer. Options supported are > > > > > > - a starting indent level (indent_lvl) > > > - a set of boolean options to control dump display, similar to those > > >used for BPF helper bpf_snprintf_btf(). Options are > > > - compact : omit newlines and other indentation > > > - noname: omit member names > > > - zero: show zero-value members > > > > > > Default output format is identical to that dumped by bpf_snprintf_btf(), > > > for example a "struct sk_buff" representation would look like this: > > > > > > struct sk_buff){ > > > (union){ > > > (struct){ > > > > Curious, these explicit anonymous (union) and (struct), is that > > preferred way for explicitness, or is it just because it makes > > implementation simpler and thus was chosen? I.e., if the goal was to > > mimic C-style data initialization, you'd just have plain .next = ..., > > .prev = ..., .dev = ..., .dev_scratch = ..., all on the same level. So > > just checking for myself. The idea here is that we want to clarify if we're dealing with an anonymous struct or union. I wanted to have things work like a C-style initializer as closely as possible, but I realized it's not legit to initialize multiple values in a union, and more importantly when we're trying to visually interpret data, we really want to know if an anonymous container of data is a structure (where all values represent different elements in the structure) or a union (where we're seeing multiple interpretations of the same value). Thanks again for the detailed review! Alan
[PATCH v2 bpf-next 1/4] libbpf: add btf_has_size() and btf_int() inlines
BTF type data dumping will use them in later patches, and they are useful generally when handling BTF data. Signed-off-by: Alan Maguire --- tools/lib/bpf/btf.h | 19 +++ 1 file changed, 19 insertions(+) diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 1237bcd..0c48f2e 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -294,6 +294,20 @@ static inline bool btf_is_datasec(const struct btf_type *t) return btf_kind(t) == BTF_KIND_DATASEC; } +static inline bool btf_has_size(const struct btf_type *t) +{ + switch (BTF_INFO_KIND(t->info)) { + case BTF_KIND_INT: + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: + case BTF_KIND_ENUM: + case BTF_KIND_DATASEC: + return true; + default: + return false; + } +} + static inline __u8 btf_int_encoding(const struct btf_type *t) { return BTF_INT_ENCODING(*(__u32 *)(t + 1)); @@ -309,6 +323,11 @@ static inline __u8 btf_int_bits(const struct btf_type *t) return BTF_INT_BITS(*(__u32 *)(t + 1)); } +static inline __u32 btf_int(const struct btf_type *t) +{ + return *(__u32 *)(t + 1); +} + static inline struct btf_array *btf_array(const struct btf_type *t) { return (struct btf_array *)(t + 1); -- 1.8.3.1
[PATCH v2 bpf-next 3/4] libbpf: BTF dumper support for typed data
Add a BTF dumper for typed data, so that the user can dump a typed version of the data provided. The API is int btf_dump__emit_type_data(struct btf_dump *d, __u32 id, const struct btf_dump_emit_type_data_opts *opts, void *data); ...where the id is the BTF id of the data pointed to by the "void *" argument; for example the BTF id of "struct sk_buff" for a "struct skb *" data pointer. Options supported are - a starting indent level (indent_lvl) - a set of boolean options to control dump display, similar to those used for BPF helper bpf_snprintf_btf(). Options are - compact : omit newlines and other indentation - noname: omit member names - zero: show zero-value members Default output format is identical to that dumped by bpf_snprintf_btf(), for example a "struct sk_buff" representation would look like this: struct sk_buff){ (union){ (struct){ .next = (struct sk_buff *)0x, .prev = (struct sk_buff *)0x, (union){ .dev = (struct net_device *)0x, .dev_scratch = (long unsigned int)18446744073709551615, }, }, ... Signed-off-by: Alan Maguire --- tools/lib/bpf/btf.h | 17 + tools/lib/bpf/btf_dump.c | 974 +++ tools/lib/bpf/libbpf.map | 5 + 3 files changed, 996 insertions(+) diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index 0c48f2e..7937124 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -180,6 +180,23 @@ struct btf_dump_emit_type_decl_opts { btf_dump__emit_type_decl(struct btf_dump *d, __u32 id, const struct btf_dump_emit_type_decl_opts *opts); + +struct btf_dump_emit_type_data_opts { + /* size of this struct, for forward/backward compatibility */ + size_t sz; + int indent_level; + /* below match "show" flags for bpf_show_snprintf() */ + bool compact; + bool noname; + bool zero; +}; +#define btf_dump_emit_type_data_opts__last_field zero + +LIBBPF_API int +btf_dump__emit_type_data(struct btf_dump *d, __u32 id, +const struct btf_dump_emit_type_data_opts *opts, +void *data); + /* * A set of helpers for easier BTF types handling */ diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index 2f9d685..04d604f 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -10,6 +10,8 @@ #include #include #include +#include +#include #include #include #include @@ -19,14 +21,31 @@ #include "libbpf.h" #include "libbpf_internal.h" +#define BITS_PER_BYTE 8 +#define BITS_PER_U128 (sizeof(__u64) * BITS_PER_BYTE * 2) +#define BITS_PER_BYTE_MASK (BITS_PER_BYTE - 1) +#define BITS_PER_BYTE_MASKED(bits) ((bits) & BITS_PER_BYTE_MASK) +#define BITS_ROUNDDOWN_BYTES(bits) ((bits) >> 3) +#define BITS_ROUNDUP_BYTES(bits) \ + (BITS_ROUNDDOWN_BYTES(bits) + !!BITS_PER_BYTE_MASKED(bits)) + static const char PREFIXES[] = "\t\t\t\t\t\t\t\t\t\t\t\t\t"; static const size_t PREFIX_CNT = sizeof(PREFIXES) - 1; + static const char *pfx(int lvl) { return lvl >= PREFIX_CNT ? PREFIXES : &PREFIXES[PREFIX_CNT - lvl]; } +static const char SPREFIXES[] = " "; +static const size_t SPREFIX_CNT = sizeof(SPREFIXES) - 1; + +static const char *spfx(int lvl) +{ + return lvl >= SPREFIX_CNT ? SPREFIXES : &SPREFIXES[SPREFIX_CNT - lvl]; +} + enum btf_dump_type_order_state { NOT_ORDERED, ORDERING, @@ -53,6 +72,49 @@ struct btf_dump_type_aux_state { __u8 referenced: 1; }; +#define BTF_DUMP_DATA_MAX_NAME_LEN 256 + +/* + * Common internal data for BTF type data dump operations. + * + * The implementation here is similar to that in kernel/bpf/btf.c + * that supports the bpf_snprintf_btf() helper, so any bugs in + * type data dumping here are likely in that code also. + * + * One challenge with showing nested data is we want to skip 0-valued + * data, but in order to figure out whether a nested object is all zeros + * we need to walk through it. As a result, we need to make two passes + * when handling structs, unions and arrays; the first path simply looks + * for nonzero data, while the second actually does the display. The first + * pass is signalled by state.depth_check being set, and if we + * encounter a non-zero value we set state.depth_to_show to the depth + * at which we encountered it. When we have completed the first pass, + * we will know if anything needs to be displayed if + * state.depth_to_show > state.depth. See btf_dump_emit_[struct,array]_data() + * for the implementation of this. + * + */ +struct btf_dump_data { + bool compact; + bool noname; + bool zero; + __u8 indent_lv
[PATCH v2 bpf-next 0/4] libbpf: BTF dumper support for typed data
Add a libbpf dumper function that supports dumping a representation of data passed in using the BTF id associated with the data in a manner similar to the bpf_snprintf_btf helper. Default output format is identical to that dumped by bpf_snprintf_btf(), for example a "struct sk_buff" representation would look like this: struct sk_buff){ (union){ (struct){ .next = (struct sk_buff *)0x, .prev = (struct sk_buff *)0x, (union){ .dev = (struct net_device *)0x, .dev_scratch = (long unsigned int)18446744073709551615, }, }, ... Patches 1 and 2 make functions available that are needed during dump operations. Patch 3 implements the dump functionality in a manner similar to that in kernel/bpf/btf.c, but with a view to fitting into libbpf more naturally. For example, rather than using flags, boolean dump options are used to control output. Patch 4 is a selftest that utilizes a dump printf function to snprintf the dump output to a string for comparison with expected output. Tests deliberately mirror those in snprintf_btf helper test to keep output consistent. Changes since RFC [1] - The initial approach explored was to share the kernel code with libbpf using #defines to paper over the different needs; however it makes more sense to try and fit in with libbpf code style for maintenance. A comment in the code points at the implementation in kernel/bpf/btf.c and notes that any issues found in it should be fixed there or vice versa; mirroring the tests should help with this also (Andrii) [1] https://lore.kernel.org/bpf/1610386373-24162-1-git-send-email-alan.magu...@oracle.com/T/#t Alan Maguire (4): libbpf: add btf_has_size() and btf_int() inlines libbpf: make skip_mods_and_typedefs available internally in libbpf libbpf: BTF dumper support for typed data selftests/bpf: add dump type data tests to btf dump tests tools/lib/bpf/btf.h | 36 + tools/lib/bpf/btf_dump.c | 974 ++ tools/lib/bpf/libbpf.c| 4 +- tools/lib/bpf/libbpf.map | 5 + tools/lib/bpf/libbpf_internal.h | 2 + tools/testing/selftests/bpf/prog_tests/btf_dump.c | 233 ++ 6 files changed, 1251 insertions(+), 3 deletions(-) -- 1.8.3.1
[PATCH v2 bpf-next 4/4] selftests/bpf: add dump type data tests to btf dump tests
Test various type data dumping operations by comparing expected format with the dumped string; an snprintf-style printf function is used to record the string dumped. Signed-off-by: Alan Maguire --- tools/testing/selftests/bpf/prog_tests/btf_dump.c | 233 ++ 1 file changed, 233 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dump.c b/tools/testing/selftests/bpf/prog_tests/btf_dump.c index c60091e..262561f4 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_dump.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_dump.c @@ -232,6 +232,237 @@ void test_btf_dump_incremental(void) btf__free(btf); } +#define STRSIZE2048 +#defineEXPECTED_STRSIZE256 + +void btf_dump_snprintf(void *ctx, const char *fmt, va_list args) +{ + char *s = ctx, new[STRSIZE]; + + vsnprintf(new, STRSIZE, fmt, args); + strncat(s, new, STRSIZE); + vfprintf(ctx, fmt, args); +} + +/* skip "enum "/"struct " prefixes */ +#define SKIP_PREFIX(_typestr, _prefix) \ + do {\ + if (strstr(_typestr, _prefix) == _typestr) \ + _typestr += strlen(_prefix) + 1;\ + } while (0) + +int btf_dump_data(struct btf *btf, struct btf_dump *d, + char *ptrtype, __u64 flags, void *ptr, + char *str, char *expectedval) +{ + struct btf_dump_emit_type_data_opts opts = { 0 }; + int ret = 0, cmp; + __s32 type_id; + + opts.sz = sizeof(opts); + opts.compact = true; + if (flags & BTF_F_NONAME) + opts.noname = true; + if (flags & BTF_F_ZERO) + opts.zero = true; + SKIP_PREFIX(ptrtype, "enum"); + SKIP_PREFIX(ptrtype, "struct"); + SKIP_PREFIX(ptrtype, "union"); + type_id = btf__find_by_name(btf, ptrtype); + if (CHECK(type_id <= 0, "find type id", + "no '%s' in BTF: %d\n", ptrtype, type_id)) { + ret = -ENOENT; + goto err; + } + str[0] = '\0'; + ret = btf_dump__emit_type_data(d, type_id, &opts, ptr); + if (CHECK(ret < 0, "btf_dump__emit_type_data", + "failed: %d\n", ret)) + goto err; + + cmp = strncmp(str, expectedval, EXPECTED_STRSIZE); + if (CHECK(cmp, "ensure expected/actual match", + "'%s' does not match expected '%s': %d\n", + str, expectedval, cmp)) + ret = -EFAULT; + +err: + if (ret) + btf_dump__free(d); + return ret; +} + +#define TEST_BTF_DUMP_DATA(_b, _d, _str, _type, _flags, _expected, ...) \ + do {\ + char _expectedval[EXPECTED_STRSIZE] = _expected;\ + char __ptrtype[64] = #_type;\ + char *_ptrtype = (char *)__ptrtype; \ + static _type _ptrdata = __VA_ARGS__;\ + void *_ptr = &_ptrdata; \ + \ + if (btf_dump_data(_b, _d, _ptrtype, _flags, _ptr, \ + _str, _expectedval)) \ + return; \ + } while (0) + +/* Use where expected data string matches its stringified declaration */ +#define TEST_BTF_DUMP_DATA_C(_b, _d, _str, _type, _opts, ...) \ + TEST_BTF_DUMP_DATA(_b, _d, _str, _type, _opts, \ + "(" #_type ")" #__VA_ARGS__, __VA_ARGS__) + +void test_btf_dump_data(void) +{ + struct btf *btf = libbpf_find_kernel_btf(); + char str[STRSIZE]; + struct btf_dump_opts opts = { .ctx = str }; + struct btf_dump *d; + + if (CHECK(!btf, "get kernel BTF", "no kernel BTF found")) + return; + + d = btf_dump__new(btf, NULL, &opts, btf_dump_snprintf); + + if (CHECK(!d, "new dump", "could not create BTF dump")) + return; + + /* Verify type display for various types. */ + + /* simple int */ + TEST_BTF_DUMP_DATA_C(btf, d, str, int, 0, 1234); + TEST_BTF_DUMP_DATA(btf, d, str, int, BTF_F_NONAME, "1234", 1234); + + /* zero value should be printed at toplevel */ + TEST_BTF_DUMP_DATA(btf, d, str, int, 0, "(int)0", 0); + TEST_BTF_DUMP_DATA(btf, d, str, int, BTF_F_NONAME, "0", 0); + TEST_BTF_DUMP_DATA(btf, d, s
[PATCH v2 bpf-next 2/4] libbpf: make skip_mods_and_typedefs available internally in libbpf
btf_dump.c will need it for type-based data display. Signed-off-by: Alan Maguire --- tools/lib/bpf/libbpf.c | 4 +--- tools/lib/bpf/libbpf_internal.h | 2 ++ 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 2abbc38..4ef84e1 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -73,8 +73,6 @@ #define __printf(a, b) __attribute__((format(printf, a, b))) static struct bpf_map *bpf_object__add_map(struct bpf_object *obj); -static const struct btf_type * -skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id); static int __base_pr(enum libbpf_print_level level, const char *format, va_list args) @@ -1885,7 +1883,7 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict) return 0; } -static const struct btf_type * +const struct btf_type * skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id) { const struct btf_type *t = btf__type_by_id(btf, id); diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 969d0ac..c25d2df 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -108,6 +108,8 @@ static inline void *libbpf_reallocarray(void *ptr, size_t nmemb, size_t size) void *btf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t cur_cnt, size_t max_cnt, size_t add_cnt); int btf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_cnt); +const struct btf_type *skip_mods_and_typedefs(const struct btf *btf, __u32 id, + __u32 *res_id); static inline bool libbpf_validate_opts(const char *opts, size_t opts_sz, size_t user_sz, -- 1.8.3.1
Re: [RFC PATCH bpf-next 1/2] bpf: share BTF "show" implementation between kernel and libbpf
On Mon, 11 Jan 2021, Andrii Nakryiko wrote: > On Mon, Jan 11, 2021 at 9:34 AM Alan Maguire wrote: > > Currently the only "show" function for userspace is to write the > > representation of the typed data to a string via > > > > LIBBPF_API int > > btf__snprintf(struct btf *btf, char *buf, int len, __u32 id, void *obj, > > __u64 flags); > > > > ...but other approaches could be pursued including printf()-based > > show, or even a callback mechanism could be supported to allow > > user-defined show functions. > > > > It's strange that you saw btf_dump APIs, and yet decided to go with > this API instead. snprintf() is not a natural "method" of struct btf. > Using char buffer as an output is overly restrictive and inconvenient. > It's appropriate for kernel and BPF program due to their restrictions, > but there is no need to cripple libbpf APIs for that. I think it > should follow btf_dump APIs with custom callback so that it's easy to > just printf() everything, but also user can create whatever elaborate > mechanism they need and that fits their use case. > > Code reuse is not the ultimate goal, it should facilitate > maintainability, not harm it. There are times where sharing code > introduces unnecessary coupling and maintainability issues. And I > think this one is a very obvious case of that. > Okay, so I've been exploring adding dumper API support. The initial approach I've been using is to provide an API like this: /* match show flags for bpf_show_snprintf() */ enum { BTF_DUMP_F_COMPACT = (1ULL << 0), BTF_DUMP_F_NONAME = (1ULL << 1), BTF_DUMP_F_ZERO = (1ULL << 3), }; struct btf_dump_emit_type_data_opts { /* size of this struct, for forward/backward compatibility */ size_t sz; void *data; int indent_level; __u64 flags; }; #define btf_dump_emit_type_data_opts__last_field flags LIBBPF_API int btf_dump__emit_type_data(struct btf_dump *d, __u32 id, const struct btf_dump_emit_type_data_opts *opts); ...so the opts play a similiar role to the struct btf_ptr + flags in bpf_snprintf_btf. I've got this working, but the current implementation is tied to emitting the same C-based syntax as bpf_snprintf_btf(); though of course the printf function is invoked. So a use case looks something like this: struct btf_dump_emit_type_data_opts opts; char skbufmem[1024], skbufstr[8192]; struct btf *btf = libbpf_find_kernel_btf(); struct btf_dump *d; __s32 skbid; int indent = 0; memset(skbufmem, 0xff, sizeof(skbufmem)); opts.data = skbufmem; opts.sz = sizeof(opts); opts.indent_level = indent; d = btf_dump__new(btf, NULL, NULL, printffn); skbid = btf__find_by_name_kind(btf, "sk_buff", BTF_KIND_STRUCT); if (skbid < 0) { fprintf(stderr, "no skbuff, err %d\n", skbid); exit(1); } btf_dump__emit_type_data(d, skbid, &opts); ..and we get output of the form (struct sk_buff){ (union){ (struct){ .next = (struct sk_buff *)0x, .prev = (struct sk_buff *)0x, (union){ .dev = (struct net_device *)0x, .dev_scratch = (long unsigned int)18446744073709551615, }, }, ... etc. However it would be nice to find a way to help printf function providers emit different formats such as JSON without having to parse the data they are provided in the printf function. That would remove the need for the output flags, since the printf function provider could control display. If we provided an option to provider a "kind" printf function, and ensured that the BTF dumper sets a "kind" prior to each _internal_ call to the printf function, we could use that info to adapt output in various ways. For example, consider the case where we want to emit C-type output. We can use the kind info to control output for various scenarios: void c_dump_kind_printf(struct btf_dump *d, enum btf_dump_kind kind, void *ctx, const char *fmt, va_list args) { switch (kind) { case BTF_DUMP_KIND_TYPE_NAME: /* For C, add brackets around the type name string ( ) */ btf_dump__printf(d, "("); btf_dump__vprintf(d, fmt, args); btf_dump__printf(d, ")"); break; case BTF_DUMP_KIND_MEMBER_NAME: /* for C, prefix a "." to member name, suffix a "=" */ btf_dump__printf(d, "."); btf_dump__vprintf(d, fmt, args); btf_dump__printf(d, " = ");
[RFC PATCH bpf-next 1/2] bpf: share BTF "show" implementation between kernel and libbpf
libbpf already supports a "dumper" API for dumping type information, but there is currently no support for dumping typed _data_ via libbpf. However this functionality does exist in the kernel, in part to facilitate the bpf_snprintf_btf() helper which dumps a string representation of the pointer passed in utilizing the BTF type id of the data pointed to. For example, the pair of a pointer to a "struct sk_buff" and the BTF type id of "struct sk_buff" can be used. Here the kernel code is generalized into btf_show_common.c. For the most part, code is identical for userspace and kernel, beyond a few API differences and missing functions. The only significant differences are - the "safe copy" logic used by the kernel to ensure we do not induce a crash during BPF operation; and - the BTF seq file support that is kernel-only. The mechanics are to maintain identical btf_show_common.c files in kernel/bpf and tools/lib/bpf , and a common header btf_common.h in include/linux/ and tools/lib/bpf/. This file duplication seems to be the common practice with duplication between kernel and tools/ so it's the approach taken here. The common code approach could likely be explored further, but here the minimum common code required to support BTF show functionality is used. Currently the only "show" function for userspace is to write the representation of the typed data to a string via LIBBPF_API int btf__snprintf(struct btf *btf, char *buf, int len, __u32 id, void *obj, __u64 flags); ...but other approaches could be pursued including printf()-based show, or even a callback mechanism could be supported to allow user-defined show functions. Here's an example usage, storing a string representation of struct sk_buff *skb in buf: struct btf *btf = libbpf_find_kernel_btf(); char buf[8192]; __s32 skb_id; skb_id = btf__find_by_name_kind(btf, "sk_buff", BTF_KIND_STRUCT); if (skb_id < 0) fprintf(stderr, "no skbuff, err %d\n", skb_id); else btf__snprintf(btf, buf, sizeof(buf), skb_id, skb, 0); Suggested-by: Alexei Starovoitov Signed-off-by: Alan Maguire --- include/linux/btf.h | 121 +--- include/linux/btf_common.h | 286 + kernel/bpf/Makefile |2 +- kernel/bpf/arraymap.c |1 + kernel/bpf/bpf_struct_ops.c |1 + kernel/bpf/btf.c| 1215 +- kernel/bpf/btf_show_common.c| 1218 +++ kernel/bpf/core.c |1 + kernel/bpf/hashtab.c|1 + kernel/bpf/local_storage.c |1 + kernel/bpf/verifier.c |1 + kernel/trace/bpf_trace.c|1 + tools/lib/bpf/Build |2 +- tools/lib/bpf/btf.h |7 + tools/lib/bpf/btf_common.h | 286 + tools/lib/bpf/btf_show_common.c | 1218 +++ tools/lib/bpf/libbpf.map|1 + 17 files changed, 3044 insertions(+), 1319 deletions(-) create mode 100644 include/linux/btf_common.h create mode 100644 kernel/bpf/btf_show_common.c create mode 100644 tools/lib/bpf/btf_common.h create mode 100644 tools/lib/bpf/btf_show_common.c diff --git a/include/linux/btf.h b/include/linux/btf.h index 4c200f5..a1f6325 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -50,43 +50,6 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, u32 *type_id, u32 *ret_size); -/* - * Options to control show behaviour. - * - BTF_SHOW_COMPACT: no formatting around type information - * - BTF_SHOW_NONAME: no struct/union member names/types - * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values; - * equivalent to %px. - * - BTF_SHOW_ZERO: show zero-valued struct/union members; they - * are not displayed by default - * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read - * data before displaying it. - */ -#define BTF_SHOW_COMPACT BTF_F_COMPACT -#define BTF_SHOW_NONAMEBTF_F_NONAME -#define BTF_SHOW_PTR_RAW BTF_F_PTR_RAW -#define BTF_SHOW_ZERO BTF_F_ZERO -#define BTF_SHOW_UNSAFE(1ULL << 4) - -void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, - struct seq_file *m); -int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj, - struct seq_file *m, u64 flags); - -/* - * Copy len bytes of string representation of obj of BTF type_id into buf. - * - * @btf: struct btf object - * @type_id: type id of type obj points to - * @obj: pointer to typed data - * @buf: buffer to write to - * @len: maximum length to write to buf - * @flags: show options (see above) - * - * Ret
[RFC PATCH bpf-next 2/2] selftests/bpf: test libbpf-based type display
Test btf__snprintf with various base/kernel types and ensure display is as expected; tests are identical to those in snprintf_btf test save for the fact these run in userspace rather than BPF program context. Signed-off-by: Alan Maguire --- .../selftests/bpf/prog_tests/snprintf_btf_user.c | 192 + 1 file changed, 192 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c new file mode 100644 index 000..9eb82b2 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c @@ -0,0 +1,192 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021, Oracle and/or its affiliates. */ +#include +#include +#include + +#include +#include + +#define STRSIZE2048 +#define EXPECTED_STRSIZE 256 + +#ifndef ARRAY_SIZE +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) +#endif + +/* skip "enum "/"struct " prefixes */ +#define SKIP_PREFIX(_typestr, _prefix) \ + do {\ + if (strstr(_typestr, _prefix) == _typestr) \ + _typestr += strlen(_prefix) + 1;\ + } while (0) + +#define TEST_BTF(btf, _str, _type, _flags, _expected, ...) \ + do {\ + const char _expectedval[EXPECTED_STRSIZE] = _expected; \ + const char __ptrtype[64] = #_type; \ + char *_ptrtype = (char *)__ptrtype; \ + __u64 _hflags = _flags | BTF_F_COMPACT; \ + static _type _ptrdata = __VA_ARGS__;\ + void *_ptr = &_ptrdata; \ + __s32 _type_id; \ + int _cmp, _ret; \ + \ + SKIP_PREFIX(_ptrtype, "enum"); \ + SKIP_PREFIX(_ptrtype, "struct");\ + SKIP_PREFIX(_ptrtype, "union"); \ + _ptr = &_ptrdata; \ + _type_id = btf__find_by_name(btf, _ptrtype);\ + if (CHECK(_type_id <= 0, "find type id",\ + "no '%s' in BTF: %d\n", _ptrtype, _type_id)) \ + return; \ + _ret = btf__snprintf(btf, _str, STRSIZE, _type_id, _ptr,\ +_hflags); \ + if (CHECK(_ret < 0, "btf snprintf", "failed: %d\n", \ + _ret))\ + return; \ + _cmp = strncmp(_str, _expectedval, EXPECTED_STRSIZE); \ + if (CHECK(_cmp, "ensure expected/actual match", \ + "'%s' does not match expected '%s': %d\n",\ + _str, _expectedval, _cmp)) \ + return; \ + } while (0) + +/* Use where expected data string matches its stringified declaration */ +#define TEST_BTF_C(btf, _str, _type, _flags, ...) \ + TEST_BTF(btf, _str, _type, _flags, "(" #_type ")" #__VA_ARGS__, \ +__VA_ARGS__) + +/* Demonstrate that libbpf btf__snprintf succeeds and that various + * data types are formatted correctly. + */ +void test_snprintf_btf_user(void) +{ + struct btf *btf = libbpf_find_kernel_btf(); + int duration = 0; + char str[STRSIZE]; + + if (CHECK(!btf, "get kernel BTF", "no kernel BTF found")) + return; + + /* Verify type display for various types. */ + + /* simple int */ + TEST_BTF_C(btf, str, int, 0, 1234); + TEST_BTF(btf, str, int, BTF_F_NONAME, "1234", 1234); + + /* zero value should be printed at toplevel */ + TEST_BTF(btf, str, int, 0, "(int)0", 0); + TEST_BTF(btf, str, int, BTF_F_NONAME, "0", 0); + TEST_BTF(btf, str, int, BTF_F_ZERO, "(int)0", 0); + TEST_BTF(btf, str, int, BTF_F_NONAME | BTF_F_ZERO, "0", 0); + TEST_BTF_C(btf, str, int, 0, -4567); + TEST_BTF(btf, str, int, BTF_F_NONAME, "-4567", -4567);
[RFC PATCH bpf-next 0/2] bpf, libbpf: share BTF data show functionality
The BPF Type Format (BTF) can be used in conjunction with the helper bpf_snprintf_btf() to display kernel data with type information. This series generalizes that support and shares it with libbpf so that libbpf can display typed data. BTF display functionality is factored out of kernel/bpf/btf.c into kernel/bpf/btf_show_common.c, and that file is duplicated in tools/lib/bpf. Similarly, common definitions and inline functions needed for this support are extracted into include/linux/btf_common.h and this header is again duplicated in tools/lib/bpf. Patch 1 carries out the refactoring, for which no kernel changes are intended, and introduces btf__snprintf() a libbpf function that supports dumping a string representation of typed data using the struct btf * and id associated with that type. Patch 2 tests btf__snprintf() with built-in and kernel types to ensure data is of expected format. The test closely mirrors the BPF program associated with the snprintf_btf.c; in this case however the string representations are verified in userspace rather than in BPF program context. Alan Maguire (2): bpf: share BTF "show" implementation between kernel and libbpf selftests/bpf: test libbpf-based type display include/linux/btf.h| 121 +- include/linux/btf_common.h | 286 + kernel/bpf/Makefile|2 +- kernel/bpf/arraymap.c |1 + kernel/bpf/bpf_struct_ops.c|1 + kernel/bpf/btf.c | 1215 +-- kernel/bpf/btf_show_common.c | 1218 kernel/bpf/core.c |1 + kernel/bpf/hashtab.c |1 + kernel/bpf/local_storage.c |1 + kernel/bpf/verifier.c |1 + kernel/trace/bpf_trace.c |1 + tools/lib/bpf/Build|2 +- tools/lib/bpf/btf.h|7 + tools/lib/bpf/btf_common.h | 286 + tools/lib/bpf/btf_show_common.c| 1218 tools/lib/bpf/libbpf.map |1 + .../selftests/bpf/prog_tests/snprintf_btf_user.c | 192 +++ 18 files changed, 3236 insertions(+), 1319 deletions(-) create mode 100644 include/linux/btf_common.h create mode 100644 kernel/bpf/btf_show_common.c create mode 100644 tools/lib/bpf/btf_common.h create mode 100644 tools/lib/bpf/btf_show_common.c create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c -- 1.8.3.1
[PATCH bpf] bpftool: fix compilation failure for net.o with older glibc
For older glibc ~2.17, #include'ing both linux/if.h and net/if.h fails due to complaints about redefinition of interface flags: CC net.o In file included from net.c:13:0: /usr/include/linux/if.h:71:2: error: redeclaration of enumerator ‘IFF_UP’ IFF_UP= 1<<0, /* sysfs */ ^ /usr/include/net/if.h:44:5: note: previous definition of ‘IFF_UP’ was here IFF_UP = 0x1, /* Interface is up. */ The issue was fixed in kernel headers in [1], but since compilation of net.c picks up system headers the problem can recur. Dropping #include resolves the issue and it is not needed for compilation anyhow. [1] https://lore.kernel.org/netdev/1461512707-23058-1-git-send-email-mikko.rapeli__34748.27880641$1462831734$gmane$o...@iki.fi/ Fixes: f6f3bac08ff9 ("tools/bpf: bpftool: add net support") Signed-off-by: Alan Maguire --- tools/bpf/bpftool/net.c | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c index 3fae61e..ff3aa0c 100644 --- a/tools/bpf/bpftool/net.c +++ b/tools/bpf/bpftool/net.c @@ -11,7 +11,6 @@ #include #include #include -#include #include #include #include -- 1.8.3.1
Re: [RFC PATCH bpf-next] ksnoop: kernel argument/return value tracing/display using BTF
On Tue, 5 Jan 2021, Cong Wang wrote: > On Mon, Jan 4, 2021 at 7:29 AM Alan Maguire wrote: > > > > BPF Type Format (BTF) provides a description of kernel data structures > > and of the types kernel functions utilize as arguments and return values. > > > > A helper was recently added - bpf_snprintf_btf() - that uses that > > description to create a string representation of the data provided, > > using the BTF id of its type. For example to create a string > > representation of a "struct sk_buff", the pointer to the skb > > is provided along with the type id of "struct sk_buff". > > > > Here that functionality is utilized to support tracing kernel > > function entry and return using k[ret]probes. The "struct pt_regs" > > context can be used to derive arguments and return values, and > > when the user supplies a function name we > > > > - look it up in /proc/kallsyms to find its address/module > > - look it up in the BTF kernel data to get types of arguments > > and return value > > - store a map representation of the trace information, keyed by > > instruction pointer > > - on function entry/return we look up the map to retrieve the BTF > > ids of the arguments/return values and can call bpf_snprintf_btf() > > with these argument/return values along with the type ids to store > > a string representation in the map. > > - this is then sent via perf event to userspace where it can be > > displayed. > > > > ksnoop can be used to show function signatures; for example: > > This is definitely quite useful! > > Is it possible to integrate this with bpftrace? That would save people > from learning yet another tool. ;) > I'd imagine (and hope!) other tracing tools will do this, but right now the aim is to make the task of tracing kernel data structures simpler, so having a tool dedicated to just that can hopefully help those discussions. There's a bit more work to be done to simplify that task, for example implementing Alexei's suggestion to support pretty-printing of data structures using BTF in libbpf. My hope is that we can evolve this tool - or something like it - to the point where we can solve that one problem easily, and that other more general tracers can then make use of that solution. I probably should have made all of this clearer in the patch submission, sorry about that. Alan
[RFC PATCH bpf-next] ksnoop: kernel argument/return value tracing/display using BTF
}, .tcp_tsorted_anchor = (struct list_head){ .next = (struct list_head *)0x930b6729bb40, .prev = (struct list_head *)0xa5bfaf00, }, }, .len = (unsigned int)84, .ignore_df = (__u8)0x1, (union){ .csum = (__wsum)2619910871, (struct){ .csum_start = (__u16)43735, .csum_offset = (__u16)39976, }, }, .transport_header = (__u16)36, .network_header = (__u16)16, .mac_header = (__u16)65535, .tail = (sk_buff_data_t)100, .end = (sk_buff_data_t)192, .head = (unsigned char *)0x930b9d3cf800, .data = (unsigned char *)0x930b9d3cf810, .truesize = (unsigned int)768, .users = (refcount_t){ .refs = (atomic_t){ .counter = (int)1, }, }, } ); It is possible to combine a request for entry arguments with a predicate on return value; for example we might want to see skbs on entry for cases where ip_send_skb eventually returned an error value. To do this, a predicate such as $ ksnoop "ip_send_skb(skb, return!=0)" ...could be used. On entry, rather than sending perf events the skb argument string representation is "stashed", and on return if the predicate is satisfied, the stashed data along with return-value-related data is sent as a perf event. This allows us to satisfy requests such as "show me entry argument X when the function fails, returning a negative errno". A note about overhead: it is very high. The overhead costs are a combination of known kprobe overhead costs and the cost of assembling string representations of kernel data. Use of predicates can mitigate overhead, as collection of trace data will only occur when the predicate is satisfied; in such cases it is best to lead with the predicate, e.g. ksnoop "ip_send_skb(skb->dev == 0, skb)" ...as this will be evaluated before the skb is stringified, and we potentially avoid that operation if the predicate fails. The same is _not_ true however in the stash case; for ksnoop "ip_send_skb(skb, return!=0)" ...we must collect the skb representation on entry as we do not yet know if the function will fail or not. If it does, the data is discarded rather than sent as a perf event. Signed-off-by: Alan Maguire --- tools/bpf/Makefile| 16 +- tools/bpf/ksnoop/Makefile | 102 + tools/bpf/ksnoop/ksnoop.bpf.c | 336 +++ tools/bpf/ksnoop/ksnoop.c | 981 ++ tools/bpf/ksnoop/ksnoop.h | 110 + 5 files changed, 1542 insertions(+), 3 deletions(-) create mode 100644 tools/bpf/ksnoop/Makefile create mode 100644 tools/bpf/ksnoop/ksnoop.bpf.c create mode 100644 tools/bpf/ksnoop/ksnoop.c create mode 100644 tools/bpf/ksnoop/ksnoop.h diff --git a/tools/bpf/Makefile b/tools/bpf/Makefile index 39bb322..8b2b6c9 100644 --- a/tools/bpf/Makefile +++ b/tools/bpf/Makefile @@ -73,7 +73,7 @@ $(OUTPUT)%.lex.o: $(OUTPUT)%.lex.c PROGS = $(OUTPUT)bpf_jit_disasm $(OUTPUT)bpf_dbg $(OUTPUT)bpf_asm -all: $(PROGS) bpftool runqslower +all: $(PROGS) bpftool runqslower ksnoop $(OUTPUT)bpf_jit_disasm: CFLAGS += -DPACKAGE='bpf_jit_disasm' $(OUTPUT)bpf_jit_disasm: $(OUTPUT)bpf_jit_disasm.o @@ -89,7 +89,7 @@ $(OUTPUT)bpf_exp.lex.c: $(OUTPUT)bpf_exp.yacc.c $(OUTPUT)bpf_exp.yacc.o: $(OUTPUT)bpf_exp.yacc.c $(OUTPUT)bpf_exp.lex.o: $(OUTPUT)bpf_exp.lex.c -clean: bpftool_clean runqslower_clean resolve_btfids_clean +clean: bpftool_clean runqslower_clean resolve_btfids_clean ksnoop_clean $(call QUIET_CLEAN, bpf-progs) $(Q)$(RM) -r -- $(OUTPUT)*.o $(OUTPUT)bpf_jit_disasm $(OUTPUT)bpf_dbg \
Re: [PATCH v2 bpf-next 0/3] bpf: support module BTF in BTF display helpers
On Sat, 5 Dec 2020, Yonghong Song wrote: > > > __builtin_btf_type_id() is really only supported in llvm12 > and 64bit return value support is pushed to llvm12 trunk > a while back. The builtin is introduced in llvm11 but has a > corner bug, so llvm12 is recommended. So if people use the builtin, > you can assume 64bit return value. libbpf support is required > here. So in my opinion, there is no need to do feature detection. > > Andrii has a patch to support 64bit return value for > __builtin_btf_type_id() and I assume that one should > be landed before or together with your patch. > > Just for your info. The following is an example you could > use to determine whether __builtin_btf_type_id() > supports btf object id at llvm level. > > -bash-4.4$ cat t.c > int test(int arg) { > return __builtin_btf_type_id(arg, 1); > } > > Compile to generate assembly code with latest llvm12 trunk: > clang -target bpf -O2 -S -g -mcpu=v3 t.c > In the asm code, you should see one line with > r0 = 1 ll > > Or you can generate obj code: > clang -target bpf -O2 -c -g -mcpu=v3 t.c > and then you disassemble the obj file > llvm-objdump -d --no-show-raw-insn --no-leading-addr t.o > You should see below in the output > r0 = 1 ll > > Use earlier version of llvm12 trunk, the builtin has > 32bit return value, you will see > r0 = 1 > which is a 32bit imm to r0, while "r0 = 1 ll" is > 64bit imm to r0. > Thanks for this Yonghong! I'm thinking the way I'll tackle it is to simply verify that the upper 32 bits specifying the veth module object id are non-zero; if they are zero, we'll skip the test (I think a skip probably makes sense as not everyone will have llvm12). Does that seem reasonable? With the additional few minor changes on top of Andrii's patch, the use of __builtin_btf_type_id() worked perfectly. Thanks! Alan
[PATCH v2 bpf-next 1/3] bpf: eliminate btf_module_mutex as RCU synchronization can be used
btf_module_mutex is used when manipulating the BTF module list. However we will wish to look up this list from BPF program context, and such contexts can include interrupt state where we cannot sleep due to a mutex_lock(). RCU usage here conforms quite closely to the example in the system call auditing example in Documentation/RCU/listRCU.rst ; and as such we can eliminate the lock and use list_del_rcu()/call_rcu() on module removal, and list_add_rcu() for module addition. Signed-off-by: Alan Maguire --- kernel/bpf/btf.c | 31 +-- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 8d6bdb4..333f41c 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5758,13 +5758,13 @@ bool btf_id_set_contains(const struct btf_id_set *set, u32 id) #ifdef CONFIG_DEBUG_INFO_BTF_MODULES struct btf_module { struct list_head list; + struct rcu_head rcu; struct module *module; struct btf *btf; struct bin_attribute *sysfs_attr; }; static LIST_HEAD(btf_modules); -static DEFINE_MUTEX(btf_module_mutex); static ssize_t btf_module_read(struct file *file, struct kobject *kobj, @@ -5777,10 +5777,21 @@ struct btf_module { return len; } +static void btf_module_free(struct rcu_head *rcu) +{ + struct btf_module *btf_mod = container_of(rcu, struct btf_module, rcu); + + if (btf_mod->sysfs_attr) + sysfs_remove_bin_file(btf_kobj, btf_mod->sysfs_attr); + btf_put(btf_mod->btf); + kfree(btf_mod->sysfs_attr); + kfree(btf_mod); +} + static int btf_module_notify(struct notifier_block *nb, unsigned long op, void *module) { - struct btf_module *btf_mod, *tmp; + struct btf_module *btf_mod; struct module *mod = module; struct btf *btf; int err = 0; @@ -5811,11 +5822,9 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op, goto out; } - mutex_lock(&btf_module_mutex); btf_mod->module = module; btf_mod->btf = btf; - list_add(&btf_mod->list, &btf_modules); - mutex_unlock(&btf_module_mutex); + list_add_rcu(&btf_mod->list, &btf_modules); if (IS_ENABLED(CONFIG_SYSFS)) { struct bin_attribute *attr; @@ -5845,20 +5854,14 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op, break; case MODULE_STATE_GOING: - mutex_lock(&btf_module_mutex); - list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) { + list_for_each_entry(btf_mod, &btf_modules, list) { if (btf_mod->module != module) continue; - list_del(&btf_mod->list); - if (btf_mod->sysfs_attr) - sysfs_remove_bin_file(btf_kobj, btf_mod->sysfs_attr); - btf_put(btf_mod->btf); - kfree(btf_mod->sysfs_attr); - kfree(btf_mod); + list_del_rcu(&btf_mod->list); + call_rcu(&btf_mod->rcu, btf_module_free); break; } - mutex_unlock(&btf_module_mutex); break; } out: -- 1.8.3.1
[PATCH v2 bpf-next 3/3] selftests/bpf: verify module-specific types can be shown via bpf_snprintf_btf
Verify that specifying a module object id in "struct btf_ptr *" along with a type id of a module-specific type will succeed. veth_stats_rx() is chosen because its function signature consists of a module-specific type "struct veth_stats" and a kernel-specific one "struct net_device". Currently the tests take the messy approach of determining object and type ids for the relevant module/function; __builtin_btf_type_id() supports object ids by returning a 64-bit value, but need to find a good way to determine if that support is present. Signed-off-by: Alan Maguire --- .../selftests/bpf/prog_tests/snprintf_btf_mod.c| 124 + tools/testing/selftests/bpf/progs/bpf_iter.h | 2 +- tools/testing/selftests/bpf/progs/btf_ptr.h| 2 +- tools/testing/selftests/bpf/progs/veth_stats_rx.c | 72 4 files changed, 198 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c create mode 100644 tools/testing/selftests/bpf/progs/veth_stats_rx.c diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c new file mode 100644 index 000..89805d7 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c @@ -0,0 +1,124 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include "veth_stats_rx.skel.h" + +#define VETH_NAME "bpfveth0" + +/* Demonstrate that bpf_snprintf_btf succeeds for both module-specific + * and kernel-defined data structures; veth_stats_rx() is used as + * it has both module-specific and kernel-defined data as arguments. + * This test assumes that veth is built as a module and will skip if not. + */ +void test_snprintf_btf_mod(void) +{ + struct btf *vmlinux_btf = NULL, *veth_btf = NULL; + struct veth_stats_rx *skel = NULL; + struct veth_stats_rx__bss *bss; + int err, duration = 0; + __u32 id; + + err = system("ip link add name " VETH_NAME " type veth"); + if (CHECK(err, "system", "ip link add veth failed: %d\n", err)) + return; + + vmlinux_btf = btf__parse_raw("/sys/kernel/btf/vmlinux"); + err = libbpf_get_error(vmlinux_btf); + if (CHECK(err, "parse vmlinux BTF", "failed parsing vmlinux BTF: %d\n", + err)) + goto cleanup; + veth_btf = btf__parse_raw_split("/sys/kernel/btf/veth", vmlinux_btf); + err = libbpf_get_error(veth_btf); + if (err == -ENOENT) { + printf("%s:SKIP:no BTF info for veth\n", __func__); + test__skip(); + goto cleanup; + } + + if (CHECK(err, "parse veth BTF", "failed parsing veth BTF: %d\n", err)) + goto cleanup; + + skel = veth_stats_rx__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + goto cleanup; + + err = veth_stats_rx__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) + goto cleanup; + + bss = skel->bss; + + /* This could all be replaced by __builtin_btf_type_id(); but need +* a way to determine if it supports object and type id. In the +* meantime, look up type id for veth_stats and object id for veth. +*/ + bss->veth_stats_btf_id = btf__find_by_name(veth_btf, "veth_stats"); + + if (CHECK(bss->veth_stats_btf_id <= 0, "find 'struct veth_stats'", + "could not find 'struct veth_stats' in veth BTF: %d", + bss->veth_stats_btf_id)) + goto cleanup; + + bss->veth_obj_id = 0; + + for (id = 1; bpf_btf_get_next_id(id, &id) == 0; ) { + struct bpf_btf_info info; + __u32 len = sizeof(info); + char name[64]; + int fd; + + fd = bpf_btf_get_fd_by_id(id); + if (fd < 0) + continue; + + memset(&info, 0, sizeof(info)); + info.name_len = sizeof(name); + info.name = (__u64)name; + if (bpf_obj_get_info_by_fd(fd, &info, &len) || + strcmp((char *)info.name, "veth") != 0) + continue; + bss->veth_obj_id = info.id; + } + + if (CHECK(bss->veth_obj_id == 0, "get obj id for veth module", + "could not get veth module id")) + goto cleanup; + + err = veth_stats_rx__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* g
[PATCH v2 bpf-next 2/3] bpf: add module support to btf display helpers
bpf_snprintf_btf and bpf_seq_printf_btf use a "struct btf_ptr *" argument that specifies type information about the type to be displayed. Augment this information to include an object id. If this id is 0, the assumption is that it refers to a core kernel type from vmlinux; otherwise the object id specifies the module the type is in, or if no such id is found in the module list, we fall back to vmlinux. Signed-off-by: Alan Maguire --- include/linux/btf.h| 12 include/uapi/linux/bpf.h | 13 +++-- kernel/bpf/btf.c | 18 + kernel/trace/bpf_trace.c | 44 +++--- tools/include/uapi/linux/bpf.h | 13 +++-- 5 files changed, 77 insertions(+), 23 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index 4c200f5..688786a 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -214,6 +214,14 @@ static inline const struct btf_var_secinfo *btf_type_var_secinfo( const char *btf_name_by_offset(const struct btf *btf, u32 offset); struct btf *btf_parse_vmlinux(void); struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog); +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES +struct btf *bpf_get_btf_module(__u32 obj_id); +#else +static inline struct btf *bpf_get_btf_module(__u32 obj_id) +{ + return ERR_PTR(-ENOTSUPP); +} +#endif #else static inline const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id) @@ -225,6 +233,10 @@ static inline const char *btf_name_by_offset(const struct btf *btf, { return NULL; } +static inline struct btf *bpf_get_btf_module(__u32 obj_id) +{ + return ERR_PTR(-ENOTSUPP); +} #endif #endif diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 1233f14..ccb75299 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3641,7 +3641,9 @@ struct bpf_stack_build_id { * the pointer data is carried out to avoid kernel crashes during * operation. Smaller types can use string space on the stack; * larger programs can use map data to store the string - * representation. + * representation. Module-specific data structures can be + * displayed if the module BTF object id is supplied in the + * *ptr*->obj_id field. * * The string can be subsequently shared with userspace via * bpf_perf_event_output() or ring buffer interfaces. @@ -5115,15 +5117,14 @@ struct bpf_sk_lookup { /* * struct btf_ptr is used for typed pointer representation; the * type id is used to render the pointer data as the appropriate type - * via the bpf_snprintf_btf() helper described above. A flags field - - * potentially to specify additional details about the BTF pointer - * (rather than its mode of display) - is included for future use. - * Display flags - BTF_F_* - are passed to bpf_snprintf_btf separately. + * via the bpf_snprintf_btf() helper described above. The obj_id + * is used to specify an object id (such as a module); if unset + * a core vmlinux type id is assumed. */ struct btf_ptr { void *ptr; __u32 type_id; - __u32 flags;/* BTF ptr flags; unused at present. */ + __u32 obj_id; /* BTF object; vmlinux if 0 */ }; /* diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 333f41c..8ee691e 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5777,6 +5777,24 @@ struct btf_module { return len; } +struct btf *bpf_get_btf_module(__u32 obj_id) +{ + struct btf *btf = ERR_PTR(-ENOENT); + struct btf_module *btf_mod; + + rcu_read_lock(); + list_for_each_entry_rcu(btf_mod, &btf_modules, list) { + if (!btf_mod->btf || obj_id != btf_mod->btf->id) + continue; + + refcount_inc(&btf_mod->btf->refcnt); + btf = btf_mod->btf; + break; + } + rcu_read_unlock(); + return btf; +} + static void btf_module_free(struct rcu_head *rcu) { struct btf_module *btf_mod = container_of(rcu, struct btf_module, rcu); diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 23a390a..66d4120 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -75,8 +75,8 @@ static struct bpf_raw_event_map *bpf_get_raw_tracepoint_module(const char *name) u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size, - u64 flags, const struct btf **btf, - s32 *btf_id); + u64 flags, struct btf **btf, + bool *btf_is_vmlinux, s32 *btf_id); /** * trace_call_bpf - invoke BPF program @@ -786,15 +786,22 @@ struct bpf_seq_prin
[PATCH v2 bpf-next 0/3] bpf: support module BTF in BTF display helpers
This series aims to add support to bpf_snprintf_btf() and bpf_seq_printf_btf() allowing them to store string representations of module-specific types, as well as the kernel-specific ones they currently support. Patch 1 removes the btf_module_mutex, as since we will need to look up module BTF during BPF program execution, we don't want to risk sleeping in the various contexts in which BPF can run. The access patterns to the btf module list seem to conform to classic list RCU usage so with a few minor tweaks this seems workable. Patch 2 replaces the unused flags field in struct btf_ptr with an obj_id field, allowing the specification of the id of a BTF module. If the value is 0, the core kernel vmlinux is assumed to contain the type's BTF information. Otherwise the module with that id is used to identify the type. If the object-id based lookup fails, we again fall back to vmlinux BTF. Patch 3 is a selftest that uses veth (when built as a module) and a kprobe to display both a module-specific and kernel-specific type; both are arguments to veth_stats_rx(). Currently it looks up the module-specific type and object ids using libbpf; in future, these lookups will likely be supported directly in the BPF program via __builtin_btf_type_id(); but I need to determine a good test to determine if that builtin supports object ids. Changes since RFC - add patch to remove module mutex - modify to use obj_id instead of module name as identifier in "struct btf_ptr" (Andrii) Alan Maguire (3): bpf: eliminate btf_module_mutex as RCU synchronization can be used bpf: add module support to btf display helpers selftests/bpf: verify module-specific types can be shown via bpf_snprintf_btf include/linux/btf.h| 12 ++ include/uapi/linux/bpf.h | 13 ++- kernel/bpf/btf.c | 49 +--- kernel/trace/bpf_trace.c | 44 ++-- tools/include/uapi/linux/bpf.h | 13 ++- .../selftests/bpf/prog_tests/snprintf_btf_mod.c| 124 + tools/testing/selftests/bpf/progs/bpf_iter.h | 2 +- tools/testing/selftests/bpf/progs/btf_ptr.h| 2 +- tools/testing/selftests/bpf/progs/veth_stats_rx.c | 72 9 files changed, 292 insertions(+), 39 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c create mode 100644 tools/testing/selftests/bpf/progs/veth_stats_rx.c -- 1.8.3.1
Re: [RFC bpf-next 1/3] bpf: add module support to btf display helpers
On Sat, 14 Nov 2020, Yonghong Song wrote: > > > On 11/14/20 8:04 AM, Alexei Starovoitov wrote: > > On Fri, Nov 13, 2020 at 10:59 PM Andrii Nakryiko > > wrote: > >> > >> On Fri, Nov 13, 2020 at 10:11 AM Alan Maguire > >> wrote: > >>> > >>> bpf_snprintf_btf and bpf_seq_printf_btf use a "struct btf_ptr *" > >>> argument that specifies type information about the type to > >>> be displayed. Augment this information to include a module > >>> name, allowing such display to support module types. > >>> > >>> Signed-off-by: Alan Maguire > >>> --- > >>> include/linux/btf.h| 8 > >>> include/uapi/linux/bpf.h | 5 - > >>> kernel/bpf/btf.c | 18 ++ > >>> kernel/trace/bpf_trace.c | 42 > >>> -- > >>> tools/include/uapi/linux/bpf.h | 5 - > >>> 5 files changed, 66 insertions(+), 12 deletions(-) > >>> > >>> diff --git a/include/linux/btf.h b/include/linux/btf.h > >>> index 2bf6418..d55ca00 100644 > >>> --- a/include/linux/btf.h > >>> +++ b/include/linux/btf.h > >>> @@ -209,6 +209,14 @@ static inline const struct btf_var_secinfo > >>> *btf_type_var_secinfo( > >>> const struct btf_type *btf_type_by_id(const struct btf *btf, u32 > >>> type_id); > >>> const char *btf_name_by_offset(const struct btf *btf, u32 offset); > >>> struct btf *btf_parse_vmlinux(void); > >>> +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES > >>> +struct btf *bpf_get_btf_module(const char *name); > >>> +#else > >>> +static inline struct btf *bpf_get_btf_module(const char *name) > >>> +{ > >>> + return ERR_PTR(-ENOTSUPP); > >>> +} > >>> +#endif > >>> struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog); > >>> #else > >>> static inline const struct btf_type *btf_type_by_id(const struct btf > >>> *btf, > >>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > >>> index 162999b..26978be 100644 > >>> --- a/include/uapi/linux/bpf.h > >>> +++ b/include/uapi/linux/bpf.h > >>> @@ -3636,7 +3636,8 @@ struct bpf_stack_build_id { > >>>* the pointer data is carried out to avoid kernel crashes > >>>during > >>>* operation. Smaller types can use string space on the > >>>stack; > >>>* larger programs can use map data to store the string > >>> - * representation. > >>> + * representation. Module-specific data structures can be > >>> + * displayed if the module name is supplied. > >>>* > >>>* The string can be subsequently shared with userspace via > >>>* bpf_perf_event_output() or ring buffer interfaces. > >>> @@ -5076,11 +5077,13 @@ struct bpf_sk_lookup { > >>>* potentially to specify additional details about the BTF pointer > >>>* (rather than its mode of display) - is included for future use. > >>>* Display flags - BTF_F_* - are passed to bpf_snprintf_btf separately. > >>> + * A module name can be specified for module-specific data. > >>> */ > >>> struct btf_ptr { > >>> void *ptr; > >>> __u32 type_id; > >>> __u32 flags;/* BTF ptr flags; unused at present. */ > >>> + const char *module; /* optional module name. */ > >> > >> I think module name is a wrong API here, similarly how type name was > >> wrong API for specifying the type (and thus we use type_id here). > >> Using the module's BTF ID seems like a more suitable interface. That's > >> what I'm going to use for all kinds of existing BPF APIs that expect > >> BTF type to attach BPF programs. > >> > >> Right now, we use only type_id and implicitly know that it's in > >> vmlinux BTF. With module BTFs, we now need a pair of BTF object ID + > >> BTF type ID to uniquely identify the type. vmlinux BTF now can be > >> specified in two different ways: either leaving BTF object ID as zero > >> (for simplicity and backwards compatibility) or specifying it's actual > >> BTF obj ID (which pretty much alwa
[PATCH bpf-next] libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types()
When operating on split BTF, btf__find_by_name[_kind] will not iterate over all types since they use btf->nr_types to show the number of types to iterate over. For split BTF this is the number of types _on top of base BTF_, so it will underestimate the number of types to iterate over, especially for vmlinux + module BTF, where the latter is much smaller. Use btf__get_nr_types() instead. Fixes: ba451366bf44 ("libbpf: Implement basic split BTF support") Signed-off-by: Alan Maguire --- tools/lib/bpf/btf.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 2d0d064..8ff46cd 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -674,12 +674,12 @@ int btf__resolve_type(const struct btf *btf, __u32 type_id) __s32 btf__find_by_name(const struct btf *btf, const char *type_name) { - __u32 i; + __u32 i, nr_types = btf__get_nr_types(btf); if (!strcmp(type_name, "void")) return 0; - for (i = 1; i <= btf->nr_types; i++) { + for (i = 1; i <= nr_types; i++) { const struct btf_type *t = btf__type_by_id(btf, i); const char *name = btf__name_by_offset(btf, t->name_off); @@ -693,12 +693,12 @@ __s32 btf__find_by_name(const struct btf *btf, const char *type_name) __s32 btf__find_by_name_kind(const struct btf *btf, const char *type_name, __u32 kind) { - __u32 i; + __u32 i, nr_types = btf__get_nr_types(btf); if (kind == BTF_KIND_UNKN || !strcmp(type_name, "void")) return 0; - for (i = 1; i <= btf->nr_types; i++) { + for (i = 1; i <= nr_types; i++) { const struct btf_type *t = btf__type_by_id(btf, i); const char *name; -- 1.8.3.1
[RFC bpf-next 3/3] selftests/bpf: verify module-specific types can be shown via bpf_snprintf_btf
Verify that specifying a module name in "struct btf_ptr *" along with a type id of a module-specific type will succeed. veth_stats_rx() is chosen because its function signature consists of a module-specific type "struct veth_stats" and a kernel-specific one "struct net_device". Signed-off-by: Alan Maguire --- .../selftests/bpf/prog_tests/snprintf_btf_mod.c| 96 ++ tools/testing/selftests/bpf/progs/btf_ptr.h| 1 + tools/testing/selftests/bpf/progs/veth_stats_rx.c | 73 3 files changed, 170 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c create mode 100644 tools/testing/selftests/bpf/progs/veth_stats_rx.c diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c new file mode 100644 index 000..f1b12df --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c @@ -0,0 +1,96 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include "veth_stats_rx.skel.h" + +#define VETH_NAME "bpfveth0" + +/* Demonstrate that bpf_snprintf_btf succeeds for both module-specific + * and kernel-defined data structures; veth_stats_rx() is used as + * it has both module-specific and kernel-defined data as arguments. + * This test assumes that veth is built as a module and will skip if not. + */ +void test_snprintf_btf_mod(void) +{ + struct btf *vmlinux_btf = NULL, *veth_btf = NULL; + struct veth_stats_rx *skel = NULL; + struct veth_stats_rx__bss *bss; + int err, duration = 0; + __u32 id; + + err = system("ip link add name " VETH_NAME " type veth"); + if (CHECK(err, "system", "ip link add veth failed: %d\n", err)) + return; + + vmlinux_btf = btf__parse_raw("/sys/kernel/btf/vmlinux"); + err = libbpf_get_error(vmlinux_btf); + if (CHECK(err, "parse vmlinux BTF", "failed parsing vmlinux BTF: %d\n", + err)) + goto cleanup; + veth_btf = btf__parse_raw_split("/sys/kernel/btf/veth", vmlinux_btf); + err = libbpf_get_error(veth_btf); + if (err == -ENOENT) { + printf("%s:SKIP:no BTF info for veth\n", __func__); + test__skip(); +goto cleanup; + } + + if (CHECK(err, "parse veth BTF", "failed parsing veth BTF: %d\n", err)) + goto cleanup; + + skel = veth_stats_rx__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + goto cleanup; + + err = veth_stats_rx__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) + goto cleanup; + + bss = skel->bss; + + bss->veth_stats_btf_id = btf__find_by_name(veth_btf, "veth_stats"); + + if (CHECK(bss->veth_stats_btf_id <= 0, "find 'struct veth_stats'", + "could not find 'struct veth_stats' in veth BTF: %d", + bss->veth_stats_btf_id)) + goto cleanup; + + err = veth_stats_rx__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* generate stats event, then delete; this ensures the program +* triggers prior to reading status. +*/ + err = system("ethtool -S " VETH_NAME " > /dev/null"); + if (CHECK(err, "system", "ethtool -S failed: %d\n", err)) + goto cleanup; + + system("ip link delete " VETH_NAME); + + /* +* Make sure veth_stats_rx program was triggered and it set +* expected return values from bpf_trace_printk()s and all +* tests ran. +*/ + if (CHECK(bss->ret <= 0, + "bpf_snprintf_btf: got return value", + "ret <= 0 %ld test %d\n", bss->ret, bss->ran_subtests)) + goto cleanup; + + if (CHECK(bss->ran_subtests == 0, "check if subtests ran", + "no subtests ran, did BPF program run?")) + goto cleanup; + + if (CHECK(bss->num_subtests != bss->ran_subtests, + "check all subtests ran", + "only ran %d of %d tests\n", bss->num_subtests, + bss->ran_subtests)) + goto cleanup; + +cleanup: + system("ip link delete " VETH_NAME ">/dev/null 2>&1"); + if (skel) + veth_stats_rx__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/btf_ptr.h b/tools/testing/s
[RFC bpf-next 1/3] bpf: add module support to btf display helpers
bpf_snprintf_btf and bpf_seq_printf_btf use a "struct btf_ptr *" argument that specifies type information about the type to be displayed. Augment this information to include a module name, allowing such display to support module types. Signed-off-by: Alan Maguire --- include/linux/btf.h| 8 include/uapi/linux/bpf.h | 5 - kernel/bpf/btf.c | 18 ++ kernel/trace/bpf_trace.c | 42 -- tools/include/uapi/linux/bpf.h | 5 - 5 files changed, 66 insertions(+), 12 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index 2bf6418..d55ca00 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -209,6 +209,14 @@ static inline const struct btf_var_secinfo *btf_type_var_secinfo( const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id); const char *btf_name_by_offset(const struct btf *btf, u32 offset); struct btf *btf_parse_vmlinux(void); +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES +struct btf *bpf_get_btf_module(const char *name); +#else +static inline struct btf *bpf_get_btf_module(const char *name) +{ + return ERR_PTR(-ENOTSUPP); +} +#endif struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog); #else static inline const struct btf_type *btf_type_by_id(const struct btf *btf, diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 162999b..26978be 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3636,7 +3636,8 @@ struct bpf_stack_build_id { * the pointer data is carried out to avoid kernel crashes during * operation. Smaller types can use string space on the stack; * larger programs can use map data to store the string - * representation. + * representation. Module-specific data structures can be + * displayed if the module name is supplied. * * The string can be subsequently shared with userspace via * bpf_perf_event_output() or ring buffer interfaces. @@ -5076,11 +5077,13 @@ struct bpf_sk_lookup { * potentially to specify additional details about the BTF pointer * (rather than its mode of display) - is included for future use. * Display flags - BTF_F_* - are passed to bpf_snprintf_btf separately. + * A module name can be specified for module-specific data. */ struct btf_ptr { void *ptr; __u32 type_id; __u32 flags;/* BTF ptr flags; unused at present. */ + const char *module; /* optional module name. */ }; /* diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 6b2d508..3ddd1fd 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5738,6 +5738,24 @@ struct btf_module { static LIST_HEAD(btf_modules); static DEFINE_MUTEX(btf_module_mutex); +struct btf *bpf_get_btf_module(const char *name) +{ + struct btf *btf = ERR_PTR(-ENOENT); + struct btf_module *btf_mod, *tmp; + + mutex_lock(&btf_module_mutex); + list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) { + if (!btf_mod->btf || strcmp(name, btf_mod->btf->name) != 0) + continue; + + refcount_inc(&btf_mod->btf->refcnt); + btf = btf_mod->btf; + break; + } + mutex_unlock(&btf_module_mutex); + return btf; +} + static ssize_t btf_module_read(struct file *file, struct kobject *kobj, struct bin_attribute *bin_attr, diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index cfce60a..a4d5a26 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -73,8 +73,7 @@ static struct bpf_raw_event_map *bpf_get_raw_tracepoint_module(const char *name) u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size, - u64 flags, const struct btf **btf, - s32 *btf_id); + u64 flags, struct btf **btf, s32 *btf_id); /** * trace_call_bpf - invoke BPF program @@ -784,7 +783,7 @@ struct bpf_seq_printf_buf { BPF_CALL_4(bpf_seq_printf_btf, struct seq_file *, m, struct btf_ptr *, ptr, u32, btf_ptr_size, u64, flags) { - const struct btf *btf; + struct btf *btf; s32 btf_id; int ret; @@ -792,7 +791,11 @@ struct bpf_seq_printf_buf { if (ret) return ret; - return btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags); + ret = btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags); + if (btf_ptr_size == sizeof(struct btf_ptr) && ptr->module) + btf_put(btf); + + return ret; } static const struct bpf_func_proto bpf_seq_printf_btf_proto = { @@ -1199,18 +1202,33 @@ static bool bpf_d_path_allowed(const struct bpf_pro
[RFC bpf-next 2/3] libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types()
When operating on split BTF, btf__find_by_name[_kind] will not iterate over all types since they use btf->nr_types to show the number of types to iterate over. For split BTF this is the number of types _on top of base BTF_, so it will underestimate the number of types to iterate over, especially for vmlinux + module BTF, where the latter is much smaller. Use btf__get_nr_types() instead. Signed-off-by: Alan Maguire --- tools/lib/bpf/btf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 2d0d064..0fccf4b 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -679,7 +679,7 @@ __s32 btf__find_by_name(const struct btf *btf, const char *type_name) if (!strcmp(type_name, "void")) return 0; - for (i = 1; i <= btf->nr_types; i++) { + for (i = 1; i <= btf__get_nr_types(btf); i++) { const struct btf_type *t = btf__type_by_id(btf, i); const char *name = btf__name_by_offset(btf, t->name_off); @@ -698,7 +698,7 @@ __s32 btf__find_by_name_kind(const struct btf *btf, const char *type_name, if (kind == BTF_KIND_UNKN || !strcmp(type_name, "void")) return 0; - for (i = 1; i <= btf->nr_types; i++) { + for (i = 1; i <= btf__get_nr_types(btf); i++) { const struct btf_type *t = btf__type_by_id(btf, i); const char *name; -- 1.8.3.1
[RFC bpf-next 0/3] bpf: support module BTF in btf display helpers
This series aims to add support to bpf_snprintf_btf() and bpf_seq_printf_btf() allowing them to store string representations of module-specific types, as well as the kernel-specific ones they currently support. Patch 1 adds an additional field "const char *module" to "struct btf_ptr", allowing the specification of a module name along with a data pointer, BTF id, etc. It is then used to look up module BTF, rather than the default vmlinux BTF. Patch 2 makes a small fix to libbpf to allow btf__type_by_name[_kind] to work with split BTF. Without this fix, type lookup of a module-specific type id will fail in patch 3. Patch 3 is a selftest that uses veth (when built as a module) and a kprobe to display both a module-specific and kernel-specific type; both are arguments to veth_stats_rx(). Alan Maguire (3): bpf: add module support to btf display helpers libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types() selftests/bpf: verify module-specific types can be shown via bpf_snprintf_btf include/linux/btf.h| 8 ++ include/uapi/linux/bpf.h | 5 +- kernel/bpf/btf.c | 18 kernel/trace/bpf_trace.c | 42 +++--- tools/include/uapi/linux/bpf.h | 5 +- tools/lib/bpf/btf.c| 4 +- .../selftests/bpf/prog_tests/snprintf_btf_mod.c| 96 ++ tools/testing/selftests/bpf/progs/btf_ptr.h| 1 + tools/testing/selftests/bpf/progs/veth_stats_rx.c | 73 9 files changed, 238 insertions(+), 14 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_mod.c create mode 100644 tools/testing/selftests/bpf/progs/veth_stats_rx.c -- 1.8.3.1
Re: [PATCH bpf-next 5/5] tools/bpftool: add support for in-kernel and named BTF in `btf show`
On Thu, 5 Nov 2020, Andrii Nakryiko wrote: > Display vmlinux BTF name and kernel module names when listing available BTFs > on the system. > > In human-readable output mode, module BTFs are reported with "name > [module-name]", while vmlinux BTF will be reported as "name [vmlinux]". > Square brackets are added by bpftool and follow kernel convention when > displaying modules in human-readable text outputs. > I had a go at testing this and all looks good, but I was curious if "bpftool btf dump" is expected to work with module BTF? I see the various modules in /sys/kernel/btf, but if I run: # bpftool btf dump file /sys/kernel/btf/ixgbe Error: failed to load BTF from /sys/kernel/btf/ixgbe: Invalid argument ...while it still works for vmlinux: # bpftool btf dump file /sys/kernel/btf/vmlinux [1] INT '(anon)' size=4 bits_offset=0 nr_bits=32 encoding=(none) [2] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none) ... "bpftool btf show" works for ixgbe: # bpftool btf show|grep ixgbe 19: name [ixgbe] size 182074B Is this perhaps not expected to work yet? (I updated pahole to the latest changes etc and BTF generation seemed to work fine for modules during kernel build). For the "bpftool btf show" functionality, feel free to add Tested-by: Alan Maguire Thanks! Alan
Re: [PATCH 5.8 574/633] selftests/bpf: Fix overflow tests to reflect iter size increase
On Tue, 27 Oct 2020, Greg Kroah-Hartman wrote: > From: Alan Maguire > > [ Upstream commit eb58bbf2e5c7917aa30bf8818761f26bbeeb2290 ] > > bpf iter size increase to PAGE_SIZE << 3 means overflow tests assuming > page size need to be bumped also. > Alexei can correct me if I've got this wrong but I don't believe it's a stable backport candidate. This selftests change should only be relevant when the BPF iterator size has been bumped up as it was in af65320 bpf: Bump iter seq size to support BTF representation of large data structures ...so I don't _think_ this commit belongs in stable unless the above commit is backported also (and unless I'm missing something I don't see a burning reason to do that currently). Backporting this alone will likely induce bpf test failures. Apologies if the "Fix" in the title was misleading; it should probably have been "Update" to reflect the fact it's not fixing an existing bug but rather updating the test to operate correctly in the context of other changes in the for-next patch series it was part of. Thanks! Alan
[PATCH bpf-next 0/2] selftests/bpf: BTF-based kernel data display fixes
Resolve issues in bpf selftests introduced with BTF-based kernel data display selftests; these are - a warning introduced in snprintf_btf.c; and - compilation failures with old kernels vmlinux.h Alan Maguire (2): selftests/bpf: fix unused-result warning in snprintf_btf.c selftests/bpf: ensure snprintf_btf/bpf_iter tests compatibility with old vmlinux.h .../selftests/bpf/prog_tests/snprintf_btf.c| 2 +- tools/testing/selftests/bpf/progs/bpf_iter.h | 23 ++ tools/testing/selftests/bpf/progs/btf_ptr.h| 27 ++ .../selftests/bpf/progs/netif_receive_skb.c| 2 +- 4 files changed, 52 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/btf_ptr.h -- 1.8.3.1
[PATCH bpf-next 2/2] selftests/bpf: ensure snprintf_btf/bpf_iter tests compatibility with old vmlinux.h
Andrii reports that bpf selftests relying on "struct btf_ptr" and BTF_F_* values will not build as vmlinux.h for older kernels will not include "struct btf_ptr" or the BTF_F_* enum values. Undefine and redefine them to work around this. Fixes: b72091bd4ee4 ("selftests/bpf: Add test for bpf_seq_printf_btf helper") Fixes: 076a95f5aff2 ("selftests/bpf: Add bpf_snprintf_btf helper tests") Reported-by: Andrii Nakryiko Signed-off-by: Alan Maguire --- tools/testing/selftests/bpf/progs/bpf_iter.h | 23 ++ tools/testing/selftests/bpf/progs/btf_ptr.h| 27 ++ .../selftests/bpf/progs/netif_receive_skb.c| 2 +- 3 files changed, 51 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/progs/btf_ptr.h diff --git a/tools/testing/selftests/bpf/progs/bpf_iter.h b/tools/testing/selftests/bpf/progs/bpf_iter.h index df682af..6a12554 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter.h +++ b/tools/testing/selftests/bpf/progs/bpf_iter.h @@ -14,6 +14,11 @@ #define bpf_iter__bpf_map_elem bpf_iter__bpf_map_elem___not_used #define bpf_iter__bpf_sk_storage_map bpf_iter__bpf_sk_storage_map___not_used #define bpf_iter__sockmap bpf_iter__sockmap___not_used +#define btf_ptr btf_ptr___not_used +#define BTF_F_COMPACT BTF_F_COMPACT___not_used +#define BTF_F_NONAME BTF_F_NONAME___not_used +#define BTF_F_PTR_RAW BTF_F_PTR_RAW___not_used +#define BTF_F_ZERO BTF_F_ZERO___not_used #include "vmlinux.h" #undef bpf_iter_meta #undef bpf_iter__bpf_map @@ -28,6 +33,11 @@ #undef bpf_iter__bpf_map_elem #undef bpf_iter__bpf_sk_storage_map #undef bpf_iter__sockmap +#undef btf_ptr +#undef BTF_F_COMPACT +#undef BTF_F_NONAME +#undef BTF_F_PTR_RAW +#undef BTF_F_ZERO struct bpf_iter_meta { struct seq_file *seq; @@ -105,3 +115,16 @@ struct bpf_iter__sockmap { void *key; struct sock *sk; }; + +struct btf_ptr { + void *ptr; + __u32 type_id; + __u32 flags; +}; + +enum { + BTF_F_COMPACT = (1ULL << 0), + BTF_F_NONAME= (1ULL << 1), + BTF_F_PTR_RAW = (1ULL << 2), + BTF_F_ZERO = (1ULL << 3), +}; diff --git a/tools/testing/selftests/bpf/progs/btf_ptr.h b/tools/testing/selftests/bpf/progs/btf_ptr.h new file mode 100644 index 000..c3c9797 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/btf_ptr.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright (c) 2020, Oracle and/or its affiliates. */ +/* "undefine" structs in vmlinux.h, because we "override" them below */ +#define btf_ptr btf_ptr___not_used +#define BTF_F_COMPACT BTF_F_COMPACT___not_used +#define BTF_F_NONAME BTF_F_NONAME___not_used +#define BTF_F_PTR_RAW BTF_F_PTR_RAW___not_used +#define BTF_F_ZERO BTF_F_ZERO___not_used +#include "vmlinux.h" +#undef btf_ptr +#undef BTF_F_COMPACT +#undef BTF_F_NONAME +#undef BTF_F_PTR_RAW +#undef BTF_F_ZERO + +struct btf_ptr { + void *ptr; + __u32 type_id; + __u32 flags; +}; + +enum { + BTF_F_COMPACT = (1ULL << 0), + BTF_F_NONAME= (1ULL << 1), + BTF_F_PTR_RAW = (1ULL << 2), + BTF_F_ZERO = (1ULL << 3), +}; diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c b/tools/testing/selftests/bpf/progs/netif_receive_skb.c index b873d80..6b67003 100644 --- a/tools/testing/selftests/bpf/progs/netif_receive_skb.c +++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright (c) 2020, Oracle and/or its affiliates. */ -#include "vmlinux.h" +#include "btf_ptr.h" #include #include #include -- 1.8.3.1
[PATCH bpf-next 1/2] selftests/bpf: fix unused-result warning in snprintf_btf.c
Daniel reports: +system("ping -c 1 127.0.0.1 > /dev/null"); This generates the following new warning when compiling BPF selftests: [...] EXT-OBJ [test_progs] cgroup_helpers.o EXT-OBJ [test_progs] trace_helpers.o EXT-OBJ [test_progs] network_helpers.o EXT-OBJ [test_progs] testing_helpers.o TEST-OBJ [test_progs] snprintf_btf.test.o /root/bpf-next/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c: In function ‘test_snprintf_btf’: /root/bpf-next/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c:30:2: warning: ignoring return value of ‘system’, declared with attribute warn_unused_result [-Wunused-result] system("ping -c 1 127.0.0.1 > /dev/null"); ^ [...] Fixes: 076a95f5aff2 ("selftests/bpf: Add bpf_snprintf_btf helper tests") Reported-by: Daniel Borkmann Signed-off-by: Alan Maguire --- tools/testing/selftests/bpf/prog_tests/snprintf_btf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c index 3a8ecf8..3c63a70 100644 --- a/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c +++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c @@ -27,7 +27,7 @@ void test_snprintf_btf(void) goto cleanup; /* generate receive event */ - system("ping -c 1 127.0.0.1 > /dev/null"); + (void) system("ping -c 1 127.0.0.1 > /dev/null"); if (bss->skip) { printf("%s:SKIP:no __builtin_btf_type_id\n", __func__); -- 1.8.3.1
Re: [PATCH v6 bpf-next 6/6] selftests/bpf: add test for bpf_seq_printf_btf helper
On Thu, 24 Sep 2020, Alexei Starovoitov wrote: > to whatever number, but printing single task_struct needs ~800 lines and > ~18kbytes. Humans can scroll through that much spam, but can we make it less > verbose by default somehow? > May be not in this patch set, but in the follow up? > One approach that might work would be to devote 4 bits or so of flag space to a "maximum depth" specifier; i.e. at depth 1, only base types are displayed, no aggregate types like arrays, structs and unions. We've already got depth processing in the code to figure out if possibly zeroed nested data needs to be displayed, so it should hopefully be a simple follow-up. One way to express it would be to use "..." to denote field(s) were omitted. We could even use the number of "."s to denote cases where multiple fields were omitted, giving a visual sense of how much data was omitted. So for example with BTF_F_MAX_DEPTH(1), task_struct looks like this: (struct task_struct){ .state = ()1, .stack = ( *)0x029d1e6f, ... .flags = (unsigned int)4194560, ... .cpu = (unsigned int)36, .wakee_flips = (unsigned int)11, .wakee_flip_decay_ts = (long unsigned int)4294914874, .last_wakee = (struct task_struct *)0x6c7dfe6d, .recent_used_cpu = (int)19, .wake_cpu = (int)36, .prio = (int)120, .static_prio = (int)120, .normal_prio = (int)120, .sched_class = (struct sched_class *)0xad1561e6, ... .exec_start = (u64)674402577156, .sum_exec_runtime = (u64)5009664110, .vruntime = (u64)167038057, .prev_sum_exec_runtime = (u64)5009578167, .nr_migrations = (u64)54, .depth = (int)1, .parent = (struct sched_entity *)0xcba60e7d, .cfs_rq = (struct cfs_rq *)0x14f353ed, ... ...etc. What do you think? > > +SEC("iter/task") > > +int dump_task_fs_struct(struct bpf_iter__task *ctx) > > +{ > > + static const char fs_type[] = "struct fs_struct"; > > + struct seq_file *seq = ctx->meta->seq; > > + struct task_struct *task = ctx->task; > > + struct fs_struct *fs = (void *)0; > > + static struct btf_ptr ptr = { }; > > + long ret; > > + > > + if (task) > > + fs = task->fs; > > + > > + ptr.type = fs_type; > > + ptr.ptr = fs; > > imo the following is better: >ptr.type_id = __builtin_btf_type_id(*fs, 1); >ptr.ptr = fs; > I'm still seeing lookup failures using __builtin_btf_type_id(,1) - whereas both __builtin_btf_type_id(,0) and Andrii's suggestion of bpf_core_type_id_kernel() work. Not sure what's going on - pahole is v1.17, clang is clang version 12.0.0 (/mnt/src/llvm-project/clang 7ab7b979d29e1e43701cf690f5cf1903740f50e3) > > + > > + if (ctx->meta->seq_num == 0) > > + BPF_SEQ_PRINTF(seq, "Raw BTF fs_struct per task\n"); > > + > > + ret = bpf_seq_printf_btf(seq, &ptr, sizeof(ptr), 0); > > + switch (ret) { > > + case 0: > > + tasks++; > > + break; > > + case -ERANGE: > > + /* NULL task or task->fs, don't count it as an error. */ > > + break; > > + default: > > + seq_err = ret; > > + break; > > + } > > Please add handling of E2BIG to this switch. Otherwise > printing large amount of tiny structs will overflow PAGE_SIZE and E2BIG > will be send to user space. > Like this: > @@ -40,6 +40,8 @@ int dump_task_fs_struct(struct bpf_iter__task *ctx) > case -ERANGE: > /* NULL task or task->fs, don't count it as an error. */ > break; > + case -E2BIG: > + return 1; > Done. > Also please change bpf_seq_read() like this: > diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c > index 30833bbf3019..8f10e30ea0b0 100644 > --- a/kernel/bpf/bpf_iter.c > +++ b/kernel/bpf/bpf_iter.c > @@ -88,8 +88,8 @@ static ssize_t bpf_seq_read(struct file *file, char __user > *buf, size_t size, > mutex_lock(&seq->lock); > > if (!seq->buf) { > - seq->size = PAGE_SIZE; > - seq->buf = kmalloc(seq->size, GFP_KERNEL); > + seq->size = PAGE_SIZE << 3; > + seq->buf = kvmalloc(seq->size, GFP_KERNEL); > > So users can print task_struct by default. > Hopefully we will figure out how to deal with spam later. > Thanks for all the help and suggestions! I didn't want to attribute the patch bumping seq size in v7 to you without your permission, but it's all your work so if I need to respin let me know if you'd like me to fix that. Thanks again! Alan
[PATCH v7 bpf-next 8/8] selftests/bpf: add test for bpf_seq_printf_btf helper
Add a test verifying iterating over tasks and displaying BTF representation of task_struct succeeds. Suggested-by: Alexei Starovoitov Signed-off-by: Alan Maguire --- tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 74 ++ .../selftests/bpf/progs/bpf_iter_task_btf.c| 50 +++ 2 files changed, 124 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index ad9de13..af15630 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -7,6 +7,7 @@ #include "bpf_iter_task.skel.h" #include "bpf_iter_task_stack.skel.h" #include "bpf_iter_task_file.skel.h" +#include "bpf_iter_task_btf.skel.h" #include "bpf_iter_tcp4.skel.h" #include "bpf_iter_tcp6.skel.h" #include "bpf_iter_udp4.skel.h" @@ -167,6 +168,77 @@ static void test_task_file(void) bpf_iter_task_file__destroy(skel); } +#define TASKBUFSZ 32768 + +static char taskbuf[TASKBUFSZ]; + +static void do_btf_read(struct bpf_iter_task_btf *skel) +{ + struct bpf_program *prog = skel->progs.dump_task_struct; + struct bpf_iter_task_btf__bss *bss = skel->bss; + int iter_fd = -1, len = 0, bufleft = TASKBUFSZ; + struct bpf_link *link; + char *buf = taskbuf; + + link = bpf_program__attach_iter(prog, NULL); + if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + return; + + iter_fd = bpf_iter_create(bpf_link__fd(link)); + if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n")) + goto free_link; + + do { + len = read(iter_fd, buf, bufleft); + if (len > 0) { + buf += len; + bufleft -= len; + } + } while (len > 0); + + if (bss->skip) { + printf("%s:SKIP:no __builtin_btf_type_id\n", __func__); + test__skip(); + goto free_link; + } + + if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno))) + goto free_link; + + CHECK(strstr(taskbuf, "(struct task_struct)") == NULL, + "check for btf representation of task_struct in iter data", + "struct task_struct not found"); +free_link: + if (iter_fd > 0) + close(iter_fd); + bpf_link__destroy(link); +} + +static void test_task_btf(void) +{ + struct bpf_iter_task_btf__bss *bss; + struct bpf_iter_task_btf *skel; + + skel = bpf_iter_task_btf__open_and_load(); + if (CHECK(!skel, "bpf_iter_task_btf__open_and_load", + "skeleton open_and_load failed\n")) + return; + + bss = skel->bss; + + do_btf_read(skel); + + if (CHECK(bss->tasks == 0, "check if iterated over tasks", + "no task iteration, did BPF program run?\n")) + goto cleanup; + + CHECK(bss->seq_err != 0, "check for unexpected err", + "bpf_seq_printf_btf returned %ld", bss->seq_err); + +cleanup: + bpf_iter_task_btf__destroy(skel); +} + static void test_tcp4(void) { struct bpf_iter_tcp4 *skel; @@ -957,6 +1029,8 @@ void test_bpf_iter(void) test_task_stack(); if (test__start_subtest("task_file")) test_task_file(); + if (test__start_subtest("task_btf")) + test_task_btf(); if (test__start_subtest("tcp4")) test_tcp4(); if (test__start_subtest("tcp6")) diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c new file mode 100644 index 000..a1ddc36 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ +#include "bpf_iter.h" +#include +#include +#include + +#include + +char _license[] SEC("license") = "GPL"; + +long tasks = 0; +long seq_err = 0; +bool skip = false; + +SEC("iter/task") +int dump_task_struct(struct bpf_iter__task *ctx) +{ + struct seq_file *seq = ctx->meta->seq; + struct task_struct *task = ctx->task; + static struct btf_ptr ptr = { }; + long ret; + +#if __has_builtin(__builtin_btf_type_id) + ptr.type_id = bpf_core_type_id_kernel(struct task_struct); + ptr.ptr = task; + + if (ctx->meta->seq_num == 0) + BPF_SEQ_PRINTF(seq, "Raw BTF
[PATCH v7 bpf-next 5/8] bpf: bump iter seq size to support BTF representation of large data structures
BPF iter size is limited to PAGE_SIZE; if we wish to display BTF-based representations of larger kernel data structures such as task_struct, this will be insufficient. Suggested-by: Alexei Starovoitov Signed-off-by: Alan Maguire --- kernel/bpf/bpf_iter.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c index 30833bb..8f10e30 100644 --- a/kernel/bpf/bpf_iter.c +++ b/kernel/bpf/bpf_iter.c @@ -88,8 +88,8 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size, mutex_lock(&seq->lock); if (!seq->buf) { - seq->size = PAGE_SIZE; - seq->buf = kmalloc(seq->size, GFP_KERNEL); + seq->size = PAGE_SIZE << 3; + seq->buf = kvmalloc(seq->size, GFP_KERNEL); if (!seq->buf) { err = -ENOMEM; goto done; -- 1.8.3.1
[PATCH v7 bpf-next 7/8] bpf: add bpf_seq_printf_btf helper
A helper is added to allow seq file writing of kernel data structures using vmlinux BTF. Its signature is long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags); Flags and struct btf_ptr definitions/use are identical to the bpf_snprintf_btf helper, and the helper returns 0 on success or a negative error value. Suggested-by: Alexei Starovoitov Signed-off-by: Alan Maguire --- include/linux/btf.h| 2 ++ include/uapi/linux/bpf.h | 9 + kernel/bpf/btf.c | 4 ++-- kernel/bpf/core.c | 1 + kernel/trace/bpf_trace.c | 33 + tools/include/uapi/linux/bpf.h | 9 + 6 files changed, 56 insertions(+), 2 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index 3e5cdc2..024e16f 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -68,6 +68,8 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, struct seq_file *m); +int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj, + struct seq_file *m, u64 flags); /* * Copy len bytes of string representation of obj of BTF type_id into buf. diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index fcafe80..82817c4 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3623,6 +3623,14 @@ struct bpf_stack_build_id { * The number of bytes that were written (or would have been * written if output had to be truncated due to string size), * or a negative error in cases of failure. + * + * long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr, u32 ptr_size, u64 flags) + * Description + * Use BTF to write to seq_write a string representation of + * *ptr*->ptr, using *ptr*->type_id as per bpf_snprintf_btf(). + * *flags* are identical to those used for bpf_snprintf_btf. + * Return + * 0 on success or a negative error in case of failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3775,6 +3783,7 @@ struct bpf_stack_build_id { FN(d_path), \ FN(copy_from_user), \ FN(snprintf_btf), \ + FN(seq_printf_btf), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index be5acf6..99e307a 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5346,8 +5346,8 @@ static void btf_seq_show(struct btf_show *show, const char *fmt, seq_vprintf((struct seq_file *)show->target, fmt, args); } -static int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, - void *obj, struct seq_file *m, u64 flags) +int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, + void *obj, struct seq_file *m, u64 flags) { struct btf_show sseq; diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 403fb23..c4ba45f 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2217,6 +2217,7 @@ void bpf_user_rnd_init_once(void) const struct bpf_func_proto bpf_get_local_storage_proto __weak; const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak; const struct bpf_func_proto bpf_snprintf_btf_proto __weak; +const struct bpf_func_proto bpf_seq_printf_btf_proto __weak; const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void) { diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 983cbd3..6ac254e 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -71,6 +71,10 @@ static struct bpf_raw_event_map *bpf_get_raw_tracepoint_module(const char *name) u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); +static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size, + u64 flags, const struct btf **btf, + s32 *btf_id); + /** * trace_call_bpf - invoke BPF program * @call: tracepoint event @@ -776,6 +780,31 @@ struct bpf_seq_printf_buf { .arg3_type = ARG_CONST_SIZE_OR_ZERO, }; +BPF_CALL_4(bpf_seq_printf_btf, struct seq_file *, m, struct btf_ptr *, ptr, + u32, btf_ptr_size, u64, flags) +{ + const struct btf *btf; + s32 btf_id; + int ret; + + ret = bpf_btf_printf_prepare(ptr, btf_ptr_size, flags, &btf, &btf_id); + if (ret) + return ret; + + return btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags); +} + +static const struct bpf_func_proto bpf_seq_printf_btf_proto = { + .func = bpf_seq_printf_btf,
[PATCH v7 bpf-next 6/8] selftests/bpf: fix overflow tests to reflect iter size increase
bpf iter size increase to PAGE_SIZE << 3 means overflow tests assuming page size need to be bumped also. Signed-off-by: Alan Maguire --- tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index fe1a83b9..ad9de13 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -352,7 +352,7 @@ static void test_overflow(bool test_e2big_overflow, bool ret1) struct bpf_map_info map_info = {}; struct bpf_iter_test_kern4 *skel; struct bpf_link *link; - __u32 page_size; + __u32 iter_size; char *buf; skel = bpf_iter_test_kern4__open(); @@ -374,19 +374,19 @@ static void test_overflow(bool test_e2big_overflow, bool ret1) "map_creation failed: %s\n", strerror(errno))) goto free_map1; - /* bpf_seq_printf kernel buffer is one page, so one map + /* bpf_seq_printf kernel buffer is 8 pages, so one map * bpf_seq_write will mostly fill it, and the other map * will partially fill and then trigger overflow and need * bpf_seq_read restart. */ - page_size = sysconf(_SC_PAGE_SIZE); + iter_size = sysconf(_SC_PAGE_SIZE) << 3; if (test_e2big_overflow) { - skel->rodata->print_len = (page_size + 8) / 8; - expected_read_len = 2 * (page_size + 8); + skel->rodata->print_len = (iter_size + 8) / 8; + expected_read_len = 2 * (iter_size + 8); } else if (!ret1) { - skel->rodata->print_len = (page_size - 8) / 8; - expected_read_len = 2 * (page_size - 8); + skel->rodata->print_len = (iter_size - 8) / 8; + expected_read_len = 2 * (iter_size - 8); } else { skel->rodata->print_len = 1; expected_read_len = 2 * 8; -- 1.8.3.1
[PATCH v7 bpf-next 4/8] selftests/bpf: add bpf_snprintf_btf helper tests
Tests verifying snprintf()ing of various data structures, flags combinations using a tp_btf program. Tests are skipped if __builtin_btf_type_id is not available to retrieve BTF type ids. Signed-off-by: Alan Maguire --- .../selftests/bpf/prog_tests/snprintf_btf.c| 60 + .../selftests/bpf/progs/netif_receive_skb.c| 249 + 2 files changed, 309 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf.c create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c new file mode 100644 index 000..3a8ecf8 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c @@ -0,0 +1,60 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include "netif_receive_skb.skel.h" + +/* Demonstrate that bpf_snprintf_btf succeeds and that various data types + * are formatted correctly. + */ +void test_snprintf_btf(void) +{ + struct netif_receive_skb *skel; + struct netif_receive_skb__bss *bss; + int err, duration = 0; + + skel = netif_receive_skb__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + err = netif_receive_skb__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) + goto cleanup; + + bss = skel->bss; + + err = netif_receive_skb__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* generate receive event */ + system("ping -c 1 127.0.0.1 > /dev/null"); + + if (bss->skip) { + printf("%s:SKIP:no __builtin_btf_type_id\n", __func__); + test__skip(); + goto cleanup; + } + + /* +* Make sure netif_receive_skb program was triggered +* and it set expected return values from bpf_trace_printk()s +* and all tests ran. +*/ + if (CHECK(bss->ret <= 0, + "bpf_snprintf_btf: got return value", + "ret <= 0 %ld test %d\n", bss->ret, bss->ran_subtests)) + goto cleanup; + + if (CHECK(bss->ran_subtests == 0, "check if subtests ran", + "no subtests ran, did BPF program run?")) + goto cleanup; + + if (CHECK(bss->num_subtests != bss->ran_subtests, + "check all subtests ran", + "only ran %d of %d tests\n", bss->num_subtests, + bss->ran_subtests)) + goto cleanup; + +cleanup: + netif_receive_skb__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c b/tools/testing/selftests/bpf/progs/netif_receive_skb.c new file mode 100644 index 000..b873d80 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c @@ -0,0 +1,249 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ + +#include "vmlinux.h" +#include +#include +#include + +#include + +long ret = 0; +int num_subtests = 0; +int ran_subtests = 0; +bool skip = false; + +#define STRSIZE2048 +#define EXPECTED_STRSIZE 256 + +#ifndef ARRAY_SIZE +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) +#endif + +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, char[STRSIZE]); +} strdata SEC(".maps"); + +static int __strncmp(const void *m1, const void *m2, size_t len) +{ + const unsigned char *s1 = m1; + const unsigned char *s2 = m2; + int i, delta = 0; + + for (i = 0; i < len; i++) { + delta = s1[i] - s2[i]; + if (delta || s1[i] == 0 || s2[i] == 0) + break; + } + return delta; +} + +#if __has_builtin(__builtin_btf_type_id) +#defineTEST_BTF(_str, _type, _flags, _expected, ...) \ + do {\ + static const char _expectedval[EXPECTED_STRSIZE] = \ + _expected; \ + static const char _ptrtype[64] = #_type;\ + __u64 _hflags = _flags | BTF_F_COMPACT; \ + static _type _ptrdata = __VA_ARGS__;\ + static struct btf_ptr _ptr = { }; \ + int _cmp; \ +
[PATCH v7 bpf-next 2/8] bpf: move to generic BTF show support, apply it to seq files/strings
generalize the "seq_show" seq file support in btf.c to support a generic show callback of which we support two instances; the current seq file show, and a show with snprintf() behaviour which instead writes the type data to a supplied string. Both classes of show function call btf_type_show() with different targets; the seq file or the string to be written. In the string case we need to track additional data - length left in string to write and length to return that we would have written (a la snprintf). By default show will display type information, field members and their types and values etc, and the information is indented based upon structure depth. Zeroed fields are omitted. Show however supports flags which modify its behaviour: BTF_SHOW_COMPACT - suppress newline/indent. BTF_SHOW_NONAME - suppress show of type and member names. BTF_SHOW_PTR_RAW - do not obfuscate pointer values. BTF_SHOW_UNSAFE - do not copy data to safe buffer before display. BTF_SHOW_ZERO - show zeroed values (by default they are not shown). Signed-off-by: Alan Maguire --- include/linux/btf.h | 36 ++ kernel/bpf/btf.c| 1007 +-- 2 files changed, 941 insertions(+), 102 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index a9af5e7..d0f5d3c 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -13,6 +13,7 @@ struct btf_member; struct btf_type; union bpf_attr; +struct btf_show; extern const struct file_operations btf_fops; @@ -46,8 +47,43 @@ int btf_get_info_by_fd(const struct btf *btf, const struct btf_type *btf_type_id_size(const struct btf *btf, u32 *type_id, u32 *ret_size); + +/* + * Options to control show behaviour. + * - BTF_SHOW_COMPACT: no formatting around type information + * - BTF_SHOW_NONAME: no struct/union member names/types + * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values; + * equivalent to %px. + * - BTF_SHOW_ZERO: show zero-valued struct/union members; they + * are not displayed by default + * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read + * data before displaying it. + */ +#define BTF_SHOW_COMPACT (1ULL << 0) +#define BTF_SHOW_NONAME(1ULL << 1) +#define BTF_SHOW_PTR_RAW (1ULL << 2) +#define BTF_SHOW_ZERO (1ULL << 3) +#define BTF_SHOW_UNSAFE(1ULL << 4) + void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, struct seq_file *m); + +/* + * Copy len bytes of string representation of obj of BTF type_id into buf. + * + * @btf: struct btf object + * @type_id: type id of type obj points to + * @obj: pointer to typed data + * @buf: buffer to write to + * @len: maximum length to write to buf + * @flags: show options (see above) + * + * Return: length that would have been/was copied as per snprintf, or + *negative error. + */ +int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj, + char *buf, int len, u64 flags); + int btf_get_fd_by_id(u32 id); u32 btf_id(const struct btf *btf); bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 5d3c36e..be5acf6 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -284,6 +284,91 @@ static const char *btf_type_str(const struct btf_type *t) return btf_kind_str[BTF_INFO_KIND(t->info)]; } +/* Chunk size we use in safe copy of data to be shown. */ +#define BTF_SHOW_OBJ_SAFE_SIZE 32 + +/* + * This is the maximum size of a base type value (equivalent to a + * 128-bit int); if we are at the end of our safe buffer and have + * less than 16 bytes space we can't be assured of being able + * to copy the next type safely, so in such cases we will initiate + * a new copy. + */ +#define BTF_SHOW_OBJ_BASE_TYPE_SIZE16 + +/* Type name size */ +#define BTF_SHOW_NAME_SIZE 80 + +/* + * Common data to all BTF show operations. Private show functions can add + * their own data to a structure containing a struct btf_show and consult it + * in the show callback. See btf_type_show() below. + * + * One challenge with showing nested data is we want to skip 0-valued + * data, but in order to figure out whether a nested object is all zeros + * we need to walk through it. As a result, we need to make two passes + * when handling structs, unions and arrays; the first path simply looks + * for nonzero data, while the second actually does the display. The first + * pass is signalled by show->state.depth_check being set, and if we + * encounter a non-zero value we set show->state.depth_to_show to + * the depth at which we encountered it. When we have completed the + * first pass, we will know if anything needs to be displayed if + * depth_to_show > depth. Se
[PATCH v7 bpf-next 3/8] bpf: add bpf_snprintf_btf helper
A helper is added to support tracing kernel type information in BPF using the BPF Type Format (BTF). Its signature is long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags); struct btf_ptr * specifies - a pointer to the data to be traced - the BTF id of the type of data pointed to - a flags field is provided for future use; these flags are not to be confused with the BTF_F_* flags below that control how the btf_ptr is displayed; the flags member of the struct btf_ptr may be used to disambiguate types in kernel versus module BTF, etc; the main distinction is the flags relate to the type and information needed in identifying it; not how it is displayed. For example a BPF program with a struct sk_buff *skb could do the following: static struct btf_ptr b = { }; b.ptr = skb; b.type_id = __builtin_btf_type_id(struct sk_buff, 1); bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0); Default output looks like this: (struct sk_buff){ .transport_header = (__u16)65535, .mac_header = (__u16)65535, .end = (sk_buff_data_t)192, .head = (unsigned char *)0x7524fd8b, .data = (unsigned char *)0x7524fd8b, .truesize = (unsigned int)768, .users = (refcount_t){ .refs = (atomic_t){ .counter = (int)1, }, }, } Flags modifying display are as follows: - BTF_F_COMPACT:no formatting around type information - BTF_F_NONAME: no struct/union member names/types - BTF_F_PTR_RAW:show raw (unobfuscated) pointer values; equivalent to %px. - BTF_F_ZERO: show zero-valued struct/union members; they are not displayed by default Signed-off-by: Alan Maguire --- include/linux/bpf.h| 1 + include/linux/btf.h| 9 +++--- include/uapi/linux/bpf.h | 67 ++ kernel/bpf/core.c | 1 + kernel/bpf/helpers.c | 4 +++ kernel/trace/bpf_trace.c | 65 scripts/bpf_helpers_doc.py | 2 ++ tools/include/uapi/linux/bpf.h | 67 ++ 8 files changed, 212 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 2eae3f3..1d020d8 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1810,6 +1810,7 @@ static inline int bpf_fd_reuseport_array_update_elem(struct bpf_map *map, extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto; extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto; extern const struct bpf_func_proto bpf_copy_from_user_proto; +extern const struct bpf_func_proto bpf_snprintf_btf_proto; const struct bpf_func_proto *bpf_tracing_func_proto( enum bpf_func_id func_id, const struct bpf_prog *prog); diff --git a/include/linux/btf.h b/include/linux/btf.h index d0f5d3c..3e5cdc2 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -6,6 +6,7 @@ #include #include +#include #define BTF_TYPE_EMIT(type) ((void)(type *)0) @@ -59,10 +60,10 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read * data before displaying it. */ -#define BTF_SHOW_COMPACT (1ULL << 0) -#define BTF_SHOW_NONAME(1ULL << 1) -#define BTF_SHOW_PTR_RAW (1ULL << 2) -#define BTF_SHOW_ZERO (1ULL << 3) +#define BTF_SHOW_COMPACT BTF_F_COMPACT +#define BTF_SHOW_NONAMEBTF_F_NONAME +#define BTF_SHOW_PTR_RAW BTF_F_PTR_RAW +#define BTF_SHOW_ZERO BTF_F_ZERO #define BTF_SHOW_UNSAFE(1ULL << 4) void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 2d6519a..fcafe80 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3587,6 +3587,42 @@ struct bpf_stack_build_id { * the data in *dst*. This is a wrapper of **copy_from_user**\ (). * Return * 0 on success, or a negative error in case of failure. + * + * long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags) + * Description + * Use BTF to store a string representation of *ptr*->ptr in *str*, + * using *ptr*->type_id. This value should specify the type + * that *ptr*->ptr points to. LLVM __builtin_btf_type_id(type, 1) + * can be used to look up vmlinux BTF type ids. Traversing the + * data structure using BTF, the type information and values are + * stored in the first *str_size* - 1 bytes of *str*. Safe copy of + * the pointer data is carried out to avoid kernel crashes during + * operation. Smaller types can use string space on the stack; + *
[PATCH v7 bpf-next 0/8] bpf: add helpers to support BTF-based kernel data display
ersion of the target dummy value which is either all zeros or all 0xff values; the idea is this exercises the "skip if zero" and "print everything" cases. - added support in BPF for using the %pT format specifier in bpf_trace_printk() - added BPF tests which ensure %pT format specifier use works (Alexei). Alan Maguire (8): bpf: provide function to get vmlinux BTF information bpf: move to generic BTF show support, apply it to seq files/strings bpf: add bpf_snprintf_btf helper selftests/bpf: add bpf_snprintf_btf helper tests bpf: bump iter seq size to support BTF representation of large data structures selftests/bpf: fix overflow tests to reflect iter size increase bpf: add bpf_seq_printf_btf helper selftests/bpf: add test for bpf_seq_printf_btf helper include/linux/bpf.h|3 + include/linux/btf.h| 39 + include/uapi/linux/bpf.h | 76 ++ kernel/bpf/bpf_iter.c |4 +- kernel/bpf/btf.c | 1007 ++-- kernel/bpf/core.c |2 + kernel/bpf/helpers.c |4 + kernel/bpf/verifier.c | 18 +- kernel/trace/bpf_trace.c | 98 ++ scripts/bpf_helpers_doc.py |2 + tools/include/uapi/linux/bpf.h | 76 ++ tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 88 +- .../selftests/bpf/prog_tests/snprintf_btf.c| 60 ++ .../selftests/bpf/progs/bpf_iter_task_btf.c| 50 + .../selftests/bpf/progs/netif_receive_skb.c| 249 + 15 files changed, 1659 insertions(+), 117 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf.c create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c -- 1.8.3.1
[PATCH v7 bpf-next 1/8] bpf: provide function to get vmlinux BTF information
It will be used later for BPF structure display support Signed-off-by: Alan Maguire --- include/linux/bpf.h | 2 ++ kernel/bpf/verifier.c | 18 -- 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 7990232..2eae3f3 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1355,6 +1355,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, union bpf_attr __user *uattr); void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth); +struct btf *bpf_get_btf_vmlinux(void); + /* Map specifics */ struct xdp_buff; struct sk_buff; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index b25ba98..686f6a9 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -11517,6 +11517,17 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) } } +struct btf *bpf_get_btf_vmlinux(void) +{ + if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { + mutex_lock(&bpf_verifier_lock); + if (!btf_vmlinux) + btf_vmlinux = btf_parse_vmlinux(); + mutex_unlock(&bpf_verifier_lock); + } + return btf_vmlinux; +} + int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, union bpf_attr __user *uattr) { @@ -11550,12 +11561,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, env->ops = bpf_verifier_ops[env->prog->type]; is_priv = bpf_capable(); - if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { - mutex_lock(&bpf_verifier_lock); - if (!btf_vmlinux) - btf_vmlinux = btf_parse_vmlinux(); - mutex_unlock(&bpf_verifier_lock); - } + bpf_get_btf_vmlinux(); /* grab the mutex to protect few globals used by verifier */ if (!is_priv) -- 1.8.3.1
[PATCH v6 bpf-next 6/6] selftests/bpf: add test for bpf_seq_printf_btf helper
Add a test verifying iterating over tasks and displaying BTF representation of data succeeds. Note here that we do not display the task_struct itself, as it will overflow the PAGE_SIZE limit on seq data; instead we write task->fs (a struct fs_struct). Suggested-by: Alexei Starovoitov Signed-off-by: Alan Maguire --- tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 66 ++ .../selftests/bpf/progs/bpf_iter_task_btf.c| 49 2 files changed, 115 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index fe1a83b9..323c48a 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -7,6 +7,7 @@ #include "bpf_iter_task.skel.h" #include "bpf_iter_task_stack.skel.h" #include "bpf_iter_task_file.skel.h" +#include "bpf_iter_task_btf.skel.h" #include "bpf_iter_tcp4.skel.h" #include "bpf_iter_tcp6.skel.h" #include "bpf_iter_udp4.skel.h" @@ -167,6 +168,69 @@ static void test_task_file(void) bpf_iter_task_file__destroy(skel); } +#define FSBUFSZ8192 + +static char fsbuf[FSBUFSZ]; + +static void do_btf_read(struct bpf_program *prog) +{ + int iter_fd = -1, len = 0, bufleft = FSBUFSZ; + struct bpf_link *link; + char *buf = fsbuf; + + link = bpf_program__attach_iter(prog, NULL); + if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + return; + + iter_fd = bpf_iter_create(bpf_link__fd(link)); + if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n")) + goto free_link; + + do { + len = read(iter_fd, buf, bufleft); + if (len > 0) { + buf += len; + bufleft -= len; + } + } while (len > 0); + + if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno))) + goto free_link; + + CHECK(strstr(fsbuf, "(struct fs_struct)") == NULL, + "check for btf representation of fs_struct in iter data", + "struct fs_struct not found"); +free_link: + if (iter_fd > 0) + close(iter_fd); + bpf_link__destroy(link); +} + +static void test_task_btf(void) +{ + struct bpf_iter_task_btf__bss *bss; + struct bpf_iter_task_btf *skel; + + skel = bpf_iter_task_btf__open_and_load(); + if (CHECK(!skel, "bpf_iter_task_btf__open_and_load", + "skeleton open_and_load failed\n")) + return; + + bss = skel->bss; + + do_btf_read(skel->progs.dump_task_fs_struct); + + if (CHECK(bss->tasks == 0, "check if iterated over tasks", + "no task iteration, did BPF program run?\n")) + goto cleanup; + + CHECK(bss->seq_err != 0, "check for unexpected err", + "bpf_seq_printf_btf returned %ld", bss->seq_err); + +cleanup: + bpf_iter_task_btf__destroy(skel); +} + static void test_tcp4(void) { struct bpf_iter_tcp4 *skel; @@ -957,6 +1021,8 @@ void test_bpf_iter(void) test_task_stack(); if (test__start_subtest("task_file")) test_task_file(); + if (test__start_subtest("task_btf")) + test_task_btf(); if (test__start_subtest("tcp4")) test_tcp4(); if (test__start_subtest("tcp6")) diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c new file mode 100644 index 000..88631a8 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ +#include "bpf_iter.h" +#include +#include +#include + +char _license[] SEC("license") = "GPL"; + +long tasks = 0; +long seq_err = 0; + +/* struct task_struct's BTF representation will overflow PAGE_SIZE so cannot + * be used here; instead dump a structure associated with each task. + */ +SEC("iter/task") +int dump_task_fs_struct(struct bpf_iter__task *ctx) +{ + static const char fs_type[] = "struct fs_struct"; + struct seq_file *seq = ctx->meta->seq; + struct task_struct *task = ctx->task; + struct fs_struct *fs = (void *)0; + static struct btf_ptr ptr = { }; + long ret; + + if (task) + fs = task->fs; + + ptr.type = fs_type; + ptr.ptr = fs; + + if (ctx->meta->
[PATCH v6 bpf-next 1/6] bpf: provide function to get vmlinux BTF information
It will be used later for BPF structure display support Signed-off-by: Alan Maguire --- include/linux/bpf.h | 2 ++ kernel/bpf/verifier.c | 18 -- 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index fc5c901..049e50f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1340,6 +1340,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, union bpf_attr __user *uattr); void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth); +struct btf *bpf_get_btf_vmlinux(void); + /* Map specifics */ struct xdp_buff; struct sk_buff; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 15ab889b..092ffd6 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -11488,6 +11488,17 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) } } +struct btf *bpf_get_btf_vmlinux(void) +{ + if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { + mutex_lock(&bpf_verifier_lock); + if (!btf_vmlinux) + btf_vmlinux = btf_parse_vmlinux(); + mutex_unlock(&bpf_verifier_lock); + } + return btf_vmlinux; +} + int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, union bpf_attr __user *uattr) { @@ -11521,12 +11532,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, env->ops = bpf_verifier_ops[env->prog->type]; is_priv = bpf_capable(); - if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { - mutex_lock(&bpf_verifier_lock); - if (!btf_vmlinux) - btf_vmlinux = btf_parse_vmlinux(); - mutex_unlock(&bpf_verifier_lock); - } + bpf_get_btf_vmlinux(); /* grab the mutex to protect few globals used by verifier */ if (!is_priv) -- 1.8.3.1
[PATCH v6 bpf-next 5/6] bpf: add bpf_seq_printf_btf helper
A helper is added to allow seq file writing of kernel data structures using vmlinux BTF. Its signature is long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags); Flags and struct btf_ptr definitions/use are identical to the bpf_snprintf_btf helper, and the helper returns 0 on success or a negative error value. Suggested-by: Alexei Starovoitov Signed-off-by: Alan Maguire --- include/linux/btf.h| 2 ++ include/uapi/linux/bpf.h | 10 ++ kernel/bpf/btf.c | 4 ++-- kernel/bpf/core.c | 1 + kernel/trace/bpf_trace.c | 33 + tools/include/uapi/linux/bpf.h | 10 ++ 6 files changed, 58 insertions(+), 2 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index 3e5cdc2..024e16f 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -68,6 +68,8 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, struct seq_file *m); +int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj, + struct seq_file *m, u64 flags); /* * Copy len bytes of string representation of obj of BTF type_id into buf. diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c1675ad..c3231a8 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3621,6 +3621,15 @@ struct bpf_stack_build_id { * The number of bytes that were written (or would have been * written if output had to be truncated due to string size), * or a negative error in cases of failure. + * + * long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr, u32 ptr_size, u64 flags) + * Description + * Use BTF to write to seq_write a string representation of + * *ptr*->ptr, using *ptr*->type name or *ptr*->type_id as per + * bpf_snprintf_btf() above. *flags* are identical to those + * used for bpf_snprintf_btf. + * Return + * 0 on success or a negative error in case of failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3773,6 +3782,7 @@ struct bpf_stack_build_id { FN(d_path), \ FN(copy_from_user), \ FN(snprintf_btf), \ + FN(seq_printf_btf), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 94190ec..dfc8654 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5316,8 +5316,8 @@ static __printf(2, 3) void btf_seq_show(struct btf_show *show, const char *fmt, va_end(args); } -static int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, - void *obj, struct seq_file *m, u64 flags) +int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, + void *obj, struct seq_file *m, u64 flags) { struct btf_show sseq; diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 403fb23..c4ba45f 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2217,6 +2217,7 @@ void bpf_user_rnd_init_once(void) const struct bpf_func_proto bpf_get_local_storage_proto __weak; const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak; const struct bpf_func_proto bpf_snprintf_btf_proto __weak; +const struct bpf_func_proto bpf_seq_printf_btf_proto __weak; const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void) { diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 61c274f8..e8fa1c0 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -71,6 +71,10 @@ static struct bpf_raw_event_map *bpf_get_raw_tracepoint_module(const char *name) u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); +static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size, + u64 flags, const struct btf **btf, + s32 *btf_id); + /** * trace_call_bpf - invoke BPF program * @call: tracepoint event @@ -776,6 +780,31 @@ struct bpf_seq_printf_buf { .arg3_type = ARG_CONST_SIZE_OR_ZERO, }; +BPF_CALL_4(bpf_seq_printf_btf, struct seq_file *, m, struct btf_ptr *, ptr, + u32, btf_ptr_size, u64, flags) +{ + const struct btf *btf; + s32 btf_id; + int ret; + + ret = bpf_btf_printf_prepare(ptr, btf_ptr_size, flags, &btf, &btf_id); + if (ret) + return ret; + + return btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags); +} + +static const struct bpf_func_proto bpf_seq_printf_btf_proto = { + .func = bpf_seq
[PATCH v6 bpf-next 2/6] bpf: move to generic BTF show support, apply it to seq files/strings
generalize the "seq_show" seq file support in btf.c to support a generic show callback of which we support two instances; the current seq file show, and a show with snprintf() behaviour which instead writes the type data to a supplied string. Both classes of show function call btf_type_show() with different targets; the seq file or the string to be written. In the string case we need to track additional data - length left in string to write and length to return that we would have written (a la snprintf). By default show will display type information, field members and their types and values etc, and the information is indented based upon structure depth. Zeroed fields are omitted. Show however supports flags which modify its behaviour: BTF_SHOW_COMPACT - suppress newline/indent. BTF_SHOW_NONAME - suppress show of type and member names. BTF_SHOW_PTR_RAW - do not obfuscate pointer values. BTF_SHOW_UNSAFE - do not copy data to safe buffer before display. BTF_SHOW_ZERO - show zeroed values (by default they are not shown). Signed-off-by: Alan Maguire --- include/linux/btf.h | 36 ++ kernel/bpf/btf.c| 980 ++-- 2 files changed, 914 insertions(+), 102 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index a9af5e7..d0f5d3c 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -13,6 +13,7 @@ struct btf_member; struct btf_type; union bpf_attr; +struct btf_show; extern const struct file_operations btf_fops; @@ -46,8 +47,43 @@ int btf_get_info_by_fd(const struct btf *btf, const struct btf_type *btf_type_id_size(const struct btf *btf, u32 *type_id, u32 *ret_size); + +/* + * Options to control show behaviour. + * - BTF_SHOW_COMPACT: no formatting around type information + * - BTF_SHOW_NONAME: no struct/union member names/types + * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values; + * equivalent to %px. + * - BTF_SHOW_ZERO: show zero-valued struct/union members; they + * are not displayed by default + * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read + * data before displaying it. + */ +#define BTF_SHOW_COMPACT (1ULL << 0) +#define BTF_SHOW_NONAME(1ULL << 1) +#define BTF_SHOW_PTR_RAW (1ULL << 2) +#define BTF_SHOW_ZERO (1ULL << 3) +#define BTF_SHOW_UNSAFE(1ULL << 4) + void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, struct seq_file *m); + +/* + * Copy len bytes of string representation of obj of BTF type_id into buf. + * + * @btf: struct btf object + * @type_id: type id of type obj points to + * @obj: pointer to typed data + * @buf: buffer to write to + * @len: maximum length to write to buf + * @flags: show options (see above) + * + * Return: length that would have been/was copied as per snprintf, or + *negative error. + */ +int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj, + char *buf, int len, u64 flags); + int btf_get_fd_by_id(u32 id); u32 btf_id(const struct btf *btf); bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 5d3c36e..94190ec 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -284,6 +284,88 @@ static const char *btf_type_str(const struct btf_type *t) return btf_kind_str[BTF_INFO_KIND(t->info)]; } +/* Chunk size we use in safe copy of data to be shown. */ +#define BTF_SHOW_OBJ_SAFE_SIZE 256 + +/* + * This is the maximum size of a base type value (equivalent to a + * 128-bit int); if we are at the end of our safe buffer and have + * less than 16 bytes space we can't be assured of being able + * to copy the next type safely, so in such cases we will initiate + * a new copy. + */ +#define BTF_SHOW_OBJ_BASE_TYPE_SIZE16 + +/* + * Common data to all BTF show operations. Private show functions can add + * their own data to a structure containing a struct btf_show and consult it + * in the show callback. See btf_type_show() below. + * + * One challenge with showing nested data is we want to skip 0-valued + * data, but in order to figure out whether a nested object is all zeros + * we need to walk through it. As a result, we need to make two passes + * when handling structs, unions and arrays; the first path simply looks + * for nonzero data, while the second actually does the display. The first + * pass is signalled by show->state.depth_check being set, and if we + * encounter a non-zero value we set show->state.depth_to_show to + * the depth at which we encountered it. When we have completed the + * first pass, we will know if anything needs to be displayed if + * depth_to_show > depth. See btf_[struct,array]_show() for the + * implementation of this.
[PATCH v6 bpf-next 0/6] bpf: add helpers to support BTF-based kernel data display
t approaches were explored including dynamic allocation and per-cpu buffers. The downside of dynamic allocation is that it would be done during BPF program execution for bpf_trace_printk()s using %pT format specifiers. The problem with per-cpu buffers is we'd have to manage preemption and since the display of an object occurs over an extended period and in printk context where we'd rather not change preemption status, it seemed tricky to manage buffer safety while considering preemption. The approach of utilizing stack buffer space via the "struct btf_show" seemed like the simplest approach. The stack size of the associated functions which have a "struct btf_show" on their stack to support show operation (btf_type_snprintf_show() and btf_type_seq_show()) stays under 500 bytes. The compromise here is the safe buffer we use is small - 256 bytes - and as a result multiple probe_kernel_read()s are needed for larger objects. Most objects of interest are smaller than this (e.g. "struct sk_buff" is 224 bytes), and while task_struct is a notable exception at ~8K, performance is not the priority for BTF-based display. (Alexei and Yonghong, patch 2). - safe buffer use is the default behaviour (and is mandatory for BPF) but unsafe display - meaning no safe copy is done and we operate on the object itself - is supported via a 'u' option. - pointers are prefixed with 0x for clarity (Alexei, patch 2) - added additional comments and explanations around BTF show code, especially around determining whether objects such zeroed. Also tried to comment safe object scheme used. (Yonghong, patch 2) - added late_initcall() to initialize vmlinux BTF so that it would not have to be initialized during printk operation (Alexei, patch 5) - removed CONFIG_BTF_PRINTF config option as it is not needed; CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and determining behaviour of type-based printk can be done via retrieval of BTF data; if it's not there BTF was unavailable or broken (Alexei, patches 4,6) - fix bpf_trace_printk test to use vmlinux.h and globals via skeleton infrastructure, removing need for perf events (Andrii, patch 8) Changes since v1: - changed format to be more drgn-like, rendering indented type info along with type names by default (Alexei) - zeroed values are omitted (Arnaldo) by default unless the '0' modifier is specified (Alexei) - added an option to print pointer values without obfuscation. The reason to do this is the sysctls controlling pointer display are likely to be irrelevant in many if not most tracing contexts. Some questions on this in the outstanding questions section below... - reworked printk format specifer so that we no longer rely on format %pT but instead use a struct * which contains type information (Rasmus). This simplifies the printk parsing, makes use more dynamic and also allows specification by BTF id as well as name. - removed incorrect patch which tried to fix dereferencing of resolved BTF info for vmlinux; instead we skip modifiers for the relevant case (array element type determination) (Alexei). - fixed issues with negative snprintf format length (Rasmus) - added test cases for various data structure formats; base types, typedefs, structs, etc. - tests now iterate through all typedef, enum, struct and unions defined for vmlinux BTF and render a version of the target dummy value which is either all zeros or all 0xff values; the idea is this exercises the "skip if zero" and "print everything" cases. - added support in BPF for using the %pT format specifier in bpf_trace_printk() - added BPF tests which ensure %pT format specifier use works (Alexei). Alan Maguire (6): bpf: provide function to get vmlinux BTF information bpf: move to generic BTF show support, apply it to seq files/strings bpf: add bpf_snprintf_btf helper selftests/bpf: add bpf_snprintf_btf helper tests bpf: add bpf_seq_printf_btf helper selftests/bpf: add test for bpf_seq_printf_btf helper include/linux/bpf.h| 3 + include/linux/btf.h| 39 + include/uapi/linux/bpf.h | 78 ++ kernel/bpf/btf.c | 980 ++--- kernel/bpf/core.c | 2 + kernel/bpf/helpers.c | 4 + kernel/bpf/verifier.c | 18 +- kernel/trace/bpf_trace.c | 134 +++ scripts/bpf_helpers_doc.py | 2 + tools/include/uapi/linux/bpf.h | 78 ++ tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 66 ++ .../selftests/bpf/prog_tests/snprintf_btf.c| 54 ++ .../selftests/bpf/progs/bpf_iter_task_btf.c| 49 ++ .../selftests/bpf/progs/netif_receive_sk
[PATCH v6 bpf-next 4/6] selftests/bpf: add bpf_snprintf_btf helper tests
Tests verifying snprintf()ing of various data structures, flags combinations using a tp_btf program. Signed-off-by: Alan Maguire --- .../selftests/bpf/prog_tests/snprintf_btf.c| 54 + .../selftests/bpf/progs/netif_receive_skb.c| 260 + 2 files changed, 314 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf.c create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c new file mode 100644 index 000..855e11d --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/snprintf_btf.c @@ -0,0 +1,54 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include "netif_receive_skb.skel.h" + +/* Demonstrate that bpf_snprintf_btf succeeds and that various data types + * are formatted correctly. + */ +void test_snprintf_btf(void) +{ + struct netif_receive_skb *skel; + struct netif_receive_skb__bss *bss; + int err, duration = 0; + + skel = netif_receive_skb__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + err = netif_receive_skb__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) + goto cleanup; + + bss = skel->bss; + + err = netif_receive_skb__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* generate receive event */ + system("ping -c 1 127.0.0.1 > /dev/null"); + + /* +* Make sure netif_receive_skb program was triggered +* and it set expected return values from bpf_trace_printk()s +* and all tests ran. +*/ + if (CHECK(bss->ret <= 0, + "bpf_snprintf_btf: got return value", + "ret <= 0 %ld test %d\n", bss->ret, bss->ran_subtests)) + goto cleanup; + + if (CHECK(bss->ran_subtests == 0, "check if subtests ran", + "no subtests ran, did BPF program run?")) + goto cleanup; + + if (CHECK(bss->num_subtests != bss->ran_subtests, + "check all subtests ran", + "only ran %d of %d tests\n", bss->num_subtests, + bss->ran_subtests)) + goto cleanup; + +cleanup: + netif_receive_skb__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c b/tools/testing/selftests/bpf/progs/netif_receive_skb.c new file mode 100644 index 000..b4f96f1 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c @@ -0,0 +1,260 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ + +#include "vmlinux.h" +#include +#include +#include + +long ret = 0; +int num_subtests = 0; +int ran_subtests = 0; + +#define STRSIZE2048 +#define EXPECTED_STRSIZE 256 + +#ifndef ARRAY_SIZE +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) +#endif + +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, char[STRSIZE]); +} strdata SEC(".maps"); + +static int __strncmp(const void *m1, const void *m2, size_t len) +{ + const unsigned char *s1 = m1; + const unsigned char *s2 = m2; + int i, delta = 0; + +#pragma clang loop unroll(full) + for (i = 0; i < len; i++) { + delta = s1[i] - s2[i]; + if (delta || s1[i] == 0 || s2[i] == 0) + break; + } + return delta; +} + +/* Use __builtin_btf_type_id to test snprintf_btf by type id instead of name */ +#if __has_builtin(__builtin_btf_type_id) +#define TEST_BTF_BY_ID(_str, _typestr, _ptr, _hflags) \ + do {\ + int _expected_ret = ret;\ + _ptr.type = 0; \ + _ptr.type_id = __builtin_btf_type_id(_typestr, 0); \ + ret = bpf_snprintf_btf(_str, STRSIZE, &_ptr,\ + sizeof(_ptr), _hflags); \ + if (ret != _expected_ret) { \ + bpf_printk("expected ret (%d), got (%d)", \ + _expected_ret, ret); \ + ret = -EBADMSG; \ + } \ + } while
[PATCH v6 bpf-next 3/6] bpf: add bpf_snprintf_btf helper
A helper is added to support tracing kernel type information in BPF using the BPF Type Format (BTF). Its signature is long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags); struct btf_ptr * specifies - a pointer to the data to be traced; - the BTF id of the type of data pointed to; or - a string representation of the type of data pointed to - a flags field is provided for future use; these flags are not to be confused with the BTF_F_* flags below that control how the btf_ptr is displayed; the flags member of the struct btf_ptr may be used to disambiguate types in kernel versus module BTF, etc; the main distinction is the flags relate to the type and information needed in identifying it; not how it is displayed. For example a BPF program with a struct sk_buff *skb could do the following: static const char skb_type[] = "struct sk_buff"; static struct btf_ptr b = { }; b.ptr = skb; b.type = skb_type; bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0); Default output looks like this: (struct sk_buff){ .transport_header = (__u16)65535, .mac_header = (__u16)65535, .end = (sk_buff_data_t)192, .head = (unsigned char *)0x7524fd8b, .data = (unsigned char *)0x7524fd8b, .truesize = (unsigned int)768, .users = (refcount_t){ .refs = (atomic_t){ .counter = (int)1, }, }, } Flags modifying display are as follows: - BTF_F_COMPACT:no formatting around type information - BTF_F_NONAME: no struct/union member names/types - BTF_F_PTR_RAW:show raw (unobfuscated) pointer values; equivalent to %px. - BTF_F_ZERO: show zero-valued struct/union members; they are not displayed by default Signed-off-by: Alan Maguire --- include/linux/bpf.h| 1 + include/linux/btf.h| 9 ++-- include/uapi/linux/bpf.h | 68 +++ kernel/bpf/core.c | 1 + kernel/bpf/helpers.c | 4 ++ kernel/trace/bpf_trace.c | 101 + scripts/bpf_helpers_doc.py | 2 + tools/include/uapi/linux/bpf.h | 68 +++ 8 files changed, 250 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 049e50f..a3b40a5 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1795,6 +1795,7 @@ static inline int bpf_fd_reuseport_array_update_elem(struct bpf_map *map, extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto; extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto; extern const struct bpf_func_proto bpf_copy_from_user_proto; +extern const struct bpf_func_proto bpf_snprintf_btf_proto; const struct bpf_func_proto *bpf_tracing_func_proto( enum bpf_func_id func_id, const struct bpf_prog *prog); diff --git a/include/linux/btf.h b/include/linux/btf.h index d0f5d3c..3e5cdc2 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -6,6 +6,7 @@ #include #include +#include #define BTF_TYPE_EMIT(type) ((void)(type *)0) @@ -59,10 +60,10 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read * data before displaying it. */ -#define BTF_SHOW_COMPACT (1ULL << 0) -#define BTF_SHOW_NONAME(1ULL << 1) -#define BTF_SHOW_PTR_RAW (1ULL << 2) -#define BTF_SHOW_ZERO (1ULL << 3) +#define BTF_SHOW_COMPACT BTF_F_COMPACT +#define BTF_SHOW_NONAMEBTF_F_NONAME +#define BTF_SHOW_PTR_RAW BTF_F_PTR_RAW +#define BTF_SHOW_ZERO BTF_F_ZERO #define BTF_SHOW_UNSAFE(1ULL << 4) void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index a228125..c1675ad 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3586,6 +3586,41 @@ struct bpf_stack_build_id { * the data in *dst*. This is a wrapper of **copy_from_user**\ (). * Return * 0 on success, or a negative error in case of failure. + * + * long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags) + * Description + * Use BTF to store a string representation of *ptr*->ptr in *str*, + * using *ptr*->type name or *ptr*->type_id. These values should + * specify the type *ptr*->ptr points to. Traversing that + * data structure using BTF, the type information and values are + * stored in the first *str_size* - 1 bytes of *str*. Safe copy of + * the pointer data is carried out to avoid kernel crashes during + * operation. Smaller types can use string space on the stack; + *
[PATCH v5 bpf-next 4/6] selftests/bpf: add bpf_btf_snprintf helper tests
Tests verifying snprintf()ing of various data structures, flags combinations using a tp_btf program. Signed-off-by: Alan Maguire --- .../selftests/bpf/prog_tests/btf_snprintf.c| 55 + .../selftests/bpf/progs/netif_receive_skb.c| 260 + 2 files changed, 315 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_snprintf.c create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c diff --git a/tools/testing/selftests/bpf/prog_tests/btf_snprintf.c b/tools/testing/selftests/bpf/prog_tests/btf_snprintf.c new file mode 100644 index 000..8f277a5 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/btf_snprintf.c @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include "netif_receive_skb.skel.h" + +/* Demonstrate that bpf_btf_snprintf succeeds with non-zero return values, + * and that string representation of kernel data can then be displayed + * via bpf_trace_printk(). + */ +void test_btf_snprintf(void) +{ + struct netif_receive_skb *skel; + struct netif_receive_skb__bss *bss; + int err, duration = 0; + + skel = netif_receive_skb__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + err = netif_receive_skb__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) + goto cleanup; + + bss = skel->bss; + + err = netif_receive_skb__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* generate receive event */ + system("ping -c 1 127.0.0.1 > /dev/null"); + + /* +* Make sure netif_receive_skb program was triggered +* and it set expected return values from bpf_trace_printk()s +* and all tests ran. +*/ + if (CHECK(bss->ret <= 0, + "bpf_btf_snprintf: got return value", + "ret <= 0 %ld test %d\n", bss->ret, bss->ran_subtests)) + goto cleanup; + + if (CHECK(bss->ran_subtests == 0, "check if subtests ran", + "no subtests ran, did BPF program run?")) + goto cleanup; + + if (CHECK(bss->num_subtests != bss->ran_subtests, + "check all subtests ran", + "only ran %d of %d tests\n", bss->num_subtests, + bss->ran_subtests)) + goto cleanup; + +cleanup: + netif_receive_skb__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c b/tools/testing/selftests/bpf/progs/netif_receive_skb.c new file mode 100644 index 000..dd08a7d --- /dev/null +++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c @@ -0,0 +1,260 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ + +#include "vmlinux.h" +#include +#include +#include + +long ret = 0; +int num_subtests = 0; +int ran_subtests = 0; + +#define STRSIZE2048 +#define EXPECTED_STRSIZE 256 + +#ifndef ARRAY_SIZE +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) +#endif + +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, char[STRSIZE]); +} strdata SEC(".maps"); + +static int __strncmp(const void *m1, const void *m2, size_t len) +{ + const unsigned char *s1 = m1; + const unsigned char *s2 = m2; + int i, delta = 0; + +#pragma clang loop unroll(full) + for (i = 0; i < len; i++) { + delta = s1[i] - s2[i]; + if (delta || s1[i] == 0 || s2[i] == 0) + break; + } + return delta; +} + +/* Use __builtin_btf_type_id to test btf_snprintf by type id instead of name */ +#if __has_builtin(__builtin_btf_type_id) +#define TEST_BTF_BY_ID(_str, _typestr, _ptr, _hflags) \ + do {\ + int _expected_ret = ret;\ + _ptr.type = 0; \ + _ptr.type_id = __builtin_btf_type_id(_typestr, 0); \ + ret = bpf_btf_snprintf(_str, STRSIZE, &_ptr,\ + sizeof(_ptr), _hflags); \ + if (ret != _expected_ret) { \ + bpf_printk("expected ret (%d), got (%d)", \ + _expected_ret, ret); \ + ret = -EBADMSG; \ + }
[PATCH v5 bpf-next 2/6] bpf: move to generic BTF show support, apply it to seq files/strings
generalize the "seq_show" seq file support in btf.c to support a generic show callback of which we support two instances; the current seq file show, and a show with snprintf() behaviour which instead writes the type data to a supplied string. Both classes of show function call btf_type_show() with different targets; the seq file or the string to be written. In the string case we need to track additional data - length left in string to write and length to return that we would have written (a la snprintf). By default show will display type information, field members and their types and values etc, and the information is indented based upon structure depth. Zeroed fields are omitted. Show however supports flags which modify its behaviour: BTF_SHOW_COMPACT - suppress newline/indent. BTF_SHOW_NONAME - suppress show of type and member names. BTF_SHOW_PTR_RAW - do not obfuscate pointer values. BTF_SHOW_UNSAFE - do not copy data to safe buffer before display. BTF_SHOW_ZERO - show zeroed values (by default they are not shown). Signed-off-by: Alan Maguire --- include/linux/btf.h | 36 ++ kernel/bpf/btf.c| 971 ++-- 2 files changed, 904 insertions(+), 103 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index a9af5e7..d0f5d3c 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -13,6 +13,7 @@ struct btf_member; struct btf_type; union bpf_attr; +struct btf_show; extern const struct file_operations btf_fops; @@ -46,8 +47,43 @@ int btf_get_info_by_fd(const struct btf *btf, const struct btf_type *btf_type_id_size(const struct btf *btf, u32 *type_id, u32 *ret_size); + +/* + * Options to control show behaviour. + * - BTF_SHOW_COMPACT: no formatting around type information + * - BTF_SHOW_NONAME: no struct/union member names/types + * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values; + * equivalent to %px. + * - BTF_SHOW_ZERO: show zero-valued struct/union members; they + * are not displayed by default + * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read + * data before displaying it. + */ +#define BTF_SHOW_COMPACT (1ULL << 0) +#define BTF_SHOW_NONAME(1ULL << 1) +#define BTF_SHOW_PTR_RAW (1ULL << 2) +#define BTF_SHOW_ZERO (1ULL << 3) +#define BTF_SHOW_UNSAFE(1ULL << 4) + void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, struct seq_file *m); + +/* + * Copy len bytes of string representation of obj of BTF type_id into buf. + * + * @btf: struct btf object + * @type_id: type id of type obj points to + * @obj: pointer to typed data + * @buf: buffer to write to + * @len: maximum length to write to buf + * @flags: show options (see above) + * + * Return: length that would have been/was copied as per snprintf, or + *negative error. + */ +int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj, + char *buf, int len, u64 flags); + int btf_get_fd_by_id(u32 id); u32 btf_id(const struct btf *btf); bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index f9ac693..70f5b88 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -284,6 +284,88 @@ static const char *btf_type_str(const struct btf_type *t) return btf_kind_str[BTF_INFO_KIND(t->info)]; } +/* Chunk size we use in safe copy of data to be shown. */ +#define BTF_SHOW_OBJ_SAFE_SIZE 256 + +/* + * This is the maximum size of a base type value (equivalent to a + * 128-bit int); if we are at the end of our safe buffer and have + * less than 16 bytes space we can't be assured of being able + * to copy the next type safely, so in such cases we will initiate + * a new copy. + */ +#define BTF_SHOW_OBJ_BASE_TYPE_SIZE16 + +/* + * Common data to all BTF show operations. Private show functions can add + * their own data to a structure containing a struct btf_show and consult it + * in the show callback. See btf_type_show() below. + * + * One challenge with showing nested data is we want to skip 0-valued + * data, but in order to figure out whether a nested object is all zeros + * we need to walk through it. As a result, we need to make two passes + * when handling structs, unions and arrays; the first path simply looks + * for nonzero data, while the second actually does the display. The first + * pass is signalled by show->state.depth_check being set, and if we + * encounter a non-zero value we set show->state.depth_to_show to + * the depth at which we encountered it. When we have completed the + * first pass, we will know if anything needs to be displayed if + * depth_to_show > depth. See btf_[struct,array]_show() for the + * implementation of this.
[PATCH v5 bpf-next 0/6] bpf: add helpers to support BTF-based kernel data display
to manage preemption and since the display of an object occurs over an extended period and in printk context where we'd rather not change preemption status, it seemed tricky to manage buffer safety while considering preemption. The approach of utilizing stack buffer space via the "struct btf_show" seemed like the simplest approach. The stack size of the associated functions which have a "struct btf_show" on their stack to support show operation (btf_type_snprintf_show() and btf_type_seq_show()) stays under 500 bytes. The compromise here is the safe buffer we use is small - 256 bytes - and as a result multiple probe_kernel_read()s are needed for larger objects. Most objects of interest are smaller than this (e.g. "struct sk_buff" is 224 bytes), and while task_struct is a notable exception at ~8K, performance is not the priority for BTF-based display. (Alexei and Yonghong, patch 2). - safe buffer use is the default behaviour (and is mandatory for BPF) but unsafe display - meaning no safe copy is done and we operate on the object itself - is supported via a 'u' option. - pointers are prefixed with 0x for clarity (Alexei, patch 2) - added additional comments and explanations around BTF show code, especially around determining whether objects such zeroed. Also tried to comment safe object scheme used. (Yonghong, patch 2) - added late_initcall() to initialize vmlinux BTF so that it would not have to be initialized during printk operation (Alexei, patch 5) - removed CONFIG_BTF_PRINTF config option as it is not needed; CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and determining behaviour of type-based printk can be done via retrieval of BTF data; if it's not there BTF was unavailable or broken (Alexei, patches 4,6) - fix bpf_trace_printk test to use vmlinux.h and globals via skeleton infrastructure, removing need for perf events (Andrii, patch 8) Changes since v1: - changed format to be more drgn-like, rendering indented type info along with type names by default (Alexei) - zeroed values are omitted (Arnaldo) by default unless the '0' modifier is specified (Alexei) - added an option to print pointer values without obfuscation. The reason to do this is the sysctls controlling pointer display are likely to be irrelevant in many if not most tracing contexts. Some questions on this in the outstanding questions section below... - reworked printk format specifer so that we no longer rely on format %pT but instead use a struct * which contains type information (Rasmus). This simplifies the printk parsing, makes use more dynamic and also allows specification by BTF id as well as name. - removed incorrect patch which tried to fix dereferencing of resolved BTF info for vmlinux; instead we skip modifiers for the relevant case (array element type determination) (Alexei). - fixed issues with negative snprintf format length (Rasmus) - added test cases for various data structure formats; base types, typedefs, structs, etc. - tests now iterate through all typedef, enum, struct and unions defined for vmlinux BTF and render a version of the target dummy value which is either all zeros or all 0xff values; the idea is this exercises the "skip if zero" and "print everything" cases. - added support in BPF for using the %pT format specifier in bpf_trace_printk() - added BPF tests which ensure %pT format specifier use works (Alexei). Alan Maguire (6): bpf: provide function to get vmlinux BTF information bpf: move to generic BTF show support, apply it to seq files/strings bpf: add bpf_btf_snprintf helper selftests/bpf: add bpf_btf_snprintf helper tests bpf: add bpf_seq_btf_write helper selftests/bpf: add test for bpf_seq_btf_write helper include/linux/bpf.h| 3 + include/linux/btf.h| 40 + include/uapi/linux/bpf.h | 78 ++ kernel/bpf/btf.c | 978 ++--- kernel/bpf/helpers.c | 4 + kernel/bpf/verifier.c | 18 +- kernel/trace/bpf_trace.c | 133 +++ scripts/bpf_helpers_doc.py | 2 + tools/include/uapi/linux/bpf.h | 78 ++ tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 66 ++ .../selftests/bpf/prog_tests/btf_snprintf.c| 55 ++ .../selftests/bpf/progs/bpf_iter_task_btf.c| 49 ++ .../selftests/bpf/progs/netif_receive_skb.c| 260 ++ 13 files changed, 1656 insertions(+), 108 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_snprintf.c create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c -- 1.8.3.1
[PATCH v5 bpf-next 6/6] selftests/bpf: add test for bpf_seq_btf_write helper
Add a test verifying iterating over tasks and displaying BTF representation of data succeeds. Note here that we do not display the task_struct itself, as it will overflow the PAGE_SIZE limit on seq data; instead we write task->fs (a struct fs_struct). Suggested-by: Alexei Starovoitov Signed-off-by: Alan Maguire --- tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 66 ++ .../selftests/bpf/progs/bpf_iter_task_btf.c| 49 2 files changed, 115 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index fe1a83b9..b9f13f9 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -7,6 +7,7 @@ #include "bpf_iter_task.skel.h" #include "bpf_iter_task_stack.skel.h" #include "bpf_iter_task_file.skel.h" +#include "bpf_iter_task_btf.skel.h" #include "bpf_iter_tcp4.skel.h" #include "bpf_iter_tcp6.skel.h" #include "bpf_iter_udp4.skel.h" @@ -167,6 +168,69 @@ static void test_task_file(void) bpf_iter_task_file__destroy(skel); } +#define FSBUFSZ8192 + +static char fsbuf[FSBUFSZ]; + +static void do_btf_read(struct bpf_program *prog) +{ + int iter_fd = -1, len = 0, bufleft = FSBUFSZ; + struct bpf_link *link; + char *buf = fsbuf; + + link = bpf_program__attach_iter(prog, NULL); + if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + return; + + iter_fd = bpf_iter_create(bpf_link__fd(link)); + if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n")) + goto free_link; + + do { + len = read(iter_fd, buf, bufleft); + if (len > 0) { + buf += len; + bufleft -= len; + } + } while (len > 0); + + if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno))) + goto free_link; + + CHECK(strstr(fsbuf, "(struct fs_struct)") == NULL, + "check for btf representation of fs_struct in iter data", + "struct fs_struct not found"); +free_link: + if (iter_fd > 0) + close(iter_fd); + bpf_link__destroy(link); +} + +static void test_task_btf(void) +{ + struct bpf_iter_task_btf__bss *bss; + struct bpf_iter_task_btf *skel; + + skel = bpf_iter_task_btf__open_and_load(); + if (CHECK(!skel, "bpf_iter_task_btf__open_and_load", + "skeleton open_and_load failed\n")) + return; + + bss = skel->bss; + + do_btf_read(skel->progs.dump_task_fs_struct); + + if (CHECK(bss->tasks == 0, "check if iterated over tasks", + "no task iteration, did BPF program run?\n")) + goto cleanup; + + CHECK(bss->seq_err != 0, "check for unexpected err", + "bpf_seq_btf_write returned %ld", bss->seq_err); + +cleanup: + bpf_iter_task_btf__destroy(skel); +} + static void test_tcp4(void) { struct bpf_iter_tcp4 *skel; @@ -957,6 +1021,8 @@ void test_bpf_iter(void) test_task_stack(); if (test__start_subtest("task_file")) test_task_file(); + if (test__start_subtest("task_btf")) + test_task_btf(); if (test__start_subtest("tcp4")) test_tcp4(); if (test__start_subtest("tcp6")) diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c new file mode 100644 index 000..0451682 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ +#include "bpf_iter.h" +#include +#include +#include + +char _license[] SEC("license") = "GPL"; + +long tasks = 0; +long seq_err = 0; + +/* struct task_struct's BTF representation will overflow PAGE_SIZE so cannot + * be used here; instead dump a structure associated with each task. + */ +SEC("iter/task") +int dump_task_fs_struct(struct bpf_iter__task *ctx) +{ + static const char fs_type[] = "struct fs_struct"; + struct seq_file *seq = ctx->meta->seq; + struct task_struct *task = ctx->task; + struct fs_struct *fs = (void *)0; + static struct btf_ptr ptr = { }; + long ret; + + if (task) + fs = task->fs; + + ptr.type = fs_type; + ptr.ptr = fs; + + if (ctx->meta->
[PATCH v5 bpf-next 3/6] bpf: add bpf_btf_snprintf helper
A helper is added to support tracing kernel type information in BPF using the BPF Type Format (BTF). Its signature is long bpf_btf_snprintf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags); struct btf_ptr * specifies - a pointer to the data to be traced; - the BTF id of the type of data pointed to; or - a string representation of the type of data pointed to - a flags field is provided for future use; these flags are not to be confused with the BTF_SNPRINTF_F_* flags below that control how the btf_ptr is displayed; the flags member of the struct btf_ptr may be used to disambiguate types in kernel versus module BTF, etc; the main distinction is the flags relate to the type and information needed in identifying it; not how it is displayed. For example a BPF program with a struct sk_buff *skb could do the following: static const char skb_type[] = "struct sk_buff"; static struct btf_ptr b = { }; b.ptr = skb; b.type = skb_type; bpf_btf_snprintf(str, sizeof(str), &b, sizeof(b), 0, 0); Default output looks like this: (struct sk_buff){ .transport_header = (__u16)65535, .mac_header = (__u16)65535, .end = (sk_buff_data_t)192, .head = (unsigned char *)0x7524fd8b, .data = (unsigned char *)0x7524fd8b, .truesize = (unsigned int)768, .users = (refcount_t){ .refs = (atomic_t){ .counter = (int)1, }, }, } Flags modifying display are as follows: - BTF_F_COMPACT:no formatting around type information - BTF_F_NONAME: no struct/union member names/types - BTF_F_PTR_RAW:show raw (unobfuscated) pointer values; equivalent to %px. - BTF_F_ZERO: show zero-valued struct/union members; they are not displayed by default Signed-off-by: Alan Maguire --- include/linux/bpf.h| 1 + include/linux/btf.h| 9 +++-- include/uapi/linux/bpf.h | 68 kernel/bpf/helpers.c | 4 ++ kernel/trace/bpf_trace.c | 88 ++ scripts/bpf_helpers_doc.py | 2 + tools/include/uapi/linux/bpf.h | 68 7 files changed, 236 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index c0ad5d8..9acbd59 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1787,6 +1787,7 @@ static inline int bpf_fd_reuseport_array_update_elem(struct bpf_map *map, extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto; extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto; extern const struct bpf_func_proto bpf_copy_from_user_proto; +extern const struct bpf_func_proto bpf_btf_snprintf_proto; const struct bpf_func_proto *bpf_tracing_func_proto( enum bpf_func_id func_id, const struct bpf_prog *prog); diff --git a/include/linux/btf.h b/include/linux/btf.h index d0f5d3c..3e5cdc2 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -6,6 +6,7 @@ #include #include +#include #define BTF_TYPE_EMIT(type) ((void)(type *)0) @@ -59,10 +60,10 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read * data before displaying it. */ -#define BTF_SHOW_COMPACT (1ULL << 0) -#define BTF_SHOW_NONAME(1ULL << 1) -#define BTF_SHOW_PTR_RAW (1ULL << 2) -#define BTF_SHOW_ZERO (1ULL << 3) +#define BTF_SHOW_COMPACT BTF_F_COMPACT +#define BTF_SHOW_NONAMEBTF_F_NONAME +#define BTF_SHOW_PTR_RAW BTF_F_PTR_RAW +#define BTF_SHOW_ZERO BTF_F_ZERO #define BTF_SHOW_UNSAFE(1ULL << 4) void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 7dd3141..9b89b67 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3579,6 +3579,41 @@ struct bpf_stack_build_id { * the data in *dst*. This is a wrapper of **copy_from_user**\ (). * Return * 0 on success, or a negative error in case of failure. + * + * long bpf_btf_snprintf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags) + * Description + * Use BTF to store a string representation of *ptr*->ptr in *str*, + * using *ptr*->type name or *ptr*->type_id. These values should + * specify the type *ptr*->ptr points to. Traversing that + * data structure using BTF, the type information and values are + * stored in the first *str_size* - 1 bytes of *str*. Safe copy of + * the pointer data is carried out to avoid kernel crashes during + * operation. Smaller types can use string space on the stack; + * larger programs can use map
[PATCH v5 bpf-next 1/6] bpf: provide function to get vmlinux BTF information
It will be used later for BPF structure display support Signed-off-by: Alan Maguire --- include/linux/bpf.h | 2 ++ kernel/bpf/verifier.c | 18 -- 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index c6d9f2c..c0ad5d8 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1330,6 +1330,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, union bpf_attr __user *uattr); void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth); +struct btf *bpf_get_btf_vmlinux(void); + /* Map specifics */ struct xdp_buff; struct sk_buff; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 814bc6c..11d7985 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -11311,6 +11311,17 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) } } +struct btf *bpf_get_btf_vmlinux(void) +{ + if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { + mutex_lock(&bpf_verifier_lock); + if (!btf_vmlinux) + btf_vmlinux = btf_parse_vmlinux(); + mutex_unlock(&bpf_verifier_lock); + } + return btf_vmlinux; +} + int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, union bpf_attr __user *uattr) { @@ -11344,12 +11355,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, env->ops = bpf_verifier_ops[env->prog->type]; is_priv = bpf_capable(); - if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { - mutex_lock(&bpf_verifier_lock); - if (!btf_vmlinux) - btf_vmlinux = btf_parse_vmlinux(); - mutex_unlock(&bpf_verifier_lock); - } + bpf_get_btf_vmlinux(); /* grab the mutex to protect few globals used by verifier */ if (!is_priv) -- 1.8.3.1
[PATCH v5 bpf-next 5/6] bpf: add bpf_seq_btf_write helper
A helper is added to allow seq file writing of kernel data structures using vmlinux BTF. Its signature is long bpf_seq_btf_write(struct seq_file *m, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags); Flags and struct btf_ptr definitions/use are identical to the bpf_btf_snprintf helper, and the helper returns 0 on success or a negative error value. Suggested-by: Alexei Starovoitov Signed-off-by: Alan Maguire --- include/linux/btf.h| 3 ++ include/uapi/linux/bpf.h | 10 ++ kernel/bpf/btf.c | 17 +++--- kernel/trace/bpf_trace.c | 75 +- tools/include/uapi/linux/bpf.h | 10 ++ 5 files changed, 96 insertions(+), 19 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index 3e5cdc2..eed23a4 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -69,6 +69,9 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, struct seq_file *m); +int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj, + struct seq_file *m, u64 flags); + /* * Copy len bytes of string representation of obj of BTF type_id into buf. * diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 9b89b67..c0815f1 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3614,6 +3614,15 @@ struct bpf_stack_build_id { * The number of bytes that were written (or would have been * written if output had to be truncated due to string size), * or a negative error in cases of failure. + * + * long bpf_seq_btf_write(struct seq_file *m, struct btf_ptr *ptr, u32 ptr_size, u64 flags) + * Description + * Use BTF to write to seq_write a string representation of + * *ptr*->ptr, using *ptr*->type name or *ptr*->type_id as per + * bpf_btf_snprintf() above. *flags* are identical to those + * used for bpf_btf_snprintf. + * Return + * 0 on success or a negative error in case of failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3766,6 +3775,7 @@ struct bpf_stack_build_id { FN(d_path), \ FN(copy_from_user), \ FN(btf_snprintf), \ + FN(seq_btf_write), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 70f5b88..0902464 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5328,17 +5328,26 @@ static void btf_seq_show(struct btf_show *show, const char *fmt, ...) va_end(args); } -void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, - struct seq_file *m) +int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj, + struct seq_file *m, u64 flags) { struct btf_show sseq; sseq.target = m; sseq.showfn = btf_seq_show; - sseq.flags = BTF_SHOW_NONAME | BTF_SHOW_COMPACT | BTF_SHOW_ZERO | -BTF_SHOW_UNSAFE; + sseq.flags = flags; btf_type_show(btf, type_id, obj, &sseq); + + return sseq.state.status; +} + +void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, + struct seq_file *m) +{ + (void) btf_type_seq_show_flags(btf, type_id, obj, m, + BTF_SHOW_NONAME | BTF_SHOW_COMPACT | + BTF_SHOW_ZERO | BTF_SHOW_UNSAFE); } struct btf_show_snprintf { diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index f171e03..eee36a8 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -71,6 +71,10 @@ static struct bpf_raw_event_map *bpf_get_raw_tracepoint_module(const char *name) u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); +static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size, + u64 flags, const struct btf **btf, + s32 *btf_id); + /** * trace_call_bpf - invoke BPF program * @call: tracepoint event @@ -780,6 +784,30 @@ struct bpf_seq_printf_buf { .btf_id = bpf_seq_write_btf_ids, }; +BPF_CALL_4(bpf_seq_btf_write, struct seq_file *, m, struct btf_ptr *, ptr, + u32, btf_ptr_size, u64, flags) +{ + const struct btf *btf; + s32 btf_id; + int ret; + + ret = bpf_btf_printf_prepare(ptr, btf_ptr_size, flags, &btf, &btf_id); + if (ret) + return ret; + + return btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags); +} + +static const struct bpf_func_proto bpf_seq
Re: [RFC PATCH bpf-next 2/4] bpf: make BTF show support generic, apply to seq files/bpf_trace_printk
On Fri, 14 Aug 2020, Alexei Starovoitov wrote: > On Fri, Aug 14, 2020 at 02:06:37PM +0100, Alan Maguire wrote: > > On Wed, 12 Aug 2020, Alexei Starovoitov wrote: > > > > > On Thu, Aug 06, 2020 at 03:42:23PM +0100, Alan Maguire wrote: > > > > > > > > The bpf_trace_printk tracepoint is augmented with a "trace_id" > > > > field; it is used to allow tracepoint filtering as typed display > > > > information can easily be interspersed with other tracing data, > > > > making it hard to read. Specifying a trace_id will allow users > > > > to selectively trace data, eliminating noise. > > > > > > Since trace_id is not seen in trace_pipe, how do you expect users > > > to filter by it? > > > > Sorry should have specified this. The approach is to use trace > > instances and filtering such that we only see events associated > > with a specific trace_id. There's no need for the trace event to > > actually display the trace_id - it's still usable as a filter. > > The steps involved are: > > > > 1. create a trace instance within which we can specify a fresh > >set of trace event enablings, filters etc. > > > > mkdir /sys/kernel/debug/tracing/instances/traceid100 > > > > 2. enable the filter for the specific trace id > > > > echo "trace_id == 100" > > > /sys/kernel/debug/tracing/instances/traceid100/events/bpf_trace/bpf_trace_printk/filter > > > > 3. enable the trace event > > > > echo 1 > > > /sys/kernel/debug/tracing/instances/events/bpf_trace/bpf_trace_printk/enable > > > > 4. ensure the BPF program uses a trace_id 100 when calling bpf_trace_btf() > > ouch. > I think you interpreted the acceptance of the > commit 7fb20f9e901e ("bpf, doc: Remove references to warning message when > using bpf_trace_printk()") > in the wrong way. > > Everything that doc had said is still valid. In particular: > -A: This is done to nudge program authors into better interfaces when > -programs need to pass data to user space. Like bpf_perf_event_output() > -can be used to efficiently stream data via perf ring buffer. > -BPF maps can be used for asynchronous data sharing between kernel > -and user space. bpf_trace_printk() should only be used for debugging. > > bpf_trace_printk is for debugging only. _debugging of bpf programs > themselves_. > What you're describing above is logging and tracing. It's not debugging of > programs. > perf buffer, ring buffer, and seq_file interfaces are the right > interfaces for tracing, logging, and kernel debugging. > > > > It also feels like workaround. May be let bpf prog print the whole > > > struct in one go with multiple new lines and call > > > trace_bpf_trace_printk(buf) once? > > > > We can do that absolutely, but I'd be interested to get your take > > on the filtering mechanism before taking that approach. I'll add > > a description of the above mechanism to the cover letter and > > patch to be clearer next time too. > > I think patch 3 is no go, because it takes bpf_trace_printk in > the wrong direction. > Instead please refactor it to use string buffer or seq_file as an output. Fair enough. I'm thinking a helper like long bpf_btf_snprintf(char *str, u32 str_size, struct btf_ptr *ptr, u32 ptr_size, u64 flags); Then the user can choose perf event or ringbuf interfaces to share the results with userspace. > If the user happen to use bpf_trace_printk("%s", buf); > after that to print that string buffer to trace_pipe that's user's choice. > I can see such use case when program author wants to debug > their bpf program. That's fine. But for kernel debugging, on demand and > "always on" logging and tracing the documentation should point > to sustainable interfaces that don't interfere with each other, > can be run in parallel by multiple users, etc. > The problem with bpf_trace_printk() under this approach is that the string size for %s arguments is very limited; bpf_trace_printk() restricts these to 64 bytes in size. Looks like bpf_seq_printf() restricts a %s string to 128 bytes also. We could add an additional helper for the bpf_seq case which calls bpf_seq_printf() for each component in the object, i.e. long bpf_seq_btf_printf(struct seq_file *m, struct btf_ptr *ptr, u32 ptr_size, u64 flags); This would steer users away from bpf_trace_printk() for this use case - since it can print only a small amount of the string - while supporting all the other user-space communication mechanisms. Alan
Re: [RFC PATCH bpf-next 2/4] bpf: make BTF show support generic, apply to seq files/bpf_trace_printk
On Wed, 12 Aug 2020, Alexei Starovoitov wrote: > On Thu, Aug 06, 2020 at 03:42:23PM +0100, Alan Maguire wrote: > > > > The bpf_trace_printk tracepoint is augmented with a "trace_id" > > field; it is used to allow tracepoint filtering as typed display > > information can easily be interspersed with other tracing data, > > making it hard to read. Specifying a trace_id will allow users > > to selectively trace data, eliminating noise. > > Since trace_id is not seen in trace_pipe, how do you expect users > to filter by it? Sorry should have specified this. The approach is to use trace instances and filtering such that we only see events associated with a specific trace_id. There's no need for the trace event to actually display the trace_id - it's still usable as a filter. The steps involved are: 1. create a trace instance within which we can specify a fresh set of trace event enablings, filters etc. mkdir /sys/kernel/debug/tracing/instances/traceid100 2. enable the filter for the specific trace id echo "trace_id == 100" > /sys/kernel/debug/tracing/instances/traceid100/events/bpf_trace/bpf_trace_printk/filter 3. enable the trace event echo 1 > /sys/kernel/debug/tracing/instances/events/bpf_trace/bpf_trace_printk/enable 4. ensure the BPF program uses a trace_id 100 when calling bpf_trace_btf() So the above can be done for multiple programs; output can then be separated for different programs if trace_ids and filtering are used together. The above trace instance only sees bpf_trace_btf() events which specify trace_id 100. I've attached a tweaked version of the patch 4 in the patchset that ensures that a trace instance with filtering enabled as above sees the bpf_trace_btf events, but _not_ bpf_trace_printk events (since they have trace_id 0 by default). To me the above provides a simple way to separate BPF program output for simple BPF programs; ringbuf and perf events require a bit more work in both BPF and userspace to support such coordination. What do you think - does this approach seem worth using? If so we could also consider extending it to bpf_trace_printk(), if we can find a way to provide a trace_id there too. > It also feels like workaround. May be let bpf prog print the whole > struct in one go with multiple new lines and call > trace_bpf_trace_printk(buf) once? We can do that absolutely, but I'd be interested to get your take on the filtering mechanism before taking that approach. I'll add a description of the above mechanism to the cover letter and patch to be clearer next time too. > > Also please add interface into bpf_seq_printf. > BTF enabled struct prints is useful for iterators too > and generalization you've done in this patch pretty much takes it there. > Sure, I'll try and tackle that next time. > > +#define BTF_SHOW_COMPACT (1ULL << 0) > > +#define BTF_SHOW_NONAME(1ULL << 1) > > +#define BTF_SHOW_PTR_RAW (1ULL << 2) > > +#define BTF_SHOW_ZERO (1ULL << 3) > > +#define BTF_SHOW_NONEWLINE (1ULL << 32) > > +#define BTF_SHOW_UNSAFE(1ULL << 33) > > I could have missed it earlier, but what is the motivation to leave the gap > in bits? Just do bit 4 and 5 ? > Patch 3 uses the first 4 as flags to bpf_trace_btf(); the final two are not supported for the helper as flag values so I wanted to leave some space for additional bpf_trace_btf() flags. BTF_SHOW_NONEWLINE is always used for bpf_trace_btf(), since the tracing adds a newline for us and we don't want to double up on newlines, so it's ORed in as an implicit argument for the bpf_trace_btf() case. BTF_SHOW_UNSAFE isn't allowed within BPF so it's not available as a flag for the helper. Thanks! Alan >From 10bd268b2585084c8f35d1b6ab0c3df76203f5cc Mon Sep 17 00:00:00 2001 From: Alan Maguire Date: Thu, 6 Aug 2020 14:21:10 +0200 Subject: [PATCH] selftests/bpf: add bpf_trace_btf helper tests Basic tests verifying various flag combinations for bpf_trace_btf() using a tp_btf program to trace skb data. Also verify that we can create a trace instance to filter trace data, using the trace_id value passed to bpf_trace/bpf_trace_printk events. trace_id is specifiable for bpf_trace_btf() so the test ensures the trace instance sees the filtered events only. Signed-off-by: Alan Maguire --- tools/testing/selftests/bpf/prog_tests/trace_btf.c | 150 + .../selftests/bpf/progs/netif_receive_skb.c| 48 +++ 2 files changed, 198 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_btf.c create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c diff --git a/tools/testing/selftests/bpf/prog_tests/trace_btf.c b/tools/testing/selftests/bpf/prog_tests/trace_btf.
Re: [PATCH v2] kunit: added lockdep support
On Wed, 12 Aug 2020, Uriel Guajardo wrote: > KUnit will fail tests upon observing a lockdep failure. Because lockdep > turns itself off after its first failure, only fail the first test and > warn users to not expect any future failures from lockdep. > > Similar to lib/locking-selftest [1], we check if the status of > debug_locks has changed after the execution of a test case. However, we > do not reset lockdep afterwards. > > Like the locking selftests, we also fix possible preemption count > corruption from lock bugs. > > Depends on kunit: support failure from dynamic analysis tools [2] > > [1] > https://elixir.bootlin.com/linux/v5.7.12/source/lib/locking-selftest.c#L1137 > > [2] > https://lore.kernel.org/linux-kselftest/20200806174326.3577537-1-urielguajard...@gmail.com/ > > Signed-off-by: Uriel Guajardo > --- > v2 Changes: > - Removed lockdep_reset > > - Added warning to users about lockdep shutting off > --- > lib/kunit/test.c | 27 ++- > 1 file changed, 26 insertions(+), 1 deletion(-) > > diff --git a/lib/kunit/test.c b/lib/kunit/test.c > index d8189d827368..7e477482457b 100644 > --- a/lib/kunit/test.c > +++ b/lib/kunit/test.c > @@ -11,6 +11,7 @@ > #include > #include > #include > +#include > > #include "debugfs.h" > #include "string-stream.h" > @@ -22,6 +23,26 @@ void kunit_fail_current_test(void) > kunit_set_failure(current->kunit_test); > } > > +static void kunit_check_locking_bugs(struct kunit *test, > + unsigned long saved_preempt_count, > + bool saved_debug_locks) > +{ > + preempt_count_set(saved_preempt_count); > +#ifdef CONFIG_TRACE_IRQFLAGS > + if (softirq_count()) > + current->softirqs_enabled = 0; > + else > + current->softirqs_enabled = 1; > +#endif > +#if IS_ENABLED(CONFIG_LOCKDEP) > + if (saved_debug_locks && !debug_locks) { > + kunit_set_failure(test); > + kunit_warn(test, "Dynamic analysis tool failure from LOCKDEP."); > + kunit_warn(test, "Further tests will have LOCKDEP disabled."); > + } > +#endif > +} Nit: I could be wrong but the general approach for this sort of feature is to do conditional compilation combined with "static inline" definitions to handle the case where the feature isn't enabled. Could we tidy this up a bit and haul this stuff out into a conditionally-compiled (if CONFIG_LOCKDEP) kunit lockdep.c file? Then in kunit's lockdep.h we'd have struct kunit_lockdep { int preempt_count; bool debug_locks; }; #if IS_ENABLED(CONFIG_LOCKDEP) void kunit_test_init_lockdep(struct kunit_test *test, struct kunit_lockdep *lockdep); void kunit_test_check_lockdep(struct kunit_test *test, struct kunit_lockdep *lockdep); #else static inline void kunit_init_lockdep(struct kunit_test *test, struct kunit_lockdep *lockdep) { } static inline void kunit_check_lockdep(struct kunit_test *test, struct kunit_lockdep *lockdep) { } #endif The test execution code could then call struct kunit_lockdep lockdep; kunit_test_init_lockdep(test, &lockdep); kunit_test_check_lockdep(test, &lockdep); If that approach makes sense, we could go a bit further and we might benefit from a bit more generalization here. _If_ the pattern of needing pre- and post- test actions is sustained across multiple analysis tools, could we add generic hooks for this? That would allow any additional dynamic analysis tools to utilize them. So kunit_try_run_case() would then cycle through the registered pre- hooks prior to running the case and post- hooks after, failing if any of the latter returned a failure value. I'm thinking something like kunit_register_external_test("lockdep", lockdep_pre, lockdep_post, &kunit_lockdep); (or we could define a kunit_external_test struct for better extensibility). A void * would be passed to pre/post, in this case it'd be a pointer to a struct containing the saved preempt count/debug locks, and the registration could be called during kunit initialization. This doesn't need to be done with your change of course but I wanted to float the idea as in addition to uncluttering the test case execution code, it might allow us to build facilities on top of that generic tool support for situations like "I'd like to see if the test passes absent any lockdep issues, so I'd like to disable lockdep-based failure". Such situations are more likely to arise in a world where kunit+tests are built as modules and run multiple times within a single system boot admittedly, but worth considering I think. For that we'd need a way to select which dynamic tools kunit enables(kernel/module parameters or debugfs could do this), but a generic approach might help that sort of thing. An extern
Re: [PATCH 1/2] kunit: support failure from dynamic analysis tools
On Thu, 6 Aug 2020, Uriel Guajardo wrote: > Adds an API to allow dynamic analysis tools to fail the currently > running KUnit test case. > > - Always places the kunit test in the task_struct to allow other tools > to access the currently running KUnit test. > > - Creates a new header file to avoid circular dependencies that could be > created from the test.h file. > > Requires KASAN-KUnit integration patch to access the kunit test from > task_struct: > https://lore.kernel.org/linux-kselftest/20200606040349.246780-2-david...@google.com/ > > Signed-off-by: Uriel Guajardo > --- > include/kunit/test-bug.h | 24 > include/kunit/test.h | 1 + > lib/kunit/test.c | 10 ++ > 3 files changed, 31 insertions(+), 4 deletions(-) > create mode 100644 include/kunit/test-bug.h > > diff --git a/include/kunit/test-bug.h b/include/kunit/test-bug.h > new file mode 100644 > index ..283c19ec328f > --- /dev/null > +++ b/include/kunit/test-bug.h > @@ -0,0 +1,24 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +/* > + * KUnit API allowing dynamic analysis tools to interact with KUnit tests > + * > + * Copyright (C) 2020, Google LLC. > + * Author: Uriel Guajardo > + */ > + > +#ifndef _KUNIT_TEST_BUG_H > +#define _KUNIT_TEST_BUG_H > + > +#if IS_ENABLED(CONFIG_KUNIT) > + > +extern void kunit_fail_current_test(void); > + > +#else > + > +static inline void kunit_fail_current_test(void) > +{ > +} > + > +#endif > + > +#endif /* _KUNIT_TEST_BUG_H */ This is great stuff! One thing I wonder though; how obvious will it be to someone running a KUnit test that the cause of the test failure is a dynamic analysis tool? Yes we'll see the dmesg logging from that tool but I don't think there's any context _within_ KUnit that could clarify the source of the failure. What about changing the above API to include a string message that KUnit can log, so it can at least identify the source of the failure (ubsan, kasan etc). That would alert anyone looking at KUnit output only that there's an external context to examine. > diff --git a/include/kunit/test.h b/include/kunit/test.h > index 3391f38389f8..81bf43a1abda 100644 > --- a/include/kunit/test.h > +++ b/include/kunit/test.h > @@ -11,6 +11,7 @@ > > #include > #include > +#include > #include > #include > #include > diff --git a/lib/kunit/test.c b/lib/kunit/test.c > index dcc35fd30d95..d8189d827368 100644 > --- a/lib/kunit/test.c > +++ b/lib/kunit/test.c > @@ -16,6 +16,12 @@ > #include "string-stream.h" > #include "try-catch-impl.h" > > +void kunit_fail_current_test(void) > +{ > + if (current->kunit_test) > + kunit_set_failure(current->kunit_test); > +} > + > static void kunit_print_tap_version(void) > { > static bool kunit_has_printed_tap_version; > @@ -284,9 +290,7 @@ static void kunit_try_run_case(void *data) > struct kunit_suite *suite = ctx->suite; > struct kunit_case *test_case = ctx->test_case; > > -#if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT)) > current->kunit_test = test; > -#endif /* IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT) */ > > /* >* kunit_run_case_internal may encounter a fatal error; if it does, > @@ -602,9 +606,7 @@ void kunit_cleanup(struct kunit *test) > spin_unlock(&test->lock); > kunit_remove_resource(test, res); > } > -#if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT)) > current->kunit_test = NULL; > -#endif /* IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT)*/ > } > EXPORT_SYMBOL_GPL(kunit_cleanup); > > -- > 2.28.0.163.g6104cc2f0b6-goog > >
[PATCH bpf] bpf: doc: remove references to warning message when using bpf_trace_printk()
The BPF helper bpf_trace_printk() no longer uses trace_printk(); it is now triggers a dedicated trace event. Hence the described warning is no longer present, so remove the discussion of it as it may confuse people. Fixes: ac5a72ea5c89 ("bpf: Use dedicated bpf_trace_printk event instead of trace_printk()") Signed-off-by: Alan Maguire --- Documentation/bpf/bpf_design_QA.rst | 11 --- 1 file changed, 11 deletions(-) diff --git a/Documentation/bpf/bpf_design_QA.rst b/Documentation/bpf/bpf_design_QA.rst index 12a246f..2df7b06 100644 --- a/Documentation/bpf/bpf_design_QA.rst +++ b/Documentation/bpf/bpf_design_QA.rst @@ -246,17 +246,6 @@ program is loaded the kernel will print warning message, so this helper is only useful for experiments and prototypes. Tracing BPF programs are root only. -Q: bpf_trace_printk() helper warning - -Q: When bpf_trace_printk() helper is used the kernel prints nasty -warning message. Why is that? - -A: This is done to nudge program authors into better interfaces when -programs need to pass data to user space. Like bpf_perf_event_output() -can be used to efficiently stream data via perf ring buffer. -BPF maps can be used for asynchronous data sharing between kernel -and user space. bpf_trace_printk() should only be used for debugging. - Q: New functionality via kernel modules? Q: Can BPF functionality such as new program or map types, new -- 1.8.3.1
[RFC PATCH bpf-next 4/4] selftests/bpf: add bpf_trace_btf helper tests
Basic tests verifying various flag combinations for bpf_trace_btf() using a tp_btf program to trace skb data. Signed-off-by: Alan Maguire --- tools/testing/selftests/bpf/prog_tests/trace_btf.c | 45 ++ .../selftests/bpf/progs/netif_receive_skb.c| 43 + 2 files changed, 88 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_btf.c create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c diff --git a/tools/testing/selftests/bpf/prog_tests/trace_btf.c b/tools/testing/selftests/bpf/prog_tests/trace_btf.c new file mode 100644 index 000..e64b69d --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/trace_btf.c @@ -0,0 +1,45 @@ +// SPDX-License-Identifier: GPL-2.0 +#include + +#include "netif_receive_skb.skel.h" + +void test_trace_btf(void) +{ + struct netif_receive_skb *skel; + struct netif_receive_skb__bss *bss; + int err, duration = 0; + + skel = netif_receive_skb__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + err = netif_receive_skb__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) + goto cleanup; + + bss = skel->bss; + + err = netif_receive_skb__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* generate receive event */ + system("ping -c 10 127.0.0.1"); + + /* +* Make sure netif_receive_skb program was triggered +* and it set expected return values from bpf_trace_printk()s +* and all tests ran. +*/ + if (CHECK(bss->ret <= 0, + "bpf_trace_btf: got return value", + "ret <= 0 %ld test %d\n", bss->ret, bss->num_subtests)) + goto cleanup; + + CHECK(bss->num_subtests != bss->ran_subtests, "check all subtests ran", + "only ran %d of %d tests\n", bss->num_subtests, + bss->ran_subtests); + +cleanup: + netif_receive_skb__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c b/tools/testing/selftests/bpf/progs/netif_receive_skb.c new file mode 100644 index 000..cab764e --- /dev/null +++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ +#include "vmlinux.h" +#include +#include + +char _license[] SEC("license") = "GPL"; + +long ret = 0; +int num_subtests = 0; +int ran_subtests = 0; + +#define CHECK_TRACE(_p, flags) \ + do { \ + ++num_subtests; \ + if (ret >= 0) { \ + ++ran_subtests; \ + ret = bpf_trace_btf(_p, sizeof(*(_p)), 0, flags);\ + }\ + } while (0) + +/* TRACE_EVENT(netif_receive_skb, + * TP_PROTO(struct sk_buff *skb), + */ +SEC("tp_btf/netif_receive_skb") +int BPF_PROG(trace_netif_receive_skb, struct sk_buff *skb) +{ + static const char skb_type[] = "struct sk_buff"; + static struct btf_ptr p = { }; + + p.ptr = skb; + p.type = skb_type; + + CHECK_TRACE(&p, 0); + CHECK_TRACE(&p, BTF_TRACE_F_COMPACT); + CHECK_TRACE(&p, BTF_TRACE_F_NONAME); + CHECK_TRACE(&p, BTF_TRACE_F_PTR_RAW); + CHECK_TRACE(&p, BTF_TRACE_F_ZERO); + CHECK_TRACE(&p, BTF_TRACE_F_COMPACT | BTF_TRACE_F_NONAME | + BTF_TRACE_F_PTR_RAW | BTF_TRACE_F_ZERO); + + return 0; +} -- 1.8.3.1
[RFC PATCH bpf-next 3/4] bpf: add bpf_trace_btf helper
f_trace_printk: }, -0 [023] d.s. 1825.778448: bpf_trace_printk: } Flags modifying display are as follows: - BTF_TRACE_F_COMPACT: no formatting around type information - BTF_TRACE_F_NONAME: no struct/union member names/types - BTF_TRACE_F_PTR_RAW: show raw (unobfuscated) pointer values; equivalent to %px. - BTF_TRACE_F_ZERO: show zero-valued struct/union members; they are not displayed by default Signed-off-by: Alan Maguire --- include/linux/bpf.h| 1 + include/linux/btf.h| 9 ++-- include/uapi/linux/bpf.h | 63 + kernel/bpf/core.c | 5 ++ kernel/bpf/helpers.c | 4 ++ kernel/trace/bpf_trace.c | 102 - scripts/bpf_helpers_doc.py | 2 + tools/include/uapi/linux/bpf.h | 63 + 8 files changed, 243 insertions(+), 6 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 6143b6e..f67819d 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -934,6 +934,7 @@ struct bpf_event_entry { const char *kernel_type_name(u32 btf_type_id); const struct bpf_func_proto *bpf_get_trace_printk_proto(void); +const struct bpf_func_proto *bpf_get_trace_btf_proto(void); typedef unsigned long (*bpf_ctx_copy_t)(void *dst, const void *src, unsigned long off, unsigned long len); diff --git a/include/linux/btf.h b/include/linux/btf.h index 46bf9f4..3d31e28 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -6,6 +6,7 @@ #include #include +#include #define BTF_TYPE_EMIT(type) ((void)(type *)0) @@ -61,10 +62,10 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read * data before displaying it. */ -#define BTF_SHOW_COMPACT (1ULL << 0) -#define BTF_SHOW_NONAME(1ULL << 1) -#define BTF_SHOW_PTR_RAW (1ULL << 2) -#define BTF_SHOW_ZERO (1ULL << 3) +#define BTF_SHOW_COMPACT BTF_TRACE_F_COMPACT +#define BTF_SHOW_NONAMEBTF_TRACE_F_NONAME +#define BTF_SHOW_PTR_RAW BTF_TRACE_F_PTR_RAW +#define BTF_SHOW_ZERO BTF_TRACE_F_ZERO #define BTF_SHOW_NONEWLINE (1ULL << 32) #define BTF_SHOW_UNSAFE(1ULL << 33) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index b134e67..726fee4 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3394,6 +3394,36 @@ struct bpf_stack_build_id { * A non-negative value equal to or less than *size* on success, * or a negative error in case of failure. * + * long bpf_trace_btf(struct btf_ptr *ptr, u32 btf_ptr_size, u32 trace_id, u64 flags) + * Description + * Utilize BTF to trace a representation of *ptr*->ptr, using + * *ptr*->type name or *ptr*->type_id. *ptr*->type_name + * should specify the type *ptr*->ptr points to. Traversing that + * data structure using BTF, the type information and values are + * bpf_trace_printk()ed. Safe copy of the pointer data is + * carried out to avoid kernel crashes during data display. + * Tracing specifies *trace_id* as the id associated with the + * trace event; this can be used to filter trace events + * to show a subset of all traced output, helping to avoid + * the situation where BTF output is intermixed with other + * output. + * + * *flags* is a combination of + * + * **BTF_TRACE_F_COMPACT** + * no formatting around type information + * **BTF_TRACE_F_NONAME** + * no struct/union member names/types + * **BTF_TRACE_F_PTR_RAW** + * show raw (unobfuscated) pointer values; + * equivalent to printk specifier %px. + * **BTF_TRACE_F_ZERO** + * show zero-valued struct/union members; they + * are not displayed by default + * + * Return + * The number of bytes traced, or a negative error in cases of + * failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3538,6 +3568,7 @@ struct bpf_stack_build_id { FN(skc_to_tcp_request_sock),\ FN(skc_to_udp6_sock), \ FN(get_task_stack), \ + FN(trace_btf), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper @@ -4446,4 +4477,36 @@ struct bpf_sk_lookup { __u32 local_port; /* Host byte order */ }; +/* + * struct btf_ptr is used for typed pointer display; the + * additional type string/BTF type id are used to render
[RFC PATCH bpf-next 0/4] bpf: add bpf-based bpf_trace_printk()-like support
arge amounts of data using a complex mechanism such as BTF traversal, but still provides a way for the display of such data to be achieved via BPF programs. Future work could include a bpf_printk_btf() function to invoke display via printk() where the elements of a data structure are printk()ed one at a time. Thanks to Petr Mladek, Andy Shevchenko and Rasmus Villemoes who took time to look at the earlier printk() format-specifier-focused version of this and provided feedback clarifying the problems with that approach. - Added trace id to the bpf_trace_printk events as a means of separating output from standard bpf_trace_printk() events, ensuring it can be easily parsed by the reader. - Added bpf_trace_btf() helper tests which do simple verification of the various display options. Changes since v2: - Alexei and Yonghong suggested it would be good to use probe_kernel_read() on to-be-shown data to ensure safety during operation. Safe copy via probe_kernel_read() to a buffer object in "struct btf_show" is used to support this. A few different approaches were explored including dynamic allocation and per-cpu buffers. The downside of dynamic allocation is that it would be done during BPF program execution for bpf_trace_printk()s using %pT format specifiers. The problem with per-cpu buffers is we'd have to manage preemption and since the display of an object occurs over an extended period and in printk context where we'd rather not change preemption status, it seemed tricky to manage buffer safety while considering preemption. The approach of utilizing stack buffer space via the "struct btf_show" seemed like the simplest approach. The stack size of the associated functions which have a "struct btf_show" on their stack to support show operation (btf_type_snprintf_show() and btf_type_seq_show()) stays under 500 bytes. The compromise here is the safe buffer we use is small - 256 bytes - and as a result multiple probe_kernel_read()s are needed for larger objects. Most objects of interest are smaller than this (e.g. "struct sk_buff" is 224 bytes), and while task_struct is a notable exception at ~8K, performance is not the priority for BTF-based display. (Alexei and Yonghong, patch 2). - safe buffer use is the default behaviour (and is mandatory for BPF) but unsafe display - meaning no safe copy is done and we operate on the object itself - is supported via a 'u' option. - pointers are prefixed with 0x for clarity (Alexei, patch 2) - added additional comments and explanations around BTF show code, especially around determining whether objects such zeroed. Also tried to comment safe object scheme used. (Yonghong, patch 2) - added late_initcall() to initialize vmlinux BTF so that it would not have to be initialized during printk operation (Alexei, patch 5) - removed CONFIG_BTF_PRINTF config option as it is not needed; CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and determining behaviour of type-based printk can be done via retrieval of BTF data; if it's not there BTF was unavailable or broken (Alexei, patches 4,6) - fix bpf_trace_printk test to use vmlinux.h and globals via skeleton infrastructure, removing need for perf events (Andrii, patch 8) Changes since v1: - changed format to be more drgn-like, rendering indented type info along with type names by default (Alexei) - zeroed values are omitted (Arnaldo) by default unless the '0' modifier is specified (Alexei) - added an option to print pointer values without obfuscation. The reason to do this is the sysctls controlling pointer display are likely to be irrelevant in many if not most tracing contexts. Some questions on this in the outstanding questions section below... - reworked printk format specifer so that we no longer rely on format %pT but instead use a struct * which contains type information (Rasmus). This simplifies the printk parsing, makes use more dynamic and also allows specification by BTF id as well as name. - removed incorrect patch which tried to fix dereferencing of resolved BTF info for vmlinux; instead we skip modifiers for the relevant case (array element type determination) (Alexei). - fixed issues with negative snprintf format length (Rasmus) - added test cases for various data structure formats; base types, typedefs, structs, etc. - tests now iterate through all typedef, enum, struct and unions defined for vmlinux BTF and render a version of the target dummy value which is either all zeros or all 0xff values; the idea is this exercises the "skip if zero" and "print everything" cases. - added support in BPF for using the %pT format specifier in bpf_trace_printk() - added BPF tests which ensure %pT format specifier use works (Alexei). Alan Maguire (4): bpf: provide function to get vmlinux BTF information
[RFC PATCH bpf-next 2/4] bpf: make BTF show support generic, apply to seq files/bpf_trace_printk
generalize the "seq_show" seq file support in btf.c to support a generic show callback of which we support three instances; - the current seq file show - a show which triggers the bpf_trace/bpf_trace_printk tracepoint for each portion of the data displayed Both classes of show function call btf_type_show() with different targets: - for seq_show, the seq file is the target - for bpf_trace_printk(), no target is needed. In the tracing case we need to also track additional data - length of data written specifically for the return value. By default show will display type information, field members and their types and values etc, and the information is indented based upon structure depth. Zeroed fields are omitted. Show however supports flags which modify its behaviour: BTF_SHOW_COMPACT - suppress newline/indent. BTF_SHOW_NONEWLINE - suppress newline only. BTF_SHOW_NONAME - suppress show of type and member names. BTF_SHOW_PTR_RAW - do not obfuscate pointer values. BTF_SHOW_UNSAFE - do not copy data to safe buffer before display. BTF_SHOW_ZERO - show zeroed values (by default they are not shown). The bpf_trace_printk tracepoint is augmented with a "trace_id" field; it is used to allow tracepoint filtering as typed display information can easily be interspersed with other tracing data, making it hard to read. Specifying a trace_id will allow users to selectively trace data, eliminating noise. Signed-off-by: Alan Maguire --- include/linux/bpf.h | 2 + include/linux/btf.h | 37 ++ kernel/bpf/btf.c | 962 ++- kernel/trace/bpf_trace.c | 19 +- kernel/trace/bpf_trace.h | 6 +- 5 files changed, 916 insertions(+), 110 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 55eb67d..6143b6e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -946,6 +946,8 @@ typedef u32 (*bpf_convert_ctx_access_t)(enum bpf_access_type type, u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size, void *ctx, u64 ctx_size, bpf_ctx_copy_t ctx_copy); +int bpf_trace_vprintk(__u32 trace_id, const char *fmt, va_list ap); + /* an array of programs to be executed under rcu_lock. * * Typical usage: diff --git a/include/linux/btf.h b/include/linux/btf.h index 8b81fbb..46bf9f4 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -13,6 +13,7 @@ struct btf_member; struct btf_type; union bpf_attr; +struct btf_show; extern const struct file_operations btf_fops; @@ -46,8 +47,44 @@ int btf_get_info_by_fd(const struct btf *btf, const struct btf_type *btf_type_id_size(const struct btf *btf, u32 *type_id, u32 *ret_size); + +/* + * Options to control show behaviour. + * - BTF_SHOW_COMPACT: no formatting around type information + * - BTF_SHOW_NONAME: no struct/union member names/types + * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values; + * equivalent to %px. + * - BTF_SHOW_ZERO: show zero-valued struct/union members; they + * are not displayed by default + * - BTF_SHOW_NONEWLINE: include indent, but suppress newline; + * to be used when a show function implicitly includes a newline. + * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read + * data before displaying it. + */ +#define BTF_SHOW_COMPACT (1ULL << 0) +#define BTF_SHOW_NONAME(1ULL << 1) +#define BTF_SHOW_PTR_RAW (1ULL << 2) +#define BTF_SHOW_ZERO (1ULL << 3) +#define BTF_SHOW_NONEWLINE (1ULL << 32) +#define BTF_SHOW_UNSAFE(1ULL << 33) + void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, struct seq_file *m); + +/* + * Trace string representation of obj of BTF type_id. + * + * @btf: struct btf object + * @type_id: type id of type obj points to + * @obj: pointer to typed data + * @flags: show options (see above) + * + * Return: length that would have been/was copied as per snprintf, or + *negative error. + */ +int btf_type_trace_show(const struct btf *btf, u32 type_id, void *obj, + u32 trace_id, u64 flags); + int btf_get_fd_by_id(u32 id); u32 btf_id(const struct btf *btf); bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 91afdd4..be47304 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -282,6 +282,88 @@ static const char *btf_type_str(const struct btf_type *t) return btf_kind_str[BTF_INFO_KIND(t->info)]; } +/* Chunk size we use in safe copy of data to be shown. */ +#define BTF_SHOW_OBJ_SAFE_SIZE 256 + +/* + * This is the maximum size of a base type value (equivalent to a + * 128-bit int); if we are at the end of our safe buffer and have + * less than 16 bytes space we can't
[RFC PATCH bpf-next 1/4] bpf: provide function to get vmlinux BTF information
It will be used later for BPF structure display support Signed-off-by: Alan Maguire --- include/linux/bpf.h | 2 ++ kernel/bpf/verifier.c | 18 -- 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index cef4ef0..55eb67d 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1290,6 +1290,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, union bpf_attr __user *uattr); void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth); +struct btf *bpf_get_btf_vmlinux(void); + /* Map specifics */ struct xdp_buff; struct sk_buff; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index b6ccfce..05dfc41 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -11064,6 +11064,17 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) } } +struct btf *bpf_get_btf_vmlinux(void) +{ + if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { + mutex_lock(&bpf_verifier_lock); + if (!btf_vmlinux) + btf_vmlinux = btf_parse_vmlinux(); + mutex_unlock(&bpf_verifier_lock); + } + return btf_vmlinux; +} + int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, union bpf_attr __user *uattr) { @@ -11097,12 +11108,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, env->ops = bpf_verifier_ops[env->prog->type]; is_priv = bpf_capable(); - if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { - mutex_lock(&bpf_verifier_lock); - if (!btf_vmlinux) - btf_vmlinux = btf_parse_vmlinux(); - mutex_unlock(&bpf_verifier_lock); - } + bpf_get_btf_vmlinux(); /* grab the mutex to protect few globals used by verifier */ if (!is_priv) -- 1.8.3.1
[PATCH v3 bpf-next 1/2] bpf: use dedicated bpf_trace_printk event instead of trace_printk()
The bpf helper bpf_trace_printk() uses trace_printk() under the hood. This leads to an alarming warning message originating from trace buffer allocation which occurs the first time a program using bpf_trace_printk() is loaded. We can instead create a trace event for bpf_trace_printk() and enable it in-kernel when/if we encounter a program using the bpf_trace_printk() helper. With this approach, trace_printk() is not used directly and no warning message appears. This work was started by Steven (see Link) and finished by Alan; added Steven's Signed-off-by with his permission. Link: https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Alan Maguire Acked-by: Andrii Nakryiko --- kernel/trace/Makefile| 2 ++ kernel/trace/bpf_trace.c | 42 +- kernel/trace/bpf_trace.h | 34 ++ 3 files changed, 73 insertions(+), 5 deletions(-) create mode 100644 kernel/trace/bpf_trace.h diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile index 6575bb0..aeba5ee 100644 --- a/kernel/trace/Makefile +++ b/kernel/trace/Makefile @@ -31,6 +31,8 @@ ifdef CONFIG_GCOV_PROFILE_FTRACE GCOV_PROFILE := y endif +CFLAGS_bpf_trace.o := -I$(src) + CFLAGS_trace_benchmark.o := -I$(src) CFLAGS_trace_events_filter.o := -I$(src) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index e0b7775..0a2716d 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include @@ -19,6 +20,9 @@ #include "trace_probe.h" #include "trace.h" +#define CREATE_TRACE_POINTS +#include "bpf_trace.h" + #define bpf_event_rcu_dereference(p) \ rcu_dereference_protected(p, lockdep_is_held(&bpf_event_mutex)) @@ -374,6 +378,30 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, } } +static DEFINE_RAW_SPINLOCK(trace_printk_lock); + +#define BPF_TRACE_PRINTK_SIZE 1024 + +static inline __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...) +{ + static char buf[BPF_TRACE_PRINTK_SIZE]; + unsigned long flags; + va_list ap; + int ret; + + raw_spin_lock_irqsave(&trace_printk_lock, flags); + va_start(ap, fmt); + ret = vsnprintf(buf, sizeof(buf), fmt, ap); + va_end(ap); + /* vsnprintf() will not append null for zero-length strings */ + if (ret == 0) + buf[0] = '\0'; + trace_bpf_trace_printk(buf); + raw_spin_unlock_irqrestore(&trace_printk_lock, flags); + + return ret; +} + /* * Only limited trace_printk() conversion specifiers allowed: * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s @@ -483,8 +511,7 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, */ #define __BPF_TP_EMIT()__BPF_ARG3_TP() #define __BPF_TP(...) \ - __trace_printk(0 /* Fake ip */, \ - fmt, ##__VA_ARGS__) + bpf_do_trace_printk(fmt, ##__VA_ARGS__) #define __BPF_ARG1_TP(...) \ ((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64))\ @@ -521,10 +548,15 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, const struct bpf_func_proto *bpf_get_trace_printk_proto(void) { /* -* this program might be calling bpf_trace_printk, -* so allocate per-cpu printk buffers +* This program might be calling bpf_trace_printk, +* so enable the associated bpf_trace/bpf_trace_printk event. +* Repeat this each time as it is possible a user has +* disabled bpf_trace_printk events. By loading a program +* calling bpf_trace_printk() however the user has expressed +* the intent to see such events. */ - trace_printk_init_buffers(); + if (trace_set_clr_event("bpf_trace", "bpf_trace_printk", 1)) + pr_warn_ratelimited("could not enable bpf_trace_printk events"); return &bpf_trace_printk_proto; } diff --git a/kernel/trace/bpf_trace.h b/kernel/trace/bpf_trace.h new file mode 100644 index 000..9acbc11 --- /dev/null +++ b/kernel/trace/bpf_trace.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM bpf_trace + +#if !defined(_TRACE_BPF_TRACE_H) || defined(TRACE_HEADER_MULTI_READ) + +#define _TRACE_BPF_TRACE_H + +#include + +TRACE_EVENT(bpf_trace_printk, + + TP_PROTO(const char *bpf_string), + + TP_ARGS(bpf_string), + + TP_STRUCT__entry( + __string(bpf_string, bpf_string) + ), + + TP_fast_assign( + __assign_str
[PATCH v3 bpf-next 0/2] bpf: fix use of trace_printk() in BPF
Steven suggested a way to resolve the appearance of the warning banner that appears as a result of using trace_printk() in BPF [1]. Applying the patch and testing reveals all works as expected; we can call bpf_trace_printk() and see the trace messages in /sys/kernel/debug/tracing/trace_pipe and no banner message appears. Also add a test prog to verify basic bpf_trace_printk() helper behaviour. Changes since v2: - fixed stray newline in bpf_trace_printk(), use sizeof(buf) rather than #defined value in vsnprintf() (Daniel, patch 1) - Daniel also pointed out that vsnprintf() returns 0 on error rather than a negative value; also turns out that a null byte is not appended if the length of the string written is zero, so to fix for cases where the string to be traced is zero length we set the null byte explicitly (Daniel, patch 1) - switch to using getline() for retrieving lines from trace buffer to ensure we don't read a portion of the search message in one read() operation and then fail to find it (Andrii, patch 2) Changes since v1: - reorder header inclusion in bpf_trace.c (Steven, patch 1) - trace zero-length messages also (Andrii, patch 1) - use a raw spinlock to ensure there are no issues for PREMMPT_RT kernels when using bpf_trace_printk() within other raw spinlocks (Steven, patch 1) - always enable bpf_trace_printk() tracepoint when loading programs using bpf_trace_printk() as this will ensure that a user disabling that tracepoint will not prevent tracing output from being logged (Steven, patch 1) - use "tp/raw_syscalls/sys_enter" and a usleep(1) to trigger events in the selftest ensuring test runs faster (Andrii, patch 2) [1] https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home Alan Maguire (2): bpf: use dedicated bpf_trace_printk event instead of trace_printk() selftests/bpf: add selftests verifying bpf_trace_printk() behaviour kernel/trace/Makefile | 2 + kernel/trace/bpf_trace.c | 42 ++-- kernel/trace/bpf_trace.h | 34 ++ .../selftests/bpf/prog_tests/trace_printk.c| 75 ++ tools/testing/selftests/bpf/progs/trace_printk.c | 21 ++ 5 files changed, 169 insertions(+), 5 deletions(-) create mode 100644 kernel/trace/bpf_trace.h create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c -- 1.8.3.1
[PATCH v3 bpf-next 2/2] selftests/bpf: add selftests verifying bpf_trace_printk() behaviour
Simple selftests that verifies bpf_trace_printk() returns a sensible value and tracing messages appear. Signed-off-by: Alan Maguire Acked-by: Andrii Nakryiko --- .../selftests/bpf/prog_tests/trace_printk.c| 75 ++ tools/testing/selftests/bpf/progs/trace_printk.c | 21 ++ 2 files changed, 96 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c diff --git a/tools/testing/selftests/bpf/prog_tests/trace_printk.c b/tools/testing/selftests/bpf/prog_tests/trace_printk.c new file mode 100644 index 000..39b0dec --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/trace_printk.c @@ -0,0 +1,75 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ + +#include + +#include "trace_printk.skel.h" + +#define TRACEBUF "/sys/kernel/debug/tracing/trace_pipe" +#define SEARCHMSG "testing,testing" + +void test_trace_printk(void) +{ + int err, iter = 0, duration = 0, found = 0; + struct trace_printk__bss *bss; + struct trace_printk *skel; + char *buf = NULL; + FILE *fp = NULL; + size_t buflen; + + skel = trace_printk__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + err = trace_printk__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) + goto cleanup; + + bss = skel->bss; + + err = trace_printk__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + fp = fopen(TRACEBUF, "r"); + if (CHECK(fp == NULL, "could not open trace buffer", + "error %d opening %s", errno, TRACEBUF)) + goto cleanup; + + /* We do not want to wait forever if this test fails... */ + fcntl(fileno(fp), F_SETFL, O_NONBLOCK); + + /* wait for tracepoint to trigger */ + usleep(1); + trace_printk__detach(skel); + + if (CHECK(bss->trace_printk_ran == 0, + "bpf_trace_printk never ran", + "ran == %d", bss->trace_printk_ran)) + goto cleanup; + + if (CHECK(bss->trace_printk_ret <= 0, + "bpf_trace_printk returned <= 0 value", + "got %d", bss->trace_printk_ret)) + goto cleanup; + + /* verify our search string is in the trace buffer */ + while (getline(&buf, &buflen, fp) >= 0 || errno == EAGAIN) { + if (strstr(buf, SEARCHMSG) != NULL) + found++; + if (found == bss->trace_printk_ran) + break; + if (++iter > 1000) + break; + } + + if (CHECK(!found, "message from bpf_trace_printk not found", + "no instance of %s in %s", SEARCHMSG, TRACEBUF)) + goto cleanup; + +cleanup: + trace_printk__destroy(skel); + free(buf); + if (fp) + fclose(fp); +} diff --git a/tools/testing/selftests/bpf/progs/trace_printk.c b/tools/testing/selftests/bpf/progs/trace_printk.c new file mode 100644 index 000..8ca7f39 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/trace_printk.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2020, Oracle and/or its affiliates. + +#include "vmlinux.h" +#include +#include + +char _license[] SEC("license") = "GPL"; + +int trace_printk_ret = 0; +int trace_printk_ran = 0; + +SEC("tp/raw_syscalls/sys_enter") +int sys_enter(void *ctx) +{ + static const char fmt[] = "testing,testing %d\n"; + + trace_printk_ret = bpf_trace_printk(fmt, sizeof(fmt), + ++trace_printk_ran); + return 0; +} -- 1.8.3.1
[PATCH v2 bpf-next 0/2] bpf: fix use of trace_printk() in BPF
Steven suggested a way to resolve the appearance of the warning banner that appears as a result of using trace_printk() in BPF [1]. Applying the patch and testing reveals all works as expected; we can call bpf_trace_printk() and see the trace messages in /sys/kernel/debug/tracing/trace_pipe and no banner message appears. Also add a test prog to verify basic bpf_trace_printk() helper behaviour. Changes since v1: - reorder header inclusion in bpf_trace.c (Steven, patch 1) - trace zero-length messages also (Andrii, patch 1) - use a raw spinlock to ensure there are no issues for PREMMPT_RT kernels when using bpf_trace_printk() within other raw spinlocks (Steven, patch 1) - always enable bpf_trace_printk() tracepoint when loading programs using bpf_trace_printk() as this will ensure that a user disabling that tracepoint will not prevent tracing output from being logged (Steven, patch 1) - use "tp/raw_syscalls/sys_enter" and a usleep(1) to trigger events in the selftest ensuring test runs faster (Andrii, patch 2) [1] https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home Alan Maguire (2): bpf: use dedicated bpf_trace_printk event instead of trace_printk() selftests/bpf: add selftests verifying bpf_trace_printk() behaviour kernel/trace/Makefile | 2 + kernel/trace/bpf_trace.c | 41 ++-- kernel/trace/bpf_trace.h | 34 ++ .../selftests/bpf/prog_tests/trace_printk.c| 74 ++ tools/testing/selftests/bpf/progs/trace_printk.c | 21 ++ 5 files changed, 167 insertions(+), 5 deletions(-) create mode 100644 kernel/trace/bpf_trace.h create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c -- 1.8.3.1
[PATCH v2 bpf-next 1/2] bpf: use dedicated bpf_trace_printk event instead of trace_printk()
The bpf helper bpf_trace_printk() uses trace_printk() under the hood. This leads to an alarming warning message originating from trace buffer allocation which occurs the first time a program using bpf_trace_printk() is loaded. We can instead create a trace event for bpf_trace_printk() and enable it in-kernel when/if we encounter a program using the bpf_trace_printk() helper. With this approach, trace_printk() is not used directly and no warning message appears. This work was started by Steven (see Link) and finished by Alan; added Steven's Signed-off-by with his permission. Link: https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Alan Maguire --- kernel/trace/Makefile| 2 ++ kernel/trace/bpf_trace.c | 41 - kernel/trace/bpf_trace.h | 34 ++ 3 files changed, 72 insertions(+), 5 deletions(-) create mode 100644 kernel/trace/bpf_trace.h diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile index 6575bb0..aeba5ee 100644 --- a/kernel/trace/Makefile +++ b/kernel/trace/Makefile @@ -31,6 +31,8 @@ ifdef CONFIG_GCOV_PROFILE_FTRACE GCOV_PROFILE := y endif +CFLAGS_bpf_trace.o := -I$(src) + CFLAGS_trace_benchmark.o := -I$(src) CFLAGS_trace_events_filter.o := -I$(src) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 1d874d8..1414bf5 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include @@ -19,6 +20,9 @@ #include "trace_probe.h" #include "trace.h" +#define CREATE_TRACE_POINTS +#include "bpf_trace.h" + #define bpf_event_rcu_dereference(p) \ rcu_dereference_protected(p, lockdep_is_held(&bpf_event_mutex)) @@ -374,6 +378,29 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, } } +static DEFINE_RAW_SPINLOCK(trace_printk_lock); + +#define BPF_TRACE_PRINTK_SIZE 1024 + + +static inline __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...) +{ + static char buf[BPF_TRACE_PRINTK_SIZE]; + unsigned long flags; + va_list ap; + int ret; + + raw_spin_lock_irqsave(&trace_printk_lock, flags); + va_start(ap, fmt); + ret = vsnprintf(buf, BPF_TRACE_PRINTK_SIZE, fmt, ap); + va_end(ap); + if (ret >= 0) + trace_bpf_trace_printk(buf); + raw_spin_unlock_irqrestore(&trace_printk_lock, flags); + + return ret; +} + /* * Only limited trace_printk() conversion specifiers allowed: * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s @@ -483,8 +510,7 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, */ #define __BPF_TP_EMIT()__BPF_ARG3_TP() #define __BPF_TP(...) \ - __trace_printk(0 /* Fake ip */, \ - fmt, ##__VA_ARGS__) + bpf_do_trace_printk(fmt, ##__VA_ARGS__) #define __BPF_ARG1_TP(...) \ ((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64))\ @@ -521,10 +547,15 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, const struct bpf_func_proto *bpf_get_trace_printk_proto(void) { /* -* this program might be calling bpf_trace_printk, -* so allocate per-cpu printk buffers +* This program might be calling bpf_trace_printk, +* so enable the associated bpf_trace/bpf_trace_printk event. +* Repeat this each time as it is possible a user has +* disabled bpf_trace_printk events. By loading a program +* calling bpf_trace_printk() however the user has expressed +* the intent to see such events. */ - trace_printk_init_buffers(); + if (trace_set_clr_event("bpf_trace", "bpf_trace_printk", 1)) + pr_warn_ratelimited("could not enable bpf_trace_printk events"); return &bpf_trace_printk_proto; } diff --git a/kernel/trace/bpf_trace.h b/kernel/trace/bpf_trace.h new file mode 100644 index 000..9acbc11 --- /dev/null +++ b/kernel/trace/bpf_trace.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM bpf_trace + +#if !defined(_TRACE_BPF_TRACE_H) || defined(TRACE_HEADER_MULTI_READ) + +#define _TRACE_BPF_TRACE_H + +#include + +TRACE_EVENT(bpf_trace_printk, + + TP_PROTO(const char *bpf_string), + + TP_ARGS(bpf_string), + + TP_STRUCT__entry( + __string(bpf_string, bpf_string) + ), + + TP_fast_assign( + __assign_str(bpf_string, bpf_string); + ), + + TP_printk("%s", __get_str(bpf_string)) +); + +#endif /* _TRACE
[PATCH v2 bpf-next 2/2] selftests/bpf: add selftests verifying bpf_trace_printk() behaviour
Simple selftests that verifies bpf_trace_printk() returns a sensible value and tracing messages appear. Signed-off-by: Alan Maguire --- .../selftests/bpf/prog_tests/trace_printk.c| 74 ++ tools/testing/selftests/bpf/progs/trace_printk.c | 21 ++ 2 files changed, 95 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c diff --git a/tools/testing/selftests/bpf/prog_tests/trace_printk.c b/tools/testing/selftests/bpf/prog_tests/trace_printk.c new file mode 100644 index 000..25dd0f47 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/trace_printk.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ + +#include + +#include "trace_printk.skel.h" + +#define TRACEBUF "/sys/kernel/debug/tracing/trace_pipe" +#define SEARCHMSG "testing,testing" + +void test_trace_printk(void) +{ + int err, iter = 0, duration = 0, found = 0, fd = -1; + struct trace_printk__bss *bss; + struct trace_printk *skel; + char buf[1024]; + + skel = trace_printk__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + err = trace_printk__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) + goto cleanup; + + bss = skel->bss; + + err = trace_printk__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + fd = open(TRACEBUF, O_RDONLY); + if (CHECK(fd < 0, "could not open trace buffer", + "error %d opening %s", errno, TRACEBUF)) + goto cleanup; + + /* We do not want to wait forever if this test fails... */ + fcntl(fd, F_SETFL, O_NONBLOCK); + + /* wait for tracepoint to trigger */ + usleep(1); + trace_printk__detach(skel); + + if (CHECK(bss->trace_printk_ran == 0, + "bpf_trace_printk never ran", + "ran == %d", bss->trace_printk_ran)) + goto cleanup; + + if (CHECK(bss->trace_printk_ret <= 0, + "bpf_trace_printk returned <= 0 value", + "got %d", bss->trace_printk_ret)) + goto cleanup; + + /* verify our search string is in the trace buffer */ + while (read(fd, buf, sizeof(buf)) >= 0 || errno == EAGAIN) { + if (strstr(buf, SEARCHMSG) != NULL) + found++; + if (found == bss->trace_printk_ran) + break; + if (++iter > 1000) + break; + } + + if (CHECK(!found, "message from bpf_trace_printk not found", + "no instance of %s in %s", SEARCHMSG, TRACEBUF)) + goto cleanup; + + printf("ran %d times; last return value %d, with %d instances of msg\n", + bss->trace_printk_ran, bss->trace_printk_ret, found); +cleanup: + trace_printk__destroy(skel); + if (fd != -1) + close(fd); +} diff --git a/tools/testing/selftests/bpf/progs/trace_printk.c b/tools/testing/selftests/bpf/progs/trace_printk.c new file mode 100644 index 000..8ca7f39 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/trace_printk.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2020, Oracle and/or its affiliates. + +#include "vmlinux.h" +#include +#include + +char _license[] SEC("license") = "GPL"; + +int trace_printk_ret = 0; +int trace_printk_ran = 0; + +SEC("tp/raw_syscalls/sys_enter") +int sys_enter(void *ctx) +{ + static const char fmt[] = "testing,testing %d\n"; + + trace_printk_ret = bpf_trace_printk(fmt, sizeof(fmt), + ++trace_printk_ran); + return 0; +} -- 1.8.3.1
Re: [PATCH bpf-next 1/2] bpf: use dedicated bpf_trace_printk event instead of trace_printk()
On Tue, 7 Jul 2020, Andrii Nakryiko wrote: > On Fri, Jul 3, 2020 at 7:47 AM Alan Maguire wrote: > > > > The bpf helper bpf_trace_printk() uses trace_printk() under the hood. > > This leads to an alarming warning message originating from trace > > buffer allocation which occurs the first time a program using > > bpf_trace_printk() is loaded. > > > > We can instead create a trace event for bpf_trace_printk() and enable > > it in-kernel when/if we encounter a program using the > > bpf_trace_printk() helper. With this approach, trace_printk() > > is not used directly and no warning message appears. > > > > This work was started by Steven (see Link) and finished by Alan; added > > Steven's Signed-off-by with his permission. > > > > Link: https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home > > Signed-off-by: Steven Rostedt (VMware) > > Signed-off-by: Alan Maguire > > --- > > kernel/trace/Makefile| 2 ++ > > kernel/trace/bpf_trace.c | 41 + > > kernel/trace/bpf_trace.h | 34 ++ > > 3 files changed, 73 insertions(+), 4 deletions(-) > > create mode 100644 kernel/trace/bpf_trace.h > > > > [...] > > > +static DEFINE_SPINLOCK(trace_printk_lock); > > + > > +#define BPF_TRACE_PRINTK_SIZE 1024 > > + > > +static inline int bpf_do_trace_printk(const char *fmt, ...) > > +{ > > + static char buf[BPF_TRACE_PRINTK_SIZE]; > > + unsigned long flags; > > + va_list ap; > > + int ret; > > + > > + spin_lock_irqsave(&trace_printk_lock, flags); > > + va_start(ap, fmt); > > + ret = vsnprintf(buf, BPF_TRACE_PRINTK_SIZE, fmt, ap); > > + va_end(ap); > > + if (ret > 0) > > + trace_bpf_trace_printk(buf); > > Is there any reason to artificially limit the case of printing empty > string? It's kind of an awkward use case, for sure, but having > guarantee that every bpf_trace_printk() invocation triggers tracepoint > is a nice property, no? > True enough; I'll modify the above to support empty string display also. > > + spin_unlock_irqrestore(&trace_printk_lock, flags); > > + > > + return ret; > > +} > > + > > /* > > * Only limited trace_printk() conversion specifiers allowed: > > * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s > > @@ -483,8 +510,7 @@ static void bpf_trace_copy_string(char *buf, void > > *unsafe_ptr, char fmt_ptype, > > */ > > #define __BPF_TP_EMIT()__BPF_ARG3_TP() > > #define __BPF_TP(...) \ > > - __trace_printk(0 /* Fake ip */, \ > > - fmt, ##__VA_ARGS__) > > + bpf_do_trace_printk(fmt, ##__VA_ARGS__) > > > > #define __BPF_ARG1_TP(...) \ > > ((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64))\ > > @@ -518,13 +544,20 @@ static void bpf_trace_copy_string(char *buf, void > > *unsafe_ptr, char fmt_ptype, > > .arg2_type = ARG_CONST_SIZE, > > }; > > > > +int bpf_trace_printk_enabled; > > static? > oops, will fix. > > + > > const struct bpf_func_proto *bpf_get_trace_printk_proto(void) > > { > > /* > > * this program might be calling bpf_trace_printk, > > -* so allocate per-cpu printk buffers > > +* so enable the associated bpf_trace/bpf_trace_printk event. > > */ > > - trace_printk_init_buffers(); > > + if (!bpf_trace_printk_enabled) { > > + if (trace_set_clr_event("bpf_trace", "bpf_trace_printk", 1)) > > just to double check, it's ok to simultaneously enable same event in > parallel, right? > >From an ftrace perspective, it looks fine since the actual enable is mutex-protected. We could grab the trace_printk_lock here too I guess, but I don't _think_ there's a need. Thanks for reviewing! I'll spin up a v2 with the above fixes shortly plus I'll change to using tp/raw_syscalls/sys_enter in the test as you suggested. Alan
[PATCH bpf-next 1/2] bpf: use dedicated bpf_trace_printk event instead of trace_printk()
The bpf helper bpf_trace_printk() uses trace_printk() under the hood. This leads to an alarming warning message originating from trace buffer allocation which occurs the first time a program using bpf_trace_printk() is loaded. We can instead create a trace event for bpf_trace_printk() and enable it in-kernel when/if we encounter a program using the bpf_trace_printk() helper. With this approach, trace_printk() is not used directly and no warning message appears. This work was started by Steven (see Link) and finished by Alan; added Steven's Signed-off-by with his permission. Link: https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Alan Maguire --- kernel/trace/Makefile| 2 ++ kernel/trace/bpf_trace.c | 41 + kernel/trace/bpf_trace.h | 34 ++ 3 files changed, 73 insertions(+), 4 deletions(-) create mode 100644 kernel/trace/bpf_trace.h diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile index 6575bb0..aeba5ee 100644 --- a/kernel/trace/Makefile +++ b/kernel/trace/Makefile @@ -31,6 +31,8 @@ ifdef CONFIG_GCOV_PROFILE_FTRACE GCOV_PROFILE := y endif +CFLAGS_bpf_trace.o := -I$(src) + CFLAGS_trace_benchmark.o := -I$(src) CFLAGS_trace_events_filter.o := -I$(src) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 1d874d8..cdbafc4 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -2,6 +2,10 @@ /* Copyright (c) 2011-2015 PLUMgrid, http://plumgrid.com * Copyright (c) 2016 Facebook */ +#define CREATE_TRACE_POINTS + +#include "bpf_trace.h" + #include #include #include @@ -11,6 +15,7 @@ #include #include #include +#include #include #include @@ -374,6 +379,28 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, } } +static DEFINE_SPINLOCK(trace_printk_lock); + +#define BPF_TRACE_PRINTK_SIZE 1024 + +static inline int bpf_do_trace_printk(const char *fmt, ...) +{ + static char buf[BPF_TRACE_PRINTK_SIZE]; + unsigned long flags; + va_list ap; + int ret; + + spin_lock_irqsave(&trace_printk_lock, flags); + va_start(ap, fmt); + ret = vsnprintf(buf, BPF_TRACE_PRINTK_SIZE, fmt, ap); + va_end(ap); + if (ret > 0) + trace_bpf_trace_printk(buf); + spin_unlock_irqrestore(&trace_printk_lock, flags); + + return ret; +} + /* * Only limited trace_printk() conversion specifiers allowed: * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s @@ -483,8 +510,7 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, */ #define __BPF_TP_EMIT()__BPF_ARG3_TP() #define __BPF_TP(...) \ - __trace_printk(0 /* Fake ip */, \ - fmt, ##__VA_ARGS__) + bpf_do_trace_printk(fmt, ##__VA_ARGS__) #define __BPF_ARG1_TP(...) \ ((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64))\ @@ -518,13 +544,20 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, .arg2_type = ARG_CONST_SIZE, }; +int bpf_trace_printk_enabled; + const struct bpf_func_proto *bpf_get_trace_printk_proto(void) { /* * this program might be calling bpf_trace_printk, -* so allocate per-cpu printk buffers +* so enable the associated bpf_trace/bpf_trace_printk event. */ - trace_printk_init_buffers(); + if (!bpf_trace_printk_enabled) { + if (trace_set_clr_event("bpf_trace", "bpf_trace_printk", 1)) + pr_warn_ratelimited("could not enable bpf_trace_printk events"); + else + bpf_trace_printk_enabled = 1; + } return &bpf_trace_printk_proto; } diff --git a/kernel/trace/bpf_trace.h b/kernel/trace/bpf_trace.h new file mode 100644 index 000..9acbc11 --- /dev/null +++ b/kernel/trace/bpf_trace.h @@ -0,0 +1,34 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM bpf_trace + +#if !defined(_TRACE_BPF_TRACE_H) || defined(TRACE_HEADER_MULTI_READ) + +#define _TRACE_BPF_TRACE_H + +#include + +TRACE_EVENT(bpf_trace_printk, + + TP_PROTO(const char *bpf_string), + + TP_ARGS(bpf_string), + + TP_STRUCT__entry( + __string(bpf_string, bpf_string) + ), + + TP_fast_assign( + __assign_str(bpf_string, bpf_string); + ), + + TP_printk("%s", __get_str(bpf_string)) +); + +#endif /* _TRACE_BPF_TRACE_H */ + +#undef TRACE_INCLUDE_PATH +#define TRACE_INCLUDE_PATH . +#define TRACE_INCLUDE_FILE bpf_trace + +#include -- 1.8.3.1
[PATCH bpf-next 0/2] bpf: fix use of trace_printk() in BPF
Steven suggested a way to resolve the appearance of the warning banner that appears as a result of using trace_printk() in BPF [1]. Applying the patch and testing reveals all works as expected; we can call bpf_trace_printk() and see the trace messages in /sys/kernel/debug/tracing/trace_pipe and no banner message appears. Also add a test prog to verify basic bpf_trace_printk() helper behaviour. Possible future work: ftrace supports trace instances, and one thing that strikes me is that we could make use of these in BPF to separate BPF program bpf_trace_printk() output from output of other tracing activities. I was thinking something like a sysctl net.core.bpf_trace_instance, defaulting to an empty value signifying we use the root trace instance. This would preserve existing behaviour while giving a way to separate BPF tracing output from other tracing output if wanted. [1] https://lore.kernel.org/r/20200628194334.6238b...@oasis.local.home Alan Maguire (2): bpf: use dedicated bpf_trace_printk event instead of trace_printk() selftests/bpf: add selftests verifying bpf_trace_printk() behaviour kernel/trace/Makefile | 2 + kernel/trace/bpf_trace.c | 41 +++-- kernel/trace/bpf_trace.h | 34 +++ .../selftests/bpf/prog_tests/trace_printk.c| 71 ++ tools/testing/selftests/bpf/progs/trace_printk.c | 21 +++ 5 files changed, 165 insertions(+), 4 deletions(-) create mode 100644 kernel/trace/bpf_trace.h create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c -- 1.8.3.1
[PATCH bpf-next 2/2] selftests/bpf: add selftests verifying bpf_trace_printk() behaviour
Simple selftest that verifies bpf_trace_printk() returns a sensible value and tracing messages appear. Signed-off-by: Alan Maguire --- .../selftests/bpf/prog_tests/trace_printk.c| 71 ++ tools/testing/selftests/bpf/progs/trace_printk.c | 21 +++ 2 files changed, 92 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk.c create mode 100644 tools/testing/selftests/bpf/progs/trace_printk.c diff --git a/tools/testing/selftests/bpf/prog_tests/trace_printk.c b/tools/testing/selftests/bpf/prog_tests/trace_printk.c new file mode 100644 index 000..a850cba --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/trace_printk.c @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ + +#include + +#include "trace_printk.skel.h" + +#define TRACEBUF "/sys/kernel/debug/tracing/trace_pipe" +#define SEARCHMSG "testing,testing" + +void test_trace_printk(void) +{ + int err, duration = 0, found = 0; + struct trace_printk *skel; + struct trace_printk__bss *bss; + char buf[1024]; + int fd = -1; + + skel = trace_printk__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + err = trace_printk__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) + goto cleanup; + + bss = skel->bss; + + err = trace_printk__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + fd = open(TRACEBUF, O_RDONLY); + if (CHECK(fd < 0, "could not open trace buffer", + "error %d opening %s", errno, TRACEBUF)) + goto cleanup; + + /* We do not want to wait forever if this test fails... */ + fcntl(fd, F_SETFL, O_NONBLOCK); + + /* wait for tracepoint to trigger */ + sleep(1); + trace_printk__detach(skel); + + if (CHECK(bss->trace_printk_ran == 0, + "bpf_trace_printk never ran", + "ran == %d", bss->trace_printk_ran)) + goto cleanup; + + if (CHECK(bss->trace_printk_ret <= 0, + "bpf_trace_printk returned <= 0 value", + "got %d", bss->trace_printk_ret)) + goto cleanup; + + /* verify our search string is in the trace buffer */ + while (read(fd, buf, sizeof(buf)) >= 0) { + if (strstr(buf, SEARCHMSG) != NULL) + found++; + } + + if (CHECK(!found, "message from bpf_trace_printk not found", + "no instance of %s in %s", SEARCHMSG, TRACEBUF)) + goto cleanup; + + printf("ran %d times; last return value %d, with %d instances of msg\n", + bss->trace_printk_ran, bss->trace_printk_ret, found); +cleanup: + trace_printk__destroy(skel); + if (fd != -1) + close(fd); +} diff --git a/tools/testing/selftests/bpf/progs/trace_printk.c b/tools/testing/selftests/bpf/progs/trace_printk.c new file mode 100644 index 000..8ff6d49 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/trace_printk.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2020, Oracle and/or its affiliates. + +#include "vmlinux.h" +#include +#include + +char _license[] SEC("license") = "GPL"; + +int trace_printk_ret = 0; +int trace_printk_ran = 0; + +SEC("tracepoint/sched/sched_switch") +int sched_switch(void *ctx) +{ + static const char fmt[] = "testing,testing %d\n"; + + trace_printk_ret = bpf_trace_printk(fmt, sizeof(fmt), + ++trace_printk_ran); + return 0; +} -- 1.8.3.1
Re: linux-next: build failure after merge of the thunderbolt tree
On Tue, 30 Jun 2020, Stephen Rothwell wrote: > Hi all, > > After merging the thunderbolt tree, today's linux-next build (powerpc > allyesconfig) failed like this: > > > Caused by commit > > 54509f5005ca ("thunderbolt: Add KUnit tests for path walking") > > interacting with commit > > d4cdd146d0db ("kunit: generalize kunit_resource API beyond allocated > resources") > > from the kunit-next tree. > > I have applied the following merge fix patch. > > From: Stephen Rothwell > Date: Tue, 30 Jun 2020 15:51:50 +1000 > Subject: [PATCH] thunderbolt: merge fix for kunix_resource changes > > Signed-off-by: Stephen Rothwell Thanks Stephen, resolution looks good to me! If you need it Reviewed-by: Alan Maguire Once the kunit and thunderbolt trees are merged there may be some additional things we can do to simplify kunit resource utilization in the thuderbolt tests using the new kunit resource APIs; no hurry with that though. Nice to see the kunit resources code being used! Alan
Re: [PATCH v3 bpf-next 4/8] printk: add type-printing %pT format specifier which uses BTF
On Fri, 26 Jun 2020, Petr Mladek wrote: > On Tue 2020-06-23 13:07:07, Alan Maguire wrote: > > printk supports multiple pointer object type specifiers (printing > > netdev features etc). Extend this support using BTF to cover > > arbitrary types. "%pT" specifies the typed format, and the pointer > > argument is a "struct btf_ptr *" where struct btf_ptr is as follows: > > > > struct btf_ptr { > > void *ptr; > > const char *type; > > u32 id; > > }; > > > > Either the "type" string ("struct sk_buff") or the BTF "id" can be > > used to identify the type to use in displaying the associated "ptr" > > value. A convenience function to create and point at the struct > > is provided: > > > > printk(KERN_INFO "%pT", BTF_PTR_TYPE(skb, struct sk_buff)); > > > > When invoked, BTF information is used to traverse the sk_buff * > > and display it. Support is present for structs, unions, enums, > > typedefs and core types (though in the latter case there's not > > much value in using this feature of course). > > > > Default output is indented, but compact output can be specified > > via the 'c' option. Type names/member values can be suppressed > > using the 'N' option. Zero values are not displayed by default > > but can be using the '0' option. Pointer values are obfuscated > > unless the 'x' option is specified. As an example: > > > > struct sk_buff *skb = alloc_skb(64, GFP_KERNEL); > > pr_info("%pT", BTF_PTR_TYPE(skb, struct sk_buff)); > > > > ...gives us: > > > > (struct sk_buff){ > > .transport_header = (__u16)65535, > > .mac_header = (__u16)65535, > > .end = (sk_buff_data_t)192, > > .head = (unsigned char *)0x6b71155a, > > .data = (unsigned char *)0x6b71155a, > > .truesize = (unsigned int)768, > > .users = (refcount_t){ > > .refs = (atomic_t){ > >.counter = (int)1, > > }, > > }, > > .extensions = (struct skb_ext *)0xf486a130, > > } > > > > printk output is truncated at 1024 bytes. For cases where overflow > > is likely, the compact/no type names display modes may be used. > > Hmm, this scares me: > >1. The long message and many lines are going to stretch printk > design in another dimensions. > >2. vsprintf() is important for debugging the system. It has to be > stable. But the btf code is too complex. > Right on both points, and there's no way around that really. Representing even small data structures will stretch us to or beyond the 1024 byte limit. This can be mitigated by using compact display mode and not printing field names, but the output becomes hard to parse then. I think a better approach might be to start small, adding the core btf_show functionality to BPF, allowing consumers to use it there, perhaps via a custom helper. In the current model bpf_trace_printk() inherits the functionality to display data from core printk, so a different approach would be needed there. Other consumers outside of BPF could potentially avail of the show functionality directly via the btf_show functions in the future, but at least it would have one consumer at the outset, and wouldn't present problems like these for printk. > I would strongly prefer to keep this outside vsprintf and printk. > Please, invert the logic and convert it into using separate printk() > call for each printed line. > I think the above is in line with what you're suggesting? > > More details: > > Add 1: Long messages with many lines: > > IMHO, all existing printk() users are far below this limit. And this is > even worse because there are many short lines. They would require > double space to add prefixes (loglevel, timestamp, caller id) when > printing to console. > > You might argue that 1024bytes are enough for you. But for how long? > > Now, we have huge troubles to make printk() lockless and thus more > reliable. There is no way to allocate any internal buffers > dynamically. People using kernel on small devices have problem > with large static buffers. > > printk() is primary designed to print single line messages. There are > many use cases where many lines are needed and they are solved by > many separate printk() calls. > > > Add 2: Complex code: > > vsprintf() is currently called in printk() under logbuf_lock. It > might block printk() on the entire system. > > Most existing %p handlers are implemented by relatively > simple routines inside lib/vsprinf.c. The other external routines > look simple as well. > > btf looks like a huge beast to me. For example, probe_kernel_read() > prevented boot recently, see the commit 2ac5a3bf7042a1c4abb > ("vsprintf: Do not break early boot with probing addresses"). > > Yep, no way round this either. I'll try a different approach. Thanks for taking a look! Alan > Best Regards, > Petr >
Re: RFC: KTAP documentation - expected messages
On Tue, 23 Jun 2020, David Gow wrote: > On Mon, Jun 22, 2020 at 6:45 AM Frank Rowand wrote: > > > > Tim Bird started a thread [1] proposing that he document the selftest result > > format used by Linux kernel tests. > > > > [1] > > https://lore.kernel.org/r/cy4pr13mb1175b804e31e502221bc8163fd...@cy4pr13mb1175.namprd13.prod.outlook.com > > > > The issue of messages generated by the kernel being tested (that are not > > messages directly created by the tests, but are instead triggered as a > > side effect of the test) came up. In this thread, I will call these > > messages "expected messages". Instead of sidetracking that thread with > > a proposal to handle expected messages, I am starting this new thread. > > Thanks for doing this: I think there are quite a few tests which could > benefit from something like this. > > I think there were actually two separate questions: what do we do with > unexpected messages (most of which I expect are useless, but some of > which may end up being related to an unexpected test failure), and how > to have tests "expect" a particular message to appear. I'll stick to > talking about the latter for this thread, but even there there's two > possible interpretations of "expected messages" we probably want to > explicitly distinguish between: a message which must be present for > the test to pass (which I think best fits the "expected message" > name), and a message which the test is likely to produce, but which > shouldn't alter the result (an "ignored message"). I don't see much > use for the latter at present, but if we wanted to do more things with > messages and had some otherwise very verbose tests, it could > potentially be useful. > > The other thing I'd note here is that this proposal seems to be doing > all of the actual message filtering in userspace, which makes a lot of > sense for kselftest tests, but does mean that the kernel can't know if > the test has passed or failed. There's definitely a tradeoff between > trying to put too much needless string parsing in the kernel and > having to have a userland tool determine the test results. The > proposed KCSAN test suite[1] is using tracepoints to do this in the > kernel. It's not the cleanest thing, but there's no reason KUnit or > similar couldn't implement a nicer API around it. > > [1]: https://lkml.org/lkml/2020/6/22/1506 > For KTF the way we handled this was to use the APIs for catching function entry and return (via kprobes), specifying printk as the function to catch, and checking its argument string to verify the expected message was seen. That allows you to verify that messages appear in kernel testing context, but it's not ideal as printk() has not yet filled in the arguments in the buffer for display (there may be a better place to trace). If it seems like it could be useful I could have a go at porting the kprobe stuff to KUnit, as it helps expand the vocabulary for what can be tested in kernel context; for example we can also override return values for kernel functions to simulate errors. Alan
[PATCH v3 bpf-next 8/8] bpf/selftests: add tests for %pT format specifier
tests verify we get 0 return value from bpf_trace_print() using %pT format specifier with various modifiers/pointer values. Signed-off-by: Alan Maguire --- .../selftests/bpf/prog_tests/trace_printk_btf.c| 45 + .../selftests/bpf/progs/netif_receive_skb.c| 47 ++ 2 files changed, 92 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk_btf.c create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c diff --git a/tools/testing/selftests/bpf/prog_tests/trace_printk_btf.c b/tools/testing/selftests/bpf/prog_tests/trace_printk_btf.c new file mode 100644 index 000..791eb97 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/trace_printk_btf.c @@ -0,0 +1,45 @@ +// SPDX-License-Identifier: GPL-2.0 +#include + +#include "netif_receive_skb.skel.h" + +void test_trace_printk_btf(void) +{ + struct netif_receive_skb *skel; + struct netif_receive_skb__bss *bss; + int err, duration = 0; + + skel = netif_receive_skb__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + err = netif_receive_skb__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) + goto cleanup; + + bss = skel->bss; + + err = netif_receive_skb__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* generate receive event */ + system("ping -c 1 127.0.0.1 >/dev/null"); + + /* +* Make sure netif_receive_skb program was triggered +* and it set expected return values from bpf_trace_printk()s +* and all tests ran. +*/ + if (CHECK(bss->ret <= 0, + "bpf_trace_printk: got return value", + "ret <= 0 %d test %d\n", bss->ret, bss->num_subtests)) + goto cleanup; + + CHECK(bss->num_subtests != bss->ran_subtests, "check all subtests ran", + "only ran %d of %d tests\n", bss->num_subtests, + bss->ran_subtests); + +cleanup: + netif_receive_skb__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c b/tools/testing/selftests/bpf/progs/netif_receive_skb.c new file mode 100644 index 000..03ca1d8 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c @@ -0,0 +1,47 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ +#include "vmlinux.h" +#include +#include + +char _license[] SEC("license") = "GPL"; + +int ret; +int num_subtests; +int ran_subtests; + +#define CHECK_PRINTK(_fmt, _p, res)\ + do {\ + char fmt[] = _fmt; \ + ++num_subtests; \ + if (ret >= 0) { \ + ++ran_subtests; \ + ret = bpf_trace_printk(fmt, sizeof(fmt), (_p)); \ + } \ + } while (0) + +/* TRACE_EVENT(netif_receive_skb, + * TP_PROTO(struct sk_buff *skb), + */ +SEC("tp_btf/netif_receive_skb") +int BPF_PROG(trace_netif_receive_skb, struct sk_buff *skb) +{ + char skb_type[] = "struct sk_buff"; + struct btf_ptr nullp = { .ptr = 0, .type = skb_type }; + struct btf_ptr p = { .ptr = skb, .type = skb_type }; + + CHECK_PRINTK("%pT\n", &p, &res); + CHECK_PRINTK("%pTc\n", &p, &res); + CHECK_PRINTK("%pTN\n", &p, &res); + CHECK_PRINTK("%pTx\n", &p, &res); + CHECK_PRINTK("%pT0\n", &p, &res); + CHECK_PRINTK("%pTcNx0\n", &p, &res); + CHECK_PRINTK("%pT\n", &nullp, &res); + CHECK_PRINTK("%pTc\n", &nullp, &res); + CHECK_PRINTK("%pTN\n", &nullp, &res); + CHECK_PRINTK("%pTx\n", &nullp, &res); + CHECK_PRINTK("%pT0\n", &nullp, &res); + CHECK_PRINTK("%pTcNx0\n", &nullp, &res); + + return 0; +} -- 1.8.3.1
[PATCH v3 bpf-next 5/8] printk: initialize vmlinux BTF outside of printk in late_initcall()
vmlinux BTF initialization can take time so it's best to do that outside of printk context; otherwise the first printk() using %pT will trigger BTF initialization. Signed-off-by: Alan Maguire --- lib/vsprintf.c | 12 1 file changed, 12 insertions(+) diff --git a/lib/vsprintf.c b/lib/vsprintf.c index c0d209d..8ac136a 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -3628,3 +3628,15 @@ int sscanf(const char *buf, const char *fmt, ...) return i; } EXPORT_SYMBOL(sscanf); + +/* + * Initialize vmlinux BTF as it may be used by printk()s and it's better + * to incur the cost of initialization outside of printk context. + */ +static int __init init_btf_vmlinux(void) +{ + (void) bpf_get_btf_vmlinux(); + + return 0; +} +late_initcall(init_btf_vmlinux); -- 1.8.3.1
[PATCH v3 bpf-next 2/8] bpf: move to generic BTF show support, apply it to seq files/strings
generalize the "seq_show" seq file support in btf.c to support a generic show callback of which we support two instances; the current seq file show, and a show with snprintf() behaviour which instead writes the type data to a supplied string. Both classes of show function call btf_type_show() with different targets; the seq file or the string to be written. In the string case we need to track additional data - length left in string to write and length to return that we would have written (a la snprintf). By default show will display type information, field members and their types and values etc, and the information is indented based upon structure depth. Zeroed fields are omitted. Show however supports flags which modify its behaviour: BTF_SHOW_COMPACT - suppress newline/indent. BTF_SHOW_NONAME - suppress show of type and member names. BTF_SHOW_PTR_RAW - do not obfuscate pointer values. BTF_SHOW_UNSAFE - do not copy data to safe buffer before display. BTF_SHOW_ZERO - show zeroed values (by default they are not shown). Signed-off-by: Alan Maguire --- include/linux/btf.h | 36 ++ kernel/bpf/btf.c| 966 ++-- 2 files changed, 899 insertions(+), 103 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index 5c1ea99..a8a4563 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -13,6 +13,7 @@ struct btf_member; struct btf_type; union bpf_attr; +struct btf_show; extern const struct file_operations btf_fops; @@ -46,8 +47,43 @@ int btf_get_info_by_fd(const struct btf *btf, const struct btf_type *btf_type_id_size(const struct btf *btf, u32 *type_id, u32 *ret_size); + +/* + * Options to control show behaviour. + * - BTF_SHOW_COMPACT: no formatting around type information + * - BTF_SHOW_NONAME: no struct/union member names/types + * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values; + * equivalent to %px. + * - BTF_SHOW_ZERO: show zero-valued struct/union members; they + * are not displayed by default + * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read + * data before displaying it. + */ +#define BTF_SHOW_COMPACT (1ULL << 0) +#define BTF_SHOW_NONAME(1ULL << 1) +#define BTF_SHOW_PTR_RAW (1ULL << 2) +#define BTF_SHOW_ZERO (1ULL << 3) +#define BTF_SHOW_UNSAFE(1ULL << 4) + void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, struct seq_file *m); + +/* + * Copy len bytes of string representation of obj of BTF type_id into buf. + * + * @btf: struct btf object + * @type_id: type id of type obj points to + * @obj: pointer to typed data + * @buf: buffer to write to + * @len: maximum length to write to buf + * @flags: show options (see above) + * + * Return: length that would have been/was copied as per snprintf, or + *negative error. + */ +int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj, + char *buf, int len, u64 flags); + int btf_get_fd_by_id(u32 id); u32 btf_id(const struct btf *btf); bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 58c9af1..c82cb18 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -281,6 +281,88 @@ static const char *btf_type_str(const struct btf_type *t) return btf_kind_str[BTF_INFO_KIND(t->info)]; } +/* Chunk size we use in safe copy of data to be shown. */ +#define BTF_SHOW_OBJ_SAFE_SIZE 256 + +/* + * This is the maximum size of a base type value (equivalent to a + * 128-bit int); if we are at the end of our safe buffer and have + * less than 16 bytes space we can't be assured of being able + * to copy the next type safely, so in such cases we will initiate + * a new copy. + */ +#define BTF_SHOW_OBJ_BASE_TYPE_SIZE16 + +/* + * Common data to all BTF show operations. Private show functions can add + * their own data to a structure containing a struct btf_show and consult it + * in the show callback. See btf_type_show() below. + * + * One challenge with showing nested data is we want to skip 0-valued + * data, but in order to figure out whether a nested object is all zeros + * we need to walk through it. As a result, we need to make two passes + * when handling structs, unions and arrays; the first path simply looks + * for nonzero data, while the second actually does the display. The first + * pass is signalled by show->state.depth_check being set, and if we + * encounter a non-zero value we set show->state.depth_to_show to + * the depth at which we encountered it. When we have completed the + * first pass, we will know if anything needs to be displayed if + * depth_to_show > depth. See btf_[struct,array]_show() for the + * implementation of this.
[PATCH v3 bpf-next 4/8] printk: add type-printing %pT format specifier which uses BTF
printk supports multiple pointer object type specifiers (printing netdev features etc). Extend this support using BTF to cover arbitrary types. "%pT" specifies the typed format, and the pointer argument is a "struct btf_ptr *" where struct btf_ptr is as follows: struct btf_ptr { void *ptr; const char *type; u32 id; }; Either the "type" string ("struct sk_buff") or the BTF "id" can be used to identify the type to use in displaying the associated "ptr" value. A convenience function to create and point at the struct is provided: printk(KERN_INFO "%pT", BTF_PTR_TYPE(skb, struct sk_buff)); When invoked, BTF information is used to traverse the sk_buff * and display it. Support is present for structs, unions, enums, typedefs and core types (though in the latter case there's not much value in using this feature of course). Default output is indented, but compact output can be specified via the 'c' option. Type names/member values can be suppressed using the 'N' option. Zero values are not displayed by default but can be using the '0' option. Pointer values are obfuscated unless the 'x' option is specified. As an example: struct sk_buff *skb = alloc_skb(64, GFP_KERNEL); pr_info("%pT", BTF_PTR_TYPE(skb, struct sk_buff)); ...gives us: (struct sk_buff){ .transport_header = (__u16)65535, .mac_header = (__u16)65535, .end = (sk_buff_data_t)192, .head = (unsigned char *)0x6b71155a, .data = (unsigned char *)0x6b71155a, .truesize = (unsigned int)768, .users = (refcount_t){ .refs = (atomic_t){ .counter = (int)1, }, }, .extensions = (struct skb_ext *)0xf486a130, } printk output is truncated at 1024 bytes. For cases where overflow is likely, the compact/no type names display modes may be used. Signed-off-by: Alan Maguire i --- Documentation/core-api/printk-formats.rst | 17 ++ include/linux/btf.h | 3 +- include/linux/printk.h| 16 + lib/vsprintf.c| 98 +++ 4 files changed, 133 insertions(+), 1 deletion(-) diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst index 8c9aba2..8f255d0 100644 --- a/Documentation/core-api/printk-formats.rst +++ b/Documentation/core-api/printk-formats.rst @@ -563,6 +563,23 @@ For printing netdev_features_t. Passed by reference. +BTF-based printing of pointer data +-- +If '%pT' is specified, use the struct btf_ptr * along with kernel vmlinux +BPF Type Format (BTF) to show the typed data. For example, specifying + + printk(KERN_INFO "%pT", BTF_PTR_TYPE(skb, struct_sk_buff)); + +will utilize BTF information to traverse the struct sk_buff * and display it. + +Supported modifers are + 'c' compact output (no indentation, newlines etc) + 'N' do not show type names + 'u' unsafe printing; probe_kernel_read() is not used to copy data safely + before use + 'x' show raw pointers (no obfuscation) + '0' show zero-valued data (it is not shown by default) + Thanks == diff --git a/include/linux/btf.h b/include/linux/btf.h index a8a4563..e8dbf0c 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -172,10 +172,11 @@ static inline const struct btf_member *btf_type_member(const struct btf_type *t) return (const struct btf_member *)(t + 1); } +struct btf *btf_parse_vmlinux(void); + #ifdef CONFIG_BPF_SYSCALL const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id); const char *btf_name_by_offset(const struct btf *btf, u32 offset); -struct btf *btf_parse_vmlinux(void); struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog); #else static inline const struct btf_type *btf_type_by_id(const struct btf *btf, diff --git a/include/linux/printk.h b/include/linux/printk.h index fc8f03c..8f8f5d2 100644 --- a/include/linux/printk.h +++ b/include/linux/printk.h @@ -618,4 +618,20 @@ static inline void print_hex_dump_debug(const char *prefix_str, int prefix_type, #define print_hex_dump_bytes(prefix_str, prefix_type, buf, len)\ print_hex_dump_debug(prefix_str, prefix_type, 16, 1, buf, len, true) +/** + * struct btf_ptr is used for %pT (typed pointer) display; the + * additional type string/BTF id are used to render the pointer + * data as the appropriate type. + */ +struct btf_ptr { + void *ptr; + const char *type; + u32 id; +}; + +#defineBTF_PTR_TYPE(ptrval, typeval) \ + (&((struct btf_ptr){.ptr = ptrval, .type = #typeval})) + +#define BTF_PTR_ID(ptrval, idval) \ + (&((struct btf_ptr){.ptr = ptrval, .id = idval})) #endif diff --git a/lib/vsprintf.c b/lib/vsprintf.c index 259e558..c0d209d 100644 --- a/lib/vsprintf.c +++
[PATCH v3 bpf-next 6/8] printk: extend test_printf to test %pT BTF-based format specifier
Add tests to verify basic type display and to iterate through all enums, structs, unions and typedefs ensuring expected behaviour occurs. Since test_printf can be built as a module we need to export a BTF kind iterator function to allow us to iterate over all names of a particular BTF kind. These changes add up to approximately 20,000 new tests covering all enum, struct, union and typedefs in vmlinux BTF. Individual tests are also added for int, char, struct, enum and typedefs which verify output is as expected. Signed-off-by: Alan Maguire --- include/linux/btf.h | 3 + kernel/bpf/btf.c| 33 ++ lib/test_printf.c | 316 3 files changed, 352 insertions(+) diff --git a/include/linux/btf.h b/include/linux/btf.h index e8dbf0c..e3102a7 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -191,4 +191,7 @@ static inline const char *btf_name_by_offset(const struct btf *btf, } #endif +/* Following function used for testing BTF-based printk-family support */ +const char *btf_vmlinux_next_type_name(u8 kind, s32 *id); + #endif diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index c82cb18..4e250cd 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5459,3 +5459,36 @@ u32 btf_id(const struct btf *btf) { return btf->id; } + +/* + * btf_vmlinux_next_type_name(): used in test_printf.c to + * iterate over types for testing. + * Exported as test_printf can be built as a module. + * + * @kind: BTF_KIND_* value + * @id: pointer to last id; value/result argument. When next + * type name is found, we set *id to associated id. + * Returns: + * Next type name, sets *id to associated id. + */ +const char *btf_vmlinux_next_type_name(u8 kind, s32 *id) +{ + const struct btf *btf = bpf_get_btf_vmlinux(); + const struct btf_type *t; + const char *name; + + if (!btf || !id) + return NULL; + + for ((*id)++; *id <= btf->nr_types; (*id)++) { + t = btf->types[*id]; + if (BTF_INFO_KIND(t->info) != kind) + continue; + name = btf_name_by_offset(btf, t->name_off); + if (name && strlen(name) > 0) + return name; + } + + return NULL; +} +EXPORT_SYMBOL_GPL(btf_vmlinux_next_type_name); diff --git a/lib/test_printf.c b/lib/test_printf.c index 7ac87f1..7ce7387 100644 --- a/lib/test_printf.c +++ b/lib/test_printf.c @@ -23,6 +23,9 @@ #include #include +#include +#include +#include #include "../tools/testing/selftests/kselftest_module.h" @@ -669,6 +672,318 @@ static void __init fwnode_pointer(void) #endif } +#define__TEST_BTF(fmt, type, ptr, expected) \ + test(expected, "%pT"fmt, ptr) + +#define TEST_BTF_C(type, var, ...)\ + do { \ + type var = __VA_ARGS__;\ + struct btf_ptr *ptr = BTF_PTR_TYPE(&var, type);\ + pr_debug("type %s: %pTc", #type, ptr); \ + __TEST_BTF("c", type, ptr, "(" #type ")" #__VA_ARGS__);\ + } while (0) + +#define TEST_BTF(fmt, type, var, expected, ...) \ + do { \ + type var = __VA_ARGS__;\ + struct btf_ptr *ptr = BTF_PTR_TYPE(&var, type);\ + pr_debug("type %s: %pT"fmt, #type, ptr); \ + __TEST_BTF(fmt, type, ptr, expected); \ + } while (0) + +#defineBTF_MAX_DATA_SIZE 65536 + +static void __init +btf_print_kind(u8 kind, const char *kind_name, u64 fillval) +{ + const char *fmt1 = "%pT", *fmt2 = "%pTN", *fmt3 = "%pT0"; + const char *name, *fmt = fmt1; + int i, res1, res2, res3, res4; + char type_name[256]; + char *buf, *buf2; + u8 *dummy_data; + s32 id = 0; + + dummy_data = kzalloc(BTF_MAX_DATA_SIZE, GFP_KERNEL); + + /* fill our dummy data with supplied fillval. */ + for (i = 0; i < BTF_MAX_DATA_SIZE; i++) + dummy_data[i] = fillval; + + buf = kzalloc(BTF_MAX_DATA_SIZE, GFP_KERNEL); + buf2 = kzalloc(BTF_MAX_DATA_SIZE, GFP_KERNEL); + + for (;;) { + name = btf_vmlinux_next_type_name(kind, &id); + if (!name) + break; + + total_tests++; + + snprintf(type_name, sizeof(type_name), "%s%s", +kind_name, name); + +
[PATCH v3 bpf-next 7/8] bpf: add support for %pT format specifier for bpf_trace_printk() helper
Allow %pT[cNx0] format specifier for BTF-based display of data associated with pointer. The unsafe data modifier 'u' - where the source data is traversed without copying it to a safe buffer via probe_kernel_read() - is not supported. Signed-off-by: Alan Maguire --- include/uapi/linux/bpf.h | 27 ++- kernel/trace/bpf_trace.c | 24 +++- tools/include/uapi/linux/bpf.h | 27 ++- 3 files changed, 67 insertions(+), 11 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 1968481..ea4fbf3 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -702,7 +702,12 @@ struct bpf_stack_build_id { * to file *\/sys/kernel/debug/tracing/trace* from DebugFS, if * available. It can take up to three additional **u64** * arguments (as an eBPF helpers, the total number of arguments is - * limited to five). + * limited to five), and also supports %pT (BTF-based type + * printing), as long as BPF_READ lockdown is not active. + * "%pT" takes a "struct __btf_ptr *" as an argument; it + * consists of a pointer value and specified BTF type string or id + * used to select the type for display. For more details, see + * Documentation/core-api/printk-formats.rst. * * Each time the helper is called, it appends a line to the trace. * Lines are discarded while *\/sys/kernel/debug/tracing/trace* is @@ -738,10 +743,10 @@ struct bpf_stack_build_id { * The conversion specifiers supported by *fmt* are similar, but * more limited than for printk(). They are **%d**, **%i**, * **%u**, **%x**, **%ld**, **%li**, **%lu**, **%lx**, **%lld**, - * **%lli**, **%llu**, **%llx**, **%p**, **%s**. No modifier (size - * of field, padding with zeroes, etc.) is available, and the - * helper will return **-EINVAL** (but print nothing) if it - * encounters an unknown specifier. + * **%lli**, **%llu**, **%llx**, **%p**, **%pT[cNx0], **%s**. + * Only %pT supports modifiers, and the helper will return + * **-EINVAL** (but print nothing) if it encouters an unknown + * specifier. * * Also, note that **bpf_trace_printk**\ () is slow, and should * only be used for debugging purposes. For this reason, a notice @@ -4260,4 +4265,16 @@ struct bpf_pidns_info { __u32 pid; __u32 tgid; }; + +/* + * struct __btf_ptr is used for %pT (typed pointer) display; the + * additional type string/BTF id are used to render the pointer + * data as the appropriate type. + */ +struct __btf_ptr { + void *ptr; + const char *type; + __u32 id; +}; + #endif /* _UAPI__LINUX_BPF_H__ */ diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index e729c9e5..33ddb31 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -374,9 +374,13 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, } } +/* Unsafe BTF display ('u' modifier) is absent here. */ +#define is_btf_safe_modifier(c)\ + (c == 'c' || c == 'N' || c == 'x' || c == '0') + /* * Only limited trace_printk() conversion specifiers allowed: - * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pks %pus %s + * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pks %pus %s %pT */ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1, u64, arg2, u64, arg3) @@ -412,6 +416,24 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype, i++; } else if (fmt[i] == 'p') { mod[fmt_cnt]++; + + /* +* allow BTF type-based printing, but disallow unsafe +* mode - this ensures the data is copied safely +* using probe_kernel_read() prior to traversing it. +*/ + if (fmt[i + 1] == 'T') { + int ret; + + ret = security_locked_down(LOCKDOWN_BPF_READ); + if (unlikely(ret < 0)) + return ret; + i += 2; + while (is_btf_safe_modifier(fmt[i])) + i++; + goto fmt_next; + } + if ((fmt[i + 1] == 'k' || fmt[i + 1] == 'u') && fmt[i + 2] == 's') { diff --git a/tools/inc
[PATCH v3 bpf-next 0/8] bpf, printk: add BTF-based type printing
oed. Also tried to comment safe object scheme used. (Yonghong, patch 2) - added late_initcall() to initialize vmlinux BTF so that it would not have to be initialized during printk operation (Alexei, patch 5) - removed CONFIG_BTF_PRINTF config option as it is not needed; CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and determining behaviour of type-based printk can be done via retrieval of BTF data; if it's not there BTF was unavailable or broken (Alexei, patches 4,6) - fix bpf_trace_printk test to use vmlinux.h and globals via skeleton infrastructure, removing need for perf events (Andrii, patch 8) Changes since v1: - changed format to be more drgn-like, rendering indented type info along with type names by default (Alexei) - zeroed values are omitted (Arnaldo) by default unless the '0' modifier is specified (Alexei) - added an option to print pointer values without obfuscation. The reason to do this is the sysctls controlling pointer display are likely to be irrelevant in many if not most tracing contexts. Some questions on this in the outstanding questions section below... - reworked printk format specifer so that we no longer rely on format %pT but instead use a struct * which contains type information (Rasmus). This simplifies the printk parsing, makes use more dynamic and also allows specification by BTF id as well as name. - removed incorrect patch which tried to fix dereferencing of resolved BTF info for vmlinux; instead we skip modifiers for the relevant case (array element type determination) (Alexei). - fixed issues with negative snprintf format length (Rasmus) - added test cases for various data structure formats; base types, typedefs, structs, etc. - tests now iterate through all typedef, enum, struct and unions defined for vmlinux BTF and render a version of the target dummy value which is either all zeros or all 0xff values; the idea is this exercises the "skip if zero" and "print everything" cases. - added support in BPF for using the %pT format specifier in bpf_trace_printk() - added BPF tests which ensure %pT format specifier use works (Alexei). Important note: if running test_printf.ko - the version in the bpf-next tree will induce a panic when running the fwnode_pointer() tests due to a kobject issue; applying the patch in https://lkml.org/lkml/2020/4/17/389 ...resolved this issue for me. Alan Maguire (8): bpf: provide function to get vmlinux BTF information bpf: move to generic BTF show support, apply it to seq files/strings checkpatch: add new BTF pointer format specifier printk: add type-printing %pT format specifier which uses BTF printk: initialize vmlinux BTF outside of printk in late_initcall() printk: extend test_printf to test %pT BTF-based format specifier bpf: add support for %pT format specifier for bpf_trace_printk() helper bpf/selftests: add tests for %pT format specifier Documentation/core-api/printk-formats.rst | 17 + include/linux/bpf.h| 2 + include/linux/btf.h| 42 +- include/linux/printk.h | 16 + include/uapi/linux/bpf.h | 27 +- kernel/bpf/btf.c | 999 ++--- kernel/bpf/verifier.c | 18 +- kernel/trace/bpf_trace.c | 24 +- lib/test_printf.c | 316 +++ lib/vsprintf.c | 110 +++ scripts/checkpatch.pl | 2 +- tools/include/uapi/linux/bpf.h | 27 +- .../selftests/bpf/prog_tests/trace_printk_btf.c| 45 + .../selftests/bpf/progs/netif_receive_skb.c| 47 + 14 files changed, 1570 insertions(+), 122 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/trace_printk_btf.c create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c -- 1.8.3.1
[PATCH v3 bpf-next 3/8] checkpatch: add new BTF pointer format specifier
checkpatch complains about unknown format specifiers, so add the BTF format specifier we will implement in a subsequent patch to avoid errors. Signed-off-by: Alan Maguire --- scripts/checkpatch.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 4c82060..e89631e 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -6148,7 +6148,7 @@ sub process { $specifier = $1; $extension = $2; $qualifier = $3; - if ($extension !~ /[SsBKRraEehMmIiUDdgVCbGNOxtf]/ || + if ($extension !~ /[SsBKRraEehMmIiUDdgVCbGNOxtfT]/ || ($extension eq "f" && defined $qualifier && $qualifier !~ /^w/)) { $bad_specifier = $specifier; -- 1.8.3.1
[PATCH v3 bpf-next 1/8] bpf: provide function to get vmlinux BTF information
It will be used later for BTF printk() support Signed-off-by: Alan Maguire --- include/linux/bpf.h | 2 ++ kernel/bpf/verifier.c | 18 -- 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 07052d4..a2ecebd 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1237,6 +1237,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, union bpf_attr __user *uattr); void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth); +struct btf *bpf_get_btf_vmlinux(void); + /* Map specifics */ struct xdp_buff; struct sk_buff; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a1857c4..d448aa8 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -10878,6 +10878,17 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) } } +struct btf *bpf_get_btf_vmlinux(void) +{ + if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { + mutex_lock(&bpf_verifier_lock); + if (!btf_vmlinux) + btf_vmlinux = btf_parse_vmlinux(); + mutex_unlock(&bpf_verifier_lock); + } + return btf_vmlinux; +} + int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, union bpf_attr __user *uattr) { @@ -10911,12 +10922,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, env->ops = bpf_verifier_ops[env->prog->type]; is_priv = bpf_capable(); - if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { - mutex_lock(&bpf_verifier_lock); - if (!btf_vmlinux) - btf_vmlinux = btf_parse_vmlinux(); - mutex_unlock(&bpf_verifier_lock); - } + bpf_get_btf_vmlinux(); /* grab the mutex to protect few globals used by verifier */ if (!is_priv) -- 1.8.3.1
Re: common KUnit Kconfig and file naming (was: Re: [PATCH] lib: kunit_test_overflow: add KUnit test of check_*_overflow functions)
On Tue, 16 Jun 2020, David Gow wrote: > CONFIG_PM_QOS_KUNIT_TESTOn Mon, Jun 15, 2020 at 1:48 AM Kees Cook > wrote: > > > > On Sat, Jun 13, 2020 at 02:51:17PM +0800, David Gow wrote: > > > Yeah, _KUNIT_TEST was what we've sort-of implicitly decided on for > > > config names, but the documentation does need to happen. > > > > That works for me. It still feels redundant, but all I really want is a > > standard name. :) > > > > > We haven't put as much thought into standardising the filenames much, > > > though. > > > > I actually find this to be much more important because it is more > > end-user-facing (i.e. in module naming, in build logs, in scripts, on > > filesystem, etc -- CONFIG is basically only present during kernel build). > > Trying to do any sorting or greping really needs a way to find all the > > kunit pieces. > > > > Certainly this is more of an issue now we support building KUnit tests > as modules, rather than having them always be built-in. > > Having some halfway consistent config-name <-> filename <-> test suite > name could be useful down the line, too. Unfortunately, not > necessarily a 1:1 mapping, e.g.: > - CONFIG_KUNIT_TEST compiles both kunit-test.c and string-stream-test.c > - kunit-test.c has several test suites within it: > kunit-try-catch-test, kunit-resource-test & kunit-log-test. > - CONFIG_EXT4_KUNIT_TESTS currently only builds ext4-inode-test.c, but > as the plural name suggests, might build others later. > - CONFIG_SECURITY_APPARMOR_KUNIT_TEST doesn't actually have its own > source file: the test is built into policy_unpack.c > - &cetera > > Indeed, this made me quickly look up the names of suites, and there > are a few inconsistencies there: > - most have "-test" as a suffix > - some have "_test" as a suffix > - some have no suffix > > (I'm inclined to say that these don't need a suffix at all.) > A good convention for module names - which I _think_ is along the lines of what Kees is suggesting - might be something like [_]_kunit.ko So for example kunit_test -> test_kunit.ko string_stream_test.ko -> test_string_stream_kunit.ko kunit_example_test -> example_kunit.ko ext4_inode_test.ko -> ext4_inode_kunit.ko For the kunit selftests, "selftest_" might be a better name than "test_", as the latter might encourage people to reintroduce a redundant "test" into their module name. > Within test suites, we're also largely prefixing all of the tests with > a suite name (even if it's not actually the specified suite name). For > example, CONFIG_PM_QOS_KUNIT_TEST builds > drivers/base/power/qos-test.c which contains a suite called > "qos-kunit-test", with tests prefixed "freq_qos_test_". Some of this > clearly comes down to wanting to namespace things a bit more > ("qos-test" as a name could refer to a few things, I imagine), but > specifying how to do so consistently could help. > Could we add some definitions to help standardize this? For example, adding a "subsystem" field to "struct kunit_suite"? So for the ext4 tests the "subsystem" would be "ext4" and the name "inode" would specify the test area within that subsystem. For the KUnit selftests, the subsystem would be "test"/"selftest". Logging could utilize the subsystem definition to allow test writers to use less redundant test names too. For example the suite name logged could be constructed from the subsystem + area values associated with the kunit_suite, and individual test names could be shown as the suite area + test_name. Thanks! Alan
Re: [PATCH] Documentation: kunit: Add some troubleshooting tips to the FAQ
On Mon, 1 Jun 2020, David Gow wrote: > Add an FAQ entry to the KUnit documentation with some tips for > troubleshooting KUnit and kunit_tool. > > These suggestions largely came from an email thread: > https://lore.kernel.org/linux-kselftest/41db8bbd-3ba0-8bde-7352-083bf4b94...@intel.com/T/#m23213d4e156db6d59b0b460a9014950f5ff6eb03 > > Signed-off-by: David Gow > --- > Documentation/dev-tools/kunit/faq.rst | 32 +++ > 1 file changed, 32 insertions(+) > > diff --git a/Documentation/dev-tools/kunit/faq.rst > b/Documentation/dev-tools/kunit/faq.rst > index ea55b2467653..40109d425988 100644 > --- a/Documentation/dev-tools/kunit/faq.rst > +++ b/Documentation/dev-tools/kunit/faq.rst > @@ -61,3 +61,35 @@ test, or an end-to-end test. >kernel by installing a production configuration of the kernel on production >hardware with a production userspace and then trying to exercise some > behavior >that depends on interactions between the hardware, the kernel, and > userspace. > + > +KUnit isn't working, what should I do? > +== > + > +Unfortunately, there are a number of things which can break, but here are > some > +things to try. > + > +1. Try running ``./tools/testing/kunit/kunit.py run`` with the > ``--raw_output`` > + parameter. This might show details or error messages hidden by the > kunit_tool > + parser. > +2. Instead of running ``kunit.py run``, try running ``kunit.py config``, > + ``kunit.py build``, and ``kunit.py exec`` independently. This can help > track > + down where an issue is occurring. (If you think the parser is at fault, > you > + can run it manually against stdin or a file with ``kunit.py parse``.) > +3. Running the UML kernel directly can often reveal issues or error messages > + kunit_tool ignores. This should be as simple as running ``./vmlinux`` > after > + building the UML kernel (e.g., by using ``kunit.py build``). Note that UML > + has some unusual requirements (such as the host having a tmpfs filesystem > + mounted), and has had issues in the past when built statically and the > host > + has KASLR enabled. (On older host kernels, you may need to run ``setarch > + `uname -m` -R ./vmlinux`` to disable KASLR.) > +4. Make sure the kernel .config has ``CONFIG_KUNIT=y`` and at least one test > + (e.g. ``CONFIG_KUNIT_EXAMPLE_TEST=y``). kunit_tool will keep its .config > + around, so you can see what config was used after running ``kunit.py > run``. > + It also preserves any config changes you might make, so you can > + enable/disable things with ``make ARCH=um menuconfig`` or similar, and > then > + re-run kunit_tool. > +5. Finally, running ``make ARCH=um defconfig`` before running ``kunit.py > run`` > + may help clean up any residual config items which could be causing > problems. > + Looks great! Could we add something like: 6. Try running kunit standalone (without UML). KUnit and associated tests can be built into a standard kernel or built as a module; doing so allows us to verify test behaviour independent of UML so can be useful to do if running under UML is failing. When tests are built-in they will execute on boot, and modules will automatically execute associated tests when loaded. Test results can be collected from /sys/kernel/debug/kunit//results. For more details see "KUnit on non-UML architectures" in :doc:`usage`. Reviewed-by: Alan Maguire
[PATCH v4 kunit-next 2/2] kunit: add support for named resources
The kunit resources API allows for custom initialization and cleanup code (init/fini); here a new resource add function sets the "struct kunit_resource" "name" field, and calls the standard add function. Having a simple way to name resources is useful in cases such as multithreaded tests where a set of resources are shared among threads; a pointer to the "struct kunit *" test state then is all that is needed to retrieve and use named resources. Support is provided to add, find and destroy named resources; the latter two are simply wrappers that use a "match-by-name" callback. If an attempt to add a resource with a name that already exists is made kunit_add_named_resource() will return -EEXIST. Signed-off-by: Alan Maguire Reviewed-by: Brendan Higgins --- include/kunit/test.h | 54 ++ lib/kunit/kunit-test.c | 37 ++ lib/kunit/test.c | 24 ++ 3 files changed, 115 insertions(+) diff --git a/include/kunit/test.h b/include/kunit/test.h index f9b914e..59f3144 100644 --- a/include/kunit/test.h +++ b/include/kunit/test.h @@ -72,9 +72,15 @@ * return kunit_alloc_resource(test, kunit_kmalloc_init, * kunit_kmalloc_free, ¶ms); * } + * + * Resources can also be named, with lookup/removal done on a name + * basis also. kunit_add_named_resource(), kunit_find_named_resource() + * and kunit_destroy_named_resource(). Resource names must be + * unique within the test instance. */ struct kunit_resource { void *data; + const char *name; /* optional name */ /* private: internal use only. */ kunit_resource_free_t free; @@ -344,6 +350,21 @@ int kunit_add_resource(struct kunit *test, kunit_resource_free_t free, struct kunit_resource *res, void *data); + +/** + * kunit_add_named_resource() - Add a named *test managed resource*. + * @test: The test context object. + * @init: a user-supplied function to initialize the resource data, if needed. + * @free: a user-supplied function to free the resource data, if needed. + * @name_data: name and data to be set for resource. + */ +int kunit_add_named_resource(struct kunit *test, +kunit_resource_init_t init, +kunit_resource_free_t free, +struct kunit_resource *res, +const char *name, +void *data); + /** * kunit_alloc_resource() - Allocates a *test managed resource*. * @test: The test context object. @@ -399,6 +420,19 @@ static inline bool kunit_resource_instance_match(struct kunit *test, } /** + * kunit_resource_name_match() - Match a resource with the same name. + * @test: Test case to which the resource belongs. + * @res: The resource. + * @match_name: The name to match against. + */ +static inline bool kunit_resource_name_match(struct kunit *test, +struct kunit_resource *res, +void *match_name) +{ + return res->name && strcmp(res->name, match_name) == 0; +} + +/** * kunit_find_resource() - Find a resource using match function/data. * @test: Test case to which the resource belongs. * @match: match function to be applied to resources/match data. @@ -427,6 +461,19 @@ static inline bool kunit_resource_instance_match(struct kunit *test, } /** + * kunit_find_named_resource() - Find a resource using match name. + * @test: Test case to which the resource belongs. + * @name: match name. + */ +static inline struct kunit_resource * +kunit_find_named_resource(struct kunit *test, + const char *name) +{ + return kunit_find_resource(test, kunit_resource_name_match, + (void *)name); +} + +/** * kunit_destroy_resource() - Find a kunit_resource and destroy it. * @test: Test case to which the resource belongs. * @match: Match function. Returns whether a given resource matches @match_data. @@ -439,6 +486,13 @@ int kunit_destroy_resource(struct kunit *test, kunit_resource_match_t match, void *match_data); +static inline int kunit_destroy_named_resource(struct kunit *test, + const char *name) +{ + return kunit_destroy_resource(test, kunit_resource_name_match, + (void *)name); +} + /** * kunit_remove_resource: remove resource from resource list associated with * test. diff --git a/lib/kunit/kunit-test.c b/lib/kunit/kunit-test.c index 03f3eca..69f9024 100644 --- a/lib/kunit/kunit-test.c +++ b/lib/kunit/kunit-test.c @@ -325,6 +325,42 @@ static void kunit_resource_test_static(struct kunit *test)
[PATCH v4 kunit-next 1/2] kunit: generalize kunit_resource API beyond allocated resources
In its original form, the kunit resources API - consisting the struct kunit_resource and associated functions - was focused on adding allocated resources during test operation that would be automatically cleaned up on test completion. The recent RFC patch proposing converting KASAN tests to KUnit [1] showed another potential model - where outside of test context, but with a pointer to the test state, we wish to access/update test-related data, but expressly want to avoid allocations. It turns out we can generalize the kunit_resource to support static resources where the struct kunit_resource * is passed in and initialized for us. As part of this work, we also change the "allocation" field to the more general "data" name, as instead of associating an allocation, we can associate a pointer to static data. Static data is distinguished by a NULL free functions. A test is added to cover using kunit_add_resource() with a static resource and data. Finally we also make use of the kernel's krefcount interfaces to manage reference counting of KUnit resources. The motivation for this is simple; if we have kernel threads accessing and using resources (say via kunit_find_resource()) we need to ensure we do not remove said resources (or indeed free them if they were dynamically allocated) until the reference count reaches zero. A new function - kunit_put_resource() - is added to handle this, and it should be called after a thread using kunit_find_resource() is finished with the retrieved resource. We ensure that the functions needed to look up, use and drop reference count are "static inline"-defined so that they can be used by builtin code as well as modules in the case that KUnit is built as a module. A cosmetic change here also; I've tried moving to kunit_[action]_resource() as the format of function names for consistency and readability. [1] https://lkml.org/lkml/2020/2/26/1286 Signed-off-by: Alan Maguire Reviewed-by: Brendan Higgins --- include/kunit/test.h | 156 +- lib/kunit/kunit-test.c| 74 -- lib/kunit/string-stream.c | 14 ++--- lib/kunit/test.c | 153 - 4 files changed, 268 insertions(+), 129 deletions(-) diff --git a/include/kunit/test.h b/include/kunit/test.h index 47e61e1..f9b914e 100644 --- a/include/kunit/test.h +++ b/include/kunit/test.h @@ -15,6 +15,7 @@ #include #include #include +#include struct kunit_resource; @@ -23,13 +24,19 @@ /** * struct kunit_resource - represents a *test managed resource* - * @allocation: for the user to store arbitrary data. + * @data: for the user to store arbitrary data. * @free: a user supplied function to free the resource. Populated by - * kunit_alloc_resource(). + * kunit_resource_alloc(). * * Represents a *test managed resource*, a resource which will automatically be * cleaned up at the end of a test case. * + * Resources are reference counted so if a resource is retrieved via + * kunit_alloc_and_get_resource() or kunit_find_resource(), we need + * to call kunit_put_resource() to reduce the resource reference count + * when finished with it. Note that kunit_alloc_resource() does not require a + * kunit_resource_put() because it does not retrieve the resource itself. + * * Example: * * .. code-block:: c @@ -42,9 +49,9 @@ * static int kunit_kmalloc_init(struct kunit_resource *res, void *context) * { * struct kunit_kmalloc_params *params = context; - * res->allocation = kmalloc(params->size, params->gfp); + * res->data = kmalloc(params->size, params->gfp); * - * if (!res->allocation) + * if (!res->data) * return -ENOMEM; * * return 0; @@ -52,30 +59,26 @@ * * static void kunit_kmalloc_free(struct kunit_resource *res) * { - * kfree(res->allocation); + * kfree(res->data); * } * * void *kunit_kmalloc(struct kunit *test, size_t size, gfp_t gfp) * { * struct kunit_kmalloc_params params; - * struct kunit_resource *res; * * params.size = size; * params.gfp = gfp; * - * res = kunit_alloc_resource(test, kunit_kmalloc_init, + * return kunit_alloc_resource(test, kunit_kmalloc_init, * kunit_kmalloc_free, ¶ms); - * if (res) - * return res->allocation; - * - * return NULL; * } */ struct kunit_resource { - void *allocation; - kunit_resource_free_t free; + void *data; /* private: internal use only. */ + kunit_resource_free_t free; + struct kref refcount; struct list_head node; }; @@ -284,6 +287,64 @@ struct kunit_resource *kunit_alloc_and_get_resource(struct kunit
[PATCH v4 kunit-next 0/2] kunit: extend kunit resources API
A recent RFC patch set [1] suggests some additional functionality may be needed around kunit resources. It seems to require 1. support for resources without allocation 2. support for lookup of such resources 3. support for access to resources across multiple kernel threads The proposed changes here are designed to address these needs. The idea is we first generalize the API to support adding resources with static data; then from there we support named resources. The latter support is needed because if we are in a different thread context and only have the "struct kunit *" to work with, we need a way to identify a resource in lookup. [1] https://lkml.org/lkml/2020/2/26/1286 Changes since v3: - removed unused "init" field from "struct kunit_resources" (Brendan) Changes since v2: - moved a few functions relating to resource retrieval in patches 1 and 2 into include/kunit/test.h and defined as "static inline"; this allows built-in consumers to use these functions when KUnit is built as a module Changes since v1: - reformatted longer parameter lists to have one parameter per-line (Brendan, patch 1) - fixed phrasing in various comments to clarify allocation of memory and added comment to kunit resource tests to clarify why kunit_put_resource() is used there (Brendan, patch 1) - changed #define to static inline function (Brendan, patch 2) - simplified kunit_add_named_resource() to use more of existing code for non-named resource (Brendan, patch 2) Alan Maguire (2): kunit: generalize kunit_resource API beyond allocated resources kunit: add support for named resources Alan Maguire (2): kunit: generalize kunit_resource API beyond allocated resources kunit: add support for named resources include/kunit/test.h | 210 +++--- lib/kunit/kunit-test.c| 111 +++- lib/kunit/string-stream.c | 14 ++-- lib/kunit/test.c | 171 ++--- 4 files changed, 380 insertions(+), 126 deletions(-) -- 1.8.3.1
Re: [PATCH v3 3/7] kunit: tests for stats_fs API
On Tue, 26 May 2020, Emanuele Giuseppe Esposito wrote: > Add kunit tests to extensively test the stats_fs API functionality. > I've added in the kunit-related folks. > In order to run them, the kernel .config must set CONFIG_KUNIT=y > and a new .kunitconfig file must be created with CONFIG_STATS_FS=y > and CONFIG_STATS_FS_TEST=y > It looks like CONFIG_STATS_FS is built-in, but it exports much of the functionality you are testing. However could the tests also be built as a module (i.e. make CONFIG_STATS_FS_TEST a tristate variable)? To test this you'd need to specify CONFIG_KUNIT=m and CONFIG_STATS_FS_TEST=m, and testing would simply be a case of "modprobe"ing the stats fs module and collecting results in /sys/kernel/debug/kunit/ (rather than running kunit.py). Are you relying on unexported internals in the the tests that would prevent building them as a module? Thanks! Alan
Re: [PATCH v2 bpf-next 2/7] bpf: move to generic BTF show support, apply it to seq files/strings
On Wed, 13 May 2020, Yonghong Song wrote: > > > +struct btf_show { > > + u64 flags; > > + void *target; /* target of show operation (seq file, buffer) */ > > + void (*showfn)(struct btf_show *show, const char *fmt, ...); > > + const struct btf *btf; > > + /* below are used during iteration */ > > + struct { > > + u8 depth; > > + u8 depth_shown; > > + u8 depth_check; > > I have some difficulties to understand the relationship between > the above three variables. Could you add some comments here? > Will do; sorry the code got a bit confusing. The goal is to track which sub-components in a data structure we need to display. The "depth" variable tracks where we are currently; "depth_shown" is the depth at which we have something nonzer to display (perhaps "depth_to_show" would be a better name?). "depth_check" tells us whether we are currently checking depth or doing printing. If we're checking, we don't actually print anything, we merely note if we hit a non-zero value, and if so, we set "depth_shown" to the depth at which we hit that value. When we show a struct, union or array, we will only display an object has one or more non-zero members. But because the struct can in turn nest a struct or array etc, we need to recurse into the object. When we are doing that, depth_check is set, and this tells us not to do any actual display. When that recursion is complete, we check if "depth_shown" (depth to show) is > depth (i.e. we found something) and if it is we go on to display the object (setting depth_check to 0). There may be a better way to solve this problem of course, but I wanted to avoid storing values where possible as deeply-nested data structures might overrun such storage. > > + u8 array_member:1, > > + array_terminated:1; > > + u16 array_encoding; > > + u32 type_id; > > + const struct btf_type *type; > > + const struct btf_member *member; > > + char name[KSYM_NAME_LEN]; /* scratch space for name */ > > + char type_name[KSYM_NAME_LEN]; /* scratch space for type */ > > KSYM_NAME_LEN is for symbol name, not for type name. But I guess in kernel we > probably do not have > 128 bytes type name so we should be > okay here. > Yeah, I couldn't find a good length to use here. We eliminate qualifiers such as "const" in the display, so it's unlikely we'd overrun. > > + } state; > > +}; > > + > > struct btf_kind_operations { > >s32 (*check_meta)(struct btf_verifier_env *env, > > const struct btf_type *t, > > @@ -297,9 +323,9 @@ struct btf_kind_operations { > > const struct btf_type *member_type); > >void (*log_details)(struct btf_verifier_env *env, > > const struct btf_type *t); > > - void (*seq_show)(const struct btf *btf, const struct btf_type *t, > > + void (*show)(const struct btf *btf, const struct btf_type *t, > > u32 type_id, void *data, u8 bits_offsets, > > -struct seq_file *m); > > +struct btf_show *show); > > }; > > > > static const struct btf_kind_operations * const kind_ops[NR_BTF_KINDS]; > > @@ -676,6 +702,340 @@ bool btf_member_is_reg_int(const struct btf *btf, > > const struct btf_type *s, > > return true; > > } > > > > +/* Similar to btf_type_skip_modifiers() but does not skip typedefs. */ > > +static inline > > +const struct btf_type *btf_type_skip_qualifiers(const struct btf *btf, u32 > > id) > > +{ > > + const struct btf_type *t = btf_type_by_id(btf, id); > > + > > + while (btf_type_is_modifier(t) && > > + BTF_INFO_KIND(t->info) != BTF_KIND_TYPEDEF) { > > + id = t->type; > > + t = btf_type_by_id(btf, t->type); > > + } > > + > > + return t; > > +} > > + > > +#define BTF_SHOW_MAX_ITER 10 > > + > > +#define BTF_KIND_BIT(kind) (1ULL << kind) > > + > > +static inline const char *btf_show_type_name(struct btf_show *show, > > +const struct btf_type *t) > > +{ > > + const char *array_suffixes = "[][][][][][][][][][]"; > > Add a comment here saying length BTF_SHOW_MAX_ITER * 2 > so later on if somebody changes the BTF_SHOW_MAX_ITER from 10 to 12, > it won't miss here? > > > + const char *array_suffix = &array_suffixes[strlen(array_suffixes)]; > > + const char *ptr_suffixes = "**"; > > The same here. > Good idea; will do. > > + const char *ptr_suffix = &ptr_suffixes[strlen(ptr_suffixes)]; > > + const char *type_name = NULL, *prefix = "", *parens = ""; > > + const struct btf_array *array; > > + u32 id = show->state.type_id; > > + bool allow_anon = true; > > + u64 kinds = 0; > > + int i; > > + > > + show->state.type_name[0] = '\0'; > > + > > + /* > > +* Start with type_id, as we have have resolved the struct btf_type * > > +* via btf_modifier_show() past the parent typedef to the c
Re: [PATCH v2 bpf-next 6/7] bpf: add support for %pT format specifier for bpf_trace_printk() helper
On Wed, 13 May 2020, Yonghong Song wrote: > > > + while (isbtffmt(fmt[i])) > > + i++; > > The pointer passed to the helper may not be valid pointer. I think you > need to do a probe_read_kernel() here. Do an atomic memory allocation > here should be okay as this is mostly for debugging only. > Are there other examples of doing allocations in program execution context? I'd hate to be the first to introduce one if not. I was hoping I could get away with some per-CPU scratch space. Most data structures will fit within a small per-CPU buffer, but if multiple copies are required, performance isn't the key concern. It will make traversing the buffer during display a bit more complex but I think avoiding allocation might make that complexity worth it. The other thought I had was we could carry out an allocation associated with the attach, but that's messy as it's possible run-time might determine the type for display (and thus the amount of the buffer we need to copy safely). Great news about LLVM support for __builtin_btf_type_id()! Thanks! Alan