Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"

2017-12-01 Thread Alexei Starovoitov

On 12/1/17 9:51 AM, Arnaldo Carvalho de Melo wrote:


But this is not just testcase expectations, the usecase is someone
wanting to use a newer tool, with perhaps some new features of interest
that don't depend on changes in the kernel, in an older kernel on a
system where updating it is not possible or desirable.


I think it's also dangerous for the core library like libbpf to
be smarter than the tool that is using it.
In this case we added prog and map names by default into loader and
create_map functions to make sure that all tools pick them up
automatically and we can see a bit more human readable bpf names
in kernel stack traces and in debug tools like bpftool, bcc/bps.
When kernel is older and doesn't support prog/map names, it's perfectly
reasonable to fall back to map creation without the name, but
library shouldn't be doing it in all cases.
Like prog_load command recently got new prog_ifindex field.
It would be incorrect to fallback to loading without it.



Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"

2017-12-01 Thread Arnaldo Carvalho de Melo
Em Thu, Nov 30, 2017 at 01:51:15PM -0800, Alexei Starovoitov escreveu:
> On 11/30/17 11:00 AM, Arnaldo Carvalho de Melo wrote:
> > > Instead of sinking all future bpf_attr's backward compatibility
> > > requirements to sys_bpf,  I would push it up to its own BPF_* command
> > > helper which has a better sense of its bpf_attr, i.e. push it up
> > > to bpf_create_map_node() and bpf_load_program_name() in this case.
> > Humm, we could try that approach, but the one in this patch seemed good
> > enough.
> > 
> > And after all if the first syscall() invokation, with the latest kernel
> > and latest tooling will work, right?
> 
> I agree with Martin and I also don't think it will work to push
> logic of all bpf commands into single sys_bpf syscall wrapper.

Sure, that was just a POC, I'll work on something that takes into
account what you guys pointed out.

> This logic will become more and more complex over time.
> Like this case really belongs in bpf_create_map() which is a wrapper
> on top of single BPF_CREATE_MAP command.
 
> Note it's the first time we're facing this 'new libbpf.a running on
> top of old kernel' issue and should be very careful adding such
> fallback code to the generic bpf library, since all the selftests/bpf/
> are using this lib and relying on excepted behavior.

Right, tools/perf/ uses it as well and relies on its continued
functioning.

> We don't want tests that want to test the latest kernel feature all of
> a sudden pass on old kernel that doesn't have it.

Sure, neither do I :-)
 
> To some degree perf and selftests/bpf needs are diverging here,
> so adding #ifdef to libbpf.a to match testcase expectations may be
> necessary.

But this is not just testcase expectations, the usecase is someone
wanting to use a newer tool, with perhaps some new features of interest
that don't depend on changes in the kernel, in an older kernel on a
system where updating it is not possible or desirable.

- Arnaldo


Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"

2017-11-30 Thread Martin KaFai Lau
On Thu, Nov 30, 2017 at 04:00:42PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 30, 2017 at 10:28:08AM -0800, Martin KaFai Lau escreveu:
> > On Thu, Nov 30, 2017 at 01:53:58PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Nov 30, 2017 at 12:01:10AM -0300, Arnaldo Carvalho de Melo 
> > > escreveu:
> > > > Em Wed, Nov 29, 2017 at 02:31:36PM -0800, Martin KaFai Lau escreveu:
> > > > > On Wed, Nov 29, 2017 at 06:15:43PM -0300, Arnaldo Carvalho de Melo 
> > > > > wrote:
> > > > > > Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu:
> > > > > > > On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de 
> > > > > > > Melo wrote:
> > > > > > > > [root@jouet ~]# perf test -v bpf
> > > > > > > > 39: BPF filter:
> > > > > > > > 39.1: Basic BPF filtering :
> > > > > > > > Kernel build dir is set to /lib/modules/4.14.0+/build
> > > > > > > [ ... ]
> > > > > > > > libbpf: failed to create map (name: 'flip_table'): Invalid 
> > > > > > > > argument
> > > > > > > > libbpf: failed to load object '[basic_bpf_test]'
> > > > > > > > bpf: load objects failed
> > > > > > > 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify 
> > > > > > > BPF obj name")
> > > > > > > is introduced in 4.15.
> > > > 
> > > > > > > I think the perf@kernel-4.15 broke on older kernels like 4.14 
> > > > > > > because
> > > > > > > the new bpf prog/map name is only introduced since 4.15.
> 
> > > > > > > The newer perf needs to be compatible with an older kernel?
> 
> > > > > > Sure :-)
> 
> > > > > Would the latest features introduced in perf/libbpf supposed to be
> > > > > available in the latest kernel only?  What may be the reason that the
> 
> > > > Yes, then the new perf binary should try to use the new stuff, if it
> > > > fails, use the old one, there is no requirement that one uses perf 4.14
> > > > in lockstep with the kernel 4.14 (or any other version), perf 4.15
> > > > should work with the 4.14 kernel as well as with 4.15 (or any other
> > > > future kernel), only limited by what it can grok up to when it was
> > > > released.
> 
> > > So, see the patch below, that makes a 'perf test bpf' and my other test
> > > cases, including that one for probe_read_str() work again, it just
> > > fallbacks to a behaviour the older kernels can accept.
> 
> > Thanks for the patch.
>  
> > > We can improve it so that that EINVAL fallback happens only for
> > > MAP_CREATE, and probably we don't need to change the size arg, just zero
> > > the unused fields, but I haven't checked that.
> > > 
> > > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > > index 5128677e4117..3084f07c7c33 100644
> > > --- a/tools/lib/bpf/bpf.c
> > > +++ b/tools/lib/bpf/bpf.c
> > > @@ -19,6 +19,7 @@
> > >   * License along with this program; if not,  see 
> > > 
> > >   */
> > >  
> > > +#include 
> > >  #include 
> > >  #include 
> > >  #include 
> > > @@ -53,10 +54,26 @@ static inline __u64 ptr_to_u64(const void *ptr)
> > >   return (__u64) (unsigned long) ptr;
> > >  }
> > >  
> > > -static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
> > > -   unsigned int size)
> > > +static int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int 
> > > size)
> > >  {
> > > - return syscall(__NR_bpf, cmd, attr, size);
> > > + int err = syscall(__NR_bpf, cmd, attr, size);
> > > + if (err == -1 && (errno == EINVAL || errno == E2BIG)) {
> > I would add a check to the length of map_name/prog_name.
> 
> Can you elaborate? What kind of check?
F.e. if map_name is not set (i.e. strlen is 0), there is no
need to retry.

>  
> > > + const unsigned int old_union_size = offsetof(union bpf_attr, 
> > > prog_name);
> > > + /*
> > > +  * These were the ones that added fields after the old bpf_attr
> > > +  * layout in commit 88cda1c9da02 ("bpf: libbpf: Provide basic
> > > +  * API support to specify BPF obj name") so zero that out to
> > > +  * pass the CHECK_ATTR() test in kernel/bpf/syscall.c in older
> > > +  * kernels.
> > > +  */
> > > + if (cmd == BPF_MAP_CREATE)
> > > + memset(&attr->map_name, 0, size - offsetof(union 
> > > bpf_attr, map_name));
> > > + else
> > > + memset(&attr->prog_name, 0, size - old_union_size);
> 
> > If bpf_attr is extended in the future,  map_name/prog_name will still be
> > used as the anchor for backward compatibility instead of trial and error
> > attribute by attribute?
> 
> Then you will first try the latest and greatest, if it fails, go the
> previous (like here), if it fails, etc.
> 
> That or some sort of versioning to make sure the kernel and the tools
> can agree on a common set of functionality supported by both.
> 
> Again, this is how perf_evsel__open() works in the perf case, try the
> latest and go fallbacking to the most recent set of features tha

Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"

2017-11-30 Thread Alexei Starovoitov

On 11/30/17 11:00 AM, Arnaldo Carvalho de Melo wrote:

Instead of sinking all future bpf_attr's backward compatibility
requirements to sys_bpf,  I would push it up to its own BPF_* command
helper which has a better sense of its bpf_attr, i.e. push it up
to bpf_create_map_node() and bpf_load_program_name() in this case.

Humm, we could try that approach, but the one in this patch seemed good
enough.

And after all if the first syscall() invokation, with the latest kernel
and latest tooling will work, right?


I agree with Martin and I also don't think it will work to push
logic of all bpf commands into single sys_bpf syscall wrapper.
This logic will become more and more complex over time.
Like this case really belongs in bpf_create_map() which is a wrapper
on top of single BPF_CREATE_MAP command.

Note it's the first time we're facing this
'new libbpf.a running on top of old kernel' issue and should be
very careful adding such fallback code to the generic bpf library,
since all the selftests/bpf/ are using this lib and relying on
excepted behavior. We don't want tests that want to test the latest
kernel feature all of a sudden pass on old kernel that doesn't have it.

To some degree perf and selftests/bpf needs are diverging here,
so adding #ifdef to libbpf.a to match testcase expectations may be
necessary.



Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"

2017-11-30 Thread Arnaldo Carvalho de Melo
Em Thu, Nov 30, 2017 at 10:28:08AM -0800, Martin KaFai Lau escreveu:
> On Thu, Nov 30, 2017 at 01:53:58PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Nov 30, 2017 at 12:01:10AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Wed, Nov 29, 2017 at 02:31:36PM -0800, Martin KaFai Lau escreveu:
> > > > On Wed, Nov 29, 2017 at 06:15:43PM -0300, Arnaldo Carvalho de Melo 
> > > > wrote:
> > > > > Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu:
> > > > > > On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo 
> > > > > > wrote:
> > > > > > > [root@jouet ~]# perf test -v bpf
> > > > > > > 39: BPF filter:
> > > > > > > 39.1: Basic BPF filtering :
> > > > > > > Kernel build dir is set to /lib/modules/4.14.0+/build
> > > > > > [ ... ]
> > > > > > > libbpf: failed to create map (name: 'flip_table'): Invalid 
> > > > > > > argument
> > > > > > > libbpf: failed to load object '[basic_bpf_test]'
> > > > > > > bpf: load objects failed
> > > > > > 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify 
> > > > > > BPF obj name")
> > > > > > is introduced in 4.15.
> > > 
> > > > > > I think the perf@kernel-4.15 broke on older kernels like 4.14 
> > > > > > because
> > > > > > the new bpf prog/map name is only introduced since 4.15.

> > > > > > The newer perf needs to be compatible with an older kernel?

> > > > > Sure :-)

> > > > Would the latest features introduced in perf/libbpf supposed to be
> > > > available in the latest kernel only?  What may be the reason that the

> > > Yes, then the new perf binary should try to use the new stuff, if it
> > > fails, use the old one, there is no requirement that one uses perf 4.14
> > > in lockstep with the kernel 4.14 (or any other version), perf 4.15
> > > should work with the 4.14 kernel as well as with 4.15 (or any other
> > > future kernel), only limited by what it can grok up to when it was
> > > released.

> > So, see the patch below, that makes a 'perf test bpf' and my other test
> > cases, including that one for probe_read_str() work again, it just
> > fallbacks to a behaviour the older kernels can accept.

> Thanks for the patch.
 
> > We can improve it so that that EINVAL fallback happens only for
> > MAP_CREATE, and probably we don't need to change the size arg, just zero
> > the unused fields, but I haven't checked that.
> > 
> > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > index 5128677e4117..3084f07c7c33 100644
> > --- a/tools/lib/bpf/bpf.c
> > +++ b/tools/lib/bpf/bpf.c
> > @@ -19,6 +19,7 @@
> >   * License along with this program; if not,  see 
> > 
> >   */
> >  
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -53,10 +54,26 @@ static inline __u64 ptr_to_u64(const void *ptr)
> > return (__u64) (unsigned long) ptr;
> >  }
> >  
> > -static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
> > - unsigned int size)
> > +static int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int 
> > size)
> >  {
> > -   return syscall(__NR_bpf, cmd, attr, size);
> > +   int err = syscall(__NR_bpf, cmd, attr, size);
> > +   if (err == -1 && (errno == EINVAL || errno == E2BIG)) {
> I would add a check to the length of map_name/prog_name.

Can you elaborate? What kind of check?
 
> > +   const unsigned int old_union_size = offsetof(union bpf_attr, 
> > prog_name);
> > +   /*
> > +* These were the ones that added fields after the old bpf_attr
> > +* layout in commit 88cda1c9da02 ("bpf: libbpf: Provide basic
> > +* API support to specify BPF obj name") so zero that out to
> > +* pass the CHECK_ATTR() test in kernel/bpf/syscall.c in older
> > +* kernels.
> > +*/
> > +   if (cmd == BPF_MAP_CREATE)
> > +   memset(&attr->map_name, 0, size - offsetof(union 
> > bpf_attr, map_name));
> > +   else
> > +   memset(&attr->prog_name, 0, size - old_union_size);

> If bpf_attr is extended in the future,  map_name/prog_name will still be
> used as the anchor for backward compatibility instead of trial and error
> attribute by attribute?

Then you will first try the latest and greatest, if it fails, go the
previous (like here), if it fails, etc.

That or some sort of versioning to make sure the kernel and the tools
can agree on a common set of functionality supported by both.

Again, this is how perf_evsel__open() works in the perf case, try the
latest and go fallbacking to the most recent set of features that could
somehow service what is needed or disable some feature and warn the
user, i.e. do the best you can with what you have.

With this patch in place I was able to have what was working before
88cda1c9da02 working again.
 
> Instead of sinking all future bpf_attr's backward compatibility
> requirements to sys_bpf,  I would push it

Re: [PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"

2017-11-30 Thread Martin KaFai Lau
On Thu, Nov 30, 2017 at 01:53:58PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Nov 30, 2017 at 12:01:10AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Wed, Nov 29, 2017 at 02:31:36PM -0800, Martin KaFai Lau escreveu:
> > > On Wed, Nov 29, 2017 at 06:15:43PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu:
> > > > > On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo 
> > > > > wrote:
> > > > > > [root@jouet ~]# perf test -v bpf
> > > > > > 39: BPF filter:
> > > > > > 39.1: Basic BPF filtering :
> > > > > > Kernel build dir is set to /lib/modules/4.14.0+/build
> > > > > [ ... ]
> > > > > > libbpf: failed to create map (name: 'flip_table'): Invalid argument
> > > > > > libbpf: failed to load object '[basic_bpf_test]'
> > > > > > bpf: load objects failed
> > > > > 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF 
> > > > > obj name")
> > > > > is introduced in 4.15.
> > 
> > > > > I think the perf@kernel-4.15 broke on older kernels like 4.14 because
> > > > > the new bpf prog/map name is only introduced since 4.15.
> > 
> > > > > The newer perf needs to be compatible with an older kernel?
> > 
> > > > Sure :-)
>  
> > > Would the latest features introduced in perf/libbpf supposed to be
> > > available in the latest kernel only?  What may be the reason that the
> > 
> > Yes, then the new perf binary should try to use the new stuff, if it
> > fails, use the old one, there is no requirement that one uses perf 4.14
> > in lockstep with the kernel 4.14 (or any other version), perf 4.15
> > should work with the 4.14 kernel as well as with 4.15 (or any other
> > future kernel), only limited by what it can grok up to when it was
> > released.
> 
> So, see the patch below, that makes a 'perf test bpf' and my other test
> cases, including that one for probe_read_str() work again, it just
> fallbacks to a behaviour the older kernels can accept.
Thanks for the patch.

> 
> We can improve it so that that EINVAL fallback happens only for
> MAP_CREATE, and probably we don't need to change the size arg, just zero
> the unused fields, but I haven't checked that.
> 
> - Arnaldo
> 
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 5128677e4117..3084f07c7c33 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -19,6 +19,7 @@
>   * License along with this program; if not,  see 
> 
>   */
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -53,10 +54,26 @@ static inline __u64 ptr_to_u64(const void *ptr)
>   return (__u64) (unsigned long) ptr;
>  }
>  
> -static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
> -   unsigned int size)
> +static int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int size)
>  {
> - return syscall(__NR_bpf, cmd, attr, size);
> + int err = syscall(__NR_bpf, cmd, attr, size);
> + if (err == -1 && (errno == EINVAL || errno == E2BIG)) {
I would add a check to the length of map_name/prog_name.

> + const unsigned int old_union_size = offsetof(union bpf_attr, 
> prog_name);
> + /*
> +  * These were the ones that added fields after the old bpf_attr
> +  * layout in commit 88cda1c9da02 ("bpf: libbpf: Provide basic
> +  * API support to specify BPF obj name") so zero that out to
> +  * pass the CHECK_ATTR() test in kernel/bpf/syscall.c in older
> +  * kernels.
> +  */
> + if (cmd == BPF_MAP_CREATE)
> + memset(&attr->map_name, 0, size - offsetof(union 
> bpf_attr, map_name));
> + else
> + memset(&attr->prog_name, 0, size - old_union_size);
If bpf_attr is extended in the future,  map_name/prog_name will still be
used as the anchor for backward compatibility instead of trial and error
attribute by attribute?

Instead of sinking all future bpf_attr's backward compatibility
requirements to sys_bpf,  I would push it up to its own BPF_* command
helper which has a better sense of its bpf_attr, i.e. push it up
to bpf_create_map_node() and bpf_load_program_name() in this case.

> +
> + err = syscall(__NR_bpf, cmd, attr, old_union_size);
> + }
> + return err;
>  }
>  
>  int bpf_create_map_node(enum bpf_map_type map_type, const char *name,


[PATCH/RFC] Re: 'perf test BPF' failing, libbpf regression wrt "basic API for BPF obj name"

2017-11-30 Thread Arnaldo Carvalho de Melo
Em Thu, Nov 30, 2017 at 12:01:10AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Nov 29, 2017 at 02:31:36PM -0800, Martin KaFai Lau escreveu:
> > On Wed, Nov 29, 2017 at 06:15:43PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Wed, Nov 29, 2017 at 01:07:34PM -0800, Martin KaFai Lau escreveu:
> > > > On Tue, Nov 28, 2017 at 04:05:19PM -0300, Arnaldo Carvalho de Melo 
> > > > wrote:
> > > > > [root@jouet ~]# perf test -v bpf
> > > > > 39: BPF filter:
> > > > > 39.1: Basic BPF filtering :
> > > > > Kernel build dir is set to /lib/modules/4.14.0+/build
> > > > [ ... ]
> > > > > libbpf: failed to create map (name: 'flip_table'): Invalid argument
> > > > > libbpf: failed to load object '[basic_bpf_test]'
> > > > > bpf: load objects failed
> > > > 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF 
> > > > obj name")
> > > > is introduced in 4.15.
> 
> > > > I think the perf@kernel-4.15 broke on older kernels like 4.14 because
> > > > the new bpf prog/map name is only introduced since 4.15.
> 
> > > > The newer perf needs to be compatible with an older kernel?
> 
> > > Sure :-)
 
> > Would the latest features introduced in perf/libbpf supposed to be
> > available in the latest kernel only?  What may be the reason that the
> 
> Yes, then the new perf binary should try to use the new stuff, if it
> fails, use the old one, there is no requirement that one uses perf 4.14
> in lockstep with the kernel 4.14 (or any other version), perf 4.15
> should work with the 4.14 kernel as well as with 4.15 (or any other
> future kernel), only limited by what it can grok up to when it was
> released.

So, see the patch below, that makes a 'perf test bpf' and my other test
cases, including that one for probe_read_str() work again, it just
fallbacks to a behaviour the older kernels can accept.

We can improve it so that that EINVAL fallback happens only for
MAP_CREATE, and probably we don't need to change the size arg, just zero
the unused fields, but I haven't checked that.

- Arnaldo

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 5128677e4117..3084f07c7c33 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -19,6 +19,7 @@
  * License along with this program; if not,  see 
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -53,10 +54,26 @@ static inline __u64 ptr_to_u64(const void *ptr)
return (__u64) (unsigned long) ptr;
 }
 
-static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
- unsigned int size)
+static int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, unsigned int size)
 {
-   return syscall(__NR_bpf, cmd, attr, size);
+   int err = syscall(__NR_bpf, cmd, attr, size);
+   if (err == -1 && (errno == EINVAL || errno == E2BIG)) {
+   const unsigned int old_union_size = offsetof(union bpf_attr, 
prog_name);
+   /*
+* These were the ones that added fields after the old bpf_attr
+* layout in commit 88cda1c9da02 ("bpf: libbpf: Provide basic
+* API support to specify BPF obj name") so zero that out to
+* pass the CHECK_ATTR() test in kernel/bpf/syscall.c in older
+* kernels.
+*/
+   if (cmd == BPF_MAP_CREATE)
+   memset(&attr->map_name, 0, size - offsetof(union 
bpf_attr, map_name));
+   else
+   memset(&attr->prog_name, 0, size - old_union_size);
+
+   err = syscall(__NR_bpf, cmd, attr, old_union_size);
+   }
+   return err;
 }
 
 int bpf_create_map_node(enum bpf_map_type map_type, const char *name,