On Tue, Feb 23, 2021 at 1:12 PM Toke Høiland-Jørgensen <t...@redhat.com> wrote:
>
> "Tristan Mayfield" <mayfieldtris...@gmail.com> writes:
>
> > Toke, thanks for the quick response!
> >
> > Yes, I was checking the bpf_probe_read return values, and was reading
> > the number of bytes expected, so nothing wrong there!
>
> Right, in that case that's probably just because the struct in question
> is next to some other valid memory (not sure where tracepoints keep
> their data, but if it's on the stack, for instance, you'll have no
> problem reading past it).
>
> > Now that you mention CO-RE, it does actually make sense that these
> > sorts of errors could be shifted to load time rather than attach time
> > (that the right phrase?). I've fiddled with CO-RE a bit but I haven't
> > adopted it for a few reasons (which I could certainly be mistaken
> > about).
>
> I'm by no means the leading authority on CO-RE, but I can give answering
> a shot; hopefully someone will chime in to correct me if I'm wrong :)
>
> > I don't have control over kernel versions or compilation flags for the
> > kernel on the systems I'm targeting and I've had significant
> > difficulty trying to compile CO-RE programs (e.g. from the BCC repo's
> > libbpf-tools) on Linux <5.4 because I've had a hard time getting the
> > vmlinux. I can't remember if I used bpftool though (this was about a
> > year ago that I last played with CO-RE), so perhaps I'll give it
> > another shot.
>
> Yeah, getting all your ducks in a row when compiling can be a bit of an
> issue. However, I don't think you need anything special from the kernel
> at compile-time if you just compile your own programs with a vmlinux.h
> file you generated on a kernel that has been compiled with BTF.

As far as CO-RE BPF program compilation goes, there shouldn't be much
difference between the latest kernel vs some older one. In case of
libbpf-tools, some of the tools might be using some features that are
supported by newer kernels only, but that's a bit different.

BTW, vmlinux.h is a pure convenience, so that you don't have to use
system headers or define your own types with
__attribute__((preserve_access_index)). vmlinux.h is not a
requirement. For libbpf-tools, though, it's pre-packaged to make life
easier (and now we have per-architecture vmlinux.h to facilitate
building libbpf-tools for various target arches).

New enough Clang is a requirement, though. Clang 11+ is preferred, but
I believe Clang 10 should have enough features for a lot of CO-RE
functionality.

>
> > I've also been very unclear, and have gotten many different answers
> > regarding the target systems and whether they need to be custom
> > compiled with BTF enabled for CO-RE programs to run on them, or if you
> > can put a CO-RE program onto a generic kernel build and it "just
> > works?" From your answer, the answer seems to be that
> > /sys/kernel/btf/vmlinux needs to be on the target system, so it must
> > have that BTF_ENABLE flag set?
>
> Well, you'll need the BTF information of the running kernel. It doesn't
> *have* to come from /sys/kernel/btf/vmlinux, libbpf will look for it in
> a few other locations as well:
>
> https://github.com/libbpf/libbpf/blob/master/src/btf.c#L4583

Right. For older kernels that don't yet support
/sys/kernel/btf/vmlinux, it's possible to add .BTF data with pahole -J
after the kernel is built. It's also possible to provide just BTF data
separately using bpf_object_open_opts, if it's more convenient.
Certainly an advanced use case, but doable.

But, of course, having kernels built with BTF and exposing it from
/sys/kernel/btf/vmlinux is hands down the most convenient way, which
seems to become more and more an option for popular Linux distros. See
[0] for a list (I think ALT Linux is going to have BTF built-in as
well).

  [0] https://github.com/libbpf/libbpf#bpf-co-re-compile-once--run-everywhere

>
> Distros have gotten pretty good about enabling BTF in their kernel
> builds, though, so it's getting increasingly feasible to rely on it. It
> should certainly be available on RHEL8 (and thus CentOS 8).
>
> > If that's set, do you also need a vmlinux.h file as well? A coworker
> > was recently messing with CO-RE and seemed to think that deploying a
> > CO-RE program required shipping the vmlinux.h file and I think he
> > mentioned that file was about 1Gb big, which is certainly a no-go for
> > our position.
>
> No, you don't need to ship the vmlinux.h file. That's just a regular
> header file with an unusual amount of definitions in it, that will be
> used at compile time. It can be useful to include a copy of it in your
> source code repository, though, as mentioned above. That's what BCC
> does, for instance:
> https://github.com/iovisor/bcc/tree/master/libbpf-tools/x86
>
> An no, it's not 1GB in size. Maybe that size was from before BTF
> de-duplication got implemented? The one linked above is 2.7M.

Maybe if you build allyesconfig it can come closer to 1GB :) But as
Toke said, it's used during compilation only. After that you get BPF
object file (that .o file), which contains all the necessary
relocation information internally and is very small. Then there is BPF
skeleton, which can be used to avoid distributing those separate .o
(and provides a bunch of other convenience features, of course), but
it's not a requirement either.

>
> > In addition to that, I've been unclear in the role of BTF in BPF
> > generally. When I began tinkering with BPF I was under the impression
> > that BTF was *only* something used for CO-RE programs (something I
> > actually might've gotten from the article referenced and written by
> > Andrii), but I've periodically seen errors arise that cite BTF reasons
> > for erroring.
>

BTF started out as "just" compact debug info for your BPF programs,
but it quickly grew into much more and is used for many BPF-related
features. CO-RE is one big area, but there are kernel BPF features
that rely on in-kernel BTF heavily as well.

> One common cause for this has been when loading 'tc' programs with
> iproute2, because the iproute2 loader doesn't understand BTF and will
> complain about it. That is usually harmless, though, but I agree it's
> quite annoying. Fortunately, iproute2 has recently gained support for
> using libbpf for its BPF loading, so hopefully that particular error
> should go away before too long.
>
> > Unfortunately I haven't saved any of these errors and
> > can't remember the causes specifically, but something like the
> > "updated" maps declarations, i.e.
> >
> > struct {
> > __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
> > __uint(key_size, sizeof(u32));
> > __uint(value_size, sizeof(u32));
> > } events SEC(".maps");
> >
> > I've learned does use BTF?
>
> Yes, the new-style map definitions use BTF. While BTF is ostensibly a
> type format (i.e., something that describes C data types), Andrii
> figured out that it is also possible to use it as a general purpose
> key/value store. You do this by being a bit clever about how you
> represent your data, which is what the __uint() macro in the above is
> doing (it's encoding the integer value as the size of an array, which
> becomes part of the type and thus embedded in the BTF). When loading,
> libbpf will parse this data back out of the BTF data and use it when
> creating the map. So you'll need BTF support in your compiler and in
> libbpf to use this style of map definitions.

Right. Clang 10+ should be enough (but I'm too lazy to check), which
coincides with CO-RE requirements.

>
> > Am I misunderstanding what BTF is and the role it plays in BPF? Or
> > maybe has libbpf development moved so far toward CO-RE that non-CO-RE
> > development gets similar or the same error messages that just aren't
> > as clear for it?
>
> Hmm, no, CO-RE is the specific feature that does relocations of struct
> fields based on member names. This relies on BTF, but it's not the only

CO-RE is more than only field offset relocations, btw, you can detect
type and field existence, get type size, use relocatable enums
(internal kernel enums can get renumbered, so this feature allows to
accommodate that), etc.

> thing BTF is used for. The map definition is another, as you discovered,
> and there are some program types that cannot work without BTF
> information at all. Also, things like bpftool being able to print out
> the struct layout of map values is using BTF. So you're certainly right
> that the BPF ecosystem in general is moving towards using BTF in more
> and more places. And I guess you're also right that this leads to some
> cryptic error messages sometimes... :)
>

Thanks for your reply, Toke. I don't think I added much value here :)

> -Toke
>
>
>
> 
>
>


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1957): https://lists.iovisor.org/g/iovisor-dev/message/1957
Mute This Topic: https://lists.iovisor.org/mt/80853471/21656
Group Owner: iovisor-dev+ow...@lists.iovisor.org
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to