2018-04-09 11:01 UTC+0200 ~ Daniel Borkmann <dan...@iogearbox.net>
> On 04/06/2018 01:11 PM, Quentin Monnet wrote:
>> eBPF helper functions can be called from within eBPF programs to perform
>> a variety of tasks that would be otherwise hard or impossible to do with
>> eBPF itself. There is a growing number of such helper functions in the
>> kernel, but documentation is scarce. The main user space header file
>> does contain a short commented description of most helpers, but it is
>> somewhat outdated and not complete. It is more a "cheat sheet" than a
>> real documentation accessible to new eBPF developers.
>>
>> This commit attempts to improve the situation by replacing the existing
>> overview for the helpers with a more developed description. Furthermore,
>> a Python script is added to generate a manual page for eBPF helpers. The
>> workflow is the following, and requires the rst2man utility:
>>
>>     $ ./scripts/bpf_helpers_doc.py \
>>             --filename include/uapi/linux/bpf.h > /tmp/bpf-helpers.rst
>>     $ rst2man /tmp/bpf-helpers.rst > /tmp/bpf-helpers.7
>>     $ man /tmp/bpf-helpers.7
>>
>> The objective is to keep all documentation related to the helpers in a
>> single place, and to be able to generate from here a manual page that
>> could be packaged in the man-pages repository and shipped with most
>> distributions [1].
>>
>> Additionally, parsing the prototypes of the helper functions could
>> hopefully be reused, with a different Printer object, to generate
>> header files needed in some eBPF-related projects.
>>
>> Regarding the description of each helper, it comprises several items:
>>
>> - The function prototype.
>> - A description of the function and of its arguments (except for a
>>   couple of cases, when there are no arguments and the return value
>>   makes the function usage really obvious).
>> - A description of return values (if not void).
>> - A listing of eBPF program types (if relevant, map types) compatible
>>   with the helper.
>> - Information about the helper being restricted to GPL programs, or not.
>> - The kernel version in which the helper was introduced.
>> - The commit that introduced the helper (this is mostly to have it in
>>   the source of the man page, as it can be used to track changes and
>>   update the page).
>>
>> For several helpers, descriptions are inspired (at times, nearly copied)
>> from the commit logs introducing them in the kernel--Many thanks to
>> their respective authors! They were completed as much as possible, the
>> objective being to have something easily accessible even for people just
>> starting with eBPF. There is probably a bit more work to do in this
>> direction for some helpers.
>>
>> Some RST formatting is used in the descriptions (not in function
>> prototypes, to keep them readable, but the Python script provided in
>> order to generate the RST for the manual page does add formatting to
>> prototypes, to produce something pretty) to get "bold" and "italics" in
>> manual pages. Hopefully, the descriptions in bpf.h file remains
>> perfectly readable. Note that the few trailing white spaces are
>> intentional, removing them would break paragraphs for rst2man.
>>
>> The descriptions should ideally be updated each time someone adds a new
>> helper, or updates the behaviour (compatibility extended to new program
>> types, new socket option supported...) or the interface (new flags
>> available, ...) of existing ones.
>>
>> [1] I have not contacted people from the man-pages project prior to
>>     sending this RFC, so I can offer no guaranty at this time that they
>>     would accept to take the generated man page.
>>
>> Cc: linux-...@vger.kernel.org
>> Cc: linux-...@vger.kernel.org
>> Signed-off-by: Quentin Monnet <quentin.mon...@netronome.com>
> 
> Great work, thanks a lot for doing this!
> 
> [...]
>> + * int bpf_probe_read(void *dst, u32 size, const void *src)
>> + *  Description
>> + *          For tracing programs, safely attempt to read *size* bytes from
>> + *          address *src* and store the data in *dst*.
>> + *  Return
>> + *          0 on success, or a negative error in case of failure.
>> + *  For
>> + *          *Tracing programs*.
>> + *  GPL only
>> + *          Yes
>> + *  Since
>> + *          Linux 4.1
>> + *
>> + * .. commit 2541517c32be
>>   *
>>   * u64 bpf_ktime_get_ns(void)
>> - *     Return: current ktime
>> - *
>> - * int bpf_trace_printk(const char *fmt, int fmt_size, ...)
>> - *     Return: length of buffer written or negative error
>> + *  Description
>> + *          Return the time elapsed since system boot, in nanoseconds.
>> + *  Return
>> + *          Current *ktime*.
>> + *  For
>> + *          All program types, except
>> + *          **BPF_PROG_TYPE_CGROUP_DEVICE**.
> 
> I think we should probably always just list the actual program types instead,
> since when new types get added to the kernel not supporting bpf_ktime_get_ns()
> helper in this example, the above statement would be misleading (potentially
> more misleading than the other way around when it's not yet mentioned to be
> supported).

Agreed. I realise “All program types” is really awkward here.

>> + *  GPL only
>> + *          Yes
>> + *  Since
>> + *          Linux 4.1
>> + *
>> + * .. commit d9847d310ab4
>> + *

[...]

>> + * u32 bpf_get_smp_processor_id(void)
>> + *  Return
>> + *          The SMP (Symmetric multiprocessing) processor id.
>> + *  For
>> + *          *Networking programs*.
> 
> Ditto plus above is actually not correct. See tracing_func_proto():
> 
>         [...]
>         case BPF_FUNC_get_smp_processor_id:
>                 return &bpf_get_smp_processor_id_proto;
>         [...]

Thanks for the catch! I will fix on my side, even if we drop the "For"
section for now.

>> + *  GPL only
>> + *          No
>> + *  Since
>> + *          Linux 4.1
>> + *
>> + * .. commit c04167ce2ca0
>> + *
>> + * int bpf_skb_store_bytes(struct sk_buff *skb, u32 offset, const void 
>> *from, u32 len, u64 flags)
>> + *  Description
>> + *          Store *len* bytes from address *from* into the packet
>> + *          associated to *skb*, at *offset*. *flags* are a combination of
>> + *          **BPF_F_RECOMPUTE_CSUM** (automatically recompute the
>> + *          checksum for the packet after storing the bytes) and
>> + *          **BPF_F_INVALIDATE_HASH** (set *skb*\ **->hash**, *skb*\
>> + *          **->swhash** and *skb*\ **->l4hash** to 0).
>> + *
>> + *          A call to this helper is susceptible to change data from the
>> + *          packet. Therefore, at load time, all checks on pointers
>> + *          previously done by the verifier are invalidated and must be
>> + *          performed again.
>> + *  Return
> [...]
> 
> Agree with Alexei that 'Description' but also 'Return' is the most useful
> (the latter we can still improve a bit wrt errors). 'For' gets indeed quickly
> out of hand but I do think that for BPF program developers 'Since' has good
> value to quickly check when a helper could be available, but then again
> this is actually tied to an individual kconfig and/or program type, and could
> potentially require to add or (worst case) remove the info in a second commit
> e.g. after the merge window similar with the commit sha. Probably best indeed
> to stick to the signature, 'Description' and 'Return' initially.

I understand that the "For" section will be the hardest to maintain
up-to-date, however, I believe that--along with minimal kernel version
and GPL information--it would be useful for readers (Alexei, what is
your case against the "GPL only" section?). Daniel, your proposal for
the "Since" information in a second time sounds good! But I do not have
a solution right now for maintaining the "For" section on the long term.

Anyway, I am fine with keeping just signatures, descriptions and return
values for now. I will submit a new version with only those items.

> We should probably also have the bpf_helpers_doc.py workflow in a separate
> comment above this one such that developers don't have to look up the original
> commit message, but have the required steps to render the rst/man page 
> available
> right where they need it.

Sure! I was considering adding some comments on top of the description,
but haven't done it for the RFC. I will do for the next version.

Thanks for the reviews!
Quentin

Reply via email to