[Added Eduard Zingerman in CC, who is implementing this same feature in
 clang/llvm and also the consumer component in the kernel (pahole).]

Hi Richard.

> On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> Hello,
>>
>> This series adds support for a new attribute, "btf_decl_tag" in GCC.
>> The same attribute is already supported in clang, and is used by various
>> components of the BPF ecosystem.
>>
>> The purpose of the attribute is to allow to associate (to "tag")
>> declarations with arbitrary string annotations, which are emitted into
>> debugging information (DWARF and/or BTF) to facilitate post-compilation
>> analysis (the motivating use case being the Linux kernel BPF verifier).
>> Multiple tags are allowed on the same declaration.
>>
>> These strings are not interpreted by the compiler, and the attribute
>> itself has no effect on generated code, other than to produce additional
>> DWARF DIEs and/or BTF records conveying the annotations.
>>
>> This entails:
>>
>> - A new C-language-level attribute which allows to associate (to "tag")
>>   particular declarations with arbitrary strings.
>>
>> - The conveyance of that information in DWARF in the form of a new DIE,
>>   DW_TAG_GNU_annotation, with tag number (0x6000) and format matching
>>   that of the DW_TAG_LLVM_annotation extension supported in LLVM for
>>   the same purpose. These DIEs are already supported by BPF tooling,
>>   such as pahole.
>>
>> - The conveyance of that information in BTF debug info in the form of
>>   BTF_KIND_DECL_TAG records. These records are already supported by
>>   LLVM and other tools in the eBPF ecosystem, such as the Linux kernel
>>   eBPF verifier.
>>
>>
>> Background
>> ==========
>>
>> The purpose of these tags is to convey additional semantic information
>> to post-compilation consumers, in particular the Linux kernel eBPF
>> verifier. The verifier can make use of that information while analyzing
>> a BPF program to aid in determining whether to allow or reject the
>> program to be run. More background on these tags can be found in the
>> early support for them in the kernel here [1] and [2].
>>
>> The "btf_decl_tag" attribute is half the story; the other half is a
>> sibling attribute "btf_type_tag" which serves the same purpose but
>> applies to types. Support for btf_type_tag will come in a separate
>> patch series, since it is impaced by GCC bug 110439 which needs to be
>> addressed first.
>>
>> I submitted an initial version of this work (including btf_type_tag)
>> last spring [3], however at the time there were some open questions
>> about the behavior of the btf_type_tag attribute and issues with its
>> implementation. Since then we have clarified these details and agreed
>> to solutions with the BPF community and LLVM BPF folks.
>>
>> The main motivation for emitting the tags in DWARF is that the Linux
>> kernel generates its BTF information via pahole, using DWARF as a source:
>>
>>     +--------+  BTF                  BTF   +----------+
>>     | pahole |-------> vmlinux.btf ------->| verifier |
>>     +--------+                             +----------+
>>         ^                                        ^
>>         |                                        |
>>   DWARF |                                    BTF |
>>         |                                        |
>>       vmlinux                              +-------------+
>>       module1.ko                           | BPF program |
>>       module2.ko                           +-------------+
>>         ...
>>
>> This is because:
>>
>> a)  pahole adds additional kernel-specific information into the
>>     produced BTF based on additional analysis of kernel objects.
>>
>> b)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>>
>> b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>>     support for linking/deduplicating BTF in the linker.
>>
>> In the scenario above, the verifier needs access to the pointer tags of
>> both the kernel types/declarations (conveyed in the DWARF and translated
>> to BTF by pahole) and those of the BPF program (available directly in BTF).
>>
>>
>> DWARF Representation
>> ====================
>>
>> As noted above, btf_decl_tag is represented in DWARF via a new DIE
>> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF
>> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has
>> the following format:
>>
>>   DW_TAG_GNU_annotation (0x6000)
>>     DW_AT_name: "btf_decl_tag"
>>     DW_AT_const_value: <string argument>
>>
>> These DIEs are placed in the DWARF tree as children of the DIE for the
>> appropriate declaration, and one such DIE is created for each occurrence
>> of the btf_decl_tag attribute on a declaration.
>>
>> For example:
>>
>>   const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag 
>> ("devicemem")));
>>
>> This declaration produces the following DWARF:
>>
>>  <1><1e>: Abbrev Number: 2 (DW_TAG_variable)
>>     <1f>   DW_AT_name        : c
>>     <24>   DW_AT_type        : <0x49>
>>     ...
>>  <2><36>: Abbrev Number: 3 (User TAG value: 0x6000)
>>     <37>   DW_AT_name        : (indirect string, offset: 0x4c): btf_decl_tag
>>     <3b>   DW_AT_const_value : (indirect string, offset: 0): devicemem
>>  <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000)
>>     <40>   DW_AT_name        : (indirect string, offset: 0x4c): btf_decl_tag
>>     <44>   DW_AT_const_value : __c
>>  <2><48>: Abbrev Number: 0
>>  <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type)
>>  ...
>>
>> The DIEs for btf_decl_tag are placed as children of the DIE for
>> variable "c".
>
> It looks like a bit of overkill, and inefficient as well.  Why's the
> tags not referenced via the existing DW_AT_description?

The DWARF spec ("Entity Descriptions") seems to imply that the
DW_AT_description attribute is intended to be used to hold alternative
ways to denote the same "debugging information" (object, type, ...),
i.e. alternative aliases to refer to the same entity than the
DW_AT_name.  For example, for a type name='foo' we could have
description='aka. long int'.  We don't think this is the case of the btf
tags, which are more like properties partially characterizing the tagged
"debugging information", but couldn't be used as an alias to the name.

Also, repurposing the DW_AT_description attribute to hold btf tag
information would require to introduce a mini-language and subsequent
parsing by the clients: how to denote several tags, how to encode the
embedded string contents, etc.  You kick the complexity out the door and
it comes back in through the window :)

Finally, for what we know, the existing attribute may already be used by
some language and handled by some debugger the way it is recommended in
the spec.  That would be incompatible with having btf tags encoded
there.

> Iff you want new TAGs why require them as children for each DIE rather
> than referencing (and sharing!) them via a DIE reference from a new
> attribute?

Hmm, thats a very good question.  The Linux kernel sources uses both
declaration tags and type tags and not sharing the DIEs may result in
serious bloating, since the tags are brought in to declarations and type
specifiers via macros...

> That said, I'd go with DW_AT_description 'btf_decl_tag ("devicemem")'.
>
> But well ...
>
> Richard.
>
>>
>> BTF Representation
>> ==================
>>
>> In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer
>> to the annotated object by BTF type ID, as well as a component index which is
>> used for btf_decl_tags placed on struct/union members or function arguments.
>>
>> For example, the BTF for the above declaration is:
>>
>>   [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>>   [2] CONST '(anon)' type_id=1
>>   [3] PTR '(anon)' type_id=2
>>   [4] DECL_TAG '__c' type_id=6 component_idx=-1
>>   [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1
>>   [6] VAR 'c' type_id=3, linkage=global
>>   ...
>>
>> The BTF format is documented here [4].
>>
>>
>> References
>> ==========
>>
>> [1] https://lore.kernel.org/bpf/20210914223004.244411-1-...@fb.com/
>> [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-...@fb.com/
>> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html
>> [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst
>>
>>
>> David Faust (9):
>>   c-family: add btf_decl_tag attribute
>>   include: add BTF decl tag defines
>>   dwarf: create annotation DIEs for decl tags
>>   dwarf: expose get_die_parent
>>   ctf: add support to pass through BTF tags
>>   dwarf2ctf: convert annotation DIEs to CTF types
>>   btf: create and output BTF_KIND_DECL_TAG types
>>   testsuite: add tests for BTF decl tags
>>   doc: document btf_decl_tag attribute
>>
>>  gcc/btfout.cc                                 | 81 ++++++++++++++++++-
>>  gcc/c-family/c-attribs.cc                     | 23 ++++++
>>  gcc/ctf-int.h                                 | 28 +++++++
>>  gcc/ctfc.cc                                   | 10 ++-
>>  gcc/ctfc.h                                    | 17 +++-
>>  gcc/doc/extend.texi                           | 47 +++++++++++
>>  gcc/dwarf2ctf.cc                              | 73 ++++++++++++++++-
>>  gcc/dwarf2out.cc                              | 37 ++++++++-
>>  gcc/dwarf2out.h                               |  1 +
>>  .../gcc.dg/debug/btf/btf-decltag-func.c       | 21 +++++
>>  .../gcc.dg/debug/btf/btf-decltag-sou.c        | 33 ++++++++
>>  .../gcc.dg/debug/btf/btf-decltag-var.c        | 19 +++++
>>  .../gcc.dg/debug/dwarf2/annotation-decl-1.c   |  9 +++
>>  .../gcc.dg/debug/dwarf2/annotation-decl-2.c   | 18 +++++
>>  .../gcc.dg/debug/dwarf2/annotation-decl-3.c   | 17 ++++
>>  include/btf.h                                 | 14 +++-
>>  include/dwarf2.def                            |  4 +
>>  17 files changed, 437 insertions(+), 15 deletions(-)
>>  create mode 100644 gcc/ctf-int.h
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c
>>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c
>>
>> --
>> 2.40.1
>>

Reply via email to