Hello, This series adds support for a new attribute, "btf_decl_tag" in GCC. The same attribute is already supported in clang, and is used by various components of the BPF ecosystem.
The purpose of the attribute is to allow to associate (to "tag") declarations with arbitrary string annotations, which are emitted into debugging information (DWARF and/or BTF) to facilitate post-compilation analysis (the motivating use case being the Linux kernel BPF verifier). Multiple tags are allowed on the same declaration. These strings are not interpreted by the compiler, and the attribute itself has no effect on generated code, other than to produce additional DWARF DIEs and/or BTF records conveying the annotations. This entails: - A new C-language-level attribute which allows to associate (to "tag") particular declarations with arbitrary strings. - The conveyance of that information in DWARF in the form of a new DIE, DW_TAG_GNU_annotation, with tag number (0x6000) and format matching that of the DW_TAG_LLVM_annotation extension supported in LLVM for the same purpose. These DIEs are already supported by BPF tooling, such as pahole. - The conveyance of that information in BTF debug info in the form of BTF_KIND_DECL_TAG records. These records are already supported by LLVM and other tools in the eBPF ecosystem, such as the Linux kernel eBPF verifier. Background ========== The purpose of these tags is to convey additional semantic information to post-compilation consumers, in particular the Linux kernel eBPF verifier. The verifier can make use of that information while analyzing a BPF program to aid in determining whether to allow or reject the program to be run. More background on these tags can be found in the early support for them in the kernel here [1] and [2]. The "btf_decl_tag" attribute is half the story; the other half is a sibling attribute "btf_type_tag" which serves the same purpose but applies to types. Support for btf_type_tag will come in a separate patch series, since it is impaced by GCC bug 110439 which needs to be addressed first. I submitted an initial version of this work (including btf_type_tag) last spring [3], however at the time there were some open questions about the behavior of the btf_type_tag attribute and issues with its implementation. Since then we have clarified these details and agreed to solutions with the BPF community and LLVM BPF folks. The main motivation for emitting the tags in DWARF is that the Linux kernel generates its BTF information via pahole, using DWARF as a source: +--------+ BTF BTF +----------+ | pahole |-------> vmlinux.btf ------->| verifier | +--------+ +----------+ ^ ^ | | DWARF | BTF | | | vmlinux +-------------+ module1.ko | BPF program | module2.ko +-------------+ ... This is because: a) pahole adds additional kernel-specific information into the produced BTF based on additional analysis of kernel objects. b) Unlike GCC, LLVM will only generate BTF for BPF programs. b) GCC can generate BTF for whatever target with -gbtf, but there is no support for linking/deduplicating BTF in the linker. In the scenario above, the verifier needs access to the pointer tags of both the kernel types/declarations (conveyed in the DWARF and translated to BTF by pahole) and those of the BPF program (available directly in BTF). DWARF Representation ==================== As noted above, btf_decl_tag is represented in DWARF via a new DIE DW_TAG_GNU_annotation, with identical format to the LLVM DWARF extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has the following format: DW_TAG_GNU_annotation (0x6000) DW_AT_name: "btf_decl_tag" DW_AT_const_value: <string argument> These DIEs are placed in the DWARF tree as children of the DIE for the appropriate declaration, and one such DIE is created for each occurrence of the btf_decl_tag attribute on a declaration. For example: const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("devicemem"))); This declaration produces the following DWARF: <1><1e>: Abbrev Number: 2 (DW_TAG_variable) <1f> DW_AT_name : c <24> DW_AT_type : <0x49> ... <2><36>: Abbrev Number: 3 (User TAG value: 0x6000) <37> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000) <40> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag <44> DW_AT_const_value : __c <2><48>: Abbrev Number: 0 <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type) ... The DIEs for btf_decl_tag are placed as children of the DIE for variable "c". BTF Representation ================== In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer to the annotated object by BTF type ID, as well as a component index which is used for btf_decl_tags placed on struct/union members or function arguments. For example, the BTF for the above declaration is: [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED [2] CONST '(anon)' type_id=1 [3] PTR '(anon)' type_id=2 [4] DECL_TAG '__c' type_id=6 component_idx=-1 [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1 [6] VAR 'c' type_id=3, linkage=global ... The BTF format is documented here [4]. References ========== [1] https://lore.kernel.org/bpf/20210914223004.244411-1-...@fb.com/ [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-...@fb.com/ [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst David Faust (9): c-family: add btf_decl_tag attribute include: add BTF decl tag defines dwarf: create annotation DIEs for decl tags dwarf: expose get_die_parent ctf: add support to pass through BTF tags dwarf2ctf: convert annotation DIEs to CTF types btf: create and output BTF_KIND_DECL_TAG types testsuite: add tests for BTF decl tags doc: document btf_decl_tag attribute gcc/btfout.cc | 81 ++++++++++++++++++- gcc/c-family/c-attribs.cc | 23 ++++++ gcc/ctf-int.h | 28 +++++++ gcc/ctfc.cc | 10 ++- gcc/ctfc.h | 17 +++- gcc/doc/extend.texi | 47 +++++++++++ gcc/dwarf2ctf.cc | 73 ++++++++++++++++- gcc/dwarf2out.cc | 37 ++++++++- gcc/dwarf2out.h | 1 + .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++ .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++ .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++ .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++ .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++ .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++ include/btf.h | 14 +++- include/dwarf2.def | 4 + 17 files changed, 437 insertions(+), 15 deletions(-) create mode 100644 gcc/ctf-int.h create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c -- 2.40.1