On 5/6/22 2:18 PM, David Faust wrote:


On 5/5/22 16:00, Yonghong Song wrote:


On 5/4/22 10:03 AM, David Faust wrote:


On 5/3/22 15:32, Joseph Myers wrote:
On Mon, 2 May 2022, David Faust via Gcc-patches wrote:

Consider the following example:

     #define __typetag1 __attribute__((btf_type_tag("tag1")))
     #define __typetag2 __attribute__((btf_type_tag("tag2")))
     #define __typetag3 __attribute__((btf_type_tag("tag3")))

     int __typetag1 * __typetag2 __typetag3 * g;

The expected behavior is that 'g' is "a pointer with tags 'tag2' and
'tag3',
to a pointer with tag 'tag1' to an int". i.e.:

That's not a correct expectation for either GNU __attribute__ or C2x [[]]
attribute syntax.  In either syntax, __typetag2 __typetag3 should
apply to
the type to which g points, not to g or its type, just as if you had a
type qualifier there.  You'd need to put the attributes (or qualifier)
after the *, not before, to make them apply to the pointer type.  See
"Attribute Syntax" in the GCC manual for how the syntax is defined for
GNU
attributes and deduce in turn, for each subsequence of the tokens
matching
the syntax for some kind of declarator, what the type for "T D1" would be
as defined there and in the C standard, as deduced from the type for
"T D"
for a sub-declarator D.
  >> But GCC's attribute parsing produces a variable 'g' which is "a
pointer with
tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an int", i.e.

In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
syntax it applies to int.  Again, if you wanted it to apply to the
pointer
type it would need to go after the * not before.

If you are concerned with the fine details of what construct an attribute
appertains to, I recommend using C2x syntax not GNU syntax.


Joseph, thank you! This is very helpful. My understanding of the syntax
was not correct.

(Actually, I made a bad mistake in paraphrasing this example from the
discussion of it in the series cover letter. But, the reason why it is
incorrect is the same.)


Yonghong, is the specific ordering an expectation in BPF programs or
other users of the tags?

This is probably a language writing issue. We are saying tags only
apply to pointer. We probably should say it only apply to pointee.

$ cat t.c
int const *ptr;

the llvm ir debuginfo:

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
!6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)

We could replace 'const' with a tag like below:

int __attribute__((btf_type_tag("tag"))) *ptr;

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
annotations: !7)
!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!7 = !{!8}
!8 = !{!"btf_type_tag", !"tag"}

In the above IR, we generate annotations to pointer_type because
we didn't invent a new DI type for encode btf_type_tag. But it is
totally okay to have IR looks like

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
!11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)

OK, thanks.

There is still the question of why the DWARF generated for this case that I have been concerned about:

   int __typetag1 * __typetag2 __typetag3 * g;

differs between GCC (with this series) and clang. After studying it, GCC is doing with the attributes exactly as is described in the Attribute Syntax portion of the GCC manual where the GNU syntax is described. I do not think there is any problem here.

So the difference in DWARF suggests to me that clang is not handling the GNU attribute syntax in this particular case correctly, since it seems to be associating __typetag2 and __typetag3 to g's type rather than the type to which it points.

I am not sure whether for the use purposes of the tags this difference is very important, but it is worth noting.


As Joseph suggested, it may be better to encourage users of these tags to use the C2x attribute syntax if they are concerned with precisely which construct the tag applies.

This would also be a way around any issues in handling the attributes due to the GNU syntax.

I tried a few test cases using C2x syntax BTF type tags with a clang-15 build, but ran into some issues (in particular, some of the tag attributes being ignored altogether). I couldn't find confirmation whether C2x attribute syntax is fully supported in clang yet, so maybe this isn't expected to work. Do you know whether the C2x syntax is fully supported in clang yet?

Actually, I don't know either. But since the btf decl_tag and type_tag
are also used to compile linux kernel and the minimum compiler version
to compile kernel is gcc5.1 and clang11. I am not sure whether gcc5.1
supports c2x or not, I guess probably not. So I think we most likely
cannot use c2x syntax.




This example comes from my testing against clang to check that the BTF
generated by both toolchains is compatible. In this case we get
different results when using the GNU attribute syntax.


To avoid confusion, here is the full example (from the cover letter).
The difference in the results is clear in the DWARF.

Consider the following example:

   #define __typetag1 __attribute__((btf_type_tag("type-tag-1")))
   #define __typetag2 __attribute__((btf_type_tag("type-tag-2")))
   #define __typetag3 __attribute__((btf_type_tag("type-tag-3")))

   int __typetag1 * __typetag2 __typetag3 * g;

  <var_decl 0x7ffff7547090 g
     type <pointer_type 0x7ffff7548000
         type <pointer_type 0x7ffff75097e0 type <integer_type
0x7ffff74495e8 int>
             asm_written unsigned DI
             size <integer_cst 0x7ffff743c450 constant 64>
             unit-size <integer_cst 0x7ffff743c468 constant 8>
             align:64 warn_if_not_align:0 symtab:0 alias-set -1
canonical-type 0x7ffff7450888
             attributes <tree_list 0x7ffff75275c8
                 purpose <identifier_node 0x7ffff753a1e0 btf_type_tag>
                 value <tree_list 0x7ffff7527550
                     value <string_cst 0x7ffff75292e0 type <array_type
0x7ffff7509738>
                         readonly constant static "type-tag-3\000">>
                 chain <tree_list 0x7ffff75275a0 purpose
<identifier_node 0x7ffff753a1e0 btf_type_tag>
                     value <tree_list 0x7ffff75274d8
                         value <string_cst 0x7ffff75292c0 type
<array_type 0x7ffff7509738>
                             readonly constant static "type-tag-2\000">>>>
             pointer_to_this <pointer_type 0x7ffff7509888>>
         asm_written unsigned DI size <integer_cst 0x7ffff743c450 64>
unit-size <integer_cst 0x7ffff743c468 8>
         align:64 warn_if_not_align:0 symtab:0 alias-set -1
canonical-type 0x7ffff7509930
         attributes <tree_list 0x7ffff75275f0 purpose <identifier_node
0x7ffff753a1e0 btf_type_tag>
             value <tree_list 0x7ffff7527438
                 value <string_cst 0x7ffff75292a0 type <array_type
0x7ffff7509738>
                     readonly constant static "type-tag-1\000">>>>
     public static unsigned DI defer-output
/home/dfaust/playpen/btf/annotate.c:29:42 size <integer_cst
0x7ffff743c450 64> unit-size <integer_cst 0x7ffff743c468 8>
     align:64 warn_if_not_align:0>


The current implementation produces the following DWARF:

  <1><1e>: Abbrev Number: 4 (DW_TAG_variable)
     <1f>   DW_AT_name        : g
     <21>   DW_AT_decl_file   : 1
     <22>   DW_AT_decl_line   : 6
     <23>   DW_AT_decl_column : 42
     <24>   DW_AT_type        : <0x32>
     <28>   DW_AT_external    : 1
     <28>   DW_AT_location    : 9 byte block: 3 0 0 0 0 0 0 0 0
(DW_OP_addr: 0)
  <1><32>: Abbrev Number: 2 (DW_TAG_pointer_type)
     <33>   DW_AT_byte_size   : 8
     <33>   DW_AT_type        : <0x45>
     <37>   DW_AT_sibling     : <0x45>
  <2><3b>: Abbrev Number: 1 (User TAG value: 0x6000)
     <3c>   DW_AT_name        : (indirect string, offset: 0x18):
btf_type_tag
     <40>   DW_AT_const_value : (indirect string, offset: 0xc7):
type-tag-1
  <2><44>: Abbrev Number: 0
  <1><45>: Abbrev Number: 2 (DW_TAG_pointer_type)
     <46>   DW_AT_byte_size   : 8
     <46>   DW_AT_type        : <0x61>
     <4a>   DW_AT_sibling     : <0x61>
  <2><4e>: Abbrev Number: 1 (User TAG value: 0x6000)
     <4f>   DW_AT_name        : (indirect string, offset: 0x18):
btf_type_tag
     <53>   DW_AT_const_value : (indirect string, offset: 0xd): type-tag-3
  <2><57>: Abbrev Number: 1 (User TAG value: 0x6000)
     <58>   DW_AT_name        : (indirect string, offset: 0x18):
btf_type_tag
     <5c>   DW_AT_const_value : (indirect string, offset: 0xd2):
type-tag-2
  <2><60>: Abbrev Number: 0
  <1><61>: Abbrev Number: 5 (DW_TAG_base_type)
     <62>   DW_AT_byte_size   : 4
     <63>   DW_AT_encoding    : 5    (signed)
     <64>   DW_AT_name        : int
  <1><68>: Abbrev Number: 0

This does not agree with the DWARF produced by LLVM/clang for the same
case:
(clang 15.0.0 git 142501117a78080d2615074d3986fa42aa6a0734)

<1><1e>: Abbrev Number: 2 (DW_TAG_variable)
     <1f>   DW_AT_name        : (indexed string: 0x3): g
     <20>   DW_AT_type        : <0x29>
     <24>   DW_AT_external    : 1
     <24>   DW_AT_decl_file   : 0
     <25>   DW_AT_decl_line   : 6
     <26>   DW_AT_location    : 2 byte block: a1 0     ((Unknown
location op 0xa1))
  <1><29>: Abbrev Number: 3 (DW_TAG_pointer_type)
     <2a>   DW_AT_type        : <0x35>
  <2><2e>: Abbrev Number: 4 (User TAG value: 0x6000)
     <2f>   DW_AT_name        : (indexed string: 0x5): btf_type_tag
     <30>   DW_AT_const_value : (indexed string: 0x7): type-tag-2
  <2><31>: Abbrev Number: 4 (User TAG value: 0x6000)
     <32>   DW_AT_name        : (indexed string: 0x5): btf_type_tag
     <33>   DW_AT_const_value : (indexed string: 0x8): type-tag-3
  <2><34>: Abbrev Number: 0
  <1><35>: Abbrev Number: 3 (DW_TAG_pointer_type)
     <36>   DW_AT_type        : <0x3e>
  <2><3a>: Abbrev Number: 4 (User TAG value: 0x6000)
     <3b>   DW_AT_name        : (indexed string: 0x5): btf_type_tag
     <3c>   DW_AT_const_value : (indexed string: 0x6): type-tag-1
  <2><3d>: Abbrev Number: 0
  <1><3e>: Abbrev Number: 5 (DW_TAG_base_type)
     <3f>   DW_AT_name        : (indexed string: 0x4): int
     <40>   DW_AT_encoding    : 5    (signed)
     <41>   DW_AT_byte_size   : 4
  <1><42>: Abbrev Number: 0


Because GCC produces BTF from the internal DWARF DIE tree, the BTF
also differs.
This can be seen most obviously in the BTF type reference chains:

   GCC
     VAR (g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int

   LLVM
     VAR (g) -> ptr -> tag3 -> tag2 -> ptr -> tag1 -> int


Reply via email to