On 6/2/22 19:04, Yonghong Song wrote:
>
>
> On 5/27/22 12:56 PM, David Faust wrote:
>>
>>
>> On 5/26/22 00:29, Yonghong Song wrote:
>>>
>>>
>>> On 5/24/22 10:04 AM, David Faust wrote:
>>>>
>>>>
>>>> On 5/24/22 09:03, Yonghong Song wrote:
>>>>>
>>>>>
>>>>> On 5/24/22 8:53 AM, David Faust wrote:
>>>>>>
>>>>>>
>>>>>> On 5/24/22 04:07, Jose E. Marchesi wrote:
>>>>>>>
>>>>>>>> On 5/11/22 11:44 AM, David Faust wrote:
>>>>>>>>>
>>>>>>>>> On 5/10/22 22:05, Yonghong Song wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 5/10/22 8:43 PM, Yonghong Song wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 5/6/22 2:18 PM, David Faust wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 5/5/22 16:00, Yonghong Song wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 5/4/22 10:03 AM, David Faust wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 5/3/22 15:32, Joseph Myers wrote:
>>>>>>>>>>>>>>> On Mon, 2 May 2022, David Faust via Gcc-patches wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Consider the following example:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> #define __typetag1
>>>>>>>>>>>>>>>> __attribute__((btf_type_tag("tag1")))
>>>>>>>>>>>>>>>> #define __typetag2
>>>>>>>>>>>>>>>> __attribute__((btf_type_tag("tag2")))
>>>>>>>>>>>>>>>> #define __typetag3
>>>>>>>>>>>>>>>> __attribute__((btf_type_tag("tag3")))
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The expected behavior is that 'g' is "a pointer with tags
>>>>>>>>>>>>>>>> 'tag2' and
>>>>>>>>>>>>>>>> 'tag3',
>>>>>>>>>>>>>>>> to a pointer with tag 'tag1' to an int". i.e.:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That's not a correct expectation for either GNU __attribute__ or
>>>>>>>>>>>>>>> C2x [[]]
>>>>>>>>>>>>>>> attribute syntax. In either syntax, __typetag2 __typetag3
>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>> apply to
>>>>>>>>>>>>>>> the type to which g points, not to g or its type, just as if
>>>>>>>>>>>>>>> you had a
>>>>>>>>>>>>>>> type qualifier there. You'd need to put the attributes (or
>>>>>>>>>>>>>>> qualifier)
>>>>>>>>>>>>>>> after the *, not before, to make them apply to the pointer
>>>>>>>>>>>>>>> type. See
>>>>>>>>>>>>>>> "Attribute Syntax" in the GCC manual for how the syntax is
>>>>>>>>>>>>>>> defined for
>>>>>>>>>>>>>>> GNU
>>>>>>>>>>>>>>> attributes and deduce in turn, for each subsequence of the
>>>>>>>>>>>>>>> tokens
>>>>>>>>>>>>>>> matching
>>>>>>>>>>>>>>> the syntax for some kind of declarator, what the type for "T D1"
>>>>>>>>>>>>>>> would be
>>>>>>>>>>>>>>> as defined there and in the C standard, as deduced from the
>>>>>>>>>>>>>>> type for
>>>>>>>>>>>>>>> "T D"
>>>>>>>>>>>>>>> for a sub-declarator D.
>>>>>>>>>>>>>>> >> But GCC's attribute parsing produces a variable 'g'
>>>>>>>>>>>>>>> which is "a
>>>>>>>>>>>>>> pointer with
>>>>>>>>>>>>>>>> tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an
>>>>>>>>>>>>>>>> int", i.e.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In GNU syntax, __typetag1 applies to the declaration, whereas
>>>>>>>>>>>>>>> in C2x
>>>>>>>>>>>>>>> syntax it applies to int. Again, if you wanted it to apply to
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> pointer
>>>>>>>>>>>>>>> type it would need to go after the * not before.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you are concerned with the fine details of what construct an
>>>>>>>>>>>>>>> attribute
>>>>>>>>>>>>>>> appertains to, I recommend using C2x syntax not GNU syntax.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Joseph, thank you! This is very helpful. My understanding of
>>>>>>>>>>>>>> the syntax
>>>>>>>>>>>>>> was not correct.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (Actually, I made a bad mistake in paraphrasing this example
>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>> discussion of it in the series cover letter. But, the reason
>>>>>>>>>>>>>> why it is
>>>>>>>>>>>>>> incorrect is the same.)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yonghong, is the specific ordering an expectation in BPF
>>>>>>>>>>>>>> programs or
>>>>>>>>>>>>>> other users of the tags?
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is probably a language writing issue. We are saying tags only
>>>>>>>>>>>>> apply to pointer. We probably should say it only apply to pointee.
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ cat t.c
>>>>>>>>>>>>> int const *ptr;
>>>>>>>>>>>>>
>>>>>>>>>>>>> the llvm ir debuginfo:
>>>>>>>>>>>>>
>>>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size:
>>>>>>>>>>>>> 64)
>>>>>>>>>>>>> !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
>>>>>>>>>>>>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>>>>
>>>>>>>>>>>>> We could replace 'const' with a tag like below:
>>>>>>>>>>>>>
>>>>>>>>>>>>> int __attribute__((btf_type_tag("tag"))) *ptr;
>>>>>>>>>>>>>
>>>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size:
>>>>>>>>>>>>> 64,
>>>>>>>>>>>>> annotations: !7)
>>>>>>>>>>>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>>>> !7 = !{!8}
>>>>>>>>>>>>> !8 = !{!"btf_type_tag", !"tag"}
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the above IR, we generate annotations to pointer_type because
>>>>>>>>>>>>> we didn't invent a new DI type for encode btf_type_tag. But it is
>>>>>>>>>>>>> totally okay to have IR looks like
>>>>>>>>>>>>>
>>>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11,
>>>>>>>>>>>>> size: 64)
>>>>>>>>>>>>> !11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
>>>>>>>>>>>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>>>>
>>>>>>>>>>>> OK, thanks.
>>>>>>>>>>>>
>>>>>>>>>>>> There is still the question of why the DWARF generated for this
>>>>>>>>>>>> case
>>>>>>>>>>>> that I have been concerned about:
>>>>>>>>>>>>
>>>>>>>>>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>>>
>>>>>>>>>>>> differs between GCC (with this series) and clang. After studying
>>>>>>>>>>>> it,
>>>>>>>>>>>> GCC is doing with the attributes exactly as is described in the
>>>>>>>>>>>> Attribute Syntax portion of the GCC manual where the GNU syntax is
>>>>>>>>>>>> described. I do not think there is any problem here.
>>>>>>>>>>>>
>>>>>>>>>>>> So the difference in DWARF suggests to me that clang is not
>>>>>>>>>>>> handling
>>>>>>>>>>>> the GNU attribute syntax in this particular case correctly, since
>>>>>>>>>>>> it
>>>>>>>>>>>> seems to be associating __typetag2 and __typetag3 to g's type
>>>>>>>>>>>> rather
>>>>>>>>>>>> than the type to which it points.
>>>>>>>>>>>>
>>>>>>>>>>>> I am not sure whether for the use purposes of the tags this
>>>>>>>>>>>> difference
>>>>>>>>>>>> is very important, but it is worth noting.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> As Joseph suggested, it may be better to encourage users of these
>>>>>>>>>>>> tags
>>>>>>>>>>>> to use the C2x attribute syntax if they are concerned with
>>>>>>>>>>>> precisely
>>>>>>>>>>>> which construct the tag applies.
>>>>>>>>>>>>
>>>>>>>>>>>> This would also be a way around any issues in handling the
>>>>>>>>>>>> attributes
>>>>>>>>>>>> due to the GNU syntax.
>>>>>>>>>>>>
>>>>>>>>>>>> I tried a few test cases using C2x syntax BTF type tags with a
>>>>>>>>>>>> clang-15 build, but ran into some issues (in particular, some of
>>>>>>>>>>>> the
>>>>>>>>>>>> tag attributes being ignored altogether). I couldn't find
>>>>>>>>>>>> confirmation
>>>>>>>>>>>> whether C2x attribute syntax is fully supported in clang yet, so
>>>>>>>>>>>> maybe
>>>>>>>>>>>> this isn't expected to work. Do you know whether the C2x syntax is
>>>>>>>>>>>> fully supported in clang yet?
>>>>>>>>>>>
>>>>>>>>>>> Actually, I don't know either. But since the btf decl_tag and
>>>>>>>>>>> type_tag
>>>>>>>>>>> are also used to compile linux kernel and the minimum compiler
>>>>>>>>>>> version
>>>>>>>>>>> to compile kernel is gcc5.1 and clang11. I am not sure whether
>>>>>>>>>>> gcc5.1
>>>>>>>>>>> supports c2x or not, I guess probably not. So I think we most likely
>>>>>>>>>>> cannot use c2x syntax.
>>>>>>>>>>
>>>>>>>>>> Okay, I think we can guard btf_tag's with newer compiler versions.
>>>>>>>>>> What kind of c2x syntax you intend to use? I can help compile kernel
>>>>>>>>>> with that syntax and llvm15 to see what is the issue and may help
>>>>>>>>>> fix it in clang if possible.
>>>>>>>>>
>>>>>>>>> I am thinking to use the [[]] C2x standard attribute syntax. The
>>>>>>>>> syntax makes it quite clear to which entity each attribute applies,
>>>>>>>>> and in my opinion is a little more intuitive/less surprising too.
>>>>>>>>> It's documented here (PDF):
>>>>>>>>> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2731.pdf
>>>>>>>>> See sections 6.7.11 for the syntax and 6.7.6 for
>>>>>>>>> declarations. Section 6.7.6.1 specifically describes using the
>>>>>>>>> attribute syntax with pointer declarators.
>>>>>>>>> The attribute syntax itself for BTF tags is:
>>>>>>>>> [[clang::btf_type_tag("tag1")]]
>>>>>>>>> or
>>>>>>>>> [[gnu::btf_type_tag("tag1")]]
>>>>>>>>>
>>>>>>>>> I am also looking into whether, with the C2x syntax, we really need
>>>>>>>>> two
>>>>>>>>> separate attributes (type_tag and decl_tag) at the language
>>>>>>>>> level. It might be possible with C2x syntax to use just one language
>>>>>>>>> attribute (e.g. just btf_tag).
>>>>>>>>>
>>>>>>>>> A simple declaration for a tagged pointer to an int:
>>>>>>>>> int * [[gnu::btf_type_tag("tag1")]] x;
>>>>>>>>> And for the example from this thread:
>>>>>>>>> #define __typetag1 [[gnu::btf_type_tag("type-tag-1")]]
>>>>>>>>> #define __typetag2 [[gnu::btf_type_tag("type-tag-2")]]
>>>>>>>>> #define __typetag3 [[gnu::btf_type_tag("type-tag-3")]]
>>>>>>>>> int * __typetag1 * __typetag2 __typetag3 g;
>>>>>>>>> Here each tag applies to the preceding pointer, so the result is
>>>>>>>>> unsurprising.
>>>>>>>>> Actually, this is where I found something that looks like an issue
>>>>>>>>> with the C2x attribute syntax in clang. The tags 2 and 3 go missing,
>>>>>>>>> but with no warning nor other indication.
>>>>>>>>> Compiling this example with gcc:
>>>>>>>>> $ ~/toolchains/bpf/bin/bpf-unknown-none-gcc -c -gbtf -gdwarf c2x.c
>>>>>>>>> -o c2x.o --std=c2x
>>>>>>>>> $ ~/toolchains/llvm/bin/llvm-dwarfdump c2x.o
>>>>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>>>> DW_AT_producer ("GNU C2X 12.0.1 20220401
>>>>>>>>> (experimental) -gbtf -gdwarf -std=c2x")
>>>>>>>>> DW_AT_language (DW_LANG_C11)
>>>>>>>>> DW_AT_name ("c2x.c")
>>>>>>>>> DW_AT_comp_dir ("/home/dfaust/playpen/btf/tags")
>>>>>>>>> DW_AT_stmt_list (0x00000000)
>>>>>>>>> 0x0000001e: DW_TAG_variable
>>>>>>>>> DW_AT_name ("g")
>>>>>>>>> DW_AT_decl_file
>>>>>>>>> ("/home/dfaust/playpen/btf/tags/c2x.c")
>>>>>>>>> DW_AT_decl_line (16)
>>>>>>>>> DW_AT_decl_column (0x2a)
>>>>>>>>> DW_AT_type (0x00000032 "int **")
>>>>>>>>> DW_AT_external (true)
>>>>>>>>> DW_AT_location (DW_OP_addr 0x0)
>>>>>>>>> 0x00000032: DW_TAG_pointer_type
>>>>>>>>> DW_AT_byte_size (8)
>>>>>>>>> DW_AT_type (0x0000004e "int *")
>>>>>>>>> DW_AT_sibling (0x0000004e)
>>>>>>>>> 0x0000003b: DW_TAG_LLVM_annotation
>>>>>>>>> DW_AT_name ("btf_type_tag")
>>>>>>>>> DW_AT_const_value ("type-tag-3")
>>>>>>>>> 0x00000044: DW_TAG_LLVM_annotation
>>>>>>>>> DW_AT_name ("btf_type_tag")
>>>>>>>>> DW_AT_const_value ("type-tag-2")
>>>>>>>>> 0x0000004d: NULL
>>>>>>>>> 0x0000004e: DW_TAG_pointer_type
>>>>>>>>> DW_AT_byte_size (8)
>>>>>>>>> DW_AT_type (0x00000061 "int")
>>>>>>>>> DW_AT_sibling (0x00000061)
>>>>>>>>> 0x00000057: DW_TAG_LLVM_annotation
>>>>>>>>> DW_AT_name ("btf_type_tag")
>>>>>>>>> DW_AT_const_value ("type-tag-1")
>>>>>>>>> 0x00000060: NULL
>>>>>>>>> 0x00000061: DW_TAG_base_type
>>>>>>>>> DW_AT_byte_size (0x04)
>>>>>>>>> DW_AT_encoding (DW_ATE_signed)
>>>>>>>>> DW_AT_name ("int")
>>>>>>>>> 0x00000068: NULL
>>>>>>>>>
>>>>>>>>> and with clang (changing the attribute prefix to clang::
>>>>>>>>> appropriately):
>>>>>>>>> $ ~/toolchains/llvm/bin/clang -target bpf -g -c c2x.c -o c2x.o.ll
>>>>>>>>> --std=c2x
>>>>>>>>> $ ~/toolchains/llvm/bin/llvm-dwarfdump c2x.o.ll
>>>>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>>>> DW_AT_producer ("clang version 15.0.0
>>>>>>>>> (https://github.com/llvm/llvm-project.git
>>>>>>>>> f80e369f61ebd33dd9377bb42fcab64d17072b18)")
>>>>>>>>> DW_AT_language (DW_LANG_C99)
>>>>>>>>> DW_AT_name ("c2x.c")
>>>>>>>>> DW_AT_str_offsets_base (0x00000008)
>>>>>>>>> DW_AT_stmt_list (0x00000000)
>>>>>>>>> DW_AT_comp_dir ("/home/dfaust/playpen/btf/tags")
>>>>>>>>> DW_AT_addr_base (0x00000008)
>>>>>>>>> 0x0000001e: DW_TAG_variable
>>>>>>>>> DW_AT_name ("g")
>>>>>>>>> DW_AT_type (0x00000029 "int **")
>>>>>>>>> DW_AT_external (true)
>>>>>>>>> DW_AT_decl_file
>>>>>>>>> ("/home/dfaust/playpen/btf/tags/c2x.c")
>>>>>>>>> DW_AT_decl_line (12)
>>>>>>>>> DW_AT_location (DW_OP_addrx 0x0)
>>>>>>>>> 0x00000029: DW_TAG_pointer_type
>>>>>>>>> DW_AT_type (0x00000032 "int *")
>>>>>>>>> 0x0000002e: DW_TAG_LLVM_annotation
>>>>>>>>> DW_AT_name ("btf_type_tag")
>>>>>>>>> DW_AT_const_value ("type-tag-1")
>>>>>>>>> 0x00000031: NULL
>>>>>>>>> 0x00000032: DW_TAG_pointer_type
>>>>>>>>> DW_AT_type (0x00000037 "int")
>>>>>>>>> 0x00000037: DW_TAG_base_type
>>>>>>>>> DW_AT_name ("int")
>>>>>>>>> DW_AT_encoding (DW_ATE_signed)
>>>>>>>>> DW_AT_byte_size (0x04)
>>>>>>>>> 0x0000003b: NULL
>>>>>>>>
>>>>>>>> Thanks. I checked with current clang. The generated code looks
>>>>>>>> like above. Basically, for code like below
>>>>>>>>
>>>>>>>> #define __typetag1 [[clang::btf_type_tag("type-tag-1")]]
>>>>>>>> #define __typetag2 [[clang::btf_type_tag("type-tag-2")]]
>>>>>>>> #define __typetag3 [[clang::btf_type_tag("type-tag-3")]]
>>>>>>>>
>>>>>>>> int * __typetag1 * __typetag2 __typetag3 g;
>>>>>>>>
>>>>>>>> The IR type looks like
>>>>>>>> __typetag3 -> __typetag2 -> * (ptr1) -> __typetag1 -> * (ptr2)
>>>>>>>> -> int
>>>>>>>>
>>>>>>>> The IR is similar to what we did if using
>>>>>>>> __attribute__((btf_type_tag(""))), but their
>>>>>>>> semantic interpretation is quite different.
>>>>>>>> For example, with c2x format,
>>>>>>>> __typetag1 applies to ptr2
>>>>>>>> with __attribute__ format, it applies pointee of ptr1.
>>>>>>>>
>>>>>>>> But more importantly, c2x format is incompatible with
>>>>>>>> the usage of linux kernel. The following are a bunch of kernel
>>>>>>>> __user usages. Here, __user intends to be replaced with a btf_type_tag.
>>>>>>>>
>>>>>>>> vfio_pci_core.h: ssize_t (*rw)(struct vfio_pci_core_device
>>>>>>>> *vdev, char __user *buf,
>>>>>>>> vfio_pci_core.h: char __user *buf,
>>>>>>>> size_t count,
>>>>>>>> vfio_pci_core.h:extern ssize_t vfio_pci_bar_rw(struct
>>>>>>>> vfio_pci_core_device *vdev, char __user *buf,
>>>>>>>> vfio_pci_core.h:extern ssize_t vfio_pci_vga_rw(struct
>>>>>>>> vfio_pci_core_device *vdev, char __user *buf,
>>>>>>>> vfio_pci_core.h: char __user
>>>>>>>> *buf, size_t count,
>>>>>>>> vfio_pci_core.h: void __user *arg,
>>>>>>>> size_t argsz);
>>>>>>>> vfio_pci_core.h:ssize_t vfio_pci_core_read(struct vfio_device
>>>>>>>> *core_vdev, char __user *buf,
>>>>>>>> vfio_pci_core.h:ssize_t vfio_pci_core_write(struct vfio_device
>>>>>>>> *core_vdev, const char __user *buf,
>>>>>>>> vringh.h: vring_desc_t __user *desc,
>>>>>>>> vringh.h: vring_avail_t __user *avail,
>>>>>>>> vringh.h: vring_used_t __user *used);
>>>>>>>> vt_kern.h:int con_set_cmap(unsigned char __user *cmap);
>>>>>>>> vt_kern.h:int con_get_cmap(unsigned char __user *cmap);
>>>>>>>> vt_kern.h:int con_set_trans_old(unsigned char __user * table);
>>>>>>>> vt_kern.h:int con_get_trans_old(unsigned char __user * table);
>>>>>>>> vt_kern.h:int con_set_trans_new(unsigned short __user * table);
>>>>>>>> vt_kern.h:int con_get_trans_new(unsigned short __user * table);
>>>>>>>>
>>>>>>>> You can see, we will not able to simply replace __user
>>>>>>>> with [[clang::btf_type_tag("user")]] because it won't work
>>>>>>>> according to c2x expectations.
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Thanks for checking this. I see that we probably cannot use the c2x
>>>>>> syntax in the kernel, since it will not work as a drop-in replacement
>>>>>> for the current uses.
>>>>>>
>>>>>>>
>>>>>>> Hi Yongsong.
>>>>>>>
>>>>>>> I am a bit confused regarding the GNU attributes problem: our patch
>>>>>>> supports it, but as David already noted:
>>>>>>>
>>>>>>>>>>> There is still the question of why the DWARF generated for this case
>>>>>>>>>>> that I have been concerned about:
>>>>>>>>>>>
>>>>>>>>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>>
>>>>>>>>>>> differs between GCC (with this series) and clang. After studying it,
>>>>>>>>>>> GCC is doing with the attributes exactly as is described in the
>>>>>>>>>>> Attribute Syntax portion of the GCC manual where the GNU syntax is
>>>>>>>>>>> described. I do not think there is any problem here.
>>>>>>>>>>>
>>>>>>>>>>> So the difference in DWARF suggests to me that clang is not handling
>>>>>>>>>>> the GNU attribute syntax in this particular case correctly, since it
>>>>>>>>>>> seems to be associating __typetag2 and __typetag3 to g's type rather
>>>>>>>>>>> than the type to which it points.
>>>>>>>
>>>>>>> Note the example he uses is:
>>>>>>>
>>>>>>> (a) int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>
>>>>>>> Not
>>>>>>>
>>>>>>> (b) int * __typetag1 * __typetag2 __typetag3 g;
>>>>>>>
>>>>>>> Apparently for (a) clang is generating DWARF that associates __typetag2
>>>>>>> and__typetag3 to g's type (the pointer to pointer) instead of the
>>>>>>> pointer to int, which contravenes the GNU syntax rules.
>>>>>>>
>>>>>>> AFAIK thats is where the DWARF we generate differs, and what is blocking
>>>>>>> us. David will correct me in the likely case I'm wrong :)
>>>>>>
>>>>>> Right. This is what I hoped maybe the C2x syntax could resolve.
>>>>>>
>>>>>> The issue I saw is that in the case (a) above, when using the GNU
>>>>>> attribute syntax, GCC and clang produce different results. I think that
>>>>>> the underlying cause is some subtle difference in how clang is handling
>>>>>> the GNU attribute syntax in the case compared to GCC.
>>>>>>
>>>>>>
>>>>>> To remind ourselves, here is the full example. Notice the significant
>>>>>> difference in which objects the tags are associated with in DWARF.
>>>>>>
>>>>>>
>>>>>> #define __typetag1 __attribute__((btf_type_tag("type-tag-1")))
>>>>>> #define __typetag2 __attribute__((btf_type_tag("type-tag-2")))
>>>>>> #define __typetag3 __attribute__((btf_type_tag("type-tag-3")))
>>>>>>
>>>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>
>>>>>>
>>>>>> GCC: bpf-unknown-none-gcc -c -gdwarf -gbtf annotate.c
>>>>>>
>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>> DW_AT_producer ("GNU C17 12.0.1 20220401
>>>>>> (experimental) -gdwarf -gbtf")
>>>>>> DW_AT_language (DW_LANG_C11)
>>>>>> DW_AT_name ("annotate.c")
>>>>>> DW_AT_comp_dir ("/home/dfaust/playpen/btf/tags")
>>>>>> DW_AT_stmt_list (0x00000000)
>>>>>>
>>>>>> 0x0000001e: DW_TAG_variable
>>>>>> DW_AT_name ("g")
>>>>>> DW_AT_decl_file
>>>>>> ("/home/dfaust/playpen/btf/tags/annotate.c")
>>>>>> DW_AT_decl_line (11)
>>>>>> DW_AT_decl_column (0x2a)
>>>>>> DW_AT_type (0x00000032 "int **")
>>>>>> DW_AT_external (true)
>>>>>> DW_AT_location (DW_OP_addr 0x0)
>>>>>>
>>>>>> 0x00000032: DW_TAG_pointer_type
>>>>>> DW_AT_byte_size (8)
>>>>>> DW_AT_type (0x00000045 "int *")
>>>>>> DW_AT_sibling (0x00000045)
>>>>>>
>>>>>> 0x0000003b: DW_TAG_LLVM_annotation
>>>>>> DW_AT_name ("btf_type_tag")
>>>>>> DW_AT_const_value ("type-tag-1")
>>>>>>
>>>>>> 0x00000044: NULL
>>>>>>
>>>>>> 0x00000045: DW_TAG_pointer_type
>>>>>> DW_AT_byte_size (8)
>>>>>> DW_AT_type (0x00000061 "int")
>>>>>> DW_AT_sibling (0x00000061)
>>>>>>
>>>>>> 0x0000004e: DW_TAG_LLVM_annotation
>>>>>> DW_AT_name ("btf_type_tag")
>>>>>> DW_AT_const_value ("type-tag-3")
>>>>>>
>>>>>> 0x00000057: DW_TAG_LLVM_annotation
>>>>>> DW_AT_name ("btf_type_tag")
>>>>>> DW_AT_const_value ("type-tag-2")
>>>>>>
>>>>>> 0x00000060: NULL
>>>>>>
>>>>>> 0x00000061: DW_TAG_base_type
>>>>>> DW_AT_byte_size (0x04)
>>>>>> DW_AT_encoding (DW_ATE_signed)
>>>>>> DW_AT_name ("int")
>>>>>>
>>>>>> 0x00000068: NULL
>>>>>
>>>>> do you have documentation to show why gnu generates attribute this way?
>>>>> If dwarf generates
>>>>> ptr -> tag3 -> tag2 -> ptr -> tag1 -> int
>>>>> does this help?
>>>>
>>>> Okay, I think I see the problem. The internal representations between clang
>>>> and GCC attach the attributes to different nodes, and as a result they
>>>> produce different DWARF:
>>>>
>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
>>>> annotations: !10)
>>>> !6 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !7, size: 64,
>>>> annotations: !8)
>>>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>> !8 = !{!9}
>>>> !9 = !{!"btf_type_tag", !"tag1"}
>>>> !10 = !{!11, !12}
>>>> !11 = !{!"btf_type_tag", !"tag2"}
>>>> !12 = !{!"btf_type_tag", !"tag3"}
>>>>
>>>> If I am reading this IR right, then the tags "tag2" and "tag3" are being
>>>> applied to the int**, and "tag1" is applied to the int*
>>>>
>>>> But I don't think this lines up with how the attribute syntax is defined.
>>>> See
>>>> https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html
>>>> In particular the "All other attributes" section. (It's a bit dense).
>>>> Or, as Joseph summed it up nicely earlier in the thread:
>>>>> In either syntax, __typetag2 __typetag3 should apply to
>>>>> the type to which g points, not to g or its type, just as if you had a
>>>>> type qualifier there. You'd need to put the attributes (or qualifier)
>>>>> after the *, not before, to make them apply to the pointer type.
>>>>
>>>>
>>>> Compare that to GCC's internal representation, from which DWARF is
>>>> generated:
>>>>
>>>> <var_decl 0x7ffff7535090 g
>>>> type <pointer_type 0x7ffff74f8888
>>>> type <pointer_type 0x7ffff74f8b28 type <integer_type
>>>> 0x7ffff74385e8 int>
>>>> unsigned DI
>>>> size <integer_cst 0x7ffff742b450 constant 64>
>>>> unit-size <integer_cst 0x7ffff742b468 constant 8>
>>>> align:64 warn_if_not_align:0 symtab:0 alias-set -1
>>>> canonical-type 0x7ffff743f888
>>>> attributes <tree_list 0x7ffff75165c8
>>>> purpose <identifier_node 0x7ffff75290f0 btf_type_tag>
>>>> value <tree_list 0x7ffff7516550
>>>> value <string_cst 0x7ffff75182e0 type <array_type
>>>> 0x7ffff74f8738>
>>>> readonly constant static "type-tag-3\000">>
>>>> chain <tree_list 0x7ffff75165a0 purpose <identifier_node
>>>> 0x7ffff75290f0 btf_type_tag>
>>>> value <tree_list 0x7ffff75164d8
>>>> value <string_cst 0x7ffff75182c0 type
>>>> <array_type 0x7ffff74f8738>
>>>> readonly constant static "type-tag-2\000">>>>
>>>> pointer_to_this <pointer_type 0x7ffff74f8bd0>>
>>>> unsigned DI size <integer_cst 0x7ffff742b450 64> unit-size
>>>> <integer_cst 0x7ffff742b468 8>
>>>> align:64 warn_if_not_align:0 symtab:0 alias-set -1
>>>> canonical-type 0x7ffff74f87e0
>>>> attributes <tree_list 0x7ffff75165f0 purpose <identifier_node
>>>> 0x7ffff75290f0 btf_type_tag>
>>>> value <tree_list 0x7ffff7516438
>>>> value <string_cst 0x7ffff75182a0 type <array_type
>>>> 0x7ffff74f8738>
>>>> readonly constant static "type-tag-1\000">>>>
>>>> public static unsigned DI defer-output
>>>> /home/dfaust/playpen/btf/tags/annotate.c:10:42 size <integer_cst
>>>> 0x7ffff742b450 64> unit-size <integer_cst 0x7ffff742b468 8>
>>>> align:64 warn_if_not_align:0>
>>>>
>>>> See how tags "tag2" and "tag3" are associated with the pointer_type
>>>> 0x7ffff74f8b28,
>>>> that is, "the type to which g points"
>>>>
>>>> From GCC's DWARF the BTF we get currently looks like:
>>>> VAR(g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int
>>>> which is obviously quite different and why this case caught my attention.
>>>>
>>>> I think this difference is the root of our problems. It might not be
>>>> specifically related to the BTF tag attributes but they do reveal some
>>>> discrepency between how clang and GCC handle the attribute syntax.
>>>
>>> The btf_type attribute is very similar to address_space attribute.
>>> For example,
>>> $ cat t1.c
>>> int __attribute__((address_space(1))) * p;
>>> $ clang -g -S -emit-llvm t1.c
>>>
>>> In IR, we will have
>>> @p = dso_local global ptr addrspace(1) null, align 8, !dbg !0
>>> ...
>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>
>>> Replacing address_space with btf_type_tag, we will get
>>> ptr->type_tag->int in debuginfo.
>>>
>>> But it looks like gcc doesn't support address_space attribute
>>>
>>> $ gcc -g -S t1.c
>>> t1.c:1:1: warning: ‘address_space’ attribute directive ignored
>>> [-Wattributes]
>>> int __attribute__((address_space(1))) * p;
>>> ^~~
>>>
>>> Is it possible for gcc to go with address_space attribute
>>> semantics for btf_type_tag attribute?
>>
>> In cases like this the behavior is the same.
>> $ cat foo.c
>> int __attribute__((btf_type_tag("tag1"))) * p;
>> $ gcc -c -gdwarf -gbtf foo.c
>>
>> Internally:
>> <var_decl 0x7ffff743abd0 p
>> type <pointer_type 0x7ffff7590150
>> type <integer_type 0x7ffff74475e8 int public SI
>> size <integer_cst 0x7ffff742bf90 constant 32>
>> unit-size <integer_cst 0x7ffff742bfa8 constant 4>
>> align:32 warn_if_not_align:0 symtab:0 alias-set -1
>> canonical-type 0x7ffff74475e8 precision:32 min <integer_cst 0x7ffff742bf48
>> -2147483648> max <integer_cst 0x7ffff742bf60 2147483647>
>> pointer_to_this <pointer_type 0x7ffff744fa80>>
>> unsigned DI
>> size <integer_cst 0x7ffff742bd50 constant 64>
>> unit-size <integer_cst 0x7ffff742bd68 constant 8>
>> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
>> 0x7ffff744fa80
>> attributes <tree_list 0x7ffff7564d70
>> purpose <identifier_node 0x7ffff757f2d0 btf_type_tag>
>> value <tree_list 0x7ffff7564cf8
>> value <string_cst 0x7ffff757c220 type <array_type
>> 0x7ffff75900a8>
>> readonly constant static "tag1\000">>>>
>> public static unsigned DI defer-output
>> /home/dfaust/playpen/btf/tags/foo.c:1:45 size <integer_cst 0x7ffff742bd50
>> 64> unit-size <integer_cst 0x7ffff742bd68 8>
>> align:64 warn_if_not_align:0>
>>
>> And the resulting BTF:
>>
>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>> [2] PTR '(anon)' type_id=3
>> [3] TYPE_TAG 'tag1' type_id=1
>> [4] VAR 'p' type_id=2, linkage=global
>> [5] DATASEC '.bss' size=0 vlen=1
>> type_id=4 offset=0 size=8 (VAR 'p')
>>
>> var(p) -> ptr -> type_tag -> int
>
> It would be good if we can generate similar encoding in dwarf.
> Currently in clang, we generate
> var(p) -> ptr (type_tag) -> int
> but I am open to generate
> var(p) -> ptr -> type_tag -> int
> in dwarf as well if it is possible.
>
The DWARF encodings are the same between GCC and LLVM.
In the case we've looked at in this thread where the DWARF is not
the same, it is a result of clang attribute parsing not following
the GNU attribute syntax correctly and associating the attribute
with the wrong part of the declaration. But this is not a problem
with DWARF.
>>
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> clang: clang -target bpf -c -g annotate.c
>>>>>>
>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>> DW_AT_producer ("clang version 15.0.0
>>>>>> (https://github.com/llvm/llvm-project.git
>>>>>> f80e369f61ebd33dd9377bb42fcab64d17072b18)")
>>>>>> DW_AT_language (DW_LANG_C99)
>>>>>> DW_AT_name ("annotate.c")
>>>>>> DW_AT_str_offsets_base (0x00000008)
>>>>>> DW_AT_stmt_list (0x00000000)
>>>>>> DW_AT_comp_dir ("/home/dfaust/playpen/btf/tags")
>>>>>> DW_AT_addr_base (0x00000008)
>>>>>>
>>>>>> 0x0000001e: DW_TAG_variable
>>>>>> DW_AT_name ("g")
>>>>>> DW_AT_type (0x00000029 "int **")
>>>>>> DW_AT_external (true)
>>>>>> DW_AT_decl_file
>>>>>> ("/home/dfaust/playpen/btf/tags/annotate.c")
>>>>>> DW_AT_decl_line (11)
>>>>>> DW_AT_location (DW_OP_addrx 0x0)
>>>>>>
>>>>>> 0x00000029: DW_TAG_pointer_type
>>>>>> DW_AT_type (0x00000035 "int *")
>>>>>>
>>>>>> 0x0000002e: DW_TAG_LLVM_annotation
>>>>>> DW_AT_name ("btf_type_tag")
>>>>>> DW_AT_const_value ("type-tag-2")
>>>>>>
>>>>>> 0x00000031: DW_TAG_LLVM_annotation
>>>>>> DW_AT_name ("btf_type_tag")
>>>>>> DW_AT_const_value ("type-tag-3")
>>>>>>
>>>>>> 0x00000034: NULL
>>>>>>
>>>>>> 0x00000035: DW_TAG_pointer_type
>>>>>> DW_AT_type (0x0000003e "int")
>>>>>>
>>>>>> 0x0000003a: DW_TAG_LLVM_annotation
>>>>>> DW_AT_name ("btf_type_tag")
>>>>>> DW_AT_const_value ("type-tag-1")
>>>>>>
>>>>>> 0x0000003d: NULL
>>>>>>
>>>>>> 0x0000003e: DW_TAG_base_type
>>>>>> DW_AT_name ("int")
>>>>>> DW_AT_encoding (DW_ATE_signed)
>>>>>> DW_AT_byte_size (0x04)
>>>>>>
>>>>>> 0x00000042: NULL
>>>>>>
>>>>>>