On 6/2/22 19:04, Yonghong Song wrote:
> 
> 
> On 5/27/22 12:56 PM, David Faust wrote:
>>
>>
>> On 5/26/22 00:29, Yonghong Song wrote:
>>>
>>>
>>> On 5/24/22 10:04 AM, David Faust wrote:
>>>>
>>>>
>>>> On 5/24/22 09:03, Yonghong Song wrote:
>>>>>
>>>>>
>>>>> On 5/24/22 8:53 AM, David Faust wrote:
>>>>>>
>>>>>>
>>>>>> On 5/24/22 04:07, Jose E. Marchesi wrote:
>>>>>>>
>>>>>>>> On 5/11/22 11:44 AM, David Faust wrote:
>>>>>>>>>
>>>>>>>>> On 5/10/22 22:05, Yonghong Song wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 5/10/22 8:43 PM, Yonghong Song wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 5/6/22 2:18 PM, David Faust wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 5/5/22 16:00, Yonghong Song wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 5/4/22 10:03 AM, David Faust wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 5/3/22 15:32, Joseph Myers wrote:
>>>>>>>>>>>>>>> On Mon, 2 May 2022, David Faust via Gcc-patches wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Consider the following example:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>          #define __typetag1 
>>>>>>>>>>>>>>>> __attribute__((btf_type_tag("tag1")))
>>>>>>>>>>>>>>>>          #define __typetag2 
>>>>>>>>>>>>>>>> __attribute__((btf_type_tag("tag2")))
>>>>>>>>>>>>>>>>          #define __typetag3 
>>>>>>>>>>>>>>>> __attribute__((btf_type_tag("tag3")))
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>          int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The expected behavior is that 'g' is "a pointer with tags
>>>>>>>>>>>>>>>> 'tag2' and
>>>>>>>>>>>>>>>> 'tag3',
>>>>>>>>>>>>>>>> to a pointer with tag 'tag1' to an int". i.e.:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That's not a correct expectation for either GNU __attribute__ or
>>>>>>>>>>>>>>> C2x [[]]
>>>>>>>>>>>>>>> attribute syntax.  In either syntax, __typetag2 __typetag3 
>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>> apply to
>>>>>>>>>>>>>>> the type to which g points, not to g or its type, just as if
>>>>>>>>>>>>>>> you had a
>>>>>>>>>>>>>>> type qualifier there.  You'd need to put the attributes (or
>>>>>>>>>>>>>>> qualifier)
>>>>>>>>>>>>>>> after the *, not before, to make them apply to the pointer
>>>>>>>>>>>>>>> type.  See
>>>>>>>>>>>>>>> "Attribute Syntax" in the GCC manual for how the syntax is
>>>>>>>>>>>>>>> defined for
>>>>>>>>>>>>>>> GNU
>>>>>>>>>>>>>>> attributes and deduce in turn, for each subsequence of the 
>>>>>>>>>>>>>>> tokens
>>>>>>>>>>>>>>> matching
>>>>>>>>>>>>>>> the syntax for some kind of declarator, what the type for "T D1"
>>>>>>>>>>>>>>> would be
>>>>>>>>>>>>>>> as defined there and in the C standard, as deduced from the 
>>>>>>>>>>>>>>> type for
>>>>>>>>>>>>>>> "T D"
>>>>>>>>>>>>>>> for a sub-declarator D.
>>>>>>>>>>>>>>>       >> But GCC's attribute parsing produces a variable 'g'
>>>>>>>>>>>>>>> which is "a
>>>>>>>>>>>>>> pointer with
>>>>>>>>>>>>>>>> tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an
>>>>>>>>>>>>>>>> int", i.e.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In GNU syntax, __typetag1 applies to the declaration, whereas 
>>>>>>>>>>>>>>> in C2x
>>>>>>>>>>>>>>> syntax it applies to int.  Again, if you wanted it to apply to 
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> pointer
>>>>>>>>>>>>>>> type it would need to go after the * not before.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you are concerned with the fine details of what construct an
>>>>>>>>>>>>>>> attribute
>>>>>>>>>>>>>>> appertains to, I recommend using C2x syntax not GNU syntax.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Joseph, thank you! This is very helpful. My understanding of
>>>>>>>>>>>>>> the syntax
>>>>>>>>>>>>>> was not correct.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (Actually, I made a bad mistake in paraphrasing this example 
>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>> discussion of it in the series cover letter. But, the reason
>>>>>>>>>>>>>> why it is
>>>>>>>>>>>>>> incorrect is the same.)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yonghong, is the specific ordering an expectation in BPF 
>>>>>>>>>>>>>> programs or
>>>>>>>>>>>>>> other users of the tags?
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is probably a language writing issue. We are saying tags only
>>>>>>>>>>>>> apply to pointer. We probably should say it only apply to pointee.
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ cat t.c
>>>>>>>>>>>>> int const *ptr;
>>>>>>>>>>>>>
>>>>>>>>>>>>> the llvm ir debuginfo:
>>>>>>>>>>>>>
>>>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 
>>>>>>>>>>>>> 64)
>>>>>>>>>>>>> !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
>>>>>>>>>>>>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>>>>
>>>>>>>>>>>>> We could replace 'const' with a tag like below:
>>>>>>>>>>>>>
>>>>>>>>>>>>> int __attribute__((btf_type_tag("tag"))) *ptr;
>>>>>>>>>>>>>
>>>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 
>>>>>>>>>>>>> 64,
>>>>>>>>>>>>> annotations: !7)
>>>>>>>>>>>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>>>> !7 = !{!8}
>>>>>>>>>>>>> !8 = !{!"btf_type_tag", !"tag"}
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the above IR, we generate annotations to pointer_type because
>>>>>>>>>>>>> we didn't invent a new DI type for encode btf_type_tag. But it is
>>>>>>>>>>>>> totally okay to have IR looks like
>>>>>>>>>>>>>
>>>>>>>>>>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, 
>>>>>>>>>>>>> size: 64)
>>>>>>>>>>>>> !11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
>>>>>>>>>>>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>>>>>>>>>>>
>>>>>>>>>>>> OK, thanks.
>>>>>>>>>>>>
>>>>>>>>>>>> There is still the question of why the DWARF generated for this 
>>>>>>>>>>>> case
>>>>>>>>>>>> that I have been concerned about:
>>>>>>>>>>>>
>>>>>>>>>>>>        int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>>>
>>>>>>>>>>>> differs between GCC (with this series) and clang. After studying 
>>>>>>>>>>>> it,
>>>>>>>>>>>> GCC is doing with the attributes exactly as is described in the
>>>>>>>>>>>> Attribute Syntax portion of the GCC manual where the GNU syntax is
>>>>>>>>>>>> described. I do not think there is any problem here.
>>>>>>>>>>>>
>>>>>>>>>>>> So the difference in DWARF suggests to me that clang is not 
>>>>>>>>>>>> handling
>>>>>>>>>>>> the GNU attribute syntax in this particular case correctly, since 
>>>>>>>>>>>> it
>>>>>>>>>>>> seems to be associating __typetag2 and __typetag3 to g's type 
>>>>>>>>>>>> rather
>>>>>>>>>>>> than the type to which it points.
>>>>>>>>>>>>
>>>>>>>>>>>> I am not sure whether for the use purposes of the tags this 
>>>>>>>>>>>> difference
>>>>>>>>>>>> is very important, but it is worth noting.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> As Joseph suggested, it may be better to encourage users of these 
>>>>>>>>>>>> tags
>>>>>>>>>>>> to use the C2x attribute syntax if they are concerned with 
>>>>>>>>>>>> precisely
>>>>>>>>>>>> which construct the tag applies.
>>>>>>>>>>>>
>>>>>>>>>>>> This would also be a way around any issues in handling the 
>>>>>>>>>>>> attributes
>>>>>>>>>>>> due to the GNU syntax.
>>>>>>>>>>>>
>>>>>>>>>>>> I tried a few test cases using C2x syntax BTF type tags with a
>>>>>>>>>>>> clang-15 build, but ran into some issues (in particular, some of 
>>>>>>>>>>>> the
>>>>>>>>>>>> tag attributes being ignored altogether). I couldn't find 
>>>>>>>>>>>> confirmation
>>>>>>>>>>>> whether C2x attribute syntax is fully supported in clang yet, so 
>>>>>>>>>>>> maybe
>>>>>>>>>>>> this isn't expected to work. Do you know whether the C2x syntax is
>>>>>>>>>>>> fully supported in clang yet?
>>>>>>>>>>>
>>>>>>>>>>> Actually, I don't know either. But since the btf decl_tag and 
>>>>>>>>>>> type_tag
>>>>>>>>>>> are also used to compile linux kernel and the minimum compiler 
>>>>>>>>>>> version
>>>>>>>>>>> to compile kernel is gcc5.1 and clang11. I am not sure whether 
>>>>>>>>>>> gcc5.1
>>>>>>>>>>> supports c2x or not, I guess probably not. So I think we most likely
>>>>>>>>>>> cannot use c2x syntax.
>>>>>>>>>>
>>>>>>>>>> Okay, I think we can guard btf_tag's with newer compiler versions.
>>>>>>>>>> What kind of c2x syntax you intend to use? I can help compile kernel
>>>>>>>>>> with that syntax and llvm15 to see what is the issue and may help
>>>>>>>>>> fix it in clang if possible.
>>>>>>>>>
>>>>>>>>> I am thinking to use the [[]] C2x standard attribute syntax. The
>>>>>>>>> syntax makes it quite clear to which entity each attribute applies,
>>>>>>>>> and in my opinion is a little more intuitive/less surprising too.
>>>>>>>>> It's documented here (PDF):
>>>>>>>>>       https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2731.pdf
>>>>>>>>> See sections 6.7.11 for the syntax and 6.7.6 for
>>>>>>>>> declarations. Section 6.7.6.1 specifically describes using the
>>>>>>>>> attribute syntax with pointer declarators.
>>>>>>>>> The attribute syntax itself for BTF tags is:
>>>>>>>>>       [[clang::btf_type_tag("tag1")]]
>>>>>>>>> or
>>>>>>>>>       [[gnu::btf_type_tag("tag1")]]
>>>>>>>>>
>>>>>>>>> I am also looking into whether, with the C2x syntax, we really need 
>>>>>>>>> two
>>>>>>>>> separate attributes (type_tag and decl_tag) at the language
>>>>>>>>> level. It might be possible with C2x syntax to use just one language
>>>>>>>>> attribute (e.g. just btf_tag).
>>>>>>>>>
>>>>>>>>> A simple declaration for a tagged pointer to an int:
>>>>>>>>>       int * [[gnu::btf_type_tag("tag1")]] x;
>>>>>>>>> And for the example from this thread:
>>>>>>>>>       #define __typetag1 [[gnu::btf_type_tag("type-tag-1")]]
>>>>>>>>>       #define __typetag2 [[gnu::btf_type_tag("type-tag-2")]]
>>>>>>>>>       #define __typetag3 [[gnu::btf_type_tag("type-tag-3")]]
>>>>>>>>>       int * __typetag1 * __typetag2 __typetag3 g;
>>>>>>>>> Here each tag applies to the preceding pointer, so the result is
>>>>>>>>> unsurprising.
>>>>>>>>> Actually, this is where I found something that looks like an issue
>>>>>>>>> with the C2x attribute syntax in clang. The tags 2 and 3 go missing,
>>>>>>>>> but with no warning nor other indication.
>>>>>>>>> Compiling this example with gcc:
>>>>>>>>> $ ~/toolchains/bpf/bin/bpf-unknown-none-gcc -c -gbtf -gdwarf c2x.c
>>>>>>>>> -o c2x.o --std=c2x
>>>>>>>>> $ ~/toolchains/llvm/bin/llvm-dwarfdump c2x.o
>>>>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>>>>                   DW_AT_producer    ("GNU C2X 12.0.1 20220401
>>>>>>>>> (experimental) -gbtf -gdwarf -std=c2x")
>>>>>>>>>                   DW_AT_language    (DW_LANG_C11)
>>>>>>>>>                   DW_AT_name    ("c2x.c")
>>>>>>>>>                   DW_AT_comp_dir    ("/home/dfaust/playpen/btf/tags")
>>>>>>>>>                   DW_AT_stmt_list    (0x00000000)
>>>>>>>>> 0x0000001e:   DW_TAG_variable
>>>>>>>>>                     DW_AT_name    ("g")
>>>>>>>>>                     DW_AT_decl_file    
>>>>>>>>> ("/home/dfaust/playpen/btf/tags/c2x.c")
>>>>>>>>>                     DW_AT_decl_line    (16)
>>>>>>>>>                     DW_AT_decl_column    (0x2a)
>>>>>>>>>                     DW_AT_type    (0x00000032 "int **")
>>>>>>>>>                     DW_AT_external    (true)
>>>>>>>>>                     DW_AT_location    (DW_OP_addr 0x0)
>>>>>>>>> 0x00000032:   DW_TAG_pointer_type
>>>>>>>>>                     DW_AT_byte_size    (8)
>>>>>>>>>                     DW_AT_type    (0x0000004e "int *")
>>>>>>>>>                     DW_AT_sibling    (0x0000004e)
>>>>>>>>> 0x0000003b:     DW_TAG_LLVM_annotation
>>>>>>>>>                       DW_AT_name    ("btf_type_tag")
>>>>>>>>>                       DW_AT_const_value    ("type-tag-3")
>>>>>>>>> 0x00000044:     DW_TAG_LLVM_annotation
>>>>>>>>>                       DW_AT_name    ("btf_type_tag")
>>>>>>>>>                       DW_AT_const_value    ("type-tag-2")
>>>>>>>>> 0x0000004d:     NULL
>>>>>>>>> 0x0000004e:   DW_TAG_pointer_type
>>>>>>>>>                     DW_AT_byte_size    (8)
>>>>>>>>>                     DW_AT_type    (0x00000061 "int")
>>>>>>>>>                     DW_AT_sibling    (0x00000061)
>>>>>>>>> 0x00000057:     DW_TAG_LLVM_annotation
>>>>>>>>>                       DW_AT_name    ("btf_type_tag")
>>>>>>>>>                       DW_AT_const_value    ("type-tag-1")
>>>>>>>>> 0x00000060:     NULL
>>>>>>>>> 0x00000061:   DW_TAG_base_type
>>>>>>>>>                     DW_AT_byte_size    (0x04)
>>>>>>>>>                     DW_AT_encoding    (DW_ATE_signed)
>>>>>>>>>                     DW_AT_name    ("int")
>>>>>>>>> 0x00000068:   NULL
>>>>>>>>>
>>>>>>>>> and with clang (changing the attribute prefix to clang:: 
>>>>>>>>> appropriately):
>>>>>>>>> $ ~/toolchains/llvm/bin/clang -target bpf -g -c c2x.c -o c2x.o.ll
>>>>>>>>> --std=c2x
>>>>>>>>> $ ~/toolchains/llvm/bin/llvm-dwarfdump c2x.o.ll
>>>>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>>>>                   DW_AT_producer    ("clang version 15.0.0
>>>>>>>>> (https://github.com/llvm/llvm-project.git
>>>>>>>>> f80e369f61ebd33dd9377bb42fcab64d17072b18)")
>>>>>>>>>                   DW_AT_language    (DW_LANG_C99)
>>>>>>>>>                   DW_AT_name    ("c2x.c")
>>>>>>>>>                   DW_AT_str_offsets_base    (0x00000008)
>>>>>>>>>                   DW_AT_stmt_list    (0x00000000)
>>>>>>>>>                   DW_AT_comp_dir    ("/home/dfaust/playpen/btf/tags")
>>>>>>>>>                   DW_AT_addr_base    (0x00000008)
>>>>>>>>> 0x0000001e:   DW_TAG_variable
>>>>>>>>>                     DW_AT_name    ("g")
>>>>>>>>>                     DW_AT_type    (0x00000029 "int **")
>>>>>>>>>                     DW_AT_external    (true)
>>>>>>>>>                     DW_AT_decl_file    
>>>>>>>>> ("/home/dfaust/playpen/btf/tags/c2x.c")
>>>>>>>>>                     DW_AT_decl_line    (12)
>>>>>>>>>                     DW_AT_location    (DW_OP_addrx 0x0)
>>>>>>>>> 0x00000029:   DW_TAG_pointer_type
>>>>>>>>>                     DW_AT_type    (0x00000032 "int *")
>>>>>>>>> 0x0000002e:     DW_TAG_LLVM_annotation
>>>>>>>>>                       DW_AT_name    ("btf_type_tag")
>>>>>>>>>                       DW_AT_const_value    ("type-tag-1")
>>>>>>>>> 0x00000031:     NULL
>>>>>>>>> 0x00000032:   DW_TAG_pointer_type
>>>>>>>>>                     DW_AT_type    (0x00000037 "int")
>>>>>>>>> 0x00000037:   DW_TAG_base_type
>>>>>>>>>                     DW_AT_name    ("int")
>>>>>>>>>                     DW_AT_encoding    (DW_ATE_signed)
>>>>>>>>>                     DW_AT_byte_size    (0x04)
>>>>>>>>> 0x0000003b:   NULL
>>>>>>>>
>>>>>>>> Thanks. I checked with current clang. The generated code looks
>>>>>>>> like above. Basically, for code like below
>>>>>>>>
>>>>>>>>       #define __typetag1 [[clang::btf_type_tag("type-tag-1")]]
>>>>>>>>       #define __typetag2 [[clang::btf_type_tag("type-tag-2")]]
>>>>>>>>       #define __typetag3 [[clang::btf_type_tag("type-tag-3")]]
>>>>>>>>
>>>>>>>>       int * __typetag1 * __typetag2 __typetag3 g;
>>>>>>>>
>>>>>>>> The IR type looks like
>>>>>>>>       __typetag3 -> __typetag2 -> * (ptr1) -> __typetag1 -> * (ptr2) 
>>>>>>>> -> int
>>>>>>>>
>>>>>>>> The IR is similar to what we did if using
>>>>>>>> __attribute__((btf_type_tag(""))), but their
>>>>>>>> semantic interpretation is quite different.
>>>>>>>> For example, with c2x format,
>>>>>>>>       __typetag1 applies to ptr2
>>>>>>>> with __attribute__ format, it applies pointee of ptr1.
>>>>>>>>
>>>>>>>> But more importantly, c2x format is incompatible with
>>>>>>>> the usage of linux kernel. The following are a bunch of kernel
>>>>>>>> __user usages. Here, __user intends to be replaced with a btf_type_tag.
>>>>>>>>
>>>>>>>> vfio_pci_core.h:        ssize_t (*rw)(struct vfio_pci_core_device
>>>>>>>> *vdev, char __user *buf,
>>>>>>>> vfio_pci_core.h:                                  char __user *buf,
>>>>>>>> size_t count,
>>>>>>>> vfio_pci_core.h:extern ssize_t vfio_pci_bar_rw(struct
>>>>>>>> vfio_pci_core_device *vdev, char __user *buf,
>>>>>>>> vfio_pci_core.h:extern ssize_t vfio_pci_vga_rw(struct
>>>>>>>> vfio_pci_core_device *vdev, char __user *buf,
>>>>>>>> vfio_pci_core.h:                                      char __user
>>>>>>>> *buf, size_t count,
>>>>>>>> vfio_pci_core.h:                                void __user *arg,
>>>>>>>> size_t argsz);
>>>>>>>> vfio_pci_core.h:ssize_t vfio_pci_core_read(struct vfio_device
>>>>>>>> *core_vdev, char __user *buf,
>>>>>>>> vfio_pci_core.h:ssize_t vfio_pci_core_write(struct vfio_device
>>>>>>>> *core_vdev, const char __user *buf,
>>>>>>>> vringh.h:                    vring_desc_t __user *desc,
>>>>>>>> vringh.h:                    vring_avail_t __user *avail,
>>>>>>>> vringh.h:                    vring_used_t __user *used);
>>>>>>>> vt_kern.h:int con_set_cmap(unsigned char __user *cmap);
>>>>>>>> vt_kern.h:int con_get_cmap(unsigned char __user *cmap);
>>>>>>>> vt_kern.h:int con_set_trans_old(unsigned char __user * table);
>>>>>>>> vt_kern.h:int con_get_trans_old(unsigned char __user * table);
>>>>>>>> vt_kern.h:int con_set_trans_new(unsigned short __user * table);
>>>>>>>> vt_kern.h:int con_get_trans_new(unsigned short __user * table);
>>>>>>>>
>>>>>>>> You can see, we will not able to simply replace __user
>>>>>>>> with [[clang::btf_type_tag("user")]] because it won't work
>>>>>>>> according to c2x expectations.
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Thanks for checking this. I see that we probably cannot use the c2x
>>>>>> syntax in the kernel, since it will not work as a drop-in replacement
>>>>>> for the current uses.
>>>>>>
>>>>>>>
>>>>>>> Hi Yongsong.
>>>>>>>
>>>>>>> I am a bit confused regarding the GNU attributes problem: our patch
>>>>>>> supports it, but as David already noted:
>>>>>>>
>>>>>>>>>>> There is still the question of why the DWARF generated for this case
>>>>>>>>>>> that I have been concerned about:
>>>>>>>>>>>
>>>>>>>>>>>       int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>>>>>
>>>>>>>>>>> differs between GCC (with this series) and clang. After studying it,
>>>>>>>>>>> GCC is doing with the attributes exactly as is described in the
>>>>>>>>>>> Attribute Syntax portion of the GCC manual where the GNU syntax is
>>>>>>>>>>> described. I do not think there is any problem here.
>>>>>>>>>>>
>>>>>>>>>>> So the difference in DWARF suggests to me that clang is not handling
>>>>>>>>>>> the GNU attribute syntax in this particular case correctly, since it
>>>>>>>>>>> seems to be associating __typetag2 and __typetag3 to g's type rather
>>>>>>>>>>> than the type to which it points.
>>>>>>>
>>>>>>> Note the example he uses is:
>>>>>>>
>>>>>>>      (a) int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>>
>>>>>>> Not
>>>>>>>
>>>>>>>      (b) int * __typetag1 * __typetag2 __typetag3 g;
>>>>>>>
>>>>>>> Apparently for (a) clang is generating DWARF that associates __typetag2
>>>>>>> and__typetag3 to g's type (the pointer to pointer) instead of the
>>>>>>> pointer to int, which contravenes the GNU syntax rules.
>>>>>>>
>>>>>>> AFAIK thats is where the DWARF we generate differs, and what is blocking
>>>>>>> us.  David will correct me in the likely case I'm wrong :)
>>>>>>
>>>>>> Right. This is what I hoped maybe the C2x syntax could resolve.
>>>>>>
>>>>>> The issue I saw is that in the case (a) above, when using the GNU
>>>>>> attribute syntax, GCC and clang produce different results. I think that
>>>>>> the underlying cause is some subtle difference in how clang is handling
>>>>>> the GNU attribute syntax in the case compared to GCC.
>>>>>>
>>>>>>
>>>>>> To remind ourselves, here is the full example. Notice the significant
>>>>>> difference in which objects the tags are associated with in DWARF.
>>>>>>
>>>>>>
>>>>>> #define __typetag1 __attribute__((btf_type_tag("type-tag-1")))
>>>>>> #define __typetag2 __attribute__((btf_type_tag("type-tag-2")))
>>>>>> #define __typetag3 __attribute__((btf_type_tag("type-tag-3")))
>>>>>>
>>>>>> int __typetag1 * __typetag2 __typetag3 * g;
>>>>>>
>>>>>>
>>>>>> GCC: bpf-unknown-none-gcc -c -gdwarf -gbtf annotate.c
>>>>>>
>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>                  DW_AT_producer  ("GNU C17 12.0.1 20220401 
>>>>>> (experimental) -gdwarf -gbtf")
>>>>>>                  DW_AT_language  (DW_LANG_C11)
>>>>>>                  DW_AT_name      ("annotate.c")
>>>>>>                  DW_AT_comp_dir  ("/home/dfaust/playpen/btf/tags")
>>>>>>                  DW_AT_stmt_list (0x00000000)
>>>>>>
>>>>>> 0x0000001e:   DW_TAG_variable
>>>>>>                    DW_AT_name    ("g")
>>>>>>                    DW_AT_decl_file       
>>>>>> ("/home/dfaust/playpen/btf/tags/annotate.c")
>>>>>>                    DW_AT_decl_line       (11)
>>>>>>                    DW_AT_decl_column     (0x2a)
>>>>>>                    DW_AT_type    (0x00000032 "int **")
>>>>>>                    DW_AT_external        (true)
>>>>>>                    DW_AT_location        (DW_OP_addr 0x0)
>>>>>>
>>>>>> 0x00000032:   DW_TAG_pointer_type
>>>>>>                    DW_AT_byte_size       (8)
>>>>>>                    DW_AT_type    (0x00000045 "int *")
>>>>>>                    DW_AT_sibling (0x00000045)
>>>>>>
>>>>>> 0x0000003b:     DW_TAG_LLVM_annotation
>>>>>>                      DW_AT_name  ("btf_type_tag")
>>>>>>                      DW_AT_const_value   ("type-tag-1")
>>>>>>
>>>>>> 0x00000044:     NULL
>>>>>>
>>>>>> 0x00000045:   DW_TAG_pointer_type
>>>>>>                    DW_AT_byte_size       (8)
>>>>>>                    DW_AT_type    (0x00000061 "int")
>>>>>>                    DW_AT_sibling (0x00000061)
>>>>>>
>>>>>> 0x0000004e:     DW_TAG_LLVM_annotation
>>>>>>                      DW_AT_name  ("btf_type_tag")
>>>>>>                      DW_AT_const_value   ("type-tag-3")
>>>>>>
>>>>>> 0x00000057:     DW_TAG_LLVM_annotation
>>>>>>                      DW_AT_name  ("btf_type_tag")
>>>>>>                      DW_AT_const_value   ("type-tag-2")
>>>>>>
>>>>>> 0x00000060:     NULL
>>>>>>
>>>>>> 0x00000061:   DW_TAG_base_type
>>>>>>                    DW_AT_byte_size       (0x04)
>>>>>>                    DW_AT_encoding        (DW_ATE_signed)
>>>>>>                    DW_AT_name    ("int")
>>>>>>
>>>>>> 0x00000068:   NULL
>>>>>
>>>>> do you have documentation to show why gnu generates attribute this way?
>>>>> If dwarf generates
>>>>>        ptr -> tag3 -> tag2 -> ptr -> tag1 -> int
>>>>> does this help?
>>>>
>>>> Okay, I think I see the problem. The internal representations between clang
>>>> and GCC attach the attributes to different nodes, and as a result they
>>>> produce different DWARF:
>>>>
>>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
>>>> annotations: !10)
>>>> !6 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !7, size: 64,
>>>> annotations: !8)
>>>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>> !8 = !{!9}
>>>> !9 = !{!"btf_type_tag", !"tag1"}
>>>> !10 = !{!11, !12}
>>>> !11 = !{!"btf_type_tag", !"tag2"}
>>>> !12 = !{!"btf_type_tag", !"tag3"}
>>>>
>>>> If I am reading this IR right, then the tags "tag2" and "tag3" are being
>>>> applied to the int**, and "tag1" is applied to the int*
>>>>
>>>> But I don't think this lines up with how the attribute syntax is defined.
>>>> See
>>>>     https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html
>>>> In particular the "All other attributes" section. (It's a bit dense).
>>>> Or, as Joseph summed it up nicely earlier in the thread:
>>>>> In either syntax, __typetag2 __typetag3 should apply to
>>>>> the type to which g points, not to g or its type, just as if you had a
>>>>> type qualifier there.  You'd need to put the attributes (or qualifier)
>>>>> after the *, not before, to make them apply to the pointer type.
>>>>
>>>>
>>>> Compare that to GCC's internal representation, from which DWARF is 
>>>> generated:
>>>>
>>>>    <var_decl 0x7ffff7535090 g
>>>>       type <pointer_type 0x7ffff74f8888
>>>>           type <pointer_type 0x7ffff74f8b28 type <integer_type 
>>>> 0x7ffff74385e8 int>
>>>>               unsigned DI
>>>>               size <integer_cst 0x7ffff742b450 constant 64>
>>>>               unit-size <integer_cst 0x7ffff742b468 constant 8>
>>>>               align:64 warn_if_not_align:0 symtab:0 alias-set -1 
>>>> canonical-type 0x7ffff743f888
>>>>               attributes <tree_list 0x7ffff75165c8
>>>>                   purpose <identifier_node 0x7ffff75290f0 btf_type_tag>
>>>>                   value <tree_list 0x7ffff7516550
>>>>                       value <string_cst 0x7ffff75182e0 type <array_type 
>>>> 0x7ffff74f8738>
>>>>                           readonly constant static "type-tag-3\000">>
>>>>                   chain <tree_list 0x7ffff75165a0 purpose <identifier_node 
>>>> 0x7ffff75290f0 btf_type_tag>
>>>>                       value <tree_list 0x7ffff75164d8
>>>>                           value <string_cst 0x7ffff75182c0 type 
>>>> <array_type 0x7ffff74f8738>
>>>>                               readonly constant static "type-tag-2\000">>>>
>>>>               pointer_to_this <pointer_type 0x7ffff74f8bd0>>
>>>>           unsigned DI size <integer_cst 0x7ffff742b450 64> unit-size 
>>>> <integer_cst 0x7ffff742b468 8>
>>>>           align:64 warn_if_not_align:0 symtab:0 alias-set -1 
>>>> canonical-type 0x7ffff74f87e0
>>>>           attributes <tree_list 0x7ffff75165f0 purpose <identifier_node 
>>>> 0x7ffff75290f0 btf_type_tag>
>>>>               value <tree_list 0x7ffff7516438
>>>>                   value <string_cst 0x7ffff75182a0 type <array_type 
>>>> 0x7ffff74f8738>
>>>>                       readonly constant static "type-tag-1\000">>>>
>>>>       public static unsigned DI defer-output 
>>>> /home/dfaust/playpen/btf/tags/annotate.c:10:42 size <integer_cst 
>>>> 0x7ffff742b450 64> unit-size <integer_cst 0x7ffff742b468 8>
>>>>       align:64 warn_if_not_align:0>
>>>>
>>>> See how tags "tag2" and "tag3" are associated with the pointer_type 
>>>> 0x7ffff74f8b28,
>>>> that is, "the type to which g points"
>>>>
>>>>   From GCC's DWARF the BTF we get currently looks like:
>>>>     VAR(g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int
>>>> which is obviously quite different and why this case caught my attention.
>>>>
>>>> I think this difference is the root of our problems. It might not be
>>>> specifically related to the BTF tag attributes but they do reveal some
>>>> discrepency between how clang and GCC handle the attribute syntax.
>>>
>>> The btf_type attribute is very similar to address_space attribute.
>>> For example,
>>> $ cat t1.c
>>> int __attribute__((address_space(1))) * p;
>>> $ clang -g -S -emit-llvm t1.c
>>>
>>> In IR, we will have
>>> @p = dso_local global ptr addrspace(1) null, align 8, !dbg !0
>>> ...
>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>
>>> Replacing address_space with btf_type_tag, we will get
>>> ptr->type_tag->int in debuginfo.
>>>
>>> But it looks like gcc doesn't support address_space attribute
>>>
>>> $ gcc -g -S t1.c
>>> t1.c:1:1: warning: ‘address_space’ attribute directive ignored
>>> [-Wattributes]
>>>    int __attribute__((address_space(1))) * p;
>>>    ^~~
>>>
>>> Is it possible for gcc to go with address_space attribute
>>> semantics for btf_type_tag attribute?
>>
>> In cases like this the behavior is the same.
>> $ cat foo.c
>> int __attribute__((btf_type_tag("tag1"))) * p;
>> $ gcc -c -gdwarf -gbtf foo.c
>>
>> Internally:
>>   <var_decl 0x7ffff743abd0 p
>>      type <pointer_type 0x7ffff7590150
>>          type <integer_type 0x7ffff74475e8 int public SI
>>              size <integer_cst 0x7ffff742bf90 constant 32>
>>              unit-size <integer_cst 0x7ffff742bfa8 constant 4>
>>              align:32 warn_if_not_align:0 symtab:0 alias-set -1 
>> canonical-type 0x7ffff74475e8 precision:32 min <integer_cst 0x7ffff742bf48 
>> -2147483648> max <integer_cst 0x7ffff742bf60 2147483647>
>>              pointer_to_this <pointer_type 0x7ffff744fa80>>
>>          unsigned DI
>>          size <integer_cst 0x7ffff742bd50 constant 64>
>>          unit-size <integer_cst 0x7ffff742bd68 constant 8>
>>          align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
>> 0x7ffff744fa80
>>          attributes <tree_list 0x7ffff7564d70
>>              purpose <identifier_node 0x7ffff757f2d0 btf_type_tag>
>>              value <tree_list 0x7ffff7564cf8
>>                  value <string_cst 0x7ffff757c220 type <array_type 
>> 0x7ffff75900a8>
>>                      readonly constant static "tag1\000">>>>
>>      public static unsigned DI defer-output 
>> /home/dfaust/playpen/btf/tags/foo.c:1:45 size <integer_cst 0x7ffff742bd50 
>> 64> unit-size <integer_cst 0x7ffff742bd68 8>
>>      align:64 warn_if_not_align:0>
>>
>> And the resulting BTF:
>>
>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
>> [2] PTR '(anon)' type_id=3
>> [3] TYPE_TAG 'tag1' type_id=1
>> [4] VAR 'p' type_id=2, linkage=global
>> [5] DATASEC '.bss' size=0 vlen=1
>>      type_id=4 offset=0 size=8 (VAR 'p')
>>
>> var(p) -> ptr -> type_tag -> int
> 
> It would be good if we can generate similar encoding in dwarf.
> Currently in clang, we generate
>      var(p) -> ptr (type_tag) -> int
> but I am open to generate
>      var(p) -> ptr -> type_tag -> int
> in dwarf as well if it is possible.
> 

The DWARF encodings are the same between GCC and LLVM.

In the case we've looked at in this thread where the DWARF is not
the same, it is a result of clang attribute parsing not following
the GNU attribute syntax correctly and associating the attribute
with the wrong part of the declaration. But this is not a problem
with DWARF.

>>
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> clang: clang -target bpf -c -g annotate.c
>>>>>>
>>>>>> 0x0000000c: DW_TAG_compile_unit
>>>>>>                  DW_AT_producer  ("clang version 15.0.0 
>>>>>> (https://github.com/llvm/llvm-project.git 
>>>>>> f80e369f61ebd33dd9377bb42fcab64d17072b18)")
>>>>>>                  DW_AT_language  (DW_LANG_C99)
>>>>>>                  DW_AT_name      ("annotate.c")
>>>>>>                  DW_AT_str_offsets_base  (0x00000008)
>>>>>>                  DW_AT_stmt_list (0x00000000)
>>>>>>                  DW_AT_comp_dir  ("/home/dfaust/playpen/btf/tags")
>>>>>>                  DW_AT_addr_base (0x00000008)
>>>>>>
>>>>>> 0x0000001e:   DW_TAG_variable
>>>>>>                    DW_AT_name    ("g")
>>>>>>                    DW_AT_type    (0x00000029 "int **")
>>>>>>                    DW_AT_external        (true)
>>>>>>                    DW_AT_decl_file       
>>>>>> ("/home/dfaust/playpen/btf/tags/annotate.c")
>>>>>>                    DW_AT_decl_line       (11)
>>>>>>                    DW_AT_location        (DW_OP_addrx 0x0)
>>>>>>
>>>>>> 0x00000029:   DW_TAG_pointer_type
>>>>>>                    DW_AT_type    (0x00000035 "int *")
>>>>>>
>>>>>> 0x0000002e:     DW_TAG_LLVM_annotation
>>>>>>                      DW_AT_name  ("btf_type_tag")
>>>>>>                      DW_AT_const_value   ("type-tag-2")
>>>>>>
>>>>>> 0x00000031:     DW_TAG_LLVM_annotation
>>>>>>                      DW_AT_name  ("btf_type_tag")
>>>>>>                      DW_AT_const_value   ("type-tag-3")
>>>>>>
>>>>>> 0x00000034:     NULL
>>>>>>
>>>>>> 0x00000035:   DW_TAG_pointer_type
>>>>>>                    DW_AT_type    (0x0000003e "int")
>>>>>>
>>>>>> 0x0000003a:     DW_TAG_LLVM_annotation
>>>>>>                      DW_AT_name  ("btf_type_tag")
>>>>>>                      DW_AT_const_value   ("type-tag-1")
>>>>>>
>>>>>> 0x0000003d:     NULL
>>>>>>
>>>>>> 0x0000003e:   DW_TAG_base_type
>>>>>>                    DW_AT_name    ("int")
>>>>>>                    DW_AT_encoding        (DW_ATE_signed)
>>>>>>                    DW_AT_byte_size       (0x04)
>>>>>>
>>>>>> 0x00000042:   NULL
>>>>>>
>>>>>>

Reply via email to