Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

Qing Zhao via Gcc-patches Fri, 01 Jul 2022 06:40:37 -0700


> On Jul 1, 2022, at 8:58 AM, Richard Biener <[email protected]> wrote:
> 
> On Fri, Jul 1, 2022 at 2:55 PM Qing Zhao <[email protected]> wrote:
>> 
>> 
>> 
>>> On Jul 1, 2022, at 2:49 AM, Richard Biener <[email protected]> 
>>> wrote:
>>> 
>>> On Thu, Jun 30, 2022 at 9:30 PM Qing Zhao <[email protected]> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Jun 30, 2022, at 1:03 PM, Jakub Jelinek <[email protected]> wrote:
>>>>> 
>>>>> On Thu, Jun 30, 2022 at 03:31:00PM +0000, Qing Zhao wrote:
>>>>>>> No, that’s not true.  A FIELD_DELC is only shared for cv variants of a 
>>>>>>> structure.
>>>>>> 
>>>>>> Sorry for my dump questions:
>>>>>> 
>>>>>> 1. What do you mean by “cv variants” of a structure?
>>>>> 
>>>>> const/volatile qualified variants.  So
>>>> Okay. I see. thanks.
>>>>> 
>>>>>> 2. For the following example:
>>>>>> 
>>>>>> struct AX { int n; short ax[];};
>>>>> 
>>>>> struct AX, const struct AX, volatile const struct AX etc. types will share
>>>>> the FIELD_DECLs.
>>>> 
>>>> Okay.
>>>>> 
>>>>>> struct UX {struct AX b; int m;};
>>>>>> 
>>>>>> Are there two different FIELD_DECLs in the IR, one for AX.ax, the other 
>>>>>> one is for UX.b.ax?
>>>>> 
>>>>> No, there are just n and ax FIELD_DECLs with DECL_CONTEXT of struct AX and
>>>>> b and m FIELD_DECLs with DECL_CONTEXT of struct UX.
>>>> 
>>>> Ah, right.
>>>> 
>>>> 
>>>>> 
>>>>> But, what is important is that when some FIELD_DECL is last in some
>>>>> structure and has array type, it doesn't mean it should have an
>>>>> unconstrained length.
>>>>> In the above case, when struct AX is is followed by some other member, it
>>>>> acts as a strict short ax[0]; field (even when that is an exception), one
>>>>> can tak address of &UX.b.ax[0], but can't dereference that, or 
>>>>> &UX.b.ax[1].
>>>> 
>>>> So, is this a GNU extension. I see that CLANG gives a warning by default 
>>>> and GCC gives a warning when specify -pedantic:
>>>> [opc@qinzhao-ol8u3-x86 trailing_array]$ cat t3.c
>>>> struct AX
>>>> {
>>>> int n;
>>>> short ax[];
>>>> };
>>>> 
>>>> struct UX
>>>> {
>>>> struct AX b;
>>>> int m;
>>>> };
>>>> 
>>>> void warn_ax_local (struct AX *p, struct UX *q)
>>>> {
>>>> p->ax[2] = 0;
>>>> q->b.ax[2] = 0;
>>>> }
>>>> [opc@qinzhao-ol8u3-x86 trailing_array]$ clang -O2 -Wall t3.c -S
>>>> t3.c:9:13: warning: field 'b' with variable sized type 'struct AX' not at 
>>>> the end of a struct or class is a GNU extension 
>>>> [-Wgnu-variable-sized-type-not-at-end]
>>>> struct AX b;
>>>>           ^
>>>> [opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t3.c -pedantic -S
>>>> t3.c:9:13: warning: invalid use of structure with flexible array member 
>>>> [-Wpedantic]
>>>>   9 |   struct AX b;
>>>>     |             ^
>>>> 
>>>> But, Yes, I agree, even though this is only a GNU extension, We still need 
>>>> to handle it and accept it as legal code.
>>>> 
>>>> Then, yes, I also agree that encoding the info of is_flexible_array into 
>>>> FIELD_DECL is not good.
>>> 
>>> Which is why I suggested to encode 'not_flexible_array'.  This way the
>>> FE can mark all a[1] this way in some mode
>>> but leave a[] as possibly flexarray (depending on context).
>> 
>> Then, FE marking (not_flexible_array) can not do the complete job to mark
>> whether a field array is flexible array member or not,  Middle end still 
>> need to
>> check the “context” (i.e, whether the array ref is at the end of a 
>> structure?)
> 
> Yes, but at the very "root" the frontend get's to say whether char[1]
> is possibly
> flexarray or if only char[] is.


Okay. 
> 
>> So, only FE marking + Middle-end “context checking” together will decide a 
>> REAL flex array?
>> 
>> If so, comparing to the current implemenation to have all the checking in 
>> middle-end, what’s the
>> major benefit of moving part of the checking into FE, and leaving the other 
>> part in middle-end?
> 
> Because a frontend might decide based on language rules that char[1]
> is never a flexarray and
> in particular it can decide to do that only for user declared
> structures.  In particular the latter is
> difficult for the middle-end where some aggregates are built by the
> middle-end (gcov) or the
> targets.

That makes sense. 
> 
>>> 
>>>> How about encoding the info of “has_flexible_array” into the enclosing 
>>>> RECORD_TYPE or UNION_TYPE node?
>>> 
>>> But that has the same issue.  Consider
>>> 
>>> struct A { int n; int a[1]; };
>>> 
>>> where a is considered possibly a flexarray vs.
>>> 
>>> struct B { struct A a; int b; };
>>> 
>>> where B.a would be not considered to have a flexarray (again note
>>> 'possibly' vs. 'actually does').
>>> 
>>> Also
>>> 
>>> struct A a;
>>> 
>>> has 'a' as _not_ having a flexarray (because it's size is statically
>>> allocated) but
>>> 
>>> struct A *a;
>>> struct B *b;
>>> 
>>> a->a[n];
>>> 
>>> as possibly accessing the flexarray portion of *a while
>>> 
>>> b->a.a[n]
>>> 
>>> is not accessing a flexarray because there's a member after a in b.
>>> 
>>> For your original proposal it's really the field declaration itself
>>> which changes so annotating the FIELD_DECL
>>> seems correct to me.
>> 
>> Then middle-end still need to check the context, and combined
>> with the “not_flexible_array” flag that is encoded in FIELD_DECL
>> to make the final decision?
> 
> Yes.

Okay, I see now.

thanks.

Qing
> 
>> Thanks.
>> 
>> Qing
>>> 
>>>> For example, in the above example,  the RECORD_TYPE for “struct AX” will 
>>>> be marked as “has_flexible_array”, but that for “struct UX” will not.
>>>> 
>>>>> 
>>>>> I believe pedantically flexible array members in such cases don't
>>>>> necessarily mean zero length array, could be longer, e.g. for the usual
>>>>> x86_64 alignments
>>>>> struct BX { long long n; short o; short ax[]; };
>>>>> struct VX { struct BX b; int m; };
>>>>> I think it acts as short ax[3]; because the padding at the end of struct 
>>>>> BX
>>>>> is so long that 3 short elements fit in there.
>>>>> While if one uses
>>>>> struct BX bx = { 1LL, 2, { 3, 4, 5, 6, 7, 8, 9, 10 } };
>>>>> (a GNU extension), then it acts as short ax[11]; - the initializer is 8
>>>>> elements and after short ax[8]; is padding for another 3 full elemenets.
>>>>> And of course:
>>>>> struct BX *p = malloc (offsetof (struct BX, ax) + n * sizeof (short));
>>>>> means short ax[n].
>>>>> Whether struct WX { struct BX b; };
>>>>> struct WX *p = malloc (offsetof (struct WX, b.ax) + n * sizeof (short));
>>>>> is pedantically acting as short ax[n]; is unclear to me, but we are
>>>>> generally allowing that and people expect it.
>>>> 
>>>> Okay, I see now.
>>>>> 
>>>>> Though, on the GCC side, I think we are only treating like flexible arrays
>>>>> what is really at the end of structs, not followed by other members.
>>>> 
>>>> My understanding is, Permitting flexible array to be followed by other 
>>>> members is a GNU extension.  (Actually, it’s not allowed by standard?).
>>>> 
>>>> Thanks a lot for your patience and help.
>>>> 
>>>> Qing
>>>>> 
>>>>>     Jakub

Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

Reply via email to