Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-30 Thread Qing Zhao
Okay, Based on the comments so far, I will work on the 5th version of the 
patch, major changes will include:

1. Change the return type of the routine .ACCESS_WITH_SIZE 
FROM:
  Pointer to the type of the element of the flexible array;
TO:
   Pointer to the type of the flexible array;

 And then wrap the call with an indirection reference. 

2. Adjust all other parts with this change, (this will simplify the bound 
sanitizer instrument);

3. Add the fixes to the kernel building failures, which include:

 A. The operator “typeof” cannot return correct type for a->array; 
(I guess that the above change 1 might automatically resolve this issue)
 B. The operator “&” cannot return correct address for a->array;

4. Correctly handle the case when the value of “counted-by” is zero or negative 
as following

4.1 . Update the counted-by doc with the following:

 When the counted-by field is assigned a negative integer value, the 
compiler will treat the value as zero. 

4.2.   (possibly) Adjust __bdos and array bound sanitizer to handle 
correctly when “counted-by” is zero. 

   __bdos will return size 0 when counted-by is zero;

  Array bound sanitizer will report out-of-bound when the counted-by is 
zero for any array access. 

Let me know if I missed anything.

Thanks a lot for all the comments

 Qing


> On Jan 23, 2024, at 7:29 PM, Qing Zhao  wrote:
> 
> Hi,
> 
> This is the 4th version of the patch.
> 
> It based on the following proposal:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635884.html
> Represent the missing dependence for the "counted_by" attribute and its 
> consumers
> 
> **The summary of the proposal is:
> 
> * Add a new internal function ".ACCESS_WITH_SIZE" to carry the size 
> information for every reference to a FAM field;
> * In C FE, Replace every reference to a FAM field whose TYPE has the 
> "counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE";
> * In every consumer of the size information, for example, BDOS or array bound 
> sanitizer, query the size information or ACCESS_MODE information from the new 
> internal function;
> * When expansing to RTL, replace the internal function with the actual 
> reference to the FAM field;
> * Some adjustment to ipa alias analysis, and other SSA passes to mitigate the 
> impact to the optimizer and code generation.
> 
> 
> **The new internal function
> 
>  .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, 
> ACCESS_MODE, INDEX)
> 
> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
> 
> which returns the "REF_TO_OBJ" same as the 1st argument;
> 
> Both the return type and the type of the first argument of this function have 
> been converted from the incomplete array type to the corresponding pointer 
> type.
> 
> Please see the following link for why:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
> 
> 1st argument "REF_TO_OBJ": The reference to the object;
> 2nd argument "REF_TO_SIZE": The reference to the size of the object,
> 3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE 
> represents
>   0: unknown;
>   1: the number of the elements of the object type;
>   2: the number of bytes;
> 4th argument "PRECISION_OF_SIZE": The precision of the integer that 
> REF_TO_SIZE points;
> 5th argument "ACCESS_MODE":
>  -1: Unknown access semantics
>   0: none
>   1: read_only
>   2: write_only
>   3: read_write
> 6th argument "INDEX": the INDEX for the original array reference.
>  -1: Unknown
> 
> NOTE: The 6th Argument is added for bound sanitizer instrumentation.
> 
> ** The Patch sets included:
> 
> 1. Provide counted_by attribute to flexible array member field;
>  which includes:
>  * "counted_by" attribute documentation;
>  * C FE handling of the new attribute;
>syntax checking, error reporting;
>  * testing cases;
> 
> 2. Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.
>  which includes:
>  * The definition of the new internal function .ACCESS_WITH_SIZE in 
> internal-fn.def.
>  * C FE converts every reference to a FAM with "counted_by" attribute to 
> a call to the internal function .ACCESS_WITH_SIZE.
>(build_component_ref in c_typeck.cc)
>This includes the case when the object is statically allocated and 
> initialized.
>In order to make this working, we should update 
> initializer_constant_valid_p_1 and output_constant in varasm.cc to include 
> calls to .ACCESS_WITH_SIZE.
> 
>However, for the reference inside "offsetof", ignore the "counted_by" 
> attribute since it's not useful at all. (c_parser_postfix_expression in 
> c/c-parser.cc)
> 
>  * Convert every call to .ACCESS_WITH_SIZE to its first argument.
>(expand_ACCESS_WITH_SIZE in internal-fn.cc)
>  * adjust alias analysis to exclude the new in

Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-30 Thread Qing Zhao


> On Jan 30, 2024, at 12:41 AM, Kees Cook  wrote:
> 
> On Mon, Jan 29, 2024 at 10:45:23PM +, Qing Zhao wrote:
>> There are two things here. 
>> 
>> 1. The value of the “counted-by” is 0; (which is easy to be understood)
>> 2. The result of the _builtin_object_size when see a “counted-by” 0.
>> 
>> For 1, it’s simple, if we see a counted-by value <= 0,  then counted-by is 0;
> 
> Okay, that's good; this matches my understanding. :)
> 
>> But for 2, when the _builtin_object_size sees a “counted-by” 0, what’s value 
>> it will return for the object size?
>> 
>> Can we return 0 for the object size? 
> 
> I don't see why not. For example:
> 
> // -O2 -fstrict-flex-arrays=3
> struct s {
>int a;
>int b[4];
> } foo;
> 
> #define report(x)   printf("%s: %zu\n", #x, (size_t)(x))
> 
> int main(int argc, char *argv[])
> {
>struct s foo;
>report(__builtin_dynamic_object_size(&foo.b[4], 0));
>report(__builtin_dynamic_object_size(&foo.b[5], 0));
>report(__builtin_dynamic_object_size(&foo.b[-10], 0));
>report(__builtin_dynamic_object_size(&foo.b[4], 1));
>report(__builtin_dynamic_object_size(&foo.b[5], 1));
>report(__builtin_dynamic_object_size(&foo.b[-10], 1));
>report(__builtin_dynamic_object_size(&foo.b[4], 2));
>report(__builtin_dynamic_object_size(&foo.b[5], 2));
>report(__builtin_dynamic_object_size(&foo.b[-10], 2));
>report(__builtin_dynamic_object_size(&foo.b[4], 3));
>report(__builtin_dynamic_object_size(&foo.b[5], 3));
>report(__builtin_dynamic_object_size(&foo.b[-10], 3));
>return 0;
> }
> 
> shows:
> 
> __builtin_dynamic_object_size(&foo.b[4], 0): 0
> __builtin_dynamic_object_size(&foo.b[5], 0): 0
> __builtin_dynamic_object_size(&foo.b[-10], 0): 0
> __builtin_dynamic_object_size(&foo.b[4], 1): 0
> __builtin_dynamic_object_size(&foo.b[5], 1): 0
> __builtin_dynamic_object_size(&foo.b[-10], 1): 0
> __builtin_dynamic_object_size(&foo.b[4], 2): 0
> __builtin_dynamic_object_size(&foo.b[5], 2): 0
> __builtin_dynamic_object_size(&foo.b[-10], 2): 0
> __builtin_dynamic_object_size(&foo.b[4], 3): 0
> __builtin_dynamic_object_size(&foo.b[5], 3): 0
> __builtin_dynamic_object_size(&foo.b[-10], 3): 0
> 
> This is showing "no bytes left" for the end of the b array, and if this
> index keeps going, it still reports 0 if we're past the end of the object
> completely. And it is similarly capped for negative indexes. This is
> true for all the __bos type bits.
> 
> A "counted-by" of 0 (or below) would have the same meaning as an out of
> bounds index here.

Okay. I will keep this behavior when counted-by is zero (and negative) for 
__bos. 
> 
>> (As I mentioned in the previous email, 0 in __builtin_object_size doesn’t 
>> mean size 0,
>> it means UNKNOWN_SIZE when the type is 2/3, So, what’s value we should 
>> return for the size 0?)
>> https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html
> 
> I think I see what you mean, but I still think it should be 0 for 2/3,
> regardless of the documented interpretation. If that's the current
> response for a pathological index under 2/3, then I think it's totally
> reasonable that it should do the same for pathological bounds.

Okay, will keep this behavior for “counted-by” zero. 

(But still feel that 0 for 2/3, i.e the MINIMUM size will represent as 
UNKNOWN_SIZE.
 If that’s the value kernel expected, that’s good)
> 
> 
> And BTW, it seems there are 0-sized objects, though maybe they're some
> kind of special case:
> 
> struct s {
>int a;
>struct { } nothing;
>int b;
> };
> 
> #define report(x)   printf("%s: %zu\n", #x, (size_t)(x))
> 
> int main(int argc, char *argv[])
> {
>struct s foo;
>report(__builtin_dynamic_object_size(&foo.nothing, 1));
> }
> 
> shows:
> 
> __builtin_dynamic_object_size(&foo.nothing, 1): 0

Looks like that GCC has such extension: 
https://gcc.gnu.org/onlinedocs/gcc/Empty-Structures.html

***GCC permits a C structure to have no members:
struct empty {
};

The structure has size zero. In C++, empty structures are part of the language. 
G++ treats empty structures as if they had a single member of type char.

Thanks.

Qing


> 
> -Kees
> 
> -- 
> Kees Cook



Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-29 Thread Kees Cook
On Mon, Jan 29, 2024 at 10:45:23PM +, Qing Zhao wrote:
> There are two things here. 
> 
> 1. The value of the “counted-by” is 0; (which is easy to be understood)
> 2. The result of the _builtin_object_size when see a “counted-by” 0.
> 
> For 1, it’s simple, if we see a counted-by value <= 0,  then counted-by is 0;

Okay, that's good; this matches my understanding. :)

> But for 2, when the _builtin_object_size sees a “counted-by” 0, what’s value 
> it will return for the object size?
> 
>  Can we return 0 for the object size? 

I don't see why not. For example:

// -O2 -fstrict-flex-arrays=3
struct s {
int a;
int b[4];
} foo;

#define report(x)   printf("%s: %zu\n", #x, (size_t)(x))

int main(int argc, char *argv[])
{
struct s foo;
report(__builtin_dynamic_object_size(&foo.b[4], 0));
report(__builtin_dynamic_object_size(&foo.b[5], 0));
report(__builtin_dynamic_object_size(&foo.b[-10], 0));
report(__builtin_dynamic_object_size(&foo.b[4], 1));
report(__builtin_dynamic_object_size(&foo.b[5], 1));
report(__builtin_dynamic_object_size(&foo.b[-10], 1));
report(__builtin_dynamic_object_size(&foo.b[4], 2));
report(__builtin_dynamic_object_size(&foo.b[5], 2));
report(__builtin_dynamic_object_size(&foo.b[-10], 2));
report(__builtin_dynamic_object_size(&foo.b[4], 3));
report(__builtin_dynamic_object_size(&foo.b[5], 3));
report(__builtin_dynamic_object_size(&foo.b[-10], 3));
return 0;
}

shows:

__builtin_dynamic_object_size(&foo.b[4], 0): 0
__builtin_dynamic_object_size(&foo.b[5], 0): 0
__builtin_dynamic_object_size(&foo.b[-10], 0): 0
__builtin_dynamic_object_size(&foo.b[4], 1): 0
__builtin_dynamic_object_size(&foo.b[5], 1): 0
__builtin_dynamic_object_size(&foo.b[-10], 1): 0
__builtin_dynamic_object_size(&foo.b[4], 2): 0
__builtin_dynamic_object_size(&foo.b[5], 2): 0
__builtin_dynamic_object_size(&foo.b[-10], 2): 0
__builtin_dynamic_object_size(&foo.b[4], 3): 0
__builtin_dynamic_object_size(&foo.b[5], 3): 0
__builtin_dynamic_object_size(&foo.b[-10], 3): 0

This is showing "no bytes left" for the end of the b array, and if this
index keeps going, it still reports 0 if we're past the end of the object
completely. And it is similarly capped for negative indexes. This is
true for all the __bos type bits.

A "counted-by" of 0 (or below) would have the same meaning as an out of
bounds index here.

> (As I mentioned in the previous email, 0 in __builtin_object_size doesn’t 
> mean size 0,
>  it means UNKNOWN_SIZE when the type is 2/3, So, what’s value we should 
> return for the size 0?)
> https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html

I think I see what you mean, but I still think it should be 0 for 2/3,
regardless of the documented interpretation. If that's the current
response for a pathological index under 2/3, then I think it's totally
reasonable that it should do the same for pathological bounds.


And BTW, it seems there are 0-sized objects, though maybe they're some
kind of special case:

struct s {
int a;
struct { } nothing;
int b;
};

#define report(x)   printf("%s: %zu\n", #x, (size_t)(x))

int main(int argc, char *argv[])
{
struct s foo;
report(__builtin_dynamic_object_size(&foo.nothing, 1));
}

shows:

__builtin_dynamic_object_size(&foo.nothing, 1): 0


-Kees

-- 
Kees Cook


Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-29 Thread Qing Zhao


> On Jan 29, 2024, at 3:19 PM, Kees Cook  wrote:
> 
> On Mon, Jan 29, 2024 at 07:32:06PM +, Qing Zhao wrote:
>> 
>> 
>>> On Jan 29, 2024, at 12:25 PM, Kees Cook  wrote:
>>> 
>>> On Mon, Jan 29, 2024 at 04:00:20PM +, Qing Zhao wrote:
 An update on the kernel building with my version 4 patch.
 
 Kees reported two FE issues with the current version 4 patch:
 
 1. The operator “typeof” cannot return correct type for a->array;
 2. The operator “&” cannot return correct address for a->array;
 
 I fixed both in my local repository. 
 
 With these additional fix.  Kernel with counted-by annotation can be built 
 successfully. 
>>> 
>>> Thanks for the fixes!
>>> 
 
 And then, Kees reported one behavioral issue with the current counted-by:
 
 When the counted-by value is below zero, my current patch 
 
 A. Didn’t report any warning for it.
 B. Accepted the negative value as a wrapped size.
 
 i.e. for:
 
 struct foo {
 signed char size;
 unsigned char array[] __counted_by(size);
 } *a;
 
 ...
 a->size = -3;
 report(__builtin_dynamic_object_size(p->array, 1));
 
 this reports 253, rather than 0.
 
 And the array-bounds sanitizer doesn’t catch negative index bounds 
 neither. 
 
 a->size = -3;
 report(a->array[1]); // does not trap
 
 
 So, my questions are:
 
 How should we handle the negative counted-by value?
>>> 
>>> Treat it as always 0-bounded: count < 0 ? 0 : count
>> 
>> Then the size of the object is 0?
> 
> That would be the purpose, yes. It's possible something else has
> happened, but it would mean "the array contents should not be accessed
> (since we don't have a valid size)".

This might be a new concept we need to add, from my understanding,
 C/C++ don’t have the zero-sized object. 
So, I am a little worried about where should we add this concept?

The most reasonable place I am thinking is adding such concept to the 
doc of “counted-by” attribute, but still not very sure on this.
> 
>> 
>>> 
 
 My approach is:
 
  I think that this is a user error, the compiler need to Issue warning 
 during runtime about this user error.
 
 Since I have one remaining patch that has not been finished yet:
 
 6  Emit warnings when the user breaks the requirments for the new 
 counted_by attribute
 compilation time: -Wcounted-by
 run time: -fsanitizer=counted-by
* The initialization to the size field should be done before the first 
 reference to the FAM field.
>>> 
>>> I would hope that regular compile-time warnings would catch this.
>> If the value is known at compile-time, then compile-time should catch it.
>> 
>>> 
* the array has at least # of elements specified by the size field all 
 the time during the program.
* the value of counted-by should not be negative.
>>> 
>>> This seems reasonable for a very strict program, but it won't work for
>>> the kernel as-is: a negative "count" is sometimes used to carry failure
>>> details back to other users of the structure. This could be refactored in
>>> the kernel, but I'd prefer that even without -fsanitizer=counted-by the
>>> runtime behaviors will be "safe".
>> 
>> So, In the kernel’s source code, for example:
>> 
>> struct foo {
>>  int count;
>>  short array[] __counted_by(count);
>> };
>> 
>> The field “count” will be used for two purposes:
>> A. As the counted_by for the “array” when its value > 0;
>> B. As an errno when its value < 0;  under such condition, the size of 
>> “array” is zero. 
>> 
>> Is the understanding correct?
> 
> Yes.
> 
>> Is doing this for saving space?  (Curious -:)
> 
> It seems so, yes.
> 
>>> It does not seem sensible to me that adding a buffer size validation
>>> primitive to GCC will result in conditions where a size calculation
>>> will wrap around. I prefer no surprises. :)
>> 
>> Might be a bug here. I guess. 
>>> 
 Let me know your comment and suggestions.
>>> 
>>> Clang has implemented the safety logic I'd prefer:
>>> 
>>> * __bdos will report 0 for any sizing where the "counted_by" count
>>> variable is negative. Effectively, the count variable is always
>>> processed as: count < 0 ? 0 : count
>>> 
>>> struct foo {
>>> int count;
>>> short array[] __counted_by(count);
>>> } *p;
>>> 
>>> __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count)
>> 
>> NOTE,  __bdo will use value 0 as UNKNOWN_SIZE for MINMUM SIZE query, i.e:
>> 
>> size_t __builtin_object_size (const void * ptr, int type)
>> 
>> Will return 0 as UNKNOW_SIZE when type= 2 or 3.
>> 
>> So, I am wondering: should  the 0 here is  UNKNOWN_SIZE or 0 size?
>> 
>> I guess should be the UNKNOWN_SIZE?  (I,e, -1 for MAXIMUM type,  0 for 
>> MINIMUM type).
>> 
>> i.e, when the value of “count” is 0 or negative,  the __bdos will return 
>> UNKNOWN_SIZE.  Is this correct?
> 
> I would suggest that a negative count 

Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-29 Thread Qing Zhao



> On Jan 29, 2024, at 3:35 PM, Joseph Myers  wrote:
> 
> On Mon, 29 Jan 2024, Qing Zhao wrote:
> 
>> Thank you!
>> 
>> Joseph and Richard,  could you also comment on this?
> 
> I think Martin's suggestions are reasonable.

Okay, I will update the patches based on this approach. 

Thanks a lot for the comment.

Qing
> 
> -- 
> Joseph S. Myers
> josmy...@redhat.com
> 



Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-29 Thread Joseph Myers
On Mon, 29 Jan 2024, Qing Zhao wrote:

> Thank you!
> 
> Joseph and Richard,  could you also comment on this?

I think Martin's suggestions are reasonable.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-29 Thread Kees Cook
On Mon, Jan 29, 2024 at 07:32:06PM +, Qing Zhao wrote:
> 
> 
> > On Jan 29, 2024, at 12:25 PM, Kees Cook  wrote:
> > 
> > On Mon, Jan 29, 2024 at 04:00:20PM +, Qing Zhao wrote:
> >> An update on the kernel building with my version 4 patch.
> >> 
> >> Kees reported two FE issues with the current version 4 patch:
> >> 
> >> 1. The operator “typeof” cannot return correct type for a->array;
> >> 2. The operator “&” cannot return correct address for a->array;
> >> 
> >> I fixed both in my local repository. 
> >> 
> >> With these additional fix.  Kernel with counted-by annotation can be built 
> >> successfully. 
> > 
> > Thanks for the fixes!
> > 
> >> 
> >> And then, Kees reported one behavioral issue with the current counted-by:
> >> 
> >> When the counted-by value is below zero, my current patch 
> >> 
> >> A. Didn’t report any warning for it.
> >> B. Accepted the negative value as a wrapped size.
> >> 
> >> i.e. for:
> >> 
> >> struct foo {
> >> signed char size;
> >> unsigned char array[] __counted_by(size);
> >> } *a;
> >> 
> >> ...
> >> a->size = -3;
> >> report(__builtin_dynamic_object_size(p->array, 1));
> >> 
> >> this reports 253, rather than 0.
> >> 
> >> And the array-bounds sanitizer doesn’t catch negative index bounds 
> >> neither. 
> >> 
> >> a->size = -3;
> >> report(a->array[1]); // does not trap
> >> 
> >> 
> >> So, my questions are:
> >> 
> >> How should we handle the negative counted-by value?
> > 
> > Treat it as always 0-bounded: count < 0 ? 0 : count
> 
> Then the size of the object is 0?

That would be the purpose, yes. It's possible something else has
happened, but it would mean "the array contents should not be accessed
(since we don't have a valid size)".

> 
> > 
> >> 
> >> My approach is:
> >> 
> >>   I think that this is a user error, the compiler need to Issue warning 
> >> during runtime about this user error.
> >> 
> >> Since I have one remaining patch that has not been finished yet:
> >> 
> >> 6  Emit warnings when the user breaks the requirments for the new 
> >> counted_by attribute
> >>  compilation time: -Wcounted-by
> >>  run time: -fsanitizer=counted-by
> >> * The initialization to the size field should be done before the first 
> >> reference to the FAM field.
> > 
> > I would hope that regular compile-time warnings would catch this.
> If the value is known at compile-time, then compile-time should catch it.
> 
> > 
> >> * the array has at least # of elements specified by the size field all 
> >> the time during the program.
> >> * the value of counted-by should not be negative.
> > 
> > This seems reasonable for a very strict program, but it won't work for
> > the kernel as-is: a negative "count" is sometimes used to carry failure
> > details back to other users of the structure. This could be refactored in
> > the kernel, but I'd prefer that even without -fsanitizer=counted-by the
> > runtime behaviors will be "safe".
> 
> So, In the kernel’s source code, for example:
> 
> struct foo {
>   int count;
>   short array[] __counted_by(count);
> };
> 
> The field “count” will be used for two purposes:
> A. As the counted_by for the “array” when its value > 0;
> B. As an errno when its value < 0;  under such condition, the size of “array” 
> is zero. 
> 
> Is the understanding correct?

Yes.

> Is doing this for saving space?  (Curious -:)

It seems so, yes.

> > It does not seem sensible to me that adding a buffer size validation
> > primitive to GCC will result in conditions where a size calculation
> > will wrap around. I prefer no surprises. :)
> 
> Might be a bug here. I guess. 
> > 
> >> Let me know your comment and suggestions.
> > 
> > Clang has implemented the safety logic I'd prefer:
> > 
> > * __bdos will report 0 for any sizing where the "counted_by" count
> >  variable is negative. Effectively, the count variable is always
> >  processed as: count < 0 ? 0 : count
> > 
> >  struct foo {
> > int count;
> > short array[] __counted_by(count);
> >  } *p;
> > 
> >  __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count)
> 
> NOTE,  __bdo will use value 0 as UNKNOWN_SIZE for MINMUM SIZE query, i.e:
> 
> size_t __builtin_object_size (const void * ptr, int type)
> 
> Will return 0 as UNKNOW_SIZE when type= 2 or 3.
> 
> So, I am wondering: should  the 0 here is  UNKNOWN_SIZE or 0 size?
> 
> I guess should be the UNKNOWN_SIZE?  (I,e, -1 for MAXIMUM type,  0 for 
> MINIMUM type).
> 
> i.e, when the value of “count” is 0 or negative,  the __bdos will return 
> UNKNOWN_SIZE.  Is this correct?

I would suggest that a negative count should always return 0. The size
isn't "unknown", the "count" has been clamped to 0 to avoid surprises,
so the result is as if the "count" had a zero value.

> Okay, when the value of “count” is 0 or negative, bound sanitizer will report 
> out-of-bound (or trap) for any access to the array. 
> This should be reasonable.

Thanks! And with __bdos() following this logic there won't be a disconnect
between the two. i.e

Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-29 Thread Qing Zhao


> On Jan 29, 2024, at 12:25 PM, Kees Cook  wrote:
> 
> On Mon, Jan 29, 2024 at 04:00:20PM +, Qing Zhao wrote:
>> An update on the kernel building with my version 4 patch.
>> 
>> Kees reported two FE issues with the current version 4 patch:
>> 
>> 1. The operator “typeof” cannot return correct type for a->array;
>> 2. The operator “&” cannot return correct address for a->array;
>> 
>> I fixed both in my local repository. 
>> 
>> With these additional fix.  Kernel with counted-by annotation can be built 
>> successfully. 
> 
> Thanks for the fixes!
> 
>> 
>> And then, Kees reported one behavioral issue with the current counted-by:
>> 
>> When the counted-by value is below zero, my current patch 
>> 
>> A. Didn’t report any warning for it.
>> B. Accepted the negative value as a wrapped size.
>> 
>> i.e. for:
>> 
>> struct foo {
>> signed char size;
>> unsigned char array[] __counted_by(size);
>> } *a;
>> 
>> ...
>> a->size = -3;
>> report(__builtin_dynamic_object_size(p->array, 1));
>> 
>> this reports 253, rather than 0.
>> 
>> And the array-bounds sanitizer doesn’t catch negative index bounds neither. 
>> 
>> a->size = -3;
>> report(a->array[1]); // does not trap
>> 
>> 
>> So, my questions are:
>> 
>> How should we handle the negative counted-by value?
> 
> Treat it as always 0-bounded: count < 0 ? 0 : count

Then the size of the object is 0?

> 
>> 
>> My approach is:
>> 
>>   I think that this is a user error, the compiler need to Issue warning 
>> during runtime about this user error.
>> 
>> Since I have one remaining patch that has not been finished yet:
>> 
>> 6  Emit warnings when the user breaks the requirments for the new counted_by 
>> attribute
>>  compilation time: -Wcounted-by
>>  run time: -fsanitizer=counted-by
>> * The initialization to the size field should be done before the first 
>> reference to the FAM field.
> 
> I would hope that regular compile-time warnings would catch this.
If the value is known at compile-time, then compile-time should catch it.

> 
>> * the array has at least # of elements specified by the size field all 
>> the time during the program.
>> * the value of counted-by should not be negative.
> 
> This seems reasonable for a very strict program, but it won't work for
> the kernel as-is: a negative "count" is sometimes used to carry failure
> details back to other users of the structure. This could be refactored in
> the kernel, but I'd prefer that even without -fsanitizer=counted-by the
> runtime behaviors will be "safe".

So, In the kernel’s source code, for example:

struct foo {
  int count;
  short array[] __counted_by(count);
};

The field “count” will be used for two purposes:
A. As the counted_by for the “array” when its value > 0;
B. As an errno when its value < 0;  under such condition, the size of “array” 
is zero. 

Is the understanding correct?

Is doing this for saving space?  (Curious -:)


> 
> It does not seem sensible to me that adding a buffer size validation
> primitive to GCC will result in conditions where a size calculation
> will wrap around. I prefer no surprises. :)

Might be a bug here. I guess. 
> 
>> Let me know your comment and suggestions.
> 
> Clang has implemented the safety logic I'd prefer:
> 
> * __bdos will report 0 for any sizing where the "counted_by" count
>  variable is negative. Effectively, the count variable is always
>  processed as: count < 0 ? 0 : count
> 
>  struct foo {
> int count;
> short array[] __counted_by(count);
>  } *p;
> 
>  __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count)

NOTE,  __bdo will use value 0 as UNKNOWN_SIZE for MINMUM SIZE query, i.e:

size_t __builtin_object_size (const void * ptr, int type)

Will return 0 as UNKNOW_SIZE when type= 2 or 3.

So, I am wondering: should  the 0 here is  UNKNOWN_SIZE or 0 size?

I guess should be the UNKNOWN_SIZE?  (I,e, -1 for MAXIMUM type,  0 for MINIMUM 
type).

i.e, when the value of “count” is 0 or negative,  the __bdos will return 
UNKNOWN_SIZE.  Is this correct?

> 
>  The logic for this is that __bdos can be _certain_ that the size is 0
>  when the count variable is pathological.


> 
> * -fsanitize=array-bounds similarly treats count as above, so that:
> 
>  printf("%d\n", p->array[index]); ==> trap when index > (count < 0 ? 0 : 
> count)
> 
>  Same logic for the sanitizer: any access to the array when count is
>  invalid means the access is invalid and must be trapped.

Okay, when the value of “count” is 0 or negative, bound sanitizer will report 
out-of-bound (or trap) for any access to the array. 
This should be reasonable.

Qing


> 
> 
> This means that software can run safely even in pathological conditions.
> 
> -Kees
> 
> -- 
> Kees Cook




Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-29 Thread Kees Cook
On Mon, Jan 29, 2024 at 04:00:20PM +, Qing Zhao wrote:
> An update on the kernel building with my version 4 patch.
> 
> Kees reported two FE issues with the current version 4 patch:
> 
> 1. The operator “typeof” cannot return correct type for a->array;
> 2. The operator “&” cannot return correct address for a->array;
> 
> I fixed both in my local repository. 
> 
> With these additional fix.  Kernel with counted-by annotation can be built 
> successfully. 

Thanks for the fixes!

> 
> And then, Kees reported one behavioral issue with the current counted-by:
> 
> When the counted-by value is below zero, my current patch 
> 
> A. Didn’t report any warning for it.
> B. Accepted the negative value as a wrapped size.
> 
> i.e. for:
> 
> struct foo {
> signed char size;
> unsigned char array[] __counted_by(size);
> } *a;
> 
> ...
> a->size = -3;
> report(__builtin_dynamic_object_size(p->array, 1));
> 
> this reports 253, rather than 0.
> 
> And the array-bounds sanitizer doesn’t catch negative index bounds neither. 
> 
> a->size = -3;
> report(a->array[1]); // does not trap
> 
> 
> So, my questions are:
> 
>  How should we handle the negative counted-by value?

Treat it as always 0-bounded: count < 0 ? 0 : count

> 
>  My approach is:
> 
>I think that this is a user error, the compiler need to Issue warning 
> during runtime about this user error.
> 
> Since I have one remaining patch that has not been finished yet:
> 
> 6  Emit warnings when the user breaks the requirments for the new counted_by 
> attribute
>   compilation time: -Wcounted-by
>   run time: -fsanitizer=counted-by
>  * The initialization to the size field should be done before the first 
> reference to the FAM field.

I would hope that regular compile-time warnings would catch this.

>  * the array has at least # of elements specified by the size field all 
> the time during the program.
>  * the value of counted-by should not be negative.

This seems reasonable for a very strict program, but it won't work for
the kernel as-is: a negative "count" is sometimes used to carry failure
details back to other users of the structure. This could be refactored in
the kernel, but I'd prefer that even without -fsanitizer=counted-by the
runtime behaviors will be "safe".

It does not seem sensible to me that adding a buffer size validation
primitive to GCC will result in conditions where a size calculation
will wrap around. I prefer no surprises. :)

> Let me know your comment and suggestions.

Clang has implemented the safety logic I'd prefer:

* __bdos will report 0 for any sizing where the "counted_by" count
  variable is negative. Effectively, the count variable is always
  processed as: count < 0 ? 0 : count

  struct foo {
int count;
short array[] __counted_by(count);
  } *p;

  __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count)

  The logic for this is that __bdos can be _certain_ that the size is 0
  when the count variable is pathological.

* -fsanitize=array-bounds similarly treats count as above, so that:

  printf("%d\n", p->array[index]); ==> trap when index > (count < 0 ? 0 : count)

  Same logic for the sanitizer: any access to the array when count is
  invalid means the access is invalid and must be trapped.


This means that software can run safely even in pathological conditions.

-Kees

-- 
Kees Cook


Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-29 Thread Qing Zhao


> On Jan 29, 2024, at 10:50 AM, Martin Uecker  wrote:
> 
> Am Montag, dem 29.01.2024 um 15:09 + schrieb Qing Zhao:
>> Thank you!
>> 
>> Joseph and Richard,  could you also comment on this?
>> 
>>> On Jan 28, 2024, at 5:09 AM, Martin Uecker  wrote:
>>> 
>>> Am Freitag, dem 26.01.2024 um 14:33 + schrieb Qing Zhao:
 
> On Jan 26, 2024, at 3:04 AM, Martin Uecker  wrote:
> 
> 
> I haven't looked at the patch, but it sounds you give the result
> the wrong type. Then patching up all use cases instead of the
> type seems wrong.
 
 Yes, this is for resolving a very early gimplification issue as I reported 
 last Nov:
 https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
 
 Since no-one responded at that time, I fixed the issue by replacing the 
 ARRAY_REF
 With a pointer indirection:
 https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
 
 The reason for such change is:  return a flexible array member TYPE is not 
 allowed
 by C language (our gimplification follows this rule), so, we have to 
 return a pointer TYPE instead. 
 
 **The new internal function
 
 .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, 
 ACCESS_MODE, INDEX)
 
 INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
 
 which returns the "REF_TO_OBJ" same as the 1st argument;
 
 Both the return type and the type of the first argument of this function 
 have been converted from 
 the incomplete array type to the corresponding pointer type.
 
 As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the 
 original INDEX of the ARRAY_REF was lost
 when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX 
 for bound sanitizer instrumentation, I added
 The 6th argument “INDEX”.
 
 What’s your comment and suggestion on this solution?
>>> 
>>> I am not entirely sure but changing types in the FE seems
>>> problematic because this breaks language semantics. And
>>> then adding special code everywhere to treat it specially
>>> in the FE does not seem a good way forward.
>>> 
>>> If I understand correctly, returning an incomplete array 
>>> type is not allowed and then fails during gimplification.
>> 
>> Yes, this is the problem in gimplification. 
>> 
>>> So I would suggest to make it return a pointer to the 
>>> incomplete array (and not the element type)
>> 
>> 
>> for the following:
>> 
>> struct annotated {
>>  unsigned int size;
>>  int array[] __attribute__((counted_by (size)));
>> };
>> 
>>  struct annotated * p = ….
>>  p->array[9] = 0;
>> 
>> The IL for the above array reference p->array[9] is:
>> 
>> 1. If the return type is the original incomplete array type, 
>> 
>> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;
>> 
>> (this triggered the gimplification failure since the return type cannot be a 
>> complete type).
>> 
>> 2. When the return type is changed to a pointer to the element type of the 
>> incomplete array, (the current patch)
>> Then the original array reference naturally becomes an indirect reference 
>> through the pointer
>> 
>> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0;
>> 
>> Since the original array reference becomes an indirect reference through the 
>> pointer to the element array, the INDEX info 
>> is mixed into the OFFSET of the indirect reference and lost, so, I added the 
>> 6th argument to the routine .ACCESS_WITH_SIZE
>> to record the INDEX. 
>> 
>> 3. With your suggestion, the return type is changed to a pointer to the 
>> incomplete array, 
>> I just tried this to change the result type :
>> 
>> 
>> --- a/gcc/c/c-typeck.cc
>> +++ b/gcc/c/c-typeck.cc
>> @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, 
>> tree ref,
>>   tree counted_by_type)
>> {
>>   gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
>> -  tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
>> +  tree result_type = build_pointer_type (TREE_TYPE (ref));
>> 
>> Then, I got the following FE errors:
>> 
>> test.c:10:11: error: invalid use of flexible array member
>>   10 |   p->array[9] = 0;
>> 
>> The reason for the error is: when the original array_ref becomes an 
>> indirect_ref through the pointer to the incomplete array,
>> During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT 
>> (type) is invalid since the type is an incomplete array. 
>> As a result, the OFFSET cannot computed for the indirect_ref.
>> 
>> Looks like even more issues with this approach.
> 
> Yes, but only because the following is missing:
> 
>> 
>> 
>>> but then wrap
>>> it with an indirection when inserting this code in the FE
>>> so that the full replacement has the correct type again
>>> (of the incomplete array).
>> 
>> I don’t quite understand the ab

Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-29 Thread Qing Zhao
An update on the kernel building with my version 4 patch.

Kees reported two FE issues with the current version 4 patch:

1. The operator “typeof” cannot return correct type for a->array;
2. The operator “&” cannot return correct address for a->array;

I fixed both in my local repository. 

With these additional fix.  Kernel with counted-by annotation can be built 
successfully. 

And then, Kees reported one behavioral issue with the current counted-by:

When the counted-by value is below zero, my current patch 

A. Didn’t report any warning for it.
B. Accepted the negative value as a wrapped size.

i.e. for:

struct foo {
signed char size;
unsigned char array[] __counted_by(size);
} *a;

...
a->size = -3;
report(__builtin_dynamic_object_size(p->array, 1));

this reports 253, rather than 0.

And the array-bounds sanitizer doesn’t catch negative index bounds neither. 

a->size = -3;
report(a->array[1]); // does not trap


So, my questions are:

 How should we handle the negative counted-by value?

 My approach is:

   I think that this is a user error, the compiler need to Issue warning during 
runtime about this user error.

Since I have one remaining patch that has not been finished yet:

6  Emit warnings when the user breaks the requirments for the new counted_by 
attribute
  compilation time: -Wcounted-by
  run time: -fsanitizer=counted-by
 * The initialization to the size field should be done before the first 
reference to the FAM field.
 * the array has at least # of elements specified by the size field all the 
time during the program.
 * the value of counted-by should not be negative.

Let me know your comment and suggestions.

Thanks

Qing

> On Jan 25, 2024, at 3:11 PM, Qing Zhao  wrote:
> 
> Thanks a lot for the testing.
> 
> Yes, I can repeat the issue with the following small example:
> 
> #include 
> #include 
> #include 
> 
> #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
> 
> struct untracked {
>   int size;
>   int array[] __attribute__((counted_by (size)));
> } *a;
> struct untracked * alloc_buf (int index)
> {
>  struct untracked *p;
>  p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
>(offsetof (struct untracked, array[0])
> + (index) * sizeof (int;
>  p->size = index;
>  return p;
> }
> 
> int main()
> {
>  a = alloc_buf(10);
> printf ("same_type is %d\n",
>  (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0];
>  return 0;
> }
> 
> 
> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
> same_type is 1
> 
> Looks like that the “typeof” operator need to be handled specially in C FE
> for the new internal function .ACCESS_WITH_SIZE. 
> 
> (I have specially handle the operator “offsetof” in C FE already).
> 
> Will fix this issue.
> 
> Thanks.
> 
> Qing
> 
>> On Jan 24, 2024, at 7:51 PM, Kees Cook  wrote:
>> 
>> On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote:
>>> This is the 4th version of the patch.
>> 
>> Thanks very much for this!
>> 
>> I tripped over an unexpected behavioral change that the Linux kernel
>> depends on:
>> 
>> __builtin_types_compatible_p() no longer treats an array marked with
>> counted_by as different from that array's decayed pointer. Specifically,
>> the kernel uses these macros:
>> 
>> 
>> /*
>> * Force a compilation error if condition is true, but also produce a
>> * result (of value 0 and type int), so the expression can be used
>> * e.g. in a structure initializer (or where-ever else comma expressions
>> * aren't permitted).
>> */
>> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
>> 
>> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
>> 
>> /* &a[0] degrades to a pointer: a different type from an array */
>> #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
>> 
>> 
>> This gets used in various places to make sure we're dealing with an
>> array for a macro:
>> 
>> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
>> __must_be_array(arr))
>> 
>> 
>> So this builds:
>> 
>> struct untracked {
>>   int size;
>>   int array[];
>> } *a;
>> 
>> __must_be_array(a->array)
>> => 0 (as expected)
>> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
>> => 0 (as expected, array vs decayed array pointer)
>> 
>> 
>> But if counted_by is added, we get a build failure:
>> 
>> struct tracked {
>>   int size;
>>   int array[] __counted_by(size);
>> } *b;
>> 
>> __must_be_array(b->array)
>> => build failure (not expected)
>> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
>> => 1 (not expected, both pointers?)
>> 
>> 
>> 
>> 
>> -- 
>> Kees Cook
> 



Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-29 Thread Martin Uecker
Am Montag, dem 29.01.2024 um 15:09 + schrieb Qing Zhao:
> Thank you!
> 
> Joseph and Richard,  could you also comment on this?
> 
> > On Jan 28, 2024, at 5:09 AM, Martin Uecker  wrote:
> > 
> > Am Freitag, dem 26.01.2024 um 14:33 + schrieb Qing Zhao:
> > > 
> > > > On Jan 26, 2024, at 3:04 AM, Martin Uecker  wrote:
> > > > 
> > > > 
> > > > I haven't looked at the patch, but it sounds you give the result
> > > > the wrong type. Then patching up all use cases instead of the
> > > > type seems wrong.
> > > 
> > > Yes, this is for resolving a very early gimplification issue as I 
> > > reported last Nov:
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
> > > 
> > > Since no-one responded at that time, I fixed the issue by replacing the 
> > > ARRAY_REF
> > > With a pointer indirection:
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
> > > 
> > > The reason for such change is:  return a flexible array member TYPE is 
> > > not allowed
> > > by C language (our gimplification follows this rule), so, we have to 
> > > return a pointer TYPE instead. 
> > > 
> > > **The new internal function
> > > 
> > > .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, 
> > > ACCESS_MODE, INDEX)
> > > 
> > > INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
> > > 
> > > which returns the "REF_TO_OBJ" same as the 1st argument;
> > > 
> > > Both the return type and the type of the first argument of this function 
> > > have been converted from 
> > > the incomplete array type to the corresponding pointer type.
> > > 
> > > As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the 
> > > original INDEX of the ARRAY_REF was lost
> > > when converting from ARRAY_REF to INDIRECT_REF, in order to keep the 
> > > INDEX for bound sanitizer instrumentation, I added
> > > The 6th argument “INDEX”.
> > > 
> > > What’s your comment and suggestion on this solution?
> > 
> > I am not entirely sure but changing types in the FE seems
> > problematic because this breaks language semantics. And
> > then adding special code everywhere to treat it specially
> > in the FE does not seem a good way forward.
> > 
> > If I understand correctly, returning an incomplete array 
> > type is not allowed and then fails during gimplification.
> 
> Yes, this is the problem in gimplification. 
> 
> > So I would suggest to make it return a pointer to the 
> > incomplete array (and not the element type)
> 
> 
> for the following:
> 
> struct annotated {
>   unsigned int size;
>   int array[] __attribute__((counted_by (size)));
> };
> 
>   struct annotated * p = ….
>   p->array[9] = 0;
> 
> The IL for the above array reference p->array[9] is:
> 
> 1. If the return type is the original incomplete array type, 
> 
> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;
> 
> (this triggered the gimplification failure since the return type cannot be a 
> complete type).
> 
> 2. When the return type is changed to a pointer to the element type of the 
> incomplete array, (the current patch)
> Then the original array reference naturally becomes an indirect reference 
> through the pointer
> 
> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0;
> 
> Since the original array reference becomes an indirect reference through the 
> pointer to the element array, the INDEX info 
> is mixed into the OFFSET of the indirect reference and lost, so, I added the 
> 6th argument to the routine .ACCESS_WITH_SIZE
> to record the INDEX. 
> 
> 3. With your suggestion, the return type is changed to a pointer to the 
> incomplete array, 
> I just tried this to change the result type :
> 
> 
> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc
> @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, 
> tree ref,
>tree counted_by_type)
>  {
>gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
> -  tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
> +  tree result_type = build_pointer_type (TREE_TYPE (ref));
> 
> Then, I got the following FE errors:
> 
> test.c:10:11: error: invalid use of flexible array member
>10 |   p->array[9] = 0;
> 
> The reason for the error is: when the original array_ref becomes an 
> indirect_ref through the pointer to the incomplete array,
> During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT 
> (type) is invalid since the type is an incomplete array. 
> As a result, the OFFSET cannot computed for the indirect_ref.
> 
> Looks like even more issues with this approach.

Yes, but only because the following is missing:

> 
> 
> > but then wrap
> > it with an indirection when inserting this code in the FE
> > so that the full replacement has the correct type again
> > (of the incomplete array).
> 
> I don’t quite understand the above, could you please explain this in more 
> details? (If possible, could you plea

Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-29 Thread Qing Zhao
Thank you!

Joseph and Richard,  could you also comment on this?

> On Jan 28, 2024, at 5:09 AM, Martin Uecker  wrote:
> 
> Am Freitag, dem 26.01.2024 um 14:33 + schrieb Qing Zhao:
>> 
>>> On Jan 26, 2024, at 3:04 AM, Martin Uecker  wrote:
>>> 
>>> 
>>> I haven't looked at the patch, but it sounds you give the result
>>> the wrong type. Then patching up all use cases instead of the
>>> type seems wrong.
>> 
>> Yes, this is for resolving a very early gimplification issue as I reported 
>> last Nov:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
>> 
>> Since no-one responded at that time, I fixed the issue by replacing the 
>> ARRAY_REF
>> With a pointer indirection:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
>> 
>> The reason for such change is:  return a flexible array member TYPE is not 
>> allowed
>> by C language (our gimplification follows this rule), so, we have to return 
>> a pointer TYPE instead. 
>> 
>> **The new internal function
>> 
>> .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, 
>> ACCESS_MODE, INDEX)
>> 
>> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
>> 
>> which returns the "REF_TO_OBJ" same as the 1st argument;
>> 
>> Both the return type and the type of the first argument of this function 
>> have been converted from 
>> the incomplete array type to the corresponding pointer type.
>> 
>> As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the 
>> original INDEX of the ARRAY_REF was lost
>> when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX 
>> for bound sanitizer instrumentation, I added
>> The 6th argument “INDEX”.
>> 
>> What’s your comment and suggestion on this solution?
> 
> I am not entirely sure but changing types in the FE seems
> problematic because this breaks language semantics. And
> then adding special code everywhere to treat it specially
> in the FE does not seem a good way forward.
> 
> If I understand correctly, returning an incomplete array 
> type is not allowed and then fails during gimplification.

Yes, this is the problem in gimplification. 

> So I would suggest to make it return a pointer to the 
> incomplete array (and not the element type)


for the following:

struct annotated {
  unsigned int size;
  int array[] __attribute__((counted_by (size)));
};

  struct annotated * p = ….
  p->array[9] = 0;

The IL for the above array reference p->array[9] is:

1. If the return type is the original incomplete array type, 

.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;

(this triggered the gimplification failure since the return type cannot be a 
complete type).

2. When the return type is changed to a pointer to the element type of the 
incomplete array, (the current patch)
Then the original array reference naturally becomes an indirect reference 
through the pointer

*(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0;

Since the original array reference becomes an indirect reference through the 
pointer to the element array, the INDEX info 
is mixed into the OFFSET of the indirect reference and lost, so, I added the 
6th argument to the routine .ACCESS_WITH_SIZE
to record the INDEX. 

3. With your suggestion, the return type is changed to a pointer to the 
incomplete array, 
I just tried this to change the result type :


--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, 
tree ref,
   tree counted_by_type)
 {
   gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
-  tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
+  tree result_type = build_pointer_type (TREE_TYPE (ref));

Then, I got the following FE errors:

test.c:10:11: error: invalid use of flexible array member
   10 |   p->array[9] = 0;

The reason for the error is: when the original array_ref becomes an 
indirect_ref through the pointer to the incomplete array,
During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT (type) 
is invalid since the type is an incomplete array. 
As a result, the OFFSET cannot computed for the indirect_ref.

Looks like even more issues with this approach.


> but then wrap
> it with an indirection when inserting this code in the FE
> so that the full replacement has the correct type again
> (of the incomplete array).

I don’t quite understand the above, could you please explain this in more 
details? (If possible, could you please use the above small example?)
thanks.

> 
> 
> Alternatively, one could allow this during gimplification
> or add some conversion.

Allowing this in gimplification might trigger some other issues.  I guess that 
adding conversion 
in the end of the FE or in the beginning of gimplification might be better.

i.e,  in FE, still keep the original incomplete array type as the return type 
for the routine .ACCESS_WITH_SIZ

Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-28 Thread Martin Uecker
Am Freitag, dem 26.01.2024 um 14:33 + schrieb Qing Zhao:
> 
> > On Jan 26, 2024, at 3:04 AM, Martin Uecker  wrote:
> > 
> > 
> > I haven't looked at the patch, but it sounds you give the result
> > the wrong type. Then patching up all use cases instead of the
> > type seems wrong.
> 
> Yes, this is for resolving a very early gimplification issue as I reported 
> last Nov:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
> 
> Since no-one responded at that time, I fixed the issue by replacing the 
> ARRAY_REF
> With a pointer indirection:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
> 
> The reason for such change is:  return a flexible array member TYPE is not 
> allowed
> by C language (our gimplification follows this rule), so, we have to return a 
> pointer TYPE instead. 
> 
> **The new internal function
> 
>  .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, 
> ACCESS_MODE, INDEX)
> 
> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
> 
> which returns the "REF_TO_OBJ" same as the 1st argument;
> 
> Both the return type and the type of the first argument of this function have 
> been converted from 
> the incomplete array type to the corresponding pointer type.
> 
> As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the 
> original INDEX of the ARRAY_REF was lost
> when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX 
> for bound sanitizer instrumentation, I added
> The 6th argument “INDEX”.
> 
> What’s your comment and suggestion on this solution?

I am not entirely sure but changing types in the FE seems
problematic because this breaks language semantics. And
then adding special code everywhere to treat it specially
in the FE does not seem a good way forward.

If I understand correctly, returning an incomplete array 
type is not allowed and then fails during gimplification.
So I would suggest to make it return a pointer to the 
incomplete array (and not the element type) but then wrap
it with an indirection when inserting this code in the FE
so that the full replacement has the correct type again
(of the incomplete array).


Alternatively, one could allow this during gimplification
or add some conversion.

Martin


> 
> Thanks.
> 
> Qing
> 
> 
> > 
> > Martin
> > 
> > 
> > Am Donnerstag, dem 25.01.2024 um 20:11 + schrieb Qing Zhao:
> > > Thanks a lot for the testing.
> > > 
> > > Yes, I can repeat the issue with the following small example:
> > > 
> > > #include 
> > > #include 
> > > #include 
> > > 
> > > #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
> > > 
> > > struct untracked {
> > >   int size;
> > >   int array[] __attribute__((counted_by (size)));
> > > } *a;
> > > struct untracked * alloc_buf (int index)
> > > {
> > >  struct untracked *p;
> > >  p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
> > >(offsetof (struct untracked, 
> > > array[0])
> > > + (index) * sizeof (int;
> > >  p->size = index;
> > >  return p;
> > > }
> > > 
> > > int main()
> > > {
> > >  a = alloc_buf(10);
> > > printf ("same_type is %d\n",
> > >  (__builtin_types_compatible_p(typeof (a->array), typeof 
> > > (&(a->array)[0];
> > >  return 0;
> > > }
> > > 
> > > 
> > > /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
> > > same_type is 1
> > > 
> > > Looks like that the “typeof” operator need to be handled specially in C FE
> > > for the new internal function .ACCESS_WITH_SIZE. 
> > > 
> > > (I have specially handle the operator “offsetof” in C FE already).
> > > 
> > > Will fix this issue.
> > > 
> > > Thanks.
> > > 
> > > Qing
> > > 
> > > > On Jan 24, 2024, at 7:51 PM, Kees Cook  wrote:
> > > > 
> > > > On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote:
> > > > > This is the 4th version of the patch.
> > > > 
> > > > Thanks very much for this!
> > > > 
> > > > I tripped over an unexpected behavioral change that the Linux kernel
> > > > depends on:
> > > > 
> > > > __builtin_types_compatible_p() no longer treats an array marked with
> > > > counted_by as different from that array's decayed pointer. Specifically,
> > > > the kernel uses these macros:
> > > > 
> > > > 
> > > > /*
> > > > * Force a compilation error if condition is true, but also produce a
> > > > * result (of value 0 and type int), so the expression can be used
> > > > * e.g. in a structure initializer (or where-ever else comma expressions
> > > > * aren't permitted).
> > > > */
> > > > #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
> > > > 
> > > > #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), 
> > > > typeof(b))
> > > > 
> > > > /* &a[0] degrades to a pointer: a different type from an array */
> > > > #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), 
> > > > &(a)[0]))
> > > > 
> > > > 
> > > > This gets used in various places to make sure we

Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-26 Thread Qing Zhao


> On Jan 26, 2024, at 3:04 AM, Martin Uecker  wrote:
> 
> 
> I haven't looked at the patch, but it sounds you give the result
> the wrong type. Then patching up all use cases instead of the
> type seems wrong.

Yes, this is for resolving a very early gimplification issue as I reported last 
Nov:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html

Since no-one responded at that time, I fixed the issue by replacing the 
ARRAY_REF
With a pointer indirection:
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html

The reason for such change is:  return a flexible array member TYPE is not 
allowed
by C language (our gimplification follows this rule), so, we have to return a 
pointer TYPE instead. 

**The new internal function

 .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, 
ACCESS_MODE, INDEX)

INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)

which returns the "REF_TO_OBJ" same as the 1st argument;

Both the return type and the type of the first argument of this function have 
been converted from 
the incomplete array type to the corresponding pointer type.

As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the 
original INDEX of the ARRAY_REF was lost
when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX for 
bound sanitizer instrumentation, I added
The 6th argument “INDEX”.

What’s your comment and suggestion on this solution?

Thanks.

Qing


> 
> Martin
> 
> 
> Am Donnerstag, dem 25.01.2024 um 20:11 + schrieb Qing Zhao:
>> Thanks a lot for the testing.
>> 
>> Yes, I can repeat the issue with the following small example:
>> 
>> #include 
>> #include 
>> #include 
>> 
>> #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
>> 
>> struct untracked {
>>   int size;
>>   int array[] __attribute__((counted_by (size)));
>> } *a;
>> struct untracked * alloc_buf (int index)
>> {
>>  struct untracked *p;
>>  p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
>>(offsetof (struct untracked, array[0])
>> + (index) * sizeof (int;
>>  p->size = index;
>>  return p;
>> }
>> 
>> int main()
>> {
>>  a = alloc_buf(10);
>> printf ("same_type is %d\n",
>>  (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0];
>>  return 0;
>> }
>> 
>> 
>> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
>> same_type is 1
>> 
>> Looks like that the “typeof” operator need to be handled specially in C FE
>> for the new internal function .ACCESS_WITH_SIZE. 
>> 
>> (I have specially handle the operator “offsetof” in C FE already).
>> 
>> Will fix this issue.
>> 
>> Thanks.
>> 
>> Qing
>> 
>>> On Jan 24, 2024, at 7:51 PM, Kees Cook  wrote:
>>> 
>>> On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote:
 This is the 4th version of the patch.
>>> 
>>> Thanks very much for this!
>>> 
>>> I tripped over an unexpected behavioral change that the Linux kernel
>>> depends on:
>>> 
>>> __builtin_types_compatible_p() no longer treats an array marked with
>>> counted_by as different from that array's decayed pointer. Specifically,
>>> the kernel uses these macros:
>>> 
>>> 
>>> /*
>>> * Force a compilation error if condition is true, but also produce a
>>> * result (of value 0 and type int), so the expression can be used
>>> * e.g. in a structure initializer (or where-ever else comma expressions
>>> * aren't permitted).
>>> */
>>> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
>>> 
>>> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
>>> 
>>> /* &a[0] degrades to a pointer: a different type from an array */
>>> #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
>>> 
>>> 
>>> This gets used in various places to make sure we're dealing with an
>>> array for a macro:
>>> 
>>> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
>>> __must_be_array(arr))
>>> 
>>> 
>>> So this builds:
>>> 
>>> struct untracked {
>>>   int size;
>>>   int array[];
>>> } *a;
>>> 
>>> __must_be_array(a->array)
>>> => 0 (as expected)
>>> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
>>> => 0 (as expected, array vs decayed array pointer)
>>> 
>>> 
>>> But if counted_by is added, we get a build failure:
>>> 
>>> struct tracked {
>>>   int size;
>>>   int array[] __counted_by(size);
>>> } *b;
>>> 
>>> __must_be_array(b->array)
>>> => build failure (not expected)
>>> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
>>> => 1 (not expected, both pointers?)
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Kees Cook
>> 
> 



Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-26 Thread Martin Uecker


I haven't looked at the patch, but it sounds you give the result
the wrong type. Then patching up all use cases instead of the
type seems wrong.

Martin


Am Donnerstag, dem 25.01.2024 um 20:11 + schrieb Qing Zhao:
> Thanks a lot for the testing.
> 
> Yes, I can repeat the issue with the following small example:
> 
> #include 
> #include 
> #include 
> 
> #define MAX(a, b)  ((a) > (b) ? (a) :  (b))
> 
> struct untracked {
>int size;
>int array[] __attribute__((counted_by (size)));
> } *a;
> struct untracked * alloc_buf (int index)
> {
>   struct untracked *p;
>   p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
> (offsetof (struct untracked, array[0])
>  + (index) * sizeof (int;
>   p->size = index;
>   return p;
> }
> 
> int main()
> {
>   a = alloc_buf(10);
>  printf ("same_type is %d\n",
>   (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0];
>   return 0;
> }
> 
> 
> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
> same_type is 1
> 
> Looks like that the “typeof” operator need to be handled specially in C FE
>  for the new internal function .ACCESS_WITH_SIZE. 
> 
> (I have specially handle the operator “offsetof” in C FE already).
> 
> Will fix this issue.
> 
> Thanks.
> 
> Qing
> 
> > On Jan 24, 2024, at 7:51 PM, Kees Cook  wrote:
> > 
> > On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote:
> > > This is the 4th version of the patch.
> > 
> > Thanks very much for this!
> > 
> > I tripped over an unexpected behavioral change that the Linux kernel
> > depends on:
> > 
> > __builtin_types_compatible_p() no longer treats an array marked with
> > counted_by as different from that array's decayed pointer. Specifically,
> > the kernel uses these macros:
> > 
> > 
> > /*
> > * Force a compilation error if condition is true, but also produce a
> > * result (of value 0 and type int), so the expression can be used
> > * e.g. in a structure initializer (or where-ever else comma expressions
> > * aren't permitted).
> > */
> > #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
> > 
> > #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
> > 
> > /* &a[0] degrades to a pointer: a different type from an array */
> > #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
> > 
> > 
> > This gets used in various places to make sure we're dealing with an
> > array for a macro:
> > 
> > #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
> > __must_be_array(arr))
> > 
> > 
> > So this builds:
> > 
> > struct untracked {
> >int size;
> >int array[];
> > } *a;
> > 
> > __must_be_array(a->array)
> > => 0 (as expected)
> > __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
> > => 0 (as expected, array vs decayed array pointer)
> > 
> > 
> > But if counted_by is added, we get a build failure:
> > 
> > struct tracked {
> >int size;
> >int array[] __counted_by(size);
> > } *b;
> > 
> > __must_be_array(b->array)
> > => build failure (not expected)
> > __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
> > => 1 (not expected, both pointers?)
> > 
> > 
> > 
> > 
> > -- 
> > Kees Cook
> 



Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-25 Thread Qing Zhao
Thanks a lot for the testing.

Yes, I can repeat the issue with the following small example:

#include 
#include 
#include 

#define MAX(a, b)  ((a) > (b) ? (a) :  (b))

struct untracked {
   int size;
   int array[] __attribute__((counted_by (size)));
} *a;
struct untracked * alloc_buf (int index)
{
  struct untracked *p;
  p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
(offsetof (struct untracked, array[0])
 + (index) * sizeof (int;
  p->size = index;
  return p;
}

int main()
{
  a = alloc_buf(10);
 printf ("same_type is %d\n",
  (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0];
  return 0;
}


/home/opc/Install/latest-d/bin/gcc -O2 btcp.c
same_type is 1

Looks like that the “typeof” operator need to be handled specially in C FE
 for the new internal function .ACCESS_WITH_SIZE. 

(I have specially handle the operator “offsetof” in C FE already).

Will fix this issue.

Thanks.

Qing

> On Jan 24, 2024, at 7:51 PM, Kees Cook  wrote:
> 
> On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote:
>> This is the 4th version of the patch.
> 
> Thanks very much for this!
> 
> I tripped over an unexpected behavioral change that the Linux kernel
> depends on:
> 
> __builtin_types_compatible_p() no longer treats an array marked with
> counted_by as different from that array's decayed pointer. Specifically,
> the kernel uses these macros:
> 
> 
> /*
> * Force a compilation error if condition is true, but also produce a
> * result (of value 0 and type int), so the expression can be used
> * e.g. in a structure initializer (or where-ever else comma expressions
> * aren't permitted).
> */
> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
> 
> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
> 
> /* &a[0] degrades to a pointer: a different type from an array */
> #define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
> 
> 
> This gets used in various places to make sure we're dealing with an
> array for a macro:
> 
> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
> __must_be_array(arr))
> 
> 
> So this builds:
> 
> struct untracked {
>int size;
>int array[];
> } *a;
> 
> __must_be_array(a->array)
> => 0 (as expected)
> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
> => 0 (as expected, array vs decayed array pointer)
> 
> 
> But if counted_by is added, we get a build failure:
> 
> struct tracked {
>int size;
>int array[] __counted_by(size);
> } *b;
> 
> __must_be_array(b->array)
> => build failure (not expected)
> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
> => 1 (not expected, both pointers?)
> 
> 
> 
> 
> -- 
> Kees Cook



Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-24 Thread Kees Cook
On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote:
> This is the 4th version of the patch.

Thanks very much for this!

I tripped over an unexpected behavioral change that the Linux kernel
depends on:

__builtin_types_compatible_p() no longer treats an array marked with
counted_by as different from that array's decayed pointer. Specifically,
the kernel uses these macros:


/*
 * Force a compilation error if condition is true, but also produce a
 * result (of value 0 and type int), so the expression can be used
 * e.g. in a structure initializer (or where-ever else comma expressions
 * aren't permitted).
 */
#define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))

#define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))

/* &a[0] degrades to a pointer: a different type from an array */
#define __must_be_array(a)   BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))


This gets used in various places to make sure we're dealing with an
array for a macro:

#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))


So this builds:

struct untracked {
int size;
int array[];
} *a;

__must_be_array(a->array)
=> 0 (as expected)
__builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
=> 0 (as expected, array vs decayed array pointer)


But if counted_by is added, we get a build failure:

struct tracked {
int size;
int array[] __counted_by(size);
} *b;

__must_be_array(b->array)
=> build failure (not expected)
__builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
=> 1 (not expected, both pointers?)




-- 
Kees Cook


[PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-01-23 Thread Qing Zhao
Hi,

This is the 4th version of the patch.

It based on the following proposal:

https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635884.html
Represent the missing dependence for the "counted_by" attribute and its 
consumers

**The summary of the proposal is:

* Add a new internal function ".ACCESS_WITH_SIZE" to carry the size information 
for every reference to a FAM field;
* In C FE, Replace every reference to a FAM field whose TYPE has the 
"counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE";
* In every consumer of the size information, for example, BDOS or array bound 
sanitizer, query the size information or ACCESS_MODE information from the new 
internal function;
* When expansing to RTL, replace the internal function with the actual 
reference to the FAM field;
* Some adjustment to ipa alias analysis, and other SSA passes to mitigate the 
impact to the optimizer and code generation.


**The new internal function

  .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, 
ACCESS_MODE, INDEX)

INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)

which returns the "REF_TO_OBJ" same as the 1st argument;

Both the return type and the type of the first argument of this function have 
been converted from the incomplete array type to the corresponding pointer type.

Please see the following link for why:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html

1st argument "REF_TO_OBJ": The reference to the object;
2nd argument "REF_TO_SIZE": The reference to the size of the object,
3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE represents
   0: unknown;
   1: the number of the elements of the object type;
   2: the number of bytes;
4th argument "PRECISION_OF_SIZE": The precision of the integer that REF_TO_SIZE 
points;
5th argument "ACCESS_MODE":
  -1: Unknown access semantics
   0: none
   1: read_only
   2: write_only
   3: read_write
6th argument "INDEX": the INDEX for the original array reference.
  -1: Unknown

NOTE: The 6th Argument is added for bound sanitizer instrumentation.

** The Patch sets included:

1. Provide counted_by attribute to flexible array member field;
  which includes:
  * "counted_by" attribute documentation;
  * C FE handling of the new attribute;
syntax checking, error reporting;
  * testing cases;

2. Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.
  which includes:
  * The definition of the new internal function .ACCESS_WITH_SIZE in 
internal-fn.def.
  * C FE converts every reference to a FAM with "counted_by" attribute to a 
call to the internal function .ACCESS_WITH_SIZE.
(build_component_ref in c_typeck.cc)
This includes the case when the object is statically allocated and 
initialized.
In order to make this working, we should update 
initializer_constant_valid_p_1 and output_constant in varasm.cc to include 
calls to .ACCESS_WITH_SIZE.

However, for the reference inside "offsetof", ignore the "counted_by" 
attribute since it's not useful at all. (c_parser_postfix_expression in 
c/c-parser.cc)

  * Convert every call to .ACCESS_WITH_SIZE to its first argument.
(expand_ACCESS_WITH_SIZE in internal-fn.cc)
  * adjust alias analysis to exclude the new internal from clobbering 
anything.
(ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in 
tree-ssa-alias.cc)
  * adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE 
when
it's LHS is eliminated as dead code.
(eliminate_unnecessary_stmts in tree-ssa-dce.cc)
  * Provide the utility routines to check the call is .ACCESS_WITH_SIZE and
get the reference from the call to .ACCESS_WITH_SIZE.
(is_access_with_size_p and get_ref_from_access_with_size in tree.cc)
  * testing cases. (for offsetof, static initialization, generation of 
calls to
.ACCESS_WITH_SIZE, code runs correctly after calls to .ACCESS_WITH_SIZE 
are
converted back)

3. Use the .ACCESS_WITH_SIZE in builtin object size (sub-object only)
  which includes:
  * use the size info of the .ACCESS_WITH_SIZE for sub-object.
  * testing cases. 

4 Use the .ACCESS_WITH_SIZE in bound sanitizer
  Since the result type of the call to .ACCESS_WITH_SIZE is a pointer to
the element type. The original array_ref is converted to an 
indirect_ref,
which introduced two issues for the instrumenation of bound sanitizer:

A. The index for the original array_ref was mixed into the offset
expression for the new indirect_ref.

In order to make the instrumentation for the bound sanitizer easier, one
more argument for the function .ACCESS_WITH_SIZE is added to record this
original index for the array_ref. then later during bound 
instrumentation,
get the index from the additi