Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
Okay, Based on the comments so far, I will work on the 5th version of the patch, major changes will include: 1. Change the return type of the routine .ACCESS_WITH_SIZE FROM: Pointer to the type of the element of the flexible array; TO: Pointer to the type of the flexible array; And then wrap the call with an indirection reference. 2. Adjust all other parts with this change, (this will simplify the bound sanitizer instrument); 3. Add the fixes to the kernel building failures, which include: A. The operator “typeof” cannot return correct type for a->array; (I guess that the above change 1 might automatically resolve this issue) B. The operator “&” cannot return correct address for a->array; 4. Correctly handle the case when the value of “counted-by” is zero or negative as following 4.1 . Update the counted-by doc with the following: When the counted-by field is assigned a negative integer value, the compiler will treat the value as zero. 4.2. (possibly) Adjust __bdos and array bound sanitizer to handle correctly when “counted-by” is zero. __bdos will return size 0 when counted-by is zero; Array bound sanitizer will report out-of-bound when the counted-by is zero for any array access. Let me know if I missed anything. Thanks a lot for all the comments Qing > On Jan 23, 2024, at 7:29 PM, Qing Zhao wrote: > > Hi, > > This is the 4th version of the patch. > > It based on the following proposal: > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635884.html > Represent the missing dependence for the "counted_by" attribute and its > consumers > > **The summary of the proposal is: > > * Add a new internal function ".ACCESS_WITH_SIZE" to carry the size > information for every reference to a FAM field; > * In C FE, Replace every reference to a FAM field whose TYPE has the > "counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE"; > * In every consumer of the size information, for example, BDOS or array bound > sanitizer, query the size information or ACCESS_MODE information from the new > internal function; > * When expansing to RTL, replace the internal function with the actual > reference to the FAM field; > * Some adjustment to ipa alias analysis, and other SSA passes to mitigate the > impact to the optimizer and code generation. > > > **The new internal function > > .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, > ACCESS_MODE, INDEX) > > INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL) > > which returns the "REF_TO_OBJ" same as the 1st argument; > > Both the return type and the type of the first argument of this function have > been converted from the incomplete array type to the corresponding pointer > type. > > Please see the following link for why: > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html > > 1st argument "REF_TO_OBJ": The reference to the object; > 2nd argument "REF_TO_SIZE": The reference to the size of the object, > 3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE > represents > 0: unknown; > 1: the number of the elements of the object type; > 2: the number of bytes; > 4th argument "PRECISION_OF_SIZE": The precision of the integer that > REF_TO_SIZE points; > 5th argument "ACCESS_MODE": > -1: Unknown access semantics > 0: none > 1: read_only > 2: write_only > 3: read_write > 6th argument "INDEX": the INDEX for the original array reference. > -1: Unknown > > NOTE: The 6th Argument is added for bound sanitizer instrumentation. > > ** The Patch sets included: > > 1. Provide counted_by attribute to flexible array member field; > which includes: > * "counted_by" attribute documentation; > * C FE handling of the new attribute; >syntax checking, error reporting; > * testing cases; > > 2. Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE. > which includes: > * The definition of the new internal function .ACCESS_WITH_SIZE in > internal-fn.def. > * C FE converts every reference to a FAM with "counted_by" attribute to > a call to the internal function .ACCESS_WITH_SIZE. >(build_component_ref in c_typeck.cc) >This includes the case when the object is statically allocated and > initialized. >In order to make this working, we should update > initializer_constant_valid_p_1 and output_constant in varasm.cc to include > calls to .ACCESS_WITH_SIZE. > >However, for the reference inside "offsetof", ignore the "counted_by" > attribute since it's not useful at all. (c_parser_postfix_expression in > c/c-parser.cc) > > * Convert every call to .ACCESS_WITH_SIZE to its first argument. >(expand_ACCESS_WITH_SIZE in internal-fn.cc) > * adjust alias analysis to exclude the new in
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
> On Jan 30, 2024, at 12:41 AM, Kees Cook wrote: > > On Mon, Jan 29, 2024 at 10:45:23PM +, Qing Zhao wrote: >> There are two things here. >> >> 1. The value of the “counted-by” is 0; (which is easy to be understood) >> 2. The result of the _builtin_object_size when see a “counted-by” 0. >> >> For 1, it’s simple, if we see a counted-by value <= 0, then counted-by is 0; > > Okay, that's good; this matches my understanding. :) > >> But for 2, when the _builtin_object_size sees a “counted-by” 0, what’s value >> it will return for the object size? >> >> Can we return 0 for the object size? > > I don't see why not. For example: > > // -O2 -fstrict-flex-arrays=3 > struct s { >int a; >int b[4]; > } foo; > > #define report(x) printf("%s: %zu\n", #x, (size_t)(x)) > > int main(int argc, char *argv[]) > { >struct s foo; >report(__builtin_dynamic_object_size(&foo.b[4], 0)); >report(__builtin_dynamic_object_size(&foo.b[5], 0)); >report(__builtin_dynamic_object_size(&foo.b[-10], 0)); >report(__builtin_dynamic_object_size(&foo.b[4], 1)); >report(__builtin_dynamic_object_size(&foo.b[5], 1)); >report(__builtin_dynamic_object_size(&foo.b[-10], 1)); >report(__builtin_dynamic_object_size(&foo.b[4], 2)); >report(__builtin_dynamic_object_size(&foo.b[5], 2)); >report(__builtin_dynamic_object_size(&foo.b[-10], 2)); >report(__builtin_dynamic_object_size(&foo.b[4], 3)); >report(__builtin_dynamic_object_size(&foo.b[5], 3)); >report(__builtin_dynamic_object_size(&foo.b[-10], 3)); >return 0; > } > > shows: > > __builtin_dynamic_object_size(&foo.b[4], 0): 0 > __builtin_dynamic_object_size(&foo.b[5], 0): 0 > __builtin_dynamic_object_size(&foo.b[-10], 0): 0 > __builtin_dynamic_object_size(&foo.b[4], 1): 0 > __builtin_dynamic_object_size(&foo.b[5], 1): 0 > __builtin_dynamic_object_size(&foo.b[-10], 1): 0 > __builtin_dynamic_object_size(&foo.b[4], 2): 0 > __builtin_dynamic_object_size(&foo.b[5], 2): 0 > __builtin_dynamic_object_size(&foo.b[-10], 2): 0 > __builtin_dynamic_object_size(&foo.b[4], 3): 0 > __builtin_dynamic_object_size(&foo.b[5], 3): 0 > __builtin_dynamic_object_size(&foo.b[-10], 3): 0 > > This is showing "no bytes left" for the end of the b array, and if this > index keeps going, it still reports 0 if we're past the end of the object > completely. And it is similarly capped for negative indexes. This is > true for all the __bos type bits. > > A "counted-by" of 0 (or below) would have the same meaning as an out of > bounds index here. Okay. I will keep this behavior when counted-by is zero (and negative) for __bos. > >> (As I mentioned in the previous email, 0 in __builtin_object_size doesn’t >> mean size 0, >> it means UNKNOWN_SIZE when the type is 2/3, So, what’s value we should >> return for the size 0?) >> https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html > > I think I see what you mean, but I still think it should be 0 for 2/3, > regardless of the documented interpretation. If that's the current > response for a pathological index under 2/3, then I think it's totally > reasonable that it should do the same for pathological bounds. Okay, will keep this behavior for “counted-by” zero. (But still feel that 0 for 2/3, i.e the MINIMUM size will represent as UNKNOWN_SIZE. If that’s the value kernel expected, that’s good) > > > And BTW, it seems there are 0-sized objects, though maybe they're some > kind of special case: > > struct s { >int a; >struct { } nothing; >int b; > }; > > #define report(x) printf("%s: %zu\n", #x, (size_t)(x)) > > int main(int argc, char *argv[]) > { >struct s foo; >report(__builtin_dynamic_object_size(&foo.nothing, 1)); > } > > shows: > > __builtin_dynamic_object_size(&foo.nothing, 1): 0 Looks like that GCC has such extension: https://gcc.gnu.org/onlinedocs/gcc/Empty-Structures.html ***GCC permits a C structure to have no members: struct empty { }; The structure has size zero. In C++, empty structures are part of the language. G++ treats empty structures as if they had a single member of type char. Thanks. Qing > > -Kees > > -- > Kees Cook
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
On Mon, Jan 29, 2024 at 10:45:23PM +, Qing Zhao wrote: > There are two things here. > > 1. The value of the “counted-by” is 0; (which is easy to be understood) > 2. The result of the _builtin_object_size when see a “counted-by” 0. > > For 1, it’s simple, if we see a counted-by value <= 0, then counted-by is 0; Okay, that's good; this matches my understanding. :) > But for 2, when the _builtin_object_size sees a “counted-by” 0, what’s value > it will return for the object size? > > Can we return 0 for the object size? I don't see why not. For example: // -O2 -fstrict-flex-arrays=3 struct s { int a; int b[4]; } foo; #define report(x) printf("%s: %zu\n", #x, (size_t)(x)) int main(int argc, char *argv[]) { struct s foo; report(__builtin_dynamic_object_size(&foo.b[4], 0)); report(__builtin_dynamic_object_size(&foo.b[5], 0)); report(__builtin_dynamic_object_size(&foo.b[-10], 0)); report(__builtin_dynamic_object_size(&foo.b[4], 1)); report(__builtin_dynamic_object_size(&foo.b[5], 1)); report(__builtin_dynamic_object_size(&foo.b[-10], 1)); report(__builtin_dynamic_object_size(&foo.b[4], 2)); report(__builtin_dynamic_object_size(&foo.b[5], 2)); report(__builtin_dynamic_object_size(&foo.b[-10], 2)); report(__builtin_dynamic_object_size(&foo.b[4], 3)); report(__builtin_dynamic_object_size(&foo.b[5], 3)); report(__builtin_dynamic_object_size(&foo.b[-10], 3)); return 0; } shows: __builtin_dynamic_object_size(&foo.b[4], 0): 0 __builtin_dynamic_object_size(&foo.b[5], 0): 0 __builtin_dynamic_object_size(&foo.b[-10], 0): 0 __builtin_dynamic_object_size(&foo.b[4], 1): 0 __builtin_dynamic_object_size(&foo.b[5], 1): 0 __builtin_dynamic_object_size(&foo.b[-10], 1): 0 __builtin_dynamic_object_size(&foo.b[4], 2): 0 __builtin_dynamic_object_size(&foo.b[5], 2): 0 __builtin_dynamic_object_size(&foo.b[-10], 2): 0 __builtin_dynamic_object_size(&foo.b[4], 3): 0 __builtin_dynamic_object_size(&foo.b[5], 3): 0 __builtin_dynamic_object_size(&foo.b[-10], 3): 0 This is showing "no bytes left" for the end of the b array, and if this index keeps going, it still reports 0 if we're past the end of the object completely. And it is similarly capped for negative indexes. This is true for all the __bos type bits. A "counted-by" of 0 (or below) would have the same meaning as an out of bounds index here. > (As I mentioned in the previous email, 0 in __builtin_object_size doesn’t > mean size 0, > it means UNKNOWN_SIZE when the type is 2/3, So, what’s value we should > return for the size 0?) > https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html I think I see what you mean, but I still think it should be 0 for 2/3, regardless of the documented interpretation. If that's the current response for a pathological index under 2/3, then I think it's totally reasonable that it should do the same for pathological bounds. And BTW, it seems there are 0-sized objects, though maybe they're some kind of special case: struct s { int a; struct { } nothing; int b; }; #define report(x) printf("%s: %zu\n", #x, (size_t)(x)) int main(int argc, char *argv[]) { struct s foo; report(__builtin_dynamic_object_size(&foo.nothing, 1)); } shows: __builtin_dynamic_object_size(&foo.nothing, 1): 0 -Kees -- Kees Cook
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
> On Jan 29, 2024, at 3:19 PM, Kees Cook wrote: > > On Mon, Jan 29, 2024 at 07:32:06PM +, Qing Zhao wrote: >> >> >>> On Jan 29, 2024, at 12:25 PM, Kees Cook wrote: >>> >>> On Mon, Jan 29, 2024 at 04:00:20PM +, Qing Zhao wrote: An update on the kernel building with my version 4 patch. Kees reported two FE issues with the current version 4 patch: 1. The operator “typeof” cannot return correct type for a->array; 2. The operator “&” cannot return correct address for a->array; I fixed both in my local repository. With these additional fix. Kernel with counted-by annotation can be built successfully. >>> >>> Thanks for the fixes! >>> And then, Kees reported one behavioral issue with the current counted-by: When the counted-by value is below zero, my current patch A. Didn’t report any warning for it. B. Accepted the negative value as a wrapped size. i.e. for: struct foo { signed char size; unsigned char array[] __counted_by(size); } *a; ... a->size = -3; report(__builtin_dynamic_object_size(p->array, 1)); this reports 253, rather than 0. And the array-bounds sanitizer doesn’t catch negative index bounds neither. a->size = -3; report(a->array[1]); // does not trap So, my questions are: How should we handle the negative counted-by value? >>> >>> Treat it as always 0-bounded: count < 0 ? 0 : count >> >> Then the size of the object is 0? > > That would be the purpose, yes. It's possible something else has > happened, but it would mean "the array contents should not be accessed > (since we don't have a valid size)". This might be a new concept we need to add, from my understanding, C/C++ don’t have the zero-sized object. So, I am a little worried about where should we add this concept? The most reasonable place I am thinking is adding such concept to the doc of “counted-by” attribute, but still not very sure on this. > >> >>> My approach is: I think that this is a user error, the compiler need to Issue warning during runtime about this user error. Since I have one remaining patch that has not been finished yet: 6 Emit warnings when the user breaks the requirments for the new counted_by attribute compilation time: -Wcounted-by run time: -fsanitizer=counted-by * The initialization to the size field should be done before the first reference to the FAM field. >>> >>> I would hope that regular compile-time warnings would catch this. >> If the value is known at compile-time, then compile-time should catch it. >> >>> * the array has at least # of elements specified by the size field all the time during the program. * the value of counted-by should not be negative. >>> >>> This seems reasonable for a very strict program, but it won't work for >>> the kernel as-is: a negative "count" is sometimes used to carry failure >>> details back to other users of the structure. This could be refactored in >>> the kernel, but I'd prefer that even without -fsanitizer=counted-by the >>> runtime behaviors will be "safe". >> >> So, In the kernel’s source code, for example: >> >> struct foo { >> int count; >> short array[] __counted_by(count); >> }; >> >> The field “count” will be used for two purposes: >> A. As the counted_by for the “array” when its value > 0; >> B. As an errno when its value < 0; under such condition, the size of >> “array” is zero. >> >> Is the understanding correct? > > Yes. > >> Is doing this for saving space? (Curious -:) > > It seems so, yes. > >>> It does not seem sensible to me that adding a buffer size validation >>> primitive to GCC will result in conditions where a size calculation >>> will wrap around. I prefer no surprises. :) >> >> Might be a bug here. I guess. >>> Let me know your comment and suggestions. >>> >>> Clang has implemented the safety logic I'd prefer: >>> >>> * __bdos will report 0 for any sizing where the "counted_by" count >>> variable is negative. Effectively, the count variable is always >>> processed as: count < 0 ? 0 : count >>> >>> struct foo { >>> int count; >>> short array[] __counted_by(count); >>> } *p; >>> >>> __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count) >> >> NOTE, __bdo will use value 0 as UNKNOWN_SIZE for MINMUM SIZE query, i.e: >> >> size_t __builtin_object_size (const void * ptr, int type) >> >> Will return 0 as UNKNOW_SIZE when type= 2 or 3. >> >> So, I am wondering: should the 0 here is UNKNOWN_SIZE or 0 size? >> >> I guess should be the UNKNOWN_SIZE? (I,e, -1 for MAXIMUM type, 0 for >> MINIMUM type). >> >> i.e, when the value of “count” is 0 or negative, the __bdos will return >> UNKNOWN_SIZE. Is this correct? > > I would suggest that a negative count
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
> On Jan 29, 2024, at 3:35 PM, Joseph Myers wrote: > > On Mon, 29 Jan 2024, Qing Zhao wrote: > >> Thank you! >> >> Joseph and Richard, could you also comment on this? > > I think Martin's suggestions are reasonable. Okay, I will update the patches based on this approach. Thanks a lot for the comment. Qing > > -- > Joseph S. Myers > josmy...@redhat.com >
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
On Mon, 29 Jan 2024, Qing Zhao wrote: > Thank you! > > Joseph and Richard, could you also comment on this? I think Martin's suggestions are reasonable. -- Joseph S. Myers josmy...@redhat.com
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
On Mon, Jan 29, 2024 at 07:32:06PM +, Qing Zhao wrote: > > > > On Jan 29, 2024, at 12:25 PM, Kees Cook wrote: > > > > On Mon, Jan 29, 2024 at 04:00:20PM +, Qing Zhao wrote: > >> An update on the kernel building with my version 4 patch. > >> > >> Kees reported two FE issues with the current version 4 patch: > >> > >> 1. The operator “typeof” cannot return correct type for a->array; > >> 2. The operator “&” cannot return correct address for a->array; > >> > >> I fixed both in my local repository. > >> > >> With these additional fix. Kernel with counted-by annotation can be built > >> successfully. > > > > Thanks for the fixes! > > > >> > >> And then, Kees reported one behavioral issue with the current counted-by: > >> > >> When the counted-by value is below zero, my current patch > >> > >> A. Didn’t report any warning for it. > >> B. Accepted the negative value as a wrapped size. > >> > >> i.e. for: > >> > >> struct foo { > >> signed char size; > >> unsigned char array[] __counted_by(size); > >> } *a; > >> > >> ... > >> a->size = -3; > >> report(__builtin_dynamic_object_size(p->array, 1)); > >> > >> this reports 253, rather than 0. > >> > >> And the array-bounds sanitizer doesn’t catch negative index bounds > >> neither. > >> > >> a->size = -3; > >> report(a->array[1]); // does not trap > >> > >> > >> So, my questions are: > >> > >> How should we handle the negative counted-by value? > > > > Treat it as always 0-bounded: count < 0 ? 0 : count > > Then the size of the object is 0? That would be the purpose, yes. It's possible something else has happened, but it would mean "the array contents should not be accessed (since we don't have a valid size)". > > > > >> > >> My approach is: > >> > >> I think that this is a user error, the compiler need to Issue warning > >> during runtime about this user error. > >> > >> Since I have one remaining patch that has not been finished yet: > >> > >> 6 Emit warnings when the user breaks the requirments for the new > >> counted_by attribute > >> compilation time: -Wcounted-by > >> run time: -fsanitizer=counted-by > >> * The initialization to the size field should be done before the first > >> reference to the FAM field. > > > > I would hope that regular compile-time warnings would catch this. > If the value is known at compile-time, then compile-time should catch it. > > > > >> * the array has at least # of elements specified by the size field all > >> the time during the program. > >> * the value of counted-by should not be negative. > > > > This seems reasonable for a very strict program, but it won't work for > > the kernel as-is: a negative "count" is sometimes used to carry failure > > details back to other users of the structure. This could be refactored in > > the kernel, but I'd prefer that even without -fsanitizer=counted-by the > > runtime behaviors will be "safe". > > So, In the kernel’s source code, for example: > > struct foo { > int count; > short array[] __counted_by(count); > }; > > The field “count” will be used for two purposes: > A. As the counted_by for the “array” when its value > 0; > B. As an errno when its value < 0; under such condition, the size of “array” > is zero. > > Is the understanding correct? Yes. > Is doing this for saving space? (Curious -:) It seems so, yes. > > It does not seem sensible to me that adding a buffer size validation > > primitive to GCC will result in conditions where a size calculation > > will wrap around. I prefer no surprises. :) > > Might be a bug here. I guess. > > > >> Let me know your comment and suggestions. > > > > Clang has implemented the safety logic I'd prefer: > > > > * __bdos will report 0 for any sizing where the "counted_by" count > > variable is negative. Effectively, the count variable is always > > processed as: count < 0 ? 0 : count > > > > struct foo { > > int count; > > short array[] __counted_by(count); > > } *p; > > > > __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count) > > NOTE, __bdo will use value 0 as UNKNOWN_SIZE for MINMUM SIZE query, i.e: > > size_t __builtin_object_size (const void * ptr, int type) > > Will return 0 as UNKNOW_SIZE when type= 2 or 3. > > So, I am wondering: should the 0 here is UNKNOWN_SIZE or 0 size? > > I guess should be the UNKNOWN_SIZE? (I,e, -1 for MAXIMUM type, 0 for > MINIMUM type). > > i.e, when the value of “count” is 0 or negative, the __bdos will return > UNKNOWN_SIZE. Is this correct? I would suggest that a negative count should always return 0. The size isn't "unknown", the "count" has been clamped to 0 to avoid surprises, so the result is as if the "count" had a zero value. > Okay, when the value of “count” is 0 or negative, bound sanitizer will report > out-of-bound (or trap) for any access to the array. > This should be reasonable. Thanks! And with __bdos() following this logic there won't be a disconnect between the two. i.e
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
> On Jan 29, 2024, at 12:25 PM, Kees Cook wrote: > > On Mon, Jan 29, 2024 at 04:00:20PM +, Qing Zhao wrote: >> An update on the kernel building with my version 4 patch. >> >> Kees reported two FE issues with the current version 4 patch: >> >> 1. The operator “typeof” cannot return correct type for a->array; >> 2. The operator “&” cannot return correct address for a->array; >> >> I fixed both in my local repository. >> >> With these additional fix. Kernel with counted-by annotation can be built >> successfully. > > Thanks for the fixes! > >> >> And then, Kees reported one behavioral issue with the current counted-by: >> >> When the counted-by value is below zero, my current patch >> >> A. Didn’t report any warning for it. >> B. Accepted the negative value as a wrapped size. >> >> i.e. for: >> >> struct foo { >> signed char size; >> unsigned char array[] __counted_by(size); >> } *a; >> >> ... >> a->size = -3; >> report(__builtin_dynamic_object_size(p->array, 1)); >> >> this reports 253, rather than 0. >> >> And the array-bounds sanitizer doesn’t catch negative index bounds neither. >> >> a->size = -3; >> report(a->array[1]); // does not trap >> >> >> So, my questions are: >> >> How should we handle the negative counted-by value? > > Treat it as always 0-bounded: count < 0 ? 0 : count Then the size of the object is 0? > >> >> My approach is: >> >> I think that this is a user error, the compiler need to Issue warning >> during runtime about this user error. >> >> Since I have one remaining patch that has not been finished yet: >> >> 6 Emit warnings when the user breaks the requirments for the new counted_by >> attribute >> compilation time: -Wcounted-by >> run time: -fsanitizer=counted-by >> * The initialization to the size field should be done before the first >> reference to the FAM field. > > I would hope that regular compile-time warnings would catch this. If the value is known at compile-time, then compile-time should catch it. > >> * the array has at least # of elements specified by the size field all >> the time during the program. >> * the value of counted-by should not be negative. > > This seems reasonable for a very strict program, but it won't work for > the kernel as-is: a negative "count" is sometimes used to carry failure > details back to other users of the structure. This could be refactored in > the kernel, but I'd prefer that even without -fsanitizer=counted-by the > runtime behaviors will be "safe". So, In the kernel’s source code, for example: struct foo { int count; short array[] __counted_by(count); }; The field “count” will be used for two purposes: A. As the counted_by for the “array” when its value > 0; B. As an errno when its value < 0; under such condition, the size of “array” is zero. Is the understanding correct? Is doing this for saving space? (Curious -:) > > It does not seem sensible to me that adding a buffer size validation > primitive to GCC will result in conditions where a size calculation > will wrap around. I prefer no surprises. :) Might be a bug here. I guess. > >> Let me know your comment and suggestions. > > Clang has implemented the safety logic I'd prefer: > > * __bdos will report 0 for any sizing where the "counted_by" count > variable is negative. Effectively, the count variable is always > processed as: count < 0 ? 0 : count > > struct foo { > int count; > short array[] __counted_by(count); > } *p; > > __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count) NOTE, __bdo will use value 0 as UNKNOWN_SIZE for MINMUM SIZE query, i.e: size_t __builtin_object_size (const void * ptr, int type) Will return 0 as UNKNOW_SIZE when type= 2 or 3. So, I am wondering: should the 0 here is UNKNOWN_SIZE or 0 size? I guess should be the UNKNOWN_SIZE? (I,e, -1 for MAXIMUM type, 0 for MINIMUM type). i.e, when the value of “count” is 0 or negative, the __bdos will return UNKNOWN_SIZE. Is this correct? > > The logic for this is that __bdos can be _certain_ that the size is 0 > when the count variable is pathological. > > * -fsanitize=array-bounds similarly treats count as above, so that: > > printf("%d\n", p->array[index]); ==> trap when index > (count < 0 ? 0 : > count) > > Same logic for the sanitizer: any access to the array when count is > invalid means the access is invalid and must be trapped. Okay, when the value of “count” is 0 or negative, bound sanitizer will report out-of-bound (or trap) for any access to the array. This should be reasonable. Qing > > > This means that software can run safely even in pathological conditions. > > -Kees > > -- > Kees Cook
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
On Mon, Jan 29, 2024 at 04:00:20PM +, Qing Zhao wrote: > An update on the kernel building with my version 4 patch. > > Kees reported two FE issues with the current version 4 patch: > > 1. The operator “typeof” cannot return correct type for a->array; > 2. The operator “&” cannot return correct address for a->array; > > I fixed both in my local repository. > > With these additional fix. Kernel with counted-by annotation can be built > successfully. Thanks for the fixes! > > And then, Kees reported one behavioral issue with the current counted-by: > > When the counted-by value is below zero, my current patch > > A. Didn’t report any warning for it. > B. Accepted the negative value as a wrapped size. > > i.e. for: > > struct foo { > signed char size; > unsigned char array[] __counted_by(size); > } *a; > > ... > a->size = -3; > report(__builtin_dynamic_object_size(p->array, 1)); > > this reports 253, rather than 0. > > And the array-bounds sanitizer doesn’t catch negative index bounds neither. > > a->size = -3; > report(a->array[1]); // does not trap > > > So, my questions are: > > How should we handle the negative counted-by value? Treat it as always 0-bounded: count < 0 ? 0 : count > > My approach is: > >I think that this is a user error, the compiler need to Issue warning > during runtime about this user error. > > Since I have one remaining patch that has not been finished yet: > > 6 Emit warnings when the user breaks the requirments for the new counted_by > attribute > compilation time: -Wcounted-by > run time: -fsanitizer=counted-by > * The initialization to the size field should be done before the first > reference to the FAM field. I would hope that regular compile-time warnings would catch this. > * the array has at least # of elements specified by the size field all > the time during the program. > * the value of counted-by should not be negative. This seems reasonable for a very strict program, but it won't work for the kernel as-is: a negative "count" is sometimes used to carry failure details back to other users of the structure. This could be refactored in the kernel, but I'd prefer that even without -fsanitizer=counted-by the runtime behaviors will be "safe". It does not seem sensible to me that adding a buffer size validation primitive to GCC will result in conditions where a size calculation will wrap around. I prefer no surprises. :) > Let me know your comment and suggestions. Clang has implemented the safety logic I'd prefer: * __bdos will report 0 for any sizing where the "counted_by" count variable is negative. Effectively, the count variable is always processed as: count < 0 ? 0 : count struct foo { int count; short array[] __counted_by(count); } *p; __bdos(p->array, 1) ==> sizeof(*p->array) * (count < 0 ? 0 : count) The logic for this is that __bdos can be _certain_ that the size is 0 when the count variable is pathological. * -fsanitize=array-bounds similarly treats count as above, so that: printf("%d\n", p->array[index]); ==> trap when index > (count < 0 ? 0 : count) Same logic for the sanitizer: any access to the array when count is invalid means the access is invalid and must be trapped. This means that software can run safely even in pathological conditions. -Kees -- Kees Cook
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
> On Jan 29, 2024, at 10:50 AM, Martin Uecker wrote: > > Am Montag, dem 29.01.2024 um 15:09 + schrieb Qing Zhao: >> Thank you! >> >> Joseph and Richard, could you also comment on this? >> >>> On Jan 28, 2024, at 5:09 AM, Martin Uecker wrote: >>> >>> Am Freitag, dem 26.01.2024 um 14:33 + schrieb Qing Zhao: > On Jan 26, 2024, at 3:04 AM, Martin Uecker wrote: > > > I haven't looked at the patch, but it sounds you give the result > the wrong type. Then patching up all use cases instead of the > type seems wrong. Yes, this is for resolving a very early gimplification issue as I reported last Nov: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html Since no-one responded at that time, I fixed the issue by replacing the ARRAY_REF With a pointer indirection: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html The reason for such change is: return a flexible array member TYPE is not allowed by C language (our gimplification follows this rule), so, we have to return a pointer TYPE instead. **The new internal function .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX) INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL) which returns the "REF_TO_OBJ" same as the 1st argument; Both the return type and the type of the first argument of this function have been converted from the incomplete array type to the corresponding pointer type. As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the original INDEX of the ARRAY_REF was lost when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX for bound sanitizer instrumentation, I added The 6th argument “INDEX”. What’s your comment and suggestion on this solution? >>> >>> I am not entirely sure but changing types in the FE seems >>> problematic because this breaks language semantics. And >>> then adding special code everywhere to treat it specially >>> in the FE does not seem a good way forward. >>> >>> If I understand correctly, returning an incomplete array >>> type is not allowed and then fails during gimplification. >> >> Yes, this is the problem in gimplification. >> >>> So I would suggest to make it return a pointer to the >>> incomplete array (and not the element type) >> >> >> for the following: >> >> struct annotated { >> unsigned int size; >> int array[] __attribute__((counted_by (size))); >> }; >> >> struct annotated * p = …. >> p->array[9] = 0; >> >> The IL for the above array reference p->array[9] is: >> >> 1. If the return type is the original incomplete array type, >> >> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0; >> >> (this triggered the gimplification failure since the return type cannot be a >> complete type). >> >> 2. When the return type is changed to a pointer to the element type of the >> incomplete array, (the current patch) >> Then the original array reference naturally becomes an indirect reference >> through the pointer >> >> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0; >> >> Since the original array reference becomes an indirect reference through the >> pointer to the element array, the INDEX info >> is mixed into the OFFSET of the indirect reference and lost, so, I added the >> 6th argument to the routine .ACCESS_WITH_SIZE >> to record the INDEX. >> >> 3. With your suggestion, the return type is changed to a pointer to the >> incomplete array, >> I just tried this to change the result type : >> >> >> --- a/gcc/c/c-typeck.cc >> +++ b/gcc/c/c-typeck.cc >> @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, >> tree ref, >> tree counted_by_type) >> { >> gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref))); >> - tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref))); >> + tree result_type = build_pointer_type (TREE_TYPE (ref)); >> >> Then, I got the following FE errors: >> >> test.c:10:11: error: invalid use of flexible array member >> 10 | p->array[9] = 0; >> >> The reason for the error is: when the original array_ref becomes an >> indirect_ref through the pointer to the incomplete array, >> During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT >> (type) is invalid since the type is an incomplete array. >> As a result, the OFFSET cannot computed for the indirect_ref. >> >> Looks like even more issues with this approach. > > Yes, but only because the following is missing: > >> >> >>> but then wrap >>> it with an indirection when inserting this code in the FE >>> so that the full replacement has the correct type again >>> (of the incomplete array). >> >> I don’t quite understand the ab
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
An update on the kernel building with my version 4 patch. Kees reported two FE issues with the current version 4 patch: 1. The operator “typeof” cannot return correct type for a->array; 2. The operator “&” cannot return correct address for a->array; I fixed both in my local repository. With these additional fix. Kernel with counted-by annotation can be built successfully. And then, Kees reported one behavioral issue with the current counted-by: When the counted-by value is below zero, my current patch A. Didn’t report any warning for it. B. Accepted the negative value as a wrapped size. i.e. for: struct foo { signed char size; unsigned char array[] __counted_by(size); } *a; ... a->size = -3; report(__builtin_dynamic_object_size(p->array, 1)); this reports 253, rather than 0. And the array-bounds sanitizer doesn’t catch negative index bounds neither. a->size = -3; report(a->array[1]); // does not trap So, my questions are: How should we handle the negative counted-by value? My approach is: I think that this is a user error, the compiler need to Issue warning during runtime about this user error. Since I have one remaining patch that has not been finished yet: 6 Emit warnings when the user breaks the requirments for the new counted_by attribute compilation time: -Wcounted-by run time: -fsanitizer=counted-by * The initialization to the size field should be done before the first reference to the FAM field. * the array has at least # of elements specified by the size field all the time during the program. * the value of counted-by should not be negative. Let me know your comment and suggestions. Thanks Qing > On Jan 25, 2024, at 3:11 PM, Qing Zhao wrote: > > Thanks a lot for the testing. > > Yes, I can repeat the issue with the following small example: > > #include > #include > #include > > #define MAX(a, b) ((a) > (b) ? (a) : (b)) > > struct untracked { > int size; > int array[] __attribute__((counted_by (size))); > } *a; > struct untracked * alloc_buf (int index) > { > struct untracked *p; > p = (struct untracked *) malloc (MAX (sizeof (struct untracked), >(offsetof (struct untracked, array[0]) > + (index) * sizeof (int; > p->size = index; > return p; > } > > int main() > { > a = alloc_buf(10); > printf ("same_type is %d\n", > (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]; > return 0; > } > > > /home/opc/Install/latest-d/bin/gcc -O2 btcp.c > same_type is 1 > > Looks like that the “typeof” operator need to be handled specially in C FE > for the new internal function .ACCESS_WITH_SIZE. > > (I have specially handle the operator “offsetof” in C FE already). > > Will fix this issue. > > Thanks. > > Qing > >> On Jan 24, 2024, at 7:51 PM, Kees Cook wrote: >> >> On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote: >>> This is the 4th version of the patch. >> >> Thanks very much for this! >> >> I tripped over an unexpected behavioral change that the Linux kernel >> depends on: >> >> __builtin_types_compatible_p() no longer treats an array marked with >> counted_by as different from that array's decayed pointer. Specifically, >> the kernel uses these macros: >> >> >> /* >> * Force a compilation error if condition is true, but also produce a >> * result (of value 0 and type int), so the expression can be used >> * e.g. in a structure initializer (or where-ever else comma expressions >> * aren't permitted). >> */ >> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); }))) >> >> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b)) >> >> /* &a[0] degrades to a pointer: a different type from an array */ >> #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0])) >> >> >> This gets used in various places to make sure we're dealing with an >> array for a macro: >> >> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + >> __must_be_array(arr)) >> >> >> So this builds: >> >> struct untracked { >> int size; >> int array[]; >> } *a; >> >> __must_be_array(a->array) >> => 0 (as expected) >> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0])) >> => 0 (as expected, array vs decayed array pointer) >> >> >> But if counted_by is added, we get a build failure: >> >> struct tracked { >> int size; >> int array[] __counted_by(size); >> } *b; >> >> __must_be_array(b->array) >> => build failure (not expected) >> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0])) >> => 1 (not expected, both pointers?) >> >> >> >> >> -- >> Kees Cook >
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
Am Montag, dem 29.01.2024 um 15:09 + schrieb Qing Zhao: > Thank you! > > Joseph and Richard, could you also comment on this? > > > On Jan 28, 2024, at 5:09 AM, Martin Uecker wrote: > > > > Am Freitag, dem 26.01.2024 um 14:33 + schrieb Qing Zhao: > > > > > > > On Jan 26, 2024, at 3:04 AM, Martin Uecker wrote: > > > > > > > > > > > > I haven't looked at the patch, but it sounds you give the result > > > > the wrong type. Then patching up all use cases instead of the > > > > type seems wrong. > > > > > > Yes, this is for resolving a very early gimplification issue as I > > > reported last Nov: > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html > > > > > > Since no-one responded at that time, I fixed the issue by replacing the > > > ARRAY_REF > > > With a pointer indirection: > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html > > > > > > The reason for such change is: return a flexible array member TYPE is > > > not allowed > > > by C language (our gimplification follows this rule), so, we have to > > > return a pointer TYPE instead. > > > > > > **The new internal function > > > > > > .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, > > > ACCESS_MODE, INDEX) > > > > > > INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL) > > > > > > which returns the "REF_TO_OBJ" same as the 1st argument; > > > > > > Both the return type and the type of the first argument of this function > > > have been converted from > > > the incomplete array type to the corresponding pointer type. > > > > > > As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the > > > original INDEX of the ARRAY_REF was lost > > > when converting from ARRAY_REF to INDIRECT_REF, in order to keep the > > > INDEX for bound sanitizer instrumentation, I added > > > The 6th argument “INDEX”. > > > > > > What’s your comment and suggestion on this solution? > > > > I am not entirely sure but changing types in the FE seems > > problematic because this breaks language semantics. And > > then adding special code everywhere to treat it specially > > in the FE does not seem a good way forward. > > > > If I understand correctly, returning an incomplete array > > type is not allowed and then fails during gimplification. > > Yes, this is the problem in gimplification. > > > So I would suggest to make it return a pointer to the > > incomplete array (and not the element type) > > > for the following: > > struct annotated { > unsigned int size; > int array[] __attribute__((counted_by (size))); > }; > > struct annotated * p = …. > p->array[9] = 0; > > The IL for the above array reference p->array[9] is: > > 1. If the return type is the original incomplete array type, > > .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0; > > (this triggered the gimplification failure since the return type cannot be a > complete type). > > 2. When the return type is changed to a pointer to the element type of the > incomplete array, (the current patch) > Then the original array reference naturally becomes an indirect reference > through the pointer > > *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0; > > Since the original array reference becomes an indirect reference through the > pointer to the element array, the INDEX info > is mixed into the OFFSET of the indirect reference and lost, so, I added the > 6th argument to the routine .ACCESS_WITH_SIZE > to record the INDEX. > > 3. With your suggestion, the return type is changed to a pointer to the > incomplete array, > I just tried this to change the result type : > > > --- a/gcc/c/c-typeck.cc > +++ b/gcc/c/c-typeck.cc > @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, > tree ref, >tree counted_by_type) > { >gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref))); > - tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref))); > + tree result_type = build_pointer_type (TREE_TYPE (ref)); > > Then, I got the following FE errors: > > test.c:10:11: error: invalid use of flexible array member >10 | p->array[9] = 0; > > The reason for the error is: when the original array_ref becomes an > indirect_ref through the pointer to the incomplete array, > During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT > (type) is invalid since the type is an incomplete array. > As a result, the OFFSET cannot computed for the indirect_ref. > > Looks like even more issues with this approach. Yes, but only because the following is missing: > > > > but then wrap > > it with an indirection when inserting this code in the FE > > so that the full replacement has the correct type again > > (of the incomplete array). > > I don’t quite understand the above, could you please explain this in more > details? (If possible, could you plea
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
Thank you! Joseph and Richard, could you also comment on this? > On Jan 28, 2024, at 5:09 AM, Martin Uecker wrote: > > Am Freitag, dem 26.01.2024 um 14:33 + schrieb Qing Zhao: >> >>> On Jan 26, 2024, at 3:04 AM, Martin Uecker wrote: >>> >>> >>> I haven't looked at the patch, but it sounds you give the result >>> the wrong type. Then patching up all use cases instead of the >>> type seems wrong. >> >> Yes, this is for resolving a very early gimplification issue as I reported >> last Nov: >> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html >> >> Since no-one responded at that time, I fixed the issue by replacing the >> ARRAY_REF >> With a pointer indirection: >> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html >> >> The reason for such change is: return a flexible array member TYPE is not >> allowed >> by C language (our gimplification follows this rule), so, we have to return >> a pointer TYPE instead. >> >> **The new internal function >> >> .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, >> ACCESS_MODE, INDEX) >> >> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL) >> >> which returns the "REF_TO_OBJ" same as the 1st argument; >> >> Both the return type and the type of the first argument of this function >> have been converted from >> the incomplete array type to the corresponding pointer type. >> >> As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the >> original INDEX of the ARRAY_REF was lost >> when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX >> for bound sanitizer instrumentation, I added >> The 6th argument “INDEX”. >> >> What’s your comment and suggestion on this solution? > > I am not entirely sure but changing types in the FE seems > problematic because this breaks language semantics. And > then adding special code everywhere to treat it specially > in the FE does not seem a good way forward. > > If I understand correctly, returning an incomplete array > type is not allowed and then fails during gimplification. Yes, this is the problem in gimplification. > So I would suggest to make it return a pointer to the > incomplete array (and not the element type) for the following: struct annotated { unsigned int size; int array[] __attribute__((counted_by (size))); }; struct annotated * p = …. p->array[9] = 0; The IL for the above array reference p->array[9] is: 1. If the return type is the original incomplete array type, .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0; (this triggered the gimplification failure since the return type cannot be a complete type). 2. When the return type is changed to a pointer to the element type of the incomplete array, (the current patch) Then the original array reference naturally becomes an indirect reference through the pointer *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0; Since the original array reference becomes an indirect reference through the pointer to the element array, the INDEX info is mixed into the OFFSET of the indirect reference and lost, so, I added the 6th argument to the routine .ACCESS_WITH_SIZE to record the INDEX. 3. With your suggestion, the return type is changed to a pointer to the incomplete array, I just tried this to change the result type : --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, tree ref, tree counted_by_type) { gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref))); - tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref))); + tree result_type = build_pointer_type (TREE_TYPE (ref)); Then, I got the following FE errors: test.c:10:11: error: invalid use of flexible array member 10 | p->array[9] = 0; The reason for the error is: when the original array_ref becomes an indirect_ref through the pointer to the incomplete array, During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT (type) is invalid since the type is an incomplete array. As a result, the OFFSET cannot computed for the indirect_ref. Looks like even more issues with this approach. > but then wrap > it with an indirection when inserting this code in the FE > so that the full replacement has the correct type again > (of the incomplete array). I don’t quite understand the above, could you please explain this in more details? (If possible, could you please use the above small example?) thanks. > > > Alternatively, one could allow this during gimplification > or add some conversion. Allowing this in gimplification might trigger some other issues. I guess that adding conversion in the end of the FE or in the beginning of gimplification might be better. i.e, in FE, still keep the original incomplete array type as the return type for the routine .ACCESS_WITH_SIZ
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
Am Freitag, dem 26.01.2024 um 14:33 + schrieb Qing Zhao: > > > On Jan 26, 2024, at 3:04 AM, Martin Uecker wrote: > > > > > > I haven't looked at the patch, but it sounds you give the result > > the wrong type. Then patching up all use cases instead of the > > type seems wrong. > > Yes, this is for resolving a very early gimplification issue as I reported > last Nov: > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html > > Since no-one responded at that time, I fixed the issue by replacing the > ARRAY_REF > With a pointer indirection: > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html > > The reason for such change is: return a flexible array member TYPE is not > allowed > by C language (our gimplification follows this rule), so, we have to return a > pointer TYPE instead. > > **The new internal function > > .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, > ACCESS_MODE, INDEX) > > INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL) > > which returns the "REF_TO_OBJ" same as the 1st argument; > > Both the return type and the type of the first argument of this function have > been converted from > the incomplete array type to the corresponding pointer type. > > As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the > original INDEX of the ARRAY_REF was lost > when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX > for bound sanitizer instrumentation, I added > The 6th argument “INDEX”. > > What’s your comment and suggestion on this solution? I am not entirely sure but changing types in the FE seems problematic because this breaks language semantics. And then adding special code everywhere to treat it specially in the FE does not seem a good way forward. If I understand correctly, returning an incomplete array type is not allowed and then fails during gimplification. So I would suggest to make it return a pointer to the incomplete array (and not the element type) but then wrap it with an indirection when inserting this code in the FE so that the full replacement has the correct type again (of the incomplete array). Alternatively, one could allow this during gimplification or add some conversion. Martin > > Thanks. > > Qing > > > > > > Martin > > > > > > Am Donnerstag, dem 25.01.2024 um 20:11 + schrieb Qing Zhao: > > > Thanks a lot for the testing. > > > > > > Yes, I can repeat the issue with the following small example: > > > > > > #include > > > #include > > > #include > > > > > > #define MAX(a, b) ((a) > (b) ? (a) : (b)) > > > > > > struct untracked { > > > int size; > > > int array[] __attribute__((counted_by (size))); > > > } *a; > > > struct untracked * alloc_buf (int index) > > > { > > > struct untracked *p; > > > p = (struct untracked *) malloc (MAX (sizeof (struct untracked), > > >(offsetof (struct untracked, > > > array[0]) > > > + (index) * sizeof (int; > > > p->size = index; > > > return p; > > > } > > > > > > int main() > > > { > > > a = alloc_buf(10); > > > printf ("same_type is %d\n", > > > (__builtin_types_compatible_p(typeof (a->array), typeof > > > (&(a->array)[0]; > > > return 0; > > > } > > > > > > > > > /home/opc/Install/latest-d/bin/gcc -O2 btcp.c > > > same_type is 1 > > > > > > Looks like that the “typeof” operator need to be handled specially in C FE > > > for the new internal function .ACCESS_WITH_SIZE. > > > > > > (I have specially handle the operator “offsetof” in C FE already). > > > > > > Will fix this issue. > > > > > > Thanks. > > > > > > Qing > > > > > > > On Jan 24, 2024, at 7:51 PM, Kees Cook wrote: > > > > > > > > On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote: > > > > > This is the 4th version of the patch. > > > > > > > > Thanks very much for this! > > > > > > > > I tripped over an unexpected behavioral change that the Linux kernel > > > > depends on: > > > > > > > > __builtin_types_compatible_p() no longer treats an array marked with > > > > counted_by as different from that array's decayed pointer. Specifically, > > > > the kernel uses these macros: > > > > > > > > > > > > /* > > > > * Force a compilation error if condition is true, but also produce a > > > > * result (of value 0 and type int), so the expression can be used > > > > * e.g. in a structure initializer (or where-ever else comma expressions > > > > * aren't permitted). > > > > */ > > > > #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); }))) > > > > > > > > #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), > > > > typeof(b)) > > > > > > > > /* &a[0] degrades to a pointer: a different type from an array */ > > > > #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), > > > > &(a)[0])) > > > > > > > > > > > > This gets used in various places to make sure we
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
> On Jan 26, 2024, at 3:04 AM, Martin Uecker wrote: > > > I haven't looked at the patch, but it sounds you give the result > the wrong type. Then patching up all use cases instead of the > type seems wrong. Yes, this is for resolving a very early gimplification issue as I reported last Nov: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html Since no-one responded at that time, I fixed the issue by replacing the ARRAY_REF With a pointer indirection: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html The reason for such change is: return a flexible array member TYPE is not allowed by C language (our gimplification follows this rule), so, we have to return a pointer TYPE instead. **The new internal function .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX) INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL) which returns the "REF_TO_OBJ" same as the 1st argument; Both the return type and the type of the first argument of this function have been converted from the incomplete array type to the corresponding pointer type. As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the original INDEX of the ARRAY_REF was lost when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX for bound sanitizer instrumentation, I added The 6th argument “INDEX”. What’s your comment and suggestion on this solution? Thanks. Qing > > Martin > > > Am Donnerstag, dem 25.01.2024 um 20:11 + schrieb Qing Zhao: >> Thanks a lot for the testing. >> >> Yes, I can repeat the issue with the following small example: >> >> #include >> #include >> #include >> >> #define MAX(a, b) ((a) > (b) ? (a) : (b)) >> >> struct untracked { >> int size; >> int array[] __attribute__((counted_by (size))); >> } *a; >> struct untracked * alloc_buf (int index) >> { >> struct untracked *p; >> p = (struct untracked *) malloc (MAX (sizeof (struct untracked), >>(offsetof (struct untracked, array[0]) >> + (index) * sizeof (int; >> p->size = index; >> return p; >> } >> >> int main() >> { >> a = alloc_buf(10); >> printf ("same_type is %d\n", >> (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]; >> return 0; >> } >> >> >> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c >> same_type is 1 >> >> Looks like that the “typeof” operator need to be handled specially in C FE >> for the new internal function .ACCESS_WITH_SIZE. >> >> (I have specially handle the operator “offsetof” in C FE already). >> >> Will fix this issue. >> >> Thanks. >> >> Qing >> >>> On Jan 24, 2024, at 7:51 PM, Kees Cook wrote: >>> >>> On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote: This is the 4th version of the patch. >>> >>> Thanks very much for this! >>> >>> I tripped over an unexpected behavioral change that the Linux kernel >>> depends on: >>> >>> __builtin_types_compatible_p() no longer treats an array marked with >>> counted_by as different from that array's decayed pointer. Specifically, >>> the kernel uses these macros: >>> >>> >>> /* >>> * Force a compilation error if condition is true, but also produce a >>> * result (of value 0 and type int), so the expression can be used >>> * e.g. in a structure initializer (or where-ever else comma expressions >>> * aren't permitted). >>> */ >>> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); }))) >>> >>> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b)) >>> >>> /* &a[0] degrades to a pointer: a different type from an array */ >>> #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0])) >>> >>> >>> This gets used in various places to make sure we're dealing with an >>> array for a macro: >>> >>> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + >>> __must_be_array(arr)) >>> >>> >>> So this builds: >>> >>> struct untracked { >>> int size; >>> int array[]; >>> } *a; >>> >>> __must_be_array(a->array) >>> => 0 (as expected) >>> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0])) >>> => 0 (as expected, array vs decayed array pointer) >>> >>> >>> But if counted_by is added, we get a build failure: >>> >>> struct tracked { >>> int size; >>> int array[] __counted_by(size); >>> } *b; >>> >>> __must_be_array(b->array) >>> => build failure (not expected) >>> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0])) >>> => 1 (not expected, both pointers?) >>> >>> >>> >>> >>> -- >>> Kees Cook >> >
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
I haven't looked at the patch, but it sounds you give the result the wrong type. Then patching up all use cases instead of the type seems wrong. Martin Am Donnerstag, dem 25.01.2024 um 20:11 + schrieb Qing Zhao: > Thanks a lot for the testing. > > Yes, I can repeat the issue with the following small example: > > #include > #include > #include > > #define MAX(a, b) ((a) > (b) ? (a) : (b)) > > struct untracked { >int size; >int array[] __attribute__((counted_by (size))); > } *a; > struct untracked * alloc_buf (int index) > { > struct untracked *p; > p = (struct untracked *) malloc (MAX (sizeof (struct untracked), > (offsetof (struct untracked, array[0]) > + (index) * sizeof (int; > p->size = index; > return p; > } > > int main() > { > a = alloc_buf(10); > printf ("same_type is %d\n", > (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]; > return 0; > } > > > /home/opc/Install/latest-d/bin/gcc -O2 btcp.c > same_type is 1 > > Looks like that the “typeof” operator need to be handled specially in C FE > for the new internal function .ACCESS_WITH_SIZE. > > (I have specially handle the operator “offsetof” in C FE already). > > Will fix this issue. > > Thanks. > > Qing > > > On Jan 24, 2024, at 7:51 PM, Kees Cook wrote: > > > > On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote: > > > This is the 4th version of the patch. > > > > Thanks very much for this! > > > > I tripped over an unexpected behavioral change that the Linux kernel > > depends on: > > > > __builtin_types_compatible_p() no longer treats an array marked with > > counted_by as different from that array's decayed pointer. Specifically, > > the kernel uses these macros: > > > > > > /* > > * Force a compilation error if condition is true, but also produce a > > * result (of value 0 and type int), so the expression can be used > > * e.g. in a structure initializer (or where-ever else comma expressions > > * aren't permitted). > > */ > > #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); }))) > > > > #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b)) > > > > /* &a[0] degrades to a pointer: a different type from an array */ > > #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0])) > > > > > > This gets used in various places to make sure we're dealing with an > > array for a macro: > > > > #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + > > __must_be_array(arr)) > > > > > > So this builds: > > > > struct untracked { > >int size; > >int array[]; > > } *a; > > > > __must_be_array(a->array) > > => 0 (as expected) > > __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0])) > > => 0 (as expected, array vs decayed array pointer) > > > > > > But if counted_by is added, we get a build failure: > > > > struct tracked { > >int size; > >int array[] __counted_by(size); > > } *b; > > > > __must_be_array(b->array) > > => build failure (not expected) > > __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0])) > > => 1 (not expected, both pointers?) > > > > > > > > > > -- > > Kees Cook >
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
Thanks a lot for the testing. Yes, I can repeat the issue with the following small example: #include #include #include #define MAX(a, b) ((a) > (b) ? (a) : (b)) struct untracked { int size; int array[] __attribute__((counted_by (size))); } *a; struct untracked * alloc_buf (int index) { struct untracked *p; p = (struct untracked *) malloc (MAX (sizeof (struct untracked), (offsetof (struct untracked, array[0]) + (index) * sizeof (int; p->size = index; return p; } int main() { a = alloc_buf(10); printf ("same_type is %d\n", (__builtin_types_compatible_p(typeof (a->array), typeof (&(a->array)[0]; return 0; } /home/opc/Install/latest-d/bin/gcc -O2 btcp.c same_type is 1 Looks like that the “typeof” operator need to be handled specially in C FE for the new internal function .ACCESS_WITH_SIZE. (I have specially handle the operator “offsetof” in C FE already). Will fix this issue. Thanks. Qing > On Jan 24, 2024, at 7:51 PM, Kees Cook wrote: > > On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote: >> This is the 4th version of the patch. > > Thanks very much for this! > > I tripped over an unexpected behavioral change that the Linux kernel > depends on: > > __builtin_types_compatible_p() no longer treats an array marked with > counted_by as different from that array's decayed pointer. Specifically, > the kernel uses these macros: > > > /* > * Force a compilation error if condition is true, but also produce a > * result (of value 0 and type int), so the expression can be used > * e.g. in a structure initializer (or where-ever else comma expressions > * aren't permitted). > */ > #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); }))) > > #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b)) > > /* &a[0] degrades to a pointer: a different type from an array */ > #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0])) > > > This gets used in various places to make sure we're dealing with an > array for a macro: > > #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + > __must_be_array(arr)) > > > So this builds: > > struct untracked { >int size; >int array[]; > } *a; > > __must_be_array(a->array) > => 0 (as expected) > __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0])) > => 0 (as expected, array vs decayed array pointer) > > > But if counted_by is added, we get a build failure: > > struct tracked { >int size; >int array[] __counted_by(size); > } *b; > > __must_be_array(b->array) > => build failure (not expected) > __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0])) > => 1 (not expected, both pointers?) > > > > > -- > Kees Cook
Re: [PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
On Wed, Jan 24, 2024 at 12:29:51AM +, Qing Zhao wrote: > This is the 4th version of the patch. Thanks very much for this! I tripped over an unexpected behavioral change that the Linux kernel depends on: __builtin_types_compatible_p() no longer treats an array marked with counted_by as different from that array's decayed pointer. Specifically, the kernel uses these macros: /* * Force a compilation error if condition is true, but also produce a * result (of value 0 and type int), so the expression can be used * e.g. in a structure initializer (or where-ever else comma expressions * aren't permitted). */ #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); }))) #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b)) /* &a[0] degrades to a pointer: a different type from an array */ #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0])) This gets used in various places to make sure we're dealing with an array for a macro: #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr)) So this builds: struct untracked { int size; int array[]; } *a; __must_be_array(a->array) => 0 (as expected) __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0])) => 0 (as expected, array vs decayed array pointer) But if counted_by is added, we get a build failure: struct tracked { int size; int array[] __counted_by(size); } *b; __must_be_array(b->array) => build failure (not expected) __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0])) => 1 (not expected, both pointers?) -- Kees Cook
[PATCH v4 0/4]New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)
Hi, This is the 4th version of the patch. It based on the following proposal: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635884.html Represent the missing dependence for the "counted_by" attribute and its consumers **The summary of the proposal is: * Add a new internal function ".ACCESS_WITH_SIZE" to carry the size information for every reference to a FAM field; * In C FE, Replace every reference to a FAM field whose TYPE has the "counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE"; * In every consumer of the size information, for example, BDOS or array bound sanitizer, query the size information or ACCESS_MODE information from the new internal function; * When expansing to RTL, replace the internal function with the actual reference to the FAM field; * Some adjustment to ipa alias analysis, and other SSA passes to mitigate the impact to the optimizer and code generation. **The new internal function .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, ACCESS_MODE, INDEX) INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL) which returns the "REF_TO_OBJ" same as the 1st argument; Both the return type and the type of the first argument of this function have been converted from the incomplete array type to the corresponding pointer type. Please see the following link for why: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html 1st argument "REF_TO_OBJ": The reference to the object; 2nd argument "REF_TO_SIZE": The reference to the size of the object, 3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE represents 0: unknown; 1: the number of the elements of the object type; 2: the number of bytes; 4th argument "PRECISION_OF_SIZE": The precision of the integer that REF_TO_SIZE points; 5th argument "ACCESS_MODE": -1: Unknown access semantics 0: none 1: read_only 2: write_only 3: read_write 6th argument "INDEX": the INDEX for the original array reference. -1: Unknown NOTE: The 6th Argument is added for bound sanitizer instrumentation. ** The Patch sets included: 1. Provide counted_by attribute to flexible array member field; which includes: * "counted_by" attribute documentation; * C FE handling of the new attribute; syntax checking, error reporting; * testing cases; 2. Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE. which includes: * The definition of the new internal function .ACCESS_WITH_SIZE in internal-fn.def. * C FE converts every reference to a FAM with "counted_by" attribute to a call to the internal function .ACCESS_WITH_SIZE. (build_component_ref in c_typeck.cc) This includes the case when the object is statically allocated and initialized. In order to make this working, we should update initializer_constant_valid_p_1 and output_constant in varasm.cc to include calls to .ACCESS_WITH_SIZE. However, for the reference inside "offsetof", ignore the "counted_by" attribute since it's not useful at all. (c_parser_postfix_expression in c/c-parser.cc) * Convert every call to .ACCESS_WITH_SIZE to its first argument. (expand_ACCESS_WITH_SIZE in internal-fn.cc) * adjust alias analysis to exclude the new internal from clobbering anything. (ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in tree-ssa-alias.cc) * adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE when it's LHS is eliminated as dead code. (eliminate_unnecessary_stmts in tree-ssa-dce.cc) * Provide the utility routines to check the call is .ACCESS_WITH_SIZE and get the reference from the call to .ACCESS_WITH_SIZE. (is_access_with_size_p and get_ref_from_access_with_size in tree.cc) * testing cases. (for offsetof, static initialization, generation of calls to .ACCESS_WITH_SIZE, code runs correctly after calls to .ACCESS_WITH_SIZE are converted back) 3. Use the .ACCESS_WITH_SIZE in builtin object size (sub-object only) which includes: * use the size info of the .ACCESS_WITH_SIZE for sub-object. * testing cases. 4 Use the .ACCESS_WITH_SIZE in bound sanitizer Since the result type of the call to .ACCESS_WITH_SIZE is a pointer to the element type. The original array_ref is converted to an indirect_ref, which introduced two issues for the instrumenation of bound sanitizer: A. The index for the original array_ref was mixed into the offset expression for the new indirect_ref. In order to make the instrumentation for the bound sanitizer easier, one more argument for the function .ACCESS_WITH_SIZE is added to record this original index for the array_ref. then later during bound instrumentation, get the index from the additi