Re: "counted_by" and -fanalyzer

2024-06-05 Thread Qing Zhao


> On Jun 5, 2024, at 09:49, David Malcolm  wrote:
> 
> On Tue, 2024-06-04 at 22:09 +, Qing Zhao wrote:
>> 
>> 
>>> On Jun 4, 2024, at 17:55, David Malcolm 
>>> wrote:
>>> 
>>> On Fri, 2024-05-31 at 13:11 +, Qing Zhao wrote:
 
 
> 
> [...]
> 
 
 
 Thanks a lot for the review.
 Will commit the patch set soon.
>>> 
>>> [...snip...]
>>> 
>>> Congratulations on getting this merged.
>>> 
>>> FWIW I've started investigating adding support for the new
>>> attribute to
>>> -fanalyzer (and am tracked this as PR analyzer/111567
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111567 ).
>> 
>> Thank you for starting looking at this.
>>> 
>>> The docs for the attribute speak of the implied relationship
>>> between
>>> the count field and size of the flex array, and say that: "It's the
>>> user's responsibility to make sure the above requirements to be
>>> kept
>>> all the time.  Otherwise the compiler *reports warnings*, at the
>>> same
>>> time, the results of the array bound sanitizer and the
>>> '__builtin_dynamic_object_size' is undefined." (my emphasis).
>>> 
>>> What are these warnings that are reported?  I looked through 
>>> r15-944-gf824acd0e80754 through r15-948-g4c5bea7def1361 and I
>>> didn't
>>> see any new warnings or test coverage for warnings (beyond misuing
>>> the
>>> attribute).  Sorry if I'm missing something obvious here.
>> 
>> These warnings will be in the remaining work (I listed the remaining
>> work in all versions except the last one):
>> 
>> **Remaining works: 
>> 
>> 6  Improve __bdos to use the counted_by info in whole-object
>> size for the structure with FAM.
>> 7  Emit warnings when the user breaks the requirments for the
>> new counted_by attribute
>> compilation time: -Wcounted-by
>> run time: -fsanitizer=counted-by
>>* The initialization to the size field should be done
>> before the first reference to the FAM field.
>>* the array has at least # of elements specified by the
>> size field all the time during the program.
> 
> Aha - thanks.  Sorry for missing this, I confess I haven't been paying
> close attention to this thread.
> 
>> 
>> With the current patches that have been committed, the warnings are
>> not emitted. 
>> I believe that more analysis and more information are needed for
>> these warnings to be effective, it might not
>> be a trivial patch.  More discussion is needed for emitting such
>> warnings.
>> 
>>> 
>>> Does anyone have examples of cases that -fanalyzer ought to warn
>>> for?
>> 
>> At this moment, I don’t have concrete testing cases for this yet, but
>> I can come up with several small examples and share with you in a
>> later email.
> 
> FWIW I did some brainstorming and put together the following .c file,
> am posting it inline here for the sake of discussion; does this look
> like the kind of thing to test for (in terms of how users are expected
> to use the attribute, and the kinds of mistake they'd want warnings
> about) ?

From my understanding, there are two parts of work to support “counted-by” in 
-fanalyzer:

1. Use this new attribute to improve out-of-bound, buffer overflow detection in 
-fanalyzer (maybe -Wanalyzer-out-of-bounds can be improved with this new 
attribute?)
2. Report user errors that breaks the following 2 requirements for the new 
counted-by attribute:
-fsanitizer=counted-by
 * The initialization to the size field should be done before the first 
reference to the FAM field.
 * the array has at least # of elements specified by the size field all the 
time during the program.

So, the testing cases might include the above 1 and 2. 
From my understanding of the below testings, mostly belong to 2. 
Some more comments inlined below:

> 
> /* TODO:
>   Some ideas for dimensions of test matrix:
>   (a) concrete value vs symbolic value for "count"
>   (b) concrete value vs symbolic value for size of array
>   (c) dynamic vs static allocation of buffer (and malloc vs alloca)
>   (d) relative size of array and of count
>   - same size (not an issue)
>   - array is too small compared to "count"
> - off by one
> - off by more than one
> - size is zero (but not negative)
> - negative size (which the docs say is OK)
>   - array is too large compared to "count" (not an issue)
>   (e) type of flex array:
>   - char
>   - non-char
>   - type requiring padding
>   (f) type/size/signedness of count field; what about overflow
>   in (count * sizeof (type of array element)) ?
>   ... etc: ideas?
> 
>Other ideas for test coverage:
>- realloc
>  - growing object
>  - shrinking object
>  - symbolic sizes where could be growth or shrinkage
>  - failing realloc
>- ...etc: ideas?  */
> 
> #include 
> #include 
> #include 
> 
> /* Example from the docs.  */
> 
> struct P {
>  size_t count;
>  char other;
>  char array[] __attribute__ ((counted_by (count)));
> } *p;
> 
> struct P *

Re: "counted_by" and -fanalyzer

2024-06-05 Thread David Malcolm
On Tue, 2024-06-04 at 22:09 +, Qing Zhao wrote:
> 
> 
> > On Jun 4, 2024, at 17:55, David Malcolm 
> > wrote:
> > 
> > On Fri, 2024-05-31 at 13:11 +, Qing Zhao wrote:
> > > 
> > > 

[...]

> > > 
> > > 
> > > Thanks a lot for the review.
> > > Will commit the patch set soon.
> > 
> > [...snip...]
> > 
> > Congratulations on getting this merged.
> > 
> > FWIW I've started investigating adding support for the new
> > attribute to
> > -fanalyzer (and am tracked this as PR analyzer/111567
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111567 ).
> 
> Thank you for starting looking at this.
> > 
> > The docs for the attribute speak of the implied relationship
> > between
> > the count field and size of the flex array, and say that: "It's the
> > user's responsibility to make sure the above requirements to be
> > kept
> > all the time.  Otherwise the compiler *reports warnings*, at the
> > same
> > time, the results of the array bound sanitizer and the
> > '__builtin_dynamic_object_size' is undefined." (my emphasis).
> > 
> > What are these warnings that are reported?  I looked through 
> > r15-944-gf824acd0e80754 through r15-948-g4c5bea7def1361 and I
> > didn't
> > see any new warnings or test coverage for warnings (beyond misuing
> > the
> > attribute).  Sorry if I'm missing something obvious here.
> 
> These warnings will be in the remaining work (I listed the remaining
> work in all versions except the last one):
> 
> > > > > **Remaining works: 
> > > > > 
> > > > > 6  Improve __bdos to use the counted_by info in whole-object
> > > > > size for the structure with FAM.
> > > > > 7  Emit warnings when the user breaks the requirments for the
> > > > > new counted_by attribute
> > > > > compilation time: -Wcounted-by
> > > > > run time: -fsanitizer=counted-by
> > > > >    * The initialization to the size field should be done
> > > > > before the first reference to the FAM field.
> > > > >    * the array has at least # of elements specified by the
> > > > > size field all the time during the program.

Aha - thanks.  Sorry for missing this, I confess I haven't been paying
close attention to this thread.

> 
> With the current patches that have been committed, the warnings are
> not emitted. 
> I believe that more analysis and more information are needed for
> these warnings to be effective, it might not
> be a trivial patch.  More discussion is needed for emitting such
> warnings.
> 
> > 
> > Does anyone have examples of cases that -fanalyzer ought to warn
> > for?
> 
> At this moment, I don’t have concrete testing cases for this yet, but
> I can come up with several small examples and share with you in a
> later email.

FWIW I did some brainstorming and put together the following .c file,
am posting it inline here for the sake of discussion; does this look
like the kind of thing to test for (in terms of how users are expected
to use the attribute, and the kinds of mistake they'd want warnings
about) ?

/* TODO:
   Some ideas for dimensions of test matrix:
   (a) concrete value vs symbolic value for "count"
   (b) concrete value vs symbolic value for size of array
   (c) dynamic vs static allocation of buffer (and malloc vs alloca)
   (d) relative size of array and of count
   - same size (not an issue)
   - array is too small compared to "count"
 - off by one
 - off by more than one
 - size is zero (but not negative)
 - negative size (which the docs say is OK)
   - array is too large compared to "count" (not an issue)
   (e) type of flex array:
   - char
   - non-char
   - type requiring padding
   (f) type/size/signedness of count field; what about overflow
   in (count * sizeof (type of array element)) ?
   ... etc: ideas?

Other ideas for test coverage:
- realloc
  - growing object
  - shrinking object
  - symbolic sizes where could be growth or shrinkage
  - failing realloc
- ...etc: ideas?  */

#include 
#include 
#include 

/* Example from the docs.  */

struct P {
  size_t count;
  char other;
  char array[] __attribute__ ((counted_by (count)));
} *p;

struct P *
test_malloc_with_correct_symbolic (size_t n)
{
  struct P *p = malloc (sizeof (struct P) + n);
  if (!p)
return NULL;
  p->count = n; // don't warn here
  return p;  
}

struct P *
test_malloc_with_correct_count_concrete (void)
{
  struct P *p = malloc (sizeof (struct P) + 42);
  if (!p)
return NULL;
  p->count = 42; // don't warn here
  return p;  
}

struct P *
test_malloc_with_array_smaller_than_count_concrete (void)
{
  struct P *p = malloc (sizeof (struct P) + 42);
  if (!p)
return NULL;
  p->count = 80; // TODO: warn here
  return p;
}

struct P *
test_malloc_with_array_larger_than_count_concrete (void)
{
  struct P *p = malloc (sizeof (struct P) + 80);
  if (!p)
return NULL;
  p->count = 42; // don't warn here
  return p;  
}

struct P *
test_malloc_with_array_access_before_count_init_concrete_1 (void)
{
  struct P *p = 

Re: "counted_by" and -fanalyzer (was Re: [PATCH v10 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.)

2024-06-04 Thread Qing Zhao


> On Jun 4, 2024, at 17:55, David Malcolm  wrote:
> 
> On Fri, 2024-05-31 at 13:11 +, Qing Zhao wrote:
>> 
>> 
>>> On May 31, 2024, at 08:58, Richard Biener 
>>> wrote:
>>> 
>>> On Thu, 30 May 2024, Qing Zhao wrote:
>>> 
 Including the following changes:
 * The definition of the new internal function .ACCESS_WITH_SIZE
  in internal-fn.def.
 * C FE converts every reference to a FAM with a "counted_by"
 attribute
  to a call to the internal function .ACCESS_WITH_SIZE.
  (build_component_ref in c_typeck.cc)
 
  This includes the case when the object is statically allocated
 and
  initialized.
  In order to make this working, the routine digest_init in c-
 typeck.cc
  is updated to fold calls to .ACCESS_WITH_SIZE to its first
 argument
  when require_constant is TRUE.
 
  However, for the reference inside "offsetof", the "counted_by"
 attribute is
  ignored since it's not useful at all.
  (c_parser_postfix_expression in c/c-parser.cc)
 
  In addtion to "offsetof", for the reference inside operator
 "typeof" and
  "alignof", we ignore counted_by attribute too.
 
  When building ADDR_EXPR for the .ACCESS_WITH_SIZE in C FE,
  replace the call with its first argument.
 
 * Convert every call to .ACCESS_WITH_SIZE to its first argument.
  (expand_ACCESS_WITH_SIZE in internal-fn.cc)
 * Provide the utility routines to check the call is
 .ACCESS_WITH_SIZE and
  get the reference from the call to .ACCESS_WITH_SIZE.
  (is_access_with_size_p and get_ref_from_access_with_size in
 tree.cc)
>>> 
>>> The middle-end parts of this revised patch are OK.
>> 
>> Thanks a lot for the review.
>> Will commit the patch set soon.
> 
> [...snip...]
> 
> Congratulations on getting this merged.
> 
> FWIW I've started investigating adding support for the new attribute to
> -fanalyzer (and am tracked this as PR analyzer/111567
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111567 ).

Thank you for starting looking at this.
> 
> The docs for the attribute speak of the implied relationship between
> the count field and size of the flex array, and say that: "It's the
> user's responsibility to make sure the above requirements to be kept
> all the time.  Otherwise the compiler *reports warnings*, at the same
> time, the results of the array bound sanitizer and the
> '__builtin_dynamic_object_size' is undefined." (my emphasis).
> 
> What are these warnings that are reported?  I looked through 
> r15-944-gf824acd0e80754 through r15-948-g4c5bea7def1361 and I didn't
> see any new warnings or test coverage for warnings (beyond misuing the
> attribute).  Sorry if I'm missing something obvious here.

These warnings will be in the remaining work (I listed the remaining work in 
all versions except the last one):

 **Remaining works: 
 
 6  Improve __bdos to use the counted_by info in whole-object size for the 
 structure with FAM.
 7  Emit warnings when the user breaks the requirments for the new 
 counted_by attribute
 compilation time: -Wcounted-by
 run time: -fsanitizer=counted-by
* The initialization to the size field should be done before the first 
 reference to the FAM field.
* the array has at least # of elements specified by the size field all 
 the time during the program.

With the current patches that have been committed, the warnings are not 
emitted. 
I believe that more analysis and more information are needed for these warnings 
to be effective, it might not
be a trivial patch.  More discussion is needed for emitting such warnings.

> 
> Does anyone have examples of cases that -fanalyzer ought to warn for?

At this moment, I don’t have concrete testing cases for this yet, but I can 
come up with several small examples and share with you in a later email.

Qing
> Presumably it would be helpful for the analyzer to report about code
> paths in which the requirements are violated (but it may be that the
> analyzer runs too late to do this...)
> 
> Thanks
> Dave
> 



"counted_by" and -fanalyzer (was Re: [PATCH v10 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.)

2024-06-04 Thread David Malcolm
On Fri, 2024-05-31 at 13:11 +, Qing Zhao wrote:
> 
> 
> > On May 31, 2024, at 08:58, Richard Biener 
> > wrote:
> > 
> > On Thu, 30 May 2024, Qing Zhao wrote:
> > 
> > > Including the following changes:
> > > * The definition of the new internal function .ACCESS_WITH_SIZE
> > >  in internal-fn.def.
> > > * C FE converts every reference to a FAM with a "counted_by"
> > > attribute
> > >  to a call to the internal function .ACCESS_WITH_SIZE.
> > >  (build_component_ref in c_typeck.cc)
> > > 
> > >  This includes the case when the object is statically allocated
> > > and
> > >  initialized.
> > >  In order to make this working, the routine digest_init in c-
> > > typeck.cc
> > >  is updated to fold calls to .ACCESS_WITH_SIZE to its first
> > > argument
> > >  when require_constant is TRUE.
> > > 
> > >  However, for the reference inside "offsetof", the "counted_by"
> > > attribute is
> > >  ignored since it's not useful at all.
> > >  (c_parser_postfix_expression in c/c-parser.cc)
> > > 
> > >  In addtion to "offsetof", for the reference inside operator
> > > "typeof" and
> > >  "alignof", we ignore counted_by attribute too.
> > > 
> > >  When building ADDR_EXPR for the .ACCESS_WITH_SIZE in C FE,
> > >  replace the call with its first argument.
> > > 
> > > * Convert every call to .ACCESS_WITH_SIZE to its first argument.
> > >  (expand_ACCESS_WITH_SIZE in internal-fn.cc)
> > > * Provide the utility routines to check the call is
> > > .ACCESS_WITH_SIZE and
> > >  get the reference from the call to .ACCESS_WITH_SIZE.
> > >  (is_access_with_size_p and get_ref_from_access_with_size in
> > > tree.cc)
> > 
> > The middle-end parts of this revised patch are OK.
> 
> Thanks a lot for the review.
> Will commit the patch set soon.

[...snip...]

Congratulations on getting this merged.

FWIW I've started investigating adding support for the new attribute to
-fanalyzer (and am tracked this as PR analyzer/111567
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111567 ).

The docs for the attribute speak of the implied relationship between
the count field and size of the flex array, and say that: "It's the
user's responsibility to make sure the above requirements to be kept
all the time.  Otherwise the compiler *reports warnings*, at the same
time, the results of the array bound sanitizer and the
'__builtin_dynamic_object_size' is undefined." (my emphasis).

What are these warnings that are reported?  I looked through 
r15-944-gf824acd0e80754 through r15-948-g4c5bea7def1361 and I didn't
see any new warnings or test coverage for warnings (beyond misuing the
attribute).  Sorry if I'm missing something obvious here.

Does anyone have examples of cases that -fanalyzer ought to warn for?
Presumably it would be helpful for the analyzer to report about code
paths in which the requirements are violated (but it may be that the
analyzer runs too late to do this...)

Thanks
Dave