Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-06-12 Thread Qing Zhao
An update on this task. (Also need more suggestions -:)

I have an initial implemenation locally, with this gcc, I can get the following 
message with the testing case in PR109071:

[ 109071]$ cat t.c
extern void warn(void);
static inline void assign(int val, int *regs, int *index)
{
  if (*index >= 4)
warn();
  *regs = val;
}
struct nums {int vals[4];};

void sparx5_set (int *ptr, struct nums *sg, int index)
{
  int *val = >vals[index];

  assign(0,ptr, );
  assign(*val, ptr, );
}

[109071]$ sh t
/home/opc/Install/latest-d/bin/gcc -O2 -Warray-bounds=1 -c -o t.o t.c
t.c: In function ‘sparx5_set’:
t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ 
[-Warray-bounds=]
   12 |   int *val = >vals[index];
  |   ^~~
  event 1
|
|4 |   if (*index >= 4)
|  |  ^
|  |  |
|  |  (1) when the condition is evaluated to true
|
t.c:8:18: note: while referencing ‘vals’
8 | struct nums {int vals[4];};
  |  ^~~~

1. Is the above diagnostic message good enough? Any more suggestion?
2. It’s hard for me to come up with a more complicate testing case that has one 
basic block copied multiple times by the jump thread,  do you have any pointer 
to such testing cases? 

Thanks a lot for any help.

Qing

> On Jun 7, 2024, at 15:13, Qing Zhao  wrote:
> 
> Hi, Richard,
> 
>> On Jun 5, 2024, at 13:58, Qing Zhao  wrote:
>>> 
>> Like this?
>> 
>> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
>> index e6e2b0897572..ee344f91333b 100644
>> --- a/libcpp/include/line-map.h
>> +++ b/libcpp/include/line-map.h
>> @@ -761,8 +761,9 @@ struct GTY(()) maps_info_macro {
>> struct GTY(()) location_adhoc_data {
>> location_t locus;
>> source_range src_range;
>> -  void * GTY((skip)) data;
>> unsigned discriminator;
>> +  void * GTY((skip)) data;
>> +  void * GTY((skip)) copy_history;
>> };
>> struct htab;
> 
> Yes.
> 
>> How about the copy_history? Do we need a new data structure (like
>> the following, or other suggestion) for this? Where should I add
>> this new data structure?
> 
> As it needs to be managed by libcpp it should be in this very same
> file.
> 
>> struct copy_history {
>> location_t condition;
>> Bool is_true_path;
>> }
> 
> I think we want a pointer to the previous copy-of state as well in
> case a stmt
> was copied twice.  We'll see whether a single (condition) location
> plus edge flag
> is sufficient.  I'd say we should plan for an enum to indicate the
> duplication
> reason at least (jump threading, unswitching, unrolling come to my
> mind).  For
> jump threading being able to say "when  is true/false" is
> probably
> good enough, though it might not always be easy to identify a single
> condition
> here given a threading path starts at an incoming edge to a CFG merge
> and
> will usually end with the outgoing edge of a condition that we can
> statically
> evaluate.  The condition controlling the path entry might not be
> determined
> fully by a single condition location.
> 
> Possibly building a full "diagnostic path" object at threading time
> might be
> the only way to record all the facts, but that's also going to be
> more
> expensive.
 
 Note that a diagnostic_path represents a path through some kind of
 graph, whereas it sounds like you want to be storing the *nodes* in the
 graph, and later generating the diagnostic_path from that graph when we
 need it (which is trivial if the graph is actually just a tree: just
 follow the parent links backwards, then reverse it).
>>> 
>>> I think we are mixing two things - one is that a single transform like 
>>> jump
>>> threading produces a stmt copy and when we emit a diagnostic on that
>>> copied statement we want to tell the user the condition under which the
>>> copy is executed.  That "condition" can be actually a sequence of
>>> conditionals.  I wanted to point out that a diagnostic_path instance 
>>> could
>>> be used to describe such complex condition.
>>> 
>>> But then the other thing I wanted to address with the link to a previous
>>> copy_history - that's when a statement gets copied twice, for example
>>> by two distinct jump threading optimizations.  Like when dumping
>>> the inlining decisions for diagnostics we could dump the logical "and"
>>> of the conditions of the two threadings.  Since we have a single
>>> location per GIMPLE stmt we'd have to keep a "stack" of 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-06-07 Thread Qing Zhao
Hi, Richard,

> On Jun 5, 2024, at 13:58, Qing Zhao  wrote:
>> 
> Like this?
> 
> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> index e6e2b0897572..ee344f91333b 100644
> --- a/libcpp/include/line-map.h
> +++ b/libcpp/include/line-map.h
> @@ -761,8 +761,9 @@ struct GTY(()) maps_info_macro {
> struct GTY(()) location_adhoc_data {
> location_t locus;
> source_range src_range;
> -  void * GTY((skip)) data;
> unsigned discriminator;
> +  void * GTY((skip)) data;
> +  void * GTY((skip)) copy_history;
> };
> struct htab;
 
 Yes.
 
> How about the copy_history? Do we need a new data structure (like
> the following, or other suggestion) for this? Where should I add
> this new data structure?
 
 As it needs to be managed by libcpp it should be in this very same
 file.
 
> struct copy_history {
> location_t condition;
> Bool is_true_path;
> }
 
 I think we want a pointer to the previous copy-of state as well in
 case a stmt
 was copied twice.  We'll see whether a single (condition) location
 plus edge flag
 is sufficient.  I'd say we should plan for an enum to indicate the
 duplication
 reason at least (jump threading, unswitching, unrolling come to my
 mind).  For
 jump threading being able to say "when  is true/false" is
 probably
 good enough, though it might not always be easy to identify a single
 condition
 here given a threading path starts at an incoming edge to a CFG merge
 and
 will usually end with the outgoing edge of a condition that we can
 statically
 evaluate.  The condition controlling the path entry might not be
 determined
 fully by a single condition location.
 
 Possibly building a full "diagnostic path" object at threading time
 might be
 the only way to record all the facts, but that's also going to be
 more
 expensive.
>>> 
>>> Note that a diagnostic_path represents a path through some kind of
>>> graph, whereas it sounds like you want to be storing the *nodes* in the
>>> graph, and later generating the diagnostic_path from that graph when we
>>> need it (which is trivial if the graph is actually just a tree: just
>>> follow the parent links backwards, then reverse it).
>> 
>> I think we are mixing two things - one is that a single transform like 
>> jump
>> threading produces a stmt copy and when we emit a diagnostic on that
>> copied statement we want to tell the user the condition under which the
>> copy is executed.  That "condition" can be actually a sequence of
>> conditionals.  I wanted to point out that a diagnostic_path instance 
>> could
>> be used to describe such complex condition.
>> 
>> But then the other thing I wanted to address with the link to a previous
>> copy_history - that's when a statement gets copied twice, for example
>> by two distinct jump threading optimizations.  Like when dumping
>> the inlining decisions for diagnostics we could dump the logical "and"
>> of the conditions of the two threadings.  Since we have a single
>> location per GIMPLE stmt we'd have to keep a "stack" of copy events
>> associated with it.  That's the linked list (I think a linked list should
>> work fine here).
> Yes, the linked list to keep the “stack” of copy events should be good 
> enough
> to form the sequence of conditionals event for the diagnostic_path 
> instance.
>> 
>> I realize things may get a bit "fat", but eventually we are not 
>> duplicating
>> statements that much.  I do hope we can share for example a big
>> diagnostic_path when we duplicate a basic block during threading
>> and use one instance for all stmts in such block copy (IIRC we never
>> release locations or their ad-hoc data, we just make sure to never
>> use locations with ad-hoc data pointing to BLOCKs that we released,
>> but the linemap data will still have pointers in "dead" location entries,
>> right?)
> Are you still suggesting to add two artificial stmts in the beginning and 
> the
> end of the duplicated block to carry the copy history information for all 
> the
> stmts in the block to save space?
> 
> Compared with the approach to carry such information to each duplicated 
> stmts (which I preferred),
> The major concerns with the approach are:
> 1. Optimizations might move stmts out of these two artificial stmts, 
> therefore we need add
> Some memory barrier on these two artificial stmts to prevent such 
> movements.
>This 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-06-05 Thread Qing Zhao


> On Jun 5, 2024, at 13:07, Richard Biener  wrote:
> 
> 
> 
>> Am 05.06.2024 um 18:39 schrieb Qing Zhao :
>> 
>> 
>> 
>>> On Jun 5, 2024, at 03:26, Richard Biener  wrote:
>>> 
 On Tue, Jun 4, 2024 at 10:31 PM Qing Zhao  wrote:
 
 
 
> On Jun 4, 2024, at 03:43, Richard Biener  
> wrote:
> 
> On Mon, Jun 3, 2024 at 4:48 PM David Malcolm  wrote:
>> 
>> On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote:
>>> On Fri, May 31, 2024 at 11:23 PM Qing Zhao 
>>> wrote:
 
 
 
> On May 23, 2024, at 07:46, Richard Biener
>  wrote:
> 
> On Wed, May 22, 2024 at 8:53 PM Qing Zhao 
> wrote:
>> 
>> 
>> 
>>> On May 22, 2024, at 03:38, Richard Biener
>>>  wrote:
>>> 
>>> On Tue, May 21, 2024 at 11:36 PM David Malcolm
>>>  wrote:
 
 On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> Thanks for the comments and suggestions.
> 
>> On May 15, 2024, at 10:00, David Malcolm
>> 
>> wrote:
>> 
>> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener
>> wrote:
>>> On Mon, 13 May 2024, Qing Zhao wrote:
>>> 
 -Warray-bounds is an important option to enable
 linux kernal to
 keep
 the array out-of-bound errors out of the source
 tree.
 
 However, due to the false positive warnings
 reported in
 PR109071
 (-Warray-bounds false positive warnings due to code
 duplication
 from
 jump threading), -Warray-bounds=1 cannot be added
 on by
 default.
 
 Although it's impossible to elinimate all the false
 positive
 warnings
 from -Warray-bounds=1 (See PR104355 Misleading -
 Warray-bounds
 documentation says "always out of bounds"), we
 should minimize
 the
 false positive warnings in -Warray-bounds=1.
 
 The root reason for the false positive warnings
 reported in
 PR109071 is:
 
 When the thread jump optimization tries to reduce
 the # of
 branches
 inside the routine, sometimes it needs to duplicate
 the code
 and
 split into two conditional pathes. for example:
 
 The original code:
 
 void sparx5_set (int * ptr, struct nums * sg, int
 index)
 {
 if (index >= 4)
 warn ();
 *ptr = 0;
 *val = sg->vals[index];
 if (index >= 4)
 warn ();
 *ptr = *val;
 
 return;
 }
 
 With the thread jump, the above becomes:
 
 void sparx5_set (int * ptr, struct nums * sg, int
 index)
 {
 if (index >= 4)
 {
 warn ();
 *ptr = 0; // Code duplications since
 "warn" does
 return;
 *val = sg->vals[index];   // same this line.
   // In this path,
 since it's
 under
 the condition
   // "index >= 4", the
 compiler
 knows
 the value
   // of "index" is
 larger then 4,
 therefore the
   // out-of-bound
 warning.
 warn ();
 }
 else
 {
 *ptr = 0;
 *val = sg->vals[index];
 }
 *ptr = *val;
 return;
 }
 
 We can see, after the thread jump optimization, the
 # of
 branches
 inside
 the routine "sparx5_set" is reduced from 2 to 1,
 however,  due
 to
 the
 code duplication (which is needed for the
 correctness of the
 code),
 we
 got a false positive out-of-bound warning.
 
 In order to eliminate such false 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-06-05 Thread Richard Biener



> Am 05.06.2024 um 18:39 schrieb Qing Zhao :
> 
> 
> 
>> On Jun 5, 2024, at 03:26, Richard Biener  wrote:
>> 
>>> On Tue, Jun 4, 2024 at 10:31 PM Qing Zhao  wrote:
>>> 
>>> 
>>> 
 On Jun 4, 2024, at 03:43, Richard Biener  
 wrote:
 
 On Mon, Jun 3, 2024 at 4:48 PM David Malcolm  wrote:
> 
> On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote:
>> On Fri, May 31, 2024 at 11:23 PM Qing Zhao 
>> wrote:
>>> 
>>> 
>>> 
 On May 23, 2024, at 07:46, Richard Biener
  wrote:
 
 On Wed, May 22, 2024 at 8:53 PM Qing Zhao 
 wrote:
> 
> 
> 
>> On May 22, 2024, at 03:38, Richard Biener
>>  wrote:
>> 
>> On Tue, May 21, 2024 at 11:36 PM David Malcolm
>>  wrote:
>>> 
>>> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
 Thanks for the comments and suggestions.
 
> On May 15, 2024, at 10:00, David Malcolm
> 
> wrote:
> 
> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener
> wrote:
>> On Mon, 13 May 2024, Qing Zhao wrote:
>> 
>>> -Warray-bounds is an important option to enable
>>> linux kernal to
>>> keep
>>> the array out-of-bound errors out of the source
>>> tree.
>>> 
>>> However, due to the false positive warnings
>>> reported in
>>> PR109071
>>> (-Warray-bounds false positive warnings due to code
>>> duplication
>>> from
>>> jump threading), -Warray-bounds=1 cannot be added
>>> on by
>>> default.
>>> 
>>> Although it's impossible to elinimate all the false
>>> positive
>>> warnings
>>> from -Warray-bounds=1 (See PR104355 Misleading -
>>> Warray-bounds
>>> documentation says "always out of bounds"), we
>>> should minimize
>>> the
>>> false positive warnings in -Warray-bounds=1.
>>> 
>>> The root reason for the false positive warnings
>>> reported in
>>> PR109071 is:
>>> 
>>> When the thread jump optimization tries to reduce
>>> the # of
>>> branches
>>> inside the routine, sometimes it needs to duplicate
>>> the code
>>> and
>>> split into two conditional pathes. for example:
>>> 
>>> The original code:
>>> 
>>> void sparx5_set (int * ptr, struct nums * sg, int
>>> index)
>>> {
>>> if (index >= 4)
>>> warn ();
>>> *ptr = 0;
>>> *val = sg->vals[index];
>>> if (index >= 4)
>>> warn ();
>>> *ptr = *val;
>>> 
>>> return;
>>> }
>>> 
>>> With the thread jump, the above becomes:
>>> 
>>> void sparx5_set (int * ptr, struct nums * sg, int
>>> index)
>>> {
>>> if (index >= 4)
>>> {
>>>  warn ();
>>>  *ptr = 0; // Code duplications since
>>> "warn" does
>>> return;
>>>  *val = sg->vals[index];   // same this line.
>>>// In this path,
>>> since it's
>>> under
>>> the condition
>>>// "index >= 4", the
>>> compiler
>>> knows
>>> the value
>>>// of "index" is
>>> larger then 4,
>>> therefore the
>>>// out-of-bound
>>> warning.
>>>  warn ();
>>> }
>>> else
>>> {
>>>  *ptr = 0;
>>>  *val = sg->vals[index];
>>> }
>>> *ptr = *val;
>>> return;
>>> }
>>> 
>>> We can see, after the thread jump optimization, the
>>> # of
>>> branches
>>> inside
>>> the routine "sparx5_set" is reduced from 2 to 1,
>>> however,  due
>>> to
>>> the
>>> code duplication (which is needed for the
>>> correctness of the
>>> code),
>>> we
>>> got a false positive out-of-bound warning.
>>> 
>>> In order to eliminate such false positive out-of-
>>> bound warning,
>>> 
>>> A. Add one more flag for GIMPLE: is_splitted.
>>> B. During the thread jump optimization, when the

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-06-05 Thread Qing Zhao


> On Jun 5, 2024, at 03:26, Richard Biener  wrote:
> 
> On Tue, Jun 4, 2024 at 10:31 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Jun 4, 2024, at 03:43, Richard Biener  wrote:
>>> 
>>> On Mon, Jun 3, 2024 at 4:48 PM David Malcolm  wrote:
 
 On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote:
> On Fri, May 31, 2024 at 11:23 PM Qing Zhao 
> wrote:
>> 
>> 
>> 
>>> On May 23, 2024, at 07:46, Richard Biener
>>>  wrote:
>>> 
>>> On Wed, May 22, 2024 at 8:53 PM Qing Zhao 
>>> wrote:
 
 
 
> On May 22, 2024, at 03:38, Richard Biener
>  wrote:
> 
> On Tue, May 21, 2024 at 11:36 PM David Malcolm
>  wrote:
>> 
>> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
>>> Thanks for the comments and suggestions.
>>> 
 On May 15, 2024, at 10:00, David Malcolm
 
 wrote:
 
 On Tue, 2024-05-14 at 15:08 +0200, Richard Biener
 wrote:
> On Mon, 13 May 2024, Qing Zhao wrote:
> 
>> -Warray-bounds is an important option to enable
>> linux kernal to
>> keep
>> the array out-of-bound errors out of the source
>> tree.
>> 
>> However, due to the false positive warnings
>> reported in
>> PR109071
>> (-Warray-bounds false positive warnings due to code
>> duplication
>> from
>> jump threading), -Warray-bounds=1 cannot be added
>> on by
>> default.
>> 
>> Although it's impossible to elinimate all the false
>> positive
>> warnings
>> from -Warray-bounds=1 (See PR104355 Misleading -
>> Warray-bounds
>> documentation says "always out of bounds"), we
>> should minimize
>> the
>> false positive warnings in -Warray-bounds=1.
>> 
>> The root reason for the false positive warnings
>> reported in
>> PR109071 is:
>> 
>> When the thread jump optimization tries to reduce
>> the # of
>> branches
>> inside the routine, sometimes it needs to duplicate
>> the code
>> and
>> split into two conditional pathes. for example:
>> 
>> The original code:
>> 
>> void sparx5_set (int * ptr, struct nums * sg, int
>> index)
>> {
>> if (index >= 4)
>> warn ();
>> *ptr = 0;
>> *val = sg->vals[index];
>> if (index >= 4)
>> warn ();
>> *ptr = *val;
>> 
>> return;
>> }
>> 
>> With the thread jump, the above becomes:
>> 
>> void sparx5_set (int * ptr, struct nums * sg, int
>> index)
>> {
>> if (index >= 4)
>> {
>>   warn ();
>>   *ptr = 0; // Code duplications since
>> "warn" does
>> return;
>>   *val = sg->vals[index];   // same this line.
>> // In this path,
>> since it's
>> under
>> the condition
>> // "index >= 4", the
>> compiler
>> knows
>> the value
>> // of "index" is
>> larger then 4,
>> therefore the
>> // out-of-bound
>> warning.
>>   warn ();
>> }
>> else
>> {
>>   *ptr = 0;
>>   *val = sg->vals[index];
>> }
>> *ptr = *val;
>> return;
>> }
>> 
>> We can see, after the thread jump optimization, the
>> # of
>> branches
>> inside
>> the routine "sparx5_set" is reduced from 2 to 1,
>> however,  due
>> to
>> the
>> code duplication (which is needed for the
>> correctness of the
>> code),
>> we
>> got a false positive out-of-bound warning.
>> 
>> In order to eliminate such false positive out-of-
>> bound warning,
>> 
>> A. Add one more flag for GIMPLE: is_splitted.
>> B. During the thread jump optimization, when the
>> basic blocks
>> are
>> duplicated, mark all the STMTs inside the original
>> and
>> duplicated
>> basic blocks as "is_splitted";

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-06-05 Thread Richard Biener
On Tue, Jun 4, 2024 at 10:31 PM Qing Zhao  wrote:
>
>
>
> > On Jun 4, 2024, at 03:43, Richard Biener  wrote:
> >
> > On Mon, Jun 3, 2024 at 4:48 PM David Malcolm  wrote:
> >>
> >> On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote:
> >>> On Fri, May 31, 2024 at 11:23 PM Qing Zhao 
> >>> wrote:
> 
> 
> 
> > On May 23, 2024, at 07:46, Richard Biener
> >  wrote:
> >
> > On Wed, May 22, 2024 at 8:53 PM Qing Zhao 
> > wrote:
> >>
> >>
> >>
> >>> On May 22, 2024, at 03:38, Richard Biener
> >>>  wrote:
> >>>
> >>> On Tue, May 21, 2024 at 11:36 PM David Malcolm
> >>>  wrote:
> 
>  On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> > Thanks for the comments and suggestions.
> >
> >> On May 15, 2024, at 10:00, David Malcolm
> >> 
> >> wrote:
> >>
> >> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener
> >> wrote:
> >>> On Mon, 13 May 2024, Qing Zhao wrote:
> >>>
>  -Warray-bounds is an important option to enable
>  linux kernal to
>  keep
>  the array out-of-bound errors out of the source
>  tree.
> 
>  However, due to the false positive warnings
>  reported in
>  PR109071
>  (-Warray-bounds false positive warnings due to code
>  duplication
>  from
>  jump threading), -Warray-bounds=1 cannot be added
>  on by
>  default.
> 
>  Although it's impossible to elinimate all the false
>  positive
>  warnings
>  from -Warray-bounds=1 (See PR104355 Misleading -
>  Warray-bounds
>  documentation says "always out of bounds"), we
>  should minimize
>  the
>  false positive warnings in -Warray-bounds=1.
> 
>  The root reason for the false positive warnings
>  reported in
>  PR109071 is:
> 
>  When the thread jump optimization tries to reduce
>  the # of
>  branches
>  inside the routine, sometimes it needs to duplicate
>  the code
>  and
>  split into two conditional pathes. for example:
> 
>  The original code:
> 
>  void sparx5_set (int * ptr, struct nums * sg, int
>  index)
>  {
>  if (index >= 4)
>   warn ();
>  *ptr = 0;
>  *val = sg->vals[index];
>  if (index >= 4)
>   warn ();
>  *ptr = *val;
> 
>  return;
>  }
> 
>  With the thread jump, the above becomes:
> 
>  void sparx5_set (int * ptr, struct nums * sg, int
>  index)
>  {
>  if (index >= 4)
>   {
> warn ();
> *ptr = 0; // Code duplications since
>  "warn" does
>  return;
> *val = sg->vals[index];   // same this line.
>   // In this path,
>  since it's
>  under
>  the condition
>   // "index >= 4", the
>  compiler
>  knows
>  the value
>   // of "index" is
>  larger then 4,
>  therefore the
>   // out-of-bound
>  warning.
> warn ();
>   }
>  else
>   {
> *ptr = 0;
> *val = sg->vals[index];
>   }
>  *ptr = *val;
>  return;
>  }
> 
>  We can see, after the thread jump optimization, the
>  # of
>  branches
>  inside
>  the routine "sparx5_set" is reduced from 2 to 1,
>  however,  due
>  to
>  the
>  code duplication (which is needed for the
>  correctness of the
>  code),
>  we
>  got a false positive out-of-bound warning.
> 
>  In order to eliminate such false positive out-of-
>  bound warning,
> 
>  A. Add one more flag for GIMPLE: is_splitted.
>  B. During the thread jump optimization, when the
>  basic blocks
>  are
>  duplicated, mark all the STMTs inside the original
>  and
>  duplicated
>  basic blocks as "is_splitted";
>  C. Inside the array bound checker, add the
>  following 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-06-04 Thread Qing Zhao


> On Jun 4, 2024, at 03:43, Richard Biener  wrote:
> 
> On Mon, Jun 3, 2024 at 4:48 PM David Malcolm  wrote:
>> 
>> On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote:
>>> On Fri, May 31, 2024 at 11:23 PM Qing Zhao 
>>> wrote:
 
 
 
> On May 23, 2024, at 07:46, Richard Biener
>  wrote:
> 
> On Wed, May 22, 2024 at 8:53 PM Qing Zhao 
> wrote:
>> 
>> 
>> 
>>> On May 22, 2024, at 03:38, Richard Biener
>>>  wrote:
>>> 
>>> On Tue, May 21, 2024 at 11:36 PM David Malcolm
>>>  wrote:
 
 On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> Thanks for the comments and suggestions.
> 
>> On May 15, 2024, at 10:00, David Malcolm
>> 
>> wrote:
>> 
>> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener
>> wrote:
>>> On Mon, 13 May 2024, Qing Zhao wrote:
>>> 
 -Warray-bounds is an important option to enable
 linux kernal to
 keep
 the array out-of-bound errors out of the source
 tree.
 
 However, due to the false positive warnings
 reported in
 PR109071
 (-Warray-bounds false positive warnings due to code
 duplication
 from
 jump threading), -Warray-bounds=1 cannot be added
 on by
 default.
 
 Although it's impossible to elinimate all the false
 positive
 warnings
 from -Warray-bounds=1 (See PR104355 Misleading -
 Warray-bounds
 documentation says "always out of bounds"), we
 should minimize
 the
 false positive warnings in -Warray-bounds=1.
 
 The root reason for the false positive warnings
 reported in
 PR109071 is:
 
 When the thread jump optimization tries to reduce
 the # of
 branches
 inside the routine, sometimes it needs to duplicate
 the code
 and
 split into two conditional pathes. for example:
 
 The original code:
 
 void sparx5_set (int * ptr, struct nums * sg, int
 index)
 {
 if (index >= 4)
  warn ();
 *ptr = 0;
 *val = sg->vals[index];
 if (index >= 4)
  warn ();
 *ptr = *val;
 
 return;
 }
 
 With the thread jump, the above becomes:
 
 void sparx5_set (int * ptr, struct nums * sg, int
 index)
 {
 if (index >= 4)
  {
warn ();
*ptr = 0; // Code duplications since
 "warn" does
 return;
*val = sg->vals[index];   // same this line.
  // In this path,
 since it's
 under
 the condition
  // "index >= 4", the
 compiler
 knows
 the value
  // of "index" is
 larger then 4,
 therefore the
  // out-of-bound
 warning.
warn ();
  }
 else
  {
*ptr = 0;
*val = sg->vals[index];
  }
 *ptr = *val;
 return;
 }
 
 We can see, after the thread jump optimization, the
 # of
 branches
 inside
 the routine "sparx5_set" is reduced from 2 to 1,
 however,  due
 to
 the
 code duplication (which is needed for the
 correctness of the
 code),
 we
 got a false positive out-of-bound warning.
 
 In order to eliminate such false positive out-of-
 bound warning,
 
 A. Add one more flag for GIMPLE: is_splitted.
 B. During the thread jump optimization, when the
 basic blocks
 are
 duplicated, mark all the STMTs inside the original
 and
 duplicated
 basic blocks as "is_splitted";
 C. Inside the array bound checker, add the
 following new
 heuristic:
 
 If
 1. the stmt is duplicated and splitted into two
 conditional
 paths;
 +  2. the warning level < 2;
 +  3. the current block is not dominating the exit
 block
 Then not report the warning.

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-06-04 Thread Richard Biener
On Mon, Jun 3, 2024 at 4:48 PM David Malcolm  wrote:
>
> On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote:
> > On Fri, May 31, 2024 at 11:23 PM Qing Zhao 
> > wrote:
> > >
> > >
> > >
> > > > On May 23, 2024, at 07:46, Richard Biener
> > > >  wrote:
> > > >
> > > > On Wed, May 22, 2024 at 8:53 PM Qing Zhao 
> > > > wrote:
> > > > >
> > > > >
> > > > >
> > > > > > On May 22, 2024, at 03:38, Richard Biener
> > > > > >  wrote:
> > > > > >
> > > > > > On Tue, May 21, 2024 at 11:36 PM David Malcolm
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> > > > > > > > Thanks for the comments and suggestions.
> > > > > > > >
> > > > > > > > > On May 15, 2024, at 10:00, David Malcolm
> > > > > > > > > 
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener
> > > > > > > > > wrote:
> > > > > > > > > > On Mon, 13 May 2024, Qing Zhao wrote:
> > > > > > > > > >
> > > > > > > > > > > -Warray-bounds is an important option to enable
> > > > > > > > > > > linux kernal to
> > > > > > > > > > > keep
> > > > > > > > > > > the array out-of-bound errors out of the source
> > > > > > > > > > > tree.
> > > > > > > > > > >
> > > > > > > > > > > However, due to the false positive warnings
> > > > > > > > > > > reported in
> > > > > > > > > > > PR109071
> > > > > > > > > > > (-Warray-bounds false positive warnings due to code
> > > > > > > > > > > duplication
> > > > > > > > > > > from
> > > > > > > > > > > jump threading), -Warray-bounds=1 cannot be added
> > > > > > > > > > > on by
> > > > > > > > > > > default.
> > > > > > > > > > >
> > > > > > > > > > > Although it's impossible to elinimate all the false
> > > > > > > > > > > positive
> > > > > > > > > > > warnings
> > > > > > > > > > > from -Warray-bounds=1 (See PR104355 Misleading -
> > > > > > > > > > > Warray-bounds
> > > > > > > > > > > documentation says "always out of bounds"), we
> > > > > > > > > > > should minimize
> > > > > > > > > > > the
> > > > > > > > > > > false positive warnings in -Warray-bounds=1.
> > > > > > > > > > >
> > > > > > > > > > > The root reason for the false positive warnings
> > > > > > > > > > > reported in
> > > > > > > > > > > PR109071 is:
> > > > > > > > > > >
> > > > > > > > > > > When the thread jump optimization tries to reduce
> > > > > > > > > > > the # of
> > > > > > > > > > > branches
> > > > > > > > > > > inside the routine, sometimes it needs to duplicate
> > > > > > > > > > > the code
> > > > > > > > > > > and
> > > > > > > > > > > split into two conditional pathes. for example:
> > > > > > > > > > >
> > > > > > > > > > > The original code:
> > > > > > > > > > >
> > > > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int
> > > > > > > > > > > index)
> > > > > > > > > > > {
> > > > > > > > > > > if (index >= 4)
> > > > > > > > > > >   warn ();
> > > > > > > > > > > *ptr = 0;
> > > > > > > > > > > *val = sg->vals[index];
> > > > > > > > > > > if (index >= 4)
> > > > > > > > > > >   warn ();
> > > > > > > > > > > *ptr = *val;
> > > > > > > > > > >
> > > > > > > > > > > return;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > With the thread jump, the above becomes:
> > > > > > > > > > >
> > > > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int
> > > > > > > > > > > index)
> > > > > > > > > > > {
> > > > > > > > > > > if (index >= 4)
> > > > > > > > > > >   {
> > > > > > > > > > > warn ();
> > > > > > > > > > > *ptr = 0; // Code duplications since
> > > > > > > > > > > "warn" does
> > > > > > > > > > > return;
> > > > > > > > > > > *val = sg->vals[index];   // same this line.
> > > > > > > > > > >   // In this path,
> > > > > > > > > > > since it's
> > > > > > > > > > > under
> > > > > > > > > > > the condition
> > > > > > > > > > >   // "index >= 4", the
> > > > > > > > > > > compiler
> > > > > > > > > > > knows
> > > > > > > > > > > the value
> > > > > > > > > > >   // of "index" is
> > > > > > > > > > > larger then 4,
> > > > > > > > > > > therefore the
> > > > > > > > > > >   // out-of-bound
> > > > > > > > > > > warning.
> > > > > > > > > > > warn ();
> > > > > > > > > > >   }
> > > > > > > > > > > else
> > > > > > > > > > >   {
> > > > > > > > > > > *ptr = 0;
> > > > > > > > > > > *val = sg->vals[index];
> > > > > > > > > > >   }
> > > > > > > > > > > *ptr = *val;
> > > > > > > > > > > return;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > We can see, after the thread jump optimization, the
> > > > > > > > > > > # of
> > > > > > > > > > > branches
> > > > > > > > > > > inside
> > > > > > > > > > > the routine "sparx5_set" is reduced from 2 to 1,
> > > > > > > > > > > however,  due
> > > > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > code duplication (which is needed for the
> > > > > > > > > > > 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-06-03 Thread David Malcolm
On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote:
> On Fri, May 31, 2024 at 11:23 PM Qing Zhao 
> wrote:
> > 
> > 
> > 
> > > On May 23, 2024, at 07:46, Richard Biener
> > >  wrote:
> > > 
> > > On Wed, May 22, 2024 at 8:53 PM Qing Zhao 
> > > wrote:
> > > > 
> > > > 
> > > > 
> > > > > On May 22, 2024, at 03:38, Richard Biener
> > > > >  wrote:
> > > > > 
> > > > > On Tue, May 21, 2024 at 11:36 PM David Malcolm
> > > > >  wrote:
> > > > > > 
> > > > > > On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> > > > > > > Thanks for the comments and suggestions.
> > > > > > > 
> > > > > > > > On May 15, 2024, at 10:00, David Malcolm
> > > > > > > > 
> > > > > > > > wrote:
> > > > > > > > 
> > > > > > > > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener
> > > > > > > > wrote:
> > > > > > > > > On Mon, 13 May 2024, Qing Zhao wrote:
> > > > > > > > > 
> > > > > > > > > > -Warray-bounds is an important option to enable
> > > > > > > > > > linux kernal to
> > > > > > > > > > keep
> > > > > > > > > > the array out-of-bound errors out of the source
> > > > > > > > > > tree.
> > > > > > > > > > 
> > > > > > > > > > However, due to the false positive warnings
> > > > > > > > > > reported in
> > > > > > > > > > PR109071
> > > > > > > > > > (-Warray-bounds false positive warnings due to code
> > > > > > > > > > duplication
> > > > > > > > > > from
> > > > > > > > > > jump threading), -Warray-bounds=1 cannot be added
> > > > > > > > > > on by
> > > > > > > > > > default.
> > > > > > > > > > 
> > > > > > > > > > Although it's impossible to elinimate all the false
> > > > > > > > > > positive
> > > > > > > > > > warnings
> > > > > > > > > > from -Warray-bounds=1 (See PR104355 Misleading -
> > > > > > > > > > Warray-bounds
> > > > > > > > > > documentation says "always out of bounds"), we
> > > > > > > > > > should minimize
> > > > > > > > > > the
> > > > > > > > > > false positive warnings in -Warray-bounds=1.
> > > > > > > > > > 
> > > > > > > > > > The root reason for the false positive warnings
> > > > > > > > > > reported in
> > > > > > > > > > PR109071 is:
> > > > > > > > > > 
> > > > > > > > > > When the thread jump optimization tries to reduce
> > > > > > > > > > the # of
> > > > > > > > > > branches
> > > > > > > > > > inside the routine, sometimes it needs to duplicate
> > > > > > > > > > the code
> > > > > > > > > > and
> > > > > > > > > > split into two conditional pathes. for example:
> > > > > > > > > > 
> > > > > > > > > > The original code:
> > > > > > > > > > 
> > > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int
> > > > > > > > > > index)
> > > > > > > > > > {
> > > > > > > > > > if (index >= 4)
> > > > > > > > > >   warn ();
> > > > > > > > > > *ptr = 0;
> > > > > > > > > > *val = sg->vals[index];
> > > > > > > > > > if (index >= 4)
> > > > > > > > > >   warn ();
> > > > > > > > > > *ptr = *val;
> > > > > > > > > > 
> > > > > > > > > > return;
> > > > > > > > > > }
> > > > > > > > > > 
> > > > > > > > > > With the thread jump, the above becomes:
> > > > > > > > > > 
> > > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int
> > > > > > > > > > index)
> > > > > > > > > > {
> > > > > > > > > > if (index >= 4)
> > > > > > > > > >   {
> > > > > > > > > >     warn ();
> > > > > > > > > >     *ptr = 0; // Code duplications since
> > > > > > > > > > "warn" does
> > > > > > > > > > return;
> > > > > > > > > >     *val = sg->vals[index];   // same this line.
> > > > > > > > > >   // In this path,
> > > > > > > > > > since it's
> > > > > > > > > > under
> > > > > > > > > > the condition
> > > > > > > > > >   // "index >= 4", the
> > > > > > > > > > compiler
> > > > > > > > > > knows
> > > > > > > > > > the value
> > > > > > > > > >   // of "index" is
> > > > > > > > > > larger then 4,
> > > > > > > > > > therefore the
> > > > > > > > > >   // out-of-bound
> > > > > > > > > > warning.
> > > > > > > > > >     warn ();
> > > > > > > > > >   }
> > > > > > > > > > else
> > > > > > > > > >   {
> > > > > > > > > >     *ptr = 0;
> > > > > > > > > >     *val = sg->vals[index];
> > > > > > > > > >   }
> > > > > > > > > > *ptr = *val;
> > > > > > > > > > return;
> > > > > > > > > > }
> > > > > > > > > > 
> > > > > > > > > > We can see, after the thread jump optimization, the
> > > > > > > > > > # of
> > > > > > > > > > branches
> > > > > > > > > > inside
> > > > > > > > > > the routine "sparx5_set" is reduced from 2 to 1,
> > > > > > > > > > however,  due
> > > > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > code duplication (which is needed for the
> > > > > > > > > > correctness of the
> > > > > > > > > > code),
> > > > > > > > > > we
> > > > > > > > > > got a false positive out-of-bound warning.
> > > > > > > > > > 
> > > > > > > > > > In order to eliminate such false positive out-of-
> > > > > > > > > > bound warning,
> > > > > > > > > > 
> > > > > > > > > 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-06-03 Thread Qing Zhao


> On Jun 3, 2024, at 02:29, Richard Biener  wrote:
> 
> On Fri, May 31, 2024 at 11:23 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On May 23, 2024, at 07:46, Richard Biener  
>>> wrote:
>>> 
>>> On Wed, May 22, 2024 at 8:53 PM Qing Zhao  wrote:
 
 
 
> On May 22, 2024, at 03:38, Richard Biener  
> wrote:
> 
> On Tue, May 21, 2024 at 11:36 PM David Malcolm  
> wrote:
>> 
>> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
>>> Thanks for the comments and suggestions.
>>> 
 On May 15, 2024, at 10:00, David Malcolm 
 wrote:
 
 On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
> On Mon, 13 May 2024, Qing Zhao wrote:
> 
>> -Warray-bounds is an important option to enable linux kernal to
>> keep
>> the array out-of-bound errors out of the source tree.
>> 
>> However, due to the false positive warnings reported in
>> PR109071
>> (-Warray-bounds false positive warnings due to code duplication
>> from
>> jump threading), -Warray-bounds=1 cannot be added on by
>> default.
>> 
>> Although it's impossible to elinimate all the false positive
>> warnings
>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
>> documentation says "always out of bounds"), we should minimize
>> the
>> false positive warnings in -Warray-bounds=1.
>> 
>> The root reason for the false positive warnings reported in
>> PR109071 is:
>> 
>> When the thread jump optimization tries to reduce the # of
>> branches
>> inside the routine, sometimes it needs to duplicate the code
>> and
>> split into two conditional pathes. for example:
>> 
>> The original code:
>> 
>> void sparx5_set (int * ptr, struct nums * sg, int index)
>> {
>> if (index >= 4)
>>  warn ();
>> *ptr = 0;
>> *val = sg->vals[index];
>> if (index >= 4)
>>  warn ();
>> *ptr = *val;
>> 
>> return;
>> }
>> 
>> With the thread jump, the above becomes:
>> 
>> void sparx5_set (int * ptr, struct nums * sg, int index)
>> {
>> if (index >= 4)
>>  {
>>warn ();
>>*ptr = 0; // Code duplications since "warn" does
>> return;
>>*val = sg->vals[index];   // same this line.
>>  // In this path, since it's
>> under
>> the condition
>>  // "index >= 4", the compiler
>> knows
>> the value
>>  // of "index" is larger then 4,
>> therefore the
>>  // out-of-bound warning.
>>warn ();
>>  }
>> else
>>  {
>>*ptr = 0;
>>*val = sg->vals[index];
>>  }
>> *ptr = *val;
>> return;
>> }
>> 
>> We can see, after the thread jump optimization, the # of
>> branches
>> inside
>> the routine "sparx5_set" is reduced from 2 to 1, however,  due
>> to
>> the
>> code duplication (which is needed for the correctness of the
>> code),
>> we
>> got a false positive out-of-bound warning.
>> 
>> In order to eliminate such false positive out-of-bound warning,
>> 
>> A. Add one more flag for GIMPLE: is_splitted.
>> B. During the thread jump optimization, when the basic blocks
>> are
>> duplicated, mark all the STMTs inside the original and
>> duplicated
>> basic blocks as "is_splitted";
>> C. Inside the array bound checker, add the following new
>> heuristic:
>> 
>> If
>> 1. the stmt is duplicated and splitted into two conditional
>> paths;
>> +  2. the warning level < 2;
>> +  3. the current block is not dominating the exit block
>> Then not report the warning.
>> 
>> The false positive warnings are moved from -Warray-bounds=1 to
>> -Warray-bounds=2 now.
>> 
>> Bootstrapped and regression tested on both x86 and aarch64.
>> adjusted
>> -Warray-bounds-61.c due to the false positive warnings.
>> 
>> Let me know if you have any comments and suggestions.
> 
> At the last Cauldron I talked with David Malcolm about these kind
> of
> issues and thought of instead of suppressing diagnostics to
> record
> how a block was duplicated.  For jump threading my idea was to
> record
> the condition that was proved true when entering the path and do
> this
> by recording 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-06-03 Thread Richard Biener
On Fri, May 31, 2024 at 11:23 PM Qing Zhao  wrote:
>
>
>
> > On May 23, 2024, at 07:46, Richard Biener  
> > wrote:
> >
> > On Wed, May 22, 2024 at 8:53 PM Qing Zhao  wrote:
> >>
> >>
> >>
> >>> On May 22, 2024, at 03:38, Richard Biener  
> >>> wrote:
> >>>
> >>> On Tue, May 21, 2024 at 11:36 PM David Malcolm  
> >>> wrote:
> 
>  On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> > Thanks for the comments and suggestions.
> >
> >> On May 15, 2024, at 10:00, David Malcolm 
> >> wrote:
> >>
> >> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
> >>> On Mon, 13 May 2024, Qing Zhao wrote:
> >>>
>  -Warray-bounds is an important option to enable linux kernal to
>  keep
>  the array out-of-bound errors out of the source tree.
> 
>  However, due to the false positive warnings reported in
>  PR109071
>  (-Warray-bounds false positive warnings due to code duplication
>  from
>  jump threading), -Warray-bounds=1 cannot be added on by
>  default.
> 
>  Although it's impossible to elinimate all the false positive
>  warnings
>  from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
>  documentation says "always out of bounds"), we should minimize
>  the
>  false positive warnings in -Warray-bounds=1.
> 
>  The root reason for the false positive warnings reported in
>  PR109071 is:
> 
>  When the thread jump optimization tries to reduce the # of
>  branches
>  inside the routine, sometimes it needs to duplicate the code
>  and
>  split into two conditional pathes. for example:
> 
>  The original code:
> 
>  void sparx5_set (int * ptr, struct nums * sg, int index)
>  {
>  if (index >= 4)
>    warn ();
>  *ptr = 0;
>  *val = sg->vals[index];
>  if (index >= 4)
>    warn ();
>  *ptr = *val;
> 
>  return;
>  }
> 
>  With the thread jump, the above becomes:
> 
>  void sparx5_set (int * ptr, struct nums * sg, int index)
>  {
>  if (index >= 4)
>    {
>  warn ();
>  *ptr = 0; // Code duplications since "warn" does
>  return;
>  *val = sg->vals[index];   // same this line.
>    // In this path, since it's
>  under
>  the condition
>    // "index >= 4", the compiler
>  knows
>  the value
>    // of "index" is larger then 4,
>  therefore the
>    // out-of-bound warning.
>  warn ();
>    }
>  else
>    {
>  *ptr = 0;
>  *val = sg->vals[index];
>    }
>  *ptr = *val;
>  return;
>  }
> 
>  We can see, after the thread jump optimization, the # of
>  branches
>  inside
>  the routine "sparx5_set" is reduced from 2 to 1, however,  due
>  to
>  the
>  code duplication (which is needed for the correctness of the
>  code),
>  we
>  got a false positive out-of-bound warning.
> 
>  In order to eliminate such false positive out-of-bound warning,
> 
>  A. Add one more flag for GIMPLE: is_splitted.
>  B. During the thread jump optimization, when the basic blocks
>  are
>   duplicated, mark all the STMTs inside the original and
>  duplicated
>   basic blocks as "is_splitted";
>  C. Inside the array bound checker, add the following new
>  heuristic:
> 
>  If
>   1. the stmt is duplicated and splitted into two conditional
>  paths;
>  +  2. the warning level < 2;
>  +  3. the current block is not dominating the exit block
>  Then not report the warning.
> 
>  The false positive warnings are moved from -Warray-bounds=1 to
>  -Warray-bounds=2 now.
> 
>  Bootstrapped and regression tested on both x86 and aarch64.
>  adjusted
>  -Warray-bounds-61.c due to the false positive warnings.
> 
>  Let me know if you have any comments and suggestions.
> >>>
> >>> At the last Cauldron I talked with David Malcolm about these kind
> >>> of
> >>> issues and thought of instead of suppressing diagnostics to
> >>> record
> >>> how a block was duplicated.  For jump threading my idea was to
> >>> record
> >>> the condition that was proved true when entering the path and do
> >>> this
> >>> by recording the corresponding locations
> >
> > Is only recording the 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-31 Thread Qing Zhao


> On May 23, 2024, at 07:46, Richard Biener  wrote:
> 
> On Wed, May 22, 2024 at 8:53 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On May 22, 2024, at 03:38, Richard Biener  
>>> wrote:
>>> 
>>> On Tue, May 21, 2024 at 11:36 PM David Malcolm  wrote:
 
 On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> Thanks for the comments and suggestions.
> 
>> On May 15, 2024, at 10:00, David Malcolm 
>> wrote:
>> 
>> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
>>> On Mon, 13 May 2024, Qing Zhao wrote:
>>> 
 -Warray-bounds is an important option to enable linux kernal to
 keep
 the array out-of-bound errors out of the source tree.
 
 However, due to the false positive warnings reported in
 PR109071
 (-Warray-bounds false positive warnings due to code duplication
 from
 jump threading), -Warray-bounds=1 cannot be added on by
 default.
 
 Although it's impossible to elinimate all the false positive
 warnings
 from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
 documentation says "always out of bounds"), we should minimize
 the
 false positive warnings in -Warray-bounds=1.
 
 The root reason for the false positive warnings reported in
 PR109071 is:
 
 When the thread jump optimization tries to reduce the # of
 branches
 inside the routine, sometimes it needs to duplicate the code
 and
 split into two conditional pathes. for example:
 
 The original code:
 
 void sparx5_set (int * ptr, struct nums * sg, int index)
 {
 if (index >= 4)
   warn ();
 *ptr = 0;
 *val = sg->vals[index];
 if (index >= 4)
   warn ();
 *ptr = *val;
 
 return;
 }
 
 With the thread jump, the above becomes:
 
 void sparx5_set (int * ptr, struct nums * sg, int index)
 {
 if (index >= 4)
   {
 warn ();
 *ptr = 0; // Code duplications since "warn" does
 return;
 *val = sg->vals[index];   // same this line.
   // In this path, since it's
 under
 the condition
   // "index >= 4", the compiler
 knows
 the value
   // of "index" is larger then 4,
 therefore the
   // out-of-bound warning.
 warn ();
   }
 else
   {
 *ptr = 0;
 *val = sg->vals[index];
   }
 *ptr = *val;
 return;
 }
 
 We can see, after the thread jump optimization, the # of
 branches
 inside
 the routine "sparx5_set" is reduced from 2 to 1, however,  due
 to
 the
 code duplication (which is needed for the correctness of the
 code),
 we
 got a false positive out-of-bound warning.
 
 In order to eliminate such false positive out-of-bound warning,
 
 A. Add one more flag for GIMPLE: is_splitted.
 B. During the thread jump optimization, when the basic blocks
 are
  duplicated, mark all the STMTs inside the original and
 duplicated
  basic blocks as "is_splitted";
 C. Inside the array bound checker, add the following new
 heuristic:
 
 If
  1. the stmt is duplicated and splitted into two conditional
 paths;
 +  2. the warning level < 2;
 +  3. the current block is not dominating the exit block
 Then not report the warning.
 
 The false positive warnings are moved from -Warray-bounds=1 to
 -Warray-bounds=2 now.
 
 Bootstrapped and regression tested on both x86 and aarch64.
 adjusted
 -Warray-bounds-61.c due to the false positive warnings.
 
 Let me know if you have any comments and suggestions.
>>> 
>>> At the last Cauldron I talked with David Malcolm about these kind
>>> of
>>> issues and thought of instead of suppressing diagnostics to
>>> record
>>> how a block was duplicated.  For jump threading my idea was to
>>> record
>>> the condition that was proved true when entering the path and do
>>> this
>>> by recording the corresponding locations
> 
> Is only recording the location for the TRUE path  enough?
> We might need to record the corresponding locations for both TRUE and
> FALSE paths since the VRP might be more accurate on both paths.
> Is only recording the location is enough?
> Do we need to record the pointer to the original condition stmt?
 
 Just to be 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-23 Thread Qing Zhao


> On May 23, 2024, at 10:13, David Malcolm  wrote:
> 
> On Thu, 2024-05-23 at 14:03 +, Qing Zhao wrote:
> 
> [...snip...]
> 
>> Is “location_adhoc_data” an available data structure in current GCC?
>> I just searched GCC source tree, cannot find it.
> 
> It's in libcpp/include/line-table.h; see the big comment at the top of
> that file.

Thanks a lot.

I found it in libcpp/include/line-map.h. reading it right now.

Qing
> 
> Dave
> 



Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-23 Thread David Malcolm
On Thu, 2024-05-23 at 14:03 +, Qing Zhao wrote:

[...snip...]

> Is “location_adhoc_data” an available data structure in current GCC?
> I just searched GCC source tree, cannot find it. 

It's in libcpp/include/line-table.h; see the big comment at the top of
that file.

Dave



Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-23 Thread Qing Zhao


> On May 23, 2024, at 07:46, Richard Biener  wrote:
> 
> On Wed, May 22, 2024 at 8:53 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On May 22, 2024, at 03:38, Richard Biener  
>>> wrote:
>>> 
>>> On Tue, May 21, 2024 at 11:36 PM David Malcolm  wrote:
 
 On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> Thanks for the comments and suggestions.
> 
>> On May 15, 2024, at 10:00, David Malcolm 
>> wrote:
>> 
>> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
>>> On Mon, 13 May 2024, Qing Zhao wrote:
>>> 
 -Warray-bounds is an important option to enable linux kernal to
 keep
 the array out-of-bound errors out of the source tree.
 
 However, due to the false positive warnings reported in
 PR109071
 (-Warray-bounds false positive warnings due to code duplication
 from
 jump threading), -Warray-bounds=1 cannot be added on by
 default.
 
 Although it's impossible to elinimate all the false positive
 warnings
 from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
 documentation says "always out of bounds"), we should minimize
 the
 false positive warnings in -Warray-bounds=1.
 
 The root reason for the false positive warnings reported in
 PR109071 is:
 
 When the thread jump optimization tries to reduce the # of
 branches
 inside the routine, sometimes it needs to duplicate the code
 and
 split into two conditional pathes. for example:
 
 The original code:
 
 void sparx5_set (int * ptr, struct nums * sg, int index)
 {
 if (index >= 4)
   warn ();
 *ptr = 0;
 *val = sg->vals[index];
 if (index >= 4)
   warn ();
 *ptr = *val;
 
 return;
 }
 
 With the thread jump, the above becomes:
 
 void sparx5_set (int * ptr, struct nums * sg, int index)
 {
 if (index >= 4)
   {
 warn ();
 *ptr = 0; // Code duplications since "warn" does
 return;
 *val = sg->vals[index];   // same this line.
   // In this path, since it's
 under
 the condition
   // "index >= 4", the compiler
 knows
 the value
   // of "index" is larger then 4,
 therefore the
   // out-of-bound warning.
 warn ();
   }
 else
   {
 *ptr = 0;
 *val = sg->vals[index];
   }
 *ptr = *val;
 return;
 }
 
 We can see, after the thread jump optimization, the # of
 branches
 inside
 the routine "sparx5_set" is reduced from 2 to 1, however,  due
 to
 the
 code duplication (which is needed for the correctness of the
 code),
 we
 got a false positive out-of-bound warning.
 
 In order to eliminate such false positive out-of-bound warning,
 
 A. Add one more flag for GIMPLE: is_splitted.
 B. During the thread jump optimization, when the basic blocks
 are
  duplicated, mark all the STMTs inside the original and
 duplicated
  basic blocks as "is_splitted";
 C. Inside the array bound checker, add the following new
 heuristic:
 
 If
  1. the stmt is duplicated and splitted into two conditional
 paths;
 +  2. the warning level < 2;
 +  3. the current block is not dominating the exit block
 Then not report the warning.
 
 The false positive warnings are moved from -Warray-bounds=1 to
 -Warray-bounds=2 now.
 
 Bootstrapped and regression tested on both x86 and aarch64.
 adjusted
 -Warray-bounds-61.c due to the false positive warnings.
 
 Let me know if you have any comments and suggestions.
>>> 
>>> At the last Cauldron I talked with David Malcolm about these kind
>>> of
>>> issues and thought of instead of suppressing diagnostics to
>>> record
>>> how a block was duplicated.  For jump threading my idea was to
>>> record
>>> the condition that was proved true when entering the path and do
>>> this
>>> by recording the corresponding locations
> 
> Is only recording the location for the TRUE path  enough?
> We might need to record the corresponding locations for both TRUE and
> FALSE paths since the VRP might be more accurate on both paths.
> Is only recording the location is enough?
> Do we need to record the pointer to the original condition stmt?
 
 Just to be 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-23 Thread Richard Biener
On Wed, May 22, 2024 at 8:53 PM Qing Zhao  wrote:
>
>
>
> > On May 22, 2024, at 03:38, Richard Biener  
> > wrote:
> >
> > On Tue, May 21, 2024 at 11:36 PM David Malcolm  wrote:
> >>
> >> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> >>> Thanks for the comments and suggestions.
> >>>
>  On May 15, 2024, at 10:00, David Malcolm 
>  wrote:
> 
>  On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
> > On Mon, 13 May 2024, Qing Zhao wrote:
> >
> >> -Warray-bounds is an important option to enable linux kernal to
> >> keep
> >> the array out-of-bound errors out of the source tree.
> >>
> >> However, due to the false positive warnings reported in
> >> PR109071
> >> (-Warray-bounds false positive warnings due to code duplication
> >> from
> >> jump threading), -Warray-bounds=1 cannot be added on by
> >> default.
> >>
> >> Although it's impossible to elinimate all the false positive
> >> warnings
> >> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
> >> documentation says "always out of bounds"), we should minimize
> >> the
> >> false positive warnings in -Warray-bounds=1.
> >>
> >> The root reason for the false positive warnings reported in
> >> PR109071 is:
> >>
> >> When the thread jump optimization tries to reduce the # of
> >> branches
> >> inside the routine, sometimes it needs to duplicate the code
> >> and
> >> split into two conditional pathes. for example:
> >>
> >> The original code:
> >>
> >> void sparx5_set (int * ptr, struct nums * sg, int index)
> >> {
> >>  if (index >= 4)
> >>warn ();
> >>  *ptr = 0;
> >>  *val = sg->vals[index];
> >>  if (index >= 4)
> >>warn ();
> >>  *ptr = *val;
> >>
> >>  return;
> >> }
> >>
> >> With the thread jump, the above becomes:
> >>
> >> void sparx5_set (int * ptr, struct nums * sg, int index)
> >> {
> >>  if (index >= 4)
> >>{
> >>  warn ();
> >>  *ptr = 0; // Code duplications since "warn" does
> >> return;
> >>  *val = sg->vals[index];   // same this line.
> >>// In this path, since it's
> >> under
> >> the condition
> >>// "index >= 4", the compiler
> >> knows
> >> the value
> >>// of "index" is larger then 4,
> >> therefore the
> >>// out-of-bound warning.
> >>  warn ();
> >>}
> >>  else
> >>{
> >>  *ptr = 0;
> >>  *val = sg->vals[index];
> >>}
> >>  *ptr = *val;
> >>  return;
> >> }
> >>
> >> We can see, after the thread jump optimization, the # of
> >> branches
> >> inside
> >> the routine "sparx5_set" is reduced from 2 to 1, however,  due
> >> to
> >> the
> >> code duplication (which is needed for the correctness of the
> >> code),
> >> we
> >> got a false positive out-of-bound warning.
> >>
> >> In order to eliminate such false positive out-of-bound warning,
> >>
> >> A. Add one more flag for GIMPLE: is_splitted.
> >> B. During the thread jump optimization, when the basic blocks
> >> are
> >>   duplicated, mark all the STMTs inside the original and
> >> duplicated
> >>   basic blocks as "is_splitted";
> >> C. Inside the array bound checker, add the following new
> >> heuristic:
> >>
> >> If
> >>   1. the stmt is duplicated and splitted into two conditional
> >> paths;
> >> +  2. the warning level < 2;
> >> +  3. the current block is not dominating the exit block
> >> Then not report the warning.
> >>
> >> The false positive warnings are moved from -Warray-bounds=1 to
> >> -Warray-bounds=2 now.
> >>
> >> Bootstrapped and regression tested on both x86 and aarch64.
> >> adjusted
> >> -Warray-bounds-61.c due to the false positive warnings.
> >>
> >> Let me know if you have any comments and suggestions.
> >
> > At the last Cauldron I talked with David Malcolm about these kind
> > of
> > issues and thought of instead of suppressing diagnostics to
> > record
> > how a block was duplicated.  For jump threading my idea was to
> > record
> > the condition that was proved true when entering the path and do
> > this
> > by recording the corresponding locations
> >>>
> >>> Is only recording the location for the TRUE path  enough?
> >>> We might need to record the corresponding locations for both TRUE and
> >>> FALSE paths since the VRP might be more accurate on both paths.
> >>> Is only recording the location is enough?
> >>> Do we need to record the pointer to the original condition stmt?
> >>
> >> Just to be clear: I don't plan to work on this myself (I have far too
> >> 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-22 Thread Qing Zhao


> On May 22, 2024, at 03:38, Richard Biener  wrote:
> 
> On Tue, May 21, 2024 at 11:36 PM David Malcolm  wrote:
>> 
>> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
>>> Thanks for the comments and suggestions.
>>> 
 On May 15, 2024, at 10:00, David Malcolm 
 wrote:
 
 On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
> On Mon, 13 May 2024, Qing Zhao wrote:
> 
>> -Warray-bounds is an important option to enable linux kernal to
>> keep
>> the array out-of-bound errors out of the source tree.
>> 
>> However, due to the false positive warnings reported in
>> PR109071
>> (-Warray-bounds false positive warnings due to code duplication
>> from
>> jump threading), -Warray-bounds=1 cannot be added on by
>> default.
>> 
>> Although it's impossible to elinimate all the false positive
>> warnings
>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
>> documentation says "always out of bounds"), we should minimize
>> the
>> false positive warnings in -Warray-bounds=1.
>> 
>> The root reason for the false positive warnings reported in
>> PR109071 is:
>> 
>> When the thread jump optimization tries to reduce the # of
>> branches
>> inside the routine, sometimes it needs to duplicate the code
>> and
>> split into two conditional pathes. for example:
>> 
>> The original code:
>> 
>> void sparx5_set (int * ptr, struct nums * sg, int index)
>> {
>>  if (index >= 4)
>>warn ();
>>  *ptr = 0;
>>  *val = sg->vals[index];
>>  if (index >= 4)
>>warn ();
>>  *ptr = *val;
>> 
>>  return;
>> }
>> 
>> With the thread jump, the above becomes:
>> 
>> void sparx5_set (int * ptr, struct nums * sg, int index)
>> {
>>  if (index >= 4)
>>{
>>  warn ();
>>  *ptr = 0; // Code duplications since "warn" does
>> return;
>>  *val = sg->vals[index];   // same this line.
>>// In this path, since it's
>> under
>> the condition
>>// "index >= 4", the compiler
>> knows
>> the value
>>// of "index" is larger then 4,
>> therefore the
>>// out-of-bound warning.
>>  warn ();
>>}
>>  else
>>{
>>  *ptr = 0;
>>  *val = sg->vals[index];
>>}
>>  *ptr = *val;
>>  return;
>> }
>> 
>> We can see, after the thread jump optimization, the # of
>> branches
>> inside
>> the routine "sparx5_set" is reduced from 2 to 1, however,  due
>> to
>> the
>> code duplication (which is needed for the correctness of the
>> code),
>> we
>> got a false positive out-of-bound warning.
>> 
>> In order to eliminate such false positive out-of-bound warning,
>> 
>> A. Add one more flag for GIMPLE: is_splitted.
>> B. During the thread jump optimization, when the basic blocks
>> are
>>   duplicated, mark all the STMTs inside the original and
>> duplicated
>>   basic blocks as "is_splitted";
>> C. Inside the array bound checker, add the following new
>> heuristic:
>> 
>> If
>>   1. the stmt is duplicated and splitted into two conditional
>> paths;
>> +  2. the warning level < 2;
>> +  3. the current block is not dominating the exit block
>> Then not report the warning.
>> 
>> The false positive warnings are moved from -Warray-bounds=1 to
>> -Warray-bounds=2 now.
>> 
>> Bootstrapped and regression tested on both x86 and aarch64.
>> adjusted
>> -Warray-bounds-61.c due to the false positive warnings.
>> 
>> Let me know if you have any comments and suggestions.
> 
> At the last Cauldron I talked with David Malcolm about these kind
> of
> issues and thought of instead of suppressing diagnostics to
> record
> how a block was duplicated.  For jump threading my idea was to
> record
> the condition that was proved true when entering the path and do
> this
> by recording the corresponding locations
>>> 
>>> Is only recording the location for the TRUE path  enough?
>>> We might need to record the corresponding locations for both TRUE and
>>> FALSE paths since the VRP might be more accurate on both paths.
>>> Is only recording the location is enough?
>>> Do we need to record the pointer to the original condition stmt?
>> 
>> Just to be clear: I don't plan to work on this myself (I have far too
>> much already to work on...); I'm assuming Richard Biener is
>> hoping/planning to implement this himself.
> 
> While I think some of this might be an improvement to those vast
> number of "false positive" diagnostics we have from (too) late diagnostic
> passes I do not have the cycles to work on this.

I can study a 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-22 Thread Richard Biener
On Tue, May 21, 2024 at 11:36 PM David Malcolm  wrote:
>
> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> > Thanks for the comments and suggestions.
> >
> > > On May 15, 2024, at 10:00, David Malcolm 
> > > wrote:
> > >
> > > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
> > > > On Mon, 13 May 2024, Qing Zhao wrote:
> > > >
> > > > > -Warray-bounds is an important option to enable linux kernal to
> > > > > keep
> > > > > the array out-of-bound errors out of the source tree.
> > > > >
> > > > > However, due to the false positive warnings reported in
> > > > > PR109071
> > > > > (-Warray-bounds false positive warnings due to code duplication
> > > > > from
> > > > > jump threading), -Warray-bounds=1 cannot be added on by
> > > > > default.
> > > > >
> > > > > Although it's impossible to elinimate all the false positive
> > > > > warnings
> > > > > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
> > > > > documentation says "always out of bounds"), we should minimize
> > > > > the
> > > > > false positive warnings in -Warray-bounds=1.
> > > > >
> > > > > The root reason for the false positive warnings reported in
> > > > > PR109071 is:
> > > > >
> > > > > When the thread jump optimization tries to reduce the # of
> > > > > branches
> > > > > inside the routine, sometimes it needs to duplicate the code
> > > > > and
> > > > > split into two conditional pathes. for example:
> > > > >
> > > > > The original code:
> > > > >
> > > > > void sparx5_set (int * ptr, struct nums * sg, int index)
> > > > > {
> > > > >   if (index >= 4)
> > > > > warn ();
> > > > >   *ptr = 0;
> > > > >   *val = sg->vals[index];
> > > > >   if (index >= 4)
> > > > > warn ();
> > > > >   *ptr = *val;
> > > > >
> > > > >   return;
> > > > > }
> > > > >
> > > > > With the thread jump, the above becomes:
> > > > >
> > > > > void sparx5_set (int * ptr, struct nums * sg, int index)
> > > > > {
> > > > >   if (index >= 4)
> > > > > {
> > > > >   warn ();
> > > > >   *ptr = 0; // Code duplications since "warn" does
> > > > > return;
> > > > >   *val = sg->vals[index];   // same this line.
> > > > > // In this path, since it's
> > > > > under
> > > > > the condition
> > > > > // "index >= 4", the compiler
> > > > > knows
> > > > > the value
> > > > > // of "index" is larger then 4,
> > > > > therefore the
> > > > > // out-of-bound warning.
> > > > >   warn ();
> > > > > }
> > > > >   else
> > > > > {
> > > > >   *ptr = 0;
> > > > >   *val = sg->vals[index];
> > > > > }
> > > > >   *ptr = *val;
> > > > >   return;
> > > > > }
> > > > >
> > > > > We can see, after the thread jump optimization, the # of
> > > > > branches
> > > > > inside
> > > > > the routine "sparx5_set" is reduced from 2 to 1, however,  due
> > > > > to
> > > > > the
> > > > > code duplication (which is needed for the correctness of the
> > > > > code),
> > > > > we
> > > > > got a false positive out-of-bound warning.
> > > > >
> > > > > In order to eliminate such false positive out-of-bound warning,
> > > > >
> > > > > A. Add one more flag for GIMPLE: is_splitted.
> > > > > B. During the thread jump optimization, when the basic blocks
> > > > > are
> > > > >duplicated, mark all the STMTs inside the original and
> > > > > duplicated
> > > > >basic blocks as "is_splitted";
> > > > > C. Inside the array bound checker, add the following new
> > > > > heuristic:
> > > > >
> > > > > If
> > > > >1. the stmt is duplicated and splitted into two conditional
> > > > > paths;
> > > > > +  2. the warning level < 2;
> > > > > +  3. the current block is not dominating the exit block
> > > > > Then not report the warning.
> > > > >
> > > > > The false positive warnings are moved from -Warray-bounds=1 to
> > > > >  -Warray-bounds=2 now.
> > > > >
> > > > > Bootstrapped and regression tested on both x86 and aarch64.
> > > > > adjusted
> > > > >  -Warray-bounds-61.c due to the false positive warnings.
> > > > >
> > > > > Let me know if you have any comments and suggestions.
> > > >
> > > > At the last Cauldron I talked with David Malcolm about these kind
> > > > of
> > > > issues and thought of instead of suppressing diagnostics to
> > > > record
> > > > how a block was duplicated.  For jump threading my idea was to
> > > > record
> > > > the condition that was proved true when entering the path and do
> > > > this
> > > > by recording the corresponding locations
> >
> > Is only recording the location for the TRUE path  enough?
> > We might need to record the corresponding locations for both TRUE and
> > FALSE paths since the VRP might be more accurate on both paths.
> > Is only recording the location is enough?
> > Do we need to record the pointer to the original condition stmt?
>
> Just to be clear: I don't plan to work on this myself (I have far too
> much already to 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-21 Thread David Malcolm
On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> Thanks for the comments and suggestions.
> 
> > On May 15, 2024, at 10:00, David Malcolm 
> > wrote:
> > 
> > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
> > > On Mon, 13 May 2024, Qing Zhao wrote:
> > > 
> > > > -Warray-bounds is an important option to enable linux kernal to
> > > > keep
> > > > the array out-of-bound errors out of the source tree.
> > > > 
> > > > However, due to the false positive warnings reported in
> > > > PR109071
> > > > (-Warray-bounds false positive warnings due to code duplication
> > > > from
> > > > jump threading), -Warray-bounds=1 cannot be added on by
> > > > default.
> > > > 
> > > > Although it's impossible to elinimate all the false positive
> > > > warnings
> > > > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
> > > > documentation says "always out of bounds"), we should minimize
> > > > the
> > > > false positive warnings in -Warray-bounds=1.
> > > > 
> > > > The root reason for the false positive warnings reported in
> > > > PR109071 is:
> > > > 
> > > > When the thread jump optimization tries to reduce the # of
> > > > branches
> > > > inside the routine, sometimes it needs to duplicate the code
> > > > and
> > > > split into two conditional pathes. for example:
> > > > 
> > > > The original code:
> > > > 
> > > > void sparx5_set (int * ptr, struct nums * sg, int index)
> > > > {
> > > >   if (index >= 4)
> > > >     warn ();
> > > >   *ptr = 0;
> > > >   *val = sg->vals[index];
> > > >   if (index >= 4)
> > > >     warn ();
> > > >   *ptr = *val;
> > > > 
> > > >   return;
> > > > }
> > > > 
> > > > With the thread jump, the above becomes:
> > > > 
> > > > void sparx5_set (int * ptr, struct nums * sg, int index)
> > > > {
> > > >   if (index >= 4)
> > > >     {
> > > >   warn ();
> > > >   *ptr = 0; // Code duplications since "warn" does
> > > > return;
> > > >   *val = sg->vals[index];   // same this line.
> > > >     // In this path, since it's
> > > > under
> > > > the condition
> > > >     // "index >= 4", the compiler
> > > > knows
> > > > the value
> > > >     // of "index" is larger then 4,
> > > > therefore the
> > > >     // out-of-bound warning.
> > > >   warn ();
> > > >     }
> > > >   else
> > > >     {
> > > >   *ptr = 0;
> > > >   *val = sg->vals[index];
> > > >     }
> > > >   *ptr = *val;
> > > >   return;
> > > > }
> > > > 
> > > > We can see, after the thread jump optimization, the # of
> > > > branches
> > > > inside
> > > > the routine "sparx5_set" is reduced from 2 to 1, however,  due
> > > > to
> > > > the
> > > > code duplication (which is needed for the correctness of the
> > > > code),
> > > > we
> > > > got a false positive out-of-bound warning.
> > > > 
> > > > In order to eliminate such false positive out-of-bound warning,
> > > > 
> > > > A. Add one more flag for GIMPLE: is_splitted.
> > > > B. During the thread jump optimization, when the basic blocks
> > > > are
> > > >    duplicated, mark all the STMTs inside the original and
> > > > duplicated
> > > >    basic blocks as "is_splitted";
> > > > C. Inside the array bound checker, add the following new
> > > > heuristic:
> > > > 
> > > > If
> > > >    1. the stmt is duplicated and splitted into two conditional
> > > > paths;
> > > > +  2. the warning level < 2;
> > > > +  3. the current block is not dominating the exit block
> > > > Then not report the warning.
> > > > 
> > > > The false positive warnings are moved from -Warray-bounds=1 to
> > > >  -Warray-bounds=2 now.
> > > > 
> > > > Bootstrapped and regression tested on both x86 and aarch64.
> > > > adjusted
> > > >  -Warray-bounds-61.c due to the false positive warnings.
> > > > 
> > > > Let me know if you have any comments and suggestions.
> > > 
> > > At the last Cauldron I talked with David Malcolm about these kind
> > > of
> > > issues and thought of instead of suppressing diagnostics to
> > > record
> > > how a block was duplicated.  For jump threading my idea was to
> > > record
> > > the condition that was proved true when entering the path and do
> > > this
> > > by recording the corresponding locations
> 
> Is only recording the location for the TRUE path  enough?
> We might need to record the corresponding locations for both TRUE and
> FALSE paths since the VRP might be more accurate on both paths. 
> Is only recording the location is enough? 
> Do we need to record the pointer to the original condition stmt?

Just to be clear: I don't plan to work on this myself (I have far too
much already to work on...); I'm assuming Richard Biener is
hoping/planning to implement this himself.

But feel free to ask me any questions about the diagnostic_path
machinery within the diagnostics subsystem.

[...snip...]

Regards
Dave



Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-21 Thread Qing Zhao
Thanks for the comments and suggestions.

> On May 15, 2024, at 10:00, David Malcolm  wrote:
> 
> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
>> On Mon, 13 May 2024, Qing Zhao wrote:
>> 
>>> -Warray-bounds is an important option to enable linux kernal to
>>> keep
>>> the array out-of-bound errors out of the source tree.
>>> 
>>> However, due to the false positive warnings reported in PR109071
>>> (-Warray-bounds false positive warnings due to code duplication
>>> from
>>> jump threading), -Warray-bounds=1 cannot be added on by default.
>>> 
>>> Although it's impossible to elinimate all the false positive
>>> warnings
>>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
>>> documentation says "always out of bounds"), we should minimize the
>>> false positive warnings in -Warray-bounds=1.
>>> 
>>> The root reason for the false positive warnings reported in
>>> PR109071 is:
>>> 
>>> When the thread jump optimization tries to reduce the # of branches
>>> inside the routine, sometimes it needs to duplicate the code and
>>> split into two conditional pathes. for example:
>>> 
>>> The original code:
>>> 
>>> void sparx5_set (int * ptr, struct nums * sg, int index)
>>> {
>>>   if (index >= 4)
>>> warn ();
>>>   *ptr = 0;
>>>   *val = sg->vals[index];
>>>   if (index >= 4)
>>> warn ();
>>>   *ptr = *val;
>>> 
>>>   return;
>>> }
>>> 
>>> With the thread jump, the above becomes:
>>> 
>>> void sparx5_set (int * ptr, struct nums * sg, int index)
>>> {
>>>   if (index >= 4)
>>> {
>>>   warn ();
>>>   *ptr = 0; // Code duplications since "warn" does
>>> return;
>>>   *val = sg->vals[index];   // same this line.
>>> // In this path, since it's under
>>> the condition
>>> // "index >= 4", the compiler knows
>>> the value
>>> // of "index" is larger then 4,
>>> therefore the
>>> // out-of-bound warning.
>>>   warn ();
>>> }
>>>   else
>>> {
>>>   *ptr = 0;
>>>   *val = sg->vals[index];
>>> }
>>>   *ptr = *val;
>>>   return;
>>> }
>>> 
>>> We can see, after the thread jump optimization, the # of branches
>>> inside
>>> the routine "sparx5_set" is reduced from 2 to 1, however,  due to
>>> the
>>> code duplication (which is needed for the correctness of the code),
>>> we
>>> got a false positive out-of-bound warning.
>>> 
>>> In order to eliminate such false positive out-of-bound warning,
>>> 
>>> A. Add one more flag for GIMPLE: is_splitted.
>>> B. During the thread jump optimization, when the basic blocks are
>>>duplicated, mark all the STMTs inside the original and
>>> duplicated
>>>basic blocks as "is_splitted";
>>> C. Inside the array bound checker, add the following new heuristic:
>>> 
>>> If
>>>1. the stmt is duplicated and splitted into two conditional
>>> paths;
>>> +  2. the warning level < 2;
>>> +  3. the current block is not dominating the exit block
>>> Then not report the warning.
>>> 
>>> The false positive warnings are moved from -Warray-bounds=1 to
>>>  -Warray-bounds=2 now.
>>> 
>>> Bootstrapped and regression tested on both x86 and aarch64.
>>> adjusted
>>>  -Warray-bounds-61.c due to the false positive warnings.
>>> 
>>> Let me know if you have any comments and suggestions.
>> 
>> At the last Cauldron I talked with David Malcolm about these kind of
>> issues and thought of instead of suppressing diagnostics to record
>> how a block was duplicated.  For jump threading my idea was to record
>> the condition that was proved true when entering the path and do this
>> by recording the corresponding locations

Is only recording the location for the TRUE path  enough?
We might need to record the corresponding locations for both TRUE and FALSE 
paths since the VRP might be more accurate on both paths. 
Is only recording the location is enough? 
Do we need to record the pointer to the original condition stmt?


>> so that in the end we can
>> use the diagnostic-path infrastructure to say
>> 
>> warning: array index always above array bounds
>> events 1:
>> 
>>> 3 |  if (index >= 4)
>>  |
>> (1) when index >= 4
>> 
>> it would be possible to record the info as part of the ad-hoc
>> location data on each duplicated stmt or, possibly simpler,
>> as part of a debug stmt of new kind.
>> 
>> I'm not sure pruning the warnings is a good thing to do.  One
>> would argue we should instead isolate such path as unreachable
>> since it invokes undefined behavior.  In particular your
>> example is clearly a bug and should be diagnosed.
>> 
>> Note very similar issues happen when unrolling a loop.
>> 
>> Note all late diagnostics are prone to these kind of issues.
> 
> To recap our chat at Cauldron: any GCC diagnostic can potentially have
> a diagnostic_path associated with it (not just the analyzer).  The
> current mechanism is:
> (a) use a rich_location for the diagnostic, and 
> (b) create 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-15 Thread David Malcolm
On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
> On Mon, 13 May 2024, Qing Zhao wrote:
> 
> > -Warray-bounds is an important option to enable linux kernal to
> > keep
> > the array out-of-bound errors out of the source tree.
> > 
> > However, due to the false positive warnings reported in PR109071
> > (-Warray-bounds false positive warnings due to code duplication
> > from
> > jump threading), -Warray-bounds=1 cannot be added on by default.
> > 
> > Although it's impossible to elinimate all the false positive
> > warnings
> > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
> > documentation says "always out of bounds"), we should minimize the
> > false positive warnings in -Warray-bounds=1.
> > 
> > The root reason for the false positive warnings reported in
> > PR109071 is:
> > 
> > When the thread jump optimization tries to reduce the # of branches
> > inside the routine, sometimes it needs to duplicate the code and
> > split into two conditional pathes. for example:
> > 
> > The original code:
> > 
> > void sparx5_set (int * ptr, struct nums * sg, int index)
> > {
> >   if (index >= 4)
> >     warn ();
> >   *ptr = 0;
> >   *val = sg->vals[index];
> >   if (index >= 4)
> >     warn ();
> >   *ptr = *val;
> > 
> >   return;
> > }
> > 
> > With the thread jump, the above becomes:
> > 
> > void sparx5_set (int * ptr, struct nums * sg, int index)
> > {
> >   if (index >= 4)
> >     {
> >   warn ();
> >   *ptr = 0; // Code duplications since "warn" does
> > return;
> >   *val = sg->vals[index];   // same this line.
> > // In this path, since it's under
> > the condition
> > // "index >= 4", the compiler knows
> > the value
> > // of "index" is larger then 4,
> > therefore the
> > // out-of-bound warning.
> >   warn ();
> >     }
> >   else
> >     {
> >   *ptr = 0;
> >   *val = sg->vals[index];
> >     }
> >   *ptr = *val;
> >   return;
> > }
> > 
> > We can see, after the thread jump optimization, the # of branches
> > inside
> > the routine "sparx5_set" is reduced from 2 to 1, however,  due to
> > the
> > code duplication (which is needed for the correctness of the code),
> > we
> > got a false positive out-of-bound warning.
> > 
> > In order to eliminate such false positive out-of-bound warning,
> > 
> > A. Add one more flag for GIMPLE: is_splitted.
> > B. During the thread jump optimization, when the basic blocks are
> >    duplicated, mark all the STMTs inside the original and
> > duplicated
> >    basic blocks as "is_splitted";
> > C. Inside the array bound checker, add the following new heuristic:
> > 
> > If
> >    1. the stmt is duplicated and splitted into two conditional
> > paths;
> > +  2. the warning level < 2;
> > +  3. the current block is not dominating the exit block
> > Then not report the warning.
> > 
> > The false positive warnings are moved from -Warray-bounds=1 to
> >  -Warray-bounds=2 now.
> > 
> > Bootstrapped and regression tested on both x86 and aarch64.
> > adjusted
> >  -Warray-bounds-61.c due to the false positive warnings.
> > 
> > Let me know if you have any comments and suggestions.
> 
> At the last Cauldron I talked with David Malcolm about these kind of
> issues and thought of instead of suppressing diagnostics to record
> how a block was duplicated.  For jump threading my idea was to record
> the condition that was proved true when entering the path and do this
> by recording the corresponding locations so that in the end we can
> use the diagnostic-path infrastructure to say
> 
> warning: array index always above array bounds
> events 1:
> 
> > 3 |  if (index >= 4)
>  |
>     (1) when index >= 4
> 
> it would be possible to record the info as part of the ad-hoc
> location data on each duplicated stmt or, possibly simpler,
> as part of a debug stmt of new kind.
> 
> I'm not sure pruning the warnings is a good thing to do.  One
> would argue we should instead isolate such path as unreachable
> since it invokes undefined behavior.  In particular your
> example is clearly a bug and should be diagnosed.
> 
> Note very similar issues happen when unrolling a loop.
> 
> Note all late diagnostics are prone to these kind of issues.

To recap our chat at Cauldron: any GCC diagnostic can potentially have
a diagnostic_path associated with it (not just the analyzer).  The
current mechanism is:
(a) use a rich_location for the diagnostic, and 
(b) create an instance of some diagnostic_path subclass (e.g.
simple_diagnostic_path, or something else), and 
(c) call  richloc.set_path ();  to associate the path with the
rich_location

See gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_paths.c for an
example of using simple_diagnostic_path that doesn't use the analyzer.


If we want *every* late diagnostic to potentially have a path, it
sounds like we might want some extra infrastructure (perhaps 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-15 Thread Qing Zhao


> On May 15, 2024, at 02:09, Richard Biener  wrote:
> 
> On Tue, 14 May 2024, Qing Zhao wrote:
> 
>> 
>> 
>>> On May 14, 2024, at 13:14, Richard Biener  wrote:
>>> 
>>> On Tue, 14 May 2024, Qing Zhao wrote:
>>> 
 
 
> On May 14, 2024, at 10:29, Richard Biener  wrote:
> 
>>> [...]
> It would of course
> need experimenting since we can end up moving stmts and merging blocks
> though the linear traces created by jump threading should be quite
> stable (as opposed to say the unrolling case where multiple instances
> of the loop body likely will end up in the exact same basic block).
 
 Do you mean, for loop unrolling the approach with one extra stmt for one 
 basic block might be even harder and unreliable?
>>> 
>>> The question is whether the stmt marks the whole block or whether we
>>> for example add both a START and END stmt covering a copied path.
>>> I would guess for unrolling we need definitely need to do the latter
>>> (so we can diagnose "on the 3rd iteration of an unrolled loop" or
>>> similar).
>> 
>> Okay. I see. 
>> 
>> Is it possible that the START and END stmts might be moved around and 
>> out-of-place by the different optimizations?
> 
> There is nothign preventing stmts to be moved across START or END.
Then we have to add some artificial data dependency or memory barrier at START 
and END to prevent such transformation. However, this might also prevent some 
useful transformation, therefore impact the performance…
Not sure whether this is a good approach…

Yes, some experiments might need to be done to compare the cost of these 
different approaches.

Qing
> 
> Richard.




Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-15 Thread Richard Biener
On Tue, 14 May 2024, Qing Zhao wrote:

> 
> 
> > On May 14, 2024, at 13:14, Richard Biener  wrote:
> > 
> > On Tue, 14 May 2024, Qing Zhao wrote:
> > 
> >> 
> >> 
> >>> On May 14, 2024, at 10:29, Richard Biener  wrote:
> >>> 
> > [...]
> >>> It would of course
> >>> need experimenting since we can end up moving stmts and merging blocks
> >>> though the linear traces created by jump threading should be quite
> >>> stable (as opposed to say the unrolling case where multiple instances
> >>> of the loop body likely will end up in the exact same basic block).
> >> 
> >> Do you mean, for loop unrolling the approach with one extra stmt for one 
> >> basic block might be even harder and unreliable?
> > 
> > The question is whether the stmt marks the whole block or whether we
> > for example add both a START and END stmt covering a copied path.
> > I would guess for unrolling we need definitely need to do the latter
> > (so we can diagnose "on the 3rd iteration of an unrolled loop" or
> > similar).
> 
> Okay. I see. 
> 
> Is it possible that the START and END stmts might be moved around and 
> out-of-place by the different optimizations?

There is nothign preventing stmts to be moved across START or END.

Richard.


Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Richard Biener
On Tue, 14 May 2024, Kees Cook wrote:

> On Tue, May 14, 2024 at 02:17:16PM +, Qing Zhao wrote:
> > The current major issue with the warning is:  the constant index value 4
> > is not in the source code, it’s a compiler generated intermediate value
> > (even though it’s a correct value -:)). Such warning messages confuse
> > the end-users with information that cannot be connected directly to the
> > source code.
> 
> Right -- this "4" comes from -fsanitize=array-bounds (in "warn but
> continue" mode).
> 
> Now, the minimized PoC shows a situation that triggers the situation, but
> I think it's worth looking at the original code that caused this false
> positive:
> 
>   for (i = 0; i < sg->num_entries; i++) {
>   gce = >gce[i];
> 
> 
> The issue here is that sg->num_entries has already been bounds-checked
> in a separate function. As a result, the value range tracking for "i"
> here is unbounded.
> 
> Enabling -fsanitize=array-bounds means the sg->gce[i] access gets
> instrumented, and suddenly "i" gains an implicit range, induced by the
> sanitizer.
> 
> (I would point out that this is very similar to the problems we've had
> with -fsanitize=shift[1][2]: the sanitizer induces a belief about a
> given variable's range this isn't true.)
> 
> Now, there is an argument to be made that the original code should be
> doing:
> 
>   for (i = 0; i < 4 && i < sg->num_entries; i++) {
> 
> But this is:
> 
> a) logically redundant (Linux maintainers don't tend to like duplicating
>their range checking)
> 
> b) a very simple case
> 
> The point of the sanitizers is to catch "impossible" situations at
> run-time for the cases where some value may end up out of range. Having
> it _induce_ a range on the resulting code makes no sense.
> 
> Could we, perhaps, have sanitizer code not influence the value range
> tracking? That continues to look like the root cause for these things.

The sanitizer code adds checks that are not distinguishable from
user code exactly because we want value-range analysis to eventually
elide even (redundant) sanitizer checks.

I think the fix for the source when there's a-priori knowledge
of sg->num_entries is to assert that knowledge through language
features or when using C through GNU extensions like assert()
using __builtin_unreachable ().  That also serves documentation
purposes "this code expects sg->num_entries to be bounds-checked".

To me it doesn't make much sense to mix sanitizing of array
accesses and at the same time do -Warray-bound diagnostics.

Note I tried to teach jump threading to be less aggressive
threading paths to exceptional situations (I think the
sanitizer runtime calls are at least marked unlikely), but
the comment was that even those are very much desired but
I can't remember the details.  This was as part of PR111515
but I think I've run into this earlier when trying to
improve back_threader_profitability::possibly_profitable_path_p.
There we have

  /* Threading is profitable if the path duplicated is hot but also
 in a case we separate cold path from hot path and permit optimization
 of the hot path later.  Be on the agressive side here. In some testcases,
 as in PR 78407 this leads to noticeable improvements.  */

here we have

  if (A)
unlikely();
  B;
  if (A)
unlikely ();

and we choose to perform that path separation which optimizes
the not exceptional path which automatically separates the
exceptional path as well.

IMO that sanitizer mode that continues running is bad - it makes
the compiler aware of undefined behavior and make the code run
into it with open eyes.  You get what you asked for.

Richard.

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Kees Cook
On Tue, May 14, 2024 at 02:17:16PM +, Qing Zhao wrote:
> The current major issue with the warning is:  the constant index value 4
> is not in the source code, it’s a compiler generated intermediate value
> (even though it’s a correct value -:)). Such warning messages confuse
> the end-users with information that cannot be connected directly to the
> source code.

Right -- this "4" comes from -fsanitize=array-bounds (in "warn but
continue" mode).

Now, the minimized PoC shows a situation that triggers the situation, but
I think it's worth looking at the original code that caused this false
positive:

for (i = 0; i < sg->num_entries; i++) {
gce = >gce[i];


The issue here is that sg->num_entries has already been bounds-checked
in a separate function. As a result, the value range tracking for "i"
here is unbounded.

Enabling -fsanitize=array-bounds means the sg->gce[i] access gets
instrumented, and suddenly "i" gains an implicit range, induced by the
sanitizer.

(I would point out that this is very similar to the problems we've had
with -fsanitize=shift[1][2]: the sanitizer induces a belief about a
given variable's range this isn't true.)

Now, there is an argument to be made that the original code should be
doing:

for (i = 0; i < 4 && i < sg->num_entries; i++) {

But this is:

a) logically redundant (Linux maintainers don't tend to like duplicating
   their range checking)

b) a very simple case

The point of the sanitizers is to catch "impossible" situations at
run-time for the cases where some value may end up out of range. Having
it _induce_ a range on the resulting code makes no sense.

Could we, perhaps, have sanitizer code not influence the value range
tracking? That continues to look like the root cause for these things.

-Kees

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105679
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108306

-- 
Kees Cook


Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Qing Zhao



> On May 14, 2024, at 13:14, Richard Biener  wrote:
> 
> On Tue, 14 May 2024, Qing Zhao wrote:
> 
>> 
>> 
>>> On May 14, 2024, at 10:29, Richard Biener  wrote:
>>> 
> [...]
>>> It would of course
>>> need experimenting since we can end up moving stmts and merging blocks
>>> though the linear traces created by jump threading should be quite
>>> stable (as opposed to say the unrolling case where multiple instances
>>> of the loop body likely will end up in the exact same basic block).
>> 
>> Do you mean, for loop unrolling the approach with one extra stmt for one 
>> basic block might be even harder and unreliable?
> 
> The question is whether the stmt marks the whole block or whether we
> for example add both a START and END stmt covering a copied path.
> I would guess for unrolling we need definitely need to do the latter
> (so we can diagnose "on the 3rd iteration of an unrolled loop" or
> similar).

Okay. I see. 

Is it possible that the START and END stmts might be moved around and 
out-of-place by the different optimizations?

Qing
> 
> Richard.
> 



Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Richard Biener
On Tue, 14 May 2024, Qing Zhao wrote:

> 
> 
> > On May 14, 2024, at 10:29, Richard Biener  wrote:
> > 
[...]
> >  It would of course
> > need experimenting since we can end up moving stmts and merging blocks
> > though the linear traces created by jump threading should be quite
> > stable (as opposed to say the unrolling case where multiple instances
> > of the loop body likely will end up in the exact same basic block).
> 
> Do you mean, for loop unrolling the approach with one extra stmt for one 
> basic block might be even harder and unreliable?

The question is whether the stmt marks the whole block or whether we
for example add both a START and END stmt covering a copied path.
I would guess for unrolling we need definitely need to do the latter
(so we can diagnose "on the 3rd iteration of an unrolled loop" or
similar).

Richard.



Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Kees Cook
On Mon, May 13, 2024 at 07:48:30PM +, Qing Zhao wrote:
> The false positive warnings are moved from -Warray-bounds=1 to
>  -Warray-bounds=2 now.

On a Linux kernel x86_64 allmodconfig build, this takes the -Warray-bounds
warnings from 14 to 9. After examining these 9, I see:

- 4: legitimate bugs in Linux source (3 distinct, 1 repeat). I'll be
  reporting/fixing these in Linux.
- 4: 4 instances of 1 case of buggy interaction with -fsanitize=shift,
 looking similar to these past bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105679
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108306
  the difference being operating on an enum. I will reduce the case
  and open a bug report for it.
- 1: still examining. It looks like a false positive so far.

Thanks!

-Kees

-- 
Kees Cook


Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Qing Zhao


> On May 14, 2024, at 11:08, Jeff Law  wrote:
> 
> 
> 
> On 5/14/24 8:57 AM, Qing Zhao wrote:
>>> On May 13, 2024, at 20:14, Kees Cook  wrote:
>>> 
>>> On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote:
 On Mon, May 13, 2024, 11:41 PM Kees Cook  wrote:
> But it makes no sense to warn about:
> 
> void sparx5_set (int * ptr, struct nums * sg, int index)
> {
>   if (index >= 4)
> warn ();
>   *ptr = 0;
>   *val = sg->vals[index];
>   if (index >= 4)
> warn ();
>   *ptr = *val;
> }
> 
> Because at "*val = sg->vals[index];" the actual value range tracking for
> index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the
> "if" statements is the range tracking [4,INT_MAX].)
> 
> However, in the case where jump threading has split the execution flow
> and produced a copy of "*val = sg->vals[index];" where the value range
> tracking for "index" is now [4,INT_MAX], is the warning valid. But it
> is only for that instance. Reporting it for effectively both (there is
> only 1 source line for the array indexing) is misleading because there
> is nothing the user can do about it -- the compiler created the copy and
> then noticed it had a range it could apply to that array index.
> 
 
 "there is nothing the user can do about it" is very much false. They could
 change warn call into a noreturn function call instead.  (In the case of
 the Linux kernel panic). There are things the user can do to fix the
 warning and even get better code generation out of the compilers.
>>> 
>>> This isn't about warn() not being noreturn. The warn() could be any
>>> function call; the jump threading still happens.
>> When the program is executed on the “if (index > = 4)” path,  the value of 
>> “index” is definitely
>>> = 4, when sg->vals[index] is referenced on this path (the case when the 
>>> routine “warn” is NOT noreturn), it’s
>> definitely an out-of-bounds array access.  So, the compiler’s warning is 
>> correct. And this warning does catch
>> a potential issue in the source code that need to be fixed by either of the 
>> following two solutions:
>>1. Make the routine “warn” as noreturn and mark it noreturn;
> This would be my recommendation.  We're about to execute undefined behavior.  
> I don't see a way to necessarily recover safely here, so I'd suggest having 
> warn() not return and mark it appropriately.
> 
> That'll have numerous secondary benefits as well.

Yes, agreed.  Such source code change should be a nice security improvement to 
the linux kernel source code. 

Qing
> 
> jeff




Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Qing Zhao


> On May 14, 2024, at 10:29, Richard Biener  wrote:
> 
> On Tue, 14 May 2024, Qing Zhao wrote:
> 
>> 
>> 
>>> On May 14, 2024, at 09:08, Richard Biener  wrote:
>>> 
>>> On Mon, 13 May 2024, Qing Zhao wrote:
>>> 
 -Warray-bounds is an important option to enable linux kernal to keep
 the array out-of-bound errors out of the source tree.
 
 However, due to the false positive warnings reported in PR109071
 (-Warray-bounds false positive warnings due to code duplication from
 jump threading), -Warray-bounds=1 cannot be added on by default.
 
 Although it's impossible to elinimate all the false positive warnings
 from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
 documentation says "always out of bounds"), we should minimize the
 false positive warnings in -Warray-bounds=1.
 
 The root reason for the false positive warnings reported in PR109071 is:
 
 When the thread jump optimization tries to reduce the # of branches
 inside the routine, sometimes it needs to duplicate the code and
 split into two conditional pathes. for example:
 
 The original code:
 
 void sparx5_set (int * ptr, struct nums * sg, int index)
 {
 if (index >= 4)
   warn ();
 *ptr = 0;
 *val = sg->vals[index];
 if (index >= 4)
   warn ();
 *ptr = *val;
 
 return;
 }
 
 With the thread jump, the above becomes:
 
 void sparx5_set (int * ptr, struct nums * sg, int index)
 {
 if (index >= 4)
   {
 warn ();
 *ptr = 0; // Code duplications since "warn" does return;
 *val = sg->vals[index]; // same this line.
 // In this path, since it's under the condition
 // "index >= 4", the compiler knows the value
 // of "index" is larger then 4, therefore the
 // out-of-bound warning.
 warn ();
   }
 else
   {
 *ptr = 0;
 *val = sg->vals[index];
   }
 *ptr = *val;
 return;
 }
 
 We can see, after the thread jump optimization, the # of branches inside
 the routine "sparx5_set" is reduced from 2 to 1, however,  due to the
 code duplication (which is needed for the correctness of the code), we
 got a false positive out-of-bound warning.
 
 In order to eliminate such false positive out-of-bound warning,
 
 A. Add one more flag for GIMPLE: is_splitted.
 B. During the thread jump optimization, when the basic blocks are
  duplicated, mark all the STMTs inside the original and duplicated
  basic blocks as "is_splitted";
 C. Inside the array bound checker, add the following new heuristic:
 
 If
  1. the stmt is duplicated and splitted into two conditional paths;
 +  2. the warning level < 2;
 +  3. the current block is not dominating the exit block
 Then not report the warning.
 
 The false positive warnings are moved from -Warray-bounds=1 to
 -Warray-bounds=2 now.
 
 Bootstrapped and regression tested on both x86 and aarch64. adjusted
 -Warray-bounds-61.c due to the false positive warnings.
 
 Let me know if you have any comments and suggestions.
>>> 
>>> At the last Cauldron I talked with David Malcolm about these kind of
>>> issues and thought of instead of suppressing diagnostics to record
>>> how a block was duplicated.  For jump threading my idea was to record
>>> the condition that was proved true when entering the path and do this
>>> by recording the corresponding locations so that in the end we can
>>> use the diagnostic-path infrastructure to say
>>> 
>>> warning: array index always above array bounds
>>> events 1:
>>> 
>>> | 3 |  if (index >= 4)
>>>|
>>>   (1) when index >= 4
> 
> As it's been quite some time I think I remeber that I thought of
> constructing the diagnostic path at jump threading time and associating
> that with the location.  But I don't remember exactly where I wanted to
> put it - I think it was on an extra stmt to avoid having too many
> ad-hoc locations as I'm not sure of their cost.

Yes, an extra stmt for each basic block might  be less memory cost than an 
extra location
info for each stmt.  

The major concern with the extra stmt for each basic block is how to  keep this 
special stmt in place
after stmts moving and basic block merging as you mentioned below.  During my 
debugging of bug
109071, I noticed that the duplicated basic blocks are merged with old blocks 
to form new blocks. And
the original block is deleted.  I expected that the implementation with the 
extra stmt for each basic block
might be much harder and error prone.

>  It would of course
> need experimenting since we can end up moving stmts and merging blocks
> though the linear traces created by jump threading should be quite
> stable (as opposed to say the unrolling case where multiple instances
> of the loop body likely will end up in the exact same basic block).

Do you mean, for loop 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Jeff Law




On 5/14/24 8:57 AM, Qing Zhao wrote:




On May 13, 2024, at 20:14, Kees Cook  wrote:

On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote:

On Mon, May 13, 2024, 11:41 PM Kees Cook  wrote:

But it makes no sense to warn about:

void sparx5_set (int * ptr, struct nums * sg, int index)
{
   if (index >= 4)
 warn ();
   *ptr = 0;
   *val = sg->vals[index];
   if (index >= 4)
 warn ();
   *ptr = *val;
}

Because at "*val = sg->vals[index];" the actual value range tracking for
index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the
"if" statements is the range tracking [4,INT_MAX].)

However, in the case where jump threading has split the execution flow
and produced a copy of "*val = sg->vals[index];" where the value range
tracking for "index" is now [4,INT_MAX], is the warning valid. But it
is only for that instance. Reporting it for effectively both (there is
only 1 source line for the array indexing) is misleading because there
is nothing the user can do about it -- the compiler created the copy and
then noticed it had a range it could apply to that array index.



"there is nothing the user can do about it" is very much false. They could
change warn call into a noreturn function call instead.  (In the case of
the Linux kernel panic). There are things the user can do to fix the
warning and even get better code generation out of the compilers.


This isn't about warn() not being noreturn. The warn() could be any
function call; the jump threading still happens.


When the program is executed on the “if (index > = 4)” path,  the value of 
“index” is definitely

= 4, when sg->vals[index] is referenced on this path (the case when the routine 
“warn” is NOT noreturn), it’s

definitely an out-of-bounds array access.  So, the compiler’s warning is 
correct. And this warning does catch
a potential issue in the source code that need to be fixed by either of the 
following two solutions:

1. Make the routine “warn” as noreturn and mark it noreturn;
This would be my recommendation.  We're about to execute undefined 
behavior.  I don't see a way to necessarily recover safely here, so I'd 
suggest having warn() not return and mark it appropriately.


That'll have numerous secondary benefits as well.

jeff



Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Qing Zhao


> On May 13, 2024, at 20:14, Kees Cook  wrote:
> 
> On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote:
>> On Mon, May 13, 2024, 11:41 PM Kees Cook  wrote:
>>> But it makes no sense to warn about:
>>> 
>>> void sparx5_set (int * ptr, struct nums * sg, int index)
>>> {
>>>   if (index >= 4)
>>> warn ();
>>>   *ptr = 0;
>>>   *val = sg->vals[index];
>>>   if (index >= 4)
>>> warn ();
>>>   *ptr = *val;
>>> }
>>> 
>>> Because at "*val = sg->vals[index];" the actual value range tracking for
>>> index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the
>>> "if" statements is the range tracking [4,INT_MAX].)
>>> 
>>> However, in the case where jump threading has split the execution flow
>>> and produced a copy of "*val = sg->vals[index];" where the value range
>>> tracking for "index" is now [4,INT_MAX], is the warning valid. But it
>>> is only for that instance. Reporting it for effectively both (there is
>>> only 1 source line for the array indexing) is misleading because there
>>> is nothing the user can do about it -- the compiler created the copy and
>>> then noticed it had a range it could apply to that array index.
>>> 
>> 
>> "there is nothing the user can do about it" is very much false. They could
>> change warn call into a noreturn function call instead.  (In the case of
>> the Linux kernel panic). There are things the user can do to fix the
>> warning and even get better code generation out of the compilers.
> 
> This isn't about warn() not being noreturn. The warn() could be any
> function call; the jump threading still happens.

When the program is executed on the “if (index > = 4)” path,  the value of 
“index” is definitely
>= 4, when sg->vals[index] is referenced on this path (the case when the 
>routine “warn” is NOT noreturn), it’s
definitely an out-of-bounds array access.  So, the compiler’s warning is 
correct. And this warning does catch 
a potential issue in the source code that need to be fixed by either of the 
following two solutions:

   1. Make the routine “warn” as noreturn and mark it noreturn;
Or
   2. On the path “if (index >= 4)”, make the value of “index” in the bound of 
the array. 

With either of the above source code changes, we can fix this potential 
out-of-bound array access bug in the source code.

Qing
> 
> GCC is warning about a compiler-constructed situation that cannot be
> reliably fixed on the source side (GCC emitting the warning is highly
> unstable in these cases), since the condition is not *always* true for
> the given line of code. If it is not useful to warn for "array[index]"
> being out of range when "index" is always [INT_MIN,INT_MAX], then it
> is not useful to warn when "index" MAY be [INT_MIN,INT_MAX] for a given
> line of code.
> 
> -Kees
> 
> -- 
> Kees Cook



Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Richard Biener
On Tue, 14 May 2024, Qing Zhao wrote:

> 
> 
> > On May 14, 2024, at 09:08, Richard Biener  wrote:
> > 
> > On Mon, 13 May 2024, Qing Zhao wrote:
> > 
> >> -Warray-bounds is an important option to enable linux kernal to keep
> >> the array out-of-bound errors out of the source tree.
> >> 
> >> However, due to the false positive warnings reported in PR109071
> >> (-Warray-bounds false positive warnings due to code duplication from
> >> jump threading), -Warray-bounds=1 cannot be added on by default.
> >> 
> >> Although it's impossible to elinimate all the false positive warnings
> >> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
> >> documentation says "always out of bounds"), we should minimize the
> >> false positive warnings in -Warray-bounds=1.
> >> 
> >> The root reason for the false positive warnings reported in PR109071 is:
> >> 
> >> When the thread jump optimization tries to reduce the # of branches
> >> inside the routine, sometimes it needs to duplicate the code and
> >> split into two conditional pathes. for example:
> >> 
> >> The original code:
> >> 
> >> void sparx5_set (int * ptr, struct nums * sg, int index)
> >> {
> >>  if (index >= 4)
> >>warn ();
> >>  *ptr = 0;
> >>  *val = sg->vals[index];
> >>  if (index >= 4)
> >>warn ();
> >>  *ptr = *val;
> >> 
> >>  return;
> >> }
> >> 
> >> With the thread jump, the above becomes:
> >> 
> >> void sparx5_set (int * ptr, struct nums * sg, int index)
> >> {
> >>  if (index >= 4)
> >>{
> >>  warn ();
> >>  *ptr = 0; // Code duplications since "warn" does return;
> >>  *val = sg->vals[index]; // same this line.
> >> // In this path, since it's under the condition
> >> // "index >= 4", the compiler knows the value
> >> // of "index" is larger then 4, therefore the
> >> // out-of-bound warning.
> >>  warn ();
> >>}
> >>  else
> >>{
> >>  *ptr = 0;
> >>  *val = sg->vals[index];
> >>}
> >>  *ptr = *val;
> >>  return;
> >> }
> >> 
> >> We can see, after the thread jump optimization, the # of branches inside
> >> the routine "sparx5_set" is reduced from 2 to 1, however,  due to the
> >> code duplication (which is needed for the correctness of the code), we
> >> got a false positive out-of-bound warning.
> >> 
> >> In order to eliminate such false positive out-of-bound warning,
> >> 
> >> A. Add one more flag for GIMPLE: is_splitted.
> >> B. During the thread jump optimization, when the basic blocks are
> >>   duplicated, mark all the STMTs inside the original and duplicated
> >>   basic blocks as "is_splitted";
> >> C. Inside the array bound checker, add the following new heuristic:
> >> 
> >> If
> >>   1. the stmt is duplicated and splitted into two conditional paths;
> >> +  2. the warning level < 2;
> >> +  3. the current block is not dominating the exit block
> >> Then not report the warning.
> >> 
> >> The false positive warnings are moved from -Warray-bounds=1 to
> >> -Warray-bounds=2 now.
> >> 
> >> Bootstrapped and regression tested on both x86 and aarch64. adjusted
> >> -Warray-bounds-61.c due to the false positive warnings.
> >> 
> >> Let me know if you have any comments and suggestions.
> > 
> > At the last Cauldron I talked with David Malcolm about these kind of
> > issues and thought of instead of suppressing diagnostics to record
> > how a block was duplicated.  For jump threading my idea was to record
> > the condition that was proved true when entering the path and do this
> > by recording the corresponding locations so that in the end we can
> > use the diagnostic-path infrastructure to say
> > 
> > warning: array index always above array bounds
> > events 1:
> > 
> > | 3 |  if (index >= 4)
> > |
> >(1) when index >= 4

As it's been quite some time I think I remeber that I thought of
constructing the diagnostic path at jump threading time and associating
that with the location.  But I don't remember exactly where I wanted to
put it - I think it was on an extra stmt to avoid having too many
ad-hoc locations as I'm not sure of their cost.  It would of course
need experimenting since we can end up moving stmts and merging blocks
though the linear traces created by jump threading should be quite
stable (as opposed to say the unrolling case where multiple instances
of the loop body likely will end up in the exact same basic block).

> Yes, this is a good idea. 
> 
> The current major issue with the warning is:  the constant index value 4 is 
> not in the source code, it’s a compiler generated intermediate value (even 
> though it’s a correct value -:)). Such warning messages confuse the end-users 
> with information that cannot be connected directly to the source code. 
> 
> With the above recorded “events” information, the warning messages should 
> make good sense to the end user, and also help the end user to locate the 
> place where the fix in the source code can be added. 
> 
> Actually, with the above warning information, the user can locate the place 
> 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Qing Zhao


> On May 14, 2024, at 09:08, Richard Biener  wrote:
> 
> On Mon, 13 May 2024, Qing Zhao wrote:
> 
>> -Warray-bounds is an important option to enable linux kernal to keep
>> the array out-of-bound errors out of the source tree.
>> 
>> However, due to the false positive warnings reported in PR109071
>> (-Warray-bounds false positive warnings due to code duplication from
>> jump threading), -Warray-bounds=1 cannot be added on by default.
>> 
>> Although it's impossible to elinimate all the false positive warnings
>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
>> documentation says "always out of bounds"), we should minimize the
>> false positive warnings in -Warray-bounds=1.
>> 
>> The root reason for the false positive warnings reported in PR109071 is:
>> 
>> When the thread jump optimization tries to reduce the # of branches
>> inside the routine, sometimes it needs to duplicate the code and
>> split into two conditional pathes. for example:
>> 
>> The original code:
>> 
>> void sparx5_set (int * ptr, struct nums * sg, int index)
>> {
>>  if (index >= 4)
>>warn ();
>>  *ptr = 0;
>>  *val = sg->vals[index];
>>  if (index >= 4)
>>warn ();
>>  *ptr = *val;
>> 
>>  return;
>> }
>> 
>> With the thread jump, the above becomes:
>> 
>> void sparx5_set (int * ptr, struct nums * sg, int index)
>> {
>>  if (index >= 4)
>>{
>>  warn ();
>>  *ptr = 0; // Code duplications since "warn" does return;
>>  *val = sg->vals[index]; // same this line.
>> // In this path, since it's under the condition
>> // "index >= 4", the compiler knows the value
>> // of "index" is larger then 4, therefore the
>> // out-of-bound warning.
>>  warn ();
>>}
>>  else
>>{
>>  *ptr = 0;
>>  *val = sg->vals[index];
>>}
>>  *ptr = *val;
>>  return;
>> }
>> 
>> We can see, after the thread jump optimization, the # of branches inside
>> the routine "sparx5_set" is reduced from 2 to 1, however,  due to the
>> code duplication (which is needed for the correctness of the code), we
>> got a false positive out-of-bound warning.
>> 
>> In order to eliminate such false positive out-of-bound warning,
>> 
>> A. Add one more flag for GIMPLE: is_splitted.
>> B. During the thread jump optimization, when the basic blocks are
>>   duplicated, mark all the STMTs inside the original and duplicated
>>   basic blocks as "is_splitted";
>> C. Inside the array bound checker, add the following new heuristic:
>> 
>> If
>>   1. the stmt is duplicated and splitted into two conditional paths;
>> +  2. the warning level < 2;
>> +  3. the current block is not dominating the exit block
>> Then not report the warning.
>> 
>> The false positive warnings are moved from -Warray-bounds=1 to
>> -Warray-bounds=2 now.
>> 
>> Bootstrapped and regression tested on both x86 and aarch64. adjusted
>> -Warray-bounds-61.c due to the false positive warnings.
>> 
>> Let me know if you have any comments and suggestions.
> 
> At the last Cauldron I talked with David Malcolm about these kind of
> issues and thought of instead of suppressing diagnostics to record
> how a block was duplicated.  For jump threading my idea was to record
> the condition that was proved true when entering the path and do this
> by recording the corresponding locations so that in the end we can
> use the diagnostic-path infrastructure to say
> 
> warning: array index always above array bounds
> events 1:
> 
> | 3 |  if (index >= 4)
> |
>(1) when index >= 4

Yes, this is a good idea. 

The current major issue with the warning is:  the constant index value 4 is not 
in the source code, it’s a compiler generated intermediate value (even though 
it’s a correct value -:)). Such warning messages confuse the end-users with 
information that cannot be connected directly to the source code. 

With the above recorded “events” information, the warning messages should make 
good sense to the end user, and also help the end user to locate the place 
where the fix in the source code can be added. 

Actually, with the above warning information, the user can locate the place 
“line 3” to add fixes as following:

if (*index >= 4)
  {
warn();
*index = 3;
  }

I.e.
[109071]$ diff t_org.c t.c
2c2
< static inline void assign(int val, int *regs, int index)
---
> static inline void assign(int val, int *regs, int *index)
4c4,5
< if (index >= 4)
---
> if (*index >= 4)
>   {
5a7,8
> *index = 3;
>   }
14,15c17,18
< assign(0,ptr, index);
< assign(*val, ptr, index);
---
> assign(0,ptr, );
> assign(*val, ptr, );

> 
> it would be possible to record the info as part of the ad-hoc
> location data on each duplicated stmt or, possibly simpler,
> as part of a debug stmt of new kind.

Recording such info to each stmt might be more reliable? 

> 
> I'm not sure pruning the warnings is a good thing to do.  One
> would argue we should instead isolate such path as unreachable
> since it invokes undefined behavior.  

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Richard Biener
On Mon, 13 May 2024, Qing Zhao wrote:

> -Warray-bounds is an important option to enable linux kernal to keep
> the array out-of-bound errors out of the source tree.
> 
> However, due to the false positive warnings reported in PR109071
> (-Warray-bounds false positive warnings due to code duplication from
> jump threading), -Warray-bounds=1 cannot be added on by default.
> 
> Although it's impossible to elinimate all the false positive warnings
> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
> documentation says "always out of bounds"), we should minimize the
> false positive warnings in -Warray-bounds=1.
> 
> The root reason for the false positive warnings reported in PR109071 is:
> 
> When the thread jump optimization tries to reduce the # of branches
> inside the routine, sometimes it needs to duplicate the code and
> split into two conditional pathes. for example:
> 
> The original code:
> 
> void sparx5_set (int * ptr, struct nums * sg, int index)
> {
>   if (index >= 4)
> warn ();
>   *ptr = 0;
>   *val = sg->vals[index];
>   if (index >= 4)
> warn ();
>   *ptr = *val;
> 
>   return;
> }
> 
> With the thread jump, the above becomes:
> 
> void sparx5_set (int * ptr, struct nums * sg, int index)
> {
>   if (index >= 4)
> {
>   warn ();
>   *ptr = 0;   // Code duplications since "warn" does return;
>   *val = sg->vals[index]; // same this line.
>   // In this path, since it's under the condition
>   // "index >= 4", the compiler knows the value
>   // of "index" is larger then 4, therefore the
>   // out-of-bound warning.
>   warn ();
> }
>   else
> {
>   *ptr = 0;
>   *val = sg->vals[index];
> }
>   *ptr = *val;
>   return;
> }
> 
> We can see, after the thread jump optimization, the # of branches inside
> the routine "sparx5_set" is reduced from 2 to 1, however,  due to the
> code duplication (which is needed for the correctness of the code), we
> got a false positive out-of-bound warning.
> 
> In order to eliminate such false positive out-of-bound warning,
> 
> A. Add one more flag for GIMPLE: is_splitted.
> B. During the thread jump optimization, when the basic blocks are
>duplicated, mark all the STMTs inside the original and duplicated
>basic blocks as "is_splitted";
> C. Inside the array bound checker, add the following new heuristic:
> 
> If
>1. the stmt is duplicated and splitted into two conditional paths;
> +  2. the warning level < 2;
> +  3. the current block is not dominating the exit block
> Then not report the warning.
> 
> The false positive warnings are moved from -Warray-bounds=1 to
>  -Warray-bounds=2 now.
> 
> Bootstrapped and regression tested on both x86 and aarch64. adjusted
>  -Warray-bounds-61.c due to the false positive warnings.
> 
> Let me know if you have any comments and suggestions.

At the last Cauldron I talked with David Malcolm about these kind of
issues and thought of instead of suppressing diagnostics to record
how a block was duplicated.  For jump threading my idea was to record
the condition that was proved true when entering the path and do this
by recording the corresponding locations so that in the end we can
use the diagnostic-path infrastructure to say

warning: array index always above array bounds
events 1:

| 3 |  if (index >= 4)
 |
(1) when index >= 4

it would be possible to record the info as part of the ad-hoc
location data on each duplicated stmt or, possibly simpler,
as part of a debug stmt of new kind.

I'm not sure pruning the warnings is a good thing to do.  One
would argue we should instead isolate such path as unreachable
since it invokes undefined behavior.  In particular your
example is clearly a bug and should be diagnosed.

Note very similar issues happen when unrolling a loop.

Note all late diagnostics are prone to these kind of issues.

Richard.

> Thanks.
> 
> Qing
> 
> 
>   PR tree optimization/109071
> 
> gcc/ChangeLog:
> 
>   * gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add two new
>   arguments for the new heuristic to not issue warnings.
>   (array_bounds_checker::check_array_ref): Call the new prototype of the
>   routine check_out_of_bounds_and_warn.
>   (array_bounds_checker::check_mem_ref): Add one new argument for the
>   new heuristic to not issue warnings.
>   (array_bounds_checker::check_addr_expr): Call the new prototype of the
>   routine check_mem_ref, add new heuristic for not issue warnings.
>   (array_bounds_checker::check_array_bounds): Call the new prototype of
>   the routine check_mem_ref.
>   * gimple-array-bounds.h: New prototype of check_mem_ref.
>   * gimple.h (struct GTY): Add one new flag is_splitted for gimple.
>   (gimple_is_splitted_p): New function.
>   (gimple_set_is_splitted): New function.
>   * 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-13 Thread Kees Cook
On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote:
> On Mon, May 13, 2024, 11:41 PM Kees Cook  wrote:
> > But it makes no sense to warn about:
> >
> > void sparx5_set (int * ptr, struct nums * sg, int index)
> > {
> >if (index >= 4)
> >  warn ();
> >*ptr = 0;
> >*val = sg->vals[index];
> >if (index >= 4)
> >  warn ();
> >*ptr = *val;
> > }
> >
> > Because at "*val = sg->vals[index];" the actual value range tracking for
> > index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the
> > "if" statements is the range tracking [4,INT_MAX].)
> >
> > However, in the case where jump threading has split the execution flow
> > and produced a copy of "*val = sg->vals[index];" where the value range
> > tracking for "index" is now [4,INT_MAX], is the warning valid. But it
> > is only for that instance. Reporting it for effectively both (there is
> > only 1 source line for the array indexing) is misleading because there
> > is nothing the user can do about it -- the compiler created the copy and
> > then noticed it had a range it could apply to that array index.
> >
> 
> "there is nothing the user can do about it" is very much false. They could
> change warn call into a noreturn function call instead.  (In the case of
> the Linux kernel panic). There are things the user can do to fix the
> warning and even get better code generation out of the compilers.

This isn't about warn() not being noreturn. The warn() could be any
function call; the jump threading still happens.

GCC is warning about a compiler-constructed situation that cannot be
reliably fixed on the source side (GCC emitting the warning is highly
unstable in these cases), since the condition is not *always* true for
the given line of code. If it is not useful to warn for "array[index]"
being out of range when "index" is always [INT_MIN,INT_MAX], then it
is not useful to warn when "index" MAY be [INT_MIN,INT_MAX] for a given
line of code.

-Kees

-- 
Kees Cook


Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-13 Thread Andrew Pinski
On Mon, May 13, 2024, 11:41 PM Kees Cook  wrote:

> On Mon, May 13, 2024 at 02:46:32PM -0600, Jeff Law wrote:
> >
> >
> > On 5/13/24 1:48 PM, Qing Zhao wrote:
> > > -Warray-bounds is an important option to enable linux kernal to keep
> > > the array out-of-bound errors out of the source tree.
> > >
> > > However, due to the false positive warnings reported in PR109071
> > > (-Warray-bounds false positive warnings due to code duplication from
> > > jump threading), -Warray-bounds=1 cannot be added on by default.
> > >
> > > Although it's impossible to elinimate all the false positive warnings
> > > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
> > > documentation says "always out of bounds"), we should minimize the
> > > false positive warnings in -Warray-bounds=1.
> > >
> > > The root reason for the false positive warnings reported in PR109071
> is:
> > >
> > > When the thread jump optimization tries to reduce the # of branches
> > > inside the routine, sometimes it needs to duplicate the code and
> > > split into two conditional pathes. for example:
> > >
> > > The original code:
> > >
> > > void sparx5_set (int * ptr, struct nums * sg, int index)
> > > {
> > >if (index >= 4)
> > >  warn ();
> > >*ptr = 0;
> > >*val = sg->vals[index];
> > >if (index >= 4)
> > >  warn ();
> > >*ptr = *val;
> > >
> > >return;
> > > }
> > >
> > > With the thread jump, the above becomes:
> > >
> > > void sparx5_set (int * ptr, struct nums * sg, int index)
> > > {
> > >if (index >= 4)
> > >  {
> > >warn ();
> > >*ptr = 0;// Code duplications since "warn" does
> return;
> > >*val = sg->vals[index];  // same this line.
> > > // In this path, since it's under the
> condition
> > > // "index >= 4", the compiler knows the
> value
> > > // of "index" is larger then 4, therefore
> the
> > > // out-of-bound warning.
> > >warn ();
> > >  }
> > >else
> > >  {
> > >*ptr = 0;
> > >*val = sg->vals[index];
> > >  }
> > >*ptr = *val;
> > >return;
> > > }
> > >
> > > We can see, after the thread jump optimization, the # of branches
> inside
> > > the routine "sparx5_set" is reduced from 2 to 1, however,  due to the
> > > code duplication (which is needed for the correctness of the code), we
> > > got a false positive out-of-bound warning.
> > >
> > > In order to eliminate such false positive out-of-bound warning,
> > >
> > > A. Add one more flag for GIMPLE: is_splitted.
> > > B. During the thread jump optimization, when the basic blocks are
> > > duplicated, mark all the STMTs inside the original and duplicated
> > > basic blocks as "is_splitted";
> > > C. Inside the array bound checker, add the following new heuristic:
> > >
> > > If
> > > 1. the stmt is duplicated and splitted into two conditional paths;
> > > +  2. the warning level < 2;
> > > +  3. the current block is not dominating the exit block
> > > Then not report the warning.
> > >
> > > The false positive warnings are moved from -Warray-bounds=1 to
> > >   -Warray-bounds=2 now.
> > >
> > > Bootstrapped and regression tested on both x86 and aarch64. adjusted
> > >   -Warray-bounds-61.c due to the false positive warnings.
> > >
> > > Let me know if you have any comments and suggestions.
> > This sounds horribly wrong.   In the code above, the warning is correct.
>
> It's not sensible from a user's perspective.
>
> If this doesn't warn:
>
> void sparx5_set (int * ptr, struct nums * sg, int index)
> {
>*ptr = 0;
>*val = sg->vals[index];
>*ptr = *val;
> }
>
> ... because the value range tracking of "index" spans [INT_MIN,INT_MAX],
> and warnings based on the value range are silenced if they haven't been
> clamped at all. (Otherwise warnings would be produced everywhere: only
> when a limited set of values is known is it useful to produce a warning.)
>
>
> But it makes no sense to warn about:
>
> void sparx5_set (int * ptr, struct nums * sg, int index)
> {
>if (index >= 4)
>  warn ();
>*ptr = 0;
>*val = sg->vals[index];
>if (index >= 4)
>  warn ();
>*ptr = *val;
> }
>
> Because at "*val = sg->vals[index];" the actual value range tracking for
> index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the
> "if" statements is the range tracking [4,INT_MAX].)
>
> However, in the case where jump threading has split the execution flow
> and produced a copy of "*val = sg->vals[index];" where the value range
> tracking for "index" is now [4,INT_MAX], is the warning valid. But it
> is only for that instance. Reporting it for effectively both (there is
> only 1 source line for the array indexing) is misleading because there
> is nothing the user can do about it -- the compiler created the copy and
> then noticed it had a range it could apply to that array index.
>

"there is 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-13 Thread Kees Cook
On Mon, May 13, 2024 at 02:46:32PM -0600, Jeff Law wrote:
> 
> 
> On 5/13/24 1:48 PM, Qing Zhao wrote:
> > -Warray-bounds is an important option to enable linux kernal to keep
> > the array out-of-bound errors out of the source tree.
> > 
> > However, due to the false positive warnings reported in PR109071
> > (-Warray-bounds false positive warnings due to code duplication from
> > jump threading), -Warray-bounds=1 cannot be added on by default.
> > 
> > Although it's impossible to elinimate all the false positive warnings
> > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
> > documentation says "always out of bounds"), we should minimize the
> > false positive warnings in -Warray-bounds=1.
> > 
> > The root reason for the false positive warnings reported in PR109071 is:
> > 
> > When the thread jump optimization tries to reduce the # of branches
> > inside the routine, sometimes it needs to duplicate the code and
> > split into two conditional pathes. for example:
> > 
> > The original code:
> > 
> > void sparx5_set (int * ptr, struct nums * sg, int index)
> > {
> >if (index >= 4)
> >  warn ();
> >*ptr = 0;
> >*val = sg->vals[index];
> >if (index >= 4)
> >  warn ();
> >*ptr = *val;
> > 
> >return;
> > }
> > 
> > With the thread jump, the above becomes:
> > 
> > void sparx5_set (int * ptr, struct nums * sg, int index)
> > {
> >if (index >= 4)
> >  {
> >warn ();
> >*ptr = 0;// Code duplications since "warn" does return;
> >*val = sg->vals[index];  // same this line.
> > // In this path, since it's under the condition
> > // "index >= 4", the compiler knows the value
> > // of "index" is larger then 4, therefore the
> > // out-of-bound warning.
> >warn ();
> >  }
> >else
> >  {
> >*ptr = 0;
> >*val = sg->vals[index];
> >  }
> >*ptr = *val;
> >return;
> > }
> > 
> > We can see, after the thread jump optimization, the # of branches inside
> > the routine "sparx5_set" is reduced from 2 to 1, however,  due to the
> > code duplication (which is needed for the correctness of the code), we
> > got a false positive out-of-bound warning.
> > 
> > In order to eliminate such false positive out-of-bound warning,
> > 
> > A. Add one more flag for GIMPLE: is_splitted.
> > B. During the thread jump optimization, when the basic blocks are
> > duplicated, mark all the STMTs inside the original and duplicated
> > basic blocks as "is_splitted";
> > C. Inside the array bound checker, add the following new heuristic:
> > 
> > If
> > 1. the stmt is duplicated and splitted into two conditional paths;
> > +  2. the warning level < 2;
> > +  3. the current block is not dominating the exit block
> > Then not report the warning.
> > 
> > The false positive warnings are moved from -Warray-bounds=1 to
> >   -Warray-bounds=2 now.
> > 
> > Bootstrapped and regression tested on both x86 and aarch64. adjusted
> >   -Warray-bounds-61.c due to the false positive warnings.
> > 
> > Let me know if you have any comments and suggestions.
> This sounds horribly wrong.   In the code above, the warning is correct.

It's not sensible from a user's perspective.

If this doesn't warn:

void sparx5_set (int * ptr, struct nums * sg, int index)
{
   *ptr = 0;
   *val = sg->vals[index];
   *ptr = *val;
}

... because the value range tracking of "index" spans [INT_MIN,INT_MAX],
and warnings based on the value range are silenced if they haven't been
clamped at all. (Otherwise warnings would be produced everywhere: only
when a limited set of values is known is it useful to produce a warning.)


But it makes no sense to warn about:

void sparx5_set (int * ptr, struct nums * sg, int index)
{
   if (index >= 4)
 warn ();
   *ptr = 0;
   *val = sg->vals[index];
   if (index >= 4)
 warn ();
   *ptr = *val;
}

Because at "*val = sg->vals[index];" the actual value range tracking for
index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the
"if" statements is the range tracking [4,INT_MAX].)

However, in the case where jump threading has split the execution flow
and produced a copy of "*val = sg->vals[index];" where the value range
tracking for "index" is now [4,INT_MAX], is the warning valid. But it
is only for that instance. Reporting it for effectively both (there is
only 1 source line for the array indexing) is misleading because there
is nothing the user can do about it -- the compiler created the copy and
then noticed it had a range it could apply to that array index.

This situation makes -Warray-bounds unusable for the Linux kernel (we
cannot have false positives says BDFL), but we'd *really* like to have
it enabled since it usually finds real bugs. But these false positives
can't be fixed on our end. :( So, moving them to -Warray-bounds=2 makes
sense as that's the level 

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-13 Thread Jeff Law




On 5/13/24 1:48 PM, Qing Zhao wrote:

-Warray-bounds is an important option to enable linux kernal to keep
the array out-of-bound errors out of the source tree.

However, due to the false positive warnings reported in PR109071
(-Warray-bounds false positive warnings due to code duplication from
jump threading), -Warray-bounds=1 cannot be added on by default.

Although it's impossible to elinimate all the false positive warnings
from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
documentation says "always out of bounds"), we should minimize the
false positive warnings in -Warray-bounds=1.

The root reason for the false positive warnings reported in PR109071 is:

When the thread jump optimization tries to reduce the # of branches
inside the routine, sometimes it needs to duplicate the code and
split into two conditional pathes. for example:

The original code:

void sparx5_set (int * ptr, struct nums * sg, int index)
{
   if (index >= 4)
 warn ();
   *ptr = 0;
   *val = sg->vals[index];
   if (index >= 4)
 warn ();
   *ptr = *val;

   return;
}

With the thread jump, the above becomes:

void sparx5_set (int * ptr, struct nums * sg, int index)
{
   if (index >= 4)
 {
   warn ();
   *ptr = 0;// Code duplications since "warn" does return;
   *val = sg->vals[index];   // same this line.
// In this path, since it's under the condition
// "index >= 4", the compiler knows the value
// of "index" is larger then 4, therefore the
// out-of-bound warning.
   warn ();
 }
   else
 {
   *ptr = 0;
   *val = sg->vals[index];
 }
   *ptr = *val;
   return;
}

We can see, after the thread jump optimization, the # of branches inside
the routine "sparx5_set" is reduced from 2 to 1, however,  due to the
code duplication (which is needed for the correctness of the code), we
got a false positive out-of-bound warning.

In order to eliminate such false positive out-of-bound warning,

A. Add one more flag for GIMPLE: is_splitted.
B. During the thread jump optimization, when the basic blocks are
duplicated, mark all the STMTs inside the original and duplicated
basic blocks as "is_splitted";
C. Inside the array bound checker, add the following new heuristic:

If
1. the stmt is duplicated and splitted into two conditional paths;
+  2. the warning level < 2;
+  3. the current block is not dominating the exit block
Then not report the warning.

The false positive warnings are moved from -Warray-bounds=1 to
  -Warray-bounds=2 now.

Bootstrapped and regression tested on both x86 and aarch64. adjusted
  -Warray-bounds-61.c due to the false positive warnings.

Let me know if you have any comments and suggestions.

This sounds horribly wrong.   In the code above, the warning is correct.

Jeff


[RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-13 Thread Qing Zhao
-Warray-bounds is an important option to enable linux kernal to keep
the array out-of-bound errors out of the source tree.

However, due to the false positive warnings reported in PR109071
(-Warray-bounds false positive warnings due to code duplication from
jump threading), -Warray-bounds=1 cannot be added on by default.

Although it's impossible to elinimate all the false positive warnings
from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
documentation says "always out of bounds"), we should minimize the
false positive warnings in -Warray-bounds=1.

The root reason for the false positive warnings reported in PR109071 is:

When the thread jump optimization tries to reduce the # of branches
inside the routine, sometimes it needs to duplicate the code and
split into two conditional pathes. for example:

The original code:

void sparx5_set (int * ptr, struct nums * sg, int index)
{
  if (index >= 4)
warn ();
  *ptr = 0;
  *val = sg->vals[index];
  if (index >= 4)
warn ();
  *ptr = *val;

  return;
}

With the thread jump, the above becomes:

void sparx5_set (int * ptr, struct nums * sg, int index)
{
  if (index >= 4)
{
  warn ();
  *ptr = 0; // Code duplications since "warn" does return;
  *val = sg->vals[index];   // same this line.
// In this path, since it's under the condition
// "index >= 4", the compiler knows the value
// of "index" is larger then 4, therefore the
// out-of-bound warning.
  warn ();
}
  else
{
  *ptr = 0;
  *val = sg->vals[index];
}
  *ptr = *val;
  return;
}

We can see, after the thread jump optimization, the # of branches inside
the routine "sparx5_set" is reduced from 2 to 1, however,  due to the
code duplication (which is needed for the correctness of the code), we
got a false positive out-of-bound warning.

In order to eliminate such false positive out-of-bound warning,

A. Add one more flag for GIMPLE: is_splitted.
B. During the thread jump optimization, when the basic blocks are
   duplicated, mark all the STMTs inside the original and duplicated
   basic blocks as "is_splitted";
C. Inside the array bound checker, add the following new heuristic:

If
   1. the stmt is duplicated and splitted into two conditional paths;
+  2. the warning level < 2;
+  3. the current block is not dominating the exit block
Then not report the warning.

The false positive warnings are moved from -Warray-bounds=1 to
 -Warray-bounds=2 now.

Bootstrapped and regression tested on both x86 and aarch64. adjusted
 -Warray-bounds-61.c due to the false positive warnings.

Let me know if you have any comments and suggestions.

Thanks.

Qing


PR tree optimization/109071

gcc/ChangeLog:

* gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add two new
arguments for the new heuristic to not issue warnings.
(array_bounds_checker::check_array_ref): Call the new prototype of the
routine check_out_of_bounds_and_warn.
(array_bounds_checker::check_mem_ref): Add one new argument for the
new heuristic to not issue warnings.
(array_bounds_checker::check_addr_expr): Call the new prototype of the
routine check_mem_ref, add new heuristic for not issue warnings.
(array_bounds_checker::check_array_bounds): Call the new prototype of
the routine check_mem_ref.
* gimple-array-bounds.h: New prototype of check_mem_ref.
* gimple.h (struct GTY): Add one new flag is_splitted for gimple.
(gimple_is_splitted_p): New function.
(gimple_set_is_splitted): New function.
* tree-ssa-threadupdate.cc (set_stmts_in_bb_is_splitted): New function.
(back_jt_path_registry::duplicate_thread_path): Mark all the stmts in
both original and copied blocks as IS_SPLITTED.

gcc/testsuite/ChangeLog:

* gcc.dg/Warray-bounds-61.c: Adjust testing case.
* gcc.dg/pr109071-1.c: New test.
* gcc.dg/pr109071.c: New test.
---
 gcc/gimple-array-bounds.cc  | 46 +
 gcc/gimple-array-bounds.h   |  2 +-
 gcc/gimple.h| 21 +--
 gcc/testsuite/gcc.dg/Warray-bounds-61.c |  6 ++--
 gcc/testsuite/gcc.dg/pr109071-1.c   | 22 
 gcc/testsuite/gcc.dg/pr109071.c | 22 
 gcc/tree-ssa-threadupdate.cc| 15 
 7 files changed, 122 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr109071-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr109071.c

diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index 008071cd5464..4a2975623bc1 100644
--- a/gcc/gimple-array-bounds.cc
+++ b/gcc/gimple-array-bounds.cc
@@ -264,7 +264,9 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
  tree up_bound, tree up_bound_p1,