Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
An update on this task. (Also need more suggestions -:) I have an initial implemenation locally, with this gcc, I can get the following message with the testing case in PR109071: [ 109071]$ cat t.c extern void warn(void); static inline void assign(int val, int *regs, int *index) { if (*index >= 4) warn(); *regs = val; } struct nums {int vals[4];}; void sparx5_set (int *ptr, struct nums *sg, int index) { int *val = >vals[index]; assign(0,ptr, ); assign(*val, ptr, ); } [109071]$ sh t /home/opc/Install/latest-d/bin/gcc -O2 -Warray-bounds=1 -c -o t.o t.c t.c: In function ‘sparx5_set’: t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=] 12 | int *val = >vals[index]; | ^~~ event 1 | |4 | if (*index >= 4) | | ^ | | | | | (1) when the condition is evaluated to true | t.c:8:18: note: while referencing ‘vals’ 8 | struct nums {int vals[4];}; | ^~~~ 1. Is the above diagnostic message good enough? Any more suggestion? 2. It’s hard for me to come up with a more complicate testing case that has one basic block copied multiple times by the jump thread, do you have any pointer to such testing cases? Thanks a lot for any help. Qing > On Jun 7, 2024, at 15:13, Qing Zhao wrote: > > Hi, Richard, > >> On Jun 5, 2024, at 13:58, Qing Zhao wrote: >>> >> Like this? >> >> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h >> index e6e2b0897572..ee344f91333b 100644 >> --- a/libcpp/include/line-map.h >> +++ b/libcpp/include/line-map.h >> @@ -761,8 +761,9 @@ struct GTY(()) maps_info_macro { >> struct GTY(()) location_adhoc_data { >> location_t locus; >> source_range src_range; >> - void * GTY((skip)) data; >> unsigned discriminator; >> + void * GTY((skip)) data; >> + void * GTY((skip)) copy_history; >> }; >> struct htab; > > Yes. > >> How about the copy_history? Do we need a new data structure (like >> the following, or other suggestion) for this? Where should I add >> this new data structure? > > As it needs to be managed by libcpp it should be in this very same > file. > >> struct copy_history { >> location_t condition; >> Bool is_true_path; >> } > > I think we want a pointer to the previous copy-of state as well in > case a stmt > was copied twice. We'll see whether a single (condition) location > plus edge flag > is sufficient. I'd say we should plan for an enum to indicate the > duplication > reason at least (jump threading, unswitching, unrolling come to my > mind). For > jump threading being able to say "when is true/false" is > probably > good enough, though it might not always be easy to identify a single > condition > here given a threading path starts at an incoming edge to a CFG merge > and > will usually end with the outgoing edge of a condition that we can > statically > evaluate. The condition controlling the path entry might not be > determined > fully by a single condition location. > > Possibly building a full "diagnostic path" object at threading time > might be > the only way to record all the facts, but that's also going to be > more > expensive. Note that a diagnostic_path represents a path through some kind of graph, whereas it sounds like you want to be storing the *nodes* in the graph, and later generating the diagnostic_path from that graph when we need it (which is trivial if the graph is actually just a tree: just follow the parent links backwards, then reverse it). >>> >>> I think we are mixing two things - one is that a single transform like >>> jump >>> threading produces a stmt copy and when we emit a diagnostic on that >>> copied statement we want to tell the user the condition under which the >>> copy is executed. That "condition" can be actually a sequence of >>> conditionals. I wanted to point out that a diagnostic_path instance >>> could >>> be used to describe such complex condition. >>> >>> But then the other thing I wanted to address with the link to a previous >>> copy_history - that's when a statement gets copied twice, for example >>> by two distinct jump threading optimizations. Like when dumping >>> the inlining decisions for diagnostics we could dump the logical "and" >>> of the conditions of the two threadings. Since we have a single >>> location per GIMPLE stmt we'd have to keep a "stack" of
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
Hi, Richard, > On Jun 5, 2024, at 13:58, Qing Zhao wrote: >> > Like this? > > diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h > index e6e2b0897572..ee344f91333b 100644 > --- a/libcpp/include/line-map.h > +++ b/libcpp/include/line-map.h > @@ -761,8 +761,9 @@ struct GTY(()) maps_info_macro { > struct GTY(()) location_adhoc_data { > location_t locus; > source_range src_range; > - void * GTY((skip)) data; > unsigned discriminator; > + void * GTY((skip)) data; > + void * GTY((skip)) copy_history; > }; > struct htab; Yes. > How about the copy_history? Do we need a new data structure (like > the following, or other suggestion) for this? Where should I add > this new data structure? As it needs to be managed by libcpp it should be in this very same file. > struct copy_history { > location_t condition; > Bool is_true_path; > } I think we want a pointer to the previous copy-of state as well in case a stmt was copied twice. We'll see whether a single (condition) location plus edge flag is sufficient. I'd say we should plan for an enum to indicate the duplication reason at least (jump threading, unswitching, unrolling come to my mind). For jump threading being able to say "when is true/false" is probably good enough, though it might not always be easy to identify a single condition here given a threading path starts at an incoming edge to a CFG merge and will usually end with the outgoing edge of a condition that we can statically evaluate. The condition controlling the path entry might not be determined fully by a single condition location. Possibly building a full "diagnostic path" object at threading time might be the only way to record all the facts, but that's also going to be more expensive. >>> >>> Note that a diagnostic_path represents a path through some kind of >>> graph, whereas it sounds like you want to be storing the *nodes* in the >>> graph, and later generating the diagnostic_path from that graph when we >>> need it (which is trivial if the graph is actually just a tree: just >>> follow the parent links backwards, then reverse it). >> >> I think we are mixing two things - one is that a single transform like >> jump >> threading produces a stmt copy and when we emit a diagnostic on that >> copied statement we want to tell the user the condition under which the >> copy is executed. That "condition" can be actually a sequence of >> conditionals. I wanted to point out that a diagnostic_path instance >> could >> be used to describe such complex condition. >> >> But then the other thing I wanted to address with the link to a previous >> copy_history - that's when a statement gets copied twice, for example >> by two distinct jump threading optimizations. Like when dumping >> the inlining decisions for diagnostics we could dump the logical "and" >> of the conditions of the two threadings. Since we have a single >> location per GIMPLE stmt we'd have to keep a "stack" of copy events >> associated with it. That's the linked list (I think a linked list should >> work fine here). > Yes, the linked list to keep the “stack” of copy events should be good > enough > to form the sequence of conditionals event for the diagnostic_path > instance. >> >> I realize things may get a bit "fat", but eventually we are not >> duplicating >> statements that much. I do hope we can share for example a big >> diagnostic_path when we duplicate a basic block during threading >> and use one instance for all stmts in such block copy (IIRC we never >> release locations or their ad-hoc data, we just make sure to never >> use locations with ad-hoc data pointing to BLOCKs that we released, >> but the linemap data will still have pointers in "dead" location entries, >> right?) > Are you still suggesting to add two artificial stmts in the beginning and > the > end of the duplicated block to carry the copy history information for all > the > stmts in the block to save space? > > Compared with the approach to carry such information to each duplicated > stmts (which I preferred), > The major concerns with the approach are: > 1. Optimizations might move stmts out of these two artificial stmts, > therefore we need add > Some memory barrier on these two artificial stmts to prevent such > movements. >This
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On Jun 5, 2024, at 13:07, Richard Biener wrote: > > > >> Am 05.06.2024 um 18:39 schrieb Qing Zhao : >> >> >> >>> On Jun 5, 2024, at 03:26, Richard Biener wrote: >>> On Tue, Jun 4, 2024 at 10:31 PM Qing Zhao wrote: > On Jun 4, 2024, at 03:43, Richard Biener > wrote: > > On Mon, Jun 3, 2024 at 4:48 PM David Malcolm wrote: >> >> On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote: >>> On Fri, May 31, 2024 at 11:23 PM Qing Zhao >>> wrote: > On May 23, 2024, at 07:46, Richard Biener > wrote: > > On Wed, May 22, 2024 at 8:53 PM Qing Zhao > wrote: >> >> >> >>> On May 22, 2024, at 03:38, Richard Biener >>> wrote: >>> >>> On Tue, May 21, 2024 at 11:36 PM David Malcolm >>> wrote: On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > Thanks for the comments and suggestions. > >> On May 15, 2024, at 10:00, David Malcolm >> >> wrote: >> >> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener >> wrote: >>> On Mon, 13 May 2024, Qing Zhao wrote: >>> -Warray-bounds is an important option to enable linux kernal to keep the array out-of-bound errors out of the source tree. However, due to the false positive warnings reported in PR109071 (-Warray-bounds false positive warnings due to code duplication from jump threading), -Warray-bounds=1 cannot be added on by default. Although it's impossible to elinimate all the false positive warnings from -Warray-bounds=1 (See PR104355 Misleading - Warray-bounds documentation says "always out of bounds"), we should minimize the false positive warnings in -Warray-bounds=1. The root reason for the false positive warnings reported in PR109071 is: When the thread jump optimization tries to reduce the # of branches inside the routine, sometimes it needs to duplicate the code and split into two conditional pathes. for example: The original code: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) warn (); *ptr = 0; *val = sg->vals[index]; if (index >= 4) warn (); *ptr = *val; return; } With the thread jump, the above becomes: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) { warn (); *ptr = 0; // Code duplications since "warn" does return; *val = sg->vals[index]; // same this line. // In this path, since it's under the condition // "index >= 4", the compiler knows the value // of "index" is larger then 4, therefore the // out-of-bound warning. warn (); } else { *ptr = 0; *val = sg->vals[index]; } *ptr = *val; return; } We can see, after the thread jump optimization, the # of branches inside the routine "sparx5_set" is reduced from 2 to 1, however, due to the code duplication (which is needed for the correctness of the code), we got a false positive out-of-bound warning. In order to eliminate such false
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> Am 05.06.2024 um 18:39 schrieb Qing Zhao : > > > >> On Jun 5, 2024, at 03:26, Richard Biener wrote: >> >>> On Tue, Jun 4, 2024 at 10:31 PM Qing Zhao wrote: >>> >>> >>> On Jun 4, 2024, at 03:43, Richard Biener wrote: On Mon, Jun 3, 2024 at 4:48 PM David Malcolm wrote: > > On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote: >> On Fri, May 31, 2024 at 11:23 PM Qing Zhao >> wrote: >>> >>> >>> On May 23, 2024, at 07:46, Richard Biener wrote: On Wed, May 22, 2024 at 8:53 PM Qing Zhao wrote: > > > >> On May 22, 2024, at 03:38, Richard Biener >> wrote: >> >> On Tue, May 21, 2024 at 11:36 PM David Malcolm >> wrote: >>> >>> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: Thanks for the comments and suggestions. > On May 15, 2024, at 10:00, David Malcolm > > wrote: > > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener > wrote: >> On Mon, 13 May 2024, Qing Zhao wrote: >> >>> -Warray-bounds is an important option to enable >>> linux kernal to >>> keep >>> the array out-of-bound errors out of the source >>> tree. >>> >>> However, due to the false positive warnings >>> reported in >>> PR109071 >>> (-Warray-bounds false positive warnings due to code >>> duplication >>> from >>> jump threading), -Warray-bounds=1 cannot be added >>> on by >>> default. >>> >>> Although it's impossible to elinimate all the false >>> positive >>> warnings >>> from -Warray-bounds=1 (See PR104355 Misleading - >>> Warray-bounds >>> documentation says "always out of bounds"), we >>> should minimize >>> the >>> false positive warnings in -Warray-bounds=1. >>> >>> The root reason for the false positive warnings >>> reported in >>> PR109071 is: >>> >>> When the thread jump optimization tries to reduce >>> the # of >>> branches >>> inside the routine, sometimes it needs to duplicate >>> the code >>> and >>> split into two conditional pathes. for example: >>> >>> The original code: >>> >>> void sparx5_set (int * ptr, struct nums * sg, int >>> index) >>> { >>> if (index >= 4) >>> warn (); >>> *ptr = 0; >>> *val = sg->vals[index]; >>> if (index >= 4) >>> warn (); >>> *ptr = *val; >>> >>> return; >>> } >>> >>> With the thread jump, the above becomes: >>> >>> void sparx5_set (int * ptr, struct nums * sg, int >>> index) >>> { >>> if (index >= 4) >>> { >>> warn (); >>> *ptr = 0; // Code duplications since >>> "warn" does >>> return; >>> *val = sg->vals[index]; // same this line. >>>// In this path, >>> since it's >>> under >>> the condition >>>// "index >= 4", the >>> compiler >>> knows >>> the value >>>// of "index" is >>> larger then 4, >>> therefore the >>>// out-of-bound >>> warning. >>> warn (); >>> } >>> else >>> { >>> *ptr = 0; >>> *val = sg->vals[index]; >>> } >>> *ptr = *val; >>> return; >>> } >>> >>> We can see, after the thread jump optimization, the >>> # of >>> branches >>> inside >>> the routine "sparx5_set" is reduced from 2 to 1, >>> however, due >>> to >>> the >>> code duplication (which is needed for the >>> correctness of the >>> code), >>> we >>> got a false positive out-of-bound warning. >>> >>> In order to eliminate such false positive out-of- >>> bound warning, >>> >>> A. Add one more flag for GIMPLE: is_splitted. >>> B. During the thread jump optimization, when the
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On Jun 5, 2024, at 03:26, Richard Biener wrote: > > On Tue, Jun 4, 2024 at 10:31 PM Qing Zhao wrote: >> >> >> >>> On Jun 4, 2024, at 03:43, Richard Biener wrote: >>> >>> On Mon, Jun 3, 2024 at 4:48 PM David Malcolm wrote: On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote: > On Fri, May 31, 2024 at 11:23 PM Qing Zhao > wrote: >> >> >> >>> On May 23, 2024, at 07:46, Richard Biener >>> wrote: >>> >>> On Wed, May 22, 2024 at 8:53 PM Qing Zhao >>> wrote: > On May 22, 2024, at 03:38, Richard Biener > wrote: > > On Tue, May 21, 2024 at 11:36 PM David Malcolm > wrote: >> >> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: >>> Thanks for the comments and suggestions. >>> On May 15, 2024, at 10:00, David Malcolm wrote: On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: > On Mon, 13 May 2024, Qing Zhao wrote: > >> -Warray-bounds is an important option to enable >> linux kernal to >> keep >> the array out-of-bound errors out of the source >> tree. >> >> However, due to the false positive warnings >> reported in >> PR109071 >> (-Warray-bounds false positive warnings due to code >> duplication >> from >> jump threading), -Warray-bounds=1 cannot be added >> on by >> default. >> >> Although it's impossible to elinimate all the false >> positive >> warnings >> from -Warray-bounds=1 (See PR104355 Misleading - >> Warray-bounds >> documentation says "always out of bounds"), we >> should minimize >> the >> false positive warnings in -Warray-bounds=1. >> >> The root reason for the false positive warnings >> reported in >> PR109071 is: >> >> When the thread jump optimization tries to reduce >> the # of >> branches >> inside the routine, sometimes it needs to duplicate >> the code >> and >> split into two conditional pathes. for example: >> >> The original code: >> >> void sparx5_set (int * ptr, struct nums * sg, int >> index) >> { >> if (index >= 4) >> warn (); >> *ptr = 0; >> *val = sg->vals[index]; >> if (index >= 4) >> warn (); >> *ptr = *val; >> >> return; >> } >> >> With the thread jump, the above becomes: >> >> void sparx5_set (int * ptr, struct nums * sg, int >> index) >> { >> if (index >= 4) >> { >> warn (); >> *ptr = 0; // Code duplications since >> "warn" does >> return; >> *val = sg->vals[index]; // same this line. >> // In this path, >> since it's >> under >> the condition >> // "index >= 4", the >> compiler >> knows >> the value >> // of "index" is >> larger then 4, >> therefore the >> // out-of-bound >> warning. >> warn (); >> } >> else >> { >> *ptr = 0; >> *val = sg->vals[index]; >> } >> *ptr = *val; >> return; >> } >> >> We can see, after the thread jump optimization, the >> # of >> branches >> inside >> the routine "sparx5_set" is reduced from 2 to 1, >> however, due >> to >> the >> code duplication (which is needed for the >> correctness of the >> code), >> we >> got a false positive out-of-bound warning. >> >> In order to eliminate such false positive out-of- >> bound warning, >> >> A. Add one more flag for GIMPLE: is_splitted. >> B. During the thread jump optimization, when the >> basic blocks >> are >> duplicated, mark all the STMTs inside the original >> and >> duplicated >> basic blocks as "is_splitted";
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Tue, Jun 4, 2024 at 10:31 PM Qing Zhao wrote: > > > > > On Jun 4, 2024, at 03:43, Richard Biener wrote: > > > > On Mon, Jun 3, 2024 at 4:48 PM David Malcolm wrote: > >> > >> On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote: > >>> On Fri, May 31, 2024 at 11:23 PM Qing Zhao > >>> wrote: > > > > > On May 23, 2024, at 07:46, Richard Biener > > wrote: > > > > On Wed, May 22, 2024 at 8:53 PM Qing Zhao > > wrote: > >> > >> > >> > >>> On May 22, 2024, at 03:38, Richard Biener > >>> wrote: > >>> > >>> On Tue, May 21, 2024 at 11:36 PM David Malcolm > >>> wrote: > > On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > > Thanks for the comments and suggestions. > > > >> On May 15, 2024, at 10:00, David Malcolm > >> > >> wrote: > >> > >> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener > >> wrote: > >>> On Mon, 13 May 2024, Qing Zhao wrote: > >>> > -Warray-bounds is an important option to enable > linux kernal to > keep > the array out-of-bound errors out of the source > tree. > > However, due to the false positive warnings > reported in > PR109071 > (-Warray-bounds false positive warnings due to code > duplication > from > jump threading), -Warray-bounds=1 cannot be added > on by > default. > > Although it's impossible to elinimate all the false > positive > warnings > from -Warray-bounds=1 (See PR104355 Misleading - > Warray-bounds > documentation says "always out of bounds"), we > should minimize > the > false positive warnings in -Warray-bounds=1. > > The root reason for the false positive warnings > reported in > PR109071 is: > > When the thread jump optimization tries to reduce > the # of > branches > inside the routine, sometimes it needs to duplicate > the code > and > split into two conditional pathes. for example: > > The original code: > > void sparx5_set (int * ptr, struct nums * sg, int > index) > { > if (index >= 4) > warn (); > *ptr = 0; > *val = sg->vals[index]; > if (index >= 4) > warn (); > *ptr = *val; > > return; > } > > With the thread jump, the above becomes: > > void sparx5_set (int * ptr, struct nums * sg, int > index) > { > if (index >= 4) > { > warn (); > *ptr = 0; // Code duplications since > "warn" does > return; > *val = sg->vals[index]; // same this line. > // In this path, > since it's > under > the condition > // "index >= 4", the > compiler > knows > the value > // of "index" is > larger then 4, > therefore the > // out-of-bound > warning. > warn (); > } > else > { > *ptr = 0; > *val = sg->vals[index]; > } > *ptr = *val; > return; > } > > We can see, after the thread jump optimization, the > # of > branches > inside > the routine "sparx5_set" is reduced from 2 to 1, > however, due > to > the > code duplication (which is needed for the > correctness of the > code), > we > got a false positive out-of-bound warning. > > In order to eliminate such false positive out-of- > bound warning, > > A. Add one more flag for GIMPLE: is_splitted. > B. During the thread jump optimization, when the > basic blocks > are > duplicated, mark all the STMTs inside the original > and > duplicated > basic blocks as "is_splitted"; > C. Inside the array bound checker, add the > following
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On Jun 4, 2024, at 03:43, Richard Biener wrote: > > On Mon, Jun 3, 2024 at 4:48 PM David Malcolm wrote: >> >> On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote: >>> On Fri, May 31, 2024 at 11:23 PM Qing Zhao >>> wrote: > On May 23, 2024, at 07:46, Richard Biener > wrote: > > On Wed, May 22, 2024 at 8:53 PM Qing Zhao > wrote: >> >> >> >>> On May 22, 2024, at 03:38, Richard Biener >>> wrote: >>> >>> On Tue, May 21, 2024 at 11:36 PM David Malcolm >>> wrote: On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > Thanks for the comments and suggestions. > >> On May 15, 2024, at 10:00, David Malcolm >> >> wrote: >> >> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener >> wrote: >>> On Mon, 13 May 2024, Qing Zhao wrote: >>> -Warray-bounds is an important option to enable linux kernal to keep the array out-of-bound errors out of the source tree. However, due to the false positive warnings reported in PR109071 (-Warray-bounds false positive warnings due to code duplication from jump threading), -Warray-bounds=1 cannot be added on by default. Although it's impossible to elinimate all the false positive warnings from -Warray-bounds=1 (See PR104355 Misleading - Warray-bounds documentation says "always out of bounds"), we should minimize the false positive warnings in -Warray-bounds=1. The root reason for the false positive warnings reported in PR109071 is: When the thread jump optimization tries to reduce the # of branches inside the routine, sometimes it needs to duplicate the code and split into two conditional pathes. for example: The original code: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) warn (); *ptr = 0; *val = sg->vals[index]; if (index >= 4) warn (); *ptr = *val; return; } With the thread jump, the above becomes: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) { warn (); *ptr = 0; // Code duplications since "warn" does return; *val = sg->vals[index]; // same this line. // In this path, since it's under the condition // "index >= 4", the compiler knows the value // of "index" is larger then 4, therefore the // out-of-bound warning. warn (); } else { *ptr = 0; *val = sg->vals[index]; } *ptr = *val; return; } We can see, after the thread jump optimization, the # of branches inside the routine "sparx5_set" is reduced from 2 to 1, however, due to the code duplication (which is needed for the correctness of the code), we got a false positive out-of-bound warning. In order to eliminate such false positive out-of- bound warning, A. Add one more flag for GIMPLE: is_splitted. B. During the thread jump optimization, when the basic blocks are duplicated, mark all the STMTs inside the original and duplicated basic blocks as "is_splitted"; C. Inside the array bound checker, add the following new heuristic: If 1. the stmt is duplicated and splitted into two conditional paths; + 2. the warning level < 2; + 3. the current block is not dominating the exit block Then not report the warning.
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Mon, Jun 3, 2024 at 4:48 PM David Malcolm wrote: > > On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote: > > On Fri, May 31, 2024 at 11:23 PM Qing Zhao > > wrote: > > > > > > > > > > > > > On May 23, 2024, at 07:46, Richard Biener > > > > wrote: > > > > > > > > On Wed, May 22, 2024 at 8:53 PM Qing Zhao > > > > wrote: > > > > > > > > > > > > > > > > > > > > > On May 22, 2024, at 03:38, Richard Biener > > > > > > wrote: > > > > > > > > > > > > On Tue, May 21, 2024 at 11:36 PM David Malcolm > > > > > > wrote: > > > > > > > > > > > > > > On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > > > > > > > > Thanks for the comments and suggestions. > > > > > > > > > > > > > > > > > On May 15, 2024, at 10:00, David Malcolm > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener > > > > > > > > > wrote: > > > > > > > > > > On Mon, 13 May 2024, Qing Zhao wrote: > > > > > > > > > > > > > > > > > > > > > -Warray-bounds is an important option to enable > > > > > > > > > > > linux kernal to > > > > > > > > > > > keep > > > > > > > > > > > the array out-of-bound errors out of the source > > > > > > > > > > > tree. > > > > > > > > > > > > > > > > > > > > > > However, due to the false positive warnings > > > > > > > > > > > reported in > > > > > > > > > > > PR109071 > > > > > > > > > > > (-Warray-bounds false positive warnings due to code > > > > > > > > > > > duplication > > > > > > > > > > > from > > > > > > > > > > > jump threading), -Warray-bounds=1 cannot be added > > > > > > > > > > > on by > > > > > > > > > > > default. > > > > > > > > > > > > > > > > > > > > > > Although it's impossible to elinimate all the false > > > > > > > > > > > positive > > > > > > > > > > > warnings > > > > > > > > > > > from -Warray-bounds=1 (See PR104355 Misleading - > > > > > > > > > > > Warray-bounds > > > > > > > > > > > documentation says "always out of bounds"), we > > > > > > > > > > > should minimize > > > > > > > > > > > the > > > > > > > > > > > false positive warnings in -Warray-bounds=1. > > > > > > > > > > > > > > > > > > > > > > The root reason for the false positive warnings > > > > > > > > > > > reported in > > > > > > > > > > > PR109071 is: > > > > > > > > > > > > > > > > > > > > > > When the thread jump optimization tries to reduce > > > > > > > > > > > the # of > > > > > > > > > > > branches > > > > > > > > > > > inside the routine, sometimes it needs to duplicate > > > > > > > > > > > the code > > > > > > > > > > > and > > > > > > > > > > > split into two conditional pathes. for example: > > > > > > > > > > > > > > > > > > > > > > The original code: > > > > > > > > > > > > > > > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int > > > > > > > > > > > index) > > > > > > > > > > > { > > > > > > > > > > > if (index >= 4) > > > > > > > > > > > warn (); > > > > > > > > > > > *ptr = 0; > > > > > > > > > > > *val = sg->vals[index]; > > > > > > > > > > > if (index >= 4) > > > > > > > > > > > warn (); > > > > > > > > > > > *ptr = *val; > > > > > > > > > > > > > > > > > > > > > > return; > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > With the thread jump, the above becomes: > > > > > > > > > > > > > > > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int > > > > > > > > > > > index) > > > > > > > > > > > { > > > > > > > > > > > if (index >= 4) > > > > > > > > > > > { > > > > > > > > > > > warn (); > > > > > > > > > > > *ptr = 0; // Code duplications since > > > > > > > > > > > "warn" does > > > > > > > > > > > return; > > > > > > > > > > > *val = sg->vals[index]; // same this line. > > > > > > > > > > > // In this path, > > > > > > > > > > > since it's > > > > > > > > > > > under > > > > > > > > > > > the condition > > > > > > > > > > > // "index >= 4", the > > > > > > > > > > > compiler > > > > > > > > > > > knows > > > > > > > > > > > the value > > > > > > > > > > > // of "index" is > > > > > > > > > > > larger then 4, > > > > > > > > > > > therefore the > > > > > > > > > > > // out-of-bound > > > > > > > > > > > warning. > > > > > > > > > > > warn (); > > > > > > > > > > > } > > > > > > > > > > > else > > > > > > > > > > > { > > > > > > > > > > > *ptr = 0; > > > > > > > > > > > *val = sg->vals[index]; > > > > > > > > > > > } > > > > > > > > > > > *ptr = *val; > > > > > > > > > > > return; > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > We can see, after the thread jump optimization, the > > > > > > > > > > > # of > > > > > > > > > > > branches > > > > > > > > > > > inside > > > > > > > > > > > the routine "sparx5_set" is reduced from 2 to 1, > > > > > > > > > > > however, due > > > > > > > > > > > to > > > > > > > > > > > the > > > > > > > > > > > code duplication (which is needed for the > > > > > > > > > > >
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Mon, 2024-06-03 at 08:29 +0200, Richard Biener wrote: > On Fri, May 31, 2024 at 11:23 PM Qing Zhao > wrote: > > > > > > > > > On May 23, 2024, at 07:46, Richard Biener > > > wrote: > > > > > > On Wed, May 22, 2024 at 8:53 PM Qing Zhao > > > wrote: > > > > > > > > > > > > > > > > > On May 22, 2024, at 03:38, Richard Biener > > > > > wrote: > > > > > > > > > > On Tue, May 21, 2024 at 11:36 PM David Malcolm > > > > > wrote: > > > > > > > > > > > > On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > > > > > > > Thanks for the comments and suggestions. > > > > > > > > > > > > > > > On May 15, 2024, at 10:00, David Malcolm > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener > > > > > > > > wrote: > > > > > > > > > On Mon, 13 May 2024, Qing Zhao wrote: > > > > > > > > > > > > > > > > > > > -Warray-bounds is an important option to enable > > > > > > > > > > linux kernal to > > > > > > > > > > keep > > > > > > > > > > the array out-of-bound errors out of the source > > > > > > > > > > tree. > > > > > > > > > > > > > > > > > > > > However, due to the false positive warnings > > > > > > > > > > reported in > > > > > > > > > > PR109071 > > > > > > > > > > (-Warray-bounds false positive warnings due to code > > > > > > > > > > duplication > > > > > > > > > > from > > > > > > > > > > jump threading), -Warray-bounds=1 cannot be added > > > > > > > > > > on by > > > > > > > > > > default. > > > > > > > > > > > > > > > > > > > > Although it's impossible to elinimate all the false > > > > > > > > > > positive > > > > > > > > > > warnings > > > > > > > > > > from -Warray-bounds=1 (See PR104355 Misleading - > > > > > > > > > > Warray-bounds > > > > > > > > > > documentation says "always out of bounds"), we > > > > > > > > > > should minimize > > > > > > > > > > the > > > > > > > > > > false positive warnings in -Warray-bounds=1. > > > > > > > > > > > > > > > > > > > > The root reason for the false positive warnings > > > > > > > > > > reported in > > > > > > > > > > PR109071 is: > > > > > > > > > > > > > > > > > > > > When the thread jump optimization tries to reduce > > > > > > > > > > the # of > > > > > > > > > > branches > > > > > > > > > > inside the routine, sometimes it needs to duplicate > > > > > > > > > > the code > > > > > > > > > > and > > > > > > > > > > split into two conditional pathes. for example: > > > > > > > > > > > > > > > > > > > > The original code: > > > > > > > > > > > > > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int > > > > > > > > > > index) > > > > > > > > > > { > > > > > > > > > > if (index >= 4) > > > > > > > > > > warn (); > > > > > > > > > > *ptr = 0; > > > > > > > > > > *val = sg->vals[index]; > > > > > > > > > > if (index >= 4) > > > > > > > > > > warn (); > > > > > > > > > > *ptr = *val; > > > > > > > > > > > > > > > > > > > > return; > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > With the thread jump, the above becomes: > > > > > > > > > > > > > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int > > > > > > > > > > index) > > > > > > > > > > { > > > > > > > > > > if (index >= 4) > > > > > > > > > > { > > > > > > > > > > warn (); > > > > > > > > > > *ptr = 0; // Code duplications since > > > > > > > > > > "warn" does > > > > > > > > > > return; > > > > > > > > > > *val = sg->vals[index]; // same this line. > > > > > > > > > > // In this path, > > > > > > > > > > since it's > > > > > > > > > > under > > > > > > > > > > the condition > > > > > > > > > > // "index >= 4", the > > > > > > > > > > compiler > > > > > > > > > > knows > > > > > > > > > > the value > > > > > > > > > > // of "index" is > > > > > > > > > > larger then 4, > > > > > > > > > > therefore the > > > > > > > > > > // out-of-bound > > > > > > > > > > warning. > > > > > > > > > > warn (); > > > > > > > > > > } > > > > > > > > > > else > > > > > > > > > > { > > > > > > > > > > *ptr = 0; > > > > > > > > > > *val = sg->vals[index]; > > > > > > > > > > } > > > > > > > > > > *ptr = *val; > > > > > > > > > > return; > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > We can see, after the thread jump optimization, the > > > > > > > > > > # of > > > > > > > > > > branches > > > > > > > > > > inside > > > > > > > > > > the routine "sparx5_set" is reduced from 2 to 1, > > > > > > > > > > however, due > > > > > > > > > > to > > > > > > > > > > the > > > > > > > > > > code duplication (which is needed for the > > > > > > > > > > correctness of the > > > > > > > > > > code), > > > > > > > > > > we > > > > > > > > > > got a false positive out-of-bound warning. > > > > > > > > > > > > > > > > > > > > In order to eliminate such false positive out-of- > > > > > > > > > > bound warning, > > > > > > > > > > > > > > > > > > >
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On Jun 3, 2024, at 02:29, Richard Biener wrote: > > On Fri, May 31, 2024 at 11:23 PM Qing Zhao wrote: >> >> >> >>> On May 23, 2024, at 07:46, Richard Biener >>> wrote: >>> >>> On Wed, May 22, 2024 at 8:53 PM Qing Zhao wrote: > On May 22, 2024, at 03:38, Richard Biener > wrote: > > On Tue, May 21, 2024 at 11:36 PM David Malcolm > wrote: >> >> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: >>> Thanks for the comments and suggestions. >>> On May 15, 2024, at 10:00, David Malcolm wrote: On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: > On Mon, 13 May 2024, Qing Zhao wrote: > >> -Warray-bounds is an important option to enable linux kernal to >> keep >> the array out-of-bound errors out of the source tree. >> >> However, due to the false positive warnings reported in >> PR109071 >> (-Warray-bounds false positive warnings due to code duplication >> from >> jump threading), -Warray-bounds=1 cannot be added on by >> default. >> >> Although it's impossible to elinimate all the false positive >> warnings >> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds >> documentation says "always out of bounds"), we should minimize >> the >> false positive warnings in -Warray-bounds=1. >> >> The root reason for the false positive warnings reported in >> PR109071 is: >> >> When the thread jump optimization tries to reduce the # of >> branches >> inside the routine, sometimes it needs to duplicate the code >> and >> split into two conditional pathes. for example: >> >> The original code: >> >> void sparx5_set (int * ptr, struct nums * sg, int index) >> { >> if (index >= 4) >> warn (); >> *ptr = 0; >> *val = sg->vals[index]; >> if (index >= 4) >> warn (); >> *ptr = *val; >> >> return; >> } >> >> With the thread jump, the above becomes: >> >> void sparx5_set (int * ptr, struct nums * sg, int index) >> { >> if (index >= 4) >> { >>warn (); >>*ptr = 0; // Code duplications since "warn" does >> return; >>*val = sg->vals[index]; // same this line. >> // In this path, since it's >> under >> the condition >> // "index >= 4", the compiler >> knows >> the value >> // of "index" is larger then 4, >> therefore the >> // out-of-bound warning. >>warn (); >> } >> else >> { >>*ptr = 0; >>*val = sg->vals[index]; >> } >> *ptr = *val; >> return; >> } >> >> We can see, after the thread jump optimization, the # of >> branches >> inside >> the routine "sparx5_set" is reduced from 2 to 1, however, due >> to >> the >> code duplication (which is needed for the correctness of the >> code), >> we >> got a false positive out-of-bound warning. >> >> In order to eliminate such false positive out-of-bound warning, >> >> A. Add one more flag for GIMPLE: is_splitted. >> B. During the thread jump optimization, when the basic blocks >> are >> duplicated, mark all the STMTs inside the original and >> duplicated >> basic blocks as "is_splitted"; >> C. Inside the array bound checker, add the following new >> heuristic: >> >> If >> 1. the stmt is duplicated and splitted into two conditional >> paths; >> + 2. the warning level < 2; >> + 3. the current block is not dominating the exit block >> Then not report the warning. >> >> The false positive warnings are moved from -Warray-bounds=1 to >> -Warray-bounds=2 now. >> >> Bootstrapped and regression tested on both x86 and aarch64. >> adjusted >> -Warray-bounds-61.c due to the false positive warnings. >> >> Let me know if you have any comments and suggestions. > > At the last Cauldron I talked with David Malcolm about these kind > of > issues and thought of instead of suppressing diagnostics to > record > how a block was duplicated. For jump threading my idea was to > record > the condition that was proved true when entering the path and do > this > by recording
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Fri, May 31, 2024 at 11:23 PM Qing Zhao wrote: > > > > > On May 23, 2024, at 07:46, Richard Biener > > wrote: > > > > On Wed, May 22, 2024 at 8:53 PM Qing Zhao wrote: > >> > >> > >> > >>> On May 22, 2024, at 03:38, Richard Biener > >>> wrote: > >>> > >>> On Tue, May 21, 2024 at 11:36 PM David Malcolm > >>> wrote: > > On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > > Thanks for the comments and suggestions. > > > >> On May 15, 2024, at 10:00, David Malcolm > >> wrote: > >> > >> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: > >>> On Mon, 13 May 2024, Qing Zhao wrote: > >>> > -Warray-bounds is an important option to enable linux kernal to > keep > the array out-of-bound errors out of the source tree. > > However, due to the false positive warnings reported in > PR109071 > (-Warray-bounds false positive warnings due to code duplication > from > jump threading), -Warray-bounds=1 cannot be added on by > default. > > Although it's impossible to elinimate all the false positive > warnings > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds > documentation says "always out of bounds"), we should minimize > the > false positive warnings in -Warray-bounds=1. > > The root reason for the false positive warnings reported in > PR109071 is: > > When the thread jump optimization tries to reduce the # of > branches > inside the routine, sometimes it needs to duplicate the code > and > split into two conditional pathes. for example: > > The original code: > > void sparx5_set (int * ptr, struct nums * sg, int index) > { > if (index >= 4) > warn (); > *ptr = 0; > *val = sg->vals[index]; > if (index >= 4) > warn (); > *ptr = *val; > > return; > } > > With the thread jump, the above becomes: > > void sparx5_set (int * ptr, struct nums * sg, int index) > { > if (index >= 4) > { > warn (); > *ptr = 0; // Code duplications since "warn" does > return; > *val = sg->vals[index]; // same this line. > // In this path, since it's > under > the condition > // "index >= 4", the compiler > knows > the value > // of "index" is larger then 4, > therefore the > // out-of-bound warning. > warn (); > } > else > { > *ptr = 0; > *val = sg->vals[index]; > } > *ptr = *val; > return; > } > > We can see, after the thread jump optimization, the # of > branches > inside > the routine "sparx5_set" is reduced from 2 to 1, however, due > to > the > code duplication (which is needed for the correctness of the > code), > we > got a false positive out-of-bound warning. > > In order to eliminate such false positive out-of-bound warning, > > A. Add one more flag for GIMPLE: is_splitted. > B. During the thread jump optimization, when the basic blocks > are > duplicated, mark all the STMTs inside the original and > duplicated > basic blocks as "is_splitted"; > C. Inside the array bound checker, add the following new > heuristic: > > If > 1. the stmt is duplicated and splitted into two conditional > paths; > + 2. the warning level < 2; > + 3. the current block is not dominating the exit block > Then not report the warning. > > The false positive warnings are moved from -Warray-bounds=1 to > -Warray-bounds=2 now. > > Bootstrapped and regression tested on both x86 and aarch64. > adjusted > -Warray-bounds-61.c due to the false positive warnings. > > Let me know if you have any comments and suggestions. > >>> > >>> At the last Cauldron I talked with David Malcolm about these kind > >>> of > >>> issues and thought of instead of suppressing diagnostics to > >>> record > >>> how a block was duplicated. For jump threading my idea was to > >>> record > >>> the condition that was proved true when entering the path and do > >>> this > >>> by recording the corresponding locations > > > > Is only recording the
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On May 23, 2024, at 07:46, Richard Biener wrote: > > On Wed, May 22, 2024 at 8:53 PM Qing Zhao wrote: >> >> >> >>> On May 22, 2024, at 03:38, Richard Biener >>> wrote: >>> >>> On Tue, May 21, 2024 at 11:36 PM David Malcolm wrote: On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > Thanks for the comments and suggestions. > >> On May 15, 2024, at 10:00, David Malcolm >> wrote: >> >> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: >>> On Mon, 13 May 2024, Qing Zhao wrote: >>> -Warray-bounds is an important option to enable linux kernal to keep the array out-of-bound errors out of the source tree. However, due to the false positive warnings reported in PR109071 (-Warray-bounds false positive warnings due to code duplication from jump threading), -Warray-bounds=1 cannot be added on by default. Although it's impossible to elinimate all the false positive warnings from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds documentation says "always out of bounds"), we should minimize the false positive warnings in -Warray-bounds=1. The root reason for the false positive warnings reported in PR109071 is: When the thread jump optimization tries to reduce the # of branches inside the routine, sometimes it needs to duplicate the code and split into two conditional pathes. for example: The original code: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) warn (); *ptr = 0; *val = sg->vals[index]; if (index >= 4) warn (); *ptr = *val; return; } With the thread jump, the above becomes: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) { warn (); *ptr = 0; // Code duplications since "warn" does return; *val = sg->vals[index]; // same this line. // In this path, since it's under the condition // "index >= 4", the compiler knows the value // of "index" is larger then 4, therefore the // out-of-bound warning. warn (); } else { *ptr = 0; *val = sg->vals[index]; } *ptr = *val; return; } We can see, after the thread jump optimization, the # of branches inside the routine "sparx5_set" is reduced from 2 to 1, however, due to the code duplication (which is needed for the correctness of the code), we got a false positive out-of-bound warning. In order to eliminate such false positive out-of-bound warning, A. Add one more flag for GIMPLE: is_splitted. B. During the thread jump optimization, when the basic blocks are duplicated, mark all the STMTs inside the original and duplicated basic blocks as "is_splitted"; C. Inside the array bound checker, add the following new heuristic: If 1. the stmt is duplicated and splitted into two conditional paths; + 2. the warning level < 2; + 3. the current block is not dominating the exit block Then not report the warning. The false positive warnings are moved from -Warray-bounds=1 to -Warray-bounds=2 now. Bootstrapped and regression tested on both x86 and aarch64. adjusted -Warray-bounds-61.c due to the false positive warnings. Let me know if you have any comments and suggestions. >>> >>> At the last Cauldron I talked with David Malcolm about these kind >>> of >>> issues and thought of instead of suppressing diagnostics to >>> record >>> how a block was duplicated. For jump threading my idea was to >>> record >>> the condition that was proved true when entering the path and do >>> this >>> by recording the corresponding locations > > Is only recording the location for the TRUE path enough? > We might need to record the corresponding locations for both TRUE and > FALSE paths since the VRP might be more accurate on both paths. > Is only recording the location is enough? > Do we need to record the pointer to the original condition stmt? Just to be
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On May 23, 2024, at 10:13, David Malcolm wrote: > > On Thu, 2024-05-23 at 14:03 +, Qing Zhao wrote: > > [...snip...] > >> Is “location_adhoc_data” an available data structure in current GCC? >> I just searched GCC source tree, cannot find it. > > It's in libcpp/include/line-table.h; see the big comment at the top of > that file. Thanks a lot. I found it in libcpp/include/line-map.h. reading it right now. Qing > > Dave >
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Thu, 2024-05-23 at 14:03 +, Qing Zhao wrote: [...snip...] > Is “location_adhoc_data” an available data structure in current GCC? > I just searched GCC source tree, cannot find it. It's in libcpp/include/line-table.h; see the big comment at the top of that file. Dave
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On May 23, 2024, at 07:46, Richard Biener wrote: > > On Wed, May 22, 2024 at 8:53 PM Qing Zhao wrote: >> >> >> >>> On May 22, 2024, at 03:38, Richard Biener >>> wrote: >>> >>> On Tue, May 21, 2024 at 11:36 PM David Malcolm wrote: On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > Thanks for the comments and suggestions. > >> On May 15, 2024, at 10:00, David Malcolm >> wrote: >> >> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: >>> On Mon, 13 May 2024, Qing Zhao wrote: >>> -Warray-bounds is an important option to enable linux kernal to keep the array out-of-bound errors out of the source tree. However, due to the false positive warnings reported in PR109071 (-Warray-bounds false positive warnings due to code duplication from jump threading), -Warray-bounds=1 cannot be added on by default. Although it's impossible to elinimate all the false positive warnings from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds documentation says "always out of bounds"), we should minimize the false positive warnings in -Warray-bounds=1. The root reason for the false positive warnings reported in PR109071 is: When the thread jump optimization tries to reduce the # of branches inside the routine, sometimes it needs to duplicate the code and split into two conditional pathes. for example: The original code: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) warn (); *ptr = 0; *val = sg->vals[index]; if (index >= 4) warn (); *ptr = *val; return; } With the thread jump, the above becomes: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) { warn (); *ptr = 0; // Code duplications since "warn" does return; *val = sg->vals[index]; // same this line. // In this path, since it's under the condition // "index >= 4", the compiler knows the value // of "index" is larger then 4, therefore the // out-of-bound warning. warn (); } else { *ptr = 0; *val = sg->vals[index]; } *ptr = *val; return; } We can see, after the thread jump optimization, the # of branches inside the routine "sparx5_set" is reduced from 2 to 1, however, due to the code duplication (which is needed for the correctness of the code), we got a false positive out-of-bound warning. In order to eliminate such false positive out-of-bound warning, A. Add one more flag for GIMPLE: is_splitted. B. During the thread jump optimization, when the basic blocks are duplicated, mark all the STMTs inside the original and duplicated basic blocks as "is_splitted"; C. Inside the array bound checker, add the following new heuristic: If 1. the stmt is duplicated and splitted into two conditional paths; + 2. the warning level < 2; + 3. the current block is not dominating the exit block Then not report the warning. The false positive warnings are moved from -Warray-bounds=1 to -Warray-bounds=2 now. Bootstrapped and regression tested on both x86 and aarch64. adjusted -Warray-bounds-61.c due to the false positive warnings. Let me know if you have any comments and suggestions. >>> >>> At the last Cauldron I talked with David Malcolm about these kind >>> of >>> issues and thought of instead of suppressing diagnostics to >>> record >>> how a block was duplicated. For jump threading my idea was to >>> record >>> the condition that was proved true when entering the path and do >>> this >>> by recording the corresponding locations > > Is only recording the location for the TRUE path enough? > We might need to record the corresponding locations for both TRUE and > FALSE paths since the VRP might be more accurate on both paths. > Is only recording the location is enough? > Do we need to record the pointer to the original condition stmt? Just to be
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Wed, May 22, 2024 at 8:53 PM Qing Zhao wrote: > > > > > On May 22, 2024, at 03:38, Richard Biener > > wrote: > > > > On Tue, May 21, 2024 at 11:36 PM David Malcolm wrote: > >> > >> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > >>> Thanks for the comments and suggestions. > >>> > On May 15, 2024, at 10:00, David Malcolm > wrote: > > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: > > On Mon, 13 May 2024, Qing Zhao wrote: > > > >> -Warray-bounds is an important option to enable linux kernal to > >> keep > >> the array out-of-bound errors out of the source tree. > >> > >> However, due to the false positive warnings reported in > >> PR109071 > >> (-Warray-bounds false positive warnings due to code duplication > >> from > >> jump threading), -Warray-bounds=1 cannot be added on by > >> default. > >> > >> Although it's impossible to elinimate all the false positive > >> warnings > >> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds > >> documentation says "always out of bounds"), we should minimize > >> the > >> false positive warnings in -Warray-bounds=1. > >> > >> The root reason for the false positive warnings reported in > >> PR109071 is: > >> > >> When the thread jump optimization tries to reduce the # of > >> branches > >> inside the routine, sometimes it needs to duplicate the code > >> and > >> split into two conditional pathes. for example: > >> > >> The original code: > >> > >> void sparx5_set (int * ptr, struct nums * sg, int index) > >> { > >> if (index >= 4) > >>warn (); > >> *ptr = 0; > >> *val = sg->vals[index]; > >> if (index >= 4) > >>warn (); > >> *ptr = *val; > >> > >> return; > >> } > >> > >> With the thread jump, the above becomes: > >> > >> void sparx5_set (int * ptr, struct nums * sg, int index) > >> { > >> if (index >= 4) > >>{ > >> warn (); > >> *ptr = 0; // Code duplications since "warn" does > >> return; > >> *val = sg->vals[index]; // same this line. > >>// In this path, since it's > >> under > >> the condition > >>// "index >= 4", the compiler > >> knows > >> the value > >>// of "index" is larger then 4, > >> therefore the > >>// out-of-bound warning. > >> warn (); > >>} > >> else > >>{ > >> *ptr = 0; > >> *val = sg->vals[index]; > >>} > >> *ptr = *val; > >> return; > >> } > >> > >> We can see, after the thread jump optimization, the # of > >> branches > >> inside > >> the routine "sparx5_set" is reduced from 2 to 1, however, due > >> to > >> the > >> code duplication (which is needed for the correctness of the > >> code), > >> we > >> got a false positive out-of-bound warning. > >> > >> In order to eliminate such false positive out-of-bound warning, > >> > >> A. Add one more flag for GIMPLE: is_splitted. > >> B. During the thread jump optimization, when the basic blocks > >> are > >> duplicated, mark all the STMTs inside the original and > >> duplicated > >> basic blocks as "is_splitted"; > >> C. Inside the array bound checker, add the following new > >> heuristic: > >> > >> If > >> 1. the stmt is duplicated and splitted into two conditional > >> paths; > >> + 2. the warning level < 2; > >> + 3. the current block is not dominating the exit block > >> Then not report the warning. > >> > >> The false positive warnings are moved from -Warray-bounds=1 to > >> -Warray-bounds=2 now. > >> > >> Bootstrapped and regression tested on both x86 and aarch64. > >> adjusted > >> -Warray-bounds-61.c due to the false positive warnings. > >> > >> Let me know if you have any comments and suggestions. > > > > At the last Cauldron I talked with David Malcolm about these kind > > of > > issues and thought of instead of suppressing diagnostics to > > record > > how a block was duplicated. For jump threading my idea was to > > record > > the condition that was proved true when entering the path and do > > this > > by recording the corresponding locations > >>> > >>> Is only recording the location for the TRUE path enough? > >>> We might need to record the corresponding locations for both TRUE and > >>> FALSE paths since the VRP might be more accurate on both paths. > >>> Is only recording the location is enough? > >>> Do we need to record the pointer to the original condition stmt? > >> > >> Just to be clear: I don't plan to work on this myself (I have far too > >>
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On May 22, 2024, at 03:38, Richard Biener wrote: > > On Tue, May 21, 2024 at 11:36 PM David Malcolm wrote: >> >> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: >>> Thanks for the comments and suggestions. >>> On May 15, 2024, at 10:00, David Malcolm wrote: On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: > On Mon, 13 May 2024, Qing Zhao wrote: > >> -Warray-bounds is an important option to enable linux kernal to >> keep >> the array out-of-bound errors out of the source tree. >> >> However, due to the false positive warnings reported in >> PR109071 >> (-Warray-bounds false positive warnings due to code duplication >> from >> jump threading), -Warray-bounds=1 cannot be added on by >> default. >> >> Although it's impossible to elinimate all the false positive >> warnings >> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds >> documentation says "always out of bounds"), we should minimize >> the >> false positive warnings in -Warray-bounds=1. >> >> The root reason for the false positive warnings reported in >> PR109071 is: >> >> When the thread jump optimization tries to reduce the # of >> branches >> inside the routine, sometimes it needs to duplicate the code >> and >> split into two conditional pathes. for example: >> >> The original code: >> >> void sparx5_set (int * ptr, struct nums * sg, int index) >> { >> if (index >= 4) >>warn (); >> *ptr = 0; >> *val = sg->vals[index]; >> if (index >= 4) >>warn (); >> *ptr = *val; >> >> return; >> } >> >> With the thread jump, the above becomes: >> >> void sparx5_set (int * ptr, struct nums * sg, int index) >> { >> if (index >= 4) >>{ >> warn (); >> *ptr = 0; // Code duplications since "warn" does >> return; >> *val = sg->vals[index]; // same this line. >>// In this path, since it's >> under >> the condition >>// "index >= 4", the compiler >> knows >> the value >>// of "index" is larger then 4, >> therefore the >>// out-of-bound warning. >> warn (); >>} >> else >>{ >> *ptr = 0; >> *val = sg->vals[index]; >>} >> *ptr = *val; >> return; >> } >> >> We can see, after the thread jump optimization, the # of >> branches >> inside >> the routine "sparx5_set" is reduced from 2 to 1, however, due >> to >> the >> code duplication (which is needed for the correctness of the >> code), >> we >> got a false positive out-of-bound warning. >> >> In order to eliminate such false positive out-of-bound warning, >> >> A. Add one more flag for GIMPLE: is_splitted. >> B. During the thread jump optimization, when the basic blocks >> are >> duplicated, mark all the STMTs inside the original and >> duplicated >> basic blocks as "is_splitted"; >> C. Inside the array bound checker, add the following new >> heuristic: >> >> If >> 1. the stmt is duplicated and splitted into two conditional >> paths; >> + 2. the warning level < 2; >> + 3. the current block is not dominating the exit block >> Then not report the warning. >> >> The false positive warnings are moved from -Warray-bounds=1 to >> -Warray-bounds=2 now. >> >> Bootstrapped and regression tested on both x86 and aarch64. >> adjusted >> -Warray-bounds-61.c due to the false positive warnings. >> >> Let me know if you have any comments and suggestions. > > At the last Cauldron I talked with David Malcolm about these kind > of > issues and thought of instead of suppressing diagnostics to > record > how a block was duplicated. For jump threading my idea was to > record > the condition that was proved true when entering the path and do > this > by recording the corresponding locations >>> >>> Is only recording the location for the TRUE path enough? >>> We might need to record the corresponding locations for both TRUE and >>> FALSE paths since the VRP might be more accurate on both paths. >>> Is only recording the location is enough? >>> Do we need to record the pointer to the original condition stmt? >> >> Just to be clear: I don't plan to work on this myself (I have far too >> much already to work on...); I'm assuming Richard Biener is >> hoping/planning to implement this himself. > > While I think some of this might be an improvement to those vast > number of "false positive" diagnostics we have from (too) late diagnostic > passes I do not have the cycles to work on this. I can study a
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Tue, May 21, 2024 at 11:36 PM David Malcolm wrote: > > On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > > Thanks for the comments and suggestions. > > > > > On May 15, 2024, at 10:00, David Malcolm > > > wrote: > > > > > > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: > > > > On Mon, 13 May 2024, Qing Zhao wrote: > > > > > > > > > -Warray-bounds is an important option to enable linux kernal to > > > > > keep > > > > > the array out-of-bound errors out of the source tree. > > > > > > > > > > However, due to the false positive warnings reported in > > > > > PR109071 > > > > > (-Warray-bounds false positive warnings due to code duplication > > > > > from > > > > > jump threading), -Warray-bounds=1 cannot be added on by > > > > > default. > > > > > > > > > > Although it's impossible to elinimate all the false positive > > > > > warnings > > > > > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds > > > > > documentation says "always out of bounds"), we should minimize > > > > > the > > > > > false positive warnings in -Warray-bounds=1. > > > > > > > > > > The root reason for the false positive warnings reported in > > > > > PR109071 is: > > > > > > > > > > When the thread jump optimization tries to reduce the # of > > > > > branches > > > > > inside the routine, sometimes it needs to duplicate the code > > > > > and > > > > > split into two conditional pathes. for example: > > > > > > > > > > The original code: > > > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int index) > > > > > { > > > > > if (index >= 4) > > > > > warn (); > > > > > *ptr = 0; > > > > > *val = sg->vals[index]; > > > > > if (index >= 4) > > > > > warn (); > > > > > *ptr = *val; > > > > > > > > > > return; > > > > > } > > > > > > > > > > With the thread jump, the above becomes: > > > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int index) > > > > > { > > > > > if (index >= 4) > > > > > { > > > > > warn (); > > > > > *ptr = 0; // Code duplications since "warn" does > > > > > return; > > > > > *val = sg->vals[index]; // same this line. > > > > > // In this path, since it's > > > > > under > > > > > the condition > > > > > // "index >= 4", the compiler > > > > > knows > > > > > the value > > > > > // of "index" is larger then 4, > > > > > therefore the > > > > > // out-of-bound warning. > > > > > warn (); > > > > > } > > > > > else > > > > > { > > > > > *ptr = 0; > > > > > *val = sg->vals[index]; > > > > > } > > > > > *ptr = *val; > > > > > return; > > > > > } > > > > > > > > > > We can see, after the thread jump optimization, the # of > > > > > branches > > > > > inside > > > > > the routine "sparx5_set" is reduced from 2 to 1, however, due > > > > > to > > > > > the > > > > > code duplication (which is needed for the correctness of the > > > > > code), > > > > > we > > > > > got a false positive out-of-bound warning. > > > > > > > > > > In order to eliminate such false positive out-of-bound warning, > > > > > > > > > > A. Add one more flag for GIMPLE: is_splitted. > > > > > B. During the thread jump optimization, when the basic blocks > > > > > are > > > > >duplicated, mark all the STMTs inside the original and > > > > > duplicated > > > > >basic blocks as "is_splitted"; > > > > > C. Inside the array bound checker, add the following new > > > > > heuristic: > > > > > > > > > > If > > > > >1. the stmt is duplicated and splitted into two conditional > > > > > paths; > > > > > + 2. the warning level < 2; > > > > > + 3. the current block is not dominating the exit block > > > > > Then not report the warning. > > > > > > > > > > The false positive warnings are moved from -Warray-bounds=1 to > > > > > -Warray-bounds=2 now. > > > > > > > > > > Bootstrapped and regression tested on both x86 and aarch64. > > > > > adjusted > > > > > -Warray-bounds-61.c due to the false positive warnings. > > > > > > > > > > Let me know if you have any comments and suggestions. > > > > > > > > At the last Cauldron I talked with David Malcolm about these kind > > > > of > > > > issues and thought of instead of suppressing diagnostics to > > > > record > > > > how a block was duplicated. For jump threading my idea was to > > > > record > > > > the condition that was proved true when entering the path and do > > > > this > > > > by recording the corresponding locations > > > > Is only recording the location for the TRUE path enough? > > We might need to record the corresponding locations for both TRUE and > > FALSE paths since the VRP might be more accurate on both paths. > > Is only recording the location is enough? > > Do we need to record the pointer to the original condition stmt? > > Just to be clear: I don't plan to work on this myself (I have far too > much already to
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > Thanks for the comments and suggestions. > > > On May 15, 2024, at 10:00, David Malcolm > > wrote: > > > > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: > > > On Mon, 13 May 2024, Qing Zhao wrote: > > > > > > > -Warray-bounds is an important option to enable linux kernal to > > > > keep > > > > the array out-of-bound errors out of the source tree. > > > > > > > > However, due to the false positive warnings reported in > > > > PR109071 > > > > (-Warray-bounds false positive warnings due to code duplication > > > > from > > > > jump threading), -Warray-bounds=1 cannot be added on by > > > > default. > > > > > > > > Although it's impossible to elinimate all the false positive > > > > warnings > > > > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds > > > > documentation says "always out of bounds"), we should minimize > > > > the > > > > false positive warnings in -Warray-bounds=1. > > > > > > > > The root reason for the false positive warnings reported in > > > > PR109071 is: > > > > > > > > When the thread jump optimization tries to reduce the # of > > > > branches > > > > inside the routine, sometimes it needs to duplicate the code > > > > and > > > > split into two conditional pathes. for example: > > > > > > > > The original code: > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int index) > > > > { > > > > if (index >= 4) > > > > warn (); > > > > *ptr = 0; > > > > *val = sg->vals[index]; > > > > if (index >= 4) > > > > warn (); > > > > *ptr = *val; > > > > > > > > return; > > > > } > > > > > > > > With the thread jump, the above becomes: > > > > > > > > void sparx5_set (int * ptr, struct nums * sg, int index) > > > > { > > > > if (index >= 4) > > > > { > > > > warn (); > > > > *ptr = 0; // Code duplications since "warn" does > > > > return; > > > > *val = sg->vals[index]; // same this line. > > > > // In this path, since it's > > > > under > > > > the condition > > > > // "index >= 4", the compiler > > > > knows > > > > the value > > > > // of "index" is larger then 4, > > > > therefore the > > > > // out-of-bound warning. > > > > warn (); > > > > } > > > > else > > > > { > > > > *ptr = 0; > > > > *val = sg->vals[index]; > > > > } > > > > *ptr = *val; > > > > return; > > > > } > > > > > > > > We can see, after the thread jump optimization, the # of > > > > branches > > > > inside > > > > the routine "sparx5_set" is reduced from 2 to 1, however, due > > > > to > > > > the > > > > code duplication (which is needed for the correctness of the > > > > code), > > > > we > > > > got a false positive out-of-bound warning. > > > > > > > > In order to eliminate such false positive out-of-bound warning, > > > > > > > > A. Add one more flag for GIMPLE: is_splitted. > > > > B. During the thread jump optimization, when the basic blocks > > > > are > > > > duplicated, mark all the STMTs inside the original and > > > > duplicated > > > > basic blocks as "is_splitted"; > > > > C. Inside the array bound checker, add the following new > > > > heuristic: > > > > > > > > If > > > > 1. the stmt is duplicated and splitted into two conditional > > > > paths; > > > > + 2. the warning level < 2; > > > > + 3. the current block is not dominating the exit block > > > > Then not report the warning. > > > > > > > > The false positive warnings are moved from -Warray-bounds=1 to > > > > -Warray-bounds=2 now. > > > > > > > > Bootstrapped and regression tested on both x86 and aarch64. > > > > adjusted > > > > -Warray-bounds-61.c due to the false positive warnings. > > > > > > > > Let me know if you have any comments and suggestions. > > > > > > At the last Cauldron I talked with David Malcolm about these kind > > > of > > > issues and thought of instead of suppressing diagnostics to > > > record > > > how a block was duplicated. For jump threading my idea was to > > > record > > > the condition that was proved true when entering the path and do > > > this > > > by recording the corresponding locations > > Is only recording the location for the TRUE path enough? > We might need to record the corresponding locations for both TRUE and > FALSE paths since the VRP might be more accurate on both paths. > Is only recording the location is enough? > Do we need to record the pointer to the original condition stmt? Just to be clear: I don't plan to work on this myself (I have far too much already to work on...); I'm assuming Richard Biener is hoping/planning to implement this himself. But feel free to ask me any questions about the diagnostic_path machinery within the diagnostics subsystem. [...snip...] Regards Dave
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
Thanks for the comments and suggestions. > On May 15, 2024, at 10:00, David Malcolm wrote: > > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: >> On Mon, 13 May 2024, Qing Zhao wrote: >> >>> -Warray-bounds is an important option to enable linux kernal to >>> keep >>> the array out-of-bound errors out of the source tree. >>> >>> However, due to the false positive warnings reported in PR109071 >>> (-Warray-bounds false positive warnings due to code duplication >>> from >>> jump threading), -Warray-bounds=1 cannot be added on by default. >>> >>> Although it's impossible to elinimate all the false positive >>> warnings >>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds >>> documentation says "always out of bounds"), we should minimize the >>> false positive warnings in -Warray-bounds=1. >>> >>> The root reason for the false positive warnings reported in >>> PR109071 is: >>> >>> When the thread jump optimization tries to reduce the # of branches >>> inside the routine, sometimes it needs to duplicate the code and >>> split into two conditional pathes. for example: >>> >>> The original code: >>> >>> void sparx5_set (int * ptr, struct nums * sg, int index) >>> { >>> if (index >= 4) >>> warn (); >>> *ptr = 0; >>> *val = sg->vals[index]; >>> if (index >= 4) >>> warn (); >>> *ptr = *val; >>> >>> return; >>> } >>> >>> With the thread jump, the above becomes: >>> >>> void sparx5_set (int * ptr, struct nums * sg, int index) >>> { >>> if (index >= 4) >>> { >>> warn (); >>> *ptr = 0; // Code duplications since "warn" does >>> return; >>> *val = sg->vals[index]; // same this line. >>> // In this path, since it's under >>> the condition >>> // "index >= 4", the compiler knows >>> the value >>> // of "index" is larger then 4, >>> therefore the >>> // out-of-bound warning. >>> warn (); >>> } >>> else >>> { >>> *ptr = 0; >>> *val = sg->vals[index]; >>> } >>> *ptr = *val; >>> return; >>> } >>> >>> We can see, after the thread jump optimization, the # of branches >>> inside >>> the routine "sparx5_set" is reduced from 2 to 1, however, due to >>> the >>> code duplication (which is needed for the correctness of the code), >>> we >>> got a false positive out-of-bound warning. >>> >>> In order to eliminate such false positive out-of-bound warning, >>> >>> A. Add one more flag for GIMPLE: is_splitted. >>> B. During the thread jump optimization, when the basic blocks are >>>duplicated, mark all the STMTs inside the original and >>> duplicated >>>basic blocks as "is_splitted"; >>> C. Inside the array bound checker, add the following new heuristic: >>> >>> If >>>1. the stmt is duplicated and splitted into two conditional >>> paths; >>> + 2. the warning level < 2; >>> + 3. the current block is not dominating the exit block >>> Then not report the warning. >>> >>> The false positive warnings are moved from -Warray-bounds=1 to >>> -Warray-bounds=2 now. >>> >>> Bootstrapped and regression tested on both x86 and aarch64. >>> adjusted >>> -Warray-bounds-61.c due to the false positive warnings. >>> >>> Let me know if you have any comments and suggestions. >> >> At the last Cauldron I talked with David Malcolm about these kind of >> issues and thought of instead of suppressing diagnostics to record >> how a block was duplicated. For jump threading my idea was to record >> the condition that was proved true when entering the path and do this >> by recording the corresponding locations Is only recording the location for the TRUE path enough? We might need to record the corresponding locations for both TRUE and FALSE paths since the VRP might be more accurate on both paths. Is only recording the location is enough? Do we need to record the pointer to the original condition stmt? >> so that in the end we can >> use the diagnostic-path infrastructure to say >> >> warning: array index always above array bounds >> events 1: >> >>> 3 | if (index >= 4) >> | >> (1) when index >= 4 >> >> it would be possible to record the info as part of the ad-hoc >> location data on each duplicated stmt or, possibly simpler, >> as part of a debug stmt of new kind. >> >> I'm not sure pruning the warnings is a good thing to do. One >> would argue we should instead isolate such path as unreachable >> since it invokes undefined behavior. In particular your >> example is clearly a bug and should be diagnosed. >> >> Note very similar issues happen when unrolling a loop. >> >> Note all late diagnostics are prone to these kind of issues. > > To recap our chat at Cauldron: any GCC diagnostic can potentially have > a diagnostic_path associated with it (not just the analyzer). The > current mechanism is: > (a) use a rich_location for the diagnostic, and > (b) create
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: > On Mon, 13 May 2024, Qing Zhao wrote: > > > -Warray-bounds is an important option to enable linux kernal to > > keep > > the array out-of-bound errors out of the source tree. > > > > However, due to the false positive warnings reported in PR109071 > > (-Warray-bounds false positive warnings due to code duplication > > from > > jump threading), -Warray-bounds=1 cannot be added on by default. > > > > Although it's impossible to elinimate all the false positive > > warnings > > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds > > documentation says "always out of bounds"), we should minimize the > > false positive warnings in -Warray-bounds=1. > > > > The root reason for the false positive warnings reported in > > PR109071 is: > > > > When the thread jump optimization tries to reduce the # of branches > > inside the routine, sometimes it needs to duplicate the code and > > split into two conditional pathes. for example: > > > > The original code: > > > > void sparx5_set (int * ptr, struct nums * sg, int index) > > { > > if (index >= 4) > > warn (); > > *ptr = 0; > > *val = sg->vals[index]; > > if (index >= 4) > > warn (); > > *ptr = *val; > > > > return; > > } > > > > With the thread jump, the above becomes: > > > > void sparx5_set (int * ptr, struct nums * sg, int index) > > { > > if (index >= 4) > > { > > warn (); > > *ptr = 0; // Code duplications since "warn" does > > return; > > *val = sg->vals[index]; // same this line. > > // In this path, since it's under > > the condition > > // "index >= 4", the compiler knows > > the value > > // of "index" is larger then 4, > > therefore the > > // out-of-bound warning. > > warn (); > > } > > else > > { > > *ptr = 0; > > *val = sg->vals[index]; > > } > > *ptr = *val; > > return; > > } > > > > We can see, after the thread jump optimization, the # of branches > > inside > > the routine "sparx5_set" is reduced from 2 to 1, however, due to > > the > > code duplication (which is needed for the correctness of the code), > > we > > got a false positive out-of-bound warning. > > > > In order to eliminate such false positive out-of-bound warning, > > > > A. Add one more flag for GIMPLE: is_splitted. > > B. During the thread jump optimization, when the basic blocks are > > duplicated, mark all the STMTs inside the original and > > duplicated > > basic blocks as "is_splitted"; > > C. Inside the array bound checker, add the following new heuristic: > > > > If > > 1. the stmt is duplicated and splitted into two conditional > > paths; > > + 2. the warning level < 2; > > + 3. the current block is not dominating the exit block > > Then not report the warning. > > > > The false positive warnings are moved from -Warray-bounds=1 to > > -Warray-bounds=2 now. > > > > Bootstrapped and regression tested on both x86 and aarch64. > > adjusted > > -Warray-bounds-61.c due to the false positive warnings. > > > > Let me know if you have any comments and suggestions. > > At the last Cauldron I talked with David Malcolm about these kind of > issues and thought of instead of suppressing diagnostics to record > how a block was duplicated. For jump threading my idea was to record > the condition that was proved true when entering the path and do this > by recording the corresponding locations so that in the end we can > use the diagnostic-path infrastructure to say > > warning: array index always above array bounds > events 1: > > > 3 | if (index >= 4) > | > (1) when index >= 4 > > it would be possible to record the info as part of the ad-hoc > location data on each duplicated stmt or, possibly simpler, > as part of a debug stmt of new kind. > > I'm not sure pruning the warnings is a good thing to do. One > would argue we should instead isolate such path as unreachable > since it invokes undefined behavior. In particular your > example is clearly a bug and should be diagnosed. > > Note very similar issues happen when unrolling a loop. > > Note all late diagnostics are prone to these kind of issues. To recap our chat at Cauldron: any GCC diagnostic can potentially have a diagnostic_path associated with it (not just the analyzer). The current mechanism is: (a) use a rich_location for the diagnostic, and (b) create an instance of some diagnostic_path subclass (e.g. simple_diagnostic_path, or something else), and (c) call richloc.set_path (); to associate the path with the rich_location See gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_paths.c for an example of using simple_diagnostic_path that doesn't use the analyzer. If we want *every* late diagnostic to potentially have a path, it sounds like we might want some extra infrastructure (perhaps
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On May 15, 2024, at 02:09, Richard Biener wrote: > > On Tue, 14 May 2024, Qing Zhao wrote: > >> >> >>> On May 14, 2024, at 13:14, Richard Biener wrote: >>> >>> On Tue, 14 May 2024, Qing Zhao wrote: >>> > On May 14, 2024, at 10:29, Richard Biener wrote: > >>> [...] > It would of course > need experimenting since we can end up moving stmts and merging blocks > though the linear traces created by jump threading should be quite > stable (as opposed to say the unrolling case where multiple instances > of the loop body likely will end up in the exact same basic block). Do you mean, for loop unrolling the approach with one extra stmt for one basic block might be even harder and unreliable? >>> >>> The question is whether the stmt marks the whole block or whether we >>> for example add both a START and END stmt covering a copied path. >>> I would guess for unrolling we need definitely need to do the latter >>> (so we can diagnose "on the 3rd iteration of an unrolled loop" or >>> similar). >> >> Okay. I see. >> >> Is it possible that the START and END stmts might be moved around and >> out-of-place by the different optimizations? > > There is nothign preventing stmts to be moved across START or END. Then we have to add some artificial data dependency or memory barrier at START and END to prevent such transformation. However, this might also prevent some useful transformation, therefore impact the performance… Not sure whether this is a good approach… Yes, some experiments might need to be done to compare the cost of these different approaches. Qing > > Richard.
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Tue, 14 May 2024, Qing Zhao wrote: > > > > On May 14, 2024, at 13:14, Richard Biener wrote: > > > > On Tue, 14 May 2024, Qing Zhao wrote: > > > >> > >> > >>> On May 14, 2024, at 10:29, Richard Biener wrote: > >>> > > [...] > >>> It would of course > >>> need experimenting since we can end up moving stmts and merging blocks > >>> though the linear traces created by jump threading should be quite > >>> stable (as opposed to say the unrolling case where multiple instances > >>> of the loop body likely will end up in the exact same basic block). > >> > >> Do you mean, for loop unrolling the approach with one extra stmt for one > >> basic block might be even harder and unreliable? > > > > The question is whether the stmt marks the whole block or whether we > > for example add both a START and END stmt covering a copied path. > > I would guess for unrolling we need definitely need to do the latter > > (so we can diagnose "on the 3rd iteration of an unrolled loop" or > > similar). > > Okay. I see. > > Is it possible that the START and END stmts might be moved around and > out-of-place by the different optimizations? There is nothign preventing stmts to be moved across START or END. Richard.
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Tue, 14 May 2024, Kees Cook wrote: > On Tue, May 14, 2024 at 02:17:16PM +, Qing Zhao wrote: > > The current major issue with the warning is: the constant index value 4 > > is not in the source code, it’s a compiler generated intermediate value > > (even though it’s a correct value -:)). Such warning messages confuse > > the end-users with information that cannot be connected directly to the > > source code. > > Right -- this "4" comes from -fsanitize=array-bounds (in "warn but > continue" mode). > > Now, the minimized PoC shows a situation that triggers the situation, but > I think it's worth looking at the original code that caused this false > positive: > > for (i = 0; i < sg->num_entries; i++) { > gce = >gce[i]; > > > The issue here is that sg->num_entries has already been bounds-checked > in a separate function. As a result, the value range tracking for "i" > here is unbounded. > > Enabling -fsanitize=array-bounds means the sg->gce[i] access gets > instrumented, and suddenly "i" gains an implicit range, induced by the > sanitizer. > > (I would point out that this is very similar to the problems we've had > with -fsanitize=shift[1][2]: the sanitizer induces a belief about a > given variable's range this isn't true.) > > Now, there is an argument to be made that the original code should be > doing: > > for (i = 0; i < 4 && i < sg->num_entries; i++) { > > But this is: > > a) logically redundant (Linux maintainers don't tend to like duplicating >their range checking) > > b) a very simple case > > The point of the sanitizers is to catch "impossible" situations at > run-time for the cases where some value may end up out of range. Having > it _induce_ a range on the resulting code makes no sense. > > Could we, perhaps, have sanitizer code not influence the value range > tracking? That continues to look like the root cause for these things. The sanitizer code adds checks that are not distinguishable from user code exactly because we want value-range analysis to eventually elide even (redundant) sanitizer checks. I think the fix for the source when there's a-priori knowledge of sg->num_entries is to assert that knowledge through language features or when using C through GNU extensions like assert() using __builtin_unreachable (). That also serves documentation purposes "this code expects sg->num_entries to be bounds-checked". To me it doesn't make much sense to mix sanitizing of array accesses and at the same time do -Warray-bound diagnostics. Note I tried to teach jump threading to be less aggressive threading paths to exceptional situations (I think the sanitizer runtime calls are at least marked unlikely), but the comment was that even those are very much desired but I can't remember the details. This was as part of PR111515 but I think I've run into this earlier when trying to improve back_threader_profitability::possibly_profitable_path_p. There we have /* Threading is profitable if the path duplicated is hot but also in a case we separate cold path from hot path and permit optimization of the hot path later. Be on the agressive side here. In some testcases, as in PR 78407 this leads to noticeable improvements. */ here we have if (A) unlikely(); B; if (A) unlikely (); and we choose to perform that path separation which optimizes the not exceptional path which automatically separates the exceptional path as well. IMO that sanitizer mode that continues running is bad - it makes the compiler aware of undefined behavior and make the code run into it with open eyes. You get what you asked for. Richard.
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Tue, May 14, 2024 at 02:17:16PM +, Qing Zhao wrote: > The current major issue with the warning is: the constant index value 4 > is not in the source code, it’s a compiler generated intermediate value > (even though it’s a correct value -:)). Such warning messages confuse > the end-users with information that cannot be connected directly to the > source code. Right -- this "4" comes from -fsanitize=array-bounds (in "warn but continue" mode). Now, the minimized PoC shows a situation that triggers the situation, but I think it's worth looking at the original code that caused this false positive: for (i = 0; i < sg->num_entries; i++) { gce = >gce[i]; The issue here is that sg->num_entries has already been bounds-checked in a separate function. As a result, the value range tracking for "i" here is unbounded. Enabling -fsanitize=array-bounds means the sg->gce[i] access gets instrumented, and suddenly "i" gains an implicit range, induced by the sanitizer. (I would point out that this is very similar to the problems we've had with -fsanitize=shift[1][2]: the sanitizer induces a belief about a given variable's range this isn't true.) Now, there is an argument to be made that the original code should be doing: for (i = 0; i < 4 && i < sg->num_entries; i++) { But this is: a) logically redundant (Linux maintainers don't tend to like duplicating their range checking) b) a very simple case The point of the sanitizers is to catch "impossible" situations at run-time for the cases where some value may end up out of range. Having it _induce_ a range on the resulting code makes no sense. Could we, perhaps, have sanitizer code not influence the value range tracking? That continues to look like the root cause for these things. -Kees [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105679 [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108306 -- Kees Cook
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On May 14, 2024, at 13:14, Richard Biener wrote: > > On Tue, 14 May 2024, Qing Zhao wrote: > >> >> >>> On May 14, 2024, at 10:29, Richard Biener wrote: >>> > [...] >>> It would of course >>> need experimenting since we can end up moving stmts and merging blocks >>> though the linear traces created by jump threading should be quite >>> stable (as opposed to say the unrolling case where multiple instances >>> of the loop body likely will end up in the exact same basic block). >> >> Do you mean, for loop unrolling the approach with one extra stmt for one >> basic block might be even harder and unreliable? > > The question is whether the stmt marks the whole block or whether we > for example add both a START and END stmt covering a copied path. > I would guess for unrolling we need definitely need to do the latter > (so we can diagnose "on the 3rd iteration of an unrolled loop" or > similar). Okay. I see. Is it possible that the START and END stmts might be moved around and out-of-place by the different optimizations? Qing > > Richard. >
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Tue, 14 May 2024, Qing Zhao wrote: > > > > On May 14, 2024, at 10:29, Richard Biener wrote: > > [...] > > It would of course > > need experimenting since we can end up moving stmts and merging blocks > > though the linear traces created by jump threading should be quite > > stable (as opposed to say the unrolling case where multiple instances > > of the loop body likely will end up in the exact same basic block). > > Do you mean, for loop unrolling the approach with one extra stmt for one > basic block might be even harder and unreliable? The question is whether the stmt marks the whole block or whether we for example add both a START and END stmt covering a copied path. I would guess for unrolling we need definitely need to do the latter (so we can diagnose "on the 3rd iteration of an unrolled loop" or similar). Richard.
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Mon, May 13, 2024 at 07:48:30PM +, Qing Zhao wrote: > The false positive warnings are moved from -Warray-bounds=1 to > -Warray-bounds=2 now. On a Linux kernel x86_64 allmodconfig build, this takes the -Warray-bounds warnings from 14 to 9. After examining these 9, I see: - 4: legitimate bugs in Linux source (3 distinct, 1 repeat). I'll be reporting/fixing these in Linux. - 4: 4 instances of 1 case of buggy interaction with -fsanitize=shift, looking similar to these past bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105679 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108306 the difference being operating on an enum. I will reduce the case and open a bug report for it. - 1: still examining. It looks like a false positive so far. Thanks! -Kees -- Kees Cook
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On May 14, 2024, at 11:08, Jeff Law wrote: > > > > On 5/14/24 8:57 AM, Qing Zhao wrote: >>> On May 13, 2024, at 20:14, Kees Cook wrote: >>> >>> On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote: On Mon, May 13, 2024, 11:41 PM Kees Cook wrote: > But it makes no sense to warn about: > > void sparx5_set (int * ptr, struct nums * sg, int index) > { > if (index >= 4) > warn (); > *ptr = 0; > *val = sg->vals[index]; > if (index >= 4) > warn (); > *ptr = *val; > } > > Because at "*val = sg->vals[index];" the actual value range tracking for > index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the > "if" statements is the range tracking [4,INT_MAX].) > > However, in the case where jump threading has split the execution flow > and produced a copy of "*val = sg->vals[index];" where the value range > tracking for "index" is now [4,INT_MAX], is the warning valid. But it > is only for that instance. Reporting it for effectively both (there is > only 1 source line for the array indexing) is misleading because there > is nothing the user can do about it -- the compiler created the copy and > then noticed it had a range it could apply to that array index. > "there is nothing the user can do about it" is very much false. They could change warn call into a noreturn function call instead. (In the case of the Linux kernel panic). There are things the user can do to fix the warning and even get better code generation out of the compilers. >>> >>> This isn't about warn() not being noreturn. The warn() could be any >>> function call; the jump threading still happens. >> When the program is executed on the “if (index > = 4)” path, the value of >> “index” is definitely >>> = 4, when sg->vals[index] is referenced on this path (the case when the >>> routine “warn” is NOT noreturn), it’s >> definitely an out-of-bounds array access. So, the compiler’s warning is >> correct. And this warning does catch >> a potential issue in the source code that need to be fixed by either of the >> following two solutions: >>1. Make the routine “warn” as noreturn and mark it noreturn; > This would be my recommendation. We're about to execute undefined behavior. > I don't see a way to necessarily recover safely here, so I'd suggest having > warn() not return and mark it appropriately. > > That'll have numerous secondary benefits as well. Yes, agreed. Such source code change should be a nice security improvement to the linux kernel source code. Qing > > jeff
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On May 14, 2024, at 10:29, Richard Biener wrote: > > On Tue, 14 May 2024, Qing Zhao wrote: > >> >> >>> On May 14, 2024, at 09:08, Richard Biener wrote: >>> >>> On Mon, 13 May 2024, Qing Zhao wrote: >>> -Warray-bounds is an important option to enable linux kernal to keep the array out-of-bound errors out of the source tree. However, due to the false positive warnings reported in PR109071 (-Warray-bounds false positive warnings due to code duplication from jump threading), -Warray-bounds=1 cannot be added on by default. Although it's impossible to elinimate all the false positive warnings from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds documentation says "always out of bounds"), we should minimize the false positive warnings in -Warray-bounds=1. The root reason for the false positive warnings reported in PR109071 is: When the thread jump optimization tries to reduce the # of branches inside the routine, sometimes it needs to duplicate the code and split into two conditional pathes. for example: The original code: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) warn (); *ptr = 0; *val = sg->vals[index]; if (index >= 4) warn (); *ptr = *val; return; } With the thread jump, the above becomes: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) { warn (); *ptr = 0; // Code duplications since "warn" does return; *val = sg->vals[index]; // same this line. // In this path, since it's under the condition // "index >= 4", the compiler knows the value // of "index" is larger then 4, therefore the // out-of-bound warning. warn (); } else { *ptr = 0; *val = sg->vals[index]; } *ptr = *val; return; } We can see, after the thread jump optimization, the # of branches inside the routine "sparx5_set" is reduced from 2 to 1, however, due to the code duplication (which is needed for the correctness of the code), we got a false positive out-of-bound warning. In order to eliminate such false positive out-of-bound warning, A. Add one more flag for GIMPLE: is_splitted. B. During the thread jump optimization, when the basic blocks are duplicated, mark all the STMTs inside the original and duplicated basic blocks as "is_splitted"; C. Inside the array bound checker, add the following new heuristic: If 1. the stmt is duplicated and splitted into two conditional paths; + 2. the warning level < 2; + 3. the current block is not dominating the exit block Then not report the warning. The false positive warnings are moved from -Warray-bounds=1 to -Warray-bounds=2 now. Bootstrapped and regression tested on both x86 and aarch64. adjusted -Warray-bounds-61.c due to the false positive warnings. Let me know if you have any comments and suggestions. >>> >>> At the last Cauldron I talked with David Malcolm about these kind of >>> issues and thought of instead of suppressing diagnostics to record >>> how a block was duplicated. For jump threading my idea was to record >>> the condition that was proved true when entering the path and do this >>> by recording the corresponding locations so that in the end we can >>> use the diagnostic-path infrastructure to say >>> >>> warning: array index always above array bounds >>> events 1: >>> >>> | 3 | if (index >= 4) >>>| >>> (1) when index >= 4 > > As it's been quite some time I think I remeber that I thought of > constructing the diagnostic path at jump threading time and associating > that with the location. But I don't remember exactly where I wanted to > put it - I think it was on an extra stmt to avoid having too many > ad-hoc locations as I'm not sure of their cost. Yes, an extra stmt for each basic block might be less memory cost than an extra location info for each stmt. The major concern with the extra stmt for each basic block is how to keep this special stmt in place after stmts moving and basic block merging as you mentioned below. During my debugging of bug 109071, I noticed that the duplicated basic blocks are merged with old blocks to form new blocks. And the original block is deleted. I expected that the implementation with the extra stmt for each basic block might be much harder and error prone. > It would of course > need experimenting since we can end up moving stmts and merging blocks > though the linear traces created by jump threading should be quite > stable (as opposed to say the unrolling case where multiple instances > of the loop body likely will end up in the exact same basic block). Do you mean, for loop
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On 5/14/24 8:57 AM, Qing Zhao wrote: On May 13, 2024, at 20:14, Kees Cook wrote: On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote: On Mon, May 13, 2024, 11:41 PM Kees Cook wrote: But it makes no sense to warn about: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) warn (); *ptr = 0; *val = sg->vals[index]; if (index >= 4) warn (); *ptr = *val; } Because at "*val = sg->vals[index];" the actual value range tracking for index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the "if" statements is the range tracking [4,INT_MAX].) However, in the case where jump threading has split the execution flow and produced a copy of "*val = sg->vals[index];" where the value range tracking for "index" is now [4,INT_MAX], is the warning valid. But it is only for that instance. Reporting it for effectively both (there is only 1 source line for the array indexing) is misleading because there is nothing the user can do about it -- the compiler created the copy and then noticed it had a range it could apply to that array index. "there is nothing the user can do about it" is very much false. They could change warn call into a noreturn function call instead. (In the case of the Linux kernel panic). There are things the user can do to fix the warning and even get better code generation out of the compilers. This isn't about warn() not being noreturn. The warn() could be any function call; the jump threading still happens. When the program is executed on the “if (index > = 4)” path, the value of “index” is definitely = 4, when sg->vals[index] is referenced on this path (the case when the routine “warn” is NOT noreturn), it’s definitely an out-of-bounds array access. So, the compiler’s warning is correct. And this warning does catch a potential issue in the source code that need to be fixed by either of the following two solutions: 1. Make the routine “warn” as noreturn and mark it noreturn; This would be my recommendation. We're about to execute undefined behavior. I don't see a way to necessarily recover safely here, so I'd suggest having warn() not return and mark it appropriately. That'll have numerous secondary benefits as well. jeff
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On May 13, 2024, at 20:14, Kees Cook wrote: > > On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote: >> On Mon, May 13, 2024, 11:41 PM Kees Cook wrote: >>> But it makes no sense to warn about: >>> >>> void sparx5_set (int * ptr, struct nums * sg, int index) >>> { >>> if (index >= 4) >>> warn (); >>> *ptr = 0; >>> *val = sg->vals[index]; >>> if (index >= 4) >>> warn (); >>> *ptr = *val; >>> } >>> >>> Because at "*val = sg->vals[index];" the actual value range tracking for >>> index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the >>> "if" statements is the range tracking [4,INT_MAX].) >>> >>> However, in the case where jump threading has split the execution flow >>> and produced a copy of "*val = sg->vals[index];" where the value range >>> tracking for "index" is now [4,INT_MAX], is the warning valid. But it >>> is only for that instance. Reporting it for effectively both (there is >>> only 1 source line for the array indexing) is misleading because there >>> is nothing the user can do about it -- the compiler created the copy and >>> then noticed it had a range it could apply to that array index. >>> >> >> "there is nothing the user can do about it" is very much false. They could >> change warn call into a noreturn function call instead. (In the case of >> the Linux kernel panic). There are things the user can do to fix the >> warning and even get better code generation out of the compilers. > > This isn't about warn() not being noreturn. The warn() could be any > function call; the jump threading still happens. When the program is executed on the “if (index > = 4)” path, the value of “index” is definitely >= 4, when sg->vals[index] is referenced on this path (the case when the >routine “warn” is NOT noreturn), it’s definitely an out-of-bounds array access. So, the compiler’s warning is correct. And this warning does catch a potential issue in the source code that need to be fixed by either of the following two solutions: 1. Make the routine “warn” as noreturn and mark it noreturn; Or 2. On the path “if (index >= 4)”, make the value of “index” in the bound of the array. With either of the above source code changes, we can fix this potential out-of-bound array access bug in the source code. Qing > > GCC is warning about a compiler-constructed situation that cannot be > reliably fixed on the source side (GCC emitting the warning is highly > unstable in these cases), since the condition is not *always* true for > the given line of code. If it is not useful to warn for "array[index]" > being out of range when "index" is always [INT_MIN,INT_MAX], then it > is not useful to warn when "index" MAY be [INT_MIN,INT_MAX] for a given > line of code. > > -Kees > > -- > Kees Cook
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Tue, 14 May 2024, Qing Zhao wrote: > > > > On May 14, 2024, at 09:08, Richard Biener wrote: > > > > On Mon, 13 May 2024, Qing Zhao wrote: > > > >> -Warray-bounds is an important option to enable linux kernal to keep > >> the array out-of-bound errors out of the source tree. > >> > >> However, due to the false positive warnings reported in PR109071 > >> (-Warray-bounds false positive warnings due to code duplication from > >> jump threading), -Warray-bounds=1 cannot be added on by default. > >> > >> Although it's impossible to elinimate all the false positive warnings > >> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds > >> documentation says "always out of bounds"), we should minimize the > >> false positive warnings in -Warray-bounds=1. > >> > >> The root reason for the false positive warnings reported in PR109071 is: > >> > >> When the thread jump optimization tries to reduce the # of branches > >> inside the routine, sometimes it needs to duplicate the code and > >> split into two conditional pathes. for example: > >> > >> The original code: > >> > >> void sparx5_set (int * ptr, struct nums * sg, int index) > >> { > >> if (index >= 4) > >>warn (); > >> *ptr = 0; > >> *val = sg->vals[index]; > >> if (index >= 4) > >>warn (); > >> *ptr = *val; > >> > >> return; > >> } > >> > >> With the thread jump, the above becomes: > >> > >> void sparx5_set (int * ptr, struct nums * sg, int index) > >> { > >> if (index >= 4) > >>{ > >> warn (); > >> *ptr = 0; // Code duplications since "warn" does return; > >> *val = sg->vals[index]; // same this line. > >> // In this path, since it's under the condition > >> // "index >= 4", the compiler knows the value > >> // of "index" is larger then 4, therefore the > >> // out-of-bound warning. > >> warn (); > >>} > >> else > >>{ > >> *ptr = 0; > >> *val = sg->vals[index]; > >>} > >> *ptr = *val; > >> return; > >> } > >> > >> We can see, after the thread jump optimization, the # of branches inside > >> the routine "sparx5_set" is reduced from 2 to 1, however, due to the > >> code duplication (which is needed for the correctness of the code), we > >> got a false positive out-of-bound warning. > >> > >> In order to eliminate such false positive out-of-bound warning, > >> > >> A. Add one more flag for GIMPLE: is_splitted. > >> B. During the thread jump optimization, when the basic blocks are > >> duplicated, mark all the STMTs inside the original and duplicated > >> basic blocks as "is_splitted"; > >> C. Inside the array bound checker, add the following new heuristic: > >> > >> If > >> 1. the stmt is duplicated and splitted into two conditional paths; > >> + 2. the warning level < 2; > >> + 3. the current block is not dominating the exit block > >> Then not report the warning. > >> > >> The false positive warnings are moved from -Warray-bounds=1 to > >> -Warray-bounds=2 now. > >> > >> Bootstrapped and regression tested on both x86 and aarch64. adjusted > >> -Warray-bounds-61.c due to the false positive warnings. > >> > >> Let me know if you have any comments and suggestions. > > > > At the last Cauldron I talked with David Malcolm about these kind of > > issues and thought of instead of suppressing diagnostics to record > > how a block was duplicated. For jump threading my idea was to record > > the condition that was proved true when entering the path and do this > > by recording the corresponding locations so that in the end we can > > use the diagnostic-path infrastructure to say > > > > warning: array index always above array bounds > > events 1: > > > > | 3 | if (index >= 4) > > | > >(1) when index >= 4 As it's been quite some time I think I remeber that I thought of constructing the diagnostic path at jump threading time and associating that with the location. But I don't remember exactly where I wanted to put it - I think it was on an extra stmt to avoid having too many ad-hoc locations as I'm not sure of their cost. It would of course need experimenting since we can end up moving stmts and merging blocks though the linear traces created by jump threading should be quite stable (as opposed to say the unrolling case where multiple instances of the loop body likely will end up in the exact same basic block). > Yes, this is a good idea. > > The current major issue with the warning is: the constant index value 4 is > not in the source code, it’s a compiler generated intermediate value (even > though it’s a correct value -:)). Such warning messages confuse the end-users > with information that cannot be connected directly to the source code. > > With the above recorded “events” information, the warning messages should > make good sense to the end user, and also help the end user to locate the > place where the fix in the source code can be added. > > Actually, with the above warning information, the user can locate the place >
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
> On May 14, 2024, at 09:08, Richard Biener wrote: > > On Mon, 13 May 2024, Qing Zhao wrote: > >> -Warray-bounds is an important option to enable linux kernal to keep >> the array out-of-bound errors out of the source tree. >> >> However, due to the false positive warnings reported in PR109071 >> (-Warray-bounds false positive warnings due to code duplication from >> jump threading), -Warray-bounds=1 cannot be added on by default. >> >> Although it's impossible to elinimate all the false positive warnings >> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds >> documentation says "always out of bounds"), we should minimize the >> false positive warnings in -Warray-bounds=1. >> >> The root reason for the false positive warnings reported in PR109071 is: >> >> When the thread jump optimization tries to reduce the # of branches >> inside the routine, sometimes it needs to duplicate the code and >> split into two conditional pathes. for example: >> >> The original code: >> >> void sparx5_set (int * ptr, struct nums * sg, int index) >> { >> if (index >= 4) >>warn (); >> *ptr = 0; >> *val = sg->vals[index]; >> if (index >= 4) >>warn (); >> *ptr = *val; >> >> return; >> } >> >> With the thread jump, the above becomes: >> >> void sparx5_set (int * ptr, struct nums * sg, int index) >> { >> if (index >= 4) >>{ >> warn (); >> *ptr = 0; // Code duplications since "warn" does return; >> *val = sg->vals[index]; // same this line. >> // In this path, since it's under the condition >> // "index >= 4", the compiler knows the value >> // of "index" is larger then 4, therefore the >> // out-of-bound warning. >> warn (); >>} >> else >>{ >> *ptr = 0; >> *val = sg->vals[index]; >>} >> *ptr = *val; >> return; >> } >> >> We can see, after the thread jump optimization, the # of branches inside >> the routine "sparx5_set" is reduced from 2 to 1, however, due to the >> code duplication (which is needed for the correctness of the code), we >> got a false positive out-of-bound warning. >> >> In order to eliminate such false positive out-of-bound warning, >> >> A. Add one more flag for GIMPLE: is_splitted. >> B. During the thread jump optimization, when the basic blocks are >> duplicated, mark all the STMTs inside the original and duplicated >> basic blocks as "is_splitted"; >> C. Inside the array bound checker, add the following new heuristic: >> >> If >> 1. the stmt is duplicated and splitted into two conditional paths; >> + 2. the warning level < 2; >> + 3. the current block is not dominating the exit block >> Then not report the warning. >> >> The false positive warnings are moved from -Warray-bounds=1 to >> -Warray-bounds=2 now. >> >> Bootstrapped and regression tested on both x86 and aarch64. adjusted >> -Warray-bounds-61.c due to the false positive warnings. >> >> Let me know if you have any comments and suggestions. > > At the last Cauldron I talked with David Malcolm about these kind of > issues and thought of instead of suppressing diagnostics to record > how a block was duplicated. For jump threading my idea was to record > the condition that was proved true when entering the path and do this > by recording the corresponding locations so that in the end we can > use the diagnostic-path infrastructure to say > > warning: array index always above array bounds > events 1: > > | 3 | if (index >= 4) > | >(1) when index >= 4 Yes, this is a good idea. The current major issue with the warning is: the constant index value 4 is not in the source code, it’s a compiler generated intermediate value (even though it’s a correct value -:)). Such warning messages confuse the end-users with information that cannot be connected directly to the source code. With the above recorded “events” information, the warning messages should make good sense to the end user, and also help the end user to locate the place where the fix in the source code can be added. Actually, with the above warning information, the user can locate the place “line 3” to add fixes as following: if (*index >= 4) { warn(); *index = 3; } I.e. [109071]$ diff t_org.c t.c 2c2 < static inline void assign(int val, int *regs, int index) --- > static inline void assign(int val, int *regs, int *index) 4c4,5 < if (index >= 4) --- > if (*index >= 4) > { 5a7,8 > *index = 3; > } 14,15c17,18 < assign(0,ptr, index); < assign(*val, ptr, index); --- > assign(0,ptr, ); > assign(*val, ptr, ); > > it would be possible to record the info as part of the ad-hoc > location data on each duplicated stmt or, possibly simpler, > as part of a debug stmt of new kind. Recording such info to each stmt might be more reliable? > > I'm not sure pruning the warnings is a good thing to do. One > would argue we should instead isolate such path as unreachable > since it invokes undefined behavior.
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Mon, 13 May 2024, Qing Zhao wrote: > -Warray-bounds is an important option to enable linux kernal to keep > the array out-of-bound errors out of the source tree. > > However, due to the false positive warnings reported in PR109071 > (-Warray-bounds false positive warnings due to code duplication from > jump threading), -Warray-bounds=1 cannot be added on by default. > > Although it's impossible to elinimate all the false positive warnings > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds > documentation says "always out of bounds"), we should minimize the > false positive warnings in -Warray-bounds=1. > > The root reason for the false positive warnings reported in PR109071 is: > > When the thread jump optimization tries to reduce the # of branches > inside the routine, sometimes it needs to duplicate the code and > split into two conditional pathes. for example: > > The original code: > > void sparx5_set (int * ptr, struct nums * sg, int index) > { > if (index >= 4) > warn (); > *ptr = 0; > *val = sg->vals[index]; > if (index >= 4) > warn (); > *ptr = *val; > > return; > } > > With the thread jump, the above becomes: > > void sparx5_set (int * ptr, struct nums * sg, int index) > { > if (index >= 4) > { > warn (); > *ptr = 0; // Code duplications since "warn" does return; > *val = sg->vals[index]; // same this line. > // In this path, since it's under the condition > // "index >= 4", the compiler knows the value > // of "index" is larger then 4, therefore the > // out-of-bound warning. > warn (); > } > else > { > *ptr = 0; > *val = sg->vals[index]; > } > *ptr = *val; > return; > } > > We can see, after the thread jump optimization, the # of branches inside > the routine "sparx5_set" is reduced from 2 to 1, however, due to the > code duplication (which is needed for the correctness of the code), we > got a false positive out-of-bound warning. > > In order to eliminate such false positive out-of-bound warning, > > A. Add one more flag for GIMPLE: is_splitted. > B. During the thread jump optimization, when the basic blocks are >duplicated, mark all the STMTs inside the original and duplicated >basic blocks as "is_splitted"; > C. Inside the array bound checker, add the following new heuristic: > > If >1. the stmt is duplicated and splitted into two conditional paths; > + 2. the warning level < 2; > + 3. the current block is not dominating the exit block > Then not report the warning. > > The false positive warnings are moved from -Warray-bounds=1 to > -Warray-bounds=2 now. > > Bootstrapped and regression tested on both x86 and aarch64. adjusted > -Warray-bounds-61.c due to the false positive warnings. > > Let me know if you have any comments and suggestions. At the last Cauldron I talked with David Malcolm about these kind of issues and thought of instead of suppressing diagnostics to record how a block was duplicated. For jump threading my idea was to record the condition that was proved true when entering the path and do this by recording the corresponding locations so that in the end we can use the diagnostic-path infrastructure to say warning: array index always above array bounds events 1: | 3 | if (index >= 4) | (1) when index >= 4 it would be possible to record the info as part of the ad-hoc location data on each duplicated stmt or, possibly simpler, as part of a debug stmt of new kind. I'm not sure pruning the warnings is a good thing to do. One would argue we should instead isolate such path as unreachable since it invokes undefined behavior. In particular your example is clearly a bug and should be diagnosed. Note very similar issues happen when unrolling a loop. Note all late diagnostics are prone to these kind of issues. Richard. > Thanks. > > Qing > > > PR tree optimization/109071 > > gcc/ChangeLog: > > * gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add two new > arguments for the new heuristic to not issue warnings. > (array_bounds_checker::check_array_ref): Call the new prototype of the > routine check_out_of_bounds_and_warn. > (array_bounds_checker::check_mem_ref): Add one new argument for the > new heuristic to not issue warnings. > (array_bounds_checker::check_addr_expr): Call the new prototype of the > routine check_mem_ref, add new heuristic for not issue warnings. > (array_bounds_checker::check_array_bounds): Call the new prototype of > the routine check_mem_ref. > * gimple-array-bounds.h: New prototype of check_mem_ref. > * gimple.h (struct GTY): Add one new flag is_splitted for gimple. > (gimple_is_splitted_p): New function. > (gimple_set_is_splitted): New function. > *
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote: > On Mon, May 13, 2024, 11:41 PM Kees Cook wrote: > > But it makes no sense to warn about: > > > > void sparx5_set (int * ptr, struct nums * sg, int index) > > { > >if (index >= 4) > > warn (); > >*ptr = 0; > >*val = sg->vals[index]; > >if (index >= 4) > > warn (); > >*ptr = *val; > > } > > > > Because at "*val = sg->vals[index];" the actual value range tracking for > > index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the > > "if" statements is the range tracking [4,INT_MAX].) > > > > However, in the case where jump threading has split the execution flow > > and produced a copy of "*val = sg->vals[index];" where the value range > > tracking for "index" is now [4,INT_MAX], is the warning valid. But it > > is only for that instance. Reporting it for effectively both (there is > > only 1 source line for the array indexing) is misleading because there > > is nothing the user can do about it -- the compiler created the copy and > > then noticed it had a range it could apply to that array index. > > > > "there is nothing the user can do about it" is very much false. They could > change warn call into a noreturn function call instead. (In the case of > the Linux kernel panic). There are things the user can do to fix the > warning and even get better code generation out of the compilers. This isn't about warn() not being noreturn. The warn() could be any function call; the jump threading still happens. GCC is warning about a compiler-constructed situation that cannot be reliably fixed on the source side (GCC emitting the warning is highly unstable in these cases), since the condition is not *always* true for the given line of code. If it is not useful to warn for "array[index]" being out of range when "index" is always [INT_MIN,INT_MAX], then it is not useful to warn when "index" MAY be [INT_MIN,INT_MAX] for a given line of code. -Kees -- Kees Cook
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Mon, May 13, 2024, 11:41 PM Kees Cook wrote: > On Mon, May 13, 2024 at 02:46:32PM -0600, Jeff Law wrote: > > > > > > On 5/13/24 1:48 PM, Qing Zhao wrote: > > > -Warray-bounds is an important option to enable linux kernal to keep > > > the array out-of-bound errors out of the source tree. > > > > > > However, due to the false positive warnings reported in PR109071 > > > (-Warray-bounds false positive warnings due to code duplication from > > > jump threading), -Warray-bounds=1 cannot be added on by default. > > > > > > Although it's impossible to elinimate all the false positive warnings > > > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds > > > documentation says "always out of bounds"), we should minimize the > > > false positive warnings in -Warray-bounds=1. > > > > > > The root reason for the false positive warnings reported in PR109071 > is: > > > > > > When the thread jump optimization tries to reduce the # of branches > > > inside the routine, sometimes it needs to duplicate the code and > > > split into two conditional pathes. for example: > > > > > > The original code: > > > > > > void sparx5_set (int * ptr, struct nums * sg, int index) > > > { > > >if (index >= 4) > > > warn (); > > >*ptr = 0; > > >*val = sg->vals[index]; > > >if (index >= 4) > > > warn (); > > >*ptr = *val; > > > > > >return; > > > } > > > > > > With the thread jump, the above becomes: > > > > > > void sparx5_set (int * ptr, struct nums * sg, int index) > > > { > > >if (index >= 4) > > > { > > >warn (); > > >*ptr = 0;// Code duplications since "warn" does > return; > > >*val = sg->vals[index]; // same this line. > > > // In this path, since it's under the > condition > > > // "index >= 4", the compiler knows the > value > > > // of "index" is larger then 4, therefore > the > > > // out-of-bound warning. > > >warn (); > > > } > > >else > > > { > > >*ptr = 0; > > >*val = sg->vals[index]; > > > } > > >*ptr = *val; > > >return; > > > } > > > > > > We can see, after the thread jump optimization, the # of branches > inside > > > the routine "sparx5_set" is reduced from 2 to 1, however, due to the > > > code duplication (which is needed for the correctness of the code), we > > > got a false positive out-of-bound warning. > > > > > > In order to eliminate such false positive out-of-bound warning, > > > > > > A. Add one more flag for GIMPLE: is_splitted. > > > B. During the thread jump optimization, when the basic blocks are > > > duplicated, mark all the STMTs inside the original and duplicated > > > basic blocks as "is_splitted"; > > > C. Inside the array bound checker, add the following new heuristic: > > > > > > If > > > 1. the stmt is duplicated and splitted into two conditional paths; > > > + 2. the warning level < 2; > > > + 3. the current block is not dominating the exit block > > > Then not report the warning. > > > > > > The false positive warnings are moved from -Warray-bounds=1 to > > > -Warray-bounds=2 now. > > > > > > Bootstrapped and regression tested on both x86 and aarch64. adjusted > > > -Warray-bounds-61.c due to the false positive warnings. > > > > > > Let me know if you have any comments and suggestions. > > This sounds horribly wrong. In the code above, the warning is correct. > > It's not sensible from a user's perspective. > > If this doesn't warn: > > void sparx5_set (int * ptr, struct nums * sg, int index) > { >*ptr = 0; >*val = sg->vals[index]; >*ptr = *val; > } > > ... because the value range tracking of "index" spans [INT_MIN,INT_MAX], > and warnings based on the value range are silenced if they haven't been > clamped at all. (Otherwise warnings would be produced everywhere: only > when a limited set of values is known is it useful to produce a warning.) > > > But it makes no sense to warn about: > > void sparx5_set (int * ptr, struct nums * sg, int index) > { >if (index >= 4) > warn (); >*ptr = 0; >*val = sg->vals[index]; >if (index >= 4) > warn (); >*ptr = *val; > } > > Because at "*val = sg->vals[index];" the actual value range tracking for > index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the > "if" statements is the range tracking [4,INT_MAX].) > > However, in the case where jump threading has split the execution flow > and produced a copy of "*val = sg->vals[index];" where the value range > tracking for "index" is now [4,INT_MAX], is the warning valid. But it > is only for that instance. Reporting it for effectively both (there is > only 1 source line for the array indexing) is misleading because there > is nothing the user can do about it -- the compiler created the copy and > then noticed it had a range it could apply to that array index. > "there is
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On Mon, May 13, 2024 at 02:46:32PM -0600, Jeff Law wrote: > > > On 5/13/24 1:48 PM, Qing Zhao wrote: > > -Warray-bounds is an important option to enable linux kernal to keep > > the array out-of-bound errors out of the source tree. > > > > However, due to the false positive warnings reported in PR109071 > > (-Warray-bounds false positive warnings due to code duplication from > > jump threading), -Warray-bounds=1 cannot be added on by default. > > > > Although it's impossible to elinimate all the false positive warnings > > from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds > > documentation says "always out of bounds"), we should minimize the > > false positive warnings in -Warray-bounds=1. > > > > The root reason for the false positive warnings reported in PR109071 is: > > > > When the thread jump optimization tries to reduce the # of branches > > inside the routine, sometimes it needs to duplicate the code and > > split into two conditional pathes. for example: > > > > The original code: > > > > void sparx5_set (int * ptr, struct nums * sg, int index) > > { > >if (index >= 4) > > warn (); > >*ptr = 0; > >*val = sg->vals[index]; > >if (index >= 4) > > warn (); > >*ptr = *val; > > > >return; > > } > > > > With the thread jump, the above becomes: > > > > void sparx5_set (int * ptr, struct nums * sg, int index) > > { > >if (index >= 4) > > { > >warn (); > >*ptr = 0;// Code duplications since "warn" does return; > >*val = sg->vals[index]; // same this line. > > // In this path, since it's under the condition > > // "index >= 4", the compiler knows the value > > // of "index" is larger then 4, therefore the > > // out-of-bound warning. > >warn (); > > } > >else > > { > >*ptr = 0; > >*val = sg->vals[index]; > > } > >*ptr = *val; > >return; > > } > > > > We can see, after the thread jump optimization, the # of branches inside > > the routine "sparx5_set" is reduced from 2 to 1, however, due to the > > code duplication (which is needed for the correctness of the code), we > > got a false positive out-of-bound warning. > > > > In order to eliminate such false positive out-of-bound warning, > > > > A. Add one more flag for GIMPLE: is_splitted. > > B. During the thread jump optimization, when the basic blocks are > > duplicated, mark all the STMTs inside the original and duplicated > > basic blocks as "is_splitted"; > > C. Inside the array bound checker, add the following new heuristic: > > > > If > > 1. the stmt is duplicated and splitted into two conditional paths; > > + 2. the warning level < 2; > > + 3. the current block is not dominating the exit block > > Then not report the warning. > > > > The false positive warnings are moved from -Warray-bounds=1 to > > -Warray-bounds=2 now. > > > > Bootstrapped and regression tested on both x86 and aarch64. adjusted > > -Warray-bounds-61.c due to the false positive warnings. > > > > Let me know if you have any comments and suggestions. > This sounds horribly wrong. In the code above, the warning is correct. It's not sensible from a user's perspective. If this doesn't warn: void sparx5_set (int * ptr, struct nums * sg, int index) { *ptr = 0; *val = sg->vals[index]; *ptr = *val; } ... because the value range tracking of "index" spans [INT_MIN,INT_MAX], and warnings based on the value range are silenced if they haven't been clamped at all. (Otherwise warnings would be produced everywhere: only when a limited set of values is known is it useful to produce a warning.) But it makes no sense to warn about: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) warn (); *ptr = 0; *val = sg->vals[index]; if (index >= 4) warn (); *ptr = *val; } Because at "*val = sg->vals[index];" the actual value range tracking for index is _still_ [INT_MIN,INT_MAX]. (Only within the "then" side of the "if" statements is the range tracking [4,INT_MAX].) However, in the case where jump threading has split the execution flow and produced a copy of "*val = sg->vals[index];" where the value range tracking for "index" is now [4,INT_MAX], is the warning valid. But it is only for that instance. Reporting it for effectively both (there is only 1 source line for the array indexing) is misleading because there is nothing the user can do about it -- the compiler created the copy and then noticed it had a range it could apply to that array index. This situation makes -Warray-bounds unusable for the Linux kernel (we cannot have false positives says BDFL), but we'd *really* like to have it enabled since it usually finds real bugs. But these false positives can't be fixed on our end. :( So, moving them to -Warray-bounds=2 makes sense as that's the level
Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
On 5/13/24 1:48 PM, Qing Zhao wrote: -Warray-bounds is an important option to enable linux kernal to keep the array out-of-bound errors out of the source tree. However, due to the false positive warnings reported in PR109071 (-Warray-bounds false positive warnings due to code duplication from jump threading), -Warray-bounds=1 cannot be added on by default. Although it's impossible to elinimate all the false positive warnings from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds documentation says "always out of bounds"), we should minimize the false positive warnings in -Warray-bounds=1. The root reason for the false positive warnings reported in PR109071 is: When the thread jump optimization tries to reduce the # of branches inside the routine, sometimes it needs to duplicate the code and split into two conditional pathes. for example: The original code: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) warn (); *ptr = 0; *val = sg->vals[index]; if (index >= 4) warn (); *ptr = *val; return; } With the thread jump, the above becomes: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) { warn (); *ptr = 0;// Code duplications since "warn" does return; *val = sg->vals[index]; // same this line. // In this path, since it's under the condition // "index >= 4", the compiler knows the value // of "index" is larger then 4, therefore the // out-of-bound warning. warn (); } else { *ptr = 0; *val = sg->vals[index]; } *ptr = *val; return; } We can see, after the thread jump optimization, the # of branches inside the routine "sparx5_set" is reduced from 2 to 1, however, due to the code duplication (which is needed for the correctness of the code), we got a false positive out-of-bound warning. In order to eliminate such false positive out-of-bound warning, A. Add one more flag for GIMPLE: is_splitted. B. During the thread jump optimization, when the basic blocks are duplicated, mark all the STMTs inside the original and duplicated basic blocks as "is_splitted"; C. Inside the array bound checker, add the following new heuristic: If 1. the stmt is duplicated and splitted into two conditional paths; + 2. the warning level < 2; + 3. the current block is not dominating the exit block Then not report the warning. The false positive warnings are moved from -Warray-bounds=1 to -Warray-bounds=2 now. Bootstrapped and regression tested on both x86 and aarch64. adjusted -Warray-bounds-61.c due to the false positive warnings. Let me know if you have any comments and suggestions. This sounds horribly wrong. In the code above, the warning is correct. Jeff
[RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading
-Warray-bounds is an important option to enable linux kernal to keep the array out-of-bound errors out of the source tree. However, due to the false positive warnings reported in PR109071 (-Warray-bounds false positive warnings due to code duplication from jump threading), -Warray-bounds=1 cannot be added on by default. Although it's impossible to elinimate all the false positive warnings from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds documentation says "always out of bounds"), we should minimize the false positive warnings in -Warray-bounds=1. The root reason for the false positive warnings reported in PR109071 is: When the thread jump optimization tries to reduce the # of branches inside the routine, sometimes it needs to duplicate the code and split into two conditional pathes. for example: The original code: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) warn (); *ptr = 0; *val = sg->vals[index]; if (index >= 4) warn (); *ptr = *val; return; } With the thread jump, the above becomes: void sparx5_set (int * ptr, struct nums * sg, int index) { if (index >= 4) { warn (); *ptr = 0; // Code duplications since "warn" does return; *val = sg->vals[index]; // same this line. // In this path, since it's under the condition // "index >= 4", the compiler knows the value // of "index" is larger then 4, therefore the // out-of-bound warning. warn (); } else { *ptr = 0; *val = sg->vals[index]; } *ptr = *val; return; } We can see, after the thread jump optimization, the # of branches inside the routine "sparx5_set" is reduced from 2 to 1, however, due to the code duplication (which is needed for the correctness of the code), we got a false positive out-of-bound warning. In order to eliminate such false positive out-of-bound warning, A. Add one more flag for GIMPLE: is_splitted. B. During the thread jump optimization, when the basic blocks are duplicated, mark all the STMTs inside the original and duplicated basic blocks as "is_splitted"; C. Inside the array bound checker, add the following new heuristic: If 1. the stmt is duplicated and splitted into two conditional paths; + 2. the warning level < 2; + 3. the current block is not dominating the exit block Then not report the warning. The false positive warnings are moved from -Warray-bounds=1 to -Warray-bounds=2 now. Bootstrapped and regression tested on both x86 and aarch64. adjusted -Warray-bounds-61.c due to the false positive warnings. Let me know if you have any comments and suggestions. Thanks. Qing PR tree optimization/109071 gcc/ChangeLog: * gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add two new arguments for the new heuristic to not issue warnings. (array_bounds_checker::check_array_ref): Call the new prototype of the routine check_out_of_bounds_and_warn. (array_bounds_checker::check_mem_ref): Add one new argument for the new heuristic to not issue warnings. (array_bounds_checker::check_addr_expr): Call the new prototype of the routine check_mem_ref, add new heuristic for not issue warnings. (array_bounds_checker::check_array_bounds): Call the new prototype of the routine check_mem_ref. * gimple-array-bounds.h: New prototype of check_mem_ref. * gimple.h (struct GTY): Add one new flag is_splitted for gimple. (gimple_is_splitted_p): New function. (gimple_set_is_splitted): New function. * tree-ssa-threadupdate.cc (set_stmts_in_bb_is_splitted): New function. (back_jt_path_registry::duplicate_thread_path): Mark all the stmts in both original and copied blocks as IS_SPLITTED. gcc/testsuite/ChangeLog: * gcc.dg/Warray-bounds-61.c: Adjust testing case. * gcc.dg/pr109071-1.c: New test. * gcc.dg/pr109071.c: New test. --- gcc/gimple-array-bounds.cc | 46 + gcc/gimple-array-bounds.h | 2 +- gcc/gimple.h| 21 +-- gcc/testsuite/gcc.dg/Warray-bounds-61.c | 6 ++-- gcc/testsuite/gcc.dg/pr109071-1.c | 22 gcc/testsuite/gcc.dg/pr109071.c | 22 gcc/tree-ssa-threadupdate.cc| 15 7 files changed, 122 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr109071-1.c create mode 100644 gcc/testsuite/gcc.dg/pr109071.c diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc index 008071cd5464..4a2975623bc1 100644 --- a/gcc/gimple-array-bounds.cc +++ b/gcc/gimple-array-bounds.cc @@ -264,7 +264,9 @@ check_out_of_bounds_and_warn (location_t location, tree ref, tree up_bound, tree up_bound_p1,