92539] Improve code and avoid Warray-bounds false positive

Richard Biener Thu, 09 Jan 2025 00:33:13 -0800

On Thu, Jan 9, 2025 at 9:08 AM Richard Biener
<[email protected]> wrote:
>
> On Wed, Jan 8, 2025 at 5:34 PM Qing Zhao <[email protected]> wrote:
> >
> >
> >
> > > On Jan 7, 2025, at 07:29, Richard Biener <[email protected]> 
> > > wrote:
> > >
> > > On Mon, Jan 6, 2025 at 5:40 PM Qing Zhao <[email protected]> wrote:
> > >>
> > >>
> > >>
> > >>> On Jan 6, 2025, at 11:01, Richard Biener <[email protected]> 
> > >>> wrote:
> > >>>
> > >>> On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao <[email protected]> wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>>> On Jan 6, 2025, at 09:21, Jeff Law <[email protected]> wrote:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On 1/6/25 7:11 AM, Qing Zhao wrote:
> > >>>>>>>
> > >>>>>>> Given it doesn't cause user visible UB, we could insert the trap 
> > >>>>>>> *before* the UB inducing statement.  That would then make the 
> > >>>>>>> statement unreachable and it'd get removed avoiding the false 
> > >>>>>>> positive diagnostic.
> > >>>>>> Yes, that’s a good idea.
> > >>>>>> However, in order to distinguish a user visible UB and a UB in the 
> > >>>>>> IL that is introduced purely by compiler, we might need some new 
> > >>>>>> marking in the IR?
> > >>>>> I don't think we've ever really tackled that question; the closest I 
> > >>>>> can think of would be things like integer overflow which we try to 
> > >>>>> avoid allowing the compiler to introduce.  If we take the integer 
> > >>>>> overflow as the model, then that would say we should be tackling this 
> > >>>>> during loop unrolling.
> > >>>>
> > >>>> UB that is introduced by compiler transformation is one important 
> > >>>> cause of false positive warnings.
> > >>>>
> > >>>> There are two approaches to tackle this problem from my understanding:
> > >>>>
> > >>>> 1. Avoid generating such UB from the beginning. i.e, for every 
> > >>>> compiler transformation that might introduce such UB, we should add 
> > >>>> check to avoid generating it.
> > >>>>
> > >>>> 2. Marking the IR portion that were generated by compiler 
> > >>>> transformations, then check whether the UB is compiler generated when 
> > >>>> issue static checker warnings.
> > >>>>
> > >>>> Are there other approaches?
> > >>>
> > >>> Note unrolling doesn't introduce UB - it makes conditional UB
> > >>> "obvious”.
> > >>
> > >> So, you mean this is the same issue as PR109071 (and PR85788, PR88771, 
> > >> etc), i.e, the compiler optimization make the conditional UB that’s 
> > >> originally in the source code “obvious” after code duplication?
> > >>
> > >> (I need to study the testing case in PR92539 more carefully to make sure 
> > >> this is the case...)
> > >>
> > >> If so, then the claimed false positive warning in PR92539 actually is a 
> > >> real bug in the original source code,  and my patch that introduced the 
> > >> new option “--fdiagnostics-details” should also include loop unrolling 
> > >> to provide more details on the warning introduced by loop unrolling.
> > >>
> > >>
> > >>> Note -Warray-bounds wants to
> > >>> diagnose UB, so doing path isolation and removing the UB would make
> > >>> -Warray-bounds useless.
> > >>>
> > >>> So unless the condition guarding the UB unrolling exposes is visibly
> > >>> false to the compiler but we fail
> > >>> to exploit that (missed optimization) there's not much that we can do.
> > >>> I think "folding" away the UB
> > >>> like what Jeff proposes trades false negatives for the false positive
> > >>> diagnostics.
> > >>>
> > >>> Note the unroller knows UB that effectively bounds the number of
> > >>> iterations, even on conditional
> > >>> paths and it uses this to limit the number of copies _and_ to prune
> > >>> unreachable paths (exploiting
> > >>> UB, avoiding diagnostics).  But one of the limitations is that it only
> > >>> prunes paths in the last unrolled
> > >>> copy which can be insufficient (ISTR some PR where I noticed this).
> > >>>
> > >>> That said - I think for these unroller exposed cases of apparent false
> > >>> positives we should improve
> > >>> the path pruning in the unroller itself.  For the other cases the path
> > >>> diagnostic might help clarify
> > >>> that the UB happens on the 'n-th' iteration of the loop when some
> > >>> additional condition is true/false.
> > >>
> > >> So, the “other cases” refer to the situation similar as PR109071, i.e, 
> > >> “conditional UB” in the original source code is made obvious after loop 
> > >> unrolling?
> > >> Yes, for such cases, the new option I have been trying to add, 
> > >> “-fdiagnostic-details” should be able to track and provide more details 
> > >> on the conditions that lead to the UB.
> > >> Is this understanding correct?
> > >
> > > I think so, but I didn't look into the testcase of the referenced PR.
> >
> > I took a detailed study of the test case of PR92539 yesterday.  The 
> > following is a brief summary:
> >
> > 1. The pass that caused the issue is: cunrolli.
> >      Adding -fdisable-tree-cunrolli eliminate the false positive warnings.
> >
> > 2. The IR Before cunrolli:
> >
> > const char *local_iterator = beginning address of string "aa";
> > const char *last = last address of string "aa";
> >
> > for (int i = 0; i < 3; ++i)
> >   if (local_iterator != last)           // pointer comparison 1
> >     {
> >       local_iterator++;
> >       if (local_iterator != last)       // pointer comparison 2
> >         local_iterator++;
> >     }
> >
> > I think that the IR has NO UB at this moment;
> > (The  pointer comparison 2 in the 2nd iteration of “I” loop and
> > both pointer comparison in the 3rd iteration of “I” loop will
> > NOT execute at all)
> >
> > **After cunrolli (fully unrolling of the above i loop):
> >
> > const char *local_iterator = beginning address of string "aa";
> > const char *last = last address of string "aa";
> > int i = 0;
> >
> > if (local_iterator != last)                     // pointer comparison 1
> >   {
> >     local_iterator++;
> >     if (local_iterator != last)         // pointer comparison 2
> >       {
> >         local_iterator++;
> >         i++;
> >         if (local_iterator != last)             // pointer comparison 3
> >           {
> >             local_iterator++;
> >             if (local_iterator != last) // pointer comparison 4
> >               {
> >                 local_iterator++;
> >                 i++;
> >                 if (local_iterator != last)     // pointer comparison 5
> >                   {
> >                     local_iterator++;
> >                     if (local_iterator != last) // pointer comparison 6
> >                       {
> >                         local_iterator++;
> >                         i++;
> >                       }
> >                   }
> >               }
> >           }
> >       }
> >   }
> >
> > Also, I think that the IR has NO UB at this moment too.
> > (The pointer comparison 4 and later will NOT execute at all)
> >
> > ** The false positive warnings claimed for pointer comparison 4, 5, and 6
> > assuming that these comparison will be executed during runtime but actually
> > they will not be executed at all.
> >
> >
> > So, based on the above study, my impression is:
> >
> > A.  Unrolling transformation didn’t do anything wrong.
> > B.  Only after const propagation, the compiler will know that the pointer 
> > “local_iterator” will
> >       point to an out-of-bound position of string “aa”, the pointer 
> > comparison 4, 5, 6 is invalid
> >       After that.
> >
> > So, I guess that Jeff’s patch to identify the invalid pointer address at 
> > “vrp1” is reasonable.
> >
> > What’s your opinion on this?
>
> Your analysis is correct - the long-standing design issue is that 
> -Warray-bounds
> does not distinguish conditional UB from unconditional UB (but that concept is
> somewhat wishful thinking given a function isn't guaranteed to be invoked 
> either
> unless it is main()).  The other thing is that while -Warray-bounds
> without unrolling
> likely sees the value-range of i_1 for &a[i_1] includes out-of-bound values it
> chooses to only diagnose cases where the whole value-range is out-of-bound
> (correctly so to avoid false positives just because of conservative ranges).
>
> I think Jeff's patch is not reasonable since it boils down to not diagnose
> -Warray-bounds but instead remove those stmts.
>
> The problem in this particular testcase is that we fail to optimize the
> compares - not because we don't exploit the visible UB but because
> we fail to compute the final value of 'last' from
>
>     auto last = "aa";
>     while (*last)
>         ++last;
>
> one way to attack this is to recognize this pattern in niter analysis
> and record niter as strlen (last) (but it would be a "first" to have
> a niter pattern with a memory operation), another would be to
> handle this in loop_niter_by_eval and allow us to take advantage
> of the constant &'aa' initial value (that would only scale so much).
>
> >
> > > Note the unroller explicitly tries to
> > > prune those "bugs" given it happily exploits UB when computing the
> > > number of iterations of a loop,
> > > like for
> > >
> > > int a[6];
> > > for (int i = 0; i < n; ++i)
> > >  a[i] = 1;
> > >
> > > it knows the loop can at most iterate 5 times because a[i] = 1 with i
> > > == 6 would invoke UB.
> >
> > So, the unroller has some heuristic to limit the potential UB invoked by 
> > unrolling? Could you point me the corresponding routine for such heuristic 
> > if convenient?
>
> It cuts off those via
> tree-ssa-loop-ivcanon.cc:remove_exits_and_undefined_stmts, but this is
> invoked only for the last copy,
> and it only handles stmts that we recorded bounds for via
> infer_loop_bounds_from_undefined.
>
> It's a bit difficult to extend this to non-last copies, we'd have to
> incrementally copy and we'll possibly disrupt the
> batching of transforms, causing compile-time issues.  But I haven't
> had time to investigate.
>
> To summarize the following things should be done to improve things in
> this and similar situations:
>   a) improve niter analysis / final value replacement to constant fold
> 'last' in such cases
>   b) improve unroll to cut out UB stmts of earlier iterations than the
> last (size estimates also do not
>       consider this, so it's also a missed optimization)
>   c) better explain diagnostics


 d) I checked, and niter analysis doesn't consider the UB with the
  {(const char *) "aa" + 1, +, 2}_1 IV - in infer_loop_bounds_from_pointer_arith
  we bail out early for !expr_invariant_in_loop_p (for unknown reason).
  But we also only use conservative low/high instead of deriving low/high
  from the CHRECs initial value if it is as in this case based on a STRING_CST
  (or in others on the address of a decl).  But then the code in
record_nonwrapping_iv
  might not be able to deal with a low &"aa" and a high &"aa" + 2.

diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
index 9e966266ca3..8c5c0852420 100644
--- a/gcc/tree-ssa-loop-niter.cc
+++ b/gcc/tree-ssa-loop-niter.cc
@@ -4307,10 +4312,6 @@ infer_loop_bounds_from_pointer_arith (class loop *loop, g
imple *stmt)
   if (!nowrap_type_p (type))
     return;

-  ptr = gimple_assign_rhs1 (stmt);
-  if (!expr_invariant_in_loop_p (loop, ptr))
-    return;
-
   var = gimple_assign_rhs2 (stmt);
   if (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (var)))
     return;


Richard.

>
> There's also the possibility to disable UB diagnostics for all unroll
> copies when unroll used the
> iteration bound rather than the exact number of iterations (I think we
> don't record whether the
> bound was from UB or from sth else).  Possibly we can try to disable
> UB diagnostic only for
> stmts with UB as recorded from bounds, similar to
> remove_exits_and_undefined_stmts, but
> with easier to guarantee to not affect batching, since we're not
> inserting stmts or removing edges.
>
> Richard.
>
> > thanks.
> >
> > Qing
> >
> > >
> > > Richard.
> > >
> > >>
> > >> thanks.
> > >>
> > >> Qing
> > >>>
> > >>> So in the end Jeff - I think your patch isn't a good approach for the
> > >>> issue at hand.
> > >>>
> > >>> Richard.
> > >>>
> > >>>> The above is very rough and initial idea at this moment.
> > >>>>
> > >>>> Qing
> > >>>>
> > >>>>>
> > >>>>> jeff
> >
> >

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

Reply via email to