On Thu, Jan 9, 2025 at 9:08 AM Richard Biener
<[email protected]> wrote:
>
> On Wed, Jan 8, 2025 at 5:34 PM Qing Zhao <[email protected]> wrote:
> >
> >
> >
> > > On Jan 7, 2025, at 07:29, Richard Biener <[email protected]>
> > > wrote:
> > >
> > > On Mon, Jan 6, 2025 at 5:40 PM Qing Zhao <[email protected]> wrote:
> > >>
> > >>
> > >>
> > >>> On Jan 6, 2025, at 11:01, Richard Biener <[email protected]>
> > >>> wrote:
> > >>>
> > >>> On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao <[email protected]> wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>>> On Jan 6, 2025, at 09:21, Jeff Law <[email protected]> wrote:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On 1/6/25 7:11 AM, Qing Zhao wrote:
> > >>>>>>>
> > >>>>>>> Given it doesn't cause user visible UB, we could insert the trap
> > >>>>>>> *before* the UB inducing statement. That would then make the
> > >>>>>>> statement unreachable and it'd get removed avoiding the false
> > >>>>>>> positive diagnostic.
> > >>>>>> Yes, that’s a good idea.
> > >>>>>> However, in order to distinguish a user visible UB and a UB in the
> > >>>>>> IL that is introduced purely by compiler, we might need some new
> > >>>>>> marking in the IR?
> > >>>>> I don't think we've ever really tackled that question; the closest I
> > >>>>> can think of would be things like integer overflow which we try to
> > >>>>> avoid allowing the compiler to introduce. If we take the integer
> > >>>>> overflow as the model, then that would say we should be tackling this
> > >>>>> during loop unrolling.
> > >>>>
> > >>>> UB that is introduced by compiler transformation is one important
> > >>>> cause of false positive warnings.
> > >>>>
> > >>>> There are two approaches to tackle this problem from my understanding:
> > >>>>
> > >>>> 1. Avoid generating such UB from the beginning. i.e, for every
> > >>>> compiler transformation that might introduce such UB, we should add
> > >>>> check to avoid generating it.
> > >>>>
> > >>>> 2. Marking the IR portion that were generated by compiler
> > >>>> transformations, then check whether the UB is compiler generated when
> > >>>> issue static checker warnings.
> > >>>>
> > >>>> Are there other approaches?
> > >>>
> > >>> Note unrolling doesn't introduce UB - it makes conditional UB
> > >>> "obvious”.
> > >>
> > >> So, you mean this is the same issue as PR109071 (and PR85788, PR88771,
> > >> etc), i.e, the compiler optimization make the conditional UB that’s
> > >> originally in the source code “obvious” after code duplication?
> > >>
> > >> (I need to study the testing case in PR92539 more carefully to make sure
> > >> this is the case...)
> > >>
> > >> If so, then the claimed false positive warning in PR92539 actually is a
> > >> real bug in the original source code, and my patch that introduced the
> > >> new option “--fdiagnostics-details” should also include loop unrolling
> > >> to provide more details on the warning introduced by loop unrolling.
> > >>
> > >>
> > >>> Note -Warray-bounds wants to
> > >>> diagnose UB, so doing path isolation and removing the UB would make
> > >>> -Warray-bounds useless.
> > >>>
> > >>> So unless the condition guarding the UB unrolling exposes is visibly
> > >>> false to the compiler but we fail
> > >>> to exploit that (missed optimization) there's not much that we can do.
> > >>> I think "folding" away the UB
> > >>> like what Jeff proposes trades false negatives for the false positive
> > >>> diagnostics.
> > >>>
> > >>> Note the unroller knows UB that effectively bounds the number of
> > >>> iterations, even on conditional
> > >>> paths and it uses this to limit the number of copies _and_ to prune
> > >>> unreachable paths (exploiting
> > >>> UB, avoiding diagnostics). But one of the limitations is that it only
> > >>> prunes paths in the last unrolled
> > >>> copy which can be insufficient (ISTR some PR where I noticed this).
> > >>>
> > >>> That said - I think for these unroller exposed cases of apparent false
> > >>> positives we should improve
> > >>> the path pruning in the unroller itself. For the other cases the path
> > >>> diagnostic might help clarify
> > >>> that the UB happens on the 'n-th' iteration of the loop when some
> > >>> additional condition is true/false.
> > >>
> > >> So, the “other cases” refer to the situation similar as PR109071, i.e,
> > >> “conditional UB” in the original source code is made obvious after loop
> > >> unrolling?
> > >> Yes, for such cases, the new option I have been trying to add,
> > >> “-fdiagnostic-details” should be able to track and provide more details
> > >> on the conditions that lead to the UB.
> > >> Is this understanding correct?
> > >
> > > I think so, but I didn't look into the testcase of the referenced PR.
> >
> > I took a detailed study of the test case of PR92539 yesterday. The
> > following is a brief summary:
> >
> > 1. The pass that caused the issue is: cunrolli.
> > Adding -fdisable-tree-cunrolli eliminate the false positive warnings.
> >
> > 2. The IR Before cunrolli:
> >
> > const char *local_iterator = beginning address of string "aa";
> > const char *last = last address of string "aa";
> >
> > for (int i = 0; i < 3; ++i)
> > if (local_iterator != last) // pointer comparison 1
> > {
> > local_iterator++;
> > if (local_iterator != last) // pointer comparison 2
> > local_iterator++;
> > }
> >
> > I think that the IR has NO UB at this moment;
> > (The pointer comparison 2 in the 2nd iteration of “I” loop and
> > both pointer comparison in the 3rd iteration of “I” loop will
> > NOT execute at all)
> >
> > **After cunrolli (fully unrolling of the above i loop):
> >
> > const char *local_iterator = beginning address of string "aa";
> > const char *last = last address of string "aa";
> > int i = 0;
> >
> > if (local_iterator != last) // pointer comparison 1
> > {
> > local_iterator++;
> > if (local_iterator != last) // pointer comparison 2
> > {
> > local_iterator++;
> > i++;
> > if (local_iterator != last) // pointer comparison 3
> > {
> > local_iterator++;
> > if (local_iterator != last) // pointer comparison 4
> > {
> > local_iterator++;
> > i++;
> > if (local_iterator != last) // pointer comparison 5
> > {
> > local_iterator++;
> > if (local_iterator != last) // pointer comparison 6
> > {
> > local_iterator++;
> > i++;
> > }
> > }
> > }
> > }
> > }
> > }
> >
> > Also, I think that the IR has NO UB at this moment too.
> > (The pointer comparison 4 and later will NOT execute at all)
> >
> > ** The false positive warnings claimed for pointer comparison 4, 5, and 6
> > assuming that these comparison will be executed during runtime but actually
> > they will not be executed at all.
> >
> >
> > So, based on the above study, my impression is:
> >
> > A. Unrolling transformation didn’t do anything wrong.
> > B. Only after const propagation, the compiler will know that the pointer
> > “local_iterator” will
> > point to an out-of-bound position of string “aa”, the pointer
> > comparison 4, 5, 6 is invalid
> > After that.
> >
> > So, I guess that Jeff’s patch to identify the invalid pointer address at
> > “vrp1” is reasonable.
> >
> > What’s your opinion on this?
>
> Your analysis is correct - the long-standing design issue is that
> -Warray-bounds
> does not distinguish conditional UB from unconditional UB (but that concept is
> somewhat wishful thinking given a function isn't guaranteed to be invoked
> either
> unless it is main()). The other thing is that while -Warray-bounds
> without unrolling
> likely sees the value-range of i_1 for &a[i_1] includes out-of-bound values it
> chooses to only diagnose cases where the whole value-range is out-of-bound
> (correctly so to avoid false positives just because of conservative ranges).
>
> I think Jeff's patch is not reasonable since it boils down to not diagnose
> -Warray-bounds but instead remove those stmts.
>
> The problem in this particular testcase is that we fail to optimize the
> compares - not because we don't exploit the visible UB but because
> we fail to compute the final value of 'last' from
>
> auto last = "aa";
> while (*last)
> ++last;
>
> one way to attack this is to recognize this pattern in niter analysis
> and record niter as strlen (last) (but it would be a "first" to have
> a niter pattern with a memory operation), another would be to
> handle this in loop_niter_by_eval and allow us to take advantage
> of the constant &'aa' initial value (that would only scale so much).
>
> >
> > > Note the unroller explicitly tries to
> > > prune those "bugs" given it happily exploits UB when computing the
> > > number of iterations of a loop,
> > > like for
> > >
> > > int a[6];
> > > for (int i = 0; i < n; ++i)
> > > a[i] = 1;
> > >
> > > it knows the loop can at most iterate 5 times because a[i] = 1 with i
> > > == 6 would invoke UB.
> >
> > So, the unroller has some heuristic to limit the potential UB invoked by
> > unrolling? Could you point me the corresponding routine for such heuristic
> > if convenient?
>
> It cuts off those via
> tree-ssa-loop-ivcanon.cc:remove_exits_and_undefined_stmts, but this is
> invoked only for the last copy,
> and it only handles stmts that we recorded bounds for via
> infer_loop_bounds_from_undefined.
>
> It's a bit difficult to extend this to non-last copies, we'd have to
> incrementally copy and we'll possibly disrupt the
> batching of transforms, causing compile-time issues. But I haven't
> had time to investigate.
>
> To summarize the following things should be done to improve things in
> this and similar situations:
> a) improve niter analysis / final value replacement to constant fold
> 'last' in such cases
> b) improve unroll to cut out UB stmts of earlier iterations than the
> last (size estimates also do not
> consider this, so it's also a missed optimization)
> c) better explain diagnostics
d) I checked, and niter analysis doesn't consider the UB with the
{(const char *) "aa" + 1, +, 2}_1 IV - in infer_loop_bounds_from_pointer_arith
we bail out early for !expr_invariant_in_loop_p (for unknown reason).
But we also only use conservative low/high instead of deriving low/high
from the CHRECs initial value if it is as in this case based on a STRING_CST
(or in others on the address of a decl). But then the code in
record_nonwrapping_iv
might not be able to deal with a low &"aa" and a high &"aa" + 2.
diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
index 9e966266ca3..8c5c0852420 100644
--- a/gcc/tree-ssa-loop-niter.cc
+++ b/gcc/tree-ssa-loop-niter.cc
@@ -4307,10 +4312,6 @@ infer_loop_bounds_from_pointer_arith (class loop *loop, g
imple *stmt)
if (!nowrap_type_p (type))
return;
- ptr = gimple_assign_rhs1 (stmt);
- if (!expr_invariant_in_loop_p (loop, ptr))
- return;
-
var = gimple_assign_rhs2 (stmt);
if (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (var)))
return;
Richard.
>
> There's also the possibility to disable UB diagnostics for all unroll
> copies when unroll used the
> iteration bound rather than the exact number of iterations (I think we
> don't record whether the
> bound was from UB or from sth else). Possibly we can try to disable
> UB diagnostic only for
> stmts with UB as recorded from bounds, similar to
> remove_exits_and_undefined_stmts, but
> with easier to guarantee to not affect batching, since we're not
> inserting stmts or removing edges.
>
> Richard.
>
> > thanks.
> >
> > Qing
> >
> > >
> > > Richard.
> > >
> > >>
> > >> thanks.
> > >>
> > >> Qing
> > >>>
> > >>> So in the end Jeff - I think your patch isn't a good approach for the
> > >>> issue at hand.
> > >>>
> > >>> Richard.
> > >>>
> > >>>> The above is very rough and initial idea at this moment.
> > >>>>
> > >>>> Qing
> > >>>>
> > >>>>>
> > >>>>> jeff
> >
> >