On 11/5/20 7:50 PM, Martin Sebor wrote:
On 11/5/20 5:02 PM, Andrew MacLeod wrote:
On 11/5/20 4:02 PM, Martin Sebor wrote:
On 11/5/20 12:29 PM, Martin Sebor wrote:
On 10/1/20 11:25 AM, Martin Sebor wrote:
On 10/1/20 9:34 AM, Aldy Hernandez wrote:


On 10/1/20 3:22 PM, Andrew MacLeod wrote:
 > On 10/1/20 5:05 AM, Aldy Hernandez via Gcc-patches wrote:
 >>> Thanks for doing all this!  There isn't anything I don't understand  >>> in the sprintf changes so no questions from me (well, almost none).
 >>> Just some comments:
 >> Thanks for your comments on the sprintf/strlen API conversion.
 >>
 >>> The current call statement is available in all functions that take  >>> a directive argument, as dir->info.callstmt.  There should be no need  >>> to also add it as a new argument to the functions that now need it.
 >> Fixed.
 >>
 >>> The change adds code along these lines in a bunch of places:
 >>>
 >>> +         value_range vr;
 >>> +         if (!query->range_of_expr (vr, arg, stmt))
 >>> +           vr.set_varying (TREE_TYPE (arg));
 >>>
 >>> I thought under the new Ranger APIs when a range couldn't be
 >>> determined it would be automatically set to the maximum for
 >>> the type.  I like that and have been moving in that direction
 >>> with my code myself (rather than having an API fail, have it
 >>> set the max range and succeed).
 >> I went through all the above idioms and noticed all are being used on  >> supported types (integers or pointers).  So range_of_expr will always
 >> return true.  I've removed the if() and the set_varying.
 >>
 >>> Since that isn't so in this case, I think it would still be nice
 >>> if the added code could be written as if the range were set to
 >>> varying in this case and (ideally) reduced to just initialization:
 >>>
 >>>     value_range vr = some-function (query, stmt, arg);
 >>>
 >>> some-function could be an inline helper defined just for the sprintf  >>> pass (and maybe also strlen which also seems to use the same pattern),  >>> or it could be a value_range AKA irange ctor, or it could be a member
 >>> of range_query, whatever is the most appropriate.
 >>>
 >>> (If assigning/copying a value_range is thought to be too expensive,
 >>> declaring it first and then passing it to that helper to set it
 >>> would work too).
 >>>
 >>> In strlen, is the removed comment no longer relevant?  (I.e., does
 >>> the ranger solve the problem?)
 >>>
 >>> -      /* The range below may be "inaccurate" if a constant has been  >>> -        substituted earlier for VAL by this pass that hasn't been  >>> -        propagated through the CFG.  This shoud be fixed by the new  >>> -        on-demand VRP if/when it becomes available (hopefully in
 >>> -        GCC 11).  */
 >> It should.
 >>
 >>> I'm wondering about the comment added to get_range_strlen_dynamic
 >>> and other places:
 >>>
 >>> +         // FIXME: Use range_query instead of global ranges.
 >>>
 >>> Is that something you're planning to do in a followup or should
 >>> I remember to do it at some point?
 >> I'm not planning on doing it.  It's just a reminder that it would be
 >> beneficial to do so.
 >>
 >>> Otherwise I have no concern with the changes.
 >> It's not cleared whether Andrew approved all 3 parts of the patchset  >> or just the valuation part.  I'll wait for his nod before committing
 >> this chunk.
 >>
 >> Aldy
 >>
 > I have no issue with it, so OK.

Pushed all 3 patches.

 >
 > Just an observation that should be pointed out, I believe Aldy has all  > the code for converting to a ranger, but we have not pursued that any  > further yet since there is a regression due to our lack of equivalence  > processing I think?  That should be resolved in the coming month, but at  > the moment is a holdback/concern for converting these passes... iirc.

Yes.  Martin, the take away here is that the strlen/sprintf pass has been converted to the new API, but ranger is still not up and running on it (even on the branch).

With the new API, all you have to do is remove all instances of evrp_range_analyzer and replace them with a ranger. That's it. Below is an untested patch that would convert you to a ranger once it's contributed.

IIRC when I enabled the ranger for your pass a while back, there was one or two regressions due to missing equivalences, and the rest were because the tests were expecting an actual specific range, and the ranger returned a slightly different/better one. You'll need to adjust your tests.

Ack.  I'll be on the lookout for the ranger commit (if you hppen
to remember and CC me on it just in case I might miss it that would
be great).

I have applied the patch and ran some tests.  There are quite
a few failures (see the list below).  I have only looked at
a couple.  The one in in gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
boils down to the following test case.  There should be no warning
for either sprintf call.  The one in h() is a false positive and
the reason for at least some of the regressions.  Somehow,
the conversions between int and char are causing Ranger to lose
the range.

$ cat t.c && gcc -O2 -S -Wall t.c
char a[2];

extern int x;

signed char f (int min, int max)
{
   signed char i = x;
   return i < min || max < i ? min : i;
}

void ff (signed char i)
{
   __builtin_sprintf (a, "%i", f (0, 9));   // okay
}

signed char g (signed char min, signed char max)
{
   signed char i = x;
   return i < min || max < i ? min : i;
}

void gg (void)
{
   __builtin_sprintf (a, "%i", g (0, 9));   // bogus warning
}

t.c: In function ‘gg’:
t.c:24:26: warning: ‘%i’ directive writing between 1 and 4 bytes into a region of size 2 [-Wformat-overflow=]
    24 |   __builtin_sprintf (a, "%i", g (0, 9));   // bogus warning
       |                          ^~
t.c:24:25: note: directive argument in the range [-128, 127]
    24 |   __builtin_sprintf (a, "%i", g (0, 9));   // bogus warning
       |                         ^~~~
t.c:24:3: note: ‘__builtin_sprintf’ output between 2 and 5 bytes into a destination of size 2
    24 |   __builtin_sprintf (a, "%i", g (0, 9));   // bogus warning
       |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Another failure (the one in builtin-sprintf-warn-22.c) is this where
the false positive is due to the strlen pass somehow having lost
the upper bound on the sum of the two string lengths.

I'm guessing this might have something to do with the strlen pass
calling set_range_info() on the nonconstant results of strlen calls
it can determine the range for.  How would I go about doing it with
ranger?  I see ranger_cache::set_global_range() and the class ctor
takes a gimple_ranger as argument.  Would calling the function be
the right way to do it?  (I couldn't find anything on the Wiki.)

I haven't had time to write a new modern wiki yet... probably won't until well after stage 1 closes either.

And no, thats the internal cache of the ranger, which you dont have access to.

At the moment, there isnt a way to inject a "new" global range out of the blue.  It hasnt been a  requirement before, but wouldn't be particularly difficult.

Short answer, there is no current equivalent of set_range_info() because we dont have a persistent global cache in the same way, and the ranger doesnt set  the RANGE_INFO field at this point.. we havent gotten to the point of reworking that yet.

So the question is, what exactly are you trying to do? I presume you are analyzing  a strlen statement, and are making range conclusions about either the result or some of the operands are that are not otherwise exposed in the code?

And you are looking to have those values injected into the rangers calculations as refinements?

Yes, exactly.  This is used both for optimization and for warnings
(both to enable them and to suppress false positives).

Besides folding strlen and sprintf calls into constants, the strlen
pass sets the range of the results of the calls when the lengths of
the arguments/output are known to be at least some number of bytes.

Besides exposing optimization opportunities downstream, the pass
uses this information to either issue or suppress warnings like in
the sprintf case above.  -Wformat-overflow issues warnings either
based on string lengths (or ranges) if they're known, or based on
the sizes of the arrays the strings are stored in when nothing is
known about the lengths.  The array size is used as the worst case
estimate, which can cause false positives.

Here's an example of a downstream optimization made possible by
this.  Because i's range is known, the range of the snprintf result
is also known (the range is computed for all snprintf arguments).
The strlen pass sets the range and subsequent passes eliminate
the unreachable code.

  void f (int i)
  {
    if (i < 100 || 9999 < i)
      i = 100;

    int n = __builtin_snprintf (0, 0, "%i", i);
    if (n < 3 || 4 < n)
      __builtin_abort ();
  }

Martin

That seems like somethings that the range code for builtins should take care of rather than be calculated externally and injected

 they are just like any other  builtins that if the range of certain arguments are known, you can produce a result? I gather the range of i being known to be [100,9999] means we know something about n.  The the ranger would be able to do that anywhere and everywhere, not just within this pass.  One of the ranger goals is to centralize all range tracking

and the warning pass can continue to check the ranges and issue warnings when the ranges arent what it likes.

There may be other hoops you have to continue to jump thru, i don't know the specifics of what you deal with,  but at least the basics of ranges produced from  a builtin sounds like something that should be internal to the range engine.

Andrew

Reply via email to