Hi Jeff,

> > By the way, I hacked tree-vrp to start all value ranges for INTEGRAL_TYPE_P
> > variables to [TYPE_MIN, TYPE_MAX].  It certainly helps with eliminating many
> > Ada range checks.  Maybe the compiler will even bootstrap :)
> The thing to check will be compile-time performance -- in general
> with propagators such as VRP, CCP and CPROP it's compile-time
> advantageous to drop to a VARYING state in the lattice as quickly
> as possible.  The flipside is you can sometimes miss transformations
> when there's cases when you can use a VARYING object in an expression
> and get a useful value/range.
> 
> Basically the two extremes we need to look at are:
> 
>   1. Give everything a range, even if it's just TYPE_MIN/TYPE_MAX.  In
>      this case VR_VARYING disappears completely as it's meaningless.
> 
>   2. Anytime we're going to record a range TYPE_MIN/TYPE_MAX, drop to
>      VARYING.  The trick then becomes to find all those cases where we
>      have an expression involving a VARYING which produces a useful
>      range (such as widening typecasts, IOR with nonzero, etc etc).

I've thought of a different approach: add ASSERT_EXPRs on those subroutine
parameters, function return values and (perhaps) global variables which have
exotic TYPE_MIN/TYPE_MAX values ("exotic" means different from what you'd
expect given the signedness and TYPE_PRECISION).  The assertion is that the
variable is in the range [TYPE_MIN,TYPE_MAX].  Except for this, VRP would
completely ignore exotic TYPE_MIN/TYPE_MAX values.  In particular, nothing
special would be done for local variables of a type with an exotic range.

For example, consider

int v (restricted_int i) {
        restricted_int j;

        if (j > 100) return -1;
        j = i;
        if (j > 100) return -1;
        return 0;
}

where restricted_int has TYPE_MAX=100 (this type does not exist in C of course).
Then a single ASSERT_EXPR would be added, something like this (I know the syntax
is wrong, this is just to convey the idea):

int v (restricted_int i) {
        ASSERT_EXPR (i < 100);

        restricted_int j;

        if (j > 100) return -1;
        j = i;
        if (j > 100) return -1;
        return 0;
}

Thus the second "if" statement would be eliminated but not the first.
As mentioned, except for inserting the ASSERT_EXPR, VRP can ignore
the fact that the type has an exotic TYPE_MAX, and reason as if TYPE_MAX
was that given by TYPE_PRECISION, as for a standard C int.  The
exoticness is all captured in the ASSERT_EXPR.

I think this scheme is enough to get good optimisation for a language like
Ada.  Advantages of this approach:

(A) No overhead for languages like C which lack exotic type ranges.
(B) Simplifies VRP, since (except for inserting the ASSERT_EXPRs) it
can completely ignore the existence of exotic ranges; and people working
on it won't need to think about them either.  From a maintenance point
of view this is excellent: as second class citizens, types with exotic
ranges are sure to be forgotten about 99% of the time when someone modifies
VRP; with this scheme, being forgotten about doesn't matter.
(C) Avoids over-aggressive optimisation on uninitialised variables.  Consider
the variable j in the example above.  If VRP considered it to be in the range
[-INF,100] because it is of type restricted_int, then it would eliminate the
first "if" statement.  This kind of thing is extremely dangerous.  Probably
the programmer placed the "if" statement there to check that the variable
really was in it's range.  Eliminating it would be legal in Ada, indeed the
program is wrong, however in practice this kind of thing is unwise: it doesn't
gain you much from an optimisation point of view, but it can make programs
behave in very unexpected ways.  It may be unwise to add ASSERT_EXPRs for
global variables with exotic ranges, since they could be uninitialised.

Furthermore, this scheme agrees well with Ada semantics.  What does
it mean to say that a subroutine parameter is of type restricted_int?  It
really means that (1) it is of type int; and (2) it's value has been checked,
before the subroutine call, and is guaranteed to be <= 100.  The code to do
the checking is automatically inserted by the compiler, which adds some code
like this at the point of the subroutine call:

if x > 100
        raise_some_exception;
else
        result = v (x);

Running VRP on this would cause an ASSERT_EXPR to be placed before the call:

if x > 100
        raise_some_exception;
else {
        ASSERT_EXPR (x <= 100);
        result = v (x);
}

I'm basically suggesting that this same ASSERT_EXPR be duplicated inside the
subroutine v.  This is OK, because the compiler guarantees that any parameter
passed to v always satisfies this assertion, for every call to v.

The compiler also guarantees that the result of a function call is in the range
of the return type (by inserting a check at the end of the function, if 
necessary).
Thus an ASSERT_EXPR can be placed on the result of a function call.

Finally, the language does not guarantee that an uninitialised variable has a
value that is in range.  Thus there is no need to add ASSERT_EXPRs for variable
declarations in general.

I'm quite interested in trying out this scheme.  Unfortunately it's not clear
to me how I would go about finding which variables are subroutine parameters,
and adding ASSERT_EXPRs for them; any hints would be welcome.

Ciao,

Duncan.

Reply via email to