On 08/23/2013 11:02 AM, Richard Sandiford wrote:
Hi Kenny,
This is the first time I've looked at the implementation of wide-int.h
(rather than just looking at the rtl changes, which as you know I like
in general), so FWIW here are some comments on wide-int.h. I expect
a lot of them overlap with Richard B.'s comments.
I also expect many of them are going to be annoying, sorry, but this
first one definitely will. The coding conventions say that functions
should be defined outside the class:
http://gcc.gnu.org/codingconventions.html
and that opening braces should be on their own line, so most of the file
needs to be reformatted. I went through and made that change with the
patch below, in the process of reading through. I also removed "SGN
must be SIGNED or UNSIGNED." because it seemed redundant when those are
the only two values available. The patch fixes a few other coding standard
problems and typos, but I've not made any actual code changes (or at least,
I didn't mean to).
I had started the file with the functions outside of the class and mike
had stated putting them in the class, and so i went with putting them in
the class since many of them were one liners and so having them out of
line just doubled the size of everything. however, we did not look at
the coding conventions and that really settles the argument. Thanks for
doing this.
Does it look OK to install?
you can check it in.
I'm still unsure about these "infinite" precision types, but I understand
the motivation and I have no objections. However:
* Code that does widening conversions. The canonical way that
this is performed is to sign or zero extend the input value to
the max width based on the sign of the type of the source and
then to truncate that value to the target type. This is in
preference to using the sign of the target type to extend the
value directly (which gets the wrong value for the conversion
of large unsigned numbers to larger signed types).
I don't understand this particular reason. Using the sign of the source
type is obviously right, but why does that mean we need "infinite" precision,
rather than just doubling the precision of the source?
in a sense, "infinite" does not really mean infinite, it really just
means large enough so that you never loose any information from the
top. For widening all that you really need to be "infinite" is one
more bit larger than the destination type. We could have had an abi
where you specified the precision of every operation based on the
precisions of the inputs. Instead, for these kinds of operations, we
decided to sniff the port and determine a fixed width that was large
enough for everything that was needed for the port. We call that number
infinite. This sort of follows the convention that double-int was used
with were infinite was 128 bits, but with our design/implementation, we
(hopefully) have no bugs where the size of the datatypes needed never
runs into implementation limits.
* When a constant that has an integer type is converted to a
wide-int it comes in with precision 0. For these constants the
top bit does accurately reflect the sign of that constant; this
is an exception to the normal rule that the signedness is not
represented. When used in a binary operation, the wide-int
implementation properly extends these constants so that they
properly match the other operand of the computation. This allows
you write:
tree t = ...
wide_int x = t + 6;
assuming t is a int_cst.
This seems dangerous. Not all code that uses "unsigned HOST_WIDE_INT"
actually wants it to be an unsigned value. Some code uses it to avoid
the undefinedness of signed overflow. So these overloads could lead
to us accidentally zero-extending what's conceptually a signed value
without any obvious indication that that's happening. Also, hex constants
are unsigned int, but it doesn't seem safe to assume that 0x80000000 was
meant to be zero-extended.
I realise the same thing can happen if you mix "unsigned int" with
HOST_WIDE_INT, but the point is that you shouldn't really do that
in general, whereas we're defining these overloads precisely so that
a mixture can be used.
I'd prefer some explicit indication of the sign, at least for anything
other than plain "int" (so that the compiler will complain about uses
of "unsigned int" and above).
There is a part of me that finds this scary and a part of me that feels
that the concern is largely theoretical. It does make it much easier
to read and understand the code to be able to write "t + 6" rather than
"wide_int (t) + wide_int::from_uhwi" (6) but of course you loose some
control over how 6 is converted.
Note that the bits above the precision are not defined and the
algorithms used here are careful not to depend on their value. In
particular, values that come in from rtx constants may have random
bits.
I have a feeling I'm rehashing a past debate, sorry, but rtx constants can't
have random bits. The upper bits must be a sign extension of the value.
There's exactly one valid rtx for each (value, mode) pair. If you saw
something different then that sounds like a bug. The rules should already
be fairly well enforced though, since something like (const_int 128) --
or (const_int 256) -- will not match a QImode operand.
This is probably the part of the representation that I disagree most with.
There seem to be two main ways we could hande the extension to whole HWIs:
(1) leave the stored upper bits undefined and extend them on read
(2) keep the stored upper bits in extended form
It is not a matter of opening old wounds. I had run some tests on
x86-64 and was never able to assume that the bits above the precision
had always been canonized. I will admit that i did not try to run down
the bugs because i thought that since the rtx constructors did not have
a mode associated with them now was one required to in the constructors,
that this was not an easily solvable problem. So i have no idea if i
hit the one and only bug or was about to start drinking from a fire
hose. But the back ends are full of GEN_INT (a) where a came from god
knows where and you almost never see it properly canonized. I think
that until GEN_INT takes a mandatory non VOIDmode mode parameter, and
that constructor canonizes it, you are doomed chasing this bug
forever. My/our experience on the dataflow branch was that unless you
go clean things up AND put in a bunch of traps to keep people honest,
you are never going to be able to make this assumption.
Having said that, we actually do neither of (1) or (2) inside of
wide-int. For rtl to wide-int, we leave the upper bits undefined and
never allow you to look at them because the constructor has a mode and
that mode allows you to draw a line in the sand. There is no
constructor for the "infinite" wide ints from rtl so you have no way to
look.
Doing this allows us to do something that richi really wanted: avoiding
copying. We do not get to do as much richi would like and when he
comes back from leave, he may have other places where can apply it, but
right now if you say w = t + 6 as above, it "borrows" the rep from t to
do the add, it does not really build a wide-int. We also do this if t is
an rtx const. But if we had to canonize the number, then we could not
borrow the rep.
The patch goes for (1) but (2) seems better to me, for a few reasons:
* As above, constants coming from rtl are already in the right form,
so if you create a wide_int from an rtx and only query it, no explicit
extension is needed.
* Things like logical operations and right shifts naturally preserve
the sign-extended form, so only a subset of write operations need
to take special measures.
right now the internals of wide-int do not keep the bits above the
precision clean. as you point out this could be fixed by changing
lshift, add, sub, mul, div (and anything else i have forgotten about)
and removing the code that cleans this up on exit. I actually do not
really care which way we go here but if we do go on keeping the bits
clean above the precision inside wide-int, we are going to have to clean
the bits in the constructors from rtl, or fix some/a lot of bugs.
But if you want to go with the stay clean plan you have to start clean,
so at the rtl level this means copying. and it is the not copying trick
that pushed me in the direction we went.
At the tree level, this is not an issue. There are no constructors for
tree-csts that do not have a type and before we copy the rep from the
wide-int to the tree, we clean the top bits. (BTW - If i had to guess
what the bug is with the missing messages on the mips port, it would be
because the front ends HAD a bad habit of creating constants that did
not fit into a type and then later checking to see if there were any
interesting bits above the precision in the int-cst. This now does not
work because we clean out those top bits on construction but it would
not surprise me if we missed the fixed point constant path.) So at the
tree level, we could easily go either way here, but there is a cost at
the rtl level with doing (2).
* You have a public interface that exposes the underlying HWIs
(which is fine with me FWIW), so it seems better to expose a fully-defined
HWI rather than only a partially-defined HWI.
E.g. zero_p is:
HOST_WIDE_INT x;
if (precision && precision < HOST_BITS_PER_WIDE_INT)
x = sext_hwi (val[0], precision);
else if (len == 0)
{
gcc_assert (precision == 0);
return true;
}
the above test for len==0 has been removed because it is rot.
else
x = val[0];
return len == 1 && x == 0;
but I think it really ought to be just:
return len == 1 && val[0] == 0;
If we did your 2, it would be this way.
When the precision is 0, all the bits in the LEN elements of
VEC are significant with no undefined bits. Precisionless
constants are limited to being one or two HOST_WIDE_INTs. When two
are used the upper value is 0, and the high order bit of the first
value is set. (Note that this may need to be generalized if it is
ever necessary to support 32bit HWIs again).
I didn't understand this. When are two HOST_WIDE_INTs needed for
"precision 0"?
if a large unsigned constant comes in with the top bit set, the
canonized value takes 2 hwis, the top hwi being 0.
#define addr_max_bitsize (64)
#define addr_max_precision \
These should either be lower-case C++ constants or upper-case macros.
this will be fixed.
/* VAL is set to a size that is capable of computing a full
multiplication on the largest mode that is represented on the
target. Currently there is a part of tree-vrp that requires 2x +
2 bits of precision where x is the precision of the variables
being optimized. */
HOST_WIDE_INT val[WIDE_INT_MAX_ELTS];
This comment seems redundant with the one above WIDE_INT_MAX_ELTS
and likely to get out of date.
this will be fixed
So real hardware only looks at a small part of the shift amount.
On IBM machines, this tends to be 1 more than what is necessary
to encode the shift amount. The rest of the world looks at only
the minimum number of bits. This means that only 3 gate delays
are necessary to set up the shifter.
I agree that it makes sense for wide_int to provide a form of shift
in which the shift amount is truncated. However, I strongly believe
wide-int.h should not test SHIFT_COUNT_TRUNCATED directly. It should
be up to the callers to decide when truncation is needed (and to what width).
richi does not like this either so i will get rid of it.
We really need to get rid of the #include "tm.h" in wide-int.h.
MAX_BITSIZE_MODE_ANY_INT should be the only partially-target-dependent
thing in there. If that comes from tm.h then perhaps we should put it
into a new header file instead.
I will talk to mike about fixing this.
/* Return THIS as a signed HOST_WIDE_INT. If THIS does not fit in
PREC, the information is lost. */
HOST_WIDE_INT
to_shwi (unsigned int prec = 0) const
Just dropping the excess bits seems dangerous. I think we should assert
instead, at least when prec is 0.
there are times when this is useful. there is a lot of code that just
wants to look at the bottom bits to do some alignment stuff. I guess
that code could just grab the bottom elt of the array, but has not
generally been how these this has been done.
/* Return true if THIS is negative based on the interpretation of SGN.
For UNSIGNED, this is always false. This is correct even if
precision is 0. */
inline bool
wide_int::neg_p (signop sgn) const
It seems odd that you have to pass SIGNED here. I assume you were doing
it so that the caller is forced to confirm signedness in the cases where
a tree type is involved, but:
* neg_p kind of implies signedness anyway
* you don't require this for minus_one_p, so the interface isn't consistent
* at the rtl level signedness isn't a property of the "type" (mode),
so it seems strange to add an extra hoop there
it was done this way so that you can pass in TYPE_SIGN (t) in as the
second parameter. We could default the parameter to SIGNED and that
would solve both cases. I will look into minus_one_p.
/* Return true if THIS fits in an unsigned HOST_WIDE_INT with no
loss of precision. */
inline bool
wide_int_ro::fits_uhwi_p () const
{
return len == 1 || (len == 2 && val[1] == 0);
}
This doesn't look right, since len == 1 could mean that you have a
gazillion-bit all-ones number. Also, the val[1] == 0 check seems
to assume the upper bits are defined when the precision isn't a multiple
of the HWI size (although as above I that's a good thing and should be
enforced).
you are correct.
sign_mask has:
gcc_unreachable ();
#if 0
return val[len - 1] >> (HOST_BITS_PER_WIDE_INT - 1);
#endif
Maybe remove this?
The inline div routines do:
i will work on this this weekend. tree vrp has not been our friend and
sometimes does not like to compile this function.
if (overflow)
*overflow = false;
but then just pass overflow to divmod_internal. Seems better to initialise
*overflow there instead.
div_floor has:
return divmod_internal (true, val, len, p1, s, cl, p2, sgn,
&remainder, false, overflow);
if (quotient.neg_p (sgn) && !remainder.zero_p ())
return quotient - 1;
return quotient;
}
where the last bit is unreachable.
not to mention that the compiler never complained.
/* Divide DIVISOR into THIS producing the remainder. The result is
the same size as the operands. The sign is specified in SGN.
The output is floor truncated. OVERFLOW is set to true if the
result overflows, false otherwise. */
template <typename T>
inline wide_int_ro
wide_int_ro::mod_floor (const T &c, signop sgn, bool *overflow = 0) const
It's really the quotient that's floor-truncated, not the output
(the remainder). I was a bit confused at first why you'd need to
truncate an integer remainder. Same for the other functions.
The comments needs work, not the code. You do have to adjust the
remainder in some cases, but it is not truncation.
#ifdef DEBUG_WIDE_INT
debug_vwa ("wide_int_ro:: %d = (%s == %s)\n", result, *this, s, cl, p2);
#endif
I think these are going to bitrot quickly if we #if 0 then out.
I think we should either use:
if (DEBUG_WIDE_INT)
debug_vwa ("wide_int_ro:: %d = (%s == %s)\n", result, *this, s, cl, p2);
or drop them.
my plan is to leave these in while the branch is still being developed
and then get rid of them before it is merged. My guess is that i am
going to need them still when i try the 32bit hwi test.
The implementations of the general to_shwi1 and to_shwi2 functions look
identical. I think the comment should explain why two functions are needed.
I will check this
/* Negate THIS. */
inline wide_int_ro
wide_int_ro::operator - () const
{
wide_int_ro r;
r = wide_int_ro (0) - *this;
return r;
}
/* Negate THIS. */
inline wide_int_ro
wide_int_ro::neg () const
{
wide_int_ro z = wide_int_ro::from_shwi (0, precision);
gcc_checking_assert (precision);
return z - *this;
}
Why do we need both of these, and why are they implemented slightly
differently?
neg should go away.
template <int bitsize>
inline bool
fixed_wide_int <bitsize>::multiple_of_p (const wide_int_ro &factor,
signop sgn,
fixed_wide_int *multiple) const
{
return wide_int_ro::multiple_of_p (factor, sgn,
reinterpret_cast <wide_int *> (multiple));
}
The patch has several instances of this kind of reinterpret_cast.
It looks like an aliasing violation.
The main thing that's changed since the early patches is that we now
have a mixture of wide-int types. This seems to have led to a lot of
boiler-plate forwarding functions (or at least it felt like that while
moving them all out the class). And that in turn seems to be because
you're trying to keep everything as member functions. E.g. a lot of the
forwarders are from a member function to a static function.
Wouldn't it be better to have the actual classes be light-weight,
with little more than accessors, and do the actual work with non-member
template functions? There seems to be 3 grades of wide-int:
(1) read-only, constant precision (from int, etc.)
(2) read-write, constant precision (fixed_wide_int)
(3) read-write, variable precision (wide_int proper)
but we should be able to hide that behind templates, with compiler errors
if you try to write to (1), etc.
To take one example, the reason we can't simply use things like
std::min on wide ints is because signedness needs to be specified
explicitly, but there's a good reason why the standard defined
std::min (x, y) rather than x.min (y). It seems like we ought
to have smin and umin functions alongside std::min, rather than
make them member functions. We could put them in a separate namespace
if necessary.
I might have a go at trying this last part next week, unless Richard is
already in the process of rewriting things :-)
mike will answer this.
Thanks,
Richard