Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

Richard Biener Wed, 03 Apr 2013 02:17:21 -0700

On Tue, Apr 2, 2013 at 7:35 PM, Kenneth Zadeck <zad...@naturalbridge.com> wrote:
> Yes, I agree that you win the challenge that it can be done.    What you
> have always failed to address is why anyone would want to do this.  Or how
> this would at all be desirable.    But I completely agree that from a purely
> abstract point of view you can add a storage model.
>
> Now here is why we REALLY do not want to go down this road:
>
> 1)  The following comment from your earlier mail is completely wrong
>
>
>> +#ifdef NEW_REP_FOR_INT_CST
>> +  /* This is the code once the tree level is converted.  */
>> +  wide_int result;
>> +  int i;
>> +
>> +  tree type = TREE_TYPE (tcst);
>> +
>> +  result.bitsize = GET_MODE_BITSIZE (TYPE_MODE (type));
>> +  result.precision = TYPE_PRECISION (type);
>> +  result.len = TREE_INT_CST_LEN (tcst);
>> +  for (i = 0; i < result.len; i++)
>> +    result.val[i] = TREE_INT_CST_ELT (tcst, i);
>> +
>> +  return result;
>> +#else
>
>
>> this also shows the main reason I was asking for storage abstraction.
>> The initialization from tree is way too expensive.
>
>
> In almost all cases, constants will fit in a single HWI.  Thus, the only
> thing that you are copying is the length and a single HWI. So you are
> dragging in a lot of machinery just to save these two copies?   Certainly
> there has to be more to it than that.

In the end you will have a variable-size storage in TREE_INT_CST thus
you will have at least to emit _code_ copying over meta-data and data
from the tree representation to the wide-int (similar for RTX CONST_DOUBLE/INT).
I'm objecting to the amount of code you emit and agree that the runtime
cost is copying the meta-data (hopefully optimizable via CSE / SRA)
and in most cases one (or two) iterations of the loop copying the data
(not optimizable).

> 2)  You present this as if the implementor actually should care about the
> implementation and you give 3 alternatives:  the double_int, the current
> one, and HWI.     We have tried to make it so that the client should not
> care.   Certainly in my experience here, I have not found a place to care.

Well, similar as for the copying overhead for tree your approach requires
overloading operations for HOST_WIDE_INT operands to be able to
say wi + 1 (which is certainly desirable), or the overhead of using
wide_int_one ().

> In my opinion double_int needs to go away.  That is the main thrust of my
> patches.   There is no place in a compiler for an abi that depends on
> constants fitting into 2 two words whose size is defined by the host.

That's true.  I'm not arguing to preserve "double-int" - I'm arguing to
preserve a way to ask for an integer type on the host with (at least)
N bits.  Almost all double-int users really ask for an integer type on the
host that has at least as many bits as the pointer representation (or
word_mode) on
the target (we do have HOST_WIDEST_INT == 32bits for 64bit pointer
targets).  No double-int user specifically wants 2 * HOST_WIDE_INT
precision - that is just what happens to be there.  Thus I am providing
a way to say get me a host integer with at least N bits (VRP asks for
this, for example).

What I was asking for is that whatever can provide the above should share
the functional interface with wide-int (or the othert way around).  And I
was claiming that wide-int is too fat, because current users of double-int
eventually store double-ints permanently.

> This is not a beauty contest argument, we have public ports are beginning to
> use modes that are larger than two x86-64 HWIs and i have a private port
> that has such modes and it is my experience that any pass that uses this
> interface has one of three behaviors: it silently gets the wrong answer, it
> ices, or it fails to do the transformation.  If we leave double_int as an
> available option, then any use of it potentially will have one of these
> three behaviors.  And so one of my strong objections to this direction is
> that i do not want to fight this kind of bug for the rest of my life.
> Having a single storage model that just always works is in my opinion a
> highly desirable option.  What you have never answered in a concrete manner
> is, if we decide to provide this generality, what it would be used for.
> There is no place in a portable compiler where the right answer for every
> target is two HOST wide integers.
>
> However, i will admit that the HWI option has some merits.   We try to
> address this in our implementation by dividing what is done inline in
> wide-int.h to the cases that fit in an HWI and then only drop into the heavy
> code in wide-int.c if mode is larger (which it rarely will be).   However, a
> case could be made that for certain kinds of things like string lengths and
> such, we could use another interface or as you argue, a different storage
> model with the same interface.   I just do not see that the cost of the
> conversion code is really going to show up on anyone's radar.

What's the issue with abstracting away the model so a fixed-size 'len'
is possible?  (let away the argument that this would easily allow an
adaptor to tree)

> 3) your trick will work at the tree level, but not at the rtl level.   The
> wide-int code cannot share storage with the CONST_INTs.    We tried this,
> and there are a million bugs that would have to be fixed to make it work.
> It could have worked if CONST_INTs had carried a mode around, but since they
> do not, you end up with the same CONST_INT sharing the rep for several
> different types and that just did not work unless you are willing to do
> substantial cleanups.

I don't believe you.  I am only asking for the adaptors to tree and RTL to
work in an RVALUE-ish way (no modification, as obviously RTL and tree
constants may be shared).  I think your claim is because you have that
precision and bitsize members in your wide-int which I believe is a
design red herring.  I suppose we should concentrate on addressing that
one first.  Thus, let me repeat a few questions on your design to eventually
let you understand my concerns:

Given two wide-ints, a and b, what precision will the result of
     a + b
have?  Consider a having precision 32 and b having precision 64
on a 32-bit HWI host.

You define wide-int to have no sign information:

+   The representation does not contain any information about
+   signedness of the represented value, so it can be used to represent
+   both signed and unsigned numbers.  For operations where the results
+   depend on signedness (division, comparisons), the signedness must
+   be specified separately.  For operations where the signness
+   matters, one of the operands to the operation specifies either
+   wide_int::SIGNED or wide_int::UNSIGNED.

but the result of operations on mixed precision operands _does_ depend
on the sign, nevertheless most operations do not get a signedness argument.
Nor would that help, because it depends on the signedness of the individual
operands!

double-int get's around this by having a common "precision" to which
all smaller precision constants have to be sign-/zero-extended.  So
does CONST_INT and CONST_DOUBLE.

Note that even with same precision you have introduced the same problem
with the variable len.

My proposal is simple - all wide-ints are signed!  wide-int is basically
an arbitrary precision signed integer format.  The sign is encoded in
the most significant bit of the last active HWI of the representation
(val[len - 1] & (1 << HOST_BITS_PER_WIDE_INT - 1)).  All values
with less precision than len * HOST_BITS_PER_WIDE_INT are
properly sign-/zero-extended to precision len * HOST_BITS_PER_WIDE_INT.

This let's you define mixed len operations by implicitely sign-/zero-extending
the operands to whatever len is required for the operation.

Furthermore it allows you to get rid of the precision member (and
bitsize anyway).
Conversion from tree / RTL requires information on the signedness of the
input (trivial for tree, for RTL all constants are signed - well,
sign-extended).
Whenever you want to transfer the wide-int to tree / RTL you have to
sign-/zero-extend according to the desired precision.  If you need sth else
than arbitrary precision arithmetic you have to explicitely truncate / extend
at suitable places - with overflow checking being trivial here.  For
optimization
purposes selected operations may benefit from a combined implementation
receiving a target precision and signedness.  Whatever extra meta-data
RTL requires does not belong in wide-int but in the RTX.  Trivially
a mode comes to my mind (on tree we have a type), and trivially
each RTX has a mode.  And each mode has a precision and bitsize.
It lacks a sign, so all RTL integer constants are sign-extended for
encoding efficiency purposes.  mixed-mode operations will not
occur (mixed len operations will), mixed-mode ops are exclusively
sign-/zero-extensions and truncations.

Representation of (unsigned HOST_WIDE_INT)-1 would necessarily
be { 0, (unsigned HOST_WIDE_INT)-1 }, representation of -1 in any
precision would be { -1 }.

That was my proposal.  Now, can you please properly specify yours?

Thanks,
Richard.

>
> On 04/02/2013 11:04 AM, Richard Biener wrote:
>>
>> On Wed, Feb 27, 2013 at 2:59 AM, Kenneth Zadeck
>> <zad...@naturalbridge.com> wrote:
>>>
>>> This patch contains a large number of the changes requested by Richi.
>>> It
>>> does not contain any of the changes that he requested to abstract the
>>> storage layer.   That suggestion appears to be quite unworkable.
>>
>> I of course took this claim as a challenge ... with the following result.
>> It is
>> of course quite workable ;)
>>
>> The attached patch implements the core wide-int class and three storage
>> models (fixed size for things like plain HWI and double-int, variable size
>> similar to how your wide-int works and an adaptor for the double-int as
>> contained in trees).  With that you can now do
>>
>> HOST_WIDE_INT
>> wi_test (tree x)
>> {
>>    // template argument deduction doesn't do the magic we want it to do
>>    // to make this kind of implicit conversions work
>>    // overload resolution considers this kind of conversions so we
>>    // need some magic that combines both ... but seeding the overload
>>    // set with some instantiations doesn't seem to be possible :/
>>    // wide_int<> w = x + 1;
>>    wide_int<> w;
>>    w += x;
>>    w += 1;
>>    // template argument deduction doesn't deduce the return value type,
>>    // not considering the template default argument either ...
>>    // w = wi (x) + 1;
>>    // we could support this by providing rvalue-to-lvalue promotion
>>    // via a traits class?
>>    // otoh it would lead to sub-optimal code anyway so we should
>>    // make the result available as reference parameter and only support
>>    // wide_int <> res; add (res, x, 1); ?
>>    w = wi (x).operator+<wide_int<> >(1);
>>    wide_int<>::add(w, x, 1);
>>    return w.to_hwi ();
>> }
>>
>> we are somewhat limited with C++ unless we want to get really fancy.
>> Eventually providing operator+ just doesn't make much sense for
>> generic wide-int combinations (though then the issue is its operands
>> are no longer commutative which I think is the case with your wide-int
>> or double-int as well - they don't suport "1 + wide_int" for obvious
>> reasons).
>>
>> So there are implementation design choices left undecided.
>>
>> Oh, and the operation implementations are crap (they compute nonsense).
>>
>> But you should get the idea.
>>
>> Richard.
>
>

Re: patch to fix constant math - 4th patch - the wide-int class - patch ping for the next stage 1

Reply via email to