Re: patch to fix constant math

Richard Guenther Sun, 07 Oct 2012 05:47:36 -0700

On Fri, Oct 5, 2012 at 6:34 PM, Kenneth Zadeck <[email protected]> wrote:
> richard s,
> there are two comments that i deferred to you.   that have the word richard
> in them,
>
> richi,
> thank, i will start doing this now.
>
> kenny
>
> On 10/05/2012 09:49 AM, Richard Guenther wrote:
>>
>> On Fri, Oct 5, 2012 at 2:39 PM, Richard Guenther
>> <[email protected]> wrote:
>>>
>>> Ok, I see where you are going.  Let me look at the patch again.
>>
>> * The introduction and use of CONST_SCALAR_INT_P could be split out
>>    (obvious and good)
>>
>> * DEF_RTL_EXPR(CONST_WIDE_INT, "const_wide_int", "", RTX_CONST_OBJ)
>>    defining that new RTX is orthogonal to providing the wide_int
>> abstraction
>>    for operating on CONST_INT and CONST_DOUBLE, right?
>>
>> @@ -144,6 +144,7 @@ static bool
>>   prefer_and_bit_test (enum machine_mode mode, int bitnum)
>>   {
>>     bool speed_p;
>> +  wide_int mask = wide_int::set_bit_in_zero (bitnum, mode);
>>
>> set_bit_in_zero ... why use this instead of
>> wide_int::zero (mode).set_bit (bitnum) that would match the old
>> double_int_zero.set_bit interface and be more composed of primitives?
>
> adding something like this was just based on usage.    We do this operation
> all over the place, why not make it concise and efficient.     The api is
> very bottom up.   I looked at what the clients were doing all over the place
> and added those functions.
>
> wide-int has both and_not and or_not.   double-int only has and_not, but
> there are a lot of places in where you do or_nots, so we have or_not also.
>
>
>>     if (and_test == 0)
>>       {
>> @@ -164,8 +165,7 @@ prefer_and_bit_test (enum machine_mode m
>>       }
>>
>>     /* Fill in the integers.  */
>> -  XEXP (and_test, 1)
>> -    = immed_double_int_const (double_int_zero.set_bit (bitnum), mode);
>> +  XEXP (and_test, 1) = immed_wide_int_const (mask);
>>
>> I suppose immed_wide_int_const may create a CONST_INT, a CONST_DOUBLE
>> or a CONST_WIDE_INT?
>
> yes, on converted targets it builds const_ints and const_wide_ints and on
> unconverted targets it builds const_ints and const_doubles.    The reason i
> did not want to convert the targets is that the code that lives in the
> targets generally wants to use the values to create constants and this code
> is very very very target specific.
>
>>
>> +#if TARGET_SUPPORTS_WIDE_INT
>> +/* Returns 1 if OP is an operand that is a CONST_INT or CONST_WIDE_INT.
>> */
>> +int
>> +const_scalar_int_operand (rtx op, enum machine_mode mode)
>> +{
>>
>> why is it necessary to condition anything on TARGET_SUPPORTS_WIDE_INT?
>> It seems testing/compile coverage of this code will be bad ...
>>
>> Can't you simply map CONST_WIDE_INT to CONST_DOUBLE for targets
>
> The accessors for the two are completely different.    const doubles "know"
> that there are exactly two hwi's.   for const_wide_ints, you only know that
> the len is greater than 1.   anything with len 1 would be CONST_INT.
>
> In a modern c++ world, const_int would be a subtype of const_int, but that
> is a huge patch.
>
>
>> not supporting CONST_WIDE_INT (what does it mean to "support"
>> CONST_WIDE_INT?  Does a target that supports CONST_WIDE_INT no
>> longer use CONST_DOUBLEs for integers?)
>
> yes, that is exactly what it means.
>
>> +  if (!CONST_WIDE_INT_P (op))
>> +    return 0;
>>
>> hm, that doesn't match the comment (CONST_INT is supposed to return 1 as
>> well).
>> The comment doesn't really tell what the function does it seems,
>
> I need some context here to reply.
>
>
>> +      int prec = GET_MODE_PRECISION (mode);
>> +      int bitsize = GET_MODE_BITSIZE (mode);
>> +
>> +      if (CONST_WIDE_INT_NUNITS (op) * HOST_BITS_PER_WIDE_INT > bitsize)
>> +       return 0;
>>
>> it's mode argument is not documented.  And this doesn't even try to see if
>> the constant would fit the mode anyway (like if it's zero).  Not sure what
>> the function is for.
>
> I will upgrade the comments, they were inherited from the old version with
> the const_double changed to the const_wide_int
>
>
>> +       {
>> +         /* Multiword partial int.  */
>> +         HOST_WIDE_INT x
>> +           = CONST_WIDE_INT_ELT (op, CONST_WIDE_INT_NUNITS (op) - 1);
>> +         return (wide_int::sext (x, prec & (HOST_BITS_PER_WIDE_INT - 1))
>> +                 == x);
>>
>> err - so now we have wide_int with a mode but we pass in another mode
>> and see if they have the same value?  Same value as-in signed value?
>> Which probably is why you need to rule out different size constants above?
>> Because you don't know whether to sign or zero extend?
>
> no it is worse than that.   I made the decision, which i think is correct,
> that we were not going to carry the mode inside const_wide_int.   The
> justification was that we do not do it for const_int.    There is a comment
> in the constructor for const_wide_int that says it would be so easy to just
> put this in.
>
> But, this is an old api that did not change.   only the internals changed to
> support const_wide_int.
>
>
>>
>> +/* Returns 1 if OP is an operand that is a CONST_WIDE_INT.  */
>> +int
>> +const_wide_int_operand (rtx op, enum machine_mode mode)
>> +{
>>
>> similar comments, common code should be factored out at least.
>> Instead of conditioning a whole set of new function on supports-wide-int
>> please add cases to the existing functions to avoid diverging in pieces
>> that both kind of targets support.
>
> in some places i do one and in some i do the other.   it really just depends
> on how much common code there was.   For these there was almost no common
> code.
>
>
>> @@ -2599,7 +2678,7 @@ constrain_operands (int strict)
>>                  break;
>>
>>                case 'n':
>> -               if (CONST_INT_P (op) || CONST_DOUBLE_AS_INT_P (op))
>> +               if (CONST_SCALAR_INT_P (op))
>>                    win = 1;
>>
>> factoring out this changes would really make the patch less noisy.
>
> i will this weekend.
>
>> skipping to rtl.h bits
>>
>> +struct GTY((variable_size)) hwivec_def {
>> +  int num_elem;                /* number of elements */
>> +  HOST_WIDE_INT elem[1];
>> +};
>>
>> num_elem seems redundant and computable from GET_MODE_SIZE.
>
> NOT AT ALL.   CONST_WIDE_INT and TREE_INT_CST only have enough elements in
> the array to hold the actual value, they do not have enough elements to hold
> the mode or type.
> in real life almost all integer constants of any type are very small.    in
> the rtl word, we almost never actually create any CONST_WIDE_INT outside of
> the test cases, because CONST_INT handles the cases, and i assume that at
> the tree level, almost every int cst will have an len of 1.
>
>
>
>> Or do you want to optimize encoding like for CONST_INT (and unlike
>> CONST_DOUBLE)?  I doubt the above packs nicely into rtx_def?
>> A general issue of it though - we waste 32bits on 64bit hosts in
>> rtx_def between the bits and the union.  Perfect place for num_elem
>> (if you really need it) and avoids another 4 byte hole.  Similar issue
>> exists in rtvec_def.
>
> I am going to let richard answer this.
>
>
>> back to where I was
>>
>> @@ -645,6 +673,10 @@ iterative_hash_rtx (const_rtx x, hashval
>>         return iterative_hash_object (i, hash);
>>       case CONST_INT:
>>         return iterative_hash_object (INTVAL (x), hash);
>> +    case CONST_WIDE_INT:
>> +      for (i = 0; i < CONST_WIDE_INT_NUNITS (x); i++)
>> +       hash = iterative_hash_object (CONST_WIDE_INT_ELT (x, i), hash);
>> +      return hash;
>>
>> I see CONST_DOUBLEs value is not hashed.  Why hash wide-ints value?
>
> There are a lot of ugly things that one uncovers when one looks at code this
> old.
> the idea was to bring whatever i touched up to best practice standards.
>
>
>> Seeing
>>
>> +#define HWI_GET_NUM_ELEM(HWIVEC)       ((HWIVEC)->num_elem)
>> +#define HWI_PUT_NUM_ELEM(HWIVEC, NUM)  ((HWIVEC)->num_elem = (NUM))
>>
>> skipping to
>>
>> +#if TARGET_SUPPORTS_WIDE_INT
>> +  {
>> +    rtx value = const_wide_int_alloc (len);
>> +    unsigned int i;
>> +
>> +    /* It is so tempting to just put the mode in here.  Must control
>> +       myself ... */
>> +    PUT_MODE (value, VOIDmode);
>> +    HWI_PUT_NUM_ELEM (CONST_WIDE_INT_VEC (value), len);
>>
>> why is NUM_ELEM not set in const_wide_int_alloc, and only there?
>>
>> +extern rtx rtx_alloc_stat_v (RTX_CODE MEM_STAT_DECL, int);
>> +#define rtx_alloc_v(c, SZ) rtx_alloc_stat_v (c MEM_STAT_INFO, SZ)
>> +#define const_wide_int_alloc(NWORDS)                           \
>> +  rtx_alloc_v (CONST_WIDE_INT,                                 \
>> +              (sizeof (struct hwivec_def)                      \
>> +               + ((NWORDS)-1) * sizeof (HOST_WIDE_INT)))       \
>>
>> because it's a macro(?).  Ugh.
>>
>> I see CONST_DOUBLE for ints and CONST_WIDE_INT for ints use
>> is mutually exclusive.  Good.  What's the issue with converting targets?
>> Just retain the existing 2 * HWI limit by default.
>
> I have answered this elsewhere, the constant generation code on some
> platforms is tricky and i felt that would require knowledge of the port that
> i did not have.   it is not just a bunch of c code, it is also changing
> patterns in the md files.
>
>
>> +#if TARGET_SUPPORTS_WIDE_INT
>> +
>> +/* Match CONST_*s that can represent compile-time constant integers.  */
>> +#define CASE_CONST_SCALAR_INT \
>> +   case CONST_INT: \
>> +   case CONST_WIDE_INT
>> +
>>
>> I regard this abstraction is mostly because the transition is not
>> finished.
>> Not sure if seeing CASE_CONST_SCALAR_INT, CASE_CONST_UNIQUE (?),
>> CASE_CONST_ANY adds more to the confusion than spelling out all of them.
>> Isn't there sth like a tree-code-class for RTXen?  That would catch
>> CASE_CONST_ANY, no?
>
> however, those abstractions are already in the trunk.   They were put in
> earlier to make this patch smaller.
>
>> @@ -3081,6 +3081,8 @@ commutative_operand_precedence (rtx op)
>>     /* Constants always come the second operand.  Prefer "nice" constants.
>> */
>>     if (code == CONST_INT)
>>       return -8;
>> +  if (code == CONST_WIDE_INT)
>> +    return -8;
>>     if (code == CONST_DOUBLE)
>>       return -7;
>>     if (code == CONST_FIXED)
>>
>> I think it should be the same as CONST_DOUBLE which it "replaces".
>> Likewise below, that avoids code generation changes (which there should
>> be none for a target that is converted, right?).
>
> not clear because it does not really replace const double - remember that
> const double still holds fp stuff.    another one for richard to comment on.
>
>> @@ -5351,6 +5355,20 @@ split_double (rtx value, rtx *first, rtx
>>              }
>>          }
>>       }
>> +  else if (GET_CODE (value) == CONST_WIDE_INT)
>> +    {
>> +      gcc_assert (CONST_WIDE_INT_NUNITS (value) == 2);
>> +      if (WORDS_BIG_ENDIAN)
>> +       {
>> +         *first = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
>> +         *second = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
>> +       }
>> +      else
>> +       {
>> +         *first = GEN_INT (CONST_WIDE_INT_ELT (value, 0));
>> +         *second = GEN_INT (CONST_WIDE_INT_ELT (value, 1));
>> +       }
>> +    }
>>
>> scary ;)
>
> unbelievably scary.    This is one of those places where i really do not
> know what the code is doing and so i just put in something that was
> minimally correct.    when we get some code where the assert hits then i
> will figure it out.
>
>
>> Index: gcc/sched-vis.c
>> ===================================================================
>> --- gcc/sched-vis.c     (revision 191978)
>> +++ gcc/sched-vis.c     (working copy)
>> @@ -444,14 +444,31 @@ print_value (char *buf, const_rtx x, int
>> ...
>>       case CONST_DOUBLE:
>> -      if (FLOAT_MODE_P (GET_MODE (x)))
>> -       real_to_decimal (t, CONST_DOUBLE_REAL_VALUE (x), sizeof (t), 0,
>> 1);
>> -      else
>> +     if (TARGET_SUPPORTS_WIDE_INT == 0 && !FLOAT_MODE_P (GET_MODE (x)))
>>          sprintf (t,
>>                   "<" HOST_WIDE_INT_PRINT_HEX "," HOST_WIDE_INT_PRINT_HEX
>> ">",
>>                   (unsigned HOST_WIDE_INT) CONST_DOUBLE_LOW (x),
>>                   (unsigned HOST_WIDE_INT) CONST_DOUBLE_HIGH (x));
>> +      else
>> +       real_to_decimal (t, CONST_DOUBLE_REAL_VALUE (x), sizeof (t), 0,
>> 1);
>>         cur = safe_concat (buf, cur, t);
>>         break;
>>
>> I don't see that this hunk is needed?  In fact it's more confusing this
>> way.
>
> you are likely right.   this is really there to say that this code would go
> away if
> (when) all targets support wide int,  but i will get rid of it.
>
>> +/* If the target supports integers that are wider than two
>> +   HOST_WIDE_INTs on the host compiler, then the target should define
>> +   TARGET_SUPPORTS_WIDE_INT and make the appropriate fixups.
>> +   Otherwise the compiler really is not robust.  */
>> +#ifndef TARGET_SUPPORTS_WIDE_INT
>> +#define TARGET_SUPPORTS_WIDE_INT 0
>> +#endif
>>
>> I wonder what are the appropriate fixups?
>
> it was in the mail of the original patch.    i will in the next round,
> transfer a cleaned up version of that into the doc for this target hook.
>
>>
>> -/* Return a CONST_INT or CONST_DOUBLE corresponding to target reading
>> -   GET_MODE_BITSIZE (MODE) bits from string constant STR.  */
>> +/* Return a CONST_INT, CONST_WIDE_INT, or CONST_DOUBLE corresponding
>> +   to target reading GET_MODE_BITSIZE (MODE) bits from string constant
>> +   STR.  */
>>
>> maybe being less specific in this kind of comments and just refering to
>> RTX integer constants would be better?
>
> will do, i have done some like that.
>
>> ... skip ...
>>
>> Index: gcc/hwint.c
>> ===================================================================
>> --- gcc/hwint.c (revision 191978)
>> +++ gcc/hwint.c (working copy)
>> @@ -112,6 +112,29 @@ ffs_hwi (unsigned HOST_WIDE_INT x)
>>   int
>>   popcount_hwi (unsigned HOST_WIDE_INT x)
>>   {
>> +  /* Compute the popcount of a HWI using the algorithm from
>> +     Hacker's Delight.  */
>> +#if HOST_BITS_PER_WIDE_INT == 32
>>
>> separate patch please.  With a rationale.
>
> fine.
>
>> ...
>>
>> +/* Constructs tree in type TYPE from with value given by CST.  Signedness
>> +   of CST is assumed to be the same as the signedness of TYPE.  */
>> +
>> +tree
>> +wide_int_to_tree (tree type, const wide_int &cst)
>> +{
>> +  wide_int v;
>> +  if (TYPE_UNSIGNED (type))
>> +    v = cst.zext (TYPE_PRECISION (type));
>> +  else
>> +    v = cst.sext (TYPE_PRECISION (type));
>> +
>> +  return build_int_cst_wide (type, v.elt (0), v.elt (1));
>>
>> this needs an assert that the wide-int contains two HWIs.
>
> as i said in an earlier mail, this is just transition code for when the tree
> level code goes in.
> but i see your point and will add the assert.
>
>
>
>> +#ifndef GENERATOR_FILE
>> +extern tree wide_int_to_tree (tree type, const wide_int &cst);
>> +#endif
>>
>> why's that?  The header inclusion isn't guarded that way.
>
> there was a problem building without it.   but i will look into it.
>
>> Stopping here, skipping to wide-int.[ch]
>>
>> +#define DEBUG_WIDE_INT
>> +#ifdef DEBUG_WIDE_INT
>> +  /* Debugging routines.  */
>> +  static void debug_vw  (const char* name, int r, const wide_int& o0);
>> +  static void debug_vwh (const char* name, int r, const wide_int &o0,
>> +                        HOST_WIDE_INT o1);
>>
>> there is DEBUG_FUNCTION.
>>
>> +/* This is the maximal size of the buffer needed for dump.  */
>> +const int MAX = (MAX_BITSIZE_MODE_ANY_INT / 4
>> +                + MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT +
>> 32);
>>
>> I'd prefer a target macro for this, not anything based off
>> MAX_BITSIZE_MODE_ANY_INT just to allow no-op transition of a
>> target to wide-ints (which IMHO should be the default for all targets,
>> simplifying the patch).
>>
>> +wide_int
>> +wide_int::from_uhwi (unsigned HOST_WIDE_INT op0, enum machine_mode
>> mode, bool *overflow)
>> +{
>> ...
>> +#ifdef DEBUG_WIDE_INT
>> +  if (dump_file)
>> +    {
>> +      char buf0[MAX];
>> +      fprintf (dump_file, "%s: %s = " HOST_WIDE_INT_PRINT_HEX "\n",
>> +              "wide_int::from_uhwi", result.dump (buf0), op0);
>> +    }
>> +#endif
>>
>> hm, ok ... I suppose the debug stuff (which looks very noisy) is going to
>> be
>> removed before this gets committed anywhere?
>
> it is very noisy and i was just planning to turn it off.    but it is very
> useful when there is a problem so i was not planning to actually remove it
> immediately.
>
>
>>
>> +wide_int
>> +wide_int::from_int_cst (const_tree tcst)
>> +{
>> +#if 1
>> +  wide_int result;
>> +  tree type = TREE_TYPE (tcst);
>>
>> should just call wide_it::from_double_int (tree_to_double_int (tcst))
>>
>> As I said elsewhere I don't think only allowing precisions that are
>> somewhere
>> in any mode is good enough.  We need the actual precision here
>> (everywhere).
>
> i think that the case has been made that what wide int needs to store inside
> it is the precision and bitsize and that passing in the mode/tree type is
> just a convenience.     I will make this change. Then we can have exposed
> constructors that just take bitsize and precision.
>
>
>
>> Index: gcc/wide-int.h
>> ===================================================================
>> --- gcc/wide-int.h      (revision 0)
>> +++ gcc/wide-int.h      (revision 0)
>> @@ -0,0 +1,718 @@
>> +/* Operations with very long integers.
>> +   Copyright (C) 2012 Free Software Foundation, Inc.
>> ...
>> +
>> +#ifndef GENERATOR_FILE
>> +#include "hwint.h"
>> +#include "options.h"
>> +#include "tm.h"
>> +#include "insn-modes.h"
>> +#include "machmode.h"
>> +#include "double-int.h"
>> +#include <gmp.h>
>> +#include "insn-modes.h"
>>
>> ugh.  Compare this with double-int.h which just needs <gmp.h> (and even
>> that
>> is a red herring) ...
>>
>> +#define wide_int_minus_one(MODE) (wide_int::from_shwi (-1, MODE))
>> +#define wide_int_zero(MODE)      (wide_int::from_shwi (0, MODE))
>> +#define wide_int_one(MODE)       (wide_int::from_shwi (1, MODE))
>> +#define wide_int_two(MODE)       (wide_int::from_shwi (2, MODE))
>> +#define wide_int_ten(MODE)       (wide_int::from_shwi (10, MODE))
>>
>> add static functions like
>>
>>    wide_int wide_int::zero (MODE)
>
> ok
>
>> +class wide_int {
>> +  /* Internal representation.  */
>> +  HOST_WIDE_INT val[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT];
>> +  unsigned short len;
>> +  enum machine_mode mode;
>>
>> you really need to store the precision here, not the mode.  We do not have
>> a distinct mode for each of the tree constant precisions we see.  So what
>> might work for RTL will later fail on trees, so better do it correct
>> from the start
>> (overloads using mode rather than precision may be acceptable - similarly
>> I'd consider overloads for tree types).
>
> discussed above.
>
>> len seems redundant unless you want to optimize encoding.
>> len == (precision + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT.
>
> that is exactly what we do.   However, we are optimizing time, not space.
> the value is always stored in compressed form, i.e the same representation
> as is used inside the CONST_WIDE_INTs and INT_CSTs. this makes the
> transformation between them very fast.   And since we do this a lot, it
> needs to be fast.   So the len is the number of HWIs needed to represent the
> value which is typically less than what would be implied by the precision.


But doesn't this require a sign?  Otherwise how do you encode TImode 0xffffffff?
Would len be 1 here?  (and talking about CONST_WIDE_INT vs CONST_INT,
wouldn't CONST_INT be used anyway for all ints we can encode "small"?  Or
is it (hopefully) the goal to replace CONST_INT with CONST_WIDE_INT
everywhere?)  Or do you say wide_int is always "signed', thus TImode 0xffffffff
needs len == 2 and an explicit zero upper HWI word?  Or do you say wide_int
is always "unsigned", thus TImode -1 needs len == 2?  Noting that double-ints
were supposed to be twos-complement (thus, 'unsigned') numbers having
implicit non-zero bits sounds error-prone.

That said - I don't see any value in optimizing storage for short-lived
(as you say) wide-ints (apart from limiting it to the mode size).  For
example mutating operations on wide-ints are not really possible
if you need to extend storage.

>> +  enum Op {
>> +    NONE,
>>
>> we don't have sth like this in double-int.h, why was the double-int
>> mechanism
>> not viable?
>
> i have chosen to use enums for things rather than passing booleans.

But it's bad to use an enum with 4 values for a thing that can only possibly
expect two.  You cannot optimize tests as easily.  Please retain the bool uns
parameters instead.

Richard.

Re: patch to fix constant math

Reply via email to