On 10/17/2013 07:49 AM, Richard Biener wrote:
On Thu, 17 Oct 2013, Kenneth Zadeck wrote:

On 10/17/2013 04:46 AM, Richard Biener wrote:
the case actually comes up on the ppc because they do a lot of 128 bit
math.    I think i got thru the x86-64 without noticing this.
Well, it'd be suspicious if we're directly using 128-bit numbers
in addr_wide_int.  The justification for the assertion was that we
should explicitly truncate to addr_wide_int when deliberately
ignoring upper bits, beyond bit or byte address width.  128 bits
definitely falls into that category on powerpc.
My question is whether with 8-bit HWI 0x00 0xff 0xff is a valid
wide-int value if it has precision 16.  AFAIK that is what the
code produces, but now Kenny says this is only for some kind
of wide-ints but not all?  That is, why is
The issue is not that the rules are different between the different flavors of
wide int, it is that the circumstances are different. The rule is that the
only bits above the precision that exist are if the precision is not an even
multiple of the HBPWI.   In that case, the bits are always an extension of the
sign bits.

max_wide_int and addr_wide_int have large enough precisions so that it is
impossible to ever generate an unsigned number on the target that is large
enough to ever run against the precision.    However, for the fixed precision
case, you can, and at least on the ppc, you do, generate unsigned numbers that
are big enough to have to be recanonicalized like this.

inline wi::storage_ref
wi::int_traits <const_tree>::decompose (HOST_WIDE_INT *scratch,
                                          unsigned int precision, const_tree
x)
{
    unsigned int len = TREE_INT_CST_NUNITS (x);
    const HOST_WIDE_INT *val = (const HOST_WIDE_INT *) &TREE_INT_CST_ELT (x,
0);
    return wi::storage_ref (val, len, precision);
}

not a valid implementation together with making sure that the
INTEGER_CST tree rep has that extra word of zeros if required?
I would like to see us move in that direction (given that the
tree rep of INTEGER_CST has transitioned to variable-length already).

Btw, code such as

tree
wide_int_to_tree (tree type, const wide_int_ref &pcst)
{
...
    unsigned int small_prec = prec & (HOST_BITS_PER_WIDE_INT - 1);
    bool recanonize = sgn == UNSIGNED
      && small_prec
      && (prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT ==
len;

definitely needs a comment.  I would have thought _all_ unsigned
numbers need re-canonicalization.  Well, maybe only if we're
forcing the extra zeros.
It will get a comment.
[This function shows another optimization issue:

      case BOOLEAN_TYPE:
        /* Cache false or true.  */
        limit = 2;
        if (wi::leu_p (cst, 1))
          ix = cst.to_uhwi ();

I would have expected cst <= 1 be optimized to cst.len == 1 &&
cst.val[0] <= 1.  It expands to

<L27>:
    MEM[(long int *)&D.50698 + 16B] = 1;
    MEM[(struct wide_int_ref_storage *)&D.50698] = &MEM[(struct
wide_int_ref_storage *)&D.50698].scratch;
    MEM[(struct wide_int_ref_storage *)&D.50698 + 8B] = 1;
    MEM[(struct wide_int_ref_storage *)&D.50698 + 12B] = 32;
    _277 = MEM[(const struct wide_int_storage *)&cst + 260B];
    if (_277 <= 64)
      goto <bb 42>;
    else
      goto <bb 43>;

    <bb 42>:
    xl_491 = zext_hwi (1, 32);  // ok, checking enabled and thus out-of-line
    _494 = MEM[(const long int *)&cst];
    _495 = (long unsigned int) _494;
    yl_496 = zext_hwi (_495, _277);
    _497 = xl_491 < yl_496;
    goto <bb 44>;

    <bb 43>:
    _503 = wi::ltu_p_large (&MEM[(struct wide_int_ref_storage
*)&D.50698].scratch, 1, 32, &MEM[(const struct wide_int_storage
*)&cst].val, len_274, _277);

this keeps D.50698 and cst un-SRAable - inline storage is problematic
for this reason.  But the representation should guarantee the
compare with a low precision (32 bit) constant is evaluatable
at compile-time if len of the larger value is > 1, no?

    <bb 44>:
    # _504 = PHI <_497(42), _503(43)>
    D.50698 ={v} {CLOBBER};
    if (_504 != 0)
      goto <bb 45>;
    else
      goto <bb 46>;

    <bb 45>:
    pretmp_563 = MEM[(const struct wide_int_storage *)&cst + 256B];
    goto <bb 229> (<L131>);

    <bb 46>:
    _65 = generic_wide_int<wide_int_storage>::to_uhwi (&cst, 0);
    ix_66 = (int) _65;
    goto <bb 91>;

The question is whether we should try to optimize wide-int for
such cases or simply not use wi:leu_p (cst, 1) but rather

   if (cst.fits_uhwi_p () == 1 && cst.to_uhwi () < 1)

?
i find this ugly, but i see where you are coming from.   The problem is that
both you and i know that the len has to be 1, but the optimizer does not.
This is a case where I think that we made a mistake getting rid of the
wi::one_p, wi::zero_p and wi::minus_one_p.  The internals of one_p were return
(len == 1 && val[0] ==1) and i think that is much nicer than what you put
there.      On the other hand, it seem that a person more skilled than i am
with c++ could specialize the comparisons with an integer constant, since i
believe that that constant must fit in one hwi (I am a little concerned about
large unsigned constants).
one_p and zero_p wouldn't catch a compare with 3 ;)  So yes, it looks
like if we cannot trick the optimizers to do the right thing, like with

inline bool
wi::lts_p (const wide_int_ref &x, const wide_int_ref &y)
{
   if (x.precision <= HOST_BITS_PER_WIDE_INT
       && y.precision <= HOST_BITS_PER_WIDE_INT)
     return x.slow () < y.slow ();
   else if (x.precision <= HOST_BITS_PER_WIDE_INT)   // why not len == 1?
     ... optimize
   else if (y.precision <= HOST_BITS_PER_WIDE_INT)
     ... optimize
   else
     return lts_p_large (x.val, x.len, x.precision, y.val, y.len,
                         y.precision);
}

which of course pessimizes the case where the precisions are not
known at compile-time.  So maybe the above should always be

   if (__builtin_constant_p (x.precision)
       && x.precision <= HOST_BITS_PER_WIDE_INT
      ....

and lts_p_large should handle the small precision case efficiently
(but out-of-line?)

Richard.
richi

what i was thinking about was to overload lts and its friends so that case of the second arg being an int or a long has handled separately. in the cases of any int and signed longs, you know that the value can always be checked with a len check and a single compare. unsigned longs can cause trouble and so you may not want to overload that.

The part that i do not know how to do is to write the signature of the function. if i just make the sig of the second arg as HOST_WIDE_INT, am i safe to get int, unsigned int and long but not unsigned long?

Reply via email to