On Wed, Jul 31, 2024 at 10:48 AM Richard Biener <rguent...@suse.de> wrote:
>
> On Wed, 31 Jul 2024, Uros Bizjak wrote:
>
> > On Wed, Jul 31, 2024 at 10:24 AM Jakub Jelinek <ja...@redhat.com> wrote:
> > >
> > > On Wed, Jul 31, 2024 at 10:11:44AM +0200, Uros Bizjak wrote:
> > > > OK. Richard, can you please mention the above in the comment why
> > > > XFmode is rejected in the hook?
> > > >
> > > > Later, we can perhaps benchmark XFmode move vs. generic memory copy to
> > > > get some hard data.
> > >
> > > My (limited) understanding was that the hook would be used only for cases
> > > where we'd like to e.g. value number some SF/DF/XF etc. mode loads and 
> > > some
> > > subsequent loads from the same address with different mode but same size
> > > the same and replace say int or long long later load with 
> > > VIEW_CONVERT_EXPR
> > > of the result of the SF/SF mode load.  That is what was incorrect, because
> > > the load didn't preserve all the bits.  The patch would still keep doing
> > > normal SF/DF/XF etc. mode copies if that is all that happens in the 
> > > program,
> > > load some floating point value and store it elsewhere or as part of larger
> > > aggregate copy.
> >
> > So, the hook should allow everything besides SF/DFmode, simply:
> >
> >
> >     switch (GET_MODE_INNER (mode))
> >       {
> >       case SFmode:
> >       case DFmode:
> >         /* These suffer from normalization upon load when not using SSE.  */
> >         return !(ix86_fpmath & FPMATH_387);
> >       default:
> >         return true;
> >       }
>
> OK, I think I'll go with this then.  I'm now unsure whether the
> wrapper around the hook should reject modes with padding or if
> the supposed users (value-numbering and SRA) should deal with that
> issue separately.  I do wonder whether
>
> ADJUST_FLOAT_FORMAT (XF, (TARGET_128BIT_LONG_DOUBLE
>                           ? &ieee_extended_intel_128_format
>                           : TARGET_96_ROUND_53_LONG_DOUBLE
>                           ? &ieee_extended_intel_96_round_53_format
>                           : &ieee_extended_intel_96_format));
> ADJUST_BYTESIZE  (XF, TARGET_128BIT_LONG_DOUBLE ? 16 : 12);
> ADJUST_ALIGNMENT (XF, TARGET_128BIT_LONG_DOUBLE ? 16 : 4);
>
> unambiguously specifies where the padding is - m68k has
>
> FRACTIONAL_FLOAT_MODE (XF, 80, 12, ieee_extended_motorola_format);
>
> It's also not clear we can model a x87 10 byte memory copy in RTL since
> a mem:XF still touches 12 or 16 bytes - IIRC a store leaves
> possible padding as unspecified and not "masked out" even if
> the actual fstp will only store 10 bytes.

The hardware will never touch bytes outside 10 bytes range, the
padding is some artificial compiler thingy, so IMO it should be
handled before the hook is called. Please find attached the source I
have used to confirm that a) the copied bits will never be mangled and
b) there is no access outside the 10 bytes range. (BTW: these
particular values are to test the effect of leading bit 63, the
non-hidden normalized bit).

Thanks,
Uros.
int main ()
{
  volatile union cvt
  {
    short s[6];
    int i[3];
    long double d;
  } x, y;

  x.s[5] = 0x5a5a; // guard
  x.s[4] = 0xffff;
  x.s[3] = 0x4000;
  x.s[2] = 0x1;
  x.s[1] = 0x0;
  x.s[0] = 0x5555;

  __builtin_printf("%08x %08x %08x\n", x.i[2], x.i[1], x.i[0]);

  y.s[5] = 0xa5a5;  // guard
   
  asm ("" : "=t" (y.d) : "0" (x.d));

  __builtin_printf("%08x %08x %08x\n", y.i[2], y.i[1], y.i[0]);

  if (y.s[0] != x.s[0]
      || y.s[1] != x.s[1]
      || y.s[2] != x.s[2]
      || y.s[3] != x.s[3]
      || y.s[4] != x.s[4])
    __builtin_abort();

  return 0;
}

Reply via email to