On Wed, Jul 31, 2024 at 10:02 AM Hongtao Liu <crazy...@gmail.com> wrote:

> > > > > > On Tue, 30 Jul 2024, Richard Biener wrote:
> > > > > >
> > > > > > > > Oh, and please add a small comment why we don't use XFmode here.
> > > > > > >
> > > > > > > Will do.
> > > > > > >
> > > > > > >         /* Do not enable XFmode, there is padding in it and it 
> > > > > > > suffers
> > > > > > >            from normalization upon load like SFmode and DFmode 
> > > > > > > when
> > > > > > >            not using SSE.  */
> > > > > >
> > > > > > Is it really true? I have no evidence of FLDT performing 
> > > > > > normalization
> > > > > > (as mentioned in PR 114659, if it did, there would be no way to 
> > > > > > spill/reload
> > > > > > x87 registers).
> > > > >
> > > > > What mangling fld performs depends on the contents of the FP control
> > > > > word which is awkward.  IIRC there's at least a bugreport that it
> > > > > turns sNaN into a qNaN, it seems I was wrong about denormals
> > > > > (when DM is not masked).  And yes, IIRC x87 instability is also
> > > > > related to spills (IIRC we spill in the actual mode of the reg, not in
> > > > > XFmode), but -fexcess-precision=standard should hopefully avoid that.
> > > > > It's also not clear whether all implementations conformed to the
> > > > > specs wrt extended-precision format loads.
> > > >
> > > > FYI, FLDT does not mangle long-double values and does not generate
> > > > exceptions. Please see [1], but ignore shadowed text and instead read
> > > > the "Floating-Point Exceptions" section. So, as far as hardware is
> > > > concerned, it *can* be used to transfer 10-byte values, but I don't
> > > > want to judge from the compiler PoV if this is the way to go. We can
> > > > enable it, perhaps temporarily to experiment a bit - it is easy to
> > > > disable if it causes problems.
> > > >
> > > > Let's CC Intel folks for their opinion, if it is worth using an aging
> > > > x87 to transfer 80-bit data.
> > > I prefer not, in another hook ix86_can_change_mode_class, we have
> > >
> > > 20372  /* x87 registers can't do subreg at all, as all values are
> > > reformatted
> > > 20373     to extended precision.  */
> > > 20374  if (MAYBE_FLOAT_CLASS_P (regclass))
> > > 20375    return false;
> >
> > No, the above applies to SFmode subreg of XFmode value, which is a
> > no-go. My question refers to the plain XFmode (80-bit) moves, where
> > x87 is used simply to:
> >
> > fldt mem1
> > ...
> > fstp mem2
> >
> > where x87 is used to perform a move from one 80-bit location to the other.
> >
> > > I guess it eventually needs reload for XFmode.
> >
> > There are no reloads, as we would like to perform bit-exact 80-bit
> > move, e.g. array of 10 chars.
> Oh, It's memory copy.
> I suspect that the hardware doesn't enable memory renaming for x87 
> instructions.
> So I prefer not.

OK. Richard, can you please mention the above in the comment why
XFmode is rejected in the hook?

Later, we can perhaps benchmark XFmode move vs. generic memory copy to
get some hard data.

Thanks,
Uros.

Reply via email to