Re: Fwd: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC

2020-03-05 Thread Richard Henderson
On 3/4/20 10:43 AM, G 3 wrote:
> I am all intrigued by these vector instructions. Apple was really big on using
> them back in the day so programs like Quicktime and iTunes definitely use 
> them.
> I'm not sure if the PowerPC's altivec vector instructions map to host vector
> instructions already, but if they don't, mapping them would give us a huge
> speedup in certain places. Would anyone know if this was already done in QEMU?

They are, provided that your x86 host supports AVX.  Which should be everything
manufactured after about 2011.


r~



Fwd: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC

2020-03-04 Thread G 3
-- Forwarded message -
From: G 3 
Date: Wed, Mar 4, 2020 at 1:35 PM
Subject: Re: [RFC PATCH v2] target/ppc: Enable hardfloat for PPC
To: BALATON Zoltan 




On Mon, Mar 2, 2020 at 6:16 PM BALATON Zoltan  wrote:

> On Mon, 2 Mar 2020, Richard Henderson wrote:
> > On 3/2/20 3:42 AM, BALATON Zoltan wrote:
> >>> The "hardfloat" option works (with other targets) only with ieee745
> >>> accumulative exceptions, when the most common of those exceptions,
> inexact, has
> >>> already been raised.  And thus need not be raised a second time.
> >>
> >> Why exactly it's done that way? What are the differences between IEEE FP
> >> implementations that prevents using hardfloat most of the time instead
> of only
> >> using it in some (although supposedly common) special cases?
> >
> > While it is possible to read the host's ieee exception word after the
> hardfloat
> > operation, there are two reasons that is undesirable:
> >
> > (1) It is *slow*.  So slow that it's faster to run the softfloat code
> instead.
> > I thought it would be easier to find the benchmark numbers that Emilio
> > generated when this was tested, but I can't find it.
>
> I remember those benchmarks too and this is also what the paper Alex
> referred to also confirmed. Also I've found that enabling hardfloat for
> PPC without doing anything else is slightly slower (on a recent CPU, on
> older CPUs could be even slower). Interetingly however it does give a
> speedup for vector instructions (maybe because they don't clear flags
> between each sub operation). Does that mean these vector instruction
> helpers are also buggy regarding exceptions?
>

I am all intrigued by these vector instructions. Apple was really big on
using them back in the day so programs like Quicktime and iTunes definitely
use them. I'm not sure if the PowerPC's altivec vector instructions map to
host vector instructions already, but if they don't, mapping them would
give us a huge speedup in certain places. Would anyone know if this was
already done in QEMU?