[Qemu-devel] [PATCH 00/24] re-factor and add fp16 using glibc soft-fp

2018-02-03 Thread Richard Henderson
As discussed on list, the structure and inline function solution that
Alex and I have been writing from scratch introduces a sizeable
performance regression.  Alex and I have done some work earlier
in the week that improved things some, but not enough.

Which leaves us with a bit of a problem.  The were two existing
code bases that we originally considered:

There's softfloat v3, which would need a large structural reorg in
order to be able to handle multiple float_status contexts.  But when
Alex communicated with upstream they weren't ready to accept patches.

Or there's the code from glibc.  I know Peter didn't like the idea;
debugging this code is fairly painful -- the massive preprocessor
macros mean that you can't step through anything.  But at least we
have a good relationship with glibc, so merging patches back and
forth should be easy.

The result seems to perform slightly better than mainline.
With an aarch64 guest and a i7-8550U host, nbench gives

- FLOATING-POINT INDEX: 3.095
+ FLOATING-POINT INDEX: 3.438

I've also run this through my usual set of aarch64 RISU tests.

Thoughts?


r~


Alex Bennée (9):
  fpu/softfloat: implement float16_squash_input_denormal
  include/fpu/softfloat: remove USE_SOFTFLOAT_STRUCT_TYPES
  fpu/softfloat-types: new header to prevent excessive re-builds
  target/*/cpu.h: remove softfloat.h
  include/fpu/softfloat: implement float16_abs helper
  include/fpu/softfloat: implement float16_chs helper
  include/fpu/softfloat: implement float16_set_sign helper
  include/fpu/softfloat: add some float16 constants
  fpu/softfloat: improve comments on ARM NaN propagation

Richard Henderson (15):
  fpu/soft-fp: Import soft-fp from glibc
  fpu/soft-fp: Adjust soft-fp types
  fpu/soft-fp: Add ties_away and to_odd rounding modes
  fpu/soft-fp: Add arithmetic macros to half.h
  fpu/soft-fp: Adjust _FP_CMP_CHECK_NAN
  fpu: Implement add/sub/mul/div with soft-fp.h
  fpu: Implement float_to_int/uint with soft-fp.h
  fpu: Implement int/uint_to_float with soft-fp.h
  fpu: Implement compares with soft-fp.h
  fpu: Implement min/max with soft-fp.h
  fpu: Implement sqrt with soft-fp.h
  fpu: Implement scalbn with soft-fp.h
  fpu: Implement float_to_float with soft-fp.h
  fpu: Implement muladd with soft-fp.h
  fpu: Implement round_to_int with soft-fp.h

 Makefile.target |5 +
 fpu/double.h|  321 +++
 fpu/half.h  |  180 ++
 fpu/op-1.h  |  369 +++
 fpu/op-2.h  |  705 ++
 fpu/op-4.h  |  875 +++
 fpu/op-8.h  |1 +
 fpu/op-common.h | 2154 +
 fpu/quad.h  |  328 +++
 fpu/sfp-machine.h   |  222 ++
 fpu/single.h|  197 ++
 fpu/soft-fp-specialize.h|  254 ++
 fpu/soft-fp.h   |  379 +++
 fpu/softfloat-specialize.h  |  273 +--
 include/fpu/softfloat-types.h   |  179 ++
 include/fpu/softfloat.h |  254 +-
 include/qemu/bswap.h|2 +-
 target/alpha/cpu.h  |2 -
 target/arm/cpu.h|2 -
 target/hppa/cpu.h   |1 -
 target/i386/cpu.h   |4 -
 target/m68k/cpu.h   |1 -
 target/microblaze/cpu.h |2 +-
 target/moxie/cpu.h  |1 -
 target/nios2/cpu.h  |1 -
 target/openrisc/cpu.h   |1 -
 target/ppc/cpu.h|1 -
 target/s390x/cpu.h  |2 -
 target/sh4/cpu.h|2 -
 target/sparc/cpu.h  |2 -
 target/tricore/cpu.h|1 -
 target/unicore32/cpu.h  |1 -
 target/xtensa/cpu.h |1 -
 fpu/float128.c  |   35 +
 fpu/float16.c   |   43 +
 fpu/float32.c   |   35 +
 fpu/float64.c   |   35 +
 fpu/floatconv.c |  154 ++
 fpu/floatxx.inc.c   |  541 +
 fpu/softfloat.c | 5092 +--
 target/arm/cpu.c|1 +
 target/arm/helper-a64.c |1 +
 target/arm/helper.c |1 +
 target/arm/neon_helper.c|1 +
 target/hppa/cpu.c   |1 +
 target/hppa/op_helper.c |1 +
 target/i386/fpu_helper.c|1 +
 target/m68k/cpu.c   |2 +-
 target/m68k/fpu_helper.c|1 +
 target/m68k/helper.c|1 +
 target/m68k/translate.c |2 +
 target/microblaze/cpu.c |1 +
 target/microblaze/op_helper.c   |1 +
 target/openrisc/fpu_helper.c|1 +
 target/ppc/fpu_helper.c |1 +
 target/ppc/int_helper.c |1 +
 target/ppc/translate_init.c |1 +
 target/s390x/cpu.c  |1 +
 target/s390x/fpu_helper.c   |1 +
 target/sh4/cpu.c|1 +
 target/sh4/op_helper.c  |1 +
 target/sparc/fop_helper.c   |1 +

Re: [Qemu-devel] [PATCH 00/24] re-factor and add fp16 using glibc soft-fp

2018-02-04 Thread Howard Spoelstra
On Sun, Feb 4, 2018 at 5:11 AM, Richard Henderson
 wrote:
> As discussed on list, the structure and inline function solution that
> Alex and I have been writing from scratch introduces a sizeable
> performance regression.  Alex and I have done some work earlier
> in the week that improved things some, but not enough.
>
> Which leaves us with a bit of a problem.  The were two existing
> code bases that we originally considered:
>
> There's softfloat v3, which would need a large structural reorg in
> order to be able to handle multiple float_status contexts.  But when
> Alex communicated with upstream they weren't ready to accept patches.
>
> Or there's the code from glibc.  I know Peter didn't like the idea;
> debugging this code is fairly painful -- the massive preprocessor
> macros mean that you can't step through anything.  But at least we
> have a good relationship with glibc, so merging patches back and
> forth should be easy.
>
> The result seems to perform slightly better than mainline.
> With an aarch64 guest and a i7-8550U host, nbench gives
>
> - FLOATING-POINT INDEX: 3.095
> + FLOATING-POINT INDEX: 3.438
>
> I've also run this through my usual set of aarch64 RISU tests.
>
> Thoughts?
>
>
Hi,

Thanks for looking into this. It seems this code does not build on OSX
Sierra nor while cross compiling for Windows on Fedora 27:

In file included from /Users/hsp/src/qemu-softfloatglibc/fpu/float16.c:20:
/Users/hsp/src/qemu-softfloatglibc/fpu/soft-fp.h:50:4: error:
"endianness not defined by sfp-machine.h"
#  error "endianness not defined by sfp-machine.h"

Best,
Howard



Re: [Qemu-devel] [PATCH 00/24] re-factor and add fp16 using glibc soft-fp

2018-02-04 Thread Peter Maydell
On 4 February 2018 at 04:11, Richard Henderson
 wrote:
> Or there's the code from glibc.  I know Peter didn't like the idea;
> debugging this code is fairly painful -- the massive preprocessor
> macros mean that you can't step through anything.  But at least we
> have a good relationship with glibc, so merging patches back and
> forth should be easy.

Yeah. I didn't like dealing with this code two decades ago
when I first encountered it, and it hasn't improved any.
It's pretty much write-only code, and it isn't going to be
any fun for debugging.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 00/24] re-factor and add fp16 using glibc soft-fp

2018-02-06 Thread Alex Bennée

Peter Maydell  writes:

> On 4 February 2018 at 04:11, Richard Henderson
>  wrote:
>> Or there's the code from glibc.  I know Peter didn't like the idea;
>> debugging this code is fairly painful -- the massive preprocessor
>> macros mean that you can't step through anything.  But at least we
>> have a good relationship with glibc, so merging patches back and
>> forth should be easy.
>
> Yeah. I didn't like dealing with this code two decades ago
> when I first encountered it, and it hasn't improved any.
> It's pretty much write-only code, and it isn't going to be
> any fun for debugging.

I think I've managed to pull the performance back on softfloat-v4 thanks
to the attribute(flatten) changes to addsub/div/mul/mulladd.

--
Alex Bennée



Re: [Qemu-devel] [PATCH 00/24] re-factor and add fp16 using glibc soft-fp

2018-02-08 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180204041136.17525-1-richard.hender...@linaro.org
Subject: [Qemu-devel] [PATCH 00/24] re-factor and add fp16 using glibc soft-fp

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
error: RPC failed; curl 18 transfer closed with outstanding read data remaining
fatal: The remote end hung up unexpectedly
error: Could not fetch 3c8cf5a9c21ff8782164d1def7f44bd888713384
Traceback (most recent call last):
  File "/usr/bin/patchew", line 442, in test_one
git_clone_repo(clone, r["repo"], r["head"], logf)
  File "/usr/bin/patchew", line 48, in git_clone_repo
stdout=logf, stderr=logf)
  File "/usr/lib64/python3.6/subprocess.py", line 291, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['git', 'remote', 'add', '-f', 
'--mirror=fetch', '3c8cf5a9c21ff8782164d1def7f44bd888713384', 
'https://github.com/patchew-project/qemu']' returned non-zero exit status 1.



---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PATCH 00/24] re-factor and add fp16 using glibc soft-fp

2018-02-09 Thread no-reply
Hi,

This series failed docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

Type: series
Message-id: 20180204041136.17525-1-richard.hender...@linaro.org
Subject: [Qemu-devel] [PATCH 00/24] re-factor and add fp16 using glibc soft-fp

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=8
time make docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
54bfdc61a5 fpu: Implement round_to_int with soft-fp.h
cea1e682ca fpu: Implement muladd with soft-fp.h
70b4528db0 fpu: Implement float_to_float with soft-fp.h
b4c110b84f fpu: Implement scalbn with soft-fp.h
5b30da2aa0 fpu: Implement sqrt with soft-fp.h
a4e5a62d79 fpu: Implement min/max with soft-fp.h
14c4d2bcdf fpu: Implement compares with soft-fp.h
f54c32c68c fpu: Implement int/uint_to_float with soft-fp.h
a8f491ad0e fpu: Implement float_to_int/uint with soft-fp.h
6d17d64dde fpu: Implement add/sub/mul/div with soft-fp.h
017e0c5da1 fpu/soft-fp: Adjust _FP_CMP_CHECK_NAN
4017e90c32 fpu/soft-fp: Add arithmetic macros to half.h
b60dd3e9a3 fpu/soft-fp: Add ties_away and to_odd rounding modes
207fb14412 fpu/soft-fp: Adjust soft-fp types
c319fb5b30 fpu/soft-fp: Import soft-fp from glibc
c45db50d7c fpu/softfloat: improve comments on ARM NaN propagation
1e8300958f include/fpu/softfloat: add some float16 constants
c1f9b7e53d include/fpu/softfloat: implement float16_set_sign helper
b592870a92 include/fpu/softfloat: implement float16_chs helper
6044c4568a include/fpu/softfloat: implement float16_abs helper
e08ee277b9 target/*/cpu.h: remove softfloat.h
5e9ee7ddaa fpu/softfloat-types: new header to prevent excessive re-builds
7f1df09de5 include/fpu/softfloat: remove USE_SOFTFLOAT_STRUCT_TYPES
ed6075583d fpu/softfloat: implement float16_squash_input_denormal

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into '/var/tmp/patchew-tester-tmp-cgiev8vr/src/dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
  BUILD   fedora
  GEN 
/var/tmp/patchew-tester-tmp-cgiev8vr/src/docker-src.2018-02-08-16.06.27.17466/qemu.tar
Cloning into 
'/var/tmp/patchew-tester-tmp-cgiev8vr/src/docker-src.2018-02-08-16.06.27.17466/qemu.tar.vroot'...
done.
Checking out files:  29% (1729/5796)   
Checking out files:  30% (1739/5796)   
Checking out files:  31% (1797/5796)   
Checking out files:  32% (1855/5796)   
Checking out files:  33% (1913/5796)   
Checking out files:  34% (1971/5796)   
Checking out files:  35% (2029/5796)   
Checking out files:  36% (2087/5796)   
Checking out files:  37% (2145/5796)   
Checking out files:  38% (2203/5796)   
Checking out files:  39% (2261/5796)   
Checking out files:  40% (2319/5796)   
Checking out files:  41% (2377/5796)   
Checking out files:  42% (2435/5796)   
Checking out files:  43% (2493/5796)   
Checking out files:  44% (2551/5796)   
Checking out files:  45% (2609/5796)   
Checking out files:  46% (2667/5796)   
Checking out files:  47% (2725/5796)   
Checking out files:  48% (2783/5796)   
Checking out files:  49% (2841/5796)   
Checking out files:  50% (2898/5796)   
Checking out files:  51% (2956/5796)   
Checking out files:  52% (3014/5796)   
Checking out files:  53% (3072/5796)   
Checking out files:  54% (3130/5796)   
Checking out files:  55% (3188/5796)   
Checking out files:  56% (3246/5796)   
Checking out files:  57% (3304/5796)   
Checking out files:  58% (3362/5796)   
Checking out files:  59% (3420/5796)   
Checking out files:  60% (3478/5796)   
Checking out files:  61% (3536/5796)   
Checking out files:  62% (3594/5796)   
Checking out files:  63% (3652/5796)   
Checking out files:  64% (3710/5796)   
Checking out files:  65% (3768/5796)   
Checking out files:  66% (3826/5796)   
Checking out files:  67% (3884/5796)   
Checking out files:  68% (3942/5796)   
Checking out files:  69% (4000/5796)   
Checking out files:  70% (4058/5796)   
Checking out files:  71% (4116/5796)   
Checking out files:  72% (4174/5796)   
Checking out files:  73% (4232/5796)   
Checking out files:  74% (4290/5796)   
Checking out files:  75% (4347/5796)   
Checking out files:  76% (4405/5796)   
Checking out files:  77% (4463/5796)   
Checking out files:  78% (4521/5796)   
Checking out files:  79% (4579/5796)   
Checking out files:  80% (4637/5796)   
Checking out files:  81% (4695/5796)   
Checking out files:  82% (4753/5796)   
Checking out files:  83% (4811/5796)   
Checking out files:  84% (4869/5796)   
Checking out files:  85% (4927/5796)   
Checking out files:  86% (4985/5796)   
Checking out files:  87% (5043/5796)   
Checking out files:  88% (5101/5796)   
Checking out files:  89% (5159/5796)   
Checking out files