v1: https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg05908.html

Changes from v1:

- Rename series from "hostfloat" to "hardfloat". The series already uses
  "host" as an option for fp-test, so this change should make things clearer

- Rebase on top of master (4c2c101590).

- Move code from fpu/hostfloat.c to fpu/softfloat.c. I am not mentioning
  anything about the license; I read the softfloat-2a license and I'm OK
  with it. [ Laurent: thanks for the clarification on this. ]

- Fix target-m68k build breakage

- Merge is_normal and is_denormal additions into a single commit

- Add tricore patch to use float32_is_denormal

- Keep the flatten attribute for the soft-fp implementations that
  have now become a slow path

- Add the noinline attribute to the soft-fp primitives. Not doing
  this reduces performance significantly

- Add a comment about why dealing with denormals in hardfloat is
  a bad idea

- Keep separate float32 and float64 implementations for most ops. This
  improves performance as shown in the commit logs.
  + I'm keeping the macro-based definitions to make testing easier.
  + In v1 I wrongly reported similar float/double results for fp-bench;
  I noticed that in my testing I forgot to set -p single/double, so I was
  benchmarking only with the default precision (single). Ouch!

- Update commit logs with fresh (correct) numbers from fp-bench.

- Move some zero-input detection (addsub/div) *after* checking for
  <= min_normal. This makes the common case (i.e. not all inputs are zero)
  faster, still allowing us to handle the 0-input cases in hardfloat

- Update the commit log of the comparison patch to mention that
  int64_to_float32/64 are still in soft-fp and take quite a bit of
  execution time for fp-bench -o cmp.

- fp-test:
  + add *.txt to fp-test/.gitignore instead of just whitelist.txt

- fp-bench
  + generate only positive numbers for testing sqrt
  + add -o cmp
  + use g_strjoinv to print the list of available ops in the
    help message
  + remove libc headers except math.h
  + use qemu/timer.h's get_clock_realtime instead of open-coding it
  + add entry to tests/Makefile.include to call fp-test/Makefile
    when building anything in tests/fp-test/

Perf numbers are in the last patch. They are a little different than
last week; I cannot replicate last week's performance (even with
the very same binaries; might have to reboot the machine I'm using
soon), but as of today v2 is certainly faster than v1 (e.g. 5% faster
for nbench-fp).

I have checked all checkpatch warnings; they're all false positives.

You can fetch the series from:
  https://github.com/cota/qemu/tree/hardfloat-v2

Thanks,

                Emilio

diffstat:
 configure                   |    2 +
 fpu/softfloat.c             |  619 ++++++++++++++++++--
 include/fpu/softfloat.h     |   20 +
 target/tricore/fpu_helper.c |    9 +-
 tests/.gitignore            |    2 +
 tests/Makefile.include      |    6 +-
 tests/fp-bench.c            |  334 +++++++++++
 tests/fp-test/.gitignore    |    3 +
 tests/fp-test/Makefile      |   34 ++
 tests/fp-test/fp-test.c     | 1183 ++++++++++++++++++++++++++++++++++++++
 tests/fp-test/muladd.fptest |   51 ++
 11 files changed, 2212 insertions(+), 51 deletions(-)
 create mode 100644 tests/fp-bench.c
 create mode 100644 tests/fp-test/.gitignore
 create mode 100644 tests/fp-test/Makefile
 create mode 100644 tests/fp-test/fp-test.c
 create mode 100644 tests/fp-test/muladd.fptest

Reply via email to