Thomas Huth <th...@redhat.com> writes:
> On 2019-01-16 18:08, Alex Bennée wrote: >> >> Thomas Huth <th...@redhat.com> writes: >> >>> On 2019-01-15 21:05, Emilio G. Cota wrote: >>>> On Tue, Jan 15, 2019 at 16:01:32 +0000, Alex Bennée wrote: >>>>> Ahh I should have mentioned we already have the technology for this ;-) >>>>> >>>>> If you build the fpu/next tree on a s390x you can then run: >>>>> >>>>> ./tests/fp/fp-bench f64_div >>>>> >>>>> with and without the CONFIG_128 path. To get an idea of the real world >>>>> impact you can compile a foreign binary and run it on a s390x system >>>>> with: >>>>> >>>>> $QEMU ./tests/fp/fp-bench f64_div -t host >>>>> >>>>> And that will give you the peak performance assuming your program is >>>>> doing nothing but f64_div operations. If the two QEMU's are basically in >>>>> the same ballpark then it doesn't make enough difference. That said: >>>> >>>> I think you mean here `tests/fp/fp-bench -o div -p double', otherwise >>>> you'll get the default op (-o add). >>> >>> I tried that now, too, and -o div -p double does not really seem to >>> exercise this function at all. >> >> How do you mean? It should do because by default it should be calling >> the softfloat implementations. > > I've added a puts("hello") into the udiv_qrnd() function. When I then > run "fp-bench -o div -p double", it only prints out "hello" a single > time, so the function is only called once during the whole test. That's very odd. With the following on my aarch64 box: modified include/fpu/softfloat-macros.h @@ -637,6 +637,8 @@ static inline uint64_t estimateDiv128To64(uint64_t a0, uint64_t a1, uint64_t b) static inline uint64_t udiv_qrnnd(uint64_t *r, uint64_t n1, uint64_t n0, uint64_t d) { + static int iter = 0; + fprintf(stderr, "%s: %d\n", __func__, iter++); #if defined(__x86_64__) uint64_t q; asm("divq %4" : "=a"(q), "=d"(*r) : "0"(n0), "1"(n1), "rm"(d)); I get: udiv_qrnnd: 0 udiv_qrnnd: 1 .. ..<all the way to> udiv_qrnnd: 99998 udiv_qrnnd: 99999 So I'm not sure what is different on s390 -- Alex Bennée