Emilio G. Cota writes:
> Performance results for fp-bench:
>
> Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
> - before:
> sqrt-single: 42.30 MFlops
> sqrt-double: 22.97 MFlops
> - after:
> sqrt-single: 311.42 MFlops
> sqrt-double: 311.08 MFlops
>
> Here USE_FP makes a huge difference for f64'
Performance results for fp-bench:
Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
- before:
sqrt-single: 42.30 MFlops
sqrt-double: 22.97 MFlops
- after:
sqrt-single: 311.42 MFlops
sqrt-double: 311.08 MFlops
Here USE_FP makes a huge difference for f64's, with throughput
going from ~200 MFlops to ~3