On 27/09/16 23:50 +0200, Marc Glisse wrote:
On Tue, 27 Sep 2016, Jonathan Wakely wrote:

This adds the new 3D std::hypot() functions. This implementation seems
to be faster than the naïve sqrt(x*x + y*y + z*z) implementation, or
hypot(hypot(x, y), z), and should be a bit more accurate at very large
or very small values due to reducing the arguments by the largest one.
Improvements welcome though, as this is not my forte.

I understand the claims about accuracy, but the one about speed seems very surprising to me. Was your test specifically with denormals on not-so-recent hardware, or specifically when sqrt does a library call for errno? Otherwise, I have a hard time believing that 3 multiplications and 2 additions can be slower than 4 multiplications, 2 additions, plus a bunch of tests and divisions.

I didn't test any denormals, just similar inputs to the ones in the
new testcase. Looking at my crappy benchmark again it seems I'd
commented out the sqrt(x*x+y*y+z*z) version in favour of a call to
gsl_hypot3 ... oops!

So what I committed is faster than hypot(hypot(x, y), z) but as you
guessed, slower than the simple sqrt(x*x+y*y+z*z).

I'm happy to defer to you in terms of what would be a better
implementation, as I'm very unlikely to ever need to use this. I don't
know whether we should be optimizing for speed or accuracy. But since
nobody else had contributed an implementation (and this was the one
C++17 feature libc++ has that we don't :-) I decided to add it.

Reply via email to