On Friday, 27 June 2014 at 10:51:05 UTC, Manu via Digitalmars-d
wrote:
On 27 June 2014 11:31, David Nadlinger via Digitalmars-d
<digitalmars-d@puremagic.com> wrote:
Hi all,
right now, the use of std.math over core.stdc.math can cause a
huge
performance problem in typical floating point graphics code.
An instance of
this has recently been discussed here in the "Perlin noise
benchmark speed"
thread [1], where even LDC, which already beat DMD by a factor
of two,
generated code more than twice as slow as that by Clang and
GCC. Here, the
use of floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math,
the biggest
problem is the fact that std.math almost exclusively uses
reals in its API.
When working with single- or double-precision floating point
numbers, this
is not only more data to shuffle around than necessary, but on
x86_64
requires the caller to transfer the arguments from the SSE
registers onto
the x87 stack and then convert the result back again. Needless
to say, this
is a serious performance hazard. In fact, this accounts for an
1.9x slowdown
in the above benchmark with LDC.
Because of this, I propose to add float and double overloads
(at the very
least the double ones) for all of the commonly used functions
in std.math.
This is unlikely to break much code, but:
a) Somebody could rely on the fact that the calls effectively
widen the
calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without
context, of
course.
What do you think?
Cheers,
David
[1] http://forum.dlang.org/thread/lo19l7$n2a$1...@digitalmars.com
[2] Fun fact: As the program happens only deal with positive
numbers, the
author could have just inserted an int-to-float cast,
sidestepping the issue
altogether. All the other language implementations have the
floor() call
too, though, so it doesn't matter for this discussion.
Totally agree.
Maintaining commitment to deprecated hardware which could be
removed
from the silicone at any time is a bit of a problem looking
forwards.
Regardless of the decision about whether overloads are created,
at
very least, I'd suggest x64 should define real as double, since
the
x87 is deprecated, and x64 ABI uses the SSE unit. It makes no
sense at
all to use real under any general circumstances in x64 builds.
And aside from that, if you *think* you need real for
precision, the
truth is, you probably have bigger problems.
Double already has massive precision. I find it's extremely
rare to
have precision problems even with float under most normal usage
circumstances, assuming you are conscious of the relative
magnitudes
of your terms.
I think real should stay how it is, as the largest
hardware-supported floating point type on a system. What needs to
change is dmd and phobos' default usage of real. Double should be
the standard. People should be able to reach for real if they
really need it, but normal D code should target the sweet spot
that is double*.
I understand why the current situation exists. In 2000 x87 was
the standard and the 80bit precision came for free.
*The number of algorithms that are both numerically
stable/correct and benefit significantly from > 64bit doubles is
very small. The same can't be said for 32bit floats.