On Friday, 27 June 2014 at 11:10:57 UTC, John Colvin wrote:
On Friday, 27 June 2014 at 10:51:05 UTC, Manu via Digitalmars-d
wrote:
On 27 June 2014 11:31, David Nadlinger via Digitalmars-d
<digitalmars-d@puremagic.com> wrote:
Hi all,
right now, the use of std.math over core.stdc.math can cause
a huge
performance problem in typical floating point graphics code.
An instance of
this has recently been discussed here in the "Perlin noise
benchmark speed"
thread [1], where even LDC, which already beat DMD by a
factor of two,
generated code more than twice as slow as that by Clang and
GCC. Here, the
use of floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math,
the biggest
problem is the fact that std.math almost exclusively uses
reals in its API.
When working with single- or double-precision floating point
numbers, this
is not only more data to shuffle around than necessary, but
on x86_64
requires the caller to transfer the arguments from the SSE
registers onto
the x87 stack and then convert the result back again.
Needless to say, this
is a serious performance hazard. In fact, this accounts for
an 1.9x slowdown
in the above benchmark with LDC.
Because of this, I propose to add float and double overloads
(at the very
least the double ones) for all of the commonly used functions
in std.math.
This is unlikely to break much code, but:
a) Somebody could rely on the fact that the calls effectively
widen the
calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without
context, of
course.
What do you think?
Cheers,
David
[1] http://forum.dlang.org/thread/lo19l7$n2a$1...@digitalmars.com
[2] Fun fact: As the program happens only deal with positive
numbers, the
author could have just inserted an int-to-float cast,
sidestepping the issue
altogether. All the other language implementations have the
floor() call
too, though, so it doesn't matter for this discussion.
Totally agree.
Maintaining commitment to deprecated hardware which could be
removed
from the silicone at any time is a bit of a problem looking
forwards.
Regardless of the decision about whether overloads are
created, at
very least, I'd suggest x64 should define real as double,
since the
x87 is deprecated, and x64 ABI uses the SSE unit. It makes no
sense at
all to use real under any general circumstances in x64 builds.
And aside from that, if you *think* you need real for
precision, the
truth is, you probably have bigger problems.
Double already has massive precision. I find it's extremely
rare to
have precision problems even with float under most normal usage
circumstances, assuming you are conscious of the relative
magnitudes
of your terms.
I think real should stay how it is, as the largest
hardware-supported floating point type on a system. What needs
to change is dmd and phobos' default usage of real. Double
should be the standard. People should be able to reach for real
if they really need it, but normal D code should target the
sweet spot that is double*.
I understand why the current situation exists. In 2000 x87 was
the standard and the 80bit precision came for free.
*The number of algorithms that are both numerically
stable/correct and benefit significantly from > 64bit doubles
is very small. The same can't be said for 32bit floats.
Totally agree!
Please add float and double overloads and make double default.
Sometimes float is just enough, but in most times double should
be used.
If some one need more precision as double can provide then 80bit
will probably be not enough any way.
IMHO intrinsics should be used as default if possible.