Re: std.math performance (SSE vs. real)

John Colvin via Digitalmars-d Fri, 27 Jun 2014 04:16:38 -0700

On Friday, 27 June 2014 at 10:51:05 UTC, Manu via Digitalmars-dwrote:

On 27 June 2014 11:31, David Nadlinger via Digitalmars-d
<digitalmars-d@puremagic.com> wrote:
Hi all,
right now, the use of std.math over core.stdc.math can cause ahugeperformance problem in typical floating point graphics code.An instance ofthis has recently been discussed here in the "Perlin noisebenchmark speed"thread [1], where even LDC, which already beat DMD by a factorof two,generated code more than twice as slow as that by Clang andGCC. Here, the
use of floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math,the biggestproblem is the fact that std.math almost exclusively usesreals in its API.When working with single- or double-precision floating pointnumbers, thisis not only more data to shuffle around than necessary, but onx86_64requires the caller to transfer the arguments from the SSEregisters ontothe x87 stack and then convert the result back again. Needlessto say, thisis a serious performance hazard. In fact, this accounts for an1.9x slowdown
in the above benchmark with LDC.
Because of this, I propose to add float and double overloads(at the veryleast the double ones) for all of the commonly used functionsin std.math.
This is unlikely to break much code, but:
a) Somebody could rely on the fact that the calls effectivelywiden the
calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous withoutcontext, of
course.

What do you think?

Cheers,
David


[1] http://forum.dlang.org/thread/lo19l7$n2a$1...@digitalmars.com
[2] Fun fact: As the program happens only deal with positivenumbers, theauthor could have just inserted an int-to-float cast,sidestepping the issuealtogether. All the other language implementations have thefloor() call
too, though, so it doesn't matter for this discussion.
Totally agree.
Maintaining commitment to deprecated hardware which could beremovedfrom the silicone at any time is a bit of a problem lookingforwards.Regardless of the decision about whether overloads are created,atvery least, I'd suggest x64 should define real as double, sincethex87 is deprecated, and x64 ABI uses the SSE unit. It makes nosense at
all to use real under any general circumstances in x64 builds.
And aside from that, if you *think* you need real forprecision, the
truth is, you probably have bigger problems.
Double already has massive precision. I find it's extremelyrare to
have precision problems even with float under most normal usage
circumstances, assuming you are conscious of the relativemagnitudes
of your terms.

I think real should stay how it is, as the largesthardware-supported floating point type on a system. What needs tochange is dmd and phobos' default usage of real. Double should bethe standard. People should be able to reach for real if theyreally need it, but normal D code should target the sweet spotthat is double*.

I understand why the current situation exists. In 2000 x87 wasthe standard and the 80bit precision came for free.

*The number of algorithms that are both numericallystable/correct and benefit significantly from > 64bit doubles isvery small. The same can't be said for 32bit floats.

Re: std.math performance (SSE vs. real)

Reply via email to