Re: Speed of math function atan: comparison D and C++

2018-03-06 Thread jmh530 via Digitalmars-d-learn

On Tuesday, 6 March 2018 at 18:41:15 UTC, H. S. Teoh wrote:


The fix itself may be straightforward, but how to do it without 
breaking tons of existing code and provoking user backlash is 
the tricky part.

[snip]


Ah, I see what you're saying. People may be depending on the 
extra accuracy for these functions.


Would just require something like

double sin(double x) @safe pure nothrow @nogc
{
version (FP_Math) {
///double sin implementation
} else {
return sin(cast(real) x);
}
}


Re: Speed of math function atan: comparison D and C++

2018-03-06 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Mar 06, 2018 at 06:05:59PM +, jmh530 via Digitalmars-d-learn wrote:
> On Tuesday, 6 March 2018 at 17:51:54 UTC, H. S. Teoh wrote:
> > [snip]
> > 
> > I'm not advocating for getting *rid* of 80-bit float support, but
> > only to make it *optional* rather than the default, as currently
> > done in std.math.
[...]
> Aren't there two issues: 1) std.math functions that cast to real to
> perform calculations, 2) the compiler sometimes converts things to
> real in the background when people don't want it to.
> 
> Number 1 seems straightforward to fix. Introduce new versions of the
> std.math functions for float/double and the user can cast to real if
> the additional accuracy is necessary.

The fix itself may be straightforward, but how to do it without breaking
tons of existing code and provoking user backlash is the tricky part.


> Number 2 would require a compiler switch, I imagine.

It may not always be the compiler's fault. In the case of x87, it's the
hardware itself that internally promotes to 80-bit and truncates later.
IIRC, the original intent was that user code would only deal with
64-bit, and the 80-bit stuff would only happen inside the x87 (C, for
example, does not provide direct access to this type, except via vendor
extensions). However, due to the necessity to be able to save
intermediate computational states, there are instructions that can
load/extract 80-bit intermediate values to/from the x87, and eventually
people ended up just using these instructions for working with the
80-bit type directly.  You can suppress the compiler from issuing these
instructions, but 64-bit doubles may still be internally converted by
the hardware to 80-bit intermediate values during computation.

But I suppose you could force the compiler to use SSE instructions for
double operations instead of x87, then it would bypass the 80-bit
intermediate values completely.


T

-- 
Being able to learn is a great learning; being able to unlearn is a greater 
learning.


Re: Speed of math function atan: comparison D and C++

2018-03-06 Thread jmh530 via Digitalmars-d-learn

On Tuesday, 6 March 2018 at 17:51:54 UTC, H. S. Teoh wrote:

[snip]

I'm not advocating for getting *rid* of 80-bit float support, 
but only to make it *optional* rather than the default, as 
currently done in std.math.



T


Aren't there two issues: 1) std.math functions that cast to real 
to perform calculations, 2) the compiler sometimes converts 
things to real in the background when people don't want it to.


Number 1 seems straightforward to fix. Introduce new versions of 
the std.math functions for float/double and the user can cast to 
real if the additional accuracy is necessary.


Number 2 would require a compiler switch, I imagine.


Re: Speed of math function atan: comparison D and C++

2018-03-06 Thread H. S. Teoh via Digitalmars-d-learn
On Tue, Mar 06, 2018 at 08:12:57AM +0100, Robert M. Münch via 
Digitalmars-d-learn wrote:
> On 2018-03-05 20:11:06 +, H. S. Teoh said:
> 
> > Walter has been adamant that we should always compute std.math.*
> > functions with the `real` type, which on x86 maps to the non-IEEE
> > 80-bit floats.  However, 80-bit floats have been deprecated for a
> > while now,
> 
> Hi, do you have a reference for this? I can't believe this, as the
> 80-bit are pretty important for a lot of optimization algorithms. We
> use it all the time and it's absolutly necessary.
[...]

http://www.zdnet.com/article/nvidia-de-optimizes-physx-for-the-cpu/?tag=nl.e539

Quotation:

Intel started discouraging the use of x87 with the introduction
of the P4 in late 2000. AMD deprecated x87 since the K8 in 2003,
as x86-64 is defined with SSE2 support; VIA’s C7 has supported
SSE2 since 2005. In 64-bit versions of Windows, x87 is
deprecated for user-mode, and prohibited entirely in
kernel-mode. Pretty much everyone in the industry has
recommended SSE over x87 since 2005 and there are no reasons to
use x87, unless software has to run on an embedded Pentium or
486. 

I'm not advocating for getting *rid* of 80-bit float support, but only
to make it *optional* rather than the default, as currently done in
std.math.


T

-- 
Once bitten, twice cry...


Re: Speed of math function atan: comparison D and C++

2018-03-06 Thread Uknown via Digitalmars-d-learn

On Tuesday, 6 March 2018 at 08:20:05 UTC, J-S Caux wrote:

On Tuesday, 6 March 2018 at 07:12:57 UTC, Robert M. Münch wrote:

On 2018-03-05 20:11:06 +, H. S. Teoh said:

[snip]
Now, with Uknown's trick of using the C math functions, I can 
reconsider. It's a bit of a "patch" but at least it works.


I'm glad I could help!


In an ideal world, I'd like the language I use to:
- have double-precision arithmetic with equal performance to 
C/C++
- have all basic mathematical functions implemented, including 
for complex types
- *big bonus*: have the ability to do extended-precision 
arithmetic (integer, but most importantly (complex) 
floating-point) on-the-fly if I so wish, without having to rely 
on external libraries.


D has std.complex and inbuilt complex types, just like C [0][1]. 
I modified the mandelbrot generator on Wikipedia, using D's 
std.complex and didn't have too much of an issue with 
performance.[2]

Also, std.bigint and mir might be of interest to you.[3]

C++ was always fine, with external libraries for extended 
precision, but D is so much more pleasant to use. Many of my 
colleagues are switching to e.g. Julia despite the performance 
costs, because it is by design a very maths/science-friendly 
language. D is however much closer to a whole stack of existing 
codebases, so switching to it would involve much less extensive 
refactoring.


Theres a good chance D can interface with those libraries you 
mentioned...


[0]: https://dlang.org/phobos/std_complex.html
[1]: https://dlang.org/phobos/core_stdc_complex.html
[2]: 
https://github.com/Sirsireesh/Khoj-2017/blob/master/Mandelbrot-set/mandelbrot.d

[3]: https://github.com/libmir


Re: Speed of math function atan: comparison D and C++

2018-03-06 Thread J-S Caux via Digitalmars-d-learn

On Tuesday, 6 March 2018 at 07:12:57 UTC, Robert M. Münch wrote:

On 2018-03-05 20:11:06 +, H. S. Teoh said:

Walter has been adamant that we should always compute 
std.math.*
functions with the `real` type, which on x86 maps to the 
non-IEEE 80-bit
floats.  However, 80-bit floats have been deprecated for a 
while now,


Hi, do you have a reference for this? I can't believe this, as 
the 80-bit are pretty important for a lot of optimization 
algorithms. We use it all the time and it's absolutly necessary.


and pretty much nobody cares to improve their performance on 
newer CPUs,


Really?

focusing instead on SSE/MMX performance with 64-bit doubles.  
People
have been clamoring for using 64-bit doubles by default rather 
than

80-bit floats, but so far Walter has refused to budge.


IMO this is all driven by the GPU/AI hype that just (seems) to 
be happy with rough precision.


Speaking for myself, the reason why I haven't made the switch 
from C++ to D many years ago for all my scientific work is that 
for many computations, 64 bit precision is certainly sufficient, 
and the performance I could get out of D (factor 4 to 6 slower in 
my tests) was simply insufficient.


Now, with Uknown's trick of using the C math functions, I can 
reconsider. It's a bit of a "patch" but at least it works.


In an ideal world, I'd like the language I use to:
- have double-precision arithmetic with equal performance to C/C++
- have all basic mathematical functions implemented, including 
for complex types
- *big bonus*: have the ability to do extended-precision 
arithmetic (integer, but most importantly (complex) 
floating-point) on-the-fly if I so wish, without having to rely 
on external libraries.


C++ was always fine, with external libraries for extended 
precision, but D is so much more pleasant to use. Many of my 
colleagues are switching to e.g. Julia despite the performance 
costs, because it is by design a very maths/science-friendly 
language. D is however much closer to a whole stack of existing 
codebases, so switching to it would involve much less extensive 
refactoring.


Re: Speed of math function atan: comparison D and C++

2018-03-06 Thread Andrea Fontana via Digitalmars-d-learn

On Monday, 5 March 2018 at 20:11:06 UTC, H. S. Teoh wrote:
Walter has been adamant that we should always compute 
std.math.* functions with the `real` type

T


I don't understand why atan(float) returns real and atan(double) 
return real too. If I'm working with float, why does it return a 
real? If you want to comute with real is ok, but shouldn't be T 
atan(T) rather than real atan(T)?


I'm missing something.

Andrea


Re: Speed of math function atan: comparison D and C++

2018-03-05 Thread Robert M. Münch via Digitalmars-d-learn

On 2018-03-05 20:11:06 +, H. S. Teoh said:


Walter has been adamant that we should always compute std.math.*
functions with the `real` type, which on x86 maps to the non-IEEE 80-bit
floats.  However, 80-bit floats have been deprecated for a while now,


Hi, do you have a reference for this? I can't believe this, as the 
80-bit are pretty important for a lot of optimization algorithms. We 
use it all the time and it's absolutly necessary.



and pretty much nobody cares to improve their performance on newer CPUs,


Really?


focusing instead on SSE/MMX performance with 64-bit doubles.  People
have been clamoring for using 64-bit doubles by default rather than
80-bit floats, but so far Walter has refused to budge.


IMO this is all driven by the GPU/AI hype that just (seems) to be happy 
with rough precision.


--
Robert M. Münch
http://www.saphirion.com
smarter | better | faster



Re: Speed of math function atan: comparison D and C++

2018-03-05 Thread psychoticRabbit via Digitalmars-d-learn

On Monday, 5 March 2018 at 06:01:27 UTC, J-S Caux wrote:


So the codes are trivial, simply some check of raw speed:

  double x = 0.0;
  for (int a = 0; a < 10; ++a) x += atan(1.0/(1.0 + 
sqrt(1.0 + a)));


for C++ and

  double x = 0.0;
  for (int a = 0; a < 1_000_000_000; ++a) x += atan(1.0/(1.0 + 
sqrt(1.0 + a)));


for D. C++ exec takes 40 seconds, D exec takes 68 seconds.


should a be an int?

make it a double ;-)



Re: Speed of math function atan: comparison D and C++

2018-03-05 Thread jmh530 via Digitalmars-d-learn

On Monday, 5 March 2018 at 21:05:19 UTC, bachmeier wrote:


I wonder if Ilya has worked on any of this for Mir.


Mir has sin and cos, but that's it. It looks like they use llvm 
intrinsics on LDC and then fall back to phobos' implementation.


Re: Speed of math function atan: comparison D and C++

2018-03-05 Thread bachmeier via Digitalmars-d-learn

On Monday, 5 March 2018 at 20:11:06 UTC, H. S. Teoh wrote:

Walter has been adamant that we should always compute 
std.math.* functions with the `real` type, which on x86 maps to 
the non-IEEE 80-bit floats.  However, 80-bit floats have been 
deprecated for a while now, and pretty much nobody cares to 
improve their performance on newer CPUs, focusing instead on 
SSE/MMX performance with 64-bit doubles.  People have been 
clamoring for using 64-bit doubles by default rather than 
80-bit floats, but so far Walter has refused to budge.


I wonder if Ilya has worked on any of this for Mir.


Re: Speed of math function atan: comparison D and C++

2018-03-05 Thread H. S. Teoh via Digitalmars-d-learn
On Mon, Mar 05, 2018 at 06:39:21PM +, J-S Caux via Digitalmars-d-learn 
wrote:
[...]
> I've tested these two very basic representative codes:
> https://www.dropbox.com/s/b5o4i8h43qh1saf/test.cc?dl=0
> https://www.dropbox.com/s/zsaikhdoyun3olk/test.d?dl=0
> 
> Results:
> 
> C++:
> g++ (Apple LLVM version 7.3.0):  9.5 secs
> g++ (GCC 7.1.0):  10.7 secs
> 
> D:
> dmd :  35.5 secs
> dmd -release -inline -O : 29.5 secs
> ldc2 :  34.4 secs
> ldc2 -release -O : 31.5 secs
> 
> But now: using the core.stdc.math atan as per Uknown's suggestion:
> D:
> dmd:  9 secs
> dmd -release -inline -O :  6.8 secs
> ldc2 : 10 secs
> ldc2 -release -O :  6.5 secs   <- best
> 
> So indeed the difference is between the `std.math atan` versus the
> `core.stdc.math atan`. Thanks Uknown! Just knowing this trick could
> make the difference between me and other scientists switching over to
> D...
> 
> But now comes the question: can the D fundamental maths functions be
> propped up to be as fast as the C ones?

Walter has been adamant that we should always compute std.math.*
functions with the `real` type, which on x86 maps to the non-IEEE 80-bit
floats.  However, 80-bit floats have been deprecated for a while now,
and pretty much nobody cares to improve their performance on newer CPUs,
focusing instead on SSE/MMX performance with 64-bit doubles.  People
have been clamoring for using 64-bit doubles by default rather than
80-bit floats, but so far Walter has refused to budge.

But perhaps this time, we might have a strong case for pushing this into
D.  IMO, it has been long overdue.  I filed an issue for this:

https://issues.dlang.org/show_bug.cgi?id=18559

If you have any additional relevant information, please post it there so
that we can build a strong case to convince Walter about this issue.


T

-- 
Heuristics are bug-ridden by definition. If they didn't have bugs, they'd be 
algorithms.


Re: Speed of math function atan: comparison D and C++

2018-03-05 Thread bauss via Digitalmars-d-learn

On Monday, 5 March 2018 at 18:39:21 UTC, J-S Caux wrote:
But now comes the question: can the D fundamental maths 
functions be propped up to be as fast as the C ones?


Probably, if someone takes the time to look at the bottlenecks.


Re: Speed of math function atan: comparison D and C++

2018-03-05 Thread J-S Caux via Digitalmars-d-learn

On Monday, 5 March 2018 at 09:48:49 UTC, Uknown wrote:

Depending on your platform, the size of `double` could be 
different between C++ and D. Could you check that the size and 
precision are indeed the same?
Also, benchmark method is just as important as benchmark code. 
Did you use DMD or LDC as the D compiler? In this case it 
shouldn't matter, but try with LDC if you haven't. Also ensure 
that you've used the right flags:

`-release -inline -O`.

If the D version is still slower, you could try using the C 
version of the function
Simply change `import std.math: atan;` to `core.stdc.math: 
atan;` [0]


[0]: https://dlang.org/phobos/core_stdc_math.html#.atan


Thanks all for the info.

I've tested these two very basic representative codes:
https://www.dropbox.com/s/b5o4i8h43qh1saf/test.cc?dl=0
https://www.dropbox.com/s/zsaikhdoyun3olk/test.d?dl=0

Results:

C++:
g++ (Apple LLVM version 7.3.0):  9.5 secs
g++ (GCC 7.1.0):  10.7 secs

D:
dmd :  35.5 secs
dmd -release -inline -O : 29.5 secs
ldc2 :  34.4 secs
ldc2 -release -O : 31.5 secs

But now: using the core.stdc.math atan as per Uknown's suggestion:
D:
dmd:  9 secs
dmd -release -inline -O :  6.8 secs
ldc2 : 10 secs
ldc2 -release -O :  6.5 secs   <- best

So indeed the difference is between the `std.math atan` versus 
the `core.stdc.math atan`. Thanks Uknown! Just knowing this trick 
could make the difference between me and other scientists 
switching over to D...


But now comes the question: can the D fundamental maths functions 
be propped up to be as fast as the C ones?


Re: Speed of math function atan: comparison D and C++

2018-03-05 Thread Johan Engelen via Digitalmars-d-learn

On Monday, 5 March 2018 at 06:01:27 UTC, J-S Caux wrote:

On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote:

On 05/03/2018 6:35 PM, J-S Caux wrote:
I'm considering shifting a large existing C++ codebase into D 
(it's a scientific code making much use of functions like 
atan, log etc).


I've compared the raw speed of atan between C++ (Apple LLVM 
version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also 
ldc2 1.7.0) by doing long loops of such functions.


I can't get the D to run faster than about half the speed of 
C++.


  double x = 0.0;
  for (int a = 0; a < 10; ++a) x += atan(1.0/(1.0 + 
sqrt(1.0 + a)));


for C++ and

  double x = 0.0;
  for (int a = 0; a < 1_000_000_000; ++a) x += atan(1.0/(1.0 + 
sqrt(1.0 + a)));


for D. C++ exec takes 40 seconds, D exec takes 68 seconds.


The performance problem with this code is that LDC does not yet 
do cross-module inlining by default. GDC does. If you pass 
`-enable-cross-module-inlining` to LDC, things should be faster. 
In particular, std.sqrt is not inlined although it is profitable 
to do so (it becomes one machine instruction). Things become 
worse when using core.stdc.math.sqrt, because no implementation 
source available: no inlining possible.


Another problem is that std.math.atan(double) just calls 
std.math.atan(real). Calculations are more expensive on platforms 
where real==80bits (i.e. x86), and that's not solvable with a 
compile flag. What it takes is someone to write the double and 
float versions of atan (and other math functions), but it 
requires someone with the right knowledge to do it.


Your tests (and reporting about them) are much appreciated. 
Please do file bug reports for these things. Perhaps you can take 
a stab at implementing double-versions of the functions you need?


cheers,
  Johan






Re: Speed of math function atan: comparison D and C++

2018-03-05 Thread Marc via Digitalmars-d-learn

On Monday, 5 March 2018 at 05:35:28 UTC, J-S Caux wrote:
I'm considering shifting a large existing C++ codebase into D 
(it's a scientific code making much use of functions like atan, 
log etc).


I've compared the raw speed of atan between C++ (Apple LLVM 
version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 
1.7.0) by doing long loops of such functions.


I can't get the D to run faster than about half the speed of 
C++.


Are there benchmarks for such scientific functions published 
somewhere?


What compiled flags did you used to compile both C++ and D 
versions?


Re: Speed of math function atan: comparison D and C++

2018-03-05 Thread Uknown via Digitalmars-d-learn

On Monday, 5 March 2018 at 06:01:27 UTC, J-S Caux wrote:

On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote:

On 05/03/2018 6:35 PM, J-S Caux wrote:
I'm considering shifting a large existing C++ codebase into D 
(it's a scientific code making much use of functions like 
atan, log etc).


I've compared the raw speed of atan between C++ (Apple LLVM 
version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also 
ldc2 1.7.0) by doing long loops of such functions.


I can't get the D to run faster than about half the speed of 
C++.


Are there benchmarks for such scientific functions published 
somewhere


Gonna need to disassemble and compare them.

atan should work out to only be a few instructions (inline 
assembly) from what I've looked at in the source.


Also you should post the code you used for each.


So the codes are trivial, simply some check of raw speed:

  double x = 0.0;
  for (int a = 0; a < 10; ++a) x += atan(1.0/(1.0 + 
sqrt(1.0 + a)));


for C++ and

  double x = 0.0;
  for (int a = 0; a < 1_000_000_000; ++a) x += atan(1.0/(1.0 + 
sqrt(1.0 + a)));


for D. C++ exec takes 40 seconds, D exec takes 68 seconds.


Depending on your platform, the size of `double` could be 
different between C++ and D. Could you check that the size and 
precision are indeed the same?
Also, benchmark method is just as important as benchmark code. 
Did you use DMD or LDC as the D compiler? In this case it 
shouldn't matter, but try with LDC if you haven't. Also ensure 
that you've used the right flags:

`-release -inline -O`.

If the D version is still slower, you could try using the C 
version of the function
Simply change `import std.math: atan;` to `core.stdc.math: atan;` 
[0]


[0]: https://dlang.org/phobos/core_stdc_math.html#.atan


Re: Speed of math function atan: comparison D and C++

2018-03-04 Thread Era Scarecrow via Digitalmars-d-learn

On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote:
atan should work out to only be a few instructions (inline 
assembly) from what I've looked at in the source.


Also you should post the code you used for each.


 Should be 3-4 instructions. Load input to the FPU (Optional? 
Depends on if it already has the value loaded), Atan, Fwait 
(optional?), Retrieve value.


 Off hand that i remember, FPU instructions run in their own 
separated space and should more or less take up only a few cycles 
by themselves to run (and also run in parallel to the CPU code).


 At which point if the code is running half the speed of C++'s, 
that means probably bad optimization elsewhere, or even the 
control settings for the FPU.


 I really haven't looked that in depth to the FPU stuff since 
about 2000...


Re: Speed of math function atan: comparison D and C++

2018-03-04 Thread rikki cattermole via Digitalmars-d-learn

On 05/03/2018 7:01 PM, J-S Caux wrote:

On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote:

On 05/03/2018 6:35 PM, J-S Caux wrote:
I'm considering shifting a large existing C++ codebase into D (it's a 
scientific code making much use of functions like atan, log etc).


I've compared the raw speed of atan between C++ (Apple LLVM version 
7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 1.7.0) by 
doing long loops of such functions.


I can't get the D to run faster than about half the speed of C++.

Are there benchmarks for such scientific functions published somewhere


Gonna need to disassemble and compare them.

atan should work out to only be a few instructions (inline assembly) 
from what I've looked at in the source.


Also you should post the code you used for each.


So the codes are trivial, simply some check of raw speed:

   double x = 0.0;
   for (int a = 0; a < 10; ++a) x += atan(1.0/(1.0 + sqrt(1.0 + 
a)));


for C++ and

   double x = 0.0;
   for (int a = 0; a < 1_000_000_000; ++a) x += atan(1.0/(1.0 + sqrt(1.0 
+ a)));


for D. C++ exec takes 40 seconds, D exec takes 68 seconds.


Yes, but that doesn't show me how you benchmarked.


Re: Speed of math function atan: comparison D and C++

2018-03-04 Thread J-S Caux via Digitalmars-d-learn

On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote:

On 05/03/2018 6:35 PM, J-S Caux wrote:
I'm considering shifting a large existing C++ codebase into D 
(it's a scientific code making much use of functions like 
atan, log etc).


I've compared the raw speed of atan between C++ (Apple LLVM 
version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 
1.7.0) by doing long loops of such functions.


I can't get the D to run faster than about half the speed of 
C++.


Are there benchmarks for such scientific functions published 
somewhere


Gonna need to disassemble and compare them.

atan should work out to only be a few instructions (inline 
assembly) from what I've looked at in the source.


Also you should post the code you used for each.


So the codes are trivial, simply some check of raw speed:

  double x = 0.0;
  for (int a = 0; a < 10; ++a) x += atan(1.0/(1.0 + 
sqrt(1.0 + a)));


for C++ and

  double x = 0.0;
  for (int a = 0; a < 1_000_000_000; ++a) x += atan(1.0/(1.0 + 
sqrt(1.0 + a)));


for D. C++ exec takes 40 seconds, D exec takes 68 seconds.


Re: Speed of math function atan: comparison D and C++

2018-03-04 Thread rikki cattermole via Digitalmars-d-learn

On 05/03/2018 6:35 PM, J-S Caux wrote:
I'm considering shifting a large existing C++ codebase into D (it's a 
scientific code making much use of functions like atan, log etc).


I've compared the raw speed of atan between C++ (Apple LLVM version 
7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 1.7.0) by doing 
long loops of such functions.


I can't get the D to run faster than about half the speed of C++.

Are there benchmarks for such scientific functions published somewhere


Gonna need to disassemble and compare them.

atan should work out to only be a few instructions (inline assembly) 
from what I've looked at in the source.


Also you should post the code you used for each.


Speed of math function atan: comparison D and C++

2018-03-04 Thread J-S Caux via Digitalmars-d-learn
I'm considering shifting a large existing C++ codebase into D 
(it's a scientific code making much use of functions like atan, 
log etc).


I've compared the raw speed of atan between C++ (Apple LLVM 
version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 
1.7.0) by doing long loops of such functions.


I can't get the D to run faster than about half the speed of C++.

Are there benchmarks for such scientific functions published 
somewhere?