On Friday, 24 May 2019 at 11:45:46 UTC, Ola Fosheim Grøstad wrote:
On Friday, 24 May 2019 at 08:33:34 UTC, Ola Fosheim Grøstad wrote:
On Thursday, 23 May 2019 at 21:47:45 UTC, Alex wrote:
Either way, sin it's still twice as fast. Also, in the code the sinTab version is missing the writeln so it would have been faster.. so it is not being optimized out.

Well, when I run this modified version:

https://gist.github.com/run-dlang/9f29a83b7b6754da98993063029ef93c

on https://run.dlang.io/

then I get:

LUT:    709
sin(x): 2761

So the LUT is 3-4 times faster even with your quarter-LUT overhead.

FWIW, as far as I can tell I managed to get the lookup version down to 104 by using bit manipulation tricks like these:

auto fastQuarterLookup(double x){
const ulong mantissa = cast(ulong)( (x - floor(x)) * (cast(double)(1UL<<63)*2.0) ); const double sign = cast(double)(-cast(uint)((mantissa>>63)&1));
    … etc

So it seems like a quarter-wave LUT is 27 times faster than sin…

You just have to make sure that the generated instructions fills the entire CPU pipeline.


Well, the QuarterWave was suppose to generate just a quarter since that is all that is required for these functions due to symmetry and periodicity. I started with a half to get that working then figure out the sign flipping.

Essentially one just has to tabulate a quarter of sin, that is, from 0 to 90o and then get the sin right. This allows one to have 4 times the resolution or 1/4 the size at the same cost.

Or, to put it another say, sin as 4 fold redundancy.

I'll check out your code, thanks for looking in to it.

Reply via email to