On Thursday, 23 May 2019 at 21:30:47 UTC, Alex wrote:
I don't see how that can necessarily be faster. A LUT can give full 64-bit precision with one operation. The CORDIC needs iteration, at least 10 to be of any use. LUT's are precision independent assuming the creation cost is not included.

It isn't, IEEE sin is typically not fast. Arm cpus let you run fewer iterations though for nonstandard floats.

I have some code that uses sin, exp and a few other primitive algebraic functions and in one case the code is extremely slow(it uses exp). I haven't done much testing of all this but something just seems off somewhere.

Dont know about exp, but some operations are slow when you get too close to zero, so called denormal numbers.

I guess there is no tool that can tell one exactly what is happening to a piece of code in the cpu... basically an instruction level profiler?

Vtune from Intel, not free... Afaik.


Reply via email to