On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote:
atan should work out to only be a few instructions (inline
assembly) from what I've looked at in the source.
Also you should post the code you used for each.
Should be 3-4 instructions. Load input to the FPU (Optional?
Depends on if it already has the value loaded), Atan, Fwait
(optional?), Retrieve value.
Off hand that i remember, FPU instructions run in their own
separated space and should more or less take up only a few cycles
by themselves to run (and also run in parallel to the CPU code).
At which point if the code is running half the speed of C++'s,
that means probably bad optimization elsewhere, or even the
control settings for the FPU.
I really haven't looked that in depth to the FPU stuff since
about 2000...