Update: The results I talked about yesterday were obtained with Nim simply 
-d:release compiling but with quite some optimization for the C reference code.

Today I cleaned up some minor lose ends and did some polishing (for both, C and 
Nim) and set Nim to compile with --opt:speed plus some checks disabled (which 
is a) unnecessary in this case, and b) fair because C has none of those at all).

And - I hope you are seated properly - Bang, the algorithm implemented in Nim 
is on average 2% to 3% **faster than the C version!**

And no that's not due to an error. I cross checked over 100K test vectors. The 
Nim implementation is correct.

Kudos to @Araq and the Nim team!

Reply via email to