On Wednesday, 25 January 2012 at 00:49:15 UTC, bearophile wrote:
a:

Because dmd currently doesn't have an intrinsic for the SHUFPS instruction I've included a version block with some GDC specific code (this gave me a speedup of up to 80%).

It seems an instruction worth having in dmd too.


Chart: http://cloud.github.com/downloads/jerro/pfft/image.png

I know your code is relatively simple, so it's not meant to be the fastest on the ground, but in your nice graph _as reference point_ I'd like to see a line for the FTTW too. Such line is able to show us how close or how far all this is from an industry standard performance. (And if possible I'd like to see two lines for the LDC2 compiler too.)

Bye,
bearophile

I have updated the graph now.

Reply via email to