Re: SIMD on Windows

jerro Sat, 22 Jun 2013 09:06:15 -0700

On Saturday, 22 June 2013 at 15:41:43 UTC, Benjamin Thaut wrote:

Am 22.06.2013 15:53, schrieb jerro:
In its current state you don't want to be using SIMD with dmdbecausethe generated assembly will be significantly slower then ifyou just
use the default FPU math.
That may be true for some kinds of code, but it isn't true intgeneral.For example, see the comparison of pfft's performance whenbuilt with 64
bit DMD using SIMD and without SIMD:

http://i.imgur.com/kYYI9R9.png
This benchmark was run on a core i5 2500K on 64 bit DebianWheezy.
Ok I saw that you did write quite a few cirtical functions ininline assembly. Not really a good argument for dmds codegenwith simd intrinsics.
Kind Regards
Benjamin Thaut


I have actually run that benchmark with the code from this branch:

https://github.com/jerro/pfft/tree/experimental

The only function in sse_float.d on that branch that uses inlineassembly is scalar_to_vector. The reason why I used more inlineassembly in the master branch is that DMD didn't have intrinsicsfor some instructions such as shufps at the time.

I'm not really arguing for DMD's codegen with SIMD intrinsics.It's more that, from what I've seen, it doesn't produce very goodscalar floating point code either (at least when compared to LDCor GDC). Whether I use scalar floating point or SIMD, pfft isabout two times slower if I compile it with DMD than it is if Icompile it with GDC.

Re: SIMD on Windows

Reply via email to