Re: Any usable SIMD implementation?

Walter Bright via Digitalmars-d Tue, 05 Apr 2016 01:36:29 -0700

On 4/4/2016 11:10 PM, 9il wrote:

It is impossible to deduct from that combination that Xeon Phi has 32 FP 
registers.

Since dmd doesn't generate specific code for a Xeon Phi, having a compile timeswitch for it is meaningless.

"Since the compiler never generates AVX or AVX2" - this is definitely nor true,
see, for example, LLVM vectorization and SLP vectorization.


dmd is not LLVM.

It's entirely practical to compile code with different source code, link them
*both* into the executable, and switch between them based on runtime detection
of the CPU.

This approach is complex,


Not at all. Used to do it all the time in the DOS world (FPU vs emulation).

I just want an unified instrument to receive CT information about target and
optimization switches. It is OK if this information would have different
switches on different compilers.

Optimizations simply do not transfer from one compiler to another, whether theswitch is the same or not. They are highly implementation dependent.

Auto vectorization is only example (maybe bad). I would use SIMD vectors, but I
need CT information about target CPU, because it is impossible to build optimal
BLAS kernels without it!

I still don't understand why you cannot just set '-version=xxx' on the commandline and then switch off that version in your custom code.

Re: Any usable SIMD implementation?

Reply via email to