On Wednesday, 13 February 2019 at 19:55:05 UTC, Guillaume Piolat
wrote:
On Wednesday, 13 February 2019 at 04:57:29 UTC, Crayo List
wrote:
On Wednesday, 6 February 2019 at 01:05:29 UTC, Guillaume
Piolat wrote:
"intel-intrinsics" is a DUB package for people interested in
x86 performance that want neither to write assembly, nor a
LDC-specific snippet... and still have fastest possible code.
This is really cool and I appreciate your efforts!
However (for those who are unaware) there is an alternative
way that is (arguably) better;
https://ispc.github.io/index.html
You can write portable vectorized code that can be trivially
invoked from D.
ispc is another compiler in your build, and you'd write in
another language, so it's not really the same thing.
That's mostly what I said, except that I did not say it's the
same thing.
It's an alternative way to produce vectorized code in a
deterministic and portable way.
This is NOT an auto-vectorizing compiler!
I haven't used it (nor do I know anyone who do) so don't really
know why it would be any better
And that's precisely why I posted here; for those people that
have interest in vectorizing their code in a portable way to be
aware that there is another (arguably) better way.
I highly recommend browsing through the walkthrough example;
https://ispc.github.io/example.html
For example, I have code that I can run on my Xeon Phi 7250
Knights Landing CPU by compiling with --target=avx512knl-i32x16,
then I can run the exact same code with no change at all on my
i7-5820k by compiling with --target=avx2-i32x8. Each time I get
optimal code. This is not something you can easily do with
intrinsics!