On 7 August 2012 16:56, jerro <a...@a.com> wrote: > > That said, almost all simd opcodes are directly accessible in std.simd. >> There are relatively few obscure operations that don't have a representing >> function. >> The unpck/shuf example above for instance, they both effectively perform a >> sort of swizzle, and both are accessible through swizzle!(). >> > > They aren't. Swizzle only takes one argument, so you cant use it to select > elements from two vectors. Both unpcklps and shufps take two arguments. > Writing a swizzle with two arguments would be much harder.
Any usages I've missed/haven't thought of; I'm all ears. The swizzle >> mask is analysed by the template, and it produces the best opcode to match >> the pattern. Take a look at swizzle, it's bloody complicated to do that >> the >> most efficient way on x86. >> > > Now imagine how complicated it would be to write a swizzle with to vector > arguments. I can imagine, I'll have a go at it... it's something I considered, but not all architectures can do it efficiently. That said, a most-efficient implementation would probably still be useful on all architectures, but for cross platform code, I usually prefer to encourage people taking another approach rather than supply a function that is not particularly portable (or not efficient when ported). The reason I didn't write the DMD support yet is because it was incomplete, >> and many opcodes weren't yet accessible, like shuf for instance... and I >> just wasn't finished. Stopped to wait for DMD to be feature complete. >> I'm not opposed to this idea, although I do have a concern that, because >> there's no __forceinline in D (or macros), adding another layer of >> abstraction will make maths code REALLY slow in unoptimised builds. >> Can you suggest a method where these would be treated as C macros, and not >> produce additional layers of function calls? >> > > Unfortunately I can't, at least not a clean one. Using string mixins would > be one way but I think no one wants that kind of API in Druntime or Phobos. Yeah, absolutely not. This is possibly the most compelling motivation behind a __forceinline mechanism that I've seen come up... ;) I'm already unhappy that >> std.simd produces redundant function calls. >> >> <rant> please please please can haz __forceinline! </rant> >> > > I agree that we need that. > Huzzah! :)