On Tue, Oct 23, 2012 at 1:16 AM, jerro <[email protected]> wrote:
> While working on TurkeyMan's std.simd I noticed that some things are still
> impossible to implement efficiently using LDC. One example is the equivalent
> of _mm_cmpgt_epi32 intrinsic
> (http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/intref_cls/common/intref_sse2_int_comparison.htm).
> LLVM does not provide GCC builtin __builtin_ia32_pcmpgtd128 which is needed
> to implement such a function. Clang implements it using a comparison
> operator on on integer vectors. This compiles  to an LLVM comparison
> instruction followed by a sext. There is no way to express this in D.
>
> We can't do what clang does here because the vectors in LDC are part of the
> language and not a compiler extension. One way to solve this would be to add
> another pragma, but I don't think that's a good option in this case. This
> intrinsic would cover only a very specific use case and there are probably
> other cases where one would face similar problems. There  will probably be
> more such cases in the future, when LDC will support more platforms. It also
> seems that LLVM often removes intrinsics when what they do can be expressed
> in other ways. So I think that adding a pragma for each operation that may
> be hard to efficiently express in D (and for which there is no intrinsic in
> LLVM) could eventually lead to quite a lot of pragmas.
>
> I think the best way to solve this problem would be to add support for
> inline LLVM assembly language. I have one implementation of it at
> https://github.com/jerro/ldc/tree/pragma-llvm-inline-ir . It is used like
> this:
>
> pragma(llvm_inline_ir)
>     R inlineIR(string s, R, P...)(P);
>
> void foo()
> {
>    auto gtMask = inlineIR!(`
>         %cmp = icmp sgt <4 x i32> %0, %1
>         %r = sext <4 x i1> %cmp to <4 x i32>
>         ret <4 x i32> %r`,
>         int4)(a, b);
> }
>
> Do you think adding support for inline LLVM IR is a good idea?
>
> Regards,
> Jernej
>
> --
>
>

FWIW, I suspect this to be generally impossible since LDC uses the MC framework.

I could be wrong, though.

Regards,
Alex

-- 


Reply via email to