While working on TurkeyMan's std.simd I noticed that some things are still 
impossible to implement efficiently using LDC. One example is the 
equivalent of _mm_cmpgt_epi32 intrinsic 
(http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/intref_cls/common/intref_sse2_int_comparison.htm).
 
 LLVM does not provide GCC builtin __builtin_ia32_pcmpgtd128 which is 
needed to implement such a function. Clang implements it using a comparison 
operator on on integer vectors. This compiles  to an LLVM comparison 
instruction followed by a sext. There is no way to express this in D.

We can't do what clang does here because the vectors in LDC are part of the 
language and not a compiler extension. One way to solve this would be to 
add another pragma, but I don't think that's a good option in this case. 
This intrinsic would cover only a very specific use case and there are 
probably other cases where one would face similar problems. There  will 
probably be more such cases in the future, when LDC will support more 
platforms. It also seems that LLVM often removes intrinsics when what they 
do can be expressed in other ways. So I think that adding a pragma for each 
operation that may be hard to efficiently express in D (and for which there 
is no intrinsic in LLVM) could eventually lead to quite a lot of pragmas.  

I think the best way to solve this problem would be to add support for 
inline LLVM assembly language. I have one implementation of it 
at https://github.com/jerro/ldc/tree/pragma-llvm-inline-ir . It is used 
like this:

pragma(llvm_inline_ir)
    R inlineIR(string s, R, P...)(P);

void foo()
{
   auto gtMask = inlineIR!(`
        %cmp = icmp sgt <4 x i32> %0, %1
        %r = sext <4 x i1> %cmp to <4 x i32>
        ret <4 x i32> %r`, 
        int4)(a, b);
}

Do you think adding support for inline LLVM IR is a good idea? 

Regards,
Jernej

-- 


Reply via email to