http://llvm.org/bugs/show_bug.cgi?id=21713

            Bug ID: 21713
           Summary: inefficient lowering of vectorized code with
                    intrinsics
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: X86
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]
    Classification: Unclassified

(A bug report from [email protected])

Here is a case where the same functional code gets compiled differently.
These functions were written to emulate _mm_insert_epi64 on i386 machines
which does not have it... I noticed when disassembling them that emul2
version does not use the pinsrd instruction, but does some weird things
with movps instead...  Though the code for emul2 and emul3 are
functionally the same, emul3 seems to be more effecient...

Also, similar behavior is observed between emul84 and emul85... though
both functions will result in the same set of arguments to
_mm_insert_epi32, one uses the intrinsic (emul84) and emul85 does not.

Compiled w/:
clang -c -o foow.o foow.c -msse4 -O3

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
LLVMbugs mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs

Reply via email to