On Tue, Jun 19, 2012 at 8:38 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Tue, Jun 19, 2012 at 12:07 AM, Richard Henderson <r...@redhat.com> wrote: >> On 2012-06-18 13:19, Uros Bizjak wrote: >>> /* ??? The builtin doesn't understand that the PCMPESTRI read from >>> memory need not be aligned. */ >>> - __asm ("%vpcmpestri $0, (%1), %2" >>> - : "=c"(index) : "r"(s), "x"(search), "a"(4), "d"(16)); >>> + sv = __builtin_ia32_loaddqu ((const char *) s); >>> + index = __builtin_ia32_pcmpestri128 (search, 4, sv, 16, 0); >>> + >> >> >> Surely the comment can be removed too then? > > I'm not sure there. The builtin, as defined, expects V16QI operand > with xm constraint. Using: > > int test (const char *s1) > { > const v16qi *p = (const v16qi *)(unsigned long) s1; > return __builtin_ia32_pcmpistri128 (*p, ...); > } > > will generate movdqa before pcmpistri.
Pedantic correction: __builtin_ia32_pcmpistri128 (v16qi_arg, *p, N); movdqa in front of this builtin will be generated with -O0. Uros.