http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43653

--- Comment #16 from Uros Bizjak <ubizjak at gmail dot com> 2011-02-17 21:05:11 
UTC ---
The assembly from -O1 -ftree-vectorize -msse3 shows another opportunity for
enhancement PR19398 (secondary reloads don't consider "m" alternatives):

.LFB0:
    .cfi_startproc
    subq    $416, %rsp
    .cfi_def_cfa_offset 424
    movq    .LC1(%rip), %rax
    leaq    (%rsp,%rax), %rax
    movq    %rax, -112(%rsp)
(*)    movq    -112(%rsp), %xmm1
(*)    punpcklqdq    %xmm1, %xmm1
    movdqa    %xmm1, %xmm0
    leaq    -104(%rsp), %rax
    leaq    408(%rsp), %rdx
.L2:


Looking at the definition of

(define_insn "*vec_dupv2di_sse3"
  [(set (match_operand:V2DI 0 "register_operand"     "=x,x")
    (vec_duplicate:V2DI
      (match_operand:DI 1 "nonimmediate_operand" " 0,m")))]
  "TARGET_SSE3"
  "@
   punpcklqdq\t%0, %0
   movddup\t{%1, %0|%0, %1}"
  [(set_attr "type" "sselog1")
   (set_attr "mode" "TI,DF")])

the two insns marked with (*) can be substituted with the second alternative:

    movddup    -112(%rsp), %xmm1.

Reply via email to