------- Comment #2 from pluto at agmk dot net  2007-03-14 19:05 -------
(In reply to comment #1)
> ifcvt could do this.  But is cmpxchgq really faster with its atomictiy
> guarantee?

only `lock; cmpxchg' has atomicity guarantee on smp.

> They are all vector-path instructions, a compare - cmov sequence looks
> faster (8 cycle latency vs. 10 and also with less constraints on register
> allocation).  Even the code we emit now:
> 
> emit_cmpxchg:
> .LFB2:
>         movq    (%rdi), %rax
>         cmpq    %rsi, %rax
>         je      .L6
>         rep ; ret
>         .p2align 4,,7
> .L6:
>         movq    %rdx, (%rdi)
>         ret
> 
> could be faster dependent on branch probability.

yes, it could be faster, but for -Os we could emit
a small branchless code:

movq %rsi, %rax
cmpxchgq %rdx, (%rdi)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31170

Reply via email to