https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102294
Hongtao.liu <crazylht at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |crazylht at gmail dot com
--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Andrew Pinski from comment #6)
> This is literally just measuring memset times of a small structure.
>
> -mtune=intel changes the timings too.
> Doing -mstringop-strategy=libcall also changes the timing to the point where
> they are about the same as clang.
>
> So this is a target issue and not a middle-end.
>
> You need to do timings on many more processors to have the -mtune=generic
> changed.
Yes, it's related to strongop strategy, w/ -mtune=skylake
gcc -O2 -march=x86-64 test.c -mtune=skylake
Elapsed time: 0.353267 s
Elapsed time: 0.515796 s
Elapsed time: 0.352953 s
gcc -O2 -march=x86-64 test.c
Elapsed time: 0.892582 s
Elapsed time: 0.515735 s
Elapsed time: 0.843342 s
w/ -mtune=skylake, xmm mov is used.
bio_init3:
.LFB30:
.cfi_startproc
pxor %xmm15, %xmm15
movups %xmm15, 96(%rdi)
movups %xmm15, 32(%rdi)
movw %dx, 98(%rdi)
movl $1, 32(%rdi)
movl $1, 100(%rdi)
movq %rsi, 104(%rdi)
movups %xmm15, (%rdi)
movups %xmm15, 16(%rdi)
movups %xmm15, 48(%rdi)
movups %xmm15, 64(%rdi)
movups %xmm15, 80(%rdi)
movq %xmm15, 112(%rdi)
ret
.cfi_endproc
w/ -mtune=generic, res stosq is used.
bio_init3:
.LFB30:
.cfi_startproc
movq %rdi, %r8
movl $15, %ecx
xorl %eax, %eax
rep stosq
movl $1, 32(%r8)
movw %dx, 98(%r8)
movl $1, 100(%r8)
movq %rsi, 104(%r8)
ret