https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102294

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |crazylht at gmail dot com

--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Andrew Pinski from comment #6)
> This is literally just measuring memset times of a small structure.
> 
> -mtune=intel changes the timings too.
> Doing -mstringop-strategy=libcall also changes the timing to the point where
> they are about the same as clang.
> 
> So this is a target issue and not a middle-end.
> 
> You need to do timings on many more processors to have the -mtune=generic
> changed.

Yes, it's related to strongop strategy, w/ -mtune=skylake

gcc -O2 -march=x86-64 test.c  -mtune=skylake

Elapsed time: 0.353267 s
Elapsed time: 0.515796 s
Elapsed time: 0.352953 s

gcc -O2 -march=x86-64 test.c

Elapsed time: 0.892582 s
Elapsed time: 0.515735 s
Elapsed time: 0.843342 s

w/ -mtune=skylake, xmm mov is used.

bio_init3:
.LFB30:
        .cfi_startproc
        pxor    %xmm15, %xmm15
        movups  %xmm15, 96(%rdi)
        movups  %xmm15, 32(%rdi)
        movw    %dx, 98(%rdi)
        movl    $1, 32(%rdi)
        movl    $1, 100(%rdi)
        movq    %rsi, 104(%rdi)
        movups  %xmm15, (%rdi)
        movups  %xmm15, 16(%rdi)
        movups  %xmm15, 48(%rdi)
        movups  %xmm15, 64(%rdi)
        movups  %xmm15, 80(%rdi)
        movq    %xmm15, 112(%rdi)
        ret
        .cfi_endproc

w/ -mtune=generic, res stosq is used.

bio_init3:
.LFB30:
        .cfi_startproc
        movq    %rdi, %r8
        movl    $15, %ecx
        xorl    %eax, %eax
        rep stosq
        movl    $1, 32(%r8)
        movw    %dx, 98(%r8)
        movl    $1, 100(%r8)
        movq    %rsi, 104(%r8)
        ret

Reply via email to