https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91994

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |---

--- Comment #9 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to rsand...@gcc.gnu.org from comment #8)
> Fixed for the reduced testcase.  Please reopen if there's still a problem
> with the SPEC test itself.

Please note that when the testcase from the comment #5 is compiled with
"-march=skylake -O2 -mavx512f", then a vzeroupper before the call to "foo" is
now missing:

bar:
        pushq   %rbp
        movq    %rsp, %rbp
        andq    $-32, %rsp
        subq    $32, %rsp
        vmovdqa x1(%rip), %ymm0
        vmovdqa %ymm0, (%rsp)
        call    foo
        vmovdqa (%rsp), %ymm0
        vmovdqa %ymm0, x3(%rip)
        vzeroupper
        leave
        ret

gcc-9.2.1 compiles the function to:

bar:
        pushq   %rbp
        movq    %rsp, %rbp
        andq    $-32, %rsp
        subq    $32, %rsp
        vmovdqa x1(%rip), %ymm1
        vmovdqa %ymm1, (%rsp)
        vzeroupper                      <---- here
        call    foo
        vmovdqa (%rsp), %ymm1
        vmovdqa %ymm1, x3(%rip)
        vzeroupper
        leave
        ret

(I would also expect that %ymm 16+ is uses as a temporary, as it is not
clobbered by a vzeroupper in "foo").

Reply via email to