https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92038

            Bug ID: 92038
           Summary: Extremely inefficient x86_64 code for trivally
                    copyable types passed in registers.
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: maxim.yegorushkin at gmail dot com
  Target Milestone: ---

The following code:

    #include <variant>
    void f(std::variant<int, int>);
    void g() { f({}); }

When compiled with `g++-9.2 -std=gnu++17 -O3 -march=skylake` generates the
following assembly:

    g():
        mov     DWORD PTR [rsp-16], 0
        mov     BYTE PTR [rsp-12], 0
        mov     rdi, QWORD PTR [rsp-16]
        jmp     f(std::variant<int, int>)

Which is rather poor: unnecessary memory stores; dependency of rdi on the value
of 3 bytes of padding at [rsp-11], [rsp-10], [rsp-9] which are unset, which may
prevent store-to-load forwarding.

`clang++-8.0 -std=gnu++17 -O3 -march=skylake` generates the expected assembly:

    g():
        xor     edi, edi
        jmp     f(std::variant<int, int>)

Reply via email to