https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92841

--- Comment #6 from Boris <bp at alien8 dot de> ---
Ok, so there was a mix-up between patterns with and without multi-nodes
in your untested fix, which Micha found and fixed, see attached patch.
(otherwise it wouldn't even build a whole kernel).

With it, it fixed the pattern which I reported to properly overwrite
RAX:

  mov    %gs:0x28,%rax
  mov    %rax,0x10(%rsp)
  mov    $0x6,%eax

Other places look good too:

  mov    %gs:0x28,%rax
  mov    %rax,0xf0(%rsp)
  mov    0x120e664(%rip),%rax        # ffffffff8220f020 <blacklisted_initcalls>


But browsing the code it generated, there are still suboptimal patterns
like:

  mov    %gs:0x28,%rax
  mov    %rax,0x8(%rsp)
  xor    %eax,%eax
  mov    0x98(%rdi),%rax

or

  mov    %gs:0x28,%rax
  mov    %rax,0x80(%rsp)
  xor    %eax,%eax
  mov    %rbx,%rax

I'm guessing this has to do with the patterns which probably should also
match mem and reg source operand. Or so, just guessing here.

Also, there's more complex stuff like:

  mov    %gs:0x28,%rax
  mov    %rax,-0x28(%rbp)
  xor    %eax,%eax
  add    %gs:0x7f00ffc4(%rip),%r13        # 12360 <this_cpu_off>
  mov    0x80(%rdi),%rax
  test   %rax,%rax

and if that ADD can land under the following MOV, then, one could remove
the XOR too. But maybe that's too complicated to do or it'll become too
ugly...

Oh, and there actually are cases where the XOR *is* needed:

  mov    %gs:0x28,%rax
  mov    %rax,0x68(%rsp)
  xor    %eax,%eax
  rep stos %rax,%es:(%rdi)

Fun.

Thanks!

Reply via email to