I think you misunderstand partial register stalls. They happen (on some microarchitectures) when you write part of a register and then use the whole register.
As you say, we do need the branch anyway, which is a good reason to do it, but the motivation is wrong. Sent from my tablet, pardon any formatting problems. > On Jun 11, 2015, at 04:47, Denys Vlasenko <dvlas...@redhat.com> wrote: > > "setbe %al" insn has a register merge stall: it needs to combine > previous %eax value with new value for the lowest byte. > Subsequent "movzbl %al,%edi" in turn depends on its completion. > > This patch replaces "setbe %al + movzbl %al,%edi" pair of insns > with "xor %edi,%edi" before the comparison, and conditional "inc %edi". > > This results in the same value of %edi as produced by old code, > but first insn has no dependencies, and we end up with having > only one insn with deps which executes only if %eax contains error > return, and both insns are shorter: 2 bytes each versus 3 bytes each. > > (The old code was inherited from 32-bit code, where it allowed to avoid > a conditional jump. Here we have to use a jump anyway). > > Signed-off-by: Denys Vlasenko <dvlas...@redhat.com> > CC: Linus Torvalds <torva...@linux-foundation.org> > CC: Steven Rostedt <rost...@goodmis.org> > CC: Ingo Molnar <mi...@kernel.org> > CC: Borislav Petkov <b...@alien8.de> > CC: "H. Peter Anvin" <h...@zytor.com> > CC: Andy Lutomirski <l...@amacapital.net> > CC: Oleg Nesterov <o...@redhat.com> > CC: Frederic Weisbecker <fweis...@gmail.com> > CC: Alexei Starovoitov <a...@plumgrid.com> > CC: Will Drewry <w...@chromium.org> > CC: Kees Cook <keesc...@chromium.org> > CC: x...@kernel.org > CC: linux-kernel@vger.kernel.org > --- > arch/x86/entry/entry_64_compat.S | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/entry/entry_64_compat.S > b/arch/x86/entry/entry_64_compat.S > index bb187a6..96f33a4 100644 > --- a/arch/x86/entry/entry_64_compat.S > +++ b/arch/x86/entry/entry_64_compat.S > @@ -213,12 +213,13 @@ sysexit_from_sys_call: > jnz ia32_ret_from_sys_call > TRACE_IRQS_ON > ENABLE_INTERRUPTS(CLBR_NONE) > + xor %edi, %edi > movl %eax, %esi /* second arg, syscall return value */ > cmpl $-MAX_ERRNO, %eax /* is it an error ? */ > jbe 1f > movslq %eax, %rsi /* if error sign extend to 64 bits */ > -1: setbe %al /* 1 if error, 0 if not */ > - movzbl %al, %edi /* zero-extend that into %edi */ > + inc %edi > +1: /* edi: 1 if error, 0 if not */ > call __audit_syscall_exit > movq RAX(%rsp), %rax /* reload syscall return value */ > movl $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT), %edi > -- > 1.8.1.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/