I think you misunderstand partial register stalls.  They happen (on some 
microarchitectures) when you write part of a register and then use the whole 
register.

As you say, we do need the branch anyway, which is a good reason to do it, but 
the motivation is wrong.

Sent from my tablet, pardon any formatting problems.

> On Jun 11, 2015, at 04:47, Denys Vlasenko <dvlas...@redhat.com> wrote:
> 
> "setbe %al" insn has a register merge stall: it needs to combine
> previous %eax value with new value for the lowest byte.
> Subsequent "movzbl %al,%edi" in turn depends on its completion.
> 
> This patch replaces "setbe %al + movzbl %al,%edi" pair of insns
> with "xor %edi,%edi" before the comparison, and conditional "inc %edi".
> 
> This results in the same value of %edi as produced by old code,
> but first insn has no dependencies, and we end up with having
> only one insn with deps which executes only if %eax contains error
> return, and both insns are shorter: 2 bytes each versus 3 bytes each.
> 
> (The old code was inherited from 32-bit code, where it allowed to avoid
> a conditional jump. Here we have to use a jump anyway).
> 
> Signed-off-by: Denys Vlasenko <dvlas...@redhat.com>
> CC: Linus Torvalds <torva...@linux-foundation.org>
> CC: Steven Rostedt <rost...@goodmis.org>
> CC: Ingo Molnar <mi...@kernel.org>
> CC: Borislav Petkov <b...@alien8.de>
> CC: "H. Peter Anvin" <h...@zytor.com>
> CC: Andy Lutomirski <l...@amacapital.net>
> CC: Oleg Nesterov <o...@redhat.com>
> CC: Frederic Weisbecker <fweis...@gmail.com>
> CC: Alexei Starovoitov <a...@plumgrid.com>
> CC: Will Drewry <w...@chromium.org>
> CC: Kees Cook <keesc...@chromium.org>
> CC: x...@kernel.org
> CC: linux-kernel@vger.kernel.org
> ---
> arch/x86/entry/entry_64_compat.S | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_64_compat.S 
> b/arch/x86/entry/entry_64_compat.S
> index bb187a6..96f33a4 100644
> --- a/arch/x86/entry/entry_64_compat.S
> +++ b/arch/x86/entry/entry_64_compat.S
> @@ -213,12 +213,13 @@ sysexit_from_sys_call:
>    jnz    ia32_ret_from_sys_call
>    TRACE_IRQS_ON
>    ENABLE_INTERRUPTS(CLBR_NONE)
> +    xor    %edi, %edi
>    movl    %eax, %esi        /* second arg, syscall return value */
>    cmpl    $-MAX_ERRNO, %eax    /* is it an error ? */
>    jbe    1f
>    movslq    %eax, %rsi        /* if error sign extend to 64 bits */
> -1:    setbe    %al            /* 1 if error, 0 if not */
> -    movzbl    %al, %edi        /* zero-extend that into %edi */
> +    inc    %edi
> +1:    /* edi: 1 if error, 0 if not */
>    call    __audit_syscall_exit
>    movq    RAX(%rsp), %rax        /* reload syscall return value */
>    movl    $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT), %edi
> -- 
> 1.8.1.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to