On Wed, Oct 7, 2015 at 10:39 AM, Denys Vlasenko <dvlas...@redhat.com> wrote:
> On 10/06/2015 02:47 AM, Andy Lutomirski wrote:
>> We eventually want to make it all the way into C code before
>> enabling interrupts.  We need to rework our flags handling slightly
>> to delay enabling interrupts.
>>
>> Signed-off-by: Andy Lutomirski <l...@kernel.org>
>> ---
>>  arch/x86/entry/entry_64_compat.S | 30 ++++++++++++++++++++++--------
>>  1 file changed, 22 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/entry/entry_64_compat.S 
>> b/arch/x86/entry/entry_64_compat.S
>> index aa76864a8a6b..1432d60a1f4a 100644
>> --- a/arch/x86/entry/entry_64_compat.S
>> +++ b/arch/x86/entry/entry_64_compat.S
>> @@ -58,14 +58,9 @@ ENDPROC(native_usergs_sysret32)
>>   * with the int 0x80 path.
>>   */
>>  ENTRY(entry_SYSENTER_compat)
>> -     /*
>> -      * Interrupts are off on entry.
>> -      * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
>> -      * it is too small to ever cause noticeable irq latency.
>> -      */
>> +     /* Interrupts are off on entry. */
>>       SWAPGS_UNSAFE_STACK
>>       movq    PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>> -     ENABLE_INTERRUPTS(CLBR_NONE)
>>
>>       /* Zero-extending 32-bit regs, do not remove */
>>       movl    %ebp, %ebp
>> @@ -76,7 +71,16 @@ ENTRY(entry_SYSENTER_compat)
>>       /* Construct struct pt_regs on stack */
>>       pushq   $__USER32_DS            /* pt_regs->ss */
>>       pushq   %rbp                    /* pt_regs->sp */
>> -     pushfq                          /* pt_regs->flags */
>> +
>> +     /*
>> +      * Push flags.  This is nasty.  First, interrupts are currently
>> +      * off, but we need pt_regs->flags to have IF set.  Second, even
>> +      * if TF was set when SYSENTER started, it's clear by now.  We fix
>> +      * that later using TIF_SINGLESTEP.
>> +      */
>> +     pushfq                          /* pt_regs->flags (except IF = 0) */
>> +     orl     $X86_EFLAGS_IF, (%rsp)  /* Fix saved flags */
>
> The sequence of "push + insn_using_rsp" is a bit slow
> on most CPUs because stack engine (the machinery which makes
> consecutive pushes fast) needs syncronizing with register file.
>
> It may be better to move the ORL insn here:
>
>         push, push, push
>         cld
>         sub     $(10*8), %rsp /* pt_regs->r8-11, bp, bx, r12-15 not saved */
> +       orl     $X86_EFLAGS_IF, EFLAGS(%rsp)  /* Fix saved flags to have .IF 
> = 1 */
>
> where we already eat that penalty.
>
>

I'll benchmark this and, if it's a win, I'll tack it on to the end of
the series.

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to