On Sun, 2018-01-07 at 15:03 +0100, Borislav Petkov wrote:
>
> My fear is if some funky compiler changes the sizes of the insns in
> RETPOLINE_CALL/JMP and then the padding becomes wrong. But looking at the
> labels, they're all close so you have a 2-byte jmp already and the
>
> call 1112f
>
>
On Sun, 2018-01-07 at 23:06 -0600, Josh Poimboeuf wrote:
>
> Here's the use case I had in mind before. With paravirt,
>
> ENABLE_INTERRUPTS(CLBR_NONE)
>
> becomes
>
> push %rax
> call *pv_irq_ops.irq_enable
> pop %rax
>
> and I wanted to apply those instructions with an alternativ
On Sat, Jan 06, 2018 at 01:30:59AM +0100, Borislav Petkov wrote:
> On Fri, Jan 05, 2018 at 11:08:06AM -0600, Josh Poimboeuf wrote:
> > I seem to recall that we also discussed the need for this for converting
> > pvops to use alternatives, though the "why" is eluding me at the moment.
>
> Ok, here'
On Sun, Jan 07, 2018 at 12:21:29PM +, David Woodhouse wrote:
> http://git.infradead.org/users/dwmw2/linux-retpoline.git
>
> In particular, this call site in entry_64.S:
> http://git.infradead.org/users/dwmw2/linux-retpoline.git/blob/0f5c54a36e:/arch/x86/entry/entry_64.S#l270
>
> It's still ju
On Sun, 2018-01-07 at 12:46 +0100, Borislav Petkov wrote:
>
> >
> > The other fun one for alternatives is in entry_64.S, where we really
> > need the return address of the call instruction to be *precisely* the
> > .Lentry_SYSCALL_64_after_fastpath_call label, so we have to eschew the
> > normal
On Sun, Jan 07, 2018 at 09:40:42AM +, David Woodhouse wrote:
> Right, so it all tends to work out OK purely by virtue of the fact that
> oldinstr and altinstr end up far enough apart in the image that they're
> 5-byte jumps. Which isn't perfect but we've lived with worse.
Well, the reference p
On Sat, 2018-01-06 at 18:02 +0100, Borislav Petkov wrote:
> On Sat, Jan 06, 2018 at 08:23:21AM +, David Woodhouse wrote:
> > Thanks. From code inspection, I couldn't see that it was smart enough
> > *not* to process a relative jump in the 'altinstr' section which was
> > jumping to a target *wi
On Sat, Jan 06, 2018 at 08:23:21AM +, David Woodhouse wrote:
> Thanks. From code inspection, I couldn't see that it was smart enough
> *not* to process a relative jump in the 'altinstr' section which was
> jumping to a target *within* that same altinstr section, and thus
> didn't need to be tou
On Fri, 2018-01-05 at 15:50 -0800, Linus Torvalds wrote:
>
> > +
> > +.macro RETPOLINE_CALL reg:req
> > + jmp 1113f
> > +1110: RETPOLINE_JMP \reg
> > +1113: call 1110b
> > +.endm
(Note that RETPOLINE_CALL is purely internal to nospec-branch.h, used
only from the NOSPEC_CALL macro
On Sat, 2018-01-06 at 01:30 +0100, Borislav Petkov wrote:
> On Fri, Jan 05, 2018 at 11:08:06AM -0600, Josh Poimboeuf wrote:
> > I seem to recall that we also discussed the need for this for converting
> > pvops to use alternatives, though the "why" is eluding me at the moment.
>
> Ok, here's somet
On Fri, Jan 05, 2018 at 11:08:06AM -0600, Josh Poimboeuf wrote:
> I seem to recall that we also discussed the need for this for converting
> pvops to use alternatives, though the "why" is eluding me at the moment.
Ok, here's something which seems to work in my VM here. I'll continue
playing with i
On Fri, Jan 5, 2018 at 2:00 PM, Woodhouse, David wrote:
> +.macro RETPOLINE_JMP reg:req
> + call1112f
> +: lfence
> + jmp b
> +1112: mov %\reg, (%_ASM_SP)
> + ret
> +.endm
> +
> +.macro RETPOLINE_CALL reg:req
> + jmp 1113f
> +1110: RETPOLINE_JMP \
On Fri, Jan 05, 2018 at 10:16:54PM +, Woodhouse, David wrote:
> You'd still want a RETPOLINE_AMD flag to enable that lfence; it's not
> just K8.
I think you're forgetting that we set K8 on everything >= K8 on AMD. So
this:
+ if (c->x86_vendor == X86_VENDOR_AMD)
+ setup_for
On Fri, Jan 05, 2018 at 10:00:19PM +, Woodhouse, David wrote:
> OK, this one looks saner, and I think I've tested all the 32/64 bit
Dunno, I think Brian's suggestion will make this even simpler:
ALTERNATIVE(NOP, K8: lfence)
ALTERNATIVE(jmp indirect, RETPOLINE: jmp thunk)
Hmm?
--
Regards/Gr
On Fri, 2018-01-05 at 09:28 -0800, Linus Torvalds wrote:
> That said, I honestly like the inline version (the one that is in the
> google paper first) of the retpoline more than the out-of-line one.
> And that one shouldn't have any relocagtion issues, because all the
> offsets are relative.
>
> W
On Fri, Jan 5, 2018 at 3:32 PM, Woodhouse, David wrote:
> On Fri, 2018-01-05 at 09:28 -0800, Linus Torvalds wrote:
>>
>> Yes, I would suggest against expecting altinstructions to have
>> relocation information. They are generated in a different place, so..
>>
>> That said, I honestly like the inli
On Fri, 2018-01-05 at 09:28 -0800, Linus Torvalds wrote:
>
> Yes, I would suggest against expecting altinstructions to have
> relocation information. They are generated in a different place, so..
>
> That said, I honestly like the inline version (the one that is in the
> google paper first) of th
> If the *compiler* uses the out-of-line version, that's a separate
> thing. But for our asm cases, let's just make it all be the inline
> case, ok?
Should be a simple change.
>
> It also should simplify the whole target generation. None of this
> silly "__x86.indirect_thunk.\reg" crap with diff
On Fri, 2018-01-05 at 09:28 -0800, Linus Torvalds wrote:
>
> Yes, I would suggest against expecting altinstructions to have
> relocation information. They are generated in a different place, so..
>
> That said, I honestly like the inline version (the one that is in the
> google paper first) of th
On Fri, Jan 5, 2018 at 9:12 AM, Woodhouse, David wrote:
>
> I typed 'jmp __x86.indirect_thunk' and it actually jumped to an address
> which I believe is (__x86.indirect_thunk + &altinstr - &oldinstr).
> Which made me sad, and took a while to debug.
Yes, I would suggest against expecting altinstru
On Fri, 2018-01-05 at 17:45 +0100, Borislav Petkov wrote:
> On Fri, Jan 05, 2018 at 04:41:46PM +, Woodhouse, David wrote:
> > Nope, alternatives are broken. Only a jmp as the *first* opcode of
> > altinstr gets handled by recompute_jump(), while any subsequent insn is
> > just copied untouched.
On Fri, Jan 05, 2018 at 05:45:06PM +0100, Borislav Petkov wrote:
> On Fri, Jan 05, 2018 at 04:41:46PM +, Woodhouse, David wrote:
> > Nope, alternatives are broken. Only a jmp as the *first* opcode of
> > altinstr gets handled by recompute_jump(), while any subsequent insn is
> > just copied unt
On Fri, Jan 05, 2018 at 04:41:46PM +, Woodhouse, David wrote:
> Nope, alternatives are broken. Only a jmp as the *first* opcode of
> altinstr gets handled by recompute_jump(), while any subsequent insn is
> just copied untouched.
Not broken - simply no one needed it until now. I'm looking into
On Fri, 2018-01-05 at 13:56 +, Woodhouse, David wrote:
>
> At some point during this whole painful mess, I had come to the
> conclusion that having relocations in altinstr didn't work, and that's
> why I had X86_xx_NO_RETPOLINE instead of X86_xx_RETPOLINE. I now think
> that something else was
On Fri, 2018-01-05 at 13:54 +0100, Thomas Gleixner wrote:
> On Thu, 4 Jan 2018, David Woodhouse wrote:
> > diff --git a/arch/x86/include/asm/cpufeatures.h
> > b/arch/x86/include/asm/cpufeatures.h
> > index 07cdd1715705..900fa7016d3f 100644
> > --- a/arch/x86/include/asm/cpufeatures.h
> > +++ b/arc
On Fri, 5 Jan 2018, Juergen Gross wrote:
> On 05/01/18 13:54, Thomas Gleixner wrote:
> > On Thu, 4 Jan 2018, David Woodhouse wrote:
> >> diff --git a/arch/x86/include/asm/cpufeatures.h
> >> b/arch/x86/include/asm/cpufeatures.h
> >> index 07cdd1715705..900fa7016d3f 100644
> >> --- a/arch/x86/includ
On 05/01/18 13:54, Thomas Gleixner wrote:
> On Thu, 4 Jan 2018, David Woodhouse wrote:
>> diff --git a/arch/x86/include/asm/cpufeatures.h
>> b/arch/x86/include/asm/cpufeatures.h
>> index 07cdd1715705..900fa7016d3f 100644
>> --- a/arch/x86/include/asm/cpufeatures.h
>> +++ b/arch/x86/include/asm/cpu
On Thu, 4 Jan 2018, David Woodhouse wrote:
> diff --git a/arch/x86/include/asm/cpufeatures.h
> b/arch/x86/include/asm/cpufeatures.h
> index 07cdd1715705..900fa7016d3f 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -342,5 +342,6 @@
> #define X86_B
On Fri, Jan 5, 2018 at 3:26 AM, Paolo Bonzini wrote:
> On 05/01/2018 11:28, Paul Turner wrote:
>>
>> The "pause; jmp" sequence proved minutely faster than "lfence;jmp" which is
>> why
>> it was chosen.
>>
>> "pause; jmp" 33.231 cycles/call 9.517 ns/call
>> "lfence; jmp" 33.354 cycles/call 9.5
On 05/01/2018 11:28, Paul Turner wrote:
>
> The "pause; jmp" sequence proved minutely faster than "lfence;jmp" which is
> why
> it was chosen.
>
> "pause; jmp" 33.231 cycles/call 9.517 ns/call
> "lfence; jmp" 33.354 cycles/call 9.552 ns/call
Do you have timings for a non-retpolined indirect
On Fri, Jan 05, 2018 at 10:55:38AM +, David Woodhouse wrote:
> On Fri, 2018-01-05 at 02:28 -0800, Paul Turner wrote:
> > On Thu, Jan 04, 2018 at 07:27:58PM +, David Woodhouse wrote:
> > > On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote:
> > > >
> > > > Pretty much.
> > > > Paul
On Fri, Jan 05, 2018 at 10:55:38AM +, David Woodhouse wrote:
> On Fri, 2018-01-05 at 02:28 -0800, Paul Turner wrote:
> > On Thu, Jan 04, 2018 at 07:27:58PM +, David Woodhouse wrote:
> > > On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote:
> > > >
> > > > Pretty much.
> > > > Paul
On Fri, 2018-01-05 at 02:28 -0800, Paul Turner wrote:
> On Thu, Jan 04, 2018 at 07:27:58PM +, David Woodhouse wrote:
> > On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote:
> > >
> > > Pretty much.
> > > Paul's writeup: https://support.google.com/faqs/answer/7625886
> > > tldr: jmp *%
On Thu, Jan 04, 2018 at 10:25:35AM -0800, Linus Torvalds wrote:
> On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov
> wrote:
> >
> > Clearly Paul's approach to retpoline without lfence is faster.
Using pause rather than lfence does not represent a fundamental difference here.
A protected indir
On Thu, Jan 04, 2018 at 10:40:23AM -0800, Andi Kleen wrote:
> > Clearly Paul's approach to retpoline without lfence is faster.
> > I'm guessing it wasn't shared with amazon/intel until now and
> > this set of patches going to adopt it, right?
> >
> > Paul, could you share a link to a set of altern
On Thu, Jan 04, 2018 at 07:27:58PM +, David Woodhouse wrote:
> On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote:
> >
> > Pretty much.
> > Paul's writeup: https://support.google.com/faqs/answer/7625886
> > tldr: jmp *%r11 gets converted to:
> > call set_up_target;
> > capture_spec:
>
On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote:
>
> Pretty much.
> Paul's writeup: https://support.google.com/faqs/answer/7625886
> tldr: jmp *%r11 gets converted to:
> call set_up_target;
> capture_spec:
> pause;
> jmp capture_spec;
> set_up_target:
> mov %r11, (%rsp);
> ret;
> Clearly Paul's approach to retpoline without lfence is faster.
> I'm guessing it wasn't shared with amazon/intel until now and
> this set of patches going to adopt it, right?
>
> Paul, could you share a link to a set of alternative gcc patches
> that do retpoline similar to llvm diff ?
I don't
On Thu, Jan 04, 2018 at 10:25:35AM -0800, Linus Torvalds wrote:
> On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov
> wrote:
> >
> > Clearly Paul's approach to retpoline without lfence is faster.
> > I'm guessing it wasn't shared with amazon/intel until now and
> > this set of patches going to a
On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov
wrote:
>
> Clearly Paul's approach to retpoline without lfence is faster.
> I'm guessing it wasn't shared with amazon/intel until now and
> this set of patches going to adopt it, right?
>
> Paul, could you share a link to a set of alternative gcc
On Thu, Jan 04, 2018 at 02:36:58PM +, David Woodhouse wrote:
> Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
> the corresponding thunks. Provide assembler macros for invoking the thunks
> in the same way that GCC does, from native and inline assembler.
>
> This add
David,
these are all marked as spam, because your emails have screwed up
DKIM. You used
From: David Woodhouse
but then you used infradead as a mailer, so it has the DKIM signature
from infradead, not from Amazon.co.uk.
The DKIM signature does pass for infradead, but amazon dmarc - quite
re
42 matches
Mail list logo