Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-08 Thread David Woodhouse
On Sun, 2018-01-07 at 15:03 +0100, Borislav Petkov wrote: > > My fear is if some funky compiler changes the sizes of the insns in > RETPOLINE_CALL/JMP and then the padding becomes wrong. But looking at the > labels, they're all close so you have a 2-byte jmp already and the > > call    1112f > >

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-08 Thread Woodhouse, David
On Sun, 2018-01-07 at 23:06 -0600, Josh Poimboeuf wrote: > > Here's the use case I had in mind before.  With paravirt, > >   ENABLE_INTERRUPTS(CLBR_NONE) > > becomes > >   push  %rax >   call  *pv_irq_ops.irq_enable >   pop   %rax > > and I wanted to apply those instructions with an alternativ

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-07 Thread Josh Poimboeuf
On Sat, Jan 06, 2018 at 01:30:59AM +0100, Borislav Petkov wrote: > On Fri, Jan 05, 2018 at 11:08:06AM -0600, Josh Poimboeuf wrote: > > I seem to recall that we also discussed the need for this for converting > > pvops to use alternatives, though the "why" is eluding me at the moment. > > Ok, here'

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-07 Thread Borislav Petkov
On Sun, Jan 07, 2018 at 12:21:29PM +, David Woodhouse wrote: > http://git.infradead.org/users/dwmw2/linux-retpoline.git > > In particular, this call site in entry_64.S: > http://git.infradead.org/users/dwmw2/linux-retpoline.git/blob/0f5c54a36e:/arch/x86/entry/entry_64.S#l270 > > It's still ju

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-07 Thread David Woodhouse
On Sun, 2018-01-07 at 12:46 +0100, Borislav Petkov wrote: > > >  > > The other fun one for alternatives is in entry_64.S, where we really > > need the return address of the call instruction to be *precisely* the  > > .Lentry_SYSCALL_64_after_fastpath_call label, so we have to eschew the > > normal

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-07 Thread Borislav Petkov
On Sun, Jan 07, 2018 at 09:40:42AM +, David Woodhouse wrote: > Right, so it all tends to work out OK purely by virtue of the fact that > oldinstr and altinstr end up far enough apart in the image that they're > 5-byte jumps. Which isn't perfect but we've lived with worse. Well, the reference p

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-07 Thread David Woodhouse
On Sat, 2018-01-06 at 18:02 +0100, Borislav Petkov wrote: > On Sat, Jan 06, 2018 at 08:23:21AM +, David Woodhouse wrote: > > Thanks. From code inspection, I couldn't see that it was smart enough > > *not* to process a relative jump in the 'altinstr' section which was > > jumping to a target *wi

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-06 Thread Borislav Petkov
On Sat, Jan 06, 2018 at 08:23:21AM +, David Woodhouse wrote: > Thanks. From code inspection, I couldn't see that it was smart enough > *not* to process a relative jump in the 'altinstr' section which was > jumping to a target *within* that same altinstr section, and thus > didn't need to be tou

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-06 Thread Woodhouse, David
On Fri, 2018-01-05 at 15:50 -0800, Linus Torvalds wrote: > > > + > > +.macro RETPOLINE_CALL reg:req > > +   jmp 1113f > > +1110:  RETPOLINE_JMP \reg > > +1113:  call    1110b > > +.endm (Note that RETPOLINE_CALL is purely internal to nospec-branch.h, used only from the NOSPEC_CALL macro

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-06 Thread David Woodhouse
On Sat, 2018-01-06 at 01:30 +0100, Borislav Petkov wrote: > On Fri, Jan 05, 2018 at 11:08:06AM -0600, Josh Poimboeuf wrote: > > I seem to recall that we also discussed the need for this for converting > > pvops to use alternatives, though the "why" is eluding me at the moment. > > Ok, here's somet

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Borislav Petkov
On Fri, Jan 05, 2018 at 11:08:06AM -0600, Josh Poimboeuf wrote: > I seem to recall that we also discussed the need for this for converting > pvops to use alternatives, though the "why" is eluding me at the moment. Ok, here's something which seems to work in my VM here. I'll continue playing with i

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Linus Torvalds
On Fri, Jan 5, 2018 at 2:00 PM, Woodhouse, David wrote: > +.macro RETPOLINE_JMP reg:req > + call1112f > +: lfence > + jmp b > +1112: mov %\reg, (%_ASM_SP) > + ret > +.endm > + > +.macro RETPOLINE_CALL reg:req > + jmp 1113f > +1110: RETPOLINE_JMP \

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Borislav Petkov
On Fri, Jan 05, 2018 at 10:16:54PM +, Woodhouse, David wrote: > You'd still want a RETPOLINE_AMD flag to enable that lfence; it's not > just K8. I think you're forgetting that we set K8 on everything >= K8 on AMD. So this: + if (c->x86_vendor == X86_VENDOR_AMD) + setup_for

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Borislav Petkov
On Fri, Jan 05, 2018 at 10:00:19PM +, Woodhouse, David wrote: > OK, this one looks saner, and I think I've tested all the 32/64 bit Dunno, I think Brian's suggestion will make this even simpler: ALTERNATIVE(NOP, K8: lfence) ALTERNATIVE(jmp indirect, RETPOLINE: jmp thunk) Hmm? -- Regards/Gr

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Woodhouse, David
On Fri, 2018-01-05 at 09:28 -0800, Linus Torvalds wrote: > That said, I honestly like the inline version (the one that is in the > google paper first) of the retpoline more than the out-of-line one. > And that one shouldn't have any relocagtion issues, because all the > offsets are relative. > > W

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Brian Gerst
On Fri, Jan 5, 2018 at 3:32 PM, Woodhouse, David wrote: > On Fri, 2018-01-05 at 09:28 -0800, Linus Torvalds wrote: >> >> Yes, I would suggest against expecting altinstructions to have >> relocation information. They are generated in a different place, so.. >> >> That said, I honestly like the inli

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Woodhouse, David
On Fri, 2018-01-05 at 09:28 -0800, Linus Torvalds wrote: > > Yes, I would suggest against expecting altinstructions to have > relocation information. They are generated in a different place, so.. > > That said, I honestly like the inline version (the one that is in the > google paper first) of th

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Andi Kleen
> If the *compiler* uses the out-of-line version, that's a separate > thing. But for our asm cases, let's just make it all be the inline > case, ok? Should be a simple change. > > It also should simplify the whole target generation. None of this > silly "__x86.indirect_thunk.\reg" crap with diff

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread David Woodhouse
On Fri, 2018-01-05 at 09:28 -0800, Linus Torvalds wrote: > > Yes, I would suggest against expecting altinstructions to have > relocation information. They are generated in a different place, so.. > > That said, I honestly like the inline version (the one that is in the > google paper first) of th

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Linus Torvalds
On Fri, Jan 5, 2018 at 9:12 AM, Woodhouse, David wrote: > > I typed 'jmp __x86.indirect_thunk' and it actually jumped to an address > which I believe is (__x86.indirect_thunk + &altinstr - &oldinstr). > Which made me sad, and took a while to debug. Yes, I would suggest against expecting altinstru

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Woodhouse, David
On Fri, 2018-01-05 at 17:45 +0100, Borislav Petkov wrote: > On Fri, Jan 05, 2018 at 04:41:46PM +, Woodhouse, David wrote: > > Nope, alternatives are broken. Only a jmp as the *first* opcode of > > altinstr gets handled by recompute_jump(), while any subsequent insn is > > just copied untouched.

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Josh Poimboeuf
On Fri, Jan 05, 2018 at 05:45:06PM +0100, Borislav Petkov wrote: > On Fri, Jan 05, 2018 at 04:41:46PM +, Woodhouse, David wrote: > > Nope, alternatives are broken. Only a jmp as the *first* opcode of > > altinstr gets handled by recompute_jump(), while any subsequent insn is > > just copied unt

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Borislav Petkov
On Fri, Jan 05, 2018 at 04:41:46PM +, Woodhouse, David wrote: > Nope, alternatives are broken. Only a jmp as the *first* opcode of > altinstr gets handled by recompute_jump(), while any subsequent insn is > just copied untouched. Not broken - simply no one needed it until now. I'm looking into

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Woodhouse, David
On Fri, 2018-01-05 at 13:56 +, Woodhouse, David wrote: > > At some point during this whole painful mess, I had come to the > conclusion that having relocations in altinstr didn't work, and that's > why I had X86_xx_NO_RETPOLINE instead of X86_xx_RETPOLINE. I now think > that something else was

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Woodhouse, David
On Fri, 2018-01-05 at 13:54 +0100, Thomas Gleixner wrote: > On Thu, 4 Jan 2018, David Woodhouse wrote: > > diff --git a/arch/x86/include/asm/cpufeatures.h > > b/arch/x86/include/asm/cpufeatures.h > > index 07cdd1715705..900fa7016d3f 100644 > > --- a/arch/x86/include/asm/cpufeatures.h > > +++ b/arc

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Thomas Gleixner
On Fri, 5 Jan 2018, Juergen Gross wrote: > On 05/01/18 13:54, Thomas Gleixner wrote: > > On Thu, 4 Jan 2018, David Woodhouse wrote: > >> diff --git a/arch/x86/include/asm/cpufeatures.h > >> b/arch/x86/include/asm/cpufeatures.h > >> index 07cdd1715705..900fa7016d3f 100644 > >> --- a/arch/x86/includ

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Juergen Gross
On 05/01/18 13:54, Thomas Gleixner wrote: > On Thu, 4 Jan 2018, David Woodhouse wrote: >> diff --git a/arch/x86/include/asm/cpufeatures.h >> b/arch/x86/include/asm/cpufeatures.h >> index 07cdd1715705..900fa7016d3f 100644 >> --- a/arch/x86/include/asm/cpufeatures.h >> +++ b/arch/x86/include/asm/cpu

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Thomas Gleixner
On Thu, 4 Jan 2018, David Woodhouse wrote: > diff --git a/arch/x86/include/asm/cpufeatures.h > b/arch/x86/include/asm/cpufeatures.h > index 07cdd1715705..900fa7016d3f 100644 > --- a/arch/x86/include/asm/cpufeatures.h > +++ b/arch/x86/include/asm/cpufeatures.h > @@ -342,5 +342,6 @@ > #define X86_B

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Fri, Jan 5, 2018 at 3:26 AM, Paolo Bonzini wrote: > On 05/01/2018 11:28, Paul Turner wrote: >> >> The "pause; jmp" sequence proved minutely faster than "lfence;jmp" which is >> why >> it was chosen. >> >> "pause; jmp" 33.231 cycles/call 9.517 ns/call >> "lfence; jmp" 33.354 cycles/call 9.5

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paolo Bonzini
On 05/01/2018 11:28, Paul Turner wrote: > > The "pause; jmp" sequence proved minutely faster than "lfence;jmp" which is > why > it was chosen. > > "pause; jmp" 33.231 cycles/call 9.517 ns/call > "lfence; jmp" 33.354 cycles/call 9.552 ns/call Do you have timings for a non-retpolined indirect

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Fri, Jan 05, 2018 at 10:55:38AM +, David Woodhouse wrote: > On Fri, 2018-01-05 at 02:28 -0800, Paul Turner wrote: > > On Thu, Jan 04, 2018 at 07:27:58PM +, David Woodhouse wrote: > > > On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote: > > > >  > > > > Pretty much. > > > > Paul

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Fri, Jan 05, 2018 at 10:55:38AM +, David Woodhouse wrote: > On Fri, 2018-01-05 at 02:28 -0800, Paul Turner wrote: > > On Thu, Jan 04, 2018 at 07:27:58PM +, David Woodhouse wrote: > > > On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote: > > > >  > > > > Pretty much. > > > > Paul

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread David Woodhouse
On Fri, 2018-01-05 at 02:28 -0800, Paul Turner wrote: > On Thu, Jan 04, 2018 at 07:27:58PM +, David Woodhouse wrote: > > On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote: > > >  > > > Pretty much. > > > Paul's writeup: https://support.google.com/faqs/answer/7625886 > > > tldr: jmp *%

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Thu, Jan 04, 2018 at 10:25:35AM -0800, Linus Torvalds wrote: > On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov > wrote: > > > > Clearly Paul's approach to retpoline without lfence is faster. Using pause rather than lfence does not represent a fundamental difference here. A protected indir

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Thu, Jan 04, 2018 at 10:40:23AM -0800, Andi Kleen wrote: > > Clearly Paul's approach to retpoline without lfence is faster. > > I'm guessing it wasn't shared with amazon/intel until now and > > this set of patches going to adopt it, right? > > > > Paul, could you share a link to a set of altern

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Thu, Jan 04, 2018 at 07:27:58PM +, David Woodhouse wrote: > On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote: > > > > Pretty much. > > Paul's writeup: https://support.google.com/faqs/answer/7625886 > > tldr: jmp *%r11 gets converted to: > > call set_up_target; > > capture_spec: >

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-04 Thread David Woodhouse
On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote: > > Pretty much. > Paul's writeup: https://support.google.com/faqs/answer/7625886 > tldr: jmp *%r11 gets converted to: > call set_up_target; > capture_spec: >   pause; >   jmp capture_spec; > set_up_target: >   mov %r11, (%rsp); >   ret;

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-04 Thread Andi Kleen
> Clearly Paul's approach to retpoline without lfence is faster. > I'm guessing it wasn't shared with amazon/intel until now and > this set of patches going to adopt it, right? > > Paul, could you share a link to a set of alternative gcc patches > that do retpoline similar to llvm diff ? I don't

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-04 Thread Alexei Starovoitov
On Thu, Jan 04, 2018 at 10:25:35AM -0800, Linus Torvalds wrote: > On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov > wrote: > > > > Clearly Paul's approach to retpoline without lfence is faster. > > I'm guessing it wasn't shared with amazon/intel until now and > > this set of patches going to a

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-04 Thread Linus Torvalds
On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov wrote: > > Clearly Paul's approach to retpoline without lfence is faster. > I'm guessing it wasn't shared with amazon/intel until now and > this set of patches going to adopt it, right? > > Paul, could you share a link to a set of alternative gcc

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-04 Thread Alexei Starovoitov
On Thu, Jan 04, 2018 at 02:36:58PM +, David Woodhouse wrote: > Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide > the corresponding thunks. Provide assembler macros for invoking the thunks > in the same way that GCC does, from native and inline assembler. > > This add

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-04 Thread Linus Torvalds
David, these are all marked as spam, because your emails have screwed up DKIM. You used From: David Woodhouse but then you used infradead as a mailer, so it has the DKIM signature from infradead, not from Amazon.co.uk. The DKIM signature does pass for infradead, but amazon dmarc - quite re