Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-10 Thread Linus Torvalds
On Sat, Feb 10, 2018 at 7:26 AM, David Laight wrote: > > The alignment doesn't matter, 'rep movsl' will still work. .. no it won't. It might not copy the last two bytes or whatever, because the shift of the count will have ignored the low bits. But since an unaligned stack pointer really shouldn

RE: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-10 Thread David Laight
From: Linus Torvalds > Sent: 09 February 2018 19:49 ... > I think the instruction scheduling ends up basically breaking around > microcoded instructions, which is why you'll get something like 12+n > cycles for "rep movs" on some uarchs, but at that point it's probably > mostly in the noise compare

RE: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-10 Thread David Laight
From: Denys Vlasenko > Sent: 09 February 2018 17:17 > On 02/09/2018 06:05 PM, Linus Torvalds wrote: > > On Fri, Feb 9, 2018 at 1:25 AM, Joerg Roedel wrote: > >> + > >> + /* Copy over the stack-frame */ > >> + cld > >> + rep movsb > > > > Ugh. This is going to be horrendous. Maybe

Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-09 Thread Linus Torvalds
On Fri, Feb 9, 2018 at 11:25 AM, Joerg Roedel wrote: > > Ugh, okay. So I switch to movsl, that should at least perform on-par > with the chain of 'pushl' instructions I had before. It should generally be roughly in the same ballpark. I think the instruction scheduling ends up basically breaking

Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-09 Thread Denys Vlasenko
On 02/09/2018 08:02 PM, Joerg Roedel wrote: On Fri, Feb 09, 2018 at 09:05:02AM -0800, Linus Torvalds wrote: On Fri, Feb 9, 2018 at 1:25 AM, Joerg Roedel wrote: + + /* Copy over the stack-frame */ + cld + rep movsb Ugh. This is going to be horrendous. Maybe not noticeable on

Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-09 Thread Joerg Roedel
On Fri, Feb 09, 2018 at 11:17:35AM -0800, Linus Torvalds wrote: > Yeah, it's only true on the very latest uarchs, and even there it's > not perfect for small copies. > > On the older machines that are relevant for 32-bit code, it's often > tens of cycles just for the ucode overhead, I think, and "

Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-09 Thread Linus Torvalds
On Fri, Feb 9, 2018 at 11:02 AM, Joerg Roedel wrote: > > Okay, I used movsb because I remembered that being the recommendation > for the most efficient memcpy, and it safes me an instruction. But that > is probably only true on modern CPUs. Yeah, it's only true on the very latest uarchs, and even

Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-09 Thread Joerg Roedel
On Fri, Feb 09, 2018 at 05:43:55PM +, Andy Lutomirski wrote: > The 64-bit code mostly uses a bunch of push instructions for this. I had it implemented with tons of push instructions first, but that doesn't work in cases where the stack switch needs to happen only after everything is copied o

Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-09 Thread Joerg Roedel
On Fri, Feb 09, 2018 at 09:05:02AM -0800, Linus Torvalds wrote: > On Fri, Feb 9, 2018 at 1:25 AM, Joerg Roedel wrote: > > + > > + /* Copy over the stack-frame */ > > + cld > > + rep movsb > > Ugh. This is going to be horrendous. Maybe not noticeable on modern > CPU's, but the wh

Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-09 Thread Andy Lutomirski
On Fri, Feb 9, 2018 at 5:05 PM, Linus Torvalds wrote: > On Fri, Feb 9, 2018 at 1:25 AM, Joerg Roedel wrote: >> + >> + /* Copy over the stack-frame */ >> + cld >> + rep movsb > > Ugh. This is going to be horrendous. Maybe not noticeable on modern > CPU's, but the whole 32-bit cod

Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-09 Thread Denys Vlasenko
On 02/09/2018 06:05 PM, Linus Torvalds wrote: On Fri, Feb 9, 2018 at 1:25 AM, Joerg Roedel wrote: + + /* Copy over the stack-frame */ + cld + rep movsb Ugh. This is going to be horrendous. Maybe not noticeable on modern CPU's, but the whole 32-bit code is kind of pointless o

Re: [PATCH 09/31] x86/entry/32: Leave the kernel via trampoline stack

2018-02-09 Thread Linus Torvalds
On Fri, Feb 9, 2018 at 1:25 AM, Joerg Roedel wrote: > + > + /* Copy over the stack-frame */ > + cld > + rep movsb Ugh. This is going to be horrendous. Maybe not noticeable on modern CPU's, but the whole 32-bit code is kind of pointless on a modern CPU. At least use "rep movsl".