Some results (was: Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs)

2015-02-25 Thread Borislav Petkov
On Fri, Feb 20, 2015 at 10:58:15AM -0800, Andy Lutomirski wrote: > - /* Auto enable eagerfpu for xsaveopt */ > - if (cpu_has_xsaveopt && eagerfpu != DISABLE) > + /* Auto enable eagerfpu for everyone */ > + if (eagerfpu != DISABLE) > eagerfpu = ENABLE; So Mel did run

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-25 Thread Ingo Molnar
* Borislav Petkov wrote: > On Tue, Feb 24, 2015 at 04:07:07PM -0800, Andy Lutomirski wrote: > > > I'd prefer a different partial solution: encourage > > everyone to clear the xstate before making syscalls > > (using e.g. vzeroall). In fact, maybe user code should > > aggressively clear

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-25 Thread Ingo Molnar
* Andy Lutomirski wrote: > > I'm a big fan of simplifying things, but. > > > > SIMD registers were growing in x86, and they are going > > to grow again, this time four-fold in Intel MIC: from > > sixteen 256-bit registers to thirty two 512-bit > > registers. > > > > That's 2 kbytes of data.

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-25 Thread Borislav Petkov
On Tue, Feb 24, 2015 at 04:07:07PM -0800, Andy Lutomirski wrote: > I'd prefer a different partial solution: encourage everyone to clear > the xstate before making syscalls (using e.g. vzeroall). In fact, > maybe user code should aggressively clear newly-unused xstate. We don't trust userspace.

Some results (was: Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs)

2015-02-25 Thread Borislav Petkov
On Fri, Feb 20, 2015 at 10:58:15AM -0800, Andy Lutomirski wrote: - /* Auto enable eagerfpu for xsaveopt */ - if (cpu_has_xsaveopt eagerfpu != DISABLE) + /* Auto enable eagerfpu for everyone */ + if (eagerfpu != DISABLE) eagerfpu = ENABLE; So Mel did run some

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-25 Thread Borislav Petkov
On Tue, Feb 24, 2015 at 04:07:07PM -0800, Andy Lutomirski wrote: I'd prefer a different partial solution: encourage everyone to clear the xstate before making syscalls (using e.g. vzeroall). In fact, maybe user code should aggressively clear newly-unused xstate. We don't trust userspace. --

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-25 Thread Ingo Molnar
* Andy Lutomirski l...@amacapital.net wrote: I'm a big fan of simplifying things, but. SIMD registers were growing in x86, and they are going to grow again, this time four-fold in Intel MIC: from sixteen 256-bit registers to thirty two 512-bit registers. That's 2 kbytes of

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-25 Thread Ingo Molnar
* Borislav Petkov b...@alien8.de wrote: On Tue, Feb 24, 2015 at 04:07:07PM -0800, Andy Lutomirski wrote: I'd prefer a different partial solution: encourage everyone to clear the xstate before making syscalls (using e.g. vzeroall). In fact, maybe user code should aggressively clear

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-24 Thread Andy Lutomirski
On Tue, Feb 24, 2015 at 11:15 AM, Denys Vlasenko wrote: > On Fri, Feb 20, 2015 at 7:58 PM, Andy Lutomirski wrote: >> We have eager and lazy fpu modes, introduced in: >> >> 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting >> xsave >> >> The result is rather messy. There

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-24 Thread Denys Vlasenko
On Fri, Feb 20, 2015 at 7:58 PM, Andy Lutomirski wrote: > We have eager and lazy fpu modes, introduced in: > > 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting > xsave > > The result is rather messy. There are two code paths in almost all of the > FPU code, and only one

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-24 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/23/2015 09:31 PM, Andy Lutomirski wrote: > On Mon, Feb 23, 2015 at 6:14 PM, Maciej W. Rozycki > wrote: >> That's an interesting case too, although not necessarily related. >> If you say that we always save the FP context eagerly for the >>

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-24 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/23/2015 09:31 PM, Andy Lutomirski wrote: On Mon, Feb 23, 2015 at 6:14 PM, Maciej W. Rozycki ma...@linux-mips.org wrote: That's an interesting case too, although not necessarily related. If you say that we always save the FP context eagerly

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-24 Thread Denys Vlasenko
On Fri, Feb 20, 2015 at 7:58 PM, Andy Lutomirski l...@amacapital.net wrote: We have eager and lazy fpu modes, introduced in: 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting xsave The result is rather messy. There are two code paths in almost all of the FPU code,

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-24 Thread Andy Lutomirski
On Tue, Feb 24, 2015 at 11:15 AM, Denys Vlasenko vda.li...@googlemail.com wrote: On Fri, Feb 20, 2015 at 7:58 PM, Andy Lutomirski l...@amacapital.net wrote: We have eager and lazy fpu modes, introduced in: 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting xsave The

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Andy Lutomirski
On Mon, Feb 23, 2015 at 6:14 PM, Maciej W. Rozycki wrote: > On Mon, 23 Feb 2015, Andy Lutomirski wrote: > >> >> After a context switch, the instructions from the old task are no >> >> longer in the pipeline. >> > >> > I'd say it's implementation-specific. As I mentioned the i486 aborted >> >

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Maciej W. Rozycki
On Mon, 23 Feb 2015, Andy Lutomirski wrote: > >> After a context switch, the instructions from the old task are no > >> longer in the pipeline. > > > > I'd say it's implementation-specific. As I mentioned the i486 aborted > > any transcendental x87 instruction in progress upon taking an

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Maciej W. Rozycki
On Mon, 23 Feb 2015, Linus Torvalds wrote: > We have one traditional special case, which actually did something > like Maciej's nightmare scenario: the completely broken "FPU errors > over irq13" IBM PC/AT FPU linkage. > > But since we don't actually support old i386 machines any more, we >

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Andy Lutomirski
On Mon, Feb 23, 2015 at 4:56 PM, Maciej W. Rozycki wrote: > On Mon, 23 Feb 2015, Linus Torvalds wrote: > >> We have one traditional special case, which actually did something >> like Maciej's nightmare scenario: the completely broken "FPU errors >> over irq13" IBM PC/AT FPU linkage. >> >> But

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Andy Lutomirski
On Mon, Feb 23, 2015 at 2:27 PM, Maciej W. Rozycki wrote: > On Mon, 23 Feb 2015, Rik van Riel wrote: > >> > I meant something else -- a slow FPU instruction can retire after a >> > task has been switched where the FP context has been left intact, >> > i.e. in the lazy FP context switching case,

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Maciej W. Rozycki
On Mon, 23 Feb 2015, Rik van Riel wrote: > > I meant something else -- a slow FPU instruction can retire after a > > task has been switched where the FP context has been left intact, > > i.e. in the lazy FP context switching case, where only the MMU > > context and GPRs have been replaced. > > I

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Linus Torvalds
On Mon, Feb 23, 2015 at 1:21 PM, Rik van Riel wrote: > > On 02/23/2015 04:17 PM, Maciej W. Rozycki wrote: >>> >>> It seems highly unlikely to me that a slow FPU instruction can >>> retire *after* a subsequent fxsave, which would need to happen >>> for this to work. >> >> I meant something else --

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/23/2015 04:17 PM, Maciej W. Rozycki wrote: > On Sat, 21 Feb 2015, Andy Lutomirski wrote: > >>> Additionally I believe long-executing FPU instructions (i.e. >>> transcendentals) can take advantage of continuing to execute in >>> parallel where

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Maciej W. Rozycki
On Sat, 21 Feb 2015, Andy Lutomirski wrote: > > Additionally I believe long-executing FPU instructions (i.e. > > transcendentals) can take advantage of continuing to execute in parallel > > where the context has already been switched rather than stalling an eager > > FPU context switch until the

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Oleg Nesterov
On 02/23, Rik van Riel wrote: > > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 02/23/2015 10:11 AM, Borislav Petkov wrote: > > On Mon, Feb 23, 2015 at 03:59:29PM +0100, Oleg Nesterov wrote: > >> Well, but if we want this change then perhaps we should simply > >> change the default value?

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Borislav Petkov
On Mon, Feb 23, 2015 at 10:51:26AM -0500, Rik van Riel wrote: > However, we would still need the rest of the kernel code to ... Yeah, let's wait out first and see what the benchmarks say. Mel started a bunch of them on a couple of boxes here, we'll have results in the coming days. --

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/23/2015 10:11 AM, Borislav Petkov wrote: > On Mon, Feb 23, 2015 at 03:59:29PM +0100, Oleg Nesterov wrote: >> Well, but if we want this change then perhaps we should simply >> change the default value? This way "AUTO" still can work. > > Yeah,

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/23/2015 10:03 AM, Borislav Petkov wrote: > On Mon, Feb 23, 2015 at 07:51:04AM -0500, Rik van Riel wrote: >> At that point we either load the FPU context, or we set CR0.TS. > > Right, but provided eager doesn't bring any slowdown, we can drop >

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Borislav Petkov
On Mon, Feb 23, 2015 at 03:59:29PM +0100, Oleg Nesterov wrote: > Well, but if we want this change then perhaps we should simply change > the default value? This way "AUTO" still can work. Yeah, sure, let's do some measurements first, to see whether this is even worth it. Btw, Mel pointed me at

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Borislav Petkov
On Mon, Feb 23, 2015 at 07:51:04AM -0500, Rik van Riel wrote: > At that point we either load the FPU context, or we > set CR0.TS. Right, but provided eager doesn't bring any slowdown, we can drop the TS fiddling altogether and only load FPU context. -- Regards/Gruss, Boris. ECO tip #101:

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Oleg Nesterov
On 02/20, Andy Lutomirski wrote: > > We have eager and lazy fpu modes, introduced in: > > 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting > xsave > > The result is rather messy. There are two code paths in almost all of the > FPU code, and only one of them (the eager

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/23/2015 12:22 AM, Andy Lutomirski wrote: > On Sun, Feb 22, 2015 at 5:45 PM, Rik van Riel > wrote: >> One implication of this is that in kernel mode, we can no longer >> just assume that the user space FPU state is always loaded, and >> we need

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/23/2015 12:22 AM, Andy Lutomirski wrote: On Sun, Feb 22, 2015 at 5:45 PM, Rik van Riel r...@redhat.com wrote: One implication of this is that in kernel mode, we can no longer just assume that the user space FPU state is always loaded, and

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Borislav Petkov
On Mon, Feb 23, 2015 at 07:51:04AM -0500, Rik van Riel wrote: At that point we either load the FPU context, or we set CR0.TS. Right, but provided eager doesn't bring any slowdown, we can drop the TS fiddling altogether and only load FPU context. -- Regards/Gruss, Boris. ECO tip #101:

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Oleg Nesterov
On 02/20, Andy Lutomirski wrote: We have eager and lazy fpu modes, introduced in: 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting xsave The result is rather messy. There are two code paths in almost all of the FPU code, and only one of them (the eager case) is

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/23/2015 10:03 AM, Borislav Petkov wrote: On Mon, Feb 23, 2015 at 07:51:04AM -0500, Rik van Riel wrote: At that point we either load the FPU context, or we set CR0.TS. Right, but provided eager doesn't bring any slowdown, we can drop the TS

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Borislav Petkov
On Mon, Feb 23, 2015 at 03:59:29PM +0100, Oleg Nesterov wrote: Well, but if we want this change then perhaps we should simply change the default value? This way AUTO still can work. Yeah, sure, let's do some measurements first, to see whether this is even worth it. Btw, Mel pointed me at some

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/23/2015 10:11 AM, Borislav Petkov wrote: On Mon, Feb 23, 2015 at 03:59:29PM +0100, Oleg Nesterov wrote: Well, but if we want this change then perhaps we should simply change the default value? This way AUTO still can work. Yeah, sure,

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Maciej W. Rozycki
On Sat, 21 Feb 2015, Andy Lutomirski wrote: Additionally I believe long-executing FPU instructions (i.e. transcendentals) can take advantage of continuing to execute in parallel where the context has already been switched rather than stalling an eager FPU context switch until the FPU

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/23/2015 04:17 PM, Maciej W. Rozycki wrote: On Sat, 21 Feb 2015, Andy Lutomirski wrote: Additionally I believe long-executing FPU instructions (i.e. transcendentals) can take advantage of continuing to execute in parallel where the context

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Oleg Nesterov
On 02/23, Rik van Riel wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/23/2015 10:11 AM, Borislav Petkov wrote: On Mon, Feb 23, 2015 at 03:59:29PM +0100, Oleg Nesterov wrote: Well, but if we want this change then perhaps we should simply change the default value? This way

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Borislav Petkov
On Mon, Feb 23, 2015 at 10:51:26AM -0500, Rik van Riel wrote: However, we would still need the rest of the kernel code to ... Yeah, let's wait out first and see what the benchmarks say. Mel started a bunch of them on a couple of boxes here, we'll have results in the coming days. --

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Maciej W. Rozycki
On Mon, 23 Feb 2015, Rik van Riel wrote: I meant something else -- a slow FPU instruction can retire after a task has been switched where the FP context has been left intact, i.e. in the lazy FP context switching case, where only the MMU context and GPRs have been replaced. I don't

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Linus Torvalds
On Mon, Feb 23, 2015 at 1:21 PM, Rik van Riel r...@redhat.com wrote: On 02/23/2015 04:17 PM, Maciej W. Rozycki wrote: It seems highly unlikely to me that a slow FPU instruction can retire *after* a subsequent fxsave, which would need to happen for this to work. I meant something else -- a

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Maciej W. Rozycki
On Mon, 23 Feb 2015, Andy Lutomirski wrote: After a context switch, the instructions from the old task are no longer in the pipeline. I'd say it's implementation-specific. As I mentioned the i486 aborted any transcendental x87 instruction in progress upon taking an exception or

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Maciej W. Rozycki
On Mon, 23 Feb 2015, Linus Torvalds wrote: We have one traditional special case, which actually did something like Maciej's nightmare scenario: the completely broken FPU errors over irq13 IBM PC/AT FPU linkage. But since we don't actually support old i386 machines any more, we don't really

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Andy Lutomirski
On Mon, Feb 23, 2015 at 4:56 PM, Maciej W. Rozycki ma...@linux-mips.org wrote: On Mon, 23 Feb 2015, Linus Torvalds wrote: We have one traditional special case, which actually did something like Maciej's nightmare scenario: the completely broken FPU errors over irq13 IBM PC/AT FPU linkage.

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Andy Lutomirski
On Mon, Feb 23, 2015 at 6:14 PM, Maciej W. Rozycki ma...@linux-mips.org wrote: On Mon, 23 Feb 2015, Andy Lutomirski wrote: After a context switch, the instructions from the old task are no longer in the pipeline. I'd say it's implementation-specific. As I mentioned the i486 aborted

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-23 Thread Andy Lutomirski
On Mon, Feb 23, 2015 at 2:27 PM, Maciej W. Rozycki ma...@linux-mips.org wrote: On Mon, 23 Feb 2015, Rik van Riel wrote: I meant something else -- a slow FPU instruction can retire after a task has been switched where the FP context has been left intact, i.e. in the lazy FP context

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Andy Lutomirski
On Sun, Feb 22, 2015 at 5:45 PM, Rik van Riel wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 02/22/2015 06:06 AM, Borislav Petkov wrote: >> On Sat, Feb 21, 2015 at 06:18:01PM -0800, Andy Lutomirski wrote: >>> That's true. The question is whether there are enough of them, >>> and

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/22/2015 06:06 AM, Borislav Petkov wrote: > On Sat, Feb 21, 2015 at 06:18:01PM -0800, Andy Lutomirski wrote: >> That's true. The question is whether there are enough of them, >> and whether twiddling TS is fast enough, that it's worth it. > >

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Borislav Petkov
On Sun, Feb 22, 2015 at 01:57:36PM +0100, Ingo Molnar wrote: > This is also very similar to the ~0.6 secs improvement your > first set of numbers gave. Yeah, running without --repeat was simply misleading. > So now that it appears we have consistent numbers, it would > be nice to check it on

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Ingo Molnar
* Borislav Petkov wrote: > Lazy FPU: > 219.406449195 seconds time elapsed >( +- 0.17% ) > Eager FPU: > 218.791122148 seconds time elapsed >( +- 0.13% ) > Timing improvement of 0.6 secs on average

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Borislav Petkov
On Sun, Feb 22, 2015 at 09:18:40AM +0100, Ingo Molnar wrote: > - It might make sense to do a 'perf stat --null --repeat' > measurement as well [without any -e arguments], to make > sure the rich PMU stats you are gathering are not > interfering? Well, the --repeat thing definitely

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Borislav Petkov
On Sat, Feb 21, 2015 at 06:18:01PM -0800, Andy Lutomirski wrote: > That's true. The question is whether there are enough of them, and > whether twiddling TS is fast enough, that it's worth it. Yes, and let me make it clear what I'm trying to do here: I want to make sure that eager FPU handling

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Borislav Petkov
On Sun, Feb 22, 2015 at 09:18:40AM +0100, Ingo Molnar wrote: > So am I interpreting the older and your latest numbers > correctly in stating that the cost observation has flipped > around 180 degrees: the first measurement showed eager FPU > to be a win, but now that we can do more precise >

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Ingo Molnar
* Ingo Molnar wrote: > - Do you have enough RAM that there's essentially no IO > in the system worth speaking of? Do you have enough RAM > to copy a whole kernel tree to /tmp/linux/ and do the > measurement there, on ramfs? Doing that will also pin down the page cache: kernel

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Ingo Molnar
* Borislav Petkov wrote: > which spit this: > > Lazy FPU: > 219.127929718 seconds time elapsed > Eager FPU: > 220.148034331 seconds time elapsed > so we have a second slowdown and 200K FPU saves more in eager mode. So am I interpreting the older and your latest numbers correctly

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Borislav Petkov
On Sat, Feb 21, 2015 at 06:18:01PM -0800, Andy Lutomirski wrote: That's true. The question is whether there are enough of them, and whether twiddling TS is fast enough, that it's worth it. Yes, and let me make it clear what I'm trying to do here: I want to make sure that eager FPU handling

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Ingo Molnar
* Borislav Petkov b...@alien8.de wrote: Lazy FPU: 219.406449195 seconds time elapsed ( +- 0.17% ) Eager FPU: 218.791122148 seconds time elapsed ( +- 0.13% ) Timing improvement of 0.6 secs on

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Ingo Molnar
* Borislav Petkov b...@alien8.de wrote: which spit this: Lazy FPU: 219.127929718 seconds time elapsed Eager FPU: 220.148034331 seconds time elapsed so we have a second slowdown and 200K FPU saves more in eager mode. So am I interpreting the older and your latest numbers

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Borislav Petkov
On Sun, Feb 22, 2015 at 09:18:40AM +0100, Ingo Molnar wrote: - It might make sense to do a 'perf stat --null --repeat' measurement as well [without any -e arguments], to make sure the rich PMU stats you are gathering are not interfering? Well, the --repeat thing definitely is

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Borislav Petkov
On Sun, Feb 22, 2015 at 09:18:40AM +0100, Ingo Molnar wrote: So am I interpreting the older and your latest numbers correctly in stating that the cost observation has flipped around 180 degrees: the first measurement showed eager FPU to be a win, but now that we can do more precise

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Borislav Petkov
On Sun, Feb 22, 2015 at 01:57:36PM +0100, Ingo Molnar wrote: This is also very similar to the ~0.6 secs improvement your first set of numbers gave. Yeah, running without --repeat was simply misleading. So now that it appears we have consistent numbers, it would be nice to check it on older

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Ingo Molnar
* Ingo Molnar mi...@kernel.org wrote: - Do you have enough RAM that there's essentially no IO in the system worth speaking of? Do you have enough RAM to copy a whole kernel tree to /tmp/linux/ and do the measurement there, on ramfs? Doing that will also pin down the page

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Andy Lutomirski
On Sun, Feb 22, 2015 at 5:45 PM, Rik van Riel r...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/22/2015 06:06 AM, Borislav Petkov wrote: On Sat, Feb 21, 2015 at 06:18:01PM -0800, Andy Lutomirski wrote: That's true. The question is whether there are enough of them,

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-22 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/22/2015 06:06 AM, Borislav Petkov wrote: On Sat, Feb 21, 2015 at 06:18:01PM -0800, Andy Lutomirski wrote: That's true. The question is whether there are enough of them, and whether twiddling TS is fast enough, that it's worth it. Yes, and

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Andy Lutomirski
On Sat, Feb 21, 2015 at 4:34 PM, Maciej W. Rozycki wrote: > On Sat, 21 Feb 2015, Borislav Petkov wrote: > >> Provided I've not made a mistake, this leads me to think that this >> simple workload and pretty much everything else uses the FPU through >> glibc which does the SSE memcpy and so on.

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Maciej W. Rozycki
On Sat, 21 Feb 2015, Borislav Petkov wrote: > Provided I've not made a mistake, this leads me to think that this > simple workload and pretty much everything else uses the FPU through > glibc which does the SSE memcpy and so on. Which basically kills the > whole idea behind lazy FPU as

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Borislav Petkov
On Sat, Feb 21, 2015 at 08:23:52PM +0100, Ingo Molnar wrote: > to switch between the modes? I went all out and did a debugfs file, see patch at the end, which counts FPU saves. Then I ran this script: --- #!/bin/bash D="/sys/kernel/debug/fpu/eager" echo "Lazy FPU: " echo 0 > $D echo -n " FPU

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Ingo Molnar
* Borislav Petkov wrote: > > I'd sleep a lot better if we had some runtime debug > > flag to be able to do run-to-run comparisons on the > > same booted up kernel, or so. > > Let me take a look whether we could so some knob... The > nice thing is, code uses use_eager_fpu() to check stuff >

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Borislav Petkov
On Sat, Feb 21, 2015 at 07:39:52PM +0100, Ingo Molnar wrote: > So the workload improved by ~600,000 usecs, and there's > 68,000 less calls, so it saved 8.8 usecs per call. Isn't I think you mean more calls. The eager measurement has more calls. Let me do some primitive math: def

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Ingo Molnar
* Borislav Petkov wrote: > On Sat, Feb 21, 2015 at 05:38:40PM +0100, Borislav Petkov wrote: > > My assumption is that libc uses SSE for memcpy and thus the FPU will > > be used. (I'll trace FPU-specific PMCs later to confirm). > > Ok, so I slapped a trace_printk() at the beginning of

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Ingo Molnar
* Borislav Petkov wrote: > plain 3.19: > > 234.681331200 seconds time elapsed >( +- 0.15% ) > > eagerfpu=ENABLE > > 234.066525648 seconds time elapsed >( +- 0.19% ) hm, a win of more than 600

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Borislav Petkov
On Sat, Feb 21, 2015 at 05:38:40PM +0100, Borislav Petkov wrote: > My assumption is that libc uses SSE for memcpy and thus the FPU will > be used. (I'll trace FPU-specific PMCs later to confirm). Ok, so I slapped a trace_printk() at the beginning of fpu_save_init() and did a kernel build once

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Borislav Petkov
On Sat, Feb 21, 2015 at 10:31:50AM +0100, Ingo Molnar wrote: > So it would be nice to test this on at least one reasonably old (but > not uncomfortably old - say 5 years old) system, to get a feel for > what kind of performance impact it has there. Yeah, this is exactly what Andy and I were

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Ingo Molnar
* Andy Lutomirski wrote: > We have eager and lazy fpu modes, introduced in: > > 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting > xsave > > The result is rather messy. There are two code paths in > almost all of the FPU code, and only one of them (the > eager

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Andy Lutomirski
On Sat, Feb 21, 2015 at 4:34 PM, Maciej W. Rozycki ma...@linux-mips.org wrote: On Sat, 21 Feb 2015, Borislav Petkov wrote: Provided I've not made a mistake, this leads me to think that this simple workload and pretty much everything else uses the FPU through glibc which does the SSE memcpy

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Maciej W. Rozycki
On Sat, 21 Feb 2015, Borislav Petkov wrote: Provided I've not made a mistake, this leads me to think that this simple workload and pretty much everything else uses the FPU through glibc which does the SSE memcpy and so on. Which basically kills the whole idea behind lazy FPU as practically

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Ingo Molnar
* Andy Lutomirski l...@amacapital.net wrote: We have eager and lazy fpu modes, introduced in: 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting xsave The result is rather messy. There are two code paths in almost all of the FPU code, and only one of them (the

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Borislav Petkov
On Sat, Feb 21, 2015 at 10:31:50AM +0100, Ingo Molnar wrote: So it would be nice to test this on at least one reasonably old (but not uncomfortably old - say 5 years old) system, to get a feel for what kind of performance impact it has there. Yeah, this is exactly what Andy and I were talking

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Ingo Molnar
* Borislav Petkov b...@alien8.de wrote: On Sat, Feb 21, 2015 at 05:38:40PM +0100, Borislav Petkov wrote: My assumption is that libc uses SSE for memcpy and thus the FPU will be used. (I'll trace FPU-specific PMCs later to confirm). Ok, so I slapped a trace_printk() at the beginning of

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Borislav Petkov
On Sat, Feb 21, 2015 at 05:38:40PM +0100, Borislav Petkov wrote: My assumption is that libc uses SSE for memcpy and thus the FPU will be used. (I'll trace FPU-specific PMCs later to confirm). Ok, so I slapped a trace_printk() at the beginning of fpu_save_init() and did a kernel build once with

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Ingo Molnar
* Borislav Petkov b...@alien8.de wrote: plain 3.19: 234.681331200 seconds time elapsed ( +- 0.15% ) eagerfpu=ENABLE 234.066525648 seconds time elapsed ( +- 0.19% ) hm, a win of more than 600

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Borislav Petkov
On Sat, Feb 21, 2015 at 08:23:52PM +0100, Ingo Molnar wrote: to switch between the modes? I went all out and did a debugfs file, see patch at the end, which counts FPU saves. Then I ran this script: --- #!/bin/bash D=/sys/kernel/debug/fpu/eager echo Lazy FPU: echo 0 $D echo -n FPU saves

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Borislav Petkov
On Sat, Feb 21, 2015 at 07:39:52PM +0100, Ingo Molnar wrote: So the workload improved by ~600,000 usecs, and there's 68,000 less calls, so it saved 8.8 usecs per call. Isn't I think you mean more calls. The eager measurement has more calls. Let me do some primitive math: def =(234.681331200

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-21 Thread Ingo Molnar
* Borislav Petkov b...@alien8.de wrote: I'd sleep a lot better if we had some runtime debug flag to be able to do run-to-run comparisons on the same booted up kernel, or so. Let me take a look whether we could so some knob... The nice thing is, code uses use_eager_fpu() to check

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-20 Thread Borislav Petkov
+ Linus. I'm sure he'll save something to say about it :-) On Fri, Feb 20, 2015 at 10:58:15AM -0800, Andy Lutomirski wrote: > We have eager and lazy fpu modes, introduced in: > > 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting > xsave > > The result is rather messy.

[RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-20 Thread Andy Lutomirski
We have eager and lazy fpu modes, introduced in: 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting xsave The result is rather messy. There are two code paths in almost all of the FPU code, and only one of them (the eager case) is tested frequently, since most kernel

Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-20 Thread Borislav Petkov
+ Linus. I'm sure he'll save something to say about it :-) On Fri, Feb 20, 2015 at 10:58:15AM -0800, Andy Lutomirski wrote: We have eager and lazy fpu modes, introduced in: 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting xsave The result is rather messy. There

[RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

2015-02-20 Thread Andy Lutomirski
We have eager and lazy fpu modes, introduced in: 304bceda6a18 x86, fpu: use non-lazy fpu restore for processors supporting xsave The result is rather messy. There are two code paths in almost all of the FPU code, and only one of them (the eager case) is tested frequently, since most kernel