Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Sun, Feb 22, 2015 at 10:31:26AM -0800, Arjan van de Ven wrote: > >>To show the boot time, I'm using the timestamp of the "Write protecting" > >>line, > >>that's pretty much the last thing we print prior to ring 3 execution. > > > >That's a little sad; we ought to be write-protecting kernel read-only > >data as *early* as possible. > > well... if you are compromised before the first ring 3 instruction... > you have a slightly bigger problem than where in the kernel we write > protect things. Definitely not talking about malicious compromise here; malicious code could just remove the write protection. However, write-protecting kernel read-only data also protects against a class of bugs. - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
To show the boot time, I'm using the timestamp of the "Write protecting" line, that's pretty much the last thing we print prior to ring 3 execution. That's a little sad; we ought to be write-protecting kernel read-only data as *early* as possible. well... if you are compromised before the first ring 3 instruction... you have a slightly bigger problem than where in the kernel we write protect things. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Sat, Feb 21, 2015 at 07:58:07PM -0800, Josh Triplett wrote: > On Sat, Feb 21, 2015 at 07:51:34AM -0800, Arjan van de Ven wrote: > > >> > > >>there's a few others as well that I'm chasing down... > > >>.. but the flip side, prior to running ring 3 code, why NOT do fast > > >>expedites? > > > > > >It would be good to have before-and-after measurements of actual > > >boot time. Are these numbers available? > > > > To show the boot time, I'm using the timestamp of the "Write protecting" > > line, > > that's pretty much the last thing we print prior to ring 3 execution. > > That's a little sad; we ought to be write-protecting kernel read-only > data as *early* as possible. > > > A kernel with default RCU behavior (inside KVM, only virtual devices) looks > > like this: > > > > [0.038724] Write protecting the kernel read-only data: 10240k > > > > a kernel with expedited RCU (using the command line option, so that I don't > > have > > to recompile between measurements and thus am completely oranges-to-oranges) > > > > [0.031768] Write protecting the kernel read-only data: 10240k > > > > which, in percentage, is an 18% improvement. > > Nice improvement, but that suggests that we're spending far too much > time waiting on RCU grace periods at boot time. Let's see... 0.038724-0.031768=0.006956, or about seven milliseconds. This might be as many as ten grace periods, but is more likely to be about two of them. Of course, this counts only the grace periods after the scheduler starts, as those prior to scheduler start are no-ops, courtesy of your single-CPU optimization. So, how many grace periods between scheduler start and init spawning do you feel would be appropriate? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Sat, Feb 21, 2015 at 07:51:34AM -0800, Arjan van de Ven wrote: > >> > >>there's a few others as well that I'm chasing down... > >>.. but the flip side, prior to running ring 3 code, why NOT do fast > >>expedites? > > > >It would be good to have before-and-after measurements of actual > >boot time. Are these numbers available? > > To show the boot time, I'm using the timestamp of the "Write protecting" line, > that's pretty much the last thing we print prior to ring 3 execution. That's a little sad; we ought to be write-protecting kernel read-only data as *early* as possible. > A kernel with default RCU behavior (inside KVM, only virtual devices) looks > like this: > > [0.038724] Write protecting the kernel read-only data: 10240k > > a kernel with expedited RCU (using the command line option, so that I don't > have > to recompile between measurements and thus am completely oranges-to-oranges) > > [0.031768] Write protecting the kernel read-only data: 10240k > > which, in percentage, is an 18% improvement. Nice improvement, but that suggests that we're spending far too much time waiting on RCU grace periods at boot time. - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Sat, Feb 21, 2015 at 05:08:52PM +0100, Peter Zijlstra wrote: > On Fri, Feb 20, 2015 at 09:45:39AM -0800, Arjan van de Ven wrote: > > On 2/20/2015 9:43 AM, Peter Zijlstra wrote: > > >On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote: > > >>there's a few others as well that I'm chasing down... > > >>.. but the flip side, prior to running ring 3 code, why NOT do fast > > >>expedites? > > > > > >So my objections are twofold: > > > > > > - I object to fast expedites in principle; they spray IPIs across the > > >system, so ideally we'd not have them at all, therefore also not at > > >boot. > > > > > >Because as soon as the option exists, people will use it for other > > >things too. > > > > the option exists today in sysfs and kernel parameter... > > Yeah, Paul and me have been having this argument for a while now ;-) Indeed we have. ;-) And if expedited grace periods start causing latency issues in real-world workloads, I will address those issues. In the meantime, one of the nice things about NO_HZ_FULL is that synchronize_sched_expedited() avoids IPIing CPUs having a single runnable task that is running in nohz_full mode. ;-) Thanx, Paul > > >And esp. in bootup code you can special case a lot of stuff; there's > > >limited concurrency esp. because userspace it not there yet. So we might > > >not actually need those sync calls. > > > > yeah I am going down that angle as well absolutely. > > but there are cases that may well be legit (or are 5 function calls deep > > into common code) > > Good ;-) > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Sat, Feb 21, 2015 at 03:12:01PM +, Mathieu Desnoyers wrote: > - Original Message - > > From: "Josh Triplett" > > To: "Peter Zijlstra" > > Cc: "Paul E. McKenney" , > > linux-kernel@vger.kernel.org, mi...@kernel.org, > > la...@cn.fujitsu.com, dipan...@in.ibm.com, a...@linux-foundation.org, > > "mathieu desnoyers" > > , t...@linutronix.de, rost...@goodmis.org, > > dhowe...@redhat.com, eduma...@google.com, > > dvh...@linux.intel.com, fweis...@gmail.com, o...@redhat.com, "bobby prani" > > > > Sent: Saturday, February 21, 2015 1:04:28 AM > > Subject: Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace > > periods > > > > On Fri, Feb 20, 2015 at 05:54:09PM +0100, Peter Zijlstra wrote: > > > On Fri, Feb 20, 2015 at 08:37:37AM -0800, Paul E. McKenney wrote: > > > > On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote: > > > > > Does it really make a machine boot much faster? Why are people using > > > > > synchronous gp primitives if they care about speed? Should we not fix > > > > > that instead? > > > > > > > > The report I heard was that it provided 10-15% faster boot times. > > > > > > That's not insignificant; got more details? I think we should really > > > look at why people are using the sync primitives. > > > > Paul, what do you think about adding a compile-time debug option to > > synchronize_rcu() that causes it to capture the time on entry and exit > > and print the duration together with the file:line of the caller? > > Similar to initcall_debug, but for blocking calls to synchronize_rcu(). > > Put that together with initcall_debug, and you'd have a pretty good idea > > of where that holds up boot. > > > > We do want early boot to run as asynchronously as possible, and to avoid > > having later bits of boot waiting on a synchronize_rcu from earlier bits > > of boot. Switching a caller over to call_rcu() doesn't actually help if > > it still has to finish a grace period before it can allow later bits to > > run. Ideally, we ought to be able to work out the "depth" of boot in > > grace-periods. > > > > Has anyone wired initcall_debug up to a bootchart-like graph? > > The information about begin/end of synchronize_rcu, as well as begin/end > of rcu_barrier() seems to be very relevant here. This should perhaps be > covered tracepoints ? Isn't it already ? Good points, but they did measure this somehow. Wouldn't some ftrace magic get this result? Thanx, Paul > Thanks, > > Mathieu > > > > > - Josh Triplett > > > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Sat, Feb 21, 2015 at 07:51:34AM -0800, Arjan van de Ven wrote: > >> > >>there's a few others as well that I'm chasing down... > >>.. but the flip side, prior to running ring 3 code, why NOT do fast > >>expedites? > > > >It would be good to have before-and-after measurements of actual > >boot time. Are these numbers available? > > > To show the boot time, I'm using the timestamp of the "Write protecting" line, > that's pretty much the last thing we print prior to ring 3 execution. > > A kernel with default RCU behavior (inside KVM, only virtual devices) looks > like this: > > [0.038724] Write protecting the kernel read-only data: 10240k > > a kernel with expedited RCU (using the command line option, so that I don't > have > to recompile between measurements and thus am completely oranges-to-oranges) > > [0.031768] Write protecting the kernel read-only data: 10240k > > which, in percentage, is an 18% improvement. Thank you, will repost with this info. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Sat, Feb 21, 2015 at 05:11:30PM +0100, Peter Zijlstra wrote: > On Fri, Feb 20, 2015 at 10:38:49AM -0800, Paul E. McKenney wrote: > > On Fri, Feb 20, 2015 at 06:43:59PM +0100, Peter Zijlstra wrote: > > > On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote: > > > > there's a few others as well that I'm chasing down... > > > > .. but the flip side, prior to running ring 3 code, why NOT do fast > > > > expedites? > > > > > > So my objections are twofold: > > > > > > - I object to fast expedites in principle; they spray IPIs across the > > >system, so ideally we'd not have them at all, therefore also not at > > >boot. > > > > There are only a few uses of expedited grace periods, despite their > > having been in the kernel for some years. So people do seem to be > > exercising appropriate restraint here. > > Or people just don't know about it :-) "Ignorance: The #1 contributor to appropriate restraint!" ;-) > > >Because as soon as the option exists, people will use it for other > > >things too. > > > > > > - The proposed interface is very much exposed to everybody who wants > > >it; this again is wide open to (ab)use. > > > > > >Once it exists people will start to use, and before you know it we'll > > >always have that fast counter incremented and we're in IPI hell. Most > > >likely because someone was lazy and it seemed like a quick fix for > > >some stupid code. > > > > I suppose that another way to keep it private would be to have the > > declaration in both update.c and rcutorture.c. This would mean that no > > other file could invoke it, and should keep 0day happy. It would mean > > that the declarations would be duplicated, but worse things could happen. > > Why do you need it for rcu torture? That can call the regular expedited > call to exercise those rcu paths, right? Yes, but not the ability to turn expediting on and off in the normal path. > That would allow you to use system_state < SYSTEM_RUNNING if you really > wanted to do this without exposing any interface for this. This decision is not up to RCU. Something else must tell RCU whether or not and when to treat normal grace periods as expedited. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Fri, Feb 20, 2015 at 10:38:49AM -0800, Paul E. McKenney wrote: > On Fri, Feb 20, 2015 at 06:43:59PM +0100, Peter Zijlstra wrote: > > On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote: > > > there's a few others as well that I'm chasing down... > > > .. but the flip side, prior to running ring 3 code, why NOT do fast > > > expedites? > > > > So my objections are twofold: > > > > - I object to fast expedites in principle; they spray IPIs across the > >system, so ideally we'd not have them at all, therefore also not at > >boot. > > There are only a few uses of expedited grace periods, despite their > having been in the kernel for some years. So people do seem to be > exercising appropriate restraint here. Or people just don't know about it :-) > >Because as soon as the option exists, people will use it for other > >things too. > > > > - The proposed interface is very much exposed to everybody who wants > >it; this again is wide open to (ab)use. > > > >Once it exists people will start to use, and before you know it we'll > >always have that fast counter incremented and we're in IPI hell. Most > >likely because someone was lazy and it seemed like a quick fix for > >some stupid code. > > I suppose that another way to keep it private would be to have the > declaration in both update.c and rcutorture.c. This would mean that no > other file could invoke it, and should keep 0day happy. It would mean > that the declarations would be duplicated, but worse things could happen. Why do you need it for rcu torture? That can call the regular expedited call to exercise those rcu paths, right? That would allow you to use system_state < SYSTEM_RUNNING if you really wanted to do this without exposing any interface for this. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Fri, Feb 20, 2015 at 09:45:39AM -0800, Arjan van de Ven wrote: > On 2/20/2015 9:43 AM, Peter Zijlstra wrote: > >On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote: > >>there's a few others as well that I'm chasing down... > >>.. but the flip side, prior to running ring 3 code, why NOT do fast > >>expedites? > > > >So my objections are twofold: > > > > - I object to fast expedites in principle; they spray IPIs across the > >system, so ideally we'd not have them at all, therefore also not at > >boot. > > > >Because as soon as the option exists, people will use it for other > >things too. > > the option exists today in sysfs and kernel parameter... Yeah, Paul and me have been having this argument for a while now ;-) > >And esp. in bootup code you can special case a lot of stuff; there's > >limited concurrency esp. because userspace it not there yet. So we might > >not actually need those sync calls. > > yeah I am going down that angle as well absolutely. > but there are cases that may well be legit (or are 5 function calls deep into > common code) Good ;-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
there's a few others as well that I'm chasing down... .. but the flip side, prior to running ring 3 code, why NOT do fast expedites? It would be good to have before-and-after measurements of actual boot time. Are these numbers available? To show the boot time, I'm using the timestamp of the "Write protecting" line, that's pretty much the last thing we print prior to ring 3 execution. A kernel with default RCU behavior (inside KVM, only virtual devices) looks like this: [0.038724] Write protecting the kernel read-only data: 10240k a kernel with expedited RCU (using the command line option, so that I don't have to recompile between measurements and thus am completely oranges-to-oranges) [0.031768] Write protecting the kernel read-only data: 10240k which, in percentage, is an 18% improvement. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
- Original Message - > From: "Josh Triplett" > To: "Peter Zijlstra" > Cc: "Paul E. McKenney" , > linux-kernel@vger.kernel.org, mi...@kernel.org, > la...@cn.fujitsu.com, dipan...@in.ibm.com, a...@linux-foundation.org, > "mathieu desnoyers" > , t...@linutronix.de, rost...@goodmis.org, > dhowe...@redhat.com, eduma...@google.com, > dvh...@linux.intel.com, fweis...@gmail.com, o...@redhat.com, "bobby prani" > > Sent: Saturday, February 21, 2015 1:04:28 AM > Subject: Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace > periods > > On Fri, Feb 20, 2015 at 05:54:09PM +0100, Peter Zijlstra wrote: > > On Fri, Feb 20, 2015 at 08:37:37AM -0800, Paul E. McKenney wrote: > > > On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote: > > > > Does it really make a machine boot much faster? Why are people using > > > > synchronous gp primitives if they care about speed? Should we not fix > > > > that instead? > > > > > > The report I heard was that it provided 10-15% faster boot times. > > > > That's not insignificant; got more details? I think we should really > > look at why people are using the sync primitives. > > Paul, what do you think about adding a compile-time debug option to > synchronize_rcu() that causes it to capture the time on entry and exit > and print the duration together with the file:line of the caller? > Similar to initcall_debug, but for blocking calls to synchronize_rcu(). > Put that together with initcall_debug, and you'd have a pretty good idea > of where that holds up boot. > > We do want early boot to run as asynchronously as possible, and to avoid > having later bits of boot waiting on a synchronize_rcu from earlier bits > of boot. Switching a caller over to call_rcu() doesn't actually help if > it still has to finish a grace period before it can allow later bits to > run. Ideally, we ought to be able to work out the "depth" of boot in > grace-periods. > > Has anyone wired initcall_debug up to a bootchart-like graph? The information about begin/end of synchronize_rcu, as well as begin/end of rcu_barrier() seems to be very relevant here. This should perhaps be covered tracepoints ? Isn't it already ? Thanks, Mathieu > > - Josh Triplett > -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Fri, Feb 20, 2015 at 05:54:09PM +0100, Peter Zijlstra wrote: > On Fri, Feb 20, 2015 at 08:37:37AM -0800, Paul E. McKenney wrote: > > On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote: > > > Does it really make a machine boot much faster? Why are people using > > > synchronous gp primitives if they care about speed? Should we not fix > > > that instead? > > > > The report I heard was that it provided 10-15% faster boot times. > > That's not insignificant; got more details? I think we should really > look at why people are using the sync primitives. Paul, what do you think about adding a compile-time debug option to synchronize_rcu() that causes it to capture the time on entry and exit and print the duration together with the file:line of the caller? Similar to initcall_debug, but for blocking calls to synchronize_rcu(). Put that together with initcall_debug, and you'd have a pretty good idea of where that holds up boot. We do want early boot to run as asynchronously as possible, and to avoid having later bits of boot waiting on a synchronize_rcu from earlier bits of boot. Switching a caller over to call_rcu() doesn't actually help if it still has to finish a grace period before it can allow later bits to run. Ideally, we ought to be able to work out the "depth" of boot in grace-periods. Has anyone wired initcall_debug up to a bootchart-like graph? - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Fri, Feb 20, 2015 at 10:29:34AM -0800, Arjan van de Ven wrote: > On 2/20/2015 10:27 AM, Paul E. McKenney wrote: > >On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote: > >>Does it really make a machine boot much faster? Why are people using > >>synchronous gp primitives if they care about speed? Should we not fix > >>that instead? > > > >The report I heard was that it provided 10-15% faster boot times. > > That's not insignificant; got more details? I think we should really > look at why people are using the sync primitives. > >>> > >>>I must defer to the people who took the exact measurements. > >>> > >>>But yes, once I have that info, I should add it to the commit log. > >> > >>so the two most obvious cases are > >> > >>Registering sysrq keys ... even when the old key code had no handler > >>(have a patch pending for this) > >> > >>registering idle handlers > >>(this is more tricky, it's very obvious abuse but the fix is less clear) > >> > >>there's a few others as well that I'm chasing down... > >>.. but the flip side, prior to running ring 3 code, why NOT do fast > >>expedites? > > > >It would be good to have before-and-after measurements of actual > >boot time. Are these numbers available? > > I'll make you pretty graphs when I get home from collab summit, which > should be later today Very good, looking forward to seeing them. ;-) Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Fri, Feb 20, 2015 at 06:43:59PM +0100, Peter Zijlstra wrote: > On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote: > > there's a few others as well that I'm chasing down... > > .. but the flip side, prior to running ring 3 code, why NOT do fast > > expedites? > > So my objections are twofold: > > - I object to fast expedites in principle; they spray IPIs across the >system, so ideally we'd not have them at all, therefore also not at >boot. There are only a few uses of expedited grace periods, despite their having been in the kernel for some years. So people do seem to be exercising appropriate restraint here. >Because as soon as the option exists, people will use it for other >things too. > > - The proposed interface is very much exposed to everybody who wants >it; this again is wide open to (ab)use. > >Once it exists people will start to use, and before you know it we'll >always have that fast counter incremented and we're in IPI hell. Most >likely because someone was lazy and it seemed like a quick fix for >some stupid code. I suppose that another way to keep it private would be to have the declaration in both update.c and rcutorture.c. This would mean that no other file could invoke it, and should keep 0day happy. It would mean that the declarations would be duplicated, but worse things could happen. > And esp. in bootup code you can special case a lot of stuff; there's > limited concurrency esp. because userspace it not there yet. So we might > not actually need those sync calls. I expect that some could be rewritten, but it might not work well for code common to boot and to runtime. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On 2/20/2015 10:27 AM, Paul E. McKenney wrote: On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote: Does it really make a machine boot much faster? Why are people using synchronous gp primitives if they care about speed? Should we not fix that instead? The report I heard was that it provided 10-15% faster boot times. That's not insignificant; got more details? I think we should really look at why people are using the sync primitives. I must defer to the people who took the exact measurements. But yes, once I have that info, I should add it to the commit log. so the two most obvious cases are Registering sysrq keys ... even when the old key code had no handler (have a patch pending for this) registering idle handlers (this is more tricky, it's very obvious abuse but the fix is less clear) there's a few others as well that I'm chasing down... .. but the flip side, prior to running ring 3 code, why NOT do fast expedites? It would be good to have before-and-after measurements of actual boot time. Are these numbers available? I'll make you pretty graphs when I get home from collab summit, which should be later today -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote: > Does it really make a machine boot much faster? Why are people using > synchronous gp primitives if they care about speed? Should we not fix > that instead? > >>> > >>>The report I heard was that it provided 10-15% faster boot times. > >> > >>That's not insignificant; got more details? I think we should really > >>look at why people are using the sync primitives. > > > >I must defer to the people who took the exact measurements. > > > >But yes, once I have that info, I should add it to the commit log. > > so the two most obvious cases are > > Registering sysrq keys ... even when the old key code had no handler > (have a patch pending for this) > > registering idle handlers > (this is more tricky, it's very obvious abuse but the fix is less clear) > > there's a few others as well that I'm chasing down... > .. but the flip side, prior to running ring 3 code, why NOT do fast expedites? It would be good to have before-and-after measurements of actual boot time. Are these numbers available? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On 2/20/2015 9:43 AM, Peter Zijlstra wrote: On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote: there's a few others as well that I'm chasing down... .. but the flip side, prior to running ring 3 code, why NOT do fast expedites? So my objections are twofold: - I object to fast expedites in principle; they spray IPIs across the system, so ideally we'd not have them at all, therefore also not at boot. Because as soon as the option exists, people will use it for other things too. the option exists today in sysfs and kernel parameter... And esp. in bootup code you can special case a lot of stuff; there's limited concurrency esp. because userspace it not there yet. So we might not actually need those sync calls. yeah I am going down that angle as well absolutely. but there are cases that may well be legit (or are 5 function calls deep into common code) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote: > there's a few others as well that I'm chasing down... > .. but the flip side, prior to running ring 3 code, why NOT do fast expedites? So my objections are twofold: - I object to fast expedites in principle; they spray IPIs across the system, so ideally we'd not have them at all, therefore also not at boot. Because as soon as the option exists, people will use it for other things too. - The proposed interface is very much exposed to everybody who wants it; this again is wide open to (ab)use. Once it exists people will start to use, and before you know it we'll always have that fast counter incremented and we're in IPI hell. Most likely because someone was lazy and it seemed like a quick fix for some stupid code. And esp. in bootup code you can special case a lot of stuff; there's limited concurrency esp. because userspace it not there yet. So we might not actually need those sync calls. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
Does it really make a machine boot much faster? Why are people using synchronous gp primitives if they care about speed? Should we not fix that instead? The report I heard was that it provided 10-15% faster boot times. That's not insignificant; got more details? I think we should really look at why people are using the sync primitives. I must defer to the people who took the exact measurements. But yes, once I have that info, I should add it to the commit log. so the two most obvious cases are Registering sysrq keys ... even when the old key code had no handler (have a patch pending for this) registering idle handlers (this is more tricky, it's very obvious abuse but the fix is less clear) there's a few others as well that I'm chasing down... .. but the flip side, prior to running ring 3 code, why NOT do fast expedites? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Fri, Feb 20, 2015 at 05:54:09PM +0100, Peter Zijlstra wrote: > On Fri, Feb 20, 2015 at 08:37:37AM -0800, Paul E. McKenney wrote: > > On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote: > > > So I though we wanted to get rid / limit the expedited stuff because its > > > IPI happy, and here its spreading. > > > > Well, at least it no longer IPIs idle CPUs. ;-) > > > > And this is during boot, when a few extra IPIs should not be a big deal. > > Well the one application now is during boot; but you expose the > interface for all to use, and therefore someone will. I could make rcu_expedite_gp() and rcu_unexpedite_gp() be static, I suppose. Except that I need to test them with rcutorture. I suppose I could put the declaration in rcutorture.c, but then 0day will tell me to made them static. :-/ > > > Does it really make a machine boot much faster? Why are people using > > > synchronous gp primitives if they care about speed? Should we not fix > > > that instead? > > > > The report I heard was that it provided 10-15% faster boot times. > > That's not insignificant; got more details? I think we should really > look at why people are using the sync primitives. I must defer to the people who took the exact measurements. But yes, once I have that info, I should add it to the commit log. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Fri, Feb 20, 2015 at 08:37:37AM -0800, Paul E. McKenney wrote: > On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote: > > So I though we wanted to get rid / limit the expedited stuff because its > > IPI happy, and here its spreading. > > Well, at least it no longer IPIs idle CPUs. ;-) > > And this is during boot, when a few extra IPIs should not be a big deal. Well the one application now is during boot; but you expose the interface for all to use, and therefore someone will. > > Does it really make a machine boot much faster? Why are people using > > synchronous gp primitives if they care about speed? Should we not fix > > that instead? > > The report I heard was that it provided 10-15% faster boot times. That's not insignificant; got more details? I think we should really look at why people are using the sync primitives. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote: > On Thu, Feb 19, 2015 at 09:08:50PM -0800, Paul E. McKenney wrote: > > Hello! > > > > This series, possibly for v3.21, contains changes that allow in-kernel > > code to specify that all subsequent synchronous grace-period primitives > > (synchronize_rcu() and friends) be expedited. New rcu_expedite_gp() > > and rcu_unexpedite_gp() primitives enable and disable expediting, > > and these may be nested. Note that the rcu_expedited boot/sysfs > > variable, if non-zero, causes expediting to happen regardless of calls > > to rcu_expedite_gp(). > > > > Because one of the use cases for these primitives is to expedite > > grace periods during the in-kernel portion of boot, a new Kconfig > > parameter named CONFIG_RCU_EXPEDITE_BOOT causes the kernel to act > > as if rcu_expedite_gp() was called very early in boot. At the end > > of boot (presumably just before init is spawned), a call to > > rcu_end_inkernel_boot() will provide the matching rcu_unexpedite_gp() > > if required. > > So I though we wanted to get rid / limit the expedited stuff because its > IPI happy, and here its spreading. Well, at least it no longer IPIs idle CPUs. ;-) And this is during boot, when a few extra IPIs should not be a big deal. > Does it really make a machine boot much faster? Why are people using > synchronous gp primitives if they care about speed? Should we not fix > that instead? The report I heard was that it provided 10-15% faster boot times. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
On Thu, Feb 19, 2015 at 09:08:50PM -0800, Paul E. McKenney wrote: > Hello! > > This series, possibly for v3.21, contains changes that allow in-kernel > code to specify that all subsequent synchronous grace-period primitives > (synchronize_rcu() and friends) be expedited. New rcu_expedite_gp() > and rcu_unexpedite_gp() primitives enable and disable expediting, > and these may be nested. Note that the rcu_expedited boot/sysfs > variable, if non-zero, causes expediting to happen regardless of calls > to rcu_expedite_gp(). > > Because one of the use cases for these primitives is to expedite > grace periods during the in-kernel portion of boot, a new Kconfig > parameter named CONFIG_RCU_EXPEDITE_BOOT causes the kernel to act > as if rcu_expedite_gp() was called very early in boot. At the end > of boot (presumably just before init is spawned), a call to > rcu_end_inkernel_boot() will provide the matching rcu_unexpedite_gp() > if required. So I though we wanted to get rid / limit the expedited stuff because its IPI happy, and here its spreading. Does it really make a machine boot much faster? Why are people using synchronous gp primitives if they care about speed? Should we not fix that instead? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods
Hello! This series, possibly for v3.21, contains changes that allow in-kernel code to specify that all subsequent synchronous grace-period primitives (synchronize_rcu() and friends) be expedited. New rcu_expedite_gp() and rcu_unexpedite_gp() primitives enable and disable expediting, and these may be nested. Note that the rcu_expedited boot/sysfs variable, if non-zero, causes expediting to happen regardless of calls to rcu_expedite_gp(). Because one of the use cases for these primitives is to expedite grace periods during the in-kernel portion of boot, a new Kconfig parameter named CONFIG_RCU_EXPEDITE_BOOT causes the kernel to act as if rcu_expedite_gp() was called very early in boot. At the end of boot (presumably just before init is spawned), a call to rcu_end_inkernel_boot() will provide the matching rcu_unexpedite_gp() if required. The patches in this series are as follows: 1. Add rcu_expedite_gp() and rcu_unexpedite_gp() functions. 2. Add rcutorture tests for rcu_expedite_gp() and rcu_unexpedite_gp(). 3. Change open-coded access to the rcu_expedited variable to instead use a new rcu_gp_is_expedited() function. 4. Add the CONFIG_RCU_EXPEDITE_BOOT Kconfig parameter and the rcu_end_inkernel_boot() function. This passes light rcutorture testing. Thanx, Paul b/include/linux/rcupdate.h | 21 b/init/Kconfig | 13 + b/kernel/rcu/rcutorture.c | 24 ++ b/kernel/rcu/srcu.c|2 - b/kernel/rcu/tree.c|9 +++--- b/kernel/rcu/tree_plugin.h |2 - b/kernel/rcu/update.c | 59 - 7 files changed, 123 insertions(+), 7 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/