Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-22 Thread Josh Triplett
On Sun, Feb 22, 2015 at 10:31:26AM -0800, Arjan van de Ven wrote:
> >>To show the boot time, I'm using the timestamp of the "Write protecting" 
> >>line,
> >>that's pretty much the last thing we print prior to ring 3 execution.
> >
> >That's a little sad; we ought to be write-protecting kernel read-only
> >data as *early* as possible.
> 
> well... if you are compromised before the first ring 3 instruction...
>  you have a slightly bigger problem than where in the kernel we write 
> protect things.

Definitely not talking about malicious compromise here; malicious code
could just remove the write protection.  However, write-protecting
kernel read-only data also protects against a class of bugs.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-22 Thread Arjan van de Ven


To show the boot time, I'm using the timestamp of the "Write protecting" line,
that's pretty much the last thing we print prior to ring 3 execution.


That's a little sad; we ought to be write-protecting kernel read-only
data as *early* as possible.


well... if you are compromised before the first ring 3 instruction...
 you have a slightly bigger problem than where in the kernel we write 
protect things.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-21 Thread Paul E. McKenney
On Sat, Feb 21, 2015 at 07:58:07PM -0800, Josh Triplett wrote:
> On Sat, Feb 21, 2015 at 07:51:34AM -0800, Arjan van de Ven wrote:
> > >>
> > >>there's a few others as well that I'm chasing down...
> > >>.. but the flip side, prior to running ring 3 code, why NOT do fast 
> > >>expedites?
> > >
> > >It would be good to have before-and-after measurements of actual
> > >boot time.  Are these numbers available?
> > 
> > To show the boot time, I'm using the timestamp of the "Write protecting" 
> > line,
> > that's pretty much the last thing we print prior to ring 3 execution.
> 
> That's a little sad; we ought to be write-protecting kernel read-only
> data as *early* as possible.
> 
> > A kernel with default RCU behavior (inside KVM, only virtual devices) looks 
> > like this:
> > 
> > [0.038724] Write protecting the kernel read-only data: 10240k
> > 
> > a kernel with expedited RCU (using the command line option, so that I don't 
> > have
> > to recompile between measurements and thus am completely oranges-to-oranges)
> > 
> > [0.031768] Write protecting the kernel read-only data: 10240k
> > 
> > which, in percentage, is an 18% improvement.
> 
> Nice improvement, but that suggests that we're spending far too much
> time waiting on RCU grace periods at boot time.

Let's see...  0.038724-0.031768=0.006956, or about seven milliseconds.
This might be as many as ten grace periods, but is more likely to be
about two of them.  Of course, this counts only the grace periods after
the scheduler starts, as those prior to scheduler start are no-ops,
courtesy of your single-CPU optimization.

So, how many grace periods between scheduler start and init spawning
do you feel would be appropriate?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-21 Thread Josh Triplett
On Sat, Feb 21, 2015 at 07:51:34AM -0800, Arjan van de Ven wrote:
> >>
> >>there's a few others as well that I'm chasing down...
> >>.. but the flip side, prior to running ring 3 code, why NOT do fast 
> >>expedites?
> >
> >It would be good to have before-and-after measurements of actual
> >boot time.  Are these numbers available?
> 
> To show the boot time, I'm using the timestamp of the "Write protecting" line,
> that's pretty much the last thing we print prior to ring 3 execution.

That's a little sad; we ought to be write-protecting kernel read-only
data as *early* as possible.

> A kernel with default RCU behavior (inside KVM, only virtual devices) looks 
> like this:
> 
> [0.038724] Write protecting the kernel read-only data: 10240k
> 
> a kernel with expedited RCU (using the command line option, so that I don't 
> have
> to recompile between measurements and thus am completely oranges-to-oranges)
> 
> [0.031768] Write protecting the kernel read-only data: 10240k
> 
> which, in percentage, is an 18% improvement.

Nice improvement, but that suggests that we're spending far too much
time waiting on RCU grace periods at boot time.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-21 Thread Paul E. McKenney
On Sat, Feb 21, 2015 at 05:08:52PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 20, 2015 at 09:45:39AM -0800, Arjan van de Ven wrote:
> > On 2/20/2015 9:43 AM, Peter Zijlstra wrote:
> > >On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote:
> > >>there's a few others as well that I'm chasing down...
> > >>.. but the flip side, prior to running ring 3 code, why NOT do fast 
> > >>expedites?
> > >
> > >So my objections are twofold:
> > >
> > >  - I object to fast expedites in principle; they spray IPIs across the
> > >system, so ideally we'd not have them at all, therefore also not at
> > >boot.
> > >
> > >Because as soon as the option exists, people will use it for other
> > >things too.
> > 
> > the option exists today in sysfs and kernel parameter...
> 
> Yeah, Paul and me have been having this argument for a while now ;-)

Indeed we have.  ;-)

And if expedited grace periods start causing latency issues in real-world
workloads, I will address those issues.

In the meantime, one of the nice things about NO_HZ_FULL is that
synchronize_sched_expedited() avoids IPIing CPUs having a single runnable
task that is running in nohz_full mode.  ;-)

Thanx, Paul

> > >And esp. in bootup code you can special case a lot of stuff; there's
> > >limited concurrency esp. because userspace it not there yet. So we might
> > >not actually need those sync calls.
> > 
> > yeah I am going down that angle as well absolutely.
> > but there are cases that may well be legit (or are 5 function calls deep 
> > into common code)
> 
> Good ;-)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-21 Thread Paul E. McKenney
On Sat, Feb 21, 2015 at 03:12:01PM +, Mathieu Desnoyers wrote:
> - Original Message -
> > From: "Josh Triplett" 
> > To: "Peter Zijlstra" 
> > Cc: "Paul E. McKenney" , 
> > linux-kernel@vger.kernel.org, mi...@kernel.org,
> > la...@cn.fujitsu.com, dipan...@in.ibm.com, a...@linux-foundation.org, 
> > "mathieu desnoyers"
> > , t...@linutronix.de, rost...@goodmis.org, 
> > dhowe...@redhat.com, eduma...@google.com,
> > dvh...@linux.intel.com, fweis...@gmail.com, o...@redhat.com, "bobby prani" 
> > 
> > Sent: Saturday, February 21, 2015 1:04:28 AM
> > Subject: Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace 
> > periods
> > 
> > On Fri, Feb 20, 2015 at 05:54:09PM +0100, Peter Zijlstra wrote:
> > > On Fri, Feb 20, 2015 at 08:37:37AM -0800, Paul E. McKenney wrote:
> > > > On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote:
> > > > > Does it really make a machine boot much faster? Why are people using
> > > > > synchronous gp primitives if they care about speed? Should we not fix
> > > > > that instead?
> > > > 
> > > > The report I heard was that it provided 10-15% faster boot times.
> > > 
> > > That's not insignificant; got more details? I think we should really
> > > look at why people are using the sync primitives.
> > 
> > Paul, what do you think about adding a compile-time debug option to
> > synchronize_rcu() that causes it to capture the time on entry and exit
> > and print the duration together with the file:line of the caller?
> > Similar to initcall_debug, but for blocking calls to synchronize_rcu().
> > Put that together with initcall_debug, and you'd have a pretty good idea
> > of where that holds up boot.
> > 
> > We do want early boot to run as asynchronously as possible, and to avoid
> > having later bits of boot waiting on a synchronize_rcu from earlier bits
> > of boot.  Switching a caller over to call_rcu() doesn't actually help if
> > it still has to finish a grace period before it can allow later bits to
> > run.  Ideally, we ought to be able to work out the "depth" of boot in
> > grace-periods.
> > 
> > Has anyone wired initcall_debug up to a bootchart-like graph?
> 
> The information about begin/end of synchronize_rcu, as well as begin/end
> of rcu_barrier() seems to be very relevant here. This should perhaps be
> covered tracepoints ? Isn't it already ?

Good points, but they did measure this somehow.  Wouldn't some ftrace
magic get this result?

Thanx, Paul

> Thanks,
> 
> Mathieu
> 
> > 
> > - Josh Triplett
> > 
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-21 Thread Paul E. McKenney
On Sat, Feb 21, 2015 at 07:51:34AM -0800, Arjan van de Ven wrote:
> >>
> >>there's a few others as well that I'm chasing down...
> >>.. but the flip side, prior to running ring 3 code, why NOT do fast 
> >>expedites?
> >
> >It would be good to have before-and-after measurements of actual
> >boot time.  Are these numbers available?
> 
> 
> To show the boot time, I'm using the timestamp of the "Write protecting" line,
> that's pretty much the last thing we print prior to ring 3 execution.
> 
> A kernel with default RCU behavior (inside KVM, only virtual devices) looks 
> like this:
> 
> [0.038724] Write protecting the kernel read-only data: 10240k
> 
> a kernel with expedited RCU (using the command line option, so that I don't 
> have
> to recompile between measurements and thus am completely oranges-to-oranges)
> 
> [0.031768] Write protecting the kernel read-only data: 10240k
> 
> which, in percentage, is an 18% improvement.

Thank you, will repost with this info.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-21 Thread Paul E. McKenney
On Sat, Feb 21, 2015 at 05:11:30PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 20, 2015 at 10:38:49AM -0800, Paul E. McKenney wrote:
> > On Fri, Feb 20, 2015 at 06:43:59PM +0100, Peter Zijlstra wrote:
> > > On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote:
> > > > there's a few others as well that I'm chasing down...
> > > > .. but the flip side, prior to running ring 3 code, why NOT do fast 
> > > > expedites?
> > > 
> > > So my objections are twofold:
> > > 
> > >  - I object to fast expedites in principle; they spray IPIs across the
> > >system, so ideally we'd not have them at all, therefore also not at
> > >boot.
> > 
> > There are only a few uses of expedited grace periods, despite their
> > having been in the kernel for some years.  So people do seem to be
> > exercising appropriate restraint here.
> 
> Or people just don't know about it :-)

"Ignorance: The #1 contributor to appropriate restraint!"  ;-)

> > >Because as soon as the option exists, people will use it for other
> > >things too.
> > > 
> > >  - The proposed interface is very much exposed to everybody who wants
> > >it; this again is wide open to (ab)use.
> > > 
> > >Once it exists people will start to use, and before you know it we'll
> > >always have that fast counter incremented and we're in IPI hell. Most
> > >likely because someone was lazy and it seemed like a quick fix for
> > >some stupid code.
> > 
> > I suppose that another way to keep it private would be to have the
> > declaration in both update.c and rcutorture.c.  This would mean that no
> > other file could invoke it, and should keep 0day happy.  It would mean
> > that the declarations would be duplicated, but worse things could happen.
> 
> Why do you need it for rcu torture? That can call the regular expedited
> call to exercise those rcu paths, right?

Yes, but not the ability to turn expediting on and off in the normal path.

> That would allow you to use system_state < SYSTEM_RUNNING if you really
> wanted to do this without exposing any interface for this.

This decision is not up to RCU.  Something else must tell RCU whether or
not and when to treat normal grace periods as expedited.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-21 Thread Peter Zijlstra
On Fri, Feb 20, 2015 at 10:38:49AM -0800, Paul E. McKenney wrote:
> On Fri, Feb 20, 2015 at 06:43:59PM +0100, Peter Zijlstra wrote:
> > On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote:
> > > there's a few others as well that I'm chasing down...
> > > .. but the flip side, prior to running ring 3 code, why NOT do fast 
> > > expedites?
> > 
> > So my objections are twofold:
> > 
> >  - I object to fast expedites in principle; they spray IPIs across the
> >system, so ideally we'd not have them at all, therefore also not at
> >boot.
> 
> There are only a few uses of expedited grace periods, despite their
> having been in the kernel for some years.  So people do seem to be
> exercising appropriate restraint here.

Or people just don't know about it :-)

> >Because as soon as the option exists, people will use it for other
> >things too.
> > 
> >  - The proposed interface is very much exposed to everybody who wants
> >it; this again is wide open to (ab)use.
> > 
> >Once it exists people will start to use, and before you know it we'll
> >always have that fast counter incremented and we're in IPI hell. Most
> >likely because someone was lazy and it seemed like a quick fix for
> >some stupid code.
> 
> I suppose that another way to keep it private would be to have the
> declaration in both update.c and rcutorture.c.  This would mean that no
> other file could invoke it, and should keep 0day happy.  It would mean
> that the declarations would be duplicated, but worse things could happen.

Why do you need it for rcu torture? That can call the regular expedited
call to exercise those rcu paths, right?

That would allow you to use system_state < SYSTEM_RUNNING if you really
wanted to do this without exposing any interface for this.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-21 Thread Peter Zijlstra
On Fri, Feb 20, 2015 at 09:45:39AM -0800, Arjan van de Ven wrote:
> On 2/20/2015 9:43 AM, Peter Zijlstra wrote:
> >On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote:
> >>there's a few others as well that I'm chasing down...
> >>.. but the flip side, prior to running ring 3 code, why NOT do fast 
> >>expedites?
> >
> >So my objections are twofold:
> >
> >  - I object to fast expedites in principle; they spray IPIs across the
> >system, so ideally we'd not have them at all, therefore also not at
> >boot.
> >
> >Because as soon as the option exists, people will use it for other
> >things too.
> 
> the option exists today in sysfs and kernel parameter...

Yeah, Paul and me have been having this argument for a while now ;-)

> >And esp. in bootup code you can special case a lot of stuff; there's
> >limited concurrency esp. because userspace it not there yet. So we might
> >not actually need those sync calls.
> 
> yeah I am going down that angle as well absolutely.
> but there are cases that may well be legit (or are 5 function calls deep into 
> common code)

Good ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-21 Thread Arjan van de Ven


there's a few others as well that I'm chasing down...
.. but the flip side, prior to running ring 3 code, why NOT do fast expedites?


It would be good to have before-and-after measurements of actual
boot time.  Are these numbers available?



To show the boot time, I'm using the timestamp of the "Write protecting" line,
that's pretty much the last thing we print prior to ring 3 execution.

A kernel with default RCU behavior (inside KVM, only virtual devices) looks 
like this:

[0.038724] Write protecting the kernel read-only data: 10240k

a kernel with expedited RCU (using the command line option, so that I don't have
to recompile between measurements and thus am completely oranges-to-oranges)

[0.031768] Write protecting the kernel read-only data: 10240k

which, in percentage, is an 18% improvement.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-21 Thread Mathieu Desnoyers
- Original Message -
> From: "Josh Triplett" 
> To: "Peter Zijlstra" 
> Cc: "Paul E. McKenney" , 
> linux-kernel@vger.kernel.org, mi...@kernel.org,
> la...@cn.fujitsu.com, dipan...@in.ibm.com, a...@linux-foundation.org, 
> "mathieu desnoyers"
> , t...@linutronix.de, rost...@goodmis.org, 
> dhowe...@redhat.com, eduma...@google.com,
> dvh...@linux.intel.com, fweis...@gmail.com, o...@redhat.com, "bobby prani" 
> 
> Sent: Saturday, February 21, 2015 1:04:28 AM
> Subject: Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace 
> periods
> 
> On Fri, Feb 20, 2015 at 05:54:09PM +0100, Peter Zijlstra wrote:
> > On Fri, Feb 20, 2015 at 08:37:37AM -0800, Paul E. McKenney wrote:
> > > On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote:
> > > > Does it really make a machine boot much faster? Why are people using
> > > > synchronous gp primitives if they care about speed? Should we not fix
> > > > that instead?
> > > 
> > > The report I heard was that it provided 10-15% faster boot times.
> > 
> > That's not insignificant; got more details? I think we should really
> > look at why people are using the sync primitives.
> 
> Paul, what do you think about adding a compile-time debug option to
> synchronize_rcu() that causes it to capture the time on entry and exit
> and print the duration together with the file:line of the caller?
> Similar to initcall_debug, but for blocking calls to synchronize_rcu().
> Put that together with initcall_debug, and you'd have a pretty good idea
> of where that holds up boot.
> 
> We do want early boot to run as asynchronously as possible, and to avoid
> having later bits of boot waiting on a synchronize_rcu from earlier bits
> of boot.  Switching a caller over to call_rcu() doesn't actually help if
> it still has to finish a grace period before it can allow later bits to
> run.  Ideally, we ought to be able to work out the "depth" of boot in
> grace-periods.
> 
> Has anyone wired initcall_debug up to a bootchart-like graph?

The information about begin/end of synchronize_rcu, as well as begin/end
of rcu_barrier() seems to be very relevant here. This should perhaps be
covered tracepoints ? Isn't it already ?

Thanks,

Mathieu

> 
> - Josh Triplett
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Josh Triplett
On Fri, Feb 20, 2015 at 05:54:09PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 20, 2015 at 08:37:37AM -0800, Paul E. McKenney wrote:
> > On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote:
> > > Does it really make a machine boot much faster? Why are people using
> > > synchronous gp primitives if they care about speed? Should we not fix
> > > that instead?
> > 
> > The report I heard was that it provided 10-15% faster boot times.
> 
> That's not insignificant; got more details? I think we should really
> look at why people are using the sync primitives.

Paul, what do you think about adding a compile-time debug option to
synchronize_rcu() that causes it to capture the time on entry and exit
and print the duration together with the file:line of the caller?
Similar to initcall_debug, but for blocking calls to synchronize_rcu().
Put that together with initcall_debug, and you'd have a pretty good idea
of where that holds up boot.

We do want early boot to run as asynchronously as possible, and to avoid
having later bits of boot waiting on a synchronize_rcu from earlier bits
of boot.  Switching a caller over to call_rcu() doesn't actually help if
it still has to finish a grace period before it can allow later bits to
run.  Ideally, we ought to be able to work out the "depth" of boot in
grace-periods.

Has anyone wired initcall_debug up to a bootchart-like graph?

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Paul E. McKenney
On Fri, Feb 20, 2015 at 10:29:34AM -0800, Arjan van de Ven wrote:
> On 2/20/2015 10:27 AM, Paul E. McKenney wrote:
> >On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote:
> >>Does it really make a machine boot much faster? Why are people using
> >>synchronous gp primitives if they care about speed? Should we not fix
> >>that instead?
> >
> >The report I heard was that it provided 10-15% faster boot times.
> 
> That's not insignificant; got more details? I think we should really
> look at why people are using the sync primitives.
> >>>
> >>>I must defer to the people who took the exact measurements.
> >>>
> >>>But yes, once I have that info, I should add it to the commit log.
> >>
> >>so the two most obvious cases are
> >>
> >>Registering sysrq keys ... even when the old key code had no handler
> >>(have a patch pending for this)
> >>
> >>registering idle handlers
> >>(this is more tricky, it's very obvious abuse but the fix is less clear)
> >>
> >>there's a few others as well that I'm chasing down...
> >>.. but the flip side, prior to running ring 3 code, why NOT do fast 
> >>expedites?
> >
> >It would be good to have before-and-after measurements of actual
> >boot time.  Are these numbers available?
> 
> I'll make you pretty graphs when I get home from collab summit, which
> should be later today

Very good, looking forward to seeing them.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Paul E. McKenney
On Fri, Feb 20, 2015 at 06:43:59PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote:
> > there's a few others as well that I'm chasing down...
> > .. but the flip side, prior to running ring 3 code, why NOT do fast 
> > expedites?
> 
> So my objections are twofold:
> 
>  - I object to fast expedites in principle; they spray IPIs across the
>system, so ideally we'd not have them at all, therefore also not at
>boot.

There are only a few uses of expedited grace periods, despite their
having been in the kernel for some years.  So people do seem to be
exercising appropriate restraint here.

>Because as soon as the option exists, people will use it for other
>things too.
> 
>  - The proposed interface is very much exposed to everybody who wants
>it; this again is wide open to (ab)use.
> 
>Once it exists people will start to use, and before you know it we'll
>always have that fast counter incremented and we're in IPI hell. Most
>likely because someone was lazy and it seemed like a quick fix for
>some stupid code.

I suppose that another way to keep it private would be to have the
declaration in both update.c and rcutorture.c.  This would mean that no
other file could invoke it, and should keep 0day happy.  It would mean
that the declarations would be duplicated, but worse things could happen.

> And esp. in bootup code you can special case a lot of stuff; there's
> limited concurrency esp. because userspace it not there yet. So we might
> not actually need those sync calls.

I expect that some could be rewritten, but it might not work well for
code common to boot and to runtime.


Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Arjan van de Ven

On 2/20/2015 10:27 AM, Paul E. McKenney wrote:

On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote:

Does it really make a machine boot much faster? Why are people using
synchronous gp primitives if they care about speed? Should we not fix
that instead?


The report I heard was that it provided 10-15% faster boot times.


That's not insignificant; got more details? I think we should really
look at why people are using the sync primitives.


I must defer to the people who took the exact measurements.

But yes, once I have that info, I should add it to the commit log.


so the two most obvious cases are

Registering sysrq keys ... even when the old key code had no handler
(have a patch pending for this)

registering idle handlers
(this is more tricky, it's very obvious abuse but the fix is less clear)

there's a few others as well that I'm chasing down...
.. but the flip side, prior to running ring 3 code, why NOT do fast expedites?


It would be good to have before-and-after measurements of actual
boot time.  Are these numbers available?


I'll make you pretty graphs when I get home from collab summit, which
should be later today

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Paul E. McKenney
On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote:
> Does it really make a machine boot much faster? Why are people using
> synchronous gp primitives if they care about speed? Should we not fix
> that instead?
> >>>
> >>>The report I heard was that it provided 10-15% faster boot times.
> >>
> >>That's not insignificant; got more details? I think we should really
> >>look at why people are using the sync primitives.
> >
> >I must defer to the people who took the exact measurements.
> >
> >But yes, once I have that info, I should add it to the commit log.
> 
> so the two most obvious cases are
> 
> Registering sysrq keys ... even when the old key code had no handler
> (have a patch pending for this)
> 
> registering idle handlers
> (this is more tricky, it's very obvious abuse but the fix is less clear)
> 
> there's a few others as well that I'm chasing down...
> .. but the flip side, prior to running ring 3 code, why NOT do fast expedites?

It would be good to have before-and-after measurements of actual
boot time.  Are these numbers available?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Arjan van de Ven

On 2/20/2015 9:43 AM, Peter Zijlstra wrote:

On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote:

there's a few others as well that I'm chasing down...
.. but the flip side, prior to running ring 3 code, why NOT do fast expedites?


So my objections are twofold:

  - I object to fast expedites in principle; they spray IPIs across the
system, so ideally we'd not have them at all, therefore also not at
boot.

Because as soon as the option exists, people will use it for other
things too.


the option exists today in sysfs and kernel parameter...


And esp. in bootup code you can special case a lot of stuff; there's
limited concurrency esp. because userspace it not there yet. So we might
not actually need those sync calls.


yeah I am going down that angle as well absolutely.
but there are cases that may well be legit (or are 5 function calls deep into 
common code)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Peter Zijlstra
On Fri, Feb 20, 2015 at 09:32:39AM -0800, Arjan van de Ven wrote:
> there's a few others as well that I'm chasing down...
> .. but the flip side, prior to running ring 3 code, why NOT do fast expedites?

So my objections are twofold:

 - I object to fast expedites in principle; they spray IPIs across the
   system, so ideally we'd not have them at all, therefore also not at
   boot.

   Because as soon as the option exists, people will use it for other
   things too.

 - The proposed interface is very much exposed to everybody who wants
   it; this again is wide open to (ab)use.

   Once it exists people will start to use, and before you know it we'll
   always have that fast counter incremented and we're in IPI hell. Most
   likely because someone was lazy and it seemed like a quick fix for
   some stupid code.

And esp. in bootup code you can special case a lot of stuff; there's
limited concurrency esp. because userspace it not there yet. So we might
not actually need those sync calls.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Arjan van de Ven

Does it really make a machine boot much faster? Why are people using
synchronous gp primitives if they care about speed? Should we not fix
that instead?


The report I heard was that it provided 10-15% faster boot times.


That's not insignificant; got more details? I think we should really
look at why people are using the sync primitives.


I must defer to the people who took the exact measurements.

But yes, once I have that info, I should add it to the commit log.


so the two most obvious cases are

Registering sysrq keys ... even when the old key code had no handler
(have a patch pending for this)

registering idle handlers
(this is more tricky, it's very obvious abuse but the fix is less clear)

there's a few others as well that I'm chasing down...
.. but the flip side, prior to running ring 3 code, why NOT do fast expedites?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Paul E. McKenney
On Fri, Feb 20, 2015 at 05:54:09PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 20, 2015 at 08:37:37AM -0800, Paul E. McKenney wrote:
> > On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote:
> > > So I though we wanted to get rid / limit the expedited stuff because its
> > > IPI happy, and here its spreading.
> > 
> > Well, at least it no longer IPIs idle CPUs.  ;-)
> > 
> > And this is during boot, when a few extra IPIs should not be a big deal.
> 
> Well the one application now is during boot; but you expose the
> interface for all to use, and therefore someone will.

I could make rcu_expedite_gp() and rcu_unexpedite_gp() be static,
I suppose.  Except that I need to test them with rcutorture.
I suppose I could put the declaration in rcutorture.c, but then
0day will tell me to made them static.  :-/

> > > Does it really make a machine boot much faster? Why are people using
> > > synchronous gp primitives if they care about speed? Should we not fix
> > > that instead?
> > 
> > The report I heard was that it provided 10-15% faster boot times.
> 
> That's not insignificant; got more details? I think we should really
> look at why people are using the sync primitives.

I must defer to the people who took the exact measurements.

But yes, once I have that info, I should add it to the commit log.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Peter Zijlstra
On Fri, Feb 20, 2015 at 08:37:37AM -0800, Paul E. McKenney wrote:
> On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote:
> > So I though we wanted to get rid / limit the expedited stuff because its
> > IPI happy, and here its spreading.
> 
> Well, at least it no longer IPIs idle CPUs.  ;-)
> 
> And this is during boot, when a few extra IPIs should not be a big deal.

Well the one application now is during boot; but you expose the
interface for all to use, and therefore someone will.

> > Does it really make a machine boot much faster? Why are people using
> > synchronous gp primitives if they care about speed? Should we not fix
> > that instead?
> 
> The report I heard was that it provided 10-15% faster boot times.

That's not insignificant; got more details? I think we should really
look at why people are using the sync primitives.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Paul E. McKenney
On Fri, Feb 20, 2015 at 10:11:07AM +0100, Peter Zijlstra wrote:
> On Thu, Feb 19, 2015 at 09:08:50PM -0800, Paul E. McKenney wrote:
> > Hello!
> > 
> > This series, possibly for v3.21, contains changes that allow in-kernel
> > code to specify that all subsequent synchronous grace-period primitives
> > (synchronize_rcu() and friends) be expedited.  New rcu_expedite_gp()
> > and rcu_unexpedite_gp() primitives enable and disable expediting,
> > and these may be nested.  Note that the rcu_expedited boot/sysfs
> > variable, if non-zero, causes expediting to happen regardless of calls
> > to rcu_expedite_gp().
> > 
> > Because one of the use cases for these primitives is to expedite
> > grace periods during the in-kernel portion of boot, a new Kconfig
> > parameter named CONFIG_RCU_EXPEDITE_BOOT causes the kernel to act
> > as if rcu_expedite_gp() was called very early in boot.  At the end
> > of boot (presumably just before init is spawned), a call to
> > rcu_end_inkernel_boot() will provide the matching rcu_unexpedite_gp()
> > if required.
> 
> So I though we wanted to get rid / limit the expedited stuff because its
> IPI happy, and here its spreading.

Well, at least it no longer IPIs idle CPUs.  ;-)

And this is during boot, when a few extra IPIs should not be a big deal.

> Does it really make a machine boot much faster? Why are people using
> synchronous gp primitives if they care about speed? Should we not fix
> that instead?

The report I heard was that it provided 10-15% faster boot times.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-20 Thread Peter Zijlstra
On Thu, Feb 19, 2015 at 09:08:50PM -0800, Paul E. McKenney wrote:
> Hello!
> 
> This series, possibly for v3.21, contains changes that allow in-kernel
> code to specify that all subsequent synchronous grace-period primitives
> (synchronize_rcu() and friends) be expedited.  New rcu_expedite_gp()
> and rcu_unexpedite_gp() primitives enable and disable expediting,
> and these may be nested.  Note that the rcu_expedited boot/sysfs
> variable, if non-zero, causes expediting to happen regardless of calls
> to rcu_expedite_gp().
> 
> Because one of the use cases for these primitives is to expedite
> grace periods during the in-kernel portion of boot, a new Kconfig
> parameter named CONFIG_RCU_EXPEDITE_BOOT causes the kernel to act
> as if rcu_expedite_gp() was called very early in boot.  At the end
> of boot (presumably just before init is spawned), a call to
> rcu_end_inkernel_boot() will provide the matching rcu_unexpedite_gp()
> if required.

So I though we wanted to get rid / limit the expedited stuff because its
IPI happy, and here its spreading.

Does it really make a machine boot much faster? Why are people using
synchronous gp primitives if they care about speed? Should we not fix
that instead?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH tip/core/rcu 0/4] Programmatic nestable expedited grace periods

2015-02-19 Thread Paul E. McKenney
Hello!

This series, possibly for v3.21, contains changes that allow in-kernel
code to specify that all subsequent synchronous grace-period primitives
(synchronize_rcu() and friends) be expedited.  New rcu_expedite_gp()
and rcu_unexpedite_gp() primitives enable and disable expediting,
and these may be nested.  Note that the rcu_expedited boot/sysfs
variable, if non-zero, causes expediting to happen regardless of calls
to rcu_expedite_gp().

Because one of the use cases for these primitives is to expedite
grace periods during the in-kernel portion of boot, a new Kconfig
parameter named CONFIG_RCU_EXPEDITE_BOOT causes the kernel to act
as if rcu_expedite_gp() was called very early in boot.  At the end
of boot (presumably just before init is spawned), a call to
rcu_end_inkernel_boot() will provide the matching rcu_unexpedite_gp()
if required.

The patches in this series are as follows:

1.  Add rcu_expedite_gp() and rcu_unexpedite_gp() functions.

2.  Add rcutorture tests for rcu_expedite_gp() and rcu_unexpedite_gp().

3.  Change open-coded access to the rcu_expedited variable to
instead use a new rcu_gp_is_expedited() function.

4.  Add the CONFIG_RCU_EXPEDITE_BOOT Kconfig parameter and
the rcu_end_inkernel_boot() function.

This passes light rcutorture testing.

Thanx, Paul



 b/include/linux/rcupdate.h |   21 
 b/init/Kconfig |   13 +
 b/kernel/rcu/rcutorture.c  |   24 ++
 b/kernel/rcu/srcu.c|2 -
 b/kernel/rcu/tree.c|9 +++---
 b/kernel/rcu/tree_plugin.h |2 -
 b/kernel/rcu/update.c  |   59 -
 7 files changed, 123 insertions(+), 7 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/