Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-18 Thread Luiz Capitulino
On Thu, 18 Jan 2018 04:04:43 +0100
Frederic Weisbecker  wrote:

> On Wed, Jan 17, 2018 at 12:38:01PM -0500, Luiz Capitulino wrote:
> > On Tue, 16 Jan 2018 23:51:29 +0100
> > Frederic Weisbecker  wrote:
> >   
> > > On Tue, Jan 16, 2018 at 11:52:11AM -0500, Luiz Capitulino wrote:  
> > > > On Tue, 16 Jan 2018 16:41:00 +0100
> > > > Frederic Weisbecker  wrote:
> > > > > So isolcpus= is now the place where we control the isolation features
> > > > > and nohz is one of them.
> > > > 
> > > > That's the part I'm not very sure about. We've been advising users to
> > > > move away from isolcpus= when possible, but this very wanted 
> > > > nohz_offload
> > > > feature will force everyone back to using isolcpus= again.
> > > 
> > > Note "isolcpus=nohz" only implies nohz. You need to add "domain" to get
> > > the behaviour that you've been advising users against. We are simply
> > > reusing a kernel parameter that was abandoned to now control the isolation
> > > features that were disorganized and opaque behind nohz.
> > >   
> > > > 
> > > > I have the impression this series is trying to solve two problems:
> > > > 
> > > >  1. How (and where) we control the various isolation features in the
> > > > kernel
> > > 
> > > No, that has already been done in the previous merge window. We have a
> > > dedicated isolation subsystem now (kernel/sched/isolation.c) and
> > > an interface to control all these isolation features that were abusively 
> > > implied
> > > by nohz. The initial plan was to introduce "cpu_isolation=" but it looked 
> > > too much like
> > > "isolcpus=". Then in fact, why not using "isolcpus=" and give it a second 
> > > life.
> > > And there we are.  
> > 
> > OK, I get it now. But then series has to un-deprecate isolcpus= otherwise
> > it doesn't make sense to use it.  
> 
> Good point. Also I think you convinced me toward just applying that tick 
> offload
> on the existing nohz kernel parameter right away, that is, to both existing 
> "nohz_full="
> and "isolcpus=nohz".
> 
> After all that tick offload is an implementation detail.
> 
> Like you said if people complain about a regression, we can still fix it
> with a new option. But eventually I doubt this will be needed.
> 
> I'll respin with that.

Exciting times!

Btw, I do have this problem where I have a hog app on an isolated core
with isolcpus=nohz_offload,domain,... and I see top -d1 going from 100%
to 0% and then back from 0% to 100% every few seconds or so. I'll debug
it when you post the next version.


Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Frederic Weisbecker
On Wed, Jan 17, 2018 at 12:38:01PM -0500, Luiz Capitulino wrote:
> On Tue, 16 Jan 2018 23:51:29 +0100
> Frederic Weisbecker  wrote:
> 
> > On Tue, Jan 16, 2018 at 11:52:11AM -0500, Luiz Capitulino wrote:
> > > On Tue, 16 Jan 2018 16:41:00 +0100
> > > Frederic Weisbecker  wrote:  
> > > > So isolcpus= is now the place where we control the isolation features
> > > > and nohz is one of them.  
> > > 
> > > That's the part I'm not very sure about. We've been advising users to
> > > move away from isolcpus= when possible, but this very wanted nohz_offload
> > > feature will force everyone back to using isolcpus= again.  
> > 
> > Note "isolcpus=nohz" only implies nohz. You need to add "domain" to get
> > the behaviour that you've been advising users against. We are simply
> > reusing a kernel parameter that was abandoned to now control the isolation
> > features that were disorganized and opaque behind nohz.
> > 
> > > 
> > > I have the impression this series is trying to solve two problems:
> > > 
> > >  1. How (and where) we control the various isolation features in the
> > > kernel  
> > 
> > No, that has already been done in the previous merge window. We have a
> > dedicated isolation subsystem now (kernel/sched/isolation.c) and
> > an interface to control all these isolation features that were abusively 
> > implied
> > by nohz. The initial plan was to introduce "cpu_isolation=" but it looked 
> > too much like
> > "isolcpus=". Then in fact, why not using "isolcpus=" and give it a second 
> > life.
> > And there we are.
> 
> OK, I get it now. But then series has to un-deprecate isolcpus= otherwise
> it doesn't make sense to use it.

Good point. Also I think you convinced me toward just applying that tick offload
on the existing nohz kernel parameter right away, that is, to both existing 
"nohz_full="
and "isolcpus=nohz".

After all that tick offload is an implementation detail.

Like you said if people complain about a regression, we can still fix it
with a new option. But eventually I doubt this will be needed.

I'll respin with that.

Thanks!


Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Luiz Capitulino
On Tue, 16 Jan 2018 23:51:29 +0100
Frederic Weisbecker  wrote:

> On Tue, Jan 16, 2018 at 11:52:11AM -0500, Luiz Capitulino wrote:
> > On Tue, 16 Jan 2018 16:41:00 +0100
> > Frederic Weisbecker  wrote:  
> > > So isolcpus= is now the place where we control the isolation features
> > > and nohz is one of them.  
> > 
> > That's the part I'm not very sure about. We've been advising users to
> > move away from isolcpus= when possible, but this very wanted nohz_offload
> > feature will force everyone back to using isolcpus= again.  
> 
> Note "isolcpus=nohz" only implies nohz. You need to add "domain" to get
> the behaviour that you've been advising users against. We are simply
> reusing a kernel parameter that was abandoned to now control the isolation
> features that were disorganized and opaque behind nohz.
> 
> > 
> > I have the impression this series is trying to solve two problems:
> > 
> >  1. How (and where) we control the various isolation features in the
> > kernel  
> 
> No, that has already been done in the previous merge window. We have a
> dedicated isolation subsystem now (kernel/sched/isolation.c) and
> an interface to control all these isolation features that were abusively 
> implied
> by nohz. The initial plan was to introduce "cpu_isolation=" but it looked too 
> much like
> "isolcpus=". Then in fact, why not using "isolcpus=" and give it a second 
> life.
> And there we are.

OK, I get it now. But then series has to un-deprecate isolcpus= otherwise
it doesn't make sense to use it.


Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Mike Galbraith
On Wed, 2018-01-17 at 10:32 -0600, Christopher Lameter wrote:
> On Wed, 17 Jan 2018, Mike Galbraith wrote:
> 
> > Domain connectivity very much is a property of a set of CPUs, a rather
> > important one, and one managed by cpusets.  NOHZ_FULL is a property of
> > a set of cpus, thus a most excellent fit.  Other things are as well.
> 
> Not sure to what domain refers to in this context.

Scheduler domains, load balancing.

> > > We have sets of cpus associated with affinity masks in the form of bitmaps
> > > etc etc which is much more lightweight than having slug around the cgroup
> > > overhead everywhere.
> >
> > What does everywhere mean, set creation time?
> 
> You would need to create multiple cgroups to create what you want. Those
> will "inherit" characteristics from higher levels etc etc. It gets
> needlessly complicated and difficult to debug if something goes work.

It's only as complicated as you make it.  What I create is dirt simple,
an exclusive system set and an exclusive realtime set, both directly
under root.  It doesn't get any simpler than that.

> > > A simple bitmask is much better if you have to control detailed system
> > > behavior for each core and are planning each processes role because you
> > > need to make full use of the harware resources available.
> >
> > If you live in a static world, maybe.
> 
> Why would that be restricted to a static world?

Guess I misunderstood, unimportant.

> > I like the flexibility of being able to configure on the fly.  One tiny
> > example: for a high performance aircraft manufacturer, having military
> > simulation background, I know that simulators frequently have to be
> > ready to go at the drop of a hat, so I twiddled cpusets to let them
> > flip their extra fancy video game (80 cores, real controls/avionics...
> > "game over, insert one gold bar to continue" kind of fancy) from low
> > power idle to full bore hard realtime with one poke to a cpuset file.
> >
> > Static may be fine for some, for others, dynamic is much better.
> 
> The problem is that I may be flipping a flag in a cpuset to enable
> something but some other cpuset somewhere in the complex hieracy does
> something different that causes a conflict.

That's what exclusive sets are for, zero set overlap.  It would be very
difficult to both connect and disconnect scheduler domains :)

>  The directness to control is
> lost. Instead there is the fog of complexity created by the cgroups that
> have various plugins and whatnot.

You don't have to use any of the other controllers, I don't, just tell
systemthing to pretty please NOT co-mount controllers, and whatever to
ensure it keeps its tentacles off of your toys, and you're fine.

-Mike


Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Christopher Lameter
On Wed, 17 Jan 2018, Mike Galbraith wrote:

> Domain connectivity very much is a property of a set of CPUs, a rather
> important one, and one managed by cpusets.  NOHZ_FULL is a property of
> a set of cpus, thus a most excellent fit.  Other things are as well.

Not sure to what domain refers to in this context.

> > We have sets of cpus associated with affinity masks in the form of bitmaps
> > etc etc which is much more lightweight than having slug around the cgroup
> > overhead everywhere.
>
> What does everywhere mean, set creation time?

You would need to create multiple cgroups to create what you want. Those
will "inherit" characteristics from higher levels etc etc. It gets
needlessly complicated and difficult to debug if something goes work.

> > A simple bitmask is much better if you have to control detailed system
> > behavior for each core and are planning each processes role because you
> > need to make full use of the harware resources available.
>
> If you live in a static world, maybe.

Why would that be restricted to a static world?

> I like the flexibility of being able to configure on the fly.  One tiny
> example: for a high performance aircraft manufacturer, having military
> simulation background, I know that simulators frequently have to be
> ready to go at the drop of a hat, so I twiddled cpusets to let them
> flip their extra fancy video game (80 cores, real controls/avionics...
> "game over, insert one gold bar to continue" kind of fancy) from low
> power idle to full bore hard realtime with one poke to a cpuset file.
>
> Static may be fine for some, for others, dynamic is much better.

The problem is that I may be flipping a flag in a cpuset to enable
something but some other cpuset somewhere in the complex hieracy does
something different that causes a conflict. The directness to control is
lost. Instead there is the fog of complexity created by the cgroups that
have various plugins and whatnot.


Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Mike Galbraith
On Wed, 2018-01-17 at 08:51 -0600, Christopher Lameter wrote:
> On Tue, 16 Jan 2018, Mike Galbraith wrote:
> 
> > > I tried to remove isolcpus or at least change the way it works so that its
> > > effects are reversible (ie: affine the init task instead of isolating 
> > > domains)
> > > but that got nacked due to the behaviour's expectations for userspace.
> >
> > So we paint ourselves into a static corner forever more, despite every
> > bit of this being all about "properties of sets of cpus", ie precisely
> > what cpusets was born to do.  That's sad, dynamic wasn't that far away.
> 
> cpusets was born in order to isolate applications to sets of processors.
> The properties of sets of cpus was not on the horizon when SGI started
> this.

Domain connectivity very much is a property of a set of CPUs, a rather
important one, and one managed by cpusets.  NOHZ_FULL is a property of
a set of cpus, thus a most excellent fit.  Other things are as well.

> We have sets of cpus associated with affinity masks in the form of bitmaps
> etc etc which is much more lightweight than having slug around the cgroup
> overhead everywhere.

What does everywhere mean, set creation time?

> A simple bitmask is much better if you have to control detailed system
> behavior for each core and are planning each processes role because you
> need to make full use of the harware resources available.

If you live in a static world, maybe.

I like the flexibility of being able to configure on the fly.  One tiny
example: for a high performance aircraft manufacturer, having military
simulation background, I know that simulators frequently have to be
ready to go at the drop of a hat, so I twiddled cpusets to let them
flip their extra fancy video game (80 cores, real controls/avionics...
"game over, insert one gold bar to continue" kind of fancy) from low
power idle to full bore hard realtime with one poke to a cpuset file.

Static may be fine for some, for others, dynamic is much better.

-Mike


Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Christopher Lameter
On Tue, 16 Jan 2018, Mike Galbraith wrote:

> > I tried to remove isolcpus or at least change the way it works so that its
> > effects are reversible (ie: affine the init task instead of isolating 
> > domains)
> > but that got nacked due to the behaviour's expectations for userspace.
>
> So we paint ourselves into a static corner forever more, despite every
> bit of this being all about "properties of sets of cpus", ie precisely
> what cpusets was born to do.  That's sad, dynamic wasn't that far away.

cpusets was born in order to isolate applications to sets of processors.
The properties of sets of cpus was not on the horizon when SGI started
this.

We have sets of cpus associated with affinity masks in the form of bitmaps
etc etc which is much more lightweight than having slug around the cgroup
overhead everywhere.

A simple bitmask is much better if you have to control detailed system
behavior for each core and are planning each processes role because you
need to make full use of the harware resources available.

Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-16 Thread Frederic Weisbecker
On Tue, Jan 16, 2018 at 06:58:18PM +0100, Mike Galbraith wrote:
> On Tue, 2018-01-16 at 16:41 +0100, Frederic Weisbecker wrote:
> > On Fri, Jan 12, 2018 at 02:18:13PM -0500, Luiz Capitulino wrote:
> > 
> > > Why are extending isolcpus= given that it's a deprecated interface?
> > > Some people have already moved away from isolcpus= now, but with this
> > > new feature they will be forced back to using it.
> > 
> > I tried to remove isolcpus or at least change the way it works so that its
> > effects are reversible (ie: affine the init task instead of isolating 
> > domains)
> > but that got nacked due to the behaviour's expectations for userspace.
> 
> So we paint ourselves into a static corner forever more, despite every
> bit of this being all about "properties of sets of cpus", ie precisely
> what cpusets was born to do.  That's sad, dynamic wasn't that far away.

Hence why we need to propagate "isolcpus=" to cpusets.


Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-16 Thread Frederic Weisbecker
On Tue, Jan 16, 2018 at 11:52:11AM -0500, Luiz Capitulino wrote:
> On Tue, 16 Jan 2018 16:41:00 +0100
> Frederic Weisbecker  wrote:
> > So isolcpus= is now the place where we control the isolation features
> > and nohz is one of them.
> 
> That's the part I'm not very sure about. We've been advising users to
> move away from isolcpus= when possible, but this very wanted nohz_offload
> feature will force everyone back to using isolcpus= again.

Note "isolcpus=nohz" only implies nohz. You need to add "domain" to get
the behaviour that you've been advising users against. We are simply
reusing a kernel parameter that was abandoned to now control the isolation
features that were disorganized and opaque behind nohz.

> 
> I have the impression this series is trying to solve two problems:
> 
>  1. How (and where) we control the various isolation features in the
> kernel

No, that has already been done in the previous merge window. We have a
dedicated isolation subsystem now (kernel/sched/isolation.c) and
an interface to control all these isolation features that were abusively implied
by nohz. The initial plan was to introduce "cpu_isolation=" but it looked too 
much like
"isolcpus=". Then in fact, why not using "isolcpus=" and give it a second life.
And there we are.

In the end the goal is to propagate what is passed to "isolcpus=" to cpusets.


> 
>  2. Where we add the control for the tick offload feature
> 
> I think item 1 is too complex to solve right now. IMHO, this series
> should focus on item 2. And regarding item 2, I think we have two
> choices to make:
> 
>  1. Make tick offload a first class citizen by making it default to
> nohz_full=. If there are regressions, we handle them

That's a possible way to go.

> 
>  2. Add a new option to nohz_full=, like nohz_full=tick_offload
> 
> As an avid user of nohz_full I'm dying to see option 1 happening,
> but I'm not totally sure what the consequences can be.

"nohz_full=" parameter has been badly designed as it implies much more
than just full dynticks. So I'm not really looking forward to expanding
it.

> Another idea is to add CONFIG_NOHZ_TICK_OFFLOAD as an experimental
> feature.

I fear it's way too distro-unfriendly. They will want to have it as a
capability without necessarily running it. Just like they do with
CONFIG_NO_HZ_FULL.

> 
> > The complain about isolcpus is the immutable result. I'm thinking about
> > making it modifiable to cpuset but I only see two possible solutions:
> > 
> > - Make the root cpuset modifiable
> > - Create a directory called "isolcpus" visible on the first cpuset mount
> >   and move all processes there.
> 
> So, if we move the control of the tick offload to nohz_full= itself,
> we can completely ditch any isolcpus= change in this series.
> 
> I think this should give you a great relief :)

Not at all :)

What would be a great relief to me is that we can finally propagate isolcpus=
to cpusets so that we can continue to expand it without a second thought.


Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-16 Thread Mike Galbraith
On Tue, 2018-01-16 at 16:41 +0100, Frederic Weisbecker wrote:
> On Fri, Jan 12, 2018 at 02:18:13PM -0500, Luiz Capitulino wrote:
> 
> > Why are extending isolcpus= given that it's a deprecated interface?
> > Some people have already moved away from isolcpus= now, but with this
> > new feature they will be forced back to using it.
> 
> I tried to remove isolcpus or at least change the way it works so that its
> effects are reversible (ie: affine the init task instead of isolating domains)
> but that got nacked due to the behaviour's expectations for userspace.

So we paint ourselves into a static corner forever more, despite every
bit of this being all about "properties of sets of cpus", ie precisely
what cpusets was born to do.  That's sad, dynamic wasn't that far away.

-Mike




Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-16 Thread Luiz Capitulino
On Tue, 16 Jan 2018 16:41:00 +0100
Frederic Weisbecker  wrote:

> On Fri, Jan 12, 2018 at 02:18:13PM -0500, Luiz Capitulino wrote:
> > On Thu,  4 Jan 2018 05:25:32 +0100
> > Frederic Weisbecker  wrote:
> >   
> > > Ingo,
> > > 
> > > Please pull the sched/0hz branch that can be found at:
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> > >   sched/0hz
> > > 
> > > HEAD: 9e932b2cc707209febd130978a5eb9f4a943a3f4
> > > 
> > > --
> > > Now that scheduler_tick() has become resilient towards the absence of
> > > ticks, current->sched_class->task_tick() is the last piece that needs
> > > at least 1Hz tick to keep scheduler stats alive.
> > > 
> > > This patchset adds a flag to the isolcpus boot option to offload the
> > > residual 1Hz tick. This way the nohz_full CPUs don't have anymore tick
> > > (assuming nothing else requires it) as their residual 1Hz tick is
> > > offloaded to the housekeepers.
> > > 
> > > For quick testing, say on CPUs 1-7:
> > > 
> > >   "isolcpus=nohz_offload,domain,1-7"  
> > 
> > Sorry for being very late to this series, but I've a few comments to
> > make (one right now and others in individual patches).
> > 
> > Why are extending isolcpus= given that it's a deprecated interface?
> > Some people have already moved away from isolcpus= now, but with this
> > new feature they will be forced back to using it.  
> 
> I tried to remove isolcpus or at least change the way it works so that its
> effects are reversible (ie: affine the init task instead of isolating domains)
> but that got nacked due to the behaviour's expectations for userspace.
> 
> That's when I realized that kernel parameters are like userspace ABIs,
> they can't be removed easily whether we deprecate them or not.
> 
> Also I needed to be able to control the various isolation features, and
> nohz_full is the wrong place to do that as nohz_full is really just an
> isolation feature like the others, nohz_full= should really just imply
> full dynticks and not watchdog, workqueue or tilegx NAPI isolation...

Yeah, I completely agree with that.

> So isolcpus= is now the place where we control the isolation features
> and nohz is one of them.

That's the part I'm not very sure about. We've been advising users to
move away from isolcpus= when possible, but this very wanted nohz_offload
feature will force everyone back to using isolcpus= again.

I have the impression this series is trying to solve two problems:

 1. How (and where) we control the various isolation features in the
kernel

 2. Where we add the control for the tick offload feature

I think item 1 is too complex to solve right now. IMHO, this series
should focus on item 2. And regarding item 2, I think we have two
choices to make:

 1. Make tick offload a first class citizen by making it default to
nohz_full=. If there are regressions, we handle them

 2. Add a new option to nohz_full=, like nohz_full=tick_offload

As an avid user of nohz_full I'm dying to see option 1 happening,
but I'm not totally sure what the consequences can be.

Another idea is to add CONFIG_NOHZ_TICK_OFFLOAD as an experimental
feature.

> The complain about isolcpus is the immutable result. I'm thinking about
> making it modifiable to cpuset but I only see two possible solutions:
> 
> - Make the root cpuset modifiable
> - Create a directory called "isolcpus" visible on the first cpuset mount
>   and move all processes there.

So, if we move the control of the tick offload to nohz_full= itself,
we can completely ditch any isolcpus= change in this series.

I think this should give you a great relief :)

> > What about just adding the new functionality to nohz_full=? That is,
> > no new options, just make the tick go away since this has always been
> > what nohz_full= was intended to do?  
> 
> We can, or have isolcpus=nohz to do it, as both do almost the same.
> 
> But I'm afraid about the overhead for people used to nohz_full= once
> they upgrade their kernels and see those workqueues once per second.
> 
> We can still affine those workqueues (in fact the whole unbound workqueue
> mask) outside the nohz_full range. Still current users may be surprised
> about that new overhead on housekeeping CPUs...
> 



Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-16 Thread Frederic Weisbecker
On Fri, Jan 12, 2018 at 02:18:13PM -0500, Luiz Capitulino wrote:
> On Thu,  4 Jan 2018 05:25:32 +0100
> Frederic Weisbecker  wrote:
> 
> > Ingo,
> > 
> > Please pull the sched/0hz branch that can be found at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> > sched/0hz
> > 
> > HEAD: 9e932b2cc707209febd130978a5eb9f4a943a3f4
> > 
> > --
> > Now that scheduler_tick() has become resilient towards the absence of
> > ticks, current->sched_class->task_tick() is the last piece that needs
> > at least 1Hz tick to keep scheduler stats alive.
> > 
> > This patchset adds a flag to the isolcpus boot option to offload the
> > residual 1Hz tick. This way the nohz_full CPUs don't have anymore tick
> > (assuming nothing else requires it) as their residual 1Hz tick is
> > offloaded to the housekeepers.
> > 
> > For quick testing, say on CPUs 1-7:
> > 
> > "isolcpus=nohz_offload,domain,1-7"
> 
> Sorry for being very late to this series, but I've a few comments to
> make (one right now and others in individual patches).
> 
> Why are extending isolcpus= given that it's a deprecated interface?
> Some people have already moved away from isolcpus= now, but with this
> new feature they will be forced back to using it.

I tried to remove isolcpus or at least change the way it works so that its
effects are reversible (ie: affine the init task instead of isolating domains)
but that got nacked due to the behaviour's expectations for userspace.

That's when I realized that kernel parameters are like userspace ABIs,
they can't be removed easily whether we deprecate them or not.

Also I needed to be able to control the various isolation features, and
nohz_full is the wrong place to do that as nohz_full is really just an
isolation feature like the others, nohz_full= should really just imply
full dynticks and not watchdog, workqueue or tilegx NAPI isolation...

So isolcpus= is now the place where we control the isolation features
and nohz is one of them.

The complain about isolcpus is the immutable result. I'm thinking about
making it modifiable to cpuset but I only see two possible solutions:

- Make the root cpuset modifiable
- Create a directory called "isolcpus" visible on the first cpuset mount
  and move all processes there.
 
> What about just adding the new functionality to nohz_full=? That is,
> no new options, just make the tick go away since this has always been
> what nohz_full= was intended to do?

We can, or have isolcpus=nohz to do it, as both do almost the same.

But I'm afraid about the overhead for people used to nohz_full= once
they upgrade their kernels and see those workqueues once per second.

We can still affine those workqueues (in fact the whole unbound workqueue
mask) outside the nohz_full range. Still current users may be surprised
about that new overhead on housekeeping CPUs...


Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-12 Thread Luiz Capitulino
On Thu,  4 Jan 2018 05:25:32 +0100
Frederic Weisbecker  wrote:

> Ingo,
> 
> Please pull the sched/0hz branch that can be found at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
>   sched/0hz
> 
> HEAD: 9e932b2cc707209febd130978a5eb9f4a943a3f4
> 
> --
> Now that scheduler_tick() has become resilient towards the absence of
> ticks, current->sched_class->task_tick() is the last piece that needs
> at least 1Hz tick to keep scheduler stats alive.
> 
> This patchset adds a flag to the isolcpus boot option to offload the
> residual 1Hz tick. This way the nohz_full CPUs don't have anymore tick
> (assuming nothing else requires it) as their residual 1Hz tick is
> offloaded to the housekeepers.
> 
> For quick testing, say on CPUs 1-7:
> 
>   "isolcpus=nohz_offload,domain,1-7"

Sorry for being very late to this series, but I've a few comments to
make (one right now and others in individual patches).

Why are extending isolcpus= given that it's a deprecated interface?
Some people have already moved away from isolcpus= now, but with this
new feature they will be forced back to using it.

What about just adding the new functionality to nohz_full=? That is,
no new options, just make the tick go away since this has always been
what nohz_full= was intended to do?

> 
> Thanks,
>   Frederic
> ---
> 
> Frederic Weisbecker (5):
>   sched: Rename init_rq_hrtick to hrtick_rq_init
>   sched/isolation: Add scheduler tick offloading interface
>   nohz: Allow to check if remote CPU tick is stopped
>   sched/isolation: Residual 1Hz scheduler tick offload
>   sched/isolation: Document "nohz_offload" flag
> 
> 
>  Documentation/admin-guide/kernel-parameters.txt |  7 +-
>  include/linux/sched/isolation.h |  3 +-
>  include/linux/tick.h|  2 +
>  kernel/sched/core.c | 94 
> +++--
>  kernel/sched/isolation.c| 10 +++
>  kernel/sched/sched.h|  2 +
>  kernel/time/tick-sched.c|  7 ++
>  7 files changed, 117 insertions(+), 8 deletions(-)
>