Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-18 Thread Phil Auld
On Mon, Mar 18, 2019 at 10:14:22AM -0700 bseg...@google.com wrote: > Phil Auld writes: > > > On Fri, Mar 15, 2019 at 05:03:47PM +0100 Peter Zijlstra wrote: > >> On Fri, Mar 15, 2019 at 11:30:42AM -0400, Phil Auld wrote: > >> > >> >> I'll rework the maths in the averaged version and post v2 if

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-18 Thread bsegall
Phil Auld writes: > On Fri, Mar 15, 2019 at 05:03:47PM +0100 Peter Zijlstra wrote: >> On Fri, Mar 15, 2019 at 11:30:42AM -0400, Phil Auld wrote: >> >> >> I'll rework the maths in the averaged version and post v2 if that makes >> >> sense. >> > >> > It may have the extra timer fetch, although

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-18 Thread Phil Auld
On Fri, Mar 15, 2019 at 05:03:47PM +0100 Peter Zijlstra wrote: > On Fri, Mar 15, 2019 at 11:30:42AM -0400, Phil Auld wrote: > > >> I'll rework the maths in the averaged version and post v2 if that makes > >> sense. > > > > It may have the extra timer fetch, although maybe I could rework it so

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Peter Zijlstra
On Fri, Mar 15, 2019 at 09:30:42AM -0400, Phil Auld wrote: > On Fri, Mar 15, 2019 at 11:11:50AM +0100 Peter Zijlstra wrote: > > Computers _suck_ at /100. And since you're free to pick the constant, > > pick a power of two, computers love those. > > > > Fair enough, I was thinking percents. And

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Phil Auld
On Fri, Mar 15, 2019 at 04:59:33PM +0100 Peter Zijlstra wrote: > On Fri, Mar 15, 2019 at 09:51:25AM -0400, Phil Auld wrote: > > On Fri, Mar 15, 2019 at 11:33:57AM +0100 Peter Zijlstra wrote: > > > On Fri, Mar 15, 2019 at 11:11:50AM +0100, Peter Zijlstra wrote: > > > > diff --git

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Phil Auld
On Fri, Mar 15, 2019 at 05:03:47PM +0100 Peter Zijlstra wrote: > On Fri, Mar 15, 2019 at 11:30:42AM -0400, Phil Auld wrote: > > > In my defense here, all the fair.c imbalance pct code also uses 100 :) > > Yes, I know, I hate on that too ;-) Just never got around to fixing > that. > > > > with

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Peter Zijlstra
On Fri, Mar 15, 2019 at 11:30:42AM -0400, Phil Auld wrote: > In my defense here, all the fair.c imbalance pct code also uses 100 :) Yes, I know, I hate on that too ;-) Just never got around to fixing that. > with the below: > > [ 117.235804] cfs_period_timer[cpu2]: period too short, scaling

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Peter Zijlstra
On Fri, Mar 15, 2019 at 09:51:25AM -0400, Phil Auld wrote: > On Fri, Mar 15, 2019 at 11:33:57AM +0100 Peter Zijlstra wrote: > > On Fri, Mar 15, 2019 at 11:11:50AM +0100, Peter Zijlstra wrote: > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > index ea74d43924b2..b71557be6b42 100644

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Phil Auld
On Fri, Mar 15, 2019 at 11:11:50AM +0100 Peter Zijlstra wrote: > On Wed, Mar 13, 2019 at 11:08:26AM -0400, Phil Auld wrote: ... > Computers _suck_ at /100. And since you're free to pick the constant, > pick a power of two, computers love those. > > > + > > + if (new_period >

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Phil Auld
On Fri, Mar 15, 2019 at 11:33:57AM +0100 Peter Zijlstra wrote: > On Fri, Mar 15, 2019 at 11:11:50AM +0100, Peter Zijlstra wrote: > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index ea74d43924b2..b71557be6b42 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Phil Auld
On Fri, Mar 15, 2019 at 11:11:50AM +0100 Peter Zijlstra wrote: > On Wed, Mar 13, 2019 at 11:08:26AM -0400, Phil Auld wrote: > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 310d0637fe4b..90cc67bbf592 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Peter Zijlstra
On Fri, Mar 15, 2019 at 11:11:50AM +0100, Peter Zijlstra wrote: > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index ea74d43924b2..b71557be6b42 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4885,6 +4885,8 @@ static enum hrtimer_restart >

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Peter Zijlstra
On Wed, Mar 13, 2019 at 11:08:26AM -0400, Phil Auld wrote: > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 310d0637fe4b..90cc67bbf592 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4859,19 +4859,51 @@ static enum hrtimer_restart >

[PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-13 Thread Phil Auld
With extremely short cfs_period_us setting on a parent task group with a large number of children the for loop in sched_cfs_period_timer can run until the watchdog fires. There is no guarantee that the call to hrtimer_forward_now() will ever return 0. The large number of children can make