Re: [tip:sched/core] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-04-09 Thread Phil Auld
On Tue, Apr 09, 2019 at 03:05:27PM +0200 Peter Zijlstra wrote: > On Tue, Apr 09, 2019 at 08:48:16AM -0400, Phil Auld wrote: > > Hi Ingo, Peter, > > > > On Wed, Apr 03, 2019 at 01:38:39AM -0700 tip-bot for Phil Auld wrote: > > > Commit-ID: 06ec5d30e8d57b820d44df6

Re: [tip:sched/core] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-04-09 Thread Phil Auld
Hi Ingo, Peter, On Wed, Apr 03, 2019 at 01:38:39AM -0700 tip-bot for Phil Auld wrote: > Commit-ID: 06ec5d30e8d57b820d44df6340dcb25010d6d0fa > Gitweb: > https://git.kernel.org/tip/06ec5d30e8d57b820d44df6340dcb25010d6d0fa > Author: Phil Auld > AuthorDate: Tue, 19 Mar 2019

[tip:sched/core] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-04-03 Thread tip-bot for Phil Auld
Commit-ID: 06ec5d30e8d57b820d44df6340dcb25010d6d0fa Gitweb: https://git.kernel.org/tip/06ec5d30e8d57b820d44df6340dcb25010d6d0fa Author: Phil Auld AuthorDate: Tue, 19 Mar 2019 09:00:05 -0400 Committer: Ingo Molnar CommitDate: Wed, 3 Apr 2019 09:50:23 +0200 sched/fair: Limit

Re: [PATCH v2] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-21 Thread Phil Auld
On Thu, Mar 21, 2019 at 07:01:37PM +0100 Peter Zijlstra wrote: > On Tue, Mar 19, 2019 at 09:00:05AM -0400, Phil Auld wrote: > > sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup > > > > With extremely short cfs_period_us setting on a parent task group with a

[PATCH v2] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-19 Thread Phil Auld
is state and the new values. v2: Math reworked/simplified by Peter Zijlstra. Signed-off-by: Phil Auld Cc: Ben Segall Cc: Ingo Molnar Cc: Peter Zijlstra (Intel) Cc: Anton Blanchard --- kernel/sched/fair.c | 25 + 1 file changed, 25 insertions(+) diff --git a/kernel/sc

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-18 Thread Phil Auld
On Mon, Mar 18, 2019 at 10:14:22AM -0700 bseg...@google.com wrote: > Phil Auld writes: > > > On Fri, Mar 15, 2019 at 05:03:47PM +0100 Peter Zijlstra wrote: > >> On Fri, Mar 15, 2019 at 11:30:42AM -0400, Phil Auld wrote: > >> > >> >> I'll rewor

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-18 Thread Phil Auld
On Fri, Mar 15, 2019 at 05:03:47PM +0100 Peter Zijlstra wrote: > On Fri, Mar 15, 2019 at 11:30:42AM -0400, Phil Auld wrote: > > >> I'll rework the maths in the averaged version and post v2 if that makes > >> sense. > > > > It may have the extra timer f

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Phil Auld
On Fri, Mar 15, 2019 at 04:59:33PM +0100 Peter Zijlstra wrote: > On Fri, Mar 15, 2019 at 09:51:25AM -0400, Phil Auld wrote: > > On Fri, Mar 15, 2019 at 11:33:57AM +0100 Peter Zijlstra wrote: > > > On Fri, Mar 15, 2019 at 11:11:50AM +0100, Peter Zijlstra wrote: > > >

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Phil Auld
On Fri, Mar 15, 2019 at 05:03:47PM +0100 Peter Zijlstra wrote: > On Fri, Mar 15, 2019 at 11:30:42AM -0400, Phil Auld wrote: > > > In my defense here, all the fair.c imbalance pct code also uses 100 :) > > Yes, I know, I hate on that too ;-) Just never got around to fixing >

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Phil Auld
On Fri, Mar 15, 2019 at 11:11:50AM +0100 Peter Zijlstra wrote: > On Wed, Mar 13, 2019 at 11:08:26AM -0400, Phil Auld wrote: ... > Computers _suck_ at /100. And since you're free to pick the constant, > pick a power of two, computers love those. > > > + > > +

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Phil Auld
On Fri, Mar 15, 2019 at 11:33:57AM +0100 Peter Zijlstra wrote: > On Fri, Mar 15, 2019 at 11:11:50AM +0100, Peter Zijlstra wrote: > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index ea74d43924b2..b71557be6b42 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@

Re: [PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-15 Thread Phil Auld
On Fri, Mar 15, 2019 at 11:11:50AM +0100 Peter Zijlstra wrote: > On Wed, Mar 13, 2019 at 11:08:26AM -0400, Phil Auld wrote: > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 310d0637fe4b..90cc67bbf592 100644 > > --- a/kernel/sched/fair.c > >

Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer

2019-03-13 Thread Phil Auld
On Wed, Mar 13, 2019 at 01:26:51PM -0700 bseg...@google.com wrote: > Phil Auld writes: > > > On Wed, Mar 13, 2019 at 10:44:09AM -0700 bseg...@google.com wrote: > >> Phil Auld writes: > >> > >> > On Mon, Mar 11, 2019 at 04:25:36PM -0400 Phil Auld wr

Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer

2019-03-13 Thread Phil Auld
On Wed, Mar 13, 2019 at 10:44:09AM -0700 bseg...@google.com wrote: > Phil Auld writes: > > > On Mon, Mar 11, 2019 at 04:25:36PM -0400 Phil Auld wrote: > >> On Mon, Mar 11, 2019 at 10:44:25AM -0700 bseg...@google.com wrote: > >> > Letting it spin for 100ms and

[PATCH] sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup

2019-03-13 Thread Phil Auld
as suggested by Ben Segall . Signed-off-by: Phil Auld Cc: Ben Segall Cc: Ingo Molnar Cc: Peter Zijlstra (Intel) --- Note: This is against v5.0 as suggested by the documentation. It won't apply to 5.0+ due to the change to raw_spin_lock_irqsave. I can respin as needed. kernel/

Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer

2019-03-12 Thread Phil Auld
On Mon, Mar 11, 2019 at 04:25:36PM -0400 Phil Auld wrote: > On Mon, Mar 11, 2019 at 10:44:25AM -0700 bseg...@google.com wrote: > > Letting it spin for 100ms and then only increasing by 6% seems extremely > > generous. If we went this route I'd probably say "after loo

Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer

2019-03-11 Thread Phil Auld
On Mon, Mar 11, 2019 at 10:44:25AM -0700 bseg...@google.com wrote: > Phil Auld writes: > > > On Wed, Mar 06, 2019 at 11:25:02AM -0800 bseg...@google.com wrote: > >> Phil Auld writes: > >> > >> > On Tue, Mar 05, 2019 at 12:45:34PM -0800 bseg...@g

Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer

2019-03-09 Thread Phil Auld
On Wed, Mar 06, 2019 at 11:25:02AM -0800 bseg...@google.com wrote: > Phil Auld writes: > > > On Tue, Mar 05, 2019 at 12:45:34PM -0800 bseg...@google.com wrote: > >> Phil Auld writes: > >> > >> > Interestingly, if I limit the number of child cgroups

Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer

2019-03-06 Thread Phil Auld
On Tue, Mar 05, 2019 at 12:45:34PM -0800 bseg...@google.com wrote: > Phil Auld writes: > > > Interestingly, if I limit the number of child cgroups to the number of > > them I'm actually putting processes into (16 down from 2500) the problem > > does not r

Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer

2019-03-05 Thread Phil Auld
On Tue, Mar 05, 2019 at 10:49:01AM -0800 bseg...@google.com wrote: > Phil Auld writes: > > >> > > >> > raw_spin_lock(&cfs_b->lock); > >> > for (;;) { > >> > overrun = hrtimer_forward_now(timer, cfs_b->peri

Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer

2019-03-04 Thread Phil Auld
On Mon, Mar 04, 2019 at 10:13:49AM -0800 bseg...@google.com wrote: > Phil Auld writes: > > > Hi, > > > > I have a reproducible case of this: > > > > [ 217.264946] NMI watchdog: Watchdog detected hard LOCKUP on cpu 24 > > [ 217.264948] M

[RFC] sched/fair: hard lockup in sched_cfs_period_timer

2019-03-01 Thread Phil Auld
Hi, I have a reproducible case of this: [ 217.264946] NMI watchdog: Watchdog detected hard LOCKUP on cpu 24 [ 217.264948] Modules linked in: sunrpc iTCO_wdt gpio_ich iTCO_vendor_support intel_powerclamp coretemp kvm_intel kvm ipmi_ssif irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_in

Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access

2019-02-19 Thread Phil Auld
On Tue, Feb 19, 2019 at 05:22:50PM +0100 Peter Zijlstra wrote: > On Tue, Feb 19, 2019 at 11:13:43AM -0500, Phil Auld wrote: > > On Mon, Feb 18, 2019 at 05:56:23PM +0100 Peter Zijlstra wrote: > > > In preparation of playing games with rq->lock, abstract the thing &g

Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access

2019-02-19 Thread Phil Auld
On Mon, Feb 18, 2019 at 05:56:23PM +0100 Peter Zijlstra wrote: > In preparation of playing games with rq->lock, abstract the thing > using an accessor. > > Signed-off-by: Peter Zijlstra (Intel) Hi Peter, Sorry... what tree are these for? They don't apply to mainline. Some branch on tip, I gue

[tip:sched/urgent] sched/fair: Fix throttle_list starvation with low CFS quota

2018-10-11 Thread tip-bot for Phil Auld
Commit-ID: baa9be4ffb55876923dc9716abc0a448e510ba30 Gitweb: https://git.kernel.org/tip/baa9be4ffb55876923dc9716abc0a448e510ba30 Author: Phil Auld AuthorDate: Mon, 8 Oct 2018 10:36:40 -0400 Committer: Ingo Molnar CommitDate: Thu, 11 Oct 2018 13:10:18 +0200 sched/fair: Fix throttle_list

[tip:sched/urgent] sched/fair: Fix throttle_list starvation with low CFS quota

2018-10-11 Thread tip-bot for Phil Auld
Commit-ID: 8b48300108248e950cde0bdc5708039fc3836623 Gitweb: https://git.kernel.org/tip/8b48300108248e950cde0bdc5708039fc3836623 Author: Phil Auld AuthorDate: Mon, 8 Oct 2018 10:36:40 -0400 Committer: Ingo Molnar CommitDate: Thu, 11 Oct 2018 11:18:32 +0200 sched/fair: Fix throttle_list

Re: [Patch] sched/fair: Avoid throttle_list starvation with low cfs quota

2018-10-10 Thread Phil Auld
eeds to be fixed - and at first sight the quota > > of 1000 looks very > > low - could we improve the arithmetics perhaps? > > > > A low quota of 1000 is used because there's many VMs or containers > > provisioned on the system > > that is trig

Re: [Patch] sched/fair: Avoid throttle_list starvation with low cfs quota

2018-10-09 Thread Phil Auld
I believe that's a different issue, though. The kernel allows this setting and should handle it better than it currently does. The proposed patch fixes it so that all the tasks make progress (even if not much progress) rather than having some starve at the back of the list. Cheers, Ph

[Patch] sched/fair: Avoid throttle_list starvation with low cfs quota

2018-10-08 Thread Phil Auld
From: "Phil Auld" sched/fair: Avoid throttle_list starvation with low cfs quota With a very low cpu.cfs_quota_us setting, such as the minimum of 1000, distribute_cfs_runtime may not empty the throttled_list before it runs out of runtime to distribute. In that case, due to the c

Re: Configure.help is complete

2001-06-01 Thread Phil Auld
Alexander Viro wrote: ...snip... > > We should start removing the crap from procfs in 2.5. Documenting shit is > a good step, but taking it out would be better. > Not to open a what may be can of worms but ... What's wrong with procfs? It allows a general interface to the kernel that does

Re: Stale super_blocks in 2.2

2001-02-14 Thread Phil Auld
Alan Cox wrote: > > > That can be a problem for fiber channel devices. I saw some issues with > > invalidate_buffers and page caching discussed in 2.4 space. Any reasons > > come to mind why I shouldn't call invalidate on the the way down instead > > (or in addition)? > > The I/O completed a few

Re: Stale super_blocks in 2.2

2001-02-13 Thread Phil Auld
Alan Cox wrote: > > > does not do anything to invalidate the buffers associated with the > > unmounted device. We then rely on disk change detection on a > > subsequent mount to prevent us from seeing the old super_block. > > 2.2 yes, 2.4 no That can be a problem for fiber channel devices. I sa

Stale super_blocks in 2.2

2001-02-13 Thread Phil Auld
Hello, It appears that the umount path in the 2.2 series kernels does not do anything to invalidate the buffers associated with the unmounted device. We then rely on disk change detection on a subsequent mount to prevent us from seeing the old super_block. Since deja was gobbled by goog

<    1   2