Re: sched: hang in migrate_swap

2015-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2015 at 04:38:21PM -0300, Rafael David Tinoco wrote: > Any thoughts ? This recently came up again and I proposed the below. Reposting because the original had a silly compile fail. --- Subject: stop_machine: Fix deadlock between multiple stop_two_cpus() From: Peter Zijlstra Date:

Re: sched: hang in migrate_swap

2015-06-15 Thread Rafael David Tinoco
Peter, Sasha, coming back to this… Not that this is happening frequently or I can easily reproduce, but… > On May14, 2014, at 07:26 AM, Peter Zijlstra wrote: > > On Wed, May 14, 2014 at 02:21:04PM +0400, Kirill Tkhai wrote: >> >> >> 14.05.2014, 14:14, "Peter Zijlstra" : >>> On Wed, May 14, 20

Re: sched: hang in migrate_swap

2014-05-14 Thread Peter Zijlstra
On Wed, May 14, 2014 at 12:26:02PM +0200, Peter Zijlstra wrote: > so we serialize stop_cpus_work() vs stop_two_cpus() with an l/g lock. > > Ah, but stop_cpus_work() only holds the global lock over queueing, it > doesn't wait for completion, that might indeed cause a problem. Hmm, is this so? If

Re: sched: hang in migrate_swap

2014-05-14 Thread Peter Zijlstra
On Wed, May 14, 2014 at 02:21:04PM +0400, Kirill Tkhai wrote: > > > 14.05.2014, 14:14, "Peter Zijlstra" : > > On Wed, May 14, 2014 at 01:42:32PM +0400, Kirill Tkhai wrote: > > > >>  Peter, do we have to queue stop works orderly? > >> > >>  Is there is not a possibility, when two pair of works que

Re: sched: hang in migrate_swap

2014-05-14 Thread Kirill Tkhai
14.05.2014, 14:14, "Peter Zijlstra" : > On Wed, May 14, 2014 at 01:42:32PM +0400, Kirill Tkhai wrote: > >>  Peter, do we have to queue stop works orderly? >> >>  Is there is not a possibility, when two pair of works queued different on >>  different cpus? >> >>   kernel/stop_machine.c | 10 ++

Re: sched: hang in migrate_swap

2014-05-14 Thread Peter Zijlstra
On Wed, May 14, 2014 at 01:42:32PM +0400, Kirill Tkhai wrote: > Peter, do we have to queue stop works orderly? > > Is there is not a possibility, when two pair of works queued different on > different cpus? > > > kernel/stop_machine.c | 10 -- > 1 file changed, 8 insertions(+), 2 deleti

Re: sched: hang in migrate_swap

2014-05-14 Thread Kirill Tkhai
12.05.2014, 22:49, "Sasha Levin" : > On 04/11/2014 11:16 AM, Kirill Tkhai wrote: > >>  11.04.2014, 18:33, "Sasha Levin" :  On 04/10/2014 09:38 AM, Kirill Tkhai wrote: >>   10.04.2014, 11:00, "Michael wang" : >>   On 04/10/2014 11:31 AM, Sasha Levin wrote: >>   [snip] >>

Re: sched: hang in migrate_swap

2014-05-12 Thread Sasha Levin
On 04/11/2014 11:16 AM, Kirill Tkhai wrote: > 11.04.2014, 18:33, "Sasha Levin" : >> > On 04/10/2014 09:38 AM, Kirill Tkhai wrote: >> > >>> >> 10.04.2014, 11:00, "Michael wang" : > On 04/10/2014 11:31 AM, Sasha Levin wrote: > [snip] >>> >> I'd like to re-open this issu

Re: sched: hang in migrate_swap

2014-04-11 Thread Kirill Tkhai
11.04.2014, 18:33, "Sasha Levin" : > On 04/10/2014 09:38 AM, Kirill Tkhai wrote: > >>  10.04.2014, 11:00, "Michael wang" :  On 04/10/2014 11:31 AM, Sasha Levin wrote:  [snip] >>   I'd like to re-open this issue. It seems that something broke and I'm >>   now seeing the same issues

Re: sched: hang in migrate_swap

2014-04-11 Thread Sasha Levin
On 04/10/2014 09:38 AM, Kirill Tkhai wrote: > 10.04.2014, 11:00, "Michael wang" : >> > On 04/10/2014 11:31 AM, Sasha Levin wrote: >> > [snip] >> > >>> >> I'd like to re-open this issue. It seems that something broke and I'm >>> >> now seeing the same issues that have gone away 2 months with this

Re: sched: hang in migrate_swap

2014-04-10 Thread Kirill Tkhai
10.04.2014, 11:00, "Michael wang" : > On 04/10/2014 11:31 AM, Sasha Levin wrote: > [snip] > >>  I'd like to re-open this issue. It seems that something broke and I'm >>  now seeing the same issues that have gone away 2 months with this patch >>  again. > > A new mechanism has been designed to move

Re: sched: hang in migrate_swap

2014-04-10 Thread Peter Zijlstra
On Wed, Apr 09, 2014 at 11:31:48PM -0400, Sasha Levin wrote: > I'd like to re-open this issue. It seems that something broke and I'm > now seeing the same issues that have gone away 2 months with this patch > again. Weird; we didn't touch anything in the last few weeks :-/ > Stack trace is simila

Re: sched: hang in migrate_swap

2014-04-10 Thread Michael wang
On 04/10/2014 11:31 AM, Sasha Levin wrote: [snip] > > I'd like to re-open this issue. It seems that something broke and I'm > now seeing the same issues that have gone away 2 months with this patch > again. A new mechanism has been designed to move the priority checking inside idle_balance(), inc

Re: sched: hang in migrate_swap

2014-04-09 Thread Sasha Levin
On 02/24/2014 07:12 AM, Peter Zijlstra wrote: > Subject: sched: Guarantee task priority in pick_next_task() > From: Peter Zijlstra > Date: Fri Feb 14 12:25:08 CET 2014 > > Michael spotted that the idle_balance() push down created a task > priority problem. > > Previously, when we called idle_bal

Re: sched: hang in migrate_swap

2014-02-25 Thread Michael wang
On 02/25/2014 06:49 PM, Peter Zijlstra wrote: > On Tue, Feb 25, 2014 at 12:47:01PM +0800, Michael wang wrote: >> On 02/24/2014 09:10 PM, Peter Zijlstra wrote: >>> On Mon, Feb 24, 2014 at 01:12:18PM +0100, Peter Zijlstra wrote: + if (p) { + if (unlikely(p == RETRY

Re: sched: hang in migrate_swap

2014-02-25 Thread Peter Zijlstra
On Mon, Feb 24, 2014 at 01:21:48PM -0500, Sasha Levin wrote: > Sign me up to the fan club of this patch, I love it. I've converted that in a Tested-by: tag :-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majo

Re: sched: hang in migrate_swap

2014-02-25 Thread Peter Zijlstra
On Tue, Feb 25, 2014 at 12:47:01PM +0800, Michael wang wrote: > On 02/24/2014 09:10 PM, Peter Zijlstra wrote: > > On Mon, Feb 24, 2014 at 01:12:18PM +0100, Peter Zijlstra wrote: > >> + if (p) { > >> + if (unlikely(p == RETRY_TASK)) > >> + goto agai

Re: sched: hang in migrate_swap

2014-02-24 Thread Michael wang
On 02/24/2014 09:10 PM, Peter Zijlstra wrote: > On Mon, Feb 24, 2014 at 01:12:18PM +0100, Peter Zijlstra wrote: >> +if (p) { >> +if (unlikely(p == RETRY_TASK)) >> +goto again; > > We could even make that: unlikely(p & 1), I think most CPU

Re: sched: hang in migrate_swap

2014-02-24 Thread Michael wang
On 02/24/2014 08:12 PM, Peter Zijlstra wrote: [snip] >> >> ...what about move idle_balance() back to it's old position? > > I've always hated that, idle_balance() is very much a fair policy thing > and shouldn't live in the core code. > >> pull_rt_task() logical could be after idle_balance() if s

Re: sched: hang in migrate_swap

2014-02-24 Thread Michael wang
On 02/25/2014 02:21 AM, Sasha Levin wrote: [snip] >> >> Fixes: 38033c37faab ("sched: Push down pre_schedule() and >> idle_balance()") >> Cc: Juri Lelli >> Cc: Ingo Molnar >> Cc: Steven Rostedt >> Reported-by: Michael Wang >> Signed-off-by: Peter Zijlstra > > Sign me up to the fan club of this patc

Re: sched: hang in migrate_swap

2014-02-24 Thread Sasha Levin
On 02/24/2014 07:12 AM, Peter Zijlstra wrote: Anyway, the below seems to work; it avoids playing tricks with the idle thread and instead uses a magic constant. The comparison should be faster too; seeing how we avoid dereferencing p->sched_class. --- Subject: sched: Guarantee task priority in p

Re: sched: hang in migrate_swap

2014-02-24 Thread Peter Zijlstra
On Mon, Feb 24, 2014 at 01:12:18PM +0100, Peter Zijlstra wrote: > + if (p) { > + if (unlikely(p == RETRY_TASK)) > + goto again; We could even make that: unlikely(p & 1), I think most CPUs can encode that far better than the full pointer i

Re: sched: hang in migrate_swap

2014-02-24 Thread Peter Zijlstra
On Mon, Feb 24, 2014 at 06:14:24PM +0800, Michael wang wrote: > On 02/24/2014 03:10 PM, Peter Zijlstra wrote: > > On Mon, Feb 24, 2014 at 01:19:15PM +0800, Michael wang wrote: > >> Peter, do we accidentally missed this commit? > >> > >> http://git.kernel.org/tip/477af336ba06ef4c32e97892bb0d2027ce30

Re: sched: hang in migrate_swap

2014-02-24 Thread Michael wang
On 02/24/2014 03:10 PM, Peter Zijlstra wrote: > On Mon, Feb 24, 2014 at 01:19:15PM +0800, Michael wang wrote: >> Peter, do we accidentally missed this commit? >> >> http://git.kernel.org/tip/477af336ba06ef4c32e97892bb0d2027ce30f466 > > Ingo dropped it on Saturday because it makes locking_selftest(

Re: sched: hang in migrate_swap

2014-02-23 Thread Peter Zijlstra
On Mon, Feb 24, 2014 at 01:19:15PM +0800, Michael wang wrote: > Peter, do we accidentally missed this commit? > > http://git.kernel.org/tip/477af336ba06ef4c32e97892bb0d2027ce30f466 Ingo dropped it on Saturday because it makes locking_selftest() unhappy. That is because we call locking_selftest()

Re: sched: hang in migrate_swap

2014-02-23 Thread Sasha Levin
On 02/24/2014 12:19 AM, Michael wang wrote: On 02/24/2014 11:23 AM, Sasha Levin wrote: [snip] >>> >>>Could I get a link to the patch please? It's a pain testing other things >>>with this issue reproducing every time. >> >>Please take a try on the latest tip tree, patches has been merged as I >>

Re: sched: hang in migrate_swap

2014-02-23 Thread Michael wang
On 02/24/2014 11:23 AM, Sasha Levin wrote: [snip] >>> >>> Could I get a link to the patch please? It's a pain testing other things >>> with this issue reproducing every time. >> >> Please take a try on the latest tip tree, patches has been merged as I >> saw :) > > Nope, still see it with latest -

Re: sched: hang in migrate_swap

2014-02-23 Thread Sasha Levin
On 02/21/2014 08:45 PM, Michael wang wrote: On 02/22/2014 12:43 AM, Sasha Levin wrote: On 02/19/2014 11:32 PM, Michael wang wrote: On 02/20/2014 02:08 AM, Sasha Levin wrote: Hi all, While fuzzing with trinity inside a KVM tools guest, running latest -next kernel, I see to hit the following ha

Re: sched: hang in migrate_swap

2014-02-21 Thread Michael wang
On 02/22/2014 12:43 AM, Sasha Levin wrote: > On 02/19/2014 11:32 PM, Michael wang wrote: >> On 02/20/2014 02:08 AM, Sasha Levin wrote: >>> >Hi all, >>> > >>> >While fuzzing with trinity inside a KVM tools guest, running latest >>> >-next kernel, I see to hit the following hang quite often. >> Fix f

Re: sched: hang in migrate_swap

2014-02-21 Thread Sasha Levin
On 02/19/2014 11:32 PM, Michael wang wrote: On 02/20/2014 02:08 AM, Sasha Levin wrote: >Hi all, > >While fuzzing with trinity inside a KVM tools guest, running latest >-next kernel, I see to hit the following hang quite often. Fix for the stuck issue around idle_balance() is now in progress, th

Re: sched: hang in migrate_swap

2014-02-19 Thread Michael wang
On 02/20/2014 02:08 AM, Sasha Levin wrote: > Hi all, > > While fuzzing with trinity inside a KVM tools guest, running latest > -next kernel, I see to hit the following hang quite often. Fix for the stuck issue around idle_balance() is now in progress, this may caused be the same problem, I sugges

sched: hang in migrate_swap

2014-02-19 Thread Sasha Levin
Hi all, While fuzzing with trinity inside a KVM tools guest, running latest -next kernel, I see to hit the following hang quite often. The initial spew is: [ 293.110057] BUG: soft lockup - CPU#8 stuck for 22s! [migration/8:258] [ 293.110057] Modules linked in: [ 293.110057] irq event stamp