Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-08-01 Thread Ilya Dryomov
On Fri, Aug 1, 2014 at 5:27 PM, Peter Zijlstra wrote: > On Fri, Aug 01, 2014 at 04:56:27PM +0400, Ilya Dryomov wrote: >> I'm going to fix up rbd_request_fn(), but I want to make sure >> I understand this in full. >> >> - Previously the danger of calling blocking primitives on the way to >>

Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-08-01 Thread Peter Zijlstra
On Fri, Aug 01, 2014 at 04:56:27PM +0400, Ilya Dryomov wrote: > I'm going to fix up rbd_request_fn(), but I want to make sure > I understand this in full. > > - Previously the danger of calling blocking primitives on the way to > schedule(), i.e. with task->state != TASK_RUNNING, was that if

Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-08-01 Thread Ilya Dryomov
On Thu, Jul 31, 2014 at 6:39 PM, Peter Zijlstra wrote: > On Thu, Jul 31, 2014 at 04:30:52PM +0200, Mike Galbraith wrote: >> On Thu, 2014-07-31 at 15:13 +0200, Peter Zijlstra wrote: >> >> > Smells like maybe current->state != TASK_RUNNING >> >> Bingo >> >> [ 1200.851004] kjournald D

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-08-01 Thread Ilya Dryomov
On Thu, Jul 31, 2014 at 6:39 PM, Peter Zijlstra pet...@infradead.org wrote: On Thu, Jul 31, 2014 at 04:30:52PM +0200, Mike Galbraith wrote: On Thu, 2014-07-31 at 15:13 +0200, Peter Zijlstra wrote: Smells like maybe current-state != TASK_RUNNING Bingo [ 1200.851004] kjournald D

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-08-01 Thread Peter Zijlstra
On Fri, Aug 01, 2014 at 04:56:27PM +0400, Ilya Dryomov wrote: I'm going to fix up rbd_request_fn(), but I want to make sure I understand this in full. - Previously the danger of calling blocking primitives on the way to schedule(), i.e. with task-state != TASK_RUNNING, was that if the

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-08-01 Thread Ilya Dryomov
On Fri, Aug 1, 2014 at 5:27 PM, Peter Zijlstra pet...@infradead.org wrote: On Fri, Aug 01, 2014 at 04:56:27PM +0400, Ilya Dryomov wrote: I'm going to fix up rbd_request_fn(), but I want to make sure I understand this in full. - Previously the danger of calling blocking primitives on the way

Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-07-31 Thread Ilya Dryomov
On Thu, Jul 31, 2014 at 6:30 PM, Mike Galbraith wrote: > On Thu, 2014-07-31 at 15:13 +0200, Peter Zijlstra wrote: > >> Smells like maybe current->state != TASK_RUNNING It just triggered for me too, took longer than usual. Sorry for the churn Peter, this was really confusing. Onto finding the

Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-07-31 Thread Peter Zijlstra
On Thu, Jul 31, 2014 at 04:30:52PM +0200, Mike Galbraith wrote: > On Thu, 2014-07-31 at 15:13 +0200, Peter Zijlstra wrote: > > > Smells like maybe current->state != TASK_RUNNING > > Bingo > > [ 1200.851004] kjournald D 0002 0 4398 2 > 0x > [ 1200.858283]

Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-07-31 Thread Mike Galbraith
On Thu, 2014-07-31 at 15:13 +0200, Peter Zijlstra wrote: > Smells like maybe current->state != TASK_RUNNING Bingo [ 1200.851004] kjournald D 0002 0 4398 2 0x [ 1200.858283] 8803537bb788 0046 8803537bb7a8 [ 1200.865914]

Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-07-31 Thread Peter Zijlstra
On Thu, Jul 31, 2014 at 03:44:11PM +0200, Ingo Molnar wrote: > > > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > > index ae712b25e492..3d726fdaa764 100644 > > --- a/kernel/locking/mutex.c > > +++ b/kernel/locking/mutex.c > > @@ -473,8 +473,12 @@ __mutex_lock_common(struct mutex

Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-07-31 Thread Ingo Molnar
* Peter Zijlstra wrote: > On Thu, Jul 31, 2014 at 04:37:29PM +0400, Ilya Dryomov wrote: > > > This didn't make sense to me at first too, and I'll be happy to be > > proven wrong, but we can reproduce this with rbd very reliably under > > higher than usual load, and the revert makes it go away.

Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-07-31 Thread Ilya Dryomov
On Thu, Jul 31, 2014 at 5:13 PM, Peter Zijlstra wrote: > On Thu, Jul 31, 2014 at 04:37:29PM +0400, Ilya Dryomov wrote: > >> This didn't make sense to me at first too, and I'll be happy to be >> proven wrong, but we can reproduce this with rbd very reliably under >> higher than usual load, and the

Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-07-31 Thread Peter Zijlstra
On Thu, Jul 31, 2014 at 04:37:29PM +0400, Ilya Dryomov wrote: > This didn't make sense to me at first too, and I'll be happy to be > proven wrong, but we can reproduce this with rbd very reliably under > higher than usual load, and the revert makes it go away. What we are > seeing in the rbd

Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-07-31 Thread Ilya Dryomov
On Thu, Jul 31, 2014 at 3:57 PM, Peter Zijlstra wrote: > On Thu, Jul 31, 2014 at 02:16:37PM +0400, Ilya Dryomov wrote: >> This reverts commit 34c6bc2c919a55e5ad4e698510a2f35ee13ab900. >> >> This commit can lead to deadlocks by way of what at a high level >> appears to look like a missing wakeup

Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-07-31 Thread Peter Zijlstra
On Thu, Jul 31, 2014 at 02:16:37PM +0400, Ilya Dryomov wrote: > This reverts commit 34c6bc2c919a55e5ad4e698510a2f35ee13ab900. > > This commit can lead to deadlocks by way of what at a high level > appears to look like a missing wakeup on mutex_unlock() when > CONFIG_MUTEX_SPIN_ON_OWNER is set,

[PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

2014-07-31 Thread Ilya Dryomov
This reverts commit 34c6bc2c919a55e5ad4e698510a2f35ee13ab900. This commit can lead to deadlocks by way of what at a high level appears to look like a missing wakeup on mutex_unlock() when CONFIG_MUTEX_SPIN_ON_OWNER is set, which is how most distributions ship their kernels. In particular, it

[PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-07-31 Thread Ilya Dryomov
This reverts commit 34c6bc2c919a55e5ad4e698510a2f35ee13ab900. This commit can lead to deadlocks by way of what at a high level appears to look like a missing wakeup on mutex_unlock() when CONFIG_MUTEX_SPIN_ON_OWNER is set, which is how most distributions ship their kernels. In particular, it

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-07-31 Thread Peter Zijlstra
On Thu, Jul 31, 2014 at 02:16:37PM +0400, Ilya Dryomov wrote: This reverts commit 34c6bc2c919a55e5ad4e698510a2f35ee13ab900. This commit can lead to deadlocks by way of what at a high level appears to look like a missing wakeup on mutex_unlock() when CONFIG_MUTEX_SPIN_ON_OWNER is set, which

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-07-31 Thread Ilya Dryomov
On Thu, Jul 31, 2014 at 3:57 PM, Peter Zijlstra pet...@infradead.org wrote: On Thu, Jul 31, 2014 at 02:16:37PM +0400, Ilya Dryomov wrote: This reverts commit 34c6bc2c919a55e5ad4e698510a2f35ee13ab900. This commit can lead to deadlocks by way of what at a high level appears to look like a

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-07-31 Thread Peter Zijlstra
On Thu, Jul 31, 2014 at 04:37:29PM +0400, Ilya Dryomov wrote: This didn't make sense to me at first too, and I'll be happy to be proven wrong, but we can reproduce this with rbd very reliably under higher than usual load, and the revert makes it go away. What we are seeing in the rbd

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-07-31 Thread Ilya Dryomov
On Thu, Jul 31, 2014 at 5:13 PM, Peter Zijlstra pet...@infradead.org wrote: On Thu, Jul 31, 2014 at 04:37:29PM +0400, Ilya Dryomov wrote: This didn't make sense to me at first too, and I'll be happy to be proven wrong, but we can reproduce this with rbd very reliably under higher than usual

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-07-31 Thread Ingo Molnar
* Peter Zijlstra pet...@infradead.org wrote: On Thu, Jul 31, 2014 at 04:37:29PM +0400, Ilya Dryomov wrote: This didn't make sense to me at first too, and I'll be happy to be proven wrong, but we can reproduce this with rbd very reliably under higher than usual load, and the revert makes

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-07-31 Thread Peter Zijlstra
On Thu, Jul 31, 2014 at 03:44:11PM +0200, Ingo Molnar wrote: diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index ae712b25e492..3d726fdaa764 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -473,8 +473,12 @@ __mutex_lock_common(struct mutex *lock, long

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-07-31 Thread Mike Galbraith
On Thu, 2014-07-31 at 15:13 +0200, Peter Zijlstra wrote: Smells like maybe current-state != TASK_RUNNING Bingo [ 1200.851004] kjournald D 0002 0 4398 2 0x [ 1200.858283] 8803537bb788 0046 8803537bb7a8 [ 1200.865914]

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-07-31 Thread Peter Zijlstra
On Thu, Jul 31, 2014 at 04:30:52PM +0200, Mike Galbraith wrote: On Thu, 2014-07-31 at 15:13 +0200, Peter Zijlstra wrote: Smells like maybe current-state != TASK_RUNNING Bingo [ 1200.851004] kjournald D 0002 0 4398 2 0x [ 1200.858283]

Re: [PATCH] locking/mutexes: Revert locking/mutexes: Add extra reschedule point

2014-07-31 Thread Ilya Dryomov
On Thu, Jul 31, 2014 at 6:30 PM, Mike Galbraith umgwanakikb...@gmail.com wrote: On Thu, 2014-07-31 at 15:13 +0200, Peter Zijlstra wrote: Smells like maybe current-state != TASK_RUNNING It just triggered for me too, took longer than usual. Sorry for the churn Peter, this was really confusing.