Re: [PATCH v3 2/6] sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2016-04-21 Thread Xunlei Pang
Hi Peter,

On 2016/04/20 at 21:49, Xunlei Pang wrote:
> On 2016/04/20 at 21:19, Peter Zijlstra wrote:
>> On Thu, Apr 14, 2016 at 07:37:03PM +0800, Xunlei Pang wrote:
>>> +   /* Updated under pi_lock and rtmutex lock */
>>> struct rb_node *pi_waiters_leftmost;
>>> +   struct rb_node *pi_waiters_leftmost_copy;
>>>  struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
>>>  {
>>> +   if (!task->pi_waiters_leftmost_copy)
>>> return NULL;
>>>  
>>> +   return rb_entry(task->pi_waiters_leftmost_copy,
>>> +   struct rt_mutex_waiter, pi_tree_entry)->task;
>>>  }
>> why ?! Why not keep a regular task_struct pointer and avoid this stuff?
> I meant to make it semantically consistent, but I can change it since you 
> think task_struct is better.

What do you think this version of PATCH1 and PATCH2?
If you are fine with it, I can sent it out as v4 separately, then we can focus 
on the issue in PATCH5.

Thanks!

>
> Regards,
> Xunlei



Re: [PATCH v3 2/6] sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2016-04-21 Thread Xunlei Pang
Hi Peter,

On 2016/04/20 at 21:49, Xunlei Pang wrote:
> On 2016/04/20 at 21:19, Peter Zijlstra wrote:
>> On Thu, Apr 14, 2016 at 07:37:03PM +0800, Xunlei Pang wrote:
>>> +   /* Updated under pi_lock and rtmutex lock */
>>> struct rb_node *pi_waiters_leftmost;
>>> +   struct rb_node *pi_waiters_leftmost_copy;
>>>  struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
>>>  {
>>> +   if (!task->pi_waiters_leftmost_copy)
>>> return NULL;
>>>  
>>> +   return rb_entry(task->pi_waiters_leftmost_copy,
>>> +   struct rt_mutex_waiter, pi_tree_entry)->task;
>>>  }
>> why ?! Why not keep a regular task_struct pointer and avoid this stuff?
> I meant to make it semantically consistent, but I can change it since you 
> think task_struct is better.

What do you think this version of PATCH1 and PATCH2?
If you are fine with it, I can sent it out as v4 separately, then we can focus 
on the issue in PATCH5.

Thanks!

>
> Regards,
> Xunlei



Re: [PATCH v3 2/6] sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2016-04-20 Thread Xunlei Pang
On 2016/04/20 at 21:19, Peter Zijlstra wrote:
> On Thu, Apr 14, 2016 at 07:37:03PM +0800, Xunlei Pang wrote:
>> +/* Updated under pi_lock and rtmutex lock */
>>  struct rb_node *pi_waiters_leftmost;
>> +struct rb_node *pi_waiters_leftmost_copy;
>>  struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
>>  {
>> +if (!task->pi_waiters_leftmost_copy)
>>  return NULL;
>>  
>> +return rb_entry(task->pi_waiters_leftmost_copy,
>> +struct rt_mutex_waiter, pi_tree_entry)->task;
>>  }
> why ?! Why not keep a regular task_struct pointer and avoid this stuff?

I meant to make it semantically consistent, but I can change it since you think 
task_struct is better.

Regards,
Xunlei


Re: [PATCH v3 2/6] sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2016-04-20 Thread Xunlei Pang
On 2016/04/20 at 21:19, Peter Zijlstra wrote:
> On Thu, Apr 14, 2016 at 07:37:03PM +0800, Xunlei Pang wrote:
>> +/* Updated under pi_lock and rtmutex lock */
>>  struct rb_node *pi_waiters_leftmost;
>> +struct rb_node *pi_waiters_leftmost_copy;
>>  struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
>>  {
>> +if (!task->pi_waiters_leftmost_copy)
>>  return NULL;
>>  
>> +return rb_entry(task->pi_waiters_leftmost_copy,
>> +struct rt_mutex_waiter, pi_tree_entry)->task;
>>  }
> why ?! Why not keep a regular task_struct pointer and avoid this stuff?

I meant to make it semantically consistent, but I can change it since you think 
task_struct is better.

Regards,
Xunlei


Re: [PATCH v3 2/6] sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2016-04-20 Thread Peter Zijlstra
On Thu, Apr 14, 2016 at 07:37:03PM +0800, Xunlei Pang wrote:
> + /* Updated under pi_lock and rtmutex lock */
>   struct rb_node *pi_waiters_leftmost;
> + struct rb_node *pi_waiters_leftmost_copy;

>  struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
>  {
> + if (!task->pi_waiters_leftmost_copy)
>   return NULL;
>  
> + return rb_entry(task->pi_waiters_leftmost_copy,
> + struct rt_mutex_waiter, pi_tree_entry)->task;
>  }

why ?! Why not keep a regular task_struct pointer and avoid this stuff?


Re: [PATCH v3 2/6] sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2016-04-20 Thread Peter Zijlstra
On Thu, Apr 14, 2016 at 07:37:03PM +0800, Xunlei Pang wrote:
> + /* Updated under pi_lock and rtmutex lock */
>   struct rb_node *pi_waiters_leftmost;
> + struct rb_node *pi_waiters_leftmost_copy;

>  struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
>  {
> + if (!task->pi_waiters_leftmost_copy)
>   return NULL;
>  
> + return rb_entry(task->pi_waiters_leftmost_copy,
> + struct rt_mutex_waiter, pi_tree_entry)->task;
>  }

why ?! Why not keep a regular task_struct pointer and avoid this stuff?


[PATCH v3 2/6] sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2016-04-14 Thread Xunlei Pang
A crash happened while I was playing with deadline PI rtmutex.

BUG: unable to handle kernel NULL pointer dereference at 0018
IP: [] rt_mutex_get_top_task+0x1f/0x30
PGD 232a75067 PUD 230947067 PMD 0
Oops:  [#1] SMP
CPU: 1 PID: 10994 Comm: a.out Not tainted

Call Trace:
[] ? enqueue_task_dl+0x2a/0x320
[] enqueue_task+0x2c/0x80
[] activate_task+0x23/0x30
[] pull_dl_task+0x1d5/0x260
[] pre_schedule_dl+0x16/0x20
[] __schedule+0xd3/0x900
[] schedule+0x29/0x70
[] __rt_mutex_slowlock+0x4b/0xc0
[] rt_mutex_slowlock+0xd1/0x190
[] rt_mutex_timed_lock+0x53/0x60
[] futex_lock_pi.isra.18+0x28c/0x390
[] ? enqueue_task_dl+0x195/0x320
[] ? prio_changed_dl+0x6c/0x90
[] do_futex+0x190/0x5b0
[] SyS_futex+0x80/0x180
[] system_call_fastpath+0x16/0x1b
RIP  [] rt_mutex_get_top_task+0x1f/0x30

This is because rt_mutex_enqueue_pi() and rt_mutex_dequeue_pi()
are only protected by pi_lock when operating pi waiters, while
rt_mutex_get_top_task()(now replaced by rt_mutex_get_top_waiter()
by previous patches, same issue), will access them with rq lock
held but not holding pi_lock.

In order to tackle it, we introduce new "pi_waiters_leftmost_copy"
in task_struct, and add a new function named rt_mutex_update_copy()
to do the copy, it can be called by rt_mutex_setprio() which held
owner's pi_lock, rq lock, and rt_mutex lock. But one exception is
that rt_mutex_setprio() can be called without rtmutex locked after
mark_wakeup_next_waiter() by rt_mutex_adjust_prio(), so we need a
new function in mark_wakeup_next_waiter() to lock rq first then
call rt_mutex_update_copy().

We avoid adding new logic by calling __rt_mutex_adjust_prio() directly
in mark_wakeup_next_waiter()and remove the old rt_mutex_adjust_prio()
outside the lock. Since we moved the deboost point, in order to avoid
current process to be preempted(due to deboost earlier) before finishing
wake_up_q(), we also move preempt_disable() before unlocking rtmutex.

Originally-From: Peter Zijlstra 
Signed-off-by: Xunlei Pang 
---
 include/linux/init_task.h  |  3 ++-
 include/linux/sched.h  |  3 +++
 include/linux/sched/deadline.h |  2 ++
 kernel/fork.c  |  1 +
 kernel/locking/rtmutex.c   | 61 +-
 kernel/sched/core.c|  2 ++
 6 files changed, 35 insertions(+), 37 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index f2cb8d4..7be5a83 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -162,7 +162,8 @@ extern struct task_group root_task_group;
 #ifdef CONFIG_RT_MUTEXES
 # define INIT_RT_MUTEXES(tsk)  \
.pi_waiters = RB_ROOT,  \
-   .pi_waiters_leftmost = NULL,
+   .pi_waiters_leftmost = NULL,\
+   .pi_waiters_leftmost_copy = NULL,
 #else
 # define INIT_RT_MUTEXES(tsk)
 #endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 45e848c..332a6b5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1621,7 +1621,10 @@ struct task_struct {
 #ifdef CONFIG_RT_MUTEXES
/* PI waiters blocked on a rt_mutex held by this task */
struct rb_root pi_waiters;
+   /* Updated under pi_lock and rtmutex lock */
struct rb_node *pi_waiters_leftmost;
+   /* Updated under pi_lock, rq_lock, and rtmutex lock */
+   struct rb_node *pi_waiters_leftmost_copy;
/* Deadlock detection and priority inheritance handling */
struct rt_mutex_waiter *pi_blocked_on;
 #endif
diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
index 9089a2a..e8304d4 100644
--- a/include/linux/sched/deadline.h
+++ b/include/linux/sched/deadline.h
@@ -26,4 +26,6 @@ static inline bool dl_time_before(u64 a, u64 b)
return (s64)(a - b) < 0;
 }
 
+extern void rt_mutex_update_copy(struct task_struct *p);
+
 #endif /* _SCHED_DEADLINE_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index a453546..4b847e4 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1225,6 +1225,7 @@ static void rt_mutex_init_task(struct task_struct *p)
 #ifdef CONFIG_RT_MUTEXES
p->pi_waiters = RB_ROOT;
p->pi_waiters_leftmost = NULL;
+   p->pi_waiters_leftmost_copy = NULL;
p->pi_blocked_on = NULL;
 #endif
 }
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 42bc59b..00c6560 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -273,10 +273,11 @@ int rt_mutex_getprio(struct task_struct *task)
 
 struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
 {
-   if (likely(!task_has_pi_waiters(task)))
+   if (!task->pi_waiters_leftmost_copy)
return NULL;
 
-   return task_top_pi_waiter(task)->task;
+   return rb_entry(task->pi_waiters_leftmost_copy,
+ 

[PATCH v3 2/6] sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2016-04-14 Thread Xunlei Pang
A crash happened while I was playing with deadline PI rtmutex.

BUG: unable to handle kernel NULL pointer dereference at 0018
IP: [] rt_mutex_get_top_task+0x1f/0x30
PGD 232a75067 PUD 230947067 PMD 0
Oops:  [#1] SMP
CPU: 1 PID: 10994 Comm: a.out Not tainted

Call Trace:
[] ? enqueue_task_dl+0x2a/0x320
[] enqueue_task+0x2c/0x80
[] activate_task+0x23/0x30
[] pull_dl_task+0x1d5/0x260
[] pre_schedule_dl+0x16/0x20
[] __schedule+0xd3/0x900
[] schedule+0x29/0x70
[] __rt_mutex_slowlock+0x4b/0xc0
[] rt_mutex_slowlock+0xd1/0x190
[] rt_mutex_timed_lock+0x53/0x60
[] futex_lock_pi.isra.18+0x28c/0x390
[] ? enqueue_task_dl+0x195/0x320
[] ? prio_changed_dl+0x6c/0x90
[] do_futex+0x190/0x5b0
[] SyS_futex+0x80/0x180
[] system_call_fastpath+0x16/0x1b
RIP  [] rt_mutex_get_top_task+0x1f/0x30

This is because rt_mutex_enqueue_pi() and rt_mutex_dequeue_pi()
are only protected by pi_lock when operating pi waiters, while
rt_mutex_get_top_task()(now replaced by rt_mutex_get_top_waiter()
by previous patches, same issue), will access them with rq lock
held but not holding pi_lock.

In order to tackle it, we introduce new "pi_waiters_leftmost_copy"
in task_struct, and add a new function named rt_mutex_update_copy()
to do the copy, it can be called by rt_mutex_setprio() which held
owner's pi_lock, rq lock, and rt_mutex lock. But one exception is
that rt_mutex_setprio() can be called without rtmutex locked after
mark_wakeup_next_waiter() by rt_mutex_adjust_prio(), so we need a
new function in mark_wakeup_next_waiter() to lock rq first then
call rt_mutex_update_copy().

We avoid adding new logic by calling __rt_mutex_adjust_prio() directly
in mark_wakeup_next_waiter()and remove the old rt_mutex_adjust_prio()
outside the lock. Since we moved the deboost point, in order to avoid
current process to be preempted(due to deboost earlier) before finishing
wake_up_q(), we also move preempt_disable() before unlocking rtmutex.

Originally-From: Peter Zijlstra 
Signed-off-by: Xunlei Pang 
---
 include/linux/init_task.h  |  3 ++-
 include/linux/sched.h  |  3 +++
 include/linux/sched/deadline.h |  2 ++
 kernel/fork.c  |  1 +
 kernel/locking/rtmutex.c   | 61 +-
 kernel/sched/core.c|  2 ++
 6 files changed, 35 insertions(+), 37 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index f2cb8d4..7be5a83 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -162,7 +162,8 @@ extern struct task_group root_task_group;
 #ifdef CONFIG_RT_MUTEXES
 # define INIT_RT_MUTEXES(tsk)  \
.pi_waiters = RB_ROOT,  \
-   .pi_waiters_leftmost = NULL,
+   .pi_waiters_leftmost = NULL,\
+   .pi_waiters_leftmost_copy = NULL,
 #else
 # define INIT_RT_MUTEXES(tsk)
 #endif
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 45e848c..332a6b5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1621,7 +1621,10 @@ struct task_struct {
 #ifdef CONFIG_RT_MUTEXES
/* PI waiters blocked on a rt_mutex held by this task */
struct rb_root pi_waiters;
+   /* Updated under pi_lock and rtmutex lock */
struct rb_node *pi_waiters_leftmost;
+   /* Updated under pi_lock, rq_lock, and rtmutex lock */
+   struct rb_node *pi_waiters_leftmost_copy;
/* Deadlock detection and priority inheritance handling */
struct rt_mutex_waiter *pi_blocked_on;
 #endif
diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
index 9089a2a..e8304d4 100644
--- a/include/linux/sched/deadline.h
+++ b/include/linux/sched/deadline.h
@@ -26,4 +26,6 @@ static inline bool dl_time_before(u64 a, u64 b)
return (s64)(a - b) < 0;
 }
 
+extern void rt_mutex_update_copy(struct task_struct *p);
+
 #endif /* _SCHED_DEADLINE_H */
diff --git a/kernel/fork.c b/kernel/fork.c
index a453546..4b847e4 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1225,6 +1225,7 @@ static void rt_mutex_init_task(struct task_struct *p)
 #ifdef CONFIG_RT_MUTEXES
p->pi_waiters = RB_ROOT;
p->pi_waiters_leftmost = NULL;
+   p->pi_waiters_leftmost_copy = NULL;
p->pi_blocked_on = NULL;
 #endif
 }
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 42bc59b..00c6560 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -273,10 +273,11 @@ int rt_mutex_getprio(struct task_struct *task)
 
 struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
 {
-   if (likely(!task_has_pi_waiters(task)))
+   if (!task->pi_waiters_leftmost_copy)
return NULL;
 
-   return task_top_pi_waiter(task)->task;
+   return rb_entry(task->pi_waiters_leftmost_copy,
+   struct