Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-03-09 Thread Sebastian Andrzej Siewior
On 03/09/2015 05:36 PM, Steven Rostedt wrote:
> BTW, I'm going to start with 3.18-rt1 and see what's been added to the
> other -rt updates. If there's something I need that was added to
> 3.18-rt1 can you let me know. That is, if it wasn't marked with a
> stable-rt tag. My scripts will find those.

I tried to create new patch files for new things and mark them stable
where possible.
I didn't mark the "simple-work" (a workqueue based on swait to avoid a
new kernel thread for each of those things where we need to schedule a
workqueue or something like that from atomic context, like MCE) with cc
stable.

The patch where I reverted this timer thingy is
  Revert-timers-do-not-raise-softirq-unconditionally.patch

and it was due to hrtimer beeing broken on 3.18 and has no stable tag.

I'm not sure about the multi-queue block stuff (*-mq-* in the queue).
You might want take a look. I updated a few patches because I managed
to create deadlocks. I'm not sure if it was possible deadlock in v3.14
and I simply didn't trigger if the code changed and it become possible.
I remember that the multi queue block code entered v3.12 but was unused
and started with v3.14 it gained a user.

> 
> -- Steve

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-03-09 Thread Steven Rostedt
On Fri, 6 Mar 2015 13:19:21 +0100
Sebastian Andrzej Siewior  wrote:

> * Steven Rostedt | 2015-02-26 09:06:10 [-0500]:
> 
> >If we can pull that off and remove all rtmutex trylocks from hardirq
> >context, I would much rather do that.
> >
> >This hocus pocus coding is just going to lead us down the path of the
> >black arts. I already have a black cat, so I'm good to go.
> 
> Okay. So what is the fix for v3.14 and other trees where the rtmutex
> reworked has been integrated? Revert the optimization like I did in
> v3.18?

Probably.

I'm going to start looking at what's in 3.18 and start porting changes
to 3.14 and earlier. I've only been updating the stable kernels to pull
in the mainline stable updates. I haven't reverted anything (unless
mainline stable caused things to be obsolete), nor have I added any
updates.

BTW, I'm going to start with 3.18-rt1 and see what's been added to the
other -rt updates. If there's something I need that was added to
3.18-rt1 can you let me know. That is, if it wasn't marked with a
stable-rt tag. My scripts will find those.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-03-09 Thread Sebastian Andrzej Siewior
On 03/09/2015 05:36 PM, Steven Rostedt wrote:
 BTW, I'm going to start with 3.18-rt1 and see what's been added to the
 other -rt updates. If there's something I need that was added to
 3.18-rt1 can you let me know. That is, if it wasn't marked with a
 stable-rt tag. My scripts will find those.

I tried to create new patch files for new things and mark them stable
where possible.
I didn't mark the simple-work (a workqueue based on swait to avoid a
new kernel thread for each of those things where we need to schedule a
workqueue or something like that from atomic context, like MCE) with cc
stable.

The patch where I reverted this timer thingy is
  Revert-timers-do-not-raise-softirq-unconditionally.patch

and it was due to hrtimer beeing broken on 3.18 and has no stable tag.

I'm not sure about the multi-queue block stuff (*-mq-* in the queue).
You might want take a look. I updated a few patches because I managed
to create deadlocks. I'm not sure if it was possible deadlock in v3.14
and I simply didn't trigger if the code changed and it become possible.
I remember that the multi queue block code entered v3.12 but was unused
and started with v3.14 it gained a user.

 
 -- Steve

Sebastian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-03-09 Thread Steven Rostedt
On Fri, 6 Mar 2015 13:19:21 +0100
Sebastian Andrzej Siewior bige...@linutronix.de wrote:

 * Steven Rostedt | 2015-02-26 09:06:10 [-0500]:
 
 If we can pull that off and remove all rtmutex trylocks from hardirq
 context, I would much rather do that.
 
 This hocus pocus coding is just going to lead us down the path of the
 black arts. I already have a black cat, so I'm good to go.
 
 Okay. So what is the fix for v3.14 and other trees where the rtmutex
 reworked has been integrated? Revert the optimization like I did in
 v3.18?

Probably.

I'm going to start looking at what's in 3.18 and start porting changes
to 3.14 and earlier. I've only been updating the stable kernels to pull
in the mainline stable updates. I haven't reverted anything (unless
mainline stable caused things to be obsolete), nor have I added any
updates.

BTW, I'm going to start with 3.18-rt1 and see what's been added to the
other -rt updates. If there's something I need that was added to
3.18-rt1 can you let me know. That is, if it wasn't marked with a
stable-rt tag. My scripts will find those.

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-03-06 Thread Sebastian Andrzej Siewior
* Steven Rostedt | 2015-02-26 09:06:10 [-0500]:

>If we can pull that off and remove all rtmutex trylocks from hardirq
>context, I would much rather do that.
>
>This hocus pocus coding is just going to lead us down the path of the
>black arts. I already have a black cat, so I'm good to go.

Okay. So what is the fix for v3.14 and other trees where the rtmutex
reworked has been integrated? Revert the optimization like I did in
v3.18?

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-03-06 Thread Sebastian Andrzej Siewior
* Steven Rostedt | 2015-02-26 09:06:10 [-0500]:

If we can pull that off and remove all rtmutex trylocks from hardirq
context, I would much rather do that.

This hocus pocus coding is just going to lead us down the path of the
black arts. I already have a black cat, so I'm good to go.

Okay. So what is the fix for v3.14 and other trees where the rtmutex
reworked has been integrated? Revert the optimization like I did in
v3.18?

Sebastian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-26 Thread Steven Rostedt
On Thu, 26 Feb 2015 14:56:30 +0100
Sebastian Andrzej Siewior  wrote:

> I am not sure if we want keep doing that. The only reason why we grab
> the lock in the first place was to check if there is a timer pending
> and we run on the isolated CPU. It should not matter for the other CPUs,
> right?
> So instead going further that road, what about storing base->next_timer
> someplace so it can be obtained via atomic_read() for the isolated CPUs?
> 

If we can pull that off and remove all rtmutex trylocks from hardirq
context, I would much rather do that.

This hocus pocus coding is just going to lead us down the path of the
black arts. I already have a black cat, so I'm good to go.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-26 Thread Sebastian Andrzej Siewior
* Steven Rostedt | 2015-02-23 19:57:43 [-0500]:

>On Mon, 23 Feb 2015 17:16:27 -0700
>Thavatchai Makphaibulchoke  wrote:
>> If I'm not mistaken, another reason could also be due to the rate of the
>> timer interrupt, in the case that the mutex is highly contested IH could
>> stall the non-real-time requester for a long time, even to the point of
>> the cpu is perceived as hung.
>
>Perhaps we should just have trylocks fail if there are other waiters.
>As it wont slow down the high priority task. And doing that would
>probably even help other rt tasks that are blocked on the lock waiting
>for a release. Why make those tasks wait more if even a higher priority
>task is simply doing a trylock and can safely fail it. At least we
>could do that if the task trying to get the lock is a interrupt.

What happened so far? The events I remember:
- we gained FULL_NO_HZ
- people started to isolate CPUs and push their work / RT tasks there
- it has been noticed that the kernel is raising the timer softirq even
  there is nothing going on once the softirq was started.
- tglx came with a patch which could go mainline if solves the problem.
- this patch did not make its way upstream (yet) and I pulled it into
  -RT. Isn't this a problem in mainline, too? Why isn't there anyone
  screaming?
- we had dead locks because the inner-lock of the sleeping was not safe
  to be used from hardirq context. #1
- we had boxes freezing on startup and not making progress due to missed
  irq_work wakeups. #2
- we had a deadlock splat on UP because the trylock failed. This was
  fixed by only allowing this feature on SMP (since it only makes sense
  with isolated CPUs). #3
- Now since the rtmutex rework we have dead lock splats which BUG()s the
  systems. #4

The four problems we had/have so far are -RT specific but still
plainfull when I think back.
rtmutex wasn't made to be accessed from hardirq context. That is where we
use the rawlocks. One problem that we still have and Peter pointer out
around #1 is about owner boosting if the lock is held in hardirq context
and the wrong owner is recorded. This problem was ignored so far.

Using a fake task as you suggest in irq context and ignoring would
somehow fix the boosting problem and avoid the deadlock we see now.

I am not sure if we want keep doing that. The only reason why we grab
the lock in the first place was to check if there is a timer pending
and we run on the isolated CPU. It should not matter for the other CPUs,
right?
So instead going further that road, what about storing base->next_timer
someplace so it can be obtained via atomic_read() for the isolated CPUs?

>-- Steve

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-26 Thread Steven Rostedt
On Thu, 26 Feb 2015 14:56:30 +0100
Sebastian Andrzej Siewior bige...@linutronix.de wrote:

 I am not sure if we want keep doing that. The only reason why we grab
 the lock in the first place was to check if there is a timer pending
 and we run on the isolated CPU. It should not matter for the other CPUs,
 right?
 So instead going further that road, what about storing base-next_timer
 someplace so it can be obtained via atomic_read() for the isolated CPUs?
 

If we can pull that off and remove all rtmutex trylocks from hardirq
context, I would much rather do that.

This hocus pocus coding is just going to lead us down the path of the
black arts. I already have a black cat, so I'm good to go.

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-26 Thread Sebastian Andrzej Siewior
* Steven Rostedt | 2015-02-23 19:57:43 [-0500]:

On Mon, 23 Feb 2015 17:16:27 -0700
Thavatchai Makphaibulchoke thavatchai.makpahibulch...@hp.com wrote:
 If I'm not mistaken, another reason could also be due to the rate of the
 timer interrupt, in the case that the mutex is highly contested IH could
 stall the non-real-time requester for a long time, even to the point of
 the cpu is perceived as hung.

Perhaps we should just have trylocks fail if there are other waiters.
As it wont slow down the high priority task. And doing that would
probably even help other rt tasks that are blocked on the lock waiting
for a release. Why make those tasks wait more if even a higher priority
task is simply doing a trylock and can safely fail it. At least we
could do that if the task trying to get the lock is a interrupt.

What happened so far? The events I remember:
- we gained FULL_NO_HZ
- people started to isolate CPUs and push their work / RT tasks there
- it has been noticed that the kernel is raising the timer softirq even
  there is nothing going on once the softirq was started.
- tglx came with a patch which could go mainline if solves the problem.
- this patch did not make its way upstream (yet) and I pulled it into
  -RT. Isn't this a problem in mainline, too? Why isn't there anyone
  screaming?
- we had dead locks because the inner-lock of the sleeping was not safe
  to be used from hardirq context. #1
- we had boxes freezing on startup and not making progress due to missed
  irq_work wakeups. #2
- we had a deadlock splat on UP because the trylock failed. This was
  fixed by only allowing this feature on SMP (since it only makes sense
  with isolated CPUs). #3
- Now since the rtmutex rework we have dead lock splats which BUG()s the
  systems. #4

The four problems we had/have so far are -RT specific but still
plainfull when I think back.
rtmutex wasn't made to be accessed from hardirq context. That is where we
use the rawlocks. One problem that we still have and Peter pointer out
around #1 is about owner boosting if the lock is held in hardirq context
and the wrong owner is recorded. This problem was ignored so far.

Using a fake task as you suggest in irq context and ignoring would
somehow fix the boosting problem and avoid the deadlock we see now.

I am not sure if we want keep doing that. The only reason why we grab
the lock in the first place was to check if there is a timer pending
and we run on the isolated CPU. It should not matter for the other CPUs,
right?
So instead going further that road, what about storing base-next_timer
someplace so it can be obtained via atomic_read() for the isolated CPUs?

-- Steve

Sebastian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-23 Thread Steven Rostedt
On Mon, 23 Feb 2015 17:16:27 -0700
Thavatchai Makphaibulchoke  wrote:


> Thanks again for the comments and suggestion.
> 
> Yes, creating a per cpu fake task was one of the alternative considered.
>  I believe one of the reasons I did not purse is the amount of extra
> storage it requires (sizeof(struct task_struct) * number of cpus.
> Though the changes may not be as intrusive as the one I sent, some are
> still required, mainly with current (one in particular came to mind is
> in wakeup_next-watier()).

Yeah, that's a draw back, but still. Big CPU machines should have big
memory. If we really are concerned, we could tidy up task_struct a bit,
rearranging the fields to make sure like fields are together (helps
with cache too) and move the RT stuff up further. Then we could
allocate just enough to cover all the task struct fields that are
accessed in the rtmutex code.


> 
> If I'm not mistaken, another reason could also be due to the rate of the
> timer interrupt, in the case that the mutex is highly contested IH could
> stall the non-real-time requester for a long time, even to the point of
> the cpu is perceived as hung.

Perhaps we should just have trylocks fail if there are other waiters.
As it wont slow down the high priority task. And doing that would
probably even help other rt tasks that are blocked on the lock waiting
for a release. Why make those tasks wait more if even a higher priority
task is simply doing a trylock and can safely fail it. At least we
could do that if the task trying to get the lock is a interrupt.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-23 Thread Thavatchai Makphaibulchoke


On 02/23/2015 11:37 AM, Steven Rostedt wrote:
> 
> OK, I believe I understand the issue. Perhaps it would be much better
> to create a fake task per CPU that we use when grabbing locks in
> interrupt mode. And make these have a priority of 0 (highest), since
> they can not be preempted, they do have such a priority.
> 
> Then in the fast trylock and unlock code, we can add:
> 
>   struct task_struct *curr = current;
> 
>   if (unlikely(in_irq()))
>   curr = this_cpu_read(irq_task);
> 
> This way the priority inheritance will stop when it hits this task (no
> need to boost a task of highest priority), and we can leave that code
> alone.
> 

Thanks again for the comments and suggestion.

Yes, creating a per cpu fake task was one of the alternative considered.
 I believe one of the reasons I did not purse is the amount of extra
storage it requires (sizeof(struct task_struct) * number of cpus.
Though the changes may not be as intrusive as the one I sent, some are
still required, mainly with current (one in particular came to mind is
in wakeup_next-watier()).

If I'm not mistaken, another reason could also be due to the rate of the
timer interrupt, in the case that the mutex is highly contested IH could
stall the non-real-time requester for a long time, even to the point of
the cpu is perceived as hung.

Anyway, I'll retry the fake task approach a try and report back if there
is any issue.

Thanks,
Mak.


> -- Steve
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-23 Thread Steven Rostedt
On Thu, 19 Feb 2015 18:31:05 -0700
Thavatchai Makphaibulchoke  wrote:

> This patch fixes the problem that the ownership of a mutex acquired by an
> interrupt handler(IH) gets incorrectly attributed to the interrupted thread.
> 
> This could result in an incorrect deadlock detection in function
> rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly leading
> up to a system hang.
> 
> Here is the approach taken: when calling from an interrupt handler, instead of
> attributing ownership to the interrupted task, use a reserved task_struct 
> value
> to indicate that the owner is a interrupt handler.  This approach avoids the
> incorrect deadlock detection.
> 
> This also includes changes in several function in rtmutex.c now that the 
> lock's
> requester may be a interrupt handler, not a real task struct.  This impacts
> the way how the lock is acquired and prioritized and decision whether to do
> the house keeping functions required for a real task struct.
> 
> The reserved task_struct values for interrupt handler are
> 
>   current | 0x2
> 
> where current is the task_struct value of the interrupted task.
> 
> Since IH will both acquire and release the lock only during an interrupt
> handling, during which current is not changed, the reserved task_struct value
> for an IH should be distinct from another instances of IH on a different cpu.
> 
> Kernel version 3.14.25 + patch-3.14.25-rt22
> 
> Signed-off-by: T. Makphaibulchoke 

OK, I believe I understand the issue. Perhaps it would be much better
to create a fake task per CPU that we use when grabbing locks in
interrupt mode. And make these have a priority of 0 (highest), since
they can not be preempted, they do have such a priority.

Then in the fast trylock and unlock code, we can add:

struct task_struct *curr = current;

if (unlikely(in_irq()))
curr = this_cpu_read(irq_task);

This way the priority inheritance will stop when it hits this task (no
need to boost a task of highest priority), and we can leave that code
alone.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-23 Thread Steven Rostedt
On Thu, 19 Feb 2015 18:31:05 -0700
Thavatchai Makphaibulchoke t...@hp.com wrote:

 This patch fixes the problem that the ownership of a mutex acquired by an
 interrupt handler(IH) gets incorrectly attributed to the interrupted thread.
 
 This could result in an incorrect deadlock detection in function
 rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly leading
 up to a system hang.
 
 Here is the approach taken: when calling from an interrupt handler, instead of
 attributing ownership to the interrupted task, use a reserved task_struct 
 value
 to indicate that the owner is a interrupt handler.  This approach avoids the
 incorrect deadlock detection.
 
 This also includes changes in several function in rtmutex.c now that the 
 lock's
 requester may be a interrupt handler, not a real task struct.  This impacts
 the way how the lock is acquired and prioritized and decision whether to do
 the house keeping functions required for a real task struct.
 
 The reserved task_struct values for interrupt handler are
 
   current | 0x2
 
 where current is the task_struct value of the interrupted task.
 
 Since IH will both acquire and release the lock only during an interrupt
 handling, during which current is not changed, the reserved task_struct value
 for an IH should be distinct from another instances of IH on a different cpu.
 
 Kernel version 3.14.25 + patch-3.14.25-rt22
 
 Signed-off-by: T. Makphaibulchoke t...@hp.com

OK, I believe I understand the issue. Perhaps it would be much better
to create a fake task per CPU that we use when grabbing locks in
interrupt mode. And make these have a priority of 0 (highest), since
they can not be preempted, they do have such a priority.

Then in the fast trylock and unlock code, we can add:

struct task_struct *curr = current;

if (unlikely(in_irq()))
curr = this_cpu_read(irq_task);

This way the priority inheritance will stop when it hits this task (no
need to boost a task of highest priority), and we can leave that code
alone.

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-23 Thread Steven Rostedt
On Mon, 23 Feb 2015 17:16:27 -0700
Thavatchai Makphaibulchoke thavatchai.makpahibulch...@hp.com wrote:


 Thanks again for the comments and suggestion.
 
 Yes, creating a per cpu fake task was one of the alternative considered.
  I believe one of the reasons I did not purse is the amount of extra
 storage it requires (sizeof(struct task_struct) * number of cpus.
 Though the changes may not be as intrusive as the one I sent, some are
 still required, mainly with current (one in particular came to mind is
 in wakeup_next-watier()).

Yeah, that's a draw back, but still. Big CPU machines should have big
memory. If we really are concerned, we could tidy up task_struct a bit,
rearranging the fields to make sure like fields are together (helps
with cache too) and move the RT stuff up further. Then we could
allocate just enough to cover all the task struct fields that are
accessed in the rtmutex code.


 
 If I'm not mistaken, another reason could also be due to the rate of the
 timer interrupt, in the case that the mutex is highly contested IH could
 stall the non-real-time requester for a long time, even to the point of
 the cpu is perceived as hung.

Perhaps we should just have trylocks fail if there are other waiters.
As it wont slow down the high priority task. And doing that would
probably even help other rt tasks that are blocked on the lock waiting
for a release. Why make those tasks wait more if even a higher priority
task is simply doing a trylock and can safely fail it. At least we
could do that if the task trying to get the lock is a interrupt.

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-23 Thread Thavatchai Makphaibulchoke


On 02/23/2015 11:37 AM, Steven Rostedt wrote:
 
 OK, I believe I understand the issue. Perhaps it would be much better
 to create a fake task per CPU that we use when grabbing locks in
 interrupt mode. And make these have a priority of 0 (highest), since
 they can not be preempted, they do have such a priority.
 
 Then in the fast trylock and unlock code, we can add:
 
   struct task_struct *curr = current;
 
   if (unlikely(in_irq()))
   curr = this_cpu_read(irq_task);
 
 This way the priority inheritance will stop when it hits this task (no
 need to boost a task of highest priority), and we can leave that code
 alone.
 

Thanks again for the comments and suggestion.

Yes, creating a per cpu fake task was one of the alternative considered.
 I believe one of the reasons I did not purse is the amount of extra
storage it requires (sizeof(struct task_struct) * number of cpus.
Though the changes may not be as intrusive as the one I sent, some are
still required, mainly with current (one in particular came to mind is
in wakeup_next-watier()).

If I'm not mistaken, another reason could also be due to the rate of the
timer interrupt, in the case that the mutex is highly contested IH could
stall the non-real-time requester for a long time, even to the point of
the cpu is perceived as hung.

Anyway, I'll retry the fake task approach a try and report back if there
is any issue.

Thanks,
Mak.


 -- Steve
 --
 To unsubscribe from this list: send the line unsubscribe linux-rt-users in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-20 Thread Steven Rostedt
On Fri, 20 Feb 2015 11:54:45 -0700
Thavatchai Makphaibulchoke  wrote:

> Sorry for not explaining the problem in more details.
> 
> IH here means the bottom half of interrupt handler, executing in the
> interrupt context (IC), not the preemptible interrupt kernel thread.
> interrupt.
> 
> Here is the problem we encountered.
> 
> An smp_apic_timer_interrupt comes in while task X is in the process of
> waiting for mutex A .  The IH successfully locks mutex B (in this case
> run_local_timers() gets the timer base's lock, base->lock, via
> spin_trylock()).
> 
> At the same time, task Y holding mutex A requests mutex B.
> 
> With current rtmutex code, mutex B ownership is incorrectly attributed
> to task X (using current, which is inaccurate in the IC). To task Y the
> situation effectively looks like it is holding mutex A and reuqesting B,
> which is held by task X holding mutex B and is now waiting for mutex A.
>  The deadlock detection is correct, a classic potential circular mutex
> deadlock.
> 
> In reality, it is not.  The IH the actual owner of mutex B will
> eventually completes and releases mutex B and task Y will eventually get
> mutex B and proceed and so will task X.  Actually either deleting or
> changing BUG_ON(ret) to WARN_ON(ret) in line 997 in fucntion
> rt_spin_lock_slowlock(), the test ran fine without any problem.
> 

Ah, I see the problem you have. Let me explain it in my own words to
make sure that you and I are on the same page.

Task X tries to grab rt_mutex A, but it's held by task Y. But as Y is
still running, the adaptive mutex code is in play and task X is
spinning (with it's blocked on A set). An interrupt comes in,
preempting task X and does a trylock on rt_mutex B, and succeeds. Now it
looks like task X has mutex B and is blocked on mutex A.

Task Y tries to take mutex B and sees that is held by task X which is
blocked on mutex A which Y owns. Thus you get a false deadlock detect.

I'm I correct?

Now, the question is, can we safely change the ownership of mutex B in
the interrupt context where it wont cause another side effect?


> A more detailed description of the problem could also be found at,
> 
> http://markmail.org/message/np33it233hoot4b2#query:+page:1+mid:np33it233hoot4b2+state:results
> 
> 
> Please let me know what you think or need any additional info.
> 

I haven't looked at the above link. I'll have to think about this some
more, but as I'm currently traveling, it will have to be done sometime
next week. Feel free top ping me on Monday or Tuesday. Tuesday would
probably be better, as I'm sure I'm going to be overloaded with other
work when I get back to my office on Monday.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-20 Thread Thavatchai Makphaibulchoke

On 02/19/2015 09:53 PM, Steven Rostedt wrote:
> On Thu, 19 Feb 2015 18:31:05 -0700
> Thavatchai Makphaibulchoke  wrote:
> 
>> This patch fixes the problem that the ownership of a mutex acquired by an
>> interrupt handler(IH) gets incorrectly attributed to the interrupted thread.
> 
> *blink*
> 
>>
>> This could result in an incorrect deadlock detection in function
>> rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly 
>> leading
>> up to a system hang.
> 
> I highly doubt this is an incorrect deadlock that was detected. My
> money is on a real deadlock that happened.
> 
>>
>> Here is the approach taken: when calling from an interrupt handler, instead 
>> of
>> attributing ownership to the interrupted task, use a reserved task_struct 
>> value
>> to indicate that the owner is a interrupt handler.  This approach avoids the
>> incorrect deadlock detection.
> 
> How is this an incorrect deadlock? Please explain.
> 

Thanks for the comments.

Sorry for not explaining the problem in more details.

IH here means the bottom half of interrupt handler, executing in the
interrupt context (IC), not the preemptible interrupt kernel thread.
interrupt.

Here is the problem we encountered.

An smp_apic_timer_interrupt comes in while task X is in the process of
waiting for mutex A .  The IH successfully locks mutex B (in this case
run_local_timers() gets the timer base's lock, base->lock, via
spin_trylock()).

At the same time, task Y holding mutex A requests mutex B.

With current rtmutex code, mutex B ownership is incorrectly attributed
to task X (using current, which is inaccurate in the IC). To task Y the
situation effectively looks like it is holding mutex A and reuqesting B,
which is held by task X holding mutex B and is now waiting for mutex A.
 The deadlock detection is correct, a classic potential circular mutex
deadlock.

In reality, it is not.  The IH the actual owner of mutex B will
eventually completes and releases mutex B and task Y will eventually get
mutex B and proceed and so will task X.  Actually either deleting or
changing BUG_ON(ret) to WARN_ON(ret) in line 997 in fucntion
rt_spin_lock_slowlock(), the test ran fine without any problem.

A more detailed description of the problem could also be found at,

http://markmail.org/message/np33it233hoot4b2#query:+page:1+mid:np33it233hoot4b2+state:results


Please let me know what you think or need any additional info.

Thanks,
Mak.

>>
>> This also includes changes in several function in rtmutex.c now that the 
>> lock's
>> requester may be a interrupt handler, not a real task struct.  This impacts
>> the way how the lock is acquired and prioritized and decision whether to do
>> the house keeping functions required for a real task struct.
>>
>> The reserved task_struct values for interrupt handler are
>>
>>  current | 0x2
>>
>> where current is the task_struct value of the interrupted task.
>>
>> Since IH will both acquire and release the lock only during an interrupt
>> handling, during which current is not changed, the reserved task_struct value
>> for an IH should be distinct from another instances of IH on a different cpu.
>>
> 
> The interrupt handler is a thread just like any other task. It's not
> special. If there was a deadlock detected, it most likely means that a
> deadlock exists.
> 
> -- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-20 Thread Steven Rostedt
On Fri, 20 Feb 2015 11:54:45 -0700
Thavatchai Makphaibulchoke thavatchai.makpahibulch...@hp.com wrote:

 Sorry for not explaining the problem in more details.
 
 IH here means the bottom half of interrupt handler, executing in the
 interrupt context (IC), not the preemptible interrupt kernel thread.
 interrupt.
 
 Here is the problem we encountered.
 
 An smp_apic_timer_interrupt comes in while task X is in the process of
 waiting for mutex A .  The IH successfully locks mutex B (in this case
 run_local_timers() gets the timer base's lock, base-lock, via
 spin_trylock()).
 
 At the same time, task Y holding mutex A requests mutex B.
 
 With current rtmutex code, mutex B ownership is incorrectly attributed
 to task X (using current, which is inaccurate in the IC). To task Y the
 situation effectively looks like it is holding mutex A and reuqesting B,
 which is held by task X holding mutex B and is now waiting for mutex A.
  The deadlock detection is correct, a classic potential circular mutex
 deadlock.
 
 In reality, it is not.  The IH the actual owner of mutex B will
 eventually completes and releases mutex B and task Y will eventually get
 mutex B and proceed and so will task X.  Actually either deleting or
 changing BUG_ON(ret) to WARN_ON(ret) in line 997 in fucntion
 rt_spin_lock_slowlock(), the test ran fine without any problem.
 

Ah, I see the problem you have. Let me explain it in my own words to
make sure that you and I are on the same page.

Task X tries to grab rt_mutex A, but it's held by task Y. But as Y is
still running, the adaptive mutex code is in play and task X is
spinning (with it's blocked on A set). An interrupt comes in,
preempting task X and does a trylock on rt_mutex B, and succeeds. Now it
looks like task X has mutex B and is blocked on mutex A.

Task Y tries to take mutex B and sees that is held by task X which is
blocked on mutex A which Y owns. Thus you get a false deadlock detect.

I'm I correct?

Now, the question is, can we safely change the ownership of mutex B in
the interrupt context where it wont cause another side effect?


 A more detailed description of the problem could also be found at,
 
 http://markmail.org/message/np33it233hoot4b2#query:+page:1+mid:np33it233hoot4b2+state:results
 
 
 Please let me know what you think or need any additional info.
 

I haven't looked at the above link. I'll have to think about this some
more, but as I'm currently traveling, it will have to be done sometime
next week. Feel free top ping me on Monday or Tuesday. Tuesday would
probably be better, as I'm sure I'm going to be overloaded with other
work when I get back to my office on Monday.

-- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-20 Thread Thavatchai Makphaibulchoke

On 02/19/2015 09:53 PM, Steven Rostedt wrote:
 On Thu, 19 Feb 2015 18:31:05 -0700
 Thavatchai Makphaibulchoke t...@hp.com wrote:
 
 This patch fixes the problem that the ownership of a mutex acquired by an
 interrupt handler(IH) gets incorrectly attributed to the interrupted thread.
 
 *blink*
 

 This could result in an incorrect deadlock detection in function
 rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly 
 leading
 up to a system hang.
 
 I highly doubt this is an incorrect deadlock that was detected. My
 money is on a real deadlock that happened.
 

 Here is the approach taken: when calling from an interrupt handler, instead 
 of
 attributing ownership to the interrupted task, use a reserved task_struct 
 value
 to indicate that the owner is a interrupt handler.  This approach avoids the
 incorrect deadlock detection.
 
 How is this an incorrect deadlock? Please explain.
 

Thanks for the comments.

Sorry for not explaining the problem in more details.

IH here means the bottom half of interrupt handler, executing in the
interrupt context (IC), not the preemptible interrupt kernel thread.
interrupt.

Here is the problem we encountered.

An smp_apic_timer_interrupt comes in while task X is in the process of
waiting for mutex A .  The IH successfully locks mutex B (in this case
run_local_timers() gets the timer base's lock, base-lock, via
spin_trylock()).

At the same time, task Y holding mutex A requests mutex B.

With current rtmutex code, mutex B ownership is incorrectly attributed
to task X (using current, which is inaccurate in the IC). To task Y the
situation effectively looks like it is holding mutex A and reuqesting B,
which is held by task X holding mutex B and is now waiting for mutex A.
 The deadlock detection is correct, a classic potential circular mutex
deadlock.

In reality, it is not.  The IH the actual owner of mutex B will
eventually completes and releases mutex B and task Y will eventually get
mutex B and proceed and so will task X.  Actually either deleting or
changing BUG_ON(ret) to WARN_ON(ret) in line 997 in fucntion
rt_spin_lock_slowlock(), the test ran fine without any problem.

A more detailed description of the problem could also be found at,

http://markmail.org/message/np33it233hoot4b2#query:+page:1+mid:np33it233hoot4b2+state:results


Please let me know what you think or need any additional info.

Thanks,
Mak.


 This also includes changes in several function in rtmutex.c now that the 
 lock's
 requester may be a interrupt handler, not a real task struct.  This impacts
 the way how the lock is acquired and prioritized and decision whether to do
 the house keeping functions required for a real task struct.

 The reserved task_struct values for interrupt handler are

  current | 0x2

 where current is the task_struct value of the interrupted task.

 Since IH will both acquire and release the lock only during an interrupt
 handling, during which current is not changed, the reserved task_struct value
 for an IH should be distinct from another instances of IH on a different cpu.

 
 The interrupt handler is a thread just like any other task. It's not
 special. If there was a deadlock detected, it most likely means that a
 deadlock exists.
 
 -- Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-19 Thread Steven Rostedt
On Thu, 19 Feb 2015 18:31:05 -0700
Thavatchai Makphaibulchoke  wrote:

> This patch fixes the problem that the ownership of a mutex acquired by an
> interrupt handler(IH) gets incorrectly attributed to the interrupted thread.

*blink*

> 
> This could result in an incorrect deadlock detection in function
> rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly leading
> up to a system hang.

I highly doubt this is an incorrect deadlock that was detected. My
money is on a real deadlock that happened.

> 
> Here is the approach taken: when calling from an interrupt handler, instead of
> attributing ownership to the interrupted task, use a reserved task_struct 
> value
> to indicate that the owner is a interrupt handler.  This approach avoids the
> incorrect deadlock detection.

How is this an incorrect deadlock? Please explain.

> 
> This also includes changes in several function in rtmutex.c now that the 
> lock's
> requester may be a interrupt handler, not a real task struct.  This impacts
> the way how the lock is acquired and prioritized and decision whether to do
> the house keeping functions required for a real task struct.
> 
> The reserved task_struct values for interrupt handler are
> 
>   current | 0x2
> 
> where current is the task_struct value of the interrupted task.
> 
> Since IH will both acquire and release the lock only during an interrupt
> handling, during which current is not changed, the reserved task_struct value
> for an IH should be distinct from another instances of IH on a different cpu.
> 

The interrupt handler is a thread just like any other task. It's not
special. If there was a deadlock detected, it most likely means that a
deadlock exists.

-- Steve


> Kernel version 3.14.25 + patch-3.14.25-rt22
> 
> Signed-off-by: T. Makphaibulchoke 
> ---
>  include/linux/spinlock_rt.h |   4 +
>  kernel/locking/rtmutex-debug.c  |  15 ++-
>  kernel/locking/rtmutex.c| 212 
> 
>  kernel/locking/rtmutex_common.h |  21 
>  kernel/timer.c  |   4 +-
>  5 files changed, 188 insertions(+), 68 deletions(-)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-19 Thread Thavatchai Makphaibulchoke
This patch fixes the problem that the ownership of a mutex acquired by an
interrupt handler(IH) gets incorrectly attributed to the interrupted thread.

This could result in an incorrect deadlock detection in function
rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly leading
up to a system hang.

Here is the approach taken: when calling from an interrupt handler, instead of
attributing ownership to the interrupted task, use a reserved task_struct value
to indicate that the owner is a interrupt handler.  This approach avoids the
incorrect deadlock detection.

This also includes changes in several function in rtmutex.c now that the lock's
requester may be a interrupt handler, not a real task struct.  This impacts
the way how the lock is acquired and prioritized and decision whether to do
the house keeping functions required for a real task struct.

The reserved task_struct values for interrupt handler are

current | 0x2

where current is the task_struct value of the interrupted task.

Since IH will both acquire and release the lock only during an interrupt
handling, during which current is not changed, the reserved task_struct value
for an IH should be distinct from another instances of IH on a different cpu.

Kernel version 3.14.25 + patch-3.14.25-rt22

Signed-off-by: T. Makphaibulchoke 
---
 include/linux/spinlock_rt.h |   4 +
 kernel/locking/rtmutex-debug.c  |  15 ++-
 kernel/locking/rtmutex.c| 212 
 kernel/locking/rtmutex_common.h |  21 
 kernel/timer.c  |   4 +-
 5 files changed, 188 insertions(+), 68 deletions(-)

diff --git a/include/linux/spinlock_rt.h b/include/linux/spinlock_rt.h
index c0d1367..eeb4188 100644
--- a/include/linux/spinlock_rt.h
+++ b/include/linux/spinlock_rt.h
@@ -27,6 +27,7 @@ extern void __lockfunc rt_spin_unlock_wait(spinlock_t *lock);
 extern int __lockfunc rt_spin_trylock_irqsave(spinlock_t *lock, unsigned long 
*flags);
 extern int __lockfunc rt_spin_trylock_bh(spinlock_t *lock);
 extern int __lockfunc rt_spin_trylock(spinlock_t *lock);
+extern int __lockfunc rt_spin_trylock_in_interrupt(spinlock_t *lock);
 extern int atomic_dec_and_spin_lock(atomic_t *atomic, spinlock_t *lock);
 
 /*
@@ -52,6 +53,9 @@ extern int __lockfunc __rt_spin_trylock(struct rt_mutex 
*lock);
 
 #define spin_lock_irq(lock)spin_lock(lock)
 
+#define spin_do_trylock_in_interrupt(lock) \
+   __cond_lock(lock, rt_spin_trylock_in_interrupt(lock))
+
 #define spin_do_trylock(lock)  __cond_lock(lock, rt_spin_trylock(lock))
 
 #define spin_trylock(lock) \
diff --git a/kernel/locking/rtmutex-debug.c b/kernel/locking/rtmutex-debug.c
index 49b2ed3..c36d629 100644
--- a/kernel/locking/rtmutex-debug.c
+++ b/kernel/locking/rtmutex-debug.c
@@ -40,6 +40,8 @@ static void printk_task(struct task_struct *p)
 
 static void printk_lock(struct rt_mutex *lock, int print_owner)
 {
+   struct task_struct *owner = rt_mutex_owner(lock);
+
if (lock->name)
printk(" [%p] {%s}\n",
lock, lock->name);
@@ -47,10 +49,13 @@ static void printk_lock(struct rt_mutex *lock, int 
print_owner)
printk(" [%p] {%s:%d}\n",
lock, lock->file, lock->line);
 
-   if (print_owner && rt_mutex_owner(lock)) {
+   if (print_owner && owner) {
printk(".. ->owner: %p\n", lock->owner);
printk(".. held by:  ");
-   printk_task(rt_mutex_owner(lock));
+   if (rt_mutex_owner_is_task(owner))
+   printk_task(owner);
+   else
+   printk(" an interrupt handler.");
printk("\n");
}
 }
@@ -76,6 +81,8 @@ void debug_rt_mutex_deadlock(int detect, struct 
rt_mutex_waiter *act_waiter,
 
task = rt_mutex_owner(act_waiter->lock);
if (task && task != current) {
+   /* Interrupt handler should not be deadlocking. */
+   BUG_ON(!rt_mutex_owner_is_task(task));
act_waiter->deadlock_task_pid = get_pid(task_pid(task));
act_waiter->deadlock_lock = lock;
}
@@ -138,7 +145,9 @@ void debug_rt_mutex_lock(struct rt_mutex *lock)
 
 void debug_rt_mutex_unlock(struct rt_mutex *lock)
 {
-   DEBUG_LOCKS_WARN_ON(rt_mutex_owner(lock) != current);
+   DEBUG_LOCKS_WARN_ON(in_interrupt() ?
+   !rt_mutex_owner_is_task(rt_mutex_owner(lock)) :
+   rt_mutex_owner(lock) != current);
 }
 
 void
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 6c40660..8b66f81 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -51,6 +51,9 @@
  * waiters. This can happen when grabbing the lock in the slow path.
  * To prevent a cmpxchg of the owner releasing the lock, we need to
  * set this bit before looking at the lock.
+ *
+ * owner can also be reserved value, INTERRUPT_HANDLER, in case the 

Re: [PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-19 Thread Steven Rostedt
On Thu, 19 Feb 2015 18:31:05 -0700
Thavatchai Makphaibulchoke t...@hp.com wrote:

 This patch fixes the problem that the ownership of a mutex acquired by an
 interrupt handler(IH) gets incorrectly attributed to the interrupted thread.

*blink*

 
 This could result in an incorrect deadlock detection in function
 rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly leading
 up to a system hang.

I highly doubt this is an incorrect deadlock that was detected. My
money is on a real deadlock that happened.

 
 Here is the approach taken: when calling from an interrupt handler, instead of
 attributing ownership to the interrupted task, use a reserved task_struct 
 value
 to indicate that the owner is a interrupt handler.  This approach avoids the
 incorrect deadlock detection.

How is this an incorrect deadlock? Please explain.

 
 This also includes changes in several function in rtmutex.c now that the 
 lock's
 requester may be a interrupt handler, not a real task struct.  This impacts
 the way how the lock is acquired and prioritized and decision whether to do
 the house keeping functions required for a real task struct.
 
 The reserved task_struct values for interrupt handler are
 
   current | 0x2
 
 where current is the task_struct value of the interrupted task.
 
 Since IH will both acquire and release the lock only during an interrupt
 handling, during which current is not changed, the reserved task_struct value
 for an IH should be distinct from another instances of IH on a different cpu.
 

The interrupt handler is a thread just like any other task. It's not
special. If there was a deadlock detected, it most likely means that a
deadlock exists.

-- Steve


 Kernel version 3.14.25 + patch-3.14.25-rt22
 
 Signed-off-by: T. Makphaibulchoke t...@hp.com
 ---
  include/linux/spinlock_rt.h |   4 +
  kernel/locking/rtmutex-debug.c  |  15 ++-
  kernel/locking/rtmutex.c| 212 
 
  kernel/locking/rtmutex_common.h |  21 
  kernel/timer.c  |   4 +-
  5 files changed, 188 insertions(+), 68 deletions(-)
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.14.25-rt22 1/2] rtmutex Real-Time Linux: Fixing kernel BUG at kernel/locking/rtmutex.c:997!

2015-02-19 Thread Thavatchai Makphaibulchoke
This patch fixes the problem that the ownership of a mutex acquired by an
interrupt handler(IH) gets incorrectly attributed to the interrupted thread.

This could result in an incorrect deadlock detection in function
rt_mutex_adjust_prio_chain(), causing thread to be killed and possibly leading
up to a system hang.

Here is the approach taken: when calling from an interrupt handler, instead of
attributing ownership to the interrupted task, use a reserved task_struct value
to indicate that the owner is a interrupt handler.  This approach avoids the
incorrect deadlock detection.

This also includes changes in several function in rtmutex.c now that the lock's
requester may be a interrupt handler, not a real task struct.  This impacts
the way how the lock is acquired and prioritized and decision whether to do
the house keeping functions required for a real task struct.

The reserved task_struct values for interrupt handler are

current | 0x2

where current is the task_struct value of the interrupted task.

Since IH will both acquire and release the lock only during an interrupt
handling, during which current is not changed, the reserved task_struct value
for an IH should be distinct from another instances of IH on a different cpu.

Kernel version 3.14.25 + patch-3.14.25-rt22

Signed-off-by: T. Makphaibulchoke t...@hp.com
---
 include/linux/spinlock_rt.h |   4 +
 kernel/locking/rtmutex-debug.c  |  15 ++-
 kernel/locking/rtmutex.c| 212 
 kernel/locking/rtmutex_common.h |  21 
 kernel/timer.c  |   4 +-
 5 files changed, 188 insertions(+), 68 deletions(-)

diff --git a/include/linux/spinlock_rt.h b/include/linux/spinlock_rt.h
index c0d1367..eeb4188 100644
--- a/include/linux/spinlock_rt.h
+++ b/include/linux/spinlock_rt.h
@@ -27,6 +27,7 @@ extern void __lockfunc rt_spin_unlock_wait(spinlock_t *lock);
 extern int __lockfunc rt_spin_trylock_irqsave(spinlock_t *lock, unsigned long 
*flags);
 extern int __lockfunc rt_spin_trylock_bh(spinlock_t *lock);
 extern int __lockfunc rt_spin_trylock(spinlock_t *lock);
+extern int __lockfunc rt_spin_trylock_in_interrupt(spinlock_t *lock);
 extern int atomic_dec_and_spin_lock(atomic_t *atomic, spinlock_t *lock);
 
 /*
@@ -52,6 +53,9 @@ extern int __lockfunc __rt_spin_trylock(struct rt_mutex 
*lock);
 
 #define spin_lock_irq(lock)spin_lock(lock)
 
+#define spin_do_trylock_in_interrupt(lock) \
+   __cond_lock(lock, rt_spin_trylock_in_interrupt(lock))
+
 #define spin_do_trylock(lock)  __cond_lock(lock, rt_spin_trylock(lock))
 
 #define spin_trylock(lock) \
diff --git a/kernel/locking/rtmutex-debug.c b/kernel/locking/rtmutex-debug.c
index 49b2ed3..c36d629 100644
--- a/kernel/locking/rtmutex-debug.c
+++ b/kernel/locking/rtmutex-debug.c
@@ -40,6 +40,8 @@ static void printk_task(struct task_struct *p)
 
 static void printk_lock(struct rt_mutex *lock, int print_owner)
 {
+   struct task_struct *owner = rt_mutex_owner(lock);
+
if (lock-name)
printk( [%p] {%s}\n,
lock, lock-name);
@@ -47,10 +49,13 @@ static void printk_lock(struct rt_mutex *lock, int 
print_owner)
printk( [%p] {%s:%d}\n,
lock, lock-file, lock-line);
 
-   if (print_owner  rt_mutex_owner(lock)) {
+   if (print_owner  owner) {
printk(.. -owner: %p\n, lock-owner);
printk(.. held by:  );
-   printk_task(rt_mutex_owner(lock));
+   if (rt_mutex_owner_is_task(owner))
+   printk_task(owner);
+   else
+   printk( an interrupt handler.);
printk(\n);
}
 }
@@ -76,6 +81,8 @@ void debug_rt_mutex_deadlock(int detect, struct 
rt_mutex_waiter *act_waiter,
 
task = rt_mutex_owner(act_waiter-lock);
if (task  task != current) {
+   /* Interrupt handler should not be deadlocking. */
+   BUG_ON(!rt_mutex_owner_is_task(task));
act_waiter-deadlock_task_pid = get_pid(task_pid(task));
act_waiter-deadlock_lock = lock;
}
@@ -138,7 +145,9 @@ void debug_rt_mutex_lock(struct rt_mutex *lock)
 
 void debug_rt_mutex_unlock(struct rt_mutex *lock)
 {
-   DEBUG_LOCKS_WARN_ON(rt_mutex_owner(lock) != current);
+   DEBUG_LOCKS_WARN_ON(in_interrupt() ?
+   !rt_mutex_owner_is_task(rt_mutex_owner(lock)) :
+   rt_mutex_owner(lock) != current);
 }
 
 void
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 6c40660..8b66f81 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -51,6 +51,9 @@
  * waiters. This can happen when grabbing the lock in the slow path.
  * To prevent a cmpxchg of the owner releasing the lock, we need to
  * set this bit before looking at the lock.
+ *
+ * owner can also be reserved value, INTERRUPT_HANDLER, in case the mutex
+ * is