Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock

2012-09-11 Thread Pekka Enberg
On Tue, Sep 11, 2012 at 5:50 AM, Michael Wang
 wrote:
> On 09/08/2012 04:39 PM, Pekka Enberg wrote:
>> On Fri, Sep 7, 2012 at 1:29 AM, Paul E. McKenney
>>  wrote:
>>> On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote:
 On 09/05/2012 09:55 PM, Christoph Lameter wrote:
> On Wed, 5 Sep 2012, Michael Wang wrote:
>
>> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock 
>> class,
>> fake report generated.
>
> Ahh... That is a key insight into why this occurs.
>
>> This should not happen since we already have init_lock_keys() which will
>> reassign the lock class for both l3 list and l3 alien.
>
> Right. I was wondering why we still get intermitted reports on this.
>
>> This patch will invoke init_lock_keys() after we done enable_cpucache()
>> instead of before to avoid the fake DEADLOCK report.
>
> Acked-by: Christoph Lameter 

 Thanks for your review.

 And add Paul to the cc list(my skills on mailing is really poor...).
>>>
>>> Tested-by: Paul E. McKenney 
>>
>> I'd also like to tag this for the stable tree to avoid bogus lockdep
>> reports. How far back in release history should we queue this?
> Hi, Pekka
>
> Sorry for the delayed reply, I try to find out the reason for commit
> 30765b92 but not get it yet, so I add Peter to the cc list.
>
> The below patch for release 3.0.0 is the one to cause the bogus report.
>
> commit 30765b92ada267c5395fc788623cb15233276f5c
> Author: Peter Zijlstra 
> Date:   Thu Jul 28 23:22:56 2011 +0200
>
> slab, lockdep: Annotate the locks before using them
>
> Fernando found we hit the regular OFF_SLAB 'recursion' before we
> annotate the locks, cure this.
>
> The relevant portion of the stack-trace:
>
> > [0.00]  [] rt_spin_lock+0x50/0x56
> > [0.00]  [] __cache_free+0x43/0xc3
> > [0.00]  [] kmem_cache_free+0x6c/0xdc
> > [0.00]  [] slab_destroy+0x4f/0x53
> > [0.00]  [] free_block+0x94/0xc1
> > [0.00]  [] do_tune_cpucache+0x10b/0x2bb
> > [0.00]  [] enable_cpucache+0x7b/0xa7
> > [0.00]  [] kmem_cache_init_late+0x1f/0x61
> > [0.00]  [] start_kernel+0x24c/0x363
> > [0.00]  [] i386_start_kernel+0xa9/0xaf
>
> Reported-by: Fernando Lopez-Lezcano 
> Acked-by: Pekka Enberg 
> Signed-off-by: Peter Zijlstra 
> Link: http://lkml.kernel.org/r/1311888176.2617.379.camel@laptop
> Signed-off-by: Ingo Molnar 
>
> It moved init_lock_keys() before we build up the alien, so we failed to
> reclass it.

I've queued the patch for v3.7. Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock

2012-09-10 Thread Michael Wang
On 09/08/2012 04:39 PM, Pekka Enberg wrote:
> On Fri, Sep 7, 2012 at 1:29 AM, Paul E. McKenney
>  wrote:
>> On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote:
>>> On 09/05/2012 09:55 PM, Christoph Lameter wrote:
 On Wed, 5 Sep 2012, Michael Wang wrote:

> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock 
> class,
> fake report generated.

 Ahh... That is a key insight into why this occurs.

> This should not happen since we already have init_lock_keys() which will
> reassign the lock class for both l3 list and l3 alien.

 Right. I was wondering why we still get intermitted reports on this.

> This patch will invoke init_lock_keys() after we done enable_cpucache()
> instead of before to avoid the fake DEADLOCK report.

 Acked-by: Christoph Lameter 
>>>
>>> Thanks for your review.
>>>
>>> And add Paul to the cc list(my skills on mailing is really poor...).
>>
>> Tested-by: Paul E. McKenney 
> 
> I'd also like to tag this for the stable tree to avoid bogus lockdep
> reports. How far back in release history should we queue this?
Hi, Pekka

Sorry for the delayed reply, I try to find out the reason for commit
30765b92 but not get it yet, so I add Peter to the cc list.

The below patch for release 3.0.0 is the one to cause the bogus report.

commit 30765b92ada267c5395fc788623cb15233276f5c
Author: Peter Zijlstra 
Date:   Thu Jul 28 23:22:56 2011 +0200

slab, lockdep: Annotate the locks before using them

Fernando found we hit the regular OFF_SLAB 'recursion' before we
annotate the locks, cure this.

The relevant portion of the stack-trace:

> [0.00]  [] rt_spin_lock+0x50/0x56
> [0.00]  [] __cache_free+0x43/0xc3
> [0.00]  [] kmem_cache_free+0x6c/0xdc
> [0.00]  [] slab_destroy+0x4f/0x53
> [0.00]  [] free_block+0x94/0xc1
> [0.00]  [] do_tune_cpucache+0x10b/0x2bb
> [0.00]  [] enable_cpucache+0x7b/0xa7
> [0.00]  [] kmem_cache_init_late+0x1f/0x61
> [0.00]  [] start_kernel+0x24c/0x363
> [0.00]  [] i386_start_kernel+0xa9/0xaf

Reported-by: Fernando Lopez-Lezcano 
Acked-by: Pekka Enberg 
Signed-off-by: Peter Zijlstra 
Link: http://lkml.kernel.org/r/1311888176.2617.379.camel@laptop
Signed-off-by: Ingo Molnar 

It moved init_lock_keys() before we build up the alien, so we failed to
reclass it.

Regards,
Michael Wang

> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock

2012-09-08 Thread Pekka Enberg
On Fri, Sep 7, 2012 at 1:29 AM, Paul E. McKenney
 wrote:
> On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote:
>> On 09/05/2012 09:55 PM, Christoph Lameter wrote:
>> > On Wed, 5 Sep 2012, Michael Wang wrote:
>> >
>> >> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock 
>> >> class,
>> >> fake report generated.
>> >
>> > Ahh... That is a key insight into why this occurs.
>> >
>> >> This should not happen since we already have init_lock_keys() which will
>> >> reassign the lock class for both l3 list and l3 alien.
>> >
>> > Right. I was wondering why we still get intermitted reports on this.
>> >
>> >> This patch will invoke init_lock_keys() after we done enable_cpucache()
>> >> instead of before to avoid the fake DEADLOCK report.
>> >
>> > Acked-by: Christoph Lameter 
>>
>> Thanks for your review.
>>
>> And add Paul to the cc list(my skills on mailing is really poor...).
>
> Tested-by: Paul E. McKenney 

I'd also like to tag this for the stable tree to avoid bogus lockdep
reports. How far back in release history should we queue this?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock

2012-09-06 Thread Paul E. McKenney
On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote:
> On 09/05/2012 09:55 PM, Christoph Lameter wrote:
> > On Wed, 5 Sep 2012, Michael Wang wrote:
> > 
> >> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock 
> >> class,
> >> fake report generated.
> > 
> > Ahh... That is a key insight into why this occurs.
> > 
> >> This should not happen since we already have init_lock_keys() which will
> >> reassign the lock class for both l3 list and l3 alien.
> > 
> > Right. I was wondering why we still get intermitted reports on this.
> > 
> >> This patch will invoke init_lock_keys() after we done enable_cpucache()
> >> instead of before to avoid the fake DEADLOCK report.
> > 
> > Acked-by: Christoph Lameter 
> 
> Thanks for your review.
> 
> And add Paul to the cc list(my skills on mailing is really poor...).

Tested-by: Paul E. McKenney 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock

2012-09-05 Thread Michael Wang
On 09/05/2012 09:55 PM, Christoph Lameter wrote:
> On Wed, 5 Sep 2012, Michael Wang wrote:
> 
>> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock 
>> class,
>> fake report generated.
> 
> Ahh... That is a key insight into why this occurs.
> 
>> This should not happen since we already have init_lock_keys() which will
>> reassign the lock class for both l3 list and l3 alien.
> 
> Right. I was wondering why we still get intermitted reports on this.
> 
>> This patch will invoke init_lock_keys() after we done enable_cpucache()
>> instead of before to avoid the fake DEADLOCK report.
> 
> Acked-by: Christoph Lameter 

Thanks for your review.

And add Paul to the cc list(my skills on mailing is really poor...).

Regards,
Michael Wang

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock

2012-09-05 Thread Christoph Lameter
On Wed, 5 Sep 2012, Michael Wang wrote:

> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock 
> class,
> fake report generated.

Ahh... That is a key insight into why this occurs.

> This should not happen since we already have init_lock_keys() which will
> reassign the lock class for both l3 list and l3 alien.

Right. I was wondering why we still get intermitted reports on this.

> This patch will invoke init_lock_keys() after we done enable_cpucache()
> instead of before to avoid the fake DEADLOCK report.

Acked-by: Christoph Lameter 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] slab: fix the DEADLOCK issue on l3 alien lock

2012-09-04 Thread Michael Wang
From: Michael Wang 

DEADLOCK will be report while running a kernel with NUMA and LOCKDEP enabled,
the process of this fake report is:

   kmem_cache_free()//free obj in cachep
-> cache_free_alien()   //acquire cachep's l3 alien lock
-> __drain_alien_cache()
-> free_block()
-> slab_destroy()
-> kmem_cache_free()//free slab in cachep->slabp_cache
-> cache_free_alien()   //acquire cachep->slabp_cache's l3 alien lock

Since the cachep and cachep->slabp_cache's l3 alien are in the same lock class,
fake report generated.

This should not happen since we already have init_lock_keys() which will
reassign the lock class for both l3 list and l3 alien.

However, init_lock_keys() was invoked at a wrong position which is before we
invoke enable_cpucache() on each cache.

Since until set slab_state to be FULL, we won't invoke enable_cpucache()
on caches to build their l3 alien while creating them, so although we invoked
init_lock_keys(), the l3 alien lock class won't change since we don't have
them until invoked enable_cpucache() later.

This patch will invoke init_lock_keys() after we done enable_cpucache()
instead of before to avoid the fake DEADLOCK report.

Signed-off-by: Michael Wang 
---
 mm/slab.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index d4715e5..cc679ef 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1780,9 +1780,6 @@ void __init kmem_cache_init_late(void)
 
slab_state = UP;
 
-   /* Annotate slab for lockdep -- annotate the malloc caches */
-   init_lock_keys();
-
/* 6) resize the head arrays to their final sizes */
mutex_lock(&slab_mutex);
list_for_each_entry(cachep, &slab_caches, list)
@@ -1790,6 +1787,9 @@ void __init kmem_cache_init_late(void)
BUG();
mutex_unlock(&slab_mutex);
 
+   /* Annotate slab for lockdep -- annotate the malloc caches */
+   init_lock_keys();
+
/* Done! */
slab_state = FULL;
 
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/