Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock
On Tue, Sep 11, 2012 at 5:50 AM, Michael Wang wrote: > On 09/08/2012 04:39 PM, Pekka Enberg wrote: >> On Fri, Sep 7, 2012 at 1:29 AM, Paul E. McKenney >> wrote: >>> On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote: On 09/05/2012 09:55 PM, Christoph Lameter wrote: > On Wed, 5 Sep 2012, Michael Wang wrote: > >> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock >> class, >> fake report generated. > > Ahh... That is a key insight into why this occurs. > >> This should not happen since we already have init_lock_keys() which will >> reassign the lock class for both l3 list and l3 alien. > > Right. I was wondering why we still get intermitted reports on this. > >> This patch will invoke init_lock_keys() after we done enable_cpucache() >> instead of before to avoid the fake DEADLOCK report. > > Acked-by: Christoph Lameter Thanks for your review. And add Paul to the cc list(my skills on mailing is really poor...). >>> >>> Tested-by: Paul E. McKenney >> >> I'd also like to tag this for the stable tree to avoid bogus lockdep >> reports. How far back in release history should we queue this? > Hi, Pekka > > Sorry for the delayed reply, I try to find out the reason for commit > 30765b92 but not get it yet, so I add Peter to the cc list. > > The below patch for release 3.0.0 is the one to cause the bogus report. > > commit 30765b92ada267c5395fc788623cb15233276f5c > Author: Peter Zijlstra > Date: Thu Jul 28 23:22:56 2011 +0200 > > slab, lockdep: Annotate the locks before using them > > Fernando found we hit the regular OFF_SLAB 'recursion' before we > annotate the locks, cure this. > > The relevant portion of the stack-trace: > > > [0.00] [] rt_spin_lock+0x50/0x56 > > [0.00] [] __cache_free+0x43/0xc3 > > [0.00] [] kmem_cache_free+0x6c/0xdc > > [0.00] [] slab_destroy+0x4f/0x53 > > [0.00] [] free_block+0x94/0xc1 > > [0.00] [] do_tune_cpucache+0x10b/0x2bb > > [0.00] [] enable_cpucache+0x7b/0xa7 > > [0.00] [] kmem_cache_init_late+0x1f/0x61 > > [0.00] [] start_kernel+0x24c/0x363 > > [0.00] [] i386_start_kernel+0xa9/0xaf > > Reported-by: Fernando Lopez-Lezcano > Acked-by: Pekka Enberg > Signed-off-by: Peter Zijlstra > Link: http://lkml.kernel.org/r/1311888176.2617.379.camel@laptop > Signed-off-by: Ingo Molnar > > It moved init_lock_keys() before we build up the alien, so we failed to > reclass it. I've queued the patch for v3.7. Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock
On 09/08/2012 04:39 PM, Pekka Enberg wrote: > On Fri, Sep 7, 2012 at 1:29 AM, Paul E. McKenney > wrote: >> On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote: >>> On 09/05/2012 09:55 PM, Christoph Lameter wrote: On Wed, 5 Sep 2012, Michael Wang wrote: > Since the cachep and cachep->slabp_cache's l3 alien are in the same lock > class, > fake report generated. Ahh... That is a key insight into why this occurs. > This should not happen since we already have init_lock_keys() which will > reassign the lock class for both l3 list and l3 alien. Right. I was wondering why we still get intermitted reports on this. > This patch will invoke init_lock_keys() after we done enable_cpucache() > instead of before to avoid the fake DEADLOCK report. Acked-by: Christoph Lameter >>> >>> Thanks for your review. >>> >>> And add Paul to the cc list(my skills on mailing is really poor...). >> >> Tested-by: Paul E. McKenney > > I'd also like to tag this for the stable tree to avoid bogus lockdep > reports. How far back in release history should we queue this? Hi, Pekka Sorry for the delayed reply, I try to find out the reason for commit 30765b92 but not get it yet, so I add Peter to the cc list. The below patch for release 3.0.0 is the one to cause the bogus report. commit 30765b92ada267c5395fc788623cb15233276f5c Author: Peter Zijlstra Date: Thu Jul 28 23:22:56 2011 +0200 slab, lockdep: Annotate the locks before using them Fernando found we hit the regular OFF_SLAB 'recursion' before we annotate the locks, cure this. The relevant portion of the stack-trace: > [0.00] [] rt_spin_lock+0x50/0x56 > [0.00] [] __cache_free+0x43/0xc3 > [0.00] [] kmem_cache_free+0x6c/0xdc > [0.00] [] slab_destroy+0x4f/0x53 > [0.00] [] free_block+0x94/0xc1 > [0.00] [] do_tune_cpucache+0x10b/0x2bb > [0.00] [] enable_cpucache+0x7b/0xa7 > [0.00] [] kmem_cache_init_late+0x1f/0x61 > [0.00] [] start_kernel+0x24c/0x363 > [0.00] [] i386_start_kernel+0xa9/0xaf Reported-by: Fernando Lopez-Lezcano Acked-by: Pekka Enberg Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1311888176.2617.379.camel@laptop Signed-off-by: Ingo Molnar It moved init_lock_keys() before we build up the alien, so we failed to reclass it. Regards, Michael Wang > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock
On Fri, Sep 7, 2012 at 1:29 AM, Paul E. McKenney wrote: > On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote: >> On 09/05/2012 09:55 PM, Christoph Lameter wrote: >> > On Wed, 5 Sep 2012, Michael Wang wrote: >> > >> >> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock >> >> class, >> >> fake report generated. >> > >> > Ahh... That is a key insight into why this occurs. >> > >> >> This should not happen since we already have init_lock_keys() which will >> >> reassign the lock class for both l3 list and l3 alien. >> > >> > Right. I was wondering why we still get intermitted reports on this. >> > >> >> This patch will invoke init_lock_keys() after we done enable_cpucache() >> >> instead of before to avoid the fake DEADLOCK report. >> > >> > Acked-by: Christoph Lameter >> >> Thanks for your review. >> >> And add Paul to the cc list(my skills on mailing is really poor...). > > Tested-by: Paul E. McKenney I'd also like to tag this for the stable tree to avoid bogus lockdep reports. How far back in release history should we queue this? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock
On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote: > On 09/05/2012 09:55 PM, Christoph Lameter wrote: > > On Wed, 5 Sep 2012, Michael Wang wrote: > > > >> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock > >> class, > >> fake report generated. > > > > Ahh... That is a key insight into why this occurs. > > > >> This should not happen since we already have init_lock_keys() which will > >> reassign the lock class for both l3 list and l3 alien. > > > > Right. I was wondering why we still get intermitted reports on this. > > > >> This patch will invoke init_lock_keys() after we done enable_cpucache() > >> instead of before to avoid the fake DEADLOCK report. > > > > Acked-by: Christoph Lameter > > Thanks for your review. > > And add Paul to the cc list(my skills on mailing is really poor...). Tested-by: Paul E. McKenney -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock
On 09/05/2012 09:55 PM, Christoph Lameter wrote: > On Wed, 5 Sep 2012, Michael Wang wrote: > >> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock >> class, >> fake report generated. > > Ahh... That is a key insight into why this occurs. > >> This should not happen since we already have init_lock_keys() which will >> reassign the lock class for both l3 list and l3 alien. > > Right. I was wondering why we still get intermitted reports on this. > >> This patch will invoke init_lock_keys() after we done enable_cpucache() >> instead of before to avoid the fake DEADLOCK report. > > Acked-by: Christoph Lameter Thanks for your review. And add Paul to the cc list(my skills on mailing is really poor...). Regards, Michael Wang > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock
On Wed, 5 Sep 2012, Michael Wang wrote: > Since the cachep and cachep->slabp_cache's l3 alien are in the same lock > class, > fake report generated. Ahh... That is a key insight into why this occurs. > This should not happen since we already have init_lock_keys() which will > reassign the lock class for both l3 list and l3 alien. Right. I was wondering why we still get intermitted reports on this. > This patch will invoke init_lock_keys() after we done enable_cpucache() > instead of before to avoid the fake DEADLOCK report. Acked-by: Christoph Lameter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] slab: fix the DEADLOCK issue on l3 alien lock
From: Michael Wang DEADLOCK will be report while running a kernel with NUMA and LOCKDEP enabled, the process of this fake report is: kmem_cache_free()//free obj in cachep -> cache_free_alien() //acquire cachep's l3 alien lock -> __drain_alien_cache() -> free_block() -> slab_destroy() -> kmem_cache_free()//free slab in cachep->slabp_cache -> cache_free_alien() //acquire cachep->slabp_cache's l3 alien lock Since the cachep and cachep->slabp_cache's l3 alien are in the same lock class, fake report generated. This should not happen since we already have init_lock_keys() which will reassign the lock class for both l3 list and l3 alien. However, init_lock_keys() was invoked at a wrong position which is before we invoke enable_cpucache() on each cache. Since until set slab_state to be FULL, we won't invoke enable_cpucache() on caches to build their l3 alien while creating them, so although we invoked init_lock_keys(), the l3 alien lock class won't change since we don't have them until invoked enable_cpucache() later. This patch will invoke init_lock_keys() after we done enable_cpucache() instead of before to avoid the fake DEADLOCK report. Signed-off-by: Michael Wang --- mm/slab.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/slab.c b/mm/slab.c index d4715e5..cc679ef 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1780,9 +1780,6 @@ void __init kmem_cache_init_late(void) slab_state = UP; - /* Annotate slab for lockdep -- annotate the malloc caches */ - init_lock_keys(); - /* 6) resize the head arrays to their final sizes */ mutex_lock(&slab_mutex); list_for_each_entry(cachep, &slab_caches, list) @@ -1790,6 +1787,9 @@ void __init kmem_cache_init_late(void) BUG(); mutex_unlock(&slab_mutex); + /* Annotate slab for lockdep -- annotate the malloc caches */ + init_lock_keys(); + /* Done! */ slab_state = FULL; -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/