Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-21 Thread Jens Axboe
On Thu, Feb 21 2008, Andrew Morton wrote: > On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote: > > > But I think the radix 'scan over entire tree' is a bit fragile. > > eek, it had better not be. Was this an error in the caller? Hope so. The cfq use of it, not the radix

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-21 Thread Andrew Morton
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote: > But I think the radix 'scan over entire tree' is a bit fragile. eek, it had better not be. Was this an error in the caller? Hope so. > This > patch adds a parallel hlist for ease of properly browsing the members, Even

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-21 Thread Andrew Morton
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: But I think the radix 'scan over entire tree' is a bit fragile. eek, it had better not be. Was this an error in the caller? Hope so. This patch adds a parallel hlist for ease of properly browsing the members, Even

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-21 Thread Jens Axboe
On Thu, Feb 21 2008, Andrew Morton wrote: On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: But I think the radix 'scan over entire tree' is a bit fragile. eek, it had better not be. Was this an error in the caller? Hope so. The cfq use of it, not the radix tree

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
On 2/20/2008, "Zhang, Yanmin" <[EMAIL PROTECTED]> wrote: > Kernel with the reverting patch is ok. > I ran reboot/hackbench for more than 10 times on every one of my 3 x86-64 > machines, and kernel didn't crash. Great, Linus reverted the patch yesterday. Thanks for testing! -- To unsubscribe

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin
On Wed, 2008-02-20 at 10:08 +0800, Zhang, Yanmin wrote: > On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote: > > On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: > > > Ingo Molnar wrote: > > > > * Pekka Enberg <[EMAIL PROTECTED]> wrote: > > > > > > > >>> Yes, this can happen. Are you

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin
On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote: > On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: > > Ingo Molnar wrote: > > > * Pekka Enberg <[EMAIL PROTECTED]> wrote: > > > > > >>> Yes, this can happen. Are you saying it is not safe to be in the > > >>> lockless path when an IRQ

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin
On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: > Ingo Molnar wrote: > > * Pekka Enberg <[EMAIL PROTECTED]> wrote: > > > >>> Yes, this can happen. Are you saying it is not safe to be in the > >>> lockless path when an IRQ triggers? > >> Hmm. The barrier() in slab_free() looks fishy. The

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers
* Pekka Enberg ([EMAIL PROTECTED]) wrote: > Hi Mathieu, > > On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > > - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore > > indicating it is not reentrant if IRQs are disabled. Since those are > > only stats, I

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers
LOCAL > > > > > - bool > > > > > - default y > > > > > - > > > > > config MMU > > > > > def_bool y > > > > > > > > > > > > > $ grep FAST_CMPXCHG_LOCAL */.config &

Re: Linux 2.6.25-rc2

2008-02-19 Thread Torsten Kaiser
On Feb 19, 2008 5:20 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote: > So: > - it might be something else entirely > - it might still be the local cmpxchg, just Torsten didn't happen to >notice it until later. My new hackbench-testcase also killed 2.6.24-rc2-mm1, so I really noticed to late.

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > Earlier today i turned off local-cmpxchg and havent had a crash or > hang since then - but at 200 bootups and 4-5 crashes in a week that's > not conclusive yet. I think others might have workloads that trigger > this bug more often. i mean, today

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > So: > - it might be something else entirely > - it might still be the local cmpxchg, just Torsten didn't happen to >notice it until later. > - it might still be the local cmpxchg, but something else changed its >patterns to actually make

Re: Linux 2.6.25-rc2

2008-02-19 Thread Linus Torvalds
On Tue, 19 Feb 2008, Eric Dumazet wrote: > > cmpxchg_local(>freelist, object, object[c->offset]) can succeed, > while an interrupt came (on this cpu), and several allocations were done, > and one free was performed at the end of this interruption, so 'object' > was recycled. I think you may

Re: Linux 2.6.25-rc2

2008-02-19 Thread Eric Dumazet
> > $ grep FAST_CMPXCHG_LOCAL */.config > > > linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > > linux-2.6.24-rc6-mm

Re: Linux 2.6.25-rc2

2008-02-19 Thread Linus Torvalds
On Tue, 19 Feb 2008, Pekka Enberg wrote: > > Hmm. The barrier() in slab_free() looks fishy. The comment says it's > there to make sure we've retrieved c->freelist before c->page but then > it uses a _compiler barrier_ which doesn't affect the CPU and the > reads may still be re-ordered... Not

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
Ingo Molnar wrote: * Ingo Molnar <[EMAIL PROTECTED]> wrote: If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
Ingo Molnar wrote: * Pekka Enberg <[EMAIL PROTECTED]> wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c->freelist before c->page

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > If this (or my other patch) indeed solves the problem i'd still favor > a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it > looks quite un-cooked and quite un-tested for multiple independent > reasons. > > Sigh, why do i again

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Pekka Enberg <[EMAIL PROTECTED]> wrote: > > Yes, this can happen. Are you saying it is not safe to be in the > > lockless path when an IRQ triggers? > > Hmm. The barrier() in slab_free() looks fishy. The comment says it's > there to make sure we've retrieved c->freelist before c->page but

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
Hi Mathieu, On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > > Since this shows mostly with network card drivers, I think the most > > plausible cause would be an IRQ nesting over kmem_cache_alloc_node and > > calling it. On Feb 19, 2008 4:21 PM, Pekka Enberg <[EMAIL

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
Hi Mathieu, On Feb 19, 2008 4:02 PM, Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore > indicating it is not reentrant if IRQs are disabled. Since those are > only stats, I guess it's ok, but still weird. What is not re-entrant?

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers
ef_bool y > > > > > > > $ grep FAST_CMPXCHG_LOCAL */.config > > linux-2.6.24-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > > lin

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: > Ingo, a comment in slub.c explains it : > > /* > * The SLUB_FASTPATH path is provisional and is currently disabled if the > * kernel is compiled with preemption or if the arch does not support > * fast cmpxchg operations. There are a couple of

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Kamalesh Babulal
Jens Axboe wrote: > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: >> On Sun, 17 Feb 2008 20:29:13 +0100 >> Jens Axboe <[EMAIL PROTECTED]> wrote: >> >>> It's odd stuff. Could you perhaps try and add some printks to >>> block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return >>> from

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers
* Ingo Molnar ([EMAIL PROTECTED]) wrote: > > * Pekka Enberg <[EMAIL PROTECTED]> wrote: > > > Mathieu, Christoph is on vacation and I'm not at all that familiar > > with this cmpxchg_local() optimization, so if you could take a peek at > > this bug report to see if you can spot something

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
Hi, Pekka Enberg <[EMAIL PROTECTED]> wrote: > > Mathieu, Christoph is on vacation and I'm not at all that familiar > > with this cmpxchg_local() optimization, so if you could take a peek at > > this bug report to see if you can spot something obviously wrong with > > it, I would much appreciate

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Pekka Enberg <[EMAIL PROTECTED]> wrote: > Mathieu, Christoph is on vacation and I'm not at all that familiar > with this cmpxchg_local() optimization, so if you could take a peek at > this bug report to see if you can spot something obviously wrong with > it, I would much appreciate that.

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > On Tue, 19 Feb 2008 09:58:38 +0100 > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > when I inserted printk here > > > == > > > for (i = 0; i < nr; i++) > > > func(ioc, cics[i]); > > > printk("%d %lx\n", nr, index); > > > == > > > index was

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki
On Tue, 19 Feb 2008 09:58:38 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote: > > when I inserted printk here > > == > > for (i = 0; i < nr; i++) > > func(ioc, cics[i]); > > printk("%d %lx\n", nr, index); > > == > > index was always "1" and nr was always 32. > > > > So,

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > On Tue, 19 Feb 2008 09:36:34 +0100 > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > > > On Sun, 17 Feb 2008 20:29:13 +0100 > > > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > > > It's odd stuff. Could

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > On Tue, 19 Feb 2008 09:36:34 +0100 > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > > > On Sun, 17 Feb 2008 20:29:13 +0100 > > > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > > > It's odd stuff. Could

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote: > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > > On Sun, 17 Feb 2008 20:29:13 +0100 > > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > It's odd stuff. Could you perhaps try and add some printks to > > >

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-19 Thread Tilman Schmidt
[added CCs from the other thread on this topic] Alasdair G Kergon schrieb: On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote: # CONFIG_SYSFS_DEPRECATED is not set IMHO That should be *set* by default until everyone has had time to update their userspace software to cope with the

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe <[EMAIL PROTECTED]> wrote: > On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > > On Sun, 17 Feb 2008 20:29:13 +0100 > > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > > > It's odd stuff. Could you perhaps try and add some printks to > > >

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: > On Sun, 17 Feb 2008 20:29:13 +0100 > Jens Axboe <[EMAIL PROTECTED]> wrote: > > > It's odd stuff. Could you perhaps try and add some printks to > > block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return > > from

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki
that case, cics[]->dead_key has key value. Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: linux-2.6.25-rc2/block/cfq-iosched.c =========== --- linux-2.6.25-rc2.orig/block/cfq-iosched.c +++ linux-2.6.25-rc2/blo

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki
value. Signed-off-by: KAMEZAWA Hiroyuki [EMAIL PROTECTED] Index: linux-2.6.25-rc2/block/cfq-iosched.c === --- linux-2.6.25-rc2.orig/block/cfq-iosched.c +++ linux-2.6.25-rc2/block/cfq-iosched.c @@ -1171,7 +1171,11 @@ call_for_each_cic

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and add some printks to block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return from radix_tree_gang_lookup() and the

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and add some printks to

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-19 Thread Tilman Schmidt
[added CCs from the other thread on this topic] Alasdair G Kergon schrieb: On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote: # CONFIG_SYSFS_DEPRECATED is not set IMHO That should be *set* by default until everyone has had time to update their userspace software to cope with the

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Pekka Enberg [EMAIL PROTECTED] wrote: Mathieu, Christoph is on vacation and I'm not at all that familiar with this cmpxchg_local() optimization, so if you could take a peek at this bug report to see if you can spot something obviously wrong with it, I would much appreciate that. hm,

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
Hi, Pekka Enberg [EMAIL PROTECTED] wrote: Mathieu, Christoph is on vacation and I'm not at all that familiar with this cmpxchg_local() optimization, so if you could take a peek at this bug report to see if you can spot something obviously wrong with it, I would much appreciate that. On

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Tue, 19 Feb 2008 09:58:38 +0100 Jens Axboe [EMAIL PROTECTED] wrote: when I inserted printk here == for (i = 0; i nr; i++) func(ioc, cics[i]); printk(%d %lx\n, nr, index); == index was always 1 and nr was always

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Jens Axboe
On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki
On Tue, 19 Feb 2008 09:36:34 +0100 Jens Axboe [EMAIL PROTECTED] wrote: On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and add some printks to

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread KAMEZAWA Hiroyuki
On Tue, 19 Feb 2008 09:58:38 +0100 Jens Axboe [EMAIL PROTECTED] wrote: when I inserted printk here == for (i = 0; i nr; i++) func(ioc, cics[i]); printk(%d %lx\n, nr, index); == index was always 1 and nr was always 32. So, cics[31]-key was always NULL when

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers
* Ingo Molnar ([EMAIL PROTECTED]) wrote: * Pekka Enberg [EMAIL PROTECTED] wrote: Mathieu, Christoph is on vacation and I'm not at all that familiar with this cmpxchg_local() optimization, so if you could take a peek at this bug report to see if you can spot something obviously wrong

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-19 Thread Kamalesh Babulal
Jens Axboe wrote: On Tue, Feb 19 2008, KAMEZAWA Hiroyuki wrote: On Sun, 17 Feb 2008 20:29:13 +0100 Jens Axboe [EMAIL PROTECTED] wrote: It's odd stuff. Could you perhaps try and add some printks to block/cfq-iosched.c:call_for_each_cic(), like dumping the 'nr' return from

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Mathieu Desnoyers [EMAIL PROTECTED] wrote: Ingo, a comment in slub.c explains it : /* * The SLUB_FASTPATH path is provisional and is currently disabled if the * kernel is compiled with preemption or if the arch does not support * fast cmpxchg operations. There are a couple of coming

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
Hi Mathieu, On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote: Since this shows mostly with network card drivers, I think the most plausible cause would be an IRQ nesting over kmem_cache_alloc_node and calling it. On Feb 19, 2008 4:21 PM, Pekka Enberg [EMAIL PROTECTED]

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers
linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y -rc2-mm1 still worked for me. Did you mean the new SLUB_FASTPATH? $ grep define SLUB_FASTPATH */mm/slub.c linux-2.6.25-rc1/mm/slub.c:#define SLUB_FASTPATH linux-2.6.25-rc2-mm1

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Pekka Enberg [EMAIL PROTECTED] wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c-freelist before c-page but then it

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Ingo Molnar [EMAIL PROTECTED] wrote: If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i again have to be

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
Hi Mathieu, On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote: - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore indicating it is not reentrant if IRQs are disabled. Since those are only stats, I guess it's ok, but still weird. What is not re-entrant? On

Re: Linux 2.6.25-rc2

2008-02-19 Thread Linus Torvalds
On Tue, 19 Feb 2008, Eric Dumazet wrote: cmpxchg_local(c-freelist, object, object[c-offset]) can succeed, while an interrupt came (on this cpu), and several allocations were done, and one free was performed at the end of this interruption, so 'object' was recycled. I think you may well be

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
Ingo Molnar wrote: * Ingo Molnar [EMAIL PROTECTED] wrote: If this (or my other patch) indeed solves the problem i'd still favor a full revert of the SLUB_FASTPATH (commit 1f84260c8ce3b1ce26d4), it looks quite un-cooked and quite un-tested for multiple independent reasons. Sigh, why do i

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
Ingo Molnar wrote: * Pekka Enberg [EMAIL PROTECTED] wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c-freelist before c-page but

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Ingo Molnar [EMAIL PROTECTED] wrote: Earlier today i turned off local-cmpxchg and havent had a crash or hang since then - but at 200 bootups and 4-5 crashes in a week that's not conclusive yet. I think others might have workloads that trigger this bug more often. i mean, today i've

Re: Linux 2.6.25-rc2

2008-02-19 Thread Linus Torvalds
On Tue, 19 Feb 2008, Pekka Enberg wrote: Hmm. The barrier() in slab_free() looks fishy. The comment says it's there to make sure we've retrieved c-freelist before c-page but then it uses a _compiler barrier_ which doesn't affect the CPU and the reads may still be re-ordered... Not sure if

Re: Linux 2.6.25-rc2

2008-02-19 Thread Eric Dumazet
=y linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y -rc2-mm1 still worked for me. Did you mean the new

Re: Linux 2.6.25-rc2

2008-02-19 Thread Ingo Molnar
* Linus Torvalds [EMAIL PROTECTED] wrote: So: - it might be something else entirely - it might still be the local cmpxchg, just Torsten didn't happen to notice it until later. - it might still be the local cmpxchg, but something else changed its patterns to actually make it

Re: Linux 2.6.25-rc2

2008-02-19 Thread Torsten Kaiser
On Feb 19, 2008 5:20 PM, Linus Torvalds [EMAIL PROTECTED] wrote: So: - it might be something else entirely - it might still be the local cmpxchg, just Torsten didn't happen to notice it until later. My new hackbench-testcase also killed 2.6.24-rc2-mm1, so I really noticed to late. -

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers
/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2

Re: Linux 2.6.25-rc2

2008-02-19 Thread Mathieu Desnoyers
* Pekka Enberg ([EMAIL PROTECTED]) wrote: Hi Mathieu, On Feb 19, 2008 4:02 PM, Mathieu Desnoyers [EMAIL PROTECTED] wrote: - stat(c, ALLOC_FASTPATH); seems to be using a var++, therefore indicating it is not reentrant if IRQs are disabled. Since those are only stats, I guess it's ok,

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin
On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: Ingo Molnar wrote: * Pekka Enberg [EMAIL PROTECTED] wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm. The barrier() in slab_free() looks fishy. The comment says it's

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin
On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote: On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: Ingo Molnar wrote: * Pekka Enberg [EMAIL PROTECTED] wrote: Yes, this can happen. Are you saying it is not safe to be in the lockless path when an IRQ triggers? Hmm.

Re: Linux 2.6.25-rc2

2008-02-19 Thread Zhang, Yanmin
On Wed, 2008-02-20 at 10:08 +0800, Zhang, Yanmin wrote: On Wed, 2008-02-20 at 08:36 +0800, Zhang, Yanmin wrote: On Tue, 2008-02-19 at 17:52 +0200, Pekka Enberg wrote: Ingo Molnar wrote: * Pekka Enberg [EMAIL PROTECTED] wrote: Yes, this can happen. Are you saying it is not safe

Re: Linux 2.6.25-rc2

2008-02-19 Thread Pekka Enberg
On 2/20/2008, Zhang, Yanmin [EMAIL PROTECTED] wrote: Kernel with the reverting patch is ok. I ran reboot/hackbench for more than 10 times on every one of my 3 x86-64 machines, and kernel didn't crash. Great, Linus reverted the patch yesterday. Thanks for testing! -- To unsubscribe from this

Re: Linux 2.6.25-rc2

2008-02-18 Thread Pekka Enberg
.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y > l

Re: Linux 2.6.25-rc2

2008-02-18 Thread Torsten Kaiser
linux-2.6.24-rc3-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc3-mm2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc6-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.24-rc8-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc

Re: Linux 2.6.25-rc2

2008-02-18 Thread Torsten Kaiser
On Feb 19, 2008 12:54 AM, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Sat, 16 Feb 2008, Torsten Kaiser wrote: > > > > [ 5282.056415] [ cut here ] > > [ 5282.059757] kernel BUG at lib/list_debug.c:33! > > Is there any chance that you could try to bisect this, if it's

Re: Linux 2.6.25-rc2

2008-02-18 Thread Ingo Molnar
* Torsten Kaiser <[EMAIL PROTECTED]> wrote: > On Feb 15, 2008 10:23 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > > Ok, > > this kernel is a winner. > > Sadly not for me: > [ 5282.056415] [ cut here ] > [ 5282.059757] kernel BUG at lib/list_debug.c:33! > [

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-18 Thread Alasdair G Kergon
On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote: > # CONFIG_SYSFS_DEPRECATED is not set IMHO That should be *set* by default until everyone has had time to update their userspace software to cope with the changed sysfs layout. Alasdair -- [EMAIL PROTECTED] -- To unsubscribe from this

Re: Linux 2.6.25-rc2

2008-02-18 Thread Linus Torvalds
On Sat, 16 Feb 2008, Torsten Kaiser wrote: > > [ 5282.056415] [ cut here ] > [ 5282.059757] kernel BUG at lib/list_debug.c:33! Is there any chance that you could try to bisect this, if it's repeatable enough for you? Even if you can't bisect it *all* the way, it would

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Frans Pop
Jeff Garzik wrote: > Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot. > One running Fedora 8 + X (GNOME) and one a headless file server. > configs and lspci attached. Unable to capture any splatter so far. Sounds like it may be http://lkml.org/lkml/2008/2/17/78. Suggest you

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Jeff Garzik
Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot. One running Fedora 8 + X (GNOME) and one a headless file server. configs and lspci attached. Unable to capture any splatter so far. Bisecting... 00:00.0 Host bridge: Intel Corporation 82955X Memory Controller Hub 00:01.0

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Andrew Morton
On Sat, 16 Feb 2008 11:14:46 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote: > The 2.6.25-rc2 kernel oopses while running dbench on ext3 filesystem > mounted with mount -o data=writeback,nobh option on the x86_64 box > > BUG: unable to handle kernel NULL pointer dereference at

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-18 Thread Tilman Schmidt
Am 17.02.2008 schrieb Jeff Chua: I faced the same problem, but resolved with ... vgscan vgchange -a y Sorry, I'm not sure what to do with those two commands. Running them once manually doesn't seem to change anything, and my initrd already contains them AFAICS. Also, ensure you set

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-18 Thread Tilman Schmidt
Am 17.02.2008 schrieb Jeff Chua: I faced the same problem, but resolved with ... vgscan vgchange -a y Sorry, I'm not sure what to do with those two commands. Running them once manually doesn't seem to change anything, and my initrd already contains them AFAICS. Also, ensure you set

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Andrew Morton
On Sat, 16 Feb 2008 11:14:46 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: The 2.6.25-rc2 kernel oopses while running dbench on ext3 filesystem mounted with mount -o data=writeback,nobh option on the x86_64 box BUG: unable to handle kernel NULL pointer dereference at IP:

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Jeff Garzik
Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot. One running Fedora 8 + X (GNOME) and one a headless file server. configs and lspci attached. Unable to capture any splatter so far. Bisecting... 00:00.0 Host bridge: Intel Corporation 82955X Memory Controller Hub 00:01.0

Re: [BUG] Linux 2.6.25-rc2 - Kernel Ooops while running dbench

2008-02-18 Thread Frans Pop
Jeff Garzik wrote: Two x86-64 boxes here lock up here on 2.6.25-rc2, shortly after boot. One running Fedora 8 + X (GNOME) and one a headless file server. configs and lspci attached. Unable to capture any splatter so far. Sounds like it may be http://lkml.org/lkml/2008/2/17/78. Suggest you

Re: Linux 2.6.25-rc2

2008-02-18 Thread Linus Torvalds
On Sat, 16 Feb 2008, Torsten Kaiser wrote: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! Is there any chance that you could try to bisect this, if it's repeatable enough for you? Even if you can't bisect it *all* the way, it would be

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-18 Thread Alasdair G Kergon
On Sat, Feb 16, 2008 at 11:37:37PM +0100, Jiri Slaby wrote: # CONFIG_SYSFS_DEPRECATED is not set IMHO That should be *set* by default until everyone has had time to update their userspace software to cope with the changed sysfs layout. Alasdair -- [EMAIL PROTECTED] -- To unsubscribe from this

Re: Linux 2.6.25-rc2

2008-02-18 Thread Ingo Molnar
* Torsten Kaiser [EMAIL PROTECTED] wrote: On Feb 15, 2008 10:23 PM, Linus Torvalds [EMAIL PROTECTED] wrote: Ok, this kernel is a winner. Sadly not for me: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! [ 5282.062055] invalid

Re: Linux 2.6.25-rc2

2008-02-18 Thread Torsten Kaiser
On Feb 19, 2008 12:54 AM, Linus Torvalds [EMAIL PROTECTED] wrote: On Sat, 16 Feb 2008, Torsten Kaiser wrote: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! Is there any chance that you could try to bisect this, if it's repeatable

Re: Linux 2.6.25-rc2

2008-02-18 Thread Torsten Kaiser
-2.6.25-rc1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2-mm1/.config:CONFIG_FAST_CMPXCHG_LOCAL=y linux-2.6.25-rc2/.config:CONFIG_FAST_CMPXCHG_LOCAL=y -rc2-mm1 still worked for me. Did you mean the new SLUB_FASTPATH? $ grep define SLUB_FASTPATH */mm/slub.c linux-2.6.25-rc1/mm/slub.c:#define

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-17 Thread Jeff Chua
On Feb 18, 2008 8:57 AM, Tilman Schmidt <[EMAIL PROTECTED]> wrote: > Am 16.02.2008 23:37 schrieb Jiri Slaby: > > On 02/16/2008 09:12 PM, Alan Cox wrote: > > Try to upgrade to at least lvm 2.02.29 (I guess this is the first version > > which > > understands the new sysfs layout). > I'll have to

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-17 Thread Tilman Schmidt
Am 16.02.2008 23:37 schrieb Jiri Slaby: On 02/16/2008 09:12 PM, Alan Cox wrote: On Sat, 16 Feb 2008 20:14:30 +0100 Tilman Schmidt <[EMAIL PROTECTED]> wrote: 2.6.25-rc2 fails to bring up my openSUSE 10.3 PC because LVM cannot find the volume group containing the root file system. 2.6.25-rc1

Re: Linux 2.6.25-rc2

2008-02-17 Thread Torsten Kaiser
On Feb 17, 2008 9:25 PM, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote: > There's the Bugzilla entry for it at > http://bugzilla.kernel.org/show_bug.cgi?id=9973 Thank you. > Please update it with the current information. Crash for 2.6.25-rc2-mm1 added. That one had a complete stacktrace, but the

Re: Linux 2.6.25-rc2

2008-02-17 Thread Rafael J. Wysocki
On Saturday, 16 of February 2008, Torsten Kaiser wrote: > On Feb 15, 2008 10:23 PM, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > > Ok, > > this kernel is a winner. > > Sadly not for me: > [ 5282.056415] [ cut here ] > [ 5282.059757] kernel BUG at lib/list_debug.c:33! >

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-17 Thread Rafael J. Wysocki
On Saturday, 16 of February 2008, Kamalesh Babulal wrote: > Hi, Hi, > The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the > 2.6.24-rc2 kernel, > While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the > powerbox Can you update the Bugzilla entry at:

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-17 Thread Jens Axboe
On Sat, Feb 16 2008, Kamalesh Babulal wrote: > Hi, > > The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the > 2.6.24-rc2 kernel, > While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the > powerbox > > Loading st.ko module > BUG: soft lockup - CPU#1 stuck

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-17 Thread Jens Axboe
On Sat, Feb 16 2008, Kamalesh Babulal wrote: Hi, The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the 2.6.24-rc2 kernel, While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the powerbox Loading st.ko module BUG: soft lockup - CPU#1 stuck for 61s!

Re: [BUG] Linux 2.6.25-rc2 - Regression from 2.6.24-rc1-git1 softlockup while bootup on powerpc

2008-02-17 Thread Rafael J. Wysocki
On Saturday, 16 of February 2008, Kamalesh Babulal wrote: Hi, Hi, The softlockup is seen from 2.6.25-rc1-git{1,3} and is visible in the 2.6.24-rc2 kernel, While booting up with the 2.6.25-rc1-git{1,3} and 2.6.25-rc2 kernel(s) on the powerbox Can you update the Bugzilla entry at:

Re: Linux 2.6.25-rc2

2008-02-17 Thread Rafael J. Wysocki
On Saturday, 16 of February 2008, Torsten Kaiser wrote: On Feb 15, 2008 10:23 PM, Linus Torvalds [EMAIL PROTECTED] wrote: Ok, this kernel is a winner. Sadly not for me: [ 5282.056415] [ cut here ] [ 5282.059757] kernel BUG at lib/list_debug.c:33! [

Re: Linux 2.6.25-rc2

2008-02-17 Thread Torsten Kaiser
On Feb 17, 2008 9:25 PM, Rafael J. Wysocki [EMAIL PROTECTED] wrote: There's the Bugzilla entry for it at http://bugzilla.kernel.org/show_bug.cgi?id=9973 Thank you. Please update it with the current information. Crash for 2.6.25-rc2-mm1 added. That one had a complete stacktrace, but the trace

Re: Linux 2.6.25-rc2 regression: LVM cannot find volume group

2008-02-17 Thread Tilman Schmidt
Am 16.02.2008 23:37 schrieb Jiri Slaby: On 02/16/2008 09:12 PM, Alan Cox wrote: On Sat, 16 Feb 2008 20:14:30 +0100 Tilman Schmidt [EMAIL PROTECTED] wrote: 2.6.25-rc2 fails to bring up my openSUSE 10.3 PC because LVM cannot find the volume group containing the root file system. 2.6.25-rc1 has

  1   2   >