Re: [PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
On Sat, Aug 25, 2007 at 11:43:08AM +0200, Mariusz Kozlowski wrote: > > > = > > > [ INFO: inconsistent lock state ] > > > 2.6.23-rc2-mm1 #7 > > > - > > > inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. > > > ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes: > > > (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b > > > [8139too] ... > I tested your patch and it still happens. Dmesg info from patched kernel > attached. > I coulnd't reproduce that on 2.6.23-rc3-mm1 - but on 2.6.23-rc2-mm2 it is > easily > reproducible. > > If you need more info, test some patches, etc. - just mail me. > ... > = > [ INFO: possible irq lock inversion dependency detected ] > 2.6.23-rc2-mm2 #2 > - > runscript.sh/5065 just changed the state of lock: > (_xmit_ETHER){-+..}, at: [] dev_watchdog+0x17/0xcc > but this lock took another, soft-irq-unsafe lock in the past: > (>lock){--..} > > and interrupts could create inverse lock ordering between them. It's OK! These're 2 different warnings. As a matter of fact, my patch wasn't supposed to fix any of them, but something similar to the first one, which was possible, but for some reason wasn't reported by lockdep. The first warning was fixed by Andrew Morton's patch to free_irq(), so it shouldn't happen in -rc3-mm. The second warning could have been fixed too, I don't know, but since it's quite long, I would prefer to think about it only if it still happens in current -mm's. Thanks, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
On Sat, Aug 25, 2007 at 11:43:08AM +0200, Mariusz Kozlowski wrote: = [ INFO: inconsistent lock state ] 2.6.23-rc2-mm1 #7 - inconsistent {in-hardirq-W} - {hardirq-on-W} usage. ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes: (tp-lock){+...}, at: [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too] ... I tested your patch and it still happens. Dmesg info from patched kernel attached. I coulnd't reproduce that on 2.6.23-rc3-mm1 - but on 2.6.23-rc2-mm2 it is easily reproducible. If you need more info, test some patches, etc. - just mail me. ... = [ INFO: possible irq lock inversion dependency detected ] 2.6.23-rc2-mm2 #2 - runscript.sh/5065 just changed the state of lock: (_xmit_ETHER){-+..}, at: [c03cb659] dev_watchdog+0x17/0xcc but this lock took another, soft-irq-unsafe lock in the past: (tp-lock){--..} and interrupts could create inverse lock ordering between them. It's OK! These're 2 different warnings. As a matter of fact, my patch wasn't supposed to fix any of them, but something similar to the first one, which was possible, but for some reason wasn't reported by lockdep. The first warning was fixed by Andrew Morton's patch to free_irq(), so it shouldn't happen in -rc3-mm. The second warning could have been fixed too, I don't know, but since it's quite long, I would prefer to think about it only if it still happens in current -mm's. Thanks, Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
> > = > > [ INFO: inconsistent lock state ] > > 2.6.23-rc2-mm1 #7 > > - > > inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. > > ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes: > > (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too] > > {in-hardirq-W} state was registered at: > > [] __lock_acquire+0x949/0x11ac > > [] lock_acquire+0x99/0xb2 > > [] _spin_lock+0x35/0x42 > > [] rtl8139_interrupt+0x27/0x46b [8139too] > > [] handle_IRQ_event+0x28/0x59 > > [] handle_level_irq+0xad/0x10b > > [] do_IRQ+0x93/0xd0 > > [] common_interrupt+0x2e/0x34 > ... > > other info that might help us debug this: > > 1 lock held by ifconfig/5492: > > #0: (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f > > > > stack backtrace: > ... > > [] _spin_lock+0x35/0x42 > > [] rtl8139_interrupt+0x27/0x46b [8139too] > > [] free_irq+0x11b/0x146 > > [] rtl8139_close+0x8a/0x14a [8139too] > > [] dev_close+0x57/0x74 > ... > > It looks like this was possible after David's fix, which really > enabled running of the handler in free_irq, but before Andrew's patch > disabling local irqs for this time. > > So, this bug should be fixed, but IMHO similar problem is possible in > request_irq. And, I think, this is not only about lockdep complaining, > but real lockup possibility, because any locks in such a handler are > taken in another, not expected for them context, and could be > vulnerable (especially with softirqs, but probably hardirqs as well). > > Reported-by: Mariusz Kozlowski <[EMAIL PROTECTED]> > Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]> > > --- > > diff -Nurp 2.6.23-rc3-mm1-/kernel/irq/manage.c > 2.6.23-rc3-mm1/kernel/irq/manage.c > --- 2.6.23-rc3-mm1-/kernel/irq/manage.c 2007-08-22 13:58:58.0 > +0200 > +++ 2.6.23-rc3-mm1/kernel/irq/manage.c2007-08-22 14:12:21.0 > +0200 > @@ -546,14 +546,11 @@ int request_irq(unsigned int irq, irq_ha >* We do this before actually registering it, to make sure that >* a 'real' IRQ doesn't run in parallel with our fake >*/ > - if (irqflags & IRQF_DISABLED) { > - unsigned long flags; > + unsigned long flags; > > - local_irq_save(flags); > - handler(irq, dev_id); > - local_irq_restore(flags); > - } else > - handler(irq, dev_id); > + local_irq_save(flags); > + handler(irq, dev_id); > + local_irq_restore(flags); > } > #endif I tested your patch and it still happens. Dmesg info from patched kernel attached. I coulnd't reproduce that on 2.6.23-rc3-mm1 - but on 2.6.23-rc2-mm2 it is easily reproducible. If you need more info, test some patches, etc. - just mail me. Pozdrawiam, Mariusz = [ INFO: possible irq lock inversion dependency detected ] 2.6.23-rc2-mm2 #2 - runscript.sh/5065 just changed the state of lock: (_xmit_ETHER){-+..}, at: [] dev_watchdog+0x17/0xcc but this lock took another, soft-irq-unsafe lock in the past: (>lock){--..} and interrupts could create inverse lock ordering between them. other info that might help us debug this: 1 lock held by runscript.sh/5065: #0: (>mmap_sem){}, at: [] do_page_fault+0x159/0x6f0 the first lock's dependencies: -> (_xmit_ETHER){-+..} ops: 21 { initial-use at: [] __lock_acquire+0x217/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock_bh+0x3a/0x47 [] dev_set_rx_mode+0x14/0x3b [] dev_change_flags+0x68/0x190 [] devinet_ioctl+0x4af/0x652 [] inet_ioctl+0x56/0x71 [] sock_ioctl+0xa5/0x1d4 [] do_ioctl+0x22/0x71 [] vfs_ioctl+0x55/0x29e [] sys_ioctl+0x33/0x69 [] sysenter_past_esp+0x5f/0x99 [] 0x in-softirq-W at: [] __lock_acquire+0x6f2/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock+0x35/0x42 [] dev_watchdog+0x17/0xcc [] run_timer_softirq+0x14b/0x1a9 [] __do_softirq+0x5b/0xb2 [] do_softirq+0x4d/0x4f [] irq_exit+0x48/0x4a [] do_IRQ+0x98/0xd0
Re: [PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
= [ INFO: inconsistent lock state ] 2.6.23-rc2-mm1 #7 - inconsistent {in-hardirq-W} - {hardirq-on-W} usage. ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes: (tp-lock){+...}, at: [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too] {in-hardirq-W} state was registered at: [c0138eeb] __lock_acquire+0x949/0x11ac [c01397e7] lock_acquire+0x99/0xb2 [c0452ff3] _spin_lock+0x35/0x42 [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too] [c0147a5d] handle_IRQ_event+0x28/0x59 [c01493ca] handle_level_irq+0xad/0x10b [c0105a13] do_IRQ+0x93/0xd0 [c010441e] common_interrupt+0x2e/0x34 ... other info that might help us debug this: 1 lock held by ifconfig/5492: #0: (rtnl_mutex){--..}, at: [c0451778] mutex_lock+0x1c/0x1f stack backtrace: ... [c0452ff3] _spin_lock+0x35/0x42 [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too] [c01480fd] free_irq+0x11b/0x146 [de871d59] rtl8139_close+0x8a/0x14a [8139too] [c03bde63] dev_close+0x57/0x74 ... It looks like this was possible after David's fix, which really enabled running of the handler in free_irq, but before Andrew's patch disabling local irqs for this time. So, this bug should be fixed, but IMHO similar problem is possible in request_irq. And, I think, this is not only about lockdep complaining, but real lockup possibility, because any locks in such a handler are taken in another, not expected for them context, and could be vulnerable (especially with softirqs, but probably hardirqs as well). Reported-by: Mariusz Kozlowski [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.23-rc3-mm1-/kernel/irq/manage.c 2.6.23-rc3-mm1/kernel/irq/manage.c --- 2.6.23-rc3-mm1-/kernel/irq/manage.c 2007-08-22 13:58:58.0 +0200 +++ 2.6.23-rc3-mm1/kernel/irq/manage.c2007-08-22 14:12:21.0 +0200 @@ -546,14 +546,11 @@ int request_irq(unsigned int irq, irq_ha * We do this before actually registering it, to make sure that * a 'real' IRQ doesn't run in parallel with our fake */ - if (irqflags IRQF_DISABLED) { - unsigned long flags; + unsigned long flags; - local_irq_save(flags); - handler(irq, dev_id); - local_irq_restore(flags); - } else - handler(irq, dev_id); + local_irq_save(flags); + handler(irq, dev_id); + local_irq_restore(flags); } #endif I tested your patch and it still happens. Dmesg info from patched kernel attached. I coulnd't reproduce that on 2.6.23-rc3-mm1 - but on 2.6.23-rc2-mm2 it is easily reproducible. If you need more info, test some patches, etc. - just mail me. Pozdrawiam, Mariusz = [ INFO: possible irq lock inversion dependency detected ] 2.6.23-rc2-mm2 #2 - runscript.sh/5065 just changed the state of lock: (_xmit_ETHER){-+..}, at: [c03cb659] dev_watchdog+0x17/0xcc but this lock took another, soft-irq-unsafe lock in the past: (tp-lock){--..} and interrupts could create inverse lock ordering between them. other info that might help us debug this: 1 lock held by runscript.sh/5065: #0: (mm-mmap_sem){}, at: [c0454569] do_page_fault+0x159/0x6f0 the first lock's dependencies: - (_xmit_ETHER){-+..} ops: 21 { initial-use at: [c0138ea9] __lock_acquire+0x217/0x11ac [c0139ed7] lock_acquire+0x99/0xb2 [c045281a] _spin_lock_bh+0x3a/0x47 [c03bc096] dev_set_rx_mode+0x14/0x3b [c03bc59f] dev_change_flags+0x68/0x190 [c03fcb4c] devinet_ioctl+0x4af/0x652 [c03fd432] inet_ioctl+0x56/0x71 [c03b151a] sock_ioctl+0xa5/0x1d4 [c0178a42] do_ioctl+0x22/0x71 [c0178ae6] vfs_ioctl+0x55/0x29e [c0178d62] sys_ioctl+0x33/0x69 [c01041da] sysenter_past_esp+0x5f/0x99 [] 0x in-softirq-W at: [c0139384] __lock_acquire+0x6f2/0x11ac [c0139ed7] lock_acquire+0x99/0xb2 [c04527d3] _spin_lock+0x35/0x42 [c03cb659] dev_watchdog+0x17/0xcc [c01224b7] run_timer_softirq+0x14b/0x1a9 [c011ecc2] __do_softirq+0x5b/0xb2 [c011ed66] do_softirq+0x4d/0x4f [c011f04b] irq_exit+0x48/0x4a [c01058f8] do_IRQ+0x98/0xd0 [c010444e] common_interrupt+0x2e/0x34 [c014b039
Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected
On Fri, Aug 24, 2007 at 10:27:25AM +0200, Jarek Poplawski wrote: > On 10-08-2007 09:06, Mariusz Kozlowski wrote: ... > > = > > [ INFO: possible irq lock inversion dependency detected ] > > 2.6.23-rc2-mm1 #7 > > - > > runscript.sh/5843 just changed the state of lock: > > (_xmit_ETHER){-+..}, at: [] dev_watchdog+0x17/0xcc > > but this lock took another, soft-irq-unsafe lock in the past: > > (>lock){--..} > > > > and interrupts could create inverse lock ordering between them. > ... > > Really no idea who to CC here ;) > > IMHO, this should be fixed by last changes to free_irq & request_irq. > (Seems to be possible only with CONFIG_DEBUG_SHIRQ?) Otherwise I can > be CC-ed - my pleasure! OOPS! But, since it's about inversion - not state - there should be no connection... Anyway if this returns currently (and if _SHIRQ only) I'm interested. Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected
On 10-08-2007 09:06, Mariusz Kozlowski wrote: > Hello, > > And the winner of today is ... > > > > = > [ INFO: possible irq lock inversion dependency detected ] > 2.6.23-rc2-mm1 #7 > - > runscript.sh/5843 just changed the state of lock: > (_xmit_ETHER){-+..}, at: [] dev_watchdog+0x17/0xcc > but this lock took another, soft-irq-unsafe lock in the past: > (>lock){--..} > > and interrupts could create inverse lock ordering between them. ... > Really no idea who to CC here ;) IMHO, this should be fixed by last changes to free_irq & request_irq. (Seems to be possible only with CONFIG_DEBUG_SHIRQ?) Otherwise I can be CC-ed - my pleasure! Cheers, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected
On 10-08-2007 09:06, Mariusz Kozlowski wrote: Hello, And the winner of today is ... = [ INFO: possible irq lock inversion dependency detected ] 2.6.23-rc2-mm1 #7 - runscript.sh/5843 just changed the state of lock: (_xmit_ETHER){-+..}, at: [c03cbe79] dev_watchdog+0x17/0xcc but this lock took another, soft-irq-unsafe lock in the past: (tp-lock){--..} and interrupts could create inverse lock ordering between them. ... Really no idea who to CC here ;) IMHO, this should be fixed by last changes to free_irq request_irq. (Seems to be possible only with CONFIG_DEBUG_SHIRQ?) Otherwise I can be CC-ed - my pleasure! Cheers, Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected
On Fri, Aug 24, 2007 at 10:27:25AM +0200, Jarek Poplawski wrote: On 10-08-2007 09:06, Mariusz Kozlowski wrote: ... = [ INFO: possible irq lock inversion dependency detected ] 2.6.23-rc2-mm1 #7 - runscript.sh/5843 just changed the state of lock: (_xmit_ETHER){-+..}, at: [c03cbe79] dev_watchdog+0x17/0xcc but this lock took another, soft-irq-unsafe lock in the past: (tp-lock){--..} and interrupts could create inverse lock ordering between them. ... Really no idea who to CC here ;) IMHO, this should be fixed by last changes to free_irq request_irq. (Seems to be possible only with CONFIG_DEBUG_SHIRQ?) Otherwise I can be CC-ed - my pleasure! OOPS! But, since it's about inversion - not state - there should be no connection... Anyway if this returns currently (and if _SHIRQ only) I'm interested. Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH (take 2)] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
On Thu, Aug 23, 2007 at 10:44:30AM +0200, Jarek Poplawski wrote: > Andrew Morton pointed out that my changelog was unusable. Sorry! > Here is a second try with the changelog and kernel version changed. ... > >(take 2) > > Subject: request_irq() - fix DEBUG_SHIRQ handling ... > Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]> > > --- > > diff -Nurp 2.6.23-rc3-git6-/kernel/irq/manage.c > 2.6.23-rc3-git6/kernel/irq/manage.c > --- 2.6.23-rc3-git6-/kernel/irq/manage.c 2007-08-23 10:11:35.0 > +0200 > +++ 2.6.23-rc3-git6/kernel/irq/manage.c 2007-08-23 10:16:29.0 > +0200 So, this time I f-ed the diff part: it's not exactly against 2.6.23-rc-git6. But, it's Andrew to blame: he should've known that some old & slow chips can't do science and poetry at the same time. Sorry (for him)! Anyway, beside an offset, should be OK... Regards, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH (take 2)] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
Andrew Morton pointed out that my changelog was unusable. Sorry! Here is a second try with the changelog and kernel version changed. Regards, Jarek P. >(take 2) Subject: request_irq() - fix DEBUG_SHIRQ handling Mariusz Kozlowski reported lockdep's warning: > = > [ INFO: inconsistent lock state ] > 2.6.23-rc2-mm1 #7 > - > inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. > ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes: > (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too] > {in-hardirq-W} state was registered at: > [] __lock_acquire+0x949/0x11ac > [] lock_acquire+0x99/0xb2 > [] _spin_lock+0x35/0x42 > [] rtl8139_interrupt+0x27/0x46b [8139too] > [] handle_IRQ_event+0x28/0x59 > [] handle_level_irq+0xad/0x10b > [] do_IRQ+0x93/0xd0 > [] common_interrupt+0x2e/0x34 ... > other info that might help us debug this: > 1 lock held by ifconfig/5492: > #0: (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f > > stack backtrace: ... > [] _spin_lock+0x35/0x42 > [] rtl8139_interrupt+0x27/0x46b [8139too] > [] free_irq+0x11b/0x146 > [] rtl8139_close+0x8a/0x14a [8139too] > [] dev_close+0x57/0x74 ... This shows that a driver's irq handler was running both in hard interrupt and process contexts with irqs enabled. The latter was done during free_irq() call and was possible only with CONFIG_DEBUG_SHIRQ enabled. This was fixed by another patch. But similar problem is possible with request_irq(): any locks taken from irq handler could be vulnerable - especially with soft interrupts. This patch fixes it by disabling local interrupts during handler's run. (It seems, disabling softirqs should be enough, but it needs more checking on possible races or other special cases). This patch is recommended to all stable versions since 2.6.21, too. Reported-by: Mariusz Kozlowski <[EMAIL PROTECTED]> Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]> --- diff -Nurp 2.6.23-rc3-git6-/kernel/irq/manage.c 2.6.23-rc3-git6/kernel/irq/manage.c --- 2.6.23-rc3-git6-/kernel/irq/manage.c2007-08-23 10:11:35.0 +0200 +++ 2.6.23-rc3-git6/kernel/irq/manage.c 2007-08-23 10:16:29.0 +0200 @@ -555,14 +555,11 @@ int request_irq(unsigned int irq, irq_ha * We do this before actually registering it, to make sure that * a 'real' IRQ doesn't run in parallel with our fake */ - if (irqflags & IRQF_DISABLED) { - unsigned long flags; + unsigned long flags; - local_irq_save(flags); - handler(irq, dev_id); - local_irq_restore(flags); - } else - handler(irq, dev_id); + local_irq_save(flags); + handler(irq, dev_id); + local_irq_restore(flags); } #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH (take 2)] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
Andrew Morton pointed out that my changelog was unusable. Sorry! Here is a second try with the changelog and kernel version changed. Regards, Jarek P. (take 2) Subject: request_irq() - fix DEBUG_SHIRQ handling Mariusz Kozlowski reported lockdep's warning: = [ INFO: inconsistent lock state ] 2.6.23-rc2-mm1 #7 - inconsistent {in-hardirq-W} - {hardirq-on-W} usage. ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes: (tp-lock){+...}, at: [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too] {in-hardirq-W} state was registered at: [c0138eeb] __lock_acquire+0x949/0x11ac [c01397e7] lock_acquire+0x99/0xb2 [c0452ff3] _spin_lock+0x35/0x42 [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too] [c0147a5d] handle_IRQ_event+0x28/0x59 [c01493ca] handle_level_irq+0xad/0x10b [c0105a13] do_IRQ+0x93/0xd0 [c010441e] common_interrupt+0x2e/0x34 ... other info that might help us debug this: 1 lock held by ifconfig/5492: #0: (rtnl_mutex){--..}, at: [c0451778] mutex_lock+0x1c/0x1f stack backtrace: ... [c0452ff3] _spin_lock+0x35/0x42 [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too] [c01480fd] free_irq+0x11b/0x146 [de871d59] rtl8139_close+0x8a/0x14a [8139too] [c03bde63] dev_close+0x57/0x74 ... This shows that a driver's irq handler was running both in hard interrupt and process contexts with irqs enabled. The latter was done during free_irq() call and was possible only with CONFIG_DEBUG_SHIRQ enabled. This was fixed by another patch. But similar problem is possible with request_irq(): any locks taken from irq handler could be vulnerable - especially with soft interrupts. This patch fixes it by disabling local interrupts during handler's run. (It seems, disabling softirqs should be enough, but it needs more checking on possible races or other special cases). This patch is recommended to all stable versions since 2.6.21, too. Reported-by: Mariusz Kozlowski [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.23-rc3-git6-/kernel/irq/manage.c 2.6.23-rc3-git6/kernel/irq/manage.c --- 2.6.23-rc3-git6-/kernel/irq/manage.c2007-08-23 10:11:35.0 +0200 +++ 2.6.23-rc3-git6/kernel/irq/manage.c 2007-08-23 10:16:29.0 +0200 @@ -555,14 +555,11 @@ int request_irq(unsigned int irq, irq_ha * We do this before actually registering it, to make sure that * a 'real' IRQ doesn't run in parallel with our fake */ - if (irqflags IRQF_DISABLED) { - unsigned long flags; + unsigned long flags; - local_irq_save(flags); - handler(irq, dev_id); - local_irq_restore(flags); - } else - handler(irq, dev_id); + local_irq_save(flags); + handler(irq, dev_id); + local_irq_restore(flags); } #endif - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH (take 2)] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
On Thu, Aug 23, 2007 at 10:44:30AM +0200, Jarek Poplawski wrote: Andrew Morton pointed out that my changelog was unusable. Sorry! Here is a second try with the changelog and kernel version changed. ... (take 2) Subject: request_irq() - fix DEBUG_SHIRQ handling ... Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.23-rc3-git6-/kernel/irq/manage.c 2.6.23-rc3-git6/kernel/irq/manage.c --- 2.6.23-rc3-git6-/kernel/irq/manage.c 2007-08-23 10:11:35.0 +0200 +++ 2.6.23-rc3-git6/kernel/irq/manage.c 2007-08-23 10:16:29.0 +0200 So, this time I f-ed the diff part: it's not exactly against 2.6.23-rc-git6. But, it's Andrew to blame: he should've known that some old slow chips can't do science and poetry at the same time. Sorry (for him)! Anyway, beside an offset, should be OK... Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
On 10-08-2007 01:49, Mariusz Kozlowski wrote: > Hello, > > = > [ INFO: inconsistent lock state ] > 2.6.23-rc2-mm1 #7 > - > inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. > ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes: > (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too] > {in-hardirq-W} state was registered at: > [] __lock_acquire+0x949/0x11ac > [] lock_acquire+0x99/0xb2 > [] _spin_lock+0x35/0x42 > [] rtl8139_interrupt+0x27/0x46b [8139too] > [] handle_IRQ_event+0x28/0x59 > [] handle_level_irq+0xad/0x10b > [] do_IRQ+0x93/0xd0 > [] common_interrupt+0x2e/0x34 ... > other info that might help us debug this: > 1 lock held by ifconfig/5492: > #0: (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f > > stack backtrace: ... > [] _spin_lock+0x35/0x42 > [] rtl8139_interrupt+0x27/0x46b [8139too] > [] free_irq+0x11b/0x146 > [] rtl8139_close+0x8a/0x14a [8139too] > [] dev_close+0x57/0x74 ... It looks like this was possible after David's fix, which really enabled running of the handler in free_irq, but before Andrew's patch disabling local irqs for this time. So, this bug should be fixed, but IMHO similar problem is possible in request_irq. And, I think, this is not only about lockdep complaining, but real lockup possibility, because any locks in such a handler are taken in another, not expected for them context, and could be vulnerable (especially with softirqs, but probably hardirqs as well). Reported-by: Mariusz Kozlowski <[EMAIL PROTECTED]> Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]> --- diff -Nurp 2.6.23-rc3-mm1-/kernel/irq/manage.c 2.6.23-rc3-mm1/kernel/irq/manage.c --- 2.6.23-rc3-mm1-/kernel/irq/manage.c 2007-08-22 13:58:58.0 +0200 +++ 2.6.23-rc3-mm1/kernel/irq/manage.c 2007-08-22 14:12:21.0 +0200 @@ -546,14 +546,11 @@ int request_irq(unsigned int irq, irq_ha * We do this before actually registering it, to make sure that * a 'real' IRQ doesn't run in parallel with our fake */ - if (irqflags & IRQF_DISABLED) { - unsigned long flags; + unsigned long flags; - local_irq_save(flags); - handler(irq, dev_id); - local_irq_restore(flags); - } else - handler(irq, dev_id); + local_irq_save(flags); + handler(irq, dev_id); + local_irq_restore(flags); } #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
On 10-08-2007 01:49, Mariusz Kozlowski wrote: Hello, = [ INFO: inconsistent lock state ] 2.6.23-rc2-mm1 #7 - inconsistent {in-hardirq-W} - {hardirq-on-W} usage. ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes: (tp-lock){+...}, at: [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too] {in-hardirq-W} state was registered at: [c0138eeb] __lock_acquire+0x949/0x11ac [c01397e7] lock_acquire+0x99/0xb2 [c0452ff3] _spin_lock+0x35/0x42 [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too] [c0147a5d] handle_IRQ_event+0x28/0x59 [c01493ca] handle_level_irq+0xad/0x10b [c0105a13] do_IRQ+0x93/0xd0 [c010441e] common_interrupt+0x2e/0x34 ... other info that might help us debug this: 1 lock held by ifconfig/5492: #0: (rtnl_mutex){--..}, at: [c0451778] mutex_lock+0x1c/0x1f stack backtrace: ... [c0452ff3] _spin_lock+0x35/0x42 [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too] [c01480fd] free_irq+0x11b/0x146 [de871d59] rtl8139_close+0x8a/0x14a [8139too] [c03bde63] dev_close+0x57/0x74 ... It looks like this was possible after David's fix, which really enabled running of the handler in free_irq, but before Andrew's patch disabling local irqs for this time. So, this bug should be fixed, but IMHO similar problem is possible in request_irq. And, I think, this is not only about lockdep complaining, but real lockup possibility, because any locks in such a handler are taken in another, not expected for them context, and could be vulnerable (especially with softirqs, but probably hardirqs as well). Reported-by: Mariusz Kozlowski [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.23-rc3-mm1-/kernel/irq/manage.c 2.6.23-rc3-mm1/kernel/irq/manage.c --- 2.6.23-rc3-mm1-/kernel/irq/manage.c 2007-08-22 13:58:58.0 +0200 +++ 2.6.23-rc3-mm1/kernel/irq/manage.c 2007-08-22 14:12:21.0 +0200 @@ -546,14 +546,11 @@ int request_irq(unsigned int irq, irq_ha * We do this before actually registering it, to make sure that * a 'real' IRQ doesn't run in parallel with our fake */ - if (irqflags IRQF_DISABLED) { - unsigned long flags; + unsigned long flags; - local_irq_save(flags); - handler(irq, dev_id); - local_irq_restore(flags); - } else - handler(irq, dev_id); + local_irq_save(flags); + handler(irq, dev_id); + local_irq_restore(flags); } #endif - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Sat, Aug 11, 2007 at 08:09:09PM +0200, Ingo Molnar wrote: > > * Paul E. McKenney <[EMAIL PROTECTED]> wrote: > > > Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it. > > Compiles, but not yet tested. > > > > Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]> > > > --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700 > > +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 > > 17:22:57.0 -0700 > > @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu) > > return now; > > } > > > > +EXPORT_SYMBOL_GPL(cpu_clock); > > sure enough, > > Acked-by: Ingo Molnar <[EMAIL PROTECTED]> Thank you! Just for the record, given that the xtime API that it replaces was EXPORT_SYMBOL(), I would have not objection to this also being EXPORT_SYMBOL(). That said, I know of no specific reason for it being other than EXPORT_SYMBOL_GPL(). Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
* Paul E. McKenney <[EMAIL PROTECTED]> wrote: > Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it. > Compiles, but not yet tested. > > Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]> > --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700 > +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 > 17:22:57.0 -0700 > @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu) > return now; > } > > +EXPORT_SYMBOL_GPL(cpu_clock); sure enough, Acked-by: Ingo Molnar <[EMAIL PROTECTED]> Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
On Fri, Aug 10, 2007 at 12:55:17AM -0700, Andrew Morton wrote: > On Fri, 10 Aug 2007 09:40:00 +0200 Ingo Molnar <[EMAIL PROTECTED]> wrote: > > > > > * Andrew Morton <[EMAIL PROTECTED]> wrote: > > > > > We seem to have made a mess in there. timer_list_show() ends up > > > calling lookup_module_symbol_name(), which takes a mutex. However > > > print_symbol() (which is called at oops time, interrupt time, etc) > > > calls module_address_lookup(), which is basically the same, only it > > > doesn't take the mutex. > > > > hm, current upstream does: > > > > static void print_name_offset(struct seq_file *m, void *sym) > > { > > char symname[KSYM_NAME_LEN]; > > > > if (lookup_symbol_name((unsigned long)sym, symname) < 0) > > > > why was that changed? > > It wasn't. Oh no, it was! commit 9d65cb4a1718a072898c7a57a3bc61b2dc4bcd4d Fix race between cat /proc/*/wchan and rmmod et al kallsyms_lookup() can go iterating over modules list unprotected which is OK for emergency situations (oops), but not OK for regular stuff like /proc/*/wchan. > lookup_symbol_name() calls lookup_module_symbol_name() which > calls mutex_lock(). > > > I think symbol lookups for debug purposes have to > > be lockless, fundamentally. > > > > Sure, especially a sysrq thingy. I imagine user running powertop which IIRC trolls /proc/timer_list and doing rmmod following powertop instructions. > It's a bit nasty to just go in there and start walking data structures > without holding the needed lock though. Yep. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
On Fri, Aug 10, 2007 at 12:55:17AM -0700, Andrew Morton wrote: On Fri, 10 Aug 2007 09:40:00 +0200 Ingo Molnar [EMAIL PROTECTED] wrote: * Andrew Morton [EMAIL PROTECTED] wrote: We seem to have made a mess in there. timer_list_show() ends up calling lookup_module_symbol_name(), which takes a mutex. However print_symbol() (which is called at oops time, interrupt time, etc) calls module_address_lookup(), which is basically the same, only it doesn't take the mutex. hm, current upstream does: static void print_name_offset(struct seq_file *m, void *sym) { char symname[KSYM_NAME_LEN]; if (lookup_symbol_name((unsigned long)sym, symname) 0) why was that changed? It wasn't. Oh no, it was! commit 9d65cb4a1718a072898c7a57a3bc61b2dc4bcd4d Fix race between cat /proc/*/wchan and rmmod et al kallsyms_lookup() can go iterating over modules list unprotected which is OK for emergency situations (oops), but not OK for regular stuff like /proc/*/wchan. lookup_symbol_name() calls lookup_module_symbol_name() which calls mutex_lock(). I think symbol lookups for debug purposes have to be lockless, fundamentally. Sure, especially a sysrq thingy. I imagine user running powertop which IIRC trolls /proc/timer_list and doing rmmod following powertop instructions. It's a bit nasty to just go in there and start walking data structures without holding the needed lock though. Yep. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
* Paul E. McKenney [EMAIL PROTECTED] wrote: Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it. Compiles, but not yet tested. Signed-off-by: Paul E. McKenney [EMAIL PROTECTED] --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700 +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 17:22:57.0 -0700 @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu) return now; } +EXPORT_SYMBOL_GPL(cpu_clock); sure enough, Acked-by: Ingo Molnar [EMAIL PROTECTED] Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Sat, Aug 11, 2007 at 08:09:09PM +0200, Ingo Molnar wrote: * Paul E. McKenney [EMAIL PROTECTED] wrote: Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it. Compiles, but not yet tested. Signed-off-by: Paul E. McKenney [EMAIL PROTECTED] --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700 +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 17:22:57.0 -0700 @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu) return now; } +EXPORT_SYMBOL_GPL(cpu_clock); sure enough, Acked-by: Ingo Molnar [EMAIL PROTECTED] Thank you! Just for the record, given that the xtime API that it replaces was EXPORT_SYMBOL(), I would have not objection to this also being EXPORT_SYMBOL(). That said, I know of no specific reason for it being other than EXPORT_SYMBOL_GPL(). Thanx, Paul - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, Aug 10, 2007 at 05:29:49PM -0700, Paul E. McKenney wrote: > > Errmmm... No joy. > > ERROR: "cpu_clock" [kernel/rcutorture.ko] undefined! > > Turns out that cpu_clock also ain't exported, and rcutorture.c is > a module. Would adding an EXPORT_SYMBOL_GPL() as in the patch below > be acceptable? Except that the old xtime symbol was EXPORT_SYMBOL() rather than my proposed EXPORT_SYMBOL_GPL() for the equivalent new cpu_clock(). Sigh!!! I will leave this one for others to sort out. Andrew, please consider this patch withdrawn and apply the version that does not rely on time for entropy. Please let me know if you would like me to resend it. Thanx, Paul > If not, I have a tested patch to rcutorture.c that leverages statistical > counters. Your choice. > > Thanx, Paul > > Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it. > Compiles, but not yet tested. > > Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]> > --- > > rcutorture.c |8 ++-- > sched.c |2 ++ > 2 files changed, 4 insertions(+), 6 deletions(-) > > diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c > linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c > --- linux-2.6.23-rc2/kernel/rcutorture.c 2007-08-03 19:49:55.0 > -0700 > +++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c 2007-08-10 > 17:15:22.0 -0700 > @@ -42,7 +42,6 @@ > #include > #include > #include > -#include > #include > #include > #include > @@ -166,16 +165,13 @@ struct rcu_random_state { > > /* > * Crude but fast random-number generator. Uses a linear congruential > - * generator, with occasional help from get_random_bytes(). > + * generator, with occasional help from cpu_clock(). > */ > static unsigned long > rcu_random(struct rcu_random_state *rrsp) > { > - long refresh; > - > if (--rrsp->rrs_count < 0) { > - get_random_bytes(, sizeof(refresh)); > - rrsp->rrs_state += refresh; > + rrsp->rrs_state += (unsigned long)cpu_clock(smp_processor_id()); > rrsp->rrs_count = RCU_RANDOM_REFRESH; > } > rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD; > diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c > linux-2.6.23-rc2-rcutorturesched/kernel/sched.c > --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700 > +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 > 17:22:57.0 -0700 > @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu) > return now; > } > > +EXPORT_SYMBOL_GPL(cpu_clock); > + > #ifdef CONFIG_FAIR_GROUP_SCHED > /* Change a task's ->cfs_rq if it moves across CPUs */ > static inline void set_task_cfs_rq(struct task_struct *p) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, Aug 10, 2007 at 01:30:55PM -0700, Paul E. McKenney wrote: > On Fri, Aug 10, 2007 at 10:12:12AM -0700, Andrew Morton wrote: > > On Fri, 10 Aug 2007 08:12:08 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> > > wrote: > > > > > > One used to use sched_clock() for this, then get frowned at. Now we > > > > have cpu_clock()... > > > > > > Hmmm... And cpu_clock() is not in 2.6.22, so must appear in some later > > > release. Which means that the rate of API change in this area is a > > > bit high, so I should avoid it like the plague. > > > > eh, it's been there for weeks. It is dust-encrusted. > > > > > Therefore, I should > > > look for some other convenient source of entropy. > > > > > > One convenient source would the per-CPU statistics that rcutorture > > > maintains. Of course, a given CPU's RNG is nearly in lock-step with > > > its own statistics, but not with the adjacent CPU's statistics... > > > > > > I will send a patch. > > > > Please use cpu_clock(). It ain't going away. > > D'accord... Errmmm... No joy. ERROR: "cpu_clock" [kernel/rcutorture.ko] undefined! Turns out that cpu_clock also ain't exported, and rcutorture.c is a module. Would adding an EXPORT_SYMBOL_GPL() as in the patch below be acceptable? If not, I have a tested patch to rcutorture.c that leverages statistical counters. Your choice. Thanx, Paul Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it. Compiles, but not yet tested. Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]> --- rcutorture.c |8 ++-- sched.c |2 ++ 2 files changed, 4 insertions(+), 6 deletions(-) diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c --- linux-2.6.23-rc2/kernel/rcutorture.c2007-08-03 19:49:55.0 -0700 +++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c2007-08-10 17:15:22.0 -0700 @@ -42,7 +42,6 @@ #include #include #include -#include #include #include #include @@ -166,16 +165,13 @@ struct rcu_random_state { /* * Crude but fast random-number generator. Uses a linear congruential - * generator, with occasional help from get_random_bytes(). + * generator, with occasional help from cpu_clock(). */ static unsigned long rcu_random(struct rcu_random_state *rrsp) { - long refresh; - if (--rrsp->rrs_count < 0) { - get_random_bytes(, sizeof(refresh)); - rrsp->rrs_state += refresh; + rrsp->rrs_state += (unsigned long)cpu_clock(smp_processor_id()); rrsp->rrs_count = RCU_RANDOM_REFRESH; } rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD; diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c linux-2.6.23-rc2-rcutorturesched/kernel/sched.c --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700 +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 17:22:57.0 -0700 @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu) return now; } +EXPORT_SYMBOL_GPL(cpu_clock); + #ifdef CONFIG_FAIR_GROUP_SCHED /* Change a task's ->cfs_rq if it moves across CPUs */ static inline void set_task_cfs_rq(struct task_struct *p) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, Aug 10, 2007 at 10:12:12AM -0700, Andrew Morton wrote: > On Fri, 10 Aug 2007 08:12:08 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> > wrote: > > > > One used to use sched_clock() for this, then get frowned at. Now we > > > have cpu_clock()... > > > > Hmmm... And cpu_clock() is not in 2.6.22, so must appear in some later > > release. Which means that the rate of API change in this area is a > > bit high, so I should avoid it like the plague. > > eh, it's been there for weeks. It is dust-encrusted. > > > Therefore, I should > > look for some other convenient source of entropy. > > > > One convenient source would the per-CPU statistics that rcutorture > > maintains. Of course, a given CPU's RNG is nearly in lock-step with > > its own statistics, but not with the adjacent CPU's statistics... > > > > I will send a patch. > > Please use cpu_clock(). It ain't going away. D'accord... Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, 10 Aug 2007 08:12:08 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> wrote: > > One used to use sched_clock() for this, then get frowned at. Now we > > have cpu_clock()... > > Hmmm... And cpu_clock() is not in 2.6.22, so must appear in some later > release. Which means that the rate of API change in this area is a > bit high, so I should avoid it like the plague. eh, it's been there for weeks. It is dust-encrusted. > Therefore, I should > look for some other convenient source of entropy. > > One convenient source would the per-CPU statistics that rcutorture > maintains. Of course, a given CPU's RNG is nearly in lock-step with > its own statistics, but not with the adjacent CPU's statistics... > > I will send a patch. Please use cpu_clock(). It ain't going away. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___
On Fri, 10 Aug 2007 15:27:42 +0200 Andi Kleen <[EMAIL PROTECTED]> wrote: > On Thursday 09 August 2007 20:52:58 Andrew Morton wrote: > > On Thu, 9 Aug 2007 10:18:15 -0400 > > "Miles Lane" <[EMAIL PROTECTED]> wrote: > > > > > CC drivers/dma/ioat_dca.o > > > drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag': > > > drivers/dma/ioat_dca.c:177: error: implicit declaration of function > > > 'cpu_physical_id' > > > > Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n. > > > > Either ioat needs to stop using cpu_physical_id() if SMP=n, or the > > supported architectures (i386, x86_64, ia64) should provide a non-SMP > > version of cpu_physical_id(). Preferably the latter, I'd say. > > > It doesn't make much sense in smp.h because there is not really > a concept of physical id on most architectures i expect. Better > to put it into the individual asm files. > I gave up and did this: From: Andrew Morton <[EMAIL PROTECTED]> drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag': drivers/dma/ioat_dca.c:177: error: implicit declaration of function 'cpu_physical_id' This is s screwed up. Root cause: linux/smp.h only includes asm/smp.h if CONFIG_SMP=y. To get at cpu_physical_id() on UP, the user must include asm/smp.h, not linux/smp.h. Cc: "Luck, Tony" <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Shannon Nelson <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- drivers/dma/ioat_dca.c |3 +++ 1 file changed, 3 insertions(+) diff -puN drivers/dma/ioat_dca.c~git-dma-up-fix drivers/dma/ioat_dca.c --- a/drivers/dma/ioat_dca.c~git-dma-up-fix +++ a/drivers/dma/ioat_dca.c @@ -25,6 +25,9 @@ #include #include #include + +#include + #include "ioatdma.h" #include "ioatdma_registers.h" _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Thu, Aug 09, 2007 at 07:06:23PM -0700, Andrew Morton wrote: > On Thu, 9 Aug 2007 19:00:40 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> > wrote: > > > On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote: > > > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote: > > > >... > > > > Changes since 2.6.23-rc2-mm1: > > > >... > > > > +allow-rcutorture-to-handle-synchronize_sched.patch > > > >... > > > > 2.6.23 queue > > > >... > > > > > > All drivers were converted to no longer use xtime directly since it > > > might be quite outdated, but this patch adds a usage of xtime.tv_nsec > > > as RNG... > > > > This code doesn't care if the time is outdated, as it is simply > > periodically perturbing an RNG, but OK. > > > > So, what interface are we supposed to be using instead? I cannot use > > get_random_bytes() due to locking issues. This is not a cryptographically > > secure usage, so the perturbation does not need to be extremely high > > quality. > > > > On x86, I would just grab the low-order bits of the TSC, but all of the > > world is not an x86. ;-) > > One used to use sched_clock() for this, then get frowned at. Now we > have cpu_clock()... Hmmm... And cpu_clock() is not in 2.6.22, so must appear in some later release. Which means that the rate of API change in this area is a bit high, so I should avoid it like the plague. Therefore, I should look for some other convenient source of entropy. One convenient source would the per-CPU statistics that rcutorture maintains. Of course, a given CPU's RNG is nearly in lock-step with its own statistics, but not with the adjacent CPU's statistics... I will send a patch. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___
On Thursday 09 August 2007 20:52:58 Andrew Morton wrote: > On Thu, 9 Aug 2007 10:18:15 -0400 > "Miles Lane" <[EMAIL PROTECTED]> wrote: > > > CC drivers/dma/ioat_dca.o > > drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag': > > drivers/dma/ioat_dca.c:177: error: implicit declaration of function > > 'cpu_physical_id' > > Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n. > > Either ioat needs to stop using cpu_physical_id() if SMP=n, or the > supported architectures (i386, x86_64, ia64) should provide a non-SMP > version of cpu_physical_id(). Preferably the latter, I'd say. It doesn't make much sense in smp.h because there is not really a concept of physical id on most architectures i expect. Better to put it into the individual asm files. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___
On 8/9/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > On Thu, 9 Aug 2007 10:18:15 -0400 > "Miles Lane" <[EMAIL PROTECTED]> wrote: > > > CC drivers/dma/ioat_dca.o > > drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag': > > drivers/dma/ioat_dca.c:177: error: implicit declaration of function > > 'cpu_physical_id' > > Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n. > > Either ioat needs to stop using cpu_physical_id() if SMP=n, or the > supported architectures (i386, x86_64, ia64) should provide a non-SMP > version of cpu_physical_id(). Preferably the latter, I'd say. > > Something like this, I suppose... > > > From: Andrew Morton <[EMAIL PROTECTED]> > > i386, x86_64 and ia64 implement cpu_physical_id() if CONFIG_SMP=y. > > Provide a uniprocessor stub so that callers will dtrt. > > Cc: Andi Kleen <[EMAIL PROTECTED]> > Cc: "Luck, Tony" <[EMAIL PROTECTED]> > Cc: Shannon Nelson <[EMAIL PROTECTED]> > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> > --- > > include/linux/smp.h |5 + > 1 files changed, 5 insertions(+) > > diff -puN include/linux/smp.h~implement-cpu_physical_id-on-smp=n > include/linux/smp.h > --- a/include/linux/smp.h~implement-cpu_physical_id-on-smp=n > +++ a/include/linux/smp.h > @@ -108,6 +108,11 @@ static inline void smp_send_reschedule(i > 0; \ > }) > > +static inline unsigned cpu_physical_id(unsigned cpu) > +{ > + return 0; > +} > + > #endif /* !SMP */ > > /* > _ Worked for me. Thanks, Miles - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- PPC G5 kernel compile failure (patch)
Krzysztof Helt wrote: > On Thu, 9 Aug 2007 14:04:49 +0100 > Andy Whitcroft <[EMAIL PROTECTED]> wrote: > >> Seeing the following compile error on a G5 mac: >> >> drivers/video/tdfxfb.c: In function 'tdfxfb_setup': >> drivers/video/tdfxfb.c:1341: error: 'opt' undeclared (first use in this >> function) >> drivers/video/tdfxfb.c:1341: error: (Each undeclared identifier is >> reported only once >> drivers/video/tdfxfb.c:1341: error: for each function it appears in.) >> >> This seems to be the following fragment from tdfxfb-hardware-cursor: >> >> + } else if (!strcmp(this_opt, "hwcursor")) { >> + hwcursor = simple_strtoul(opt + 9, NULL, 0); >> >> I guess the nieve fix would be s/opt/this_opt, but I am also >> suspicious of the +9 here as hwcursor is only 8 long? Now this >> seems to take a numeric value and I assume that is via hwcursor=N, >> if so then the +9 would make sense _if_ the strcmp was against >> "hwcursor=". >> > > The patch below fixes all issues you have pointed out. It also fixes > the description of the nomtrr option. > > --- > > From: Krzysztof Helt <[EMAIL PROTECTED]> > > This patch fixes compilation with setup options bug and corrects > description of the nomtrr option. > > Signed-off-by: Krzysztof Helt <[EMAIL PROTECTED]> > > --- > > --- linux-2.6.22.new/drivers/video/tdfxfb.c 2007-08-09 16:11:23.870028259 > +0200 > +++ linux-2.6.23/drivers/video/tdfxfb.c 2007-08-09 16:15:07.654781024 > +0200 > @@ -1337,8 +1337,8 @@ static void tdfxfb_setup(char *options) > nopan = 1; > } else if (!strcmp(this_opt, "nowrap")) { > nowrap = 1; > - } else if (!strcmp(this_opt, "hwcursor")) { > - hwcursor = simple_strtoul(opt + 9, NULL, 0); > + } else if (!strncmp(this_opt, "hwcursor=", 9)) { > + hwcursor = simple_strtoul(this_opt + 9, NULL, 0); > #ifdef CONFIG_MTRR > } else if (!strncmp(this_opt, "nomtrr", 6)) { > nomtrr = 1; > @@ -1409,7 +1409,7 @@ MODULE_PARM_DESC(hwcursor, "Enable hardw > "(1=enable, 0=disable, default=1)"); > #ifdef CONFIG_MTRR > module_param(nomtrr, bool, 0); > -MODULE_PARM_DESC(nomtrr, "Disable MTRR support (0 or 1=disabled) > (default=0)"); > +MODULE_PARM_DESC(nomtrr, "Disable MTRR support (default: enabled)"); > #endif > > module_init(tdfxfb_init); Confirmed that this gets my kernel compiled and the result boots. Tested-by: Andy Whitcroft <[EMAIL PROTECTED]> -apw - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Softlockup detected with 2.6.23-rc2-mm1
Andrew Morton wrote: On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote: I get call trace, when the file system stress is run on the 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron (processor 270) \BUG: spinlock bad magic on CPU#1, fsx-linux/19721 Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug. Those patches were removed from 2.6.23-rc2-mm2 - please test that instead. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ The Call trace is not reproducible in the 2.6.23-rc2-mm2. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- INFO: possible circular locking dependency detected
On Fri, 2007-08-10 at 02:47 +0400, Alexey Starikovskiy wrote: > > Presumably the new debugging patches in -mm > > (workqueue-debug-flushing-deadlocks-with-lockdep.patch and > > workqueue-debug-work-related-deadlocks-with-lockdep.patch) think they have > > found a potential deadlock in ACPI. I don't have time to pick through the > > code to confirm that, but boy I'm good at adding cc's ;) > Yep, it indeed may lock up... Here is a patch to avoid it Cool. I'm impressed this stuff actually finds something :) johannes signature.asc Description: This is a digitally signed message part
Re: Softlockup detected with 2.6.23-rc2-mm1
On Fri, Aug 10, 2007 at 01:06:58PM +0530, Kamalesh Babulal wrote: > Andrew Morton wrote: > >On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal > ><[EMAIL PROTECTED]> wrote: > > > > > >>I get call trace, when the file system stress is run on the > >>2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron > >>(processor 270) > >> > >>\BUG: spinlock bad magic on > >>CPU#1, fsx-linux/19721 > >> > > > >Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug. Those patches > >were > >removed from 2.6.23-rc2-mm2 - please test that instead. > >- > >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > >the body of a message to [EMAIL PROTECTED] > >More majordomo info at http://vger.kernel.org/majordomo-info.html > >Please read the FAQ at http://www.tux.org/lkml/ > > > I get different call trace on AMD Opteron(tm) Processor 844 machine > , I am not sure where it is related to the same patch > > ============= > BUG: soft lockup - CPU#3 stuck for 11s! [pdflush:272] > CPU 3: > Modules linked in: > Pid: 272, comm: pdflush Not tainted 2.6.23-rc2-mm1-autokern1 #1 > RIP: 0010:[] [] > flush_tlb_others+0x69/0x95 Cannot be 100% sure but of the group of machines showing your original problem one showed this form. Dropping the patches indicated by Andrew seemed to fix both symptoms. So I think it is highly likely the same thing. -apw - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
On Fri, 10 Aug 2007 09:40:00 +0200 Ingo Molnar <[EMAIL PROTECTED]> wrote: > > * Andrew Morton <[EMAIL PROTECTED]> wrote: > > > We seem to have made a mess in there. timer_list_show() ends up > > calling lookup_module_symbol_name(), which takes a mutex. However > > print_symbol() (which is called at oops time, interrupt time, etc) > > calls module_address_lookup(), which is basically the same, only it > > doesn't take the mutex. > > hm, current upstream does: > > static void print_name_offset(struct seq_file *m, void *sym) > { > char symname[KSYM_NAME_LEN]; > > if (lookup_symbol_name((unsigned long)sym, symname) < 0) > > why was that changed? It wasn't. lookup_symbol_name() calls lookup_module_symbol_name() which calls mutex_lock(). > I think symbol lookups for debug purposes have to > be lockless, fundamentally. > Sure, especially a sysrq thingy. It's a bit nasty to just go in there and start walking data structures without holding the needed lock though. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
* Andrew Morton <[EMAIL PROTECTED]> wrote: > We seem to have made a mess in there. timer_list_show() ends up > calling lookup_module_symbol_name(), which takes a mutex. However > print_symbol() (which is called at oops time, interrupt time, etc) > calls module_address_lookup(), which is basically the same, only it > doesn't take the mutex. hm, current upstream does: static void print_name_offset(struct seq_file *m, void *sym) { char symname[KSYM_NAME_LEN]; if (lookup_symbol_name((unsigned long)sym, symname) < 0) why was that changed? I think symbol lookups for debug purposes have to be lockless, fundamentally. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Softlockup detected with 2.6.23-rc2-mm1
Andrew Morton wrote: On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote: I get call trace, when the file system stress is run on the 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron (processor 270) \BUG: spinlock bad magic on CPU#1, fsx-linux/19721 Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug. Those patches were removed from 2.6.23-rc2-mm2 - please test that instead. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ I get different call trace on AMD Opteron(tm) Processor 844 machine , I am not sure where it is related to the same patch = BUG: soft lockup - CPU#3 stuck for 11s! [pdflush:272] CPU 3: Modules linked in: Pid: 272, comm: pdflush Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[] [] flush_tlb_others+0x69/0x95 RSP: :810001f15a90 EFLAGS: 0202 RAX: 0003 RBX: 810001f15ac0 RCX: 0008 RDX: 08f3 RSI: 00f3 RDI: 0002 RBP: R08: 810082f05210 R09: 802e60c1 R10: 8100815e6e70 R11: R12: 8101ffc38080 R13: 80358b47 R14: 810001f15a40 R15: 810081d73208 FS: () GS:810180724280() knlGS:f7f75b80 CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: f7e20494 CR3: 029f CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Call Trace: [] flush_tlb_page+0x8f/0x97 [] page_mkclean+0x120/0x171 [] ext3_ordered_writepage+0x13f/0x16c [] clear_page_dirty_for_io+0x52/0xba [] write_cache_pages+0x1b2/0x33a [] update_curr+0xd9/0xf8 [] __writepage+0x0/0x2a [] generic_writepages+0x1f/0x25 [] do_writepages+0x2c/0x35 [] __writeback_single_inode+0x1c9/0x346 [] try_to_del_timer_sync+0x55/0x60 [] del_timer_sync+0x12/0x1f [] update_curr+0xd9/0xf8 [] dequeue_entity+0x7d/0x92 [] generic_sync_sb_inodes+0x216/0x372 [] sync_sb_inodes+0x1d/0x1f [] writeback_inodes+0x83/0xd6 [] wb_kupdate+0xa0/0x113 [] pdflush+0x156/0x206 [] wb_kupdate+0x0/0x113 [] pdflush+0x0/0x206 [] kthread+0x44/0x6d [] child_rip+0xa/0x12 [] kthread+0x0/0x6d [] child_rip+0x0/0x12 Thanks Kamalesh Babulal. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
> >>This probably doesn't have great impact ;) but ... > >> To reproduce: run torture tests for RCU and then sysrq+q. > >> > >> SysRq : Show Pending Timers > >> Timer List Version: v0.3 > >> HRTIMER_MAX_CLOCK_BASES: 2 > >> now at 1764338760370 nsecs > >> > >> cpu: 0 > >> clock 0: > >> .index: 0 > >> .resolution: 1 nsecs > >> .get_time: ktime_get_real > >> .offset: 1186699025823815427 nsecs > >> active timers: > >> clock 1: > >> .index: 1 > >> .resolution: 1 nsecs > >> .get_time: ktime_get > >> .offset: 0 nsecs > >> active timers: > >> #0: <3>BUG: sleeping function called from invalid context at > >> kernel/mutex.c:86 > >> in_atomic():1, irqs_disabled():1 > >> INFO: lockdep is turned off. > >> irq event stamp: 0 > >> hardirqs last enabled at (0): [<>] 0x0 > >> hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c > >> softirqs last enabled at (0): [] copy_process+0x4c6/0x144c > >> softirqs last disabled at (0): [<>] 0x0 > >> [] show_trace_log_lvl+0x1a/0x30 > >> [] show_trace+0x12/0x14 > >> [] dump_stack+0x15/0x17 > >> [] __might_sleep+0xb7/0xc9 > >> [] mutex_lock+0x15/0x1f > >> [] lookup_module_symbol_name+0x17/0xc0 > >> [] lookup_symbol_name+0x3f/0x43 > >> [] print_name_offset+0x1f/0x96 > >> [] timer_list_show+0x802/0xcbd > >> [] sysrq_timer_list_show+0xc/0xe > >> [] sysrq_handle_show_timers+0x8/0xa > >> [] __handle_sysrq+0x7b/0x115 > >> [] handle_sysrq+0x20/0x24 > >> [] kbd_event+0x3a8/0x5c7 > >> [] input_pass_event+0x8f/0x91 > >> [] input_handle_event+0x98/0x38d > >> [] input_event+0x54/0x67 > >> [] atkbd_interrupt+0x200/0x59e > >> [] serio_interrupt+0x7c/0x80 > >> [] i8042_interrupt+0x17a/0x289 > >> [] handle_IRQ_event+0x28/0x59 > >> [] handle_level_irq+0xad/0x10b > >> [] do_IRQ+0x93/0xd0 > >> [] common_interrupt+0x2e/0x34 > >> [] rcu_read_delay+0x8/0x36 [rcutorture] > >> [] rcu_torture_reader+0x6e/0x169 [rcutorture] > >> [] kthread+0x36/0x58 > >> [] kernel_thread_helper+0x7/0x1c > >> === > > > > We seem to have made a mess in there. timer_list_show() ends up calling > > lookup_module_symbol_name(), which takes a mutex. However print_symbol() > > (which is called at oops time, interrupt time, etc) calls > > module_address_lookup(), which is basically the same, only it doesn't take > > the mutex. > > > > I guess a quicky fix would be to switch > > kernel/time/timer_list.c:print_name_offset() from > > lookup_module_symbol_name() to module_address_lookup(). But we'd still > > have a mess in there. > > > > (adds ccs, runs away) > > I don't think rcutorture matters for this bug. Maybe not but that's the only way I could trigger it (insmod rcutorture and sysrq+q). Thanks, Mariusz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected
Hello, And the winner of today is ... = [ INFO: possible irq lock inversion dependency detected ] 2.6.23-rc2-mm1 #7 - runscript.sh/5843 just changed the state of lock: (_xmit_ETHER){-+..}, at: [] dev_watchdog+0x17/0xcc but this lock took another, soft-irq-unsafe lock in the past: (>lock){--..} and interrupts could create inverse lock ordering between them. other info that might help us debug this: no locks held by runscript.sh/5843. the first lock's dependencies: -> (_xmit_ETHER){-+..} ops: 21 { initial-use at: [] __lock_acquire+0x217/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock_bh+0x3a/0x47 [] dev_set_rx_mode+0x14/0x3b [] dev_change_flags+0x68/0x190 [] devinet_ioctl+0x4af/0x652 [] inet_ioctl+0x56/0x71 [] sock_ioctl+0xa5/0x1d4 [] do_ioctl+0x22/0x71 [] vfs_ioctl+0x55/0x29e [] sys_ioctl+0x33/0x69 [] sysenter_past_esp+0x5f/0x99 [] 0x in-softirq-W at: [] __lock_acquire+0x6f2/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock+0x35/0x42 [] dev_watchdog+0x17/0xcc [] run_timer_softirq+0x14b/0x1a9 [] __do_softirq+0x5b/0xb2 [] do_softirq+0x4d/0x4f [] irq_exit+0x48/0x4a [] do_IRQ+0x98/0xd0 [] common_interrupt+0x2e/0x34 [] error_code+0x6a/0x70 [] 0x hardirq-on-W at: [] __lock_acquire+0x73e/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock_bh+0x3a/0x47 [] dev_set_rx_mode+0x14/0x3b [] dev_change_flags+0x68/0x190 [] devinet_ioctl+0x4af/0x652 [] inet_ioctl+0x56/0x71 [] sock_ioctl+0xa5/0x1d4 [] do_ioctl+0x22/0x71 [] vfs_ioctl+0x55/0x29e [] sys_ioctl+0x33/0x69 [] sysenter_past_esp+0x5f/0x99 [] 0x } ... key at: [] netdev_xmit_lock_key+0x8/0x1c0 -> (>lock){--..} ops: 44 { initial-use at: [] __lock_acquire+0x217/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock+0x35/0x42 [] rtl8139_interrupt+0x27/0x46b [8139too] [] request_irq+0xba/0x108 [] rtl8139_open+0x2f/0x1e2 [8139too] [] dev_open+0x37/0x76 [] dev_change_flags+0x8e/0x190 [] devinet_ioctl+0x4af/0x652 [] inet_ioctl+0x56/0x71 [] sock_ioctl+0xa5/0x1d4 [] do_ioctl+0x22/0x71 [] vfs_ioctl+0x55/0x29e [] sys_ioctl+0x33/0x69 [] sysenter_past_esp+0x5f/0x99 [] 0x softirq-on-W at: [] __lock_acquire+0x767/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock+0x35/0x42 [] rtl8139_interrupt+0x27/0x46b [8139too] [] free_irq+0x11b/0x146 [] rtl8139_close+0x8a/0x14a [8139too] [] dev_close+0x57/0x74 [] dev_change_flags+0x8e/0x190 [] devinet_ioctl+0x4af/0x652 [] inet_ioctl+0x56/0x71 [] sock_ioctl+0xa5/0x1d4 [] do_ioctl+0x22/0x71 [] vfs_ioctl+0x55/0x29e [] sys_ioctl+0x33/0x69 [] sysenter_past_esp+0x5f/0x99 [] 0x hardirq-on-W at: [] __lock_acquire+0x73e/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock+0x35/0x42 [] rtl8139_interrupt+0x27/0x46b [8139too] [] free_irq+0x11b/0x146 [] rtl8139_close+0x8a/0x14a [8139too] [] dev_close+0x57/0x74 [] dev_change_flags+0x8e/0x190 [] devinet_ioctl+0x4af/0x652 [] inet_ioc
kernel BUG at mm/swap_state.c:78 with the 2.6.23-rc2-mm1
Hi, I got the following kernel Bug on the 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron (processor 270), while testing the LTP runall kernel BUG at mm/swap_state.c:78! invalid opcode: [1] SMP CPU 0 Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button Pid: 262, comm: kprefetchd Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[] [] __add_to_swap_cache+0x12/0xa6 RSP: 0018:81000299bea0 EFLAGS: 00010246 RAX: RBX: 81003f3baec0 RCX: 8100048c33b0 RDX: 00d0 RSI: 0001 RDI: 00d0 RBP: 81003f3baec0 R08: 810001423f14 R09: bb27 R10: R11: 0001 R12: 0001 R13: 0002 R14: 0001 R15: 8100048c33b0 FS: 2b941f4a40f0() GS:8067() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 005b9db0 CR3: 04ed7000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process kprefetchd (pid: 262, threadinfo 81000299a000, task 810001c98040) Stack: 0001 81003f3baec0 8027c50d 0002 81003f3baec0 8027f318 81000299bf20 Call Trace: [] add_to_swap_cache+0x36/0x5f [] kprefetchd+0x248/0x40c [] kprefetchd+0x0/0x40c [] kthread+0x47/0x73 [] child_rip+0xa/0x12 [] kthread+0x0/0x73 [] child_rip+0x0/0x12 Code: 0f 0b eb fe 8b 03 66 85 c0 79 04 0f 0b eb fe 8b 03 f6 c4 08 Thanks, Kamalesh Babulal. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Softlockup detected with 2.6.23-rc2-mm1
On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote: > I get call trace, when the file system stress is run on the > 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron > (processor 270) > > \BUG: spinlock bad magic on > CPU#1, fsx-linux/19721 Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug. Those patches were removed from 2.6.23-rc2-mm2 - please test that instead. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Softlockup detected with 2.6.23-rc2-mm1
Hi, I get call trace, when the file system stress is run on the 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron (processor 270) \BUG: spinlock bad magic on CPU#1, fsx-linux/19721 lock: 8100028cef48, .magic: , .owner: /-1, .owner_cpu: 0 Call Trace: [] _raw_spin_lock+0x22/0xf6 [] _spin_lock_irqsave+0x9/0xe [] prop_norm_single+0x40/0x9a [] set_page_dirty+0x8d/0xc9 [] set_page_dirty_balance+0x9/0x39 [] __do_fault+0x37a/0x395 [] handle_mm_fault+0x342/0x6c3 [] do_page_fault+0x3e5/0x7ab [] arch_get_unmapped_area+0x184/0x1f9 [] _spin_lock_irqsave+0x9/0xe [] __up_write+0x21/0x10d [] error_exit+0x0/0x84 BUG: spinlock lockup on CPU#1, fsx-linux/19721, 8100028cef48 Call Trace: [] _raw_spin_lock+0xcf/0xf6 [] _spin_lock_irqsave+0x9/0xe [] prop_norm_single+0x40/0x9a [] set_page_dirty+0x8d/0xc9 [] set_page_dirty_balance+0x9/0x39 [] __do_fault+0x37a/0x395 [] handle_mm_fault+0x342/0x6c3 [] do_page_fault+0x3e5/0x7ab [] arch_get_unmapped_area+0x184/0x1f9 [] _spin_lock_irqsave+0x9/0xe [] __up_write+0x21/0x10d [] error_exit+0x0/0x84 BUG: soft lockup - CPU#2 stuck for 11s! [events/2:17] CPU 2: Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button Pid: 17, comm: events/2 Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[] [] __smp_call_function+0x63/0x84 RSP: 0018:810001727e00 EFLAGS: 0297 RAX: 08fc RBX: 0003 RCX: RDX: 08fc RSI: 810001727de0 RDI: 00fc RBP: 0246 R08: 0003 R09: 0005 R10: 0010 R11: 0246 R12: 0400 R13: 0400 R14: R15: 81000102d980 FS: 2b8de51be6f0() GS:8100016123c0() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 0036d4b938a0 CR3: 1659f000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Call Trace: [] mcheck_check_cpu+0x0/0x30 [] smp_call_function+0x32/0x49 [] mcheck_check_cpu+0x0/0x30 [] on_each_cpu+0x10/0x22 [] mcheck_timer+0x0/0x7c [] mcheck_timer+0x1d/0x7c [] _spin_unlock_irq+0x9/0xc [] run_workqueue+0x8d/0x11a [] worker_thread+0x0/0xe4 [] worker_thread+0xda/0xe4 [] autoremove_wake_function+0x0/0x2e [] kthread+0x47/0x73 [] child_rip+0xa/0x12 [] kthread+0x0/0x73 [] child_rip+0x0/0x12 BUG: soft lockup - CPU#2 stuck for 11s! [events/2:17] CPU 2: Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button Pid: 17, comm: events/2 Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[] [] __smp_call_function+0x63/0x84 RSP: 0018:810001727e00 EFLAGS: 0297 RAX: 08fc RBX: 0003 RCX: RDX: 08fc RSI: 810001727de0 RDI: 00fc RBP: 0246 R08: 0003 R09: 0005 R10: 0010 R11: 0246 R12: 0400 R13: 0400 R14: R15: 81000102d980 FS: 2b8de51be6f0() GS:8100016123c0() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 0036d4b938a0 CR3: 1659f000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Call Trace: [] mcheck_check_cpu+0x0/0x30 [] smp_call_function+0x32/0x49 [] mcheck_check_cpu+0x0/0x30 [] on_each_cpu+0x10/0x22 [] mcheck_timer+0x0/0x7c [] mcheck_timer+0x1d/0x7c [] _spin_unlock_irq+0x9/0xc [] run_workqueue+0x8d/0x11a [] worker_thread+0x0/0xe4 [] worker_thread+0xda/0xe4 [] autoremove_wake_function+0x0/0x2e [] kthread+0x47/0x73 [] child_rip+0xa/0x12 [] kthread+0x0/0x73 [] child_rip+0x0/0x12 BUG: soft lockup - CPU#2 stuck for 11s! [events/2:17] CPU 2: Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button Pid: 17, comm: events/2 Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[] [] __smp_call_function+0x63/0x84 RSP: 0018:810001727e00 EFLAGS: 0297 RAX: 08fc RBX: 0003 RCX: RDX: 08fc RSI: 810001727de0 RDI: 00fc RBP: 0246 R08: 0003 R09: 0005 R10: 0010 R11: 0246 R12: 0400 R13: 0400 R14: R15: 81000102d980 FS: 2b8de51be6f0() GS:8100016123c0() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 0036d4b938a0 CR3: 00201000 CR4: 06e0 DR0: DR1:
Re: Softlockup detected with 2.6.23-rc2-mm1
On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: I get call trace, when the file system stress is run on the 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron (processor 270) \BUG: spinlock bad magic on CPU#1, fsx-linux/19721 Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug. Those patches were removed from 2.6.23-rc2-mm2 - please test that instead. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel BUG at mm/swap_state.c:78 with the 2.6.23-rc2-mm1
Hi, I got the following kernel Bug on the 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron (processor 270), while testing the LTP runall kernel BUG at mm/swap_state.c:78! invalid opcode: [1] SMP CPU 0 Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button Pid: 262, comm: kprefetchd Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[8027c443] [8027c443] __add_to_swap_cache+0x12/0xa6 RSP: 0018:81000299bea0 EFLAGS: 00010246 RAX: RBX: 81003f3baec0 RCX: 8100048c33b0 RDX: 00d0 RSI: 0001 RDI: 00d0 RBP: 81003f3baec0 R08: 810001423f14 R09: bb27 R10: R11: 0001 R12: 0001 R13: 0002 R14: 0001 R15: 8100048c33b0 FS: 2b941f4a40f0() GS:8067() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 005b9db0 CR3: 04ed7000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process kprefetchd (pid: 262, threadinfo 81000299a000, task 810001c98040) Stack: 0001 81003f3baec0 8027c50d 0002 81003f3baec0 8027f318 81000299bf20 Call Trace: [8027c50d] add_to_swap_cache+0x36/0x5f [8027f318] kprefetchd+0x248/0x40c [8027f0d0] kprefetchd+0x0/0x40c [80248360] kthread+0x47/0x73 [8020ca78] child_rip+0xa/0x12 [80248319] kthread+0x0/0x73 [8020ca6e] child_rip+0x0/0x12 Code: 0f 0b eb fe 8b 03 66 85 c0 79 04 0f 0b eb fe 8b 03 f6 c4 08 Thanks, Kamalesh Babulal. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Softlockup detected with 2.6.23-rc2-mm1
Hi, I get call trace, when the file system stress is run on the 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron (processor 270) \BUG: spinlock bad magic on CPU#1, fsx-linux/19721 lock: 8100028cef48, .magic: , .owner: none/-1, .owner_cpu: 0 Call Trace: [803359a6] _raw_spin_lock+0x22/0xf6 [804e4c13] _spin_lock_irqsave+0x9/0xe [803303fe] prop_norm_single+0x40/0x9a [8026ac1f] set_page_dirty+0x8d/0xc9 [8026bc09] set_page_dirty_balance+0x9/0x39 [80271f14] __do_fault+0x37a/0x395 [802738d7] handle_mm_fault+0x342/0x6c3 [804e6ac6] do_page_fault+0x3e5/0x7ab [802117d3] arch_get_unmapped_area+0x184/0x1f9 [804e4c13] _spin_lock_irqsave+0x9/0xe [803318cc] __up_write+0x21/0x10d [804e500d] error_exit+0x0/0x84 BUG: spinlock lockup on CPU#1, fsx-linux/19721, 8100028cef48 Call Trace: [80335a53] _raw_spin_lock+0xcf/0xf6 [804e4c13] _spin_lock_irqsave+0x9/0xe [803303fe] prop_norm_single+0x40/0x9a [8026ac1f] set_page_dirty+0x8d/0xc9 [8026bc09] set_page_dirty_balance+0x9/0x39 [80271f14] __do_fault+0x37a/0x395 [802738d7] handle_mm_fault+0x342/0x6c3 [804e6ac6] do_page_fault+0x3e5/0x7ab [802117d3] arch_get_unmapped_area+0x184/0x1f9 [804e4c13] _spin_lock_irqsave+0x9/0xe [803318cc] __up_write+0x21/0x10d [804e500d] error_exit+0x0/0x84 BUG: soft lockup - CPU#2 stuck for 11s! [events/2:17] CPU 2: Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button Pid: 17, comm: events/2 Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[8021a4a4] [8021a4a4] __smp_call_function+0x63/0x84 RSP: 0018:810001727e00 EFLAGS: 0297 RAX: 08fc RBX: 0003 RCX: RDX: 08fc RSI: 810001727de0 RDI: 00fc RBP: 0246 R08: 0003 R09: 0005 R10: 0010 R11: 0246 R12: 0400 R13: 0400 R14: R15: 81000102d980 FS: 2b8de51be6f0() GS:8100016123c0() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 0036d4b938a0 CR3: 1659f000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Call Trace: [80214c8d] mcheck_check_cpu+0x0/0x30 [8021a4f7] smp_call_function+0x32/0x49 [80214c8d] mcheck_check_cpu+0x0/0x30 [8023aca7] on_each_cpu+0x10/0x22 [80214532] mcheck_timer+0x0/0x7c [8021454f] mcheck_timer+0x1d/0x7c [804e4be1] _spin_unlock_irq+0x9/0xc [80244c93] run_workqueue+0x8d/0x11a [802454e2] worker_thread+0x0/0xe4 [802455bc] worker_thread+0xda/0xe4 [8024846b] autoremove_wake_function+0x0/0x2e [80248360] kthread+0x47/0x73 [8020ca78] child_rip+0xa/0x12 [80248319] kthread+0x0/0x73 [8020ca6e] child_rip+0x0/0x12 BUG: soft lockup - CPU#2 stuck for 11s! [events/2:17] CPU 2: Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button Pid: 17, comm: events/2 Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[8021a4a4] [8021a4a4] __smp_call_function+0x63/0x84 RSP: 0018:810001727e00 EFLAGS: 0297 RAX: 08fc RBX: 0003 RCX: RDX: 08fc RSI: 810001727de0 RDI: 00fc RBP: 0246 R08: 0003 R09: 0005 R10: 0010 R11: 0246 R12: 0400 R13: 0400 R14: R15: 81000102d980 FS: 2b8de51be6f0() GS:8100016123c0() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 0036d4b938a0 CR3: 1659f000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Call Trace: [80214c8d] mcheck_check_cpu+0x0/0x30 [8021a4f7] smp_call_function+0x32/0x49 [80214c8d] mcheck_check_cpu+0x0/0x30 [8023aca7] on_each_cpu+0x10/0x22 [80214532] mcheck_timer+0x0/0x7c [8021454f] mcheck_timer+0x1d/0x7c [804e4be1] _spin_unlock_irq+0x9/0xc [80244c93] run_workqueue+0x8d/0x11a [802454e2] worker_thread+0x0/0xe4 [802455bc] worker_thread+0xda/0xe4 [8024846b] autoremove_wake_function+0x0/0x2e [80248360] kthread+0x47/0x73 [8020ca78] child_rip+0xa/0x12 [80248319] kthread+0x0/0x73 [8020ca6e] child_rip+0x0/0x12 BUG: soft
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
This probably doesn't have great impact ;) but ... To reproduce: run torture tests for RCU and then sysrq+q. SysRq : Show Pending Timers Timer List Version: v0.3 HRTIMER_MAX_CLOCK_BASES: 2 now at 1764338760370 nsecs cpu: 0 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1186699025823815427 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: 3BUG: sleeping function called from invalid context at kernel/mutex.c:86 in_atomic():1, irqs_disabled():1 INFO: lockdep is turned off. irq event stamp: 0 hardirqs last enabled at (0): [] 0x0 hardirqs last disabled at (0): [c0117def] copy_process+0x4a8/0x144c softirqs last enabled at (0): [c0117e0d] copy_process+0x4c6/0x144c softirqs last disabled at (0): [] 0x0 [c0104869] show_trace_log_lvl+0x1a/0x30 [c01053ad] show_trace+0x12/0x14 [c0105515] dump_stack+0x15/0x17 [c0114da7] __might_sleep+0xb7/0xc9 [c0451771] mutex_lock+0x15/0x1f [c0141b75] lookup_module_symbol_name+0x17/0xc0 [c014272a] lookup_symbol_name+0x3f/0x43 [c013287e] print_name_offset+0x1f/0x96 [c01330f7] timer_list_show+0x802/0xcbd [c01335be] sysrq_timer_list_show+0xc/0xe [c02cc4a1] sysrq_handle_show_timers+0x8/0xa [c02cc3ac] __handle_sysrq+0x7b/0x115 [c02cc466] handle_sysrq+0x20/0x24 [c02c69c1] kbd_event+0x3a8/0x5c7 [c0362f8f] input_pass_event+0x8f/0x91 [c0363e77] input_handle_event+0x98/0x38d [c0364e6d] input_event+0x54/0x67 [c03682c2] atkbd_interrupt+0x200/0x59e [c0360cd0] serio_interrupt+0x7c/0x80 [c0361965] i8042_interrupt+0x17a/0x289 [c0147a5d] handle_IRQ_event+0x28/0x59 [c01493ca] handle_level_irq+0xad/0x10b [c0105a13] do_IRQ+0x93/0xd0 [c010441e] common_interrupt+0x2e/0x34 [df39d7e3] rcu_read_delay+0x8/0x36 [rcutorture] [df39d99a] rcu_torture_reader+0x6e/0x169 [rcutorture] [c012c11e] kthread+0x36/0x58 [c010451b] kernel_thread_helper+0x7/0x1c === We seem to have made a mess in there. timer_list_show() ends up calling lookup_module_symbol_name(), which takes a mutex. However print_symbol() (which is called at oops time, interrupt time, etc) calls module_address_lookup(), which is basically the same, only it doesn't take the mutex. I guess a quicky fix would be to switch kernel/time/timer_list.c:print_name_offset() from lookup_module_symbol_name() to module_address_lookup(). But we'd still have a mess in there. (adds ccs, runs away) I don't think rcutorture matters for this bug. Maybe not but that's the only way I could trigger it (insmod rcutorture and sysrq+q). Thanks, Mariusz - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected
Hello, And the winner of today is ... = [ INFO: possible irq lock inversion dependency detected ] 2.6.23-rc2-mm1 #7 - runscript.sh/5843 just changed the state of lock: (_xmit_ETHER){-+..}, at: [c03cbe79] dev_watchdog+0x17/0xcc but this lock took another, soft-irq-unsafe lock in the past: (tp-lock){--..} and interrupts could create inverse lock ordering between them. other info that might help us debug this: no locks held by runscript.sh/5843. the first lock's dependencies: - (_xmit_ETHER){-+..} ops: 21 { initial-use at: [c01387b9] __lock_acquire+0x217/0x11ac [c01397e7] lock_acquire+0x99/0xb2 [c045303a] _spin_lock_bh+0x3a/0x47 [c03bc936] dev_set_rx_mode+0x14/0x3b [c03bce3f] dev_change_flags+0x68/0x190 [c03fd37c] devinet_ioctl+0x4af/0x652 [c03fdc62] inet_ioctl+0x56/0x71 [c03b1dba] sock_ioctl+0xa5/0x1d4 [c0178b42] do_ioctl+0x22/0x71 [c0178be6] vfs_ioctl+0x55/0x29e [c0178e62] sys_ioctl+0x33/0x69 [c01041aa] sysenter_past_esp+0x5f/0x99 [] 0x in-softirq-W at: [c0138c94] __lock_acquire+0x6f2/0x11ac [c01397e7] lock_acquire+0x99/0xb2 [c0452ff3] _spin_lock+0x35/0x42 [c03cbe79] dev_watchdog+0x17/0xcc [c0122587] run_timer_softirq+0x14b/0x1a9 [c011ee12] __do_softirq+0x5b/0xb2 [c011eeb6] do_softirq+0x4d/0x4f [c011f19b] irq_exit+0x48/0x4a [c0105a18] do_IRQ+0x98/0xd0 [c010441e] common_interrupt+0x2e/0x34 [c0453922] error_code+0x6a/0x70 [] 0x hardirq-on-W at: [c0138ce0] __lock_acquire+0x73e/0x11ac [c01397e7] lock_acquire+0x99/0xb2 [c045303a] _spin_lock_bh+0x3a/0x47 [c03bc936] dev_set_rx_mode+0x14/0x3b [c03bce3f] dev_change_flags+0x68/0x190 [c03fd37c] devinet_ioctl+0x4af/0x652 [c03fdc62] inet_ioctl+0x56/0x71 [c03b1dba] sock_ioctl+0xa5/0x1d4 [c0178b42] do_ioctl+0x22/0x71 [c0178be6] vfs_ioctl+0x55/0x29e [c0178e62] sys_ioctl+0x33/0x69 [c01041aa] sysenter_past_esp+0x5f/0x99 [] 0x } ... key at: [c087aae8] netdev_xmit_lock_key+0x8/0x1c0 - (tp-lock){--..} ops: 44 { initial-use at: [c01387b9] __lock_acquire+0x217/0x11ac [c01397e7] lock_acquire+0x99/0xb2 [c0452ff3] _spin_lock+0x35/0x42 [de84d6e0] rtl8139_interrupt+0x27/0x46b [8139too] [c01484a2] request_irq+0xba/0x108 [de84e5f6] rtl8139_open+0x2f/0x1e2 [8139too] [c03bf09d] dev_open+0x37/0x76 [c03bce65] dev_change_flags+0x8e/0x190 [c03fd37c] devinet_ioctl+0x4af/0x652 [c03fdc62] inet_ioctl+0x56/0x71 [c03b1dba] sock_ioctl+0xa5/0x1d4 [c0178b42] do_ioctl+0x22/0x71 [c0178be6] vfs_ioctl+0x55/0x29e [c0178e62] sys_ioctl+0x33/0x69 [c01041aa] sysenter_past_esp+0x5f/0x99 [] 0x softirq-on-W at: [c0138d09] __lock_acquire+0x767/0x11ac [c01397e7] lock_acquire+0x99/0xb2 [c0452ff3] _spin_lock+0x35/0x42 [de84d6e0] rtl8139_interrupt+0x27/0x46b [8139too] [c01480fd] free_irq+0x11b/0x146 [de84ed59] rtl8139_close+0x8a/0x14a [8139too] [c03bde63] dev_close+0x57/0x74 [c03bce65] dev_change_flags+0x8e/0x190 [c03fd37c] devinet_ioctl+0x4af/0x652 [c03fdc62] inet_ioctl+0x56/0x71 [c03b1dba] sock_ioctl+0xa5/0x1d4 [c0178b42] do_ioctl+0x22/0x71 [c0178be6] vfs_ioctl+0x55/0x29e [c0178e62] sys_ioctl+0x33/0x69 [c01041aa] sysenter_past_esp+0x5f/0x99 [] 0x hardirq-on-W
Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___
On Fri, 10 Aug 2007 15:27:42 +0200 Andi Kleen [EMAIL PROTECTED] wrote: On Thursday 09 August 2007 20:52:58 Andrew Morton wrote: On Thu, 9 Aug 2007 10:18:15 -0400 Miles Lane [EMAIL PROTECTED] wrote: CC drivers/dma/ioat_dca.o drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag': drivers/dma/ioat_dca.c:177: error: implicit declaration of function 'cpu_physical_id' Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n. Either ioat needs to stop using cpu_physical_id() if SMP=n, or the supported architectures (i386, x86_64, ia64) should provide a non-SMP version of cpu_physical_id(). Preferably the latter, I'd say. It doesn't make much sense in smp.h because there is not really a concept of physical id on most architectures i expect. Better to put it into the individual asm files. I gave up and did this: From: Andrew Morton [EMAIL PROTECTED] drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag': drivers/dma/ioat_dca.c:177: error: implicit declaration of function 'cpu_physical_id' This is s screwed up. Root cause: linux/smp.h only includes asm/smp.h if CONFIG_SMP=y. To get at cpu_physical_id() on UP, the user must include asm/smp.h, not linux/smp.h. Cc: Luck, Tony [EMAIL PROTECTED] Cc: Andi Kleen [EMAIL PROTECTED] Cc: Shannon Nelson [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/dma/ioat_dca.c |3 +++ 1 file changed, 3 insertions(+) diff -puN drivers/dma/ioat_dca.c~git-dma-up-fix drivers/dma/ioat_dca.c --- a/drivers/dma/ioat_dca.c~git-dma-up-fix +++ a/drivers/dma/ioat_dca.c @@ -25,6 +25,9 @@ #include linux/smp.h #include linux/interrupt.h #include linux/dca.h + +#include asm/smp.h + #include ioatdma.h #include ioatdma_registers.h _ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Thu, Aug 09, 2007 at 07:06:23PM -0700, Andrew Morton wrote: On Thu, 9 Aug 2007 19:00:40 -0700 Paul E. McKenney [EMAIL PROTECTED] wrote: On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote: On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote: ... Changes since 2.6.23-rc2-mm1: ... +allow-rcutorture-to-handle-synchronize_sched.patch ... 2.6.23 queue ... All drivers were converted to no longer use xtime directly since it might be quite outdated, but this patch adds a usage of xtime.tv_nsec as RNG... This code doesn't care if the time is outdated, as it is simply periodically perturbing an RNG, but OK. So, what interface are we supposed to be using instead? I cannot use get_random_bytes() due to locking issues. This is not a cryptographically secure usage, so the perturbation does not need to be extremely high quality. On x86, I would just grab the low-order bits of the TSC, but all of the world is not an x86. ;-) One used to use sched_clock() for this, then get frowned at. Now we have cpu_clock()... Hmmm... And cpu_clock() is not in 2.6.22, so must appear in some later release. Which means that the rate of API change in this area is a bit high, so I should avoid it like the plague. Therefore, I should look for some other convenient source of entropy. One convenient source would the per-CPU statistics that rcutorture maintains. Of course, a given CPU's RNG is nearly in lock-step with its own statistics, but not with the adjacent CPU's statistics... I will send a patch. Thanx, Paul - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, 10 Aug 2007 08:12:08 -0700 Paul E. McKenney [EMAIL PROTECTED] wrote: One used to use sched_clock() for this, then get frowned at. Now we have cpu_clock()... Hmmm... And cpu_clock() is not in 2.6.22, so must appear in some later release. Which means that the rate of API change in this area is a bit high, so I should avoid it like the plague. eh, it's been there for weeks. It is dust-encrusted. Therefore, I should look for some other convenient source of entropy. One convenient source would the per-CPU statistics that rcutorture maintains. Of course, a given CPU's RNG is nearly in lock-step with its own statistics, but not with the adjacent CPU's statistics... I will send a patch. Please use cpu_clock(). It ain't going away. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___
On Thursday 09 August 2007 20:52:58 Andrew Morton wrote: On Thu, 9 Aug 2007 10:18:15 -0400 Miles Lane [EMAIL PROTECTED] wrote: CC drivers/dma/ioat_dca.o drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag': drivers/dma/ioat_dca.c:177: error: implicit declaration of function 'cpu_physical_id' Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n. Either ioat needs to stop using cpu_physical_id() if SMP=n, or the supported architectures (i386, x86_64, ia64) should provide a non-SMP version of cpu_physical_id(). Preferably the latter, I'd say. It doesn't make much sense in smp.h because there is not really a concept of physical id on most architectures i expect. Better to put it into the individual asm files. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___
On 8/9/07, Andrew Morton [EMAIL PROTECTED] wrote: On Thu, 9 Aug 2007 10:18:15 -0400 Miles Lane [EMAIL PROTECTED] wrote: CC drivers/dma/ioat_dca.o drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag': drivers/dma/ioat_dca.c:177: error: implicit declaration of function 'cpu_physical_id' Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n. Either ioat needs to stop using cpu_physical_id() if SMP=n, or the supported architectures (i386, x86_64, ia64) should provide a non-SMP version of cpu_physical_id(). Preferably the latter, I'd say. Something like this, I suppose... From: Andrew Morton [EMAIL PROTECTED] i386, x86_64 and ia64 implement cpu_physical_id() if CONFIG_SMP=y. Provide a uniprocessor stub so that callers will dtrt. Cc: Andi Kleen [EMAIL PROTECTED] Cc: Luck, Tony [EMAIL PROTECTED] Cc: Shannon Nelson [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- include/linux/smp.h |5 + 1 files changed, 5 insertions(+) diff -puN include/linux/smp.h~implement-cpu_physical_id-on-smp=n include/linux/smp.h --- a/include/linux/smp.h~implement-cpu_physical_id-on-smp=n +++ a/include/linux/smp.h @@ -108,6 +108,11 @@ static inline void smp_send_reschedule(i 0; \ }) +static inline unsigned cpu_physical_id(unsigned cpu) +{ + return 0; +} + #endif /* !SMP */ /* _ Worked for me. Thanks, Miles - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Softlockup detected with 2.6.23-rc2-mm1
On Fri, Aug 10, 2007 at 01:06:58PM +0530, Kamalesh Babulal wrote: Andrew Morton wrote: On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: I get call trace, when the file system stress is run on the 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron (processor 270) \BUG: spinlock bad magic on CPU#1, fsx-linux/19721 Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug. Those patches were removed from 2.6.23-rc2-mm2 - please test that instead. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ I get different call trace on AMD Opteron(tm) Processor 844 machine , I am not sure where it is related to the same patch = BUG: soft lockup - CPU#3 stuck for 11s! [pdflush:272] CPU 3: Modules linked in: Pid: 272, comm: pdflush Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[8021a9c3] [8021a9c3] flush_tlb_others+0x69/0x95 Cannot be 100% sure but of the group of machines showing your original problem one showed this form. Dropping the patches indicated by Andrew seemed to fix both symptoms. So I think it is highly likely the same thing. -apw - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Softlockup detected with 2.6.23-rc2-mm1
Andrew Morton wrote: On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: I get call trace, when the file system stress is run on the 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron (processor 270) \BUG: spinlock bad magic on CPU#1, fsx-linux/19721 Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug. Those patches were removed from 2.6.23-rc2-mm2 - please test that instead. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ I get different call trace on AMD Opteron(tm) Processor 844 machine , I am not sure where it is related to the same patch = BUG: soft lockup - CPU#3 stuck for 11s! [pdflush:272] CPU 3: Modules linked in: Pid: 272, comm: pdflush Not tainted 2.6.23-rc2-mm1-autokern1 #1 RIP: 0010:[8021a9c3] [8021a9c3] flush_tlb_others+0x69/0x95 RSP: :810001f15a90 EFLAGS: 0202 RAX: 0003 RBX: 810001f15ac0 RCX: 0008 RDX: 08f3 RSI: 00f3 RDI: 0002 RBP: R08: 810082f05210 R09: 802e60c1 R10: 8100815e6e70 R11: R12: 8101ffc38080 R13: 80358b47 R14: 810001f15a40 R15: 810081d73208 FS: () GS:810180724280() knlGS:f7f75b80 CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: f7e20494 CR3: 029f CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Call Trace: [8021abd0] flush_tlb_page+0x8f/0x97 [8026daee] page_mkclean+0x120/0x171 [802e6227] ext3_ordered_writepage+0x13f/0x16c [8025ed09] clear_page_dirty_for_io+0x52/0xba [8025f002] write_cache_pages+0x1b2/0x33a [8022a149] update_curr+0xd9/0xf8 [8025e9ca] __writepage+0x0/0x2a [8025f1a9] generic_writepages+0x1f/0x25 [8025f1db] do_writepages+0x2c/0x35 [8029b453] __writeback_single_inode+0x1c9/0x346 [8023b809] try_to_del_timer_sync+0x55/0x60 [8023b826] del_timer_sync+0x12/0x1f [8022a149] update_curr+0xd9/0xf8 [8022a446] dequeue_entity+0x7d/0x92 [8029ba20] generic_sync_sb_inodes+0x216/0x372 [8029bb99] sync_sb_inodes+0x1d/0x1f [8029bdd9] writeback_inodes+0x83/0xd6 [8025e82b] wb_kupdate+0xa0/0x113 [8025f658] pdflush+0x156/0x206 [8025e78b] wb_kupdate+0x0/0x113 [8025f502] pdflush+0x0/0x206 [80245660] kthread+0x44/0x6d [8020c5e8] child_rip+0xa/0x12 [8024561c] kthread+0x0/0x6d [8020c5de] child_rip+0x0/0x12 Thanks Kamalesh Babulal. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
* Andrew Morton [EMAIL PROTECTED] wrote: We seem to have made a mess in there. timer_list_show() ends up calling lookup_module_symbol_name(), which takes a mutex. However print_symbol() (which is called at oops time, interrupt time, etc) calls module_address_lookup(), which is basically the same, only it doesn't take the mutex. hm, current upstream does: static void print_name_offset(struct seq_file *m, void *sym) { char symname[KSYM_NAME_LEN]; if (lookup_symbol_name((unsigned long)sym, symname) 0) why was that changed? I think symbol lookups for debug purposes have to be lockless, fundamentally. Ingo - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
On Fri, 10 Aug 2007 09:40:00 +0200 Ingo Molnar [EMAIL PROTECTED] wrote: * Andrew Morton [EMAIL PROTECTED] wrote: We seem to have made a mess in there. timer_list_show() ends up calling lookup_module_symbol_name(), which takes a mutex. However print_symbol() (which is called at oops time, interrupt time, etc) calls module_address_lookup(), which is basically the same, only it doesn't take the mutex. hm, current upstream does: static void print_name_offset(struct seq_file *m, void *sym) { char symname[KSYM_NAME_LEN]; if (lookup_symbol_name((unsigned long)sym, symname) 0) why was that changed? It wasn't. lookup_symbol_name() calls lookup_module_symbol_name() which calls mutex_lock(). I think symbol lookups for debug purposes have to be lockless, fundamentally. Sure, especially a sysrq thingy. It's a bit nasty to just go in there and start walking data structures without holding the needed lock though. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- INFO: possible circular locking dependency detected
On Fri, 2007-08-10 at 02:47 +0400, Alexey Starikovskiy wrote: Presumably the new debugging patches in -mm (workqueue-debug-flushing-deadlocks-with-lockdep.patch and workqueue-debug-work-related-deadlocks-with-lockdep.patch) think they have found a potential deadlock in ACPI. I don't have time to pick through the code to confirm that, but boy I'm good at adding cc's ;) Yep, it indeed may lock up... Here is a patch to avoid it Cool. I'm impressed this stuff actually finds something :) johannes signature.asc Description: This is a digitally signed message part
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, Aug 10, 2007 at 10:12:12AM -0700, Andrew Morton wrote: On Fri, 10 Aug 2007 08:12:08 -0700 Paul E. McKenney [EMAIL PROTECTED] wrote: One used to use sched_clock() for this, then get frowned at. Now we have cpu_clock()... Hmmm... And cpu_clock() is not in 2.6.22, so must appear in some later release. Which means that the rate of API change in this area is a bit high, so I should avoid it like the plague. eh, it's been there for weeks. It is dust-encrusted. Therefore, I should look for some other convenient source of entropy. One convenient source would the per-CPU statistics that rcutorture maintains. Of course, a given CPU's RNG is nearly in lock-step with its own statistics, but not with the adjacent CPU's statistics... I will send a patch. Please use cpu_clock(). It ain't going away. D'accord... Thanx, Paul - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Softlockup detected with 2.6.23-rc2-mm1
Andrew Morton wrote: On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote: I get call trace, when the file system stress is run on the 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron (processor 270) \BUG: spinlock bad magic on CPU#1, fsx-linux/19721 Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug. Those patches were removed from 2.6.23-rc2-mm2 - please test that instead. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ The Call trace is not reproducible in the 2.6.23-rc2-mm2. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- PPC G5 kernel compile failure (patch)
Krzysztof Helt wrote: On Thu, 9 Aug 2007 14:04:49 +0100 Andy Whitcroft [EMAIL PROTECTED] wrote: Seeing the following compile error on a G5 mac: drivers/video/tdfxfb.c: In function 'tdfxfb_setup': drivers/video/tdfxfb.c:1341: error: 'opt' undeclared (first use in this function) drivers/video/tdfxfb.c:1341: error: (Each undeclared identifier is reported only once drivers/video/tdfxfb.c:1341: error: for each function it appears in.) This seems to be the following fragment from tdfxfb-hardware-cursor: + } else if (!strcmp(this_opt, hwcursor)) { + hwcursor = simple_strtoul(opt + 9, NULL, 0); I guess the nieve fix would be s/opt/this_opt, but I am also suspicious of the +9 here as hwcursor is only 8 long? Now this seems to take a numeric value and I assume that is via hwcursor=N, if so then the +9 would make sense _if_ the strcmp was against hwcursor=. The patch below fixes all issues you have pointed out. It also fixes the description of the nomtrr option. --- From: Krzysztof Helt [EMAIL PROTECTED] This patch fixes compilation with setup options bug and corrects description of the nomtrr option. Signed-off-by: Krzysztof Helt [EMAIL PROTECTED] --- --- linux-2.6.22.new/drivers/video/tdfxfb.c 2007-08-09 16:11:23.870028259 +0200 +++ linux-2.6.23/drivers/video/tdfxfb.c 2007-08-09 16:15:07.654781024 +0200 @@ -1337,8 +1337,8 @@ static void tdfxfb_setup(char *options) nopan = 1; } else if (!strcmp(this_opt, nowrap)) { nowrap = 1; - } else if (!strcmp(this_opt, hwcursor)) { - hwcursor = simple_strtoul(opt + 9, NULL, 0); + } else if (!strncmp(this_opt, hwcursor=, 9)) { + hwcursor = simple_strtoul(this_opt + 9, NULL, 0); #ifdef CONFIG_MTRR } else if (!strncmp(this_opt, nomtrr, 6)) { nomtrr = 1; @@ -1409,7 +1409,7 @@ MODULE_PARM_DESC(hwcursor, Enable hardw (1=enable, 0=disable, default=1)); #ifdef CONFIG_MTRR module_param(nomtrr, bool, 0); -MODULE_PARM_DESC(nomtrr, Disable MTRR support (0 or 1=disabled) (default=0)); +MODULE_PARM_DESC(nomtrr, Disable MTRR support (default: enabled)); #endif module_init(tdfxfb_init); Confirmed that this gets my kernel compiled and the result boots. Tested-by: Andy Whitcroft [EMAIL PROTECTED] -apw - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, Aug 10, 2007 at 01:30:55PM -0700, Paul E. McKenney wrote: On Fri, Aug 10, 2007 at 10:12:12AM -0700, Andrew Morton wrote: On Fri, 10 Aug 2007 08:12:08 -0700 Paul E. McKenney [EMAIL PROTECTED] wrote: One used to use sched_clock() for this, then get frowned at. Now we have cpu_clock()... Hmmm... And cpu_clock() is not in 2.6.22, so must appear in some later release. Which means that the rate of API change in this area is a bit high, so I should avoid it like the plague. eh, it's been there for weeks. It is dust-encrusted. Therefore, I should look for some other convenient source of entropy. One convenient source would the per-CPU statistics that rcutorture maintains. Of course, a given CPU's RNG is nearly in lock-step with its own statistics, but not with the adjacent CPU's statistics... I will send a patch. Please use cpu_clock(). It ain't going away. D'accord... Errmmm... No joy. ERROR: cpu_clock [kernel/rcutorture.ko] undefined! Turns out that cpu_clock also ain't exported, and rcutorture.c is a module. Would adding an EXPORT_SYMBOL_GPL() as in the patch below be acceptable? If not, I have a tested patch to rcutorture.c that leverages statistical counters. Your choice. Thanx, Paul Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it. Compiles, but not yet tested. Signed-off-by: Paul E. McKenney [EMAIL PROTECTED] --- rcutorture.c |8 ++-- sched.c |2 ++ 2 files changed, 4 insertions(+), 6 deletions(-) diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c --- linux-2.6.23-rc2/kernel/rcutorture.c2007-08-03 19:49:55.0 -0700 +++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c2007-08-10 17:15:22.0 -0700 @@ -42,7 +42,6 @@ #include linux/notifier.h #include linux/freezer.h #include linux/cpu.h -#include linux/random.h #include linux/delay.h #include linux/byteorder/swabb.h #include linux/stat.h @@ -166,16 +165,13 @@ struct rcu_random_state { /* * Crude but fast random-number generator. Uses a linear congruential - * generator, with occasional help from get_random_bytes(). + * generator, with occasional help from cpu_clock(). */ static unsigned long rcu_random(struct rcu_random_state *rrsp) { - long refresh; - if (--rrsp-rrs_count 0) { - get_random_bytes(refresh, sizeof(refresh)); - rrsp-rrs_state += refresh; + rrsp-rrs_state += (unsigned long)cpu_clock(smp_processor_id()); rrsp-rrs_count = RCU_RANDOM_REFRESH; } rrsp-rrs_state = rrsp-rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD; diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c linux-2.6.23-rc2-rcutorturesched/kernel/sched.c --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700 +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 17:22:57.0 -0700 @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu) return now; } +EXPORT_SYMBOL_GPL(cpu_clock); + #ifdef CONFIG_FAIR_GROUP_SCHED /* Change a task's -cfs_rq if it moves across CPUs */ static inline void set_task_cfs_rq(struct task_struct *p) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, Aug 10, 2007 at 05:29:49PM -0700, Paul E. McKenney wrote: Errmmm... No joy. ERROR: cpu_clock [kernel/rcutorture.ko] undefined! Turns out that cpu_clock also ain't exported, and rcutorture.c is a module. Would adding an EXPORT_SYMBOL_GPL() as in the patch below be acceptable? Except that the old xtime symbol was EXPORT_SYMBOL() rather than my proposed EXPORT_SYMBOL_GPL() for the equivalent new cpu_clock(). Sigh!!! I will leave this one for others to sort out. Andrew, please consider this patch withdrawn and apply the version that does not rely on time for entropy. Please let me know if you would like me to resend it. Thanx, Paul If not, I have a tested patch to rcutorture.c that leverages statistical counters. Your choice. Thanx, Paul Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it. Compiles, but not yet tested. Signed-off-by: Paul E. McKenney [EMAIL PROTECTED] --- rcutorture.c |8 ++-- sched.c |2 ++ 2 files changed, 4 insertions(+), 6 deletions(-) diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c --- linux-2.6.23-rc2/kernel/rcutorture.c 2007-08-03 19:49:55.0 -0700 +++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c 2007-08-10 17:15:22.0 -0700 @@ -42,7 +42,6 @@ #include linux/notifier.h #include linux/freezer.h #include linux/cpu.h -#include linux/random.h #include linux/delay.h #include linux/byteorder/swabb.h #include linux/stat.h @@ -166,16 +165,13 @@ struct rcu_random_state { /* * Crude but fast random-number generator. Uses a linear congruential - * generator, with occasional help from get_random_bytes(). + * generator, with occasional help from cpu_clock(). */ static unsigned long rcu_random(struct rcu_random_state *rrsp) { - long refresh; - if (--rrsp-rrs_count 0) { - get_random_bytes(refresh, sizeof(refresh)); - rrsp-rrs_state += refresh; + rrsp-rrs_state += (unsigned long)cpu_clock(smp_processor_id()); rrsp-rrs_count = RCU_RANDOM_REFRESH; } rrsp-rrs_state = rrsp-rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD; diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c linux-2.6.23-rc2-rcutorturesched/kernel/sched.c --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700 +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 17:22:57.0 -0700 @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu) return now; } +EXPORT_SYMBOL_GPL(cpu_clock); + #ifdef CONFIG_FAIR_GROUP_SCHED /* Change a task's -cfs_rq if it moves across CPUs */ static inline void set_task_cfs_rq(struct task_struct *p) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
Andrew Morton wrote: > On Fri, 10 Aug 2007 01:23:07 +0200 > Mariusz Kozlowski <[EMAIL PROTECTED]> wrote: > >> Hello, >> >> This probably doesn't have great impact ;) but ... >> To reproduce: run torture tests for RCU and then sysrq+q. >> >> SysRq : Show Pending Timers >> Timer List Version: v0.3 >> HRTIMER_MAX_CLOCK_BASES: 2 >> now at 1764338760370 nsecs >> >> cpu: 0 >> clock 0: >> .index: 0 >> .resolution: 1 nsecs >> .get_time: ktime_get_real >> .offset: 1186699025823815427 nsecs >> active timers: >> clock 1: >> .index: 1 >> .resolution: 1 nsecs >> .get_time: ktime_get >> .offset: 0 nsecs >> active timers: >> #0: <3>BUG: sleeping function called from invalid context at >> kernel/mutex.c:86 >> in_atomic():1, irqs_disabled():1 >> INFO: lockdep is turned off. >> irq event stamp: 0 >> hardirqs last enabled at (0): [<>] 0x0 >> hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c >> softirqs last enabled at (0): [] copy_process+0x4c6/0x144c >> softirqs last disabled at (0): [<>] 0x0 >> [] show_trace_log_lvl+0x1a/0x30 >> [] show_trace+0x12/0x14 >> [] dump_stack+0x15/0x17 >> [] __might_sleep+0xb7/0xc9 >> [] mutex_lock+0x15/0x1f >> [] lookup_module_symbol_name+0x17/0xc0 >> [] lookup_symbol_name+0x3f/0x43 >> [] print_name_offset+0x1f/0x96 >> [] timer_list_show+0x802/0xcbd >> [] sysrq_timer_list_show+0xc/0xe >> [] sysrq_handle_show_timers+0x8/0xa >> [] __handle_sysrq+0x7b/0x115 >> [] handle_sysrq+0x20/0x24 >> [] kbd_event+0x3a8/0x5c7 >> [] input_pass_event+0x8f/0x91 >> [] input_handle_event+0x98/0x38d >> [] input_event+0x54/0x67 >> [] atkbd_interrupt+0x200/0x59e >> [] serio_interrupt+0x7c/0x80 >> [] i8042_interrupt+0x17a/0x289 >> [] handle_IRQ_event+0x28/0x59 >> [] handle_level_irq+0xad/0x10b >> [] do_IRQ+0x93/0xd0 >> [] common_interrupt+0x2e/0x34 >> [] rcu_read_delay+0x8/0x36 [rcutorture] >> [] rcu_torture_reader+0x6e/0x169 [rcutorture] >> [] kthread+0x36/0x58 >> [] kernel_thread_helper+0x7/0x1c >> === > > We seem to have made a mess in there. timer_list_show() ends up calling > lookup_module_symbol_name(), which takes a mutex. However print_symbol() > (which is called at oops time, interrupt time, etc) calls > module_address_lookup(), which is basically the same, only it doesn't take > the mutex. > > I guess a quicky fix would be to switch > kernel/time/timer_list.c:print_name_offset() from > lookup_module_symbol_name() to module_address_lookup(). But we'd still > have a mess in there. > > (adds ccs, runs away) I don't think rcutorture matters for this bug. As far as I can tell, Andrew's description of this problem will always apply to this particular sysrq: the keyboard interrupt leads to handle_sysrq, which leads to timer_list_show, which leads to lookup_module_symbol_name, which acquires a mutex. - Josh Triplett - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Thu, Aug 09, 2007 at 07:00:40PM -0700, Paul E. McKenney wrote: > On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote: > > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote: > > >... > > > Changes since 2.6.23-rc2-mm1: > > >... > > > +allow-rcutorture-to-handle-synchronize_sched.patch > > >... > > > 2.6.23 queue > > >... > > > > All drivers were converted to no longer use xtime directly since it > > might be quite outdated, but this patch adds a usage of xtime.tv_nsec > > as RNG... > > This code doesn't care if the time is outdated, as it is simply > periodically perturbing an RNG, but OK. >... I should have been a bit more concrete: I have a patch pending to unexport xtime for catching unsafe usages, and you added an (ab)user. > Thanx, Paul cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Thu, 9 Aug 2007 19:00:40 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> wrote: > On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote: > > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote: > > >... > > > Changes since 2.6.23-rc2-mm1: > > >... > > > +allow-rcutorture-to-handle-synchronize_sched.patch > > >... > > > 2.6.23 queue > > >... > > > > All drivers were converted to no longer use xtime directly since it > > might be quite outdated, but this patch adds a usage of xtime.tv_nsec > > as RNG... > > This code doesn't care if the time is outdated, as it is simply > periodically perturbing an RNG, but OK. > > So, what interface are we supposed to be using instead? I cannot use > get_random_bytes() due to locking issues. This is not a cryptographically > secure usage, so the perturbation does not need to be extremely high > quality. > > On x86, I would just grab the low-order bits of the TSC, but all of the > world is not an x86. ;-) > One used to use sched_clock() for this, then get frowned at. Now we have cpu_clock()... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rcutorture xtime usage
On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote: > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote: > >... > > Changes since 2.6.23-rc2-mm1: > >... > > +allow-rcutorture-to-handle-synchronize_sched.patch > >... > > 2.6.23 queue > >... > > All drivers were converted to no longer use xtime directly since it > might be quite outdated, but this patch adds a usage of xtime.tv_nsec > as RNG... This code doesn't care if the time is outdated, as it is simply periodically perturbing an RNG, but OK. So, what interface are we supposed to be using instead? I cannot use get_random_bytes() due to locking issues. This is not a cryptographically secure usage, so the perturbation does not need to be extremely high quality. On x86, I would just grab the low-order bits of the TSC, but all of the world is not an x86. ;-) Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc2-mm1: rcutorture xtime usage
On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote: >... > Changes since 2.6.23-rc2-mm1: >... > +allow-rcutorture-to-handle-synchronize_sched.patch >... > 2.6.23 queue >... All drivers were converted to no longer use xtime directly since it might be quite outdated, but this patch adds a usage of xtime.tv_nsec as RNG... cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!
On Thu, Aug 09, 2007 at 04:37:35PM +0100, Hugh Dickins wrote: > On Thu, 9 Aug 2007, Mariusz Kozlowski wrote: > > Hello, > > > > Nothing unusual happening, allmodconfig compiling etc. > > Not sure why it says kernel was tainted though ... hmmm. > > > > [ cut here ] > > kernel BUG at mm/swap_state.c:78! > > invalid opcode: [#1] > > PREEMPT > > Modules linked in: orinoco_cs orinoco hermes pl2303 usbserial pcmcia > > 8250_pci 8250 serial_core yenta_socket rsrc_nonstatic pcmcia_core 8139too > > CPU:0 > > EIP:0060:[]Tainted: PVLI > > EFLAGS: 00010246 (2.6.23-rc2-mm1 #1) > > EIP is at __add_to_swap_cache+0xc6/0xd7 > > eax: 4000 ebx: c11285c0 ecx: 00d0 edx: 0283 > > esi: c11285c0 edi: 0283 ebp: c1858f90 esp: c1858f84 > > ds: 007b es: 007b fs: gs: ss: 0068 > > Process kprefetchd (pid: 236, ti=c1858000 task=c18d14d0 task.ti=c1858000) > > Stack: 0283 c11285c0 c3d5a3c8 c1858fa0 c01504ea c11285c0 > > c1858fcc > >c015307c 0001 0007 0002 0002 0283 > > fffc > > c0152d5c c1858fe0 c0127f2e c0127ef8 > > > > Call Trace: > > [] show_trace_log_lvl+0x1a/0x30 > > [] show_stack_log_lvl+0xa9/0xd5 > > [] show_registers+0x219/0x38d > > [] die+0x104/0x23e > > [] do_trap+0x83/0xad > > [] do_invalid_op+0x88/0x92 > > [] error_code+0x6a/0x70 > > [] add_to_swap_cache+0x22/0x58 > > [] kprefetchd+0x320/0x364 > > [] kthread+0x36/0x58 > > [] kernel_thread_helper+0x7/0x14 > > === > > INFO: lockdep is turned off. > > Code: 0f 89 7b 0c 83 05 fc c9 53 c0 01 8b 13 c1 ea 1e 8d 04 12 01 d0 c1 e0 > > 03 29 d0 c1 e0 05 ff 80 b8 c0 53 c0 ff 05 34 1d 68 c0 eb 96 <0f> 0b eb fe > > 0f 0b eb fe 0f 0b eb fe 8b 53 0c eb be 55 89 e5 56 > > EIP: [] __add_to_swap_cache+0xc6/0xd7 SS:ESP 0068:c1858f84 > > Don't worry about reproducing untainted, I got the same earlier > and was just preparing and testing the hotfix: here it is... > > > Nick's mm-clarify-__add_to_swap_cache-locking.patch is fine for mainline, > but soon generates a "kernel BUG at mm/swap_state.c:78!" when it meets > mm-implement-swap-prefetching.patch in 2.6.23-rc2-mm1. We could add a > fix to the latter, but I think it's better to adjust Nick's, so that > it's right for whichever tree it's in: move the responsibility to > SetPageLocked from read_swap_cache_async to add_to_swap_cache. Hmm, yeah I like this better, it is more like add_to_page_cache now. Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
On Fri, 10 Aug 2007 01:23:07 +0200 Mariusz Kozlowski <[EMAIL PROTECTED]> wrote: > Hello, > > This probably doesn't have great impact ;) but ... > To reproduce: run torture tests for RCU and then sysrq+q. > > SysRq : Show Pending Timers > Timer List Version: v0.3 > HRTIMER_MAX_CLOCK_BASES: 2 > now at 1764338760370 nsecs > > cpu: 0 > clock 0: > .index: 0 > .resolution: 1 nsecs > .get_time: ktime_get_real > .offset: 1186699025823815427 nsecs > active timers: > clock 1: > .index: 1 > .resolution: 1 nsecs > .get_time: ktime_get > .offset: 0 nsecs > active timers: > #0: <3>BUG: sleeping function called from invalid context at > kernel/mutex.c:86 > in_atomic():1, irqs_disabled():1 > INFO: lockdep is turned off. > irq event stamp: 0 > hardirqs last enabled at (0): [<>] 0x0 > hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c > softirqs last enabled at (0): [] copy_process+0x4c6/0x144c > softirqs last disabled at (0): [<>] 0x0 > [] show_trace_log_lvl+0x1a/0x30 > [] show_trace+0x12/0x14 > [] dump_stack+0x15/0x17 > [] __might_sleep+0xb7/0xc9 > [] mutex_lock+0x15/0x1f > [] lookup_module_symbol_name+0x17/0xc0 > [] lookup_symbol_name+0x3f/0x43 > [] print_name_offset+0x1f/0x96 > [] timer_list_show+0x802/0xcbd > [] sysrq_timer_list_show+0xc/0xe > [] sysrq_handle_show_timers+0x8/0xa > [] __handle_sysrq+0x7b/0x115 > [] handle_sysrq+0x20/0x24 > [] kbd_event+0x3a8/0x5c7 > [] input_pass_event+0x8f/0x91 > [] input_handle_event+0x98/0x38d > [] input_event+0x54/0x67 > [] atkbd_interrupt+0x200/0x59e > [] serio_interrupt+0x7c/0x80 > [] i8042_interrupt+0x17a/0x289 > [] handle_IRQ_event+0x28/0x59 > [] handle_level_irq+0xad/0x10b > [] do_IRQ+0x93/0xd0 > [] common_interrupt+0x2e/0x34 > [] rcu_read_delay+0x8/0x36 [rcutorture] > [] rcu_torture_reader+0x6e/0x169 [rcutorture] > [] kthread+0x36/0x58 > [] kernel_thread_helper+0x7/0x1c > === We seem to have made a mess in there. timer_list_show() ends up calling lookup_module_symbol_name(), which takes a mutex. However print_symbol() (which is called at oops time, interrupt time, etc) calls module_address_lookup(), which is basically the same, only it doesn't take the mutex. I guess a quicky fix would be to switch kernel/time/timer_list.c:print_name_offset() from lookup_module_symbol_name() to module_address_lookup(). But we'd still have a mess in there. (adds ccs, runs away) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state
Hello, = [ INFO: inconsistent lock state ] 2.6.23-rc2-mm1 #7 - inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes: (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too] {in-hardirq-W} state was registered at: [] __lock_acquire+0x949/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock+0x35/0x42 [] rtl8139_interrupt+0x27/0x46b [8139too] [] handle_IRQ_event+0x28/0x59 [] handle_level_irq+0xad/0x10b [] do_IRQ+0x93/0xd0 [] common_interrupt+0x2e/0x34 [] cpuidle_idle_call+0x74/0x99 [] cpu_idle+0x87/0x89 [] rest_init+0x60/0x62 [] start_kernel+0x23a/0x2c5 [<>] 0x0 [] 0x irq event stamp: 1777 hardirqs last enabled at (1777): [] kfree+0xee/0x105 hardirqs last disabled at (1776): [] kfree+0x87/0x105 softirqs last enabled at (1756): [] dev_deactivate+0x86/0xa5 softirqs last disabled at (1754): [] _spin_lock_bh+0xe/0x47 other info that might help us debug this: 1 lock held by ifconfig/5492: #0: (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f stack backtrace: [] show_trace_log_lvl+0x1a/0x30 [] show_trace+0x12/0x14 [] dump_stack+0x15/0x17 [] print_usage_bug+0x145/0x14f [] mark_lock+0x61f/0x70c [] __lock_acquire+0x73e/0x11ac [] lock_acquire+0x99/0xb2 [] _spin_lock+0x35/0x42 [] rtl8139_interrupt+0x27/0x46b [8139too] [] free_irq+0x11b/0x146 [] rtl8139_close+0x8a/0x14a [8139too] [] dev_close+0x57/0x74 [] dev_change_flags+0x8e/0x190 [] devinet_ioctl+0x4af/0x652 [] inet_ioctl+0x56/0x71 [] sock_ioctl+0xa5/0x1d4 [] do_ioctl+0x22/0x71 [] vfs_ioctl+0x55/0x29e [] sys_ioctl+0x33/0x69 [] sysenter_past_esp+0x5f/0x99 === Regards, Mariusz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
Alan Cox wrote: >>> [ 28.828484] :00:1f.1: cannot adjust BAR0 (not I/O) >>> [ 28.828487] :00:1f.1: cannot adjust BAR1 (not I/O) >>> [ 28.828489] :00:1f.1: cannot adjust BAR2 (not I/O) >>> [ 28.828491] :00:1f.1: cannot adjust BAR3 (not I/O) > > This means it didn't do anything. (wrongly because its checking I/O bits > on a BAR which are ignored according to the spec) > >>> Region 0: [virtual] Memory at 01f0 (32-bit, non-prefetchable) >>> [disabled] [size=8] >>> Region 1: [virtual] Memory at 03f0 (type 3, non-prefetchable) >>> [disabled] [size=1] >>> Region 2: [virtual] Memory at 0170 (32-bit, non-prefetchable) >>> [disabled] [size=8] >>> Region 3: [virtual] Memory at 0370 (type 3, non-prefetchable) >>> [disabled] [size=1] > > The controller is disabled and when disabled it seems to think its > memory. Valid but interesting. > > The box is an Dell Precision WorkStation 530 MT. Actually I have an ATA-7 disc on the primary EIDE connector ( one port free ) and an oldish CDROM on the secondary EIDE connector ( one port free ). http://194.231.229.228/lara/lara.dmesg ( from 2.6.23-rc2-mm1 with the 2 patches reverted ) http://194.231.229.228/lara/lara.lspci ( lspci - -nn ) http://194.231.229.228/lara/lara.html ( lshw html output ) If you want me to do/try something let me know. Gabriel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86
Hello, This probably doesn't have great impact ;) but ... To reproduce: run torture tests for RCU and then sysrq+q. SysRq : Show Pending Timers Timer List Version: v0.3 HRTIMER_MAX_CLOCK_BASES: 2 now at 1764338760370 nsecs cpu: 0 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1186699025823815427 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <3>BUG: sleeping function called from invalid context at kernel/mutex.c:86 in_atomic():1, irqs_disabled():1 INFO: lockdep is turned off. irq event stamp: 0 hardirqs last enabled at (0): [<>] 0x0 hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c softirqs last enabled at (0): [] copy_process+0x4c6/0x144c softirqs last disabled at (0): [<>] 0x0 [] show_trace_log_lvl+0x1a/0x30 [] show_trace+0x12/0x14 [] dump_stack+0x15/0x17 [] __might_sleep+0xb7/0xc9 [] mutex_lock+0x15/0x1f [] lookup_module_symbol_name+0x17/0xc0 [] lookup_symbol_name+0x3f/0x43 [] print_name_offset+0x1f/0x96 [] timer_list_show+0x802/0xcbd [] sysrq_timer_list_show+0xc/0xe [] sysrq_handle_show_timers+0x8/0xa [] __handle_sysrq+0x7b/0x115 [] handle_sysrq+0x20/0x24 [] kbd_event+0x3a8/0x5c7 [] input_pass_event+0x8f/0x91 [] input_handle_event+0x98/0x38d [] input_event+0x54/0x67 [] atkbd_interrupt+0x200/0x59e [] serio_interrupt+0x7c/0x80 [] i8042_interrupt+0x17a/0x289 [] handle_IRQ_event+0x28/0x59 [] handle_level_irq+0xad/0x10b [] do_IRQ+0x93/0xd0 [] common_interrupt+0x2e/0x34 [] rcu_read_delay+0x8/0x36 [rcutorture] [] rcu_torture_reader+0x6e/0x169 [rcutorture] [] kthread+0x36/0x58 [] kernel_thread_helper+0x7/0x1c === , tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0 # expires at 176433900 nsecs [in 239630 nsecs] #1: , it_real_fn, S:01, do_setitimer, artsd/7461 # expires at 1764742781512 nsecs [in 404021142 nsecs] #2: , hrtimer_wakeup, S:01, do_nanosleep, kwrapper/7452 # expires at 1764922105491 nsecs [in 583345121 nsecs] #3: , it_real_fn, S:01, do_setitimer, syslogd/6719 # expires at 1790027922194 nsecs [in 25689161824 nsecs] .expires_next : 176433900 nsecs .hres_active: 1 .nr_events : 1422687 .nohz_mode : 2 .idle_tick : 46585900 nsecs .tick_stopped : 0 .idle_jiffies : 165857 .idle_calls : 1812679 .idle_sleeps: 1761361 .idle_entrytime : 466865075138 nsecs .idle_sleeptime : 357976883572 nsecs .last_jiffies : 166865 .next_jiffies : 166866 .idle_expires : 46595100 nsecs jiffies: 1464338 Tick Device: mode: 1 Clock Event Device: pit max_delta_ns: 27461866 min_delta_ns: 12571 mult: 5124677 shift: 32 mode: 3 next_event: 176433900 nsecs set_next_event: pit_next_event set_mode: init_pit_timer event_handler: hrtimer_interrupt Regards, Mariusz # # Automatically generated make config: don't edit # Linux kernel version: 2.6.23-rc2-mm1 # Fri Aug 10 00:12:50 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_NONIRQ_WAKEUP=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SWAP_PREFETCH=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y # CONFIG_BSD_PROCESS_ACCT_V3 is not set # CONFIG_TASKSTATS is not set # CONFIG_USER_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=17 # CONFIG_CONTAINERS is not set CONFIG_SYSFS_DEPRECATED=y # CONFIG_RELAY is not set CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_SLUB_DEBUG=y # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_PROC_PAGE_MONITOR=y CONFIG_PROC_KPAGEMAP=y CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y # CONFIG_MODULE_FORCE_UNLOAD is not set # CONFIG_MODVE
Re: 2.6.23-rc2-mm1 -- INFO: possible circular locking dependency detected
Andrew Morton wrote: > On Thu, 9 Aug 2007 16:24:48 -0400 > "Miles Lane" <[EMAIL PROTECTED]> wrote: > >> [ INFO: possible circular locking dependency detected ] >> 2.6.23-rc2-mm1 #7 >> --- >> kacpid/53 is trying to acquire lock: >> (>lock){--..}, at: [] mutex_lock+0x1c/0x1f >> >> but task is already holding lock: >> (>work){--..}, at: [] run_workqueue+0xa0/0x182 >> >> which lock already depends on the new lock. >> >> >> the existing dependency chain (in reverse order) is: >> >> -> #2 (>work){--..}: >>[] __lock_acquire+0x9a6/0xb6f >>[] lock_acquire+0x61/0x7d >>[] run_workqueue+0xb5/0x182 >>[] worker_thread+0xb7/0xc2 >>[] kthread+0x39/0x61 >>[] kernel_thread_helper+0x7/0x10 >>[] 0x >> >> -> #1 (kacpid){--..}: >>[] __lock_acquire+0x9a6/0xb6f >>[] lock_acquire+0x61/0x7d >>[] flush_workqueue+0x2d/0x4f >>[] acpi_os_wait_events_complete+0xd/0xf >>[] acpi_remove_gpe_handler+0x7b/0xdd >>[] ec_remove_handlers+0x26/0x29 >>[] acpi_ec_add+0x8f/0x13e >>[] acpi_device_probe+0x3e/0xdb >>[] driver_probe_device+0xd7/0x14d >>[] __driver_attach+0x6a/0xa1 >>[] bus_for_each_dev+0x36/0x5b >>[] driver_attach+0x14/0x16 >>[] bus_add_driver+0x70/0x16c >>[] driver_register+0x60/0x65 >>[] acpi_bus_register_driver+0x3a/0x3c >>[] acpi_ec_init+0x36/0x55 >>[] kernel_init+0xc5/0x20f >>[] kernel_thread_helper+0x7/0x10 >>[] 0x >> >> -> #0 (>lock){--..}: >>[] __lock_acquire+0x8c6/0xb6f >>[] lock_acquire+0x61/0x7d >>[] __mutex_lock_slowpath+0xbc/0x241 >>[] mutex_lock+0x1c/0x1f >>[] acpi_ec_transaction+0x65/0x1c1 >>[] acpi_ec_gpe_query+0x2b/0xab >>[] acpi_os_execute_deferred+0x20/0x31 >>[] run_workqueue+0xba/0x182 >>[] worker_thread+0xb7/0xc2 >>[] kthread+0x39/0x61 >>[] kernel_thread_helper+0x7/0x10 >>[] 0x >> >> other info that might help us debug this: >> >> 2 locks held by kacpid/53: >> #0: (kacpid){--..}, at: [] run_workqueue+0x85/0x182 >> #1: (>work){--..}, at: [] run_workqueue+0xa0/0x182 >> >> stack backtrace: >> [] show_trace_log_lvl+0x12/0x25 >> [] show_trace+0xd/0x10 >> [] dump_stack+0x15/0x17 >> [] print_circular_bug_tail+0x5a/0x65 >> [] __lock_acquire+0x8c6/0xb6f >> [] lock_acquire+0x61/0x7d >> [] __mutex_lock_slowpath+0xbc/0x241 >> [] mutex_lock+0x1c/0x1f >> [] acpi_ec_transaction+0x65/0x1c1 >> [] acpi_ec_gpe_query+0x2b/0xab >> [] acpi_os_execute_deferred+0x20/0x31 >> [] run_workqueue+0xba/0x182 >> [] worker_thread+0xb7/0xc2 >> [] kthread+0x39/0x61 >> [] kernel_thread_helper+0x7/0x10 >> === > > Presumably the new debugging patches in -mm > (workqueue-debug-flushing-deadlocks-with-lockdep.patch and > workqueue-debug-work-related-deadlocks-with-lockdep.patch) think they have > found a potential deadlock in ACPI. I don't have time to pick through the > code to confirm that, but boy I'm good at adding cc's ;) Yep, it indeed may lock up... Here is a patch to avoid it Thanks, Alex. ACPI EC: remove potential deadlock from EC. From: Alexey Starikovskiy <[EMAIL PROTECTED]> Signed-off-by: Alexey Starikovskiy <[EMAIL PROTECTED]> --- drivers/acpi/ec.c |2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c index ceb7c3f..4b299fd 100644 --- a/drivers/acpi/ec.c +++ b/drivers/acpi/ec.c @@ -723,9 +723,7 @@ static int acpi_ec_add(struct acpi_device *device) /* Check if we found the boot EC */ if (boot_ec) { if (boot_ec->gpe == ec->gpe) { - mutex_lock(_ec->lock); ec_remove_handlers(boot_ec); - mutex_unlock(_ec->lock); mutex_destroy(_ec->lock); kfree(boot_ec); first_ec = boot_ec = NULL;
Re: 2.6.23-rc2-mm1
> > [ 28.828484] :00:1f.1: cannot adjust BAR0 (not I/O) > > [ 28.828487] :00:1f.1: cannot adjust BAR1 (not I/O) > > [ 28.828489] :00:1f.1: cannot adjust BAR2 (not I/O) > > [ 28.828491] :00:1f.1: cannot adjust BAR3 (not I/O) This means it didn't do anything. (wrongly because its checking I/O bits on a BAR which are ignored according to the spec) > > Region 0: [virtual] Memory at 01f0 (32-bit, non-prefetchable) > > [disabled] [size=8] > > Region 1: [virtual] Memory at 03f0 (type 3, non-prefetchable) > > [disabled] [size=1] > > Region 2: [virtual] Memory at 0170 (32-bit, non-prefetchable) > > [disabled] [size=8] > > Region 3: [virtual] Memory at 0370 (type 3, non-prefetchable) > > [disabled] [size=1] The controller is disabled and when disabled it seems to think its memory. Valid but interesting. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
On Thu, 09 Aug 2007 23:36:04 +0200 Gabriel C <[EMAIL PROTECTED]> wrote: > > ... > > > +pci-align-bar-settings-for-legacy-mode-ide.patch > > +pci-align-bar-settings-for-legacy-mode-ide-fix.patch > > > > 2.6.23 material, but these belong to subssytem trees > > > > ... > > > These broke the IDE controller , using libata on my Dell Workstation .. > > Reverting both fixes the problem. > > > .. > > [ 28.828484] :00:1f.1: cannot adjust BAR0 (not I/O) > [ 28.828487] :00:1f.1: cannot adjust BAR1 (not I/O) > [ 28.828489] :00:1f.1: cannot adjust BAR2 (not I/O) > [ 28.828491] :00:1f.1: cannot adjust BAR3 (not I/O) > ... > > [ 44.564308] ata_piix :00:1f.1: version 2.11 > [ 44.564365] ata_piix :00:1f.1: no available native port > > ... > > And my CDROM and second ide disc gone. > > $ lspci -vvvxxx > > ... > > 00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 Controller (rev 04) > (prog-if 80 [Master]) > Subsystem: Dell Unknown device 00d8 > Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- > Stepping- SERR- FastB2B- > Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- > SERR- Latency: 0 > Region 0: [virtual] Memory at 01f0 (32-bit, non-prefetchable) > [disabled] [size=8] > Region 1: [virtual] Memory at 03f0 (type 3, non-prefetchable) > [disabled] [size=1] > Region 2: [virtual] Memory at 0170 (32-bit, non-prefetchable) > [disabled] [size=8] > Region 3: [virtual] Memory at 0370 (type 3, non-prefetchable) > [disabled] [size=1] > Region 4: I/O ports at ffa0 [size=16] > 00: 86 80 4b 24 05 00 80 02 04 80 01 01 00 00 00 00 > 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 20: a1 ff 00 00 00 00 00 00 00 00 00 00 28 10 d8 00 > 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 40: 07 e3 03 e3 00 00 00 00 05 00 01 02 00 00 00 00 > 50: 00 00 00 00 50 10 00 00 00 00 00 00 00 00 00 00 > 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > f0: 00 00 00 00 00 00 00 00 47 0f 00 00 00 00 00 00 > > ... > > config is attched. > Great, thanks for working that out. I'll drop them, thereby breaking other people's stuff. This is rather a disaster. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
... > +pci-align-bar-settings-for-legacy-mode-ide.patch > +pci-align-bar-settings-for-legacy-mode-ide-fix.patch > > 2.6.23 material, but these belong to subssytem trees > ... These broke the IDE controller , using libata on my Dell Workstation .. Reverting both fixes the problem. .. [ 28.828484] :00:1f.1: cannot adjust BAR0 (not I/O) [ 28.828487] :00:1f.1: cannot adjust BAR1 (not I/O) [ 28.828489] :00:1f.1: cannot adjust BAR2 (not I/O) [ 28.828491] :00:1f.1: cannot adjust BAR3 (not I/O) ... [ 44.564308] ata_piix :00:1f.1: version 2.11 [ 44.564365] ata_piix :00:1f.1: no available native port ... And my CDROM and second ide disc gone. $ lspci -vvvxxx ... 00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 Controller (rev 04) (prog-if 80 [Master]) Subsystem: Dell Unknown device 00d8 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- # # Automatically generated make config: don't edit # Linux kernel version: 2.6.23-rc2-mm1 # Thu Aug 9 15:19:34 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_NONIRQ_WAKEUP=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SWAP_PREFETCH=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y # CONFIG_BSD_PROCESS_ACCT_V3 is not set # CONFIG_TASKSTATS is not set # CONFIG_USER_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=17 # CONFIG_CONTAINERS is not set # CONFIG_SYSFS_DEPRECATED is not set CONFIG_RELAY=y CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_EXTRA_PASS=y CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_SLUB_DEBUG=y # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_PROC_PAGE_MONITOR=y CONFIG_PROC_KPAGEMAP=y CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y CONFIG_MODVERSIONS=y # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_KMOD=y CONFIG_STOP_MACHINE=y CONFIG_BLOCK=y CONFIG_LBD=y CONFIG_BLK_DEV_IO_TRACE=y CONFIG_LSF=y CONFIG_BLK_DEV_BSG=y # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y # CONFIG_DEFAULT_AS is not set # CONFIG_DEFAULT_DEADLINE is not set CONFIG_DEFAULT_CFQ=y # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="cfq" # # Processor type and features # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_SMP=y CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_PARAVIRT is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set CONFIG_MPENTIUM4=y # CONFIG_MCORE2 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_MVIAC7 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=7 CONFIG_X86_XADD=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_X86_CMOV=y CONFIG_X86_MINIMUM_
Re: 2.6.23-rc2-mm1 -- INFO: possible circular locking dependency detected
On Thu, 9 Aug 2007 16:24:48 -0400 "Miles Lane" <[EMAIL PROTECTED]> wrote: > [ INFO: possible circular locking dependency detected ] > 2.6.23-rc2-mm1 #7 > --- > kacpid/53 is trying to acquire lock: > (>lock){--..}, at: [] mutex_lock+0x1c/0x1f > > but task is already holding lock: > (>work){--..}, at: [] run_workqueue+0xa0/0x182 > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #2 (>work){--..}: >[] __lock_acquire+0x9a6/0xb6f >[] lock_acquire+0x61/0x7d >[] run_workqueue+0xb5/0x182 >[] worker_thread+0xb7/0xc2 >[] kthread+0x39/0x61 >[] kernel_thread_helper+0x7/0x10 >[] 0x > > -> #1 (kacpid){--..}: >[] __lock_acquire+0x9a6/0xb6f >[] lock_acquire+0x61/0x7d >[] flush_workqueue+0x2d/0x4f >[] acpi_os_wait_events_complete+0xd/0xf >[] acpi_remove_gpe_handler+0x7b/0xdd >[] ec_remove_handlers+0x26/0x29 >[] acpi_ec_add+0x8f/0x13e >[] acpi_device_probe+0x3e/0xdb >[] driver_probe_device+0xd7/0x14d >[] __driver_attach+0x6a/0xa1 >[] bus_for_each_dev+0x36/0x5b >[] driver_attach+0x14/0x16 >[] bus_add_driver+0x70/0x16c >[] driver_register+0x60/0x65 >[] acpi_bus_register_driver+0x3a/0x3c >[] acpi_ec_init+0x36/0x55 >[] kernel_init+0xc5/0x20f >[] kernel_thread_helper+0x7/0x10 >[] 0x > > -> #0 (>lock){--..}: >[] __lock_acquire+0x8c6/0xb6f >[] lock_acquire+0x61/0x7d >[] __mutex_lock_slowpath+0xbc/0x241 >[] mutex_lock+0x1c/0x1f >[] acpi_ec_transaction+0x65/0x1c1 >[] acpi_ec_gpe_query+0x2b/0xab >[] acpi_os_execute_deferred+0x20/0x31 >[] run_workqueue+0xba/0x182 >[] worker_thread+0xb7/0xc2 >[] kthread+0x39/0x61 >[] kernel_thread_helper+0x7/0x10 >[] 0x > > other info that might help us debug this: > > 2 locks held by kacpid/53: > #0: (kacpid){--..}, at: [] run_workqueue+0x85/0x182 > #1: (>work){--..}, at: [] run_workqueue+0xa0/0x182 > > stack backtrace: > [] show_trace_log_lvl+0x12/0x25 > [] show_trace+0xd/0x10 > [] dump_stack+0x15/0x17 > [] print_circular_bug_tail+0x5a/0x65 > [] __lock_acquire+0x8c6/0xb6f > [] lock_acquire+0x61/0x7d > [] __mutex_lock_slowpath+0xbc/0x241 > [] mutex_lock+0x1c/0x1f > [] acpi_ec_transaction+0x65/0x1c1 > [] acpi_ec_gpe_query+0x2b/0xab > [] acpi_os_execute_deferred+0x20/0x31 > [] run_workqueue+0xba/0x182 > [] worker_thread+0xb7/0xc2 > [] kthread+0x39/0x61 > [] kernel_thread_helper+0x7/0x10 > === Presumably the new debugging patches in -mm (workqueue-debug-flushing-deadlocks-with-lockdep.patch and workqueue-debug-work-related-deadlocks-with-lockdep.patch) think they have found a potential deadlock in ACPI. I don't have time to pick through the code to confirm that, but boy I'm good at adding cc's ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: silly df numbers on 32bit extN
On Thu, 9 Aug 2007 21:17:20 +0100 (BST) Hugh Dickins <[EMAIL PROTECTED]> wrote: > On Thu, 9 Aug 2007, Andrew Morton wrote: > > > > +lib-make-percpu_counter_add-take-s64.patch > > lib-make-percpu_counter_add-take-s64.patch looks sensible, but it doesn't > actually work on 32-bit architectures: several users of percpu_counter_add > are passing -unsignedlong as the amount, which is not promoted to s64 in > the desired way, so "df" on extN filesystems is showing silly numbers. > > The hack below (say long instead of s64 or s32) may be good as hotfix for > 2.6.23-rc2-mm1, but is probably the worst of solutions. Perhaps take-s64 > should be reverted, perhaps there should be a percpu_counter_sub and the > filesystems use that instead of saying -unsignedlong, perhaps they should > use a cast or a long or an s64. I don't know, but here's this for now... Thanks. I think I'll quietly tip the whole patch series overboard and shoot for a quick rc2-mm2 rather than trying to patch it up in-situ. I haven't had a chance to review it all in recent months. Vague first impressions are that it all goes a bit rampant and changes more than it needs to, but I'll take a closer look at that if Peter can provide us with the next version (please). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
On 09/08/07, Andrew Morton <[EMAIL PROTECTED]> wrote: ... > - If there's a patch in here which you think should be in 2.6.23 but I do > not have it designated in that way, please be sure to let me know. ... Well, if you want to clean up your patch queue a bit then I have a few suggestions for some patches of mine that are currently in -mm that you could push to Linus. They are not really important, so if you'd rather keep them around in -mm until the next merge window then that's fine by me, but they should be safe to push and would get cut down on the number of patches you track a bit :-) This fix was already merged for the ati side of things, this is an identical fix for the amd side of things - I see no reason why we shouldn't get this fix into 2.6.23 as well : fix-use-after-free--double-free-bug-in-amd_create_gatt_pages--amd_free_gatt_pages.patch This patch only changes the output of the script when run without arguments, so as far as building the kernel goes it can't cause any regressions and it's a clear improvement over what we currently have, so might as well get it out of your queue and upstream : improve-scripts-gcc-versionsh-output-a-bit-when-called-without-args.patch When people use scripts/ver_linux in bugreports we want correct info - currently we often print wrong info for the binutils version. This patch doesn't hurt existing working scenarios but does fix a few broken ones, might as well get that merged, it's a clear fix : scripts-ver_linux-correct-printing-of-binutils-version.patch These should all be trivially correct since they just remove duplicate #include's of the same header into a .c file outside any #ifdef and similar magic, so they should be quite safe to push. Also, I haven't seen anything but ACK's in response to them (when I've seen a response), and a few of my similar patches have already been merged : powerpc-clean-out-a-bunch-of-duplicate-includes.patch clean-up-duplicate-includes-in-drivers-input.patch clean-up-duplicate-includes-in-drivers-net.patch clean-up-duplicate-includes-in-drivers-atm.patch clean-up-duplicate-includes-in-net-atm.patch clean-up-duplicate-includes-in-net-ipv4.patch clean-up-duplicate-includes-in-net-ipv6.patch clean-up-duplicate-includes-in-net-sched.patch clean-up-duplicate-includes-in-net-sunrpc.patch clean-up-duplicate-includes-in-net-tipc.patch clean-up-duplicate-includes-in-net-xfrm.patch clean-up-duplicate-includes-in-include-linux-nfs_fsh.patch clean-up-duplicate-includes-in-fs-ntfs.patch clean-up-duplicate-includes-in-drivers-scsi.patch clean-up-duplicate-includes-in-drivers-block.patch clean-up-duplicate-includes-in-arch-i386-xen.patch clean-up-duplicate-includes-in-include-linux-memory_hotplugh.patch clean-up-duplicate-includes-in-mm.patch clean-up-duplicate-includes-in-drivers-char.patch clean-up-duplicate-includes-in-drivers-w1.patch clean-up-duplicate-includes-in-fs.patch clean-up-duplicate-includes-in-fs-ecryptfs.patch clean-up-duplicate-includes-in-kernel.patch clean-up-duplicate-includes-in-drivers-spi.patch clean-up-duplicate-includes-in-documentation.patch -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc2-mm1 -- INFO: possible circular locking dependency detected
[ INFO: possible circular locking dependency detected ] 2.6.23-rc2-mm1 #7 --- kacpid/53 is trying to acquire lock: (>lock){--..}, at: [] mutex_lock+0x1c/0x1f but task is already holding lock: (>work){--..}, at: [] run_workqueue+0xa0/0x182 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (>work){--..}: [] __lock_acquire+0x9a6/0xb6f [] lock_acquire+0x61/0x7d [] run_workqueue+0xb5/0x182 [] worker_thread+0xb7/0xc2 [] kthread+0x39/0x61 [] kernel_thread_helper+0x7/0x10 [] 0x -> #1 (kacpid){--..}: [] __lock_acquire+0x9a6/0xb6f [] lock_acquire+0x61/0x7d [] flush_workqueue+0x2d/0x4f [] acpi_os_wait_events_complete+0xd/0xf [] acpi_remove_gpe_handler+0x7b/0xdd [] ec_remove_handlers+0x26/0x29 [] acpi_ec_add+0x8f/0x13e [] acpi_device_probe+0x3e/0xdb [] driver_probe_device+0xd7/0x14d [] __driver_attach+0x6a/0xa1 [] bus_for_each_dev+0x36/0x5b [] driver_attach+0x14/0x16 [] bus_add_driver+0x70/0x16c [] driver_register+0x60/0x65 [] acpi_bus_register_driver+0x3a/0x3c [] acpi_ec_init+0x36/0x55 [] kernel_init+0xc5/0x20f [] kernel_thread_helper+0x7/0x10 [] 0x -> #0 (>lock){--..}: [] __lock_acquire+0x8c6/0xb6f [] lock_acquire+0x61/0x7d [] __mutex_lock_slowpath+0xbc/0x241 [] mutex_lock+0x1c/0x1f [] acpi_ec_transaction+0x65/0x1c1 [] acpi_ec_gpe_query+0x2b/0xab [] acpi_os_execute_deferred+0x20/0x31 [] run_workqueue+0xba/0x182 [] worker_thread+0xb7/0xc2 [] kthread+0x39/0x61 [] kernel_thread_helper+0x7/0x10 [] 0x other info that might help us debug this: 2 locks held by kacpid/53: #0: (kacpid){--..}, at: [] run_workqueue+0x85/0x182 #1: (>work){--..}, at: [] run_workqueue+0xa0/0x182 stack backtrace: [] show_trace_log_lvl+0x12/0x25 [] show_trace+0xd/0x10 [] dump_stack+0x15/0x17 [] print_circular_bug_tail+0x5a/0x65 [] __lock_acquire+0x8c6/0xb6f [] lock_acquire+0x61/0x7d [] __mutex_lock_slowpath+0xbc/0x241 [] mutex_lock+0x1c/0x1f [] acpi_ec_transaction+0x65/0x1c1 [] acpi_ec_gpe_query+0x2b/0xab [] acpi_os_execute_deferred+0x20/0x31 [] run_workqueue+0xba/0x182 [] worker_thread+0xb7/0xc2 [] kthread+0x39/0x61 [] kernel_thread_helper+0x7/0x10 === - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: silly df numbers on 32bit extN
On Thu, 9 Aug 2007, Andrew Morton wrote: > > +lib-make-percpu_counter_add-take-s64.patch lib-make-percpu_counter_add-take-s64.patch looks sensible, but it doesn't actually work on 32-bit architectures: several users of percpu_counter_add are passing -unsignedlong as the amount, which is not promoted to s64 in the desired way, so "df" on extN filesystems is showing silly numbers. The hack below (say long instead of s64 or s32) may be good as hotfix for 2.6.23-rc2-mm1, but is probably the worst of solutions. Perhaps take-s64 should be reverted, perhaps there should be a percpu_counter_sub and the filesystems use that instead of saying -unsignedlong, perhaps they should use a cast or a long or an s64. I don't know, but here's this for now... Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]> --- include/linux/percpu_counter.h | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) --- 2.6.23-rc2-mm1/include/linux/percpu_counter.h 2007-08-09 13:15:35.0 +0100 +++ linux/include/linux/percpu_counter.h2007-08-09 20:34:23.0 +0100 @@ -37,7 +37,7 @@ void percpu_counter_set(struct percpu_co void __percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch); s64 __percpu_counter_sum(struct percpu_counter *fbc); -static inline void percpu_counter_add(struct percpu_counter *fbc, s64 amount) +static inline void percpu_counter_add(struct percpu_counter *fbc, long amount) { __percpu_counter_add(fbc, amount, FBC_BATCH); } @@ -96,11 +96,16 @@ static inline void percpu_counter_set(st fbc->count = amount; } -#define __percpu_counter_add(fbc, amount, batch) \ - percpu_counter_add(fbc, amount) +static inline void +__percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch) +{ + preempt_disable(); + fbc->count += amount; + preempt_enable(); +} static inline void -percpu_counter_add(struct percpu_counter *fbc, s64 amount) +percpu_counter_add(struct percpu_counter *fbc, long amount) { preempt_disable(); fbc->count += amount; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
Andrew Morton wrote: umm, I was hoping to find out which of those two patches was the cuplrit. Almost surely it was 9ee6b32a47b9abc565466a9c3b127a5246b452e5? Highly likely it is my patch in #ALL. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
On Thu, 9 Aug 2007 21:04:35 +0200 "Michal Piotrowski" <[EMAIL PROTECTED]> wrote: > On 09/08/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > > On Thu, 09 Aug 2007 15:23:41 +0200 > > Michal Piotrowski <[EMAIL PROTECTED]> wrote: > > > > > Andrew Morton pisze: > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/ > > > > > > I am experiencing some problems with 8139too > > > > > > [ 28.847004] 8139too :02:0d.0: region #0 not a PIO resource, > > > aborting > > > [ 28.854722] Bad IO access at port 0 () > > > [ 28.859459] WARNING: at /home/devel/linux-mm/lib/iomap.c:44 > > > bad_io_access() > > > [ 28.867415] [] show_trace_log_lvl+0x1a/0x30 > > > [ 28.873568] [] show_trace+0x12/0x14 > > > [ 28.879015] [] dump_stack+0x16/0x18 > > > [ 28.884451] [] bad_io_access+0x58/0x5a > > > [ 28.890129] [] pci_iounmap+0x21/0x2b > > > [ 28.895635] [] __rtl8139_cleanup_dev+0x75/0xc6 > > > [ 28.902037] [] rtl8139_init_one+0x59b/0xa9f > > > [ 28.908170] [] pci_device_probe+0x44/0x5f > > > [ 28.914116] [] driver_probe_device+0xa7/0x19a > > > [ 28.920402] [] __driver_attach+0xa6/0xa8 > > > [ 28.926236] [] bus_for_each_dev+0x43/0x61 > > > [ 28.932139] [] driver_attach+0x19/0x1b > > > [ 28.937776] [] bus_add_driver+0x7e/0x1a5 > > > [ 28.943567] [] driver_register+0x45/0x75 > > > [ 28.949358] [] __pci_register_driver+0x56/0x84 > > > [ 28.955678] [] rtl8139_init_module+0x14/0x1c > > > [ 28.961832] [] kernel_init+0x132/0x306 > > > [ 28.967451] [] kernel_thread_helper+0x7/0x14 > > > [ 28.973588] === > > > [ 28.978151] initcall 0xc0819104: rtl8139_init_module+0x0/0x1c() > > > returned 0. > > > [ 28.986114] initcall 0xc0819104 ran for 161 msecs: > > > rtl8139_init_module+0x0/0x1c() > > > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-dmesg > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-config > > > > > > > Please try reverting 8139too-force-media-setting-fix.patch, then > > applying this: > > > > > > Problem fixed, thanks! > umm, I was hoping to find out which of those two patches was the cuplrit. Almost surely it was 9ee6b32a47b9abc565466a9c3b127a5246b452e5? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
On 09/08/07, Andrew Morton <[EMAIL PROTECTED]> wrote: > On Thu, 09 Aug 2007 15:23:41 +0200 > Michal Piotrowski <[EMAIL PROTECTED]> wrote: > > > Andrew Morton pisze: > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/ > > > > I am experiencing some problems with 8139too > > > > [ 28.847004] 8139too :02:0d.0: region #0 not a PIO resource, aborting > > [ 28.854722] Bad IO access at port 0 () > > [ 28.859459] WARNING: at /home/devel/linux-mm/lib/iomap.c:44 > > bad_io_access() > > [ 28.867415] [] show_trace_log_lvl+0x1a/0x30 > > [ 28.873568] [] show_trace+0x12/0x14 > > [ 28.879015] [] dump_stack+0x16/0x18 > > [ 28.884451] [] bad_io_access+0x58/0x5a > > [ 28.890129] [] pci_iounmap+0x21/0x2b > > [ 28.895635] [] __rtl8139_cleanup_dev+0x75/0xc6 > > [ 28.902037] [] rtl8139_init_one+0x59b/0xa9f > > [ 28.908170] [] pci_device_probe+0x44/0x5f > > [ 28.914116] [] driver_probe_device+0xa7/0x19a > > [ 28.920402] [] __driver_attach+0xa6/0xa8 > > [ 28.926236] [] bus_for_each_dev+0x43/0x61 > > [ 28.932139] [] driver_attach+0x19/0x1b > > [ 28.937776] [] bus_add_driver+0x7e/0x1a5 > > [ 28.943567] [] driver_register+0x45/0x75 > > [ 28.949358] [] __pci_register_driver+0x56/0x84 > > [ 28.955678] [] rtl8139_init_module+0x14/0x1c > > [ 28.961832] [] kernel_init+0x132/0x306 > > [ 28.967451] [] kernel_thread_helper+0x7/0x14 > > [ 28.973588] === > > [ 28.978151] initcall 0xc0819104: rtl8139_init_module+0x0/0x1c() returned > > 0. > > [ 28.986114] initcall 0xc0819104 ran for 161 msecs: > > rtl8139_init_module+0x0/0x1c() > > > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-dmesg > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-config > > > > Please try reverting 8139too-force-media-setting-fix.patch, then > applying this: > > Problem fixed, thanks! Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id ___
On Thu, 9 Aug 2007 10:18:15 -0400 "Miles Lane" <[EMAIL PROTECTED]> wrote: > CC drivers/dma/ioat_dca.o > drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag': > drivers/dma/ioat_dca.c:177: error: implicit declaration of function > 'cpu_physical_id' Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n. Either ioat needs to stop using cpu_physical_id() if SMP=n, or the supported architectures (i386, x86_64, ia64) should provide a non-SMP version of cpu_physical_id(). Preferably the latter, I'd say. Something like this, I suppose... From: Andrew Morton <[EMAIL PROTECTED]> i386, x86_64 and ia64 implement cpu_physical_id() if CONFIG_SMP=y. Provide a uniprocessor stub so that callers will dtrt. Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: "Luck, Tony" <[EMAIL PROTECTED]> Cc: Shannon Nelson <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- include/linux/smp.h |5 + 1 files changed, 5 insertions(+) diff -puN include/linux/smp.h~implement-cpu_physical_id-on-smp=n include/linux/smp.h --- a/include/linux/smp.h~implement-cpu_physical_id-on-smp=n +++ a/include/linux/smp.h @@ -108,6 +108,11 @@ static inline void smp_send_reschedule(i 0; \ }) +static inline unsigned cpu_physical_id(unsigned cpu) +{ + return 0; +} + #endif /* !SMP */ /* _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
On Thu, 09 Aug 2007 15:23:41 +0200 Michal Piotrowski <[EMAIL PROTECTED]> wrote: > Andrew Morton pisze: > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/ > > I am experiencing some problems with 8139too > > [ 28.847004] 8139too :02:0d.0: region #0 not a PIO resource, aborting > [ 28.854722] Bad IO access at port 0 () > [ 28.859459] WARNING: at /home/devel/linux-mm/lib/iomap.c:44 bad_io_access() > [ 28.867415] [] show_trace_log_lvl+0x1a/0x30 > [ 28.873568] [] show_trace+0x12/0x14 > [ 28.879015] [] dump_stack+0x16/0x18 > [ 28.884451] [] bad_io_access+0x58/0x5a > [ 28.890129] [] pci_iounmap+0x21/0x2b > [ 28.895635] [] __rtl8139_cleanup_dev+0x75/0xc6 > [ 28.902037] [] rtl8139_init_one+0x59b/0xa9f > [ 28.908170] [] pci_device_probe+0x44/0x5f > [ 28.914116] [] driver_probe_device+0xa7/0x19a > [ 28.920402] [] __driver_attach+0xa6/0xa8 > [ 28.926236] [] bus_for_each_dev+0x43/0x61 > [ 28.932139] [] driver_attach+0x19/0x1b > [ 28.937776] [] bus_add_driver+0x7e/0x1a5 > [ 28.943567] [] driver_register+0x45/0x75 > [ 28.949358] [] __pci_register_driver+0x56/0x84 > [ 28.955678] [] rtl8139_init_module+0x14/0x1c > [ 28.961832] [] kernel_init+0x132/0x306 > [ 28.967451] [] kernel_thread_helper+0x7/0x14 > [ 28.973588] === > [ 28.978151] initcall 0xc0819104: rtl8139_init_module+0x0/0x1c() returned 0. > [ 28.986114] initcall 0xc0819104 ran for 161 msecs: > rtl8139_init_module+0x0/0x1c() > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-dmesg > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-config > Please try reverting 8139too-force-media-setting-fix.patch, then applying this: From: Andrew Morton <[EMAIL PROTECTED]> Revert git-netdev-all's 9ee6b32a47b9abc565466a9c3b127a5246b452e5 Cc: Michal Piotrowski <[EMAIL PROTECTED]> Cc: Jeff Garzik <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- drivers/net/8139too.c | 50 +++- 1 file changed, 29 insertions(+), 21 deletions(-) diff -puN drivers/net/8139too.c~revert-8139too-clean-up-i-o-remapping drivers/net/8139too.c --- a/drivers/net/8139too.c~revert-8139too-clean-up-i-o-remapping +++ a/drivers/net/8139too.c @@ -121,15 +121,8 @@ /* enable PIO instead of MMIO, if CONFIG_8139TOO_PIO is selected */ -enum rtl_bar_map_info { - rtl_pio_bar = 0,/* PCI BAR #0: PIO */ - rtl_mmio_bar= 1,/* PCI BAR #1: MMIO */ -}; - #ifdef CONFIG_8139TOO_PIO -static int use_pio = 1; -#else -static int use_pio; +#define USE_IO_OPS 1 #endif /* define to 1, 2 or 3 to enable copious debugging info */ @@ -620,17 +613,14 @@ MODULE_DESCRIPTION ("RealTek RTL-8139 Fa MODULE_LICENSE("GPL"); MODULE_VERSION(DRV_VERSION); -module_param(multicast_filter_limit, int, 0444); +module_param(multicast_filter_limit, int, 0); module_param_array(media, int, NULL, 0); module_param_array(full_duplex, int, NULL, 0); -module_param(debug, int, 0444); -module_param(use_pio, int, 0444); - +module_param(debug, int, 0); +MODULE_PARM_DESC (debug, "8139too bitmapped message enable number"); MODULE_PARM_DESC (multicast_filter_limit, "8139too maximum number of filtered multicast addresses"); MODULE_PARM_DESC (media, "8139too: Bits 4+9: force full duplex, bit 5: 100Mbps"); MODULE_PARM_DESC (full_duplex, "8139too: Force full duplex for board(s) (1)"); -MODULE_PARM_DESC (debug, "8139too bitmapped message enable number"); -MODULE_PARM_DESC (use_pio, "Non-zero to enable PIO (rather than MMIO) register mapping"); static int read_eeprom (void __iomem *ioaddr, int location, int addr_len); static int rtl8139_open (struct net_device *dev); @@ -718,7 +708,13 @@ static void __rtl8139_cleanup_dev (struc assert (tp->pci_dev != NULL); pdev = tp->pci_dev; - pci_iounmap (pdev, tp->mmio_addr); +#ifdef USE_IO_OPS + if (tp->mmio_addr) + ioport_unmap (tp->mmio_addr); +#else + if (tp->mmio_addr) + pci_iounmap (pdev, tp->mmio_addr); +#endif /* USE_IO_OPS */ /* it's ok to call this even if we have no regions to free */ pci_release_regions (pdev); @@ -794,32 +790,32 @@ static int __devinit rtl8139_init_board DPRINTK("PIO region size == 0x%02X\n", pio_len); DPRINTK("MMIO region size == 0x%02lX\n", mmio_len); +#ifdef USE_IO_OPS /* make sure PCI base addr 0 is PIO */ if (!(pio_flags & IORESOURCE_IO)) { dev_err(>dev, "region #0 not a PIO resource, aborting\n"); rc = -ENODEV; goto err_out; } - /* check for weird/broke
Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ‘cpu_physical_id’
On 8/9/07, Adrian Bunk <[EMAIL PROTECTED]> wrote: > On Thu, Aug 09, 2007 at 10:18:15AM -0400, Miles Lane wrote: > > CC drivers/dma/ioat_dca.o > > drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag': > > drivers/dma/ioat_dca.c:177: error: implicit declaration of function > > 'cpu_physical_id' > > make[2]: *** [drivers/dma/ioat_dca.o] Error 1 > > -ENODOTCONFIG # # Automatically generated make config: don't edit # Linux kernel version: 2.6.23-rc2-mm1 # Thu Aug 9 12:18:45 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_NONIRQ_WAKEUP=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SWAP_PREFETCH=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y # CONFIG_TASKSTATS is not set # CONFIG_USER_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=18 CONFIG_CONTAINERS=y CONFIG_CONTAINER_DEBUG=y CONFIG_CONTAINER_NS=y CONFIG_CONTAINER_CPUACCT=y # CONFIG_SYSFS_DEPRECATED is not set CONFIG_RELAY=y CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_SLUB_DEBUG=y # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_PROC_PAGE_MONITOR=y CONFIG_PROC_KPAGEMAP=y CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_KMOD=y CONFIG_BLOCK=y CONFIG_LBD=y # CONFIG_BLK_DEV_IO_TRACE is not set # CONFIG_LSF is not set # CONFIG_BLK_DEV_BSG is not set # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=m CONFIG_IOSCHED_CFQ=m CONFIG_DEFAULT_AS=y # CONFIG_DEFAULT_DEADLINE is not set # CONFIG_DEFAULT_CFQ is not set # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="anticipatory" # # Processor type and features # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y # CONFIG_SMP is not set CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_PARAVIRT is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set CONFIG_MPENTIUM4=y # CONFIG_MCORE2 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_MVIAC7 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=7 CONFIG_X86_XADD=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_X86_CMOV=y CONFIG_X86_MINIMUM_CPU_FAMILY=4 CONFIG_HPET_TIMER=y # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT=y CONFIG_PREEMPT_BKL=y CONFIG_X86_UP_APIC=y CONFIG_X86_UP_IOAPIC=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_MCE=y CONFIG_X86_MCE_NONFATAL=m CONFIG_X86_MCE_P4THERMAL=y CONFIG_VM86=y # CONFIG_TOSHIBA is not set # CONFIG_I8K is not set # CONFIG_X86_REBOOTFIXUPS is not set CONFIG_MICROCODE=m CONFIG_MICROCODE_OLD_INTERFACE=y CONFIG_X86_MSR=m CONFIG_X86_CPUID=m
Re: 2.6.23-rc2-mm1 -- PPC G5 kernel compile failure (patch)
On Thu, Aug 09, 2007 at 04:20:06PM +0200, Krzysztof Helt wrote: > On Thu, 9 Aug 2007 14:04:49 +0100 > Andy Whitcroft <[EMAIL PROTECTED]> wrote: > > > Seeing the following compile error on a G5 mac: > > > > drivers/video/tdfxfb.c: In function 'tdfxfb_setup': > > drivers/video/tdfxfb.c:1341: error: 'opt' undeclared (first use in this > > function) > > drivers/video/tdfxfb.c:1341: error: (Each undeclared identifier is > > reported only once > > drivers/video/tdfxfb.c:1341: error: for each function it appears in.) > > > > This seems to be the following fragment from tdfxfb-hardware-cursor: > > > > + } else if (!strcmp(this_opt, "hwcursor")) { > > + hwcursor = simple_strtoul(opt + 9, NULL, 0); > > > > I guess the nieve fix would be s/opt/this_opt, but I am also > > suspicious of the +9 here as hwcursor is only 8 long? Now this > > seems to take a numeric value and I assume that is via hwcursor=N, > > if so then the +9 would make sense _if_ the strcmp was against > > "hwcursor=". > > > > The patch below fixes all issues you have pointed out. It also fixes > the description of the nomtrr option. Will push this through our tests and let you know. -apw - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- spinlock bad magic
On Thu, Aug 09, 2007 at 01:53:06PM +0100, Andy Whitcroft wrote: > Seeing spinlock bad magic BUG's from x86 and x86_64 test boxes, > fsx-linux seems to be tickling it. Adding Peter as prop_norm_single > seems to be his: Talking to Peter on IRC he suggested I back out the patch below and retest on these machines: mm-dirty-balancing-for-tasks One machine seems to have hung elsewhere (probabally another bug sigh), and one has run the fsx-linux tests successfully. So this patch does seem suspect. I will report back on the other tests when they complete. -apw - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [E1000-devel] 2.6.23-rc2-mm1: e1000e global symbols must be renamed
Adrian Bunk wrote: On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote: ... - There is a new e1000 driver in git-netdev-all, called e1000e. I'm sure the developers would like it tested. Please cc [EMAIL PROTECTED] on any reports. ... Changes since 2.6.23-rc2-mm1: ... git-netdev-all.patch ... git trees ... <-- snip --> ... LD drivers/net/built-in.o drivers/net/e1000e/built-in.o: In function `e1000_read_mac_addr': (.text+0x3470): multiple definition of `e1000_read_mac_addr' drivers/net/e1000/built-in.o:(.text+0xb6cc): first defined here drivers/net/e1000e/built-in.o: In function `e1000_set_ethtool_ops': (.text+0x594d): multiple definition of `e1000_set_ethtool_ops' drivers/net/e1000/built-in.o:(.text+0xc97a): first defined here ... make[3]: *** [drivers/net/built-in.o] Error 1 ack, I'll step on that and make it go away :) Auke - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
On Thu, 09 Aug 2007 18:19:30 +0200 Michal Piotrowski <[EMAIL PROTECTED]> wrote: > This might be related. The kernel is tainted because I hit > kernel BUG at /home/devel/linux-mm/mm/swap_state.c:78! umm, possibly. If we went BUG while holding a spinlock then sure, a future lockup is pretty much inevitable. But the lockdep uninitialised-lock complaint is a bit of a surprise. Can you please retest with Hugh's fix applied? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
On Thu, 09 Aug 2007 17:36:57 +0200 Michal Piotrowski <[EMAIL PROTECTED]> wrote: > Andrew Morton pisze: > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/ > > > > bash_shared_mapping triggered this > > [ 874.714700] INFO: trying to register non-static key. > [ 874.719659] the code is fine but needs lockdep annotation. > [ 874.725133] turning off the locking correctness validator. > [ 874.730606] [] show_trace_log_lvl+0x1a/0x30 > [ 874.735759] [] show_trace+0x12/0x14 > [ 874.740218] [] dump_stack+0x16/0x18 > [ 874.744679] [] __lock_acquire+0x598/0x125c > [ 874.749745] [] lock_acquire+0xa7/0xc1 > [ 874.754378] [] _spin_lock_irqsave+0x41/0x6e > [ 874.759529] [] prop_norm_single+0x34/0x8a > [ 874.764508] [] set_page_dirty+0xa1/0x13b > [ 874.769402] [] try_to_unmap_one+0xb8/0x1e7 > [ 874.774467] [] try_to_unmap+0x8f/0x40d > [ 874.779187] [] shrink_page_list+0x278/0x750 > [ 874.784339] [] shrink_inactive_list+0xf6/0x328 > [ 874.789749] [] shrink_zone+0xad/0x10b > [ 874.794383] [] try_to_free_pages+0x178/0x274 > [ 874.799620] [] __alloc_pages+0x169/0x431 > [ 874.804514] [] __do_page_cache_readahead+0x141/0x207 > [ 874.810443] [] do_page_cache_readahead+0x48/0x5c > [ 874.816027] [] filemap_fault+0x2dd/0x4cf > [ 874.820921] [] __do_fault+0xb6/0x42d > [ 874.825466] [] handle_mm_fault+0x1b6/0x750 > [ 874.830533] [] do_page_fault+0x334/0x5f9 > [ 874.835425] [] error_code+0x72/0x78 > [ 874.839886] === I'd assume that the lib/proportions code went and passed a garbage pointer into spin_lock_irqsave(). Or maybe it has a correct pointer but didn't initialise the spinlock. > [ 880.621883] BUG: NMI Watchdog detected LOCKUP on CPU1, eip c0529022, > registers: > [ 880.629200] Modules linked in: ext2 loop autofs4 af_packet > nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink > ipt_REJECT iptable_filter ip_tables xt_tcpudp ip6t_REJECT ip6table_filter > ip6_tables x_tables firmware_class binfmt_misc fan ipv6 nvram snd_intel8x0 > snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq > snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer evdev snd > soundcore i2c_i801 snd_page_alloc intel_agp agpgart rtc > [ 880.672397] CPU:1 This will be a consequence of that. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
Michal Piotrowski pisze: > Andrew Morton pisze: >> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/ >> > > bash_shared_mapping triggered this > > [ 874.714700] INFO: trying to register non-static key. > [ 874.719659] the code is fine but needs lockdep annotation. > [ 874.725133] turning off the locking correctness validator. > [ 874.730606] [] show_trace_log_lvl+0x1a/0x30 > [ 874.735759] [] show_trace+0x12/0x14 > [ 874.740218] [] dump_stack+0x16/0x18 > [ 874.744679] [] __lock_acquire+0x598/0x125c > [ 874.749745] [] lock_acquire+0xa7/0xc1 > [ 874.754378] [] _spin_lock_irqsave+0x41/0x6e > [ 874.759529] [] prop_norm_single+0x34/0x8a > [ 874.764508] [] set_page_dirty+0xa1/0x13b > [ 874.769402] [] try_to_unmap_one+0xb8/0x1e7 > [ 874.774467] [] try_to_unmap+0x8f/0x40d > [ 874.779187] [] shrink_page_list+0x278/0x750 > [ 874.784339] [] shrink_inactive_list+0xf6/0x328 > [ 874.789749] [] shrink_zone+0xad/0x10b > [ 874.794383] [] try_to_free_pages+0x178/0x274 > [ 874.799620] [] __alloc_pages+0x169/0x431 > [ 874.804514] [] __do_page_cache_readahead+0x141/0x207 > [ 874.810443] [] do_page_cache_readahead+0x48/0x5c > [ 874.816027] [] filemap_fault+0x2dd/0x4cf > [ 874.820921] [] __do_fault+0xb6/0x42d > [ 874.825466] [] handle_mm_fault+0x1b6/0x750 > [ 874.830533] [] do_page_fault+0x334/0x5f9 > [ 874.835425] [] error_code+0x72/0x78 > [ 874.839886] === > [ 880.621883] BUG: NMI Watchdog detected LOCKUP on CPU1, eip c0529022, > registers: > [ 880.629200] Modules linked in: ext2 loop autofs4 af_packet > nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink > ipt_REJECT iptable_filter ip_tables xt_tcpudp ip6t_REJECT ip6table_filter > ip6_tables x_tables firmware_class binfmt_misc fan ipv6 nvram snd_intel8x0 > snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq > snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer evdev snd > soundcore i2c_i801 snd_page_alloc intel_agp agpgart rtc > [ 880.672397] CPU:1 > [ 880.672398] EIP:0060:[]Not tainted VLI > [ 880.672400] EFLAGS: 0046 (2.6.23-rc2-mm1 #3) > [ 880.684735] EIP is at delay_tsc+0xe/0x17 > > l *delay_tsc+0xe > 0xc1129022 is in delay_tsc (/home/devel/linux-mm/arch/i386/lib/delay.c:49). > 44 > 45 rdtscl(bclock); > 46 do { > 47 rep_nop(); > 48 rdtscl(now); > 49 } while ((now-bclock) < loops); > 50 } > 51 > 52 /* > 53 * Since we calibrate only once at boot, this > > > [ 880.688646] eax: 393e5d7c ebx: 0001 ecx: 393e5d04 edx: 023f > [ 880.695414] esi: edi: cabbf5cc ebp: caf29ae8 esp: caf29ae4 > [ 880.702183] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 > [ 880.708002] Process firefox-bin (pid: 2625, ti=caf29000 task=cabbe900 > task.ti=caf29000) > [ 880.715805] Stack: 02e6eb94 caf29af0 c0528fdd caf29b28 c05375ac 0046 > caf29b28 > [ 880.724345]0046 a6c999f0 0001 a6c999f0 > cabbf5e0 cabbf5cc > [ 880.732887]0086 caf29b48 c069f76f 0002 c05259db > cabbf5c0 8000 > [ 880.741425] Call Trace: > [ 880.744073] [] show_trace_log_lvl+0x1a/0x30 > [ 880.749233] [] show_stack_log_lvl+0xa9/0xd5 > [ 880.754386] [] show_registers+0x21a/0x3ac > [ 880.759365] [] die_nmi+0x84/0xd7 > [ 880.763566] [] nmi_watchdog_tick+0x14d/0x168 > [ 880.768803] [] do_nmi+0x8b/0x284 > [ 880.773004] [] nmi_stack_correct+0x26/0x2b > [ 880.778069] [] __delay+0x9/0xb > [ 880.782098] [] _raw_spin_lock+0xd8/0x18a > [ 880.786991] [] _spin_lock_irqsave+0x5d/0x6e > [ 880.792143] [] prop_norm_single+0x34/0x8a > [ 880.797122] [] set_page_dirty+0xa1/0x13b > [ 880.802015] [] try_to_unmap_one+0xb8/0x1e7 > [ 880.807079] [] try_to_unmap+0x8f/0x40d > [ 880.811798] [] shrink_page_list+0x278/0x750 > [ 880.816950] [] shrink_inactive_list+0xf6/0x328 > [ 880.822362] [] shrink_zone+0xad/0x10b > [ 880.826997] [] try_to_free_pages+0x178/0x274 > [ 880.832235] [] __alloc_pages+0x169/0x431 > [ 880.837126] [] __do_page_cache_readahead+0x141/0x207 > [ 880.843056] [] do_page_cache_readahead+0x48/0x5c > [ 880.848641] [] filemap_fault+0x2dd/0x4cf > [ 880.853534] [] __do_fault+0xb6/0x42d > [ 880.858081] [] handle_mm_fault+0x1b6/0x750 > [ 880.863146] [] do_page_fault+0x334/0x5f9 > [ 880.868037] [] error_code+0x72/0x78 > [ 880.872497] === > [ 880.876068] INFO: lockdep is turned off. > [ 880.879983] Code: 8d 0c 1b 01 c9 89 da c1 e2 07 29 ca 01 da 0
Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!
Hugh Dickins pisze: > On Thu, 9 Aug 2007, Mariusz Kozlowski wrote: >> Hello, >> >> Nothing unusual happening, allmodconfig compiling etc. >> Not sure why it says kernel was tainted though ... hmmm. >> >> [ cut here ] >> kernel BUG at mm/swap_state.c:78! >> invalid opcode: [#1] >> PREEMPT >> Modules linked in: orinoco_cs orinoco hermes pl2303 usbserial pcmcia >> 8250_pci 8250 serial_core yenta_socket rsrc_nonstatic pcmcia_core 8139too >> CPU:0 >> EIP:0060:[]Tainted: PVLI >> EFLAGS: 00010246 (2.6.23-rc2-mm1 #1) >> EIP is at __add_to_swap_cache+0xc6/0xd7 >> eax: 4000 ebx: c11285c0 ecx: 00d0 edx: 0283 >> esi: c11285c0 edi: 0283 ebp: c1858f90 esp: c1858f84 >> ds: 007b es: 007b fs: gs: ss: 0068 >> Process kprefetchd (pid: 236, ti=c1858000 task=c18d14d0 task.ti=c1858000) >> Stack: 0283 c11285c0 c3d5a3c8 c1858fa0 c01504ea c11285c0 >> c1858fcc >>c015307c 0001 0007 0002 0002 0283 >> fffc >> c0152d5c c1858fe0 c0127f2e c0127ef8 >> >> Call Trace: >> [] show_trace_log_lvl+0x1a/0x30 >> [] show_stack_log_lvl+0xa9/0xd5 >> [] show_registers+0x219/0x38d >> [] die+0x104/0x23e >> [] do_trap+0x83/0xad >> [] do_invalid_op+0x88/0x92 >> [] error_code+0x6a/0x70 >> [] add_to_swap_cache+0x22/0x58 >> [] kprefetchd+0x320/0x364 >> [] kthread+0x36/0x58 >> [] kernel_thread_helper+0x7/0x14 >> === >> INFO: lockdep is turned off. >> Code: 0f 89 7b 0c 83 05 fc c9 53 c0 01 8b 13 c1 ea 1e 8d 04 12 01 d0 c1 e0 >> 03 29 d0 c1 e0 05 ff 80 b8 c0 53 c0 ff 05 34 1d 68 c0 eb 96 <0f> 0b eb fe 0f >> 0b eb fe 0f 0b eb fe 8b 53 0c eb be 55 89 e5 56 >> EIP: [] __add_to_swap_cache+0xc6/0xd7 SS:ESP 0068:c1858f84 > The same issue here. > Don't worry about reproducing untainted, I got the same earlier > and was just preparing and testing the hotfix: here it is... > Thanks for the patch. Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1
Andrew Morton pisze: > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/ > bash_shared_mapping triggered this [ 874.714700] INFO: trying to register non-static key. [ 874.719659] the code is fine but needs lockdep annotation. [ 874.725133] turning off the locking correctness validator. [ 874.730606] [] show_trace_log_lvl+0x1a/0x30 [ 874.735759] [] show_trace+0x12/0x14 [ 874.740218] [] dump_stack+0x16/0x18 [ 874.744679] [] __lock_acquire+0x598/0x125c [ 874.749745] [] lock_acquire+0xa7/0xc1 [ 874.754378] [] _spin_lock_irqsave+0x41/0x6e [ 874.759529] [] prop_norm_single+0x34/0x8a [ 874.764508] [] set_page_dirty+0xa1/0x13b [ 874.769402] [] try_to_unmap_one+0xb8/0x1e7 [ 874.774467] [] try_to_unmap+0x8f/0x40d [ 874.779187] [] shrink_page_list+0x278/0x750 [ 874.784339] [] shrink_inactive_list+0xf6/0x328 [ 874.789749] [] shrink_zone+0xad/0x10b [ 874.794383] [] try_to_free_pages+0x178/0x274 [ 874.799620] [] __alloc_pages+0x169/0x431 [ 874.804514] [] __do_page_cache_readahead+0x141/0x207 [ 874.810443] [] do_page_cache_readahead+0x48/0x5c [ 874.816027] [] filemap_fault+0x2dd/0x4cf [ 874.820921] [] __do_fault+0xb6/0x42d [ 874.825466] [] handle_mm_fault+0x1b6/0x750 [ 874.830533] [] do_page_fault+0x334/0x5f9 [ 874.835425] [] error_code+0x72/0x78 [ 874.839886] === [ 880.621883] BUG: NMI Watchdog detected LOCKUP on CPU1, eip c0529022, registers: [ 880.629200] Modules linked in: ext2 loop autofs4 af_packet nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink ipt_REJECT iptable_filter ip_tables xt_tcpudp ip6t_REJECT ip6table_filter ip6_tables x_tables firmware_class binfmt_misc fan ipv6 nvram snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer evdev snd soundcore i2c_i801 snd_page_alloc intel_agp agpgart rtc [ 880.672397] CPU:1 [ 880.672398] EIP:0060:[]Not tainted VLI [ 880.672400] EFLAGS: 0046 (2.6.23-rc2-mm1 #3) [ 880.684735] EIP is at delay_tsc+0xe/0x17 l *delay_tsc+0xe 0xc1129022 is in delay_tsc (/home/devel/linux-mm/arch/i386/lib/delay.c:49). 44 45 rdtscl(bclock); 46 do { 47 rep_nop(); 48 rdtscl(now); 49 } while ((now-bclock) < loops); 50 } 51 52 /* 53 * Since we calibrate only once at boot, this [ 880.688646] eax: 393e5d7c ebx: 0001 ecx: 393e5d04 edx: 023f [ 880.695414] esi: edi: cabbf5cc ebp: caf29ae8 esp: caf29ae4 [ 880.702183] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 [ 880.708002] Process firefox-bin (pid: 2625, ti=caf29000 task=cabbe900 task.ti=caf29000) [ 880.715805] Stack: 02e6eb94 caf29af0 c0528fdd caf29b28 c05375ac 0046 caf29b28 [ 880.724345]0046 a6c999f0 0001 a6c999f0 cabbf5e0 cabbf5cc [ 880.732887]0086 caf29b48 c069f76f 0002 c05259db cabbf5c0 8000 [ 880.741425] Call Trace: [ 880.744073] [] show_trace_log_lvl+0x1a/0x30 [ 880.749233] [] show_stack_log_lvl+0xa9/0xd5 [ 880.754386] [] show_registers+0x21a/0x3ac [ 880.759365] [] die_nmi+0x84/0xd7 [ 880.763566] [] nmi_watchdog_tick+0x14d/0x168 [ 880.768803] [] do_nmi+0x8b/0x284 [ 880.773004] [] nmi_stack_correct+0x26/0x2b [ 880.778069] [] __delay+0x9/0xb [ 880.782098] [] _raw_spin_lock+0xd8/0x18a [ 880.786991] [] _spin_lock_irqsave+0x5d/0x6e [ 880.792143] [] prop_norm_single+0x34/0x8a [ 880.797122] [] set_page_dirty+0xa1/0x13b [ 880.802015] [] try_to_unmap_one+0xb8/0x1e7 [ 880.807079] [] try_to_unmap+0x8f/0x40d [ 880.811798] [] shrink_page_list+0x278/0x750 [ 880.816950] [] shrink_inactive_list+0xf6/0x328 [ 880.822362] [] shrink_zone+0xad/0x10b [ 880.826997] [] try_to_free_pages+0x178/0x274 [ 880.832235] [] __alloc_pages+0x169/0x431 [ 880.837126] [] __do_page_cache_readahead+0x141/0x207 [ 880.843056] [] do_page_cache_readahead+0x48/0x5c [ 880.848641] [] filemap_fault+0x2dd/0x4cf [ 880.853534] [] __do_fault+0xb6/0x42d [ 880.858081] [] handle_mm_fault+0x1b6/0x750 [ 880.863146] [] do_page_fault+0x334/0x5f9 [ 880.868037] [] error_code+0x72/0x78 [ 880.872497] === [ 880.876068] INFO: lockdep is turned off. [ 880.879983] Code: 8d 0c 1b 01 c9 89 da c1 e2 07 29 ca 01 da 01 d2 f7 e2 8d 42 01 e8 c3 ff ff ff 5b 5d c3 55 89 e5 53 89 c3 0f 31 89 c1 f3 90 0f 31 <29> c8 39 d8 72 f6 5b 5d c3 55 89 e5 53 69 c0 1c 43 00 00 64 8b [ 880.900092] Kernel panic - not syncing: Aiee, killing interrupt handler! [ 880.906791] WARNING: at /home/devel/linux-mm/arch/i386/kernel/smp.c:474 native_smp_send_reschedule() [ 880.915892] [] show_trace_log_lvl+0x1a/0x30 [ 880.921043] [] show_trace+0x12/0x14 [ 880.925504] [] dump_stack+0x16/0x18 [ 880.929964] [] native_smp_send_reschedule+0x8b/0x98 [ 880.935808] [] re
Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!
On Thu, 9 Aug 2007, Mariusz Kozlowski wrote: > Hello, > > Nothing unusual happening, allmodconfig compiling etc. > Not sure why it says kernel was tainted though ... hmmm. > > [ cut here ] > kernel BUG at mm/swap_state.c:78! > invalid opcode: [#1] > PREEMPT > Modules linked in: orinoco_cs orinoco hermes pl2303 usbserial pcmcia 8250_pci > 8250 serial_core yenta_socket rsrc_nonstatic pcmcia_core 8139too > CPU:0 > EIP:0060:[]Tainted: P VLI > EFLAGS: 00010246 (2.6.23-rc2-mm1 #1) > EIP is at __add_to_swap_cache+0xc6/0xd7 > eax: 4000 ebx: c11285c0 ecx: 00d0 edx: 0283 > esi: c11285c0 edi: 0283 ebp: c1858f90 esp: c1858f84 > ds: 007b es: 007b fs: gs: ss: 0068 > Process kprefetchd (pid: 236, ti=c1858000 task=c18d14d0 task.ti=c1858000) > Stack: 0283 c11285c0 c3d5a3c8 c1858fa0 c01504ea c11285c0 > c1858fcc >c015307c 0001 0007 0002 0002 0283 > fffc > c0152d5c c1858fe0 c0127f2e c0127ef8 > > Call Trace: > [] show_trace_log_lvl+0x1a/0x30 > [] show_stack_log_lvl+0xa9/0xd5 > [] show_registers+0x219/0x38d > [] die+0x104/0x23e > [] do_trap+0x83/0xad > [] do_invalid_op+0x88/0x92 > [] error_code+0x6a/0x70 > [] add_to_swap_cache+0x22/0x58 > [] kprefetchd+0x320/0x364 > [] kthread+0x36/0x58 > [] kernel_thread_helper+0x7/0x14 > === > INFO: lockdep is turned off. > Code: 0f 89 7b 0c 83 05 fc c9 53 c0 01 8b 13 c1 ea 1e 8d 04 12 01 d0 c1 e0 03 > 29 d0 c1 e0 05 ff 80 b8 c0 53 c0 ff 05 34 1d 68 c0 eb 96 <0f> 0b eb fe 0f 0b > eb fe 0f 0b eb fe 8b 53 0c eb be 55 89 e5 56 > EIP: [] __add_to_swap_cache+0xc6/0xd7 SS:ESP 0068:c1858f84 Don't worry about reproducing untainted, I got the same earlier and was just preparing and testing the hotfix: here it is... Nick's mm-clarify-__add_to_swap_cache-locking.patch is fine for mainline, but soon generates a "kernel BUG at mm/swap_state.c:78!" when it meets mm-implement-swap-prefetching.patch in 2.6.23-rc2-mm1. We could add a fix to the latter, but I think it's better to adjust Nick's, so that it's right for whichever tree it's in: move the responsibility to SetPageLocked from read_swap_cache_async to add_to_swap_cache. Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]> --- mm/swap_state.c |8 1 file changed, 4 insertions(+), 4 deletions(-) --- 2.6.23-rc2-mm1/mm/swap_state.c 2007-08-09 13:15:36.0 +0100 +++ linux/mm/swap_state.c 2007-08-09 14:40:27.0 +0100 @@ -100,15 +100,18 @@ int add_to_swap_cache(struct page *page, { int error; + BUG_ON(PageLocked(page)); if (!swap_duplicate(entry)) { INC_CACHE_INFO(noent_race); return -ENOENT; } + SetPageLocked(page); error = __add_to_swap_cache(page, entry, GFP_KERNEL); /* * Anon pages are already on the LRU, we don't run lru_cache_add here. */ if (error) { + ClearPageLocked(page); swap_free(entry); if (error == -EEXIST) INC_CACHE_INFO(exist_race); @@ -345,7 +348,6 @@ struct page *read_swap_cache_async(swp_e vma, addr); if (!new_page) break; /* Out of memory */ - SetPageLocked(new_page);/* could be non-atomic op */ } /* @@ -369,9 +371,7 @@ struct page *read_swap_cache_async(swp_e } } while (err != -ENOENT && err != -ENOMEM); - if (new_page) { - ClearPageLocked(new_page); + if (new_page) page_cache_release(new_page); - } return found_page; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: hang, prop_norm_single involved
On Thu, 2007-08-09 at 14:45 +0200, Peter Zijlstra wrote: > On Thu, 2007-08-09 at 15:10 +0400, Alexey Dobriyan wrote: > > LTP run reproducably hangs during rwtest01 test > > rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$ > > Calltrace is always the same: > > > [EMAIL PROTECTED] ~]# PATH=/testcases/bin/:$PATH /testcases/bin/rwtest -N > rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$ > rwtest011 PASS : Test passed > [EMAIL PROTECTED] ~]# PATH=/testcases/bin/:$PATH /testcases/bin/rwtest -N > rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$ > > I can reproduce, but not always. > > Also, since the task->dirties member is initialized in fork.c this > should either _always_ happen or never. So this does point to some > memory corruption, ->dirties is the very last member of the task struct. > > /me goes try with slab_debug,... to no avail, banging head against the wall what is happening here. Andrew, could you: # mm-dirty-balancing-for-tasks.patch while I try to figure this one out? That seems to make the unhappies I could reproduce here go away. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!
> >... > > Not sure why it says kernel was tainted though ... hmmm. > >... > > What does your syslog say when it was tainted? Shit. My fault. I'll try to reproduce it on untainted kernel. Thanks, Mariusz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!
On Thu, Aug 09, 2007 at 05:11:52PM +0200, Mariusz Kozlowski wrote: >... > Not sure why it says kernel was tainted though ... hmmm. >... What does your syslog say when it was tainted? > Regards, > > Mariusz cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!
Hello, Nothing unusual happening, allmodconfig compiling etc. Not sure why it says kernel was tainted though ... hmmm. [ cut here ] kernel BUG at mm/swap_state.c:78! invalid opcode: [#1] PREEMPT Modules linked in: orinoco_cs orinoco hermes pl2303 usbserial pcmcia 8250_pci 8250 serial_core yenta_socket rsrc_nonstatic pcmcia_core 8139too CPU:0 EIP:0060:[]Tainted: PVLI EFLAGS: 00010246 (2.6.23-rc2-mm1 #1) EIP is at __add_to_swap_cache+0xc6/0xd7 eax: 4000 ebx: c11285c0 ecx: 00d0 edx: 0283 esi: c11285c0 edi: 0283 ebp: c1858f90 esp: c1858f84 ds: 007b es: 007b fs: gs: ss: 0068 Process kprefetchd (pid: 236, ti=c1858000 task=c18d14d0 task.ti=c1858000) Stack: 0283 c11285c0 c3d5a3c8 c1858fa0 c01504ea c11285c0 c1858fcc c015307c 0001 0007 0002 0002 0283 fffc c0152d5c c1858fe0 c0127f2e c0127ef8 Call Trace: [] show_trace_log_lvl+0x1a/0x30 [] show_stack_log_lvl+0xa9/0xd5 [] show_registers+0x219/0x38d [] die+0x104/0x23e [] do_trap+0x83/0xad [] do_invalid_op+0x88/0x92 [] error_code+0x6a/0x70 [] add_to_swap_cache+0x22/0x58 [] kprefetchd+0x320/0x364 [] kthread+0x36/0x58 [] kernel_thread_helper+0x7/0x14 === INFO: lockdep is turned off. Code: 0f 89 7b 0c 83 05 fc c9 53 c0 01 8b 13 c1 ea 1e 8d 04 12 01 d0 c1 e0 03 29 d0 c1 e0 05 ff 80 b8 c0 53 c0 ff 05 34 1d 68 c0 eb 96 <0f> 0b eb fe 0f 0b eb fe 0f 0b eb fe 8b 53 0c eb be 55 89 e5 56 EIP: [] __add_to_swap_cache+0xc6/0xd7 SS:ESP 0068:c1858f84 Regards, Mariusz - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- drivers/ dma/ioat_dca.c:177: error: implicit declaration of function ‘cpu_physical_id’
On Thu, Aug 09, 2007 at 10:18:15AM -0400, Miles Lane wrote: > CC drivers/dma/ioat_dca.o > drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag': > drivers/dma/ioat_dca.c:177: error: implicit declaration of function > 'cpu_physical_id' > make[2]: *** [drivers/dma/ioat_dca.o] Error 1 -ENODOTCONFIG cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc2-mm1 -- PPC G5 kernel compile failure (patch)
On Thu, 9 Aug 2007 14:04:49 +0100 Andy Whitcroft <[EMAIL PROTECTED]> wrote: > Seeing the following compile error on a G5 mac: > > drivers/video/tdfxfb.c: In function 'tdfxfb_setup': > drivers/video/tdfxfb.c:1341: error: 'opt' undeclared (first use in this > function) > drivers/video/tdfxfb.c:1341: error: (Each undeclared identifier is > reported only once > drivers/video/tdfxfb.c:1341: error: for each function it appears in.) > > This seems to be the following fragment from tdfxfb-hardware-cursor: > > + } else if (!strcmp(this_opt, "hwcursor")) { > + hwcursor = simple_strtoul(opt + 9, NULL, 0); > > I guess the nieve fix would be s/opt/this_opt, but I am also > suspicious of the +9 here as hwcursor is only 8 long? Now this > seems to take a numeric value and I assume that is via hwcursor=N, > if so then the +9 would make sense _if_ the strcmp was against > "hwcursor=". > The patch below fixes all issues you have pointed out. It also fixes the description of the nomtrr option. --- From: Krzysztof Helt <[EMAIL PROTECTED]> This patch fixes compilation with setup options bug and corrects description of the nomtrr option. Signed-off-by: Krzysztof Helt <[EMAIL PROTECTED]> --- --- linux-2.6.22.new/drivers/video/tdfxfb.c 2007-08-09 16:11:23.870028259 +0200 +++ linux-2.6.23/drivers/video/tdfxfb.c 2007-08-09 16:15:07.654781024 +0200 @@ -1337,8 +1337,8 @@ static void tdfxfb_setup(char *options) nopan = 1; } else if (!strcmp(this_opt, "nowrap")) { nowrap = 1; - } else if (!strcmp(this_opt, "hwcursor")) { - hwcursor = simple_strtoul(opt + 9, NULL, 0); + } else if (!strncmp(this_opt, "hwcursor=", 9)) { + hwcursor = simple_strtoul(this_opt + 9, NULL, 0); #ifdef CONFIG_MTRR } else if (!strncmp(this_opt, "nomtrr", 6)) { nomtrr = 1; @@ -1409,7 +1409,7 @@ MODULE_PARM_DESC(hwcursor, "Enable hardw "(1=enable, 0=disable, default=1)"); #ifdef CONFIG_MTRR module_param(nomtrr, bool, 0); -MODULE_PARM_DESC(nomtrr, "Disable MTRR support (0 or 1=disabled) (default=0)"); +MODULE_PARM_DESC(nomtrr, "Disable MTRR support (default: enabled)"); #endif module_init(tdfxfb_init); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/