Re: [PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-26 Thread Jarek Poplawski
On Sat, Aug 25, 2007 at 11:43:08AM +0200, Mariusz Kozlowski wrote:
> > > =
> > > [ INFO: inconsistent lock state ]
> > > 2.6.23-rc2-mm1 #7
> > > -
> > > inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
> > > ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > >  (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b 
> > > [8139too]
...
> I tested your patch and it still happens. Dmesg info from patched kernel 
> attached.
> I coulnd't reproduce that on 2.6.23-rc3-mm1 - but on 2.6.23-rc2-mm2 it is 
> easily
> reproducible.
> 
> If you need more info, test some patches, etc. - just mail me.
> 
...
> =
> [ INFO: possible irq lock inversion dependency detected ]
> 2.6.23-rc2-mm2 #2
> -
> runscript.sh/5065 just changed the state of lock:
>  (_xmit_ETHER){-+..}, at: [] dev_watchdog+0x17/0xcc
> but this lock took another, soft-irq-unsafe lock in the past:
>  (>lock){--..}
> 
> and interrupts could create inverse lock ordering between them.

It's OK! These're 2 different warnings. As a matter of fact, my patch
wasn't supposed to fix any of them, but something similar to the first
one, which was possible, but for some reason wasn't reported by
lockdep.

The first warning was fixed by Andrew Morton's patch to free_irq(),
so it shouldn't happen in -rc3-mm. The second warning could have been
fixed too, I don't know, but since it's quite long, I would prefer to
think about it only if it still happens in current -mm's.

Thanks,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-26 Thread Jarek Poplawski
On Sat, Aug 25, 2007 at 11:43:08AM +0200, Mariusz Kozlowski wrote:
   =
   [ INFO: inconsistent lock state ]
   2.6.23-rc2-mm1 #7
   -
   inconsistent {in-hardirq-W} - {hardirq-on-W} usage.
   ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
(tp-lock){+...}, at: [de8706e0] rtl8139_interrupt+0x27/0x46b 
   [8139too]
...
 I tested your patch and it still happens. Dmesg info from patched kernel 
 attached.
 I coulnd't reproduce that on 2.6.23-rc3-mm1 - but on 2.6.23-rc2-mm2 it is 
 easily
 reproducible.
 
 If you need more info, test some patches, etc. - just mail me.
 
...
 =
 [ INFO: possible irq lock inversion dependency detected ]
 2.6.23-rc2-mm2 #2
 -
 runscript.sh/5065 just changed the state of lock:
  (_xmit_ETHER){-+..}, at: [c03cb659] dev_watchdog+0x17/0xcc
 but this lock took another, soft-irq-unsafe lock in the past:
  (tp-lock){--..}
 
 and interrupts could create inverse lock ordering between them.

It's OK! These're 2 different warnings. As a matter of fact, my patch
wasn't supposed to fix any of them, but something similar to the first
one, which was possible, but for some reason wasn't reported by
lockdep.

The first warning was fixed by Andrew Morton's patch to free_irq(),
so it shouldn't happen in -rc3-mm. The second warning could have been
fixed too, I don't know, but since it's quite long, I would prefer to
think about it only if it still happens in current -mm's.

Thanks,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-25 Thread Mariusz Kozlowski
> > =
> > [ INFO: inconsistent lock state ]
> > 2.6.23-rc2-mm1 #7
> > -
> > inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
> > ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
> >  (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too]
> > {in-hardirq-W} state was registered at:
> >   [] __lock_acquire+0x949/0x11ac
> >   [] lock_acquire+0x99/0xb2
> >   [] _spin_lock+0x35/0x42
> >   [] rtl8139_interrupt+0x27/0x46b [8139too]
> >   [] handle_IRQ_event+0x28/0x59
> >   [] handle_level_irq+0xad/0x10b
> >   [] do_IRQ+0x93/0xd0
> >   [] common_interrupt+0x2e/0x34
> ...
> > other info that might help us debug this:
> > 1 lock held by ifconfig/5492:
> >  #0:  (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
> > 
> > stack backtrace:
> ...
> >  [] _spin_lock+0x35/0x42
> >  [] rtl8139_interrupt+0x27/0x46b [8139too]
> >  [] free_irq+0x11b/0x146
> >  [] rtl8139_close+0x8a/0x14a [8139too]
> >  [] dev_close+0x57/0x74
> ...
> 
> It looks like this was possible after David's fix, which really
> enabled running of the handler in free_irq, but before Andrew's patch
> disabling local irqs for this time.
> 
> So, this bug should be fixed, but IMHO similar problem is possible in
> request_irq. And, I think, this is not only about lockdep complaining,
> but real lockup possibility, because any locks in such a handler are
> taken in another, not expected for them context, and could be
> vulnerable (especially with softirqs, but probably hardirqs as well).
> 
> Reported-by: Mariusz Kozlowski <[EMAIL PROTECTED]>
> Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>
> 
> ---
> 
> diff -Nurp 2.6.23-rc3-mm1-/kernel/irq/manage.c 
> 2.6.23-rc3-mm1/kernel/irq/manage.c
> --- 2.6.23-rc3-mm1-/kernel/irq/manage.c   2007-08-22 13:58:58.0 
> +0200
> +++ 2.6.23-rc3-mm1/kernel/irq/manage.c2007-08-22 14:12:21.0 
> +0200
> @@ -546,14 +546,11 @@ int request_irq(unsigned int irq, irq_ha
>* We do this before actually registering it, to make sure that
>* a 'real' IRQ doesn't run in parallel with our fake
>*/
> - if (irqflags & IRQF_DISABLED) {
> - unsigned long flags;
> + unsigned long flags;
>  
> - local_irq_save(flags);
> - handler(irq, dev_id);
> - local_irq_restore(flags);
> - } else
> - handler(irq, dev_id);
> + local_irq_save(flags);
> + handler(irq, dev_id);
> + local_irq_restore(flags);
>   }
>  #endif

I tested your patch and it still happens. Dmesg info from patched kernel 
attached.
I coulnd't reproduce that on 2.6.23-rc3-mm1 - but on 2.6.23-rc2-mm2 it is easily
reproducible.

If you need more info, test some patches, etc. - just mail me.

Pozdrawiam,

Mariusz


=
[ INFO: possible irq lock inversion dependency detected ]
2.6.23-rc2-mm2 #2
-
runscript.sh/5065 just changed the state of lock:
 (_xmit_ETHER){-+..}, at: [] dev_watchdog+0x17/0xcc
but this lock took another, soft-irq-unsafe lock in the past:
 (>lock){--..}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
1 lock held by runscript.sh/5065:
 #0:  (>mmap_sem){}, at: [] do_page_fault+0x159/0x6f0

the first lock's dependencies:
-> (_xmit_ETHER){-+..} ops: 21 {
   initial-use  at:
[] __lock_acquire+0x217/0x11ac
[] lock_acquire+0x99/0xb2
[] _spin_lock_bh+0x3a/0x47
[] dev_set_rx_mode+0x14/0x3b
[] dev_change_flags+0x68/0x190
[] devinet_ioctl+0x4af/0x652
[] inet_ioctl+0x56/0x71
[] sock_ioctl+0xa5/0x1d4
[] do_ioctl+0x22/0x71
[] vfs_ioctl+0x55/0x29e
[] sys_ioctl+0x33/0x69
[] sysenter_past_esp+0x5f/0x99
[] 0x
   in-softirq-W at:
[] __lock_acquire+0x6f2/0x11ac
[] lock_acquire+0x99/0xb2
[] _spin_lock+0x35/0x42
[] dev_watchdog+0x17/0xcc
[] run_timer_softirq+0x14b/0x1a9
[] __do_softirq+0x5b/0xb2
[] do_softirq+0x4d/0x4f
[] irq_exit+0x48/0x4a
[] do_IRQ+0x98/0xd0

Re: [PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-25 Thread Mariusz Kozlowski
  =
  [ INFO: inconsistent lock state ]
  2.6.23-rc2-mm1 #7
  -
  inconsistent {in-hardirq-W} - {hardirq-on-W} usage.
  ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
   (tp-lock){+...}, at: [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too]
  {in-hardirq-W} state was registered at:
[c0138eeb] __lock_acquire+0x949/0x11ac
[c01397e7] lock_acquire+0x99/0xb2
[c0452ff3] _spin_lock+0x35/0x42
[de8706e0] rtl8139_interrupt+0x27/0x46b [8139too]
[c0147a5d] handle_IRQ_event+0x28/0x59
[c01493ca] handle_level_irq+0xad/0x10b
[c0105a13] do_IRQ+0x93/0xd0
[c010441e] common_interrupt+0x2e/0x34
 ...
  other info that might help us debug this:
  1 lock held by ifconfig/5492:
   #0:  (rtnl_mutex){--..}, at: [c0451778] mutex_lock+0x1c/0x1f
  
  stack backtrace:
 ...
   [c0452ff3] _spin_lock+0x35/0x42
   [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too]
   [c01480fd] free_irq+0x11b/0x146
   [de871d59] rtl8139_close+0x8a/0x14a [8139too]
   [c03bde63] dev_close+0x57/0x74
 ...
 
 It looks like this was possible after David's fix, which really
 enabled running of the handler in free_irq, but before Andrew's patch
 disabling local irqs for this time.
 
 So, this bug should be fixed, but IMHO similar problem is possible in
 request_irq. And, I think, this is not only about lockdep complaining,
 but real lockup possibility, because any locks in such a handler are
 taken in another, not expected for them context, and could be
 vulnerable (especially with softirqs, but probably hardirqs as well).
 
 Reported-by: Mariusz Kozlowski [EMAIL PROTECTED]
 Signed-off-by: Jarek Poplawski [EMAIL PROTECTED]
 
 ---
 
 diff -Nurp 2.6.23-rc3-mm1-/kernel/irq/manage.c 
 2.6.23-rc3-mm1/kernel/irq/manage.c
 --- 2.6.23-rc3-mm1-/kernel/irq/manage.c   2007-08-22 13:58:58.0 
 +0200
 +++ 2.6.23-rc3-mm1/kernel/irq/manage.c2007-08-22 14:12:21.0 
 +0200
 @@ -546,14 +546,11 @@ int request_irq(unsigned int irq, irq_ha
* We do this before actually registering it, to make sure that
* a 'real' IRQ doesn't run in parallel with our fake
*/
 - if (irqflags  IRQF_DISABLED) {
 - unsigned long flags;
 + unsigned long flags;
  
 - local_irq_save(flags);
 - handler(irq, dev_id);
 - local_irq_restore(flags);
 - } else
 - handler(irq, dev_id);
 + local_irq_save(flags);
 + handler(irq, dev_id);
 + local_irq_restore(flags);
   }
  #endif

I tested your patch and it still happens. Dmesg info from patched kernel 
attached.
I coulnd't reproduce that on 2.6.23-rc3-mm1 - but on 2.6.23-rc2-mm2 it is easily
reproducible.

If you need more info, test some patches, etc. - just mail me.

Pozdrawiam,

Mariusz


=
[ INFO: possible irq lock inversion dependency detected ]
2.6.23-rc2-mm2 #2
-
runscript.sh/5065 just changed the state of lock:
 (_xmit_ETHER){-+..}, at: [c03cb659] dev_watchdog+0x17/0xcc
but this lock took another, soft-irq-unsafe lock in the past:
 (tp-lock){--..}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
1 lock held by runscript.sh/5065:
 #0:  (mm-mmap_sem){}, at: [c0454569] do_page_fault+0x159/0x6f0

the first lock's dependencies:
- (_xmit_ETHER){-+..} ops: 21 {
   initial-use  at:
[c0138ea9] __lock_acquire+0x217/0x11ac
[c0139ed7] lock_acquire+0x99/0xb2
[c045281a] _spin_lock_bh+0x3a/0x47
[c03bc096] dev_set_rx_mode+0x14/0x3b
[c03bc59f] dev_change_flags+0x68/0x190
[c03fcb4c] devinet_ioctl+0x4af/0x652
[c03fd432] inet_ioctl+0x56/0x71
[c03b151a] sock_ioctl+0xa5/0x1d4
[c0178a42] do_ioctl+0x22/0x71
[c0178ae6] vfs_ioctl+0x55/0x29e
[c0178d62] sys_ioctl+0x33/0x69
[c01041da] sysenter_past_esp+0x5f/0x99
[] 0x
   in-softirq-W at:
[c0139384] __lock_acquire+0x6f2/0x11ac
[c0139ed7] lock_acquire+0x99/0xb2
[c04527d3] _spin_lock+0x35/0x42
[c03cb659] dev_watchdog+0x17/0xcc
[c01224b7] run_timer_softirq+0x14b/0x1a9
[c011ecc2] __do_softirq+0x5b/0xb2
[c011ed66] do_softirq+0x4d/0x4f
[c011f04b] irq_exit+0x48/0x4a
[c01058f8] do_IRQ+0x98/0xd0
[c010444e] common_interrupt+0x2e/0x34
[c014b039

Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected

2007-08-24 Thread Jarek Poplawski
On Fri, Aug 24, 2007 at 10:27:25AM +0200, Jarek Poplawski wrote:
> On 10-08-2007 09:06, Mariusz Kozlowski wrote:
...
> > =
> > [ INFO: possible irq lock inversion dependency detected ]
> > 2.6.23-rc2-mm1 #7
> > -
> > runscript.sh/5843 just changed the state of lock:
> >  (_xmit_ETHER){-+..}, at: [] dev_watchdog+0x17/0xcc
> > but this lock took another, soft-irq-unsafe lock in the past:
> >  (>lock){--..}
> > 
> > and interrupts could create inverse lock ordering between them.
> ...
> > Really no idea who to CC here ;)
> 
> IMHO, this should be fixed by last changes to free_irq & request_irq.
> (Seems to be possible only with CONFIG_DEBUG_SHIRQ?) Otherwise I can
> be CC-ed - my pleasure!

OOPS! But, since it's about inversion - not state - there should be no
connection... Anyway if this returns currently (and if _SHIRQ only) I'm
interested.

Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected

2007-08-24 Thread Jarek Poplawski
On 10-08-2007 09:06, Mariusz Kozlowski wrote:
> Hello,
> 
>   And the winner of today is ...
> 
> 
> 
> =
> [ INFO: possible irq lock inversion dependency detected ]
> 2.6.23-rc2-mm1 #7
> -
> runscript.sh/5843 just changed the state of lock:
>  (_xmit_ETHER){-+..}, at: [] dev_watchdog+0x17/0xcc
> but this lock took another, soft-irq-unsafe lock in the past:
>  (>lock){--..}
> 
> and interrupts could create inverse lock ordering between them.
...
> Really no idea who to CC here ;)

IMHO, this should be fixed by last changes to free_irq & request_irq.
(Seems to be possible only with CONFIG_DEBUG_SHIRQ?) Otherwise I can
be CC-ed - my pleasure!

Cheers,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected

2007-08-24 Thread Jarek Poplawski
On 10-08-2007 09:06, Mariusz Kozlowski wrote:
 Hello,
 
   And the winner of today is ...
 
 
 
 =
 [ INFO: possible irq lock inversion dependency detected ]
 2.6.23-rc2-mm1 #7
 -
 runscript.sh/5843 just changed the state of lock:
  (_xmit_ETHER){-+..}, at: [c03cbe79] dev_watchdog+0x17/0xcc
 but this lock took another, soft-irq-unsafe lock in the past:
  (tp-lock){--..}
 
 and interrupts could create inverse lock ordering between them.
...
 Really no idea who to CC here ;)

IMHO, this should be fixed by last changes to free_irq  request_irq.
(Seems to be possible only with CONFIG_DEBUG_SHIRQ?) Otherwise I can
be CC-ed - my pleasure!

Cheers,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected

2007-08-24 Thread Jarek Poplawski
On Fri, Aug 24, 2007 at 10:27:25AM +0200, Jarek Poplawski wrote:
 On 10-08-2007 09:06, Mariusz Kozlowski wrote:
...
  =
  [ INFO: possible irq lock inversion dependency detected ]
  2.6.23-rc2-mm1 #7
  -
  runscript.sh/5843 just changed the state of lock:
   (_xmit_ETHER){-+..}, at: [c03cbe79] dev_watchdog+0x17/0xcc
  but this lock took another, soft-irq-unsafe lock in the past:
   (tp-lock){--..}
  
  and interrupts could create inverse lock ordering between them.
 ...
  Really no idea who to CC here ;)
 
 IMHO, this should be fixed by last changes to free_irq  request_irq.
 (Seems to be possible only with CONFIG_DEBUG_SHIRQ?) Otherwise I can
 be CC-ed - my pleasure!

OOPS! But, since it's about inversion - not state - there should be no
connection... Anyway if this returns currently (and if _SHIRQ only) I'm
interested.

Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH (take 2)] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-23 Thread Jarek Poplawski
On Thu, Aug 23, 2007 at 10:44:30AM +0200, Jarek Poplawski wrote:
> Andrew Morton pointed out that my changelog was unusable. Sorry!
> Here is a second try with the changelog and kernel version changed.
...
> >(take 2)
> 
> Subject: request_irq() - fix DEBUG_SHIRQ handling
...
> Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>
> 
> ---
> 
> diff -Nurp 2.6.23-rc3-git6-/kernel/irq/manage.c 
> 2.6.23-rc3-git6/kernel/irq/manage.c
> --- 2.6.23-rc3-git6-/kernel/irq/manage.c  2007-08-23 10:11:35.0 
> +0200
> +++ 2.6.23-rc3-git6/kernel/irq/manage.c   2007-08-23 10:16:29.0 
> +0200

So, this time I f-ed the diff part: it's not exactly against 2.6.23-rc-git6.
But, it's Andrew to blame: he should've known that some old & slow chips
can't do science and poetry at the same time. Sorry (for him)!

Anyway, beside an offset, should be OK...

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH (take 2)] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-23 Thread Jarek Poplawski
Andrew Morton pointed out that my changelog was unusable. Sorry!
Here is a second try with the changelog and kernel version changed.

Regards,
Jarek P.

>(take 2)

Subject: request_irq() - fix DEBUG_SHIRQ handling

Mariusz Kozlowski reported lockdep's warning:

> =
> [ INFO: inconsistent lock state ]
> 2.6.23-rc2-mm1 #7
> -
> inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
> ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
>  (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too]
> {in-hardirq-W} state was registered at:
>   [] __lock_acquire+0x949/0x11ac
>   [] lock_acquire+0x99/0xb2
>   [] _spin_lock+0x35/0x42
>   [] rtl8139_interrupt+0x27/0x46b [8139too]
>   [] handle_IRQ_event+0x28/0x59
>   [] handle_level_irq+0xad/0x10b
>   [] do_IRQ+0x93/0xd0
>   [] common_interrupt+0x2e/0x34
...
> other info that might help us debug this:
> 1 lock held by ifconfig/5492:
>  #0:  (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
> 
> stack backtrace:
...
>  [] _spin_lock+0x35/0x42
>  [] rtl8139_interrupt+0x27/0x46b [8139too]
>  [] free_irq+0x11b/0x146
>  [] rtl8139_close+0x8a/0x14a [8139too]
>  [] dev_close+0x57/0x74
...

This shows that a driver's irq handler was running both in hard interrupt
and process contexts with irqs enabled. The latter was done during
free_irq() call and was possible only with CONFIG_DEBUG_SHIRQ enabled.
This was fixed by another patch.

But similar problem is possible with request_irq(): any locks taken from
irq handler could be vulnerable - especially with soft interrupts. This
patch fixes it by disabling local interrupts during handler's run. (It
seems, disabling softirqs should be enough, but it needs more checking
on possible races or other special cases).

This patch is recommended to all stable versions since 2.6.21, too.

Reported-by: Mariusz Kozlowski <[EMAIL PROTECTED]>
Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>

---

diff -Nurp 2.6.23-rc3-git6-/kernel/irq/manage.c 
2.6.23-rc3-git6/kernel/irq/manage.c
--- 2.6.23-rc3-git6-/kernel/irq/manage.c2007-08-23 10:11:35.0 
+0200
+++ 2.6.23-rc3-git6/kernel/irq/manage.c 2007-08-23 10:16:29.0 +0200
@@ -555,14 +555,11 @@ int request_irq(unsigned int irq, irq_ha
 * We do this before actually registering it, to make sure that
 * a 'real' IRQ doesn't run in parallel with our fake
 */
-   if (irqflags & IRQF_DISABLED) {
-   unsigned long flags;
+   unsigned long flags;
 
-   local_irq_save(flags);
-   handler(irq, dev_id);
-   local_irq_restore(flags);
-   } else
-   handler(irq, dev_id);
+   local_irq_save(flags);
+   handler(irq, dev_id);
+   local_irq_restore(flags);
}
 #endif
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH (take 2)] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-23 Thread Jarek Poplawski
Andrew Morton pointed out that my changelog was unusable. Sorry!
Here is a second try with the changelog and kernel version changed.

Regards,
Jarek P.

(take 2)

Subject: request_irq() - fix DEBUG_SHIRQ handling

Mariusz Kozlowski reported lockdep's warning:

 =
 [ INFO: inconsistent lock state ]
 2.6.23-rc2-mm1 #7
 -
 inconsistent {in-hardirq-W} - {hardirq-on-W} usage.
 ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
  (tp-lock){+...}, at: [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too]
 {in-hardirq-W} state was registered at:
   [c0138eeb] __lock_acquire+0x949/0x11ac
   [c01397e7] lock_acquire+0x99/0xb2
   [c0452ff3] _spin_lock+0x35/0x42
   [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too]
   [c0147a5d] handle_IRQ_event+0x28/0x59
   [c01493ca] handle_level_irq+0xad/0x10b
   [c0105a13] do_IRQ+0x93/0xd0
   [c010441e] common_interrupt+0x2e/0x34
...
 other info that might help us debug this:
 1 lock held by ifconfig/5492:
  #0:  (rtnl_mutex){--..}, at: [c0451778] mutex_lock+0x1c/0x1f
 
 stack backtrace:
...
  [c0452ff3] _spin_lock+0x35/0x42
  [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too]
  [c01480fd] free_irq+0x11b/0x146
  [de871d59] rtl8139_close+0x8a/0x14a [8139too]
  [c03bde63] dev_close+0x57/0x74
...

This shows that a driver's irq handler was running both in hard interrupt
and process contexts with irqs enabled. The latter was done during
free_irq() call and was possible only with CONFIG_DEBUG_SHIRQ enabled.
This was fixed by another patch.

But similar problem is possible with request_irq(): any locks taken from
irq handler could be vulnerable - especially with soft interrupts. This
patch fixes it by disabling local interrupts during handler's run. (It
seems, disabling softirqs should be enough, but it needs more checking
on possible races or other special cases).

This patch is recommended to all stable versions since 2.6.21, too.

Reported-by: Mariusz Kozlowski [EMAIL PROTECTED]
Signed-off-by: Jarek Poplawski [EMAIL PROTECTED]

---

diff -Nurp 2.6.23-rc3-git6-/kernel/irq/manage.c 
2.6.23-rc3-git6/kernel/irq/manage.c
--- 2.6.23-rc3-git6-/kernel/irq/manage.c2007-08-23 10:11:35.0 
+0200
+++ 2.6.23-rc3-git6/kernel/irq/manage.c 2007-08-23 10:16:29.0 +0200
@@ -555,14 +555,11 @@ int request_irq(unsigned int irq, irq_ha
 * We do this before actually registering it, to make sure that
 * a 'real' IRQ doesn't run in parallel with our fake
 */
-   if (irqflags  IRQF_DISABLED) {
-   unsigned long flags;
+   unsigned long flags;
 
-   local_irq_save(flags);
-   handler(irq, dev_id);
-   local_irq_restore(flags);
-   } else
-   handler(irq, dev_id);
+   local_irq_save(flags);
+   handler(irq, dev_id);
+   local_irq_restore(flags);
}
 #endif
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH (take 2)] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-23 Thread Jarek Poplawski
On Thu, Aug 23, 2007 at 10:44:30AM +0200, Jarek Poplawski wrote:
 Andrew Morton pointed out that my changelog was unusable. Sorry!
 Here is a second try with the changelog and kernel version changed.
...
 (take 2)
 
 Subject: request_irq() - fix DEBUG_SHIRQ handling
...
 Signed-off-by: Jarek Poplawski [EMAIL PROTECTED]
 
 ---
 
 diff -Nurp 2.6.23-rc3-git6-/kernel/irq/manage.c 
 2.6.23-rc3-git6/kernel/irq/manage.c
 --- 2.6.23-rc3-git6-/kernel/irq/manage.c  2007-08-23 10:11:35.0 
 +0200
 +++ 2.6.23-rc3-git6/kernel/irq/manage.c   2007-08-23 10:16:29.0 
 +0200

So, this time I f-ed the diff part: it's not exactly against 2.6.23-rc-git6.
But, it's Andrew to blame: he should've known that some old  slow chips
can't do science and poetry at the same time. Sorry (for him)!

Anyway, beside an offset, should be OK...

Regards,
Jarek P.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-22 Thread Jarek Poplawski
On 10-08-2007 01:49, Mariusz Kozlowski wrote:
> Hello,
> 
> =
> [ INFO: inconsistent lock state ]
> 2.6.23-rc2-mm1 #7
> -
> inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
> ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
>  (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too]
> {in-hardirq-W} state was registered at:
>   [] __lock_acquire+0x949/0x11ac
>   [] lock_acquire+0x99/0xb2
>   [] _spin_lock+0x35/0x42
>   [] rtl8139_interrupt+0x27/0x46b [8139too]
>   [] handle_IRQ_event+0x28/0x59
>   [] handle_level_irq+0xad/0x10b
>   [] do_IRQ+0x93/0xd0
>   [] common_interrupt+0x2e/0x34
...
> other info that might help us debug this:
> 1 lock held by ifconfig/5492:
>  #0:  (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
> 
> stack backtrace:
...
>  [] _spin_lock+0x35/0x42
>  [] rtl8139_interrupt+0x27/0x46b [8139too]
>  [] free_irq+0x11b/0x146
>  [] rtl8139_close+0x8a/0x14a [8139too]
>  [] dev_close+0x57/0x74
...

It looks like this was possible after David's fix, which really
enabled running of the handler in free_irq, but before Andrew's patch
disabling local irqs for this time.

So, this bug should be fixed, but IMHO similar problem is possible in
request_irq. And, I think, this is not only about lockdep complaining,
but real lockup possibility, because any locks in such a handler are
taken in another, not expected for them context, and could be
vulnerable (especially with softirqs, but probably hardirqs as well).

Reported-by: Mariusz Kozlowski <[EMAIL PROTECTED]>
Signed-off-by: Jarek Poplawski <[EMAIL PROTECTED]>

---

diff -Nurp 2.6.23-rc3-mm1-/kernel/irq/manage.c 
2.6.23-rc3-mm1/kernel/irq/manage.c
--- 2.6.23-rc3-mm1-/kernel/irq/manage.c 2007-08-22 13:58:58.0 +0200
+++ 2.6.23-rc3-mm1/kernel/irq/manage.c  2007-08-22 14:12:21.0 +0200
@@ -546,14 +546,11 @@ int request_irq(unsigned int irq, irq_ha
 * We do this before actually registering it, to make sure that
 * a 'real' IRQ doesn't run in parallel with our fake
 */
-   if (irqflags & IRQF_DISABLED) {
-   unsigned long flags;
+   unsigned long flags;
 
-   local_irq_save(flags);
-   handler(irq, dev_id);
-   local_irq_restore(flags);
-   } else
-   handler(irq, dev_id);
+   local_irq_save(flags);
+   handler(irq, dev_id);
+   local_irq_restore(flags);
}
 #endif
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-22 Thread Jarek Poplawski
On 10-08-2007 01:49, Mariusz Kozlowski wrote:
 Hello,
 
 =
 [ INFO: inconsistent lock state ]
 2.6.23-rc2-mm1 #7
 -
 inconsistent {in-hardirq-W} - {hardirq-on-W} usage.
 ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
  (tp-lock){+...}, at: [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too]
 {in-hardirq-W} state was registered at:
   [c0138eeb] __lock_acquire+0x949/0x11ac
   [c01397e7] lock_acquire+0x99/0xb2
   [c0452ff3] _spin_lock+0x35/0x42
   [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too]
   [c0147a5d] handle_IRQ_event+0x28/0x59
   [c01493ca] handle_level_irq+0xad/0x10b
   [c0105a13] do_IRQ+0x93/0xd0
   [c010441e] common_interrupt+0x2e/0x34
...
 other info that might help us debug this:
 1 lock held by ifconfig/5492:
  #0:  (rtnl_mutex){--..}, at: [c0451778] mutex_lock+0x1c/0x1f
 
 stack backtrace:
...
  [c0452ff3] _spin_lock+0x35/0x42
  [de8706e0] rtl8139_interrupt+0x27/0x46b [8139too]
  [c01480fd] free_irq+0x11b/0x146
  [de871d59] rtl8139_close+0x8a/0x14a [8139too]
  [c03bde63] dev_close+0x57/0x74
...

It looks like this was possible after David's fix, which really
enabled running of the handler in free_irq, but before Andrew's patch
disabling local irqs for this time.

So, this bug should be fixed, but IMHO similar problem is possible in
request_irq. And, I think, this is not only about lockdep complaining,
but real lockup possibility, because any locks in such a handler are
taken in another, not expected for them context, and could be
vulnerable (especially with softirqs, but probably hardirqs as well).

Reported-by: Mariusz Kozlowski [EMAIL PROTECTED]
Signed-off-by: Jarek Poplawski [EMAIL PROTECTED]

---

diff -Nurp 2.6.23-rc3-mm1-/kernel/irq/manage.c 
2.6.23-rc3-mm1/kernel/irq/manage.c
--- 2.6.23-rc3-mm1-/kernel/irq/manage.c 2007-08-22 13:58:58.0 +0200
+++ 2.6.23-rc3-mm1/kernel/irq/manage.c  2007-08-22 14:12:21.0 +0200
@@ -546,14 +546,11 @@ int request_irq(unsigned int irq, irq_ha
 * We do this before actually registering it, to make sure that
 * a 'real' IRQ doesn't run in parallel with our fake
 */
-   if (irqflags  IRQF_DISABLED) {
-   unsigned long flags;
+   unsigned long flags;
 
-   local_irq_save(flags);
-   handler(irq, dev_id);
-   local_irq_restore(flags);
-   } else
-   handler(irq, dev_id);
+   local_irq_save(flags);
+   handler(irq, dev_id);
+   local_irq_restore(flags);
}
 #endif
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-11 Thread Paul E. McKenney
On Sat, Aug 11, 2007 at 08:09:09PM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney <[EMAIL PROTECTED]> wrote:
> 
> > Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it.
> > Compiles, but not yet tested.
> > 
> > Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]>
> 
> > --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700
> > +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 
> > 17:22:57.0 -0700
> > @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu)
> > return now;
> >  }
> >  
> > +EXPORT_SYMBOL_GPL(cpu_clock);
> 
> sure enough,
> 
> Acked-by: Ingo Molnar <[EMAIL PROTECTED]>

Thank you!

Just for the record, given that the xtime API that it replaces
was EXPORT_SYMBOL(), I would have not objection to this also being
EXPORT_SYMBOL().  That said, I know of no specific reason for it being
other than EXPORT_SYMBOL_GPL().

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-11 Thread Ingo Molnar

* Paul E. McKenney <[EMAIL PROTECTED]> wrote:

> Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it.
> Compiles, but not yet tested.
> 
> Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]>

> --- linux-2.6.23-rc2/kernel/sched.c   2007-08-03 19:49:55.0 -0700
> +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c   2007-08-10 
> 17:22:57.0 -0700
> @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu)
>   return now;
>  }
>  
> +EXPORT_SYMBOL_GPL(cpu_clock);

sure enough,

Acked-by: Ingo Molnar <[EMAIL PROTECTED]>

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-11 Thread Alexey Dobriyan
On Fri, Aug 10, 2007 at 12:55:17AM -0700, Andrew Morton wrote:
> On Fri, 10 Aug 2007 09:40:00 +0200 Ingo Molnar <[EMAIL PROTECTED]> wrote:
> 
> > 
> > * Andrew Morton <[EMAIL PROTECTED]> wrote:
> > 
> > > We seem to have made a mess in there.  timer_list_show() ends up 
> > > calling lookup_module_symbol_name(), which takes a mutex.  However 
> > > print_symbol() (which is called at oops time, interrupt time, etc) 
> > > calls module_address_lookup(), which is basically the same, only it 
> > > doesn't take the mutex.
> > 
> > hm, current upstream does:
> > 
> >  static void print_name_offset(struct seq_file *m, void *sym)
> >  {
> >  char symname[KSYM_NAME_LEN];
> > 
> >  if (lookup_symbol_name((unsigned long)sym, symname) < 0)
> > 
> > why was that changed?
> 
> It wasn't.

Oh no, it was! commit 9d65cb4a1718a072898c7a57a3bc61b2dc4bcd4d

Fix race between cat /proc/*/wchan and rmmod et al

kallsyms_lookup() can go iterating over modules list unprotected which 
is OK
for emergency situations (oops), but not OK for regular stuff like
/proc/*/wchan.

> lookup_symbol_name() calls lookup_module_symbol_name() which
> calls mutex_lock().
> 
> > I think symbol lookups for debug purposes have to 
> > be lockless, fundamentally.
> > 
> 
> Sure, especially a sysrq thingy.

I imagine user running powertop which IIRC trolls /proc/timer_list and
doing rmmod following powertop instructions.

> It's a bit nasty to just go in there and start walking data structures
> without holding the needed lock though.

Yep.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-11 Thread Alexey Dobriyan
On Fri, Aug 10, 2007 at 12:55:17AM -0700, Andrew Morton wrote:
 On Fri, 10 Aug 2007 09:40:00 +0200 Ingo Molnar [EMAIL PROTECTED] wrote:
 
  
  * Andrew Morton [EMAIL PROTECTED] wrote:
  
   We seem to have made a mess in there.  timer_list_show() ends up 
   calling lookup_module_symbol_name(), which takes a mutex.  However 
   print_symbol() (which is called at oops time, interrupt time, etc) 
   calls module_address_lookup(), which is basically the same, only it 
   doesn't take the mutex.
  
  hm, current upstream does:
  
   static void print_name_offset(struct seq_file *m, void *sym)
   {
   char symname[KSYM_NAME_LEN];
  
   if (lookup_symbol_name((unsigned long)sym, symname)  0)
  
  why was that changed?
 
 It wasn't.

Oh no, it was! commit 9d65cb4a1718a072898c7a57a3bc61b2dc4bcd4d

Fix race between cat /proc/*/wchan and rmmod et al

kallsyms_lookup() can go iterating over modules list unprotected which 
is OK
for emergency situations (oops), but not OK for regular stuff like
/proc/*/wchan.

 lookup_symbol_name() calls lookup_module_symbol_name() which
 calls mutex_lock().
 
  I think symbol lookups for debug purposes have to 
  be lockless, fundamentally.
  
 
 Sure, especially a sysrq thingy.

I imagine user running powertop which IIRC trolls /proc/timer_list and
doing rmmod following powertop instructions.

 It's a bit nasty to just go in there and start walking data structures
 without holding the needed lock though.

Yep.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-11 Thread Ingo Molnar

* Paul E. McKenney [EMAIL PROTECTED] wrote:

 Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it.
 Compiles, but not yet tested.
 
 Signed-off-by: Paul E. McKenney [EMAIL PROTECTED]

 --- linux-2.6.23-rc2/kernel/sched.c   2007-08-03 19:49:55.0 -0700
 +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c   2007-08-10 
 17:22:57.0 -0700
 @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu)
   return now;
  }
  
 +EXPORT_SYMBOL_GPL(cpu_clock);

sure enough,

Acked-by: Ingo Molnar [EMAIL PROTECTED]

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-11 Thread Paul E. McKenney
On Sat, Aug 11, 2007 at 08:09:09PM +0200, Ingo Molnar wrote:
 
 * Paul E. McKenney [EMAIL PROTECTED] wrote:
 
  Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it.
  Compiles, but not yet tested.
  
  Signed-off-by: Paul E. McKenney [EMAIL PROTECTED]
 
  --- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700
  +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 
  17:22:57.0 -0700
  @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu)
  return now;
   }
   
  +EXPORT_SYMBOL_GPL(cpu_clock);
 
 sure enough,
 
 Acked-by: Ingo Molnar [EMAIL PROTECTED]

Thank you!

Just for the record, given that the xtime API that it replaces
was EXPORT_SYMBOL(), I would have not objection to this also being
EXPORT_SYMBOL().  That said, I know of no specific reason for it being
other than EXPORT_SYMBOL_GPL().

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Paul E. McKenney
On Fri, Aug 10, 2007 at 05:29:49PM -0700, Paul E. McKenney wrote:
> 
> Errmmm...  No joy.
> 
>   ERROR: "cpu_clock" [kernel/rcutorture.ko] undefined!
> 
> Turns out that cpu_clock also ain't exported, and rcutorture.c is
> a module.  Would adding an EXPORT_SYMBOL_GPL() as in the patch below
> be acceptable?

Except that the old xtime symbol was EXPORT_SYMBOL() rather than my
proposed EXPORT_SYMBOL_GPL() for the equivalent new cpu_clock().

Sigh!!!  I will leave this one for others to sort out.

Andrew, please consider this patch withdrawn and apply the version that
does not rely on time for entropy.  Please let me know if you would like
me to resend it.

Thanx, Paul

> If not, I have a tested patch to rcutorture.c that leverages statistical
> counters.  Your choice.
> 
>   Thanx, Paul
> 
> Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it.
> Compiles, but not yet tested.
> 
> Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]>
> ---
> 
>  rcutorture.c |8 ++--
>  sched.c  |2 ++
>  2 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c 
> linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c
> --- linux-2.6.23-rc2/kernel/rcutorture.c  2007-08-03 19:49:55.0 
> -0700
> +++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c  2007-08-10 
> 17:15:22.0 -0700
> @@ -42,7 +42,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -166,16 +165,13 @@ struct rcu_random_state {
>  
>  /*
>   * Crude but fast random-number generator.  Uses a linear congruential
> - * generator, with occasional help from get_random_bytes().
> + * generator, with occasional help from cpu_clock().
>   */
>  static unsigned long
>  rcu_random(struct rcu_random_state *rrsp)
>  {
> - long refresh;
> -
>   if (--rrsp->rrs_count < 0) {
> - get_random_bytes(, sizeof(refresh));
> - rrsp->rrs_state += refresh;
> + rrsp->rrs_state += (unsigned long)cpu_clock(smp_processor_id());
>   rrsp->rrs_count = RCU_RANDOM_REFRESH;
>   }
>   rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD;
> diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c 
> linux-2.6.23-rc2-rcutorturesched/kernel/sched.c
> --- linux-2.6.23-rc2/kernel/sched.c   2007-08-03 19:49:55.0 -0700
> +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c   2007-08-10 
> 17:22:57.0 -0700
> @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu)
>   return now;
>  }
>  
> +EXPORT_SYMBOL_GPL(cpu_clock);
> +
>  #ifdef CONFIG_FAIR_GROUP_SCHED
>  /* Change a task's ->cfs_rq if it moves across CPUs */
>  static inline void set_task_cfs_rq(struct task_struct *p)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Paul E. McKenney
On Fri, Aug 10, 2007 at 01:30:55PM -0700, Paul E. McKenney wrote:
> On Fri, Aug 10, 2007 at 10:12:12AM -0700, Andrew Morton wrote:
> > On Fri, 10 Aug 2007 08:12:08 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> 
> > wrote:
> > 
> > > > One used to use sched_clock() for this, then get frowned at.  Now we
> > > > have cpu_clock()...
> > > 
> > > Hmmm...  And cpu_clock() is not in 2.6.22, so must appear in some later
> > > release.  Which means that the rate of API change in this area is a
> > > bit high, so I should avoid it like the plague.
> > 
> > eh, it's been there for weeks.  It is dust-encrusted.
> > 
> > >  Therefore, I should
> > > look for some other convenient source of entropy.
> > > 
> > > One convenient source would the per-CPU statistics that rcutorture
> > > maintains.  Of course, a given CPU's RNG is nearly in lock-step with
> > > its own statistics, but not with the adjacent CPU's statistics...
> > > 
> > > I will send a patch.
> > 
> > Please use cpu_clock().  It ain't going away.
> 
> D'accord...

Errmmm...  No joy.

ERROR: "cpu_clock" [kernel/rcutorture.ko] undefined!

Turns out that cpu_clock also ain't exported, and rcutorture.c is
a module.  Would adding an EXPORT_SYMBOL_GPL() as in the patch below
be acceptable?

If not, I have a tested patch to rcutorture.c that leverages statistical
counters.  Your choice.

Thanx, Paul

Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it.
Compiles, but not yet tested.

Signed-off-by: Paul E. McKenney <[EMAIL PROTECTED]>
---

 rcutorture.c |8 ++--
 sched.c  |2 ++
 2 files changed, 4 insertions(+), 6 deletions(-)

diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c 
linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c
--- linux-2.6.23-rc2/kernel/rcutorture.c2007-08-03 19:49:55.0 
-0700
+++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c2007-08-10 
17:15:22.0 -0700
@@ -42,7 +42,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -166,16 +165,13 @@ struct rcu_random_state {
 
 /*
  * Crude but fast random-number generator.  Uses a linear congruential
- * generator, with occasional help from get_random_bytes().
+ * generator, with occasional help from cpu_clock().
  */
 static unsigned long
 rcu_random(struct rcu_random_state *rrsp)
 {
-   long refresh;
-
if (--rrsp->rrs_count < 0) {
-   get_random_bytes(, sizeof(refresh));
-   rrsp->rrs_state += refresh;
+   rrsp->rrs_state += (unsigned long)cpu_clock(smp_processor_id());
rrsp->rrs_count = RCU_RANDOM_REFRESH;
}
rrsp->rrs_state = rrsp->rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD;
diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c 
linux-2.6.23-rc2-rcutorturesched/kernel/sched.c
--- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700
+++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 
17:22:57.0 -0700
@@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu)
return now;
 }
 
+EXPORT_SYMBOL_GPL(cpu_clock);
+
 #ifdef CONFIG_FAIR_GROUP_SCHED
 /* Change a task's ->cfs_rq if it moves across CPUs */
 static inline void set_task_cfs_rq(struct task_struct *p)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Paul E. McKenney
On Fri, Aug 10, 2007 at 10:12:12AM -0700, Andrew Morton wrote:
> On Fri, 10 Aug 2007 08:12:08 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> 
> wrote:
> 
> > > One used to use sched_clock() for this, then get frowned at.  Now we
> > > have cpu_clock()...
> > 
> > Hmmm...  And cpu_clock() is not in 2.6.22, so must appear in some later
> > release.  Which means that the rate of API change in this area is a
> > bit high, so I should avoid it like the plague.
> 
> eh, it's been there for weeks.  It is dust-encrusted.
> 
> >  Therefore, I should
> > look for some other convenient source of entropy.
> > 
> > One convenient source would the per-CPU statistics that rcutorture
> > maintains.  Of course, a given CPU's RNG is nearly in lock-step with
> > its own statistics, but not with the adjacent CPU's statistics...
> > 
> > I will send a patch.
> 
> Please use cpu_clock().  It ain't going away.

D'accord...

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Andrew Morton
On Fri, 10 Aug 2007 08:12:08 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> wrote:

> > One used to use sched_clock() for this, then get frowned at.  Now we
> > have cpu_clock()...
> 
> Hmmm...  And cpu_clock() is not in 2.6.22, so must appear in some later
> release.  Which means that the rate of API change in this area is a
> bit high, so I should avoid it like the plague.

eh, it's been there for weeks.  It is dust-encrusted.

>  Therefore, I should
> look for some other convenient source of entropy.
> 
> One convenient source would the per-CPU statistics that rcutorture
> maintains.  Of course, a given CPU's RNG is nearly in lock-step with
> its own statistics, but not with the adjacent CPU's statistics...
> 
> I will send a patch.

Please use cpu_clock().  It ain't going away.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___

2007-08-10 Thread Andrew Morton
On Fri, 10 Aug 2007 15:27:42 +0200 Andi Kleen <[EMAIL PROTECTED]> wrote:

> On Thursday 09 August 2007 20:52:58 Andrew Morton wrote:
> > On Thu, 9 Aug 2007 10:18:15 -0400
> > "Miles Lane" <[EMAIL PROTECTED]> wrote:
> > 
> > >   CC  drivers/dma/ioat_dca.o
> > > drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag':
> > > drivers/dma/ioat_dca.c:177: error: implicit declaration of function
> > > 'cpu_physical_id'
> > 
> > Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n.
> > 
> > Either ioat needs to stop using cpu_physical_id() if SMP=n, or the
> > supported architectures (i386, x86_64, ia64) should provide a non-SMP
> > version of cpu_physical_id().  Preferably the latter, I'd say.
> 
> 
> It doesn't make much sense in smp.h because there is not really
> a concept of physical id on most architectures i expect. Better 
> to put it into the individual asm files.
> 

I gave up and did this:

From: Andrew Morton <[EMAIL PROTECTED]>

drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag':
drivers/dma/ioat_dca.c:177: error: implicit declaration of function 
'cpu_physical_id'

This is s screwed up.  Root cause: linux/smp.h only includes asm/smp.h if
CONFIG_SMP=y.

To get at cpu_physical_id() on UP, the user must include asm/smp.h, not
linux/smp.h.

Cc: "Luck, Tony" <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Shannon Nelson <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 drivers/dma/ioat_dca.c |3 +++
 1 file changed, 3 insertions(+)

diff -puN drivers/dma/ioat_dca.c~git-dma-up-fix drivers/dma/ioat_dca.c
--- a/drivers/dma/ioat_dca.c~git-dma-up-fix
+++ a/drivers/dma/ioat_dca.c
@@ -25,6 +25,9 @@
 #include 
 #include 
 #include 
+
+#include 
+
 #include "ioatdma.h"
 #include "ioatdma_registers.h"
 
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Paul E. McKenney
On Thu, Aug 09, 2007 at 07:06:23PM -0700, Andrew Morton wrote:
> On Thu, 9 Aug 2007 19:00:40 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> 
> wrote:
> 
> > On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote:
> > > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:
> > > >...
> > > > Changes since 2.6.23-rc2-mm1:
> > > >...
> > > > +allow-rcutorture-to-handle-synchronize_sched.patch
> > > >...
> > > >  2.6.23 queue
> > > >...
> > > 
> > > All drivers were converted to no longer use xtime directly since it 
> > > might be quite outdated, but this patch adds a usage of xtime.tv_nsec
> > > as RNG...
> > 
> > This code doesn't care if the time is outdated, as it is simply
> > periodically perturbing an RNG, but OK.
> > 
> > So, what interface are we supposed to be using instead?  I cannot use
> > get_random_bytes() due to locking issues.  This is not a cryptographically
> > secure usage, so the perturbation does not need to be extremely high
> > quality.
> > 
> > On x86, I would just grab the low-order bits of the TSC, but all of the
> > world is not an x86.  ;-)
> 
> One used to use sched_clock() for this, then get frowned at.  Now we
> have cpu_clock()...

Hmmm...  And cpu_clock() is not in 2.6.22, so must appear in some later
release.  Which means that the rate of API change in this area is a
bit high, so I should avoid it like the plague.  Therefore, I should
look for some other convenient source of entropy.

One convenient source would the per-CPU statistics that rcutorture
maintains.  Of course, a given CPU's RNG is nearly in lock-step with
its own statistics, but not with the adjacent CPU's statistics...

I will send a patch.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___

2007-08-10 Thread Andi Kleen
On Thursday 09 August 2007 20:52:58 Andrew Morton wrote:
> On Thu, 9 Aug 2007 10:18:15 -0400
> "Miles Lane" <[EMAIL PROTECTED]> wrote:
> 
> >   CC  drivers/dma/ioat_dca.o
> > drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag':
> > drivers/dma/ioat_dca.c:177: error: implicit declaration of function
> > 'cpu_physical_id'
> 
> Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n.
> 
> Either ioat needs to stop using cpu_physical_id() if SMP=n, or the
> supported architectures (i386, x86_64, ia64) should provide a non-SMP
> version of cpu_physical_id().  Preferably the latter, I'd say.


It doesn't make much sense in smp.h because there is not really
a concept of physical id on most architectures i expect. Better 
to put it into the individual asm files.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___

2007-08-10 Thread Miles Lane
On 8/9/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Thu, 9 Aug 2007 10:18:15 -0400
> "Miles Lane" <[EMAIL PROTECTED]> wrote:
>
> >   CC  drivers/dma/ioat_dca.o
> > drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag':
> > drivers/dma/ioat_dca.c:177: error: implicit declaration of function
> > 'cpu_physical_id'
>
> Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n.
>
> Either ioat needs to stop using cpu_physical_id() if SMP=n, or the
> supported architectures (i386, x86_64, ia64) should provide a non-SMP
> version of cpu_physical_id().  Preferably the latter, I'd say.
>
> Something like this, I suppose...
>
>
> From: Andrew Morton <[EMAIL PROTECTED]>
>
> i386, x86_64 and ia64 implement cpu_physical_id() if CONFIG_SMP=y.
>
> Provide a uniprocessor stub so that callers will dtrt.
>
> Cc: Andi Kleen <[EMAIL PROTECTED]>
> Cc: "Luck, Tony" <[EMAIL PROTECTED]>
> Cc: Shannon Nelson <[EMAIL PROTECTED]>
> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
> ---
>
>  include/linux/smp.h |5 +
>  1 files changed, 5 insertions(+)
>
> diff -puN include/linux/smp.h~implement-cpu_physical_id-on-smp=n 
> include/linux/smp.h
> --- a/include/linux/smp.h~implement-cpu_physical_id-on-smp=n
> +++ a/include/linux/smp.h
> @@ -108,6 +108,11 @@ static inline void smp_send_reschedule(i
> 0;  \
>  })
>
> +static inline unsigned cpu_physical_id(unsigned cpu)
> +{
> +   return 0;
> +}
> +
>  #endif /* !SMP */
>
>  /*
> _

Worked for me.
Thanks,
  Miles
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- PPC G5 kernel compile failure (patch)

2007-08-10 Thread Andy Whitcroft
Krzysztof Helt wrote:
> On Thu, 9 Aug 2007 14:04:49 +0100
> Andy Whitcroft <[EMAIL PROTECTED]> wrote:
> 
>> Seeing the following compile error on a G5 mac:
>>
>>   drivers/video/tdfxfb.c: In function 'tdfxfb_setup':
>>   drivers/video/tdfxfb.c:1341: error: 'opt' undeclared (first use in this
>>  function)
>>   drivers/video/tdfxfb.c:1341: error: (Each undeclared identifier is
>> reported only once
>>   drivers/video/tdfxfb.c:1341: error: for each function it appears in.)
>>
>> This seems to be the following fragment from tdfxfb-hardware-cursor:
>>
>> +   } else if (!strcmp(this_opt, "hwcursor")) {
>> +   hwcursor = simple_strtoul(opt + 9, NULL, 0);
>>
>> I guess the nieve fix would be s/opt/this_opt, but I am also
>> suspicious of the +9 here as hwcursor is only 8 long?  Now this
>> seems to take a numeric value and I assume that is via hwcursor=N,
>> if so then the +9 would make sense _if_ the strcmp was against
>> "hwcursor=".
>>
> 
> The patch below fixes all issues you have pointed out. It also fixes
> the description of the nomtrr option.
> 
> ---
> 
> From: Krzysztof Helt <[EMAIL PROTECTED]>
> 
> This patch fixes compilation with setup options bug and corrects
> description of the nomtrr option.
> 
> Signed-off-by: Krzysztof Helt <[EMAIL PROTECTED]>
> 
> ---
> 
> --- linux-2.6.22.new/drivers/video/tdfxfb.c   2007-08-09 16:11:23.870028259 
> +0200
> +++ linux-2.6.23/drivers/video/tdfxfb.c   2007-08-09 16:15:07.654781024 
> +0200
> @@ -1337,8 +1337,8 @@ static void tdfxfb_setup(char *options)
>   nopan = 1;
>   } else if (!strcmp(this_opt, "nowrap")) {
>   nowrap = 1;
> - } else if (!strcmp(this_opt, "hwcursor")) {
> - hwcursor = simple_strtoul(opt + 9, NULL, 0);
> + } else if (!strncmp(this_opt, "hwcursor=", 9)) {
> + hwcursor = simple_strtoul(this_opt + 9, NULL, 0);
>  #ifdef CONFIG_MTRR
>   } else if (!strncmp(this_opt, "nomtrr", 6)) {
>   nomtrr = 1;
> @@ -1409,7 +1409,7 @@ MODULE_PARM_DESC(hwcursor, "Enable hardw
>   "(1=enable, 0=disable, default=1)");
>  #ifdef CONFIG_MTRR
>  module_param(nomtrr, bool, 0);
> -MODULE_PARM_DESC(nomtrr, "Disable MTRR support (0 or 1=disabled) 
> (default=0)");
> +MODULE_PARM_DESC(nomtrr, "Disable MTRR support (default: enabled)");
>  #endif
>  
>  module_init(tdfxfb_init);

Confirmed that this gets my kernel compiled and the result boots.

Tested-by: Andy Whitcroft <[EMAIL PROTECTED]>

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Softlockup detected with 2.6.23-rc2-mm1

2007-08-10 Thread Kamalesh Babulal

Andrew Morton wrote:

On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:

  

I get call trace, when the file system stress is run on the
2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron
(processor 270)

\BUG: spinlock bad magic on 
CPU#1, fsx-linux/19721



Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug.  Those patches were
removed from 2.6.23-rc2-mm2 - please test that instead.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
  

The Call trace is not reproducible in the 2.6.23-rc2-mm2.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- INFO: possible circular locking dependency detected

2007-08-10 Thread Johannes Berg
On Fri, 2007-08-10 at 02:47 +0400, Alexey Starikovskiy wrote:

> > Presumably the new debugging patches in -mm
> > (workqueue-debug-flushing-deadlocks-with-lockdep.patch and
> > workqueue-debug-work-related-deadlocks-with-lockdep.patch) think they have
> > found a potential deadlock in ACPI.  I don't have time to pick through the
> > code to confirm that, but boy I'm good at adding cc's ;)

> Yep, it indeed may lock up... Here is a patch to avoid it

Cool. I'm impressed this stuff actually finds something :)

johannes


signature.asc
Description: This is a digitally signed message part


Re: Softlockup detected with 2.6.23-rc2-mm1

2007-08-10 Thread Andy Whitcroft
On Fri, Aug 10, 2007 at 01:06:58PM +0530, Kamalesh Babulal wrote:
> Andrew Morton wrote:
> >On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal 
> ><[EMAIL PROTECTED]> wrote:
> >
> >  
> >>I get call trace, when the file system stress is run on the
> >>2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron
> >>(processor 270)
> >>
> >>\BUG: spinlock bad magic on 
> >>CPU#1, fsx-linux/19721
> >>
> >
> >Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug.  Those patches 
> >were
> >removed from 2.6.23-rc2-mm2 - please test that instead.
> >-
> >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >the body of a message to [EMAIL PROTECTED]
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >Please read the FAQ at  http://www.tux.org/lkml/
> >  
> I get different call trace on AMD Opteron(tm) Processor 844 machine
> , I am not sure where it is related to the same patch
> 
> =============
> BUG: soft lockup - CPU#3 stuck for 11s! [pdflush:272]
> CPU 3:
> Modules linked in:
> Pid: 272, comm: pdflush Not tainted 2.6.23-rc2-mm1-autokern1 #1
> RIP: 0010:[]  [] 
> flush_tlb_others+0x69/0x95

Cannot be 100% sure but of the group of machines showing your original
problem one showed this form.  Dropping the patches indicated by Andrew
seemed to fix both symptoms.  So I think it is highly likely the same
thing.

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-10 Thread Andrew Morton
On Fri, 10 Aug 2007 09:40:00 +0200 Ingo Molnar <[EMAIL PROTECTED]> wrote:

> 
> * Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > We seem to have made a mess in there.  timer_list_show() ends up 
> > calling lookup_module_symbol_name(), which takes a mutex.  However 
> > print_symbol() (which is called at oops time, interrupt time, etc) 
> > calls module_address_lookup(), which is basically the same, only it 
> > doesn't take the mutex.
> 
> hm, current upstream does:
> 
>  static void print_name_offset(struct seq_file *m, void *sym)
>  {
>  char symname[KSYM_NAME_LEN];
> 
>  if (lookup_symbol_name((unsigned long)sym, symname) < 0)
> 
> why was that changed?

It wasn't.  lookup_symbol_name() calls lookup_module_symbol_name() which
calls mutex_lock().

> I think symbol lookups for debug purposes have to 
> be lockless, fundamentally.
> 

Sure, especially a sysrq thingy.

It's a bit nasty to just go in there and start walking data structures
without holding the needed lock though.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-10 Thread Ingo Molnar

* Andrew Morton <[EMAIL PROTECTED]> wrote:

> We seem to have made a mess in there.  timer_list_show() ends up 
> calling lookup_module_symbol_name(), which takes a mutex.  However 
> print_symbol() (which is called at oops time, interrupt time, etc) 
> calls module_address_lookup(), which is basically the same, only it 
> doesn't take the mutex.

hm, current upstream does:

 static void print_name_offset(struct seq_file *m, void *sym)
 {
 char symname[KSYM_NAME_LEN];

 if (lookup_symbol_name((unsigned long)sym, symname) < 0)

why was that changed? I think symbol lookups for debug purposes have to 
be lockless, fundamentally.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Softlockup detected with 2.6.23-rc2-mm1

2007-08-10 Thread Kamalesh Babulal

Andrew Morton wrote:

On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:

  

I get call trace, when the file system stress is run on the
2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron
(processor 270)

\BUG: spinlock bad magic on 
CPU#1, fsx-linux/19721



Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug.  Those patches were
removed from 2.6.23-rc2-mm2 - please test that instead.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
  

I get different call trace on AMD Opteron(tm) Processor 844 machine
, I am not sure where it is related to the same patch

=
BUG: soft lockup - CPU#3 stuck for 11s! [pdflush:272]
CPU 3:
Modules linked in:
Pid: 272, comm: pdflush Not tainted 2.6.23-rc2-mm1-autokern1 #1
RIP: 0010:[]  [] 
flush_tlb_others+0x69/0x95

RSP: :810001f15a90  EFLAGS: 0202
RAX: 0003 RBX: 810001f15ac0 RCX: 0008
RDX: 08f3 RSI: 00f3 RDI: 0002
RBP:  R08: 810082f05210 R09: 802e60c1
R10: 8100815e6e70 R11:  R12: 8101ffc38080
R13: 80358b47 R14: 810001f15a40 R15: 810081d73208
FS:  () GS:810180724280() knlGS:f7f75b80
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: f7e20494 CR3: 029f CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400

Call Trace:
[] flush_tlb_page+0x8f/0x97
[] page_mkclean+0x120/0x171
[] ext3_ordered_writepage+0x13f/0x16c
[] clear_page_dirty_for_io+0x52/0xba
[] write_cache_pages+0x1b2/0x33a
[] update_curr+0xd9/0xf8
[] __writepage+0x0/0x2a
[] generic_writepages+0x1f/0x25
[] do_writepages+0x2c/0x35
[] __writeback_single_inode+0x1c9/0x346
[] try_to_del_timer_sync+0x55/0x60
[] del_timer_sync+0x12/0x1f
[] update_curr+0xd9/0xf8
[] dequeue_entity+0x7d/0x92
[] generic_sync_sb_inodes+0x216/0x372
[] sync_sb_inodes+0x1d/0x1f
[] writeback_inodes+0x83/0xd6
[] wb_kupdate+0xa0/0x113
[] pdflush+0x156/0x206
[] wb_kupdate+0x0/0x113
[] pdflush+0x0/0x206
[] kthread+0x44/0x6d
[] child_rip+0xa/0x12
[] kthread+0x0/0x6d
[] child_rip+0x0/0x12

Thanks
Kamalesh Babulal.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-10 Thread Mariusz Kozlowski
> >>This probably doesn't have great impact ;) but ...
> >> To reproduce: run torture tests for RCU and then sysrq+q.
> >>
> >> SysRq : Show Pending Timers
> >> Timer List Version: v0.3
> >> HRTIMER_MAX_CLOCK_BASES: 2
> >> now at 1764338760370 nsecs
> >>
> >> cpu: 0
> >>  clock 0:
> >>   .index:  0
> >>   .resolution: 1 nsecs
> >>   .get_time:   ktime_get_real
> >>   .offset: 1186699025823815427 nsecs
> >> active timers:
> >>  clock 1:
> >>   .index:  1
> >>   .resolution: 1 nsecs
> >>   .get_time:   ktime_get
> >>   .offset: 0 nsecs
> >> active timers:
> >>  #0: <3>BUG: sleeping function called from invalid context at 
> >> kernel/mutex.c:86
> >> in_atomic():1, irqs_disabled():1
> >> INFO: lockdep is turned off.
> >> irq event stamp: 0
> >> hardirqs last  enabled at (0): [<>] 0x0
> >> hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c
> >> softirqs last  enabled at (0): [] copy_process+0x4c6/0x144c
> >> softirqs last disabled at (0): [<>] 0x0
> >>  [] show_trace_log_lvl+0x1a/0x30
> >>  [] show_trace+0x12/0x14
> >>  [] dump_stack+0x15/0x17
> >>  [] __might_sleep+0xb7/0xc9
> >>  [] mutex_lock+0x15/0x1f
> >>  [] lookup_module_symbol_name+0x17/0xc0
> >>  [] lookup_symbol_name+0x3f/0x43
> >>  [] print_name_offset+0x1f/0x96
> >>  [] timer_list_show+0x802/0xcbd
> >>  [] sysrq_timer_list_show+0xc/0xe
> >>  [] sysrq_handle_show_timers+0x8/0xa
> >>  [] __handle_sysrq+0x7b/0x115
> >>  [] handle_sysrq+0x20/0x24
> >>  [] kbd_event+0x3a8/0x5c7
> >>  [] input_pass_event+0x8f/0x91
> >>  [] input_handle_event+0x98/0x38d
> >>  [] input_event+0x54/0x67
> >>  [] atkbd_interrupt+0x200/0x59e
> >>  [] serio_interrupt+0x7c/0x80
> >>  [] i8042_interrupt+0x17a/0x289
> >>  [] handle_IRQ_event+0x28/0x59
> >>  [] handle_level_irq+0xad/0x10b
> >>  [] do_IRQ+0x93/0xd0
> >>  [] common_interrupt+0x2e/0x34
> >>  [] rcu_read_delay+0x8/0x36 [rcutorture]
> >>  [] rcu_torture_reader+0x6e/0x169 [rcutorture]
> >>  [] kthread+0x36/0x58
> >>  [] kernel_thread_helper+0x7/0x1c
> >>  ===
> > 
> > We seem to have made a mess in there.  timer_list_show() ends up calling
> > lookup_module_symbol_name(), which takes a mutex.  However print_symbol()
> > (which is called at oops time, interrupt time, etc) calls
> > module_address_lookup(), which is basically the same, only it doesn't take
> > the mutex.
> > 
> > I guess a quicky fix would be to switch
> > kernel/time/timer_list.c:print_name_offset() from
> > lookup_module_symbol_name() to module_address_lookup().  But we'd still
> > have a mess in there.
> > 
> > (adds ccs, runs away)
> 
> I don't think rcutorture matters for this bug. 

Maybe not but that's the only way I could trigger it (insmod rcutorture and 
sysrq+q).

Thanks,

Mariusz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected

2007-08-10 Thread Mariusz Kozlowski
Hello,

And the winner of today is ...



=
[ INFO: possible irq lock inversion dependency detected ]
2.6.23-rc2-mm1 #7
-
runscript.sh/5843 just changed the state of lock:
 (_xmit_ETHER){-+..}, at: [] dev_watchdog+0x17/0xcc
but this lock took another, soft-irq-unsafe lock in the past:
 (>lock){--..}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
no locks held by runscript.sh/5843.

the first lock's dependencies:
-> (_xmit_ETHER){-+..} ops: 21 {
   initial-use  at:
[] __lock_acquire+0x217/0x11ac
[] lock_acquire+0x99/0xb2
[] _spin_lock_bh+0x3a/0x47
[] dev_set_rx_mode+0x14/0x3b
[] dev_change_flags+0x68/0x190
[] devinet_ioctl+0x4af/0x652
[] inet_ioctl+0x56/0x71
[] sock_ioctl+0xa5/0x1d4
[] do_ioctl+0x22/0x71
[] vfs_ioctl+0x55/0x29e
[] sys_ioctl+0x33/0x69
[] sysenter_past_esp+0x5f/0x99
[] 0x
   in-softirq-W at:
[] __lock_acquire+0x6f2/0x11ac
[] lock_acquire+0x99/0xb2
[] _spin_lock+0x35/0x42
[] dev_watchdog+0x17/0xcc
[] run_timer_softirq+0x14b/0x1a9
[] __do_softirq+0x5b/0xb2
[] do_softirq+0x4d/0x4f
[] irq_exit+0x48/0x4a
[] do_IRQ+0x98/0xd0
[] common_interrupt+0x2e/0x34
[] error_code+0x6a/0x70
[] 0x
   hardirq-on-W at:
[] __lock_acquire+0x73e/0x11ac
[] lock_acquire+0x99/0xb2
[] _spin_lock_bh+0x3a/0x47
[] dev_set_rx_mode+0x14/0x3b
[] dev_change_flags+0x68/0x190
[] devinet_ioctl+0x4af/0x652
[] inet_ioctl+0x56/0x71
[] sock_ioctl+0xa5/0x1d4
[] do_ioctl+0x22/0x71
[] vfs_ioctl+0x55/0x29e
[] sys_ioctl+0x33/0x69
[] sysenter_past_esp+0x5f/0x99
[] 0x
 }
 ... key  at: [] netdev_xmit_lock_key+0x8/0x1c0
 -> (>lock){--..} ops: 44 {
initial-use  at:
  [] __lock_acquire+0x217/0x11ac
  [] lock_acquire+0x99/0xb2
  [] _spin_lock+0x35/0x42
  [] rtl8139_interrupt+0x27/0x46b [8139too]
  [] request_irq+0xba/0x108
  [] rtl8139_open+0x2f/0x1e2 [8139too]
  [] dev_open+0x37/0x76
  [] dev_change_flags+0x8e/0x190
  [] devinet_ioctl+0x4af/0x652
  [] inet_ioctl+0x56/0x71
  [] sock_ioctl+0xa5/0x1d4
  [] do_ioctl+0x22/0x71
  [] vfs_ioctl+0x55/0x29e
  [] sys_ioctl+0x33/0x69
  [] sysenter_past_esp+0x5f/0x99
  [] 0x
softirq-on-W at:
  [] __lock_acquire+0x767/0x11ac
  [] lock_acquire+0x99/0xb2
  [] _spin_lock+0x35/0x42
  [] rtl8139_interrupt+0x27/0x46b [8139too]
  [] free_irq+0x11b/0x146
  [] rtl8139_close+0x8a/0x14a [8139too]
  [] dev_close+0x57/0x74
  [] dev_change_flags+0x8e/0x190
  [] devinet_ioctl+0x4af/0x652
  [] inet_ioctl+0x56/0x71
  [] sock_ioctl+0xa5/0x1d4
  [] do_ioctl+0x22/0x71
  [] vfs_ioctl+0x55/0x29e
  [] sys_ioctl+0x33/0x69
  [] sysenter_past_esp+0x5f/0x99
  [] 0x
hardirq-on-W at:
  [] __lock_acquire+0x73e/0x11ac
  [] lock_acquire+0x99/0xb2
  [] _spin_lock+0x35/0x42
  [] rtl8139_interrupt+0x27/0x46b [8139too]
  [] free_irq+0x11b/0x146
  [] rtl8139_close+0x8a/0x14a [8139too]
  [] dev_close+0x57/0x74
  [] dev_change_flags+0x8e/0x190
  [] devinet_ioctl+0x4af/0x652
  [] inet_ioc

kernel BUG at mm/swap_state.c:78 with the 2.6.23-rc2-mm1

2007-08-10 Thread Kamalesh Babulal

Hi,

I got the following kernel Bug  on the 2.6.23-rc2-mm1 kernel on
a Dual Core AMD Opteron (processor 270),  while testing the  LTP
runall


kernel BUG at mm/swap_state.c:78!

invalid opcode:  [1] SMP

CPU 0

Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp 
parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button


Pid: 262, comm: kprefetchd Not tainted 2.6.23-rc2-mm1-autokern1 #1

RIP: 0010:[]  [] 
__add_to_swap_cache+0x12/0xa6


RSP: 0018:81000299bea0  EFLAGS: 00010246

RAX:  RBX: 81003f3baec0 RCX: 8100048c33b0

RDX: 00d0 RSI: 0001 RDI: 00d0

RBP: 81003f3baec0 R08: 810001423f14 R09: bb27

R10:  R11: 0001 R12: 0001

R13: 0002 R14: 0001 R15: 8100048c33b0

FS:  2b941f4a40f0() GS:8067() knlGS:

CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b

CR2: 005b9db0 CR3: 04ed7000 CR4: 06e0

DR0:  DR1:  DR2: 

DR3:  DR6: 0ff0 DR7: 0400

Process kprefetchd (pid: 262, threadinfo 81000299a000, task 
810001c98040)


Stack:  0001 81003f3baec0  8027c50d

0002 81003f3baec0  8027f318

  81000299bf20 

Call Trace:

[] add_to_swap_cache+0x36/0x5f

[] kprefetchd+0x248/0x40c

[] kprefetchd+0x0/0x40c

[] kthread+0x47/0x73

[] child_rip+0xa/0x12

[] kthread+0x0/0x73

[] child_rip+0x0/0x12





Code: 0f 0b eb fe 8b 03 66 85 c0 79 04 0f 0b eb fe 8b 03 f6 c4 08

Thanks,
Kamalesh Babulal.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Softlockup detected with 2.6.23-rc2-mm1

2007-08-10 Thread Andrew Morton
On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:

> I get call trace, when the file system stress is run on the
> 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron
> (processor 270)
> 
> \BUG: spinlock bad magic on 
> CPU#1, fsx-linux/19721

Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug.  Those patches were
removed from 2.6.23-rc2-mm2 - please test that instead.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Softlockup detected with 2.6.23-rc2-mm1

2007-08-10 Thread Kamalesh Babulal

Hi,

I get call trace, when the file system stress is run on the
2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron
(processor 270)

\BUG: spinlock bad magic on 
CPU#1, fsx-linux/19721


lock: 8100028cef48, .magic: , .owner: /-1, .owner_cpu: 0



Call Trace:

[] _raw_spin_lock+0x22/0xf6

[] _spin_lock_irqsave+0x9/0xe

[] prop_norm_single+0x40/0x9a

[] set_page_dirty+0x8d/0xc9

[] set_page_dirty_balance+0x9/0x39

[] __do_fault+0x37a/0x395

[] handle_mm_fault+0x342/0x6c3

[] do_page_fault+0x3e5/0x7ab

[] arch_get_unmapped_area+0x184/0x1f9

[] _spin_lock_irqsave+0x9/0xe

[] __up_write+0x21/0x10d

[] error_exit+0x0/0x84



BUG: spinlock lockup on CPU#1, fsx-linux/19721, 8100028cef48



Call Trace:

[] _raw_spin_lock+0xcf/0xf6

[] _spin_lock_irqsave+0x9/0xe

[] prop_norm_single+0x40/0x9a

[] set_page_dirty+0x8d/0xc9

[] set_page_dirty_balance+0x9/0x39

[] __do_fault+0x37a/0x395

[] handle_mm_fault+0x342/0x6c3

[] do_page_fault+0x3e5/0x7ab

[] arch_get_unmapped_area+0x184/0x1f9

[] _spin_lock_irqsave+0x9/0xe

[] __up_write+0x21/0x10d

[] error_exit+0x0/0x84



BUG: soft lockup - CPU#2 stuck for 11s! [events/2:17]

CPU 2:

Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp 
parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button


Pid: 17, comm: events/2 Not tainted 2.6.23-rc2-mm1-autokern1 #1

RIP: 0010:[]  [] 
__smp_call_function+0x63/0x84


RSP: 0018:810001727e00  EFLAGS: 0297

RAX: 08fc RBX: 0003 RCX: 

RDX: 08fc RSI: 810001727de0 RDI: 00fc

RBP: 0246 R08: 0003 R09: 0005

R10: 0010 R11: 0246 R12: 0400

R13: 0400 R14:  R15: 81000102d980

FS:  2b8de51be6f0() GS:8100016123c0() knlGS:

CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b

CR2: 0036d4b938a0 CR3: 1659f000 CR4: 06e0

DR0:  DR1:  DR2: 

DR3:  DR6: 0ff0 DR7: 0400



Call Trace:

[] mcheck_check_cpu+0x0/0x30

[] smp_call_function+0x32/0x49

[] mcheck_check_cpu+0x0/0x30

[] on_each_cpu+0x10/0x22

[] mcheck_timer+0x0/0x7c

[] mcheck_timer+0x1d/0x7c

[] _spin_unlock_irq+0x9/0xc

[] run_workqueue+0x8d/0x11a

[] worker_thread+0x0/0xe4

[] worker_thread+0xda/0xe4

[] autoremove_wake_function+0x0/0x2e

[] kthread+0x47/0x73

[] child_rip+0xa/0x12

[] kthread+0x0/0x73

[] child_rip+0x0/0x12



BUG: soft lockup - CPU#2 stuck for 11s! [events/2:17]

CPU 2:

Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp 
parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button


Pid: 17, comm: events/2 Not tainted 2.6.23-rc2-mm1-autokern1 #1

RIP: 0010:[]  [] 
__smp_call_function+0x63/0x84


RSP: 0018:810001727e00  EFLAGS: 0297

RAX: 08fc RBX: 0003 RCX: 

RDX: 08fc RSI: 810001727de0 RDI: 00fc

RBP: 0246 R08: 0003 R09: 0005

R10: 0010 R11: 0246 R12: 0400

R13: 0400 R14:  R15: 81000102d980

FS:  2b8de51be6f0() GS:8100016123c0() knlGS:

CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b

CR2: 0036d4b938a0 CR3: 1659f000 CR4: 06e0

DR0:  DR1:  DR2: 

DR3:  DR6: 0ff0 DR7: 0400



Call Trace:

[] mcheck_check_cpu+0x0/0x30

[] smp_call_function+0x32/0x49

[] mcheck_check_cpu+0x0/0x30

[] on_each_cpu+0x10/0x22

[] mcheck_timer+0x0/0x7c

[] mcheck_timer+0x1d/0x7c

[] _spin_unlock_irq+0x9/0xc

[] run_workqueue+0x8d/0x11a

[] worker_thread+0x0/0xe4

[] worker_thread+0xda/0xe4

[] autoremove_wake_function+0x0/0x2e

[] kthread+0x47/0x73

[] child_rip+0xa/0x12

[] kthread+0x0/0x73

[] child_rip+0x0/0x12



BUG: soft lockup - CPU#2 stuck for 11s! [events/2:17]

CPU 2:

Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp 
parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button


Pid: 17, comm: events/2 Not tainted 2.6.23-rc2-mm1-autokern1 #1

RIP: 0010:[]  [] 
__smp_call_function+0x63/0x84


RSP: 0018:810001727e00  EFLAGS: 0297

RAX: 08fc RBX: 0003 RCX: 

RDX: 08fc RSI: 810001727de0 RDI: 00fc

RBP: 0246 R08: 0003 R09: 0005

R10: 0010 R11: 0246 R12: 0400

R13: 0400 R14:  R15: 81000102d980

FS:  2b8de51be6f0() GS:8100016123c0() knlGS:

CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b

CR2: 0036d4b938a0 CR3: 00201000 CR4: 06e0

DR0:  DR1: 

Re: Softlockup detected with 2.6.23-rc2-mm1

2007-08-10 Thread Andrew Morton
On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote:

 I get call trace, when the file system stress is run on the
 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron
 (processor 270)
 
 \BUG: spinlock bad magic on 
 CPU#1, fsx-linux/19721

Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug.  Those patches were
removed from 2.6.23-rc2-mm2 - please test that instead.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


kernel BUG at mm/swap_state.c:78 with the 2.6.23-rc2-mm1

2007-08-10 Thread Kamalesh Babulal

Hi,

I got the following kernel Bug  on the 2.6.23-rc2-mm1 kernel on
a Dual Core AMD Opteron (processor 270),  while testing the  LTP
runall


kernel BUG at mm/swap_state.c:78!

invalid opcode:  [1] SMP

CPU 0

Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp 
parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button


Pid: 262, comm: kprefetchd Not tainted 2.6.23-rc2-mm1-autokern1 #1

RIP: 0010:[8027c443]  [8027c443] 
__add_to_swap_cache+0x12/0xa6


RSP: 0018:81000299bea0  EFLAGS: 00010246

RAX:  RBX: 81003f3baec0 RCX: 8100048c33b0

RDX: 00d0 RSI: 0001 RDI: 00d0

RBP: 81003f3baec0 R08: 810001423f14 R09: bb27

R10:  R11: 0001 R12: 0001

R13: 0002 R14: 0001 R15: 8100048c33b0

FS:  2b941f4a40f0() GS:8067() knlGS:

CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b

CR2: 005b9db0 CR3: 04ed7000 CR4: 06e0

DR0:  DR1:  DR2: 

DR3:  DR6: 0ff0 DR7: 0400

Process kprefetchd (pid: 262, threadinfo 81000299a000, task 
810001c98040)


Stack:  0001 81003f3baec0  8027c50d

0002 81003f3baec0  8027f318

  81000299bf20 

Call Trace:

[8027c50d] add_to_swap_cache+0x36/0x5f

[8027f318] kprefetchd+0x248/0x40c

[8027f0d0] kprefetchd+0x0/0x40c

[80248360] kthread+0x47/0x73

[8020ca78] child_rip+0xa/0x12

[80248319] kthread+0x0/0x73

[8020ca6e] child_rip+0x0/0x12





Code: 0f 0b eb fe 8b 03 66 85 c0 79 04 0f 0b eb fe 8b 03 f6 c4 08

Thanks,
Kamalesh Babulal.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Softlockup detected with 2.6.23-rc2-mm1

2007-08-10 Thread Kamalesh Babulal

Hi,

I get call trace, when the file system stress is run on the
2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron
(processor 270)

\BUG: spinlock bad magic on 
CPU#1, fsx-linux/19721


lock: 8100028cef48, .magic: , .owner: none/-1, .owner_cpu: 0



Call Trace:

[803359a6] _raw_spin_lock+0x22/0xf6

[804e4c13] _spin_lock_irqsave+0x9/0xe

[803303fe] prop_norm_single+0x40/0x9a

[8026ac1f] set_page_dirty+0x8d/0xc9

[8026bc09] set_page_dirty_balance+0x9/0x39

[80271f14] __do_fault+0x37a/0x395

[802738d7] handle_mm_fault+0x342/0x6c3

[804e6ac6] do_page_fault+0x3e5/0x7ab

[802117d3] arch_get_unmapped_area+0x184/0x1f9

[804e4c13] _spin_lock_irqsave+0x9/0xe

[803318cc] __up_write+0x21/0x10d

[804e500d] error_exit+0x0/0x84



BUG: spinlock lockup on CPU#1, fsx-linux/19721, 8100028cef48



Call Trace:

[80335a53] _raw_spin_lock+0xcf/0xf6

[804e4c13] _spin_lock_irqsave+0x9/0xe

[803303fe] prop_norm_single+0x40/0x9a

[8026ac1f] set_page_dirty+0x8d/0xc9

[8026bc09] set_page_dirty_balance+0x9/0x39

[80271f14] __do_fault+0x37a/0x395

[802738d7] handle_mm_fault+0x342/0x6c3

[804e6ac6] do_page_fault+0x3e5/0x7ab

[802117d3] arch_get_unmapped_area+0x184/0x1f9

[804e4c13] _spin_lock_irqsave+0x9/0xe

[803318cc] __up_write+0x21/0x10d

[804e500d] error_exit+0x0/0x84



BUG: soft lockup - CPU#2 stuck for 11s! [events/2:17]

CPU 2:

Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp 
parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button


Pid: 17, comm: events/2 Not tainted 2.6.23-rc2-mm1-autokern1 #1

RIP: 0010:[8021a4a4]  [8021a4a4] 
__smp_call_function+0x63/0x84


RSP: 0018:810001727e00  EFLAGS: 0297

RAX: 08fc RBX: 0003 RCX: 

RDX: 08fc RSI: 810001727de0 RDI: 00fc

RBP: 0246 R08: 0003 R09: 0005

R10: 0010 R11: 0246 R12: 0400

R13: 0400 R14:  R15: 81000102d980

FS:  2b8de51be6f0() GS:8100016123c0() knlGS:

CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b

CR2: 0036d4b938a0 CR3: 1659f000 CR4: 06e0

DR0:  DR1:  DR2: 

DR3:  DR6: 0ff0 DR7: 0400



Call Trace:

[80214c8d] mcheck_check_cpu+0x0/0x30

[8021a4f7] smp_call_function+0x32/0x49

[80214c8d] mcheck_check_cpu+0x0/0x30

[8023aca7] on_each_cpu+0x10/0x22

[80214532] mcheck_timer+0x0/0x7c

[8021454f] mcheck_timer+0x1d/0x7c

[804e4be1] _spin_unlock_irq+0x9/0xc

[80244c93] run_workqueue+0x8d/0x11a

[802454e2] worker_thread+0x0/0xe4

[802455bc] worker_thread+0xda/0xe4

[8024846b] autoremove_wake_function+0x0/0x2e

[80248360] kthread+0x47/0x73

[8020ca78] child_rip+0xa/0x12

[80248319] kthread+0x0/0x73

[8020ca6e] child_rip+0x0/0x12



BUG: soft lockup - CPU#2 stuck for 11s! [events/2:17]

CPU 2:

Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc battery ac lp 
parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core button


Pid: 17, comm: events/2 Not tainted 2.6.23-rc2-mm1-autokern1 #1

RIP: 0010:[8021a4a4]  [8021a4a4] 
__smp_call_function+0x63/0x84


RSP: 0018:810001727e00  EFLAGS: 0297

RAX: 08fc RBX: 0003 RCX: 

RDX: 08fc RSI: 810001727de0 RDI: 00fc

RBP: 0246 R08: 0003 R09: 0005

R10: 0010 R11: 0246 R12: 0400

R13: 0400 R14:  R15: 81000102d980

FS:  2b8de51be6f0() GS:8100016123c0() knlGS:

CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b

CR2: 0036d4b938a0 CR3: 1659f000 CR4: 06e0

DR0:  DR1:  DR2: 

DR3:  DR6: 0ff0 DR7: 0400



Call Trace:

[80214c8d] mcheck_check_cpu+0x0/0x30

[8021a4f7] smp_call_function+0x32/0x49

[80214c8d] mcheck_check_cpu+0x0/0x30

[8023aca7] on_each_cpu+0x10/0x22

[80214532] mcheck_timer+0x0/0x7c

[8021454f] mcheck_timer+0x1d/0x7c

[804e4be1] _spin_unlock_irq+0x9/0xc

[80244c93] run_workqueue+0x8d/0x11a

[802454e2] worker_thread+0x0/0xe4

[802455bc] worker_thread+0xda/0xe4

[8024846b] autoremove_wake_function+0x0/0x2e

[80248360] kthread+0x47/0x73

[8020ca78] child_rip+0xa/0x12

[80248319] kthread+0x0/0x73

[8020ca6e] child_rip+0x0/0x12



BUG: soft

Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-10 Thread Mariusz Kozlowski
 This probably doesn't have great impact ;) but ...
  To reproduce: run torture tests for RCU and then sysrq+q.
 
  SysRq : Show Pending Timers
  Timer List Version: v0.3
  HRTIMER_MAX_CLOCK_BASES: 2
  now at 1764338760370 nsecs
 
  cpu: 0
   clock 0:
.index:  0
.resolution: 1 nsecs
.get_time:   ktime_get_real
.offset: 1186699025823815427 nsecs
  active timers:
   clock 1:
.index:  1
.resolution: 1 nsecs
.get_time:   ktime_get
.offset: 0 nsecs
  active timers:
   #0: 3BUG: sleeping function called from invalid context at 
  kernel/mutex.c:86
  in_atomic():1, irqs_disabled():1
  INFO: lockdep is turned off.
  irq event stamp: 0
  hardirqs last  enabled at (0): [] 0x0
  hardirqs last disabled at (0): [c0117def] copy_process+0x4a8/0x144c
  softirqs last  enabled at (0): [c0117e0d] copy_process+0x4c6/0x144c
  softirqs last disabled at (0): [] 0x0
   [c0104869] show_trace_log_lvl+0x1a/0x30
   [c01053ad] show_trace+0x12/0x14
   [c0105515] dump_stack+0x15/0x17
   [c0114da7] __might_sleep+0xb7/0xc9
   [c0451771] mutex_lock+0x15/0x1f
   [c0141b75] lookup_module_symbol_name+0x17/0xc0
   [c014272a] lookup_symbol_name+0x3f/0x43
   [c013287e] print_name_offset+0x1f/0x96
   [c01330f7] timer_list_show+0x802/0xcbd
   [c01335be] sysrq_timer_list_show+0xc/0xe
   [c02cc4a1] sysrq_handle_show_timers+0x8/0xa
   [c02cc3ac] __handle_sysrq+0x7b/0x115
   [c02cc466] handle_sysrq+0x20/0x24
   [c02c69c1] kbd_event+0x3a8/0x5c7
   [c0362f8f] input_pass_event+0x8f/0x91
   [c0363e77] input_handle_event+0x98/0x38d
   [c0364e6d] input_event+0x54/0x67
   [c03682c2] atkbd_interrupt+0x200/0x59e
   [c0360cd0] serio_interrupt+0x7c/0x80
   [c0361965] i8042_interrupt+0x17a/0x289
   [c0147a5d] handle_IRQ_event+0x28/0x59
   [c01493ca] handle_level_irq+0xad/0x10b
   [c0105a13] do_IRQ+0x93/0xd0
   [c010441e] common_interrupt+0x2e/0x34
   [df39d7e3] rcu_read_delay+0x8/0x36 [rcutorture]
   [df39d99a] rcu_torture_reader+0x6e/0x169 [rcutorture]
   [c012c11e] kthread+0x36/0x58
   [c010451b] kernel_thread_helper+0x7/0x1c
   ===
  
  We seem to have made a mess in there.  timer_list_show() ends up calling
  lookup_module_symbol_name(), which takes a mutex.  However print_symbol()
  (which is called at oops time, interrupt time, etc) calls
  module_address_lookup(), which is basically the same, only it doesn't take
  the mutex.
  
  I guess a quicky fix would be to switch
  kernel/time/timer_list.c:print_name_offset() from
  lookup_module_symbol_name() to module_address_lookup().  But we'd still
  have a mess in there.
  
  (adds ccs, runs away)
 
 I don't think rcutorture matters for this bug. 

Maybe not but that's the only way I could trigger it (insmod rcutorture and 
sysrq+q).

Thanks,

Mariusz
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: irq lock inversion dependency detected

2007-08-10 Thread Mariusz Kozlowski
Hello,

And the winner of today is ...



=
[ INFO: possible irq lock inversion dependency detected ]
2.6.23-rc2-mm1 #7
-
runscript.sh/5843 just changed the state of lock:
 (_xmit_ETHER){-+..}, at: [c03cbe79] dev_watchdog+0x17/0xcc
but this lock took another, soft-irq-unsafe lock in the past:
 (tp-lock){--..}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
no locks held by runscript.sh/5843.

the first lock's dependencies:
- (_xmit_ETHER){-+..} ops: 21 {
   initial-use  at:
[c01387b9] __lock_acquire+0x217/0x11ac
[c01397e7] lock_acquire+0x99/0xb2
[c045303a] _spin_lock_bh+0x3a/0x47
[c03bc936] dev_set_rx_mode+0x14/0x3b
[c03bce3f] dev_change_flags+0x68/0x190
[c03fd37c] devinet_ioctl+0x4af/0x652
[c03fdc62] inet_ioctl+0x56/0x71
[c03b1dba] sock_ioctl+0xa5/0x1d4
[c0178b42] do_ioctl+0x22/0x71
[c0178be6] vfs_ioctl+0x55/0x29e
[c0178e62] sys_ioctl+0x33/0x69
[c01041aa] sysenter_past_esp+0x5f/0x99
[] 0x
   in-softirq-W at:
[c0138c94] __lock_acquire+0x6f2/0x11ac
[c01397e7] lock_acquire+0x99/0xb2
[c0452ff3] _spin_lock+0x35/0x42
[c03cbe79] dev_watchdog+0x17/0xcc
[c0122587] run_timer_softirq+0x14b/0x1a9
[c011ee12] __do_softirq+0x5b/0xb2
[c011eeb6] do_softirq+0x4d/0x4f
[c011f19b] irq_exit+0x48/0x4a
[c0105a18] do_IRQ+0x98/0xd0
[c010441e] common_interrupt+0x2e/0x34
[c0453922] error_code+0x6a/0x70
[] 0x
   hardirq-on-W at:
[c0138ce0] __lock_acquire+0x73e/0x11ac
[c01397e7] lock_acquire+0x99/0xb2
[c045303a] _spin_lock_bh+0x3a/0x47
[c03bc936] dev_set_rx_mode+0x14/0x3b
[c03bce3f] dev_change_flags+0x68/0x190
[c03fd37c] devinet_ioctl+0x4af/0x652
[c03fdc62] inet_ioctl+0x56/0x71
[c03b1dba] sock_ioctl+0xa5/0x1d4
[c0178b42] do_ioctl+0x22/0x71
[c0178be6] vfs_ioctl+0x55/0x29e
[c0178e62] sys_ioctl+0x33/0x69
[c01041aa] sysenter_past_esp+0x5f/0x99
[] 0x
 }
 ... key  at: [c087aae8] netdev_xmit_lock_key+0x8/0x1c0
 - (tp-lock){--..} ops: 44 {
initial-use  at:
  [c01387b9] __lock_acquire+0x217/0x11ac
  [c01397e7] lock_acquire+0x99/0xb2
  [c0452ff3] _spin_lock+0x35/0x42
  [de84d6e0] rtl8139_interrupt+0x27/0x46b [8139too]
  [c01484a2] request_irq+0xba/0x108
  [de84e5f6] rtl8139_open+0x2f/0x1e2 [8139too]
  [c03bf09d] dev_open+0x37/0x76
  [c03bce65] dev_change_flags+0x8e/0x190
  [c03fd37c] devinet_ioctl+0x4af/0x652
  [c03fdc62] inet_ioctl+0x56/0x71
  [c03b1dba] sock_ioctl+0xa5/0x1d4
  [c0178b42] do_ioctl+0x22/0x71
  [c0178be6] vfs_ioctl+0x55/0x29e
  [c0178e62] sys_ioctl+0x33/0x69
  [c01041aa] sysenter_past_esp+0x5f/0x99
  [] 0x
softirq-on-W at:
  [c0138d09] __lock_acquire+0x767/0x11ac
  [c01397e7] lock_acquire+0x99/0xb2
  [c0452ff3] _spin_lock+0x35/0x42
  [de84d6e0] rtl8139_interrupt+0x27/0x46b [8139too]
  [c01480fd] free_irq+0x11b/0x146
  [de84ed59] rtl8139_close+0x8a/0x14a [8139too]
  [c03bde63] dev_close+0x57/0x74
  [c03bce65] dev_change_flags+0x8e/0x190
  [c03fd37c] devinet_ioctl+0x4af/0x652
  [c03fdc62] inet_ioctl+0x56/0x71
  [c03b1dba] sock_ioctl+0xa5/0x1d4
  [c0178b42] do_ioctl+0x22/0x71
  [c0178be6] vfs_ioctl+0x55/0x29e
  [c0178e62] sys_ioctl+0x33/0x69
  [c01041aa] sysenter_past_esp+0x5f/0x99
  [] 0x
hardirq-on-W

Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___

2007-08-10 Thread Andrew Morton
On Fri, 10 Aug 2007 15:27:42 +0200 Andi Kleen [EMAIL PROTECTED] wrote:

 On Thursday 09 August 2007 20:52:58 Andrew Morton wrote:
  On Thu, 9 Aug 2007 10:18:15 -0400
  Miles Lane [EMAIL PROTECTED] wrote:
  
 CC  drivers/dma/ioat_dca.o
   drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag':
   drivers/dma/ioat_dca.c:177: error: implicit declaration of function
   'cpu_physical_id'
  
  Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n.
  
  Either ioat needs to stop using cpu_physical_id() if SMP=n, or the
  supported architectures (i386, x86_64, ia64) should provide a non-SMP
  version of cpu_physical_id().  Preferably the latter, I'd say.
 
 
 It doesn't make much sense in smp.h because there is not really
 a concept of physical id on most architectures i expect. Better 
 to put it into the individual asm files.
 

I gave up and did this:

From: Andrew Morton [EMAIL PROTECTED]

drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag':
drivers/dma/ioat_dca.c:177: error: implicit declaration of function 
'cpu_physical_id'

This is s screwed up.  Root cause: linux/smp.h only includes asm/smp.h if
CONFIG_SMP=y.

To get at cpu_physical_id() on UP, the user must include asm/smp.h, not
linux/smp.h.

Cc: Luck, Tony [EMAIL PROTECTED]
Cc: Andi Kleen [EMAIL PROTECTED]
Cc: Shannon Nelson [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 drivers/dma/ioat_dca.c |3 +++
 1 file changed, 3 insertions(+)

diff -puN drivers/dma/ioat_dca.c~git-dma-up-fix drivers/dma/ioat_dca.c
--- a/drivers/dma/ioat_dca.c~git-dma-up-fix
+++ a/drivers/dma/ioat_dca.c
@@ -25,6 +25,9 @@
 #include linux/smp.h
 #include linux/interrupt.h
 #include linux/dca.h
+
+#include asm/smp.h
+
 #include ioatdma.h
 #include ioatdma_registers.h
 
_

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Paul E. McKenney
On Thu, Aug 09, 2007 at 07:06:23PM -0700, Andrew Morton wrote:
 On Thu, 9 Aug 2007 19:00:40 -0700 Paul E. McKenney [EMAIL PROTECTED] 
 wrote:
 
  On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote:
   On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:
   ...
Changes since 2.6.23-rc2-mm1:
   ...
+allow-rcutorture-to-handle-synchronize_sched.patch
   ...
 2.6.23 queue
   ...
   
   All drivers were converted to no longer use xtime directly since it 
   might be quite outdated, but this patch adds a usage of xtime.tv_nsec
   as RNG...
  
  This code doesn't care if the time is outdated, as it is simply
  periodically perturbing an RNG, but OK.
  
  So, what interface are we supposed to be using instead?  I cannot use
  get_random_bytes() due to locking issues.  This is not a cryptographically
  secure usage, so the perturbation does not need to be extremely high
  quality.
  
  On x86, I would just grab the low-order bits of the TSC, but all of the
  world is not an x86.  ;-)
 
 One used to use sched_clock() for this, then get frowned at.  Now we
 have cpu_clock()...

Hmmm...  And cpu_clock() is not in 2.6.22, so must appear in some later
release.  Which means that the rate of API change in this area is a
bit high, so I should avoid it like the plague.  Therefore, I should
look for some other convenient source of entropy.

One convenient source would the per-CPU statistics that rcutorture
maintains.  Of course, a given CPU's RNG is nearly in lock-step with
its own statistics, but not with the adjacent CPU's statistics...

I will send a patch.

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Andrew Morton
On Fri, 10 Aug 2007 08:12:08 -0700 Paul E. McKenney [EMAIL PROTECTED] wrote:

  One used to use sched_clock() for this, then get frowned at.  Now we
  have cpu_clock()...
 
 Hmmm...  And cpu_clock() is not in 2.6.22, so must appear in some later
 release.  Which means that the rate of API change in this area is a
 bit high, so I should avoid it like the plague.

eh, it's been there for weeks.  It is dust-encrusted.

  Therefore, I should
 look for some other convenient source of entropy.
 
 One convenient source would the per-CPU statistics that rcutorture
 maintains.  Of course, a given CPU's RNG is nearly in lock-step with
 its own statistics, but not with the adjacent CPU's statistics...
 
 I will send a patch.

Please use cpu_clock().  It ain't going away.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___

2007-08-10 Thread Andi Kleen
On Thursday 09 August 2007 20:52:58 Andrew Morton wrote:
 On Thu, 9 Aug 2007 10:18:15 -0400
 Miles Lane [EMAIL PROTECTED] wrote:
 
CC  drivers/dma/ioat_dca.o
  drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag':
  drivers/dma/ioat_dca.c:177: error: implicit declaration of function
  'cpu_physical_id'
 
 Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n.
 
 Either ioat needs to stop using cpu_physical_id() if SMP=n, or the
 supported architectures (i386, x86_64, ia64) should provide a non-SMP
 version of cpu_physical_id().  Preferably the latter, I'd say.


It doesn't make much sense in smp.h because there is not really
a concept of physical id on most architectures i expect. Better 
to put it into the individual asm files.

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id___

2007-08-10 Thread Miles Lane
On 8/9/07, Andrew Morton [EMAIL PROTECTED] wrote:
 On Thu, 9 Aug 2007 10:18:15 -0400
 Miles Lane [EMAIL PROTECTED] wrote:

CC  drivers/dma/ioat_dca.o
  drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag':
  drivers/dma/ioat_dca.c:177: error: implicit declaration of function
  'cpu_physical_id'

 Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n.

 Either ioat needs to stop using cpu_physical_id() if SMP=n, or the
 supported architectures (i386, x86_64, ia64) should provide a non-SMP
 version of cpu_physical_id().  Preferably the latter, I'd say.

 Something like this, I suppose...


 From: Andrew Morton [EMAIL PROTECTED]

 i386, x86_64 and ia64 implement cpu_physical_id() if CONFIG_SMP=y.

 Provide a uniprocessor stub so that callers will dtrt.

 Cc: Andi Kleen [EMAIL PROTECTED]
 Cc: Luck, Tony [EMAIL PROTECTED]
 Cc: Shannon Nelson [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]
 ---

  include/linux/smp.h |5 +
  1 files changed, 5 insertions(+)

 diff -puN include/linux/smp.h~implement-cpu_physical_id-on-smp=n 
 include/linux/smp.h
 --- a/include/linux/smp.h~implement-cpu_physical_id-on-smp=n
 +++ a/include/linux/smp.h
 @@ -108,6 +108,11 @@ static inline void smp_send_reschedule(i
 0;  \
  })

 +static inline unsigned cpu_physical_id(unsigned cpu)
 +{
 +   return 0;
 +}
 +
  #endif /* !SMP */

  /*
 _

Worked for me.
Thanks,
  Miles
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Softlockup detected with 2.6.23-rc2-mm1

2007-08-10 Thread Andy Whitcroft
On Fri, Aug 10, 2007 at 01:06:58PM +0530, Kamalesh Babulal wrote:
 Andrew Morton wrote:
 On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal 
 [EMAIL PROTECTED] wrote:
 
   
 I get call trace, when the file system stress is run on the
 2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron
 (processor 270)
 
 \BUG: spinlock bad magic on 
 CPU#1, fsx-linux/19721
 
 
 Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug.  Those patches 
 were
 removed from 2.6.23-rc2-mm2 - please test that instead.
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
   
 I get different call trace on AMD Opteron(tm) Processor 844 machine
 , I am not sure where it is related to the same patch
 
 =
 BUG: soft lockup - CPU#3 stuck for 11s! [pdflush:272]
 CPU 3:
 Modules linked in:
 Pid: 272, comm: pdflush Not tainted 2.6.23-rc2-mm1-autokern1 #1
 RIP: 0010:[8021a9c3]  [8021a9c3] 
 flush_tlb_others+0x69/0x95

Cannot be 100% sure but of the group of machines showing your original
problem one showed this form.  Dropping the patches indicated by Andrew
seemed to fix both symptoms.  So I think it is highly likely the same
thing.

-apw
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Softlockup detected with 2.6.23-rc2-mm1

2007-08-10 Thread Kamalesh Babulal

Andrew Morton wrote:

On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote:

  

I get call trace, when the file system stress is run on the
2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron
(processor 270)

\BUG: spinlock bad magic on 
CPU#1, fsx-linux/19721



Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug.  Those patches were
removed from 2.6.23-rc2-mm2 - please test that instead.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
  

I get different call trace on AMD Opteron(tm) Processor 844 machine
, I am not sure where it is related to the same patch

=
BUG: soft lockup - CPU#3 stuck for 11s! [pdflush:272]
CPU 3:
Modules linked in:
Pid: 272, comm: pdflush Not tainted 2.6.23-rc2-mm1-autokern1 #1
RIP: 0010:[8021a9c3]  [8021a9c3] 
flush_tlb_others+0x69/0x95

RSP: :810001f15a90  EFLAGS: 0202
RAX: 0003 RBX: 810001f15ac0 RCX: 0008
RDX: 08f3 RSI: 00f3 RDI: 0002
RBP:  R08: 810082f05210 R09: 802e60c1
R10: 8100815e6e70 R11:  R12: 8101ffc38080
R13: 80358b47 R14: 810001f15a40 R15: 810081d73208
FS:  () GS:810180724280() knlGS:f7f75b80
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: f7e20494 CR3: 029f CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400

Call Trace:
[8021abd0] flush_tlb_page+0x8f/0x97
[8026daee] page_mkclean+0x120/0x171
[802e6227] ext3_ordered_writepage+0x13f/0x16c
[8025ed09] clear_page_dirty_for_io+0x52/0xba
[8025f002] write_cache_pages+0x1b2/0x33a
[8022a149] update_curr+0xd9/0xf8
[8025e9ca] __writepage+0x0/0x2a
[8025f1a9] generic_writepages+0x1f/0x25
[8025f1db] do_writepages+0x2c/0x35
[8029b453] __writeback_single_inode+0x1c9/0x346
[8023b809] try_to_del_timer_sync+0x55/0x60
[8023b826] del_timer_sync+0x12/0x1f
[8022a149] update_curr+0xd9/0xf8
[8022a446] dequeue_entity+0x7d/0x92
[8029ba20] generic_sync_sb_inodes+0x216/0x372
[8029bb99] sync_sb_inodes+0x1d/0x1f
[8029bdd9] writeback_inodes+0x83/0xd6
[8025e82b] wb_kupdate+0xa0/0x113
[8025f658] pdflush+0x156/0x206
[8025e78b] wb_kupdate+0x0/0x113
[8025f502] pdflush+0x0/0x206
[80245660] kthread+0x44/0x6d
[8020c5e8] child_rip+0xa/0x12
[8024561c] kthread+0x0/0x6d
[8020c5de] child_rip+0x0/0x12

Thanks
Kamalesh Babulal.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-10 Thread Ingo Molnar

* Andrew Morton [EMAIL PROTECTED] wrote:

 We seem to have made a mess in there.  timer_list_show() ends up 
 calling lookup_module_symbol_name(), which takes a mutex.  However 
 print_symbol() (which is called at oops time, interrupt time, etc) 
 calls module_address_lookup(), which is basically the same, only it 
 doesn't take the mutex.

hm, current upstream does:

 static void print_name_offset(struct seq_file *m, void *sym)
 {
 char symname[KSYM_NAME_LEN];

 if (lookup_symbol_name((unsigned long)sym, symname)  0)

why was that changed? I think symbol lookups for debug purposes have to 
be lockless, fundamentally.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-10 Thread Andrew Morton
On Fri, 10 Aug 2007 09:40:00 +0200 Ingo Molnar [EMAIL PROTECTED] wrote:

 
 * Andrew Morton [EMAIL PROTECTED] wrote:
 
  We seem to have made a mess in there.  timer_list_show() ends up 
  calling lookup_module_symbol_name(), which takes a mutex.  However 
  print_symbol() (which is called at oops time, interrupt time, etc) 
  calls module_address_lookup(), which is basically the same, only it 
  doesn't take the mutex.
 
 hm, current upstream does:
 
  static void print_name_offset(struct seq_file *m, void *sym)
  {
  char symname[KSYM_NAME_LEN];
 
  if (lookup_symbol_name((unsigned long)sym, symname)  0)
 
 why was that changed?

It wasn't.  lookup_symbol_name() calls lookup_module_symbol_name() which
calls mutex_lock().

 I think symbol lookups for debug purposes have to 
 be lockless, fundamentally.
 

Sure, especially a sysrq thingy.

It's a bit nasty to just go in there and start walking data structures
without holding the needed lock though.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- INFO: possible circular locking dependency detected

2007-08-10 Thread Johannes Berg
On Fri, 2007-08-10 at 02:47 +0400, Alexey Starikovskiy wrote:

  Presumably the new debugging patches in -mm
  (workqueue-debug-flushing-deadlocks-with-lockdep.patch and
  workqueue-debug-work-related-deadlocks-with-lockdep.patch) think they have
  found a potential deadlock in ACPI.  I don't have time to pick through the
  code to confirm that, but boy I'm good at adding cc's ;)

 Yep, it indeed may lock up... Here is a patch to avoid it

Cool. I'm impressed this stuff actually finds something :)

johannes


signature.asc
Description: This is a digitally signed message part


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Paul E. McKenney
On Fri, Aug 10, 2007 at 10:12:12AM -0700, Andrew Morton wrote:
 On Fri, 10 Aug 2007 08:12:08 -0700 Paul E. McKenney [EMAIL PROTECTED] 
 wrote:
 
   One used to use sched_clock() for this, then get frowned at.  Now we
   have cpu_clock()...
  
  Hmmm...  And cpu_clock() is not in 2.6.22, so must appear in some later
  release.  Which means that the rate of API change in this area is a
  bit high, so I should avoid it like the plague.
 
 eh, it's been there for weeks.  It is dust-encrusted.
 
   Therefore, I should
  look for some other convenient source of entropy.
  
  One convenient source would the per-CPU statistics that rcutorture
  maintains.  Of course, a given CPU's RNG is nearly in lock-step with
  its own statistics, but not with the adjacent CPU's statistics...
  
  I will send a patch.
 
 Please use cpu_clock().  It ain't going away.

D'accord...

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Softlockup detected with 2.6.23-rc2-mm1

2007-08-10 Thread Kamalesh Babulal

Andrew Morton wrote:

On Fri, 10 Aug 2007 11:46:25 +0530 Kamalesh Babulal [EMAIL PROTECTED] wrote:

  

I get call trace, when the file system stress is run on the
2.6.23-rc2-mm1 kernel on a Dual Core AMD Opteron
(processor 270)

\BUG: spinlock bad magic on 
CPU#1, fsx-linux/19721



Yes, sorry, mm-dirty-balancing-for-tasks.patch had a bug.  Those patches were
removed from 2.6.23-rc2-mm2 - please test that instead.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
  

The Call trace is not reproducible in the 2.6.23-rc2-mm2.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- PPC G5 kernel compile failure (patch)

2007-08-10 Thread Andy Whitcroft
Krzysztof Helt wrote:
 On Thu, 9 Aug 2007 14:04:49 +0100
 Andy Whitcroft [EMAIL PROTECTED] wrote:
 
 Seeing the following compile error on a G5 mac:

   drivers/video/tdfxfb.c: In function 'tdfxfb_setup':
   drivers/video/tdfxfb.c:1341: error: 'opt' undeclared (first use in this
  function)
   drivers/video/tdfxfb.c:1341: error: (Each undeclared identifier is
 reported only once
   drivers/video/tdfxfb.c:1341: error: for each function it appears in.)

 This seems to be the following fragment from tdfxfb-hardware-cursor:

 +   } else if (!strcmp(this_opt, hwcursor)) {
 +   hwcursor = simple_strtoul(opt + 9, NULL, 0);

 I guess the nieve fix would be s/opt/this_opt, but I am also
 suspicious of the +9 here as hwcursor is only 8 long?  Now this
 seems to take a numeric value and I assume that is via hwcursor=N,
 if so then the +9 would make sense _if_ the strcmp was against
 hwcursor=.

 
 The patch below fixes all issues you have pointed out. It also fixes
 the description of the nomtrr option.
 
 ---
 
 From: Krzysztof Helt [EMAIL PROTECTED]
 
 This patch fixes compilation with setup options bug and corrects
 description of the nomtrr option.
 
 Signed-off-by: Krzysztof Helt [EMAIL PROTECTED]
 
 ---
 
 --- linux-2.6.22.new/drivers/video/tdfxfb.c   2007-08-09 16:11:23.870028259 
 +0200
 +++ linux-2.6.23/drivers/video/tdfxfb.c   2007-08-09 16:15:07.654781024 
 +0200
 @@ -1337,8 +1337,8 @@ static void tdfxfb_setup(char *options)
   nopan = 1;
   } else if (!strcmp(this_opt, nowrap)) {
   nowrap = 1;
 - } else if (!strcmp(this_opt, hwcursor)) {
 - hwcursor = simple_strtoul(opt + 9, NULL, 0);
 + } else if (!strncmp(this_opt, hwcursor=, 9)) {
 + hwcursor = simple_strtoul(this_opt + 9, NULL, 0);
  #ifdef CONFIG_MTRR
   } else if (!strncmp(this_opt, nomtrr, 6)) {
   nomtrr = 1;
 @@ -1409,7 +1409,7 @@ MODULE_PARM_DESC(hwcursor, Enable hardw
   (1=enable, 0=disable, default=1));
  #ifdef CONFIG_MTRR
  module_param(nomtrr, bool, 0);
 -MODULE_PARM_DESC(nomtrr, Disable MTRR support (0 or 1=disabled) 
 (default=0));
 +MODULE_PARM_DESC(nomtrr, Disable MTRR support (default: enabled));
  #endif
  
  module_init(tdfxfb_init);

Confirmed that this gets my kernel compiled and the result boots.

Tested-by: Andy Whitcroft [EMAIL PROTECTED]

-apw
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Paul E. McKenney
On Fri, Aug 10, 2007 at 01:30:55PM -0700, Paul E. McKenney wrote:
 On Fri, Aug 10, 2007 at 10:12:12AM -0700, Andrew Morton wrote:
  On Fri, 10 Aug 2007 08:12:08 -0700 Paul E. McKenney [EMAIL PROTECTED] 
  wrote:
  
One used to use sched_clock() for this, then get frowned at.  Now we
have cpu_clock()...
   
   Hmmm...  And cpu_clock() is not in 2.6.22, so must appear in some later
   release.  Which means that the rate of API change in this area is a
   bit high, so I should avoid it like the plague.
  
  eh, it's been there for weeks.  It is dust-encrusted.
  
Therefore, I should
   look for some other convenient source of entropy.
   
   One convenient source would the per-CPU statistics that rcutorture
   maintains.  Of course, a given CPU's RNG is nearly in lock-step with
   its own statistics, but not with the adjacent CPU's statistics...
   
   I will send a patch.
  
  Please use cpu_clock().  It ain't going away.
 
 D'accord...

Errmmm...  No joy.

ERROR: cpu_clock [kernel/rcutorture.ko] undefined!

Turns out that cpu_clock also ain't exported, and rcutorture.c is
a module.  Would adding an EXPORT_SYMBOL_GPL() as in the patch below
be acceptable?

If not, I have a tested patch to rcutorture.c that leverages statistical
counters.  Your choice.

Thanx, Paul

Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it.
Compiles, but not yet tested.

Signed-off-by: Paul E. McKenney [EMAIL PROTECTED]
---

 rcutorture.c |8 ++--
 sched.c  |2 ++
 2 files changed, 4 insertions(+), 6 deletions(-)

diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c 
linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c
--- linux-2.6.23-rc2/kernel/rcutorture.c2007-08-03 19:49:55.0 
-0700
+++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c2007-08-10 
17:15:22.0 -0700
@@ -42,7 +42,6 @@
 #include linux/notifier.h
 #include linux/freezer.h
 #include linux/cpu.h
-#include linux/random.h
 #include linux/delay.h
 #include linux/byteorder/swabb.h
 #include linux/stat.h
@@ -166,16 +165,13 @@ struct rcu_random_state {
 
 /*
  * Crude but fast random-number generator.  Uses a linear congruential
- * generator, with occasional help from get_random_bytes().
+ * generator, with occasional help from cpu_clock().
  */
 static unsigned long
 rcu_random(struct rcu_random_state *rrsp)
 {
-   long refresh;
-
if (--rrsp-rrs_count  0) {
-   get_random_bytes(refresh, sizeof(refresh));
-   rrsp-rrs_state += refresh;
+   rrsp-rrs_state += (unsigned long)cpu_clock(smp_processor_id());
rrsp-rrs_count = RCU_RANDOM_REFRESH;
}
rrsp-rrs_state = rrsp-rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD;
diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c 
linux-2.6.23-rc2-rcutorturesched/kernel/sched.c
--- linux-2.6.23-rc2/kernel/sched.c 2007-08-03 19:49:55.0 -0700
+++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c 2007-08-10 
17:22:57.0 -0700
@@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu)
return now;
 }
 
+EXPORT_SYMBOL_GPL(cpu_clock);
+
 #ifdef CONFIG_FAIR_GROUP_SCHED
 /* Change a task's -cfs_rq if it moves across CPUs */
 static inline void set_task_cfs_rq(struct task_struct *p)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-10 Thread Paul E. McKenney
On Fri, Aug 10, 2007 at 05:29:49PM -0700, Paul E. McKenney wrote:
 
 Errmmm...  No joy.
 
   ERROR: cpu_clock [kernel/rcutorture.ko] undefined!
 
 Turns out that cpu_clock also ain't exported, and rcutorture.c is
 a module.  Would adding an EXPORT_SYMBOL_GPL() as in the patch below
 be acceptable?

Except that the old xtime symbol was EXPORT_SYMBOL() rather than my
proposed EXPORT_SYMBOL_GPL() for the equivalent new cpu_clock().

Sigh!!!  I will leave this one for others to sort out.

Andrew, please consider this patch withdrawn and apply the version that
does not rely on time for entropy.  Please let me know if you would like
me to resend it.

Thanx, Paul

 If not, I have a tested patch to rcutorture.c that leverages statistical
 counters.  Your choice.
 
   Thanx, Paul
 
 Add an EXPORT_SYMBOL_GPL() for cpu_clock() and make rcutorture.c use it.
 Compiles, but not yet tested.
 
 Signed-off-by: Paul E. McKenney [EMAIL PROTECTED]
 ---
 
  rcutorture.c |8 ++--
  sched.c  |2 ++
  2 files changed, 4 insertions(+), 6 deletions(-)
 
 diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/rcutorture.c 
 linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c
 --- linux-2.6.23-rc2/kernel/rcutorture.c  2007-08-03 19:49:55.0 
 -0700
 +++ linux-2.6.23-rc2-rcutorturesched/kernel/rcutorture.c  2007-08-10 
 17:15:22.0 -0700
 @@ -42,7 +42,6 @@
  #include linux/notifier.h
  #include linux/freezer.h
  #include linux/cpu.h
 -#include linux/random.h
  #include linux/delay.h
  #include linux/byteorder/swabb.h
  #include linux/stat.h
 @@ -166,16 +165,13 @@ struct rcu_random_state {
  
  /*
   * Crude but fast random-number generator.  Uses a linear congruential
 - * generator, with occasional help from get_random_bytes().
 + * generator, with occasional help from cpu_clock().
   */
  static unsigned long
  rcu_random(struct rcu_random_state *rrsp)
  {
 - long refresh;
 -
   if (--rrsp-rrs_count  0) {
 - get_random_bytes(refresh, sizeof(refresh));
 - rrsp-rrs_state += refresh;
 + rrsp-rrs_state += (unsigned long)cpu_clock(smp_processor_id());
   rrsp-rrs_count = RCU_RANDOM_REFRESH;
   }
   rrsp-rrs_state = rrsp-rrs_state * RCU_RANDOM_MULT + RCU_RANDOM_ADD;
 diff -urpNa -X dontdiff linux-2.6.23-rc2/kernel/sched.c 
 linux-2.6.23-rc2-rcutorturesched/kernel/sched.c
 --- linux-2.6.23-rc2/kernel/sched.c   2007-08-03 19:49:55.0 -0700
 +++ linux-2.6.23-rc2-rcutorturesched/kernel/sched.c   2007-08-10 
 17:22:57.0 -0700
 @@ -394,6 +394,8 @@ unsigned long long cpu_clock(int cpu)
   return now;
  }
  
 +EXPORT_SYMBOL_GPL(cpu_clock);
 +
  #ifdef CONFIG_FAIR_GROUP_SCHED
  /* Change a task's -cfs_rq if it moves across CPUs */
  static inline void set_task_cfs_rq(struct task_struct *p)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-09 Thread Josh Triplett
Andrew Morton wrote:
> On Fri, 10 Aug 2007 01:23:07 +0200
> Mariusz Kozlowski <[EMAIL PROTECTED]> wrote:
> 
>> Hello,
>>
>>  This probably doesn't have great impact ;) but ...
>> To reproduce: run torture tests for RCU and then sysrq+q.
>>
>> SysRq : Show Pending Timers
>> Timer List Version: v0.3
>> HRTIMER_MAX_CLOCK_BASES: 2
>> now at 1764338760370 nsecs
>>
>> cpu: 0
>>  clock 0:
>>   .index:  0
>>   .resolution: 1 nsecs
>>   .get_time:   ktime_get_real
>>   .offset: 1186699025823815427 nsecs
>> active timers:
>>  clock 1:
>>   .index:  1
>>   .resolution: 1 nsecs
>>   .get_time:   ktime_get
>>   .offset: 0 nsecs
>> active timers:
>>  #0: <3>BUG: sleeping function called from invalid context at 
>> kernel/mutex.c:86
>> in_atomic():1, irqs_disabled():1
>> INFO: lockdep is turned off.
>> irq event stamp: 0
>> hardirqs last  enabled at (0): [<>] 0x0
>> hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c
>> softirqs last  enabled at (0): [] copy_process+0x4c6/0x144c
>> softirqs last disabled at (0): [<>] 0x0
>>  [] show_trace_log_lvl+0x1a/0x30
>>  [] show_trace+0x12/0x14
>>  [] dump_stack+0x15/0x17
>>  [] __might_sleep+0xb7/0xc9
>>  [] mutex_lock+0x15/0x1f
>>  [] lookup_module_symbol_name+0x17/0xc0
>>  [] lookup_symbol_name+0x3f/0x43
>>  [] print_name_offset+0x1f/0x96
>>  [] timer_list_show+0x802/0xcbd
>>  [] sysrq_timer_list_show+0xc/0xe
>>  [] sysrq_handle_show_timers+0x8/0xa
>>  [] __handle_sysrq+0x7b/0x115
>>  [] handle_sysrq+0x20/0x24
>>  [] kbd_event+0x3a8/0x5c7
>>  [] input_pass_event+0x8f/0x91
>>  [] input_handle_event+0x98/0x38d
>>  [] input_event+0x54/0x67
>>  [] atkbd_interrupt+0x200/0x59e
>>  [] serio_interrupt+0x7c/0x80
>>  [] i8042_interrupt+0x17a/0x289
>>  [] handle_IRQ_event+0x28/0x59
>>  [] handle_level_irq+0xad/0x10b
>>  [] do_IRQ+0x93/0xd0
>>  [] common_interrupt+0x2e/0x34
>>  [] rcu_read_delay+0x8/0x36 [rcutorture]
>>  [] rcu_torture_reader+0x6e/0x169 [rcutorture]
>>  [] kthread+0x36/0x58
>>  [] kernel_thread_helper+0x7/0x1c
>>  ===
> 
> We seem to have made a mess in there.  timer_list_show() ends up calling
> lookup_module_symbol_name(), which takes a mutex.  However print_symbol()
> (which is called at oops time, interrupt time, etc) calls
> module_address_lookup(), which is basically the same, only it doesn't take
> the mutex.
> 
> I guess a quicky fix would be to switch
> kernel/time/timer_list.c:print_name_offset() from
> lookup_module_symbol_name() to module_address_lookup().  But we'd still
> have a mess in there.
> 
> (adds ccs, runs away)

I don't think rcutorture matters for this bug.  As far as I can tell, Andrew's
description of this problem will always apply to this particular sysrq: the
keyboard interrupt leads to handle_sysrq, which leads to timer_list_show,
which leads to lookup_module_symbol_name, which acquires a mutex.

- Josh Triplett
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-09 Thread Adrian Bunk
On Thu, Aug 09, 2007 at 07:00:40PM -0700, Paul E. McKenney wrote:
> On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote:
> > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:
> > >...
> > > Changes since 2.6.23-rc2-mm1:
> > >...
> > > +allow-rcutorture-to-handle-synchronize_sched.patch
> > >...
> > >  2.6.23 queue
> > >...
> > 
> > All drivers were converted to no longer use xtime directly since it 
> > might be quite outdated, but this patch adds a usage of xtime.tv_nsec
> > as RNG...
> 
> This code doesn't care if the time is outdated, as it is simply
> periodically perturbing an RNG, but OK.
>...

I should have been a bit more concrete:

I have a patch pending to unexport xtime for catching unsafe usages, and 
you added an (ab)user.

>   Thanx, Paul

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-09 Thread Andrew Morton
On Thu, 9 Aug 2007 19:00:40 -0700 "Paul E. McKenney" <[EMAIL PROTECTED]> wrote:

> On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote:
> > On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:
> > >...
> > > Changes since 2.6.23-rc2-mm1:
> > >...
> > > +allow-rcutorture-to-handle-synchronize_sched.patch
> > >...
> > >  2.6.23 queue
> > >...
> > 
> > All drivers were converted to no longer use xtime directly since it 
> > might be quite outdated, but this patch adds a usage of xtime.tv_nsec
> > as RNG...
> 
> This code doesn't care if the time is outdated, as it is simply
> periodically perturbing an RNG, but OK.
> 
> So, what interface are we supposed to be using instead?  I cannot use
> get_random_bytes() due to locking issues.  This is not a cryptographically
> secure usage, so the perturbation does not need to be extremely high
> quality.
> 
> On x86, I would just grab the low-order bits of the TSC, but all of the
> world is not an x86.  ;-)
> 

One used to use sched_clock() for this, then get frowned at.  Now we
have cpu_clock()...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-09 Thread Paul E. McKenney
On Fri, Aug 10, 2007 at 03:31:46AM +0200, Adrian Bunk wrote:
> On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:
> >...
> > Changes since 2.6.23-rc2-mm1:
> >...
> > +allow-rcutorture-to-handle-synchronize_sched.patch
> >...
> >  2.6.23 queue
> >...
> 
> All drivers were converted to no longer use xtime directly since it 
> might be quite outdated, but this patch adds a usage of xtime.tv_nsec
> as RNG...

This code doesn't care if the time is outdated, as it is simply
periodically perturbing an RNG, but OK.

So, what interface are we supposed to be using instead?  I cannot use
get_random_bytes() due to locking issues.  This is not a cryptographically
secure usage, so the perturbation does not need to be extremely high
quality.

On x86, I would just grab the low-order bits of the TSC, but all of the
world is not an x86.  ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc2-mm1: rcutorture xtime usage

2007-08-09 Thread Adrian Bunk
On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:
>...
> Changes since 2.6.23-rc2-mm1:
>...
> +allow-rcutorture-to-handle-synchronize_sched.patch
>...
>  2.6.23 queue
>...

All drivers were converted to no longer use xtime directly since it 
might be quite outdated, but this patch adds a usage of xtime.tv_nsec
as RNG...

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!

2007-08-09 Thread Nick Piggin
On Thu, Aug 09, 2007 at 04:37:35PM +0100, Hugh Dickins wrote:
> On Thu, 9 Aug 2007, Mariusz Kozlowski wrote:
> > Hello,
> > 
> > Nothing unusual happening, allmodconfig compiling etc.
> > Not sure why it says kernel was tainted though ... hmmm.
> > 
> > [ cut here ]
> > kernel BUG at mm/swap_state.c:78!
> > invalid opcode:  [#1]
> > PREEMPT 
> > Modules linked in: orinoco_cs orinoco hermes pl2303 usbserial pcmcia 
> > 8250_pci 8250 serial_core yenta_socket rsrc_nonstatic pcmcia_core 8139too
> > CPU:0
> > EIP:0060:[]Tainted: PVLI
> > EFLAGS: 00010246   (2.6.23-rc2-mm1 #1)
> > EIP is at __add_to_swap_cache+0xc6/0xd7
> > eax: 4000   ebx: c11285c0   ecx: 00d0   edx: 0283
> > esi: c11285c0   edi: 0283   ebp: c1858f90   esp: c1858f84
> > ds: 007b   es: 007b   fs:   gs:   ss: 0068
> > Process kprefetchd (pid: 236, ti=c1858000 task=c18d14d0 task.ti=c1858000)
> > Stack: 0283 c11285c0 c3d5a3c8 c1858fa0 c01504ea c11285c0  
> > c1858fcc 
> >c015307c 0001 0007 0002 0002 0283  
> > fffc 
> > c0152d5c c1858fe0 c0127f2e c0127ef8   
> >  
> > Call Trace:
> >  [] show_trace_log_lvl+0x1a/0x30
> >  [] show_stack_log_lvl+0xa9/0xd5
> >  [] show_registers+0x219/0x38d
> >  [] die+0x104/0x23e
> >  [] do_trap+0x83/0xad
> >  [] do_invalid_op+0x88/0x92
> >  [] error_code+0x6a/0x70
> >  [] add_to_swap_cache+0x22/0x58
> >  [] kprefetchd+0x320/0x364
> >  [] kthread+0x36/0x58
> >  [] kernel_thread_helper+0x7/0x14
> >  ===
> > INFO: lockdep is turned off.
> > Code: 0f 89 7b 0c 83 05 fc c9 53 c0 01 8b 13 c1 ea 1e 8d 04 12 01 d0 c1 e0 
> > 03 29 d0 c1 e0 05 ff 80 b8 c0 53 c0 ff 05 34 1d 68 c0 eb 96 <0f> 0b eb fe 
> > 0f 0b eb fe 0f 0b eb fe 8b 53 0c eb be 55 89 e5 56 
> > EIP: [] __add_to_swap_cache+0xc6/0xd7 SS:ESP 0068:c1858f84
> 
> Don't worry about reproducing untainted, I got the same earlier
> and was just preparing and testing the hotfix: here it is...
> 
> 
> Nick's mm-clarify-__add_to_swap_cache-locking.patch is fine for mainline,
> but soon generates a "kernel BUG at mm/swap_state.c:78!" when it meets
> mm-implement-swap-prefetching.patch in 2.6.23-rc2-mm1.  We could add a
> fix to the latter, but I think it's better to adjust Nick's, so that
> it's right for whichever tree it's in: move the responsibility to
> SetPageLocked from read_swap_cache_async to add_to_swap_cache.

Hmm, yeah I like this better, it is more like add_to_page_cache now.
Thanks.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-09 Thread Andrew Morton
On Fri, 10 Aug 2007 01:23:07 +0200
Mariusz Kozlowski <[EMAIL PROTECTED]> wrote:

> Hello,
> 
>   This probably doesn't have great impact ;) but ...
> To reproduce: run torture tests for RCU and then sysrq+q.
> 
> SysRq : Show Pending Timers
> Timer List Version: v0.3
> HRTIMER_MAX_CLOCK_BASES: 2
> now at 1764338760370 nsecs
> 
> cpu: 0
>  clock 0:
>   .index:  0
>   .resolution: 1 nsecs
>   .get_time:   ktime_get_real
>   .offset: 1186699025823815427 nsecs
> active timers:
>  clock 1:
>   .index:  1
>   .resolution: 1 nsecs
>   .get_time:   ktime_get
>   .offset: 0 nsecs
> active timers:
>  #0: <3>BUG: sleeping function called from invalid context at 
> kernel/mutex.c:86
> in_atomic():1, irqs_disabled():1
> INFO: lockdep is turned off.
> irq event stamp: 0
> hardirqs last  enabled at (0): [<>] 0x0
> hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c
> softirqs last  enabled at (0): [] copy_process+0x4c6/0x144c
> softirqs last disabled at (0): [<>] 0x0
>  [] show_trace_log_lvl+0x1a/0x30
>  [] show_trace+0x12/0x14
>  [] dump_stack+0x15/0x17
>  [] __might_sleep+0xb7/0xc9
>  [] mutex_lock+0x15/0x1f
>  [] lookup_module_symbol_name+0x17/0xc0
>  [] lookup_symbol_name+0x3f/0x43
>  [] print_name_offset+0x1f/0x96
>  [] timer_list_show+0x802/0xcbd
>  [] sysrq_timer_list_show+0xc/0xe
>  [] sysrq_handle_show_timers+0x8/0xa
>  [] __handle_sysrq+0x7b/0x115
>  [] handle_sysrq+0x20/0x24
>  [] kbd_event+0x3a8/0x5c7
>  [] input_pass_event+0x8f/0x91
>  [] input_handle_event+0x98/0x38d
>  [] input_event+0x54/0x67
>  [] atkbd_interrupt+0x200/0x59e
>  [] serio_interrupt+0x7c/0x80
>  [] i8042_interrupt+0x17a/0x289
>  [] handle_IRQ_event+0x28/0x59
>  [] handle_level_irq+0xad/0x10b
>  [] do_IRQ+0x93/0xd0
>  [] common_interrupt+0x2e/0x34
>  [] rcu_read_delay+0x8/0x36 [rcutorture]
>  [] rcu_torture_reader+0x6e/0x169 [rcutorture]
>  [] kthread+0x36/0x58
>  [] kernel_thread_helper+0x7/0x1c
>  ===

We seem to have made a mess in there.  timer_list_show() ends up calling
lookup_module_symbol_name(), which takes a mutex.  However print_symbol()
(which is called at oops time, interrupt time, etc) calls
module_address_lookup(), which is basically the same, only it doesn't take
the mutex.

I guess a quicky fix would be to switch
kernel/time/timer_list.c:print_name_offset() from
lookup_module_symbol_name() to module_address_lookup().  But we'd still
have a mess in there.

(adds ccs, runs away)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-09 Thread Mariusz Kozlowski
Hello,

=
[ INFO: inconsistent lock state ]
2.6.23-rc2-mm1 #7
-
inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (>lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too]
{in-hardirq-W} state was registered at:
  [] __lock_acquire+0x949/0x11ac
  [] lock_acquire+0x99/0xb2
  [] _spin_lock+0x35/0x42
  [] rtl8139_interrupt+0x27/0x46b [8139too]
  [] handle_IRQ_event+0x28/0x59
  [] handle_level_irq+0xad/0x10b
  [] do_IRQ+0x93/0xd0
  [] common_interrupt+0x2e/0x34
  [] cpuidle_idle_call+0x74/0x99
  [] cpu_idle+0x87/0x89
  [] rest_init+0x60/0x62
  [] start_kernel+0x23a/0x2c5
  [<>] 0x0
  [] 0x
irq event stamp: 1777
hardirqs last  enabled at (1777): [] kfree+0xee/0x105
hardirqs last disabled at (1776): [] kfree+0x87/0x105
softirqs last  enabled at (1756): [] dev_deactivate+0x86/0xa5
softirqs last disabled at (1754): [] _spin_lock_bh+0xe/0x47

other info that might help us debug this:
1 lock held by ifconfig/5492:
 #0:  (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f

stack backtrace:
 [] show_trace_log_lvl+0x1a/0x30
 [] show_trace+0x12/0x14
 [] dump_stack+0x15/0x17
 [] print_usage_bug+0x145/0x14f
 [] mark_lock+0x61f/0x70c
 [] __lock_acquire+0x73e/0x11ac
 [] lock_acquire+0x99/0xb2
 [] _spin_lock+0x35/0x42
 [] rtl8139_interrupt+0x27/0x46b [8139too]
 [] free_irq+0x11b/0x146
 [] rtl8139_close+0x8a/0x14a [8139too]
 [] dev_close+0x57/0x74
 [] dev_change_flags+0x8e/0x190
 [] devinet_ioctl+0x4af/0x652
 [] inet_ioctl+0x56/0x71
 [] sock_ioctl+0xa5/0x1d4
 [] do_ioctl+0x22/0x71
 [] vfs_ioctl+0x55/0x29e
 [] sys_ioctl+0x33/0x69
 [] sysenter_past_esp+0x5f/0x99
 ===

Regards,

Mariusz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Gabriel C
Alan Cox wrote:
>>> [   28.828484] :00:1f.1: cannot adjust BAR0 (not I/O)
>>> [   28.828487] :00:1f.1: cannot adjust BAR1 (not I/O)
>>> [   28.828489] :00:1f.1: cannot adjust BAR2 (not I/O)
>>> [   28.828491] :00:1f.1: cannot adjust BAR3 (not I/O)
> 
> This means it didn't do anything. (wrongly because its checking I/O bits
> on a BAR which are ignored according to the spec)
> 
>>> Region 0: [virtual] Memory at 01f0 (32-bit, non-prefetchable) 
>>> [disabled] [size=8]
>>> Region 1: [virtual] Memory at 03f0 (type 3, non-prefetchable) 
>>> [disabled] [size=1]
>>> Region 2: [virtual] Memory at 0170 (32-bit, non-prefetchable) 
>>> [disabled] [size=8]
>>> Region 3: [virtual] Memory at 0370 (type 3, non-prefetchable) 
>>> [disabled] [size=1]
> 
> The controller is disabled and when disabled it seems to think its
> memory. Valid but interesting.
> 
> 

The box is an Dell Precision WorkStation 530 MT.

Actually I have an ATA-7 disc on the primary EIDE connector ( one port free ) 
and an oldish CDROM
on the secondary EIDE connector ( one port free ).

http://194.231.229.228/lara/lara.dmesg ( from 2.6.23-rc2-mm1 with the 2 patches 
reverted )
http://194.231.229.228/lara/lara.lspci ( lspci - -nn )
http://194.231.229.228/lara/lara.html ( lshw html output )

If you want me to do/try something let me know.


Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: sleeping function called from invalid context at kernel/mutex.c:86

2007-08-09 Thread Mariusz Kozlowski
Hello,

This probably doesn't have great impact ;) but ...
To reproduce: run torture tests for RCU and then sysrq+q.

SysRq : Show Pending Timers
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 1764338760370 nsecs

cpu: 0
 clock 0:
  .index:  0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset: 1186699025823815427 nsecs
active timers:
 clock 1:
  .index:  1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset: 0 nsecs
active timers:
 #0: <3>BUG: sleeping function called from invalid context at kernel/mutex.c:86
in_atomic():1, irqs_disabled():1
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last  enabled at (0): [<>] 0x0
hardirqs last disabled at (0): [] copy_process+0x4a8/0x144c
softirqs last  enabled at (0): [] copy_process+0x4c6/0x144c
softirqs last disabled at (0): [<>] 0x0
 [] show_trace_log_lvl+0x1a/0x30
 [] show_trace+0x12/0x14
 [] dump_stack+0x15/0x17
 [] __might_sleep+0xb7/0xc9
 [] mutex_lock+0x15/0x1f
 [] lookup_module_symbol_name+0x17/0xc0
 [] lookup_symbol_name+0x3f/0x43
 [] print_name_offset+0x1f/0x96
 [] timer_list_show+0x802/0xcbd
 [] sysrq_timer_list_show+0xc/0xe
 [] sysrq_handle_show_timers+0x8/0xa
 [] __handle_sysrq+0x7b/0x115
 [] handle_sysrq+0x20/0x24
 [] kbd_event+0x3a8/0x5c7
 [] input_pass_event+0x8f/0x91
 [] input_handle_event+0x98/0x38d
 [] input_event+0x54/0x67
 [] atkbd_interrupt+0x200/0x59e
 [] serio_interrupt+0x7c/0x80
 [] i8042_interrupt+0x17a/0x289
 [] handle_IRQ_event+0x28/0x59
 [] handle_level_irq+0xad/0x10b
 [] do_IRQ+0x93/0xd0
 [] common_interrupt+0x2e/0x34
 [] rcu_read_delay+0x8/0x36 [rcutorture]
 [] rcu_torture_reader+0x6e/0x169 [rcutorture]
 [] kthread+0x36/0x58
 [] kernel_thread_helper+0x7/0x1c
 ===
, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, swapper/0
 # expires at 176433900 nsecs [in 239630 nsecs]
 #1: , it_real_fn, S:01, do_setitimer, artsd/7461
 # expires at 1764742781512 nsecs [in 404021142 nsecs]
 #2: , hrtimer_wakeup, S:01, do_nanosleep, kwrapper/7452
 # expires at 1764922105491 nsecs [in 583345121 nsecs]
 #3: , it_real_fn, S:01, do_setitimer, syslogd/6719
 # expires at 1790027922194 nsecs [in 25689161824 nsecs]
  .expires_next   : 176433900 nsecs
  .hres_active: 1
  .nr_events  : 1422687
  .nohz_mode  : 2
  .idle_tick  : 46585900 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 165857
  .idle_calls : 1812679
  .idle_sleeps: 1761361
  .idle_entrytime : 466865075138 nsecs
  .idle_sleeptime : 357976883572 nsecs
  .last_jiffies   : 166865
  .next_jiffies   : 166866
  .idle_expires   : 46595100 nsecs
jiffies: 1464338


Tick Device: mode: 1
Clock Event Device: pit
 max_delta_ns:   27461866
 min_delta_ns:   12571
 mult:   5124677
 shift:  32
 mode:   3
 next_event: 176433900 nsecs
 set_next_event: pit_next_event
 set_mode:   init_pit_timer
 event_handler:  hrtimer_interrupt

Regards,

Mariusz
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.23-rc2-mm1
# Fri Aug 10 00:12:50 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_NONIRQ_WAKEUP=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SWAP_PREFETCH=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CONTAINERS is not set
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_PROC_KPAGEMAP=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVE

Re: 2.6.23-rc2-mm1 -- INFO: possible circular locking dependency detected

2007-08-09 Thread Alexey Starikovskiy
Andrew Morton wrote:
> On Thu, 9 Aug 2007 16:24:48 -0400
> "Miles Lane" <[EMAIL PROTECTED]> wrote:
> 
>> [ INFO: possible circular locking dependency detected ]
>> 2.6.23-rc2-mm1 #7
>> ---
>> kacpid/53 is trying to acquire lock:
>>  (>lock){--..}, at: [] mutex_lock+0x1c/0x1f
>>
>> but task is already holding lock:
>>  (>work){--..}, at: [] run_workqueue+0xa0/0x182
>>
>> which lock already depends on the new lock.
>>
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #2 (>work){--..}:
>>[] __lock_acquire+0x9a6/0xb6f
>>[] lock_acquire+0x61/0x7d
>>[] run_workqueue+0xb5/0x182
>>[] worker_thread+0xb7/0xc2
>>[] kthread+0x39/0x61
>>[] kernel_thread_helper+0x7/0x10
>>[] 0x
>>
>> -> #1 (kacpid){--..}:
>>[] __lock_acquire+0x9a6/0xb6f
>>[] lock_acquire+0x61/0x7d
>>[] flush_workqueue+0x2d/0x4f
>>[] acpi_os_wait_events_complete+0xd/0xf
>>[] acpi_remove_gpe_handler+0x7b/0xdd
>>[] ec_remove_handlers+0x26/0x29
>>[] acpi_ec_add+0x8f/0x13e
>>[] acpi_device_probe+0x3e/0xdb
>>[] driver_probe_device+0xd7/0x14d
>>[] __driver_attach+0x6a/0xa1
>>[] bus_for_each_dev+0x36/0x5b
>>[] driver_attach+0x14/0x16
>>[] bus_add_driver+0x70/0x16c
>>[] driver_register+0x60/0x65
>>[] acpi_bus_register_driver+0x3a/0x3c
>>[] acpi_ec_init+0x36/0x55
>>[] kernel_init+0xc5/0x20f
>>[] kernel_thread_helper+0x7/0x10
>>[] 0x
>>
>> -> #0 (>lock){--..}:
>>[] __lock_acquire+0x8c6/0xb6f
>>[] lock_acquire+0x61/0x7d
>>[] __mutex_lock_slowpath+0xbc/0x241
>>[] mutex_lock+0x1c/0x1f
>>[] acpi_ec_transaction+0x65/0x1c1
>>[] acpi_ec_gpe_query+0x2b/0xab
>>[] acpi_os_execute_deferred+0x20/0x31
>>[] run_workqueue+0xba/0x182
>>[] worker_thread+0xb7/0xc2
>>[] kthread+0x39/0x61
>>[] kernel_thread_helper+0x7/0x10
>>[] 0x
>>
>> other info that might help us debug this:
>>
>> 2 locks held by kacpid/53:
>>  #0:  (kacpid){--..}, at: [] run_workqueue+0x85/0x182
>>  #1:  (>work){--..}, at: [] run_workqueue+0xa0/0x182
>>
>> stack backtrace:
>>  [] show_trace_log_lvl+0x12/0x25
>>  [] show_trace+0xd/0x10
>>  [] dump_stack+0x15/0x17
>>  [] print_circular_bug_tail+0x5a/0x65
>>  [] __lock_acquire+0x8c6/0xb6f
>>  [] lock_acquire+0x61/0x7d
>>  [] __mutex_lock_slowpath+0xbc/0x241
>>  [] mutex_lock+0x1c/0x1f
>>  [] acpi_ec_transaction+0x65/0x1c1
>>  [] acpi_ec_gpe_query+0x2b/0xab
>>  [] acpi_os_execute_deferred+0x20/0x31
>>  [] run_workqueue+0xba/0x182
>>  [] worker_thread+0xb7/0xc2
>>  [] kthread+0x39/0x61
>>  [] kernel_thread_helper+0x7/0x10
>>  ===
> 
> Presumably the new debugging patches in -mm
> (workqueue-debug-flushing-deadlocks-with-lockdep.patch and
> workqueue-debug-work-related-deadlocks-with-lockdep.patch) think they have
> found a potential deadlock in ACPI.  I don't have time to pick through the
> code to confirm that, but boy I'm good at adding cc's ;)
Yep, it indeed may lock up... Here is a patch to avoid it

Thanks,
Alex.


ACPI EC: remove potential deadlock from EC.

From: Alexey Starikovskiy <[EMAIL PROTECTED]>

Signed-off-by: Alexey Starikovskiy <[EMAIL PROTECTED]>
---

 drivers/acpi/ec.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index ceb7c3f..4b299fd 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -723,9 +723,7 @@ static int acpi_ec_add(struct acpi_device *device)
 	/* Check if we found the boot EC */
 	if (boot_ec) {
 		if (boot_ec->gpe == ec->gpe) {
-			mutex_lock(_ec->lock);
 			ec_remove_handlers(boot_ec);
-			mutex_unlock(_ec->lock);
 			mutex_destroy(_ec->lock);
 			kfree(boot_ec);
 			first_ec = boot_ec = NULL;


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Alan Cox
> > [   28.828484] :00:1f.1: cannot adjust BAR0 (not I/O)
> > [   28.828487] :00:1f.1: cannot adjust BAR1 (not I/O)
> > [   28.828489] :00:1f.1: cannot adjust BAR2 (not I/O)
> > [   28.828491] :00:1f.1: cannot adjust BAR3 (not I/O)

This means it didn't do anything. (wrongly because its checking I/O bits
on a BAR which are ignored according to the spec)

> > Region 0: [virtual] Memory at 01f0 (32-bit, non-prefetchable) 
> > [disabled] [size=8]
> > Region 1: [virtual] Memory at 03f0 (type 3, non-prefetchable) 
> > [disabled] [size=1]
> > Region 2: [virtual] Memory at 0170 (32-bit, non-prefetchable) 
> > [disabled] [size=8]
> > Region 3: [virtual] Memory at 0370 (type 3, non-prefetchable) 
> > [disabled] [size=1]

The controller is disabled and when disabled it seems to think its
memory. Valid but interesting.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Andrew Morton
On Thu, 09 Aug 2007 23:36:04 +0200
Gabriel C <[EMAIL PROTECTED]> wrote:

> 
> ...
> 
> > +pci-align-bar-settings-for-legacy-mode-ide.patch
> > +pci-align-bar-settings-for-legacy-mode-ide-fix.patch
> > 
> >  2.6.23 material, but these belong to subssytem trees
> >
> 
> ...
> 
>  
> These broke the IDE controller , using libata on my Dell Workstation .. 
> 
> Reverting both fixes the problem.
> 
> 
> ..
> 
> [   28.828484] :00:1f.1: cannot adjust BAR0 (not I/O)
> [   28.828487] :00:1f.1: cannot adjust BAR1 (not I/O)
> [   28.828489] :00:1f.1: cannot adjust BAR2 (not I/O)
> [   28.828491] :00:1f.1: cannot adjust BAR3 (not I/O)
> ...
> 
> [   44.564308] ata_piix :00:1f.1: version 2.11
> [   44.564365] ata_piix :00:1f.1: no available native port
> 
> ...
> 
> And my CDROM and second ide disc gone.
> 
> $ lspci -vvvxxx
> 
> ...
> 
> 00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 Controller (rev 04) 
> (prog-if 80 [Master])
> Subsystem: Dell Unknown device 00d8
> Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
> Stepping- SERR- FastB2B-
> Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
> SERR-  Latency: 0
> Region 0: [virtual] Memory at 01f0 (32-bit, non-prefetchable) 
> [disabled] [size=8]
> Region 1: [virtual] Memory at 03f0 (type 3, non-prefetchable) 
> [disabled] [size=1]
> Region 2: [virtual] Memory at 0170 (32-bit, non-prefetchable) 
> [disabled] [size=8]
> Region 3: [virtual] Memory at 0370 (type 3, non-prefetchable) 
> [disabled] [size=1]
> Region 4: I/O ports at ffa0 [size=16]
> 00: 86 80 4b 24 05 00 80 02 04 80 01 01 00 00 00 00
> 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 20: a1 ff 00 00 00 00 00 00 00 00 00 00 28 10 d8 00
> 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 40: 07 e3 03 e3 00 00 00 00 05 00 01 02 00 00 00 00
> 50: 00 00 00 00 50 10 00 00 00 00 00 00 00 00 00 00
> 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> f0: 00 00 00 00 00 00 00 00 47 0f 00 00 00 00 00 00
>  
> ...
> 
> config is attched.
> 

Great, thanks for working that out.  I'll drop them, thereby breaking other
people's stuff.  This is rather a disaster.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Gabriel C

...

> +pci-align-bar-settings-for-legacy-mode-ide.patch
> +pci-align-bar-settings-for-legacy-mode-ide-fix.patch
> 
>  2.6.23 material, but these belong to subssytem trees
>

...

 
These broke the IDE controller , using libata on my Dell Workstation .. 

Reverting both fixes the problem.


..

[   28.828484] :00:1f.1: cannot adjust BAR0 (not I/O)
[   28.828487] :00:1f.1: cannot adjust BAR1 (not I/O)
[   28.828489] :00:1f.1: cannot adjust BAR2 (not I/O)
[   28.828491] :00:1f.1: cannot adjust BAR3 (not I/O)
...

[   44.564308] ata_piix :00:1f.1: version 2.11
[   44.564365] ata_piix :00:1f.1: no available native port

...

And my CDROM and second ide disc gone.

$ lspci -vvvxxx

...

00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 Controller (rev 04) 
(prog-if 80 [Master])
Subsystem: Dell Unknown device 00d8
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- #
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.23-rc2-mm1
# Thu Aug  9 15:19:34 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_NONIRQ_WAKEUP=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SWAP_PREFETCH=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CONTAINERS is not set
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_PROC_KPAGEMAP=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_LBD=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_LSF=y
CONFIG_BLK_DEV_BSG=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
CONFIG_MPENTIUM4=y
# CONFIG_MCORE2 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_

Re: 2.6.23-rc2-mm1 -- INFO: possible circular locking dependency detected

2007-08-09 Thread Andrew Morton
On Thu, 9 Aug 2007 16:24:48 -0400
"Miles Lane" <[EMAIL PROTECTED]> wrote:

> [ INFO: possible circular locking dependency detected ]
> 2.6.23-rc2-mm1 #7
> ---
> kacpid/53 is trying to acquire lock:
>  (>lock){--..}, at: [] mutex_lock+0x1c/0x1f
> 
> but task is already holding lock:
>  (>work){--..}, at: [] run_workqueue+0xa0/0x182
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #2 (>work){--..}:
>[] __lock_acquire+0x9a6/0xb6f
>[] lock_acquire+0x61/0x7d
>[] run_workqueue+0xb5/0x182
>[] worker_thread+0xb7/0xc2
>[] kthread+0x39/0x61
>[] kernel_thread_helper+0x7/0x10
>[] 0x
> 
> -> #1 (kacpid){--..}:
>[] __lock_acquire+0x9a6/0xb6f
>[] lock_acquire+0x61/0x7d
>[] flush_workqueue+0x2d/0x4f
>[] acpi_os_wait_events_complete+0xd/0xf
>[] acpi_remove_gpe_handler+0x7b/0xdd
>[] ec_remove_handlers+0x26/0x29
>[] acpi_ec_add+0x8f/0x13e
>[] acpi_device_probe+0x3e/0xdb
>[] driver_probe_device+0xd7/0x14d
>[] __driver_attach+0x6a/0xa1
>[] bus_for_each_dev+0x36/0x5b
>[] driver_attach+0x14/0x16
>[] bus_add_driver+0x70/0x16c
>[] driver_register+0x60/0x65
>[] acpi_bus_register_driver+0x3a/0x3c
>[] acpi_ec_init+0x36/0x55
>[] kernel_init+0xc5/0x20f
>[] kernel_thread_helper+0x7/0x10
>[] 0x
> 
> -> #0 (>lock){--..}:
>[] __lock_acquire+0x8c6/0xb6f
>[] lock_acquire+0x61/0x7d
>[] __mutex_lock_slowpath+0xbc/0x241
>[] mutex_lock+0x1c/0x1f
>[] acpi_ec_transaction+0x65/0x1c1
>[] acpi_ec_gpe_query+0x2b/0xab
>[] acpi_os_execute_deferred+0x20/0x31
>[] run_workqueue+0xba/0x182
>[] worker_thread+0xb7/0xc2
>[] kthread+0x39/0x61
>[] kernel_thread_helper+0x7/0x10
>[] 0x
> 
> other info that might help us debug this:
> 
> 2 locks held by kacpid/53:
>  #0:  (kacpid){--..}, at: [] run_workqueue+0x85/0x182
>  #1:  (>work){--..}, at: [] run_workqueue+0xa0/0x182
> 
> stack backtrace:
>  [] show_trace_log_lvl+0x12/0x25
>  [] show_trace+0xd/0x10
>  [] dump_stack+0x15/0x17
>  [] print_circular_bug_tail+0x5a/0x65
>  [] __lock_acquire+0x8c6/0xb6f
>  [] lock_acquire+0x61/0x7d
>  [] __mutex_lock_slowpath+0xbc/0x241
>  [] mutex_lock+0x1c/0x1f
>  [] acpi_ec_transaction+0x65/0x1c1
>  [] acpi_ec_gpe_query+0x2b/0xab
>  [] acpi_os_execute_deferred+0x20/0x31
>  [] run_workqueue+0xba/0x182
>  [] worker_thread+0xb7/0xc2
>  [] kthread+0x39/0x61
>  [] kernel_thread_helper+0x7/0x10
>  ===

Presumably the new debugging patches in -mm
(workqueue-debug-flushing-deadlocks-with-lockdep.patch and
workqueue-debug-work-related-deadlocks-with-lockdep.patch) think they have
found a potential deadlock in ACPI.  I don't have time to pick through the
code to confirm that, but boy I'm good at adding cc's ;)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: silly df numbers on 32bit extN

2007-08-09 Thread Andrew Morton
On Thu, 9 Aug 2007 21:17:20 +0100 (BST)
Hugh Dickins <[EMAIL PROTECTED]> wrote:

> On Thu, 9 Aug 2007, Andrew Morton wrote:
> > 
> > +lib-make-percpu_counter_add-take-s64.patch
> 
> lib-make-percpu_counter_add-take-s64.patch looks sensible, but it doesn't
> actually work on 32-bit architectures: several users of percpu_counter_add
> are passing -unsignedlong as the amount, which is not promoted to s64 in
> the desired way, so "df" on extN filesystems is showing silly numbers.
> 
> The hack below (say long instead of s64 or s32) may be good as hotfix for
> 2.6.23-rc2-mm1, but is probably the worst of solutions.  Perhaps take-s64
> should be reverted, perhaps there should be a percpu_counter_sub and the
> filesystems use that instead of saying -unsignedlong, perhaps they should
> use a cast or a long or an s64.  I don't know, but here's this for now...

Thanks.  I think I'll quietly tip the whole patch series overboard and
shoot for a quick rc2-mm2 rather than trying to patch it up in-situ.

I haven't had a chance to review it all in recent months.  Vague first
impressions are that it all goes a bit rampant and changes more than it
needs to, but I'll take a closer look at that if Peter can provide us with
the next version (please).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Jesper Juhl
On 09/08/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
...
> - If there's a patch in here which you think should be in 2.6.23 but I do
>   not have it designated in that way, please be sure to let me know.
...

Well, if you want to clean up your patch queue a bit then I have a few
suggestions for some patches of mine that are currently in -mm that
you could push to Linus.  They are not really important, so if you'd
rather keep them around in -mm until the next merge window then that's
fine by me, but they should be safe to push and would get cut down on
the number of patches you track a bit :-)


This fix was already merged for the ati side of things, this is an
identical fix for the amd side of things - I see no reason why we
shouldn't get this fix into 2.6.23 as well :

 
fix-use-after-free--double-free-bug-in-amd_create_gatt_pages--amd_free_gatt_pages.patch


This patch only changes the output of the script when run without
arguments, so as far as building the kernel goes it can't cause any
regressions and it's a clear improvement over what we currently have,
so might as well get it out of your queue and upstream :

 improve-scripts-gcc-versionsh-output-a-bit-when-called-without-args.patch


When people use scripts/ver_linux in bugreports we want correct info -
currently we often print wrong info for the binutils version. This
patch doesn't hurt existing working scenarios but does fix a few
broken ones, might as well get that merged, it's a clear fix :

 scripts-ver_linux-correct-printing-of-binutils-version.patch


These should all be trivially correct since they just remove duplicate
#include's of the same header into a .c file outside any #ifdef and
similar magic, so they should be quite safe to push. Also, I haven't
seen anything but ACK's in response to them (when I've seen a
response), and a few of my similar patches have already been merged :

 powerpc-clean-out-a-bunch-of-duplicate-includes.patch
 clean-up-duplicate-includes-in-drivers-input.patch
 clean-up-duplicate-includes-in-drivers-net.patch
 clean-up-duplicate-includes-in-drivers-atm.patch
 clean-up-duplicate-includes-in-net-atm.patch
 clean-up-duplicate-includes-in-net-ipv4.patch
 clean-up-duplicate-includes-in-net-ipv6.patch
 clean-up-duplicate-includes-in-net-sched.patch
 clean-up-duplicate-includes-in-net-sunrpc.patch
 clean-up-duplicate-includes-in-net-tipc.patch
 clean-up-duplicate-includes-in-net-xfrm.patch
 clean-up-duplicate-includes-in-include-linux-nfs_fsh.patch
 clean-up-duplicate-includes-in-fs-ntfs.patch
 clean-up-duplicate-includes-in-drivers-scsi.patch
 clean-up-duplicate-includes-in-drivers-block.patch
 clean-up-duplicate-includes-in-arch-i386-xen.patch
 clean-up-duplicate-includes-in-include-linux-memory_hotplugh.patch
 clean-up-duplicate-includes-in-mm.patch
 clean-up-duplicate-includes-in-drivers-char.patch
 clean-up-duplicate-includes-in-drivers-w1.patch
 clean-up-duplicate-includes-in-fs.patch
 clean-up-duplicate-includes-in-fs-ecryptfs.patch
 clean-up-duplicate-includes-in-kernel.patch
 clean-up-duplicate-includes-in-drivers-spi.patch
 clean-up-duplicate-includes-in-documentation.patch


-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc2-mm1 -- INFO: possible circular locking dependency detected

2007-08-09 Thread Miles Lane
[ INFO: possible circular locking dependency detected ]
2.6.23-rc2-mm1 #7
---
kacpid/53 is trying to acquire lock:
 (>lock){--..}, at: [] mutex_lock+0x1c/0x1f

but task is already holding lock:
 (>work){--..}, at: [] run_workqueue+0xa0/0x182

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (>work){--..}:
   [] __lock_acquire+0x9a6/0xb6f
   [] lock_acquire+0x61/0x7d
   [] run_workqueue+0xb5/0x182
   [] worker_thread+0xb7/0xc2
   [] kthread+0x39/0x61
   [] kernel_thread_helper+0x7/0x10
   [] 0x

-> #1 (kacpid){--..}:
   [] __lock_acquire+0x9a6/0xb6f
   [] lock_acquire+0x61/0x7d
   [] flush_workqueue+0x2d/0x4f
   [] acpi_os_wait_events_complete+0xd/0xf
   [] acpi_remove_gpe_handler+0x7b/0xdd
   [] ec_remove_handlers+0x26/0x29
   [] acpi_ec_add+0x8f/0x13e
   [] acpi_device_probe+0x3e/0xdb
   [] driver_probe_device+0xd7/0x14d
   [] __driver_attach+0x6a/0xa1
   [] bus_for_each_dev+0x36/0x5b
   [] driver_attach+0x14/0x16
   [] bus_add_driver+0x70/0x16c
   [] driver_register+0x60/0x65
   [] acpi_bus_register_driver+0x3a/0x3c
   [] acpi_ec_init+0x36/0x55
   [] kernel_init+0xc5/0x20f
   [] kernel_thread_helper+0x7/0x10
   [] 0x

-> #0 (>lock){--..}:
   [] __lock_acquire+0x8c6/0xb6f
   [] lock_acquire+0x61/0x7d
   [] __mutex_lock_slowpath+0xbc/0x241
   [] mutex_lock+0x1c/0x1f
   [] acpi_ec_transaction+0x65/0x1c1
   [] acpi_ec_gpe_query+0x2b/0xab
   [] acpi_os_execute_deferred+0x20/0x31
   [] run_workqueue+0xba/0x182
   [] worker_thread+0xb7/0xc2
   [] kthread+0x39/0x61
   [] kernel_thread_helper+0x7/0x10
   [] 0x

other info that might help us debug this:

2 locks held by kacpid/53:
 #0:  (kacpid){--..}, at: [] run_workqueue+0x85/0x182
 #1:  (>work){--..}, at: [] run_workqueue+0xa0/0x182

stack backtrace:
 [] show_trace_log_lvl+0x12/0x25
 [] show_trace+0xd/0x10
 [] dump_stack+0x15/0x17
 [] print_circular_bug_tail+0x5a/0x65
 [] __lock_acquire+0x8c6/0xb6f
 [] lock_acquire+0x61/0x7d
 [] __mutex_lock_slowpath+0xbc/0x241
 [] mutex_lock+0x1c/0x1f
 [] acpi_ec_transaction+0x65/0x1c1
 [] acpi_ec_gpe_query+0x2b/0xab
 [] acpi_os_execute_deferred+0x20/0x31
 [] run_workqueue+0xba/0x182
 [] worker_thread+0xb7/0xc2
 [] kthread+0x39/0x61
 [] kernel_thread_helper+0x7/0x10
 ===
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: silly df numbers on 32bit extN

2007-08-09 Thread Hugh Dickins
On Thu, 9 Aug 2007, Andrew Morton wrote:
> 
> +lib-make-percpu_counter_add-take-s64.patch

lib-make-percpu_counter_add-take-s64.patch looks sensible, but it doesn't
actually work on 32-bit architectures: several users of percpu_counter_add
are passing -unsignedlong as the amount, which is not promoted to s64 in
the desired way, so "df" on extN filesystems is showing silly numbers.

The hack below (say long instead of s64 or s32) may be good as hotfix for
2.6.23-rc2-mm1, but is probably the worst of solutions.  Perhaps take-s64
should be reverted, perhaps there should be a percpu_counter_sub and the
filesystems use that instead of saying -unsignedlong, perhaps they should
use a cast or a long or an s64.  I don't know, but here's this for now...

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---

 include/linux/percpu_counter.h |   13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

--- 2.6.23-rc2-mm1/include/linux/percpu_counter.h   2007-08-09 
13:15:35.0 +0100
+++ linux/include/linux/percpu_counter.h2007-08-09 20:34:23.0 
+0100
@@ -37,7 +37,7 @@ void percpu_counter_set(struct percpu_co
 void __percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch);
 s64 __percpu_counter_sum(struct percpu_counter *fbc);
 
-static inline void percpu_counter_add(struct percpu_counter *fbc, s64 amount)
+static inline void percpu_counter_add(struct percpu_counter *fbc, long amount)
 {
__percpu_counter_add(fbc, amount, FBC_BATCH);
 }
@@ -96,11 +96,16 @@ static inline void percpu_counter_set(st
fbc->count = amount;
 }
 
-#define __percpu_counter_add(fbc, amount, batch) \
-   percpu_counter_add(fbc, amount)
+static inline void
+__percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch)
+{
+   preempt_disable();
+   fbc->count += amount;
+   preempt_enable();
+}
 
 static inline void
-percpu_counter_add(struct percpu_counter *fbc, s64 amount)
+percpu_counter_add(struct percpu_counter *fbc, long amount)
 {
preempt_disable();
fbc->count += amount;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Jeff Garzik

Andrew Morton wrote:

umm, I was hoping to find out which of those two patches was the cuplrit.
Almost surely it was 9ee6b32a47b9abc565466a9c3b127a5246b452e5?



Highly likely it is my patch in #ALL.

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Andrew Morton
On Thu, 9 Aug 2007 21:04:35 +0200
"Michal Piotrowski" <[EMAIL PROTECTED]> wrote:

> On 09/08/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Thu, 09 Aug 2007 15:23:41 +0200
> > Michal Piotrowski <[EMAIL PROTECTED]> wrote:
> >
> > > Andrew Morton pisze:
> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/
> > >
> > > I am experiencing some problems with 8139too
> > >
> > > [   28.847004] 8139too :02:0d.0: region #0 not a PIO resource, 
> > > aborting
> > > [   28.854722] Bad IO access at port 0 ()
> > > [   28.859459] WARNING: at /home/devel/linux-mm/lib/iomap.c:44 
> > > bad_io_access()
> > > [   28.867415]  [] show_trace_log_lvl+0x1a/0x30
> > > [   28.873568]  [] show_trace+0x12/0x14
> > > [   28.879015]  [] dump_stack+0x16/0x18
> > > [   28.884451]  [] bad_io_access+0x58/0x5a
> > > [   28.890129]  [] pci_iounmap+0x21/0x2b
> > > [   28.895635]  [] __rtl8139_cleanup_dev+0x75/0xc6
> > > [   28.902037]  [] rtl8139_init_one+0x59b/0xa9f
> > > [   28.908170]  [] pci_device_probe+0x44/0x5f
> > > [   28.914116]  [] driver_probe_device+0xa7/0x19a
> > > [   28.920402]  [] __driver_attach+0xa6/0xa8
> > > [   28.926236]  [] bus_for_each_dev+0x43/0x61
> > > [   28.932139]  [] driver_attach+0x19/0x1b
> > > [   28.937776]  [] bus_add_driver+0x7e/0x1a5
> > > [   28.943567]  [] driver_register+0x45/0x75
> > > [   28.949358]  [] __pci_register_driver+0x56/0x84
> > > [   28.955678]  [] rtl8139_init_module+0x14/0x1c
> > > [   28.961832]  [] kernel_init+0x132/0x306
> > > [   28.967451]  [] kernel_thread_helper+0x7/0x14
> > > [   28.973588]  ===
> > > [   28.978151] initcall 0xc0819104: rtl8139_init_module+0x0/0x1c() 
> > > returned 0.
> > > [   28.986114] initcall 0xc0819104 ran for 161 msecs: 
> > > rtl8139_init_module+0x0/0x1c()
> > >
> > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-dmesg
> > > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-config
> > >
> >
> > Please try reverting 8139too-force-media-setting-fix.patch, then
> > applying this:
> >
> >
> 
> Problem fixed, thanks!
> 

umm, I was hoping to find out which of those two patches was the cuplrit.
Almost surely it was 9ee6b32a47b9abc565466a9c3b127a5246b452e5?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Michal Piotrowski
On 09/08/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Thu, 09 Aug 2007 15:23:41 +0200
> Michal Piotrowski <[EMAIL PROTECTED]> wrote:
>
> > Andrew Morton pisze:
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/
> >
> > I am experiencing some problems with 8139too
> >
> > [   28.847004] 8139too :02:0d.0: region #0 not a PIO resource, aborting
> > [   28.854722] Bad IO access at port 0 ()
> > [   28.859459] WARNING: at /home/devel/linux-mm/lib/iomap.c:44 
> > bad_io_access()
> > [   28.867415]  [] show_trace_log_lvl+0x1a/0x30
> > [   28.873568]  [] show_trace+0x12/0x14
> > [   28.879015]  [] dump_stack+0x16/0x18
> > [   28.884451]  [] bad_io_access+0x58/0x5a
> > [   28.890129]  [] pci_iounmap+0x21/0x2b
> > [   28.895635]  [] __rtl8139_cleanup_dev+0x75/0xc6
> > [   28.902037]  [] rtl8139_init_one+0x59b/0xa9f
> > [   28.908170]  [] pci_device_probe+0x44/0x5f
> > [   28.914116]  [] driver_probe_device+0xa7/0x19a
> > [   28.920402]  [] __driver_attach+0xa6/0xa8
> > [   28.926236]  [] bus_for_each_dev+0x43/0x61
> > [   28.932139]  [] driver_attach+0x19/0x1b
> > [   28.937776]  [] bus_add_driver+0x7e/0x1a5
> > [   28.943567]  [] driver_register+0x45/0x75
> > [   28.949358]  [] __pci_register_driver+0x56/0x84
> > [   28.955678]  [] rtl8139_init_module+0x14/0x1c
> > [   28.961832]  [] kernel_init+0x132/0x306
> > [   28.967451]  [] kernel_thread_helper+0x7/0x14
> > [   28.973588]  ===
> > [   28.978151] initcall 0xc0819104: rtl8139_init_module+0x0/0x1c() returned 
> > 0.
> > [   28.986114] initcall 0xc0819104 ran for 161 msecs: 
> > rtl8139_init_module+0x0/0x1c()
> >
> > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-dmesg
> > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-config
> >
>
> Please try reverting 8139too-force-media-setting-fix.patch, then
> applying this:
>
>

Problem fixed, thanks!

Regards,
Michal

-- 
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ___cpu_physical_id ___

2007-08-09 Thread Andrew Morton
On Thu, 9 Aug 2007 10:18:15 -0400
"Miles Lane" <[EMAIL PROTECTED]> wrote:

>   CC  drivers/dma/ioat_dca.o
> drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag':
> drivers/dma/ioat_dca.c:177: error: implicit declaration of function
> 'cpu_physical_id'

Looks like cpu_physical_id() doesn't get implemented if CONFIG_SMP=n.

Either ioat needs to stop using cpu_physical_id() if SMP=n, or the
supported architectures (i386, x86_64, ia64) should provide a non-SMP
version of cpu_physical_id().  Preferably the latter, I'd say.

Something like this, I suppose...


From: Andrew Morton <[EMAIL PROTECTED]>

i386, x86_64 and ia64 implement cpu_physical_id() if CONFIG_SMP=y.

Provide a uniprocessor stub so that callers will dtrt.

Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: "Luck, Tony" <[EMAIL PROTECTED]>
Cc: Shannon Nelson <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 include/linux/smp.h |5 +
 1 files changed, 5 insertions(+)

diff -puN include/linux/smp.h~implement-cpu_physical_id-on-smp=n 
include/linux/smp.h
--- a/include/linux/smp.h~implement-cpu_physical_id-on-smp=n
+++ a/include/linux/smp.h
@@ -108,6 +108,11 @@ static inline void smp_send_reschedule(i
0;  \
 })
 
+static inline unsigned cpu_physical_id(unsigned cpu)
+{
+   return 0;
+}
+
 #endif /* !SMP */
 
 /*
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Andrew Morton
On Thu, 09 Aug 2007 15:23:41 +0200
Michal Piotrowski <[EMAIL PROTECTED]> wrote:

> Andrew Morton pisze:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/
> 
> I am experiencing some problems with 8139too
> 
> [   28.847004] 8139too :02:0d.0: region #0 not a PIO resource, aborting
> [   28.854722] Bad IO access at port 0 ()
> [   28.859459] WARNING: at /home/devel/linux-mm/lib/iomap.c:44 bad_io_access()
> [   28.867415]  [] show_trace_log_lvl+0x1a/0x30
> [   28.873568]  [] show_trace+0x12/0x14
> [   28.879015]  [] dump_stack+0x16/0x18
> [   28.884451]  [] bad_io_access+0x58/0x5a
> [   28.890129]  [] pci_iounmap+0x21/0x2b
> [   28.895635]  [] __rtl8139_cleanup_dev+0x75/0xc6
> [   28.902037]  [] rtl8139_init_one+0x59b/0xa9f
> [   28.908170]  [] pci_device_probe+0x44/0x5f
> [   28.914116]  [] driver_probe_device+0xa7/0x19a
> [   28.920402]  [] __driver_attach+0xa6/0xa8
> [   28.926236]  [] bus_for_each_dev+0x43/0x61
> [   28.932139]  [] driver_attach+0x19/0x1b
> [   28.937776]  [] bus_add_driver+0x7e/0x1a5
> [   28.943567]  [] driver_register+0x45/0x75
> [   28.949358]  [] __pci_register_driver+0x56/0x84
> [   28.955678]  [] rtl8139_init_module+0x14/0x1c
> [   28.961832]  [] kernel_init+0x132/0x306
> [   28.967451]  [] kernel_thread_helper+0x7/0x14
> [   28.973588]  ===
> [   28.978151] initcall 0xc0819104: rtl8139_init_module+0x0/0x1c() returned 0.
> [   28.986114] initcall 0xc0819104 ran for 161 msecs: 
> rtl8139_init_module+0x0/0x1c()
> 
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-dmesg
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.23-rc2-mm1/mm-config
> 

Please try reverting 8139too-force-media-setting-fix.patch, then
applying this:


From: Andrew Morton <[EMAIL PROTECTED]>

Revert git-netdev-all's 9ee6b32a47b9abc565466a9c3b127a5246b452e5

Cc: Michal Piotrowski <[EMAIL PROTECTED]>
Cc: Jeff Garzik <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 drivers/net/8139too.c |   50 +++-
 1 file changed, 29 insertions(+), 21 deletions(-)

diff -puN drivers/net/8139too.c~revert-8139too-clean-up-i-o-remapping 
drivers/net/8139too.c
--- a/drivers/net/8139too.c~revert-8139too-clean-up-i-o-remapping
+++ a/drivers/net/8139too.c
@@ -121,15 +121,8 @@
 
 
 /* enable PIO instead of MMIO, if CONFIG_8139TOO_PIO is selected */
-enum rtl_bar_map_info {
-   rtl_pio_bar = 0,/* PCI BAR #0: PIO */
-   rtl_mmio_bar= 1,/* PCI BAR #1: MMIO */
-};
-
 #ifdef CONFIG_8139TOO_PIO
-static int use_pio = 1;
-#else
-static int use_pio;
+#define USE_IO_OPS 1
 #endif
 
 /* define to 1, 2 or 3 to enable copious debugging info */
@@ -620,17 +613,14 @@ MODULE_DESCRIPTION ("RealTek RTL-8139 Fa
 MODULE_LICENSE("GPL");
 MODULE_VERSION(DRV_VERSION);
 
-module_param(multicast_filter_limit, int, 0444);
+module_param(multicast_filter_limit, int, 0);
 module_param_array(media, int, NULL, 0);
 module_param_array(full_duplex, int, NULL, 0);
-module_param(debug, int, 0444);
-module_param(use_pio, int, 0444);
-
+module_param(debug, int, 0);
+MODULE_PARM_DESC (debug, "8139too bitmapped message enable number");
 MODULE_PARM_DESC (multicast_filter_limit, "8139too maximum number of filtered 
multicast addresses");
 MODULE_PARM_DESC (media, "8139too: Bits 4+9: force full duplex, bit 5: 
100Mbps");
 MODULE_PARM_DESC (full_duplex, "8139too: Force full duplex for board(s) (1)");
-MODULE_PARM_DESC (debug, "8139too bitmapped message enable number");
-MODULE_PARM_DESC (use_pio, "Non-zero to enable PIO (rather than MMIO) register 
mapping");
 
 static int read_eeprom (void __iomem *ioaddr, int location, int addr_len);
 static int rtl8139_open (struct net_device *dev);
@@ -718,7 +708,13 @@ static void __rtl8139_cleanup_dev (struc
assert (tp->pci_dev != NULL);
pdev = tp->pci_dev;
 
-   pci_iounmap (pdev, tp->mmio_addr);
+#ifdef USE_IO_OPS
+   if (tp->mmio_addr)
+   ioport_unmap (tp->mmio_addr);
+#else
+   if (tp->mmio_addr)
+   pci_iounmap (pdev, tp->mmio_addr);
+#endif /* USE_IO_OPS */
 
/* it's ok to call this even if we have no regions to free */
pci_release_regions (pdev);
@@ -794,32 +790,32 @@ static int __devinit rtl8139_init_board 
DPRINTK("PIO region size == 0x%02X\n", pio_len);
DPRINTK("MMIO region size == 0x%02lX\n", mmio_len);
 
+#ifdef USE_IO_OPS
/* make sure PCI base addr 0 is PIO */
if (!(pio_flags & IORESOURCE_IO)) {
dev_err(>dev, "region #0 not a PIO resource, aborting\n");
rc = -ENODEV;
goto err_out;
}
-
/* check for weird/broke

Re: 2.6.23-rc2-mm1 -- drivers/dma/ioat_dca.c:177: error: implicit declaration of function ‘cpu_physical_id’

2007-08-09 Thread Miles Lane
On 8/9/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> On Thu, Aug 09, 2007 at 10:18:15AM -0400, Miles Lane wrote:
> >   CC  drivers/dma/ioat_dca.o
> > drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag':
> > drivers/dma/ioat_dca.c:177: error: implicit declaration of function
> > 'cpu_physical_id'
> > make[2]: *** [drivers/dma/ioat_dca.o] Error 1
>
> -ENODOTCONFIG

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.23-rc2-mm1
# Thu Aug  9 12:18:45 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_NONIRQ_WAKEUP=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SWAP_PREFETCH=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=18
CONFIG_CONTAINERS=y
CONFIG_CONTAINER_DEBUG=y
CONFIG_CONTAINER_NS=y
CONFIG_CONTAINER_CPUACCT=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_PROC_KPAGEMAP=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_BLOCK=y
CONFIG_LBD=y
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set
# CONFIG_BLK_DEV_BSG is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=m
CONFIG_IOSCHED_CFQ=m
CONFIG_DEFAULT_AS=y
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="anticipatory"

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
# CONFIG_SMP is not set
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
CONFIG_MPENTIUM4=y
# CONFIG_MCORE2 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=4
CONFIG_HPET_TIMER=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=m
CONFIG_X86_MCE_P4THERMAL=y
CONFIG_VM86=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
CONFIG_MICROCODE=m
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m

Re: 2.6.23-rc2-mm1 -- PPC G5 kernel compile failure (patch)

2007-08-09 Thread Andy Whitcroft
On Thu, Aug 09, 2007 at 04:20:06PM +0200, Krzysztof Helt wrote:
> On Thu, 9 Aug 2007 14:04:49 +0100
> Andy Whitcroft <[EMAIL PROTECTED]> wrote:
> 
> > Seeing the following compile error on a G5 mac:
> > 
> >   drivers/video/tdfxfb.c: In function 'tdfxfb_setup':
> >   drivers/video/tdfxfb.c:1341: error: 'opt' undeclared (first use in this
> >  function)
> >   drivers/video/tdfxfb.c:1341: error: (Each undeclared identifier is
> > reported only once
> >   drivers/video/tdfxfb.c:1341: error: for each function it appears in.)
> > 
> > This seems to be the following fragment from tdfxfb-hardware-cursor:
> > 
> > +   } else if (!strcmp(this_opt, "hwcursor")) {
> > +   hwcursor = simple_strtoul(opt + 9, NULL, 0);
> > 
> > I guess the nieve fix would be s/opt/this_opt, but I am also
> > suspicious of the +9 here as hwcursor is only 8 long?  Now this
> > seems to take a numeric value and I assume that is via hwcursor=N,
> > if so then the +9 would make sense _if_ the strcmp was against
> > "hwcursor=".
> > 
> 
> The patch below fixes all issues you have pointed out. It also fixes
> the description of the nomtrr option.

Will push this through our tests and let you know.

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- spinlock bad magic

2007-08-09 Thread Andy Whitcroft
On Thu, Aug 09, 2007 at 01:53:06PM +0100, Andy Whitcroft wrote:
> Seeing spinlock bad magic BUG's from x86 and x86_64 test boxes,
> fsx-linux seems to be tickling it.  Adding Peter as prop_norm_single
> seems to be his:

Talking to Peter on IRC he suggested I back out the patch below and
retest on these machines:

mm-dirty-balancing-for-tasks

One machine seems to have hung elsewhere (probabally another bug sigh),
and one has run the fsx-linux tests successfully.  So this patch does
seem suspect.  I will report back on the other tests when they complete.

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [E1000-devel] 2.6.23-rc2-mm1: e1000e global symbols must be renamed

2007-08-09 Thread Kok, Auke

Adrian Bunk wrote:

On Thu, Aug 09, 2007 at 01:51:06AM -0700, Andrew Morton wrote:

...
- There is a new e1000 driver in git-netdev-all, called e1000e.  I'm sure
  the developers would like it tested.  Please cc [EMAIL PROTECTED] on
  any reports.
...
Changes since 2.6.23-rc2-mm1:
...
 git-netdev-all.patch
...
 git trees
...


<--  snip  -->

...
  LD  drivers/net/built-in.o
drivers/net/e1000e/built-in.o: In function `e1000_read_mac_addr':
(.text+0x3470): multiple definition of `e1000_read_mac_addr'
drivers/net/e1000/built-in.o:(.text+0xb6cc): first defined here
drivers/net/e1000e/built-in.o: In function `e1000_set_ethtool_ops':
(.text+0x594d): multiple definition of `e1000_set_ethtool_ops'
drivers/net/e1000/built-in.o:(.text+0xc97a): first defined here
...
make[3]: *** [drivers/net/built-in.o] Error 1


ack, I'll step on that and make it go away :)

Auke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Andrew Morton
On Thu, 09 Aug 2007 18:19:30 +0200 Michal Piotrowski <[EMAIL PROTECTED]> wrote:

> This might be related. The kernel is tainted because I hit
> kernel BUG at /home/devel/linux-mm/mm/swap_state.c:78!

umm, possibly.  If we went BUG while holding a spinlock then sure, 
a future lockup is pretty much inevitable.  But the lockdep
uninitialised-lock complaint is a bit of a surprise.

Can you please retest with Hugh's fix applied?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Andrew Morton
On Thu, 09 Aug 2007 17:36:57 +0200 Michal Piotrowski <[EMAIL PROTECTED]> wrote:

> Andrew Morton pisze:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/
> > 
> 
> bash_shared_mapping triggered this
> 
> [  874.714700] INFO: trying to register non-static key.
> [  874.719659] the code is fine but needs lockdep annotation.
> [  874.725133] turning off the locking correctness validator.
> [  874.730606]  [] show_trace_log_lvl+0x1a/0x30
> [  874.735759]  [] show_trace+0x12/0x14
> [  874.740218]  [] dump_stack+0x16/0x18
> [  874.744679]  [] __lock_acquire+0x598/0x125c
> [  874.749745]  [] lock_acquire+0xa7/0xc1
> [  874.754378]  [] _spin_lock_irqsave+0x41/0x6e
> [  874.759529]  [] prop_norm_single+0x34/0x8a
> [  874.764508]  [] set_page_dirty+0xa1/0x13b
> [  874.769402]  [] try_to_unmap_one+0xb8/0x1e7
> [  874.774467]  [] try_to_unmap+0x8f/0x40d
> [  874.779187]  [] shrink_page_list+0x278/0x750
> [  874.784339]  [] shrink_inactive_list+0xf6/0x328
> [  874.789749]  [] shrink_zone+0xad/0x10b
> [  874.794383]  [] try_to_free_pages+0x178/0x274
> [  874.799620]  [] __alloc_pages+0x169/0x431
> [  874.804514]  [] __do_page_cache_readahead+0x141/0x207
> [  874.810443]  [] do_page_cache_readahead+0x48/0x5c
> [  874.816027]  [] filemap_fault+0x2dd/0x4cf
> [  874.820921]  [] __do_fault+0xb6/0x42d
> [  874.825466]  [] handle_mm_fault+0x1b6/0x750
> [  874.830533]  [] do_page_fault+0x334/0x5f9
> [  874.835425]  [] error_code+0x72/0x78
> [  874.839886]  ===

I'd assume that the lib/proportions code went and passed a garbage pointer into
spin_lock_irqsave().  Or maybe it has a correct pointer but didn't initialise 
the
spinlock.

> [  880.621883] BUG: NMI Watchdog detected LOCKUP on CPU1, eip c0529022, 
> registers:
> [  880.629200] Modules linked in: ext2 loop autofs4 af_packet 
> nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink 
> ipt_REJECT iptable_filter ip_tables xt_tcpudp ip6t_REJECT ip6table_filter 
> ip6_tables x_tables firmware_class binfmt_misc fan ipv6 nvram snd_intel8x0 
> snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq 
> snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer evdev snd 
> soundcore i2c_i801 snd_page_alloc intel_agp agpgart rtc
> [  880.672397] CPU:1

This will be a consequence of that.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Michal Piotrowski
Michal Piotrowski pisze:
> Andrew Morton pisze:
>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/
>>
> 
> bash_shared_mapping triggered this
> 
> [  874.714700] INFO: trying to register non-static key.
> [  874.719659] the code is fine but needs lockdep annotation.
> [  874.725133] turning off the locking correctness validator.
> [  874.730606]  [] show_trace_log_lvl+0x1a/0x30
> [  874.735759]  [] show_trace+0x12/0x14
> [  874.740218]  [] dump_stack+0x16/0x18
> [  874.744679]  [] __lock_acquire+0x598/0x125c
> [  874.749745]  [] lock_acquire+0xa7/0xc1
> [  874.754378]  [] _spin_lock_irqsave+0x41/0x6e
> [  874.759529]  [] prop_norm_single+0x34/0x8a
> [  874.764508]  [] set_page_dirty+0xa1/0x13b
> [  874.769402]  [] try_to_unmap_one+0xb8/0x1e7
> [  874.774467]  [] try_to_unmap+0x8f/0x40d
> [  874.779187]  [] shrink_page_list+0x278/0x750
> [  874.784339]  [] shrink_inactive_list+0xf6/0x328
> [  874.789749]  [] shrink_zone+0xad/0x10b
> [  874.794383]  [] try_to_free_pages+0x178/0x274
> [  874.799620]  [] __alloc_pages+0x169/0x431
> [  874.804514]  [] __do_page_cache_readahead+0x141/0x207
> [  874.810443]  [] do_page_cache_readahead+0x48/0x5c
> [  874.816027]  [] filemap_fault+0x2dd/0x4cf
> [  874.820921]  [] __do_fault+0xb6/0x42d
> [  874.825466]  [] handle_mm_fault+0x1b6/0x750
> [  874.830533]  [] do_page_fault+0x334/0x5f9
> [  874.835425]  [] error_code+0x72/0x78
> [  874.839886]  ===
> [  880.621883] BUG: NMI Watchdog detected LOCKUP on CPU1, eip c0529022, 
> registers:
> [  880.629200] Modules linked in: ext2 loop autofs4 af_packet 
> nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink 
> ipt_REJECT iptable_filter ip_tables xt_tcpudp ip6t_REJECT ip6table_filter 
> ip6_tables x_tables firmware_class binfmt_misc fan ipv6 nvram snd_intel8x0 
> snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq 
> snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer evdev snd 
> soundcore i2c_i801 snd_page_alloc intel_agp agpgart rtc
> [  880.672397] CPU:1
> [  880.672398] EIP:0060:[]Not tainted VLI
> [  880.672400] EFLAGS: 0046   (2.6.23-rc2-mm1 #3)
> [  880.684735] EIP is at delay_tsc+0xe/0x17
> 
> l *delay_tsc+0xe
> 0xc1129022 is in delay_tsc (/home/devel/linux-mm/arch/i386/lib/delay.c:49).
> 44
> 45  rdtscl(bclock);
> 46  do {
> 47  rep_nop();
> 48  rdtscl(now);
> 49  } while ((now-bclock) < loops);
> 50  }
> 51
> 52  /*
> 53   * Since we calibrate only once at boot, this
> 
> 
> [  880.688646] eax: 393e5d7c   ebx: 0001   ecx: 393e5d04   edx: 023f
> [  880.695414] esi:    edi: cabbf5cc   ebp: caf29ae8   esp: caf29ae4
> [  880.702183] ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
> [  880.708002] Process firefox-bin (pid: 2625, ti=caf29000 task=cabbe900 
> task.ti=caf29000)
> [  880.715805] Stack: 02e6eb94 caf29af0 c0528fdd caf29b28 c05375ac 0046 
>  caf29b28 
> [  880.724345]0046  a6c999f0 0001 a6c999f0  
> cabbf5e0 cabbf5cc 
> [  880.732887]0086 caf29b48 c069f76f  0002 c05259db 
> cabbf5c0 8000 
> [  880.741425] Call Trace:
> [  880.744073]  [] show_trace_log_lvl+0x1a/0x30
> [  880.749233]  [] show_stack_log_lvl+0xa9/0xd5
> [  880.754386]  [] show_registers+0x21a/0x3ac
> [  880.759365]  [] die_nmi+0x84/0xd7
> [  880.763566]  [] nmi_watchdog_tick+0x14d/0x168
> [  880.768803]  [] do_nmi+0x8b/0x284
> [  880.773004]  [] nmi_stack_correct+0x26/0x2b
> [  880.778069]  [] __delay+0x9/0xb
> [  880.782098]  [] _raw_spin_lock+0xd8/0x18a
> [  880.786991]  [] _spin_lock_irqsave+0x5d/0x6e
> [  880.792143]  [] prop_norm_single+0x34/0x8a
> [  880.797122]  [] set_page_dirty+0xa1/0x13b
> [  880.802015]  [] try_to_unmap_one+0xb8/0x1e7
> [  880.807079]  [] try_to_unmap+0x8f/0x40d
> [  880.811798]  [] shrink_page_list+0x278/0x750
> [  880.816950]  [] shrink_inactive_list+0xf6/0x328
> [  880.822362]  [] shrink_zone+0xad/0x10b
> [  880.826997]  [] try_to_free_pages+0x178/0x274
> [  880.832235]  [] __alloc_pages+0x169/0x431
> [  880.837126]  [] __do_page_cache_readahead+0x141/0x207
> [  880.843056]  [] do_page_cache_readahead+0x48/0x5c
> [  880.848641]  [] filemap_fault+0x2dd/0x4cf
> [  880.853534]  [] __do_fault+0xb6/0x42d
> [  880.858081]  [] handle_mm_fault+0x1b6/0x750
> [  880.863146]  [] do_page_fault+0x334/0x5f9
> [  880.868037]  [] error_code+0x72/0x78
> [  880.872497]  ===
> [  880.876068] INFO: lockdep is turned off.
> [  880.879983] Code: 8d 0c 1b 01 c9 89 da c1 e2 07 29 ca 01 da 0

Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!

2007-08-09 Thread Michal Piotrowski
Hugh Dickins pisze:
> On Thu, 9 Aug 2007, Mariusz Kozlowski wrote:
>> Hello,
>>
>>  Nothing unusual happening, allmodconfig compiling etc.
>> Not sure why it says kernel was tainted though ... hmmm.
>>
>> [ cut here ]
>> kernel BUG at mm/swap_state.c:78!
>> invalid opcode:  [#1]
>> PREEMPT 
>> Modules linked in: orinoco_cs orinoco hermes pl2303 usbserial pcmcia 
>> 8250_pci 8250 serial_core yenta_socket rsrc_nonstatic pcmcia_core 8139too
>> CPU:0
>> EIP:0060:[]Tainted: PVLI
>> EFLAGS: 00010246   (2.6.23-rc2-mm1 #1)
>> EIP is at __add_to_swap_cache+0xc6/0xd7
>> eax: 4000   ebx: c11285c0   ecx: 00d0   edx: 0283
>> esi: c11285c0   edi: 0283   ebp: c1858f90   esp: c1858f84
>> ds: 007b   es: 007b   fs:   gs:   ss: 0068
>> Process kprefetchd (pid: 236, ti=c1858000 task=c18d14d0 task.ti=c1858000)
>> Stack: 0283 c11285c0 c3d5a3c8 c1858fa0 c01504ea c11285c0  
>> c1858fcc 
>>c015307c 0001 0007 0002 0002 0283  
>> fffc 
>> c0152d5c c1858fe0 c0127f2e c0127ef8   
>>  
>> Call Trace:
>>  [] show_trace_log_lvl+0x1a/0x30
>>  [] show_stack_log_lvl+0xa9/0xd5
>>  [] show_registers+0x219/0x38d
>>  [] die+0x104/0x23e
>>  [] do_trap+0x83/0xad
>>  [] do_invalid_op+0x88/0x92
>>  [] error_code+0x6a/0x70
>>  [] add_to_swap_cache+0x22/0x58
>>  [] kprefetchd+0x320/0x364
>>  [] kthread+0x36/0x58
>>  [] kernel_thread_helper+0x7/0x14
>>  ===
>> INFO: lockdep is turned off.
>> Code: 0f 89 7b 0c 83 05 fc c9 53 c0 01 8b 13 c1 ea 1e 8d 04 12 01 d0 c1 e0 
>> 03 29 d0 c1 e0 05 ff 80 b8 c0 53 c0 ff 05 34 1d 68 c0 eb 96 <0f> 0b eb fe 0f 
>> 0b eb fe 0f 0b eb fe 8b 53 0c eb be 55 89 e5 56 
>> EIP: [] __add_to_swap_cache+0xc6/0xd7 SS:ESP 0068:c1858f84
> 

The same issue here.

> Don't worry about reproducing untainted, I got the same earlier
> and was just preparing and testing the hotfix: here it is...
> 

Thanks for the patch.

Regards,
Michal

-- 
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1

2007-08-09 Thread Michal Piotrowski
Andrew Morton pisze:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc2/2.6.23-rc2-mm1/
> 

bash_shared_mapping triggered this

[  874.714700] INFO: trying to register non-static key.
[  874.719659] the code is fine but needs lockdep annotation.
[  874.725133] turning off the locking correctness validator.
[  874.730606]  [] show_trace_log_lvl+0x1a/0x30
[  874.735759]  [] show_trace+0x12/0x14
[  874.740218]  [] dump_stack+0x16/0x18
[  874.744679]  [] __lock_acquire+0x598/0x125c
[  874.749745]  [] lock_acquire+0xa7/0xc1
[  874.754378]  [] _spin_lock_irqsave+0x41/0x6e
[  874.759529]  [] prop_norm_single+0x34/0x8a
[  874.764508]  [] set_page_dirty+0xa1/0x13b
[  874.769402]  [] try_to_unmap_one+0xb8/0x1e7
[  874.774467]  [] try_to_unmap+0x8f/0x40d
[  874.779187]  [] shrink_page_list+0x278/0x750
[  874.784339]  [] shrink_inactive_list+0xf6/0x328
[  874.789749]  [] shrink_zone+0xad/0x10b
[  874.794383]  [] try_to_free_pages+0x178/0x274
[  874.799620]  [] __alloc_pages+0x169/0x431
[  874.804514]  [] __do_page_cache_readahead+0x141/0x207
[  874.810443]  [] do_page_cache_readahead+0x48/0x5c
[  874.816027]  [] filemap_fault+0x2dd/0x4cf
[  874.820921]  [] __do_fault+0xb6/0x42d
[  874.825466]  [] handle_mm_fault+0x1b6/0x750
[  874.830533]  [] do_page_fault+0x334/0x5f9
[  874.835425]  [] error_code+0x72/0x78
[  874.839886]  ===
[  880.621883] BUG: NMI Watchdog detected LOCKUP on CPU1, eip c0529022, 
registers:
[  880.629200] Modules linked in: ext2 loop autofs4 af_packet 
nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink 
ipt_REJECT iptable_filter ip_tables xt_tcpudp ip6t_REJECT ip6table_filter 
ip6_tables x_tables firmware_class binfmt_misc fan ipv6 nvram snd_intel8x0 
snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq 
snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer evdev snd soundcore 
i2c_i801 snd_page_alloc intel_agp agpgart rtc
[  880.672397] CPU:1
[  880.672398] EIP:0060:[]Not tainted VLI
[  880.672400] EFLAGS: 0046   (2.6.23-rc2-mm1 #3)
[  880.684735] EIP is at delay_tsc+0xe/0x17

l *delay_tsc+0xe
0xc1129022 is in delay_tsc (/home/devel/linux-mm/arch/i386/lib/delay.c:49).
44
45  rdtscl(bclock);
46  do {
47  rep_nop();
48  rdtscl(now);
49  } while ((now-bclock) < loops);
50  }
51
52  /*
53   * Since we calibrate only once at boot, this


[  880.688646] eax: 393e5d7c   ebx: 0001   ecx: 393e5d04   edx: 023f
[  880.695414] esi:    edi: cabbf5cc   ebp: caf29ae8   esp: caf29ae4
[  880.702183] ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
[  880.708002] Process firefox-bin (pid: 2625, ti=caf29000 task=cabbe900 
task.ti=caf29000)
[  880.715805] Stack: 02e6eb94 caf29af0 c0528fdd caf29b28 c05375ac 0046 
 caf29b28 
[  880.724345]0046  a6c999f0 0001 a6c999f0  
cabbf5e0 cabbf5cc 
[  880.732887]0086 caf29b48 c069f76f  0002 c05259db 
cabbf5c0 8000 
[  880.741425] Call Trace:
[  880.744073]  [] show_trace_log_lvl+0x1a/0x30
[  880.749233]  [] show_stack_log_lvl+0xa9/0xd5
[  880.754386]  [] show_registers+0x21a/0x3ac
[  880.759365]  [] die_nmi+0x84/0xd7
[  880.763566]  [] nmi_watchdog_tick+0x14d/0x168
[  880.768803]  [] do_nmi+0x8b/0x284
[  880.773004]  [] nmi_stack_correct+0x26/0x2b
[  880.778069]  [] __delay+0x9/0xb
[  880.782098]  [] _raw_spin_lock+0xd8/0x18a
[  880.786991]  [] _spin_lock_irqsave+0x5d/0x6e
[  880.792143]  [] prop_norm_single+0x34/0x8a
[  880.797122]  [] set_page_dirty+0xa1/0x13b
[  880.802015]  [] try_to_unmap_one+0xb8/0x1e7
[  880.807079]  [] try_to_unmap+0x8f/0x40d
[  880.811798]  [] shrink_page_list+0x278/0x750
[  880.816950]  [] shrink_inactive_list+0xf6/0x328
[  880.822362]  [] shrink_zone+0xad/0x10b
[  880.826997]  [] try_to_free_pages+0x178/0x274
[  880.832235]  [] __alloc_pages+0x169/0x431
[  880.837126]  [] __do_page_cache_readahead+0x141/0x207
[  880.843056]  [] do_page_cache_readahead+0x48/0x5c
[  880.848641]  [] filemap_fault+0x2dd/0x4cf
[  880.853534]  [] __do_fault+0xb6/0x42d
[  880.858081]  [] handle_mm_fault+0x1b6/0x750
[  880.863146]  [] do_page_fault+0x334/0x5f9
[  880.868037]  [] error_code+0x72/0x78
[  880.872497]  ===
[  880.876068] INFO: lockdep is turned off.
[  880.879983] Code: 8d 0c 1b 01 c9 89 da c1 e2 07 29 ca 01 da 01 d2 f7 e2 8d 
42 01 e8 c3 ff ff ff 5b 5d c3 55 89 e5 53 89 c3 0f 31 89 c1 f3 90 0f 31 <29> c8 
39 d8 72 f6 5b 5d c3 55 89 e5 53 69 c0 1c 43 00 00 64 8b 
[  880.900092] Kernel panic - not syncing: Aiee, killing interrupt handler!
[  880.906791] WARNING: at /home/devel/linux-mm/arch/i386/kernel/smp.c:474 
native_smp_send_reschedule()
[  880.915892]  [] show_trace_log_lvl+0x1a/0x30
[  880.921043]  [] show_trace+0x12/0x14
[  880.925504]  [] dump_stack+0x16/0x18
[  880.929964]  [] native_smp_send_reschedule+0x8b/0x98
[  880.935808]  [] re

Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!

2007-08-09 Thread Hugh Dickins
On Thu, 9 Aug 2007, Mariusz Kozlowski wrote:
> Hello,
> 
>   Nothing unusual happening, allmodconfig compiling etc.
> Not sure why it says kernel was tainted though ... hmmm.
> 
> [ cut here ]
> kernel BUG at mm/swap_state.c:78!
> invalid opcode:  [#1]
> PREEMPT 
> Modules linked in: orinoco_cs orinoco hermes pl2303 usbserial pcmcia 8250_pci 
> 8250 serial_core yenta_socket rsrc_nonstatic pcmcia_core 8139too
> CPU:0
> EIP:0060:[]Tainted: P        VLI
> EFLAGS: 00010246   (2.6.23-rc2-mm1 #1)
> EIP is at __add_to_swap_cache+0xc6/0xd7
> eax: 4000   ebx: c11285c0   ecx: 00d0   edx: 0283
> esi: c11285c0   edi: 0283   ebp: c1858f90   esp: c1858f84
> ds: 007b   es: 007b   fs:   gs:   ss: 0068
> Process kprefetchd (pid: 236, ti=c1858000 task=c18d14d0 task.ti=c1858000)
> Stack: 0283 c11285c0 c3d5a3c8 c1858fa0 c01504ea c11285c0  
> c1858fcc 
>c015307c 0001 0007 0002 0002 0283  
> fffc 
> c0152d5c c1858fe0 c0127f2e c0127ef8   
>  
> Call Trace:
>  [] show_trace_log_lvl+0x1a/0x30
>  [] show_stack_log_lvl+0xa9/0xd5
>  [] show_registers+0x219/0x38d
>  [] die+0x104/0x23e
>  [] do_trap+0x83/0xad
>  [] do_invalid_op+0x88/0x92
>  [] error_code+0x6a/0x70
>  [] add_to_swap_cache+0x22/0x58
>  [] kprefetchd+0x320/0x364
>  [] kthread+0x36/0x58
>  [] kernel_thread_helper+0x7/0x14
>  ===
> INFO: lockdep is turned off.
> Code: 0f 89 7b 0c 83 05 fc c9 53 c0 01 8b 13 c1 ea 1e 8d 04 12 01 d0 c1 e0 03 
> 29 d0 c1 e0 05 ff 80 b8 c0 53 c0 ff 05 34 1d 68 c0 eb 96 <0f> 0b eb fe 0f 0b 
> eb fe 0f 0b eb fe 8b 53 0c eb be 55 89 e5 56 
> EIP: [] __add_to_swap_cache+0xc6/0xd7 SS:ESP 0068:c1858f84

Don't worry about reproducing untainted, I got the same earlier
and was just preparing and testing the hotfix: here it is...


Nick's mm-clarify-__add_to_swap_cache-locking.patch is fine for mainline,
but soon generates a "kernel BUG at mm/swap_state.c:78!" when it meets
mm-implement-swap-prefetching.patch in 2.6.23-rc2-mm1.  We could add a
fix to the latter, but I think it's better to adjust Nick's, so that
it's right for whichever tree it's in: move the responsibility to
SetPageLocked from read_swap_cache_async to add_to_swap_cache.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---

 mm/swap_state.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

--- 2.6.23-rc2-mm1/mm/swap_state.c  2007-08-09 13:15:36.0 +0100
+++ linux/mm/swap_state.c   2007-08-09 14:40:27.0 +0100
@@ -100,15 +100,18 @@ int add_to_swap_cache(struct page *page,
 {
int error;
 
+   BUG_ON(PageLocked(page));
if (!swap_duplicate(entry)) {
INC_CACHE_INFO(noent_race);
return -ENOENT;
}
+   SetPageLocked(page);
error = __add_to_swap_cache(page, entry, GFP_KERNEL);
/*
 * Anon pages are already on the LRU, we don't run lru_cache_add here.
 */
if (error) {
+   ClearPageLocked(page);
swap_free(entry);
if (error == -EEXIST)
INC_CACHE_INFO(exist_race);
@@ -345,7 +348,6 @@ struct page *read_swap_cache_async(swp_e
vma, addr);
if (!new_page)
break;  /* Out of memory */
-   SetPageLocked(new_page);/* could be non-atomic op */
}
 
/*
@@ -369,9 +371,7 @@ struct page *read_swap_cache_async(swp_e
}
} while (err != -ENOENT && err != -ENOMEM);
 
-   if (new_page) {
-   ClearPageLocked(new_page);
+   if (new_page)
page_cache_release(new_page);
-   }
return found_page;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: hang, prop_norm_single involved

2007-08-09 Thread Peter Zijlstra
On Thu, 2007-08-09 at 14:45 +0200, Peter Zijlstra wrote:
> On Thu, 2007-08-09 at 15:10 +0400, Alexey Dobriyan wrote:
> > LTP run reproducably hangs during rwtest01 test
> > rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$
> > Calltrace is always the same:
> > 

> [EMAIL PROTECTED] ~]# PATH=/testcases/bin/:$PATH /testcases/bin/rwtest -N 
> rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$
> rwtest011  PASS  :  Test passed
> [EMAIL PROTECTED] ~]# PATH=/testcases/bin/:$PATH /testcases/bin/rwtest -N 
> rwtest01 -c -q -i 60s -f sync 10%25000:rs-sync=$$
>
> I can reproduce, but not always.
> 
> Also, since the task->dirties member is initialized in fork.c this
> should either _always_ happen or never. So this does point to some
> memory corruption, ->dirties is the very last member of the task struct.
> 
> /me goes try with slab_debug,...

to no avail, banging head against the wall what is happening here.

Andrew, could you:

# mm-dirty-balancing-for-tasks.patch

while I try to figure this one out?

That seems to make the unhappies I could reproduce here go away.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!

2007-08-09 Thread Mariusz Kozlowski
> >...
> > Not sure why it says kernel was tainted though ... hmmm.
> >...
> 
> What does your syslog say when it was tainted?

Shit. My fault. I'll try to reproduce it on untainted kernel.

Thanks,

Mariusz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!

2007-08-09 Thread Adrian Bunk
On Thu, Aug 09, 2007 at 05:11:52PM +0200, Mariusz Kozlowski wrote:
>...
> Not sure why it says kernel was tainted though ... hmmm.
>...

What does your syslog say when it was tainted?

> Regards,
> 
>   Mariusz

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1: kernel BUG at mm/swap_state.c:78!

2007-08-09 Thread Mariusz Kozlowski
Hello,

Nothing unusual happening, allmodconfig compiling etc.
Not sure why it says kernel was tainted though ... hmmm.

[ cut here ]
kernel BUG at mm/swap_state.c:78!
invalid opcode:  [#1]
PREEMPT 
Modules linked in: orinoco_cs orinoco hermes pl2303 usbserial pcmcia 8250_pci 
8250 serial_core yenta_socket rsrc_nonstatic pcmcia_core 8139too
CPU:0
EIP:0060:[]Tainted: PVLI
EFLAGS: 00010246   (2.6.23-rc2-mm1 #1)
EIP is at __add_to_swap_cache+0xc6/0xd7
eax: 4000   ebx: c11285c0   ecx: 00d0   edx: 0283
esi: c11285c0   edi: 0283   ebp: c1858f90   esp: c1858f84
ds: 007b   es: 007b   fs:   gs:   ss: 0068
Process kprefetchd (pid: 236, ti=c1858000 task=c18d14d0 task.ti=c1858000)
Stack: 0283 c11285c0 c3d5a3c8 c1858fa0 c01504ea c11285c0  c1858fcc 
   c015307c 0001 0007 0002 0002 0283  fffc 
    c0152d5c c1858fe0 c0127f2e c0127ef8    
Call Trace:
 [] show_trace_log_lvl+0x1a/0x30
 [] show_stack_log_lvl+0xa9/0xd5
 [] show_registers+0x219/0x38d
 [] die+0x104/0x23e
 [] do_trap+0x83/0xad
 [] do_invalid_op+0x88/0x92
 [] error_code+0x6a/0x70
 [] add_to_swap_cache+0x22/0x58
 [] kprefetchd+0x320/0x364
 [] kthread+0x36/0x58
 [] kernel_thread_helper+0x7/0x14
 ===
INFO: lockdep is turned off.
Code: 0f 89 7b 0c 83 05 fc c9 53 c0 01 8b 13 c1 ea 1e 8d 04 12 01 d0 c1 e0 03 
29 d0 c1 e0 05 ff 80 b8 c0 53 c0 ff 05 34 1d 68 c0 eb 96 <0f> 0b eb fe 0f 0b eb 
fe 0f 0b eb fe 8b 53 0c eb be 55 89 e5 56 
EIP: [] __add_to_swap_cache+0xc6/0xd7 SS:ESP 0068:c1858f84

Regards,

Mariusz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- drivers/ dma/ioat_dca.c:177: error: implicit declaration of function ‘cpu_physical_id’

2007-08-09 Thread Adrian Bunk
On Thu, Aug 09, 2007 at 10:18:15AM -0400, Miles Lane wrote:
>   CC  drivers/dma/ioat_dca.o
> drivers/dma/ioat_dca.c: In function 'ioat_dca_get_tag':
> drivers/dma/ioat_dca.c:177: error: implicit declaration of function
> 'cpu_physical_id'
> make[2]: *** [drivers/dma/ioat_dca.o] Error 1

-ENODOTCONFIG

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc2-mm1 -- PPC G5 kernel compile failure (patch)

2007-08-09 Thread Krzysztof Helt
On Thu, 9 Aug 2007 14:04:49 +0100
Andy Whitcroft <[EMAIL PROTECTED]> wrote:

> Seeing the following compile error on a G5 mac:
> 
>   drivers/video/tdfxfb.c: In function 'tdfxfb_setup':
>   drivers/video/tdfxfb.c:1341: error: 'opt' undeclared (first use in this
>  function)
>   drivers/video/tdfxfb.c:1341: error: (Each undeclared identifier is
> reported only once
>   drivers/video/tdfxfb.c:1341: error: for each function it appears in.)
> 
> This seems to be the following fragment from tdfxfb-hardware-cursor:
> 
> +   } else if (!strcmp(this_opt, "hwcursor")) {
> +   hwcursor = simple_strtoul(opt + 9, NULL, 0);
> 
> I guess the nieve fix would be s/opt/this_opt, but I am also
> suspicious of the +9 here as hwcursor is only 8 long?  Now this
> seems to take a numeric value and I assume that is via hwcursor=N,
> if so then the +9 would make sense _if_ the strcmp was against
> "hwcursor=".
> 

The patch below fixes all issues you have pointed out. It also fixes
the description of the nomtrr option.

---

From: Krzysztof Helt <[EMAIL PROTECTED]>

This patch fixes compilation with setup options bug and corrects
description of the nomtrr option.

Signed-off-by: Krzysztof Helt <[EMAIL PROTECTED]>

---

--- linux-2.6.22.new/drivers/video/tdfxfb.c 2007-08-09 16:11:23.870028259 
+0200
+++ linux-2.6.23/drivers/video/tdfxfb.c 2007-08-09 16:15:07.654781024 +0200
@@ -1337,8 +1337,8 @@ static void tdfxfb_setup(char *options)
nopan = 1;
} else if (!strcmp(this_opt, "nowrap")) {
nowrap = 1;
-   } else if (!strcmp(this_opt, "hwcursor")) {
-   hwcursor = simple_strtoul(opt + 9, NULL, 0);
+   } else if (!strncmp(this_opt, "hwcursor=", 9)) {
+   hwcursor = simple_strtoul(this_opt + 9, NULL, 0);
 #ifdef CONFIG_MTRR
} else if (!strncmp(this_opt, "nomtrr", 6)) {
nomtrr = 1;
@@ -1409,7 +1409,7 @@ MODULE_PARM_DESC(hwcursor, "Enable hardw
"(1=enable, 0=disable, default=1)");
 #ifdef CONFIG_MTRR
 module_param(nomtrr, bool, 0);
-MODULE_PARM_DESC(nomtrr, "Disable MTRR support (0 or 1=disabled) (default=0)");
+MODULE_PARM_DESC(nomtrr, "Disable MTRR support (default: enabled)");
 #endif
 
 module_init(tdfxfb_init);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >