Re: Linux 2.6.20-rc2

2007-01-03 Thread Jiri Kosina
On Mon, 25 Dec 2006, Florin Iucha wrote:

> I left the machine to run the diff and when I came back, the USB 
> keyboard was unresponsive although the USB mice plugged in the hub built 
> into the keyboard were working fine.  I was able to ssh into the box, 
> capture the dmesg and reboot.  Everything went down quietly but the box 
> froze at the "... will restart".  I had no working keyboard and no way 
> to see if it was indeed frozen or not.

Hi Florin,

I have not seen any similar bugreports, but it seems that you are able to 
reproduce the problem reliably to some extent.

Do you think that you could try to narrow down whether the HID core 
patches that went to 2.6.20-rc1 might possibly be causing your problem?

The easiest way might probably be just reverting the following commits and 
see if you can still reproduce the problems. It would be nice if you could 
try, so that we know whether it is caused by HID core, or any other 
post-2.6.19 USB/input changes.

10f549fa1538849548787879d96bbb3450f06117
4ef4caad41630c7caa6e2b94c6e7dda7e9689714
1c1e40b5ad6e345feba69bc612db006efccf4cdc
e3a0dd7ced76bb439ddeda244a9667e7b3800fc8
63f3861d2fbf8ccbad1386ac9ac8b822c036ea00
4c2ae844b5ef85fd4b571c9c91ac48afa6ef2dfc
aa8de2f038baec993f07ef66fb3e94481d1ec22b
aa938f7974b82cfd9ee955031987344f332b7c77
4916b3a57fc94664677d439b911b8aaf86c7ec23
229695e51efc4ed5e04ab471c82591d0f432909d
dde5845a529ff753364a6d1aea61180946270bfa
64bb67b1702958759f650adb64ab33496641e526

They should be revertible without conflict in this order.

Thanks,

-- 
Jiri Kosina
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2

2006-12-26 Thread Florin Iucha
On Wed, Dec 27, 2006 at 12:42:53AM +0100, Ingo Molnar wrote:
> * Florin Iucha <[EMAIL PROTECTED]> wrote:
> > I saw your subsequent message and will apply the patch, retest and 
> > report.
> 
> yeah. Just to make sure i've attached the latest and greatest version of 
> the patch - please make sure you have this one applied.

The good news is, with this patch there is no oops.

The bad news is, the USB keyboard is still not functioning, but the
mice plugged in the keyboard hub are working.

One down, one more to go...

florin

> -->
> Subject: [patch] sched: fix cond_resched_softirq() offset
> From: Ingo Molnar <[EMAIL PROTECTED]>
> 
> remove the __resched_legal() check: it is conceptually broken.
> The biggest problem it had is that it can mask buggy cond_resched()
> calls. A cond_resched() call is only legal if we are not in an
> atomic context, with two narrow exceptions:
> 
>  - if the system is booting
>  - a reacquire_kernel_lock() down() done while PREEMPT_ACTIVE is set
> 
> But __resched_legal() hid this and just silently returned whenever
> these primitives were called from invalid contexts. (Same goes for
> cond_resched_locked() and cond_resched_softirq()).
> 
> furthermore, the __legal_resched(0) call was buggy in that it caused
> unnecessarily long softirq latencies via cond_resched_softirq(). (which
> is only called from softirq-off sections, hence the code did nothing.)
> 
> the fix is to resurrect the efficiency of the might_sleep checks and to
> only allow the narrow exceptions.
> 
> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
> ---
>  kernel/sched.c |   18 --
>  1 file changed, 4 insertions(+), 14 deletions(-)
> 
> Index: linux/kernel/sched.c
> ===
> --- linux.orig/kernel/sched.c
> +++ linux/kernel/sched.c
> @@ -4617,17 +4617,6 @@ asmlinkage long sys_sched_yield(void)
>   return 0;
>  }
>  
> -static inline int __resched_legal(int expected_preempt_count)
> -{
> -#ifdef CONFIG_PREEMPT
> - if (unlikely(preempt_count() != expected_preempt_count))
> - return 0;
> -#endif
> - if (unlikely(system_state != SYSTEM_RUNNING))
> - return 0;
> - return 1;
> -}
> -
>  static void __cond_resched(void)
>  {
>  #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
> @@ -4647,7 +4636,8 @@ static void __cond_resched(void)
>  
>  int __sched cond_resched(void)
>  {
> - if (need_resched() && __resched_legal(0)) {
> + if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) &&
> + system_state == SYSTEM_RUNNING) {
>   __cond_resched();
>   return 1;
>   }
> @@ -4673,7 +4663,7 @@ int cond_resched_lock(spinlock_t *lock)
>   ret = 1;
>   spin_lock(lock);
>   }
> - if (need_resched() && __resched_legal(1)) {
> + if (need_resched() && system_state == SYSTEM_RUNNING) {
>   spin_release(&lock->dep_map, 1, _THIS_IP_);
>   _raw_spin_unlock(lock);
>   preempt_enable_no_resched();
> @@ -4689,7 +4679,7 @@ int __sched cond_resched_softirq(void)
>  {
>   BUG_ON(!in_softirq());
>  
> - if (need_resched() && __resched_legal(0)) {
> + if (need_resched() && system_state == SYSTEM_RUNNING) {
>   raw_local_irq_disable();
>   _local_bh_enable();
>   raw_local_irq_enable();
> 

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature


Re: Linux 2.6.20-rc2

2006-12-26 Thread Fabio Comolli

Hi.
Can you confirm that the problem I mentioned in
http://lkml.org/lkml/2006/12/24/32 is the same?

Best regards,
Fabio




On 12/26/06, Ingo Molnar <[EMAIL PROTECTED]> wrote:


* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> > I've had at least one more occurrence of it:
> >
> > [   78.804940] BUG: scheduling while atomic: kbd/0x2000/3444
> > [   78.804944]
> > [   78.804945] Call Trace:
>
> ok, i can think of a simpler scenario:
> add_preempt_count(PREEMPT_ACTIVE) /twice/, nested into each other.

doh - the BKL! That does a down() in a PREEMPT_ACTIVE section, which can
trigger cond_resched(). The fix is to check for PREEMPT_ACTIVE in
cond_resched(). (and only in cond_resched())

Updated fix (against -rc2) attached.

Ingo

-->
Subject: [patch] sched: fix cond_resched_softirq() offset
From: Ingo Molnar <[EMAIL PROTECTED]>

remove the __resched_legal() check: it is conceptually broken.
The biggest problem it had is that it can mask buggy cond_resched()
calls. A cond_resched() call is only legal if we are not in an
atomic context, with two narrow exceptions:

 - if the system is booting
 - a reacquire_kernel_lock() down() done while PREEMPT_ACTIVE is set

But __resched_legal() hid this and just silently returned whenever
these primitives were called from invalid contexts. (Same goes for
cond_resched_locked() and cond_resched_softirq()).

furthermore, the __legal_resched(0) call was buggy in that it caused
unnecessarily long softirq latencies via cond_resched_softirq(). (which
is only called from softirq-off sections, hence the code did nothing.)

the fix is to resurrect the efficiency of the might_sleep checks and to
only allow the narrow exceptions.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 kernel/sched.c |   18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -4617,17 +4617,6 @@ asmlinkage long sys_sched_yield(void)
return 0;
 }

-static inline int __resched_legal(int expected_preempt_count)
-{
-#ifdef CONFIG_PREEMPT
-   if (unlikely(preempt_count() != expected_preempt_count))
-   return 0;
-#endif
-   if (unlikely(system_state != SYSTEM_RUNNING))
-   return 0;
-   return 1;
-}
-
 static void __cond_resched(void)
 {
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
@@ -4647,7 +4636,8 @@ static void __cond_resched(void)

 int __sched cond_resched(void)
 {
-   if (need_resched() && __resched_legal(0)) {
+   if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) &&
+   system_state == SYSTEM_RUNNING) {
__cond_resched();
return 1;
}
@@ -4673,7 +4663,7 @@ int cond_resched_lock(spinlock_t *lock)
ret = 1;
spin_lock(lock);
}
-   if (need_resched() && __resched_legal(1)) {
+   if (need_resched() && system_state == SYSTEM_RUNNING) {
spin_release(&lock->dep_map, 1, _THIS_IP_);
_raw_spin_unlock(lock);
preempt_enable_no_resched();
@@ -4689,7 +4679,7 @@ int __sched cond_resched_softirq(void)
 {
BUG_ON(!in_softirq());

-   if (need_resched() && __resched_legal(0)) {
+   if (need_resched() && system_state == SYSTEM_RUNNING) {
raw_local_irq_disable();
_local_bh_enable();
raw_local_irq_enable();
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: scheduling while atomic - Linux 2.6.20-rc2-ga3d89517

2006-12-26 Thread Randy Dunlap
On Tue, 26 Dec 2006 18:15:31 +0100 Pavel Machek wrote:

> Hi!
> 
> > some days and will let you know if the problem represents. Please note
> > that it happened only twice and I don't have any clue on how to
> > reproduce it.
> > 
> > I added Pavel and Rafael to CC-list because for the first time in at
> > least six months my laptop failed to resume after suspend-to-disk
> > (userland tools) with this kernel. Guys, do you think that this
> > failure could be related to this BUG?
> 
> everything is possible, but this one does not seem too likely. Is
> failure reproducible?

Ingo just posted a patch for this problem.

http://marc.theaimsgroup.com/?l=linux-kernel&m=116715139714252&w=2

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: scheduling while atomic - Linux 2.6.20-rc2-ga3d89517

2006-12-26 Thread Fabio Comolli

Hi

On 12/26/06, Pavel Machek <[EMAIL PROTECTED]> wrote:

Hi!

> some days and will let you know if the problem represents. Please note
> that it happened only twice and I don't have any clue on how to
> reproduce it.
>
> I added Pavel and Rafael to CC-list because for the first time in at
> least six months my laptop failed to resume after suspend-to-disk
> (userland tools) with this kernel. Guys, do you think that this
> failure could be related to this BUG?

everything is possible, but this one does not seem too likely. Is
failure reproducible?



Not at all. I applied Hirofumi's patch and the problem seems to be
gone. But it was impossible to reproduce even without it: the BUG
happened only twice and the resume failure only once.



Pavel


Fabio


--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: scheduling while atomic - Linux 2.6.20-rc2-ga3d89517

2006-12-26 Thread Pavel Machek
Hi!

> some days and will let you know if the problem represents. Please note
> that it happened only twice and I don't have any clue on how to
> reproduce it.
> 
> I added Pavel and Rafael to CC-list because for the first time in at
> least six months my laptop failed to resume after suspend-to-disk
> (userland tools) with this kernel. Guys, do you think that this
> failure could be related to this BUG?

everything is possible, but this one does not seem too likely. Is
failure reproducible?

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2

2006-12-26 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> > I've had at least one more occurrence of it:
> > 
> > [   78.804940] BUG: scheduling while atomic: kbd/0x2000/3444
> > [   78.804944] 
> > [   78.804945] Call Trace:
> 
> ok, i can think of a simpler scenario: 
> add_preempt_count(PREEMPT_ACTIVE) /twice/, nested into each other.

doh - the BKL! That does a down() in a PREEMPT_ACTIVE section, which can 
trigger cond_resched(). The fix is to check for PREEMPT_ACTIVE in 
cond_resched(). (and only in cond_resched())

Updated fix (against -rc2) attached.

Ingo

-->
Subject: [patch] sched: fix cond_resched_softirq() offset
From: Ingo Molnar <[EMAIL PROTECTED]>

remove the __resched_legal() check: it is conceptually broken.
The biggest problem it had is that it can mask buggy cond_resched()
calls. A cond_resched() call is only legal if we are not in an
atomic context, with two narrow exceptions:

 - if the system is booting
 - a reacquire_kernel_lock() down() done while PREEMPT_ACTIVE is set

But __resched_legal() hid this and just silently returned whenever
these primitives were called from invalid contexts. (Same goes for
cond_resched_locked() and cond_resched_softirq()).

furthermore, the __legal_resched(0) call was buggy in that it caused
unnecessarily long softirq latencies via cond_resched_softirq(). (which
is only called from softirq-off sections, hence the code did nothing.)

the fix is to resurrect the efficiency of the might_sleep checks and to
only allow the narrow exceptions.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 kernel/sched.c |   18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -4617,17 +4617,6 @@ asmlinkage long sys_sched_yield(void)
return 0;
 }
 
-static inline int __resched_legal(int expected_preempt_count)
-{
-#ifdef CONFIG_PREEMPT
-   if (unlikely(preempt_count() != expected_preempt_count))
-   return 0;
-#endif
-   if (unlikely(system_state != SYSTEM_RUNNING))
-   return 0;
-   return 1;
-}
-
 static void __cond_resched(void)
 {
 #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP
@@ -4647,7 +4636,8 @@ static void __cond_resched(void)
 
 int __sched cond_resched(void)
 {
-   if (need_resched() && __resched_legal(0)) {
+   if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) &&
+   system_state == SYSTEM_RUNNING) {
__cond_resched();
return 1;
}
@@ -4673,7 +4663,7 @@ int cond_resched_lock(spinlock_t *lock)
ret = 1;
spin_lock(lock);
}
-   if (need_resched() && __resched_legal(1)) {
+   if (need_resched() && system_state == SYSTEM_RUNNING) {
spin_release(&lock->dep_map, 1, _THIS_IP_);
_raw_spin_unlock(lock);
preempt_enable_no_resched();
@@ -4689,7 +4679,7 @@ int __sched cond_resched_softirq(void)
 {
BUG_ON(!in_softirq());
 
-   if (need_resched() && __resched_legal(0)) {
+   if (need_resched() && system_state == SYSTEM_RUNNING) {
raw_local_irq_disable();
_local_bh_enable();
raw_local_irq_enable();
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2

2006-12-26 Thread Ingo Molnar

* Randy Dunlap <[EMAIL PROTECTED]> wrote:

> I've had at least one more occurrence of it:
> 
> [   78.804940] BUG: scheduling while atomic: kbd/0x2000/3444
> [   78.804944] 
> [   78.804945] Call Trace:

ok, i can think of a simpler scenario: add_preempt_count(PREEMPT_ACTIVE) 
/twice/, nested into each other.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2

2006-12-26 Thread Randy Dunlap
On Tue, 26 Dec 2006 13:40:19 +0100 Ingo Molnar wrote:

> 
> * Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > > [ 2844.871895] BUG: scheduling while atomic: cp/0x2000/2965
> 
> > This is the second report we've had where bit 29 of ->preempt_count is 
> > getting set.  I don't think there's any legitimate way in which that 
> > bit can get set.  (Ingo?)

First one was me, on x86_64 UP.  I ran memtest86 many hours
with no problems found.  It's an almost-new system fwiw.

> It's not legitimate (the highest legitimate bit is PREEMPT_ACTIVE, bit 
> 28). Nor can i think of any bug scenario barring outright memory 
> corruption (either hardware or kernel induced) that could cause this. 
> It's quite hard to trigger bit 29 there via any of the scheduling 
> mechanisms: either the preempt count would have to underflow massively 
> /and/ avoid detection during that undflow (sheer impossible) or the 
> HARDIRQ_COUNT would have to overflow to more than 4096 (again near 
> impossible to trigger), and simultaneously the softirq and preempt count 
> would have to overflow by 256 /at once/, or underflow by 1 at once. The 
> likelyhood of that makes the likelyhood of GPL-ed Windows a sure bet in 
> comparison.
> 
> So my guess would still be memory corruption of some sort, or some 
> really weird compiler bug. We just recently mandated REGPARM on i386 for 
> example, it would be interesting to know whether an older (say 2.6.18 or 
> 19) config had CONFIG_REGPARM enabled or not? Regparm can also tax the 
> hardware (the CPU in particular) a bit more.

I've had at least one more occurrence of it:

[   78.804940] BUG: scheduling while atomic: kbd/0x2000/3444
[   78.804944] 
[   78.804945] Call Trace:
[   78.804952]  [] __sched_text_start+0x60/0xae0
[   78.804958]  [] default_wake_function+0xd/0xf
[   78.804962]  [] __wake_up_common+0x3e/0x68
[   78.804966]  [] __cond_resched+0x1c/0x44
[   78.804969]  [] cond_resched+0x29/0x30
[   78.804973]  [] __reacquire_kernel_lock+0x29/0x49
[   78.804977]  [] thread_return+0xa3/0xe2
[   78.804981]  [] __cond_resched+0x1c/0x44
[   78.804985]  [] cond_resched+0x29/0x30
[   78.804989]  [] device_add+0x3e1/0x53e
[   78.804993]  [] device_register+0x19/0x1d
[   78.804996]  [] device_create+0xdf/0x110
[   78.805001]  [] set_palette+0x5c/0x60
[   78.805005]  [] reset_terminal+0x1f0/0x1f5
[   78.805010]  [] vcs_make_sysfs+0x5e/0x62
[   78.805014]  [] con_open+0x88/0x9b
[   78.805018]  [] tty_open+0x19c/0x310
[   78.805022]  [] chrdev_open+0x164/0x19d
[   78.805026]  [] chrdev_open+0x0/0x19d
[   78.805030]  [] __dentry_open+0xe9/0x1ba
[   78.805034]  [] nameidata_to_filp+0x2d/0x3f
[   78.805038]  [] do_filp_open+0x36/0x46
[   78.805042]  [] get_unused_fd+0x70/0x105
[   78.805046]  [] do_sys_open+0x4f/0xd7
[   78.805050]  [] sys_open+0x1b/0x1d
[   78.805054]  [] system_call+0x7e/0x83

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2

2006-12-26 Thread Ingo Molnar

* Florin Iucha <[EMAIL PROTECTED]> wrote:

> This is my year-old workstation that I've build from good parts (Asus 
> A8N-SLI premium, OCZ memory), not overclocked, not overheated (it is 
> in a Antec P180 case with 12 cm fans -> CPU max is 43'C when not used 
> for my hour-long simulations).  I will leave it do memtest for a 
> couple hours.
> 
> The compiler is "gcc version 4.1.2 20061028 (prerelease) (Debian 
> 4.1.1-19)" and the .config is attached.

could you send a config that you used with the 2.6.19 (or 2.6.18) 
kernel?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2

2006-12-26 Thread Florin Iucha
On Tue, Dec 26, 2006 at 01:40:19PM +0100, Ingo Molnar wrote:
> 
> * Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > > [ 2844.871895] BUG: scheduling while atomic: cp/0x2000/2965
> 
> > This is the second report we've had where bit 29 of ->preempt_count is 
> > getting set.  I don't think there's any legitimate way in which that 
> > bit can get set.  (Ingo?)
> 
> It's not legitimate (the highest legitimate bit is PREEMPT_ACTIVE, bit 
> 28). Nor can i think of any bug scenario barring outright memory 
> corruption (either hardware or kernel induced) that could cause this. 
> It's quite hard to trigger bit 29 there via any of the scheduling 
> mechanisms: either the preempt count would have to underflow massively 
> /and/ avoid detection during that undflow (sheer impossible) or the 
> HARDIRQ_COUNT would have to overflow to more than 4096 (again near 
> impossible to trigger), and simultaneously the softirq and preempt count 
> would have to overflow by 256 /at once/, or underflow by 1 at once. The 
> likelyhood of that makes the likelyhood of GPL-ed Windows a sure bet in 
> comparison.
> 
> So my guess would still be memory corruption of some sort, or some 
> really weird compiler bug. We just recently mandated REGPARM on i386 for 
> example, it would be interesting to know whether an older (say 2.6.18 or 
> 19) config had CONFIG_REGPARM enabled or not? Regparm can also tax the 
> hardware (the CPU in particular) a bit more.

This is my year-old workstation that I've build from good parts (Asus
A8N-SLI premium, OCZ memory), not overclocked, not overheated (it is
in a Antec P180 case with 12 cm fans -> CPU max is 43'C when not used
for my hour-long simulations).  I will leave it do memtest for a
couple hours.

The compiler is "gcc version 4.1.2 20061028 (prerelease) (Debian
4.1.1-19)" and the .config is attached.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.20-rc2
# Mon Dec 25 10:48:26 2006
#
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
CONFIG_ZONE_DMA32=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_DMI=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
# CONFIG_POSIX_MQUEUE is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
# CONFIG_IKCONFIG_PROC is not set
# CONFIG_CPUSETS is not set
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_EMBEDDED=y
CONFIG_UID16=y
# CONFIG_SYSCTL_SYSCALL is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
# CONFIG_VM_EVENT_COUNTERS is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_SLOB is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y

#
# Block layer
#
CONFIG_BLOCK=y
# CONFIG_BLK_DEV_IO_TRACE is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_VSMP is not set
CONFIG_MK8=y
# CONFIG_MPSC is not set
# CONFIG_MCORE2 is not set
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_L1_CACHE_BYTES=64
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_INTERNODE_CACHE_BYTES=64
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_MICROCODE=m
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=m
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_MTRR=y
CONFIG_SMP=y
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_BKL=y
# CONFIG_NUMA is not set
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_SELECT_MEMORY_MODEL=y
CO

Re: Linux 2.6.20-rc2

2006-12-26 Thread Ingo Molnar

* Andrew Morton <[EMAIL PROTECTED]> wrote:

> > [ 2844.871895] BUG: scheduling while atomic: cp/0x2000/2965

> This is the second report we've had where bit 29 of ->preempt_count is 
> getting set.  I don't think there's any legitimate way in which that 
> bit can get set.  (Ingo?)

It's not legitimate (the highest legitimate bit is PREEMPT_ACTIVE, bit 
28). Nor can i think of any bug scenario barring outright memory 
corruption (either hardware or kernel induced) that could cause this. 
It's quite hard to trigger bit 29 there via any of the scheduling 
mechanisms: either the preempt count would have to underflow massively 
/and/ avoid detection during that undflow (sheer impossible) or the 
HARDIRQ_COUNT would have to overflow to more than 4096 (again near 
impossible to trigger), and simultaneously the softirq and preempt count 
would have to overflow by 256 /at once/, or underflow by 1 at once. The 
likelyhood of that makes the likelyhood of GPL-ed Windows a sure bet in 
comparison.

So my guess would still be memory corruption of some sort, or some 
really weird compiler bug. We just recently mandated REGPARM on i386 for 
example, it would be interesting to know whether an older (say 2.6.18 or 
19) config had CONFIG_REGPARM enabled or not? Regparm can also tax the 
hardware (the CPU in particular) a bit more.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2

2006-12-26 Thread Andrew Morton
On Mon, 25 Dec 2006 16:56:16 -0600
[EMAIL PROTECTED] (Florin Iucha) wrote:

> > The dmesg from the client machine is attached.
> 
> Now, really.
> 
> BTW, I am using NFSv4 exported async from the server and mounted
> without any extra options on the client.
> 
> florin
> 
> -- 
> Bruce Schneier expects the Spanish Inquisition.
>   http://geekz.co.uk/schneierfacts/fact/163
> 
> 
> [the_oops  text/plain (9.9KB)]
> [ 2844.871895] BUG: scheduling while atomic: cp/0x2000/2965
> [ 2844.871900] 
> [ 2844.871901] Call Trace:
> [ 2844.871910]  [] __sched_text_start+0x5d/0x7a6
> [ 2844.871914]  [] submit_bio+0x84/0x8b
> [ 2844.871918]  [] ext3_get_block+0x0/0xe4
> [ 2844.871922]  [] __pagevec_lru_add+0xb6/0xc6
> [ 2844.871927]  [] mpage_bio_submit+0x22/0x26
> [ 2844.871931]  [] unix_poll+0x0/0xa4
> [ 2844.871936]  [] __cond_resched+0x1c/0x44
> [ 2844.871940]  [] cond_resched+0x29/0x30
> [ 2844.871943]  [] __reacquire_kernel_lock+0x26/0x44
> [ 2844.871948]  [] thread_return+0xa3/0xe1
> [ 2844.871953]  [] unlock_page+0x9/0x26
> [ 2844.871957]  [] __cond_resched+0x1c/0x44
> [ 2844.871961]  [] cond_resched+0x29/0x30
> [ 2844.871965]  [] generic_writepages+0x113/0x2d8
> [ 2844.871970]  [] nfs_writepage+0x0/0x22
> [ 2844.871976]  [] nfs_writepages+0x45/0x13c
> [ 2844.871980]  [] do_writepages+0x20/0x2d
> [ 2844.871984]  [] __filemap_fdatawrite_range+0x51/0x5b
> [ 2844.871989]  [] filemap_write_and_wait+0x17/0x31
> [ 2844.871993]  [] nfs_setattr+0x98/0x108
> [ 2844.871996]  [] mntput_no_expire+0x19/0x7b
> [ 2844.872000]  [] link_path_walk+0xc5/0xd7
> [ 2844.872005]  [] current_fs_time+0x3b/0x40
> [ 2844.872009]  [] notify_change+0x122/0x22f
> [ 2844.872014]  [] do_utimes+0x106/0x129
> [ 2844.872019]  [] vfs_read+0xaa/0x152
> [ 2844.872023]  [] sys_futimesat+0x3c/0x4b
> [ 2844.872027]  [] system_call+0x7e/0x83
> [ 2844.872030] 

This is the second report we've had where bit 29 of ->preempt_count is
getting set.  I don't think there's any legitimate way in which that bit
can get set.  (Ingo?)

I'd suggested that the first report (which was in i386 iirc) was due to
memory corruption (hardware or software).  And this might also be a
hardware error, but that's looking pretty unlikely now.

If this is real, it's going to be hard to find, unless someone finds a
way to make it happen with some repeatability.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2

2006-12-25 Thread Florin Iucha
On Tue, Dec 26, 2006 at 12:06:58AM +0100, Trond Myklebust wrote:
> On Mon, 2006-12-25 at 16:56 -0600, Florin Iucha wrote:
> > BTW, I am using NFSv4 exported async from the server and mounted
> > without any extra options on the client.
> 
> Doesn't look like it has much to do with NFS. The Oopses appear mainly
> to be occurring when assorted ext3 code calls submit_bio(). Was that the
> entire Oops text?

Yes, that was the entire oops text.  NFS appeared on the stack trace
and I thought I might be useful to know more about the code paths.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature


Re: Linux 2.6.20-rc2

2006-12-25 Thread Trond Myklebust
On Mon, 2006-12-25 at 16:56 -0600, Florin Iucha wrote:
> > The dmesg from the client machine is attached.
> 
> Now, really.
> 
> BTW, I am using NFSv4 exported async from the server and mounted
> without any extra options on the client.
> 
> florin

Doesn't look like it has much to do with NFS. The Oopses appear mainly
to be occurring when assorted ext3 code calls submit_bio(). Was that the
entire Oops text?

Cheers
  Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Linux 2.6.20-rc2

2006-12-25 Thread Florin Iucha
I've got an oops or two while copying 60 Gb of files over NFS then
comparing them using diff.  The client is AMD64 running Debian
testing/unstable with the shinny new 2.6.20-rc2 kernel.  The server is
Debian testing with 2.6.18-3 distribution kernel.  The source
filesystem is ext3.

I left the machine to run the diff and when I came back, the USB keyboard
was unresponsive although the USB mice plugged in the hub built into
the keyboard were working fine.  I was able to ssh into the box,
capture the dmesg and reboot.  Everything went down quietly but the
box froze at the "... will restart".  I had no working keyboard and
no way to see if it was indeed frozen or not.

I got a similar event of keyboard loss while copying the files using
2.6.20-rc1.  I was able to copy the files using 2.6.19.

The dmesg from the client machine is attached.
florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163


signature.asc
Description: Digital signature


Re: Linux 2.6.20-rc2

2006-12-25 Thread Florin Iucha
> The dmesg from the client machine is attached.

Now, really.

BTW, I am using NFSv4 exported async from the server and mounted
without any extra options on the client.

florin

-- 
Bruce Schneier expects the Spanish Inquisition.
  http://geekz.co.uk/schneierfacts/fact/163
[ 2844.871895] BUG: scheduling while atomic: cp/0x2000/2965
[ 2844.871900] 
[ 2844.871901] Call Trace:
[ 2844.871910]  [] __sched_text_start+0x5d/0x7a6
[ 2844.871914]  [] submit_bio+0x84/0x8b
[ 2844.871918]  [] ext3_get_block+0x0/0xe4
[ 2844.871922]  [] __pagevec_lru_add+0xb6/0xc6
[ 2844.871927]  [] mpage_bio_submit+0x22/0x26
[ 2844.871931]  [] unix_poll+0x0/0xa4
[ 2844.871936]  [] __cond_resched+0x1c/0x44
[ 2844.871940]  [] cond_resched+0x29/0x30
[ 2844.871943]  [] __reacquire_kernel_lock+0x26/0x44
[ 2844.871948]  [] thread_return+0xa3/0xe1
[ 2844.871953]  [] unlock_page+0x9/0x26
[ 2844.871957]  [] __cond_resched+0x1c/0x44
[ 2844.871961]  [] cond_resched+0x29/0x30
[ 2844.871965]  [] generic_writepages+0x113/0x2d8
[ 2844.871970]  [] nfs_writepage+0x0/0x22
[ 2844.871976]  [] nfs_writepages+0x45/0x13c
[ 2844.871980]  [] do_writepages+0x20/0x2d
[ 2844.871984]  [] __filemap_fdatawrite_range+0x51/0x5b
[ 2844.871989]  [] filemap_write_and_wait+0x17/0x31
[ 2844.871993]  [] nfs_setattr+0x98/0x108
[ 2844.871996]  [] mntput_no_expire+0x19/0x7b
[ 2844.872000]  [] link_path_walk+0xc5/0xd7
[ 2844.872005]  [] current_fs_time+0x3b/0x40
[ 2844.872009]  [] notify_change+0x122/0x22f
[ 2844.872014]  [] do_utimes+0x106/0x129
[ 2844.872019]  [] vfs_read+0xaa/0x152
[ 2844.872023]  [] sys_futimesat+0x3c/0x4b
[ 2844.872027]  [] system_call+0x7e/0x83
[ 2844.872030] 
[ 3606.114991] [drm] Loading R300 Microcode
[ 3878.479521] BUG: scheduling while atomic: cp/0x2000/3129
[ 3878.479526] 
[ 3878.479527] Call Trace:
[ 3878.479536]  [] __sched_text_start+0x5d/0x7a6
[ 3878.479541]  [] __down+0xbe/0x100
[ 3878.479546]  [] default_wake_function+0x0/0xe
[ 3878.479571]  [] unx_validate+0x0/0x56
[ 3878.479575]  [] __cond_resched+0x1c/0x44
[ 3878.479579]  [] cond_resched+0x29/0x30
[ 3878.479583]  [] __reacquire_kernel_lock+0x26/0x44
[ 3878.479587]  [] thread_return+0xa3/0xe1
[ 3878.479591]  [] thread_return+0xa3/0xe1
[ 3878.479595]  [] cache_alloc_refill+0x63/0x4ac
[ 3878.479600]  [] __cond_resched+0x1c/0x44
[ 3878.479606]  [] cond_resched+0x29/0x30
[ 3878.479609]  [] kmem_cache_alloc+0x14/0x54
[ 3878.479613]  [] nfs_create_request+0x3d/0x109
[ 3878.479618]  [] nfs_writepage_setup+0x1ab/0x3b5
[ 3878.479624]  [] nfs_updatepage+0xf5/0x134
[ 3878.479628]  [] nfs_commit_write+0x2e/0x41
[ 3878.479758]  [] generic_file_buffered_write+0x482/0x690
[ 3878.479764]  [] copy_user_generic_string+0x17/0x40
[ 3878.479770]  [] __generic_file_aio_write_nolock+0x379/0x3ec
[ 3878.479775]  [] generic_file_aio_write+0x61/0xc1
[ 3878.479780]  [] nfs_file_write+0xb4/0x121
[ 3878.479784]  [] do_sync_write+0xc9/0x10c
[ 3878.479790]  [] autoremove_wake_function+0x0/0x2e
[ 3878.479794]  [] thread_return+0x0/0xe1
[ 3878.479799]  [] vfs_write+0xad/0x156
[ 3878.479802]  [] sys_write+0x45/0x6e
[ 3878.479807]  [] system_call+0x7e/0x83
[ 3878.479810] 
[ 4280.656585] BUG: scheduling while atomic: cp/0x2000/3129
[ 4280.656590] 
[ 4280.656591] Call Trace:
[ 4280.656600]  [] __sched_text_start+0x5d/0x7a6
[ 4280.656605]  [] __alloc_pages+0x61/0x2a8
[ 4280.656609]  [] ext3_get_block+0x0/0xe4
[ 4280.656612]  [] __pagevec_lru_add+0xb6/0xc6
[ 4280.656617]  [] unix_poll+0x0/0xa4
[ 4280.656622]  [] __cond_resched+0x1c/0x44
[ 4280.656626]  [] cond_resched+0x29/0x30
[ 4280.656629]  [] __reacquire_kernel_lock+0x26/0x44
[ 4280.656634]  [] thread_return+0xa3/0xe1
[ 4280.656638]  [] nfs_unlock_request+0x1/0x37
[ 4280.656644]  [] __cond_resched+0x1c/0x44
[ 4280.656647]  [] cond_resched+0x29/0x30
[ 4280.656652]  [] generic_writepages+0x113/0x2d8
[ 4280.656656]  [] nfs_writepage+0x0/0x22
[ 4280.656662]  [] nfs_writepages+0x45/0x13c
[ 4280.656677]  [] do_writepages+0x20/0x2d
[ 4280.656681]  [] __filemap_fdatawrite_range+0x51/0x5b
[ 4280.656686]  [] filemap_write_and_wait+0x17/0x31
[ 4280.656690]  [] nfs_setattr+0x98/0x108
[ 4280.656693]  [] mntput_no_expire+0x19/0x7b
[ 4280.656697]  [] link_path_walk+0xc5/0xd7
[ 4280.656702]  [] current_fs_time+0x3b/0x40
[ 4280.656706]  [] notify_change+0x122/0x22f
[ 4280.656710]  [] do_utimes+0x106/0x129
[ 4280.656715]  [] thread_return+0x0/0xe1
[ 4280.656719]  [] vfs_read+0xaa/0x152
[ 4280.656723]  [] sys_write+0x2d/0x6e
[ 4280.656727]  [] sys_futimesat+0x3c/0x4b
[ 4280.656731]  [] system_call+0x7e/0x83
[ 4280.656734] 
[ 4811.737194] [drm] Loading R300 Microcode
[ 5355.331624] BUG: scheduling while atomic: hald-addon-stor/0x2000/2066
[ 5355.331630] 
[ 5355.331631] Call Trace:
[ 5355.331641]  [] __sched_text_start+0x5d/0x7a6
[ 5355.331646]  [] task_rq_lock+0x3d/0x6f
[ 5355.331650]  [] __activate_task+0x27/0x39
[ 5355.331655]  [] try_to_wake_up+0x363/0x374
[ 5355.331660]  [] __cond_resched+0x1c/0x44
[ 5355.331677]  [] cond_resched+0x29/0x30
[ 5355.331681]  [] __reacquire_ker

swsusp testing wanted (was Re: Linux 2.6.20-rc2)

2006-12-25 Thread Pavel Machek
Hi!

> Rafael J. Wysocki (1):
>   ACPI: S4: Use "platform" rather than "shutdown" mode by default

...platform is right thing to do, but it is also "more aggresive" than
"shutdown" -- it needs bigger chunk of ACPI BIOS to work properly.

So, it would be nice to test 2.6.20-rc2 on your favourite system (if
it breaks, try if echo "shutdown" > /sys/power/disk fixes it), and
report results. Thanks,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2: forgot how to make a zImage on powerpc?

2006-12-25 Thread Guennadi Liakhovetski
On Mon, 25 Dec 2006, Mark Glines wrote:

> Mark Glines wrote:
> > Hmm.  I'm trying to build 2.6.20-rc2 on a little powerpc box with
> > arch/powerpc/configs/linkstation_defconfig, and I get:
> ...
> >   MODPOST vmlinux
> > ln: accessing `arch/powerpc/boot/zImage': No such file or directory
> > make[1]: *** [arch/powerpc/boot/zImage] Error 1
> > make: *** [zImage] Error 2
> > 
> > So, uh, are we forgetting to go into the right subdirectory to make the
> > actual zImage, or what?  If I'm just doing something wrong, I'd love to know
> > what it is.
> > 
> > I'll follow up here on lkml if I diagnose this further.  Thanks,
> 
> 
> Followup:  Yeah, it looks like it just doesn't know which format of zImage to
> produce for linkstation.
> 
> I'm not sure what image should be used by default.  I guess it depends on the
> bootloader.  Maybe default to uImage, as uBoot seems to be fairly common on
> these devices?

Yes, uImage is the format used on linkstation. Is there a way to cleanly 
specify this in the kernel sources apart from a comment in Kconfig?

Thanks
Guennadi
---
Guennadi Liakhovetski
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2: forgot how to make a zImage on powerpc?

2006-12-25 Thread Mark Glines

Mark Glines wrote:
Hmm.  I'm trying to build 2.6.20-rc2 on a little powerpc box with 
arch/powerpc/configs/linkstation_defconfig, and I get:

...

  MODPOST vmlinux
ln: accessing `arch/powerpc/boot/zImage': No such file or directory
make[1]: *** [arch/powerpc/boot/zImage] Error 1
make: *** [zImage] Error 2

So, uh, are we forgetting to go into the right subdirectory to make the 
actual zImage, or what?  If I'm just doing something wrong, I'd love to 
know what it is.


I'll follow up here on lkml if I diagnose this further.  Thanks,



Followup:  Yeah, it looks like it just doesn't know which format of 
zImage to produce for linkstation.


I'm not sure what image should be used by default.  I guess it depends 
on the bootloader.  Maybe default to uImage, as uBoot seems to be fairly 
common on these devices?


Mark
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: scheduling while atomic - Linux 2.6.20-rc2-ga3d89517

2006-12-25 Thread Fabio Comolli

OK, I applied your patch to yesterday's Linus' GIT. I will run it for
some days and will let you know if the problem represents. Please note
that it happened only twice and I don't have any clue on how to
reproduce it.

I added Pavel and Rafael to CC-list because for the first time in at
least six months my laptop failed to resume after suspend-to-disk
(userland tools) with this kernel. Guys, do you think that this
failure could be related to this BUG?

Best regards and Happy Holidays,
Fabio




On 12/24/06, OGAWA Hirofumi <[EMAIL PROTECTED]> wrote:

"Fabio Comolli" <[EMAIL PROTECTED]> writes:

> Just found this in syslog. It was during normal activity, about 6
> minutes after resume-from-ram. I never saw this before.

It seems someone missed to check PREEMPT_ACTIVE in __resched_legal().
Could you please test the following patch?
--
OGAWA Hirofumi <[EMAIL PROTECTED]>



Signed-off-by: OGAWA Hirofumi <[EMAIL PROTECTED]>
---

 kernel/sched.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff -puN kernel/sched.c~__resched_legal kernel/sched.c
--- linux-2.6/kernel/sched.c~__resched_legal2006-12-24 22:40:19.0 
+0900
+++ linux-2.6-hirofumi/kernel/sched.c   2006-12-24 23:54:01.0 +0900
@@ -4619,10 +4619,11 @@ asmlinkage long sys_sched_yield(void)

 static inline int __resched_legal(int expected_preempt_count)
 {
-#ifdef CONFIG_PREEMPT
+#ifndef CONFIG_PREEMPT
+   expected_preempt_count = 0;
+#endif
if (unlikely(preempt_count() != expected_preempt_count))
return 0;
-#endif
if (unlikely(system_state != SYSTEM_RUNNING))
return 0;
return 1;
_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2: forgot how to make a zImage on powerpc?

2006-12-24 Thread Mark Glines

Linus Torvalds wrote:

(much of the latter syntactic cleanups). And arm and powerpc updates.


Hmm.  I'm trying to build 2.6.20-rc2 on a little powerpc box with 
arch/powerpc/configs/linkstation_defconfig, and I get:


[EMAIL PROTECTED] /usr/src/linux $ make zImage
  HOSTCC  scripts/basic/fixdep
  HOSTCC  scripts/basic/docproc
  HOSTCC  scripts/kconfig/conf.o
  HOSTCC  scripts/kconfig/kxgettext.o
[snip lots of buildspam]
  LD  lib/zlib_deflate/built-in.o
  LD  lib/zlib_inflate/built-in.o
  GEN .version
  LD  .tmp_vmlinux1
  KSYM.tmp_kallsyms1.S
  AS  .tmp_kallsyms1.o
  LD  .tmp_vmlinux2
  KSYM.tmp_kallsyms2.S
  AS  .tmp_kallsyms2.o
  LD  vmlinux
  SYSMAP  System.map
  SYSMAP  .tmp_System.map
  MODPOST vmlinux
ln: accessing `arch/powerpc/boot/zImage': No such file or directory
make[1]: *** [arch/powerpc/boot/zImage] Error 1
make: *** [zImage] Error 2

So, uh, are we forgetting to go into the right subdirectory to make the 
actual zImage, or what?  If I'm just doing something wrong, I'd love to 
know what it is.


I'll follow up here on lkml if I diagnose this further.  Thanks,

Mark

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2

2006-12-24 Thread Andreas Schwab
Linus Torvalds <[EMAIL PROTECTED]> writes:

> Yu Luming (1):
>   ACPI: video: Add dev argument for backlight_device_register

Fix compilation of via-pmu-backlight.

Signed-off-by: Andreas Schwab <[EMAIL PROTECTED]>

---
 drivers/macintosh/via-pmu-backlight.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.20-rc2/drivers/macintosh/via-pmu-backlight.c
=======
--- linux-2.6.20-rc2.orig/drivers/macintosh/via-pmu-backlight.c 2006-11-30 
23:33:39.00000 +0100
+++ linux-2.6.20-rc2/drivers/macintosh/via-pmu-backlight.c  2006-12-24 
17:58:18.0 +0100
@@ -147,7 +147,7 @@ void __init pmu_backlight_init()
 
snprintf(name, sizeof(name), "pmubl");
 
-   bd = backlight_device_register(name, NULL, &pmu_backlight_data);
+   bd = backlight_device_register(name, NULL, NULL, &pmu_backlight_data);
if (IS_ERR(bd)) {
printk("pmubl: Backlight registration failed\n");
goto error;

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: scheduling while atomic - Linux 2.6.20-rc2-ga3d89517

2006-12-24 Thread OGAWA Hirofumi
"Fabio Comolli" <[EMAIL PROTECTED]> writes:

> Just found this in syslog. It was during normal activity, about 6
> minutes after resume-from-ram. I never saw this before.

It seems someone missed to check PREEMPT_ACTIVE in __resched_legal().
Could you please test the following patch?
-- 
OGAWA Hirofumi <[EMAIL PROTECTED]>



Signed-off-by: OGAWA Hirofumi <[EMAIL PROTECTED]>
---

 kernel/sched.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff -puN kernel/sched.c~__resched_legal kernel/sched.c
--- linux-2.6/kernel/sched.c~__resched_legal2006-12-24 22:40:19.0 
+0900
+++ linux-2.6-hirofumi/kernel/sched.c   2006-12-24 23:54:01.0 +0900
@@ -4619,10 +4619,11 @@ asmlinkage long sys_sched_yield(void)
 
 static inline int __resched_legal(int expected_preempt_count)
 {
-#ifdef CONFIG_PREEMPT
+#ifndef CONFIG_PREEMPT
+   expected_preempt_count = 0;
+#endif
if (unlikely(preempt_count() != expected_preempt_count))
return 0;
-#endif
if (unlikely(system_state != SYSTEM_RUNNING))
return 0;
return 1;
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2

2006-12-24 Thread Jeff Garzik

Alessandro Suardi wrote:

On 12/24/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:


Ok,
 it's a couple of days delayed, because we've been trying to figure out
what is up with the rtorrent hash failures since 2.6.18.3. I don't think
we've made any progress, but we've cleaned up a number of suspects in the
meantime.

It's a bit sad, if only because I was really hoping to make 2.6.20 an 
easy

release, and held back on merging some stuff during the merge window for
that reason. And now we're battling something that was introduced much
earlier..

Now, practically speaking this isn't likely to affect a lot of people, 
but

it's still a worrisome problem, and we've had "top people" looking at it.
And they'll continue, but xmas is coming.

In the meantime, we'll continue with the stabilization, and this mainly
does some driver updates (usb, sound, dri, pci hotplug) and ACPI updates
(much of the latter syntactic cleanups). And arm and powerpc updates.

Shortlog appended.

For developers: if you sent me a patch, and I didn't apply it, it was
probably just missed because I concentrated on other issues. So pls
re-send.. Unless I explicitly told you that I'm not going to pull it due
to the merge window being over, of course ;)

Linus


[shortlog snipped]

As already reported multiple times, including at -rc1 time...

still need this libata-sff.c patch:

http://marc.theaimsgroup.com/?l=linux-kernel&m=116343564202844&q=raw

to have my root device detected, ata_piix probe would otherwise
fail as described in this thread:

http://www.ussg.iu.edu/hypermail/linux/kernel/0612.0/0690.html


I've got a patch that should work for those cases.  Alan's patch 
contained some bugs.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.20-rc2

2006-12-24 Thread Alessandro Suardi

On 12/24/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:


Ok,
 it's a couple of days delayed, because we've been trying to figure out
what is up with the rtorrent hash failures since 2.6.18.3. I don't think
we've made any progress, but we've cleaned up a number of suspects in the
meantime.

It's a bit sad, if only because I was really hoping to make 2.6.20 an easy
release, and held back on merging some stuff during the merge window for
that reason. And now we're battling something that was introduced much
earlier..

Now, practically speaking this isn't likely to affect a lot of people, but
it's still a worrisome problem, and we've had "top people" looking at it.
And they'll continue, but xmas is coming.

In the meantime, we'll continue with the stabilization, and this mainly
does some driver updates (usb, sound, dri, pci hotplug) and ACPI updates
(much of the latter syntactic cleanups). And arm and powerpc updates.

Shortlog appended.

For developers: if you sent me a patch, and I didn't apply it, it was
probably just missed because I concentrated on other issues. So pls
re-send.. Unless I explicitly told you that I'm not going to pull it due
to the merge window being over, of course ;)

Linus


[shortlog snipped]

As already reported multiple times, including at -rc1 time...

still need this libata-sff.c patch:

http://marc.theaimsgroup.com/?l=linux-kernel&m=116343564202844&q=raw

to have my root device detected, ata_piix probe would otherwise
fail as described in this thread:

http://www.ussg.iu.edu/hypermail/linux/kernel/0612.0/0690.html

Enjoy the holiday season,

--alessandro

"...when I get it, I _get_ it"

(Lara Eidemiller)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Linux 2.6.20-rc2

2006-12-23 Thread Linus Torvalds
di (4):
  ACPI: dock: use mutex instead of spinlock
  ACPI: dock: Make the dock station driver a platform device driver.
  ACPI: dock: add uevent to indicate change in device status
  acpiphp: Link-time error for PCI Hotplug

Krzysztof Helt (1):
  [ARM] 4015/1: s3c2410 cpu ifdefs

Leigh Brown (2):
  [TCP]: Fix oops caused by tcp_v4_md5_do_del
  [TCP]: Trivial fix to message in tcp_v4_inbound_md5_hash

Len Brown (3):
  ACPI: dock: fix build warning
  ACPI: ibm_acpi: respond to workqueue update
  ACPI: fix git automerge failure

Lennert Buytenhek (6):
  [ARM] 4054/1: ep93xx: add HWCAP_CRUNCH
  [ARM] 4055/1: iop13xx: fix phys_io/io_pg_offst for iq81340mc/sc
  [ARM] 4056/1: iop13xx: fix resource.end off-by-one in flash setup
  [ARM] 4057/1: ixp23xx: unconditionally enable hardware coherency
  [ARM] 4061/1: xsc3: change of maintainer
  [ARM] 4060/1: update several ARM defconfigs

Leonid Arsh (1):
  IB/mthca: Add HCA profile module parameters

Li Yewang (1):
  [IPV4]: Fix BUG of ip_rt_send_redirect()

Linas Vepstas (2):
  [POWERPC] Fix PCI device channel state initialization
  rpaphp: compiler warning cleanup

Linus Torvalds (11):
  Remove stack unwinder for now
  Fix "delayed_work_pending()" macro expansion
  Fix incorrect user space access locking in mincore()
  Make workqueue bit operations work on "atomic_long_t"
  Fix up mm/mincore.c error value cases
  Clean up and make try_to_free_buffers() not race with dirty pages
  VM: Remove "clear_page_dirty()" and "test_clear_page_dirty()" functions
  Clean up and export cancel_dirty_page() to modules
  Fix reiserfs after "test_clear_page_dirty()" removal
  Fix up CIFS for "test_clear_page_dirty()" removal
  Linux 2.6.20-rc2

Maciej W. Rozycki (1):
  mips: if_fddi.h: Add a missing inclusion

Magnus Damm (1):
  fix vm_events_fold_cpu() build breakage

Marcel Holtmann (1):
  Call init_timer() for ISDN PPP CCP reset state timer

Mark Fasheh (1):
  Conditionally check expected_preempt_count in __resched_legal()

Martin Bligh (1):
  ACPI: avoid gcc warnings in ACPI mutex debug code

Martin Schwidefsky (1):
  [S390] update default configuration

Martin Waitz (1):
  kernel-doc: remove Martin from MAINTAINERS

Mattia Dongili (1):
  [CPUFREQ] set policy->curfreq on initialization

Michael Chan (7):
  [BNX2]: Fix panic in bnx2_tx_int().
  [BNX2]: Fix bug in bnx2_nvram_write().
  [BNX2]: Fix minor loopback problem.
  [TG3]: Assign tp->link_config.orig_* values.
  [TG3]: Fix race condition when calling register_netdev().
  [TG3]: Power down/up 5906 PHY correctly.
  [TG3]: Update version and reldate.

Michel Dänzer (2):
  i915_vblank_tasklet: Try harder to avoid tearing.
  drm: Unify radeon offset checking.

Michael Ellerman (6):
  PCI: Create __pci_bus_find_cap_start() from __pci_bus_find_cap()
  PCI: Add pci_find_ht_capability() for finding Hypertransport capabilities
  PCI: Use pci_find_ht_capability() in drivers/pci/htirq.c
  PCI: Add #defines for Hypertransport MSI fields
  PCI: Use pci_find_ht_capability() in drivers/pci/quirks.c
  PCI: Only check the HT capability bits in mpic.c

Michael Halcrow (1):
  fsstack: Remove inode copy

Michael Holzheu (3):
  [S390] Fix reboot hang on LPARs
  [S390] Fix reboot hang
  [S390] Save prefix register for dump on panic

Michael Riepe (3):
  KVM: Do not export unsupported msrs to userspace
  KVM: Force real-mode cs limit to 64K
  KVM: Handle p5 mce msrs

Mike Miller (2):
  cciss: set default raid level when reading geometry fails
  cciss: fix XFER_READ/XFER_WRITE in do_cciss_request

Miklos Szeredi (1):
  fuse: remove clear_page_dirty() call

NeilBrown (1):
  md: fix a few problems with the interface (sysfs and ioctl) to md

Nick Piggin (1):
  mm: more rmap debugging

Nickolay V. Shmyrev (1):
  [ALSA] snd_hda_intel 3stack mode for ASUS P5P-L2

Nigel Cunningham (1):
  Fix swapped parameters in mm/vmscan.c

OGAWA Hirofumi (1):
  arch/i386/pci/mmconfig.c tlb flush fix

Oleg Nesterov (1):
  sys_mincore: s/max/min/

Oliver Neukum (3):
  USB: fix transvibrator disconnect race
  USB: removing ifdefed code from gl620a
  USB: mutexification of usblp

Olivier Galibert (1):
  bluetooth: add support for another Kensington dongle

Patrick Caulfield (1):
  [DLM] fix compile warning

Paul Jackson (1):
  CONFIG_VM_EVENT_COUNTER comment decrustify

Paul Mackerras (2):
  [POWERPC] Fix register save area alignment for swapcontext syscall
  gxt4500: Fix colormap and PLL setting, support GXT6000P

Paul Moore (2):
  NetLabel: perform input validation earlier on CIPSOv4 DOI add ops
  NetLabel: correctly fill in unused CIPSOv4 level and category mappings

Pavel Machek (1):
  [ARM