Re: Linux 2.6.21-rc5

2007-03-28 Thread Tilman Schmidt
Am 27.03.2007 08:17 schrieb Andrew Morton:
> I have a few fixes here which belong to subsystem trees, which were missed
> by the maintainers and which we probably want to get into 2.6.21.
[...]
> Maintainers are cc'ed.  Please promptly ack, nack or otherwise quack, else
> I'll be making my own decisions ;)

[CC list trimmed]

It's not on that list, but would you mind slipping
drivers-isdn-gigaset-mark-some-static-data-as-const-v2.patch
into 2.6.21 too? It's largely trivial but I'd like to get it
out of the door.

Thanks,
Tilman

-- 
Tilman Schmidt  E-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeoeffnet mindestens haltbar bis: (siehe Rueckseite)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-28 Thread Linus Torvalds


On Wed, 28 Mar 2007, Andi Kleen wrote:
> 
> Can you test this patch please? 

This patch is totally broken.

> i386/x86-64: Convert nmi reservation to be global
> 
> It doesn't make much sense to have this per CPU, because all
> the services using NMIs run on all CPUs. So make it global.

NO!

If you do this, then you must make all *callers* be global too. But they 
aren't. Right now all callers do per-CPU setup!

See for example enable_lapic_nmi_watchdog():

on_each_cpu(setup_apic_nmi_watchdog, NULL, 0, 1);

where "setup_apic_nmi_watchdog()" will call "setup_k7_watchdog()", which 
in turn will do a per-CPU reservation of the perfctl for the watchdog.

So I agree in that it probably doesn't make sense to have NMI/perfctl 
reservation per-CPU, but you can't just change the reservation and ignore 
all the *users* of that reservation that assumed that it was per-CPU.

Is that code insane? Probably. But it probably also works. After your 
patch, one CPU will be able to reserve the NMI/perfctl thing (fine so far) 
but then all the other CPU's that try to do it will fail.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-28 Thread Michal Piotrowski

On 28/03/07, Jiri Kosina <[EMAIL PROTECTED]> wrote:

On Wed, 28 Mar 2007, Michal Piotrowski wrote:

> BUG: using smp_processor_id() in preemptible [0001] code: mount/7245
> is fixed, thanks.
> but I still get this
> [  208.523901] =
> [  208.529739] [ INFO: inconsistent lock state ]
> [  208.534087] 2.6.21-rc5-g28defbea-dirty #131
> [  208.538260] -
> [  208.542611] inconsistent {hardirq-on-W} -> {in-hardirq-W} usage.
> [  208.548600] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:

Perhaps something like the one below?



Problem seems to be fixed. Thanks!

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-28 Thread Jiri Kosina
On Wed, 28 Mar 2007, Michal Piotrowski wrote:

> BUG: using smp_processor_id() in preemptible [0001] code: mount/7245 
> is fixed, thanks.
> but I still get this
> [  208.523901] =
> [  208.529739] [ INFO: inconsistent lock state ]
> [  208.534087] 2.6.21-rc5-g28defbea-dirty #131
> [  208.538260] -
> [  208.542611] inconsistent {hardirq-on-W} -> {in-hardirq-W} usage.
> [  208.548600] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:

Perhaps something like the one below?


From: Jiri Kosina <[EMAIL PROTECTED]>

oprofile: fix potential deadlock on oprofilefs_lock

nmi_cpu_setup() is called from hardirq context and acquires oprofilefs_lock.
alloc_event_buffer() and oprofilefs_ulong_from_user() acquire this lock
without disabling irqs, which could deadlock.

Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]>

 drivers/oprofile/event_buffer.c |5 +++--
 drivers/oprofile/oprofilefs.c   |5 +++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/oprofile/event_buffer.c b/drivers/oprofile/event_buffer.c
index 00e937e..e7fbac5 100644
--- a/drivers/oprofile/event_buffer.c
+++ b/drivers/oprofile/event_buffer.c
@@ -70,11 +70,12 @@ void wake_up_buffer_waiter(void)
 int alloc_event_buffer(void)
 {
int err = -ENOMEM;
+   unsigned long flags;
 
-   spin_lock(&oprofilefs_lock);
+   spin_lock_irqsave(&oprofilefs_lock, flags);
buffer_size = fs_buffer_size;
buffer_watershed = fs_buffer_watershed;
-   spin_unlock(&oprofilefs_lock);
+   spin_unlock_irqrestore(&oprofilefs_lock, flags);
  
if (buffer_watershed >= buffer_size)
return -EINVAL;
diff --git a/drivers/oprofile/oprofilefs.c b/drivers/oprofile/oprofilefs.c
index 6e67b42..8543cb2 100644
--- a/drivers/oprofile/oprofilefs.c
+++ b/drivers/oprofile/oprofilefs.c
@@ -65,6 +65,7 @@ ssize_t oprofilefs_ulong_to_user(unsigned long val, char 
__user * buf, size_t co
 int oprofilefs_ulong_from_user(unsigned long * val, char const __user * buf, 
size_t count)
 {
char tmpbuf[TMPBUFSIZE];
+   unsigned long flags;
 
if (!count)
return 0;
@@ -77,9 +78,9 @@ int oprofilefs_ulong_from_user(unsigned long * val, char 
const __user * buf, siz
if (copy_from_user(tmpbuf, buf, count))
return -EFAULT;
 
-   spin_lock(&oprofilefs_lock);
+   spin_lock_irqsave(&oprofilefs_lock, flags);
*val = simple_strtoul(tmpbuf, NULL, 0);
-   spin_unlock(&oprofilefs_lock);
+   spin_unlock_irqrestore(&oprofilefs_lock, flags);
return 0;
 }

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-28 Thread Michal Piotrowski
Andi Kleen napisał(a):
> On Tuesday 27 March 2007 20:53, Michal Piotrowski wrote:
>> Linus Torvalds napisał(a):
>>> There's various fixes here, ranging from some architecture updates (ia64, 
>>> ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers.
>>>
>>> And random one-liners.
>>>
>> I found this in mm snapshot
>> http://www.ussg.iu.edu/hypermail/linux/kernel/0703.2/1367.html
>> it's in mainline too.
>>
>> Andi, any progress with this bug?
> 
> Can you test this patch please? 
> 

BUG: using smp_processor_id() in preemptible [0001] code: mount/7245 
is fixed, thanks.


but I still get this

[  208.523901] =
[  208.529739] [ INFO: inconsistent lock state ]
[  208.534087] 2.6.21-rc5-g28defbea-dirty #131
[  208.538260] -
[  208.542611] inconsistent {hardirq-on-W} -> {in-hardirq-W} usage.
[  208.548600] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[  208.553553]  (oprofilefs_lock){+-..}, at: [] 
nmi_cpu_setup+0x15/0x4f [oprofile]
[  208.561800] {hardirq-on-W} state was registered at:
[  208.55]   [] __lock_acquire+0x442/0xba1
[  208.571765]   [] lock_acquire+0x68/0x82
[  208.576519]   [] _spin_lock+0x35/0x42
[  208.581102]   [] oprofilefs_ulong_from_user+0x4e/0x74 [oprofile]
[  208.588026]   [] ulong_write_file+0x2a/0x38 [oprofile]
[  208.594084]   [] vfs_write+0xaf/0x138
[  208.598658]   [] sys_write+0x3d/0x61
[  208.603171]   [] syscall_call+0x7/0xb
[  208.607751]   [] 0x
[  208.611478] irq event stamp: 575782
[  208.614960] hardirqs last  enabled at (575781): [] 
default_idle+0x3e/0x59
[  208.622645] hardirqs last disabled at (575782): [] 
call_function_interrupt+0x29/0x38
[  208.631281] softirqs last  enabled at (575768): [] 
__do_softirq+0xe4/0xea
[  208.638965] softirqs last disabled at (575759): [] 
do_softirq+0x64/0xd1
[  208.646478]
[  208.646479] other info that might help us debug this:
[  208.653003] no locks held by swapper/0.
[  208.656832]
[  208.656833] stack backtrace:
[  208.661199]  [] show_trace_log_lvl+0x1a/0x2f
[  208.666350]  [] show_trace+0x12/0x14
[  208.670811]  [] dump_stack+0x16/0x18
[  208.675272]  [] print_usage_bug+0x140/0x14a
[  208.680336]  [] mark_lock+0xa1/0x40b
[  208.684796]  [] __lock_acquire+0x3b3/0xba1
[  208.689775]  [] lock_acquire+0x68/0x82
[  208.694410]  [] _spin_lock+0x35/0x42
[  208.698869]  [] nmi_cpu_setup+0x15/0x4f [oprofile]
[  208.704540]  [] smp_call_function_interrupt+0x3a/0x56
[  208.710470]  [] call_function_interrupt+0x33/0x38
[  208.716053]  [] cpu_idle+0xb6/0xeb
[  208.720342]  [] start_secondary+0x333/0x33b
[  208.725407]  [<>] 0x0
[  208.728397]  ===


Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-28 Thread Andi Kleen
On Tuesday 27 March 2007 20:53, Michal Piotrowski wrote:
> Linus Torvalds napisał(a):
> > There's various fixes here, ranging from some architecture updates (ia64, 
> > ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers.
> > 
> > And random one-liners.
> > 
> 
> I found this in mm snapshot
> http://www.ussg.iu.edu/hypermail/linux/kernel/0703.2/1367.html
> it's in mainline too.
> 
> Andi, any progress with this bug?

Can you test this patch please? 

-Andi

i386/x86-64: Convert nmi reservation to be global

It doesn't make much sense to have this per CPU, because all
the services using NMIs run on all CPUs. So make it global.

This also fixes a warning about unprotected use of smp_processor_id
on preemptible kernels.

Signed-off-by: Andi Kleen <[EMAIL PROTECTED]>

Index: linux/arch/i386/kernel/nmi.c
===
--- linux.orig/arch/i386/kernel/nmi.c
+++ linux/arch/i386/kernel/nmi.c
@@ -41,8 +41,8 @@ int nmi_watchdog_enabled;
  *   different subsystems this reservation system just tries to coordinate
  *   things a little
  */
-static DEFINE_PER_CPU(unsigned long, perfctr_nmi_owner);
-static DEFINE_PER_CPU(unsigned long, evntsel_nmi_owner[3]);
+static unsigned long perfctr_nmi_owner;
+static unsigned long evntsel_nmi_owner[3];
 
 static cpumask_t backtrace_mask = CPU_MASK_NONE;
 
@@ -124,7 +124,7 @@ int avail_to_resrv_perfctr_nmi_bit(unsig
 {
BUG_ON(counter > NMI_MAX_COUNTER_BITS);
 
-   return (!test_bit(counter, &__get_cpu_var(perfctr_nmi_owner)));
+   return (!test_bit(counter, &perfctr_nmi_owner));
 }
 
 /* checks the an msr for availability */
@@ -135,7 +135,7 @@ int avail_to_resrv_perfctr_nmi(unsigned 
counter = nmi_perfctr_msr_to_bit(msr);
BUG_ON(counter > NMI_MAX_COUNTER_BITS);
 
-   return (!test_bit(counter, &__get_cpu_var(perfctr_nmi_owner)));
+   return (!test_bit(counter, &perfctr_nmi_owner));
 }
 
 int reserve_perfctr_nmi(unsigned int msr)
@@ -145,7 +145,7 @@ int reserve_perfctr_nmi(unsigned int msr
counter = nmi_perfctr_msr_to_bit(msr);
BUG_ON(counter > NMI_MAX_COUNTER_BITS);
 
-   if (!test_and_set_bit(counter, &__get_cpu_var(perfctr_nmi_owner)))
+   if (!test_and_set_bit(counter, &perfctr_nmi_owner))
return 1;
return 0;
 }
@@ -157,7 +157,7 @@ void release_perfctr_nmi(unsigned int ms
counter = nmi_perfctr_msr_to_bit(msr);
BUG_ON(counter > NMI_MAX_COUNTER_BITS);
 
-   clear_bit(counter, &__get_cpu_var(perfctr_nmi_owner));
+   clear_bit(counter, &perfctr_nmi_owner);
 }
 
 int reserve_evntsel_nmi(unsigned int msr)
@@ -167,7 +167,7 @@ int reserve_evntsel_nmi(unsigned int msr
counter = nmi_evntsel_msr_to_bit(msr);
BUG_ON(counter > NMI_MAX_COUNTER_BITS);
 
-   if (!test_and_set_bit(counter, &__get_cpu_var(evntsel_nmi_owner)[0]))
+   if (!test_and_set_bit(counter, &evntsel_nmi_owner[0]))
return 1;
return 0;
 }
@@ -179,7 +179,7 @@ void release_evntsel_nmi(unsigned int ms
counter = nmi_evntsel_msr_to_bit(msr);
BUG_ON(counter > NMI_MAX_COUNTER_BITS);
 
-   clear_bit(counter, &__get_cpu_var(evntsel_nmi_owner)[0]);
+   clear_bit(counter, &evntsel_nmi_owner[0]);
 }
 
 static __cpuinit inline int nmi_known_cpu(void)
Index: linux/arch/x86_64/kernel/nmi.c
===
--- linux.orig/arch/x86_64/kernel/nmi.c
+++ linux/arch/x86_64/kernel/nmi.c
@@ -39,8 +39,8 @@ int panic_on_unrecovered_nmi;
  *   different subsystems this reservation system just tries to coordinate
  *   things a little
  */
-static DEFINE_PER_CPU(unsigned, perfctr_nmi_owner);
-static DEFINE_PER_CPU(unsigned, evntsel_nmi_owner[2]);
+static unsigned perfctr_nmi_owner;
+static unsigned evntsel_nmi_owner[2];
 
 static cpumask_t backtrace_mask = CPU_MASK_NONE;
 
@@ -110,7 +110,7 @@ int avail_to_resrv_perfctr_nmi_bit(unsig
 {
BUG_ON(counter > NMI_MAX_COUNTER_BITS);
 
-   return (!test_bit(counter, &__get_cpu_var(perfctr_nmi_owner)));
+   return (!test_bit(counter, &perfctr_nmi_owner));
 }
 
 /* checks the an msr for availability */
@@ -121,7 +121,7 @@ int avail_to_resrv_perfctr_nmi(unsigned 
counter = nmi_perfctr_msr_to_bit(msr);
BUG_ON(counter > NMI_MAX_COUNTER_BITS);
 
-   return (!test_bit(counter, &__get_cpu_var(perfctr_nmi_owner)));
+   return (!test_bit(counter, &perfctr_nmi_owner));
 }
 
 int reserve_perfctr_nmi(unsigned int msr)
@@ -131,7 +131,7 @@ int reserve_perfctr_nmi(unsigned int msr
counter = nmi_perfctr_msr_to_bit(msr);
BUG_ON(counter > NMI_MAX_COUNTER_BITS);
 
-   if (!test_and_set_bit(counter, &__get_cpu_var(perfctr_nmi_owner)))
+   if (!test_and_set_bit(counter, &perfctr_nmi_owner))
return 1;
return 0;
 }
@@ -143,7 +143,7 @@ void release_perfctr_nmi(unsigned int ms
counter = nmi_perfctr_msr_to_bit(msr);
BUG_ON(counter > NMI_MAX_

Re: ATA ACPI (was Re: Linux 2.6.21-rc5)

2007-03-28 Thread Tejun Heo
Pavel Machek wrote:
> Hi!
> 
 So if you have reported a regression in the 2.6.21-rc 
 series, please check 2.6.21-rc5, and update your 
 report as appropriate (whether fixed or "still 
 problems with xyzzy").
>>> [just got back from vacation, or would have sent this 
>>> earlier]
>>>
>>> FWIW, I'm still leaning towards disabling libata ACPI 
>>> support by default for 2.6.21.
>>>
>>> Upstream has Alan's fix for the worst PATA problems, 
>>> but for different reasons, I think PATA ACPI and SATA 
>>> ACPI support in libata does not feel quite ready for 
>>> prime time in 2.6.21.
>>>
>>> Scream now, or hold your peace until 2.6.22... :)
>> I second disabling ACPI for 2.6.21.
> 
> Ugh.. does that mean we'll have 'regression reports' as in 'it worked
> ok in -rc5, broken in final?
> 
> Well, suspend is currently so broken that we'll be flooded by reports,
> anyway, but could we get at least define in code so that we can
> tell users to flip it?

Just the default value for libata.noacpi is changed to 1, so user can
easily reenable it by passing boot/module parameter.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ATA ACPI (was Re: Linux 2.6.21-rc5)

2007-03-28 Thread Pavel Machek
Hi!

> >>So if you have reported a regression in the 2.6.21-rc 
> >>series, please check 2.6.21-rc5, and update your 
> >>report as appropriate (whether fixed or "still 
> >>problems with xyzzy").
> >
> >[just got back from vacation, or would have sent this 
> >earlier]
> >
> >FWIW, I'm still leaning towards disabling libata ACPI 
> >support by default for 2.6.21.
> >
> >Upstream has Alan's fix for the worst PATA problems, 
> >but for different reasons, I think PATA ACPI and SATA 
> >ACPI support in libata does not feel quite ready for 
> >prime time in 2.6.21.
> >
> >Scream now, or hold your peace until 2.6.22... :)
> 
> I second disabling ACPI for 2.6.21.

Ugh.. does that mean we'll have 'regression reports' as in 'it worked
ok in -rc5, broken in final?

Well, suspend is currently so broken that we'll be flooded by reports,
anyway, but could we get at least define in code so that we can
tell users to flip it?

Or maybe it is enough to make libata dependend on EXPERIMETAL?
...making it dependend on BROKEN should be definitely enough...

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-27 Thread Michal Piotrowski
Pavel Machek napisał(a):
> Hi!
>>> There's various fixes here, ranging from some architecture updates (ia64,
>>> ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers.
>>>
>> Suspend to disk doesn't work for me with this patch. It hangs after
>> PM: Preparing devices for restore.
>> Suspending console(s)
>> during resuming.
>>
>> a504e64ab42bcc27074ea37405d06833ed6e0820 is first bad commit
>> commit a504e64ab42bcc27074ea37405d06833ed6e0820
>> Author: Stephen Hemminger <[EMAIL PROTECTED]>
>> Date:   Fri Feb 2 08:22:53 2007 -0800
>>
>>skge: WOL support
>>
>>Add WOL support for Yukon chipsets in skge device.
>>
>>Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
>>Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>
>>
>> :04 04 d3d4335e6cba330b7880b4787fbe48733e69f8fc
>> 5845b004228d811de912a55da6a7843b72f23f81 M  drivers
>>
>> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/git-config2
> 
> Do you use skge as your network device?

Yes, I have a Marvell based onboard NIC.

02:05.0 Ethernet controller: 3Com Corporation 3c940 10/100/1000Base-T [Marvell] 
(rev 12)
Subsystem: ASUSTeK Computer Inc. A7V600/P4P800/K8V motherboard
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR-Pavel

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-27 Thread Pavel Machek
Hi!
> >There's various fixes here, ranging from some architecture updates (ia64,
> >ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers.
> >
> 
> Suspend to disk doesn't work for me with this patch. It hangs after
> PM: Preparing devices for restore.
> Suspending console(s)
> during resuming.
> 
> a504e64ab42bcc27074ea37405d06833ed6e0820 is first bad commit
> commit a504e64ab42bcc27074ea37405d06833ed6e0820
> Author: Stephen Hemminger <[EMAIL PROTECTED]>
> Date:   Fri Feb 2 08:22:53 2007 -0800
> 
>skge: WOL support
> 
>Add WOL support for Yukon chipsets in skge device.
> 
>Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
>Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>
> 
> :04 04 d3d4335e6cba330b7880b4787fbe48733e69f8fc
> 5845b004228d811de912a55da6a7843b72f23f81 M  drivers
> 
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/git-config2

Do you use skge as your network device?
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-27 Thread Michal Piotrowski
Linus Torvalds napisał(a):
> There's various fixes here, ranging from some architecture updates (ia64, 
> ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers.
> 
> And random one-liners.
> 

I found this in mm snapshot
http://www.ussg.iu.edu/hypermail/linux/kernel/0703.2/1367.html
it's in mainline too.

Andi, any progress with this bug?

BUG: using smp_processor_id() in preemptible [0001] code: mount/7245
caller is avail_to_resrv_perfctr_nmi_bit+0x1a/0x32
 [] show_trace_log_lvl+0x1a/0x2f
 [] show_trace+0x12/0x14
 [] dump_stack+0x16/0x18
 [] debug_smp_processor_id+0xa2/0xb4
 [] avail_to_resrv_perfctr_nmi_bit+0x1a/0x32
 [] nmi_create_files+0x2a/0x10e [oprofile]
 [] oprofile_create_files+0xe6/0xec [oprofile]
 [] oprofilefs_fill_super+0x78/0x7e [oprofile]
 [] get_sb_single+0x46/0x8c
 [] oprofilefs_get_sb+0x1c/0x1e [oprofile]
 [] vfs_kern_mount+0x81/0xf1
 [] do_kern_mount+0x30/0x42
 [] do_mount+0x601/0x678
 [] sys_mount+0x6f/0xa4
 [] syscall_call+0x7/0xb
 ===
BUG: using smp_processor_id() in preemptible [0001] code: mount/7245
caller is avail_to_resrv_perfctr_nmi_bit+0x1a/0x32
 [] show_trace_log_lvl+0x1a/0x2f
 [] show_trace+0x12/0x14
 [] dump_stack+0x16/0x18
 [] debug_smp_processor_id+0xa2/0xb4
 [] avail_to_resrv_perfctr_nmi_bit+0x1a/0x32
 [] nmi_create_files+0x2a/0x10e [oprofile]
 [] oprofile_create_files+0xe6/0xec [oprofile]
 [] oprofilefs_fill_super+0x78/0x7e [oprofile]
 [] get_sb_single+0x46/0x8c
 [] oprofilefs_get_sb+0x1c/0x1e [oprofile]
 [] vfs_kern_mount+0x81/0xf1
 [] do_kern_mount+0x30/0x42
 [] do_mount+0x601/0x678
 [] sys_mount+0x6f/0xa4
 [] syscall_call+0x7/0xb
 ===
BUG: using smp_processor_id() in preemptible [0001] code: mount/7245
caller is avail_to_resrv_perfctr_nmi_bit+0x1a/0x32
 [] show_trace_log_lvl+0x1a/0x2f
 [] show_trace+0x12/0x14
 [] dump_stack+0x16/0x18
 [] debug_smp_processor_id+0xa2/0xb4
 [] avail_to_resrv_perfctr_nmi_bit+0x1a/0x32
 [] nmi_create_files+0x2a/0x10e [oprofile]
 [] oprofile_create_files+0xe6/0xec [oprofile]
 [] oprofilefs_fill_super+0x78/0x7e [oprofile]
 [] get_sb_single+0x46/0x8c
 [] oprofilefs_get_sb+0x1c/0x1e [oprofile]
 [] vfs_kern_mount+0x81/0xf1
 [] do_kern_mount+0x30/0x42
 [] do_mount+0x601/0x678
 [] sys_mount+0x6f/0xa4
 [] syscall_call+0x7/0xb
 ===
BUG: using smp_processor_id() in preemptible [0001] code: mount/7245
caller is avail_to_resrv_perfctr_nmi_bit+0x1a/0x32
 [] show_trace_log_lvl+0x1a/0x2f
 [] show_trace+0x12/0x14
 [] dump_stack+0x16/0x18
 [] debug_smp_processor_id+0xa2/0xb4
 [] avail_to_resrv_perfctr_nmi_bit+0x1a/0x32
 [] nmi_create_files+0x2a/0x10e [oprofile]
 [] oprofile_create_files+0xe6/0xec [oprofile]
 [] oprofilefs_fill_super+0x78/0x7e [oprofile]
 [] get_sb_single+0x46/0x8c
 [] oprofilefs_get_sb+0x1c/0x1e [oprofile]
 [] vfs_kern_mount+0x81/0xf1
 [] do_kern_mount+0x30/0x42
 [] do_mount+0x601/0x678
 [] sys_mount+0x6f/0xa4
 [] syscall_call+0x7/0xb
 ===
SELinux: initialized (dev oprofilefs, type oprofilefs), uses genfs_contexts

=
[ INFO: inconsistent lock state ]
2.6.21-rc5-gd4590940-dirty #128
-
inconsistent {hardirq-on-W} -> {in-hardirq-W} usage.
firefox-bin/3542 [HC1[1]:SC0[0]:HE0:SE1] takes:
 (oprofilefs_lock){+-..}, at: [] nmi_cpu_setup+0x15/0x4f [oprofile]
{hardirq-on-W} state was registered at:
  [] __lock_acquire+0x442/0xba1
  [] lock_acquire+0x68/0x82
  [] _spin_lock+0x35/0x42
  [] oprofilefs_ulong_from_user+0x4e/0x74 [oprofile]
  [] ulong_write_file+0x2a/0x38 [oprofile]
  [] vfs_write+0xaf/0x138
  [] sys_write+0x3d/0x61
  [] syscall_call+0x7/0xb
  [] 0x
irq event stamp: 50464270
hardirqs last  enabled at (50464269): [] syscall_exit_work+0x11/0x26
hardirqs last disabled at (50464270): [] 
call_function_interrupt+0x29/0x38
softirqs last  enabled at (50462522): [] __do_softirq+0xe4/0xea
softirqs last disabled at (50462515): [] do_softirq+0x64/0xd1

other info that might help us debug this:
no locks held by firefox-bin/3542.

stack backtrace:
 [] show_trace_log_lvl+0x1a/0x2f
 [] show_trace+0x12/0x14
 [] dump_stack+0x16/0x18
 [] print_usage_bug+0x140/0x14a
 [] mark_lock+0xa1/0x40b
 [] __lock_acquire+0x3b3/0xba1
 [] lock_acquire+0x68/0x82
 [] _spin_lock+0x35/0x42
 [] nmi_cpu_setup+0x15/0x4f [oprofile]
 [] smp_call_function_interrupt+0x3a/0x56
 [] call_function_interrupt+0x33/0x38
 ===

http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/git-config2
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/git-dmesg

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http:/

Re: ATA ACPI (was Re: Linux 2.6.21-rc5)

2007-03-27 Thread Jeff Garzik

Linus Torvalds wrote:


On Tue, 27 Mar 2007, Jeff Garzik wrote:

FWIW, I'm still leaning towards disabling libata ACPI support by default for
2.6.21.


Hey, I'm not going to argue against anything that says "disable ACPI". Of 
*course* it should be disabled if there aren't thousands of machines that 
are in user hands that actually need it (and none that regress).


It's required to access data at all (BIOS-supplied password [un]locks 
disk), in a small minority of configurations.  It's strongly suggested 
for reliable suspend/resume, particularly on laptops, where libata ACPI 
support fixes some suspend/resume problems.


Some BIOSen also want to apply drive+board-specific errata workarounds. 
 That's OK, but ideally we should know about those in the kernel.


"none that regress" is the problem though.  Buggy tables, unexercised 
ACPI code paths, and in a few cases unexpected post-ACPI 
drive/controller behavior expose regressions.




Anybody want to send me a patch?


Since everybody is OK with my plan, I'll send one today along with the 
rest of the post-vacation 2.6.21-rc bug fixes.


Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-27 Thread Michal Piotrowski

Hi,

On 26/03/07, Linus Torvalds <[EMAIL PROTECTED]> wrote:


There's various fixes here, ranging from some architecture updates (ia64,
ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers.



Suspend to disk doesn't work for me with this patch. It hangs after
PM: Preparing devices for restore.
Suspending console(s)
during resuming.

a504e64ab42bcc27074ea37405d06833ed6e0820 is first bad commit
commit a504e64ab42bcc27074ea37405d06833ed6e0820
Author: Stephen Hemminger <[EMAIL PROTECTED]>
Date:   Fri Feb 2 08:22:53 2007 -0800

   skge: WOL support

   Add WOL support for Yukon chipsets in skge device.

   Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
   Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>

:04 04 d3d4335e6cba330b7880b4787fbe48733e69f8fc
5845b004228d811de912a55da6a7843b72f23f81 M  drivers

http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/git-config2

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ATA ACPI (was Re: Linux 2.6.21-rc5)

2007-03-27 Thread Linus Torvalds


On Tue, 27 Mar 2007, Jeff Garzik wrote:
> 
> FWIW, I'm still leaning towards disabling libata ACPI support by default for
> 2.6.21.

Hey, I'm not going to argue against anything that says "disable ACPI". Of 
*course* it should be disabled if there aren't thousands of machines that 
are in user hands that actually need it (and none that regress).

Anybody want to send me a patch?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-27 Thread Jesse Barnes
On Monday, March 26, 2007 11:20 pm Greg KH wrote:
> Already in Linus's tree.
>
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.2
> >1-rc5/2.6.21-rc5-mm2/broken-out/fix-sysfs-rom-file-creation-for-bios
> >-rom-shadows.patch
>
> I'd prefer to wait until 2.6.22 for this one, I've had too many odd
> reports of problems in this area, and since no one has reported this
> issue, it's not a real rush at all.

Yeah, I don't think this one is critical.  These files aren't in heavy 
use yet, so fixing this in 2.6.22 should be ok.  I've only heard one 
complaint about this bug so far, and that was caused by some code still 
in development.

Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-27 Thread Andrew Morton
On Tue, 27 Mar 2007 14:25:07 +0200 Andi Kleen <[EMAIL PROTECTED]> wrote:

> On Tuesday 27 March 2007 08:17, Andrew Morton wrote:
> > 
> > I have a few fixes here which belong to subsystem trees, which were missed
> > by the maintainers and which we probably want to get into 2.6.21.
> > 
> > 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/make-aout-executables-work-again.patch
> 
> This is already fixed in a different way.
> 

oh yeah.  Did that fix make it into 2.6.20.x?

I think we decided that make-aout-executables-work-again.patch might still
be a desirable thing to have, but I don't recall the reasoning for that.

Anyway, if it doesn't fix a bug it is nowhere near a high-priority patch
for that seething bugfest which we like to call a kernel, so I'll drop it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-27 Thread Dmitry Torokhov

On 3/27/07, Andrew Morton <[EMAIL PROTECTED]> wrote:


I have a few fixes here which belong to subsystem trees, which were missed
by the maintainers and which we probably want to get into 2.6.21.

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/fix-sudden-warps-in-mousedev.patch



Slightly different fix is already in input tree. I'd perfer waiting
for 2.6.22 as this only affects touchpads when not using synaptics X
driver (which most distributions use by default).

--
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-27 Thread Andi Kleen
On Tuesday 27 March 2007 08:17, Andrew Morton wrote:
> 
> I have a few fixes here which belong to subsystem trees, which were missed
> by the maintainers and which we probably want to get into 2.6.21.
> 
> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/make-aout-executables-work-again.patch

This is already fixed in a different way.

-Andi


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-27 Thread Takashi Iwai
At Mon, 26 Mar 2007 22:17:31 -0800,
Andrew Morton wrote:
> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/revert-ac97-fix-microphone-and-line_in-selection-logic.patch

The better fix is already in rc5, so please drop this one from your
tree.

c26a8de23a4417f556250c4c099b048b26c430be
[ALSA] ac97 - fix AD shared shared jack control logic


thanks,

Takashi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-26 Thread Greg KH
On Mon, Mar 26, 2007 at 10:17:31PM -0800, Andrew Morton wrote:
> 
> I have a few fixes here which belong to subsystem trees, which were missed
> by the maintainers and which we probably want to get into 2.6.21.
> 
> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/pci-set-pci=bfsort-for-poweredge-r900.patch

Already in Linus's tree.

> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/fix-sysfs-rom-file-creation-for-bios-rom-shadows.patch

I'd prefer to wait until 2.6.22 for this one, I've had too many odd
reports of problems in this area, and since no one has reported this
issue, it's not a real rush at all.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-26 Thread Andrew Morton

I have a few fixes here which belong to subsystem trees, which were missed
by the maintainers and which we probably want to get into 2.6.21.


ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/make-aout-executables-work-again.patch

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/revert-ac97-fix-microphone-and-line_in-selection-logic.patch

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/drivers-mfd-sm501c-fix-an-off-by-one.patch

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/pci-set-pci=bfsort-for-poweredge-r900.patch

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/fix-sudden-warps-in-mousedev.patch

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm2/broken-out/fix-sysfs-rom-file-creation-for-bios-rom-shadows.patch

Maintainers are cc'ed.  Please promptly ack, nack or otherwise quack, else
I'll be making my own decisions ;)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ATA ACPI (was Re: Linux 2.6.21-rc5)

2007-03-26 Thread Tejun Heo

Jeff Garzik wrote:

Linus Torvalds wrote:
There's various fixes here, ranging from some architecture updates 
(ia64, ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers.


And random one-liners.

But probably more important, and likely much more visible to most 
people is the fixes for the fallout from the hrtimers and no-HZ 
changes, and some of the ACPI regressions.


Those timer changes ended up much more painful than anybody wished 
for, but big thanks to Thomas Gleixner for being on it like a weasel 
on a dead rat, and the regression list has kept shrinking.


So if you have reported a regression in the 2.6.21-rc series, please 
check 2.6.21-rc5, and update your report as appropriate (whether fixed 
or "still problems with xyzzy").


Linus



[just got back from vacation, or would have sent this earlier]

FWIW, I'm still leaning towards disabling libata ACPI support by default 
for 2.6.21.


Upstream has Alan's fix for the worst PATA problems, but for different 
reasons, I think PATA ACPI and SATA ACPI support in libata does not feel 
quite ready for prime time in 2.6.21.


Scream now, or hold your peace until 2.6.22... :)


I second disabling ACPI for 2.6.21.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


ATA ACPI (was Re: Linux 2.6.21-rc5)

2007-03-26 Thread Jeff Garzik

Linus Torvalds wrote:
There's various fixes here, ranging from some architecture updates (ia64, 
ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers.


And random one-liners.

But probably more important, and likely much more visible to most people 
is the fixes for the fallout from the hrtimers and no-HZ changes, and some 
of the ACPI regressions.


Those timer changes ended up much more painful than anybody wished for, 
but big thanks to Thomas Gleixner for being on it like a weasel on a dead 
rat, and the regression list has kept shrinking.


So if you have reported a regression in the 2.6.21-rc series, please check 
2.6.21-rc5, and update your report as appropriate (whether fixed or "still 
problems with xyzzy").


Linus



[just got back from vacation, or would have sent this earlier]

FWIW, I'm still leaning towards disabling libata ACPI support by default 
for 2.6.21.


Upstream has Alan's fix for the worst PATA problems, but for different 
reasons, I think PATA ACPI and SATA ACPI support in libata does not feel 
quite ready for prime time in 2.6.21.


Scream now, or hold your peace until 2.6.22... :)

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-26 Thread Ayaz Abdulla
This issue might be resolved with the patch provided in the following 
bug report: http://bugzilla.kernel.org/show_bug.cgi?id=8058


Please try out the patch in the bug report without your patch and see if 
the issue reproduces.


Ayaz


Ingo Molnar wrote:

* Linus Torvalds <[EMAIL PROTECTED]> wrote:


There's various fixes here, ranging from some architecture updates 
(ia64, ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers.



here's a new v2.6.20 -> v2.6.21 forcedeth.c regression:

in the last week or so i've been seeing sporadic under-load forcedeth.c 
crashes (see the full oops further below):


 eth1: too many iterations (6) in nv_nic_irq.
 Unable to handle kernel NULL pointer dereference at 0088 RIP: 
 [] nv_tx_done+0xf4/0x1cf


this is line 1906 of drivers/net/forcedeth.c:

np->stats.tx_bytes += np->get_tx_ctx->skb->len;

struct sk_buff's len field is at offset 88, so np->get_tx_ctx->skb is 
NULL. That is an 'impossible' scenario for tx descriptors here - the tx 
ring descriptors are always set up with a valid skb (and a valid dma 
address), and their completion is serialized via np->lock.


these crashes are almost instant on the .21-rc5-rt kernel, but extremely 
sporadic on the upstream kernel and needed very high networking loads to 
trigger. Today i found a good way to trigger it almost instantly on 
upstream kernels too: apply the debug patch attached further below and 
do:


echo 100 > /proc/sys/kernel/panic

that will inject 100 artificial 'too many iterations' failures and 
provokes a TX timeout - which TX timeout will crash. (i've used a 
dual-core Athlon64 system in this test)


my first quick guess was to extend np->priv locking to the whole of 
nv_start_xmit/nv_start_xmit_optimized - while that appeared to make the 
crash a bit less likely, it did not prevent it. So there must be some 
other, more fundamental problem be left as well. At first glance the SMP 
locking looks OK, so maybe the ring indices are messed up somehow and we 
got into a 'ring head bites the tail' scenario?


i can provide more info if needed.

Ingo

-->
eth1: too many iterations (6) in nv_nic_irq.
Unable to handle kernel NULL pointer dereference at 0088 RIP: 
 [] nv_tx_done+0xf4/0x1cf
PGD 34d03067 PUD 34d02067 PMD 0 
Oops:  [1] PREEMPT SMP 
CPU 1 
Modules linked in:

Pid: 0, comm: swapper Not tainted 2.6.21-rc5 #8
RIP: 0010:[]  [] nv_tx_done+0xf4/0x1cf
RSP: 0018:81003ff6be40  EFLAGS: 00010206
RAX:  RBX: 810002e26700 RCX: 0001
RDX: 0042 RSI: 3ef00cbe RDI: 81003fbeb070
RBP: 81003ff6be60 R08: 810002e26a00 R09: 0003
R10: 81003ff4e100 R11: 810001e283f8 R12: 3ef00cbe
R13: 810002e26000 R14: 810002e28fc0 R15: 
FS:  2b6cb57f1db0() GS:81003ff4ad40() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 0088 CR3: 34c87000 CR4: 06e0
Process swapper (pid: 0, threadinfo 81003ff64000, task 81003ff4e100)
Stack:  810002e26700 0032 c201a000 810002e26000
 81003ff6bea0 80406dae 810002e26700 810002e26700
 810002e26000 00ff c201a000 80749080
Call Trace:
   [] nv_nic_irq+0x76/0x261
 [] nv_do_nic_poll+0x200/0x284
 [] nv_do_nic_poll+0x0/0x284
 [] run_timer_softirq+0x167/0x1dd
 [] __do_softirq+0x5b/0xc9
 [] call_softirq+0x1c/0x28
 [] do_softirq+0x31/0x84
 [] irq_exit+0x3f/0x50
 [] smp_apic_timer_interrupt+0x49/0x5b
 [] default_idle+0x0/0x44
 [] apic_timer_interrupt+0x66/0x70
   [] default_idle+0x2f/0x44
 [] enter_idle+0x22/0x24
 [] cpu_idle+0x91/0xd4
 [] start_secondary+0x2e3/0x2f5

---
 drivers/net/forcedeth.c |   20 
 1 file changed, 20 insertions(+)

Index: linux/drivers/net/forcedeth.c
===
--- linux.orig/drivers/net/forcedeth.c
+++ linux/drivers/net/forcedeth.c
@@ -2908,6 +2908,10 @@ static irqreturn_t nv_nic_irq(int foo, v
spin_unlock(&np->lock);
break;
}
+   if (panic_timeout > 0) {
+   panic_timeout--;
+   i = max_interrupt_work+1;
+   }
if (unlikely(i > max_interrupt_work)) {
spin_lock(&np->lock);
/* disable interrupts on the nic */
@@ -3026,6 +3030,10 @@ static irqreturn_t nv_nic_irq_optimized(
break;
}
 
+		if (panic_timeout > 0) {

+   panic_timeout--;
+   i = max_interrupt_work+1;
+   }
if (unlikely(i > max_interrupt_work)) {
spin_lock(&np->lock);
/* disable interrupts on the nic */
@@ -3076,6 +3084,10 @@ static irqreturn_t nv_nic_irq_tx(int foo
 

Re: Linux 2.6.21-rc5

2007-03-26 Thread Thomas Gleixner
On Mon, 2007-03-26 at 07:25 -0500, Bob Tracy wrote:
> Thomas Gleixner wrote:
> > This fix from John Stultz is still missing:
> > 
> > http://lkml.org/lkml/2007/3/22/287
> > 
> > It's in Andrews queue already and waits to be sent to you.
> 
> In summary, that fix is a workaround to allow the acpi_pm clocksource
> to be selected instead of the pit clocksource, thereby allowing my
> Dell laptop with the PIIX4 bug to boot.  Other apic, clocksource, etc.
> patches that were included in -rc5 fixed the problem that caused the
> boot process to hang when the pit clocksource was selected, as I
> suspected would be the case :-).

Ah. Ok

> Per John's message in the above URL, while the fix is no longer needed
> for allowing the laptop to boot, it's probably still "a good thing" to
> allow a better clocksource to be selected.

Yes. The read three times pmtimer is faster and more reliable than the
PIT.

tglx



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-26 Thread Bob Tracy
Thomas Gleixner wrote:
> This fix from John Stultz is still missing:
> 
> http://lkml.org/lkml/2007/3/22/287
> 
> It's in Andrews queue already and waits to be sent to you.

In summary, that fix is a workaround to allow the acpi_pm clocksource
to be selected instead of the pit clocksource, thereby allowing my
Dell laptop with the PIIX4 bug to boot.  Other apic, clocksource, etc.
patches that were included in -rc5 fixed the problem that caused the
boot process to hang when the pit clocksource was selected, as I
suspected would be the case :-).

Per John's message in the above URL, while the fix is no longer needed
for allowing the laptop to boot, it's probably still "a good thing" to
allow a better clocksource to be selected.

-- 
---
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-26 Thread Ingo Molnar

* Linus Torvalds <[EMAIL PROTECTED]> wrote:

> There's various fixes here, ranging from some architecture updates 
> (ia64, ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers.

here's a new v2.6.20 -> v2.6.21 forcedeth.c regression:

in the last week or so i've been seeing sporadic under-load forcedeth.c 
crashes (see the full oops further below):

 eth1: too many iterations (6) in nv_nic_irq.
 Unable to handle kernel NULL pointer dereference at 0088 RIP: 
 [] nv_tx_done+0xf4/0x1cf

this is line 1906 of drivers/net/forcedeth.c:

np->stats.tx_bytes += np->get_tx_ctx->skb->len;

struct sk_buff's len field is at offset 88, so np->get_tx_ctx->skb is 
NULL. That is an 'impossible' scenario for tx descriptors here - the tx 
ring descriptors are always set up with a valid skb (and a valid dma 
address), and their completion is serialized via np->lock.

these crashes are almost instant on the .21-rc5-rt kernel, but extremely 
sporadic on the upstream kernel and needed very high networking loads to 
trigger. Today i found a good way to trigger it almost instantly on 
upstream kernels too: apply the debug patch attached further below and 
do:

echo 100 > /proc/sys/kernel/panic

that will inject 100 artificial 'too many iterations' failures and 
provokes a TX timeout - which TX timeout will crash. (i've used a 
dual-core Athlon64 system in this test)

my first quick guess was to extend np->priv locking to the whole of 
nv_start_xmit/nv_start_xmit_optimized - while that appeared to make the 
crash a bit less likely, it did not prevent it. So there must be some 
other, more fundamental problem be left as well. At first glance the SMP 
locking looks OK, so maybe the ring indices are messed up somehow and we 
got into a 'ring head bites the tail' scenario?

i can provide more info if needed.

Ingo

-->
eth1: too many iterations (6) in nv_nic_irq.
Unable to handle kernel NULL pointer dereference at 0088 RIP: 
 [] nv_tx_done+0xf4/0x1cf
PGD 34d03067 PUD 34d02067 PMD 0 
Oops:  [1] PREEMPT SMP 
CPU 1 
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.21-rc5 #8
RIP: 0010:[]  [] nv_tx_done+0xf4/0x1cf
RSP: 0018:81003ff6be40  EFLAGS: 00010206
RAX:  RBX: 810002e26700 RCX: 0001
RDX: 0042 RSI: 3ef00cbe RDI: 81003fbeb070
RBP: 81003ff6be60 R08: 810002e26a00 R09: 0003
R10: 81003ff4e100 R11: 810001e283f8 R12: 3ef00cbe
R13: 810002e26000 R14: 810002e28fc0 R15: 
FS:  2b6cb57f1db0() GS:81003ff4ad40() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 0088 CR3: 34c87000 CR4: 06e0
Process swapper (pid: 0, threadinfo 81003ff64000, task 81003ff4e100)
Stack:  810002e26700 0032 c201a000 810002e26000
 81003ff6bea0 80406dae 810002e26700 810002e26700
 810002e26000 00ff c201a000 80749080
Call Trace:
   [] nv_nic_irq+0x76/0x261
 [] nv_do_nic_poll+0x200/0x284
 [] nv_do_nic_poll+0x0/0x284
 [] run_timer_softirq+0x167/0x1dd
 [] __do_softirq+0x5b/0xc9
 [] call_softirq+0x1c/0x28
 [] do_softirq+0x31/0x84
 [] irq_exit+0x3f/0x50
 [] smp_apic_timer_interrupt+0x49/0x5b
 [] default_idle+0x0/0x44
 [] apic_timer_interrupt+0x66/0x70
   [] default_idle+0x2f/0x44
 [] enter_idle+0x22/0x24
 [] cpu_idle+0x91/0xd4
 [] start_secondary+0x2e3/0x2f5

---
 drivers/net/forcedeth.c |   20 
 1 file changed, 20 insertions(+)

Index: linux/drivers/net/forcedeth.c
===
--- linux.orig/drivers/net/forcedeth.c
+++ linux/drivers/net/forcedeth.c
@@ -2908,6 +2908,10 @@ static irqreturn_t nv_nic_irq(int foo, v
spin_unlock(&np->lock);
break;
}
+   if (panic_timeout > 0) {
+   panic_timeout--;
+   i = max_interrupt_work+1;
+   }
if (unlikely(i > max_interrupt_work)) {
spin_lock(&np->lock);
/* disable interrupts on the nic */
@@ -3026,6 +3030,10 @@ static irqreturn_t nv_nic_irq_optimized(
break;
}
 
+   if (panic_timeout > 0) {
+   panic_timeout--;
+   i = max_interrupt_work+1;
+   }
if (unlikely(i > max_interrupt_work)) {
spin_lock(&np->lock);
/* disable interrupts on the nic */
@@ -3076,6 +3084,10 @@ static irqreturn_t nv_nic_irq_tx(int foo
dprintk(KERN_DEBUG "%s: received irq with events 0x%x. 
Probably TX fail.\n",
dev->name, events);
}
+   if (panic_timeout > 0) {
+   panic_timeo

Re: Linux 2.6.21-rc5

2007-03-26 Thread Ingo Molnar

* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> my first quick guess was to extend np->priv locking to the whole of 
> nv_start_xmit/nv_start_xmit_optimized - while that appeared to make 
> the crash a bit less likely, it did not prevent it. So there must be 
> some other, more fundamental problem be left as well. At first glance 
> the SMP locking looks OK, so maybe the ring indices are messed up 
> somehow and we got into a 'ring head bites the tail' scenario?

to be specific, the patch below is what i tried - but it didnt 
completely fix the crash.

Ingo

---
 drivers/net/forcedeth.c |   15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

Index: linux/drivers/net/forcedeth.c
===
--- linux.orig/drivers/net/forcedeth.c
+++ linux/drivers/net/forcedeth.c
@@ -1650,9 +1650,10 @@ static int nv_start_xmit(struct sk_buff 
   ((skb_shinfo(skb)->frags[i].size & 
(NV_TX2_TSO_MAX_SIZE-1)) ? 1 : 0);
}
 
+   spin_lock_irq(&np->lock);
+
empty_slots = nv_get_empty_tx_slots(np);
if (unlikely(empty_slots <= entries)) {
-   spin_lock_irq(&np->lock);
netif_stop_queue(dev);
np->tx_stop = 1;
spin_unlock_irq(&np->lock);
@@ -1718,8 +1719,6 @@ static int nv_start_xmit(struct sk_buff 
tx_flags_extra = skb->ip_summed == CHECKSUM_PARTIAL ?
 NV_TX2_CHECKSUM_L3 | NV_TX2_CHECKSUM_L4 : 0;
 
-   spin_lock_irq(&np->lock);
-
/* set tx flags */
start_tx->flaglen |= cpu_to_le32(tx_flags | tx_flags_extra);
np->put_tx.orig = put_tx;
@@ -1766,9 +1765,10 @@ static int nv_start_xmit_optimized(struc
   ((skb_shinfo(skb)->frags[i].size & 
(NV_TX2_TSO_MAX_SIZE-1)) ? 1 : 0);
}
 
+   spin_lock_irq(&np->lock);
+
empty_slots = nv_get_empty_tx_slots(np);
if (unlikely(empty_slots <= entries)) {
-   spin_lock_irq(&np->lock);
netif_stop_queue(dev);
np->tx_stop = 1;
spin_unlock_irq(&np->lock);
@@ -1846,8 +1846,6 @@ static int nv_start_xmit_optimized(struc
start_tx->txvlan = 0;
}
 
-   spin_lock_irq(&np->lock);
-
/* set tx flags */
start_tx->flaglen |= cpu_to_le32(tx_flags | tx_flags_extra);
np->put_tx.ex = put_tx;
@@ -3484,6 +3482,7 @@ static void nv_do_nic_poll(unsigned long
struct net_device *dev = (struct net_device *) data;
struct fe_priv *np = netdev_priv(dev);
u8 __iomem *base = get_hwbase(dev);
+   unsigned long flags;
u32 mask = 0;
 
/*
@@ -3519,7 +3518,7 @@ static void nv_do_nic_poll(unsigned long
printk(KERN_INFO "forcedeth: MAC in recoverable error state\n");
if (netif_running(dev)) {
netif_tx_lock_bh(dev);
-   spin_lock(&np->lock);
+   spin_lock_irqsave(&np->lock, flags);
/* stop engines */
nv_stop_rx(dev);
nv_stop_tx(dev);
@@ -3545,7 +3544,7 @@ static void nv_do_nic_poll(unsigned long
/* restart rx engine */
nv_start_rx(dev);
nv_start_tx(dev);
-   spin_unlock(&np->lock);
+   spin_unlock_irqrestore(&np->lock, flags);
netif_tx_unlock_bh(dev);
}
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21-rc5

2007-03-26 Thread Thomas Gleixner
On Sun, 2007-03-25 at 16:08 -0700, Linus Torvalds wrote:
> Those timer changes ended up much more painful than anybody wished for, 
> but big thanks to Thomas Gleixner for being on it like a weasel on a dead 
> rat, and the regression list has kept shrinking.

Why certainly ! I caused them, so I have to fix them. There are still a
few to hunt down and I want to sort them out before 2.6.21 final.

> So if you have reported a regression in the 2.6.21-rc series, please check 
> 2.6.21-rc5, and update your report as appropriate (whether fixed or "still 
> problems with xyzzy").

This fix from John Stultz is still missing:

http://lkml.org/lkml/2007/3/22/287

It's in Andrews queue already and waits to be sent to you.

tglx



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/