RE: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread Thomas Gleixner
On Thu, 4 Oct 2007, Pallipadi, Venkatesh wrote:
> >-Original Message-
> >From: [EMAIL PROTECTED] 
> >[mailto:[EMAIL PROTECTED] On Behalf Of 
> >Thomas Gleixner
> >Sent: Monday, October 01, 2007 11:19 PM
> >To: Andi Kleen
> >Cc: Arjan van de Ven; David Bahi; LKML; 
> >[EMAIL PROTECTED]; Andrew Morton; Ingo Molnar; 
> >Gregory Haskins
> >Subject: Re: nmi_watchdog fix for x86_64 to be more like i386
> >
> >>
> >> The only workaround for chipsets ignoring IRQ affinity would 
> >be to keep
> >> track on which CPU irq 0 happens and then restart APIC timer 
> >interrupts
> >> on the others (or send IPIs) as needed. But that would be 
> >fairly ugly.
> >
> >The clock events code does handle this already. The broadcast 
> >interrupt 
> >can come in on any cpu. It's just the nmi watchdog which would 
> >be affected 
> >by that.
> >
> 
> Probably we can workaround this by keeping track of IRQ0 count at percpu
> level and
> use local apic timer + this percpu counter in NMI. Or just increment
> local
> apic timer count in IRQ0 with nohz enabled.

No, I tried that. It's ugly.

The per cpu accounting is the correct way to go if we want to take
care of those systems, which ignore the CPU0 binding of irq0.

See patch against the x86 tree below.

tglx

>
commit 093976c7ad206a008bd5de4619f40f6bca4a79c3
Author: Thomas Gleixner <[EMAIL PROTECTED]>
Date:   Fri Oct 5 22:19:18 2007 +0200

x86: Fix irq0 / local apic timer accounting

The clock events merge introduced a change to the nmi watchdog code to
handle the not longer increasing local apic timer count in the
broadcast mode. This is fine for UP, but on SMP it pampers over a
stuck CPU which is not handling the broadcast interrupt due to the
unconditional sum up of local apic timer count and irq0 count.

To cover all cases we need to keep track on which CPU irq0 is
handled. In theory this is CPU#0 due to the explicit disabling of irq
balancing for irq0, but there are systems which ignore this on the
hardware level. The per cpu irq0 accounting allows us to remove the
irq0 to CPU0 binding as well.

Add a per cpu counter for irq0 and evaluate this instead of the global
irq0 count in the nmi watchdog code.

Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>

diff --git a/arch/x86/kernel/nmi_32.c b/arch/x86/kernel/nmi_32.c
index c7227e2..95d3fc2 100644
--- a/arch/x86/kernel/nmi_32.c
+++ b/arch/x86/kernel/nmi_32.c
@@ -353,7 +353,8 @@ __kprobes int nmi_watchdog_tick(struct pt_regs * regs, 
unsigned reason)
 * Take the local apic timer and PIT/HPET into account. We don't
 * know which one is active, when we have highres/dyntick on
 */
-   sum = per_cpu(irq_stat, cpu).apic_timer_irqs + kstat_cpu(cpu).irqs[0];
+   sum = per_cpu(irq_stat, cpu).apic_timer_irqs +
+   per_cpu(irq_stat, cpu).irq0_irqs;
 
/* if the none of the timers isn't firing, this cpu isn't doing much */
if (!touched && last_irq_sums[cpu] == sum) {
diff --git a/arch/x86/kernel/time_32.c b/arch/x86/kernel/time_32.c
index 19a6c67..3571d0a 100644
--- a/arch/x86/kernel/time_32.c
+++ b/arch/x86/kernel/time_32.c
@@ -157,6 +157,9 @@ EXPORT_SYMBOL(profile_pc);
  */
 irqreturn_t timer_interrupt(int irq, void *dev_id)
 {
+   /* Keep nmi watchdog up to date */
+   per_cpu(irq_stat, cpu).irq0_irqs++;
+
 #ifdef CONFIG_X86_IO_APIC
if (timer_ack) {
/*
diff --git a/include/asm-x86/hardirq_32.h b/include/asm-x86/hardirq_32.h
index ed7cf97..9188635 100644
--- a/include/asm-x86/hardirq_32.h
+++ b/include/asm-x86/hardirq_32.h
@@ -9,6 +9,7 @@ typedef struct {
unsigned long idle_timestamp;
unsigned int __nmi_count;   /* arch dependent */
unsigned int apic_timer_irqs;   /* arch dependent */
+   unsigned int irq0_irqs;
 } cacheline_aligned irq_cpustat_t;
 
 DECLARE_PER_CPU(irq_cpustat_t, irq_stat);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread Pallipadi, Venkatesh
 

>-Original Message-
>From: [EMAIL PROTECTED] 
>[mailto:[EMAIL PROTECTED] On Behalf Of 
>Thomas Gleixner
>Sent: Monday, October 01, 2007 11:19 PM
>To: Andi Kleen
>Cc: Arjan van de Ven; David Bahi; LKML; 
>[EMAIL PROTECTED]; Andrew Morton; Ingo Molnar; 
>Gregory Haskins
>Subject: Re: nmi_watchdog fix for x86_64 to be more like i386
>
>>
>> The only workaround for chipsets ignoring IRQ affinity would 
>be to keep
>> track on which CPU irq 0 happens and then restart APIC timer 
>interrupts
>> on the others (or send IPIs) as needed. But that would be 
>fairly ugly.
>
>The clock events code does handle this already. The broadcast 
>interrupt 
>can come in on any cpu. It's just the nmi watchdog which would 
>be affected 
>by that.
>

Probably we can workaround this by keeping track of IRQ0 count at percpu
level and
use local apic timer + this percpu counter in NMI. Or just increment
local
apic timer count in IRQ0 with nohz enabled.

Thanks,
Venki
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread Andi Kleen

> Hummm, a directory appears to be protected, or DNE.  I get  "failed to
> change directory". 

Sorry typo
 
ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/fix-watchdog

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread Peter W. Morreale
On Fri, 2007-10-05 at 18:03 +0200, Andi Kleen wrote:
> > If it's agreed that this is the fix - can you submit a proper [PATCH] so
> > all users of watchdog_use_timer_and_hpet_on_x86_64.patch can be removed,
> > and replaced with yours.
> 
> ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/watchdog-fix
> 
> -Andi
> 

Hummm, a directory appears to be protected, or DNE.  I get  "failed to
change directory". 

Thx,
-PWM



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread Andi Kleen

> If it's agreed that this is the fix - can you submit a proper [PATCH] so
> all users of watchdog_use_timer_and_hpet_on_x86_64.patch can be removed,
> and replaced with yours.

ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/watchdog-fix

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread David Bahi
On Mon, 2007-10-01 at 23:41 +0200, Andi Kleen wrote:
> > IRQ_NOBALANCING is not preventing cpu unplug. It moves the affinity to the
> > next CPU, but the check in NMI watchdog for CPU == 0 would not longer
> > work.
> 
> That cannot happen right now because cpu_disable() on both i386/x86-64
> reject CPU #0. So just setting IRQ_NOBALANCING is sufficient and both
> do that already. I was wrong earlier in being concerned about this.
> 
> >  int tick_do_broadcast(cpumask_t mask)
> > @@ -137,6 +147,7 @@ int tick_do_broadcast(cpumask_t mask)
> > cpu_clear(cpu, mask);
> > td = _cpu(tick_cpu_device, cpu);
> > td->evtdev->event_handler(td->evtdev);
> > +   tick_broadcast_account(cpu);
> 
> That would not handle the case with a single CPU running only
> irq  0 but not broadcasting I think.
> 
> I believe ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/fix-watchdog
> is the correct fix
> 
> -Andi

Andi, 

If it's agreed that this is the fix - can you submit a proper [PATCH] so
all users of watchdog_use_timer_and_hpet_on_x86_64.patch can be removed,
and replaced with yours.

Thank you very much




signature.asc
Description: PGP signature


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread Andi Kleen

 If it's agreed that this is the fix - can you submit a proper [PATCH] so
 all users of watchdog_use_timer_and_hpet_on_x86_64.patch can be removed,
 and replaced with yours.

ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/watchdog-fix

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread David Bahi
On Mon, 2007-10-01 at 23:41 +0200, Andi Kleen wrote:
  IRQ_NOBALANCING is not preventing cpu unplug. It moves the affinity to the
  next CPU, but the check in NMI watchdog for CPU == 0 would not longer
  work.
 
 That cannot happen right now because cpu_disable() on both i386/x86-64
 reject CPU #0. So just setting IRQ_NOBALANCING is sufficient and both
 do that already. I was wrong earlier in being concerned about this.
 
   int tick_do_broadcast(cpumask_t mask)
  @@ -137,6 +147,7 @@ int tick_do_broadcast(cpumask_t mask)
  cpu_clear(cpu, mask);
  td = per_cpu(tick_cpu_device, cpu);
  td-evtdev-event_handler(td-evtdev);
  +   tick_broadcast_account(cpu);
 
 That would not handle the case with a single CPU running only
 irq  0 but not broadcasting I think.
 
 I believe ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/fix-watchdog
 is the correct fix
 
 -Andi

Andi, 

If it's agreed that this is the fix - can you submit a proper [PATCH] so
all users of watchdog_use_timer_and_hpet_on_x86_64.patch can be removed,
and replaced with yours.

Thank you very much




signature.asc
Description: PGP signature


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread Peter W. Morreale
On Fri, 2007-10-05 at 18:03 +0200, Andi Kleen wrote:
  If it's agreed that this is the fix - can you submit a proper [PATCH] so
  all users of watchdog_use_timer_and_hpet_on_x86_64.patch can be removed,
  and replaced with yours.
 
 ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/watchdog-fix
 
 -Andi
 

Hummm, a directory appears to be protected, or DNE.  I get  failed to
change directory. 

Thx,
-PWM



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread Andi Kleen

 Hummm, a directory appears to be protected, or DNE.  I get  failed to
 change directory. 

Sorry typo
 
ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/fix-watchdog

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread Pallipadi, Venkatesh
 

-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of 
Thomas Gleixner
Sent: Monday, October 01, 2007 11:19 PM
To: Andi Kleen
Cc: Arjan van de Ven; David Bahi; LKML; 
[EMAIL PROTECTED]; Andrew Morton; Ingo Molnar; 
Gregory Haskins
Subject: Re: nmi_watchdog fix for x86_64 to be more like i386


 The only workaround for chipsets ignoring IRQ affinity would 
be to keep
 track on which CPU irq 0 happens and then restart APIC timer 
interrupts
 on the others (or send IPIs) as needed. But that would be 
fairly ugly.

The clock events code does handle this already. The broadcast 
interrupt 
can come in on any cpu. It's just the nmi watchdog which would 
be affected 
by that.


Probably we can workaround this by keeping track of IRQ0 count at percpu
level and
use local apic timer + this percpu counter in NMI. Or just increment
local
apic timer count in IRQ0 with nohz enabled.

Thanks,
Venki
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: nmi_watchdog fix for x86_64 to be more like i386

2007-10-05 Thread Thomas Gleixner
On Thu, 4 Oct 2007, Pallipadi, Venkatesh wrote:
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Thomas Gleixner
 Sent: Monday, October 01, 2007 11:19 PM
 To: Andi Kleen
 Cc: Arjan van de Ven; David Bahi; LKML; 
 [EMAIL PROTECTED]; Andrew Morton; Ingo Molnar; 
 Gregory Haskins
 Subject: Re: nmi_watchdog fix for x86_64 to be more like i386
 
 
  The only workaround for chipsets ignoring IRQ affinity would 
 be to keep
  track on which CPU irq 0 happens and then restart APIC timer 
 interrupts
  on the others (or send IPIs) as needed. But that would be 
 fairly ugly.
 
 The clock events code does handle this already. The broadcast 
 interrupt 
 can come in on any cpu. It's just the nmi watchdog which would 
 be affected 
 by that.
 
 
 Probably we can workaround this by keeping track of IRQ0 count at percpu
 level and
 use local apic timer + this percpu counter in NMI. Or just increment
 local
 apic timer count in IRQ0 with nohz enabled.

No, I tried that. It's ugly.

The per cpu accounting is the correct way to go if we want to take
care of those systems, which ignore the CPU0 binding of irq0.

See patch against the x86 tree below.

tglx


commit 093976c7ad206a008bd5de4619f40f6bca4a79c3
Author: Thomas Gleixner [EMAIL PROTECTED]
Date:   Fri Oct 5 22:19:18 2007 +0200

x86: Fix irq0 / local apic timer accounting

The clock events merge introduced a change to the nmi watchdog code to
handle the not longer increasing local apic timer count in the
broadcast mode. This is fine for UP, but on SMP it pampers over a
stuck CPU which is not handling the broadcast interrupt due to the
unconditional sum up of local apic timer count and irq0 count.

To cover all cases we need to keep track on which CPU irq0 is
handled. In theory this is CPU#0 due to the explicit disabling of irq
balancing for irq0, but there are systems which ignore this on the
hardware level. The per cpu irq0 accounting allows us to remove the
irq0 to CPU0 binding as well.

Add a per cpu counter for irq0 and evaluate this instead of the global
irq0 count in the nmi watchdog code.

Signed-off-by: Thomas Gleixner [EMAIL PROTECTED]

diff --git a/arch/x86/kernel/nmi_32.c b/arch/x86/kernel/nmi_32.c
index c7227e2..95d3fc2 100644
--- a/arch/x86/kernel/nmi_32.c
+++ b/arch/x86/kernel/nmi_32.c
@@ -353,7 +353,8 @@ __kprobes int nmi_watchdog_tick(struct pt_regs * regs, 
unsigned reason)
 * Take the local apic timer and PIT/HPET into account. We don't
 * know which one is active, when we have highres/dyntick on
 */
-   sum = per_cpu(irq_stat, cpu).apic_timer_irqs + kstat_cpu(cpu).irqs[0];
+   sum = per_cpu(irq_stat, cpu).apic_timer_irqs +
+   per_cpu(irq_stat, cpu).irq0_irqs;
 
/* if the none of the timers isn't firing, this cpu isn't doing much */
if (!touched  last_irq_sums[cpu] == sum) {
diff --git a/arch/x86/kernel/time_32.c b/arch/x86/kernel/time_32.c
index 19a6c67..3571d0a 100644
--- a/arch/x86/kernel/time_32.c
+++ b/arch/x86/kernel/time_32.c
@@ -157,6 +157,9 @@ EXPORT_SYMBOL(profile_pc);
  */
 irqreturn_t timer_interrupt(int irq, void *dev_id)
 {
+   /* Keep nmi watchdog up to date */
+   per_cpu(irq_stat, cpu).irq0_irqs++;
+
 #ifdef CONFIG_X86_IO_APIC
if (timer_ack) {
/*
diff --git a/include/asm-x86/hardirq_32.h b/include/asm-x86/hardirq_32.h
index ed7cf97..9188635 100644
--- a/include/asm-x86/hardirq_32.h
+++ b/include/asm-x86/hardirq_32.h
@@ -9,6 +9,7 @@ typedef struct {
unsigned long idle_timestamp;
unsigned int __nmi_count;   /* arch dependent */
unsigned int apic_timer_irqs;   /* arch dependent */
+   unsigned int irq0_irqs;
 } cacheline_aligned irq_cpustat_t;
 
 DECLARE_PER_CPU(irq_cpustat_t, irq_stat);

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-02 Thread Thomas Gleixner
On Tue, 2 Oct 2007, Andi Kleen wrote:
> > Agreed.
> > 
> > I just got a x8664-hrt report, where I found the following oddity:
> > 
> >  0:   1197 172881   IO-APIC-edge  timer
> > 
> > That's one of those infamous AMD C1E boxen. Strange, all my systems have 
> > IRQ#0 on CPU#0 and nowhere else. Any idea ?
> 
> Hmm, in lowestpriority mode it would be possible that the APIC changes
> the CPU to #1 once; but IRQ 0 is always set to fixed mode. Also even 
> if that happens you should have them all on 1.
> 
> Maybe the chipset is just ignoring the IO-APIC configuration in this case?
> 
> Is it always the same chipset? Is it seen on i386 too?
> 
> The problem is really that if this happens it's more than the NMI watchdog
> that is broken. If you don't run an additional APIC timer interrupt on CPU #0
> it's possible that CPU #0 won't schedule at all.
>
> The only workaround for chipsets ignoring IRQ affinity would be to keep
> track on which CPU irq 0 happens and then restart APIC timer interrupts
> on the others (or send IPIs) as needed. But that would be fairly ugly.

The clock events code does handle this already. The broadcast interrupt 
can come in on any cpu. It's just the nmi watchdog which would be affected 
by that.

tglx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-02 Thread Thomas Gleixner
On Tue, 2 Oct 2007, Andi Kleen wrote:
  Agreed.
  
  I just got a x8664-hrt report, where I found the following oddity:
  
   0:   1197 172881   IO-APIC-edge  timer
  
  That's one of those infamous AMD C1E boxen. Strange, all my systems have 
  IRQ#0 on CPU#0 and nowhere else. Any idea ?
 
 Hmm, in lowestpriority mode it would be possible that the APIC changes
 the CPU to #1 once; but IRQ 0 is always set to fixed mode. Also even 
 if that happens you should have them all on 1.
 
 Maybe the chipset is just ignoring the IO-APIC configuration in this case?
 
 Is it always the same chipset? Is it seen on i386 too?
 
 The problem is really that if this happens it's more than the NMI watchdog
 that is broken. If you don't run an additional APIC timer interrupt on CPU #0
 it's possible that CPU #0 won't schedule at all.

 The only workaround for chipsets ignoring IRQ affinity would be to keep
 track on which CPU irq 0 happens and then restart APIC timer interrupts
 on the others (or send IPIs) as needed. But that would be fairly ugly.

The clock events code does handle this already. The broadcast interrupt 
can come in on any cpu. It's just the nmi watchdog which would be affected 
by that.

tglx
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Andi Kleen

> Agreed.
> 
> I just got a x8664-hrt report, where I found the following oddity:
> 
>  0:   1197 172881   IO-APIC-edge  timer
> 
> That's one of those infamous AMD C1E boxen. Strange, all my systems have 
> IRQ#0 on CPU#0 and nowhere else. Any idea ?

Hmm, in lowestpriority mode it would be possible that the APIC changes
the CPU to #1 once; but IRQ 0 is always set to fixed mode. Also even 
if that happens you should have them all on 1.

Maybe the chipset is just ignoring the IO-APIC configuration in this case?

Is it always the same chipset? Is it seen on i386 too?

The problem is really that if this happens it's more than the NMI watchdog
that is broken. If you don't run an additional APIC timer interrupt on CPU #0
it's possible that CPU #0 won't schedule at all.

The only workaround for chipsets ignoring IRQ affinity would be to keep
track on which CPU irq 0 happens and then restart APIC timer interrupts
on the others (or send IPIs) as needed. But that would be fairly ugly.

IRQ 0 is unfortunately an generally under-validated area because Windows doesn't
use it.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Arjan van de Ven
On Tue, 02 Oct 2007 07:56:24 +0300
Mika Penttilä <[EMAIL PROTECTED]> wrote:


> Here I have with stock FC7 (2.6.22.9-91) kernel :
>  0: 107835  133459760   IO-APIC-edge  timer
> 


fwiw this is entirely done by the hardware; no irq balancer has touched
this one (fc7 has the new irqbalancer which leaves irq0 alone)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Mika Penttilä

Thomas Gleixner wrote:

On Tue, 2 Oct 2007, Andi Kleen wrote:
  

OTOH, the accounting hook would allow us to remove the IRQ#0 -> CPU#0
restriction. Not sure whether it's worth the trouble.
  

Some SIS chipsets hang the machine when you migrate irq 0 to another
CPU. It's better to keep that Also I wouldn't be surprised if there are some
other assumptions about this elsewhere.

Ok in theory it could be done only on SIS, but that probably would really
not be worth the trouble



Agreed.

I just got a x8664-hrt report, where I found the following oddity:

 0:   1197 172881   IO-APIC-edge  timer

That's one of those infamous AMD C1E boxen. Strange, all my systems have 
IRQ#0 on CPU#0 and nowhere else. Any idea ?


tglx

  

Here I have with stock FC7 (2.6.22.9-91) kernel :
0: 107835  133459760   IO-APIC-edge  timer

Processor:
vendor_id: AuthenticAMD
cpu family: 15
model: 107
model name: AMD Athlon(tm) 64 X2 Dual Core Processor 4000+
stepping: 1
cpu MHz: 2109.721
cache size: 512 KB

MB:
Asus M2N-E (NF570)

--Mika


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Arjan van de Ven
On Tue, 2 Oct 2007 00:47:12 +0200 (CEST)
Thomas Gleixner <[EMAIL PROTECTED]> wrote:

> On Tue, 2 Oct 2007, Andi Kleen wrote:
> > 
> > > OTOH, the accounting hook would allow us to remove the IRQ#0 ->
> > > CPU#0 restriction. Not sure whether it's worth the trouble.
> > 
> > Some SIS chipsets hang the machine when you migrate irq 0 to another
> > CPU. It's better to keep that Also I wouldn't be surprised if there
> > are some other assumptions about this elsewhere.
> > 
> > Ok in theory it could be done only on SIS, but that probably would
> > really not be worth the trouble
> 
> Agreed.
> 
> I just got a x8664-hrt report, where I found the following oddity:
> 
>  0:   1197 172881   IO-APIC-edge  timer
> 
> That's one of those infamous AMD C1E boxen. Strange, all my systems
> have IRQ#0 on CPU#0 and nowhere else. Any idea ?

2 things;
the current irq balancers don't balance the timer interrupt, in fact
they'll leave it alone (so the chipset might balance)
older ones pin it to cpu 0 or rotate it
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Tue, 2 Oct 2007, Andi Kleen wrote:
> 
> > OTOH, the accounting hook would allow us to remove the IRQ#0 -> CPU#0
> > restriction. Not sure whether it's worth the trouble.
> 
> Some SIS chipsets hang the machine when you migrate irq 0 to another
> CPU. It's better to keep that Also I wouldn't be surprised if there are some
> other assumptions about this elsewhere.
> 
> Ok in theory it could be done only on SIS, but that probably would really
> not be worth the trouble

Agreed.

I just got a x8664-hrt report, where I found the following oddity:

 0:   1197 172881   IO-APIC-edge  timer

That's one of those infamous AMD C1E boxen. Strange, all my systems have 
IRQ#0 on CPU#0 and nowhere else. Any idea ?

tglx

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Andi Kleen

> OTOH, the accounting hook would allow us to remove the IRQ#0 -> CPU#0
> restriction. Not sure whether it's worth the trouble.

Some SIS chipsets hang the machine when you migrate irq 0 to another
CPU. It's better to keep that Also I wouldn't be surprised if there are some
other assumptions about this elsewhere.

Ok in theory it could be done only on SIS, but that probably would really
not be worth the trouble

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Andi Kleen wrote:
> 
> > IRQ_NOBALANCING is not preventing cpu unplug. It moves the affinity to the
> > next CPU, but the check in NMI watchdog for CPU == 0 would not longer
> > work.
> 
> That cannot happen right now because cpu_disable() on both i386/x86-64
> reject CPU #0. So just setting IRQ_NOBALANCING is sufficient and both
> do that already. I was wrong earlier in being concerned about this.
> 
> >  int tick_do_broadcast(cpumask_t mask)
> > @@ -137,6 +147,7 @@ int tick_do_broadcast(cpumask_t mask)
> > cpu_clear(cpu, mask);
> > td = _cpu(tick_cpu_device, cpu);
> > td->evtdev->event_handler(td->evtdev);
> > +   tick_broadcast_account(cpu);
> 
> That would not handle the case with a single CPU running only
> irq  0 but not broadcasting I think.

Hmm. The only situation where this can happen is when you add
"nolapic_timer" to the command line on a single CPU system. We do not
register the lapic "dummy" clock event device then.

> I believe 
> ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/fix-watchdog
> is the correct fix

Yup, I completely missed the fact, that we reject CPU#0 unplugging, so
your fix seems indeed to be more correct and simpler.

OTOH, the accounting hook would allow us to remove the IRQ#0 -> CPU#0
restriction. Not sure whether it's worth the trouble.

 tglx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Andi Kleen

> IRQ_NOBALANCING is not preventing cpu unplug. It moves the affinity to the
> next CPU, but the check in NMI watchdog for CPU == 0 would not longer
> work.

That cannot happen right now because cpu_disable() on both i386/x86-64
reject CPU #0. So just setting IRQ_NOBALANCING is sufficient and both
do that already. I was wrong earlier in being concerned about this.

>  int tick_do_broadcast(cpumask_t mask)
> @@ -137,6 +147,7 @@ int tick_do_broadcast(cpumask_t mask)
>   cpu_clear(cpu, mask);
>   td = _cpu(tick_cpu_device, cpu);
>   td->evtdev->event_handler(td->evtdev);
> + tick_broadcast_account(cpu);

That would not handle the case with a single CPU running only
irq  0 but not broadcasting I think.

I believe ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/fix-watchdog
is the correct fix

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Arjan van de Ven wrote:
> > > I already did this here by checking for cpu != 0. But it also needs
> > > either tracking or forbidding migrations of irq 0. I can take care
> > > of the patch.
> > 
> > I was thinking about the same fix. On i386 we already have the irq 
> > migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
> > IRQ_NOBALANCING.
> 
> btw doing this is a problem if the user decides to hot(un)plug cpu 0...
> he then can't move the irqs away to do that

IRQ_NOBALANCING is not preventing cpu unplug. It moves the affinity to the 
next CPU, but the check in NMI watchdog for CPU == 0 would not longer 
work.

Fix below. Post .23 material. I work out a separate one for the x8664 
clock events series.

tglx

>

[PATCH] i386: Fix nmi watchdog per cpu timer irq accounting

The clock events patches changed the interrupt distribution and the
local apic timer interrupt accounting for the broadcast case. The per
cpu clock events handler of the cpu, which runs the broadcast
interrupt, is executed directly in the broadcast irq context. This
does not invoke the low level arch code, which does the local apic
timer irq accounting. The work around for false positives in the nmi
watchdog was to add the irq0 interrupts (broadcast device) to the
local apic timer interrupts. This falsifies the results for the CPUs
which are not handling the broadcast interrupt, i.e. stuck CPUs might
be not detected, as noticed by Andi Kleen.

It would be possible to move the clockevents handler invocation of the
CPU which runs the broadcast interrupt into the tick device broadcast
function, but this would require to handle the per cpu device to this
function and perform the direct operation in the clock device specific
architecture code. Right now this is only i386 and x86_64, but MIPS is
on the way to use the broadcast mode as well.

Introduce a weak function tick_broadcast_account(), which allows x86
to adjust the local apic timer interrupt counter in the case when the
cpu local timer handler has been invoked. This keeps the cpu local
handler decision and invocation in the common code and allows x86 to
handle the nmi watchdog accounting correctly.

Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>

diff --git a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c
index 3d67ae1..180dde8 100644
--- a/arch/i386/kernel/apic.c
+++ b/arch/i386/kernel/apic.c
@@ -283,6 +283,16 @@ static void lapic_timer_broadcast(cpumask_t mask)
 }
 
 /*
+ * Called from the broadcasting code to keep the local apic timer irq
+ * accounting straight for the nmi watchdog. Is called with interrupts
+ * disabled.
+ */
+void tick_broadcast_account(int cpu)
+{
+   per_cpu(irq_stat, cpu).apic_timer_irqs++;
+}
+
+/*
  * Setup the local APIC timer for this CPU. Copy the initilized values
  * of the boot CPU and register the clock event in the framework.
  */
diff --git a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c
index c7227e2..03cdcaf 100644
--- a/arch/i386/kernel/nmi.c
+++ b/arch/i386/kernel/nmi.c
@@ -349,11 +349,7 @@ __kprobes int nmi_watchdog_tick(struct pt_regs * regs, 
unsigned reason)
cpu_clear(cpu, backtrace_mask);
}
 
-   /*
-* Take the local apic timer and PIT/HPET into account. We don't
-* know which one is active, when we have highres/dyntick on
-*/
-   sum = per_cpu(irq_stat, cpu).apic_timer_irqs + kstat_cpu(cpu).irqs[0];
+   sum = per_cpu(irq_stat, cpu).apic_timer_irqs;
 
/* if the none of the timers isn't firing, this cpu isn't doing much */
if (!touched && last_irq_sums[cpu] == sum) {
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 9a7252e..99b3021 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -73,6 +73,7 @@ static inline void tick_cancel_sched_timer(int cpu) { }
 # ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 extern struct tick_device *tick_get_broadcast_device(void);
 extern cpumask_t *tick_get_broadcast_mask(void);
+extern void tick_broadcast_account(int cpu);
 
 #  ifdef CONFIG_TICK_ONESHOT
 extern cpumask_t *tick_get_broadcast_oneshot_mask(void);
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 0962e05..43d0085 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -123,6 +123,16 @@ int tick_device_uses_broadcast(struct clock_event_device 
*dev, int cpu)
 }
 
 /*
+ * Weak function for cpu local interrupt accounting. Used by x86 to
+ * keep the lapic accounting correct for nmi_watchdog.
+ *
+ * Must be called with interrupts disabled.
+ */
+void __attribute__((weak)) tick_broadcast_account(int cpu)
+{
+}
+
+/*
  * Broadcast the event to the cpus, which are set in the mask
  */
 int tick_do_broadcast(cpumask_t mask)
@@ -137,6 +147,7 @@ int tick_do_broadcast(cpumask_t mask)
cpu_clear(cpu, mask);
td = _cpu(tick_cpu_device, cpu);
td->evtdev->event_handler(td->evtdev);
+  

Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Paul E. McKenney
On Mon, Oct 01, 2007 at 12:56:26PM -0700, Arjan van de Ven wrote:
> On Mon, 1 Oct 2007 21:27:39 +0200 (CEST)
> 
> > > I already did this here by checking for cpu != 0. But it also needs
> > > either tracking or forbidding migrations of irq 0. I can take care
> > > of the patch.
> > 
> > I was thinking about the same fix. On i386 we already have the irq 
> > migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
> > IRQ_NOBALANCING.
> 
> btw doing this is a problem if the user decides to hot(un)plug cpu 0...
> he then can't move the irqs away to do that

Last I tried it, x86_64 and i386 in fact prohibit hotunplugging CPU 0.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Dave Jones
On Mon, Oct 01, 2007 at 12:56:26PM -0700, Arjan van de Ven wrote:
 > On Mon, 1 Oct 2007 21:27:39 +0200 (CEST)
 > 
 > > > I already did this here by checking for cpu != 0. But it also needs
 > > > either tracking or forbidding migrations of irq 0. I can take care
 > > > of the patch.
 > > 
 > > I was thinking about the same fix. On i386 we already have the irq 
 > > migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
 > > IRQ_NOBALANCING.
 > 
 > btw doing this is a problem if the user decides to hot(un)plug cpu 0...
 > he then can't move the irqs away to do that
 
You can't hot unplug cpu0.
Take a look in sysfs, no /sys/devices/system/cpu/cpu0/online file.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Arjan van de Ven
On Mon, 1 Oct 2007 21:27:39 +0200 (CEST)

> > I already did this here by checking for cpu != 0. But it also needs
> > either tracking or forbidding migrations of irq 0. I can take care
> > of the patch.
> 
> I was thinking about the same fix. On i386 we already have the irq 
> migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
> IRQ_NOBALANCING.

btw doing this is a problem if the user decides to hot(un)plug cpu 0...
he then can't move the irqs away to do that
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Andi Kleen wrote:
> On Monday 01 October 2007 20:54:21 Thomas Gleixner wrote:
> > On Mon, 1 Oct 2007, Andi Kleen wrote:
> > 
> > > On Wednesday 26 September 2007 20:03:12 David Bahi wrote:
> > > > Thanks to tglx and ghaskins for all the help in tracking down a very
> > > > early nmi_watchdog crash on certain x86_64 machines.
> > > 
> > > The patch is totally bogus. irq 0 doesn't say anything about whether
> > > the current CPU still works or not. You always need some local
> > > interrupt. This basically disables the NMI watchdog for the non boot CPUs.
> > > 
> > > It's even wrong on i386 -- i wonder how that broken patch
> > > made it in there. I'll remove it there.
> > 
> > Right, it's wrong for the broadcast case, but simply removing it will
> > trigger false positives on the CPU which runs the broadcast timer. I
> > fix this proper.
> 
> I already did this here by checking for cpu != 0. But it also needs either 
> tracking
> or forbidding migrations of irq 0. I can take care of the patch.

I was thinking about the same fix. On i386 we already have the irq 
migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
IRQ_NOBALANCING.

tglx




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Andi Kleen
On Monday 01 October 2007 20:54:21 Thomas Gleixner wrote:
> On Mon, 1 Oct 2007, Andi Kleen wrote:
> 
> > On Wednesday 26 September 2007 20:03:12 David Bahi wrote:
> > > Thanks to tglx and ghaskins for all the help in tracking down a very
> > > early nmi_watchdog crash on certain x86_64 machines.
> > 
> > The patch is totally bogus. irq 0 doesn't say anything about whether
> > the current CPU still works or not. You always need some local
> > interrupt. This basically disables the NMI watchdog for the non boot CPUs.
> > 
> > It's even wrong on i386 -- i wonder how that broken patch
> > made it in there. I'll remove it there.
> 
> Right, it's wrong for the broadcast case, but simply removing it will
> trigger false positives on the CPU which runs the broadcast timer. I
> fix this proper.

I already did this here by checking for cpu != 0. But it also needs either 
tracking
or forbidding migrations of irq 0. I can take care of the patch.

-Andi 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Andi Kleen wrote:

> On Wednesday 26 September 2007 20:03:12 David Bahi wrote:
> > Thanks to tglx and ghaskins for all the help in tracking down a very
> > early nmi_watchdog crash on certain x86_64 machines.
> 
> The patch is totally bogus. irq 0 doesn't say anything about whether
> the current CPU still works or not. You always need some local
> interrupt. This basically disables the NMI watchdog for the non boot CPUs.
> 
> It's even wrong on i386 -- i wonder how that broken patch
> made it in there. I'll remove it there.

Right, it's wrong for the broadcast case, but simply removing it will
trigger false positives on the CPU which runs the broadcast timer. I
fix this proper.

tglx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Andi Kleen
On Wednesday 26 September 2007 20:03:12 David Bahi wrote:
> Thanks to tglx and ghaskins for all the help in tracking down a very
> early nmi_watchdog crash on certain x86_64 machines.

The patch is totally bogus. irq 0 doesn't say anything about whether
the current CPU still works or not. You always need some local
interrupt. This basically disables the NMI watchdog for the non boot CPUs.

It's even wrong on i386 -- i wonder how that broken patch
made it in there. I'll remove it there.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Andi Kleen
On Wednesday 26 September 2007 20:03:12 David Bahi wrote:
 Thanks to tglx and ghaskins for all the help in tracking down a very
 early nmi_watchdog crash on certain x86_64 machines.

The patch is totally bogus. irq 0 doesn't say anything about whether
the current CPU still works or not. You always need some local
interrupt. This basically disables the NMI watchdog for the non boot CPUs.

It's even wrong on i386 -- i wonder how that broken patch
made it in there. I'll remove it there.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Andi Kleen wrote:

 On Wednesday 26 September 2007 20:03:12 David Bahi wrote:
  Thanks to tglx and ghaskins for all the help in tracking down a very
  early nmi_watchdog crash on certain x86_64 machines.
 
 The patch is totally bogus. irq 0 doesn't say anything about whether
 the current CPU still works or not. You always need some local
 interrupt. This basically disables the NMI watchdog for the non boot CPUs.
 
 It's even wrong on i386 -- i wonder how that broken patch
 made it in there. I'll remove it there.

Right, it's wrong for the broadcast case, but simply removing it will
trigger false positives on the CPU which runs the broadcast timer. I
fix this proper.

tglx
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Andi Kleen
On Monday 01 October 2007 20:54:21 Thomas Gleixner wrote:
 On Mon, 1 Oct 2007, Andi Kleen wrote:
 
  On Wednesday 26 September 2007 20:03:12 David Bahi wrote:
   Thanks to tglx and ghaskins for all the help in tracking down a very
   early nmi_watchdog crash on certain x86_64 machines.
  
  The patch is totally bogus. irq 0 doesn't say anything about whether
  the current CPU still works or not. You always need some local
  interrupt. This basically disables the NMI watchdog for the non boot CPUs.
  
  It's even wrong on i386 -- i wonder how that broken patch
  made it in there. I'll remove it there.
 
 Right, it's wrong for the broadcast case, but simply removing it will
 trigger false positives on the CPU which runs the broadcast timer. I
 fix this proper.

I already did this here by checking for cpu != 0. But it also needs either 
tracking
or forbidding migrations of irq 0. I can take care of the patch.

-Andi 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Andi Kleen wrote:
 On Monday 01 October 2007 20:54:21 Thomas Gleixner wrote:
  On Mon, 1 Oct 2007, Andi Kleen wrote:
  
   On Wednesday 26 September 2007 20:03:12 David Bahi wrote:
Thanks to tglx and ghaskins for all the help in tracking down a very
early nmi_watchdog crash on certain x86_64 machines.
   
   The patch is totally bogus. irq 0 doesn't say anything about whether
   the current CPU still works or not. You always need some local
   interrupt. This basically disables the NMI watchdog for the non boot CPUs.
   
   It's even wrong on i386 -- i wonder how that broken patch
   made it in there. I'll remove it there.
  
  Right, it's wrong for the broadcast case, but simply removing it will
  trigger false positives on the CPU which runs the broadcast timer. I
  fix this proper.
 
 I already did this here by checking for cpu != 0. But it also needs either 
 tracking
 or forbidding migrations of irq 0. I can take care of the patch.

I was thinking about the same fix. On i386 we already have the irq 
migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
IRQ_NOBALANCING.

tglx




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Arjan van de Ven
On Mon, 1 Oct 2007 21:27:39 +0200 (CEST)

  I already did this here by checking for cpu != 0. But it also needs
  either tracking or forbidding migrations of irq 0. I can take care
  of the patch.
 
 I was thinking about the same fix. On i386 we already have the irq 
 migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
 IRQ_NOBALANCING.

btw doing this is a problem if the user decides to hot(un)plug cpu 0...
he then can't move the irqs away to do that
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Dave Jones
On Mon, Oct 01, 2007 at 12:56:26PM -0700, Arjan van de Ven wrote:
  On Mon, 1 Oct 2007 21:27:39 +0200 (CEST)
  
I already did this here by checking for cpu != 0. But it also needs
either tracking or forbidding migrations of irq 0. I can take care
of the patch.
   
   I was thinking about the same fix. On i386 we already have the irq 
   migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
   IRQ_NOBALANCING.
  
  btw doing this is a problem if the user decides to hot(un)plug cpu 0...
  he then can't move the irqs away to do that
 
You can't hot unplug cpu0.
Take a look in sysfs, no /sys/devices/system/cpu/cpu0/online file.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Paul E. McKenney
On Mon, Oct 01, 2007 at 12:56:26PM -0700, Arjan van de Ven wrote:
 On Mon, 1 Oct 2007 21:27:39 +0200 (CEST)
 
   I already did this here by checking for cpu != 0. But it also needs
   either tracking or forbidding migrations of irq 0. I can take care
   of the patch.
  
  I was thinking about the same fix. On i386 we already have the irq 
  migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
  IRQ_NOBALANCING.
 
 btw doing this is a problem if the user decides to hot(un)plug cpu 0...
 he then can't move the irqs away to do that

Last I tried it, x86_64 and i386 in fact prohibit hotunplugging CPU 0.

Thanx, Paul
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Arjan van de Ven wrote:
   I already did this here by checking for cpu != 0. But it also needs
   either tracking or forbidding migrations of irq 0. I can take care
   of the patch.
  
  I was thinking about the same fix. On i386 we already have the irq 
  migration / balancing of irq 0 disabled. That's why we setup IRQ0 with
  IRQ_NOBALANCING.
 
 btw doing this is a problem if the user decides to hot(un)plug cpu 0...
 he then can't move the irqs away to do that

IRQ_NOBALANCING is not preventing cpu unplug. It moves the affinity to the 
next CPU, but the check in NMI watchdog for CPU == 0 would not longer 
work.

Fix below. Post .23 material. I work out a separate one for the x8664 
clock events series.

tglx



[PATCH] i386: Fix nmi watchdog per cpu timer irq accounting

The clock events patches changed the interrupt distribution and the
local apic timer interrupt accounting for the broadcast case. The per
cpu clock events handler of the cpu, which runs the broadcast
interrupt, is executed directly in the broadcast irq context. This
does not invoke the low level arch code, which does the local apic
timer irq accounting. The work around for false positives in the nmi
watchdog was to add the irq0 interrupts (broadcast device) to the
local apic timer interrupts. This falsifies the results for the CPUs
which are not handling the broadcast interrupt, i.e. stuck CPUs might
be not detected, as noticed by Andi Kleen.

It would be possible to move the clockevents handler invocation of the
CPU which runs the broadcast interrupt into the tick device broadcast
function, but this would require to handle the per cpu device to this
function and perform the direct operation in the clock device specific
architecture code. Right now this is only i386 and x86_64, but MIPS is
on the way to use the broadcast mode as well.

Introduce a weak function tick_broadcast_account(), which allows x86
to adjust the local apic timer interrupt counter in the case when the
cpu local timer handler has been invoked. This keeps the cpu local
handler decision and invocation in the common code and allows x86 to
handle the nmi watchdog accounting correctly.

Signed-off-by: Thomas Gleixner [EMAIL PROTECTED]

diff --git a/arch/i386/kernel/apic.c b/arch/i386/kernel/apic.c
index 3d67ae1..180dde8 100644
--- a/arch/i386/kernel/apic.c
+++ b/arch/i386/kernel/apic.c
@@ -283,6 +283,16 @@ static void lapic_timer_broadcast(cpumask_t mask)
 }
 
 /*
+ * Called from the broadcasting code to keep the local apic timer irq
+ * accounting straight for the nmi watchdog. Is called with interrupts
+ * disabled.
+ */
+void tick_broadcast_account(int cpu)
+{
+   per_cpu(irq_stat, cpu).apic_timer_irqs++;
+}
+
+/*
  * Setup the local APIC timer for this CPU. Copy the initilized values
  * of the boot CPU and register the clock event in the framework.
  */
diff --git a/arch/i386/kernel/nmi.c b/arch/i386/kernel/nmi.c
index c7227e2..03cdcaf 100644
--- a/arch/i386/kernel/nmi.c
+++ b/arch/i386/kernel/nmi.c
@@ -349,11 +349,7 @@ __kprobes int nmi_watchdog_tick(struct pt_regs * regs, 
unsigned reason)
cpu_clear(cpu, backtrace_mask);
}
 
-   /*
-* Take the local apic timer and PIT/HPET into account. We don't
-* know which one is active, when we have highres/dyntick on
-*/
-   sum = per_cpu(irq_stat, cpu).apic_timer_irqs + kstat_cpu(cpu).irqs[0];
+   sum = per_cpu(irq_stat, cpu).apic_timer_irqs;
 
/* if the none of the timers isn't firing, this cpu isn't doing much */
if (!touched  last_irq_sums[cpu] == sum) {
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 9a7252e..99b3021 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -73,6 +73,7 @@ static inline void tick_cancel_sched_timer(int cpu) { }
 # ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 extern struct tick_device *tick_get_broadcast_device(void);
 extern cpumask_t *tick_get_broadcast_mask(void);
+extern void tick_broadcast_account(int cpu);
 
 #  ifdef CONFIG_TICK_ONESHOT
 extern cpumask_t *tick_get_broadcast_oneshot_mask(void);
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 0962e05..43d0085 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -123,6 +123,16 @@ int tick_device_uses_broadcast(struct clock_event_device 
*dev, int cpu)
 }
 
 /*
+ * Weak function for cpu local interrupt accounting. Used by x86 to
+ * keep the lapic accounting correct for nmi_watchdog.
+ *
+ * Must be called with interrupts disabled.
+ */
+void __attribute__((weak)) tick_broadcast_account(int cpu)
+{
+}
+
+/*
  * Broadcast the event to the cpus, which are set in the mask
  */
 int tick_do_broadcast(cpumask_t mask)
@@ -137,6 +147,7 @@ int tick_do_broadcast(cpumask_t mask)
cpu_clear(cpu, mask);
td = per_cpu(tick_cpu_device, cpu);
td-evtdev-event_handler(td-evtdev);
+   

Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Andi Kleen

 IRQ_NOBALANCING is not preventing cpu unplug. It moves the affinity to the
 next CPU, but the check in NMI watchdog for CPU == 0 would not longer
 work.

That cannot happen right now because cpu_disable() on both i386/x86-64
reject CPU #0. So just setting IRQ_NOBALANCING is sufficient and both
do that already. I was wrong earlier in being concerned about this.

  int tick_do_broadcast(cpumask_t mask)
 @@ -137,6 +147,7 @@ int tick_do_broadcast(cpumask_t mask)
   cpu_clear(cpu, mask);
   td = per_cpu(tick_cpu_device, cpu);
   td-evtdev-event_handler(td-evtdev);
 + tick_broadcast_account(cpu);

That would not handle the case with a single CPU running only
irq  0 but not broadcasting I think.

I believe ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/fix-watchdog
is the correct fix

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Mon, 1 Oct 2007, Andi Kleen wrote:
 
  IRQ_NOBALANCING is not preventing cpu unplug. It moves the affinity to the
  next CPU, but the check in NMI watchdog for CPU == 0 would not longer
  work.
 
 That cannot happen right now because cpu_disable() on both i386/x86-64
 reject CPU #0. So just setting IRQ_NOBALANCING is sufficient and both
 do that already. I was wrong earlier in being concerned about this.
 
   int tick_do_broadcast(cpumask_t mask)
  @@ -137,6 +147,7 @@ int tick_do_broadcast(cpumask_t mask)
  cpu_clear(cpu, mask);
  td = per_cpu(tick_cpu_device, cpu);
  td-evtdev-event_handler(td-evtdev);
  +   tick_broadcast_account(cpu);
 
 That would not handle the case with a single CPU running only
 irq  0 but not broadcasting I think.

Hmm. The only situation where this can happen is when you add
nolapic_timer to the command line on a single CPU system. We do not
register the lapic dummy clock event device then.

 I believe 
 ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/fix-watchdog
 is the correct fix

Yup, I completely missed the fact, that we reject CPU#0 unplugging, so
your fix seems indeed to be more correct and simpler.

OTOH, the accounting hook would allow us to remove the IRQ#0 - CPU#0
restriction. Not sure whether it's worth the trouble.

 tglx
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Andi Kleen

 OTOH, the accounting hook would allow us to remove the IRQ#0 - CPU#0
 restriction. Not sure whether it's worth the trouble.

Some SIS chipsets hang the machine when you migrate irq 0 to another
CPU. It's better to keep that Also I wouldn't be surprised if there are some
other assumptions about this elsewhere.

Ok in theory it could be done only on SIS, but that probably would really
not be worth the trouble

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Thomas Gleixner
On Tue, 2 Oct 2007, Andi Kleen wrote:
 
  OTOH, the accounting hook would allow us to remove the IRQ#0 - CPU#0
  restriction. Not sure whether it's worth the trouble.
 
 Some SIS chipsets hang the machine when you migrate irq 0 to another
 CPU. It's better to keep that Also I wouldn't be surprised if there are some
 other assumptions about this elsewhere.
 
 Ok in theory it could be done only on SIS, but that probably would really
 not be worth the trouble

Agreed.

I just got a x8664-hrt report, where I found the following oddity:

 0:   1197 172881   IO-APIC-edge  timer

That's one of those infamous AMD C1E boxen. Strange, all my systems have 
IRQ#0 on CPU#0 and nowhere else. Any idea ?

tglx

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Arjan van de Ven
On Tue, 2 Oct 2007 00:47:12 +0200 (CEST)
Thomas Gleixner [EMAIL PROTECTED] wrote:

 On Tue, 2 Oct 2007, Andi Kleen wrote:
  
   OTOH, the accounting hook would allow us to remove the IRQ#0 -
   CPU#0 restriction. Not sure whether it's worth the trouble.
  
  Some SIS chipsets hang the machine when you migrate irq 0 to another
  CPU. It's better to keep that Also I wouldn't be surprised if there
  are some other assumptions about this elsewhere.
  
  Ok in theory it could be done only on SIS, but that probably would
  really not be worth the trouble
 
 Agreed.
 
 I just got a x8664-hrt report, where I found the following oddity:
 
  0:   1197 172881   IO-APIC-edge  timer
 
 That's one of those infamous AMD C1E boxen. Strange, all my systems
 have IRQ#0 on CPU#0 and nowhere else. Any idea ?

2 things;
the current irq balancers don't balance the timer interrupt, in fact
they'll leave it alone (so the chipset might balance)
older ones pin it to cpu 0 or rotate it
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Mika Penttilä

Thomas Gleixner wrote:

On Tue, 2 Oct 2007, Andi Kleen wrote:
  

OTOH, the accounting hook would allow us to remove the IRQ#0 - CPU#0
restriction. Not sure whether it's worth the trouble.
  

Some SIS chipsets hang the machine when you migrate irq 0 to another
CPU. It's better to keep that Also I wouldn't be surprised if there are some
other assumptions about this elsewhere.

Ok in theory it could be done only on SIS, but that probably would really
not be worth the trouble



Agreed.

I just got a x8664-hrt report, where I found the following oddity:

 0:   1197 172881   IO-APIC-edge  timer

That's one of those infamous AMD C1E boxen. Strange, all my systems have 
IRQ#0 on CPU#0 and nowhere else. Any idea ?


tglx

  

Here I have with stock FC7 (2.6.22.9-91) kernel :
0: 107835  133459760   IO-APIC-edge  timer

Processor:
vendor_id: AuthenticAMD
cpu family: 15
model: 107
model name: AMD Athlon(tm) 64 X2 Dual Core Processor 4000+
stepping: 1
cpu MHz: 2109.721
cache size: 512 KB

MB:
Asus M2N-E (NF570)

--Mika


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Arjan van de Ven
On Tue, 02 Oct 2007 07:56:24 +0300
Mika Penttilä [EMAIL PROTECTED] wrote:


 Here I have with stock FC7 (2.6.22.9-91) kernel :
  0: 107835  133459760   IO-APIC-edge  timer
 


fwiw this is entirely done by the hardware; no irq balancer has touched
this one (fc7 has the new irqbalancer which leaves irq0 alone)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nmi_watchdog fix for x86_64 to be more like i386

2007-10-01 Thread Andi Kleen

 Agreed.
 
 I just got a x8664-hrt report, where I found the following oddity:
 
  0:   1197 172881   IO-APIC-edge  timer
 
 That's one of those infamous AMD C1E boxen. Strange, all my systems have 
 IRQ#0 on CPU#0 and nowhere else. Any idea ?

Hmm, in lowestpriority mode it would be possible that the APIC changes
the CPU to #1 once; but IRQ 0 is always set to fixed mode. Also even 
if that happens you should have them all on 1.

Maybe the chipset is just ignoring the IO-APIC configuration in this case?

Is it always the same chipset? Is it seen on i386 too?

The problem is really that if this happens it's more than the NMI watchdog
that is broken. If you don't run an additional APIC timer interrupt on CPU #0
it's possible that CPU #0 won't schedule at all.

The only workaround for chipsets ignoring IRQ affinity would be to keep
track on which CPU irq 0 happens and then restart APIC timer interrupts
on the others (or send IPIs) as needed. But that would be fairly ugly.

IRQ 0 is unfortunately an generally under-validated area because Windows doesn't
use it.

-Andi

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


nmi_watchdog fix for x86_64 to be more like i386

2007-09-26 Thread David Bahi
Thanks to tglx and ghaskins for all the help in tracking down a very
early nmi_watchdog crash on certain x86_64 machines.

This modifies nmi_watchdog_tick behavior for 
x86_64 arch to consider both timer and hpet IRQs
just as the i386 arch does.

Signed-off-by: David Bahi <[EMAIL PROTECTED]>

---
 arch/x86_64/kernel/nmi.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Index: linux-2.6.22.8-rt9_1267/arch/x86_64/kernel/nmi.c
===
--- linux-2.6.22.8-rt9_1267.orig/arch/x86_64/kernel/nmi.c
+++ linux-2.6.22.8-rt9_1267/arch/x86_64/kernel/nmi.c
@@ -369,8 +369,6 @@ int notrace __kprobes nmi_watchdog_tick(
touched = 1;
}
 
-   sum = read_pda(apic_timer_irqs);
-
if (__get_cpu_var(nmi_touch)) {
__get_cpu_var(nmi_touch) = 0;
touched = 1;
@@ -386,6 +384,12 @@ int notrace __kprobes nmi_watchdog_tick(
cpu_clear(cpu, backtrace_mask);
}
 
+   /*
+* Take the local apic timer and PIT/HPET into account. We don't
+* know which one is active, when we have highres/dyntick on
+*/
+   sum = read_pda(apic_timer_irqs) + kstat_cpu(cpu).irqs[0];
+
 #ifdef CONFIG_X86_MCE
/* Could check oops_in_progress here too, but it's safer
   not too */


signature.asc
Description: PGP signature


nmi_watchdog fix for x86_64 to be more like i386

2007-09-26 Thread David Bahi
Thanks to tglx and ghaskins for all the help in tracking down a very
early nmi_watchdog crash on certain x86_64 machines.

This modifies nmi_watchdog_tick behavior for 
x86_64 arch to consider both timer and hpet IRQs
just as the i386 arch does.

Signed-off-by: David Bahi [EMAIL PROTECTED]

---
 arch/x86_64/kernel/nmi.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Index: linux-2.6.22.8-rt9_1267/arch/x86_64/kernel/nmi.c
===
--- linux-2.6.22.8-rt9_1267.orig/arch/x86_64/kernel/nmi.c
+++ linux-2.6.22.8-rt9_1267/arch/x86_64/kernel/nmi.c
@@ -369,8 +369,6 @@ int notrace __kprobes nmi_watchdog_tick(
touched = 1;
}
 
-   sum = read_pda(apic_timer_irqs);
-
if (__get_cpu_var(nmi_touch)) {
__get_cpu_var(nmi_touch) = 0;
touched = 1;
@@ -386,6 +384,12 @@ int notrace __kprobes nmi_watchdog_tick(
cpu_clear(cpu, backtrace_mask);
}
 
+   /*
+* Take the local apic timer and PIT/HPET into account. We don't
+* know which one is active, when we have highres/dyntick on
+*/
+   sum = read_pda(apic_timer_irqs) + kstat_cpu(cpu).irqs[0];
+
 #ifdef CONFIG_X86_MCE
/* Could check oops_in_progress here too, but it's safer
   not too */


signature.asc
Description: PGP signature