Re: [PATCH] enable K7 nmi watchdog

2001-01-15 Thread Mikael Pettersson

On Mon, 15 Jan 2001 04:00:29 +0100, Petr Vandrovec wrote:

>(1) You missed some zeros in MSR_K7_ definitions

Oops :-(

>(2) AMD's MSR are real 64bit (well, 47bit) values, so high
>MSR dword must be set to -1, not to 0

Correct. That was a copy-paste error from the P6 code.
When writing to a perfctr MSR, Intel P6 sign-extends bit 31.
P5 and Pentium 4 [*], and AMD K7 don't sign-extend, so there one
has to pass -1 in the high word.

[*] P4? PIV? P15? NB? Oh why oh why couldn't they just have named
the core P7 ...

>(3) on my CPU performance register 0x76 counts who knows what...
>This causes that when machine is idle, there is exactly one
>NMI per second. When machine is loaded, NMI count/sec climbs
>up to 100 NMIs per sec. I have no idea whether someone slows
>clock down to 10MHz on hlt, or what happens. Maybe that they
>removed this from documentation due to this. This also means
>that on bootup check for NMI stuck probably passed only
>due to pure luck - because of mdelay()/udelay() is implemented
>as tight loop.

The varying speed of this counter is unfortunate, but at least
it doesn't stop completely. The NMI oopser should still trigger,
although perhaps after a much longer delay.

>Otherwise it works

Great. Thanks.

/Mikael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] enable K7 nmi watchdog

2001-01-15 Thread Mikael Pettersson

On Mon, 15 Jan 2001 04:00:29 +0100, Petr Vandrovec wrote:

(1) You missed some zeros in MSR_K7_ definitions

Oops :-(

(2) AMD's MSR are real 64bit (well, 47bit) values, so high
MSR dword must be set to -1, not to 0

Correct. That was a copy-paste error from the P6 code.
When writing to a perfctr MSR, Intel P6 sign-extends bit 31.
P5 and Pentium 4 [*], and AMD K7 don't sign-extend, so there one
has to pass -1 in the high word.

[*] P4? PIV? P15? NB? Oh why oh why couldn't they just have named
the core P7 ...

(3) on my CPU performance register 0x76 counts who knows what...
This causes that when machine is idle, there is exactly one
NMI per second. When machine is loaded, NMI count/sec climbs
up to 100 NMIs per sec. I have no idea whether someone slows
clock down to 10MHz on hlt, or what happens. Maybe that they
removed this from documentation due to this. This also means
that on bootup check for NMI stuck probably passed only
due to pure luck - because of mdelay()/udelay() is implemented
as tight loop.

The varying speed of this counter is unfortunate, but at least
it doesn't stop completely. The NMI oopser should still trigger,
although perhaps after a much longer delay.

Otherwise it works

Great. Thanks.

/Mikael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] enable K7 nmi watchdog

2001-01-14 Thread Petr Vandrovec

On Sat, Jan 13, 2001 at 04:16:59PM +0100, Mikael Pettersson wrote:
> This patch (against 2.4.0-ac8) _may_ enable the NMI watchdog on
> some K7 systems. It won't help if you have an old K7 without a
> local APIC, or if your BIOS disables it.
> 
> This is a quick hack to test the mechanism -- I'll submit a
> cleaner patch later if this one works.
> 
> If you try this, please cc: me the result (positive or negative)
> and a copy of the kernel's boot log.

Hi,
  I had to change couple of things in your patch:
(1) You missed some zeros in MSR_K7_ definitions
(2) AMD's MSR are real 64bit (well, 47bit) values, so high
MSR dword must be set to -1, not to 0
(3) on my CPU performance register 0x76 counts who knows what...
This causes that when machine is idle, there is exactly one
NMI per second. When machine is loaded, NMI count/sec climbs
up to 100 NMIs per sec. I have no idea whether someone slows
clock down to 10MHz on hlt, or what happens. Maybe that they
removed this from documentation due to this. This also means
that on bootup check for NMI stuck probably passed only
due to pure luck - because of mdelay()/udelay() is implemented
as tight loop.

Otherwise it works - I did not checked vmware yet, and I expect
that vmmon die painfull death because of it disables NMI only
on SMP kernels... But it is easily fixable.

I did not checked what happens if I'll do cli; hlt;, but as
nmi count clibms up, I believe that it will work.

BTW, it was really painful to find which of AMD documents documents
these MSRs (it is 22007). Their search engined did not found them...

BTW#2: Should not intel code use -1 for high dword too? I have no
documentation handy, but as MSRs are 64bit registers...

Patch:

diff -urdN linux/arch/i386/kernel/nmi.c linux/arch/i386/kernel/nmi.c
--- linux/arch/i386/kernel/nmi.cSun Jan 14 05:02:42 2001
+++ linux/arch/i386/kernel/nmi.cMon Jan 15 03:26:15 2001
@@ -64,6 +64,10 @@
(boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) &&
(boot_cpu_data.x86 == 6))
nmi_watchdog = nmi;
+   if ((nmi == NMI_LOCAL_APIC) &&
+   (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
+   (boot_cpu_data.x86 == 6))
+   nmi_watchdog = nmi;
/*
 * We can enable the IO-APIC watchdog
 * unconditionally.
@@ -80,10 +84,34 @@
  * Original code written by Keith Owens.
  */
 
+#define MSR_K7_EVNTSEL0 0xC001
+#define MSR_K7_PERFCTR0 0xC0010004
+
 void setup_apic_nmi_watchdog (void)
 {
int value;
 
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
+   boot_cpu_data.x86 == 6) {
+   unsigned evntsel = (1<<20)|(3<<16); /* INT, OS, USR */
+#if 1  /* listed in old docs */
+   evntsel |= 0x76;/* CYCLES_PROCESSOR_IS_RUNNING */
+#else  /* try this if the above doesn't work */
+   evntsel |= 0xC0;/* RETIRED_INSTRUCTIONS */
+#endif
+   wrmsr(MSR_K7_EVNTSEL0, 0, 0);
+   wrmsr(MSR_K7_PERFCTR0, 0, 0);
+   wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
+   printk("setting K7_PERFCTR0 to %08lx\n", -(cpu_khz/HZ*1000));
+   wrmsr(MSR_K7_PERFCTR0, -(cpu_khz/HZ*1000), -1);
+   printk("setting K7 LVTPC to DM_NMI\n");
+   apic_write(APIC_LVTPC, APIC_DM_NMI);
+   evntsel |= (1<<22); /* ENable */
+   printk("setting K7_EVNTSEL0 to %08x\n", evntsel);
+   wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
+   return;
+   }
+
/* clear performance counters 0, 1 */
 
wrmsr(MSR_IA32_EVNTSEL0, 0, 0);
@@ -162,7 +190,14 @@
last_irq_sums[cpu] = sum;
alert_counter[cpu] = 0;
}
-   if (cpu_has_apic && (nmi_watchdog == NMI_LOCAL_APIC))
-   wrmsr(MSR_IA32_PERFCTR1, -(cpu_khz/HZ*1000), 0);
+   if (cpu_has_apic && (nmi_watchdog == NMI_LOCAL_APIC)) {
+   /* XXX: nmi_watchdog should carry this info */
+   unsigned msr;
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
+   wrmsr(MSR_K7_PERFCTR0, -(cpu_khz/HZ*1000), -1);
+   } else {
+   wrmsr(MSR_IA32_PERFCTR1, -(cpu_khz/HZ*1000), 0);
+   }
+   }
 }
 
diff -urdN linux/arch/i386/kernel/setup.c linux/arch/i386/kernel/setup.c
--- linux/arch/i386/kernel/setup.c  Sun Jan 14 05:02:42 2001
+++ linux/arch/i386/kernel/setup.c  Sun Jan 14 05:04:24 2001
@@ -1926,14 +1926,6 @@
c->x86 = 4;
}
 
-   /*
-* Athlons have an APIC, but the APIC-programming
-* MSRs are in different places. If you want NMI-watchdog
-* on Athlons, please fix setup_apic_nmi_watchdog().
-*/
-   if (c->x86_vendor == X86_VENDOR_AMD)
-   

Re: [PATCH] enable K7 nmi watchdog

2001-01-14 Thread Petr Vandrovec

On Sat, Jan 13, 2001 at 04:16:59PM +0100, Mikael Pettersson wrote:
 This patch (against 2.4.0-ac8) _may_ enable the NMI watchdog on
 some K7 systems. It won't help if you have an old K7 without a
 local APIC, or if your BIOS disables it.
 
 This is a quick hack to test the mechanism -- I'll submit a
 cleaner patch later if this one works.
 
 If you try this, please cc: me the result (positive or negative)
 and a copy of the kernel's boot log.

Hi,
  I had to change couple of things in your patch:
(1) You missed some zeros in MSR_K7_ definitions
(2) AMD's MSR are real 64bit (well, 47bit) values, so high
MSR dword must be set to -1, not to 0
(3) on my CPU performance register 0x76 counts who knows what...
This causes that when machine is idle, there is exactly one
NMI per second. When machine is loaded, NMI count/sec climbs
up to 100 NMIs per sec. I have no idea whether someone slows
clock down to 10MHz on hlt, or what happens. Maybe that they
removed this from documentation due to this. This also means
that on bootup check for NMI stuck probably passed only
due to pure luck - because of mdelay()/udelay() is implemented
as tight loop.

Otherwise it works - I did not checked vmware yet, and I expect
that vmmon die painfull death because of it disables NMI only
on SMP kernels... But it is easily fixable.

I did not checked what happens if I'll do cli; hlt;, but as
nmi count clibms up, I believe that it will work.

BTW, it was really painful to find which of AMD documents documents
these MSRs (it is 22007). Their search engined did not found them...

BTW#2: Should not intel code use -1 for high dword too? I have no
documentation handy, but as MSRs are 64bit registers...

Patch:

diff -urdN linux/arch/i386/kernel/nmi.c linux/arch/i386/kernel/nmi.c
--- linux/arch/i386/kernel/nmi.cSun Jan 14 05:02:42 2001
+++ linux/arch/i386/kernel/nmi.cMon Jan 15 03:26:15 2001
@@ -64,6 +64,10 @@
(boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) 
(boot_cpu_data.x86 == 6))
nmi_watchdog = nmi;
+   if ((nmi == NMI_LOCAL_APIC) 
+   (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) 
+   (boot_cpu_data.x86 == 6))
+   nmi_watchdog = nmi;
/*
 * We can enable the IO-APIC watchdog
 * unconditionally.
@@ -80,10 +84,34 @@
  * Original code written by Keith Owens.
  */
 
+#define MSR_K7_EVNTSEL0 0xC001
+#define MSR_K7_PERFCTR0 0xC0010004
+
 void setup_apic_nmi_watchdog (void)
 {
int value;
 
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD 
+   boot_cpu_data.x86 == 6) {
+   unsigned evntsel = (120)|(316); /* INT, OS, USR */
+#if 1  /* listed in old docs */
+   evntsel |= 0x76;/* CYCLES_PROCESSOR_IS_RUNNING */
+#else  /* try this if the above doesn't work */
+   evntsel |= 0xC0;/* RETIRED_INSTRUCTIONS */
+#endif
+   wrmsr(MSR_K7_EVNTSEL0, 0, 0);
+   wrmsr(MSR_K7_PERFCTR0, 0, 0);
+   wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
+   printk("setting K7_PERFCTR0 to %08lx\n", -(cpu_khz/HZ*1000));
+   wrmsr(MSR_K7_PERFCTR0, -(cpu_khz/HZ*1000), -1);
+   printk("setting K7 LVTPC to DM_NMI\n");
+   apic_write(APIC_LVTPC, APIC_DM_NMI);
+   evntsel |= (122); /* ENable */
+   printk("setting K7_EVNTSEL0 to %08x\n", evntsel);
+   wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
+   return;
+   }
+
/* clear performance counters 0, 1 */
 
wrmsr(MSR_IA32_EVNTSEL0, 0, 0);
@@ -162,7 +190,14 @@
last_irq_sums[cpu] = sum;
alert_counter[cpu] = 0;
}
-   if (cpu_has_apic  (nmi_watchdog == NMI_LOCAL_APIC))
-   wrmsr(MSR_IA32_PERFCTR1, -(cpu_khz/HZ*1000), 0);
+   if (cpu_has_apic  (nmi_watchdog == NMI_LOCAL_APIC)) {
+   /* XXX: nmi_watchdog should carry this info */
+   unsigned msr;
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
+   wrmsr(MSR_K7_PERFCTR0, -(cpu_khz/HZ*1000), -1);
+   } else {
+   wrmsr(MSR_IA32_PERFCTR1, -(cpu_khz/HZ*1000), 0);
+   }
+   }
 }
 
diff -urdN linux/arch/i386/kernel/setup.c linux/arch/i386/kernel/setup.c
--- linux/arch/i386/kernel/setup.c  Sun Jan 14 05:02:42 2001
+++ linux/arch/i386/kernel/setup.c  Sun Jan 14 05:04:24 2001
@@ -1926,14 +1926,6 @@
c-x86 = 4;
}
 
-   /*
-* Athlons have an APIC, but the APIC-programming
-* MSRs are in different places. If you want NMI-watchdog
-* on Athlons, please fix setup_apic_nmi_watchdog().
-*/
-   if (c-x86_vendor == X86_VENDOR_AMD)
-   clear_bit(X86_FEATURE_APIC, c-x86_capability);
-
 

[PATCH] enable K7 nmi watchdog

2001-01-13 Thread Mikael Pettersson

This patch (against 2.4.0-ac8) _may_ enable the NMI watchdog on
some K7 systems. It won't help if you have an old K7 without a
local APIC, or if your BIOS disables it.

This is a quick hack to test the mechanism -- I'll submit a
cleaner patch later if this one works.

If you try this, please cc: me the result (positive or negative)
and a copy of the kernel's boot log.

/Mikael

--- linux-2.4.0-ac8/arch/i386/kernel/nmi.c.~1~  Sat Jan 13 14:57:09 2001
+++ linux-2.4.0-ac8/arch/i386/kernel/nmi.c  Sat Jan 13 16:00:27 2001
@@ -64,6 +64,10 @@
(boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) &&
(boot_cpu_data.x86 == 6))
nmi_watchdog = nmi;
+   if ((nmi == NMI_LOCAL_APIC) &&
+   (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) &&
+   (boot_cpu_data.x86 == 6))
+   nmi_watchdog = nmi;
/*
 * We can enable the IO-APIC watchdog
 * unconditionally.
@@ -80,10 +84,34 @@
  * Original code written by Keith Owens.
  */
 
+#define MSR_K7_EVNTSEL0 0xC001000
+#define MSR_K7_PERFCTR0 0xC001004
+
 void setup_apic_nmi_watchdog (void)
 {
int value;
 
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
+   boot_cpu_data.x86 == 6) {
+   unsigned evntsel = (1<<20)|(3<<16); /* INT, OS, USR */
+#if 1  /* listed in old docs */
+   evntsel |= 0x76;/* CYCLES_PROCESSOR_IS_RUNNING */
+#else  /* try this if the above doesn't work */
+   evntsel |= 0xC0;/* RETIRED_INSTRUCTIONS */
+#endif
+   wrmsr(MSR_K7_EVNTSEL0, 0, 0);
+   wrmsr(MSR_K7_PERFCTR0, 0, 0);
+   wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
+   printk("setting K7_PERFCTR0 to %08lx\n", -(cpu_khz/HZ*1000));
+   wrmsr(MSR_K7_PERFCTR0, -(cpu_khz/HZ*1000), 0);
+   printk("setting K7 LVTPC to DM_NMI\n");
+   apic_write(APIC_LVTPC, APIC_DM_NMI);
+   evntsel |= (1<<22); /* ENable */
+   printk("setting K7_EVNTSEL0 to %08x\n", evntsel);
+   wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
+   return;
+   }
+
/* clear performance counters 0, 1 */
 
wrmsr(MSR_IA32_EVNTSEL0, 0, 0);
@@ -162,7 +190,14 @@
last_irq_sums[cpu] = sum;
alert_counter[cpu] = 0;
}
-   if (cpu_has_apic && (nmi_watchdog == NMI_LOCAL_APIC))
-   wrmsr(MSR_IA32_PERFCTR1, -(cpu_khz/HZ*1000), 0);
+   if (cpu_has_apic && (nmi_watchdog == NMI_LOCAL_APIC)) {
+   /* XXX: nmi_watchdog should carry this info */
+   unsigned msr;
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+   msr = MSR_K7_PERFCTR0;
+   else
+   msr = MSR_IA32_PERFCTR1;
+   wrmsr(msr, -(cpu_khz/HZ*1000), 0);
+   }
 }
 
--- linux-2.4.0-ac8/arch/i386/kernel/setup.c.~1~Sat Jan 13 14:57:09 2001
+++ linux-2.4.0-ac8/arch/i386/kernel/setup.cSat Jan 13 14:57:48 2001
@@ -1926,14 +1926,6 @@
c->x86 = 4;
}
 
-   /*
-* Athlons have an APIC, but the APIC-programming
-* MSRs are in different places. If you want NMI-watchdog
-* on Athlons, please fix setup_apic_nmi_watchdog().
-*/
-   if (c->x86_vendor == X86_VENDOR_AMD)
-   clear_bit(X86_FEATURE_APIC, >x86_capability);
-
/* AMD-defined flags: level 0x8001 */
xlvl = cpuid_eax(0x8000);
if ( (xlvl & 0x) == 0x8000 ) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] enable K7 nmi watchdog

2001-01-13 Thread Mikael Pettersson

This patch (against 2.4.0-ac8) _may_ enable the NMI watchdog on
some K7 systems. It won't help if you have an old K7 without a
local APIC, or if your BIOS disables it.

This is a quick hack to test the mechanism -- I'll submit a
cleaner patch later if this one works.

If you try this, please cc: me the result (positive or negative)
and a copy of the kernel's boot log.

/Mikael

--- linux-2.4.0-ac8/arch/i386/kernel/nmi.c.~1~  Sat Jan 13 14:57:09 2001
+++ linux-2.4.0-ac8/arch/i386/kernel/nmi.c  Sat Jan 13 16:00:27 2001
@@ -64,6 +64,10 @@
(boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) 
(boot_cpu_data.x86 == 6))
nmi_watchdog = nmi;
+   if ((nmi == NMI_LOCAL_APIC) 
+   (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) 
+   (boot_cpu_data.x86 == 6))
+   nmi_watchdog = nmi;
/*
 * We can enable the IO-APIC watchdog
 * unconditionally.
@@ -80,10 +84,34 @@
  * Original code written by Keith Owens.
  */
 
+#define MSR_K7_EVNTSEL0 0xC001000
+#define MSR_K7_PERFCTR0 0xC001004
+
 void setup_apic_nmi_watchdog (void)
 {
int value;
 
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD 
+   boot_cpu_data.x86 == 6) {
+   unsigned evntsel = (120)|(316); /* INT, OS, USR */
+#if 1  /* listed in old docs */
+   evntsel |= 0x76;/* CYCLES_PROCESSOR_IS_RUNNING */
+#else  /* try this if the above doesn't work */
+   evntsel |= 0xC0;/* RETIRED_INSTRUCTIONS */
+#endif
+   wrmsr(MSR_K7_EVNTSEL0, 0, 0);
+   wrmsr(MSR_K7_PERFCTR0, 0, 0);
+   wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
+   printk("setting K7_PERFCTR0 to %08lx\n", -(cpu_khz/HZ*1000));
+   wrmsr(MSR_K7_PERFCTR0, -(cpu_khz/HZ*1000), 0);
+   printk("setting K7 LVTPC to DM_NMI\n");
+   apic_write(APIC_LVTPC, APIC_DM_NMI);
+   evntsel |= (122); /* ENable */
+   printk("setting K7_EVNTSEL0 to %08x\n", evntsel);
+   wrmsr(MSR_K7_EVNTSEL0, evntsel, 0);
+   return;
+   }
+
/* clear performance counters 0, 1 */
 
wrmsr(MSR_IA32_EVNTSEL0, 0, 0);
@@ -162,7 +190,14 @@
last_irq_sums[cpu] = sum;
alert_counter[cpu] = 0;
}
-   if (cpu_has_apic  (nmi_watchdog == NMI_LOCAL_APIC))
-   wrmsr(MSR_IA32_PERFCTR1, -(cpu_khz/HZ*1000), 0);
+   if (cpu_has_apic  (nmi_watchdog == NMI_LOCAL_APIC)) {
+   /* XXX: nmi_watchdog should carry this info */
+   unsigned msr;
+   if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+   msr = MSR_K7_PERFCTR0;
+   else
+   msr = MSR_IA32_PERFCTR1;
+   wrmsr(msr, -(cpu_khz/HZ*1000), 0);
+   }
 }
 
--- linux-2.4.0-ac8/arch/i386/kernel/setup.c.~1~Sat Jan 13 14:57:09 2001
+++ linux-2.4.0-ac8/arch/i386/kernel/setup.cSat Jan 13 14:57:48 2001
@@ -1926,14 +1926,6 @@
c-x86 = 4;
}
 
-   /*
-* Athlons have an APIC, but the APIC-programming
-* MSRs are in different places. If you want NMI-watchdog
-* on Athlons, please fix setup_apic_nmi_watchdog().
-*/
-   if (c-x86_vendor == X86_VENDOR_AMD)
-   clear_bit(X86_FEATURE_APIC, c-x86_capability);
-
/* AMD-defined flags: level 0x8001 */
xlvl = cpuid_eax(0x8000);
if ( (xlvl  0x) == 0x8000 ) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/