RE: 2.6.25-current-git hangs on boot
>-Original Message- >From: Rafael J. Wysocki [mailto:[EMAIL PROTECTED] >Sent: Sunday, February 24, 2008 3:18 AM >To: Soeren Sonnenburg >Cc: Oliver Pinter; Linux Kernel; Pallipadi, Venkatesh >Subject: Re: 2.6.25-current-git hangs on boot > >On Sunday, 24 of February 2008, Soeren Sonnenburg wrote: >> On Sat, 2008-02-23 at 20:00 +0100, Oliver Pinter wrote: >> > the pci=nommconf kernel parameter helped it? >> >> yes indeed, this switch reliably helps to over come the hang at *this >> stage* (I tried booting with booth the switch and w/o). >> >> however with 50% chance I still see a hang directly after >> >> cpuidle: using governor ladder > >Do you have CONFIG_CPU_IDLE set? If you have, please try to >unset it and >retest. > Rafael, I am looking at the CPU_IDLE part of this regression. Just want to note that there is another regression with needing pci=nommconf in current git which was not required in .24. I am not sure whether you are already tracking that as a different issue. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.25-current-git hangs on boot
-Original Message- From: Rafael J. Wysocki [mailto:[EMAIL PROTECTED] Sent: Sunday, February 24, 2008 3:18 AM To: Soeren Sonnenburg Cc: Oliver Pinter; Linux Kernel; Pallipadi, Venkatesh Subject: Re: 2.6.25-current-git hangs on boot On Sunday, 24 of February 2008, Soeren Sonnenburg wrote: On Sat, 2008-02-23 at 20:00 +0100, Oliver Pinter wrote: the pci=nommconf kernel parameter helped it? yes indeed, this switch reliably helps to over come the hang at *this stage* (I tried booting with booth the switch and w/o). however with 50% chance I still see a hang directly after cpuidle: using governor ladder Do you have CONFIG_CPU_IDLE set? If you have, please try to unset it and retest. Rafael, I am looking at the CPU_IDLE part of this regression. Just want to note that there is another regression with needing pci=nommconf in current git which was not required in .24. I am not sure whether you are already tracking that as a different issue. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 100% C0 with 2.6.25-rc
>-Original Message- >From: Jan Willies [mailto:[EMAIL PROTECTED] >Sent: Friday, February 22, 2008 9:50 AM >To: Pallipadi, Venkatesh >Cc: Rafael J. Wysocki; [EMAIL PROTECTED]; Ingo >Molnar; LKML; Thomas Gleixner >Subject: Re: 100% C0 with 2.6.25-rc > >Pallipadi, Venkatesh wrote: >> One question. Do you have CONFIG_CPU_IDLE enabled in your confg? > >Yes, I have. Attached is my .config file > Can you try with it disabled and check whether you still see the problem? Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 100% C0 with 2.6.25-rc
>-Original Message- >From: Jan Willies [mailto:[EMAIL PROTECTED] >Sent: Friday, February 22, 2008 7:56 AM >To: Rafael J. Wysocki >Cc: [EMAIL PROTECTED]; Ingo Molnar; LKML; Thomas >Gleixner; Pallipadi, Venkatesh >Subject: Re: 100% C0 with 2.6.25-rc > >Rafael J. Wysocki wrote: >> On Thursday, 21 of February 2008, Jan Willies wrote: >>> Since 2.6.25-rc1 I have a lot of wakeups/s (≈134191,4) and >spend 100% in C0. >>> It worked fine with 2.6.24 and commandline nolapic. Without >nolapic I had 80k >>> wakeups/s after some time, but not right from the start like now. >> >> We have a regression from 2.6.24, apparently interrupts-related. > >After a lot of bisecting I've found the bad commit: > >9b12e18cdc1553de62d931e73443c806347cd974 is first bad commit >commit 9b12e18cdc1553de62d931e73443c806347cd974 >Author: [EMAIL PROTECTED] <[EMAIL PROTECTED]> >Date: Thu Jan 31 17:35:05 2008 -0800 > >ACPI: cpuidle: Support C1 idle time accounting > >Show C1 idle time in /sysfs cpuidle interface. C1 idle time may not >be entirely accurate in all cases. It includes the time spent >in the interrupt handler after wakeup with "hlt" based C1. >But, it will >be accurate with "mwait" based C1. > > >Reverting the commit brings my laptop back to C2. > One question. Do you have CONFIG_CPU_IDLE enabled in your confg? Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 100% C0 with 2.6.25-rc
>-Original Message- >From: Jan Willies [mailto:[EMAIL PROTECTED] >Sent: Friday, February 22, 2008 7:56 AM >To: Rafael J. Wysocki >Cc: [EMAIL PROTECTED]; Ingo Molnar; LKML; Thomas >Gleixner; Pallipadi, Venkatesh >Subject: Re: 100% C0 with 2.6.25-rc > >Rafael J. Wysocki wrote: >> On Thursday, 21 of February 2008, Jan Willies wrote: >>> Since 2.6.25-rc1 I have a lot of wakeups/s (≈134191,4) and >spend 100% in C0. >>> It worked fine with 2.6.24 and commandline nolapic. Without >nolapic I had 80k >>> wakeups/s after some time, but not right from the start like now. >> >> We have a regression from 2.6.24, apparently interrupts-related. > >After a lot of bisecting I've found the bad commit: > >9b12e18cdc1553de62d931e73443c806347cd974 is first bad commit >commit 9b12e18cdc1553de62d931e73443c806347cd974 >Author: [EMAIL PROTECTED] <[EMAIL PROTECTED]> >Date: Thu Jan 31 17:35:05 2008 -0800 > >ACPI: cpuidle: Support C1 idle time accounting > >Show C1 idle time in /sysfs cpuidle interface. C1 idle time may not >be entirely accurate in all cases. It includes the time spent >in the interrupt handler after wakeup with "hlt" based C1. >But, it will >be accurate with "mwait" based C1. > > >Reverting the commit brings my laptop back to C2. > Thanks for the bisect info. I will look at the bad side effects that patch may be having and I should have a patch for you to test later today Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 100% C0 with 2.6.25-rc
-Original Message- From: Jan Willies [mailto:[EMAIL PROTECTED] Sent: Friday, February 22, 2008 7:56 AM To: Rafael J. Wysocki Cc: [EMAIL PROTECTED]; Ingo Molnar; LKML; Thomas Gleixner; Pallipadi, Venkatesh Subject: Re: 100% C0 with 2.6.25-rc Rafael J. Wysocki wrote: On Thursday, 21 of February 2008, Jan Willies wrote: Since 2.6.25-rc1 I have a lot of wakeups/s (≈134191,4) and spend 100% in C0. It worked fine with 2.6.24 and commandline nolapic. Without nolapic I had 80k wakeups/s after some time, but not right from the start like now. We have a regression from 2.6.24, apparently interrupts-related. After a lot of bisecting I've found the bad commit: 9b12e18cdc1553de62d931e73443c806347cd974 is first bad commit commit 9b12e18cdc1553de62d931e73443c806347cd974 Author: [EMAIL PROTECTED] [EMAIL PROTECTED] Date: Thu Jan 31 17:35:05 2008 -0800 ACPI: cpuidle: Support C1 idle time accounting Show C1 idle time in /sysfs cpuidle interface. C1 idle time may not be entirely accurate in all cases. It includes the time spent in the interrupt handler after wakeup with hlt based C1. But, it will be accurate with mwait based C1. Reverting the commit brings my laptop back to C2. Thanks for the bisect info. I will look at the bad side effects that patch may be having and I should have a patch for you to test later today Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 100% C0 with 2.6.25-rc
-Original Message- From: Jan Willies [mailto:[EMAIL PROTECTED] Sent: Friday, February 22, 2008 7:56 AM To: Rafael J. Wysocki Cc: [EMAIL PROTECTED]; Ingo Molnar; LKML; Thomas Gleixner; Pallipadi, Venkatesh Subject: Re: 100% C0 with 2.6.25-rc Rafael J. Wysocki wrote: On Thursday, 21 of February 2008, Jan Willies wrote: Since 2.6.25-rc1 I have a lot of wakeups/s (≈134191,4) and spend 100% in C0. It worked fine with 2.6.24 and commandline nolapic. Without nolapic I had 80k wakeups/s after some time, but not right from the start like now. We have a regression from 2.6.24, apparently interrupts-related. After a lot of bisecting I've found the bad commit: 9b12e18cdc1553de62d931e73443c806347cd974 is first bad commit commit 9b12e18cdc1553de62d931e73443c806347cd974 Author: [EMAIL PROTECTED] [EMAIL PROTECTED] Date: Thu Jan 31 17:35:05 2008 -0800 ACPI: cpuidle: Support C1 idle time accounting Show C1 idle time in /sysfs cpuidle interface. C1 idle time may not be entirely accurate in all cases. It includes the time spent in the interrupt handler after wakeup with hlt based C1. But, it will be accurate with mwait based C1. Reverting the commit brings my laptop back to C2. One question. Do you have CONFIG_CPU_IDLE enabled in your confg? Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 100% C0 with 2.6.25-rc
-Original Message- From: Jan Willies [mailto:[EMAIL PROTECTED] Sent: Friday, February 22, 2008 9:50 AM To: Pallipadi, Venkatesh Cc: Rafael J. Wysocki; [EMAIL PROTECTED]; Ingo Molnar; LKML; Thomas Gleixner Subject: Re: 100% C0 with 2.6.25-rc Pallipadi, Venkatesh wrote: One question. Do you have CONFIG_CPU_IDLE enabled in your confg? Yes, I have. Attached is my .config file Can you try with it disabled and check whether you still see the problem? Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: kernel-2.6.25-rc2, CPU C state "active state" always remain C0 unchangeable by cat 'proc/'
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of >[EMAIL PROTECTED] >Sent: Wednesday, February 20, 2008 11:01 PM >To: linux-kernel@vger.kernel.org >Cc: Song, Youquan >Subject: kernel-2.6.25-rc2, CPU C state "active state" always >remain C0 unchangeable by cat 'proc/' > >On Fedora9 Alpha with kernel-2.6.25-rc2, >I run " watch -n 0.1 "cat /proc/acpi/processor/CPU*/power | >grep 'active >state'" ", I find that the CPU C state information "active >state" remain >at "C0" and it do not change any time. The same issue also exit on >2.6.24-rc2 kernel. >But with RHEL5.1 kernel-2.6.18 the CPU C state information is normal. > > >Every 0.1s: cat /proc/acpi/processor/CPU*/power | grep 'active >state' > > >Thu Feb 21 09:44:33 2008 > >active state:C0 >active state:C0 >active state:C0 >active state:C0 >active state:C0 >active state:C0 >active state:C0 >active state:C0 > >On RHEL5.1 kernel-2.6.18-53.el5 >[EMAIL PROTECTED] ~]# cat /proc/acpi/processor/CPU*/power | grep >'active state' >active state:C2 >active state:C2 >active state:C2 >active state:C2 >active state:C2 >active state:C3 >active state:C2 >active state:C3 > >The hardware platform: bensley and santarosa. > I am not sure active state gives a lot of information about C-state. And I really hate to see code waking up every .1 sec to check the active state :-). /proc/acpi/processor/*/power is marked deprecated upstream and if you really want this field, you will have to disable CONFIG_CPU_IDLE in upstream kernel. With CPU_IDLE you will get more detailed information about all C-states entry counts and residencies under /sys/devices/system/cpu/cpu*/cpuidle/ Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: kernel-2.6.25-rc2, CPU C state active state always remain C0 unchangeable by cat 'proc/'
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Wednesday, February 20, 2008 11:01 PM To: linux-kernel@vger.kernel.org Cc: Song, Youquan Subject: kernel-2.6.25-rc2, CPU C state active state always remain C0 unchangeable by cat 'proc/' On Fedora9 Alpha with kernel-2.6.25-rc2, I run watch -n 0.1 cat /proc/acpi/processor/CPU*/power | grep 'active state' , I find that the CPU C state information active state remain at C0 and it do not change any time. The same issue also exit on 2.6.24-rc2 kernel. But with RHEL5.1 kernel-2.6.18 the CPU C state information is normal. Every 0.1s: cat /proc/acpi/processor/CPU*/power | grep 'active state' Thu Feb 21 09:44:33 2008 active state:C0 active state:C0 active state:C0 active state:C0 active state:C0 active state:C0 active state:C0 active state:C0 On RHEL5.1 kernel-2.6.18-53.el5 [EMAIL PROTECTED] ~]# cat /proc/acpi/processor/CPU*/power | grep 'active state' active state:C2 active state:C2 active state:C2 active state:C2 active state:C2 active state:C3 active state:C2 active state:C3 The hardware platform: bensley and santarosa. I am not sure active state gives a lot of information about C-state. And I really hate to see code waking up every .1 sec to check the active state :-). /proc/acpi/processor/*/power is marked deprecated upstream and if you really want this field, you will have to disable CONFIG_CPU_IDLE in upstream kernel. With CPU_IDLE you will get more detailed information about all C-states entry counts and residencies under /sys/devices/system/cpu/cpu*/cpuidle/ Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [2.6.25-rc1 regression] Suspend to RAM (bisected)
>-Original Message- >From: Calvin Walton [mailto:[EMAIL PROTECTED] >Sent: Sunday, February 10, 2008 9:48 PM >To: Carlos R. Mafra >Cc: linux-kernel@vger.kernel.org; Pallipadi, Venkatesh >Subject: Re: [2.6.25-rc1 regression] Suspend to RAM (bisected) > >On Mon, 2008-02-11 at 03:25 -0200, Carlos R. Mafra wrote: >> Hi, >> >> I want to report that suspend to RAM stopped working on my Sony Vaio >> VGN-FZ240E in 2.6.25-rc1 and that I could bisect the problem down >> to: >> >> commit bc71bec91f9875ef825d12104acf3bf4ca215fa4 >> Author: [EMAIL PROTECTED] <[EMAIL PROTECTED]> >> Date: Thu Jan 31 17:35:04 2008 -0800 >> >> ACPI: enable MWAIT for C1 idle > >I normally hate to throw in a 'me-too', but I'm also seeing a >suspend-to-ram regression on my Thinkpad R61i that I've managed to >bisect down to the same patch series. > Carlos, Calvin, Can you send me the output of acpidump and full dmesg to me. Looks like it is a platform issue due to which we cannot use C1 mwait idle during suspend resume, something similar to issue we had with using C2/C3 state during idle. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [2.6.25-rc1 regression] Suspend to RAM (bisected)
-Original Message- From: Calvin Walton [mailto:[EMAIL PROTECTED] Sent: Sunday, February 10, 2008 9:48 PM To: Carlos R. Mafra Cc: linux-kernel@vger.kernel.org; Pallipadi, Venkatesh Subject: Re: [2.6.25-rc1 regression] Suspend to RAM (bisected) On Mon, 2008-02-11 at 03:25 -0200, Carlos R. Mafra wrote: Hi, I want to report that suspend to RAM stopped working on my Sony Vaio VGN-FZ240E in 2.6.25-rc1 and that I could bisect the problem down to: commit bc71bec91f9875ef825d12104acf3bf4ca215fa4 Author: [EMAIL PROTECTED] [EMAIL PROTECTED] Date: Thu Jan 31 17:35:04 2008 -0800 ACPI: enable MWAIT for C1 idle I normally hate to throw in a 'me-too', but I'm also seeing a suspend-to-ram regression on my Thinkpad R61i that I've managed to bisect down to the same patch series. Carlos, Calvin, Can you send me the output of acpidump and full dmesg to me. Looks like it is a platform issue due to which we cannot use C1 mwait idle during suspend resume, something similar to issue we had with using C2/C3 state during idle. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: (ondemand) CPU governor regression between 2.6.23 and 2.6.24
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Andrew Morton >Sent: Sunday, February 03, 2008 4:33 PM >To: Toralf Förster >Cc: linux-kernel@vger.kernel.org; [EMAIL PROTECTED] >Subject: Re: (ondemand) CPU governor regression between 2.6.23 >and 2.6.24 > >On Sat, 26 Jan 2008 15:06:25 +0100 Toralf Förster ><[EMAIL PROTECTED]> wrote: > >> I use a 1-liner for a simple performance check : "time >factor 819734028463158891" >> Here is the result for the new (Gentoo) kernel 2.6.24: >> >> With the ondemand governor of the I get: >> >> [EMAIL PROTECTED] ~/tmp $ time factor 819734028463158891 >> 819734028463158891: 3 273244676154386297 >> >> real0m32.997s >> user0m15.732s >> sys 0m0.014s >> >> With the ondemand governor the CPU runs at 600 MHz, >> whereas with the performance governor I get : >> >> [EMAIL PROTECTED] ~/tmp $ time factor 819734028463158891 >> 819734028463158891: 3 273244676154386297 >> >> real0m10.893s >> user0m5.444s >> sys 0m0.000s >> >> (~5.5 sec as I expected) b/c the CPU is set to 1.7 GHz. >> >> The ondeman governor of previous kernel versions however >automatically increased >> the CPU speed from 600 MHz to 1.7 GHz. >> >> My system is a ThinkPad T41, I'll attach the .config >> > This looks like is related to the report here http://www.ussg.iu.edu/hypermail/linux/kernel/0801.3/1260.html Can you try the workarounds on that thread and see whether the problem goes away. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: (ondemand) CPU governor regression between 2.6.23 and 2.6.24
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Morton Sent: Sunday, February 03, 2008 4:33 PM To: Toralf Förster Cc: linux-kernel@vger.kernel.org; [EMAIL PROTECTED] Subject: Re: (ondemand) CPU governor regression between 2.6.23 and 2.6.24 On Sat, 26 Jan 2008 15:06:25 +0100 Toralf Förster [EMAIL PROTECTED] wrote: I use a 1-liner for a simple performance check : time factor 819734028463158891 Here is the result for the new (Gentoo) kernel 2.6.24: With the ondemand governor of the I get: [EMAIL PROTECTED] ~/tmp $ time factor 819734028463158891 819734028463158891: 3 273244676154386297 real0m32.997s user0m15.732s sys 0m0.014s With the ondemand governor the CPU runs at 600 MHz, whereas with the performance governor I get : [EMAIL PROTECTED] ~/tmp $ time factor 819734028463158891 819734028463158891: 3 273244676154386297 real0m10.893s user0m5.444s sys 0m0.000s (~5.5 sec as I expected) b/c the CPU is set to 1.7 GHz. The ondeman governor of previous kernel versions however automatically increased the CPU speed from 600 MHz to 1.7 GHz. My system is a ThinkPad T41, I'll attach the .config This looks like is related to the report here http://www.ussg.iu.edu/hypermail/linux/kernel/0801.3/1260.html Can you try the workarounds on that thread and see whether the problem goes away. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] Force enable HPET on (some?) ICH9 boards
Patch looks good. If BIOS does not report HPET on more of such systems we may have to add other chipsets in ICH9 family (ICH9_8, ...) as well. Acked-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> >-Original Message- >From: Alistair John Strachan [mailto:[EMAIL PROTECTED] >Sent: Sunday, January 27, 2008 6:33 AM >To: Pallipadi, Venkatesh >Cc: Ingo Molnar; Linux Kernel Mailing List; Alistair John Strachan >Subject: [PATCH] Force enable HPET on (some?) ICH9 boards > >Some consumer ICH9 boards (such as the Abit IP35 Pro) do not >provide a BIOS >option for enabling the HPET. The same ICH workaround used for >6,7,8 can be >applied to 9. Here I enable the only PCI id that was visible >on my system. > >I have confirmed the HPETs work both from userspace and as a >clocksource for >the running kernel (2.6.24 here) after applying this patch. > >Force enabled HPET at base address 0xfed0 >hpet clockevent registered >hpet0: at MMIO 0xfed0, IRQs 2, 8, 0, 0 >hpet0: 4 64-bit timers, 14318180 Hz > >Signed-off-by: Alistair John Strachan <[EMAIL PROTECTED]> > >--- > arch/x86/kernel/quirks.c |2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > >diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c >index fab30e1..150ba29 100644 >--- a/arch/x86/kernel/quirks.c >+++ b/arch/x86/kernel/quirks.c >@@ -162,6 +162,8 @@ >DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, >PCI_DEVICE_ID_INTEL_ICH7_31, >ich_force_enable_hpet); > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, >PCI_DEVICE_ID_INTEL_ICH8_1, >ich_force_enable_hpet); >+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, >PCI_DEVICE_ID_INTEL_ICH9_7, >+ ich_force_enable_hpet); > > > static struct pci_dev *cached_dev; > >-- >Cheers, >Alistair. > >137/1 Warrender Park Road, Edinburgh, UK. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] Force enable HPET on (some?) ICH9 boards
Patch looks good. If BIOS does not report HPET on more of such systems we may have to add other chipsets in ICH9 family (ICH9_8, ...) as well. Acked-by: Venkatesh Pallipadi [EMAIL PROTECTED] -Original Message- From: Alistair John Strachan [mailto:[EMAIL PROTECTED] Sent: Sunday, January 27, 2008 6:33 AM To: Pallipadi, Venkatesh Cc: Ingo Molnar; Linux Kernel Mailing List; Alistair John Strachan Subject: [PATCH] Force enable HPET on (some?) ICH9 boards Some consumer ICH9 boards (such as the Abit IP35 Pro) do not provide a BIOS option for enabling the HPET. The same ICH workaround used for 6,7,8 can be applied to 9. Here I enable the only PCI id that was visible on my system. I have confirmed the HPETs work both from userspace and as a clocksource for the running kernel (2.6.24 here) after applying this patch. Force enabled HPET at base address 0xfed0 hpet clockevent registered hpet0: at MMIO 0xfed0, IRQs 2, 8, 0, 0 hpet0: 4 64-bit timers, 14318180 Hz Signed-off-by: Alistair John Strachan [EMAIL PROTECTED] --- arch/x86/kernel/quirks.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c index fab30e1..150ba29 100644 --- a/arch/x86/kernel/quirks.c +++ b/arch/x86/kernel/quirks.c @@ -162,6 +162,8 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH7_31, ich_force_enable_hpet); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH8_1, ich_force_enable_hpet); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH9_7, + ich_force_enable_hpet); static struct pci_dev *cached_dev; -- Cheers, Alistair. 137/1 Warrender Park Road, Edinburgh, UK. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] X86: fix typo PAT to X86_PAT
>-Original Message- >From: Dave Jones [mailto:[EMAIL PROTECTED] >Sent: Friday, January 18, 2008 7:29 PM >To: Ingo Molnar >Cc: Yinghai Lu; Pallipadi, Venkatesh; LKML >Subject: Re: [PATCH] X86: fix typo PAT to X86_PAT > >On Fri, Jan 18, 2008 at 10:02:10PM +0100, Ingo Molnar wrote: > > > > * Dave Jones <[EMAIL PROTECTED]> wrote: > > > > > > you mean modifies MTRRs? Which code is that? (besides the > > > > /proc/mtrr userspace API) > > > > > > This exclusion is going to be a real pain in the ass for distro > > > kernels. It's impossible for example to build a kernel >that will now > > > support the MTRR-alike registers on the AMD K6/early >Cyrix etc and > > > also support PAT. > > > > > > Additionally, given people tend to update their kernels a >lot more > > > often than they update to a whole new version of X, it >means until > > > userspace has caught up, we can't ship a kernel with PAT >supported, or > > > else X gets a lot slower due to the missing mtrr support. > > > > there's no exclusion enforced right now, and if a CPU is >PAT-incapable > > (or if the kernel is booted nopat) then the MTRR bits >should be usable. > > But if we boot with PAT enabled, and Xorg gets /proc/mtrr >wrong, we'll > > see nasty crashes. If it gets them right, it should all >still work just > > fine. Is this ok? Then, in a year or two, distros can disable write > > support to /proc/mtrr. Hm? > >A crazy idea just occured to me.. We could make /proc/mtrr an >interface >to set PAT on a range of memory. This would make it transparently work >without any changes in X or anything else that sets them in userspace. > Yes. We actually used this earlier while we were testing PAT functionality internally :). There are some issues though. 1) Current X does /dev/mem mapping of the region followed by MTRR setting for this region. For this to work with PAT based MTRR, either the order has to change (so that there wont be any conflict due to WB devmem mapping when we try to simulate mtrr) or we need a mechanism to go and change devmem mapping to reflect the later PAT attribute changes. 2) We will have to fail mtrr setting when there are hard conflicts with PAT requests. We will look at this as a possible optimization for next round of PAT patches. But, to work with existing X, we will have to have mechanism to go and change existing mappings which is slightly more complicated than what we already have with current PAT changes. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] X86: fix typo PAT to X86_PAT
-Original Message- From: Dave Jones [mailto:[EMAIL PROTECTED] Sent: Friday, January 18, 2008 7:29 PM To: Ingo Molnar Cc: Yinghai Lu; Pallipadi, Venkatesh; LKML Subject: Re: [PATCH] X86: fix typo PAT to X86_PAT On Fri, Jan 18, 2008 at 10:02:10PM +0100, Ingo Molnar wrote: * Dave Jones [EMAIL PROTECTED] wrote: you mean modifies MTRRs? Which code is that? (besides the /proc/mtrr userspace API) This exclusion is going to be a real pain in the ass for distro kernels. It's impossible for example to build a kernel that will now support the MTRR-alike registers on the AMD K6/early Cyrix etc and also support PAT. Additionally, given people tend to update their kernels a lot more often than they update to a whole new version of X, it means until userspace has caught up, we can't ship a kernel with PAT supported, or else X gets a lot slower due to the missing mtrr support. there's no exclusion enforced right now, and if a CPU is PAT-incapable (or if the kernel is booted nopat) then the MTRR bits should be usable. But if we boot with PAT enabled, and Xorg gets /proc/mtrr wrong, we'll see nasty crashes. If it gets them right, it should all still work just fine. Is this ok? Then, in a year or two, distros can disable write support to /proc/mtrr. Hm? A crazy idea just occured to me.. We could make /proc/mtrr an interface to set PAT on a range of memory. This would make it transparently work without any changes in X or anything else that sets them in userspace. Yes. We actually used this earlier while we were testing PAT functionality internally :). There are some issues though. 1) Current X does /dev/mem mapping of the region followed by MTRR setting for this region. For this to work with PAT based MTRR, either the order has to change (so that there wont be any conflict due to WB devmem mapping when we try to simulate mtrr) or we need a mechanism to go and change devmem mapping to reflect the later PAT attribute changes. 2) We will have to fail mtrr setting when there are hard conflicts with PAT requests. We will look at this as a possible optimization for next round of PAT patches. But, to work with existing X, we will have to have mechanism to go and change existing mappings which is slightly more complicated than what we already have with current PAT changes. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] X86: fix typo PAT to X86_PAT
>-Original Message- >From: Dave Jones [mailto:[EMAIL PROTECTED] >Sent: Friday, January 18, 2008 10:25 AM >To: Ingo Molnar >Cc: Yinghai Lu; Pallipadi, Venkatesh; LKML >Subject: Re: [PATCH] X86: fix typo PAT to X86_PAT > >On Fri, Jan 18, 2008 at 01:31:40PM +0100, Ingo Molnar wrote: > > > * Yinghai Lu <[EMAIL PROTECTED]> wrote: > > > > > > thanks. But, i think we should rather do the following: >if X86_PAT > > > > is eanbled then /proc/mtrr should be read-only. There's >no problem > > > > _looking_ at MTRR contents, as long as we do not try to >modify them. > > > > Hm? > > > > > > anyway > > > > > > depends on !PAT > > > > > > need to be removed. > > > > > > it seems when PAT is used, some code still touch MTRR. > > > > you mean modifies MTRRs? Which code is that? (besides the /proc/mtrr > > userspace API) > >This exclusion is going to be a real pain in the ass for >distro kernels. >It's impossible for example to build a kernel that will now support >the MTRR-alike registers on the AMD K6/early Cyrix etc and also >support PAT. > Actually, this exclusion will not work at all with the current code. Infact it should be PAT selects MTRR, for the current code. As pat_init() is called during mtrr init as the rules for how to change PAT and how to change MTRR are same. Further, MTRR is always required on SMP, as we read the MTRR setting from boot CPU and set it on Aps at boot time. We should only remove the /proc/mtrr write permissions with CONFIG_PAT. We need to deprecate it for a while before that... Ingo, can you remove this PAT MTRR exclusion. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes
>-Original Message- >From: Andreas Herrmann3 [mailto:[EMAIL PROTECTED] >Sent: Friday, January 18, 2008 8:11 AM >To: Pallipadi, Venkatesh >Cc: Ingo Molnar; Siddha, Suresh B; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org >Subject: Re: [patch 0/4] x86: PAT followup - Incremental >changes and bug fixes > >On Thu, Jan 17, 2008 at 03:04:10PM -0800, Venki Pallipadi wrote: >> >> Below is another potential fix for the problem here. Going >through ACPI >> ioremap usages, we found at one place the mapping is cached >for possible >> optimization reason and not unmapped later. Patch below always unmaps >> ioremap at this place in ACPICA. > >The patch does not fix the problem. The conflicting cache >attributes are >still there. > Andreas, Could you also try the patch Suresh Siddha sent out yesterday. That covers the case where the attribute was not getting removed even after unmap was called. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes
-Original Message- From: Andreas Herrmann3 [mailto:[EMAIL PROTECTED] Sent: Friday, January 18, 2008 8:11 AM To: Pallipadi, Venkatesh Cc: Ingo Molnar; Siddha, Suresh B; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes On Thu, Jan 17, 2008 at 03:04:10PM -0800, Venki Pallipadi wrote: Below is another potential fix for the problem here. Going through ACPI ioremap usages, we found at one place the mapping is cached for possible optimization reason and not unmapped later. Patch below always unmaps ioremap at this place in ACPICA. The patch does not fix the problem. The conflicting cache attributes are still there. Andreas, Could you also try the patch Suresh Siddha sent out yesterday. That covers the case where the attribute was not getting removed even after unmap was called. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] X86: fix typo PAT to X86_PAT
-Original Message- From: Dave Jones [mailto:[EMAIL PROTECTED] Sent: Friday, January 18, 2008 10:25 AM To: Ingo Molnar Cc: Yinghai Lu; Pallipadi, Venkatesh; LKML Subject: Re: [PATCH] X86: fix typo PAT to X86_PAT On Fri, Jan 18, 2008 at 01:31:40PM +0100, Ingo Molnar wrote: * Yinghai Lu [EMAIL PROTECTED] wrote: thanks. But, i think we should rather do the following: if X86_PAT is eanbled then /proc/mtrr should be read-only. There's no problem _looking_ at MTRR contents, as long as we do not try to modify them. Hm? anyway depends on !PAT need to be removed. it seems when PAT is used, some code still touch MTRR. you mean modifies MTRRs? Which code is that? (besides the /proc/mtrr userspace API) This exclusion is going to be a real pain in the ass for distro kernels. It's impossible for example to build a kernel that will now support the MTRR-alike registers on the AMD K6/early Cyrix etc and also support PAT. Actually, this exclusion will not work at all with the current code. Infact it should be PAT selects MTRR, for the current code. As pat_init() is called during mtrr init as the rules for how to change PAT and how to change MTRR are same. Further, MTRR is always required on SMP, as we read the MTRR setting from boot CPU and set it on Aps at boot time. We should only remove the /proc/mtrr write permissions with CONFIG_PAT. We need to deprecate it for a while before that... Ingo, can you remove this PAT MTRR exclusion. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes
>-Original Message- >From: Andreas Herrmann3 [mailto:[EMAIL PROTECTED] >Sent: Thursday, January 17, 2008 3:25 PM >To: Pallipadi, Venkatesh >Cc: Ingo Molnar; Siddha, Suresh B; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org >Subject: Re: [patch 0/4] x86: PAT followup - Incremental >changes and bug fixes > >On Thu, Jan 17, 2008 at 03:04:10PM -0800, Venki Pallipadi wrote: >> >> Below is another potential fix for the problem here. Going >through ACPI >> ioremap usages, we found at one place the mapping is cached >for possible >> optimization reason and not unmapped later. Patch below always unmaps >> ioremap at this place in ACPICA. >> >> Thanks, >> Venki >> >> >> Index: linux-2.6.git/drivers/acpi/executer/exregion.c >> === >> --- linux-2.6.git.orig/drivers/acpi/executer/exregion.c >2008-01-17 03:18:39.0 -0800 >> +++ linux-2.6.git/drivers/acpi/executer/exregion.c >2008-01-17 07:34:33.0 -0800 >> @@ -48,6 +48,8 @@ >> #define _COMPONENT ACPI_EXECUTER >> ACPI_MODULE_NAME("exregion") >> >> +static int ioremap_cache; >> + >> >/** >* >> * >> * FUNCTION:acpi_ex_system_memory_space_handler >> @@ -249,6 +251,13 @@ >> break; >> } >> >> +if (!ioremap_cache) { >> +acpi_os_unmap_memory(mem_info->mapped_logical_address, >> + window_size); >> +mem_info->mapped_logical_address = 0; >> +mem_info->mapped_physical_address = 0; >> +mem_info->mapped_length = 0; >> +} >> return_ACPI_STATUS(status); >> } >> > > >Applying and compiling your patch I see: > > CC drivers/acpi/executer/exregion.o >drivers/acpi/executer/exregion.c: In function >'acpi_ex_system_memory_space_handler': >drivers/acpi/executer/exregion.c:81: warning: 'window_size' >may be used uninitialized in this function > > >After glancing through this file it seems that ioremap_cache >is always 0 >and acpi_os_unmap_memory will unconditionally be executed at >end of this function. >I am not familiar with that code. But I just want to reinsure that this >is what you want. And if so, why is that variable needed? >But maybe I missed something ... I missed that warning. But should not matter for testing this patch as we always initialize window_size with the patch. Yes. The variable is not needed. With patch I always map at the beginning of this function and unmap at the end. I just kept the variable as I was planning to add a boot option to control this initially. But, later decided to keep the test patch simple without any boot option. We can come up with a better patch once we know that the test patch helps. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [-mm Patch] uml: fix a building error
>-Original Message- >From: Jeff Dike [mailto:[EMAIL PROTECTED] >Sent: Thursday, January 17, 2008 3:08 PM >To: Pallipadi, Venkatesh >Cc: Andrew Morton; Mariusz Kozlowski; WANG Cong; >linux-kernel@vger.kernel.org; >[EMAIL PROTECTED]; David Miller; >[EMAIL PROTECTED]; Ingo Molnar; Thomas Gleixner >Subject: Re: [-mm Patch] uml: fix a building error > >On Thu, Jan 17, 2008 at 01:41:50PM -0800, Venki Pallipadi wrote: >> > And while we're on the subject, what's the deal with these, in >> > include/asm-x86/io.h? >> > >> > #define ioremap_wc ioremap_wc >> > #define unxlate_dev_mem_ptr unxlate_dev_mem_ptr >> > >> >> If archs want to override the defaults for these two >functions, they define >> the above and then include asm-generic/iomap.h. > >That wasn't really the question. > >#define X X > >is a no-op, yes? > Later there is code in generic.h which is doing #ifndef ioremap_wc #define ioremap_wc ioremap_nocache #endif Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [-mm Patch] uml: fix a building error
>-Original Message- >From: Andrew Morton [mailto:[EMAIL PROTECTED] >Sent: Thursday, January 17, 2008 10:56 AM >To: Mariusz Kozlowski >Cc: WANG Cong; linux-kernel@vger.kernel.org; Jeff Dike; >[EMAIL PROTECTED]; David Miller; >[EMAIL PROTECTED]; Ingo Molnar; Thomas Gleixner; >Pallipadi, Venkatesh >Subject: Re: [-mm Patch] uml: fix a building error > >On Thu, 17 Jan 2008 19:11:13 +0100 Mariusz Kozlowski ><[EMAIL PROTECTED]> wrote: > >> Hello, >> >> > This patch fixes this building error: >> > ... >> > drivers/char/mem.c: In function 'read_mem': >> > drivers/char/mem.c:136: error: implicit declaration of >function 'unxlate_dev_mem_ptr' >> > ... >> >> I see this on sparc64 as well: >> >> CC drivers/char/mem.o >> drivers/char/mem.c: In function 'read_mem': >> drivers/char/mem.c:136: error: implicit declaration of >function 'unxlate_dev_mem_ptr' >> make[2]: *** [drivers/char/mem.o] Error 1 >> make[1]: *** [drivers/char] Error 2 >> make: *** [drivers] Error 2 >> >> Does sparc64 need similar fix? >> > >The PAT patches strike again. > >Ingo, I think you might need to toss some cross-compilers into >that build >test setup of yours. > These functions were defined for other archs in asm-generic/iomap.h. We need all archs including it in io.h. I now see only few archs are including it.. Apart from unxlate, there is also ioremap_wc which is defined in the same way. I can send a patch for this. But, I don't have cross compiler setup for all archs to test. Andrew, I will need your help. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.24-rc8-mm1
>-Original Message- >From: Andrew Morton [mailto:[EMAIL PROTECTED] >Sent: Thursday, January 17, 2008 10:40 AM >To: [EMAIL PROTECTED] >Cc: linux-kernel@vger.kernel.org; Linux ACPI mailing list; >Intel E/100 mailing list; Ingo Molnar; Thomas Gleixner; >Pallipadi, Venkatesh >Subject: Re: 2.6.24-rc8-mm1 > >On Thu, 17 Jan 2008 18:16:22 +0530 Balbir Singh ><[EMAIL PROTECTED]> wrote: > >> * Andrew Morton <[EMAIL PROTECTED]> [2008-01-17 02:35:14]: >> >> > >> > >ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2 >.6.24-rc8/2.6.24-rc8-mm1/ >> > >> > - selinux is busted on one of my two selinux-enabled test machines. >> > >> > - suspend-to-ram and suspend-to-disk are totally hosed on >one of my test >> > machines. I guess I get to bisect this. >> > >> > - git-nfsd is dropped due to conflicts with git-nfs >> > >> > - git-newsetup is dropped due to conflicts with git-x86 (I think) >> > >> > - git-perfmon is dropped due to conflicts with git-x86 (I think) >> > >> > - git-kgdb is dropped due to conflicts with >git-damn-near-everything >> > >> > - git-block is dropped due to conflicts with the IDE tree >> > >> > - kvm probably doesn't work properly because I couldn't be >bothered fixing >> > the conflicts between git-kvm and the driver tree >> > >> > - the volume of rejects and build errors which are caused >by subsystem >> > maintainers fiddling with other people's stuff is quite >out of control. >> > Something needs to happen here. >> >> Hi, Andrew, >> >> May be it was one of the conflicts, but my system fails to get >> ethernet working with this version. I see >> >> e100: Intel(R) PRO/100 Network Driver, 3. 5.23-k4-NAPI >> e100: Copyright(c) 1999-2006 Intel Corporation >> ACPI: PCI Interrupt :04:08.0[A] -> GSI 20 (level, low) -> IRQ 20 >> modprobe:2584 conflicting cache attribute 5000-50001000 >> uncached<->default >> e100: :04:08.0: e100_probe: Cannot map device registers, >aborting. >> ACPI: PCI interrupt for device :04:08.0 disabled >> e100: probe of :04:08.0 failed with error -12 >> >It appears that the new PAT code didn't like e100's >pci_iomap(). Venki, can you >take a look please? > This seems similar to one problem we saw yday. May not be specific to e1000. May be at some generic pci code. The problem is >> modprobe:2584 conflicting cache attribute 5000-50001000 >> uncached<->default Some address range here is being mapped with conflicting types. Somewhere the range was mapped with default (write-back). Later pci_iomap() is mapping that region as uncacheable which is basically aliasing. PAT code detects the aliasing and fails the second uncacheable request which leads in the failure. We are trying to find who exactly is mapping this with default at the beginning. Balbir: Full dmesg with debug boot parameter may help. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.24-rc8-mm1
-Original Message- From: Andrew Morton [mailto:[EMAIL PROTECTED] Sent: Thursday, January 17, 2008 10:40 AM To: [EMAIL PROTECTED] Cc: linux-kernel@vger.kernel.org; Linux ACPI mailing list; Intel E/100 mailing list; Ingo Molnar; Thomas Gleixner; Pallipadi, Venkatesh Subject: Re: 2.6.24-rc8-mm1 On Thu, 17 Jan 2008 18:16:22 +0530 Balbir Singh [EMAIL PROTECTED] wrote: * Andrew Morton [EMAIL PROTECTED] [2008-01-17 02:35:14]: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2 .6.24-rc8/2.6.24-rc8-mm1/ - selinux is busted on one of my two selinux-enabled test machines. - suspend-to-ram and suspend-to-disk are totally hosed on one of my test machines. I guess I get to bisect this. - git-nfsd is dropped due to conflicts with git-nfs - git-newsetup is dropped due to conflicts with git-x86 (I think) - git-perfmon is dropped due to conflicts with git-x86 (I think) - git-kgdb is dropped due to conflicts with git-damn-near-everything - git-block is dropped due to conflicts with the IDE tree - kvm probably doesn't work properly because I couldn't be bothered fixing the conflicts between git-kvm and the driver tree - the volume of rejects and build errors which are caused by subsystem maintainers fiddling with other people's stuff is quite out of control. Something needs to happen here. Hi, Andrew, May be it was one of the conflicts, but my system fails to get ethernet working with this version. I see e100: Intel(R) PRO/100 Network Driver, 3. 5.23-k4-NAPI e100: Copyright(c) 1999-2006 Intel Corporation ACPI: PCI Interrupt :04:08.0[A] - GSI 20 (level, low) - IRQ 20 modprobe:2584 conflicting cache attribute 5000-50001000 uncached-default e100: :04:08.0: e100_probe: Cannot map device registers, aborting. ACPI: PCI interrupt for device :04:08.0 disabled e100: probe of :04:08.0 failed with error -12 It appears that the new PAT code didn't like e100's pci_iomap(). Venki, can you take a look please? This seems similar to one problem we saw yday. May not be specific to e1000. May be at some generic pci code. The problem is modprobe:2584 conflicting cache attribute 5000-50001000 uncached-default Some address range here is being mapped with conflicting types. Somewhere the range was mapped with default (write-back). Later pci_iomap() is mapping that region as uncacheable which is basically aliasing. PAT code detects the aliasing and fails the second uncacheable request which leads in the failure. We are trying to find who exactly is mapping this with default at the beginning. Balbir: Full dmesg with debug boot parameter may help. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [-mm Patch] uml: fix a building error
-Original Message- From: Andrew Morton [mailto:[EMAIL PROTECTED] Sent: Thursday, January 17, 2008 10:56 AM To: Mariusz Kozlowski Cc: WANG Cong; linux-kernel@vger.kernel.org; Jeff Dike; [EMAIL PROTECTED]; David Miller; [EMAIL PROTECTED]; Ingo Molnar; Thomas Gleixner; Pallipadi, Venkatesh Subject: Re: [-mm Patch] uml: fix a building error On Thu, 17 Jan 2008 19:11:13 +0100 Mariusz Kozlowski [EMAIL PROTECTED] wrote: Hello, This patch fixes this building error: ... drivers/char/mem.c: In function 'read_mem': drivers/char/mem.c:136: error: implicit declaration of function 'unxlate_dev_mem_ptr' ... I see this on sparc64 as well: CC drivers/char/mem.o drivers/char/mem.c: In function 'read_mem': drivers/char/mem.c:136: error: implicit declaration of function 'unxlate_dev_mem_ptr' make[2]: *** [drivers/char/mem.o] Error 1 make[1]: *** [drivers/char] Error 2 make: *** [drivers] Error 2 Does sparc64 need similar fix? The PAT patches strike again. Ingo, I think you might need to toss some cross-compilers into that build test setup of yours. These functions were defined for other archs in asm-generic/iomap.h. We need all archs including it in io.h. I now see only few archs are including it.. Apart from unxlate, there is also ioremap_wc which is defined in the same way. I can send a patch for this. But, I don't have cross compiler setup for all archs to test. Andrew, I will need your help. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes
-Original Message- From: Andreas Herrmann3 [mailto:[EMAIL PROTECTED] Sent: Thursday, January 17, 2008 3:25 PM To: Pallipadi, Venkatesh Cc: Ingo Molnar; Siddha, Suresh B; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes On Thu, Jan 17, 2008 at 03:04:10PM -0800, Venki Pallipadi wrote: Below is another potential fix for the problem here. Going through ACPI ioremap usages, we found at one place the mapping is cached for possible optimization reason and not unmapped later. Patch below always unmaps ioremap at this place in ACPICA. Thanks, Venki Index: linux-2.6.git/drivers/acpi/executer/exregion.c === --- linux-2.6.git.orig/drivers/acpi/executer/exregion.c 2008-01-17 03:18:39.0 -0800 +++ linux-2.6.git/drivers/acpi/executer/exregion.c 2008-01-17 07:34:33.0 -0800 @@ -48,6 +48,8 @@ #define _COMPONENT ACPI_EXECUTER ACPI_MODULE_NAME(exregion) +static int ioremap_cache; + /** * * * FUNCTION:acpi_ex_system_memory_space_handler @@ -249,6 +251,13 @@ break; } +if (!ioremap_cache) { +acpi_os_unmap_memory(mem_info-mapped_logical_address, + window_size); +mem_info-mapped_logical_address = 0; +mem_info-mapped_physical_address = 0; +mem_info-mapped_length = 0; +} return_ACPI_STATUS(status); } Applying and compiling your patch I see: CC drivers/acpi/executer/exregion.o drivers/acpi/executer/exregion.c: In function 'acpi_ex_system_memory_space_handler': drivers/acpi/executer/exregion.c:81: warning: 'window_size' may be used uninitialized in this function After glancing through this file it seems that ioremap_cache is always 0 and acpi_os_unmap_memory will unconditionally be executed at end of this function. I am not familiar with that code. But I just want to reinsure that this is what you want. And if so, why is that variable needed? But maybe I missed something ... I missed that warning. But should not matter for testing this patch as we always initialize window_size with the patch. Yes. The variable is not needed. With patch I always map at the beginning of this function and unmap at the end. I just kept the variable as I was planning to add a boot option to control this initially. But, later decided to keep the test patch simple without any boot option. We can come up with a better patch once we know that the test patch helps. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [-mm Patch] uml: fix a building error
-Original Message- From: Jeff Dike [mailto:[EMAIL PROTECTED] Sent: Thursday, January 17, 2008 3:08 PM To: Pallipadi, Venkatesh Cc: Andrew Morton; Mariusz Kozlowski; WANG Cong; linux-kernel@vger.kernel.org; [EMAIL PROTECTED]; David Miller; [EMAIL PROTECTED]; Ingo Molnar; Thomas Gleixner Subject: Re: [-mm Patch] uml: fix a building error On Thu, Jan 17, 2008 at 01:41:50PM -0800, Venki Pallipadi wrote: And while we're on the subject, what's the deal with these, in include/asm-x86/io.h? #define ioremap_wc ioremap_wc #define unxlate_dev_mem_ptr unxlate_dev_mem_ptr If archs want to override the defaults for these two functions, they define the above and then include asm-generic/iomap.h. That wasn't really the question. #define X X is a no-op, yes? Later there is code in generic.h which is doing #ifndef ioremap_wc #define ioremap_wc ioremap_nocache #endif Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes
>-Original Message- >From: Andi Kleen [mailto:[EMAIL PROTECTED] >Sent: Wednesday, January 16, 2008 2:02 PM >To: Pallipadi, Venkatesh >Cc: Andreas Herrmann; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; >[EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B >Subject: Re: [patch 0/4] x86: PAT followup - Incremental >changes and bug fixes > >> This ioremap failing seems to be the real problem. This can be due to >> new tracking of ioremaps introduced by PAT patches. We do not allow >> conflicting ioremaps to same region. Probably that is happening > >Normally if there is a conflict there should be a printk (or >at least it was >so in the original mattr code if you haven't changed it) > Yes. Printks are there. But are with KERN_DEBUG now. We should change them to WARNING atleast. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes
Sorry. Never mind about e820 map. Somehow I did not notice the boot.log you had attached earlier. Thanks, Venki >-Original Message- >From: Pallipadi, Venkatesh >Sent: Wednesday, January 16, 2008 11:06 AM >To: 'Andreas Herrmann' >Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; >linux-kernel@vger.kernel.org >Subject: RE: [patch 0/4] x86: PAT followup - Incremental >changes and bug fixes > > >Can you attach the e820 map from the top of your dmesg. > >Thanks, >Venki > >>-Original Message- >>From: Andreas Herrmann [mailto:[EMAIL PROTECTED] >>Sent: Wednesday, January 16, 2008 10:58 AM >>To: Pallipadi, Venkatesh >>Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >>[EMAIL PROTECTED]; [EMAIL PROTECTED]; >>[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >>[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >>[EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; >>linux-kernel@vger.kernel.org >>Subject: Re: [patch 0/4] x86: PAT followup - Incremental >>changes and bug fixes >> >>Hi, >> >>I just want to report that the PAT support in x86/mm causes crashes >>on two of my test machines. On both boxes the SATA detection does >>not work when the PAT support is patched into the kernel. >> >>Symptoms are as follows -- best described by a diff between the >>two boot.logs: >> >># diff boot-failing.log boot-working.log >> >>-Linux version 2.6.24-rc8-ga9f7faa5 ([EMAIL PROTECTED]) (gcc version ... >>+Linux version 2.6.24-rc8-g2ea3cf43 ([EMAIL PROTECTED]) (gcc version ... >>... >> early_iounmap(82a0b000, 1000) >>-early_ioremap(c000, 1000) => -02103394304 >>-early_iounmap(82a0c000, 1000) >> early_iounmap(82808000, 1000) >>... >>-ACPI: PCI interrupt for device :00:12.0 disabled >>-sata_sil: probe of :00:12.0 failed with error -12 >>+scsi0 : sata_sil >>+scsi1 : sata_sil >>+ata1: SATA max UDMA/100 mmio [EMAIL PROTECTED] tf 0xc0403080 irq 22 >>... >>-AC'97 space ioremap problem >>-ACPI: PCI interrupt for device :00:14.5 disabled >>-ATI IXP AC97 controller: probe of :00:14.5 failed with error -5 >> ALSA device list: >>- No soundcards found. >>+ #0: ATI IXP rev 80 with ALC655 at 0xc0403800, irq 17 >>... >>-VFS: Cannot open root device "sda1" or unknown-block(0,0) >>-Please append a correct "root=" boot option; here are the >>available partitions: >>-16004194302 hdc driver: ide-cdrom >>-Kernel panic - not syncing: VFS: Unable to mount root fs on >>unknown-block(0,0) >>+kjournald starting. Commit interval 5 seconds >>+EXT3-fs: mounted filesystem with ordered data mode. >>+VFS: Mounted root (ext3 filesystem) readonly. >>... >> >> >> >>The second test machine uses ahci. But the symptoms are similar. >> >>I performed a git-bisect on x86/mm. Last commit that worked for me was >> >>2ea3cf43fddecbfd66353caafdf73ec21ea3760b (x86: fix >>early_ioremap() ISA window) >> >>The subsequent commits for PAT support introduced the problem. >>I noticed that PAT should be disabled by default, but >>obviously the patches >>still have some side-effect. (Maybe ioremap changes lead to >>the problem?) >> >>Boot-logs are attached: >> >> boot-failing.log for x86/mm as of v2.6.24-rc8-672-ga9f7faa >> boot-working.log for x86/mm as of v2.6.24-rc8-621-g2ea3cf4 >> >>Hopefully it helps to track down the problem. >>Maybe someone has an idea why the PAT patches are causing that >>ominous "PCI interrupt for device ... disabled" messages. >> >> >>Thanks and regards, >> >>Andreas >> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes
Can you attach the e820 map from the top of your dmesg. Thanks, Venki >-Original Message- >From: Andreas Herrmann [mailto:[EMAIL PROTECTED] >Sent: Wednesday, January 16, 2008 10:58 AM >To: Pallipadi, Venkatesh >Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; >linux-kernel@vger.kernel.org >Subject: Re: [patch 0/4] x86: PAT followup - Incremental >changes and bug fixes > >Hi, > >I just want to report that the PAT support in x86/mm causes crashes >on two of my test machines. On both boxes the SATA detection does >not work when the PAT support is patched into the kernel. > >Symptoms are as follows -- best described by a diff between the >two boot.logs: > ># diff boot-failing.log boot-working.log > >-Linux version 2.6.24-rc8-ga9f7faa5 ([EMAIL PROTECTED]) (gcc version ... >+Linux version 2.6.24-rc8-g2ea3cf43 ([EMAIL PROTECTED]) (gcc version ... >... > early_iounmap(82a0b000, 1000) >-early_ioremap(c000, 1000) => -02103394304 >-early_iounmap(82a0c000, 1000) > early_iounmap(82808000, 1000) >... >-ACPI: PCI interrupt for device :00:12.0 disabled >-sata_sil: probe of :00:12.0 failed with error -12 >+scsi0 : sata_sil >+scsi1 : sata_sil >+ata1: SATA max UDMA/100 mmio [EMAIL PROTECTED] tf 0xc0403080 irq 22 >... >-AC'97 space ioremap problem >-ACPI: PCI interrupt for device :00:14.5 disabled >-ATI IXP AC97 controller: probe of :00:14.5 failed with error -5 > ALSA device list: >- No soundcards found. >+ #0: ATI IXP rev 80 with ALC655 at 0xc0403800, irq 17 >... >-VFS: Cannot open root device "sda1" or unknown-block(0,0) >-Please append a correct "root=" boot option; here are the >available partitions: >-16004194302 hdc driver: ide-cdrom >-Kernel panic - not syncing: VFS: Unable to mount root fs on >unknown-block(0,0) >+kjournald starting. Commit interval 5 seconds >+EXT3-fs: mounted filesystem with ordered data mode. >+VFS: Mounted root (ext3 filesystem) readonly. >... > > > >The second test machine uses ahci. But the symptoms are similar. > >I performed a git-bisect on x86/mm. Last commit that worked for me was > >2ea3cf43fddecbfd66353caafdf73ec21ea3760b (x86: fix >early_ioremap() ISA window) > >The subsequent commits for PAT support introduced the problem. >I noticed that PAT should be disabled by default, but >obviously the patches >still have some side-effect. (Maybe ioremap changes lead to >the problem?) > >Boot-logs are attached: > > boot-failing.log for x86/mm as of v2.6.24-rc8-672-ga9f7faa > boot-working.log for x86/mm as of v2.6.24-rc8-621-g2ea3cf4 > >Hopefully it helps to track down the problem. >Maybe someone has an idea why the PAT patches are causing that >ominous "PCI interrupt for device ... disabled" messages. > > >Thanks and regards, > >Andreas > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2/4] x86: PAT followup - Remove KERNPG_TABLE from pte entry
>-Original Message- >From: Mika Penttilä [mailto:[EMAIL PROTECTED] >Sent: Wednesday, January 16, 2008 12:14 AM >To: Pallipadi, Venkatesh >Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; >linux-kernel@vger.kernel.org; Siddha, Suresh B >Subject: Re: [patch 2/4] x86: PAT followup - Remove >KERNPG_TABLE from pte entry > >[EMAIL PROTECTED] kirjoitti: >> KERNPG_TABLE was a bug in earlier patch. Remove it from pte. >> pte_val() check is redundant as this routine is called >immediately after a >> ptepage is allocated afresh. >> >> Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> >> Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> >> >> Index: linux-2.6.git/arch/x86/mm/init_64.c >> === >> --- linux-2.6.git.orig/arch/x86/mm/init_64.c 2008-01-15 >11:02:23.0 -0800 >> +++ linux-2.6.git/arch/x86/mm/init_64.c 2008-01-15 >11:06:37.0 -0800 >> @@ -541,9 +541,6 @@ >> if (address >= end) >> break; >> >> -if (pte_val(*pte)) >> -continue; >> - >> /* Nothing to map. Map the null page */ >> if (!(address & (~PAGE_MASK)) && >> (address + PAGE_SIZE <= end) && >> @@ -561,9 +558,9 @@ >> } >> >> if (exec) >> -entry = >_PAGE_NX|_KERNPG_TABLE|_PAGE_GLOBAL|address; >> +entry = _PAGE_NX|_PAGE_GLOBAL|address; >> else >> -entry = _KERNPG_TABLE|_PAGE_GLOBAL|address; >> +entry = _PAGE_GLOBAL|address; >> entry &= __supported_pte_mask; >> set_pte(pte, __pte(entry)); >> } >> >> > >Hmm then what's the point of mapping not present 4k pages for >valid mem >here? > My bad... Thanks for the catch. I had to replace KERNPG_TABLE by PAGE_KERNEL here. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2/4] x86: PAT followup - Remove KERNPG_TABLE from pte entry
-Original Message- From: Mika Penttilä [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 16, 2008 12:14 AM To: Pallipadi, Venkatesh Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B Subject: Re: [patch 2/4] x86: PAT followup - Remove KERNPG_TABLE from pte entry [EMAIL PROTECTED] kirjoitti: KERNPG_TABLE was a bug in earlier patch. Remove it from pte. pte_val() check is redundant as this routine is called immediately after a ptepage is allocated afresh. Signed-off-by: Venkatesh Pallipadi [EMAIL PROTECTED] Signed-off-by: Suresh Siddha [EMAIL PROTECTED] Index: linux-2.6.git/arch/x86/mm/init_64.c === --- linux-2.6.git.orig/arch/x86/mm/init_64.c 2008-01-15 11:02:23.0 -0800 +++ linux-2.6.git/arch/x86/mm/init_64.c 2008-01-15 11:06:37.0 -0800 @@ -541,9 +541,6 @@ if (address = end) break; -if (pte_val(*pte)) -continue; - /* Nothing to map. Map the null page */ if (!(address (~PAGE_MASK)) (address + PAGE_SIZE = end) @@ -561,9 +558,9 @@ } if (exec) -entry = _PAGE_NX|_KERNPG_TABLE|_PAGE_GLOBAL|address; +entry = _PAGE_NX|_PAGE_GLOBAL|address; else -entry = _KERNPG_TABLE|_PAGE_GLOBAL|address; +entry = _PAGE_GLOBAL|address; entry = __supported_pte_mask; set_pte(pte, __pte(entry)); } Hmm then what's the point of mapping not present 4k pages for valid mem here? My bad... Thanks for the catch. I had to replace KERNPG_TABLE by PAGE_KERNEL here. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes
Can you attach the e820 map from the top of your dmesg. Thanks, Venki -Original Message- From: Andreas Herrmann [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 16, 2008 10:58 AM To: Pallipadi, Venkatesh Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes Hi, I just want to report that the PAT support in x86/mm causes crashes on two of my test machines. On both boxes the SATA detection does not work when the PAT support is patched into the kernel. Symptoms are as follows -- best described by a diff between the two boot.logs: # diff boot-failing.log boot-working.log -Linux version 2.6.24-rc8-ga9f7faa5 ([EMAIL PROTECTED]) (gcc version ... +Linux version 2.6.24-rc8-g2ea3cf43 ([EMAIL PROTECTED]) (gcc version ... ... early_iounmap(82a0b000, 1000) -early_ioremap(c000, 1000) = -02103394304 -early_iounmap(82a0c000, 1000) early_iounmap(82808000, 1000) ... -ACPI: PCI interrupt for device :00:12.0 disabled -sata_sil: probe of :00:12.0 failed with error -12 +scsi0 : sata_sil +scsi1 : sata_sil +ata1: SATA max UDMA/100 mmio [EMAIL PROTECTED] tf 0xc0403080 irq 22 ... -AC'97 space ioremap problem -ACPI: PCI interrupt for device :00:14.5 disabled -ATI IXP AC97 controller: probe of :00:14.5 failed with error -5 ALSA device list: - No soundcards found. + #0: ATI IXP rev 80 with ALC655 at 0xc0403800, irq 17 ... -VFS: Cannot open root device sda1 or unknown-block(0,0) -Please append a correct root= boot option; here are the available partitions: -16004194302 hdc driver: ide-cdrom -Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) +kjournald starting. Commit interval 5 seconds +EXT3-fs: mounted filesystem with ordered data mode. +VFS: Mounted root (ext3 filesystem) readonly. ... snip The second test machine uses ahci. But the symptoms are similar. I performed a git-bisect on x86/mm. Last commit that worked for me was 2ea3cf43fddecbfd66353caafdf73ec21ea3760b (x86: fix early_ioremap() ISA window) The subsequent commits for PAT support introduced the problem. I noticed that PAT should be disabled by default, but obviously the patches still have some side-effect. (Maybe ioremap changes lead to the problem?) Boot-logs are attached: boot-failing.log for x86/mm as of v2.6.24-rc8-672-ga9f7faa boot-working.log for x86/mm as of v2.6.24-rc8-621-g2ea3cf4 Hopefully it helps to track down the problem. Maybe someone has an idea why the PAT patches are causing that ominous PCI interrupt for device ... disabled messages. Thanks and regards, Andreas -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes
Sorry. Never mind about e820 map. Somehow I did not notice the boot.log you had attached earlier. Thanks, Venki -Original Message- From: Pallipadi, Venkatesh Sent: Wednesday, January 16, 2008 11:06 AM To: 'Andreas Herrmann' Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org Subject: RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes Can you attach the e820 map from the top of your dmesg. Thanks, Venki -Original Message- From: Andreas Herrmann [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 16, 2008 10:58 AM To: Pallipadi, Venkatesh Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes Hi, I just want to report that the PAT support in x86/mm causes crashes on two of my test machines. On both boxes the SATA detection does not work when the PAT support is patched into the kernel. Symptoms are as follows -- best described by a diff between the two boot.logs: # diff boot-failing.log boot-working.log -Linux version 2.6.24-rc8-ga9f7faa5 ([EMAIL PROTECTED]) (gcc version ... +Linux version 2.6.24-rc8-g2ea3cf43 ([EMAIL PROTECTED]) (gcc version ... ... early_iounmap(82a0b000, 1000) -early_ioremap(c000, 1000) = -02103394304 -early_iounmap(82a0c000, 1000) early_iounmap(82808000, 1000) ... -ACPI: PCI interrupt for device :00:12.0 disabled -sata_sil: probe of :00:12.0 failed with error -12 +scsi0 : sata_sil +scsi1 : sata_sil +ata1: SATA max UDMA/100 mmio [EMAIL PROTECTED] tf 0xc0403080 irq 22 ... -AC'97 space ioremap problem -ACPI: PCI interrupt for device :00:14.5 disabled -ATI IXP AC97 controller: probe of :00:14.5 failed with error -5 ALSA device list: - No soundcards found. + #0: ATI IXP rev 80 with ALC655 at 0xc0403800, irq 17 ... -VFS: Cannot open root device sda1 or unknown-block(0,0) -Please append a correct root= boot option; here are the available partitions: -16004194302 hdc driver: ide-cdrom -Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) +kjournald starting. Commit interval 5 seconds +EXT3-fs: mounted filesystem with ordered data mode. +VFS: Mounted root (ext3 filesystem) readonly. ... snip The second test machine uses ahci. But the symptoms are similar. I performed a git-bisect on x86/mm. Last commit that worked for me was 2ea3cf43fddecbfd66353caafdf73ec21ea3760b (x86: fix early_ioremap() ISA window) The subsequent commits for PAT support introduced the problem. I noticed that PAT should be disabled by default, but obviously the patches still have some side-effect. (Maybe ioremap changes lead to the problem?) Boot-logs are attached: boot-failing.log for x86/mm as of v2.6.24-rc8-672-ga9f7faa boot-working.log for x86/mm as of v2.6.24-rc8-621-g2ea3cf4 Hopefully it helps to track down the problem. Maybe someone has an idea why the PAT patches are causing that ominous PCI interrupt for device ... disabled messages. Thanks and regards, Andreas -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes
-Original Message- From: Andi Kleen [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 16, 2008 2:02 PM To: Pallipadi, Venkatesh Cc: Andreas Herrmann; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes This ioremap failing seems to be the real problem. This can be due to new tracking of ioremaps introduced by PAT patches. We do not allow conflicting ioremaps to same region. Probably that is happening Normally if there is a conflict there should be a printk (or at least it was so in the original mattr code if you haven't changed it) Yes. Printks are there. But are with KERN_DEBUG now. We should change them to WARNING atleast. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH -mm 0/3] i386 boot: replace boot_ioremap with enhancedbt_ioremap
>-Original Message- >From: Huang, Ying >Sent: Tuesday, January 15, 2008 1:49 AM >To: Ingo Molnar; Pallipadi, Venkatesh >Cc: [EMAIL PROTECTED]; H. Peter Anvin; Thomas >Gleixner; Ingo Molnar; Andi Kleen; linux-kernel@vger.kernel.org >Subject: Re: [PATCH -mm 0/3] i386 boot: replace boot_ioremap >with enhancedbt_ioremap > >On Tue, 2008-01-15 at 09:44 +0100, Ingo Molnar wrote: >> * Huang, Ying <[EMAIL PROTECTED]> wrote: >> >> > This patchset replaces boot_ioremap with a enhanced version of >> > bt_ioremap and renames the bt_ioremap to early_ioremap. >This reduces >> > 12k from .init.data segment and increases the size of >memory that can >> > be re-mapped before paging_init to 64k. >> >> in latest x86.git#mm there's an early_ioremap() introduced >as part of >> the PAT series - available on both 32-bit and 64-bit. Could >you take a >> look at it and use that if it's OK for your purposes? > >After checking the early_ioremap() implementation in >arch/x86/kernel/setup_32.c, I found that it is a duplication of >bt_ioremap() implementation in arch/x86/mm/ioremap_32.c. Both >implementations use set_fixmap(), so they can be used only after >paging_init(). > >The early_ioremap implementation provided in this patchset works as >follow: > >- Enhances bt_ioremap, make it usable before paging_init() via a >dedicated PTE page. >- Rename bt_ioremap to early_ioremap > >So I think maybe we should replace the early_ioremap() >implementation in >PAT series with that of this series. > Agreed. PAT can use this for early mappings. Thanks for the patches :) -Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH -mm 0/3] i386 boot: replace boot_ioremap with enhancedbt_ioremap
-Original Message- From: Huang, Ying Sent: Tuesday, January 15, 2008 1:49 AM To: Ingo Molnar; Pallipadi, Venkatesh Cc: [EMAIL PROTECTED]; H. Peter Anvin; Thomas Gleixner; Ingo Molnar; Andi Kleen; linux-kernel@vger.kernel.org Subject: Re: [PATCH -mm 0/3] i386 boot: replace boot_ioremap with enhancedbt_ioremap On Tue, 2008-01-15 at 09:44 +0100, Ingo Molnar wrote: * Huang, Ying [EMAIL PROTECTED] wrote: This patchset replaces boot_ioremap with a enhanced version of bt_ioremap and renames the bt_ioremap to early_ioremap. This reduces 12k from .init.data segment and increases the size of memory that can be re-mapped before paging_init to 64k. in latest x86.git#mm there's an early_ioremap() introduced as part of the PAT series - available on both 32-bit and 64-bit. Could you take a look at it and use that if it's OK for your purposes? After checking the early_ioremap() implementation in arch/x86/kernel/setup_32.c, I found that it is a duplication of bt_ioremap() implementation in arch/x86/mm/ioremap_32.c. Both implementations use set_fixmap(), so they can be used only after paging_init(). The early_ioremap implementation provided in this patchset works as follow: - Enhances bt_ioremap, make it usable before paging_init() via a dedicated PTE page. - Rename bt_ioremap to early_ioremap So I think maybe we should replace the early_ioremap() implementation in PAT series with that of this series. Agreed. PAT can use this for early mappings. Thanks for the patches :) -Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text
>-Original Message- >From: Linus Torvalds [mailto:[EMAIL PROTECTED] >Sent: Thursday, January 10, 2008 2:15 PM >To: Pallipadi, Venkatesh >Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; >linux-kernel@vger.kernel.org; Siddha, Suresh B >Subject: RE: [patch 02/11] PAT x86: Map only usable memory in >x86_64 identity map and kernel text > > > >On Thu, 10 Jan 2008, Pallipadi, Venkatesh wrote: >> >> Yes. I had those pages not mapped at all earlier. The reason >I switched >> to zero page is to continue support cases like: >> BIOS-e820: - 0009cc00 (usable) >> BIOS-e820: 0009cc00 - 000a (reserved) >> BIOS-e820: 000cc000 - 000d (reserved) >> BIOS-e820: 000e4000 - 0010 (reserved) >> BIOS-e820: 0010 - cff6 (usable) >> >> In this case if some one does a dd of /dev/mem before they >can read the >> contents of usable memory in 0x10-0xcff6 range. > >Well, I think that /dev/mem should simply give them the right >info. That's >what people use /dev/mem for - doing things like reading BIOS >images etc. > >So returning *either* a zero page *or* stopping at the first >hole is both >equally wrong. > I was not fully clear in my earlier email. Mapping /dev/mem would still work with our changes. As they go through proper map interface. It is the dd of dev mem which does the read that has the problem. I was wondering of apps using dd. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Andi Kleen >Sent: Thursday, January 10, 2008 1:17 PM >To: Pallipadi, Venkatesh >Cc: Andi Kleen; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B >Subject: Re: [patch 02/11] PAT x86: Map only usable memory in >x86_64 identity map and kernel text > >> I think it is unsafe to access any reserved areas through >"WB" not just >> mmio regions. In the above case 0xe000-0xf000 is one such >> region. > >That is 2MB aligned. That e820 also has a reserved here at 0x9d000. BIOS-e820: - 0009cc00 (usable) BIOS-e820: 0009cc00 - 000a (reserved) BIOS-e820: 000cc000 - 000d (reserved) If we keep mapping for such pages, it will be problematic as if a driver later does a ioremap, then we have to go through split-pages and cpa. With not mapping any reserved regions at all, we can avoid cpa for all maps of reserved regions. Reducing the complications at setup will make code more complicated at ioremap, etc. Most of the holes/reserved areas will be 2M aligned, other than initial 2M and possible 2M around ACPI region. So, we may end up mapping some of those pages with small pages. Even though it was not enforced until now, I feel that is required for correctness. >> > >> >Exactly it's already broken. >> > >> >Anyways if someone accesses mmio through /dev/mem I think they >> >definitely >> >want the real mappings, not a zero page. And dev/mem >should provide. >> >The trick is just to do it without caching attribute violations, >> >but with mattr it is possible. >> >> I don't like /dev/mem supporting access to mmio. We do not know what > >But it always did that. I'm sure you'll break stuff if you forbid >it suddenly. > >> attributes to use for these regions. We can potentially map >all these >> pages uncacheable. > >That is what current /dev/mem does. May be I am missing something. But, I don't think I saw /dev/mem checking whether some region is reserved and mapping those pages as uncacheable. As I though, its mostly done as MTRR has such setting. If I do dd of devmem which ends up reading all reserved regions today, I see one of my systems dying horribly with NMI dazed and confused and the other gets SCSI errors etc. I am not sure how can some apps depend on reading mmio regions through /dev/mem. Any particular app you are thinking about? >> But there may be cases where reading an address can >> block too possibly? > >Yes sure, machine may hang, but that was always the case and I don't >think it can be changed. > >> >> >> >Anyways you could make that a zillion times more simple by >> >> >just rounding >> >> >the e820 areas to 2MB -- for the holes only that should be >> >ok I think; >> >> >i would expect them to be near always already suitably aligned. >> >> > >> >> >In short this can be all done much simpler. >> >> >> >> On systems I tested, ACPI regions are typically not 2MB >> >aligned. And on >> > >> >ACPI regions don't need to be unmapped. >> > >> >> some systems there are few 4k pages of reserved holes just before >> > >> >reserved shouldn't be unmapped, just holes. Do they have holes >> >there or reserved areas? >> > >> >I still hope 2MB alignment will work out. >> >> E820 above has a combination of reserved and holes. >> The problem is that we end up depending on specific e820s >and paltform >> specific problems/workarounds. This is not a real problem for i386 at > >> all, as we map only < 1G memory there. > >First there is the 2GB and in theory 1/3 GB split too which >are supported. >And then in theory someone could put mmio in the first 1GB >anyways (e.g. >in the 1MB hole) > >I don't think you can ignore i386 here. > OK. I was thinking that we will have smaller subset of systems to worry about with x86_64. With above, yes. We need to worry about i386 as well. Other than the complicated code, do you see any issues of identity mapping only "usable" and "ACPI" regions as per e820? We can possible try to simplify the code, if that is the only concern. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text
>-Original Message- >From: Linus Torvalds [mailto:[EMAIL PROTECTED] >Sent: Thursday, January 10, 2008 1:05 PM >To: Pallipadi, Venkatesh >Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; >[EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B >Subject: Re: [patch 02/11] PAT x86: Map only usable memory in >x86_64 identity map and kernel text > > > >On Thu, 10 Jan 2008, [EMAIL PROTECTED] wrote: >> >> x86_64: Map only usable memory in identity map. All reserved >memory maps to a >> zero page. > >I don't mind this horribly per se, but why a zero page? > >Accessing that page without mapping it explicitly would be a bug with >your change - if only because you'd get the wrong value! > >So why map it at all? The only thing mapping it can do is to hide bugs. > Yes. I had those pages not mapped at all earlier. The reason I switched to zero page is to continue support cases like: BIOS-e820: - 0009cc00 (usable) BIOS-e820: 0009cc00 - 000a (reserved) BIOS-e820: 000cc000 - 000d (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - cff6 (usable) In this case if some one does a dd of /dev/mem before they can read the contents of usable memory in 0x10-0xcff6 range. But, if I not map reserved regions, dd will stop after fist such hole. Even though this may not be a good usage model, I thought there may be apps depending on such things. Having said that, I do not like having dummy zero page there very much. So, if we do not see any regressions due to usages like above, I will be happy to remove mapping reserved regions altogether. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 11/11] PAT x86: Expose uc and wc interfaces in /sysfsvor pci_mmap_resource
>-Original Message- >From: Greg KH [mailto:[EMAIL PROTECTED] >Sent: Thursday, January 10, 2008 11:43 AM >To: Pallipadi, Venkatesh >Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >Barnes, Jesse; [EMAIL PROTECTED]; >linux-kernel@vger.kernel.org; Siddha, Suresh B >Subject: Re: [patch 11/11] PAT x86: Expose uc and wc >interfaces in /sysfsvor pci_mmap_resource > >On Thu, Jan 10, 2008 at 10:48:51AM -0800, >[EMAIL PROTECTED] wrote: >> New interfaces exported for uc and wc accesses. Apps has to >change to use >> these new interfaces. >> >> Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> >> Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> > >Please update the documentation for this change, as well as adding >something to Documentation/ABI/ for these new sysfs files. > OK. Will do. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text
>-Original Message- >From: Andi Kleen [mailto:[EMAIL PROTECTED] >Sent: Thursday, January 10, 2008 11:28 AM >To: Pallipadi, Venkatesh >Cc: Andi Kleen; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >linux-kernel@vger.kernel.org; Siddha, Suresh B >Subject: Re: [patch 02/11] PAT x86: Map only usable memory in >x86_64 identity map and kernel text > >On Thu, Jan 10, 2008 at 11:17:07AM -0800, Pallipadi, Venkatesh wrote: >> >I don't think that is needed or makes sense for >reserved/ACPI * etc. >> >Only e820 holes should be truly unmapped because only those should >> >contain mmio. >> >> Do you mean just the regions that are not listed in e820 at all? We >> should also not map anything marked "RESERVED" in e820. Right? > >RESERVED is usually memory used by the BIOS. Properly MMIO areas >should be in holes. > >Of course there might be buggy BIOS who violate that but the >only way to find out is to check for the case in ioremap and >warn. I would >be still optimistic of it being correct. > >Another way would be to double check against the MTRRs - if >it's UC then >it should be unmapped. Maybe that would be a good idea. That should >catch all true mmio holes unless a BIOS maps them cached but if it does >that it's already beyond help. One of the test systems I have has following E820 BIOS-e820: - 0009cc00 (usable) BIOS-e820: 0009cc00 - 000a (reserved) BIOS-e820: 000cc000 - 000d (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - cff6 (usable) BIOS-e820: cff6 - cff69000 (ACPI data) BIOS-e820: cff69000 - cff8 (ACPI NVS) BIOS-e820: cff8 - d000 (reserved) BIOS-e820: e000 - f000 (reserved) BIOS-e820: fec0 - fec1 (reserved) BIOS-e820: fee0 - fee01000 (reserved) BIOS-e820: ff00 - 0001 (reserved) BIOS-e820: 0001 - 00013000 (usable) I think it is unsafe to access any reserved areas through "WB" not just mmio regions. In the above case 0xe000-0xf000 is one such region. Also, relying on MTRR, is like giving more importance to BIOS writer than required :-). I think the best way to deal with MTRR is just to not touch it. Leave it as it is and do not try to assume that they are correct, as frequently they will not be. >> >> All reserved memory maps to a >> >> zero page. >> > >> >Why zero page? Why not unmap. >> >> I had it unmapped first. Then thought of zero mapping for dd >of devmem >> to continue working. May be there are apps that depend on that? >> Also, dd of devmem seems to be already broken with big memory without >> any of these changes. > >Exactly it's already broken. > >Anyways if someone accesses mmio through /dev/mem I think they >definitely >want the real mappings, not a zero page. And dev/mem should provide. >The trick is just to do it without caching attribute violations, >but with mattr it is possible. I don't like /dev/mem supporting access to mmio. We do not know what attributes to use for these regions. We can potentially map all these pages uncacheable. But there may be cases where reading an address can block too possibly? >> >Anyways you could make that a zillion times more simple by >> >just rounding >> >the e820 areas to 2MB -- for the holes only that should be >ok I think; >> >i would expect them to be near always already suitably aligned. >> > >> >In short this can be all done much simpler. >> >> On systems I tested, ACPI regions are typically not 2MB >aligned. And on > >ACPI regions don't need to be unmapped. > >> some systems there are few 4k pages of reserved holes just before > >reserved shouldn't be unmapped, just holes. Do they have holes >there or reserved areas? > >I still hope 2MB alignment will work out. E820 above has a combination of reserved and holes. The problem is that we end up depending on specific e820s and paltform specific problems/workarounds. This is not a real problem for i386 at all, as we map only < 1G memory there. So, it is limited to x86_64 systems which should be less in number. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 07/11] PAT x86: pat-conflict resolution using linear list
>-Original Message- >From: Andi Kleen [mailto:[EMAIL PROTECTED] >Sent: Thursday, January 10, 2008 11:13 AM >To: Pallipadi, Venkatesh >Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >linux-kernel@vger.kernel.org; Siddha, Suresh B >Subject: Re: [patch 07/11] PAT x86: pat-conflict resolution >using linear list > >[EMAIL PROTECTED] writes: >> >> /* Reset the direct mapping. Can block */ >> -if (p->flags >> 20) >> -ioremap_change_attr(p->phys_addr, p->size, 0); >> +if (p->flags >> 20) { >> +free_mattr(p->phys_addr, p->phys_addr + >get_vm_area_size(p), >> + p->flags>>20); >> +ioremap_change_attr(p->phys_addr, >get_vm_area_size(p), 0); > >If you really unmap all holes and forbid (or let it just return the >__va address) ioremap on anything mapped (which is probably ok) then >you can eliminate that completely. > We heard X can allocate a page and then map it UC using it through gart. So, I don't we can forbid all ioremaps for RAM. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 09/11] PAT x86: Add ioremap_wc support
>-Original Message- >From: Andi Kleen [mailto:[EMAIL PROTECTED] >Sent: Thursday, January 10, 2008 11:09 AM >To: Pallipadi, Venkatesh >Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >linux-kernel@vger.kernel.org; Siddha, Suresh B >Subject: Re: [patch 09/11] PAT x86: Add ioremap_wc support > >[EMAIL PROTECTED] writes: >> Index: linux-2.6.git/include/asm-generic/iomap.h >> === >> --- linux-2.6.git.orig/include/asm-generic/iomap.h >2008-01-08 03:31:37.0 -0800 >> +++ linux-2.6.git/include/asm-generic/iomap.h >2008-01-08 05:15:56.0 -0800 >> @@ -65,4 +65,8 @@ >> extern void __iomem *pci_iomap(struct pci_dev *dev, int >bar, unsigned long max); >> extern void pci_iounmap(struct pci_dev *dev, void __iomem *); >> >> +#ifndef ioremap_wc >> +#define ioremap_wc ioremap_nocache >> +#endif > >I don't think that's a good idea. Drivers should be able to >detect this somehow. >Handling UC mappings as WC will probably give very poor results. > It is the other way. ioremap_wc aliases to ioremap_nocache. This was based on earlier feedback from Roland. >From: Roland Dreier [mailto:[EMAIL PROTECTED] >I think ioremap_wc() needs to be available on all archs for this to be >really useful to drivers. It can be a fallback to ioremap_nocache() >everywhere except 64-bit x86, but it's not nice for every driver that >wants to use this to need an "#ifdef X86" or whatever. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Andi Kleen >Sent: Thursday, January 10, 2008 11:07 AM >To: Pallipadi, Venkatesh >Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; >linux-kernel@vger.kernel.org; Siddha, Suresh B >Subject: Re: [patch 02/11] PAT x86: Map only usable memory in >x86_64 identity map and kernel text > >[EMAIL PROTECTED] writes: > >> x86_64: Map only usable memory in identity map. > >I don't think that is needed or makes sense for reserved/ACPI * etc. >Only e820 holes should be truly unmapped because only those should >contain mmio. Do you mean just the regions that are not listed in e820 at all? We should also not map anything marked "RESERVED" in e820. Right? >> All reserved memory maps to a >> zero page. > >Why zero page? Why not unmap. I had it unmapped first. Then thought of zero mapping for dd of devmem to continue working. May be there are apps that depend on that? Also, dd of devmem seems to be already broken with big memory without any of these changes. >Anyways you could make that a zillion times more simple by >just rounding >the e820 areas to 2MB -- for the holes only that should be ok I think; >i would expect them to be near always already suitably aligned. > >In short this can be all done much simpler. On systems I tested, ACPI regions are typically not 2MB aligned. And on some systems there are few 4k pages of reserved holes just before 0xa. PCI reserved regions are 2MB aligned however. I agree that making this 2MB aligned will make this patch a lot simpler. But, not all reserved regions seems to be aligned that way. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] Kick CPUS that might be sleeping in cpus_idle_wait
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Steven Rostedt >Sent: Thursday, January 10, 2008 6:44 AM >To: Ingo Molnar >Cc: LKML; Linus Torvalds; Andrew Morton; Thomas Gleixner; >Brown, Len; Pallipadi, Venkatesh; Adam Belay; Peter Zijlstra; >Andi Kleen >Subject: Re: [PATCH] Kick CPUS that might be sleeping in cpus_idle_wait > > > >On Thu, 10 Jan 2008, Ingo Molnar wrote: >> > FYI, I just hit this hang on 2.6.24-rc6 without any extra >patches. So, >> > unless 2.6.24-rc7 did anything to fix this issue, this is a high >> > priority bug (IMHO). >> >> i'm wondering why this only triggered now. Is this something new in >> 2.6.24? > >It only triggeres with the switching of the idle governors. >And not just >one, you need to switch twice. The first loading of a governor does not >call cpu_idle_wait, but the second one does. NO_HZ must also >be enabled, >plus this needs to happen when no events or threads are >scheduled to run >on a CPU, which limits this to boot up. > >Also, this only seems to happen on my 2x2 (4way) and only once >in a while. > >I'm surprised that I'm the only one so far to report it. I can >boot up the >2.6.23 kernel on this box to see if it also hangs sometimes. But, as I >said, it may take several hundreds of tries to see it. With 2.6.23, you can try compiling acpi processor.ko as a module and doing insmod and rmmod in a loop. That should call cpu_idle_wait very frequently. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] Kick CPUS that might be sleeping in cpus_idle_wait
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steven Rostedt Sent: Thursday, January 10, 2008 6:44 AM To: Ingo Molnar Cc: LKML; Linus Torvalds; Andrew Morton; Thomas Gleixner; Brown, Len; Pallipadi, Venkatesh; Adam Belay; Peter Zijlstra; Andi Kleen Subject: Re: [PATCH] Kick CPUS that might be sleeping in cpus_idle_wait On Thu, 10 Jan 2008, Ingo Molnar wrote: FYI, I just hit this hang on 2.6.24-rc6 without any extra patches. So, unless 2.6.24-rc7 did anything to fix this issue, this is a high priority bug (IMHO). i'm wondering why this only triggered now. Is this something new in 2.6.24? It only triggeres with the switching of the idle governors. And not just one, you need to switch twice. The first loading of a governor does not call cpu_idle_wait, but the second one does. NO_HZ must also be enabled, plus this needs to happen when no events or threads are scheduled to run on a CPU, which limits this to boot up. Also, this only seems to happen on my 2x2 (4way) and only once in a while. I'm surprised that I'm the only one so far to report it. I can boot up the 2.6.23 kernel on this box to see if it also hangs sometimes. But, as I said, it may take several hundreds of tries to see it. With 2.6.23, you can try compiling acpi processor.ko as a module and doing insmod and rmmod in a loop. That should call cpu_idle_wait very frequently. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andi Kleen Sent: Thursday, January 10, 2008 11:07 AM To: Pallipadi, Venkatesh Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B Subject: Re: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text [EMAIL PROTECTED] writes: x86_64: Map only usable memory in identity map. I don't think that is needed or makes sense for reserved/ACPI * etc. Only e820 holes should be truly unmapped because only those should contain mmio. Do you mean just the regions that are not listed in e820 at all? We should also not map anything marked RESERVED in e820. Right? All reserved memory maps to a zero page. Why zero page? Why not unmap. I had it unmapped first. Then thought of zero mapping for dd of devmem to continue working. May be there are apps that depend on that? Also, dd of devmem seems to be already broken with big memory without any of these changes. Anyways you could make that a zillion times more simple by just rounding the e820 areas to 2MB -- for the holes only that should be ok I think; i would expect them to be near always already suitably aligned. In short this can be all done much simpler. On systems I tested, ACPI regions are typically not 2MB aligned. And on some systems there are few 4k pages of reserved holes just before 0xa. PCI reserved regions are 2MB aligned however. I agree that making this 2MB aligned will make this patch a lot simpler. But, not all reserved regions seems to be aligned that way. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 09/11] PAT x86: Add ioremap_wc support
-Original Message- From: Andi Kleen [mailto:[EMAIL PROTECTED] Sent: Thursday, January 10, 2008 11:09 AM To: Pallipadi, Venkatesh Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B Subject: Re: [patch 09/11] PAT x86: Add ioremap_wc support [EMAIL PROTECTED] writes: Index: linux-2.6.git/include/asm-generic/iomap.h === --- linux-2.6.git.orig/include/asm-generic/iomap.h 2008-01-08 03:31:37.0 -0800 +++ linux-2.6.git/include/asm-generic/iomap.h 2008-01-08 05:15:56.0 -0800 @@ -65,4 +65,8 @@ extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max); extern void pci_iounmap(struct pci_dev *dev, void __iomem *); +#ifndef ioremap_wc +#define ioremap_wc ioremap_nocache +#endif I don't think that's a good idea. Drivers should be able to detect this somehow. Handling UC mappings as WC will probably give very poor results. It is the other way. ioremap_wc aliases to ioremap_nocache. This was based on earlier feedback from Roland. From: Roland Dreier [mailto:[EMAIL PROTECTED] I think ioremap_wc() needs to be available on all archs for this to be really useful to drivers. It can be a fallback to ioremap_nocache() everywhere except 64-bit x86, but it's not nice for every driver that wants to use this to need an #ifdef X86 or whatever. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text
-Original Message- From: Andi Kleen [mailto:[EMAIL PROTECTED] Sent: Thursday, January 10, 2008 11:28 AM To: Pallipadi, Venkatesh Cc: Andi Kleen; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B Subject: Re: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text On Thu, Jan 10, 2008 at 11:17:07AM -0800, Pallipadi, Venkatesh wrote: I don't think that is needed or makes sense for reserved/ACPI * etc. Only e820 holes should be truly unmapped because only those should contain mmio. Do you mean just the regions that are not listed in e820 at all? We should also not map anything marked RESERVED in e820. Right? RESERVED is usually memory used by the BIOS. Properly MMIO areas should be in holes. Of course there might be buggy BIOS who violate that but the only way to find out is to check for the case in ioremap and warn. I would be still optimistic of it being correct. Another way would be to double check against the MTRRs - if it's UC then it should be unmapped. Maybe that would be a good idea. That should catch all true mmio holes unless a BIOS maps them cached but if it does that it's already beyond help. One of the test systems I have has following E820 BIOS-e820: - 0009cc00 (usable) BIOS-e820: 0009cc00 - 000a (reserved) BIOS-e820: 000cc000 - 000d (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - cff6 (usable) BIOS-e820: cff6 - cff69000 (ACPI data) BIOS-e820: cff69000 - cff8 (ACPI NVS) BIOS-e820: cff8 - d000 (reserved) BIOS-e820: e000 - f000 (reserved) BIOS-e820: fec0 - fec1 (reserved) BIOS-e820: fee0 - fee01000 (reserved) BIOS-e820: ff00 - 0001 (reserved) BIOS-e820: 0001 - 00013000 (usable) I think it is unsafe to access any reserved areas through WB not just mmio regions. In the above case 0xe000-0xf000 is one such region. Also, relying on MTRR, is like giving more importance to BIOS writer than required :-). I think the best way to deal with MTRR is just to not touch it. Leave it as it is and do not try to assume that they are correct, as frequently they will not be. All reserved memory maps to a zero page. Why zero page? Why not unmap. I had it unmapped first. Then thought of zero mapping for dd of devmem to continue working. May be there are apps that depend on that? Also, dd of devmem seems to be already broken with big memory without any of these changes. Exactly it's already broken. Anyways if someone accesses mmio through /dev/mem I think they definitely want the real mappings, not a zero page. And dev/mem should provide. The trick is just to do it without caching attribute violations, but with mattr it is possible. I don't like /dev/mem supporting access to mmio. We do not know what attributes to use for these regions. We can potentially map all these pages uncacheable. But there may be cases where reading an address can block too possibly? Anyways you could make that a zillion times more simple by just rounding the e820 areas to 2MB -- for the holes only that should be ok I think; i would expect them to be near always already suitably aligned. In short this can be all done much simpler. On systems I tested, ACPI regions are typically not 2MB aligned. And on ACPI regions don't need to be unmapped. some systems there are few 4k pages of reserved holes just before reserved shouldn't be unmapped, just holes. Do they have holes there or reserved areas? I still hope 2MB alignment will work out. E820 above has a combination of reserved and holes. The problem is that we end up depending on specific e820s and paltform specific problems/workarounds. This is not a real problem for i386 at all, as we map only 1G memory there. So, it is limited to x86_64 systems which should be less in number. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 11/11] PAT x86: Expose uc and wc interfaces in /sysfsvor pci_mmap_resource
-Original Message- From: Greg KH [mailto:[EMAIL PROTECTED] Sent: Thursday, January 10, 2008 11:43 AM To: Pallipadi, Venkatesh Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B Subject: Re: [patch 11/11] PAT x86: Expose uc and wc interfaces in /sysfsvor pci_mmap_resource On Thu, Jan 10, 2008 at 10:48:51AM -0800, [EMAIL PROTECTED] wrote: New interfaces exported for uc and wc accesses. Apps has to change to use these new interfaces. Signed-off-by: Venkatesh Pallipadi [EMAIL PROTECTED] Signed-off-by: Suresh Siddha [EMAIL PROTECTED] Please update the documentation for this change, as well as adding something to Documentation/ABI/ for these new sysfs files. OK. Will do. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 07/11] PAT x86: pat-conflict resolution using linear list
-Original Message- From: Andi Kleen [mailto:[EMAIL PROTECTED] Sent: Thursday, January 10, 2008 11:13 AM To: Pallipadi, Venkatesh Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B Subject: Re: [patch 07/11] PAT x86: pat-conflict resolution using linear list [EMAIL PROTECTED] writes: /* Reset the direct mapping. Can block */ -if (p-flags 20) -ioremap_change_attr(p-phys_addr, p-size, 0); +if (p-flags 20) { +free_mattr(p-phys_addr, p-phys_addr + get_vm_area_size(p), + p-flags20); +ioremap_change_attr(p-phys_addr, get_vm_area_size(p), 0); If you really unmap all holes and forbid (or let it just return the __va address) ioremap on anything mapped (which is probably ok) then you can eliminate that completely. We heard X can allocate a page and then map it UC using it through gart. So, I don't we can forbid all ioremaps for RAM. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text
-Original Message- From: Linus Torvalds [mailto:[EMAIL PROTECTED] Sent: Thursday, January 10, 2008 1:05 PM To: Pallipadi, Venkatesh Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B Subject: Re: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text On Thu, 10 Jan 2008, [EMAIL PROTECTED] wrote: x86_64: Map only usable memory in identity map. All reserved memory maps to a zero page. I don't mind this horribly per se, but why a zero page? Accessing that page without mapping it explicitly would be a bug with your change - if only because you'd get the wrong value! So why map it at all? The only thing mapping it can do is to hide bugs. Yes. I had those pages not mapped at all earlier. The reason I switched to zero page is to continue support cases like: BIOS-e820: - 0009cc00 (usable) BIOS-e820: 0009cc00 - 000a (reserved) BIOS-e820: 000cc000 - 000d (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - cff6 (usable) In this case if some one does a dd of /dev/mem before they can read the contents of usable memory in 0x10-0xcff6 range. But, if I not map reserved regions, dd will stop after fist such hole. Even though this may not be a good usage model, I thought there may be apps depending on such things. Having said that, I do not like having dummy zero page there very much. So, if we do not see any regressions due to usages like above, I will be happy to remove mapping reserved regions altogether. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andi Kleen Sent: Thursday, January 10, 2008 1:17 PM To: Pallipadi, Venkatesh Cc: Andi Kleen; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B Subject: Re: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text I think it is unsafe to access any reserved areas through WB not just mmio regions. In the above case 0xe000-0xf000 is one such region. That is 2MB aligned. That e820 also has a reserved here at 0x9d000. BIOS-e820: - 0009cc00 (usable) BIOS-e820: 0009cc00 - 000a (reserved) BIOS-e820: 000cc000 - 000d (reserved) If we keep mapping for such pages, it will be problematic as if a driver later does a ioremap, then we have to go through split-pages and cpa. With not mapping any reserved regions at all, we can avoid cpa for all maps of reserved regions. Reducing the complications at setup will make code more complicated at ioremap, etc. Most of the holes/reserved areas will be 2M aligned, other than initial 2M and possible 2M around ACPI region. So, we may end up mapping some of those pages with small pages. Even though it was not enforced until now, I feel that is required for correctness. Exactly it's already broken. Anyways if someone accesses mmio through /dev/mem I think they definitely want the real mappings, not a zero page. And dev/mem should provide. The trick is just to do it without caching attribute violations, but with mattr it is possible. I don't like /dev/mem supporting access to mmio. We do not know what But it always did that. I'm sure you'll break stuff if you forbid it suddenly. attributes to use for these regions. We can potentially map all these pages uncacheable. That is what current /dev/mem does. May be I am missing something. But, I don't think I saw /dev/mem checking whether some region is reserved and mapping those pages as uncacheable. As I though, its mostly done as MTRR has such setting. If I do dd of devmem which ends up reading all reserved regions today, I see one of my systems dying horribly with NMI dazed and confused and the other gets SCSI errors etc. I am not sure how can some apps depend on reading mmio regions through /dev/mem. Any particular app you are thinking about? But there may be cases where reading an address can block too possibly? Yes sure, machine may hang, but that was always the case and I don't think it can be changed. Anyways you could make that a zillion times more simple by just rounding the e820 areas to 2MB -- for the holes only that should be ok I think; i would expect them to be near always already suitably aligned. In short this can be all done much simpler. On systems I tested, ACPI regions are typically not 2MB aligned. And on ACPI regions don't need to be unmapped. some systems there are few 4k pages of reserved holes just before reserved shouldn't be unmapped, just holes. Do they have holes there or reserved areas? I still hope 2MB alignment will work out. E820 above has a combination of reserved and holes. The problem is that we end up depending on specific e820s and paltform specific problems/workarounds. This is not a real problem for i386 at all, as we map only 1G memory there. First there is the 2GB and in theory 1/3 GB split too which are supported. And then in theory someone could put mmio in the first 1GB anyways (e.g. in the 1MB hole) I don't think you can ignore i386 here. OK. I was thinking that we will have smaller subset of systems to worry about with x86_64. With above, yes. We need to worry about i386 as well. Other than the complicated code, do you see any issues of identity mapping only usable and ACPI regions as per e820? We can possible try to simplify the code, if that is the only concern. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text
-Original Message- From: Linus Torvalds [mailto:[EMAIL PROTECTED] Sent: Thursday, January 10, 2008 2:15 PM To: Pallipadi, Venkatesh Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Barnes, Jesse; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; Siddha, Suresh B Subject: RE: [patch 02/11] PAT x86: Map only usable memory in x86_64 identity map and kernel text On Thu, 10 Jan 2008, Pallipadi, Venkatesh wrote: Yes. I had those pages not mapped at all earlier. The reason I switched to zero page is to continue support cases like: BIOS-e820: - 0009cc00 (usable) BIOS-e820: 0009cc00 - 000a (reserved) BIOS-e820: 000cc000 - 000d (reserved) BIOS-e820: 000e4000 - 0010 (reserved) BIOS-e820: 0010 - cff6 (usable) In this case if some one does a dd of /dev/mem before they can read the contents of usable memory in 0x10-0xcff6 range. Well, I think that /dev/mem should simply give them the right info. That's what people use /dev/mem for - doing things like reading BIOS images etc. So returning *either* a zero page *or* stopping at the first hole is both equally wrong. I was not fully clear in my earlier email. Mapping /dev/mem would still work with our changes. As they go through proper map interface. It is the dd of dev mem which does the read that has the problem. I was wondering of apps using dd. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] Kick CPUS that might be sleeping in cpus_idle_wait
>-Original Message- >From: Steven Rostedt [mailto:[EMAIL PROTECTED] >Sent: Wednesday, January 09, 2008 12:42 PM >To: LKML >Cc: Linus Torvalds; Andrew Morton; Ingo Molnar; Thomas >Gleixner; Brown, Len; Pallipadi, Venkatesh; Adam Belay; Peter >Zijlstra; Andi Kleen >Subject: [PATCH] Kick CPUS that might be sleeping in cpus_idle_wait > >This patch is different than the first patch I sent out. >This one just sends an IPI to all CPUS that don't check in after 1 sec. > > >Sometimes cpu_idle_wait gets stuck because it might miss CPUS that are >already in idle, have no tasks waiting to run and have no interrupts >going to them. This is common on bootup when switching cpu idle >governors. > >This patch gives those CPUS that don't check in an IPI kick. > I think your RFC patch is the right solution here. As I see it, there is no race with your RFC patch. As long as you call a dummy smp_call_function on all CPUs, we should be OK. We can get rid of cpu_idle_state and the current wait forever logic altogether with dummy smp_call_function. And so there wont be any wait forever scenario. The whole point of cpu_idle_wait() is to make all CPUs come out of idle loop atleast once. The caller will use cpu_idle_wait something like this. // Want to change idle handler - Switch global idle handler to always present default_idle - call cpu_idle_wait so that all cpus come out of idle for an instant and stop using old idle pointer and start using default idle - Change the idle handler to a new handler - optional cpu_idle_wait if you want all cpus to start using the new handler immediately. May be the below 1s patch is safe bet for .24. But for .25, I would say we just replace all complicated logic by simple dummy smp_call_function and remove cpu_idle_state altogether. Thanks, Venki >Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]> >--- > arch/x86/kernel/process_32.c | 11 +++ > arch/x86/kernel/process_64.c | 11 +++ > 2 files changed, 22 insertions(+) > >Index: linux-compile-i386.git/arch/x86/kernel/process_32.c >=== >--- linux-compile-i386.git.orig/arch/x86/kernel/process_32.c >2008-01-09 14:09:36.0 -0500 >+++ linux-compile-i386.git/arch/x86/kernel/process_32.c >2008-01-09 14:09:45.0 -0500 >@@ -204,6 +204,10 @@ void cpu_idle(void) > } > } > >+static void do_nothing(void *unused) >+{ >+} >+ > void cpu_idle_wait(void) > { > unsigned int cpu, this_cpu = get_cpu(); >@@ -228,6 +232,13 @@ void cpu_idle_wait(void) > cpu_clear(cpu, map); > } > cpus_and(map, map, cpu_online_map); >+ /* >+ * We waited 1 sec, if a CPU still did not call idle >+ * it may be because it is in idle and not waking up >+ * because it has nothing to do. >+ * Give all the remaining CPUS a kick. >+ */ >+ smp_call_function_mask(map, do_nothing, 0, 0); > } while (!cpus_empty(map)); > > set_cpus_allowed(current, tmp); >Index: linux-compile-i386.git/arch/x86/kernel/process_64.c >=== >--- linux-compile-i386.git.orig/arch/x86/kernel/process_64.c >2008-01-09 14:09:36.0 -0500 >+++ linux-compile-i386.git/arch/x86/kernel/process_64.c >2008-01-09 15:17:20.0 -0500 >@@ -135,6 +135,10 @@ static void poll_idle (void) > cpu_relax(); > } > >+static void do_nothing(void *unused) >+{ >+} >+ > void cpu_idle_wait(void) > { > unsigned int cpu, this_cpu = get_cpu(); >@@ -160,6 +164,13 @@ void cpu_idle_wait(void) > cpu_clear(cpu, map); > } > cpus_and(map, map, cpu_online_map); >+ /* >+ * We waited 1 sec, if a CPU still did not call idle >+ * it may be because it is in idle and not waking up >+ * because it has nothing to do. >+ * Give all the remaining CPUS a kick. >+ */ >+ smp_call_function_mask(map, do_nothing, 0, 0); > } while (!cpus_empty(map)); > > set_cpus_allowed(current, tmp); > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] Kick CPUS that might be sleeping in cpus_idle_wait
-Original Message- From: Steven Rostedt [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 09, 2008 12:42 PM To: LKML Cc: Linus Torvalds; Andrew Morton; Ingo Molnar; Thomas Gleixner; Brown, Len; Pallipadi, Venkatesh; Adam Belay; Peter Zijlstra; Andi Kleen Subject: [PATCH] Kick CPUS that might be sleeping in cpus_idle_wait This patch is different than the first patch I sent out. This one just sends an IPI to all CPUS that don't check in after 1 sec. Sometimes cpu_idle_wait gets stuck because it might miss CPUS that are already in idle, have no tasks waiting to run and have no interrupts going to them. This is common on bootup when switching cpu idle governors. This patch gives those CPUS that don't check in an IPI kick. I think your RFC patch is the right solution here. As I see it, there is no race with your RFC patch. As long as you call a dummy smp_call_function on all CPUs, we should be OK. We can get rid of cpu_idle_state and the current wait forever logic altogether with dummy smp_call_function. And so there wont be any wait forever scenario. The whole point of cpu_idle_wait() is to make all CPUs come out of idle loop atleast once. The caller will use cpu_idle_wait something like this. // Want to change idle handler - Switch global idle handler to always present default_idle - call cpu_idle_wait so that all cpus come out of idle for an instant and stop using old idle pointer and start using default idle - Change the idle handler to a new handler - optional cpu_idle_wait if you want all cpus to start using the new handler immediately. May be the below 1s patch is safe bet for .24. But for .25, I would say we just replace all complicated logic by simple dummy smp_call_function and remove cpu_idle_state altogether. Thanks, Venki Signed-off-by: Steven Rostedt [EMAIL PROTECTED] --- arch/x86/kernel/process_32.c | 11 +++ arch/x86/kernel/process_64.c | 11 +++ 2 files changed, 22 insertions(+) Index: linux-compile-i386.git/arch/x86/kernel/process_32.c === --- linux-compile-i386.git.orig/arch/x86/kernel/process_32.c 2008-01-09 14:09:36.0 -0500 +++ linux-compile-i386.git/arch/x86/kernel/process_32.c 2008-01-09 14:09:45.0 -0500 @@ -204,6 +204,10 @@ void cpu_idle(void) } } +static void do_nothing(void *unused) +{ +} + void cpu_idle_wait(void) { unsigned int cpu, this_cpu = get_cpu(); @@ -228,6 +232,13 @@ void cpu_idle_wait(void) cpu_clear(cpu, map); } cpus_and(map, map, cpu_online_map); + /* + * We waited 1 sec, if a CPU still did not call idle + * it may be because it is in idle and not waking up + * because it has nothing to do. + * Give all the remaining CPUS a kick. + */ + smp_call_function_mask(map, do_nothing, 0, 0); } while (!cpus_empty(map)); set_cpus_allowed(current, tmp); Index: linux-compile-i386.git/arch/x86/kernel/process_64.c === --- linux-compile-i386.git.orig/arch/x86/kernel/process_64.c 2008-01-09 14:09:36.0 -0500 +++ linux-compile-i386.git/arch/x86/kernel/process_64.c 2008-01-09 15:17:20.0 -0500 @@ -135,6 +135,10 @@ static void poll_idle (void) cpu_relax(); } +static void do_nothing(void *unused) +{ +} + void cpu_idle_wait(void) { unsigned int cpu, this_cpu = get_cpu(); @@ -160,6 +164,13 @@ void cpu_idle_wait(void) cpu_clear(cpu, map); } cpus_and(map, map, cpu_online_map); + /* + * We waited 1 sec, if a CPU still did not call idle + * it may be because it is in idle and not waking up + * because it has nothing to do. + * Give all the remaining CPUS a kick. + */ + smp_call_function_mask(map, do_nothing, 0, 0); } while (!cpus_empty(map)); set_cpus_allowed(current, tmp); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree
>-Original Message- >From: Andrew Morton [mailto:[EMAIL PROTECTED] >Sent: Sunday, January 06, 2008 11:19 PM >To: Mark Lord >Cc: Pallipadi, Venkatesh; Arjan van de Ven; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; Ingo Molnar; linux-kernel@vger.kernel.org; >[EMAIL PROTECTED]; [EMAIL PROTECTED] >Subject: Re: + restore-missing-sysfs-max_cstate-attr.patch >added to -mm tree > >On Sun, 06 Jan 2008 16:34:16 -0500 Mark Lord <[EMAIL PROTECTED]> wrote: > >> Venki Pallipadi wrote: >> > Reintroduce run time configurable max_cstate for !CPU_IDLE case. >> > >> Can we get this patch upstream so that a stock 2.6.24 will work here? >> > >umm, OK, I queued it for 2.6.24. I'll give people a day or so >to comment >on this. > >I had to invent some silly changlelog for it. Please review it for >accuracy and completeness? > >It isn't complete, really. How come we only make max_cstate >writeable if >CONFIG_CPU_IDLE=n? What happens to people who were reliant >upon writeable >max_cstate who now enable CPU_IDLE? Things still break? What is the >rationale behind this? What constraints led us to this decision? It is done only for !CPU_IDLE case to take care of regression at hand. CPU_IDLE case technically is not a regression as it is a new config option. It is not easy to implement this with CPU_IDLE as acpi driver only provides the C-state mechanism and does not have the policy in it anymore with CPU_IDLE. It still can be done with some hacky code. But, I am incliced to switch this to using latency interface which is more cleaner than max_cstate interface. For example, max_cstate does not mean anything to the user, as BIOSes normally tend to hide one C-state or more than one C-states behind one OS visible C-state. Like C2 mapped to real C3 etc. Saying that I don't want CPUs to enter any C-state more than 100uS latency is cleaner in comparison (even though we depend on the latency number coming from the BIOS). Mark said this latency interface is not working as it is expected to at this moment. I will look at that soon and then we will have an alternate mechanism for this limiting C-state thing. I am OK with the below changelog. Thanks, Venki >From: Venki Pallipadi <[EMAIL PROTECTED]> > >This was writeable in 2.6.23 but the cpuidle merge made it >read-only. But >some people's scripts (ie: Mark's) were writing to it. > >As an unhappy compromise, make max_cstate writeable again if >the kernel was >configured without CONFIG_CPU_IDLE. > >Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> >Cc: Mark Lord <[EMAIL PROTECTED]> >Cc: Arjan van de Ven <[EMAIL PROTECTED]> >Cc: Len Brown <[EMAIL PROTECTED]> >Cc: Ingo Molnar <[EMAIL PROTECTED]> >Cc: "Rafael J. Wysocki" <[EMAIL PROTECTED]> >Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> >--- > > drivers/acpi/processor_idle.c |4 > 1 file changed, 4 insertions(+) > >diff -puN >drivers/acpi/processor_idle.c~reintroduce-run-time-configurable >-max_cstate-for-cpu_idle-case drivers/acpi/processor_idle.c >--- >a/drivers/acpi/processor_idle.c~reintroduce-run-time-configurab >le-max_cstate-for-cpu_idle-case >+++ a/drivers/acpi/processor_idle.c >@@ -76,7 +76,11 @@ static void (*pm_idle_save) (void) __rea > #define PM_TIMER_TICKS_TO_US(p) (((p) * >1000)/(PM_TIMER_FREQUENCY/1000)) > > static unsigned int max_cstate __read_mostly = >ACPI_PROCESSOR_MAX_POWER; >+#ifdef CONFIG_CPU_IDLE > module_param(max_cstate, uint, ); >+#else >+module_param(max_cstate, uint, 0644); >+#endif > static unsigned int nocst __read_mostly; > module_param(nocst, uint, ); > >_ > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree
>-Original Message- >From: Mark Lord [mailto:[EMAIL PROTECTED] >Sent: Friday, January 04, 2008 1:53 PM >To: Pallipadi, Venkatesh >Cc: Arjan van de Ven; Andrew Morton; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; Ingo Molnar; linux-kernel@vger.kernel.org; >[EMAIL PROTECTED]; [EMAIL PROTECTED] >Subject: Re: + restore-missing-sysfs-max_cstate-attr.patch >added to -mm tree > >Mark Lord wrote: >> Venki Pallipadi wrote: >>> Reintroduce run time configurable max_cstate for !CPU_IDLE case. >>> >>> Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> >>> >>> Index: linux-2.6.24-rc/drivers/acpi/processor_idle.c >>> === >>> --- linux-2.6.24-rc.orig/drivers/acpi/processor_idle.c >>> +++ linux-2.6.24-rc/drivers/acpi/processor_idle.c >>> @@ -76,7 +76,11 @@ static void (*pm_idle_save) (void) __rea >>> #define PM_TIMER_TICKS_TO_US(p)(((p) * >>> 1000)/(PM_TIMER_FREQUENCY/1000)) >>> >>> static unsigned int max_cstate __read_mostly = >ACPI_PROCESSOR_MAX_POWER; >>> +#ifdef CONFIG_CPU_IDLE >>> module_param(max_cstate, uint, ); >>> +#else >>> +module_param(max_cstate, uint, 0644); >>> +#endif >>> static unsigned int nocst __read_mostly; >>> module_param(nocst, uint, ); >>> >> .. >> >> I'll try and re-test with this on Friday. >.. > >Okay, with !CONFIG_CPU_IDLE, this works fine -- same as 2.6.23 >and earlier. > Good to know. Atleast we do not have a regression for 2.6.24 now. >> Meanwhile, can you give a short summary of how behaviour differs >> between CONFIG_CPU_IDLE and !CONFIG_CPU_IDLE ?? >> >> I'm not at all clear on how this really affects things. > With CPU_IDLE, the C-state policy is removed from acpi driver. Ideally policy should have nothing to do with ACPI, as ACPI only provides the C-state mechanisms. So, with CPU_IDLE, it is not easy to control this variable through a acpi driver module at run time. Also, the latency interface that was mentioned before is to serve the same purpose in a more clear manner (based on the wakeup latency) instead of a C-state number which may not mean much from the end user point of view. I will look at why latency does not work on a single core system soon(Was that with UP kernel or SMP kernel?). That way we will have a proper cover for this with CPU_IDLE in future. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree
-Original Message- From: Mark Lord [mailto:[EMAIL PROTECTED] Sent: Friday, January 04, 2008 1:53 PM To: Pallipadi, Venkatesh Cc: Arjan van de Ven; Andrew Morton; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Ingo Molnar; linux-kernel@vger.kernel.org; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree Mark Lord wrote: Venki Pallipadi wrote: Reintroduce run time configurable max_cstate for !CPU_IDLE case. Signed-off-by: Venkatesh Pallipadi [EMAIL PROTECTED] Index: linux-2.6.24-rc/drivers/acpi/processor_idle.c === --- linux-2.6.24-rc.orig/drivers/acpi/processor_idle.c +++ linux-2.6.24-rc/drivers/acpi/processor_idle.c @@ -76,7 +76,11 @@ static void (*pm_idle_save) (void) __rea #define PM_TIMER_TICKS_TO_US(p)(((p) * 1000)/(PM_TIMER_FREQUENCY/1000)) static unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER; +#ifdef CONFIG_CPU_IDLE module_param(max_cstate, uint, ); +#else +module_param(max_cstate, uint, 0644); +#endif static unsigned int nocst __read_mostly; module_param(nocst, uint, ); .. I'll try and re-test with this on Friday. .. Okay, with !CONFIG_CPU_IDLE, this works fine -- same as 2.6.23 and earlier. Good to know. Atleast we do not have a regression for 2.6.24 now. Meanwhile, can you give a short summary of how behaviour differs between CONFIG_CPU_IDLE and !CONFIG_CPU_IDLE ?? I'm not at all clear on how this really affects things. With CPU_IDLE, the C-state policy is removed from acpi driver. Ideally policy should have nothing to do with ACPI, as ACPI only provides the C-state mechanisms. So, with CPU_IDLE, it is not easy to control this variable through a acpi driver module at run time. Also, the latency interface that was mentioned before is to serve the same purpose in a more clear manner (based on the wakeup latency) instead of a C-state number which may not mean much from the end user point of view. I will look at why latency does not work on a single core system soon(Was that with UP kernel or SMP kernel?). That way we will have a proper cover for this with CPU_IDLE in future. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree
>-Original Message- >From: Andrew Morton [mailto:[EMAIL PROTECTED] >Sent: Wednesday, January 02, 2008 4:52 PM >To: Pallipadi, Venkatesh >Cc: Mark Lord; Arjan van de Ven; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; Ingo Molnar; linux-kernel@vger.kernel.org; >[EMAIL PROTECTED] >Subject: Re: + restore-missing-sysfs-max_cstate-attr.patch >added to -mm tree > >On Wed, 2 Jan 2008 16:06:20 -0800 "Pallipadi, Venkatesh" ><[EMAIL PROTECTED]> wrote: > >> >> >> >-Original Message- >> >From: Mark Lord [mailto:[EMAIL PROTECTED] >> >Sent: Wednesday, January 02, 2008 3:42 PM >> >To: Arjan van de Ven >> >Cc: Pallipadi, Venkatesh; Andrew Morton; [EMAIL PROTECTED]; >> >[EMAIL PROTECTED]; Ingo Molnar; linux-kernel@vger.kernel.org; >> >[EMAIL PROTECTED] >> >Subject: Re: + restore-missing-sysfs-max_cstate-attr.patch >> >added to -mm tree >> > >> >Arjan van de Ven wrote: >> >> On Fri, 30 Nov 2007 22:31:17 -0500 >> >> Mark Lord <[EMAIL PROTECTED]> wrote: >> >> >> >>> Arjan van de Ven wrote: >> >>>> On Fri, 30 Nov 2007 22:14:08 -0500 >> >>>> Mark Lord <[EMAIL PROTECTED]> wrote: >> >>>> >> >>>>>> in -mm there is.. the QoS stuff allows you to set maximum >> >>>>>> tolerable >> >>>>> .. >> >>>>> >> >>>>> That's encouraging, I think, but not for 2.6.24. >> >>>>> >> >>>>>> latency. If your app cant take any latency, you should set >> >>>>>> those... and the side effect is that the kernel will not do >> >>>>>> long-latency C-states or P-state transitions.. >> >>>>> .. >> >>>>> >> >>>>> I don't mind the cpufreq changing (actually, I want it >to drop in >> >>>>> cpugfreq to save power and keep the fan off), but the >> >C-states just >> >>>>> kill this app. >> >>>>> >> >>>>> The app is VMware. I force the max_state=1 when launching, >> >>>> ah but then its' even easier... and can be done in >2.6.24 already. >> >>>> VMWare after all has a kernel module, and the latency >stuff is in >> >>>> 2.6.23 and 2.6.24 available inside the kernel already. >> >>> .. >> >>> >> >>> Oh, I'm perfectly happy to write my own kernel module if >that's what >> >> >> >> all you need to do in your kernel module is call >> >> >> >> add_latency_constraint("mark_wants_his_mouse", 5); >> >> >> >> or so >> >.. >> > >> >Dredging up an old regression again now: >> > >> >The "make my own module to replace /sys/.../max_cstate" doesn't work >> >for the single-core machine we use a lot around here. >> > >> >VMware is totally sluggish unless I go to another text window >> >and do this: >> > >> >while ( true ); do echo -n ; done >> > >> >At which point VMware performs well again, >> >the same as with "echo 1 > max_cstate" in 2.6.23. >> > >> >Anyone got any suggestions on how to fix this regression >> >or work around it for 2.6.24 ? >> > >> >> Easiest and clean way to do it is to have a driver with >> set_acceptable_latency() for 1uS or so in init and >> remove_acceptable_latency() at exit. > >err, you appear to be suggesting that Mark patch his kernel to >make it work >as well as 2.6.23? That would be a wrong answer. > >This regression was known six weeks ago. What do we need to >do (or revert) >to fix it in 2.6.24? > As I responded earlier here http://www.ussg.iu.edu/hypermail/linux/kernel/0711.3/2348.html This interface cannot be supported cleanly with cpuidle. The cleanest way to do this is to go through latency interfaces. We have changed all in kernel drivers to use this new interface. The issue here is, I removed this sysfs interface without depracting it. We can call it a regression and we can add it back for the moment. But, this will go from sysfs sooner or later and latency interface has to be used in future. And Mark earlier responded in this thread saying he is OK with adding something in the kernel to get this working, That is the reason I suggested the above option. As I saw it 6 weeks back, max_cstate option works as a boot parameter. I did not see anyone else (apart from Mark) saying they are depending on this sysfs interface to change max_cstate at run time and Mark said he can do with the kernel change if possible. Please let me know if you think this interface is a must fix for .24. I will send a minimal patch to add it back for .24 for !CPU_IDLE case. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree
>-Original Message- >From: Mark Lord [mailto:[EMAIL PROTECTED] >Sent: Wednesday, January 02, 2008 3:42 PM >To: Arjan van de Ven >Cc: Pallipadi, Venkatesh; Andrew Morton; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; Ingo Molnar; linux-kernel@vger.kernel.org; >[EMAIL PROTECTED] >Subject: Re: + restore-missing-sysfs-max_cstate-attr.patch >added to -mm tree > >Arjan van de Ven wrote: >> On Fri, 30 Nov 2007 22:31:17 -0500 >> Mark Lord <[EMAIL PROTECTED]> wrote: >> >>> Arjan van de Ven wrote: >>>> On Fri, 30 Nov 2007 22:14:08 -0500 >>>> Mark Lord <[EMAIL PROTECTED]> wrote: >>>> >>>>>> in -mm there is.. the QoS stuff allows you to set maximum >>>>>> tolerable >>>>> .. >>>>> >>>>> That's encouraging, I think, but not for 2.6.24. >>>>> >>>>>> latency. If your app cant take any latency, you should set >>>>>> those... and the side effect is that the kernel will not do >>>>>> long-latency C-states or P-state transitions.. >>>>> .. >>>>> >>>>> I don't mind the cpufreq changing (actually, I want it to drop in >>>>> cpugfreq to save power and keep the fan off), but the >C-states just >>>>> kill this app. >>>>> >>>>> The app is VMware. I force the max_state=1 when launching, >>>> ah but then its' even easier... and can be done in 2.6.24 already. >>>> VMWare after all has a kernel module, and the latency stuff is in >>>> 2.6.23 and 2.6.24 available inside the kernel already. >>> .. >>> >>> Oh, I'm perfectly happy to write my own kernel module if that's what >> >> all you need to do in your kernel module is call >> >> add_latency_constraint("mark_wants_his_mouse", 5); >> >> or so >.. > >Dredging up an old regression again now: > >The "make my own module to replace /sys/.../max_cstate" doesn't work >for the single-core machine we use a lot around here. > >VMware is totally sluggish unless I go to another text window >and do this: > >while ( true ); do echo -n ; done > >At which point VMware performs well again, >the same as with "echo 1 > max_cstate" in 2.6.23. > >Anyone got any suggestions on how to fix this regression >or work around it for 2.6.24 ? > Easiest and clean way to do it is to have a driver with set_acceptable_latency() for 1uS or so in init and remove_acceptable_latency() at exit. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree
-Original Message- From: Mark Lord [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 02, 2008 3:42 PM To: Arjan van de Ven Cc: Pallipadi, Venkatesh; Andrew Morton; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Ingo Molnar; linux-kernel@vger.kernel.org; [EMAIL PROTECTED] Subject: Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree Arjan van de Ven wrote: On Fri, 30 Nov 2007 22:31:17 -0500 Mark Lord [EMAIL PROTECTED] wrote: Arjan van de Ven wrote: On Fri, 30 Nov 2007 22:14:08 -0500 Mark Lord [EMAIL PROTECTED] wrote: in -mm there is.. the QoS stuff allows you to set maximum tolerable .. That's encouraging, I think, but not for 2.6.24. latency. If your app cant take any latency, you should set those... and the side effect is that the kernel will not do long-latency C-states or P-state transitions.. .. I don't mind the cpufreq changing (actually, I want it to drop in cpugfreq to save power and keep the fan off), but the C-states just kill this app. The app is VMware. I force the max_state=1 when launching, ah but then its' even easier... and can be done in 2.6.24 already. VMWare after all has a kernel module, and the latency stuff is in 2.6.23 and 2.6.24 available inside the kernel already. .. Oh, I'm perfectly happy to write my own kernel module if that's what all you need to do in your kernel module is call add_latency_constraint(mark_wants_his_mouse, 5); or so .. Dredging up an old regression again now: The make my own module to replace /sys/.../max_cstate doesn't work for the single-core machine we use a lot around here. VMware is totally sluggish unless I go to another text window and do this: while ( true ); do echo -n ; done At which point VMware performs well again, the same as with echo 1 max_cstate in 2.6.23. Anyone got any suggestions on how to fix this regression or work around it for 2.6.24 ? Easiest and clean way to do it is to have a driver with set_acceptable_latency() for 1uS or so in init and remove_acceptable_latency() at exit. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree
-Original Message- From: Andrew Morton [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 02, 2008 4:52 PM To: Pallipadi, Venkatesh Cc: Mark Lord; Arjan van de Ven; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Ingo Molnar; linux-kernel@vger.kernel.org; [EMAIL PROTECTED] Subject: Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree On Wed, 2 Jan 2008 16:06:20 -0800 Pallipadi, Venkatesh [EMAIL PROTECTED] wrote: -Original Message- From: Mark Lord [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 02, 2008 3:42 PM To: Arjan van de Ven Cc: Pallipadi, Venkatesh; Andrew Morton; [EMAIL PROTECTED]; [EMAIL PROTECTED]; Ingo Molnar; linux-kernel@vger.kernel.org; [EMAIL PROTECTED] Subject: Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree Arjan van de Ven wrote: On Fri, 30 Nov 2007 22:31:17 -0500 Mark Lord [EMAIL PROTECTED] wrote: Arjan van de Ven wrote: On Fri, 30 Nov 2007 22:14:08 -0500 Mark Lord [EMAIL PROTECTED] wrote: in -mm there is.. the QoS stuff allows you to set maximum tolerable .. That's encouraging, I think, but not for 2.6.24. latency. If your app cant take any latency, you should set those... and the side effect is that the kernel will not do long-latency C-states or P-state transitions.. .. I don't mind the cpufreq changing (actually, I want it to drop in cpugfreq to save power and keep the fan off), but the C-states just kill this app. The app is VMware. I force the max_state=1 when launching, ah but then its' even easier... and can be done in 2.6.24 already. VMWare after all has a kernel module, and the latency stuff is in 2.6.23 and 2.6.24 available inside the kernel already. .. Oh, I'm perfectly happy to write my own kernel module if that's what all you need to do in your kernel module is call add_latency_constraint(mark_wants_his_mouse, 5); or so .. Dredging up an old regression again now: The make my own module to replace /sys/.../max_cstate doesn't work for the single-core machine we use a lot around here. VMware is totally sluggish unless I go to another text window and do this: while ( true ); do echo -n ; done At which point VMware performs well again, the same as with echo 1 max_cstate in 2.6.23. Anyone got any suggestions on how to fix this regression or work around it for 2.6.24 ? Easiest and clean way to do it is to have a driver with set_acceptable_latency() for 1uS or so in init and remove_acceptable_latency() at exit. err, you appear to be suggesting that Mark patch his kernel to make it work as well as 2.6.23? That would be a wrong answer. This regression was known six weeks ago. What do we need to do (or revert) to fix it in 2.6.24? As I responded earlier here http://www.ussg.iu.edu/hypermail/linux/kernel/0711.3/2348.html This interface cannot be supported cleanly with cpuidle. The cleanest way to do this is to go through latency interfaces. We have changed all in kernel drivers to use this new interface. The issue here is, I removed this sysfs interface without depracting it. We can call it a regression and we can add it back for the moment. But, this will go from sysfs sooner or later and latency interface has to be used in future. And Mark earlier responded in this thread saying he is OK with adding something in the kernel to get this working, That is the reason I suggested the above option. As I saw it 6 weeks back, max_cstate option works as a boot parameter. I did not see anyone else (apart from Mark) saying they are depending on this sysfs interface to change max_cstate at run time and Mark said he can do with the kernel change if possible. Please let me know if you think this interface is a must fix for .24. I will send a minimal patch to add it back for .24 for !CPU_IDLE case. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] x86: export 'leave_mm' symbol
Acked-by: Venkatesh Pallipadi <[EMAIL PROTECTED]> >-Original Message- >From: Miguel Botón [mailto:[EMAIL PROTECTED] >Sent: Friday, December 28, 2007 7:58 AM >To: Linux Kernel Mailing List >Cc: Ingo Molnar; Thomas Gleixner; Pallipadi, Venkatesh >Subject: [PATCH] x86: export 'leave_mm' symbol > >This patch fixes a linking error if CONFIG_ACPI_PROCESSOR=m >(this error occurs in the 'mm' branch of linux-2.6-x86.git). > >We need to export the 'leave_mm' symbol so it can be accessible >for the ACPI processor module. > >Signed-off-by: Miguel Botón <[EMAIL PROTECTED]> > >diff --git a/arch/x86/kernel/smp_32.c b/arch/x86/kernel/smp_32.c >index 596d002..dc0cde9 100644 >--- a/arch/x86/kernel/smp_32.c >+++ b/arch/x86/kernel/smp_32.c >@@ -263,6 +263,7 @@ void leave_mm(int cpu) > cpu_clear(cpu, per_cpu(cpu_tlbstate, >cpu).active_mm->cpu_vm_mask); > load_cr3(swapper_pg_dir); > } >+EXPORT_SYMBOL(leave_mm); > > /* > * >diff --git a/arch/x86/kernel/smp_64.c b/arch/x86/kernel/smp_64.c >index 1334afe..2fd74b0 100644 >--- a/arch/x86/kernel/smp_64.c >+++ b/arch/x86/kernel/smp_64.c >@@ -76,6 +76,7 @@ void leave_mm(int cpu) > cpu_clear(cpu, read_pda(active_mm)->cpu_vm_mask); > load_cr3(swapper_pg_dir); > } >+EXPORT_SYMBOL(leave_mm); > > /* > * > >-- > Miguel Botón > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] x86: export 'leave_mm' symbol
Acked-by: Venkatesh Pallipadi [EMAIL PROTECTED] -Original Message- From: Miguel Botón [mailto:[EMAIL PROTECTED] Sent: Friday, December 28, 2007 7:58 AM To: Linux Kernel Mailing List Cc: Ingo Molnar; Thomas Gleixner; Pallipadi, Venkatesh Subject: [PATCH] x86: export 'leave_mm' symbol This patch fixes a linking error if CONFIG_ACPI_PROCESSOR=m (this error occurs in the 'mm' branch of linux-2.6-x86.git). We need to export the 'leave_mm' symbol so it can be accessible for the ACPI processor module. Signed-off-by: Miguel Botón [EMAIL PROTECTED] diff --git a/arch/x86/kernel/smp_32.c b/arch/x86/kernel/smp_32.c index 596d002..dc0cde9 100644 --- a/arch/x86/kernel/smp_32.c +++ b/arch/x86/kernel/smp_32.c @@ -263,6 +263,7 @@ void leave_mm(int cpu) cpu_clear(cpu, per_cpu(cpu_tlbstate, cpu).active_mm-cpu_vm_mask); load_cr3(swapper_pg_dir); } +EXPORT_SYMBOL(leave_mm); /* * diff --git a/arch/x86/kernel/smp_64.c b/arch/x86/kernel/smp_64.c index 1334afe..2fd74b0 100644 --- a/arch/x86/kernel/smp_64.c +++ b/arch/x86/kernel/smp_64.c @@ -76,6 +76,7 @@ void leave_mm(int cpu) cpu_clear(cpu, read_pda(active_mm)-cpu_vm_mask); load_cr3(swapper_pg_dir); } +EXPORT_SYMBOL(leave_mm); /* * -- Miguel Botón -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: soft lockup - CPU#1 stuck for 15s! [swapper:0]
>-Original Message- >From: Parag Warudkar [mailto:[EMAIL PROTECTED] >Sent: Friday, December 07, 2007 2:54 PM >To: LKML >Cc: Andrew Morton; Pallipadi, Venkatesh; Linus Torvalds >Subject: BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0] > >Got this on today's git (2.6.24-rc4) while compiling stuff - Looks >like it is related to CpuIdle stuff. >I chose CONFIG_CPU_IDLE for the first time so I don't know when this >was introduced. > >This is on x86_32, SMP. > >BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0] > >Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #3) >EIP: 0060:[] EFLAGS: 0202 CPU: 1 >EIP is at _spin_lock_irqsave+0x16/0x27 >EAX: c06b4110 EBX: 0001 ECX: f7873808 EDX: 0293 >ESI: 0005 EDI: f7873808 EBP: ESP: f7829f10 > DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 >CR0: 8005003b CR2: 004f5960 CR3: 372c5000 CR4: 06d0 >DR0: DR1: DR2: DR3: >DR6: 0ff0 DR7: 0400 > [] tick_broadcast_oneshot_control+0x10/0xda > [] tick_notify+0x1d4/0x2eb > [] get_next_timer_interrupt+0x143/0x1b4 > [] notifier_call_chain+0x2a/0x47 > [] raw_notifier_call_chain+0x17/0x1a > [] clockevents_notify+0x19/0x4f > [] acpi_idle_enter_simple+0x183/0x1d0 > [] cpuidle_idle_call+0x53/0x78 > [] cpuidle_idle_call+0x0/0x78 > [] cpu_idle+0x97/0xb8 > Looks like tick_broadcast_lock did not get freed in some path. You do not see this when you CPU_IDLE is not configured? Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree
>-Original Message- >From: Pavel Machek [mailto:[EMAIL PROTECTED] >Sent: Wednesday, December 05, 2007 3:17 AM >To: Pallipadi, Venkatesh >Cc: Andrew Morton; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; >linux-kernel@vger.kernel.org; [EMAIL PROTECTED] >Subject: Re: + restore-missing-sysfs-max_cstate-attr.patch >added to -mm tree > >Hi! > >> >It is not known whether Mark is actually writing to this >> >thing. Perhaps >> >read-only permissions would be a suitable fix? >> > >> >> Exporting it as read only should be OK. We also need to know if there >> are hard user space dependency on writing to this from userspace. > >Some people are manipulating it from their suspend scripts.. > That is done by default in kernel now. Deep C-states are disabled between suspend and resume. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: soft lockup - CPU#1 stuck for 15s! [swapper:0]
-Original Message- From: Parag Warudkar [mailto:[EMAIL PROTECTED] Sent: Friday, December 07, 2007 2:54 PM To: LKML Cc: Andrew Morton; Pallipadi, Venkatesh; Linus Torvalds Subject: BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0] Got this on today's git (2.6.24-rc4) while compiling stuff - Looks like it is related to CpuIdle stuff. I chose CONFIG_CPU_IDLE for the first time so I don't know when this was introduced. This is on x86_32, SMP. BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0] Pid: 0, comm: swapper Not tainted (2.6.24-rc4 #3) EIP: 0060:[c0603e22] EFLAGS: 0202 CPU: 1 EIP is at _spin_lock_irqsave+0x16/0x27 EAX: c06b4110 EBX: 0001 ECX: f7873808 EDX: 0293 ESI: 0005 EDI: f7873808 EBP: ESP: f7829f10 DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 CR0: 8005003b CR2: 004f5960 CR3: 372c5000 CR4: 06d0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 [c0438233] tick_broadcast_oneshot_control+0x10/0xda [c0437c82] tick_notify+0x1d4/0x2eb [c04281b4] get_next_timer_interrupt+0x143/0x1b4 [c0605819] notifier_call_chain+0x2a/0x47 [c04345a0] raw_notifier_call_chain+0x17/0x1a [c04378b7] clockevents_notify+0x19/0x4f [c0533cc3] acpi_idle_enter_simple+0x183/0x1d0 [c058cea3] cpuidle_idle_call+0x53/0x78 [c058ce50] cpuidle_idle_call+0x0/0x78 [c0402575] cpu_idle+0x97/0xb8 Looks like tick_broadcast_lock did not get freed in some path. You do not see this when you CPU_IDLE is not configured? Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree
-Original Message- From: Pavel Machek [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 05, 2007 3:17 AM To: Pallipadi, Venkatesh Cc: Andrew Morton; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; [EMAIL PROTECTED] Subject: Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree Hi! It is not known whether Mark is actually writing to this thing. Perhaps read-only permissions would be a suitable fix? Exporting it as read only should be OK. We also need to know if there are hard user space dependency on writing to this from userspace. Some people are manipulating it from their suspend scripts.. That is done by default in kernel now. Deep C-states are disabled between suspend and resume. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: constant_tsc and TSC unstable
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Paul >Rolland (???・???) >Sent: Thursday, November 29, 2007 10:53 PM >To: Pallipadi, Venkatesh >Cc: Linux Kernel; Siddha, Suresh B; [EMAIL PROTECTED] >Subject: Re: constant_tsc and TSC unstable > >Hello, > >On Thu, 29 Nov 2007 15:29:49 -0800 >"Pallipadi, Venkatesh" <[EMAIL PROTECTED]> wrote: > > > >> TSCs on Core 2 Duo are supposed to be in sync unless CPU >supports deep idle >> states like C2, C3. Can you send the full /proc/cpuinfo and >full dmesg. >> >Sure I can... >[EMAIL PROTECTED] log]# cat /proc/cpuinfo >processor : 0 >vendor_id : GenuineIntel >cpu family : 6 >model : 15 >model name : Intel(R) Core(TM)2 CPU T5300 @ 1.73GHz >stepping: 2 >cpu MHz : 800.000 >cache size : 2048 KB >physical id : 0 >siblings: 2 >core id : 0 >cpu cores : 2 >fdiv_bug: no >hlt_bug : no >f00f_bug: no >coma_bug: no >fpu : yes >fpu_exception : yes >cpuid level : 10 >wp : yes >flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr >pge mca cmov pat ps >e36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm >constant_tsc arch_perfmo >n pebs bts pni monitor ds_cpl est tm2 ssse3 cx16 xtpr lahf_lm >bogomips: 3461.13 >clflush size: 64 > Tried reproducing this here, but on a similar system (slightly newer CPU stepping) I don’t see this happening. This error does not matter on this particular system as even with TSC synchronization passes, TSC is going to be disabled later due to C2, C3 states, with a message like this Marking TSC unstable due to: TSC halts in idle. Will try to reproduce it on other systems to see whether there are any bugs in sync routines. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: constant_tsc and TSC unstable
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Rolland (???・???) Sent: Thursday, November 29, 2007 10:53 PM To: Pallipadi, Venkatesh Cc: Linux Kernel; Siddha, Suresh B; [EMAIL PROTECTED] Subject: Re: constant_tsc and TSC unstable Hello, On Thu, 29 Nov 2007 15:29:49 -0800 Pallipadi, Venkatesh [EMAIL PROTECTED] wrote: TSCs on Core 2 Duo are supposed to be in sync unless CPU supports deep idle states like C2, C3. Can you send the full /proc/cpuinfo and full dmesg. Sure I can... [EMAIL PROTECTED] log]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU T5300 @ 1.73GHz stepping: 2 cpu MHz : 800.000 cache size : 2048 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 2 fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat ps e36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmo n pebs bts pni monitor ds_cpl est tm2 ssse3 cx16 xtpr lahf_lm bogomips: 3461.13 clflush size: 64 Tried reproducing this here, but on a similar system (slightly newer CPU stepping) I don’t see this happening. This error does not matter on this particular system as even with TSC synchronization passes, TSC is going to be disabled later due to C2, C3 states, with a message like this Marking TSC unstable due to: TSC halts in idle. Will try to reproduce it on other systems to see whether there are any bugs in sync routines. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: WARNING: smp_call_function_single() and smp_call_function_mask()
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Tomas Carnecky >Sent: Monday, December 03, 2007 12:14 PM >To: Arjan van de Ven >Cc: linux-kernel@vger.kernel.org; [EMAIL PROTECTED] >Subject: Re: WARNING: smp_call_function_single() and >smp_call_function_mask() > >Arjan van de Ven wrote: >> On Sun, 02 Dec 2007 09:43:39 +0100 >> Tomas Carnecky <[EMAIL PROTECTED]> wrote: >> >>> WARNING: at arch/x86/kernel/smp_64.c:427 smp_call_function_single() >>> WARNING: at arch/x86/kernel/smp_64.c:397 smp_call_function_mask() >>> >>> dmesg and config attached. >>> >>> I'm getting about three of each at boot. I'm running: >>> commit e1cca7e8d484390169777b423a7fe46c7021fec1 >>> Date: Thu Nov 29 16:25:29 2007 -0800 >>> which is the latest git as of yesterday plus a one (unrelated) debug >>> statement patch in usb uhci. >>> >>> There was a similar bug report after 2.6.23-rc8-mm was released. >>> Though there seems to be a fundamental problem with how people use >>> smp_call_function*() [1]. And this can just as well be another >>> incarnation of it. >>> >>> Is that easy enough to fix or do I need to bisect (it >didn't happen in >>> 2.6.24-rc3)? >>> >> >> this appears to be a bug in the acpi code, to be exact in >> processor_throttling.c file, function >> acpi_processor_set_throttling_ptc(); it disables interrupts and then >> appears to do a cross-cpu IPI to set the state. Well... we can't do >> that due to deadlock reasons (you can't do IPI's with >interrupts off or >> you can get a very nice deadlock with the cpu that you IPI trying to >> do the same thing to you). >> > >I updated the kernel today (to 1a2edea9aff48...) and the >warnings are gone. Yes. This was reported here earlier and fixed by this patch from Yakui. http://www.ussg.iu.edu/hypermail/linux/kernel/0711.3/1596.html which should now be merged upstream. Thanks, Venki -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: WARNING: smp_call_function_single() and smp_call_function_mask()
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tomas Carnecky Sent: Monday, December 03, 2007 12:14 PM To: Arjan van de Ven Cc: linux-kernel@vger.kernel.org; [EMAIL PROTECTED] Subject: Re: WARNING: smp_call_function_single() and smp_call_function_mask() Arjan van de Ven wrote: On Sun, 02 Dec 2007 09:43:39 +0100 Tomas Carnecky [EMAIL PROTECTED] wrote: WARNING: at arch/x86/kernel/smp_64.c:427 smp_call_function_single() WARNING: at arch/x86/kernel/smp_64.c:397 smp_call_function_mask() dmesg and config attached. I'm getting about three of each at boot. I'm running: commit e1cca7e8d484390169777b423a7fe46c7021fec1 Date: Thu Nov 29 16:25:29 2007 -0800 which is the latest git as of yesterday plus a one (unrelated) debug statement patch in usb uhci. There was a similar bug report after 2.6.23-rc8-mm was released. Though there seems to be a fundamental problem with how people use smp_call_function*() [1]. And this can just as well be another incarnation of it. Is that easy enough to fix or do I need to bisect (it didn't happen in 2.6.24-rc3)? this appears to be a bug in the acpi code, to be exact in processor_throttling.c file, function acpi_processor_set_throttling_ptc(); it disables interrupts and then appears to do a cross-cpu IPI to set the state. Well... we can't do that due to deadlock reasons (you can't do IPI's with interrupts off or you can get a very nice deadlock with the cpu that you IPI trying to do the same thing to you). I updated the kernel today (to 1a2edea9aff48...) and the warnings are gone. Yes. This was reported here earlier and fixed by this patch from Yakui. http://www.ussg.iu.edu/hypermail/linux/kernel/0711.3/1596.html which should now be merged upstream. Thanks, Venki -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree
>On Fri, 30 Nov 2007 14:06:55 -0800 >"Pallipadi, Venkatesh" <[EMAIL PROTECTED]> wrote: > >Please dont go off-list like this. I put Mark's original >mailing list cc's >back. Sorry for missing some cc's earlier. I blindly did a reply-all to the mm-commits mail I got. >> I will have to Nack this. The reason max_cstate was initentionally >> removed due to couple of reasons: > >It broke userspace without any warning or migration period, afaict. Yes. That's true. I will have to take the blame for that. It has been known for a while during cpuidle development. But, it was never documented as deprecating. >> 1) All in kernel users of max_cstate should rather be using >> pm_qos/latency interfaces. All such max_cstate usages must already be >> migrated. > >That code isn't merged. All kernel part is already merged. I mean, there are do drivers that depend on max_cstate. They use latency_notifier thing today and their migration to pm_qos part is not merged yet. >> 2) Supporting max_cstate as a dynamic parameter cleanly is no longer >> possible in acpi/processor_idle.c as the C-state policy has moved to >> cpuidle instead. It can be done if it is needed. But, just >below patch >> will not really work with cpuidle. >> >> Selecting max_cstate at boot time as a debug option still >works without >> this patch. >> >> So, just this patch will not get back the functionality with cpuidle. >> Infact changing it at run time will have no effect. Question >however is: >> Is there a real need to revive this parameter so that user can change >> max_cstate at run time? > >It is not known whether Mark is actually writing to this >thing. Perhaps >read-only permissions would be a suitable fix? > Exporting it as read only should be OK. We also need to know if there are hard user space dependency on writing to this from userspace. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree
On Fri, 30 Nov 2007 14:06:55 -0800 Pallipadi, Venkatesh [EMAIL PROTECTED] wrote: Please dont go off-list like this. I put Mark's original mailing list cc's back. Sorry for missing some cc's earlier. I blindly did a reply-all to the mm-commits mail I got. I will have to Nack this. The reason max_cstate was initentionally removed due to couple of reasons: It broke userspace without any warning or migration period, afaict. Yes. That's true. I will have to take the blame for that. It has been known for a while during cpuidle development. But, it was never documented as deprecating. 1) All in kernel users of max_cstate should rather be using pm_qos/latency interfaces. All such max_cstate usages must already be migrated. That code isn't merged. All kernel part is already merged. I mean, there are do drivers that depend on max_cstate. They use latency_notifier thing today and their migration to pm_qos part is not merged yet. 2) Supporting max_cstate as a dynamic parameter cleanly is no longer possible in acpi/processor_idle.c as the C-state policy has moved to cpuidle instead. It can be done if it is needed. But, just below patch will not really work with cpuidle. Selecting max_cstate at boot time as a debug option still works without this patch. So, just this patch will not get back the functionality with cpuidle. Infact changing it at run time will have no effect. Question however is: Is there a real need to revive this parameter so that user can change max_cstate at run time? It is not known whether Mark is actually writing to this thing. Perhaps read-only permissions would be a suitable fix? Exporting it as read only should be OK. We also need to know if there are hard user space dependency on writing to this from userspace. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: constant_tsc and TSC unstable
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Paul >Rolland (???・???) >Sent: Thursday, November 29, 2007 8:12 AM >To: Linux Kernel >Cc: [EMAIL PROTECTED] >Subject: constant_tsc and TSC unstable > >Hello, > >I've a machine with a Core2Duo CPU. /proc/cpuinfo reports the flag >constant_tsc, but at boot time, I have the log : > >... >Total of 2 processors activated (6919.15 BogoMIPS). >ENABLING IO-APIC IRQs >..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 >checking TSC synchronization [CPU#0 -> CPU#1]: >Measured 3978592228 cycles TSC warp between CPUs, turning off >TSC clock. >Marking TSC unstable due to: check_tsc_sync_source failed. >Brought up 2 CPUs >... > >This machine is running 2.6.23.1-21.fc7. I know I should >report to Fedora, >but I was wondering if this is a bug or a feature ;) > TSCs on Core 2 Duo are supposed to be in sync unless CPU supports deep idle states like C2, C3. Can you send the full /proc/cpuinfo and full dmesg. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: kondemand: kernel BUG at kernel/workqueue.c:258!
>-Original Message- >From: Jiri Slaby [mailto:[EMAIL PROTECTED] >Sent: Thursday, November 29, 2007 1:43 PM >To: Pallipadi, Venkatesh; Nakajima, Jun >Cc: Linux kernel mailing list >Subject: kondemand: kernel BUG at kernel/workqueue.c:258! > >Hi, > >while trying to evoke another bug by endlessly change >governors, this appeared: >kernel BUG at .../kernel/workqueue.c:258! >invalid opcode: [1] PREEMPT SMP >CPU 0 >Modules linked in: iwl3945 mac80211 cfg80211 tun >cpufreq_userspace rfcomm >l2cap hci_usb bluetooth kvm_intel arc4 ecb blkcipher kvm cryptomgr >crypto_algapi acpi_cpufreq fglrx(P) asus_laptop sr_mod cdrom ehci_hcd >uhci_hcd battery >Pid: 443, comm: kondemand/0 Tainted: P2.6.23 #38 >RIP: 0010:[] [] >run_workqueue+0xca/0x120 >RSP: :81003ff3bea0 EFLAGS: 00010283 >RAX: 81003fc08200 RBX: 810001e0a300 RCX: >RDX: 810001e0a300 RSI: 81003ff3bed0 RDI: 81003fc08180 >RBP: 81003fc08180 R08: 81003ff3a000 R09: >R10: 003a18d529f0 R11: 0001 R12: 810001e0a2f8 >R13: 803c7160 R14: 81003fc08188 R15: >FS: () GS:805db000() >knlGS: >CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b >CR2: 003a18a9afa0 CR3: 2e431000 CR4: 26e0 >DR0: DR1: DR2: >DR3: DR6: 0ff0 DR7: 0400 >Process kondemand/0 (pid: 443, threadinfo 81003ff3a000, task >81003fe67560) >Stack: 81003fc08198 81003fc08180 80249060 >81003fc08188 > 80249125 81003fe67560 > 8024cb20 81003ff3bee8 81003ff3bee8 fffc >Call Trace: > [] worker_thread+0x0/0x130 > [] worker_thread+0xc5/0x130 > [] autoremove_wake_function+0x0/0x30 > [] worker_thread+0x0/0x130 > [] worker_thread+0x0/0x130 > [] kthread+0x4b/0x80 > [] child_rip+0xa/0x12 > [] flat_send_IPI_mask+0x0/0x70 > [] kthread+0x0/0x80 > [] child_rip+0x0/0x12 > > >Code: 0f 0b eb fe 66 90 65 48 8b 34 25 00 00 00 00 48 c7 c7 10 98 >RIP [] run_workqueue+0xca/0x120 > RSP > Kernel version? Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: kondemand: kernel BUG at kernel/workqueue.c:258!
-Original Message- From: Jiri Slaby [mailto:[EMAIL PROTECTED] Sent: Thursday, November 29, 2007 1:43 PM To: Pallipadi, Venkatesh; Nakajima, Jun Cc: Linux kernel mailing list Subject: kondemand: kernel BUG at kernel/workqueue.c:258! Hi, while trying to evoke another bug by endlessly change governors, this appeared: kernel BUG at .../kernel/workqueue.c:258! invalid opcode: [1] PREEMPT SMP CPU 0 Modules linked in: iwl3945 mac80211 cfg80211 tun cpufreq_userspace rfcomm l2cap hci_usb bluetooth kvm_intel arc4 ecb blkcipher kvm cryptomgr crypto_algapi acpi_cpufreq fglrx(P) asus_laptop sr_mod cdrom ehci_hcd uhci_hcd battery Pid: 443, comm: kondemand/0 Tainted: P2.6.23 #38 RIP: 0010:[802484ba] [802484ba] run_workqueue+0xca/0x120 RSP: :81003ff3bea0 EFLAGS: 00010283 RAX: 81003fc08200 RBX: 810001e0a300 RCX: RDX: 810001e0a300 RSI: 81003ff3bed0 RDI: 81003fc08180 RBP: 81003fc08180 R08: 81003ff3a000 R09: R10: 003a18d529f0 R11: 0001 R12: 810001e0a2f8 R13: 803c7160 R14: 81003fc08188 R15: FS: () GS:805db000() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 003a18a9afa0 CR3: 2e431000 CR4: 26e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process kondemand/0 (pid: 443, threadinfo 81003ff3a000, task 81003fe67560) Stack: 81003fc08198 81003fc08180 80249060 81003fc08188 80249125 81003fe67560 8024cb20 81003ff3bee8 81003ff3bee8 fffc Call Trace: [80249060] worker_thread+0x0/0x130 [80249125] worker_thread+0xc5/0x130 [8024cb20] autoremove_wake_function+0x0/0x30 [80249060] worker_thread+0x0/0x130 [80249060] worker_thread+0x0/0x130 [8024c71b] kthread+0x4b/0x80 [8020cbb8] child_rip+0xa/0x12 [8021fff0] flat_send_IPI_mask+0x0/0x70 [8024c6d0] kthread+0x0/0x80 [8020cbae] child_rip+0x0/0x12 Code: 0f 0b eb fe 66 90 65 48 8b 34 25 00 00 00 00 48 c7 c7 10 98 RIP [802484ba] run_workqueue+0xca/0x120 RSP 81003ff3bea0 Kernel version? Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: constant_tsc and TSC unstable
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Rolland (???・???) Sent: Thursday, November 29, 2007 8:12 AM To: Linux Kernel Cc: [EMAIL PROTECTED] Subject: constant_tsc and TSC unstable Hello, I've a machine with a Core2Duo CPU. /proc/cpuinfo reports the flag constant_tsc, but at boot time, I have the log : ... Total of 2 processors activated (6919.15 BogoMIPS). ENABLING IO-APIC IRQs ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 checking TSC synchronization [CPU#0 - CPU#1]: Measured 3978592228 cycles TSC warp between CPUs, turning off TSC clock. Marking TSC unstable due to: check_tsc_sync_source failed. Brought up 2 CPUs ... This machine is running 2.6.23.1-21.fc7. I know I should report to Fedora, but I was wondering if this is a bug or a feature ;) TSCs on Core 2 Duo are supposed to be in sync unless CPU supports deep idle states like C2, C3. Can you send the full /proc/cpuinfo and full dmesg. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: ACPI related Warning in 2.6.24-rc3-git2
Yakui, Can you look at this. Seems to be coming from commit f79f06ab9f86 FixedHW support tries to read MSR with interrupts disabled. Thanks, Venki >-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of >Rafael J. Wysocki >Sent: Tuesday, November 27, 2007 7:37 AM >To: Lukas Hejtmanek >Cc: linux-kernel@vger.kernel.org; ACPI Devel Maling List; Len >Brown; Alexey Starikovskiy >Subject: Re: ACPI related Warning in 2.6.24-rc3-git2 > >On Tuesday, 27 of November 2007, Lukas Hejtmanek wrote: >> Hello, >> >> in recent kernel, I got the following warnings while >booting. It's ACPI >> related. Does anybode care? Lenovo ThinkPad T61 (6465CTO). > >Appropriate Ccs added. > >Did it happen before? > >> [ 13.114814] Pid: 1, comm: swapper Not tainted 2.6.24-rc3-git2 #3 >> [ 13.114885] >> [ 13.114885] Call Trace: >> [ 13.115020] [] >acpi_ut_update_ref_count+0x50/0x9d >> [ 13.115095] [] >smp_call_function_single+0xbd/0xd0 >> [ 13.115169] [] _rdmsr_on_cpu+0x5c/0x60 >> [ 13.115241] [] >> acpi_processor_get_throttling_ptc+0xf3/0x158 >> [ 13.115323] [] >> acpi_processor_get_throttling_info+0x460/0x4af >> [ 13.115406] [] acpi_processor_start+0x54a/0x606 >> [ 13.115478] [] ifind+0x48/0xd0 >> [ 13.115550] [] >acpi_start_single_object+0x24/0x46 >> [ 13.115622] [] acpi_device_probe+0x7d/0x91 >> [ 13.115694] [] driver_probe_device+0x9c/0x1b0 >> [ 13.115766] [] __driver_attach+0xc9/0xd0 >> [ 13.115840] [] __driver_attach+0x0/0xd0 >> [ 13.115924] [] bus_for_each_dev+0x4d/0x80 >> [ 13.115994] [] bus_add_driver+0xac/0x220 >> [ 13.116080] [] acpi_processor_init+0x8f/0xfc >> [ 13.116153] [] kernel_init+0x154/0x330 >> [ 13.116225] [] child_rip+0xa/0x12 >> [ 13.116295] [] kernel_init+0x0/0x330 >> [ 13.116365] [] child_rip+0x0/0x12 >> [ 13.116435] >> [ 13.116504] WARNING: at arch/x86/kernel/smp_64.c:397 >> smp_call_function_mask() >> [ 13.116577] Pid: 1, comm: swapper Not tainted 2.6.24-rc3-git2 #3 >> [ 13.116648] >> [ 13.116648] Call Trace: >> [ 13.116779] [] >acpi_ut_update_ref_count+0x50/0x9d >> [ 13.116851] [] smp_call_function_mask+0x8f/0xa0 >> [ 13.116923] [] _rdmsr_on_cpu+0x5c/0x60 >> [ 13.116994] [] >> acpi_processor_get_throttling_ptc+0xf3/0x158 >> [ 13.117077] [] >> acpi_processor_get_throttling_info+0x460/0x4af >> [ 13.117169] [] acpi_processor_start+0x54a/0x606 >> [ 13.117248] [] ifind+0x48/0xd0 >> [ 13.117330] [] >acpi_start_single_object+0x24/0x46 >> [ 13.117402] [] acpi_device_probe+0x7d/0x91 >> [ 13.117488] [] driver_probe_device+0x9c/0x1b0 >> [ 13.117559] [] __driver_attach+0xc9/0xd0 >> [ 13.117631] [] __driver_attach+0x0/0xd0 >> [ 13.117715] [] bus_for_each_dev+0x4d/0x80 >> [ 13.117786] [] bus_add_driver+0xac/0x220 >> [ 13.117856] [] acpi_processor_init+0x8f/0xfc >> [ 13.117941] [] kernel_init+0x154/0x330 >> [ 13.118018] [] child_rip+0xa/0x12 >> [ 13.118088] [] kernel_init+0x0/0x330 >> [ 13.118158] [] child_rip+0x0/0x12 >> [ 13.118227] >> [...] >> [ 13.124714] WARNING: at arch/x86/kernel/smp_64.c:427 >> smp_call_function_single() >> [ 13.124798] Pid: 1, comm: swapper Not tainted 2.6.24-rc3-git2 #3 >> [ 13.125460] >> [ 13.125461] Call Trace: >> [ 13.125592] [] >acpi_ut_update_ref_count+0x50/0x9d >> [ 13.125665] [] >smp_call_function_single+0xbd/0xd0 >> [ 13.125737] [] _rdmsr_on_cpu+0x5c/0x60 >> [ 13.125807] [] >> acpi_processor_get_throttling_ptc+0xf3/0x158 >> [ 13.125903] [] >> acpi_processor_get_throttling_info+0x460/0x4af >> [ 13.125999] [] acpi_processor_start+0x54a/0x606 >> [ 13.126071] [] acpi_processor_add+0x24/0x6b >> [ 13.126142] [] >acpi_start_single_object+0x24/0x46 >> [ 13.126214] [] acpi_device_probe+0x7d/0x91 >> [ 13.126285] [] driver_probe_device+0x9c/0x1b0 >> [ 13.126357] [] __driver_attach+0xc9/0xd0 >> [ 13.126441] [] __driver_attach+0x0/0xd0 >> [ 13.126518] [] bus_for_each_dev+0x4d/0x80 >> [ 13.126600] [] bus_add_driver+0xac/0x220 >> [ 13.126670] [] acpi_processor_init+0x8f/0xfc >> [ 13.126755] [] kernel_init+0x154/0x330 >> [ 13.126832] [] child_rip+0xa/0x12 >> [ 13.126916] [] kernel_init+0x0/0x330 >> [ 13.126986] [] child_rip+0x0/0x12 >> [ 13.127059] >> [ 13.127124] WARNING: at arch/x86/kernel/smp_64.c:397 >> smp_call_function_mask() >> [ 13.127197] Pid: 1, comm: swapper Not tainted 2.6.24-rc3-git2 #3 >> [ 13.127267] >> [ 13.127268] Call Trace: >> [ 13.127398] [] >acpi_ut_update_ref_count+0x50/0x9d >> [ 13.127473] [] smp_call_function_mask+0x8f/0xa0 >> [ 13.127545] [] _rdmsr_on_cpu+0x5c/0x60 >> [ 13.127616] [] >> acpi_processor_get_throttling_ptc+0xf3/0x158 >> [ 13.127699] [] >> acpi_processor_get_throttling_info+0x460/0x4af >> [ 13.127782] [] acpi_processor_start+0x54a/0x606 >> [ 13.127861] [] acpi_processor_add+0x24/0x6b >> [ 13.127933] [] >acpi_start_single_object+0x24/0x46 >> [ 13.128005] []
RE: [PATCH] x86: disable hpet legacy replacement for kexec
Ack. Thanks, Venki >-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of OGAWA Hirofumi >Sent: Monday, November 26, 2007 1:43 PM >To: Ingo Molnar >Cc: Linus Torvalds; linux-kernel@vger.kernel.org; Thomas >Gleixner; H. Peter Anvin >Subject: [PATCH] x86: disable hpet legacy replacement for kexec > >Hi, > >This seems to introduced after 2.6.23, so if possible, I'd like to fix >before 2.6.24. What do you think the following? > >Thanks. >-- >OGAWA Hirofumi <[EMAIL PROTECTED]> > > >If HPET was enabled by pci quirks, we use i8253 as initial clockevent >because pci quirks doesn't run until pci is initialized. > >The above means the kernel (or something) is assuming HPET legacy >replacement is disabled and can use i8253 at boot. > >If we used kexec, it isn't true. So, this patch disables HPET legacy >replacement for kexec in machine_shutdown(). > >Signed-off-by: OGAWA Hirofumi <[EMAIL PROTECTED]> >--- > > arch/x86/kernel/hpet.c | 14 ++ > arch/x86/kernel/reboot_32.c |4 > arch/x86/kernel/reboot_64.c |4 > include/asm-x86/hpet.h |1 + > 4 files changed, 23 insertions(+) > >diff -puN arch/x86/kernel/hpet.c~kexec-need-to-disable-hpet >arch/x86/kernel/hpet.c >--- >linux-2.6/arch/x86/kernel/hpet.c~kexec-need-to-disable-hpet >2007-11-24 09:38:23.0 +0900 >+++ linux-2.6-hirofumi/arch/x86/kernel/hpet.c 2007-11-27 >04:57:00.0 +0900 >@@ -446,6 +446,20 @@ static __init int hpet_late_init(void) > } > fs_initcall(hpet_late_init); > >+void hpet_disable(void) >+{ >+ if (is_hpet_capable()) { >+ unsigned long cfg = hpet_readl(HPET_CFG); >+ >+ if (hpet_legacy_int_enabled) { >+ cfg &= ~HPET_CFG_LEGACY; >+ hpet_legacy_int_enabled = 0; >+ } >+ cfg &= ~HPET_CFG_ENABLE; >+ hpet_writel(cfg, HPET_CFG); >+ } >+} >+ > #ifdef CONFIG_HPET_EMULATE_RTC > > /* HPET in LegacyReplacement Mode eats up RTC interrupt line. >When, HPET >diff -puN >arch/x86/kernel/reboot_32.c~kexec-need-to-disable-hpet >arch/x86/kernel/reboot_32.c >--- >linux-2.6/arch/x86/kernel/reboot_32.c~kexec-need-to-disab >le-hpet2007-11-24 09:38:23.0 +0900 >+++ linux-2.6-hirofumi/arch/x86/kernel/reboot_32.c >2007-11-27 04:57:50.0 +0900 >@@ -11,6 +11,7 @@ > #include > #include > #include >+#include > #include > #include "mach_reboot.h" > #include >@@ -326,6 +327,9 @@ static void native_machine_shutdown(void > #ifdef CONFIG_X86_IO_APIC > disable_IO_APIC(); > #endif >+#ifdef CONFIG_HPET_TIMER >+ hpet_disable(); >+#endif > } > > void __attribute__((weak)) mach_reboot_fixups(void) >diff -puN >arch/x86/kernel/reboot_64.c~kexec-need-to-disable-hpet >arch/x86/kernel/reboot_64.c >--- >linux-2.6/arch/x86/kernel/reboot_64.c~kexec-need-to-disab >le-hpet2007-11-24 09:38:23.0 +0900 >+++ linux-2.6-hirofumi/arch/x86/kernel/reboot_64.c >2007-11-27 04:57:56.0 +0900 >@@ -17,6 +17,7 @@ > #include > #include > #include >+#include > #include > > /* >@@ -113,6 +114,9 @@ void machine_shutdown(void) > > disable_IO_APIC(); > >+#ifdef CONFIG_HPET_TIMER >+ hpet_disable(); >+#endif > local_irq_restore(flags); > > pci_iommu_shutdown(); >diff -puN include/asm-x86/hpet.h~kexec-need-to-disable-hpet >include/asm-x86/hpet.h >--- >linux-2.6/include/asm-x86/hpet.h~kexec-need-to-disable-hpet >2007-11-24 09:38:23.0 +0900 >+++ linux-2.6-hirofumi/include/asm-x86/hpet.h 2007-11-27 >04:54:32.0 +0900 >@@ -61,6 +61,7 @@ extern unsigned long force_hpet_address; > extern int hpet_force_user; > extern int is_hpet_enabled(void); > extern int hpet_enable(void); >+extern void hpet_disable(void); > extern unsigned long hpet_readl(unsigned long a); > extern void force_hpet_resume(void); > >_ >- >To unsubscribe from this list: send the line "unsubscribe >linux-kernel" in >the body of a message to [EMAIL PROTECTED] >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] x86: disable hpet legacy replacement for kexec
Ack. Thanks, Venki -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of OGAWA Hirofumi Sent: Monday, November 26, 2007 1:43 PM To: Ingo Molnar Cc: Linus Torvalds; linux-kernel@vger.kernel.org; Thomas Gleixner; H. Peter Anvin Subject: [PATCH] x86: disable hpet legacy replacement for kexec Hi, This seems to introduced after 2.6.23, so if possible, I'd like to fix before 2.6.24. What do you think the following? Thanks. -- OGAWA Hirofumi [EMAIL PROTECTED] If HPET was enabled by pci quirks, we use i8253 as initial clockevent because pci quirks doesn't run until pci is initialized. The above means the kernel (or something) is assuming HPET legacy replacement is disabled and can use i8253 at boot. If we used kexec, it isn't true. So, this patch disables HPET legacy replacement for kexec in machine_shutdown(). Signed-off-by: OGAWA Hirofumi [EMAIL PROTECTED] --- arch/x86/kernel/hpet.c | 14 ++ arch/x86/kernel/reboot_32.c |4 arch/x86/kernel/reboot_64.c |4 include/asm-x86/hpet.h |1 + 4 files changed, 23 insertions(+) diff -puN arch/x86/kernel/hpet.c~kexec-need-to-disable-hpet arch/x86/kernel/hpet.c --- linux-2.6/arch/x86/kernel/hpet.c~kexec-need-to-disable-hpet 2007-11-24 09:38:23.0 +0900 +++ linux-2.6-hirofumi/arch/x86/kernel/hpet.c 2007-11-27 04:57:00.0 +0900 @@ -446,6 +446,20 @@ static __init int hpet_late_init(void) } fs_initcall(hpet_late_init); +void hpet_disable(void) +{ + if (is_hpet_capable()) { + unsigned long cfg = hpet_readl(HPET_CFG); + + if (hpet_legacy_int_enabled) { + cfg = ~HPET_CFG_LEGACY; + hpet_legacy_int_enabled = 0; + } + cfg = ~HPET_CFG_ENABLE; + hpet_writel(cfg, HPET_CFG); + } +} + #ifdef CONFIG_HPET_EMULATE_RTC /* HPET in LegacyReplacement Mode eats up RTC interrupt line. When, HPET diff -puN arch/x86/kernel/reboot_32.c~kexec-need-to-disable-hpet arch/x86/kernel/reboot_32.c --- linux-2.6/arch/x86/kernel/reboot_32.c~kexec-need-to-disab le-hpet2007-11-24 09:38:23.0 +0900 +++ linux-2.6-hirofumi/arch/x86/kernel/reboot_32.c 2007-11-27 04:57:50.0 +0900 @@ -11,6 +11,7 @@ #include linux/reboot.h #include asm/uaccess.h #include asm/apic.h +#include asm/hpet.h #include asm/desc.h #include mach_reboot.h #include asm/reboot_fixups.h @@ -326,6 +327,9 @@ static void native_machine_shutdown(void #ifdef CONFIG_X86_IO_APIC disable_IO_APIC(); #endif +#ifdef CONFIG_HPET_TIMER + hpet_disable(); +#endif } void __attribute__((weak)) mach_reboot_fixups(void) diff -puN arch/x86/kernel/reboot_64.c~kexec-need-to-disable-hpet arch/x86/kernel/reboot_64.c --- linux-2.6/arch/x86/kernel/reboot_64.c~kexec-need-to-disab le-hpet2007-11-24 09:38:23.0 +0900 +++ linux-2.6-hirofumi/arch/x86/kernel/reboot_64.c 2007-11-27 04:57:56.0 +0900 @@ -17,6 +17,7 @@ #include asm/pgtable.h #include asm/tlbflush.h #include asm/apic.h +#include asm/hpet.h #include asm/gart.h /* @@ -113,6 +114,9 @@ void machine_shutdown(void) disable_IO_APIC(); +#ifdef CONFIG_HPET_TIMER + hpet_disable(); +#endif local_irq_restore(flags); pci_iommu_shutdown(); diff -puN include/asm-x86/hpet.h~kexec-need-to-disable-hpet include/asm-x86/hpet.h --- linux-2.6/include/asm-x86/hpet.h~kexec-need-to-disable-hpet 2007-11-24 09:38:23.0 +0900 +++ linux-2.6-hirofumi/include/asm-x86/hpet.h 2007-11-27 04:54:32.0 +0900 @@ -61,6 +61,7 @@ extern unsigned long force_hpet_address; extern int hpet_force_user; extern int is_hpet_enabled(void); extern int hpet_enable(void); +extern void hpet_disable(void); extern unsigned long hpet_readl(unsigned long a); extern void force_hpet_resume(void); _ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: ACPI related Warning in 2.6.24-rc3-git2
Yakui, Can you look at this. Seems to be coming from commit f79f06ab9f86 FixedHW support tries to read MSR with interrupts disabled. Thanks, Venki -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rafael J. Wysocki Sent: Tuesday, November 27, 2007 7:37 AM To: Lukas Hejtmanek Cc: linux-kernel@vger.kernel.org; ACPI Devel Maling List; Len Brown; Alexey Starikovskiy Subject: Re: ACPI related Warning in 2.6.24-rc3-git2 On Tuesday, 27 of November 2007, Lukas Hejtmanek wrote: Hello, in recent kernel, I got the following warnings while booting. It's ACPI related. Does anybode care? Lenovo ThinkPad T61 (6465CTO). Appropriate Ccs added. Did it happen before? [ 13.114814] Pid: 1, comm: swapper Not tainted 2.6.24-rc3-git2 #3 [ 13.114885] [ 13.114885] Call Trace: [ 13.115020] [80357ab6] acpi_ut_update_ref_count+0x50/0x9d [ 13.115095] [8021e7ad] smp_call_function_single+0xbd/0xd0 [ 13.115169] [80331dbc] _rdmsr_on_cpu+0x5c/0x60 [ 13.115241] [803631c7] acpi_processor_get_throttling_ptc+0xf3/0x158 [ 13.115323] [80362f04] acpi_processor_get_throttling_info+0x460/0x4af [ 13.115406] [80362264] acpi_processor_start+0x54a/0x606 [ 13.115478] [802abc38] ifind+0x48/0xd0 [ 13.115550] [8035a31e] acpi_start_single_object+0x24/0x46 [ 13.115622] [8035b716] acpi_device_probe+0x7d/0x91 [ 13.115694] [8038effc] driver_probe_device+0x9c/0x1b0 [ 13.115766] [8038f2c9] __driver_attach+0xc9/0xd0 [ 13.115840] [8038f200] __driver_attach+0x0/0xd0 [ 13.115924] [8038e1dd] bus_for_each_dev+0x4d/0x80 [ 13.115994] [8038e64c] bus_add_driver+0xac/0x220 [ 13.116080] [8064dd7c] acpi_processor_init+0x8f/0xfc [ 13.116153] [806386f4] kernel_init+0x154/0x330 [ 13.116225] [8020d178] child_rip+0xa/0x12 [ 13.116295] [806385a0] kernel_init+0x0/0x330 [ 13.116365] [8020d16e] child_rip+0x0/0x12 [ 13.116435] [ 13.116504] WARNING: at arch/x86/kernel/smp_64.c:397 smp_call_function_mask() [ 13.116577] Pid: 1, comm: swapper Not tainted 2.6.24-rc3-git2 #3 [ 13.116648] [ 13.116648] Call Trace: [ 13.116779] [80357ab6] acpi_ut_update_ref_count+0x50/0x9d [ 13.116851] [8021e4af] smp_call_function_mask+0x8f/0xa0 [ 13.116923] [80331dbc] _rdmsr_on_cpu+0x5c/0x60 [ 13.116994] [803631c7] acpi_processor_get_throttling_ptc+0xf3/0x158 [ 13.117077] [80362f04] acpi_processor_get_throttling_info+0x460/0x4af [ 13.117169] [80362264] acpi_processor_start+0x54a/0x606 [ 13.117248] [802abc38] ifind+0x48/0xd0 [ 13.117330] [8035a31e] acpi_start_single_object+0x24/0x46 [ 13.117402] [8035b716] acpi_device_probe+0x7d/0x91 [ 13.117488] [8038effc] driver_probe_device+0x9c/0x1b0 [ 13.117559] [8038f2c9] __driver_attach+0xc9/0xd0 [ 13.117631] [8038f200] __driver_attach+0x0/0xd0 [ 13.117715] [8038e1dd] bus_for_each_dev+0x4d/0x80 [ 13.117786] [8038e64c] bus_add_driver+0xac/0x220 [ 13.117856] [8064dd7c] acpi_processor_init+0x8f/0xfc [ 13.117941] [806386f4] kernel_init+0x154/0x330 [ 13.118018] [8020d178] child_rip+0xa/0x12 [ 13.118088] [806385a0] kernel_init+0x0/0x330 [ 13.118158] [8020d16e] child_rip+0x0/0x12 [ 13.118227] [...] [ 13.124714] WARNING: at arch/x86/kernel/smp_64.c:427 smp_call_function_single() [ 13.124798] Pid: 1, comm: swapper Not tainted 2.6.24-rc3-git2 #3 [ 13.125460] [ 13.125461] Call Trace: [ 13.125592] [80357ab6] acpi_ut_update_ref_count+0x50/0x9d [ 13.125665] [8021e7ad] smp_call_function_single+0xbd/0xd0 [ 13.125737] [80331dbc] _rdmsr_on_cpu+0x5c/0x60 [ 13.125807] [803631c7] acpi_processor_get_throttling_ptc+0xf3/0x158 [ 13.125903] [80362f04] acpi_processor_get_throttling_info+0x460/0x4af [ 13.125999] [80362264] acpi_processor_start+0x54a/0x606 [ 13.126071] [803625ed] acpi_processor_add+0x24/0x6b [ 13.126142] [8035a31e] acpi_start_single_object+0x24/0x46 [ 13.126214] [8035b716] acpi_device_probe+0x7d/0x91 [ 13.126285] [8038effc] driver_probe_device+0x9c/0x1b0 [ 13.126357] [8038f2c9] __driver_attach+0xc9/0xd0 [ 13.126441] [8038f200] __driver_attach+0x0/0xd0 [ 13.126518] [8038e1dd] bus_for_each_dev+0x4d/0x80 [ 13.126600] [8038e64c] bus_add_driver+0xac/0x220 [ 13.126670] [8064dd7c] acpi_processor_init+0x8f/0xfc [ 13.126755] [806386f4] kernel_init+0x154/0x330 [ 13.126832] [8020d178] child_rip+0xa/0x12 [ 13.126916] [806385a0] kernel_init+0x0/0x330 [ 13.126986] [8020d16e] child_rip+0x0/0x12 [
RE: 2.6.23.8, ondemand scaling governor: "BUG: soft lockup detected on CPU#0!"
>-Original Message- >From: Harald Dunkel [mailto:[EMAIL PROTECTED] >Sent: Tuesday, November 20, 2007 12:17 AM >To: Pallipadi, Venkatesh >Cc: Linux Kernel list >Subject: Re: 2.6.23.8, ondemand scaling governor: "BUG: soft >lockup detected on CPU#0!" > >Pallipadi, Venkatesh wrote: >> >> Can you try switching to powersave governor (which should >always run CPU >> at 400MHz) and see whether you see similar error? >> > >Yes, if I move from performance to powersave, then I see a similar >error: > >Nov 20 09:06:48 bugs kernel: BUG: soft lockup detected on CPU#0! >Nov 20 09:06:48 bugs kernel: [] softlockup_tick+0x91/0xa6 >Nov 20 09:06:48 bugs kernel: [] >update_process_times+0x3a/0x5d >Nov 20 09:06:48 bugs kernel: [] tick_sched_timer+0x115/0x164 >Nov 20 09:06:48 bugs kernel: [] >hrtimer_interrupt+0x102/0x191 >Nov 20 09:06:48 bugs kernel: [] timer_interrupt+0x2e/0x34 >Nov 20 09:06:48 bugs kernel: [] handle_IRQ_event+0x1a/0x3f >Nov 20 09:06:48 bugs kernel: [] handle_level_irq+0xa8/0xb7 >Nov 20 09:06:48 bugs kernel: [] do_IRQ+0x53/0x6c >Nov 20 09:06:48 bugs kernel: [] common_interrupt+0x23/0x28 >Nov 20 09:06:48 bugs kernel: === > > This looks like TSC related issue. Ingo's patch commit id a3b13c23f186ecb57204580cc1f2dbe9c284953a http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.g it;a=commit;h=a3b13c23f186ecb57204580cc1f2dbe9c284953a should help. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.23.8, ondemand scaling governor: BUG: soft lockup detected on CPU#0!
-Original Message- From: Harald Dunkel [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 20, 2007 12:17 AM To: Pallipadi, Venkatesh Cc: Linux Kernel list Subject: Re: 2.6.23.8, ondemand scaling governor: BUG: soft lockup detected on CPU#0! Pallipadi, Venkatesh wrote: Can you try switching to powersave governor (which should always run CPU at 400MHz) and see whether you see similar error? Yes, if I move from performance to powersave, then I see a similar error: Nov 20 09:06:48 bugs kernel: BUG: soft lockup detected on CPU#0! Nov 20 09:06:48 bugs kernel: [c013cf8d] softlockup_tick+0x91/0xa6 Nov 20 09:06:48 bugs kernel: [c012269c] update_process_times+0x3a/0x5d Nov 20 09:06:48 bugs kernel: [c0131219] tick_sched_timer+0x115/0x164 Nov 20 09:06:48 bugs kernel: [c012d311] hrtimer_interrupt+0x102/0x191 Nov 20 09:06:48 bugs kernel: [c0106cd6] timer_interrupt+0x2e/0x34 Nov 20 09:06:48 bugs kernel: [c013d1f6] handle_IRQ_event+0x1a/0x3f Nov 20 09:06:48 bugs kernel: [c013e4e1] handle_level_irq+0xa8/0xb7 Nov 20 09:06:48 bugs kernel: [c0106367] do_IRQ+0x53/0x6c Nov 20 09:06:48 bugs kernel: [c0104853] common_interrupt+0x23/0x28 Nov 20 09:06:48 bugs kernel: === This looks like TSC related issue. Ingo's patch commit id a3b13c23f186ecb57204580cc1f2dbe9c284953a http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.g it;a=commit;h=a3b13c23f186ecb57204580cc1f2dbe9c284953a should help. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.23.8, ondemand scaling governor: "BUG: soft lockup detected on CPU#0!"
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Harald Dunkel >Sent: Monday, November 19, 2007 4:19 PM >To: Linux Kernel list >Subject: 2.6.23.8, ondemand scaling governor: "BUG: soft >lockup detected on CPU#0!" > >Hi folks, > >using the ondemand scaling governour I see some error messages >in kern.log, e.g.: > >Nov 20 01:00:46 bugs kernel: BUG: soft lockup detected on CPU#0! >Nov 20 01:00:46 bugs kernel: [] softlockup_tick+0x91/0xa6 >Nov 20 01:00:46 bugs kernel: [] >update_process_times+0x3a/0x5d >Nov 20 01:00:46 bugs kernel: [] tick_sched_timer+0x115/0x164 >Nov 20 01:00:46 bugs kernel: [] >hrtimer_interrupt+0x102/0x191 >Nov 20 01:00:46 bugs kernel: [] timer_interrupt+0x2e/0x34 >Nov 20 01:00:46 bugs kernel: [] handle_IRQ_event+0x1a/0x3f >Nov 20 01:00:46 bugs kernel: [] handle_level_irq+0xa8/0xb7 >Nov 20 01:00:46 bugs kernel: [] do_IRQ+0x53/0x6c >Nov 20 01:00:46 bugs kernel: [] common_interrupt+0x23/0x28 >Nov 20 01:00:46 bugs kernel: [] >smp_apic_timer_interrupt+0x1a/0x70 >Nov 20 01:00:46 bugs kernel: [] default_idle+0x27/0x39 >Nov 20 01:00:46 bugs kernel: [] cpu_idle+0x46/0x68 >Nov 20 01:00:46 bugs kernel: [] start_kernel+0x24d/0x252 >Nov 20 01:00:46 bugs kernel: [] unknown_bootoption+0x0/0x196 >Nov 20 01:00:46 bugs kernel: === > >This seems to happen when the load drops below the threshold and >the ondemand governor changes the CPU from 2GHz to 400MHz. If I use >the "performance" governor instead, then there is no such message. >If I set it back to "ondemand", then the message is printed >immediately. > ># uname -a >Linux bugs 2.6.23.8 #1 PREEMPT Sun Nov 18 09:14:13 CET 2007 >i686 GNU/Linux ># cat /proc/cpuinfo >processor : 0 >vendor_id : GenuineIntel >cpu family : 6 >model : 13 >model name : Intel(R) Pentium(R) M processor 2.00GHz >stepping: 8 >cpu MHz : 400.000 >cache size : 2048 KB >fdiv_bug: no >hlt_bug : no >f00f_bug: no >coma_bug: no >fpu : yes >fpu_exception : yes >cpuid level : 2 >wp : yes >flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr >pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss tm pbe >nx bts est tm2 >bogomips: 798.34 >clflush size: 64 > > >Please mail if I can help to track this down. > > Can you try switching to powersave governor (which should always run CPU at 400MHz) and see whether you see similar error? Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.23.8, ondemand scaling governor: BUG: soft lockup detected on CPU#0!
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Harald Dunkel Sent: Monday, November 19, 2007 4:19 PM To: Linux Kernel list Subject: 2.6.23.8, ondemand scaling governor: BUG: soft lockup detected on CPU#0! Hi folks, using the ondemand scaling governour I see some error messages in kern.log, e.g.: Nov 20 01:00:46 bugs kernel: BUG: soft lockup detected on CPU#0! Nov 20 01:00:46 bugs kernel: [c013cf8d] softlockup_tick+0x91/0xa6 Nov 20 01:00:46 bugs kernel: [c012269c] update_process_times+0x3a/0x5d Nov 20 01:00:46 bugs kernel: [c0131219] tick_sched_timer+0x115/0x164 Nov 20 01:00:46 bugs kernel: [c012d311] hrtimer_interrupt+0x102/0x191 Nov 20 01:00:46 bugs kernel: [c0106cd6] timer_interrupt+0x2e/0x34 Nov 20 01:00:46 bugs kernel: [c013d1f6] handle_IRQ_event+0x1a/0x3f Nov 20 01:00:46 bugs kernel: [c013e4e1] handle_level_irq+0xa8/0xb7 Nov 20 01:00:46 bugs kernel: [c0106367] do_IRQ+0x53/0x6c Nov 20 01:00:46 bugs kernel: [c0104853] common_interrupt+0x23/0x28 Nov 20 01:00:46 bugs kernel: [c011007b] smp_apic_timer_interrupt+0x1a/0x70 Nov 20 01:00:46 bugs kernel: [c0102a36] default_idle+0x27/0x39 Nov 20 01:00:46 bugs kernel: [c010234c] cpu_idle+0x46/0x68 Nov 20 01:00:46 bugs kernel: [c032e9e8] start_kernel+0x24d/0x252 Nov 20 01:00:46 bugs kernel: [c032e317] unknown_bootoption+0x0/0x196 Nov 20 01:00:46 bugs kernel: === This seems to happen when the load drops below the threshold and the ondemand governor changes the CPU from 2GHz to 400MHz. If I use the performance governor instead, then there is no such message. If I set it back to ondemand, then the message is printed immediately. # uname -a Linux bugs 2.6.23.8 #1 PREEMPT Sun Nov 18 09:14:13 CET 2007 i686 GNU/Linux # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : Intel(R) Pentium(R) M processor 2.00GHz stepping: 8 cpu MHz : 400.000 cache size : 2048 KB fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss tm pbe nx bts est tm2 bogomips: 798.34 clflush size: 64 Please mail if I can help to track this down. Can you try switching to powersave governor (which should always run CPU at 400MHz) and see whether you see similar error? Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.24-rc1 and 2.6.24.rc2 hangs while running udev on my laptop
>-Original Message- >From: Andrew Morton [mailto:[EMAIL PROTECTED] >Sent: Friday, November 09, 2007 2:03 AM >To: SANGOI DINO LEONARDO >Cc: linux-kernel@vger.kernel.org; Rafael J. Wysocki; Brown, >Len; Pallipadi, Venkatesh; [EMAIL PROTECTED] >Subject: Re: 2.6.24-rc1 and 2.6.24.rc2 hangs while running >udev on my laptop > > >(cc's added) > >On Fri, 9 Nov 2007 09:47:02 +0100 SANGOI DINO LEONARDO ><[EMAIL PROTECTED]> wrote: > >> Hi, >> >> My laptop (an HP nx6125) doesn't boot with kernels 2.6.24-rc1 and >> 2.6.24.rc2. >> It works fine with 2.6.23 and older. >> >> I seen this bug first while running fedora rawhide, so you >can find hardware >> >> info and boot logs at >https://bugzilla.redhat.com/show_bug.cgi?id=312201. >> >> I did a git bisect, and got this: >> >> $ git bisect bad >> 4f86d3a8e297205780cca027e974fd5f81064780 is first bad commit >> commit 4f86d3a8e297205780cca027e974fd5f81064780 >> Author: Len Brown <[EMAIL PROTECTED]> >> Date: Wed Oct 3 18:58:00 2007 -0400 >> >> cpuidle: consolidate 2.6.22 cpuidle branch into one patch >> [SNIP full commit log] >> > >> >> Config is taken from Fedora kernel. CONFIG_CPU_IDLE is set >to y (tell me if >> full config is needed). >> >> If I use 'nolapic' parameter, kernel 2.6.24-rc1 boots fine. >> Setting CONFIG_CPU_IDLE=n also gives me a working kernel. >> >> Ask me if more info is needed (please CC me). >> >> Thanks, >> >> Dino Dino, Thanks for all the dumps and information in bugzilla. I am looking at it to root cause the failure and I should have more updates later today. Can you also post the full config you are using with rc1 rc2. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: 2.6.24-rc1 and 2.6.24.rc2 hangs while running udev on my laptop
-Original Message- From: Andrew Morton [mailto:[EMAIL PROTECTED] Sent: Friday, November 09, 2007 2:03 AM To: SANGOI DINO LEONARDO Cc: linux-kernel@vger.kernel.org; Rafael J. Wysocki; Brown, Len; Pallipadi, Venkatesh; [EMAIL PROTECTED] Subject: Re: 2.6.24-rc1 and 2.6.24.rc2 hangs while running udev on my laptop (cc's added) On Fri, 9 Nov 2007 09:47:02 +0100 SANGOI DINO LEONARDO [EMAIL PROTECTED] wrote: Hi, My laptop (an HP nx6125) doesn't boot with kernels 2.6.24-rc1 and 2.6.24.rc2. It works fine with 2.6.23 and older. I seen this bug first while running fedora rawhide, so you can find hardware info and boot logs at https://bugzilla.redhat.com/show_bug.cgi?id=312201. I did a git bisect, and got this: $ git bisect bad 4f86d3a8e297205780cca027e974fd5f81064780 is first bad commit commit 4f86d3a8e297205780cca027e974fd5f81064780 Author: Len Brown [EMAIL PROTECTED] Date: Wed Oct 3 18:58:00 2007 -0400 cpuidle: consolidate 2.6.22 cpuidle branch into one patch [SNIP full commit log] snip Config is taken from Fedora kernel. CONFIG_CPU_IDLE is set to y (tell me if full config is needed). If I use 'nolapic' parameter, kernel 2.6.24-rc1 boots fine. Setting CONFIG_CPU_IDLE=n also gives me a working kernel. Ask me if more info is needed (please CC me). Thanks, Dino Dino, Thanks for all the dumps and information in bugzilla. I am looking at it to root cause the failure and I should have more updates later today. Can you also post the full config you are using with rc1 rc2. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [2.6 patch] unexport tick_nohz_get_sleep_length
>-Original Message- >From: Thomas Gleixner [mailto:[EMAIL PROTECTED] >Sent: Thursday, October 25, 2007 2:28 AM >To: Adrian Bunk >Cc: Brown, Len; LKML; Pallipadi, Venkatesh >Subject: Re: [2.6 patch] unexport tick_nohz_get_sleep_length > >On Wed, 24 Oct 2007, Adrian Bunk wrote: > >> This patch removes the unused >> EXPORT_SYMBOL_GPL(tick_nohz_get_sleep_length). >> >> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> >> >> --- >> f7c83dfe117f4fd072b2506ae090e4145abda362 >> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c >> index 10a1347..5997456 100644 >> --- a/kernel/time/tick-sched.c >> +++ b/kernel/time/tick-sched.c >> @@ -320,8 +320,6 @@ ktime_t tick_nohz_get_sleep_length(void) >> return ts->sleep_length; >> } >> >> -EXPORT_SYMBOL_GPL(tick_nohz_get_sleep_length); >> - > >Hmm, this was added to allow the cpuidle governers modular >build. Seems this was changed to compiled in only. > >Len, Venki, is this the final decision ? > Yes. This was done recently for proper fallback to old ACPI policy in case CPUIDLE is not selected. With that being module, falling back to ACPI policy at run time makes things ugly. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [2.6 patch] unexport tick_nohz_get_sleep_length
-Original Message- From: Thomas Gleixner [mailto:[EMAIL PROTECTED] Sent: Thursday, October 25, 2007 2:28 AM To: Adrian Bunk Cc: Brown, Len; LKML; Pallipadi, Venkatesh Subject: Re: [2.6 patch] unexport tick_nohz_get_sleep_length On Wed, 24 Oct 2007, Adrian Bunk wrote: This patch removes the unused EXPORT_SYMBOL_GPL(tick_nohz_get_sleep_length). Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- f7c83dfe117f4fd072b2506ae090e4145abda362 diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 10a1347..5997456 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -320,8 +320,6 @@ ktime_t tick_nohz_get_sleep_length(void) return ts-sleep_length; } -EXPORT_SYMBOL_GPL(tick_nohz_get_sleep_length); - Hmm, this was added to allow the cpuidle governers modular build. Seems this was changed to compiled in only. Len, Venki, is this the final decision ? Yes. This was done recently for proper fallback to old ACPI policy in case CPUIDLE is not selected. With that being module, falling back to ACPI policy at run time makes things ugly. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: nmi_watchdog fix for x86_64 to be more like i386
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of >Thomas Gleixner >Sent: Monday, October 01, 2007 11:19 PM >To: Andi Kleen >Cc: Arjan van de Ven; David Bahi; LKML; >[EMAIL PROTECTED]; Andrew Morton; Ingo Molnar; >Gregory Haskins >Subject: Re: nmi_watchdog fix for x86_64 to be more like i386 > >> >> The only workaround for chipsets ignoring IRQ affinity would >be to keep >> track on which CPU irq 0 happens and then restart APIC timer >interrupts >> on the others (or send IPIs) as needed. But that would be >fairly ugly. > >The clock events code does handle this already. The broadcast >interrupt >can come in on any cpu. It's just the nmi watchdog which would >be affected >by that. > Probably we can workaround this by keeping track of IRQ0 count at percpu level and use local apic timer + this percpu counter in NMI. Or just increment local apic timer count in IRQ0 with nohz enabled. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: nmi_watchdog fix for x86_64 to be more like i386
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Thomas Gleixner Sent: Monday, October 01, 2007 11:19 PM To: Andi Kleen Cc: Arjan van de Ven; David Bahi; LKML; [EMAIL PROTECTED]; Andrew Morton; Ingo Molnar; Gregory Haskins Subject: Re: nmi_watchdog fix for x86_64 to be more like i386 The only workaround for chipsets ignoring IRQ affinity would be to keep track on which CPU irq 0 happens and then restart APIC timer interrupts on the others (or send IPIs) as needed. But that would be fairly ugly. The clock events code does handle this already. The broadcast interrupt can come in on any cpu. It's just the nmi watchdog which would be affected by that. Probably we can workaround this by keeping track of IRQ0 count at percpu level and use local apic timer + this percpu counter in NMI. Or just increment local apic timer count in IRQ0 with nohz enabled. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: cpu hotplug support broken in 2.6.23-rc3
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of >Thomas Gleixner >Sent: Friday, September 14, 2007 5:51 AM >To: Pavel Machek >Cc: Rafael J. Wysocki; Jeff Chua; [EMAIL PROTECTED]; >[EMAIL PROTECTED]; [EMAIL PROTECTED]; kernel list; Len Brown >Subject: Re: cpu hotplug support broken in 2.6.23-rc3 > >Pavel, > >On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote: >> > I have an yet untested fix, which preserves the broadcast >state across >> > the offline state, but Len is looking into it as well, >whether we can >> > just reevaluate the power states (and the broadcast flags) >when a cpu >> > becomes online again. If Len can do that easily for >2.6.23, I'd prefer >> > that. >> >> Is there a patch you want me to test? Or does Len have anything to >> play with? > >Venki sent me an initial patch, but it has issues with the notify >ordering. Find below my "cache the broadcast flags" version >for testing. > While wirting that patch, I knew solution could not be that simple :(. Does the patch work for online offline case atleast? Will look at the Suspend/Resume ordering part in that case. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: cpu hotplug support broken in 2.6.23-rc3
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Thomas Gleixner Sent: Friday, September 14, 2007 5:51 AM To: Pavel Machek Cc: Rafael J. Wysocki; Jeff Chua; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; kernel list; Len Brown Subject: Re: cpu hotplug support broken in 2.6.23-rc3 Pavel, On Fri, 2007-09-14 at 14:38 +0200, Pavel Machek wrote: I have an yet untested fix, which preserves the broadcast state across the offline state, but Len is looking into it as well, whether we can just reevaluate the power states (and the broadcast flags) when a cpu becomes online again. If Len can do that easily for 2.6.23, I'd prefer that. Is there a patch you want me to test? Or does Len have anything to play with? Venki sent me an initial patch, but it has issues with the notify ordering. Find below my cache the broadcast flags version for testing. While wirting that patch, I knew solution could not be that simple :(. Does the patch work for online offline case atleast? Will look at the Suspend/Resume ordering part in that case. Thanks, Venki - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch] enable userspace cpu core voltage control withacpi-cpufreq
>-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] On Behalf Of Dave Jones >Sent: Monday, September 03, 2007 8:25 AM >To: Andi Kleen >Cc: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org >Subject: Re: [patch] enable userspace cpu core voltage control >withacpi-cpufreq > >On Mon, Sep 03, 2007 at 12:56:13PM +0200, Andi Kleen wrote: > > <[EMAIL PROTECTED]> writes: > > > > > i want to make a patch known that provides a userspace >interface to control the core voltage of a computer processor(s). > > > > That would be essentially linux supported undervolting which > > for stability is as bad as overclocking. The problem is that > > such games tend to generate weird kernel crashes and then > > chew up development issues when kernel hackers have to chase > > ghost bugs. I don't think we should support it. Developer > > time is too precious. > >Seconded. Exactly the same reasons I've refused to merge patches >into cpufreq to allow arbitrary tables to override BIOS tables. >Or patches to remove boundary checks. Even when correctly >implemented, this stuff can be fragile as hell, so introducing >more things that cast doubt over its stability isn't something >I'm keen on at all. > And acpi-cpufreq does not seem to be the place to be doing this. I would say it should be a new driver or go into speedstep-centrino which has similar user defined freq voltage values. Ideally a standalone driver so that users do not get confused with single driver working in different modes and distributors can choose not to ship it in case where they do not like the driver tainting the kernel. Thanks, Venki - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/