Re: PATCH: Assume PM Timer to be reliable on broken board/BIOS
On 7/27/05, Robert Hancock <[EMAIL PROTECTED]> wrote: > > In a nutshell, sometimes, the PIT/TSC timer runs 3x too fast [1]. That > > causes many issues, including DMA errors, MCE, and clock running way too > > fast (making the laptop unusable for any software development). So far, > > no BIOS update was able to fix the issue for me. > > Shouldn't this be looked into further rather than adding this > workaround? Surely Windows is using the PIT as well, so there must be > some way to get it to behave properly.. Sorry for the late follow up. Well, the timer management in Windows depends on the HAL used. By default, it's the ACPI HAL that is used in this laptop. I did re-install Windows by forcing the "Standard PC" HAL in Windows XP installation and, without ACPI, Windows exhibits the exact same problem as Linux or any other system: The clock runs 3 times too fast in Windows too... So my guess is that the HAL ACPI in Windows does more or less the same thing that does my patch (updated, available here: http://www.xfce.org/~olivier/r3000), it calibrates the PIT timer based on the ACPI (PM) timer. Cheers, Olivier. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH: Assume PM Timer to be reliable on broken board/BIOS
On 7/27/05, Robert Hancock [EMAIL PROTECTED] wrote: In a nutshell, sometimes, the PIT/TSC timer runs 3x too fast [1]. That causes many issues, including DMA errors, MCE, and clock running way too fast (making the laptop unusable for any software development). So far, no BIOS update was able to fix the issue for me. Shouldn't this be looked into further rather than adding this workaround? Surely Windows is using the PIT as well, so there must be some way to get it to behave properly.. Sorry for the late follow up. Well, the timer management in Windows depends on the HAL used. By default, it's the ACPI HAL that is used in this laptop. I did re-install Windows by forcing the Standard PC HAL in Windows XP installation and, without ACPI, Windows exhibits the exact same problem as Linux or any other system: The clock runs 3 times too fast in Windows too... So my guess is that the HAL ACPI in Windows does more or less the same thing that does my patch (updated, available here: http://www.xfce.org/~olivier/r3000), it calibrates the PIT timer based on the ACPI (PM) timer. Cheers, Olivier. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Time Flies (Twice as Fast)
Kurt Did you try with the "no_timer_check" boot option? HTH Olivier. On Thu, 2005-07-28 at 22:03 -0400, Kurt Wall wrote: > Hola, > > I have an eMachines T6212 Opteron system on which the system clock > seems to run at ~twice the speed of the wall clock. The main board > is an ASUS K8 of some description with at ATI SB400 southbridge and > an ATI RS480 northbridge. Kernel version is 2.6.12.3. > > If I disable ACPI, the clock slows down to what seems to be the proper > speed, but then my NIC doesn't work, presumably because it shares > an interrupt with something else. > > I've tried booting with clock=tsc and clock=pit to no effect. Based > on my review of the list archives, there appears to be issues with > the chipset, but I haven't been able to sort out what the real problem > is and the appropriate solution. > > There's an ACPI error that seems potentially troublesome: > > ACPI: Subsystem revision 20050309 > ACPI-0352: *** Error: Looking up [\_SB_.PCI0.LPC0.LNK0] in namespace, > AE_NOT_FOUND > search_node 81001fec9440 start_node 81001fec9440 return_node > > > I also see this message from the PCI subsystem: > > PCI: Ignoring BAR0-3 of IDE controller :00:14.1 > > As a starting point, I've attached lspci output and the boot log. I'm > willing to provide more information and try patches and such. > > Thanks. > > Kurt > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Time Flies (Twice as Fast)
Kurt Did you try with the no_timer_check boot option? HTH Olivier. On Thu, 2005-07-28 at 22:03 -0400, Kurt Wall wrote: Hola, I have an eMachines T6212 Opteron system on which the system clock seems to run at ~twice the speed of the wall clock. The main board is an ASUS K8 of some description with at ATI SB400 southbridge and an ATI RS480 northbridge. Kernel version is 2.6.12.3. If I disable ACPI, the clock slows down to what seems to be the proper speed, but then my NIC doesn't work, presumably because it shares an interrupt with something else. I've tried booting with clock=tsc and clock=pit to no effect. Based on my review of the list archives, there appears to be issues with the chipset, but I haven't been able to sort out what the real problem is and the appropriate solution. There's an ACPI error that seems potentially troublesome: ACPI: Subsystem revision 20050309 ACPI-0352: *** Error: Looking up [\_SB_.PCI0.LPC0.LNK0] in namespace, AE_NOT_FOUND search_node 81001fec9440 start_node 81001fec9440 return_node I also see this message from the PCI subsystem: PCI: Ignoring BAR0-3 of IDE controller :00:14.1 As a starting point, I've attached lspci output and the boot log. I'm willing to provide more information and try patches and such. Thanks. Kurt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PATCH: Assume PM Timer to be reliable on broken board/BIOS
On Tue, 2005-07-26 at 17:34 -0600, Robert Hancock wrote: > > In a nutshell, sometimes, the PIT/TSC timer runs 3x too fast [1]. That > > causes many issues, including DMA errors, MCE, and clock running way too > > fast (making the laptop unusable for any software development). So far, > > no BIOS update was able to fix the issue for me. > > Shouldn't this be looked into further rather than adding this > workaround? Surely Windows is using the PIT as well, so there must be > some way to get it to behave properly.. Surely, but I've been desesperatly trying to find the cause w/out success for months. My first idea was that the BIOS doesn't set the CPU voltage properly at boot, so I made up a patch that sets the right fid/vid before any calibration but that didn't help. The BIOS is wrong (ie the BIOS reports a 1/3 of the actual CPU speed), memtest86+ which doesn't use any ACPI or whatever reports wrong time too, so it's definitely not a Linux bug. My guess is that Windows reinitialize some register but it's hard to tell. Cheers, Olivier. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PATCH: Assume PM Timer to be reliable on broken board/BIOS
Hi all, Background == I have a laptop (Compaq R3480EA, AMD 64 3400+ with NForce3) and reported multiple problems related to timer issues. In a nutshell, sometimes, the PIT/TSC timer runs 3x too fast [1]. That causes many issues, including DMA errors, MCE, and clock running way too fast (making the laptop unusable for any software development). So far, no BIOS update was able to fix the issue for me. As I first reported this the LKML back in march [2], the only reliable time source on this laptop seems to be the PM timer. However, the time in Linux is tick based and forcing the PM timer doesn't help. Also, the PIT timer being used to calibrate the lpj, the wrong LP was causing the nasty errors I had with DMA and other MCE. Although the lpj can be forced at boot, having it right in the first place even on such broken hardware as my laptop can save quite a lot of time and investigations for novice users. Many similar reports can be found on the web for the Compaq R3000 and HP zv5000 laptops, either with 64 or 32 bit CPU [3]. Similar bug reports with no fix can be also found in SuSE and Red Hat bugzilla databases. What the patch does === Basically, the patch adjusts the PIT/TSC passed values based on the PM timer rate. The PM timer is compared to the TSC/PIT rate and a a multiplier is computed. On a "normal" system, the ratio is 1. On my broken laptop, the ratio is 3. That ration is then applied to all values passed to the PIT timer. For example, instead of using: outb_p(LATCH & 0xff, PIT_CH0); outb(LATCH >> 8, PIT_CH0); The patch uses : outb_p((LATCH * timer_mult) & 0xff, PIT_CH0); outb((LATCH * timer_mult) >> 8, PIT_CH0); Also, the ratio is computed/used only if the user has specified the "clock=pmtmr" boot option on i386 or "pmtmr" on x86_64. If the user has not explicitly asked for the PM timer to be used, and if there is a delta of more than 5% between the PM timer and the PIT, then the PM timer is not used (just like it is in the current implementation for i386 arch). What is included in the patch = The patch includes the code that implements the workaround described above for x86_64 and i386 arch. The patch applies in Linux 2.6.12.3. Documentation is also updated. == Please let me know if there are some fixes or improvements to add and if such a patch could be suitable in the kernel. As a side note, this patch is very useful for me as it makes the laptop usable under Linux and I plan to keep it available somewhere on xfce.org so that other Compaq R3000 and HP zv5000 owners can use it. Ref. [1] http://kerneltrap.org/mailarchive/1/message/43741/thread [2] http://lkml.org/lkml/2005/3/29/265 [3] http://lists.pcxperience.com/pipermail/linuxr3000/2004- September/003678.html http://lists.pcxperience.com/pipermail/linuxr3000/2004- September/003788.html http://lists.pcxperience.com/pipermail/linuxr3000/2005- July/006763.html http://lists.pcxperience.com/pipermail/linuxr3000/2005- January/004650.html Thanks, Regards, Olivier. diff -Naur linux-2.6.12.3/arch/i386/kernel/time.c linux-2.6.12.3-pmtimer/arch/i386/kernel/time.c --- linux-2.6.12.3/arch/i386/kernel/time.c 2005-06-17 21:48:29.0 +0200 +++ linux-2.6.12.3-pmtimer/arch/i386/kernel/time.c 2005-07-26 22:30:52.0 +0200 @@ -77,6 +77,12 @@ EXPORT_SYMBOL(jiffies_64); +/* + * timer_mult is a mutiplier used to work arround some very buggy BIOS + * or hardware where the PIT/TSC timer runs n times too fast. + */ +u16 timer_mult = 1; + unsigned long cpu_khz; /* Detected as we calibrate the TSC */ extern unsigned long wall_jiffies; diff -Naur linux-2.6.12.3/arch/i386/kernel/timers/timer_cyclone.c linux-2.6.12.3-pmtimer/arch/i386/kernel/timers/timer_cyclone.c --- linux-2.6.12.3/arch/i386/kernel/timers/timer_cyclone.c 2005-06-17 21:48:29.0 +0200 +++ linux-2.6.12.3-pmtimer/arch/i386/kernel/timers/timer_cyclone.c 2005-07-26 22:52:24.0 +0200 @@ -21,6 +21,12 @@ extern spinlock_t i8253_lock; +/* + * timer_mult is a mutiplier used to work arround some very buggy BIOS + * or hardware where the PIT/TSC timer runs n times too fast. + */ +extern u16 timer_mult; + /* Number of usecs that the last interrupt was delayed */ static int delay_at_last_interrupt; @@ -70,8 +76,8 @@ */ if (count > LATCH) { outb_p(0x34, PIT_MODE); - outb_p(LATCH & 0xff, PIT_CH0); - outb(LATCH >> 8, PIT_CH0); + outb_p((LATCH * timer_mult) & 0xff, PIT_CH0); + outb((LATCH * timer_mult) >> 8, PIT_CH0); count = LATCH - 1; } spin_unlock(_lock); diff -Naur linux-2.6.12.3/arch/i386/kernel/timers/timer_pit.c linux-2.6.12.3-pmtimer/arch/i386/kernel/timers/timer_pit.c --- linux-2.6.12.3/arch/i386/kernel/timers/timer_pit.c 2005-06-17 21:48:29.0 +0200 +++ linux-2.6.12.3-pmtimer/arch/i386/kernel/timers/timer_pit.c
PATCH: Assume PM Timer to be reliable on broken board/BIOS
Hi all, Background == I have a laptop (Compaq R3480EA, AMD 64 3400+ with NForce3) and reported multiple problems related to timer issues. In a nutshell, sometimes, the PIT/TSC timer runs 3x too fast [1]. That causes many issues, including DMA errors, MCE, and clock running way too fast (making the laptop unusable for any software development). So far, no BIOS update was able to fix the issue for me. As I first reported this the LKML back in march [2], the only reliable time source on this laptop seems to be the PM timer. However, the time in Linux is tick based and forcing the PM timer doesn't help. Also, the PIT timer being used to calibrate the lpj, the wrong LP was causing the nasty errors I had with DMA and other MCE. Although the lpj can be forced at boot, having it right in the first place even on such broken hardware as my laptop can save quite a lot of time and investigations for novice users. Many similar reports can be found on the web for the Compaq R3000 and HP zv5000 laptops, either with 64 or 32 bit CPU [3]. Similar bug reports with no fix can be also found in SuSE and Red Hat bugzilla databases. What the patch does === Basically, the patch adjusts the PIT/TSC passed values based on the PM timer rate. The PM timer is compared to the TSC/PIT rate and a a multiplier is computed. On a normal system, the ratio is 1. On my broken laptop, the ratio is 3. That ration is then applied to all values passed to the PIT timer. For example, instead of using: outb_p(LATCH 0xff, PIT_CH0); outb(LATCH 8, PIT_CH0); The patch uses : outb_p((LATCH * timer_mult) 0xff, PIT_CH0); outb((LATCH * timer_mult) 8, PIT_CH0); Also, the ratio is computed/used only if the user has specified the clock=pmtmr boot option on i386 or pmtmr on x86_64. If the user has not explicitly asked for the PM timer to be used, and if there is a delta of more than 5% between the PM timer and the PIT, then the PM timer is not used (just like it is in the current implementation for i386 arch). What is included in the patch = The patch includes the code that implements the workaround described above for x86_64 and i386 arch. The patch applies in Linux 2.6.12.3. Documentation is also updated. == Please let me know if there are some fixes or improvements to add and if such a patch could be suitable in the kernel. As a side note, this patch is very useful for me as it makes the laptop usable under Linux and I plan to keep it available somewhere on xfce.org so that other Compaq R3000 and HP zv5000 owners can use it. Ref. [1] http://kerneltrap.org/mailarchive/1/message/43741/thread [2] http://lkml.org/lkml/2005/3/29/265 [3] http://lists.pcxperience.com/pipermail/linuxr3000/2004- September/003678.html http://lists.pcxperience.com/pipermail/linuxr3000/2004- September/003788.html http://lists.pcxperience.com/pipermail/linuxr3000/2005- July/006763.html http://lists.pcxperience.com/pipermail/linuxr3000/2005- January/004650.html Thanks, Regards, Olivier. diff -Naur linux-2.6.12.3/arch/i386/kernel/time.c linux-2.6.12.3-pmtimer/arch/i386/kernel/time.c --- linux-2.6.12.3/arch/i386/kernel/time.c 2005-06-17 21:48:29.0 +0200 +++ linux-2.6.12.3-pmtimer/arch/i386/kernel/time.c 2005-07-26 22:30:52.0 +0200 @@ -77,6 +77,12 @@ EXPORT_SYMBOL(jiffies_64); +/* + * timer_mult is a mutiplier used to work arround some very buggy BIOS + * or hardware where the PIT/TSC timer runs n times too fast. + */ +u16 timer_mult = 1; + unsigned long cpu_khz; /* Detected as we calibrate the TSC */ extern unsigned long wall_jiffies; diff -Naur linux-2.6.12.3/arch/i386/kernel/timers/timer_cyclone.c linux-2.6.12.3-pmtimer/arch/i386/kernel/timers/timer_cyclone.c --- linux-2.6.12.3/arch/i386/kernel/timers/timer_cyclone.c 2005-06-17 21:48:29.0 +0200 +++ linux-2.6.12.3-pmtimer/arch/i386/kernel/timers/timer_cyclone.c 2005-07-26 22:52:24.0 +0200 @@ -21,6 +21,12 @@ extern spinlock_t i8253_lock; +/* + * timer_mult is a mutiplier used to work arround some very buggy BIOS + * or hardware where the PIT/TSC timer runs n times too fast. + */ +extern u16 timer_mult; + /* Number of usecs that the last interrupt was delayed */ static int delay_at_last_interrupt; @@ -70,8 +76,8 @@ */ if (count LATCH) { outb_p(0x34, PIT_MODE); - outb_p(LATCH 0xff, PIT_CH0); - outb(LATCH 8, PIT_CH0); + outb_p((LATCH * timer_mult) 0xff, PIT_CH0); + outb((LATCH * timer_mult) 8, PIT_CH0); count = LATCH - 1; } spin_unlock(i8253_lock); diff -Naur linux-2.6.12.3/arch/i386/kernel/timers/timer_pit.c linux-2.6.12.3-pmtimer/arch/i386/kernel/timers/timer_pit.c --- linux-2.6.12.3/arch/i386/kernel/timers/timer_pit.c 2005-06-17 21:48:29.0 +0200 +++ linux-2.6.12.3-pmtimer/arch/i386/kernel/timers/timer_pit.c 2005-07-26
Re: PATCH: Assume PM Timer to be reliable on broken board/BIOS
On Tue, 2005-07-26 at 17:34 -0600, Robert Hancock wrote: In a nutshell, sometimes, the PIT/TSC timer runs 3x too fast [1]. That causes many issues, including DMA errors, MCE, and clock running way too fast (making the laptop unusable for any software development). So far, no BIOS update was able to fix the issue for me. Shouldn't this be looked into further rather than adding this workaround? Surely Windows is using the PIT as well, so there must be some way to get it to behave properly.. Surely, but I've been desesperatly trying to find the cause w/out success for months. My first idea was that the BIOS doesn't set the CPU voltage properly at boot, so I made up a patch that sets the right fid/vid before any calibration but that didn't help. The BIOS is wrong (ie the BIOS reports a 1/3 of the actual CPU speed), memtest86+ which doesn't use any ACPI or whatever reports wrong time too, so it's definitely not a Linux bug. My guess is that Windows reinitialize some register but it's hard to tell. Cheers, Olivier. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Is it possible to "reset" the processor to a sane state at boot?
Hi, Sorry if this post sounds a bit off topic now. It seems I've narrowed down the issue with the timer running too fast on my AMD 64 based Compaq laptop. As said previously, after a cold restart, the system runs 3x too fast. The processor speed as reported by both the Linux kernel and memtest86 is 266MHz while the lowest speed is actually 800MHz (1). Even the BIOS shows that problem, instead of reporting the correct 800MHz speed for the CPU (like it does normally when the system is fine), it shows "???MHz" at boot instead. So it's probably a hardware or a BIOS issue (or both). What is puzzling me is that doesn't make a single difference for WinXP. Everything works just fine in WinXP (2). So I wonder, is there a way to "reset" the processor to a sane state? If such a workaround is doable, could someone point me to where I should look? Thanks in advance Olivier (1) memtest86 uses "rdtsc" to compute cpu speed. (2) The laptop came preloaded with WinXP and it runs fine with it, so I guess that from a "support" point of view, the system is fine. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Is it possible to reset the processor to a sane state at boot?
Hi, Sorry if this post sounds a bit off topic now. It seems I've narrowed down the issue with the timer running too fast on my AMD 64 based Compaq laptop. As said previously, after a cold restart, the system runs 3x too fast. The processor speed as reported by both the Linux kernel and memtest86 is 266MHz while the lowest speed is actually 800MHz (1). Even the BIOS shows that problem, instead of reporting the correct 800MHz speed for the CPU (like it does normally when the system is fine), it shows ???MHz at boot instead. So it's probably a hardware or a BIOS issue (or both). What is puzzling me is that doesn't make a single difference for WinXP. Everything works just fine in WinXP (2). So I wonder, is there a way to reset the processor to a sane state? If such a workaround is doable, could someone point me to where I should look? Thanks in advance Olivier (1) memtest86 uses rdtsc to compute cpu speed. (2) The laptop came preloaded with WinXP and it runs fine with it, so I guess that from a support point of view, the system is fine. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Clock 3x too fast on AMD64 laptop [WAS Re: Various issues after rebooting]
Hi John, Dominik, On Tue, 2005-03-29 at 14:11 -0800, john stultz wrote: > Yea. From your description this is most likely the cause of the issue. > Currently the time of day is still tick-based, using the tsc/pmtmr/hpet > only for interpolating between ticks. Sorry for the late follow up. Unfortunately, a quick hack to disable the "pmtmr" check shows that even when "trusting" the PM-Timer, the clock and interrupts still run 3x too fast. That makes no difference. > Well, if you tried the time of day re-work I've been working on it would > mask the issue somewhat, but you'd still have the problem that you are > taking too many timer interrupts. Where could I get that patch from ? I'd be glad to do some testing for you if you need it. > One thing you could try is playing with the CLOCK_TICK_RATE value to see > if you just have very unique hardware. Problem is that the issue shows exactly after one quick power off/power on sequence. It doesn't show after a real cold start (leaving the laptop off for a couple of hours) or even after a reboot. > A similar sounding issue has also been reported here: > http://bugme.osdl.org/show_bug.cgi?id=3927 Not sure if that's the exact same problem. What I can say, after reading that bug report, is that disabling ACPI and/or APIC makes no difference. Specifying the clock=... makes no difference either. It doesn't seem related to the AMD64 part of the kernel since it shows equally when using a 64bit kernel and a 32bit kernel. Moreover, when that bug shows, there are other different problems showing (such as the cdrom not being to mount anything, or ndiswrapper crashing the system with a MCE error). At first, I thought the issue might be related to the nforce3, but the bug refers to an ATI chipset so I guess it's not related to the nforce. Anyway, it doesn't seem to be an uncommon issue with AMD64 based hardware. I don't know where to start from though. Cheers, Olivier. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Clock 3x too fast on AMD64 laptop [WAS Re: Various issues after rebooting]
Hi John, Dominik, On Tue, 2005-03-29 at 14:11 -0800, john stultz wrote: Yea. From your description this is most likely the cause of the issue. Currently the time of day is still tick-based, using the tsc/pmtmr/hpet only for interpolating between ticks. Sorry for the late follow up. Unfortunately, a quick hack to disable the pmtmr check shows that even when trusting the PM-Timer, the clock and interrupts still run 3x too fast. That makes no difference. Well, if you tried the time of day re-work I've been working on it would mask the issue somewhat, but you'd still have the problem that you are taking too many timer interrupts. Where could I get that patch from ? I'd be glad to do some testing for you if you need it. One thing you could try is playing with the CLOCK_TICK_RATE value to see if you just have very unique hardware. Problem is that the issue shows exactly after one quick power off/power on sequence. It doesn't show after a real cold start (leaving the laptop off for a couple of hours) or even after a reboot. A similar sounding issue has also been reported here: http://bugme.osdl.org/show_bug.cgi?id=3927 Not sure if that's the exact same problem. What I can say, after reading that bug report, is that disabling ACPI and/or APIC makes no difference. Specifying the clock=... makes no difference either. It doesn't seem related to the AMD64 part of the kernel since it shows equally when using a 64bit kernel and a 32bit kernel. Moreover, when that bug shows, there are other different problems showing (such as the cdrom not being to mount anything, or ndiswrapper crashing the system with a MCE error). At first, I thought the issue might be related to the nforce3, but the bug refers to an ATI chipset so I guess it's not related to the nforce. Anyway, it doesn't seem to be an uncommon issue with AMD64 based hardware. I don't know where to start from though. Cheers, Olivier. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Clock 3x too fast on AMD64 laptop [WAS Re: Various issues after rebooting]
Hi, A quick look at the source shows that the error is triggered in arch/i386/kernel/timers/timer_pm.c by the verify_pmtr_rate() function. My guess is that the pmtmr timer is right and the pit is wrong in my case. That would explain why the clock is wrong when being based on pit (like when forced with "clock=pit") Maybe, if I can prove my guesses, a fix could be to "trust" the pmtmr clock when the user has passed a "clock=pmtmr" argument ? Does that make any sense ? TIA Olivier. On Tue, 2005-03-29 at 23:28 +0200, Olivier Fourdan wrote: > Hi all > > Following my own thread, I found the following error in dmesg: > > PM-Timer running at invalid rate: 33% of normal - aborting. > > I found that interesting because 33% is 1/3 and the clock runs exactly > 3x faster than normal... > > A bit of search on google gave me several links to posts from other > people with the exact same problem on similar hardware (AMD64 laptop) > but I couldn't find neither the cause nor the fix of that issue (as I > think it might be related to the other issues I observe when the clock > goes too fast) > > Does that PM-Timer message makes sense to someone knowledgeable? > > Thanks in advance, > > Cheers, > Olivier. > > On Mon, 2005-03-28 at 21:39 +0200, Willy Tarreau wrote: > > On Mon, Mar 28, 2005 at 09:30:26PM +0200, Olivier Fourdan wrote: > > > Hi Willy > > > > > > On Mon, 2005-03-28 at 21:20 +0200, Willy Tarreau wrote: > > > > Now I have a compaq (nc8000) which does not exhibit such buggy > > > > behaviour, > > > > but you can try disabling the APIC too just in case it's a similar > > > > problem > > > > (at least in 32 bits, I don't know if you can disable it in 64 bits > > > > mode). > > > > > > Thanks for the hint, but unfortunately, it's one of the first things I > > > tried, and that makes no difference. > > > > Sorry, at first I only noticed ACPI in your mail, but after reading it > > again, I also noticed APIC. So now, you can only try not to initialize > > some peripherals (IDE, network, display, etc...) by removing their drivers > > from the kernel. You may end up with a kernel panic, but that does not > > matter is you boot it with "panic=5" so that it automatically reboots > > 5 seconds after the panic. You should then finally identify the subsystem > > which is responsible for your problems. Perhaps you'll even need to remove > > PCI support :-( > > > > Regards, > > Willy > > > > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Clock 3x too fast on AMD64 laptop [WAS Re: Various issues after rebooting]
Hi all Following my own thread, I found the following error in dmesg: PM-Timer running at invalid rate: 33% of normal - aborting. I found that interesting because 33% is 1/3 and the clock runs exactly 3x faster than normal... A bit of search on google gave me several links to posts from other people with the exact same problem on similar hardware (AMD64 laptop) but I couldn't find neither the cause nor the fix of that issue (as I think it might be related to the other issues I observe when the clock goes too fast) Does that PM-Timer message makes sense to someone knowledgeable? Thanks in advance, Cheers, Olivier. On Mon, 2005-03-28 at 21:39 +0200, Willy Tarreau wrote: > On Mon, Mar 28, 2005 at 09:30:26PM +0200, Olivier Fourdan wrote: > > Hi Willy > > > > On Mon, 2005-03-28 at 21:20 +0200, Willy Tarreau wrote: > > > Now I have a compaq (nc8000) which does not exhibit such buggy behaviour, > > > but you can try disabling the APIC too just in case it's a similar problem > > > (at least in 32 bits, I don't know if you can disable it in 64 bits mode). > > > > Thanks for the hint, but unfortunately, it's one of the first things I > > tried, and that makes no difference. > > Sorry, at first I only noticed ACPI in your mail, but after reading it > again, I also noticed APIC. So now, you can only try not to initialize > some peripherals (IDE, network, display, etc...) by removing their drivers > from the kernel. You may end up with a kernel panic, but that does not > matter is you boot it with "panic=5" so that it automatically reboots > 5 seconds after the panic. You should then finally identify the subsystem > which is responsible for your problems. Perhaps you'll even need to remove > PCI support :-( > > Regards, > Willy > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Clock 3x too fast on AMD64 laptop [WAS Re: Various issues after rebooting]
Hi all Following my own thread, I found the following error in dmesg: PM-Timer running at invalid rate: 33% of normal - aborting. I found that interesting because 33% is 1/3 and the clock runs exactly 3x faster than normal... A bit of search on google gave me several links to posts from other people with the exact same problem on similar hardware (AMD64 laptop) but I couldn't find neither the cause nor the fix of that issue (as I think it might be related to the other issues I observe when the clock goes too fast) Does that PM-Timer message makes sense to someone knowledgeable? Thanks in advance, Cheers, Olivier. On Mon, 2005-03-28 at 21:39 +0200, Willy Tarreau wrote: On Mon, Mar 28, 2005 at 09:30:26PM +0200, Olivier Fourdan wrote: Hi Willy On Mon, 2005-03-28 at 21:20 +0200, Willy Tarreau wrote: Now I have a compaq (nc8000) which does not exhibit such buggy behaviour, but you can try disabling the APIC too just in case it's a similar problem (at least in 32 bits, I don't know if you can disable it in 64 bits mode). Thanks for the hint, but unfortunately, it's one of the first things I tried, and that makes no difference. Sorry, at first I only noticed ACPI in your mail, but after reading it again, I also noticed APIC. So now, you can only try not to initialize some peripherals (IDE, network, display, etc...) by removing their drivers from the kernel. You may end up with a kernel panic, but that does not matter is you boot it with panic=5 so that it automatically reboots 5 seconds after the panic. You should then finally identify the subsystem which is responsible for your problems. Perhaps you'll even need to remove PCI support :-( Regards, Willy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Clock 3x too fast on AMD64 laptop [WAS Re: Various issues after rebooting]
Hi, A quick look at the source shows that the error is triggered in arch/i386/kernel/timers/timer_pm.c by the verify_pmtr_rate() function. My guess is that the pmtmr timer is right and the pit is wrong in my case. That would explain why the clock is wrong when being based on pit (like when forced with clock=pit) Maybe, if I can prove my guesses, a fix could be to trust the pmtmr clock when the user has passed a clock=pmtmr argument ? Does that make any sense ? TIA Olivier. On Tue, 2005-03-29 at 23:28 +0200, Olivier Fourdan wrote: Hi all Following my own thread, I found the following error in dmesg: PM-Timer running at invalid rate: 33% of normal - aborting. I found that interesting because 33% is 1/3 and the clock runs exactly 3x faster than normal... A bit of search on google gave me several links to posts from other people with the exact same problem on similar hardware (AMD64 laptop) but I couldn't find neither the cause nor the fix of that issue (as I think it might be related to the other issues I observe when the clock goes too fast) Does that PM-Timer message makes sense to someone knowledgeable? Thanks in advance, Cheers, Olivier. On Mon, 2005-03-28 at 21:39 +0200, Willy Tarreau wrote: On Mon, Mar 28, 2005 at 09:30:26PM +0200, Olivier Fourdan wrote: Hi Willy On Mon, 2005-03-28 at 21:20 +0200, Willy Tarreau wrote: Now I have a compaq (nc8000) which does not exhibit such buggy behaviour, but you can try disabling the APIC too just in case it's a similar problem (at least in 32 bits, I don't know if you can disable it in 64 bits mode). Thanks for the hint, but unfortunately, it's one of the first things I tried, and that makes no difference. Sorry, at first I only noticed ACPI in your mail, but after reading it again, I also noticed APIC. So now, you can only try not to initialize some peripherals (IDE, network, display, etc...) by removing their drivers from the kernel. You may end up with a kernel panic, but that does not matter is you boot it with panic=5 so that it automatically reboots 5 seconds after the panic. You should then finally identify the subsystem which is responsible for your problems. Perhaps you'll even need to remove PCI support :-( Regards, Willy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Various issues after rebooting
Hi Willy, On Mon, 2005-03-28 at 21:39 +0200, Willy Tarreau wrote: > Sorry, at first I only noticed ACPI in your mail, but after reading it > again, I also noticed APIC. So now, you can only try not to initialize > some peripherals (IDE, network, display, etc...) by removing their drivers > from the kernel. You may end up with a kernel panic, but that does not > matter is you boot it with "panic=5" so that it automatically reboots > 5 seconds after the panic. You should then finally identify the subsystem > which is responsible for your problems. Perhaps you'll even need to remove > PCI support :-( Well, actually, the system runs (at least) unless I try to load "ndiswrapper" which leads to a kernel panic. I tried to bring the issue to the ndiswrapper ML but I doubt that ndiswrapper is faulty. I can reliably predict the crash. If the clock (and all other time based events) are too fast, then modprobing ndiswrapper will lead to a system crash, just like mounting a CDROM will fail. I think the clock speed and other effects are just signs, not the cause of the problem. What I'd like to determine is what would need to be done to avoid the root cause, or maybe if there is anything that can be done in Linux to avoid that? I just tried "acpi_fake_ecdt" but that leads to a immediate kernel panic. Ps: Given the crash (Machine check exception), the sleep option seems to have no effect. Thanks, Olivier. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Various issues after rebooting
Hi Willy On Mon, 2005-03-28 at 21:20 +0200, Willy Tarreau wrote: > Now I have a compaq (nc8000) which does not exhibit such buggy behaviour, > but you can try disabling the APIC too just in case it's a similar problem > (at least in 32 bits, I don't know if you can disable it in 64 bits mode). Thanks for the hint, but unfortunately, it's one of the first things I tried, and that makes no difference. Regards, Olivier. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Various issues after rebooting
Hi all, I'm facing some various odd issues with a AMD64 based laptop (Compaq R3480EA) I bought recently. On first boot, everything is all right. The laptop runs flawlessly. But if I shutdown the laptop and restart it, I can see all kind of strange things happening. 1) the system clock runs 3 times faster, 2) the system is unable to mount cdroms, 3) modprobing nidswrapper cause a whole system freeze with the following message: CPU 0: Machine Check Exception: 0004 Bank 4: b2070f0f Kernel panic - not syncing: CPU context corrupt I've tried with various kernels and distributions in 32bit and 64bit modes but that make no differences. I also tried disable ACPI, setting clock=[tsc|pmtmr|pti], diabling APIC, etc. No luck. No matter how many reboots I do, the problem remains. The only way to fix the problem is to keep the laptop off for a couple of hours. I thought of a hardware issue, but in WinXP, everything is fine. And in the case of a hardware issue, I guess the problem would always show, not just in Linux after a reboot. My guess is that the BIOS doesn't re-initialize the hardware correctly in case of a quick shutdown/reboot but WinXP might be initializing the things by itself (it's a guess, I'm probably completely wrong). Does that make any sense so someone? How could I help tracking down this issue? Thanks in advance, Best regards, Olivier. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Various issues after rebooting
Hi all, I'm facing some various odd issues with a AMD64 based laptop (Compaq R3480EA) I bought recently. On first boot, everything is all right. The laptop runs flawlessly. But if I shutdown the laptop and restart it, I can see all kind of strange things happening. 1) the system clock runs 3 times faster, 2) the system is unable to mount cdroms, 3) modprobing nidswrapper cause a whole system freeze with the following message: CPU 0: Machine Check Exception: 0004 Bank 4: b2070f0f Kernel panic - not syncing: CPU context corrupt I've tried with various kernels and distributions in 32bit and 64bit modes but that make no differences. I also tried disable ACPI, setting clock=[tsc|pmtmr|pti], diabling APIC, etc. No luck. No matter how many reboots I do, the problem remains. The only way to fix the problem is to keep the laptop off for a couple of hours. I thought of a hardware issue, but in WinXP, everything is fine. And in the case of a hardware issue, I guess the problem would always show, not just in Linux after a reboot. My guess is that the BIOS doesn't re-initialize the hardware correctly in case of a quick shutdown/reboot but WinXP might be initializing the things by itself (it's a guess, I'm probably completely wrong). Does that make any sense so someone? How could I help tracking down this issue? Thanks in advance, Best regards, Olivier. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Various issues after rebooting
Hi Willy On Mon, 2005-03-28 at 21:20 +0200, Willy Tarreau wrote: Now I have a compaq (nc8000) which does not exhibit such buggy behaviour, but you can try disabling the APIC too just in case it's a similar problem (at least in 32 bits, I don't know if you can disable it in 64 bits mode). Thanks for the hint, but unfortunately, it's one of the first things I tried, and that makes no difference. Regards, Olivier. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Various issues after rebooting
Hi Willy, On Mon, 2005-03-28 at 21:39 +0200, Willy Tarreau wrote: Sorry, at first I only noticed ACPI in your mail, but after reading it again, I also noticed APIC. So now, you can only try not to initialize some peripherals (IDE, network, display, etc...) by removing their drivers from the kernel. You may end up with a kernel panic, but that does not matter is you boot it with panic=5 so that it automatically reboots 5 seconds after the panic. You should then finally identify the subsystem which is responsible for your problems. Perhaps you'll even need to remove PCI support :-( Well, actually, the system runs (at least) unless I try to load ndiswrapper which leads to a kernel panic. I tried to bring the issue to the ndiswrapper ML but I doubt that ndiswrapper is faulty. I can reliably predict the crash. If the clock (and all other time based events) are too fast, then modprobing ndiswrapper will lead to a system crash, just like mounting a CDROM will fail. I think the clock speed and other effects are just signs, not the cause of the problem. What I'd like to determine is what would need to be done to avoid the root cause, or maybe if there is anything that can be done in Linux to avoid that? I just tried acpi_fake_ecdt but that leads to a immediate kernel panic. Ps: Given the crash (Machine check exception), the sleep option seems to have no effect. Thanks, Olivier. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/