Re: Re: Re: fast clock wreaks havoc on amd64 dual core - hp1250n
Too late to answer Dave now. He passed away. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Re: fast clock wreaks havoc on amd64 dual core - hp1250n
So, let me jump in line here as another interested party -- I've got the fast clock problem on an HP zv6130us laptop (Athlon 3200+) running Debian sarge AMD64. I've seen the bugzilla.kernel.org report on this (bug#3927), and I'd love to know when and how a fix for this might make it into the Debian release stream. Anyone? Thanks, --dave -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: fast clock wreaks havoc on amd64 dual core - hp1250n
On Sat, Oct 22, 2005 at 09:07:47PM -0400, Nathan O. Siemers wrote: Unfortunately 2 hours is not enough time to say this is a stable configuration (24 hours is better), but encouraging. I have hit the system with high cpu, video, disk, and ieee load to test but sometime the clock starts going haywire again after several more hours... Thanks to all that responded so far. try booting with disable_timer_pin_1 this option was introcuced in 2.6.14-rc as a workaround for buggy ATI chipsets. Best regards Frederik Schueler -- ENOSIG signature.asc Description: Digital signature
Re: fast clock wreaks havoc on amd64 dual core - hp1250n
Frederik, Thanks much. I will test this. With the exception of some disquieting kernel logs: Oct 23 07:00:38 line kernel: APIC error on CPU0: 40(40) Oct 23 07:00:38 line kernel: APIC error on CPU1: 40(40) Oct 23 07:07:19 line kernel: APIC error on CPU0: 40(40) Oct 23 07:07:19 line kernel: APIC error on CPU1: 40(40) (etc) The system I have written about earlier (new HP a1250n , dual-core athlon 64, ATI motherboard ) is up and running well over days of use. I am using a boot configuration (grub): title Debian GNU/Linux, kernel 2.6.14-rc5 root(hd0,0) kernel /boot/vmlinuz-2.6.14-rc5ns1-rc5 root=/dev/sda1 ro notsc no_timer_check initrd /boot/initrd.img-2.6.14-rc5ns1-rc5 boot Questions: Will the disable_timer_pin_1 eliminate the need for timer_check, notsc options? Sounds to me like it will. How far back is this option? Do I have to stay at rc5? Although I have to say that so far I have had no other problems with this relatively bleeding edge release. ps. If anyone on the kernel list is reading, I am willing to attempt some testing on this machine if it can help you. For some reason I did not get access to lkml after I applied. I imagine a lot of folks will be buying this hardware - it is a truly awesome computing environment if you are not out to get a graphics-intensive gaming machine. nathan Frederik Schueler wrote: On Sat, Oct 22, 2005 at 09:07:47PM -0400, Nathan O. Siemers wrote: Unfortunately 2 hours is not enough time to say this is a stable configuration (24 hours is better), but encouraging. I have hit the system with high cpu, video, disk, and ieee load to test but sometime the clock starts going haywire again after several more hours... Thanks to all that responded so far. try booting with disable_timer_pin_1 this option was introcuced in 2.6.14-rc as a workaround for buggy ATI chipsets. Best regards Frederik Schueler -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: fast clock wreaks havoc on amd64 dual core - hp1250n
Am Samstag 22 Oktober 2005 03:52 schrieb Nathan O. Siemers: Hello all, I've spent the last 5 days trying to fix an issue with a brand new hp a1250n dual core athlon64 machine. ATI motherboard with embedded radeon xpress 200 graphics. I've installed the pure64 sarge distribution. This is my first 64 bit debian attempt, although I am running a 2 cpu opteron workstation with suse at work. The system in many ways okay, but there is a serious problem with interrupts and clock speed which wreaks general havoc on the machine. The clock is running about 2x speed - I think perhaps two clock ticks (from each core?) are happening for each one that should. X windows keyboard behavior is quite erratic, I often get 2-4 chars repeated for each key typed. I believe this is consistent with lots of interrupt activity? Summary of my experiments so far: 2.6.13.4 kernel 1. turning off smp in kernel compile configuration does not correct the problem. 2. no_timer_check and/or notsc does not reliably correct the problem - I have seen some help for periods of time. 3. moving from athlon64 to generic x86_64 during kernel compile does nothing 4. no_timer_check pci=noacpi pci=routeirq kernel boot option corrects the 2x clock speed problem, but breaks at lot of other things - I am running this at the moment so I can use the computer (but my firewire drive is not recognized, for example). 5. PM_timer kernel compile option does nothing 6. Changing timer frequency does nothing. I wanted to check with older 2.6 kernels but experience a failed boot on stock debian 2.6.8 amd64-smp kernel, I don't this is indicative of a problem other than misconfiguration of grub or devfs subsystems (there is a pivot_root at boot time that fails)... some interesting log entries: kern.log: Oct 18 14:57:48 localhost kernel: Losing some ticks... checking if CPU frequency changed. Oct 18 23:36:50 line kernel: Your time source seems to be instable or some driver is hogging interupts Oct 19 05:40:05 line kernel: rtc: lost some interrupts at 2048Hz. Oct 19 05:49:33 line kernel: rtc: lost some interrupts at 2048Hz. This seems like it could be related to kernel bug 3927: http://bugzilla.kernel.org/show_bug.cgi?id=3927 which has been marked as resolved but my reading suggest that a sufficient number of people found workarounds to let the bug subside rather than fixing it... In any case, my deep appreciation to anyone who has a solution after days of kernel recompiles and rebooting with various boot options. Happy to send more detailed logs and kernel compile options if there is interest. Nathan Just a little workaround. You need an smp numa kernel. Compile it this 500 hz. Markus -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: fast clock wreaks havoc on amd64 dual core - hp1250n
Le Fri, Oct 21, 2005 at 09:52:05PM -0400, Nathan O. Siemers écrivait/wrote: Hello all, I've spent the last 5 days trying to fix an issue with a brand new hp a1250n dual core athlon64 machine. ATI motherboard with embedded radeon xpress 200 graphics. [...] The system in many ways okay, but there is a serious problem with interrupts and clock speed which wreaks general havoc on the machine. The clock is running about 2x speed - I think perhaps two clock ticks (from each core?) are happening for each one that should. [] The double-clock rate is a problem I also encountered on my ATIX200 chipset laptop, MSI S270. See http://starynkevitch.net/Basile/msi_s270_linux.html A workaround is to boot with the noapic flag. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basile(at)starynkevitch(dot)net 8, rue de la Faïencerie, 92340 Bourg La Reine, France -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: fast clock wreaks havoc on amd64 dual core - hp1250n
Nathan, I can empathise. I have a two week old HP NX6125 notebook that also suffers from the same problem. I think that you have identified the bug correctly as #3927. I've just checked bugzilla.kernel.org/show_bug.cgi?id=3927 and the status is ASSIGNED. The owner is Andi Kleen (at suse). As far as I can tell there are some patches available, but none deemed stable enough to incorporate into current kernels. The usual workarounds do not work for me either. I have used boot options of noapic and noapictimer, and both cause a kernel panic at boot. I have tried no_timer_check=0 and this stops my fans (not something you want on a new laptop--during the boot phase my CPU temp rapidly rose to about 68 deg C!). So I used no_timer_check=0 and set acpid to poll my thermal zones to try to get the fans to kick in and this did not work reliably. So for now I, like you, am at the mercy of the kernel maintainers. (When I use my laptop, my typed text usually looks like... ths dooubl timer bu realy sucks. At times I've considered the aerodynamic properties of this notebook ;-). As far as I can tell, this bug is already a few months old. If anyone has any news as to when a fix will be available, there are several out here who would be happy to hear it. Regards Richard On Saturday 22 October 2005 03:52, Nathan O. Siemers wrote: Hello all, I've spent the last 5 days trying to fix an issue with a brand new hp a1250n dual core athlon64 machine. ATI motherboard with embedded radeon xpress 200 graphics. I've installed the pure64 sarge distribution. This is my first 64 bit debian attempt, although I am running a 2 cpu opteron workstation with suse at work. The system in many ways okay, but there is a serious problem with interrupts and clock speed which wreaks general havoc on the machine. The clock is running about 2x speed - I think perhaps two clock ticks (from each core?) are happening for each one that should. X windows keyboard behavior is quite erratic, I often get 2-4 chars repeated for each key typed. I believe this is consistent with lots of interrupt activity? Summary of my experiments so far: 2.6.13.4 kernel 1. turning off smp in kernel compile configuration does not correct the problem. 2. no_timer_check and/or notsc does not reliably correct the problem - I have seen some help for periods of time. 3. moving from athlon64 to generic x86_64 during kernel compile does nothing 4. no_timer_check pci=noacpi pci=routeirq kernel boot option corrects the 2x clock speed problem, but breaks at lot of other things - I am running this at the moment so I can use the computer (but my firewire drive is not recognized, for example). 5. PM_timer kernel compile option does nothing 6. Changing timer frequency does nothing. I wanted to check with older 2.6 kernels but experience a failed boot on stock debian 2.6.8 amd64-smp kernel, I don't this is indicative of a problem other than misconfiguration of grub or devfs subsystems (there is a pivot_root at boot time that fails)... some interesting log entries: kern.log: Oct 18 14:57:48 localhost kernel: Losing some ticks... checking if CPU frequency changed. Oct 18 23:36:50 line kernel: Your time source seems to be instable or some driver is hogging interupts Oct 19 05:40:05 line kernel: rtc: lost some interrupts at 2048Hz. Oct 19 05:49:33 line kernel: rtc: lost some interrupts at 2048Hz. This seems like it could be related to kernel bug 3927: http://bugzilla.kernel.org/show_bug.cgi?id=3927 which has been marked as resolved but my reading suggest that a sufficient number of people found workarounds to let the bug subside rather than fixing it... In any case, my deep appreciation to anyone who has a solution after days of kernel recompiles and rebooting with various boot options. Happy to send more detailed logs and kernel compile options if there is interest. Nathan -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: fast clock wreaks havoc on amd64 dual core - hp1250n
All: So far, good news to report. 2.6.14rc5 compiled still has a 2x clock speed, but adding notsc and no_timer_check (not sure if both are necessary) to the boot seems to result in a system that has a clock with reasonable integrity, without all the problems of noapic, etc. I have seen apic errors, so far twice in kern.log: Oct 22 20:52:44 line kernel: APIC error on CPU0: 40(40) Oct 22 20:52:44 line kernel: APIC error on CPU1: 40(40) Unfortunately 2 hours is not enough time to say this is a stable configuration (24 hours is better), but encouraging. I have hit the system with high cpu, video, disk, and ieee load to test but sometime the clock starts going haywire again after several more hours... Thanks to all that responded so far. mikepolniak wrote: On 11:35 Sat 22 Oct , Richard Mace wrote: Nathan, I can empathise. I have a two week old HP NX6125 notebook that also suffers from the same problem. I think that you have identified the bug correctly as #3927. I've just checked bugzilla.kernel.org/show_bug.cgi?id=3927 and the status is ASSIGNED. The owner is Andi Kleen (at suse). As far as I can tell there are some patches available, but none deemed stable enough to incorporate into current kernels. The ChangeLog-2.6.14-rc5 has this fix for TSC timers: Author: Andi Kleen [EMAIL PROTECTED] Date: Thu Oct 13 14:41:44 2005 -0700 [NET]: Disable NET_SCH_CLK_CPU for SMP x86 hosts Opterons with frequency scaling have fully unsynchronized TSCs running at different frequencies, so using TSCs there is not a good idea. Also some other x86 boxes have this problem. gettimeofday should be good enough, so just disable it. Signed-off-by: Andi Kleen [EMAIL PROTECTED] I have had the 'lost ticks' problem with AMD x2 dual cpu and SMP. Now i am running kernel-2.6.14-rc5 and the problem appears fixed. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]