On Mon, 31 Mar 2008 22:49:05 -0700 Lucky Greeen <[EMAIL PROTECTED]> wrote: >Scott Bennett wrote: >> It appears that the "clock jump" problem does persist into 0.2.0.23-rc. >> Here are the notice-level log messages since I started up the new version >> this afternoon: >> [remainder eldided] >> >Scott, >If you are using a dual-core or multiprocessor system, it is likely >that you too are experiencing issues with the TSC being out of syc >between the cores. The TSC counters on physical CPUs can get out of sync >just as CPUs on virtual CPUs can. Physical CPUs just experience this >issue so rarely that the most users will never encounter it. The >likelihood that the TSCs between two cores on the same system get out of >sync appears to be system/motherboard/CPU dependent.
It's actually a hyperthreading 3.4 GHz P4. FreeBSD supports such chips as SMP systems, although it's really a single core with two logical CPUs vying for pipeline slots. Running systat's vmstat display with two-second updates, the time interrupt counts for cpu0 and cpu1 hover around 2000 and are usually equal. Only rarely do they differ between them by more than a single interrupt in each two-second interval, so it's difficult to imagine the kernel tolerating an accumulated difference of over 100 K interrupts in the two counts occurring. It's also hard to imagine how it could happen when there's really almost no load on the system. > >See http://kerneltrap.org/node/14003 on the rationale for adding the >"notsc" option to the Linux kernel. Also see I notice it refers to a problem in the LINUX x86_64 support for dual core systems. Mine is only hyperthreading-capable, not dual cored, and is only a 32-bit chip, so I run FreeBSD's i386 versions, not the amd64 versions. >http://lkml.org/lkml/2005/11/4/173 for a detailed explanation by an AMD >engieer of why and under which conditions the TSCs might go out of sync. Fascinating, indeed, but probably irrelevant to a single-cored, HTT- enabled P4 chip. sysctl reveals quite a few variables that are available on such a system for the chip as a whole, but not for individual (logical) CPUs. >At least in the past FreeBSD was also impacted by TSCs getting out of >sync, causing the time to jump as the the process migrates between CPUs. >See >http://unix.derkeiler.com/Mailing-Lists/FreeBSD/current/2004-08/1105.html Interesting thread. Thanks. > >There is a very simple way for you to test if the time jumps you have >been seeing are caused by the TSCs between two CPU cores being out of >sync: simply temporarily disable SMP in your kernel. If the problem goes >away, the TSCs likely are out of sync. If so, you may want to find out >of there is a way to disable FreeBSD's use of the TSC similar to the >"notsc" option in Linux and see if that addresses the problem. > Here's a bit more information, the clue for which came from the second URL's article above. [hellas] 333 % sysctl kern.timecounter kern.timecounter.stepwarnings: 0 kern.timecounter.nbinuptime: 751724079 kern.timecounter.nnanouptime: 5273 kern.timecounter.nmicrouptime: 353617 kern.timecounter.nbintime: 295482161 kern.timecounter.nnanotime: 157914288 kern.timecounter.nmicrotime: 137569526 kern.timecounter.ngetbinuptime: 2285019 kern.timecounter.ngetnanouptime: 18897431 kern.timecounter.ngetmicrouptime: 133591454 kern.timecounter.ngetbintime: 0 kern.timecounter.ngetnanotime: 70820 kern.timecounter.ngetmicrotime: 602672586 kern.timecounter.nsetclock: 4 kern.timecounter.hardware: ACPI-fast kern.timecounter.choice: TSC(-100) ACPI-fast(1000) i8254(0) dummy(-1000000) kern.timecounter.tick: 1 kern.timecounter.smp_tsc: 0 [hellas] 334 % sysctl hw.acpi hw.acpi.supported_sleep_state: S1 S3 S4 S5 hw.acpi.power_button_state: S5 hw.acpi.sleep_button_state: S1 hw.acpi.lid_switch_state: NONE hw.acpi.standby_state: S1 hw.acpi.suspend_state: S3 hw.acpi.sleep_delay: 1 hw.acpi.s4bios: 1 hw.acpi.verbose: 0 hw.acpi.disable_on_reboot: 0 hw.acpi.handle_reboot: 0 hw.acpi.reset_video: 0 hw.acpi.cpu.cx_lowest: C1 hw.acpi.acline: 1 hw.acpi.battery.life: 100 hw.acpi.battery.time: -1 hw.acpi.battery.state: 0 hw.acpi.battery.units: 1 hw.acpi.battery.info_expire: 5 hw.acpi.thermal.min_runtime: 0 hw.acpi.thermal.polling_rate: 10 hw.acpi.thermal.user_override: 0 hw.acpi.thermal.tz0.temperature: 55.5C hw.acpi.thermal.tz0.active: -1 hw.acpi.thermal.tz0.passive_cooling: 0 hw.acpi.thermal.tz0.thermal_flags: 0 hw.acpi.thermal.tz0._PSV: -1 hw.acpi.thermal.tz0._HOT: -1 hw.acpi.thermal.tz0._CRT: 89.0C hw.acpi.thermal.tz0._ACx: -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 hw.acpi.thermal.tz0._TC1: -1 hw.acpi.thermal.tz0._TC2: -1 hw.acpi.thermal.tz0._TSP: -1 [hellas] 335 % Now, I make no claim to knowing the significance of all of the above. My guess is that the value of kern.timecounter.hardware means that the kernel is not using the TSC anyway, so the problem you describe above would not be what is happening here. Another guess is that kern.timecounter.hardware may be set that way because kern.timecounter.smp_tsc is set to 0. (The hw.acpi stuff shows some examples of variables not duplicated for the two logical CPUs, e.g., cpu.cx_lowest and thermal.tz0.*. There are some others hidden elsewhere, too.) How does it look to you? BTW, I'm running FreeBSD 6.3-STABLE. The message your second URL pointed to is from 2004, and an awful lot of things have changed since then. (FreeBSD 7 includes many, many more changes, including the change of default scheduler to the ULE scheduler, made reliable at last for SMP operations, so if anything, 7.x is probably even less vulnerable to the vagaries of various timer support mechanisms than 6.x, but I'm not running that system yet.) Scott Bennett, Comm. ASMELG, CFIAG ********************************************************************** * Internet: bennett at cs.niu.edu * *--------------------------------------------------------------------* * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * * -- Gov. John Hancock, New York Journal, 28 January 1790 * **********************************************************************