Re: time keeps on slipping... slipping...
On 11.10.2012 09:30, John-Mark Gurney wrote: Alexander Motin wrote this message on Thu, Oct 11, 2012 at 01:43 +0300: On 08.10.2012 07:02, John-Mark Gurney wrote: I recently put together a new machine w/ a SuperMicro H8SCM and an AMD Opteron 4228 HE... I've having an issue where the clock on the machine skips around... The wierd part is that it's very sudden when it happens... ntp sometimes brings it back, but it can't when the clock gets too far ahread (1000 seconds), ntp dies... In order to catch it happening, I ran a sleep 60 loop fetching time >from another server that keeps time correctly via: while sleep 60; do echo -n h2:; nc h2 13; date; ntpdate h2.funkthat.com; done here are some snippits: h2:Sun Oct 7 17:12:54 2012^M Sun Oct 7 17:12:54 PDT 2012 7 Oct 17:12:54 ntpdate[31036]: the NTP socket is in use, exiting h2:Sun Oct 7 17:13:48 2012^M Sun Oct 7 17:20:21 PDT 2012 7 Oct 17:20:21 ntpdate[31045]: the NTP socket is in use, exiting but then ntp brings it back in sync: h2:Sun Oct 7 17:28:49 2012^M Sun Oct 7 17:35:21 PDT 2012 7 Oct 17:35:21 ntpdate[31164]: the NTP socket is in use, exiting h2:Sun Oct 7 17:29:49 2012^M Sun Oct 7 17:29:49 PDT 2012 7 Oct 17:29:49 ntpdate[31170]: the NTP socket is in use, exiting It happens pretty often: Oct 7 00:19:13 gold ntpd[3721]: time reset -785.347912 s Oct 7 00:46:37 gold ntpd[3721]: time reset -392.673256 s Oct 7 01:04:24 gold ntpd[3721]: time reset -785.346533 s Oct 7 15:00:59 gold ntpd[3721]: time reset -392.681720 s Oct 7 16:32:11 gold ntpd[3721]: time reset -392.671268 s Oct 7 17:29:29 gold ntpd[3721]: time reset -392.671752 s Oct 7 18:04:37 gold ntpd[3721]: time reset -785.346987 s but as you can see above, the time slip happens abruptly.. looks like a rounding error or something... I'm now reducing the sleep to 5 seconds... but as you can see the sleep ends a few seconds early and local time suddenly jumped forward 6 minutes 33 seconds... $ sysctl kern.timecounter kern.timecounter.fast_gettime: 1 kern.timecounter.tick: 1 kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) HPET(950) i8254(0) dummy(-100) kern.timecounter.hardware: TSC-low kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 11598 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 3257069245 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 950 kern.timecounter.tc.ACPI-safe.mask: 16777215 kern.timecounter.tc.ACPI-safe.counter: 4219134510 kern.timecounter.tc.ACPI-safe.frequency: 3579545 kern.timecounter.tc.ACPI-safe.quality: 850 kern.timecounter.tc.TSC-low.mask: 4294967295 kern.timecounter.tc.TSC-low.counter: 2854866610 kern.timecounter.tc.TSC-low.frequency: 10937740 kern.timecounter.tc.TSC-low.quality: 1000 kern.timecounter.smp_tsc: 1 kern.timecounter.invariant_tsc: 1 $ sysctl kern.eventtimer kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0) kern.eventtimer.et.LAPIC.flags: 15 kern.eventtimer.et.LAPIC.frequency: 12217 kern.eventtimer.et.LAPIC.quality: 400 kern.eventtimer.et.i8254.flags: 1 kern.eventtimer.et.i8254.frequency: 1193182 kern.eventtimer.et.i8254.quality: 100 kern.eventtimer.et.RTC.flags: 17 kern.eventtimer.et.RTC.frequency: 32768 kern.eventtimer.et.RTC.quality: 0 kern.eventtimer.periodic: 0 kern.eventtimer.timer: LAPIC kern.eventtimer.activetick: 1 kern.eventtimer.idletick: 0 kern.eventtimer.singlemul: 2 I have switched my timecounter to HPET to see if things are different... Any clues? Mentioned switching to HPET could tell a lot about the problem. Switching event timer also may be interesting. Since I switch to HPET, it hasn't happened at all in the last 3 days.. That is probably tells about some problems with TSC timecounter. What is strange to me is time jump size of 5 minutes. TSC timecounter should overflow each few seconds, so single jump should be just that big. Should I try switching back to TSC and switching event timer? do you need any other info, or want me to try anything else? You may try to do it to be sure eventtimers are not related to the case. Oh, forgot to include the specific processor info in my previous email: CPU: AMD Opteron(tm) Processor 4228 HE (2800.05-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x600f12 Family = 0x15 Model = 0x1 Stepping = 2 Features=0x178bfbff Features2=0x1e98220b AMD Features=0x2e500800 AMD Features2=0x1c9bfff,> TSC: P-state invariant, performance statistics Unfortunately, I don't know AMD processors specifics. May be jkim@ or avg@ may remember something. As far as I know, kernel should block enter sleep states on AMD CPUs when LAPIC eventtimer is used (by default). In such case I guess TSC should also work fine. But I don't know what other possible sources of asynchronicity may be there. -- Alexander Motin
Re: time keeps on slipping... slipping...
On 2012-Oct-10 23:30:30 -0700, John-Mark Gurney wrote: >> >kern.timecounter.tc.TSC-low.mask: 4294967295 >> >kern.timecounter.tc.TSC-low.counter: 2854866610 >> >kern.timecounter.tc.TSC-low.frequency: 10937740 >> >kern.timecounter.tc.TSC-low.quality: 1000 ... >Since I switch to HPET, it hasn't happened at all in the last 3 days.. That suggests that there's something peculiar about your TSC. There are a variety of possibilities... Does your CPU support multiple Cx states and are you using them ("sysctl dev.cpu | grep cx_")? -- Peter Jeremy pgpeGw8Kk7Rhc.pgp Description: PGP signature
Re: time keeps on slipping... slipping...
Alexander Motin wrote this message on Thu, Oct 11, 2012 at 01:43 +0300: > On 08.10.2012 07:02, John-Mark Gurney wrote: > >I recently put together a new machine w/ a SuperMicro H8SCM and an > >AMD Opteron 4228 HE... I've having an issue where the clock on the > >machine skips around... The wierd part is that it's very sudden when > >it happens... ntp sometimes brings it back, but it can't when the clock > >gets too far ahread (1000 seconds), ntp dies... > > > >In order to catch it happening, I ran a sleep 60 loop fetching time > >from another server that keeps time correctly via: > >while sleep 60; do echo -n h2:; nc h2 13; date; ntpdate h2.funkthat.com; > >done > > > >here are some snippits: > >h2:Sun Oct 7 17:12:54 2012^M > >Sun Oct 7 17:12:54 PDT 2012 > > 7 Oct 17:12:54 ntpdate[31036]: the NTP socket is in use, exiting > >h2:Sun Oct 7 17:13:48 2012^M > >Sun Oct 7 17:20:21 PDT 2012 > > 7 Oct 17:20:21 ntpdate[31045]: the NTP socket is in use, exiting > > > >but then ntp brings it back in sync: > >h2:Sun Oct 7 17:28:49 2012^M > >Sun Oct 7 17:35:21 PDT 2012 > > 7 Oct 17:35:21 ntpdate[31164]: the NTP socket is in use, exiting > >h2:Sun Oct 7 17:29:49 2012^M > >Sun Oct 7 17:29:49 PDT 2012 > > 7 Oct 17:29:49 ntpdate[31170]: the NTP socket is in use, exiting > > > >It happens pretty often: > >Oct 7 00:19:13 gold ntpd[3721]: time reset -785.347912 s > >Oct 7 00:46:37 gold ntpd[3721]: time reset -392.673256 s > >Oct 7 01:04:24 gold ntpd[3721]: time reset -785.346533 s > >Oct 7 15:00:59 gold ntpd[3721]: time reset -392.681720 s > >Oct 7 16:32:11 gold ntpd[3721]: time reset -392.671268 s > >Oct 7 17:29:29 gold ntpd[3721]: time reset -392.671752 s > >Oct 7 18:04:37 gold ntpd[3721]: time reset -785.346987 s > > > >but as you can see above, the time slip happens abruptly.. looks like > >a rounding error or something... > > > >I'm now reducing the sleep to 5 seconds... but as you can see the sleep > >ends a few seconds early and local time suddenly jumped forward 6 > >minutes 33 seconds... > > > >$ sysctl kern.timecounter > >kern.timecounter.fast_gettime: 1 > >kern.timecounter.tick: 1 > >kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) HPET(950) i8254(0) > >dummy(-100) > >kern.timecounter.hardware: TSC-low > >kern.timecounter.stepwarnings: 0 > >kern.timecounter.tc.i8254.mask: 65535 > >kern.timecounter.tc.i8254.counter: 11598 > >kern.timecounter.tc.i8254.frequency: 1193182 > >kern.timecounter.tc.i8254.quality: 0 > >kern.timecounter.tc.HPET.mask: 4294967295 > >kern.timecounter.tc.HPET.counter: 3257069245 > >kern.timecounter.tc.HPET.frequency: 14318180 > >kern.timecounter.tc.HPET.quality: 950 > >kern.timecounter.tc.ACPI-safe.mask: 16777215 > >kern.timecounter.tc.ACPI-safe.counter: 4219134510 > >kern.timecounter.tc.ACPI-safe.frequency: 3579545 > >kern.timecounter.tc.ACPI-safe.quality: 850 > >kern.timecounter.tc.TSC-low.mask: 4294967295 > >kern.timecounter.tc.TSC-low.counter: 2854866610 > >kern.timecounter.tc.TSC-low.frequency: 10937740 > >kern.timecounter.tc.TSC-low.quality: 1000 > >kern.timecounter.smp_tsc: 1 > >kern.timecounter.invariant_tsc: 1 > >$ sysctl kern.eventtimer > >kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0) > >kern.eventtimer.et.LAPIC.flags: 15 > >kern.eventtimer.et.LAPIC.frequency: 12217 > >kern.eventtimer.et.LAPIC.quality: 400 > >kern.eventtimer.et.i8254.flags: 1 > >kern.eventtimer.et.i8254.frequency: 1193182 > >kern.eventtimer.et.i8254.quality: 100 > >kern.eventtimer.et.RTC.flags: 17 > >kern.eventtimer.et.RTC.frequency: 32768 > >kern.eventtimer.et.RTC.quality: 0 > >kern.eventtimer.periodic: 0 > >kern.eventtimer.timer: LAPIC > >kern.eventtimer.activetick: 1 > >kern.eventtimer.idletick: 0 > >kern.eventtimer.singlemul: 2 > > > >I have switched my timecounter to HPET to see if things are different... > > > >Any clues? > > Mentioned switching to HPET could tell a lot about the problem. > Switching event timer also may be interesting. Since I switch to HPET, it hasn't happened at all in the last 3 days.. Should I try switching back to TSC and switching event timer? do you need any other info, or want me to try anything else? Oh, forgot to include the specific processor info in my previous email: CPU: AMD Opteron(tm) Processor 4228 HE (2800.05-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x600f12 Family = 0x15 Model = 0x1 Stepping = 2 Features=0x178bfbff Features2=0x1e98220b AMD Features=0x2e500800 AMD Features2=0x1c9bfff,> TSC: P-state invariant, performance statistics -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: time keeps on slipping... slipping...
On 08.10.2012 07:02, John-Mark Gurney wrote: I recently put together a new machine w/ a SuperMicro H8SCM and an AMD Opteron 4228 HE... I've having an issue where the clock on the machine skips around... The wierd part is that it's very sudden when it happens... ntp sometimes brings it back, but it can't when the clock gets too far ahread (1000 seconds), ntp dies... In order to catch it happening, I ran a sleep 60 loop fetching time from another server that keeps time correctly via: while sleep 60; do echo -n h2:; nc h2 13; date; ntpdate h2.funkthat.com; done here are some snippits: h2:Sun Oct 7 17:12:54 2012^M Sun Oct 7 17:12:54 PDT 2012 7 Oct 17:12:54 ntpdate[31036]: the NTP socket is in use, exiting h2:Sun Oct 7 17:13:48 2012^M Sun Oct 7 17:20:21 PDT 2012 7 Oct 17:20:21 ntpdate[31045]: the NTP socket is in use, exiting but then ntp brings it back in sync: h2:Sun Oct 7 17:28:49 2012^M Sun Oct 7 17:35:21 PDT 2012 7 Oct 17:35:21 ntpdate[31164]: the NTP socket is in use, exiting h2:Sun Oct 7 17:29:49 2012^M Sun Oct 7 17:29:49 PDT 2012 7 Oct 17:29:49 ntpdate[31170]: the NTP socket is in use, exiting It happens pretty often: Oct 7 00:19:13 gold ntpd[3721]: time reset -785.347912 s Oct 7 00:46:37 gold ntpd[3721]: time reset -392.673256 s Oct 7 01:04:24 gold ntpd[3721]: time reset -785.346533 s Oct 7 15:00:59 gold ntpd[3721]: time reset -392.681720 s Oct 7 16:32:11 gold ntpd[3721]: time reset -392.671268 s Oct 7 17:29:29 gold ntpd[3721]: time reset -392.671752 s Oct 7 18:04:37 gold ntpd[3721]: time reset -785.346987 s but as you can see above, the time slip happens abruptly.. looks like a rounding error or something... I'm now reducing the sleep to 5 seconds... but as you can see the sleep ends a few seconds early and local time suddenly jumped forward 6 minutes 33 seconds... $ sysctl kern.timecounter kern.timecounter.fast_gettime: 1 kern.timecounter.tick: 1 kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) HPET(950) i8254(0) dummy(-100) kern.timecounter.hardware: TSC-low kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 11598 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 3257069245 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 950 kern.timecounter.tc.ACPI-safe.mask: 16777215 kern.timecounter.tc.ACPI-safe.counter: 4219134510 kern.timecounter.tc.ACPI-safe.frequency: 3579545 kern.timecounter.tc.ACPI-safe.quality: 850 kern.timecounter.tc.TSC-low.mask: 4294967295 kern.timecounter.tc.TSC-low.counter: 2854866610 kern.timecounter.tc.TSC-low.frequency: 10937740 kern.timecounter.tc.TSC-low.quality: 1000 kern.timecounter.smp_tsc: 1 kern.timecounter.invariant_tsc: 1 $ sysctl kern.eventtimer kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0) kern.eventtimer.et.LAPIC.flags: 15 kern.eventtimer.et.LAPIC.frequency: 12217 kern.eventtimer.et.LAPIC.quality: 400 kern.eventtimer.et.i8254.flags: 1 kern.eventtimer.et.i8254.frequency: 1193182 kern.eventtimer.et.i8254.quality: 100 kern.eventtimer.et.RTC.flags: 17 kern.eventtimer.et.RTC.frequency: 32768 kern.eventtimer.et.RTC.quality: 0 kern.eventtimer.periodic: 0 kern.eventtimer.timer: LAPIC kern.eventtimer.activetick: 1 kern.eventtimer.idletick: 0 kern.eventtimer.singlemul: 2 I have switched my timecounter to HPET to see if things are different... Any clues? Mentioned switching to HPET could tell a lot about the problem. Switching event timer also may be interesting. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: time keeps on slipping... slipping...
On Sun, Oct 07, 2012 at 21:02:39 -0700 , John-Mark Gurney wrote: > I recently put together a new machine w/ a SuperMicro H8SCM and an > AMD Opteron 4228 HE... I've having an issue where the clock on the > machine skips around... The wierd part is that it's very sudden when > it happens... ntp sometimes brings it back, but it can't when the clock > gets too far ahread (1000 seconds), ntp dies... > > In order to catch it happening, I ran a sleep 60 loop fetching time > from another server that keeps time correctly via: > while sleep 60; do echo -n h2:; nc h2 13; date; ntpdate h2.funkthat.com; done > > here are some snippits: > h2:Sun Oct 7 17:12:54 2012^M > Sun Oct 7 17:12:54 PDT 2012 > 7 Oct 17:12:54 ntpdate[31036]: the NTP socket is in use, exiting > h2:Sun Oct 7 17:13:48 2012^M > Sun Oct 7 17:20:21 PDT 2012 > 7 Oct 17:20:21 ntpdate[31045]: the NTP socket is in use, exiting > > but then ntp brings it back in sync: > h2:Sun Oct 7 17:28:49 2012^M > Sun Oct 7 17:35:21 PDT 2012 > 7 Oct 17:35:21 ntpdate[31164]: the NTP socket is in use, exiting > h2:Sun Oct 7 17:29:49 2012^M > Sun Oct 7 17:29:49 PDT 2012 > 7 Oct 17:29:49 ntpdate[31170]: the NTP socket is in use, exiting > > I have switched my timecounter to HPET to see if things are different... > > Any clues? Looks like faulty hardware to me. You shouldn't need to work that hard to keep the clock in sync. Can you boot a livecd of e.g. Linux or OpenIndiana and reproduce the issue? -- Thanks and best regards, Chris Nehren pgppOHFTirbxg.pgp Description: PGP signature