Re: time issues and some more
On Wed, 16 Jan 2013 10:45:49 +0100, Daniel Braniss da...@cs.huji.ac.il wrote: I resently upgraded a Dell PowerEdge R710, to 9.1-stable, we mainly use it as a backup to several zfs servers (doing send|receive) without major issues till the upgrade, it was running 8.2-stable. now, we see that sometime the time drifts, and today I noticed that it was hung, and once I got unto the ipmi console this is what i got: [SOL Session operational. Use ~? for help] swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 and things started moving again, in /var/log/messages: Jan 16 03:27:35 store-02 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 but the REAL time is 7hs ahead!, so time stood still ? and now, of course we get: Jan 16 03:54:19 store-02 ntpd[38163]: time correction of 25216 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. I will now reboot, and try a newer kernel and check, but any insight will be very helpful, thanks, danny Does BSD 9 choose another timer source than BSD 8? Use sysctl to check these values at your system. kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0) kern.eventtimer.timer: LAPIC Or this ones. I always confuse these. kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) i8254(0) dummy(-100) kern.timecounter.hardware: TSC-low under 8.3 it's kern.timecounte, so this is what I get: sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) dummy(-100) kern.timecounter.hardware: ACPI-fast kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 52515 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-fast.mask: 16777215 kern.timecounter.tc.ACPI-fast.counter: 925448 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 1000 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 1472869277 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 900 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 4125922088 kern.timecounter.tc.TSC.frequency: 2329838875 kern.timecounter.tc.TSC.quality: -100 kern.timecounter.smp_tsc: 0 kern.timecounter.invariant_tsc: 1 so I assume the choise is HPET, under 9.1: kern.eventtimer.timer: HPET so it seems to be the same. btw, this morning I see that it's behind more than 1 hour, and no signs of ntpd! the logs show: ... Jan 17 00:40:52 store-02 kernel: usb_dev_suspend_peer: Setting device remote wakeup failed Jan 17 01:05:46 store-02 ntpd[1845]: time correction of 7854 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. ... it seems to me that the 7854 seconds is exactly the time diff: date on this hosts says: Thu Jan 17 08:46:18 IST 2013 addig the 7854 sec is the current(almost) real date: Thu Jan 17 10:57:13 IST 2013 something is very fishy here. cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and some more
On Thu, 17 Jan 2013 09:58:07 +0100, Daniel Braniss da...@cs.huji.ac.il wrote: On Wed, 16 Jan 2013 10:45:49 +0100, Daniel Braniss da...@cs.huji.ac.il wrote: I resently upgraded a Dell PowerEdge R710, to 9.1-stable, we mainly use it as a backup to several zfs servers (doing send|receive) without major issues till the upgrade, it was running 8.2-stable. now, we see that sometime the time drifts, and today I noticed that it was hung, and once I got unto the ipmi console this is what i got: [SOL Session operational. Use ~? for help] swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 and things started moving again, in /var/log/messages: Jan 16 03:27:35 store-02 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 but the REAL time is 7hs ahead!, so time stood still ? and now, of course we get: Jan 16 03:54:19 store-02 ntpd[38163]: time correction of 25216 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. I will now reboot, and try a newer kernel and check, but any insight will be very helpful, thanks, danny Does BSD 9 choose another timer source than BSD 8? Use sysctl to check these values at your system. kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0) kern.eventtimer.timer: LAPIC Or this ones. I always confuse these. kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) i8254(0) dummy(-100) kern.timecounter.hardware: TSC-low under 8.3 it's kern.timecounte, so this is what I get: sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) dummy(-100) kern.timecounter.hardware: ACPI-fast kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 52515 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-fast.mask: 16777215 kern.timecounter.tc.ACPI-fast.counter: 925448 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 1000 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 1472869277 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 900 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 4125922088 kern.timecounter.tc.TSC.frequency: 2329838875 kern.timecounter.tc.TSC.quality: -100 kern.timecounter.smp_tsc: 0 kern.timecounter.invariant_tsc: 1 so I assume the choise is HPET, under 9.1: kern.eventtimer.timer: HPET Your servers uses: kern.timecounter.hardware: ACPI-fast Please check that value on 9.1 and 8.3. so it seems to be the same. btw, this morning I see that it's behind more than 1 hour, and no signs of ntpd! the logs show: ... Jan 17 00:40:52 store-02 kernel: usb_dev_suspend_peer: Setting device remote wakeup failed Jan 17 01:05:46 store-02 ntpd[1845]: time correction of 7854 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. ... it seems to me that the 7854 seconds is exactly the time diff: date on this hosts says: Thu Jan 17 08:46:18 IST 2013 addig the 7854 sec is the current(almost) real date: Thu Jan 17 10:57:13 IST 2013 something is very fishy here. Are you doing suspend/resume stuff on your machine? Or does usb_dev_suspend_peer mean suspend in another way? Ronald. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and some more
On Thu, 17 Jan 2013 09:58:07 +0100, Daniel Braniss da...@cs.huji.ac.il wrote: On Wed, 16 Jan 2013 10:45:49 +0100, Daniel Braniss da...@cs.huji.ac.il wrote: I resently upgraded a Dell PowerEdge R710, to 9.1-stable, we mainly use it as a backup to several zfs servers (doing send|receive) without major issues till the upgrade, it was running 8.2-stable. now, we see that sometime the time drifts, and today I noticed that it was hung, and once I got unto the ipmi console this is what i got: [SOL Session operational. Use ~? for help] swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 and things started moving again, in /var/log/messages: Jan 16 03:27:35 store-02 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 but the REAL time is 7hs ahead!, so time stood still ? and now, of course we get: Jan 16 03:54:19 store-02 ntpd[38163]: time correction of 25216 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. I will now reboot, and try a newer kernel and check, but any insight will be very helpful, thanks, danny Does BSD 9 choose another timer source than BSD 8? Use sysctl to check these values at your system. kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0) kern.eventtimer.timer: LAPIC Or this ones. I always confuse these. kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) i8254(0) dummy(-100) kern.timecounter.hardware: TSC-low under 8.3 it's kern.timecounte, so this is what I get: sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) dummy(-100) kern.timecounter.hardware: ACPI-fast kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 52515 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-fast.mask: 16777215 kern.timecounter.tc.ACPI-fast.counter: 925448 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 1000 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 1472869277 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 900 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 4125922088 kern.timecounter.tc.TSC.frequency: 2329838875 kern.timecounter.tc.TSC.quality: -100 kern.timecounter.smp_tsc: 0 kern.timecounter.invariant_tsc: 1 so I assume the choise is HPET, under 9.1: kern.eventtimer.timer: HPET Your servers uses: kern.timecounter.hardware: ACPI-fast Please check that value on 9.1 and 8.3. they both choose the same, ACPI-fast so it seems to be the same. btw, this morning I see that it's behind more than 1 hour, and no signs of ntpd! the logs show: ... Jan 17 00:40:52 store-02 kernel: usb_dev_suspend_peer: Setting device remote wakeup failed Jan 17 01:05:46 store-02 ntpd[1845]: time correction of 7854 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. ... it seems to me that the 7854 seconds is exactly the time diff: date on this hosts says: Thu Jan 17 08:46:18 IST 2013 addig the 7854 sec is the current(almost) real date: Thu Jan 17 10:57:13 IST 2013 something is very fishy here. Are you doing suspend/resume stuff on your machine? Or does usb_dev_suspend_peer mean suspend in another way? not that I know, but the prev. time it complained about something else: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 Since I have other such boxes -without the problem-, my bet is on mfdi/zfs danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and some more
I have noticed similar time issues after upgrading from 8.2 to 8.3. I haven't had time to investigate, but often the time drift exceeds sanity limit. So, the problem is present pre 9.x. I never had this problem when running 8.2. From: Daniel Braniss da...@cs.huji.ac.il To: freebsd-stable@freebsd.org Sent: Wed, January 16, 2013 3:54:44 AM Subject: time issues and some more I resently upgraded a Dell PowerEdge R710, to 9.1-stable, we mainly use it as a backup to several zfs servers (doing send|receive) without major issues till the upgrade, it was running 8.2-stable. now, we see that sometime the time drifts, and today I noticed that it was hung, and once I got unto the ipmi console this is what i got: [SOL Session operational. Use ~? for help] swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 and things started moving again, in /var/log/messages: Jan 16 03:27:35 store-02 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 but the REAL time is 7hs ahead!, so time stood still ? and now, of course we get: Jan 16 03:54:19 store-02 ntpd[38163]: time correction of 25216 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. I will now reboot, and try a newer kernel and check, but any insight will be very helpful, thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and some more
On Wed, 16 Jan 2013 10:45:49 +0100, Daniel Braniss da...@cs.huji.ac.il wrote: I resently upgraded a Dell PowerEdge R710, to 9.1-stable, we mainly use it as a backup to several zfs servers (doing send|receive) without major issues till the upgrade, it was running 8.2-stable. now, we see that sometime the time drifts, and today I noticed that it was hung, and once I got unto the ipmi console this is what i got: [SOL Session operational. Use ~? for help] swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 and things started moving again, in /var/log/messages: Jan 16 03:27:35 store-02 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288 but the REAL time is 7hs ahead!, so time stood still ? and now, of course we get: Jan 16 03:54:19 store-02 ntpd[38163]: time correction of 25216 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. I will now reboot, and try a newer kernel and check, but any insight will be very helpful, thanks, danny Does BSD 9 choose another timer source than BSD 8? Use sysctl to check these values at your system. kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0) kern.eventtimer.timer: LAPIC Or this ones. I always confuse these. kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) i8254(0) dummy(-100) kern.timecounter.hardware: TSC-low Ronald. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org