Re: time issues and some more

2013-01-17 Thread Daniel Braniss
 On Wed, 16 Jan 2013 10:45:49 +0100, Daniel Braniss da...@cs.huji.ac.il  
 wrote:
 
  I resently upgraded a Dell PowerEdge R710, to 9.1-stable, we mainly use  
  it as
  a backup to several zfs servers (doing send|receive) without major  
  issues till
  the upgrade, it was running 8.2-stable.
 
  now, we see that sometime the time drifts, and today I noticed that it  
  was
  hung, and once I got unto the ipmi console this is what i got:
  [SOL Session operational.  Use ~? for help]
  swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288
 
  and things started moving again,
 
  in /var/log/messages:
  Jan 16 03:27:35 store-02 kernel: swap_pager: indefinite wait buffer:  
  bufobj:
  0, blkno: 3864, size: 12288
 
  but the REAL time is 7hs ahead!, so time stood still ?
  and now, of course we get:
  Jan 16 03:54:19 store-02 ntpd[38163]: time correction of 25216 seconds  
  exceeds
  sanity limit (1000); set clock manually to the correct UTC time.
 
  I will now reboot, and try a newer kernel and check, but any insight will
  be very helpful,
 
  thanks,
  danny
 
 Does BSD 9 choose another timer source than BSD 8?
 Use sysctl to check these values at your system.
 kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0)
 kern.eventtimer.timer: LAPIC
 
 Or this ones. I always confuse these.
 kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) i8254(0)  
 dummy(-100)
 kern.timecounter.hardware: TSC-low
 

under 8.3 it's kern.timecounte, so this is what I get:

 sysctl kern.timecounter
kern.timecounter.tick: 1
kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) 
dummy(-100)
kern.timecounter.hardware: ACPI-fast
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 52515
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.ACPI-fast.counter: 925448
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.quality: 1000
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.HPET.counter: 1472869277
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.quality: 900
kern.timecounter.tc.TSC.mask: 4294967295
kern.timecounter.tc.TSC.counter: 4125922088
kern.timecounter.tc.TSC.frequency: 2329838875
kern.timecounter.tc.TSC.quality: -100
kern.timecounter.smp_tsc: 0
kern.timecounter.invariant_tsc: 1

so I assume the choise is HPET, under 9.1:
kern.eventtimer.timer: HPET

so it seems to be the same.

btw, this morning I see that it's behind more than 1 hour, and no signs of
ntpd!

the logs show:
...
Jan 17 00:40:52 store-02 kernel: usb_dev_suspend_peer: Setting device remote 
wakeup failed
Jan 17 01:05:46 store-02 ntpd[1845]: time correction of 7854 seconds exceeds 
sanity limit (1000); set clock manually to the correct UTC time.
...

it seems to me that the 7854 seconds is exactly the time diff:
date on this hosts says:
Thu Jan 17 08:46:18 IST 2013


addig the 7854 sec is the current(almost) real date:
Thu Jan 17 10:57:13 IST 2013

something is very fishy here.

cheers,

danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and some more

2013-01-17 Thread Ronald Klop
On Thu, 17 Jan 2013 09:58:07 +0100, Daniel Braniss da...@cs.huji.ac.il  
wrote:



On Wed, 16 Jan 2013 10:45:49 +0100, Daniel Braniss da...@cs.huji.ac.il
wrote:

 I resently upgraded a Dell PowerEdge R710, to 9.1-stable, we mainly  
use

 it as
 a backup to several zfs servers (doing send|receive) without major
 issues till
 the upgrade, it was running 8.2-stable.

 now, we see that sometime the time drifts, and today I noticed that it
 was
 hung, and once I got unto the ipmi console this is what i got:
 [SOL Session operational.  Use ~? for help]
 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size:  
12288


 and things started moving again,

 in /var/log/messages:
 Jan 16 03:27:35 store-02 kernel: swap_pager: indefinite wait buffer:
 bufobj:
 0, blkno: 3864, size: 12288

 but the REAL time is 7hs ahead!, so time stood still ?
 and now, of course we get:
 Jan 16 03:54:19 store-02 ntpd[38163]: time correction of 25216 seconds
 exceeds
 sanity limit (1000); set clock manually to the correct UTC time.

 I will now reboot, and try a newer kernel and check, but any insight  
will

 be very helpful,

 thanks,
danny

Does BSD 9 choose another timer source than BSD 8?
Use sysctl to check these values at your system.
kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0)
kern.eventtimer.timer: LAPIC

Or this ones. I always confuse these.
kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) i8254(0)
dummy(-100)
kern.timecounter.hardware: TSC-low



under 8.3 it's kern.timecounte, so this is what I get:


sysctl kern.timecounter

kern.timecounter.tick: 1
kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0)
dummy(-100)
kern.timecounter.hardware: ACPI-fast
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 52515
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.ACPI-fast.counter: 925448
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.quality: 1000
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.HPET.counter: 1472869277
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.quality: 900
kern.timecounter.tc.TSC.mask: 4294967295
kern.timecounter.tc.TSC.counter: 4125922088
kern.timecounter.tc.TSC.frequency: 2329838875
kern.timecounter.tc.TSC.quality: -100
kern.timecounter.smp_tsc: 0
kern.timecounter.invariant_tsc: 1

so I assume the choise is HPET, under 9.1:
kern.eventtimer.timer: HPET


Your servers uses:

kern.timecounter.hardware: ACPI-fast


Please check that value on 9.1 and 8.3.




so it seems to be the same.

btw, this morning I see that it's behind more than 1 hour, and no signs  
of

ntpd!

the logs show:
...
Jan 17 00:40:52 store-02 kernel: usb_dev_suspend_peer: Setting device  
remote

wakeup failed
Jan 17 01:05:46 store-02 ntpd[1845]: time correction of 7854 seconds  
exceeds

sanity limit (1000); set clock manually to the correct UTC time.
...

it seems to me that the 7854 seconds is exactly the time diff:
date on this hosts says:
Thu Jan 17 08:46:18 IST 2013


addig the 7854 sec is the current(almost) real date:
Thu Jan 17 10:57:13 IST 2013

something is very fishy here.



Are you doing suspend/resume stuff on your machine? Or does  
usb_dev_suspend_peer mean suspend in another way?


Ronald.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and some more

2013-01-17 Thread Daniel Braniss
 On Thu, 17 Jan 2013 09:58:07 +0100, Daniel Braniss da...@cs.huji.ac.il  
 wrote:
 
  On Wed, 16 Jan 2013 10:45:49 +0100, Daniel Braniss da...@cs.huji.ac.il
  wrote:
 
   I resently upgraded a Dell PowerEdge R710, to 9.1-stable, we mainly  
  use
   it as
   a backup to several zfs servers (doing send|receive) without major
   issues till
   the upgrade, it was running 8.2-stable.
  
   now, we see that sometime the time drifts, and today I noticed that it
   was
   hung, and once I got unto the ipmi console this is what i got:
   [SOL Session operational.  Use ~? for help]
   swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size:  
  12288
  
   and things started moving again,
  
   in /var/log/messages:
   Jan 16 03:27:35 store-02 kernel: swap_pager: indefinite wait buffer:
   bufobj:
   0, blkno: 3864, size: 12288
  
   but the REAL time is 7hs ahead!, so time stood still ?
   and now, of course we get:
   Jan 16 03:54:19 store-02 ntpd[38163]: time correction of 25216 seconds
   exceeds
   sanity limit (1000); set clock manually to the correct UTC time.
  
   I will now reboot, and try a newer kernel and check, but any insight  
  will
   be very helpful,
  
   thanks,
danny
 
  Does BSD 9 choose another timer source than BSD 8?
  Use sysctl to check these values at your system.
  kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0)
  kern.eventtimer.timer: LAPIC
 
  Or this ones. I always confuse these.
  kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) i8254(0)
  dummy(-100)
  kern.timecounter.hardware: TSC-low
 
 
  under 8.3 it's kern.timecounte, so this is what I get:
 
  sysctl kern.timecounter
  kern.timecounter.tick: 1
  kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0)
  dummy(-100)
  kern.timecounter.hardware: ACPI-fast
  kern.timecounter.stepwarnings: 0
  kern.timecounter.tc.i8254.mask: 65535
  kern.timecounter.tc.i8254.counter: 52515
  kern.timecounter.tc.i8254.frequency: 1193182
  kern.timecounter.tc.i8254.quality: 0
  kern.timecounter.tc.ACPI-fast.mask: 16777215
  kern.timecounter.tc.ACPI-fast.counter: 925448
  kern.timecounter.tc.ACPI-fast.frequency: 3579545
  kern.timecounter.tc.ACPI-fast.quality: 1000
  kern.timecounter.tc.HPET.mask: 4294967295
  kern.timecounter.tc.HPET.counter: 1472869277
  kern.timecounter.tc.HPET.frequency: 14318180
  kern.timecounter.tc.HPET.quality: 900
  kern.timecounter.tc.TSC.mask: 4294967295
  kern.timecounter.tc.TSC.counter: 4125922088
  kern.timecounter.tc.TSC.frequency: 2329838875
  kern.timecounter.tc.TSC.quality: -100
  kern.timecounter.smp_tsc: 0
  kern.timecounter.invariant_tsc: 1
 
  so I assume the choise is HPET, under 9.1:
  kern.eventtimer.timer: HPET
 
 Your servers uses:
  kern.timecounter.hardware: ACPI-fast
 
 Please check that value on 9.1 and 8.3.
 
they both choose the same, ACPI-fast

 
 
  so it seems to be the same.
 
  btw, this morning I see that it's behind more than 1 hour, and no signs  
  of
  ntpd!
 
  the logs show:
  ...
  Jan 17 00:40:52 store-02 kernel: usb_dev_suspend_peer: Setting device  
  remote
  wakeup failed
  Jan 17 01:05:46 store-02 ntpd[1845]: time correction of 7854 seconds  
  exceeds
  sanity limit (1000); set clock manually to the correct UTC time.
  ...
 
  it seems to me that the 7854 seconds is exactly the time diff:
  date on this hosts says:
  Thu Jan 17 08:46:18 IST 2013
 
 
  addig the 7854 sec is the current(almost) real date:
  Thu Jan 17 10:57:13 IST 2013
 
  something is very fishy here.
 
 
 Are you doing suspend/resume stuff on your machine? Or does  
 usb_dev_suspend_peer mean suspend in another way?
not that I know, but the prev. time it complained about something else:
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size:  12288

Since I have other such boxes -without the problem-, my bet is on mfdi/zfs

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and some more

2013-01-16 Thread Dan Daley

I have noticed similar time issues after upgrading from 8.2 to 8.3.  I haven't 
had time to investigate, but often the time drift exceeds sanity limit.

So, the problem is present pre 9.x.  I never had this problem when running 8.2.





From: Daniel Braniss da...@cs.huji.ac.il
To: freebsd-stable@freebsd.org
Sent: Wed, January 16, 2013 3:54:44 AM
Subject: time issues and some more

I resently upgraded a Dell PowerEdge R710, to 9.1-stable, we mainly use it as
a backup to several zfs servers (doing send|receive) without major issues till
the upgrade, it was running 8.2-stable.

now, we see that sometime the time drifts, and today I noticed that it was
hung, and once I got unto the ipmi console this is what i got:
[SOL Session operational.  Use ~? for help]
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288

and things started moving again,

in /var/log/messages:
Jan 16 03:27:35 store-02 kernel: swap_pager: indefinite wait buffer: bufobj: 
0, blkno: 3864, size: 12288

but the REAL time is 7hs ahead!, so time stood still ?
and now, of course we get:
Jan 16 03:54:19 store-02 ntpd[38163]: time correction of 25216 seconds exceeds 
sanity limit (1000); set clock manually to the correct UTC time.

I will now reboot, and try a newer kernel and check, but any insight will
be very helpful,

thanks,
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time issues and some more

2013-01-16 Thread Ronald Klop
On Wed, 16 Jan 2013 10:45:49 +0100, Daniel Braniss da...@cs.huji.ac.il  
wrote:


I resently upgraded a Dell PowerEdge R710, to 9.1-stable, we mainly use  
it as
a backup to several zfs servers (doing send|receive) without major  
issues till

the upgrade, it was running 8.2-stable.

now, we see that sometime the time drifts, and today I noticed that it  
was

hung, and once I got unto the ipmi console this is what i got:
[SOL Session operational.  Use ~? for help]
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 3864, size: 12288

and things started moving again,

in /var/log/messages:
Jan 16 03:27:35 store-02 kernel: swap_pager: indefinite wait buffer:  
bufobj:

0, blkno: 3864, size: 12288

but the REAL time is 7hs ahead!, so time stood still ?
and now, of course we get:
Jan 16 03:54:19 store-02 ntpd[38163]: time correction of 25216 seconds  
exceeds

sanity limit (1000); set clock manually to the correct UTC time.

I will now reboot, and try a newer kernel and check, but any insight will
be very helpful,

thanks,
danny


Does BSD 9 choose another timer source than BSD 8?
Use sysctl to check these values at your system.
kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0)
kern.eventtimer.timer: LAPIC

Or this ones. I always confuse these.
kern.timecounter.choice: TSC-low(1000) ACPI-fast(900) i8254(0)  
dummy(-100)

kern.timecounter.hardware: TSC-low

Ronald.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org