On Fri, 2017-10-20 at 21:15 +0300, Boris Samorodov wrote:
> 20.10.2017 21:04, Ian Lepore пишет:
> > 
> > On Fri, 2017-10-20 at 20:20 +0300, Boris Samorodov wrote:
> > > 
> > > (CC to freebsd-virtualization@)
> > > 
> > > 20.10.2017 19:32, Ian Lepore пишет:
> > > > 
> > > > 
> > > > On Fri, 2017-10-20 at 18:36 +0300, Boris Samorodov wrote:
> > > > > 
> > > > > 
> > > > > 20.10.2017 18:31, Boris Samorodov пишет:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 20.10.2017 18:12, Ian Lepore пишет:
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > On Fri, 2017-10-20 at 14:46 +0300, Boris Samorodov wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Hi All,
> > > > > > > > 
> > > > > > > > I have got a host:
> > > > > > > > ---
> > > > > > > > bhyve-host% uname -a
> > > > > > > > FreeBSD sm.bsnet 12.0-CURRENT FreeBSD 12.0-CURRENT #3 r322868: 
> > > > > > > > Fri Aug
> > > > > > > > 25 05:25:26 MSK 2017
> > > > > > > > bsam@builder.bsnet:/usr/obj/usr/src/sys/GENERIC-FAST  amd64 
> > > > > > > > amd64
> > > > > > > > ---
> > > > > > > > 
> > > > > > > > And a bhyve vm:
> > > > > > > > ---
> > > > > > > > bhyve-vm: uname -a
> > > > > > > > FreeBSD builder.bsnet 12.0-CURRENT FreeBSD 12.0-CURRENT #58 
> > > > > > > > r324782: Fri
> > > > > > > > Oct 20 05:12:17 MSK 2017
> > > > > > > > bsam@builder.bsnet:/usr/obj/usr/src/sys/PKG64X  amd64 amd64
> > > > > > > > ---
> > > > > > > > 
> > > > > > > > The only difference at kernel configs is a colored console. :-)
> > > > > > > > 
> > > > > > > > And here I get some weird (is it?) result at the VM (I expect 
> > > > > > > > ntpd to be
> > > > > > > > more stable):
> > > > > > > > ---
> > > > > > > > bhyve-vm% for t in `jot 10`; do ntpq -pn; sleep 64; done
> > > > > > > >      remote           refid      st t when poll reach   delay   
> > > > > > > > offset
> > > > > > > > jitter
> > > > > > > > ==============================================================================
> > > > > > > >  XX.XX.XX.1      XX.XX.XX.245     4 u    9   64    3    0.605   
> > > > > > > > -1.202
> > > > > > > > 316.407
> > > > > > > >  XX.XX.XX.1      XX.XX.XX.245     4 u    7   64    7    0.605   
> > > > > > > > -1.202
> > > > > > > > 358.395
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u    5   64   17    0.615  
> > > > > > > > -328.42
> > > > > > > > 181.405
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u    3   64   37    0.615  
> > > > > > > > -328.42
> > > > > > > > 214.868
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u   67   64   37    0.615  
> > > > > > > > -328.42
> > > > > > > > 214.868
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u   63   64   77    0.615  
> > > > > > > > -328.42
> > > > > > > > 268.618
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u   60   64  177    0.615  
> > > > > > > > -328.42
> > > > > > > > 333.175
> > > > > > > >  XX.XX.XX.1      .STEP.          16 u 1910   64    0    0.000   
> > > > > > > >  0.000
> > > > > > > > 0.000
> > > > > > > >  XX.XX.XX.1      XX.XX.XX.245     4 u   27   64    1    0.703  
> > > > > > > > -262.63
> > > > > > > > 0.004
> > > > > > > >  XX.XX.XX.1      XX.XX.XX.245     4 u   31   64    1    0.649  
> > > > > > > > -331.43
> > > > > > > > 68.800
> > > > > > > > ---
> > > > > > > > 
> > > > > > > > At the same time host's results are very stable:
> > > > > > > > ---
> > > > > > > > bhyve-host% for t in `jot 10`; do ntpq -pn; sleep 64; done
> > > > > > > >      remote           refid      st t when poll reach   delay   
> > > > > > > > offset
> > > > > > > > jitter
> > > > > > > > ==============================================================================
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u    1   64    1    0.401   
> > > > > > > >  0.176
> > > > > > > > 0.106
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u    6   64    3    0.401   
> > > > > > > >  0.176
> > > > > > > > 0.459
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u    3   64    7    0.401   
> > > > > > > >  0.176
> > > > > > > > 0.940
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u   67   64    7    0.401   
> > > > > > > >  0.176
> > > > > > > > 0.940
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u   64   64   17    0.401   
> > > > > > > >  0.176
> > > > > > > > 1.566
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u   60   64   37    0.448   
> > > > > > > >  1.275
> > > > > > > > 1.739
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u   55   64   77    0.448   
> > > > > > > >  1.275
> > > > > > > > 2.365
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u   53   64  177    0.448   
> > > > > > > >  1.275
> > > > > > > > 3.110
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u   50   64  377    0.448   
> > > > > > > >  1.275
> > > > > > > > 3.929
> > > > > > > > *XX.XX.XX.1      XX.XX.XX.245     4 u   45   64  377    0.443   
> > > > > > > >  8.750
> > > > > > > > 4.722
> > > > > > > > ---
> > > > > > > > 
> > > > > > > > The network is organized via bridge -- host igb and vm tap 
> > > > > > > > interfaces
> > > > > > > > are members of one bridge.
> > > > > > > > 
> > > > > > > > Are those results expected? Does it smell like a bug? Should I 
> > > > > > > > dig
> > > > > > > > furter?
> > > > > > > > 
> > > > > > > So it is repeatedly stepping the clock in the VM? (Set
> > > > > > > kern.timecounter.stepwarnings=1 to log steps).
> > > > > > No kernel/ntpd messages for 20 minutes after setting this sysctl.
> > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > >  That is usually a sign
> > > > > > > that the chosen timecounter is running at a different frequency 
> > > > > > > than it
> > > > > > > claimed to be when it registered itself -- the host may not be
> > > > > > > emulating the timer hardware properly in the guest.  What is the 
> > > > > > > output
> > > > > > > of sysctl kern.timecounter in the vm?
> > > > > > ---
> > > > > > bhyve-vm% sysctl kern.timecounter
> > > > > > 
> > > > > > kern.timecounter.tsc_shift: 1
> > > > > > kern.timecounter.smp_tsc_adjust: 0
> > > > > > kern.timecounter.smp_tsc: 0
> > > > > > kern.timecounter.invariant_tsc: 1
> > > > > > kern.timecounter.fast_gettime: 1
> > > > > > kern.timecounter.tick: 1
> > > > > > kern.timecounter.choice: ACPI-fast(900) HPET(950) i8254(0) 
> > > > > > TSC-low(-100)
> > > > > > dummy(-1000000)
> > > > > > kern.timecounter.hardware: HPET
> > > > > > kern.timecounter.alloweddeviation: 5
> > > > > > kern.timecounter.stepwarnings: 1
> > > > > > kern.timecounter.tc.ACPI-fast.quality: 900
> > > > > > kern.timecounter.tc.ACPI-fast.frequency: 3579545
> > > > > > kern.timecounter.tc.ACPI-fast.counter: 4161213491
> > > > > > kern.timecounter.tc.ACPI-fast.mask: 4294967295
> > > > > > kern.timecounter.tc.HPET.quality: 950
> > > > > > kern.timecounter.tc.HPET.frequency: 10000000
> > > > > > kern.timecounter.tc.HPET.counter: 3518036865
> > > > > > kern.timecounter.tc.HPET.mask: 4294967295
> > > > > > kern.timecounter.tc.i8254.quality: 0
> > > > > > kern.timecounter.tc.i8254.frequency: 1193182
> > > > > > kern.timecounter.tc.i8254.counter: 47597
> > > > > > kern.timecounter.tc.i8254.mask: 65535
> > > > > > kern.timecounter.tc.TSC-low.quality: -100
> > > > > > kern.timecounter.tc.TSC-low.frequency: 1199886114
> > > > > > kern.timecounter.tc.TSC-low.counter: 1274338278
> > > > > > kern.timecounter.tc.TSC-low.mask: 4294967295
> > > > > > ---
> > > > > BTW, here is the host kern.timecounter:
> > > > > ---
> > > > > kern.timecounter.tsc_shift: 1
> > > > > kern.timecounter.smp_tsc_adjust: 0
> > > > > kern.timecounter.smp_tsc: 1
> > > > > kern.timecounter.invariant_tsc: 1
> > > > > kern.timecounter.fast_gettime: 1
> > > > > kern.timecounter.tick: 1
> > > > > kern.timecounter.choice: ACPI-fast(900) HPET(950) i8254(0) 
> > > > > TSC-low(1000)
> > > > > dummy(-1000000)
> > > > > kern.timecounter.hardware: TSC-low
> > > > > kern.timecounter.alloweddeviation: 5
> > > > > kern.timecounter.stepwarnings: 0
> > > > > kern.timecounter.tc.ACPI-fast.quality: 900
> > > > > kern.timecounter.tc.ACPI-fast.frequency: 3579545
> > > > > kern.timecounter.tc.ACPI-fast.counter: 9047194
> > > > > kern.timecounter.tc.ACPI-fast.mask: 16777215
> > > > > kern.timecounter.tc.HPET.quality: 950
> > > > > kern.timecounter.tc.HPET.frequency: 14318180
> > > > > kern.timecounter.tc.HPET.counter: 2232552795
> > > > > kern.timecounter.tc.HPET.mask: 4294967295
> > > > > kern.timecounter.tc.i8254.quality: 0
> > > > > kern.timecounter.tc.i8254.frequency: 1193182
> > > > > kern.timecounter.tc.i8254.counter: 43410
> > > > > kern.timecounter.tc.i8254.mask: 65535
> > > > > kern.timecounter.tc.TSC-low.quality: 1000
> > > > > kern.timecounter.tc.TSC-low.frequency: 1200067168
> > > > > kern.timecounter.tc.TSC-low.counter: 2463146362
> > > > > kern.timecounter.tc.TSC-low.mask: 4294967295
> > > > > ---
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > Also, what is the output of ntptime(8) in the vm?
> > > > > > ---
> > > > > > bhyve-vm% ntptime
> > > > > > 
> > > > > > ntp_gettime() returns code 0 (OK)
> > > > > >   time dd94930f.20ea2900  Fri, Oct 20 2017 18:21:51.128, 
> > > > > > (.128573699),
> > > > > >   maximum error 1309110 us, estimated error 3 us, TAI offset 37
> > > > > > ntp_adjtime() returns code 0 (OK)
> > > > > >   modes 0x0 (),
> > > > > >   offset 0.000 us, frequency 0.000 ppm, interval 1 s,
> > > > > >   maximum error 1309110 us, estimated error 3 us,
> > > > > >   status 0x2001 (PLL,NANO),
> > > > > >   time constant 6, precision 0.001 us, tolerance 496 ppm,
> > > > > > ---
> > > > > > 
> > > > > > Ian, thank you for your help!
> > > > > > 
> > > > It seems odd to me that the frequency of the host HPET is 14.3mhz and
> > > > that of the guest is 10.0mhz, but maybe that's a normal condition for
> > > > bhyve.  I did find some google hits[1] for bhyve guest timekeeping
> > > > trouble with the HPET timer which was solved by switching to a
> > > > different timecounter.  Timecounter choices can't be controlled from
> > > > loader.conf, so I guess a sysctl.conf entry of
> > > > kern.timecounter.hardware="ACPI-fast" is the only way to fix that.  You
> > > > can also just do that command interactively first and see if it stops
> > > > the time steps and ntp settles down.
> > > The process seems to become more monotonic. But steps nevertheless:
> > > ---
> > > *XX.1      XX.245     4 u   20   64    1    0.717  -12.771   4.193
> > > *XX.1      XX.245     4 u   28   64    3    0.751  -41.970  32.342
> > > *XX.1      XX.245     4 u   23   64    7    0.748  -59.505  46.624
> > > *XX.1      XX.245     4 u   18   64   17    0.699  -75.164  56.482
> > > *XX.1      XX.245     4 u   14   64   37    0.669  -90.112  63.767
> > > *XX.1      XX.245     4 u   11   64   77    0.605  -10.567  60.914
> > > *XX.1      XX.245     4 u    7   64  177    0.591  -169.54 116.762
> > > *XX.1      XX.245     4 u    3   64  377    0.591  -169.54 102.107
> > > *XX.1      XX.245     4 u   68   64  377    0.591  -169.54 102.107
> > > *XX.1      XX.245     4 u   63   64  377    0.591  -169.54  88.424
> > > *XX.1      XX.245     4 u   58   64  377    0.591  -169.54  92.949
> > > *XX.1      XX.245     4 u   55   64  377    0.591  -169.54 111.512
> > > *XX.1      XX.245     4 u   50   64  377    0.591  -169.54 140.827
> > > *XX.1      XX.245     4 u   45   64  377    0.591  -169.54 177.360
> > > *XX.1      XX.245     4 u   43   64  377    0.591  -169.54 219.057
> > >  XX.1      .STEP.    16 u  588   64    0    0.000    0.000   0.000
> > > ---
> > > 
> > > > 
> > > > 
> > > > This would be a workaround, not a fix per se.  If the time steps go
> > > > away, then something in bhyve's emulation of HPET (maybe only on some
> > > > hardware?) must be buggy.
> > > > 
> > > > 
> > > > [1] 
> > > > https://lists.freebsd.org/pipermail/freebsd-virtualization/2015-April/003492.html
> > > Also thanks for the link. Unfortunately the problem seems to persist.
> > > 
> > Hmm, that almost looks normal... it noticed the clock was drifting
> > fast, so it went into a frequency-training cycle (at the point where
> > the offset stopped changing at -169.54) and then once it had figured
> > out the frequency it did a step to get realigned, and (presumably)
> > adjusted the frequency of the kernel clock.  The output of ntptime
> > before and after the step would show that... before the step the
> > frequency would show as zero (which was the case in your ntptime output
> > earlier), and after the step it would be non-zero.  No more steps would
> > occur after the first one, if that's what is happening here.
> Those were following steps:
> ---
>  XX.1      XX.245     4 u   15   64    1    0.597  -51.802   7.547
>  XX.1      XX.245     4 u   17   64    1    0.597  -51.802  29.526
>  XX.1      XX.245     4 u   19   64    1    0.597  -51.802  52.760
>  XX.1      XX.245     4 u   19   64    3    0.597  -51.802  77.354
>  XX.1      XX.245     4 u   16   64    7    0.597  -51.802 103.182
>  XX.1      XX.245     4 u   12   64   17    0.597  -51.802 138.976
> *XX.1      XX.245     4 u    9   64   37    0.700  -200.07 109.275
>  XX.1      .STEP.    16 u 1100   64    0    0.000    0.000   0.002
>  XX.1      XX.245     4 u   63   64    1    0.645   -3.620   0.002
>  XX.1      XX.245     4 u   34   64    1    0.645   -3.620  66.907
>  XX.1      XX.245     4 u   36   64    1    0.643  -115.30  85.082---
> 
> No stability. :-) But I see your point. I'll leave it till morning
> with hope it may stabilize.
> 

I don't think it will.  ntpd doesn't need more than one cycle to figure
out the frequency of a clock and adjust it (and record the amount in
ntpd.drift for future use).  Either the clock is drifting at more than
the max 500ppm rate, or it isn't really drifting steadily, it's still
jumping around in an uneven manner.

It might be interesting to try one of the lower-quality clocks such as 
i8254, or try TSC-low, although there was a comment in that mail thread
about how TSC might be a bad choice.

Beyond that, I'm not sure what else to try.  It might be necessary to
get some bhyve developers involved (I know almost nothing about it).

-- Ian

_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to