2017-01-18 13:28-0200, Marcelo Tosatti: > On Wed, Jan 18, 2017 at 04:20:33PM +0100, Radim Krcmar wrote: >> 2017-01-18 12:53-0200, Marcelo Tosatti: >> > GOn Wed, Jan 18, 2017 at 12:37:25PM -0200, Marcelo Tosatti wrote: >> > > On Wed, Jan 18, 2017 at 01:46:58PM +0100, Paolo Bonzini wrote: >> > > > >> > > > >> > > > On 18/01/2017 13:24, Marcelo Tosatti wrote: >> > > > > On Wed, Jan 18, 2017 at 10:17:38AM -0200, Marcelo Tosatti wrote: >> > > > >> On Tue, Jan 17, 2017 at 04:36:21PM +0100, Radim Krcmar wrote: >> > > > >>> 2017-01-17 09:30-0200, Marcelo Tosatti: >> > > > >>>> On Tue, Jan 17, 2017 at 09:03:27AM +0100, Miroslav Lichvar wrote: >> > > > >>>>> Users of the PTP_SYS_OFFSET ioctl assume that (ts[0]+ts[2])/2 >> > > > >>>>> corresponds to ts[1], (ts[2]+ts[4])/2 corresponds to ts[3], and >> > > > >>>>> so on. >> > > > >>>>> >> > > > >>>>> ts[1] ts[3] >> > > > >>>>> Host time ---------+---------+........ >> > > > >>>>> | | >> > > > >>>>> | | >> > > > >>>>> Guest time ----+---------+---------+...... >> > > > >>>>> ts[0] ts[2] ts[4] >> > > > >>> >> > > > >>> KVM PTP delay moves host ts[i] to be close to guest ts[i+1] and >> > > > >>> makes >> > > > >>> the offset very consistent, so the graph would look like: >> > > > >>> >> > > > >>> ts[1] ts[3] >> > > > >>> Host time -------------+---------+........ >> > > > >>> | | >> > > > >>> | | >> > > > >>> Guest time ----+---------+---------+...... >> > > > >>> ts[0] ts[2] ts[4] >> > > > >>> >> > > > >>> which doesn't sound good if users assume that the host reading is >> > > > >>> in the >> > > > >>> middle -- the guest time would be ahead of the host time. >> > > > >> >> > > > >> Testcase: run a guest and a loop sending SIGUSR1 to vcpu0 (emulating >> > > > >> intense interrupts). Follows results: >> > > > >> >> > > > >> Without TSC delta calculation: >> > > > >> ============================= >> > > > >> >> > > > >> #* PHC0 0 3 377 2 -99ns[ >> > > > >> +206ns] +/- 116ns >> > > > >> #* PHC0 0 3 377 8 +202ns[ >> > > > >> +249ns] +/- 111ns >> > > > >> #* PHC0 0 3 377 8 -213ns[ >> > > > >> +683ns] +/- 88ns >> > > > >> #* PHC0 0 3 377 6 +77ns[ >> > > > >> +319ns] +/- 56ns >> > > > >> #* PHC0 0 3 377 4 >> > > > >> -771ns[-1029ns] +/- 93ns >> > > > >> #* PHC0 0 3 377 10 -49ns[ >> > > > >> -58ns] +/- 121ns >> > > > >> #* PHC0 0 3 377 9 +562ns[ >> > > > >> +703ns] +/- 107ns >> > > > >> #* PHC0 0 3 377 6 -2ns[ >> > > > >> -3ns] +/- 94ns >> > > > >> #* PHC0 0 3 377 4 +451ns[ >> > > > >> +494ns] +/- 138ns >> > > > >> #* PHC0 0 3 377 11 -67ns[ >> > > > >> -74ns] +/- 113ns >> > > > >> #* PHC0 0 3 377 8 +244ns[ >> > > > >> +264ns] +/- 119ns >> > > > >> #* PHC0 0 3 377 7 -696ns[ >> > > > >> -890ns] +/- 89ns >> > > > >> #* PHC0 0 3 377 4 +468ns[ >> > > > >> +560ns] +/- 110ns >> > > > >> #* PHC0 0 3 377 11 -310ns[ >> > > > >> -430ns] +/- 72ns >> > > > >> #* PHC0 0 3 377 9 +189ns[ >> > > > >> +298ns] +/- 54ns >> > > > >> #* PHC0 0 3 377 7 +594ns[ >> > > > >> +473ns] +/- 96ns >> > > > >> #* PHC0 0 3 377 5 +151ns[ >> > > > >> +280ns] +/- 71ns >> > > > >> #* PHC0 0 3 377 10 -590ns[ >> > > > >> -696ns] +/- 94ns >> > > > >> #* PHC0 0 3 377 8 +415ns[ >> > > > >> +526ns] +/- 74ns >> > > > >> #* PHC0 0 3 377 6 >> > > > >> +1381ns[+1469ns] +/- 101ns >> > > > >> #* PHC0 0 3 377 4 >> > > > >> +571ns[+1304ns] +/- 54ns >> > > > >> #* PHC0 0 3 377 8 -5ns[ >> > > > >> +71ns] +/- 139ns >> > > > >> #* PHC0 0 3 377 7 -247ns[ >> > > > >> -502ns] +/- 69ns >> > > > >> #* PHC0 0 3 377 5 -283ns[ >> > > > >> +879ns] +/- 73ns >> > > > >> #* PHC0 0 3 377 3 +148ns[ >> > > > >> -109ns] +/- 61ns >> > > > >> >> > > > >> With TSC delta calculation: >> > > > >> ============================ >> > > > >> >> > > > >> #* PHC0 0 3 377 7 +379ns[ >> > > > >> +432ns] +/- 53ns >> > > > >> #* PHC0 0 3 377 9 +106ns[ >> > > > >> +420ns] +/- 42ns >> > > > >> #* PHC0 0 3 377 7 -58ns[ >> > > > >> -136ns] +/- 62ns >> > > > >> #* PHC0 0 3 377 12 +93ns[ >> > > > >> -38ns] +/- 64ns >> > > > >> #* PHC0 0 3 377 8 +84ns[ >> > > > >> +107ns] +/- 69ns >> > > > >> #* PHC0 0 3 377 3 -76ns[ >> > > > >> -103ns] +/- 52ns >> > > > >> #* PHC0 0 3 377 7 +52ns[ >> > > > >> +63ns] +/- 50ns >> > > > >> #* PHC0 0 3 377 11 +29ns[ >> > > > >> +31ns] +/- 70ns >> > > > >> #* PHC0 0 3 377 7 -47ns[ >> > > > >> -56ns] +/- 42ns >> > > > >> #* PHC0 0 3 377 10 -35ns[ >> > > > >> -42ns] +/- 33ns >> > > > >> #* PHC0 0 3 377 7 -32ns[ >> > > > >> -34ns] +/- 42ns >> > > > >> #* PHC0 0 3 377 11 -172ns[ >> > > > >> -173ns] +/- 118ns >> > > > >> #* PHC0 0 3 377 6 +65ns[ >> > > > >> +76ns] +/- 23ns >> > > > >> #* PHC0 0 3 377 9 +18ns[ >> > > > >> +23ns] +/- 37ns >> > > > >> #* PHC0 0 3 377 6 +41ns[ >> > > > >> -60ns] +/- 30ns >> > > > >> #* PHC0 0 3 377 10 +39ns[ >> > > > >> +183ns] +/- 42ns >> > > > >> #* PHC0 0 3 377 6 +50ns[ >> > > > >> +102ns] +/- 86ns >> > > > >> #* PHC0 0 3 377 11 +50ns[ >> > > > >> +75ns] +/- 52ns >> > > > >> #* PHC0 0 3 377 6 +50ns[ >> > > > >> +116ns] +/- 100ns >> > > > >> #* PHC0 0 3 377 10 +46ns[ >> > > > >> +65ns] +/- 79ns >> > > > >> #* PHC0 0 3 377 7 -38ns[ >> > > > >> -51ns] +/- 29ns >> > > > >> #* PHC0 0 3 377 10 -11ns[ >> > > > >> -12ns] +/- 32ns >> > > > >> #* PHC0 0 3 377 7 -31ns[ >> > > > >> -32ns] +/- 99ns >> > > > >> #* PHC0 0 3 377 10 +222ns[ >> > > > >> +238ns] +/- 58ns >> > > > >> #* PHC0 0 3 377 6 +185ns[ >> > > > >> +207ns] +/- 39ns >> > > > >> #* PHC0 0 3 377 10 -392ns[ >> > > > >> -394ns] +/- 118ns >> > > > >> #* PHC0 0 3 377 6 -9ns[ >> > > > >> -50ns] +/- 35ns >> > > > >> #* PHC0 0 3 377 10 -346ns[ >> > > > >> -355ns] +/- 111ns >> > > > >> >> > > > >> >> > > > >> Do you still want to drop it in favour of simplicity? >> > > > > >> > > > > This is the output of "chronyc sources". See section "Time sources" >> > > > > of https://chrony.tuxfamily.org/doc/2.4/chronyc.html. >> > > > >> > > > It's just that it's not obvious why you get better results with biased >> > > > host timestamps. What makes the biased host timestamp more precise? >> > > > >> > > > I'd rather use PTP_SYS_OFFSET_PRECISE instead, but unfortunately chrony >> > > > does not support it---but I would still prefer you to support >> > > > PTP_SYS_OFFSET_PRECISE as well. >> > > >> > > A single TSC read could be used to implement the PRECISE ioctl, but if >> > > a timer interrupt takes place on either the host or the guest, and that >> > > timer interrupt "adds" the TSC delta to xtime.nsec/xtime.sec, then that >> > > single TSC read cannot be used. >> > > >> > > So you would have to stop timer interrupts (in guest and host) for the >> > > duration of the >> > > PRECISE ioctl in the guest to avoid that situation, which seems a bit >> > > overkill to me. >> > > >> > > Any other ideas? >> > >> > Could have a hypercall that disables host timer interrupts for >> > a specified amount of time... But that does not scale with multiple VMs. >> >> No need to disable interrupts on guest nor host as both protect the time >> by a seqlock. > > Still not scalable with multiple VMs... so need a different solution.
What doesn't scale? The VM hypercall takes read on the tk_core.seq, which doesn't block other VMs from doing the same.

