2017-01-18 13:28-0200, Marcelo Tosatti:
> On Wed, Jan 18, 2017 at 04:20:33PM +0100, Radim Krcmar wrote:
>> 2017-01-18 12:53-0200, Marcelo Tosatti:
>> > GOn Wed, Jan 18, 2017 at 12:37:25PM -0200, Marcelo Tosatti wrote:
>> > > On Wed, Jan 18, 2017 at 01:46:58PM +0100, Paolo Bonzini wrote:
>> > > > 
>> > > > 
>> > > > On 18/01/2017 13:24, Marcelo Tosatti wrote:
>> > > > > On Wed, Jan 18, 2017 at 10:17:38AM -0200, Marcelo Tosatti wrote:
>> > > > >> On Tue, Jan 17, 2017 at 04:36:21PM +0100, Radim Krcmar wrote:
>> > > > >>> 2017-01-17 09:30-0200, Marcelo Tosatti:
>> > > > >>>> On Tue, Jan 17, 2017 at 09:03:27AM +0100, Miroslav Lichvar wrote:
>> > > > >>>>> Users of the PTP_SYS_OFFSET ioctl assume that (ts[0]+ts[2])/2
>> > > > >>>>> corresponds to ts[1], (ts[2]+ts[4])/2 corresponds to ts[3], and 
>> > > > >>>>> so on.
>> > > > >>>>>
>> > > > >>>>>                     ts[1]     ts[3]
>> > > > >>>>> Host time    ---------+---------+........
>> > > > >>>>>                       |         |
>> > > > >>>>>                       |         |
>> > > > >>>>> Guest time   ----+---------+---------+......
>> > > > >>>>>                 ts[0]    ts[2]     ts[4]
>> > > > >>>
>> > > > >>> KVM PTP delay moves host ts[i] to be close to guest ts[i+1] and 
>> > > > >>> makes
>> > > > >>> the offset very consistent, so the graph would look like:
>> > > > >>>
>> > > > >>>                         ts[1]     ts[3]
>> > > > >>> Host time    -------------+---------+........
>> > > > >>>                           |         |
>> > > > >>>                           |         |
>> > > > >>> Guest time   ----+---------+---------+......
>> > > > >>>                 ts[0]    ts[2]     ts[4]
>> > > > >>>
>> > > > >>> which doesn't sound good if users assume that the host reading is 
>> > > > >>> in the
>> > > > >>> middle -- the guest time would be ahead of the host time.
>> > > > >>
>> > > > >> Testcase: run a guest and a loop sending SIGUSR1 to vcpu0 (emulating
>> > > > >> intense interrupts). Follows results:
>> > > > >>
>> > > > >> Without TSC delta calculation:
>> > > > >> =============================
>> > > > >>
>> > > > >> #* PHC0                          0   3   377     2    -99ns[ 
>> > > > >> +206ns] +/-  116ns
>> > > > >> #* PHC0                          0   3   377     8   +202ns[ 
>> > > > >> +249ns] +/-  111ns
>> > > > >> #* PHC0                          0   3   377     8   -213ns[ 
>> > > > >> +683ns] +/-   88ns
>> > > > >> #* PHC0                          0   3   377     6    +77ns[ 
>> > > > >> +319ns] +/-   56ns
>> > > > >> #* PHC0                          0   3   377     4   
>> > > > >> -771ns[-1029ns] +/-   93ns
>> > > > >> #* PHC0                          0   3   377    10    -49ns[  
>> > > > >> -58ns] +/-  121ns
>> > > > >> #* PHC0                          0   3   377     9   +562ns[ 
>> > > > >> +703ns] +/-  107ns
>> > > > >> #* PHC0                          0   3   377     6     -2ns[   
>> > > > >> -3ns] +/-   94ns
>> > > > >> #* PHC0                          0   3   377     4   +451ns[ 
>> > > > >> +494ns] +/-  138ns
>> > > > >> #* PHC0                          0   3   377    11    -67ns[  
>> > > > >> -74ns] +/-  113ns
>> > > > >> #* PHC0                          0   3   377     8   +244ns[ 
>> > > > >> +264ns] +/-  119ns
>> > > > >> #* PHC0                          0   3   377     7   -696ns[ 
>> > > > >> -890ns] +/-   89ns
>> > > > >> #* PHC0                          0   3   377     4   +468ns[ 
>> > > > >> +560ns] +/-  110ns
>> > > > >> #* PHC0                          0   3   377    11   -310ns[ 
>> > > > >> -430ns] +/-   72ns
>> > > > >> #* PHC0                          0   3   377     9   +189ns[ 
>> > > > >> +298ns] +/-   54ns
>> > > > >> #* PHC0                          0   3   377     7   +594ns[ 
>> > > > >> +473ns] +/-   96ns
>> > > > >> #* PHC0                          0   3   377     5   +151ns[ 
>> > > > >> +280ns] +/-   71ns
>> > > > >> #* PHC0                          0   3   377    10   -590ns[ 
>> > > > >> -696ns] +/-   94ns
>> > > > >> #* PHC0                          0   3   377     8   +415ns[ 
>> > > > >> +526ns] +/-   74ns
>> > > > >> #* PHC0                          0   3   377     6  
>> > > > >> +1381ns[+1469ns] +/-  101ns
>> > > > >> #* PHC0                          0   3   377     4   
>> > > > >> +571ns[+1304ns] +/-   54ns
>> > > > >> #* PHC0                          0   3   377     8     -5ns[  
>> > > > >> +71ns] +/-  139ns
>> > > > >> #* PHC0                          0   3   377     7   -247ns[ 
>> > > > >> -502ns] +/-   69ns
>> > > > >> #* PHC0                          0   3   377     5   -283ns[ 
>> > > > >> +879ns] +/-   73ns
>> > > > >> #* PHC0                          0   3   377     3   +148ns[ 
>> > > > >> -109ns] +/-   61ns
>> > > > >>
>> > > > >> With TSC delta calculation:
>> > > > >> ============================
>> > > > >>
>> > > > >> #* PHC0                          0   3   377     7   +379ns[ 
>> > > > >> +432ns] +/-   53ns
>> > > > >> #* PHC0                          0   3   377     9   +106ns[ 
>> > > > >> +420ns] +/-   42ns
>> > > > >> #* PHC0                          0   3   377     7    -58ns[ 
>> > > > >> -136ns] +/-   62ns
>> > > > >> #* PHC0                          0   3   377    12    +93ns[  
>> > > > >> -38ns] +/-   64ns
>> > > > >> #* PHC0                          0   3   377     8    +84ns[ 
>> > > > >> +107ns] +/-   69ns
>> > > > >> #* PHC0                          0   3   377     3    -76ns[ 
>> > > > >> -103ns] +/-   52ns
>> > > > >> #* PHC0                          0   3   377     7    +52ns[  
>> > > > >> +63ns] +/-   50ns
>> > > > >> #* PHC0                          0   3   377    11    +29ns[  
>> > > > >> +31ns] +/-   70ns
>> > > > >> #* PHC0                          0   3   377     7    -47ns[  
>> > > > >> -56ns] +/-   42ns
>> > > > >> #* PHC0                          0   3   377    10    -35ns[  
>> > > > >> -42ns] +/-   33ns
>> > > > >> #* PHC0                          0   3   377     7    -32ns[  
>> > > > >> -34ns] +/-   42ns
>> > > > >> #* PHC0                          0   3   377    11   -172ns[ 
>> > > > >> -173ns] +/-  118ns
>> > > > >> #* PHC0                          0   3   377     6    +65ns[  
>> > > > >> +76ns] +/-   23ns
>> > > > >> #* PHC0                          0   3   377     9    +18ns[  
>> > > > >> +23ns] +/-   37ns
>> > > > >> #* PHC0                          0   3   377     6    +41ns[  
>> > > > >> -60ns] +/-   30ns
>> > > > >> #* PHC0                          0   3   377    10    +39ns[ 
>> > > > >> +183ns] +/-   42ns
>> > > > >> #* PHC0                          0   3   377     6    +50ns[ 
>> > > > >> +102ns] +/-   86ns
>> > > > >> #* PHC0                          0   3   377    11    +50ns[  
>> > > > >> +75ns] +/-   52ns
>> > > > >> #* PHC0                          0   3   377     6    +50ns[ 
>> > > > >> +116ns] +/-  100ns
>> > > > >> #* PHC0                          0   3   377    10    +46ns[  
>> > > > >> +65ns] +/-   79ns
>> > > > >> #* PHC0                          0   3   377     7    -38ns[  
>> > > > >> -51ns] +/-   29ns
>> > > > >> #* PHC0                          0   3   377    10    -11ns[  
>> > > > >> -12ns] +/-   32ns
>> > > > >> #* PHC0                          0   3   377     7    -31ns[  
>> > > > >> -32ns] +/-   99ns
>> > > > >> #* PHC0                          0   3   377    10   +222ns[ 
>> > > > >> +238ns] +/-   58ns
>> > > > >> #* PHC0                          0   3   377     6   +185ns[ 
>> > > > >> +207ns] +/-   39ns
>> > > > >> #* PHC0                          0   3   377    10   -392ns[ 
>> > > > >> -394ns] +/-  118ns
>> > > > >> #* PHC0                          0   3   377     6     -9ns[  
>> > > > >> -50ns] +/-   35ns
>> > > > >> #* PHC0                          0   3   377    10   -346ns[ 
>> > > > >> -355ns] +/-  111ns
>> > > > >>
>> > > > >>
>> > > > >> Do you still want to drop it in favour of simplicity?
>> > > > > 
>> > > > > This is the output of "chronyc sources". See section "Time sources"
>> > > > > of https://chrony.tuxfamily.org/doc/2.4/chronyc.html.
>> > > > 
>> > > > It's just that it's not obvious why you get better results with biased
>> > > > host timestamps.  What makes the biased host timestamp more precise?
>> > > > 
>> > > > I'd rather use PTP_SYS_OFFSET_PRECISE instead, but unfortunately chrony
>> > > > does not support it---but I would still prefer you to support
>> > > > PTP_SYS_OFFSET_PRECISE as well.
>> > > 
>> > > A single TSC read could be used to implement the PRECISE ioctl, but if
>> > > a timer interrupt takes place on either the host or the guest, and that
>> > > timer interrupt "adds" the TSC delta to xtime.nsec/xtime.sec, then that
>> > > single TSC read cannot be used.
>> > > 
>> > > So you would have to stop timer interrupts (in guest and host) for the 
>> > > duration of the
>> > > PRECISE ioctl in the guest to avoid that situation, which seems a bit
>> > > overkill to me.
>> > > 
>> > > Any other ideas?
>> > 
>> > Could have a hypercall that disables host timer interrupts for 
>> > a specified amount of time... But that does not scale with multiple VMs.
>> 
>> No need to disable interrupts on guest nor host as both protect the time
>> by a seqlock.  
> 
> Still not scalable with multiple VMs... so need a different solution.

What doesn't scale?
The VM hypercall takes read on the tk_core.seq, which doesn't block
other VMs from doing the same.

Reply via email to