Taylor R Campbell <riastr...@netbsd.org> writes:

> [1:text/plain Hide]
>
>> Date: Tue, 01 Aug 2023 16:02:17 -0400
>> From: Brad Spencer <b...@anduin.eldar.org>
>> 
>> Taylor R Campbell <riastr...@netbsd.org> writes:
>> 
>> > So I just added a printf to the kernel in case this jump happens.  Can
>> > you update to xen_clock.c 1.15 (and sys/arch/x86/include/cpu.h 1.135)
>> > and try again?
>> 
>> Sure...
>
> Correction: xen_clock.c 1.16 and sys/arch/x86/include/cpu.h 1.136
> (missed a spot).

I noticed the second update and believe that I am running the latest of
those files.  I have xen_clock.c at 1.17 and cpu.h at 1.136.

>> >> If the dtrace does continue to run, sometimes, it is impossible to exit
>> >> with CTRL-C.  The process seems stuck in this:
>> >> 
>> >> [ 4261.7158728] load: 2.64  cmd: dtrace 3295 [xclocv] 0.01u 0.02s 0% 7340k
>> >
>> > Interesting.  If this is reproducible, can you enter crash or ddb and
>> > get a stack trace for the dtrace process, as well as output from ps,
>> > ps/w, and `show all tstiles'?
>> 
>> It appears to be reproduceable..  in the sense that I encountered it a
>> couple of times doing exactly the same workload test.  I am more or less
>> completely unsure as to what the trigger is, however.  I probably should
>> have mentioned, but when this happened the last time, I did have other
>> newly created processes hang in tstile (the one in particular that I
>> noticed was 'fortune' from a ssh attempt .. it got stuck on login and
>> when I did a CTRL-T tstile was shown).
>
> `show all tstiles' output in crash or ddb would definitely be helpful
> here.

I started the last dtrace that was mentioned before the latest abuse
test and it has already exited claiming a unresponsive system.  I highly
suspect that if I start it again, it will hang as that seems to be what
occurred previously.  However, that will have to wait a bit until the
abuse is done.  If there are no additional messages about negative
runtime in the morning, I will try starting the dtrace and see if it
hangs up and perform the ddb outputs.  This will also result in the
guest rebooting to get it all unhung again.

>> I also probably should have mentioned that the DOM0 (NOT the DOMU) that
>> the target system is running under has HZ set to 1000.  This is mostly
>> to help keep the ntpd and chronyd happy on the Xen guests.  If the DOM0
>> is left at 100 the drift can be too much on the DOMU systems.  Been
>> running like this for a long time...
>
> Interesting.  Why would the dom0's HZ choice a difference?  Nothing
> in the guest should depend substantively on the host's tick rate.

I really have no idea why it matters, but it did seem to help some.

> A NetBSD XEN3_DOM0 kernel periodically updates the hypervisor with a
> real-time clock derived from NTP (the `timepush' callout in
> xen_clock.c), but the period between updates is 53 sec + 3 ticks, and
> it's hard to imagine that setting the clock every 53.03 sec vs every
> 53.003 sec should make any difference for whether guests drift.
>
> The resolution of the real-time clock sent to NTP is 1/hz, because
> resettodr uses getmicrotime instead of microtime, but while that might
> introduce jitter from rounding, I'm not sure it should cause
> persistent drift in one direction or the other and I don't think
> guests are likely to periodically query the Xen wall clock time often
> enough for this jitter to matter.
>
> Does the dom0 have any substantive continuous influence on domU
> scheduling and timing?  I always assumed the hypervisor would have all
> the say in that.
>
> As an aside, I wonder whether it's even worthwhile to run ntpd or
> chronyd on the domU instead of just letting the dom0 set it and
> arranging to do the equivalent of inittodr periodically in the domU?

Something will need to keep the time synced.  I can tell you with 100%
assurance that ntpd will report differences in its statistics and
metrics between a DOM0 and a PV DOMU on the same hardware.  I have been
running some metric gathering for years from ntpq output and it this is
100% the case.  Currently I have been using ntpd on single processor
DOMUs and chronyd on multiprocessor DOMUs (because ntpd very often can
not deal with the drift present with DOMU vcpu > 1).  I can also say
with 100% certainty that for a PV DOMU one of those, or something like
it, will be required.

Your suggestion is something like what Solaris zones do, where there you
don't run ntpd in the zone, and it gets it time from the global zone.
This would probably be a very useful thing to do for Xen if it were
possible.

> Can you try the attached patch on a dom0 and see if you still observe
> drift?

For the DOM0 that is a little more involved than it should have to be.
For various other reasons, it is running a 9.99.104 kernel and in a
unfortunate "zfs destroy ..." incident I no longer have the source tree
that built that kernel (or the artifacts).  I was going to move the DOM0
to 10.x_BETA, but had not gotten around to doing that yet.  I can
probably move it to the -current that is being used for this test, but
neither 10.x or -current can happen right away...  probably at least a
week or two away.




-- 
Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org

Reply via email to