Taylor R Campbell <riastr...@netbsd.org> writes:

>> Date: Mon, 31 Jul 2023 12:47:20 -0400
>> 
>> # dtrace -x nolibs -n 'sdt:xen:hardclock:jump { @ = quantize(arg1 - arg0) } 
>> sdt:xen:hardclock:jump /arg2 >= 430/ { printf("hardclock jump violated 
>> timecounter contract") }'
>> dtrace: description 'sdt:xen:hardclock:jump ' matched 2 probes
>> dtrace: processing aborted: Abort due to systemic unresponsiveness
>
> Well!  dtrace might be unhappy if the timecounter is broken too, heh.
> So I just added a printf to the kernel in case this jump happens.  Can
> you update to xen_clock.c 1.15 (and sys/arch/x86/include/cpu.h 1.135)
> and try again?

Sure...

>> The system is fine just after a reboot, it certainly seems to be a
>> requirment that a fair bit of work must be done before it gets into a
>> bad state.
>> 
>> If the dtrace does continue to run, sometimes, it is impossible to exit
>> with CTRL-C.  The process seems stuck in this:
>> 
>> [ 4261.7158728] load: 2.64  cmd: dtrace 3295 [xclocv] 0.01u 0.02s 0% 7340k
>
> Interesting.  If this is reproducible, can you enter crash or ddb and
> get a stack trace for the dtrace process, as well as output from ps,
> ps/w, and `show all tstiles'?

It appears to be reproduceable..  in the sense that I encountered it a
couple of times doing exactly the same workload test.  I am more or less
completely unsure as to what the trigger is, however.  I probably should
have mentioned, but when this happened the last time, I did have other
newly created processes hang in tstile (the one in particular that I
noticed was 'fortune' from a ssh attempt .. it got stuck on login and
when I did a CTRL-T tstile was shown).


I also probably should have mentioned that the DOM0 (NOT the DOMU) that
the target system is running under has HZ set to 1000.  This is mostly
to help keep the ntpd and chronyd happy on the Xen guests.  If the DOM0
is left at 100 the drift can be too much on the DOMU systems.  Been
running like this for a long time...



-- 
Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org

Reply via email to