Taylor R Campbell <riastr...@netbsd.org> writes: [snip]
> I don't know what the sparc timecounter frequency is, but the Xen > system timecounter returns units of nanoseconds, i.e., runs at 1 GHz, > well within these bounds. So this kind of wraparound leading to > apparently negative runtime -- that is, l->l_stime going backwards -- > should not be possible as long as we are calling tc_windup() at a > frequency of at least 1 GHz / (2^k / 2) = 0.47 Hz. > > That said, at a 32-bit timecounter frequency of 1 GHz, if there is a > period of about 2^32 / 1 GHz ~= 4.3sec during which we miss all > consecutive hardclock ticks, that would violate the timecounter(9) > assumptions, and tc_delta(th) may go backwards if that happens. > > So I think we need to find out why we're missing Xen hardclock timer > interrupts. Should also make the dtrace probe show exactly how many > hardclock ticks in a batch happened, and should raise an alarm (with > or without dtrace) if it exceeds a threshold. [snip] On the system that I have that exhibits the negative runtime problem, it may very well be the case that hardclocks are missed for 4.3sec. The system has to have been up for a while and busy as a prereq., but if I then run: dtrace -x nolibs -n 'sdt:xen:clock:, sdt:xen:hardclock:, sdt:xen:timecounter: { printf("%d %d %d %d %d %d %d %d", arg0, arg1, arg2, arg3, arg4, arg5, arg6, arg7) }' on what is otherwise an idle system there are a continous stream of hardclock missed events. And... 'vmstat -e | grep -e tsc -e systime -e hardclock' shows: vcpu0 missed hardclock 6882953 74 intr vcpu1 missed hardclock 32268 0 intr .. that second number, 74, in this case, is a rate.. that is, there are continuous missed hardclocks even while idle. If the system is freshly rebooted, the vmstat will not show anything until after the system has been busy for a while and likewise the dtrace is nearly free of events. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org