On Wed, Jan 20, 2016 at 9:42 AM, Thomas Gleixner <t...@linutronix.de> wrote: > On Wed, 20 Jan 2016, John Stultz wrote: >> Ehrm. A more productive route in solving this might be to cap the >> cycle delta we return from timekeeping_get_delta(). >> >> We already do this in the CONFIG_DEBUG_TIMEKEEPING, but adding a >> simple check it to the non-debug case should be doable w/o adding too >> much overhead to this very hot path. >> >> Something like: >> if (delta > tkr->clock->max_cycles) >> delta = tkr->clock->max_cycles; >> >> return delta; > > Well, you can make CONFIG_KDB select CONFIG_DEBUG_TIMEKEEPING.
True. And turning on DEBUG_TIMEKEEPING is probably the easiest thing for Jeff to try. Though, there's still the same issue w/ paused VMs. Most of the design for the timekeeping code has been that it can't properly function if you block update_wall_time() calls, but it shouldn't kill the box. With most clocksources, the issue is the counter wraps and we lose time. But in this case with the TSC its the *very* large cycle delta turning into a unexpectedly large nanosecond value. Hrm.. I do also wonder: the logarithmic accumulation chews through large cycle deltas efficiently, but it does have some design limits, so it might also hit the rails and take awhile to spin accumulating time with such large offsets. Jeff: Can you try the config option above to let me know if that avoids the issue? And if not, can you provide some analysis of what else is going on? thanks -john