Hi Thomas,

The VMware PhotonOS team is evaluating 4.19-rt compared to CentOS
3.10-rt (franken kernel from Red Hat). They found a regression between
the two kernels that was found to be introduced by:

 d25408756accb ("clockevents: Stop unused clockevent devices")

The issue is running this on a guest, and it causes a noticeable wake
up latency in cyclictest. The 4.19-rt kernel has two extra apic
instructions causing for two extra VMEXITs to occur over the 3.10-rt
kernel. I found out the reason why, and this is true for vanilla 5.9-rc
as well.

When running isocpus with NOHZ_FULL, I see the following.

  tick_nohz_idle_stop_tick() {
        hrtimer_start_range_ns() {
                remove_hrtimer(timer)
                        /* no more timers on the base */
                        expires = KTIME_MAX;
                        tick_program_event() {
                                clock_switch_state(ONESHOT_STOPPED);
                                /* call to apic to shutdown timer */
                        }
                }
                [..]
                hrtimer_reprogram(timer) {
                        tick_program_event() {
                                clock_switch_state(ONESHOT);
                                /* call to apic to enable timer again! */
                }
        }
 }


Thus, we are needlessly shutting down and restarting the apic every
time we call tick_nohz_stop_tick() if there is a timer still on the
queue.

I'm not exactly sure how to fix this. Is there a way we can hold off
disabling the clock here until we know that it isn't going to be
immediately enabled again?

-- Steve

Reply via email to