On Fri, Jul 5, 2019 at 7:22 PM Thomas Lindroth <[email protected]> wrote: > > On recent kernels the tick remains active on idle adaptive-tick CPUs when a > small value is written to /dev/cpu_dma_latency to restrict the highest > C-state. Before the idle loop redesign in 4.17 idle CPUs had the tick > disabled even when C-state were restricted. Is this change intentional or a > regression?
It was intentional, but this kind of is a gray area. At least for the menu governor you may argue that the decision on whether or not to stop the tick should be based on the predicted idle duration. > I use an x86_64 system built with CONFIG_NO_HZ_FULL that I recently upgraded > to the 4.19 series from the 4.14 series. I noticed that adaptive-tick CPUs > (nohz_full=1-7) still fire timer interrupts about 1000 times/s > (CONFIG_HZ_1000=y) even when they are mostly idle. Some debugging showed that > this only happens when a program is writing to /dev/cpu_dma_latency to > restrict C-states. The old 4.14 kernel only have around 10 timer interrupts > per second on idle adaptive-tick CPU even when C-states are restricted that > way. > > I would expect an adaptive-tick CPU to turn off the tick when it has 0 or 1 > processes to run and enable the tick for >2 processes. Kernels after 4.17 > instead have the tick on when 0 or >2 processes are running and the tick off > in the 1 process case. Since the tick is off when a single process is running > that workload isn't directly harmed by the change but if the CPU use > hyperthreading the constant wakeups on an idle HT sibling will reduce > performance on the other sibling. > > They way I look for timer interrupts is by comparing the LOC line in > /proc/interrupts or using the hrtimer_expire_entry tracepoint when > function=tick_sched_timer. Both methods seem to give the same results. > > I can reproduce the problem using an i7-4790K CPU with > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle. I can also > reproduce the problem on an old core2duo laptop with current_driver:acpi_idle > but I can't reproduce the problem in a virtual machine where > current_driver:none. I also can't reproduce the problem if C-states are > restricted using the intel_idle.max_cstate=1 kernel argument instead of > /dev/cpu_dma_latency. > > The commit that introduced the change is 554c8aa8ec "sched: idle: Select idle > state before stopping the tick" in v4.17 and the problem exists at least up > to kernel 5.1 using the menu cpuidle governor. Restoring the previous behavior in this case should be relatively straightforward. I'll send you a patch to do that later.

