Re: [PATCH 1/3] cpuidle,x86: increase forced cut-off for polling to 20us

Daniel Lezcano Thu, 29 Oct 2015 06:03:21 -0700

On 10/29/2015 12:54 PM, Rik van Riel wrote:

On 10/29/2015 06:17 AM, Daniel Lezcano wrote:

On 10/28/2015 11:46 PM, [email protected] wrote:

From: Rik van Riel <[email protected]>


The cpuidle menu governor has a forced cut-off for polling at 5us,
in order to deal with firmware that gives the OS bad information
on cpuidle states, leading to the system spending way too much time
in polling.


May be I am misunderstanding your explanation but it is not how I read
the code.

The default idle state is C1 (hlt) if no other states suits the
constraint. If a timer is happening really soon, then set the default
idle state to POLL if no other idle state suits the constraint.

That applies only on x86.


With the current code, the default idle state is C1 (hlt) even if
C1 does not suit the constraint.

This is not related to break-even but exit latency.


Why would we not care about break-even for C1?

On systems where going into C1 for too-short periods wastes
power, why would we waste the power when we expect a very
short sleep?

IMO, we should just drop this 5us and the POLL state selection in the
menu governor as we have since a while hyper fast C1 exit. Except a few
embedded processors where polling is not adequate.


We have hyper fast C1 exit on Nehalem and newer high performance
chips. On those chips, we will pick C1 (or deeper) when we have
an expected sleep time of just a few microseconds.

However, on Atom, and for the paravirt cpuidle driver I am
working on, C1 exit latency and target residence are higher
than the cut-off hardcoded in the menu governor.

Furthermore, the number of times the poll state is selected vs the other
states is negligible.


And it will continue to be with this patch, on CPUs with
hyper fast C1 exit.

Which makes me confused about what your are objecting to,
since the system should continue to be have the way you want,
with the patch applied.

Ok, I don't object the correctness of your patch but the reasoningbehind this small optimization which bring us a lot of mess in thecpuidle code.

As you are touching this part of the code, I take the opportunity toraise a discussion about it.

From my POV, the poll state is *not* an idle state. It is like avehicle burnout [1].

But it is inserted into the idle state tables using a trick with a macroCPUIDLE_DRIVER_STATE_START which already led us to some bugs.

So instead of falling back into the poll state under certaincircumstances, I propose we extract this state from the idle state tableand we let the menu governor to fail choosing a state (or not).

From the caller, we decide what to do (poll or C1) if the idle stateselection fails or we choose to poll *before* like what we already havein kernel/sched/idle.c:


in the idle loop:

if (cpu_idle_force_poll || tick_check_broadcast_expired())
        cpu_idle_poll();
else
        cpuidle_idle_call();

By this way, we:

1) factor out the idle state selection with the find_deepest_idle_state
2) remove the CPUIDLE_DRIVER_STATE_START macro

3) concentrate the optimization logic outside of a governor which willbenefit to all architectures


Does it make sense ?

  -- Daniel

[1] https://en.wikipedia.org/wiki/Burnout_%28vehicle%29


--
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] cpuidle,x86: increase forced cut-off for polling to 20us

Reply via email to