On Wed, Jul 12, 2017 at 11:19:59AM +0800, Li, Aubrey wrote: > On 2017/7/12 2:11, Paul E. McKenney wrote: > > On Tue, Jul 11, 2017 at 06:33:55PM +0200, Frederic Weisbecker wrote: > >> On Tue, Jul 11, 2017 at 05:58:47AM -0700, Paul E. McKenney wrote: > >>> On Mon, Jul 10, 2017 at 09:38:34AM +0800, Aubrey Li wrote: > >>>> From: Aubrey Li <aubrey...@linux.intel.com> > >>>> > >>>> The system will enter a fast idle loop if the predicted idle period > >>>> is shorter than the threshold. > >>>> --- > >>>> kernel/sched/idle.c | 9 ++++++++- > >>>> 1 file changed, 8 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c > >>>> index cf6c11f..16a766c 100644 > >>>> --- a/kernel/sched/idle.c > >>>> +++ b/kernel/sched/idle.c > >>>> @@ -280,6 +280,8 @@ static void cpuidle_generic(void) > >>>> */ > >>>> static void do_idle(void) > >>>> { > >>>> + unsigned int predicted_idle_us; > >>>> + unsigned int short_idle_threshold = jiffies_to_usecs(1) / 2; > >>>> /* > >>>> * If the arch has a polling bit, we maintain an invariant: > >>>> * > >>>> @@ -291,7 +293,12 @@ static void do_idle(void) > >>>> > >>>> __current_set_polling(); > >>>> > >>>> - cpuidle_generic(); > >>>> + predicted_idle_us = cpuidle_predict(); > >>>> + > >>>> + if (likely(predicted_idle_us < short_idle_threshold)) > >>>> + cpuidle_fast(); > >>> > >>> What if we get here from nohz_full usermode execution? In that > >>> case, if I remember correctly, the scheduling-clock interrupt > >>> will still be disabled, and would have to be re-enabled before > >>> we could safely invoke cpuidle_fast(). > >>> > >>> Or am I missing something here? > >> > >> That's a good point. It's partially ok because if the tick is needed > >> for something specific, it is not entirely stopped but programmed to that > >> deadline. > >> > >> Now there is some idle specific code when we enter dynticks-idle. See > >> tick_nohz_start_idle(), tick_nohz_stop_idle(), > >> sched_clock_idle_wakeup_event() > >> and some subsystems that react differently when we enter dyntick idle > >> mode (scheduler_tick_max_deferment) so the tick may need a reevaluation. > >> > >> For now I'd rather suggest that we treat full nohz as an exception case > >> here > >> and do: > >> > >> if (!tick_nohz_full_cpu(smp_processor_id()) && > >> likely(predicted_idle_us < short_idle_threshold)) > >> cpuidle_fast(); > >> > >> Ugly but safer! > > > > Works for me! > > I guess who enabled full nohz(for example the financial guys who need the > system > response as fast as possible) does not like this compromise, ;)
And some HPC guys and some real-time guys with CPU-bound real-time processing, so there are likely quite a few different views on this compromise. > How about add rcu_idle enter/exit back only for full nohz case in fast idle? > RCU idle > is the only risky ops if removing them from fast idle path. Comparing to > adding RCU > idle back, going to normal idle path has more overhead IMHO. That might work, but I would need to see the actual patch. Frederic Weisbecker should look at it as well. Thanx, Paul