> -----Original Message-----
> From: Andy Lutomirski [mailto:l...@amacapital.net]
> Sent: Friday, August 15, 2014 5:23 AM
> To: Peter Zijlstra
> Cc: Daniel Lezcano; Liu, Chuansheng; Rafael J. Wysocki;
> linux...@vger.kernel.org; LKML; Liu, Changcheng; Wang, Xiaoming;
> Chakravarty, Souvik K
> Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS
> back to DEFAULT
> 
> On Thu, Aug 14, 2014 at 2:16 PM, Peter Zijlstra <pet...@infradead.org>
> wrote:
> > On Thu, Aug 14, 2014 at 02:12:27PM -0700, Andy Lutomirski wrote:
> >> On 08/14/2014 04:14 AM, Daniel Lezcano wrote:
> >> > On 08/14/2014 01:00 PM, Peter Zijlstra wrote:
> >> >>
> >> >> So seeing how you're from @intel.com I'm assuming you're using x86
> here.
> >> >>
> >> >> I'm not seeing how this can be possible, MWAIT is interrupted by IPIs
> >> >> just fine, which means we'll fall out of the cpuidle_enter(), which
> >> >> means we'll cpuidle_reflect(), and then leave cpuidle_idle_call().
> >> >>
> >> >> It will indeed not leave the cpu_idle_loop() function and go right back
> >> >> into cpuidle_idle_call(), but that will then call cpuidle_select() which
> >> >> should pick a new C state.
> >> >>
> >> >> So the interrupt _should_ work. If it doesn't you need to explain why.
> >> >
> >> > I think the issue is related to the poll_idle state, in
> >> > drivers/cpuidle/driver.c. This state is x86 specific and inserted in the
> >> > cpuidle table as the state 0 (POLL). There is no mwait for this state.
> >> > It is a bit confusing because this state is not listed in the acpi /
> >> > intel idle driver but inserted implicitly at the beginning of the idle
> >> > table by the cpuidle framework when the driver is registered.
> >> >
> >> > static int poll_idle(struct cpuidle_device *dev,
> >> >                 struct cpuidle_driver *drv, int index)
> >> > {
> >> >         local_irq_enable();
> >> >         if (!current_set_polling_and_test()) {
> >> >                 while (!need_resched())
> >> >                         cpu_relax();
> >> >         }
> >> >         current_clr_polling();
> >> >
> >> >         return index;
> >> > }
> >>
> >> As the most recent person to have modified this function, and as an
> >> avowed hater of pointless IPIs, let me ask a rather different question:
> >> why are you sending IPIs at all?  As of Linux 3.16, poll_idle actually
> >> supports the polling idle interface :)
> >>
> >> Can't you just do:
> >>
> >> if (set_nr_if_polling(rq->idle)) {
> >>       trace_sched_wake_idle_without_ipi(cpu);
> >> } else {
> >>       spin_lock_irqsave(&rq->lock, flags);
> >>       if (rq->curr == rq->idle)
> >>               smp_send_reschedule(cpu);
> >>       // else the CPU wasn't idle; nothing to do
> >>       raw_spin_unlock_irqrestore(&rq->lock, flags);
> >> }
> >>
> >> In the common case (wake from C0, i.e. polling idle), this will skip the
> >> IPI entirely unless you race with idle entry/exit, saving a few more
> >> precious electrons and all of the latency involved in poking the APIC
> >> registers.
> >
> > They could and they probably should, but that logic should _not_ live in
> > the cpuidle driver.
> 
> Sure.  My point is that fixing the IPI handler is, I think, totally
> bogus, because the IPI API isn't the right way to do this at all.
> 
> It would be straightforward to add a new function wake_if_idle(int
> cpu) to sched/core.c.
> 
Thanks Andy and Peter's suggestion, it will save some IPI things in case the 
cores are not
in idle.

There is one similar API in sched/core.c wake_up_idle_cpu(),
then just need add one new common smp API:

smp_wake_up_cpus() {
for_each_online_cpu()
  wake_up_idle_cpu();
}

Will try one patch for it.


N�����r��y����b�X��ǧv�^�)޺{.n�+����{����zX����ܨ}���Ơz�&j:+v�������zZ+��+zf���h���~����i���z��w���?�����&�)ߢf��^jǫy�m��@A�a���
0��h���i

Reply via email to