On 11/10/2014 08:47 PM, Peter Zijlstra wrote: > On Mon, Nov 10, 2014 at 07:50:22PM +0530, Preeti U Murthy wrote: >> Hi Peter, >> >> On 11/10/2014 05:59 PM, Peter Zijlstra wrote: >>> On Fri, Nov 07, 2014 at 03:31:22PM +0100, Daniel Lezcano wrote: >>>> The poll function is called when a timer expired or if we force to poll >>>> when >>>> the cpu_idle_force_poll option is set. >>>> >>>> The poll function does: >>>> >>>> local_irq_enable(); >>>> while (!tif_need_resched()) >>>> cpu_relax(); >>>> >>>> This default poll function suits for the x86 arch because its rep; nop; >>>> hardware power optimization. But on other archs, this optimization does not >>>> exists and we are not saving power. The arch specific bits may want to >>>> optimize this loop by adding their own optimization. >>> >>> This doesn't make sense to me; should an arch not either implement an >>> actual idle driver or implement cpu_relax() properly, why allow for a >>> third weird option? >>> >> >> The previous version of this patch simply invoked cpu_idle_loop() for >> cases where latency_req was 0. This would have changed the behavior >> on PowerPC wherein earlier the 0th idle index was returned which is also >> a polling loop but differs from cpu_idle_loop() in two ways: >> >> a. It polls at a relatively lower power state than cpu_relax(). >> b. We set certain registers to indicate that the cpu is idle. > > So I'm confused; the current code runs the generic cpu_relax idle poll > loop for the broadcast case. I suppose you want to retain this because > not doing your a-b above will indeed give you a lower latency. > > Therefore one could argue that latency_req==0 should indeed use this, > and your a-b idle state should be latency_req==1 or higher. > > Thus yes it changes behaviour, but I think it actually fixes something. > You cannot have a latency_req==0 state which has higher latency than the > actual polling loop, as you appear to have.
Yes you are right. This is fixing the current behavior. So we should be good to call cpuidle_idle_loop() when latency_req=0. > >> Hence for all such cases wherein the cpu is required to poll while idle >> (only for cases such as force poll, broadcast ipi to arrive soon and >> latency_req = 0), we should be able to call into cpuidle_idle_loop() >> only if the cpuidle driver's 0th idle state has an exit_latency > 0. >> (The 0th idle state is expected to be a polling loop with >> exit_latency = 0). >> >> If otherwise, it would mean the driver has an optimized polling loop >> when idle. But instead of adding in the logic of checking the >> exit_latency, we thought it would be simpler to call into an arch >> defined polling idle loop under the above circumstances. If that is no >> better we could fall back to cpuidle_idle_loop(). > > That still doesn't make sense to me; suppose the implementation of this > special poll state differs on different uarchs for the same arch, then > we'll end up with another registration and selection interface parallel > to cpuidle. Yeah this will only get complicated. I was trying to see how to keep the current behavior unchanged but looks like it is not worth it. Thanks Regards Preeti U Murthy > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/