On 27/03/2018 05:35, Leo Yan wrote: > On Wed, Feb 21, 2018 at 04:29:27PM +0100, Daniel Lezcano wrote: > > [...] > >> +/** >> + * cpuidle_cooling_injection_thread - Idle injection mainloop thread >> function >> + * @arg: a void pointer containing the idle cooling device address >> + * >> + * This main function does basically two operations: >> + * >> + * - Goes idle for a specific amount of time >> + * >> + * - Sets a timer to wake up all the idle injection threads after a >> + * running period >> + * >> + * That happens only when the mitigation is enabled, otherwise the >> + * task is scheduled out. >> + * >> + * In order to keep the tasks synchronized together, it is the last >> + * task exiting the idle period which is in charge of setting the >> + * timer. >> + * >> + * This function never returns. >> + */ >> +static int cpuidle_cooling_injection_thread(void *arg) >> +{ >> + struct sched_param param = { .sched_priority = MAX_USER_RT_PRIO/2 }; > > I am just wandering if should set priority to (MAX_RT_PRIO - 1)? > Otherwise I am concern it might be cannot enter deep idle state when > any CPU idle injection thread is preempted by other higher priority RT > threads so all CPUs have no alignment for idle state entering/exiting.
I do believe we should consider other RT tasks more important than the idle injection threads. >> + struct cpuidle_cooling_device *idle_cdev = arg; >> + struct cpuidle_cooling_tsk *cct = per_cpu_ptr(&cpuidle_cooling_tsk, >> + smp_processor_id()); >> + DEFINE_WAIT(wait); >> + >> + set_freezable(); >> + >> + sched_setscheduler(current, SCHED_FIFO, ¶m); >> + >> + while (1) { >> + s64 next_wakeup; >> + >> + prepare_to_wait(&cct->waitq, &wait, TASK_INTERRUPTIBLE); >> + >> + schedule(); >> + >> + atomic_inc(&idle_cdev->count); >> + >> + play_idle(idle_cdev->idle_cycle / USEC_PER_MSEC); >> + >> + /* >> + * The last CPU waking up is in charge of setting the >> + * timer. If the CPU is hotplugged, the timer will >> + * move to another CPU (which may not belong to the >> + * same cluster) but that is not a problem as the >> + * timer will be set again by another CPU belonging to >> + * the cluster, so this mechanism is self adaptive and >> + * does not require any hotplugging dance. >> + */ >> + if (!atomic_dec_and_test(&idle_cdev->count)) >> + continue; >> + >> + if (!idle_cdev->state) >> + continue; >> + >> + next_wakeup = cpuidle_cooling_runtime(idle_cdev); >> + >> + hrtimer_start(&idle_cdev->timer, ns_to_ktime(next_wakeup), >> + HRTIMER_MODE_REL_PINNED); > > If SoC temperature descreases under tipping point, will the timer be > disabled for this case? Or will here set next timer event with big > value from next_wakeup? Another timer (the polling one) will update the 'state' variable to zero in the set_cur_state. In the worst case, we check the idle_cdev->state right before it is updated and we end up with an extra idle injection cycle which is perfectly fine. -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog