On Sun, May 24, 2020 at 09:29:56PM +0100, Mel Gorman wrote:
> The patch "sched: Optimize ttwu() spinning on p->on_cpu" avoids spinning
> on p->on_rq when the task is descheduling but only if the wakee is on
> a CPU that does not share cache with the waker. This patch offloads the
> activation of the wakee to the CPU that is about to go idle if the task
> is the only one on the runqueue. This potentially allows the waker task
> to continue making progress when the wakeup is not strictly synchronous.
> 
> This is very obvious with netperf UDP_STREAM running on localhost. The
> waker is sending packets as quickly as possible without waiting for any
> reply. It frequently wakes the server for the processing of packets and
> when netserver is using local memory, it quickly completes the processing
> and goes back to idle. The waker often observes that netserver is on_rq
> and spins excessively leading to a drop in throughput.
> 
> This is a comparison of 5.7-rc6 against "sched: Optimize ttwu() spinning
> on p->on_cpu" and against this patch labeled vanilla, optttwu-v1r1 and
> localwakelist-v1r2 respectively.
> 
>                                   5.7.0-rc6              5.7.0-rc6            
>   5.7.0-rc6
>                                     vanilla           optttwu-v1r1     
> localwakelist-v1r2
> Hmean     send-64         251.49 (   0.00%)      258.05 *   2.61%*      
> 305.59 *  21.51%*
> Hmean     send-128        497.86 (   0.00%)      519.89 *   4.43%*      
> 600.25 *  20.57%*
> Hmean     send-256        944.90 (   0.00%)      997.45 *   5.56%*     
> 1140.19 *  20.67%*
> Hmean     send-1024      3779.03 (   0.00%)     3859.18 *   2.12%*     
> 4518.19 *  19.56%*
> Hmean     send-2048      7030.81 (   0.00%)     7315.99 *   4.06%*     
> 8683.01 *  23.50%*
> Hmean     send-3312     10847.44 (   0.00%)    11149.43 *   2.78%*    
> 12896.71 *  18.89%*
> Hmean     send-4096     13436.19 (   0.00%)    13614.09 (   1.32%)    
> 15041.09 *  11.94%*
> Hmean     send-8192     22624.49 (   0.00%)    23265.32 *   2.83%*    
> 24534.96 *   8.44%*
> Hmean     send-16384    34441.87 (   0.00%)    36457.15 *   5.85%*    
> 35986.21 *   4.48%*
> 
> Note that this benefit is not universal to all wakeups, it only applies
> to the case where the waker often spins on p->on_rq.
> 
Thanks for the detailed description. I think you meant p->on_cpu here not
p->on_rq. I know this patch is included, if possible, please make this edit
here and below.

>       /*
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index db3a57675ccf..06297d1142a0 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1688,7 +1688,8 @@ static inline int task_on_rq_migrating(struct 
> task_struct *p)
>   */
>  #define WF_SYNC                      0x01            /* Waker goes to sleep 
> after wakeup */
>  #define WF_FORK                      0x02            /* Child wakeup after 
> fork */
> -#define WF_MIGRATED          0x4             /* Internal use, task got 
> migrated */
> +#define WF_MIGRATED          0x04            /* Internal use, task got 
> migrated */
> +#define WF_ON_RQ             0x08            /* Wakee is on_rq */
>  
should this be WF_ON_CPU? There is a different check for p->on_rq in
try_to_wake_up.

Thanks,
Pavan

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

Reply via email to