Hi, Srikar,

Srikar Dronamraju <[email protected]> writes:

> * Huang, Ying <[email protected]> [2019-07-25 16:01:24]:
>
>> From: Huang Ying <[email protected]>
>> 
>> From the commit log and comments of commit 37ec97deb3a8 ("sched/numa:
>> Slow down scan rate if shared faults dominate"), the autonuma scan
>> period should be increased (scanning is slowed down) if the majority
>> of the page accesses are shared with other processes.  But in current
>> code, the scan period will be decreased (scanning is speeded up) in
>> that situation.
>> 
>> The commit log and comments make more sense.  So this patch fixes the
>> code to make it match the commit log and comments.  And this has been
>> verified via tracing the scan period changing and /proc/vmstat
>> numa_pte_updates counter when running a multi-threaded memory
>> accessing program (most memory areas are accessed by multiple
>> threads).
>> 
>
> Lets split into 4 modes.
> More Local and Private Page Accesses:
> We definitely want to scan slowly i.e increase the scan window.
>
> More Local and Shared Page Accesses:
> We still want to scan slowly because we have consolidated and there is no
> point in scanning faster. So scan slowly + increase the scan window.
> (Do remember access on any active node counts as local!!!)
>
> More Remote + Private page Accesses:
> Most likely the Private accesses are going to be local accesses.
>
> In the unlikely event of the private accesses not being local, we should
> scan faster so that the memory and task consolidates.
>
> More Remote + Shared page Accesses: This means the workload has not
> consolidated and needs to scan faster. So we need to scan faster.

This sounds reasonable.  But

lr_ratio < NUMA_PERIOD_THRESHOLD

doesn't indicate More Remote.  If Local = Remote, it is also true.  If
there are also more Shared, we should slow down the scanning.  So, the
logic could be

if (lr_ratio >= NUMA_PERIOD_THRESHOLD)
    slow down scanning
else if (sp_ratio >= NUMA_PERIOD_THRESHOLD) {
    if (NUMA_PERIOD_SLOTS - lr_ratio >= NUMA_PERIOD_THRESHOLD)
        speed up scanning
    else
        slow down scanning
} else
   speed up scanning

This follows your idea better?

Best Regards,
Huang, Ying

> So I would think we should go back to before 37ec97deb3a8.
>
> i.e 
>
>       int slot = lr_ratio - NUMA_PERIOD_THRESHOLD;
>
>       if (!slot)
>               slot = 1;
>       diff = slot * period_slot;
>
>
> No?
>
>> Fixes: 37ec97deb3a8 ("sched/numa: Slow down scan rate if shared faults 
>> dominate")
>> Signed-off-by: "Huang, Ying" <[email protected]>
>> Cc: Rik van Riel <[email protected]>
>> Cc: Peter Zijlstra (Intel) <[email protected]>
>> Cc: Mel Gorman <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: Ingo Molnar <[email protected]>
>> Cc: Andrew Morton <[email protected]>
>> ---
>>  kernel/sched/fair.c | 20 ++++++++++----------
>>  1 file changed, 10 insertions(+), 10 deletions(-)
>> 
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 036be95a87e9..468a1c5038b2 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -1940,7 +1940,7 @@ static void update_task_scan_period(struct task_struct 
>> *p,
>>                      unsigned long shared, unsigned long private)
>>  {
>>      unsigned int period_slot;
>> -    int lr_ratio, ps_ratio;
>> +    int lr_ratio, sp_ratio;
>>      int diff;
>>  
>>      unsigned long remote = p->numa_faults_locality[0];
>> @@ -1971,22 +1971,22 @@ static void update_task_scan_period(struct 
>> task_struct *p,
>>       */
>>      period_slot = DIV_ROUND_UP(p->numa_scan_period, NUMA_PERIOD_SLOTS);
>>      lr_ratio = (local * NUMA_PERIOD_SLOTS) / (local + remote);
>> -    ps_ratio = (private * NUMA_PERIOD_SLOTS) / (private + shared);
>> +    sp_ratio = (shared * NUMA_PERIOD_SLOTS) / (private + shared);
>>  
>> -    if (ps_ratio >= NUMA_PERIOD_THRESHOLD) {
>> +    if (sp_ratio >= NUMA_PERIOD_THRESHOLD) {
>>              /*
>> -             * Most memory accesses are local. There is no need to
>> -             * do fast NUMA scanning, since memory is already local.
>> +             * Most memory accesses are shared with other tasks.
>> +             * There is no point in continuing fast NUMA scanning,
>> +             * since other tasks may just move the memory elsewhere.
>
> With this change, I would expect that with Shared page accesses,
> consolidation to take a hit.
>
>>               */
>> -            int slot = ps_ratio - NUMA_PERIOD_THRESHOLD;
>> +            int slot = sp_ratio - NUMA_PERIOD_THRESHOLD;
>>              if (!slot)
>>                      slot = 1;
>>              diff = slot * period_slot;
>>      } else if (lr_ratio >= NUMA_PERIOD_THRESHOLD) {
>>              /*
>> -             * Most memory accesses are shared with other tasks.
>> -             * There is no point in continuing fast NUMA scanning,
>> -             * since other tasks may just move the memory elsewhere.
>> +             * Most memory accesses are local. There is no need to
>> +             * do fast NUMA scanning, since memory is already local.
>
> Comment wise this make sense.
>
>>               */
>>              int slot = lr_ratio - NUMA_PERIOD_THRESHOLD;
>>              if (!slot)
>> @@ -1998,7 +1998,7 @@ static void update_task_scan_period(struct task_struct 
>> *p,
>>               * yet they are not on the local NUMA node. Speed up
>>               * NUMA scanning to get the memory moved over.
>>               */
>> -            int ratio = max(lr_ratio, ps_ratio);
>> +            int ratio = max(lr_ratio, sp_ratio);
>>              diff = -(NUMA_PERIOD_THRESHOLD - ratio) * period_slot;
>>      }
>>  
>> -- 
>> 2.20.1
>> 

Reply via email to