> -----Original Message-----
> From: Valentin Schneider [mailto:valentin.schnei...@arm.com]
> Sent: Thursday, February 4, 2021 4:55 AM
> To: linux-kernel@vger.kernel.org
> Cc: vincent.guit...@linaro.org; mgor...@suse.de; mi...@kernel.org;
> pet...@infradead.org; dietmar.eggem...@arm.com; morten.rasmus...@arm.com;
> linux...@openeuler.org; xuwei (O) <xuw...@huawei.com>; Liguozhu (Kenneth)
> <liguo...@hisilicon.com>; tiantao (H) <tiant...@hisilicon.com>; wanghuiqiang
> <wanghuiqi...@huawei.com>; Zengtao (B) <prime.z...@hisilicon.com>; Jonathan
> Cameron <jonathan.came...@huawei.com>; guodong...@linaro.org; Song Bao Hua
> (Barry Song) <song.bao....@hisilicon.com>; Meelis Roos <mr...@linux.ee>
> Subject: [RFC PATCH 2/2] Revert "sched/topology: Warn when NUMA diameter > 2"
> 
> The scheduler topology code can now figure out what to do with such
> topologies.
> 
> This reverts commit b5b217346de85ed1b03fdecd5c5076b34fbb2f0b.
> 
> Signed-off-by: Valentin Schneider <valentin.schnei...@arm.com>

Yes, this is fine. I actually have seen some other problems we need
to consider.

The current code is probably well consolidated for machines with
2 hops or less. Thus, even after we fix the 3-hops span issue, I
can still see some other issue.

For example, if we change the sd flags and remove the SD_BALANCE
flags for the last hops in sd_init(), we are able to see large
score increase in unixbench.

                if (sched_domains_numa_distance[tl->numa_level] > 
node_reclaim_distance ||
                        is_3rd_hops_domain(...)) {
                        sd->flags &= ~(SD_BALANCE_EXEC |
                                       SD_BALANCE_FORK |
                                       SD_WAKE_AFFINE);
                }

So guess something needs to be tuned for machines with 3 hops or more.

But we need a kernel which has the fix of 3-hops issue before we can
do more work.

> ---
>  kernel/sched/topology.c | 33 ---------------------------------
>  1 file changed, 33 deletions(-)
> 
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index a8f69f234258..0fa41aab74e0 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -688,7 +688,6 @@ cpu_attach_domain(struct sched_domain *sd, struct
> root_domain *rd, int cpu)
>  {
>       struct rq *rq = cpu_rq(cpu);
>       struct sched_domain *tmp;
> -     int numa_distance = 0;
> 
>       /* Remove the sched domains which do not contribute to scheduling. */
>       for (tmp = sd; tmp; ) {
> @@ -720,38 +719,6 @@ cpu_attach_domain(struct sched_domain *sd, struct
> root_domain *rd, int cpu)
>                       sd->child = NULL;
>       }
> 
> -     for (tmp = sd; tmp; tmp = tmp->parent)
> -             numa_distance += !!(tmp->flags & SD_NUMA);
> -
> -     /*
> -      * FIXME: Diameter >=3 is misrepresented.
> -      *
> -      * Smallest diameter=3 topology is:
> -      *
> -      *   node   0   1   2   3
> -      *     0:  10  20  30  40
> -      *     1:  20  10  20  30
> -      *     2:  30  20  10  20
> -      *     3:  40  30  20  10
> -      *
> -      *   0 --- 1 --- 2 --- 3
> -      *
> -      * NUMA-3       0-3             N/A             N/A             0-3
> -      *  groups:     {0-2},{1-3}                                     
> {1-3},{0-2}
> -      *
> -      * NUMA-2       0-2             0-3             0-3             1-3
> -      *  groups:     {0-1},{1-3}     {0-2},{2-3}     {1-3},{0-1}     
> {2-3},{0-2}
> -      *
> -      * NUMA-1       0-1             0-2             1-3             2-3
> -      *  groups:     {0},{1}         {1},{2},{0}     {2},{3},{1}     {3},{2}
> -      *
> -      * NUMA-0       0               1               2               3
> -      *
> -      * The NUMA-2 groups for nodes 0 and 3 are obviously buggered, as the
> -      * group span isn't a subset of the domain span.
> -      */
> -     WARN_ONCE(numa_distance > 2, "Shortest NUMA path spans too many 
> nodes\n");
> -
>       sched_domain_debug(sd, cpu);
> 
>       rq_attach_root(rq, rd);
> --
> 2.27.0

Thanks
Barry

Reply via email to