Hi,
On 01/02/21 16:38, Barry Song wrote: > A tricky thing is that we shouldn't use the sgc of the 1st CPU of node2 > for the sched_group generated by grandchild, otherwise, when this cpu > becomes the balance_cpu of another sched_group of cpus other than node0, > our sched_group generated by grandchild will access the same sgc with > the sched_group generated by child of another CPU. > > So in init_overlap_sched_group(), sgc's capacity be overwritten: > build_balance_mask(sd, sg, mask); > cpu = cpumask_first_and(sched_group_span(sg), mask); > > sg->sgc = *per_cpu_ptr(sdd->sgc, cpu); > > And WARN_ON_ONCE(!cpumask_equal(group_balance_mask(sg), mask)) will > also be triggered: > static void init_overlap_sched_group(struct sched_domain *sd, > struct sched_group *sg) > { > if (atomic_inc_return(&sg->sgc->ref) == 1) > cpumask_copy(group_balance_mask(sg), mask); > else > WARN_ON_ONCE(!cpumask_equal(group_balance_mask(sg), mask)); > } > > So here move to use the sgc of the 2nd cpu. For the corner case, if NUMA > has only one CPU, we will still trigger this WARN_ON_ONCE. But It is > really unlikely to be a real case for one NUMA to have one CPU only. > Well, it's trivial to boot this with QEMU, and it's actually the example the comment atop that WARN_ONCE() is based on. Also, you could end up with a single CPU on a node during hotplug operations... I am not entirely sure whether having more than one CPU per node is a sufficient condition. I'm starting to *think* it is, but I'm not entirely convinced yet - and now I need a new notebook.