Re: [PATCH v5 02/14] hw/core/machine: Introduce CPU cluster topology support

2022-01-14 Thread Markus Armbruster
Philippe Mathieu-Daudé  writes:

> Hi,
>
> On 12/28/21 10:22, Yanan Wang wrote:

[...]

>> diff --git a/qapi/machine.json b/qapi/machine.json
>> index edeab6084b..ff0ab4ca20 100644
>> --- a/qapi/machine.json
>> +++ b/qapi/machine.json
>> @@ -1404,7 +1404,9 @@
>>  #
>>  # @dies: number of dies per socket in the CPU topology
>>  #
>> -# @cores: number of cores per die in the CPU topology
>> +# @clusters: number of clusters per die in the CPU topology
>
> Missing:
>
>#(since 7.0)
>
>> +#
>> +# @cores: number of cores per cluster in the CPU topology
>>  #
>>  # @threads: number of threads per core in the CPU topology
>>  #
>> @@ -1416,6 +1418,7 @@
>>   '*cpus': 'int',
>>   '*sockets': 'int',
>>   '*dies': 'int',
>> + '*clusters': 'int',
>>   '*cores': 'int',
>>   '*threads': 'int',
>>   '*maxcpus': 'int' } }
>
> If you want I can update the doc when applying.

With the update, QAPU schema
Acked-by: Markus Armbruster 




Re: [PATCH v5 02/14] hw/core/machine: Introduce CPU cluster topology support

2021-12-29 Thread Philippe Mathieu-Daudé
On 12/29/21 14:04, wangyanan (Y) wrote:
> 
> On 2021/12/29 18:44, Philippe Mathieu-Daudé wrote:
>> On 12/29/21 04:48, wangyanan (Y) wrote:
>>> Hi Philippe,
>>> Thanks for your review.
>>>
>>> On 2021/12/29 3:17, Philippe Mathieu-Daudé wrote:
 Hi,

 On 12/28/21 10:22, Yanan Wang wrote:
> The new Cluster-Aware Scheduling support has landed in Linux 5.16,
> which has been proved to benefit the scheduling performance (e.g.
> load balance and wake_affine strategy) on both x86_64 and AArch64.
>
> So now in Linux 5.16 we have four-level arch-neutral CPU topology
> definition like below and a new scheduler level for clusters.
> struct cpu_topology {
>   int thread_id;
>   int core_id;
>   int cluster_id;
>   int package_id;
>   int llc_id;
>   cpumask_t thread_sibling;
>   cpumask_t core_sibling;
>   cpumask_t cluster_sibling;
>   cpumask_t llc_sibling;
> }
>
> A cluster generally means a group of CPU cores which share L2 cache
> or other mid-level resources, and it is the shared resources that
> is used to improve scheduler's behavior. From the point of view of
> the size range, it's between CPU die and CPU core. For example, on
> some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
> and 4 CPU cores in each cluster. The 4 CPU cores share a separate
> L2 cache and a L3 cache tag, which brings cache affinity advantage.
>
> In virtualization, on the Hosts which have pClusters, if we can
 Maybe [*] -> reference to pClusters?
>>> Hm, I'm not sure what kind of reference is appropriate here.
>> So I guess the confusion comes from a simple typo =)
> I tried to mean "physical clusters" on host by pClusters, on the contrary
> to "virtual clusters" on guest. But obviously it brings confusion.

OK, I got confused because you don't use "vClusters".

>> Is it OK if I replace "pClusters" by "Clusters"?
> Sure, it's clearer to just use "clusters", please do that.

OK.




Re: [PATCH v5 02/14] hw/core/machine: Introduce CPU cluster topology support

2021-12-29 Thread wangyanan (Y)



On 2021/12/29 18:44, Philippe Mathieu-Daudé wrote:

On 12/29/21 04:48, wangyanan (Y) wrote:

Hi Philippe,
Thanks for your review.

On 2021/12/29 3:17, Philippe Mathieu-Daudé wrote:

Hi,

On 12/28/21 10:22, Yanan Wang wrote:

The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.

So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
  int thread_id;
  int core_id;
  int cluster_id;
  int package_id;
  int llc_id;
  cpumask_t thread_sibling;
  cpumask_t core_sibling;
  cpumask_t cluster_sibling;
  cpumask_t llc_sibling;
}

A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.

In virtualization, on the Hosts which have pClusters, if we can

Maybe [*] -> reference to pClusters?

Hm, I'm not sure what kind of reference is appropriate here.

So I guess the confusion comes from a simple typo =)

I tried to mean "physical clusters" on host by pClusters, on the contrary
to "virtual clusters" on guest. But obviously it brings confusion.

Is it OK if I replace "pClusters" by "Clusters"?

Sure, it's clearer to just use "clusters", please do that.

The third paragraph in the commit message has explained what
a cluster generally means. We can also read the description of
clusters in Linux kernel Kconfig files: [1] and [2].

[1]arm64: https://github.com/torvalds/linux/blob/master/arch/arm64/Kconfig

config SCHED_CLUSTER
    bool "Cluster scheduler support"
    help
  Cluster scheduler support improves the CPU scheduler's decision
  making when dealing with machines that have clusters of CPUs.
  Cluster usually means a couple of CPUs which are placed closely
  by sharing mid-level caches, last-level cache tags or internal
  busses.

[2]x86: https://github.com/torvalds/linux/blob/master/arch/x86/Kconfig

config SCHED_CLUSTER
    bool "Cluster scheduler support"
    depends on SMP
    default y
    help
  Cluster scheduler support improves the CPU scheduler's decision
  making when dealing with machines that have clusters of CPUs.
  Cluster usually means a couple of CPUs which are placed closely
  by sharing mid-level caches, last-level cache tags or internal
  busses.

design a vCPU topology with cluster level for guest kernel and
have a dedicated vCPU pinning. A Cluster-Aware Guest kernel can
also make use of the cache affinity of CPU clusters to gain
similar scheduling performance.

This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.

Signed-off-by: Yanan Wang 
---
   hw/core/machine-smp.c | 26 +++---
   hw/core/machine.c |  3 +++
   include/hw/boards.h   |  6 +-
   qapi/machine.json |  5 -
   qemu-options.hx   |  7 ---
   softmmu/vl.c  |  3 +++
   6 files changed, 38 insertions(+), 12 deletions(-)
diff --git a/qapi/machine.json b/qapi/machine.json
index edeab6084b..ff0ab4ca20 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1404,7 +1404,9 @@
   #
   # @dies: number of dies per socket in the CPU topology
   #
-# @cores: number of cores per die in the CPU topology
+# @clusters: number of clusters per die in the CPU topology

Missing:

     #    (since 7.0)

Ah, yes.

+#
+# @cores: number of cores per cluster in the CPU topology
   #
   # @threads: number of threads per core in the CPU topology
   #
@@ -1416,6 +1418,7 @@
    '*cpus': 'int',
    '*sockets': 'int',
    '*dies': 'int',
+ '*clusters': 'int',
    '*cores': 'int',
    '*threads': 'int',
    '*maxcpus': 'int' } }

If you want I can update the doc when applying.

Do you mean the missing "since 7.0"?
If you have a plan to apply the first 1-7 patches separately and
I don't need to respin, please help to update it, thank you! :)

Yes, that is the plan.

Thank you! I will pack the rest for ARM into next version separately
after you queue the generic part.

Thanks,
Yanan

Thanks,
Yanan

Thanks,

Phil.

.

.





Re: [PATCH v5 02/14] hw/core/machine: Introduce CPU cluster topology support

2021-12-29 Thread Philippe Mathieu-Daudé
On 12/29/21 04:48, wangyanan (Y) wrote:
> Hi Philippe,
> Thanks for your review.
> 
> On 2021/12/29 3:17, Philippe Mathieu-Daudé wrote:
>> Hi,
>>
>> On 12/28/21 10:22, Yanan Wang wrote:
>>> The new Cluster-Aware Scheduling support has landed in Linux 5.16,
>>> which has been proved to benefit the scheduling performance (e.g.
>>> load balance and wake_affine strategy) on both x86_64 and AArch64.
>>>
>>> So now in Linux 5.16 we have four-level arch-neutral CPU topology
>>> definition like below and a new scheduler level for clusters.
>>> struct cpu_topology {
>>>  int thread_id;
>>>  int core_id;
>>>  int cluster_id;
>>>  int package_id;
>>>  int llc_id;
>>>  cpumask_t thread_sibling;
>>>  cpumask_t core_sibling;
>>>  cpumask_t cluster_sibling;
>>>  cpumask_t llc_sibling;
>>> }
>>>
>>> A cluster generally means a group of CPU cores which share L2 cache
>>> or other mid-level resources, and it is the shared resources that
>>> is used to improve scheduler's behavior. From the point of view of
>>> the size range, it's between CPU die and CPU core. For example, on
>>> some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
>>> and 4 CPU cores in each cluster. The 4 CPU cores share a separate
>>> L2 cache and a L3 cache tag, which brings cache affinity advantage.
>>>
>>> In virtualization, on the Hosts which have pClusters, if we can
>> Maybe [*] -> reference to pClusters?
> Hm, I'm not sure what kind of reference is appropriate here.

So I guess the confusion comes from a simple typo =)

Is it OK if I replace "pClusters" by "Clusters"?

> The third paragraph in the commit message has explained what
> a cluster generally means. We can also read the description of
> clusters in Linux kernel Kconfig files: [1] and [2].
> 
> [1]arm64: https://github.com/torvalds/linux/blob/master/arch/arm64/Kconfig
> 
> config SCHED_CLUSTER
>    bool "Cluster scheduler support"
>    help
>  Cluster scheduler support improves the CPU scheduler's decision
>  making when dealing with machines that have clusters of CPUs.
>  Cluster usually means a couple of CPUs which are placed closely
>  by sharing mid-level caches, last-level cache tags or internal
>  busses.
> 
> [2]x86: https://github.com/torvalds/linux/blob/master/arch/x86/Kconfig
> 
> config SCHED_CLUSTER
>    bool "Cluster scheduler support"
>    depends on SMP
>    default y
>    help
>  Cluster scheduler support improves the CPU scheduler's decision
>  making when dealing with machines that have clusters of CPUs.
>  Cluster usually means a couple of CPUs which are placed closely
>  by sharing mid-level caches, last-level cache tags or internal
>  busses.
>>> design a vCPU topology with cluster level for guest kernel and
>>> have a dedicated vCPU pinning. A Cluster-Aware Guest kernel can
>>> also make use of the cache affinity of CPU clusters to gain
>>> similar scheduling performance.
>>>
>>> This patch adds infrastructure for CPU cluster level topology
>>> configuration and parsing, so that the user can specify cluster
>>> parameter if their machines support it.
>>>
>>> Signed-off-by: Yanan Wang 
>>> ---
>>>   hw/core/machine-smp.c | 26 +++---
>>>   hw/core/machine.c |  3 +++
>>>   include/hw/boards.h   |  6 +-
>>>   qapi/machine.json |  5 -
>>>   qemu-options.hx   |  7 ---
>>>   softmmu/vl.c  |  3 +++
>>>   6 files changed, 38 insertions(+), 12 deletions(-)
>>> diff --git a/qapi/machine.json b/qapi/machine.json
>>> index edeab6084b..ff0ab4ca20 100644
>>> --- a/qapi/machine.json
>>> +++ b/qapi/machine.json
>>> @@ -1404,7 +1404,9 @@
>>>   #
>>>   # @dies: number of dies per socket in the CPU topology
>>>   #
>>> -# @cores: number of cores per die in the CPU topology
>>> +# @clusters: number of clusters per die in the CPU topology
>> Missing:
>>
>>     #    (since 7.0)
> Ah, yes.
>>> +#
>>> +# @cores: number of cores per cluster in the CPU topology
>>>   #
>>>   # @threads: number of threads per core in the CPU topology
>>>   #
>>> @@ -1416,6 +1418,7 @@
>>>    '*cpus': 'int',
>>>    '*sockets': 'int',
>>>    '*dies': 'int',
>>> + '*clusters': 'int',
>>>    '*cores': 'int',
>>>    '*threads': 'int',
>>>    '*maxcpus': 'int' } }
>> If you want I can update the doc when applying.
> Do you mean the missing "since 7.0"?
> If you have a plan to apply the first 1-7 patches separately and
> I don't need to respin, please help to update it, thank you! :)

Yes, that is the plan.

> 
> Thanks,
> Yanan
>> Thanks,
>>
>> Phil.
>>
>> .
> 




Re: [PATCH v5 02/14] hw/core/machine: Introduce CPU cluster topology support

2021-12-28 Thread wangyanan (Y)

Hi Philippe,
Thanks for your review.

On 2021/12/29 3:17, Philippe Mathieu-Daudé wrote:

Hi,

On 12/28/21 10:22, Yanan Wang wrote:

The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.

So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
 int thread_id;
 int core_id;
 int cluster_id;
 int package_id;
 int llc_id;
 cpumask_t thread_sibling;
 cpumask_t core_sibling;
 cpumask_t cluster_sibling;
 cpumask_t llc_sibling;
}

A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.

In virtualization, on the Hosts which have pClusters, if we can

Maybe [*] -> reference to pClusters?

Hm, I'm not sure what kind of reference is appropriate here.
The third paragraph in the commit message has explained what
a cluster generally means. We can also read the description of
clusters in Linux kernel Kconfig files: [1] and [2].

[1]arm64: https://github.com/torvalds/linux/blob/master/arch/arm64/Kconfig

config SCHED_CLUSTER
   bool "Cluster scheduler support"
   help
 Cluster scheduler support improves the CPU scheduler's decision
 making when dealing with machines that have clusters of CPUs.
 Cluster usually means a couple of CPUs which are placed closely
 by sharing mid-level caches, last-level cache tags or internal
 busses.

[2]x86: https://github.com/torvalds/linux/blob/master/arch/x86/Kconfig

config SCHED_CLUSTER
   bool "Cluster scheduler support"
   depends on SMP
   default y
   help
 Cluster scheduler support improves the CPU scheduler's decision
 making when dealing with machines that have clusters of CPUs.
 Cluster usually means a couple of CPUs which are placed closely
 by sharing mid-level caches, last-level cache tags or internal
 busses.

design a vCPU topology with cluster level for guest kernel and
have a dedicated vCPU pinning. A Cluster-Aware Guest kernel can
also make use of the cache affinity of CPU clusters to gain
similar scheduling performance.

This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.

Signed-off-by: Yanan Wang 
---
  hw/core/machine-smp.c | 26 +++---
  hw/core/machine.c |  3 +++
  include/hw/boards.h   |  6 +-
  qapi/machine.json |  5 -
  qemu-options.hx   |  7 ---
  softmmu/vl.c  |  3 +++
  6 files changed, 38 insertions(+), 12 deletions(-)
diff --git a/qapi/machine.json b/qapi/machine.json
index edeab6084b..ff0ab4ca20 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1404,7 +1404,9 @@
  #
  # @dies: number of dies per socket in the CPU topology
  #
-# @cores: number of cores per die in the CPU topology
+# @clusters: number of clusters per die in the CPU topology

Missing:

#(since 7.0)

Ah, yes.

+#
+# @cores: number of cores per cluster in the CPU topology
  #
  # @threads: number of threads per core in the CPU topology
  #
@@ -1416,6 +1418,7 @@
   '*cpus': 'int',
   '*sockets': 'int',
   '*dies': 'int',
+ '*clusters': 'int',
   '*cores': 'int',
   '*threads': 'int',
   '*maxcpus': 'int' } }

If you want I can update the doc when applying.

Do you mean the missing "since 7.0"?
If you have a plan to apply the first 1-7 patches separately and
I don't need to respin, please help to update it, thank you! :)

Thanks,
Yanan

Thanks,

Phil.

.





Re: [PATCH v5 02/14] hw/core/machine: Introduce CPU cluster topology support

2021-12-28 Thread Philippe Mathieu-Daudé
Hi,

On 12/28/21 10:22, Yanan Wang wrote:
> The new Cluster-Aware Scheduling support has landed in Linux 5.16,
> which has been proved to benefit the scheduling performance (e.g.
> load balance and wake_affine strategy) on both x86_64 and AArch64.
> 
> So now in Linux 5.16 we have four-level arch-neutral CPU topology
> definition like below and a new scheduler level for clusters.
> struct cpu_topology {
> int thread_id;
> int core_id;
> int cluster_id;
> int package_id;
> int llc_id;
> cpumask_t thread_sibling;
> cpumask_t core_sibling;
> cpumask_t cluster_sibling;
> cpumask_t llc_sibling;
> }
> 
> A cluster generally means a group of CPU cores which share L2 cache
> or other mid-level resources, and it is the shared resources that
> is used to improve scheduler's behavior. From the point of view of
> the size range, it's between CPU die and CPU core. For example, on
> some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
> and 4 CPU cores in each cluster. The 4 CPU cores share a separate
> L2 cache and a L3 cache tag, which brings cache affinity advantage.
> 
> In virtualization, on the Hosts which have pClusters, if we can

Maybe [*] -> reference to pClusters?

> design a vCPU topology with cluster level for guest kernel and
> have a dedicated vCPU pinning. A Cluster-Aware Guest kernel can
> also make use of the cache affinity of CPU clusters to gain
> similar scheduling performance.
> 
> This patch adds infrastructure for CPU cluster level topology
> configuration and parsing, so that the user can specify cluster
> parameter if their machines support it.
> 
> Signed-off-by: Yanan Wang 
> ---
>  hw/core/machine-smp.c | 26 +++---
>  hw/core/machine.c |  3 +++
>  include/hw/boards.h   |  6 +-
>  qapi/machine.json |  5 -
>  qemu-options.hx   |  7 ---
>  softmmu/vl.c  |  3 +++
>  6 files changed, 38 insertions(+), 12 deletions(-)

> diff --git a/qapi/machine.json b/qapi/machine.json
> index edeab6084b..ff0ab4ca20 100644
> --- a/qapi/machine.json
> +++ b/qapi/machine.json
> @@ -1404,7 +1404,9 @@
>  #
>  # @dies: number of dies per socket in the CPU topology
>  #
> -# @cores: number of cores per die in the CPU topology
> +# @clusters: number of clusters per die in the CPU topology

Missing:

   #(since 7.0)

> +#
> +# @cores: number of cores per cluster in the CPU topology
>  #
>  # @threads: number of threads per core in the CPU topology
>  #
> @@ -1416,6 +1418,7 @@
>   '*cpus': 'int',
>   '*sockets': 'int',
>   '*dies': 'int',
> + '*clusters': 'int',
>   '*cores': 'int',
>   '*threads': 'int',
>   '*maxcpus': 'int' } }
If you want I can update the doc when applying.

Thanks,

Phil.




[PATCH v5 02/14] hw/core/machine: Introduce CPU cluster topology support

2021-12-28 Thread Yanan Wang via
The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.

So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
int thread_id;
int core_id;
int cluster_id;
int package_id;
int llc_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
cpumask_t cluster_sibling;
cpumask_t llc_sibling;
}

A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.

In virtualization, on the Hosts which have pClusters, if we can
design a vCPU topology with cluster level for guest kernel and
have a dedicated vCPU pinning. A Cluster-Aware Guest kernel can
also make use of the cache affinity of CPU clusters to gain
similar scheduling performance.

This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.

Signed-off-by: Yanan Wang 
---
 hw/core/machine-smp.c | 26 +++---
 hw/core/machine.c |  3 +++
 include/hw/boards.h   |  6 +-
 qapi/machine.json |  5 -
 qemu-options.hx   |  7 ---
 softmmu/vl.c  |  3 +++
 6 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index 2cbfd57429..b39ed21e65 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -37,6 +37,10 @@ static char *cpu_hierarchy_to_string(MachineState *ms)
 g_string_append_printf(s, " * dies (%u)", ms->smp.dies);
 }
 
+if (mc->smp_props.clusters_supported) {
+g_string_append_printf(s, " * clusters (%u)", ms->smp.clusters);
+}
+
 g_string_append_printf(s, " * cores (%u)", ms->smp.cores);
 g_string_append_printf(s, " * threads (%u)", ms->smp.threads);
 
@@ -71,6 +75,7 @@ void machine_parse_smp_config(MachineState *ms,
 unsigned cpus= config->has_cpus ? config->cpus : 0;
 unsigned sockets = config->has_sockets ? config->sockets : 0;
 unsigned dies= config->has_dies ? config->dies : 0;
+unsigned clusters = config->has_clusters ? config->clusters : 0;
 unsigned cores   = config->has_cores ? config->cores : 0;
 unsigned threads = config->has_threads ? config->threads : 0;
 unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0;
@@ -82,6 +87,7 @@ void machine_parse_smp_config(MachineState *ms,
 if ((config->has_cpus && config->cpus == 0) ||
 (config->has_sockets && config->sockets == 0) ||
 (config->has_dies && config->dies == 0) ||
+(config->has_clusters && config->clusters == 0) ||
 (config->has_cores && config->cores == 0) ||
 (config->has_threads && config->threads == 0) ||
 (config->has_maxcpus && config->maxcpus == 0)) {
@@ -97,8 +103,13 @@ void machine_parse_smp_config(MachineState *ms,
 error_setg(errp, "dies not supported by this machine's CPU topology");
 return;
 }
+if (!mc->smp_props.clusters_supported && clusters > 1) {
+error_setg(errp, "clusters not supported by this machine's CPU 
topology");
+return;
+}
 
 dies = dies > 0 ? dies : 1;
+clusters = clusters > 0 ? clusters : 1;
 
 /* compute missing values based on the provided ones */
 if (cpus == 0 && maxcpus == 0) {
@@ -113,41 +124,42 @@ void machine_parse_smp_config(MachineState *ms,
 if (sockets == 0) {
 cores = cores > 0 ? cores : 1;
 threads = threads > 0 ? threads : 1;
-sockets = maxcpus / (dies * cores * threads);
+sockets = maxcpus / (dies * clusters * cores * threads);
 } else if (cores == 0) {
 threads = threads > 0 ? threads : 1;
-cores = maxcpus / (sockets * dies * threads);
+cores = maxcpus / (sockets * dies * clusters * threads);
 }
 } else {
 /* prefer cores over sockets since 6.2 */
 if (cores == 0) {
 sockets = sockets > 0 ? sockets : 1;
 threads = threads > 0 ? threads : 1;
-cores = maxcpus / (sockets * dies * threads);
+cores = maxcpus / (sockets * dies * clusters * threads);
 } else if (sockets == 0) {
 threads = threads > 0 ? threads : 1;
-sockets = maxcpus / (dies * cores * threads);
+sockets = maxcpus / (dies * cluste