Hi Drew,
On 2021/4/28 18:13, Andrew Jones wrote:
On Wed, Apr 28, 2021 at 05:36:43PM +0800, wangyanan (Y) wrote:
On 2021/4/27 22:58, Andrew Jones wrote:
On Tue, Apr 13, 2021 at 04:07:45PM +0800, Yanan Wang wrote:
From: Andrew Jones <drjo...@redhat.com>
The virt machine type has never used the CPU topology parameters, other
than number of online CPUs and max CPUs. When choosing how to allocate
those CPUs the default has been to assume cores. In preparation for
using the other CPU topology parameters let's use an smp_parse that
prefers cores over sockets. We can also enforce the topology matches
max_cpus check because we have no legacy to preserve.
Signed-off-by: Andrew Jones <drjo...@redhat.com>
Signed-off-by: Yanan Wang <wangyana...@huawei.com>
---
hw/arm/virt.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 76 insertions(+)
Thanks, this patch matches [1]. Of course, I've always considered this
patch to be something of an RFC, though. Is there any harm in defaulting
to sockets over cores? If not, I wonder if we shouldn't just leave the
default as it is to avoid a mach-virt specific smp parser. The "no
topology" compat variable will keep existing machine types from switching
from cores to sockets, so we don't need to worry about that.
[1]
https://github.com/rhdrjones/qemu/commit/c0670b1bccb4d08c7cf7c6957cc8878a2af131dd
For CPU topology population, ARM64 kernel will firstly try to parse ACPI
PPTT table
and then DT in function init_cpu_topology(), if failed it will rely on the
MPIDR value
in function store_cpu_topology(). But MPIDR can not be trusted and is
ignored for
any topology deduction. And instead, topology of one single socket with
multiple
cores is made, which can not represent the real underlying system topology.
I think this is the reason why VMs will in default prefer cores over sockets
if no
topology description is provided.
With the feature introduced by this series, guest kernel can successfully
get cpu
information from one of the two (ACPI or DT) for topology population.
According to above analysis, IMO, whether the parsing logic is "sockets over
cores" or
"cores over sockets", it just provide different topology information and
consequently
different scheduling performance. Apart from this, I think there will not
any harm or
problems caused.
So maybe it's fine that we just use the arch-neutral parsing logic?
How do you think ?
Can you do an experiment where you create a guest with N vcpus, where N is
the number of cores in a single socket. Then, pin each of those vcpus to a
core in a single physical socket. Then, boot the VM with a topology of one
socket and N cores and run some benchmarks. Then, boot the VM again with N
sockets, one core each, and run the same benchmarks.
I'm guessing we'll see the same benchmark numbers (within noise allowance)
for both the runs. If we don't see the same numbers, then that'd be
interesting.
Yes, I can do the experiment, and will post the results later.
Thanks,
Yanan
Thanks,
drew
.