On Thu, Feb 25, 2021 at 04:56:22PM +0800, Ying Fang wrote: > An accurate cpu topology may help improve the cpu scheduler's decision > making when dealing with multi-core system. So cpu topology description > is helpful to provide guest with the right view. Dario Faggioli's talk > in [0] also shows the virtual topology may has impact on sched performace. > Thus this patch series is posted to introduce cpu topology support for > arm platform. > > Both fdt and ACPI are introduced to present the cpu topology. To describe > the cpu topology via ACPI, a PPTT table is introduced according to the > processor hierarchy node structure. This series is derived from [1], in > [1] we are trying to bring both cpu and cache topology support for arm > platform, but there is still some issues to solve to support the cache > hierarchy. So we split the cpu topology part out and send it seperately. > The patch series to support cache hierarchy will be send later since > Salil Mehta's cpu hotplug feature need the cpu topology enabled first and > he is waiting for it to be upstreamed. > > This patch series was initially based on the patches posted by Andrew Jones > [2]. > I jumped in on it since some OS vendor cooperative partner are eager for it. > Thanks for Andrew's contribution. > > After applying this patch series, launch a guest with virt-6.0 and cpu > topology configured with sockets:cores:threads = 2:4:2, you will get the > bellow messages with the lscpu command. > > ----------------------------------------- > Architecture: aarch64 > CPU op-mode(s): 64-bit > Byte Order: Little Endian > CPU(s): 16 > On-line CPU(s) list: 0-15 > Thread(s) per core: 2
What CPU model was used? Did it actually support threads? If these were KVM VCPUs, then I guess MPIDR.MT was not set on the CPUs. Apparently that didn't confuse Linux? See [1] for how I once tried to deal with threads. [1] https://github.com/rhdrjones/qemu/commit/60218e0dd7b331031b644872d56f2aca42d0ff1e > Core(s) per socket: 4 > Socket(s): 2 Good, but what happens if you specify '-smp 16'? Do you get 16 sockets each with 1 core? Or, do you get 1 socket with 16 cores? And, which do we want and why? If you look at [2], then you'll see I was assuming we want to prefer cores over sockets, since without topology descriptions that's what the Linux guest kernel would do. [2] https://github.com/rhdrjones/qemu/commit/c0670b1bccb4d08c7cf7c6957cc8878a2af131dd > NUMA node(s): 2 Why do we have two NUMA nodes in the guest? The two sockets in the guest should not imply this. Thanks, drew > Vendor ID: HiSilicon > Model: 0 > Model name: Kunpeng-920 > Stepping: 0x1 > BogoMIPS: 200.00 > NUMA node0 CPU(s): 0-7 > NUMA node1 CPU(s): 8-15 > > [0] > https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual-machines-friend-or-foe-dario-faggioli-suse > [1] https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg02166.html > [2] > https://patchwork.ozlabs.org/project/qemu-devel/cover/20180704124923.32483-1-drjo...@redhat.com > > Ying Fang (5): > device_tree: Add qemu_fdt_add_path > hw/arm/virt: Add cpu-map to device tree > hw/arm/virt-acpi-build: distinguish possible and present cpus > hw/acpi/aml-build: add processor hierarchy node structure > hw/arm/virt-acpi-build: add PPTT table > > hw/acpi/aml-build.c | 40 ++++++++++++++++++++++ > hw/arm/virt-acpi-build.c | 64 +++++++++++++++++++++++++++++++++--- > hw/arm/virt.c | 40 +++++++++++++++++++++- > include/hw/acpi/acpi-defs.h | 13 ++++++++ > include/hw/acpi/aml-build.h | 7 ++++ > include/hw/arm/virt.h | 1 + > include/sysemu/device_tree.h | 1 + > softmmu/device_tree.c | 45 +++++++++++++++++++++++-- > 8 files changed, 204 insertions(+), 7 deletions(-) > > -- > 2.23.0 >