An accurate cpu topology may help improve the cpu scheduler's decision making when dealing with multi-core system. So cpu topology description is helpful to provide guest with the right view. Cpu cache information may also have slight impact on the sched domain, and even userspace software may check the cpu cache information to do some optimizations. Dario Faggioli's talk in [0] also shows the virtual topology may has impact on sched performace. Thus this patch series is posted to provide cpu and cache topology support for arm platform.
Both fdt and ACPI are introduced to present the cpu and cache topology. To describe the cpu topology via ACPI, a PPTT table is introduced according to the processor hierarchy node structure. To describe the cpu cache information, a default cache hierarchy is given and built according to the cache type structure defined by ACPI, it can be made configurable later. The RFC v1 was posted at [1], we tried to map the MPIDR register into cpu topology, however it is totally wrong. Andrew points it out that Linux kernel is goint to stop using MPIDR for topology information [2]. The root cause is the MPIDR register has been abused by ARM OEM manufactures. It is only used as an identifer for a specific cpu, not representation of the topology. Moreover this v2 is rebased on Andrew's latest branch shared [4]. This patch series was initially based on the patches posted by Andrew Jones [3]. I jumped in on it since some OS vendor cooperative partner are eager for it. Thanks for Andrew's contribution. After applying this patch series, launch a guest with virt-5.3 and cpu topology configured with sockets:cores:threads = 2:4:2, you will get the bellow messages with the lscpu command. ----------------------------------------- Architecture: aarch64 CPU op-mode(s): 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: HiSilicon Model: 0 Model name: Kunpeng-920 Stepping: 0x1 BogoMIPS: 200.00 L1d cache: 512 KiB L1i cache: 512 KiB L2 cache: 4 MiB L3 cache: 128 MiB NUMA node0 CPU(s): 0-7 NUMA node1 CPU(s): 8-15 changelog v2 -> v3: - Make use of possible_cpus->cpus[i].cpu to check against current online cpus v1 -> v2: - Rebased to the latest branch shared by Andrew Jones [4] - Stop mapping MPIDR into vcpu topology [0] https://kvmforum2020.sched.com/event/eE1y/virtual-topology-for-virtual-machines-friend-or-foe-dario-faggioli-suse [1] https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg06027.html [2] https://patchwork.kernel.org/project/linux-arm-kernel/patch/20200829130016.26106-1-valentin.schnei...@arm.com/ [3] https://patchwork.ozlabs.org/project/qemu-devel/cover/20180704124923.32483-1-drjo...@redhat.com [4] https://github.com/rhdrjones/qemu/commits/virt-cpu-topology-refresh Andrew Jones (5): hw/arm/virt: Spell out smp.cpus and smp.max_cpus hw/arm/virt: Remove unused variable hw/arm/virt: Replace smp_parse with one that prefers cores device_tree: Add qemu_fdt_add_path hw/arm/virt: DT: add cpu-map Ying Fang (8): hw: add compat machines for 5.3 hw/arm/virt-acpi-build: distinguish possible and present cpus hw/acpi/aml-build: add processor hierarchy node structure hw/arm/virt-acpi-build: add PPTT table target/arm/cpu: Add cpu cache description for arm hw/arm/virt: add fdt cache information hw/acpi/aml-build: Build ACPI cpu cache hierarchy information hw/arm/virt-acpi-build: Enable cpu and cache topology device_tree.c | 45 +++++- hw/acpi/aml-build.c | 68 +++++++++ hw/arm/virt-acpi-build.c | 99 ++++++++++++- hw/arm/virt.c | 273 +++++++++++++++++++++++++++++++---- hw/core/machine.c | 3 + hw/i386/pc.c | 3 + hw/i386/pc_piix.c | 15 +- hw/i386/pc_q35.c | 14 +- hw/ppc/spapr.c | 15 +- hw/s390x/s390-virtio-ccw.c | 14 +- include/hw/acpi/acpi-defs.h | 14 ++ include/hw/acpi/aml-build.h | 11 ++ include/hw/arm/virt.h | 4 +- include/hw/boards.h | 3 + include/hw/i386/pc.h | 3 + include/sysemu/device_tree.h | 1 + target/arm/cpu.c | 42 ++++++ target/arm/cpu.h | 27 ++++ 18 files changed, 609 insertions(+), 45 deletions(-) -- 2.23.0