On 23.08.2024 14:54, Alireza Sanaee via wrote:
Failure cases:
1) there are cases where QEMU might not have any clusters selected in the
-smp option, while user specifies caches to be shared at cluster level. In
this situations, qemu returns error.
2) There are other scenarios where caches exist in systems' registers but
not left unspecified by users. In this case qemu returns failure.
Sockets, clusters, cores, threads. And then caches. Sounds like more fun
than it is already.
IIRC Arm hardware can have up to 16 cores per cluster (virt uses 16,
sbsa-ref uses 8) as this is GIC limitation.
I have a script to visualize Arm topology:
https://github.com/hrw/sbsa-ref-status/blob/main/parse-pptt-log.py
It uses 'EFIShell> acpiview -s PPTT' output and gives something like this:
-smp 24,sockets=1,clusters=2,cores=3,threads=4
socket: offset: 0x24 parent: 0x0
cluster: offset: 0x38 parent: 0x24
core: offset: 0x4C parent: 0x38 cpuId: 0x0 L1i: 0x68 L1d: 0x84
cache: offset: 0x68 cacheId: 1 size: 0x10000 next: 0xA0
cache: offset: 0x84 cacheId: 2 size: 0x10000 next: 0xA0
cache: offset: 0xA0 cacheId: 3 size: 0x80000
thread: offset: 0xBC parent: 0x4C cpuId: 0x0
thread: offset: 0xD0 parent: 0x4C cpuId: 0x1
thread: offset: 0xE4 parent: 0x4C cpuId: 0x2
thread: offset: 0xF8 parent: 0x4C cpuId: 0x3
core: offset: 0x10C parent: 0x38 cpuId: 0x0 L1i: 0x128 L1d: 0x144
cache: offset: 0x128 cacheId: 4 size: 0x10000 next: 0x160
cache: offset: 0x144 cacheId: 5 size: 0x10000 next: 0x160
cache: offset: 0x160 cacheId: 6 size: 0x80000
thread: offset: 0x17C parent: 0x10C cpuId: 0x4
thread: offset: 0x190 parent: 0x10C cpuId: 0x5
thread: offset: 0x1A4 parent: 0x10C cpuId: 0x6
thread: offset: 0x1B8 parent: 0x10C cpuId: 0x7
core: offset: 0x1CC parent: 0x38 cpuId: 0x0 L1i: 0x1E8 L1d: 0x204
cache: offset: 0x1E8 cacheId: 7 size: 0x10000 next: 0x220
cache: offset: 0x204 cacheId: 8 size: 0x10000 next: 0x220
cache: offset: 0x220 cacheId: 9 size: 0x80000
thread: offset: 0x23C parent: 0x1CC cpuId: 0x8
thread: offset: 0x250 parent: 0x1CC cpuId: 0x9
thread: offset: 0x264 parent: 0x1CC cpuId: 0xA
thread: offset: 0x278 parent: 0x1CC cpuId: 0xB
cluster: offset: 0x28C parent: 0x24
core: offset: 0x2A0 parent: 0x28C cpuId: 0x0 L1i: 0x2BC L1d: 0x2D8
cache: offset: 0x2BC cacheId: 10 size: 0x10000 next: 0x2F4
cache: offset: 0x2D8 cacheId: 11 size: 0x10000 next: 0x2F4
cache: offset: 0x2F4 cacheId: 12 size: 0x80000
thread: offset: 0x310 parent: 0x2A0 cpuId: 0xC
thread: offset: 0x324 parent: 0x2A0 cpuId: 0xD
thread: offset: 0x338 parent: 0x2A0 cpuId: 0xE
thread: offset: 0x34C parent: 0x2A0 cpuId: 0xF
core: offset: 0x360 parent: 0x28C cpuId: 0x0 L1i: 0x37C L1d: 0x398
cache: offset: 0x37C cacheId: 13 size: 0x10000 next: 0x3B4
cache: offset: 0x398 cacheId: 14 size: 0x10000 next: 0x3B4
cache: offset: 0x3B4 cacheId: 15 size: 0x80000
thread: offset: 0x3D0 parent: 0x360 cpuId: 0x10
thread: offset: 0x3E4 parent: 0x360 cpuId: 0x11
thread: offset: 0x3F8 parent: 0x360 cpuId: 0x12
thread: offset: 0x40C parent: 0x360 cpuId: 0x13
core: offset: 0x420 parent: 0x28C cpuId: 0x0 L1i: 0x43C L1d: 0x458
cache: offset: 0x43C cacheId: 16 size: 0x10000 next: 0x474
cache: offset: 0x458 cacheId: 17 size: 0x10000 next: 0x474
cache: offset: 0x474 cacheId: 18 size: 0x80000
thread: offset: 0x490 parent: 0x420 cpuId: 0x14
thread: offset: 0x4A4 parent: 0x420 cpuId: 0x15
thread: offset: 0x4B8 parent: 0x420 cpuId: 0x16
thread: offset: 0x4CC parent: 0x420 cpuId: 0x17
You may find it useful. I tested it only with cache at either core or
cluster level.