On Mon, Jul 03, 2023 at 06:03:08PM +0200, Pierrick Bouvier wrote: > Hi everyone, > > Recently (in d135f781 [1], between v7.0.0 and v8.0.0), qemu-user default cpu > was updated to "max" instead of qemu32/qemu64. > > This change "broke" qemu self emulation if this new default cpu is used. > > $ ./qemu-x86_64 ./qemu-x86_64 --version > qemu-x86_64: ../util/cacheflush.c:212: init_cache_info: Assertion `(isize & > (isize - 1)) == 0' failed. > qemu: uncaught target signal 6 (Aborted) - core dumped > Aborted > > By setting cpu back to qemu64, it works again. > $ ./qemu-x86_64 -cpu qemu64 ./qemu-x86_64 --version > qemu-x86_64 version 8.0.50 (v8.0.0-2317-ge125b08ed6) > Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers > > Commenting assert does not work, as qemu aligned malloc fail shortly after. > > I'm willing to fix it, but I'm not sure what is the issue with "max" cpu > exactly. Is it missing CPU cache line, or something else?
I've observed GLibC is issuing CPUID leaf 0x8000_001d QEMU 'max' CPU model doesn't defnie xlevel, so QEMU makes it default to the same as min_xlevel, which is calculated to be 0x8000_000a. cpu_x86_cpuid() in QEMU sees CPUID leaf 0x8000_001d is above 0x8000_000a, and so considers it an invaild CPUID and thus forces it to report 0x0000_000d which is supposedly what an invalid CPUID leaf should do. Net result: glibc is asking for 0x8000_001d, but getting back data for 0x0000_000d. This doesn't end happily for obvious reasons, getting garbage for the dcache sizes. The 'qemu64' CPU model also gets CPUID leaf 0x8000_001d capped back to 0x0000_000d, but crucially qemu64 lacks the 'xsave' feature bit, so QEMU returns all-zeroes for CPUID leaf 0x0000_000d. Still not good, but this makes glibc report 0 for DCACHE_*, which in turn avoids tripping up the nested qemu which queries DCACHE sysconf. So the problem is thus more widespread than just 'max' CPU model. Any QEMU CPU model with vendor=AuthenticAMD and the xsave feature, and the xlevel unset, will cause glibc to report garbage for the L1D cache info Any QEMU CPU model with vendor=AuthenticAMD and without the xsave feature, and the xlevel unset, will cause glibc to report zeroes for L1D cache info Neither is good, but the latter at least doesn't trip up the nested QEMU when it queries L1D cache info. I'm unsure if QEMU's behaviour is correct with calculating the default 'xlevel' values for 'max', but I'm assuming the xlevel was correct for Opteron_G4/5 since those are explicitly set in the code for along time. Over to the GLibC side, I see there was a recent change: commit 103a469dc7755fd9e8ccf362f3dd4c55dc761908 Author: Sajan Karumanchi <sajan.karuman...@amd.com> Date: Wed Jan 18 18:29:04 2023 +0100 x86: Cache computation for AMD architecture. All AMD architectures cache details will be computed based on __cpuid__ `0x8000_001D` and the reference to __cpuid__ `0x8000_0006` will be zeroed out for future architectures. Reviewed-by: Premachandra Mallappa <premachandra.malla...@amd.com> This introduced the use of CPUID leaf 0x8000_001D. Before this point glibc would use 0x8000_0000 and 0x8000_0005 to calculate the cache size. QEMU worked correctly with this implementation. https://sourceware.org/pipermail/libc-alpha/2023-January/144815.html The reporter said "Though we have done the testing on Zen and pre-Zen architectures, we recommend to carryout the tests from your end too." it is unclear if their testing would have covered Opteron_G4/Opteron_G5 architectures, and I not expecting to have had QEMU testing of course ? I don't have any non-virtual pre-Zen silicon I could verify CPUID behaviour on. I've not found historic versions of the AMD architecture reference to see when they first documented 0x8000_001d as a valid CPUID leaf for getting cache info. IOW it is still unclear to me whether the root cause bug here is in QEMU's emulation of CPUID 0x8000_001d, or whether this was actually a real regression introduced in glibc >= 2.37 I'm tending towards glibc regression though. Copying Florian and the original AMD patch author Brief summary With old glibc 2.36, using QEMU's qemu64/max CPU models: # qemu-x86_64-static -cpu qemu64 /bin/getconf -a | grep DCACHE LEVEL1_DCACHE_SIZE 65536 LEVEL1_DCACHE_ASSOC 2 LEVEL1_DCACHE_LINESIZE 64 # qemu-x86_64-static -cpu Opteron_G4 /bin/getconf -a | grep DCACHE LEVEL1_DCACHE_SIZE 65536 LEVEL1_DCACHE_ASSOC 2 LEVEL1_DCACHE_LINESIZE 64 # qemu-x86_64-static -cpu max /bin/getconf -a | grep DCACHE LEVEL1_DCACHE_SIZE 65536 LEVEL1_DCACHE_ASSOC 2 LEVEL1_DCACHE_LINESIZE 64 With new glibc 2.37: # qemu-x86_64-static -cpu qemu64 /bin/getconf -a | grep DCACHE LEVEL1_DCACHE_SIZE 0 LEVEL1_DCACHE_ASSOC 0 LEVEL1_DCACHE_LINESIZE 0 # qemu-x86_64-static -cpu Opteron_G4 /bin/getconf -a | grep DCACHE LEVEL1_DCACHE_SIZE 693889 LEVEL1_DCACHE_ASSOC 1 LEVEL1_DCACHE_LINESIZE 833 # qemu-x86_64-static -cpu max /bin/getconf -a | grep DCACHE LEVEL1_DCACHE_SIZE 7273809 LEVEL1_DCACHE_ASSOC 1 LEVEL1_DCACHE_LINESIZE 2697 With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|