Hi Ard,
在 2022/11/30 16:18, Ard Biesheuvel 写道:
On Wed, 30 Nov 2022 at 08:53, Marc Zyngier <m...@kernel.org> wrote:
On Wed, 30 Nov 2022 02:52:35 +0000,
"chenxiang (M)" <chenxian...@hisilicon.com> wrote:
Hi,
We boot the VM using following commands (with nvdimm on) (qemu
version 6.1.50, kernel 6.0-r4):
How relevant is the presence of the nvdimm? Do you observe the failure
without this?
qemu-system-aarch64 -machine
virt,kernel_irqchip=on,gic-version=3,nvdimm=on -kernel
/home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios
/root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m
2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0
ealycon=pl0ll,0x90000000 pcie_ports=native pciehp.pciehp_debug=1'
-object memory-backend-ram,id=ram1,size=10G -device
nvdimm,id=dimm1,memdev=ram1 -device ioh3420,id=root_port1,chassis=1
-device vfio-pci,host=7d:01.0,id=net0,bus=root_port1
Then in VM we insmod a module, vmalloc error occurs as follows (kernel
5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4):
estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko
[ 8.186563] vmap allocation for size 20480 failed: use
vmalloc=<size> to increase size
Have you tried increasing the vmalloc size to check that this is
indeed the problem?
[...]
We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr:
defer initialization to initcall where permitted").
I guess you mean commit fc5a89f75d2a instead, right?
Do you have any idea about the issue?
I sort of suspect that the nvdimm gets vmap-ed and consumes a large
portion of the vmalloc space, but you give very little information
that could help here...
Ouch. I suspect what's going on here: that patch defers the
randomization of the module region, so that we can decouple it from
the very early init code.
Obviously, it is happening too late now, and the randomized module
region is overlapping with a vmalloc region that is in use by the time
the randomization occurs.
Does the below fix the issue?
The issue still occurs, but it seems decrease the probability, before it
occured almost every time, after the change, i tried 2-3 times, and it
occurs.
But i change back "subsys_initcall" to "core_initcall", and i test more
than 20 times, and it is still ok.
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c
index 37a9deed2aec..71fb18b2f304 100644
--- a/arch/arm64/kernel/kaslr.c
+++ b/arch/arm64/kernel/kaslr.c
@@ -90,4 +90,4 @@ static int __init kaslr_init(void)
return 0;
}
-subsys_initcall(kaslr_init)
+arch_initcall(kaslr_init)
.