[Touch-packages] [Bug 1842320] Re: Can't boot: "error: out of memory." immediately after the grub menu
Ah, didn't realize Debian had a different limit. Thanks @juliank. @anourzad, I'm not sure I would want to start adding custom steps to the kernel update path. I'm much more inclined to add an /etc option for a supported automatic mechanism, like compression or module selection. @adrien-n, great notes in #125, I'm sure this will help others. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to initramfs-tools in Ubuntu. https://bugs.launchpad.net/bugs/1842320 Title: Can't boot: "error: out of memory." immediately after the grub menu Status in grub: Unknown Status in OEM Priority Project: Triaged Status in grub2-signed package in Ubuntu: Triaged Status in grub2-unsigned package in Ubuntu: Triaged Status in initramfs-tools package in Ubuntu: Won't Fix Status in linux package in Ubuntu: Confirmed Bug description: [Impact] * In some cases, if the users’ initramfs grow bigger, then it’ll likely not be able to be loaded by grub2. * Some real cases from OEM projects: In many built-in 4k monitor laptops with nvidia drivers, the u-d-c puts the nvidia*.ko to initramfs which grows the initramfs to ~120M. Also the gfxpayload=auto will remain to use 4K resolution since it’s what EFI POST passed. In this case, the grub isn't able to load initramfs because the grub_memalign() won't be able to get suitable memory for the larger file: ``` #0 grub_memalign (align=1, size=592214020) at ../../../grub-core/kern/mm.c:376 #1 0x7dd7b074 in grub_malloc (size=592214020) at ../../../grub-core/kern/mm.c:408 #2 0x7dd7a2c8 in grub_verifiers_open (io=0x7bc02d80, type=131076) at ../../../grub-core/kern/verifiers.c:150 #3 0x7dd801d4 in grub_file_open (name=0x7bc02f00 "/boot/initrd.img-5.17.0-1011-oem", type=131076) at ../../../grub-core/kern/file.c:121 #4 0x7bcd5a30 in ?? () #5 0x7fe21247 in ?? () #6 0x7bc030c8 in ?? () #7 0x00017fe21238 in ?? () #8 0x7bcd5320 in ?? () #9 0x7fe21250 in ?? () #10 0x in ?? () ``` Based on grub_mm_dump, we can see the memory fragment (some parts seem likely be used because of 4K resolution?) and doesn’t have available contiguous memory for larger file as: ``` grub_real_malloc(...) ... if (cur->size >= n + extra) ``` Based on UEFI Specification Section 7.2[1] and UEFI driver writers’ guide 4.2.3[2], we can ask 32bits+ on AllocatePages(). As most X86_64 platforms should support 64 bits addressing, we should extend GRUB_EFI_MAX_USABLE_ADDRESS to 64 bits to get more available memory. * When users grown the initramfs, then probably will get initramfs not found which really annoyed and impact the user experience (system not able to boot). [Test Plan] * detailed instructions how to reproduce the bug: 1. Any method to grow the initramfs, such as install nvidia-driver. 2. If developers would like to reproduce, then could dd if=/dev/random of=... bs=1M count=500, something like: ``` $ cat /usr/share/initramfs-tools/hooks/zzz-touch-a-file #!/bin/sh PREREQ="" prereqs() { echo "$PREREQ" } case $1 in # get pre-requisites prereqs) prereqs exit 0 ;; esac . /usr/share/initramfs-tools/hook-functions dd if=/dev/random of=${DESTDIR}/test-500M bs=1M count=500 ``` And then update-initramfs * After applying my patches, the issue is gone. * I did also test my test grubx64.efi in: 1. X86_64 qemu with 1.1. 60M initramfs + 5.15.0-37-generic kernel 1.2. 565M initramfs + 5.17.0-1011-oem kernel 2. Amd64 HP mobile workstation with 2.1. 65M initramfs + 5.15.0-39-generic kernel 2.2. 771M initramfs + 5.17.0-1011-oem kernel All working well. [Where problems could occur] * The changes almost in i386/efi, thus the impact will be in the i386 / x86_64 EFI system. The other change is to modify the “grub-core/kern/efi/mm.c” but I use the original addressing for “arm/arm64/ia64/riscv32/riscv64”. Thus it should not impact them. * There is a “#if defined(__x86_64__)” which intent to limit the > 32bits code in i386 system and also ``` #if defined (__code_model_large__) -#define GRUB_EFI_MAX_USABLE_ADDRESS 0x +#define GRUB_EFI_MAX_USABLE_ADDRESS __UINTPTR_MAX__ +#define GRUB_EFI_MAX_ALLOCATION_ADDRESS 0x7fff #else #define GRUB_EFI_MAX_USABLE_ADDRESS 0x7fff +#define GRUB_EFI_MAX_ALLOCATION_ADDRESS 0x3fff #endif ``` If everything works as expected, then i386 should working good. If not lucky, based on “UEFI writers’ guide”[2], the i386 will get > 4GB memory region and never be able to access. [Other Info] * Upstream grub2 bug #61058 https://savannah.gnu.org/bugs/index.php?61058 * Test PPA: https://launchpad.net/~os369510/+archive/ubuntu/lp1842320 * Test grubx64.efi:
[Touch-packages] [Bug 1842320] Re: Can't boot: "error: out of memory." immediately after the grub menu
Confirmed that reducing the size of the initrd.img slightly allowed my machine to boot. I just enabled `COMPRESSLEVEL=19` with `COMPRESS=zstd` (default on debian) in initramfs.conf. That reduced my initrd.img size from 72MB to 62MB, which allowed my machine to boot. Ultimately, I set `MODULES=dep` (instead of `MODULES=most`) and reverted the compression to defaults. This reduced my initrd.img to 22MB, which includes Intel microcode (specific binary), nvme, crypto, and lvm support, among other things. --- For the record: My machine is running Debian bookworm with the Nvidia proprietary drivers with 32GB. I first noticed the issue when, coincidentally, upgrading from kernel 5.19 to 6.0. ``` Loading Linux 6.0.0-4amd64 ... Loading initial ramdisk ... error: out of memory. Press any key to continue... ``` -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to initramfs-tools in Ubuntu. https://bugs.launchpad.net/bugs/1842320 Title: Can't boot: "error: out of memory." immediately after the grub menu Status in grub: Unknown Status in OEM Priority Project: Triaged Status in grub2-signed package in Ubuntu: Triaged Status in grub2-unsigned package in Ubuntu: Triaged Status in initramfs-tools package in Ubuntu: Won't Fix Status in linux package in Ubuntu: Confirmed Bug description: [Impact] * In some cases, if the users’ initramfs grow bigger, then it’ll likely not be able to be loaded by grub2. * Some real cases from OEM projects: In many built-in 4k monitor laptops with nvidia drivers, the u-d-c puts the nvidia*.ko to initramfs which grows the initramfs to ~120M. Also the gfxpayload=auto will remain to use 4K resolution since it’s what EFI POST passed. In this case, the grub isn't able to load initramfs because the grub_memalign() won't be able to get suitable memory for the larger file: ``` #0 grub_memalign (align=1, size=592214020) at ../../../grub-core/kern/mm.c:376 #1 0x7dd7b074 in grub_malloc (size=592214020) at ../../../grub-core/kern/mm.c:408 #2 0x7dd7a2c8 in grub_verifiers_open (io=0x7bc02d80, type=131076) at ../../../grub-core/kern/verifiers.c:150 #3 0x7dd801d4 in grub_file_open (name=0x7bc02f00 "/boot/initrd.img-5.17.0-1011-oem", type=131076) at ../../../grub-core/kern/file.c:121 #4 0x7bcd5a30 in ?? () #5 0x7fe21247 in ?? () #6 0x7bc030c8 in ?? () #7 0x00017fe21238 in ?? () #8 0x7bcd5320 in ?? () #9 0x7fe21250 in ?? () #10 0x in ?? () ``` Based on grub_mm_dump, we can see the memory fragment (some parts seem likely be used because of 4K resolution?) and doesn’t have available contiguous memory for larger file as: ``` grub_real_malloc(...) ... if (cur->size >= n + extra) ``` Based on UEFI Specification Section 7.2[1] and UEFI driver writers’ guide 4.2.3[2], we can ask 32bits+ on AllocatePages(). As most X86_64 platforms should support 64 bits addressing, we should extend GRUB_EFI_MAX_USABLE_ADDRESS to 64 bits to get more available memory. * When users grown the initramfs, then probably will get initramfs not found which really annoyed and impact the user experience (system not able to boot). [Test Plan] * detailed instructions how to reproduce the bug: 1. Any method to grow the initramfs, such as install nvidia-driver. 2. If developers would like to reproduce, then could dd if=/dev/random of=... bs=1M count=500, something like: ``` $ cat /usr/share/initramfs-tools/hooks/zzz-touch-a-file #!/bin/sh PREREQ="" prereqs() { echo "$PREREQ" } case $1 in # get pre-requisites prereqs) prereqs exit 0 ;; esac . /usr/share/initramfs-tools/hook-functions dd if=/dev/random of=${DESTDIR}/test-500M bs=1M count=500 ``` And then update-initramfs * After applying my patches, the issue is gone. * I did also test my test grubx64.efi in: 1. X86_64 qemu with 1.1. 60M initramfs + 5.15.0-37-generic kernel 1.2. 565M initramfs + 5.17.0-1011-oem kernel 2. Amd64 HP mobile workstation with 2.1. 65M initramfs + 5.15.0-39-generic kernel 2.2. 771M initramfs + 5.17.0-1011-oem kernel All working well. [Where problems could occur] * The changes almost in i386/efi, thus the impact will be in the i386 / x86_64 EFI system. The other change is to modify the “grub-core/kern/efi/mm.c” but I use the original addressing for “arm/arm64/ia64/riscv32/riscv64”. Thus it should not impact them. * There is a “#if defined(__x86_64__)” which intent to limit the > 32bits code in i386 system and also ``` #if defined (__code_model_large__) -#define GRUB_EFI_MAX_USABLE_ADDRESS 0x +#define GRUB_EFI_MAX_USABLE_ADDRESS __UINTPTR_MAX__ +#define GRUB_EFI_MAX_ALLOCATION_ADDRESS 0x7fff #else #define GRUB_EFI_MAX_USABLE_ADDRESS