** Tags added: originate-from-1994098 stella -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1842320
Title: Can't boot: "error: out of memory." immediately after the grub menu Status in grub: Unknown Status in OEM Priority Project: Triaged Status in grub2-signed package in Ubuntu: Triaged Status in grub2-unsigned package in Ubuntu: Triaged Status in initramfs-tools package in Ubuntu: Won't Fix Status in linux package in Ubuntu: Confirmed Bug description: [Impact] * In some cases, if the users’ initramfs grow bigger, then it’ll likely not be able to be loaded by grub2. * Some real cases from OEM projects: In many built-in 4k monitor laptops with nvidia drivers, the u-d-c puts the nvidia*.ko to initramfs which grows the initramfs to ~120M. Also the gfxpayload=auto will remain to use 4K resolution since it’s what EFI POST passed. In this case, the grub isn't able to load initramfs because the grub_memalign() won't be able to get suitable memory for the larger file: ``` #0 grub_memalign (align=1, size=592214020) at ../../../grub-core/kern/mm.c:376 #1 0x000000007dd7b074 in grub_malloc (size=592214020) at ../../../grub-core/kern/mm.c:408 #2 0x000000007dd7a2c8 in grub_verifiers_open (io=0x7bc02d80, type=131076) at ../../../grub-core/kern/verifiers.c:150 #3 0x000000007dd801d4 in grub_file_open (name=0x7bc02f00 "/boot/initrd.img-5.17.0-1011-oem", type=131076) at ../../../grub-core/kern/file.c:121 #4 0x000000007bcd5a30 in ?? () #5 0x000000007fe21247 in ?? () #6 0x000000007bc030c8 in ?? () #7 0x000000017fe21238 in ?? () #8 0x000000007bcd5320 in ?? () #9 0x000000007fe21250 in ?? () #10 0x0000000000000000 in ?? () ``` Based on grub_mm_dump, we can see the memory fragment (some parts seem likely be used because of 4K resolution?) and doesn’t have available contiguous memory for larger file as: ``` grub_real_malloc(...) ... if (cur->size >= n + extra) ``` Based on UEFI Specification Section 7.2[1] and UEFI driver writers’ guide 4.2.3[2], we can ask 32bits+ on AllocatePages(). As most X86_64 platforms should support 64 bits addressing, we should extend GRUB_EFI_MAX_USABLE_ADDRESS to 64 bits to get more available memory. * When users grown the initramfs, then probably will get initramfs not found which really annoyed and impact the user experience (system not able to boot). [Test Plan] * detailed instructions how to reproduce the bug: 1. Any method to grow the initramfs, such as install nvidia-driver. 2. If developers would like to reproduce, then could dd if=/dev/random of=... bs=1M count=500, something like: ``` $ cat /usr/share/initramfs-tools/hooks/zzz-touch-a-file #!/bin/sh PREREQ="" prereqs() { echo "$PREREQ" } case $1 in # get pre-requisites prereqs) prereqs exit 0 ;; esac . /usr/share/initramfs-tools/hook-functions dd if=/dev/random of=${DESTDIR}/test-500M bs=1M count=500 ``` And then update-initramfs * After applying my patches, the issue is gone. * I did also test my test grubx64.efi in: 1. X86_64 qemu with 1.1. 60M initramfs + 5.15.0-37-generic kernel 1.2. 565M initramfs + 5.17.0-1011-oem kernel 2. Amd64 HP mobile workstation with 2.1. 65M initramfs + 5.15.0-39-generic kernel 2.2. 771M initramfs + 5.17.0-1011-oem kernel All working well. [Where problems could occur] * The changes almost in i386/efi, thus the impact will be in the i386 / x86_64 EFI system. The other change is to modify the “grub-core/kern/efi/mm.c” but I use the original addressing for “arm/arm64/ia64/riscv32/riscv64”. Thus it should not impact them. * There is a “#if defined(__x86_64__)” which intent to limit the > 32bits code in i386 system and also ``` #if defined (__code_model_large__) -#define GRUB_EFI_MAX_USABLE_ADDRESS 0xffffffff +#define GRUB_EFI_MAX_USABLE_ADDRESS __UINTPTR_MAX__ +#define GRUB_EFI_MAX_ALLOCATION_ADDRESS 0x7fffffff #else #define GRUB_EFI_MAX_USABLE_ADDRESS 0x7fffffff +#define GRUB_EFI_MAX_ALLOCATION_ADDRESS 0x3fffffff #endif ``` If everything works as expected, then i386 should working good. If not lucky, based on “UEFI writers’ guide”[2], the i386 will get > 4GB memory region and never be able to access. [Other Info] * Upstream grub2 bug #61058 https://savannah.gnu.org/bugs/index.php?61058 * Test PPA: https://launchpad.net/~os369510/+archive/ubuntu/lp1842320 * Test grubx64.efi: https://people.canonical.com/~jeremysu/lp1842320/grubx64.efi.lp1842320 * Test source code: https://github.com/os369510/grub2/tree/lp1842320 * If you built the package, then test grubx64.efi is under “obj/monolithic/grub-efi-amd64/grubx64.efi”, in my case: `/var/cache/pbuilder/build/276481/build/grub2-2.06/obj/monolithic/grub- efi-amd64/grubx64.efi` * My build command: `sudo PBSHELL=1 pbuilder build --hookdir ~/hook- dir ubuntu-grub/grub2_2.06-2ubuntu7+jeremydev2.dsc 2>&1 | tee build.log` * My qemu command: `qemu-system-x86_64 -bios edk2/Build/OvmfX64/DEBUG_GCC5/FV/OVMF.fd -hda Templates/grub.qcow2 -m 6G -vga cirrus -smp 8 -machine type=q35,accel=kvm -cpu host -enable- kvm -boot menu=on` (I built an edk2 binary with debugging log) * You can use my grubx64.efi with debug symbols from https://people.canonical.com/~jeremysu/lp1842320/grubx64.efi.lp1842320-dev- with-debug-symbols and source code is from https://github.com/os369510/grub2/tree/jeremy-dev . After built the package from source code, then you can use gdb to attach the qemu session as: ``` ubuntu@ubuntu-HP-ZBook-Fury-16-G9-Mobile-Workstation-PC [ /var/cache/pbuilder/build/35354/tmp/buildd/grub2-2.06/obj/grub-efi-amd64/grub-core ] $ gdb -x gdb_grub # with “add-symbol-file kernel.img ${address} ``` The address above can read from qemu serial port and found the last “Loading driver at 0x000xxxxxxxxxx EntryPoint=0x000xxxxxxxabc” In above case, fill “0x000xxxxxxxabc” to ${address}. [1] https://uefi.org/sites/default/files/resources/UEFI_Spec_2_9_2021_03_18.pdf [2] https://edk2-docs.gitbook.io/edk-ii-uefi-driver-writer-s-guide/4_general_driver_design_guidelines/readme.2/423_use_uefi_memory_allocation_services --- Upgraded from 19.04 to current 19.10 using "do-release-upgrade -d". Can still boot using the previous 5.0.0-25-generic kernel, but the 5.2.0-15-generic fails to start. On selecting Ubuntu from Grub, the message "error: out of memory." is immediately shown. Pressing a key attempts to start boot-up but fails to mount root fs. Machine is HP Spectre X360 with 8GB RAM. Under kernel 5.0.0, free shows the following (run from Gnome terminal): total used free shared buff/cache available Mem: 7906564 1761196 3833240 1020216 2312128 4849224 Swap: 1003516 0 1003516 Kernel packages installed: linux-generic 5.2.0.15.16 amd64 linux-headers-5.2.0-15 5.2.0-15.16 all linux-headers-5.2.0-15-generic 5.2.0-15.16 amd64 linux-headers-generic 5.2.0.15.16 amd64 linux-image-5.0.0-25-generic 5.0.0-25.26 amd64 linux-image-5.2.0-15-generic 5.2.0-15.16+signed1 amd64 linux-image-generic 5.2.0.15.16 amd64 linux-modules-5.0.0-25-generic 5.0.0-25.26 amd64 linux-modules-5.2.0-15-generic 5.2.0-15.16 amd64 linux-modules-extra-5.0.0-25-generic 5.0.0-25.26 amd64 linux-modules-extra-5.2.0-15-generic 5.2.0-15.16 amd64 Photo of kernel panic attached. NVMe drive partition layout (GPT): Device Start End Sectors Size Type /dev/nvme0n1p1 2048 1050623 1048576 512M EFI System /dev/nvme0n1p2 1050624 2549759 1499136 732M Linux filesystem /dev/nvme0n1p3 2549760 1000214527 997664768 475.7G Linux filesystem $ sudo pvs PV VG Fmt Attr PSize PFree /dev/mapper/nvme0n1p3_crypt ubuntu-vg lvm2 a-- <475.71g 0 $ sudo lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert root ubuntu-vg -wi-ao---- 474.75g swap_1 ubuntu-vg -wi-ao---- 980.00m Partition 3 is LUKS encrypted. Root LV is ext4. --- ProblemType: Bug ApportVersion: 2.20.11-0ubuntu7 Architecture: amd64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC0: gmckeown 1647 F.... pulseaudio CurrentDesktop: ubuntu:GNOME DistroRelease: Ubuntu 19.10 InstallationDate: Installed on 2019-08-15 (18 days ago) InstallationMedia: Ubuntu 19.04 "Disco Dingo" - Release amd64 (20190416) Lsusb: Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 003: ID 8087:0a2b Intel Corp. Bus 001 Device 002: ID 04f2:b593 Chicony Electronics Co., Ltd HP Wide Vision FHD Camera Bus 001 Device 004: ID 046d:c52b Logitech, Inc. Unifying Receiver Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub MachineType: HP HP Spectre x360 Convertible 13-ae0xx Package: linux (not installed) ProcFB: 0 inteldrmfb ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.0.0-25-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash ProcVersionSignature: Ubuntu 5.0.0-25.26-generic 5.0.18 RelatedPackageVersions: linux-restricted-modules-5.0.0-25-generic N/A linux-backports-modules-5.0.0-25-generic N/A linux-firmware 1.181 Tags: eoan Uname: Linux 5.0.0-25-generic x86_64 UpgradeStatus: Upgraded to eoan on 2019-09-02 (0 days ago) UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo _MarkForUpload: True dmi.bios.date: 05/17/2019 dmi.bios.vendor: AMI dmi.bios.version: F.25 dmi.board.asset.tag: Base Board Asset Tag dmi.board.name: 83B9 dmi.board.vendor: HP dmi.board.version: 56.43 dmi.chassis.type: 31 dmi.chassis.vendor: HP dmi.chassis.version: Chassis Version dmi.modalias: dmi:bvnAMI:bvrF.25:bd05/17/2019:svnHP:pnHPSpectrex360Convertible13-ae0xx:pvr:rvnHP:rn83B9:rvr56.43:cvnHP:ct31:cvrChassisVersion: dmi.product.family: 103C_5335KV HP Spectre dmi.product.name: HP Spectre x360 Convertible 13-ae0xx dmi.product.sku: 2QH38EA#ABU dmi.sys.vendor: HP To manage notifications about this bug go to: https://bugs.launchpad.net/grub/+bug/1842320/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp