This bug is awaiting verification that the linux-bluefield/5.4.0-1070.76 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal-linux-bluefield' to 'verification-done- focal-linux-bluefield'. If the problem still exists, change the tag 'verification-needed-focal-linux-bluefield' to 'verification-failed- focal-linux-bluefield'.
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: kernel-spammed-focal-linux-bluefield-v2 verification-needed-focal-linux-bluefield -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1978079 Title: EFI pstore not cleared on boot Status in linux-bluefield package in Ubuntu: New Status in systemd package in Ubuntu: Fix Released Status in linux-bluefield source package in Focal: Fix Committed Status in systemd source package in Focal: Fix Released Status in systemd source package in Impish: Won't Fix Status in linux-bluefield source package in Jammy: Fix Committed Status in systemd source package in Jammy: Fix Released Status in systemd source package in Kinetic: Fix Released Bug description: [Impact] Systemd has a systemd-pstore component that scans the pstore on boot and if non-empty, takes all previously created dumps, transfers them into its journal and removes the pstore elements. This is very important on UEFI systems, which only have a limited amount of space for variables. In Ubuntu, the kernel is configured with CONFIG_EFI_VARS_PSTORE=m which means the EFI pstore support gets loaded dynamically. In all of my boots, this dynamic module loading happened *after* systemd tried to check for pstore variables. So systemd-pstore never starts and never clears the UEFI variable store. I see this happening in AWS on Graviton instances, which eventually run out of space to store the dumps. On real hardware, this behavior may lead to unbootable systems. ``` $ systemctl status systemd-pstore ○ systemd-pstore.service - Platform Persistent Storage Archival Loaded: loaded (/lib/systemd/system/systemd-pstore.service; enabled; vendor preset: enabled) Active: inactive (dead) Condition: start condition failed at Thu 2022-06-09 09:11:41 UTC; 29min ago └─ ConditionDirectoryNotEmpty=/sys/fs/pstore was not met Docs: man:systemd-pstore(8) Jun 09 09:11:41 ip-172-31-0-61 systemd[1]: Condition check resulted in Platform Persistent Storage Archival being skipped. $ ls -la /sys/fs/pstore total 0 drwxr-x--- 2 root root 0 Jun 9 09:11 . drwxr-xr-x 8 root root 0 Jun 9 09:11 .. -r--r--r-- 1 root root 1803 Jun 9 09:07 dmesg-efi-165476562001001 -r--r--r-- 1 root root 1777 Jun 9 09:07 dmesg-efi-165476562002001 -r--r--r-- 1 root root 1773 Jun 9 09:07 dmesg-efi-165476562003001 -r--r--r-- 1 root root 1815 Jun 9 09:07 dmesg-efi-165476562004001 -r--r--r-- 1 root root 1826 Jun 9 09:07 dmesg-efi-165476562005001 -r--r--r-- 1 root root 1754 Jun 9 09:07 dmesg-efi-165476562006001 -r--r--r-- 1 root root 1821 Jun 9 09:07 dmesg-efi-165476562007001 -r--r--r-- 1 root root 1767 Jun 9 09:07 dmesg-efi-165476562008001 -r--r--r-- 1 root root 1729 Jun 9 09:07 dmesg-efi-165476562009001 -r--r--r-- 1 root root 1819 Jun 9 09:07 dmesg-efi-165476562010001 -r--r--r-- 1 root root 1767 Jun 9 09:07 dmesg-efi-165476562011001 -r--r--r-- 1 root root 1775 Jun 9 09:07 dmesg-efi-165476562012001 -r--r--r-- 1 root root 1802 Jun 9 09:07 dmesg-efi-165476562013001 -r--r--r-- 1 root root 1812 Jun 9 09:07 dmesg-efi-165476562014001 -r--r--r-- 1 root root 1764 Jun 9 09:07 dmesg-efi-165476562015001 -r--r--r-- 1 root root 1795 Jun 9 09:11 dmesg-efi-165476589801001 -r--r--r-- 1 root root 1785 Jun 9 09:11 dmesg-efi-165476589802001 -r--r--r-- 1 root root 1683 Jun 9 09:11 dmesg-efi-165476589803001 -r--r--r-- 1 root root 1785 Jun 9 09:11 dmesg-efi-165476589804001 -r--r--r-- 1 root root 1771 Jun 9 09:11 dmesg-efi-165476589805001 -r--r--r-- 1 root root 1797 Jun 9 09:11 dmesg-efi-165476589806001 -r--r--r-- 1 root root 1805 Jun 9 09:11 dmesg-efi-165476589807001 -r--r--r-- 1 root root 1781 Jun 9 09:11 dmesg-efi-165476589808001 -r--r--r-- 1 root root 1806 Jun 9 09:11 dmesg-efi-165476589809001 -r--r--r-- 1 root root 1821 Jun 9 09:11 dmesg-efi-165476589810001 -r--r--r-- 1 root root 1763 Jun 9 09:11 dmesg-efi-165476589811001 -r--r--r-- 1 root root 1783 Jun 9 09:11 dmesg-efi-165476589812001 -r--r--r-- 1 root root 1788 Jun 9 09:11 dmesg-efi-165476589813001 -r--r--r-- 1 root root 1788 Jun 9 09:11 dmesg-efi-165476589814001 -r--r--r-- 1 root root 1786 Jun 9 09:11 dmesg-efi-165476589815001 ``` This problem affects (at least) Ubuntu 20.04 and 22.04. A quick fix would be to configure CONFIG_EFI_VARS_PSTORE=y so that it's always available. A long term fix would make systemd rescan the directory after all module probing settled. [Test Plan] In order to be able to reproduce this issue, the system must have EFI- backed pstore. To check which kind of backend that pstore, use `cat /sys/module/pstore/parameters/backend` If it says `efi`, the steps below are applicable. Otherwise, find an environment that has EFI backed pstore. # Enable the pstore service. This service is supposed to move the data in /sys/fs/pstore # to the `/var/lib/systemd/pstore` path on boot. systemctl enable systemd-pstore.service # (or can be vendor enabled) # Crash the kernel echo 1 > /proc/sys/kernel/sysrq echo 1 > /proc/sys/kernel/panic # this is usually set to zero, causing kernel to loop over the panic and freeze echo "c" > /proc/sysrq-trigger # The system will reboot itself. Check `/sys/fs/pstore` path first: ls /sys/fs/pstore # The path should not be empty, which means the systemd-pstore has failed to do its' job ls /var/lib/systemd/pstore # The path should be empty. # Apply the fix sudo add-apt-repository ppa:mustafakemalgilor/lp-1978079-1 sudo apt upgrade # Crash the kernel echo 1 > /proc/sys/kernel/sysrq echo 1 > /proc/sys/kernel/panic # this is usually set to zero, causing kernel to loop over the panic and freeze echo "c" > /proc/sysrq-trigger # The system will reboot itself. After reboot, the contents of the `/sys/fs/pstore` must have been moved to the `/var/lib/systemd/pstore` path. ls /sys/fs/pstore # The path should be empty ls /var/lib/systemd/pstore # The path should not be empty [Where problems could occur] On some systems, even though the described bug is present, the effect of the bug could not be observed. The nature of the issue suggests that this is a due to a timing issue. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/1978079/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp