** Description changed:

+ [Impact]
+ 
  Systemd has a systemd-pstore component that scans the pstore on boot and
  if non-empty, takes all previously created dumps, transfers them into
  its journal and removes the pstore elements. This is very important on
  UEFI systems, which only have a limited amount of space for variables.
  
  In Ubuntu, the kernel is configured with CONFIG_EFI_VARS_PSTORE=m which
  means the EFI pstore support gets loaded dynamically. In all of my
  boots, this dynamic module loading happened *after* systemd tried to
  check for pstore variables. So systemd-pstore never starts and never
  clears the UEFI variable store. I see this happening in AWS on Graviton
  instances, which eventually run out of space to store the dumps. On real
  hardware, this behavior may lead to unbootable systems.
  
  ```
  $ systemctl status systemd-pstore
  ○ systemd-pstore.service - Platform Persistent Storage Archival
-      Loaded: loaded (/lib/systemd/system/systemd-pstore.service; enabled; 
vendor preset: enabled)
-      Active: inactive (dead)
-   Condition: start condition failed at Thu 2022-06-09 09:11:41 UTC; 29min ago
-              └─ ConditionDirectoryNotEmpty=/sys/fs/pstore was not met
-        Docs: man:systemd-pstore(8)
+      Loaded: loaded (/lib/systemd/system/systemd-pstore.service; enabled; 
vendor preset: enabled)
+      Active: inactive (dead)
+   Condition: start condition failed at Thu 2022-06-09 09:11:41 UTC; 29min ago
+              └─ ConditionDirectoryNotEmpty=/sys/fs/pstore was not met
+        Docs: man:systemd-pstore(8)
  
  Jun 09 09:11:41 ip-172-31-0-61 systemd[1]: Condition check resulted in
  Platform Persistent Storage Archival being skipped.
  
  $ ls -la /sys/fs/pstore
  total 0
  drwxr-x--- 2 root root    0 Jun  9 09:11 .
  drwxr-xr-x 8 root root    0 Jun  9 09:11 ..
  -r--r--r-- 1 root root 1803 Jun  9 09:07 dmesg-efi-165476562001001
  -r--r--r-- 1 root root 1777 Jun  9 09:07 dmesg-efi-165476562002001
  -r--r--r-- 1 root root 1773 Jun  9 09:07 dmesg-efi-165476562003001
  -r--r--r-- 1 root root 1815 Jun  9 09:07 dmesg-efi-165476562004001
  -r--r--r-- 1 root root 1826 Jun  9 09:07 dmesg-efi-165476562005001
  -r--r--r-- 1 root root 1754 Jun  9 09:07 dmesg-efi-165476562006001
  -r--r--r-- 1 root root 1821 Jun  9 09:07 dmesg-efi-165476562007001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562008001
  -r--r--r-- 1 root root 1729 Jun  9 09:07 dmesg-efi-165476562009001
  -r--r--r-- 1 root root 1819 Jun  9 09:07 dmesg-efi-165476562010001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562011001
  -r--r--r-- 1 root root 1775 Jun  9 09:07 dmesg-efi-165476562012001
  -r--r--r-- 1 root root 1802 Jun  9 09:07 dmesg-efi-165476562013001
  -r--r--r-- 1 root root 1812 Jun  9 09:07 dmesg-efi-165476562014001
  -r--r--r-- 1 root root 1764 Jun  9 09:07 dmesg-efi-165476562015001
  -r--r--r-- 1 root root 1795 Jun  9 09:11 dmesg-efi-165476589801001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589802001
  -r--r--r-- 1 root root 1683 Jun  9 09:11 dmesg-efi-165476589803001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589804001
  -r--r--r-- 1 root root 1771 Jun  9 09:11 dmesg-efi-165476589805001
  -r--r--r-- 1 root root 1797 Jun  9 09:11 dmesg-efi-165476589806001
  -r--r--r-- 1 root root 1805 Jun  9 09:11 dmesg-efi-165476589807001
  -r--r--r-- 1 root root 1781 Jun  9 09:11 dmesg-efi-165476589808001
  -r--r--r-- 1 root root 1806 Jun  9 09:11 dmesg-efi-165476589809001
  -r--r--r-- 1 root root 1821 Jun  9 09:11 dmesg-efi-165476589810001
  -r--r--r-- 1 root root 1763 Jun  9 09:11 dmesg-efi-165476589811001
  -r--r--r-- 1 root root 1783 Jun  9 09:11 dmesg-efi-165476589812001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589813001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589814001
  -r--r--r-- 1 root root 1786 Jun  9 09:11 dmesg-efi-165476589815001
  ```
  
  This problem affects (at least) Ubuntu 20.04 and 22.04. A quick fix
  would be to configure CONFIG_EFI_VARS_PSTORE=y so that it's always
  available. A long term fix would make systemd rescan the directory after
  all module probing settled.
+ 
+ [Test Plan]
+ 
+ In order to be able to reproduce this issue, the system must have EFI-
+ backed pstore.
+ 
+ To check which kind of backend that pstore, use `cat
+ /sys/module/pstore/parameters/backend`
+ 
+ If it says `efi`, the steps below are applicable. Otherwise, find an
+ environment that has EFI backed pstore.
+ 
+ # Enable the pstore service. This service is supposed to move the data in 
/sys/fs/pstore
+ # to the `/var/lib/systemd/pstore` path on boot.
+ systemctl enable systemd-pstore.service # (or can be vendor enabled)
+ 
+ # Crash the kernel
+ echo 1 > /proc/sys/kernel/sysrq
+ echo 1 > /proc/sys/kernel/panic # this is usually set to zero, causing kernel 
to loop over the panic and freeze
+ echo "c" > /proc/sysrq-trigger
+ 
+ # The system will reboot itself. Check `/sys/fs/pstore` path first:
+ ls /sys/fs/pstore # The path should not be empty, which means the 
systemd-pstore has failed to do its' job
+ ls /var/lib/systemd/pstore # The path should be empty.
+ 
+ # Apply the fix
+ sudo add-apt-repository ppa:mustafakemalgilor/lp-1978079-1
+ sudo apt upgrade
+ 
+ # Crash the kernel
+ echo 1 > /proc/sys/kernel/sysrq
+ echo 1 > /proc/sys/kernel/panic # this is usually set to zero, causing kernel 
to loop over the panic and freeze
+ echo "c" > /proc/sysrq-trigger
+ 
+ # The system will reboot itself. After reboot, the contents of the 
`/sys/fs/pstore` must have been moved to the `/var/lib/systemd/pstore` path. 
+ ls /sys/fs/pstore # The path should be empty
+ ls /var/lib/systemd/pstore # The path should not be empty
+ 
+ [Where problems could occur]
+ 
+ On some systems, even though the described bug is present, the effect of
+ the bug could not be observed. The nature of the issue suggests that
+ this is a due to a timing issue.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1978079

Title:
  EFI pstore not cleared on boot

Status in systemd package in Ubuntu:
  In Progress

Bug description:
  [Impact]

  Systemd has a systemd-pstore component that scans the pstore on boot
  and if non-empty, takes all previously created dumps, transfers them
  into its journal and removes the pstore elements. This is very
  important on UEFI systems, which only have a limited amount of space
  for variables.

  In Ubuntu, the kernel is configured with CONFIG_EFI_VARS_PSTORE=m
  which means the EFI pstore support gets loaded dynamically. In all of
  my boots, this dynamic module loading happened *after* systemd tried
  to check for pstore variables. So systemd-pstore never starts and
  never clears the UEFI variable store. I see this happening in AWS on
  Graviton instances, which eventually run out of space to store the
  dumps. On real hardware, this behavior may lead to unbootable systems.

  ```
  $ systemctl status systemd-pstore
  ○ systemd-pstore.service - Platform Persistent Storage Archival
       Loaded: loaded (/lib/systemd/system/systemd-pstore.service; enabled; 
vendor preset: enabled)
       Active: inactive (dead)
    Condition: start condition failed at Thu 2022-06-09 09:11:41 UTC; 29min ago
               └─ ConditionDirectoryNotEmpty=/sys/fs/pstore was not met
         Docs: man:systemd-pstore(8)

  Jun 09 09:11:41 ip-172-31-0-61 systemd[1]: Condition check resulted in
  Platform Persistent Storage Archival being skipped.

  $ ls -la /sys/fs/pstore
  total 0
  drwxr-x--- 2 root root    0 Jun  9 09:11 .
  drwxr-xr-x 8 root root    0 Jun  9 09:11 ..
  -r--r--r-- 1 root root 1803 Jun  9 09:07 dmesg-efi-165476562001001
  -r--r--r-- 1 root root 1777 Jun  9 09:07 dmesg-efi-165476562002001
  -r--r--r-- 1 root root 1773 Jun  9 09:07 dmesg-efi-165476562003001
  -r--r--r-- 1 root root 1815 Jun  9 09:07 dmesg-efi-165476562004001
  -r--r--r-- 1 root root 1826 Jun  9 09:07 dmesg-efi-165476562005001
  -r--r--r-- 1 root root 1754 Jun  9 09:07 dmesg-efi-165476562006001
  -r--r--r-- 1 root root 1821 Jun  9 09:07 dmesg-efi-165476562007001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562008001
  -r--r--r-- 1 root root 1729 Jun  9 09:07 dmesg-efi-165476562009001
  -r--r--r-- 1 root root 1819 Jun  9 09:07 dmesg-efi-165476562010001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562011001
  -r--r--r-- 1 root root 1775 Jun  9 09:07 dmesg-efi-165476562012001
  -r--r--r-- 1 root root 1802 Jun  9 09:07 dmesg-efi-165476562013001
  -r--r--r-- 1 root root 1812 Jun  9 09:07 dmesg-efi-165476562014001
  -r--r--r-- 1 root root 1764 Jun  9 09:07 dmesg-efi-165476562015001
  -r--r--r-- 1 root root 1795 Jun  9 09:11 dmesg-efi-165476589801001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589802001
  -r--r--r-- 1 root root 1683 Jun  9 09:11 dmesg-efi-165476589803001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589804001
  -r--r--r-- 1 root root 1771 Jun  9 09:11 dmesg-efi-165476589805001
  -r--r--r-- 1 root root 1797 Jun  9 09:11 dmesg-efi-165476589806001
  -r--r--r-- 1 root root 1805 Jun  9 09:11 dmesg-efi-165476589807001
  -r--r--r-- 1 root root 1781 Jun  9 09:11 dmesg-efi-165476589808001
  -r--r--r-- 1 root root 1806 Jun  9 09:11 dmesg-efi-165476589809001
  -r--r--r-- 1 root root 1821 Jun  9 09:11 dmesg-efi-165476589810001
  -r--r--r-- 1 root root 1763 Jun  9 09:11 dmesg-efi-165476589811001
  -r--r--r-- 1 root root 1783 Jun  9 09:11 dmesg-efi-165476589812001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589813001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589814001
  -r--r--r-- 1 root root 1786 Jun  9 09:11 dmesg-efi-165476589815001
  ```

  This problem affects (at least) Ubuntu 20.04 and 22.04. A quick fix
  would be to configure CONFIG_EFI_VARS_PSTORE=y so that it's always
  available. A long term fix would make systemd rescan the directory
  after all module probing settled.

  [Test Plan]

  In order to be able to reproduce this issue, the system must have EFI-
  backed pstore.

  To check which kind of backend that pstore, use `cat
  /sys/module/pstore/parameters/backend`

  If it says `efi`, the steps below are applicable. Otherwise, find an
  environment that has EFI backed pstore.

  # Enable the pstore service. This service is supposed to move the data in 
/sys/fs/pstore
  # to the `/var/lib/systemd/pstore` path on boot.
  systemctl enable systemd-pstore.service # (or can be vendor enabled)

  # Crash the kernel
  echo 1 > /proc/sys/kernel/sysrq
  echo 1 > /proc/sys/kernel/panic # this is usually set to zero, causing kernel 
to loop over the panic and freeze
  echo "c" > /proc/sysrq-trigger

  # The system will reboot itself. Check `/sys/fs/pstore` path first:
  ls /sys/fs/pstore # The path should not be empty, which means the 
systemd-pstore has failed to do its' job
  ls /var/lib/systemd/pstore # The path should be empty.

  # Apply the fix
  sudo add-apt-repository ppa:mustafakemalgilor/lp-1978079-1
  sudo apt upgrade

  # Crash the kernel
  echo 1 > /proc/sys/kernel/sysrq
  echo 1 > /proc/sys/kernel/panic # this is usually set to zero, causing kernel 
to loop over the panic and freeze
  echo "c" > /proc/sysrq-trigger

  # The system will reboot itself. After reboot, the contents of the 
`/sys/fs/pstore` must have been moved to the `/var/lib/systemd/pstore` path. 
  ls /sys/fs/pstore # The path should be empty
  ls /var/lib/systemd/pstore # The path should not be empty

  [Where problems could occur]

  On some systems, even though the described bug is present, the effect
  of the bug could not be observed. The nature of the issue suggests
  that this is a due to a timing issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1978079/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to