** Summary changed:

- why does booting any livefs squashfs cause the kernel to complain about being 
unable to read metadata
+ why does booting any livefs squashfs cause the kernel to complain about being 
unable to read metadata‽

** Tags added: rls-ff-incoming

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  why does booting any livefs squashfs cause the kernel to complain
  about being unable to read metadata‽

Status in linux package in Ubuntu:
  Confirmed
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  Confirmed

Bug description:
  1) Download focal subiquity daily image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) Before --- insert the following options
   bebroken debug init=/bin/bash 
  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)

  5) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  6) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  7) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  8) Exhibit A:
  $ cat /etc/machine-id
  (no output)
  $ systemd-machine-id-setup
  $ cat /etc/machine-id
  (some machine id)
  $ mount -o remount /
  $ cat /etc/machine-id
  I/O error
  with overlay errors in dmesg

  Similarly one can reproduce this with /etc/.pwd.lock & executing
  systemd-sysusers.

  systemd-machine-id-setup is probably the easiest to trace. It does a
  simply open, truncate, lseek, write. On boot, actuall remount is done
  by the starting a unit which calls /lib/systemd/systemd-remount-fs

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper to "rm" the offending
  files, and create them again on the upper rw layer. They then survive
  remount without i/o errors. However, we'd rather not ship those hacks,
  and have kernel overlay fixed to work correctly with multi-lower-dir
  and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to