Attached debdiff lp2103533-noble.debdiff bumps the noble package
24.004.60-1ubuntu7.1 -> 24.004.60-1ubuntu7.2 and cherry-picks the two
upstream NULL-guard commits that fix this crash:

  63597f92d108237a3ab7d2343a602a95edddd4e5 (ply-terminal: guard NULL terminal 
in set_disabled/_unbuffered/_buffered_input)
  5c10072a978dd7566559f44a54c3e031bb4cb216 (renderers: only call 
ply_terminal_set_unbuffered_input when there is a terminal, in the drm and 
frame-buffer close_input_source paths)

Both are by nerdopolis, dated 2024-01-04, and landed immediately after
the 24.004.60 release tag, so 24.004.60 is affected. They are already
Fix Released in plucky (24.004.60-2ubuntu6) and in Fedora
(plymouth-24.004.60-14.fc41 / -19.fc42). The plucky -2ubuntu6 packaging
is not directly copyable to noble (different Debian base), so the
debdiff applies the two commits as quilt patches on top of noble's
-1ubuntu7.1 delta.

I have no upload rights, so I have subscribed ubuntu-sponsors to this
bug for upload to noble-unapproved. I have built these exact two patches
on noble (24.004.60-1ubuntu7.1+evdifix1) and confirmed the fix on the
affected hardware (HP ZBook Studio x360 G5; Intel i915 eDP-1 + NVIDIA
PRIME on-demand; StarTech USB32DP4K DisplayLink via the evdi DKMS
module). I can run verification on the -proposed build and flip
verification-needed-noble -> verification-done-noble.

Proposed bug description follows; please copy it into the description.

[Impact]

 * plymouthd SIGSEGVs during boot on systems that have a terminal-less
   secondary DRM head while the boot splash is active. Effect for users:
   the boot splash dies and a plymouthd coredump is logged every boot. On
   an encrypted-disk install this is the LUKS passphrase splash.

 * Root cause is a NULL dereference of backend->terminal in
   ply_terminal_set_disabled_input(). Plymouth assigns a terminal only to
   the primary console DRM card but still builds a backend and watches
   input on secondary, terminal-less DRM heads. The public reports trigger
   this with NVIDIA multi-GPU / vt-less kernels. A second trigger is a
   DisplayLink evdi connector that reports connected+enabled with a 0-byte
   EDID, which is also a terminal-less head.

 * The upload fixes it by NULL-guarding the terminal pointer before the
   terminal-input calls: in the set_disabled/_unbuffered/_buffered_input
   functions in ply-terminal.c, and before the call in the drm and
   frame-buffer renderers' close_input_source(). It cherry-picks the two
   upstream commits listed in the comment onto noble's -1ubuntu7.1 as
   -1ubuntu7.2.

 * Justification for backporting: noble is 24.04 LTS in standard support.
   Affected users get a crashing boot splash and a coredump on every boot,
   and the noble task has had no fix for roughly 14 months while the fix
   has shipped in plucky and Fedora. The change is small and defensive.

[Test Plan]

 * Affected configuration: noble with plymouth 24.004.60-1ubuntu7.1 and a
   terminal-less secondary DRM head present while the splash is up. Either
   an NVIDIA multi-GPU / vt-less-kernel setup, or a DisplayLink/evdi device
   whose connector reports connected+enabled with an empty EDID.

 * Reproduce without the patch:
   1. Ensure "splash" is on the kernel command line.
   2. Boot with the affected head live during the splash.
   3. After boot, check for a plymouthd crash:
        coredumpctl list plymouthd
        coredumpctl info plymouthd
      The backtrace shows the crash in ply_terminal_set_disabled_input
      (here at +0x27 with rdi=0), reached via the drm renderer
      add_input_device -> watch_input_device path.

 * Reproducer used by the reporter: HP ZBook Studio x360 G5, Intel i915 +
   NVIDIA PRIME, StarTech USB32DP4K DisplayLink dongle (evdi) attached,
   dracut initramfs that force-loads evdi early so the empty-EDID head is
   live across switch_root. Booting with "splash" SIGSEGVs plymouthd as
   above.

 * Verify with the -proposed package: install plymouth 24.004.60-1ubuntu7.2
   from noble-proposed, regenerate the initramfs, and boot the same way.
   Expected: the splash completes and no new plymouthd coredump appears
   (coredumpctl list plymouthd shows no new entry for the boot). The
   reporter confirmed this with a local rebuild of the same two patches.

[Where problems could occur]

 * The change adds NULL guards (~7 lines) in two places: the
   set_disabled/_unbuffered/_buffered_input functions in
   src/libply-splash-core/ply-terminal.c, and close_input_source() in the
   drm and frame-buffer renderers (src/plugins/renderers/*/plugin.c).

 * If the guards were wrong, the failure mode would be input not being
   toggled on a backend that legitimately has a terminal, i.e. the
   terminal-input enable/disable on the primary console head being skipped.
   In practice that would show up as the splash keyboard path
   misbehaving -- the encrypted-disk passphrase prompt not accepting input,
   or echo/no-echo being wrong at the prompt. Regression testers should
   confirm the LUKS passphrase entry at the splash still accepts input with
   correct echoing, and that the splash hands off to the display manager
   cleanly, on a normal single-GPU machine with no secondary head as well
   as on the affected configuration.

 * The change only gates the existing input-source open/close calls behind
   a NULL check; it adds no new code paths. The same two commits are
   shipping in plucky (24.004.60-2ubuntu6) and Fedora without reported
   regressions, which lowers the risk but does not remove it, hence the
   passphrase-input check above.

[Other Info]

 * Upstream: commits 63597f92d108237a3ab7d2343a602a95edddd4e5 and
   5c10072a978dd7566559f44a54c3e031bb4cb216, freedesktop issue 288, first
   upstream tag containing them 26.134.222. Related: Red Hat bz 2350956,
   apport duplicates LP #2104358 and LP #2104360, earlier LP #2060086.

 * Development-release-first: the fix is Fix Released in plucky
   (24.004.60-2ubuntu6) and in Fedora. I have not confirmed which plymouth
   version the current development release ships or the state of its task,
   so please confirm the development-release task is Fix Released (or get
   it fixed there first) before this SRU is accepted.

 * The evdi empty-EDID head is an additional reproducer beyond the
   NVIDIA multi-GPU trigger framed in the public reports; it is the same
   NULL-terminal bug reached from a terminal-less head.

 * I have no upload rights. The debdiff is attached and ubuntu-sponsors is
   subscribed for upload to noble-unapproved. I have not set the task to
   In Progress; the sponsor uploads and sets status. I can run the Test
   Plan against the -proposed build and mark verification-done-noble.

** Patch added: "lp2103533-noble.debdiff"
   
https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/2103533/+attachment/5978994/+files/lp2103533-noble.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2103533

Title:
  plymouth crashes with SIGSEGV in ply_terminal_set_disabled_input()
  from open_input_source() [drm.so] from
  ply_renderer_open_input_source()

To manage notifications about this bug go to:
https://bugs.launchpad.net/plymouth/+bug/2103533/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to