Wiki - https://fedoraproject.org/wiki/Changes/EnableDrmPanic
Discussion Thread -
https://discussion.fedoraproject.org/t/f42-change-proposal-enable-drm-panic-system-wide/125542

This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes
process, proposals are publicly announced in order to receive
community feedback. This proposal will only be implemented if approved
by the Fedora Engineering Steering Committee.

== Summary ==
Drm_panic is a new feature in the Linux kernel that displays a panic
screen when a kernel panic occurs. This proposal is to enable
DRM_PANIC in the Fedora kernel, to improve the kernel panic user
experience.


== Owner ==

* Name: [[User:jfalempe| Jocelyn Falempe]], [[User:Javierm| Javier
Martinez Canillas]]

* Email: <jfale...@redhat.com>, <javi...@redhat.com>


== Detailed Description ==
When the linux kernel panics in Fedora 40, in most cases, the screen
just freezes.
If you're in a VT console, you'll be able to see the kernel debug
information, but that is pretty hard to understand for users that are
not kernel developers.
With this feature, they will see a message saying the computer has
crashed, and they need to reboot the computer.
Drm_panic has been introduced in kernel v6.10, but is still under
active development.

In order to enable DRM_PANIC, you need to disable VT_CONSOLE in the
kernel, this is to prevent a race condition, that if you are in a VT
console when the panic occurs, both fbcon and drm_panic will write to
the framebuffer at the same time, leading to corrupted output.
https://patchwork.freedesktop.org/series/134831/
The drawback is that tty0 won't show the kernel kmsg, and it can be
harder to debug boot issues. But plymouth already takes care of this,
and can display the boot kmsg when no VT console is present.
https://gitlab.freedesktop.org/plymouth/plymouth/-/merge_requests/224
And the user experience would be better, because plymouth has better
font and color support than fbcon.

Supported drivers are simpledrm, mgag200, ast, (and imx, tidss, on
aarch64). I'm working on nouveau support, and I hope i915 and amdgpu
will add support too.
If the driver is not supported, you won't see the panic screen, but it
won't be worse than what you have today.

Drm panic provides different panic screens. The default is "user"
which will display a simple friendly message telling the user to
reboot the computer. But for kernel developers, you can also set it to
"kmsg", to see the last kmsg lines (so this is equivalent to the
current fbcon). You can select the panic screen in Kconfig, or as a
module parameter (drm.panic_screen=user) or at runtime with "echo -n
kmsg > /sys/module/drm/parameters/panic_screen"

I've also made a proof of concept to add a panic screen with a QR code
with debugging information, which will make it easier for users to
report kernel panic in Fedora. An example can be seen here:
https://github.com/kdj0c/panic_report/issues/1

== Feedback ==


== Benefit to Fedora ==
This change will improve the user experience when a kernel panic occurs.

It's also a first step to switch to userspace console, and being able
to disable CONFIG_VT in the kernel.
VT and fbcon are legacy part of the kernel, that would reduce
maintenance burden if we can disable them, and
It will also reduce CVE impact, as userspace vulnerabilities are
usually less critical.

== Scope ==
* Proposal owners:

Write documentation on how-to debug boot issues without VT_CONSOLE.
Maybe also change the systemd log configuration, so that it default to
writing the log to the console.

* Other developers:

* Release engineering: [https://pagure.io/releng/issues #Releng issue number

I'm unsure if it has impact on the installer.

* Policies and guidelines: N/A (not needed for this Change)

* Trademark approval: N/A (not needed for this Change)

* Alignment with the Fedora Strategy:
I think it perfectly fit the "Fedora is for everyone" goal, as the
current kernel panic (either UI freeze or kmsg output in VT) is not
user-friendly.

== Upgrade/compatibility impact ==
Enabling DRM_PANIC should be transparent to user, but disabling
VT_CONSOLE may have a visible impact.
Fortunately since Fedora 40, plymouth is able to display the kmsg messages.

For non-graphical boot, you can use systemd.log_target=console
systemd.log_level=info and remove rhgb and quiet to see the kernel
boot message.

But this needs to be documented, and communicated, so that users that
debug boot issues, know about this change.



== Early Testing (Optional) ==
Do you require 'QA Blueprint' support? Y/N

== How To Test ==
Currently the easiest way to test, is to use the simpledrm driver, as
it can run on all hardware. So first blacklist your driver (i915,
amdgpu or nouveau), and then boot and check that you're using
simpledrm.
then you can trigger a kernel panic with:
echo c > /proc/sysrq-trigger

As it will crash your machine, it's also possible to do this in a VM
(so disabling virtio-gpu, or vmwgfx)

Also to check that you can still see the kernel messages at boot, in
the grub menu, remove the "quiet" kernel command argument, and you
should still see the kernel boot messages on the plymouth screen.


== User Experience ==
With DRM panic, users will be notified that their computer crashed,
instead of it being unresponsive.

With v6.10, it's only for a few GPU drivers (simpledrm, mgag200, ast),
but with simpledrm, it will already catch some common kernel panic
cases, like root filesystem not found, or ramdisk corruption.
(simpledrm is used at boot, and is later replaced with
i915/amdgpu/nouveau ...)

It also prepares for future drm panic improvements, like having a kmsg
panic screen, (should  be available in v6.11) or also have better
debugging information, using QR code. A test sample is shown at
https://github.com/kdj0c/panic_report/issues/1

== Dependencies ==
The main dependency, is to have a kernel v6.10 or later.
To still see the kernel boot messages, there is also a dependency on
plymouth and systemd, but the versions in F40 are already good.


== Contingency Plan ==
* Contingency mechanism: Revert the kernel configuration changes.
* Contingency deadline: N/A (not a System Wide Change)
* Blocks release? N/A


== Documentation ==
Kernel Kconfig for DRM_PANIC:
https://elixir.bootlin.com/linux/v6.10-rc7/source/drivers/gpu/drm/Kconfig#L107

== Release Notes ==

-- 
Aoife Moloney

Fedora Operations Architect

Fedora Project

Matrix: @amoloney:fedora.im

IRC: amoloney

-- 
_______________________________________________
devel-announce mailing list -- devel-annou...@lists.fedoraproject.org
To unsubscribe send an email to devel-announce-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel-annou...@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue
-- 
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to