On Mon, May 11, 2026 at 12:33:28PM +0200, Anthony Iliopoulos wrote:
On Sat, May 09, 2026 at 08:34:11AM -0400, Sasha Levin wrote:
On Sat, May 09, 2026 at 02:02:24PM +0200, Florian Weimer wrote:
> * Sasha Levin:
>
> > When a kernel (security) issue goes public, fleets stay exposed until a 
patched
> > kernel is built, distributed, and rebooted into.
> >
> > For many such issues the simplest mitigation is to stop calling the buggy
> > function. Killswitch provides that. An admin writes:
> >
> >     echo "engage af_alg_sendmsg -1" \
> >         > /sys/kernel/security/killswitch/control
> >
> > After this, af_alg_sendmsg() returns -EPERM on every call without
> > running its body. The mitigation takes effect immediately, and is dropped on
> > the next reboot -- by which point a patched kernel is hopefully in place.
>
> Do you expect this to be safe to enable in kernel lockdown mode (i.e.,
> with typical Secure Boot configurations in distributions)?

Yes: under lockdown, killswitch has to be configured on the cmdline. Runtime
engage is gated on the new LOCKDOWN_KILLSWITCH reason.

Basically this proposal allows for any function to be overridden on a
production kernel as long as no lockdown level is enabled, which is quite
dangerous.

Assuming this is acceptable (which I am not sure it should be), then this
is equivalent to the existing error injection code that we already have in
the kernel (CONFIG_FAIL_FUNCTION) minus the explicit whitelisting on a per
function basis required to permit injection.

The mechanism is the same, but I don't think reusing fail_function works for
what killswitch is trying to do.

Given that this achieves the exact same result, then why don't we consider
simply removing the whitelisting restriction from fail_function altogether
and use that instead? The only thing missing then would be the boot param
parsing and setup.

fail_function lives in debugfs, and on a typical Secure Boot distro debugfs is
itself blocked by LOCKDOWN_DEBUGFS at integrity level. Dropping the whitelist
doesn't help when the operator can't write to the file in the first place.
Killswitch is in securityfs so that engaging it can be its own lockdown
decision rather than being lumped in with everything debugfs exposes.

Fault injection in general isn't enabled on production kernels - having to
enable CONFIG_FUNCTION_ERROR_INJECTION will drag in that entire infra into
kernels that don't need it.

This way we'll be removing a few hundred lines of code instead of adding
more duplication, while enabling the same functionality.

I'm not even sure there would be hundreds of lines saved here...

The pieces that make killswitch what it is (cmdline parser,
LOCKDOWN_KILLSWITCH, TAINT_KILLSWITCH, audit on engage and disengage, the
module-unload notifier, etc) add up to roughly 200 lines that would move into
fail_function unchanged. I really don't think we'd end up with much of a line
delta.

That said, the kprobe and override machinery underneath both of these is fair
game for a shared helper that fail_function and killswitch both build on. We can
look at extracting that as a follow-up once killswitch lands, but it's a
separate piece of work from the policy questions in this thread.

--
Thanks,
Sasha

Reply via email to