On Mon, May 11, 2026 at 07:15:10AM -0400, Sasha Levin wrote:
> On Mon, May 11, 2026 at 12:33:28PM +0200, Anthony Iliopoulos wrote:
> > On Sat, May 09, 2026 at 08:34:11AM -0400, Sasha Levin wrote:
> > > On Sat, May 09, 2026 at 02:02:24PM +0200, Florian Weimer wrote:
> > > > * Sasha Levin:
> > > >
> > > > > When a kernel (security) issue goes public, fleets stay exposed until 
> > > > > a patched
> > > > > kernel is built, distributed, and rebooted into.
> > > > >
> > > > > For many such issues the simplest mitigation is to stop calling the 
> > > > > buggy
> > > > > function. Killswitch provides that. An admin writes:
> > > > >
> > > > >     echo "engage af_alg_sendmsg -1" \
> > > > >         > /sys/kernel/security/killswitch/control
> > > > >
> > > > > After this, af_alg_sendmsg() returns -EPERM on every call without
> > > > > running its body. The mitigation takes effect immediately, and is 
> > > > > dropped on
> > > > > the next reboot -- by which point a patched kernel is hopefully in 
> > > > > place.
> > > >
> > > > Do you expect this to be safe to enable in kernel lockdown mode (i.e.,
> > > > with typical Secure Boot configurations in distributions)?
> > > 
> > > Yes: under lockdown, killswitch has to be configured on the cmdline. 
> > > Runtime
> > > engage is gated on the new LOCKDOWN_KILLSWITCH reason.
> > 
> > Basically this proposal allows for any function to be overridden on a
> > production kernel as long as no lockdown level is enabled, which is quite
> > dangerous.
> > 
> > Assuming this is acceptable (which I am not sure it should be), then this
> > is equivalent to the existing error injection code that we already have in
> > the kernel (CONFIG_FAIL_FUNCTION) minus the explicit whitelisting on a per
> > function basis required to permit injection.
> 
> The mechanism is the same, but I don't think reusing fail_function works for
> what killswitch is trying to do.

How so? The kprobe handler is essentially the same. Setting the
whitelisting aside, it is currently possible to do:

echo af_alg_sendmsg > /sys/kernel/debug/fail_function/inject
echo 0xffffffffffffffff > /sys/kernel/debug/fail_function/af_alg_sendmsg/retval
echo 100 > /sys/kernel/debug/fail_function/probability
echo -1 > /sys/kernel/debug/fail_function/times

and that will return -EPERM, taint the kernel, and log the stacktrace on
dmesg on every rejected call.

> > Given that this achieves the exact same result, then why don't we consider
> > simply removing the whitelisting restriction from fail_function altogether
> > and use that instead? The only thing missing then would be the boot param
> > parsing and setup.
> 
> fail_function lives in debugfs, and on a typical Secure Boot distro debugfs is
> itself blocked by LOCKDOWN_DEBUGFS at integrity level. Dropping the whitelist
> doesn't help when the operator can't write to the file in the first place.

Agreed, for this to work fail_function would also need to parse boot
params similarly.

> Killswitch is in securityfs so that engaging it can be its own lockdown
> decision rather than being lumped in with everything debugfs exposes.

Sure but it makes no difference when a kernel is locked at integrity it
will anyway block either solution, this makes no practical difference.

> Fault injection in general isn't enabled on production kernels - having to
> enable CONFIG_FUNCTION_ERROR_INJECTION will drag in that entire infra into
> kernels that don't need it.

There's very little code that CONFIG_FUNCTION_ERROR_INJECTION brings in
apart from the override_function_with_return trampoline and
lib/error-inject.c which becomes obsolete without the need to whitelist.

Your proposal also depends on FUNCTION_ERROR_INJECTION necessarily.

The only thing that would be missing and not usually compiled in is
CONFIG_FAIL_FUNCTION that just implements the debugfs ops interface
which you are exposing via securityfs instead.

> > This way we'll be removing a few hundred lines of code instead of adding
> > more duplication, while enabling the same functionality.
> 
> I'm not even sure there would be hundreds of lines saved here...

I'm talking specifically about whitelisting which would essentially be
useless:

wc -l lib/error-inject.c include/asm-generic/error-injection.h 
include/linux/error-injection.h
 246 lib/error-inject.c
  43 include/asm-generic/error-injection.h
  28 include/linux/error-injection.h
 317 total

plus a hundred or so annotations of ALLOW_ERROR_INJECT and a tiny bit of
image space savings from dropping that whitelist section from the binary.

> The pieces that make killswitch what it is (cmdline parser,
> LOCKDOWN_KILLSWITCH, TAINT_KILLSWITCH, audit on engage and disengage, the
> module-unload notifier, etc) add up to roughly 200 lines that would move into
> fail_function unchanged. I really don't think we'd end up with much of a line
> delta.

All of that apart from the cmdline parser is already present in the
fault/error injection code, directly or indirectly. I can see though the
appeal of having killswitch cleanly separated from anything else, but
perhaps changing the existing code is more approachable.

> That said, the kprobe and override machinery underneath both of these is fair
> game for a shared helper that fail_function and killswitch both build on. We 
> can
> look at extracting that as a follow-up once killswitch lands, but it's a
> separate piece of work from the policy questions in this thread.

Sure, but my point is that if this is acceptable, then it follows that:

- whitelisting becomes irrelevant (even if fail_function remains
  separate), since the exact same capability will be exposed via the
  killswitch interface for all functions anyway, so why would we need it
  to protect error-injection

and subsequently:

- fail_function would become somewhat redundant since the same
  functionality would be achieved via the securityfs (or just bpf, which
  is already the case).

Regards,
Anthony

Reply via email to