Daniel P. Berrangé <[email protected]> writes:

> Our docs/system/security.rst file loosely classifies code into that
> applicable for 'virtualization' vs 'non-virtualization' use cases.
> Only code relevant to the former group is eligible for security
> bug handling. Peter's recent proposal pointed out that we are
> increasingly hitting the limits of such a crude classification:
>
>   https://lists.nongnu.org/archive/html/qemu-devel/2025-09/msg01520.html

Yes, we do.

> Michael suggested that with the increased complexity, docs are not
> going to be an effective way to convey the information, and we
> need to re-consider embedding this info in code:
>
>   https://lists.nongnu.org/archive/html/qemu-devel/2025-09/msg01566.html
>
> This also allows users to validate a configuration's security status
> when starting a guest, or modifying a running guest. This series is
> an attempt to start the embedding process.

I like the idea.

We have a long list of configuration choices that might / are known to
punch holes into security boundaries.  Documenting them is entirely
inadequate; telling users who got p0wned it's their own fault for having
missed this particlar drop in the sea of QEMU documentation reminds me
of Douglas Adams' “Beware of The Leopard“.

And we don't have even that!  Just handwavy talk about a "virtualization
use case".

We can and should do better.

> It starts with QOM, adding "bool secure" and "bool insecure"
> properties to the TypeInfo struct, which get turned into flags
> on the Type struct. This enables querying any ObjectClass to
> ask whether or not it is declared secure or insecure.

We should clearly document what "declared secure" actually means.
Here's my attempt at it: supported for use cases that require certain
security boundaries.

> By default no statement will be made about whether a class is
> secure or insecure, reflecting our historical defaults. Over
> time we should annotate as many classes as possible with an
> explicit statement.
>
> The "-machine" argument gains two new parameters
>
>   * prohibit-insecure=yes|no  - a weak security boundary, only
>     excluding stuff that is explicitly declared insecure,
>     permiting stuff that is secure & anything without a stetement

This isn't what users need.

>   * require-secure=yes|no - a strong security boundary, only
>     permitting stuff that is explicitly declared secure,
>     excluding insecure stuff & anything without a statement

This would be, if it covered everything accessible at the security
boundaries.  It doesn't for now: only QOM.

It might still be better than nothing.

However, it may well be unusable until enough of QOM is declared secure.

What would our advice to users be?  I'm afraid something complicated and
impermanent like "try require-secure=yes, and if you can't make it work
because parts of QOM you can't do without are still undeclared, fall
back to prohibit-insecure=yes, and be aware this avoids only some, but
not all security boundary death traps in either case."

This is an awful user interface.  But it's also a step towards the user
interface we want: a single, unchanging switch that ensures you're
running something that's fully supported for use cases that require
certain security boundaries.

A next step could be getting enough of QOM declared so we can move to a
single switch, with the (hopefully temporary) caveat about "only QOM".

We should clearly and prominently document the limitations at each step.

> As illustration, I have added explicit annotations for many machine
> types, some accelerators, all NICs (all insecure except xen,
> e1000(e) and virtio), and all PCI virtio devices (all secure).
>
> Example: TCG is explicitly insecure, KVM is explicitly secure,
>          qtest has no statement:
>
>   $ qemu-system-x86_64 -display none -machine pc,prohibit-insecure=yes -accel 
> tcg
>   qemu-system-x86_64: Type 'tcg-accel' is declared as insecure

[...]

> Some questions....
>
>   * Is using '-machine' the right place to express the policy ?

Not sure.  The guest boundary is just one of several security boundaries
listed in docs/system/security.rst.  Some of them aren't really about
the guest / the machine.

Maybe -compat?  It lets you exclude unstable or deprecated bits from the
user interface.  Feels similar to excluding insecure bits.

>   * Can we change '-accel help' to report 'secure' / 'insecure'
>     as we did for '-machine help' and '-device help'.

No idea, guess it's at worst a matter of shaving the yak?

>   * Should we have 'query-devices' for QMP to allow the 'secure'
>     or 'insecure' status to be queried for every device.
>
>   * Should we have 'query-accel' for QMP to allow the 'secure'
>     or 'insecure' status to be queried for every accelerator.

I recommend qom-list-types.  Covers all of QOM, not just devices and
accelerators.

>   * Should we enforce checks for -object & object_add too ?
>     Easy to add code for this, but do we need the ability to
>     exclude some object backends of dubious code quality ?
>
>   * Likewise for -chardev / -netdev / etc which are
>     conceptual specializations of -object

I lean towards all of QOM, no ifs, no buts.

>   * BlockDriver structs don't use QOM, so we can't mark
>     'vvfat' block backend as insecure

I think this is the biggest gap.

> The first one about '-machine' is probably the main blocker
> from a design POV. Other things are just potential future
> incremental work.
>
> This series has had only 1/2 a day's work / thought put into
> it, hence RFC status. It has been compiled and minimally tested
> with the examples shown above. I have not pushed this through
> CI nor considered tests yet. Still it gives a good illustration
> of what's involved in recording security info in code.

Thanks for that!


Reply via email to