On 10/21/20 10:34 AM, Hongyan Xia wrote:
Hi,
A while ago there was a quick chat on IRC about how XSM interacts with
the idle domain. The conversation did not reach any clear conclusions
so it might be a good idea to summarise the questions in an email.
Basically there were two questions in that conversation:
1. In its current state, are security modules able to limit what the
idle domain can do?
Yes in the fact that the idle domain has a type and you can constrain
what actions the type is allowed. Now in reality the idle domain is
given the same type as the hypervisor itself thus must have the ability
to make certain actions.
2. Should security modules be able to restrict the idle domain?
IMHO I think this question should be reversed to ask whether the actions
the idle domain is being used for is appropriate from a security point
of view. AIUI the idle domain is a mechanism for the scheduler to use as
a place to schedule an idle vcpu. And yes I understand that some limited
work is done there, e.g. memory scrubbing, but 1.) there is a difference
between light/limited work that can be done within the confines of a
domain and work requiring hypercalls, and 2.) this precedence may have
been due to limitations vs being the necessarily correct approach.
The first question came up during ongoing work in LiveUpdate. After an
LU, the next Xen needs to restore all domains. To do that, some
hypercalls need to be issued from the idle domain context and
apparently XSM does not like it. We need to introduce hacks in the
dummy module to leave the idle domain alone. Our work is not compiled
with CONFIG_XSM at all, but with CONFIG_XSM, are we able to enforce
security policies against the idle domain? Of course, without any LU
work this does not make any difference because the idle domain does not
do any useful work to be restricted anyway.
Why do they "need to be issued from the idle domain"? As was suggested
by Jason, why isn't this done from a construction domain context? I will
interject here that with DomB that is what we will be doing and it
sounds like LiveUpdate is very similar to the relaunch concept that DomB
is being constructed to support.
Yes XSM did not like it because an analogy of what is being done is like
trying to do a system call from inside an OS kernel. Again AIUI the idle
domain is not a real domain but an internal construct for the scheduler
to manage idle vcpu and attempting to make hypercalls from it is in fact
attempting to turn into a full fledged domain.
From a security perspective, if hacks to the XSM hooks are necessary to
make something work then it is highly recommended to take a step back
and ask why and whether you are doing something that is not safe from a
security perspective.
Also, should idle domain be restricted? IMO the idle domain is Xen
itself which mostly bootstraps the system and performs limited work
when switched to, and is not something a user (either dom0 or domU)
directly interacts with. I doubt XSM was designed to include the idle
domain (although there is an ID allocated for it in the code), so I
would say just exclude idle in all security policy checks.
The idle domain is a limited, internal construct within the hypervisor
and should be constrained as part of the hypervisor, which is why its
domain id gets labeled with the same label as the hypervisor. For this
reason I would wholeheartedly disagree with exempting the idle domain id
from XSM hooks as that would effectively be saying the core hypervisor
should not be constrained. The purpose of the XSM hooks is to control
the flow of information in the system in a non-bypassable way. Codifying
bypasses completely subverts the security model behind XSM for which the
flask security server is dependent upon.
I may have missed some points in that discussion, so please feel free
to add.
Hongyan
V/r,
DPS