On 17.05.2024 03:22, Daniel P. Smith wrote:
> On 5/16/24 02:41, Jan Beulich wrote:
>> On 14.05.2024 11:25, Jan Beulich wrote:
>>> On 03.04.2024 08:16, Jan Beulich wrote:
>>>> On 02.04.2024 19:06, Andrew Cooper wrote:
>>>>> The commit makes a claim without any kind of justification.
>>>>
>>>> Well, what does "have no business" leave open?
>>>>
>>>>> The claim is false, and the commit broke lsevtchn in dom0.
>>>>
>>>> Or alternatively lsevtchn was doing something that was never meant to work
>>>> (from Xen's perspective).
>>>>
>>>>>   It is also quite
>>>>> obvious from XSM_TARGET that it has broken device model stubdoms too.
>>>>
>>>> Why would that be "obvious"? What business would a stubdom have to look at
>>>> Xen's side of an evtchn?
>>>>
>>>>> Whether to return information about a xen-owned evtchn is a matter of 
>>>>> policy,
>>>>> and it's not acceptable to short circuit the XSM on the matter.
>>>>
>>>> I can certainly accept this as one possible view point. As in so many cases
>>>> I'm afraid I dislike you putting it as if it was the only possible one.
>>>>
>>>> In summary: The supposed justification you claim is missing in the original
>>>> change is imo also missing here then: What business would any entity in the
>>>> system have to look at Xen's side of an event channel? Back at the time, 3
>>>> people agreed that it's "none".
>>>
>>> You've never responded to this reply of mine, or its follow-up. You also
>>> didn't chime in on the discussion Daniel and I were having. I consider my
>>> objections unaddressed, and in fact I continue to consider the change to
>>> be wrong. Therefore it was inappropriate for you to commit it; it needs
>>> reverting asap. If you're not going to do so, I will.
>>
>> For the record: With Andrew having clarified that in his revert's
>> description:
>>
>> "The claim is false; it broke lsevtchn in dom0, a debugging utility which
>>   absolutely does care about all of the domain's event channels."
>>
>> "all of the domain's event channels" means to include event channels Xen
>> uses for its own, and merely puts in the domain's table for ease of
>> handling, I've agreed that - in the absence of any better debugging
>> means - the revert may stay in place. For context, my prior understanding
>> of the issue was that lsevtchn simply stops probing further channels when
>> getting back any kind of error from EVTCHNOP_status (which I continue to
>> think wants addressing there, by a future version of a patch that was
>> already sent). However, there are follow-on concerns I have:
>>
>> 1) In the discussion George claimed that exposing status information in
>> an uncontrolled manner is okay. I'm afraid I have to disagree, seeing
>> how a similar assumption by CPU designers has led to a flood of
>> vulnerabilities over the last 6+ years. Information exposure imo is never
>> okay, unless it can be _proven_ that absolutely nothing "useful" can be
>> inferred from it. (I'm having difficulty seeing how such a proof might
>> look like.)
> 
> Jan, I would agree with you that any resources exposed to the guest, 
> even if that resource is status information, should be behind an XSM 
> check. I would, however, have to disagree the position over proving 
> absolutely nothing "useful" can be inferred. Prior to spectre becoming 
> commonly aware, no one considered proving there needed to be protections 
> against out-of-order instruction execution being used to bypass branch 
> conditions.

Interesting perspective.

> Predicting the future will more often than not fail, but 
> with an XSM check in place, the dummy/default policy can just be updated 
> with the general safe decision and others can use FLASK to provide fine 
> grain access.

I have to admit I have difficulty seeing how one would adjust dummy to do
restrict things by default, at least as long as XSM_TARGET is used.

>> 2) Me pointing out that the XSM hook might similarly get in the way of
>> debugging, Andrew suggested that this is not an issue because any sensible
>> XSM policy used in such an environment would grant sufficient privilege to
>> Dom0. Yet that then still doesn't cover why DomU-s also can obtain status
>> for Xen-internal event channels. The debugging argument then becomes weak,
>> as in that case the XSM hook is possibly going to get in the way.
> 
> And this is where we have a fundamental difference. Event channels are 
> XSM labeled objects that are always connected to a guest. If they were 
> truly Xen-internal, then a guest would have no way to get 
> access/reference them.

And that's what I'd like to achieve. The fact that Xen puts these event
channels in the guest's table is a mere implementation detail. They could
as well be kept elsewhere, just that handling then would (likely) be more
cumbersome.

> Again, I never disagreed with whether the guest 
> should or should not be able to access them. I did ask for a better 
> explanation than, "has no business", which is a statement of opinion not 
> of fact or reason. The point is these event channels are a resource 
> attached to the guest

As per above - not really, they merely appear like that.

> and access control decisions to them should be 
> made under XSM. Injecting a hard-coded access control in front of the 
> XSM check subverted/invalidated rules in the existing FLASK policy.

I'd agree if these were truly guest resources. Elsewhere I pointed at
the equivalent in memory management: Xen-internal memory also isn't
protected by XSM checks. It's inaccessible for entirely different reasons.

>> 3) In the discussion Andrew further gave the impression that evtchn_send()
>> had no XSM check. Yet it has; the difference to evtchn_status() is that
>> the latter uses XSM_TARGET while the former uses XSM_HOOK. (Much like
>> evtchn_status() may indeed be useful for debugging, evtchn_send() may be
>> similarly useful to allow getting a stuck channel unstuck.)
> 
> A more appropriate default might be XSM_OTHER with conditions with in 
> the hook  as to whether the policy should be XSM_HOOK, XSM_TARGET or 
> flat denial.

That you mean for send, status, or both?

>> In summary I continue to think that an outright revert was inappropriate.
>> DomU-s should continue to be denied status information on Xen-internal
>> event channels, unconditionally and independent of whether dummy, silo, or
>> Flask is in use.
> 
> Any guest facing resources, regardless if the backing end is the 
> hypervisor, should be protected with XSM. This allows the maintainers to 
> codify what they believe is a sane policy in the dummy policy and the 
> end user can use FLASK to enforce what they believe is acceptable.

I continue to be under the impression that either I don't look at things
correctly, or you don't. What's "guest facing" here? Xen-internal channels
aren't guest facing aiui. Their remote end is, in Dom0 or a stubdom. The
Xen side of it isn't guest facing at all; it's merely a vehicle to ease
handling that they're put in the guest's table.

>> Plus, as stated before, evtchn_send() would better remain in sync in this
>> regard with evtchn_status(). The situation is less clear for evtchn_close()
>> and evtchn_bind_vcpu(): Those indeed have no XSM checks while they do deny
>> operation on Xen-internal channels. It is likely more far fetched to
>> assume permitting them for debugging purposes might in fact help in rare
>> situations. Yet it may still be a matter of consistency to keep them in
>> sync with the other two. (Note that there's also evtchn_usable(), which
>> might then also need tweaking itself or at the use sites.)
> 
> Just because that is how it was, doesn't mean it was correct. I had a 
> discussion with one of the original authors of FLASK before taking up 
> the maintainership and he felt there were likely more XSM checks that 
> should have been in place originally. He considered it a first order 
> approximation of what should be protected.

I'm afraid it's not really clear to me what to take from this. Are you
suggesting that further XSM checks may be wanted? If so, where, and who
would take care of adding them?

Jan

Reply via email to