On 09.02.2021 22:14, Julien Grall wrote:
> On Tue, 9 Feb 2021 at 20:28, Paul Durrant <xadimg...@gmail.com> wrote:
>>> From: Julien Grall <jul...@xen.org>
>>> Sent: 09 February 2021 15:28
>>>
>>> It is a bit pointless to crash a domain that is already dying. This will
>>> become more an annoyance with a follow-up change where page-table
>>> allocation will be forbidden when the domain is dying.
>>>
>>> Security wise, there is no change as the devices would still have access
>>> to the IOMMU page-tables even if the domain has crashed until Xen
>>> start to relinquish the resources.
>>>
>>> For x86, we rely on dom_iommu(d)->arch.mapping.lock to ensure
>>> d->is_dying is correctly observed (a follow-up patch will held it in the
>>> relinquish path).

Am I to understand this to mean that at this point of the series
things aren't really correct yet in this regard? If so, wouldn't
it be better to re-order?

>>> For Arm, there is still a small race possible. But there is so far no
>>> failure specific to a domain dying.
>>>
>>> Signed-off-by: Julien Grall <jgr...@amazon.com>
>>>
>>> ---
>>>
>>> This was spotted when trying to destroy IOREQ servers while the domain
>>> is dying. The code will try to add the entry back in the P2M and
>>> therefore update the P2M (see arch_ioreq_server_disable() ->
>>> hvm_add_ioreq_gfn()).
>>>
>>> It should be possible to skip the mappin in hvm_add_ioreq_gfn(), however
>>> I didn't try a patch yet because checking d->is_dying can be racy (I
>>> can't find a proper lock).

I understand the concern. I find it odd though that we permit
iommu_map() to do anything at all when the domain is already
dying. So irrespective of the remark below, how about bailing
from iommu_map() earlier when the domain is dying?

>>> --- a/xen/drivers/passthrough/iommu.c
>>> +++ b/xen/drivers/passthrough/iommu.c
>>> @@ -272,7 +272,7 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn,
>>>                              flush_flags) )
>>>                  continue;
>>>
>>> -        if ( !is_hardware_domain(d) )
>>> +        if ( !is_hardware_domain(d) && !d->is_dying )
>>>              domain_crash(d);
>>
>> Would it make more sense to check is_dying inside domain_crash() (and turn 
>> it into a no-op in that case)?
> 
> Jan also suggested moving the check in domain_crash(). However, I felt
> it is potentially a too risky change for 4.15 as there are quite a few
> callers.

This is a fair point. However, in such a case I'd prefer symmetry
at least throughout this one source file (there are three more
places), unless there are strong reasons against doing so.

Jan

Reply via email to