Re: [PATCH 3/6] VT-d: don't leak domid mapping on error path

2021-11-15 Thread Jan Beulich
On 12.11.2021 15:35, Roger Pau Monné wrote:
> On Fri, Nov 12, 2021 at 02:45:14PM +0100, Jan Beulich wrote:
>> On 12.11.2021 14:42, Roger Pau Monné wrote:
>>> On Fri, Nov 12, 2021 at 10:48:43AM +0100, Jan Beulich wrote:
 While domain_context_mapping() invokes domain_context_unmap() in a sub-
 case of handling DEV_TYPE_PCI when encountering an error, thus avoiding
 a leak, individual calls to domain_context_mapping_one() aren't
 similarly covered. Such a leak might persist until domain destruction.
 Leverage that these cases can be recognized by pdev being non-NULL.
>>>
>>> Would it help to place the domid cleanup in domain_context_unmap_one,
>>> as that would then cover calls from domain_context_unmap and the
>>> failure path in domain_context_mapping_one.
>>
>> I don't think that would work (without further convolution), because of
>> the up to 3 successive calls in DEV_TYPE_PCI handling. Cleanup may happen
>> only on the first map's error path or after the last unmap.
> 
> Hm, I see. And AFAICT that's because some devices that get assigned to
> a guest iommu context are not actually assigned to the guest (ie:
> pdev->domain doesn't get set, neither the device is added to the
> per-domain list), which makes them invisible to
> any_pdev_behind_iommu.
> 
> I dislike that the domid is added in domain_context_mapping_one, while
> the cleanup is not done in domain_context_unmap_one, and that some
> devices context could be using the domain id without being assigned to
> the domain.

This all isn't really pretty, is it? As Andrew has been saying (I think
more than once), ideally we'd rewrite IOMMU code from scratch. But I
don't see anyone having enough spare time to actually do so ...

Jan




RE: [PATCH 3/6] VT-d: don't leak domid mapping on error path

2021-11-14 Thread Tian, Kevin
> From: Jan Beulich 
> Sent: Friday, November 12, 2021 5:49 PM
> 
> While domain_context_mapping() invokes domain_context_unmap() in a
> sub-
> case of handling DEV_TYPE_PCI when encountering an error, thus avoiding
> a leak, individual calls to domain_context_mapping_one() aren't
> similarly covered. Such a leak might persist until domain destruction.
> Leverage that these cases can be recognized by pdev being non-NULL.
> 
> Fixes: dec403cc668f ("VT-d: fix iommu_domid for PCI/PCIx devices
> assignment")
> Signed-off-by: Jan Beulich 

Reviewed-by: Kevin Tian 

> ---
> The Fixes: tag isn't strictly correct, as error handling had more severe
> shortcomings at the time. But I wouldn't want to blame a commit
> improving error handling to have introduced the leak.
> 
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1518,7 +1518,12 @@ int domain_context_mapping_one(
>  rc = me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC);
> 
>  if ( rc )
> -domain_context_unmap_one(domain, iommu, bus, devfn);
> +{
> +ret = domain_context_unmap_one(domain, iommu, bus, devfn);
> +
> +if ( !ret && pdev && pdev->devfn == devfn )
> +check_cleanup_domid_map(domain, pdev, iommu);
> +}
> 
>  return rc;
>  }



Re: [PATCH 3/6] VT-d: don't leak domid mapping on error path

2021-11-12 Thread Roger Pau Monné
On Fri, Nov 12, 2021 at 02:45:14PM +0100, Jan Beulich wrote:
> On 12.11.2021 14:42, Roger Pau Monné wrote:
> > On Fri, Nov 12, 2021 at 10:48:43AM +0100, Jan Beulich wrote:
> >> While domain_context_mapping() invokes domain_context_unmap() in a sub-
> >> case of handling DEV_TYPE_PCI when encountering an error, thus avoiding
> >> a leak, individual calls to domain_context_mapping_one() aren't
> >> similarly covered. Such a leak might persist until domain destruction.
> >> Leverage that these cases can be recognized by pdev being non-NULL.
> > 
> > Would it help to place the domid cleanup in domain_context_unmap_one,
> > as that would then cover calls from domain_context_unmap and the
> > failure path in domain_context_mapping_one.
> 
> I don't think that would work (without further convolution), because of
> the up to 3 successive calls in DEV_TYPE_PCI handling. Cleanup may happen
> only on the first map's error path or after the last unmap.

Hm, I see. And AFAICT that's because some devices that get assigned to
a guest iommu context are not actually assigned to the guest (ie:
pdev->domain doesn't get set, neither the device is added to the
per-domain list), which makes them invisible to
any_pdev_behind_iommu.

I dislike that the domid is added in domain_context_mapping_one, while
the cleanup is not done in domain_context_unmap_one, and that some
devices context could be using the domain id without being assigned to
the domain.

Thanks, Roger.



Re: [PATCH 3/6] VT-d: don't leak domid mapping on error path

2021-11-12 Thread Jan Beulich
On 12.11.2021 14:42, Roger Pau Monné wrote:
> On Fri, Nov 12, 2021 at 10:48:43AM +0100, Jan Beulich wrote:
>> While domain_context_mapping() invokes domain_context_unmap() in a sub-
>> case of handling DEV_TYPE_PCI when encountering an error, thus avoiding
>> a leak, individual calls to domain_context_mapping_one() aren't
>> similarly covered. Such a leak might persist until domain destruction.
>> Leverage that these cases can be recognized by pdev being non-NULL.
> 
> Would it help to place the domid cleanup in domain_context_unmap_one,
> as that would then cover calls from domain_context_unmap and the
> failure path in domain_context_mapping_one.

I don't think that would work (without further convolution), because of
the up to 3 successive calls in DEV_TYPE_PCI handling. Cleanup may happen
only on the first map's error path or after the last unmap.

Jan




Re: [PATCH 3/6] VT-d: don't leak domid mapping on error path

2021-11-12 Thread Roger Pau Monné
On Fri, Nov 12, 2021 at 10:48:43AM +0100, Jan Beulich wrote:
> While domain_context_mapping() invokes domain_context_unmap() in a sub-
> case of handling DEV_TYPE_PCI when encountering an error, thus avoiding
> a leak, individual calls to domain_context_mapping_one() aren't
> similarly covered. Such a leak might persist until domain destruction.
> Leverage that these cases can be recognized by pdev being non-NULL.

Would it help to place the domid cleanup in domain_context_unmap_one,
as that would then cover calls from domain_context_unmap and the
failure path in domain_context_mapping_one.

Thanks, Roger.