Re: [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag

2024-05-20 Thread Jan Beulich
On 20.05.2024 13:36, Roger Pau Monné wrote:
> On Wed, May 15, 2024 at 12:42:40PM +0200, Jan Beulich wrote:
>> On 06.05.2024 15:38, Roger Pau Monné wrote:
>>> On Thu, Feb 15, 2024 at 11:16:11AM +0100, Jan Beulich wrote:
 When the flag is set, permit Dom0 to control the device (no worse than
 what we had before and in line with other "best effort" behavior we use
 when it comes to Dom0),
>>>
>>> I think we should somehow be able to signal dom0 that this device
>>> might not operate as expected, otherwise dom0 might use it and the
>>> device could silently malfunction due to ATS not being enabled.
>>
>> Whatever signaling we invented, no Dom0 would be required to respect it,
>> and for (perhaps quite) some time no Dom0 kernel would even exist to query
>> that property.
>>
>>> Otherwise we should just hide the device from dom0.
>>
>> This would feel wrong to me, almost like a regression from what we had
>> before.
> 
> Exposing a device to dom0 that won't be functional doesn't seem like a
> very wise choice from Xen TBH.

Yes but. That's what we're doing right now, after all.

>>> I assume setting the IOMMU context entry to passthrough mode would
>>> also be fine for such devices that require ATS?
>>
>> I'm afraid I'm lacking the connection of the question to what is being
>> done here. Can you perhaps provide some more context? To provide some
>> context from my side: Using pass-through mode would be excluded when Dom0
>> is PVH. Hence why I'm not getting why we would want to even just consider
>> doing so.
>>
>> Yet, looking at the spec, in pass-through mode translation requests are
>> treated as UR. So maybe your question was towards there needing to be
>> handling (whichever way) for the case where pass-through mode was
>> requested for PV Dom0? The only half-way sensible thing to do in that case
>> that I can think of right now would be to ignore that command line option,
> 
> Hm, maybe I'm confused, but if the IOMMU device context entry is set
> in pass-through mode ATS won't be enabled and hence no translation
> requests would be send from the device?
> 
> IOW, devices listed in the SATC can only mandate ATS enabled when the
> IOMMU is enforcing translation.   IF the IOMMU is not enabled or if
> the device is in passthrough mode then the requirement for having ATS
> enabled no longer applies.

Oh, I think I now get what your original question was about: Instead of
enabling ATS on such devices, we might run them in pass-through mode.
For PV that would appear to be an option, yes. But with PVH (presumably)
being the future I'd be rather hesitant to go that route.

Jan



Re: [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag

2024-05-20 Thread Roger Pau Monné
On Wed, May 15, 2024 at 12:42:40PM +0200, Jan Beulich wrote:
> On 06.05.2024 15:38, Roger Pau Monné wrote:
> > On Thu, Feb 15, 2024 at 11:16:11AM +0100, Jan Beulich wrote:
> >> When the flag is set, permit Dom0 to control the device (no worse than
> >> what we had before and in line with other "best effort" behavior we use
> >> when it comes to Dom0),
> > 
> > I think we should somehow be able to signal dom0 that this device
> > might not operate as expected, otherwise dom0 might use it and the
> > device could silently malfunction due to ATS not being enabled.
> 
> Whatever signaling we invented, no Dom0 would be required to respect it,
> and for (perhaps quite) some time no Dom0 kernel would even exist to query
> that property.
> 
> > Otherwise we should just hide the device from dom0.
> 
> This would feel wrong to me, almost like a regression from what we had
> before.

Exposing a device to dom0 that won't be functional doesn't seem like a
very wise choice from Xen TBH.

> > I assume setting the IOMMU context entry to passthrough mode would
> > also be fine for such devices that require ATS?
> 
> I'm afraid I'm lacking the connection of the question to what is being
> done here. Can you perhaps provide some more context? To provide some
> context from my side: Using pass-through mode would be excluded when Dom0
> is PVH. Hence why I'm not getting why we would want to even just consider
> doing so.
> 
> Yet, looking at the spec, in pass-through mode translation requests are
> treated as UR. So maybe your question was towards there needing to be
> handling (whichever way) for the case where pass-through mode was
> requested for PV Dom0? The only half-way sensible thing to do in that case
> that I can think of right now would be to ignore that command line option,

Hm, maybe I'm confused, but if the IOMMU device context entry is set
in pass-through mode ATS won't be enabled and hence no translation
requests would be send from the device?

IOW, devices listed in the SATC can only mandate ATS enabled when the
IOMMU is enforcing translation.   IF the IOMMU is not enabled or if
the device is in passthrough mode then the requirement for having ATS
enabled no longer applies.

Thanks, Roger.



Re: [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag

2024-05-15 Thread Jan Beulich
On 06.05.2024 15:38, Roger Pau Monné wrote:
> On Thu, Feb 15, 2024 at 11:16:11AM +0100, Jan Beulich wrote:
>> When the flag is set, permit Dom0 to control the device (no worse than
>> what we had before and in line with other "best effort" behavior we use
>> when it comes to Dom0),
> 
> I think we should somehow be able to signal dom0 that this device
> might not operate as expected, otherwise dom0 might use it and the
> device could silently malfunction due to ATS not being enabled.

Whatever signaling we invented, no Dom0 would be required to respect it,
and for (perhaps quite) some time no Dom0 kernel would even exist to query
that property.

> Otherwise we should just hide the device from dom0.

This would feel wrong to me, almost like a regression from what we had
before.

> I assume setting the IOMMU context entry to passthrough mode would
> also be fine for such devices that require ATS?

I'm afraid I'm lacking the connection of the question to what is being
done here. Can you perhaps provide some more context? To provide some
context from my side: Using pass-through mode would be excluded when Dom0
is PVH. Hence why I'm not getting why we would want to even just consider
doing so.

Yet, looking at the spec, in pass-through mode translation requests are
treated as UR. So maybe your question was towards there needing to be
handling (whichever way) for the case where pass-through mode was
requested for PV Dom0? The only half-way sensible thing to do in that case
that I can think of right now would be to ignore that command line option,
just like we do when Dom0 is PVH. Yet that would equally apply to use of
"ats" on the command line, i.e. would likely first require yet another
separate patch. Except that in the "ats" case it may be reasonable to
instead panic(), for there being conflicting requests on the command line
(and it being unclear which one would be better to ignore).

>> --- a/xen/drivers/passthrough/vtd/iommu.c
>> +++ b/xen/drivers/passthrough/vtd/iommu.c
>> @@ -2364,6 +2364,26 @@ static int cf_check intel_iommu_add_devi
>>  if ( ret )
>>  dprintk(XENLOG_ERR VTDPREFIX, "%pd: context mapping failed\n",
>>  pdev->domain);
>> +else if ( !pdev->broken )
>> +{
>> +const struct acpi_drhd_unit *drhd = 
>> acpi_find_matched_drhd_unit(pdev);
>> +const struct acpi_satc_unit *satc = 
>> acpi_find_matched_satc_unit(pdev);
>> +
>> +/*
>> + * Prevent the device from getting assigned to an unprivileged 
>> domain
>> + * when firmware indicates ATS is required, but ATS could not be 
>> enabled
>> + * or was not explicitly enabled via command line option.
>> + */
>> +if ( satc && satc->atc_required &&
>> + (!drhd || ats_device(pdev, drhd) <= 0 ||
>> +  !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) ||
>> +  opt_ats < 0) )
> 
> Do you need the opt_ats check here?
> 
> I don't think it's possible for pci_ats_enabled() to return true if
> opt_ats is <= 0, and hence the opt_ats < 0 check can be dropped from
> the conditional?

In the present tristate mode of opt_ats a device can have ATS enabled when
opt_ats is -1 (i.e. no command line override): For devices with ATC_REQUIRED
set.

>> @@ -2375,12 +2395,26 @@ static int cf_check intel_iommu_enable_d
>>  
>>  pci_vtd_quirk(pdev);
>>  
>> -if ( ret <= 0 )
>> -return ret;
>> +if ( ret <= 0 ||
>> + (ret = enable_ats_device(pdev, &drhd->iommu->ats_devices)) < 0 ||
>> + opt_ats < 0 )
> 
> Shouldn't this be opt_ats <= 0?

No, again not as long as this variable is a tristate one.

>> --- a/xen/drivers/passthrough/vtd/x86/ats.c
>> +++ b/xen/drivers/passthrough/vtd/x86/ats.c
>> @@ -45,8 +45,9 @@ int ats_device(const struct pci_dev *pde
>>  {
>>  struct acpi_drhd_unit *ats_drhd;
>>  unsigned int pos, expfl = 0;
>> +const struct acpi_satc_unit *satc;
>>  
>> -if ( opt_ats <= 0 || !iommu_qinval )
>> +if ( !opt_ats || !iommu_qinval )
>>  return 0;
> 
> FWIW, I find this change confusing, hence my request earlier that
> opt_ats must be set to 0 or 1 by the point it gets used.

Right, but as said in particular on the subthread of patch 5, for now it has
to remain a full tristate. Whereas if the spec was changed, I expect the
variable could be switched to bool, and hence no overriding from -1 to 0/1
would be needed anymore at all.

Jan



Re: [PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag

2024-05-06 Thread Roger Pau Monné
On Thu, Feb 15, 2024 at 11:16:11AM +0100, Jan Beulich wrote:
> When the flag is set, permit Dom0 to control the device (no worse than
> what we had before and in line with other "best effort" behavior we use
> when it comes to Dom0),

I think we should somehow be able to signal dom0 that this device
might not operate as expected, otherwise dom0 might use it and the
device could silently malfunction due to ATS not being enabled.

Otherwise we should just hide the device from dom0.

I assume setting the IOMMU context entry to passthrough mode would
also be fine for such devices that require ATS?

> but suppress passing through to DomU-s unless
> ATS can actually be enabled for such devices (and was explicitly enabled
> on the command line).
> 
> Signed-off-by: Jan Beulich 
> ---
> v2: Re-base over new earlier patches.
> 
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -225,7 +225,11 @@ exceptions (watchdog NMIs and unexpected
>  > Default: `false`
>  
>  Permits Xen to set up and use PCI Address Translation Services.  This is a
> -performance optimisation for PCI Passthrough.
> +performance optimisation for PCI Passthrough.  Note that firmware may 
> indicate
> +that certain devices need to have ATS enabled for proper operation. For such
> +devices ATS will be enabled by default, unless the option is used in its
> +negative form.  Such devices will still not be eligible for passing through 
> to
> +guests, unless the option is used in its positive form.
>  
>  **WARNING: Xen cannot currently safely use ATS because of its synchronous 
> wait
>  loops for Queued Invalidation completions.**
> --- a/xen/drivers/passthrough/vtd/dmar.c
> +++ b/xen/drivers/passthrough/vtd/dmar.c
> @@ -253,6 +253,24 @@ struct acpi_atsr_unit *acpi_find_matched
>  return all_ports;
>  }
>  
> +const struct acpi_satc_unit *acpi_find_matched_satc_unit(
> +const struct pci_dev *pdev)
> +{
> +const struct acpi_satc_unit *satc;
> +
> +list_for_each_entry ( satc, &acpi_satc_units, list )
> +{
> +if ( satc->segment != pdev->seg )
> +continue;
> +
> +for ( unsigned int i = 0; i < satc->scope.devices_cnt; ++i )
> +if ( satc->scope.devices[i] == pdev->sbdf.bdf )
> +return satc;
> +}
> +
> +return NULL;
> +}
> +
>  struct acpi_rhsa_unit *drhd_to_rhsa(const struct acpi_drhd_unit *drhd)
>  {
>  struct acpi_rhsa_unit *rhsa;
> --- a/xen/drivers/passthrough/vtd/dmar.h
> +++ b/xen/drivers/passthrough/vtd/dmar.h
> @@ -112,6 +112,8 @@ struct acpi_satc_unit {
>  
>  struct acpi_drhd_unit *acpi_find_matched_drhd_unit(const struct pci_dev *);
>  struct acpi_atsr_unit *acpi_find_matched_atsr_unit(const struct pci_dev *);
> +const struct acpi_satc_unit *acpi_find_matched_satc_unit(
> +const struct pci_dev *pdev);
>  
>  #define DMAR_TYPE 1
>  #define RMRR_TYPE 2
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2364,6 +2364,26 @@ static int cf_check intel_iommu_add_devi
>  if ( ret )
>  dprintk(XENLOG_ERR VTDPREFIX, "%pd: context mapping failed\n",
>  pdev->domain);
> +else if ( !pdev->broken )
> +{
> +const struct acpi_drhd_unit *drhd = 
> acpi_find_matched_drhd_unit(pdev);
> +const struct acpi_satc_unit *satc = 
> acpi_find_matched_satc_unit(pdev);
> +
> +/*
> + * Prevent the device from getting assigned to an unprivileged domain
> + * when firmware indicates ATS is required, but ATS could not be 
> enabled
> + * or was not explicitly enabled via command line option.
> + */
> +if ( satc && satc->atc_required &&
> + (!drhd || ats_device(pdev, drhd) <= 0 ||
> +  !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) ||
> +  opt_ats < 0) )

Do you need the opt_ats check here?

I don't think it's possible for pci_ats_enabled() to return true if
opt_ats is <= 0, and hence the opt_ats < 0 check can be dropped from
the conditional?

> +{
> +printk(XENLOG_WARNING "ATS: %pp is not eligible for 
> pass-through\n",
> +   &pdev->sbdf);
> +pdev->broken = true;
> +}
> +}
>  
>  return ret;
>  }
> @@ -2375,12 +2395,26 @@ static int cf_check intel_iommu_enable_d
>  
>  pci_vtd_quirk(pdev);
>  
> -if ( ret <= 0 )
> -return ret;
> +if ( ret <= 0 ||
> + (ret = enable_ats_device(pdev, &drhd->iommu->ats_devices)) < 0 ||
> + opt_ats < 0 )

Shouldn't this be opt_ats <= 0?

> +{
> +const struct acpi_satc_unit *satc = 
> acpi_find_matched_satc_unit(pdev);
> +
> +/*
> + * Besides in error cases also prevent the device from getting 
> assigned
> + * to an unprivileged domain when firmware indicates ATS is required,
> + * but ATS use was not explicitly enabled via command line option.
> + */
> +if ( satc && satc->atc_requi

[PATCH v2 06/12] VT-d: respect ACPI SATC's ATC_REQUIRED flag

2024-02-15 Thread Jan Beulich
When the flag is set, permit Dom0 to control the device (no worse than
what we had before and in line with other "best effort" behavior we use
when it comes to Dom0), but suppress passing through to DomU-s unless
ATS can actually be enabled for such devices (and was explicitly enabled
on the command line).

Signed-off-by: Jan Beulich 
---
v2: Re-base over new earlier patches.

--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -225,7 +225,11 @@ exceptions (watchdog NMIs and unexpected
 > Default: `false`
 
 Permits Xen to set up and use PCI Address Translation Services.  This is a
-performance optimisation for PCI Passthrough.
+performance optimisation for PCI Passthrough.  Note that firmware may indicate
+that certain devices need to have ATS enabled for proper operation. For such
+devices ATS will be enabled by default, unless the option is used in its
+negative form.  Such devices will still not be eligible for passing through to
+guests, unless the option is used in its positive form.
 
 **WARNING: Xen cannot currently safely use ATS because of its synchronous wait
 loops for Queued Invalidation completions.**
--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -253,6 +253,24 @@ struct acpi_atsr_unit *acpi_find_matched
 return all_ports;
 }
 
+const struct acpi_satc_unit *acpi_find_matched_satc_unit(
+const struct pci_dev *pdev)
+{
+const struct acpi_satc_unit *satc;
+
+list_for_each_entry ( satc, &acpi_satc_units, list )
+{
+if ( satc->segment != pdev->seg )
+continue;
+
+for ( unsigned int i = 0; i < satc->scope.devices_cnt; ++i )
+if ( satc->scope.devices[i] == pdev->sbdf.bdf )
+return satc;
+}
+
+return NULL;
+}
+
 struct acpi_rhsa_unit *drhd_to_rhsa(const struct acpi_drhd_unit *drhd)
 {
 struct acpi_rhsa_unit *rhsa;
--- a/xen/drivers/passthrough/vtd/dmar.h
+++ b/xen/drivers/passthrough/vtd/dmar.h
@@ -112,6 +112,8 @@ struct acpi_satc_unit {
 
 struct acpi_drhd_unit *acpi_find_matched_drhd_unit(const struct pci_dev *);
 struct acpi_atsr_unit *acpi_find_matched_atsr_unit(const struct pci_dev *);
+const struct acpi_satc_unit *acpi_find_matched_satc_unit(
+const struct pci_dev *pdev);
 
 #define DMAR_TYPE 1
 #define RMRR_TYPE 2
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2364,6 +2364,26 @@ static int cf_check intel_iommu_add_devi
 if ( ret )
 dprintk(XENLOG_ERR VTDPREFIX, "%pd: context mapping failed\n",
 pdev->domain);
+else if ( !pdev->broken )
+{
+const struct acpi_drhd_unit *drhd = acpi_find_matched_drhd_unit(pdev);
+const struct acpi_satc_unit *satc = acpi_find_matched_satc_unit(pdev);
+
+/*
+ * Prevent the device from getting assigned to an unprivileged domain
+ * when firmware indicates ATS is required, but ATS could not be 
enabled
+ * or was not explicitly enabled via command line option.
+ */
+if ( satc && satc->atc_required &&
+ (!drhd || ats_device(pdev, drhd) <= 0 ||
+  !pci_ats_enabled(pdev->seg, pdev->bus, pdev->devfn) ||
+  opt_ats < 0) )
+{
+printk(XENLOG_WARNING "ATS: %pp is not eligible for 
pass-through\n",
+   &pdev->sbdf);
+pdev->broken = true;
+}
+}
 
 return ret;
 }
@@ -2375,12 +2395,26 @@ static int cf_check intel_iommu_enable_d
 
 pci_vtd_quirk(pdev);
 
-if ( ret <= 0 )
-return ret;
+if ( ret <= 0 ||
+ (ret = enable_ats_device(pdev, &drhd->iommu->ats_devices)) < 0 ||
+ opt_ats < 0 )
+{
+const struct acpi_satc_unit *satc = acpi_find_matched_satc_unit(pdev);
+
+/*
+ * Besides in error cases also prevent the device from getting assigned
+ * to an unprivileged domain when firmware indicates ATS is required,
+ * but ATS use was not explicitly enabled via command line option.
+ */
+if ( satc && satc->atc_required && !pdev->broken )
+{
+printk(XENLOG_WARNING "ATS: %pp is not eligible for 
pass-through\n",
+   &pdev->sbdf);
+pdev->broken = true;
+}
+}
 
-ret = enable_ats_device(pdev, &drhd->iommu->ats_devices);
-
-return ret >= 0 ? 0 : ret;
+return ret <= 0 ? ret : 0;
 }
 
 static int cf_check intel_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -45,8 +45,9 @@ int ats_device(const struct pci_dev *pde
 {
 struct acpi_drhd_unit *ats_drhd;
 unsigned int pos, expfl = 0;
+const struct acpi_satc_unit *satc;
 
-if ( opt_ats <= 0 || !iommu_qinval )
+if ( !opt_ats || !iommu_qinval )
 return 0;
 
 if ( !ecap_queued_inval(drhd->iommu->ecap) ||
@@ -61,6 +62,10 @@ int ats_device(const struct pci_dev *pde