Re: [PATCH v5 09/14] PCI: portdrv: Suppress kernel DMA ownership auto-claiming

2022-01-06 Thread Lu Baolu

On 1/7/22 2:32 AM, Bjorn Helgaas wrote:

On Thu, Jan 06, 2022 at 12:12:35PM +0800, Lu Baolu wrote:

On 1/5/22 1:06 AM, Bjorn Helgaas wrote:

On Tue, Jan 04, 2022 at 09:56:39AM +0800, Lu Baolu wrote:

If a switch lacks ACS P2P Request Redirect, a device below the switch can
bypass the IOMMU and DMA directly to other devices below the switch, so
all the downstream devices must be in the same IOMMU group as the switch
itself.

Help me think through what's going on here.  IIUC, we put devices in
the same IOMMU group when they can interfere with each other in any
way (DMA, config access, etc).

(We said "DMA" above, but I guess this would also apply to config
requests, right?)


I am not sure whether devices could interfere each other through config
space access. The IOMMU hardware only protects and isolates DMA
accesses, so that userspace could control DMA directly. The config
accesses will always be intercepted by VFIO. Hence, I don't see a
problem.


I was wondering about config accesses generated by an endpoint, e.g.,
an endpoint doing config writes to a peer or the upstream bridge.

But I think that is prohibited by spec - PCIe r5.0, sec 7.3.3, says
"Propagation of Configuration Requests from Downstream to Upstream as
well as peer-to-peer are not supported" and "Configuration Requests
are initiated only by the Host Bridge, including those passed through
the SFI CAM mechanism."


That's clear. Thank you for the clarification.



Bjorn



Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 01/14] iommu: Add dma ownership management interfaces

2022-01-06 Thread Lu Baolu

On 1/6/22 11:46 PM, Jason Gunthorpe wrote:

On Thu, Jan 06, 2022 at 11:54:06AM +0800, Lu Baolu wrote:

On 1/5/22 3:23 AM, Jason Gunthorpe wrote:

The device driver oriented interfaces are,

int iommu_device_use_dma_api(struct device *dev);
void iommu_device_unuse_dma_api(struct device *dev);

Nit, do we care whether it uses the actual DMA API?  Or is it just
that iommu_device_use_dma_api() tells us the driver may program the
device to do DMA?

As the main purpose, yes this is all about the DMA API because it
asserts the group domain is the DMA API's domain.

There is a secondary purpose that has to do with the user/kernel
attack you mentioned above. Maintaining the DMA API domain also
prevents VFIO from allowing userspace to operate any device in the
group which blocks P2P attacks to MMIO of other devices.

This is why, even if the driver doesn't use DMA, it should still do a
iommu_device_use_dma_api(), except in the special cases where we don't
care about P2P attacks (eg pci-stub, bridges, etc).



By the way, use_dma_api seems hard to read. How about

iommu_device_use_default_dma()?


You could just say "use default domain"

IMHO the way the iommu subsystem has its own wonky language is a
little troublesome. In the rest of the kernel we call this the DMA
API, while the iommu subsystem calls the domain that the DMA API uses
the 'default domain' not the 'DMA API' domain.

Still, it is probably better to align to the iommu language - just be
sure to put in the function comment that this API 'allows the driver
to use the DMA API eg dma_map_sg()'


iommu_device_use_default_domain() reads better. And add some comments
to link "default domain" with "DMA API". Thanks!



Jason



Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 6/8] gpu/host1x: Use iommu_attach/detach_device()

2022-01-06 Thread Lu Baolu

On 1/7/22 8:48 AM, Jason Gunthorpe wrote:

On Fri, Jan 07, 2022 at 08:35:34AM +0800, Lu Baolu wrote:

On 1/6/22 11:35 PM, Jason Gunthorpe wrote:

On Thu, Jan 06, 2022 at 10:20:51AM +0800, Lu Baolu wrote:

Ordinary drivers should use iommu_attach/detach_device() for domain
attaching and detaching.

Signed-off-by: Lu Baolu 
   drivers/gpu/host1x/dev.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index fbb6447b8659..6e08cb6202cc 100644
+++ b/drivers/gpu/host1x/dev.c
@@ -265,7 +265,7 @@ static struct iommu_domain *host1x_iommu_attach(struct 
host1x *host)
goto put_cache;
}
-   err = iommu_attach_group(host->domain, host->group);
+   err = iommu_attach_device(host->domain, host->dev);
if (err) {
if (err == -ENODEV)
err = 0;
@@ -335,7 +335,7 @@ static void host1x_iommu_exit(struct host1x *host)
   {
if (host->domain) {
put_iova_domain(&host->iova);
-   iommu_detach_group(host->domain, host->group);
+   iommu_detach_device(host->domain, host->dev);
iommu_domain_free(host->domain);
host->domain = NULL;


Shouldn't this add the flag to tegra_host1x_driver ?


This is called for a single driver. The call trace looks like below:

static struct platform_driver tegra_host1x_driver = {
 .driver = {
 .name = "tegra-host1x",
 .of_match_table = host1x_of_match,
 },
 .probe = host1x_probe,
 .remove = host1x_remove,
};

host1x_probe(dev)
->host1x_iommu_init(host)//host is a wrapper of dev
  iommu_domain_alloc(&platform_bus_type)
  iommu_attach_group(domain, group);


The main question is if the iommu group is being shared with other
drivers, not the call chain for this function.

For tegra you have to go look in each entry of the of_match_table:

 { .compatible = "nvidia,tegra114-host1x", .data = &host1x02_info, },

And find the DTS block:

 host1x@5000 {
 compatible = "nvidia,tegra114-host1x";
 reg = <0x5000 0x00028000>;
 interrupts = , /* syncpt */
  ; /* general */
 interrupt-names = "syncpt", "host1x";
 clocks = <&tegra_car TEGRA114_CLK_HOST1X>;
 clock-names = "host1x";
 resets = <&tegra_car 28>;
 reset-names = "host1x";
 iommus = <&mc TEGRA_SWGROUP_HC>;

Then check if any other devices in the DTS use the same 'iommus' which
is how the groups are setup.


Yes, exactly.



I checked everything and it does look like this is a single device
group.


Okay, thanks!



Jason



Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 3/8] iommu: Extend iommu_at[de]tach_device() for multi-device groups

2022-01-06 Thread Jason Gunthorpe via iommu
On Fri, Jan 07, 2022 at 09:14:38AM +0800, Lu Baolu wrote:

> > Once we know our calling context we can always automatic switch from
> > DMA API mode to another domain without any trouble or special
> > counters:
> > 
> > if (!dev->driver->no_kernel_api_dma) {
> >  if (group->owner_cnt > 1 || group->owner)
> >  return -EBUSY;
> >  return __iommu_attach_group(domain, group);
> > }
> 
> Is there any lock issue when referencing dev->driver here? I guess this
> requires iommu_attach_device() only being called during the driver life
> (a.k.a. between driver .probe and .release).

Yes, that is correct. That would need to be documented.

It is the same reason the routine was able to get the group from the
dev. The dev's group must be stable so long as a driver is attached or
everything is broken :)

Much of the group refcounting code is useless for this reason. The
group simply cannot be concurrently destroyed in these contexts.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 3/8] iommu: Extend iommu_at[de]tach_device() for multi-device groups

2022-01-06 Thread Lu Baolu

Hi Jason,

On 1/7/22 1:22 AM, Jason Gunthorpe wrote:

On Thu, Jan 06, 2022 at 10:20:48AM +0800, Lu Baolu wrote:

The iommu_attach/detach_device() interfaces were exposed for the device
drivers to attach/detach their own domains. The commit <426a273834eae>
("iommu: Limit iommu_attach/detach_device to device with their own group")
restricted them to singleton groups to avoid different device in a group
attaching different domain.

As we've introduced device DMA ownership into the iommu core. We can now
extend these interfaces for muliple-device groups, and "all devices are in
the same address space" is still guaranteed.

For multiple devices belonging to a same group, iommu_device_use_dma_api()
and iommu_attach_device() are exclusive. Therefore, when drivers decide to
use iommu_attach_domain(), they cannot call iommu_device_use_dma_api() at
the same time.

Signed-off-by: Jason Gunthorpe 
Signed-off-by: Lu Baolu 
  drivers/iommu/iommu.c | 79 +--
  1 file changed, 62 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ab8ab95969f5..2c9efd85e447 100644
+++ b/drivers/iommu/iommu.c
@@ -47,6 +47,7 @@ struct iommu_group {
struct iommu_domain *domain;
struct list_head entry;
unsigned int owner_cnt;
+   unsigned int attach_cnt;


Why did we suddenly need another counter? None of the prior versions
needed this. I suppose this is being used a some flag to indicate if
owner_cnt == 1 or owner_cnt == 0 should restore the default domain?


Yes, exactly.


Would rather a flag 'auto_no_kernel_dma_api_compat' or something


Adding a flag also works.





+/**
+ * iommu_attach_device() - attach external or UNMANAGED domain to device
+ * @domain: the domain about to attach
+ * @dev: the device about to be attached
+ *
+ * For devices belonging to the same group, iommu_device_use_dma_api() and
+ * iommu_attach_device() are exclusive. Therefore, when drivers decide to
+ * use iommu_attach_domain(), they cannot call iommu_device_use_dma_api()
+ * at the same time.
+ */
  int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
  {
struct iommu_group *group;
+   int ret = 0;
+
+   if (domain->type != IOMMU_DOMAIN_UNMANAGED)
+   return -EINVAL;
  
  	group = iommu_group_get(dev);

if (!group)
return -ENODEV;
  
+	if (group->owner_cnt) {

+   /*
+* Group has been used for kernel-api dma or claimed explicitly
+* for exclusive occupation. For backward compatibility, device
+* in a singleton group is allowed to ignore setting the
+* drv.no_kernel_api_dma field.


BTW why is this call 'no kernel api dma' ? That reads backwards 'no
kernel dma api' right?


Yes. Need to rephrase this wording.



Aother appeal of putting no_kernel_api_dma in the struct device_driver
is that this could could simply do 'dev->driver->no_kernel_api_dma' to
figure out how it is being called and avoid this messy implicitness.


Yes.



Once we know our calling context we can always automatic switch from
DMA API mode to another domain without any trouble or special
counters:

if (!dev->driver->no_kernel_api_dma) {
 if (group->owner_cnt > 1 || group->owner)
 return -EBUSY;
 return __iommu_attach_group(domain, group);
}


Is there any lock issue when referencing dev->driver here? I guess this
requires iommu_attach_device() only being called during the driver life
(a.k.a. between driver .probe and .release).



if (!group->owner_cnt) {
 ret = __iommu_attach_group(domain, group);
 if (ret)
 return ret;
} else if (group->owner || group->domain != domain)
 return -EBUSY;
group->owner_cnt++;

Right?


Yes. It's more straightforward if there's no issue around dev->driver
referencing.




+   if (!group->attach_cnt) {
+   ret = __iommu_attach_group(domain, group);


How come we don't have to detatch the default domain here? Doesn't
that mean that the iommu_replace_group could also just call attach
directly without going through detatch?


__iommu_attach_group() allows replacing the default domain with a
private domain. Corresponding __iommu_detach_group() automatically
replaces private domain with the default domain.

The auto-switch logic should not apply to iommu_group_replace_domain()
which is designed for components with iommu_set_dma_owner() called.



Jason



Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 6/8] gpu/host1x: Use iommu_attach/detach_device()

2022-01-06 Thread Jason Gunthorpe via iommu
On Fri, Jan 07, 2022 at 08:35:34AM +0800, Lu Baolu wrote:
> On 1/6/22 11:35 PM, Jason Gunthorpe wrote:
> > On Thu, Jan 06, 2022 at 10:20:51AM +0800, Lu Baolu wrote:
> > > Ordinary drivers should use iommu_attach/detach_device() for domain
> > > attaching and detaching.
> > > 
> > > Signed-off-by: Lu Baolu 
> > >   drivers/gpu/host1x/dev.c | 4 ++--
> > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> > > index fbb6447b8659..6e08cb6202cc 100644
> > > +++ b/drivers/gpu/host1x/dev.c
> > > @@ -265,7 +265,7 @@ static struct iommu_domain 
> > > *host1x_iommu_attach(struct host1x *host)
> > >   goto put_cache;
> > >   }
> > > - err = iommu_attach_group(host->domain, host->group);
> > > + err = iommu_attach_device(host->domain, host->dev);
> > >   if (err) {
> > >   if (err == -ENODEV)
> > >   err = 0;
> > > @@ -335,7 +335,7 @@ static void host1x_iommu_exit(struct host1x *host)
> > >   {
> > >   if (host->domain) {
> > >   put_iova_domain(&host->iova);
> > > - iommu_detach_group(host->domain, host->group);
> > > + iommu_detach_device(host->domain, host->dev);
> > >   iommu_domain_free(host->domain);
> > >   host->domain = NULL;
> > 
> > Shouldn't this add the flag to tegra_host1x_driver ?
> 
> This is called for a single driver. The call trace looks like below:
> 
> static struct platform_driver tegra_host1x_driver = {
> .driver = {
> .name = "tegra-host1x",
> .of_match_table = host1x_of_match,
> },
> .probe = host1x_probe,
> .remove = host1x_remove,
> };
> 
> host1x_probe(dev)
> ->host1x_iommu_init(host) //host is a wrapper of dev
>  iommu_domain_alloc(&platform_bus_type)
>  iommu_attach_group(domain, group);

The main question is if the iommu group is being shared with other
drivers, not the call chain for this function.

For tegra you have to go look in each entry of the of_match_table:

{ .compatible = "nvidia,tegra114-host1x", .data = &host1x02_info, },

And find the DTS block:

host1x@5000 {
compatible = "nvidia,tegra114-host1x";
reg = <0x5000 0x00028000>;
interrupts = , /* syncpt */
 ; /* general */
interrupt-names = "syncpt", "host1x";
clocks = <&tegra_car TEGRA114_CLK_HOST1X>;
clock-names = "host1x";
resets = <&tegra_car 28>;
reset-names = "host1x";
iommus = <&mc TEGRA_SWGROUP_HC>;

Then check if any other devices in the DTS use the same 'iommus' which
is how the groups are setup.

I checked everything and it does look like this is a single device
group.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 6/8] gpu/host1x: Use iommu_attach/detach_device()

2022-01-06 Thread Lu Baolu

On 1/6/22 11:35 PM, Jason Gunthorpe wrote:

On Thu, Jan 06, 2022 at 10:20:51AM +0800, Lu Baolu wrote:

Ordinary drivers should use iommu_attach/detach_device() for domain
attaching and detaching.

Signed-off-by: Lu Baolu 
  drivers/gpu/host1x/dev.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index fbb6447b8659..6e08cb6202cc 100644
+++ b/drivers/gpu/host1x/dev.c
@@ -265,7 +265,7 @@ static struct iommu_domain *host1x_iommu_attach(struct 
host1x *host)
goto put_cache;
}
  
-		err = iommu_attach_group(host->domain, host->group);

+   err = iommu_attach_device(host->domain, host->dev);
if (err) {
if (err == -ENODEV)
err = 0;
@@ -335,7 +335,7 @@ static void host1x_iommu_exit(struct host1x *host)
  {
if (host->domain) {
put_iova_domain(&host->iova);
-   iommu_detach_group(host->domain, host->group);
+   iommu_detach_device(host->domain, host->dev);
  
  		iommu_domain_free(host->domain);

host->domain = NULL;


Shouldn't this add the flag to tegra_host1x_driver ?


This is called for a single driver. The call trace looks like below:

static struct platform_driver tegra_host1x_driver = {
.driver = {
.name = "tegra-host1x",
.of_match_table = host1x_of_match,
},
.probe = host1x_probe,
.remove = host1x_remove,
};

host1x_probe(dev)
->host1x_iommu_init(host)//host is a wrapper of dev
-->host1x_iommu_attach(host)
>iommu_group_get(host->dev)
 iommu_domain_alloc(&platform_bus_type)
 iommu_attach_group(domain, group);

It seems that the existing code only works for singleton group.



And do like we did in the other tegra stuff and switch to the dma api
when !host1x_wants_iommu() ?

Jason



Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 1/8] iommu: Add iommu_group_replace_domain()

2022-01-06 Thread Lu Baolu

On 1/7/22 1:06 AM, Jason Gunthorpe wrote:

On Thu, Jan 06, 2022 at 10:20:46AM +0800, Lu Baolu wrote:

Expose an interface to replace the domain of an iommu group for frameworks
like vfio which claims the ownership of the whole iommu group.

Signed-off-by: Lu Baolu 
  include/linux/iommu.h | 10 ++
  drivers/iommu/iommu.c | 37 +
  2 files changed, 47 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 408a6d2b3034..66ebce3d1e11 100644
+++ b/include/linux/iommu.h
@@ -677,6 +677,9 @@ void iommu_device_unuse_dma_api(struct device *dev);
  int iommu_group_set_dma_owner(struct iommu_group *group, void *owner);
  void iommu_group_release_dma_owner(struct iommu_group *group);
  bool iommu_group_dma_owner_claimed(struct iommu_group *group);
+int iommu_group_replace_domain(struct iommu_group *group,
+  struct iommu_domain *old,
+  struct iommu_domain *new);
  
  #else /* CONFIG_IOMMU_API */
  
@@ -1090,6 +1093,13 @@ static inline bool iommu_group_dma_owner_claimed(struct iommu_group *group)

  {
return false;
  }
+
+static inline int
+iommu_group_replace_domain(struct iommu_group *group, struct iommu_domain *old,
+  struct iommu_domain *new)
+{
+   return -ENODEV;
+}
  #endif /* CONFIG_IOMMU_API */
  
  /**

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 72a95dea688e..ab8ab95969f5 100644
+++ b/drivers/iommu/iommu.c
@@ -3431,3 +3431,40 @@ bool iommu_group_dma_owner_claimed(struct iommu_group 
*group)
return user;
  }
  EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
+
+/**
+ * iommu_group_replace_domain() - Replace group's domain
+ * @group: The group.
+ * @old: The previous attached domain. NULL for none.
+ * @new: The new domain about to be attached.
+ *
+ * This is to support backward compatibility for vfio which manages the dma
+ * ownership in iommu_group level.


This should mention it can only be used with iommu_group_set_dma_owner()


Sure.




+   if (old)
+   __iommu_detach_group(old, group);
+
+   if (new) {
+   ret = __iommu_attach_group(new, group);
+   if (ret && old)
+   __iommu_attach_group(old, group);
+   }


The sketchy error unwind here gives me some pause for sure. Maybe we
should define that on error this leaves the domain as NULL

Complicates vfio a tiny bit to cope with this failure but seems
cleaner than leaving it indeterminate.


Fair enough.



Jason



Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 5/8] iommu/amd: Use iommu_attach/detach_device()

2022-01-06 Thread Lu Baolu

Hi Jason,

On 1/6/22 10:33 PM, Jason Gunthorpe wrote:

On Thu, Jan 06, 2022 at 10:20:50AM +0800, Lu Baolu wrote:

The individual device driver should use iommu_attach/detach_device()
for domain attachment/detachment.

Signed-off-by: Lu Baolu 
  drivers/iommu/amd/iommu_v2.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index 58da08cc3d01..7d9d0fe89064 100644
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -133,7 +133,7 @@ static void free_device_state(struct device_state 
*dev_state)
if (WARN_ON(!group))
return;
  
-	iommu_detach_group(dev_state->domain, group);

+   iommu_detach_device(dev_state->domain, &dev_state->pdev->dev);
  
  	iommu_group_put(group);


This is the only user of the group in the function all the
group_get/put should be deleted too.

Joerg said in commit 55c99a4dc50f ("iommu/amd: Use
iommu_attach_group()") that the device API doesn't work here because
there are multi-device groups?

But I'm not sure how this can work with multi-device groups - this
seems to assigns a domain setup for direct map, so perhaps this only
works if all devices are setup for direct map?


It's also difficult for me to understand how this can work with multi-
device group. The iommu_attach_group() returns -EBUSY if _init_device()
is called for the second device in the group. That's the reason why I
didn't set no_kernel_dma.




@@ -791,7 +791,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
goto out_free_domain;
}
  
-	ret = iommu_attach_group(dev_state->domain, group);

+   ret = iommu_attach_device(dev_state->domain, &pdev->dev);
if (ret != 0)
goto out_drop_group;


Same comment here


Yes.



Jason



Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 09/14] PCI: portdrv: Suppress kernel DMA ownership auto-claiming

2022-01-06 Thread Bjorn Helgaas
On Thu, Jan 06, 2022 at 12:12:35PM +0800, Lu Baolu wrote:
> On 1/5/22 1:06 AM, Bjorn Helgaas wrote:
> > On Tue, Jan 04, 2022 at 09:56:39AM +0800, Lu Baolu wrote:
> > > If a switch lacks ACS P2P Request Redirect, a device below the switch can
> > > bypass the IOMMU and DMA directly to other devices below the switch, so
> > > all the downstream devices must be in the same IOMMU group as the switch
> > > itself.
> > Help me think through what's going on here.  IIUC, we put devices in
> > the same IOMMU group when they can interfere with each other in any
> > way (DMA, config access, etc).
> > 
> > (We said "DMA" above, but I guess this would also apply to config
> > requests, right?)
> 
> I am not sure whether devices could interfere each other through config
> space access. The IOMMU hardware only protects and isolates DMA
> accesses, so that userspace could control DMA directly. The config
> accesses will always be intercepted by VFIO. Hence, I don't see a
> problem.

I was wondering about config accesses generated by an endpoint, e.g.,
an endpoint doing config writes to a peer or the upstream bridge.

But I think that is prohibited by spec - PCIe r5.0, sec 7.3.3, says
"Propagation of Configuration Requests from Downstream to Upstream as
well as peer-to-peer are not supported" and "Configuration Requests
are initiated only by the Host Bridge, including those passed through
the SFI CAM mechanism."

Bjorn
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 3/8] iommu: Extend iommu_at[de]tach_device() for multi-device groups

2022-01-06 Thread Jason Gunthorpe via iommu
On Thu, Jan 06, 2022 at 10:20:48AM +0800, Lu Baolu wrote:
> The iommu_attach/detach_device() interfaces were exposed for the device
> drivers to attach/detach their own domains. The commit <426a273834eae>
> ("iommu: Limit iommu_attach/detach_device to device with their own group")
> restricted them to singleton groups to avoid different device in a group
> attaching different domain.
> 
> As we've introduced device DMA ownership into the iommu core. We can now
> extend these interfaces for muliple-device groups, and "all devices are in
> the same address space" is still guaranteed.
> 
> For multiple devices belonging to a same group, iommu_device_use_dma_api()
> and iommu_attach_device() are exclusive. Therefore, when drivers decide to
> use iommu_attach_domain(), they cannot call iommu_device_use_dma_api() at
> the same time.
> 
> Signed-off-by: Jason Gunthorpe 
> Signed-off-by: Lu Baolu 
>  drivers/iommu/iommu.c | 79 +--
>  1 file changed, 62 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index ab8ab95969f5..2c9efd85e447 100644
> +++ b/drivers/iommu/iommu.c
> @@ -47,6 +47,7 @@ struct iommu_group {
>   struct iommu_domain *domain;
>   struct list_head entry;
>   unsigned int owner_cnt;
> + unsigned int attach_cnt;

Why did we suddenly need another counter? None of the prior versions
needed this. I suppose this is being used a some flag to indicate if
owner_cnt == 1 or owner_cnt == 0 should restore the default domain?
Would rather a flag 'auto_no_kernel_dma_api_compat' or something


> +/**
> + * iommu_attach_device() - attach external or UNMANAGED domain to device
> + * @domain: the domain about to attach
> + * @dev: the device about to be attached
> + *
> + * For devices belonging to the same group, iommu_device_use_dma_api() and
> + * iommu_attach_device() are exclusive. Therefore, when drivers decide to
> + * use iommu_attach_domain(), they cannot call iommu_device_use_dma_api()
> + * at the same time.
> + */
>  int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
>  {
>   struct iommu_group *group;
> + int ret = 0;
> +
> + if (domain->type != IOMMU_DOMAIN_UNMANAGED)
> + return -EINVAL;
>  
>   group = iommu_group_get(dev);
>   if (!group)
>   return -ENODEV;
>  
> + if (group->owner_cnt) {
> + /*
> +  * Group has been used for kernel-api dma or claimed explicitly
> +  * for exclusive occupation. For backward compatibility, device
> +  * in a singleton group is allowed to ignore setting the
> +  * drv.no_kernel_api_dma field.

BTW why is this call 'no kernel api dma' ? That reads backwards 'no
kernel dma api' right?

Aother appeal of putting no_kernel_api_dma in the struct device_driver
is that this could could simply do 'dev->driver->no_kernel_api_dma' to
figure out how it is being called and avoid this messy implicitness.

Once we know our calling context we can always automatic switch from
DMA API mode to another domain without any trouble or special
counters:

if (!dev->driver->no_kernel_api_dma) {
if (group->owner_cnt > 1 || group->owner)
return -EBUSY;
return __iommu_attach_group(domain, group);
}

if (!group->owner_cnt) {
ret = __iommu_attach_group(domain, group);
if (ret)
return ret;
} else if (group->owner || group->domain != domain)
return -EBUSY;
group->owner_cnt++;

Right?

> + if (!group->attach_cnt) {
> + ret = __iommu_attach_group(domain, group);

How come we don't have to detatch the default domain here? Doesn't
that mean that the iommu_replace_group could also just call attach
directly without going through detatch?

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 1/8] iommu: Add iommu_group_replace_domain()

2022-01-06 Thread Jason Gunthorpe via iommu
On Thu, Jan 06, 2022 at 10:20:46AM +0800, Lu Baolu wrote:
> Expose an interface to replace the domain of an iommu group for frameworks
> like vfio which claims the ownership of the whole iommu group.
> 
> Signed-off-by: Lu Baolu 
>  include/linux/iommu.h | 10 ++
>  drivers/iommu/iommu.c | 37 +
>  2 files changed, 47 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 408a6d2b3034..66ebce3d1e11 100644
> +++ b/include/linux/iommu.h
> @@ -677,6 +677,9 @@ void iommu_device_unuse_dma_api(struct device *dev);
>  int iommu_group_set_dma_owner(struct iommu_group *group, void *owner);
>  void iommu_group_release_dma_owner(struct iommu_group *group);
>  bool iommu_group_dma_owner_claimed(struct iommu_group *group);
> +int iommu_group_replace_domain(struct iommu_group *group,
> +struct iommu_domain *old,
> +struct iommu_domain *new);
>  
>  #else /* CONFIG_IOMMU_API */
>  
> @@ -1090,6 +1093,13 @@ static inline bool 
> iommu_group_dma_owner_claimed(struct iommu_group *group)
>  {
>   return false;
>  }
> +
> +static inline int
> +iommu_group_replace_domain(struct iommu_group *group, struct iommu_domain 
> *old,
> +struct iommu_domain *new)
> +{
> + return -ENODEV;
> +}
>  #endif /* CONFIG_IOMMU_API */
>  
>  /**
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 72a95dea688e..ab8ab95969f5 100644
> +++ b/drivers/iommu/iommu.c
> @@ -3431,3 +3431,40 @@ bool iommu_group_dma_owner_claimed(struct iommu_group 
> *group)
>   return user;
>  }
>  EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
> +
> +/**
> + * iommu_group_replace_domain() - Replace group's domain
> + * @group: The group.
> + * @old: The previous attached domain. NULL for none.
> + * @new: The new domain about to be attached.
> + *
> + * This is to support backward compatibility for vfio which manages the dma
> + * ownership in iommu_group level.

This should mention it can only be used with iommu_group_set_dma_owner()

> + if (old)
> + __iommu_detach_group(old, group);
> +
> + if (new) {
> + ret = __iommu_attach_group(new, group);
> + if (ret && old)
> + __iommu_attach_group(old, group);
> + }

The sketchy error unwind here gives me some pause for sure. Maybe we
should define that on error this leaves the domain as NULL

Complicates vfio a tiny bit to cope with this failure but seems
cleaner than leaving it indeterminate.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 01/14] iommu: Add dma ownership management interfaces

2022-01-06 Thread Jason Gunthorpe via iommu
On Thu, Jan 06, 2022 at 11:54:06AM +0800, Lu Baolu wrote:
> On 1/5/22 3:23 AM, Jason Gunthorpe wrote:
> > > > > The device driver oriented interfaces are,
> > > > > 
> > > > >   int iommu_device_use_dma_api(struct device *dev);
> > > > >   void iommu_device_unuse_dma_api(struct device *dev);
> > > Nit, do we care whether it uses the actual DMA API?  Or is it just
> > > that iommu_device_use_dma_api() tells us the driver may program the
> > > device to do DMA?
> > As the main purpose, yes this is all about the DMA API because it
> > asserts the group domain is the DMA API's domain.
> > 
> > There is a secondary purpose that has to do with the user/kernel
> > attack you mentioned above. Maintaining the DMA API domain also
> > prevents VFIO from allowing userspace to operate any device in the
> > group which blocks P2P attacks to MMIO of other devices.
> > 
> > This is why, even if the driver doesn't use DMA, it should still do a
> > iommu_device_use_dma_api(), except in the special cases where we don't
> > care about P2P attacks (eg pci-stub, bridges, etc).
> > 
> 
> By the way, use_dma_api seems hard to read. How about
> 
>   iommu_device_use_default_dma()?

You could just say "use default domain"

IMHO the way the iommu subsystem has its own wonky language is a
little troublesome. In the rest of the kernel we call this the DMA
API, while the iommu subsystem calls the domain that the DMA API uses
the 'default domain' not the 'DMA API' domain.

Still, it is probably better to align to the iommu language - just be
sure to put in the function comment that this API 'allows the driver
to use the DMA API eg dma_map_sg()'

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 6/8] gpu/host1x: Use iommu_attach/detach_device()

2022-01-06 Thread Jason Gunthorpe via iommu
On Thu, Jan 06, 2022 at 10:20:51AM +0800, Lu Baolu wrote:
> Ordinary drivers should use iommu_attach/detach_device() for domain
> attaching and detaching.
> 
> Signed-off-by: Lu Baolu 
>  drivers/gpu/host1x/dev.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> index fbb6447b8659..6e08cb6202cc 100644
> +++ b/drivers/gpu/host1x/dev.c
> @@ -265,7 +265,7 @@ static struct iommu_domain *host1x_iommu_attach(struct 
> host1x *host)
>   goto put_cache;
>   }
>  
> - err = iommu_attach_group(host->domain, host->group);
> + err = iommu_attach_device(host->domain, host->dev);
>   if (err) {
>   if (err == -ENODEV)
>   err = 0;
> @@ -335,7 +335,7 @@ static void host1x_iommu_exit(struct host1x *host)
>  {
>   if (host->domain) {
>   put_iova_domain(&host->iova);
> - iommu_detach_group(host->domain, host->group);
> + iommu_detach_device(host->domain, host->dev);
>  
>   iommu_domain_free(host->domain);
>   host->domain = NULL;

Shouldn't this add the flag to tegra_host1x_driver ?

And do like we did in the other tegra stuff and switch to the dma api
when !host1x_wants_iommu() ?

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 5/8] iommu/amd: Use iommu_attach/detach_device()

2022-01-06 Thread Jason Gunthorpe via iommu
On Thu, Jan 06, 2022 at 10:20:50AM +0800, Lu Baolu wrote:
> The individual device driver should use iommu_attach/detach_device()
> for domain attachment/detachment.
> 
> Signed-off-by: Lu Baolu 
>  drivers/iommu/amd/iommu_v2.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
> index 58da08cc3d01..7d9d0fe89064 100644
> +++ b/drivers/iommu/amd/iommu_v2.c
> @@ -133,7 +133,7 @@ static void free_device_state(struct device_state 
> *dev_state)
>   if (WARN_ON(!group))
>   return;
>  
> - iommu_detach_group(dev_state->domain, group);
> + iommu_detach_device(dev_state->domain, &dev_state->pdev->dev);
>  
>   iommu_group_put(group);

This is the only user of the group in the function all the
group_get/put should be deleted too.

Joerg said in commit 55c99a4dc50f ("iommu/amd: Use
iommu_attach_group()") that the device API doesn't work here because
there are multi-device groups?

But I'm not sure how this can work with multi-device groups - this
seems to assigns a domain setup for direct map, so perhaps this only
works if all devices are setup for direct map?

> @@ -791,7 +791,7 @@ int amd_iommu_init_device(struct pci_dev *pdev, int 
> pasids)
>   goto out_free_domain;
>   }
>  
> - ret = iommu_attach_group(dev_state->domain, group);
> + ret = iommu_attach_device(dev_state->domain, &pdev->dev);
>   if (ret != 0)
>   goto out_drop_group;

Same comment here

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu