Hi, Also add Joerg to cc list.
Thanks. Lianbo 在 2020年12月26日 13:39, Lianbo Jiang 写道: > Currently, because domain attach allows to be deferred from iommu > driver to device driver, and when iommu initializes, the devices > on the bus will be scanned and the default groups will be allocated. > > Due to the above changes, some devices could be added to the same > group as below: > > [ 3.859417] pci 0000:01:00.0: Adding to iommu group 16 > [ 3.864572] pci 0000:01:00.1: Adding to iommu group 16 > [ 3.869738] pci 0000:02:00.0: Adding to iommu group 17 > [ 3.874892] pci 0000:02:00.1: Adding to iommu group 17 > > But when attaching these devices, it doesn't allow that a group has > more than one device, otherwise it will return an error. This conflicts > with the deferred attaching. Unfortunately, it has two devices in the > same group for my side, for example: > > [ 9.627014] iommu_group_device_count(): device name[0]:0000:01:00.0 > [ 9.633545] iommu_group_device_count(): device name[1]:0000:01:00.1 > ... > [ 10.255609] iommu_group_device_count(): device name[0]:0000:02:00.0 > [ 10.262144] iommu_group_device_count(): device name[1]:0000:02:00.1 > > Finally, which caused the failure of tg3 driver when tg3 driver calls > the dma_alloc_coherent() to allocate coherent memory in the tg3_test_dma(). > > [ 9.660310] tg3 0000:01:00.0: DMA engine test failed, aborting > [ 9.754085] tg3: probe of 0000:01:00.0 failed with error -12 > [ 9.997512] tg3 0000:01:00.1: DMA engine test failed, aborting > [ 10.043053] tg3: probe of 0000:01:00.1 failed with error -12 > [ 10.288905] tg3 0000:02:00.0: DMA engine test failed, aborting > [ 10.334070] tg3: probe of 0000:02:00.0 failed with error -12 > [ 10.578303] tg3 0000:02:00.1: DMA engine test failed, aborting > [ 10.622629] tg3: probe of 0000:02:00.1 failed with error -12 > > In addition, the similar situations also occur in other drivers such > as the bnxt_en driver. That can be reproduced easily in kdump kernel > when SME is active. > > Add a check for the deferred attach in the iommu_attach_device() and > allow to attach the deferred device regardless of how many devices > are in a group. > > Signed-off-by: Lianbo Jiang <liji...@redhat.com> > --- > drivers/iommu/iommu.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index ffeebda8d6de..dccab7b133fb 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -1967,8 +1967,11 @@ int iommu_attach_device(struct iommu_domain *domain, > struct device *dev) > */ > mutex_lock(&group->mutex); > ret = -EINVAL; > - if (iommu_group_device_count(group) != 1) > + if (!iommu_is_attach_deferred(domain, dev) && > + iommu_group_device_count(group) != 1) { > + dev_err_ratelimited(dev, "Group has more than one device\n"); > goto out_unlock; > + } > > ret = __iommu_attach_group(domain, group); > >