Re: [PATCH v5 4/7] PCI/ATS: Add PRI support for PCIe VF devices
On Wed, Aug 28, 2019 at 11:21:53AM -0700, Kuppuswamy Sathyanarayanan wrote: > On Mon, Aug 19, 2019 at 06:19:25PM -0500, Bjorn Helgaas wrote: > > On Mon, Aug 19, 2019 at 03:53:31PM -0700, Kuppuswamy Sathyanarayanan wrote: > > > On Mon, Aug 19, 2019 at 09:15:00AM -0500, Bjorn Helgaas wrote: > > > > On Thu, Aug 15, 2019 at 03:39:03PM -0700, Kuppuswamy Sathyanarayanan > > > > wrote: > > > > > On 8/15/19 3:20 PM, Bjorn Helgaas wrote: > > > > > > [+cc Joerg, David, iommu list: because IOMMU drivers are the only > > > > > > callers of pci_enable_pri() and pci_enable_pasid()] > > > > > > > > > > > > On Thu, Aug 01, 2019 at 05:06:01PM -0700, > > > > > > sathyanarayanan.kuppusw...@linux.intel.com wrote: > > > > > > > From: Kuppuswamy Sathyanarayanan > > > > > > > > > > > > > > > > > > > > > When IOMMU tries to enable Page Request Interface (PRI) for VF > > > > > > > device > > > > > > > in iommu_enable_dev_iotlb(), it always fails because PRI support > > > > > > > for > > > > > > > PCIe VF device is currently broken. Current implementation expects > > > > > > > the given PCIe device (PF & VF) to implement PRI capability before > > > > > > > enabling the PRI support. But this assumption is incorrect. As > > > > > > > per PCIe > > > > > > > spec r4.0, sec 9.3.7.11, all VFs associated with PF can only use > > > > > > > the > > > > > > > PRI of the PF and not implement it. Hence we need to create > > > > > > > exception > > > > > > > for handling the PRI support for PCIe VF device. > > > > > > > > > > > > > > Also, since PRI is a shared resource between PF/VF, following > > > > > > > rules > > > > > > > should apply. > > > > > > > > > > > > > > 1. Use proper locking before accessing/modifying PF resources in > > > > > > > VF > > > > > > > PRI enable/disable call. > > > > > > > 2. Use reference count logic to track the usage of PRI resource. > > > > > > > 3. Disable PRI only if the PRI reference count (pri_ref_cnt) is > > > > > > > zero. > > > > > > > > > > Wait, why do we need this at all? I agree the spec says VFs may not > > > > > > implement PRI or PASID capabilities and that VFs use the PRI and > > > > > > PASID of the PF. > > > > > > > > > > > > But why do we need to support pci_enable_pri() and > > > > > > pci_enable_pasid() > > > > > > for VFs? There's nothing interesting we can *do* in the VF, and > > > > > > passing it off to the PF adds all this locking mess. For VFs, can > > > > > > we > > > > > > just make them do nothing or return -EINVAL? What functionality > > > > > > would > > > > > > we be missing if we did that? > > > > > > > > > > Currently PRI/PASID capabilities are not enabled by default. IOMMU can > > > > > enable PRI/PASID for VF first (and not enable it for PF). In this > > > > > case, > > > > > doing nothing for VF device will break the functionality. > > > > > > > > What is the path where we can enable PRI/PASID for VF but not for the > > > > PF? The call chains leading to pci_enable_pri() go through the > > > > iommu_ops.add_device interface, which makes me think this is part of > > > > the device enumeration done by the PCI core, and in that case I would > > > > think this it should be done for the PF before VFs. But maybe this > > > > path isn't exercised until a driver does a DMA map or something > > > > similar? > > > > > AFAIK, this path will only get exercised when the device does DMA and > > > hence there is no specific order in which PRI/PASID is enabled in PF/VF. > > > In fact, my v2 version of this patch set had a check to ensure PF > > > PRI/PASID enable is happened before VF attempts PRI/PASID > > > enable/disable. But I had to remove it in later version of this series > > > due to failure case reported by one the tester of this code. > > > > What's the path? And does that path make sense? > > > > I got this far before giving up: > > > > iommu_go_to_state # AMD > > state_next > > amd_iommu_init_pci > > amd_iommu_init_api > > bus_set_iommu > > iommu_bus_init > > bus_for_each_dev(..., add_iommu_group) > > add_iommu_group > > iommu_probe_device > > amd_iommu_add_device # > > amd_iommu_ops.add_device > > init_iommu_group > > iommu_group_get_for_dev > > iommu_group_add_device > > __iommu_attach_device > > amd_iommu_attach_device # > > amd_iommu_ops.attach_dev > > attach_device # amd_iommu > > pdev_iommuv2_enable > > pci_enable_pri > > > > > > iommu_probe_device > > intel_iommu_add_device# intel_iommu_ops.add_device > > domain_add_dev_info > > dmar_insert_o
Re: [PATCH v5 4/7] PCI/ATS: Add PRI support for PCIe VF devices
On Mon, Aug 19, 2019 at 06:19:25PM -0500, Bjorn Helgaas wrote: > On Mon, Aug 19, 2019 at 03:53:31PM -0700, Kuppuswamy Sathyanarayanan wrote: > > On Mon, Aug 19, 2019 at 09:15:00AM -0500, Bjorn Helgaas wrote: > > > On Thu, Aug 15, 2019 at 03:39:03PM -0700, Kuppuswamy Sathyanarayanan > > > wrote: > > > > On 8/15/19 3:20 PM, Bjorn Helgaas wrote: > > > > > [+cc Joerg, David, iommu list: because IOMMU drivers are the only > > > > > callers of pci_enable_pri() and pci_enable_pasid()] > > > > > > > > > > On Thu, Aug 01, 2019 at 05:06:01PM -0700, > > > > > sathyanarayanan.kuppusw...@linux.intel.com wrote: > > > > > > From: Kuppuswamy Sathyanarayanan > > > > > > > > > > > > > > > > > > When IOMMU tries to enable Page Request Interface (PRI) for VF > > > > > > device > > > > > > in iommu_enable_dev_iotlb(), it always fails because PRI support for > > > > > > PCIe VF device is currently broken. Current implementation expects > > > > > > the given PCIe device (PF & VF) to implement PRI capability before > > > > > > enabling the PRI support. But this assumption is incorrect. As per > > > > > > PCIe > > > > > > spec r4.0, sec 9.3.7.11, all VFs associated with PF can only use the > > > > > > PRI of the PF and not implement it. Hence we need to create > > > > > > exception > > > > > > for handling the PRI support for PCIe VF device. > > > > > > > > > > > > Also, since PRI is a shared resource between PF/VF, following rules > > > > > > should apply. > > > > > > > > > > > > 1. Use proper locking before accessing/modifying PF resources in VF > > > > > > PRI enable/disable call. > > > > > > 2. Use reference count logic to track the usage of PRI resource. > > > > > > 3. Disable PRI only if the PRI reference count (pri_ref_cnt) is > > > > > > zero. > > > > > > > > Wait, why do we need this at all? I agree the spec says VFs may not > > > > > implement PRI or PASID capabilities and that VFs use the PRI and > > > > > PASID of the PF. > > > > > > > > > > But why do we need to support pci_enable_pri() and pci_enable_pasid() > > > > > for VFs? There's nothing interesting we can *do* in the VF, and > > > > > passing it off to the PF adds all this locking mess. For VFs, can we > > > > > just make them do nothing or return -EINVAL? What functionality would > > > > > we be missing if we did that? > > > > > > > > Currently PRI/PASID capabilities are not enabled by default. IOMMU can > > > > enable PRI/PASID for VF first (and not enable it for PF). In this case, > > > > doing nothing for VF device will break the functionality. > > > > > > What is the path where we can enable PRI/PASID for VF but not for the > > > PF? The call chains leading to pci_enable_pri() go through the > > > iommu_ops.add_device interface, which makes me think this is part of > > > the device enumeration done by the PCI core, and in that case I would > > > think this it should be done for the PF before VFs. But maybe this > > > path isn't exercised until a driver does a DMA map or something > > > similar? > > > AFAIK, this path will only get exercised when the device does DMA and > > hence there is no specific order in which PRI/PASID is enabled in PF/VF. > > In fact, my v2 version of this patch set had a check to ensure PF > > PRI/PASID enable is happened before VF attempts PRI/PASID > > enable/disable. But I had to remove it in later version of this series > > due to failure case reported by one the tester of this code. > > What's the path? And does that path make sense? > > I got this far before giving up: > > iommu_go_to_state # AMD > state_next > amd_iommu_init_pci > amd_iommu_init_api > bus_set_iommu > iommu_bus_init > bus_for_each_dev(..., add_iommu_group) > add_iommu_group > iommu_probe_device > amd_iommu_add_device # > amd_iommu_ops.add_device > init_iommu_group > iommu_group_get_for_dev > iommu_group_add_device > __iommu_attach_device > amd_iommu_attach_device # > amd_iommu_ops.attach_dev > attach_device # amd_iommu > pdev_iommuv2_enable > pci_enable_pri > > > iommu_probe_device > intel_iommu_add_device# intel_iommu_ops.add_device > domain_add_dev_info > dmar_insert_one_dev_info > domain_context_mapping > domain_context_mapping_one > iommu_enable_dev_iotlb > pci_enable_pri > > > These *look* like enumeration paths, not DMA setup paths. But I could > be wrong, since I gave up before getting to the source. > > I don't want to add all this complexity because we *think* we ne
Re: [PATCH v5 4/7] PCI/ATS: Add PRI support for PCIe VF devices
On Mon, Aug 19, 2019 at 03:53:31PM -0700, Kuppuswamy Sathyanarayanan wrote: > On Mon, Aug 19, 2019 at 09:15:00AM -0500, Bjorn Helgaas wrote: > > On Thu, Aug 15, 2019 at 03:39:03PM -0700, Kuppuswamy Sathyanarayanan wrote: > > > On 8/15/19 3:20 PM, Bjorn Helgaas wrote: > > > > [+cc Joerg, David, iommu list: because IOMMU drivers are the only > > > > callers of pci_enable_pri() and pci_enable_pasid()] > > > > > > > > On Thu, Aug 01, 2019 at 05:06:01PM -0700, > > > > sathyanarayanan.kuppusw...@linux.intel.com wrote: > > > > > From: Kuppuswamy Sathyanarayanan > > > > > > > > > > > > > > > When IOMMU tries to enable Page Request Interface (PRI) for VF device > > > > > in iommu_enable_dev_iotlb(), it always fails because PRI support for > > > > > PCIe VF device is currently broken. Current implementation expects > > > > > the given PCIe device (PF & VF) to implement PRI capability before > > > > > enabling the PRI support. But this assumption is incorrect. As per > > > > > PCIe > > > > > spec r4.0, sec 9.3.7.11, all VFs associated with PF can only use the > > > > > PRI of the PF and not implement it. Hence we need to create exception > > > > > for handling the PRI support for PCIe VF device. > > > > > > > > > > Also, since PRI is a shared resource between PF/VF, following rules > > > > > should apply. > > > > > > > > > > 1. Use proper locking before accessing/modifying PF resources in VF > > > > > PRI enable/disable call. > > > > > 2. Use reference count logic to track the usage of PRI resource. > > > > > 3. Disable PRI only if the PRI reference count (pri_ref_cnt) is zero. > > > > > > Wait, why do we need this at all? I agree the spec says VFs may not > > > > implement PRI or PASID capabilities and that VFs use the PRI and > > > > PASID of the PF. > > > > > > > > But why do we need to support pci_enable_pri() and pci_enable_pasid() > > > > for VFs? There's nothing interesting we can *do* in the VF, and > > > > passing it off to the PF adds all this locking mess. For VFs, can we > > > > just make them do nothing or return -EINVAL? What functionality would > > > > we be missing if we did that? > > > > > > Currently PRI/PASID capabilities are not enabled by default. IOMMU can > > > enable PRI/PASID for VF first (and not enable it for PF). In this case, > > > doing nothing for VF device will break the functionality. > > > > What is the path where we can enable PRI/PASID for VF but not for the > > PF? The call chains leading to pci_enable_pri() go through the > > iommu_ops.add_device interface, which makes me think this is part of > > the device enumeration done by the PCI core, and in that case I would > > think this it should be done for the PF before VFs. But maybe this > > path isn't exercised until a driver does a DMA map or something > > similar? > AFAIK, this path will only get exercised when the device does DMA and > hence there is no specific order in which PRI/PASID is enabled in PF/VF. > In fact, my v2 version of this patch set had a check to ensure PF > PRI/PASID enable is happened before VF attempts PRI/PASID > enable/disable. But I had to remove it in later version of this series > due to failure case reported by one the tester of this code. What's the path? And does that path make sense? I got this far before giving up: iommu_go_to_state # AMD state_next amd_iommu_init_pci amd_iommu_init_api bus_set_iommu iommu_bus_init bus_for_each_dev(..., add_iommu_group) add_iommu_group iommu_probe_device amd_iommu_add_device # amd_iommu_ops.add_device init_iommu_group iommu_group_get_for_dev iommu_group_add_device __iommu_attach_device amd_iommu_attach_device # amd_iommu_ops.attach_dev attach_device # amd_iommu pdev_iommuv2_enable pci_enable_pri iommu_probe_device intel_iommu_add_device# intel_iommu_ops.add_device domain_add_dev_info dmar_insert_one_dev_info domain_context_mapping domain_context_mapping_one iommu_enable_dev_iotlb pci_enable_pri These *look* like enumeration paths, not DMA setup paths. But I could be wrong, since I gave up before getting to the source. I don't want to add all this complexity because we *think* we need it. I want to think about whether it makes *sense*. Maybe it's sensible for the PF enumeration or a PF driver to enable the hardware it owns. If we leave it to the VFs, then we have issues with coordinating between VFs that want different settings, etc. If we understand the whole picture and it needs
Re: [PATCH v5 4/7] PCI/ATS: Add PRI support for PCIe VF devices
On Mon, Aug 19, 2019 at 09:15:00AM -0500, Bjorn Helgaas wrote: > On Thu, Aug 15, 2019 at 03:39:03PM -0700, Kuppuswamy Sathyanarayanan wrote: > > On 8/15/19 3:20 PM, Bjorn Helgaas wrote: > > > [+cc Joerg, David, iommu list: because IOMMU drivers are the only > > > callers of pci_enable_pri() and pci_enable_pasid()] > > > > > > On Thu, Aug 01, 2019 at 05:06:01PM -0700, > > > sathyanarayanan.kuppusw...@linux.intel.com wrote: > > > > From: Kuppuswamy Sathyanarayanan > > > > > > > > > > > > When IOMMU tries to enable Page Request Interface (PRI) for VF device > > > > in iommu_enable_dev_iotlb(), it always fails because PRI support for > > > > PCIe VF device is currently broken. Current implementation expects > > > > the given PCIe device (PF & VF) to implement PRI capability before > > > > enabling the PRI support. But this assumption is incorrect. As per PCIe > > > > spec r4.0, sec 9.3.7.11, all VFs associated with PF can only use the > > > > PRI of the PF and not implement it. Hence we need to create exception > > > > for handling the PRI support for PCIe VF device. > > > > > > > > Also, since PRI is a shared resource between PF/VF, following rules > > > > should apply. > > > > > > > > 1. Use proper locking before accessing/modifying PF resources in VF > > > > PRI enable/disable call. > > > > 2. Use reference count logic to track the usage of PRI resource. > > > > 3. Disable PRI only if the PRI reference count (pri_ref_cnt) is zero. > > > > Wait, why do we need this at all? I agree the spec says VFs may not > > > implement PRI or PASID capabilities and that VFs use the PRI and > > > PASID of the PF. > > > > > > But why do we need to support pci_enable_pri() and pci_enable_pasid() > > > for VFs? There's nothing interesting we can *do* in the VF, and > > > passing it off to the PF adds all this locking mess. For VFs, can we > > > just make them do nothing or return -EINVAL? What functionality would > > > we be missing if we did that? > > > > Currently PRI/PASID capabilities are not enabled by default. IOMMU can > > enable PRI/PASID for VF first (and not enable it for PF). In this case, > > doing nothing for VF device will break the functionality. > > What is the path where we can enable PRI/PASID for VF but not for the > PF? The call chains leading to pci_enable_pri() go through the > iommu_ops.add_device interface, which makes me think this is part of > the device enumeration done by the PCI core, and in that case I would > think this it should be done for the PF before VFs. But maybe this > path isn't exercised until a driver does a DMA map or something > similar? AFAIK, this path will only get exercised when the device does DMA and hence there is no specific order in which PRI/PASID is enabled in PF/VF. In fact, my v2 version of this patch set had a check to ensure PF PRI/PASID enable is happened before VF attempts PRI/PASID enable/disable. But I had to remove it in later version of this series due to failure case reported by one the tester of this code. > > Bjorn -- -- Sathyanarayanan Kuppuswamy Linux kernel developer ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 4/7] PCI/ATS: Add PRI support for PCIe VF devices
On Thu, Aug 15, 2019 at 03:39:03PM -0700, Kuppuswamy Sathyanarayanan wrote: > On 8/15/19 3:20 PM, Bjorn Helgaas wrote: > > [+cc Joerg, David, iommu list: because IOMMU drivers are the only > > callers of pci_enable_pri() and pci_enable_pasid()] > > > > On Thu, Aug 01, 2019 at 05:06:01PM -0700, > > sathyanarayanan.kuppusw...@linux.intel.com wrote: > > > From: Kuppuswamy Sathyanarayanan > > > > > > > > > When IOMMU tries to enable Page Request Interface (PRI) for VF device > > > in iommu_enable_dev_iotlb(), it always fails because PRI support for > > > PCIe VF device is currently broken. Current implementation expects > > > the given PCIe device (PF & VF) to implement PRI capability before > > > enabling the PRI support. But this assumption is incorrect. As per PCIe > > > spec r4.0, sec 9.3.7.11, all VFs associated with PF can only use the > > > PRI of the PF and not implement it. Hence we need to create exception > > > for handling the PRI support for PCIe VF device. > > > > > > Also, since PRI is a shared resource between PF/VF, following rules > > > should apply. > > > > > > 1. Use proper locking before accessing/modifying PF resources in VF > > > PRI enable/disable call. > > > 2. Use reference count logic to track the usage of PRI resource. > > > 3. Disable PRI only if the PRI reference count (pri_ref_cnt) is zero. > > Wait, why do we need this at all? I agree the spec says VFs may not > > implement PRI or PASID capabilities and that VFs use the PRI and > > PASID of the PF. > > > > But why do we need to support pci_enable_pri() and pci_enable_pasid() > > for VFs? There's nothing interesting we can *do* in the VF, and > > passing it off to the PF adds all this locking mess. For VFs, can we > > just make them do nothing or return -EINVAL? What functionality would > > we be missing if we did that? > > Currently PRI/PASID capabilities are not enabled by default. IOMMU can > enable PRI/PASID for VF first (and not enable it for PF). In this case, > doing nothing for VF device will break the functionality. What is the path where we can enable PRI/PASID for VF but not for the PF? The call chains leading to pci_enable_pri() go through the iommu_ops.add_device interface, which makes me think this is part of the device enumeration done by the PCI core, and in that case I would think this it should be done for the PF before VFs. But maybe this path isn't exercised until a driver does a DMA map or something similar? Bjorn ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 4/7] PCI/ATS: Add PRI support for PCIe VF devices
On 8/15/19 3:20 PM, Bjorn Helgaas wrote: [+cc Joerg, David, iommu list: because IOMMU drivers are the only callers of pci_enable_pri() and pci_enable_pasid()] On Thu, Aug 01, 2019 at 05:06:01PM -0700, sathyanarayanan.kuppusw...@linux.intel.com wrote: From: Kuppuswamy Sathyanarayanan When IOMMU tries to enable Page Request Interface (PRI) for VF device in iommu_enable_dev_iotlb(), it always fails because PRI support for PCIe VF device is currently broken. Current implementation expects the given PCIe device (PF & VF) to implement PRI capability before enabling the PRI support. But this assumption is incorrect. As per PCIe spec r4.0, sec 9.3.7.11, all VFs associated with PF can only use the PRI of the PF and not implement it. Hence we need to create exception for handling the PRI support for PCIe VF device. Also, since PRI is a shared resource between PF/VF, following rules should apply. 1. Use proper locking before accessing/modifying PF resources in VF PRI enable/disable call. 2. Use reference count logic to track the usage of PRI resource. 3. Disable PRI only if the PRI reference count (pri_ref_cnt) is zero. Wait, why do we need this at all? I agree the spec says VFs may not implement PRI or PASID capabilities and that VFs use the PRI and PASID of the PF. But why do we need to support pci_enable_pri() and pci_enable_pasid() for VFs? There's nothing interesting we can *do* in the VF, and passing it off to the PF adds all this locking mess. For VFs, can we just make them do nothing or return -EINVAL? What functionality would we be missing if we did that? Currently PRI/PASID capabilities are not enabled by default. IOMMU can enable PRI/PASID for VF first (and not enable it for PF). In this case, doing nothing for VF device will break the functionality. Also the PRI/PASID config options like "PRI Outstanding Page Request Allocation" or "PASID Execute Permission" or "PASID Privileged Mode" are currently configured as per device feature. And hence there is a chance for VF/PF to use different values for these options. (Obviously returning -EINVAL would require tweaks in the callers to either avoid the call for VFs or handle the -EINVAL gracefully.) Cc: Ashok Raj Cc: Keith Busch Suggested-by: Ashok Raj Signed-off-by: Kuppuswamy Sathyanarayanan --- drivers/pci/ats.c | 143 ++-- include/linux/pci.h | 2 + 2 files changed, 112 insertions(+), 33 deletions(-) diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c index 1f4be27a071d..079dc544 100644 --- a/drivers/pci/ats.c +++ b/drivers/pci/ats.c @@ -189,6 +189,8 @@ void pci_pri_init(struct pci_dev *pdev) if (pdev->is_virtfn) return; + mutex_init(&pdev->pri_lock); + pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI); if (!pos) return; @@ -221,29 +223,57 @@ int pci_enable_pri(struct pci_dev *pdev, u32 reqs) { u16 control, status; u32 max_requests; + int ret = 0; + struct pci_dev *pf = pci_physfn(pdev); - if (WARN_ON(pdev->pri_enabled)) - return -EBUSY; + mutex_lock(&pf->pri_lock); - if (!pdev->pri_cap) - return -EINVAL; + if (WARN_ON(pdev->pri_enabled)) { + ret = -EBUSY; + goto pri_unlock; + } - pci_read_config_word(pdev, pdev->pri_cap + PCI_PRI_STATUS, &status); - if (!(status & PCI_PRI_STATUS_STOPPED)) - return -EBUSY; + if (!pf->pri_cap) { + ret = -EINVAL; + goto pri_unlock; + } + + if (pdev->is_virtfn && pf->pri_enabled) + goto update_status; + + /* +* Before updating PRI registers, make sure there is no +* outstanding PRI requests. +*/ + pci_read_config_word(pf, pf->pri_cap + PCI_PRI_STATUS, &status); + if (!(status & PCI_PRI_STATUS_STOPPED)) { + ret = -EBUSY; + goto pri_unlock; + } - pci_read_config_dword(pdev, pdev->pri_cap + PCI_PRI_MAX_REQ, - &max_requests); + pci_read_config_dword(pf, pf->pri_cap + PCI_PRI_MAX_REQ, &max_requests); reqs = min(max_requests, reqs); - pdev->pri_reqs_alloc = reqs; - pci_write_config_dword(pdev, pdev->pri_cap + PCI_PRI_ALLOC_REQ, reqs); + pf->pri_reqs_alloc = reqs; + pci_write_config_dword(pf, pf->pri_cap + PCI_PRI_ALLOC_REQ, reqs); control = PCI_PRI_CTRL_ENABLE; - pci_write_config_word(pdev, pdev->pri_cap + PCI_PRI_CTRL, control); + pci_write_config_word(pf, pf->pri_cap + PCI_PRI_CTRL, control); - pdev->pri_enabled = 1; + /* +* If PRI is not already enabled in PF, increment the PF +* pri_ref_cnt to track the usage of PRI interface. +*/ + if (pdev->is_virtfn && !pf->pri_enabled) { + atomic_inc(&pf->pri_ref_cnt); + pf->pri_enabled = 1; + } - return 0; +up
Re: [PATCH v5 4/7] PCI/ATS: Add PRI support for PCIe VF devices
[+cc Joerg, David, iommu list: because IOMMU drivers are the only callers of pci_enable_pri() and pci_enable_pasid()] On Thu, Aug 01, 2019 at 05:06:01PM -0700, sathyanarayanan.kuppusw...@linux.intel.com wrote: > From: Kuppuswamy Sathyanarayanan > > When IOMMU tries to enable Page Request Interface (PRI) for VF device > in iommu_enable_dev_iotlb(), it always fails because PRI support for > PCIe VF device is currently broken. Current implementation expects > the given PCIe device (PF & VF) to implement PRI capability before > enabling the PRI support. But this assumption is incorrect. As per PCIe > spec r4.0, sec 9.3.7.11, all VFs associated with PF can only use the > PRI of the PF and not implement it. Hence we need to create exception > for handling the PRI support for PCIe VF device. > > Also, since PRI is a shared resource between PF/VF, following rules > should apply. > > 1. Use proper locking before accessing/modifying PF resources in VF >PRI enable/disable call. > 2. Use reference count logic to track the usage of PRI resource. > 3. Disable PRI only if the PRI reference count (pri_ref_cnt) is zero. Wait, why do we need this at all? I agree the spec says VFs may not implement PRI or PASID capabilities and that VFs use the PRI and PASID of the PF. But why do we need to support pci_enable_pri() and pci_enable_pasid() for VFs? There's nothing interesting we can *do* in the VF, and passing it off to the PF adds all this locking mess. For VFs, can we just make them do nothing or return -EINVAL? What functionality would we be missing if we did that? (Obviously returning -EINVAL would require tweaks in the callers to either avoid the call for VFs or handle the -EINVAL gracefully.) > Cc: Ashok Raj > Cc: Keith Busch > Suggested-by: Ashok Raj > Signed-off-by: Kuppuswamy Sathyanarayanan > > --- > drivers/pci/ats.c | 143 ++-- > include/linux/pci.h | 2 + > 2 files changed, 112 insertions(+), 33 deletions(-) > > diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c > index 1f4be27a071d..079dc544 100644 > --- a/drivers/pci/ats.c > +++ b/drivers/pci/ats.c > @@ -189,6 +189,8 @@ void pci_pri_init(struct pci_dev *pdev) > if (pdev->is_virtfn) > return; > > + mutex_init(&pdev->pri_lock); > + > pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI); > if (!pos) > return; > @@ -221,29 +223,57 @@ int pci_enable_pri(struct pci_dev *pdev, u32 reqs) > { > u16 control, status; > u32 max_requests; > + int ret = 0; > + struct pci_dev *pf = pci_physfn(pdev); > > - if (WARN_ON(pdev->pri_enabled)) > - return -EBUSY; > + mutex_lock(&pf->pri_lock); > > - if (!pdev->pri_cap) > - return -EINVAL; > + if (WARN_ON(pdev->pri_enabled)) { > + ret = -EBUSY; > + goto pri_unlock; > + } > > - pci_read_config_word(pdev, pdev->pri_cap + PCI_PRI_STATUS, &status); > - if (!(status & PCI_PRI_STATUS_STOPPED)) > - return -EBUSY; > + if (!pf->pri_cap) { > + ret = -EINVAL; > + goto pri_unlock; > + } > + > + if (pdev->is_virtfn && pf->pri_enabled) > + goto update_status; > + > + /* > + * Before updating PRI registers, make sure there is no > + * outstanding PRI requests. > + */ > + pci_read_config_word(pf, pf->pri_cap + PCI_PRI_STATUS, &status); > + if (!(status & PCI_PRI_STATUS_STOPPED)) { > + ret = -EBUSY; > + goto pri_unlock; > + } > > - pci_read_config_dword(pdev, pdev->pri_cap + PCI_PRI_MAX_REQ, > - &max_requests); > + pci_read_config_dword(pf, pf->pri_cap + PCI_PRI_MAX_REQ, &max_requests); > reqs = min(max_requests, reqs); > - pdev->pri_reqs_alloc = reqs; > - pci_write_config_dword(pdev, pdev->pri_cap + PCI_PRI_ALLOC_REQ, reqs); > + pf->pri_reqs_alloc = reqs; > + pci_write_config_dword(pf, pf->pri_cap + PCI_PRI_ALLOC_REQ, reqs); > > control = PCI_PRI_CTRL_ENABLE; > - pci_write_config_word(pdev, pdev->pri_cap + PCI_PRI_CTRL, control); > + pci_write_config_word(pf, pf->pri_cap + PCI_PRI_CTRL, control); > > - pdev->pri_enabled = 1; > + /* > + * If PRI is not already enabled in PF, increment the PF > + * pri_ref_cnt to track the usage of PRI interface. > + */ > + if (pdev->is_virtfn && !pf->pri_enabled) { > + atomic_inc(&pf->pri_ref_cnt); > + pf->pri_enabled = 1; > + } > > - return 0; > +update_status: > + atomic_inc(&pf->pri_ref_cnt); > + pdev->pri_enabled = 1; > +pri_unlock: > + mutex_unlock(&pf->pri_lock); > + return ret; > } > EXPORT_SYMBOL_GPL(pci_enable_pri); > > @@ -256,18 +286,30 @@ EXPORT_SYMBOL_GPL(pci_enable_pri); > void pci_disable_pri(struct pci_dev *pdev) > { > u16 control; > + struct pci_dev *pf = pci_physfn(pdev); > > - if (WARN_ON(