Re: [PATCH] dma-direct: use the correct size for dma_set_encrypted()
On Wed, Jun 22, 2022 at 12:14:24PM -0700, Dexuan Cui wrote: > The third parameter of dma_set_encrypted() is a size in bytes rather than > the number of pages. > > Fixes: 4d0564785bb0 ("dma-direct: factor out dma_set_{de,en}crypted helpers") > Signed-off-by: Dexuan Cui see: commit 4a37f3dd9a83186cb88d44808ab35b78375082c9 (tag: dma-mapping-5.19-2022-05-25) Author: Robin Murphy Date: Fri May 20 18:10:13 2022 +0100 dma-direct: don't over-decrypt memory ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/2] dma-mapping, remoteproc: Fix dma_mem leak after rproc_shutdown
On Wed, Jun 22, 2022 at 10:25:40AM -0600, Mathieu Poirier wrote: > On Sat, Apr 23, 2022 at 07:46:50PM +0200, Christoph Hellwig wrote: > > Sigh. In theory drivers should never declare coherent memory like > > this, and there has been some work to fix remoteproc in that regard. > > > > But I guess until that is merged we'll need somthing like this fix. > > Should I take this in the remoteproc tree? If so, can I get an RB? Reluctantly-Acked-by: Christoph Hellwig ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
Hi Robin, I love your patch! Yet something to improve: [auto build test ERROR on v5.19-rc3] [also build test ERROR on linus/master next-20220622] [cannot apply to awilliam-vfio/next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/intel-lab-lkp/linux/commits/Robin-Murphy/vfio-type1-Simplify-bus_type-determination/20220622-200503 base:a111daf0c53ae91e71fd2bfe7497862d14132e3e config: x86_64-rhel-8.3-kselftests (https://download.01.org/0day-ci/archive/20220623/202206231208.pgasmluw-...@intel.com/config) compiler: gcc-11 (Debian 11.3.0-3) 11.3.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/7a6e1ddc765bde40f879995137a2ff20cb0eda47 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Robin-Murphy/vfio-type1-Simplify-bus_type-determination/20220622-200503 git checkout 7a6e1ddc765bde40f879995137a2ff20cb0eda47 # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot All errors (new ones prefixed by >>, old ones prefixed by <<): >> ERROR: modpost: "vfio_device_get_from_iommu" >> [drivers/vfio/vfio_iommu_type1.ko] undefined! >> ERROR: modpost: "vfio_device_put" [drivers/vfio/vfio_iommu_type1.ko] >> undefined! -- 0-DAY CI Kernel Test Service https://01.org/lkp ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 3/5] vfio/iommu_type1: Remove the domain->ops comparison
> From: Robin Murphy > Sent: Wednesday, June 22, 2022 3:55 PM > > On 2022-06-16 23:23, Nicolin Chen wrote: > > On Thu, Jun 16, 2022 at 06:40:14AM +, Tian, Kevin wrote: > > > >>> The domain->ops validation was added, as a precaution, for mixed- > driver > >>> systems. However, at this moment only one iommu driver is possible. So > >>> remove it. > >> > >> It's true on a physical platform. But I'm not sure whether a virtual > platform > >> is allowed to include multiple e.g. one virtio-iommu alongside a virtual > >> VT- > d > >> or a virtual smmu. It might be clearer to claim that (as Robin pointed out) > >> there is plenty more significant problems than this to solve instead of > simply > >> saying that only one iommu driver is possible if we don't have explicit > code > >> to reject such configuration. > > > > Will edit this part. Thanks! > > Oh, physical platforms with mixed IOMMUs definitely exist already. The > main point is that while bus_set_iommu still exists, the core code > effectively *does* prevent multiple drivers from registering - even in > emulated cases like the example above, virtio-iommu and VT-d would both > try to bus_set_iommu(_bus_type), and one of them will lose. The > aspect which might warrant clarification is that there's no combination > of supported drivers which claim non-overlapping buses *and* could > appear in the same system - even if you tried to contrive something by > emulating, say, VT-d (PCI) alongside rockchip-iommu (platform), you > could still only describe one or the other due to ACPI vs. Devicetree. > This explanation is much clearer! thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On 2022/6/23 10:51, Jerry Snitselaar wrote: The real problem here is that the iommu sequence ID overflows if DMAR_UNITS_SUPPORTED is not big enough. This is purely a software implementation issue, I am not sure whether user opt-in when building a kernel package could help a lot here. Is this something that could be figured out when parsing the dmar table? It looks like currently iommu_refcnt[], iommu_did[], and dmar_seq_ids[] depend on it. That's definitely a better solution. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Thu, Jun 23, 2022 at 10:29:35AM +0800, Baolu Lu wrote: > On 2022/6/22 23:05, Jerry Snitselaar wrote: > > On Wed, Jun 22, 2022 at 7:52 AM Baolu Lu wrote: > > > On 2022/6/16 02:36, Steve Wahl wrote: > > > > To support up to 64 sockets with 10 DMAR units each (640), make the > > > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > > > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is > > > > set. > > > > > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set > > > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to > > > > allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ > > > > remapping doesn't support X2APIC mode x2apic disabled"; and the system > > > > fails to boot properly. > > > > > > > > Signed-off-by: Steve Wahl > > > > Reviewed-by: Kevin Tian > > > > --- > > > > > > > > Note that we could not find a reason for connecting > > > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps > > > > it seemed like the two would continue to match on earlier processors. > > > > There doesn't appear to be kernel code that assumes that the value of > > > > one is related to the other. > > > > > > > > v2: Make this value a config option, rather than a fixed constant. The > > > > default > > > > values should match previous configuration except in the MAXSMP case. > > > > Keeping the > > > > value at a power of two was requested by Kevin Tian. > > > > > > > > v3: Make the config option dependent upon DMAR_TABLE, as it is not used > > > > without this. > > > > > > > >drivers/iommu/intel/Kconfig | 7 +++ > > > >include/linux/dmar.h| 6 +- > > > >2 files changed, 8 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig > > > > index 39a06d245f12..07aaebcb581d 100644 > > > > --- a/drivers/iommu/intel/Kconfig > > > > +++ b/drivers/iommu/intel/Kconfig > > > > @@ -9,6 +9,13 @@ config DMAR_PERF > > > >config DMAR_DEBUG > > > >bool > > > > > > > > +config DMAR_UNITS_SUPPORTED > > > > + int "Number of DMA Remapping Units supported" > > > > + depends on DMAR_TABLE > > > > + default 1024 if MAXSMP > > > > + default 128 if X86_64 > > > > + default 64 > > > With this patch applied, the IOMMU configuration looks like: > > > > > > [*] AMD IOMMU support > > > AMD IOMMU Version 2 driver > > > [*] Enable AMD IOMMU internals in DebugFS > > > (1024) Number of DMA Remapping Units supported NEW > > > [*] Support for Intel IOMMU using DMA Remapping Devices > > > [*] Export Intel IOMMU internals in Debugfs > > > [*] Support for Shared Virtual Memory with Intel IOMMU > > > [*] Enable Intel DMA Remapping Devices by default > > > [*] Enable Intel IOMMU scalable mode by default > > > [*] Support for Interrupt Remapping > > > [*] OMAP IOMMU Support > > > [*] Export OMAP IOMMU internals in DebugFS > > > [*] Rockchip IOMMU Support > > > > > > The NEW item looks confusing. It looks to be a generic configurable > > > value though it's actually Intel DMAR specific. Any thoughts? > > > > > > Best regards, > > > baolu > > > > > Would moving it under INTEL_IOMMU at least have it show up below > > "Support for Intel IOMMU using DMA Remapping Devices"? I'm not sure it > > can be better than that, because IRQ_REMAP selects DMAR_TABLE, so we > > can't stick it in the if INTEL_IOMMU section. > > It's more reasonable to move it under INTEL_IOMMU, but the trouble is > that this also stands even if INTEL_IOMMU is not configured. My thought only was with it after the 'config INTEL_IOMMU' block and before 'if INTEL_IOMMU' it would show up like: [*] Support for Intel IOMMU using DMA Remapping Devices (1024) Number of DMA Remapping Units supported NEW > > The real problem here is that the iommu sequence ID overflows if > DMAR_UNITS_SUPPORTED is not big enough. This is purely a software > implementation issue, I am not sure whether user opt-in when building a > kernel package could help a lot here. > Is this something that could be figured out when parsing the dmar table? It looks like currently iommu_refcnt[], iommu_did[], and dmar_seq_ids[] depend on it. Regards, Jerry > If we can't find a better way, can we just step back? > > Best regards, > baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On 2022/6/22 23:05, Jerry Snitselaar wrote: On Wed, Jun 22, 2022 at 7:52 AM Baolu Lu wrote: On 2022/6/16 02:36, Steve Wahl wrote: To support up to 64 sockets with 10 DMAR units each (640), make the value of DMAR_UNITS_SUPPORTED adjustable by a config variable, CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is set. If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ remapping doesn't support X2APIC mode x2apic disabled"; and the system fails to boot properly. Signed-off-by: Steve Wahl Reviewed-by: Kevin Tian --- Note that we could not find a reason for connecting DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps it seemed like the two would continue to match on earlier processors. There doesn't appear to be kernel code that assumes that the value of one is related to the other. v2: Make this value a config option, rather than a fixed constant. The default values should match previous configuration except in the MAXSMP case. Keeping the value at a power of two was requested by Kevin Tian. v3: Make the config option dependent upon DMAR_TABLE, as it is not used without this. drivers/iommu/intel/Kconfig | 7 +++ include/linux/dmar.h| 6 +- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig index 39a06d245f12..07aaebcb581d 100644 --- a/drivers/iommu/intel/Kconfig +++ b/drivers/iommu/intel/Kconfig @@ -9,6 +9,13 @@ config DMAR_PERF config DMAR_DEBUG bool +config DMAR_UNITS_SUPPORTED + int "Number of DMA Remapping Units supported" + depends on DMAR_TABLE + default 1024 if MAXSMP + default 128 if X86_64 + default 64 With this patch applied, the IOMMU configuration looks like: [*] AMD IOMMU support AMD IOMMU Version 2 driver [*] Enable AMD IOMMU internals in DebugFS (1024) Number of DMA Remapping Units supported NEW [*] Support for Intel IOMMU using DMA Remapping Devices [*] Export Intel IOMMU internals in Debugfs [*] Support for Shared Virtual Memory with Intel IOMMU [*] Enable Intel DMA Remapping Devices by default [*] Enable Intel IOMMU scalable mode by default [*] Support for Interrupt Remapping [*] OMAP IOMMU Support [*] Export OMAP IOMMU internals in DebugFS [*] Rockchip IOMMU Support The NEW item looks confusing. It looks to be a generic configurable value though it's actually Intel DMAR specific. Any thoughts? Best regards, baolu Would moving it under INTEL_IOMMU at least have it show up below "Support for Intel IOMMU using DMA Remapping Devices"? I'm not sure it can be better than that, because IRQ_REMAP selects DMAR_TABLE, so we can't stick it in the if INTEL_IOMMU section. It's more reasonable to move it under INTEL_IOMMU, but the trouble is that this also stands even if INTEL_IOMMU is not configured. The real problem here is that the iommu sequence ID overflows if DMAR_UNITS_SUPPORTED is not big enough. This is purely a software implementation issue, I am not sure whether user opt-in when building a kernel package could help a lot here. If we can't find a better way, can we just step back? Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/5] iommu/mediatek: Add error path for loop of mm_dts_parse
On Thu, 2022-06-16 at 15:49 +0200, Matthias Brugger wrote: > > On 16/06/2022 07:42, Yong Wu wrote: > > The mtk_iommu_mm_dts_parse will parse the smi larbs nodes. if the > > i+1 > > larb is parsed fail(return -EINVAL), we should of_node_put for the > > 0..i > > larbs. In the fail path, one of_node_put matches with > > of_parse_phandle in > > it. > > > > Fixes: d2e9a1102cfc ("iommu/mediatek: Contain MM IOMMU flow with > > the MM TYPE") > > Signed-off-by: Yong Wu > > --- > > drivers/iommu/mtk_iommu.c | 21 - > > 1 file changed, 16 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 3b2489e8a6dd..ab24078938bf 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -1071,12 +1071,12 @@ static int mtk_iommu_mm_dts_parse(struct > > device *dev, struct component_match **m > > > > Don't we need to call the goto also on error case of: > > larbnode = of_parse_phandle(dev->of_node, "mediatek,larbs", i); Thanks very much. exactly right. I will add in next version. > Regards, > Matthias ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/5] iommu/mediatek: Add error path for loop of mm_dts_parse
On Thu, 2022-06-16 at 11:31 +0100, Robin Murphy wrote: > On 2022-06-16 11:08, Yong Wu wrote: > > On Thu, 2022-06-16 at 09:59 +0100, Robin Murphy wrote: > > > On 2022-06-16 06:42, Yong Wu wrote: > > > > The mtk_iommu_mm_dts_parse will parse the smi larbs nodes. if > > > > the > > > > i+1 > > > > larb is parsed fail(return -EINVAL), we should of_node_put for > > > > the > > > > 0..i > > > > larbs. In the fail path, one of_node_put matches with > > > > of_parse_phandle in > > > > it. > > > > > > > > Fixes: d2e9a1102cfc ("iommu/mediatek: Contain MM IOMMU flow > > > > with > > > > the MM TYPE") > > > > Signed-off-by: Yong Wu > > > > --- > > > >drivers/iommu/mtk_iommu.c | 21 - > > > >1 file changed, 16 insertions(+), 5 deletions(-) [snip..] > > > > +err_larbnode_put: > > > > + while (i--) { > > > > + larbnode = of_parse_phandle(dev->of_node, > > > > "mediatek,larbs", i); > > > > + if (larbnode && > > > > of_device_is_available(larbnode)) { > > > > + of_node_put(larbnode); > > > > + of_node_put(larbnode); > > > > + } > > > > > > This looks a bit awkward - could we not just iterate through > > > data->larb_imu and put dev->of_node for each valid dev? > > > > It should work. Thanks very much. > > > > > > > > Also, of_find_device_by_node() takes a reference on the struct > > > device > > > itself, so strictly we should be doing put_device() on those as > > > well > > > if we're bailing out. > > > > Thanks for this hint. A new reference for me. I will add it. > > In fact, thinking about it some more we may as well do the > of_node_put() > unconditionally immediately after the of_find_device_by_node() call, of_node_put is called in component_release_of in the normal case, thus we shouldn't call of_node_put unconditionally. Right? (Sorry for reply late) > so > then it's *only* the device references we'd need to worry about > cleaning > up in the failure path. > > Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/2] vfio: Use device_iommu_capable()
On 2022/6/22 20:04, Robin Murphy wrote: Use the new interface to check the capabilities for our device specifically. Signed-off-by: Robin Murphy --- drivers/vfio/vfio.c | 2 +- drivers/vfio/vfio_iommu_type1.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 73bab04880d0..765d68192c88 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -621,7 +621,7 @@ int vfio_register_group_dev(struct vfio_device *device) * VFIO always sets IOMMU_CACHE because we offer no way for userspace to * restore cache coherency. */ - if (!iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY)) + if (!device_iommu_capable(device->dev, IOMMU_CAP_CACHE_COHERENCY)) return -EINVAL; return __vfio_register_dev(device, diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index e38b8bfde677..2107e95eb743 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -2247,7 +2247,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, list_add(>next, >group_list); msi_remap = irq_domain_check_msi_remap() || - iommu_capable(iommu_api_dev->dev->bus, IOMMU_CAP_INTR_REMAP); + device_iommu_capable(iommu_api_dev->dev, IOMMU_CAP_INTR_REMAP); if (!allow_unsafe_interrupts && !msi_remap) { pr_warn("%s: No interrupt remapping support. Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n", Reviewed-by: Lu Baolu Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
On 2022/6/22 20:04, Robin Murphy wrote: Since IOMMU groups are mandatory for drivers to support, it stands to reason that any device which has been successfully be added to a group must be on a bus supported by that IOMMU driver, and therefore a domain viable for any device in the group must be viable for all devices in the group. This already has to be the case for the IOMMU API's internal default domain, for instance. Thus even if the group contains devices on different buses, that can only mean that the IOMMU driver actually supports such an odd topology, and so without loss of generality we can expect the bus type of any device in a group to be suitable for IOMMU API calls. Ideally we could remove bus->iommu_ops and all IOMMU APIs go through the dev_iommu_ops(). Replace vfio_bus_type() with a simple call to resolve an appropriate member device from which to then derive a bus type. This is also a step towards removing the vague bus-based interfaces from the IOMMU API, when we can subsequently switch to using this device directly. Furthermore, scrutiny reveals a lack of protection for the bus being removed while vfio_iommu_type1_attach_group() is using it; the reference that VFIO holds on the iommu_group ensures that data remains valid, but does not prevent the group's membership changing underfoot. Holding the vfio_device for as long as we need here also neatly solves this. Signed-off-by: Robin Murphy Reviewed-by: Lu Baolu Best regards, baolu --- After sleeping on it, I decided to type up the helper function approach to see how it looked in practice, and in doing so realised that with one more tweak it could also subsume the locking out of the common paths as well, so end up being a self-contained way for type1 to take care of its own concern, which I rather like. drivers/vfio/vfio.c | 18 +- drivers/vfio/vfio.h | 3 +++ drivers/vfio/vfio_iommu_type1.c | 30 +++--- 3 files changed, 31 insertions(+), 20 deletions(-) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 61e71c1154be..73bab04880d0 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -448,7 +448,7 @@ static void vfio_group_get(struct vfio_group *group) * Device objects - create, release, get, put, search */ /* Device reference always implies a group reference */ -static void vfio_device_put(struct vfio_device *device) +void vfio_device_put(struct vfio_device *device) { if (refcount_dec_and_test(>refcount)) complete(>comp); @@ -475,6 +475,22 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group, return NULL; } +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group) +{ + struct vfio_group *group = vfio_group_get_from_iommu(iommu_group); + struct vfio_device *device; + + mutex_lock(>device_lock); + list_for_each_entry(device, >device_list, group_next) { + if (vfio_device_try_get(device)) { + mutex_unlock(>device_lock); + return device; + } + } + mutex_unlock(>device_lock); + return NULL; +} + /* * VFIO driver API */ diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index a67130221151..e8f21e64541b 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -70,3 +70,6 @@ struct vfio_iommu_driver_ops { int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops); void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops); + +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group); +void vfio_device_put(struct vfio_device *device); diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index c13b9290e357..e38b8bfde677 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1679,18 +1679,6 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, return ret; } -static int vfio_bus_type(struct device *dev, void *data) -{ - struct bus_type **bus = data; - - if (*bus && *bus != dev->bus) - return -EINVAL; - - *bus = dev->bus; - - return 0; -} - static int vfio_iommu_replay(struct vfio_iommu *iommu, struct vfio_domain *domain) { @@ -2159,7 +2147,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, struct vfio_iommu *iommu = iommu_data; struct vfio_iommu_group *group; struct vfio_domain *domain, *d; - struct bus_type *bus = NULL; + struct vfio_device *iommu_api_dev; bool resv_msi, msi_remap; phys_addr_t resv_msi_base = 0; struct iommu_domain_geometry *geo; @@ -2192,18 +2180,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, goto out_unlock; } - /* Determine bus_type in order to allocate a domain */ - ret =
Re: [PATCH v1 0/2] Fix console probe delay due to fw_devlink
On Wed, Jun 22, 2022 at 3:32 PM Peng Fan wrote: > > > Subject: [PATCH v1 0/2] Fix console probe delay due to fw_devlink > > > > fw_devlink.strict=1 has been enabled by default. This was delaying the probe > > of console devices. This series fixes that. > > > > Sasha/Peng, > > > > Can you test this please? > > Thanks, just give a test on i.MX8MP-EVK, works well now. > > Tested-by: Peng Fan #i.MX8MP-EVK Lol, that was quick! Thanks! -Saravana > > Thanks, > Peng. > > > > > -Saravana > > > > Cc: Sascha Hauer > > Cc: Peng Fan > > Cc: Kevin Hilman > > Cc: Ulf Hansson > > Cc: Len Brown > > Cc: Pavel Machek > > Cc: Joerg Roedel > > Cc: Will Deacon > > Cc: Andrew Lunn > > Cc: Heiner Kallweit > > Cc: Russell King > > Cc: "David S. Miller" > > Cc: Eric Dumazet > > Cc: Jakub Kicinski > > Cc: Paolo Abeni > > Cc: Linus Walleij > > Cc: Hideaki YOSHIFUJI > > Cc: David Ahern > > Cc: kernel-t...@android.com > > Cc: linux-ker...@vger.kernel.org > > Cc: linux...@vger.kernel.org > > Cc: iommu@lists.linux-foundation.org > > Cc: net...@vger.kernel.org > > Cc: linux-g...@vger.kernel.org > > Cc: ker...@pengutronix.de > > > > Saravana Kannan (2): > > driver core: fw_devlink: Allow firmware to mark devices as best effort > > of: base: Avoid console probe delay when fw_devlink.strict=1 > > > > drivers/base/core.c| 3 ++- > > drivers/of/base.c | 2 ++ > > include/linux/fwnode.h | 4 > > 3 files changed, 8 insertions(+), 1 deletion(-) > > > > -- > > 2.37.0.rc0.161.g10f37bed90-goog > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v1 0/2] Fix console probe delay due to fw_devlink
> Subject: [PATCH v1 0/2] Fix console probe delay due to fw_devlink > > fw_devlink.strict=1 has been enabled by default. This was delaying the probe > of console devices. This series fixes that. > > Sasha/Peng, > > Can you test this please? Thanks, just give a test on i.MX8MP-EVK, works well now. Tested-by: Peng Fan #i.MX8MP-EVK Thanks, Peng. > > -Saravana > > Cc: Sascha Hauer > Cc: Peng Fan > Cc: Kevin Hilman > Cc: Ulf Hansson > Cc: Len Brown > Cc: Pavel Machek > Cc: Joerg Roedel > Cc: Will Deacon > Cc: Andrew Lunn > Cc: Heiner Kallweit > Cc: Russell King > Cc: "David S. Miller" > Cc: Eric Dumazet > Cc: Jakub Kicinski > Cc: Paolo Abeni > Cc: Linus Walleij > Cc: Hideaki YOSHIFUJI > Cc: David Ahern > Cc: kernel-t...@android.com > Cc: linux-ker...@vger.kernel.org > Cc: linux...@vger.kernel.org > Cc: iommu@lists.linux-foundation.org > Cc: net...@vger.kernel.org > Cc: linux-g...@vger.kernel.org > Cc: ker...@pengutronix.de > > Saravana Kannan (2): > driver core: fw_devlink: Allow firmware to mark devices as best effort > of: base: Avoid console probe delay when fw_devlink.strict=1 > > drivers/base/core.c| 3 ++- > drivers/of/base.c | 2 ++ > include/linux/fwnode.h | 4 > 3 files changed, 8 insertions(+), 1 deletion(-) > > -- > 2.37.0.rc0.161.g10f37bed90-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 7/9] driver core: Set fw_devlink.strict=1 by default
On Wed, Jun 22, 2022 at 1:35 PM Saravana Kannan wrote: > > On Wed, Jun 22, 2022 at 12:40 PM Saravana Kannan wrote: > > > > On Wed, Jun 22, 2022 at 1:44 AM Linus Walleij > > wrote: > > > > > > On Wed, Jun 22, 2022 at 9:48 AM Sascha Hauer wrote: > > > > > > > This patch has the effect that console UART devices which have "dmas" > > > > properties specified in the device tree get deferred for 10 to 20 > > > > seconds. This happens on i.MX and likely on other SoCs as well. On i.MX > > > > the dma channel is only requested at UART startup time and not at probe > > > > time. dma is not used for the console. Nevertheless with this driver > > > > probe > > > > defers until the dma engine driver is available. > > > > FYI, if most of the drivers are built in, you could set > > deferred_probe_timeout=1 to reduce the impact of this (should drop > > down to 1 to 2 seconds). Is that an option until we figure out > > something better? > > > > Actually, why isn't earlyconsole being used? That doesn't get blocked > > on anything and the main point of that is to have console working from > > really early on. > > > > > > > > > > It shouldn't go in as-is. > > > > > > This affects all machines with the PL011 UART and DMAs specified as > > > well. > > > > > > It would be best if the console subsystem could be treated special and > > > not require DMA devlink to be satisfied before probing. > > > > If we can mark the console devices somehow before their drivers probe > > them, I can make fw_devlink give them special treatment. Is there any > > way I could identify them before their drivers probe? > > > > > It seems devlink is not quite aware of the concept of resources that are > > > necessary to probe vs resources that are nice to have and might be > > > added after probe. > > > > Correct, it can't tell them apart. Which is why it tries its best to > > enforce them, get most of them ordered properly and then gives up > > enforcing the rest after deferred_probe_timeout= expires. There's a > > bit more nuance than what I explained here (explained in earlier > > commit texts, LPC talks), but that's the gist of it. That's what's > > going on in this case Sascha is pointing out.z > > > > > We need a strong devlink for the first category > > > and maybe a weak devlink for the latter category. > > > > > > I don't know if this is a generic hardware property for all operating > > > systems so it could be a DT property such as dma-weak-dependency? > > > > > > Or maybe compromize and add a linux,dma-weak-dependency; > > > property? > > > > The linux,dma-weak-dependency might be an option, but then if the > > kernel version changes and we want to enforce it because we now have a > > dma driver (not related to Shasha's example) support, then the > > fw_devlink still can't enforce it because of that property. But maybe > > that's okay? The consumer can try to use dma and defer probe if it > > fails? > > > > Another option is to mark console devices in DT with some property and > > we can give special treatment for those without waiting for > > deferred_probe_timeout= to expire. > > Heh, looks like there's already a property for that: stdout-path. > > Let me send a series that'll use that to give special treatment to > console devices. Here's the fix. https://lore.kernel.org/lkml/20220622215912.550419-1-sarava...@google.com/ Sascha, can you give it a shot? -Saravana ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/2] vfio/type1: Simplify bus_type determination
On Wed, 22 Jun 2022 13:04:11 +0100 Robin Murphy wrote: > Since IOMMU groups are mandatory for drivers to support, it stands to > reason that any device which has been successfully be added to a group s/be // > must be on a bus supported by that IOMMU driver, and therefore a domain > viable for any device in the group must be viable for all devices in > the group. This already has to be the case for the IOMMU API's internal > default domain, for instance. Thus even if the group contains devices on > different buses, that can only mean that the IOMMU driver actually > supports such an odd topology, and so without loss of generality we can > expect the bus type of any device in a group to be suitable for IOMMU > API calls. > > Replace vfio_bus_type() with a simple call to resolve an appropriate > member device from which to then derive a bus type. This is also a step > towards removing the vague bus-based interfaces from the IOMMU API, when > we can subsequently switch to using this device directly. > > Furthermore, scrutiny reveals a lack of protection for the bus being > removed while vfio_iommu_type1_attach_group() is using it; the reference > that VFIO holds on the iommu_group ensures that data remains valid, but > does not prevent the group's membership changing underfoot. Holding the > vfio_device for as long as we need here also neatly solves this. > > Signed-off-by: Robin Murphy > --- > > After sleeping on it, I decided to type up the helper function approach > to see how it looked in practice, and in doing so realised that with one > more tweak it could also subsume the locking out of the common paths as > well, so end up being a self-contained way for type1 to take care of its > own concern, which I rather like. > > drivers/vfio/vfio.c | 18 +- > drivers/vfio/vfio.h | 3 +++ > drivers/vfio/vfio_iommu_type1.c | 30 +++--- > 3 files changed, 31 insertions(+), 20 deletions(-) > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > index 61e71c1154be..73bab04880d0 100644 > --- a/drivers/vfio/vfio.c > +++ b/drivers/vfio/vfio.c > @@ -448,7 +448,7 @@ static void vfio_group_get(struct vfio_group *group) > * Device objects - create, release, get, put, search > */ > /* Device reference always implies a group reference */ > -static void vfio_device_put(struct vfio_device *device) > +void vfio_device_put(struct vfio_device *device) > { > if (refcount_dec_and_test(>refcount)) > complete(>comp); > @@ -475,6 +475,22 @@ static struct vfio_device *vfio_group_get_device(struct > vfio_group *group, > return NULL; > } > > +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group > *iommu_group) > +{ > + struct vfio_group *group = vfio_group_get_from_iommu(iommu_group); > + struct vfio_device *device; Check group for NULL. > + > + mutex_lock(>device_lock); > + list_for_each_entry(device, >device_list, group_next) { > + if (vfio_device_try_get(device)) { > + mutex_unlock(>device_lock); > + return device; > + } > + } > + mutex_unlock(>device_lock); > + return NULL; No vfio_group_put() on either path. > +} > + > /* > * VFIO driver API > */ > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h > index a67130221151..e8f21e64541b 100644 > --- a/drivers/vfio/vfio.h > +++ b/drivers/vfio/vfio.h > @@ -70,3 +70,6 @@ struct vfio_iommu_driver_ops { > > int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops); > void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops); > + > +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group > *iommu_group); > +void vfio_device_put(struct vfio_device *device); > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index c13b9290e357..e38b8bfde677 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -1679,18 +1679,6 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, > return ret; > } > > -static int vfio_bus_type(struct device *dev, void *data) > -{ > - struct bus_type **bus = data; > - > - if (*bus && *bus != dev->bus) > - return -EINVAL; > - > - *bus = dev->bus; > - > - return 0; > -} > - > static int vfio_iommu_replay(struct vfio_iommu *iommu, >struct vfio_domain *domain) > { > @@ -2159,7 +2147,7 @@ static int vfio_iommu_type1_attach_group(void > *iommu_data, > struct vfio_iommu *iommu = iommu_data; > struct vfio_iommu_group *group; > struct vfio_domain *domain, *d; > - struct bus_type *bus = NULL; > + struct vfio_device *iommu_api_dev; > bool resv_msi, msi_remap; > phys_addr_t resv_msi_base = 0; > struct iommu_domain_geometry *geo; > @@ -2192,18 +2180,19 @@ static int vfio_iommu_type1_attach_group(void > *iommu_data, >
[PATCH v1 2/2] of: base: Avoid console probe delay when fw_devlink.strict=1
Commit 71066545b48e ("driver core: Set fw_devlink.strict=1 by default") enabled iommus and dmas dependency enforcement by default. On some systems, this caused the console device's probe to get delayed until the deferred_probe_timeout expires. We need consoles to work as soon as possible, so mark the console device node with FWNODE_FLAG_BEST_EFFORT so that fw_delink knows not to delay the probe of the console device for suppliers without drivers. The driver can then make the decision on where it can probe without those suppliers or defer its probe. Fixes: 71066545b48e ("driver core: Set fw_devlink.strict=1 by default") Reported-by: Sascha Hauer Reported-by: Peng Fan Signed-off-by: Saravana Kannan --- drivers/of/base.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/of/base.c b/drivers/of/base.c index d4f98c8469ed..a19cd0c73644 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1919,6 +1919,8 @@ void of_alias_scan(void * (*dt_alloc)(u64 size, u64 align)) of_property_read_string(of_aliases, "stdout", ); if (name) of_stdout = of_find_node_opts_by_path(name, _stdout_options); + if (of_stdout) + of_stdout->fwnode.flags |= FWNODE_FLAG_BEST_EFFORT; } if (!of_aliases) -- 2.37.0.rc0.161.g10f37bed90-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v1 1/2] driver core: fw_devlink: Allow firmware to mark devices as best effort
When firmware sets the FWNODE_FLAG_BEST_EFFORT flag for a fwnode, fw_devlink will do a best effort ordering for that device where it'll only enforce the probe/suspend/resume ordering of that device with suppliers that have drivers. The driver of that device can then decide if it wants to defer probe or probe without the suppliers. This will be useful for avoid probe delays of the console device that were caused by commit 71066545b48e ("driver core: Set fw_devlink.strict=1 by default"). Fixes: 71066545b48e ("driver core: Set fw_devlink.strict=1 by default") Reported-by: Sascha Hauer Reported-by: Peng Fan Signed-off-by: Saravana Kannan --- drivers/base/core.c| 3 ++- include/linux/fwnode.h | 4 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index 839f64485a55..61edd18b7bf3 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -968,7 +968,8 @@ static void device_links_missing_supplier(struct device *dev) static bool dev_is_best_effort(struct device *dev) { - return fw_devlink_best_effort && dev->can_match; + return (fw_devlink_best_effort && dev->can_match) || + dev->fwnode->flags & FWNODE_FLAG_BEST_EFFORT; } /** diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h index 9a81c4410b9f..89b9bdfca925 100644 --- a/include/linux/fwnode.h +++ b/include/linux/fwnode.h @@ -27,11 +27,15 @@ struct device; * driver needs its child devices to be bound with * their respective drivers as soon as they are * added. + * BEST_EFFORT: The fwnode/device needs to probe early and might be missing some + * suppliers. Only enforce ordering with suppliers that have + * drivers. */ #define FWNODE_FLAG_LINKS_ADDEDBIT(0) #define FWNODE_FLAG_NOT_DEVICE BIT(1) #define FWNODE_FLAG_INITIALIZEDBIT(2) #define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD BIT(3) +#define FWNODE_FLAG_BEST_EFFORTBIT(4) struct fwnode_handle { struct fwnode_handle *secondary; -- 2.37.0.rc0.161.g10f37bed90-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v1 0/2] Fix console probe delay due to fw_devlink
fw_devlink.strict=1 has been enabled by default. This was delaying the probe of console devices. This series fixes that. Sasha/Peng, Can you test this please? -Saravana Cc: Sascha Hauer Cc: Peng Fan Cc: Kevin Hilman Cc: Ulf Hansson Cc: Len Brown Cc: Pavel Machek Cc: Joerg Roedel Cc: Will Deacon Cc: Andrew Lunn Cc: Heiner Kallweit Cc: Russell King Cc: "David S. Miller" Cc: Eric Dumazet Cc: Jakub Kicinski Cc: Paolo Abeni Cc: Linus Walleij Cc: Hideaki YOSHIFUJI Cc: David Ahern Cc: kernel-t...@android.com Cc: linux-ker...@vger.kernel.org Cc: linux...@vger.kernel.org Cc: iommu@lists.linux-foundation.org Cc: net...@vger.kernel.org Cc: linux-g...@vger.kernel.org Cc: ker...@pengutronix.de Saravana Kannan (2): driver core: fw_devlink: Allow firmware to mark devices as best effort of: base: Avoid console probe delay when fw_devlink.strict=1 drivers/base/core.c| 3 ++- drivers/of/base.c | 2 ++ include/linux/fwnode.h | 4 3 files changed, 8 insertions(+), 1 deletion(-) -- 2.37.0.rc0.161.g10f37bed90-goog ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 7/9] driver core: Set fw_devlink.strict=1 by default
On Wed, Jun 22, 2022 at 12:40 PM Saravana Kannan wrote: > > On Wed, Jun 22, 2022 at 1:44 AM Linus Walleij > wrote: > > > > On Wed, Jun 22, 2022 at 9:48 AM Sascha Hauer wrote: > > > > > This patch has the effect that console UART devices which have "dmas" > > > properties specified in the device tree get deferred for 10 to 20 > > > seconds. This happens on i.MX and likely on other SoCs as well. On i.MX > > > the dma channel is only requested at UART startup time and not at probe > > > time. dma is not used for the console. Nevertheless with this driver probe > > > defers until the dma engine driver is available. > > FYI, if most of the drivers are built in, you could set > deferred_probe_timeout=1 to reduce the impact of this (should drop > down to 1 to 2 seconds). Is that an option until we figure out > something better? > > Actually, why isn't earlyconsole being used? That doesn't get blocked > on anything and the main point of that is to have console working from > really early on. > > > > > > > It shouldn't go in as-is. > > > > This affects all machines with the PL011 UART and DMAs specified as > > well. > > > > It would be best if the console subsystem could be treated special and > > not require DMA devlink to be satisfied before probing. > > If we can mark the console devices somehow before their drivers probe > them, I can make fw_devlink give them special treatment. Is there any > way I could identify them before their drivers probe? > > > It seems devlink is not quite aware of the concept of resources that are > > necessary to probe vs resources that are nice to have and might be > > added after probe. > > Correct, it can't tell them apart. Which is why it tries its best to > enforce them, get most of them ordered properly and then gives up > enforcing the rest after deferred_probe_timeout= expires. There's a > bit more nuance than what I explained here (explained in earlier > commit texts, LPC talks), but that's the gist of it. That's what's > going on in this case Sascha is pointing out.z > > > We need a strong devlink for the first category > > and maybe a weak devlink for the latter category. > > > > I don't know if this is a generic hardware property for all operating > > systems so it could be a DT property such as dma-weak-dependency? > > > > Or maybe compromize and add a linux,dma-weak-dependency; > > property? > > The linux,dma-weak-dependency might be an option, but then if the > kernel version changes and we want to enforce it because we now have a > dma driver (not related to Shasha's example) support, then the > fw_devlink still can't enforce it because of that property. But maybe > that's okay? The consumer can try to use dma and defer probe if it > fails? > > Another option is to mark console devices in DT with some property and > we can give special treatment for those without waiting for > deferred_probe_timeout= to expire. Heh, looks like there's already a property for that: stdout-path. Let me send a series that'll use that to give special treatment to console devices. -Saravana ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 7/9] driver core: Set fw_devlink.strict=1 by default
On Wed, Jun 22, 2022 at 1:44 AM Linus Walleij wrote: > > On Wed, Jun 22, 2022 at 9:48 AM Sascha Hauer wrote: > > > This patch has the effect that console UART devices which have "dmas" > > properties specified in the device tree get deferred for 10 to 20 > > seconds. This happens on i.MX and likely on other SoCs as well. On i.MX > > the dma channel is only requested at UART startup time and not at probe > > time. dma is not used for the console. Nevertheless with this driver probe > > defers until the dma engine driver is available. FYI, if most of the drivers are built in, you could set deferred_probe_timeout=1 to reduce the impact of this (should drop down to 1 to 2 seconds). Is that an option until we figure out something better? Actually, why isn't earlyconsole being used? That doesn't get blocked on anything and the main point of that is to have console working from really early on. > > > > It shouldn't go in as-is. > > This affects all machines with the PL011 UART and DMAs specified as > well. > > It would be best if the console subsystem could be treated special and > not require DMA devlink to be satisfied before probing. If we can mark the console devices somehow before their drivers probe them, I can make fw_devlink give them special treatment. Is there any way I could identify them before their drivers probe? > It seems devlink is not quite aware of the concept of resources that are > necessary to probe vs resources that are nice to have and might be > added after probe. Correct, it can't tell them apart. Which is why it tries its best to enforce them, get most of them ordered properly and then gives up enforcing the rest after deferred_probe_timeout= expires. There's a bit more nuance than what I explained here (explained in earlier commit texts, LPC talks), but that's the gist of it. That's what's going on in this case Sascha is pointing out.z > We need a strong devlink for the first category > and maybe a weak devlink for the latter category. > > I don't know if this is a generic hardware property for all operating > systems so it could be a DT property such as dma-weak-dependency? > > Or maybe compromize and add a linux,dma-weak-dependency; > property? The linux,dma-weak-dependency might be an option, but then if the kernel version changes and we want to enforce it because we now have a dma driver (not related to Shasha's example) support, then the fw_devlink still can't enforce it because of that property. But maybe that's okay? The consumer can try to use dma and defer probe if it fails? Another option is to mark console devices in DT with some property and we can give special treatment for those without waiting for deferred_probe_timeout= to expire. -Saravana ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma-direct: use the correct size for dma_set_encrypted()
The third parameter of dma_set_encrypted() is a size in bytes rather than the number of pages. Fixes: 4d0564785bb0 ("dma-direct: factor out dma_set_{de,en}crypted helpers") Signed-off-by: Dexuan Cui --- kernel/dma/direct.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index e978f36e6be8..8d0b68a17042 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -357,7 +357,7 @@ void dma_direct_free(struct device *dev, size_t size, } else { if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_CLEAR_UNCACHED)) arch_dma_clear_uncached(cpu_addr, size); - if (dma_set_encrypted(dev, cpu_addr, 1 << page_order)) + if (dma_set_encrypted(dev, cpu_addr, size)) return; } @@ -392,7 +392,6 @@ void dma_direct_free_pages(struct device *dev, size_t size, struct page *page, dma_addr_t dma_addr, enum dma_data_direction dir) { - unsigned int page_order = get_order(size); void *vaddr = page_address(page); /* If cpu_addr is not from an atomic pool, dma_free_from_pool() fails */ @@ -400,7 +399,7 @@ void dma_direct_free_pages(struct device *dev, size_t size, dma_free_from_pool(dev, vaddr, size)) return; - if (dma_set_encrypted(dev, vaddr, 1 << page_order)) + if (dma_set_encrypted(dev, vaddr, size)) return; __dma_direct_free_pages(dev, page, size); } -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/9] PM: domains: Delete usage of driver_deferred_probe_check_state()
On Tue, Jun 21, 2022 at 9:59 PM Tony Lindgren wrote: > > Hi, > > * Saravana Kannan [220621 19:29]: > > On Tue, Jun 21, 2022 at 12:28 AM Tony Lindgren wrote: > > > > > > Hi, > > > > > > * Saravana Kannan [700101 02:00]: > > > > Now that fw_devlink=on by default and fw_devlink supports > > > > "power-domains" property, the execution will never get to the point > > > > where driver_deferred_probe_check_state() is called before the supplier > > > > has probed successfully or before deferred probe timeout has expired. > > > > > > > > So, delete the call and replace it with -ENODEV. > > > > > > Looks like this causes omaps to not boot in Linux next. > > > > Can you please point me to an example DTS I could use for debugging > > this? I'm assuming you are leaving fw_devlink=on and not turning it > > off or putting it in permissive mode. > > Sure, this seems to happen at least with simple-pm-bus as the top > level interconnect with a configured power-domains property: > > $ git grep -A10 "ocp {" arch/arm/boot/dts/*.dtsi | grep -B3 -A4 simple-pm-bus Thanks for the example. I generally start looking from dts (not dtsi) files in case there are some DT property override/additions after the dtsi files are included in the dts file. But I'll assume for now that's not the case. If there's a specific dts file for a board I can look from that'd be helpful to rule out those kinds of issues. For now, I looked at arch/arm/boot/dts/omap4.dtsi. > > This issue is no directly related fw_devlink. It is a side effect of > removing driver_deferred_probe_check_state(). We no longer return > -EPROBE_DEFER at the end of driver_deferred_probe_check_state(). Yes, I understand the issue. But driver_deferred_probe_check_state() was deleted because fw_devlink=on should have short circuited the probe attempt with an -EPROBE_DEFER before reaching the bus/driver probe function and hitting this -ENOENT failure. That's why I was asking the other questions. > > > On platform_probe() genpd_get_from_provider() returns > > > -ENOENT. > > > > This error is with the series I assume? > > On the first probe genpd_get_from_provider() will return -ENOENT in > both cases. The list is empty on the first probe and there are no > genpd providers at this point. > > Earlier with driver_deferred_probe_check_state(), the initial -ENOENT > ends up getting changed to -EPROBE_DEFER at the end of > driver_deferred_probe_check_state(), we are now missing that. Right, I was aware -ENOENT would be returned if we got this far. But the point of this series is that you shouldn't have gotten that far before your pm domain device is ready. Hence my questions from the earlier reply. Can I get answers to rest of my questions in the first reply please? That should help us figure out why fw_devlink let us get this far. Summarize them here to make it easy: * Are you running with fw_devlink=on? * Is the"ti,omap4-prm-inst"/"ti,omap-prm-inst" built-in in this case? * If it's not built-in, can you please try deferred_probe_timeout=0 and deferred_probe_timeout=30 and see if either one of them help? * Can I get the output of "ls -d supplier:*" and "cat supplier:*/status" output from the sysfs dir for the ocp device without this series where it boots properly. Thanks, Saravana ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 7/7] iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled
The IOMMUv2 APIs (for supporting shared virtual memory with PASID) configures the domain with IOMMU v2 page table, and sets DTE[Mode]=0. This configuration cannot be supported on SNP-enabled system. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index f5695ccb7c81..4c9b96160a8b 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -3448,7 +3448,12 @@ __setup("ivrs_acpihid", parse_ivrs_acpihid); bool amd_iommu_v2_supported(void) { - return amd_iommu_v2_present; + /* +* Since DTE[Mode]=0 is prohibited on SNP-enabled system +* (i.e. EFR[SNPSup]=1), IOMMUv2 page table cannot be used without +* setting up IOMMUv1 page table. +*/ + return amd_iommu_v2_present && !amd_iommu_snp_en; } EXPORT_SYMBOL(amd_iommu_v2_supported); -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 0/7] iommu/amd: Enforce IOMMU restrictions for SNP-enabled system
SNP-enabled system requires IOMMU v1 page table to be configured with non-zero DTE[Mode] for DMA-capable devices. This effects a number of usecases such as IOMMU pass-through mode and AMD IOMMUv2 APIs for binding/unbinding pasid. The series introduce a global variable to check SNP-enabled state during driver initialization, and use it to enforce the SNP restrictions during runtime. Also, for non-DMA-capable devices such as IOAPIC, the recommendation is to set DTE[TV] and DTE[Mode] to zero on SNP-enabled system. Therefore, additinal checks is added before setting DTE[TV]. Testing: - Tested booting and verify dmesg. - Tested booting with iommu=pt - Tested loading amd_iommu_v2 driver - Tested changing the iommu domain at runtime - Tested booting SEV/SNP-enabled guest - Tested when CONFIG_AMD_MEM_ENCRYPT is not set Pre-requisite: - [PATCH v3 00/35] iommu/amd: Add multiple PCI segments support https://lore.kernel.org/linux-iommu/20220511072141.15485-29-vasant.he...@amd.com/T/ Chanages from V2: (https://lists.linuxfoundation.org/pipermail/iommu/2022-June/066392.html) - Patch 4: * Update pr_err message to report SNP not supported. * Remove export GPL. * Remove function stub when CONFIG_AMD_MEM_ENCRYPT is not set. - Patch 6: Change to WARN_ONCE. Best Regards, Suravee Brijesh Singh (1): iommu/amd: Introduce function to check and enable SNP Suravee Suthikulpanit (6): iommu/amd: Warn when found inconsistency EFR mask iommu/amd: Process all IVHDs before enabling IOMMU features iommu/amd: Introduce an iommu variable for tracking SNP support status iommu/amd: Set translation valid bit only when IO page tables are in use iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled drivers/iommu/amd/amd_iommu_types.h | 5 ++ drivers/iommu/amd/init.c| 109 +++- drivers/iommu/amd/iommu.c | 27 ++- include/linux/amd-iommu.h | 4 + 4 files changed, 123 insertions(+), 22 deletions(-) -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 6/7] iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled
Once SNP is enabled (by executing SNP_INIT command), IOMMU can no longer support the passthrough domain (i.e. IOMMU_DOMAIN_IDENTITY). The SNP_INIT command is called early in the boot process, and would fail if the kernel is configure to default to passthrough mode. After the system is already booted, users can try to change IOMMU domain type of a particular IOMMU group. In this case, the IOMMU driver needs to check the SNP-enable status and return failure when requesting to change domain type to identity. Therefore, return failure when trying to allocate identity domain. Reviewed-by: Robin Murphy Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 4f4571d3ff61..7093e26fec59 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2119,6 +2119,14 @@ static struct iommu_domain *amd_iommu_domain_alloc(unsigned type) { struct protection_domain *domain; + /* +* Since DTE[Mode]=0 is prohibited on SNP-enabled system, +* default to use IOMMU_DOMAIN_DMA[_FQ]. +*/ + if (WARN_ONCE(amd_iommu_snp_en && (type == IOMMU_DOMAIN_IDENTITY), + "Cannot allocate identity domain due to SNP\n")) + return NULL; + domain = protection_domain_alloc(type); if (!domain) return NULL; -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 4/7] iommu/amd: Introduce function to check and enable SNP
From: Brijesh Singh To support SNP, IOMMU needs to be enabled, and prohibits IOMMU configurations where DTE[Mode]=0, which means it cannot be supported with IOMMU passthrough domain (a.k.a IOMMU_DOMAIN_IDENTITY), and when AMD IOMMU driver is configured to not use the IOMMU host (v1) page table. Otherwise, RMP table initialization could cause the system to crash. The request to enable SNP support in IOMMU must be done before PCI initialization state of the IOMMU driver because enabling SNP affects how IOMMU driver sets up IOMMU data structures (i.e. DTE). Unlike other IOMMU features, SNP feature does not have an enable bit in the IOMMU control register. Instead, the IOMMU driver introduces an amd_iommu_snp_en variable to track enabling state of SNP. Introduce amd_iommu_snp_enable() for other drivers to request enabling the SNP support in IOMMU, which checks all prerequisites and determines if the feature can be safely enabled. Please see the IOMMU spec section 2.12 for further details. Reviewed-by: Robin Murphy Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Brijesh Singh --- drivers/iommu/amd/amd_iommu_types.h | 5 drivers/iommu/amd/init.c| 44 +++-- drivers/iommu/amd/iommu.c | 4 +-- include/linux/amd-iommu.h | 4 +++ 4 files changed, 53 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 73b729be7410..ce4db2835b36 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -463,6 +463,9 @@ extern bool amd_iommu_irq_remap; /* kmem_cache to get tables with 128 byte alignement */ extern struct kmem_cache *amd_iommu_irq_cache; +/* SNP is enabled on the system? */ +extern bool amd_iommu_snp_en; + #define PCI_SBDF_TO_SEGID(sbdf)(((sbdf) >> 16) & 0x) #define PCI_SBDF_TO_DEVID(sbdf)((sbdf) & 0x) #define PCI_SEG_DEVID_TO_SBDF(seg, devid) u32)(seg) & 0x) << 16) | \ @@ -1013,4 +1016,6 @@ extern struct amd_irte_ops irte_32_ops; extern struct amd_irte_ops irte_128_ops; #endif +extern struct iommu_ops amd_iommu_ops; + #endif /* _ASM_X86_AMD_IOMMU_TYPES_H */ diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 013c55e3c2f2..c62fb4470519 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -95,8 +95,6 @@ * out of it. */ -extern const struct iommu_ops amd_iommu_ops; - /* * structure describing one IOMMU in the ACPI table. Typically followed by one * or more ivhd_entrys. @@ -168,6 +166,9 @@ static int amd_iommu_target_ivhd_type; static bool amd_iommu_snp_sup; +bool amd_iommu_snp_en; +EXPORT_SYMBOL(amd_iommu_snp_en); + LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ @@ -3549,3 +3550,42 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, u8 fxn, u64 return iommu_pc_get_set_reg(iommu, bank, cntr, fxn, value, true); } + +#ifdef CONFIG_AMD_MEM_ENCRYPT +int amd_iommu_snp_enable(void) +{ + /* +* The SNP support requires that IOMMU must be enabled, and is +* not configured in the passthrough mode. +*/ + if (no_iommu || iommu_default_passthrough()) { + pr_err("SNP: IOMMU is disabled or configured in passthrough mode, SNP cannot be supported"); + return -EINVAL; + } + + /* +* Prevent enabling SNP after IOMMU_ENABLED state because this process +* affect how IOMMU driver sets up data structures and configures +* IOMMU hardware. +*/ + if (init_state > IOMMU_ENABLED) { + pr_err("SNP: Too late to enable SNP for IOMMU.\n"); + return -EINVAL; + } + + amd_iommu_snp_en = amd_iommu_snp_sup; + if (!amd_iommu_snp_en) + return -EINVAL; + + pr_info("SNP enabled\n"); + + /* Enforce IOMMU v1 pagetable when SNP is enabled. */ + if (amd_iommu_pgtable != AMD_IOMMU_V1) { + pr_warn("Force to using AMD IOMMU v1 page table due to SNP\n"); + amd_iommu_pgtable = AMD_IOMMU_V1; + amd_iommu_ops.pgsize_bitmap = AMD_IOMMU_PGSIZES; + } + + return 0; +} +#endif diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 86045dc50a0f..0792cd618dba 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -71,7 +71,7 @@ LIST_HEAD(acpihid_map); * Domain for untranslated devices - only allocated * if iommu=pt passed on kernel cmd line. */ -const struct iommu_ops amd_iommu_ops; +struct iommu_ops amd_iommu_ops; static ATOMIC_NOTIFIER_HEAD(ppr_notifier); int amd_iommu_max_glx_val = -1; @@ -2412,7 +2412,7 @@ static int amd_iommu_def_domain_type(struct device *dev)
[PATCH v3 5/7] iommu/amd: Set translation valid bit only when IO page tables are in use
On AMD system with SNP enabled, IOMMU hardware checks the host translation valid (TV) and guest translation valid (GV) bits in the device table entry (DTE) before accessing the corresponded page tables. However, current IOMMU driver sets the TV bit for all devices regardless of whether the host page table is in use. This results in ILLEGAL_DEV_TABLE_ENTRY event for devices, which do not the host page table root pointer set up. Thefore, when SNP is enabled, only set TV bit when DMA remapping is not used, which is when domain ID in the AMD IOMMU device table entry (DTE) is zero. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 3 ++- drivers/iommu/amd/iommu.c | 15 +-- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index c62fb4470519..f5695ccb7c81 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -2544,7 +2544,8 @@ static void init_device_table_dma(struct amd_iommu_pci_seg *pci_seg) for (devid = 0; devid <= pci_seg->last_bdf; ++devid) { __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_VALID); - __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION); + if (!amd_iommu_snp_en) + __set_dev_entry_bit(dev_table, devid, DEV_ENTRY_TRANSLATION); } } diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 0792cd618dba..4f4571d3ff61 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1563,7 +1563,14 @@ static void set_dte_entry(struct amd_iommu *iommu, u16 devid, (domain->flags & PD_GIOV_MASK)) pte_root |= DTE_FLAG_GIOV; - pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV; + pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V; + + /* +* When SNP is enabled, Only set TV bit when IOMMU +* page translation is in use. +*/ + if (!amd_iommu_snp_en || (domain->id != 0)) + pte_root |= DTE_FLAG_TV; flags = dev_table[devid].data[1]; @@ -1625,7 +1632,11 @@ static void clear_dte_entry(struct amd_iommu *iommu, u16 devid) struct dev_table_entry *dev_table = get_dev_table(iommu); /* remove entry from the device table seen by the hardware */ - dev_table[devid].data[0] = DTE_FLAG_V | DTE_FLAG_TV; + dev_table[devid].data[0] = DTE_FLAG_V; + + if (!amd_iommu_snp_en) + dev_table[devid].data[0] |= DTE_FLAG_TV; + dev_table[devid].data[1] &= DTE_FLAG_MASK; amd_iommu_apply_erratum_63(iommu, devid); -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 3/7] iommu/amd: Introduce an iommu variable for tracking SNP support status
EFR[SNPSup] needs to be checked early in the boot process, since it is used to determine how IOMMU driver configures other IOMMU features and data structures. This check can be done as soon as the IOMMU driver finishes parsing IVHDs. Introduce a variable for tracking the SNP support status, which is initialized before enabling the rest of IOMMU features. Also report IOMMU SNP support information for each IOMMU. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 5f86e357dbaa..013c55e3c2f2 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -166,6 +166,8 @@ static bool amd_iommu_disabled __initdata; static bool amd_iommu_force_enable __initdata; static int amd_iommu_target_ivhd_type; +static bool amd_iommu_snp_sup; + LIST_HEAD(amd_iommu_pci_seg_list); /* list of all PCI segments */ LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the system */ @@ -260,7 +262,6 @@ int amd_iommu_get_num_iommus(void) return amd_iommus_present; } -#ifdef CONFIG_IRQ_REMAP /* * Iterate through all the IOMMUs to verify if the specified * EFR bitmask of IOMMU feature are set. @@ -285,7 +286,6 @@ static bool check_feature_on_all_iommus(u64 mask) } return ret; } -#endif /* * For IVHD type 0x11/0x40, EFR is also available via IVHD. @@ -368,7 +368,7 @@ static void iommu_set_cwwb_range(struct amd_iommu *iommu) u64 start = iommu_virt_to_phys((void *)iommu->cmd_sem); u64 entry = start & PM_ADDR_MASK; - if (!iommu_feature(iommu, FEATURE_SNP)) + if (!amd_iommu_snp_sup) return; /* Note: @@ -783,7 +783,7 @@ static void *__init iommu_alloc_4k_pages(struct amd_iommu *iommu, void *buf = (void *)__get_free_pages(gfp, order); if (buf && - iommu_feature(iommu, FEATURE_SNP) && + amd_iommu_snp_sup && set_memory_4k((unsigned long)buf, (1 << order))) { free_pages((unsigned long)buf, order); buf = NULL; @@ -1882,6 +1882,7 @@ static int __init init_iommu_all(struct acpi_table_header *table) WARN_ON(p != end); /* Phase 2 : Early feature support check */ + amd_iommu_snp_sup = check_feature_on_all_iommus(FEATURE_SNP); /* Phase 3 : Enabling IOMMU features */ for_each_iommu(iommu) { @@ -2118,6 +2119,9 @@ static void print_iommu_info(void) if (iommu->features & FEATURE_GAM_VAPIC) pr_cont(" GA_vAPIC"); + if (iommu->features & FEATURE_SNP) + pr_cont(" SNP"); + pr_cont("\n"); } } -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 1/7] iommu/amd: Warn when found inconsistency EFR mask
The function check_feature_on_all_iommus() checks to ensure if an IOMMU feature support bit is set on the Extended Feature Register (EFR). Current logic iterates through all IOMMU, and returns false when it found the first unset bit. To provide more thorough checking, modify the logic to iterate through all IOMMUs even when found that the bit is not set, and also throws a FW_BUG warning if inconsistency is found. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 3dd0f26039c7..b3e4551ce9dd 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -261,18 +261,29 @@ int amd_iommu_get_num_iommus(void) } #ifdef CONFIG_IRQ_REMAP +/* + * Iterate through all the IOMMUs to verify if the specified + * EFR bitmask of IOMMU feature are set. + * Warn and return false if found inconsistency. + */ static bool check_feature_on_all_iommus(u64 mask) { bool ret = false; struct amd_iommu *iommu; for_each_iommu(iommu) { - ret = iommu_feature(iommu, mask); - if (!ret) + bool tmp = iommu_feature(iommu, mask); + + if ((ret != tmp) && + !list_is_first(>list, _iommu_list)) { + pr_err(FW_BUG "Found inconsistent EFR mask (%#llx) on iommu%d (%04x:%02x:%02x.%01x).\n", + mask, iommu->index, iommu->pci_seg->id, PCI_BUS_NUM(iommu->devid), + PCI_SLOT(iommu->devid), PCI_FUNC(iommu->devid)); return false; + } + ret = tmp; } - - return true; + return ret; } #endif -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 2/7] iommu/amd: Process all IVHDs before enabling IOMMU features
The ACPI IVRS table can contain multiple IVHD blocks. Each block contains information used to initialize each IOMMU instance. Currently, init_iommu_all sequentially process IVHD block and initialize IOMMU instance one-by-one. However, certain features require all IOMMUs to be configured in the same way system-wide. In case certain IVHD blocks contain inconsistent information (most likely FW bugs), the driver needs to go through and try to revert settings on IOMMUs that have already been configured. A solution is to split IOMMU initialization into 3 phases: Phase1 : Processes information of the IVRS table for all IOMMU instances. This allow all IVHDs to be processed prior to enabling features. Phase2 : Early feature support check on all IOMMUs (using information in IVHD blocks. Phase3 : Iterates through all IOMMU instances and enabling features. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/init.c | 24 ++-- 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index b3e4551ce9dd..5f86e357dbaa 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -1692,7 +1692,6 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, struct acpi_table_header *ivrs_base) { struct amd_iommu_pci_seg *pci_seg; - int ret; pci_seg = get_pci_segment(h->pci_seg, ivrs_base); if (pci_seg == NULL) @@ -1773,6 +1772,13 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, if (!iommu->mmio_base) return -ENOMEM; + return init_iommu_from_acpi(iommu, h); +} + +static int __init init_iommu_one_late(struct amd_iommu *iommu) +{ + int ret; + if (alloc_cwwb_sem(iommu)) return -ENOMEM; @@ -1794,10 +1800,6 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, if (amd_iommu_pre_enabled) amd_iommu_pre_enabled = translation_pre_enabled(iommu); - ret = init_iommu_from_acpi(iommu, h); - if (ret) - return ret; - if (amd_iommu_irq_remap) { ret = amd_iommu_create_irq_domain(iommu); if (ret) @@ -1808,7 +1810,7 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h, * Make sure IOMMU is not considered to translate itself. The IVRS * table tells us so, but this is a lie! */ - pci_seg->rlookup_table[iommu->devid] = NULL; + iommu->pci_seg->rlookup_table[iommu->devid] = NULL; return 0; } @@ -1853,6 +1855,7 @@ static int __init init_iommu_all(struct acpi_table_header *table) end += table->length; p += IVRS_HEADER_LENGTH; + /* Phase 1: Process all IVHD blocks */ while (p < end) { h = (struct ivhd_header *)p; if (*p == amd_iommu_target_ivhd_type) { @@ -1878,6 +1881,15 @@ static int __init init_iommu_all(struct acpi_table_header *table) } WARN_ON(p != end); + /* Phase 2 : Early feature support check */ + + /* Phase 3 : Enabling IOMMU features */ + for_each_iommu(iommu) { + ret = init_iommu_one_late(iommu); + if (ret) + return ret; + } + return 0; } -- 2.32.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/2] dma-mapping, remoteproc: Fix dma_mem leak after rproc_shutdown
On Sat, Apr 23, 2022 at 07:46:50PM +0200, Christoph Hellwig wrote: > Sigh. In theory drivers should never declare coherent memory like > this, and there has been some work to fix remoteproc in that regard. > > But I guess until that is merged we'll need somthing like this fix. Should I take this in the remoteproc tree? If so, can I get an RB? Thanks, Mathieu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Wed, Jun 22, 2022 at 08:05:12AM -0700, Jerry Snitselaar wrote: > On Wed, Jun 22, 2022 at 7:52 AM Baolu Lu wrote: > > > > On 2022/6/16 02:36, Steve Wahl wrote: > > > To support up to 64 sockets with 10 DMAR units each (640), make the > > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is > > > set. > > > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set > > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to > > > allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ > > > remapping doesn't support X2APIC mode x2apic disabled"; and the system > > > fails to boot properly. > > > > > > Signed-off-by: Steve Wahl > > > Reviewed-by: Kevin Tian > > > --- > > > > > > Note that we could not find a reason for connecting > > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps > > > it seemed like the two would continue to match on earlier processors. > > > There doesn't appear to be kernel code that assumes that the value of > > > one is related to the other. > > > > > > v2: Make this value a config option, rather than a fixed constant. The > > > default > > > values should match previous configuration except in the MAXSMP case. > > > Keeping the > > > value at a power of two was requested by Kevin Tian. > > > > > > v3: Make the config option dependent upon DMAR_TABLE, as it is not used > > > without this. > > > > > > drivers/iommu/intel/Kconfig | 7 +++ > > > include/linux/dmar.h| 6 +- > > > 2 files changed, 8 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig > > > index 39a06d245f12..07aaebcb581d 100644 > > > --- a/drivers/iommu/intel/Kconfig > > > +++ b/drivers/iommu/intel/Kconfig > > > @@ -9,6 +9,13 @@ config DMAR_PERF > > > config DMAR_DEBUG > > > bool > > > > > > +config DMAR_UNITS_SUPPORTED > > > + int "Number of DMA Remapping Units supported" > > > + depends on DMAR_TABLE > > > + default 1024 if MAXSMP > > > + default 128 if X86_64 > > > + default 64 > > > > With this patch applied, the IOMMU configuration looks like: > > > > [*] AMD IOMMU support > > AMD IOMMU Version 2 driver > > [*] Enable AMD IOMMU internals in DebugFS > > (1024) Number of DMA Remapping Units supported NEW > > [*] Support for Intel IOMMU using DMA Remapping Devices > > [*] Export Intel IOMMU internals in Debugfs > > [*] Support for Shared Virtual Memory with Intel IOMMU > > [*] Enable Intel DMA Remapping Devices by default > > [*] Enable Intel IOMMU scalable mode by default > > [*] Support for Interrupt Remapping > > [*] OMAP IOMMU Support > > [*] Export OMAP IOMMU internals in DebugFS > > [*] Rockchip IOMMU Support > > > > The NEW item looks confusing. It looks to be a generic configurable > > value though it's actually Intel DMAR specific. Any thoughts? > > > > Best regards, > > baolu > > > > Would moving it under INTEL_IOMMU at least have it show up below > "Support for Intel IOMMU using DMA Remapping Devices"? I'm not sure it > can be better than that, because IRQ_REMAP selects DMAR_TABLE, so we > can't stick it in the if INTEL_IOMMU section. > > Regards, > Jerry I have no qualms with Jerry's suggestion. --> Steve Wahl -- Steve Wahl, Hewlett Packard Enterprise ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On Wed, Jun 22, 2022 at 7:52 AM Baolu Lu wrote: > > On 2022/6/16 02:36, Steve Wahl wrote: > > To support up to 64 sockets with 10 DMAR units each (640), make the > > value of DMAR_UNITS_SUPPORTED adjustable by a config variable, > > CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is > > set. > > > > If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set > > to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to > > allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ > > remapping doesn't support X2APIC mode x2apic disabled"; and the system > > fails to boot properly. > > > > Signed-off-by: Steve Wahl > > Reviewed-by: Kevin Tian > > --- > > > > Note that we could not find a reason for connecting > > DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps > > it seemed like the two would continue to match on earlier processors. > > There doesn't appear to be kernel code that assumes that the value of > > one is related to the other. > > > > v2: Make this value a config option, rather than a fixed constant. The > > default > > values should match previous configuration except in the MAXSMP case. > > Keeping the > > value at a power of two was requested by Kevin Tian. > > > > v3: Make the config option dependent upon DMAR_TABLE, as it is not used > > without this. > > > > drivers/iommu/intel/Kconfig | 7 +++ > > include/linux/dmar.h| 6 +- > > 2 files changed, 8 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig > > index 39a06d245f12..07aaebcb581d 100644 > > --- a/drivers/iommu/intel/Kconfig > > +++ b/drivers/iommu/intel/Kconfig > > @@ -9,6 +9,13 @@ config DMAR_PERF > > config DMAR_DEBUG > > bool > > > > +config DMAR_UNITS_SUPPORTED > > + int "Number of DMA Remapping Units supported" > > + depends on DMAR_TABLE > > + default 1024 if MAXSMP > > + default 128 if X86_64 > > + default 64 > > With this patch applied, the IOMMU configuration looks like: > > [*] AMD IOMMU support > AMD IOMMU Version 2 driver > [*] Enable AMD IOMMU internals in DebugFS > (1024) Number of DMA Remapping Units supported NEW > [*] Support for Intel IOMMU using DMA Remapping Devices > [*] Export Intel IOMMU internals in Debugfs > [*] Support for Shared Virtual Memory with Intel IOMMU > [*] Enable Intel DMA Remapping Devices by default > [*] Enable Intel IOMMU scalable mode by default > [*] Support for Interrupt Remapping > [*] OMAP IOMMU Support > [*] Export OMAP IOMMU internals in DebugFS > [*] Rockchip IOMMU Support > > The NEW item looks confusing. It looks to be a generic configurable > value though it's actually Intel DMAR specific. Any thoughts? > > Best regards, > baolu > Would moving it under INTEL_IOMMU at least have it show up below "Support for Intel IOMMU using DMA Remapping Devices"? I'm not sure it can be better than that, because IRQ_REMAP selects DMAR_TABLE, so we can't stick it in the if INTEL_IOMMU section. Regards, Jerry ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3] iommu/vt-d: Make DMAR_UNITS_SUPPORTED a config setting
On 2022/6/16 02:36, Steve Wahl wrote: To support up to 64 sockets with 10 DMAR units each (640), make the value of DMAR_UNITS_SUPPORTED adjustable by a config variable, CONFIG_DMAR_UNITS_SUPPORTED, and make it's default 1024 when MAXSMP is set. If the available hardware exceeds DMAR_UNITS_SUPPORTED (previously set to MAX_IO_APICS, or 128), it causes these messages: "DMAR: Failed to allocate seq_id", "DMAR: Parse DMAR table failure.", and "x2apic: IRQ remapping doesn't support X2APIC mode x2apic disabled"; and the system fails to boot properly. Signed-off-by: Steve Wahl Reviewed-by: Kevin Tian --- Note that we could not find a reason for connecting DMAR_UNITS_SUPPORTED to MAX_IO_APICS as was done previously. Perhaps it seemed like the two would continue to match on earlier processors. There doesn't appear to be kernel code that assumes that the value of one is related to the other. v2: Make this value a config option, rather than a fixed constant. The default values should match previous configuration except in the MAXSMP case. Keeping the value at a power of two was requested by Kevin Tian. v3: Make the config option dependent upon DMAR_TABLE, as it is not used without this. drivers/iommu/intel/Kconfig | 7 +++ include/linux/dmar.h| 6 +- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/intel/Kconfig b/drivers/iommu/intel/Kconfig index 39a06d245f12..07aaebcb581d 100644 --- a/drivers/iommu/intel/Kconfig +++ b/drivers/iommu/intel/Kconfig @@ -9,6 +9,13 @@ config DMAR_PERF config DMAR_DEBUG bool +config DMAR_UNITS_SUPPORTED + int "Number of DMA Remapping Units supported" + depends on DMAR_TABLE + default 1024 if MAXSMP + default 128 if X86_64 + default 64 With this patch applied, the IOMMU configuration looks like: [*] AMD IOMMU support AMD IOMMU Version 2 driver [*] Enable AMD IOMMU internals in DebugFS (1024) Number of DMA Remapping Units supported NEW [*] Support for Intel IOMMU using DMA Remapping Devices [*] Export Intel IOMMU internals in Debugfs [*] Support for Shared Virtual Memory with Intel IOMMU [*] Enable Intel DMA Remapping Devices by default [*] Enable Intel IOMMU scalable mode by default [*] Support for Interrupt Remapping [*] OMAP IOMMU Support [*] Export OMAP IOMMU internals in DebugFS [*] Rockchip IOMMU Support The NEW item looks confusing. It looks to be a generic configurable value though it's actually Intel DMAR specific. Any thoughts? Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] swiotlb: Remove redundant swiotlb_force
On 22/06/2022 15:32, Christoph Hellwig wrote: > On Wed, Jun 22, 2022 at 03:29:52PM +0100, Steven Price wrote: >> The variable (and enum) was removed in commit c6af2aa9ffc9 ("swiotlb: >> make the swiotlb_init interface more useful") but the declaration was >> left in swiotlb.h. Tidy up by removing the declaration as well. >> >> Signed-off-by: Steven Price > > I just applied an identical patch from Dongli Zhang a few hours ago. Ah, I missed that. Sorry for the noise! Steve ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] swiotlb: Remove redundant swiotlb_force
On Wed, Jun 22, 2022 at 03:29:52PM +0100, Steven Price wrote: > The variable (and enum) was removed in commit c6af2aa9ffc9 ("swiotlb: > make the swiotlb_init interface more useful") but the declaration was > left in swiotlb.h. Tidy up by removing the declaration as well. > > Signed-off-by: Steven Price I just applied an identical patch from Dongli Zhang a few hours ago. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] swiotlb: Remove redundant swiotlb_force
The variable (and enum) was removed in commit c6af2aa9ffc9 ("swiotlb: make the swiotlb_init interface more useful") but the declaration was left in swiotlb.h. Tidy up by removing the declaration as well. Signed-off-by: Steven Price --- include/linux/swiotlb.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 7ed35dd3de6e..b1f5ace37502 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -60,8 +60,6 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys, size_t size, enum dma_data_direction dir, unsigned long attrs); #ifdef CONFIG_SWIOTLB -extern enum swiotlb_force swiotlb_force; - /** * struct io_tlb_mem - IO TLB Memory Pool Descriptor * -- 2.25.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/1] iommu/vt-d: Fix RID2PASID setup failure
On 2022/6/22 17:09, Ethan Zhao wrote: 在 2022/6/22 12:41, Lu Baolu 写道: The IOMMU driver shares the pasid table for PCI alias devices. When the RID2PASID entry of the shared pasid table has been filled by the first device, the subsequent devices will encounter the "DMAR: Setup RID2PASID failed" failure as the pasid entry has already been marked as present. As the result, the IOMMU probing process will be aborted. This fixes it by skipping RID2PASID setting if the pasid entry has been populated. This works because the IOMMU core ensures that only the same IOMMU domain can be attached to all PCI alias devices at the same time. Therefore the subsequent devices just try to setup the RID2PASID entry with the same domain, which is negligible. This also adds domain validity checks for more confidence anyway. Fixes: ef848b7e5a6a0 ("iommu/vt-d: Setup pasid entry for RID2PASID support") Reported-by: Chenyi Qiang Cc: sta...@vger.kernel.org Signed-off-by: Lu Baolu --- drivers/iommu/intel/pasid.c | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) Change log: v2: - Add domain validity check in RID2PASID entry setup. diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index cb4c1d0cf25c..4f3525f3346f 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -575,6 +575,19 @@ static inline int pasid_enable_wpe(struct pasid_entry *pte) return 0; }; +/* + * Return true if @pasid is RID2PASID and the domain @did has already + * been setup to the @pte. Otherwise, return false. PCI alias devices + * probably share the single RID2PASID pasid entry in the shared pasid + * table. It's reasonable that those devices try to set a share domain + * in their probe paths. + */ I am thinking about the counter-part, the intel_pasid_tear_down_entry(), Multi devices share the same PASID entry, then one was detached from the domain, so the entry doesn't exist anymore, while another devices don't know about the change, and they are using the mapping, is it possible case ?shared thing, no refer-counter, am I missing something ? No. You are right. When any alias device is hot-removed from the system, the shared RID2PASID will be cleared without any notification to other devices. Hence any DMAs from those devices are blocked. We still have a lot to do for sharing pasid table among alias devices. Before we arrive there, let's remove it for now. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 0/5] mtk_iommu: Specify phandles to infracfg and pericfg
Hi Joerg, On 22/06/2022 15:44, Joerg Roedel wrote: On Thu, Jun 16, 2022 at 01:08:25PM +0200, AngeloGioacchino Del Regno wrote: AngeloGioacchino Del Regno (5): dt-bindings: iommu: mediatek: Add mediatek,infracfg phandle iommu/mediatek: Lookup phandle to retrieve syscon to infracfg arm64: dts: mediatek: mt8173: Add mediatek,infracfg phandle for IOMMU arm64: dts: mediatek: mt2712e: Add mediatek,infracfg phandle for IOMMU iommu/mediatek: Cleanup pericfg lookup flow .../bindings/iommu/mediatek,iommu.yaml| 17 +++ arch/arm64/boot/dts/mediatek/mt2712e.dtsi | 2 + arch/arm64/boot/dts/mediatek/mt8173.dtsi | 1 + drivers/iommu/mtk_iommu.c | 50 +++ 4 files changed, 49 insertions(+), 21 deletions(-) Applied, thanks. I wanted to check if you took also 3 and 4, as these should go through my tree. Unfortunately you haven't pushed your tree (yet). In case you took the whole series, can you please drop the dts patches. I'll apply them now on my v5.19-next/dts64 branch. Regards. Matthias ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 0/5] mtk_iommu: Specify phandles to infracfg and pericfg
Hi Joerg, On 22/06/2022 15:44, Joerg Roedel wrote: On Thu, Jun 16, 2022 at 01:08:25PM +0200, AngeloGioacchino Del Regno wrote: AngeloGioacchino Del Regno (5): dt-bindings: iommu: mediatek: Add mediatek,infracfg phandle iommu/mediatek: Lookup phandle to retrieve syscon to infracfg arm64: dts: mediatek: mt8173: Add mediatek,infracfg phandle for IOMMU arm64: dts: mediatek: mt2712e: Add mediatek,infracfg phandle for IOMMU iommu/mediatek: Cleanup pericfg lookup flow .../bindings/iommu/mediatek,iommu.yaml| 17 +++ arch/arm64/boot/dts/mediatek/mt2712e.dtsi | 2 + arch/arm64/boot/dts/mediatek/mt8173.dtsi | 1 + drivers/iommu/mtk_iommu.c | 50 +++ 4 files changed, 49 insertions(+), 21 deletions(-) Applied, thanks. I wanted to check if you took also 3 and 4, as these should go through my tree. Unfortunately you haven't pushed your tree (yet). In case you took the whole series, can you please drop the dts patches. I'll apply them now on my v5.19-next/dts64 branch. Regards. Matthias ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/ipmmu-vmsa: Fix compatible for rcar-gen4
On Fri, Jun 17, 2022 at 10:01:07AM +0900, Yoshihiro Shimoda wrote: > Fix compatible string for R-Car Gen4. > > Fixes: ae684caf465b ("iommu/ipmmu-vmsa: Add support for R-Car Gen4") > Signed-off-by: Yoshihiro Shimoda > --- > drivers/iommu/ipmmu-vmsa.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Applied for 5.19, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 0/5] mtk_iommu: Specify phandles to infracfg and pericfg
On Thu, Jun 16, 2022 at 01:08:25PM +0200, AngeloGioacchino Del Regno wrote: > AngeloGioacchino Del Regno (5): > dt-bindings: iommu: mediatek: Add mediatek,infracfg phandle > iommu/mediatek: Lookup phandle to retrieve syscon to infracfg > arm64: dts: mediatek: mt8173: Add mediatek,infracfg phandle for IOMMU > arm64: dts: mediatek: mt2712e: Add mediatek,infracfg phandle for IOMMU > iommu/mediatek: Cleanup pericfg lookup flow > > .../bindings/iommu/mediatek,iommu.yaml| 17 +++ > arch/arm64/boot/dts/mediatek/mt2712e.dtsi | 2 + > arch/arm64/boot/dts/mediatek/mt8173.dtsi | 1 + > drivers/iommu/mtk_iommu.c | 50 +++ > 4 files changed, 49 insertions(+), 21 deletions(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: Remove usage of the deprecated ida_simple_xxx API
On Thu, Jun 16, 2022 at 10:05:55PM -0400, Bo Liu wrote: > Use ida_alloc_xxx()/ida_free() instead of > ida_simple_get()/ida_simple_remove(). > The latter is deprecated and more verbose. > > Signed-off-by: Bo Liu > --- > drivers/iommu/iommu.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) Sorry, just applied this similar change: https://lore.kernel.org/r/20220608021655.1538087-1-liuk...@huawei.com ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/dma: Fix race condition during iova_domain initialization
On Wed, Jun 22, 2022 at 02:27:57PM +0100, Robin Murphy wrote: > Apologies, I did spot this before, I've just been tied up with other things > and dropping everything non-critical on the floor, so didn't get round to > replying before it slipped my mind again. > > In summary, I hate it, but mostly because the whole situation of calling > iommu_probe_device off the back of driver probe is fundamentally broken. I'm > still a few steps away from fixing that properly, at which point I can just > as well rip all these little bodges out again. If it really does need > mitigating in the meantime (i.e. this is real-world async probe, not just > some contrived testcase), then I can't easily think of any cleaner hack, so, > > Acked-by: Robin Murphy Alright, applied this now. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/dma: Fix race condition during iova_domain initialization
On 2022-06-22 13:46, Joerg Roedel wrote: Please re-send with Robin Murphy in Cc. Apologies, I did spot this before, I've just been tied up with other things and dropping everything non-critical on the floor, so didn't get round to replying before it slipped my mind again. In summary, I hate it, but mostly because the whole situation of calling iommu_probe_device off the back of driver probe is fundamentally broken. I'm still a few steps away from fixing that properly, at which point I can just as well rip all these little bodges out again. If it really does need mitigating in the meantime (i.e. this is real-world async probe, not just some contrived testcase), then I can't easily think of any cleaner hack, so, Acked-by: Robin Murphy (somewhat reluctantly) Cheers, Robin. On Mon, May 30, 2022 at 08:07:45PM +0800, yf.w...@mediatek.com wrote: From: Yunfei Wang When many devices share the same iova domain, iommu_dma_init_domain() may be called at the same time. The checking of iovad->start_pfn will all get false in iommu_dma_init_domain() and both enter init_iova_domain() to do iovad initialization. Fix this by protecting init_iova_domain() with iommu_dma_cookie->mutex. Exception backtrace: rb_insert_color(param1=0xFF80CD2BDB40, param3=1) + 64 init_iova_domain() + 180 iommu_setup_dma_ops() + 260 arch_setup_dma_ops() + 132 of_dma_configure_id() + 468 platform_dma_configure() + 32 really_probe() + 1168 driver_probe_device() + 268 __device_attach_driver() + 524 __device_attach() + 524 bus_probe_device() + 64 deferred_probe_work_func() + 260 process_one_work() + 580 worker_thread() + 1076 kthread() + 332 ret_from_fork() + 16 Signed-off-by: Ning Li Signed-off-by: Yunfei Wang --- drivers/iommu/dma-iommu.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 09f6e1c0f9c0..b38c5041eeab 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -63,6 +63,7 @@ struct iommu_dma_cookie { /* Domain for flush queue callback; NULL if flush queue not in use */ struct iommu_domain *fq_domain; + struct mutexmutex; }; static DEFINE_STATIC_KEY_FALSE(iommu_deferred_attach_enabled); @@ -309,6 +310,7 @@ int iommu_get_dma_cookie(struct iommu_domain *domain) if (!domain->iova_cookie) return -ENOMEM; + mutex_init(>iova_cookie->mutex); return 0; } @@ -549,26 +551,33 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, } /* start_pfn is always nonzero for an already-initialised domain */ + mutex_lock(>mutex); if (iovad->start_pfn) { if (1UL << order != iovad->granule || base_pfn != iovad->start_pfn) { pr_warn("Incompatible range for DMA domain\n"); - return -EFAULT; + ret = -EFAULT; + goto done_unlock; } - return 0; + ret = 0; + goto done_unlock; } init_iova_domain(iovad, 1UL << order, base_pfn); ret = iova_domain_init_rcaches(iovad); if (ret) - return ret; + goto done_unlock; /* If the FQ fails we can simply fall back to strict mode */ if (domain->type == IOMMU_DOMAIN_DMA_FQ && iommu_dma_init_fq(domain)) domain->type = IOMMU_DOMAIN_DMA; - return iova_reserve_iommu_regions(dev, domain); + ret = iova_reserve_iommu_regions(dev, domain); + +done_unlock: + mutex_unlock(>mutex); + return ret; } /** -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
On 2022-06-22 13:59, Joerg Roedel wrote: On Thu, Jun 09, 2022 at 04:12:10PM +0100, Robin Murphy wrote: firmware bindings by now. Let's be brave and default it to off in the I don't have an overall good feeling about this, but as you said, let's be brave. This is applied now to the core branch. If it causes too much trouble we can still revert it (and re-revert it later, ...) Even easier, we can just bring back "default X86", or "default y", if too many folks object to configuring it manually :) Thanks for your bravery! Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
On Thu, Jun 09, 2022 at 04:12:10PM +0100, Robin Murphy wrote: > firmware bindings by now. Let's be brave and default it to off in the I don't have an overall good feeling about this, but as you said, let's be brave. This is applied now to the core branch. If it causes too much trouble we can still revert it (and re-revert it later, ...) Thanks, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH V4 1/1] swiotlb: Split up single swiotlb lock
I will build the next RFC version of 64-bit swiotlb on top of this patch (or next version of this patch), so that it will render a more finalized view of 32-bt/64-bit plus multiple area. Thank you very much! Dongli Zhang On 6/22/22 3:54 AM, Christoph Hellwig wrote: > Thanks, > > this looks pretty good to me. A few comments below: > > On Fri, Jun 17, 2022 at 10:47:41AM -0400, Tianyu Lan wrote: >> +/** >> + * struct io_tlb_area - IO TLB memory area descriptor >> + * >> + * This is a single area with a single lock. >> + * >> + * @used: The number of used IO TLB block. >> + * @index: The slot index to start searching in this area for next round. >> + * @lock: The lock to protect the above data structures in the map and >> + * unmap calls. >> + */ >> +struct io_tlb_area { >> +unsigned long used; >> +unsigned int index; >> +spinlock_t lock; >> +}; > > This can go into swiotlb.c. > >> +void __init swiotlb_adjust_nareas(unsigned int nareas); > > And this should be marked static. > >> +#define DEFAULT_NUM_AREAS 1 > > I'd drop this define, the magic 1 and a > 1 comparism seems to > convey how it is used much better as the checks aren't about default > or not, but about larger than one. > > I also think that we want some good way to size the default, e.g. > by number of CPUs or memory size. > >> +void __init swiotlb_adjust_nareas(unsigned int nareas) >> +{ >> +if (!is_power_of_2(nareas)) { >> +pr_err("swiotlb: Invalid areas parameter %d.\n", nareas); >> +return; >> +} >> + >> +default_nareas = nareas; >> + >> +pr_info("area num %d.\n", nareas); >> +/* Round up number of slabs to the next power of 2. >> + * The last area is going be smaller than the rest if >> + * default_nslabs is not power of two. >> + */ > > Please follow the normal kernel comment style with a /* on its own line. > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/iommu__;!!ACWV5N9M2RV99hQ!Jd_DYgd6_uOF8IPr8h1tratEG51zFXtwVpaPa_OW3AEJlWe8gOnmA_fGOdaFUfsVcj1sT5oYw2j4vacY$ > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu: Directly use ida_alloc()/free()
On Wed, Jun 08, 2022 at 02:16:55AM +, Ke Liu wrote: > Use ida_alloc()/ida_free() instead of deprecated > ida_simple_get()/ida_simple_remove(). > > Signed-off-by: Ke Liu > Reviewed-by: Jason Gunthorpe > Signed-off-by: Michael S. Tsirkin > --- > v2 subject change to iommu > --- > drivers/iommu/iommu.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/dma: Fix race condition during iova_domain initialization
Please re-send with Robin Murphy in Cc. On Mon, May 30, 2022 at 08:07:45PM +0800, yf.w...@mediatek.com wrote: > From: Yunfei Wang > > When many devices share the same iova domain, iommu_dma_init_domain() > may be called at the same time. The checking of iovad->start_pfn will > all get false in iommu_dma_init_domain() and both enter init_iova_domain() > to do iovad initialization. > > Fix this by protecting init_iova_domain() with iommu_dma_cookie->mutex. > > Exception backtrace: > rb_insert_color(param1=0xFF80CD2BDB40, param3=1) + 64 > init_iova_domain() + 180 > iommu_setup_dma_ops() + 260 > arch_setup_dma_ops() + 132 > of_dma_configure_id() + 468 > platform_dma_configure() + 32 > really_probe() + 1168 > driver_probe_device() + 268 > __device_attach_driver() + 524 > __device_attach() + 524 > bus_probe_device() + 64 > deferred_probe_work_func() + 260 > process_one_work() + 580 > worker_thread() + 1076 > kthread() + 332 > ret_from_fork() + 16 > > Signed-off-by: Ning Li > Signed-off-by: Yunfei Wang > --- > drivers/iommu/dma-iommu.c | 17 + > 1 file changed, 13 insertions(+), 4 deletions(-) > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index 09f6e1c0f9c0..b38c5041eeab 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -63,6 +63,7 @@ struct iommu_dma_cookie { > > /* Domain for flush queue callback; NULL if flush queue not in use */ > struct iommu_domain *fq_domain; > + struct mutexmutex; > }; > > static DEFINE_STATIC_KEY_FALSE(iommu_deferred_attach_enabled); > @@ -309,6 +310,7 @@ int iommu_get_dma_cookie(struct iommu_domain *domain) > if (!domain->iova_cookie) > return -ENOMEM; > > + mutex_init(>iova_cookie->mutex); > return 0; > } > > @@ -549,26 +551,33 @@ static int iommu_dma_init_domain(struct iommu_domain > *domain, dma_addr_t base, > } > > /* start_pfn is always nonzero for an already-initialised domain */ > + mutex_lock(>mutex); > if (iovad->start_pfn) { > if (1UL << order != iovad->granule || > base_pfn != iovad->start_pfn) { > pr_warn("Incompatible range for DMA domain\n"); > - return -EFAULT; > + ret = -EFAULT; > + goto done_unlock; > } > > - return 0; > + ret = 0; > + goto done_unlock; > } > > init_iova_domain(iovad, 1UL << order, base_pfn); > ret = iova_domain_init_rcaches(iovad); > if (ret) > - return ret; > + goto done_unlock; > > /* If the FQ fails we can simply fall back to strict mode */ > if (domain->type == IOMMU_DOMAIN_DMA_FQ && iommu_dma_init_fq(domain)) > domain->type = IOMMU_DOMAIN_DMA; > > - return iova_reserve_iommu_regions(dev, domain); > + ret = iova_reserve_iommu_regions(dev, domain); > + > +done_unlock: > + mutex_unlock(>mutex); > + return ret; > } > > /** > -- > 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 2/2] vfio: Use device_iommu_capable()
Use the new interface to check the capabilities for our device specifically. Signed-off-by: Robin Murphy --- drivers/vfio/vfio.c | 2 +- drivers/vfio/vfio_iommu_type1.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 73bab04880d0..765d68192c88 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -621,7 +621,7 @@ int vfio_register_group_dev(struct vfio_device *device) * VFIO always sets IOMMU_CACHE because we offer no way for userspace to * restore cache coherency. */ - if (!iommu_capable(device->dev->bus, IOMMU_CAP_CACHE_COHERENCY)) + if (!device_iommu_capable(device->dev, IOMMU_CAP_CACHE_COHERENCY)) return -EINVAL; return __vfio_register_dev(device, diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index e38b8bfde677..2107e95eb743 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -2247,7 +2247,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, list_add(>next, >group_list); msi_remap = irq_domain_check_msi_remap() || - iommu_capable(iommu_api_dev->dev->bus, IOMMU_CAP_INTR_REMAP); + device_iommu_capable(iommu_api_dev->dev, IOMMU_CAP_INTR_REMAP); if (!allow_unsafe_interrupts && !msi_remap) { pr_warn("%s: No interrupt remapping support. Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n", -- 2.36.1.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/2] vfio/type1: Simplify bus_type determination
Since IOMMU groups are mandatory for drivers to support, it stands to reason that any device which has been successfully be added to a group must be on a bus supported by that IOMMU driver, and therefore a domain viable for any device in the group must be viable for all devices in the group. This already has to be the case for the IOMMU API's internal default domain, for instance. Thus even if the group contains devices on different buses, that can only mean that the IOMMU driver actually supports such an odd topology, and so without loss of generality we can expect the bus type of any device in a group to be suitable for IOMMU API calls. Replace vfio_bus_type() with a simple call to resolve an appropriate member device from which to then derive a bus type. This is also a step towards removing the vague bus-based interfaces from the IOMMU API, when we can subsequently switch to using this device directly. Furthermore, scrutiny reveals a lack of protection for the bus being removed while vfio_iommu_type1_attach_group() is using it; the reference that VFIO holds on the iommu_group ensures that data remains valid, but does not prevent the group's membership changing underfoot. Holding the vfio_device for as long as we need here also neatly solves this. Signed-off-by: Robin Murphy --- After sleeping on it, I decided to type up the helper function approach to see how it looked in practice, and in doing so realised that with one more tweak it could also subsume the locking out of the common paths as well, so end up being a self-contained way for type1 to take care of its own concern, which I rather like. drivers/vfio/vfio.c | 18 +- drivers/vfio/vfio.h | 3 +++ drivers/vfio/vfio_iommu_type1.c | 30 +++--- 3 files changed, 31 insertions(+), 20 deletions(-) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 61e71c1154be..73bab04880d0 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -448,7 +448,7 @@ static void vfio_group_get(struct vfio_group *group) * Device objects - create, release, get, put, search */ /* Device reference always implies a group reference */ -static void vfio_device_put(struct vfio_device *device) +void vfio_device_put(struct vfio_device *device) { if (refcount_dec_and_test(>refcount)) complete(>comp); @@ -475,6 +475,22 @@ static struct vfio_device *vfio_group_get_device(struct vfio_group *group, return NULL; } +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group) +{ + struct vfio_group *group = vfio_group_get_from_iommu(iommu_group); + struct vfio_device *device; + + mutex_lock(>device_lock); + list_for_each_entry(device, >device_list, group_next) { + if (vfio_device_try_get(device)) { + mutex_unlock(>device_lock); + return device; + } + } + mutex_unlock(>device_lock); + return NULL; +} + /* * VFIO driver API */ diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index a67130221151..e8f21e64541b 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -70,3 +70,6 @@ struct vfio_iommu_driver_ops { int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops); void vfio_unregister_iommu_driver(const struct vfio_iommu_driver_ops *ops); + +struct vfio_device *vfio_device_get_from_iommu(struct iommu_group *iommu_group); +void vfio_device_put(struct vfio_device *device); diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index c13b9290e357..e38b8bfde677 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -1679,18 +1679,6 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, return ret; } -static int vfio_bus_type(struct device *dev, void *data) -{ - struct bus_type **bus = data; - - if (*bus && *bus != dev->bus) - return -EINVAL; - - *bus = dev->bus; - - return 0; -} - static int vfio_iommu_replay(struct vfio_iommu *iommu, struct vfio_domain *domain) { @@ -2159,7 +2147,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, struct vfio_iommu *iommu = iommu_data; struct vfio_iommu_group *group; struct vfio_domain *domain, *d; - struct bus_type *bus = NULL; + struct vfio_device *iommu_api_dev; bool resv_msi, msi_remap; phys_addr_t resv_msi_base = 0; struct iommu_domain_geometry *geo; @@ -2192,18 +2180,19 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, goto out_unlock; } - /* Determine bus_type in order to allocate a domain */ - ret = iommu_group_for_each_dev(iommu_group, , vfio_bus_type); - if (ret) + /* Resolve the group back to a member device for IOMMU API ops */ + ret = -ENODEV; + iommu_api_dev =
Re: [PATCH 3/3] iommu: Clean up release_device checks
On 2022/6/22 15:17, Robin Murphy wrote: On 2022-06-22 02:36, Baolu Lu wrote: On 2022/6/21 23:14, Robin Murphy wrote: Since .release_device is now called through per-device ops, any call which gets as far as a driver definitely*is* for that driver, for a device which has successfully passed .probe_device, so all the checks to that effect are now redundant and can be removed. In the same vein we can also skip freeing fwspecs which are now managed by core code. Does this depend on any other series? I didn't see iommu_fwspec_free() called in the core code. Or I missed anything? dev_iommu_free() cleans up param->fwspec directly (see b54240ad4943). FWIW the plan is that iommu_fwspec_free() should eventually go away - of the remaining uses after this, two are in fact similarly redundant already, since there's also a dev_iommu_free() in the probe failure path, and the other two should disappear in part 2 of fixing the bus probing mess (wherein the of_xlate step gets pulled into iommu_probe_device as well, and finally works correctly again). Yes, it is. Thanks for the explanation. Reviewed-by: Lu Baolu Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 7/9] driver core: Set fw_devlink.strict=1 by default
On Wed, Jun 22, 2022 at 12:52:02PM +0200, Andy Shevchenko wrote: > On Wed, Jun 22, 2022 at 10:44 AM Linus Walleij > wrote: > > On Wed, Jun 22, 2022 at 9:48 AM Sascha Hauer wrote: > > ... > > > > This patch has the effect that console UART devices which have "dmas" > > > properties specified in the device tree get deferred for 10 to 20 > > > seconds. This happens on i.MX and likely on other SoCs as well. On i.MX > > > the dma channel is only requested at UART startup time and not at probe > > > time. dma is not used for the console. Nevertheless with this driver probe > > > defers until the dma engine driver is available. > > > > > > It shouldn't go in as-is. > > > > This affects all machines with the PL011 UART and DMAs specified as > > well. > > > > It would be best if the console subsystem could be treated special and > > not require DMA devlink to be satisfied before probing. > > In 8250 we force disable DMA and PM on kernel consoles, because it's > so-o PITA and has a lot of corner cases we may never chase down. On i.MX this is done as well, but it doesn't help here. The driver is not even probed when the device node contains a "dmas" property. Sascha -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-direct: take dma-ranges/offsets into account in resource mapping
I'd really like to hear something from the driver maintainers. The cod change itself looks fine, we just need to make sure it does not break any driver assumptions. And I think at least for the PCIe P2P and NTB cases I fear it might break them. The proper logic for those is in the p2p helpers, but it seems like not everyone is using them. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 4/7] iommu/amd: Introduce function to check and enable SNP
On 6/22/2022 3:35 PM, Robin Murphy wrote: Overall though, this is way nicer than v1, and it's definitely the right name in the right place now, thanks! FWIW, with those nits picked one way or another: Reviewed-by: Robin Murphy Cheers, Robin. Thanks for your review. I'll send out v3 w/ your suggestions and reviewed-by tag. Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH V4 1/1] swiotlb: Split up single swiotlb lock
Thanks, this looks pretty good to me. A few comments below: On Fri, Jun 17, 2022 at 10:47:41AM -0400, Tianyu Lan wrote: > +/** > + * struct io_tlb_area - IO TLB memory area descriptor > + * > + * This is a single area with a single lock. > + * > + * @used:The number of used IO TLB block. > + * @index: The slot index to start searching in this area for next round. > + * @lock:The lock to protect the above data structures in the map and > + * unmap calls. > + */ > +struct io_tlb_area { > + unsigned long used; > + unsigned int index; > + spinlock_t lock; > +}; This can go into swiotlb.c. > +void __init swiotlb_adjust_nareas(unsigned int nareas); And this should be marked static. > +#define DEFAULT_NUM_AREAS 1 I'd drop this define, the magic 1 and a > 1 comparism seems to convey how it is used much better as the checks aren't about default or not, but about larger than one. I also think that we want some good way to size the default, e.g. by number of CPUs or memory size. > +void __init swiotlb_adjust_nareas(unsigned int nareas) > +{ > + if (!is_power_of_2(nareas)) { > + pr_err("swiotlb: Invalid areas parameter %d.\n", nareas); > + return; > + } > + > + default_nareas = nareas; > + > + pr_info("area num %d.\n", nareas); > + /* Round up number of slabs to the next power of 2. > + * The last area is going be smaller than the rest if > + * default_nslabs is not power of two. > + */ Please follow the normal kernel comment style with a /* on its own line. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 7/9] driver core: Set fw_devlink.strict=1 by default
On Wed, Jun 22, 2022 at 10:44 AM Linus Walleij wrote: > On Wed, Jun 22, 2022 at 9:48 AM Sascha Hauer wrote: ... > > This patch has the effect that console UART devices which have "dmas" > > properties specified in the device tree get deferred for 10 to 20 > > seconds. This happens on i.MX and likely on other SoCs as well. On i.MX > > the dma channel is only requested at UART startup time and not at probe > > time. dma is not used for the console. Nevertheless with this driver probe > > defers until the dma engine driver is available. > > > > It shouldn't go in as-is. > > This affects all machines with the PL011 UART and DMAs specified as > well. > > It would be best if the console subsystem could be treated special and > not require DMA devlink to be satisfied before probing. In 8250 we force disable DMA and PM on kernel consoles, because it's so-o PITA and has a lot of corner cases we may never chase down. 089b6d365491 serial: 8250_port: Disable DMA operations for kernel console bedb404e91bb serial: 8250_port: Don't use power management for kernel console > It seems devlink is not quite aware of the concept of resources that are > necessary to probe vs resources that are nice to have and might be > added after probe. We need a strong devlink for the first category > and maybe a weak devlink for the latter category. > > I don't know if this is a generic hardware property for all operating > systems so it could be a DT property such as dma-weak-dependency? > Or maybe compromize and add a linux,dma-weak-dependency; > property? -- With Best Regards, Andy Shevchenko ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 0/4] swiotlb: some cleanup
Thanks, I've applied all 4 to the dma-mapping tree for Linux 5.20. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 00/35] iommu/amd: Add multiple PCI segments support
Hi Joerg, On 6/7/2022 4:17 PM, Vasant Hegde wrote: > Hello Joerg, > > > On 5/20/2022 5:42 PM, Vasant Hegde wrote: >> Joerg, >> >> >> On 5/20/2022 3:33 PM, Joerg Roedel wrote: >>> Hi Vasant, >>> >>> On Fri, May 20, 2022 at 03:25:38PM +0530, Vasant Hegde wrote: Ping. Did you get a chance to look into this series? >>> >>> Sorry, too late for this round. The changes are pretty invasive and >>> merging them at -rc7 stage would not give them enough testing before >>> being merged. Please send me a reminder after the next merge window. >> >> Sure. I will remind you after v5.19 merge window closes. > > Ping. Can you please take a look of this series? > Do you want me to rebase patchset on to of v5.19-rc1 -OR- latest iommu/next > branch? > Ping? -Vasant ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 2/3] iommu/samsung: Introduce Exynos sysmmu-v8 driver
On 22.06.2022 11:14, Robin Murphy wrote: > On 2022-06-21 20:57, Sam Protsenko wrote: >> Hi Marek, >> >> On Fri, 21 Jan 2022 at 14:31, Marek Szyprowski >> wrote: >> >> [snip] >> >>> >>> Well, for starting point the existing exynos-iommu driver is really >>> enough. I've played a bit with newer Exyos SoCs some time ago. If I >>> remember right, if you limit the iommu functionality to the essential >>> things like mapping pages to IO-virtual space, the hardware differences >>> between SYSMMU v5 (already supported by the exynos-iommu driver) and v7 >>> are just a matter of changing a one register during the initialization >>> and different bits the page fault reason decoding. You must of course >>> rely on the DMA-mapping framework and its implementation based on >>> mainline DMA-IOMMU helpers. All the code for custom iommu group(s) >>> handling or extended fault management are not needed for the initial >>> version. >>> >> >> Thanks for the advice! Just implemented some testing driver, which >> uses "Emulated Translation" registers available on SysMMU v7. That's >> one way to verify the IOMMU driver with no actual users of it. It >> works fine with vendor SysMMU driver I ported to mainline earlier, and >> now I'm trying to use it with exynos-sysmmu driver (existing >> upstream). If you're curious -- I can share the testing driver >> somewhere on GitHub. >> >> I believe the register you mentioned is PT_BASE one, so I used >> REG_V7_FLPT_BASE_VM = 0x800C instead of REG_V5_PT_BASE_PFN. But I >> didn't manage to get that far, unfortunately, as >> exynos_iommu_domain_alloc() function fails in my case, with BUG_ON() >> at this line: >> >> /* For mapping page table entries we rely on dma == phys */ >> BUG_ON(handle != virt_to_phys(domain->pgtable)); >> >> One possible explanation for this BUG is that "dma-ranges" property is >> not provided in DTS (which seems to be the case right now for all >> users of "samsung,exynos-sysmmu" driver). Because of that the SWIOTLB >> is used for dma_map_single() call (in exynos_iommu_domain_alloc() >> function), which in turn leads to that BUG. At least that's what >> happens in my case. The call chain looks like this: >> >> exynos_iommu_domain_alloc() >> v >> dma_map_single() >> v >> dma_map_single_attrs() >> v >> dma_map_page_attrs() >> v >> dma_direct_map_page() // dma_capable() == false >> v >> swiotlb_map() >> v >> swiotlb_tbl_map_single() >> >> And the last call of course always returns the address different than >> the address for allocated pgtable. E.g. in my case I see this: >> >> handle = 0xfbfff000 >> virt_to_phys(domain->pgtable) = 0x000880d0c000 >> >> Do you know what might be the reason for that? I just wonder how the >> SysMMU driver work for all existing Exynos platforms right now. I feel >> I might be missing something, like some DMA option should be enabled >> so that SWIOTLB is not used, or something like that. Please let me >> know if you have any idea on possible cause. The vendor's SysMMU >> driver is kinda different in that regard, as it doesn't use >> dma_map_single(), so I don't see such issue there. > > If this SysMMU version is capable of addressing more than 32 bits, > then exynos_iommu_probe_device() should set its DMA masks appropriately. Well, Sam's question was about the Exynos SYSMMU own platform device, not the device for which Exynos SYSMMU manages the IO virtual address space. Simply add something like dma_set_mask(dev, DMA_BIT_MASK(36)); to the beginning of the exynos_sysmmu_probe(). This will disable SWIOTLB and switch to DMA-direct for the Exynos SYSMMU platform device. > (as a side note since I looked, the use of PAGE_SIZE/PAGE_SHIFT in the > driver looks wrong, since I can't imagine that the hardware knows > whether Linux is using 4KB, 16KB or 64KB and adjusts itself > accordingly...) Right, this has to be cleaned up. This driver was never used on systems with kernel configuration for non-4KB pages. Best regards -- Marek Szyprowski, PhD Samsung R Institute Poland ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 2/3] iommu/samsung: Introduce Exynos sysmmu-v8 driver
On 2022-06-21 20:57, Sam Protsenko wrote: Hi Marek, On Fri, 21 Jan 2022 at 14:31, Marek Szyprowski wrote: [snip] Well, for starting point the existing exynos-iommu driver is really enough. I've played a bit with newer Exyos SoCs some time ago. If I remember right, if you limit the iommu functionality to the essential things like mapping pages to IO-virtual space, the hardware differences between SYSMMU v5 (already supported by the exynos-iommu driver) and v7 are just a matter of changing a one register during the initialization and different bits the page fault reason decoding. You must of course rely on the DMA-mapping framework and its implementation based on mainline DMA-IOMMU helpers. All the code for custom iommu group(s) handling or extended fault management are not needed for the initial version. Thanks for the advice! Just implemented some testing driver, which uses "Emulated Translation" registers available on SysMMU v7. That's one way to verify the IOMMU driver with no actual users of it. It works fine with vendor SysMMU driver I ported to mainline earlier, and now I'm trying to use it with exynos-sysmmu driver (existing upstream). If you're curious -- I can share the testing driver somewhere on GitHub. I believe the register you mentioned is PT_BASE one, so I used REG_V7_FLPT_BASE_VM = 0x800C instead of REG_V5_PT_BASE_PFN. But I didn't manage to get that far, unfortunately, as exynos_iommu_domain_alloc() function fails in my case, with BUG_ON() at this line: /* For mapping page table entries we rely on dma == phys */ BUG_ON(handle != virt_to_phys(domain->pgtable)); One possible explanation for this BUG is that "dma-ranges" property is not provided in DTS (which seems to be the case right now for all users of "samsung,exynos-sysmmu" driver). Because of that the SWIOTLB is used for dma_map_single() call (in exynos_iommu_domain_alloc() function), which in turn leads to that BUG. At least that's what happens in my case. The call chain looks like this: exynos_iommu_domain_alloc() v dma_map_single() v dma_map_single_attrs() v dma_map_page_attrs() v dma_direct_map_page() // dma_capable() == false v swiotlb_map() v swiotlb_tbl_map_single() And the last call of course always returns the address different than the address for allocated pgtable. E.g. in my case I see this: handle = 0xfbfff000 virt_to_phys(domain->pgtable) = 0x000880d0c000 Do you know what might be the reason for that? I just wonder how the SysMMU driver work for all existing Exynos platforms right now. I feel I might be missing something, like some DMA option should be enabled so that SWIOTLB is not used, or something like that. Please let me know if you have any idea on possible cause. The vendor's SysMMU driver is kinda different in that regard, as it doesn't use dma_map_single(), so I don't see such issue there. If this SysMMU version is capable of addressing more than 32 bits, then exynos_iommu_probe_device() should set its DMA masks appropriately. (as a side note since I looked, the use of PAGE_SIZE/PAGE_SHIFT in the driver looks wrong, since I can't imagine that the hardware knows whether Linux is using 4KB, 16KB or 64KB and adjusts itself accordingly...) Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/1] iommu/vt-d: Fix RID2PASID setup failure
Hi, 在 2022/6/22 12:41, Lu Baolu 写道: The IOMMU driver shares the pasid table for PCI alias devices. When the RID2PASID entry of the shared pasid table has been filled by the first device, the subsequent devices will encounter the "DMAR: Setup RID2PASID failed" failure as the pasid entry has already been marked as present. As the result, the IOMMU probing process will be aborted. This fixes it by skipping RID2PASID setting if the pasid entry has been populated. This works because the IOMMU core ensures that only the same IOMMU domain can be attached to all PCI alias devices at the same time. Therefore the subsequent devices just try to setup the RID2PASID entry with the same domain, which is negligible. This also adds domain validity checks for more confidence anyway. Fixes: ef848b7e5a6a0 ("iommu/vt-d: Setup pasid entry for RID2PASID support") Reported-by: Chenyi Qiang Cc: sta...@vger.kernel.org Signed-off-by: Lu Baolu --- drivers/iommu/intel/pasid.c | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) Change log: v2: - Add domain validity check in RID2PASID entry setup. diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index cb4c1d0cf25c..4f3525f3346f 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -575,6 +575,19 @@ static inline int pasid_enable_wpe(struct pasid_entry *pte) return 0; }; +/* + * Return true if @pasid is RID2PASID and the domain @did has already + * been setup to the @pte. Otherwise, return false. PCI alias devices + * probably share the single RID2PASID pasid entry in the shared pasid + * table. It's reasonable that those devices try to set a share domain + * in their probe paths. + */ I am thinking about the counter-part, the intel_pasid_tear_down_entry(), Multi devices share the same PASID entry, then one was detached from the domain, so the entry doesn't exist anymore, while another devices don't know about the change, and they are using the mapping, is it possible case ?shared thing, no refer-counter, am I missing something ? Thanks, Ethan +static inline bool +rid2pasid_domain_valid(struct pasid_entry *pte, u32 pasid, u16 did) +{ + return pasid == PASID_RID2PASID && pasid_get_domain_id(pte) == did; +} + /* * Set up the scalable mode pasid table entry for first only * translation type. @@ -595,9 +608,8 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu, if (WARN_ON(!pte)) return -EINVAL; - /* Caller must ensure PASID entry is not in use. */ if (pasid_pte_is_present(pte)) - return -EBUSY; + return rid2pasid_domain_valid(pte, pasid, did) ? 0 : -EBUSY; pasid_clear_entry(pte); @@ -698,9 +710,8 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu, return -ENODEV; } - /* Caller must ensure PASID entry is not in use. */ if (pasid_pte_is_present(pte)) - return -EBUSY; + return rid2pasid_domain_valid(pte, pasid, did) ? 0 : -EBUSY; pasid_clear_entry(pte); pasid_set_domain_id(pte, did); @@ -738,9 +749,8 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu, return -ENODEV; } - /* Caller must ensure PASID entry is not in use. */ if (pasid_pte_is_present(pte)) - return -EBUSY; + return rid2pasid_domain_valid(pte, pasid, did) ? 0 : -EBUSY; pasid_clear_entry(pte); pasid_set_domain_id(pte, did); -- AFAIK = As Far As I Know AKA = Also Known As ASAP = As Soon As Possible ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 7/9] driver core: Set fw_devlink.strict=1 by default
On Wed, Jun 22, 2022 at 9:48 AM Sascha Hauer wrote: > This patch has the effect that console UART devices which have "dmas" > properties specified in the device tree get deferred for 10 to 20 > seconds. This happens on i.MX and likely on other SoCs as well. On i.MX > the dma channel is only requested at UART startup time and not at probe > time. dma is not used for the console. Nevertheless with this driver probe > defers until the dma engine driver is available. > > It shouldn't go in as-is. This affects all machines with the PL011 UART and DMAs specified as well. It would be best if the console subsystem could be treated special and not require DMA devlink to be satisfied before probing. It seems devlink is not quite aware of the concept of resources that are necessary to probe vs resources that are nice to have and might be added after probe. We need a strong devlink for the first category and maybe a weak devlink for the latter category. I don't know if this is a generic hardware property for all operating systems so it could be a DT property such as dma-weak-dependency? Or maybe compromize and add a linux,dma-weak-dependency; property? Yours, Linus Walleij ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 6/7] iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled
On 2022-06-16 02:55, Suravee Suthikulpanit wrote: Once SNP is enabled (by executing SNP_INIT command), IOMMU can no longer support the passthrough domain (i.e. IOMMU_DOMAIN_IDENTITY). The SNP_INIT command is called early in the boot process, and would fail if the kernel is configure to default to passthrough mode. After the system is already booted, users can try to change IOMMU domain type of a particular IOMMU group. In this case, the IOMMU driver needs to check the SNP-enable status and return failure when requesting to change domain type to identity. Therefore, return failure when trying to allocate identity domain. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/iommu.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 4f4571d3ff61..d8a6df423b90 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2119,6 +2119,15 @@ static struct iommu_domain *amd_iommu_domain_alloc(unsigned type) { struct protection_domain *domain; + /* +* Since DTE[Mode]=0 is prohibited on SNP-enabled system, +* default to use IOMMU_DOMAIN_DMA[_FQ]. +*/ + if (amd_iommu_snp_en && (type == IOMMU_DOMAIN_IDENTITY)) { + pr_warn("Cannot allocate identity domain due to SNP\n"); Maybe pr_warn_once? Although on the other hand, perhaps anyone with the privilege to be messing with the sysfs interface at all could be trusted not to flood their own logs :/ Either way, Reviewed-by: Robin Murphy + return NULL; + } + domain = protection_domain_alloc(type); if (!domain) return NULL; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 4/7] iommu/amd: Introduce function to check and enable SNP
On 2022-06-16 02:55, Suravee Suthikulpanit wrote: From: Brijesh Singh To support SNP, IOMMU needs to be enabled, and prohibits IOMMU configurations where DTE[Mode]=0, which means it cannot be supported with IOMMU passthrough domain (a.k.a IOMMU_DOMAIN_IDENTITY), and when AMD IOMMU driver is configured to not use the IOMMU host (v1) page table. Otherwise, RMP table initialization could cause the system to crash. The request to enable SNP support in IOMMU must be done before PCI initialization state of the IOMMU driver because enabling SNP affects how IOMMU driver sets up IOMMU data structures (i.e. DTE). Unlike other IOMMU features, SNP feature does not have an enable bit in the IOMMU control register. Instead, the IOMMU driver introduces an amd_iommu_snp_en variable to track enabling state of SNP. Introduce amd_iommu_snp_enable() for other drivers to request enabling the SNP support in IOMMU, which checks all prerequisites and determines if the feature can be safely enabled. Please see the IOMMU spec section 2.12 for further details. Co-developed-by: Suravee Suthikulpanit Signed-off-by: Suravee Suthikulpanit Signed-off-by: Brijesh Singh --- drivers/iommu/amd/amd_iommu_types.h | 5 drivers/iommu/amd/init.c| 45 +++-- drivers/iommu/amd/iommu.c | 4 +-- include/linux/amd-iommu.h | 6 4 files changed, 56 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 73b729be7410..ce4db2835b36 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -463,6 +463,9 @@ extern bool amd_iommu_irq_remap; /* kmem_cache to get tables with 128 byte alignement */ extern struct kmem_cache *amd_iommu_irq_cache; +/* SNP is enabled on the system? */ +extern bool amd_iommu_snp_en; + #define PCI_SBDF_TO_SEGID(sbdf) (((sbdf) >> 16) & 0x) #define PCI_SBDF_TO_DEVID(sbdf) ((sbdf) & 0x) #define PCI_SEG_DEVID_TO_SBDF(seg, devid) u32)(seg) & 0x) << 16) | \ @@ -1013,4 +1016,6 @@ extern struct amd_irte_ops irte_32_ops; extern struct amd_irte_ops irte_128_ops; #endif +extern struct iommu_ops amd_iommu_ops; + #endif /* _ASM_X86_AMD_IOMMU_TYPES_H */ diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 013c55e3c2f2..b5d3de327a5f 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -95,8 +95,6 @@ * out of it. */ -extern const struct iommu_ops amd_iommu_ops; - /* * structure describing one IOMMU in the ACPI table. Typically followed by one * or more ivhd_entrys. @@ -168,6 +166,9 @@ static int amd_iommu_target_ivhd_type; static bool amd_iommu_snp_sup; +bool amd_iommu_snp_en; +EXPORT_SYMBOL(amd_iommu_snp_en); + LIST_HEAD(amd_iommu_pci_seg_list);/* list of all PCI segments */ LIST_HEAD(amd_iommu_list);/* list of all AMD IOMMUs in the system */ @@ -3549,3 +3550,43 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, u8 fxn, u64 return iommu_pc_get_set_reg(iommu, bank, cntr, fxn, value, true); } + +#ifdef CONFIG_AMD_MEM_ENCRYPT +int amd_iommu_snp_enable(void) +{ + /* +* The SNP support requires that IOMMU must be enabled, and is +* not configured in the passthrough mode. +*/ + if (no_iommu || iommu_default_passthrough()) { + pr_err("SNP: IOMMU is either disabled or configured in passthrough mode.\n"); I agree that clarifying what the actual implication is would be a good idea. + return -EINVAL; + } + + /* +* Prevent enabling SNP after IOMMU_ENABLED state because this process +* affect how IOMMU driver sets up data structures and configures +* IOMMU hardware. +*/ + if (init_state > IOMMU_ENABLED) { + pr_err("SNP: Too late to enable SNP for IOMMU.\n"); + return -EINVAL; + } + + amd_iommu_snp_en = amd_iommu_snp_sup; + if (!amd_iommu_snp_en) + return -EINVAL; + + pr_info("SNP enabled\n"); + + /* Enforce IOMMU v1 pagetable when SNP is enabled. */ + if (amd_iommu_pgtable != AMD_IOMMU_V1) { + pr_warn("Force to using AMD IOMMU v1 page table due to SNP\n"); + amd_iommu_pgtable = AMD_IOMMU_V1; + amd_iommu_ops.pgsize_bitmap = AMD_IOMMU_PGSIZES; + } + + return 0; +} +EXPORT_SYMBOL_GPL(amd_iommu_snp_enable); Once again this export seems dubious - surely the IOMMU is going to be enabled well before there's even a chance to load any modules? +#endif diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 86045dc50a0f..0792cd618dba 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -71,7 +71,7 @@ LIST_HEAD(acpihid_map); * Domain for untranslated devices - only
[PATCH] MAINTAINERS: Add new IOMMU development mailing list
From: Joerg Roedel The IOMMU mailing list will move from lists.linux-foundation.org to lists.linux.dev. The hard switch of the archive will happen on July 5th, but add the new list now already so that people start using the list when sending patches. Signed-off-by: Joerg Roedel --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 1fc9ead83d2a..b5bac297d63d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10354,6 +10354,7 @@ IOMMU DRIVERS M: Joerg Roedel M: Will Deacon L: iommu@lists.linux-foundation.org +L: io...@lists.linux.dev S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git F: Documentation/devicetree/bindings/iommu/ -- 2.36.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove
On Wed, Jun 22, 2022 at 04:14:45PM +0800, Zhangfei Gao wrote: > Hi, Greg > > On 2022/6/21 下午3:44, Greg Kroah-Hartman wrote: > > On Tue, Jun 21, 2022 at 03:37:31PM +0800, Zhangfei Gao wrote: > > > > > > On 2022/6/20 下午9:36, Greg Kroah-Hartman wrote: > > > > On Mon, Jun 20, 2022 at 02:24:31PM +0100, Jean-Philippe Brucker wrote: > > > > > On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote: > > > > > > > The refcount only ensures that the uacce_device object is not > > > > > > > freed as > > > > > > > long as there are open fds. But uacce_remove() can run while > > > > > > > there are > > > > > > > open fds, or fds in the process of being opened. And atfer > > > > > > > uacce_remove() > > > > > > > runs, the uacce_device object still exists but is mostly > > > > > > > unusable. For > > > > > > > example once the module is freed, uacce->ops is not valid > > > > > > > anymore. But > > > > > > > currently uacce_fops_open() may dereference the ops in this case: > > > > > > > > > > > > > > uacce_fops_open() > > > > > > >if (!uacce->parent->driver) > > > > > > >/* Still valid, keep going */ > > > > > > >...rmmod > > > > > > >uacce_remove() > > > > > > >... free_module() > > > > > > >uacce->ops->get_queue() /* BUG */ > > > > > > uacce_remove should wait for uacce->queues_lock, until fops_open > > > > > > release the > > > > > > lock. > > > > > > If open happen just after the uacce_remove: unlock, > > > > > > uacce_bind_queue in open > > > > > > should fail. > > > > > Ah yes sorry, I lost sight of what this patch was adding. But we could > > > > > have the same issue with the patch, just in a different order, no? > > > > > > > > > > uacce_fops_open() > > > > >uacce = xa_load() > > > > >...rmmod > > > > Um, how is rmmod called if the file descriptor is open? > > > > > > > > That should not be possible if the owner of the file descriptor is > > > > properly set. Please fix that up. > > > Thanks Greg > > > > > > Set cdev owner or use module_get/put can block rmmod once fops_open. > > > - uacce->cdev->owner = THIS_MODULE; > > > + uacce->cdev->owner = uacce->parent->driver->owner; > > > > > > However, still not find good method to block removing parent pci device. > > > > > > $ echo 1 > /sys/bus/pci/devices/:00:02.0/remove & > > > > > > [ 32.563350] uacce_remove+0x6c/0x148 > > > [ 32.563353] hisi_qm_uninit+0x12c/0x178 > > > [ 32.563356] hisi_zip_remove+0xa0/0xd0 [hisi_zip] > > > [ 32.563361] pci_device_remove+0x44/0xd8 > > > [ 32.563364] device_remove+0x54/0x88 > > > [ 32.563367] device_release_driver_internal+0xec/0x1a0 > > > [ 32.563370] device_release_driver+0x20/0x30 > > > [ 32.563372] pci_stop_bus_device+0x8c/0xe0 > > > [ 32.563375] pci_stop_and_remove_bus_device_locked+0x28/0x60 > > > [ 32.563378] remove_store+0x9c/0xb0 > > > [ 32.563379] dev_attr_store+0x20/0x38 > > Removing the parent pci device does not remove the module code, it > > removes the device itself. Don't confuse code vs. data here. > > Do you mean even parent pci device is removed immediately, the code has to > wait, like dma etc? No, reads will fail, as will DMA transfers, all PCI drivers need to handle surprise removal like this as we have had PCI hotplug systems for decades now. > Currently parent driver has to ensure all dma stopped then call > uacce_remove, > ie, after uacce_fops_open succeed, parent driver need wait fops_release, > then uacce_remove can be called. remove can be called before the file close can happen all the time, you need to handle that properly. > For example: > drivers/crypto/hisilicon/zip/zip_main.c: > hisi_qm_wait_task_finish > > If remove this wait , there may other issue, > Unable to handle kernel paging request at virtual address 8b700204 > pc : hisi_qm_cache_wb.part.0+0x2c/0xa0 > > So uacce only need serialize uacce_fops_open and uacce_remove. That's not going to help much. > After uacce_fops_open, we can assume uacce_remove only happen after > uacce_fops_release? Nope, again, device remove can happen at any point in time and is independent of userspace open/close of file descriptors on the char device. This is a common problem/pattern that drivers need to handle, and unfortunatly they all need to handle it on their own. We have discussed ways of making it easier (see the ksummit discuss list archives from last year), but no one has stepped up and done the work yet :( > Then it would be much simpler. Sorry. If you treat the structures as independant, and properly grab some reference counts or a lock, you should be ok. greg k-h ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove
Hi, Greg On 2022/6/21 下午3:44, Greg Kroah-Hartman wrote: On Tue, Jun 21, 2022 at 03:37:31PM +0800, Zhangfei Gao wrote: On 2022/6/20 下午9:36, Greg Kroah-Hartman wrote: On Mon, Jun 20, 2022 at 02:24:31PM +0100, Jean-Philippe Brucker wrote: On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote: The refcount only ensures that the uacce_device object is not freed as long as there are open fds. But uacce_remove() can run while there are open fds, or fds in the process of being opened. And atfer uacce_remove() runs, the uacce_device object still exists but is mostly unusable. For example once the module is freed, uacce->ops is not valid anymore. But currently uacce_fops_open() may dereference the ops in this case: uacce_fops_open() if (!uacce->parent->driver) /* Still valid, keep going */ ...rmmod uacce_remove() ... free_module() uacce->ops->get_queue() /* BUG */ uacce_remove should wait for uacce->queues_lock, until fops_open release the lock. If open happen just after the uacce_remove: unlock, uacce_bind_queue in open should fail. Ah yes sorry, I lost sight of what this patch was adding. But we could have the same issue with the patch, just in a different order, no? uacce_fops_open() uacce = xa_load() ...rmmod Um, how is rmmod called if the file descriptor is open? That should not be possible if the owner of the file descriptor is properly set. Please fix that up. Thanks Greg Set cdev owner or use module_get/put can block rmmod once fops_open. - uacce->cdev->owner = THIS_MODULE; + uacce->cdev->owner = uacce->parent->driver->owner; However, still not find good method to block removing parent pci device. $ echo 1 > /sys/bus/pci/devices/:00:02.0/remove & [ 32.563350] uacce_remove+0x6c/0x148 [ 32.563353] hisi_qm_uninit+0x12c/0x178 [ 32.563356] hisi_zip_remove+0xa0/0xd0 [hisi_zip] [ 32.563361] pci_device_remove+0x44/0xd8 [ 32.563364] device_remove+0x54/0x88 [ 32.563367] device_release_driver_internal+0xec/0x1a0 [ 32.563370] device_release_driver+0x20/0x30 [ 32.563372] pci_stop_bus_device+0x8c/0xe0 [ 32.563375] pci_stop_and_remove_bus_device_locked+0x28/0x60 [ 32.563378] remove_store+0x9c/0xb0 [ 32.563379] dev_attr_store+0x20/0x38 Removing the parent pci device does not remove the module code, it removes the device itself. Don't confuse code vs. data here. Do you mean even parent pci device is removed immediately, the code has to wait, like dma etc? Currently parent driver has to ensure all dma stopped then call uacce_remove, ie, after uacce_fops_open succeed, parent driver need wait fops_release, then uacce_remove can be called. For example: drivers/crypto/hisilicon/zip/zip_main.c: hisi_qm_wait_task_finish If remove this wait , there may other issue, Unable to handle kernel paging request at virtual address 8b700204 pc : hisi_qm_cache_wb.part.0+0x2c/0xa0 So uacce only need serialize uacce_fops_open and uacce_remove. After uacce_fops_open, we can assume uacce_remove only happen after uacce_fops_release? Then it would be much simpler. Thanks ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 4/7] iommu/amd: Introduce function to check and enable SNP
Recap discussion on the other thread. https://lore.kernel.org/linux-mm/camkat6qorwbaxapacasm0sc9o2uq9zqzb6s1kbkvav2d4tk...@mail.gmail.com/#t On 6/16/2022 8:55 AM, Suravee Suthikulpanit wrote: +int amd_iommu_snp_enable(void) +{ + /* +* The SNP support requires that IOMMU must be enabled, and is +* not configured in the passthrough mode. +*/ + if (no_iommu || iommu_default_passthrough()) { + pr_err("SNP: IOMMU is either disabled or configured in passthrough mode.\n"); + return -EINVAL; + } Peter has suggested rewording to something more descriptive such as: "SNP: IOMMU is either disabled or configured in passthrough mode, SNP cannot be supported". Thank you, Suravee + /* +* Prevent enabling SNP after IOMMU_ENABLED state because this process +* affect how IOMMU driver sets up data structures and configures +* IOMMU hardware. +*/ + if (init_state > IOMMU_ENABLED) { + pr_err("SNP: Too late to enable SNP for IOMMU.\n"); + return -EINVAL; + } + + amd_iommu_snp_en = amd_iommu_snp_sup; + if (!amd_iommu_snp_en) + return -EINVAL; + + pr_info("SNP enabled\n"); + + /* Enforce IOMMU v1 pagetable when SNP is enabled. */ + if (amd_iommu_pgtable != AMD_IOMMU_V1) { + pr_warn("Force to using AMD IOMMU v1 page table due to SNP\n"); + amd_iommu_pgtable = AMD_IOMMU_V1; + amd_iommu_ops.pgsize_bitmap = AMD_IOMMU_PGSIZES; + } + + return 0; +} +EXPORT_SYMBOL_GPL(amd_iommu_snp_enable); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 3/5] vfio/iommu_type1: Remove the domain->ops comparison
On 2022-06-16 23:23, Nicolin Chen wrote: On Thu, Jun 16, 2022 at 06:40:14AM +, Tian, Kevin wrote: The domain->ops validation was added, as a precaution, for mixed-driver systems. However, at this moment only one iommu driver is possible. So remove it. It's true on a physical platform. But I'm not sure whether a virtual platform is allowed to include multiple e.g. one virtio-iommu alongside a virtual VT-d or a virtual smmu. It might be clearer to claim that (as Robin pointed out) there is plenty more significant problems than this to solve instead of simply saying that only one iommu driver is possible if we don't have explicit code to reject such configuration. Will edit this part. Thanks! Oh, physical platforms with mixed IOMMUs definitely exist already. The main point is that while bus_set_iommu still exists, the core code effectively *does* prevent multiple drivers from registering - even in emulated cases like the example above, virtio-iommu and VT-d would both try to bus_set_iommu(_bus_type), and one of them will lose. The aspect which might warrant clarification is that there's no combination of supported drivers which claim non-overlapping buses *and* could appear in the same system - even if you tried to contrive something by emulating, say, VT-d (PCI) alongside rockchip-iommu (platform), you could still only describe one or the other due to ACPI vs. Devicetree. Thanks, Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 7/9] driver core: Set fw_devlink.strict=1 by default
On Wed, Jun 01, 2022 at 12:07:03AM -0700, Saravana Kannan wrote: > Now that deferred_probe_timeout is non-zero by default, fw_devlink will > never permanently block the probing of devices. It'll try its best to > probe the devices in the right order and then finally let devices probe > even if their suppliers don't have any drivers. > > Signed-off-by: Saravana Kannan > --- > drivers/base/core.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) As mentioned here: https://lore.kernel.org/lkml/20220622062027.994614-1-peng@oss.nxp.com/ This patch has the effect that console UART devices which have "dmas" properties specified in the device tree get deferred for 10 to 20 seconds. This happens on i.MX and likely on other SoCs as well. On i.MX the dma channel is only requested at UART startup time and not at probe time. dma is not used for the console. Nevertheless with this driver probe defers until the dma engine driver is available. It shouldn't go in as-is. Sascha > > diff --git a/drivers/base/core.c b/drivers/base/core.c > index 61fdfe99b348..977b379a495b 100644 > --- a/drivers/base/core.c > +++ b/drivers/base/core.c > @@ -1613,7 +1613,7 @@ static int __init fw_devlink_setup(char *arg) > } > early_param("fw_devlink", fw_devlink_setup); > > -static bool fw_devlink_strict; > +static bool fw_devlink_strict = true; > static int __init fw_devlink_strict_setup(char *arg) > { > return strtobool(arg, _devlink_strict); > -- > 2.36.1.255.ge46751e96f-goog > > -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 04/10] gpu: host1x: Add context device management code
Hi Mikko, Thank you for the patch! Yet something to improve: [auto build test ERROR on drm/drm-next] [also build test ERROR on tegra/for-next linus/master v5.19-rc3] [cannot apply to tegra-drm/drm/tegra/for-next next-20220621] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/intel-lab-lkp/linux/commits/Mikko-Perttunen/Host1x-context-isolation-support/20220621-231339 base: git://anongit.freedesktop.org/drm/drm drm-next config: arm64-randconfig-r021-20220622 (https://download.01.org/0day-ci/archive/20220622/202206221557.laes8ynq-...@intel.com/config) compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 8b8d126598ce7bd5243da7f94f69fa1104288bee) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm64 cross compiling tool for clang build # apt-get install binutils-aarch64-linux-gnu # https://github.com/intel-lab-lkp/linux/commit/2501beeae7469b805f9f624049fd56643cf6e18e git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Mikko-Perttunen/Host1x-context-isolation-support/20220621-231339 git checkout 2501beeae7469b805f9f624049fd56643cf6e18e # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/gpu/host1x/ If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot All errors (new ones prefixed by >>): >> drivers/gpu/host1x/context.c:80:28: error: no member named 'ids' in 'struct >> iommu_fwspec' ctx->stream_id = fwspec->ids[0] & 0x; ~~ ^ 1 error generated. vim +80 drivers/gpu/host1x/context.c 15 16 int host1x_memory_context_list_init(struct host1x *host1x) 17 { 18 struct host1x_memory_context_list *cdl = >context_list; 19 struct device_node *node = host1x->dev->of_node; 20 struct host1x_memory_context *ctx; 21 unsigned int i; 22 int err; 23 24 cdl->devs = NULL; 25 cdl->len = 0; 26 mutex_init(>lock); 27 28 err = of_property_count_u32_elems(node, "iommu-map"); 29 if (err < 0) 30 return 0; 31 32 cdl->devs = kcalloc(err, sizeof(*cdl->devs), GFP_KERNEL); 33 if (!cdl->devs) 34 return -ENOMEM; 35 cdl->len = err / 4; 36 37 for (i = 0; i < cdl->len; i++) { 38 struct iommu_fwspec *fwspec; 39 40 ctx = >devs[i]; 41 42 ctx->host = host1x; 43 44 device_initialize(>dev); 45 46 /* 47 * Due to an issue with T194 NVENC, only 38 bits can be used. 48 * Anyway, 256GiB of IOVA ought to be enough for anyone. 49 */ 50 ctx->dma_mask = DMA_BIT_MASK(38); 51 ctx->dev.dma_mask = >dma_mask; 52 ctx->dev.coherent_dma_mask = ctx->dma_mask; 53 dev_set_name(>dev, "host1x-ctx.%d", i); 54 ctx->dev.bus = _context_device_bus_type; 55 ctx->dev.parent = host1x->dev; 56 57 dma_set_max_seg_size(>dev, UINT_MAX); 58 59 err = device_add(>dev); 60 if (err) { 61 dev_err(host1x->dev, "could not add context device %d: %d\n", i, err); 62 goto del_devices; 63 } 64 65 err = of_dma_configure_id(>dev, node, true, ); 66 if (err) { 67 dev_err(host1x->dev, "IOMMU configuration failed for context device %d: %d\n", 68 i, err); 69 device_del(>dev); 70 goto del_devices; 71 } 72 73 fwspec = dev_iommu_fwspec_get(>dev); 74 if (!fwspec) { 75 dev_err(host1x->dev, "Context device %d has no IOMMU!\n", i); 76 device_del(>dev); 77 goto del_devices; 78 }
Re: [PATCH 3/3] iommu: Clean up release_device checks
On 2022-06-22 02:36, Baolu Lu wrote: On 2022/6/21 23:14, Robin Murphy wrote: Since .release_device is now called through per-device ops, any call which gets as far as a driver definitely*is* for that driver, for a device which has successfully passed .probe_device, so all the checks to that effect are now redundant and can be removed. In the same vein we can also skip freeing fwspecs which are now managed by core code. Does this depend on any other series? I didn't see iommu_fwspec_free() called in the core code. Or I missed anything? dev_iommu_free() cleans up param->fwspec directly (see b54240ad4943). FWIW the plan is that iommu_fwspec_free() should eventually go away - of the remaining uses after this, two are in fact similarly redundant already, since there's also a dev_iommu_free() in the probe failure path, and the other two should disappear in part 2 of fixing the bus probing mess (wherein the of_xlate step gets pulled into iommu_probe_device as well, and finally works correctly again). Cheers, Robin. Signed-off-by: Robin Murphy --- drivers/iommu/apple-dart.c | 3 --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 +--- drivers/iommu/arm/arm-smmu/arm-smmu.c | 14 +++--- drivers/iommu/arm/arm-smmu/qcom_iommu.c | 11 --- drivers/iommu/exynos-iommu.c | 3 --- drivers/iommu/mtk_iommu.c | 5 - drivers/iommu/mtk_iommu_v1.c | 5 - drivers/iommu/sprd-iommu.c | 11 --- drivers/iommu/virtio-iommu.c | 8 +--- 9 files changed, 5 insertions(+), 63 deletions(-) Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu