Re: [PATCH 10/10] dmapool: improve scalability of dma_pool_free
On Tue, May 31, 2022 at 02:23:44PM -0400, Tony Battersby wrote: > dma_pool_free() scales poorly when the pool contains many pages because > pool_find_page() does a linear scan of all allocated pages. Improve its > scalability by replacing the linear scan with a red-black tree lookup. > In big O notation, this improves the algorithm from O(n^2) to O(n * log n). The improvement should say O(n) to O(log n), right? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 08/10] dmapool: cleanup dma_pool_destroy
On Tue, May 31, 2022 at 02:22:21PM -0400, Tony Battersby wrote: > +static void pool_free_page(struct dma_pool *pool, > +struct dma_page *page, > +bool destroying_pool) 'destroying_pool' is always true, so I don't think you need it. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: nvme: IO_PAGE_FAULT logged with Intel SSDPEKKF512G8
On Tue, Jan 18, 2022 at 03:32:45PM +0100, Paul Menzel wrote: > On a Dell OptiPlex 5055 with an Intel SSDPEKKF512G8, Linux 5.10.82 reported > an IO_PAGE_FAULT error. This is the first and only time this has happened. > > $ dmesg --level=err > [4.194306] nvme :01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT > domain=0x000c address=0xc080 flags=0x0050] > [4.206970] nvme :01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT > domain=0x000c address=0xc000 flags=0x0050] > [7.327820] kfd kfd: VERDE not supported in kfd > $ lspci -nn -s 01:00.0 > 01:00.0 Non-Volatile memory controller [0108]: Intel Corporation SSD Pro > 7600p/760p/E 6100p Series [8086:f1a6] (rev 03) I think it's a bug with the iommu implementation. If it causes problems, you can typically work around it with kernel parameter "iommu=soft". ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.
On Mon, Feb 01, 2021 at 10:30:17AM -0800, Jianxiong Gao wrote: > @@ -868,12 +871,24 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, > struct request *req, > if (!iod->nents) > goto out_free_sg; > > + offset_ret = dma_set_min_align_mask(dev->dev, NVME_CTRL_PAGE_SIZE - 1); > + if (offset_ret) { > + dev_warn(dev->dev, "dma_set_min_align_mask failed to set > offset\n"); > + goto out_free_sg; > + } > + > if (is_pci_p2pdma_page(sg_page(iod->sg))) > nr_mapped = pci_p2pdma_map_sg_attrs(dev->dev, iod->sg, > iod->nents, rq_dma_dir(req), DMA_ATTR_NO_WARN); > else > nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents, >rq_dma_dir(req), DMA_ATTR_NO_WARN); > + > + offset_ret = dma_set_min_align_mask(dev->dev, 0); > + if (offset_ret) { > + dev_warn(dev->dev, "dma_set_min_align_mask failed to reset > offset\n"); > + goto out_free_sg; > + } > if (!nr_mapped) > goto out_free_sg; Why is this setting being done and undone on each IO? Wouldn't it be more efficient to set it once during device initialization? And more importantly, this isn't thread safe: one CPU may be setting the device's dma alignment mask to 0 while another CPU is expecting it to be NVME_CTRL_PAGE_SIZE - 1. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/3] Add swiotlb offset preserving mapping when dma_dma_parameters->page_offset_mask is non zero.
On Thu, Jan 28, 2021 at 12:15:28PM -0500, Konrad Rzeszutek Wilk wrote: > On Wed, Jan 27, 2021 at 04:38:28PM -0800, Jianxiong Gao wrote: > > For devices that need to preserve address offset on mapping through > > swiotlb, this patch adds offset preserving based on page_offset_mask > > and keeps the offset if the mask is non zero. This is needed for > > device drivers like NVMe. > > > > Didn't you send this patch like a month ago and someone pointed > out that the right fix would be in the NVMe driver? > > Is there an issue with fixing the NVMe driver? You got it backwards. The initial "fix" used a flag specific to the nvme driver, and it was pointed out that it should just be the generic behaviour. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 15/15] nvme-pci: Allow mmaping the CMB in userspace
On Fri, Nov 06, 2020 at 10:00:36AM -0700, Logan Gunthorpe wrote: > Allow userspace to obtain CMB memory by mmaping the controller's > char device. The mmap call allocates and returns a hunk of CMB memory, > (the offset is ignored) so userspace does not have control over the > address within the CMB. > > A VMA allocated in this way will only be usable by drivers that set > FOLL_PCI_P2PDMA when calling GUP. And inter-device support will be > checked the first time the pages are mapped for DMA. > > Currently this is only supported by O_DIRECT to an PCI NVMe device > or through the NVMe passthrough IOCTL. Rather than make this be specific to nvme, could pci p2pdma create an mmap'able file for any resource registered with it? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;
On Wed, Sep 09, 2020 at 01:06:39PM -0700, Joe Perches wrote: > diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c > index eea0f453cfb6..8aac5bc60f4c 100644 > --- a/crypto/tcrypt.c > +++ b/crypto/tcrypt.c > @@ -2464,7 +2464,7 @@ static int do_test(const char *alg, u32 type, u32 mask, > int m, u32 num_mb) > test_hash_speed("streebog512", sec, > generic_hash_speed_template); > if (mode > 300 && mode < 400) break; > - fallthrough; > + break; > case 399: > break; Just imho, this change makes the preceding 'if' look even more pointless. Maybe the fallthrough was a deliberate choice? Not that my opinion matters here as I don't know this module, but it looked a bit odd to me. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 6/7] PCI: vmd: Stop overriding dma_map_ops
On Tue, Jan 21, 2020 at 06:37:50AM -0700, Jon Derrick wrote: > Devices on the VMD domain use the VMD endpoint's requester ID and have > been relying on the VMD endpoint's DMA operations. The problem with this > was that VMD domain devices would use the VMD endpoint's attributes when > doing DMA and IOMMU mapping. We can be smarter about this by only using > the VMD endpoint when mapping and providing the correct child device's > attributes during DMA operations. > > This patch removes the dma_map_ops redirect. > > Signed-off-by: Jon Derrick Looks good. Reviewed-by: Keith Busch ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/5] PCI/vmd: Stop overriding dma_map_ops
On Wed, Aug 28, 2019 at 07:14:42AM -0700, Christoph Hellwig wrote: > With a little tweak to the intel-iommu code we should be able to work > around the VMD mess for the requester IDs without having to create giant > amounts of boilerplate DMA ops wrapping code. The other advantage of > this scheme is that we can respect the real DMA masks for the actual > devices, and I bet it will only be a matter of time until we'll see the > first DMA challeneged NVMe devices. This tests out fine on VMD hardware, but it's quite different than the previous patch. In v1, the original dev was used in iommu_need_mapping(), but this time it's the vmd device. Is this still using the actual device's DMA mask then? > Signed-off-by: Christoph Hellwig > --- > drivers/iommu/intel-iommu.c| 25 ++ > drivers/pci/controller/Kconfig | 1 - > drivers/pci/controller/vmd.c | 150 - > 3 files changed, 25 insertions(+), 151 deletions(-) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 12d094d08c0a..aaa35ac73956 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -373,6 +373,23 @@ EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped); > static DEFINE_SPINLOCK(device_domain_lock); > static LIST_HEAD(device_domain_list); > > +/* > + * For VMD we need to use the VMD devices for mapping requests instead of the > + * actual device to get the proper PCIe requester ID. > + */ > +static inline struct device *vmd_real_dev(struct device *dev) > +{ > +#if IS_ENABLED(CONFIG_VMD) > + if (dev_is_pci(dev)) { > + struct pci_sysdata *sd = to_pci_dev(dev)->bus->sysdata; > + > + if (sd->vmd_dev) > + return sd->vmd_dev; > + } > +#endif > + return dev; > +} > + > /* > * Iterate over elements in device_domain_list and call the specified > * callback @fn against each element. > @@ -3520,6 +3537,7 @@ static dma_addr_t intel_map_page(struct device *dev, > struct page *page, >enum dma_data_direction dir, >unsigned long attrs) > { > + dev = vmd_real_dev(dev); > if (iommu_need_mapping(dev)) > return __intel_map_single(dev, page_to_phys(page) + offset, > size, dir, *dev->dma_mask); > @@ -3530,6 +3548,7 @@ static dma_addr_t intel_map_resource(struct device > *dev, phys_addr_t phys_addr, >size_t size, enum dma_data_direction dir, >unsigned long attrs) > { > + dev = vmd_real_dev(dev); > if (iommu_need_mapping(dev)) > return __intel_map_single(dev, phys_addr, size, dir, > *dev->dma_mask); > @@ -3585,6 +3604,7 @@ static void intel_unmap_page(struct device *dev, > dma_addr_t dev_addr, >size_t size, enum dma_data_direction dir, >unsigned long attrs) > { > + dev = vmd_real_dev(dev); > if (iommu_need_mapping(dev)) > intel_unmap(dev, dev_addr, size); > else > @@ -3594,6 +3614,7 @@ static void intel_unmap_page(struct device *dev, > dma_addr_t dev_addr, > static void intel_unmap_resource(struct device *dev, dma_addr_t dev_addr, > size_t size, enum dma_data_direction dir, unsigned long attrs) > { > + dev = vmd_real_dev(dev); > if (iommu_need_mapping(dev)) > intel_unmap(dev, dev_addr, size); > } > @@ -3605,6 +3626,7 @@ static void *intel_alloc_coherent(struct device *dev, > size_t size, > struct page *page = NULL; > int order; > > + dev = vmd_real_dev(dev); > if (!iommu_need_mapping(dev)) > return dma_direct_alloc(dev, size, dma_handle, flags, attrs); > > @@ -3641,6 +3663,7 @@ static void intel_free_coherent(struct device *dev, > size_t size, void *vaddr, > int order; > struct page *page = virt_to_page(vaddr); > > + dev = vmd_real_dev(dev); > if (!iommu_need_mapping(dev)) > return dma_direct_free(dev, size, vaddr, dma_handle, attrs); > > @@ -3661,6 +3684,7 @@ static void intel_unmap_sg(struct device *dev, struct > scatterlist *sglist, > struct scatterlist *sg; > int i; > > + dev = vmd_real_dev(dev); > if (!iommu_need_mapping(dev)) > return dma_direct_unmap_sg(dev, sglist, nelems, dir, attrs); > > @@ -3685,6 +3709,7 @@ static int intel_map_sg(struct device *dev, struct > scatterlist *sglist, int nele > struct intel_iommu *iommu; > > BUG_ON(dir == DMA_NONE); > + dev = vmd_real_dev(dev); > if (!iommu_need_mapping(dev)) > return dma_direct_map_sg(dev, sglist, nelems, dir, attrs); > > diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig > index fe9f9f13ce11..920546cb84e2 100644 > --- a/drivers/pci/controller/Kconfig > +++ b/drivers/pci/controller/Kconfig > @@
Re: [PATCH] vmd: Stop overriding dma_map_ops
On Tue, Aug 27, 2019 at 11:24:05AM -0700, Derrick, Jonathan wrote: > On Mon, 2019-08-26 at 17:06 +0200, Christoph Hellwig wrote: > > With a little tweak to the intel-iommu code we should be able to work > > around the VMD mess for the requester IDs without having to create giant > > amounts of boilerplate DMA ops wrapping code. The other advantage of > > this scheme is that we can respect the real DMA masks for the actual > > devices, and I bet it will only be a matter of time until we'll see the > > first DMA challeneged NVMe devices. > > > > The only downside is that we can't offer vmd as a module given that > > intel-iommu calls into it. But the driver only has about 700 lines > > of code, so this should not be a major issue. > If we're going to remove its ability to be a module, and given its > small size, could we make this default =y? > > Otherwise we risk breaking platforms which have it enabled with OSVs > who miss enabling it Can we keep this as a module if we stick the remapping struct device in pci_sysdata instead of going through the vmd driver to get it? Otherwise, very happy to see this dma wrapping go away.
Re: [PATCH 1/3] MAINTAINERS: Add Jonathan Derrick as VMD maintainer
On Mon, Aug 07, 2017 at 01:57:11PM -0600, Jon Derrick wrote: > Add myself as VMD maintainer > > Signed-off-by: Jon Derrick <jonathan.derr...@intel.com> Thanks for adding. Acked-by: Keith Busch <keith.bu...@intel.com> > --- > MAINTAINERS | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/MAINTAINERS b/MAINTAINERS > index f66488d..3ec39df 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -10090,6 +10090,7 @@ F:drivers/pci/dwc/*imx6* > > PCI DRIVER FOR INTEL VOLUME MANAGEMENT DEVICE (VMD) > M: Keith Busch <keith.bu...@intel.com> > +M: Jonathan Derrick <jonathan.derr...@intel.com> > L: linux-...@vger.kernel.org > S: Supported > F: drivers/pci/host/vmd.c > -- > 2.9.4 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu