Re: [PATCH] iommu/amd: Fix amd_iommu_detect() (does not fix any issues).
On Tue, Oct 27, 2015 at 09:47:48AM +0900, Jerome Glisse wrote: > On Mon, Oct 26, 2015 at 12:07:17PM -0400, Konrad Rzeszutek Wilk wrote: > > On Mon, Aug 31, 2015 at 06:13:03PM -0400, j.gli...@gmail.com wrote: > > > From: Jérôme Glisse > > > > > > Fix amd_iommu_detect() to return positive value on success, like > > > intended, and not zero. This will not change anything in the end > > > as AMD IOMMU disable swiotlb and properly associate itself with > > > > Not sure how it disables SWIOTLB? The AMD Vi does not seem to > > change 'swiotlb'. While 'gart_iommu_init' does. Did you mean > > the AMD GART code? > > So this is convoluted and painfull, each i look back at that it takes > me time to figure out of thing happen. Basicly amd_iommu_init_dma_ops() > will replace dma_ops to no_mmu unless passthrough, and when the AMD > iommu associate itself with each device it will set the archdata.dma_ops > again this unbind the default of swiotlb that is initialize before > hw IOMMU. > > > > > > devices even if detect() doesn't return a positive value. > > > > Returning positive will mean that the pci_iommu_alloc will stop > > processing _all_ other IOMMUs. > > > > While returning 0 will let it detect the other IOMMUs. > > No see the IOMMU_FINISH_IF_DETECTED flags in pci_iommu_alloc(). > Which is not set for AMD hence my patch should not change anything > it (AFAICT and from testing but i do not have all AMD hw the ever > existed). > > So i am just making the detect function do what the API doc says it > should do. See line 72 to 80 of : arch/x86/include/asm/iommu_table.h > > > > > Granted on an AMD machine there can be two 'IOMMU's - the GART > > and the AMD Vi. The detection is always to call gart_iommu_hole_init > > first, then amd_iommu_detect. > > > > I presume if there was one more type on AMD we would run into trouble. > > No because of IOMMU_FINISH_IF_DETECTED flag. > > Hope this clarify thing this spagethi mix :) Ok my bad amd actualy is using IOMMU_INIT_FINISH() so it finish before trying other. Which make sense for AMD as AMD driver will call the gart init gart_iommu_init() if it fails to initialize. If we ever end up with a platform with multiple IOMMU beside AMD then we need to switch to the IOMMU_INIT() instead of the finish one. Cheers, Jérôme ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Fix amd_iommu_detect() (does not fix any issues).
On Mon, Oct 26, 2015 at 12:07:17PM -0400, Konrad Rzeszutek Wilk wrote: > On Mon, Aug 31, 2015 at 06:13:03PM -0400, j.gli...@gmail.com wrote: > > From: Jérôme Glisse > > > > Fix amd_iommu_detect() to return positive value on success, like > > intended, and not zero. This will not change anything in the end > > as AMD IOMMU disable swiotlb and properly associate itself with > > Not sure how it disables SWIOTLB? The AMD Vi does not seem to > change 'swiotlb'. While 'gart_iommu_init' does. Did you mean > the AMD GART code? So this is convoluted and painfull, each i look back at that it takes me time to figure out of thing happen. Basicly amd_iommu_init_dma_ops() will replace dma_ops to no_mmu unless passthrough, and when the AMD iommu associate itself with each device it will set the archdata.dma_ops again this unbind the default of swiotlb that is initialize before hw IOMMU. > > > devices even if detect() doesn't return a positive value. > > Returning positive will mean that the pci_iommu_alloc will stop > processing _all_ other IOMMUs. > > While returning 0 will let it detect the other IOMMUs. No see the IOMMU_FINISH_IF_DETECTED flags in pci_iommu_alloc(). Which is not set for AMD hence my patch should not change anything it (AFAICT and from testing but i do not have all AMD hw the ever existed). So i am just making the detect function do what the API doc says it should do. See line 72 to 80 of : arch/x86/include/asm/iommu_table.h > > Granted on an AMD machine there can be two 'IOMMU's - the GART > and the AMD Vi. The detection is always to call gart_iommu_hole_init > first, then amd_iommu_detect. > > I presume if there was one more type on AMD we would run into trouble. No because of IOMMU_FINISH_IF_DETECTED flag. Hope this clarify thing this spagethi mix :) Cheers, Jérôme ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 1/3] iommu: Implement common IOMMU ops for DMA mapping
On 26/10/15 13:44, Yong Wu wrote: On Thu, 2015-10-01 at 20:13 +0100, Robin Murphy wrote: [...] +/* + * The DMA API client is passing in a scatterlist which could describe + * any old buffer layout, but the IOMMU API requires everything to be + * aligned to IOMMU pages. Hence the need for this complicated bit of + * impedance-matching, to be able to hand off a suitably-aligned list, + * but still preserve the original offsets and sizes for the caller. + */ +int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, + int nents, int prot) +{ + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + struct iova_domain *iovad = domain->iova_cookie; + struct iova *iova; + struct scatterlist *s, *prev = NULL; + dma_addr_t dma_addr; + size_t iova_len = 0; + int i; + + /* +* Work out how much IOVA space we need, and align the segments to +* IOVA granules for the IOMMU driver to handle. With some clever +* trickery we can modify the list in-place, but reversibly, by +* hiding the original data in the as-yet-unused DMA fields. +*/ + for_each_sg(sg, s, nents, i) { + size_t s_offset = iova_offset(iovad, s->offset); + size_t s_length = s->length; + + sg_dma_address(s) = s->offset; + sg_dma_len(s) = s_length; + s->offset -= s_offset; + s_length = iova_align(iovad, s_length + s_offset); + s->length = s_length; + + /* +* The simple way to avoid the rare case of a segment +* crossing the boundary mask is to pad the previous one +* to end at a naturally-aligned IOVA for this one's size, +* at the cost of potentially over-allocating a little. +*/ + if (prev) { + size_t pad_len = roundup_pow_of_two(s_length); + + pad_len = (pad_len - iova_len) & (pad_len - 1); + prev->length += pad_len; Hi Robin, While our v4l2 testing, It seems that we met a problem here. Here we update prev->length again, Do we need update sg_dma_len(prev) again too? Some function like vb2_dc_get_contiguous_size[1] always get sg_dma_len(s) to compare instead of s->length. so it may break unexpectedly while sg_dma_len(s) is not same with s->length. This is just tweaking the faked-up length that we hand off to iommu_map_sg() (see also the iova_align() above), to trick it into bumping this segment up to a suitable starting IOVA. The real length at this point is stashed in sg_dma_len(s), and will be copied back into s->length in __finalise_sg(), so both will hold the same true length once we return to the caller. Yes, it does mean that if you have a list where the segment lengths are page aligned but not monotonically decreasing, e.g. {64k, 16k, 64k}, then you'll still end up with a gap between the second and third segments, but that's fine because the DMA API offers no guarantees about what the resulting DMA addresses will be (consider the no-IOMMU case where they would each just be "mapped" to their physical address). If that breaks v4l, then it's probably v4l's DMA API use that needs looking at (again). Robin. [1]: http://lxr.free-electrons.com/source/drivers/media/v4l2-core/videobuf2-dma-contig.c#L70 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Fix amd_iommu_detect() (does not fix any issues).
On Mon, Aug 31, 2015 at 06:13:03PM -0400, j.gli...@gmail.com wrote: > From: Jérôme Glisse > > Fix amd_iommu_detect() to return positive value on success, like > intended, and not zero. This will not change anything in the end > as AMD IOMMU disable swiotlb and properly associate itself with Not sure how it disables SWIOTLB? The AMD Vi does not seem to change 'swiotlb'. While 'gart_iommu_init' does. Did you mean the AMD GART code? > devices even if detect() doesn't return a positive value. Returning positive will mean that the pci_iommu_alloc will stop processing _all_ other IOMMUs. While returning 0 will let it detect the other IOMMUs. Granted on an AMD machine there can be two 'IOMMU's - the GART and the AMD Vi. The detection is always to call gart_iommu_hole_init first, then amd_iommu_detect. I presume if there was one more type on AMD we would run into trouble. > > Signed-off-by: Jérôme Glisse > Cc: Joerg Roedel > Cc: iommu@lists.linux-foundation.org > --- > drivers/iommu/amd_iommu_init.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c > index a24495e..360a451 100644 > --- a/drivers/iommu/amd_iommu_init.c > +++ b/drivers/iommu/amd_iommu_init.c > @@ -2198,7 +2198,7 @@ int __init amd_iommu_detect(void) > iommu_detected = 1; > x86_init.iommu.iommu_init = amd_iommu_init; > > - return 0; > + return 1; > } > > / > -- > 1.8.3.1 > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 1/3] iommu: Implement common IOMMU ops for DMA mapping
On Thu, 2015-10-01 at 20:13 +0100, Robin Murphy wrote: [...] > +/* > + * The DMA API client is passing in a scatterlist which could describe > + * any old buffer layout, but the IOMMU API requires everything to be > + * aligned to IOMMU pages. Hence the need for this complicated bit of > + * impedance-matching, to be able to hand off a suitably-aligned list, > + * but still preserve the original offsets and sizes for the caller. > + */ > +int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, > + int nents, int prot) > +{ > + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + struct iova_domain *iovad = domain->iova_cookie; > + struct iova *iova; > + struct scatterlist *s, *prev = NULL; > + dma_addr_t dma_addr; > + size_t iova_len = 0; > + int i; > + > + /* > + * Work out how much IOVA space we need, and align the segments to > + * IOVA granules for the IOMMU driver to handle. With some clever > + * trickery we can modify the list in-place, but reversibly, by > + * hiding the original data in the as-yet-unused DMA fields. > + */ > + for_each_sg(sg, s, nents, i) { > + size_t s_offset = iova_offset(iovad, s->offset); > + size_t s_length = s->length; > + > + sg_dma_address(s) = s->offset; > + sg_dma_len(s) = s_length; > + s->offset -= s_offset; > + s_length = iova_align(iovad, s_length + s_offset); > + s->length = s_length; > + > + /* > + * The simple way to avoid the rare case of a segment > + * crossing the boundary mask is to pad the previous one > + * to end at a naturally-aligned IOVA for this one's size, > + * at the cost of potentially over-allocating a little. > + */ > + if (prev) { > + size_t pad_len = roundup_pow_of_two(s_length); > + > + pad_len = (pad_len - iova_len) & (pad_len - 1); > + prev->length += pad_len; Hi Robin, While our v4l2 testing, It seems that we met a problem here. Here we update prev->length again, Do we need update sg_dma_len(prev) again too? Some function like vb2_dc_get_contiguous_size[1] always get sg_dma_len(s) to compare instead of s->length. so it may break unexpectedly while sg_dma_len(s) is not same with s->length. [1]: http://lxr.free-electrons.com/source/drivers/media/v4l2-core/videobuf2-dma-contig.c#L70 > + iova_len += pad_len; > + } > + > + iova_len += s_length; > + prev = s; > + } > + > + iova = __alloc_iova(iovad, iova_len, dma_get_mask(dev)); > + if (!iova) > + goto out_restore_sg; > + > + /* > + * We'll leave any physical concatenation to the IOMMU driver's > + * implementation - it knows better than we do. > + */ > + dma_addr = iova_dma_addr(iovad, iova); > + if (iommu_map_sg(domain, dma_addr, sg, nents, prot) < iova_len) > + goto out_free_iova; > + > + return __finalise_sg(dev, sg, nents, dma_addr); > + > +out_free_iova: > + __free_iova(iovad, iova); > +out_restore_sg: > + __invalidate_sg(sg, nents); > + return 0; > +} > + ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[git pull] IOMMU Fixes for Linux v4.3-rc7
Hi Linus, The following changes since commit 5adad9915472e180712030d730cdc476c6f8a60b: iommu/amd: Fix NULL pointer deref on device detach (2015-10-09 17:59:33 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git tags/iommu-fixes-v4.3-rc7 for you to fetch changes up to cbf3ccd09d683abf1cacd36e3640872ee912d99b: iommu/amd: Don't clear DTE flags when modifying it (2015-10-21 11:29:06 +0200) IOMMU Fixes for Linux v4.3-rc7 Two late fixes for the AMD IOMMU driver: * One adds an additional check to the io page-fault handler to avoid a BUG_ON being hit in handle_mm_fault() * Second patch fixes a problem with devices writing to the system management area and were blocked by the IOMMU because the driver wrongly cleared out the DTE flags allowing that access. Jay Cornwall (1): iommu/amd: Fix BUG when faulting a PROT_NONE VMA Joerg Roedel (1): iommu/amd: Don't clear DTE flags when modifying it drivers/iommu/amd_iommu.c | 4 ++-- drivers/iommu/amd_iommu_types.h | 1 + drivers/iommu/amd_iommu_v2.c| 7 +++ 3 files changed, 10 insertions(+), 2 deletions(-) Please pull. Thanks, Joerg signature.asc Description: Digital signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu