Re: [PATCH v2 1/5] drm/rockchip: fix fb references in async update
On 2019-03-15 11:25 a.m., Boris Brezillon wrote: > On Fri, 15 Mar 2019 11:11:36 +0100 > Michel Dänzer wrote: > >> On 2019-03-14 6:51 p.m., Helen Koike wrote: >>> On 3/14/19 6:15 AM, Michel Dänzer wrote: >>>> On 2019-03-13 7:08 p.m., Helen Koike wrote: >>>>> On 3/13/19 6:58 AM, Michel Dänzer wrote: >>>>>> On 2019-03-13 4:42 a.m., Tomasz Figa wrote: >>>>>>> On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon >>>>>>> wrote: >>>>>>>> On Tue, 12 Mar 2019 12:34:45 -0300 >>>>>>>> Helen Koike wrote: >>>>>>>>> On 3/12/19 3:34 AM, Boris Brezillon wrote: >>>>>>>>>> On Mon, 11 Mar 2019 23:21:59 -0300 >>>>>>>>>> Helen Koike wrote: >>>>>>>>>> >>>>>>>>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >>>>>>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >>>>>>>>>>> @@ -912,30 +912,31 @@ static void >>>>>>>>>>> vop_plane_atomic_async_update(struct drm_plane *plane, >>>>>>>>>>> struct drm_plane_state >>>>>>>>>>> *new_state) >>>>>>>>>>> { >>>>>>>>>>>struct vop *vop = to_vop(plane->state->crtc); >>>>>>>>>>> - struct drm_plane_state *plane_state; >>>>>>>>>>> + struct drm_framebuffer *old_fb = plane->state->fb; >>>>>>>>>>> >>>>>>>>>>> - plane_state = plane->funcs->atomic_duplicate_state(plane); >>>>>>>>>>> - plane_state->crtc_x = new_state->crtc_x; >>>>>>>>>>> - plane_state->crtc_y = new_state->crtc_y; >>>>>>>>>>> - plane_state->crtc_h = new_state->crtc_h; >>>>>>>>>>> - plane_state->crtc_w = new_state->crtc_w; >>>>>>>>>>> - plane_state->src_x = new_state->src_x; >>>>>>>>>>> - plane_state->src_y = new_state->src_y; >>>>>>>>>>> - plane_state->src_h = new_state->src_h; >>>>>>>>>>> - plane_state->src_w = new_state->src_w; >>>>>>>>>>> - >>>>>>>>>>> - if (plane_state->fb != new_state->fb) >>>>>>>>>>> - drm_atomic_set_fb_for_plane(plane_state, new_state->fb); >>>>>>>>>>> - >>>>>>>>>>> - swap(plane_state, plane->state); >>>>>>>>>>> - >>>>>>>>>>> - if (plane->state->fb && plane->state->fb != new_state->fb) { >>>>>>>>>>> + /* >>>>>>>>>>> + * A scanout can still be occurring, so we can't drop the >>>>>>>>>>> reference to >>>>>>>>>>> + * the old framebuffer. To solve this we get a reference to >>>>>>>>>>> old_fb and >>>>>>>>>>> + * set a worker to release it later. >>>>>>>>>> >>>>>>>>>> Hm, doesn't look like an async update to me if we have to wait for >>>>>>>>>> the >>>>>>>>>> next VBLANK to happen to get the new content on the screen. Maybe we >>>>>>>>>> should reject async updates when old_fb != new_fb in the rk >>>>>>>>>> ->async_check() hook. >>>>>>>>> >>>>>>>>> Unless I am misunderstanding this, we don't wait here, we just grab a >>>>>>>>> reference to the fb in case it is being still used by the hw, so it >>>>>>>>> doesn't get released prematurely. >>>>>>>> >>>>>>>> I was just reacting to the comment that says the new FB should stay >>>>>>>> around until the next VBLANK event happens. If the FB must stay around >>>>>>>> that probably means the HW is still using, which made me wonder if this >>>>>>>> HW actually supports async update (where async means "update now and >&
Re: [PATCH v2 1/5] drm/rockchip: fix fb references in async update
On 2019-03-14 6:51 p.m., Helen Koike wrote: > On 3/14/19 6:15 AM, Michel Dänzer wrote: >> On 2019-03-13 7:08 p.m., Helen Koike wrote: >>> On 3/13/19 6:58 AM, Michel Dänzer wrote: >>>> On 2019-03-13 4:42 a.m., Tomasz Figa wrote: >>>>> On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon >>>>> wrote: >>>>>> On Tue, 12 Mar 2019 12:34:45 -0300 >>>>>> Helen Koike wrote: >>>>>>> On 3/12/19 3:34 AM, Boris Brezillon wrote: >>>>>>>> On Mon, 11 Mar 2019 23:21:59 -0300 >>>>>>>> Helen Koike wrote: >>>>>>>> >>>>>>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >>>>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >>>>>>>>> @@ -912,30 +912,31 @@ static void >>>>>>>>> vop_plane_atomic_async_update(struct drm_plane *plane, >>>>>>>>> struct drm_plane_state >>>>>>>>> *new_state) >>>>>>>>> { >>>>>>>>>struct vop *vop = to_vop(plane->state->crtc); >>>>>>>>> - struct drm_plane_state *plane_state; >>>>>>>>> + struct drm_framebuffer *old_fb = plane->state->fb; >>>>>>>>> >>>>>>>>> - plane_state = plane->funcs->atomic_duplicate_state(plane); >>>>>>>>> - plane_state->crtc_x = new_state->crtc_x; >>>>>>>>> - plane_state->crtc_y = new_state->crtc_y; >>>>>>>>> - plane_state->crtc_h = new_state->crtc_h; >>>>>>>>> - plane_state->crtc_w = new_state->crtc_w; >>>>>>>>> - plane_state->src_x = new_state->src_x; >>>>>>>>> - plane_state->src_y = new_state->src_y; >>>>>>>>> - plane_state->src_h = new_state->src_h; >>>>>>>>> - plane_state->src_w = new_state->src_w; >>>>>>>>> - >>>>>>>>> - if (plane_state->fb != new_state->fb) >>>>>>>>> - drm_atomic_set_fb_for_plane(plane_state, new_state->fb); >>>>>>>>> - >>>>>>>>> - swap(plane_state, plane->state); >>>>>>>>> - >>>>>>>>> - if (plane->state->fb && plane->state->fb != new_state->fb) { >>>>>>>>> + /* >>>>>>>>> + * A scanout can still be occurring, so we can't drop the >>>>>>>>> reference to >>>>>>>>> + * the old framebuffer. To solve this we get a reference to old_fb >>>>>>>>> and >>>>>>>>> + * set a worker to release it later. >>>>>>>> >>>>>>>> Hm, doesn't look like an async update to me if we have to wait for the >>>>>>>> next VBLANK to happen to get the new content on the screen. Maybe we >>>>>>>> should reject async updates when old_fb != new_fb in the rk >>>>>>>> ->async_check() hook. >>>>>>> >>>>>>> Unless I am misunderstanding this, we don't wait here, we just grab a >>>>>>> reference to the fb in case it is being still used by the hw, so it >>>>>>> doesn't get released prematurely. >>>>>> >>>>>> I was just reacting to the comment that says the new FB should stay >>>>>> around until the next VBLANK event happens. If the FB must stay around >>>>>> that probably means the HW is still using, which made me wonder if this >>>>>> HW actually supports async update (where async means "update now and >>>>>> don't care about about tearing"). Or maybe it takes some time to switch >>>>>> to the new FB and waiting for the next VBLANK to release the old FB was >>>>>> an easy solution to not wait for the flip to actually happen in >>>>>> ->async_update() (which is kind of a combination of async+non-blocking). >>>>> >>>>> The hardware switches framebuffers on vblank, so whatever framebuffer >>>>> is currently being scanned out from needs to stay there until the >>>>> hardware switches to the new one in shadow registers. If that do
Re: [PATCH v2 1/5] drm/rockchip: fix fb references in async update
On 2019-03-13 7:08 p.m., Helen Koike wrote: > On 3/13/19 6:58 AM, Michel Dänzer wrote: >> On 2019-03-13 4:42 a.m., Tomasz Figa wrote: >>> On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon >>> wrote: >>>> On Tue, 12 Mar 2019 12:34:45 -0300 >>>> Helen Koike wrote: >>>>> On 3/12/19 3:34 AM, Boris Brezillon wrote: >>>>>> On Mon, 11 Mar 2019 23:21:59 -0300 >>>>>> Helen Koike wrote: >>>>>> >>>>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c >>>>>>> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct >>>>>>> drm_plane *plane, >>>>>>> struct drm_plane_state *new_state) >>>>>>> { >>>>>>>struct vop *vop = to_vop(plane->state->crtc); >>>>>>> - struct drm_plane_state *plane_state; >>>>>>> + struct drm_framebuffer *old_fb = plane->state->fb; >>>>>>> >>>>>>> - plane_state = plane->funcs->atomic_duplicate_state(plane); >>>>>>> - plane_state->crtc_x = new_state->crtc_x; >>>>>>> - plane_state->crtc_y = new_state->crtc_y; >>>>>>> - plane_state->crtc_h = new_state->crtc_h; >>>>>>> - plane_state->crtc_w = new_state->crtc_w; >>>>>>> - plane_state->src_x = new_state->src_x; >>>>>>> - plane_state->src_y = new_state->src_y; >>>>>>> - plane_state->src_h = new_state->src_h; >>>>>>> - plane_state->src_w = new_state->src_w; >>>>>>> - >>>>>>> - if (plane_state->fb != new_state->fb) >>>>>>> - drm_atomic_set_fb_for_plane(plane_state, new_state->fb); >>>>>>> - >>>>>>> - swap(plane_state, plane->state); >>>>>>> - >>>>>>> - if (plane->state->fb && plane->state->fb != new_state->fb) { >>>>>>> + /* >>>>>>> + * A scanout can still be occurring, so we can't drop the reference >>>>>>> to >>>>>>> + * the old framebuffer. To solve this we get a reference to old_fb >>>>>>> and >>>>>>> + * set a worker to release it later. >>>>>> >>>>>> Hm, doesn't look like an async update to me if we have to wait for the >>>>>> next VBLANK to happen to get the new content on the screen. Maybe we >>>>>> should reject async updates when old_fb != new_fb in the rk >>>>>> ->async_check() hook. >>>>> >>>>> Unless I am misunderstanding this, we don't wait here, we just grab a >>>>> reference to the fb in case it is being still used by the hw, so it >>>>> doesn't get released prematurely. >>>> >>>> I was just reacting to the comment that says the new FB should stay >>>> around until the next VBLANK event happens. If the FB must stay around >>>> that probably means the HW is still using, which made me wonder if this >>>> HW actually supports async update (where async means "update now and >>>> don't care about about tearing"). Or maybe it takes some time to switch >>>> to the new FB and waiting for the next VBLANK to release the old FB was >>>> an easy solution to not wait for the flip to actually happen in >>>> ->async_update() (which is kind of a combination of async+non-blocking). >>> >>> The hardware switches framebuffers on vblank, so whatever framebuffer >>> is currently being scanned out from needs to stay there until the >>> hardware switches to the new one in shadow registers. If that doesn't >>> happen, you get IOMMU faults and the display controller stops working >>> since we don't have any fault handling currently, just printing a >>> message. >> >> Sounds like your hardware doesn't actually support async flips. It's >> probably better for the driver not to pretend otherwise. > > I think wee need to clarify the meaning of the async_update callback > (and we should clarify it in the docs). > > The way I understand what the async_update callback should do is: don't > block (i.e. don't wait for the next vblank), Note that those are two separate things. "
Re: [PATCH v2 1/5] drm/rockchip: fix fb references in async update
t; + * set a worker to release it later. >>>> >>>> Hm, doesn't look like an async update to me if we have to wait for the >>>> next VBLANK to happen to get the new content on the screen. Maybe we >>>> should reject async updates when old_fb != new_fb in the rk >>>> ->async_check() hook. >>> >>> Unless I am misunderstanding this, we don't wait here, we just grab a >>> reference to the fb in case it is being still used by the hw, so it >>> doesn't get released prematurely. >> >> I was just reacting to the comment that says the new FB should stay >> around until the next VBLANK event happens. If the FB must stay around >> that probably means the HW is still using, which made me wonder if this >> HW actually supports async update (where async means "update now and >> don't care about about tearing"). Or maybe it takes some time to switch >> to the new FB and waiting for the next VBLANK to release the old FB was >> an easy solution to not wait for the flip to actually happen in >> ->async_update() (which is kind of a combination of async+non-blocking). > > The hardware switches framebuffers on vblank, so whatever framebuffer > is currently being scanned out from needs to stay there until the > hardware switches to the new one in shadow registers. If that doesn't > happen, you get IOMMU faults and the display controller stops working > since we don't have any fault handling currently, just printing a > message. Sounds like your hardware doesn't actually support async flips. It's probably better for the driver not to pretend otherwise. -- Earthling Michel Dänzer | https://www.amd.com Libre software enthusiast | Mesa and X developer ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: amdgpu/TTM oopses since merging swiotlb_dma_ops into the dma_direct code
On 2019-01-10 3:48 p.m., Christoph Hellwig wrote: > On Thu, Jan 10, 2019 at 03:00:31PM +0100, Christian König wrote: >>> From the trace it looks like we git the case where swiotlb tries >>> to copy back data from a bounce buffer, but hits a dangling or NULL >>> pointer. So a couple questions for the submitter: >>> >>> - does the system have more than 4GB memory and thus use swiotlb? >>> (check /proc/meminfo, and if something SWIOTLB appears in dmesg) >>> - does the device this happens on have a DMA mask smaller than >>> the available memory, that is should swiotlb be used here to start >>> with? >> >> Rather unlikely. The device is an AMD GPU, so we can address memory up to >> 1TB. > > So we probably somehow got a false positive. > > For now I'like the reported to confirm that the dma_direct_unmap_page+0x92 > backtrace really is in the swiotlb code (I can't think of anything else, > but I'd rather be sure). > > Second it would be great to print what the contents of io_tlb_start > and io_tlb_end are, e.g. by doing a printk_once in is_swiotlb_buffer, > maybe that gives a clue why we are hitting the swiotlb code here. Any progress? https://bugzilla.kernel.org/show_bug.cgi?id=202261 was also filed about this. I hope everyone's clear that this needs to be resolved one way or another by 5.0 final (though the sooner, the better :). -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
amdgpu/TTM oopses since merging swiotlb_dma_ops into the dma_direct code
Hi Christoph, https://bugs.freedesktop.org/109234 (please ignore comments #6-#9) was bisected to your commit 55897af63091 "dma-direct: merge swiotlb_dma_ops into the dma_direct code". Any ideas? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] swiotlb: Fix inversed DMA_ATTR_NO_WARN test
On 2018-05-25 10:41 AM, Christoph Hellwig wrote: > On Tue, May 22, 2018 at 03:13:58PM +0200, Christian König wrote: >> Am 02.05.2018 um 18:59 schrieb Michel Dänzer: >>> On 2018-05-02 06:21 PM, Christoph Hellwig wrote: >>>> On Wed, May 02, 2018 at 04:31:09PM +0200, Michel Dänzer wrote: >>>>>> No. __GFP_NOWARN (and gfp_t flags in general) are the wrong interface >>>>>> for dma allocations and just cause problems. I actually plan to >>>>>> get rid of the gfp_t argument in dma_alloc_attrs sooner, and only >>>>>> allow either GFP_KERNEL or GFP_DMA passed in dma_alloc_coherent. >>>>> How about GFP_TRANSHUGE_LIGHT? TTM uses that to opportunistically >>>>> allocate huge pages (GFP_TRANSHUGE can result in unacceptably long >>>>> delays with memory pressure). >>>> Well, that is exactly what I don't want drivers to do - same for >>>> __GFP_COMP in some drm code. This very much assumes the page allocator >>>> is used to back dma allocations, which very often it actually isn't, and >>>> any use of magic gfp flags creates a tight coupling of consumers with a >>>> specific implementation. >>>> >>>> In general I can't think of a good reason not to actually use >>>> GFP_TRANSHUGE_LIGHT by default in the dma allocator unless >>>> DMA_ATTR_ALLOC_SINGLE_PAGES is set. Can you prepare a patch for that? >>> I'm afraid I'll have to leave that to somebody else. >> >> Coming back to this topic once more, sorry for the delay but busy as usual >> :) >> >> What exactly do you mean with "dma allocator" here? The TTM allocator using >> the dma_alloc_coherent calls? Or the swiotlb implementation of the calls? > > dma allocatr in this case: backends for dma_alloc_coherent/ > dma_alloc_attrs. Most importantly dma_direct_alloc. > > But while we're at it I can't actually see any GFP_TRANSHUGE_LIGHT > usage in TTM, just plain old GFP_TRANSHUGE. See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=da291320baec914f0bb4e65a9dccb86bd6c728f2 . -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] swiotlb: Fix inversed DMA_ATTR_NO_WARN test
On 2018-05-02 06:21 PM, Christoph Hellwig wrote: > On Wed, May 02, 2018 at 04:31:09PM +0200, Michel Dänzer wrote: >>> No. __GFP_NOWARN (and gfp_t flags in general) are the wrong interface >>> for dma allocations and just cause problems. I actually plan to >>> get rid of the gfp_t argument in dma_alloc_attrs sooner, and only >>> allow either GFP_KERNEL or GFP_DMA passed in dma_alloc_coherent. >> >> How about GFP_TRANSHUGE_LIGHT? TTM uses that to opportunistically >> allocate huge pages (GFP_TRANSHUGE can result in unacceptably long >> delays with memory pressure). > > Well, that is exactly what I don't want drivers to do - same for > __GFP_COMP in some drm code. This very much assumes the page allocator > is used to back dma allocations, which very often it actually isn't, and > any use of magic gfp flags creates a tight coupling of consumers with a > specific implementation. > > In general I can't think of a good reason not to actually use > GFP_TRANSHUGE_LIGHT by default in the dma allocator unless > DMA_ATTR_ALLOC_SINGLE_PAGES is set. Can you prepare a patch for that? I'm afraid I'll have to leave that to somebody else. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] swiotlb: Fix inversed DMA_ATTR_NO_WARN test
On 2018-05-02 02:41 PM, Christoph Hellwig wrote: > On Wed, May 02, 2018 at 02:18:56PM +0200, Daniel Vetter wrote: >> Other dma-api backends like cma just shut up when __GFP_NOWARN is >> passed. And afaiui Christoph Hellwig has plans to nuke the DMA_ATTR >> stuff (or at least clean it up) - should we just remove >> DMA_ATTR_NO_WARN and instead only look at __GFP_NOWARN? > > No. __GFP_NOWARN (and gfp_t flags in general) are the wrong interface > for dma allocations and just cause problems. I actually plan to > get rid of the gfp_t argument in dma_alloc_attrs sooner, and only > allow either GFP_KERNEL or GFP_DMA passed in dma_alloc_coherent. How about GFP_TRANSHUGE_LIGHT? TTM uses that to opportunistically allocate huge pages (GFP_TRANSHUGE can result in unacceptably long delays with memory pressure). -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] swiotlb: Fix inversed DMA_ATTR_NO_WARN test
From: Michel Dänzer <michel.daen...@amd.com> The result was printing the warning only when we were explicitly asked not to. Cc: sta...@vger.kernel.org Fixes: 0176adb004065d6815a8e67946752df4cd947c5b "swiotlb: refactor coherent buffer allocation" Signed-off-by: Michel Dänzer <michel.daen...@amd.com> --- lib/swiotlb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index c43ec2271469..e9ac21540628 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -750,7 +750,7 @@ swiotlb_alloc_buffer(struct device *dev, size_t size, dma_addr_t *dma_handle, swiotlb_tbl_unmap_single(dev, phys_addr, size, DMA_TO_DEVICE, DMA_ATTR_SKIP_CPU_SYNC); out_warn: - if ((attrs & DMA_ATTR_NO_WARN) && printk_ratelimit()) { + if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit()) { dev_warn(dev, "swiotlb: coherent allocation failed, size=%zu\n", size); -- 2.17.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v2)
On 19/05/17 07:02 PM, arindam.n...@amd.com wrote: > From: Arindam Nath <arindam.n...@amd.com> > > Change History > -- > > v2: changes suggested by Joerg > - add flush flag to improve efficiency of flush operation > > v1: > - The idea behind flush queues is to defer the IOTLB flushing > for domains for which the mappings are no longer valid. We > add such domains in queue_add(), and when the queue size > reaches FLUSH_QUEUE_SIZE, we perform __queue_flush(). > > Since we have already taken lock before __queue_flush() > is called, we need to make sure the IOTLB flushing is > performed as quickly as possible. > > In the current implementation, we perform IOTLB flushing > for all domains irrespective of which ones were actually > added in the flush queue initially. This can be quite > expensive especially for domains for which unmapping is > not required at this point of time. > > This patch makes use of domain information in > 'struct flush_queue_entry' to make sure we only flush > IOTLBs for domains who need it, skipping others. > > Suggested-by: Joerg Roedel <j...@8bytes.org> > Signed-off-by: Arindam Nath <arindam.n...@amd.com> Please add these tags: Fixes: b1516a14657a ("iommu/amd: Implement flush queue") Cc: sta...@vger.kernel.org -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: flush IOTLB for specific domains only
On 07/04/17 07:20 PM, Joerg Roedel wrote: > On Mon, Mar 27, 2017 at 11:47:07AM +0530, arindam.n...@amd.com wrote: >> From: Arindam Nath <arindam.n...@amd.com> >> >> The idea behind flush queues is to defer the IOTLB flushing >> for domains for which the mappings are no longer valid. We >> add such domains in queue_add(), and when the queue size >> reaches FLUSH_QUEUE_SIZE, we perform __queue_flush(). >> >> Since we have already taken lock before __queue_flush() >> is called, we need to make sure the IOTLB flushing is >> performed as quickly as possible. >> >> In the current implementation, we perform IOTLB flushing >> for all domains irrespective of which ones were actually >> added in the flush queue initially. This can be quite >> expensive especially for domains for which unmapping is >> not required at this point of time. >> >> This patch makes use of domain information in >> 'struct flush_queue_entry' to make sure we only flush >> IOTLBs for domains who need it, skipping others. >> >> Signed-off-by: Arindam Nath <arindam.n...@amd.com> >> --- >> drivers/iommu/amd_iommu.c | 15 --- >> 1 file changed, 8 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c >> index 98940d1..6a9a048 100644 >> --- a/drivers/iommu/amd_iommu.c >> +++ b/drivers/iommu/amd_iommu.c >> @@ -2227,15 +2227,16 @@ static struct iommu_group >> *amd_iommu_device_group(struct device *dev) >> >> static void __queue_flush(struct flush_queue *queue) >> { >> -struct protection_domain *domain; >> -unsigned long flags; >> int idx; >> >> -/* First flush TLB of all known domains */ >> -spin_lock_irqsave(_iommu_pd_lock, flags); >> -list_for_each_entry(domain, _iommu_pd_list, list) >> -domain_flush_tlb(domain); >> -spin_unlock_irqrestore(_iommu_pd_lock, flags); >> +/* First flush TLB of all domains which were added to flush queue */ >> +for (idx = 0; idx < queue->next; ++idx) { >> +struct flush_queue_entry *entry; >> + >> +entry = queue->entries + idx; >> + >> +domain_flush_tlb(>dma_dom->domain); >> +} > > With this we will flush a domain every time we find one of its > iova-addresses in the flush queue, so potentially we flush a domain > multiple times per __queue_flush() call. > > Its better to either add a flush-flag to the domains and evaluate that > in __queue_flush or keep a list of domains to flush to make the flushing > really more efficient. Arindam, can you incorporate Joerg's feedback? FWIW, looks like Carrizo systems are affected by this as well (see e.g. https://bugs.freedesktop.org/show_bug.cgi?id=101029#c21), so it would be good to land this fix in some form ASAP. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu