Re: [PATCH] of/device: add blacklist for iommu dma_ops
On 6/3/2019 11:50 AM, Tomasz Figa wrote: On Mon, Jun 3, 2019 at 4:40 AM Rob Clark wrote: On Fri, May 10, 2019 at 7:35 AM Rob Clark wrote: On Tue, Dec 4, 2018 at 2:29 PM Rob Herring wrote: On Sat, Dec 1, 2018 at 10:54 AM Rob Clark wrote: This solves a problem we see with drm/msm, caused by getting iommu_dma_ops while we attach our own domain and manage it directly at the iommu API level: [0038] user address but active_mm is swapper Internal error: Oops: 9605 [#1] PREEMPT SMP Modules linked in: CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90 Hardware name: xxx (DT) Workqueue: events deferred_probe_work_func pstate: 80c9 (Nzcv daif +PAN +UAO) pc : iommu_dma_map_sg+0x7c/0x2c8 lr : iommu_dma_map_sg+0x40/0x2c8 sp : ff80095eb4f0 x29: ff80095eb4f0 x28: x27: ffc0f9431578 x26: x25: x24: 0003 x23: 0001 x22: ffc0fa9ac010 x21: x20: ffc0fab40980 x19: ffc0fab40980 x18: 0003 x17: 01c4 x16: 0007 x15: 000e x14: x13: x12: 0028 x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f x9 : x8 : ffc0fab409a0 x7 : x6 : 0002 x5 : 0001 x4 : x3 : 0001 x2 : 0002 x1 : ffc0f9431578 x0 : Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb) Call trace: iommu_dma_map_sg+0x7c/0x2c8 __iommu_map_sg_attrs+0x70/0x84 get_pages+0x170/0x1e8 msm_gem_get_iova+0x8c/0x128 _msm_gem_kernel_new+0x6c/0xc8 msm_gem_kernel_new+0x4c/0x58 dsi_tx_buf_alloc_6g+0x4c/0x8c msm_dsi_host_modeset_init+0xc8/0x108 msm_dsi_modeset_init+0x54/0x18c _dpu_kms_drm_obj_init+0x430/0x474 dpu_kms_hw_init+0x5f8/0x6b4 msm_drm_bind+0x360/0x6c8 try_to_bring_up_master.part.7+0x28/0x70 component_master_add_with_match+0xe8/0x124 msm_pdev_probe+0x294/0x2b4 platform_drv_probe+0x58/0xa4 really_probe+0x150/0x294 driver_probe_device+0xac/0xe8 __device_attach_driver+0xa4/0xb4 bus_for_each_drv+0x98/0xc8 __device_attach+0xac/0x12c device_initial_probe+0x24/0x30 bus_probe_device+0x38/0x98 deferred_probe_work_func+0x78/0xa4 process_one_work+0x24c/0x3dc worker_thread+0x280/0x360 kthread+0x134/0x13c ret_from_fork+0x10/0x18 Code: d284 91000725 6b17039f 5400048a (f9401f40) ---[ end trace f22dda57f3648e2c ]--- Kernel panic - not syncing: Fatal exception SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x0,22802a18 Memory Limit: none The problem is that when drm/msm does it's own iommu_attach_device(), now the domain returned by iommu_get_domain_for_dev() is drm/msm's domain, and it doesn't have domain->iova_cookie. We kind of avoided this problem prior to sdm845/dpu because the iommu was attached to the mdp node in dt, which is a child of the toplevel mdss node (which corresponds to the dev passed in dma_map_sg()). But with sdm845, now the iommu is attached at the mdss level so we hit the iommu_dma_ops in dma_map_sg(). But auto allocating/attaching a domain before the driver is probed was already a blocking problem for enabling per-context pagetables for the GPU. This problem is also now solved with this patch. Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in of_dma_configure Tested-by: Douglas Anderson Signed-off-by: Rob Clark --- This is an alternative/replacement for [1]. What it lacks in elegance it makes up for in practicality ;-) [1] https://patchwork.freedesktop.org/patch/264930/ drivers/of/device.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/drivers/of/device.c b/drivers/of/device.c index 5957cd4fa262..15ffee00fb22 100644 --- a/drivers/of/device.c +++ b/drivers/of/device.c @@ -72,6 +72,14 @@ int of_device_add(struct platform_device *ofdev) return device_add(&ofdev->dev); } +static const struct of_device_id iommu_blacklist[] = { + { .compatible = "qcom,mdp4" }, + { .compatible = "qcom,mdss" }, + { .compatible = "qcom,sdm845-mdss" }, + { .compatible = "qcom,adreno" }, + {} +}; Not completely clear to whether this is still needed or not, but this really won't scale. Why can't the driver for these devices override whatever has been setup by default? fwiw, at the moment it is not needed, but it will become needed again to implement per-context pagetables (although I suppose for this we only need to blacklist qcom,adreno and not also the display nodes). So, another case I've come across, on the display side.. I'm working on handling the case where bootloader enables display (and takes iommu out of reset).. as soon as DMA domain gets attached we get iommu faults, because bootloade
Re: [PATCH -next] intel-iommu: fix a variable set but not used
On Fri, May 31, 2019 at 04:16:02PM -0400, Qian Cai wrote: > The commit "iommu/vt-d: Delegate the dma domain to upper layer" left an > unused variable, > > drivers/iommu/intel-iommu.c: In function 'disable_dmar_iommu': > drivers/iommu/intel-iommu.c:1652:23: warning: variable 'domain' set but > not used [-Wunused-but-set-variable] > > Signed-off-by: Qian Cai > --- > drivers/iommu/intel-iommu.c | 4 > 1 file changed, 4 deletions(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 0/6] iommu/ipmmu-vmsa: Suspend/resume support and assorted cleanups
On Mon, May 27, 2019 at 01:52:47PM +0200, Geert Uytterhoeven wrote: > Geert Uytterhoeven (6): > iommu/ipmmu-vmsa: Link IOMMUs and devices in sysfs > iommu/ipmmu-vmsa: Prepare to handle 40-bit error addresses > iommu/ipmmu-vmsa: Make IPMMU_CTX_MAX unsigned > iommu/ipmmu-vmsa: Move num_utlbs to SoC-specific features > iommu/ipmmu-vmsa: Extract hardware context initialization > iommu/ipmmu-vmsa: Add suspend/resume support Applied, thanks Geert.
Re: [PATCH] iommu/dma: Fix condition check in iommu_dma_unmap_sg
On Wed, May 29, 2019 at 01:15:32AM -0700, Nathan Chancellor wrote: > Fixes: 06d60728ff5c ("iommu/dma: move the arm64 wrappers to common code") > Link: https://github.com/ClangBuiltLinux/linux/issues/497 > Signed-off-by: Nathan Chancellor > --- > drivers/iommu/dma-iommu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Mon, Jun 3, 2019 at 12:57 AM Vivek Gautam wrote: > > > > On 6/3/2019 11:50 AM, Tomasz Figa wrote: > > On Mon, Jun 3, 2019 at 4:40 AM Rob Clark wrote: > >> On Fri, May 10, 2019 at 7:35 AM Rob Clark wrote: > >>> On Tue, Dec 4, 2018 at 2:29 PM Rob Herring wrote: > On Sat, Dec 1, 2018 at 10:54 AM Rob Clark wrote: > > This solves a problem we see with drm/msm, caused by getting > > iommu_dma_ops while we attach our own domain and manage it directly at > > the iommu API level: > > > >[0038] user address but active_mm is swapper > >Internal error: Oops: 9605 [#1] PREEMPT SMP > >Modules linked in: > >CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 > > #90 > >Hardware name: xxx (DT) > >Workqueue: events deferred_probe_work_func > >pstate: 80c9 (Nzcv daif +PAN +UAO) > >pc : iommu_dma_map_sg+0x7c/0x2c8 > >lr : iommu_dma_map_sg+0x40/0x2c8 > >sp : ff80095eb4f0 > >x29: ff80095eb4f0 x28: > >x27: ffc0f9431578 x26: > >x25: x24: 0003 > >x23: 0001 x22: ffc0fa9ac010 > >x21: x20: ffc0fab40980 > >x19: ffc0fab40980 x18: 0003 > >x17: 01c4 x16: 0007 > >x15: 000e x14: > >x13: x12: 0028 > >x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f > >x9 : x8 : ffc0fab409a0 > >x7 : x6 : 0002 > >x5 : 0001 x4 : > >x3 : 0001 x2 : 0002 > >x1 : ffc0f9431578 x0 : > >Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb) > >Call trace: > > iommu_dma_map_sg+0x7c/0x2c8 > > __iommu_map_sg_attrs+0x70/0x84 > > get_pages+0x170/0x1e8 > > msm_gem_get_iova+0x8c/0x128 > > _msm_gem_kernel_new+0x6c/0xc8 > > msm_gem_kernel_new+0x4c/0x58 > > dsi_tx_buf_alloc_6g+0x4c/0x8c > > msm_dsi_host_modeset_init+0xc8/0x108 > > msm_dsi_modeset_init+0x54/0x18c > > _dpu_kms_drm_obj_init+0x430/0x474 > > dpu_kms_hw_init+0x5f8/0x6b4 > > msm_drm_bind+0x360/0x6c8 > > try_to_bring_up_master.part.7+0x28/0x70 > > component_master_add_with_match+0xe8/0x124 > > msm_pdev_probe+0x294/0x2b4 > > platform_drv_probe+0x58/0xa4 > > really_probe+0x150/0x294 > > driver_probe_device+0xac/0xe8 > > __device_attach_driver+0xa4/0xb4 > > bus_for_each_drv+0x98/0xc8 > > __device_attach+0xac/0x12c > > device_initial_probe+0x24/0x30 > > bus_probe_device+0x38/0x98 > > deferred_probe_work_func+0x78/0xa4 > > process_one_work+0x24c/0x3dc > > worker_thread+0x280/0x360 > > kthread+0x134/0x13c > > ret_from_fork+0x10/0x18 > >Code: d284 91000725 6b17039f 5400048a (f9401f40) > >---[ end trace f22dda57f3648e2c ]--- > >Kernel panic - not syncing: Fatal exception > >SMP: stopping secondary CPUs > >Kernel Offset: disabled > >CPU features: 0x0,22802a18 > >Memory Limit: none > > > > The problem is that when drm/msm does it's own iommu_attach_device(), > > now the domain returned by iommu_get_domain_for_dev() is drm/msm's > > domain, and it doesn't have domain->iova_cookie. > > > > We kind of avoided this problem prior to sdm845/dpu because the iommu > > was attached to the mdp node in dt, which is a child of the toplevel > > mdss node (which corresponds to the dev passed in dma_map_sg()). But > > with sdm845, now the iommu is attached at the mdss level so we hit the > > iommu_dma_ops in dma_map_sg(). > > > > But auto allocating/attaching a domain before the driver is probed was > > already a blocking problem for enabling per-context pagetables for the > > GPU. This problem is also now solved with this patch. > > > > Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in > > of_dma_configure > > Tested-by: Douglas Anderson > > Signed-off-by: Rob Clark > > --- > > This is an alternative/replacement for [1]. What it lacks in elegance > > it makes up for in practicality ;-) > > > > [1] https://patchwork.freedesktop.org/patch/264930/ > > > > drivers/of/device.c | 22 ++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/drivers/of/device.c b/drivers/of/device.c > > index 5957cd4fa262..15ffee00fb22 100644 > > --- a/drivers/of/device.c > > +++ b/drivers/of/device.c > > @@ -72,6 +72,14 @@ int of_device_add(struct platform_device *ofdev) > > return devi
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Sun, Jun 2, 2019 at 11:25 PM Tomasz Figa wrote: > > On Mon, Jun 3, 2019 at 4:40 AM Rob Clark wrote: > > > > On Fri, May 10, 2019 at 7:35 AM Rob Clark wrote: > > > > > > On Tue, Dec 4, 2018 at 2:29 PM Rob Herring wrote: > > > > > > > > On Sat, Dec 1, 2018 at 10:54 AM Rob Clark wrote: > > > > > > > > > > This solves a problem we see with drm/msm, caused by getting > > > > > iommu_dma_ops while we attach our own domain and manage it directly at > > > > > the iommu API level: > > > > > > > > > > [0038] user address but active_mm is swapper > > > > > Internal error: Oops: 9605 [#1] PREEMPT SMP > > > > > Modules linked in: > > > > > CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 > > > > > #90 > > > > > Hardware name: xxx (DT) > > > > > Workqueue: events deferred_probe_work_func > > > > > pstate: 80c9 (Nzcv daif +PAN +UAO) > > > > > pc : iommu_dma_map_sg+0x7c/0x2c8 > > > > > lr : iommu_dma_map_sg+0x40/0x2c8 > > > > > sp : ff80095eb4f0 > > > > > x29: ff80095eb4f0 x28: > > > > > x27: ffc0f9431578 x26: > > > > > x25: x24: 0003 > > > > > x23: 0001 x22: ffc0fa9ac010 > > > > > x21: x20: ffc0fab40980 > > > > > x19: ffc0fab40980 x18: 0003 > > > > > x17: 01c4 x16: 0007 > > > > > x15: 000e x14: > > > > > x13: x12: 0028 > > > > > x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f > > > > > x9 : x8 : ffc0fab409a0 > > > > > x7 : x6 : 0002 > > > > > x5 : 0001 x4 : > > > > > x3 : 0001 x2 : 0002 > > > > > x1 : ffc0f9431578 x0 : > > > > > Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb) > > > > > Call trace: > > > > >iommu_dma_map_sg+0x7c/0x2c8 > > > > >__iommu_map_sg_attrs+0x70/0x84 > > > > >get_pages+0x170/0x1e8 > > > > >msm_gem_get_iova+0x8c/0x128 > > > > >_msm_gem_kernel_new+0x6c/0xc8 > > > > >msm_gem_kernel_new+0x4c/0x58 > > > > >dsi_tx_buf_alloc_6g+0x4c/0x8c > > > > >msm_dsi_host_modeset_init+0xc8/0x108 > > > > >msm_dsi_modeset_init+0x54/0x18c > > > > >_dpu_kms_drm_obj_init+0x430/0x474 > > > > >dpu_kms_hw_init+0x5f8/0x6b4 > > > > >msm_drm_bind+0x360/0x6c8 > > > > >try_to_bring_up_master.part.7+0x28/0x70 > > > > >component_master_add_with_match+0xe8/0x124 > > > > >msm_pdev_probe+0x294/0x2b4 > > > > >platform_drv_probe+0x58/0xa4 > > > > >really_probe+0x150/0x294 > > > > >driver_probe_device+0xac/0xe8 > > > > >__device_attach_driver+0xa4/0xb4 > > > > >bus_for_each_drv+0x98/0xc8 > > > > >__device_attach+0xac/0x12c > > > > >device_initial_probe+0x24/0x30 > > > > >bus_probe_device+0x38/0x98 > > > > >deferred_probe_work_func+0x78/0xa4 > > > > >process_one_work+0x24c/0x3dc > > > > >worker_thread+0x280/0x360 > > > > >kthread+0x134/0x13c > > > > >ret_from_fork+0x10/0x18 > > > > > Code: d284 91000725 6b17039f 5400048a (f9401f40) > > > > > ---[ end trace f22dda57f3648e2c ]--- > > > > > Kernel panic - not syncing: Fatal exception > > > > > SMP: stopping secondary CPUs > > > > > Kernel Offset: disabled > > > > > CPU features: 0x0,22802a18 > > > > > Memory Limit: none > > > > > > > > > > The problem is that when drm/msm does it's own iommu_attach_device(), > > > > > now the domain returned by iommu_get_domain_for_dev() is drm/msm's > > > > > domain, and it doesn't have domain->iova_cookie. > > > > > > > > > > We kind of avoided this problem prior to sdm845/dpu because the iommu > > > > > was attached to the mdp node in dt, which is a child of the toplevel > > > > > mdss node (which corresponds to the dev passed in dma_map_sg()). But > > > > > with sdm845, now the iommu is attached at the mdss level so we hit the > > > > > iommu_dma_ops in dma_map_sg(). > > > > > > > > > > But auto allocating/attaching a domain before the driver is probed was > > > > > already a blocking problem for enabling per-context pagetables for the > > > > > GPU. This problem is also now solved with this patch. > > > > > > > > > > Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in > > > > > of_dma_configure > > > > > Tested-by: Douglas Anderson > > > > > Signed-off-by: Rob Clark > > > > > --- > > > > > This is an alternative/replacement for [1]. What it lacks in elegance > > > > > it makes up for in practicality ;-) > > > > > > > > > > [1] https://patchwork.freedesktop.org/patch/264930/ > > > > > > > > > > drivers/of/device.c | 22 ++ > > > > > 1 file changed, 22 insertions(+) > > > > > > > > > > diff --git a/drivers/of/device.c b/drivers/of/device.c > > > > > index 5957cd4fa262..15ffee00fb22 100644 > > > > > --- a/drivers/of/device.c > > > > > +++ b/drivers/of/d
Re: [PATCH v3 0/4] iommu/amd: Convert the AMD iommu driver to the dma-iommu api
Hi Tom, On Mon, May 06, 2019 at 07:52:02PM +0100, Tom Murphy wrote: > Convert the AMD iommu driver to the dma-iommu api. Remove the iova > handling and reserve region code from the AMD iommu driver. Thank you for your work on this! I appreciate that much, but I am not sure we are ready to make that move for the AMD and Intel IOMMU drivers yet. My main concern right now is that these changes will add a per-page table lock into the fast-path for dma-mapping operations. There has been much work in the past to remove all locking from these code-paths and make it scalable on x86. The dma-ops implementations in the x86 IOMMU drivers have the benefit that they can call their page-table manipulation functions directly and without locks, because they can make the necessary assumptions. The IOMMU-API mapping/unmapping path can't make these assumptions because it is also used for non-DMA-API use-cases. So before we can move the AMD and Intel drivers to the generic DMA-API implementation we need to solve this problem to not introduce new scalability regressions. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Mon, Jun 3, 2019 at 4:14 PM Rob Clark wrote: > > On Mon, Jun 3, 2019 at 12:57 AM Vivek Gautam > wrote: > > > > > > > > On 6/3/2019 11:50 AM, Tomasz Figa wrote: > > > On Mon, Jun 3, 2019 at 4:40 AM Rob Clark wrote: > > >> On Fri, May 10, 2019 at 7:35 AM Rob Clark wrote: > > >>> On Tue, Dec 4, 2018 at 2:29 PM Rob Herring wrote: > > On Sat, Dec 1, 2018 at 10:54 AM Rob Clark wrote: > > > This solves a problem we see with drm/msm, caused by getting > > > iommu_dma_ops while we attach our own domain and manage it directly at > > > the iommu API level: > > > > > >[0038] user address but active_mm is swapper > > >Internal error: Oops: 9605 [#1] PREEMPT SMP > > >Modules linked in: > > >CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW > > > 4.19.3 #90 > > >Hardware name: xxx (DT) > > >Workqueue: events deferred_probe_work_func > > >pstate: 80c9 (Nzcv daif +PAN +UAO) > > >pc : iommu_dma_map_sg+0x7c/0x2c8 > > >lr : iommu_dma_map_sg+0x40/0x2c8 > > >sp : ff80095eb4f0 > > >x29: ff80095eb4f0 x28: > > >x27: ffc0f9431578 x26: > > >x25: x24: 0003 > > >x23: 0001 x22: ffc0fa9ac010 > > >x21: x20: ffc0fab40980 > > >x19: ffc0fab40980 x18: 0003 > > >x17: 01c4 x16: 0007 > > >x15: 000e x14: > > >x13: x12: 0028 > > >x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f > > >x9 : x8 : ffc0fab409a0 > > >x7 : x6 : 0002 > > >x5 : 0001 x4 : > > >x3 : 0001 x2 : 0002 > > >x1 : ffc0f9431578 x0 : > > >Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb) > > >Call trace: > > > iommu_dma_map_sg+0x7c/0x2c8 > > > __iommu_map_sg_attrs+0x70/0x84 > > > get_pages+0x170/0x1e8 > > > msm_gem_get_iova+0x8c/0x128 > > > _msm_gem_kernel_new+0x6c/0xc8 > > > msm_gem_kernel_new+0x4c/0x58 > > > dsi_tx_buf_alloc_6g+0x4c/0x8c > > > msm_dsi_host_modeset_init+0xc8/0x108 > > > msm_dsi_modeset_init+0x54/0x18c > > > _dpu_kms_drm_obj_init+0x430/0x474 > > > dpu_kms_hw_init+0x5f8/0x6b4 > > > msm_drm_bind+0x360/0x6c8 > > > try_to_bring_up_master.part.7+0x28/0x70 > > > component_master_add_with_match+0xe8/0x124 > > > msm_pdev_probe+0x294/0x2b4 > > > platform_drv_probe+0x58/0xa4 > > > really_probe+0x150/0x294 > > > driver_probe_device+0xac/0xe8 > > > __device_attach_driver+0xa4/0xb4 > > > bus_for_each_drv+0x98/0xc8 > > > __device_attach+0xac/0x12c > > > device_initial_probe+0x24/0x30 > > > bus_probe_device+0x38/0x98 > > > deferred_probe_work_func+0x78/0xa4 > > > process_one_work+0x24c/0x3dc > > > worker_thread+0x280/0x360 > > > kthread+0x134/0x13c > > > ret_from_fork+0x10/0x18 > > >Code: d284 91000725 6b17039f 5400048a (f9401f40) > > >---[ end trace f22dda57f3648e2c ]--- > > >Kernel panic - not syncing: Fatal exception > > >SMP: stopping secondary CPUs > > >Kernel Offset: disabled > > >CPU features: 0x0,22802a18 > > >Memory Limit: none > > > > > > The problem is that when drm/msm does it's own iommu_attach_device(), > > > now the domain returned by iommu_get_domain_for_dev() is drm/msm's > > > domain, and it doesn't have domain->iova_cookie. > > > > > > We kind of avoided this problem prior to sdm845/dpu because the iommu > > > was attached to the mdp node in dt, which is a child of the toplevel > > > mdss node (which corresponds to the dev passed in dma_map_sg()). But > > > with sdm845, now the iommu is attached at the mdss level so we hit the > > > iommu_dma_ops in dma_map_sg(). > > > > > > But auto allocating/attaching a domain before the driver is probed was > > > already a blocking problem for enabling per-context pagetables for the > > > GPU. This problem is also now solved with this patch. > > > > > > Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in > > > of_dma_configure > > > Tested-by: Douglas Anderson > > > Signed-off-by: Rob Clark > > > --- > > > This is an alternative/replacement for [1]. What it lacks in elegance > > > it makes up for in practicality ;-) > > > > > > [1] https://patchwork.freedesktop.org/patch/264930/ > > > > > > drivers/of/device.c | 22 ++ > > > 1 file changed, 22 insertions(+) > > > > > > diff --git a
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On 03/06/2019 11:47, Rob Clark wrote: On Sun, Jun 2, 2019 at 11:25 PM Tomasz Figa wrote: On Mon, Jun 3, 2019 at 4:40 AM Rob Clark wrote: On Fri, May 10, 2019 at 7:35 AM Rob Clark wrote: On Tue, Dec 4, 2018 at 2:29 PM Rob Herring wrote: On Sat, Dec 1, 2018 at 10:54 AM Rob Clark wrote: This solves a problem we see with drm/msm, caused by getting iommu_dma_ops while we attach our own domain and manage it directly at the iommu API level: [0038] user address but active_mm is swapper Internal error: Oops: 9605 [#1] PREEMPT SMP Modules linked in: CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90 Hardware name: xxx (DT) Workqueue: events deferred_probe_work_func pstate: 80c9 (Nzcv daif +PAN +UAO) pc : iommu_dma_map_sg+0x7c/0x2c8 lr : iommu_dma_map_sg+0x40/0x2c8 sp : ff80095eb4f0 x29: ff80095eb4f0 x28: x27: ffc0f9431578 x26: x25: x24: 0003 x23: 0001 x22: ffc0fa9ac010 x21: x20: ffc0fab40980 x19: ffc0fab40980 x18: 0003 x17: 01c4 x16: 0007 x15: 000e x14: x13: x12: 0028 x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f x9 : x8 : ffc0fab409a0 x7 : x6 : 0002 x5 : 0001 x4 : x3 : 0001 x2 : 0002 x1 : ffc0f9431578 x0 : Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb) Call trace: iommu_dma_map_sg+0x7c/0x2c8 __iommu_map_sg_attrs+0x70/0x84 get_pages+0x170/0x1e8 msm_gem_get_iova+0x8c/0x128 _msm_gem_kernel_new+0x6c/0xc8 msm_gem_kernel_new+0x4c/0x58 dsi_tx_buf_alloc_6g+0x4c/0x8c msm_dsi_host_modeset_init+0xc8/0x108 msm_dsi_modeset_init+0x54/0x18c _dpu_kms_drm_obj_init+0x430/0x474 dpu_kms_hw_init+0x5f8/0x6b4 msm_drm_bind+0x360/0x6c8 try_to_bring_up_master.part.7+0x28/0x70 component_master_add_with_match+0xe8/0x124 msm_pdev_probe+0x294/0x2b4 platform_drv_probe+0x58/0xa4 really_probe+0x150/0x294 driver_probe_device+0xac/0xe8 __device_attach_driver+0xa4/0xb4 bus_for_each_drv+0x98/0xc8 __device_attach+0xac/0x12c device_initial_probe+0x24/0x30 bus_probe_device+0x38/0x98 deferred_probe_work_func+0x78/0xa4 process_one_work+0x24c/0x3dc worker_thread+0x280/0x360 kthread+0x134/0x13c ret_from_fork+0x10/0x18 Code: d284 91000725 6b17039f 5400048a (f9401f40) ---[ end trace f22dda57f3648e2c ]--- Kernel panic - not syncing: Fatal exception SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x0,22802a18 Memory Limit: none The problem is that when drm/msm does it's own iommu_attach_device(), now the domain returned by iommu_get_domain_for_dev() is drm/msm's domain, and it doesn't have domain->iova_cookie. We kind of avoided this problem prior to sdm845/dpu because the iommu was attached to the mdp node in dt, which is a child of the toplevel mdss node (which corresponds to the dev passed in dma_map_sg()). But with sdm845, now the iommu is attached at the mdss level so we hit the iommu_dma_ops in dma_map_sg(). But auto allocating/attaching a domain before the driver is probed was already a blocking problem for enabling per-context pagetables for the GPU. This problem is also now solved with this patch. Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in of_dma_configure Tested-by: Douglas Anderson Signed-off-by: Rob Clark --- This is an alternative/replacement for [1]. What it lacks in elegance it makes up for in practicality ;-) [1] https://patchwork.freedesktop.org/patch/264930/ drivers/of/device.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/drivers/of/device.c b/drivers/of/device.c index 5957cd4fa262..15ffee00fb22 100644 --- a/drivers/of/device.c +++ b/drivers/of/device.c @@ -72,6 +72,14 @@ int of_device_add(struct platform_device *ofdev) return device_add(&ofdev->dev); } +static const struct of_device_id iommu_blacklist[] = { + { .compatible = "qcom,mdp4" }, + { .compatible = "qcom,mdss" }, + { .compatible = "qcom,sdm845-mdss" }, + { .compatible = "qcom,adreno" }, + {} +}; Not completely clear to whether this is still needed or not, but this really won't scale. Why can't the driver for these devices override whatever has been setup by default? fwiw, at the moment it is not needed, but it will become needed again to implement per-context pagetables (although I suppose for this we only need to blacklist qcom,adreno and not also the display nodes). So, another case I've come across, on the display side.. I'm working on handling the case where bootloader enables display (and takes iommu out of reset).. as soon as DMA doma
Re: [PATCH] of/device: add blacklist for iommu dma_ops
If you (and a few others actors in the thread) want people to actually read what you wrote please follow proper mailing list ettiquette. I've given up on reading all the recent mails after scrolling through two pages of full quotes. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 0/4] iommu/amd: Convert the AMD iommu driver to the dma-iommu api
On Mon, Jun 3, 2019 at 11:52 AM Joerg Roedel wrote: > > Hi Tom, > > On Mon, May 06, 2019 at 07:52:02PM +0100, Tom Murphy wrote: > > Convert the AMD iommu driver to the dma-iommu api. Remove the iova > > handling and reserve region code from the AMD iommu driver. > > Thank you for your work on this! I appreciate that much, but I am not > sure we are ready to make that move for the AMD and Intel IOMMU drivers > yet. > > My main concern right now is that these changes will add a per-page > table lock into the fast-path for dma-mapping operations. There has been > much work in the past to remove all locking from these code-paths and > make it scalable on x86. Where is the locking introduced? intel doesn't use a lock in it's iommu_map function: https://github.com/torvalds/linux/blob/f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a/drivers/iommu/intel-iommu.c#L5302 because it cleverly uses cmpxchg64 to avoid using locks: https://github.com/torvalds/linux/blob/f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a/drivers/iommu/intel-iommu.c#L900 And the locking in AMD's iommu_map function can be removed (and i have removed it in my patch set) because it does that same thing as intel: https://github.com/torvalds/linux/blob/f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a/drivers/iommu/amd_iommu.c#L1486 Is there something I'm missing? > > The dma-ops implementations in the x86 IOMMU drivers have the benefit > that they can call their page-table manipulation functions directly and > without locks, because they can make the necessary assumptions. The > IOMMU-API mapping/unmapping path can't make these assumptions because it > is also used for non-DMA-API use-cases. > > So before we can move the AMD and Intel drivers to the generic DMA-API > implementation we need to solve this problem to not introduce new > scalability regressions. > > Regards, > > Joerg > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Mon, Jun 3, 2019 at 4:47 PM Christoph Hellwig wrote: > > If you (and a few others actors in the thread) want people to actually > read what you wrote please follow proper mailing list ettiquette. I've > given up on reading all the recent mails after scrolling through two > pages of full quotes. Apologies for not cutting down the quoted text. I will be more careful next time onwards. Regards Vivek -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 3/7] s390/pci: add support for IOMMU default DMA mode build options
On Thu, 30 May 2019, Zhen Lei wrote: > The default DMA mode is LAZY on s390, this patch make it can be set to > STRICT at build time. It can be overridden by boot option. > > There is no functional change. > > Signed-off-by: Zhen Lei Acked-by: Sebastian Ott ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3] iommu/arm-smmu: Avoid constant zero in TLBI writes
From: Robin Murphy Apparently, some Qualcomm arm64 platforms which appear to expose their SMMU global register space are still, in fact, using a hypervisor to mediate it by trapping and emulating register accesses. Sadly, some deployed versions of said trapping code have bugs wherein they go horribly wrong for stores using r31 (i.e. XZR/WZR) as the source register. While this can be mitigated for GCC today by tweaking the constraints for the implementation of writel_relaxed(), to avoid any potential arms race with future compilers more aggressively optimising register allocation, the simple way is to just remove all the problematic constant zeros. For the write-only TLB operations, the actual value is irrelevant anyway and any old nearby variable will provide a suitable GPR to encode. The one point at which we really do need a zero to clear a context bank happens before any of the TLB maintenance where crashes have been reported, so is apparently not a problem... :/ Reported-by: AngeloGioacchino Del Regno Tested-by: Marc Gonzalez Signed-off-by: Robin Murphy Signed-off-by: Marc Gonzalez --- Changes from v2: - Define and use QCOM_DUMMY_VAL for the 3 problematic mmio writes - Drop previous Reviewed-by and Tested-by tags (rationale: we are now writing a different value) Boot log: REMAP: PA=0168 VA=ff80111b SIZE=1 arm-smmu 168.arm,smmu: probing hardware configuration... arm-smmu 168.arm,smmu: SMMUv2 with: arm-smmu 168.arm,smmu: stage 1 translation arm-smmu 168.arm,smmu: address translation ops arm-smmu 168.arm,smmu: non-coherent table walk arm-smmu 168.arm,smmu: (IDR0.CTTW overridden by FW configuration) arm-smmu 168.arm,smmu: stream matching with 16 register groups arm-smmu 168.arm,smmu: 6 context banks (0 stage-2 only) arm-smmu 168.arm,smmu: Supported page sizes: 0x63315000 arm-smmu 168.arm,smmu: Stage-1: 36-bit VA -> 36-bit IPA [SMMU + 48] = [SMMU + 000c00] = 0002 [SMMU + 000800] = [SMMU + 000c04] = 0002 [SMMU + 000804] = [SMMU + 000c08] = 0002 [SMMU + 000808] = [SMMU + 000c0c] = 0002 [SMMU + 00080c] = [SMMU + 000c10] = 0002 [SMMU + 000810] = [SMMU + 000c14] = 0002 [SMMU + 000814] = [SMMU + 000c18] = 0002 [SMMU + 000818] = [SMMU + 000c1c] = 0002 [SMMU + 00081c] = [SMMU + 000c20] = 0002 [SMMU + 000820] = [SMMU + 000c24] = 0002 [SMMU + 000824] = [SMMU + 000c28] = 0002 [SMMU + 000828] = [SMMU + 000c2c] = 0002 [SMMU + 00082c] = [SMMU + 000c30] = 0002 [SMMU + 000830] = [SMMU + 000c34] = 0002 [SMMU + 000834] = [SMMU + 000c38] = 0002 [SMMU + 000838] = [SMMU + 000c3c] = 0002 [SMMU + 00083c] = [SMMU + 008000] = [SMMU + 008058] = c1fe [SMMU + 009000] = [SMMU + 009058] = c1fe [SMMU + 00a000] = [SMMU + 00a058] = c1fe [SMMU + 00b000] = [SMMU + 00b058] = c1fe [SMMU + 00c000] = [SMMU + 00c058] = c1fe [SMMU + 00d000] = [SMMU + 00d058] = c1fe [SMMU + 6c] = [SMMU + 68] = [SMMU + 70] = [SMMU + 00] = 00201e36 [SMMU + 000800] = 1fff [SMMU + 000800] = 1fff [SMMU + 001800] = 0001 [SMMU + 001000] = 0001f300 [SMMU + 008010] = 00038011 [SMMU + 008030] = 0080351c [SMMU + 008020] = 785d5000 [SMMU + 008028] = [SMMU + 008038] = 0004ff44 [SMMU + 00803c] = [SMMU + 008000] = 1067 [SMMU + 000c00] = atl1c :01:00.0: Adding to iommu group 0 [SMMU + 000c00] = [SMMU + 000800] = 80001480 --- drivers/iommu/arm-smmu.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 930c07635956..9435e4a7759f 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -59,6 +59,15 @@ #include "arm-smmu-regs.h" +/* + * Apparently, some Qualcomm arm64 platforms which appear to expose their SMMU + * global register space are still, in fact, using a hypervisor to mediate it + * by trapping and emulating register accesses. Sadly, some deployed versions + * of said trapping code have bugs wherein they go horribly wrong for stores + * using r31 (i.e. XZR/WZR) as the source register. + */ +#define QCOM_DUMMY_VAL -1 + #define ARM_MMU500_ACTLR_CPRE (1
[PATCH] iommu: replace single-char identifiers in macros
There are a few macros in IOMMU have single-char identifiers make the code hard to read and debug. Replace them with meaningful names. Suggested-by: Andrew Morton Signed-off-by: Qian Cai --- include/linux/dmar.h | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/include/linux/dmar.h b/include/linux/dmar.h index f8af1d770520..eb634912f475 100644 --- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -104,12 +104,14 @@ static inline bool dmar_rcu_check(void) #definedmar_rcu_dereference(p) rcu_dereference_check((p), dmar_rcu_check()) -#definefor_each_dev_scope(a, c, p, d) \ - for ((p) = 0; ((d) = (p) < (c) ? dmar_rcu_dereference((a)[(p)].dev) : \ - NULL, (p) < (c)); (p)++) - -#definefor_each_active_dev_scope(a, c, p, d) \ - for_each_dev_scope((a), (c), (p), (d)) if (!(d)) { continue; } else +#define for_each_dev_scope(devs, cnt, i, tmp) \ + for ((i) = 0; ((tmp) = (i) < (cnt) ?\ + dmar_rcu_dereference((devs)[(i)].dev) : NULL, (i) < (cnt)); \ + (i)++) + +#define for_each_active_dev_scope(devs, cnt, i, tmp) \ + for_each_dev_scope((devs), (cnt), (i), (tmp)) \ + if (!(tmp)) { continue; } else extern int dmar_table_init(void); extern int dmar_dev_scope_init(void); -- 1.8.3.1
Re: [PATCH] iommu: replace single-char identifiers in macros
On 03/06/2019 13:59, Qian Cai wrote: There are a few macros in IOMMU have single-char identifiers make the code hard to read and debug. Replace them with meaningful names. Suggested-by: Andrew Morton Signed-off-by: Qian Cai --- include/linux/dmar.h | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/include/linux/dmar.h b/include/linux/dmar.h index f8af1d770520..eb634912f475 100644 --- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -104,12 +104,14 @@ static inline bool dmar_rcu_check(void) #define dmar_rcu_dereference(p) rcu_dereference_check((p), dmar_rcu_check()) -#define for_each_dev_scope(a, c, p, d) \ - for ((p) = 0; ((d) = (p) < (c) ? dmar_rcu_dereference((a)[(p)].dev) : \ - NULL, (p) < (c)); (p)++) - -#definefor_each_active_dev_scope(a, c, p, d) \ - for_each_dev_scope((a), (c), (p), (d)) if (!(d)) { continue; } else +#define for_each_dev_scope(devs, cnt, i, tmp) \ + for ((i) = 0; ((tmp) = (i) < (cnt) ? \ Given that "tmp" actually appears to be some sort of device cursor, I'm not sure that that naming really achieves the stated goal of clarity :/ Robin. + dmar_rcu_dereference((devs)[(i)].dev) : NULL, (i) < (cnt)); \ + (i)++) + +#define for_each_active_dev_scope(devs, cnt, i, tmp) \ + for_each_dev_scope((devs), (cnt), (i), (tmp)) \ + if (!(tmp)) { continue; } else extern int dmar_table_init(void); extern int dmar_dev_scope_init(void);
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Mon, Jun 3, 2019 at 4:14 AM Robin Murphy wrote: > > On 03/06/2019 11:47, Rob Clark wrote: > > On Sun, Jun 2, 2019 at 11:25 PM Tomasz Figa wrote: > >> > >> On Mon, Jun 3, 2019 at 4:40 AM Rob Clark wrote: > >>> > >>> So, another case I've come across, on the display side.. I'm working > >>> on handling the case where bootloader enables display (and takes iommu > >>> out of reset).. as soon as DMA domain gets attached we get iommu > >>> faults, because bootloader has already configured display for scanout. > >>> Unfortunately this all happens before actual driver is probed and has > >>> a chance to intervene. > >>> > >>> It's rather unfortunate that we tried to be clever rather than just > >>> making drivers call some function to opt-in to the hookup of dma iommu > >>> ops :-( > >> > >> I think it still works for the 90% of cases and if 10% needs some > >> explicit work in the drivers, that's better than requiring 100% of the > >> drivers to do things manually. > > Right, it's not about "being clever", it's about not adding opt-in code > to the hundreds and hundreds and hundreds of drivers which *might* ever > find themselves used on a system where they would need the IOMMU's help > for DMA. Well, I mean, at one point we didn't do the automatic iommu hookup, we could have just stuck with that and added a helper so drivers could request the hookup. Things wouldn't have been any more broken than before, and when people bring up systems where iommu is required, they could have added the necessary dma_iommu_configure() call. But that is water under the bridge now. > >> Adding Marek who had the same problem on Exynos. > > > > I do wonder how many drivers need to iommu_map in their ->probe()? > > I'm thinking moving the auto-hookup to after a successful probe(), > > with some function a driver could call if they need mapping in probe, > > might be a way to eventually get rid of the blacklist. But I've no > > idea how to find the subset of drivers that would be broken without a > > dma_setup_iommu_stuff() call in their probe. > > Wouldn't help much. That particular issue is nothing to do with DMA ops > really, it's about IOMMU initialisation. On something like SMMUv3 there > is literally no way to turn the thing on without blocking unknown > traffic - it *has* to have stream table entries programmed before it can > even allow stuff to bypass. fwiw, on these sdm850 laptops (and I think sdm845 boards like mtp and db845c) the SMMU (v2) is taken out of bypass by the bootloader. Bjorn has some patches for arm-smmu to read-back the state at boot. (Although older devices were booting with display enabled but SMMU in bypass.) > The answer is either to either pay attention to the "Quiesce all DMA > capable devices" part of the boot protocol (which has been there since > pretty much forever), or to come up with some robust way of > communicating "live" boot-time mappings to IOMMU drivers so that they > can program themselves appropriately during probe. Unfortunately display lit up by bootloader is basically ubiquitous. Every single android phone does it. All of the windows-arm laptops do it. Basically 99.9% of things that have a display do it. It's a tough problem to solve involving clks, gdsc, regulators, etc, in addition to the display driver.. and sadly now smmu. And devices where we can make changes to and update the firmware are rather rare. So there is really no option to ignore this problem. I guess if we had some early-quirks mechanism like x86 does, we could mash the display off early in boot. That would be an easy solution. Although I'd prefer a proper solution so that android phones aren't carrying around enormous stacks of hack patches to achieve a smooth flicker-free boot. I suppose arm-smmu could realize that the context bank is already taken out of bypass.. although I'm not entirely sure if we can assume that the CPU would be able to read back the pagetable setup by the bootloader. Maybe Vivek has an idea about that? BR, -R ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: replace single-char identifiers in macros
On Mon, 2019-06-03 at 14:07 +0100, Robin Murphy wrote: > On 03/06/2019 13:59, Qian Cai wrote: > > There are a few macros in IOMMU have single-char identifiers make the > > code hard to read and debug. Replace them with meaningful names. > > > > Suggested-by: Andrew Morton > > Signed-off-by: Qian Cai > > --- > > include/linux/dmar.h | 14 -- > > 1 file changed, 8 insertions(+), 6 deletions(-) > > > > diff --git a/include/linux/dmar.h b/include/linux/dmar.h > > index f8af1d770520..eb634912f475 100644 > > --- a/include/linux/dmar.h > > +++ b/include/linux/dmar.h > > @@ -104,12 +104,14 @@ static inline bool dmar_rcu_check(void) > > > > #define dmar_rcu_dereference(p) rcu_dereference_check((p), > > dmar_rcu_check()) > > > > -#definefor_each_dev_scope(a, c, p, d) \ > > - for ((p) = 0; ((d) = (p) < (c) ? dmar_rcu_dereference((a)[(p)].dev) > > : \ > > - NULL, (p) < (c)); (p)++) > > - > > -#definefor_each_active_dev_scope(a, c, p, d) \ > > - for_each_dev_scope((a), (c), (p), (d)) if (!(d)) { continue; > > } else > > +#define for_each_dev_scope(devs, cnt, i, tmp) > > \ > > + for ((i) = 0; ((tmp) = (i) < (cnt) ? > > \ > > Given that "tmp" actually appears to be some sort of device cursor, I'm > not sure that that naming really achieves the stated goal of clarity :/ "tmp" is used in the callers everywhere though, although I suppose something like "tmp_dev" can be used if you prefer. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Mon, Jun 03, 2019 at 12:14:27PM +0100, Robin Murphy wrote: > On 03/06/2019 11:47, Rob Clark wrote: > > On Sun, Jun 2, 2019 at 11:25 PM Tomasz Figa wrote: > > > > > > On Mon, Jun 3, 2019 at 4:40 AM Rob Clark wrote: > > > > > > > > On Fri, May 10, 2019 at 7:35 AM Rob Clark wrote: > > > > > > > > > > On Tue, Dec 4, 2018 at 2:29 PM Rob Herring wrote: > > > > > > > > > > > > On Sat, Dec 1, 2018 at 10:54 AM Rob Clark > > > > > > wrote: > > > > > > > > > > > > > > This solves a problem we see with drm/msm, caused by getting > > > > > > > iommu_dma_ops while we attach our own domain and manage it > > > > > > > directly at > > > > > > > the iommu API level: > > > > > > > > > > > > > >[0038] user address but active_mm is swapper > > > > > > >Internal error: Oops: 9605 [#1] PREEMPT SMP > > > > > > >Modules linked in: > > > > > > >CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW > > > > > > > 4.19.3 #90 > > > > > > >Hardware name: xxx (DT) > > > > > > >Workqueue: events deferred_probe_work_func > > > > > > >pstate: 80c9 (Nzcv daif +PAN +UAO) > > > > > > >pc : iommu_dma_map_sg+0x7c/0x2c8 > > > > > > >lr : iommu_dma_map_sg+0x40/0x2c8 > > > > > > >sp : ff80095eb4f0 > > > > > > >x29: ff80095eb4f0 x28: > > > > > > >x27: ffc0f9431578 x26: > > > > > > >x25: x24: 0003 > > > > > > >x23: 0001 x22: ffc0fa9ac010 > > > > > > >x21: x20: ffc0fab40980 > > > > > > >x19: ffc0fab40980 x18: 0003 > > > > > > >x17: 01c4 x16: 0007 > > > > > > >x15: 000e x14: > > > > > > >x13: x12: 0028 > > > > > > >x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f > > > > > > >x9 : x8 : ffc0fab409a0 > > > > > > >x7 : x6 : 0002 > > > > > > >x5 : 0001 x4 : > > > > > > >x3 : 0001 x2 : 0002 > > > > > > >x1 : ffc0f9431578 x0 : > > > > > > >Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb) > > > > > > >Call trace: > > > > > > > iommu_dma_map_sg+0x7c/0x2c8 > > > > > > > __iommu_map_sg_attrs+0x70/0x84 > > > > > > > get_pages+0x170/0x1e8 > > > > > > > msm_gem_get_iova+0x8c/0x128 > > > > > > > _msm_gem_kernel_new+0x6c/0xc8 > > > > > > > msm_gem_kernel_new+0x4c/0x58 > > > > > > > dsi_tx_buf_alloc_6g+0x4c/0x8c > > > > > > > msm_dsi_host_modeset_init+0xc8/0x108 > > > > > > > msm_dsi_modeset_init+0x54/0x18c > > > > > > > _dpu_kms_drm_obj_init+0x430/0x474 > > > > > > > dpu_kms_hw_init+0x5f8/0x6b4 > > > > > > > msm_drm_bind+0x360/0x6c8 > > > > > > > try_to_bring_up_master.part.7+0x28/0x70 > > > > > > > component_master_add_with_match+0xe8/0x124 > > > > > > > msm_pdev_probe+0x294/0x2b4 > > > > > > > platform_drv_probe+0x58/0xa4 > > > > > > > really_probe+0x150/0x294 > > > > > > > driver_probe_device+0xac/0xe8 > > > > > > > __device_attach_driver+0xa4/0xb4 > > > > > > > bus_for_each_drv+0x98/0xc8 > > > > > > > __device_attach+0xac/0x12c > > > > > > > device_initial_probe+0x24/0x30 > > > > > > > bus_probe_device+0x38/0x98 > > > > > > > deferred_probe_work_func+0x78/0xa4 > > > > > > > process_one_work+0x24c/0x3dc > > > > > > > worker_thread+0x280/0x360 > > > > > > > kthread+0x134/0x13c > > > > > > > ret_from_fork+0x10/0x18 > > > > > > >Code: d284 91000725 6b17039f 5400048a (f9401f40) > > > > > > >---[ end trace f22dda57f3648e2c ]--- > > > > > > >Kernel panic - not syncing: Fatal exception > > > > > > >SMP: stopping secondary CPUs > > > > > > >Kernel Offset: disabled > > > > > > >CPU features: 0x0,22802a18 > > > > > > >Memory Limit: none > > > > > > > > > > > > > > The problem is that when drm/msm does it's own > > > > > > > iommu_attach_device(), > > > > > > > now the domain returned by iommu_get_domain_for_dev() is drm/msm's > > > > > > > domain, and it doesn't have domain->iova_cookie. > > > > > > > > > > > > > > We kind of avoided this problem prior to sdm845/dpu because the > > > > > > > iommu > > > > > > > was attached to the mdp node in dt, which is a child of the > > > > > > > toplevel > > > > > > > mdss node (which corresponds to the dev passed in dma_map_sg()). > > > > > > > But > > > > > > > with sdm845, now the iommu is attached at the mdss level so we > > > > > > > hit the > > > > > > > iommu_dma_ops in dma_map_sg(). > > > > > > > > > > > > > > But auto allocating/attaching a domain before the driver is > > > > > > > probed was > > > > > > > already a blocking problem for enabling per-context pagetables > > > > > > > for the > > > > > > > GPU. This problem is also now solved with this patch. > > > > > > > > > > > > > > Fi
Re: [PATCH] iommu: replace single-char identifiers in macros
On 03/06/2019 14:29, Qian Cai wrote: On Mon, 2019-06-03 at 14:07 +0100, Robin Murphy wrote: On 03/06/2019 13:59, Qian Cai wrote: There are a few macros in IOMMU have single-char identifiers make the code hard to read and debug. Replace them with meaningful names. Suggested-by: Andrew Morton Signed-off-by: Qian Cai --- include/linux/dmar.h | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/include/linux/dmar.h b/include/linux/dmar.h index f8af1d770520..eb634912f475 100644 --- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -104,12 +104,14 @@ static inline bool dmar_rcu_check(void) #define dmar_rcu_dereference(p) rcu_dereference_check((p), dmar_rcu_check()) -#define for_each_dev_scope(a, c, p, d) \ - for ((p) = 0; ((d) = (p) < (c) ? dmar_rcu_dereference((a)[(p)].dev) : \ - NULL, (p) < (c)); (p)++) - -#definefor_each_active_dev_scope(a, c, p, d) \ - for_each_dev_scope((a), (c), (p), (d)) if (!(d)) { continue; } else +#define for_each_dev_scope(devs, cnt, i, tmp) \ + for ((i) = 0; ((tmp) = (i) < (cnt) ? \ Given that "tmp" actually appears to be some sort of device cursor, I'm not sure that that naming really achieves the stated goal of clarity :/ "tmp" is used in the callers everywhere though, although I suppose something like "tmp_dev" can be used if you prefer. I don't have any preference, I'm just questioning the assertion in the commit message - as a reader not intimately familiar with this code, "tmp" is honestly no more meaningful than "d" was. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Mon, Jun 03, 2019 at 06:20:57AM -0700, Rob Clark wrote: > On Mon, Jun 3, 2019 at 4:14 AM Robin Murphy wrote: > > > > On 03/06/2019 11:47, Rob Clark wrote: > > > On Sun, Jun 2, 2019 at 11:25 PM Tomasz Figa wrote: > > >> > > >> On Mon, Jun 3, 2019 at 4:40 AM Rob Clark wrote: > > >>> > > >>> So, another case I've come across, on the display side.. I'm working > > >>> on handling the case where bootloader enables display (and takes iommu > > >>> out of reset).. as soon as DMA domain gets attached we get iommu > > >>> faults, because bootloader has already configured display for scanout. > > >>> Unfortunately this all happens before actual driver is probed and has > > >>> a chance to intervene. > > >>> > > >>> It's rather unfortunate that we tried to be clever rather than just > > >>> making drivers call some function to opt-in to the hookup of dma iommu > > >>> ops :-( > > >> > > >> I think it still works for the 90% of cases and if 10% needs some > > >> explicit work in the drivers, that's better than requiring 100% of the > > >> drivers to do things manually. > > > > Right, it's not about "being clever", it's about not adding opt-in code > > to the hundreds and hundreds and hundreds of drivers which *might* ever > > find themselves used on a system where they would need the IOMMU's help > > for DMA. > > Well, I mean, at one point we didn't do the automatic iommu hookup, we > could have just stuck with that and added a helper so drivers could > request the hookup. Things wouldn't have been any more broken than > before, and when people bring up systems where iommu is required, they > could have added the necessary dma_iommu_configure() call. But that > is water under the bridge now. > > > >> Adding Marek who had the same problem on Exynos. > > > > > > I do wonder how many drivers need to iommu_map in their ->probe()? > > > I'm thinking moving the auto-hookup to after a successful probe(), > > > with some function a driver could call if they need mapping in probe, > > > might be a way to eventually get rid of the blacklist. But I've no > > > idea how to find the subset of drivers that would be broken without a > > > dma_setup_iommu_stuff() call in their probe. > > > > Wouldn't help much. That particular issue is nothing to do with DMA ops > > really, it's about IOMMU initialisation. On something like SMMUv3 there > > is literally no way to turn the thing on without blocking unknown > > traffic - it *has* to have stream table entries programmed before it can > > even allow stuff to bypass. > > fwiw, on these sdm850 laptops (and I think sdm845 boards like mtp and > db845c) the SMMU (v2) is taken out of bypass by the bootloader. Bjorn > has some patches for arm-smmu to read-back the state at boot. > > (Although older devices were booting with display enabled but SMMU in bypass.) > > > The answer is either to either pay attention to the "Quiesce all DMA > > capable devices" part of the boot protocol (which has been there since > > pretty much forever), or to come up with some robust way of > > communicating "live" boot-time mappings to IOMMU drivers so that they > > can program themselves appropriately during probe. > > Unfortunately display lit up by bootloader is basically ubiquitous. > Every single android phone does it. All of the windows-arm laptops do > it. Basically 99.9% of things that have a display do it. It's a > tough problem to solve involving clks, gdsc, regulators, etc, in > addition to the display driver.. and sadly now smmu. And devices > where we can make changes to and update the firmware are rather rare. > So there is really no option to ignore this problem. I think this is going to require at least some degree of cooperation from the bootloader. See my other thread on that. Unfortunately I think this is an area where everyone has kind of been doing their own thing even if standard bindings for this have been around for quite a while (actually 5 years by now). I suspect that most bootloaders that run today are not that old, but as always downstream doesn't follow closely where upstream is guiding. > I guess if we had some early-quirks mechanism like x86 does, we could > mash the display off early in boot. That would be an easy solution. > Although I'd prefer a proper solution so that android phones aren't > carrying around enormous stacks of hack patches to achieve a smooth > flicker-free boot. The proper solution, I think, is for bootloader and kernel to work together. Unfortunately that means we'll just have to bite the bullet and get things fixed across the stack. I think this is just the latest manifestation of the catch-up that upstream has been playing. Only now that we're starting to enable all of these features upstream are we running into interoperability issues. If upstream had been further along we would've caught these issues way ahead of time and could've influenced the designs of bootloader much earlier. Now, unless we get all the vendors to go back and mo
The dma-mapping for-next tree has been rebased
I want to pull in the cache maintainance fix for the dma-iommu conversion. Hope it didn't cause any inconvenience. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH -next] iommu/intel: silence a variable set but not used
The commit "iommu/vt-d: Probe DMA-capable ACPI name space devices" introduced a compilation warning due to the "iommu" variable in for_each_active_iommu() but never used the for each element, i.e, "drhd->iommu". drivers/iommu/intel-iommu.c: In function 'probe_acpi_namespace_devices': drivers/iommu/intel-iommu.c:4639:22: warning: variable 'iommu' set but not used [-Wunused-but-set-variable] struct intel_iommu *iommu; Silence the warning the same way as in the commit d3ed71e5cc50 ("drivers/iommu/intel-iommu.c: fix variable 'iommu' set but not used") Signed-off-by: Qian Cai --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index b431cc6f6ba4..2897354a341a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4636,7 +4636,8 @@ static int __init platform_optin_force_iommu(void) static int __init probe_acpi_namespace_devices(void) { struct dmar_drhd_unit *drhd; - struct intel_iommu *iommu; + /* To avoid a -Wunused-but-set-variable warning. */ + struct intel_iommu *iommu __maybe_unused; struct device *dev; int i, ret = 0; -- 1.8.3.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Mon, Jun 3, 2019 at 6:54 AM Thierry Reding wrote: > > On Mon, Jun 03, 2019 at 06:20:57AM -0700, Rob Clark wrote: > > On Mon, Jun 3, 2019 at 4:14 AM Robin Murphy wrote: > > > > > > On 03/06/2019 11:47, Rob Clark wrote: > > > > On Sun, Jun 2, 2019 at 11:25 PM Tomasz Figa wrote: > > > >> > > > >> On Mon, Jun 3, 2019 at 4:40 AM Rob Clark wrote: > > > >>> > > > >>> So, another case I've come across, on the display side.. I'm working > > > >>> on handling the case where bootloader enables display (and takes iommu > > > >>> out of reset).. as soon as DMA domain gets attached we get iommu > > > >>> faults, because bootloader has already configured display for scanout. > > > >>> Unfortunately this all happens before actual driver is probed and has > > > >>> a chance to intervene. > > > >>> > > > >>> It's rather unfortunate that we tried to be clever rather than just > > > >>> making drivers call some function to opt-in to the hookup of dma iommu > > > >>> ops :-( > > > >> > > > >> I think it still works for the 90% of cases and if 10% needs some > > > >> explicit work in the drivers, that's better than requiring 100% of the > > > >> drivers to do things manually. > > > > > > Right, it's not about "being clever", it's about not adding opt-in code > > > to the hundreds and hundreds and hundreds of drivers which *might* ever > > > find themselves used on a system where they would need the IOMMU's help > > > for DMA. > > > > Well, I mean, at one point we didn't do the automatic iommu hookup, we > > could have just stuck with that and added a helper so drivers could > > request the hookup. Things wouldn't have been any more broken than > > before, and when people bring up systems where iommu is required, they > > could have added the necessary dma_iommu_configure() call. But that > > is water under the bridge now. > > > > > >> Adding Marek who had the same problem on Exynos. > > > > > > > > I do wonder how many drivers need to iommu_map in their ->probe()? > > > > I'm thinking moving the auto-hookup to after a successful probe(), > > > > with some function a driver could call if they need mapping in probe, > > > > might be a way to eventually get rid of the blacklist. But I've no > > > > idea how to find the subset of drivers that would be broken without a > > > > dma_setup_iommu_stuff() call in their probe. > > > > > > Wouldn't help much. That particular issue is nothing to do with DMA ops > > > really, it's about IOMMU initialisation. On something like SMMUv3 there > > > is literally no way to turn the thing on without blocking unknown > > > traffic - it *has* to have stream table entries programmed before it can > > > even allow stuff to bypass. > > > > fwiw, on these sdm850 laptops (and I think sdm845 boards like mtp and > > db845c) the SMMU (v2) is taken out of bypass by the bootloader. Bjorn > > has some patches for arm-smmu to read-back the state at boot. > > > > (Although older devices were booting with display enabled but SMMU in > > bypass.) > > > > > The answer is either to either pay attention to the "Quiesce all DMA > > > capable devices" part of the boot protocol (which has been there since > > > pretty much forever), or to come up with some robust way of > > > communicating "live" boot-time mappings to IOMMU drivers so that they > > > can program themselves appropriately during probe. > > > > Unfortunately display lit up by bootloader is basically ubiquitous. > > Every single android phone does it. All of the windows-arm laptops do > > it. Basically 99.9% of things that have a display do it. It's a > > tough problem to solve involving clks, gdsc, regulators, etc, in > > addition to the display driver.. and sadly now smmu. And devices > > where we can make changes to and update the firmware are rather rare. > > So there is really no option to ignore this problem. > > I think this is going to require at least some degree of cooperation > from the bootloader. See my other thread on that. Unfortunately I think > this is an area where everyone has kind of been doing their own thing > even if standard bindings for this have been around for quite a while > (actually 5 years by now). I suspect that most bootloaders that run > today are not that old, but as always downstream doesn't follow closely > where upstream is guiding. > > > I guess if we had some early-quirks mechanism like x86 does, we could > > mash the display off early in boot. That would be an easy solution. > > Although I'd prefer a proper solution so that android phones aren't > > carrying around enormous stacks of hack patches to achieve a smooth > > flicker-free boot. > > The proper solution, I think, is for bootloader and kernel to work > together. Unfortunately that means we'll just have to bite the bullet > and get things fixed across the stack. I think this is just the latest > manifestation of the catch-up that upstream has been playing. Only now > that we're starting to enable all of these features upstream are we > running
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Mon, Jun 03, 2019 at 07:20:14AM -0700, Rob Clark wrote: > On Mon, Jun 3, 2019 at 6:54 AM Thierry Reding > wrote: > > > > On Mon, Jun 03, 2019 at 06:20:57AM -0700, Rob Clark wrote: > > > On Mon, Jun 3, 2019 at 4:14 AM Robin Murphy wrote: > > > > > > > > On 03/06/2019 11:47, Rob Clark wrote: > > > > > On Sun, Jun 2, 2019 at 11:25 PM Tomasz Figa > > > > > wrote: > > > > >> > > > > >> On Mon, Jun 3, 2019 at 4:40 AM Rob Clark wrote: > > > > >>> > > > > >>> So, another case I've come across, on the display side.. I'm working > > > > >>> on handling the case where bootloader enables display (and takes > > > > >>> iommu > > > > >>> out of reset).. as soon as DMA domain gets attached we get iommu > > > > >>> faults, because bootloader has already configured display for > > > > >>> scanout. > > > > >>> Unfortunately this all happens before actual driver is probed and > > > > >>> has > > > > >>> a chance to intervene. > > > > >>> > > > > >>> It's rather unfortunate that we tried to be clever rather than just > > > > >>> making drivers call some function to opt-in to the hookup of dma > > > > >>> iommu > > > > >>> ops :-( > > > > >> > > > > >> I think it still works for the 90% of cases and if 10% needs some > > > > >> explicit work in the drivers, that's better than requiring 100% of > > > > >> the > > > > >> drivers to do things manually. > > > > > > > > Right, it's not about "being clever", it's about not adding opt-in code > > > > to the hundreds and hundreds and hundreds of drivers which *might* ever > > > > find themselves used on a system where they would need the IOMMU's help > > > > for DMA. > > > > > > Well, I mean, at one point we didn't do the automatic iommu hookup, we > > > could have just stuck with that and added a helper so drivers could > > > request the hookup. Things wouldn't have been any more broken than > > > before, and when people bring up systems where iommu is required, they > > > could have added the necessary dma_iommu_configure() call. But that > > > is water under the bridge now. > > > > > > > >> Adding Marek who had the same problem on Exynos. > > > > > > > > > > I do wonder how many drivers need to iommu_map in their ->probe()? > > > > > I'm thinking moving the auto-hookup to after a successful probe(), > > > > > with some function a driver could call if they need mapping in probe, > > > > > might be a way to eventually get rid of the blacklist. But I've no > > > > > idea how to find the subset of drivers that would be broken without a > > > > > dma_setup_iommu_stuff() call in their probe. > > > > > > > > Wouldn't help much. That particular issue is nothing to do with DMA ops > > > > really, it's about IOMMU initialisation. On something like SMMUv3 there > > > > is literally no way to turn the thing on without blocking unknown > > > > traffic - it *has* to have stream table entries programmed before it can > > > > even allow stuff to bypass. > > > > > > fwiw, on these sdm850 laptops (and I think sdm845 boards like mtp and > > > db845c) the SMMU (v2) is taken out of bypass by the bootloader. Bjorn > > > has some patches for arm-smmu to read-back the state at boot. > > > > > > (Although older devices were booting with display enabled but SMMU in > > > bypass.) > > > > > > > The answer is either to either pay attention to the "Quiesce all DMA > > > > capable devices" part of the boot protocol (which has been there since > > > > pretty much forever), or to come up with some robust way of > > > > communicating "live" boot-time mappings to IOMMU drivers so that they > > > > can program themselves appropriately during probe. > > > > > > Unfortunately display lit up by bootloader is basically ubiquitous. > > > Every single android phone does it. All of the windows-arm laptops do > > > it. Basically 99.9% of things that have a display do it. It's a > > > tough problem to solve involving clks, gdsc, regulators, etc, in > > > addition to the display driver.. and sadly now smmu. And devices > > > where we can make changes to and update the firmware are rather rare. > > > So there is really no option to ignore this problem. > > > > I think this is going to require at least some degree of cooperation > > from the bootloader. See my other thread on that. Unfortunately I think > > this is an area where everyone has kind of been doing their own thing > > even if standard bindings for this have been around for quite a while > > (actually 5 years by now). I suspect that most bootloaders that run > > today are not that old, but as always downstream doesn't follow closely > > where upstream is guiding. > > > > > I guess if we had some early-quirks mechanism like x86 does, we could > > > mash the display off early in boot. That would be an easy solution. > > > Although I'd prefer a proper solution so that android phones aren't > > > carrying around enormous stacks of hack patches to achieve a smooth > > > flicker-free boot. > > > > The proper solution, I think, is for bootloader and
Re: [PATCH] of/device: add blacklist for iommu dma_ops
> It shouldn't be a problem to hook something else up to the IOMMU > subsystem. Hopefully it's something that people are going to standardize > on. > > > 3) The automatic attach of DMA domain is also causing a different > >problem for us on the GPU side, preventing us from supporting per- > >context pagetables (since we end up with a disagreement about > >which context bank is used between arm-smmu and the firmware). > > I'm not sure I understand this issue. Is the context bank hard-coded in > the firmware somehow? Or is it possible to rewrite which one is going to > be used at runtime? Do you switch out the actual page tables rather than > the IOMMU domains for context switching? We have a rather long history on this but the tl;dr is that the GPU microcode switches the pagetables by rewriting TTBR0 on the fly (since this is arm-smmu-v2 we have no better option) and yes, unfortunately it is hard coded to use context bank 0. [1] is the current patchset to support all this, including my own take on avoiding the dma-domain (all the cool kids have one). Jordan [1] https://patchwork.freedesktop.org/series/57441/ -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 0/4] iommu: Add device fault reporting API
Allow device drivers and VFIO to get notified on IOMMU translation fault, and handle recoverable faults (PCI PRI). Several series require this API (Intel VT-d and Arm SMMUv3 nested support, as well as the generic host SVA implementation). Changes since v1 [1]: * Allocate iommu_param earlier, in iommu_probe_device(). * Pass struct iommu_fault to fault handlers, instead of the iommu_fault_event wrapper. * Removed unused iommu_fault_event::iommu_private. * Removed unnecessary iommu_page_response::addr. * Added iommu_page_response::version, which would allow to introduce a new incompatible iommu_page_response structure (as opposed to just adding a flag + field). [1] [PATCH 0/4] iommu: Add device fault reporting API https://lore.kernel.org/lkml/20190523180613.55049-1-jean-philippe.bruc...@arm.com/ Jacob Pan (3): driver core: Add per device iommu param iommu: Introduce device fault data iommu: Introduce device fault report API Jean-Philippe Brucker (1): iommu: Add recoverable fault reporting drivers/iommu/iommu.c | 236 - include/linux/device.h | 3 + include/linux/iommu.h | 87 ++ include/uapi/linux/iommu.h | 153 4 files changed, 476 insertions(+), 3 deletions(-) create mode 100644 include/uapi/linux/iommu.h -- 2.21.0
[PATCH v2 2/4] iommu: Introduce device fault data
From: Jacob Pan Device faults detected by IOMMU can be reported outside the IOMMU subsystem for further processing. This patch introduces a generic device fault data structure. The fault can be either an unrecoverable fault or a page request, also referred to as a recoverable fault. We only care about non internal faults that are likely to be reported to an external subsystem. Signed-off-by: Jacob Pan Signed-off-by: Jean-Philippe Brucker Signed-off-by: Liu, Yi L Signed-off-by: Ashok Raj Signed-off-by: Eric Auger --- include/linux/iommu.h | 39 include/uapi/linux/iommu.h | 118 + 2 files changed, 157 insertions(+) create mode 100644 include/uapi/linux/iommu.h diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a815cf6f6f47..2b05056d5fa7 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -25,6 +25,7 @@ #include #include #include +#include #define IOMMU_READ (1 << 0) #define IOMMU_WRITE(1 << 1) @@ -49,6 +50,7 @@ struct device; struct iommu_domain; struct notifier_block; struct iommu_sva; +struct iommu_fault_event; /* iommu fault flags */ #define IOMMU_FAULT_READ 0x0 @@ -58,6 +60,7 @@ typedef int (*iommu_fault_handler_t)(struct iommu_domain *, struct device *, unsigned long, int, void *); typedef int (*iommu_mm_exit_handler_t)(struct device *dev, struct iommu_sva *, void *); +typedef int (*iommu_dev_fault_handler_t)(struct iommu_fault *, void *); struct iommu_domain_geometry { dma_addr_t aperture_start; /* First address that can be mapped*/ @@ -301,6 +304,41 @@ struct iommu_device { struct device *dev; }; +/** + * struct iommu_fault_event - Generic fault event + * + * Can represent recoverable faults such as a page requests or + * unrecoverable faults such as DMA or IRQ remapping faults. + * + * @fault: fault descriptor + */ +struct iommu_fault_event { + struct iommu_fault fault; +}; + +/** + * struct iommu_fault_param - per-device IOMMU fault data + * @handler: Callback function to handle IOMMU faults at device level + * @data: handler private data + */ +struct iommu_fault_param { + iommu_dev_fault_handler_t handler; + void *data; +}; + +/** + * struct iommu_param - collection of per-device IOMMU data + * + * @fault_param: IOMMU detected device fault reporting data + * + * TODO: migrate other per device data pointers under iommu_dev_data, e.g. + * struct iommu_group *iommu_group; + * struct iommu_fwspec *iommu_fwspec; + */ +struct iommu_param { + struct iommu_fault_param *fault_param; +}; + int iommu_device_register(struct iommu_device *iommu); void iommu_device_unregister(struct iommu_device *iommu); int iommu_device_sysfs_add(struct iommu_device *iommu, @@ -504,6 +542,7 @@ struct iommu_ops {}; struct iommu_group {}; struct iommu_fwspec {}; struct iommu_device {}; +struct iommu_fault_param {}; static inline bool iommu_present(struct bus_type *bus) { diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h new file mode 100644 index ..796402174d6c --- /dev/null +++ b/include/uapi/linux/iommu.h @@ -0,0 +1,118 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* + * IOMMU user API definitions + */ + +#ifndef _UAPI_IOMMU_H +#define _UAPI_IOMMU_H + +#include + +#define IOMMU_FAULT_PERM_READ (1 << 0) /* read */ +#define IOMMU_FAULT_PERM_WRITE (1 << 1) /* write */ +#define IOMMU_FAULT_PERM_EXEC (1 << 2) /* exec */ +#define IOMMU_FAULT_PERM_PRIV (1 << 3) /* privileged */ + +/* Generic fault types, can be expanded IRQ remapping fault */ +enum iommu_fault_type { + IOMMU_FAULT_DMA_UNRECOV = 1,/* unrecoverable fault */ + IOMMU_FAULT_PAGE_REQ, /* page request fault */ +}; + +enum iommu_fault_reason { + IOMMU_FAULT_REASON_UNKNOWN = 0, + + /* Could not access the PASID table (fetch caused external abort) */ + IOMMU_FAULT_REASON_PASID_FETCH, + + /* PASID entry is invalid or has configuration errors */ + IOMMU_FAULT_REASON_BAD_PASID_ENTRY, + + /* +* PASID is out of range (e.g. exceeds the maximum PASID +* supported by the IOMMU) or disabled. +*/ + IOMMU_FAULT_REASON_PASID_INVALID, + + /* +* An external abort occurred fetching (or updating) a translation +* table descriptor +*/ + IOMMU_FAULT_REASON_WALK_EABT, + + /* +* Could not access the page table entry (Bad address), +* actual translation fault +*/ + IOMMU_FAULT_REASON_PTE_FETCH, + + /* Protection flag check failed */ + IOMMU_FAULT_REASON_PERMISSION, + + /* access flag check failed */ + IOMMU_FAULT_REASON_ACCESS, + + /* Output address of a translation stage caused Address Size fault */ + IOMMU_FAULT_REASON_OOR_ADDRESS, +}; + +/** + * struct iommu_fault_unrecoverabl
[PATCH v2 3/4] iommu: Introduce device fault report API
From: Jacob Pan Traditionally, device specific faults are detected and handled within their own device drivers. When IOMMU is enabled, faults such as DMA related transactions are detected by IOMMU. There is no generic reporting mechanism to report faults back to the in-kernel device driver or the guest OS in case of assigned devices. This patch introduces a registration API for device specific fault handlers. This differs from the existing iommu_set_fault_handler/ report_iommu_fault infrastructures in several ways: - it allows to report more sophisticated fault events (both unrecoverable faults and page request faults) due to the nature of the iommu_fault struct - it is device specific and not domain specific. The current iommu_report_device_fault() implementation only handles the "shoot and forget" unrecoverable fault case. Handling of page request faults or stalled faults will come later. Signed-off-by: Jacob Pan Signed-off-by: Ashok Raj Signed-off-by: Jean-Philippe Brucker Signed-off-by: Eric Auger --- drivers/iommu/iommu.c | 146 +- include/linux/iommu.h | 29 + 2 files changed, 172 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 67ee6623f9b2..8037a3f07f07 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -107,15 +107,43 @@ void iommu_device_unregister(struct iommu_device *iommu) spin_unlock(&iommu_device_lock); } +static struct iommu_param *iommu_get_dev_param(struct device *dev) +{ + struct iommu_param *param = dev->iommu_param; + + if (param) + return param; + + param = kzalloc(sizeof(*param), GFP_KERNEL); + if (!param) + return NULL; + + mutex_init(¶m->lock); + dev->iommu_param = param; + return param; +} + +static void iommu_free_dev_param(struct device *dev) +{ + kfree(dev->iommu_param); + dev->iommu_param = NULL; +} + int iommu_probe_device(struct device *dev) { const struct iommu_ops *ops = dev->bus->iommu_ops; - int ret = -EINVAL; + int ret; WARN_ON(dev->iommu_group); + if (!ops) + return -EINVAL; - if (ops) - ret = ops->add_device(dev); + if (!iommu_get_dev_param(dev)) + return -ENOMEM; + + ret = ops->add_device(dev); + if (ret) + iommu_free_dev_param(dev); return ret; } @@ -126,6 +154,8 @@ void iommu_release_device(struct device *dev) if (dev->iommu_group) ops->remove_device(dev); + + iommu_free_dev_param(dev); } static struct iommu_domain *__iommu_domain_alloc(struct bus_type *bus, @@ -854,6 +884,116 @@ int iommu_group_unregister_notifier(struct iommu_group *group, } EXPORT_SYMBOL_GPL(iommu_group_unregister_notifier); +/** + * iommu_register_device_fault_handler() - Register a device fault handler + * @dev: the device + * @handler: the fault handler + * @data: private data passed as argument to the handler + * + * When an IOMMU fault event is received, this handler gets called with the + * fault event and data as argument. The handler should return 0 on success. + * + * Return 0 if the fault handler was installed successfully, or an error. + */ +int iommu_register_device_fault_handler(struct device *dev, + iommu_dev_fault_handler_t handler, + void *data) +{ + struct iommu_param *param = dev->iommu_param; + int ret = 0; + + if (!param) + return -EINVAL; + + mutex_lock(¶m->lock); + /* Only allow one fault handler registered for each device */ + if (param->fault_param) { + ret = -EBUSY; + goto done_unlock; + } + + get_device(dev); + param->fault_param = kzalloc(sizeof(*param->fault_param), GFP_KERNEL); + if (!param->fault_param) { + put_device(dev); + ret = -ENOMEM; + goto done_unlock; + } + param->fault_param->handler = handler; + param->fault_param->data = data; + +done_unlock: + mutex_unlock(¶m->lock); + + return ret; +} +EXPORT_SYMBOL_GPL(iommu_register_device_fault_handler); + +/** + * iommu_unregister_device_fault_handler() - Unregister the device fault handler + * @dev: the device + * + * Remove the device fault handler installed with + * iommu_register_device_fault_handler(). + * + * Return 0 on success, or an error. + */ +int iommu_unregister_device_fault_handler(struct device *dev) +{ + struct iommu_param *param = dev->iommu_param; + int ret = 0; + + if (!param) + return -EINVAL; + + mutex_lock(¶m->lock); + + if (!param->fault_param) + goto unlock; + + kfree(param->fault_param); + param->fault_param = NULL; + put_device(dev); +unlock: + mutex_unlock(¶m->lock); + + re
[PATCH v2 4/4] iommu: Add recoverable fault reporting
Some IOMMU hardware features, for example PCI PRI and Arm SMMU Stall, enable recoverable I/O page faults. Allow IOMMU drivers to report PRI Page Requests and Stall events through the new fault reporting API. The consumer of the fault can be either an I/O page fault handler in the host, or a guest OS. Once handled, the fault must be completed by sending a page response back to the IOMMU. Add an iommu_page_response() function to complete a page fault. There are two ways to extend the userspace API: * Add a field to iommu_page_response and a flag to iommu_page_response::flags describing the validity of this field. * Introduce a new iommu_page_response_X structure with a different version number. The kernel must then support both versions. Signed-off-by: Jacob Pan Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/iommu.c | 94 +- include/linux/iommu.h | 19 include/uapi/linux/iommu.h | 35 ++ 3 files changed, 146 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 8037a3f07f07..956a80364efd 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -891,7 +891,14 @@ EXPORT_SYMBOL_GPL(iommu_group_unregister_notifier); * @data: private data passed as argument to the handler * * When an IOMMU fault event is received, this handler gets called with the - * fault event and data as argument. The handler should return 0 on success. + * fault event and data as argument. The handler should return 0 on success. If + * the fault is recoverable (IOMMU_FAULT_PAGE_REQ), the consumer should also + * complete the fault by calling iommu_page_response() with one of the following + * response code: + * - IOMMU_PAGE_RESP_SUCCESS: retry the translation + * - IOMMU_PAGE_RESP_INVALID: terminate the fault + * - IOMMU_PAGE_RESP_FAILURE: terminate the fault and stop reporting + * page faults if possible. * * Return 0 if the fault handler was installed successfully, or an error. */ @@ -921,6 +928,8 @@ int iommu_register_device_fault_handler(struct device *dev, } param->fault_param->handler = handler; param->fault_param->data = data; + mutex_init(¶m->fault_param->lock); + INIT_LIST_HEAD(¶m->fault_param->faults); done_unlock: mutex_unlock(¶m->lock); @@ -951,6 +960,12 @@ int iommu_unregister_device_fault_handler(struct device *dev) if (!param->fault_param) goto unlock; + /* we cannot unregister handler if there are pending faults */ + if (!list_empty(¶m->fault_param->faults)) { + ret = -EBUSY; + goto unlock; + } + kfree(param->fault_param); param->fault_param = NULL; put_device(dev); @@ -967,13 +982,15 @@ EXPORT_SYMBOL_GPL(iommu_unregister_device_fault_handler); * @evt: fault event data * * Called by IOMMU drivers when a fault is detected, typically in a threaded IRQ - * handler. + * handler. When this function fails and the fault is recoverable, it is the + * caller's responsibility to complete the fault. * * Return 0 on success, or an error. */ int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt) { struct iommu_param *param = dev->iommu_param; + struct iommu_fault_event *evt_pending = NULL; struct iommu_fault_param *fparam; int ret = 0; @@ -987,13 +1004,86 @@ int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt) ret = -EINVAL; goto done_unlock; } + + if (evt->fault.type == IOMMU_FAULT_PAGE_REQ && + (evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { + evt_pending = kmemdup(evt, sizeof(struct iommu_fault_event), + GFP_KERNEL); + if (!evt_pending) { + ret = -ENOMEM; + goto done_unlock; + } + mutex_lock(&fparam->lock); + list_add_tail(&evt_pending->list, &fparam->faults); + mutex_unlock(&fparam->lock); + } + ret = fparam->handler(&evt->fault, fparam->data); + if (ret && evt_pending) { + mutex_lock(&fparam->lock); + list_del(&evt_pending->list); + mutex_unlock(&fparam->lock); + kfree(evt_pending); + } done_unlock: mutex_unlock(¶m->lock); return ret; } EXPORT_SYMBOL_GPL(iommu_report_device_fault); +int iommu_page_response(struct device *dev, + struct iommu_page_response *msg) +{ + bool pasid_valid; + int ret = -EINVAL; + struct iommu_fault_event *evt; + struct iommu_fault_page_request *prm; + struct iommu_param *param = dev->iommu_param; + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); + + if (!domain || !domain->ops->page_response) + re
[PATCH v2 1/4] driver core: Add per device iommu param
From: Jacob Pan DMA faults can be detected by IOMMU at device level. Adding a pointer to struct device allows IOMMU subsystem to report relevant faults back to the device driver for further handling. For direct assigned device (or user space drivers), guest OS holds responsibility to handle and respond per device IOMMU fault. Therefore we need fault reporting mechanism to propagate faults beyond IOMMU subsystem. There are two other IOMMU data pointers under struct device today, here we introduce iommu_param as a parent pointer such that all device IOMMU data can be consolidated here. The idea was suggested here by Greg KH and Joerg. The name iommu_param is chosen here since iommu_data has been used. Suggested-by: Greg Kroah-Hartman Reviewed-by: Greg Kroah-Hartman Signed-off-by: Jacob Pan Link: https://lkml.org/lkml/2017/10/6/81 --- include/linux/device.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/linux/device.h b/include/linux/device.h index e85264fb6616..f0a975abd6e9 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -42,6 +42,7 @@ struct iommu_ops; struct iommu_group; struct iommu_fwspec; struct dev_pin_info; +struct iommu_param; struct bus_attribute { struct attributeattr; @@ -959,6 +960,7 @@ struct dev_links_info { * device (i.e. the bus driver that discovered the device). * @iommu_group: IOMMU group the device belongs to. * @iommu_fwspec: IOMMU-specific properties supplied by firmware. + * @iommu_param: Per device generic IOMMU runtime data * * @offline_disabled: If set, the device is permanently online. * @offline: Set after successful invocation of bus type's .offline(). @@ -1052,6 +1054,7 @@ struct device { void(*release)(struct device *dev); struct iommu_group *iommu_group; struct iommu_fwspec *iommu_fwspec; + struct iommu_param *iommu_param; booloffline_disabled:1; booloffline:1; -- 2.21.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 0/6] Prerequisites for NXP LS104xA SMMU enablement
Am 31.05.19 um 19:32 schrieb Laurentiu Tudor: >> -Original Message- >> From: Andreas Färber >> Sent: Friday, May 31, 2019 8:04 PM >> >> Hello Laurentiu, >> >> Am 31.05.19 um 18:46 schrieb Laurentiu Tudor: -Original Message- From: Andreas Färber Sent: Friday, May 31, 2019 7:15 PM Hi Laurentiu, Am 30.05.19 um 16:19 schrieb laurentiu.tu...@nxp.com: > This patch series contains several fixes in preparation for SMMU > support on NXP LS1043A and LS1046A chips. Once these get picked up, > I'll submit the actual SMMU enablement patches consisting in the > required device tree changes. Have you thought through what will happen if this patch ordering is not preserved? In particular, a user installing a future U-Boot update with the DTB bits but booting a stable kernel without this patch series - wouldn't that regress dpaa then for our customers? >>> >>> These are fixes for issues that popped out after enabling SMMU. >>> I do not expect them to break anything. >> >> That was not my question! You're missing my point: All your patches are >> lacking a Fixes header in their commit message, for backporting them, to >> avoid _your DT patches_ breaking the driver on stable branches! > > It does appear that I'm missing your point. For sure, the DT updates solely > will > break the kernel without these fixes but I'm not sure I understand how this > could happen. In short, customers rarely run master branch. Kindly have your colleagues explain stable branches to you in details. With Fixes header I was referring to the syntax explained here: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes https://www.kernel.org/doc/html/latest/process/submitting-patches.html#using-reported-by-tested-by-reviewed-by-suggested-by-and-fixes > My plan was to share the kernel dts patches sometime after this series > makes it through. That's fine. What I'm warning you is that seemingly your DT patches, once in one of your LSDK U-Boot releases, will cause a regression for distros like our SLES 15 SP1 unless these prereq kernel patches get applied on the respective stable branches. Which will not happen automatically unless you as patch author take the appropriate action before they get merged. Thanks, Andreas -- SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 0/6] Allwinner H6 Mali GPU support
Hi Maxime, Joerg, On Wed, 22 May 2019 at 21:27, Rob Herring wrote: > > On Tue, May 21, 2019 at 11:11 AM Clément Péron wrote: > > > > Hi, > > > > The Allwinner H6 has a Mali-T720 MP2 which should be supported by > > the new panfrost driver. This series fix two issues and introduce the > > dt-bindings but a simple benchmark show that it's still NOT WORKING. > > > > I'm pushing it in case someone want to continue the work. > > > > This has been tested with Mesa3D 19.1.0-RC2 and a GPU bitness patch[1]. > > > > One patch is from Icenowy Zheng where I changed the order as required > > by Rob Herring[2]. > > > > Thanks, > > Clement > > > > [1] https://gitlab.freedesktop.org/kszaq/mesa/tree/panfrost_64_32 > > [2] https://patchwork.kernel.org/patch/10699829/ > > > > > > [ 345.204813] panfrost 180.gpu: mmu irq status=1 > > [ 345.209617] panfrost 180.gpu: Unhandled Page fault in AS0 at VA > > 0x02400400 > > [ 345.209617] Reason: TODO > > [ 345.209617] raw fault status: 0x82C1 > > [ 345.209617] decoded fault status: SLAVE FAULT > > [ 345.209617] exception type 0xC1: TRANSLATION_FAULT_LEVEL1 > > [ 345.209617] access type 0x2: READ > > [ 345.209617] source id 0x8000 > > [ 345.729957] panfrost 180.gpu: gpu sched timeout, js=0, > > status=0x8, head=0x2400400, tail=0x2400400, sched_job=9e204de9 > > [ 346.055876] panfrost 180.gpu: mmu irq status=1 > > [ 346.060680] panfrost 180.gpu: Unhandled Page fault in AS0 at VA > > 0x02C00A00 > > [ 346.060680] Reason: TODO > > [ 346.060680] raw fault status: 0x810002C1 > > [ 346.060680] decoded fault status: SLAVE FAULT > > [ 346.060680] exception type 0xC1: TRANSLATION_FAULT_LEVEL1 > > [ 346.060680] access type 0x2: READ > > [ 346.060680] source id 0x8100 > > [ 346.561955] panfrost 180.gpu: gpu sched timeout, js=1, > > status=0x8, head=0x2c00a00, tail=0x2c00a00, sched_job=b55a9a85 > > [ 346.573913] panfrost 180.gpu: mmu irq status=1 > > [ 346.578707] panfrost 180.gpu: Unhandled Page fault in AS0 at VA > > 0x02C00B80 > > > > Change in v5: > > - Remove fix indent > > > > Changes in v4: > > - Add bus_clock probe > > - Fix sanity check in io-pgtable > > - Add vramp-delay > > - Merge all boards into one patch > > - Remove upstreamed Neil A. patch > > > > Change in v3 (Thanks to Maxime Ripard): > > - Reauthor Icenowy for her path > > > > Changes in v2 (Thanks to Maxime Ripard): > > - Drop GPU OPP Table > > - Add clocks and clock-names in required > > > > Clément Péron (5): > > drm: panfrost: add optional bus_clock > > iommu: io-pgtable: fix sanity check for non 48-bit mali iommu > > dt-bindings: gpu: mali-midgard: Add H6 mali gpu compatible > > arm64: dts: allwinner: Add ARM Mali GPU node for H6 > > arm64: dts: allwinner: Add mali GPU supply for H6 boards > > > > Icenowy Zheng (1): > > dt-bindings: gpu: add bus clock for Mali Midgard GPUs > > I've applied patches 1 and 3 to drm-misc. I was going to do patch 4 > too, but it doesn't apply. > > Patch 2 can go in via the iommu tree and the rest via the allwinner tree. Is this OK for you to pick up this series? Thanks, Clément > > Rob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code
On 29/04/2019 13:56, Robin Murphy wrote: > On 22/04/2019 18:59, Christoph Hellwig wrote: >> There is nothing really arm64 specific in the iommu_dma_ops >> implementation, so move it to dma-iommu.c and keep a lot of symbols >> self-contained. Note the implementation does depend on the >> DMA_DIRECT_REMAP infrastructure for now, so we'll have to make the >> DMA_IOMMU support depend on it, but this will be relaxed soon. > > Nothing looks objectionable, and boot testing with this much of the > series merged has my coherent and non-coherent IOMMU-backed devices > appearing to still work OK, so: > > Acked-by: Robin Murphy Since next-20190529 one of our tests for MMC has started failing, where the symptom is that the data written to the MMC does not match the source. Bisecting this is pointing to this commit. Unfortunately, I am not able to cleanly revert this on top of -next, but wanted to report this if case you have any ideas. Cheers Jon -- nvpublic
Re: [PATCH v2 0/4] iommu: Add device fault reporting API
On Mon, 3 Jun 2019 15:57:45 +0100 Jean-Philippe Brucker wrote: > Allow device drivers and VFIO to get notified on IOMMU translation > fault, and handle recoverable faults (PCI PRI). Several series require > this API (Intel VT-d and Arm SMMUv3 nested support, as well as the > generic host SVA implementation). > > Changes since v1 [1]: > * Allocate iommu_param earlier, in iommu_probe_device(). > * Pass struct iommu_fault to fault handlers, instead of the > iommu_fault_event wrapper. > * Removed unused iommu_fault_event::iommu_private. > * Removed unnecessary iommu_page_response::addr. > * Added iommu_page_response::version, which would allow to introduce a > new incompatible iommu_page_response structure (as opposed to just > adding a flag + field). > > [1] [PATCH 0/4] iommu: Add device fault reporting API > > https://lore.kernel.org/lkml/20190523180613.55049-1-jean-philippe.bruc...@arm.com/ > > Jacob Pan (3): > driver core: Add per device iommu param > iommu: Introduce device fault data > iommu: Introduce device fault report API > > Jean-Philippe Brucker (1): > iommu: Add recoverable fault reporting > This interface meet the need for vt-d, just one more comment on 2/4. Do you want to add Co-developed-by you for the three patches from me? Thanks, Jacob > drivers/iommu/iommu.c | 236 > - include/linux/device.h | > 3 + include/linux/iommu.h | 87 ++ > include/uapi/linux/iommu.h | 153 > 4 files changed, 476 insertions(+), 3 deletions(-) > create mode 100644 include/uapi/linux/iommu.h > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/4] iommu: Introduce device fault data
On Mon, 3 Jun 2019 15:57:47 +0100 Jean-Philippe Brucker wrote: > +/** > + * struct iommu_fault_page_request - Page Request data > + * @flags: encodes whether the corresponding fields are valid and > whether this > + * is the last page in group (IOMMU_FAULT_PAGE_REQUEST_* > values) > + * @pasid: Process Address Space ID > + * @grpid: Page Request Group Index > + * @perm: requested page permissions (IOMMU_FAULT_PERM_* values) > + * @addr: page address > + * @private_data: device-specific private information > + */ > +struct iommu_fault_page_request { > +#define IOMMU_FAULT_PAGE_REQUEST_PASID_VALID (1 << 0) > +#define IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE (1 << 1) > +#define IOMMU_FAULT_PAGE_REQUEST_PRIV_DATA (1 << 2) > + __u32 flags; > + __u32 pasid; > + __u32 grpid; > + __u32 perm; > + __u64 addr; > + __u64 private_data[2]; > +}; > + Just a thought, for non-identity G-H PASID management. We could pass on guest PASID in PRQ to save a lookup in QEMU. In this case, QEMU allocate a GPASID for vIOMMU then a host PASID for pIOMMU. QEMU has a G->H lookup. When PRQ comes in to the pIOMMU with HPASID, IOMMU driver can retrieve GPASID from the bind data then report to the guest via VFIO. In this case QEMU does not need to do a H->G PASID lookup. Should we add a gpasid field here? or we can add a flag and field later, up to you. Thanks, Jacob ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 28/29] vfio-pci: Add VFIO_PCI_DMA_FAULT_IRQ_INDEX
On Sun, 26 May 2019 18:10:03 +0200 Eric Auger wrote: > Add a new VFIO_PCI_DMA_FAULT_IRQ_INDEX index. This allows to > set/unset an eventfd that will be triggered when DMA translation > faults are detected at physical level when the nested mode is used. > > Signed-off-by: Eric Auger > --- > drivers/vfio/pci/vfio_pci.c | 3 +++ > drivers/vfio/pci/vfio_pci_intrs.c | 19 +++ > include/uapi/linux/vfio.h | 1 + > 3 files changed, 23 insertions(+) Note that I suggested to Intel folks trying to add a GVT-g page flipping eventfd to convert to device specific interrupts the same way we added device specific regions: https://patchwork.kernel.org/patch/10962337/ I'd probably suggest the same here so we can optionally expose it when supported. Thanks, Alex > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index a9c8af2a774a..65a1e6814f5c 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -746,6 +746,8 @@ static int vfio_pci_get_irq_count(struct vfio_pci_device > *vdev, int irq_type) > return 1; > } else if (irq_type == VFIO_PCI_REQ_IRQ_INDEX) { > return 1; > + } else if (irq_type == VFIO_PCI_DMA_FAULT_IRQ_INDEX) { > + return 1; > } > > return 0; > @@ -1082,6 +1084,7 @@ static long vfio_pci_ioctl(void *device_data, > switch (info.index) { > case VFIO_PCI_INTX_IRQ_INDEX ... VFIO_PCI_MSIX_IRQ_INDEX: > case VFIO_PCI_REQ_IRQ_INDEX: > + case VFIO_PCI_DMA_FAULT_IRQ_INDEX: > break; > case VFIO_PCI_ERR_IRQ_INDEX: > if (pci_is_pcie(vdev->pdev)) > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c > b/drivers/vfio/pci/vfio_pci_intrs.c > index 1c46045b0e7f..28a96117daf3 100644 > --- a/drivers/vfio/pci/vfio_pci_intrs.c > +++ b/drivers/vfio/pci/vfio_pci_intrs.c > @@ -622,6 +622,18 @@ static int vfio_pci_set_req_trigger(struct > vfio_pci_device *vdev, > count, flags, data); > } > > +static int vfio_pci_set_dma_fault_trigger(struct vfio_pci_device *vdev, > + unsigned index, unsigned start, > + unsigned count, uint32_t flags, > + void *data) > +{ > + if (index != VFIO_PCI_DMA_FAULT_IRQ_INDEX || start != 0 || count > 1) > + return -EINVAL; > + > + return vfio_pci_set_ctx_trigger_single(&vdev->dma_fault_trigger, > +count, flags, data); > +} > + > int vfio_pci_set_irqs_ioctl(struct vfio_pci_device *vdev, uint32_t flags, > unsigned index, unsigned start, unsigned count, > void *data) > @@ -671,6 +683,13 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_device > *vdev, uint32_t flags, > break; > } > break; > + case VFIO_PCI_DMA_FAULT_IRQ_INDEX: > + switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) { > + case VFIO_IRQ_SET_ACTION_TRIGGER: > + func = vfio_pci_set_dma_fault_trigger; > + break; > + } > + break; > } > > if (!func) > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h > index 13e041b84d48..66b6b08c4a38 100644 > --- a/include/uapi/linux/vfio.h > +++ b/include/uapi/linux/vfio.h > @@ -559,6 +559,7 @@ enum { > VFIO_PCI_MSIX_IRQ_INDEX, > VFIO_PCI_ERR_IRQ_INDEX, > VFIO_PCI_REQ_IRQ_INDEX, > + VFIO_PCI_DMA_FAULT_IRQ_INDEX, > VFIO_PCI_NUM_IRQS > }; >
Re: [PATCH v8 04/29] iommu: Add recoverable fault reporting
On Sun, 26 May 2019 18:09:39 +0200 Eric Auger wrote: > From: Jean-Philippe Brucker > > Some IOMMU hardware features, for example PCI's PRI and Arm SMMU's Stall, > enable recoverable I/O page faults. Allow IOMMU drivers to report PRI Page > Requests and Stall events through the new fault reporting API. The > consumer of the fault can be either an I/O page fault handler in the host, > or a guest OS. > > Once handled, the fault must be completed by sending a page response back > to the IOMMU. Add an iommu_page_response() function to complete a page > fault. > > Signed-off-by: Jacob Pan > Signed-off-by: Jean-Philippe Brucker > --- > drivers/iommu/iommu.c | 77 ++- > include/linux/iommu.h | 51 > 2 files changed, 127 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 795518445a3a..13b301cfb10f 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -869,7 +869,14 @@ EXPORT_SYMBOL_GPL(iommu_group_unregister_notifier); > * @data: private data passed as argument to the handler > * > * When an IOMMU fault event is received, this handler gets called with the > - * fault event and data as argument. > + * fault event and data as argument. The handler should return 0 on success. > If > + * the fault is recoverable (IOMMU_FAULT_PAGE_REQ), the handler should also > + * complete the fault by calling iommu_page_response() with one of the > following > + * response code: > + * - IOMMU_PAGE_RESP_SUCCESS: retry the translation > + * - IOMMU_PAGE_RESP_INVALID: terminate the fault > + * - IOMMU_PAGE_RESP_FAILURE: terminate the fault and stop reporting > + * page faults if possible. > * > * Return 0 if the fault handler was installed successfully, or an error. > */ > @@ -904,6 +911,8 @@ int iommu_register_device_fault_handler(struct device > *dev, > } > param->fault_param->handler = handler; > param->fault_param->data = data; > + mutex_init(¶m->fault_param->lock); > + INIT_LIST_HEAD(¶m->fault_param->faults); > > done_unlock: > mutex_unlock(¶m->lock); > @@ -934,6 +943,12 @@ int iommu_unregister_device_fault_handler(struct device > *dev) > if (!param->fault_param) > goto unlock; > > + /* we cannot unregister handler if there are pending faults */ > + if (!list_empty(¶m->fault_param->faults)) { > + ret = -EBUSY; > + goto unlock; > + } Why? Attempting to unregister a fault handler suggests the handler doesn't care about outstanding faults. Can't we go ahead and dispatch them as failed? Otherwise we need to be careful that we don't introduce an environment where the registered fault handler is blocked trying to shutdown and release the device due to a flood of errors. Thanks, Alex > + > kfree(param->fault_param); > param->fault_param = NULL; > put_device(dev); > @@ -958,6 +973,7 @@ EXPORT_SYMBOL_GPL(iommu_unregister_device_fault_handler); > int iommu_report_device_fault(struct device *dev, struct iommu_fault_event > *evt) > { > struct iommu_param *param = dev->iommu_param; > + struct iommu_fault_event *evt_pending; > struct iommu_fault_param *fparam; > int ret = 0; > > @@ -972,6 +988,20 @@ int iommu_report_device_fault(struct device *dev, struct > iommu_fault_event *evt) > ret = -EINVAL; > goto done_unlock; > } > + > + if (evt->fault.type == IOMMU_FAULT_PAGE_REQ && > + (evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) { > + evt_pending = kmemdup(evt, sizeof(struct iommu_fault_event), > + GFP_KERNEL); > + if (!evt_pending) { > + ret = -ENOMEM; > + goto done_unlock; > + } > + mutex_lock(&fparam->lock); > + list_add_tail(&evt_pending->list, &fparam->faults); > + mutex_unlock(&fparam->lock); > + } > + > ret = fparam->handler(evt, fparam->data); > done_unlock: > mutex_unlock(¶m->lock); > @@ -1513,6 +1543,51 @@ int iommu_attach_device(struct iommu_domain *domain, > struct device *dev) > } > EXPORT_SYMBOL_GPL(iommu_attach_device); > > +int iommu_page_response(struct device *dev, > + struct page_response_msg *msg) > +{ > + struct iommu_param *param = dev->iommu_param; > + int ret = -EINVAL; > + struct iommu_fault_event *evt; > + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); > + > + if (!domain || !domain->ops->page_response) > + return -ENODEV; > + > + /* > + * Device iommu_param should have been allocated when device is > + * added to its iommu_group. > + */ > + if (!param || !param->fault_param) > + return -EINVAL; > + > + /* Only send response if there is a fault report pending */ > + mutex_lock(¶m->fault_par
Re: [PATCH v8 05/29] iommu: Add a timeout parameter for PRQ response
On Sun, 26 May 2019 18:09:40 +0200 Eric Auger wrote: > From: Jacob Pan > > When an IO page request is processed outside IOMMU subsystem, response > can be delayed or lost. Add a tunable setup parameter such that user can > choose the timeout for IOMMU to track pending page requests. > > This timeout mechanism is a basic safety net which can be implemented in > conjunction with credit based or device level page response exception > handling. > > Signed-off-by: Jacob Pan > --- > .../admin-guide/kernel-parameters.txt | 8 + > drivers/iommu/iommu.c | 29 +++ > 2 files changed, 37 insertions(+) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt > b/Documentation/admin-guide/kernel-parameters.txt > index 138f6664b2e2..b43f0893d252 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -1813,6 +1813,14 @@ > 1 - Bypass the IOMMU for DMA. > unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH. > > + iommu.prq_timeout= > + Timeout in seconds to wait for page response > + of a pending page request. > + Format: > + Default: 10 > + 0 - no timeout tracking > + 1 to 100 - allowed range > + > io7=[HW] IO7 for Marvel based alpha systems > See comment before marvel_specify_io7 in > arch/alpha/kernel/core_marvel.c. > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 13b301cfb10f..64e87d56f471 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -45,6 +45,19 @@ static unsigned int iommu_def_domain_type = > IOMMU_DOMAIN_DMA; > #endif > static bool iommu_dma_strict __read_mostly = true; > > +/* > + * Timeout to wait for page response of a pending page request. This is > + * intended as a basic safty net in case a pending page request is not > + * responded for an exceptionally long time. Device may also implement > + * its own protection mechanism against this exception. > + * Units are in jiffies with a range between 1 - 100 seconds equivalent. > + * Default to 10 seconds. > + * Setting 0 means no timeout tracking. > + */ > +#define IOMMU_PAGE_RESPONSE_MAX_TIMEOUT (HZ * 100) > +#define IOMMU_PAGE_RESPONSE_DEF_TIMEOUT (HZ * 10) > +static unsigned long prq_timeout = IOMMU_PAGE_RESPONSE_DEF_TIMEOUT; > + > struct iommu_group { > struct kobject kobj; > struct kobject *devices_kobj; > @@ -157,6 +170,22 @@ static int __init iommu_dma_setup(char *str) > } > early_param("iommu.strict", iommu_dma_setup); > > +static int __init iommu_set_prq_timeout(char *str) > +{ > + unsigned long timeout; > + > + if (!str) > + return -EINVAL; > + timeout = simple_strtoul(str, NULL, 0); > + timeout = timeout * HZ; > + if (timeout > IOMMU_PAGE_RESPONSE_MAX_TIMEOUT) > + return -EINVAL; > + prq_timeout = timeout; > + > + return 0; > +} > +early_param("iommu.prq_timeout", iommu_set_prq_timeout); > + > static ssize_t iommu_group_attr_show(struct kobject *kobj, >struct attribute *__attr, char *buf) > { It doesn't seem to make much sense to include this patch without also including "iommu: handle page response timeout". Was that one lost? Dropped? Lives elsewhere? Thanks, Alex
Re: [PATCH v8 26/29] vfio-pci: Register an iommu fault handler
On Sun, 26 May 2019 18:10:01 +0200 Eric Auger wrote: > This patch registers a fault handler which records faults in > a circular buffer and then signals an eventfd. This buffer is > exposed within the fault region. > > Signed-off-by: Eric Auger > > --- > > v3 -> v4: > - move iommu_unregister_device_fault_handler to vfio_pci_release > --- > drivers/vfio/pci/vfio_pci.c | 49 + > drivers/vfio/pci/vfio_pci_private.h | 1 + > 2 files changed, 50 insertions(+) > > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index f75f61127277..52094ba8 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -30,6 +30,7 @@ > #include > #include > #include > +#include > > #include "vfio_pci_private.h" > > @@ -296,6 +297,46 @@ static const struct vfio_pci_regops > vfio_pci_fault_prod_regops = { > .add_capability = vfio_pci_fault_prod_add_capability, > }; > > +int vfio_pci_iommu_dev_fault_handler(struct iommu_fault_event *evt, void > *data) > +{ > + struct vfio_pci_device *vdev = (struct vfio_pci_device *) data; > + struct vfio_region_fault_prod *prod_region = > + (struct vfio_region_fault_prod *)vdev->fault_pages; > + struct vfio_region_fault_cons *cons_region = > + (struct vfio_region_fault_cons *)(vdev->fault_pages + 2 * > PAGE_SIZE); > + struct iommu_fault *new = > + (struct iommu_fault *)(vdev->fault_pages + prod_region->offset + > + prod_region->prod * prod_region->entry_size); > + int prod, cons, size; > + > + mutex_lock(&vdev->fault_queue_lock); > + > + if (!vdev->fault_abi) > + goto unlock; > + > + prod = prod_region->prod; > + cons = cons_region->cons; > + size = prod_region->nb_entries; > + > + if (CIRC_SPACE(prod, cons, size) < 1) > + goto unlock; > + > + *new = evt->fault; > + prod = (prod + 1) % size; > + prod_region->prod = prod; > + mutex_unlock(&vdev->fault_queue_lock); > + > + mutex_lock(&vdev->igate); > + if (vdev->dma_fault_trigger) > + eventfd_signal(vdev->dma_fault_trigger, 1); > + mutex_unlock(&vdev->igate); > + return 0; > + > +unlock: > + mutex_unlock(&vdev->fault_queue_lock); > + return -EINVAL; > +} > + > static int vfio_pci_init_fault_region(struct vfio_pci_device *vdev) > { > struct vfio_region_fault_prod *header; > @@ -328,6 +369,13 @@ static int vfio_pci_init_fault_region(struct > vfio_pci_device *vdev) > header = (struct vfio_region_fault_prod *)vdev->fault_pages; > header->version = -1; > header->offset = PAGE_SIZE; > + > + ret = iommu_register_device_fault_handler(&vdev->pdev->dev, > + vfio_pci_iommu_dev_fault_handler, > + vdev); > + if (ret) > + goto out; > + > return 0; > out: > kfree(vdev->fault_pages); > @@ -570,6 +618,7 @@ static void vfio_pci_release(void *device_data) > if (!(--vdev->refcnt)) { > vfio_spapr_pci_eeh_release(vdev->pdev); > vfio_pci_disable(vdev); > + iommu_unregister_device_fault_handler(&vdev->pdev->dev); But this can fail if there are pending faults which leaves a device reference and then the system is broken :( > } > > mutex_unlock(&vdev->reflck->lock); > diff --git a/drivers/vfio/pci/vfio_pci_private.h > b/drivers/vfio/pci/vfio_pci_private.h > index 8e0a55682d3f..a9276926f008 100644 > --- a/drivers/vfio/pci/vfio_pci_private.h > +++ b/drivers/vfio/pci/vfio_pci_private.h > @@ -122,6 +122,7 @@ struct vfio_pci_device { > int ioeventfds_nr; > struct eventfd_ctx *err_trigger; > struct eventfd_ctx *req_trigger; > + struct eventfd_ctx *dma_fault_trigger; > struct mutexfault_queue_lock; > int fault_abi; > struct list_headdummy_resources_list;
Re: [PATCH v8 22/29] vfio: VFIO_IOMMU_ATTACH/DETACH_PASID_TABLE
On Sun, 26 May 2019 18:09:57 +0200 Eric Auger wrote: > From: "Liu, Yi L" > > This patch adds VFIO_IOMMU_ATTACH/DETACH_PASID_TABLE ioctl > which aims to pass/withdraw the virtual iommu guest configuration > to/from the VFIO driver downto to the iommu subsystem. > > Signed-off-by: Jacob Pan > Signed-off-by: Liu, Yi L > Signed-off-by: Eric Auger > > --- > v6 -> v7: > - add a comment related to VFIO_IOMMU_DETACH_PASID_TABLE > > v3 -> v4: > - restore ATTACH/DETACH > - add unwind on failure > > v2 -> v3: > - s/BIND_PASID_TABLE/SET_PASID_TABLE > > v1 -> v2: > - s/BIND_GUEST_STAGE/BIND_PASID_TABLE > - remove the struct device arg > --- > drivers/vfio/vfio_iommu_type1.c | 53 + > include/uapi/linux/vfio.h | 22 ++ > 2 files changed, 75 insertions(+) > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index 3ddc375e7063..b2d609d6fe83 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -1758,6 +1758,43 @@ static int vfio_domains_have_iommu_cache(struct > vfio_iommu *iommu) > return ret; > } > > +static void > +vfio_detach_pasid_table(struct vfio_iommu *iommu) > +{ > + struct vfio_domain *d; > + > + mutex_lock(&iommu->lock); > + > + list_for_each_entry(d, &iommu->domain_list, next) { > + iommu_detach_pasid_table(d->domain); > + } > + mutex_unlock(&iommu->lock); > +} > + > +static int > +vfio_attach_pasid_table(struct vfio_iommu *iommu, > + struct vfio_iommu_type1_attach_pasid_table *ustruct) > +{ > + struct vfio_domain *d; > + int ret = 0; > + > + mutex_lock(&iommu->lock); > + > + list_for_each_entry(d, &iommu->domain_list, next) { > + ret = iommu_attach_pasid_table(d->domain, &ustruct->config); > + if (ret) > + goto unwind; > + } > + goto unlock; > +unwind: > + list_for_each_entry_continue_reverse(d, &iommu->domain_list, next) { > + iommu_detach_pasid_table(d->domain); > + } > +unlock: > + mutex_unlock(&iommu->lock); > + return ret; > +} > + > static long vfio_iommu_type1_ioctl(void *iommu_data, > unsigned int cmd, unsigned long arg) > { > @@ -1828,6 +1865,22 @@ static long vfio_iommu_type1_ioctl(void *iommu_data, > > return copy_to_user((void __user *)arg, &unmap, minsz) ? > -EFAULT : 0; > + } else if (cmd == VFIO_IOMMU_ATTACH_PASID_TABLE) { > + struct vfio_iommu_type1_attach_pasid_table ustruct; > + > + minsz = offsetofend(struct vfio_iommu_type1_attach_pasid_table, > + config); > + > + if (copy_from_user(&ustruct, (void __user *)arg, minsz)) > + return -EFAULT; > + > + if (ustruct.argsz < minsz || ustruct.flags) > + return -EINVAL; > + > + return vfio_attach_pasid_table(iommu, &ustruct); > + } else if (cmd == VFIO_IOMMU_DETACH_PASID_TABLE) { > + vfio_detach_pasid_table(iommu); > + return 0; > } > > return -ENOTTY; > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h > index 8f10748dac79..4316dd8cb5b5 100644 > --- a/include/uapi/linux/vfio.h > +++ b/include/uapi/linux/vfio.h > @@ -14,6 +14,7 @@ > > #include > #include > +#include > > #define VFIO_API_VERSION 0 > > @@ -763,6 +764,27 @@ struct vfio_iommu_type1_dma_unmap { > #define VFIO_IOMMU_ENABLE_IO(VFIO_TYPE, VFIO_BASE + 15) > #define VFIO_IOMMU_DISABLE _IO(VFIO_TYPE, VFIO_BASE + 16) > > +/** > + * VFIO_IOMMU_ATTACH_PASID_TABLE - _IOWR(VFIO_TYPE, VFIO_BASE + 22, > + * struct vfio_iommu_type1_attach_pasid_table) > + * > + * Passes the PASID table to the host. Calling ATTACH_PASID_TABLE > + * while a table is already installed is allowed: it replaces the old > + * table. DETACH does a comprehensive tear down of the nested mode. > + */ > +struct vfio_iommu_type1_attach_pasid_table { > + __u32 argsz; > + __u32 flags; > + struct iommu_pasid_table_config config; > +}; > +#define VFIO_IOMMU_ATTACH_PASID_TABLE_IO(VFIO_TYPE, VFIO_BASE + 22) > + > +/** > + * VFIO_IOMMU_DETACH_PASID_TABLE - - _IOWR(VFIO_TYPE, VFIO_BASE + 23) > + * Detaches the PASID table > + */ > +#define VFIO_IOMMU_DETACH_PASID_TABLE_IO(VFIO_TYPE, VFIO_BASE + 23) > + > /* Additional API for SPAPR TCE (Server POWERPC) IOMMU */ > > /* I'm tempted to suggest a "SET" rather than ATTACH/DETACH interface so this could be done in one ioctl and make use of the flags provided.
Re: [PATCH v8 25/29] vfio-pci: Add a new VFIO_REGION_TYPE_NESTED region type
On Sun, 26 May 2019 18:10:00 +0200 Eric Auger wrote: > This patch adds two new regions aiming to handle nested mode > translation faults. > > The first region (two host kernel pages) is read-only from the > user-space perspective. The first page contains an header > that provides information about the circular buffer located in the > second page. The circular buffer is put in a different page in > the prospect to be mmappable. > > The max user API version supported by the kernel is returned > through a dedicated fault region capability. > > The prod header contains > - the user API version in use (potentially inferior to the one > returned in the capability), > - the offset of the queue within the region, > - the producer index relative to the start of the queue > - the max number of fault records, > - the size of each record. > > The second region is write-only from the user perspective. It > contains the version of the requested fault ABI and the consumer > index that is updated by the userspace each time this latter has > consumed fault records. > > The natural order of operation for the userspace is: > - retrieve the highest supported fault ABI version > - set the requested fault ABI version in the consumer region > > Until the ABI version is not set by the userspace, the kernel > cannot return a comprehensive set of information inside the > prod header (entry size and number of entries in the fault queue). It's not clear to me why two regions are required for this. If the first page is not mmap capable, why does it need to be read-only? If it were not read-only couldn't the fields of the second region also fit within this first page? If you wanted to deal with an mmap capable writeable region, it could just be yet a 3rd page in the first region. > > Signed-off-by: Eric Auger > > --- > > v4 -> v5 > - check cons is not null in vfio_pci_check_cons_fault > > v3 -> v4: > - use 2 separate regions, respectively in read and write modes > - add the version capability > --- > drivers/vfio/pci/vfio_pci.c | 105 > drivers/vfio/pci/vfio_pci_private.h | 17 + > drivers/vfio/pci/vfio_pci_rdwr.c| 73 +++ > include/uapi/linux/vfio.h | 42 +++ > 4 files changed, 237 insertions(+) > > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index cab71da46f4a..f75f61127277 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -261,6 +261,106 @@ int vfio_pci_set_power_state(struct vfio_pci_device > *vdev, pci_power_t state) > return ret; > } > > +void vfio_pci_fault_release(struct vfio_pci_device *vdev, > + struct vfio_pci_region *region) > +{ > +} > + > +static const struct vfio_pci_fault_abi fault_abi_versions[] = { > + [0] = { > + .entry_size = sizeof(struct iommu_fault), > + }, > +}; > + > +#define NR_FAULT_ABIS ARRAY_SIZE(fault_abi_versions) This looks like it's leading to some dangerous complicated code to support multiple user selected ABIs. How many ABIs do we plan to support? The region capability also exposes a type, sub-type, and version. How much of this could be exposed that way? ie. if we need to support multiple versions, expose multiple regions. > + > +static int vfio_pci_fault_prod_add_capability(struct vfio_pci_device *vdev, > + struct vfio_pci_region *region, struct vfio_info_cap *caps) > +{ > + struct vfio_region_info_cap_fault cap = { > + .header.id = VFIO_REGION_INFO_CAP_PRODUCER_FAULT, > + .header.version = 1, > + .version = NR_FAULT_ABIS, > + }; > + return vfio_info_add_capability(caps, &cap.header, sizeof(cap)); > +} > + > +static const struct vfio_pci_regops vfio_pci_fault_cons_regops = { > + .rw = vfio_pci_fault_cons_rw, > + .release= vfio_pci_fault_release, > +}; > + > +static const struct vfio_pci_regops vfio_pci_fault_prod_regops = { > + .rw = vfio_pci_fault_prod_rw, > + .release= vfio_pci_fault_release, > + .add_capability = vfio_pci_fault_prod_add_capability, > +}; > + > +static int vfio_pci_init_fault_region(struct vfio_pci_device *vdev) > +{ > + struct vfio_region_fault_prod *header; > + int ret; > + > + mutex_init(&vdev->fault_queue_lock); > + > + vdev->fault_pages = kzalloc(3 * PAGE_SIZE, GFP_KERNEL); > + if (!vdev->fault_pages) > + return -ENOMEM; > + > + ret = vfio_pci_register_dev_region(vdev, > + VFIO_REGION_TYPE_NESTED, > + VFIO_REGION_SUBTYPE_NESTED_FAULT_PROD, > + &vfio_pci_fault_prod_regops, 2 * PAGE_SIZE, > + VFIO_REGION_INFO_FLAG_READ, vdev->fault_pages); If mmap isn't supported yet, why are we pushing the queue out to the 2nd page? We're just wasting space. vfio_region_fault_prod.offset allows us to relocate it when/if it is mmap capable. > + if (ret) > +
Re: [PATCH v8 24/29] vfio: VFIO_IOMMU_BIND/UNBIND_MSI
On Sun, 26 May 2019 18:09:59 +0200 Eric Auger wrote: > This patch adds the VFIO_IOMMU_BIND/UNBIND_MSI ioctl which aim > to pass/withdraw the guest MSI binding to/from the host. > > Signed-off-by: Eric Auger > > --- > v6 -> v7: > - removed the dev arg > > v3 -> v4: > - add UNBIND > - unwind on BIND error > > v2 -> v3: > - adapt to new proto of bind_guest_msi > - directly use vfio_iommu_for_each_dev > > v1 -> v2: > - s/vfio_iommu_type1_guest_msi_binding/vfio_iommu_type1_bind_guest_msi > --- > drivers/vfio/vfio_iommu_type1.c | 64 + > include/uapi/linux/vfio.h | 29 +++ > 2 files changed, 93 insertions(+) > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index 6fda4fbc9bfa..18142cb078a3 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -1832,6 +1832,42 @@ static int vfio_cache_inv_fn(struct device *dev, void > *data) > return iommu_cache_invalidate(dc->domain, dev, &ustruct->info); > } > > +static int > +vfio_bind_msi(struct vfio_iommu *iommu, > + dma_addr_t giova, phys_addr_t gpa, size_t size) > +{ > + struct vfio_domain *d; > + int ret = 0; > + > + mutex_lock(&iommu->lock); > + > + list_for_each_entry(d, &iommu->domain_list, next) { > + ret = iommu_bind_guest_msi(d->domain, giova, gpa, size); > + if (ret) > + goto unwind; > + } > + goto unlock; > +unwind: > + list_for_each_entry_continue_reverse(d, &iommu->domain_list, next) { > + iommu_unbind_guest_msi(d->domain, giova); > + } > +unlock: > + mutex_unlock(&iommu->lock); > + return ret; > +} > + > +static void > +vfio_unbind_msi(struct vfio_iommu *iommu, dma_addr_t giova) > +{ > + struct vfio_domain *d; > + > + mutex_lock(&iommu->lock); > + list_for_each_entry(d, &iommu->domain_list, next) { > + iommu_unbind_guest_msi(d->domain, giova); > + } > + mutex_unlock(&iommu->lock); > +} > + > static long vfio_iommu_type1_ioctl(void *iommu_data, > unsigned int cmd, unsigned long arg) > { > @@ -1936,6 +1972,34 @@ static long vfio_iommu_type1_ioctl(void *iommu_data, > &ustruct); > mutex_unlock(&iommu->lock); > return ret; > + } else if (cmd == VFIO_IOMMU_BIND_MSI) { > + struct vfio_iommu_type1_bind_msi ustruct; > + > + minsz = offsetofend(struct vfio_iommu_type1_bind_msi, > + size); > + > + if (copy_from_user(&ustruct, (void __user *)arg, minsz)) > + return -EFAULT; > + > + if (ustruct.argsz < minsz || ustruct.flags) > + return -EINVAL; > + > + return vfio_bind_msi(iommu, ustruct.iova, ustruct.gpa, > + ustruct.size); > + } else if (cmd == VFIO_IOMMU_UNBIND_MSI) { > + struct vfio_iommu_type1_unbind_msi ustruct; > + > + minsz = offsetofend(struct vfio_iommu_type1_unbind_msi, > + iova); > + > + if (copy_from_user(&ustruct, (void __user *)arg, minsz)) > + return -EFAULT; > + > + if (ustruct.argsz < minsz || ustruct.flags) > + return -EINVAL; > + > + vfio_unbind_msi(iommu, ustruct.iova); > + return 0; > } > > return -ENOTTY; > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h > index 055aa9b9745a..2774a1ab37ae 100644 > --- a/include/uapi/linux/vfio.h > +++ b/include/uapi/linux/vfio.h > @@ -798,6 +798,35 @@ struct vfio_iommu_type1_cache_invalidate { > }; > #define VFIO_IOMMU_CACHE_INVALIDATE _IO(VFIO_TYPE, VFIO_BASE + 24) > > +/** > + * VFIO_IOMMU_BIND_MSI - _IOWR(VFIO_TYPE, VFIO_BASE + 25, > + * struct vfio_iommu_type1_bind_msi) > + * > + * Pass a stage 1 MSI doorbell mapping to the host so that this > + * latter can build a nested stage2 mapping > + */ > +struct vfio_iommu_type1_bind_msi { > + __u32 argsz; > + __u32 flags; > + __u64 iova; > + __u64 gpa; > + __u64 size; > +}; > +#define VFIO_IOMMU_BIND_MSI _IO(VFIO_TYPE, VFIO_BASE + 25) > + > +/** > + * VFIO_IOMMU_UNBIND_MSI - _IOWR(VFIO_TYPE, VFIO_BASE + 26, > + * struct vfio_iommu_type1_unbind_msi) > + * > + * Unregister an MSI mapping > + */ > +struct vfio_iommu_type1_unbind_msi { > + __u32 argsz; > + __u32 flags; > + __u64 iova; > +}; > +#define VFIO_IOMMU_UNBIND_MSI _IO(VFIO_TYPE, VFIO_BASE + 26) > + > /* Additional API for SPAPR TCE (Server POWERPC) IOMMU */ > > /* And another pair of ioctls. Maybe think about how we can reduce the ioctl bloat of this series. I don't want to impose an awkward interface for the sake of fewer ioctls, but I also don't want us casually bur
[PATCH] iommu/dma: Apply dma_{alloc,free}_contiguous functions
This patch replaces dma_{alloc,release}_from_contiguous() with dma_{alloc,free}_contiguous() to simplify those function calls. Signed-off-by: Nicolin Chen --- drivers/iommu/dma-iommu.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 0cd49c2d3770..cc3d39dbbe1a 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -951,8 +951,8 @@ static void __iommu_dma_free(struct device *dev, size_t size, void *cpu_addr) if (pages) __iommu_dma_free_pages(pages, count); - if (page && !dma_release_from_contiguous(dev, page, count)) - __free_pages(page, get_order(alloc_size)); + if (page) + dma_free_contiguous(dev, page, alloc_size); } static void iommu_dma_free(struct device *dev, size_t size, void *cpu_addr, @@ -970,12 +970,7 @@ static void *iommu_dma_alloc_pages(struct device *dev, size_t size, struct page *page = NULL; void *cpu_addr; - if (gfpflags_allow_blocking(gfp)) - page = dma_alloc_from_contiguous(dev, alloc_size >> PAGE_SHIFT, -get_order(alloc_size), -gfp & __GFP_NOWARN); - if (!page) - page = alloc_pages(gfp, get_order(alloc_size)); + page = dma_alloc_contiguous(dev, alloc_size, gfp); if (!page) return NULL; @@ -997,8 +992,7 @@ static void *iommu_dma_alloc_pages(struct device *dev, size_t size, memset(cpu_addr, 0, alloc_size); return cpu_addr; out_free_pages: - if (!dma_release_from_contiguous(dev, page, alloc_size >> PAGE_SHIFT)) - __free_pages(page, get_order(alloc_size)); + dma_free_contiguous(dev, page, alloc_size); return NULL; } -- 2.17.1
Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
Michael S. Tsirkin writes: > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: >> I rephrased it in terms of address translation. What do you think of >> this version? The flag name is slightly different too: >> >> >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, >> with the exception that address translation is guaranteed to be >> unnecessary when accessing memory addresses supplied to the device >> by the driver. Which is to say, the device will always use physical >> addresses matching addresses used by the driver (typically meaning >> physical addresses used by the CPU) and not translated further. This >> flag should be set by the guest if offered, but to allow for >> backward-compatibility device implementations allow for it to be >> left unset by the guest. It is an error to set both this flag and >> VIRTIO_F_ACCESS_PLATFORM. > > > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > drivers. This is why devices fail when it's not negotiated. Just to clarify, what do you mean by unprivileged drivers? Is it drivers implemented in guest userspace such as with VFIO? Or unprivileged in some other sense such as needing to use bounce buffers for some reason? > This confuses me. > If driver is unpriveledged then what happens with this flag? > It can supply any address it wants. Will that corrupt kernel > memory? Not needing address translation doesn't necessarily mean that there's no IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's always an IOMMU present. And we also support VFIO drivers. The VFIO API for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls to program the IOMMU. For our use case, we don't need address translation because we set up an identity mapping in the IOMMU so that the device can use guest physical addresses. If the guest kernel is concerned that an unprivileged driver could jeopardize its integrity it should not negotiate this feature flag. Perhaps there should be a note about this in the flag definition? This concern is platform-dependant though. I don't believe it's an issue in pseries. -- Thiago Jung Bauermann IBM Linux Technology Center
Re: [RFC PATCH] virtio_ring: Use DMA API if guest memory is encrypted
On Mon, Jun 03, 2019 at 10:13:59PM -0300, Thiago Jung Bauermann wrote: > > > Michael S. Tsirkin writes: > > > On Wed, Apr 17, 2019 at 06:42:00PM -0300, Thiago Jung Bauermann wrote: > >> I rephrased it in terms of address translation. What do you think of > >> this version? The flag name is slightly different too: > >> > >> > >> VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION This feature has the same > >> meaning as VIRTIO_F_ACCESS_PLATFORM both when set and when not set, > >> with the exception that address translation is guaranteed to be > >> unnecessary when accessing memory addresses supplied to the device > >> by the driver. Which is to say, the device will always use physical > >> addresses matching addresses used by the driver (typically meaning > >> physical addresses used by the CPU) and not translated further. This > >> flag should be set by the guest if offered, but to allow for > >> backward-compatibility device implementations allow for it to be > >> left unset by the guest. It is an error to set both this flag and > >> VIRTIO_F_ACCESS_PLATFORM. > > > > > > OK so VIRTIO_F_ACCESS_PLATFORM is designed to allow unpriveledged > > drivers. This is why devices fail when it's not negotiated. > > Just to clarify, what do you mean by unprivileged drivers? Is it drivers > implemented in guest userspace such as with VFIO? Or unprivileged in > some other sense such as needing to use bounce buffers for some reason? I had drivers in guest userspace in mind. > > This confuses me. > > If driver is unpriveledged then what happens with this flag? > > It can supply any address it wants. Will that corrupt kernel > > memory? > > Not needing address translation doesn't necessarily mean that there's no > IOMMU. On powerpc we don't use VIRTIO_F_ACCESS_PLATFORM but there's > always an IOMMU present. And we also support VFIO drivers. The VFIO API > for pseries (sPAPR section in Documentation/vfio.txt) has extra ioctls > to program the IOMMU. > > For our use case, we don't need address translation because we set up an > identity mapping in the IOMMU so that the device can use guest physical > addresses. And can it access any guest physical address? > If the guest kernel is concerned that an unprivileged driver could > jeopardize its integrity it should not negotiate this feature flag. Unfortunately flag negotiation is done through config space and so can be overwritten by the driver. > Perhaps there should be a note about this in the flag definition? This > concern is platform-dependant though. I don't believe it's an issue in > pseries. Again ACCESS_PLATFORM has a pretty open definition. It does actually say it's all up to the platform. Specifically how will VIRTIO_F_ACCESS_PLATFORM_NO_TRANSLATION be implemented portably? virtio has no portable way to know whether DMA API bypasses translation. > -- > Thiago Jung Bauermann > IBM Linux Technology Center
Re: [PATCH 07/26] iommu/dma: move the arm64 wrappers to common code
On Mon, Jun 03, 2019 at 08:47:57PM +0100, Jon Hunter wrote: > Since next-20190529 one of our tests for MMC has started failing, where > the symptom is that the data written to the MMC does not match the > source. Bisecting this is pointing to this commit. Unfortunately, I am > not able to cleanly revert this on top of -next, but wanted to report > this if case you have any ideas. Does this fix your problem? https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=generic-dma-ops&id=1b961423158caaae49d3900b7c9c37477bbfa9b3
cleanup vmap usage in the dma-mapping layer
Hi all, the common DMA remapping code uses the vmalloc/vmap code to create page table entries for DMA mappings. This series lifts the currently arm specific VM_* flag for that into common code, and also exposes it to userspace in procfs to better understand the mappings, and cleans up a couple helpers in this area. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/3] vmalloc: lift the arm flag for coherent mappings to common code
The arm architecture had a VM_ARM_DMA_CONSISTENT flag to mark DMA coherent remapping for a while. Lift this flag to common code so that we can use it generically. We also check it in the only place VM_USERMAP is directly check so that we can entirely replace that flag as well (although I'm not even sure why we'd want to allow remapping DMA appings, but I'd rather not change behavior). Signed-off-by: Christoph Hellwig --- arch/arm/mm/dma-mapping.c | 22 +++--- arch/arm/mm/mm.h | 3 --- include/linux/vmalloc.h | 2 ++ mm/vmalloc.c | 5 - 4 files changed, 13 insertions(+), 19 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 0a75058c11f3..e197b028e341 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -360,19 +360,13 @@ static void * __dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot, const void *caller) { - /* -* DMA allocation can be mapped to user space, so lets -* set VM_USERMAP flags too. -*/ - return dma_common_contiguous_remap(page, size, - VM_ARM_DMA_CONSISTENT | VM_USERMAP, + return dma_common_contiguous_remap(page, size, VM_DMA_COHERENT, prot, caller); } static void __dma_free_remap(void *cpu_addr, size_t size) { - dma_common_free_remap(cpu_addr, size, - VM_ARM_DMA_CONSISTENT | VM_USERMAP); + dma_common_free_remap(cpu_addr, size, VM_DMA_COHERENT); } #define DEFAULT_DMA_COHERENT_POOL_SIZE SZ_256K @@ -1387,8 +1381,8 @@ static void * __iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot, const void *caller) { - return dma_common_pages_remap(pages, size, - VM_ARM_DMA_CONSISTENT | VM_USERMAP, prot, caller); + return dma_common_pages_remap(pages, size, VM_DMA_COHERENT, prot, + caller); } /* @@ -1472,7 +1466,7 @@ static struct page **__iommu_get_pages(void *cpu_addr, unsigned long attrs) return cpu_addr; area = find_vm_area(cpu_addr); - if (area && (area->flags & VM_ARM_DMA_CONSISTENT)) + if (area && (area->flags & VM_DMA_COHERENT)) return area->pages; return NULL; } @@ -1630,10 +1624,8 @@ void __arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, return; } - if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) == 0) { - dma_common_free_remap(cpu_addr, size, - VM_ARM_DMA_CONSISTENT | VM_USERMAP); - } + if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) == 0) + dma_common_free_remap(cpu_addr, size, VM_DMA_COHERENT); __iommu_remove_mapping(dev, handle, size); __iommu_free_buffer(dev, pages, size, attrs); diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h index 6b045c6653ea..6ec48188ef9f 100644 --- a/arch/arm/mm/mm.h +++ b/arch/arm/mm/mm.h @@ -68,9 +68,6 @@ extern void __flush_dcache_page(struct address_space *mapping, struct page *page #define VM_ARM_MTYPE(mt) ((mt) << 20) #define VM_ARM_MTYPE_MASK (0x1f << 20) -/* consistent regions used by dma_alloc_attrs() */ -#define VM_ARM_DMA_CONSISTENT 0x2000 - struct static_vm { struct vm_struct vm; diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 51e131245379..500fa4fb06f0 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -18,6 +18,7 @@ struct notifier_block;/* in notifier.h */ #define VM_ALLOC 0x0002 /* vmalloc() */ #define VM_MAP 0x0004 /* vmap()ed pages */ #define VM_USERMAP 0x0008 /* suitable for remap_vmalloc_range */ +#define VM_DMA_COHERENT0x0010 /* dma_alloc_coherent */ #define VM_UNINITIALIZED 0x0020 /* vm_struct is not fully initialized */ #define VM_NO_GUARD0x0040 /* don't add guard page */ #define VM_KASAN 0x0080 /* has allocated kasan shadow memory */ @@ -26,6 +27,7 @@ struct notifier_block;/* in notifier.h */ * vfree_atomic(). */ #define VM_FLUSH_RESET_PERMS 0x0100 /* Reset direct map and flush TLB on unmap */ + /* bits [20..32] reserved for arch specific ioremap internals */ /* diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 233af6936c93..c4b5784bccc1 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2948,7 +2948,7 @@ int remap_vmalloc_range_partial(struct vm_area_struct *vma, unsigned long uaddr, if (!area) return -EINVAL; - if (!(area->flags & VM_USERMAP)) + if (!(area->flags & (VM_USERMAP | VM_DMA_COHERENT))) return -EINVAL; if (kaddr + size > area->addr + get_vm_area_size(area)) @@ -3438,6 +3438,9 @@ static int s_show(struct seq_file *m, void *p)
[PATCH 3/3] dma-mapping: introduce a dma_common_find_pages helper
A helper to find the backing page array based on a virtual address. This also ensures we do the same vm_flags check everywhere instead of slightly different or missing ones in a few places. Signed-off-by: Christoph Hellwig --- arch/arm/mm/dma-mapping.c | 7 +-- drivers/iommu/dma-iommu.c | 15 +++ include/linux/dma-mapping.h | 1 + kernel/dma/remap.c | 13 +++-- 4 files changed, 16 insertions(+), 20 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 647fd25d2aba..7620d4f55e92 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -1434,18 +1434,13 @@ static struct page **__atomic_get_pages(void *addr) static struct page **__iommu_get_pages(void *cpu_addr, unsigned long attrs) { - struct vm_struct *area; - if (__in_atomic_pool(cpu_addr, PAGE_SIZE)) return __atomic_get_pages(cpu_addr); if (attrs & DMA_ATTR_NO_KERNEL_MAPPING) return cpu_addr; - area = find_vm_area(cpu_addr); - if (area && (area->flags & VM_DMA_COHERENT)) - return area->pages; - return NULL; + return dma_common_find_pages(cpu_addr); } static void *__iommu_alloc_simple(struct device *dev, size_t size, gfp_t gfp, diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index cea561897086..002d3bb6254a 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -543,15 +543,6 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev, return pages; } -static struct page **__iommu_dma_get_pages(void *cpu_addr) -{ - struct vm_struct *area = find_vm_area(cpu_addr); - - if (!area || !area->pages) - return NULL; - return area->pages; -} - /** * iommu_dma_alloc_remap - Allocate and map a buffer contiguous in IOVA space * @dev: Device to allocate memory for. Must be a real device @@ -940,7 +931,7 @@ static void __iommu_dma_free(struct device *dev, size_t size, void *cpu_addr) * If it the address is remapped, then it's either non-coherent * or highmem CMA, or an iommu_dma_alloc_remap() construction. */ - pages = __iommu_dma_get_pages(cpu_addr); + pages = dma_common_find_pages(cpu_addr); if (!pages) page = vmalloc_to_page(cpu_addr); dma_common_free_remap(cpu_addr, alloc_size); @@ -1050,7 +1041,7 @@ static int iommu_dma_mmap(struct device *dev, struct vm_area_struct *vma, return -ENXIO; if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) { - struct page **pages = __iommu_dma_get_pages(cpu_addr); + struct page **pages = dma_common_find_pages(cpu_addr); if (pages) return __iommu_dma_mmap(pages, size, vma); @@ -1072,7 +1063,7 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt, int ret; if (IS_ENABLED(CONFIG_DMA_REMAP) && is_vmalloc_addr(cpu_addr)) { - struct page **pages = __iommu_dma_get_pages(cpu_addr); + struct page **pages = dma_common_find_pages(cpu_addr); if (pages) { return sg_alloc_table_from_pages(sgt, pages, diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index ac320b7cacfd..cb07d1388d66 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -615,6 +615,7 @@ extern int dma_common_mmap(struct device *dev, struct vm_area_struct *vma, void *cpu_addr, dma_addr_t dma_addr, size_t size, unsigned long attrs); +struct page **dma_common_find_pages(void *cpu_addr); void *dma_common_contiguous_remap(struct page *page, size_t size, pgprot_t prot, const void *caller); diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c index 51958d21c810..52cdca386de0 100644 --- a/kernel/dma/remap.c +++ b/kernel/dma/remap.c @@ -11,6 +11,15 @@ #include #include +struct page **dma_common_find_pages(void *cpu_addr) +{ + struct vm_struct *area = find_vm_area(cpu_addr); + + if (!area || area->flags != VM_DMA_COHERENT) + return NULL; + return area->pages; +} + static struct vm_struct *__dma_common_pages_remap(struct page **pages, size_t size, pgprot_t prot, const void *caller) { @@ -78,9 +87,9 @@ void *dma_common_contiguous_remap(struct page *page, size_t size, */ void dma_common_free_remap(void *cpu_addr, size_t size) { - struct vm_struct *area = find_vm_area(cpu_addr); + struct page **pages = dma_common_find_pages(cpu_addr); - if (!area || area->flags != VM_DMA_COHERENT) { + if (!pages) { WARN(1, "trying to free invalid coherent area: %p\n", cpu_addr); return; } -- 2.20.1 __
[PATCH 2/3] dma-mapping: always use VM_DMA_COHERENT for generic DMA remap
Currently the generic dma remap allocator gets a vm_flags passed by the caller that is a little confusing. We just introduced a generic vmalloc-level flag to identify the dma coherent allocations, so use that everywhere and remove the now pointless argument. Signed-off-by: Christoph Hellwig --- arch/arm/mm/dma-mapping.c| 37 +++- arch/xtensa/kernel/pci-dma.c | 4 ++-- drivers/iommu/dma-iommu.c| 6 +++--- include/linux/dma-mapping.h | 6 ++ kernel/dma/remap.c | 25 +++- 5 files changed, 25 insertions(+), 53 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index e197b028e341..647fd25d2aba 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -356,19 +356,6 @@ static void *__alloc_remap_buffer(struct device *dev, size_t size, gfp_t gfp, pgprot_t prot, struct page **ret_page, const void *caller, bool want_vaddr); -static void * -__dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot, - const void *caller) -{ - return dma_common_contiguous_remap(page, size, VM_DMA_COHERENT, - prot, caller); -} - -static void __dma_free_remap(void *cpu_addr, size_t size) -{ - dma_common_free_remap(cpu_addr, size, VM_DMA_COHERENT); -} - #define DEFAULT_DMA_COHERENT_POOL_SIZE SZ_256K static struct gen_pool *atomic_pool __ro_after_init; @@ -525,7 +512,7 @@ static void *__alloc_remap_buffer(struct device *dev, size_t size, gfp_t gfp, if (!want_vaddr) goto out; - ptr = __dma_alloc_remap(page, size, gfp, prot, caller); + ptr = dma_common_contiguous_remap(page, size, gfp, prot, caller); if (!ptr) { __dma_free_buffer(page, size); return NULL; @@ -592,7 +579,8 @@ static void *__alloc_from_contiguous(struct device *dev, size_t size, goto out; if (PageHighMem(page)) { - ptr = __dma_alloc_remap(page, size, GFP_KERNEL, prot, caller); + ptr = dma_common_contiguous_remap(page, size, GFP_KERNEL, prot, + caller); if (!ptr) { dma_release_from_contiguous(dev, page, count); return NULL; @@ -612,7 +600,7 @@ static void __free_from_contiguous(struct device *dev, struct page *page, { if (want_vaddr) { if (PageHighMem(page)) - __dma_free_remap(cpu_addr, size); + dma_common_free_remap(cpu_addr, size); else __dma_remap(page, size, PAGE_KERNEL); } @@ -704,7 +692,7 @@ static void *remap_allocator_alloc(struct arm_dma_alloc_args *args, static void remap_allocator_free(struct arm_dma_free_args *args) { if (args->want_vaddr) - __dma_free_remap(args->cpu_addr, args->size); + dma_common_free_remap(args->cpu_addr, args->size); __dma_free_buffer(args->page, args->size); } @@ -1374,17 +1362,6 @@ static int __iommu_free_buffer(struct device *dev, struct page **pages, return 0; } -/* - * Create a CPU mapping for a specified pages - */ -static void * -__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot, - const void *caller) -{ - return dma_common_pages_remap(pages, size, VM_DMA_COHERENT, prot, - caller); -} - /* * Create a mapping in device IO address space for specified pages */ @@ -1541,7 +1518,7 @@ static void *__arm_iommu_alloc_attrs(struct device *dev, size_t size, if (attrs & DMA_ATTR_NO_KERNEL_MAPPING) return pages; - addr = __iommu_alloc_remap(pages, size, gfp, prot, + addr = dma_common_pages_remap(pages, size, gfp, prot, __builtin_return_address(0)); if (!addr) goto err_mapping; @@ -1625,7 +1602,7 @@ void __arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr, } if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) == 0) - dma_common_free_remap(cpu_addr, size, VM_DMA_COHERENT); + dma_common_free_remap(cpu_addr, size); __iommu_remove_mapping(dev, handle, size); __iommu_free_buffer(dev, pages, size, attrs); diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c index 9171bff76fc4..81c1def6b5a2 100644 --- a/arch/xtensa/kernel/pci-dma.c +++ b/arch/xtensa/kernel/pci-dma.c @@ -175,7 +175,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle, if (PageHighMem(page)) { void *p; - p = dma_common_contiguous_remap(page, size, VM_MAP, + p = dma_common_contiguous_remap(page, size, pgprot_noncached(PAGE_KERNEL),