Re: use generic DMA mapping code in powerpc V4
Hi All, Could you please test Christoph's kernel on your PASEMI and NXP boards? Download: 'git clone git://git.infradead.org/users/hch/misc.git -b powerpc-dma.5 a' Thanks, Christian ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 0/8] vfio/mdev: IOMMU aware mediated device
Hi, Is this solution trying to support general user space processes who are directly working on devices? Yes, it is. Okay. But I got another question. As I write a Crypto driver, could I call 'mdev_register_device'? Or in other words, is 'mdev_register_device' acceptable for drivers of Crypto? +cc: Herbert Xu Thanks, Zaibo On 2018/11/5 15:34, Lu Baolu wrote: Hi, The Mediate Device is a framework for fine-grained physical device sharing across the isolated domains. Currently the mdev framework is designed to be independent of the platform IOMMU support. As the result, the DMA isolation relies on the mdev parent device in a vendor specific way. There are several cases where a mediated device could be protected and isolated by the platform IOMMU. For example, Intel vt-d rev3.0 [1] introduces a new translation mode called 'scalable mode', which enables PASID-granular translations. The vt-d scalable mode is the key ingredient for Scalable I/O Virtualization [2] [3] which allows sharing a device in minimal possible granularity (ADI - Assignable Device Interface). A mediated device backed by an ADI could be protected and isolated by the IOMMU since 1) the parent device supports tagging an unique PASID to all DMA traffic out of the mediated device; and 2) the DMA translation unit (IOMMU) supports the PASID granular translation. We can apply IOMMU protection and isolation to this kind of devices just as what we are doing with an assignable PCI device. In order to distinguish the IOMMU-capable mediated devices from those which still need to rely on parent devices, this patch set adds two new members in struct mdev_device. * iommu_device - This, if set, indicates that the mediated device could be fully isolated and protected by IOMMU via attaching an iommu domain to this device. If empty, it indicates using vendor defined isolation. * iommu_domain - This is a place holder for an iommu domain. A domain could be store here for later use once it has been attached to the iommu_device of this mdev. Below helpers are added to set and get above iommu device and iommu domain pointers in mdev core implementation. * mdev_set/get_iommu_device(dev, iommu_device) - Set or get the iommu device which represents this mdev in IOMMU's device scope. Drivers don't need to set the iommu device if it uses vendor defined isolation. * mdev_set/get_iommu_domain(domain) - A iommu domain which has been attached to the iommu device in order to protect and isolate the mediated device will be kept in the mdev data structure and could be retrieved later. The mdev parent device driver could opt-in that the mdev could be fully isolated and protected by the IOMMU when the mdev is being created by invoking mdev_set_iommu_device() in its @create(). In the vfio_iommu_type1_attach_group(), a domain allocated through iommu_domain_alloc() will be attached to the mdev iommu device if an iommu device has been set. Otherwise, the dummy external domain will be used and all the DMA isolation and protection are routed to parent driver as the result. On IOMMU side, a basic requirement is allowing to attach multiple domains to a PCI device if the device advertises the capability and the IOMMU hardware supports finer granularity translations than the normal PCI Source ID based translation. As the result, a PCI device could work in two modes: normal mode and auxiliary mode. In the normal mode, a pci device could be isolated in the Source ID granularity; the pci device itself could be assigned to a user application by attaching a single domain to it. In the auxiliary mode, a pci device could be isolated in finer granularity, hence subsets of the device could be assigned to different user level application by attaching a different domain to each subset. The device driver is able to switch between above two modes with below interfaces: * iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_CAPABILITY) - Represents the ability of supporting multiple domains per device. * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE) - Enable the multiple domains capability for the device referenced by @dev. * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE) - Disable the multiple domains capability for the device referenced by @dev. * iommu_domain_get_attr(domain, DOMAIN_ATTR_AUXD_ID) - Return ID used for finer-granularity DMA translation. * iommu_attach_device_aux(domain, dev) - Attach a domain to the device in the auxiliary mode. * iommu_detach_device_aux(domain, dev) - Detach the aux domain from device. In order for the ease of discussion, sometimes we call "a domain in auxiliary mode' or simply 'an auxiliary domain' when a domain is attached to a device for finer granularity translations. But we need to keep in mind that this doesn't mean there is a differnt domain type. A same domain could be bound to a device for Source ID based translation, and b
Re: [PATCH v2 1/9] mm: Introduce new vm_insert_range API
On Mon, Dec 3, 2018 at 11:52 AM Mike Rapoport wrote: > > On Mon, Dec 03, 2018 at 09:51:45AM +0530, Souptick Joarder wrote: > > Hi Mike, > > > > On Sun, Dec 2, 2018 at 4:43 PM Mike Rapoport wrote: > > > > > > On Sun, Dec 02, 2018 at 11:49:44AM +0530, Souptick Joarder wrote: > > > > Previouly drivers have their own way of mapping range of > > > > kernel pages/memory into user vma and this was done by > > > > invoking vm_insert_page() within a loop. > > > > > > > > As this pattern is common across different drivers, it can > > > > be generalized by creating a new function and use it across > > > > the drivers. > > > > > > > > vm_insert_range is the new API which will be used to map a > > > > range of kernel memory/pages to user vma. > > > > > > > > This API is tested by Heiko for Rockchip drm driver, on rk3188, > > > > rk3288, rk3328 and rk3399 with graphics. > > > > > > > > Signed-off-by: Souptick Joarder > > > > Reviewed-by: Matthew Wilcox > > > > Tested-by: Heiko Stuebner > > > > --- > > > > include/linux/mm_types.h | 3 +++ > > > > mm/memory.c | 38 ++ > > > > mm/nommu.c | 7 +++ > > > > 3 files changed, 48 insertions(+) > > > > > > > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > > > > index 5ed8f62..15ae24f 100644 > > > > --- a/include/linux/mm_types.h > > > > +++ b/include/linux/mm_types.h > > > > @@ -523,6 +523,9 @@ extern void tlb_gather_mmu(struct mmu_gather *tlb, > > > > struct mm_struct *mm, > > > > extern void tlb_finish_mmu(struct mmu_gather *tlb, > > > > unsigned long start, unsigned long end); > > > > > > > > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > > > > + struct page **pages, unsigned long page_count); > > > > + > > > > > > This seem to belong to include/linux/mm.h, near vm_insert_page() > > > > Ok, I will change it. Apart from this change does it looks good ? > > With this change you can add > > Reviewed-by: Mike Rapoport Thanks Mike. > > > > > > > > static inline void init_tlb_flush_pending(struct mm_struct *mm) > > > > { > > > > atomic_set(&mm->tlb_flush_pending, 0); > > > > diff --git a/mm/memory.c b/mm/memory.c > > > > index 15c417e..84ea46c 100644 > > > > --- a/mm/memory.c > > > > +++ b/mm/memory.c > > > > @@ -1478,6 +1478,44 @@ static int insert_page(struct vm_area_struct > > > > *vma, unsigned long addr, > > > > } > > > > > > > > /** > > > > + * vm_insert_range - insert range of kernel pages into user vma > > > > + * @vma: user vma to map to > > > > + * @addr: target user address of this page > > > > + * @pages: pointer to array of source kernel pages > > > > + * @page_count: number of pages need to insert into user vma > > > > + * > > > > + * This allows drivers to insert range of kernel pages they've > > > > allocated > > > > + * into a user vma. This is a generic function which drivers can use > > > > + * rather than using their own way of mapping range of kernel pages > > > > into > > > > + * user vma. > > > > + * > > > > + * If we fail to insert any page into the vma, the function will return > > > > + * immediately leaving any previously-inserted pages present. Callers > > > > + * from the mmap handler may immediately return the error as their > > > > caller > > > > + * will destroy the vma, removing any successfully-inserted pages. > > > > Other > > > > + * callers should make their own arrangements for calling > > > > unmap_region(). > > > > + * > > > > + * Context: Process context. Called by mmap handlers. > > > > + * Return: 0 on success and error code otherwise > > > > + */ > > > > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > > > > + struct page **pages, unsigned long page_count) > > > > +{ > > > > + unsigned long uaddr = addr; > > > > + int ret = 0, i; > > > > + > > > > + for (i = 0; i < page_count; i++) { > > > > + ret = vm_insert_page(vma, uaddr, pages[i]); > > > > + if (ret < 0) > > > > + return ret; > > > > + uaddr += PAGE_SIZE; > > > > + } > > > > + > > > > + return ret; > > > > +} > > > > +EXPORT_SYMBOL(vm_insert_range); > > > > + > > > > +/** > > > > * vm_insert_page - insert single page into user vma > > > > * @vma: user vma to map to > > > > * @addr: target user address of this page > > > > diff --git a/mm/nommu.c b/mm/nommu.c > > > > index 749276b..d6ef5c7 100644 > > > > --- a/mm/nommu.c > > > > +++ b/mm/nommu.c > > > > @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, > > > > unsigned long addr, > > > > } > > > > EXPORT_SYMBOL(vm_insert_page); > > > > > > > > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr, > > > > + struct page **pages, unsigned long page_count) > > > > +{ > > > > + return -EINVAL; > > > > +} > > > > +EXPORT_SYMBOL(vm_insert_range); > > > > + > > > > /* > >
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Mon, Dec 3, 2018 at 7:56 PM Rob Clark wrote: > > On Mon, Dec 3, 2018 at 7:45 AM Robin Murphy wrote: > > > > Hi Rob, > > > > On 01/12/2018 16:53, Rob Clark wrote: > > > This solves a problem we see with drm/msm, caused by getting > > > iommu_dma_ops while we attach our own domain and manage it directly at > > > the iommu API level: > > > > > >[0038] user address but active_mm is swapper > > >Internal error: Oops: 9605 [#1] PREEMPT SMP > > >Modules linked in: > > >CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90 > > >Hardware name: xxx (DT) > > >Workqueue: events deferred_probe_work_func > > >pstate: 80c9 (Nzcv daif +PAN +UAO) > > >pc : iommu_dma_map_sg+0x7c/0x2c8 > > >lr : iommu_dma_map_sg+0x40/0x2c8 > > >sp : ff80095eb4f0 > > >x29: ff80095eb4f0 x28: > > >x27: ffc0f9431578 x26: > > >x25: x24: 0003 > > >x23: 0001 x22: ffc0fa9ac010 > > >x21: x20: ffc0fab40980 > > >x19: ffc0fab40980 x18: 0003 > > >x17: 01c4 x16: 0007 > > >x15: 000e x14: > > >x13: x12: 0028 > > >x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f > > >x9 : x8 : ffc0fab409a0 > > >x7 : x6 : 0002 > > >x5 : 0001 x4 : > > >x3 : 0001 x2 : 0002 > > >x1 : ffc0f9431578 x0 : > > >Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb) > > >Call trace: > > > iommu_dma_map_sg+0x7c/0x2c8 > > > __iommu_map_sg_attrs+0x70/0x84 > > > get_pages+0x170/0x1e8 > > > msm_gem_get_iova+0x8c/0x128 > > > _msm_gem_kernel_new+0x6c/0xc8 > > > msm_gem_kernel_new+0x4c/0x58 > > > dsi_tx_buf_alloc_6g+0x4c/0x8c > > > msm_dsi_host_modeset_init+0xc8/0x108 > > > msm_dsi_modeset_init+0x54/0x18c > > > _dpu_kms_drm_obj_init+0x430/0x474 > > > dpu_kms_hw_init+0x5f8/0x6b4 > > > msm_drm_bind+0x360/0x6c8 > > > try_to_bring_up_master.part.7+0x28/0x70 > > > component_master_add_with_match+0xe8/0x124 > > > msm_pdev_probe+0x294/0x2b4 > > > platform_drv_probe+0x58/0xa4 > > > really_probe+0x150/0x294 > > > driver_probe_device+0xac/0xe8 > > > __device_attach_driver+0xa4/0xb4 > > > bus_for_each_drv+0x98/0xc8 > > > __device_attach+0xac/0x12c > > > device_initial_probe+0x24/0x30 > > > bus_probe_device+0x38/0x98 > > > deferred_probe_work_func+0x78/0xa4 > > > process_one_work+0x24c/0x3dc > > > worker_thread+0x280/0x360 > > > kthread+0x134/0x13c > > > ret_from_fork+0x10/0x18 > > >Code: d284 91000725 6b17039f 5400048a (f9401f40) > > >---[ end trace f22dda57f3648e2c ]--- > > >Kernel panic - not syncing: Fatal exception > > >SMP: stopping secondary CPUs > > >Kernel Offset: disabled > > >CPU features: 0x0,22802a18 > > >Memory Limit: none > > > > > > The problem is that when drm/msm does it's own iommu_attach_device(), > > > now the domain returned by iommu_get_domain_for_dev() is drm/msm's > > > domain, and it doesn't have domain->iova_cookie. > > > > Does this crash still happen with 4.20-rc? Because as of 6af588fed391 it > > really shouldn't. > > > > > We kind of avoided this problem prior to sdm845/dpu because the iommu > > > was attached to the mdp node in dt, which is a child of the toplevel > > > mdss node (which corresponds to the dev passed in dma_map_sg()). But > > > with sdm845, now the iommu is attached at the mdss level so we hit the > > > iommu_dma_ops in dma_map_sg(). > > > > > > But auto allocating/attaching a domain before the driver is probed was > > > already a blocking problem for enabling per-context pagetables for the > > > GPU. This problem is also now solved with this patch. > > > > s/solved/worked around/ > > > > If you want a guarantee of actually getting a specific hardware context > > allocated for a given domain, there needs to be code in the IOMMU driver > > to understand and honour that. Implicitly depending on whatever happens > > to fall out of current driver behaviour on current systems is not a real > > solution. > > > > > Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in > > > of_dma_configure > > > > That's rather misleading, since the crash described above depends on at > > least two other major changes which came long after that commit. > > > > It's not that I don't understand exactly what you want here - just that > > this commit message isn't a coherent justification for that ;) > > > > > Tested-by: Douglas Anderson > > > Signed-off-by: Rob Clark > > > --- > > > This is an alternative/replacement for [1]. What it lacks in elegance > > > it makes up for in practicality ;-) > > > > > > [1] https://patchwork.freedesktop.org/patch
[PATCH v19 4/5] dt-bindings: arm-smmu: Add bindings for qcom,smmu-v2
Add bindings doc for Qcom's smmu-v2 implementation. Signed-off-by: Vivek Gautam Reviewed-by: Tomasz Figa Tested-by: Srinivas Kandagatla Reviewed-by: Rob Herring Reviewed-by: Robin Murphy --- Changes since v18: None. .../devicetree/bindings/iommu/arm,smmu.txt | 39 ++ 1 file changed, 39 insertions(+) diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt index 8a6ffce12af5..a6504b37cc21 100644 --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt @@ -17,10 +17,16 @@ conditions. "arm,mmu-401" "arm,mmu-500" "cavium,smmu-v2" +"qcom,smmu-v2" depending on the particular implementation and/or the version of the architecture implemented. + Qcom SoCs must contain, as below, SoC-specific compatibles + along with "qcom,smmu-v2": + "qcom,msm8996-smmu-v2", "qcom,smmu-v2", + "qcom,sdm845-smmu-v2", "qcom,smmu-v2". + - reg : Base address and size of the SMMU. - #global-interrupts : The number of global interrupts exposed by the @@ -71,6 +77,22 @@ conditions. or using stream matching with #iommu-cells = <2>, and may be ignored if present in such cases. +- clock-names:List of the names of clocks input to the device. The + required list depends on particular implementation and + is as follows: + - for "qcom,smmu-v2": +- "bus": clock required for downstream bus access and + for the smmu ptw, +- "iface": clock required to access smmu's registers + through the TCU's programming interface. + - unspecified for other implementations. + +- clocks: Specifiers for all clocks listed in the clock-names property, + as per generic clock bindings. + +- power-domains: Specifiers for power domains required to be powered on for + the SMMU to operate, as per generic power domain bindings. + ** Deprecated properties: - mmu-masters (deprecated in favour of the generic "iommus" binding) : @@ -137,3 +159,20 @@ conditions. iommu-map = <0 &smmu3 0 0x400>; ... }; + + /* Qcom's arm,smmu-v2 implementation */ + smmu4: iommu@d0 { + compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2"; + reg = <0xd0 0x1>; + + #global-interrupts = <1>; + interrupts = , +, +; + #iommu-cells = <1>; + power-domains = <&mmcc MDSS_GDSC>; + + clocks = <&mmcc SMMU_MDP_AXI_CLK>, +<&mmcc SMMU_MDP_AHB_CLK>; + clock-names = "bus", "iface"; + }; -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 0/8] vfio/mdev: IOMMU aware mediated device
Hi, On 12/4/18 11:46 AM, Xu Zaibo wrote: Hi, Is this solution trying to support general user space processes who are directly working on devices? Yes, it is. Thanks, Zaibo Best regards, Lu Baolu . On 2018/11/5 15:34, Lu Baolu wrote: Hi, The Mediate Device is a framework for fine-grained physical device sharing across the isolated domains. Currently the mdev framework is designed to be independent of the platform IOMMU support. As the result, the DMA isolation relies on the mdev parent device in a vendor specific way. There are several cases where a mediated device could be protected and isolated by the platform IOMMU. For example, Intel vt-d rev3.0 [1] introduces a new translation mode called 'scalable mode', which enables PASID-granular translations. The vt-d scalable mode is the key ingredient for Scalable I/O Virtualization [2] [3] which allows sharing a device in minimal possible granularity (ADI - Assignable Device Interface). A mediated device backed by an ADI could be protected and isolated by the IOMMU since 1) the parent device supports tagging an unique PASID to all DMA traffic out of the mediated device; and 2) the DMA translation unit (IOMMU) supports the PASID granular translation. We can apply IOMMU protection and isolation to this kind of devices just as what we are doing with an assignable PCI device. In order to distinguish the IOMMU-capable mediated devices from those which still need to rely on parent devices, this patch set adds two new members in struct mdev_device. * iommu_device - This, if set, indicates that the mediated device could be fully isolated and protected by IOMMU via attaching an iommu domain to this device. If empty, it indicates using vendor defined isolation. * iommu_domain - This is a place holder for an iommu domain. A domain could be store here for later use once it has been attached to the iommu_device of this mdev. Below helpers are added to set and get above iommu device and iommu domain pointers in mdev core implementation. * mdev_set/get_iommu_device(dev, iommu_device) - Set or get the iommu device which represents this mdev in IOMMU's device scope. Drivers don't need to set the iommu device if it uses vendor defined isolation. * mdev_set/get_iommu_domain(domain) - A iommu domain which has been attached to the iommu device in order to protect and isolate the mediated device will be kept in the mdev data structure and could be retrieved later. The mdev parent device driver could opt-in that the mdev could be fully isolated and protected by the IOMMU when the mdev is being created by invoking mdev_set_iommu_device() in its @create(). In the vfio_iommu_type1_attach_group(), a domain allocated through iommu_domain_alloc() will be attached to the mdev iommu device if an iommu device has been set. Otherwise, the dummy external domain will be used and all the DMA isolation and protection are routed to parent driver as the result. On IOMMU side, a basic requirement is allowing to attach multiple domains to a PCI device if the device advertises the capability and the IOMMU hardware supports finer granularity translations than the normal PCI Source ID based translation. As the result, a PCI device could work in two modes: normal mode and auxiliary mode. In the normal mode, a pci device could be isolated in the Source ID granularity; the pci device itself could be assigned to a user application by attaching a single domain to it. In the auxiliary mode, a pci device could be isolated in finer granularity, hence subsets of the device could be assigned to different user level application by attaching a different domain to each subset. The device driver is able to switch between above two modes with below interfaces: * iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_CAPABILITY) - Represents the ability of supporting multiple domains per device. * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE) - Enable the multiple domains capability for the device referenced by @dev. * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE) - Disable the multiple domains capability for the device referenced by @dev. * iommu_domain_get_attr(domain, DOMAIN_ATTR_AUXD_ID) - Return ID used for finer-granularity DMA translation. * iommu_attach_device_aux(domain, dev) - Attach a domain to the device in the auxiliary mode. * iommu_detach_device_aux(domain, dev) - Detach the aux domain from device. In order for the ease of discussion, sometimes we call "a domain in auxiliary mode' or simply 'an auxiliary domain' when a domain is attached to a device for finer granularity translations. But we need to keep in mind that this doesn't mean there is a differnt domain type. A same domain could be bound to a device for Source ID based translation, and bound to another device for finer granularity translation at the same time. This patch series extends both IOMMU and vfio componen
[PATCH v19 3/5] iommu/arm-smmu: Add the device_link between masters and smmu
From: Sricharan R Finally add the device link between the master device and smmu, so that the smmu gets runtime enabled/disabled only when the master needs it. This is done from add_device callback which gets called once when the master is added to the smmu. Signed-off-by: Sricharan R Signed-off-by: Vivek Gautam Reviewed-by: Tomasz Figa Tested-by: Srinivas Kandagatla Reviewed-by: Robin Murphy --- Changes since v18: None. drivers/iommu/arm-smmu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 1917d214c4d9..b6b11642b3a9 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -1500,6 +1500,9 @@ static int arm_smmu_add_device(struct device *dev) iommu_device_link(&smmu->iommu, dev); + device_link_add(dev, smmu->dev, + DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER); + return 0; out_cfg_free: -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v19 5/5] iommu/arm-smmu: Add support for qcom,smmu-v2 variant
qcom,smmu-v2 is an arm,smmu-v2 implementation with specific clock and power requirements. On msm8996, multiple cores, viz. mdss, video, etc. use this smmu. On sdm845, this smmu is used with gpu. Add bindings for the same. Signed-off-by: Vivek Gautam Reviewed-by: Rob Herring Reviewed-by: Tomasz Figa Tested-by: Srinivas Kandagatla Reviewed-by: Robin Murphy --- Changes since v18: None. drivers/iommu/arm-smmu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index b6b11642b3a9..ba18d89d4732 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -120,6 +120,7 @@ enum arm_smmu_implementation { GENERIC_SMMU, ARM_MMU500, CAVIUM_SMMUV2, + QCOM_SMMUV2, }; struct arm_smmu_s2cr { @@ -2030,6 +2031,7 @@ ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU); ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU); ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500); ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2); +ARM_SMMU_MATCH_DATA(qcom_smmuv2, ARM_SMMU_V2, QCOM_SMMUV2); static const struct of_device_id arm_smmu_of_match[] = { { .compatible = "arm,smmu-v1", .data = &smmu_generic_v1 }, @@ -2038,6 +2040,7 @@ static const struct of_device_id arm_smmu_of_match[] = { { .compatible = "arm,mmu-401", .data = &arm_mmu401 }, { .compatible = "arm,mmu-500", .data = &arm_mmu500 }, { .compatible = "cavium,smmu-v2", .data = &cavium_smmuv2 }, + { .compatible = "qcom,smmu-v2", .data = &qcom_smmuv2 }, { }, }; MODULE_DEVICE_TABLE(of, arm_smmu_of_match); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v19 2/5] iommu/arm-smmu: Invoke pm_runtime across the driver
From: Sricharan R Enable pm-runtime on devices that implement a pm domain. Then, add pm runtime hooks to several iommu_ops to power cycle the smmu device for explicit TLB invalidation requests, and register space accesses, etc. We need these hooks when the smmu, linked to its master through device links, has to be powered-up without the master device being in context. Signed-off-by: Sricharan R [vivek: Cleanup pm runtime calls] Signed-off-by: Vivek Gautam Reviewed-by: Tomasz Figa Tested-by: Srinivas Kandagatla Reviewed-by: Robin Murphy --- Changes since v18: None. drivers/iommu/arm-smmu.c | 108 ++- 1 file changed, 98 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 602b67d4f2d6..1917d214c4d9 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -270,6 +270,20 @@ static struct arm_smmu_option_prop arm_smmu_options[] = { { 0, NULL}, }; +static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu) +{ + if (pm_runtime_enabled(smmu->dev)) + return pm_runtime_get_sync(smmu->dev); + + return 0; +} + +static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu) +{ + if (pm_runtime_enabled(smmu->dev)) + pm_runtime_put(smmu->dev); +} + static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) { return container_of(dom, struct arm_smmu_domain, domain); @@ -929,11 +943,15 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_cfg *cfg = &smmu_domain->cfg; - int irq; + int ret, irq; if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) return; + ret = arm_smmu_rpm_get(smmu); + if (ret < 0) + return; + /* * Disable the context bank and free the page tables before freeing * it. @@ -948,6 +966,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) free_io_pgtable_ops(smmu_domain->pgtbl_ops); __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); + + arm_smmu_rpm_put(smmu); } static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) @@ -1229,10 +1249,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) return -ENODEV; smmu = fwspec_smmu(fwspec); + + ret = arm_smmu_rpm_get(smmu); + if (ret < 0) + return ret; + /* Ensure that the domain is finalised */ ret = arm_smmu_init_domain_context(domain, smmu); if (ret < 0) - return ret; + goto rpm_put; /* * Sanity check the domain. We don't support domains across @@ -1242,49 +1267,74 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) dev_err(dev, "cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n", dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev)); - return -EINVAL; + ret = -EINVAL; + goto rpm_put; } /* Looks ok, so add the device to the domain */ - return arm_smmu_domain_add_master(smmu_domain, fwspec); + ret = arm_smmu_domain_add_master(smmu_domain, fwspec); + +rpm_put: + arm_smmu_rpm_put(smmu); + return ret; } static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot) { struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; + struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; + int ret; if (!ops) return -ENODEV; - return ops->map(ops, iova, paddr, size, prot); + arm_smmu_rpm_get(smmu); + ret = ops->map(ops, iova, paddr, size, prot); + arm_smmu_rpm_put(smmu); + + return ret; } static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size) { struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops; + struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu; + size_t ret; if (!ops) return 0; - return ops->unmap(ops, iova, size); + arm_smmu_rpm_get(smmu); + ret = ops->unmap(ops, iova, size); + arm_smmu_rpm_put(smmu); + + return ret; } static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain) { struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_device *smmu = smmu_domain->smmu; - if (smmu_domain->tlb_ops) + if (smmu_domain->tlb_ops) { + arm_smmu_rpm_get(smmu); smmu_domain-
[PATCH v19 0/5] iommu/arm-smmu: Add runtime pm/sleep support
Changes since v18: - Addressing Stephen's comment [5]: Replaced the entire clock bulk data filling and handling with devm_clk_bulk_get_all(). Changes since v17: - Addressing Will's comment to embed Thor's change [2] for pulling clocks information from device tree. This is done by squashing Thor's change [2] in v17's 1/5 patch [3]. - Another minor change is addition of runtime pm hooks to arm_smmu_iova_to_phys_hard(). Previous version of this patch series is @ [1]. Also refer to [4] for change logs for previous versions. [1] https://lore.kernel.org/patchwork/cover/1017699/ [2] https://lore.kernel.org/patchwork/patch/996143/ [3] https://lore.kernel.org/patchwork/patch/1013167/ [4] https://lore.kernel.org/patchwork/cover/979429/ [5] https://lore.kernel.org/patchwork/patch/1017700/ Sricharan R (3): iommu/arm-smmu: Add pm_runtime/sleep ops iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device iommu/arm-smmu: Add the device_link between masters and smmu Vivek Gautam (2): dt-bindings: arm-smmu: Add bindings for qcom,smmu-v2 iommu/arm-smmu: Add support for qcom,smmu-v2 variant .../devicetree/bindings/iommu/arm,smmu.txt | 39 + drivers/iommu/arm-smmu.c | 170 +++-- 2 files changed, 197 insertions(+), 12 deletions(-) -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v19 1/5] iommu/arm-smmu: Add pm_runtime/sleep ops
From: Sricharan R The smmu needs to be functional only when the respective master's using it are active. The device_link feature helps to track such functional dependencies, so that the iommu gets powered when the master device enables itself using pm_runtime. So by adapting the smmu driver for runtime pm, above said dependency can be addressed. This patch adds the pm runtime/sleep callbacks to the driver and the corresponding bulk clock handling for all the clocks needed by smmu. Also, while we enable the runtime pm, add a pm sleep suspend callback that pushes devices to low power state by turning the clocks off in a system sleep. Add corresponding clock enable path in resume callback as well. Signed-off-by: Sricharan R Signed-off-by: Archit Taneja [Thor: Rework to get clocks from device tree] Signed-off-by: Thor Thayer [vivek: rework for clock and pm ops] Signed-off-by: Vivek Gautam Reviewed-by: Tomasz Figa Tested-by: Srinivas Kandagatla Reviewed-by: Robin Murphy Tested-by: Thor Thayer --- Changes since v18: - Replaced the entire clock bulk data filling and handling with devm_clk_bulk_get_all(). drivers/iommu/arm-smmu.c | 58 +--- 1 file changed, 55 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 5a28ae892504..602b67d4f2d6 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -48,6 +48,7 @@ #include #include #include +#include #include #include @@ -206,6 +207,8 @@ struct arm_smmu_device { u32 num_global_irqs; u32 num_context_irqs; unsigned int*irqs; + struct clk_bulk_data*clks; + int num_clks; u32 cavium_id_base; /* Specific to Cavium */ @@ -1947,7 +1950,7 @@ struct arm_smmu_match_data { }; #define ARM_SMMU_MATCH_DATA(name, ver, imp)\ -static struct arm_smmu_match_data name = { .version = ver, .model = imp } +static const struct arm_smmu_match_data name = { .version = ver, .model = imp } ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU); ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU); @@ -2150,6 +2153,17 @@ static int arm_smmu_device_probe(struct platform_device *pdev) smmu->irqs[i] = irq; } + err = devm_clk_bulk_get_all(dev, &smmu->clks); + if (err < 0) { + dev_err(dev, "failed to get clocks %d\n", err); + return err; + } + smmu->num_clks = err; + + err = clk_bulk_prepare_enable(smmu->num_clks, smmu->clks); + if (err) + return err; + err = arm_smmu_device_cfg_probe(smmu); if (err) return err; @@ -2236,6 +2250,9 @@ static int arm_smmu_device_remove(struct platform_device *pdev) /* Turn the thing off */ writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); + + clk_bulk_disable_unprepare(smmu->num_clks, smmu->clks); + return 0; } @@ -2244,15 +2261,50 @@ static void arm_smmu_device_shutdown(struct platform_device *pdev) arm_smmu_device_remove(pdev); } -static int __maybe_unused arm_smmu_pm_resume(struct device *dev) +static int __maybe_unused arm_smmu_runtime_resume(struct device *dev) { struct arm_smmu_device *smmu = dev_get_drvdata(dev); + int ret; + + ret = clk_bulk_enable(smmu->num_clks, smmu->clks); + if (ret) + return ret; arm_smmu_device_reset(smmu); + return 0; } -static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume); +static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev) +{ + struct arm_smmu_device *smmu = dev_get_drvdata(dev); + + clk_bulk_disable(smmu->num_clks, smmu->clks); + + return 0; +} + +static int __maybe_unused arm_smmu_pm_resume(struct device *dev) +{ + if (pm_runtime_suspended(dev)) + return 0; + + return arm_smmu_runtime_resume(dev); +} + +static int __maybe_unused arm_smmu_pm_suspend(struct device *dev) +{ + if (pm_runtime_suspended(dev)) + return 0; + + return arm_smmu_runtime_suspend(dev); +} + +static const struct dev_pm_ops arm_smmu_pm_ops = { + SET_SYSTEM_SLEEP_PM_OPS(arm_smmu_pm_suspend, arm_smmu_pm_resume) + SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend, + arm_smmu_runtime_resume, NULL) +}; static struct platform_driver arm_smmu_driver = { .driver = { -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support
Hi, On 12/4/18 1:23 AM, Liu, Yi L wrote: Hi Joerg, From: Joerg Roedel [mailto:j...@8bytes.org] Sent: Monday, December 3, 2018 5:49 AM To: Lu Baolu Subject: Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support On Wed, Nov 28, 2018 at 11:54:41AM +0800, Lu Baolu wrote: - - desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO, 0); + /* +* Need two pages to accommodate 256 descriptors of 256 bits each +* if the remapping hardware supports scalable mode translation. +*/ + desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO, +!!ecap_smts(iommu->ecap)); Same here, does the allocation really need GFP_ATOMIC? still leave to Baolu. The existing code uses GFP_ATOMIC, this patch only changes the size of the allocated desc_page. I don't think we really need GFP_ATOMIC here (and also for some other places). I will clean up them in a separated patch. struct q_inval { raw_spinlock_t q_lock; - struct qi_desc *desc; /* invalidation queue */ + void*desc; /* invalidation queue */ int *desc_status; /* desc status */ int free_head; /* first free entry */ int free_tail; /* last free entry */ Why do you switch the pointer to void* ? In this patch, there is some code like the code below. It calculates destination address of memcpy with qi->desc. If it's still struct qi_desc pointer, the calculation result would be wrong. + memcpy(desc, qi->desc + (wait_index << shift), + 1 << shift); The change of the calculation method is to support 128 bits invalidation descriptors and 256 invalidation descriptors in this unified code logic. Also, the conversation between Baolu and me may help. https://lore.kernel.org/patchwork/patch/1006756/ Yes. We need to support different descriptor size. Best regards, Lu Baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 02/12] iommu/vt-d: Manage scalalble mode PASID tables
Hi, On 12/4/18 1:23 AM, Liu, Yi L wrote: Hi Joerg, From: Joerg Roedel [mailto:j...@8bytes.org] Sent: Monday, December 3, 2018 5:44 AM To: Lu Baolu Subject: Re: [PATCH v5 02/12] iommu/vt-d: Manage scalalble mode PASID tables Hi Baolu, On Wed, Nov 28, 2018 at 11:54:39AM +0800, Lu Baolu wrote: @@ -2482,12 +2482,13 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu, if (dev) dev->archdata.iommu = info; - if (dev && dev_is_pci(dev) && info->pasid_supported) { + /* PASID table is mandatory for a PCI device in scalable mode. */ + if (dev && dev_is_pci(dev) && sm_supported(iommu)) { This will also allocate a PASID table if the device does not support PASIDs, right? Will the table not be used in that case or will the device just use the fallback PASID? Isn't it better in that case to have no PASID table? We need to allocate the PASID table in scalable mode, the reason is as below: In VT-d scalable mode, all address translation is done in PASID-granularity. For requests-with-PASID, the address translation would be subjected to the PASID entry specified by the PASID value in the DMA request. However, for requests-without-PASID, there is no PASID in the DMA request. To fulfil the translation logic, we've introduced RID2PASID field in sm-context-entry in VT-d 3.o spec. So that such DMA requests would be subjected to the pasid entry specified by the PASID value in the RID2PASID field of sm-context-entry. So for a device without PASID support, we need to at least to have a PASID entry so that its DMA request (without pasid) can be translated. Thus a PASID table is needed for such devices. @@ -143,18 +143,20 @@ int intel_pasid_alloc_table(struct device *dev) return -ENOMEM; INIT_LIST_HEAD(&pasid_table->dev); - size = sizeof(struct pasid_entry); - count = min_t(int, pci_max_pasids(to_pci_dev(dev)), intel_pasid_max_id); - order = get_order(size * count); + if (info->pasid_supported) + max_pasid = min_t(int, pci_max_pasids(to_pci_dev(dev)), + intel_pasid_max_id); + + size = max_pasid >> (PASID_PDE_SHIFT - 3); + order = size ? get_order(size) : 0; pages = alloc_pages_node(info->iommu->node, -GFP_ATOMIC | __GFP_ZERO, -order); +GFP_ATOMIC | __GFP_ZERO, order); This is a simple data structure allocation path, does it need GFP_ATOMIC? This function is called in an unsleepable context. spin_lock(&lock) [...] if (pasid_table_is_necessary) allocate_pasid_table(dev) [...] spin_unlock(&lock) We can move it out of the lock range. How about if (pasid_table_is_necessary) pasid_table = allocate_pasid_table(dev) spin_lock(&lock) [...] if (pasid_table_is_necessary) set_up_pasid_table(pasid_table) [...] spin_unlock(&lock) ? Best regards, Lu Baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 0/8] vfio/mdev: IOMMU aware mediated device
Hi, Is this solution trying to support general user space processes who are directly working on devices? Thanks, Zaibo . On 2018/11/5 15:34, Lu Baolu wrote: Hi, The Mediate Device is a framework for fine-grained physical device sharing across the isolated domains. Currently the mdev framework is designed to be independent of the platform IOMMU support. As the result, the DMA isolation relies on the mdev parent device in a vendor specific way. There are several cases where a mediated device could be protected and isolated by the platform IOMMU. For example, Intel vt-d rev3.0 [1] introduces a new translation mode called 'scalable mode', which enables PASID-granular translations. The vt-d scalable mode is the key ingredient for Scalable I/O Virtualization [2] [3] which allows sharing a device in minimal possible granularity (ADI - Assignable Device Interface). A mediated device backed by an ADI could be protected and isolated by the IOMMU since 1) the parent device supports tagging an unique PASID to all DMA traffic out of the mediated device; and 2) the DMA translation unit (IOMMU) supports the PASID granular translation. We can apply IOMMU protection and isolation to this kind of devices just as what we are doing with an assignable PCI device. In order to distinguish the IOMMU-capable mediated devices from those which still need to rely on parent devices, this patch set adds two new members in struct mdev_device. * iommu_device - This, if set, indicates that the mediated device could be fully isolated and protected by IOMMU via attaching an iommu domain to this device. If empty, it indicates using vendor defined isolation. * iommu_domain - This is a place holder for an iommu domain. A domain could be store here for later use once it has been attached to the iommu_device of this mdev. Below helpers are added to set and get above iommu device and iommu domain pointers in mdev core implementation. * mdev_set/get_iommu_device(dev, iommu_device) - Set or get the iommu device which represents this mdev in IOMMU's device scope. Drivers don't need to set the iommu device if it uses vendor defined isolation. * mdev_set/get_iommu_domain(domain) - A iommu domain which has been attached to the iommu device in order to protect and isolate the mediated device will be kept in the mdev data structure and could be retrieved later. The mdev parent device driver could opt-in that the mdev could be fully isolated and protected by the IOMMU when the mdev is being created by invoking mdev_set_iommu_device() in its @create(). In the vfio_iommu_type1_attach_group(), a domain allocated through iommu_domain_alloc() will be attached to the mdev iommu device if an iommu device has been set. Otherwise, the dummy external domain will be used and all the DMA isolation and protection are routed to parent driver as the result. On IOMMU side, a basic requirement is allowing to attach multiple domains to a PCI device if the device advertises the capability and the IOMMU hardware supports finer granularity translations than the normal PCI Source ID based translation. As the result, a PCI device could work in two modes: normal mode and auxiliary mode. In the normal mode, a pci device could be isolated in the Source ID granularity; the pci device itself could be assigned to a user application by attaching a single domain to it. In the auxiliary mode, a pci device could be isolated in finer granularity, hence subsets of the device could be assigned to different user level application by attaching a different domain to each subset. The device driver is able to switch between above two modes with below interfaces: * iommu_get_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_CAPABILITY) - Represents the ability of supporting multiple domains per device. * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_ENABLE) - Enable the multiple domains capability for the device referenced by @dev. * iommu_set_dev_attr(dev, IOMMU_DEV_ATTR_AUXD_DISABLE) - Disable the multiple domains capability for the device referenced by @dev. * iommu_domain_get_attr(domain, DOMAIN_ATTR_AUXD_ID) - Return ID used for finer-granularity DMA translation. * iommu_attach_device_aux(domain, dev) - Attach a domain to the device in the auxiliary mode. * iommu_detach_device_aux(domain, dev) - Detach the aux domain from device. In order for the ease of discussion, sometimes we call "a domain in auxiliary mode' or simply 'an auxiliary domain' when a domain is attached to a device for finer granularity translations. But we need to keep in mind that this doesn't mean there is a differnt domain type. A same domain could be bound to a device for Source ID based translation, and bound to another device for finer granularity translation at the same time. This patch series extends both IOMMU and vfio components to support mdev device passing through when it could be isolated and protected by
[PATCH v3 5/6] arm64: defconfig: Enable ARM_SMMU_TEGRA
Enabling CONFIG_ARM_SMMU_TEGRA that is used by Tegra194 SOC. Signed-off-by: Krishna Reddy --- arch/arm64/configs/defconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index 5c2b1f6..e5ea13b 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -625,6 +625,7 @@ CONFIG_QCOM_APCS_IPC=y CONFIG_ROCKCHIP_IOMMU=y CONFIG_TEGRA_IOMMU_SMMU=y CONFIG_ARM_SMMU=y +CONFIG_ARM_SMMU_TEGRA=y CONFIG_ARM_SMMU_V3=y CONFIG_QCOM_IOMMU=y CONFIG_REMOTEPROC=m -- 2.1.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3 2/6] iommu/arm-smmu: Add support to program multiple ARM SMMU's identically
Add support to program multiple ARM SMMU's identically as one SMMU device. Tegra194 uses Two ARM SMMU's as one SMMU device and both ARM SMMU's need to be programmed identically. Signed-off-by: Krishna Reddy --- drivers/iommu/lib-arm-smmu.c | 191 --- 1 file changed, 144 insertions(+), 47 deletions(-) diff --git a/drivers/iommu/lib-arm-smmu.c b/drivers/iommu/lib-arm-smmu.c index 6aba5db..7036763 100644 --- a/drivers/iommu/lib-arm-smmu.c +++ b/drivers/iommu/lib-arm-smmu.c @@ -69,9 +69,9 @@ * therefore this actually makes more sense than it might first appear. */ #ifdef CONFIG_64BIT -#define smmu_write_atomic_lq writeq_relaxed +#define smmu_write_atomic_lq writeq_all_relaxed #else -#define smmu_write_atomic_lq writel_relaxed +#define smmu_write_atomic_lq writel_all_relaxed #endif /* Translation context bank */ @@ -135,6 +135,48 @@ struct arm_smmu_domain { struct iommu_domain domain; }; +#define to_smmu_intance(smmu, inst, addr) \ + (addr - smmu->bases[0] + smmu->bases[inst]) + +static void writel_all(struct arm_smmu_device *smmu, + u32 value, void __iomem *addr) +{ + int i; + + writel(value, addr); + for (i = 1; i < smmu->num_smmus; i++) { + void __iomem *reg_addr = to_smmu_intance(smmu, i, addr); + + writel(value, reg_addr); + } +} + +static void writel_all_relaxed(struct arm_smmu_device *smmu, + u32 value, void __iomem *addr) +{ + int i; + + writel_relaxed(value, addr); + for (i = 1; i < smmu->num_smmus; i++) { + void __iomem *reg_addr = to_smmu_intance(smmu, i, addr); + + writel_relaxed(value, reg_addr); + } +} + +static void writeq_all_relaxed(struct arm_smmu_device *smmu, + u64 value, void __iomem *addr) +{ + int i; + + writeq_relaxed(value, addr); + for (i = 1; i < smmu->num_smmus; i++) { + void __iomem *reg_addr = to_smmu_intance(smmu, i, addr); + + writeq_relaxed(value, reg_addr); + } +} + static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) { return container_of(dom, struct arm_smmu_domain, domain); @@ -179,25 +221,37 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, static void arm_smmu_tlb_sync_global(struct arm_smmu_device *smmu) { - void __iomem *base = ARM_SMMU_GR0(smmu); + int i; unsigned long flags; spin_lock_irqsave(&smmu->global_sync_lock, flags); - __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_GR0_sTLBGSYNC, - base + ARM_SMMU_GR0_sTLBGSTATUS); + for (i = 0; i < smmu->num_smmus; i++) { + void __iomem *base = ARM_SMMU_GR0(smmu); + + if (i > 0) + base = to_smmu_intance(smmu, i, base); + __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_GR0_sTLBGSYNC, + base + ARM_SMMU_GR0_sTLBGSTATUS); + } spin_unlock_irqrestore(&smmu->global_sync_lock, flags); } static void arm_smmu_tlb_sync_context(void *cookie) { + int i; struct arm_smmu_domain *smmu_domain = cookie; struct arm_smmu_device *smmu = smmu_domain->smmu; - void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx); unsigned long flags; spin_lock_irqsave(&smmu_domain->cb_lock, flags); - __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_CB_TLBSYNC, - base + ARM_SMMU_CB_TLBSTATUS); + for (i = 0; i < smmu->num_smmus; i++) { + void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx); + + if (i > 0) + base = to_smmu_intance(smmu, i, base); + __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_CB_TLBSYNC, + base + ARM_SMMU_CB_TLBSTATUS); + } spin_unlock_irqrestore(&smmu_domain->cb_lock, flags); } @@ -212,13 +266,14 @@ static void arm_smmu_tlb_inv_context_s1(void *cookie) { struct arm_smmu_domain *smmu_domain = cookie; struct arm_smmu_cfg *cfg = &smmu_domain->cfg; - void __iomem *base = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); + struct arm_smmu_device *smmu = smmu_domain->smmu; + void __iomem *base = ARM_SMMU_CB(smmu, cfg->cbndx); /* * NOTE: this is not a relaxed write; it needs to guarantee that PTEs * cleared by the current CPU are visible to the SMMU before the TLBI. */ - writel(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID); + writel_all(smmu, cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID); arm_smmu_tlb_sync_context(cookie); } @@ -229,7 +284,7 @@ static void arm_smmu_tlb_inv_context_s2(void *cookie) void __iomem *base = ARM_SMMU_GR0(smmu); /* NOTE: see above */ - writel(smmu_domain
[PATCH v3 6/6] arm64: tegra: Add SMMU nodes to Tegra194 device tree
Add SMMU nodes and dma-ranges to Tegra194 device tree. Tegra194 has three ARM SMMU Instances. Two of them are used together to access IOVA interleaved. The third one is used as regular ARM SMMU. Signed-off-by: Krishna Reddy --- arch/arm64/boot/dts/nvidia/tegra194.dtsi | 148 +++ 1 file changed, 148 insertions(+) diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi b/arch/arm64/boot/dts/nvidia/tegra194.dtsi index f274562..9a3e08c 100644 --- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi @@ -12,6 +12,7 @@ interrupt-parent = <&gic>; #address-cells = <2>; #size-cells = <2>; + dma-ranges = <0x0 0x0 0x0 0x0 0x8 0x0>; /* control backbone */ cbb { @@ -957,4 +958,151 @@ (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>; interrupt-parent = <&gic>; }; + + dualsmmu: iommu@1200 { + compatible = "tegra194,arm,mmu-500"; + reg = <0 0x1200 0 0x80>, + <0 0x1100 0 0x80>; + interrupts = , +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +; + stream-match-mask = <0x7f80>; + #global-interrupts = <1>; + #iommu-cells = <1>; + }; + + smmu: iommu@1000 { + compatible = "arm,mmu-500"; + reg = <0 0x1000 0 0x80>; + interrupts = , +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +
[PATCH v3 1/6] iommu/arm-smmu: create library for ARM SMMU
Create library routines to share ARM SMMU programming and common IOMMU API implementation for ARM SMMU v1 and v2 based architecture Implementations. Signed-off-by: Krishna Reddy --- drivers/iommu/Makefile |1 + drivers/iommu/lib-arm-smmu.c | 1671 ++ drivers/iommu/lib-arm-smmu.h | 161 3 files changed, 1833 insertions(+) create mode 100644 drivers/iommu/lib-arm-smmu.c create mode 100644 drivers/iommu/lib-arm-smmu.h diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index a158a68..ea87cae 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -14,6 +14,7 @@ obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o obj-$(CONFIG_ARM_SMMU) += arm-smmu.o +obj-$(CONFIG_ARM_SMMU) += lib-arm-smmu.o obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o obj-$(CONFIG_DMAR_TABLE) += dmar.o obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o diff --git a/drivers/iommu/lib-arm-smmu.c b/drivers/iommu/lib-arm-smmu.c new file mode 100644 index 000..6aba5db --- /dev/null +++ b/drivers/iommu/lib-arm-smmu.c @@ -0,0 +1,1671 @@ +/* + * Copyright (c) 2018, NVIDIA Corporation + * Author: Krishna Reddy + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Library for ARM architected v1 and v2 SMMU implementations. + * This library is created by reusing the code from arm-smmu.c which is + * authored by Will Deacon. + */ + +#define pr_fmt(fmt) "lib-arm-smmu: " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "io-pgtable.h" +#include "arm-smmu-regs.h" +#include "lib-arm-smmu.h" + +#define ARM_MMU500_ACTLR_CPRE (1 << 1) + +#define ARM_MMU500_ACR_CACHE_LOCK (1 << 26) +#define ARM_MMU500_ACR_S2CRB_TLBEN (1 << 10) +#define ARM_MMU500_ACR_SMTNMB_TLBEN(1 << 8) + +#define TLB_LOOP_TIMEOUT 100 /* 1s! */ +#define TLB_SPIN_COUNT 10 + +/* SMMU global address space */ +#define ARM_SMMU_GR0(smmu) ((smmu)->base) +#define ARM_SMMU_GR1(smmu) ((smmu)->base + (1 << (smmu)->pgshift)) + +/* + * SMMU global address space with conditional offset to access secure + * aliases of non-secure registers (e.g. nsCR0: 0x400, nsGFSR: 0x448, + * nsGFSYNR0: 0x450) + */ +#define ARM_SMMU_GR0_NS(smmu) \ + ((smmu)->base + \ + ((smmu->options & ARM_SMMU_OPT_SECURE_CFG_ACCESS) \ + ? 0x400 : 0)) + +/* + * Some 64-bit registers only make sense to write atomically, but in such + * cases all the data relevant to AArch32 formats lies within the lower word, + * therefore this actually makes more sense than it might first appear. + */ +#ifdef CONFIG_64BIT +#define smmu_write_atomic_lq writeq_relaxed +#else +#define smmu_write_atomic_lq writel_relaxed +#endif + +/* Translation context bank */ +#define ARM_SMMU_CB(smmu, n) ((smmu)->cb_base + ((n) << (smmu)->pgshift)) + +#define MSI_IOVA_BASE 0x800 +#define MSI_IOVA_LENGTH0x10 + +#define s2cr_init_val (struct arm_smmu_s2cr){ \ + .type = smmu->disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS, \ +} + +struct arm_smmu_master_cfg { + struct arm_smmu_device *smmu; + s16 smendx[]; +}; +#define INVALID_SMENDX -1 +#define __fwspec_cfg(fw) ((struct arm_smmu_master_cfg *)fw->iommu_priv) +#define fwspec_smmu(fw) (__fwspec_cfg(fw)->smmu) +#define fwspec_smendx(fw, i) \ + (i >= fw->num_ids ? INVALID_SMENDX : __fwspec_cfg(fw)->smendx[i]) +#define for_each_cfg_sme(fw, i, idx) \ + for (i = 0; idx = fwspec_smendx(fw, i), i < fw->num_ids; ++i) + +enum arm_smmu_context_fmt { + ARM_SMMU_CTX_FMT_NONE, + ARM_SMMU_CTX_FMT_AARCH64, + ARM_SMMU_CTX_FMT_AARCH32_L, + ARM_SMMU_CTX_FMT_AARCH32_S, +}; + +struct arm_smmu_cfg { + u8 cbndx; + u8 irptndx; + union { + u16 asid; + u16 vmid; + }; + u32 cbar; + enum arm_smmu_context_fmt fmt; +}; +#define INVALID_IRPTNDX0xff + +enum arm_smmu_domain
[PATCH v3 3/6] iommu/arm-smmu: update arm-smmu.c to use ARM SMMU library
Update ARM SMMU driver to use the ARM SMMU library routines. Signed-off-by: Krishna Reddy --- drivers/iommu/arm-smmu.c | 1709 +- 1 file changed, 11 insertions(+), 1698 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 5a28ae8..453159b 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -1,5 +1,5 @@ /* - * IOMMU API for ARM architected SMMU implementations. + * ARM SMMU driver for ARM architected V1 and V2 SMMU implementations. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -39,7 +39,6 @@ #include #include #include -#include #include #include #include @@ -56,49 +55,7 @@ #include "io-pgtable.h" #include "arm-smmu-regs.h" - -#define ARM_MMU500_ACTLR_CPRE (1 << 1) - -#define ARM_MMU500_ACR_CACHE_LOCK (1 << 26) -#define ARM_MMU500_ACR_S2CRB_TLBEN (1 << 10) -#define ARM_MMU500_ACR_SMTNMB_TLBEN(1 << 8) - -#define TLB_LOOP_TIMEOUT 100 /* 1s! */ -#define TLB_SPIN_COUNT 10 - -/* Maximum number of context banks per SMMU */ -#define ARM_SMMU_MAX_CBS 128 - -/* SMMU global address space */ -#define ARM_SMMU_GR0(smmu) ((smmu)->base) -#define ARM_SMMU_GR1(smmu) ((smmu)->base + (1 << (smmu)->pgshift)) - -/* - * SMMU global address space with conditional offset to access secure - * aliases of non-secure registers (e.g. nsCR0: 0x400, nsGFSR: 0x448, - * nsGFSYNR0: 0x450) - */ -#define ARM_SMMU_GR0_NS(smmu) \ - ((smmu)->base + \ - ((smmu->options & ARM_SMMU_OPT_SECURE_CFG_ACCESS) \ - ? 0x400 : 0)) - -/* - * Some 64-bit registers only make sense to write atomically, but in such - * cases all the data relevant to AArch32 formats lies within the lower word, - * therefore this actually makes more sense than it might first appear. - */ -#ifdef CONFIG_64BIT -#define smmu_write_atomic_lq writeq_relaxed -#else -#define smmu_write_atomic_lq writel_relaxed -#endif - -/* Translation context bank */ -#define ARM_SMMU_CB(smmu, n) ((smmu)->cb_base + ((n) << (smmu)->pgshift)) - -#define MSI_IOVA_BASE 0x800 -#define MSI_IOVA_LENGTH0x10 +#include "lib-arm-smmu.h" static int force_stage; module_param(force_stage, int, S_IRUGO); @@ -109,150 +66,6 @@ module_param(disable_bypass, bool, S_IRUGO); MODULE_PARM_DESC(disable_bypass, "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); -enum arm_smmu_arch_version { - ARM_SMMU_V1, - ARM_SMMU_V1_64K, - ARM_SMMU_V2, -}; - -enum arm_smmu_implementation { - GENERIC_SMMU, - ARM_MMU500, - CAVIUM_SMMUV2, -}; - -struct arm_smmu_s2cr { - struct iommu_group *group; - int count; - enum arm_smmu_s2cr_type type; - enum arm_smmu_s2cr_privcfg privcfg; - u8 cbndx; -}; - -#define s2cr_init_val (struct arm_smmu_s2cr){ \ - .type = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS,\ -} - -struct arm_smmu_smr { - u16 mask; - u16 id; - boolvalid; -}; - -struct arm_smmu_cb { - u64 ttbr[2]; - u32 tcr[2]; - u32 mair[2]; - struct arm_smmu_cfg *cfg; -}; - -struct arm_smmu_master_cfg { - struct arm_smmu_device *smmu; - s16 smendx[]; -}; -#define INVALID_SMENDX -1 -#define __fwspec_cfg(fw) ((struct arm_smmu_master_cfg *)fw->iommu_priv) -#define fwspec_smmu(fw) (__fwspec_cfg(fw)->smmu) -#define fwspec_smendx(fw, i) \ - (i >= fw->num_ids ? INVALID_SMENDX : __fwspec_cfg(fw)->smendx[i]) -#define for_each_cfg_sme(fw, i, idx) \ - for (i = 0; idx = fwspec_smendx(fw, i), i < fw->num_ids; ++i) - -struct arm_smmu_device { - struct device *dev; - - void __iomem*base; - void __iomem*cb_base; - unsigned long pgshift; - -#define ARM_SMMU_FEAT_COHERENT_WALK(1 << 0) -#define ARM_SMMU_FEAT_STREAM_MATCH (1 << 1) -#define ARM_SMMU_FEAT_TRANS_S1 (1 << 2) -#define ARM_SMMU_FEAT_TRANS_S2 (1 << 3) -#define ARM_SMMU_FEAT_TRANS_NESTED (1 << 4) -#define ARM_SMMU_FEAT_TRANS_OPS(1 << 5) -#define ARM_SMMU_FEAT_VMID16 (1 << 6) -#define ARM_SMMU_FEAT_FMT_A
[PATCH v3 4/6] iommu/tegra194_smmu: Add Tegra194 SMMU driver
Tegra194 SMMU driver supports Dual ARM SMMU configuration necessary for Tegra194 SOC. The IOVA accesses from HW devices are interleaved across two ARM SMMU devices transparently. Signed-off-by: Krishna Reddy --- drivers/iommu/Kconfig | 11 ++ drivers/iommu/Makefile| 1 + drivers/iommu/tegra194-smmu.c | 394 ++ 3 files changed, 406 insertions(+) create mode 100644 drivers/iommu/tegra194-smmu.c diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index d9a2571..2dc08c7 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -357,6 +357,17 @@ config ARM_SMMU Say Y here if your SoC includes an IOMMU device implementing the ARM SMMU architecture. +config ARM_SMMU_TEGRA + bool "Dual ARM SMMU(MMU-500) support on Tegra" + depends on ARM64 && MMU && ARCH_TEGRA + select IOMMU_API + select IOMMU_IO_PGTABLE_LPAE + help + Support for implementation of Dual ARM SMMU Instances as single + SMMU Device on Tegra194. IOVA accesses from devices are interleaved + across these two ARM SMMU Instances. Number of ARM SMMU Instances used + in the SMMU device is not known to HW devices. + config ARM_SMMU_V3 bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support" depends on ARM64 diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index ea87cae..f1587bb 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -15,6 +15,7 @@ obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o obj-$(CONFIG_ARM_SMMU) += arm-smmu.o obj-$(CONFIG_ARM_SMMU) += lib-arm-smmu.o +obj-$(CONFIG_ARM_SMMU_TEGRA) += tegra194-smmu.o obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o obj-$(CONFIG_DMAR_TABLE) += dmar.o obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o diff --git a/drivers/iommu/tegra194-smmu.c b/drivers/iommu/tegra194-smmu.c new file mode 100644 index 000..7b21670 --- /dev/null +++ b/drivers/iommu/tegra194-smmu.c @@ -0,0 +1,394 @@ +/* + * Copyright (c) 2018, NVIDIA Corporation + * Author: Krishna Reddy + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#define pr_fmt(fmt) "tegra194-smmu: " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "io-pgtable.h" +#include "arm-smmu-regs.h" +#include "lib-arm-smmu.h" + +static int force_stage; +module_param(force_stage, int, S_IRUGO); +MODULE_PARM_DESC(force_stage, + "Force SMMU mappings to be installed at a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation."); +static bool disable_bypass; +module_param(disable_bypass, bool, S_IRUGO); +MODULE_PARM_DESC(disable_bypass, + "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); + +static struct platform_driver tegra194_smmu_driver; +static struct iommu_ops tegra194_smmu_ops; + +static struct iommu_domain *tegra194_smmu_domain_alloc(unsigned int type) +{ + return arm_smmu_domain_alloc_common(type, true); +} + +static int tegra194_smmu_attach_dev(struct iommu_domain *domain, + struct device *dev) +{ + return arm_smmu_attach_dev_common(domain, dev, &tegra194_smmu_ops); +} + +static int tegra194_smmu_match_node(struct device *dev, void *data) +{ + return dev->fwnode == data; +} + +static +struct arm_smmu_device *tegra194_smmu_get_by_fwnode(struct fwnode_handle *fwnode) +{ + struct device *dev = driver_find_device(&tegra194_smmu_driver.driver, + NULL, fwnode, tegra194_smmu_match_node); + put_device(dev); + return dev ? dev_get_drvdata(dev) : NULL; +} + +static int tegra194_smmu_add_device(struct device *dev) +{ + struct arm_smmu_device *smmu; + struct iommu_fwspec *fwspec = dev->iommu_fwspec; + + if (fwspec && fwspec->ops == &tegra194_smmu_ops) + smmu = tegra194_smmu_get_by_fwnode(fwspec->iommu_fwnode); + else + return -ENODEV; + + return arm_smmu_add_device_common(dev, smmu); +} + +static void tegra194_smmu_remov
[PATCH v3 0/5] Add Tegra194 Dual ARM SMMU driver
NVIDIA's Xavier (Tegra194) SOC has two ARM SMMU(MMU-500) instances, which are used as one SMMU device in HW. The IOVA accesses from HW devices are interleaved across these two SMMU instances and need to be programmed identical. The existing ARM SMMU driver can't be used in its current form for programming the two SMMU instances identically. But, Most of the code can be shared between ARM SMMU driver and Tegra194 SMMU driver. To allow sharing the code, Created a libray based on the current ARM SMMU driver and added suppport to program multiple ARM SMMU Instances identically. Upated Current ARM SMMU driver and Tegra194 SMMU driver to use the library functions. Please review the patches and provide feedback. Changes in v2: * Added CONFIG_ARM_SMMU_TEGRA to protect Tegra194 SMMU driver compilation * Enabled CONFIG_ARM_SMMU_TEGRA in defconfig * Added SMMU nodes in Tegra194 device tree Changes in v3: * Created library for ARM SMMU based on arm-smmu.c * Added support to program multiple ARM SMMU instances identically * Updated arm-smmu.c/tegra194-smmu.c to use ARM SMMU library functions Krishna Reddy (6): iommu/arm-smmu: create library for ARM SMMU iommu/arm-smmu: Add support to program multiple ARM SMMU's identically iommu/arm-smmu: update arm-smmu.c to use ARM SMMU library iommu/tegra194_smmu: Add Tegra194 SMMU driver arm64: defconfig: Enable ARM_SMMU_TEGRA arm64: tegra: Add SMMU nodes to Tegra194 device tree arch/arm64/boot/dts/nvidia/tegra194.dtsi | 148 +++ arch/arm64/configs/defconfig |1 + drivers/iommu/Kconfig| 11 + drivers/iommu/Makefile |2 + drivers/iommu/arm-smmu.c | 1709 + drivers/iommu/lib-arm-smmu.c | 1768 ++ drivers/iommu/lib-arm-smmu.h | 161 +++ drivers/iommu/tegra194-smmu.c| 394 +++ 8 files changed, 2496 insertions(+), 1698 deletions(-) create mode 100644 drivers/iommu/lib-arm-smmu.c create mode 100644 drivers/iommu/lib-arm-smmu.h create mode 100644 drivers/iommu/tegra194-smmu.c -- 2.1.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/4] PCI / ACPI: Identify untrusted PCI devices
On Thu, Nov 29, 2018 at 06:51:50PM +0300, Mika Westerberg wrote: > A malicious PCI device may use DMA to attack the system. An external > Thunderbolt port is a convenient point to attach such a device. The OS > may use IOMMU to defend against DMA attacks. > > Recent BIOSes with Thunderbolt ports mark these externally facing root > ports with this ACPI _DSD [1]: I'm not 100% comfortable with the "Recent BIOSes" wording because that suggests that we can rely on the fact that *all* BIOSes newer than some date X mark these ports. Since this _DSD usage is Microsoft-specific and not required by either PCIe or ACPI specs, we can't rely on it. A BIOS that doesn't implement it may not be Windows-certified, but it's perfectly spec-compliant otherwise and we have to keep in mind the possibility that ports without this _DSD may still be externally visible and may still be attack vectors. > Name (_DSD, Package () { > ToUUID ("efcc06cc-73ac-4bc3-bff0-76143807c389"), > Package () { > Package () {"ExternalFacingPort", 1}, > Package () {"UID", 0 } > } > }) > > If we find such a root port, mark it and all its children as untrusted. > The rest of the OS may use this information to enable DMA protection > against malicious devices. For instance the device may be put behind an > IOMMU to keep it from accessing memory outside of what the driver has > allocated for it. > > While at it, add a comment on top of prp_guids array explaining the > possible caveat resulting when these GUIDs are treated equivalent. > > [1] > https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports > > Signed-off-by: Mika Westerberg Acked-by: Bjorn Helgaas > --- > drivers/acpi/property.c | 11 +++ > drivers/pci/pci-acpi.c | 19 +++ > drivers/pci/probe.c | 15 +++ > include/linux/pci.h | 8 > 4 files changed, 53 insertions(+) > > diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c > index 8c7c4583b52d..77abe0ec4043 100644 > --- a/drivers/acpi/property.c > +++ b/drivers/acpi/property.c > @@ -24,6 +24,14 @@ static int acpi_data_get_property_array(const struct > acpi_device_data *data, > acpi_object_type type, > const union acpi_object **obj); > > +/* > + * The GUIDs here are made equivalent to each other in order to avoid extra > + * complexity in the properties handling code, with the caveat that the > + * kernel will accept certain combinations of GUID and properties that are > + * not defined without a warning. For instance if any of the properties > + * from different GUID appear in a property list of another, it will be > + * accepted by the kernel. Firmware validation tools should catch these. > + */ > static const guid_t prp_guids[] = { > /* ACPI _DSD device properties GUID: > daffd814-6eba-4d8c-8a91-bc9bbf4aa301 */ > GUID_INIT(0xdaffd814, 0x6eba, 0x4d8c, > @@ -31,6 +39,9 @@ static const guid_t prp_guids[] = { > /* Hotplug in D3 GUID: 6211e2c0-58a3-4af3-90e1-927a4e0c55a4 */ > GUID_INIT(0x6211e2c0, 0x58a3, 0x4af3, > 0x90, 0xe1, 0x92, 0x7a, 0x4e, 0x0c, 0x55, 0xa4), > + /* External facing port GUID: efcc06cc-73ac-4bc3-bff0-76143807c389 */ > + GUID_INIT(0xefcc06cc, 0x73ac, 0x4bc3, > + 0xbf, 0xf0, 0x76, 0x14, 0x38, 0x07, 0xc3, 0x89), > }; > > static const guid_t ads_guid = > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c > index 921db6f80340..e1949f7efd9c 100644 > --- a/drivers/pci/pci-acpi.c > +++ b/drivers/pci/pci-acpi.c > @@ -789,6 +789,24 @@ static void pci_acpi_optimize_delay(struct pci_dev *pdev, > ACPI_FREE(obj); > } > > +static void pci_acpi_set_untrusted(struct pci_dev *dev) > +{ > + u8 val; > + > + if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT) > + return; > + if (device_property_read_u8(&dev->dev, "ExternalFacingPort", &val)) > + return; > + > + /* > + * These root ports expose PCIe (including DMA) outside of the > + * system so make sure we treat them and everything behind as > + * untrusted. > + */ > + if (val) > + dev->untrusted = 1; > +} > + > static void pci_acpi_setup(struct device *dev) > { > struct pci_dev *pci_dev = to_pci_dev(dev); > @@ -798,6 +816,7 @@ static void pci_acpi_setup(struct device *dev) > return; > > pci_acpi_optimize_delay(pci_dev, adev->handle); > + pci_acpi_set_untrusted(pci_dev); > > pci_acpi_add_pm_notifier(adev, pci_dev); > if (!adev->wakeup.flags.valid) > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index b1c05b5054a0..257b9f6f2ebb 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -1378,6 +1378,19 @@ static void set_pcie_thunderbolt(struct pci_dev *dev) > } > } > > +static void set_pcie_untr
Re: [PATCH v18 1/5] iommu/arm-smmu: Add pm_runtime/sleep ops
Quoting Vivek Gautam (2018-12-02 22:43:38) > On Fri, Nov 30, 2018 at 11:45 PM Will Deacon wrote: > > > > On Thu, Nov 29, 2018 at 08:25:20PM +0530, Vivek Gautam wrote: > > > clk_bulk_get_all() seems like going only the OF way. > > > Is there another way here to have something common between ACPI > > > and OF, and then do the clk_bulk_get? > > > > I'd say just go with clk_bulk_get_all() and if somebody really wants to > > mess with the SMMU clocks on a system booted via ACPI, then it's their > > problem to solve. My understanding is that the design of IORT makes this > > next to impossible to solve anyway, because a static table is used and > > therefore we're unable to run whatever ASL methods need to be invoked to > > mess with the clocks. > > Sure then. I will respin this patch-series. > Right. The idea is to add non-OF support to clk_bulk_get_all() if/when we get the requirement. Sounds like we can keep waiting a little longer for that to happen. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH V2] mm: Replace all open encodings for NUMA_NO_NODE
On Mon, 2018-11-26 at 17:56 +0530, Anshuman Khandual wrote: > At present there are multiple places where invalid node number is encoded > as -1. Even though implicitly understood it is always better to have macros > in there. Replace these open encodings for an invalid node number with the > global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like > 'invalid node' from various places redirecting them to a common definition. > > Signed-off-by: Anshuman Khandual > --- > Changes in V2: > > - Added inclusion of 'numa.h' header at various places per Andrew > - Updated 'dev_to_node' to use NUMA_NO_NODE instead per Vinod > > Changes in V1: (https://lkml.org/lkml/2018/11/23/485) > > - Dropped OCFS2 changes per Joseph > - Dropped media/video drivers changes per Hans > > RFC - https://patchwork.kernel.org/patch/10678035/ > > Build tested this with multiple cross compiler options like alpha, sparc, > arm64, x86, powerpc, powerpc64le etc with their default config which might > not have compiled tested all driver related changes. I will appreciate > folks giving this a test in their respective build environment. > > All these places for replacement were found by running the following grep > patterns on the entire kernel code. Please let me know if this might have > missed some instances. This might also have replaced some false positives. > I will appreciate suggestions, inputs and review. > > 1. git grep "nid == -1" > 2. git grep "node == -1" > 3. git grep "nid = -1" > 4. git grep "node = -1" > > drivers/infiniband/hw/hfi1/affinity.c | 3 ++- > drivers/infiniband/hw/hfi1/init.c | 3 ++- For the drivers/infiniband changes, Acked-by: Doug Ledford -- Doug Ledford GPG KeyID: B826A3330E572FDD Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD signature.asc Description: This is a digitally signed message part ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/4] dma-debug: Dynamically expand the dma_debug_entry pool
On 03/12/2018 17:28, Robin Murphy wrote: Certain drivers such as large multi-queue network adapters can use pools of mapped DMA buffers larger than the default dma_debug_entry pool of 65536 entries, with the result that merely probing such a device can cause DMA debug to disable itself during boot unless explicitly given an appropriate "dma_debug_entries=..." option. Developers trying to debug some other driver on such a system may not be immediately aware of this, and at worst it can hide bugs if they fail to realise that dma-debug has already disabled itself unexpectedly by the time the code of interest gets to run. Even once they do realise, it can be a bit of a pain to emprirically determine a suitable number of preallocated entries to configure without massively over-allocating. There's really no need for such a static limit, though, since we can quite easily expand the pool at runtime in those rare cases that the preallocated entries are insufficient, which is arguably the least surprising and most useful behaviour. Hi Robin, Do you have an idea on shrinking the pool again when the culprit driver is removed, i.e. we have so many unused debug entries now available? Thanks, John Signed-off-by: Robin Murphy --- kernel/dma/debug.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index de5db800dbfc..46cc075aec99 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -47,6 +47,9 @@ #ifndef PREALLOC_DMA_DEBUG_ENTRIES #define PREALLOC_DMA_DEBUG_ENTRIES (1 << 16) #endif +/* If the pool runs out, try this many times to allocate this many new entries */ +#define DMA_DEBUG_DYNAMIC_ENTRIES 256 +#define DMA_DEBUG_DYNAMIC_RETRIES 2 enum { dma_debug_single, @@ -702,12 +705,21 @@ static struct dma_debug_entry *dma_entry_alloc(void) { struct dma_debug_entry *entry; unsigned long flags; + int retry_count; - spin_lock_irqsave(&free_entries_lock, flags); + for (retry_count = 0; ; retry_count++) { + spin_lock_irqsave(&free_entries_lock, flags); + + if (num_free_entries > 0) + break; - if (list_empty(&free_entries)) { - global_disable = true; spin_unlock_irqrestore(&free_entries_lock, flags); + + if (retry_count < DMA_DEBUG_DYNAMIC_RETRIES && + !prealloc_memory(DMA_DEBUG_DYNAMIC_ENTRIES)) + continue; + + global_disable = true; pr_err("debugging out of memory - disabling\n"); return NULL; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-debug: Kconfig for PREALLOC_DMA_DEBUG_ENTRIES
On Mon, Dec 03, 2018 at 11:56:11AM +, John Garry wrote: > On 01/12/2018 16:36, Christoph Hellwig wrote: >> On Fri, Nov 30, 2018 at 07:39:50PM +, Robin Murphy wrote: >>> I was assuming the point was to also add something like >>> >>> default 131072 if HNS_ENET >>> >>> so that DMA debug doesn't require too much thought from the user. If they >>> still have to notice the overflow message and empirically figure out a >>> value that does work, rebuilding the kernel each time is far less >>> convenient than simply adding "dma_debug_entries=..." to their kernel >>> command line and rebooting, which they can do today. If they do already >>> know up-front that the default will need overriding and what the >>> appropriate value is, then the command line still seems seems just as >>> convenient. >> >> I'm not so fond of random drivers changing the defaults. My idea >> was rather to have the config option so that the defconfig files for >> the Hisilicon SOCs with this hardware could select a larger number >> without making a total mess of the kernel configuration. >> >> If we really have to we could do different defaults, but I'd still >> much rather do this on a arch/platform basis than specific drivers. > > As I understand, some drivers could even use much more than this (131072), > to such a point where I can't imagine that we would want to set an arch > default to support them. For this HNS_ENET case, it is arm64 specific so it > would be an arch defconfig. But I'm not sure we could always do the right thing for everyone. I think we might be better of trying to just dynamically allocate entries when we run out of them instead of coming up with a perfect number. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/4] dma-debug: implement dynamic entry allocation
On Mon, Dec 03, 2018 at 05:28:05PM +, Robin Murphy wrote: > The HNS_ENET discussion got me thinking, why not just make DMA debug > cleverer so that (in terms of basic functionality at least) we don't > need to worry about driver- or arch-specific configuration at all? > > Patches #2 and #3 are the real meat here - #1 is just some preparatory > cleanup motivated by moving printks around, and could be split out if > desired; I kept #4 separate as a possible nice-to-have depending on > what people think. Hah, I just thought of this and suggested it a few seconds ago. So obviously I think it is a good idea, but let me look at the details. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/4] dma-debug: Use pr_fmt()
Use pr_fmt() to generate the "DMA-API: " prefix consistently. This results in it being added to a couple of pr_*() messages which were missing it before, and for the err_printk() calls moves it to the actual start of the message instead of somewhere in the middle. Signed-off-by: Robin Murphy --- I chose not to refactor the existing split strings for minimal churn here. kernel/dma/debug.c | 74 -- 1 file changed, 38 insertions(+), 36 deletions(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 231ca4628062..91b84140e4a5 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -17,6 +17,8 @@ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ +#define pr_fmt(fmt)"DMA-API: " fmt + #include #include #include @@ -234,7 +236,7 @@ static bool driver_filter(struct device *dev) error_count += 1; \ if (driver_filter(dev) && \ (show_all_errors || show_num_errors > 0)) { \ - WARN(1, "%s %s: " format, \ + WARN(1, pr_fmt("%s %s: ") format, \ dev ? dev_driver_string(dev) : "NULL", \ dev ? dev_name(dev) : "NULL", ## arg); \ dump_entry_trace(entry);\ @@ -519,7 +521,7 @@ static void active_cacheline_inc_overlap(phys_addr_t cln) * prematurely. */ WARN_ONCE(overlap > ACTIVE_CACHELINE_MAX_OVERLAP, - "DMA-API: exceeded %d overlapping mappings of cacheline %pa\n", + pr_fmt("exceeded %d overlapping mappings of cacheline %pa\n"), ACTIVE_CACHELINE_MAX_OVERLAP, &cln); } @@ -614,7 +616,7 @@ void debug_dma_assert_idle(struct page *page) cln = to_cacheline_number(entry); err_printk(entry->dev, entry, - "DMA-API: cpu touching an active dma mapped cacheline [cln=%pa]\n", + "cpu touching an active dma mapped cacheline [cln=%pa]\n", &cln); } @@ -634,7 +636,7 @@ static void add_dma_entry(struct dma_debug_entry *entry) rc = active_cacheline_insert(entry); if (rc == -ENOMEM) { - pr_err("DMA-API: cacheline tracking ENOMEM, dma-debug disabled\n"); + pr_err("cacheline tracking ENOMEM, dma-debug disabled\n"); global_disable = true; } @@ -673,7 +675,7 @@ static struct dma_debug_entry *dma_entry_alloc(void) if (list_empty(&free_entries)) { global_disable = true; spin_unlock_irqrestore(&free_entries_lock, flags); - pr_err("DMA-API: debugging out of memory - disabling\n"); + pr_err("debugging out of memory - disabling\n"); return NULL; } @@ -777,7 +779,7 @@ static int prealloc_memory(u32 num_entries) num_free_entries = num_entries; min_free_entries = num_entries; - pr_info("DMA-API: preallocated %d debug entries\n", num_entries); + pr_info("preallocated %d debug entries\n", num_entries); return 0; @@ -850,7 +852,7 @@ static ssize_t filter_write(struct file *file, const char __user *userbuf, * switched off. */ if (current_driver_name[0]) - pr_info("DMA-API: switching off dma-debug driver filter\n"); + pr_info("switching off dma-debug driver filter\n"); current_driver_name[0] = 0; current_driver = NULL; goto out_unlock; @@ -868,7 +870,7 @@ static ssize_t filter_write(struct file *file, const char __user *userbuf, current_driver_name[i] = 0; current_driver = NULL; - pr_info("DMA-API: enable driver filter for driver [%s]\n", + pr_info("enable driver filter for driver [%s]\n", current_driver_name); out_unlock: @@ -887,7 +889,7 @@ static int dma_debug_fs_init(void) { dma_debug_dent = debugfs_create_dir("dma-api", NULL); if (!dma_debug_dent) { - pr_err("DMA-API: can not create debugfs directory\n"); + pr_err("can not create debugfs directory\n"); return -ENOMEM; } @@ -973,7 +975,7 @@ static int dma_debug_device_change(struct notifier_block *nb, unsigned long acti count = device_dma_allocations(dev, &entry); if (count == 0) break; - err_printk(dev, entry, "DMA-API: device driver has pending " + err_printk(dev, entry, "device driver has pending " "DMA allocations while released from device " "[count=%d]\n" "One of leaked entries details: " @@ -1
[PATCH 0/4] dma-debug: implement dynamic entry allocation
The HNS_ENET discussion got me thinking, why not just make DMA debug cleverer so that (in terms of basic functionality at least) we don't need to worry about driver- or arch-specific configuration at all? Patches #2 and #3 are the real meat here - #1 is just some preparatory cleanup motivated by moving printks around, and could be split out if desired; I kept #4 separate as a possible nice-to-have depending on what people think. Robin. Robin Murphy (4): dma-debug: Use pr_fmt() dma-debug: Refactor dma_debug_entry allocation dma-debug: Dynamically expand the dma_debug_entry pool dma-debug: Make leak-like behaviour apparent kernel/dma/debug.c | 198 - 1 file changed, 105 insertions(+), 93 deletions(-) -- 2.19.1.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/4] dma-debug: Refactor dma_debug_entry allocation
Make prealloc_memory() a little more general and robust so that it serves for runtime reallocations too. The first thing we can do with that is clean up dma_debug_resize_entries() quite a bit. Signed-off-by: Robin Murphy --- kernel/dma/debug.c | 95 +++--- 1 file changed, 40 insertions(+), 55 deletions(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 91b84140e4a5..de5db800dbfc 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -645,6 +645,39 @@ static void add_dma_entry(struct dma_debug_entry *entry) */ } +static int prealloc_memory(u32 num_entries) +{ + struct dma_debug_entry *entry, *next_entry; + unsigned long flags; + LIST_HEAD(tmp); + int i; + + for (i = 0; i < num_entries; ++i) { + entry = kzalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) + goto out_err; + + list_add_tail(&entry->list, &tmp); + } + + spin_lock_irqsave(&free_entries_lock, flags); + list_splice(&tmp, &free_entries); + num_free_entries += num_entries; + nr_total_entries += num_entries; + spin_unlock_irqrestore(&free_entries_lock, flags); + + return 0; + +out_err: + + list_for_each_entry_safe(entry, next_entry, &tmp, list) { + list_del(&entry->list); + kfree(entry); + } + + return -ENOMEM; +} + static struct dma_debug_entry *__dma_entry_alloc(void) { struct dma_debug_entry *entry; @@ -714,44 +747,25 @@ int dma_debug_resize_entries(u32 num_entries) int i, delta, ret = 0; unsigned long flags; struct dma_debug_entry *entry; - LIST_HEAD(tmp); - - spin_lock_irqsave(&free_entries_lock, flags); if (nr_total_entries < num_entries) { delta = num_entries - nr_total_entries; - spin_unlock_irqrestore(&free_entries_lock, flags); - - for (i = 0; i < delta; i++) { - entry = kzalloc(sizeof(*entry), GFP_KERNEL); - if (!entry) - break; - - list_add_tail(&entry->list, &tmp); - } - - spin_lock_irqsave(&free_entries_lock, flags); - - list_splice(&tmp, &free_entries); - nr_total_entries += i; - num_free_entries += i; + ret = prealloc_memory(delta); } else { delta = nr_total_entries - num_entries; + spin_lock_irqsave(&free_entries_lock, flags); for (i = 0; i < delta && !list_empty(&free_entries); i++) { entry = __dma_entry_alloc(); kfree(entry); } + spin_unlock_irqrestore(&free_entries_lock, flags); nr_total_entries -= i; + if (nr_total_entries != num_entries) + ret = -EBUSY; } - - if (nr_total_entries != num_entries) - ret = 1; - - spin_unlock_irqrestore(&free_entries_lock, flags); - return ret; } @@ -763,36 +777,6 @@ int dma_debug_resize_entries(u32 num_entries) * 2. Preallocate a given number of dma_debug_entry structs */ -static int prealloc_memory(u32 num_entries) -{ - struct dma_debug_entry *entry, *next_entry; - int i; - - for (i = 0; i < num_entries; ++i) { - entry = kzalloc(sizeof(*entry), GFP_KERNEL); - if (!entry) - goto out_err; - - list_add_tail(&entry->list, &free_entries); - } - - num_free_entries = num_entries; - min_free_entries = num_entries; - - pr_info("preallocated %d debug entries\n", num_entries); - - return 0; - -out_err: - - list_for_each_entry_safe(entry, next_entry, &free_entries, list) { - list_del(&entry->list); - kfree(entry); - } - - return -ENOMEM; -} - static ssize_t filter_read(struct file *file, char __user *user_buf, size_t count, loff_t *ppos) { @@ -1038,7 +1022,8 @@ static int dma_debug_init(void) return 0; } - nr_total_entries = num_free_entries; + min_free_entries = num_free_entries; + pr_info("preallocated %d debug entries\n", nr_total_entries); dma_debug_initialized = true; -- 2.19.1.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/4] dma-debug: Dynamically expand the dma_debug_entry pool
Certain drivers such as large multi-queue network adapters can use pools of mapped DMA buffers larger than the default dma_debug_entry pool of 65536 entries, with the result that merely probing such a device can cause DMA debug to disable itself during boot unless explicitly given an appropriate "dma_debug_entries=..." option. Developers trying to debug some other driver on such a system may not be immediately aware of this, and at worst it can hide bugs if they fail to realise that dma-debug has already disabled itself unexpectedly by the time the code of interest gets to run. Even once they do realise, it can be a bit of a pain to emprirically determine a suitable number of preallocated entries to configure without massively over-allocating. There's really no need for such a static limit, though, since we can quite easily expand the pool at runtime in those rare cases that the preallocated entries are insufficient, which is arguably the least surprising and most useful behaviour. Signed-off-by: Robin Murphy --- kernel/dma/debug.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index de5db800dbfc..46cc075aec99 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -47,6 +47,9 @@ #ifndef PREALLOC_DMA_DEBUG_ENTRIES #define PREALLOC_DMA_DEBUG_ENTRIES (1 << 16) #endif +/* If the pool runs out, try this many times to allocate this many new entries */ +#define DMA_DEBUG_DYNAMIC_ENTRIES 256 +#define DMA_DEBUG_DYNAMIC_RETRIES 2 enum { dma_debug_single, @@ -702,12 +705,21 @@ static struct dma_debug_entry *dma_entry_alloc(void) { struct dma_debug_entry *entry; unsigned long flags; + int retry_count; - spin_lock_irqsave(&free_entries_lock, flags); + for (retry_count = 0; ; retry_count++) { + spin_lock_irqsave(&free_entries_lock, flags); + + if (num_free_entries > 0) + break; - if (list_empty(&free_entries)) { - global_disable = true; spin_unlock_irqrestore(&free_entries_lock, flags); + + if (retry_count < DMA_DEBUG_DYNAMIC_RETRIES && + !prealloc_memory(DMA_DEBUG_DYNAMIC_ENTRIES)) + continue; + + global_disable = true; pr_err("debugging out of memory - disabling\n"); return NULL; } -- 2.19.1.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC 4/4] dma-debug: Make leak-like behaviour apparent
Now that we can dynamically allocate DMA debug entries to cope with drivers maintaining excessively large numbers of live mappings, a driver which *does* actually have a bug leaking mappings (and is not unloaded) will no longer trigger the "DMA-API: debugging out of memory - disabling" message until it gets to actual kernel OOM conditions, which means it could go unnoticed for a while. To that end, let's inform the user each time the pool has grown to a multiple of its initial size, which should make it apparent that they either have a leak or might want to increase the preallocation size. Signed-off-by: Robin Murphy --- Tagging this one as RFC since people might think it's silly. kernel/dma/debug.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 46cc075aec99..c4759dab0f8c 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -696,6 +696,17 @@ static struct dma_debug_entry *__dma_entry_alloc(void) return entry; } +void __dma_entry_alloc_check_leak(void) +{ + u32 tmp = nr_total_entries % nr_prealloc_entries; + + /* Shout each time we tick over some multiple of the initial pool */ + if (tmp < DMA_DEBUG_DYNAMIC_ENTRIES) { + pr_info("dma_debug_entry pool grown to %u00%% - possible mapping leak?\n", + (nr_total_entries / nr_prealloc_entries)); + } +} + /* struct dma_entry allocator * * The next two functions implement the allocator for @@ -716,8 +727,10 @@ static struct dma_debug_entry *dma_entry_alloc(void) spin_unlock_irqrestore(&free_entries_lock, flags); if (retry_count < DMA_DEBUG_DYNAMIC_RETRIES && - !prealloc_memory(DMA_DEBUG_DYNAMIC_ENTRIES)) + !prealloc_memory(DMA_DEBUG_DYNAMIC_ENTRIES)) { + __dma_entry_alloc_check_leak(); continue; + } global_disable = true; pr_err("debugging out of memory - disabling\n"); -- 2.19.1.dirty ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: remove the ->mapping_error method from dma_map_ops V3
Does anyone but Linus and Russell have comments on this series? I'd like to pull it in fairly quickly as I have a fair amount of work on top of it that I'd like to get into 4.21 as well. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v5 02/12] iommu/vt-d: Manage scalalble mode PASID tables
Hi Joerg, > From: Joerg Roedel [mailto:j...@8bytes.org] > Sent: Monday, December 3, 2018 5:44 AM > To: Lu Baolu > Subject: Re: [PATCH v5 02/12] iommu/vt-d: Manage scalalble mode PASID tables > > Hi Baolu, > > On Wed, Nov 28, 2018 at 11:54:39AM +0800, Lu Baolu wrote: > > @@ -2482,12 +2482,13 @@ static struct dmar_domain > *dmar_insert_one_dev_info(struct intel_iommu *iommu, > > if (dev) > > dev->archdata.iommu = info; > > > > - if (dev && dev_is_pci(dev) && info->pasid_supported) { > > + /* PASID table is mandatory for a PCI device in scalable mode. */ > > + if (dev && dev_is_pci(dev) && sm_supported(iommu)) { > > This will also allocate a PASID table if the device does not support > PASIDs, right? Will the table not be used in that case or will the > device just use the fallback PASID? Isn't it better in that case to have > no PASID table? We need to allocate the PASID table in scalable mode, the reason is as below: In VT-d scalable mode, all address translation is done in PASID-granularity. For requests-with-PASID, the address translation would be subjected to the PASID entry specified by the PASID value in the DMA request. However, for requests-without-PASID, there is no PASID in the DMA request. To fulfil the translation logic, we've introduced RID2PASID field in sm-context-entry in VT-d 3.o spec. So that such DMA requests would be subjected to the pasid entry specified by the PASID value in the RID2PASID field of sm-context-entry. So for a device without PASID support, we need to at least to have a PASID entry so that its DMA request (without pasid) can be translated. Thus a PASID table is needed for such devices. > > > @@ -143,18 +143,20 @@ int intel_pasid_alloc_table(struct device *dev) > > return -ENOMEM; > > INIT_LIST_HEAD(&pasid_table->dev); > > > > - size = sizeof(struct pasid_entry); > > - count = min_t(int, pci_max_pasids(to_pci_dev(dev)), intel_pasid_max_id); > > - order = get_order(size * count); > > + if (info->pasid_supported) > > + max_pasid = min_t(int, pci_max_pasids(to_pci_dev(dev)), > > + intel_pasid_max_id); > > + > > + size = max_pasid >> (PASID_PDE_SHIFT - 3); > > + order = size ? get_order(size) : 0; > > pages = alloc_pages_node(info->iommu->node, > > -GFP_ATOMIC | __GFP_ZERO, > > -order); > > +GFP_ATOMIC | __GFP_ZERO, order); > > This is a simple data structure allocation path, does it need > GFP_ATOMIC? will leave it to Baolu. > > Joerg Thanks, Yi Liu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support
Hi Joerg, > From: Joerg Roedel [mailto:j...@8bytes.org] > Sent: Monday, December 3, 2018 5:49 AM > To: Lu Baolu > Subject: Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor > support > > On Wed, Nov 28, 2018 at 11:54:41AM +0800, Lu Baolu wrote: > > - > > - desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO, > 0); > > + /* > > +* Need two pages to accommodate 256 descriptors of 256 bits each > > +* if the remapping hardware supports scalable mode translation. > > +*/ > > + desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO, > > +!!ecap_smts(iommu->ecap)); > > > Same here, does the allocation really need GFP_ATOMIC? still leave to Baolu. > > > struct q_inval { > > raw_spinlock_t q_lock; > > - struct qi_desc *desc; /* invalidation queue */ > > + void*desc; /* invalidation queue */ > > int *desc_status; /* desc status */ > > int free_head; /* first free entry */ > > int free_tail; /* last free entry */ > > Why do you switch the pointer to void* ? In this patch, there is some code like the code below. It calculates destination address of memcpy with qi->desc. If it's still struct qi_desc pointer, the calculation result would be wrong. + memcpy(desc, qi->desc + (wait_index << shift), + 1 << shift); The change of the calculation method is to support 128 bits invalidation descriptors and 256 invalidation descriptors in this unified code logic. Also, the conversation between Baolu and me may help. https://lore.kernel.org/patchwork/patch/1006756/ > > Joerg Thanks, Yi Liu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 6/8] vfio/mdev: Add iommu place holders in mdev_device
On 11/21/2018 2:15 PM, Christoph Hellwig wrote: > On Wed, Nov 21, 2018 at 02:22:08AM +0530, Kirti Wankhede wrote: >> It is about how mdev framework can be used by existing drivers. These >> symbols doesn't use any other exported symbols. > > That is an unfortunate accident of history, but doesn't extent to new > ones. It also is another inidicator those drivers probably are derived > works of the Linux kernel and might be in legal trouble one way or > another. > These symbols are just to associate iommu properties of a physical device with a mdev device, doesn't include low-level information. Thanks, Kirti ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Mon, Dec 3, 2018 at 7:45 AM Robin Murphy wrote: > > Hi Rob, > > On 01/12/2018 16:53, Rob Clark wrote: > > This solves a problem we see with drm/msm, caused by getting > > iommu_dma_ops while we attach our own domain and manage it directly at > > the iommu API level: > > > >[0038] user address but active_mm is swapper > >Internal error: Oops: 9605 [#1] PREEMPT SMP > >Modules linked in: > >CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90 > >Hardware name: xxx (DT) > >Workqueue: events deferred_probe_work_func > >pstate: 80c9 (Nzcv daif +PAN +UAO) > >pc : iommu_dma_map_sg+0x7c/0x2c8 > >lr : iommu_dma_map_sg+0x40/0x2c8 > >sp : ff80095eb4f0 > >x29: ff80095eb4f0 x28: > >x27: ffc0f9431578 x26: > >x25: x24: 0003 > >x23: 0001 x22: ffc0fa9ac010 > >x21: x20: ffc0fab40980 > >x19: ffc0fab40980 x18: 0003 > >x17: 01c4 x16: 0007 > >x15: 000e x14: > >x13: x12: 0028 > >x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f > >x9 : x8 : ffc0fab409a0 > >x7 : x6 : 0002 > >x5 : 0001 x4 : > >x3 : 0001 x2 : 0002 > >x1 : ffc0f9431578 x0 : > >Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb) > >Call trace: > > iommu_dma_map_sg+0x7c/0x2c8 > > __iommu_map_sg_attrs+0x70/0x84 > > get_pages+0x170/0x1e8 > > msm_gem_get_iova+0x8c/0x128 > > _msm_gem_kernel_new+0x6c/0xc8 > > msm_gem_kernel_new+0x4c/0x58 > > dsi_tx_buf_alloc_6g+0x4c/0x8c > > msm_dsi_host_modeset_init+0xc8/0x108 > > msm_dsi_modeset_init+0x54/0x18c > > _dpu_kms_drm_obj_init+0x430/0x474 > > dpu_kms_hw_init+0x5f8/0x6b4 > > msm_drm_bind+0x360/0x6c8 > > try_to_bring_up_master.part.7+0x28/0x70 > > component_master_add_with_match+0xe8/0x124 > > msm_pdev_probe+0x294/0x2b4 > > platform_drv_probe+0x58/0xa4 > > really_probe+0x150/0x294 > > driver_probe_device+0xac/0xe8 > > __device_attach_driver+0xa4/0xb4 > > bus_for_each_drv+0x98/0xc8 > > __device_attach+0xac/0x12c > > device_initial_probe+0x24/0x30 > > bus_probe_device+0x38/0x98 > > deferred_probe_work_func+0x78/0xa4 > > process_one_work+0x24c/0x3dc > > worker_thread+0x280/0x360 > > kthread+0x134/0x13c > > ret_from_fork+0x10/0x18 > >Code: d284 91000725 6b17039f 5400048a (f9401f40) > >---[ end trace f22dda57f3648e2c ]--- > >Kernel panic - not syncing: Fatal exception > >SMP: stopping secondary CPUs > >Kernel Offset: disabled > >CPU features: 0x0,22802a18 > >Memory Limit: none > > > > The problem is that when drm/msm does it's own iommu_attach_device(), > > now the domain returned by iommu_get_domain_for_dev() is drm/msm's > > domain, and it doesn't have domain->iova_cookie. > > Does this crash still happen with 4.20-rc? Because as of 6af588fed391 it > really shouldn't. > > > We kind of avoided this problem prior to sdm845/dpu because the iommu > > was attached to the mdp node in dt, which is a child of the toplevel > > mdss node (which corresponds to the dev passed in dma_map_sg()). But > > with sdm845, now the iommu is attached at the mdss level so we hit the > > iommu_dma_ops in dma_map_sg(). > > > > But auto allocating/attaching a domain before the driver is probed was > > already a blocking problem for enabling per-context pagetables for the > > GPU. This problem is also now solved with this patch. > > s/solved/worked around/ > > If you want a guarantee of actually getting a specific hardware context > allocated for a given domain, there needs to be code in the IOMMU driver > to understand and honour that. Implicitly depending on whatever happens > to fall out of current driver behaviour on current systems is not a real > solution. > > > Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in > > of_dma_configure > > That's rather misleading, since the crash described above depends on at > least two other major changes which came long after that commit. > > It's not that I don't understand exactly what you want here - just that > this commit message isn't a coherent justification for that ;) > > > Tested-by: Douglas Anderson > > Signed-off-by: Rob Clark > > --- > > This is an alternative/replacement for [1]. What it lacks in elegance > > it makes up for in practicality ;-) > > > > [1] https://patchwork.freedesktop.org/patch/264930/ > > > > drivers/of/device.c | 22 ++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/drivers/of/device.c b/drivers/of/device.c > > index 5957cd4fa262..15ffee00fb22 100644 > > --- a/drivers/of/device.c > > +++ b/drivers/of/device.c > >
Re: [PATCH] of/device: add blacklist for iommu dma_ops
On Mon, Dec 3, 2018 at 7:45 AM Robin Murphy wrote: > > Hi Rob, > > On 01/12/2018 16:53, Rob Clark wrote: > > This solves a problem we see with drm/msm, caused by getting > > iommu_dma_ops while we attach our own domain and manage it directly at > > the iommu API level: > > > >[0038] user address but active_mm is swapper > >Internal error: Oops: 9605 [#1] PREEMPT SMP > >Modules linked in: > >CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90 > >Hardware name: xxx (DT) > >Workqueue: events deferred_probe_work_func > >pstate: 80c9 (Nzcv daif +PAN +UAO) > >pc : iommu_dma_map_sg+0x7c/0x2c8 > >lr : iommu_dma_map_sg+0x40/0x2c8 > >sp : ff80095eb4f0 > >x29: ff80095eb4f0 x28: > >x27: ffc0f9431578 x26: > >x25: x24: 0003 > >x23: 0001 x22: ffc0fa9ac010 > >x21: x20: ffc0fab40980 > >x19: ffc0fab40980 x18: 0003 > >x17: 01c4 x16: 0007 > >x15: 000e x14: > >x13: x12: 0028 > >x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f > >x9 : x8 : ffc0fab409a0 > >x7 : x6 : 0002 > >x5 : 0001 x4 : > >x3 : 0001 x2 : 0002 > >x1 : ffc0f9431578 x0 : > >Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb) > >Call trace: > > iommu_dma_map_sg+0x7c/0x2c8 > > __iommu_map_sg_attrs+0x70/0x84 > > get_pages+0x170/0x1e8 > > msm_gem_get_iova+0x8c/0x128 > > _msm_gem_kernel_new+0x6c/0xc8 > > msm_gem_kernel_new+0x4c/0x58 > > dsi_tx_buf_alloc_6g+0x4c/0x8c > > msm_dsi_host_modeset_init+0xc8/0x108 > > msm_dsi_modeset_init+0x54/0x18c > > _dpu_kms_drm_obj_init+0x430/0x474 > > dpu_kms_hw_init+0x5f8/0x6b4 > > msm_drm_bind+0x360/0x6c8 > > try_to_bring_up_master.part.7+0x28/0x70 > > component_master_add_with_match+0xe8/0x124 > > msm_pdev_probe+0x294/0x2b4 > > platform_drv_probe+0x58/0xa4 > > really_probe+0x150/0x294 > > driver_probe_device+0xac/0xe8 > > __device_attach_driver+0xa4/0xb4 > > bus_for_each_drv+0x98/0xc8 > > __device_attach+0xac/0x12c > > device_initial_probe+0x24/0x30 > > bus_probe_device+0x38/0x98 > > deferred_probe_work_func+0x78/0xa4 > > process_one_work+0x24c/0x3dc > > worker_thread+0x280/0x360 > > kthread+0x134/0x13c > > ret_from_fork+0x10/0x18 > >Code: d284 91000725 6b17039f 5400048a (f9401f40) > >---[ end trace f22dda57f3648e2c ]--- > >Kernel panic - not syncing: Fatal exception > >SMP: stopping secondary CPUs > >Kernel Offset: disabled > >CPU features: 0x0,22802a18 > >Memory Limit: none > > > > The problem is that when drm/msm does it's own iommu_attach_device(), > > now the domain returned by iommu_get_domain_for_dev() is drm/msm's > > domain, and it doesn't have domain->iova_cookie. > > Does this crash still happen with 4.20-rc? Because as of 6af588fed391 it > really shouldn't. for this hw, I'm still on 4.19, although that does look like it would avoid the issue. > > We kind of avoided this problem prior to sdm845/dpu because the iommu > > was attached to the mdp node in dt, which is a child of the toplevel > > mdss node (which corresponds to the dev passed in dma_map_sg()). But > > with sdm845, now the iommu is attached at the mdss level so we hit the > > iommu_dma_ops in dma_map_sg(). > > > > But auto allocating/attaching a domain before the driver is probed was > > already a blocking problem for enabling per-context pagetables for the > > GPU. This problem is also now solved with this patch. > > s/solved/worked around/ > > If you want a guarantee of actually getting a specific hardware context > allocated for a given domain, there needs to be code in the IOMMU driver > to understand and honour that. Implicitly depending on whatever happens > to fall out of current driver behaviour on current systems is not a real > solution. ok, fair.. but I'll settle for "works" in the absence of better options.. At some level, it would be nice to be able to optionally specify a context-bank in the iommu bindings. But not sure how to make that fit w/ cb allocated per domain. And I assume I'm the only one who has this problem? > > Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in > > of_dma_configure > > That's rather misleading, since the crash described above depends on at > least two other major changes which came long after that commit. Fair, when I realized it was the difference in where the iommu attaches between dpu1 (sdm845) and everything coming before, I should have removed the tag. BR, -R > It's not that I don't understand exactly what you want here - just that > this commit message isn't a coheren
Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support
On Wed, Nov 28, 2018 at 11:54:41AM +0800, Lu Baolu wrote: > - > - desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO, 0); > + /* > + * Need two pages to accommodate 256 descriptors of 256 bits each > + * if the remapping hardware supports scalable mode translation. > + */ > + desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO, > + !!ecap_smts(iommu->ecap)); Same here, does the allocation really need GFP_ATOMIC? > struct q_inval { > raw_spinlock_t q_lock; > - struct qi_desc *desc; /* invalidation queue */ > + void*desc; /* invalidation queue */ > int *desc_status; /* desc status */ > int free_head; /* first free entry */ > int free_tail; /* last free entry */ Why do you switch the pointer to void* ? Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 02/12] iommu/vt-d: Manage scalalble mode PASID tables
Hi Baolu, On Wed, Nov 28, 2018 at 11:54:39AM +0800, Lu Baolu wrote: > @@ -2482,12 +2482,13 @@ static struct dmar_domain > *dmar_insert_one_dev_info(struct intel_iommu *iommu, > if (dev) > dev->archdata.iommu = info; > > - if (dev && dev_is_pci(dev) && info->pasid_supported) { > + /* PASID table is mandatory for a PCI device in scalable mode. */ > + if (dev && dev_is_pci(dev) && sm_supported(iommu)) { This will also allocate a PASID table if the device does not support PASIDs, right? Will the table not be used in that case or will the device just use the fallback PASID? Isn't it better in that case to have no PASID table? > @@ -143,18 +143,20 @@ int intel_pasid_alloc_table(struct device *dev) > return -ENOMEM; > INIT_LIST_HEAD(&pasid_table->dev); > > - size = sizeof(struct pasid_entry); > - count = min_t(int, pci_max_pasids(to_pci_dev(dev)), intel_pasid_max_id); > - order = get_order(size * count); > + if (info->pasid_supported) > + max_pasid = min_t(int, pci_max_pasids(to_pci_dev(dev)), > + intel_pasid_max_id); > + > + size = max_pasid >> (PASID_PDE_SHIFT - 3); > + order = size ? get_order(size) : 0; > pages = alloc_pages_node(info->iommu->node, > - GFP_ATOMIC | __GFP_ZERO, > - order); > + GFP_ATOMIC | __GFP_ZERO, order); This is a simple data structure allocation path, does it need GFP_ATOMIC? Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/9] iommu: clean up/remove modular stuff from non-modules.
On Sat, Dec 01, 2018 at 02:19:08PM -0500, Paul Gortmaker wrote: > Paul Gortmaker (9): > iommu: audit and remove any unnecessary uses of module.h > iommu/rockchip: Make it explicitly non-modular > iommu/msm: Make it explicitly non-modular > iommu/mediatek: Make it explicitly non-modular > iommu/ipmmu-vmsa: Make it explicitly non-modular > iommu/qcom: Make it explicitly non-modular > iommu/tegra: Make it explicitly non-modular > iommu/arm-smmu: Make arm-smmu explicitly non-modular > iommu/arm-smmu: Make arm-smmu-v3 explicitly non-modular Applied all, thanks Paul. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 0/7] Add virtio-iommu driver
Hi Michael, On 11/27/18 6:16 PM, Michael S. Tsirkin wrote: > On Tue, Nov 27, 2018 at 06:09:25PM +0100, Auger Eric wrote: >> Hi Michael, >> >> On 11/27/18 5:53 PM, Michael S. Tsirkin wrote: >>> On Thu, Nov 22, 2018 at 07:37:54PM +, Jean-Philippe Brucker wrote: Implement the virtio-iommu driver, following specification v0.9 [1]. Since v4 [2] I fixed the issues reported by Eric, and added Reviewed-by from Eric and Rob. Thanks! I changed the specification to fix one inconsistency discussed in v4. That the device fills the probe buffer with zeroes is now a "SHOULD" instead of a "MAY", since it's the only way for the driver to know if the device wrote the status. Existing devices already do this. In addition the device now needs to fill the three padding bytes at the tail with zeroes. You can find Linux driver and kvmtool device on branches virtio-iommu/v0.9 [3]. I also lightly tested with Eric's latest QEMU device [4]. >>> >>> I tried to get this to work on my x86 box but without >>> success. Any hints? Does this have to do with the IORT table? >>> I think we really should just reserve our own table ID >>> and avoid the pain of trying to add things to the IORT spec. >>> I'm reluctant to merge lots of code that I can't easily test. >>> Again, if we found a way to push more configuration into >>> virtio config space the problem space would be smaller. >> >> You can at least test it with QEMU ARM virt in TCG mode. > > It's slow enough that I generally just focus on KVM. > >> Then I have >> worked on the IORT integration in PC/Q35 but this is not yet functional. >> I need to debug what's wrong on guest ACPI probing. I plan to work on >> this this week. >> >> Thanks >> >> Eric > > Sounds good. Did you need to make changes to IORT? I don't remember. If > yes it would be very easy to just have a virtio specific ACPI table - > I am assuming ARM guys will be just as hostile to virt changes > to IORT as they were to minor changes to ARM vIOMMU. I eventually succeeded to prototype the IORT integration on x86. Please find below the branches to use for testing: - QEMU: https://github.com/eauger/qemu.git v3.1.0-rc3-virtio-iommu-v0.9-x86 On top of "[RFC v9 00/17] VIRTIO-IOMMU device", I added the IORT build for pc machine - LINUX GUEST: https://github.com/eauger/linux.git virtio-iommu-v0.9-iort-x86 Jean's virtio-iommu/devel branch + 2 hacks (to be discussed separately) Make sure the CONFIG_VIRTIO_IOMMU config is set Used qemu cmd line featuring -device virtio-iommu-pci,addr=0xa: ./x86_64-softmmu/qemu-system-x86_64 -M q35,accel=kvm,usb=off,sata=off,dump-guest-core=off -cpu Haswell,-hle,-rtm -smp 4,sockets=4,cores=1,threads=1 -m 4G -display none --enable-kvm -serial tcp:localhost:,server -device virtio-iommu-pci,addr=0xa -trace events=/home/augere/UPSTREAM/qemu/hw-virtio-iommu -qmp unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -realtime mlock=off -no-hpet -no-shutdown -device virtio-blk-pci,scsi=off,drive=drv0,id=virtio-disk0,bootindex=1,iommu_platform,disable-modern=off,disable-legacy=on -drive file=/home/augere/VM/IMAGES/vm0.qcow2,format=qcow2,if=none,id=drv0 -device virtio-net-pci,bus=pcie.0,netdev=nic0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on -netdev tap,id=nic0,script=/home/augere/TEST/SCRIPTS/qemu-ifup,downscript=/home/augere/TEST/SCRIPTS/qemu-ifdown -kernel /home/augere/VM/BOOT/vmlinuz-4.20.0-rc3-guest-defconfig-virtio-iommu-0.9-x86+ -initrd /home/augere/VM/BOOT/initrd.img-4.20.0-rc3-guest-defconfig-virtio-iommu-0.9-x86+ -append 'root=/dev/mapper/rhel_dhcp156--238-root ro crashkernel=auto rd.lvm.lv=rhel_dhcp156-238/root rd.lvm.lv=rhel_dhcp156-238/swap rw ignore_loglevel console=tty0 console=ttyS0,115200n8 serial acpi=force' -net none -d guest_errors I put sata=off otherwise I have early untranslated transactions that prevent the guest from booting. Please let me know if you encounter any issue reproducing the environment on your side. Thanks Eric > > >>> [1] Virtio-iommu specification v0.9, sources, pdf and diff from v0.8 git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9 http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf http://jpbrucker.net/virtio-iommu/spec/diffs/virtio-iommu-pdf-diff-v0.8-v0.9.pdf [2] [PATCH v4 0/7] Add virtio-iommu driver https://lists.linuxfoundation.org/pipermail/iommu/2018-November/031074.html [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9 git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9 [4] [RFC v9 00/17] VIRTIO-IOMMU device https://www.mail-archive.com/qemu-devel@nongnu.org/msg575578.html Jean-Philippe Brucker (7): dt-bindings: virtio-mmio: Add IOMMU description dt-bindings: virtio: Add virtio-pci-iommu node of: Allow the iom
Re: [PATCH] irq_remapping: Remove unused header files
On Fri, Nov 30, 2018 at 09:35:21AM -0500, Yangtao Li wrote: > seq_file.h does not need to be included,so remove it. > > Signed-off-by: Yangtao Li > --- > drivers/iommu/irq_remapping.c | 1 - > 1 file changed, 1 deletion(-) Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/2] iommu/ipmmu-vmsa: Modify ipmmu_slave_whitelist()
On Wed, Nov 28, 2018 at 09:23:35AM +, Yoshihiro Shimoda wrote: > Yoshihiro Shimoda (2): > iommu/ipmmu-vmsa: Modify ipmmu_slave_whitelist() to check SoC > revisions > iommu/ipmmu-vmsa: add an array of slave devices whitelist Applied, thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] of/device: add blacklist for iommu dma_ops
Hi Rob, On 01/12/2018 16:53, Rob Clark wrote: This solves a problem we see with drm/msm, caused by getting iommu_dma_ops while we attach our own domain and manage it directly at the iommu API level: [0038] user address but active_mm is swapper Internal error: Oops: 9605 [#1] PREEMPT SMP Modules linked in: CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90 Hardware name: xxx (DT) Workqueue: events deferred_probe_work_func pstate: 80c9 (Nzcv daif +PAN +UAO) pc : iommu_dma_map_sg+0x7c/0x2c8 lr : iommu_dma_map_sg+0x40/0x2c8 sp : ff80095eb4f0 x29: ff80095eb4f0 x28: x27: ffc0f9431578 x26: x25: x24: 0003 x23: 0001 x22: ffc0fa9ac010 x21: x20: ffc0fab40980 x19: ffc0fab40980 x18: 0003 x17: 01c4 x16: 0007 x15: 000e x14: x13: x12: 0028 x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f x9 : x8 : ffc0fab409a0 x7 : x6 : 0002 x5 : 0001 x4 : x3 : 0001 x2 : 0002 x1 : ffc0f9431578 x0 : Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb) Call trace: iommu_dma_map_sg+0x7c/0x2c8 __iommu_map_sg_attrs+0x70/0x84 get_pages+0x170/0x1e8 msm_gem_get_iova+0x8c/0x128 _msm_gem_kernel_new+0x6c/0xc8 msm_gem_kernel_new+0x4c/0x58 dsi_tx_buf_alloc_6g+0x4c/0x8c msm_dsi_host_modeset_init+0xc8/0x108 msm_dsi_modeset_init+0x54/0x18c _dpu_kms_drm_obj_init+0x430/0x474 dpu_kms_hw_init+0x5f8/0x6b4 msm_drm_bind+0x360/0x6c8 try_to_bring_up_master.part.7+0x28/0x70 component_master_add_with_match+0xe8/0x124 msm_pdev_probe+0x294/0x2b4 platform_drv_probe+0x58/0xa4 really_probe+0x150/0x294 driver_probe_device+0xac/0xe8 __device_attach_driver+0xa4/0xb4 bus_for_each_drv+0x98/0xc8 __device_attach+0xac/0x12c device_initial_probe+0x24/0x30 bus_probe_device+0x38/0x98 deferred_probe_work_func+0x78/0xa4 process_one_work+0x24c/0x3dc worker_thread+0x280/0x360 kthread+0x134/0x13c ret_from_fork+0x10/0x18 Code: d284 91000725 6b17039f 5400048a (f9401f40) ---[ end trace f22dda57f3648e2c ]--- Kernel panic - not syncing: Fatal exception SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x0,22802a18 Memory Limit: none The problem is that when drm/msm does it's own iommu_attach_device(), now the domain returned by iommu_get_domain_for_dev() is drm/msm's domain, and it doesn't have domain->iova_cookie. Does this crash still happen with 4.20-rc? Because as of 6af588fed391 it really shouldn't. We kind of avoided this problem prior to sdm845/dpu because the iommu was attached to the mdp node in dt, which is a child of the toplevel mdss node (which corresponds to the dev passed in dma_map_sg()). But with sdm845, now the iommu is attached at the mdss level so we hit the iommu_dma_ops in dma_map_sg(). But auto allocating/attaching a domain before the driver is probed was already a blocking problem for enabling per-context pagetables for the GPU. This problem is also now solved with this patch. s/solved/worked around/ If you want a guarantee of actually getting a specific hardware context allocated for a given domain, there needs to be code in the IOMMU driver to understand and honour that. Implicitly depending on whatever happens to fall out of current driver behaviour on current systems is not a real solution. Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in of_dma_configure That's rather misleading, since the crash described above depends on at least two other major changes which came long after that commit. It's not that I don't understand exactly what you want here - just that this commit message isn't a coherent justification for that ;) Tested-by: Douglas Anderson Signed-off-by: Rob Clark --- This is an alternative/replacement for [1]. What it lacks in elegance it makes up for in practicality ;-) [1] https://patchwork.freedesktop.org/patch/264930/ drivers/of/device.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/drivers/of/device.c b/drivers/of/device.c index 5957cd4fa262..15ffee00fb22 100644 --- a/drivers/of/device.c +++ b/drivers/of/device.c @@ -72,6 +72,14 @@ int of_device_add(struct platform_device *ofdev) return device_add(&ofdev->dev); } +static const struct of_device_id iommu_blacklist[] = { + { .compatible = "qcom,mdp4" }, + { .compatible = "qcom,mdss" }, + { .compatible = "qcom,sdm845-mdss" }, + { .compatible = "qcom,adreno" }, + {} +}; + /** * of_dma_configure - Setup DMA configuration * @dev: Device to apply DMA configurati
Re: [PATCH] dma-debug: Kconfig for PREALLOC_DMA_DEBUG_ENTRIES
On 01/12/2018 16:36, Christoph Hellwig wrote: On Fri, Nov 30, 2018 at 07:39:50PM +, Robin Murphy wrote: I was assuming the point was to also add something like default 131072 if HNS_ENET so that DMA debug doesn't require too much thought from the user. If they still have to notice the overflow message and empirically figure out a value that does work, rebuilding the kernel each time is far less convenient than simply adding "dma_debug_entries=..." to their kernel command line and rebooting, which they can do today. If they do already know up-front that the default will need overriding and what the appropriate value is, then the command line still seems seems just as convenient. I'm not so fond of random drivers changing the defaults. My idea was rather to have the config option so that the defconfig files for the Hisilicon SOCs with this hardware could select a larger number without making a total mess of the kernel configuration. If we really have to we could do different defaults, but I'd still much rather do this on a arch/platform basis than specific drivers. As I understand, some drivers could even use much more than this (131072), to such a point where I can't imagine that we would want to set an arch default to support them. For this HNS_ENET case, it is arm64 specific so it would be an arch defconfig. Thanks, John ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu . ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: PCI passthrough with multiple devices in same iommu group
Thanks for the info. Yes. we need to do this as our architecture manages everything from one of super privileged guest . Rest of the guest don't have access to these devices . Let me check with hardware vendor to see if we can segregate SMBus device into separate iommu group. -Gokul On Fri, Nov 30, 2018 at 9:35 PM Alex Williamson wrote: > On Fri, 30 Nov 2018 12:04:26 +0530 > gokul cg wrote: > > > Thanks for info > > > > See inline > > On Wed, Nov 28, 2018 at 8:56 PM Alex Williamson < > alex.william...@redhat.com> > > wrote: > > > > > On Wed, 28 Nov 2018 20:21:02 +0530 > > > gokul cg wrote: > > > > > > > Hi Folks, > > > > > > > > Please excuse me , I just writing to you as i could see you had made > > > > changes regarding iommu and I thought you could give help me here. > > > > > > > > We are testing visualization with QEMU-2.7.0 + Linux-4.8+, we are > facing > > > > issue while configuring pass through PCI devices in QEMU to pass it > to > > > > guest OS. > > > > We are using following QEMU argument to configure PCI device as > > > passthrough > > > > device to guest, which was working with Linux 3.10 distro > (hypervisor: > > > QEMU > > > > 1.7.2, Linux: 3.10+). > > > > > > > > /usr/bin/qemu-system-x86_64 -name guest_1 -S -machine > > > > pc-i440fx-1.7,accel=kvm,usb=off -m 49152\ > > > > -device kvm-pci-assign,host=:00:1f.3 -device > > > > kvm-pci-assign,host=:09:0e.0 .. > > > > > > Legacy KVM device assignment (aka kvm-pci-assign) has been deprecated > > > for some time now, you'll find it doesn't even exist in newer kernels > > > and QEMU. > > > > > > Understood . > > > > > > Here is the error message that we will get when we try to configure > PCI > > > > devices as passthrough using kvm pci assignment method in Linux 4.8+ > > > > (hypervisor: QEMU 2.7.0, Linux: 4.8+). > > > > > > > > which is shown below. > > > > > > > > ---log--- > > > > > > > > From QEMU: > > > > qemu-system-x86_64: -device kvm-pci-assign,host=:00:1f.3: Failed > to > > > > assign device "(null)": Invalid argument > > > > > > > > From dmesg: > > > > pci-stub :00:1f.3: kvm assign device failed ret -22 > > > > > > > > ---end > > > > > > > > Info about devices (CPU: Intel(R) Xeon(R) CPU E5-2608L): > > > > > > > > root@shining-node:~# lspci -k -s 00:1f.3 > > > > 00:1f.3 SMBus: Intel Corporation C610/X99 series chipset SMBus > Controller > > > > (rev 05) > > > > Subsystem: Intel Corporation C610/X99 series chipset SMBus > > > > Controller > > > > Kernel driver in use: pci-stub > > > > Kernel modules: i2c_i801 > > > > > > Why are you trying to assign an SMBus controller to a VM? > > > > > > We want guest of to read out eprom and sensor devices and manage our > > chassis . > > Our i2c devices are connected to Intel SMbus controller > > It seems fundamentally unsafe to allow a user driver (where QEMU is > just a userspace driver exposing the device into a VM) to manage the > host chassis. Additionally since this set of multifunction devices do > not expose access control services, we must assume that untranslated > DMA between the function is possible and thus group them together. It > would be up to the hardware vendor whether the devices are in fact DMA > isolated to provide an ACS quirk to split this grouping. Otherwise > you'll need to provide that isolation as a downstream assumption. > > > > root@shining-node:~# > > > > root@shining-node:~# lspci -k -s 09:0e.0 > > > > 09:0e.0 Network controller: Broadcom Limited Device b680 (rev 12) > > > > root@shining-node:~# > > > > > > > > From the web search i could see that it is because there are multiple > > > > devices in same iommu_group that the passthrough device belongs to. > > > > When i check iommu_group , i have multiple devices in same group but > all > > > > those are intel devices under Wellsburg PCH. > > > > > > Nope, kvm-pci-assign doesn't make use of IOMMU groups, more likely just > > > some state of disrepair as it's deprecated and replaced by vfio-pci, > > > which does use iommu groups. So iommu groups will be a problem, but > > > not in the configuration you're attempting to use above. > > > > > > The error i had pasted above "< pci-stub :00:1f.3: kvm assign > device > > failed ret -22 >" is comming as iommu attach device > > fails because of following check . > > 1094 int iommu_attach_device(struct iommu_domain *domain, struct device > > *dev) > > 1095 { > > 1096 struct iommu_group *group; > > 1097 int ret; > > 1098 > > 1099 group = iommu_group_get(dev); > > 1100 /* FIXME: Remove this when groups a mandatory for iommu > > drivers */ > > 1101 if (group == NULL) > > 1102 return __iommu_attach_device(domain, dev); > > 1103 > > Ah yes, I forget about this since legacy KVM device assignment has been > deprecated for so long. > > > > > root@shining-node:~# ls > > > > /sys/bus/pci/devices/\:00\:1f.3/iommu_group/devices/ > > > > :00:1f.0 :00:1f
Re: [PATCH] of/device: add blacklist for iommu dma_ops
Hi Tomasz, On 2018-12-03 01:10, Tomasz Figa wrote: > On Sat, Dec 1, 2018 at 8:54 AM Rob Clark wrote: >> This solves a problem we see with drm/msm, caused by getting >> iommu_dma_ops while we attach our own domain and manage it directly at >> the iommu API level: >> >> [0038] user address but active_mm is swapper >> Internal error: Oops: 9605 [#1] PREEMPT SMP >> Modules linked in: >> CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 #90 >> Hardware name: xxx (DT) >> Workqueue: events deferred_probe_work_func >> pstate: 80c9 (Nzcv daif +PAN +UAO) >> pc : iommu_dma_map_sg+0x7c/0x2c8 >> lr : iommu_dma_map_sg+0x40/0x2c8 >> sp : ff80095eb4f0 >> x29: ff80095eb4f0 x28: >> x27: ffc0f9431578 x26: >> x25: x24: 0003 >> x23: 0001 x22: ffc0fa9ac010 >> x21: x20: ffc0fab40980 >> x19: ffc0fab40980 x18: 0003 >> x17: 01c4 x16: 0007 >> x15: 000e x14: >> x13: x12: 0028 >> x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f >> x9 : x8 : ffc0fab409a0 >> x7 : x6 : 0002 >> x5 : 0001 x4 : >> x3 : 0001 x2 : 0002 >> x1 : ffc0f9431578 x0 : >> Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb) >> Call trace: >>iommu_dma_map_sg+0x7c/0x2c8 >>__iommu_map_sg_attrs+0x70/0x84 >>get_pages+0x170/0x1e8 >>msm_gem_get_iova+0x8c/0x128 >>_msm_gem_kernel_new+0x6c/0xc8 >>msm_gem_kernel_new+0x4c/0x58 >>dsi_tx_buf_alloc_6g+0x4c/0x8c >>msm_dsi_host_modeset_init+0xc8/0x108 >>msm_dsi_modeset_init+0x54/0x18c >>_dpu_kms_drm_obj_init+0x430/0x474 >>dpu_kms_hw_init+0x5f8/0x6b4 >>msm_drm_bind+0x360/0x6c8 >>try_to_bring_up_master.part.7+0x28/0x70 >>component_master_add_with_match+0xe8/0x124 >>msm_pdev_probe+0x294/0x2b4 >>platform_drv_probe+0x58/0xa4 >>really_probe+0x150/0x294 >>driver_probe_device+0xac/0xe8 >>__device_attach_driver+0xa4/0xb4 >>bus_for_each_drv+0x98/0xc8 >>__device_attach+0xac/0x12c >>device_initial_probe+0x24/0x30 >>bus_probe_device+0x38/0x98 >>deferred_probe_work_func+0x78/0xa4 >>process_one_work+0x24c/0x3dc >>worker_thread+0x280/0x360 >>kthread+0x134/0x13c >>ret_from_fork+0x10/0x18 >> Code: d284 91000725 6b17039f 5400048a (f9401f40) >> ---[ end trace f22dda57f3648e2c ]--- >> Kernel panic - not syncing: Fatal exception >> SMP: stopping secondary CPUs >> Kernel Offset: disabled >> CPU features: 0x0,22802a18 >> Memory Limit: none >> >> The problem is that when drm/msm does it's own iommu_attach_device(), >> now the domain returned by iommu_get_domain_for_dev() is drm/msm's >> domain, and it doesn't have domain->iova_cookie. >> >> We kind of avoided this problem prior to sdm845/dpu because the iommu >> was attached to the mdp node in dt, which is a child of the toplevel >> mdss node (which corresponds to the dev passed in dma_map_sg()). But >> with sdm845, now the iommu is attached at the mdss level so we hit the >> iommu_dma_ops in dma_map_sg(). >> >> But auto allocating/attaching a domain before the driver is probed was >> already a blocking problem for enabling per-context pagetables for the >> GPU. This problem is also now solved with this patch. >> >> Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in >> of_dma_configure >> Tested-by: Douglas Anderson >> Signed-off-by: Rob Clark >> --- >> This is an alternative/replacement for [1]. What it lacks in elegance >> it makes up for in practicality ;-) >> >> [1] https://patchwork.freedesktop.org/patch/264930/ >> >> drivers/of/device.c | 22 ++ >> 1 file changed, 22 insertions(+) >> >> diff --git a/drivers/of/device.c b/drivers/of/device.c >> index 5957cd4fa262..15ffee00fb22 100644 >> --- a/drivers/of/device.c >> +++ b/drivers/of/device.c >> @@ -72,6 +72,14 @@ int of_device_add(struct platform_device *ofdev) >> return device_add(&ofdev->dev); >> } >> >> +static const struct of_device_id iommu_blacklist[] = { >> + { .compatible = "qcom,mdp4" }, >> + { .compatible = "qcom,mdss" }, >> + { .compatible = "qcom,sdm845-mdss" }, >> + { .compatible = "qcom,adreno" }, >> + {} >> +}; >> + >> /** >> * of_dma_configure - Setup DMA configuration >> * @dev: Device to apply DMA configuration >> @@ -164,6 +172,20 @@ int of_dma_configure(struct device *dev, struct >> device_node *np, bool force_dma) >> dev_dbg(dev, "device is%sbehind an iommu\n", >> iommu ? " " : " not "); >> >> + /* >> +* There is at least one case where the driver wants to directly >> +* manage the IOMMU, but if we end up with iommu dma_ops, that >> +* int
[PATCH v2] kernel/dma: Fix panic caused by passing swiotlb to command line
From: He Zhe setup_io_tlb_npages does not check input argument before passing it to isdigit. The argument would be a NULL pointer if "swiotlb", without its value, is set in command line and thus causes the following panic. PANIC: early exception 0xe3 IP 10:bb9b8e9f error 0 cr2 0x0 [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc3-yocto-standard+ #9 [0.00] RIP: 0010:setup_io_tlb_npages+0xf/0x95 ... [0.00] Call Trace: [0.00] do_early_param+0x57/0x8e [0.00] parse_args+0x208/0x320 [0.00] ? rdinit_setup+0x30/0x30 [0.00] parse_early_options+0x29/0x2d [0.00] ? rdinit_setup+0x30/0x30 [0.00] parse_early_param+0x36/0x4d [0.00] setup_arch+0x336/0x99e [0.00] start_kernel+0x6f/0x4e6 [0.00] x86_64_start_reservations+0x24/0x26 [0.00] x86_64_start_kernel+0x6f/0x72 [0.00] secondary_startup_64+0xa4/0xb0 This patch adds a check to prevent the panic and sets it for 4MB by default. Signed-off-by: He Zhe Cc: sta...@vger.kernel.org Cc: konrad.w...@oracle.com Cc: h...@lst.de Cc: m.szyprow...@samsung.com Cc: robin.mur...@arm.com Signed-off-by: He Zhe --- v2: Set swiotlb for 4MB by default kernel/dma/swiotlb.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 045930e..0e18cd4 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -58,6 +58,9 @@ */ #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT) +#define IO_TLB_DEFAULT_MB 4 +#define IO_TLB_DEFAULT_SLABS ((IO_TLB_DEFAULT_MB<<20) >> IO_TLB_SHIFT) + enum swiotlb_force swiotlb_force; /* @@ -103,6 +106,14 @@ static int late_alloc; static int __init setup_io_tlb_npages(char *str) { + if (!str) { + pr_err("No value provided for swiotlb, " + "set it to %d slabs for %dMB by default.\n", + IO_TLB_DEFAULT_SLABS, IO_TLB_DEFAULT_MB); + io_tlb_nslabs = IO_TLB_DEFAULT_SLABS; + return 0; + } + if (isdigit(*str)) { io_tlb_nslabs = simple_strtoul(str, &str, 0); /* avoid tail segment of size < IO_TLB_SEGSIZE */ -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] kernel/dma: Fix panic caused by passing swiotlb to command line
On 2018/12/2 20:30, Konrad Rzeszutek Wilk wrote: > On Tue, Oct 30, 2018 at 07:06:10PM +0800, He Zhe wrote: >> >> On 2018/10/23 19:14, He Zhe wrote: >>> On 2018/10/23 03:29, Konrad Rzeszutek Wilk wrote: On Sat, Sep 22, 2018 at 08:56:58PM +0800, He Zhe wrote: > May I have your input? Alternatively would it make more sense for it to assume some default value? >>> Maybe, but the original code has no default value and I have no idea >>> what default value is proper here. >> Can anyone give some suggestions? Though I'd prefer to do nothing when >> no option is provided. > One provided the paramter for a reason. I would just do a small one, say > 4MB. OK. Thanks, I'll send v2. Zhe >> Thanks, >> Zhe >> >>> Zhe >>> > Thanks, > Zhe > > On 2018年09月17日 11:27, zhe...@windriver.com wrote: >> From: He Zhe >> >> setup_io_tlb_npages does not check input argument before passing it >> to isdigit. The argument would be a NULL pointer if "swiotlb", without >> its value, is set in command line and thus causes the following panic. >> >> PANIC: early exception 0xe3 IP 10:bb9b8e9f error 0 cr2 0x0 >> [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted >> 4.19.0-rc3-yocto-standard+ #9 >> [0.00] RIP: 0010:setup_io_tlb_npages+0xf/0x95 >> ... >> [0.00] Call Trace: >> [0.00] do_early_param+0x57/0x8e >> [0.00] parse_args+0x208/0x320 >> [0.00] ? rdinit_setup+0x30/0x30 >> [0.00] parse_early_options+0x29/0x2d >> [0.00] ? rdinit_setup+0x30/0x30 >> [0.00] parse_early_param+0x36/0x4d >> [0.00] setup_arch+0x336/0x99e >> [0.00] start_kernel+0x6f/0x4e6 >> [0.00] x86_64_start_reservations+0x24/0x26 >> [0.00] x86_64_start_kernel+0x6f/0x72 >> [0.00] secondary_startup_64+0xa4/0xb0 >> >> This patch adds a check to prevent the panic. >> >> Signed-off-by: He Zhe >> Cc: sta...@vger.kernel.org >> Cc: konrad.w...@oracle.com >> Cc: h...@lst.de >> Cc: m.szyprow...@samsung.com >> Cc: robin.mur...@arm.com >> --- >> kernel/dma/swiotlb.c | 5 + >> 1 file changed, 5 insertions(+) >> >> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c >> index 4f8a6db..46fc34e 100644 >> --- a/kernel/dma/swiotlb.c >> +++ b/kernel/dma/swiotlb.c >> @@ -109,6 +109,11 @@ static int late_alloc; >> static int __init >> setup_io_tlb_npages(char *str) >> { >> +if (!str) { >> +pr_err("Config string not provided\n"); >> +return -EINVAL; >> +} >> + >> if (isdigit(*str)) { >> io_tlb_nslabs = simple_strtoul(str, &str, 0); >> /* avoid tail segment of size < IO_TLB_SEGSIZE */ > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 06/15] iommu/mediatek: Add mt8183 IOMMU support
Hi Matthias, Thanks very much for your review. On Mon, 2018-12-03 at 00:56 +0100, Matthias Brugger wrote: > > On 17/11/2018 03:35, Yong Wu wrote: > > The M4U IP blocks in mt8183 is MediaTek's generation2 M4U which use > > the ARM Short-descriptor like mt8173, and most of the HW registers > > are the same. > > > > Here list main changes in mt8183: > > 1) mt8183 has only one M4U HW like mt8173. > > That's a change? I guess I should use "difference" rather than "changes". (mt8173 and mt8183 have only one M4u HW while mt2712 have two.) > > > 2) mt8183 don't have its "bclk" clock, the M4U use the EMI clock > > which has already been enabled before kernel. > > 3) mt8183 can support the dram over 4GB, but it don't call this "4GB > > mode". > > 4) mt8183 pgtable base register(0x0) extend bit[1:0] which represent > > the bit[33:32] in the physical address of the pgtable base, But the > > standard ttbr0[1] means the S bit which is enabled defaultly, Hence, > > we add a mask. > > 5) mt8183 HW has a GALS modules, the SMI should add "gals" clock > > support. > > 6) the larb-id in smi-common has been remapped. This means the > > larb-id reported in the mtk_iommu_isr is not the real larb-id, here > > is the remapping relationship of mt8183: > > This whole list is an indicator that you will need several enhancements to the > existing code before you can add mt8183 support. > Please do so in independent patches (e.g. gals modules, remap logic) OK. I will separate them into 2 new preparing patches. > Regarding making bclk optional, I think it would be better to set it to NULL > for > mt8183 otherwise the driver will work with a broken device tree on systems > that > actually need the bclk. Same holds for gals0 and gals1. OK. From your comment below, I will add a "has_bclk" in the platform data and change code like this: if (data->plat_data->has_bclk) { data->bclk = devm_clk_get(); if (IS_ERR(data->bclk)) return PTR_ERR(data->bclk). } else { data->bclk = NULL; } > > >M4U > > | > > - > > | SMI common | > > -0-7-5-6-1-2--3-4- <- Id remapped > > | | | | | | | | > > larb0 larb1 IPU0 IPU1 larb4 larb5 larb6 CCU > > disp vdec img cam venc imgcam > > As above, larb0 connects with the id 0 in smi-common. > > larb1 connects with the id 7 in smi-common. > > ... > > Take a example, if the larb-id reported in the mtk_iommu_isr is 7, > > actually it is larb1(vdec). > > > > Signed-off-by: Yong Wu > > --- > > drivers/iommu/mtk_iommu.c | 38 -- > > drivers/iommu/mtk_iommu.h | 5 + > > drivers/memory/mtk-smi.c | 47 > > +++ > > 3 files changed, 80 insertions(+), 10 deletions(-) > > > > [...] > > > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h > > index a243047..e5fd8ee 100644 > > --- a/drivers/iommu/mtk_iommu.h > > +++ b/drivers/iommu/mtk_iommu.h > > @@ -39,11 +39,16 @@ enum mtk_iommu_plat { > > M4U_MT2701, > > M4U_MT2712, > > M4U_MT8173, > > + M4U_MT8183, > > }; > > > > struct mtk_iommu_plat_data { > > enum mtk_iommu_plat m4u_plat; > > bool has_4gb_mode; > > + > > + /* The larb-id may be remapped in the smi-common. */ > > + bool larbid_remap_enable; > > + unsigned int larbid_remapped[MTK_LARB_NR_MAX]; > > Please add new features to the driver as single patches. Don't add this kind > of > things in the patch where you enable a new SoC. OK. > > > }; > > > > struct mtk_iommu_domain; > > diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c > > index e37e54b..979153b 100644 > > --- a/drivers/memory/mtk-smi.c > > +++ b/drivers/memory/mtk-smi.c > > @@ -59,6 +59,7 @@ struct mtk_smi_larb_gen { > > struct mtk_smi { > > struct device *dev; > > struct clk *clk_apb, *clk_smi; > > + struct clk *clk_gals0, *clk_gals1; > > struct clk *clk_async; /*only needed by mt2701*/ > > void __iomem*smi_ao_base; > > }; > > @@ -93,8 +94,20 @@ static int mtk_smi_enable(const struct mtk_smi *smi) > > if (ret) > > goto err_disable_apb; > > > > + ret = clk_prepare_enable(smi->clk_gals0); > > + if (ret) > > + goto err_disable_smi; > > + > > + ret = clk_prepare_enable(smi->clk_gals1); > > + if (ret) > > + goto err_disable_gals0; > > +> return 0; > > > > +err_disable_gals0: > > + clk_disable_unprepare(smi->clk_gals0); > > +err_disable_smi: > > + clk_disable_unprepare(smi->clk_smi); > > err_disable_apb: > > clk_disable_unprepare(smi->clk_apb); > > err_put_pm: > > @@ -104,6 +117,8 @@ static int mtk_smi_enable(const struct mtk_smi *smi) > > > > static vo
Re: [PATCH v3 13/15] memory: mtk-smi: Get rid of need_larbid
On Mon, 2018-12-03 at 00:04 +0100, Matthias Brugger wrote: > > On 17/11/2018 03:35, Yong Wu wrote: > > The "mediatek,larb-id" has already been parsed in MTK IOMMU driver. > > It's no need to parse it again in SMI driver. Only clean some codes. > > This patch is fit for all the current mt2701, mt2712, mt7623, mt8173 > > and mt8183. > > I'm trying to understand why we need the mediatek,larb-id at all. From what I > understand as long as the mediatek larbs described in the iommu are ordered > (larb0, larb1, larb2, etc) we actually get the same value as defined by > mediatek,larb-id. At least this holds for all present implementations. > > On the other hand I don't see where the mtk_iommu_v1 driver actually parses > the > larb-id, can you please help to understand: > > 1) why we need the larb-id at all Actually only mt2712 which have 2 M4U HW need "mediatek,larb-id". This is larbs in the m4u0/1 dtsi node of mt2712: m4u0 { mediatek,larbs = <&larb0 &larb1 &larb2 &larb3 &larb6>;} m4u1 { mediatek,larbs = <&larb4 &larb5 &larb7>;} the id don't increase one by one, M4U have to get the larbid with the help of "mediatek,larb-id". (The m4u/smi dtsi patch of mt2712 will be send with some other modules, maybe in this week.) > 2) how this will work if we do not parse the larb-id for v1 iommu at all As you said above and I also have wrote that the larbid "must sort according to the local arbiter index" in the "mediatek,larbs" description of dt-binding. All the M4U except mt2712 could ignore "mediatek,larb-id". the v1 iommu also should be ok. I'm not sure whether we should change [1], if only reserving mt2712 there, we also should change the dtsi file of mt2701 and mt7623.or keep it as is. [1] https://elixir.bootlin.com/linux/v4.20-rc1/source/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt#L20 > > Thanks a lot, > Matthias > > > [snip] ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu