Re: [PATCH v3 1/3] iommu/vt-d: Allow 32bit devices to uses DMA domain
On Thu, Apr 16, 2020 at 03:40:38PM +0800, Lu Baolu wrote: >> description. I'd need to look at the final code, but it seems like >> this will still cause bounce buffering instead of using dynamic >> mapping, which still seems like an awful idea. > > Yes. If the user chooses to use identity domain by default through > kernel command, identity domain will be applied for all devices. For > those devices with limited addressing capability, bounce buffering will > be used when they try to access the memory beyond their address > capability. This won't cause any kernel regression as far as I can see. > > Switching domain during runtime with drivers loaded will cause real > problems as I said in the commit message. That's the reason why I am > proposing to remove it. If we want to keep it, we have to make sure that > switching domain for one device should not impact other devices which > share the same domain with it. Furthermore, it's better to implement it > in the generic layer to keep device driver behavior consistent on all > architectures. I don't disagree with the technical points. What I pointed out is that a) the actual technical change is not in the commit log, which it should be b) that I still think taking away the ability to dynamically map devices in the identify domain after all the time we allowed for that is going to cause nasty regressions. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3] dt-bindings: iommu: renesas,ipmmu-vmsa: convert to json-schema
Hi all, > From: Yoshihiro Shimoda, Sent: Friday, April 17, 2020 1:35 PM > +properties: > + compatible: > +oneOf: > + - items: > + - enum: > + - renesas,ipmmu-r8a73a4 # R-Mobile APE6 > + - renesas,ipmmu-r8a7743 # RZ/G1M > + - renesas,ipmmu-r8a7744 # RZ/G1N > + - renesas,ipmmu-r8a7745 # RZ/G1E > + - renesas,ipmmu-r8a7790 # R-Car H2 > + - renesas,ipmmu-r8a7791 # R-Car M2-W > + - renesas,ipmmu-r8a7793 # R-Car M2-N > + - renesas,ipmmu-r8a7794 # R-Car E2 > + - renesas,ipmmu-r8a7795 # R-Car H3 I'm afraid but this renesas,ipmmu-r8a7795 should be the below section. So, I'll fix it. Best regards, Yoshihiro Shimoda ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v1 7/8] vfio/type1: Add VFIO_IOMMU_CACHE_INVALIDATE
Hi Alex, > From: Alex Williamson > Sent: Thursday, April 16, 2020 10:41 PM > To: Liu, Yi L > Subject: Re: [PATCH v1 7/8] vfio/type1: Add VFIO_IOMMU_CACHE_INVALIDATE > > On Thu, 16 Apr 2020 10:40:03 + > "Liu, Yi L" wrote: > > > Hi Alex, > > Still have a direction question with you. Better get agreement with you > > before heading forward. > > > > > From: Alex Williamson > > > Sent: Friday, April 3, 2020 11:35 PM > > [...] > > > > > > + * > > > > > > + * returns: 0 on success, -errno on failure. > > > > > > + */ > > > > > > +struct vfio_iommu_type1_cache_invalidate { > > > > > > + __u32 argsz; > > > > > > + __u32 flags; > > > > > > + struct iommu_cache_invalidate_info cache_info; > > > > > > +}; > > > > > > +#define VFIO_IOMMU_CACHE_INVALIDATE _IO(VFIO_TYPE, > > > VFIO_BASE > > > > > + 24) > > > > > > > > > > The future extension capabilities of this ioctl worry me, I wonder if > > > > > we should do another data[] with flag defining that data as > > > > > CACHE_INFO. > > > > > > > > Can you elaborate? Does it mean with this way we don't rely on iommu > > > > driver to provide version_to_size conversion and instead we just pass > > > > data[] to iommu driver for further audit? > > > > > > No, my concern is that this ioctl has a single function, strictly tied > > > to the iommu uapi. If we replace cache_info with data[] then we can > > > define a flag to specify that data[] is struct > > > iommu_cache_invalidate_info, and if we need to, a different flag to > > > identify data[] as something else. For example if we get stuck > > > expanding cache_info to meet new demands and develop a new uapi to > > > solve that, how would we expand this ioctl to support it rather than > > > also create a new ioctl? There's also a trade-off in making the ioctl > > > usage more difficult for the user. I'd still expect the vfio layer to > > > check the flag and interpret data[] as indicated by the flag rather > > > than just passing a blob of opaque data to the iommu layer though. > > > Thanks, > > > > Based on your comments about defining a single ioctl and a unified > > vfio structure (with a @data[] field) for pasid_alloc/free, bind/ > > unbind_gpasid, cache_inv. After some offline trying, I think it would > > be good for bind/unbind_gpasid and cache_inv as both of them use the > > iommu uapi definition. While the pasid alloc/free operation doesn't. > > It would be weird to put all of them together. So pasid alloc/free > > may have a separate ioctl. It would look as below. Does this direction > > look good per your opinion? > > > > ioctl #22: VFIO_IOMMU_PASID_REQUEST > > /** > > * @pasid: used to return the pasid alloc result when flags == ALLOC_PASID > > * specify a pasid to be freed when flags == FREE_PASID > > * @range: specify the allocation range when flags == ALLOC_PASID > > */ > > struct vfio_iommu_pasid_request { > > __u32 argsz; > > #define VFIO_IOMMU_ALLOC_PASID (1 << 0) > > #define VFIO_IOMMU_FREE_PASID (1 << 1) > > __u32 flags; > > __u32 pasid; > > struct { > > __u32 min; > > __u32 max; > > } range; > > }; > > Can't the ioctl return the pasid valid on alloc (like GET_DEVICE_FD)? Yep, I think you mentioned before. At that time, I believed it would be better to return the result via a __u32 buffer so that make full use of the 32 bits. But looks like it doesn't make much difference. I'll follow your suggestion. > Would it be useful to support freeing a range of pasids? If so then we > could simply use range for both, ie. allocate a pasid from this range > and return it, or free all pasids in this range? vfio already needs to > track pasids to free them on release, so presumably this is something > we could support easily. yes, I think it is a nice thing. then I can remove the @pasid field. will do it. > > ioctl #23: VFIO_IOMMU_NESTING_OP > > struct vfio_iommu_type1_nesting_op { > > __u32 argsz; > > __u32 flags; > > __u32 op; > > __u8data[]; > > }; > > data only has 4-byte alignment, I think we really want it at an 8-byte > alignment. This is why I embedded the "op" into the flag for > DEVICE_FEATURE. Thanks, got it. I may also merge the op into flags (maybe the lower 16 bits for op). Thanks, Yi Liu > Alex > > > > > /* Nesting Ops */ > > #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL0 > > #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL 1 > > #define VFIO_IOMMU_NESTING_OP_CACHE_INVLD 2 > > > > Thanks, > > Yi Liu > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3] dt-bindings: iommu: renesas, ipmmu-vmsa: convert to json-schema
Convert Renesas VMSA-Compatible IOMMU bindings documentation to json-schema. Note that original documentation doesn't mention renesas,ipmmu-vmsa for R-Mobile APE6. But, R-Mobile APE6 is similar to the R-Car Gen2. So, renesas,ipmmu-r8a73a4 belongs the renesas,ipmmu-vmsa section. Signed-off-by: Yoshihiro Shimoda --- Changes from v2: - Add a description for R-Mobile APE6 on the commit log. - Change renesas,ipmmu-r8a73a4 section on the compatible. - Add items on the interrupts. - Add power-domains to required. - Add oneOf for interrupts and renesas,ipmmu-main https://patchwork.kernel.org/patch/11490581/ Changes from v1: - Fix typo in the subject. - Add a description on #iommu-cells. https://patchwork.kernel.org/patch/11485415/ .../bindings/iommu/renesas,ipmmu-vmsa.txt | 73 --- .../bindings/iommu/renesas,ipmmu-vmsa.yaml | 101 + 2 files changed, 101 insertions(+), 73 deletions(-) delete mode 100644 Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt create mode 100644 Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml diff --git a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt deleted file mode 100644 index 020d6f2.. --- a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.txt +++ /dev/null @@ -1,73 +0,0 @@ -* Renesas VMSA-Compatible IOMMU - -The IPMMU is an IOMMU implementation compatible with the ARM VMSA page tables. -It provides address translation for bus masters outside of the CPU, each -connected to the IPMMU through a port called micro-TLB. - - -Required Properties: - - - compatible: Must contain SoC-specific and generic entry below in case -the device is compatible with the R-Car Gen2 VMSA-compatible IPMMU. - -- "renesas,ipmmu-r8a73a4" for the R8A73A4 (R-Mobile APE6) IPMMU. -- "renesas,ipmmu-r8a7743" for the R8A7743 (RZ/G1M) IPMMU. -- "renesas,ipmmu-r8a7744" for the R8A7744 (RZ/G1N) IPMMU. -- "renesas,ipmmu-r8a7745" for the R8A7745 (RZ/G1E) IPMMU. -- "renesas,ipmmu-r8a774a1" for the R8A774A1 (RZ/G2M) IPMMU. -- "renesas,ipmmu-r8a774b1" for the R8A774B1 (RZ/G2N) IPMMU. -- "renesas,ipmmu-r8a774c0" for the R8A774C0 (RZ/G2E) IPMMU. -- "renesas,ipmmu-r8a7790" for the R8A7790 (R-Car H2) IPMMU. -- "renesas,ipmmu-r8a7791" for the R8A7791 (R-Car M2-W) IPMMU. -- "renesas,ipmmu-r8a7793" for the R8A7793 (R-Car M2-N) IPMMU. -- "renesas,ipmmu-r8a7794" for the R8A7794 (R-Car E2) IPMMU. -- "renesas,ipmmu-r8a7795" for the R8A7795 (R-Car H3) IPMMU. -- "renesas,ipmmu-r8a7796" for the R8A7796 (R-Car M3-W) IPMMU. -- "renesas,ipmmu-r8a77965" for the R8A77965 (R-Car M3-N) IPMMU. -- "renesas,ipmmu-r8a77970" for the R8A77970 (R-Car V3M) IPMMU. -- "renesas,ipmmu-r8a77980" for the R8A77980 (R-Car V3H) IPMMU. -- "renesas,ipmmu-r8a77990" for the R8A77990 (R-Car E3) IPMMU. -- "renesas,ipmmu-r8a77995" for the R8A77995 (R-Car D3) IPMMU. -- "renesas,ipmmu-vmsa" for generic R-Car Gen2 or RZ/G1 VMSA-compatible - IPMMU. - - - reg: Base address and size of the IPMMU registers. - - interrupts: Specifiers for the MMU fault interrupts. For instances that -support secure mode two interrupts must be specified, for non-secure and -secure mode, in that order. For instances that don't support secure mode a -single interrupt must be specified. Not required for cache IPMMUs. - - - #iommu-cells: Must be 1. - -Optional properties: - - - renesas,ipmmu-main: reference to the main IPMMU instance in two cells. -The first cell is a phandle to the main IPMMU and the second cell is -the interrupt bit number associated with the particular cache IPMMU device. -The interrupt bit number needs to match the main IPMMU IMSSTR register. -Only used by cache IPMMU instances. - - -Each bus master connected to an IPMMU must reference the IPMMU in its device -node with the following property: - - - iommus: A reference to the IPMMU in two cells. The first cell is a phandle -to the IPMMU and the second cell the number of the micro-TLB that the -device is connected to. - - -Example: R8A7791 IPMMU-MX and VSP1-D0 bus master - - ipmmu_mx: mmu@fe951000 { - compatible = "renasas,ipmmu-r8a7791", "renasas,ipmmu-vmsa"; - reg = <0 0xfe951000 0 0x1000>; - interrupts = <0 222 IRQ_TYPE_LEVEL_HIGH>, -<0 221 IRQ_TYPE_LEVEL_HIGH>; - #iommu-cells = <1>; - }; - - vsp@fe928000 { - ... - iommus = <&ipmmu_mx 13>; - ... - }; diff --git a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml new file mode 100644 index ..8ade89d --- /dev/null +++ b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml @@ -0,0 +1,101 @@ +
Re: [PATCH v2 5/7] iommu/vt-d: Save prq descriptors in an internal list
Hi Kevin, On 2020/4/16 9:46, Lu Baolu wrote: On 2020/4/15 17:30, Tian, Kevin wrote: From: Lu Baolu Sent: Wednesday, April 15, 2020 1:26 PM Currently, the page request interrupt thread handles the page requests in the queue in this way: - Clear PPR bit to ensure new interrupt could come in; - Read and record the head and tail registers; - Handle all descriptors between head and tail; - Write tail to head register. This might cause some descriptors to be handles multiple times. An example sequence: - Thread A got scheduled with PRQ_1 and PRQ_2 in the queue; - Thread A clear the PPR bit and record the head and tail; - A new PRQ_3 comes and Thread B gets scheduled; - Thread B record the head and tail which includes PRQ_1 and PRQ_2. I may overlook something but isn't the prq interrupt thread per iommu then why would two prq threads contend here? The prq interrupt could be masked by the PPR (Pending Page Request) bit in Page Request Status Register. In the interrupt handling thread once this bit is clear, new prq interrupts are allowed to be generated. So, if a page request is in process and the PPR bit is cleared, another page request from any devices under the same iommu could trigger another interrupt thread. Rechecked the code. You are right. As long as the interrupt thread is per iommu, there will only single prq thread scheduled. I will change this accordingly in the new version. Thank you for pointing this out. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v11 05/10] iommu/vt-d: Add bind guest PASID support
> From: Auger Eric > Sent: Thursday, April 16, 2020 6:43 PM > [...] > >>> + if (svm) { > >>> + /* > >>> + * If we found svm for the PASID, there must be at > >>> + * least one device bond, otherwise svm should be > >>> freed. > >>> + */ > >>> + if (WARN_ON(list_empty(&svm->devs))) { > >>> + ret = -EINVAL; > >>> + goto out; > >>> + } > >>> + > >>> + for_each_svm_dev(sdev, svm, dev) { > >>> + /* In case of multiple sub-devices of the > >>> same pdev > >>> + * assigned, we should allow multiple bind > >>> calls with > >>> + * the same PASID and pdev. > >>> + */ > >>> + sdev->users++; > >> What if this is not an mdev device. Is it also allowed? > > Yes. IOMMU and VT-d driver is not mdev aware. Here mdev is just an > > example of normal use case. You can bind the same PCI device (PF or > > SRIOV VF) more than once to the same PASID. Just need to unbind also. > > I don't get the point of binding a non mdev device several times with > the same PASID. Do you intend to allow that at userspace level or > prevent this from happening in VFIO? I feel it's better to prevent this from happening, otherwise VFIO also needs to track the bind count and do multiple unbinds at mm_exit. But it's not necessary to prevent it in VFIO. We can check here upon whether aux_domain is valid, and if not return -EBUSY. > > Besides, the comment is a bit misleading as it gives the impression it > is only true for mdev and there is no associated check. Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 6/7] iommu/vt-d: Add page request draining support
> From: Lu Baolu > Sent: Thursday, April 16, 2020 4:38 PM > > Hi Kevin, > > On 2020/4/15 19:10, Tian, Kevin wrote: > > the completion of above sequence ensures that previous queued > > page group responses are sent out and received by the endpoint > > and vice versa all in-fly page requests from the endpoint are queued > > in iommu page request queue. Then comes a problem - you didn't > > wait for completion of those newly-queued requests and their > > responses. > > I thought about this again. > > We do page request draining after PASID table entry gets torn down and > the devTLB gets invalidated. At this point, if any new page request for > this pasid comes in, IOMMU will generate an unrecoverable fault and > response the device with IR (Invalid Request). IOMMU won't put this page > request into the queue. [VT-d spec 7.4.1] Non-coverable fault implies severe errors, so I don't see why we should allow such thing happen when handling a normal situation. if you look at the start of chapter 7: -- Non-recoverable Faults: Requests that encounter non-recoverable address translation faults are aborted by the remapping hardware, and typically require a reset of the device (such as through a function- level-reset) to recover and re-initialize the device to put it back into service. -- > > Hence, we don't need to worry about the newly-queued requests here. > > Best regards, > Baolu Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH] iommu/amd: fix a race in fetch_pte()
> On Apr 13, 2020, at 9:36 PM, Qian Cai wrote: > > > >> On Apr 8, 2020, at 10:19 AM, Joerg Roedel wrote: >> >> Hi Qian, >> >> On Tue, Apr 07, 2020 at 11:36:05AM -0400, Qian Cai wrote: >>> After further testing, the change along is insufficient. What I am chasing >>> right >>> now is the swap device will go offline after heavy memory pressure below. >>> The >>> symptom is similar to what we have in the commit, >>> >>> 754265bcab78 (“iommu/amd: Fix race in increase_address_space()”) >>> >>> Apparently, it is no possible to take the domain->lock in fetch_pte() >>> because it >>> could sleep. >> >> Thanks a lot for finding and tracking down another race in the AMD IOMMU >> page-table code. The domain->lock is a spin-lock and taking it can't >> sleep. But fetch_pte() is a fast-path and must not take any locks. >> >> I think the best fix is to update the pt_root and mode of the domain >> atomically by storing the mode in the lower 12 bits of pt_root. This way >> they are stored together and can be read/write atomically. > > Like this? So, this is still not enough that would still trigger storage driver offline under memory pressure for a bit longer. It looks to me that in fetch_pte() there are could still racy? level = domain->mode - 1; pte= &domain->pt_root[PM_LEVEL_INDEX(level, address)]; <— increase_address_space(); *page_size = PTE_LEVEL_PAGE_SIZE(level); while (level > 0) { *page_size = PTE_LEVEL_PAGE_SIZE(level); Then in iommu_unmap_page(), while (unmapped < page_size) { pte = fetch_pte(dom, bus_addr, &unmap_size); … bus_addr = (bus_addr & ~(unmap_size - 1)) + unmap_size; bus_addr would be unsync with dom->mode when it enter fetch_pte() again. Could that be a problem? > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index 20cce366e951..b36c6b07cbfd 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -1397,13 +1397,13 @@ static struct page *free_sub_pt(unsigned long root, > int mode, > > static void free_pagetable(struct protection_domain *domain) > { > - unsigned long root = (unsigned long)domain->pt_root; > + int level = iommu_get_mode(domain->pt_root); > + unsigned long root = iommu_get_root(domain->pt_root); > struct page *freelist = NULL; > > - BUG_ON(domain->mode < PAGE_MODE_NONE || > -domain->mode > PAGE_MODE_6_LEVEL); > + BUG_ON(level < PAGE_MODE_NONE || level > PAGE_MODE_6_LEVEL); > > - freelist = free_sub_pt(root, domain->mode, freelist); > + freelist = free_sub_pt(root, level, freelist); > > free_page_list(freelist); > } > @@ -1417,24 +1417,27 @@ static bool increase_address_space(struct > protection_domain *domain, > unsigned long address, > gfp_t gfp) > { > + int level; > unsigned long flags; > bool ret = false; > u64 *pte; > > spin_lock_irqsave(&domain->lock, flags); > > - if (address <= PM_LEVEL_SIZE(domain->mode) || > - WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL)) > + level = iommu_get_mode(domain->pt_root); > + > + if (address <= PM_LEVEL_SIZE(level) || > + WARN_ON_ONCE(level == PAGE_MODE_6_LEVEL)) > goto out; > > pte = (void *)get_zeroed_page(gfp); > if (!pte) > goto out; > > - *pte = PM_LEVEL_PDE(domain->mode, > - iommu_virt_to_phys(domain->pt_root)); > - domain->pt_root = pte; > - domain->mode+= 1; > + *pte = PM_LEVEL_PDE(level, > + iommu_virt_to_phys((void *)iommu_get_root(domain->pt_root))); > + > + WRITE_ONCE(domain->pt_root, (unsigned long)pte + level + 1); > > ret = true; > > @@ -1452,15 +1455,17 @@ static u64 *alloc_pte(struct protection_domain > *domain, > bool *updated) > { > int level, end_lvl; > - u64 *pte, *page; > + u64 *pte, *page, *pt_root, *root; > > BUG_ON(!is_power_of_2(page_size)); > > - while (address > PM_LEVEL_SIZE(domain->mode)) > + while (address > PM_LEVEL_SIZE(iommu_get_mode(domain->pt_root))) > *updated = increase_address_space(domain, address, gfp) || > *updated; > > - level = domain->mode - 1; > - pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)]; > + pt_root = READ_ONCE(domain->pt_root); > + root= (void *)iommu_get_root(pt_root); > + level = iommu_get_mode(pt_root) - 1; > + pte = &root[PM_LEVEL_INDEX(level, address)]; > address = PAGE_SIZE_ALIGN(address, page_size); > end_lvl = PAGE_SIZE_LEVEL(page_size); > > @@ -1536,16 +1541,18 @@ static u64 *fetch_pte(struct protection_domain > *domain, > unsigned long address, > unsign
Re: [PATCH v1 0/2] vfio/pci: expose device's PASID capability to VMs
On Fri, Apr 17, 2020 at 06:33:54AM +0800, Raj, Ashok wrote: > Hi Zhao > > > On Thu, Apr 16, 2020 at 06:12:26PM -0400, Yan Zhao wrote: > > On Tue, Mar 31, 2020 at 03:08:25PM +0800, Lu, Baolu wrote: > > > On 2020/3/31 14:35, Tian, Kevin wrote: > > > >> From: Liu, Yi L > > > >> Sent: Sunday, March 22, 2020 8:33 PM > > > >> > > > >> From: Liu Yi L > > > >> > > > >> Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on > > > >> Intel platforms allows address space sharing between device DMA and > > > >> applications. SVA can reduce programming complexity and enhance > > > >> security. > > > >> > > > >> To enable SVA, device needs to have PASID capability, which is a key > > > >> capability for SVA. This patchset exposes the device's PASID capability > > > >> to guest instead of hiding it from guest. > > > >> > > > >> The second patch emulates PASID capability for VFs (Virtual Function) > > > >> since > > > >> VFs don't implement such capability per PCIe spec. This patch emulates > > > >> such > > > >> capability and expose to VM if the capability is enabled in PF > > > >> (Physical > > > >> Function). > > > >> > > > >> However, there is an open for PASID emulation. If PF driver disables > > > >> PASID > > > >> capability at runtime, then it may be an issue. e.g. PF should not > > > >> disable > > > >> PASID capability if there is guest using this capability on any VF > > > >> related > > > >> to this PF. To solve it, may need to introduce a generic communication > > > >> framework between vfio-pci driver and PF drivers. Please feel free to > > > >> give > > > >> your suggestions on it. > > > > I'm not sure how this is addressed on bate metal today, i.e. between > > > > normal > > > > kernel PF and VF drivers. I look at pasid enable/disable code in > > > > intel-iommu.c. > > > > There is no check on PF/VF dependency so far. The cap is toggled when > > > > attaching/detaching the PF to its domain. Let's see how IOMMU guys > > > > respond, and if there is a way for VF driver to block PF driver from > > > > disabling > > > > the pasid cap when it's being actively used by VF driver, then we may > > > > leverage the same trick in VFIO when emulation is provided to guest. > > > > > > IOMMU subsystem doesn't expose any APIs for pasid enabling/disabling. > > > The PCI subsystem does. It handles VF/PF like below. > > > > > > /** > > > * pci_enable_pasid - Enable the PASID capability > > > * @pdev: PCI device structure > > > * @features: Features to enable > > > * > > > * Returns 0 on success, negative value on error. This function checks > > > * whether the features are actually supported by the device and returns > > > * an error if not. > > > */ > > > int pci_enable_pasid(struct pci_dev *pdev, int features) > > > { > > > u16 control, supported; > > > int pasid = pdev->pasid_cap; > > > > > > /* > > > * VFs must not implement the PASID Capability, but if a PF > > > * supports PASID, its VFs share the PF PASID configuration. > > > */ > > > if (pdev->is_virtfn) { > > > if (pci_physfn(pdev)->pasid_enabled) > > > return 0; > > > return -EINVAL; > > > } > > > > > > /** > > > * pci_disable_pasid - Disable the PASID capability > > > * @pdev: PCI device structure > > > */ > > > void pci_disable_pasid(struct pci_dev *pdev) > > > { > > > u16 control = 0; > > > int pasid = pdev->pasid_cap; > > > > > > /* VFs share the PF PASID configuration */ > > > if (pdev->is_virtfn) > > > return; > > > > > > > > > It doesn't block disabling PASID on PF even VFs are possibly using it. > > > > > hi > > I'm not sure, but is it possible for pci_enable_pasid() and > > pci_disable_pasid() to do the same thing as pdev->driver->sriov_configure, > > e.g. pci_sriov_configure_simple() below. > > > > It checks whether there are VFs are assigned in pci_vfs_assigned(dev). > > and we can set the VF in assigned status if vfio_pci_open() is performed > > on the VF. > > But you can still unbind the PF driver that magically causes the VF's to be > removed from the guest image too correct? > > Only the IOMMU mucks with pasid_enable/disable. And it doesn't look like > we have a path to disable without tearing down the PF binding. > > We originally had some refcounts and such and would do the real disable only > when the refcount drops to 0, but we found it wasn't actually necessary > to protect these resources like that. > right. now unbinding PF driver would cause VFs unplugged from guest. if we modify vfio_pci and set VFs to be assigned, then VFs could remain appearing in guest but it cannot function well as PF driver has been unbound. thanks for explanation :) > > > > > > int pci_sriov_configure_simple(struct pci_dev *dev, int nr_virtfn) > > { > > int rc; > > > > might_sleep(); > > > >
Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code
Hi Daniel, On Fri, 2020-04-17 at 09:03 +0800, Daniel Drake wrote: > Hi Joerg, > > > Hi, > > > > here is the second version of this patch-set. The first version with > > some more introductory text can be found here: > > > > https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/ > > Thanks for the continued improvements in this area! > > I may have spotted a problem with setups like VMD. > > The core PCI bus is set up during early boot. > Then, for the PCI bus, we reach iommu_bus_init() -> bus_iommu_probe(). > In there, we call probe_iommu_group() -> dev_iommu_get() for each PCI > device, which allocates dev->iommu in each case. So far so good. > > The problem is that this is the last time that we'll call dev_iommu_get(). > If any PCI bus devices get added after this point, they do not get passed > to dev_iommu_get(). > > So when the vmd module gets loaded later, and creates more PCI devices, > we end up in iommu_bus_notifier() -> iommu_probe_device() > -> __iommu_probe_device() which does: > > dev->iommu->iommu_dev = iommu_dev; > > dev->iommu-> is a NULL dereference because dev_iommu_get() was never > called for this new device. > > Daniel > I should have CCed you on this, but it should temporarily resolve that issue: https://lists.linuxfoundation.org/pipermail/iommu/2020-April/043253.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code
Hi Joerg, > Hi, > > here is the second version of this patch-set. The first version with > some more introductory text can be found here: > > https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/ Thanks for the continued improvements in this area! I may have spotted a problem with setups like VMD. The core PCI bus is set up during early boot. Then, for the PCI bus, we reach iommu_bus_init() -> bus_iommu_probe(). In there, we call probe_iommu_group() -> dev_iommu_get() for each PCI device, which allocates dev->iommu in each case. So far so good. The problem is that this is the last time that we'll call dev_iommu_get(). If any PCI bus devices get added after this point, they do not get passed to dev_iommu_get(). So when the vmd module gets loaded later, and creates more PCI devices, we end up in iommu_bus_notifier() -> iommu_probe_device() -> __iommu_probe_device() which does: dev->iommu->iommu_dev = iommu_dev; dev->iommu-> is a NULL dereference because dev_iommu_get() was never called for this new device. Daniel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 0/2] vfio/pci: expose device's PASID capability to VMs
On Tue, Mar 31, 2020 at 03:08:25PM +0800, Lu, Baolu wrote: > On 2020/3/31 14:35, Tian, Kevin wrote: > >> From: Liu, Yi L > >> Sent: Sunday, March 22, 2020 8:33 PM > >> > >> From: Liu Yi L > >> > >> Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on > >> Intel platforms allows address space sharing between device DMA and > >> applications. SVA can reduce programming complexity and enhance security. > >> > >> To enable SVA, device needs to have PASID capability, which is a key > >> capability for SVA. This patchset exposes the device's PASID capability > >> to guest instead of hiding it from guest. > >> > >> The second patch emulates PASID capability for VFs (Virtual Function) since > >> VFs don't implement such capability per PCIe spec. This patch emulates such > >> capability and expose to VM if the capability is enabled in PF (Physical > >> Function). > >> > >> However, there is an open for PASID emulation. If PF driver disables PASID > >> capability at runtime, then it may be an issue. e.g. PF should not disable > >> PASID capability if there is guest using this capability on any VF related > >> to this PF. To solve it, may need to introduce a generic communication > >> framework between vfio-pci driver and PF drivers. Please feel free to give > >> your suggestions on it. > > I'm not sure how this is addressed on bate metal today, i.e. between normal > > kernel PF and VF drivers. I look at pasid enable/disable code in > > intel-iommu.c. > > There is no check on PF/VF dependency so far. The cap is toggled when > > attaching/detaching the PF to its domain. Let's see how IOMMU guys > > respond, and if there is a way for VF driver to block PF driver from > > disabling > > the pasid cap when it's being actively used by VF driver, then we may > > leverage the same trick in VFIO when emulation is provided to guest. > > IOMMU subsystem doesn't expose any APIs for pasid enabling/disabling. > The PCI subsystem does. It handles VF/PF like below. > > /** > * pci_enable_pasid - Enable the PASID capability > * @pdev: PCI device structure > * @features: Features to enable > * > * Returns 0 on success, negative value on error. This function checks > * whether the features are actually supported by the device and returns > * an error if not. > */ > int pci_enable_pasid(struct pci_dev *pdev, int features) > { > u16 control, supported; > int pasid = pdev->pasid_cap; > > /* > * VFs must not implement the PASID Capability, but if a PF > * supports PASID, its VFs share the PF PASID configuration. > */ > if (pdev->is_virtfn) { > if (pci_physfn(pdev)->pasid_enabled) > return 0; > return -EINVAL; > } > > /** > * pci_disable_pasid - Disable the PASID capability > * @pdev: PCI device structure > */ > void pci_disable_pasid(struct pci_dev *pdev) > { > u16 control = 0; > int pasid = pdev->pasid_cap; > > /* VFs share the PF PASID configuration */ > if (pdev->is_virtfn) > return; > > > It doesn't block disabling PASID on PF even VFs are possibly using it. > hi I'm not sure, but is it possible for pci_enable_pasid() and pci_disable_pasid() to do the same thing as pdev->driver->sriov_configure, e.g. pci_sriov_configure_simple() below. It checks whether there are VFs are assigned in pci_vfs_assigned(dev). and we can set the VF in assigned status if vfio_pci_open() is performed on the VF. int pci_sriov_configure_simple(struct pci_dev *dev, int nr_virtfn) { int rc; might_sleep(); if (!dev->is_physfn) return -ENODEV; if (pci_vfs_assigned(dev)) { pci_warn(dev, "Cannot modify SR-IOV while VFs are assigned\n"); return -EPERM; } if (nr_virtfn == 0) { sriov_disable(dev); return 0; } rc = sriov_enable(dev, nr_virtfn); if (rc < 0) return rc; return nr_virtfn; } Thanks Yan ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 0/2] vfio/pci: expose device's PASID capability to VMs
Hi Zhao On Thu, Apr 16, 2020 at 06:12:26PM -0400, Yan Zhao wrote: > On Tue, Mar 31, 2020 at 03:08:25PM +0800, Lu, Baolu wrote: > > On 2020/3/31 14:35, Tian, Kevin wrote: > > >> From: Liu, Yi L > > >> Sent: Sunday, March 22, 2020 8:33 PM > > >> > > >> From: Liu Yi L > > >> > > >> Shared Virtual Addressing (SVA), a.k.a, Shared Virtual Memory (SVM) on > > >> Intel platforms allows address space sharing between device DMA and > > >> applications. SVA can reduce programming complexity and enhance security. > > >> > > >> To enable SVA, device needs to have PASID capability, which is a key > > >> capability for SVA. This patchset exposes the device's PASID capability > > >> to guest instead of hiding it from guest. > > >> > > >> The second patch emulates PASID capability for VFs (Virtual Function) > > >> since > > >> VFs don't implement such capability per PCIe spec. This patch emulates > > >> such > > >> capability and expose to VM if the capability is enabled in PF (Physical > > >> Function). > > >> > > >> However, there is an open for PASID emulation. If PF driver disables > > >> PASID > > >> capability at runtime, then it may be an issue. e.g. PF should not > > >> disable > > >> PASID capability if there is guest using this capability on any VF > > >> related > > >> to this PF. To solve it, may need to introduce a generic communication > > >> framework between vfio-pci driver and PF drivers. Please feel free to > > >> give > > >> your suggestions on it. > > > I'm not sure how this is addressed on bate metal today, i.e. between > > > normal > > > kernel PF and VF drivers. I look at pasid enable/disable code in > > > intel-iommu.c. > > > There is no check on PF/VF dependency so far. The cap is toggled when > > > attaching/detaching the PF to its domain. Let's see how IOMMU guys > > > respond, and if there is a way for VF driver to block PF driver from > > > disabling > > > the pasid cap when it's being actively used by VF driver, then we may > > > leverage the same trick in VFIO when emulation is provided to guest. > > > > IOMMU subsystem doesn't expose any APIs for pasid enabling/disabling. > > The PCI subsystem does. It handles VF/PF like below. > > > > /** > > * pci_enable_pasid - Enable the PASID capability > > * @pdev: PCI device structure > > * @features: Features to enable > > * > > * Returns 0 on success, negative value on error. This function checks > > * whether the features are actually supported by the device and returns > > * an error if not. > > */ > > int pci_enable_pasid(struct pci_dev *pdev, int features) > > { > > u16 control, supported; > > int pasid = pdev->pasid_cap; > > > > /* > > * VFs must not implement the PASID Capability, but if a PF > > * supports PASID, its VFs share the PF PASID configuration. > > */ > > if (pdev->is_virtfn) { > > if (pci_physfn(pdev)->pasid_enabled) > > return 0; > > return -EINVAL; > > } > > > > /** > > * pci_disable_pasid - Disable the PASID capability > > * @pdev: PCI device structure > > */ > > void pci_disable_pasid(struct pci_dev *pdev) > > { > > u16 control = 0; > > int pasid = pdev->pasid_cap; > > > > /* VFs share the PF PASID configuration */ > > if (pdev->is_virtfn) > > return; > > > > > > It doesn't block disabling PASID on PF even VFs are possibly using it. > > > hi > I'm not sure, but is it possible for pci_enable_pasid() and > pci_disable_pasid() to do the same thing as pdev->driver->sriov_configure, > e.g. pci_sriov_configure_simple() below. > > It checks whether there are VFs are assigned in pci_vfs_assigned(dev). > and we can set the VF in assigned status if vfio_pci_open() is performed > on the VF. But you can still unbind the PF driver that magically causes the VF's to be removed from the guest image too correct? Only the IOMMU mucks with pasid_enable/disable. And it doesn't look like we have a path to disable without tearing down the PF binding. We originally had some refcounts and such and would do the real disable only when the refcount drops to 0, but we found it wasn't actually necessary to protect these resources like that. > > > int pci_sriov_configure_simple(struct pci_dev *dev, int nr_virtfn) > { > int rc; > > might_sleep(); > > if (!dev->is_physfn) > return -ENODEV; > > if (pci_vfs_assigned(dev)) { > pci_warn(dev, "Cannot modify SR-IOV while VFs are > assigned\n"); > return -EPERM; > } > > if (nr_virtfn == 0) { > sriov_disable(dev); > return 0; > } > > rc = sriov_enable(dev, nr_virtfn); > if (rc < 0) > return rc; > > return nr_virtfn; > } > > Thanks > Yan ___ iommu mailing list i
Re: [PATCH 0/2] iommu: Remove iommu_sva_ops::mm_exit()
On Wed, 15 Apr 2020 09:47:36 +0200 Jean-Philippe Brucker wrote: > On Fri, Apr 10, 2020 at 08:52:49AM -0700, Jacob Pan wrote: > > On Thu, 9 Apr 2020 16:50:58 +0200 > > Jean-Philippe Brucker wrote: > > > > > > So unbind is coming anyway, the difference in handling in mmu > > > > release notifier is whether we silently drop DMA fault vs. > > > > reporting fault? > > > > > > What I meant is, between mmu release notifier and unbind(), we > > > can't print any error from DMA fault on dmesg, because an mm exit > > > is easily triggered by userspace. Look at the lifetime of the > > > bond: > > > > > > bind() > > > | > > > : Here any DMA fault is handled by mm, and on error we don't > > > print : anything to dmesg. Userspace can easily trigger faults by > > > issuing DMA : on unmapped buffers. > > > | > > > mm exit -> clear pgd, invalidate IOTLBs > > > | > > > : Here the PASID descriptor doesn't have the pgd anymore, but we > > > don't : print out any error to dmesg either. DMA is likely still > > > running but : any fault has to be ignored. > > > : > > > : We also can't free the PASID yet, since transactions are still > > > coming : in with this PASID. > > > | > > > unbind() -> clear context descriptor, release PASID and mmu > > > notifier | > > > : Here the PASID descriptor is clear. If DMA is still running the > > > device : driver really messed up and we have to print out any > > > fault. > > > > > > For that middle state I had to introduce a new pasid descriptor > > > state in the SMMU driver, to avoid reporting errors between mm > > > exit and unbind(). > > I must have missed something, but why bother with a state when you > > can always check if the mm is dead by mmget_not_zero()? You would > > not handle IOPF if the mm is dead anyway, similarly for other DMA > > errors. > > In the SMMU a cleared PASID descriptor results in unrecoverable > faults, which do not go through the I/O page fault handler. I've been > thinking about injecting everything to the IOPF handler, recoverable > or not, but filtering down the stream is complicated. Most of the > time outside this small window, we really need to print out those > messages because they would indicate serious bugs. > VT-d also results in unrecoverable fault for a cleared PASID. I am assuming in the fault record, SMMU can also identify the PASID and source ID. So that should be able to find the matching mm. Then you can check if the mm is defunct? > > Also, since you are not freeing ioasid in mmu_notifier release > > anymore, does it mean the IOASID notifier chain can be non-atomic? > > Unfortunately not, ioasid_free() is called from > mmu_notifier_ops::free_notifier() in the RCU callback that results > from mmu_notifier_put(). > I agree. I looked at the code, it is much more clean with the mmu_notifier_get/put. I am thinking perhaps adding a reclaim mechanism such that IOASID not directly freed can stay in an in_active list (while waiting for its states get cleared) until it can be reclaimed. Do you see this is useful for SMMU? This is useful for VT-d, since we have more consumers for a given PASID, i.e. VMCS, VDCM, and IOMMU. Each consumer has its own PASID context to clean up. Thanks for the explanation! > Thanks, > Jean [Jacob Pan] ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 11/29] mm: only allow page table mappings for built-in zsmalloc
On Tue, Apr 14, 2020 at 03:13:30PM +0200, Christoph Hellwig wrote: > This allows to unexport map_vm_area and unmap_kernel_range, which are > rather deep internal and should not be available to modules, as they for > example allow fine grained control of mapping permissions, and also > allow splitting the setup of a vmalloc area and the actual mapping and > thus expose vmalloc internals. > > zsmalloc is typically built-in and continues to work (just like the > percpu-vm code using a similar patter), while modular zsmalloc also > continues to work, but must use copies. > > Signed-off-by: Christoph Hellwig > Acked-by: Peter Zijlstra (Intel) Acked-by: Minchan Kim Thanks! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 10/28] mm: only allow page table mappings for built-in zsmalloc
Hi Christoph, Sorry for the late. On Sat, Apr 11, 2020 at 09:20:52AM +0200, Christoph Hellwig wrote: > Hi Minchan, > > On Fri, Apr 10, 2020 at 04:11:36PM -0700, Minchan Kim wrote: > > It doesn't mean we couldn't use zsmalloc as module any longer. It means > > we couldn't use zsmalloc as module with pgtable mapping whcih was little > > bit faster on microbenchmark in some architecutre(However, I usually temped > > to remove it since it had several problems). However, we could still use > > zsmalloc as module as copy way instead of pgtable mapping. Thus, if someone > > really want to rollback the feature, they should provide reasonable reason > > why it doesn't work for them. "A little fast" wouldn't be enough to exports > > deep internal to the module. > > do you have any data how much faster it is on arm (and does that include > arm64 as well)? Besides the exports which were my prime concern, https://github.com/sjenning/zsmapbench I need to recall the memory. IIRC, it was almost 30% faster at that time in ARM so was not trivial at that time. However, it was story from several years ago. > zsmalloc with pgtable mappings also is the only user of map_kernel_range > outside of vmalloc.c, if it really is another code base for tiny > improvements we could mark map_kernel_range or in fact remove it entirely > and open code it in the remaining callers. I alsh have temped to remove it. Let me have time to revist it in this chance. Thanks. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/arm-smmu: Allow client devices to select direct mapping
Hi Robin, On 2020-04-16 22:47, Robin Murphy wrote: On 2020-04-16 5:23 pm, Sai Prakash Ranjan wrote: Hi Robin, On 2020-04-16 19:28, Robin Murphy wrote: On 2020-01-22 11:48 am, Sai Prakash Ranjan wrote: From: Jordan Crouse Some client devices want to directly map the IOMMU themselves instead of using the DMA domain. Allow those devices to opt in to direct mapping by way of a list of compatible strings. Signed-off-by: Jordan Crouse Co-developed-by: Sai Prakash Ranjan Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/arm-smmu-qcom.c | 39 +++ drivers/iommu/arm-smmu.c | 3 +++ drivers/iommu/arm-smmu.h | 5 + 3 files changed, 47 insertions(+) diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c index 64a4ab270ab7..ff746acd1c81 100644 --- a/drivers/iommu/arm-smmu-qcom.c +++ b/drivers/iommu/arm-smmu-qcom.c @@ -3,6 +3,7 @@ * Copyright (c) 2019, The Linux Foundation. All rights reserved. */ +#include #include #include "arm-smmu.h" @@ -11,6 +12,43 @@ struct qcom_smmu { struct arm_smmu_device smmu; }; +static const struct arm_smmu_client_match_data qcom_adreno = { + .direct_mapping = true, +}; + +static const struct arm_smmu_client_match_data qcom_mdss = { + .direct_mapping = true, +}; Might it make sense to group these by the desired SMMU behaviour rather than (apparently) what kind of device the client happens to be, which seems like a completely arbitrary distinction from the SMMU driver's PoV? Sorry, I did not get the "grouping by the desired SMMU behaviour" thing. Could you please give some more details? I mean this pattern: device_a_data { .thing = this; } device_b_data { .thing = this; } device_c_data { .thing = that; } match[] = { { .compatible = "A", .data = &device_a_data }, { .compatible = "B", .data = &device_b_data }, { .compatible = "C", .data = &device_c_data }, }; ...vs. this pattern: do_this { .thing = this; } do_that { .thing = that; } match[] = { { .compatible = "A", .data = &do_this }, { .compatible = "B", .data = &do_this }, { .compatible = "C", .data = &do_that }, }; From the perspective of the thing doing the thing, grouping the data by device is meaningless if all that matters is whether to do this or that. The second pattern expresses that more naturally. Of course if every device turns out to need a unique combination of several behaviours, then there ends up being no practical difference except that it's probably easier to come up with nice names under the first pattern. I guess it's up to how you see this developing in future; whether you're likely to need fine-grained per-device control, or don't expect it to go much beyond domain type. Thanks for explaining *this thing* :) I will update the patch to follow the 2nd pattern as it makes more sense to do_this or do_that directly. I'm not expecting anything other than domain type atleast for now but hey we can always add the functionality if the need arises. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/arm-smmu: Allow client devices to select direct mapping
On 2020-04-16 5:23 pm, Sai Prakash Ranjan wrote: Hi Robin, On 2020-04-16 19:28, Robin Murphy wrote: On 2020-01-22 11:48 am, Sai Prakash Ranjan wrote: From: Jordan Crouse Some client devices want to directly map the IOMMU themselves instead of using the DMA domain. Allow those devices to opt in to direct mapping by way of a list of compatible strings. Signed-off-by: Jordan Crouse Co-developed-by: Sai Prakash Ranjan Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/arm-smmu-qcom.c | 39 +++ drivers/iommu/arm-smmu.c | 3 +++ drivers/iommu/arm-smmu.h | 5 + 3 files changed, 47 insertions(+) diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c index 64a4ab270ab7..ff746acd1c81 100644 --- a/drivers/iommu/arm-smmu-qcom.c +++ b/drivers/iommu/arm-smmu-qcom.c @@ -3,6 +3,7 @@ * Copyright (c) 2019, The Linux Foundation. All rights reserved. */ +#include #include #include "arm-smmu.h" @@ -11,6 +12,43 @@ struct qcom_smmu { struct arm_smmu_device smmu; }; +static const struct arm_smmu_client_match_data qcom_adreno = { + .direct_mapping = true, +}; + +static const struct arm_smmu_client_match_data qcom_mdss = { + .direct_mapping = true, +}; Might it make sense to group these by the desired SMMU behaviour rather than (apparently) what kind of device the client happens to be, which seems like a completely arbitrary distinction from the SMMU driver's PoV? Sorry, I did not get the "grouping by the desired SMMU behaviour" thing. Could you please give some more details? I mean this pattern: device_a_data { .thing = this; } device_b_data { .thing = this; } device_c_data { .thing = that; } match[] = { { .compatible = "A", .data = &device_a_data }, { .compatible = "B", .data = &device_b_data }, { .compatible = "C", .data = &device_c_data }, }; ...vs. this pattern: do_this { .thing = this; } do_that { .thing = that; } match[] = { { .compatible = "A", .data = &do_this }, { .compatible = "B", .data = &do_this }, { .compatible = "C", .data = &do_that }, }; From the perspective of the thing doing the thing, grouping the data by device is meaningless if all that matters is whether to do this or that. The second pattern expresses that more naturally. Of course if every device turns out to need a unique combination of several behaviours, then there ends up being no practical difference except that it's probably easier to come up with nice names under the first pattern. I guess it's up to how you see this developing in future; whether you're likely to need fine-grained per-device control, or don't expect it to go much beyond domain type. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/arm-smmu: Allow client devices to select direct mapping
Hi Robin, On 2020-04-16 19:28, Robin Murphy wrote: On 2020-01-22 11:48 am, Sai Prakash Ranjan wrote: From: Jordan Crouse Some client devices want to directly map the IOMMU themselves instead of using the DMA domain. Allow those devices to opt in to direct mapping by way of a list of compatible strings. Signed-off-by: Jordan Crouse Co-developed-by: Sai Prakash Ranjan Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/arm-smmu-qcom.c | 39 +++ drivers/iommu/arm-smmu.c | 3 +++ drivers/iommu/arm-smmu.h | 5 + 3 files changed, 47 insertions(+) diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c index 64a4ab270ab7..ff746acd1c81 100644 --- a/drivers/iommu/arm-smmu-qcom.c +++ b/drivers/iommu/arm-smmu-qcom.c @@ -3,6 +3,7 @@ * Copyright (c) 2019, The Linux Foundation. All rights reserved. */ +#include #include #include "arm-smmu.h" @@ -11,6 +12,43 @@ struct qcom_smmu { struct arm_smmu_device smmu; }; +static const struct arm_smmu_client_match_data qcom_adreno = { + .direct_mapping = true, +}; + +static const struct arm_smmu_client_match_data qcom_mdss = { + .direct_mapping = true, +}; Might it make sense to group these by the desired SMMU behaviour rather than (apparently) what kind of device the client happens to be, which seems like a completely arbitrary distinction from the SMMU driver's PoV? Sorry, I did not get the "grouping by the desired SMMU behaviour" thing. Could you please give some more details? + +static const struct of_device_id qcom_smmu_client_of_match[] = { + { .compatible = "qcom,adreno", .data = &qcom_adreno }, + { .compatible = "qcom,mdp4", .data = &qcom_mdss }, + { .compatible = "qcom,mdss", .data = &qcom_mdss }, + { .compatible = "qcom,sc7180-mdss", .data = &qcom_mdss }, + { .compatible = "qcom,sdm845-mdss", .data = &qcom_mdss }, + {}, +}; + +static const struct arm_smmu_client_match_data * +qcom_smmu_client_data(struct device *dev) +{ + const struct of_device_id *match = + of_match_device(qcom_smmu_client_of_match, dev); + + return match ? match->data : NULL; of_device_get_match_data() is your friend. Ok will use it. +} + +static int qcom_smmu_request_domain(struct device *dev) +{ + const struct arm_smmu_client_match_data *client; + + client = qcom_smmu_client_data(dev); + if (client) + iommu_request_dm_for_dev(dev); + + return 0; +} + static int qcom_sdm845_smmu500_reset(struct arm_smmu_device *smmu) { int ret; @@ -41,6 +79,7 @@ static int qcom_smmu500_reset(struct arm_smmu_device *smmu) } static const struct arm_smmu_impl qcom_smmu_impl = { + .req_domain = qcom_smmu_request_domain, .reset = qcom_smmu500_reset, }; diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 16c4b87af42b..67dd9326247a 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -1448,6 +1448,9 @@ static int arm_smmu_add_device(struct device *dev) device_link_add(dev, smmu->dev, DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER); + if (smmu->impl && smmu->impl->req_domain) + return smmu->impl->req_domain(dev); + There are about 5 different patchsets flying around at the moment that all touch default domain allocation, so this is a fast-moving target, but I think where the dust should settle is with arm_smmu_ops forwarding .def_domain_type (or whatever it ends up as) calls to arm_smmu_impl as appropriate. I'll wait till the dust settles down and then post the next version. return 0; out_cfg_free: diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h index 8d1cd54d82a6..059dc9c39f64 100644 --- a/drivers/iommu/arm-smmu.h +++ b/drivers/iommu/arm-smmu.h @@ -244,6 +244,10 @@ enum arm_smmu_arch_version { ARM_SMMU_V2, }; +struct arm_smmu_client_match_data { + bool direct_mapping; +}; Does this need to be public? I don't see the other users... Will move this out. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 7/8] vfio/type1: Add VFIO_IOMMU_CACHE_INVALIDATE
Hi Kevin, On 4/16/20 3:28 PM, Tian, Kevin wrote: >> From: Auger Eric >> Sent: Thursday, April 16, 2020 8:43 PM >> >> Hi Kevin, >> On 4/16/20 2:09 PM, Tian, Kevin wrote: From: Liu, Yi L Sent: Thursday, April 16, 2020 6:40 PM Hi Alex, Still have a direction question with you. Better get agreement with you before heading forward. > From: Alex Williamson > Sent: Friday, April 3, 2020 11:35 PM [...] + * + * returns: 0 on success, -errno on failure. + */ +struct vfio_iommu_type1_cache_invalidate { + __u32 argsz; + __u32 flags; + struct iommu_cache_invalidate_info cache_info; +}; +#define VFIO_IOMMU_CACHE_INVALIDATE _IO(VFIO_TYPE, > VFIO_BASE >>> + 24) >>> >>> The future extension capabilities of this ioctl worry me, I wonder if >>> we should do another data[] with flag defining that data as CACHE_INFO. >> >> Can you elaborate? Does it mean with this way we don't rely on iommu >> driver to provide version_to_size conversion and instead we just pass >> data[] to iommu driver for further audit? > > No, my concern is that this ioctl has a single function, strictly tied > to the iommu uapi. If we replace cache_info with data[] then we can > define a flag to specify that data[] is struct > iommu_cache_invalidate_info, and if we need to, a different flag to > identify data[] as something else. For example if we get stuck > expanding cache_info to meet new demands and develop a new uapi to > solve that, how would we expand this ioctl to support it rather than > also create a new ioctl? There's also a trade-off in making the ioctl > usage more difficult for the user. I'd still expect the vfio layer to > check the flag and interpret data[] as indicated by the flag rather > than just passing a blob of opaque data to the iommu layer though. > Thanks, Based on your comments about defining a single ioctl and a unified vfio structure (with a @data[] field) for pasid_alloc/free, bind/ unbind_gpasid, cache_inv. After some offline trying, I think it would be good for bind/unbind_gpasid and cache_inv as both of them use the iommu uapi definition. While the pasid alloc/free operation doesn't. It would be weird to put all of them together. So pasid alloc/free may have a separate ioctl. It would look as below. Does this direction look good per your opinion? ioctl #22: VFIO_IOMMU_PASID_REQUEST /** * @pasid: used to return the pasid alloc result when flags == >> ALLOC_PASID * specify a pasid to be freed when flags == FREE_PASID * @range: specify the allocation range when flags == ALLOC_PASID */ struct vfio_iommu_pasid_request { __u32 argsz; #define VFIO_IOMMU_ALLOC_PASID (1 << 0) #define VFIO_IOMMU_FREE_PASID (1 << 1) __u32 flags; __u32 pasid; struct { __u32 min; __u32 max; } range; }; ioctl #23: VFIO_IOMMU_NESTING_OP struct vfio_iommu_type1_nesting_op { __u32 argsz; __u32 flags; __u32 op; __u8data[]; }; /* Nesting Ops */ #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL0 #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL 1 #define VFIO_IOMMU_NESTING_OP_CACHE_INVLD 2 >>> >>> Then why cannot we just put PASID into the header since the >>> majority of nested usage is associated with a pasid? >>> >>> ioctl #23: VFIO_IOMMU_NESTING_OP >>> struct vfio_iommu_type1_nesting_op { >>> __u32 argsz; >>> __u32 flags; >>> __u32 op; >>> __u32 pasid; >>> __u8data[]; >>> }; >>> >>> In case of SMMUv2 which supports nested w/o PASID, this field can >>> be ignored for that specific case. >> On my side I would prefer keeping the pasid in the data[]. This is not >> always used. >> >> For instance, in iommu_cache_invalidate_info/iommu_inv_pasid_info we >> devised flags to tell whether the PASID is used. >> > > But don't we include a PASID in both invalidate structures already? The pasid presence is indicated by the IOMMU_INV_ADDR_FLAGS_PASID flag. For instance for nested stage SMMUv3 I current performs an ARCHID (asid) based invalidation only. Eric > > struct iommu_inv_addr_info { > #define IOMMU_INV_ADDR_FLAGS_PASID (1 << 0) > #define IOMMU_INV_ADDR_FLAGS_ARCHID (1 << 1) > #define IOMMU_INV_ADDR_FLAGS_LEAF (1 << 2) > __u32 flags; > __u32 archid; > __u64 pasid; > __u64 addr; > __u64 granule_size; > __u64 nb_granules; > }; > > struct iommu_inv_pasid_info { > #define IOMMU_INV_PASID_FLAGS_PASID (1 << 0) > #define IOMMU_INV_PASID_FLAGS_ARCHID(1 << 1) > __u32 flags; > __u32 archid
Re: [PATCH v1 7/8] vfio/type1: Add VFIO_IOMMU_CACHE_INVALIDATE
On Thu, 16 Apr 2020 08:40:31 -0600 Alex Williamson wrote: > On Thu, 16 Apr 2020 10:40:03 + > "Liu, Yi L" wrote: > > > Hi Alex, > > Still have a direction question with you. Better get agreement with you > > before heading forward. > > > > > From: Alex Williamson > > > Sent: Friday, April 3, 2020 11:35 PM > > [...] > > > > > > + * > > > > > > + * returns: 0 on success, -errno on failure. > > > > > > + */ > > > > > > +struct vfio_iommu_type1_cache_invalidate { > > > > > > + __u32 argsz; > > > > > > + __u32 flags; > > > > > > + struct iommu_cache_invalidate_info cache_info; > > > > > > +}; > > > > > > +#define VFIO_IOMMU_CACHE_INVALIDATE _IO(VFIO_TYPE, > > > VFIO_BASE > > > > > + 24) > > > > > > > > > > The future extension capabilities of this ioctl worry me, I wonder if > > > > > we should do another data[] with flag defining that data as > > > > > CACHE_INFO. > > > > > > > > Can you elaborate? Does it mean with this way we don't rely on iommu > > > > driver to provide version_to_size conversion and instead we just pass > > > > data[] to iommu driver for further audit? > > > > > > No, my concern is that this ioctl has a single function, strictly tied > > > to the iommu uapi. If we replace cache_info with data[] then we can > > > define a flag to specify that data[] is struct > > > iommu_cache_invalidate_info, and if we need to, a different flag to > > > identify data[] as something else. For example if we get stuck > > > expanding cache_info to meet new demands and develop a new uapi to > > > solve that, how would we expand this ioctl to support it rather than > > > also create a new ioctl? There's also a trade-off in making the ioctl > > > usage more difficult for the user. I'd still expect the vfio layer to > > > check the flag and interpret data[] as indicated by the flag rather > > > than just passing a blob of opaque data to the iommu layer though. > > > Thanks, > > > > Based on your comments about defining a single ioctl and a unified > > vfio structure (with a @data[] field) for pasid_alloc/free, bind/ > > unbind_gpasid, cache_inv. After some offline trying, I think it would > > be good for bind/unbind_gpasid and cache_inv as both of them use the > > iommu uapi definition. While the pasid alloc/free operation doesn't. > > It would be weird to put all of them together. So pasid alloc/free > > may have a separate ioctl. It would look as below. Does this direction > > look good per your opinion? > > > > ioctl #22: VFIO_IOMMU_PASID_REQUEST > > /** > > * @pasid: used to return the pasid alloc result when flags == ALLOC_PASID > > * specify a pasid to be freed when flags == FREE_PASID > > * @range: specify the allocation range when flags == ALLOC_PASID > > */ > > struct vfio_iommu_pasid_request { > > __u32 argsz; > > #define VFIO_IOMMU_ALLOC_PASID (1 << 0) > > #define VFIO_IOMMU_FREE_PASID (1 << 1) > > __u32 flags; > > __u32 pasid; > > struct { > > __u32 min; > > __u32 max; > > } range; > > }; > > Can't the ioctl return the pasid valid on alloc (like GET_DEVICE_FD)? s/valid/value/ > Would it be useful to support freeing a range of pasids? If so then we > could simply use range for both, ie. allocate a pasid from this range > and return it, or free all pasids in this range? vfio already needs to > track pasids to free them on release, so presumably this is something > we could support easily. > > > ioctl #23: VFIO_IOMMU_NESTING_OP > > struct vfio_iommu_type1_nesting_op { > > __u32 argsz; > > __u32 flags; > > __u32 op; > > __u8data[]; > > }; > > data only has 4-byte alignment, I think we really want it at an 8-byte > alignment. This is why I embedded the "op" into the flag for > DEVICE_FEATURE. Thanks, > > Alex > > > > > /* Nesting Ops */ > > #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL0 > > #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL 1 > > #define VFIO_IOMMU_NESTING_OP_CACHE_INVLD 2 > > > > Thanks, > > Yi Liu > > > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v1 7/8] vfio/type1: Add VFIO_IOMMU_CACHE_INVALIDATE
On Thu, 16 Apr 2020 10:40:03 + "Liu, Yi L" wrote: > Hi Alex, > Still have a direction question with you. Better get agreement with you > before heading forward. > > > From: Alex Williamson > > Sent: Friday, April 3, 2020 11:35 PM > [...] > > > > > + * > > > > > + * returns: 0 on success, -errno on failure. > > > > > + */ > > > > > +struct vfio_iommu_type1_cache_invalidate { > > > > > + __u32 argsz; > > > > > + __u32 flags; > > > > > + struct iommu_cache_invalidate_info cache_info; > > > > > +}; > > > > > +#define VFIO_IOMMU_CACHE_INVALIDATE _IO(VFIO_TYPE, > > VFIO_BASE > > > > + 24) > > > > > > > > The future extension capabilities of this ioctl worry me, I wonder if > > > > we should do another data[] with flag defining that data as CACHE_INFO. > > > > > > > > > > Can you elaborate? Does it mean with this way we don't rely on iommu > > > driver to provide version_to_size conversion and instead we just pass > > > data[] to iommu driver for further audit? > > > > No, my concern is that this ioctl has a single function, strictly tied > > to the iommu uapi. If we replace cache_info with data[] then we can > > define a flag to specify that data[] is struct > > iommu_cache_invalidate_info, and if we need to, a different flag to > > identify data[] as something else. For example if we get stuck > > expanding cache_info to meet new demands and develop a new uapi to > > solve that, how would we expand this ioctl to support it rather than > > also create a new ioctl? There's also a trade-off in making the ioctl > > usage more difficult for the user. I'd still expect the vfio layer to > > check the flag and interpret data[] as indicated by the flag rather > > than just passing a blob of opaque data to the iommu layer though. > > Thanks, > > Based on your comments about defining a single ioctl and a unified > vfio structure (with a @data[] field) for pasid_alloc/free, bind/ > unbind_gpasid, cache_inv. After some offline trying, I think it would > be good for bind/unbind_gpasid and cache_inv as both of them use the > iommu uapi definition. While the pasid alloc/free operation doesn't. > It would be weird to put all of them together. So pasid alloc/free > may have a separate ioctl. It would look as below. Does this direction > look good per your opinion? > > ioctl #22: VFIO_IOMMU_PASID_REQUEST > /** > * @pasid: used to return the pasid alloc result when flags == ALLOC_PASID > * specify a pasid to be freed when flags == FREE_PASID > * @range: specify the allocation range when flags == ALLOC_PASID > */ > struct vfio_iommu_pasid_request { > __u32 argsz; > #define VFIO_IOMMU_ALLOC_PASID(1 << 0) > #define VFIO_IOMMU_FREE_PASID (1 << 1) > __u32 flags; > __u32 pasid; > struct { > __u32 min; > __u32 max; > } range; > }; Can't the ioctl return the pasid valid on alloc (like GET_DEVICE_FD)? Would it be useful to support freeing a range of pasids? If so then we could simply use range for both, ie. allocate a pasid from this range and return it, or free all pasids in this range? vfio already needs to track pasids to free them on release, so presumably this is something we could support easily. > ioctl #23: VFIO_IOMMU_NESTING_OP > struct vfio_iommu_type1_nesting_op { > __u32 argsz; > __u32 flags; > __u32 op; > __u8data[]; > }; data only has 4-byte alignment, I think we really want it at an 8-byte alignment. This is why I embedded the "op" into the flag for DEVICE_FEATURE. Thanks, Alex > > /* Nesting Ops */ > #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL0 > #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL 1 > #define VFIO_IOMMU_NESTING_OP_CACHE_INVLD 2 > > Thanks, > Yi Liu > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 1/4] bus: fsl-mc: add custom .dma_configure implementation
On 4/15/2020 7:04 PM, Lorenzo Pieralisi wrote: > On Wed, Apr 15, 2020 at 06:44:37PM +0300, Laurentiu Tudor wrote: >> >> >> On 4/14/2020 5:32 PM, Lorenzo Pieralisi wrote: >>> On Wed, Mar 25, 2020 at 06:48:55PM +0200, Laurentiu Tudor wrote: Hi Lorenzo, On 3/25/2020 2:51 PM, Lorenzo Pieralisi wrote: > On Thu, Feb 27, 2020 at 12:05:39PM +0200, laurentiu.tu...@nxp.com wrote: >> From: Laurentiu Tudor >> >> The devices on this bus are not discovered by way of device tree >> but by queries to the firmware. It makes little sense to trick the >> generic of layer into thinking that these devices are of related so >> that we can get our dma configuration. Instead of doing that, add >> our custom dma configuration implementation. >> >> Signed-off-by: Laurentiu Tudor >> --- >> drivers/bus/fsl-mc/fsl-mc-bus.c | 31 ++- >> 1 file changed, 30 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c >> b/drivers/bus/fsl-mc/fsl-mc-bus.c >> index 36eb25f82c8e..eafaa0e0b906 100644 >> --- a/drivers/bus/fsl-mc/fsl-mc-bus.c >> +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c >> @@ -132,11 +132,40 @@ static int fsl_mc_bus_uevent(struct device *dev, >> struct kobj_uevent_env *env) >> static int fsl_mc_dma_configure(struct device *dev) >> { >> struct device *dma_dev = dev; >> +struct iommu_fwspec *fwspec; >> +const struct iommu_ops *iommu_ops; >> +struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev); >> +int ret; >> +u32 icid; >> >> while (dev_is_fsl_mc(dma_dev)) >> dma_dev = dma_dev->parent; >> >> -return of_dma_configure(dev, dma_dev->of_node, 0); >> +fwspec = dev_iommu_fwspec_get(dma_dev); >> +if (!fwspec) >> +return -ENODEV; >> +iommu_ops = iommu_ops_from_fwnode(fwspec->iommu_fwnode); >> +if (!iommu_ops) >> +return -ENODEV; >> + >> +ret = iommu_fwspec_init(dev, fwspec->iommu_fwnode, iommu_ops); >> +if (ret) >> +return ret; >> + >> +icid = mc_dev->icid; >> +ret = iommu_fwspec_add_ids(dev, &icid, 1); > > I see. So with this patch we would use the MC named component only to > retrieve the iommu_ops Right. I'd also add that the implementation tries to follow the existing standard .dma_configure implementations, e.g. of_dma_configure + of_iommu_configure. I'd also note that similarly to the ACPI case, this MC FW device is probed as a platform device in the DT scenario, binding here [1]. A similar approach is used for the retrieval of the msi irq domain, see following patch. > - the streamid are injected directly here bypassing OF/IORT bindings > translations altogether. Actually I've submitted a v2 [2] that calls into .of_xlate() to allow the smmu driver to do some processing on the raw streamid coming from the firmware. I have not yet tested this with ACPI but expect it to work, however, it's debatable how valid is this approach in the context of ACPI. >>> >>> Actually, what I think you need is of_map_rid() (and an IORT >>> equivalent, that I am going to write - generalizing iort_msi_map_rid()). >>> >>> Would that be enough to enable IORT "normal" mappings in the MC bus >>> named components ? >>> >> >> At a first glance, looks like this could very well fix the ACPI >> scenario, but I have some unclarities on the approach: >> * are we going to rely in DT and ACPI generic layers even if these >> devices are not published / enumerated through DT or ACPI tables? >> * the firmware manages and provides discrete streamids for the devices >> it exposes so there's no translation involved. There's no >>requestor_id / input_id involved but it seems that we would still do >> some kind of translation relying for this on the DT/ACPI functions. >> * MC firmware has its own stream_id (e.g. on some chips 0x4000, others >> 0xf00, so outside the range of stream_ids used for the mc devices) >>while for the devices on this bus, MC allocates stream_ids from a >> range (e.g. 0x17 - 0x3f). Is it possible to describe this in the IORT table? >> * Regarding the of_map_rid() use you mentioned, I was planning to >> decouple the mc bus from the DT layer by dropping the use of >> of_map_rid(), see patch 4. >> I briefly glanced over the iort code and spotted this static function: >> iort_iommu_xlate(). Wouldn't it also help, of course after making it public? > > Guys I have lost you honestly. I don't understand what you really need > to do with DT and ACPI here. Are they needed to describe what you need > or not ? If the MC dma configure function does not need any DT/ACPI > bindings that's fine by me, I don't und
Re: [PATCH 2/2] iommu/arm-smmu: Allow client devices to select direct mapping
On 2020-01-22 11:48 am, Sai Prakash Ranjan wrote: From: Jordan Crouse Some client devices want to directly map the IOMMU themselves instead of using the DMA domain. Allow those devices to opt in to direct mapping by way of a list of compatible strings. Signed-off-by: Jordan Crouse Co-developed-by: Sai Prakash Ranjan Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/arm-smmu-qcom.c | 39 +++ drivers/iommu/arm-smmu.c | 3 +++ drivers/iommu/arm-smmu.h | 5 + 3 files changed, 47 insertions(+) diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c index 64a4ab270ab7..ff746acd1c81 100644 --- a/drivers/iommu/arm-smmu-qcom.c +++ b/drivers/iommu/arm-smmu-qcom.c @@ -3,6 +3,7 @@ * Copyright (c) 2019, The Linux Foundation. All rights reserved. */ +#include #include #include "arm-smmu.h" @@ -11,6 +12,43 @@ struct qcom_smmu { struct arm_smmu_device smmu; }; +static const struct arm_smmu_client_match_data qcom_adreno = { + .direct_mapping = true, +}; + +static const struct arm_smmu_client_match_data qcom_mdss = { + .direct_mapping = true, +}; Might it make sense to group these by the desired SMMU behaviour rather than (apparently) what kind of device the client happens to be, which seems like a completely arbitrary distinction from the SMMU driver's PoV? + +static const struct of_device_id qcom_smmu_client_of_match[] = { + { .compatible = "qcom,adreno", .data = &qcom_adreno }, + { .compatible = "qcom,mdp4", .data = &qcom_mdss }, + { .compatible = "qcom,mdss", .data = &qcom_mdss }, + { .compatible = "qcom,sc7180-mdss", .data = &qcom_mdss }, + { .compatible = "qcom,sdm845-mdss", .data = &qcom_mdss }, + {}, +}; + +static const struct arm_smmu_client_match_data * +qcom_smmu_client_data(struct device *dev) +{ + const struct of_device_id *match = + of_match_device(qcom_smmu_client_of_match, dev); + + return match ? match->data : NULL; of_device_get_match_data() is your friend. +} + +static int qcom_smmu_request_domain(struct device *dev) +{ + const struct arm_smmu_client_match_data *client; + + client = qcom_smmu_client_data(dev); + if (client) + iommu_request_dm_for_dev(dev); + + return 0; +} + static int qcom_sdm845_smmu500_reset(struct arm_smmu_device *smmu) { int ret; @@ -41,6 +79,7 @@ static int qcom_smmu500_reset(struct arm_smmu_device *smmu) } static const struct arm_smmu_impl qcom_smmu_impl = { + .req_domain = qcom_smmu_request_domain, .reset = qcom_smmu500_reset, }; diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 16c4b87af42b..67dd9326247a 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -1448,6 +1448,9 @@ static int arm_smmu_add_device(struct device *dev) device_link_add(dev, smmu->dev, DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_SUPPLIER); + if (smmu->impl && smmu->impl->req_domain) + return smmu->impl->req_domain(dev); + There are about 5 different patchsets flying around at the moment that all touch default domain allocation, so this is a fast-moving target, but I think where the dust should settle is with arm_smmu_ops forwarding .def_domain_type (or whatever it ends up as) calls to arm_smmu_impl as appropriate. return 0; out_cfg_free: diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h index 8d1cd54d82a6..059dc9c39f64 100644 --- a/drivers/iommu/arm-smmu.h +++ b/drivers/iommu/arm-smmu.h @@ -244,6 +244,10 @@ enum arm_smmu_arch_version { ARM_SMMU_V2, }; +struct arm_smmu_client_match_data { + bool direct_mapping; +}; Does this need to be public? I don't see the other users... Robin. + enum arm_smmu_implementation { GENERIC_SMMU, ARM_MMU500, @@ -386,6 +390,7 @@ struct arm_smmu_impl { int (*init_context)(struct arm_smmu_domain *smmu_domain); void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync, int status); + int (*req_domain)(struct device *dev); }; static inline void __iomem *arm_smmu_page(struct arm_smmu_device *smmu, int n) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu: arm-smmu-impl: Convert to a generic reset implementation
On 2020-01-22 11:48 am, Sai Prakash Ranjan wrote: Currently the QCOM specific smmu reset implementation is very specific to SDM845 SoC and has a wait-for-safe logic which may not be required for other SoCs. So move the SDM845 specific logic to its specific reset function. Also add SC7180 SMMU compatible for calling into QCOM specific implementation. Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/arm-smmu-impl.c | 8 +--- drivers/iommu/arm-smmu-qcom.c | 16 +--- 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c index 74d97a886e93..c75b9d957b70 100644 --- a/drivers/iommu/arm-smmu-impl.c +++ b/drivers/iommu/arm-smmu-impl.c @@ -150,6 +150,8 @@ static const struct arm_smmu_impl arm_mmu500_impl = { struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu) { + const struct device_node *np = smmu->dev->of_node; + /* * We will inevitably have to combine model-specific implementation * quirks with platform-specific integration quirks, but everything @@ -166,11 +168,11 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu) break; } - if (of_property_read_bool(smmu->dev->of_node, - "calxeda,smmu-secure-config-access")) + if (of_property_read_bool(np, "calxeda,smmu-secure-config-access")) smmu->impl = &calxeda_impl; - if (of_device_is_compatible(smmu->dev->of_node, "qcom,sdm845-smmu-500")) + if (of_device_is_compatible(np, "qcom,sdm845-smmu-500") || + of_device_is_compatible(np, "qcom,sc7180-smmu-500")) return qcom_smmu_impl_init(smmu); return smmu; diff --git a/drivers/iommu/arm-smmu-qcom.c b/drivers/iommu/arm-smmu-qcom.c index 24c071c1d8b0..64a4ab270ab7 100644 --- a/drivers/iommu/arm-smmu-qcom.c +++ b/drivers/iommu/arm-smmu-qcom.c @@ -15,8 +15,6 @@ static int qcom_sdm845_smmu500_reset(struct arm_smmu_device *smmu) { int ret; - arm_mmu500_reset(smmu); - /* * To address performance degradation in non-real time clients, * such as USB and UFS, turn off wait-for-safe on sdm845 based boards, @@ -30,8 +28,20 @@ static int qcom_sdm845_smmu500_reset(struct arm_smmu_device *smmu) return ret; } +static int qcom_smmu500_reset(struct arm_smmu_device *smmu) +{ + const struct device_node *np = smmu->dev->of_node; + + arm_mmu500_reset(smmu); + + if (of_device_is_compatible(np, "qcom,sdm845-smmu-500")) + return qcom_sdm845_smmu500_reset(smmu); + + return 0; +} + static const struct arm_smmu_impl qcom_smmu_impl = { - .reset = qcom_sdm845_smmu500_reset, + .reset = qcom_smmu500_reset, }; It might be logical to have a separate SDM845 impl rather than indirecting within the callback itself, but I'm not too concerned either way. For the arm-smmu-impl.c changes, Reviewed-by: Robin Murphy Thanks, Robin. struct arm_smmu_device *qcom_smmu_impl_init(struct arm_smmu_device *smmu) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v1 7/8] vfio/type1: Add VFIO_IOMMU_CACHE_INVALIDATE
> From: Auger Eric > Sent: Thursday, April 16, 2020 8:43 PM > > Hi Kevin, > On 4/16/20 2:09 PM, Tian, Kevin wrote: > >> From: Liu, Yi L > >> Sent: Thursday, April 16, 2020 6:40 PM > >> > >> Hi Alex, > >> Still have a direction question with you. Better get agreement with you > >> before heading forward. > >> > >>> From: Alex Williamson > >>> Sent: Friday, April 3, 2020 11:35 PM > >> [...] > >> + * > >> + * returns: 0 on success, -errno on failure. > >> + */ > >> +struct vfio_iommu_type1_cache_invalidate { > >> + __u32 argsz; > >> + __u32 flags; > >> + struct iommu_cache_invalidate_info cache_info; > >> +}; > >> +#define VFIO_IOMMU_CACHE_INVALIDATE _IO(VFIO_TYPE, > >>> VFIO_BASE > > + 24) > > > > The future extension capabilities of this ioctl worry me, I wonder if > > we should do another data[] with flag defining that data as > >> CACHE_INFO. > > Can you elaborate? Does it mean with this way we don't rely on iommu > driver to provide version_to_size conversion and instead we just pass > data[] to iommu driver for further audit? > >>> > >>> No, my concern is that this ioctl has a single function, strictly tied > >>> to the iommu uapi. If we replace cache_info with data[] then we can > >>> define a flag to specify that data[] is struct > >>> iommu_cache_invalidate_info, and if we need to, a different flag to > >>> identify data[] as something else. For example if we get stuck > >>> expanding cache_info to meet new demands and develop a new uapi to > >>> solve that, how would we expand this ioctl to support it rather than > >>> also create a new ioctl? There's also a trade-off in making the ioctl > >>> usage more difficult for the user. I'd still expect the vfio layer to > >>> check the flag and interpret data[] as indicated by the flag rather > >>> than just passing a blob of opaque data to the iommu layer though. > >>> Thanks, > >> > >> Based on your comments about defining a single ioctl and a unified > >> vfio structure (with a @data[] field) for pasid_alloc/free, bind/ > >> unbind_gpasid, cache_inv. After some offline trying, I think it would > >> be good for bind/unbind_gpasid and cache_inv as both of them use the > >> iommu uapi definition. While the pasid alloc/free operation doesn't. > >> It would be weird to put all of them together. So pasid alloc/free > >> may have a separate ioctl. It would look as below. Does this direction > >> look good per your opinion? > >> > >> ioctl #22: VFIO_IOMMU_PASID_REQUEST > >> /** > >> * @pasid: used to return the pasid alloc result when flags == > ALLOC_PASID > >> * specify a pasid to be freed when flags == FREE_PASID > >> * @range: specify the allocation range when flags == ALLOC_PASID > >> */ > >> struct vfio_iommu_pasid_request { > >>__u32 argsz; > >> #define VFIO_IOMMU_ALLOC_PASID (1 << 0) > >> #define VFIO_IOMMU_FREE_PASID (1 << 1) > >>__u32 flags; > >>__u32 pasid; > >>struct { > >>__u32 min; > >>__u32 max; > >>} range; > >> }; > >> > >> ioctl #23: VFIO_IOMMU_NESTING_OP > >> struct vfio_iommu_type1_nesting_op { > >>__u32 argsz; > >>__u32 flags; > >>__u32 op; > >>__u8data[]; > >> }; > >> > >> /* Nesting Ops */ > >> #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL0 > >> #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL 1 > >> #define VFIO_IOMMU_NESTING_OP_CACHE_INVLD 2 > >> > > > > Then why cannot we just put PASID into the header since the > > majority of nested usage is associated with a pasid? > > > > ioctl #23: VFIO_IOMMU_NESTING_OP > > struct vfio_iommu_type1_nesting_op { > > __u32 argsz; > > __u32 flags; > > __u32 op; > > __u32 pasid; > > __u8data[]; > > }; > > > > In case of SMMUv2 which supports nested w/o PASID, this field can > > be ignored for that specific case. > On my side I would prefer keeping the pasid in the data[]. This is not > always used. > > For instance, in iommu_cache_invalidate_info/iommu_inv_pasid_info we > devised flags to tell whether the PASID is used. > But don't we include a PASID in both invalidate structures already? struct iommu_inv_addr_info { #define IOMMU_INV_ADDR_FLAGS_PASID (1 << 0) #define IOMMU_INV_ADDR_FLAGS_ARCHID (1 << 1) #define IOMMU_INV_ADDR_FLAGS_LEAF (1 << 2) __u32 flags; __u32 archid; __u64 pasid; __u64 addr; __u64 granule_size; __u64 nb_granules; }; struct iommu_inv_pasid_info { #define IOMMU_INV_PASID_FLAGS_PASID (1 << 0) #define IOMMU_INV_PASID_FLAGS_ARCHID(1 << 1) __u32 flags; __u32 archid; __u64 pasid; }; then consolidating the pasid field into generic header doesn't hurt. the specific handler still rely on flags to tell whether it is used? Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation
Re: [PATCH v1 7/8] vfio/type1: Add VFIO_IOMMU_CACHE_INVALIDATE
Hi Kevin, On 4/16/20 2:09 PM, Tian, Kevin wrote: >> From: Liu, Yi L >> Sent: Thursday, April 16, 2020 6:40 PM >> >> Hi Alex, >> Still have a direction question with you. Better get agreement with you >> before heading forward. >> >>> From: Alex Williamson >>> Sent: Friday, April 3, 2020 11:35 PM >> [...] >> + * >> + * returns: 0 on success, -errno on failure. >> + */ >> +struct vfio_iommu_type1_cache_invalidate { >> +__u32 argsz; >> +__u32 flags; >> +struct iommu_cache_invalidate_info cache_info; >> +}; >> +#define VFIO_IOMMU_CACHE_INVALIDATE _IO(VFIO_TYPE, >>> VFIO_BASE > + 24) > > The future extension capabilities of this ioctl worry me, I wonder if > we should do another data[] with flag defining that data as >> CACHE_INFO. Can you elaborate? Does it mean with this way we don't rely on iommu driver to provide version_to_size conversion and instead we just pass data[] to iommu driver for further audit? >>> >>> No, my concern is that this ioctl has a single function, strictly tied >>> to the iommu uapi. If we replace cache_info with data[] then we can >>> define a flag to specify that data[] is struct >>> iommu_cache_invalidate_info, and if we need to, a different flag to >>> identify data[] as something else. For example if we get stuck >>> expanding cache_info to meet new demands and develop a new uapi to >>> solve that, how would we expand this ioctl to support it rather than >>> also create a new ioctl? There's also a trade-off in making the ioctl >>> usage more difficult for the user. I'd still expect the vfio layer to >>> check the flag and interpret data[] as indicated by the flag rather >>> than just passing a blob of opaque data to the iommu layer though. >>> Thanks, >> >> Based on your comments about defining a single ioctl and a unified >> vfio structure (with a @data[] field) for pasid_alloc/free, bind/ >> unbind_gpasid, cache_inv. After some offline trying, I think it would >> be good for bind/unbind_gpasid and cache_inv as both of them use the >> iommu uapi definition. While the pasid alloc/free operation doesn't. >> It would be weird to put all of them together. So pasid alloc/free >> may have a separate ioctl. It would look as below. Does this direction >> look good per your opinion? >> >> ioctl #22: VFIO_IOMMU_PASID_REQUEST >> /** >> * @pasid: used to return the pasid alloc result when flags == ALLOC_PASID >> * specify a pasid to be freed when flags == FREE_PASID >> * @range: specify the allocation range when flags == ALLOC_PASID >> */ >> struct vfio_iommu_pasid_request { >> __u32 argsz; >> #define VFIO_IOMMU_ALLOC_PASID (1 << 0) >> #define VFIO_IOMMU_FREE_PASID(1 << 1) >> __u32 flags; >> __u32 pasid; >> struct { >> __u32 min; >> __u32 max; >> } range; >> }; >> >> ioctl #23: VFIO_IOMMU_NESTING_OP >> struct vfio_iommu_type1_nesting_op { >> __u32 argsz; >> __u32 flags; >> __u32 op; >> __u8data[]; >> }; >> >> /* Nesting Ops */ >> #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL0 >> #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL 1 >> #define VFIO_IOMMU_NESTING_OP_CACHE_INVLD 2 >> > > Then why cannot we just put PASID into the header since the > majority of nested usage is associated with a pasid? > > ioctl #23: VFIO_IOMMU_NESTING_OP > struct vfio_iommu_type1_nesting_op { > __u32 argsz; > __u32 flags; > __u32 op; > __u32 pasid; > __u8data[]; > }; > > In case of SMMUv2 which supports nested w/o PASID, this field can > be ignored for that specific case. On my side I would prefer keeping the pasid in the data[]. This is not always used. For instance, in iommu_cache_invalidate_info/iommu_inv_pasid_info we devised flags to tell whether the PASID is used. Thanks Eric > > Thanks > Kevin > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 02/25] iommu/sva: Manage process address spaces
On Thu, Apr 16, 2020 at 10:54:02AM +0200, Jean-Philippe Brucker wrote: > On Thu, Apr 16, 2020 at 12:28:52AM -0700, Christoph Hellwig wrote: > > > + rcu_read_lock(); > > > + hlist_for_each_entry_rcu(bond, &io_mm->devices, mm_node) > > > + io_mm->ops->invalidate(bond->sva.dev, io_mm->pasid, io_mm->ctx, > > > +start, end - start); > > > + rcu_read_unlock(); > > > +} > > > > What is the reason that the devices don't register their own notifiers? > > This kinds of multiplexing is always rather messy, and you do it for > > all the methods. > > This sends TLB and ATC invalidations through the IOMMU, it doesn't go > through device drivers I don't think we mean the same thing, probably because of my rather imprecise use of the word device. What I mean is that the mmu_notifier should not be embedded into the io_mm structure (whch btw, seems to have a way to generic name, just like all other io_* prefixed names), but instead into the iommu_bond structure. That avoid the whole multiplexing layer. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v1 7/8] vfio/type1: Add VFIO_IOMMU_CACHE_INVALIDATE
> From: Liu, Yi L > Sent: Thursday, April 16, 2020 6:40 PM > > Hi Alex, > Still have a direction question with you. Better get agreement with you > before heading forward. > > > From: Alex Williamson > > Sent: Friday, April 3, 2020 11:35 PM > [...] > > > > > + * > > > > > + * returns: 0 on success, -errno on failure. > > > > > + */ > > > > > +struct vfio_iommu_type1_cache_invalidate { > > > > > + __u32 argsz; > > > > > + __u32 flags; > > > > > + struct iommu_cache_invalidate_info cache_info; > > > > > +}; > > > > > +#define VFIO_IOMMU_CACHE_INVALIDATE _IO(VFIO_TYPE, > > VFIO_BASE > > > > + 24) > > > > > > > > The future extension capabilities of this ioctl worry me, I wonder if > > > > we should do another data[] with flag defining that data as > CACHE_INFO. > > > > > > Can you elaborate? Does it mean with this way we don't rely on iommu > > > driver to provide version_to_size conversion and instead we just pass > > > data[] to iommu driver for further audit? > > > > No, my concern is that this ioctl has a single function, strictly tied > > to the iommu uapi. If we replace cache_info with data[] then we can > > define a flag to specify that data[] is struct > > iommu_cache_invalidate_info, and if we need to, a different flag to > > identify data[] as something else. For example if we get stuck > > expanding cache_info to meet new demands and develop a new uapi to > > solve that, how would we expand this ioctl to support it rather than > > also create a new ioctl? There's also a trade-off in making the ioctl > > usage more difficult for the user. I'd still expect the vfio layer to > > check the flag and interpret data[] as indicated by the flag rather > > than just passing a blob of opaque data to the iommu layer though. > > Thanks, > > Based on your comments about defining a single ioctl and a unified > vfio structure (with a @data[] field) for pasid_alloc/free, bind/ > unbind_gpasid, cache_inv. After some offline trying, I think it would > be good for bind/unbind_gpasid and cache_inv as both of them use the > iommu uapi definition. While the pasid alloc/free operation doesn't. > It would be weird to put all of them together. So pasid alloc/free > may have a separate ioctl. It would look as below. Does this direction > look good per your opinion? > > ioctl #22: VFIO_IOMMU_PASID_REQUEST > /** > * @pasid: used to return the pasid alloc result when flags == ALLOC_PASID > * specify a pasid to be freed when flags == FREE_PASID > * @range: specify the allocation range when flags == ALLOC_PASID > */ > struct vfio_iommu_pasid_request { > __u32 argsz; > #define VFIO_IOMMU_ALLOC_PASID(1 << 0) > #define VFIO_IOMMU_FREE_PASID (1 << 1) > __u32 flags; > __u32 pasid; > struct { > __u32 min; > __u32 max; > } range; > }; > > ioctl #23: VFIO_IOMMU_NESTING_OP > struct vfio_iommu_type1_nesting_op { > __u32 argsz; > __u32 flags; > __u32 op; > __u8data[]; > }; > > /* Nesting Ops */ > #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL0 > #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL 1 > #define VFIO_IOMMU_NESTING_OP_CACHE_INVLD 2 > Then why cannot we just put PASID into the header since the majority of nested usage is associated with a pasid? ioctl #23: VFIO_IOMMU_NESTING_OP struct vfio_iommu_type1_nesting_op { __u32 argsz; __u32 flags; __u32 op; __u32 pasid; __u8data[]; }; In case of SMMUv2 which supports nested w/o PASID, this field can be ignored for that specific case. Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code
Hi Joerg, On 14.04.2020 15:15, Joerg Roedel wrote: > here is the second version of this patch-set. The first version with > some more introductory text can be found here: > > https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/ > > Changes v1->v2: > > * Rebased to v5.7-rc1 > > * Re-wrote the arm-smmu changes as suggested by Robin Murphy > > * Re-worked the Exynos patches to hopefully not break the > driver anymore Thanks for this rework. This version is much better. Works fine on various Exynos-based boards (ARM and ARM64). Tested-by: Marek Szyprowski Acked-by: Marek Szyprowski (for Exynos and core changes) > * Fixed a missing mutex_unlock() reported by Marek Szyprowski, > thanks for that. > > There is also a git-branch available with these patches applied: > > > https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v2 > > Please review. > > Thanks, > > Joerg > > Joerg Roedel (32): >iommu: Move default domain allocation to separate function >iommu/amd: Implement iommu_ops->def_domain_type call-back >iommu/vt-d: Wire up iommu_ops->def_domain_type >iommu/amd: Remove dma_mask check from check_device() >iommu/amd: Return -ENODEV in add_device when device is not handled by > IOMMU >iommu: Add probe_device() and remove_device() call-backs >iommu: Move default domain allocation to iommu_probe_device() >iommu: Keep a list of allocated groups in __iommu_probe_device() >iommu: Move new probe_device path to separate function >iommu: Split off default domain allocation from group assignment >iommu: Move iommu_group_create_direct_mappings() out of > iommu_group_add_device() >iommu: Export bus_iommu_probe() and make is safe for re-probing >iommu/amd: Remove dev_data->passthrough >iommu/amd: Convert to probe/release_device() call-backs >iommu/vt-d: Convert to probe/release_device() call-backs >iommu/arm-smmu: Convert to probe/release_device() call-backs >iommu/pamu: Convert to probe/release_device() call-backs >iommu/s390: Convert to probe/release_device() call-backs >iommu/virtio: Convert to probe/release_device() call-backs >iommu/msm: Convert to probe/release_device() call-backs >iommu/mediatek: Convert to probe/release_device() call-backs >iommu/mediatek-v1 Convert to probe/release_device() call-backs >iommu/qcom: Convert to probe/release_device() call-backs >iommu/rockchip: Convert to probe/release_device() call-backs >iommu/tegra: Convert to probe/release_device() call-backs >iommu/renesas: Convert to probe/release_device() call-backs >iommu/omap: Remove orphan_dev tracking >iommu/omap: Convert to probe/release_device() call-backs >iommu/exynos: Use first SYSMMU in controllers list for IOMMU core >iommu/exynos: Convert to probe/release_device() call-backs >iommu: Remove add_device()/remove_device() code-paths >iommu: Unexport iommu_group_get_for_dev() > > Sai Praneeth Prakhya (1): >iommu: Add def_domain_type() callback in iommu_ops > > drivers/iommu/amd_iommu.c | 97 > drivers/iommu/amd_iommu_types.h | 1 - > drivers/iommu/arm-smmu-v3.c | 38 +-- > drivers/iommu/arm-smmu.c| 39 ++-- > drivers/iommu/exynos-iommu.c| 24 +- > drivers/iommu/fsl_pamu_domain.c | 22 +- > drivers/iommu/intel-iommu.c | 68 +- > drivers/iommu/iommu.c | 393 +--- > drivers/iommu/ipmmu-vmsa.c | 60 ++--- > drivers/iommu/msm_iommu.c | 34 +-- > drivers/iommu/mtk_iommu.c | 24 +- > drivers/iommu/mtk_iommu_v1.c| 50 ++-- > drivers/iommu/omap-iommu.c | 99 ++-- > drivers/iommu/qcom_iommu.c | 24 +- > drivers/iommu/rockchip-iommu.c | 26 +-- > drivers/iommu/s390-iommu.c | 22 +- > drivers/iommu/tegra-gart.c | 24 +- > drivers/iommu/tegra-smmu.c | 31 +-- > drivers/iommu/virtio-iommu.c| 41 +--- > include/linux/iommu.h | 21 +- > 20 files changed, 533 insertions(+), 605 deletions(-) > Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v11 07/10] iommu/vt-d: Add svm/sva invalidate function
Hi Jacob, On 4/10/20 11:56 PM, Jacob Pan wrote: > On Thu, 9 Apr 2020 10:50:34 +0200 > Auger Eric wrote: > >> Hi Jacob, >> >> On 4/3/20 8:42 PM, Jacob Pan wrote: >>> When Shared Virtual Address (SVA) is enabled for a guest OS via >>> vIOMMU, we need to provide invalidation support at IOMMU API and >>> driver level. This patch adds Intel VT-d specific function to >>> implement iommu passdown invalidate API for shared virtual address. >>> >>> The use case is for supporting caching structure invalidation >>> of assigned SVM capable devices. Emulated IOMMU exposes queue >>> invalidation capability and passes down all descriptors from the >>> guest to the physical IOMMU. >>> >>> The assumption is that guest to host device ID mapping should be >>> resolved prior to calling IOMMU driver. Based on the device handle, >>> host IOMMU driver can replace certain fields before submit to the >>> invalidation queue. >>> >>> --- >>> v11 - Removed 2D map array, use -EINVAL in granularity lookup array. >>> Fixed devTLB invalidation granularity mapping. Disregard G=1 >>> case and use address selective invalidation only. >>> >>> v7 review fixed in v10 >>> --- >>> >>> Signed-off-by: Jacob Pan >>> Signed-off-by: Ashok Raj >>> Signed-off-by: Liu, Yi L >>> --- >>> drivers/iommu/intel-iommu.c | 158 >>> 1 file changed, 158 >>> insertions(+) >>> >>> diff --git a/drivers/iommu/intel-iommu.c >>> b/drivers/iommu/intel-iommu.c index 94c7993dac6a..045c5c08d71d >>> 100644 --- a/drivers/iommu/intel-iommu.c >>> +++ b/drivers/iommu/intel-iommu.c >>> @@ -5594,6 +5594,163 @@ static void >>> intel_iommu_aux_detach_device(struct iommu_domain *domain, >>> aux_domain_remove_dev(to_dmar_domain(domain), dev); } >>> >>> +/* >>> + * 2D array for converting and sanitizing IOMMU generic TLB >>> granularity to >>> + * VT-d granularity. Invalidation is typically included in the >>> unmap operation >>> + * as a result of DMA or VFIO unmap. However, for assigned devices >>> guest >>> + * owns the first level page tables. Invalidations of translation >>> caches in the >>> + * guest are trapped and passed down to the host. >>> + * >>> + * vIOMMU in the guest will only expose first level page tables, >>> therefore >>> + * we do not support IOTLB granularity for request without PASID >>> (second level). >>> + * >>> + * For example, to find the VT-d granularity encoding for IOTLB >>> + * type and page selective granularity within PASID: >>> + * X: indexed by iommu cache type >>> + * Y: indexed by enum iommu_inv_granularity >>> + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR] >>> + */ >>> + >>> +const static int >>> inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = >>> { >>> + /* >>> +* PASID based IOTLB invalidation: PASID selective (per >>> PASID), >>> +* page selective (address granularity) >>> +*/ >>> + {-EINVAL, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID}, >>> + /* PASID based dev TLBs */ >>> + {-EINVAL, -EINVAL, QI_DEV_IOTLB_GRAN_PASID_SEL}, >>> + /* PASID cache */ >>> + {-EINVAL, -EINVAL, -EINVAL} >>> +}; >>> + >>> +static inline int to_vtd_granularity(int type, int granu) >>> +{ >>> + return inv_type_granu_table[type][granu]; >>> +} >>> + >>> +static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules) >>> +{ >>> + u64 nr_pages = (granu_size * nr_granules) >> >>> VTD_PAGE_SHIFT; + >>> + /* VT-d size is encoded as 2^size of 4K pages, 0 for 4k, 9 >>> for 2MB, etc. >>> +* IOMMU cache invalidate API passes granu_size in bytes, >>> and number of >>> +* granu size in contiguous memory. >>> +*/ >>> + return order_base_2(nr_pages); >>> +} >>> + >>> +#ifdef CONFIG_INTEL_IOMMU_SVM >>> +static int intel_iommu_sva_invalidate(struct iommu_domain *domain, >>> + struct device *dev, struct >>> iommu_cache_invalidate_info *inv_info) +{ >>> + struct dmar_domain *dmar_domain = to_dmar_domain(domain); >>> + struct device_domain_info *info; >>> + struct intel_iommu *iommu; >>> + unsigned long flags; >>> + int cache_type; >>> + u8 bus, devfn; >>> + u16 did, sid; >>> + int ret = 0; >>> + u64 size = 0; >>> + >>> + if (!inv_info || !dmar_domain || >>> + inv_info->version != >>> IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) >>> + return -EINVAL; >>> + >>> + if (!dev || !dev_is_pci(dev)) >>> + return -ENODEV; >> >> Check (domain->flags & DOMAIN_FLAG_NESTING_MODE)? > Good point > >>> + >>> + iommu = device_to_iommu(dev, &bus, &devfn); >>> + if (!iommu) >>> + return -ENODEV; >>> + >>> + spin_lock_irqsave(&device_domain_lock, flags); >>> + spin_lock(&iommu->lock); >>> + info = iommu_support_dev_iotlb(dmar_domain, iommu, bus, >>> devfn); >>> + if (!info) { >>> + ret = -EINVAL; >>> + goto out_unlock; >>> + } >>> + did = dmar_domain->iommu_did[iommu->seq_id]; >>> + sid = PCI_DEVID(bus, devfn); >>> + >>> + /* Size is only valid in non-PASID selective invalidatio
Re: [PATCH v11 05/10] iommu/vt-d: Add bind guest PASID support
Hi Jacob, On 4/10/20 11:06 PM, Jacob Pan wrote: > Hi Eric, > > Missed a few things in the last reply. > > On Thu, 9 Apr 2020 09:41:32 +0200 > Auger Eric wrote: > >>> + intel_pasid_tear_down_entry(iommu, dev, >>> svm->pasid); >> intel_svm_unbind_mm() calls intel_flush_svm_range_dev(svm, sdev, 0, >> -1, 0); Don't we need to flush the (DEV-)IOTLBs as well? > Right, pasid tear down should always include (DEV-)IOTLB flush, I > initially thought it is taken care of by intel_pasid_tear_down_entry(). > >>> + /* TODO: Drain in flight PRQ for the PASID >>> since it >>> +* may get reused soon, we don't want to >>> +* confuse with its previous life. >>> +* intel_svm_drain_prq(dev, pasid); >>> +*/ >>> + kfree_rcu(sdev, rcu); >>> + >>> + if (list_empty(&svm->devs)) { >>> + /* >>> +* We do not free the IOASID here >>> in that >>> +* IOMMU driver did not allocate >>> it. >> s/in/as? > I meant to say "in that" as "for the reason that" ok sorry > >>> +* Unlike native SVM, IOASID for >>> guest use was >>> +* allocated prior to the bind >>> call.> + * In any case, if the free >>> call comes before >>> +* the unbind, IOMMU driver will >>> get notified >>> +* and perform cleanup. >>> +*/ >>> + ioasid_set_data(pasid, NULL); >>> + kfree(svm); >>> + } >> nit: you may use intel_svm_free_if_empty() > True, but I meant to insert ioasid_notifier under the list_empty > condition in the following ioasid patch. ok Thanks Eric > > > Thanks, > > Jacob > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v11 05/10] iommu/vt-d: Add bind guest PASID support
Hi Jacob, On 4/10/20 9:45 PM, Jacob Pan wrote: > Hi Eric, > > On Thu, 9 Apr 2020 09:41:32 +0200 > Auger Eric wrote: > >> Hi Jacob, >> >> On 4/3/20 8:42 PM, Jacob Pan wrote: >>> When supporting guest SVA with emulated IOMMU, the guest PASID >>> table is shadowed in VMM. Updates to guest vIOMMU PASID table >>> will result in PASID cache flush which will be passed down to >>> the host as bind guest PASID calls. >>> >>> For the SL page tables, it will be harvested from device's >>> default domain (request w/o PASID), or aux domain in case of >>> mediated device. >>> >>> .-. .---. >>> | vIOMMU| | Guest process CR3, FL only| >>> | | '---' >>> ./ >>> | PASID Entry |--- PASID cache flush - >>> '-' | >>> | | V >>> | |CR3 in GPA >>> '-' >>> Guest >>> --| Shadow |--| >>> vv v >>> Host >>> .-. .--. >>> | pIOMMU| | Bind FL for GVA-GPA | >>> | | '--' >>> ./ | >>> | PASID Entry | V (Nested xlate) >>> '\.--. >>> | | |SL for GPA-HPA, default domain| >>> | | '--' >>> '-' >>> Where: >>> - FL = First level/stage one page tables >>> - SL = Second level/stage two page tables >>> >>> --- >>> v11: Fixed locking, avoid duplicated paging mode check, added >>> helper to free svm if device list is empty. Use rate limited error >>> message since the bind gpasid call comes from user space. >>> --- >>> >>> Signed-off-by: Jacob Pan >>> Signed-off-by: Liu, Yi L >>> --- >>> drivers/iommu/intel-iommu.c | 4 + >>> drivers/iommu/intel-svm.c | 206 >>> >>> include/linux/intel-iommu.h | 8 +- include/linux/intel-svm.h | >>> 17 4 files changed, 234 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/iommu/intel-iommu.c >>> b/drivers/iommu/intel-iommu.c index c0dadec5a6b3..94c7993dac6a >>> 100644 --- a/drivers/iommu/intel-iommu.c >>> +++ b/drivers/iommu/intel-iommu.c >>> @@ -6178,6 +6178,10 @@ const struct iommu_ops intel_iommu_ops = { >>> .dev_disable_feat = intel_iommu_dev_disable_feat, >>> .is_attach_deferred = >>> intel_iommu_is_attach_deferred, .pgsize_bitmap = >>> INTEL_IOMMU_PGSIZES, +#ifdef CONFIG_INTEL_IOMMU_SVM >>> + .sva_bind_gpasid= intel_svm_bind_gpasid, >>> + .sva_unbind_gpasid = intel_svm_unbind_gpasid, >>> +#endif >>> }; >>> >>> static void quirk_iommu_igfx(struct pci_dev *dev) >>> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c >>> index d7f2a5358900..7cf711318b87 100644 >>> --- a/drivers/iommu/intel-svm.c >>> +++ b/drivers/iommu/intel-svm.c >>> @@ -226,6 +226,212 @@ static LIST_HEAD(global_svm_list); >>> list_for_each_entry((sdev), &(svm)->devs, list) \ >>> if ((d) != (sdev)->dev) {} else >>> >>> + >>> +static inline void intel_svm_free_if_empty(struct intel_svm *svm, >>> u64 pasid) +{ >>> + if (list_empty(&svm->devs)) { >>> + ioasid_set_data(pasid, NULL); >>> + kfree(svm); >>> + } >>> +} >>> + >>> +int intel_svm_bind_gpasid(struct iommu_domain *domain, >>> + struct device *dev, >>> + struct iommu_gpasid_bind_data *data) >>> +{ >>> + struct intel_iommu *iommu = intel_svm_device_to_iommu(dev); >>> + struct dmar_domain *dmar_domain; >>> + struct intel_svm_dev *sdev; >>> + struct intel_svm *svm; >>> + int ret = 0; >>> + >>> + if (WARN_ON(!iommu) || !data) >>> + return -EINVAL; >>> + >>> + if (data->version != IOMMU_GPASID_BIND_VERSION_1 || >>> + data->format != IOMMU_PASID_FORMAT_INTEL_VTD) >>> + return -EINVAL; >>> + >>> + if (dev_is_pci(dev)) { >>> + /* VT-d supports devices with full 20 bit PASIDs >>> only */ >>> + if (pci_max_pasids(to_pci_dev(dev)) != PASID_MAX) >>> + return -EINVAL; >>> + } else { >>> + return -ENOTSUPP; >>> + } >>> + >>> + /* >>> +* We only check host PASID range, we have no knowledge to >>> check >>> +* guest PASID range. >>> +*/ >>> + if (data->hpasid <= 0 || data->hpasid >= PASID_MAX) >>> + return -EINVAL; >>> + >>> + dmar_domain = to_dmar_domain(domain); >>> + >>> + mutex_lock(&pasid_mutex); >>> + svm = ioasid_find(NULL, data->hpasid, NULL); >>> + if (IS_ERR(svm)) { >>> + ret = PTR_ERR(svm); >>> + goto out; >>> + } >>> + >>> + if (svm) { >>> + /* >>> +* If we found svm for the PASID, there must be at >>> +* least one device bond, otherwise svm should be >>>
RE: [PATCH v1 7/8] vfio/type1: Add VFIO_IOMMU_CACHE_INVALIDATE
Hi Alex, Still have a direction question with you. Better get agreement with you before heading forward. > From: Alex Williamson > Sent: Friday, April 3, 2020 11:35 PM [...] > > > > + * > > > > + * returns: 0 on success, -errno on failure. > > > > + */ > > > > +struct vfio_iommu_type1_cache_invalidate { > > > > + __u32 argsz; > > > > + __u32 flags; > > > > + struct iommu_cache_invalidate_info cache_info; > > > > +}; > > > > +#define VFIO_IOMMU_CACHE_INVALIDATE _IO(VFIO_TYPE, > VFIO_BASE > > > + 24) > > > > > > The future extension capabilities of this ioctl worry me, I wonder if > > > we should do another data[] with flag defining that data as CACHE_INFO. > > > > Can you elaborate? Does it mean with this way we don't rely on iommu > > driver to provide version_to_size conversion and instead we just pass > > data[] to iommu driver for further audit? > > No, my concern is that this ioctl has a single function, strictly tied > to the iommu uapi. If we replace cache_info with data[] then we can > define a flag to specify that data[] is struct > iommu_cache_invalidate_info, and if we need to, a different flag to > identify data[] as something else. For example if we get stuck > expanding cache_info to meet new demands and develop a new uapi to > solve that, how would we expand this ioctl to support it rather than > also create a new ioctl? There's also a trade-off in making the ioctl > usage more difficult for the user. I'd still expect the vfio layer to > check the flag and interpret data[] as indicated by the flag rather > than just passing a blob of opaque data to the iommu layer though. > Thanks, Based on your comments about defining a single ioctl and a unified vfio structure (with a @data[] field) for pasid_alloc/free, bind/ unbind_gpasid, cache_inv. After some offline trying, I think it would be good for bind/unbind_gpasid and cache_inv as both of them use the iommu uapi definition. While the pasid alloc/free operation doesn't. It would be weird to put all of them together. So pasid alloc/free may have a separate ioctl. It would look as below. Does this direction look good per your opinion? ioctl #22: VFIO_IOMMU_PASID_REQUEST /** * @pasid: used to return the pasid alloc result when flags == ALLOC_PASID * specify a pasid to be freed when flags == FREE_PASID * @range: specify the allocation range when flags == ALLOC_PASID */ struct vfio_iommu_pasid_request { __u32 argsz; #define VFIO_IOMMU_ALLOC_PASID (1 << 0) #define VFIO_IOMMU_FREE_PASID (1 << 1) __u32 flags; __u32 pasid; struct { __u32 min; __u32 max; } range; }; ioctl #23: VFIO_IOMMU_NESTING_OP struct vfio_iommu_type1_nesting_op { __u32 argsz; __u32 flags; __u32 op; __u8data[]; }; /* Nesting Ops */ #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL0 #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL 1 #define VFIO_IOMMU_NESTING_OP_CACHE_INVLD 2 Thanks, Yi Liu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH] dt-bndings: iommu: renesas,ipmmu-vmsa: convert to json-schema
Hi Geert-san, > From: Geert Uytterhoeven, Sent: Thursday, April 16, 2020 7:06 PM > > > > --- /dev/null > > > > +++ b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml > > > > > + renesas,ipmmu-main: > > > > +$ref: /schemas/types.yaml#/definitions/phandle-array > > > > +description: > > > > + Reference to the main IPMMU instance in two cells. The first > > > > cell is > > > > + a phandle to the main IPMMU and the second cell is the interrupt > > > > bit > > > > + number associated with the particular cache IPMMU device. The > > > > interrupt > > > > + bit number needs to match the main IPMMU IMSSTR register. Only > > > > used by > > > > + cache IPMMU instances. > > > > > > This property is not valid only on R-Car Gen2 and R-Mobile APE6. > > > > That sounds good. But, since ipmmu_mm on R-Car Gen3 also is not valid, > > we cannot use compatible property for detecting it. > > What do you mean by "ipmmu_mm is not valid"? I just wanted to say that ipmmu_mm node on R-Car Gen3 cannot have this renesas,ipmmu-main property. > > However, I realized renesas,ipmmu-main is only used by "cache IPMMU" > > and "cache IPMMU" doesn't have interrupts property. So, > > I described the following schema and then it seems success. > > -- > > if: > > properties: > > interrupts: false > > then: > > required: > > - renesas,ipmmu-main > > But all other nodes should have interrupts properties, right? Yes, that's right. > So they are mutually exclusive: > > oneOf: > - required: > - interrupts > - required: > - renesas,ipmmu-main Thank you! Now I understood it and such the schema is better than mine. I'll fix it. > > Especially, I could find missing renesas,ipmmu-main property on > > r8a77980.dtsi > > like below :) So, I'll make and submit a fixing r8a77980.dtsi patch later > > (maybe tomorrow). > > Good! ;-) :) Best regards, Yoshihiro Shimoda > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- > ge...@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like > that. > -- Linus Torvalds ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/qcom:fix local_base status check
On 2020/4/16 18:05, Robin Murphy wrote: On 2020-04-02 7:33 am, Tang Bin wrote: Release resources when exiting on error. Signed-off-by: Tang Bin --- drivers/iommu/qcom_iommu.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c index 4328da0b0..c08aa9651 100644 --- a/drivers/iommu/qcom_iommu.c +++ b/drivers/iommu/qcom_iommu.c @@ -813,8 +813,11 @@ static int qcom_iommu_device_probe(struct platform_device *pdev) qcom_iommu->dev = dev; res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - if (res) + if (res) { qcom_iommu->local_base = devm_ioremap_resource(dev, res); + if (IS_ERR(qcom_iommu->local_base)) + return PTR_ERR(qcom_iommu->local_base); + } ...or just use devm_platform_ioremap_resource() to make the whole thing simpler. Yes, I was going to simplify the code here, but status check is still required. So I'm waiting. Thanks, Tang Bin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dt-bndings: iommu: renesas, ipmmu-vmsa: convert to json-schema
Hi Shimoda-san, On Thu, Apr 16, 2020 at 11:56 AM Yoshihiro Shimoda wrote: > > From: Geert Uytterhoeven, Sent: Wednesday, April 15, 2020 11:21 PM > > On Tue, Apr 14, 2020 at 2:26 AM Yoshihiro Shimoda > > wrote: > > > Convert Renesas VMSA-Compatible IOMMU bindings documentation > > > to json-schema. > > > > > > Signed-off-by: Yoshihiro Shimoda > > > > > --- /dev/null > > > +++ b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml > > > + renesas,ipmmu-main: > > > +$ref: /schemas/types.yaml#/definitions/phandle-array > > > +description: > > > + Reference to the main IPMMU instance in two cells. The first cell > > > is > > > + a phandle to the main IPMMU and the second cell is the interrupt > > > bit > > > + number associated with the particular cache IPMMU device. The > > > interrupt > > > + bit number needs to match the main IPMMU IMSSTR register. Only > > > used by > > > + cache IPMMU instances. > > > > This property is not valid only on R-Car Gen2 and R-Mobile APE6. > > That sounds good. But, since ipmmu_mm on R-Car Gen3 also is not valid, > we cannot use compatible property for detecting it. What do you mean by "ipmmu_mm is not valid"? > However, I realized renesas,ipmmu-main is only used by "cache IPMMU" > and "cache IPMMU" doesn't have interrupts property. So, > I described the following schema and then it seems success. > -- > if: > properties: > interrupts: false > then: > required: > - renesas,ipmmu-main But all other nodes should have interrupts properties, right? So they are mutually exclusive: oneOf: - required: - interrupts - required: - renesas,ipmmu-main > Especially, I could find missing renesas,ipmmu-main property on r8a77980.dtsi > like below :) So, I'll make and submit a fixing r8a77980.dtsi patch later > (maybe tomorrow). Good! ;-) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/qcom:fix local_base status check
On 2020-04-02 7:33 am, Tang Bin wrote: Release resources when exiting on error. Signed-off-by: Tang Bin --- drivers/iommu/qcom_iommu.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c index 4328da0b0..c08aa9651 100644 --- a/drivers/iommu/qcom_iommu.c +++ b/drivers/iommu/qcom_iommu.c @@ -813,8 +813,11 @@ static int qcom_iommu_device_probe(struct platform_device *pdev) qcom_iommu->dev = dev; res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - if (res) + if (res) { qcom_iommu->local_base = devm_ioremap_resource(dev, res); + if (IS_ERR(qcom_iommu->local_base)) + return PTR_ERR(qcom_iommu->local_base); + } ...or just use devm_platform_ioremap_resource() to make the whole thing simpler. Robin. qcom_iommu->iface_clk = devm_clk_get(dev, "iface"); if (IS_ERR(qcom_iommu->iface_clk)) { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH] dt-bndings: iommu: renesas,ipmmu-vmsa: convert to json-schema
Hi Geert-san, Thank you for your review! > From: Geert Uytterhoeven, Sent: Wednesday, April 15, 2020 11:21 PM > > Hi Shimoda-san, > > On Tue, Apr 14, 2020 at 2:26 AM Yoshihiro Shimoda > wrote: > > Convert Renesas VMSA-Compatible IOMMU bindings documentation > > to json-schema. > > > > Signed-off-by: Yoshihiro Shimoda > > > --- /dev/null > > +++ b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml > > @@ -0,0 +1,90 @@ > > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) > > +%YAML 1.2 > > +--- > > +$id: http://devicetree.org/schemas/iommu/renesas,ipmmu-vmsa.yaml# > > +$schema: http://devicetree.org/meta-schemas/core.yaml# > > + > > +title: Renesas VMSA-Compatible IOMMU > > + > > +maintainers: > > + - Yoshihiro Shimoda > > + > > +description: > > + The IPMMU is an IOMMU implementation compatible with the ARM VMSA page > > tables. > > + It provides address translation for bus masters outside of the CPU, each > > + connected to the IPMMU through a port called micro-TLB. > > + > > +properties: > > + compatible: > > +oneOf: > > + - items: > > + - enum: > > + - renesas,ipmmu-r8a7743 # RZ/G1M > > + - renesas,ipmmu-r8a7744 # RZ/G1N > > + - renesas,ipmmu-r8a7745 # RZ/G1E > > + - renesas,ipmmu-r8a7790 # R-Car H2 > > + - renesas,ipmmu-r8a7791 # R-Car M2-W > > + - renesas,ipmmu-r8a7793 # R-Car M2-N > > + - renesas,ipmmu-r8a7794 # R-Car E2 > > + - renesas,ipmmu-r8a7795 # R-Car H3 > > + - const: renesas,ipmmu-vmsa # R-Car Gen2 or RZ/G1 > > + - items: > > + - enum: > > + - renesas,ipmmu-r8a73a4 # R-Mobile APE6 > > I believe the R-Mobile APE6 IPMMU is similar to the R-Car Gen2 IPMMU, > and thus belongs in the section above instead. I think so. Since the original document doesn't mention about R-Mobile APE6, I wrote this here. But, I also think R-Mobile APE6 IPMMU is similar to the R-Car Gen2 IPMMU. So, I'll change renesas,ipmmu-r8a73a4 to the above section and fix comment of renesas,ipmmu-vmsa > > + - renesas,ipmmu-r8a774a1 # RZ/G2M > > + - renesas,ipmmu-r8a774b1 # RZ/G2N > > + - renesas,ipmmu-r8a774c0 # RZ/G2E > > + - renesas,ipmmu-r8a7796 # R-Car M3-W > > + - renesas,ipmmu-r8a77965 # R-Car M3-N > > + - renesas,ipmmu-r8a77970 # R-Car V3M > > + - renesas,ipmmu-r8a77980 # R-Car V3H > > + - renesas,ipmmu-r8a77990 # R-Car E3 > > + - renesas,ipmmu-r8a77995 # R-Car D3 > > + > > + reg: > > +maxItems: 1 > > + > > + interrupts: > > +minItems: 1 > > +maxItems: 2 > > +description: > > + Specifiers for the MMU fault interrupts. For instances that support > > + secure mode two interrupts must be specified, for non-secure and > > secure > > + mode, in that order. For instances that don't support secure mode a > > + single interrupt must be specified. Not required for cache IPMMUs. > > items: > - description: > - description: I got it. I'll add items. > > + > > + '#iommu-cells': > > +const: 1 > > + > > + power-domains: > > +maxItems: 1 > > + > > + renesas,ipmmu-main: > > +$ref: /schemas/types.yaml#/definitions/phandle-array > > +description: > > + Reference to the main IPMMU instance in two cells. The first cell is > > + a phandle to the main IPMMU and the second cell is the interrupt bit > > + number associated with the particular cache IPMMU device. The > > interrupt > > + bit number needs to match the main IPMMU IMSSTR register. Only used > > by > > + cache IPMMU instances. > > This property is not valid only on R-Car Gen2 and R-Mobile APE6. That sounds good. But, since ipmmu_mm on R-Car Gen3 also is not valid, we cannot use compatible property for detecting it. However, I realized renesas,ipmmu-main is only used by "cache IPMMU" and "cache IPMMU" doesn't have interrupts property. So, I described the following schema and then it seems success. -- if: properties: interrupts: false then: required: - renesas,ipmmu-main -- Especially, I could find missing renesas,ipmmu-main property on r8a77980.dtsi like below :) So, I'll make and submit a fixing r8a77980.dtsi patch later (maybe tomorrow). CHECK arch/arm64/boot/dts/renesas/r8a77980-condor.dt.yaml /opt/home/shimoda/workdir/json-schema/arch/arm64/boot/dts/renesas/r8a77980-condor.dt.yaml: mmu@e7b0: 'renesas,ipmmu-main' is a required property /opt/home/shimoda/workdir/json-schema/arch/arm64/boot/dts/renesas/r8a77980-condor.dt.yaml: mmu@e796: 'renesas,ipmmu-main' is a required property Best regards, Yoshihiro Shimoda > (untested) > > oneOf: > - properties: > contains: > const: renesas,ipmmu-vmsa > - properties: > renesas,ipmmu-main: > $ref: /schemas/types.yaml#/definitions/phandle-array > descr
[PATCH v2] of_device: removed #include that caused a recursion in included headers
Both of_platform.h and of_device.h were included each other. In of_device.h, removed unneeded #include to of_platform.h and added include to of_platform.h in the files that needs it. Signed-off-by: Hadar Gat --- v2: add include to of_platform.h in more files. (reported due other builds) arch/sparc/mm/io-unit.c | 1 + arch/sparc/mm/iommu.c | 1 + drivers/base/platform.c | 1 + drivers/bus/imx-weim.c| 1 + drivers/bus/vexpress-config.c | 1 + drivers/clk/mediatek/clk-mt7622-aud.c | 1 + drivers/dma/at_hdmac.c| 1 + drivers/dma/stm32-dmamux.c| 1 + drivers/dma/ti/dma-crossbar.c | 1 + drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 1 + drivers/gpu/drm/msm/hdmi/hdmi.c | 1 + drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 1 + drivers/gpu/drm/sun4i/sun4i_tcon.c| 1 + drivers/iio/adc/stm32-adc-core.c | 1 + drivers/iio/adc/stm32-dfsdm-adc.c | 1 + drivers/iio/adc/stm32-dfsdm-core.c| 1 + drivers/iommu/tegra-smmu.c| 1 + drivers/memory/atmel-ebi.c| 1 + drivers/mfd/palmas.c | 1 + drivers/mfd/ssbi.c| 1 + drivers/mtd/nand/raw/omap2.c | 1 + drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c | 1 + drivers/net/ethernet/ti/cpsw.c| 1 + drivers/phy/tegra/xusb.c | 1 + drivers/pinctrl/freescale/pinctrl-imx1-core.c | 1 + drivers/pinctrl/nomadik/pinctrl-nomadik.c | 1 + drivers/soc/samsung/exynos-pmu.c | 1 + drivers/soc/sunxi/sunxi_sram.c| 1 + include/linux/of_device.h | 2 -- lib/genalloc.c| 1 + 31 files changed, 30 insertions(+), 2 deletions(-) diff --git a/arch/sparc/mm/io-unit.c b/arch/sparc/mm/io-unit.c index 289276b..5638399 100644 --- a/arch/sparc/mm/io-unit.c +++ b/arch/sparc/mm/io-unit.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include diff --git a/arch/sparc/mm/iommu.c b/arch/sparc/mm/iommu.c index b00dde1..9cbb2e7 100644 --- a/arch/sparc/mm/iommu.c +++ b/arch/sparc/mm/iommu.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include diff --git a/drivers/base/platform.c b/drivers/base/platform.c index 520..f549274b 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include diff --git a/drivers/bus/imx-weim.c b/drivers/bus/imx-weim.c index 28bb65a..8c786da 100644 --- a/drivers/bus/imx-weim.c +++ b/drivers/bus/imx-weim.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include diff --git a/drivers/bus/vexpress-config.c b/drivers/bus/vexpress-config.c index ff70575..12b8b0b 100644 --- a/drivers/bus/vexpress-config.c +++ b/drivers/bus/vexpress-config.c @@ -8,6 +8,7 @@ #include #include #include +#include #include diff --git a/drivers/clk/mediatek/clk-mt7622-aud.c b/drivers/clk/mediatek/clk-mt7622-aud.c index 2bd4295..8cbb68f 100644 --- a/drivers/clk/mediatek/clk-mt7622-aud.c +++ b/drivers/clk/mediatek/clk-mt7622-aud.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include "clk-mtk.h" diff --git a/drivers/dma/at_hdmac.c b/drivers/dma/at_hdmac.c index 73a2078..388f8e10 100644 --- a/drivers/dma/at_hdmac.c +++ b/drivers/dma/at_hdmac.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include "at_hdmac_regs.h" diff --git a/drivers/dma/stm32-dmamux.c b/drivers/dma/stm32-dmamux.c index 12f7637..b704896 100644 --- a/drivers/dma/stm32-dmamux.c +++ b/drivers/dma/stm32-dmamux.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include diff --git a/drivers/dma/ti/dma-crossbar.c b/drivers/dma/ti/dma-crossbar.c index 4ba8fa5..2c0fd44 100644 --- a/drivers/dma/ti/dma-crossbar.c +++ b/drivers/dma/ti/dma-crossbar.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #define TI_XBAR_DRA7 0 diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index c4e71ab..f523254 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include "a6xx_gpu.h" diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c index 737453b..5034d40 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi.c +++ b/drivers/gpu/drm/msm/hdmi/hdmi.c @@ -7,6 +7,7 @@ #include #include +#include #include #include "hd
Re: [PATCH v5 02/25] iommu/sva: Manage process address spaces
On Thu, Apr 16, 2020 at 12:28:52AM -0700, Christoph Hellwig wrote: > > + rcu_read_lock(); > > + hlist_for_each_entry_rcu(bond, &io_mm->devices, mm_node) > > + io_mm->ops->invalidate(bond->sva.dev, io_mm->pasid, io_mm->ctx, > > + start, end - start); > > + rcu_read_unlock(); > > +} > > What is the reason that the devices don't register their own notifiers? > This kinds of multiplexing is always rather messy, and you do it for > all the methods. This sends TLB and ATC invalidations through the IOMMU, it doesn't go through device drivers Thanks, Jean ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 6/7] iommu/vt-d: Add page request draining support
Hi Kevin, On 2020/4/15 19:10, Tian, Kevin wrote: the completion of above sequence ensures that previous queued page group responses are sent out and received by the endpoint and vice versa all in-fly page requests from the endpoint are queued in iommu page request queue. Then comes a problem - you didn't wait for completion of those newly-queued requests and their responses. I thought about this again. We do page request draining after PASID table entry gets torn down and the devTLB gets invalidated. At this point, if any new page request for this pasid comes in, IOMMU will generate an unrecoverable fault and response the device with IR (Invalid Request). IOMMU won't put this page request into the queue. [VT-d spec 7.4.1] Hence, we don't need to worry about the newly-queued requests here. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
Hi Zhangfei, On 4/16/20 6:25 AM, Zhangfei Gao wrote: > > > On 2020/4/14 下午11:05, Eric Auger wrote: >> This version fixes an issue observed by Shameer on an SMMU 3.2, >> when moving from dual stage config to stage 1 only config. >> The 2 high 64b of the STE now get reset. Otherwise, leaving the >> S2TTB set may cause a C_BAD_STE error. >> >> This series can be found at: >> https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1 >> (including the VFIO part) >> The QEMU fellow series still can be found at: >> https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6 >> >> Users have expressed interest in that work and tested v9/v10: >> - https://patchwork.kernel.org/cover/11039995/#23012381 >> - https://patchwork.kernel.org/cover/11039995/#23197235 >> >> Background: >> >> This series brings the IOMMU part of HW nested paging support >> in the SMMUv3. The VFIO part is submitted separately. >> >> The IOMMU API is extended to support 2 new API functionalities: >> 1) pass the guest stage 1 configuration >> 2) pass stage 1 MSI bindings >> >> Then those capabilities gets implemented in the SMMUv3 driver. >> >> The virtualizer passes information through the VFIO user API >> which cascades them to the iommu subsystem. This allows the guest >> to own stage 1 tables and context descriptors (so-called PASID >> table) while the host owns stage 2 tables and main configuration >> structures (STE). >> >> > > Thanks Eric > > Tested v11 on Hisilicon kunpeng920 board via hardware zip accelerator. > 1. no-sva works, where guest app directly use physical address via ioctl. Thank you for the testing. Glad it works for you. > 2. vSVA still not work, same as v10, Yes that's normal this series is not meant to support vSVM at this stage. I intend to add the missing pieces during the next weeks. Thanks Eric > 3. the v10 issue reported by Shameer has been solved, first start qemu > with iommu=smmuv3, then start qemu without iommu=smmuv3 > 4. no-sva also works without iommu=smmuv3 > > Test details in https://docs.qq.com/doc/DRU5oR1NtUERseFNL > > Thanks > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/3] iommu/vt-d: Allow 32bit devices to uses DMA domain
Hi Christoph, On 2020/4/16 15:01, Christoph Hellwig wrote: On Thu, Apr 16, 2020 at 02:23:52PM +0800, Lu Baolu wrote: Currently, if a 32bit device initially uses an identity domain, Intel IOMMU driver will convert it forcibly to a DMA one if its address capability is not enough for the whole system memory. The motivation was to overcome the overhead caused by possible bounced buffer. Unfortunately, this improvement has led to many problems. For example, some 32bit devices are required to use an identity domain, forcing them to use DMA domain will cause the device not to work anymore. On the other hand, the VMD sub-devices share a domain but each sub-device might have different address capability. Forcing a VMD sub-device to use DMA domain blindly will impact the operation of other sub-devices without any notification. Further more, PCI aliased devices (PCI bridge and all devices beneath it, VMD devices and various devices quirked with pci_add_dma_alias()) must use the same domain. Forcing one device to switch to DMA domain during runtime will cause in-fligh DMAs for other devices to abort or target to other memory which might cause undefind system behavior. This commit log doesn't actually explain what you are chaning, and as far as I can tell it just removes the code to change the domain at run time, which seems to not actually match the subject or This removes the domain switching in iommu_need_mapping(). Another place where the private domain is used is intel_iommu_add_device(). Joerg's patch set has remove that. So with domain switching in iommu_need_mapping() removed, the private domain helpers could be removed now. Otherwise, the compiler will complain that some functions are defined but not used. description. I'd need to look at the final code, but it seems like this will still cause bounce buffering instead of using dynamic mapping, which still seems like an awful idea. Yes. If the user chooses to use identity domain by default through kernel command, identity domain will be applied for all devices. For those devices with limited addressing capability, bounce buffering will be used when they try to access the memory beyond their address capability. This won't cause any kernel regression as far as I can see. Switching domain during runtime with drivers loaded will cause real problems as I said in the commit message. That's the reason why I am proposing to remove it. If we want to keep it, we have to make sure that switching domain for one device should not impact other devices which share the same domain with it. Furthermore, it's better to implement it in the generic layer to keep device driver behavior consistent on all architectures. Also from a purely stylistic perspective a lot of the lines seem very short and not use up the whole 73 charaters allowed. Yes. I will try to use up the allowed characters. Best regards, baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v5 02/25] iommu/sva: Manage process address spaces
> + rcu_read_lock(); > + hlist_for_each_entry_rcu(bond, &io_mm->devices, mm_node) > + io_mm->ops->invalidate(bond->sva.dev, io_mm->pasid, io_mm->ctx, > +start, end - start); > + rcu_read_unlock(); > +} What is the reason that the devices don't register their own notifiers? This kinds of multiplexing is always rather messy, and you do it for all the methods. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/3] iommu/vt-d: Allow 32bit devices to uses DMA domain
On Thu, Apr 16, 2020 at 02:23:52PM +0800, Lu Baolu wrote: > Currently, if a 32bit device initially uses an identity domain, > Intel IOMMU driver will convert it forcibly to a DMA one if its > address capability is not enough for the whole system memory. > The motivation was to overcome the overhead caused by possible > bounced buffer. > > Unfortunately, this improvement has led to many problems. For > example, some 32bit devices are required to use an identity > domain, forcing them to use DMA domain will cause the device > not to work anymore. On the other hand, the VMD sub-devices > share a domain but each sub-device might have different address > capability. Forcing a VMD sub-device to use DMA domain blindly > will impact the operation of other sub-devices without any > notification. Further more, PCI aliased devices (PCI bridge > and all devices beneath it, VMD devices and various devices > quirked with pci_add_dma_alias()) must use the same domain. > Forcing one device to switch to DMA domain during runtime > will cause in-fligh DMAs for other devices to abort or target > to other memory which might cause undefind system behavior. This commit log doesn't actually explain what you are chaning, and as far as I can tell it just removes the code to change the domain at run time, which seems to not actually match the subject or description. I'd need to look at the final code, but it seems like this will still cause bounce buffering instead of using dynamic mapping, which still seems like an awful idea. Also from a purely stylistic perspective a lot of the lines seem very short and not use up the whole 73 charaters allowed. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu