Re: [PATCH v4 1/1] iommu/io-pgtable-arm: Add support to use system cache
Hi Robin, On Mon, May 13, 2019 at 5:02 PM Robin Murphy wrote: > > On 13/05/2019 11:04, Vivek Gautam wrote: > > Few Qualcomm platforms such as, sdm845 have an additional outer > > cache called as System cache, aka. Last level cache (LLC) that > > allows non-coherent devices to upgrade to using caching. > > This cache sits right before the DDR, and is tightly coupled > > with the memory controller. The clients using this cache request > > their slices from this system cache, make it active, and can then > > start using it. > > > > There is a fundamental assumption that non-coherent devices can't > > access caches. This change adds an exception where they *can* use > > some level of cache despite still being non-coherent overall. > > The coherent devices that use cacheable memory, and CPU make use of > > this system cache by default. > > > > Looking at memory types, we have following - > > a) Normal uncached :- MAIR 0x44, inner non-cacheable, > >outer non-cacheable; > > b) Normal cached :- MAIR 0xff, inner read write-back non-transient, > >outer read write-back non-transient; > >attribute setting for coherenet I/O devices. > > and, for non-coherent i/o devices that can allocate in system cache > > another type gets added - > > c) Normal sys-cached :- MAIR 0xf4, inner non-cacheable, > > outer read write-back non-transient > > > > Coherent I/O devices use system cache by marking the memory as > > normal cached. > > Non-coherent I/O devices should mark the memory as normal > > sys-cached in page tables to use system cache. > > > > Signed-off-by: Vivek Gautam > > --- > > > > V3 version of this patch and related series can be found at [1]. > > > > This change is a realisation of following changes from downstream msm-4.9: > > iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[2] > > > > Changes since v3: > > - Dropping support to cache i/o page tables to system cache. Getting > > support > > for data buffers is the first step. > > Removed io-pgtable quirk and related change to add domain attribute. > > > > Glmark2 numbers on SDM845 based cheza board: > > > > S.No.|with LLC support |without LLC support > > | for data buffers | > > --- > > 1|4480; 72.3fps |4042; 65.2fps > > 2|4500; 72.6fps |4039; 65.1fps > > 3|4523; 72.9fps |4106; 66.2fps > > 4|4489; 72.4fps |4104; 66.2fps > > 5|4518; 72.9fps |4072; 65.7fps > > > > [1] https://patchwork.kernel.org/cover/10772629/ > > [2] > > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=d4c72c413ea27c43f60825193d4de9cb8ffd9602 > > > > drivers/iommu/io-pgtable-arm.c | 9 - > > include/linux/iommu.h | 1 + > > 2 files changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c > > index d3700ec15cbd..2dbafe697531 100644 > > --- a/drivers/iommu/io-pgtable-arm.c > > +++ b/drivers/iommu/io-pgtable-arm.c > > @@ -167,10 +167,12 @@ > > #define ARM_LPAE_MAIR_ATTR_MASK 0xff > > #define ARM_LPAE_MAIR_ATTR_DEVICE 0x04 > > #define ARM_LPAE_MAIR_ATTR_NC 0x44 > > +#define ARM_LPAE_MAIR_ATTR_QCOM_SYS_CACHE0xf4 > > #define ARM_LPAE_MAIR_ATTR_WBRWA0xff > > #define ARM_LPAE_MAIR_ATTR_IDX_NC 0 > > #define ARM_LPAE_MAIR_ATTR_IDX_CACHE1 > > #define ARM_LPAE_MAIR_ATTR_IDX_DEV 2 > > +#define ARM_LPAE_MAIR_ATTR_IDX_QCOM_SYS_CACHE3 > > Here at the implementation level, I'd rather just call these what they > are, i.e. s/QCOM_SYS_CACHE/INC_OWBRWA/. > Thanks for the review. Sure, will change this as suggested. > > > > /* IOPTE accessors */ > > #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d)) > > @@ -442,6 +444,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct > > arm_lpae_io_pgtable *data, > > else if (prot & IOMMU_CACHE) > > pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE > > << ARM_LPAE_PTE_ATTRINDX_SHIFT); > > + else if (prot & IOMMU_QCOM_SYS_CACHE) > > Where in the call stack is this going to be decided? (I don't recall the > previous discussions ever really reaching a solid conclusion on how to > separate responsibilities). > Based on the last discussion [1], I understood that we may not want to expose these cache protections to DMA APIs. So such control would lie with the masters that are creating the individual domains. An example [2] of this is graphics on sdm845. Please ignore the change in naming at [2] IOMMU_UPSTREAM_HINT in [2] is same as IOMMU_QCOM_SYS_CACHE here. At that point [1] I also pointed to the fact that video that uses DMA APIs to handle buffers too uses system cache on sdm845. In this case shouldn't we expose the protection co
Re: [RFC] iommu: arm-smmu: stall support
On Mon, May 13, 2019 at 11:37 AM Jean-Philippe Brucker wrote: > > Hi Rob, > > On 10/05/2019 19:23, Rob Clark wrote: > > On Fri, Sep 22, 2017 at 2:58 AM Jean-Philippe Brucker > > wrote: > >> > >> On 22/09/17 10:02, Joerg Roedel wrote: > >>> On Tue, Sep 19, 2017 at 10:23:43AM -0400, Rob Clark wrote: > I would like to decide in the IRQ whether or not to queue work or not, > because when we get a gpu fault, we tend to get 1000's of gpu faults > all at once (and I really only need to handle the first one). I > suppose that could also be achieved by having a special return value > from the fault handler to say "call me again from a wq".. > > Note that in the drm driver I already have a suitable wq to queue the > work, so it really doesn't buy me anything to have the iommu driver > toss things off to a wq for me. Might be a different situation for > other drivers (but I guess mostly other drivers are using iommu API > indirectly via dma-mapping?) > >>> > >>> Okay, so since you are the only user for now, we don't need a > >>> work-queue. But I still want the ->resume call-back to be hidden in the > >>> iommu code and not be exposed to users. > >>> > >>> We already have per-domain fault-handlers, so the best solution for now > >>> is to call ->resume from report_iommu_fault() when the fault-handler > >>> returns a special value. > >> > >> The problem is that report_iommu_fault is called from IRQ context by the > >> SMMU driver, so the device driver callback cannot sleep. > >> > >> So if the device driver needs to be able to sleep between fault report and > >> resume, as I understand Rob needs for writing debugfs, we can either: > >> > >> * call report_iommu_fault from higher up, in a thread or workqueue. > >> * split the fault reporting as this patch proposes. The exact same > >> mechanism is needed for the vSVM work by Intel: in order to inject fault > >> into the guest, they would like to have an atomic notifier registered by > >> VFIO for passing down the Page Request, and a new function in the IOMMU > >> API to resume/complete the fault. > >> > > > > So I was thinking about this topic again.. I would still like to get > > some sort of async resume so that I can wire up GPU cmdstream/state > > logging on iommu fault (without locally resurrecting and rebasing this > > patch and drm/msm side changes each time I need to debug iommu > > faults).. > > We've been working on the new fault reporting API with Jacob and Eric, > and I intend to send it out soon. It is supposed to be used for > reporting faults to guests via VFIO, handling page faults via mm, and > also reporting events directly to device drivers. Please let us know > what works and what doesn't in your case > > The most recent version of the patches is at > http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/api > (git://www.linux-arm.org/linux-jpb.git branch sva/api). Hopefully on the > list sometimes next week, I'll add you on Cc. > > In particular, see commits > iommu: Introduce device fault data > iommu: Introduce device fault report API > iommu: Add recoverable fault reporting > > The device driver calls iommu_register_device_fault_handler(dev, cb, > data). To report a fault, the SMMU driver calls > iommu_report_device_fault(dev, fault). This calls into the device driver > directly, there isn't any workqueue. If the fault is recoverable (the > SMMU driver set type IOMMU_FAULT_PAGE_REQ rather than > IOMMU_FAULT_DMA_UNRECOV), the device driver calls iommu_page_response() > once it has dealt with the fault (after sleeping if it needs to). This > invokes the SMMU driver's resume callback. Ok, this sounds at a high level similar to my earlier RFC, in that resume is split (and that was the main thing I was interested in). And it does solve one thing I was struggling with, namely that when the domain is created it doesn't know which iommu device it will be attached to (given that at least the original arm-smmu.c driver cannot support stall in all cases).. For GPU translation faults, I also don't really need to know if the faulting translation is stalled until the callback (I mainly want to not bother to snapshot GPU state if it is not stalled, because in that case the data we snapshot is unlikely to be related to the fault if the translation is not stalled). > At the moment we use mutexes, so iommu_report_device_fault() can only be > called from an IRQ thread, which is incompatible with the current SMMUv2 > driver. Either we need to switch the SMMUv2 driver to an IRQ thread, or > rework the fault handler to be called from an IRQ handler. The reporting > also has to be per device rather than per domain, and I'm not sure if > the SMMUv2 driver can deal with this. I'll take a closer look at the branch and try to formulate some plan to add v2 support for this. For my cases, the GPU always has it's own iommu device, while display and other blocks s
Re: [PATCH v3 02/16] iommu: Introduce cache_invalidate API
On Mon, 13 May 2019 18:09:48 +0100 Jean-Philippe Brucker wrote: > On 13/05/2019 17:50, Auger Eric wrote: > >> struct iommu_inv_pasid_info { > >> #define IOMMU_INV_PASID_FLAGS_PASID(1 << 0) > >> #define IOMMU_INV_PASID_FLAGS_ARCHID (1 << 1) > >>__u32 flags; > >>__u32 archid; > >>__u64 pasid; > >> }; > > I agree it does the job now. However it looks a bit strange to do a > > PASID based invalidation in my case - SMMUv3 nested stage - where I > > don't have any PASID involved. > > > > Couldn't we call it context based invalidation then? A context can > > be tagged by a PASID or/and an ARCHID. > > I think calling it "context" would be confusing as well (I shouldn't > have used it earlier), since VT-d uses that name for device table > entries (=STE on Arm SMMU). Maybe "addr_space"? > I am still struggling to understand what ARCHID is after scanning through SMMUv3.1 spec. It seems to be a constant for a given SMMU. Why do you need to pass it down every time? Could you point to me the document or explain a little more on ARCHID use cases. We have three fileds called pasid under this struct iommu_cache_invalidate_info{} Gets confusing :) > Thanks, > Jean > > > > > Domain invalidation would invalidate all the contexts belonging to > > that domain. > > > > Thanks > > > > Eric [Jacob Pan] ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC] iommu: arm-smmu: stall support
Hi Rob, On 10/05/2019 19:23, Rob Clark wrote: > On Fri, Sep 22, 2017 at 2:58 AM Jean-Philippe Brucker > wrote: >> >> On 22/09/17 10:02, Joerg Roedel wrote: >>> On Tue, Sep 19, 2017 at 10:23:43AM -0400, Rob Clark wrote: I would like to decide in the IRQ whether or not to queue work or not, because when we get a gpu fault, we tend to get 1000's of gpu faults all at once (and I really only need to handle the first one). I suppose that could also be achieved by having a special return value from the fault handler to say "call me again from a wq".. Note that in the drm driver I already have a suitable wq to queue the work, so it really doesn't buy me anything to have the iommu driver toss things off to a wq for me. Might be a different situation for other drivers (but I guess mostly other drivers are using iommu API indirectly via dma-mapping?) >>> >>> Okay, so since you are the only user for now, we don't need a >>> work-queue. But I still want the ->resume call-back to be hidden in the >>> iommu code and not be exposed to users. >>> >>> We already have per-domain fault-handlers, so the best solution for now >>> is to call ->resume from report_iommu_fault() when the fault-handler >>> returns a special value. >> >> The problem is that report_iommu_fault is called from IRQ context by the >> SMMU driver, so the device driver callback cannot sleep. >> >> So if the device driver needs to be able to sleep between fault report and >> resume, as I understand Rob needs for writing debugfs, we can either: >> >> * call report_iommu_fault from higher up, in a thread or workqueue. >> * split the fault reporting as this patch proposes. The exact same >> mechanism is needed for the vSVM work by Intel: in order to inject fault >> into the guest, they would like to have an atomic notifier registered by >> VFIO for passing down the Page Request, and a new function in the IOMMU >> API to resume/complete the fault. >> > > So I was thinking about this topic again.. I would still like to get > some sort of async resume so that I can wire up GPU cmdstream/state > logging on iommu fault (without locally resurrecting and rebasing this > patch and drm/msm side changes each time I need to debug iommu > faults).. We've been working on the new fault reporting API with Jacob and Eric, and I intend to send it out soon. It is supposed to be used for reporting faults to guests via VFIO, handling page faults via mm, and also reporting events directly to device drivers. Please let us know what works and what doesn't in your case The most recent version of the patches is at http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/api (git://www.linux-arm.org/linux-jpb.git branch sva/api). Hopefully on the list sometimes next week, I'll add you on Cc. In particular, see commits iommu: Introduce device fault data iommu: Introduce device fault report API iommu: Add recoverable fault reporting The device driver calls iommu_register_device_fault_handler(dev, cb, data). To report a fault, the SMMU driver calls iommu_report_device_fault(dev, fault). This calls into the device driver directly, there isn't any workqueue. If the fault is recoverable (the SMMU driver set type IOMMU_FAULT_PAGE_REQ rather than IOMMU_FAULT_DMA_UNRECOV), the device driver calls iommu_page_response() once it has dealt with the fault (after sleeping if it needs to). This invokes the SMMU driver's resume callback. At the moment we use mutexes, so iommu_report_device_fault() can only be called from an IRQ thread, which is incompatible with the current SMMUv2 driver. Either we need to switch the SMMUv2 driver to an IRQ thread, or rework the fault handler to be called from an IRQ handler. The reporting also has to be per device rather than per domain, and I'm not sure if the SMMUv2 driver can deal with this. > > And I do still prefer the fault cb in irq (or not requiring it in > wq).. but on thinking about it, the two ideas aren't entirely > conflicting, ie. add some flags either when we register handler[1], or > they could be handled thru domain_set_attr, like: > > _EXPLICIT_RESUME - iommu API user calls iommu_domain_resume(), > potentialy from wq/thread after fault handler returns > _HANDLER_SLEEPS - iommu core handles the wq, and calls ops->resume() > internally > > In both cases, from the iommu driver PoV it just implements > iommu_ops::resume().. in first case it is called via iommu user either > from the fault handler or at some point later (ie. wq or thread). > > I don't particularly need the _HANDLER_SLEEPS case (unless I can't > convince anyone that iommu_domamin_resume() called from outside iommu > core is a good idea).. so probably I wouldn't wire up the wq plumbing > for the _HANDLER_SLEEPS case unless someone really wanted me to. > > Since there are more iommu drivers, than places that register fault > handlers, I like the idea that in either case,
Re: [PATCH v3 02/16] iommu: Introduce cache_invalidate API
On 13/05/2019 17:50, Auger Eric wrote: >> struct iommu_inv_pasid_info { >> #define IOMMU_INV_PASID_FLAGS_PASID (1 << 0) >> #define IOMMU_INV_PASID_FLAGS_ARCHID (1 << 1) >> __u32 flags; >> __u32 archid; >> __u64 pasid; >> }; > I agree it does the job now. However it looks a bit strange to do a > PASID based invalidation in my case - SMMUv3 nested stage - where I > don't have any PASID involved. > > Couldn't we call it context based invalidation then? A context can be > tagged by a PASID or/and an ARCHID. I think calling it "context" would be confusing as well (I shouldn't have used it earlier), since VT-d uses that name for device table entries (=STE on Arm SMMU). Maybe "addr_space"? Thanks, Jean > > Domain invalidation would invalidate all the contexts belonging to that > domain. > > Thanks > > Eric
Re: [PATCH 3/4] iommu/vt-d: Handle RMRR with PCI bridge device scopes
Hi Jacob, On 5/13/19 6:41 PM, Jacob Pan wrote: > On Mon, 13 May 2019 09:13:01 +0200 > Eric Auger wrote: > >> When reading the vtd specification and especially the >> Reserved Memory Region Reporting Structure chapter, >> it is not obvious a device scope element cannot be a >> PCI-PCI bridge, in which case all downstream ports are >> likely to access the reserved memory region. Let's handle >> this case in device_has_rmrr. >> >> Fixes: ea2447f700ca ("intel-iommu: Prevent devices with RMRRs from >> being placed into SI Domain") >> >> Signed-off-by: Eric Auger >> --- >> drivers/iommu/intel-iommu.c | 32 +++- >> 1 file changed, 23 insertions(+), 9 deletions(-) >> >> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >> index e2134b13c9ae..89d82a1d50b1 100644 >> --- a/drivers/iommu/intel-iommu.c >> +++ b/drivers/iommu/intel-iommu.c >> @@ -736,12 +736,31 @@ static int iommu_dummy(struct device *dev) >> return dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO; >> } >> >> +static bool >> +is_downstream_to_pci_bridge(struct device *deva, struct device *devb) >> +{ >> +struct pci_dev *pdeva, *pdevb; >> + > is there a more illustrative name for these. i guess deva is is the > bridge dev? deva is the candidate PCI device likely to belong to the PCI sub-hierarchy of devb (the candidate bridge). My concern about the naming was that they are not necessarily a pci device or a bridge. But at least I can add a comment or rename according to your suggestion ;-) >> +if (!dev_is_pci(deva) || !dev_is_pci(devb)) >> +return false; >> + >> +pdeva = to_pci_dev(deva); >> +pdevb = to_pci_dev(devb); >> + >> +if (pdevb->subordinate && >> +pdevb->subordinate->number <= pdeva->bus->number && >> +pdevb->subordinate->busn_res.end >= pdeva->bus->number) >> +return true; >> + >> +return false; >> +>> + > this seems to be a separate cleanup patch. I can split into 2 patches: introduction of this helper and its usage in device_to_iommu and second patch using it in iommu_has_rmrr. >> static struct intel_iommu *device_to_iommu(struct device *dev, u8 >> *bus, u8 *devfn) { >> struct dmar_drhd_unit *drhd = NULL; >> struct intel_iommu *iommu; >> struct device *tmp; >> -struct pci_dev *ptmp, *pdev = NULL; >> +struct pci_dev *pdev = NULL; >> u16 segment = 0; >> int i; >> >> @@ -787,13 +806,7 @@ static struct intel_iommu >> *device_to_iommu(struct device *dev, u8 *bus, u8 *devf goto out; >> } >> >> -if (!pdev || !dev_is_pci(tmp)) >> -continue; >> - >> -ptmp = to_pci_dev(tmp); >> -if (ptmp->subordinate && >> -ptmp->subordinate->number <= >> pdev->bus->number && >> -ptmp->subordinate->busn_res.end >= >> pdev->bus->number) >> +if (is_downstream_to_pci_bridge(dev, tmp)) >> goto got_pdev; >> } >> >> @@ -2886,7 +2899,8 @@ static bool device_has_rmrr(struct device *dev) >> */ >> for_each_active_dev_scope(rmrr->devices, >>rmrr->devices_cnt, i, tmp) >> -if (tmp == dev) { >> +if (tmp == dev || >> +is_downstream_to_pci_bridge(dev, tmp)) { >> rcu_read_unlock(); >> return true; >> } > > [Jacob Pan] > Thanks Eric ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 02/16] iommu: Introduce cache_invalidate API
Hi Jean-Philippe, On 5/13/19 1:20 PM, Jean-Philippe Brucker wrote: > Hi Eric, > > On 13/05/2019 10:14, Auger Eric wrote: >> I noticed my qemu integration was currently incorrectly using PASID >> invalidation for ASID based invalidation (SMMUV3 Stage1 CMD_TLBI_NH_ASID >> invalidation command). So I think we also need ARCHID invalidation. >> Sorry for the late notice. >>> >>> +/* defines the granularity of the invalidation */ >>> +enum iommu_inv_granularity { >>> + IOMMU_INV_GRANU_DOMAIN, /* domain-selective invalidation */ >> IOMMU_INV_GRANU_ARCHID, /* archid-selective invalidation */ >>> + IOMMU_INV_GRANU_PASID, /* pasid-selective invalidation */ > > In terms of granularity, these values have the same meaning: invalidate > the whole address space of a context. Then you can communicate two > things using the same struct: > * If ATS is enables an Arm host needs to invalidate all ATC entries > using PASID. > * If BTM isn't used by the guest, the host needs to invalidate all TLB > entries using ARCHID. > > Rather than introducing a new granule here, could we just add an archid > field to the struct associated with IOMMU_INV_GRANU_PASID? Something like... > >>> + IOMMU_INV_GRANU_ADDR, /* page-selective invalidation */ >>> + IOMMU_INVAL_GRANU_NR, /* number of invalidation granularities */ >>> +}; >>> + >>> +/** >>> + * Address Selective Invalidation Structure >>> + * >>> + * @flags indicates the granularity of the address-selective invalidation >>> + * - if PASID bit is set, @pasid field is populated and the invalidation >>> + * relates to cache entries tagged with this PASID and matching the >>> + * address range. >>> + * - if ARCHID bit is set, @archid is populated and the invalidation >>> relates >>> + * to cache entries tagged with this architecture specific id and >>> matching >>> + * the address range. >>> + * - Both PASID and ARCHID can be set as they may tag different caches. >>> + * - if neither PASID or ARCHID is set, global addr invalidation applies >>> + * - LEAF flag indicates whether only the leaf PTE caching needs to be >>> + * invalidated and other paging structure caches can be preserved. >>> + * @pasid: process address space id >>> + * @archid: architecture-specific id >>> + * @addr: first stage/level input address >>> + * @granule_size: page/block size of the mapping in bytes >>> + * @nb_granules: number of contiguous granules to be invalidated >>> + */ >>> +struct iommu_inv_addr_info { >>> +#define IOMMU_INV_ADDR_FLAGS_PASID (1 << 0) >>> +#define IOMMU_INV_ADDR_FLAGS_ARCHID(1 << 1) >>> +#define IOMMU_INV_ADDR_FLAGS_LEAF (1 << 2) >>> + __u32 flags; >>> + __u32 archid; >>> + __u64 pasid; >>> + __u64 addr; >>> + __u64 granule_size; >>> + __u64 nb_granules; >>> +}; > > struct iommu_inv_pasid_info { > #define IOMMU_INV_PASID_FLAGS_PASID (1 << 0) > #define IOMMU_INV_PASID_FLAGS_ARCHID (1 << 1) > __u32 flags; > __u32 archid; > __u64 pasid; > }; I agree it does the job now. However it looks a bit strange to do a PASID based invalidation in my case - SMMUv3 nested stage - where I don't have any PASID involved. Couldn't we call it context based invalidation then? A context can be tagged by a PASID or/and an ARCHID. Domain invalidation would invalidate all the contexts belonging to that domain. Thanks Eric > >>> + >>> +/** >>> + * First level/stage invalidation information >>> + * @cache: bitfield that allows to select which caches to invalidate >>> + * @granularity: defines the lowest granularity used for the invalidation: >>> + * domain > pasid > addr >>> + * >>> + * Not all the combinations of cache/granularity make sense: >>> + * >>> + * type | DEV_IOTLB | IOTLB | PASID| >>> + * granularity | | | cache| >>> + * -+---+---+---+ >>> + * DOMAIN | N/A | Y | Y | >> * ARCHID | N/A | Y | N/A | >> >>> + * PASID | Y | Y | Y | >>> + * ADDR| Y | Y | N/A | >>> + * >>> + * Invalidations by %IOMMU_INV_GRANU_ADDR use field @addr_info. >> * Invalidations by %IOMMU_INV_GRANU_ARCHID use field @archid. >>> + * Invalidations by %IOMMU_INV_GRANU_PASID use field @pasid. >>> + * Invalidations by %IOMMU_INV_GRANU_DOMAIN don't take any argument. >>> + * >>> + * If multiple cache types are invalidated simultaneously, they all >>> + * must support the used granularity. >>> + */ >>> +struct iommu_cache_invalidate_info { >>> +#define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1 >>> + __u32 version; >>> +/* IOMMU paging structure cache */ >>> +#define IOMMU_CACHE_INV_TYPE_IOTLB (1 << 0) /* IOMMU IOTLB */ >>> +#define IOMMU_CACHE_INV_TYPE_DEV_IOTLB (1 << 1) /* Device IOTLB */ >>> +#define IOMMU_CACHE_INV_TYPE_PASID (1 << 2) /* PASID cache */ >>> +#
Re: [PATCH 3/4] iommu/vt-d: Handle RMRR with PCI bridge device scopes
On Mon, 13 May 2019 09:13:01 +0200 Eric Auger wrote: > When reading the vtd specification and especially the > Reserved Memory Region Reporting Structure chapter, > it is not obvious a device scope element cannot be a > PCI-PCI bridge, in which case all downstream ports are > likely to access the reserved memory region. Let's handle > this case in device_has_rmrr. > > Fixes: ea2447f700ca ("intel-iommu: Prevent devices with RMRRs from > being placed into SI Domain") > > Signed-off-by: Eric Auger > --- > drivers/iommu/intel-iommu.c | 32 +++- > 1 file changed, 23 insertions(+), 9 deletions(-) > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index e2134b13c9ae..89d82a1d50b1 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -736,12 +736,31 @@ static int iommu_dummy(struct device *dev) > return dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO; > } > > +static bool > +is_downstream_to_pci_bridge(struct device *deva, struct device *devb) > +{ > + struct pci_dev *pdeva, *pdevb; > + is there a more illustrative name for these. i guess deva is is the bridge dev? > + if (!dev_is_pci(deva) || !dev_is_pci(devb)) > + return false; > + > + pdeva = to_pci_dev(deva); > + pdevb = to_pci_dev(devb); > + > + if (pdevb->subordinate && > + pdevb->subordinate->number <= pdeva->bus->number && > + pdevb->subordinate->busn_res.end >= pdeva->bus->number) > + return true; > + > + return false; > +} > + this seems to be a separate cleanup patch. > static struct intel_iommu *device_to_iommu(struct device *dev, u8 > *bus, u8 *devfn) { > struct dmar_drhd_unit *drhd = NULL; > struct intel_iommu *iommu; > struct device *tmp; > - struct pci_dev *ptmp, *pdev = NULL; > + struct pci_dev *pdev = NULL; > u16 segment = 0; > int i; > > @@ -787,13 +806,7 @@ static struct intel_iommu > *device_to_iommu(struct device *dev, u8 *bus, u8 *devf goto out; > } > > - if (!pdev || !dev_is_pci(tmp)) > - continue; > - > - ptmp = to_pci_dev(tmp); > - if (ptmp->subordinate && > - ptmp->subordinate->number <= > pdev->bus->number && > - ptmp->subordinate->busn_res.end >= > pdev->bus->number) > + if (is_downstream_to_pci_bridge(dev, tmp)) > goto got_pdev; > } > > @@ -2886,7 +2899,8 @@ static bool device_has_rmrr(struct device *dev) >*/ > for_each_active_dev_scope(rmrr->devices, > rmrr->devices_cnt, i, tmp) > - if (tmp == dev) { > + if (tmp == dev || > + is_downstream_to_pci_bridge(dev, tmp)) { > rcu_read_unlock(); > return true; > } [Jacob Pan] ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 12/23] iommu/smmuv3: Get prepared for nested stage support
Hi Robin, On 5/13/19 1:43 PM, Robin Murphy wrote: > On 10/05/2019 15:34, Auger Eric wrote: >> Hi Robin, >> >> On 5/8/19 4:24 PM, Robin Murphy wrote: >>> On 08/04/2019 13:19, Eric Auger wrote: To allow nested stage support, we need to store both stage 1 and stage 2 configurations (and remove the former union). A nested setup is characterized by both s1_cfg and s2_cfg set. We introduce a new ste.abort field that will be set upon guest stage1 configuration passing. If s1_cfg is NULL and ste.abort is set, traffic can't pass. If ste.abort is not set, S1 is bypassed. arm_smmu_write_strtab_ent() is modified to write both stage fields in the STE and deal with the abort field. In nested mode, only stage 2 is "finalized" as the host does not own/configure the stage 1 context descriptor, guest does. Signed-off-by: Eric Auger --- v4 -> v5: - reset ste.abort on detach v3 -> v4: - s1_cfg.nested_abort and nested_bypass removed. - s/ste.nested/ste.abort - arm_smmu_write_strtab_ent modifications with introduction of local abort, bypass and translate local variables - comment updated v1 -> v2: - invalidate the STE before moving from a live STE config to another - add the nested_abort and nested_bypass fields --- drivers/iommu/arm-smmu-v3.c | 35 --- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 21d027695181..e22e944ffc05 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -211,6 +211,7 @@ #define STRTAB_STE_0_CFG_BYPASS 4 #define STRTAB_STE_0_CFG_S1_TRANS 5 #define STRTAB_STE_0_CFG_S2_TRANS 6 +#define STRTAB_STE_0_CFG_NESTED 7 #define STRTAB_STE_0_S1FMT GENMASK_ULL(5, 4) #define STRTAB_STE_0_S1FMT_LINEAR 0 @@ -514,6 +515,7 @@ struct arm_smmu_strtab_ent { * configured according to the domain type. */ bool assigned; + bool abort; struct arm_smmu_s1_cfg *s1_cfg; struct arm_smmu_s2_cfg *s2_cfg; }; @@ -628,10 +630,8 @@ struct arm_smmu_domain { bool non_strict; enum arm_smmu_domain_stage stage; - union { - struct arm_smmu_s1_cfg s1_cfg; - struct arm_smmu_s2_cfg s2_cfg; - }; + struct arm_smmu_s1_cfg s1_cfg; + struct arm_smmu_s2_cfg s2_cfg; struct iommu_domain domain; @@ -1108,12 +1108,13 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, __le64 *dst, struct arm_smmu_strtab_ent *ste) { /* - * This is hideously complicated, but we only really care about - * three cases at the moment: + * We care about the following transitions: * * 1. Invalid (all zero) -> bypass/fault (init) - * 2. Bypass/fault -> translation/bypass (attach) - * 3. Translation/bypass -> bypass/fault (detach) + * 2. Bypass/fault -> single stage translation/bypass (attach) + * 3. single stage Translation/bypass -> bypass/fault (detach) + * 4. S2 -> S1 + S2 (attach_pasid_table) + * 5. S1 + S2 -> S2 (detach_pasid_table) * * Given that we can't update the STE atomically and the SMMU * doesn't read the thing in a defined order, that leaves us @@ -1124,7 +1125,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, * 3. Update Config, sync */ u64 val = le64_to_cpu(dst[0]); - bool ste_live = false; + bool abort, bypass, translate, ste_live = false; struct arm_smmu_cmdq_ent prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG, .prefetch = { @@ -1138,11 +1139,11 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, break; case STRTAB_STE_0_CFG_S1_TRANS: case STRTAB_STE_0_CFG_S2_TRANS: + case STRTAB_STE_0_CFG_NESTED: ste_live = true; break; case STRTAB_STE_0_CFG_ABORT: - if (disable_bypass) - break; + break; default: BUG(); /* STE corruption */ } @@ -1152,8 +1153,13 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, val = STRTAB_STE_0_V; /* Bypass/fault */ - if (!ste->assigned || !
Re: [PATCH v6 1/1] iommu: enhance IOMMU dma mode build options
On 2019/5/8 17:42, John Garry wrote: > On 18/04/2019 14:57, Zhen Lei wrote: >> First, add build option IOMMU_DEFAULT_{LAZY|STRICT}, so that we have the >> opportunity to set {lazy|strict} mode as default at build time. Then put >> the three config options in an choice, make people can only choose one of >> the three at a time. >> >> The default IOMMU dma modes on each ARCHs have no change. >> >> Signed-off-by: Zhen Lei >> --- >> arch/ia64/kernel/pci-dma.c| 2 +- >> arch/powerpc/platforms/powernv/pci-ioda.c | 3 ++- >> arch/s390/pci/pci_dma.c | 2 +- >> arch/x86/kernel/pci-dma.c | 7 ++--- >> drivers/iommu/Kconfig | 44 >> ++- >> drivers/iommu/amd_iommu_init.c| 3 ++- >> drivers/iommu/intel-iommu.c | 2 +- >> drivers/iommu/iommu.c | 3 ++- >> 8 files changed, 48 insertions(+), 18 deletions(-) >> >> diff --git a/arch/ia64/kernel/pci-dma.c b/arch/ia64/kernel/pci-dma.c >> index fe988c49f01ce6a..655511dbf3c3b34 100644 >> --- a/arch/ia64/kernel/pci-dma.c >> +++ b/arch/ia64/kernel/pci-dma.c >> @@ -22,7 +22,7 @@ >> int force_iommu __read_mostly; >> #endif >> >> -int iommu_pass_through; >> +int iommu_pass_through = IS_ENABLED(CONFIG_IOMMU_DEFAULT_PASSTHROUGH); >> >> static int __init pci_iommu_init(void) >> { >> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >> b/arch/powerpc/platforms/powernv/pci-ioda.c >> index 3ead4c237ed0ec9..383e082a9bb985c 100644 >> --- a/arch/powerpc/platforms/powernv/pci-ioda.c >> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c >> @@ -85,7 +85,8 @@ void pe_level_printk(const struct pnv_ioda_pe *pe, const >> char *level, >> va_end(args); >> } >> >> -static bool pnv_iommu_bypass_disabled __read_mostly; >> +static bool pnv_iommu_bypass_disabled __read_mostly = >> +!IS_ENABLED(CONFIG_IOMMU_DEFAULT_PASSTHROUGH); >> static bool pci_reset_phbs __read_mostly; >> >> static int __init iommu_setup(char *str) >> diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c >> index 9e52d1527f71495..784ad1e0acecfb1 100644 >> --- a/arch/s390/pci/pci_dma.c >> +++ b/arch/s390/pci/pci_dma.c >> @@ -17,7 +17,7 @@ >> >> static struct kmem_cache *dma_region_table_cache; >> static struct kmem_cache *dma_page_table_cache; >> -static int s390_iommu_strict; >> +static int s390_iommu_strict = IS_ENABLED(CONFIG_IOMMU_DEFAULT_STRICT); >> >> static int zpci_refresh_global(struct zpci_dev *zdev) >> { >> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c >> index d460998ae828514..fb2bab42a0a3173 100644 >> --- a/arch/x86/kernel/pci-dma.c >> +++ b/arch/x86/kernel/pci-dma.c >> @@ -43,11 +43,8 @@ >> * It is also possible to disable by default in kernel config, and enable >> with >> * iommu=nopt at boot time. >> */ >> -#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH >> -int iommu_pass_through __read_mostly = 1; >> -#else >> -int iommu_pass_through __read_mostly; >> -#endif >> +int iommu_pass_through __read_mostly = >> +IS_ENABLED(CONFIG_IOMMU_DEFAULT_PASSTHROUGH); >> >> extern struct iommu_table_entry __iommu_table[], __iommu_table_end[]; >> >> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig >> index 6f07f3b21816c64..8a1f1793cde76b4 100644 >> --- a/drivers/iommu/Kconfig >> +++ b/drivers/iommu/Kconfig >> @@ -74,17 +74,47 @@ config IOMMU_DEBUGFS >>debug/iommu directory, and then populate a subdirectory with >>entries as required. >> >> -config IOMMU_DEFAULT_PASSTHROUGH >> -bool "IOMMU passthrough by default" >> +choice >> +prompt "IOMMU dma mode" > > /s/dma/DMA/ OK > > And how about add "default", as in "Default IOMMU DMA mode" or "IOMMU default > DMA mode"? Yes. I prefer "IOMMU default DMA mode". > >> depends on IOMMU_API >> -help >> - Enable passthrough by default, removing the need to pass in >> - iommu.passthrough=on or iommu=pt through command line. If this >> - is enabled, you can still disable with iommu.passthrough=off >> - or iommu=nopt depending on the architecture. >> +default IOMMU_DEFAULT_PASSTHROUGH if (PPC_POWERNV && PCI) >> +default IOMMU_DEFAULT_LAZY if (AMD_IOMMU || INTEL_IOMMU || S390_IOMMU) >> +default IOMMU_DEFAULT_STRICT >> +help >> + This option allows IOMMU dma mode to be chose at build time, to > > again, capitalize acronyms, i.e. /s/dma/DMA/ (more of these above and below) OK, I will check it all. Thanks. > >> + override the default dma mode of each ARCHs, removing the need to >> + pass in kernel parameters through command line. You can still use >> + ARCHs specific boot options to override this option again. >> + >> +config IOMMU_DEFAULT_PASSTHROUGH > > I think that it may need to be indented, along with the other choices There is no problem. I referred to mm/Kconfig. > >> +bool "passthrough" >> +help >> + In this mode, the dma access through IOMMU withou
Re: [PATCH v7 14/23] iommu/smmuv3: Implement cache_invalidate
Hi Robin, On 5/13/19 4:01 PM, Robin Murphy wrote: > On 13/05/2019 13:16, Auger Eric wrote: >> Hi Robin, >> On 5/8/19 5:01 PM, Robin Murphy wrote: >>> On 08/04/2019 13:19, Eric Auger wrote: Implement domain-selective and page-selective IOTLB invalidations. Signed-off-by: Eric Auger --- v6 -> v7 - check the uapi version v3 -> v4: - adapt to changes in the uapi - add support for leaf parameter - do not use arm_smmu_tlb_inv_range_nosync or arm_smmu_tlb_inv_context anymore v2 -> v3: - replace __arm_smmu_tlb_sync by arm_smmu_cmdq_issue_sync v1 -> v2: - properly pass the asid --- drivers/iommu/arm-smmu-v3.c | 60 + 1 file changed, 60 insertions(+) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 1486baf53425..4366921d8318 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -2326,6 +2326,65 @@ static void arm_smmu_detach_pasid_table(struct iommu_domain *domain) mutex_unlock(&smmu_domain->init_mutex); } +static int +arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device *dev, + struct iommu_cache_invalidate_info *inv_info) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_device *smmu = smmu_domain->smmu; + + if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED) + return -EINVAL; + + if (!smmu) + return -EINVAL; + + if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) + return -EINVAL; + + if (inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB) { + if (inv_info->granularity == IOMMU_INV_GRANU_PASID) { + struct arm_smmu_cmdq_ent cmd = { + .opcode = CMDQ_OP_TLBI_NH_ASID, + .tlbi = { + .vmid = smmu_domain->s2_cfg.vmid, + .asid = inv_info->pasid, + }, + }; + + arm_smmu_cmdq_issue_cmd(smmu, &cmd); + arm_smmu_cmdq_issue_sync(smmu); >>> >>> I'd much rather make arm_smmu_tlb_inv_context() understand nested >>> domains than open-code commands all over the place. >> >> >>> + + } else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR) { + struct iommu_inv_addr_info *info = &inv_info->addr_info; + size_t size = info->nb_granules * info->granule_size; + bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF; + struct arm_smmu_cmdq_ent cmd = { + .opcode = CMDQ_OP_TLBI_NH_VA, + .tlbi = { + .addr = info->addr, + .vmid = smmu_domain->s2_cfg.vmid, + .asid = info->pasid, + .leaf = leaf, + }, + }; + + do { + arm_smmu_cmdq_issue_cmd(smmu, &cmd); + cmd.tlbi.addr += info->granule_size; + } while (size -= info->granule_size); + arm_smmu_cmdq_issue_sync(smmu); >>> >>> An this in particular I would really like to go all the way through >>> io_pgtable_tlb_add_flush()/io_pgtable_sync() if at all possible. Hooking >>> up range-based invalidations is going to be a massive headache if the >>> abstraction isn't solid. >> >> The concern is the host does not "own" the s1 config asid >> (smmu_domain->s1_cfg.cd.asid is not set, practically). In our case the >> asid only is passed by the userspace on CACHE_INVALIDATE ioctl call. >> >> arm_smmu_tlb_inv_context and arm_smmu_tlb_inv_range_nosync use this field > > Right, but that's not exactly hard to solve. Even just something like > the (untested, purely illustrative) refactoring below would be beneficial. Sure I can go this way. Thank you for detailing Eric > > Robin. > > ->8- > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > index d3880010c6cf..31ef703cf671 100644 > --- a/drivers/iommu/arm-smmu-v3.c > +++ b/drivers/iommu/arm-smmu-v3.c > @@ -1423,11 +1423,9 @@ static void arm_smmu_tlb_inv_context(void *cookie) > arm_smmu_cmdq_issue_sync(smmu); > } > > -static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, > - size_t granule, bool leaf, void *cookie) > +static void __arm_smmu_tlb_inv_range(struct arm_smmu_domain > *smmu_domain, u16 asid, > + unsigned long iova, size_t size, size_t granule, bool leaf) > { > - struct arm_smmu_domain *smmu_domain = cookie; > - struct arm_smmu_device *smmu = smmu_domain->smmu; > struct arm_smmu_cmdq_ent cmd = { > .tlbi = { > .leaf = leaf, > @@ -1437,18 +1435,27 @@ s
Re: [PATCH v7 14/23] iommu/smmuv3: Implement cache_invalidate
On 13/05/2019 13:16, Auger Eric wrote: Hi Robin, On 5/8/19 5:01 PM, Robin Murphy wrote: On 08/04/2019 13:19, Eric Auger wrote: Implement domain-selective and page-selective IOTLB invalidations. Signed-off-by: Eric Auger --- v6 -> v7 - check the uapi version v3 -> v4: - adapt to changes in the uapi - add support for leaf parameter - do not use arm_smmu_tlb_inv_range_nosync or arm_smmu_tlb_inv_context anymore v2 -> v3: - replace __arm_smmu_tlb_sync by arm_smmu_cmdq_issue_sync v1 -> v2: - properly pass the asid --- drivers/iommu/arm-smmu-v3.c | 60 + 1 file changed, 60 insertions(+) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 1486baf53425..4366921d8318 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -2326,6 +2326,65 @@ static void arm_smmu_detach_pasid_table(struct iommu_domain *domain) mutex_unlock(&smmu_domain->init_mutex); } +static int +arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device *dev, + struct iommu_cache_invalidate_info *inv_info) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_device *smmu = smmu_domain->smmu; + + if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED) + return -EINVAL; + + if (!smmu) + return -EINVAL; + + if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) + return -EINVAL; + + if (inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB) { + if (inv_info->granularity == IOMMU_INV_GRANU_PASID) { + struct arm_smmu_cmdq_ent cmd = { + .opcode = CMDQ_OP_TLBI_NH_ASID, + .tlbi = { + .vmid = smmu_domain->s2_cfg.vmid, + .asid = inv_info->pasid, + }, + }; + + arm_smmu_cmdq_issue_cmd(smmu, &cmd); + arm_smmu_cmdq_issue_sync(smmu); I'd much rather make arm_smmu_tlb_inv_context() understand nested domains than open-code commands all over the place. + + } else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR) { + struct iommu_inv_addr_info *info = &inv_info->addr_info; + size_t size = info->nb_granules * info->granule_size; + bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF; + struct arm_smmu_cmdq_ent cmd = { + .opcode = CMDQ_OP_TLBI_NH_VA, + .tlbi = { + .addr = info->addr, + .vmid = smmu_domain->s2_cfg.vmid, + .asid = info->pasid, + .leaf = leaf, + }, + }; + + do { + arm_smmu_cmdq_issue_cmd(smmu, &cmd); + cmd.tlbi.addr += info->granule_size; + } while (size -= info->granule_size); + arm_smmu_cmdq_issue_sync(smmu); An this in particular I would really like to go all the way through io_pgtable_tlb_add_flush()/io_pgtable_sync() if at all possible. Hooking up range-based invalidations is going to be a massive headache if the abstraction isn't solid. The concern is the host does not "own" the s1 config asid (smmu_domain->s1_cfg.cd.asid is not set, practically). In our case the asid only is passed by the userspace on CACHE_INVALIDATE ioctl call. arm_smmu_tlb_inv_context and arm_smmu_tlb_inv_range_nosync use this field Right, but that's not exactly hard to solve. Even just something like the (untested, purely illustrative) refactoring below would be beneficial. Robin. ->8- diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index d3880010c6cf..31ef703cf671 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -1423,11 +1423,9 @@ static void arm_smmu_tlb_inv_context(void *cookie) arm_smmu_cmdq_issue_sync(smmu); } -static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, - size_t granule, bool leaf, void *cookie) +static void __arm_smmu_tlb_inv_range(struct arm_smmu_domain *smmu_domain, u16 asid, + unsigned long iova, size_t size, size_t granule, bool leaf) { - struct arm_smmu_domain *smmu_domain = cookie; - struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_cmdq_ent cmd = { .tlbi = { .leaf = leaf, @@ -1437,18 +1435,27 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) { cmd.opcode = CMDQ_OP_TLBI_NH_VA; - cmd.tlbi.asid = smmu_domain->s1_cfg.cd.asid; + cmd.tlbi.asid = asid; } else { cmd.opcode = CMDQ_OP_TLBI_S2_IPA; cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid; } do { - arm_smmu_cmdq_issue_cmd(smmu, &cmd); + arm_smmu_cmdq_issue_cmd(smmu_domain->s
Re: [PATCH v7 18/23] iommu/smmuv3: Report non recoverable faults
On 13/05/2019 13:32, Auger Eric wrote: Hi Robin, On 5/13/19 1:54 PM, Robin Murphy wrote: On 13/05/2019 08:46, Auger Eric wrote: Hi Robin, On 5/8/19 7:20 PM, Robin Murphy wrote: On 08/04/2019 13:19, Eric Auger wrote: When a stage 1 related fault event is read from the event queue, let's propagate it to potential external fault listeners, ie. users who registered a fault handler. Signed-off-by: Eric Auger --- v4 -> v5: - s/IOMMU_FAULT_PERM_INST/IOMMU_FAULT_PERM_EXEC --- drivers/iommu/arm-smmu-v3.c | 169 +--- 1 file changed, 158 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 805bc32a..1fd320788dcb 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -167,6 +167,26 @@ #define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8 #define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc +/* Events */ +#define ARM_SMMU_EVT_F_UUT 0x01 +#define ARM_SMMU_EVT_C_BAD_STREAMID 0x02 +#define ARM_SMMU_EVT_F_STE_FETCH 0x03 +#define ARM_SMMU_EVT_C_BAD_STE 0x04 +#define ARM_SMMU_EVT_F_BAD_ATS_TREQ 0x05 +#define ARM_SMMU_EVT_F_STREAM_DISABLED 0x06 +#define ARM_SMMU_EVT_F_TRANSL_FORBIDDEN 0x07 +#define ARM_SMMU_EVT_C_BAD_SUBSTREAMID 0x08 +#define ARM_SMMU_EVT_F_CD_FETCH 0x09 +#define ARM_SMMU_EVT_C_BAD_CD 0x0a +#define ARM_SMMU_EVT_F_WALK_EABT 0x0b +#define ARM_SMMU_EVT_F_TRANSLATION 0x10 +#define ARM_SMMU_EVT_F_ADDR_SIZE 0x11 +#define ARM_SMMU_EVT_F_ACCESS 0x12 +#define ARM_SMMU_EVT_F_PERMISSION 0x13 +#define ARM_SMMU_EVT_F_TLB_CONFLICT 0x20 +#define ARM_SMMU_EVT_F_CFG_CONFLICT 0x21 +#define ARM_SMMU_EVT_E_PAGE_REQUEST 0x24 + /* Common MSI config fields */ #define MSI_CFG0_ADDR_MASK GENMASK_ULL(51, 2) #define MSI_CFG2_SH GENMASK(5, 4) @@ -332,6 +352,15 @@ #define EVTQ_MAX_SZ_SHIFT 7 #define EVTQ_0_ID GENMASK_ULL(7, 0) +#define EVTQ_0_SSV GENMASK_ULL(11, 11) +#define EVTQ_0_SUBSTREAMID GENMASK_ULL(31, 12) +#define EVTQ_0_STREAMID GENMASK_ULL(63, 32) +#define EVTQ_1_PNU GENMASK_ULL(33, 33) +#define EVTQ_1_IND GENMASK_ULL(34, 34) +#define EVTQ_1_RNW GENMASK_ULL(35, 35) +#define EVTQ_1_S2 GENMASK_ULL(39, 39) +#define EVTQ_1_CLASS GENMASK_ULL(40, 41) +#define EVTQ_3_FETCH_ADDR GENMASK_ULL(51, 3) /* PRI queue */ #define PRIQ_ENT_DWORDS 2 @@ -639,6 +668,64 @@ struct arm_smmu_domain { spinlock_t devices_lock; }; +/* fault propagation */ + +#define IOMMU_FAULT_F_FIELDS (IOMMU_FAULT_UNRECOV_PASID_VALID | \ + IOMMU_FAULT_UNRECOV_PERM_VALID | \ + IOMMU_FAULT_UNRECOV_ADDR_VALID) + +struct arm_smmu_fault_propagation_data { + enum iommu_fault_reason reason; + bool s1_check; + u32 fields; /* IOMMU_FAULT_UNRECOV_*_VALID bits */ +}; + +/* + * Describes how SMMU faults translate into generic IOMMU faults + * and if they need to be reported externally + */ +static const struct arm_smmu_fault_propagation_data fault_propagation[] = { +[ARM_SMMU_EVT_F_UUT] = { }, +[ARM_SMMU_EVT_C_BAD_STREAMID] = { }, +[ARM_SMMU_EVT_F_STE_FETCH] = { }, +[ARM_SMMU_EVT_C_BAD_STE] = { }, +[ARM_SMMU_EVT_F_BAD_ATS_TREQ] = { }, +[ARM_SMMU_EVT_F_STREAM_DISABLED] = { }, +[ARM_SMMU_EVT_F_TRANSL_FORBIDDEN] = { }, +[ARM_SMMU_EVT_C_BAD_SUBSTREAMID] = {IOMMU_FAULT_REASON_PASID_INVALID, + false, + IOMMU_FAULT_UNRECOV_PASID_VALID + }, +[ARM_SMMU_EVT_F_CD_FETCH] = {IOMMU_FAULT_REASON_PASID_FETCH, + false, + IOMMU_FAULT_UNRECOV_PASID_VALID | It doesn't make sense to presume validity here, or in any of the faults below... + IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID + }, +[ARM_SMMU_EVT_C_BAD_CD] = {IOMMU_FAULT_REASON_BAD_PASID_ENTRY, + false, + IOMMU_FAULT_UNRECOV_PASID_VALID + }, +[ARM_SMMU_EVT_F_WALK_EABT] = {IOMMU_FAULT_REASON_WALK_EABT, true, + IOMMU_FAULT_F_FIELDS | + IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID + }, +[ARM_SMMU_EVT_F_TRANSLATION] = {IOMMU_FAULT_REASON_PTE_FETCH, true, + IOMMU_FAULT_F_FIELDS + }, +[ARM_SMMU_EVT_F_ADDR_SIZE] = {IOMMU_FAULT_REASON_OOR_ADDRESS, true, + IOMMU_FAULT_F_FIELDS + }, +[ARM_SMMU_EVT_F_ACCESS] = {IOMMU_FAULT_REASON_ACCESS, true, + IOMMU_FAULT_F_FIELDS + }, +[ARM_SMMU_EVT_F_PERMISSION] = {IOMMU_FAULT_REASON_PERMISSION, true, + IOMMU_FAULT_F_FIELDS + }, +[ARM_SMMU_EVT_F_TLB
Re: [git pull] IOMMU Updates for Linux v5.2
The pull request you sent on Mon, 13 May 2019 13:53:34 +0200: > git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git > tags/iommu-updates-v5.2 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/a13f0655503a4a89df67fdc7cac6a7810795d4b3 Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker
Re: [ARM SMMU] Dynamic StreamID allocation
On 13/05/2019 13:42, Jean-Philippe Brucker wrote: On 13/05/2019 08:09, Pankaj Bansal wrote: Hi Jean, -Original Message- From: Jean-Philippe Brucker Sent: Friday, 10 May, 2019 07:07 PM To: Pankaj Bansal ; Will Deacon ; Robin Murphy ; Joerg Roedel Cc: iommu@lists.linux-foundation.org; Varun Sethi ; linux- arm-ker...@lists.infradead.org; Nipun Gupta Subject: Re: [ARM SMMU] Dynamic StreamID allocation On 10/05/2019 13:33, Pankaj Bansal wrote: Hi Will/Robin/Joerg, I am s/w engineer from NXP India Pvt. Ltd. We are using SMMU-V3 in one of NXP SOC. I have a question about the SMMU Stream ID allocation in linux. Right now the Stream IDs allocated to a device are mapped via device tree to the device. https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felix ir.bootlin.com%2Flinux%2Flatest%2Fsource%2FDocumentation%2Fdevicetree% 2Fbindings%2Fiommu%2Farm%2Csmmu- v3.txt%23L39&data=02%7C01%7Cpankaj .bansal%40nxp.com%7C3cbe8bd4827e425afd0f08d6d54c925e%7C686ea1d3b c2b4c6 fa92cd99c5c301635%7C0%7C0%7C63693090665343&sdata=vIG5u5n XR5iRp uuuGjeFxKBtA5f5ohf91znXX0QWm1c%3D&reserved=0 As the device tree is passed from bootloader to linux, we detect all the stream IDs needed by a device in bootloader and add their IDs in respective device nodes. For each PCIE Endpoint (a unique BDF (Bus Device Function)) on PCIE bus, we are assigning a unique Stream ID in bootloader. However, this poses an issue with PCIE hot plug. If we plug in a pcie device while linux is running, a unique BDF is assigned to the device, for which there is no stream ID in device tree. How can this problem be solved in linux? Assuming the streamID associated to a BDF is predictable (streamID = BDF + constant), using the iommu-map property should just work: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.boo tlin.com%2Flinux%2Flatest%2Fsource%2FDocumentation%2Fdevicetree%2Fbind ings%2Fpci%2Fpci- iommu.txt&data=02%7C01%7Cpankaj.bansal%40nxp.com%7C3cbe8bd482 7e425afd0f08d6d54c925e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0 %7C63693090665343&sdata=GkkovEnvhd5dN%2BGdh%2FnKCyW5Cd EnLDP3cWTrk%2B%2FO7EQ%3D&reserved=0 It describes the streamIDs of all possible BDFs, including hotplugged functions. You mean that we should increase the "length" parameter (in (rid-base,iommu,iommu-base,length) touple) ? This would cater to any *new* Bus Device Function being detected on PCIE bus? Is that right ? No, the iommu-map solution only works when you can predict at boot time which streamID will correspond to any BDF, for example if the PCIe controller or firmware automatically assigns streamID = BDF. Right now when we make iommu-map in bootloader, we are giving one RID per BDF: https://elixir.bootlin.com/u-boot/latest/source/drivers/pci/pcie_layerscape_fixup.c#L168 Right, that won't work with hotplug. You can't derive a unique streamID from any BDF at boot, if the streamID range is limited to 16 values on this hardware. Furthermore, does U-Boot enumerate all possible SR-IOV capabilities, or is this already broken regardless of hotplug? Maybe you could move this allocator to the Linux Layerscape driver, and call iommu_fwspec_add_ids() from there, rather than using iommu-map? Not sure how feasible that is, but it might still be the simplest. Assuming there's still the same silly 7-bit limitation as LS2085, you could in theory carve out that entire space of ICIDs to allow a static mapping like this: iommu-map-mask = <0x0f07>; iommu-map = <0x000 &smmu 0x00 0x8>, <0x100 &smmu 0x08 0x8>, <0x200 &smmu 0x10 0x8>, <0x300 &smmu 0x18 0x8>, <0x400 &smmu 0x20 0x8>, <0x500 &smmu 0x28 0x8>, <0x600 &smmu 0x30 0x8>, <0x700 &smmu 0x38 0x8>, <0x800 &smmu 0x40 0x8>, <0x900 &smmu 0x48 0x8>, <0xa00 &smmu 0x50 0x8>, <0xb00 &smmu 0x58 0x8>, <0xc00 &smmu 0x60 0x8>, <0xd00 &smmu 0x68 0x8>, <0xe00 &smmu 0x70 0x8>, <0xf00 &smmu 0x78 0x8>; That gives you 16 buses before IDs start aliasing awkwardly (any devices sharing a bus will alias per-function, but that's probably acceptable since they'd likely get grouped together anyway). Either way the IOMMU layer itself is actually relatively easy-going these days. Even with the existing approach, if you could just update Linux's internal idea of the firmware data before the bus_add_device() call happens for the hotplugged device, then things should just work. You're also going to face the exact same problem for ITS Device IDs, though, and things may be less forgiving there. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [ARM SMMU] Dynamic StreamID allocation
Hi Pankaj, > -Original Message- > From: linux-arm-kernel On > Behalf Of Pankaj Bansal > Sent: Friday, May 10, 2019 3:34 PM > > Hi Will/Robin/Joerg, > > I am s/w engineer from NXP India Pvt. Ltd. > We are using SMMU-V3 in one of NXP SOC. > I have a question about the SMMU Stream ID allocation in linux. > > Right now the Stream IDs allocated to a device are mapped via device tree > to the device. [snip] > > As the device tree is passed from bootloader to linux, we detect all the > stream IDs needed by a device in bootloader and add their IDs in > respective device nodes. > For each PCIE Endpoint (a unique BDF (Bus Device Function)) on PCIE bus, > we are assigning a unique Stream ID in bootloader. > > However, this poses an issue with PCIE hot plug. > If we plug in a pcie device while linux is running, a unique BDF is > assigned to the device, for which there is no stream ID in device tree. > > How can this problem be solved in linux? > > Is there a way to assign (and revoke) stream IDs at run time? I think that our main problem is that we enumerate the PCI EPs in the bootloader (u-boot) and allocate StreamIDs just for them, completely disregarding hotplug scenarios. One simple fix would be to not do this and simply allocate a decently sized, fixed range of StreamIDs per PCI controller. --- Best Regards, Laurentiu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [ARM SMMU] Dynamic StreamID allocation
On 13/05/2019 08:09, Pankaj Bansal wrote: > Hi Jean, > >> -Original Message- >> From: Jean-Philippe Brucker >> Sent: Friday, 10 May, 2019 07:07 PM >> To: Pankaj Bansal ; Will Deacon >> ; Robin Murphy ; Joerg >> Roedel >> Cc: iommu@lists.linux-foundation.org; Varun Sethi ; linux- >> arm-ker...@lists.infradead.org; Nipun Gupta >> Subject: Re: [ARM SMMU] Dynamic StreamID allocation >> >> On 10/05/2019 13:33, Pankaj Bansal wrote: >>> Hi Will/Robin/Joerg, >>> >>> I am s/w engineer from NXP India Pvt. Ltd. >>> We are using SMMU-V3 in one of NXP SOC. >>> I have a question about the SMMU Stream ID allocation in linux. >>> >>> Right now the Stream IDs allocated to a device are mapped via device tree to >> the device. >>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felix >>> ir.bootlin.com%2Flinux%2Flatest%2Fsource%2FDocumentation%2Fdevicetree% >>> 2Fbindings%2Fiommu%2Farm%2Csmmu- >> v3.txt%23L39&data=02%7C01%7Cpankaj >>> .bansal%40nxp.com%7C3cbe8bd4827e425afd0f08d6d54c925e%7C686ea1d3b >> c2b4c6 >>> >> fa92cd99c5c301635%7C0%7C0%7C63693090665343&sdata=vIG5u5n >> XR5iRp >>> uuuGjeFxKBtA5f5ohf91znXX0QWm1c%3D&reserved=0 >>> >>> As the device tree is passed from bootloader to linux, we detect all the >>> stream >> IDs needed by a device in bootloader and add their IDs in respective device >> nodes. >>> For each PCIE Endpoint (a unique BDF (Bus Device Function)) on PCIE bus, we >> are assigning a unique Stream ID in bootloader. >>> >>> However, this poses an issue with PCIE hot plug. >>> If we plug in a pcie device while linux is running, a unique BDF is >>> assigned to >> the device, for which there is no stream ID in device tree. >>> >>> How can this problem be solved in linux? >> >> Assuming the streamID associated to a BDF is predictable (streamID = BDF >> + constant), using the iommu-map property should just work: >> >> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.boo >> tlin.com%2Flinux%2Flatest%2Fsource%2FDocumentation%2Fdevicetree%2Fbind >> ings%2Fpci%2Fpci- >> iommu.txt&data=02%7C01%7Cpankaj.bansal%40nxp.com%7C3cbe8bd482 >> 7e425afd0f08d6d54c925e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0 >> %7C63693090665343&sdata=GkkovEnvhd5dN%2BGdh%2FnKCyW5Cd >> EnLDP3cWTrk%2B%2FO7EQ%3D&reserved=0 >> >> It describes the streamIDs of all possible BDFs, including hotplugged >> functions. > > You mean that we should increase the "length" parameter (in > (rid-base,iommu,iommu-base,length) touple) ? > This would cater to any *new* Bus Device Function being detected on PCIE bus? > Is that right ? No, the iommu-map solution only works when you can predict at boot time which streamID will correspond to any BDF, for example if the PCIe controller or firmware automatically assigns streamID = BDF. > Right now when we make iommu-map in bootloader, we are giving one RID per BDF: > https://elixir.bootlin.com/u-boot/latest/source/drivers/pci/pcie_layerscape_fixup.c#L168 Right, that won't work with hotplug. You can't derive a unique streamID from any BDF at boot, if the streamID range is limited to 16 values on this hardware. Maybe you could move this allocator to the Linux Layerscape driver, and call iommu_fwspec_add_ids() from there, rather than using iommu-map? Not sure how feasible that is, but it might still be the simplest. Thanks, Jean > > But isn't the better approach to make it dynamic in linux? > i.e. as soon as a new device is detected "requester id" is allocated to it > from available pool. > When device is removed, return the "requester id" to pool. > is there any h/w limitation which prevents it? > > Regards, > Pankaj Bansal >> >> Thanks, >> Jean >> >>> >>> Is there a way to assign (and revoke) stream IDs at run time? >>> >>> Regards, >>> Pankaj Bansal >>> >>> >>> ___ >>> linux-arm-kernel mailing list >>> linux-arm-ker...@lists.infradead.org >>> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists >>> .infradead.org%2Fmailman%2Flistinfo%2Flinux-arm- >> kernel&data=02%7C0 >>> >> 1%7Cpankaj.bansal%40nxp.com%7C3cbe8bd4827e425afd0f08d6d54c925e%7C6 >> 86ea >>> >> 1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63693090665343&sda >> ta=2La >>> GBHO2%2Bbqk519uJvCatlHyRCtAPPjKO8Gxu1bQHBM%3D&reserved=0 >>> > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 18/23] iommu/smmuv3: Report non recoverable faults
Hi Robin, On 5/13/19 1:54 PM, Robin Murphy wrote: > On 13/05/2019 08:46, Auger Eric wrote: >> Hi Robin, >> >> On 5/8/19 7:20 PM, Robin Murphy wrote: >>> On 08/04/2019 13:19, Eric Auger wrote: When a stage 1 related fault event is read from the event queue, let's propagate it to potential external fault listeners, ie. users who registered a fault handler. Signed-off-by: Eric Auger --- v4 -> v5: - s/IOMMU_FAULT_PERM_INST/IOMMU_FAULT_PERM_EXEC --- drivers/iommu/arm-smmu-v3.c | 169 +--- 1 file changed, 158 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 805bc32a..1fd320788dcb 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -167,6 +167,26 @@ #define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8 #define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc +/* Events */ +#define ARM_SMMU_EVT_F_UUT 0x01 +#define ARM_SMMU_EVT_C_BAD_STREAMID 0x02 +#define ARM_SMMU_EVT_F_STE_FETCH 0x03 +#define ARM_SMMU_EVT_C_BAD_STE 0x04 +#define ARM_SMMU_EVT_F_BAD_ATS_TREQ 0x05 +#define ARM_SMMU_EVT_F_STREAM_DISABLED 0x06 +#define ARM_SMMU_EVT_F_TRANSL_FORBIDDEN 0x07 +#define ARM_SMMU_EVT_C_BAD_SUBSTREAMID 0x08 +#define ARM_SMMU_EVT_F_CD_FETCH 0x09 +#define ARM_SMMU_EVT_C_BAD_CD 0x0a +#define ARM_SMMU_EVT_F_WALK_EABT 0x0b +#define ARM_SMMU_EVT_F_TRANSLATION 0x10 +#define ARM_SMMU_EVT_F_ADDR_SIZE 0x11 +#define ARM_SMMU_EVT_F_ACCESS 0x12 +#define ARM_SMMU_EVT_F_PERMISSION 0x13 +#define ARM_SMMU_EVT_F_TLB_CONFLICT 0x20 +#define ARM_SMMU_EVT_F_CFG_CONFLICT 0x21 +#define ARM_SMMU_EVT_E_PAGE_REQUEST 0x24 + /* Common MSI config fields */ #define MSI_CFG0_ADDR_MASK GENMASK_ULL(51, 2) #define MSI_CFG2_SH GENMASK(5, 4) @@ -332,6 +352,15 @@ #define EVTQ_MAX_SZ_SHIFT 7 #define EVTQ_0_ID GENMASK_ULL(7, 0) +#define EVTQ_0_SSV GENMASK_ULL(11, 11) +#define EVTQ_0_SUBSTREAMID GENMASK_ULL(31, 12) +#define EVTQ_0_STREAMID GENMASK_ULL(63, 32) +#define EVTQ_1_PNU GENMASK_ULL(33, 33) +#define EVTQ_1_IND GENMASK_ULL(34, 34) +#define EVTQ_1_RNW GENMASK_ULL(35, 35) +#define EVTQ_1_S2 GENMASK_ULL(39, 39) +#define EVTQ_1_CLASS GENMASK_ULL(40, 41) +#define EVTQ_3_FETCH_ADDR GENMASK_ULL(51, 3) /* PRI queue */ #define PRIQ_ENT_DWORDS 2 @@ -639,6 +668,64 @@ struct arm_smmu_domain { spinlock_t devices_lock; }; +/* fault propagation */ + +#define IOMMU_FAULT_F_FIELDS (IOMMU_FAULT_UNRECOV_PASID_VALID | \ + IOMMU_FAULT_UNRECOV_PERM_VALID | \ + IOMMU_FAULT_UNRECOV_ADDR_VALID) + +struct arm_smmu_fault_propagation_data { + enum iommu_fault_reason reason; + bool s1_check; + u32 fields; /* IOMMU_FAULT_UNRECOV_*_VALID bits */ +}; + +/* + * Describes how SMMU faults translate into generic IOMMU faults + * and if they need to be reported externally + */ +static const struct arm_smmu_fault_propagation_data fault_propagation[] = { +[ARM_SMMU_EVT_F_UUT] = { }, +[ARM_SMMU_EVT_C_BAD_STREAMID] = { }, +[ARM_SMMU_EVT_F_STE_FETCH] = { }, +[ARM_SMMU_EVT_C_BAD_STE] = { }, +[ARM_SMMU_EVT_F_BAD_ATS_TREQ] = { }, +[ARM_SMMU_EVT_F_STREAM_DISABLED] = { }, +[ARM_SMMU_EVT_F_TRANSL_FORBIDDEN] = { }, +[ARM_SMMU_EVT_C_BAD_SUBSTREAMID] = {IOMMU_FAULT_REASON_PASID_INVALID, + false, + IOMMU_FAULT_UNRECOV_PASID_VALID + }, +[ARM_SMMU_EVT_F_CD_FETCH] = {IOMMU_FAULT_REASON_PASID_FETCH, + false, + IOMMU_FAULT_UNRECOV_PASID_VALID | >>> >>> It doesn't make sense to presume validity here, or in any of the faults >>> below... >> >> >>> + IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID + }, +[ARM_SMMU_EVT_C_BAD_CD] = {IOMMU_FAULT_REASON_BAD_PASID_ENTRY, + false, + IOMMU_FAULT_UNRECOV_PASID_VALID + }, +[ARM_SMMU_EVT_F_WALK_EABT] = {IOMMU_FAULT_REASON_WALK_EABT, true, + IOMMU_FAULT_F_FIELDS | + IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID + }, +[ARM_SMMU_EVT_F_TRANSLATION] = {IOMMU_FAULT_REASON_PTE_FETCH, true, +
Re: [PATCH v7 14/23] iommu/smmuv3: Implement cache_invalidate
Hi Robin, On 5/8/19 5:01 PM, Robin Murphy wrote: > On 08/04/2019 13:19, Eric Auger wrote: >> Implement domain-selective and page-selective IOTLB invalidations. >> >> Signed-off-by: Eric Auger >> >> --- >> v6 -> v7 >> - check the uapi version >> >> v3 -> v4: >> - adapt to changes in the uapi >> - add support for leaf parameter >> - do not use arm_smmu_tlb_inv_range_nosync or arm_smmu_tlb_inv_context >> anymore >> >> v2 -> v3: >> - replace __arm_smmu_tlb_sync by arm_smmu_cmdq_issue_sync >> >> v1 -> v2: >> - properly pass the asid >> --- >> drivers/iommu/arm-smmu-v3.c | 60 + >> 1 file changed, 60 insertions(+) >> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c >> index 1486baf53425..4366921d8318 100644 >> --- a/drivers/iommu/arm-smmu-v3.c >> +++ b/drivers/iommu/arm-smmu-v3.c >> @@ -2326,6 +2326,65 @@ static void arm_smmu_detach_pasid_table(struct >> iommu_domain *domain) >> mutex_unlock(&smmu_domain->init_mutex); >> } >> +static int >> +arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device >> *dev, >> + struct iommu_cache_invalidate_info *inv_info) >> +{ >> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); >> + struct arm_smmu_device *smmu = smmu_domain->smmu; >> + >> + if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED) >> + return -EINVAL; >> + >> + if (!smmu) >> + return -EINVAL; >> + >> + if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) >> + return -EINVAL; >> + >> + if (inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB) { >> + if (inv_info->granularity == IOMMU_INV_GRANU_PASID) { >> + struct arm_smmu_cmdq_ent cmd = { >> + .opcode = CMDQ_OP_TLBI_NH_ASID, >> + .tlbi = { >> + .vmid = smmu_domain->s2_cfg.vmid, >> + .asid = inv_info->pasid, >> + }, >> + }; >> + >> + arm_smmu_cmdq_issue_cmd(smmu, &cmd); >> + arm_smmu_cmdq_issue_sync(smmu); > > I'd much rather make arm_smmu_tlb_inv_context() understand nested > domains than open-code commands all over the place. > >> + >> + } else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR) { >> + struct iommu_inv_addr_info *info = &inv_info->addr_info; >> + size_t size = info->nb_granules * info->granule_size; >> + bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF; >> + struct arm_smmu_cmdq_ent cmd = { >> + .opcode = CMDQ_OP_TLBI_NH_VA, >> + .tlbi = { >> + .addr = info->addr, >> + .vmid = smmu_domain->s2_cfg.vmid, >> + .asid = info->pasid, >> + .leaf = leaf, >> + }, >> + }; >> + >> + do { >> + arm_smmu_cmdq_issue_cmd(smmu, &cmd); >> + cmd.tlbi.addr += info->granule_size; >> + } while (size -= info->granule_size); >> + arm_smmu_cmdq_issue_sync(smmu); > > An this in particular I would really like to go all the way through > io_pgtable_tlb_add_flush()/io_pgtable_sync() if at all possible. Hooking > up range-based invalidations is going to be a massive headache if the > abstraction isn't solid. The concern is the host does not "own" the s1 config asid (smmu_domain->s1_cfg.cd.asid is not set, practically). In our case the asid only is passed by the userspace on CACHE_INVALIDATE ioctl call. arm_smmu_tlb_inv_context and arm_smmu_tlb_inv_range_nosync use this field Jean-Philippe highlighted that fact in https://lkml.org/lkml/2019/1/11/1062 and suggested to open-code commands instead, hence the current form. Thanks Eric > > Robin. > >> + } else { >> + return -EINVAL; >> + } >> + } >> + if (inv_info->cache & IOMMU_CACHE_INV_TYPE_PASID || >> + inv_info->cache & IOMMU_CACHE_INV_TYPE_DEV_IOTLB) { >> + return -ENOENT; >> + } >> + return 0; >> +} >> + >> static struct iommu_ops arm_smmu_ops = { >> .capable = arm_smmu_capable, >> .domain_alloc = arm_smmu_domain_alloc, >> @@ -2346,6 +2405,7 @@ static struct iommu_ops arm_smmu_ops = { >> .put_resv_regions = arm_smmu_put_resv_regions, >> .attach_pasid_table = arm_smmu_attach_pasid_table, >> .detach_pasid_table = arm_smmu_detach_pasid_table, >> + .cache_invalidate = arm_smmu_cache_invalidate, >> .pgsize_bitmap = -1UL, /* Restricted during device attach */ >> }; >>
Re: [PATCH v7 13/23] iommu/smmuv3: Implement attach/detach_pasid_table
On 10/05/2019 15:35, Auger Eric wrote: Hi Robin, On 5/8/19 4:38 PM, Robin Murphy wrote: On 08/04/2019 13:19, Eric Auger wrote: On attach_pasid_table() we program STE S1 related info set by the guest into the actual physical STEs. At minimum we need to program the context descriptor GPA and compute whether the stage1 is translated/bypassed or aborted. Signed-off-by: Eric Auger --- v6 -> v7: - check versions and comment the fact we don't need to take into account s1dss and s1fmt v3 -> v4: - adapt to changes in iommu_pasid_table_config - different programming convention at s1_cfg/s2_cfg/ste.abort v2 -> v3: - callback now is named set_pasid_table and struct fields are laid out differently. v1 -> v2: - invalidate the STE before changing them - hold init_mutex - handle new fields --- drivers/iommu/arm-smmu-v3.c | 121 1 file changed, 121 insertions(+) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index e22e944ffc05..1486baf53425 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -2207,6 +2207,125 @@ static void arm_smmu_put_resv_regions(struct device *dev, kfree(entry); } +static int arm_smmu_attach_pasid_table(struct iommu_domain *domain, + struct iommu_pasid_table_config *cfg) +{ + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + struct arm_smmu_master_data *entry; + struct arm_smmu_s1_cfg *s1_cfg; + struct arm_smmu_device *smmu; + unsigned long flags; + int ret = -EINVAL; + + if (cfg->format != IOMMU_PASID_FORMAT_SMMUV3) + return -EINVAL; + + if (cfg->version != PASID_TABLE_CFG_VERSION_1 || + cfg->smmuv3.version != PASID_TABLE_SMMUV3_CFG_VERSION_1) + return -EINVAL; + + mutex_lock(&smmu_domain->init_mutex); + + smmu = smmu_domain->smmu; + + if (!smmu) + goto out; + + if (!((smmu->features & ARM_SMMU_FEAT_TRANS_S1) && + (smmu->features & ARM_SMMU_FEAT_TRANS_S2))) { + dev_info(smmu_domain->smmu->dev, + "does not implement two stages\n"); + goto out; + } That check is redundant (and frankly looks a little bit spammy). If the one below is not enough, there is a problem elsewhere - if it's possible for smmu_domain->stage to ever get set to ARM_SMMU_DOMAIN_NESTED without both stages of translation present, we've already gone fundamentally wrong. Makes sense. Moved that check to arm_smmu_domain_finalise() instead and remove redundant ones. Urgh, I forgot exactly how the crazy domain-allocation dance worked, such that we're not in a position to refuse the domain_set_attr() call itself, but that does sound like the best compromise for now. Thanks, Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 18/23] iommu/smmuv3: Report non recoverable faults
On 13/05/2019 08:46, Auger Eric wrote: Hi Robin, On 5/8/19 7:20 PM, Robin Murphy wrote: On 08/04/2019 13:19, Eric Auger wrote: When a stage 1 related fault event is read from the event queue, let's propagate it to potential external fault listeners, ie. users who registered a fault handler. Signed-off-by: Eric Auger --- v4 -> v5: - s/IOMMU_FAULT_PERM_INST/IOMMU_FAULT_PERM_EXEC --- drivers/iommu/arm-smmu-v3.c | 169 +--- 1 file changed, 158 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 805bc32a..1fd320788dcb 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -167,6 +167,26 @@ #define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8 #define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc +/* Events */ +#define ARM_SMMU_EVT_F_UUT 0x01 +#define ARM_SMMU_EVT_C_BAD_STREAMID 0x02 +#define ARM_SMMU_EVT_F_STE_FETCH 0x03 +#define ARM_SMMU_EVT_C_BAD_STE 0x04 +#define ARM_SMMU_EVT_F_BAD_ATS_TREQ 0x05 +#define ARM_SMMU_EVT_F_STREAM_DISABLED 0x06 +#define ARM_SMMU_EVT_F_TRANSL_FORBIDDEN 0x07 +#define ARM_SMMU_EVT_C_BAD_SUBSTREAMID 0x08 +#define ARM_SMMU_EVT_F_CD_FETCH 0x09 +#define ARM_SMMU_EVT_C_BAD_CD 0x0a +#define ARM_SMMU_EVT_F_WALK_EABT 0x0b +#define ARM_SMMU_EVT_F_TRANSLATION 0x10 +#define ARM_SMMU_EVT_F_ADDR_SIZE 0x11 +#define ARM_SMMU_EVT_F_ACCESS 0x12 +#define ARM_SMMU_EVT_F_PERMISSION 0x13 +#define ARM_SMMU_EVT_F_TLB_CONFLICT 0x20 +#define ARM_SMMU_EVT_F_CFG_CONFLICT 0x21 +#define ARM_SMMU_EVT_E_PAGE_REQUEST 0x24 + /* Common MSI config fields */ #define MSI_CFG0_ADDR_MASK GENMASK_ULL(51, 2) #define MSI_CFG2_SH GENMASK(5, 4) @@ -332,6 +352,15 @@ #define EVTQ_MAX_SZ_SHIFT 7 #define EVTQ_0_ID GENMASK_ULL(7, 0) +#define EVTQ_0_SSV GENMASK_ULL(11, 11) +#define EVTQ_0_SUBSTREAMID GENMASK_ULL(31, 12) +#define EVTQ_0_STREAMID GENMASK_ULL(63, 32) +#define EVTQ_1_PNU GENMASK_ULL(33, 33) +#define EVTQ_1_IND GENMASK_ULL(34, 34) +#define EVTQ_1_RNW GENMASK_ULL(35, 35) +#define EVTQ_1_S2 GENMASK_ULL(39, 39) +#define EVTQ_1_CLASS GENMASK_ULL(40, 41) +#define EVTQ_3_FETCH_ADDR GENMASK_ULL(51, 3) /* PRI queue */ #define PRIQ_ENT_DWORDS 2 @@ -639,6 +668,64 @@ struct arm_smmu_domain { spinlock_t devices_lock; }; +/* fault propagation */ + +#define IOMMU_FAULT_F_FIELDS (IOMMU_FAULT_UNRECOV_PASID_VALID | \ + IOMMU_FAULT_UNRECOV_PERM_VALID | \ + IOMMU_FAULT_UNRECOV_ADDR_VALID) + +struct arm_smmu_fault_propagation_data { + enum iommu_fault_reason reason; + bool s1_check; + u32 fields; /* IOMMU_FAULT_UNRECOV_*_VALID bits */ +}; + +/* + * Describes how SMMU faults translate into generic IOMMU faults + * and if they need to be reported externally + */ +static const struct arm_smmu_fault_propagation_data fault_propagation[] = { +[ARM_SMMU_EVT_F_UUT] = { }, +[ARM_SMMU_EVT_C_BAD_STREAMID] = { }, +[ARM_SMMU_EVT_F_STE_FETCH] = { }, +[ARM_SMMU_EVT_C_BAD_STE] = { }, +[ARM_SMMU_EVT_F_BAD_ATS_TREQ] = { }, +[ARM_SMMU_EVT_F_STREAM_DISABLED] = { }, +[ARM_SMMU_EVT_F_TRANSL_FORBIDDEN] = { }, +[ARM_SMMU_EVT_C_BAD_SUBSTREAMID] = {IOMMU_FAULT_REASON_PASID_INVALID, + false, + IOMMU_FAULT_UNRECOV_PASID_VALID + }, +[ARM_SMMU_EVT_F_CD_FETCH] = {IOMMU_FAULT_REASON_PASID_FETCH, + false, + IOMMU_FAULT_UNRECOV_PASID_VALID | It doesn't make sense to presume validity here, or in any of the faults below... + IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID + }, +[ARM_SMMU_EVT_C_BAD_CD] = {IOMMU_FAULT_REASON_BAD_PASID_ENTRY, + false, + IOMMU_FAULT_UNRECOV_PASID_VALID + }, +[ARM_SMMU_EVT_F_WALK_EABT] = {IOMMU_FAULT_REASON_WALK_EABT, true, + IOMMU_FAULT_F_FIELDS | + IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID + }, +[ARM_SMMU_EVT_F_TRANSLATION] = {IOMMU_FAULT_REASON_PTE_FETCH, true, + IOMMU_FAULT_F_FIELDS + }, +[ARM_SMMU_EVT_F_ADDR_SIZE] = {IOMMU_FAULT_REASON_OOR_ADDRESS, true, + IOMMU_FAULT_F_FIELDS + }, +[ARM_SMMU_EVT_F_ACCESS] = {IOMMU_FAULT_REASON_ACCESS, true, + IOMMU_FAULT_F_FIELDS + }, +[ARM_SMMU_EVT_F_PERMISSION] = {IOMMU_FAULT_REASON_PERMISSION, true, + IOMMU_FAULT_F_FIELDS + }, +[ARM_SMMU_EVT_F_TLB_CONFLICT] = { }, +[ARM_SMMU_EVT_F_CFG_CONFLICT] = { }, +[ARM_SMMU_EVT_E_PAGE_REQUEST]
[git pull] IOMMU Updates for Linux v5.2
Hi Linus, this pull-request includes two reverts which I had to do after the merge window started, because the reverted patches caused issues in linux-next. But the rest of this was ready before the merge window. With this in mind: The following changes since commit 37624b58542fb9f2d9a70e6ea006ef8a5f66c30b: Linux 5.1-rc7 (2019-04-28 17:04:13 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git tags/iommu-updates-v5.2 for you to fetch changes up to b5531563e8a0b8fcc5344a38d1fad9217e08e09b: Merge branches 'arm/tegra', 'arm/mediatek', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next (2019-05-07 09:40:12 +0200) IOMMU Updates for Linux v5.2 Including: - ATS support for ARM-SMMU-v3. - AUX domain support in the IOMMU-API and the Intel VT-d driver. This adds support for multiple DMA address spaces per (PCI-)device. The use-case is to multiplex devices between host and KVM guests in a more flexible way than supported by SR-IOV. - The Rest are smaller cleanups and fixes, two of which needed to be reverted after testing in linux-next. Andy Shevchenko (1): iommu/vt-d: Switch to bitmap_zalloc() Christoph Hellwig (4): iommu/amd: Remove the leftover of bypass support iommu/vt-d: Clean up iommu_no_mapping iommu/vt-d: Use dma_direct for bypass devices iommu/vt-d: Don't clear GFP_DMA and GFP_DMA32 flags Dmitry Osipenko (3): iommu/tegra-smmu: Fix invalid ASID bits on Tegra30/114 iommu/tegra-smmu: Properly release domain resources iommu/tegra-smmu: Respect IOMMU API read-write protections Douglas Anderson (1): iommu/arm-smmu: Break insecure users by disabling bypass by default Eric Auger (1): iommu/vt-d: Fix leak in intel_pasid_alloc_table on error path Gustavo A. R. Silva (1): iommu/vt-d: Use struct_size() helper Jean-Philippe Brucker (11): iommu: Bind process address spaces to devices iommu/amd: Use pci_prg_resp_pasid_required() PCI: Move ATS declarations outside of CONFIG_PCI PCI: Add a stub for pci_ats_disabled() ACPI/IORT: Check ATS capability in root complex nodes iommu/arm-smmu-v3: Rename arm_smmu_master_data to arm_smmu_master iommu/arm-smmu-v3: Store SteamIDs in master iommu/arm-smmu-v3: Add a master->domain pointer iommu/arm-smmu-v3: Link domains and devices iommu/arm-smmu-v3: Add support for PCI ATS iommu/arm-smmu-v3: Disable tagged pointers Jinyu Qi (1): iommu/iova: Separate atomic variables to improve performance Joerg Roedel (7): Merge branch 'api-features' into x86/vt-d iommu/amd: Remove amd_iommu_pd_list Merge branch 'for-joerg/arm-smmu/updates' of git://git.kernel.org/.../will/linux into arm/smmu Merge branch 'api-features' into arm/smmu Revert "iommu/amd: Remove the leftover of bypass support" Revert "iommu/amd: Flush not present cache in iommu_map_page" Merge branches 'arm/tegra', 'arm/mediatek', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next Lu Baolu (15): iommu: Remove iommu_callback_data iommu: Add APIs for multiple domains per device iommu/vt-d: Make intel_iommu_enable_pasid() more generic iommu/vt-d: Add per-device IOMMU feature ops entries iommu/vt-d: Move common code out of iommu_attch_device() iommu/vt-d: Aux-domain specific domain attach/detach iommu/vt-d: Return ID associated with an auxiliary domain vfio/mdev: Add iommu related member in mdev_device vfio/type1: Add domain at(de)taching group helpers vfio/type1: Handle different mdev isolation type iommu/vt-d: Flush IOTLB for untrusted device in time iommu/vt-d: Don't request page request irq under dmar_global_lock iommu/vt-d: Cleanup: no spaces at the start of a line iommu/vt-d: Set intel_iommu_gfx_mapped correctly iommu/vt-d: Make kernel parameter igfx_off work with vIOMMU Tom Murphy (1): iommu/amd: Flush not present cache in iommu_map_page Vivek Gautam (1): iommu/arm-smmu: Log CBFRSYNRA register on context fault Wen Yang (1): iommu/mediatek: Fix leaked of_node references Will Deacon (1): iommu/arm-smmu-v3: Don't disable SMMU in kdump kernel drivers/acpi/arm64/iort.c | 11 + drivers/iommu/Kconfig | 25 ++ drivers/iommu/amd_iommu.c | 52 +--- drivers/iommu/amd_iommu_init.c | 8 - drivers/iommu/amd_iommu_types.h | 6 - drivers/iommu/arm-smmu-regs.h | 2 + drivers/iommu/arm-smmu-v3.c | 355 +- drivers/iommu/arm-smmu.c| 11 +- drivers/iommu/dmar.c| 2 +- drivers/iommu/intel-iommu.c | 584 drivers/
Re: [PATCH v7 12/23] iommu/smmuv3: Get prepared for nested stage support
On 10/05/2019 15:34, Auger Eric wrote: Hi Robin, On 5/8/19 4:24 PM, Robin Murphy wrote: On 08/04/2019 13:19, Eric Auger wrote: To allow nested stage support, we need to store both stage 1 and stage 2 configurations (and remove the former union). A nested setup is characterized by both s1_cfg and s2_cfg set. We introduce a new ste.abort field that will be set upon guest stage1 configuration passing. If s1_cfg is NULL and ste.abort is set, traffic can't pass. If ste.abort is not set, S1 is bypassed. arm_smmu_write_strtab_ent() is modified to write both stage fields in the STE and deal with the abort field. In nested mode, only stage 2 is "finalized" as the host does not own/configure the stage 1 context descriptor, guest does. Signed-off-by: Eric Auger --- v4 -> v5: - reset ste.abort on detach v3 -> v4: - s1_cfg.nested_abort and nested_bypass removed. - s/ste.nested/ste.abort - arm_smmu_write_strtab_ent modifications with introduction of local abort, bypass and translate local variables - comment updated v1 -> v2: - invalidate the STE before moving from a live STE config to another - add the nested_abort and nested_bypass fields --- drivers/iommu/arm-smmu-v3.c | 35 --- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 21d027695181..e22e944ffc05 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -211,6 +211,7 @@ #define STRTAB_STE_0_CFG_BYPASS 4 #define STRTAB_STE_0_CFG_S1_TRANS 5 #define STRTAB_STE_0_CFG_S2_TRANS 6 +#define STRTAB_STE_0_CFG_NESTED 7 #define STRTAB_STE_0_S1FMT GENMASK_ULL(5, 4) #define STRTAB_STE_0_S1FMT_LINEAR 0 @@ -514,6 +515,7 @@ struct arm_smmu_strtab_ent { * configured according to the domain type. */ bool assigned; + bool abort; struct arm_smmu_s1_cfg *s1_cfg; struct arm_smmu_s2_cfg *s2_cfg; }; @@ -628,10 +630,8 @@ struct arm_smmu_domain { bool non_strict; enum arm_smmu_domain_stage stage; - union { - struct arm_smmu_s1_cfg s1_cfg; - struct arm_smmu_s2_cfg s2_cfg; - }; + struct arm_smmu_s1_cfg s1_cfg; + struct arm_smmu_s2_cfg s2_cfg; struct iommu_domain domain; @@ -1108,12 +1108,13 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, __le64 *dst, struct arm_smmu_strtab_ent *ste) { /* - * This is hideously complicated, but we only really care about - * three cases at the moment: + * We care about the following transitions: * * 1. Invalid (all zero) -> bypass/fault (init) - * 2. Bypass/fault -> translation/bypass (attach) - * 3. Translation/bypass -> bypass/fault (detach) + * 2. Bypass/fault -> single stage translation/bypass (attach) + * 3. single stage Translation/bypass -> bypass/fault (detach) + * 4. S2 -> S1 + S2 (attach_pasid_table) + * 5. S1 + S2 -> S2 (detach_pasid_table) * * Given that we can't update the STE atomically and the SMMU * doesn't read the thing in a defined order, that leaves us @@ -1124,7 +1125,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, * 3. Update Config, sync */ u64 val = le64_to_cpu(dst[0]); - bool ste_live = false; + bool abort, bypass, translate, ste_live = false; struct arm_smmu_cmdq_ent prefetch_cmd = { .opcode = CMDQ_OP_PREFETCH_CFG, .prefetch = { @@ -1138,11 +1139,11 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, break; case STRTAB_STE_0_CFG_S1_TRANS: case STRTAB_STE_0_CFG_S2_TRANS: + case STRTAB_STE_0_CFG_NESTED: ste_live = true; break; case STRTAB_STE_0_CFG_ABORT: - if (disable_bypass) - break; + break; default: BUG(); /* STE corruption */ } @@ -1152,8 +1153,13 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, val = STRTAB_STE_0_V; /* Bypass/fault */ - if (!ste->assigned || !(ste->s1_cfg || ste->s2_cfg)) { - if (!ste->assigned && disable_bypass) + + abort = (!ste->assigned && disable_bypass) || ste->abort; + translate = ste->s1_cfg || ste->s2_cfg; + bypass = !abort && !translate; + + if (abort || bypass) { + if (abort) val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_ABORT); else val |= FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_BYPASS); @@ -1172,7 +1178,6 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid, } if (ste->s1_cfg) { - BUG_ON(ste_live); Hmm, I'm a little uneasy ab
Re: [PATCH v4 1/1] iommu/io-pgtable-arm: Add support to use system cache
On 13/05/2019 11:04, Vivek Gautam wrote: Few Qualcomm platforms such as, sdm845 have an additional outer cache called as System cache, aka. Last level cache (LLC) that allows non-coherent devices to upgrade to using caching. This cache sits right before the DDR, and is tightly coupled with the memory controller. The clients using this cache request their slices from this system cache, make it active, and can then start using it. There is a fundamental assumption that non-coherent devices can't access caches. This change adds an exception where they *can* use some level of cache despite still being non-coherent overall. The coherent devices that use cacheable memory, and CPU make use of this system cache by default. Looking at memory types, we have following - a) Normal uncached :- MAIR 0x44, inner non-cacheable, outer non-cacheable; b) Normal cached :- MAIR 0xff, inner read write-back non-transient, outer read write-back non-transient; attribute setting for coherenet I/O devices. and, for non-coherent i/o devices that can allocate in system cache another type gets added - c) Normal sys-cached :- MAIR 0xf4, inner non-cacheable, outer read write-back non-transient Coherent I/O devices use system cache by marking the memory as normal cached. Non-coherent I/O devices should mark the memory as normal sys-cached in page tables to use system cache. Signed-off-by: Vivek Gautam --- V3 version of this patch and related series can be found at [1]. This change is a realisation of following changes from downstream msm-4.9: iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[2] Changes since v3: - Dropping support to cache i/o page tables to system cache. Getting support for data buffers is the first step. Removed io-pgtable quirk and related change to add domain attribute. Glmark2 numbers on SDM845 based cheza board: S.No.| with LLC support |without LLC support | for data buffers | --- 1| 4480; 72.3fps |4042; 65.2fps 2| 4500; 72.6fps |4039; 65.1fps 3| 4523; 72.9fps |4106; 66.2fps 4| 4489; 72.4fps |4104; 66.2fps 5| 4518; 72.9fps |4072; 65.7fps [1] https://patchwork.kernel.org/cover/10772629/ [2] https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=d4c72c413ea27c43f60825193d4de9cb8ffd9602 drivers/iommu/io-pgtable-arm.c | 9 - include/linux/iommu.h | 1 + 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index d3700ec15cbd..2dbafe697531 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -167,10 +167,12 @@ #define ARM_LPAE_MAIR_ATTR_MASK 0xff #define ARM_LPAE_MAIR_ATTR_DEVICE 0x04 #define ARM_LPAE_MAIR_ATTR_NC 0x44 +#define ARM_LPAE_MAIR_ATTR_QCOM_SYS_CACHE 0xf4 #define ARM_LPAE_MAIR_ATTR_WBRWA 0xff #define ARM_LPAE_MAIR_ATTR_IDX_NC 0 #define ARM_LPAE_MAIR_ATTR_IDX_CACHE 1 #define ARM_LPAE_MAIR_ATTR_IDX_DEV2 +#define ARM_LPAE_MAIR_ATTR_IDX_QCOM_SYS_CACHE 3 Here at the implementation level, I'd rather just call these what they are, i.e. s/QCOM_SYS_CACHE/INC_OWBRWA/. /* IOPTE accessors */ #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d)) @@ -442,6 +444,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, else if (prot & IOMMU_CACHE) pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE << ARM_LPAE_PTE_ATTRINDX_SHIFT); + else if (prot & IOMMU_QCOM_SYS_CACHE) Where in the call stack is this going to be decided? (I don't recall the previous discussions ever really reaching a solid conclusion on how to separate responsibilities). + pte |= (ARM_LPAE_MAIR_ATTR_IDX_QCOM_SYS_CACHE + << ARM_LPAE_PTE_ATTRINDX_SHIFT); } else { pte = ARM_LPAE_PTE_HAP_FAULT; if (prot & IOMMU_READ) @@ -841,7 +846,9 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) (ARM_LPAE_MAIR_ATTR_WBRWA << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_CACHE)) | (ARM_LPAE_MAIR_ATTR_DEVICE - << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV)); + << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV)) | + (ARM_LPAE_MAIR_ATTR_QCOM_SYS_CACHE + << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_QCOM_SYS_CACHE)); cfg->arm_lpae_s1_cfg.mair[0] = reg; cfg->arm_lpae_s1_cfg.mair[1] = 0; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a815cf6f6f47..29dd2c624348 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -31,6 +31,7
Re: [PATCH v3 02/16] iommu: Introduce cache_invalidate API
Hi Eric, On 13/05/2019 10:14, Auger Eric wrote: > I noticed my qemu integration was currently incorrectly using PASID > invalidation for ASID based invalidation (SMMUV3 Stage1 CMD_TLBI_NH_ASID > invalidation command). So I think we also need ARCHID invalidation. > Sorry for the late notice. >> >> +/* defines the granularity of the invalidation */ >> +enum iommu_inv_granularity { >> +IOMMU_INV_GRANU_DOMAIN, /* domain-selective invalidation */ > IOMMU_INV_GRANU_ARCHID, /* archid-selective invalidation */ >> +IOMMU_INV_GRANU_PASID, /* pasid-selective invalidation */ In terms of granularity, these values have the same meaning: invalidate the whole address space of a context. Then you can communicate two things using the same struct: * If ATS is enables an Arm host needs to invalidate all ATC entries using PASID. * If BTM isn't used by the guest, the host needs to invalidate all TLB entries using ARCHID. Rather than introducing a new granule here, could we just add an archid field to the struct associated with IOMMU_INV_GRANU_PASID? Something like... >> +IOMMU_INV_GRANU_ADDR, /* page-selective invalidation */ >> +IOMMU_INVAL_GRANU_NR, /* number of invalidation granularities */ >> +}; >> + >> +/** >> + * Address Selective Invalidation Structure >> + * >> + * @flags indicates the granularity of the address-selective invalidation >> + * - if PASID bit is set, @pasid field is populated and the invalidation >> + * relates to cache entries tagged with this PASID and matching the >> + * address range. >> + * - if ARCHID bit is set, @archid is populated and the invalidation relates >> + * to cache entries tagged with this architecture specific id and matching >> + * the address range. >> + * - Both PASID and ARCHID can be set as they may tag different caches. >> + * - if neither PASID or ARCHID is set, global addr invalidation applies >> + * - LEAF flag indicates whether only the leaf PTE caching needs to be >> + * invalidated and other paging structure caches can be preserved. >> + * @pasid: process address space id >> + * @archid: architecture-specific id >> + * @addr: first stage/level input address >> + * @granule_size: page/block size of the mapping in bytes >> + * @nb_granules: number of contiguous granules to be invalidated >> + */ >> +struct iommu_inv_addr_info { >> +#define IOMMU_INV_ADDR_FLAGS_PASID (1 << 0) >> +#define IOMMU_INV_ADDR_FLAGS_ARCHID (1 << 1) >> +#define IOMMU_INV_ADDR_FLAGS_LEAF (1 << 2) >> +__u32 flags; >> +__u32 archid; >> +__u64 pasid; >> +__u64 addr; >> +__u64 granule_size; >> +__u64 nb_granules; >> +}; struct iommu_inv_pasid_info { #define IOMMU_INV_PASID_FLAGS_PASID (1 << 0) #define IOMMU_INV_PASID_FLAGS_ARCHID(1 << 1) __u32 flags; __u32 archid; __u64 pasid; }; >> + >> +/** >> + * First level/stage invalidation information >> + * @cache: bitfield that allows to select which caches to invalidate >> + * @granularity: defines the lowest granularity used for the invalidation: >> + * domain > pasid > addr >> + * >> + * Not all the combinations of cache/granularity make sense: >> + * >> + * type | DEV_IOTLB | IOTLB | PASID| >> + * granularity | | | cache| >> + * -+---+---+---+ >> + * DOMAIN | N/A | Y | Y | > * ARCHID | N/A | Y | N/A | > >> + * PASID| Y | Y | Y | >> + * ADDR | Y | Y | N/A | >> + * >> + * Invalidations by %IOMMU_INV_GRANU_ADDR use field @addr_info. > * Invalidations by %IOMMU_INV_GRANU_ARCHID use field @archid. >> + * Invalidations by %IOMMU_INV_GRANU_PASID use field @pasid. >> + * Invalidations by %IOMMU_INV_GRANU_DOMAIN don't take any argument. >> + * >> + * If multiple cache types are invalidated simultaneously, they all >> + * must support the used granularity. >> + */ >> +struct iommu_cache_invalidate_info { >> +#define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1 >> +__u32 version; >> +/* IOMMU paging structure cache */ >> +#define IOMMU_CACHE_INV_TYPE_IOTLB (1 << 0) /* IOMMU IOTLB */ >> +#define IOMMU_CACHE_INV_TYPE_DEV_IOTLB (1 << 1) /* Device IOTLB */ >> +#define IOMMU_CACHE_INV_TYPE_PASID (1 << 2) /* PASID cache */ >> +#define IOMMU_CACHE_TYPE_NR (3) >> +__u8cache; >> +__u8granularity; >> +__u8padding[2]; >> +union { >> +__u64 pasid; > __u32 archid; struct iommu_inv_pasid_info pasid_info; Thanks, Jean > > Thanks > > Eric >> +struct iommu_inv_addr_info addr_info; >> +}; >> +}; >> + >> + >> #endif /* _UAPI_IOMMU_H */ >> ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iomm
[PATCH v4 1/1] iommu/io-pgtable-arm: Add support to use system cache
Few Qualcomm platforms such as, sdm845 have an additional outer cache called as System cache, aka. Last level cache (LLC) that allows non-coherent devices to upgrade to using caching. This cache sits right before the DDR, and is tightly coupled with the memory controller. The clients using this cache request their slices from this system cache, make it active, and can then start using it. There is a fundamental assumption that non-coherent devices can't access caches. This change adds an exception where they *can* use some level of cache despite still being non-coherent overall. The coherent devices that use cacheable memory, and CPU make use of this system cache by default. Looking at memory types, we have following - a) Normal uncached :- MAIR 0x44, inner non-cacheable, outer non-cacheable; b) Normal cached :- MAIR 0xff, inner read write-back non-transient, outer read write-back non-transient; attribute setting for coherenet I/O devices. and, for non-coherent i/o devices that can allocate in system cache another type gets added - c) Normal sys-cached :- MAIR 0xf4, inner non-cacheable, outer read write-back non-transient Coherent I/O devices use system cache by marking the memory as normal cached. Non-coherent I/O devices should mark the memory as normal sys-cached in page tables to use system cache. Signed-off-by: Vivek Gautam --- V3 version of this patch and related series can be found at [1]. This change is a realisation of following changes from downstream msm-4.9: iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[2] Changes since v3: - Dropping support to cache i/o page tables to system cache. Getting support for data buffers is the first step. Removed io-pgtable quirk and related change to add domain attribute. Glmark2 numbers on SDM845 based cheza board: S.No.| with LLC support |without LLC support | for data buffers | --- 1| 4480; 72.3fps |4042; 65.2fps 2| 4500; 72.6fps |4039; 65.1fps 3| 4523; 72.9fps |4106; 66.2fps 4| 4489; 72.4fps |4104; 66.2fps 5| 4518; 72.9fps |4072; 65.7fps [1] https://patchwork.kernel.org/cover/10772629/ [2] https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9&id=d4c72c413ea27c43f60825193d4de9cb8ffd9602 drivers/iommu/io-pgtable-arm.c | 9 - include/linux/iommu.h | 1 + 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index d3700ec15cbd..2dbafe697531 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -167,10 +167,12 @@ #define ARM_LPAE_MAIR_ATTR_MASK0xff #define ARM_LPAE_MAIR_ATTR_DEVICE 0x04 #define ARM_LPAE_MAIR_ATTR_NC 0x44 +#define ARM_LPAE_MAIR_ATTR_QCOM_SYS_CACHE 0xf4 #define ARM_LPAE_MAIR_ATTR_WBRWA 0xff #define ARM_LPAE_MAIR_ATTR_IDX_NC 0 #define ARM_LPAE_MAIR_ATTR_IDX_CACHE 1 #define ARM_LPAE_MAIR_ATTR_IDX_DEV 2 +#define ARM_LPAE_MAIR_ATTR_IDX_QCOM_SYS_CACHE 3 /* IOPTE accessors */ #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d)) @@ -442,6 +444,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data, else if (prot & IOMMU_CACHE) pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE << ARM_LPAE_PTE_ATTRINDX_SHIFT); + else if (prot & IOMMU_QCOM_SYS_CACHE) + pte |= (ARM_LPAE_MAIR_ATTR_IDX_QCOM_SYS_CACHE + << ARM_LPAE_PTE_ATTRINDX_SHIFT); } else { pte = ARM_LPAE_PTE_HAP_FAULT; if (prot & IOMMU_READ) @@ -841,7 +846,9 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) (ARM_LPAE_MAIR_ATTR_WBRWA << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_CACHE)) | (ARM_LPAE_MAIR_ATTR_DEVICE - << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV)); + << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV)) | + (ARM_LPAE_MAIR_ATTR_QCOM_SYS_CACHE + << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_QCOM_SYS_CACHE)); cfg->arm_lpae_s1_cfg.mair[0] = reg; cfg->arm_lpae_s1_cfg.mair[1] = 0; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a815cf6f6f47..29dd2c624348 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -31,6 +31,7 @@ #define IOMMU_CACHE(1 << 2) /* DMA cache coherency */ #define IOMMU_NOEXEC (1 << 3) #define IOMMU_MMIO (1 << 4) /* e.g. things like MSI doorbells */ +#define IOMMU_QCOM_SYS_CACHE (1 << 6) /* * Where the bus hardware includes a privilege level as part of its access type * markings, and certain devices are capable of issuing t
Re: [PATCH v3 02/16] iommu: Introduce cache_invalidate API
Hi Jacob, Jean-Philippe, On 5/4/19 12:32 AM, Jacob Pan wrote: > From: "Liu, Yi L" > > In any virtualization use case, when the first translation stage > is "owned" by the guest OS, the host IOMMU driver has no knowledge > of caching structure updates unless the guest invalidation activities > are trapped by the virtualizer and passed down to the host. > > Since the invalidation data are obtained from user space and will be > written into physical IOMMU, we must allow security check at various > layers. Therefore, generic invalidation data format are proposed here, > model specific IOMMU drivers need to convert them into their own format. > > Signed-off-by: Liu, Yi L > Signed-off-by: Jean-Philippe Brucker > Signed-off-by: Jacob Pan > Signed-off-by: Ashok Raj > Signed-off-by: Eric Auger > > --- > v6 -> v7: > - detail which fields are used for each invalidation type > - add a comment about multiple cache invalidation > > v5 -> v6: > - fix merge issue > > v3 -> v4: > - full reshape of the API following Alex' comments > > v1 -> v2: > - add arch_id field > - renamed tlb_invalidate into cache_invalidate as this API allows > to invalidate context caches on top of IOTLBs > > v1: > renamed sva_invalidate into tlb_invalidate and add iommu_ prefix in > header. Commit message reworded. > --- > drivers/iommu/iommu.c | 14 > include/linux/iommu.h | 15 - > include/uapi/linux/iommu.h | 80 > ++ > 3 files changed, 108 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 8df9d34..a2f6f3e 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -1645,6 +1645,20 @@ void iommu_detach_pasid_table(struct iommu_domain > *domain) > } > EXPORT_SYMBOL_GPL(iommu_detach_pasid_table); > > +int iommu_cache_invalidate(struct iommu_domain *domain, struct device *dev, > +struct iommu_cache_invalidate_info *inv_info) > +{ > + int ret = 0; > + > + if (unlikely(!domain->ops->cache_invalidate)) > + return -ENODEV; > + > + ret = domain->ops->cache_invalidate(domain, dev, inv_info); > + > + return ret; > +} > +EXPORT_SYMBOL_GPL(iommu_cache_invalidate); > + > static void __iommu_detach_device(struct iommu_domain *domain, > struct device *dev) > { > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index ab4d922..d182525 100644 > --- a/include/linux/iommu.h > +++ b/include/linux/iommu.h > @@ -266,6 +266,7 @@ struct page_response_msg { > * @page_response: handle page request response > * @attach_pasid_table: attach a pasid table > * @detach_pasid_table: detach the pasid table > + * @cache_invalidate: invalidate translation caches > * @pgsize_bitmap: bitmap of all possible supported page sizes > */ > struct iommu_ops { > @@ -328,8 +329,9 @@ struct iommu_ops { > int (*attach_pasid_table)(struct iommu_domain *domain, > struct iommu_pasid_table_config *cfg); > void (*detach_pasid_table)(struct iommu_domain *domain); > - > int (*page_response)(struct device *dev, struct page_response_msg *msg); > + int (*cache_invalidate)(struct iommu_domain *domain, struct device *dev, > + struct iommu_cache_invalidate_info *inv_info); > > unsigned long pgsize_bitmap; > }; > @@ -442,6 +444,9 @@ extern void iommu_detach_device(struct iommu_domain > *domain, > extern int iommu_attach_pasid_table(struct iommu_domain *domain, > struct iommu_pasid_table_config *cfg); > extern void iommu_detach_pasid_table(struct iommu_domain *domain); > +extern int iommu_cache_invalidate(struct iommu_domain *domain, > + struct device *dev, > + struct iommu_cache_invalidate_info *inv_info); > extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev); > extern struct iommu_domain *iommu_get_dma_domain(struct device *dev); > extern int iommu_map(struct iommu_domain *domain, unsigned long iova, > @@ -982,6 +987,14 @@ static inline int iommu_sva_get_pasid(struct iommu_sva > *handle) > static inline > void iommu_detach_pasid_table(struct iommu_domain *domain) {} > > +static inline int > +iommu_cache_invalidate(struct iommu_domain *domain, > +struct device *dev, > +struct iommu_cache_invalidate_info *inv_info) > +{ > + return -ENODEV; > +} > + > #endif /* CONFIG_IOMMU_API */ > > #ifdef CONFIG_IOMMU_DEBUGFS > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h > index 8848514..fa96ecb 100644 > --- a/include/uapi/linux/iommu.h > +++ b/include/uapi/linux/iommu.h > @@ -162,4 +162,84 @@ struct iommu_pasid_table_config { > }; > }; I noticed my qemu integration was currently incorrectly using PASID invalidation for ASID based invalidation (SMMUV3 Stage1 CMD_TLBI_NH
Re: [PATCH v7 18/23] iommu/smmuv3: Report non recoverable faults
Hi Robin, On 5/8/19 7:20 PM, Robin Murphy wrote: > On 08/04/2019 13:19, Eric Auger wrote: >> When a stage 1 related fault event is read from the event queue, >> let's propagate it to potential external fault listeners, ie. users >> who registered a fault handler. >> >> Signed-off-by: Eric Auger >> >> --- >> v4 -> v5: >> - s/IOMMU_FAULT_PERM_INST/IOMMU_FAULT_PERM_EXEC >> --- >> drivers/iommu/arm-smmu-v3.c | 169 +--- >> 1 file changed, 158 insertions(+), 11 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c >> index 805bc32a..1fd320788dcb 100644 >> --- a/drivers/iommu/arm-smmu-v3.c >> +++ b/drivers/iommu/arm-smmu-v3.c >> @@ -167,6 +167,26 @@ >> #define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8 >> #define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc >> +/* Events */ >> +#define ARM_SMMU_EVT_F_UUT 0x01 >> +#define ARM_SMMU_EVT_C_BAD_STREAMID 0x02 >> +#define ARM_SMMU_EVT_F_STE_FETCH 0x03 >> +#define ARM_SMMU_EVT_C_BAD_STE 0x04 >> +#define ARM_SMMU_EVT_F_BAD_ATS_TREQ 0x05 >> +#define ARM_SMMU_EVT_F_STREAM_DISABLED 0x06 >> +#define ARM_SMMU_EVT_F_TRANSL_FORBIDDEN 0x07 >> +#define ARM_SMMU_EVT_C_BAD_SUBSTREAMID 0x08 >> +#define ARM_SMMU_EVT_F_CD_FETCH 0x09 >> +#define ARM_SMMU_EVT_C_BAD_CD 0x0a >> +#define ARM_SMMU_EVT_F_WALK_EABT 0x0b >> +#define ARM_SMMU_EVT_F_TRANSLATION 0x10 >> +#define ARM_SMMU_EVT_F_ADDR_SIZE 0x11 >> +#define ARM_SMMU_EVT_F_ACCESS 0x12 >> +#define ARM_SMMU_EVT_F_PERMISSION 0x13 >> +#define ARM_SMMU_EVT_F_TLB_CONFLICT 0x20 >> +#define ARM_SMMU_EVT_F_CFG_CONFLICT 0x21 >> +#define ARM_SMMU_EVT_E_PAGE_REQUEST 0x24 >> + >> /* Common MSI config fields */ >> #define MSI_CFG0_ADDR_MASK GENMASK_ULL(51, 2) >> #define MSI_CFG2_SH GENMASK(5, 4) >> @@ -332,6 +352,15 @@ >> #define EVTQ_MAX_SZ_SHIFT 7 >> #define EVTQ_0_ID GENMASK_ULL(7, 0) >> +#define EVTQ_0_SSV GENMASK_ULL(11, 11) >> +#define EVTQ_0_SUBSTREAMID GENMASK_ULL(31, 12) >> +#define EVTQ_0_STREAMID GENMASK_ULL(63, 32) >> +#define EVTQ_1_PNU GENMASK_ULL(33, 33) >> +#define EVTQ_1_IND GENMASK_ULL(34, 34) >> +#define EVTQ_1_RNW GENMASK_ULL(35, 35) >> +#define EVTQ_1_S2 GENMASK_ULL(39, 39) >> +#define EVTQ_1_CLASS GENMASK_ULL(40, 41) >> +#define EVTQ_3_FETCH_ADDR GENMASK_ULL(51, 3) >> /* PRI queue */ >> #define PRIQ_ENT_DWORDS 2 >> @@ -639,6 +668,64 @@ struct arm_smmu_domain { >> spinlock_t devices_lock; >> }; >> +/* fault propagation */ >> + >> +#define IOMMU_FAULT_F_FIELDS (IOMMU_FAULT_UNRECOV_PASID_VALID | \ >> + IOMMU_FAULT_UNRECOV_PERM_VALID | \ >> + IOMMU_FAULT_UNRECOV_ADDR_VALID) >> + >> +struct arm_smmu_fault_propagation_data { >> + enum iommu_fault_reason reason; >> + bool s1_check; >> + u32 fields; /* IOMMU_FAULT_UNRECOV_*_VALID bits */ >> +}; >> + >> +/* >> + * Describes how SMMU faults translate into generic IOMMU faults >> + * and if they need to be reported externally >> + */ >> +static const struct arm_smmu_fault_propagation_data >> fault_propagation[] = { >> +[ARM_SMMU_EVT_F_UUT] = { }, >> +[ARM_SMMU_EVT_C_BAD_STREAMID] = { }, >> +[ARM_SMMU_EVT_F_STE_FETCH] = { }, >> +[ARM_SMMU_EVT_C_BAD_STE] = { }, >> +[ARM_SMMU_EVT_F_BAD_ATS_TREQ] = { }, >> +[ARM_SMMU_EVT_F_STREAM_DISABLED] = { }, >> +[ARM_SMMU_EVT_F_TRANSL_FORBIDDEN] = { }, >> +[ARM_SMMU_EVT_C_BAD_SUBSTREAMID] = {IOMMU_FAULT_REASON_PASID_INVALID, >> + false, >> + IOMMU_FAULT_UNRECOV_PASID_VALID >> + }, >> +[ARM_SMMU_EVT_F_CD_FETCH] = {IOMMU_FAULT_REASON_PASID_FETCH, >> + false, >> + IOMMU_FAULT_UNRECOV_PASID_VALID | > > It doesn't make sense to presume validity here, or in any of the faults > below... > >> + IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID >> + }, >> +[ARM_SMMU_EVT_C_BAD_CD] = >> {IOMMU_FAULT_REASON_BAD_PASID_ENTRY, >> + false, >> + IOMMU_FAULT_UNRECOV_PASID_VALID >> + }, >> +[ARM_SMMU_EVT_F_WALK_EABT] = {IOMMU_FAULT_REASON_WALK_EABT, true, >> + IOMMU_FAULT_F_FIELDS | >> + IOMMU_FAULT_UNRECOV_FETCH_ADDR_VALID >> + }, >> +[ARM_SMMU_EVT_F_TRANSLATION] = {IOMMU_FAULT_REASON_PTE_FETCH, >> true, >> + IOMMU_FAULT_F_FIELDS >> + }, >> +[ARM_SMMU_EVT_F_ADDR_SIZE] = {IOMMU_FAULT_REASON_OOR_ADDRESS, >> true, >> + IOMMU_FAULT_F_FIELDS >> + }, >> +[ARM_SMMU_EVT_F_ACCESS] = {IOMMU_FAULT_REASON_ACCESS, true, >> + IOMMU_FAULT_F_FIELDS >> +
[PATCH 3/4] iommu/vt-d: Handle RMRR with PCI bridge device scopes
When reading the vtd specification and especially the Reserved Memory Region Reporting Structure chapter, it is not obvious a device scope element cannot be a PCI-PCI bridge, in which case all downstream ports are likely to access the reserved memory region. Let's handle this case in device_has_rmrr. Fixes: ea2447f700ca ("intel-iommu: Prevent devices with RMRRs from being placed into SI Domain") Signed-off-by: Eric Auger --- drivers/iommu/intel-iommu.c | 32 +++- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index e2134b13c9ae..89d82a1d50b1 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -736,12 +736,31 @@ static int iommu_dummy(struct device *dev) return dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO; } +static bool +is_downstream_to_pci_bridge(struct device *deva, struct device *devb) +{ + struct pci_dev *pdeva, *pdevb; + + if (!dev_is_pci(deva) || !dev_is_pci(devb)) + return false; + + pdeva = to_pci_dev(deva); + pdevb = to_pci_dev(devb); + + if (pdevb->subordinate && + pdevb->subordinate->number <= pdeva->bus->number && + pdevb->subordinate->busn_res.end >= pdeva->bus->number) + return true; + + return false; +} + static struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn) { struct dmar_drhd_unit *drhd = NULL; struct intel_iommu *iommu; struct device *tmp; - struct pci_dev *ptmp, *pdev = NULL; + struct pci_dev *pdev = NULL; u16 segment = 0; int i; @@ -787,13 +806,7 @@ static struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devf goto out; } - if (!pdev || !dev_is_pci(tmp)) - continue; - - ptmp = to_pci_dev(tmp); - if (ptmp->subordinate && - ptmp->subordinate->number <= pdev->bus->number && - ptmp->subordinate->busn_res.end >= pdev->bus->number) + if (is_downstream_to_pci_bridge(dev, tmp)) goto got_pdev; } @@ -2886,7 +2899,8 @@ static bool device_has_rmrr(struct device *dev) */ for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt, i, tmp) - if (tmp == dev) { + if (tmp == dev || + is_downstream_to_pci_bridge(dev, tmp)) { rcu_read_unlock(); return true; } -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 0/6] iommu/ipmmu-vmsa: Suspend/resume support and assorted cleanups
Hi Geert-san, > From: Geert Uytterhoeven, Sent: Wednesday, April 24, 2019 10:55 PM > > Hi Jörg, Magnus, > > On R-Car Gen3 systems with PSCI, PSCI may power down the SoC during > system suspend, thus losing all IOMMU state. Hence after s2ram, devices > behind an IPMMU (e.g. SATA), and configured to use it, will fail to > complete their I/O operations. > > This patch series adds suspend/resume support to the Renesas IPMMU-VMSA > IOMMU driver, and performs some smaller cleanups and fixes during the > process. Most patches are fairly independent, except for patch 6/6, > which depends on patches 4/6 and 5/6. > > Changes compared to v2: > - Fix sysfs path typo in patch description, > - Add Reviewed-by. > > Changes compared to v1: > - Dropped "iommu/ipmmu-vmsa: Call ipmmu_ctx_write_root() instead of > open coding", > - Add Reviewed-by, > - Merge IMEAR/IMELAR, > - s/ipmmu_context_init/ipmmu_domain_setup_context/, > - Drop PSCI checks. > > This has been tested on Salvator-XS with R-Car H3 ES2.0, with IPMMU > suport for SATA enabled. To play safe, the resume operation has also > been tested on R-Car M2-W. Thank you for the patch! I reviewed this patch series and tested it on R-Car H3 ES3.0 with IPMMU support for USB3.0 host and SDHI. So, Reviewed-by: Yoshihiro Shimoda Tested-by: Yoshihiro Shimoda Best regards, Yoshihiro Shimoda
[PATCH 0/4] RMRR related fixes
Currently the Intel reserved region is attached to the RMRR unit and when building the list of RMRR seen by a device we link this unique reserved region without taking care of potential multiple usage of this reserved region by several devices. Also while reading the vtd spec it is unclear to me whether the RMRR device scope referenced by an RMRR ACPI struct could be a PCI-PCI bridge, in which case I think we also need to check the device belongs to the PCI sub-hierarchy of the device referenced in the scope. This would be true for device_has_rmrr() and intel_iommu_get_resv_regions(). Eric Auger (4): iommu: Pass a GFP flag parameter to iommu_alloc_resv_region() iommu/vt-d: Duplicate iommu_resv_region objects per device list iommu/vt-d: Handle RMRR with PCI bridge device scopes iommu/vt-d: Handle PCI bridge RMRR device scopes in intel_iommu_get_resv_regions drivers/acpi/arm64/iort.c | 3 +- drivers/iommu/amd_iommu.c | 7 ++-- drivers/iommu/arm-smmu-v3.c | 2 +- drivers/iommu/arm-smmu.c| 2 +- drivers/iommu/intel-iommu.c | 68 +++-- drivers/iommu/iommu.c | 7 ++-- include/linux/iommu.h | 2 +- 7 files changed, 55 insertions(+), 36 deletions(-) -- 2.20.1
[PATCH 2/4] iommu/vt-d: Duplicate iommu_resv_region objects per device list
intel_iommu_get_resv_regions() aims to return the list of reserved regions accessible by a given @device. However several devices can access the same reserved memory region and when building the list it is not safe to use a single iommu_resv_region object, whose container is the RMRR. This iommu_resv_region must be duplicated per device reserved region list. Let's remove the struct iommu_resv_region from the RMRR unit and allocate the iommu_resv_region directly in intel_iommu_get_resv_regions(). Fixes: 0659b8dc45a6 ("iommu/vt-d: Implement reserved region get/put callbacks") Signed-off-by: Eric Auger --- drivers/iommu/intel-iommu.c | 31 --- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 2075abdb174d..e2134b13c9ae 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -322,7 +322,6 @@ struct dmar_rmrr_unit { u64 end_address;/* reserved end address */ struct dmar_dev_scope *devices; /* target devices */ int devices_cnt;/* target device count */ - struct iommu_resv_region *resv; /* reserved region handle */ }; struct dmar_atsr_unit { @@ -4206,7 +4205,6 @@ static inline void init_iommu_pm_ops(void) {} int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg) { struct acpi_dmar_reserved_memory *rmrr; - int prot = DMA_PTE_READ|DMA_PTE_WRITE; struct dmar_rmrr_unit *rmrru; size_t length; @@ -4220,22 +4218,16 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg) rmrru->end_address = rmrr->end_address; length = rmrr->end_address - rmrr->base_address + 1; - rmrru->resv = iommu_alloc_resv_region(rmrr->base_address, length, prot, - IOMMU_RESV_DIRECT, GFP_KERNEL); - if (!rmrru->resv) - goto free_rmrru; rmrru->devices = dmar_alloc_dev_scope((void *)(rmrr + 1), ((void *)rmrr) + rmrr->header.length, &rmrru->devices_cnt); if (rmrru->devices_cnt && rmrru->devices == NULL) - goto free_all; + goto free_rmrru; list_add(&rmrru->list, &dmar_rmrr_units); return 0; -free_all: - kfree(rmrru->resv); free_rmrru: kfree(rmrru); out: @@ -4453,7 +4445,6 @@ static void intel_iommu_free_dmars(void) list_for_each_entry_safe(rmrru, rmrr_n, &dmar_rmrr_units, list) { list_del(&rmrru->list); dmar_free_dev_scope(&rmrru->devices, &rmrru->devices_cnt); - kfree(rmrru->resv); kfree(rmrru); } @@ -5271,6 +5262,7 @@ static void intel_iommu_remove_device(struct device *dev) static void intel_iommu_get_resv_regions(struct device *device, struct list_head *head) { + int prot = DMA_PTE_READ|DMA_PTE_WRITE; struct iommu_resv_region *reg; struct dmar_rmrr_unit *rmrr; struct device *i_dev; @@ -5280,10 +5272,21 @@ static void intel_iommu_get_resv_regions(struct device *device, for_each_rmrr_units(rmrr) { for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt, i, i_dev) { + struct iommu_resv_region *resv; + size_t length; + if (i_dev != device) continue; - list_add_tail(&rmrr->resv->list, head); + length = rmrr->end_address - rmrr->base_address + 1; + resv = iommu_alloc_resv_region(rmrr->base_address, + length, prot, + IOMMU_RESV_DIRECT, + GFP_ATOMIC); + if (!resv) + break; + + list_add_tail(&resv->list, head); } } rcu_read_unlock(); @@ -5301,10 +5304,8 @@ static void intel_iommu_put_resv_regions(struct device *dev, { struct iommu_resv_region *entry, *next; - list_for_each_entry_safe(entry, next, head, list) { - if (entry->type == IOMMU_RESV_MSI) - kfree(entry); - } + list_for_each_entry_safe(entry, next, head, list) + kfree(entry); } #ifdef CONFIG_INTEL_IOMMU_SVM -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/4] iommu: Pass a GFP flag parameter to iommu_alloc_resv_region()
We plan to call iommu_alloc_resv_region in a non preemptible section. Pass a GFP flag to the function and update all the call sites. Signed-off-by: Eric Auger --- drivers/acpi/arm64/iort.c | 3 ++- drivers/iommu/amd_iommu.c | 7 --- drivers/iommu/arm-smmu-v3.c | 2 +- drivers/iommu/arm-smmu.c| 2 +- drivers/iommu/intel-iommu.c | 4 ++-- drivers/iommu/iommu.c | 7 --- include/linux/iommu.h | 2 +- 7 files changed, 15 insertions(+), 12 deletions(-) diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index adbf7cbedf80..20b56ae91513 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -868,7 +868,8 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head) struct iommu_resv_region *region; region = iommu_alloc_resv_region(base + SZ_64K, SZ_64K, -prot, IOMMU_RESV_MSI); +prot, IOMMU_RESV_MSI, +GFP_KERNEL); if (region) { list_add_tail(®ion->list, head); resv++; diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index f7cdd2ab7f11..a9aab13a9487 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -3186,7 +3186,8 @@ static void amd_iommu_get_resv_regions(struct device *dev, type = IOMMU_RESV_RESERVED; region = iommu_alloc_resv_region(entry->address_start, -length, prot, type); +length, prot, type, +GFP_KERNEL); if (!region) { dev_err(dev, "Out of memory allocating dm-regions\n"); return; @@ -3196,14 +3197,14 @@ static void amd_iommu_get_resv_regions(struct device *dev, region = iommu_alloc_resv_region(MSI_RANGE_START, MSI_RANGE_END - MSI_RANGE_START + 1, -0, IOMMU_RESV_MSI); +0, IOMMU_RESV_MSI, GFP_KERNEL); if (!region) return; list_add_tail(®ion->list, head); region = iommu_alloc_resv_region(HT_RANGE_START, HT_RANGE_END - HT_RANGE_START + 1, -0, IOMMU_RESV_RESERVED); +0, IOMMU_RESV_RESERVED, GFP_KERNEL); if (!region) return; list_add_tail(®ion->list, head); diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index d3880010c6cf..5aae50c811b3 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -2031,7 +2031,7 @@ static void arm_smmu_get_resv_regions(struct device *dev, int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO; region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH, -prot, IOMMU_RESV_SW_MSI); +prot, IOMMU_RESV_SW_MSI, GFP_KERNEL); if (!region) return; diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 045d93884164..3c28bc0555d4 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -1667,7 +1667,7 @@ static void arm_smmu_get_resv_regions(struct device *dev, int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO; region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH, -prot, IOMMU_RESV_SW_MSI); +prot, IOMMU_RESV_SW_MSI, GFP_KERNEL); if (!region) return; diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 28cb713d728c..2075abdb174d 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -4221,7 +4221,7 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg) length = rmrr->end_address - rmrr->base_address + 1; rmrru->resv = iommu_alloc_resv_region(rmrr->base_address, length, prot, - IOMMU_RESV_DIRECT); + IOMMU_RESV_DIRECT, GFP_KERNEL); if (!rmrru->resv) goto free_rmrru; @@ -5290,7 +5290,7 @@ static void intel_iommu_get_resv_regions(struct device *device, reg = iommu_alloc_resv_region(IOAPIC_RANGE_START, IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1, - 0, IOMMU_RESV_MSI); + 0, IOMMU_RESV_MSI, GFP_KERNEL); if (!reg) return; list_add_tai
[PATCH 4/4] iommu/vt-d: Handle PCI bridge RMRR device scopes in intel_iommu_get_resv_regions
In the case the RMRR device scope is a PCI-PCI bridge, let's check the device belongs to the PCI sub-hierarchy. Fixes: 0659b8dc45a6 ("iommu/vt-d: Implement reserved region get/put callbacks") Signed-off-by: Eric Auger --- drivers/iommu/intel-iommu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 89d82a1d50b1..9c1a765eca8b 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5289,7 +5289,8 @@ static void intel_iommu_get_resv_regions(struct device *device, struct iommu_resv_region *resv; size_t length; - if (i_dev != device) + if (i_dev != device && + !is_downstream_to_pci_bridge(device, i_dev)) continue; length = rmrr->end_address - rmrr->base_address + 1; -- 2.20.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [ARM SMMU] Dynamic StreamID allocation
Hi Jean, > -Original Message- > From: Jean-Philippe Brucker > Sent: Friday, 10 May, 2019 07:07 PM > To: Pankaj Bansal ; Will Deacon > ; Robin Murphy ; Joerg > Roedel > Cc: iommu@lists.linux-foundation.org; Varun Sethi ; linux- > arm-ker...@lists.infradead.org; Nipun Gupta > Subject: Re: [ARM SMMU] Dynamic StreamID allocation > > On 10/05/2019 13:33, Pankaj Bansal wrote: > > Hi Will/Robin/Joerg, > > > > I am s/w engineer from NXP India Pvt. Ltd. > > We are using SMMU-V3 in one of NXP SOC. > > I have a question about the SMMU Stream ID allocation in linux. > > > > Right now the Stream IDs allocated to a device are mapped via device tree to > the device. > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felix > > ir.bootlin.com%2Flinux%2Flatest%2Fsource%2FDocumentation%2Fdevicetree% > > 2Fbindings%2Fiommu%2Farm%2Csmmu- > v3.txt%23L39&data=02%7C01%7Cpankaj > > .bansal%40nxp.com%7C3cbe8bd4827e425afd0f08d6d54c925e%7C686ea1d3b > c2b4c6 > > > fa92cd99c5c301635%7C0%7C0%7C63693090665343&sdata=vIG5u5n > XR5iRp > > uuuGjeFxKBtA5f5ohf91znXX0QWm1c%3D&reserved=0 > > > > As the device tree is passed from bootloader to linux, we detect all the > > stream > IDs needed by a device in bootloader and add their IDs in respective device > nodes. > > For each PCIE Endpoint (a unique BDF (Bus Device Function)) on PCIE bus, we > are assigning a unique Stream ID in bootloader. > > > > However, this poses an issue with PCIE hot plug. > > If we plug in a pcie device while linux is running, a unique BDF is > > assigned to > the device, for which there is no stream ID in device tree. > > > > How can this problem be solved in linux? > > Assuming the streamID associated to a BDF is predictable (streamID = BDF > + constant), using the iommu-map property should just work: > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.boo > tlin.com%2Flinux%2Flatest%2Fsource%2FDocumentation%2Fdevicetree%2Fbind > ings%2Fpci%2Fpci- > iommu.txt&data=02%7C01%7Cpankaj.bansal%40nxp.com%7C3cbe8bd482 > 7e425afd0f08d6d54c925e%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0 > %7C63693090665343&sdata=GkkovEnvhd5dN%2BGdh%2FnKCyW5Cd > EnLDP3cWTrk%2B%2FO7EQ%3D&reserved=0 > > It describes the streamIDs of all possible BDFs, including hotplugged > functions. You mean that we should increase the "length" parameter (in (rid-base,iommu,iommu-base,length) touple) ? This would cater to any *new* Bus Device Function being detected on PCIE bus? Is that right ? Right now when we make iommu-map in bootloader, we are giving one RID per BDF: https://elixir.bootlin.com/u-boot/latest/source/drivers/pci/pcie_layerscape_fixup.c#L168 But isn't the better approach to make it dynamic in linux? i.e. as soon as a new device is detected "requester id" is allocated to it from available pool. When device is removed, return the "requester id" to pool. is there any h/w limitation which prevents it? Regards, Pankaj Bansal > > Thanks, > Jean > > > > > Is there a way to assign (and revoke) stream IDs at run time? > > > > Regards, > > Pankaj Bansal > > > > > > ___ > > linux-arm-kernel mailing list > > linux-arm-ker...@lists.infradead.org > > https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists > > .infradead.org%2Fmailman%2Flistinfo%2Flinux-arm- > kernel&data=02%7C0 > > > 1%7Cpankaj.bansal%40nxp.com%7C3cbe8bd4827e425afd0f08d6d54c925e%7C6 > 86ea > > > 1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63693090665343&sda > ta=2La > > GBHO2%2Bbqk519uJvCatlHyRCtAPPjKO8Gxu1bQHBM%3D&reserved=0 > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 02/10] swiotlb: Factor out slot allocation and free
On Mon, May 06, 2019 at 09:54:30AM +0800, Lu Baolu wrote: > Agreed. I will prepare the next version simply without the optimization, so > the offset is not required. > > For your changes in swiotlb, will you formalize them in patches or want > me to do this? Please do it yourself given that you still need the offset and thus a rework of the patches anyway.