Re: [PATCH] iommu/arm-smmu-qcom: Fix TTBR0 read
On Mon 08 Nov 09:17 PST 2021, Rob Clark wrote: > From: Rob Clark > > It is a 64b register, lets not lose the upper bits. > > Fixes: ab5df7b953d8 ("iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback > to get pagefault info") > Signed-off-by: Rob Clark Reviewed-by: Bjorn Andersson Regards, Bjorn > --- > drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > index 55690af1b25d..c998960495b4 100644 > --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c > @@ -51,7 +51,7 @@ static void qcom_adreno_smmu_get_fault_info(const void > *cookie, > info->fsynr1 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSYNR1); > info->far = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_FAR); > info->cbfrsynra = arm_smmu_gr1_read(smmu, > ARM_SMMU_GR1_CBFRSYNRA(cfg->cbndx)); > - info->ttbr0 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_TTBR0); > + info->ttbr0 = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_TTBR0); > info->contextidr = arm_smmu_cb_read(smmu, cfg->cbndx, > ARM_SMMU_CB_CONTEXTIDR); > } > > -- > 2.31.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: How to reduce PCI initialization from 5 s (1.5 s adding them to IOMMU groups)
Hi Paul, > On a PowerEdge T440/021KCD, BIOS 2.11.2 04/22/2021, Linux 5.10.70 takes > almost five seconds to initialize PCI. According to the timestamps, 1.5 s > are from assigning the PCI devices to the 142 IOMMU groups. [...] > Is there anything that could be done to reduce the time? I am curious - why is this a problem? Are you power-cycling your servers so often to the point where the cumulative time spent in enumerating PCI devices and adding them later to IOMMU groups is a problem? I am simply wondering why you decided to signal out the PCI enumeration as slow in particular, especially given that a large server hardware tends to have (most of the time, as per my experience) rather long initialisation time either from being powered off or after being power cycled. I can take a while before the actual operating system itself will start. We talked about this briefly with Bjorn, and there might be an option to perhaps add some caching, as we suspect that the culprit here is doing PCI configuration space read for each device, which can be slow on some platforms. However, we would need to profile this to get some quantitative data to see whether doing anything would even be worthwhile. It would definitely help us understand better where the bottlenecks really are and of what magnitude. I personally don't have access to such a large hardware like the one you have access to, thus I was wondering whether you would have some time, and be willing, to profile this for us on the hardware you have. Let me know what do you think? Krzysztof ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu-qcom: Fix TTBR0 read
From: Rob Clark It is a 64b register, lets not lose the upper bits. Fixes: ab5df7b953d8 ("iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault info") Signed-off-by: Rob Clark --- drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c index 55690af1b25d..c998960495b4 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c @@ -51,7 +51,7 @@ static void qcom_adreno_smmu_get_fault_info(const void *cookie, info->fsynr1 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSYNR1); info->far = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_FAR); info->cbfrsynra = arm_smmu_gr1_read(smmu, ARM_SMMU_GR1_CBFRSYNRA(cfg->cbndx)); - info->ttbr0 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_TTBR0); + info->ttbr0 = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_TTBR0); info->contextidr = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_CONTEXTIDR); } -- 2.31.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 0/8] Host1x context isolation support
On 9/16/21 5:32 PM, Mikko Perttunen wrote: Hi all, *** New in v2: Added support for Tegra194 Use standard iommu-map property instead of custom mechanism *** this series adds support for Host1x 'context isolation'. Since when programming engines through Host1x, userspace can program in any addresses it wants, we need some way to isolate the engines' memory spaces. Traditionally this has either been done imperfectly with a single shared IOMMU domain, or by copying and verifying the programming command stream at submit time (Host1x firewall). Since Tegra186 there is a privileged (only usable by kernel) Host1x opcode that allows setting the stream ID sent by the engine to the SMMU. So, by allocating a number of context banks and stream IDs for this purpose, and using this opcode at the beginning of each job, we can implement isolation. Due to the limited number of context banks only each process gets its own context, and not each channel. This feature also allows sharing engines among multiple VMs when used with Host1x's hardware virtualization support - up to 8 VMs can be configured with a subset of allowed stream IDs, enforced at hardware level. To implement this, this series adds a new host1x context bus, which will contain the 'struct device's corresponding to each context bank / stream ID, changes to device tree and SMMU code to allow registering the devices and using the bus, as well as the Host1x stream ID programming code and support in TegraDRM. Device tree bindings are not updated yet pending consensus that the proposed changes make sense. Thanks, Mikko Mikko Perttunen (8): gpu: host1x: Add context bus gpu: host1x: Add context device management code gpu: host1x: Program context stream ID on submission iommu/arm-smmu: Attach to host1x context device bus arm64: tegra: Add Host1x context stream IDs on Tegra186+ drm/tegra: falcon: Set DMACTX field on DMA transactions drm/tegra: vic: Implement get_streamid_offset drm/tegra: Support context isolation arch/arm64/boot/dts/nvidia/tegra186.dtsi | 12 ++ arch/arm64/boot/dts/nvidia/tegra194.dtsi | 12 ++ drivers/gpu/Makefile | 3 +- drivers/gpu/drm/tegra/drm.h | 2 + drivers/gpu/drm/tegra/falcon.c| 8 + drivers/gpu/drm/tegra/falcon.h| 1 + drivers/gpu/drm/tegra/submit.c| 13 ++ drivers/gpu/drm/tegra/uapi.c | 34 - drivers/gpu/drm/tegra/vic.c | 38 + drivers/gpu/host1x/Kconfig| 5 + drivers/gpu/host1x/Makefile | 2 + drivers/gpu/host1x/context.c | 174 ++ drivers/gpu/host1x/context.h | 27 drivers/gpu/host1x/context_bus.c | 31 drivers/gpu/host1x/dev.c | 12 +- drivers/gpu/host1x/dev.h | 2 + drivers/gpu/host1x/hw/channel_hw.c| 52 ++- drivers/gpu/host1x/hw/host1x06_hardware.h | 10 ++ drivers/gpu/host1x/hw/host1x07_hardware.h | 10 ++ drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 ++ include/linux/host1x.h| 21 +++ include/linux/host1x_context_bus.h| 15 ++ 22 files changed, 488 insertions(+), 9 deletions(-) create mode 100644 drivers/gpu/host1x/context.c create mode 100644 drivers/gpu/host1x/context.h create mode 100644 drivers/gpu/host1x/context_bus.c create mode 100644 include/linux/host1x_context_bus.h IOMMU/DT folks, any thoughts about this approach? The patches that are of interest outside of Host1x/TegraDRM specifics are patches 1, 2, 4, and 5. Thanks, Mikko ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2] memory: mtk-smi: Fix a null dereference for the ostd
We add the ostd setting for mt8195. It introduces a KE for the previous SoC which doesn't have ostd setting. This is the log: Unable to handle kernel NULL pointer dereference at virtual address 0080 ... pc : mtk_smi_larb_config_port_gen2_general+0x64/0x130 lr : mtk_smi_larb_resume+0x54/0x98 ... Call trace: mtk_smi_larb_config_port_gen2_general+0x64/0x130 pm_generic_runtime_resume+0x2c/0x48 __genpd_runtime_resume+0x30/0xa8 genpd_runtime_resume+0x94/0x2c8 __rpm_callback+0x44/0x150 rpm_callback+0x6c/0x78 rpm_resume+0x310/0x558 __pm_runtime_resume+0x3c/0x88 In the code: larbostd = larb->larb_gen->ostd[larb->larbid], if "larb->larb_gen->ostd" is null, the "larbostd" is the offset(e.g. 0x80 above), it's also a valid value, then accessing "larbostd[i]" in the "for" loop will cause the KE above. To avoid this issue, initialize "larbostd" to NULL when the SoC doesn't have ostd setting. Signed-off-by: Yong Wu --- change note: Reword the commit message to show why it KE. and update the solution via initializing "larbostd" is NULL explicitly in the non-ostd case. --- drivers/memory/mtk-smi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c index b883dcc0bbfa..e201e5976f34 100644 --- a/drivers/memory/mtk-smi.c +++ b/drivers/memory/mtk-smi.c @@ -241,7 +241,7 @@ static void mtk_smi_larb_config_port_gen2_general(struct device *dev) { struct mtk_smi_larb *larb = dev_get_drvdata(dev); u32 reg, flags_general = larb->larb_gen->flags_general; - const u8 *larbostd = larb->larb_gen->ostd[larb->larbid]; + const u8 *larbostd = larb->larb_gen->ostd ? larb->larb_gen->ostd[larb->larbid] : NULL; int i; if (BIT(larb->larbid) & larb->larb_gen->larb_direct_to_common_mask) -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu