Re: [PATCH] iommu/arm-smmu-qcom: Fix TTBR0 read

2021-11-08 Thread Bjorn Andersson
On Mon 08 Nov 09:17 PST 2021, Rob Clark wrote:

> From: Rob Clark 
> 
> It is a 64b register, lets not lose the upper bits.
> 
> Fixes: ab5df7b953d8 ("iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback 
> to get pagefault info")
> Signed-off-by: Rob Clark 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> index 55690af1b25d..c998960495b4 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> @@ -51,7 +51,7 @@ static void qcom_adreno_smmu_get_fault_info(const void 
> *cookie,
>   info->fsynr1 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSYNR1);
>   info->far = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_FAR);
>   info->cbfrsynra = arm_smmu_gr1_read(smmu, 
> ARM_SMMU_GR1_CBFRSYNRA(cfg->cbndx));
> - info->ttbr0 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_TTBR0);
> + info->ttbr0 = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_TTBR0);
>   info->contextidr = arm_smmu_cb_read(smmu, cfg->cbndx, 
> ARM_SMMU_CB_CONTEXTIDR);
>  }
>  
> -- 
> 2.31.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: How to reduce PCI initialization from 5 s (1.5 s adding them to IOMMU groups)

2021-11-08 Thread Krzysztof WilczyƄski
Hi Paul,

> On a PowerEdge T440/021KCD, BIOS 2.11.2 04/22/2021, Linux 5.10.70 takes
> almost five seconds to initialize PCI. According to the timestamps, 1.5 s
> are from assigning the PCI devices to the 142 IOMMU groups.
[...]
> Is there anything that could be done to reduce the time?

I am curious - why is this a problem?  Are you power-cycling your servers
so often to the point where the cumulative time spent in enumerating PCI
devices and adding them later to IOMMU groups is a problem? 

I am simply wondering why you decided to signal out the PCI enumeration as
slow in particular, especially given that a large server hardware tends to
have (most of the time, as per my experience) rather long initialisation
time either from being powered off or after being power cycled.  I can take
a while before the actual operating system itself will start.

We talked about this briefly with Bjorn, and there might be an option to
perhaps add some caching, as we suspect that the culprit here is doing PCI
configuration space read for each device, which can be slow on some
platforms.

However, we would need to profile this to get some quantitative data to see
whether doing anything would even be worthwhile.  It would definitely help
us understand better where the bottlenecks really are and of what magnitude.

I personally don't have access to such a large hardware like the one you
have access to, thus I was wondering whether you would have some time, and
be willing, to profile this for us on the hardware you have.

Let me know what do you think?

Krzysztof
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/arm-smmu-qcom: Fix TTBR0 read

2021-11-08 Thread Rob Clark
From: Rob Clark 

It is a 64b register, lets not lose the upper bits.

Fixes: ab5df7b953d8 ("iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to 
get pagefault info")
Signed-off-by: Rob Clark 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 55690af1b25d..c998960495b4 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -51,7 +51,7 @@ static void qcom_adreno_smmu_get_fault_info(const void 
*cookie,
info->fsynr1 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_FSYNR1);
info->far = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_FAR);
info->cbfrsynra = arm_smmu_gr1_read(smmu, 
ARM_SMMU_GR1_CBFRSYNRA(cfg->cbndx));
-   info->ttbr0 = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_TTBR0);
+   info->ttbr0 = arm_smmu_cb_readq(smmu, cfg->cbndx, ARM_SMMU_CB_TTBR0);
info->contextidr = arm_smmu_cb_read(smmu, cfg->cbndx, 
ARM_SMMU_CB_CONTEXTIDR);
 }
 
-- 
2.31.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/8] Host1x context isolation support

2021-11-08 Thread Mikko Perttunen

On 9/16/21 5:32 PM, Mikko Perttunen wrote:

Hi all,

***
New in v2:

Added support for Tegra194
Use standard iommu-map property instead of custom mechanism
***

this series adds support for Host1x 'context isolation'. Since
when programming engines through Host1x, userspace can program in
any addresses it wants, we need some way to isolate the engines'
memory spaces. Traditionally this has either been done imperfectly
with a single shared IOMMU domain, or by copying and verifying the
programming command stream at submit time (Host1x firewall).

Since Tegra186 there is a privileged (only usable by kernel)
Host1x opcode that allows setting the stream ID sent by the engine
to the SMMU. So, by allocating a number of context banks and stream
IDs for this purpose, and using this opcode at the beginning of
each job, we can implement isolation. Due to the limited number of
context banks only each process gets its own context, and not
each channel.

This feature also allows sharing engines among multiple VMs when
used with Host1x's hardware virtualization support - up to 8 VMs
can be configured with a subset of allowed stream IDs, enforced
at hardware level.

To implement this, this series adds a new host1x context bus, which
will contain the 'struct device's corresponding to each context
bank / stream ID, changes to device tree and SMMU code to allow
registering the devices and using the bus, as well as the Host1x
stream ID programming code and support in TegraDRM.

Device tree bindings are not updated yet pending consensus that the
proposed changes make sense.

Thanks,
Mikko

Mikko Perttunen (8):
   gpu: host1x: Add context bus
   gpu: host1x: Add context device management code
   gpu: host1x: Program context stream ID on submission
   iommu/arm-smmu: Attach to host1x context device bus
   arm64: tegra: Add Host1x context stream IDs on Tegra186+
   drm/tegra: falcon: Set DMACTX field on DMA transactions
   drm/tegra: vic: Implement get_streamid_offset
   drm/tegra: Support context isolation

  arch/arm64/boot/dts/nvidia/tegra186.dtsi  |  12 ++
  arch/arm64/boot/dts/nvidia/tegra194.dtsi  |  12 ++
  drivers/gpu/Makefile  |   3 +-
  drivers/gpu/drm/tegra/drm.h   |   2 +
  drivers/gpu/drm/tegra/falcon.c|   8 +
  drivers/gpu/drm/tegra/falcon.h|   1 +
  drivers/gpu/drm/tegra/submit.c|  13 ++
  drivers/gpu/drm/tegra/uapi.c  |  34 -
  drivers/gpu/drm/tegra/vic.c   |  38 +
  drivers/gpu/host1x/Kconfig|   5 +
  drivers/gpu/host1x/Makefile   |   2 +
  drivers/gpu/host1x/context.c  | 174 ++
  drivers/gpu/host1x/context.h  |  27 
  drivers/gpu/host1x/context_bus.c  |  31 
  drivers/gpu/host1x/dev.c  |  12 +-
  drivers/gpu/host1x/dev.h  |   2 +
  drivers/gpu/host1x/hw/channel_hw.c|  52 ++-
  drivers/gpu/host1x/hw/host1x06_hardware.h |  10 ++
  drivers/gpu/host1x/hw/host1x07_hardware.h |  10 ++
  drivers/iommu/arm/arm-smmu/arm-smmu.c |  13 ++
  include/linux/host1x.h|  21 +++
  include/linux/host1x_context_bus.h|  15 ++
  22 files changed, 488 insertions(+), 9 deletions(-)
  create mode 100644 drivers/gpu/host1x/context.c
  create mode 100644 drivers/gpu/host1x/context.h
  create mode 100644 drivers/gpu/host1x/context_bus.c
  create mode 100644 include/linux/host1x_context_bus.h



IOMMU/DT folks, any thoughts about this approach? The patches that are 
of interest outside of Host1x/TegraDRM specifics are patches 1, 2, 4, and 5.


Thanks,
Mikko
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] memory: mtk-smi: Fix a null dereference for the ostd

2021-11-08 Thread Yong Wu
We add the ostd setting for mt8195. It introduces a KE for the
previous SoC which doesn't have ostd setting. This is the log:

Unable to handle kernel NULL pointer dereference at virtual address
0080
...
pc : mtk_smi_larb_config_port_gen2_general+0x64/0x130
lr : mtk_smi_larb_resume+0x54/0x98
...
Call trace:
 mtk_smi_larb_config_port_gen2_general+0x64/0x130
 pm_generic_runtime_resume+0x2c/0x48
 __genpd_runtime_resume+0x30/0xa8
 genpd_runtime_resume+0x94/0x2c8
 __rpm_callback+0x44/0x150
 rpm_callback+0x6c/0x78
 rpm_resume+0x310/0x558
 __pm_runtime_resume+0x3c/0x88

In the code: larbostd = larb->larb_gen->ostd[larb->larbid],
if "larb->larb_gen->ostd" is null, the "larbostd" is the offset(e.g.
0x80 above), it's also a valid value, then accessing "larbostd[i]" in the
"for" loop will cause the KE above. To avoid this issue, initialize
"larbostd" to NULL when the SoC doesn't have ostd setting.

Signed-off-by: Yong Wu 
---
change note: Reword the commit message to show why it KE. and update the
solution via initializing "larbostd" is NULL explicitly in the non-ostd
case.
---
 drivers/memory/mtk-smi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index b883dcc0bbfa..e201e5976f34 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -241,7 +241,7 @@ static void mtk_smi_larb_config_port_gen2_general(struct 
device *dev)
 {
struct mtk_smi_larb *larb = dev_get_drvdata(dev);
u32 reg, flags_general = larb->larb_gen->flags_general;
-   const u8 *larbostd = larb->larb_gen->ostd[larb->larbid];
+   const u8 *larbostd = larb->larb_gen->ostd ? 
larb->larb_gen->ostd[larb->larbid] : NULL;
int i;
 
if (BIT(larb->larbid) & larb->larb_gen->larb_direct_to_common_mask)
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu