Re: [PATCH v4 01/17] media: dt-binding: mtk-vcodec: Separating mtk-vcodec encode node.

2020-06-10 Thread Alexandre Courbot
On Wed, Jun 10, 2020 at 6:21 AM Rob Herring  wrote:
>
> On Sat, May 30, 2020 at 04:10:02PM +0800, Yong Wu wrote:
> > From: Maoguang Meng 
> >
> > Update binding document since the avc and vp8 hardware encoder in
> > mt8173 are now separated. Separate "mediatek,mt8173-vcodec-enc" to
> > "mediatek,mt8173-vcodec-vp8-enc" and "mediatek,mt8173-vcodec-avc-enc".
>
> The h/w suddenly split in 2? You are breaking compatibility. Up to the
> Mediatek maintainers to decide if that's okay, but you need to state you
> are breaking compatibility (here and in the driver) and why that is
> okay.

In my understanding there is no real hardware using the old bindings
at the moment, and the split is indeed a reflection of the actual
hardware layout. Tiffany, can you give your acked-by if this change is
ok with you?

>
> >
> > This is a preparing patch for smi cleaning up "mediatek,larb".
> >
> > Signed-off-by: Maoguang Meng 
> > Signed-off-by: Hsin-Yi Wang 
> > Signed-off-by: Irui Wang 
> > Signed-off-by: Yong Wu 
> > ---
> >  .../devicetree/bindings/media/mediatek-vcodec.txt  | 58 
> > --
> >  1 file changed, 31 insertions(+), 27 deletions(-)
> >
> > diff --git a/Documentation/devicetree/bindings/media/mediatek-vcodec.txt 
> > b/Documentation/devicetree/bindings/media/mediatek-vcodec.txt
> > index 8093335..1023740 100644
> > --- a/Documentation/devicetree/bindings/media/mediatek-vcodec.txt
> > +++ b/Documentation/devicetree/bindings/media/mediatek-vcodec.txt
> > @@ -4,7 +4,9 @@ Mediatek Video Codec is the video codec hw present in 
> > Mediatek SoCs which
> >  supports high resolution encoding and decoding functionalities.
> >
> >  Required properties:
> > -- compatible : "mediatek,mt8173-vcodec-enc" for MT8173 encoder
> > +- compatible : must be one of the following string:
> > +  "mediatek,mt8173-vcodec-vp8-enc" for mt8173 vp8 encoder.
> > +  "mediatek,mt8173-vcodec-avc-enc" for mt8173 avc encoder.
> >"mediatek,mt8183-vcodec-enc" for MT8183 encoder.
> >"mediatek,mt8173-vcodec-dec" for MT8173 decoder.
> >  - reg : Physical base address of the video codec registers and length of
> > @@ -13,10 +15,11 @@ Required properties:
> >  - mediatek,larb : must contain the local arbiters in the current Socs.
> >  - clocks : list of clock specifiers, corresponding to entries in
> >the clock-names property.
> > -- clock-names: encoder must contain "venc_sel_src", "venc_sel",,
> > -  "venc_lt_sel_src", "venc_lt_sel", decoder must contain "vcodecpll",
> > -  "univpll_d2", "clk_cci400_sel", "vdec_sel", "vdecpll", "vencpll",
> > -  "venc_lt_sel", "vdec_bus_clk_src".
> > +- clock-names:
> > +   avc venc must contain "venc_sel";
> > +   vp8 venc must contain "venc_lt_sel";
> > +   decoder  must contain "vcodecpll", "univpll_d2", "clk_cci400_sel",
> > +   "vdec_sel", "vdecpll", "vencpll", "venc_lt_sel", "vdec_bus_clk_src".
> >  - iommus : should point to the respective IOMMU block with master port as
> >argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
> >for details.
> > @@ -80,14 +83,10 @@ vcodec_dec: vcodec@1600 {
> >  assigned-clock-rates = <0>, <0>, <0>, <148200>, <8>;
> >};
> >
> > -  vcodec_enc: vcodec@18002000 {
> > -compatible = "mediatek,mt8173-vcodec-enc";
> > -reg = <0 0x18002000 0 0x1000>,/*VENC_SYS*/
> > -  <0 0x19002000 0 0x1000>;/*VENC_LT_SYS*/
> > -interrupts = ,
> > -  ;
> > -mediatek,larb = <>,
> > - <>;
> > +vcodec_enc: vcodec@18002000 {
> > +compatible = "mediatek,mt8173-vcodec-avc-enc";
> > +reg = <0 0x18002000 0 0x1000>;
> > +interrupts = ;
> >  iommus = < M4U_PORT_VENC_RCPU>,
> >   < M4U_PORT_VENC_REC>,
> >   < M4U_PORT_VENC_BSDMA>,
> > @@ -98,8 +97,20 @@ vcodec_dec: vcodec@1600 {
> >   < M4U_PORT_VENC_REF_LUMA>,
> >   < M4U_PORT_VENC_REF_CHROMA>,
> >   < M4U_PORT_VENC_NBM_RDMA>,
> > - < M4U_PORT_VENC_NBM_WDMA>,
> > - < M4U_PORT_VENC_RCPU_SET2>,
> > + < M4U_PORT_VENC_NBM_WDMA>;
> > +mediatek,larb = <>;
> > +mediatek,vpu = <>;
> > +clocks = < CLK_TOP_VENC_SEL>;
> > +clock-names = "venc_sel";
> > +assigned-clocks = < CLK_TOP_VENC_SEL>;
> > +assigned-clock-parents = < CLK_TOP_VCODECPLL>;
> > +  };
> > +
> > +vcodec_enc_lt: vcodec@19002000 {
> > +compatible = "mediatek,mt8173-vcodec-vp8-enc";
> > +reg =  <0 0x19002000 0 0x1000>;  /* VENC_LT_SYS */
> > +interrupts = ;
> > +iommus = < M4U_PORT_VENC_RCPU_SET2>,
> >   < M4U_PORT_VENC_REC_FRM_SET2>,
> >   < M4U_PORT_VENC_BSDMA_SET2>,
> >   < M4U_PORT_VENC_SV_COMA_SET2>,
> > @@ -108,17 +119,10 @@ vcodec_dec: vcodec@1600 {
> >   < M4U_PORT_VENC_CUR_CHROMA_SET2>,
> >   < M4U_PORT_VENC_REF_LUMA_SET2>,
> >   < M4U_PORT_VENC_REC_CHROMA_SET2>;
> > +mediatek,larb = <>;
> >  mediatek,vpu = <>;
> > -clocks 

Re: [PATCH v6 8/8] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2015-01-17 Thread Alexandre Courbot

On 01/16/2015 08:18 AM, Laurent Pinchart wrote:

On Thursday 15 January 2015 11:12:17 Will Deacon wrote:

On Thu, Jan 15, 2015 at 08:28:44AM +, Thierry Reding wrote:

On Wed, Jan 14, 2015 at 10:46:10AM +, Will Deacon wrote:

On Wed, Jan 14, 2015 at 09:00:24AM +, Alexandre Courbot wrote:

[...]


2) Say you want to use the IOMMU API in your driver, and have an iommu
property in your device's DT node. If by chance your IOMMU is
registered early, you will already have a mapping automatically
created even before your probe function is called. Can this be
avoided? Is it even safe?


Currently, I think you have to either teardown the ops manually or
return an error from of_xlate. Thierry was also looking at this sort of
thing, so it might be worth talking to him.


I already explained in earlier threads why I think this is a bad idea.
It's completely unnatural for any driver to manually tear down something
that it didn't want set up in the first place. It also means that you
have to carefully audit any users of these IOMMU APIs to make sure that
they do tear down. That doesn't sound like a good incremental approach,
as evidenced by the breakage that Alex and Heiko have encountered.


Well, perhaps we hide that behind a get_iommu API or something. We *do*
need this manual teardown step to support things like VFIO, so it makes
sense to reuse it for other users too imo.


The solution for me has been to completely side-step the issue and not
register the IOMMU with the new mechanism at all. That is, there's no
.of_xlate() implementation, which means that the ARM DMA API glue won't
try to be smart and use the IOMMU in ways it's not meant to be used.


That will break when someone will want to use the same IOMMU type for devices
that use the DMA mapping API to hide the IOMMU. That might not be the case for
your IOMMU today, but it's pretty fragile, we need to fix it.


This has several advantages, such as that I can also use the regular
driver model for suspend/resume of the IOMMU, and I get to enjoy the
benefits of devres in the IOMMU driver. Probe ordering is still a tiny
issue, but we can easily solve that using explicit initcall ordering
(which really isn't any worse than IOMMU_OF_DECLARE()).


That's a pity. I'd much rather extend what we currently have to satisfy
your use-case. Ho-hum.


Assuming we want the IOMMU to be handled transparently for the majority of
devices I only see two ways to fix this,

The first way is to create a default DMA mapping unconditionally and let
drivers that can't live with it tear it down. That's what is implemented
today.


I strongly support Thierry's point that drivers should not have to tear 
down things they don't need. The issue we are facing today is a very 
good illustration of why one should not have to do this.


Everybody hates to receive unsollicited email with a link that says to 
unsubscribe, click here. Let's not import that unpleasant culture into 
the kernel.


I am arriving late in this discussion, but what is wrong with asking 
drivers to explicitly state that they want the DMA API to be backed by 
the IOMMU instead of forcibly making it work that way?




The second way is to implement a mechanism to let drivers signal that they
want to handle DMA mappings themselves. As the mappings need in the general
case to be created before the probe function  is called


Sorry for being ignorant here, but why is that?

 we can't signal this by

calling a function in probe(). A new flag field for struct device_driver is a
possible solution. This would however require delaying the creation of DMA
mappings until right before probe time. Attaching to the IOMMU could be pushed
to right before probe() as well, which would have the added benefit of making
IOMMU driver implementable as real platform drivers.


Keeping the ability to write IOMMU drivers as platform drivers would be 
nice indeed.


The problem with the opt-out flag though is that one will need to check 
every single driver out there to see whether it stills behave correctly 
if its hardware is suddently put behind a IOMMU. Doing it the other way 
(a flag that enables IOMMU if available) sounds safer to me.


What we have right now is a mechanism that basically makes it impossible 
to use the DMA API on many ARM platforms if ARM_DMA_USE_IOMMU is set 
(and I suspect it would also make the IOMMU unusable as well, without 
any way to fix things). This is quite concerning.


Even more concerning is that -rc5 is about to be released and we have 
in-tree drivers (Rockchip DRM) that are not working as they should 
because of this patch. Will, what is your plan to fix this? Do we have 
stuff that absolutely depends on this patch? If not, can we just revert 
it until all these issues are solved?

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 8/8] arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops

2015-01-14 Thread Alexandre Courbot

On 12/02/2014 01:57 AM, Will Deacon wrote:

This patch plumbs the existing ARM IOMMU DMA infrastructure (which isn't
actually called outside of a few drivers) into arch_setup_dma_ops, so
that we can use IOMMUs for DMA transfers in a more generic fashion.

Since this significantly complicates the arch_setup_dma_ops function,
it is moved out of line into dma-mapping.c. If CONFIG_ARM_DMA_USE_IOMMU
is not set, the iommu parameter is ignored and the normal ops are used
instead.


A series for IOMMU support with Tegra/Nouveau ceased to work after this 
patch. The Tegra IOMMU is not registered by the time the DT is parsed, 
and thus all devices end up without the proper DMA ops set up because 
the phandle to the IOMMU cannot be resolved. Subsequently calling 
arm_iommu_create_mapping() and arm_iommu_attach_device() from the driver 
(as I used to do until 3.18) does not help since the call to 
set_dma_ops() has been moved out of arm_iommu_attach_device(). Therefore 
there seems to be no way for a device to gets its correct DMA ops unless 
the IOMMU is ready by the time the DT is parsed.


Also potentially affected by this are the Rockchip DRM and OMAP3 ISP 
drivers, which follow the same pattern.


This raises the following questions:

1) Why are arm_iommu_create_mapping() and arm_iommu_attach_device() 
still public since they cannot set the DMA ops and thus seem to be 
useless outside of arch_setup_dma_ops()?


2) Say you want to use the IOMMU API in your driver, and have an iommu 
property in your device's DT node. If by chance your IOMMU is registered 
early, you will already have a mapping automatically created even before 
your probe function is called. Can this be avoided? Is it even safe?


The issue is workarounded (at least for me) with the following patch:

--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1961,6 +1961,7 @@ int arm_iommu_attach_device(struct device *dev,

kref_get(mapping-kref);
dev-archdata.mapping = mapping;
+   set_dma_ops(dev, iommu_ops);

pr_debug(Attached IOMMU controller to %s device.\n, 
dev_name(dev));

return 0;

This allows arm_iommu_create_mapping() and arm_iommu_attach_device() to 
set the correct DMA ops when called from the driver. But it doesn't look 
like mergeable material and I'd like to know whether there is a better 
way to handle this, or whether I should just use the IOMMU API directly. 
Knowing that even this might not be safe if ARM_DMA_USE_IOMMU is enabled 
because of point 2) above.


So basically I'm afraid I might not even be able to use the IOMMU safely 
after this. Or am I missing anything?


Thanks,
Alex.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] memory: Add NVIDIA SMMU suspend/resume support

2014-12-12 Thread Alexandre Courbot
Hi Mark,

On Mon, Dec 8, 2014 at 3:20 PM, Mark Zhang ma...@nvidia.com wrote:
 This patch adds suspend/resume support for NVIDIA SMMU.


 This patch is created on top of Thierry Reding's patch set:

 [PATCH v7 00/12] NVIDIA Tegra memory controller and IOMMU support

You should have this comment under the --- as we don't need it to
persist once this patch is merged.


 Signed-off-by: Mark Zhang ma...@nvidia.com
 ---
  drivers/iommu/tegra-smmu.c | 79 
 +-
  drivers/memory/tegra/mc.c  | 21 
  drivers/memory/tegra/mc.h  |  4 +++
  3 files changed, 103 insertions(+), 1 deletion(-)

 diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
 index 0909e0bae9ec..ab38805055a4 100644
 --- a/drivers/iommu/tegra-smmu.c
 +++ b/drivers/iommu/tegra-smmu.c
 @@ -17,6 +17,8 @@
  #include soc/tegra/ahb.h
  #include soc/tegra/mc.h

 +struct tegra_smmu_as;
 +
  struct tegra_smmu {
 void __iomem *regs;
 struct device *dev;
 @@ -25,9 +27,10 @@ struct tegra_smmu {
 const struct tegra_smmu_soc *soc;

 unsigned long *asids;
 +   struct tegra_smmu_as **as;
 struct mutex lock;

 -   struct list_head list;
 +   struct list_head swgroup_asid_list;
  };

  struct tegra_smmu_as {
 @@ -40,6 +43,12 @@ struct tegra_smmu_as {
 u32 attr;
  };

 +struct tegra_smmu_swgroup_asid {
 +   struct list_head list;
 +   unsigned swgroup_id;
 +   unsigned asid;
 +};
 +
  static inline void smmu_writel(struct tegra_smmu *smmu, u32 value,
unsigned long offset)
  {
 @@ -376,6 +385,7 @@ static int tegra_smmu_as_prepare(struct tegra_smmu *smmu,
 as-smmu = smmu;
 as-use_count++;

 +   smmu-as[as-id] = as;
 return 0;
  }

 @@ -386,6 +396,7 @@ static void tegra_smmu_as_unprepare(struct tegra_smmu 
 *smmu,
 return;

 tegra_smmu_free_asid(smmu, as-id);
 +   smmu-as[as-id] = NULL;
 as-smmu = NULL;
  }

 @@ -398,6 +409,7 @@ static int tegra_smmu_attach_dev(struct iommu_domain 
 *domain,
 struct of_phandle_args args;
 unsigned int index = 0;
 int err = 0;
 +   struct tegra_smmu_swgroup_asid *sa = NULL;

This initialization is unneeded. Actually this declaration would
probably be better placed in the while() loop below since its usage is
local to it.


 while (!of_parse_phandle_with_args(np, iommus, #iommu-cells, 
 index,
args)) {
 @@ -411,6 +423,14 @@ static int tegra_smmu_attach_dev(struct iommu_domain 
 *domain,
 return err;

 tegra_smmu_enable(smmu, swgroup, as-id);
 +
 +   sa = kzalloc(sizeof(struct tegra_smmu_swgroup_asid),
 +   GFP_KERNEL);
 +   INIT_LIST_HEAD(sa-list);

You don't need to call INIT_LIST_HEAD on this, list_add_tail() will
effectively overwrite any initialization done by this macro (see
include/linux/list.h).

 +   sa-swgroup_id = swgroup;
 +   sa-asid = as-id;
 +   list_add_tail(sa-list, smmu-swgroup_asid_list);
 +
 index++;
 }

 @@ -427,6 +447,7 @@ static void tegra_smmu_detach_dev(struct iommu_domain 
 *domain, struct device *de
 struct tegra_smmu *smmu = as-smmu;
 struct of_phandle_args args;
 unsigned int index = 0;
 +   struct tegra_smmu_swgroup_asid *sa = NULL;

Same here, move the declaration into the while() loop and remove the
initialization.


 while (of_parse_phandle_with_args(np, iommus, #iommu-cells, index,
   args)) {
 @@ -435,6 +456,13 @@ static void tegra_smmu_detach_dev(struct iommu_domain 
 *domain, struct device *de
 if (args.np != smmu-dev-of_node)
 continue;

 +   list_for_each_entry(sa, smmu-swgroup_asid_list, list) {
 +   if (sa-asid == as-id  sa-swgroup_id == swgroup)
 +   break;
 +   }
 +   list_del(sa-list);
 +   kfree(sa);
 +
 tegra_smmu_disable(smmu, swgroup, as-id);
 tegra_smmu_as_unprepare(smmu, as);
 index++;
 @@ -651,6 +679,48 @@ static void tegra_smmu_ahb_enable(void)
 }
  }

 +void tegra_smmu_resume(struct tegra_smmu *smmu)
 +{
 +   struct tegra_smmu_as *as = NULL;
 +   unsigned int bit;
 +   u32 value;
 +   struct tegra_smmu_swgroup_asid *sa = NULL;

Again no need to initialize to NULL here.

 +
 +   for_each_set_bit(bit, smmu-asids, smmu-soc-num_asids) {
 +   as = smmu-as[bit];
 +   smmu-soc-ops-flush_dcache(as-pd, 0, SMMU_SIZE_PD);
 +
 +   smmu_writel(smmu, as-id  0x7f, SMMU_PTB_ASID);
 +   value = SMMU_PTB_DATA_VALUE(as-pd, as-attr);
 +   smmu_writel(smmu, value, SMMU_PTB_DATA);
 +   }
 +
 +   

Re: [PATCH v6 00/12] NVIDIA Tegra memory controller and IOMMU support

2014-11-10 Thread Alexandre Courbot

On 11/08/2014 01:00 AM, Thierry Reding wrote:

From: Thierry Reding tred...@nvidia.com

This is the sixth installment in the Tegra IOMMU and memory controller
support series. This version addresses the final outstanding comments from
Olof about using proper Kconfig symbols to track the dependencies. It also
splits up the driver into one part that implements the memory controller
only and a second part that implements the SMMU. This plays nicely with
the new Kconfig options introduced.

Patch 1 is a preparatory patch that exposes the memory controller clock.

Patches 2 and 3 is a pair of precursory patches needed to make this all
work on 64-bit ARM in the future.

The device tree binding for the Tegra memory controller is added in patch
4 and patch 5 is the bulk of the series that move the existing memory
controller and IOMMU drivers into the new unified driver that supports
Tegra30, Tegra114 and Tegra124.

Patches 6, 7 and 8 add the DT nodes for the memory controller/IOMMU on
Tegra30, Tegra114 and Tegra124.

IOMMU support is enabled for the display controllers in patches 9, 10 and
11. This will allow the display controllers to have their memory accesses
translated by the SMMU, which will enable non-contiguous buffers to be
used for scan-out.

Finally patch 12 also adds support for Tegra132. It is kept separate because
none of the other Tegra132 patches have been merged yet, but I've included
it here for completeness.

Because the patches are rather intertwined, I'd like to merge them all via
the Tegra tree. For that I'll need Acked-bys from Mike, Russell and Joerg
on patches 1, 2 and 3, and 5, respectively.


FWIW,

Tested-by: Alexandre Courbot acour...@nvidia.com

Works nicely with both the display and GPU clients, which allows us to 
remove the need for CMA on Tegra.


Thanks,
Alex.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/12] memory: Add NVIDIA Tegra memory controller support

2014-11-03 Thread Alexandre Courbot
On Mon, Nov 3, 2014 at 5:22 PM, Thierry Reding thierry.red...@gmail.com wrote:
 On Sat, Nov 01, 2014 at 02:38:26PM +0900, Alexandre Courbot wrote:
 On Fri, Oct 31, 2014 at 10:27 PM, Thierry Reding
 thierry.red...@gmail.com wrote:
  On Thu, Oct 30, 2014 at 04:08:41PM +0100, Thierry Reding wrote:
  On Wed, Oct 15, 2014 at 03:09:30PM -0700, Olof Johansson wrote:
   Hi,
  
   Oh, a few more comments:
  
   On Mon, Oct 13, 2014 at 3:33 AM, Thierry Reding
   thierry.red...@gmail.com wrote:
  
diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
index c32d31981be3..1c932e7e7b8d 100644
--- a/drivers/memory/Makefile
+++ b/drivers/memory/Makefile
@@ -12,4 +12,5 @@ obj-$(CONFIG_FSL_CORENET_CF)  += fsl-corenet-cf.o
 obj-$(CONFIG_FSL_IFC)  += fsl_ifc.o
 obj-$(CONFIG_MVEBU_DEVBUS) += mvebu-devbus.o
 obj-$(CONFIG_TEGRA20_MC)   += tegra20-mc.o
-obj-$(CONFIG_TEGRA30_MC)   += tegra30-mc.o
+
+obj-$(CONFIG_ARCH_TEGRA)   += tegra/
diff --git a/drivers/memory/tegra/Makefile 
b/drivers/memory/tegra/Makefile
new file mode 100644
index ..51b9e8fcde1b
--- /dev/null
+++ b/drivers/memory/tegra/Makefile
@@ -0,0 +1,5 @@
+obj-y   = tegra-mc.o
+obj-$(CONFIG_ARCH_TEGRA_3x_SOC)+= tegra30-mc.o
+obj-$(CONFIG_ARCH_TEGRA_114_SOC)   += tegra114-mc.o
+obj-$(CONFIG_ARCH_TEGRA_124_SOC)   += tegra124-mc.o
+obj-$(CONFIG_ARCH_TEGRA_132_SOC)   += tegra124-mc.o
  
   You'll need a Kconfig and not just a makefile -- there are definitely
   dependencies on this driver (IOMMU in particular).
 
  This is handled within the tegra-mc driver by only setting up the IOMMU
  when TEGRA_IOMMU_SMMU is enabled. That config option remains in place.
 
   Also, the problem of having a global enable bit that is only under
   control of TrustZone FW is a big problem -- if the bit is not set, the
   driver will not work (and the machine will crash).
  
   I think you'll need to come up with a way to detect that in the
   driver. I don't have a good idea of how it can be done though.
 
  I don't think I ever got back to you on this. We discussed this
  internally and it seems like there's no way to detect this properly, so
  the best suggestion so far was to make it a requirement on the secure
  firmware to enable IOMMU or not. Since there's no way for the kernel to
  detect whether IOMMU was enabled or not, I think the firmware would
  equally have to adjust the SMMU's device tree node's status property
  appropriately.
 
  The other option would be for the firmware not to touch the SMMU device
  tree node and the kernel simply assuming that if it's running in non-
  secure mode then there must be secure firmware and it has enabled the
  SMMU. Enabling the SMMU would become part of the contract between
  firmware and kernel, much like locking the VPR is required to get the
  GPU to work.
 
  Those are really the only two choices we have.

 We got the exact same problem with GPU and VPR registers, and it seems
 like the approach we will be taking here is to have the
 firmware/bootloader do whatever is needed to get the GPU working and
 enable the DT node once it did. IOW, the kernel will never touch
 protected registers, and will not freeze if the hardware is not
 properly set up.

 I think the situation is slightly different. For the SMMU we still have
 the option to enable translations when running in secure mode with code
 that's pretty trivial to have in the IOMMU driver (it's just a single
 bit that gets written to a register). And when the IOMMU driver does
 that everything will work just fine.

 So I'm thinking that a workable alternative to what we've done for VPR
 would be to just always enable translations in the SMMU driver (that
 operation will simply be discarded in non-secure mode) and assume that
 it'll work. So the contract would be that running in secure mode the
 kernel sets everything up and when running in non-secure mode the kernel
 will assume that firmware set everything up already.

 Requiring firmware to change the device node's status to okay seems
 rather restrictive since it would have to do that even if it boots the
 kernel in secure mode.

Sounds good to me, as long as the kernel can always run using the same
code in both modes.


 It would be nice for consistency if the same approach can be taken
 with the IOMMU. OTOH we will need to make sure that all these
 initialization contracts are clearly documented somewhere. Maybe a
 comment in the DTS to explain what is expected from the firmware to
 enable such nodes would be a good idea, too.

 The device tree binding seems like a good place to document this.

Indeed.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/12] memory: Add NVIDIA Tegra memory controller support

2014-11-02 Thread Alexandre Courbot
On Fri, Oct 31, 2014 at 10:27 PM, Thierry Reding
thierry.red...@gmail.com wrote:
 On Thu, Oct 30, 2014 at 04:08:41PM +0100, Thierry Reding wrote:
 On Wed, Oct 15, 2014 at 03:09:30PM -0700, Olof Johansson wrote:
  Hi,
 
  Oh, a few more comments:
 
  On Mon, Oct 13, 2014 at 3:33 AM, Thierry Reding
  thierry.red...@gmail.com wrote:
 
   diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
   index c32d31981be3..1c932e7e7b8d 100644
   --- a/drivers/memory/Makefile
   +++ b/drivers/memory/Makefile
   @@ -12,4 +12,5 @@ obj-$(CONFIG_FSL_CORENET_CF)  += fsl-corenet-cf.o
obj-$(CONFIG_FSL_IFC)  += fsl_ifc.o
obj-$(CONFIG_MVEBU_DEVBUS) += mvebu-devbus.o
obj-$(CONFIG_TEGRA20_MC)   += tegra20-mc.o
   -obj-$(CONFIG_TEGRA30_MC)   += tegra30-mc.o
   +
   +obj-$(CONFIG_ARCH_TEGRA)   += tegra/
   diff --git a/drivers/memory/tegra/Makefile 
   b/drivers/memory/tegra/Makefile
   new file mode 100644
   index ..51b9e8fcde1b
   --- /dev/null
   +++ b/drivers/memory/tegra/Makefile
   @@ -0,0 +1,5 @@
   +obj-y   = tegra-mc.o
   +obj-$(CONFIG_ARCH_TEGRA_3x_SOC)+= tegra30-mc.o
   +obj-$(CONFIG_ARCH_TEGRA_114_SOC)   += tegra114-mc.o
   +obj-$(CONFIG_ARCH_TEGRA_124_SOC)   += tegra124-mc.o
   +obj-$(CONFIG_ARCH_TEGRA_132_SOC)   += tegra124-mc.o
 
  You'll need a Kconfig and not just a makefile -- there are definitely
  dependencies on this driver (IOMMU in particular).

 This is handled within the tegra-mc driver by only setting up the IOMMU
 when TEGRA_IOMMU_SMMU is enabled. That config option remains in place.

  Also, the problem of having a global enable bit that is only under
  control of TrustZone FW is a big problem -- if the bit is not set, the
  driver will not work (and the machine will crash).
 
  I think you'll need to come up with a way to detect that in the
  driver. I don't have a good idea of how it can be done though.

 I don't think I ever got back to you on this. We discussed this
 internally and it seems like there's no way to detect this properly, so
 the best suggestion so far was to make it a requirement on the secure
 firmware to enable IOMMU or not. Since there's no way for the kernel to
 detect whether IOMMU was enabled or not, I think the firmware would
 equally have to adjust the SMMU's device tree node's status property
 appropriately.

 The other option would be for the firmware not to touch the SMMU device
 tree node and the kernel simply assuming that if it's running in non-
 secure mode then there must be secure firmware and it has enabled the
 SMMU. Enabling the SMMU would become part of the contract between
 firmware and kernel, much like locking the VPR is required to get the
 GPU to work.

 Those are really the only two choices we have.

We got the exact same problem with GPU and VPR registers, and it seems
like the approach we will be taking here is to have the
firmware/bootloader do whatever is needed to get the GPU working and
enable the DT node once it did. IOW, the kernel will never touch
protected registers, and will not freeze if the hardware is not
properly set up.

It would be nice for consistency if the same approach can be taken
with the IOMMU. OTOH we will need to make sure that all these
initialization contracts are clearly documented somewhere. Maybe a
comment in the DTS to explain what is expected from the firmware to
enable such nodes would be a good idea, too.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/12] memory: Add NVIDIA Tegra memory controller support

2014-10-30 Thread Alexandre Courbot

diff --git a/drivers/memory/tegra/tegra124-mc.c 
b/drivers/memory/tegra/tegra124-mc.c

...

+static const struct tegra_smmu_swgroup tegra124_swgroups[] = {
+   { .swgroup = TEGRA_SWGROUP_DC,.reg = 0x240 },
+   { .swgroup = TEGRA_SWGROUP_DCB,   .reg = 0x244 },
+   { .swgroup = TEGRA_SWGROUP_AFI,   .reg = 0x238 },
+   { .swgroup = TEGRA_SWGROUP_AVPC,  .reg = 0x23c },
+   { .swgroup = TEGRA_SWGROUP_HDA,   .reg = 0x254 },
+   { .swgroup = TEGRA_SWGROUP_HC,.reg = 0x250 },
+   { .swgroup = TEGRA_SWGROUP_MSENC, .reg = 0x264 },
+   { .swgroup = TEGRA_SWGROUP_PPCS,  .reg = 0x270 },
+   { .swgroup = TEGRA_SWGROUP_SATA,  .reg = 0x274 },
+   { .swgroup = TEGRA_SWGROUP_VDE,   .reg = 0x27c },
+   { .swgroup = TEGRA_SWGROUP_ISP2,  .reg = 0x258 },
+   { .swgroup = TEGRA_SWGROUP_XUSB_HOST, .reg = 0x288 },
+   { .swgroup = TEGRA_SWGROUP_XUSB_DEV,  .reg = 0x28c },
+   { .swgroup = TEGRA_SWGROUP_ISP2B, .reg = 0xaa4 },
+   { .swgroup = TEGRA_SWGROUP_TSEC,  .reg = 0x294 },
+   { .swgroup = TEGRA_SWGROUP_A9AVP, .reg = 0x290 },
+   { .swgroup = TEGRA_SWGROUP_GPU,   .reg = 0xaa8 },


I had to change the .reg of TEGRA_SWGROUP_GPU to 0xaac to get the IOMMU 
to work with GK20A. The reason is still not completely clear to me, but 
if you look at the TRM you see that 0xaa8 is basically constant, with 
the SMMU translation bit hardcoded to DISABLE (and the ASID field being 
meaningless in that case). However right after that register you have a 
functional one named GPUB instead of GPU, and this one is fully 
writeable (and has the expected effect).


I will try to get more information about the why of this, but for now 
this setting is what works for me.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/12] memory: Add NVIDIA Tegra memory controller support

2014-10-30 Thread Alexandre Courbot

On 10/30/2014 08:04 PM, Terje Bergström wrote:

On 30.10.2014 12:22, Alexandre Courbot wrote:

So should I understand that the GPU group is for addresses without bit
34 set (hence forcibly disabled) while GPUB is used when that bit is
set? Or is it something else?


That's exactly correct. And only GPUB can be programmed to be SMMU
translated.


Great, thanks for confirming!

Thierry, how do you want to address this? We could change the register 
for the GPU group, or (maybe preferable if we want to reflect the actual 
hardware state) add the GPUB group. I don't know if that would be easy 
though since we would have the problem of the gpusrd and gpuswr clients 
ownership (seems like they would belong to both groups?)

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 05/12] memory: Add NVIDIA Tegra memory controller support

2014-10-30 Thread Alexandre Courbot
On Thu, Oct 30, 2014 at 7:18 PM, Terje Bergström tbergst...@nvidia.com 
wrote:

 On 30.10.2014 12:03, Alexandre Courbot wrote:
 I had to change the .reg of TEGRA_SWGROUP_GPU to 0xaac to get the IOMMU
 to work with GK20A. The reason is still not completely clear to me, but
 if you look at the TRM you see that 0xaa8 is basically constant, with
 the SMMU translation bit hardcoded to DISABLE (and the ASID field being
 meaningless in that case). However right after that register you have a
 functional one named GPUB instead of GPU, and this one is fully
 writeable (and has the expected effect).

 GPU has two SW group IDs, because it accesses memory both with and
 without translation. The bit 34 in addresses in f.ex. PTE chooses
 between the two.

Indeed, to enable SMMU translation I have to program 0xaac correctly 
*and* set the bit 34 of every address that needs to go through the SMMU.


 GPU is hard-wired to disable translation. For GPUB translation can be
 enabled.

So should I understand that the GPU group is for addresses without bit 
34 set (hence forcibly disabled) while GPUB is used when that bit is 
set? Or is it something else?

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu