Re: [PATCH v2 0/5] Support for Adreno X1-85 GPU

2024-07-02 Thread Will Deacon
On Sat, 29 Jun 2024 07:19:33 +0530, Akhil P Oommen wrote:
> This series adds support for the Adreno X1-85 GPU found in Qualcomm's
> compute series chipset, Snapdragon X1 Elite (x1e80100). In this new
> naming scheme for Adreno GPU, 'X' stands for compute series, '1' denotes
> 1st generation and '8' & '5' denotes the tier and the SKU which it
> belongs.
> 
> X1-85 has major focus on doubling core clock frequency and bandwidth
> throughput. It has a dedicated collapsible Graphics MX rail (gmxc) to
> power the memories and double the number of data channels to improve
> bandwidth to DDR.
> 
> [...]

Applied SMMU bindings change to will (for-joerg/arm-smmu/bindings),
thanks!

[4/5] dt-bindings: arm-smmu: Add X1E80100 GPU SMMU
  https://git.kernel.org/will/c/d6c102881b30

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


Re: [PATCH v4 1/2] iommu/io-pgtable-arm: Add way to debug pgtable walk

2024-06-25 Thread Will Deacon
On Mon, Jun 24, 2024 at 08:37:26AM -0700, Rob Clark wrote:
> On Mon, Jun 24, 2024 at 8:14 AM Will Deacon  wrote:
> >
> > On Thu, May 23, 2024 at 10:52:21AM -0700, Rob Clark wrote:
> > > From: Rob Clark 
> > >
> > > Add an io-pgtable method to walk the pgtable returning the raw PTEs that
> > > would be traversed for a given iova access.
> > >
> > > Signed-off-by: Rob Clark 
> > > ---
> > >  drivers/iommu/io-pgtable-arm.c | 51 --
> > >  include/linux/io-pgtable.h |  4 +++
> > >  2 files changed, 46 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/iommu/io-pgtable-arm.c 
> > > b/drivers/iommu/io-pgtable-arm.c
> > > index f7828a7aad41..f47a0e64bb35 100644
> > > --- a/drivers/iommu/io-pgtable-arm.c
> > > +++ b/drivers/iommu/io-pgtable-arm.c
> > > @@ -693,17 +693,19 @@ static size_t arm_lpae_unmap_pages(struct 
> > > io_pgtable_ops *ops, unsigned long iov
> > >   data->start_level, ptep);
> > >  }
> > >
> > > -static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
> > > -  unsigned long iova)
> > > +static int arm_lpae_pgtable_walk(struct io_pgtable_ops *ops, unsigned 
> > > long iova,
> > > + int (*cb)(void *cb_data, void *pte, int level),
> > > + void *cb_data)
> > >  {
> > >   struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
> > >   arm_lpae_iopte pte, *ptep = data->pgd;
> > >   int lvl = data->start_level;
> > > + int ret;
> > >
> > >   do {
> > >   /* Valid IOPTE pointer? */
> > >   if (!ptep)
> > > - return 0;
> > > + return -EFAULT;
> >
> > nit: -ENOENT might be a little better, as we're only checking against a
> > NULL entry rather than strictly any faulting entry.
> >
> > >   /* Grab the IOPTE we're interested in */
> > >   ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
> > > @@ -711,22 +713,52 @@ static phys_addr_t arm_lpae_iova_to_phys(struct 
> > > io_pgtable_ops *ops,
> > >
> > >   /* Valid entry? */
> > >   if (!pte)
> > > - return 0;
> > > + return -EFAULT;
> >
> > Same here (and at the end of the function).
> >
> > > +
> > > + ret = cb(cb_data, , lvl);
> >
> > Since pte is on the stack, rather than pointing into the actual pgtable,
> > I think it would be clearer to pass it by value to the callback.
> 
> fwiw, I passed it as a void* to avoid the pte size.. although I guess
> it could be a union of all the possible pte types

Can you just get away with a u64?

Will


Re: [PATCH v4 1/2] iommu/io-pgtable-arm: Add way to debug pgtable walk

2024-06-24 Thread Will Deacon
On Thu, May 23, 2024 at 10:52:21AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> Add an io-pgtable method to walk the pgtable returning the raw PTEs that
> would be traversed for a given iova access.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/iommu/io-pgtable-arm.c | 51 --
>  include/linux/io-pgtable.h |  4 +++
>  2 files changed, 46 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index f7828a7aad41..f47a0e64bb35 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -693,17 +693,19 @@ static size_t arm_lpae_unmap_pages(struct 
> io_pgtable_ops *ops, unsigned long iov
>   data->start_level, ptep);
>  }
>  
> -static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
> -  unsigned long iova)
> +static int arm_lpae_pgtable_walk(struct io_pgtable_ops *ops, unsigned long 
> iova,
> + int (*cb)(void *cb_data, void *pte, int level),
> + void *cb_data)
>  {
>   struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
>   arm_lpae_iopte pte, *ptep = data->pgd;
>   int lvl = data->start_level;
> + int ret;
>  
>   do {
>   /* Valid IOPTE pointer? */
>   if (!ptep)
> - return 0;
> + return -EFAULT;

nit: -ENOENT might be a little better, as we're only checking against a
NULL entry rather than strictly any faulting entry.

>   /* Grab the IOPTE we're interested in */
>   ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
> @@ -711,22 +713,52 @@ static phys_addr_t arm_lpae_iova_to_phys(struct 
> io_pgtable_ops *ops,
>  
>   /* Valid entry? */
>   if (!pte)
> - return 0;
> + return -EFAULT;

Same here (and at the end of the function).

> +
> + ret = cb(cb_data, , lvl);

Since pte is on the stack, rather than pointing into the actual pgtable,
I think it would be clearer to pass it by value to the callback.

> + if (ret)
> + return ret;
>  
> - /* Leaf entry? */
> + /* Leaf entry?  If so, we've found the translation */
>   if (iopte_leaf(pte, lvl, data->iop.fmt))
> - goto found_translation;
> + return 0;
>  
>   /* Take it to the next level */
>   ptep = iopte_deref(pte, data);
>   } while (++lvl < ARM_LPAE_MAX_LEVELS);
>  
>   /* Ran out of page tables to walk */
> + return -EFAULT;
> +}
> +
> +struct iova_to_phys_walk_data {
> + arm_lpae_iopte pte;
> + int level;
> +};

Expanding a little on Robin's suggestion, why don't we drop this structure
in favour of something more generic:

struct arm_lpae_walk_data {
arm_lpae_iopte ptes[ARM_LPAE_MAX_LEVELS];
};

and then do something in the walker like:

if (cb && !cb(pte, lvl))
walk_data->ptes[lvl] = pte;

which could return the physical address at the end, if it reaches a leaf
entry. That way arm_lpae_iova_to_phys() is just passing a NULL callback
to the walker and your debug callback just needs to return 0 (i.e. the
callback is basically just saying whether or not to continue the walk).

Will


Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-06-20 Thread Will Deacon
On Tue, Jun 18, 2024 at 09:41:58PM +0530, Akhil P Oommen wrote:
> On Tue, Jun 04, 2024 at 03:40:56PM +0100, Will Deacon wrote:
> > On Thu, May 16, 2024 at 01:55:26PM -0500, Andrew Halaney wrote:
> > > On Thu, May 16, 2024 at 08:20:05PM GMT, Akhil P Oommen wrote:
> > > > On Thu, May 16, 2024 at 08:15:34AM -0500, Andrew Halaney wrote:
> > > > > If I understand correctly, you don't need any memory barrier.
> > > > > writel()/readl()'s are ordered to the same endpoint. That goes for all
> > > > > the reordering/barrier comments mentioned below too.
> > > > > 
> > > > > device-io.rst:
> > > > > 
> > > > > The read and write functions are defined to be ordered. That is 
> > > > > the
> > > > > compiler is not permitted to reorder the I/O sequence. When the 
> > > > > ordering
> > > > > can be compiler optimised, you can use __readb() and friends to
> > > > > indicate the relaxed ordering. Use this with care.
> > > > > 
> > > > > memory-barriers.txt:
> > > > > 
> > > > >  (*) readX(), writeX():
> > > > > 
> > > > >   The readX() and writeX() MMIO accessors take a pointer to 
> > > > > the
> > > > >   peripheral being accessed as an __iomem * parameter. For 
> > > > > pointers
> > > > >   mapped with the default I/O attributes (e.g. those returned 
> > > > > by
> > > > >   ioremap()), the ordering guarantees are as follows:
> > > > > 
> > > > >   1. All readX() and writeX() accesses to the same peripheral 
> > > > > are ordered
> > > > >  with respect to each other. This ensures that MMIO 
> > > > > register accesses
> > > > >  by the same CPU thread to a particular device will 
> > > > > arrive in program
> > > > >  order.
> > > > > 
> > > > 
> > > > In arm64, a writel followed by readl translates to roughly the following
> > > > sequence: dmb_wmb(), __raw_writel(), __raw_readl(), dmb_rmb(). I am not
> > > > sure what is stopping compiler from reordering  __raw_writel() and 
> > > > __raw_readl()
> > > > above? I am assuming iomem cookie is ignored during compilation.
> > > 
> > > It seems to me that is due to some usage of volatile there in
> > > __raw_writel() etc, but to be honest after reading about volatile and
> > > some threads from gcc mailing lists, I don't have a confident answer :)
> > > 
> > > > 
> > > > Added Will to this thread if he can throw some light on this.
> > > 
> > > Hopefully Will can school us.
> > 
> > The ordering in this case is ensured by the memory attributes used for
> > ioremap(). When an MMIO region is mapped using Device-nGnRE attributes
> > (as it the case for ioremap()), the "nR" part means "no reordering", so
> > readX() and writeX() to that region are ordered wrt each other.
> 
> But that avoids only HW reordering, doesn't it? What about *compiler 
> reordering* in the
> case of a writel following by a readl which translates to:
>   1: dmb_wmb()
>   2: __raw_writel() -> roughly "asm volatile('str')
>   3: __raw_readl() -> roughly "asm volatile('ldr')
>   4: dmb_rmb()
> 
> Is the 'volatile' keyword sufficient to avoid reordering between (2) and (3)? 
> Or
> do we need a "memory" clobber to inhibit reordering?
> 
> This is still not clear to me even after going through some compiler 
> documentions.

I don't think the compiler should reorder volatile asm blocks wrt each
other.

Will


Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-06-18 Thread Will Deacon
On Thu, Jun 06, 2024 at 02:03:24PM +0200, Konrad Dybcio wrote:
> On 4.06.2024 4:40 PM, Will Deacon wrote:
> > On Thu, May 16, 2024 at 01:55:26PM -0500, Andrew Halaney wrote:
> >> On Thu, May 16, 2024 at 08:20:05PM GMT, Akhil P Oommen wrote:
> >>> On Thu, May 16, 2024 at 08:15:34AM -0500, Andrew Halaney wrote:
> >>>> If I understand correctly, you don't need any memory barrier.
> >>>> writel()/readl()'s are ordered to the same endpoint. That goes for all
> >>>> the reordering/barrier comments mentioned below too.
> >>>>
> >>>> device-io.rst:
> >>>>
> >>>> The read and write functions are defined to be ordered. That is the
> >>>> compiler is not permitted to reorder the I/O sequence. When the 
> >>>> ordering
> >>>> can be compiler optimised, you can use __readb() and friends to
> >>>> indicate the relaxed ordering. Use this with care.
> >>>>
> >>>> memory-barriers.txt:
> >>>>
> >>>>  (*) readX(), writeX():
> >>>>
> >>>>  The readX() and writeX() MMIO accessors take a pointer to the
> >>>>  peripheral being accessed as an __iomem * parameter. For pointers
> >>>>  mapped with the default I/O attributes (e.g. those returned by
> >>>>  ioremap()), the ordering guarantees are as follows:
> >>>>
> >>>>  1. All readX() and writeX() accesses to the same peripheral are 
> >>>> ordered
> >>>> with respect to each other. This ensures that MMIO register 
> >>>> accesses
> >>>> by the same CPU thread to a particular device will arrive in 
> >>>> program
> >>>> order.
> >>>>
> >>>
> >>> In arm64, a writel followed by readl translates to roughly the following
> >>> sequence: dmb_wmb(), __raw_writel(), __raw_readl(), dmb_rmb(). I am not
> >>> sure what is stopping compiler from reordering  __raw_writel() and 
> >>> __raw_readl()
> >>> above? I am assuming iomem cookie is ignored during compilation.
> >>
> >> It seems to me that is due to some usage of volatile there in
> >> __raw_writel() etc, but to be honest after reading about volatile and
> >> some threads from gcc mailing lists, I don't have a confident answer :)
> >>
> >>>
> >>> Added Will to this thread if he can throw some light on this.
> >>
> >> Hopefully Will can school us.
> > 
> > The ordering in this case is ensured by the memory attributes used for
> > ioremap(). When an MMIO region is mapped using Device-nGnRE attributes
> > (as it the case for ioremap()), the "nR" part means "no reordering", so
> > readX() and writeX() to that region are ordered wrt each other.
> > 
> > Note that guarantee _doesn't_ apply to other flavours of ioremap(), so
> > e.g. ioremap_wc() won't give you the ordering.
> > 
> > Hope that helps,
> 
> Just to make sure I'm following, would mapping things as nGnRnE effectively
> get rid of write buffering, perhaps being a way of debugging whether that
> in particular is causing issues (at the cost of speed)?

I think the "nE" part is just a hint, so it will depend on how the
hardware has been built. On top of that, you'll still need something
like a DSB to force the CPU to wait for the write response.

Will


Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-06-04 Thread Will Deacon
On Thu, May 16, 2024 at 01:55:26PM -0500, Andrew Halaney wrote:
> On Thu, May 16, 2024 at 08:20:05PM GMT, Akhil P Oommen wrote:
> > On Thu, May 16, 2024 at 08:15:34AM -0500, Andrew Halaney wrote:
> > > If I understand correctly, you don't need any memory barrier.
> > > writel()/readl()'s are ordered to the same endpoint. That goes for all
> > > the reordering/barrier comments mentioned below too.
> > > 
> > > device-io.rst:
> > > 
> > > The read and write functions are defined to be ordered. That is the
> > > compiler is not permitted to reorder the I/O sequence. When the 
> > > ordering
> > > can be compiler optimised, you can use __readb() and friends to
> > > indicate the relaxed ordering. Use this with care.
> > > 
> > > memory-barriers.txt:
> > > 
> > >  (*) readX(), writeX():
> > > 
> > >   The readX() and writeX() MMIO accessors take a pointer to the
> > >   peripheral being accessed as an __iomem * parameter. For pointers
> > >   mapped with the default I/O attributes (e.g. those returned by
> > >   ioremap()), the ordering guarantees are as follows:
> > > 
> > >   1. All readX() and writeX() accesses to the same peripheral are 
> > > ordered
> > >  with respect to each other. This ensures that MMIO register 
> > > accesses
> > >  by the same CPU thread to a particular device will arrive in 
> > > program
> > >  order.
> > > 
> > 
> > In arm64, a writel followed by readl translates to roughly the following
> > sequence: dmb_wmb(), __raw_writel(), __raw_readl(), dmb_rmb(). I am not
> > sure what is stopping compiler from reordering  __raw_writel() and 
> > __raw_readl()
> > above? I am assuming iomem cookie is ignored during compilation.
> 
> It seems to me that is due to some usage of volatile there in
> __raw_writel() etc, but to be honest after reading about volatile and
> some threads from gcc mailing lists, I don't have a confident answer :)
> 
> > 
> > Added Will to this thread if he can throw some light on this.
> 
> Hopefully Will can school us.

The ordering in this case is ensured by the memory attributes used for
ioremap(). When an MMIO region is mapped using Device-nGnRE attributes
(as it the case for ioremap()), the "nR" part means "no reordering", so
readX() and writeX() to that region are ordered wrt each other.

Note that guarantee _doesn't_ apply to other flavours of ioremap(), so
e.g. ioremap_wc() won't give you the ordering.

Hope that helps,

Will


Re: [PATCH v2 0/7] A702 support

2024-02-27 Thread Will Deacon
On Fri, Feb 23, 2024 at 10:21:36PM +0100, Konrad Dybcio wrote:
> Bit of a megaseries, bunched together for your testing convenience..
> Needs mesa!27665 [1] on the userland part, kmscube happily spins.
> 
> I'm feeling quite lukewarm about the memory barriers in patch 3..
> 
> Patch 1 for Will/smmu, 5-6 for drm/msm, rest for qcom

I'm guessing you don't really expect me to take the clock bindings?!

Will


Re: [PATCH v3 0/7] drm/msm: Add support for the A750 GPU found on the SM8650 platform

2024-02-22 Thread Will Deacon
On Fri, 16 Feb 2024 12:03:47 +0100, Neil Armstrong wrote:
> Unlike the the very close A740 GPU on the SM8550 SoC, the A750 GPU
> doesn't have an HWCFG block but a separate register set.
> 
> The missing registers are added in the a6xx.xml.h file that would
> require a subsequent sync and the non-existent hwcfg is handled
> in a6xx_set_hwcg().
> 
> [...]

Applied SMMU binding changes to will (for-joerg/arm-smmu/bindings), thanks!

[2/7] dt-bindings: arm-smmu: Fix SM8[45]50 GPU SMMU 'if' condition
  https://git.kernel.org/will/c/281ca9b6
[3/7] dt-bindings: arm-smmu: Document SM8650 GPU SMMU
  https://git.kernel.org/will/c/8a05f74d567a

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


Re: [PATCH 0/8] A702 support

2024-02-22 Thread Will Deacon
On Mon, 19 Feb 2024 14:35:45 +0100, Konrad Dybcio wrote:
> Bit of a megaseries, bunched together for your testing convenience..
> Needs mesa!27665 [1] on the userland part, kmscube happily spins.
> 
> I'm feeling quite lukewarm about the memory barriers in patch 3..
> 
> Patch 1 for Will/smmu, 5-6 for drm/msm, rest for qcom
> 
> [...]

Applied SMMU bindings patch to will (for-joerg/arm-smmu/bindings), thanks!

[1/8] dt-bindings: arm-smmu: Add QCM2290 GPU SMMU
  https://git.kernel.org/will/c/0eca305f8e0d

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


Re: [PATCH v2] iommu/arm-smmu-qcom: Add missing GMU entry to match table

2023-12-12 Thread Will Deacon
On Sun, 10 Dec 2023 10:06:53 -0800, Rob Clark wrote:
> From: Rob Clark 
> 
> In some cases the firmware expects cbndx 1 to be assigned to the GMU,
> so we also want the default domain for the GMU to be an identy domain.
> This way it does not get a context bank assigned.  Without this, both
> of_dma_configure() and drm/msm's iommu_domain_attach() will trigger
> allocating and configuring a context bank.  So GMU ends up attached to
> both cbndx 1 and later cbndx 2.  This arrangement seemingly confounds
> and surprises the firmware if the GPU later triggers a translation
> fault, resulting (on sc8280xp / lenovo x13s, at least) in the SMMU
> getting wedged and the GPU stuck without memory access.
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-qcom: Add missing GMU entry to match table
  https://git.kernel.org/will/c/afc95681c306

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


Re: [PATCH v3 00/12] RB1/QCM2290 features

2023-12-12 Thread Will Deacon
On Wed, 29 Nov 2023 15:43:57 +0100, Konrad Dybcio wrote:
> This series brings:
> - interconnect plumbing
> - display setup
> 
> for QCM2290/QRB2210 and
> 
> - CAN bus controller
> - HDMI display
> - wifi fw variant name
> 
> [...]

Applied SMMU update to will (for-joerg/arm-smmu/updates), thanks!

[05/12] iommu/arm-smmu-qcom: Add QCM2290 MDSS compatible
https://git.kernel.org/will/c/28af105cb650

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


Re: [Freedreno] [PATCH] drm/msm/a6xx: don't set IO_PGTABLE_QUIRK_ARM_OUTER_WBWA with coherent SMMU

2023-10-02 Thread Will Deacon
On Fri, Sep 29, 2023 at 06:25:21PM +0100, Robin Murphy wrote:
> On 29/09/2023 4:45 pm, Will Deacon wrote:
> > On Mon, Sep 25, 2023 at 06:54:42PM +0100, Robin Murphy wrote:
> > > On 2023-04-10 19:52, Dmitry Baryshkov wrote:
> > > > If the Adreno SMMU is dma-coherent, allocation will fail unless we
> > > > disable IO_PGTABLE_QUIRK_ARM_OUTER_WBWA. Skip setting this quirk for the
> > > > coherent SMMUs (like we have on sm8350 platform).
> > > 
> > > Hmm, but is it right that it should fail in the first place? The fact is
> > > that if the SMMU is coherent then walks *will* be outer-WBWA, so I 
> > > honestly
> > > can't see why the io-pgtable code is going out of its way to explicitly
> > > reject a request to give them the same attribute it's already giving then
> > > anyway :/
> > > 
> > > Even if the original intent was for the quirk to have an over-specific
> > > implication of representing inner-NC as well, that hardly seems useful if
> > > what we've ended up with in practice is a nonsensical-looking check in one
> > > place and then a weird hacky bodge in another purely to work around it.
> > > 
> > > Does anyone know a good reason why this is the way it is?
> > 
> > I think it was mainly because the quick doesn't make sense for a coherent
> > page-table walker and we could in theory use that bit for something else
> > in that case.
> 
> Yuck, even if we did want some horrible notion of quirks being conditional
> on parts of the config rather than just the format, then the users would
> need to be testing for the same condition as the pagetable code itself (i.e.
> cfg->coherent_walk), rather than hoping some other property of something
> else indirectly reflects the right information - e.g. there'd be no hope of
> backporting this particular bodge before 5.19 where the old
> iommu_capable(IOMMU_CAP_CACHE_COHERENCY) always returned true, and in future
> we could conceivably support coherent SMMUs being configured for
> non-coherent walks on a per-domain basis.

That doesn't sound like an insurmountable problem to me. Either a bunch of
other stuff has to be backported as well, or the msm_iommu driver can fish
the pgtable configuration out of the SMMU, like it already does elsewhere.

> Furthermore, if we did overload a flag to have multiple meanings, then we'd
> have no way of knowing which one the caller was actually expecting, thus the
> illusion of being able to validate calls in the meantime isn't necessarily
> as helpful as it seems, particularly in a case where the "wrong"
> interpretation would be to have no effect anyway. Mostly though I'd hope
> that if we ever got anywhere near the point of running out of quirk bits
> we'd have already realised that it's time for a better interface :(

Although I agree that practically I can't see us reusing quirk bits, I do
much prefer that we reject quirks that don't make sense. Yes, in this case
it happens that the quirk is expressing something which is already true
for the coherent walker, but that feels like a special case to me rather
than something which is likely to be true in general, for example, the
system cache quirk proposed by Qualcomm to expose the unused
inner-NC-outer-WBWRA MAIR configuration.

Implicitly accepting quirks also makes it more difficult if we wanted to
change the default configuration in future; for example if we wanted to
adjust the default allocation hints.

So I'd prefer to leave the page-table code as-is.

Will


Re: [Freedreno] [PATCH] drm/msm/a6xx: don't set IO_PGTABLE_QUIRK_ARM_OUTER_WBWA with coherent SMMU

2023-09-29 Thread Will Deacon
On Mon, Sep 25, 2023 at 06:54:42PM +0100, Robin Murphy wrote:
> On 2023-04-10 19:52, Dmitry Baryshkov wrote:
> > If the Adreno SMMU is dma-coherent, allocation will fail unless we
> > disable IO_PGTABLE_QUIRK_ARM_OUTER_WBWA. Skip setting this quirk for the
> > coherent SMMUs (like we have on sm8350 platform).
> 
> Hmm, but is it right that it should fail in the first place? The fact is
> that if the SMMU is coherent then walks *will* be outer-WBWA, so I honestly
> can't see why the io-pgtable code is going out of its way to explicitly
> reject a request to give them the same attribute it's already giving then
> anyway :/
> 
> Even if the original intent was for the quirk to have an over-specific
> implication of representing inner-NC as well, that hardly seems useful if
> what we've ended up with in practice is a nonsensical-looking check in one
> place and then a weird hacky bodge in another purely to work around it.
> 
> Does anyone know a good reason why this is the way it is?

I think it was mainly because the quick doesn't make sense for a coherent
page-table walker and we could in theory use that bit for something else
in that case.

Will


Re: [Freedreno] [PATCH v6 12/12] iommu/arm-smmu-qcom: Add SM6350 DPU compatible

2023-06-08 Thread Will Deacon
On Tue, Jun 06, 2023 at 02:44:03PM +0200, Konrad Dybcio wrote:
> From: Konrad Dybcio 
> 
> Add the SM6350 DPU compatible to clients compatible list, as it also
> needs the workarounds.
> 
> Signed-off-by: Konrad Dybcio 
> Acked-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> index cc574928c707..bdeb587552c0 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> @@ -253,6 +253,7 @@ static const struct of_device_id 
> qcom_smmu_client_of_match[] __maybe_unused = {
>   { .compatible = "qcom,sc8280xp-mdss" },
>   { .compatible = "qcom,sdm845-mdss" },
>   { .compatible = "qcom,sdm845-mss-pil" },
> + { .compatible = "qcom,sm6350-mdss" },
>   { .compatible = "qcom,sm6375-mdss" },
>   { .compatible = "qcom,sm8150-mdss" },
>   { .compatible = "qcom,sm8250-mdss" },

Acked-by: Will Deacon 

Will


Re: [Freedreno] [PATCH v6 11/12] iommu/arm-smmu-qcom: Add SM6375 DPU compatible

2023-06-08 Thread Will Deacon
On Tue, Jun 06, 2023 at 02:44:02PM +0200, Konrad Dybcio wrote:
> Add the SM6375 DPU compatible to clients compatible list, as it also
> needs the workarounds.
> 
> Acked-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> index 3800ab478216..cc574928c707 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> @@ -253,6 +253,7 @@ static const struct of_device_id 
> qcom_smmu_client_of_match[] __maybe_unused = {
>   { .compatible = "qcom,sc8280xp-mdss" },
>   { .compatible = "qcom,sdm845-mdss" },
>   { .compatible = "qcom,sdm845-mss-pil" },
> + { .compatible = "qcom,sm6375-mdss" },
>   { .compatible = "qcom,sm8150-mdss" },
>   { .compatible = "qcom,sm8250-mdss" },
>   { }

Acked-by: Will Deacon 

Will


Re: [Freedreno] [PATCH v6 10/12] iommu/arm-smmu-qcom: Sort the compatible list alphabetically

2023-06-08 Thread Will Deacon
On Tue, Jun 06, 2023 at 02:44:01PM +0200, Konrad Dybcio wrote:
> It got broken at some point, fix it up.
> 
> Reviewed-by: Dmitry Baryshkov 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> index c71afda79d64..3800ab478216 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> @@ -251,10 +251,10 @@ static const struct of_device_id 
> qcom_smmu_client_of_match[] __maybe_unused = {
>   { .compatible = "qcom,sc7280-mss-pil" },
>   { .compatible = "qcom,sc8180x-mdss" },
>   { .compatible = "qcom,sc8280xp-mdss" },
> - { .compatible = "qcom,sm8150-mdss" },
> - { .compatible = "qcom,sm8250-mdss" },
>   { .compatible = "qcom,sdm845-mdss" },
>   { .compatible = "qcom,sdm845-mss-pil" },
> + { .compatible = "qcom,sm8150-mdss" },
> + { .compatible = "qcom,sm8250-mdss" },
>   { }
>  };

Acked-by: Will Deacon 

Will


Re: [Freedreno] [PATCH v5 00/12] SM63(50|75) DPU support

2023-06-05 Thread Will Deacon
On Thu, Jun 01, 2023 at 03:16:52AM +0300, Dmitry Baryshkov wrote:
> On 23/05/2023 10:46, Konrad Dybcio wrote:
> 
> [skipped the changelog]
> 
> > ---
> > Konrad Dybcio (12):
> >dt-bindings: display/msm: dsi-controller-main: Add SM6350
> >dt-bindings: display/msm: dsi-controller-main: Add SM6375
> >dt-bindings: display/msm: sc7180-dpu: Describe SM6350 and SM6375
> >dt-bindings: display/msm: Add SM6350 MDSS
> >dt-bindings: display/msm: Add SM6375 MDSS
> >drm/msm/dpu: Add SM6350 support
> >drm/msm: mdss: Add SM6350 support
> >drm/msm/dpu: Add SM6375 support
> >drm/msm: mdss: Add SM6375 support
> >iommu/arm-smmu-qcom: Sort the compatible list alphabetically
> >iommu/arm-smmu-qcom: Add SM6375 DPU compatible
> >iommu/arm-smmu-qcom: Add SM6350 DPU compatible
> 
> As we are now nearly ready to merge this series, Will, Robin, what should be
> the merge strategy for these three patches? Would you take them through the
> arm-smmu/iommu tree?

I'm happy to take the three IOMMU changes, but the bulk of this series is
replated to display and GPU so I don't think it makes sense for me to take
those.

Will


Re: [Freedreno] [PATCH v3 1/2] iommu/arm-smmu-qcom: Fix missing adreno_smmu's

2023-05-16 Thread Will Deacon
On Thu, May 11, 2023 at 07:59:05AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> When the special handling of qcom,adreno-smmu was moved into
> qcom_smmu_create(), it was overlooked that we didn't have all the
> required entries in qcom_smmu_impl_of_match.  So we stopped getting
> adreno_smmu_priv on sc7180, breaking per-process pgtables.
> 
> Fixes: 30b912a03d91 ("iommu/arm-smmu-qcom: Move the qcom,adreno-smmu check 
> into qcom_smmu_create")
> Suggested-by: Lepton Wu 
> Signed-off-by: Rob Clark 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> index d1b296b95c86..66e191773099 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> @@ -496,20 +496,21 @@ static const struct qcom_smmu_match_data 
> qcom_smmu_500_impl0_data = {
>  /*
>   * Do not add any more qcom,SOC-smmu-500 entries to this list, unless they 
> need
>   * special handling and can not be covered by the qcom,smmu-500 entry.
>   */
>  static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = {
>   { .compatible = "qcom,msm8996-smmu-v2", .data = _smmu_data },
>   { .compatible = "qcom,msm8998-smmu-v2", .data = _smmu_v2_data },
>   { .compatible = "qcom,qcm2290-smmu-500", .data = 
> _smmu_500_impl0_data },
>   { .compatible = "qcom,qdu1000-smmu-500", .data = 
> _smmu_500_impl0_data  },
>   { .compatible = "qcom,sc7180-smmu-500", .data = 
> _smmu_500_impl0_data },
> + { .compatible = "qcom,sc7180-smmu-v2", .data = _smmu_v2_data },
>   { .compatible = "qcom,sc7280-smmu-500", .data = 
> _smmu_500_impl0_data },
>   { .compatible = "qcom,sc8180x-smmu-500", .data = 
> _smmu_500_impl0_data },
>   { .compatible = "qcom,sc8280xp-smmu-500", .data = 
> _smmu_500_impl0_data },
>   { .compatible = "qcom,sdm630-smmu-v2", .data = _smmu_v2_data },
>   { .compatible = "qcom,sdm845-smmu-v2", .data = _smmu_v2_data },
>   { .compatible = "qcom,sdm845-smmu-500", .data = _smmu_500_data },
>   { .compatible = "qcom,sm6115-smmu-500", .data = 
> _smmu_500_impl0_data},
>   { .compatible = "qcom,sm6125-smmu-500", .data = 
> _smmu_500_impl0_data },
>   { .compatible = "qcom,sm6350-smmu-v2", .data = _smmu_v2_data },
>   { .compatible = "qcom,sm6350-smmu-500", .data = 
> _smmu_500_impl0_data },
> @@ -540,12 +541,18 @@ struct arm_smmu_device *qcom_smmu_impl_init(struct 
> arm_smmu_device *smmu)
>   /* Match platform for ACPI boot */
>   if (acpi_match_platform_list(qcom_acpi_platlist) >= 0)
>   return qcom_smmu_create(smmu, 
> _smmu_500_impl0_data);
>   }
>  #endif
>  
>   match = of_match_node(qcom_smmu_impl_of_match, np);
>   if (match)
>   return qcom_smmu_create(smmu, match->data);
>  
> + /* If you hit this WARN_ON() you are missing an entry in the
> +  * qcom_smmu_impl_of_match[] table, and GPU per-process page-
> +  * tables will be broken.
> +  */
> + WARN_ON(of_device_is_compatible(np, "qcom,adreno-smmu"));

Wouldn't it be better to print the information from the comment, rather
than force the user to diagnose a WARN_ON() back to the source?

Will


Re: [Freedreno] [PATCH v1 00/10] iommu/arm-smmu-qcom: Rework Qualcomm SMMU bindings and implementation

2022-11-18 Thread Will Deacon
On Fri, Nov 18, 2022 at 01:41:24PM +0100, Krzysztof Kozlowski wrote:
> On 14/11/2022 18:06, Dmitry Baryshkov wrote:
> > The main goal of this patchset is to define a generic qcom,smmu-500
> > binding to be used by newer Qualcomm platforms instead of defining each
> > and every SoC line with no actual differences between the compats.
> > 
> > While preparing this change it was required to cleanup the existing
> > bindings and to rework the way the arm-smmu-qcom implementation handles
> > binding to IOMMU devices.
> > 
> > Changes since RFC v2:
> >  - Dropped the dts patch, picked up by Bjorn
> >  - Fixed minor nits in commit messages and in-file comments (noted by
> >Krzysztof and Richard Acayan)
> > 
> > Changes since RFC v1:
> >  - Added the dts patch fixing order of clocks in msm8996.dtsi
> >  - Fixed the DT bot errors
> >  - Added separate clause for Google Cheza devices
> 
> Please continue the version numbering. RFC is also a patch and also a
> version. This is v3. Your next will be v4.

I queued this already, so hopefully there won't be a next version!

Will


Re: [Freedreno] [PATCH v1 00/10] iommu/arm-smmu-qcom: Rework Qualcomm SMMU bindings and implementation

2022-11-14 Thread Will Deacon
On Mon, 14 Nov 2022 20:06:25 +0300, Dmitry Baryshkov wrote:
> The main goal of this patchset is to define a generic qcom,smmu-500
> binding to be used by newer Qualcomm platforms instead of defining each
> and every SoC line with no actual differences between the compats.
> 
> While preparing this change it was required to cleanup the existing
> bindings and to rework the way the arm-smmu-qcom implementation handles
> binding to IOMMU devices.
> 
> [...]

Applied to arm64 (for-joerg/arm-smmu/bindings), thanks!

Note that I removed the 'qcom_smmu_data' structure completely as it was
no longer referenced after patch 9.

[01/10] dt-bindings: arm-smmu: Add missing Qualcomm SMMU compatibles
https://git.kernel.org/arm64/c/dbf88f743583
[02/10] dt-bindings: arm-smmu: fix clocks/clock-names schema
https://git.kernel.org/arm64/c/982295bfe369
[03/10] dt-bindings: arm-smmu: add special case for Google Cheza platform
https://git.kernel.org/arm64/c/3a12e8c06536
[04/10] dt-bindings: arm-smmu: Add generic qcom,smmu-500 bindings
https://git.kernel.org/arm64/c/6c84bbd103d8
[05/10] iommu/arm-smmu-qcom: Move implementation data into match data
https://git.kernel.org/arm64/c/4c1d0ad153f8
[06/10] iommu/arm-smmu-qcom: Move the qcom,adreno-smmu check into 
qcom_smmu_create
https://git.kernel.org/arm64/c/30b912a03d91
[07/10] iommu/arm-smmu-qcom: provide separate implementation for SDM845-smmu-500
https://git.kernel.org/arm64/c/417b76adcf1d
[08/10] iommu/arm-smmu-qcom: Merge table from arm-smmu-qcom-debug into match 
data
https://git.kernel.org/arm64/c/4172dda2b30a
[09/10] iommu/arm-smmu-qcom: Stop using mmu500 reset for v2 MMUs
https://git.kernel.org/arm64/c/b4c6ee515c42
[10/10] iommu/arm-smmu-qcom: Add generic qcom,smmu-500 match entry
https://git.kernel.org/arm64/c/80b71080720e

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


Re: [Freedreno] [RFC PATCH v2 00/11] iommu/arm-smmu-qcom: Rework Qualcomm SMMU bindings and implementation

2022-11-14 Thread Will Deacon
On Wed, Nov 02, 2022 at 09:44:09PM +0300, Dmitry Baryshkov wrote:
> The main goal of this patchset is to define a generic qcom,smmu-500
> binding to be used by newer Qualcomm platforms instead of defining each
> and every SoC line with no actual differences between the compats.

Thanks for doing this, I really like the cleanup and the possibility
that we can stop adding all these pointless strings every release!

It looks like Bjorn picked up patch 1, so could you please rebase the
rest of the series onto my SMMU bindings queue:

https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=for-joerg/arm-smmu/bindings

and address the minor review comments you had so that I can pick this up?

Cheers,

Will


Re: [Freedreno] [PATCH 1/2] iommu: arm-smmu-impl: Add 8250 display compatible to the client list.

2022-07-06 Thread Will Deacon
On Tue, 14 Jun 2022 16:01:35 -0700, Emma Anholt wrote:
> Required for turning on per-process page tables for the GPU.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/2] iommu: arm-smmu-impl: Add 8250 display compatible to the client list.
  https://git.kernel.org/will/c/3482c0b73073
[2/2] arm64: dts: qcom: sm8250: Enable per-process page tables.
  (no commit info)

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


Re: [Freedreno] [PATCH v2 1/3] iommu/io-pgtable-arm: Add way to debug pgtable walk

2021-12-14 Thread Will Deacon
On Tue, Oct 05, 2021 at 08:16:25AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> Add an io-pgtable method to retrieve the raw PTEs that would be
> traversed for a given iova access.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/iommu/io-pgtable-arm.c | 40 +++---
>  include/linux/io-pgtable.h |  9 
>  2 files changed, 41 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index dd9e47189d0d..c470fc0b3c2b 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -700,38 +700,61 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops 
> *ops, unsigned long iova,
>   return arm_lpae_unmap_pages(ops, iova, size, 1, gather);
>  }
>  
> -static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
> -  unsigned long iova)
> +static int arm_lpae_pgtable_walk(struct io_pgtable_ops *ops, unsigned long 
> iova,
> +  void *_ptes, int *num_ptes)
>  {
>   struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
>   arm_lpae_iopte pte, *ptep = data->pgd;
> + arm_lpae_iopte *ptes = _ptes;
> + int max_ptes = *num_ptes;
>   int lvl = data->start_level;
>  
> + *num_ptes = 0;
> +
>   do {
> + if (*num_ptes >= max_ptes)
> + return -ENOSPC;
> +
>   /* Valid IOPTE pointer? */
>   if (!ptep)
> - return 0;
> + return -EFAULT;
>  
>   /* Grab the IOPTE we're interested in */
>   ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
>   pte = READ_ONCE(*ptep);
>  
> + ptes[(*num_ptes)++] = pte;
> +
>   /* Valid entry? */
>   if (!pte)
> - return 0;
> + return -EFAULT;
>  
>   /* Leaf entry? */
>   if (iopte_leaf(pte, lvl, data->iop.fmt))
> - goto found_translation;
> + return 0;
>  
>   /* Take it to the next level */
>   ptep = iopte_deref(pte, data);
>   } while (++lvl < ARM_LPAE_MAX_LEVELS);
>  
> - /* Ran out of page tables to walk */
> - return 0;
> + return -EFAULT;
> +}
> +
> +static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
> +  unsigned long iova)
> +{
> + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
> + arm_lpae_iopte pte, ptes[ARM_LPAE_MAX_LEVELS];
> + int lvl, num_ptes = ARM_LPAE_MAX_LEVELS;
> + int ret;
> +
> + ret = arm_lpae_pgtable_walk(ops, iova, ptes, _ptes);
> + if (ret)
> + return 0;
> +
> + pte = ptes[num_ptes - 1];
> + lvl = num_ptes - 1 + data->start_level;
>  
> -found_translation:
>   iova &= (ARM_LPAE_BLOCK_SIZE(lvl, data) - 1);
>   return iopte_to_paddr(pte, data) | iova;
>  }
> @@ -816,6 +839,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>   .unmap  = arm_lpae_unmap,
>   .unmap_pages= arm_lpae_unmap_pages,
>   .iova_to_phys   = arm_lpae_iova_to_phys,
> + .pgtable_walk   = arm_lpae_pgtable_walk,
>   };
>  
>   return data;
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index 86af6f0a00a2..501f362a929c 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -148,6 +148,13 @@ struct io_pgtable_cfg {
>   * @unmap:Unmap a physically contiguous memory region.
>   * @unmap_pages:  Unmap a range of virtually contiguous pages of the same 
> size.
>   * @iova_to_phys: Translate iova to physical address.
> + * @pgtable_walk: Return details of a page table walk for a given iova.
> + *This returns the array of PTEs in a format that is
> + *specific to the page table format.  The number of
> + *PTEs can be format specific.  The num_ptes parameter
> + *on input specifies the size of the ptes array, and
> + *on output the number of PTEs filled in (which depends
> + *on the number of PTEs walked to resolve the iova)

I think this would be a fair bit cleaner if the interface instead took a
callback function to invoke at each page-table level. It would be invoked
with the pte value and the level. Depending on its return value the walk
could be terminated early. That would also potentially scale to walking
ranges of iovas as well if we ever need it and it may be more readily
implementable by other formats too.

>   *
>   * These functions map directly onto the iommu_ops member functions with
>   * the same names.

This bit of the comment is no longer true with your change.

Will


Re: [Freedreno] [PATCH] iommu/arm-smmu-qcom: Fix TTBR0 read

2021-12-14 Thread Will Deacon
On Mon, 8 Nov 2021 09:17:23 -0800, Rob Clark wrote:
> From: Rob Clark 
> 
> It is a 64b register, lets not lose the upper bits.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-qcom: Fix TTBR0 read
  https://git.kernel.org/will/c/c31112fbd407

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev


Re: [Freedreno] [PATCH] [RFC] qcom_scm: hide Kconfig symbol

2021-09-29 Thread Will Deacon
On Mon, Sep 27, 2021 at 05:22:13PM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> Now that SCM can be a loadable module, we have to add another
> dependency to avoid link failures when ipa or adreno-gpu are
> built-in:
> 
> aarch64-linux-ld: drivers/net/ipa/ipa_main.o: in function `ipa_probe':
> ipa_main.c:(.text+0xfc4): undefined reference to `qcom_scm_is_available'
> 
> ld.lld: error: undefined symbol: qcom_scm_is_available
> >>> referenced by adreno_gpu.c
> >>>   gpu/drm/msm/adreno/adreno_gpu.o:(adreno_zap_shader_load) in 
> >>> archive drivers/built-in.a
> 
> This can happen when CONFIG_ARCH_QCOM is disabled and we don't select
> QCOM_MDT_LOADER, but some other module selects QCOM_SCM. Ideally we'd
> use a similar dependency here to what we have for QCOM_RPROC_COMMON,
> but that causes dependency loops from other things selecting QCOM_SCM.
> 
> This appears to be an endless problem, so try something different this
> time:
> 
>  - CONFIG_QCOM_SCM becomes a hidden symbol that nothing 'depends on'
>but that is simply selected by all of its users
> 
>  - All the stubs in include/linux/qcom_scm.h can go away
> 
>  - arm-smccc.h needs to provide a stub for __arm_smccc_smc() to
>allow compile-testing QCOM_SCM on all architectures.
> 
>  - To avoid a circular dependency chain involving RESET_CONTROLLER
>and PINCTRL_SUNXI, change the 'depends on RESET_CONTROLLER' in
>the latter one to 'select'.
> 
> The last bit is rather annoying, as drivers should generally never
> 'select' another subsystem, and about half the users of the reset
> controller interface do this anyway.
> 
> Nevertheless, this version seems to pass all my randconfig tests
> and is more robust than any of the prior versions.
> 
> Comments?
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/firmware/Kconfig|  4 +-
>  drivers/gpu/drm/msm/Kconfig |  4 +-
>  drivers/iommu/Kconfig   |  2 +-
>  drivers/media/platform/Kconfig  |  2 +-
>  drivers/mmc/host/Kconfig|  2 +-
>  drivers/net/ipa/Kconfig |  1 +
>  drivers/net/wireless/ath/ath10k/Kconfig |  2 +-
>  drivers/pinctrl/qcom/Kconfig|  3 +-
>  drivers/pinctrl/sunxi/Kconfig   |  6 +--
>  include/linux/arm-smccc.h   | 10 
>  include/linux/qcom_scm.h| 71 -
>  11 files changed, 23 insertions(+), 84 deletions(-)
> 
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index 220a58cf0a44..f7dd82ef0b9c 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -203,9 +203,7 @@ config INTEL_STRATIX10_RSU
> Say Y here if you want Intel RSU support.
>  
>  config QCOM_SCM
> - tristate "Qcom SCM driver"
> - depends on ARM || ARM64
> - depends on HAVE_ARM_SMCCC
> + tristate
>   select RESET_CONTROLLER
>  
>  config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
> diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
> index e9c6af78b1d7..3ddf739a6f9b 100644
> --- a/drivers/gpu/drm/msm/Kconfig
> +++ b/drivers/gpu/drm/msm/Kconfig
> @@ -17,7 +17,7 @@ config DRM_MSM
>   select DRM_SCHED
>   select SHMEM
>   select TMPFS
> - select QCOM_SCM if ARCH_QCOM
> + select QCOM_SCM
>   select WANT_DEV_COREDUMP
>   select SND_SOC_HDMI_CODEC if SND_SOC
>   select SYNC_FILE
> @@ -55,7 +55,7 @@ config DRM_MSM_GPU_SUDO
>  
>  config DRM_MSM_HDMI_HDCP
>   bool "Enable HDMI HDCP support in MSM DRM driver"
> - depends on DRM_MSM && QCOM_SCM
> + depends on DRM_MSM
>   default y
>   help
> Choose this option to enable HDCP state machine
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 124c41adeca1..989c83acbfee 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -308,7 +308,7 @@ config APPLE_DART
>  config ARM_SMMU
>   tristate "ARM Ltd. System MMU (SMMU) Support"
>   depends on ARM64 || ARM || (COMPILE_TEST && !GENERIC_ATOMIC64)
> - depends on QCOM_SCM || !QCOM_SCM #if QCOM_SCM=m this can't be =y
> + select QCOM_SCM
>   select IOMMU_API
>   select IOMMU_IO_PGTABLE_LPAE
>   select ARM_DMA_USE_IOMMU if ARM

I don't want to get in the way of this patch because I'm also tired of the
randconfig failures caused by QCOM_SCM. However, ARM_SMMU is applicable to
a wide variety of (non-qcom) SoCs and so it seems a shame to require the
QCOM_SCM code to be included for all of those when it's not strictly needed
at all.

Will


Re: [Freedreno] [PATCH 0/3] iommu/drm/msm: Allow non-coherent masters to use system cache

2021-08-10 Thread Will Deacon
On Mon, Aug 09, 2021 at 11:17:40PM +0530, Sai Prakash Ranjan wrote:
> On 2021-08-09 23:10, Will Deacon wrote:
> > On Mon, Aug 09, 2021 at 10:18:21AM -0700, Rob Clark wrote:
> > > On Mon, Aug 9, 2021 at 10:05 AM Will Deacon  wrote:
> > > > On Mon, Aug 09, 2021 at 09:57:08AM -0700, Rob Clark wrote:
> > > > > But I suppose we could call it instead IOMMU_QCOM_LLC or something
> > > > > like that to make it more clear that it is not necessarily something
> > > > > that would work with a different outer level cache implementation?
> > > >
> > > > ... or we could just deal with the problem so that other people can 
> > > > reuse
> > > > the code. I haven't really understood the reluctance to solve this 
> > > > properly.
> > > >
> > > > Am I missing some reason this isn't solvable?
> > > 
> > > Oh, was there another way to solve it (other than foregoing setting
> > > INC_OCACHE in the pgtables)?  Maybe I misunderstood, is there a
> > > corresponding setting on the MMU pgtables side of things?
> > 
> > Right -- we just need to program the CPU's MMU with the matching memory
> > attributes! It's a bit more fiddly if you're just using ioremap_wc()
> > though, as it's usually the DMA API which handles the attributes under
> > the
> > hood.
> > 
> > Anyway, sorry, I should've said that explicitly earlier on. We've done
> > this
> > sort of thing in the Android tree so I assumed Sai knew what needed to
> > be
> > done and then I didn't think to explain to you :(
> > 
> 
> Right I was aware of that but even in the android tree there is no user :)

I'm assuming there are vendor modules using it there, otherwise we wouldn't
have been asked to put it in. Since you work at Qualcomm, maybe you could
talk to your colleagues (Isaac and Patrick) directly?

> I think we can't have a new memory type without any user right in upstream
> like android tree?

Correct. But I don't think we should be adding IOMMU_* anything upstream
if we don't have a user.

Will


Re: [Freedreno] [PATCH 0/3] iommu/drm/msm: Allow non-coherent masters to use system cache

2021-08-09 Thread Will Deacon
On Mon, Aug 09, 2021 at 10:18:21AM -0700, Rob Clark wrote:
> On Mon, Aug 9, 2021 at 10:05 AM Will Deacon  wrote:
> >
> > On Mon, Aug 09, 2021 at 09:57:08AM -0700, Rob Clark wrote:
> > > On Mon, Aug 9, 2021 at 7:56 AM Will Deacon  wrote:
> > > > On Mon, Aug 02, 2021 at 06:36:04PM -0700, Rob Clark wrote:
> > > > > On Mon, Aug 2, 2021 at 8:14 AM Will Deacon  wrote:
> > > > > > On Mon, Aug 02, 2021 at 08:08:07AM -0700, Rob Clark wrote:
> > > > > > > On Mon, Aug 2, 2021 at 3:55 AM Will Deacon  
> > > > > > > wrote:
> > > > > > > > On Thu, Jul 29, 2021 at 10:08:22AM +0530, Sai Prakash Ranjan 
> > > > > > > > wrote:
> > > > > > > > > On 2021-07-28 19:30, Georgi Djakov wrote:
> > > > > > > > > > On Mon, Jan 11, 2021 at 07:45:02PM +0530, Sai Prakash 
> > > > > > > > > > Ranjan wrote:
> > > > > > > > > > > commit ecd7274fb4cd ("iommu: Remove unused 
> > > > > > > > > > > IOMMU_SYS_CACHE_ONLY flag")
> > > > > > > > > > > removed unused IOMMU_SYS_CACHE_ONLY prot flag and along 
> > > > > > > > > > > with it went
> > > > > > > > > > > the memory type setting required for the non-coherent 
> > > > > > > > > > > masters to use
> > > > > > > > > > > system cache. Now that system cache support for GPU is 
> > > > > > > > > > > added, we will
> > > > > > > > > > > need to set the right PTE attribute for GPU buffers to be 
> > > > > > > > > > > sys cached.
> > > > > > > > > > > Without this, the system cache lines are not allocated 
> > > > > > > > > > > for GPU.
> > > > > > > > > > >
> > > > > > > > > > > So the patches in this series introduces a new prot flag 
> > > > > > > > > > > IOMMU_LLC,
> > > > > > > > > > > renames IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to 
> > > > > > > > > > > IO_PGTABLE_QUIRK_PTW_LLC
> > > > > > > > > > > and makes GPU the user of this protection flag.
> > > > > > > > > >
> > > > > > > > > > Thank you for the patchset! Are you planning to refresh it, 
> > > > > > > > > > as it does
> > > > > > > > > > not apply anymore?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I was waiting on Will's reply [1]. If there are no changes 
> > > > > > > > > needed, then
> > > > > > > > > I can repost the patch.
> > > > > > > >
> > > > > > > > I still think you need to handle the mismatched alias, no? 
> > > > > > > > You're adding
> > > > > > > > a new memory type to the SMMU which doesn't exist on the CPU 
> > > > > > > > side. That
> > > > > > > > can't be right.
> > > > > > > >
> > > > > > >
> > > > > > > Just curious, and maybe this is a dumb question, but what is your
> > > > > > > concern about mismatched aliases?  I mean the cache hierarchy on 
> > > > > > > the
> > > > > > > GPU device side (anything beyond the LLC) is pretty different and
> > > > > > > doesn't really care about the smmu pgtable attributes..
> > > > > >
> > > > > > If the CPU accesses a shared buffer with different attributes to 
> > > > > > those which
> > > > > > the device is using then you fall into the "mismatched memory 
> > > > > > attributes"
> > > > > > part of the Arm architecture. It's reasonably unforgiving (you 
> > > > > > should go and
> > > > > > read it) and in some cases can apply to speculative accesses as 
> > > > > > well, but
> > > > > > the end result is typically loss of coherency.
> > > > >
> > > > > Ok, I might have a few other sections to read first to decipher the
> > > > > terminology..
> 

Re: [Freedreno] [PATCH 0/3] iommu/drm/msm: Allow non-coherent masters to use system cache

2021-08-09 Thread Will Deacon
On Mon, Aug 09, 2021 at 09:57:08AM -0700, Rob Clark wrote:
> On Mon, Aug 9, 2021 at 7:56 AM Will Deacon  wrote:
> > On Mon, Aug 02, 2021 at 06:36:04PM -0700, Rob Clark wrote:
> > > On Mon, Aug 2, 2021 at 8:14 AM Will Deacon  wrote:
> > > > On Mon, Aug 02, 2021 at 08:08:07AM -0700, Rob Clark wrote:
> > > > > On Mon, Aug 2, 2021 at 3:55 AM Will Deacon  wrote:
> > > > > > On Thu, Jul 29, 2021 at 10:08:22AM +0530, Sai Prakash Ranjan wrote:
> > > > > > > On 2021-07-28 19:30, Georgi Djakov wrote:
> > > > > > > > On Mon, Jan 11, 2021 at 07:45:02PM +0530, Sai Prakash Ranjan 
> > > > > > > > wrote:
> > > > > > > > > commit ecd7274fb4cd ("iommu: Remove unused 
> > > > > > > > > IOMMU_SYS_CACHE_ONLY flag")
> > > > > > > > > removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with 
> > > > > > > > > it went
> > > > > > > > > the memory type setting required for the non-coherent masters 
> > > > > > > > > to use
> > > > > > > > > system cache. Now that system cache support for GPU is added, 
> > > > > > > > > we will
> > > > > > > > > need to set the right PTE attribute for GPU buffers to be sys 
> > > > > > > > > cached.
> > > > > > > > > Without this, the system cache lines are not allocated for 
> > > > > > > > > GPU.
> > > > > > > > >
> > > > > > > > > So the patches in this series introduces a new prot flag 
> > > > > > > > > IOMMU_LLC,
> > > > > > > > > renames IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to 
> > > > > > > > > IO_PGTABLE_QUIRK_PTW_LLC
> > > > > > > > > and makes GPU the user of this protection flag.
> > > > > > > >
> > > > > > > > Thank you for the patchset! Are you planning to refresh it, as 
> > > > > > > > it does
> > > > > > > > not apply anymore?
> > > > > > > >
> > > > > > >
> > > > > > > I was waiting on Will's reply [1]. If there are no changes 
> > > > > > > needed, then
> > > > > > > I can repost the patch.
> > > > > >
> > > > > > I still think you need to handle the mismatched alias, no? You're 
> > > > > > adding
> > > > > > a new memory type to the SMMU which doesn't exist on the CPU side. 
> > > > > > That
> > > > > > can't be right.
> > > > > >
> > > > >
> > > > > Just curious, and maybe this is a dumb question, but what is your
> > > > > concern about mismatched aliases?  I mean the cache hierarchy on the
> > > > > GPU device side (anything beyond the LLC) is pretty different and
> > > > > doesn't really care about the smmu pgtable attributes..
> > > >
> > > > If the CPU accesses a shared buffer with different attributes to those 
> > > > which
> > > > the device is using then you fall into the "mismatched memory 
> > > > attributes"
> > > > part of the Arm architecture. It's reasonably unforgiving (you should 
> > > > go and
> > > > read it) and in some cases can apply to speculative accesses as well, 
> > > > but
> > > > the end result is typically loss of coherency.
> > >
> > > Ok, I might have a few other sections to read first to decipher the
> > > terminology..
> > >
> > > But my understanding of LLC is that it looks just like system memory
> > > to the CPU and GPU (I think that would make it "the point of
> > > coherence" between the GPU and CPU?)  If that is true, shouldn't it be
> > > invisible from the point of view of different CPU mapping options?
> >
> > You could certainly build a system where mismatched attributes don't cause
> > loss of coherence, but as it's not guaranteed by the architecture and the
> > changes proposed here affect APIs which are exposed across SoCs, then I
> > don't think it helps much.
> >
> 
> Hmm, the description of the new mapping flag is that it applies only
> to transparent outer level cache:
> 
> +/*
> + * Non-coherent masters can use this page protection flag to set cacheable
> + * memory attributes for only a transparent outer level of cache, also known 
> as
> + * the last-level or system cache.
> + */
> +#define IOMMU_LLC  (1 << 6)
> 
> But I suppose we could call it instead IOMMU_QCOM_LLC or something
> like that to make it more clear that it is not necessarily something
> that would work with a different outer level cache implementation?

... or we could just deal with the problem so that other people can reuse
the code. I haven't really understood the reluctance to solve this properly.

Am I missing some reason this isn't solvable?

Will


Re: [Freedreno] [PATCH 0/3] iommu/drm/msm: Allow non-coherent masters to use system cache

2021-08-09 Thread Will Deacon
On Mon, Aug 02, 2021 at 06:36:04PM -0700, Rob Clark wrote:
> On Mon, Aug 2, 2021 at 8:14 AM Will Deacon  wrote:
> >
> > On Mon, Aug 02, 2021 at 08:08:07AM -0700, Rob Clark wrote:
> > > On Mon, Aug 2, 2021 at 3:55 AM Will Deacon  wrote:
> > > >
> > > > On Thu, Jul 29, 2021 at 10:08:22AM +0530, Sai Prakash Ranjan wrote:
> > > > > On 2021-07-28 19:30, Georgi Djakov wrote:
> > > > > > On Mon, Jan 11, 2021 at 07:45:02PM +0530, Sai Prakash Ranjan wrote:
> > > > > > > commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY 
> > > > > > > flag")
> > > > > > > removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it 
> > > > > > > went
> > > > > > > the memory type setting required for the non-coherent masters to 
> > > > > > > use
> > > > > > > system cache. Now that system cache support for GPU is added, we 
> > > > > > > will
> > > > > > > need to set the right PTE attribute for GPU buffers to be sys 
> > > > > > > cached.
> > > > > > > Without this, the system cache lines are not allocated for GPU.
> > > > > > >
> > > > > > > So the patches in this series introduces a new prot flag 
> > > > > > > IOMMU_LLC,
> > > > > > > renames IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to 
> > > > > > > IO_PGTABLE_QUIRK_PTW_LLC
> > > > > > > and makes GPU the user of this protection flag.
> > > > > >
> > > > > > Thank you for the patchset! Are you planning to refresh it, as it 
> > > > > > does
> > > > > > not apply anymore?
> > > > > >
> > > > >
> > > > > I was waiting on Will's reply [1]. If there are no changes needed, 
> > > > > then
> > > > > I can repost the patch.
> > > >
> > > > I still think you need to handle the mismatched alias, no? You're adding
> > > > a new memory type to the SMMU which doesn't exist on the CPU side. That
> > > > can't be right.
> > > >
> > >
> > > Just curious, and maybe this is a dumb question, but what is your
> > > concern about mismatched aliases?  I mean the cache hierarchy on the
> > > GPU device side (anything beyond the LLC) is pretty different and
> > > doesn't really care about the smmu pgtable attributes..
> >
> > If the CPU accesses a shared buffer with different attributes to those which
> > the device is using then you fall into the "mismatched memory attributes"
> > part of the Arm architecture. It's reasonably unforgiving (you should go and
> > read it) and in some cases can apply to speculative accesses as well, but
> > the end result is typically loss of coherency.
> 
> Ok, I might have a few other sections to read first to decipher the
> terminology..
> 
> But my understanding of LLC is that it looks just like system memory
> to the CPU and GPU (I think that would make it "the point of
> coherence" between the GPU and CPU?)  If that is true, shouldn't it be
> invisible from the point of view of different CPU mapping options?

You could certainly build a system where mismatched attributes don't cause
loss of coherence, but as it's not guaranteed by the architecture and the
changes proposed here affect APIs which are exposed across SoCs, then I
don't think it helps much.

Will


Re: [Freedreno] [PATCH 0/3] iommu/drm/msm: Allow non-coherent masters to use system cache

2021-08-02 Thread Will Deacon
On Mon, Aug 02, 2021 at 08:08:07AM -0700, Rob Clark wrote:
> On Mon, Aug 2, 2021 at 3:55 AM Will Deacon  wrote:
> >
> > On Thu, Jul 29, 2021 at 10:08:22AM +0530, Sai Prakash Ranjan wrote:
> > > On 2021-07-28 19:30, Georgi Djakov wrote:
> > > > On Mon, Jan 11, 2021 at 07:45:02PM +0530, Sai Prakash Ranjan wrote:
> > > > > commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY flag")
> > > > > removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it went
> > > > > the memory type setting required for the non-coherent masters to use
> > > > > system cache. Now that system cache support for GPU is added, we will
> > > > > need to set the right PTE attribute for GPU buffers to be sys cached.
> > > > > Without this, the system cache lines are not allocated for GPU.
> > > > >
> > > > > So the patches in this series introduces a new prot flag IOMMU_LLC,
> > > > > renames IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to IO_PGTABLE_QUIRK_PTW_LLC
> > > > > and makes GPU the user of this protection flag.
> > > >
> > > > Thank you for the patchset! Are you planning to refresh it, as it does
> > > > not apply anymore?
> > > >
> > >
> > > I was waiting on Will's reply [1]. If there are no changes needed, then
> > > I can repost the patch.
> >
> > I still think you need to handle the mismatched alias, no? You're adding
> > a new memory type to the SMMU which doesn't exist on the CPU side. That
> > can't be right.
> >
> 
> Just curious, and maybe this is a dumb question, but what is your
> concern about mismatched aliases?  I mean the cache hierarchy on the
> GPU device side (anything beyond the LLC) is pretty different and
> doesn't really care about the smmu pgtable attributes..

If the CPU accesses a shared buffer with different attributes to those which
the device is using then you fall into the "mismatched memory attributes"
part of the Arm architecture. It's reasonably unforgiving (you should go and
read it) and in some cases can apply to speculative accesses as well, but
the end result is typically loss of coherency.

Will


Re: [Freedreno] [PATCH 1/2] iommu/arm-smmu-qcom: Skip the TTBR1 quirk for db820c.

2021-06-08 Thread Will Deacon
On Fri, 26 Mar 2021 16:13:02 -0700, Eric Anholt wrote:
> db820c wants to use the qcom smmu path to get HUPCF set (which keeps
> the GPU from wedging and then sometimes wedging the kernel after a
> page fault), but it doesn't have separate pagetables support yet in
> drm/msm so we can't go all the way to the TTBR1 path.

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/2] iommu/arm-smmu-qcom: Skip the TTBR1 quirk for db820c.
  https://git.kernel.org/will/c/a242f4297cfe
[2/2] arm64: dts: msm8996: Mark the GPU's SMMU as an adreno one.
  https://git.kernel.org/will/c/19c07b91f85d

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RESEND PATCH v4 1/6] iommu/arm-smmu: Add support for driver IOMMU fault handlers

2021-06-08 Thread Will Deacon
On Wed, Jun 02, 2021 at 09:52:44AM -0700, Rob Clark wrote:
> From: Jordan Crouse 
> 
> Call report_iommu_fault() to allow upper-level drivers to register their
> own fault handlers.
> 
> Signed-off-by: Jordan Crouse 
> Signed-off-by: Rob Clark 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 6f72c4d208ca..b4b32d31fc06 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -408,6 +408,7 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
> *dev)
>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   struct arm_smmu_device *smmu = smmu_domain->smmu;
>   int idx = smmu_domain->cfg.cbndx;
> + int ret;
>  
>   fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR);
>   if (!(fsr & ARM_SMMU_FSR_FAULT))
> @@ -417,8 +418,12 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
> *dev)
>   iova = arm_smmu_cb_readq(smmu, idx, ARM_SMMU_CB_FAR);
>   cbfrsynra = arm_smmu_gr1_read(smmu, ARM_SMMU_GR1_CBFRSYNRA(idx));
>  
> - dev_err_ratelimited(smmu->dev,
> - "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
> cbfrsynra=0x%x, cb=%d\n",
> + ret = report_iommu_fault(domain, NULL, iova,
> + fsynr & ARM_SMMU_FSYNR0_WNR ? IOMMU_FAULT_WRITE : 
> IOMMU_FAULT_READ);
> +
> + if (ret == -ENOSYS)
> + dev_err_ratelimited(smmu->dev,
> + "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
> cbfrsynra=0x%x, cb=%d\n",
>   fsr, iova, fsynr, cbfrsynra, idx);

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 16/18] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE

2021-04-01 Thread Will Deacon
On Thu, Apr 01, 2021 at 11:59:45AM +0200, Christoph Hellwig wrote:
> For now I'll just pass the iommu_domain to iommu_get_dma_strict,
> so that we can check for it.  We can do additional cleanups on top
> of that later.

Sounds good to me, cheers!

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 16/18] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE

2021-03-31 Thread Will Deacon
On Wed, Mar 31, 2021 at 02:09:37PM +0100, Robin Murphy wrote:
> On 2021-03-31 12:49, Will Deacon wrote:
> > On Tue, Mar 30, 2021 at 05:28:19PM +0100, Robin Murphy wrote:
> > > On 2021-03-30 14:58, Will Deacon wrote:
> > > > On Tue, Mar 30, 2021 at 02:19:38PM +0100, Robin Murphy wrote:
> > > > > On 2021-03-30 14:11, Will Deacon wrote:
> > > > > > On Tue, Mar 16, 2021 at 04:38:22PM +0100, Christoph Hellwig wrote:
> > > > > > > From: Robin Murphy 
> > > > > > > 
> > > > > > > Instead make the global iommu_dma_strict paramete in iommu.c 
> > > > > > > canonical by
> > > > > > > exporting helpers to get and set it and use those directly in the 
> > > > > > > drivers.
> > > > > > > 
> > > > > > > This make sure that the iommu.strict parameter also works for the 
> > > > > > > AMD and
> > > > > > > Intel IOMMU drivers on x86.  As those default to lazy flushing a 
> > > > > > > new
> > > > > > > IOMMU_CMD_LINE_STRICT is used to turn the value into a tristate to
> > > > > > > represent the default if not overriden by an explicit parameter.
> > > > > > > 
> > > > > > > Signed-off-by: Robin Murphy .
> > > > > > > [ported on top of the other iommu_attr changes and added a few 
> > > > > > > small
> > > > > > > missing bits]
> > > > > > > Signed-off-by: Christoph Hellwig 
> > > > > > > ---
> > > > > > > drivers/iommu/amd/iommu.c   | 23 +---
> > > > > > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 50 
> > > > > > > +---
> > > > > > > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
> > > > > > > drivers/iommu/arm/arm-smmu/arm-smmu.c   | 27 +
> > > > > > > drivers/iommu/dma-iommu.c   |  9 +--
> > > > > > > drivers/iommu/intel/iommu.c | 64 
> > > > > > > -
> > > > > > > drivers/iommu/iommu.c   | 27 ++---
> > > > > > > include/linux/iommu.h   |  4 +-
> > > > > > > 8 files changed, 40 insertions(+), 165 deletions(-)
> > > > > > 
> > > > > > I really like this cleanup, but I can't help wonder if it's going 
> > > > > > in the
> > > > > > wrong direction. With SoCs often having multiple IOMMU instances 
> > > > > > and a
> > > > > > distinction between "trusted" and "untrusted" devices, then having 
> > > > > > the
> > > > > > flush-queue enabled on a per-IOMMU or per-domain basis doesn't sound
> > > > > > unreasonable to me, but this change makes it a global property.
> > > > > 
> > > > > The intent here was just to streamline the existing behaviour of 
> > > > > stuffing a
> > > > > global property into a domain attribute then pulling it out again in 
> > > > > the
> > > > > illusion that it was in any way per-domain. We're still checking
> > > > > dev_is_untrusted() before making an actual decision, and it's not 
> > > > > like we
> > > > > can't add more factors at that point if we want to.
> > > > 
> > > > Like I say, the cleanup is great. I'm just wondering whether there's a
> > > > better way to express the complicated logic to decide whether or not to 
> > > > use
> > > > the flush queue than what we end up with:
> > > > 
> > > > if (!cookie->fq_domain && (!dev || !dev_is_untrusted(dev)) &&
> > > > domain->ops->flush_iotlb_all && !iommu_get_dma_strict())
> > > > 
> > > > which is mixing up globals, device properties and domain properties. The
> > > > result is that the driver code ends up just using the global to 
> > > > determine
> > > > whether or not to pass IO_PGTABLE_QUIRK_NON_STRICT to the page-table 
> > > > code,
> > > > which is a departure from the current way of doing things.
> > > 
> > > But previously, SMMU only ever saw the global policy piped through the
> &

Re: [Freedreno] [PATCH 16/18] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE

2021-03-31 Thread Will Deacon
On Tue, Mar 30, 2021 at 05:28:19PM +0100, Robin Murphy wrote:
> On 2021-03-30 14:58, Will Deacon wrote:
> > On Tue, Mar 30, 2021 at 02:19:38PM +0100, Robin Murphy wrote:
> > > On 2021-03-30 14:11, Will Deacon wrote:
> > > > On Tue, Mar 16, 2021 at 04:38:22PM +0100, Christoph Hellwig wrote:
> > > > > From: Robin Murphy 
> > > > > 
> > > > > Instead make the global iommu_dma_strict paramete in iommu.c 
> > > > > canonical by
> > > > > exporting helpers to get and set it and use those directly in the 
> > > > > drivers.
> > > > > 
> > > > > This make sure that the iommu.strict parameter also works for the AMD 
> > > > > and
> > > > > Intel IOMMU drivers on x86.  As those default to lazy flushing a new
> > > > > IOMMU_CMD_LINE_STRICT is used to turn the value into a tristate to
> > > > > represent the default if not overriden by an explicit parameter.
> > > > > 
> > > > > Signed-off-by: Robin Murphy .
> > > > > [ported on top of the other iommu_attr changes and added a few small
> > > > >missing bits]
> > > > > Signed-off-by: Christoph Hellwig 
> > > > > ---
> > > > >drivers/iommu/amd/iommu.c   | 23 +---
> > > > >drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 50 +---
> > > > >drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
> > > > >drivers/iommu/arm/arm-smmu/arm-smmu.c   | 27 +
> > > > >drivers/iommu/dma-iommu.c   |  9 +--
> > > > >drivers/iommu/intel/iommu.c | 64 
> > > > > -
> > > > >drivers/iommu/iommu.c   | 27 ++---
> > > > >include/linux/iommu.h   |  4 +-
> > > > >8 files changed, 40 insertions(+), 165 deletions(-)
> > > > 
> > > > I really like this cleanup, but I can't help wonder if it's going in the
> > > > wrong direction. With SoCs often having multiple IOMMU instances and a
> > > > distinction between "trusted" and "untrusted" devices, then having the
> > > > flush-queue enabled on a per-IOMMU or per-domain basis doesn't sound
> > > > unreasonable to me, but this change makes it a global property.
> > > 
> > > The intent here was just to streamline the existing behaviour of stuffing 
> > > a
> > > global property into a domain attribute then pulling it out again in the
> > > illusion that it was in any way per-domain. We're still checking
> > > dev_is_untrusted() before making an actual decision, and it's not like we
> > > can't add more factors at that point if we want to.
> > 
> > Like I say, the cleanup is great. I'm just wondering whether there's a
> > better way to express the complicated logic to decide whether or not to use
> > the flush queue than what we end up with:
> > 
> > if (!cookie->fq_domain && (!dev || !dev_is_untrusted(dev)) &&
> > domain->ops->flush_iotlb_all && !iommu_get_dma_strict())
> > 
> > which is mixing up globals, device properties and domain properties. The
> > result is that the driver code ends up just using the global to determine
> > whether or not to pass IO_PGTABLE_QUIRK_NON_STRICT to the page-table code,
> > which is a departure from the current way of doing things.
> 
> But previously, SMMU only ever saw the global policy piped through the
> domain attribute by iommu_group_alloc_default_domain(), so there's no
> functional change there.

For DMA domains sure, but I don't think that's the case for unmanaged
domains such as those used by VFIO.

> Obviously some of the above checks could be factored out into some kind of
> iommu_use_flush_queue() helper that IOMMU drivers can also call if they need
> to keep in sync. Or maybe we just allow iommu-dma to set
> IO_PGTABLE_QUIRK_NON_STRICT directly via iommu_set_pgtable_quirks() if we're
> treating that as a generic thing now.

I think a helper that takes a domain would be a good starting point.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 1/2] iommu/arm-smmu-qcom: Skip the TTBR1 quirk for db820c.

2021-03-30 Thread Will Deacon
On Tue, Mar 30, 2021 at 08:03:36AM -0700, Rob Clark wrote:
> On Tue, Mar 30, 2021 at 2:34 AM Will Deacon  wrote:
> >
> > On Mon, Mar 29, 2021 at 09:02:50PM -0700, Rob Clark wrote:
> > > On Mon, Mar 29, 2021 at 7:47 AM Will Deacon  wrote:
> > > >
> > > > On Fri, Mar 26, 2021 at 04:13:02PM -0700, Eric Anholt wrote:
> > > > > db820c wants to use the qcom smmu path to get HUPCF set (which keeps
> > > > > the GPU from wedging and then sometimes wedging the kernel after a
> > > > > page fault), but it doesn't have separate pagetables support yet in
> > > > > drm/msm so we can't go all the way to the TTBR1 path.
> > > >
> > > > What do you mean by "doesn't have separate pagetables support yet"? The
> > > > compatible string doesn't feel like the right way to determine this.
> > >
> > > the compatible string identifies what it is, not what the sw
> > > limitations are, so in that regard it seems right to me..
> >
> > Well it depends on what "doesn't have separate pagetables support yet"
> > means. I can't tell if it's a hardware issue, a firmware issue or a driver
> > issue.
> 
> Just a driver issue (and the fact that currently we don't have
> physical access to a device... debugging a5xx per-process-pgtables by
> pushing untested things to the CI farm is kind of a difficult way to
> work)

But then in that case, this is using the compatible string to identify a
driver issue, no?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 16/18] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE

2021-03-30 Thread Will Deacon
On Tue, Mar 30, 2021 at 02:19:38PM +0100, Robin Murphy wrote:
> On 2021-03-30 14:11, Will Deacon wrote:
> > On Tue, Mar 16, 2021 at 04:38:22PM +0100, Christoph Hellwig wrote:
> > > From: Robin Murphy 
> > > 
> > > Instead make the global iommu_dma_strict paramete in iommu.c canonical by
> > > exporting helpers to get and set it and use those directly in the drivers.
> > > 
> > > This make sure that the iommu.strict parameter also works for the AMD and
> > > Intel IOMMU drivers on x86.  As those default to lazy flushing a new
> > > IOMMU_CMD_LINE_STRICT is used to turn the value into a tristate to
> > > represent the default if not overriden by an explicit parameter.
> > > 
> > > Signed-off-by: Robin Murphy .
> > > [ported on top of the other iommu_attr changes and added a few small
> > >   missing bits]
> > > Signed-off-by: Christoph Hellwig 
> > > ---
> > >   drivers/iommu/amd/iommu.c   | 23 +---
> > >   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 50 +---
> > >   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
> > >   drivers/iommu/arm/arm-smmu/arm-smmu.c   | 27 +
> > >   drivers/iommu/dma-iommu.c   |  9 +--
> > >   drivers/iommu/intel/iommu.c | 64 -
> > >   drivers/iommu/iommu.c   | 27 ++---
> > >   include/linux/iommu.h   |  4 +-
> > >   8 files changed, 40 insertions(+), 165 deletions(-)
> > 
> > I really like this cleanup, but I can't help wonder if it's going in the
> > wrong direction. With SoCs often having multiple IOMMU instances and a
> > distinction between "trusted" and "untrusted" devices, then having the
> > flush-queue enabled on a per-IOMMU or per-domain basis doesn't sound
> > unreasonable to me, but this change makes it a global property.
> 
> The intent here was just to streamline the existing behaviour of stuffing a
> global property into a domain attribute then pulling it out again in the
> illusion that it was in any way per-domain. We're still checking
> dev_is_untrusted() before making an actual decision, and it's not like we
> can't add more factors at that point if we want to.

Like I say, the cleanup is great. I'm just wondering whether there's a
better way to express the complicated logic to decide whether or not to use
the flush queue than what we end up with:

if (!cookie->fq_domain && (!dev || !dev_is_untrusted(dev)) &&
domain->ops->flush_iotlb_all && !iommu_get_dma_strict())

which is mixing up globals, device properties and domain properties. The
result is that the driver code ends up just using the global to determine
whether or not to pass IO_PGTABLE_QUIRK_NON_STRICT to the page-table code,
which is a departure from the current way of doing things.

> > For example, see the recent patch from Lu Baolu:
> > 
> > https://lore.kernel.org/r/20210225061454.2864009-1-baolu...@linux.intel.com
> 
> Erm, this patch is based on that one, it's right there in the context :/

Ah, sorry, I didn't spot that! I was just trying to illustrate that this
is per-device.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 18/18] iommu: remove iommu_domain_{get, set}_attr

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:24PM +0100, Christoph Hellwig wrote:
> Remove the now unused iommu attr infrastructure.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/iommu/iommu.c | 26 --
>  include/linux/iommu.h | 36 
>  2 files changed, 62 deletions(-)

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 17/18] iommu: remove DOMAIN_ATTR_IO_PGTABLE_CFG

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:23PM +0100, Christoph Hellwig wrote:
> Use an explicit set_pgtable_quirks method instead that just passes
> the actual quirk bitmask instead.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Li Yang 
> ---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |  5 +-
>  drivers/iommu/arm/arm-smmu/arm-smmu.c   | 64 +
>  drivers/iommu/arm/arm-smmu/arm-smmu.h   |  2 +-
>  drivers/iommu/iommu.c   | 11 +
>  include/linux/io-pgtable.h  |  4 --
>  include/linux/iommu.h   | 12 -
>  6 files changed, 35 insertions(+), 63 deletions(-)

I'm fine with this for now, although there has been talk about passing
things other than boolean flags as page-table quirks. We can cross that
bridge when we get there, so:

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 16/18] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:22PM +0100, Christoph Hellwig wrote:
> From: Robin Murphy 
> 
> Instead make the global iommu_dma_strict paramete in iommu.c canonical by
> exporting helpers to get and set it and use those directly in the drivers.
> 
> This make sure that the iommu.strict parameter also works for the AMD and
> Intel IOMMU drivers on x86.  As those default to lazy flushing a new
> IOMMU_CMD_LINE_STRICT is used to turn the value into a tristate to
> represent the default if not overriden by an explicit parameter.
> 
> Signed-off-by: Robin Murphy .
> [ported on top of the other iommu_attr changes and added a few small
>  missing bits]
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/iommu/amd/iommu.c   | 23 +---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 50 +---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  1 -
>  drivers/iommu/arm/arm-smmu/arm-smmu.c   | 27 +
>  drivers/iommu/dma-iommu.c   |  9 +--
>  drivers/iommu/intel/iommu.c | 64 -
>  drivers/iommu/iommu.c   | 27 ++---
>  include/linux/iommu.h   |  4 +-
>  8 files changed, 40 insertions(+), 165 deletions(-)

I really like this cleanup, but I can't help wonder if it's going in the
wrong direction. With SoCs often having multiple IOMMU instances and a
distinction between "trusted" and "untrusted" devices, then having the
flush-queue enabled on a per-IOMMU or per-domain basis doesn't sound
unreasonable to me, but this change makes it a global property.

For example, see the recent patch from Lu Baolu:

https://lore.kernel.org/r/20210225061454.2864009-1-baolu...@linux.intel.com

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 15/18] iommu: remove iommu_set_cmd_line_dma_api and iommu_cmd_line_dma_api

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:21PM +0100, Christoph Hellwig wrote:
> Don't obsfucate the trivial bit flag check.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/iommu/iommu.c | 23 +--
>  1 file changed, 5 insertions(+), 18 deletions(-)

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 14/18] iommu: remove DOMAIN_ATTR_NESTING

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:20PM +0100, Christoph Hellwig wrote:
> Use an explicit enable_nesting method instead.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Li Yang 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 43 -
>  drivers/iommu/arm/arm-smmu/arm-smmu.c   | 30 +++---
>  drivers/iommu/intel/iommu.c | 31 +--
>  drivers/iommu/iommu.c   | 10 +
>  drivers/vfio/vfio_iommu_type1.c |  5 +--
>  include/linux/iommu.h   |  4 +-
>  6 files changed, 55 insertions(+), 68 deletions(-)

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 13/18] iommu: remove DOMAIN_ATTR_GEOMETRY

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:19PM +0100, Christoph Hellwig wrote:
> The geometry information can be trivially queried from the iommu_domain
> struture.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Li Yang 
> ---
>  drivers/iommu/iommu.c   | 20 +++-
>  drivers/vfio/vfio_iommu_type1.c | 26 --
>  drivers/vhost/vdpa.c| 10 +++---
>  include/linux/iommu.h   |  1 -
>  4 files changed, 18 insertions(+), 39 deletions(-)

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 12/18] iommu: remove DOMAIN_ATTR_PAGING

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:18PM +0100, Christoph Hellwig wrote:
> DOMAIN_ATTR_PAGING is never used.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Li Yang 
> ---
>  drivers/iommu/iommu.c | 5 -
>  include/linux/iommu.h | 1 -
>  2 files changed, 6 deletions(-)

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 09/18] iommu/fsl_pamu: merge handle_attach_device into fsl_pamu_attach_device

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:15PM +0100, Christoph Hellwig wrote:
> No good reason to split this functionality over two functions.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Li Yang 
> ---
>  drivers/iommu/fsl_pamu_domain.c | 59 +++--
>  1 file changed, 20 insertions(+), 39 deletions(-)

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 08/18] iommu/fsl_pamu: merge pamu_set_liodn and map_liodn

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:14PM +0100, Christoph Hellwig wrote:
> Merge the two fuctions that configure the ppaace into a single coherent
> function.  I somehow doubt we need the two pamu_config_ppaace calls,
> but keep the existing behavior just to be on the safe side.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Li Yang 
> ---
>  drivers/iommu/fsl_pamu_domain.c | 65 +
>  1 file changed, 17 insertions(+), 48 deletions(-)
> 
> diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
> index 40eff4b7bc5d42..4a4944332674f7 100644
> --- a/drivers/iommu/fsl_pamu_domain.c
> +++ b/drivers/iommu/fsl_pamu_domain.c
> @@ -54,25 +54,6 @@ static int __init iommu_init_mempool(void)
>   return 0;
>  }
>  
> -/* Map the DMA window corresponding to the LIODN */
> -static int map_liodn(int liodn, struct fsl_dma_domain *dma_domain)
> -{
> - int ret;
> - struct iommu_domain_geometry *geom = _domain->iommu_domain.geometry;
> - unsigned long flags;
> -
> - spin_lock_irqsave(_lock, flags);
> - ret = pamu_config_ppaace(liodn, geom->aperture_start,
> -  geom->aperture_end - 1, ~(u32)0,
> -  0, dma_domain->snoop_id, dma_domain->stash_id,
> -  PAACE_AP_PERMS_QUERY | PAACE_AP_PERMS_UPDATE);
> - spin_unlock_irqrestore(_lock, flags);
> - if (ret)
> - pr_debug("PAACE configuration failed for liodn %d\n", liodn);
> -
> - return ret;
> -}
> -
>  static int update_liodn_stash(int liodn, struct fsl_dma_domain *dma_domain,
> u32 val)
>  {
> @@ -94,11 +75,11 @@ static int update_liodn_stash(int liodn, struct 
> fsl_dma_domain *dma_domain,
>  }
>  
>  /* Set the geometry parameters for a LIODN */
> -static int pamu_set_liodn(int liodn, struct device *dev,
> -   struct fsl_dma_domain *dma_domain,
> -   struct iommu_domain_geometry *geom_attr)
> +static int pamu_set_liodn(struct fsl_dma_domain *dma_domain, struct device 
> *dev,
> +   int liodn)
>  {
> - phys_addr_t window_addr, window_size;
> + struct iommu_domain *domain = _domain->iommu_domain;
> + struct iommu_domain_geometry *geom = >geometry;
>   u32 omi_index = ~(u32)0;
>   unsigned long flags;
>   int ret;
> @@ -110,22 +91,25 @@ static int pamu_set_liodn(int liodn, struct device *dev,
>*/
>   get_ome_index(_index, dev);
>  
> - window_addr = geom_attr->aperture_start;
> - window_size = geom_attr->aperture_end + 1;
> -
>   spin_lock_irqsave(_lock, flags);
>   ret = pamu_disable_liodn(liodn);
> - if (!ret)
> - ret = pamu_config_ppaace(liodn, window_addr, window_size, 
> omi_index,
> -  0, dma_domain->snoop_id,
> -  dma_domain->stash_id, 0);
> + if (ret)
> + goto out_unlock;
> + ret = pamu_config_ppaace(liodn, geom->aperture_start,
> +  geom->aperture_end - 1, omi_index, 0,
> +  dma_domain->snoop_id, dma_domain->stash_id, 0);
> + if (ret)
> + goto out_unlock;
> + ret = pamu_config_ppaace(liodn, geom->aperture_start,
> +  geom->aperture_end - 1, ~(u32)0,
> +  0, dma_domain->snoop_id, dma_domain->stash_id,
> +  PAACE_AP_PERMS_QUERY | PAACE_AP_PERMS_UPDATE);

There's more '+1' / '-1' confusion here with aperture_end which I'm not
managing to follow. What am I missing?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 07/18] iommu/fsl_pamu: replace DOMAIN_ATTR_FSL_PAMU_STASH with a direct call

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:13PM +0100, Christoph Hellwig wrote:
> Add a fsl_pamu_configure_l1_stash API that qman_portal can call directly
> instead of indirecting through the iommu attr API.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Li Yang 
> ---
>  arch/powerpc/include/asm/fsl_pamu_stash.h | 12 +++-
>  drivers/iommu/fsl_pamu_domain.c   | 16 +++-
>  drivers/iommu/fsl_pamu_domain.h   |  2 --
>  drivers/soc/fsl/qbman/qman_portal.c   | 18 +++---
>  include/linux/iommu.h |  1 -
>  5 files changed, 9 insertions(+), 40 deletions(-)

Heh, this thing is so over-engineered.

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 06/18] iommu/fsl_pamu: remove ->domain_window_enable

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:12PM +0100, Christoph Hellwig wrote:
> The only thing that fsl_pamu_window_enable does for the current caller
> is to fill in the prot value in the only dma_window structure, and to
> propagate a few values from the iommu_domain_geometry struture into the
> dma_window.  Remove the dma_window entirely, hardcode the prot value and
> otherwise use the iommu_domain_geometry structure instead.
> 
> Remove the now unused ->domain_window_enable iommu method.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Li Yang 
> ---
>  drivers/iommu/fsl_pamu_domain.c | 182 +++-
>  drivers/iommu/fsl_pamu_domain.h |  17 ---
>  drivers/iommu/iommu.c   |  11 --
>  drivers/soc/fsl/qbman/qman_portal.c |   7 --
>  include/linux/iommu.h   |  17 ---
>  5 files changed, 14 insertions(+), 220 deletions(-)
> 
> diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
> index e6bdd38fc18409..fd2bc88b690465 100644
> --- a/drivers/iommu/fsl_pamu_domain.c
> +++ b/drivers/iommu/fsl_pamu_domain.c
> @@ -54,34 +54,18 @@ static int __init iommu_init_mempool(void)
>   return 0;
>  }
>  
> -static phys_addr_t get_phys_addr(struct fsl_dma_domain *dma_domain, 
> dma_addr_t iova)
> -{
> - struct dma_window *win_ptr = _domain->win_arr[0];
> - struct iommu_domain_geometry *geom;
> -
> - geom = _domain->iommu_domain.geometry;
> -
> - if (win_ptr->valid)
> - return win_ptr->paddr + (iova & (win_ptr->size - 1));
> -
> - return 0;
> -}
> -
>  /* Map the DMA window corresponding to the LIODN */
>  static int map_liodn(int liodn, struct fsl_dma_domain *dma_domain)
>  {
>   int ret;
> - struct dma_window *wnd = _domain->win_arr[0];
> - phys_addr_t wnd_addr = dma_domain->iommu_domain.geometry.aperture_start;
> + struct iommu_domain_geometry *geom = _domain->iommu_domain.geometry;
>   unsigned long flags;
>  
>   spin_lock_irqsave(_lock, flags);
> - ret = pamu_config_ppaace(liodn, wnd_addr,
> -  wnd->size,
> -  ~(u32)0,
> -  wnd->paddr >> PAMU_PAGE_SHIFT,
> -  dma_domain->snoop_id, dma_domain->stash_id,
> -  wnd->prot);
> + ret = pamu_config_ppaace(liodn, geom->aperture_start,
> +  geom->aperture_end - 1, ~(u32)0,

You're passing 'geom->aperture_end - 1' as the size here, but the old code
seemed to _add_ 1:

> -static int fsl_pamu_window_enable(struct iommu_domain *domain, u32 wnd_nr,
> -   phys_addr_t paddr, u64 size, int prot)
> -{
> - struct fsl_dma_domain *dma_domain = to_fsl_dma_domain(domain);
> - struct dma_window *wnd;
> - int pamu_prot = 0;
> - int ret;
> - unsigned long flags;
> - u64 win_size;
> -
> - if (prot & IOMMU_READ)
> - pamu_prot |= PAACE_AP_PERMS_QUERY;
> - if (prot & IOMMU_WRITE)
> - pamu_prot |= PAACE_AP_PERMS_UPDATE;
> -
> - spin_lock_irqsave(_domain->domain_lock, flags);
> - if (wnd_nr > 0) {
> - pr_debug("Invalid window index\n");
> - spin_unlock_irqrestore(_domain->domain_lock, flags);
> - return -EINVAL;
> - }
> -
> - win_size = (domain->geometry.aperture_end + 1) >> ilog2(1);

here ^^ when calculating the exclusive upper bound. In other words, I think
'1ULL << 36' used to get passed to pamu_config_ppaace(). Is that an
intentional change?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 05/18] iommu/fsl_pamu: remove support for multiple windows

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:11PM +0100, Christoph Hellwig wrote:
> The only domains allocated forces use of a single window.  Remove all
> the code related to multiple window support, as well as the need for
> qman_portal to force a single window.
> 
> Remove the now unused DOMAIN_ATTR_WINDOWS iommu_attr.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Li Yang 
> ---
>  drivers/iommu/fsl_pamu.c| 264 +-
>  drivers/iommu/fsl_pamu.h|  10 +-
>  drivers/iommu/fsl_pamu_domain.c | 275 +---
>  drivers/iommu/fsl_pamu_domain.h |  12 +-
>  drivers/soc/fsl/qbman/qman_portal.c |   7 -
>  include/linux/iommu.h   |   1 -
>  6 files changed, 59 insertions(+), 510 deletions(-)

[...]

> + set_bf(ppaace->impl_attr, PAACE_IA_ATM, PAACE_ATM_WINDOW_XLATE);
> + ppaace->twbah = rpn >> 20;
> + set_bf(ppaace->win_bitfields, PAACE_WIN_TWBAL, rpn);
> + set_bf(ppaace->addr_bitfields, PAACE_AF_AP, prot);
> + set_bf(ppaace->impl_attr, PAACE_IA_WCE, 0);
> + set_bf(ppaace->addr_bitfields, PPAACE_AF_MW, 0);
>   mb();

(I wonder what on Earth that mb() is doing...)

> diff --git a/drivers/iommu/fsl_pamu_domain.h b/drivers/iommu/fsl_pamu_domain.h
> index 53d359d66fe577..b9236fb5a8f82e 100644
> --- a/drivers/iommu/fsl_pamu_domain.h
> +++ b/drivers/iommu/fsl_pamu_domain.h
> @@ -17,23 +17,13 @@ struct dma_window {
>  };
>  
>  struct fsl_dma_domain {
> - /*
> -  * Number of windows assocaited with this domain.
> -  * During domain initialization, it is set to the
> -  * the maximum number of subwindows allowed for a LIODN.
> -  * Minimum value for this is 1 indicating a single PAMU
> -  * window, without any sub windows. Value can be set/
> -  * queried by set_attr/get_attr API for DOMAIN_ATTR_WINDOWS.
> -  * Value can only be set once the geometry has been configured.
> -  */
> - u32 win_cnt;
>   /*
>* win_arr contains information of the configured
>* windows for a domain. This is allocated only
>* when the number of windows for the domain are
>* set.
>*/

The last part of this comment is now stale ^^

> - struct dma_window   *win_arr;
> + struct dma_window   win_arr[1];
>   /* list of devices associated with the domain */
>   struct list_headdevices;
>   /* dma_domain states:

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 04/18] iommu/fsl_pamu: merge iommu_alloc_dma_domain into fsl_pamu_domain_alloc

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:10PM +0100, Christoph Hellwig wrote:
> Keep the functionality to allocate the domain together.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Li Yang 
> ---
>  drivers/iommu/fsl_pamu_domain.c | 34 ++---
>  1 file changed, 10 insertions(+), 24 deletions(-)

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 01/18] iommu: remove the unused domain_window_disable method

2021-03-30 Thread Will Deacon
On Tue, Mar 16, 2021 at 04:38:07PM +0100, Christoph Hellwig wrote:
> domain_window_disable is wired up by fsl_pamu, but never actually called.
> 
> Signed-off-by: Christoph Hellwig 
> Acked-by: Li Yang 
> ---
>  drivers/iommu/fsl_pamu_domain.c | 48 -
>  include/linux/iommu.h   |  2 --
>  2 files changed, 50 deletions(-)

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 1/2] iommu/arm-smmu-qcom: Skip the TTBR1 quirk for db820c.

2021-03-29 Thread Will Deacon
On Fri, Mar 26, 2021 at 04:13:02PM -0700, Eric Anholt wrote:
> db820c wants to use the qcom smmu path to get HUPCF set (which keeps
> the GPU from wedging and then sometimes wedging the kernel after a
> page fault), but it doesn't have separate pagetables support yet in
> drm/msm so we can't go all the way to the TTBR1 path.

What do you mean by "doesn't have separate pagetables support yet"? The
compatible string doesn't feel like the right way to determine this.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 2/3] iommu/io-pgtable-arm: Add IOMMU_LLC page protection flag

2021-03-25 Thread Will Deacon
On Tue, Mar 09, 2021 at 12:10:44PM +0530, Sai Prakash Ranjan wrote:
> On 2021-02-05 17:38, Sai Prakash Ranjan wrote:
> > On 2021-02-04 03:16, Will Deacon wrote:
> > > On Tue, Feb 02, 2021 at 11:56:27AM +0530, Sai Prakash Ranjan wrote:
> > > > On 2021-02-01 23:50, Jordan Crouse wrote:
> > > > > On Mon, Feb 01, 2021 at 08:20:44AM -0800, Rob Clark wrote:
> > > > > > On Mon, Feb 1, 2021 at 3:16 AM Will Deacon  wrote:
> > > > > > > On Fri, Jan 29, 2021 at 03:12:59PM +0530, Sai Prakash Ranjan 
> > > > > > > wrote:
> > > > > > > > On 2021-01-29 14:35, Will Deacon wrote:
> > > > > > > > > On Mon, Jan 11, 2021 at 07:45:04PM +0530, Sai Prakash Ranjan 
> > > > > > > > > wrote:
> > > > > > > > > > +#define IOMMU_LLC(1 << 6)
> > > > > > > > >
> > > > > > > > > On reflection, I'm a bit worried about exposing this because 
> > > > > > > > > I think it
> > > > > > > > > will
> > > > > > > > > introduce a mismatched virtual alias with the CPU (we don't 
> > > > > > > > > even have a
> > > > > > > > > MAIR
> > > > > > > > > set up for this memory type). Now, we also have that issue 
> > > > > > > > > for the PTW,
> > > > > > > > > but
> > > > > > > > > since we always use cache maintenance (i.e. the streaming 
> > > > > > > > > API) for
> > > > > > > > > publishing the page-tables to a non-coheren walker, it works 
> > > > > > > > > out.
> > > > > > > > > However,
> > > > > > > > > if somebody expects IOMMU_LLC to be coherent with a DMA API 
> > > > > > > > > coherent
> > > > > > > > > allocation, then they're potentially in for a nasty surprise 
> > > > > > > > > due to the
> > > > > > > > > mismatched outer-cacheability attributes.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Can't we add the syscached memory type similar to what is done 
> > > > > > > > on android?
> > > > > > >
> > > > > > > Maybe. How does the GPU driver map these things on the CPU side?
> > > > > >
> > > > > > Currently we use writecombine mappings for everything, although 
> > > > > > there
> > > > > > are some cases that we'd like to use cached (but have not merged
> > > > > > patches that would give userspace a way to flush/invalidate)
> > > > > >
> > > > >
> > > > > LLC/system cache doesn't have a relationship with the CPU cache.  Its
> > > > > just a
> > > > > little accelerator that sits on the connection from the GPU to DDR and
> > > > > caches
> > > > > accesses. The hint that Sai is suggesting is used to mark the buffers 
> > > > > as
> > > > > 'no-write-allocate' to prevent GPU write operations from being cached 
> > > > > in
> > > > > the LLC
> > > > > which a) isn't interesting and b) takes up cache space for read
> > > > > operations.
> > > > >
> > > > > Its easiest to think of the LLC as a bonus accelerator that has no 
> > > > > cost
> > > > > for
> > > > > us to use outside of the unfortunate per buffer hint.
> > > > >
> > > > > We do have to worry about the CPU cache w.r.t I/O coherency (which is 
> > > > > a
> > > > > different hint) and in that case we have all of concerns that Will
> > > > > identified.
> > > > >
> > > > 
> > > > For mismatched outer cacheability attributes which Will
> > > > mentioned, I was
> > > > referring to [1] in android kernel.
> > > 
> > > I've lost track of the conversation here :/
> > > 
> > > When the GPU has a buffer mapped with IOMMU_LLC, is the buffer also
> > > mapped
> > > into the CPU and with what attributes? Rob said "writecombine for
> > > everything" -- does that mean ioremap_wc() / MEMREMAP_WC?
> > > 
> > 
> > Rob answered this.
> > 
> > > Finally, we need to be careful when we use the word "hint" as
> > > "allocation
> > > hint" has a specific meaning in the architecture, and if we only
> > > mismatch on
> > > those then we're actually ok. But I think IOMMU_LLC is more than
> > > just a
> > > hint, since it actually drives eviction policy (i.e. it enables
> > > writeback).
> > > 
> > > Sorry for the pedantry, but I just want to make sure we're all talking
> > > about the same things!
> > > 
> > 
> > Sorry for the confusion which probably was caused by my mentioning of
> > android, NWA(no write allocate) is an allocation hint which we can
> > ignore
> > for now as it is not introduced yet in upstream.
> > 
> 
> Any chance of taking this forward? We do not want to miss out on small fps
> gain when the product gets released.

Do we have a solution to the mismatched virtual alias?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 16/17] iommu: remove DOMAIN_ATTR_IO_PGTABLE_CFG

2021-03-05 Thread Will Deacon
On Thu, Mar 04, 2021 at 03:11:08PM -0800, Rob Clark wrote:
> On Thu, Mar 4, 2021 at 7:48 AM Robin Murphy  wrote:
> >
> > On 2021-03-01 08:42, Christoph Hellwig wrote:
> > > Signed-off-by: Christoph Hellwig 
> >
> > Moreso than the previous patch, where the feature is at least relatively
> > generic (note that there's a bunch of in-flight development around
> > DOMAIN_ATTR_NESTING), I'm really not convinced that it's beneficial to
> > bloat the generic iommu_ops structure with private driver-specific
> > interfaces. The attribute interface is a great compromise for these
> > kinds of things, and you can easily add type-checked wrappers around it
> > for external callers (maybe even make the actual attributes internal
> > between the IOMMU core and drivers) if that's your concern.
> 
> I suppose if this is *just* for the GPU we could move it into 
> adreno_smmu_priv..
> 
> But one thing I'm not sure about is whether
> IO_PGTABLE_QUIRK_ARM_OUTER_WBWA is something that other devices
> *should* be using as well, but just haven't gotten around to yet.

The intention is certainly that this would be a place to collate per-domain
pgtable quirks, so I'd prefer not to tie that to the GPU.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 2/3] iommu/io-pgtable-arm: Add IOMMU_LLC page protection flag

2021-02-03 Thread Will Deacon
On Tue, Feb 02, 2021 at 11:56:27AM +0530, Sai Prakash Ranjan wrote:
> On 2021-02-01 23:50, Jordan Crouse wrote:
> > On Mon, Feb 01, 2021 at 08:20:44AM -0800, Rob Clark wrote:
> > > On Mon, Feb 1, 2021 at 3:16 AM Will Deacon  wrote:
> > > > On Fri, Jan 29, 2021 at 03:12:59PM +0530, Sai Prakash Ranjan wrote:
> > > > > On 2021-01-29 14:35, Will Deacon wrote:
> > > > > > On Mon, Jan 11, 2021 at 07:45:04PM +0530, Sai Prakash Ranjan wrote:
> > > > > > > +#define IOMMU_LLC(1 << 6)
> > > > > >
> > > > > > On reflection, I'm a bit worried about exposing this because I 
> > > > > > think it
> > > > > > will
> > > > > > introduce a mismatched virtual alias with the CPU (we don't even 
> > > > > > have a
> > > > > > MAIR
> > > > > > set up for this memory type). Now, we also have that issue for the 
> > > > > > PTW,
> > > > > > but
> > > > > > since we always use cache maintenance (i.e. the streaming API) for
> > > > > > publishing the page-tables to a non-coheren walker, it works out.
> > > > > > However,
> > > > > > if somebody expects IOMMU_LLC to be coherent with a DMA API coherent
> > > > > > allocation, then they're potentially in for a nasty surprise due to 
> > > > > > the
> > > > > > mismatched outer-cacheability attributes.
> > > > > >
> > > > >
> > > > > Can't we add the syscached memory type similar to what is done on 
> > > > > android?
> > > >
> > > > Maybe. How does the GPU driver map these things on the CPU side?
> > > 
> > > Currently we use writecombine mappings for everything, although there
> > > are some cases that we'd like to use cached (but have not merged
> > > patches that would give userspace a way to flush/invalidate)
> > > 
> > 
> > LLC/system cache doesn't have a relationship with the CPU cache.  Its
> > just a
> > little accelerator that sits on the connection from the GPU to DDR and
> > caches
> > accesses. The hint that Sai is suggesting is used to mark the buffers as
> > 'no-write-allocate' to prevent GPU write operations from being cached in
> > the LLC
> > which a) isn't interesting and b) takes up cache space for read
> > operations.
> > 
> > Its easiest to think of the LLC as a bonus accelerator that has no cost
> > for
> > us to use outside of the unfortunate per buffer hint.
> > 
> > We do have to worry about the CPU cache w.r.t I/O coherency (which is a
> > different hint) and in that case we have all of concerns that Will
> > identified.
> > 
> 
> For mismatched outer cacheability attributes which Will mentioned, I was
> referring to [1] in android kernel.

I've lost track of the conversation here :/

When the GPU has a buffer mapped with IOMMU_LLC, is the buffer also mapped
into the CPU and with what attributes? Rob said "writecombine for
everything" -- does that mean ioremap_wc() / MEMREMAP_WC?

Finally, we need to be careful when we use the word "hint" as "allocation
hint" has a specific meaning in the architecture, and if we only mismatch on
those then we're actually ok. But I think IOMMU_LLC is more than just a
hint, since it actually drives eviction policy (i.e. it enables writeback).

Sorry for the pedantry, but I just want to make sure we're all talking
about the same things!

Cheers,

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 2/3] iommu/io-pgtable-arm: Add IOMMU_LLC page protection flag

2021-02-01 Thread Will Deacon
On Fri, Jan 29, 2021 at 03:12:59PM +0530, Sai Prakash Ranjan wrote:
> On 2021-01-29 14:35, Will Deacon wrote:
> > On Mon, Jan 11, 2021 at 07:45:04PM +0530, Sai Prakash Ranjan wrote:
> > > Add a new page protection flag IOMMU_LLC which can be used
> > > by non-coherent masters to set cacheable memory attributes
> > > for an outer level of cache called as last-level cache or
> > > system cache. Initial user of this page protection flag is
> > > the adreno gpu and then can later be used by other clients
> > > such as video where this can be used for per-buffer based
> > > mapping.
> > > 
> > > Signed-off-by: Sai Prakash Ranjan 
> > > ---
> > >  drivers/iommu/io-pgtable-arm.c | 3 +++
> > >  include/linux/iommu.h  | 6 ++
> > >  2 files changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/iommu/io-pgtable-arm.c
> > > b/drivers/iommu/io-pgtable-arm.c
> > > index 7439ee7fdcdb..ebe653ef601b 100644
> > > --- a/drivers/iommu/io-pgtable-arm.c
> > > +++ b/drivers/iommu/io-pgtable-arm.c
> > > @@ -415,6 +415,9 @@ static arm_lpae_iopte
> > > arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
> > >   else if (prot & IOMMU_CACHE)
> > >   pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
> > >   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
> > > + else if (prot & IOMMU_LLC)
> > > + pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE
> > > + << ARM_LPAE_PTE_ATTRINDX_SHIFT);
> > >   }
> > > 
> > >   if (prot & IOMMU_CACHE)
> > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > > index ffaa389ea128..1f82057df531 100644
> > > --- a/include/linux/iommu.h
> > > +++ b/include/linux/iommu.h
> > > @@ -31,6 +31,12 @@
> > >   * if the IOMMU page table format is equivalent.
> > >   */
> > >  #define IOMMU_PRIV   (1 << 5)
> > > +/*
> > > + * Non-coherent masters can use this page protection flag to set
> > > cacheable
> > > + * memory attributes for only a transparent outer level of cache,
> > > also known as
> > > + * the last-level or system cache.
> > > + */
> > > +#define IOMMU_LLC(1 << 6)
> > 
> > On reflection, I'm a bit worried about exposing this because I think it
> > will
> > introduce a mismatched virtual alias with the CPU (we don't even have a
> > MAIR
> > set up for this memory type). Now, we also have that issue for the PTW,
> > but
> > since we always use cache maintenance (i.e. the streaming API) for
> > publishing the page-tables to a non-coheren walker, it works out.
> > However,
> > if somebody expects IOMMU_LLC to be coherent with a DMA API coherent
> > allocation, then they're potentially in for a nasty surprise due to the
> > mismatched outer-cacheability attributes.
> > 
> 
> Can't we add the syscached memory type similar to what is done on android?

Maybe. How does the GPU driver map these things on the CPU side?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 2/3] iommu/io-pgtable-arm: Add IOMMU_LLC page protection flag

2021-01-29 Thread Will Deacon
On Mon, Jan 11, 2021 at 07:45:04PM +0530, Sai Prakash Ranjan wrote:
> Add a new page protection flag IOMMU_LLC which can be used
> by non-coherent masters to set cacheable memory attributes
> for an outer level of cache called as last-level cache or
> system cache. Initial user of this page protection flag is
> the adreno gpu and then can later be used by other clients
> such as video where this can be used for per-buffer based
> mapping.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/iommu/io-pgtable-arm.c | 3 +++
>  include/linux/iommu.h  | 6 ++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 7439ee7fdcdb..ebe653ef601b 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -415,6 +415,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct 
> arm_lpae_io_pgtable *data,
>   else if (prot & IOMMU_CACHE)
>   pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
>   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
> + else if (prot & IOMMU_LLC)
> + pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE
> + << ARM_LPAE_PTE_ATTRINDX_SHIFT);
>   }
>  
>   if (prot & IOMMU_CACHE)
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index ffaa389ea128..1f82057df531 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -31,6 +31,12 @@
>   * if the IOMMU page table format is equivalent.
>   */
>  #define IOMMU_PRIV   (1 << 5)
> +/*
> + * Non-coherent masters can use this page protection flag to set cacheable
> + * memory attributes for only a transparent outer level of cache, also known 
> as
> + * the last-level or system cache.
> + */
> +#define IOMMU_LLC(1 << 6)

On reflection, I'm a bit worried about exposing this because I think it will
introduce a mismatched virtual alias with the CPU (we don't even have a MAIR
set up for this memory type). Now, we also have that issue for the PTW, but
since we always use cache maintenance (i.e. the streaming API) for
publishing the page-tables to a non-coheren walker, it works out. However,
if somebody expects IOMMU_LLC to be coherent with a DMA API coherent
allocation, then they're potentially in for a nasty surprise due to the
mismatched outer-cacheability attributes.

So I can take patch (1) as a trivial rename, but unfortunately I think this
needs more thought before exposing it beyond the PTW.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v2 5/7] drm/msm: Add dependency on io-pgtable-arm format module

2021-01-18 Thread Will Deacon
On Mon, Jan 18, 2021 at 01:16:03PM -0800, Rob Clark wrote:
> On Mon, Dec 21, 2020 at 4:44 PM Isaac J. Manjarres
>  wrote:
> >
> > The MSM DRM driver depends on the availability of the ARM LPAE io-pgtable
> > format code to work properly. In preparation for having the io-pgtable
> > formats as modules, add a "pre" dependency with MODULE_SOFTDEP() to
> > ensure that the io-pgtable-arm format module is loaded before loading
> > the MSM DRM driver module.
> >
> > Signed-off-by: Isaac J. Manjarres 
> 
> Thanks, I've queued this up locally

I don't plan to make the io-pgtable code modular, so please drop this patch.

https://lore.kernel.org/r/20210106123428.GA1798@willie-the-truck

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH RESEND 0/7] iommu: Permit modular builds of io-pgtable drivers

2021-01-06 Thread Will Deacon
On Mon, Jan 04, 2021 at 11:36:38PM -0800, Isaac J. Manjarres wrote:
> The goal of the Generic Kernel Image (GKI) effort is to have a common
> kernel image that works across multiple Android devices. This involves
> generating a kernel image that has core features integrated into it,
> while SoC specific functionality can be added to the kernel for the
> device as a module.
> 
> Along with modularizing IOMMU drivers, this also means building the
> io-pgtable code as modules, which allows for SoC vendors to only include
> the io-pgtable implementations that they use. For example, GKI for arm64
> must include support for both the IOMMU ARM LPAE/V7S formats at the
> moment. Having the code for both formats as modules allows SoC vendors
> to only provide the page table format that they use, along with their
> IOMMU driver.

Why is this desirable for upstream? This code isn't especially large, and
the formats we support are largely architectural, meaning that they are
shared between different IOMMU drivers. I think that making this modular
just means that out-of-tree modifications are likely to generate page-tables
which are specific to a particular IOMMU, and lead to horrible problems
(crashes and data corruption) if another IOMMU driver tries to use them.

Please can you upstream whatever changes you want to make instead?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv10 0/9] System Cache support for GPU and required SMMU support

2020-11-25 Thread Will Deacon
On Wed, 25 Nov 2020 12:30:09 +0530, Sai Prakash Ranjan wrote:
> Some hardware variants contain a system cache or the last level
> cache(llc). This cache is typically a large block which is shared
> by multiple clients on the SOC. GPU uses the system cache to cache
> both the GPU data buffers(like textures) as well the SMMU pagetables.
> This helps with improved render performance as well as lower power
> consumption by reducing the bus traffic to the system memory.
> 
> [...]

Applied first two patches on a shared branch for Rob:

arm64 (for-next/iommu/io-pgtable-domain-attr), thanks!

[1/9] iommu/io-pgtable: Add a domain attribute for pagetable configuration
  https://git.kernel.org/arm64/c/a7656ecf825a
[2/9] iommu/io-pgtable-arm: Add support to use system cache
  https://git.kernel.org/arm64/c/e67890c97944

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv10 0/9] System Cache support for GPU and required SMMU support

2020-11-25 Thread Will Deacon
On Wed, 25 Nov 2020 12:30:09 +0530, Sai Prakash Ranjan wrote:
> Some hardware variants contain a system cache or the last level
> cache(llc). This cache is typically a large block which is shared
> by multiple clients on the SOC. GPU uses the system cache to cache
> both the GPU data buffers(like textures) as well the SMMU pagetables.
> This helps with improved render performance as well as lower power
> consumption by reducing the bus traffic to the system memory.
> 
> [...]

Applied the SMMU bits to arm64 (for-next/iommu/arm-smmu), thanks!

[3/9] iommu/arm-smmu: Add support for pagetable config domain attribute
  https://git.kernel.org/arm64/c/c99110a865a3
[4/9] iommu/arm-smmu: Move non-strict mode to use io_pgtable_domain_attr
  https://git.kernel.org/arm64/c/12bc36793fd6

[8/9] iommu: arm-smmu-impl: Use table to list QCOM implementations
  https://git.kernel.org/arm64/c/00597f9ff5ec
[9/9] iommu: arm-smmu-impl: Add a space before open parenthesis
  https://git.kernel.org/arm64/c/7f575a6087f4

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv8 0/8] System Cache support for GPU and required SMMU support

2020-11-24 Thread Will Deacon
On Tue, Nov 24, 2020 at 11:05:39AM -0800, Rob Clark wrote:
> On Tue, Nov 24, 2020 at 3:10 AM Will Deacon  wrote:
> > On Tue, Nov 24, 2020 at 09:32:54AM +0530, Sai Prakash Ranjan wrote:
> > > On 2020-11-24 00:52, Rob Clark wrote:
> > > > On Mon, Nov 23, 2020 at 9:01 AM Sai Prakash Ranjan
> > > >  wrote:
> > > > > On 2020-11-23 20:51, Will Deacon wrote:
> > > > > > Modulo some minor comments I've made, this looks good to me. What is
> > > > > > the
> > > > > > plan for merging it? I can take the IOMMU parts, but patches 4-6 
> > > > > > touch
> > > > > > the
> > > > > > MSM GPU driver and I'd like to avoid conflicts with that.
> > > > > >
> > > > >
> > > > > SMMU bits are pretty much independent and GPU relies on the domain
> > > > > attribute
> > > > > and the quirk exposed, so as long as SMMU changes go in first it
> > > > > should
> > > > > be good.
> > > > > Rob?
> > > >
> > > > I suppose one option would be to split out the patch that adds the
> > > > attribute into it's own patch, and merge that both thru drm and iommu?
> > > >
> > >
> > > Ok I can split out domain attr and quirk into its own patch if Will is
> > > fine with that approach.
> >
> > Why don't I just queue the first two patches on their own branch and we
> > both pull that?
> 
> Ok, that works for me.  I normally base msm-next on -rc1 but I guess
> as long as we base the branch on the older or our two -next branches,
> that should work out nicely

Turns out we're getting a v10 of Sai's stuff, so I've asked him to split
patch two up anyway. Then I'll make a branch based on -rc1 that we can
both pull.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv9 2/8] iommu/arm-smmu: Add domain attribute for pagetable configuration

2020-11-24 Thread Will Deacon
On Mon, Nov 23, 2020 at 10:35:55PM +0530, Sai Prakash Ranjan wrote:
> Add iommu domain attribute for pagetable configuration which
> initially will be used to set quirks like for system cache aka
> last level cache to be used by client drivers like GPU to set
> right attributes for caching the hardware pagetables into the
> system cache and later can be extended to include other page
> table configuration data.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 20 
>  drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
>  include/linux/io-pgtable.h|  4 
>  include/linux/iommu.h |  1 +
>  4 files changed, 26 insertions(+)

Given that we're heading for a v10 to address my comments on patch 3,
then I guess you may as well split this into two patches so that I can
share just the atttibute with Rob rather than the driver parts.

Please keep it all as one series though, with the common parts at the
beginning, and I'll figure it out.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv9 3/8] iommu/arm-smmu: Move non-strict mode to use io_pgtable_domain_attr

2020-11-24 Thread Will Deacon
On Mon, Nov 23, 2020 at 10:35:56PM +0530, Sai Prakash Ranjan wrote:
> Now that we have a struct io_pgtable_domain_attr with quirks,
> use that for non_strict mode as well thereby removing the need
> for more members of arm_smmu_domain in the future.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 8 +++-
>  drivers/iommu/arm/arm-smmu/arm-smmu.h | 1 -
>  2 files changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 4b9b10fe50ed..f56f266ebdf7 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -786,9 +786,6 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   goto out_clear_smmu;
>   }
>  
> - if (smmu_domain->non_strict)
> - pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> -
>   if (smmu_domain->pgtbl_cfg.quirks)
>   pgtbl_cfg.quirks |= smmu_domain->pgtbl_cfg.quirks;
>  
> @@ -1527,7 +1524,8 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
> *domain,
>   case IOMMU_DOMAIN_DMA:
>   switch (attr) {
>   case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> - *(int *)data = smmu_domain->non_strict;
> + if (smmu_domain->pgtbl_cfg.quirks & 
> IO_PGTABLE_QUIRK_NON_STRICT)
> + *(int *)data = smmu_domain->pgtbl_cfg.quirks;

I still don't think this is right :(
We need to set *data to 1 or 0 depending on whether or not the non-strict
quirk is set, i.e:

bool non_strict = smmu_domain->pgtbl_cfg.quirks & 
IO_PGTABLE_QUIRK_NON_STRICT;
*(int *)data = non_strict;

Your code above leaves *data uninitialised if non_strict is not set.

>   return 0;
>   default:
>   return -ENODEV;
> @@ -1578,7 +1576,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
> *domain,
>   case IOMMU_DOMAIN_DMA:
>   switch (attr) {
>   case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> - smmu_domain->non_strict = *(int *)data;
> + smmu_domain->pgtbl_cfg.quirks |= 
> IO_PGTABLE_QUIRK_NON_STRICT;

And this is broken because if *data is 0, then you _set_ the quirk, which is
the opposite of what we should be doing.

In other words, although the implementation has changed, the semantics have
not.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv8 0/8] System Cache support for GPU and required SMMU support

2020-11-24 Thread Will Deacon
On Tue, Nov 24, 2020 at 09:32:54AM +0530, Sai Prakash Ranjan wrote:
> On 2020-11-24 00:52, Rob Clark wrote:
> > On Mon, Nov 23, 2020 at 9:01 AM Sai Prakash Ranjan
> >  wrote:
> > > 
> > > On 2020-11-23 20:51, Will Deacon wrote:
> > > > On Tue, Nov 17, 2020 at 08:00:39PM +0530, Sai Prakash Ranjan wrote:
> > > >> Some hardware variants contain a system cache or the last level
> > > >> cache(llc). This cache is typically a large block which is shared
> > > >> by multiple clients on the SOC. GPU uses the system cache to cache
> > > >> both the GPU data buffers(like textures) as well the SMMU pagetables.
> > > >> This helps with improved render performance as well as lower power
> > > >> consumption by reducing the bus traffic to the system memory.
> > > >>
> > > >> The system cache architecture allows the cache to be split into slices
> > > >> which then be used by multiple SOC clients. This patch series is an
> > > >> effort to enable and use two of those slices preallocated for the GPU,
> > > >> one for the GPU data buffers and another for the GPU SMMU hardware
> > > >> pagetables.
> > > >>
> > > >> Patch 1 - Patch 6 adds system cache support in SMMU and GPU driver.
> > > >> Patch 7 and 8 are minor cleanups for arm-smmu impl.
> > > >>
> > > >> Changes in v8:
> > > >>  * Introduce a generic domain attribute for pagetable config (Will)
> > > >>  * Rename quirk to more generic IO_PGTABLE_QUIRK_ARM_OUTER_WBWA (Will)
> > > >>  * Move non-strict mode to use new struct domain_attr_io_pgtbl_config
> > > >> (Will)
> > > >
> > > > Modulo some minor comments I've made, this looks good to me. What is
> > > > the
> > > > plan for merging it? I can take the IOMMU parts, but patches 4-6 touch
> > > > the
> > > > MSM GPU driver and I'd like to avoid conflicts with that.
> > > >
> > > 
> > > SMMU bits are pretty much independent and GPU relies on the domain
> > > attribute
> > > and the quirk exposed, so as long as SMMU changes go in first it
> > > should
> > > be good.
> > > Rob?
> > 
> > I suppose one option would be to split out the patch that adds the
> > attribute into it's own patch, and merge that both thru drm and iommu?
> > 
> 
> Ok I can split out domain attr and quirk into its own patch if Will is
> fine with that approach.

Why don't I just queue the first two patches on their own branch and we
both pull that?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv8 0/8] System Cache support for GPU and required SMMU support

2020-11-23 Thread Will Deacon
On Tue, Nov 17, 2020 at 08:00:39PM +0530, Sai Prakash Ranjan wrote:
> Some hardware variants contain a system cache or the last level
> cache(llc). This cache is typically a large block which is shared
> by multiple clients on the SOC. GPU uses the system cache to cache
> both the GPU data buffers(like textures) as well the SMMU pagetables.
> This helps with improved render performance as well as lower power
> consumption by reducing the bus traffic to the system memory.
> 
> The system cache architecture allows the cache to be split into slices
> which then be used by multiple SOC clients. This patch series is an
> effort to enable and use two of those slices preallocated for the GPU,
> one for the GPU data buffers and another for the GPU SMMU hardware
> pagetables.
> 
> Patch 1 - Patch 6 adds system cache support in SMMU and GPU driver.
> Patch 7 and 8 are minor cleanups for arm-smmu impl.
> 
> Changes in v8:
>  * Introduce a generic domain attribute for pagetable config (Will)
>  * Rename quirk to more generic IO_PGTABLE_QUIRK_ARM_OUTER_WBWA (Will)
>  * Move non-strict mode to use new struct domain_attr_io_pgtbl_config (Will)

Modulo some minor comments I've made, this looks good to me. What is the
plan for merging it? I can take the IOMMU parts, but patches 4-6 touch the
MSM GPU driver and I'd like to avoid conflicts with that.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv8 3/8] iommu/arm-smmu: Move non-strict mode to use domain_attr_io_pgtbl_cfg

2020-11-23 Thread Will Deacon
On Tue, Nov 17, 2020 at 08:00:42PM +0530, Sai Prakash Ranjan wrote:
> Now that we have a struct domain_attr_io_pgtbl_cfg with quirks,
> use that for non_strict mode as well thereby removing the need
> for more members of arm_smmu_domain in the future.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 7 ++-
>  drivers/iommu/arm/arm-smmu/arm-smmu.h | 1 -
>  2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 7b05782738e2..5f066a1b7221 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -786,9 +786,6 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   goto out_clear_smmu;
>   }
>  
> - if (smmu_domain->non_strict)
> - pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> -
>   if (smmu_domain->pgtbl_cfg.quirks)
>   pgtbl_cfg.quirks |= smmu_domain->pgtbl_cfg.quirks;
>  
> @@ -1527,7 +1524,7 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
> *domain,
>   case IOMMU_DOMAIN_DMA:
>   switch (attr) {
>   case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> - *(int *)data = smmu_domain->non_strict;
> + *(int *)data = smmu_domain->pgtbl_cfg.quirks;

Probably better to compare with IO_PGTABLE_QUIRK_NON_STRICT here even though
we only support this one quirk for DMA domains atm.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv8 2/8] iommu/arm-smmu: Add domain attribute for pagetable configuration

2020-11-23 Thread Will Deacon
On Tue, Nov 17, 2020 at 08:00:41PM +0530, Sai Prakash Ranjan wrote:
> Add iommu domain attribute for pagetable configuration which
> initially will be used to set quirks like for system cache aka
> last level cache to be used by client drivers like GPU to set
> right attributes for caching the hardware pagetables into the
> system cache and later can be extended to include other page
> table configuration data.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 25 +
>  drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
>  include/linux/io-pgtable.h|  4 
>  include/linux/iommu.h |  1 +
>  4 files changed, 31 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 0f28a8614da3..7b05782738e2 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -789,6 +789,9 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   if (smmu_domain->non_strict)
>   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>  
> + if (smmu_domain->pgtbl_cfg.quirks)
> + pgtbl_cfg.quirks |= smmu_domain->pgtbl_cfg.quirks;
> +
>   pgtbl_ops = alloc_io_pgtable_ops(fmt, _cfg, smmu_domain);
>   if (!pgtbl_ops) {
>   ret = -ENOMEM;
> @@ -1511,6 +1514,12 @@ static int arm_smmu_domain_get_attr(struct 
> iommu_domain *domain,
>   case DOMAIN_ATTR_NESTING:
>   *(int *)data = (smmu_domain->stage == 
> ARM_SMMU_DOMAIN_NESTED);
>   return 0;
> + case DOMAIN_ATTR_IO_PGTABLE_CFG: {
> + struct domain_attr_io_pgtbl_cfg *pgtbl_cfg = data;
> + *pgtbl_cfg = smmu_domain->pgtbl_cfg;
> +
> + return 0;
> + }
>   default:
>   return -ENODEV;
>   }
> @@ -1551,6 +1560,22 @@ static int arm_smmu_domain_set_attr(struct 
> iommu_domain *domain,
>   else
>   smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>   break;
> + case DOMAIN_ATTR_IO_PGTABLE_CFG: {
> + struct domain_attr_io_pgtbl_cfg *pgtbl_cfg = data;
> +
> + if (smmu_domain->smmu) {
> + ret = -EPERM;
> + goto out_unlock;
> + }
> +
> + if (!pgtbl_cfg) {

Do we really need to check this? If somebody passed us a NULL pointer then
they have a bug and we don't check this for other domain attributes afaict.

> + ret = -ENODEV;
> + goto out_unlock;
> + }
> +
> + smmu_domain->pgtbl_cfg = *pgtbl_cfg;
> + break;
> + }
>   default:
>   ret = -ENODEV;
>   }
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> index 04288b6fc619..18fbed376afb 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> @@ -364,6 +364,7 @@ enum arm_smmu_domain_stage {
>  struct arm_smmu_domain {
>   struct arm_smmu_device  *smmu;
>   struct io_pgtable_ops   *pgtbl_ops;
> + struct domain_attr_io_pgtbl_cfg pgtbl_cfg;
>   const struct iommu_flush_ops*flush_ops;
>   struct arm_smmu_cfg cfg;
>   enum arm_smmu_domain_stage  stage;
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index a9a2c59fab37..686b37d48743 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -212,6 +212,10 @@ struct io_pgtable {
>  
>  #define io_pgtable_ops_to_pgtable(x) container_of((x), struct io_pgtable, 
> ops)
>  
> +struct domain_attr_io_pgtbl_cfg {
> + unsigned long quirks;
> +};

nit: Can you rename this to 'struct io_pgtable_domain_attr' please?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv8 1/8] iommu/io-pgtable-arm: Add support to use system cache

2020-11-23 Thread Will Deacon
On Tue, Nov 17, 2020 at 08:00:40PM +0530, Sai Prakash Ranjan wrote:
> Add a quirk IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to override
> the attributes set in TCR for the page table walker when
> using system cache.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/iommu/io-pgtable-arm.c | 10 --
>  include/linux/io-pgtable.h |  4 
>  2 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index a7a9bc08dcd1..7c9ea9d7874a 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -761,7 +761,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
> void *cookie)
>  
>   if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
>   IO_PGTABLE_QUIRK_NON_STRICT |
> - IO_PGTABLE_QUIRK_ARM_TTBR1))
> + IO_PGTABLE_QUIRK_ARM_TTBR1 |
> + IO_PGTABLE_QUIRK_ARM_OUTER_WBWA))
>   return NULL;
>  
>   data = arm_lpae_alloc_pgtable(cfg);
> @@ -773,10 +774,15 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg 
> *cfg, void *cookie)
>   tcr->sh = ARM_LPAE_TCR_SH_IS;
>   tcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
>   tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
> + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA)
> + goto out_free_data;
>   } else {
>   tcr->sh = ARM_LPAE_TCR_SH_OS;
>   tcr->irgn = ARM_LPAE_TCR_RGN_NC;
> - tcr->orgn = ARM_LPAE_TCR_RGN_NC;
> + if (!(cfg->quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA))
> + tcr->orgn = ARM_LPAE_TCR_RGN_NC;
> + else
> + tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
>   }
>  
>   tg1 = cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1;
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index 4cde111e425b..a9a2c59fab37 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -86,6 +86,9 @@ struct io_pgtable_cfg {
>*
>* IO_PGTABLE_QUIRK_ARM_TTBR1: (ARM LPAE format) Configure the table
>*  for use in the upper half of a split address space.
> +  *
> +  * IO_PGTABLE_QUIRK_ARM_OUTER_WBWA: Override the attributes set in TCR 
> for
> +  *  the page table walker when using system cache.

Please can you reword this to say:

  "Override the outer-cacheability attributes set in the TCR for a non-coherent
   page-table walker."

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv7 1/7] iommu/io-pgtable-arm: Add support to use system cache

2020-11-12 Thread Will Deacon
On Wed, Nov 11, 2020 at 11:32:42AM +0530, Sai Prakash Ranjan wrote:
> On 2020-11-10 17:48, Will Deacon wrote:
> > On Fri, Oct 30, 2020 at 02:53:08PM +0530, Sai Prakash Ranjan wrote:
> > > Add a quirk IO_PGTABLE_QUIRK_SYS_CACHE to override the
> > > attributes set in TCR for the page table walker when
> > > using system cache.
> > > 
> > > Signed-off-by: Sai Prakash Ranjan 
> > > ---
> > >  drivers/iommu/io-pgtable-arm.c | 7 ++-
> > >  include/linux/io-pgtable.h | 4 
> > >  2 files changed, 10 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/iommu/io-pgtable-arm.c
> > > b/drivers/iommu/io-pgtable-arm.c
> > > index a7a9bc08dcd1..a356caf1683a 100644
> > > --- a/drivers/iommu/io-pgtable-arm.c
> > > +++ b/drivers/iommu/io-pgtable-arm.c
> > > @@ -761,7 +761,8 @@ arm_64_lpae_alloc_pgtable_s1(struct
> > > io_pgtable_cfg *cfg, void *cookie)
> > > 
> > >   if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
> > >   IO_PGTABLE_QUIRK_NON_STRICT |
> > > - IO_PGTABLE_QUIRK_ARM_TTBR1))
> > > + IO_PGTABLE_QUIRK_ARM_TTBR1 |
> > > + IO_PGTABLE_QUIRK_SYS_CACHE))
> > >   return NULL;
> > > 
> > >   data = arm_lpae_alloc_pgtable(cfg);
> > > @@ -773,6 +774,10 @@ arm_64_lpae_alloc_pgtable_s1(struct
> > > io_pgtable_cfg *cfg, void *cookie)
> > >   tcr->sh = ARM_LPAE_TCR_SH_IS;
> > >   tcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
> > >   tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
> > > + } else if (cfg->quirks & IO_PGTABLE_QUIRK_SYS_CACHE) {
> > > + tcr->sh = ARM_LPAE_TCR_SH_OS;
> > > + tcr->irgn = ARM_LPAE_TCR_RGN_NC;
> > > + tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
> > 
> > Given that this only applies in the case where then page-table walker is
> > non-coherent, I think we'd be better off renaming the quirk to something
> > like IO_PGTABLE_QUIRK_ARM_OUTER_WBWA and then rejecting it in the
> > non-coherent case.
> > 
> 
> Do you mean like below?
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index a7a9bc08dcd1..94de1f71db42 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -776,7 +776,10 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg
> *cfg, void *cookie)
> } else {
> tcr->sh = ARM_LPAE_TCR_SH_OS;
> tcr->irgn = ARM_LPAE_TCR_RGN_NC;
> -   tcr->orgn = ARM_LPAE_TCR_RGN_NC;
> +   if (!(cfg->quirks & IO_PGTABLE_QUIRK_ARM_OUTER_WBWA))
> +   tcr->orgn = ARM_LPAE_TCR_RGN_NC;
> +   else
> +   tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;

Yes, but rejecting the quirk if the walker is coherent (I accidentally said
"non-coherent" earlier on).

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for system cache

2020-11-12 Thread Will Deacon
On Wed, Nov 11, 2020 at 12:10:50PM +0530, Sai Prakash Ranjan wrote:
> On 2020-11-10 17:48, Will Deacon wrote:
> > On Fri, Oct 30, 2020 at 02:53:09PM +0530, Sai Prakash Ranjan wrote:
> > > Add iommu domain attribute for using system cache aka last level
> > > cache by client drivers like GPU to set right attributes for caching
> > > the hardware pagetables into the system cache.
> > > 
> > > Signed-off-by: Sai Prakash Ranjan 
> > > ---
> > >  drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 +
> > >  drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
> > >  include/linux/iommu.h |  1 +
> > >  3 files changed, 19 insertions(+)
> > > 
> > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > index b1cf8f0abc29..070d13f80c7e 100644
> > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > @@ -789,6 +789,9 @@ static int arm_smmu_init_domain_context(struct
> > > iommu_domain *domain,
> > >   if (smmu_domain->non_strict)
> > >   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> > > 
> > > + if (smmu_domain->sys_cache)
> > > + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
> > > +
> > >   pgtbl_ops = alloc_io_pgtable_ops(fmt, _cfg, smmu_domain);
> > >   if (!pgtbl_ops) {
> > >   ret = -ENOMEM;
> > > @@ -1520,6 +1523,9 @@ static int arm_smmu_domain_get_attr(struct
> > > iommu_domain *domain,
> > >   case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
> > >   *(int *)data = smmu_domain->non_strict;
> > >   return 0;
> > > + case DOMAIN_ATTR_SYS_CACHE:
> > > + *((int *)data) = smmu_domain->sys_cache;
> > > + return 0;
> > >   default:
> > >   return -ENODEV;
> > >   }
> > > @@ -1551,6 +1557,17 @@ static int arm_smmu_domain_set_attr(struct
> > > iommu_domain *domain,
> > >   else
> > >   smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
> > >   break;
> > > + case DOMAIN_ATTR_SYS_CACHE:
> > > + if (smmu_domain->smmu) {
> > > + ret = -EPERM;
> > > + goto out_unlock;
> > > + }
> > > +
> > > + if (*((int *)data))
> > > + smmu_domain->sys_cache = true;
> > > + else
> > > + smmu_domain->sys_cache = false;
> > > + break;
> > >   default:
> > >   ret = -ENODEV;
> > >   }
> > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > > b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > > index 885840f3bec8..dfc44d806671 100644
> > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> > > @@ -373,6 +373,7 @@ struct arm_smmu_domain {
> > >   struct mutexinit_mutex; /* Protects smmu pointer */
> > >   spinlock_t  cb_lock; /* Serialises ATS1* ops and 
> > > TLB syncs */
> > >   struct iommu_domain domain;
> > > + boolsys_cache;
> > >  };
> > > 
> > >  struct arm_smmu_master_cfg {
> > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > > index b95a6f8db6ff..4f4bb9c6f8f6 100644
> > > --- a/include/linux/iommu.h
> > > +++ b/include/linux/iommu.h
> > > @@ -118,6 +118,7 @@ enum iommu_attr {
> > >   DOMAIN_ATTR_FSL_PAMUV1,
> > >   DOMAIN_ATTR_NESTING,/* two stages of translation */
> > >   DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> > > + DOMAIN_ATTR_SYS_CACHE,
> > 
> > I think you're trying to make this look generic, but it's really not.
> > If we need to funnel io-pgtable quirks through domain attributes, then I
> > think we should be open about that and add something like
> > DOMAIN_ATTR_IO_PGTABLE_CFG which could take a struct of page-table
> > configuration data for the domain (this could just be quirks initially,
> > but maybe it's worth extending to take ias, oas and page size)
> > 
> 
> Actually the initial versions used DOMAIN_ATTR_QCOM_SYS_CACHE
> to make it QCOM specific and not generic, I don't see anyone else
> using this attribute, would that work?

No -- I'd prefer to have _one_ domain attribute for funneling all the
IP_PGTABLE_CFG data. Otherwise, we'll just end up with things like
DOMAIN_ATTR_QCOM_SYS_CACHE_EXT or DOMAIN_ATTR_QCOM_QUIRKS later on.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv7 1/7] iommu/io-pgtable-arm: Add support to use system cache

2020-11-10 Thread Will Deacon
On Fri, Oct 30, 2020 at 02:53:08PM +0530, Sai Prakash Ranjan wrote:
> Add a quirk IO_PGTABLE_QUIRK_SYS_CACHE to override the
> attributes set in TCR for the page table walker when
> using system cache.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/iommu/io-pgtable-arm.c | 7 ++-
>  include/linux/io-pgtable.h | 4 
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index a7a9bc08dcd1..a356caf1683a 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -761,7 +761,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
> void *cookie)
>  
>   if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
>   IO_PGTABLE_QUIRK_NON_STRICT |
> - IO_PGTABLE_QUIRK_ARM_TTBR1))
> + IO_PGTABLE_QUIRK_ARM_TTBR1 |
> + IO_PGTABLE_QUIRK_SYS_CACHE))
>   return NULL;
>  
>   data = arm_lpae_alloc_pgtable(cfg);
> @@ -773,6 +774,10 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
> void *cookie)
>   tcr->sh = ARM_LPAE_TCR_SH_IS;
>   tcr->irgn = ARM_LPAE_TCR_RGN_WBWA;
>   tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;
> + } else if (cfg->quirks & IO_PGTABLE_QUIRK_SYS_CACHE) {
> + tcr->sh = ARM_LPAE_TCR_SH_OS;
> + tcr->irgn = ARM_LPAE_TCR_RGN_NC;
> + tcr->orgn = ARM_LPAE_TCR_RGN_WBWA;

Given that this only applies in the case where then page-table walker is
non-coherent, I think we'd be better off renaming the quirk to something
like IO_PGTABLE_QUIRK_ARM_OUTER_WBWA and then rejecting it in the
non-coherent case.

>   } else {
>   tcr->sh = ARM_LPAE_TCR_SH_OS;
>   tcr->irgn = ARM_LPAE_TCR_RGN_NC;
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index 4cde111e425b..86631f711e05 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -86,6 +86,9 @@ struct io_pgtable_cfg {
>*
>* IO_PGTABLE_QUIRK_ARM_TTBR1: (ARM LPAE format) Configure the table
>*  for use in the upper half of a split address space.
> +  *
> +  * IO_PGTABLE_QUIRK_SYS_CACHE: Override the attributes set in TCR for
> +  *  the page table walker when using system cache.

and then update this accordingly.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv7 2/7] iommu/arm-smmu: Add domain attribute for system cache

2020-11-10 Thread Will Deacon
On Fri, Oct 30, 2020 at 02:53:09PM +0530, Sai Prakash Ranjan wrote:
> Add iommu domain attribute for using system cache aka last level
> cache by client drivers like GPU to set right attributes for caching
> the hardware pagetables into the system cache.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 +
>  drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
>  include/linux/iommu.h |  1 +
>  3 files changed, 19 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index b1cf8f0abc29..070d13f80c7e 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -789,6 +789,9 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   if (smmu_domain->non_strict)
>   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>  
> + if (smmu_domain->sys_cache)
> + pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_SYS_CACHE;
> +
>   pgtbl_ops = alloc_io_pgtable_ops(fmt, _cfg, smmu_domain);
>   if (!pgtbl_ops) {
>   ret = -ENOMEM;
> @@ -1520,6 +1523,9 @@ static int arm_smmu_domain_get_attr(struct iommu_domain 
> *domain,
>   case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
>   *(int *)data = smmu_domain->non_strict;
>   return 0;
> + case DOMAIN_ATTR_SYS_CACHE:
> + *((int *)data) = smmu_domain->sys_cache;
> + return 0;
>   default:
>   return -ENODEV;
>   }
> @@ -1551,6 +1557,17 @@ static int arm_smmu_domain_set_attr(struct 
> iommu_domain *domain,
>   else
>   smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>   break;
> + case DOMAIN_ATTR_SYS_CACHE:
> + if (smmu_domain->smmu) {
> + ret = -EPERM;
> + goto out_unlock;
> + }
> +
> + if (*((int *)data))
> + smmu_domain->sys_cache = true;
> + else
> + smmu_domain->sys_cache = false;
> + break;
>   default:
>   ret = -ENODEV;
>   }
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> index 885840f3bec8..dfc44d806671 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> @@ -373,6 +373,7 @@ struct arm_smmu_domain {
>   struct mutexinit_mutex; /* Protects smmu pointer */
>   spinlock_t  cb_lock; /* Serialises ATS1* ops and 
> TLB syncs */
>   struct iommu_domain domain;
> + boolsys_cache;
>  };
>  
>  struct arm_smmu_master_cfg {
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index b95a6f8db6ff..4f4bb9c6f8f6 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -118,6 +118,7 @@ enum iommu_attr {
>   DOMAIN_ATTR_FSL_PAMUV1,
>   DOMAIN_ATTR_NESTING,/* two stages of translation */
>   DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> + DOMAIN_ATTR_SYS_CACHE,

I think you're trying to make this look generic, but it's really not.
If we need to funnel io-pgtable quirks through domain attributes, then I
think we should be open about that and add something like
DOMAIN_ATTR_IO_PGTABLE_CFG which could take a struct of page-table
configuration data for the domain (this could just be quirks initially,
but maybe it's worth extending to take ias, oas and page size)

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv7 7/7] iommu: arm-smmu-impl: Add a space before open parenthesis

2020-11-10 Thread Will Deacon
On Fri, Oct 30, 2020 at 02:53:14PM +0530, Sai Prakash Ranjan wrote:
> Fix the checkpatch warning for space required before the open
> parenthesis.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> index ffaf3f91ba52..f16da4a21270 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> @@ -12,7 +12,7 @@
>  
>  static int arm_smmu_gr0_ns(int offset)
>  {
> - switch(offset) {
> + switch (offset) {
>   case ARM_SMMU_GR0_sCR0:
>   case ARM_SMMU_GR0_sACR:
>   case ARM_SMMU_GR0_sGFSR:

Whatever...

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv7 6/7] iommu: arm-smmu-impl: Use table to list QCOM implementations

2020-11-10 Thread Will Deacon
On Fri, Oct 30, 2020 at 02:53:13PM +0530, Sai Prakash Ranjan wrote:
> Use table and of_match_node() to match qcom implementation
> instead of multiple of_device_compatible() calls for each
> QCOM SMMU implementation.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-impl.c |  9 +
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 21 -
>  drivers/iommu/arm/arm-smmu/arm-smmu.h  |  1 -
>  3 files changed, 17 insertions(+), 14 deletions(-)

Acked-by: Will Deacon 

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv4 1/6] iommu/io-pgtable-arm: Add support to use system cache

2020-09-21 Thread Will Deacon
On Mon, Sep 21, 2020 at 11:03:49PM +0100, Robin Murphy wrote:
> On 2020-09-21 19:03, Will Deacon wrote:
> > On Fri, Sep 11, 2020 at 07:57:18PM +0530, Sai Prakash Ranjan wrote:
> > > Add a quirk IO_PGTABLE_QUIRK_SYS_CACHE to override the
> > > attributes set in TCR for the page table walker when
> > > using system cache.
> > 
> > I wonder if the panfrost folks can reuse this for the issue discussed
> > over at:
> > 
> > https://lore.kernel.org/r/cover.1600213517.git.robin.mur...@arm.com
> 
> Isn't this all hinged around the outer cacheability attribute, rather than
> shareability (since these are nominally NC mappings and thus already
> properly Osh)? The Panfrost issue is just about shareability domains being a
> bit wonky; the cacheability attributes there are actually reasonably normal
> (other than not having a non-cacheable type at all, only a choice of
> allocation policies...)

Hmm, yes, this quirk _also_ changes the cacheability settings which isn't
what we need. It's a bit grotty having two different ways to configure these
TCR bits (i.e. a quirk and a format), but at least the mali format rejects
all of the quirks so I suppose it's not the end of the world.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v17 00/20] iommu/arm-smmu + drm/msm: per-process GPU pgtables

2020-09-21 Thread Will Deacon
On Sat, Sep 05, 2020 at 01:04:06PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> NOTE: I have re-ordered the series, and propose that we could merge this
>   series in the following order:
> 
>1) 01-11 - merge via drm / msm-next
>2) 12-15 - merge via iommu, no dependency on msm-next pull req

Thanks, I've queued 12-15 in the smmu tree.

https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=for-joerg/arm-smmu/updates

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv4 1/6] iommu/io-pgtable-arm: Add support to use system cache

2020-09-21 Thread Will Deacon
On Fri, Sep 11, 2020 at 07:57:18PM +0530, Sai Prakash Ranjan wrote:
> Add a quirk IO_PGTABLE_QUIRK_SYS_CACHE to override the
> attributes set in TCR for the page table walker when
> using system cache.

I wonder if the panfrost folks can reuse this for the issue discussed
over at:

https://lore.kernel.org/r/cover.1600213517.git.robin.mur...@arm.com

However, Sai, your email setup went wrong when you posted this so you
probably need to repost now that you have that fixed.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv4 6/6] iommu: arm-smmu-impl: Remove unwanted extra blank lines

2020-09-11 Thread Will Deacon
On Fri, Sep 11, 2020 at 05:03:06PM +0100, Robin Murphy wrote:
> BTW am I supposed to have received 3 copies of everything? Because I did...

Yeah, this seems to be happening for all of Sai's emails :/

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v12 04/13] iommu: Add a domain attribute to get/set a pagetable configuration

2020-08-13 Thread Will Deacon
On Thu, Aug 13, 2020 at 08:11:02AM -0700, Rob Clark wrote:
> On Thu, Aug 13, 2020 at 6:14 AM Will Deacon  wrote:
> >
> > On Mon, Aug 10, 2020 at 04:26:48PM -0600, Jordan Crouse wrote:
> > > Add domain attribute DOMAIN_ATTR_PGTABLE_CFG. This will be used by
> > > arm-smmu to share the current pagetable configuration with the
> > > leaf driver and to allow the leaf driver to set up a new pagetable
> > > configuration under certain circumstances.
> > >
> > > Signed-off-by: Jordan Crouse 
> > > ---
> > >
> > >  include/linux/iommu.h | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > > index fee209efb756..995ab8c47ef2 100644
> > > --- a/include/linux/iommu.h
> > > +++ b/include/linux/iommu.h
> > > @@ -118,6 +118,7 @@ enum iommu_attr {
> > >   DOMAIN_ATTR_FSL_PAMUV1,
> > >   DOMAIN_ATTR_NESTING,/* two stages of translation */
> > >   DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> > > + DOMAIN_ATTR_PGTABLE_CFG,
> > >   DOMAIN_ATTR_MAX,
> > >  };
> >
> > Nobody other than the adreno gpu uses this, so can we avoid exposing it
> > in the IOMMU API, please? Given that you have a reference to the adreno
> > GPU device in the SMMU implementation code thanks to .alloc_context_bank(),
> > can you squirrel some function pointers away in the driver data (i.e. with
> > dev_set_drvdata()) instead?
> >
> 
> Hmm, we are already using drvdata on the gpu side, and it looks like
> arm-smmu is also using it.  Could we get away with stashing an extra
> 'void *' in iommu_domain itself?

What I meant was, expose the type of whatever you put in there on the GPU
side so that the SMMU impl can install its function pointers into a field of
that structure. As far as I'm concerned, the SMMU impl code and the GPU
driver are the same entity and we should keep their communication private,
rather than expose it up the stack. After all, the GPU writes to the SMMU
registers!

If you really don't want to expose all of your gubbins, I suppose you
could have a structure just for the SMMU view and container_of() out of
that on the GPU side.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v12 05/13] iommu/arm-smmu-qcom: Add implementation for the adreno GPU SMMU

2020-08-13 Thread Will Deacon
On Mon, Aug 10, 2020 at 04:26:49PM -0600, Jordan Crouse wrote:
> Add a special implementation for the SMMU attached to most Adreno GPU
> target triggered from the qcom,adreno-smmu compatible string.
> 
> The new Adreno SMMU implementation will enable split pagetables
> (TTBR1) for the domain attached to the GPU device (SID 0) and
> hard code it context bank 0 so the GPU hardware can implement
> per-instance pagetables.
> 
> Signed-off-by: Jordan Crouse 
> ---
> 
>  drivers/iommu/arm/arm-smmu/arm-smmu-impl.c |   3 +
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 156 -
>  2 files changed, 157 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> index 88f17cc33023..d199b4bff15d 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> @@ -223,6 +223,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
> arm_smmu_device *smmu)
>   of_device_is_compatible(np, "qcom,sm8250-smmu-500"))
>   return qcom_smmu_impl_init(smmu);
>  
> + if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))
> + return qcom_adreno_smmu_impl_init(smmu);
> +
>   if (of_device_is_compatible(np, "marvell,ap806-smmu-500"))
>   smmu->impl = _mmu500_impl;
>  
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> index be4318044f96..3be10145bf57 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
> @@ -12,6 +12,138 @@ struct qcom_smmu {
>   struct arm_smmu_device smmu;
>  };
>  
> +#define QCOM_ADRENO_SMMU_GPU_SID 0
> +
> +static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
> +{
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> + struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
> + int idx, i;
> +
> + /*
> +  * The GPU will always use SID 0 so that is a handy way to uniquely
> +  * identify it and configure it for per-instance pagetables
> +  */
> + for_each_cfg_sme(cfg, fwspec, i, idx) {
> + u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
> +
> + if (sid == QCOM_ADRENO_SMMU_GPU_SID)
> + return true;
> + }

Is for_each_cfg_sme() really what you want here? You're not using idx for
anything, so I guess it should really be a loop over the sids (e.g. a bog
standard for loop from 0 to fw->num_ids - 1)?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v12 00/13] iommu/arm-smmu: Add Adreno SMMU specific implementation

2020-08-13 Thread Will Deacon
On Mon, Aug 10, 2020 at 04:26:44PM -0600, Jordan Crouse wrote:
> This series adds an Adreno SMMU implementation to arm-smmu to allow GPU 
> hardware
> pagetable switching.
> 
> The Adreno GPU has built in capabilities to switch the TTBR0 pagetable during
> runtime to allow each individual instance or application to have its own
> pagetable.  In order to take advantage of the HW capabilities there are 
> certain
> requirements needed of the SMMU hardware.

"capabilities" is a polite way of putting it ;)

Anyway, modulo two design comments, I think this is about as nice as we're
going to get this. Thanks for persevering, and sorry that you have to deal
with such dreadful hardware.

Hopefully the next version will be the one, although I'd like Robin to take
a quick look as well if he gets a chance.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v12 04/13] iommu: Add a domain attribute to get/set a pagetable configuration

2020-08-13 Thread Will Deacon
On Mon, Aug 10, 2020 at 04:26:48PM -0600, Jordan Crouse wrote:
> Add domain attribute DOMAIN_ATTR_PGTABLE_CFG. This will be used by
> arm-smmu to share the current pagetable configuration with the
> leaf driver and to allow the leaf driver to set up a new pagetable
> configuration under certain circumstances.
> 
> Signed-off-by: Jordan Crouse 
> ---
> 
>  include/linux/iommu.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index fee209efb756..995ab8c47ef2 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -118,6 +118,7 @@ enum iommu_attr {
>   DOMAIN_ATTR_FSL_PAMUV1,
>   DOMAIN_ATTR_NESTING,/* two stages of translation */
>   DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> + DOMAIN_ATTR_PGTABLE_CFG,
>   DOMAIN_ATTR_MAX,
>  };

Nobody other than the adreno gpu uses this, so can we avoid exposing it
in the IOMMU API, please? Given that you have a reference to the adreno
GPU device in the SMMU implementation code thanks to .alloc_context_bank(),
can you squirrel some function pointers away in the driver data (i.e. with
dev_set_drvdata()) instead?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RFC v12 13/13] iommu/arm-smmu: Add a init_context_bank implementation hook

2020-08-13 Thread Will Deacon
On Mon, Aug 10, 2020 at 04:26:57PM -0600, Jordan Crouse wrote:
> Add a new implementation hook to allow the implementation specific code
> to tweek the context bank configuration just before it gets written.
> The first user will be the Adreno GPU implementation to turn on
> SCTLR.HUPCF to ensure that a page fault doesn't terminating pending
> transactions. Doing so could hang the GPU if one of the terminated
> transactions is a CP read.
> 
> Signed-off-by: Jordan Crouse 
> ---
> 
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 16 
>  drivers/iommu/arm/arm-smmu/arm-smmu.c  | 21 +
>  drivers/iommu/arm/arm-smmu/arm-smmu.h  |  5 +
>  3 files changed, 34 insertions(+), 8 deletions(-)

We already have ->init_context(), so I'd prefer to use that instead of
adding another callback. Could we stick a couple of fields in
smmu_domain->cfg (e.g. sctlr_set, sctlr_clr) and handle those a bit like
we do for the asid/vmid on cavium implementations?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v9 4/7] iommu/arm-smmu: Add a pointer to the attached device to smmu_domain

2020-07-16 Thread Will Deacon
On Mon, Jul 13, 2020 at 11:19:17AM -0600, Jordan Crouse wrote:
> On Mon, Jul 13, 2020 at 04:09:02PM +0100, Will Deacon wrote:
> > On Fri, Jun 26, 2020 at 02:00:38PM -0600, Jordan Crouse wrote:
> > > diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> > > index 5f2de20e883b..d33cfe26b2f5 100644
> > > --- a/drivers/iommu/arm-smmu.h
> > > +++ b/drivers/iommu/arm-smmu.h
> > > @@ -345,6 +345,7 @@ struct arm_smmu_domain {
> > >   struct mutexinit_mutex; /* Protects smmu pointer */
> > >   spinlock_t  cb_lock; /* Serialises ATS1* ops and 
> > > TLB syncs */
> > >   struct iommu_domain domain;
> > > + struct device   *dev;   /* Device attached to this 
> > > domain */
> > 
> > This really doesn't feel right to me -- you can generally have multiple
> > devices attached to a domain and they can come and go without the domain
> > being destroyed. Perhaps you could instead identify the GPU during
> > cfg_probe() and squirrel that information away somewhere?
> 
> I need some help here. The SMMU device (qcom,adreno-smmu) will have at least 
> two
> stream ids from two different platform devices (GPU and GMU) and I need to
> configure split-pagetable and stall/terminate differently on the two domains.

Hmm. How does the GPU driver know which context bank is assigned to the GPU
and which one is assigned to the GMU? I assume it needs this information so
that it can play its nasty tricks with the TTBR registers?

I ask because if we need to guarantee stability of the context-bank
assignment, then you could match on that in the ->init_context() callback,
but now I worry that it currently works by luck :/

Do we need to add an extra callback to allocate the context bank?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH] iommu/arm-smmu: Add a init_context_bank implementation hook

2020-07-13 Thread Will Deacon
On Mon, Jul 13, 2020 at 11:00:32AM -0600, Jordan Crouse wrote:
> On Mon, Jul 13, 2020 at 04:11:23PM +0100, Will Deacon wrote:
> > On Thu, Jun 11, 2020 at 04:36:56PM -0600, Jordan Crouse wrote:
> > > Add a new implementation hook to allow the implementation specific code
> > > to tweek the context bank configuration just before it gets written.
> > > The first user will be the Adreno GPU implementation to turn on
> > > SCTLR.HUPCF to ensure that a page fault doesn't terminating pending
> > > transactions. Doing so could hang the GPU if one of the terminated
> > > transactions is a CP read.
> > > 
> > > This depends on the arm-smmu adreno SMMU implementation [1].
> > > 
> > > [1] https://patchwork.kernel.org/patch/11600943/
> > > 
> > > Signed-off-by: Jordan Crouse 
> > > ---
> > > 
> > >  drivers/iommu/arm-smmu-qcom.c | 13 +
> > >  drivers/iommu/arm-smmu.c  | 28 +---
> > >  drivers/iommu/arm-smmu.h  | 11 +++
> > >  3 files changed, 37 insertions(+), 15 deletions(-)
> > 
> > This looks straightforward enough, but I don't want to merge this without
> > a user and Sai's series has open questions afaict.
> 
> Not sure what you mean by a user in this context?
> Are you referring to https://patchwork.kernel.org/patch/11628541/?

Right, this post was just a single patch in isolation, whereas it was
reposted over at:

https://lore.kernel.org/r/cdcc6a1c95a84e774790389dc8b3b7f490dc.1593344119.git.saiprakash.ran...@codeaurora.org

so I'll ignore this one. Sorry, I'm just really struggling to keep track
of what is targetting 5.9, and I don't have tonnes of time to sift through
the backlog of duplicate postings :(

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH] iommu/arm-smmu: Add a init_context_bank implementation hook

2020-07-13 Thread Will Deacon
On Thu, Jun 11, 2020 at 04:36:56PM -0600, Jordan Crouse wrote:
> Add a new implementation hook to allow the implementation specific code
> to tweek the context bank configuration just before it gets written.
> The first user will be the Adreno GPU implementation to turn on
> SCTLR.HUPCF to ensure that a page fault doesn't terminating pending
> transactions. Doing so could hang the GPU if one of the terminated
> transactions is a CP read.
> 
> This depends on the arm-smmu adreno SMMU implementation [1].
> 
> [1] https://patchwork.kernel.org/patch/11600943/
> 
> Signed-off-by: Jordan Crouse 
> ---
> 
>  drivers/iommu/arm-smmu-qcom.c | 13 +
>  drivers/iommu/arm-smmu.c  | 28 +---
>  drivers/iommu/arm-smmu.h  | 11 +++
>  3 files changed, 37 insertions(+), 15 deletions(-)

This looks straightforward enough, but I don't want to merge this without
a user and Sai's series has open questions afaict.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v9 4/7] iommu/arm-smmu: Add a pointer to the attached device to smmu_domain

2020-07-13 Thread Will Deacon
On Fri, Jun 26, 2020 at 02:00:38PM -0600, Jordan Crouse wrote:
> Add a link to the pointer to the struct device that is attached to a
> domain. This makes it easy to get the pointer if it is needed in the
> implementation specific code.
> 
> Signed-off-by: Jordan Crouse 
> ---
> 
>  drivers/iommu/arm-smmu.c | 6 --
>  drivers/iommu/arm-smmu.h | 1 +
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 048de2681670..060139452c54 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -668,7 +668,8 @@ static void arm_smmu_write_context_bank(struct 
> arm_smmu_device *smmu, int idx)
>  }
>  
>  static int arm_smmu_init_domain_context(struct iommu_domain *domain,
> - struct arm_smmu_device *smmu)
> + struct arm_smmu_device *smmu,
> + struct device *dev)
>  {
>   int irq, start, ret = 0;
>   unsigned long ias, oas;
> @@ -801,6 +802,7 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   cfg->asid = cfg->cbndx;
>  
>   smmu_domain->smmu = smmu;
> + smmu_domain->dev = dev;
>  
>   pgtbl_cfg = (struct io_pgtable_cfg) {
>   .pgsize_bitmap  = smmu->pgsize_bitmap,
> @@ -1190,7 +1192,7 @@ static int arm_smmu_attach_dev(struct iommu_domain 
> *domain, struct device *dev)
>   return ret;
>  
>   /* Ensure that the domain is finalised */
> - ret = arm_smmu_init_domain_context(domain, smmu);
> + ret = arm_smmu_init_domain_context(domain, smmu, dev);
>   if (ret < 0)
>   goto rpm_put;
>  
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index 5f2de20e883b..d33cfe26b2f5 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -345,6 +345,7 @@ struct arm_smmu_domain {
>   struct mutexinit_mutex; /* Protects smmu pointer */
>   spinlock_t  cb_lock; /* Serialises ATS1* ops and 
> TLB syncs */
>   struct iommu_domain domain;
> + struct device   *dev;   /* Device attached to this 
> domain */

This really doesn't feel right to me -- you can generally have multiple
devices attached to a domain and they can come and go without the domain
being destroyed. Perhaps you could instead identify the GPU during
cfg_probe() and squirrel that information away somewhere?

The rest of the series looks ok to me.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC)

2020-07-03 Thread Will Deacon
On Fri, Jul 03, 2020 at 08:23:07PM +0530, Sai Prakash Ranjan wrote:
> On 2020-07-03 19:07, Will Deacon wrote:
> > On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:
> > > diff --git a/drivers/gpu/drm/msm/msm_iommu.c
> > > b/drivers/gpu/drm/msm/msm_iommu.c
> > > index f455c597f76d..bd1d58229cc2 100644
> > > --- a/drivers/gpu/drm/msm/msm_iommu.c
> > > +++ b/drivers/gpu/drm/msm/msm_iommu.c
> > > @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu,
> > > uint64_t iova,
> > >   iova |= GENMASK_ULL(63, 49);
> > > 
> > > 
> > > + if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)
> > > + prot |= IOMMU_SYS_CACHE_ONLY;
> > 
> > Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then
> > it
> > looks like it should actually be a property on the domain because we
> > never
> > need to configure it on a per-mapping basis within a domain, and
> > therefore
> > it shouldn't be exposed by the IOMMU API as a prot flag.
> > 
> > Do you agree?
> > 
> 
> GPU being the only user is for now, but there are other clients which can
> use this.
> Plus how do we set the memory attributes if we do not expose this as prot
> flag?

I just don't understand the need for it to be per-map operation. Put another
way, if we extended the domain attribute to apply to cacheable mappings
on the domain and not just the table walk, what would break?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCHv3 7/7] drm/msm/a6xx: Add support for using system cache(LLC)

2020-07-03 Thread Will Deacon
On Mon, Jun 29, 2020 at 09:22:50PM +0530, Sai Prakash Ranjan wrote:
> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
> index f455c597f76d..bd1d58229cc2 100644
> --- a/drivers/gpu/drm/msm/msm_iommu.c
> +++ b/drivers/gpu/drm/msm/msm_iommu.c
> @@ -218,6 +218,9 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t 
> iova,
>   iova |= GENMASK_ULL(63, 49);
>  
>  
> + if (mmu->features & MMU_FEATURE_USE_SYSTEM_CACHE)
> + prot |= IOMMU_SYS_CACHE_ONLY;

Given that I think this is the only user of IOMMU_SYS_CACHE_ONLY, then it
looks like it should actually be a property on the domain because we never
need to configure it on a per-mapping basis within a domain, and therefore
it shouldn't be exposed by the IOMMU API as a prot flag.

Do you agree?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 2/2] dt-bindings: arm-smmu: Add sc7180 compatible string

2020-05-21 Thread Will Deacon
On Mon, May 18, 2020 at 01:59:49PM -0700, Doug Anderson wrote:
> On Mon, May 18, 2020 at 7:39 AM Will Deacon  wrote:
> > On Fri, May 15, 2020 at 12:05:39PM -0700, Doug Anderson wrote:
> > > On Fri, May 1, 2020 at 3:30 AM Sharat Masetty  
> > > wrote:
> > > >
> > > > This patch simply adds a new compatible string for SC7180 platform.
> > > >
> > > > Signed-off-by: Sharat Masetty 
> > > > ---
> > > >  Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > >
> > > > diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
> > > > b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > > > index 6515dbe..986098b 100644
> > > > --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > > > +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > > > @@ -28,6 +28,7 @@ properties:
> > > >- enum:
> > > >- qcom,msm8996-smmu-v2
> > > >- qcom,msm8998-smmu-v2
> > > > +  - qcom,sc7180-smmu-v2
> > > >- qcom,sdm845-smmu-v2
> > > >- const: qcom,smmu-v2
> > >
> > > Is anything blocking this patch from landing now?
> >
> > I thought updates to the bindings usually went via Rob and the device-tree
> > tree, but neither of those are on cc.
> >
> > Perhaps resend with that fixed?
> 
> Ah, I guess I wasn't familiar with how things worked for this file, or
> maybe things have changed recently?  I'm used to most bindings going
> through the same tree as the drivers that use them.  Usually if things
> are at all complicated maintainers wait for an Ack from Rob (so he
> should have been CCed for sure) and then land.

Just to clear this up: I'm happy to take DT stuff like this, but preferably
with Rob's ack so that I know that (a) it's not a load of rubbish and (b) it
probably won't conflict with his tree. So having the DT folks omitted from
the CC list just rings alarm bells for me.

> In this case it actually looks like Bjorn landed it in the Qualcomm
> and I just didn't realize it.  That seems like it should be fine since
> it's in the middle of a clause that's all Qualcomm and the change
> shouldn't be controversial in any way.  :-)

Ok!

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 2/2] dt-bindings: arm-smmu: Add sc7180 compatible string

2020-05-18 Thread Will Deacon
On Fri, May 15, 2020 at 12:05:39PM -0700, Doug Anderson wrote:
> On Fri, May 1, 2020 at 3:30 AM Sharat Masetty  wrote:
> >
> > This patch simply adds a new compatible string for SC7180 platform.
> >
> > Signed-off-by: Sharat Masetty 
> > ---
> >  Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
> > b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > index 6515dbe..986098b 100644
> > --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
> > @@ -28,6 +28,7 @@ properties:
> >- enum:
> >- qcom,msm8996-smmu-v2
> >- qcom,msm8998-smmu-v2
> > +  - qcom,sc7180-smmu-v2
> >- qcom,sdm845-smmu-v2
> >- const: qcom,smmu-v2
> 
> Is anything blocking this patch from landing now?

I thought updates to the bindings usually went via Rob and the device-tree
tree, but neither of those are on cc.

Perhaps resend with that fixed?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH] iommu/arm-smmu: avoid pathological RPM behaviour for unmaps

2019-11-01 Thread Will Deacon
On Thu, Oct 31, 2019 at 02:31:02PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> When games, browser, or anything using a lot of GPU buffers exits, there
> can be many hundreds or thousands of buffers to unmap and free.  If the
> GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
> for each buffer, resulting 5-10 seconds worth of reprogramming the
> context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
> To the user it would appear that the system just locked up.
> 
> A simple solution is to use pm_runtime_put_autosuspend() instead, so we
> don't immediately suspend the SMMU device.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/iommu/arm-smmu.c | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 7c503a6bc585..5abc0d210d90 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -122,7 +122,7 @@ static inline int arm_smmu_rpm_get(struct arm_smmu_device 
> *smmu)
>  static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
>  {
>   if (pm_runtime_enabled(smmu->dev))
> - pm_runtime_put(smmu->dev);
> + pm_runtime_put_autosuspend(smmu->dev);
>  }
>  
>  static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
> @@ -1154,6 +1154,20 @@ static int arm_smmu_attach_dev(struct iommu_domain 
> *domain, struct device *dev)
>   /* Looks ok, so add the device to the domain */
>   ret = arm_smmu_domain_add_master(smmu_domain, fwspec);
>  
> + /*
> +  * Setup an autosuspend delay to avoid bouncing runpm state.
> +  * Otherwise, if a driver for a suspendend consumer device

I fixed this typo and applied with Robin's reviewed-by from before.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

Re: [Freedreno] [PATCH v2] iommu/arm-smmu: fix "hang" when games exit

2019-10-29 Thread Will Deacon
On Mon, Oct 28, 2019 at 10:51:53PM +, Robin Murphy wrote:
> On 2019-10-28 10:38 pm, Rob Clark wrote:
> > On Mon, Oct 28, 2019 at 3:20 PM Will Deacon  wrote:
> > > On Mon, Oct 07, 2019 at 01:49:06PM -0700, Rob Clark wrote:
> > > > From: Rob Clark 
> > > > 
> > > > When games, browser, or anything using a lot of GPU buffers exits, there
> > > > can be many hundreds or thousands of buffers to unmap and free.  If the
> > > > GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
> > > > for each buffer, resulting 5-10 seconds worth of reprogramming the
> > > > context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
> > > > To the user it would appear that the system just locked up.
> > > > 
> > > > A simple solution is to use pm_runtime_put_autosuspend() instead, so we
> > > > don't immediately suspend the SMMU device.
> > > 
> > > Please can you reword the subject to be a bit more useful? The commit
> > > message is great, but the subject is a bit like "fix bug in code" to me.
> > 
> > yeah, not the best $subject, but I wasn't quite sure how to fit
> > something better in a reasonable # of chars.. maybe something like:
> > "iommu/arm-smmu: optimize unmap but avoiding toggling runpm state"?
> 
> FWIW, I'd be inclined to frame it as something like "avoid pathological RPM
> behaviour for unmaps".

LGTM!

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

Re: [Freedreno] [PATCH v2] iommu/arm-smmu: fix "hang" when games exit

2019-10-28 Thread Will Deacon
Hi Rob,

On Mon, Oct 07, 2019 at 01:49:06PM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> When games, browser, or anything using a lot of GPU buffers exits, there
> can be many hundreds or thousands of buffers to unmap and free.  If the
> GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
> for each buffer, resulting 5-10 seconds worth of reprogramming the
> context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
> To the user it would appear that the system just locked up.
> 
> A simple solution is to use pm_runtime_put_autosuspend() instead, so we
> don't immediately suspend the SMMU device.

Please can you reword the subject to be a bit more useful? The commit
message is great, but the subject is a bit like "fix bug in code" to me.

> Signed-off-by: Rob Clark 
> ---
> v1: original
> v2: unconditionally use autosuspend, rather than deciding based on what
> consumer does
> 
>  drivers/iommu/arm-smmu.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 3f1d55fb43c4..b7b41f5001bc 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -289,7 +289,7 @@ static inline int arm_smmu_rpm_get(struct arm_smmu_device 
> *smmu)
>  static inline void arm_smmu_rpm_put(struct arm_smmu_device *smmu)
>  {
>   if (pm_runtime_enabled(smmu->dev))
> - pm_runtime_put(smmu->dev);
> + pm_runtime_put_autosuspend(smmu->dev);
>  }
>  
>  static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
> @@ -1445,6 +1445,9 @@ static int arm_smmu_attach_dev(struct iommu_domain 
> *domain, struct device *dev)
>   /* Looks ok, so add the device to the domain */
>   ret = arm_smmu_domain_add_master(smmu_domain, fwspec);

Please can you put a comment here explaining what this is doing? An abridged
version of the commit message is fine.

> + pm_runtime_set_autosuspend_delay(smmu->dev, 20);
> + pm_runtime_use_autosuspend(smmu->dev);

Cheers,

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

Re: [Freedreno] [PATCH v18 1/5] iommu/arm-smmu: Add pm_runtime/sleep ops

2018-11-30 Thread Will Deacon
On Thu, Nov 29, 2018 at 08:25:20PM +0530, Vivek Gautam wrote:
> On Wed, Nov 28, 2018 at 10:07 PM Robin Murphy  wrote:
> >
> > On 28/11/2018 16:24, Stephen Boyd wrote:
> > > Quoting Vivek Gautam (2018-11-27 02:11:41)
> > >> @@ -1966,6 +1970,23 @@ static const struct of_device_id 
> > >> arm_smmu_of_match[] = {
> > >>   };
> > >>   MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
> > >>
> > >> +static void arm_smmu_fill_clk_data(struct arm_smmu_device *smmu,
> > >> +  const char * const *clks)
> > >> +{
> > >> +   int i;
> > >> +
> > >> +   if (smmu->num_clks < 1)
> > >> +   return;
> > >> +
> > >> +   smmu->clks = devm_kcalloc(smmu->dev, smmu->num_clks,
> > >> + sizeof(*smmu->clks), GFP_KERNEL);
> > >> +   if (!smmu->clks)
> > >> +   return;
> > >> +
> > >> +   for (i = 0; i < smmu->num_clks; i++)
> > >> +   smmu->clks[i].id = clks[i];
> > >
> > > Is this clk_bulk_get_all()?
> 
> From what I remember, and now I could go back to v7 and check [1], we parked
> clk_bulk_get out of OF's sole purview as we also have
> arm_smmu_device_acpi_probe() besides arm_smmu_device_dt_probe().
> 
> arm_smmu_device_dt_probe() could get the clocks from dt and fill in
> the clock bulk data, and
> similarly, arm_smmu_device_acpi_probe() could fill the clock bulk data
> by getting it from ACPI.
> 
> clk_bulk_get_all() seems like going only the OF way.
> Is there another way here to have something common between ACPI
> and OF, and then do the clk_bulk_get?

I'd say just go with clk_bulk_get_all() and if somebody really wants to
mess with the SMMU clocks on a system booted via ACPI, then it's their
problem to solve. My understanding is that the design of IORT makes this
next to impossible to solve anyway, because a static table is used and
therefore we're unable to run whatever ASL methods need to be invoked to
mess with the clocks.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RESEND PATCH v17 2/5] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-11-26 Thread Will Deacon
On Mon, Nov 26, 2018 at 04:56:42PM +0530, Vivek Gautam wrote:
> On 11/26/2018 11:33 AM, Vivek Gautam wrote:
> >On 11/24/2018 12:06 AM, Will Deacon wrote:
> >>On Thu, Nov 22, 2018 at 05:32:24PM +0530, Vivek Gautam wrote:
> >>>On Wed, Nov 21, 2018 at 11:09 PM Will Deacon 
> >>>wrote:
> >>>>On Fri, Nov 16, 2018 at 04:54:27PM +0530, Vivek Gautam wrote:
> >>>>>From: Sricharan R 
> >>>>>
> >>>>>The smmu device probe/remove and add/remove master device callbacks
> >>>>>gets called when the smmu is not linked to its master, that is
> >>>>>without
> >>>>>the context of the master device. So calling runtime apis in those
> >>>>>places
> >>>>>separately.
> >>>>>Global locks are also initialized before enabling runtime pm as the
> >>>>>runtime_resume() calls device_reset() which does tlb_sync_global()
> >>>>>that ultimately requires locks to be initialized.
> >>>>>
> >>>>>Signed-off-by: Sricharan R 
> >>>>>[vivek: Cleanup pm runtime calls]
> >>>>>Signed-off-by: Vivek Gautam 
> >>>>>Reviewed-by: Tomasz Figa 
> >>>>>Tested-by: Srinivas Kandagatla 
> >>>>>Reviewed-by: Robin Murphy 
> >>>>>---
> >>>>>  drivers/iommu/arm-smmu.c | 101
> >>>>>++-
> >>>>>  1 file changed, 91 insertions(+), 10 deletions(-)
> >>>>Given that you're doing the get/put in the TLBI ops unconditionally:
> >>>>
> >>>>>  static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
> >>>>>  {
> >>>>>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> >>>>>+ struct arm_smmu_device *smmu = smmu_domain->smmu;
> >>>>>
> >>>>>- if (smmu_domain->tlb_ops)
> >>>>>+ if (smmu_domain->tlb_ops) {
> >>>>>+ arm_smmu_rpm_get(smmu);
> >>>>>smmu_domain->tlb_ops->tlb_flush_all(smmu_domain);
> >>>>>+ arm_smmu_rpm_put(smmu);
> >>>>>+ }
> >>>>>  }
> >>>>>
> >>>>>  static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
> >>>>>  {
> >>>>>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> >>>>>+ struct arm_smmu_device *smmu = smmu_domain->smmu;
> >>>>>
> >>>>>- if (smmu_domain->tlb_ops)
> >>>>>+ if (smmu_domain->tlb_ops) {
> >>>>>+ arm_smmu_rpm_get(smmu);
> >>>>>smmu_domain->tlb_ops->tlb_sync(smmu_domain);
> >>>>>+ arm_smmu_rpm_put(smmu);
> >>>>>+ }
> >>>>Why do you need them around the map/unmap calls as well?
> >>>We still have .tlb_add_flush path?
> >>Ok, so we could add the ops around that as well. Right now, we've got
> >>the runtime pm hooks crossing two parts of the API.
> >
> >Sure, will do that then, and remove the runtime pm hooks from map/unmap.
> 
> I missed this earlier -
> We are adding runtime pm hooks in the 'iommu_ops' callbacks and not really
> to
> 'tlb_ops'. So how the runtime pm hooks crossing the paths?
> '.map/.unmap' iommu_ops don't call '.flush_iotlb_all' or '.iotlb_sync'
> iommu_ops
> anywhere.
> 
> E.g., only callers to domain->ops->flush_iotlb_all() are:
> iommu_dma_flush_iotlb_all(), or iommu_flush_tlb_all() which are not in
> map/unmap paths.

Yes, sorry, I got confused here and completely misled you. In which case,
your original patch is ok because it intercepts the core IOMMU API via
iommu_ops. Apologies.

At that level, should we also annotate arm_smmu_iova_to_phys_hard()
for the iova_to_phys() implementation?

With that detail and clock bits sorted out, we should be able to get this
queued at last.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RESEND PATCH v17 2/5] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-11-23 Thread Will Deacon
On Thu, Nov 22, 2018 at 05:32:24PM +0530, Vivek Gautam wrote:
> Hi Will,
> 
> On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  wrote:
> >
> > On Fri, Nov 16, 2018 at 04:54:27PM +0530, Vivek Gautam wrote:
> > > From: Sricharan R 
> > >
> > > The smmu device probe/remove and add/remove master device callbacks
> > > gets called when the smmu is not linked to its master, that is without
> > > the context of the master device. So calling runtime apis in those places
> > > separately.
> > > Global locks are also initialized before enabling runtime pm as the
> > > runtime_resume() calls device_reset() which does tlb_sync_global()
> > > that ultimately requires locks to be initialized.
> > >
> > > Signed-off-by: Sricharan R 
> > > [vivek: Cleanup pm runtime calls]
> > > Signed-off-by: Vivek Gautam 
> > > Reviewed-by: Tomasz Figa 
> > > Tested-by: Srinivas Kandagatla 
> > > Reviewed-by: Robin Murphy 
> > > ---
> > >  drivers/iommu/arm-smmu.c | 101 
> > > ++-
> > >  1 file changed, 91 insertions(+), 10 deletions(-)
> >
> > Given that you're doing the get/put in the TLBI ops unconditionally:
> >
> > >  static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
> > >  {
> > >   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > > + struct arm_smmu_device *smmu = smmu_domain->smmu;
> > >
> > > - if (smmu_domain->tlb_ops)
> > > + if (smmu_domain->tlb_ops) {
> > > + arm_smmu_rpm_get(smmu);
> > >   smmu_domain->tlb_ops->tlb_flush_all(smmu_domain);
> > > + arm_smmu_rpm_put(smmu);
> > > + }
> > >  }
> > >
> > >  static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
> > >  {
> > >   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > > + struct arm_smmu_device *smmu = smmu_domain->smmu;
> > >
> > > - if (smmu_domain->tlb_ops)
> > > + if (smmu_domain->tlb_ops) {
> > > + arm_smmu_rpm_get(smmu);
> > >   smmu_domain->tlb_ops->tlb_sync(smmu_domain);
> > > + arm_smmu_rpm_put(smmu);
> > > + }
> >
> > Why do you need them around the map/unmap calls as well?
> 
> We still have .tlb_add_flush path?

Ok, so we could add the ops around that as well. Right now, we've got
the runtime pm hooks crossing two parts of the API.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom, smmu-v2 variant

2018-11-23 Thread Will Deacon
On Fri, Nov 23, 2018 at 03:06:29PM +0530, Vivek Gautam wrote:
> On Fri, Nov 23, 2018 at 2:52 PM Tomasz Figa  wrote:
> > On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam
> >  wrote:
> > > On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  wrote:
> > > > On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote:
> > > > > @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, 
> > > > > ARM_SMMU_V1_64K, GENERIC_SMMU);
> > > > >  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
> > > > >  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
> > > > >
> > > > > +static const char * const qcom_smmuv2_clks[] = {
> > > > > + "bus", "iface",
> > > > > +};
> > > > > +
> > > > > +static const struct arm_smmu_match_data qcom_smmuv2 = {
> > > > > + .version = ARM_SMMU_V2,
> > > > > + .model = QCOM_SMMUV2,
> > > > > + .clks = qcom_smmuv2_clks,
> > > > > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
> > > > > +};
> > > >
> > > > These seems redundant if we go down the route proposed by Thor, where we
> > > > just pull all of the clocks out of the device-tree. In which case, why
> > > > do we need this match_data at all?
> > >
> > > Which is better? Driver relying solely on the device tree to tell
> > > which all clocks
> > > are required to be enabled,
> > > or, the driver deciding itself based on the platform's match data,
> > > that it should
> > > have X, Y, & Z clocks that should be supplied from the device tree.
> >
> > The former would simplify the driver, but would also make it
> > impossible to spot mistakes in DT, which would ultimately surface out
> > as very hard to debug bugs (likely complete system lockups).
> 
> Thanks.
> Yea, this is how I understand things presently. Relying on device tree
> puts the things out of driver's control.

But it also has the undesirable effect of having to update the driver
code whenever we want to add support for a new SMMU implementation. If
we do this all in the DT, as Thor is trying to do, then older kernels
will work well with new hardware.

> Hi Will,
> Am I unable to understand the intentions here for Thor's clock-fetch
> design change?

I'm having trouble parsing your question, sorry. Please work with Thor
so that we have a single way to get the clock information. My preference
is to take it from the firmware, for the reason I stated above.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom, smmu-v2 variant

2018-11-21 Thread Will Deacon
[+Thor]

On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote:
> qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
> clock and power requirements.
> On msm8996, multiple cores, viz. mdss, video, etc. use this
> smmu. On sdm845, this smmu is used with gpu.
> Add bindings for the same.
> 
> Signed-off-by: Vivek Gautam 
> Reviewed-by: Rob Herring 
> Reviewed-by: Tomasz Figa 
> Tested-by: Srinivas Kandagatla 
> Reviewed-by: Robin Murphy 
> ---
>  drivers/iommu/arm-smmu.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 2098c3141f5f..d315ca637097 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -120,6 +120,7 @@ enum arm_smmu_implementation {
>   GENERIC_SMMU,
>   ARM_MMU500,
>   CAVIUM_SMMUV2,
> + QCOM_SMMUV2,
>  };
>  
>  struct arm_smmu_s2cr {
> @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, 
> GENERIC_SMMU);
>  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
>  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
>  
> +static const char * const qcom_smmuv2_clks[] = {
> + "bus", "iface",
> +};
> +
> +static const struct arm_smmu_match_data qcom_smmuv2 = {
> + .version = ARM_SMMU_V2,
> + .model = QCOM_SMMUV2,
> + .clks = qcom_smmuv2_clks,
> + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
> +};

These seems redundant if we go down the route proposed by Thor, where we
just pull all of the clocks out of the device-tree. In which case, why
do we need this match_data at all?

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


  1   2   >