Re: [PATCH 5/6] iommu/vt-d: Cleanup after delegating DMA domain to generic iommu

2019-06-10 Thread Lu Baolu

Hi,

On 6/11/19 3:25 AM, Sai Praneeth Prakhya wrote:

On Mon, 2019-06-10 at 11:45 -0700, Mehta, Sohil wrote:

On Sun, 2019-06-09 at 10:38 +0800, Lu Baolu wrote:

  static int __init si_domain_init(int hw)
@@ -3306,14 +3252,13 @@ static int __init init_dmars(void)
 if (pasid_supported(iommu))
 intel_svm_init(iommu);
  #endif
-   }
  
-   /*

-* Now that qi is enabled on all iommus, set the root entry
and flush
-* caches. This is required on some Intel X58 chipsets,
otherwise the
-* flush_context function will loop forever and the boot
hangs.
-*/
-   for_each_active_iommu(iommu, drhd) {
+   /*
+* Now that qi is enabled on all iommus, set the root
entry and
+* flush caches. This is required on some Intel X58
chipsets,
+* otherwise the flush_context function will loop
forever and
+* the boot hangs.
+*/
 iommu_flush_write_buffer(iommu);
 iommu_set_root_entry(iommu);
 iommu->flush.flush_context(iommu, 0, 0, 0,
DMA_CCMD_GLOBAL_INVL);


This changes the intent of the original code. As the comment says
enable QI on all IOMMUs, then flush the caches and set the root entry.
The order of setting the root entries has changed now.

Refer:
Commit a4c34ff1c029 ('iommu/vt-d: Enable QI on all IOMMUs before
setting root entry')


Thanks Sohil! for catching the bug.
Will send a V2 to Lu Baolu fixing this.


Okay, I will submit a v2 of this series later.



Regards,
Sai


Best regards,
Baolu


Re: [PATCH] dma-remap: Avoid de-referencing NULL atomic_pool

2019-06-10 Thread Christoph Hellwig
Looks good to me.  When did this start to show up?  Do we need
to push it to Linus this cycle and cc stable?


Re: Device specific pass through in host systems - discuss user interface

2019-06-10 Thread Raj, Ashok
On Mon, Jun 10, 2019 at 09:38:11PM -0700, Sai Praneeth Prakhya wrote:
> Hi All,
> 
> + Sohil and Rob Clark (as there are dropped from CC'list)
> 
> > > > Most iommu vendor drivers have switched from per-device to per-group
> > > > domain (a.k.a. default domain). So per-group pass-through mode makes
> > more sense?
> > > >
> > > > By the way, can we extend this to "per-group default domain type",
> > > > instead of only "per-group pass-through mode"? Currently we have
> > > > system level default domain type, if we have finer granularity of
> > > > default domain type, both iommu drivers and end users will benefit from 
> > > > it.
> > >
> > > Sure! Makes sense.. per-group default domain type sounds good.
> 
> I am planning to implement an RFC (supporting only runtime case for now) 
> which works as below
> 
> 1. User unbinds the driver by writing to sysfs
> 2. User puts a group in pass through mode by writing "1" to
> /sys/kernel/iommu_groups//pt

might be better to read current value of default domain for that group.. 
/sys/kernel/iommu_groups//default_domain

reading the above value shows current setting.
provide a differnet file next_def_domain, and you can echo "pt" or "dma_domain"
to switch to pass-through, or normal dma isolation mode.

For devices that automatically set to pass through today like graphics, or 
isoch audio
you can show "pt" as default_domain.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: Device specific pass through in host systems - discuss user interface

2019-06-10 Thread Prakhya, Sai Praneeth
Hi All,

+ Sohil and Rob Clark (as there are dropped from CC'list)

> > > Most iommu vendor drivers have switched from per-device to per-group
> > > domain (a.k.a. default domain). So per-group pass-through mode makes
> more sense?
> > >
> > > By the way, can we extend this to "per-group default domain type",
> > > instead of only "per-group pass-through mode"? Currently we have
> > > system level default domain type, if we have finer granularity of
> > > default domain type, both iommu drivers and end users will benefit from 
> > > it.
> >
> > Sure! Makes sense.. per-group default domain type sounds good.

I am planning to implement an RFC (supporting only runtime case for now) which 
works as below

1. User unbinds the driver by writing to sysfs
2. User puts a group in pass through mode by writing "1" to
/sys/kernel/iommu_groups//pt
3. User re-binds the driver by writing to sysfs

As suggested by Lu, Baolu will look into implementing this by using "per-group 
default domain type"

If anyone has suggestions/comments/concerns, please reply.

Regards,
Sai
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: "iommu/vt-d: Delegate DMA domain to generic iommu" series breaks megaraid_sas

2019-06-10 Thread Lu Baolu

Ah, good catch!

The device failed to be attached by a DMA domain. Can you please try the
attached fix patch?

[  101.885468] pci :06:00.0: DMAR: Device is ineligible for IOMMU
domain attach due to platform RMRR requirement.  Contact your platform
vendor.
[  101.900801] pci :06:00.0: Failed to add to iommu group 23: -1

Best regards,
Baolu

On 6/10/19 10:54 PM, Qian Cai wrote:

On Mon, 2019-06-10 at 09:44 -0400, Qian Cai wrote:

On Sun, 2019-06-09 at 10:43 +0800, Lu Baolu wrote:

Hi Qian,

I just posted some fix patches. I cc'ed them in your email inbox as
well. Can you please check whether they happen to fix your issue?
If not, do you mind posting more debug messages?


Unfortunately, it does not work. Here is the dmesg.

https://raw.githubusercontent.com/cailca/tmp/master/dmesg?token=AMC35QKPIZBYUM
FUQKLW4ZC47ZPIK


This one should be good to view.

https://cailca.github.io/files/dmesg.txt

>From ff0b1ae0d8fde0655392fde3a1090b03a7a35394 Mon Sep 17 00:00:00 2001
From: Lu Baolu 
Date: Tue, 11 Jun 2019 09:29:16 +0800
Subject: [PATCH 1/1] iommu/vt-d: Allow DMA domain attaching to rmrr locked
 device

We don't allow a device to be assigned to user level when it is locked
by any RMRR's. Hence, intel_iommu_attach_device() will return error if
a domain of type IOMMU_DOMAIN_UNMANAGED is about to attach to a device
locked by rmrr. But this doesn't apply to a domain of type other than
IOMMU_DOMAIN_UNMANAGED. This adds a check to fix this.

Fixes: fa954e6831789 ("iommu/vt-d: Delegate the dma domain to upper layer")
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 1dcb6365ddc4..38232220f6ff 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5281,7 +5281,8 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
 {
 	int ret;
 
-	if (device_is_rmrr_locked(dev)) {
+	if (domain->type == IOMMU_DOMAIN_UNMANAGED &&
+	device_is_rmrr_locked(dev)) {
 		dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.\n");
 		return -EPERM;
 	}
-- 
2.17.1



[PATCH] dma-remap: Avoid de-referencing NULL atomic_pool

2019-06-10 Thread Florian Fainelli
With architectures allowing the kernel to be placed almost arbitrarily
in memory (e.g.: ARM64), it is possible to have the kernel resides at
physical addresses above 4GB, resulting in neither the default CMA area,
nor the atomic pool from successfully allocating. This does not prevent
specific peripherals from working though, one example is XHCI, which
still operates correctly.

Trouble comes when the XHCI driver gets suspended and resumed, since we
can now trigger the following NPD:

[   12.664170] usb usb1: root hub lost power or was reset
[   12.669387] usb usb2: root hub lost power or was reset
[   12.674662] Unable to handle kernel NULL pointer dereference at virtual 
address 0008
[   12.682896] pgd = ffc1365a7000
[   12.686386] [0008] *pgd=00013653, *pud=00013653, 
*pmd=
[   12.694897] Internal error: Oops: 9606 [#1] SMP
[   12.699843] Modules linked in:
[   12.702980] CPU: 0 PID: 1499 Comm: pml Not tainted 4.9.135-1.13pre #51
[   12.709577] Hardware name: BCM97268DV (DT)
[   12.713736] task: ffc136bb6540 task.stack: ffc1366cc000
[   12.719740] PC is at addr_in_gen_pool+0x4/0x48
[   12.724253] LR is at __dma_free+0x64/0xbc
[   12.728325] pc : [] lr : [] pstate: 
6145
[   12.735825] sp : ffc1366cf990
[   12.739196] x29: ffc1366cf990 x28: ffc1366cc000
[   12.744608] x27:  x26: ffc13a8568c8
[   12.750020] x25:  x24: ff80098f9000
[   12.755433] x23: 00013a5ff000 x22: ff8009c57000
[   12.760844] x21: ffc13a856810 x20: 
[   12.766255] x19: 1000 x18: 000a
[   12.771667] x17: 007f917553e0 x16: 1002
[   12.777078] x15: 000a36cb x14: ff80898feb77
[   12.782490] x13:  x12: 0030
[   12.787899] x11: fffe x10: ff80098feb7f
[   12.793311] x9 : 05f5e0ff x8 : 65776f702074736f
[   12.798723] x7 : 6c2062756820746f x6 : ff80098febb1
[   12.804134] x5 : ff800809797c x4 : 
[   12.809545] x3 : 00013a5ff000 x2 : 0fff
[   12.814955] x1 : ff8009c57000 x0 : 
[   12.820363]
[   12.821907] Process pml (pid: 1499, stack limit = 0xffc1366cc020)
[   12.828421] Stack: (0xffc1366cf990 to 0xffc1366d)
[   12.834240] f980:   ffc1366cf9e0 
ff80086004d0
[   12.842186] f9a0: ffc13ab08238 0010 ff80097c2218 
ffc13a856810
[   12.850131] f9c0: ff8009c57000 00013a5ff000 0008 
00013a5ff000
[   12.858076] f9e0: ffc1366cfa50 ff80085f9250 ffc13ab08238 
0004
[   12.866021] fa00: ffc13ab08000 ff80097b6000 ffc13ab08130 
0001
[   12.873966] fa20: 0008 ffc13a8568c8  
ffc1366cc000
[   12.881911] fa40: ffc13ab08130 0001 ffc1366cfa90 
ff80085e3de8
[   12.889856] fa60: ffc13ab08238  ffc136b75b00 

[   12.897801] fa80: 0010 ff80089ccb92 ffc1366cfac0 
ff80084ad040
[   12.905746] faa0: ffc13a856810  ff80084ad004 
ff80084b91a8
[   12.913691] fac0: ffc1366cfae0 ff80084b91b4 ffc13a856810 
ff80080db5cc
[   12.921636] fae0: ffc1366cfb20 ff80084b96bc ffc13a856810 
0010
[   12.929581] fb00: ffc13a856870  ffc13a856810 
ff800984d2b8
[   12.937526] fb20: ffc1366cfb50 ff80084baa70 ff8009932ad0 
ff800984d260
[   12.945471] fb40: 0010 0002eff0a065 ffc1366cfbb0 
ff80084bafbc
[   12.953415] fb60: 0010 0003 ff80098fe000 

[   12.961360] fb80: ff80097b6000 ff80097b6dc8 ff80098c12b8 
ff80098c12f8
[   12.969306] fba0: ff8008842000 ff80097b6dc8 ffc1366cfbd0 
ff80080e0d88
[   12.977251] fbc0: fffb ff80080e10bc ffc1366cfc60 
ff80080e16a8
[   12.985196] fbe0:  0003 ff80097b6000 
ff80098fe9f0
[   12.993140] fc00: ff80097d4000 ff8008983802 0123 
0040
[   13.001085] fc20: ff8008842000 ffc1366cc000 ff80089803c2 

[   13.009029] fc40:   ffc1366cfc60 
00040987
[   13.016974] fc60: ffc1366cfcc0 ff80080dfd08 0003 
0004
[   13.024919] fc80: 0003 ff80098fea08 ffc136577ec0 
ff80089803c2
[   13.032864] fca0: 0123 0001 00050002 
00040987
[   13.040809] fcc0: ffc1366cfd00 ff80083a89d4 0004 
ffc136577ec0
[   13.048754] fce0: ffc136610cc0 ffea ffc1366cfeb0 
ffc136610cd8
[   13.056700] fd00: ffc1366cfd10 ff800822a614 ffc1366cfd40 
ff80082295d4
[   13.064645] fd20: 0004 ffc136577ec0 

Re: [PATCH v8 26/29] vfio-pci: Register an iommu fault handler

2019-06-10 Thread Jacob Pan
On Mon, 10 Jun 2019 13:45:02 +0100
Jean-Philippe Brucker  wrote:

> On 07/06/2019 18:43, Jacob Pan wrote:
> >>> So it seems we agree on the following:
> >>> - iommu_unregister_device_fault_handler() will never fail
> >>> - iommu driver cleans up all pending faults when handler is
> >>> unregistered
> >>> - assume device driver or guest not sending more page response
> >>> _after_ handler is unregistered.
> >>> - system will tolerate rare spurious response
> >>>
> >>> Sounds right?
> >>
> >> Yes, I'll add that to the fault series  
> > Hold on a second please, I think we need more clarifications. Ashok
> > pointed out to me that the spurious response can be harmful to other
> > devices when it comes to mdev, where PRQ group id is not per PASID,
> > device may reuse the group number and receiving spurious page
> > response can confuse the entire PF.   
> 
> I don't understand how mdev differs from the non-mdev situation (but I
> also still don't fully get how mdev+PASID will be implemented). Is the
> following the case you're worried about?
> 
>   M#: mdev #
> 
> # Dev Hostmdev drv   VFIO/QEMUGuest
> 
> 1 <- reg(handler)
> 2 PR1 G1 P1-> M1 PR1 G1inject -> M1 PR1 G1
> 3 <- unreg(handler)
> 4   <- PS1 G1 P1 (F)  |
> 5unreg(handler)
> 6 <- reg(handler)
> 7 PR2 G1 P1-> M2 PR2 G1inject -> M2 PR2 G1
> 8 <- M1 PS1 G1
> 9 accept ??<- PS1 G1 P1
> 10<- M2 PS2 G1
> 11accept   <- PS2 G1 P1
> 
Not really. I am not worried about PASID reuse or unbind. Just within
the same PASID bind lifetime of a single mdev, back to back
register/unregister fault handler.
After Step 4, device will think G1 is done. Device could reuse G1 for
the next PR, if we accept PS1 in step 9, device will terminate G1 before
the real G1 PS arrives in Step 11. The real G1 PS might have a
different response code. Then we just drop the PS in Step 11?

If the device does not reuse G1 immediately, the spurious response to
G1 will get dropped no issue there.

> 
> Step 2 injects PR1 for mdev#1. Step 4 auto-responds to PR1. Between
> steps 5 and 6, we re-allocate PASID #1 for mdev #2. At step 7, we
> inject PR2 for mdev #2. Step 8 is the spurious Page Response for PR1.
> 
> But I don't think step 9 is possible, because the mdev driver knows
> that mdev #1 isn't using PASID #1 anymore. If the configuration is
> valid at all (a page response channel still exists for mdev #1), then
> mdev #1 now has a different PASID, e.g. #2, and step 9 would be "<-
> PS1 G1 P2" which is rejected by iommu.c (no such pending page
> request). And step 11 will be accepted.
> 
> If PASIDs are allocated through VCMD, then the situation seems
> similar: at step 2 you inject "M1 PR1 G1 P1" into the guest, and at
> step 8 the spurious response is "M1 PS1 G1 P1". If mdev #1 doesn't
> have PASID #1 anymore, then the mdev driver can check that the PASID
> is invalid and can reject the page response.
> 
> > Having spurious page response is also not
> > abiding the PCIe spec. exactly.  
> 
> We are following the PCI spec though, in that we don't send page
> responses for PRGIs that aren't in flight.
> 
You are right, the worst case of the spurious PS is to terminate the
group prematurely. Need to know the scope of the HW damage in case of mdev
where group IDs can be shared among mdevs belong to the same PF.

> > We have two options here:
> > 1. unregister handler will get -EBUSY if outstanding fault exists.
> > -PROs: block offending device unbind only, eventually
> > timeout will clear.
> > -CONs: flooded faults can prevent clearing
> > 2. unregister handle will block until all faults are clear in the
> > host. Never fails unregistration  
> 
> Here the host completes the faults itself or wait for a response from
> the guest? I'm slightly confused by the word "blocking". I'd rather we
> don't introduce an uninterruptible sleep in the IOMMU core, since it's
> unlikely to ever finish if we rely on the guest to complete things.
> 
No uninterruptible sleep, I meant unregister_handler is a sync call.
But no wait for guest's response.
> > -PROs: simple flow for VFIO, no need to worry about device
> > holding reference.
> > -CONs: spurious page response may come from
> > misbehaving/malicious guest if guest does unregister and
> > register back to back.  
> 
> > It seems the only way to prevent spurious page response is to
> > introduce a SW token or sequence# for each PRQ that needs a
> > response. I still think option 2 is good.
> > 
> > Consider the following time line:
> > decoding
> >  PR#: page request
> >  G#:  group #
> >  P#:  PASID
> >  S#:  sequence #
> >  A#:  address
> >  PS#: page response
> 

Re: [PATCH 5/6] iommu/vt-d: Cleanup after delegating DMA domain to generic iommu

2019-06-10 Thread Sai Praneeth Prakhya
On Mon, 2019-06-10 at 11:45 -0700, Mehta, Sohil wrote:
> On Sun, 2019-06-09 at 10:38 +0800, Lu Baolu wrote:
> >  static int __init si_domain_init(int hw)
> > @@ -3306,14 +3252,13 @@ static int __init init_dmars(void)
> > if (pasid_supported(iommu))
> > intel_svm_init(iommu);
> >  #endif
> > -   }
> >  
> > -   /*
> > -* Now that qi is enabled on all iommus, set the root entry
> > and flush
> > -* caches. This is required on some Intel X58 chipsets,
> > otherwise the
> > -* flush_context function will loop forever and the boot
> > hangs.
> > -*/
> > -   for_each_active_iommu(iommu, drhd) {
> > +   /*
> > +* Now that qi is enabled on all iommus, set the root
> > entry and
> > +* flush caches. This is required on some Intel X58
> > chipsets,
> > +* otherwise the flush_context function will loop
> > forever and
> > +* the boot hangs.
> > +*/
> > iommu_flush_write_buffer(iommu);
> > iommu_set_root_entry(iommu);
> > iommu->flush.flush_context(iommu, 0, 0, 0,
> > DMA_CCMD_GLOBAL_INVL);
> 
> This changes the intent of the original code. As the comment says
> enable QI on all IOMMUs, then flush the caches and set the root entry.
> The order of setting the root entries has changed now.
> 
> Refer: 
> Commit a4c34ff1c029 ('iommu/vt-d: Enable QI on all IOMMUs before
> setting root entry')

Thanks Sohil! for catching the bug.
Will send a V2 to Lu Baolu fixing this.

Regards,
Sai



[PATCH 7/8] iommu/arm-smmu-v3: Improve add_device() error handling

2019-06-10 Thread Jean-Philippe Brucker
Let add_device() clean up behind itself. The iommu_bus_init() function
does call remove_device() on error, but other sites (e.g. of_iommu) do
not.

Don't free level-2 stream tables because we'd have to track if we
allocated each of them or if they are used by other endpoints. It's not
worth the hassle since they are managed resources.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 633d829f246f..972bfb80f964 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2398,14 +2398,16 @@ static int arm_smmu_add_device(struct device *dev)
for (i = 0; i < master->num_sids; i++) {
u32 sid = master->sids[i];
 
-   if (!arm_smmu_sid_in_range(smmu, sid))
-   return -ERANGE;
+   if (!arm_smmu_sid_in_range(smmu, sid)) {
+   ret = -ERANGE;
+   goto err_free_master;
+   }
 
/* Ensure l2 strtab is initialised */
if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
ret = arm_smmu_init_l2_strtab(smmu, sid);
if (ret)
-   return ret;
+   goto err_free_master;
}
}
 
@@ -2419,13 +2421,25 @@ static int arm_smmu_add_device(struct device *dev)
if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB))
master->ssid_bits = min(master->ssid_bits, 10U);
 
+   ret = iommu_device_link(>iommu, dev);
+   if (ret)
+   goto err_free_master;
+
group = iommu_group_get_for_dev(dev);
-   if (!IS_ERR(group)) {
-   iommu_group_put(group);
-   iommu_device_link(>iommu, dev);
+   if (IS_ERR(group)) {
+   ret = PTR_ERR(group);
+   goto err_unlink;
}
 
-   return PTR_ERR_OR_ZERO(group);
+   iommu_group_put(group);
+   return 0;
+
+err_unlink:
+   iommu_device_unlink(>iommu, dev);
+err_free_master:
+   kfree(master);
+   fwspec->iommu_priv = NULL;
+   return ret;
 }
 
 static void arm_smmu_remove_device(struct device *dev)
-- 
2.21.0



[PATCH 8/8] iommu/arm-smmu-v3: Add support for PCI PASID

2019-06-10 Thread Jean-Philippe Brucker
Enable PASID for PCI devices that support it. Since the SSID tables are
allocated by arm_smmu_attach_dev(), PASID has to be enabled early enough.
arm_smmu_dev_feature_enable() would be too late, since by that time the
main DMA domain has already been attached. Do it in add_device() instead.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 51 -
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 972bfb80f964..a8a516d9ff10 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2197,6 +2197,49 @@ static void arm_smmu_disable_ats(struct arm_smmu_master 
*master)
master->ats_enabled = false;
 }
 
+static int arm_smmu_enable_pasid(struct arm_smmu_master *master)
+{
+   int ret;
+   int features;
+   int num_pasids;
+   struct pci_dev *pdev;
+
+   if (!dev_is_pci(master->dev))
+   return -ENOSYS;
+
+   pdev = to_pci_dev(master->dev);
+
+   features = pci_pasid_features(pdev);
+   if (features < 0)
+   return -ENOSYS;
+
+   num_pasids = pci_max_pasids(pdev);
+   if (num_pasids <= 0)
+   return -ENOSYS;
+
+   ret = pci_enable_pasid(pdev, features);
+   if (!ret)
+   master->ssid_bits = min_t(u8, ilog2(num_pasids),
+ master->smmu->ssid_bits);
+   return ret;
+}
+
+static void arm_smmu_disable_pasid(struct arm_smmu_master *master)
+{
+   struct pci_dev *pdev;
+
+   if (!dev_is_pci(master->dev))
+   return;
+
+   pdev = to_pci_dev(master->dev);
+
+   if (!pdev->pasid_enabled)
+   return;
+
+   pci_disable_pasid(pdev);
+   master->ssid_bits = 0;
+}
+
 static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 {
unsigned long flags;
@@ -2413,6 +2456,9 @@ static int arm_smmu_add_device(struct device *dev)
 
master->ssid_bits = min(smmu->ssid_bits, fwspec->num_pasid_bits);
 
+   /* Note that PASID must be enabled before, and disabled after ATS */
+   arm_smmu_enable_pasid(master);
+
/*
 * If the SMMU doesn't support 2-stage CD, limit the linear
 * tables to a reasonable number of contexts, let's say
@@ -2423,7 +2469,7 @@ static int arm_smmu_add_device(struct device *dev)
 
ret = iommu_device_link(>iommu, dev);
if (ret)
-   goto err_free_master;
+   goto err_disable_pasid;
 
group = iommu_group_get_for_dev(dev);
if (IS_ERR(group)) {
@@ -2436,6 +2482,8 @@ static int arm_smmu_add_device(struct device *dev)
 
 err_unlink:
iommu_device_unlink(>iommu, dev);
+err_disable_pasid:
+   arm_smmu_disable_pasid(master);
 err_free_master:
kfree(master);
fwspec->iommu_priv = NULL;
@@ -2456,6 +2504,7 @@ static void arm_smmu_remove_device(struct device *dev)
arm_smmu_detach_dev(master);
iommu_group_remove_device(dev);
iommu_device_unlink(>iommu, dev);
+   arm_smmu_disable_pasid(master);
kfree(master);
iommu_fwspec_free(dev);
 }
-- 
2.21.0



[PATCH 6/8] iommu/arm-smmu-v3: Support auxiliary domains

2019-06-10 Thread Jean-Philippe Brucker
In commit a3a195929d40 ("iommu: Add APIs for multiple domains per
device"), the IOMMU API gained the concept of auxiliary domains (AUXD),
which allows to control the PASID-tagged address spaces of a device. With
AUXD the PASID address space are not shared with the CPU, but are instead
modified with iommu_map() and iommu_unmap() calls on auxiliary domains.

Add auxiliary domain support to the SMMUv3 driver. Device drivers allocate
an unmanaged IOMMU domain with iommu_domain_alloc(), and attach it to the
device with iommu_aux_attach_domain().

The AUXD API is fairly permissive, and allows to attach an IOMMU domain in
both normal and auxiliary mode at the same time - one device can be
attached to the domain normally, and another device can be attached
through one of its PASIDs. To avoid excessive complexity in the SMMU
implementation we pose some restrictions on supported AUXD usage:

* A domain is either in auxiliary mode or normal mode. And that state is
  sticky. Once detached the domain has to be re-attached in the same mode.

* An auxiliary domain can have a single parent domain. Two devices can be
  attached to the same auxiliary domain only if they are attached to the
  same parent domain.

In practice these shouldn't be problematic, since we have the same kind of
restriction on normal domains and users have been able to cope so far: at
the moment a domain cannot be attached to two devices behind different
SMMUs. When VFIO puts two such devices in the same container, it simply
falls back to allocating two separate IOMMU domains.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/Kconfig   |   1 +
 drivers/iommu/arm-smmu-v3.c | 276 +---
 2 files changed, 260 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 9b45f70549a7..d326fef3d3a6 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -393,6 +393,7 @@ config ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT
 config ARM_SMMU_V3
bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
depends on ARM64
+   select IOASID
select IOMMU_API
select IOMMU_IO_PGTABLE_LPAE
select GENERIC_MSI_IRQ_DOMAIN
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 326b71793336..633d829f246f 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -641,6 +642,7 @@ struct arm_smmu_master {
unsigned intnum_sids;
unsigned intssid_bits;
boolats_enabled :1;
+   boolauxd_enabled:1;
 };
 
 /* SMMU private data for an IOMMU domain */
@@ -666,8 +668,14 @@ struct arm_smmu_domain {
 
struct iommu_domain domain;
 
+   /* Unused in aux domains */
struct list_headdevices;
spinlock_t  devices_lock;
+
+   /* Auxiliary domain stuff */
+   struct arm_smmu_domain  *parent;
+   ioasid_tssid;
+   unsigned long   aux_nr_devs;
 };
 
 struct arm_smmu_option_prop {
@@ -675,6 +683,8 @@ struct arm_smmu_option_prop {
const char *prop;
 };
 
+static DECLARE_IOASID_SET(private_ioasid);
+
 static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SKIP_PREFETCH, "hisilicon,broken-prefetch-cmd" },
{ ARM_SMMU_OPT_PAGE0_REGS_ONLY, "cavium,cn9900-broken-page1-regspace"},
@@ -696,6 +706,15 @@ static struct arm_smmu_domain *to_smmu_domain(struct 
iommu_domain *dom)
return container_of(dom, struct arm_smmu_domain, domain);
 }
 
+static struct arm_smmu_master *dev_to_master(struct device *dev)
+{
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+
+   if (!fwspec)
+   return NULL;
+   return fwspec->iommu_priv;
+}
+
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
int i = 0;
@@ -1776,13 +1795,19 @@ static int arm_smmu_atc_inv_master(struct 
arm_smmu_master *master,
 }
 
 static int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain,
-  int ssid, unsigned long iova, size_t size)
+  unsigned long iova, size_t size)
 {
int ret = 0;
+   unsigned int ssid = 0;
unsigned long flags;
struct arm_smmu_cmdq_ent cmd;
struct arm_smmu_master *master;
 
+   if (smmu_domain->parent) {
+   ssid = smmu_domain->ssid;
+   smmu_domain = smmu_domain->parent;
+   }
+
if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_ATS))
return 0;
 
@@ -1935,10 +1960,12 @@ static void arm_smmu_domain_free(struct iommu_domain 
*domain)
if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
struct arm_smmu_s1_cfg *cfg = 

[PATCH 3/8] iommu/arm-smmu-v3: Support platform SSID

2019-06-10 Thread Jean-Philippe Brucker
For platform devices that support SubstreamID (SSID), firmware provides
the number of supported SSID bits. Restrict it to what the SMMU supports
and cache it into master->ssid_bits.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 11 +++
 drivers/iommu/of_iommu.c|  6 +-
 include/linux/iommu.h   |  1 +
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 4d5a694f02c2..3254f473e681 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -604,6 +604,7 @@ struct arm_smmu_master {
struct list_headdomain_head;
u32 *sids;
unsigned intnum_sids;
+   unsigned intssid_bits;
boolats_enabled :1;
 };
 
@@ -2097,6 +2098,16 @@ static int arm_smmu_add_device(struct device *dev)
}
}
 
+   master->ssid_bits = min(smmu->ssid_bits, fwspec->num_pasid_bits);
+
+   /*
+* If the SMMU doesn't support 2-stage CD, limit the linear
+* tables to a reasonable number of contexts, let's say
+* 64kB / sizeof(ctx_desc) = 1024 = 2^10
+*/
+   if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB))
+   master->ssid_bits = min(master->ssid_bits, 10U);
+
group = iommu_group_get_for_dev(dev);
if (!IS_ERR(group)) {
iommu_group_put(group);
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index f04a6df65eb8..04f4f6b95d82 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -206,8 +206,12 @@ const struct iommu_ops *of_iommu_configure(struct device 
*dev,
if (err)
break;
}
-   }
 
+   fwspec = dev_iommu_fwspec_get(dev);
+   if (!err && fwspec)
+   of_property_read_u32(master_np, "pasid-num-bits",
+>num_pasid_bits);
+   }
 
/*
 * Two success conditions can be represented by non-negative err here:
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 519e40fb23ce..b91df613385f 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -536,6 +536,7 @@ struct iommu_fwspec {
struct fwnode_handle*iommu_fwnode;
void*iommu_priv;
u32 flags;
+   u32 num_pasid_bits;
unsigned intnum_ids;
u32 ids[1];
 };
-- 
2.21.0



[PATCH 2/8] dt-bindings: document PASID property for IOMMU masters

2019-06-10 Thread Jean-Philippe Brucker
On Arm systems, some platform devices behind an SMMU may support the PASID
feature, which offers multiple address space. Let the firmware tell us
when a device supports PASID.

Reviewed-by: Rob Herring 
Signed-off-by: Jean-Philippe Brucker 
---
Previous discussion on this patch last year:
https://patchwork.ozlabs.org/patch/872275/
I split PASID and stall definitions, keeping only PASID here.
---
 Documentation/devicetree/bindings/iommu/iommu.txt | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt 
b/Documentation/devicetree/bindings/iommu/iommu.txt
index 5a8b4624defc..3c36334e4f94 100644
--- a/Documentation/devicetree/bindings/iommu/iommu.txt
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -86,6 +86,12 @@ have a means to turn off translation. But it is invalid in 
such cases to
 disable the IOMMU's device tree node in the first place because it would
 prevent any driver from properly setting up the translations.
 
+Optional properties:
+
+- pasid-num-bits: Some masters support multiple address spaces for DMA, by
+  tagging DMA transactions with an address space identifier. By default,
+  this is 0, which means that the device only has one address space.
+
 
 Notes:
 ==
-- 
2.21.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/8] iommu/arm-smmu-v3: Add second level of context descriptor table

2019-06-10 Thread Jean-Philippe Brucker
The SMMU can support up to 20 bits of SSID. Add a second level of page
tables to accommodate this. Devices that support more than 1024 SSIDs now
have a table of 1024 L1 entries (8kB), pointing to tables of 1024 context
descriptors (64kB), allocated on demand.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 136 +---
 1 file changed, 128 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index d90eb604b65d..326b71793336 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -216,6 +216,8 @@
 
 #define STRTAB_STE_0_S1FMT GENMASK_ULL(5, 4)
 #define STRTAB_STE_0_S1FMT_LINEAR  0
+#define STRTAB_STE_0_S1FMT_4K_L2   1
+#define STRTAB_STE_0_S1FMT_64K_L2  2
 #define STRTAB_STE_0_S1CTXPTR_MASK GENMASK_ULL(51, 6)
 #define STRTAB_STE_0_S1CDMAX   GENMASK_ULL(63, 59)
 
@@ -255,6 +257,18 @@
 
 #define STRTAB_STE_3_S2TTB_MASKGENMASK_ULL(51, 4)
 
+/*
+ * Linear: when less than 1024 SSIDs are supported
+ * 2lvl: at most 1024 L1 entrie,
+ *  1024 lazy entries per table.
+ */
+#define CTXDESC_SPLIT  10
+#define CTXDESC_NUM_L2_ENTRIES (1 << CTXDESC_SPLIT)
+
+#define CTXDESC_L1_DESC_DWORD  1
+#define CTXDESC_L1_DESC_VALID  1
+#define CTXDESC_L1_DESC_L2PTR_MASK GENMASK_ULL(51, 12)
+
 /* Context descriptor (stage-1 only) */
 #define CTXDESC_CD_DWORDS  8
 #define CTXDESC_CD_0_TCR_T0SZ  GENMASK_ULL(5, 0)
@@ -530,7 +544,10 @@ struct arm_smmu_ctx_desc {
 struct arm_smmu_s1_cfg {
u8  s1fmt;
u8  s1cdmax;
-   struct arm_smmu_cd_tabletable;
+   struct arm_smmu_cd_table*tables;
+   size_t  num_tables;
+   __le64  *l1ptr;
+   dma_addr_t  l1ptr_dma;
 
/* Context descriptor 0, when substreams are disabled or s1dss = 0b10 */
struct arm_smmu_ctx_desccd;
@@ -1118,12 +1135,51 @@ static void arm_smmu_free_cd_leaf_table(struct 
arm_smmu_device *smmu,
 {
size_t size = num_entries * (CTXDESC_CD_DWORDS << 3);
 
+   if (!table->ptr)
+   return;
dmam_free_coherent(smmu->dev, size, table->ptr, table->ptr_dma);
 }
 
-static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_s1_cfg *cfg, u32 ssid)
+static void arm_smmu_write_cd_l1_desc(__le64 *dst,
+ struct arm_smmu_cd_table *table)
 {
-   return cfg->table.ptr + ssid * CTXDESC_CD_DWORDS;
+   u64 val = (table->ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) |
+ CTXDESC_L1_DESC_VALID;
+
+   *dst = cpu_to_le64(val);
+}
+
+static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain,
+  u32 ssid)
+{
+   unsigned int idx;
+   struct arm_smmu_cd_table *table;
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+   struct arm_smmu_s1_cfg *cfg = _domain->s1_cfg;
+
+   if (cfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR) {
+   table = >tables[0];
+   idx = ssid;
+   } else {
+   idx = ssid >> CTXDESC_SPLIT;
+   if (idx >= cfg->num_tables)
+   return NULL;
+
+   table = >tables[idx];
+   if (!table->ptr) {
+   __le64 *l1ptr = cfg->l1ptr + idx * 
CTXDESC_L1_DESC_DWORD;
+
+   if (arm_smmu_alloc_cd_leaf_table(smmu, table,
+
CTXDESC_NUM_L2_ENTRIES))
+   return NULL;
+
+   arm_smmu_write_cd_l1_desc(l1ptr, table);
+   /* An invalid L1 entry is allowed to be cached */
+   arm_smmu_sync_cd(smmu_domain, ssid, false);
+   }
+   idx = ssid & (CTXDESC_NUM_L2_ENTRIES - 1);
+   }
+   return table->ptr + idx * CTXDESC_CD_DWORDS;
 }
 
 static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr)
@@ -1149,7 +1205,7 @@ static int arm_smmu_write_ctx_desc(struct arm_smmu_domain 
*smmu_domain,
u64 val;
bool cd_live;
struct arm_smmu_device *smmu = smmu_domain->smmu;
-   __le64 *cdptr = arm_smmu_get_cd_ptr(_domain->s1_cfg, ssid);
+   __le64 *cdptr = arm_smmu_get_cd_ptr(smmu_domain, ssid);
 
/*
 * This function handles the following cases:
@@ -1213,20 +1269,81 @@ static int arm_smmu_write_ctx_desc(struct 
arm_smmu_domain *smmu_domain,
 static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain,
struct arm_smmu_master *master)
 {
+   int ret;
+   size_t size = 0;
+   size_t max_contexts, num_leaf_entries;
struct arm_smmu_device *smmu = smmu_domain->smmu;
struct arm_smmu_s1_cfg *cfg = _domain->s1_cfg;
 
cfg->s1fmt = STRTAB_STE_0_S1FMT_LINEAR;
   

[PATCH 1/8] iommu: Add I/O ASID allocator

2019-06-10 Thread Jean-Philippe Brucker
Some devices might support multiple DMA address spaces, in particular
those that have the PCI PASID feature. PASID (Process Address Space ID)
allows to share process address spaces with devices (SVA), partition a
device into VM-assignable entities (VFIO mdev) or simply provide
multiple DMA address space to kernel drivers. Add a global PASID
allocator usable by different drivers at the same time. Name it I/O ASID
to avoid confusion with ASIDs allocated by arch code, which are usually
a separate ID space.

The IOASID space is global. Each device can have its own PASID space,
but by convention the IOMMU ended up having a global PASID space, so
that with SVA, each mm_struct is associated to a single PASID.

The allocator is primarily used by IOMMU subsystem but in rare occasions
drivers would like to allocate PASIDs for devices that aren't managed by
an IOMMU, using the same ID space as IOMMU.

Signed-off-by: Jean-Philippe Brucker 
Signed-off-by: Jacob Pan 
---
The most recent discussion on this patch was at:
https://lkml.kernel.org/lkml/1556922737-76313-4-git-send-email-jacob.jun@linux.intel.com/
I fixed it up a bit following comments in that series, and removed the
definitions for the custom allocator for now.

There also is a new version that includes the custom allocator into this
patch, but is currently missing the RCU fixes, at:
https://lore.kernel.org/lkml/1560087862-57608-13-git-send-email-jacob.jun@linux.intel.com/
---
 drivers/iommu/Kconfig  |   4 ++
 drivers/iommu/Makefile |   1 +
 drivers/iommu/ioasid.c | 150 +
 include/linux/ioasid.h |  49 ++
 4 files changed, 204 insertions(+)
 create mode 100644 drivers/iommu/ioasid.c
 create mode 100644 include/linux/ioasid.h

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 83664db5221d..9b45f70549a7 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -3,6 +3,10 @@
 config IOMMU_IOVA
tristate
 
+# The IOASID library may also be used by non-IOMMU_API users
+config IOASID
+   tristate
+
 # IOMMU_API always gets selected by whoever wants it.
 config IOMMU_API
bool
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 8c71a15e986b..0efac6f1ec73 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -7,6 +7,7 @@ obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
 obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
+obj-$(CONFIG_IOASID) += ioasid.o
 obj-$(CONFIG_IOMMU_IOVA) += iova.o
 obj-$(CONFIG_OF_IOMMU) += of_iommu.o
 obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
new file mode 100644
index ..bbb771214fa9
--- /dev/null
+++ b/drivers/iommu/ioasid.c
@@ -0,0 +1,150 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * I/O Address Space ID allocator. There is one global IOASID space, split into
+ * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and
+ * free IOASIDs with ioasid_alloc and ioasid_free.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct ioasid_data {
+   ioasid_t id;
+   struct ioasid_set *set;
+   void *private;
+   struct rcu_head rcu;
+};
+
+static DEFINE_XARRAY_ALLOC(ioasid_xa);
+
+/**
+ * ioasid_set_data - Set private data for an allocated ioasid
+ * @ioasid: the ID to set data
+ * @data:   the private data
+ *
+ * For IOASID that is already allocated, private data can be set
+ * via this API. Future lookup can be done via ioasid_find.
+ */
+int ioasid_set_data(ioasid_t ioasid, void *data)
+{
+   struct ioasid_data *ioasid_data;
+   int ret = 0;
+
+   xa_lock(_xa);
+   ioasid_data = xa_load(_xa, ioasid);
+   if (ioasid_data)
+   rcu_assign_pointer(ioasid_data->private, data);
+   else
+   ret = -ENOENT;
+   xa_unlock(_xa);
+
+   /*
+* Wait for readers to stop accessing the old private data, so the
+* caller can free it.
+*/
+   if (!ret)
+   synchronize_rcu();
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(ioasid_set_data);
+
+/**
+ * ioasid_alloc - Allocate an IOASID
+ * @set: the IOASID set
+ * @min: the minimum ID (inclusive)
+ * @max: the maximum ID (inclusive)
+ * @private: data private to the caller
+ *
+ * Allocate an ID between @min and @max. The @private pointer is stored
+ * internally and can be retrieved with ioasid_find().
+ *
+ * Return: the allocated ID on success, or %INVALID_IOASID on failure.
+ */
+ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
+ void *private)
+{
+   u32 id = INVALID_IOASID;
+   struct ioasid_data *data;
+
+   data = kzalloc(sizeof(*data), GFP_KERNEL);
+   if (!data)
+   return INVALID_IOASID;
+
+   data->set = set;
+   data->private = private;
+
+   if (xa_alloc(_xa, , data, 

[PATCH 4/8] iommu/arm-smmu-v3: Add support for Substream IDs

2019-06-10 Thread Jean-Philippe Brucker
At the moment, the SMMUv3 driver implements only one stage-1 or stage-2
page directory per device. However SMMUv3 allows more than one address
space for some devices, by providing multiple stage-1 page directories. In
addition to the Stream ID (SID), that identifies a device, we can now have
Substream IDs (SSID) identifying an address space. In PCIe, SID is called
Requester ID (RID) and SSID is called Process Address-Space ID (PASID).

Prepare the driver for SSID support, by adding context descriptor tables
in STEs (previously a single static context descriptor). A complete
stage-1 walk is now performed like this by the SMMU:

  Stream tables  Ctx. tables  Page tables
++   ,--->+---+   ,--->+---+
::   |:   :   |:   :
++   |+---+   |+---+
   SID->|  STE   |---'  SSID->|  CD   |---'  IOVA->|  PTE  |--> IPA
+++---++---+
:::   ::   :
+++---++---+

Implement a single level of context descriptor table for now, but as with
stream and page tables, an SSID can be split to index multiple levels of
tables.

In all stream table entries, we set S1DSS=SSID0 mode, making translations
without an SSID use context descriptor 0. Although it would be possible by
setting S1DSS=BYPASS, we don't currently support SSID when user selects
iommu.passthrough.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/arm-smmu-v3.c | 238 +---
 1 file changed, 192 insertions(+), 46 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 3254f473e681..d90eb604b65d 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -219,6 +219,11 @@
 #define STRTAB_STE_0_S1CTXPTR_MASK GENMASK_ULL(51, 6)
 #define STRTAB_STE_0_S1CDMAX   GENMASK_ULL(63, 59)
 
+#define STRTAB_STE_1_S1DSS GENMASK_ULL(1, 0)
+#define STRTAB_STE_1_S1DSS_TERMINATE   0x0
+#define STRTAB_STE_1_S1DSS_BYPASS  0x1
+#define STRTAB_STE_1_S1DSS_SSID0   0x2
+
 #define STRTAB_STE_1_S1C_CACHE_NC  0UL
 #define STRTAB_STE_1_S1C_CACHE_WBRA1UL
 #define STRTAB_STE_1_S1C_CACHE_WT  2UL
@@ -305,6 +310,7 @@
 #define CMDQ_PREFETCH_1_SIZE   GENMASK_ULL(4, 0)
 #define CMDQ_PREFETCH_1_ADDR_MASK  GENMASK_ULL(63, 12)
 
+#define CMDQ_CFGI_0_SSID   GENMASK_ULL(31, 12)
 #define CMDQ_CFGI_0_SIDGENMASK_ULL(63, 32)
 #define CMDQ_CFGI_1_LEAF   (1UL << 0)
 #define CMDQ_CFGI_1_RANGE  GENMASK_ULL(4, 0)
@@ -421,8 +427,11 @@ struct arm_smmu_cmdq_ent {
 
#define CMDQ_OP_CFGI_STE0x3
#define CMDQ_OP_CFGI_ALL0x4
+   #define CMDQ_OP_CFGI_CD 0x5
+   #define CMDQ_OP_CFGI_CD_ALL 0x6
struct {
u32 sid;
+   u32 ssid;
union {
boolleaf;
u8  span;
@@ -506,16 +515,25 @@ struct arm_smmu_strtab_l1_desc {
dma_addr_t  l2ptr_dma;
 };
 
+struct arm_smmu_cd_table {
+   __le64  *ptr;
+   dma_addr_t  ptr_dma;
+};
+
+struct arm_smmu_ctx_desc {
+   u16 asid;
+   u64 ttbr;
+   u64 tcr;
+   u64 mair;
+};
+
 struct arm_smmu_s1_cfg {
-   __le64  *cdptr;
-   dma_addr_t  cdptr_dma;
-
-   struct arm_smmu_ctx_desc {
-   u16 asid;
-   u64 ttbr;
-   u64 tcr;
-   u64 mair;
-   }   cd;
+   u8  s1fmt;
+   u8  s1cdmax;
+   struct arm_smmu_cd_tabletable;
+
+   /* Context descriptor 0, when substreams are disabled or s1dss = 0b10 */
+   struct arm_smmu_ctx_desccd;
 };
 
 struct arm_smmu_s2_cfg {
@@ -811,10 +829,16 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct 
arm_smmu_cmdq_ent *ent)
cmd[1] |= FIELD_PREP(CMDQ_PREFETCH_1_SIZE, ent->prefetch.size);
cmd[1] |= ent->prefetch.addr & CMDQ_PREFETCH_1_ADDR_MASK;
break;
+   case CMDQ_OP_CFGI_CD:
+   cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SSID, ent->cfgi.ssid);
+   /* Fallthrough */
case CMDQ_OP_CFGI_STE:
cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SID, ent->cfgi.sid);
cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_LEAF, ent->cfgi.leaf);
break;
+   case CMDQ_OP_CFGI_CD_ALL:
+   cmd[0] 

[PATCH 0/8] iommu: Add auxiliary domain and PASID support to Arm SMMUv3

2019-06-10 Thread Jean-Philippe Brucker
Add substreams and PCI PASID support to the SMMUv3 driver. At the moment
the driver supports a single address space per device. PASID enables
multiple address spaces per device, up to a million in theory (1 << 20).

Two kernel features will make use of PASIDs, auxiliary domains (AUXD)
and Shared Virtual Addressing (SVA). Auxiliary domains allow to program
PASID contexts using IOMMU domains. SVA allows to bind process address
spaces to device contexts and relieve device drivers of DMA management.

Since SVA support for SMMUv3 has a lot more dependencies (new fault API,
ASID pinning, generic bind, PRI or stall support, and so on),
introducing PASID support to the SMMUv3 driver is easier with auxiliary
domains.

The AUXD API allows device drivers to easily test PASID support of their
devices, although they need to allocate IOVA and pages themselves
because the DMA API doesn't support AUXD for the moment:

iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_AUX);
domain = iommu_domain_alloc(dev->bus);
iommu_aux_attach_device(domain, dev);
iommu_map(domain, iova, phys_addr, size, prot);
pasid = iommu_aux_get_pasid(domain);
/* Then launch DMA with the PASID and IOVA */

Auxiliary domains also allow to split devices into multiple contexts
assignable to guest, with vfio-mdev.

Past discussions for these patches:
* Auxiliary domains (patch 6)
  [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
  https://www.spinics.net/lists/iommu/msg30637.html
* SSID support for the SMMU (patches 2, 3, 4, 5, 7 and 8)
  [PATCH v2 00/40] Shared Virtual Addressing for the IOMMU
  https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027595.html
* I/O ASID (patch 1)
  [PATCH v3 00/16] Shared virtual address IOMMU and VT-d support
  
https://lkml.kernel.org/lkml/1556922737-76313-4-git-send-email-jacob.jun@linux.intel.com/

Jean-Philippe Brucker (8):
  iommu: Add I/O ASID allocator
  dt-bindings: document PASID property for IOMMU masters
  iommu/arm-smmu-v3: Support platform SSID
  iommu/arm-smmu-v3: Add support for Substream IDs
  iommu/arm-smmu-v3: Add second level of context descriptor table
  iommu/arm-smmu-v3: Support auxiliary domains
  iommu/arm-smmu-v3: Improve add_device() error handling
  iommu/arm-smmu-v3: Add support for PCI PASID

 .../devicetree/bindings/iommu/iommu.txt   |   6 +
 drivers/iommu/Kconfig |   5 +
 drivers/iommu/Makefile|   1 +
 drivers/iommu/arm-smmu-v3.c   | 714 --
 drivers/iommu/ioasid.c| 150 
 drivers/iommu/of_iommu.c  |   6 +-
 include/linux/ioasid.h|  49 ++
 include/linux/iommu.h |   1 +
 8 files changed, 865 insertions(+), 67 deletions(-)
 create mode 100644 drivers/iommu/ioasid.c
 create mode 100644 include/linux/ioasid.h

-- 
2.21.0



Re: How to resolve an issue in swiotlb environment?

2019-06-10 Thread Alan Stern
On Mon, 10 Jun 2019, Christoph Hellwig wrote:

> Hi Yoshihiro,
> 
> sorry for not taking care of this earlier, today is a public holiday
> here and thus I'm not working much over the long weekend.
> 
> On Mon, Jun 10, 2019 at 11:13:07AM +, Yoshihiro Shimoda wrote:
> > I have another way to avoid the issue. But it doesn't seem that a good way 
> > though...
> > According to the commit that adding blk_queue_virt_boundary() [3],
> > this is needed for vhci_hcd as a workaround so that if we avoid to call it
> > on xhci-hcd driver, the issue disappeared. What do you think?
> > JFYI, I pasted a tentative patch in the end of email [4].
> 
> Oh, I hadn't even look at why USB uses blk_queue_virt_boundary, and it
> seems like the usage is wrong, as it doesn't follow the same rules as
> all the others.  I think your patch goes in the right direction,
> but instead of comparing a hcd name it needs to be keyed of a flag
> set by the driver (I suspect there is one indicating native SG support,
> but I can't quickly find it), and we need an alternative solution
> for drivers that don't see like vhci.  I suspect just limiting the
> entire transfer size to something that works for a single packet
> for them would be fine.

Christoph:

In most of the different kinds of USB host controllers, the hardware is
not capable of assembling a packet out of multiple buffers at arbitrary
addresses.  As a matter of fact, xHCI is the only kind that _can_ do 
this.

In some cases, the hardware can assemble packets provided each buffer
other than the last ends at a page boundary and each buffer other than
the first starts at a page boundary (Intel would say the buffers are
"virtually contiguous"), but this is a rather complex rule and we don't
want to rely on it.  Plus, in other cases the hardware _can't_ do this.

Instead, we want the SG buffers to be set up so that each one (except 
the last) is an exact multiple of the maximum packet size.  That way, 
each packet can be assembled from the contents of a single buffer and 
there's no problem.

The maximum packet size depends on the type of USB connection.  
Typical values are 1024, 512, or 64.  It's always a power of two and
it's smaller than 4096.  Therefore we simplify the problem even further
by requiring that each SG buffer in a scatterlist (except the last one)
be a multiple of the page size.  (It doesn't need to be aligned on a 
page boundary, as far as I remember.)

That's why the blk_queue_virt_boundary usage was added to the USB code.  
Perhaps it's not the right way of doing this; I'm not an expert on the
inner workings of the block layer.  If you can suggest a better way to
express our requirement, that would be great.

Alan Stern

PS: There _is_ a flag saying whether an HCD supports SG.  But what it
means is that the driver can handle an SG list that meets the
requirement above; it doesn't mean that the driver can reassemble the
data from an SG list into a series of bounce buffers in order to meet
the requirement.  We very much want not to do that, especially since
the block layer should already be capable of doing it for us.



Re: [PATCH 5/6] iommu/vt-d: Cleanup after delegating DMA domain to generic iommu

2019-06-10 Thread Mehta, Sohil
On Sun, 2019-06-09 at 10:38 +0800, Lu Baolu wrote:
>  static int __init si_domain_init(int hw)
> @@ -3306,14 +3252,13 @@ static int __init init_dmars(void)
> if (pasid_supported(iommu))
> intel_svm_init(iommu);
>  #endif
> -   }
>  
> -   /*
> -    * Now that qi is enabled on all iommus, set the root entry
> and flush
> -    * caches. This is required on some Intel X58 chipsets,
> otherwise the
> -    * flush_context function will loop forever and the boot
> hangs.
> -    */
> -   for_each_active_iommu(iommu, drhd) {
> +   /*
> +    * Now that qi is enabled on all iommus, set the root
> entry and
> +    * flush caches. This is required on some Intel X58
> chipsets,
> +    * otherwise the flush_context function will loop
> forever and
> +    * the boot hangs.
> +    */
> iommu_flush_write_buffer(iommu);
> iommu_set_root_entry(iommu);
> iommu->flush.flush_context(iommu, 0, 0, 0,
> DMA_CCMD_GLOBAL_INVL);


This changes the intent of the original code. As the comment says
enable QI on all IOMMUs, then flush the caches and set the root entry.
The order of setting the root entries has changed now.

Refer: 
Commit a4c34ff1c029 ('iommu/vt-d: Enable QI on all IOMMUs before
setting root entry')

--Sohil

Re: [PATCH v4 0/9] iommu: Bounce page for untrusted devices

2019-06-10 Thread Konrad Rzeszutek Wilk
On Mon, Jun 03, 2019 at 09:16:11AM +0800, Lu Baolu wrote:
> The Thunderbolt vulnerabilities are public and have a nice
> name as Thunderclap [1] [3] nowadays. This patch series aims
> to mitigate those concerns.

.. Forgot to ask but should the patches also include the CVE number?
Or at least the last one that enables this?

Thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 7/9] iommu/vt-d: Add trace events for domain map/unmap

2019-06-10 Thread Konrad Rzeszutek Wilk
On Mon, Jun 03, 2019 at 09:16:18AM +0800, Lu Baolu wrote:
> This adds trace support for the Intel IOMMU driver. It
> also declares some events which could be used to trace
> the events when an IOVA is being mapped or unmapped in
> a domain.

Is that even needed considering SWIOTLB also has tracing events?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 6/9] iommu/vt-d: Check whether device requires bounce buffer

2019-06-10 Thread Konrad Rzeszutek Wilk
On Mon, Jun 03, 2019 at 09:16:17AM +0800, Lu Baolu wrote:
> This adds a helper to check whether a device needs to
> use bounce buffer. It also provides a boot time option
> to disable the bounce buffer. Users can use this to
> prevent the iommu driver from using the bounce buffer
> for performance gain.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Signed-off-by: Lu Baolu 
> Tested-by: Xu Pengfei 
> Tested-by: Mika Westerberg 
> ---
>  Documentation/admin-guide/kernel-parameters.txt | 5 +
>  drivers/iommu/intel-iommu.c | 6 ++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index 138f6664b2e2..65685c6e53e4 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1728,6 +1728,11 @@
>   Note that using this option lowers the security
>   provided by tboot because it makes the system
>   vulnerable to DMA attacks.
> + nobounce [Default off]
> + Do not use the bounce buffer for untrusted devices like
> + the Thunderbolt devices. This will treat the untrusted

My brain has sometimes a hard time parsing 'Not' and 'un'. Could this be:

Disable bounce buffer for unstrusted devices ..?


And perhaps call it 'noswiotlb' ? Not everyone knows that SWIOTLB = bounce 
buffer.

> + devices as the trusted ones, hence might expose security
> + risks of DMA attacks.
>  
>   intel_idle.max_cstate=  [KNL,HW,ACPI,X86]
>   0   disables intel_idle and fall back on acpi_idle.
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 235837c50719..41439647f75d 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -371,6 +371,7 @@ static int dmar_forcedac;
>  static int intel_iommu_strict;
>  static int intel_iommu_superpage = 1;
>  static int iommu_identity_mapping;
> +static int intel_no_bounce;

intel_swiotlb_on = 1 ?

>  
>  #define IDENTMAP_ALL 1
>  #define IDENTMAP_GFX 2
> @@ -384,6 +385,8 @@ EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped);
>  static DEFINE_SPINLOCK(device_domain_lock);
>  static LIST_HEAD(device_domain_list);
>  
> +#define device_needs_bounce(d) (!intel_no_bounce && dev_is_untrusted(d))
> +
>  /*
>   * Iterate over elements in device_domain_list and call the specified
>   * callback @fn against each element.
> @@ -466,6 +469,9 @@ static int __init intel_iommu_setup(char *str)
>   printk(KERN_INFO
>   "Intel-IOMMU: not forcing on after tboot. This 
> could expose security risk for tboot\n");
>   intel_iommu_tboot_noforce = 1;
> + } else if (!strncmp(str, "nobounce", 8)) {
> + pr_info("Intel-IOMMU: No bounce buffer. This could 
> expose security risks of DMA attacks\n");

Again, Intel-IOMMU: No SWIOTLB. T.. blah blah'

Asking for this as doing 'dmesg | grep SWIOTLB' will expose nicely all
the SWIOTLB invocations..

> + intel_no_bounce = 1;
>   }
>  
>   str += strcspn(str, ",");
> -- 
> 2.17.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 4/9] iommu: Add bounce page APIs

2019-06-10 Thread Konrad Rzeszutek Wilk
On Mon, Jun 03, 2019 at 09:16:15AM +0800, Lu Baolu wrote:
> IOMMU hardware always use paging for DMA remapping.  The
> minimum mapped window is a page size. The device drivers
> may map buffers not filling whole IOMMU window. It allows
> device to access to possibly unrelated memory and various
> malicious devices can exploit this to perform DMA attack.
> 
> This introduces the bouce buffer mechanism for DMA buffers
> which doesn't fill a minimal IOMMU page. It could be used
> by various vendor specific IOMMU drivers as long as the
> DMA domain is managed by the generic IOMMU layer. Below
> APIs are added:
> 
> * iommu_bounce_map(dev, addr, paddr, size, dir, attrs)
>   - Map a buffer start at DMA address @addr in bounce page
> manner. For buffer parts that doesn't cross a whole
> minimal IOMMU page, the bounce page policy is applied.
> A bounce page mapped by swiotlb will be used as the DMA
> target in the IOMMU page table. Otherwise, the physical
> address @paddr is mapped instead.
> 
> * iommu_bounce_unmap(dev, addr, size, dir, attrs)
>   - Unmap the buffer mapped with iommu_bounce_map(). The bounce
> page will be torn down after the bounced data get synced.
> 
> * iommu_bounce_sync(dev, addr, size, dir, target)
>   - Synce the bounced data in case the bounce mapped buffer is
> reused.
> 
> The whole APIs are included within a kernel option IOMMU_BOUNCE_PAGE.
> It's useful for cases where bounce page doesn't needed, for example,
> embedded cases.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Cc: Alan Cox 
> Cc: Mika Westerberg 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/Kconfig |  14 +
>  drivers/iommu/iommu.c | 119 ++
>  include/linux/iommu.h |  35 +
>  3 files changed, 168 insertions(+)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 83664db5221d..d837ec3f359b 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -86,6 +86,20 @@ config IOMMU_DEFAULT_PASSTHROUGH
>  
> If unsure, say N here.
>  
> +config IOMMU_BOUNCE_PAGE
> + bool "Use bounce page for untrusted devices"
> + depends on IOMMU_API
> + select SWIOTLB

I think you want:
depends on IOMMU_API && SWIOTLB

As people may want to have IOMMU and SWIOTLB, and not IOMMU_BOUNCE_PAGE enabled.

> + help
> +   IOMMU hardware always use paging for DMA remapping. The minimum
> +   mapped window is a page size. The device drivers may map buffers
> +   not filling whole IOMMU window. This allows device to access to
> +   possibly unrelated memory and malicious device can exploit this
> +   to perform a DMA attack. Select this to use a bounce page for the
> +   buffer which doesn't fill a whole IOMU page.
> +
> +   If unsure, say N here.
> +
>  config OF_IOMMU
> def_bool y
> depends on OF && IOMMU_API
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 2a906386bb8e..fa44f681a82b 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2246,3 +2246,122 @@ int iommu_sva_get_pasid(struct iommu_sva *handle)
>   return ops->sva_get_pasid(handle);
>  }
>  EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
> +
> +#ifdef CONFIG_IOMMU_BOUNCE_PAGE
> +
> +/*
> + * Bounce buffer support for external devices:
> + *
> + * IOMMU hardware always use paging for DMA remapping. The minimum mapped
> + * window is a page size. The device drivers may map buffers not filling
> + * whole IOMMU window. This allows device to access to possibly unrelated
> + * memory and malicious device can exploit this to perform a DMA attack.
> + * Use bounce pages for the buffer which doesn't fill whole IOMMU pages.
> + */
> +
> +static inline size_t
> +get_aligned_size(struct iommu_domain *domain, dma_addr_t addr, size_t size)
> +{
> + unsigned long page_size = 1 << __ffs(domain->pgsize_bitmap);
> + unsigned long offset = page_size - 1;
> +
> + return ALIGN((addr & offset) + size, page_size);
> +}
> +
> +dma_addr_t iommu_bounce_map(struct device *dev, dma_addr_t iova,
> + phys_addr_t paddr, size_t size,
> + enum dma_data_direction dir,
> + unsigned long attrs)
> +{
> + struct iommu_domain *domain;
> + unsigned int min_pagesz;
> + phys_addr_t tlb_addr;
> + size_t aligned_size;
> + int prot = 0;
> + int ret;
> +
> + domain = iommu_get_dma_domain(dev);
> + if (!domain)
> + return DMA_MAPPING_ERROR;
> +
> + if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
> + prot |= IOMMU_READ;
> + if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)
> + prot |= IOMMU_WRITE;
> +
> + aligned_size = get_aligned_size(domain, paddr, size);
> + min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
> +
> + /*
> +  * If both the physical buffer start address and size are
> +  * page aligned, we don't 

Re: [PATCH v4 5/9] iommu/vt-d: Don't switch off swiotlb if use direct dma

2019-06-10 Thread Konrad Rzeszutek Wilk
On Mon, Jun 03, 2019 at 09:16:16AM +0800, Lu Baolu wrote:
> The direct dma implementation depends on swiotlb. Hence, don't
> switch of swiotlb since direct dma interfaces are used in this

s/of/off/

> driver.

But I think you really want to leave the code as is but change
the #ifdef to check for IOMMU_BOUNCE_PAGE and not CONFIG_SWIOTLB.

As one could disable IOMMU_BOUNCE_PAGE.
> 
> Cc: Ashok Raj 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Cc: Mika Westerberg 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel-iommu.c | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index d5a6c8064c56..235837c50719 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -4625,9 +4625,6 @@ static int __init platform_optin_force_iommu(void)
>   iommu_identity_mapping |= IDENTMAP_ALL;
>  
>   dmar_disabled = 0;
> -#if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
> - swiotlb = 0;
> -#endif
>   no_iommu = 0;
>  
>   return 1;
> @@ -4765,9 +4762,6 @@ int __init intel_iommu_init(void)
>   }
>   up_write(_global_lock);
>  
> -#if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
> - swiotlb = 0;
> -#endif
>   dma_ops = _dma_ops;
>  
>   init_iommu_pm_ops();
> -- 
> 2.17.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 3/9] swiotlb: Zero out bounce buffer for untrusted device

2019-06-10 Thread Konrad Rzeszutek Wilk
On Mon, Jun 03, 2019 at 09:16:14AM +0800, Lu Baolu wrote:
> This is necessary to avoid exposing valid kernel data to any
> milicious device.

malicious 

> 
> Suggested-by: Christoph Hellwig 
> Signed-off-by: Lu Baolu 
> ---
>  kernel/dma/swiotlb.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index f956f785645a..ed41eb7f6131 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #ifdef CONFIG_DEBUG_FS
>  #include 
>  #endif
> @@ -560,6 +561,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
>*/
>   for (i = 0; i < nslots; i++)
>   io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT);
> +
> + /* Zero out the bounce buffer if the consumer is untrusted. */
> + if (dev_is_untrusted(hwdev))
> + memset(phys_to_virt(tlb_addr), 0, alloc_size);

What if the alloc_size is less than a PAGE? Should this at least have ALIGN or 
such?

> +
>   if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
>   (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
>   swiotlb_bounce(orig_addr, tlb_addr, mapping_size, 
> DMA_TO_DEVICE);
> -- 
> 2.17.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/9] iommu: Bounce page for untrusted devices

2019-06-10 Thread Konrad Rzeszutek Wilk
On Mon, Jun 03, 2019 at 09:16:11AM +0800, Lu Baolu wrote:
> The Thunderbolt vulnerabilities are public and have a nice
> name as Thunderclap [1] [3] nowadays. This patch series aims
> to mitigate those concerns.
> 
> An external PCI device is a PCI peripheral device connected
> to the system through an external bus, such as Thunderbolt.
> What makes it different is that it can't be trusted to the
> same degree as the devices build into the system. Generally,
> a trusted PCIe device will DMA into the designated buffers
> and not overrun or otherwise write outside the specified
> bounds. But it's different for an external device.
> 
> The minimum IOMMU mapping granularity is one page (4k), so
> for DMA transfers smaller than that a malicious PCIe device
> can access the whole page of memory even if it does not
> belong to the driver in question. This opens a possibility
> for DMA attack. For more information about DMA attacks
> imposed by an untrusted PCI/PCIe device, please refer to [2].
> 
> This implements bounce buffer for the untrusted external
> devices. The transfers should be limited in isolated pages
> so the IOMMU window does not cover memory outside of what
> the driver expects. Previously (v3 and before), we proposed
> an optimisation to only copy the head and tail of the buffer
> if it spans multiple pages, and directly map the ones in the
> middle. Figure 1 gives a big picture about this solution.
> 
> swiotlb System
> IOVA  bounce page   Memory
>  .-.  .-..-.
>  | |  | || |
>  | |  | || |
> buffer_start .-.  .-..-.
>  | |->| |***>| |
>  | |  | | swiotlb| |
>  | |  | | mapping| |
>  IOMMU Page  '-'  '-''-'
>   Boundary   | | | |
>  | | | |
>  | | | |
>  | |>| |
>  | |IOMMU mapping| |
>  | | | |
>  IOMMU Page  .-. .-.
>   Boundary   | | | |
>  | | | |
>  | |>| |
>  | | IOMMU mapping   | |
>  | | | |
>  | | | |
>  IOMMU Page  .-.  .-..-.
>   Boundary   | |  | || |
>  | |  | || |
>  | |->| |***>| |
>   buffer_end '-'  '-' swiotlb'-'
>  | |  | | mapping| |
>  | |  | || |
>  '-'  '-''-'
>   Figure 1: A big view of iommu bounce page 
> 
> As Robin Murphy pointed out, this ties us to using strict mode for
> TLB maintenance, which may not be an overall win depending on the
> balance between invalidation bandwidth vs. memcpy bandwidth. If we
> use standard SWIOTLB logic to always copy the whole thing, we should
> be able to release the bounce pages via the flush queue to allow
> 'safe' lazy unmaps. So since v4 we start to use the standard swiotlb
> logic.
> 
> swiotlb System
> IOVA  bounce page   Memory
> buffer_start .-.  .-..-.
>  | |  | || |
>  | |  | || |
>  | |  | |.-.physical
>  | |->| | -->| |_start  
>  | |iommu | | swiotlb| |
>  | | map  | |   map  | |
>  IOMMU Page  .-.  .-.'-'

The prior picture had 'buffer_start' at an offset in the page. I am
assuming you meant that here in as well?

Meaning it starts at the same offset as 'physical_start' in the right
side box?

>   Boundary   | |  | || |
>  | |  | || |
>  | |->| || |
>  | |iommu | || |
>  | | map  | 

Re: "iommu/vt-d: Delegate DMA domain to generic iommu" series breaks megaraid_sas

2019-06-10 Thread Qian Cai
On Mon, 2019-06-10 at 09:44 -0400, Qian Cai wrote:
> On Sun, 2019-06-09 at 10:43 +0800, Lu Baolu wrote:
> > Hi Qian,
> > 
> > I just posted some fix patches. I cc'ed them in your email inbox as
> > well. Can you please check whether they happen to fix your issue?
> > If not, do you mind posting more debug messages?
> 
> Unfortunately, it does not work. Here is the dmesg.
> 
> https://raw.githubusercontent.com/cailca/tmp/master/dmesg?token=AMC35QKPIZBYUM
> FUQKLW4ZC47ZPIK

This one should be good to view.

https://cailca.github.io/files/dmesg.txt


Re: Device specific pass through in host systems - discuss user interface

2019-06-10 Thread Raj, Ashok
Hi Sai

On Sun, Jun 09, 2019 at 10:41:10PM -0700, Sai Praneeth Prakhya wrote:
> > > I am working on an IOMMU driver feature that allows a user to specify
> > > if the DMA from a device should be translated by IOMMU or not.
> > > Presently, we support only all devices or none mode i.e. if user
> > > specifies "iommu=pt" [X86] or "iommu.passthrough" [ARM64] through
> > > kernel command line, all the devices would be in pass through mode and
> > > we don't have per device granularity, but, we were requested by a
> > > customer to selectively put devices in pass through mode and not all.
> > 
> > Most iommu vendor drivers have switched from per-device to per-group domain
> > (a.k.a. default domain). So per-group pass-through mode makes more sense?
> > 
> > By the way, can we extend this to "per-group default domain type", instead 
> > of
> > only "per-group pass-through mode"? Currently we have system level default
> > domain type, if we have finer granularity of default domain type, both iommu
> > drivers and end users will benefit from it.
> 
> Sure! Makes sense.. per-group default domain type sounds good.
> 
> > > I am looking for a consensus on **how the kernel command line argument
> > > should look like and path for sysfs entry**. Also, please note that if
> > > a device is put in pass through mode it won't be available for the
> > > guest and that's ok.
> > 
> > Just out of curiosity, what's the limitation for a device using pass- 
> > through DMA
> > domain to be assignable.
> 
> Sorry! I don't know about assignable devices. Probably, Ashok or Jacob could 
> answer this question

We don't switch the domain for assigned devices. Only the "type" of the default 
domain is 
changed from dma-protected to passthrough type.

When assigning devices to user-space, there is no change in this proposal.

> 
> Regards,
> Sai
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: "iommu/vt-d: Delegate DMA domain to generic iommu" series breaks megaraid_sas

2019-06-10 Thread Qian Cai
On Sun, 2019-06-09 at 10:43 +0800, Lu Baolu wrote:
> Hi Qian,
> 
> I just posted some fix patches. I cc'ed them in your email inbox as
> well. Can you please check whether they happen to fix your issue?
> If not, do you mind posting more debug messages?

Unfortunately, it does not work. Here is the dmesg.

https://raw.githubusercontent.com/cailca/tmp/master/dmesg?token=AMC35QKPIZBYUMFU
QKLW4ZC47ZPIK

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 0/6] Allwinner H6 Mali GPU support

2019-06-10 Thread Tomeu Vizoso
On Wed, 29 May 2019 at 19:38, Robin Murphy  wrote:
>
> On 29/05/2019 16:09, Tomeu Vizoso wrote:
> > On Tue, 21 May 2019 at 18:11, Clément Péron  wrote:
> >>
> > [snip]
> >> [  345.204813] panfrost 180.gpu: mmu irq status=1
> >> [  345.209617] panfrost 180.gpu: Unhandled Page fault in AS0 at VA
> >> 0x02400400
> >
> >  From what I can see here, 0x02400400 points to the first byte
> > of the first submitted job descriptor.
> >
> > So mapping buffers for the GPU doesn't seem to be working at all on
> > 64-bit T-760.
> >
> > Steven, Robin, do you have any idea of why this could be?
>
> I tried rolling back to the old panfrost/nondrm shim, and it works fine
> with kbase, and I also found that T-820 falls over in the exact same
> manner, so the fact that it seemed to be common to the smaller 33-bit
> designs rather than anything to do with the other
> job_descriptor_size/v4/v5 complication turned out to be telling.
>
> [ as an aside, are 64-bit jobs actually known not to work on v4 GPUs, or
> is it just that nobody's yet observed a 64-bit blob driving one? ]

Do you know if 64-bit descriptors work on v4 GPUs with our kernel
driver but with the DDK?

Wonder if there something else to be fixed in the kernel for that scenario.

Thanks,

Tomeu

> Long story short, it appears that 'Mali LPAE' is also lacking the start
> level notion of VMSA, and expects a full 4-level table even for <40 bits
> when level 0 effectively redundant. Thus walking the 3-level table that
> io-pgtable comes back with ends up going wildly wrong. The hack below
> seems to do the job for me; if Clément can confirm (on T-720 you'll
> still need the userspace hack to force 32-bit jobs as well) then I think
> I'll cook up a proper refactoring of the allocator to put things right.
>
> Robin.
>
>
> ->8-
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 546968d8a349..f29da6e8dc08 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -1023,12 +1023,14 @@ arm_mali_lpae_alloc_pgtable(struct
> io_pgtable_cfg *cfg, void *cookie)
> iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
> if (iop) {
> u64 mair, ttbr;
> +   struct arm_lpae_io_pgtable *data = 
> io_pgtable_ops_to_data(>ops);
>
> +   data->levels = 4;
> /* Copy values as union fields overlap */
> mair = cfg->arm_lpae_s1_cfg.mair[0];
> ttbr = cfg->arm_lpae_s1_cfg.ttbr[0];
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2 11/12] arm: dts: mediatek: Get rid of mediatek,larb for MM nodes

2019-06-10 Thread Yong Wu
After adding device_link between the IOMMU consumer and smi,
the mediatek,larb is unnecessary now.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 arch/arm/boot/dts/mt2701.dtsi | 1 -
 arch/arm/boot/dts/mt7623.dtsi | 1 -
 2 files changed, 2 deletions(-)

diff --git a/arch/arm/boot/dts/mt2701.dtsi b/arch/arm/boot/dts/mt2701.dtsi
index 51e1305..57b5de3 100644
--- a/arch/arm/boot/dts/mt2701.dtsi
+++ b/arch/arm/boot/dts/mt2701.dtsi
@@ -564,7 +564,6 @@
clock-names = "jpgdec-smi",
  "jpgdec";
power-domains = < MT2701_POWER_DOMAIN_ISP>;
-   mediatek,larb = <>;
iommus = < MT2701_M4U_PORT_JPGDEC_WDMA>,
 < MT2701_M4U_PORT_JPGDEC_BSDMA>;
};
diff --git a/arch/arm/boot/dts/mt7623.dtsi b/arch/arm/boot/dts/mt7623.dtsi
index a79f0b6..cf22c58 100644
--- a/arch/arm/boot/dts/mt7623.dtsi
+++ b/arch/arm/boot/dts/mt7623.dtsi
@@ -783,7 +783,6 @@
clock-names = "jpgdec-smi",
  "jpgdec";
power-domains = < MT2701_POWER_DOMAIN_ISP>;
-   mediatek,larb = <>;
iommus = < MT2701_M4U_PORT_JPGDEC_WDMA>,
 < MT2701_M4U_PORT_JPGDEC_BSDMA>;
};
-- 
1.9.1



[PATCH v2 12/12] arm64: dts: mediatek: Get rid of mediatek,larb for MM nodes

2019-06-10 Thread Yong Wu
After adding device_link between the IOMMU consumer and smi,
the mediatek,larb is unnecessary now.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 arch/arm64/boot/dts/mediatek/mt8173.dtsi | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt8173.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
index 15f1842..06e2c09 100644
--- a/arch/arm64/boot/dts/mediatek/mt8173.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
@@ -921,7 +921,6 @@
 < CLK_MM_MUTEX_32K>;
power-domains = < MT8173_POWER_DOMAIN_MM>;
iommus = < M4U_PORT_MDP_RDMA0>;
-   mediatek,larb = <>;
mediatek,vpu = <>;
};
 
@@ -932,7 +931,6 @@
 < CLK_MM_MUTEX_32K>;
power-domains = < MT8173_POWER_DOMAIN_MM>;
iommus = < M4U_PORT_MDP_RDMA1>;
-   mediatek,larb = <>;
};
 
mdp_rsz0: rsz@14003000 {
@@ -962,7 +960,6 @@
clocks = < CLK_MM_MDP_WDMA>;
power-domains = < MT8173_POWER_DOMAIN_MM>;
iommus = < M4U_PORT_MDP_WDMA>;
-   mediatek,larb = <>;
};
 
mdp_wrot0: wrot@14007000 {
@@ -971,7 +968,6 @@
clocks = < CLK_MM_MDP_WROT0>;
power-domains = < MT8173_POWER_DOMAIN_MM>;
iommus = < M4U_PORT_MDP_WROT0>;
-   mediatek,larb = <>;
};
 
mdp_wrot1: wrot@14008000 {
@@ -980,7 +976,6 @@
clocks = < CLK_MM_MDP_WROT1>;
power-domains = < MT8173_POWER_DOMAIN_MM>;
iommus = < M4U_PORT_MDP_WROT1>;
-   mediatek,larb = <>;
};
 
ovl0: ovl@1400c000 {
@@ -990,7 +985,6 @@
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_OVL0>;
iommus = < M4U_PORT_DISP_OVL0>;
-   mediatek,larb = <>;
};
 
ovl1: ovl@1400d000 {
@@ -1000,7 +994,6 @@
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_OVL1>;
iommus = < M4U_PORT_DISP_OVL1>;
-   mediatek,larb = <>;
};
 
rdma0: rdma@1400e000 {
@@ -1010,7 +1003,6 @@
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_RDMA0>;
iommus = < M4U_PORT_DISP_RDMA0>;
-   mediatek,larb = <>;
};
 
rdma1: rdma@1400f000 {
@@ -1020,7 +1012,6 @@
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_RDMA1>;
iommus = < M4U_PORT_DISP_RDMA1>;
-   mediatek,larb = <>;
};
 
rdma2: rdma@1401 {
@@ -1030,7 +1021,6 @@
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_RDMA2>;
iommus = < M4U_PORT_DISP_RDMA2>;
-   mediatek,larb = <>;
};
 
wdma0: wdma@14011000 {
@@ -1040,7 +1030,6 @@
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_WDMA0>;
iommus = < M4U_PORT_DISP_WDMA0>;
-   mediatek,larb = <>;
};
 
wdma1: wdma@14012000 {
@@ -1050,7 +1039,6 @@
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_WDMA1>;
iommus = < M4U_PORT_DISP_WDMA1>;
-   mediatek,larb = <>;
};
 
color0: color@14013000 {
@@ -1294,7 +1282,6 @@
  <0 0x16027800 0 0x800>,   /* VDEC_HWB */
  <0 0x16028400 0 0x400>;   /* VDEC_HWG */
interrupts = ;
-   mediatek,larb = <>;
iommus = < M4U_PORT_HW_VDEC_MC_EXT>,
 < M4U_PORT_HW_VDEC_PP_EXT>,
 < M4U_PORT_HW_VDEC_AVC_MV_EXT>,
@@ -1364,8 +1351,6 @@
  <0 0x19002000 0 0x1000>;  /* VENC_LT_SYS */
interrupts = ,
 ;
-   mediatek,larb = <>,
-   <>;
iommus = < M4U_PORT_VENC_RCPU>,
 < M4U_PORT_VENC_REC>,
 < 

[PATCH v2 10/12] iommu/mediatek: Use builtin_platform_driver

2019-06-10 Thread Yong Wu
MediaTek IOMMU should wait for smi larb which need wait for the
power domain(mtk-scpsys.c) and the multimedia ccf who both are
module init. Thus, subsys_initcall for MediaTek IOMMU is not helpful.
Switch to builtin_platform_driver.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c| 31 +--
 drivers/iommu/mtk_iommu_v1.c | 24 +---
 2 files changed, 2 insertions(+), 53 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 7b70574..8459597 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -712,22 +712,6 @@ static int mtk_iommu_probe(struct platform_device *pdev)
return component_master_add_with_match(dev, _iommu_com_ops, match);
 }
 
-static int mtk_iommu_remove(struct platform_device *pdev)
-{
-   struct mtk_iommu_data *data = platform_get_drvdata(pdev);
-
-   iommu_device_sysfs_remove(>iommu);
-   iommu_device_unregister(>iommu);
-
-   if (iommu_present(_bus_type))
-   bus_set_iommu(_bus_type, NULL);
-
-   clk_disable_unprepare(data->bclk);
-   devm_free_irq(>dev, data->irq, data);
-   component_master_del(>dev, _iommu_com_ops);
-   return 0;
-}
-
 static int __maybe_unused mtk_iommu_suspend(struct device *dev)
 {
struct mtk_iommu_data *data = dev_get_drvdata(dev);
@@ -808,23 +792,10 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
 
 static struct platform_driver mtk_iommu_driver = {
.probe  = mtk_iommu_probe,
-   .remove = mtk_iommu_remove,
.driver = {
.name = "mtk-iommu",
.of_match_table = of_match_ptr(mtk_iommu_of_ids),
.pm = _iommu_pm_ops,
}
 };
-
-static int __init mtk_iommu_init(void)
-{
-   int ret;
-
-   ret = platform_driver_register(_iommu_driver);
-   if (ret != 0)
-   pr_err("Failed to register MTK IOMMU driver\n");
-
-   return ret;
-}
-
-subsys_initcall(mtk_iommu_init)
+builtin_platform_driver(mtk_iommu_driver);
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 845e20b..1c0fb82 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -650,22 +650,6 @@ static int mtk_iommu_probe(struct platform_device *pdev)
return component_master_add_with_match(dev, _iommu_com_ops, match);
 }
 
-static int mtk_iommu_remove(struct platform_device *pdev)
-{
-   struct mtk_iommu_data *data = platform_get_drvdata(pdev);
-
-   iommu_device_sysfs_remove(>iommu);
-   iommu_device_unregister(>iommu);
-
-   if (iommu_present(_bus_type))
-   bus_set_iommu(_bus_type, NULL);
-
-   clk_disable_unprepare(data->bclk);
-   devm_free_irq(>dev, data->irq, data);
-   component_master_del(>dev, _iommu_com_ops);
-   return 0;
-}
-
 static int __maybe_unused mtk_iommu_suspend(struct device *dev)
 {
struct mtk_iommu_data *data = dev_get_drvdata(dev);
@@ -702,16 +686,10 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
 
 static struct platform_driver mtk_iommu_driver = {
.probe  = mtk_iommu_probe,
-   .remove = mtk_iommu_remove,
.driver = {
.name = "mtk-iommu-v1",
.of_match_table = mtk_iommu_of_ids,
.pm = _iommu_pm_ops,
}
 };
-
-static int __init m4u_init(void)
-{
-   return platform_driver_register(_iommu_driver);
-}
-subsys_initcall(m4u_init);
+builtin_platform_driver(mtk_iommu_driver);
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 09/12] memory: mtk-smi: Get rid of mtk_smi_larb_get/put

2019-06-10 Thread Yong Wu
After adding device_link between the iommu consumer and smi-larb,
the pm_runtime_get(_sync) of smi-larb and smi-common will be called
automatically. we can get rid of mtk_smi_larb_get/put.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/memory/mtk-smi.c   | 14 --
 include/soc/mediatek/smi.h | 20 
 2 files changed, 34 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 98b1180..11d99b7 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -123,20 +123,6 @@ static void mtk_smi_clk_disable(const struct mtk_smi *smi)
clk_disable_unprepare(smi->clk_apb);
 }
 
-int mtk_smi_larb_get(struct device *larbdev)
-{
-   int ret = pm_runtime_get_sync(larbdev);
-
-   return (ret < 0) ? ret : 0;
-}
-EXPORT_SYMBOL_GPL(mtk_smi_larb_get);
-
-void mtk_smi_larb_put(struct device *larbdev)
-{
-   pm_runtime_put_sync(larbdev);
-}
-EXPORT_SYMBOL_GPL(mtk_smi_larb_put);
-
 static int
 mtk_smi_larb_bind(struct device *dev, struct device *master, void *data)
 {
diff --git a/include/soc/mediatek/smi.h b/include/soc/mediatek/smi.h
index 7a8d870..609397d 100644
--- a/include/soc/mediatek/smi.h
+++ b/include/soc/mediatek/smi.h
@@ -24,26 +24,6 @@ struct mtk_smi_iommu {
struct mtk_smi_larb_iommu larb_imu[MTK_LARB_NR_MAX];
 };
 
-/*
- * mtk_smi_larb_get: Enable the power domain and clocks for this local arbiter.
- *   It also initialize some basic setting(like iommu).
- * mtk_smi_larb_put: Disable the power domain and clocks for this local 
arbiter.
- * Both should be called in non-atomic context.
- *
- * Returns 0 if successful, negative on failure.
- */
-int mtk_smi_larb_get(struct device *larbdev);
-void mtk_smi_larb_put(struct device *larbdev);
-
-#else
-
-static inline int mtk_smi_larb_get(struct device *larbdev)
-{
-   return 0;
-}
-
-static inline void mtk_smi_larb_put(struct device *larbdev) { }
-
 #endif
 
 #endif
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 08/12] drm/mediatek: Get rid of mtk_smi_larb_get/put

2019-06-10 Thread Yong Wu
MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the drm device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: CK Hu 
CC: Philipp Zabel 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 11 ---
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 26 --
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h |  1 -
 3 files changed, 38 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c 
b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
index acad088..3a21a48 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -18,7 +18,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mtk_drm_drv.h"
 #include "mtk_drm_crtc.h"
@@ -371,20 +370,12 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
   struct drm_crtc_state *old_state)
 {
struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
-   struct mtk_ddp_comp *comp = mtk_crtc->ddp_comp[0];
int ret;
 
DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);
 
-   ret = mtk_smi_larb_get(comp->larb_dev);
-   if (ret) {
-   DRM_ERROR("Failed to get larb: %d\n", ret);
-   return;
-   }
-
ret = mtk_crtc_ddp_hw_init(mtk_crtc);
if (ret) {
-   mtk_smi_larb_put(comp->larb_dev);
return;
}
 
@@ -396,7 +387,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
*crtc,
struct drm_crtc_state *old_state)
 {
struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
-   struct mtk_ddp_comp *comp = mtk_crtc->ddp_comp[0];
int i;
 
DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);
@@ -419,7 +409,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
*crtc,
 
drm_crtc_vblank_off(crtc);
mtk_crtc_ddp_hw_fini(mtk_crtc);
-   mtk_smi_larb_put(comp->larb_dev);
 
mtk_crtc->enabled = false;
 }
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
index 54ca794..ede15c9 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
@@ -265,8 +265,6 @@ int mtk_ddp_comp_init(struct device *dev, struct 
device_node *node,
  const struct mtk_ddp_comp_funcs *funcs)
 {
enum mtk_ddp_comp_type type;
-   struct device_node *larb_node;
-   struct platform_device *larb_pdev;
 
if (comp_id < 0 || comp_id >= DDP_COMPONENT_ID_MAX)
return -EINVAL;
@@ -296,30 +294,6 @@ int mtk_ddp_comp_init(struct device *dev, struct 
device_node *node,
if (IS_ERR(comp->clk))
return PTR_ERR(comp->clk);
 
-   /* Only DMA capable components need the LARB property */
-   comp->larb_dev = NULL;
-   if (type != MTK_DISP_OVL &&
-   type != MTK_DISP_RDMA &&
-   type != MTK_DISP_WDMA)
-   return 0;
-
-   larb_node = of_parse_phandle(node, "mediatek,larb", 0);
-   if (!larb_node) {
-   dev_err(dev,
-   "Missing mediadek,larb phandle in %pOF node\n", node);
-   return -EINVAL;
-   }
-
-   larb_pdev = of_find_device_by_node(larb_node);
-   if (!larb_pdev) {
-   dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
-   of_node_put(larb_node);
-   return -EPROBE_DEFER;
-   }
-   of_node_put(larb_node);
-
-   comp->larb_dev = _pdev->dev;
-
return 0;
 }
 
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
index 8399229..b8dc17e 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
@@ -91,7 +91,6 @@ struct mtk_ddp_comp {
struct clk *clk;
void __iomem *regs;
int irq;
-   struct device *larb_dev;
enum mtk_ddp_comp_id id;
const struct mtk_ddp_comp_funcs *funcs;
 };
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 07/12] media: mtk-vcodec: Get rid of mtk_smi_larb_get/put

2019-06-10 Thread Yong Wu
MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the vcodec device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: Tiffany Lin 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 .../media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c  | 21 --
 drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h |  3 --
 drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c |  1 -
 .../media/platform/mtk-vcodec/mtk_vcodec_enc_pm.c  | 47 --
 4 files changed, 72 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
index 7884465..6caad39 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mtk_vcodec_dec_pm.h"
 #include "mtk_vcodec_util.h"
@@ -24,7 +23,6 @@
 
 int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
 {
-   struct device_node *node;
struct platform_device *pdev;
struct mtk_vcodec_pm *pm;
struct mtk_vcodec_clk *dec_clk;
@@ -35,18 +33,6 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
pm = >pm;
pm->mtkdev = mtkdev;
dec_clk = >vdec_clk;
-   node = of_parse_phandle(pdev->dev.of_node, "mediatek,larb", 0);
-   if (!node) {
-   mtk_v4l2_err("of_parse_phandle mediatek,larb fail!");
-   return -1;
-   }
-
-   pdev = of_find_device_by_node(node);
-   if (WARN_ON(!pdev)) {
-   of_node_put(node);
-   return -1;
-   }
-   pm->larbvdec = >dev;
pdev = mtkdev->plat_dev;
pm->dev = >dev;
 
@@ -121,12 +107,6 @@ void mtk_vcodec_dec_clock_on(struct mtk_vcodec_pm *pm)
goto error;
}
}
-
-   ret = mtk_smi_larb_get(pm->larbvdec);
-   if (ret) {
-   mtk_v4l2_err("mtk_smi_larb_get larbvdec fail %d", ret);
-   goto error;
-   }
return;
 
 error:
@@ -139,7 +119,6 @@ void mtk_vcodec_dec_clock_off(struct mtk_vcodec_pm *pm)
struct mtk_vcodec_clk *dec_clk = >vdec_clk;
int i = 0;
 
-   mtk_smi_larb_put(pm->larbvdec);
for (i = dec_clk->clk_num - 1; i >= 0; i--)
clk_disable_unprepare(dec_clk->clk_info[i].vcodec_clk);
 }
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 662a84b..cf56b07 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -196,11 +196,8 @@ struct mtk_vcodec_clk {
  */
 struct mtk_vcodec_pm {
struct mtk_vcodec_clk   vdec_clk;
-   struct device   *larbvdec;
 
struct mtk_vcodec_clk   venc_clk;
-   struct device   *larbvenc;
-   struct device   *larbvenclt;
struct device   *dev;
struct mtk_vcodec_dev   *mtkdev;
 };
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
index 50351ad..80d1c4e 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mtk_vcodec_drv.h"
 #include "mtk_vcodec_enc.h"
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_pm.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_pm.c
index 39375b8..f61c65d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_pm.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_enc_pm.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mtk_vcodec_enc_pm.h"
 #include "mtk_vcodec_util.h"
@@ -25,49 +24,18 @@
 
 int mtk_vcodec_init_enc_pm(struct mtk_vcodec_dev *mtkdev)
 {
-   struct device_node *node;
struct platform_device *pdev;
struct mtk_vcodec_pm *pm;
struct mtk_vcodec_clk *enc_clk;
struct mtk_vcodec_clk_info *clk_info;
int ret = 0, i = 0;
-   struct device *dev;
 
pdev = mtkdev->plat_dev;
pm = >pm;
memset(pm, 0, sizeof(struct mtk_vcodec_pm));
pm->mtkdev = mtkdev;
pm->dev = >dev;
-   dev = >dev;
enc_clk = >venc_clk;
-
-   node = of_parse_phandle(dev->of_node, "mediatek,larb", 0);
-   if (!node) {
-   mtk_v4l2_err("no mediatek,larb found");
-   return -ENODEV;
-   }
-   pdev = of_find_device_by_node(node);
-   of_node_put(node);
-   if (!pdev) {
-   mtk_v4l2_err("no mediatek,larb device found");
-   return -ENODEV;
-   }
-   pm->larbvenc = >dev;
-
-   node = of_parse_phandle(dev->of_node, "mediatek,larb", 1);
-   if (!node) {
-   mtk_v4l2_err("no mediatek,larb found");
-   return -ENODEV;
-   }
-
-   

[PATCH v2 06/12] media: mtk-mdp: Get rid of mtk_smi_larb_get/put

2019-06-10 Thread Yong Wu
MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the mdp device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: Minghsiu Tsai 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/media/platform/mtk-mdp/mtk_mdp_comp.c | 38 ---
 drivers/media/platform/mtk-mdp/mtk_mdp_comp.h |  2 --
 drivers/media/platform/mtk-mdp/mtk_mdp_core.c |  1 -
 3 files changed, 41 deletions(-)

diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c 
b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
index 03aba03..4f7cbc4 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mtk_mdp_comp.h"
 
@@ -66,14 +65,6 @@ void mtk_mdp_comp_clock_on(struct device *dev, struct 
mtk_mdp_comp *comp)
 {
int i, err;
 
-   if (comp->larb_dev) {
-   err = mtk_smi_larb_get(comp->larb_dev);
-   if (err)
-   dev_err(dev,
-   "failed to get larb, err %d. type:%d id:%d\n",
-   err, comp->type, comp->id);
-   }
-
for (i = 0; i < ARRAY_SIZE(comp->clk); i++) {
if (IS_ERR(comp->clk[i]))
continue;
@@ -94,16 +85,11 @@ void mtk_mdp_comp_clock_off(struct device *dev, struct 
mtk_mdp_comp *comp)
continue;
clk_disable_unprepare(comp->clk[i]);
}
-
-   if (comp->larb_dev)
-   mtk_smi_larb_put(comp->larb_dev);
 }
 
 int mtk_mdp_comp_init(struct device *dev, struct device_node *node,
  struct mtk_mdp_comp *comp, enum mtk_mdp_comp_id comp_id)
 {
-   struct device_node *larb_node;
-   struct platform_device *larb_pdev;
int i;
 
if (comp_id < 0 || comp_id >= MTK_MDP_COMP_ID_MAX) {
@@ -124,30 +110,6 @@ int mtk_mdp_comp_init(struct device *dev, struct 
device_node *node,
break;
}
 
-   /* Only DMA capable components need the LARB property */
-   comp->larb_dev = NULL;
-   if (comp->type != MTK_MDP_RDMA &&
-   comp->type != MTK_MDP_WDMA &&
-   comp->type != MTK_MDP_WROT)
-   return 0;
-
-   larb_node = of_parse_phandle(node, "mediatek,larb", 0);
-   if (!larb_node) {
-   dev_err(dev,
-   "Missing mediadek,larb phandle in %pOF node\n", node);
-   return -EINVAL;
-   }
-
-   larb_pdev = of_find_device_by_node(larb_node);
-   if (!larb_pdev) {
-   dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
-   of_node_put(larb_node);
-   return -EPROBE_DEFER;
-   }
-   of_node_put(larb_node);
-
-   comp->larb_dev = _pdev->dev;
-
return 0;
 }
 
diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h 
b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
index 63b3983..602d577 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
@@ -47,7 +47,6 @@ enum mtk_mdp_comp_id {
  * @dev_node:  component device node
  * @clk:   clocks required for component
  * @regs:  Mapped address of component registers.
- * @larb_dev:  SMI device required for component
  * @type:  component type
  * @id:component ID
  */
@@ -55,7 +54,6 @@ struct mtk_mdp_comp {
struct device_node  *dev_node;
struct clk  *clk[2];
void __iomem*regs;
-   struct device   *larb_dev;
enum mtk_mdp_comp_type  type;
enum mtk_mdp_comp_idid;
 };
diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c 
b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c
index bbb24fb..adb098d 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c
@@ -25,7 +25,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mtk_mdp_core.h"
 #include "mtk_mdp_m2m.h"
-- 
1.9.1



[PATCH v2 05/12] media: mtk-jpeg: Get rid of mtk_smi_larb_get/put

2019-06-10 Thread Yong Wu
MediaTek IOMMU has already added device_link between the consumer
and smi-larb device. If the jpg device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: Rick Chang 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c | 22 --
 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h |  2 --
 2 files changed, 24 deletions(-)

diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c 
b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
index f761e4d..2f37538 100644
--- a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
+++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
@@ -29,7 +29,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "mtk_jpeg_hw.h"
 #include "mtk_jpeg_core.h"
@@ -901,11 +900,6 @@ static int mtk_jpeg_queue_init(void *priv, struct 
vb2_queue *src_vq,
 
 static void mtk_jpeg_clk_on(struct mtk_jpeg_dev *jpeg)
 {
-   int ret;
-
-   ret = mtk_smi_larb_get(jpeg->larb);
-   if (ret)
-   dev_err(jpeg->dev, "mtk_smi_larb_get larbvdec fail %d\n", ret);
clk_prepare_enable(jpeg->clk_jdec_smi);
clk_prepare_enable(jpeg->clk_jdec);
 }
@@ -914,7 +908,6 @@ static void mtk_jpeg_clk_off(struct mtk_jpeg_dev *jpeg)
 {
clk_disable_unprepare(jpeg->clk_jdec);
clk_disable_unprepare(jpeg->clk_jdec_smi);
-   mtk_smi_larb_put(jpeg->larb);
 }
 
 static irqreturn_t mtk_jpeg_dec_irq(int irq, void *priv)
@@ -1059,21 +1052,6 @@ static int mtk_jpeg_release(struct file *file)
 
 static int mtk_jpeg_clk_init(struct mtk_jpeg_dev *jpeg)
 {
-   struct device_node *node;
-   struct platform_device *pdev;
-
-   node = of_parse_phandle(jpeg->dev->of_node, "mediatek,larb", 0);
-   if (!node)
-   return -EINVAL;
-   pdev = of_find_device_by_node(node);
-   if (WARN_ON(!pdev)) {
-   of_node_put(node);
-   return -EINVAL;
-   }
-   of_node_put(node);
-
-   jpeg->larb = >dev;
-
jpeg->clk_jdec = devm_clk_get(jpeg->dev, "jpgdec");
if (IS_ERR(jpeg->clk_jdec))
return PTR_ERR(jpeg->clk_jdec);
diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h 
b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
index 1a6cdfd..e35fb79 100644
--- a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
+++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
@@ -55,7 +55,6 @@ enum mtk_jpeg_ctx_state {
  * @dec_reg_base:  JPEG registers mapping
  * @clk_jdec:  JPEG hw working clock
  * @clk_jdec_smi:  JPEG SMI bus clock
- * @larb:  SMI device
  */
 struct mtk_jpeg_dev {
struct mutexlock;
@@ -69,7 +68,6 @@ struct mtk_jpeg_dev {
void __iomem*dec_reg_base;
struct clk  *clk_jdec;
struct clk  *clk_jdec_smi;
-   struct device   *larb;
 };
 
 /**
-- 
1.9.1



[PATCH v2 04/12] memory: mtk-smi: Add device-link between smi-larb and smi-common

2019-06-10 Thread Yong Wu
Normally, If the smi-larb HW need work, we should enable the smi-common
HW power and clock firstly.
This patch adds device-link between the smi-larb dev and the smi-common
dev. then If pm_runtime_get_sync(smi-larb-dev), the pm_runtime_get_sync
(smi-common-dev) will be called automatically.

CC: Matthias Brugger 
Suggested-by: Tomasz Figa 
Signed-off-by: Yong Wu 
---
 drivers/memory/mtk-smi.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 9688341..98b1180 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -271,6 +271,7 @@ static int mtk_smi_larb_probe(struct platform_device *pdev)
struct device *dev = >dev;
struct device_node *smi_node;
struct platform_device *smi_pdev;
+   struct device_link *link;
 
larb = devm_kzalloc(dev, sizeof(*larb), GFP_KERNEL);
if (!larb)
@@ -310,6 +311,13 @@ static int mtk_smi_larb_probe(struct platform_device *pdev)
if (!platform_get_drvdata(smi_pdev))
return -EPROBE_DEFER;
larb->smi_common_dev = _pdev->dev;
+   link = device_link_add(dev, larb->smi_common_dev,
+  DL_FLAG_PM_RUNTIME |
+  DL_FLAG_AUTOREMOVE_CONSUMER);
+   if (!link) {
+   dev_err(dev, "Unable to link smi-common dev\n");
+   return -ENODEV;
+   }
} else {
dev_err(dev, "Failed to get the smi_common device\n");
return -EINVAL;
@@ -333,17 +341,9 @@ static int __maybe_unused mtk_smi_larb_resume(struct 
device *dev)
const struct mtk_smi_larb_gen *larb_gen = larb->larb_gen;
int ret;
 
-   /* Power on smi-common. */
-   ret = pm_runtime_get_sync(larb->smi_common_dev);
-   if (ret < 0) {
-   dev_err(dev, "Failed to pm get for smi-common(%d).\n", ret);
-   return ret;
-   }
-
ret = mtk_smi_clk_enable(>smi);
if (ret < 0) {
dev_err(dev, "Failed to enable clock(%d).\n", ret);
-   pm_runtime_put_sync(larb->smi_common_dev);
return ret;
}
 
@@ -358,7 +358,6 @@ static int __maybe_unused mtk_smi_larb_suspend(struct 
device *dev)
struct mtk_smi_larb *larb = dev_get_drvdata(dev);
 
mtk_smi_clk_disable(>smi);
-   pm_runtime_put_sync(larb->smi_common_dev);
return 0;
 }
 
-- 
1.9.1



[PATCH v2 03/12] iommu/mediatek: Add device_link between the consumer and the larb devices

2019-06-10 Thread Yong Wu
MediaTek IOMMU don't have its power-domain. all the consumer connect
with smi-larb, then connect with smi-common.

M4U
 |
smi-common
 |
  -
  | |...
  | |
larb1 larb2
  | |
vdec   venc

When the consumer works, it should enable the smi-larb's power which
also need enable the smi-common's power firstly.

Thus, First of all, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

This patch adds device_link between the consumer and the larbs.

Suggested-by: Tomasz Figa 
Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c| 12 
 drivers/iommu/mtk_iommu_v1.c | 13 -
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index f7599d8..7b70574 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -440,6 +440,9 @@ static int mtk_iommu_add_device(struct device *dev)
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_data *data;
struct iommu_group *group;
+   struct device_link *link;
+   struct device *larbdev;
+   unsigned int larbid;
 
if (!fwspec || fwspec->ops != _iommu_ops)
return -ENODEV; /* Not a iommu client device */
@@ -451,6 +454,15 @@ static int mtk_iommu_add_device(struct device *dev)
if (IS_ERR(group))
return PTR_ERR(group);
 
+   /* Link the consumer device with the smi-larb device(supplier) */
+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+   larbdev = data->smi_imu.larb_imu[larbid].dev;
+   link = device_link_add(dev, larbdev,
+  DL_FLAG_PM_RUNTIME |
+  DL_FLAG_AUTOREMOVE_CONSUMER);
+   if (!link)
+   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
+
iommu_group_put(group);
return 0;
 }
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index c43c4a0..845e20b 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -423,7 +423,9 @@ static int mtk_iommu_add_device(struct device *dev)
struct of_phandle_iterator it;
struct mtk_iommu_data *data;
struct iommu_group *group;
-   int err;
+   struct device_link *link;
+   struct device *larbdev;
+   int err, larbid;
 
of_for_each_phandle(, err, dev->of_node, "iommus",
"#iommu-cells", 0) {
@@ -466,6 +468,15 @@ static int mtk_iommu_add_device(struct device *dev)
return err;
}
 
+   /* Link the consumer device with the smi-larb device(supplier) */
+   larbid = mt2701_m4u_to_larb(fwspec->ids[0]);
+   larbdev = data->smi_imu.larb_imu[larbid].dev;
+   link = device_link_add(dev, larbdev,
+  DL_FLAG_PM_RUNTIME |
+  DL_FLAG_AUTOREMOVE_CONSUMER);
+   if (!link)
+   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
+
return iommu_device_link(>iommu, dev);
 }
 
-- 
1.9.1



[PATCH v2 02/12] iommu/mediatek: Add probe_defer for smi-larb

2019-06-10 Thread Yong Wu
The iommu consumer should use device_link to connect with the
smi-larb(supplier). then the smi-larb should run before the iommu
consumer. Here we delay the iommu driver until the smi driver is
ready, then all the iommu consumer always is after the smi driver.

When there is no this patch, if some consumer drivers run before
smi-larb, the supplier link_status is DL_DEV_NO_DRIVER(0) in the
device_link_add, then device_links_driver_bound will use WARN_ON
to complain that the link_status of supplier is not right.

This is a preparing patch for adding device_link.

Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c| 2 +-
 drivers/iommu/mtk_iommu_v1.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 6fe3369..f7599d8 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -664,7 +664,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
id = i;
 
plarbdev = of_find_device_by_node(larbnode);
-   if (!plarbdev) {
+   if (!plarbdev || !plarbdev->dev.driver) {
of_node_put(larbnode);
return -EPROBE_DEFER;
}
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 0b0908c..c43c4a0 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -604,7 +604,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
plarbdev = of_platform_device_create(
larb_spec.np, NULL,
platform_bus_type.dev_root);
-   if (!plarbdev) {
+   if (!plarbdev || !plarbdev->dev.driver) {
of_node_put(larb_spec.np);
return -EPROBE_DEFER;
}
-- 
1.9.1



[PATCH v2 00/12] Clean up "mediatek,larb" after adding device_link

2019-06-10 Thread Yong Wu
MediaTek IOMMU block diagram always like below:

M4U
 |
smi-common
 |
  -
  | |  ...
  | |
larb1 larb2
  | |
vdec   venc

All the consumer connect with smi-larb, then connect with smi-common.

MediaTek IOMMU don't have its power-domain. When the consumer works,
it should enable the smi-larb's power which also need enable the smi-common's
power firstly.

Thus, Firstly, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

After adding the device_link, then "mediatek,larb" property can be removed.
the iommu consumer don't need call the mtk_smi_larb_get/put to enable
the power and clock of smi-larb and smi-common.

This patchset depends on "MT8183 IOMMU SUPPORT"[1].

[1] https://lists.linuxfoundation.org/pipermail/iommu/2019-June/036552.html

Change notes:
v2:
   1) rebase on v5.2-rc1.
   2) Move adding device_link between the consumer and smi-larb into
iommu_add_device from Robin.
   3) add DL_FLAG_AUTOREMOVE_CONSUMER even though the smi is built-in from Evan.
   4) Remove the shutdown callback in iommu.   

v1: https://lists.linuxfoundation.org/pipermail/iommu/2019-January/032387.html

Yong Wu (12):
  dt-binding: mediatek: Get rid of mediatek,larb for multimedia HW
  iommu/mediatek: Add probe_defer for smi-larb
  iommu/mediatek: Add device_link between the consumer and the larb
devices
  memory: mtk-smi: Add device-link between smi-larb and smi-common
  media: mtk-jpeg: Get rid of mtk_smi_larb_get/put
  media: mtk-mdp: Get rid of mtk_smi_larb_get/put
  media: mtk-vcodec: Get rid of mtk_smi_larb_get/put
  drm/mediatek: Get rid of mtk_smi_larb_get/put
  memory: mtk-smi: Get rid of mtk_smi_larb_get/put
  iommu/mediatek: Use builtin_platform_driver
  arm: dts: mediatek: Get rid of mediatek,larb for MM nodes
  arm64: dts: mediatek: Get rid of mediatek,larb for MM nodes

 .../bindings/display/mediatek/mediatek,disp.txt|  9 -
 .../bindings/media/mediatek-jpeg-decoder.txt   |  4 --
 .../devicetree/bindings/media/mediatek-mdp.txt |  8 
 .../devicetree/bindings/media/mediatek-vcodec.txt  |  4 --
 arch/arm/boot/dts/mt2701.dtsi  |  1 -
 arch/arm/boot/dts/mt7623.dtsi  |  1 -
 arch/arm64/boot/dts/mediatek/mt8173.dtsi   | 15 ---
 drivers/gpu/drm/mediatek/mtk_drm_crtc.c| 11 -
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c| 26 
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h|  1 -
 drivers/iommu/mtk_iommu.c  | 45 +++--
 drivers/iommu/mtk_iommu_v1.c   | 39 +++---
 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c| 22 --
 drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h|  2 -
 drivers/media/platform/mtk-mdp/mtk_mdp_comp.c  | 38 -
 drivers/media/platform/mtk-mdp/mtk_mdp_comp.h  |  2 -
 drivers/media/platform/mtk-mdp/mtk_mdp_core.c  |  1 -
 .../media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c  | 21 --
 drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h |  3 --
 drivers/media/platform/mtk-vcodec/mtk_vcodec_enc.c |  1 -
 .../media/platform/mtk-vcodec/mtk_vcodec_enc_pm.c  | 47 --
 drivers/memory/mtk-smi.c   | 31 --
 include/soc/mediatek/smi.h | 20 -
 23 files changed, 36 insertions(+), 316 deletions(-)

-- 
1.9.1 



[PATCH v2 01/12] dt-binding: mediatek: Get rid of mediatek,larb for multimedia HW

2019-06-10 Thread Yong Wu
After adding device_link between the consumer with the smi-larbs,
if the consumer call its owner pm_runtime_get(_sync), the
pm_runtime_get(_sync) of smi-larb and smi-common will be called
automatically. Thus, the consumer don't need the property.

And IOMMU also know which larb this consumer connects with from
iommu id in the "iommus=" property.

Signed-off-by: Yong Wu 
Reviewed-by: Rob Herring 
Reviewed-by: Evan Green 
---
 .../devicetree/bindings/display/mediatek/mediatek,disp.txt   | 9 -
 .../devicetree/bindings/media/mediatek-jpeg-decoder.txt  | 4 
 Documentation/devicetree/bindings/media/mediatek-mdp.txt | 8 
 Documentation/devicetree/bindings/media/mediatek-vcodec.txt  | 4 
 4 files changed, 25 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt 
b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
index 8469de5..464b92f 100644
--- a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
+++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
@@ -56,8 +56,6 @@ Required properties (DMA function blocks):
"mediatek,-disp-rdma"
"mediatek,-disp-wdma"
   the supported chips are mt2701 and mt8173.
-- larb: Should contain a phandle pointing to the local arbiter device as 
defined
-  in Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt
 - iommus: Should point to the respective IOMMU block with master port as
   argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
   for details.
@@ -78,7 +76,6 @@ ovl0: ovl@1400c000 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_OVL0>;
iommus = < M4U_PORT_DISP_OVL0>;
-   mediatek,larb = <>;
 };
 
 ovl1: ovl@1400d000 {
@@ -88,7 +85,6 @@ ovl1: ovl@1400d000 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_OVL1>;
iommus = < M4U_PORT_DISP_OVL1>;
-   mediatek,larb = <>;
 };
 
 rdma0: rdma@1400e000 {
@@ -98,7 +94,6 @@ rdma0: rdma@1400e000 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_RDMA0>;
iommus = < M4U_PORT_DISP_RDMA0>;
-   mediatek,larb = <>;
 };
 
 rdma1: rdma@1400f000 {
@@ -108,7 +103,6 @@ rdma1: rdma@1400f000 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_RDMA1>;
iommus = < M4U_PORT_DISP_RDMA1>;
-   mediatek,larb = <>;
 };
 
 rdma2: rdma@1401 {
@@ -118,7 +112,6 @@ rdma2: rdma@1401 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_RDMA2>;
iommus = < M4U_PORT_DISP_RDMA2>;
-   mediatek,larb = <>;
 };
 
 wdma0: wdma@14011000 {
@@ -128,7 +121,6 @@ wdma0: wdma@14011000 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_WDMA0>;
iommus = < M4U_PORT_DISP_WDMA0>;
-   mediatek,larb = <>;
 };
 
 wdma1: wdma@14012000 {
@@ -138,7 +130,6 @@ wdma1: wdma@14012000 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_WDMA1>;
iommus = < M4U_PORT_DISP_WDMA1>;
-   mediatek,larb = <>;
 };
 
 color0: color@14013000 {
diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.txt 
b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.txt
index 044b119..7978f21 100644
--- a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.txt
+++ b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.txt
@@ -15,9 +15,6 @@ Required properties:
 - clock-names: must contain "jpgdec-smi" and "jpgdec".
 - power-domains: a phandle to the power domain, see
   Documentation/devicetree/bindings/power/power_domain.txt for details.
-- mediatek,larb: must contain the local arbiters in the current Socs, see
-  Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.txt
-  for details.
 - iommus: should point to the respective IOMMU block with master port as
   argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
   for details.
@@ -32,7 +29,6 @@ Example:
clock-names = "jpgdec-smi",
  "jpgdec";
power-domains = < MT2701_POWER_DOMAIN_ISP>;
-   mediatek,larb = <>;
iommus = < MT2701_M4U_PORT_JPGDEC_WDMA>,
 < MT2701_M4U_PORT_JPGDEC_BSDMA>;
};
diff --git a/Documentation/devicetree/bindings/media/mediatek-mdp.txt 
b/Documentation/devicetree/bindings/media/mediatek-mdp.txt
index 0d03e3a..df69c5a 100644
--- a/Documentation/devicetree/bindings/media/mediatek-mdp.txt
+++ b/Documentation/devicetree/bindings/media/mediatek-mdp.txt
@@ -27,9 +27,6 @@ Required properties (DMA function blocks, child node):
 - iommus: should point to the respective IOMMU block with master port as
   argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
   for details.
-- mediatek,larb: must contain the local 

Re: [PATCH v8 26/29] vfio-pci: Register an iommu fault handler

2019-06-10 Thread Jean-Philippe Brucker
On 07/06/2019 18:43, Jacob Pan wrote:
>>> So it seems we agree on the following:
>>> - iommu_unregister_device_fault_handler() will never fail
>>> - iommu driver cleans up all pending faults when handler is
>>> unregistered
>>> - assume device driver or guest not sending more page response
>>> _after_ handler is unregistered.
>>> - system will tolerate rare spurious response
>>>
>>> Sounds right?  
>>
>> Yes, I'll add that to the fault series
> Hold on a second please, I think we need more clarifications. Ashok
> pointed out to me that the spurious response can be harmful to other
> devices when it comes to mdev, where PRQ group id is not per PASID,
> device may reuse the group number and receiving spurious page response
> can confuse the entire PF. 

I don't understand how mdev differs from the non-mdev situation (but I
also still don't fully get how mdev+PASID will be implemented). Is the
following the case you're worried about?

  M#: mdev #

# Dev Hostmdev drv   VFIO/QEMUGuest

1 <- reg(handler)
2 PR1 G1 P1-> M1 PR1 G1inject -> M1 PR1 G1
3 <- unreg(handler)
4   <- PS1 G1 P1 (F)  |
5unreg(handler)
6 <- reg(handler)
7 PR2 G1 P1-> M2 PR2 G1inject -> M2 PR2 G1
8 <- M1 PS1 G1
9 accept ??<- PS1 G1 P1
10<- M2 PS2 G1
11accept   <- PS2 G1 P1


Step 2 injects PR1 for mdev#1. Step 4 auto-responds to PR1. Between
steps 5 and 6, we re-allocate PASID #1 for mdev #2. At step 7, we inject
PR2 for mdev #2. Step 8 is the spurious Page Response for PR1.

But I don't think step 9 is possible, because the mdev driver knows that
mdev #1 isn't using PASID #1 anymore. If the configuration is valid at
all (a page response channel still exists for mdev #1), then mdev #1 now
has a different PASID, e.g. #2, and step 9 would be "<- PS1 G1 P2" which
is rejected by iommu.c (no such pending page request). And step 11 will
be accepted.

If PASIDs are allocated through VCMD, then the situation seems similar:
at step 2 you inject "M1 PR1 G1 P1" into the guest, and at step 8 the
spurious response is "M1 PS1 G1 P1". If mdev #1 doesn't have PASID #1
anymore, then the mdev driver can check that the PASID is invalid and
can reject the page response.

> Having spurious page response is also not
> abiding the PCIe spec. exactly.

We are following the PCI spec though, in that we don't send page
responses for PRGIs that aren't in flight.

> We have two options here:
> 1. unregister handler will get -EBUSY if outstanding fault exists.
>   -PROs: block offending device unbind only, eventually timeout
>   will clear.
>   -CONs: flooded faults can prevent clearing
> 2. unregister handle will block until all faults are clear in the host.
>Never fails unregistration

Here the host completes the faults itself or wait for a response from
the guest? I'm slightly confused by the word "blocking". I'd rather we
don't introduce an uninterruptible sleep in the IOMMU core, since it's
unlikely to ever finish if we rely on the guest to complete things.

>   -PROs: simple flow for VFIO, no need to worry about device
>   holding reference.
>   -CONs: spurious page response may come from
>   misbehaving/malicious guest if guest does unregister and
>   register back to back.

> It seems the only way to prevent spurious page response is to introduce
> a SW token or sequence# for each PRQ that needs a response. I still
> think option 2 is good.
> 
> Consider the following time line:
> decoding
>  PR#: page request
>  G#:  group #
>  P#:  PASID
>  S#:  sequence #
>  A#:  address
>  PS#: page response
>  (F): Fail
>  (S): Success
> 
> # Dev HostVFIO/QEMU   Guest
> ===   
> 1 <-reg(handler)
> 2 PR1G1S1A1   ->  inject  ->  PR1G1S1A1
> 3 PR2G1S2A2   ->  inject  ->  PR2G1S2A2
> 4.<-unreg(handler)
> 5.<-PR1G1S1A1(F)  | 
> 6.<-PR2G1S2A2(F)  V
> 7.<-unreg(handler)
> 8.<-reg(handler)
> 9 PR3G1S3A1   ->  inject  ->  PR3G1S3A1
> 10.   <-PS1G1S1A1
> 11.   
> 11.<-PS3G1S3A1
> 12.PS3G1S3A1(S)
> 
> The spurious page response comes in at step 10 where the guest sends
> response for the request in step 1. But since the sequence # is 1, host
> IOMMU driver will reject it. At step 11, we accept page response for
> the matching sequence # then respond SUCCESS to the device.
> 
> So would it be OK to add this sequence# to iommu_fault and page

Re: How to resolve an issue in swiotlb environment?

2019-06-10 Thread Christoph Hellwig
Hi Yoshihiro,

sorry for not taking care of this earlier, today is a public holiday
here and thus I'm not working much over the long weekend.

On Mon, Jun 10, 2019 at 11:13:07AM +, Yoshihiro Shimoda wrote:
> I have another way to avoid the issue. But it doesn't seem that a good way 
> though...
> According to the commit that adding blk_queue_virt_boundary() [3],
> this is needed for vhci_hcd as a workaround so that if we avoid to call it
> on xhci-hcd driver, the issue disappeared. What do you think?
> JFYI, I pasted a tentative patch in the end of email [4].

Oh, I hadn't even look at why USB uses blk_queue_virt_boundary, and it
seems like the usage is wrong, as it doesn't follow the same rules as
all the others.  I think your patch goes in the right direction,
but instead of comparing a hcd name it needs to be keyed of a flag
set by the driver (I suspect there is one indicating native SG support,
but I can't quickly find it), and we need an alternative solution
for drivers that don't see like vhci.  I suspect just limiting the
entire transfer size to something that works for a single packet
for them would be fine.


[PATCH v7 06/21] iommu/io-pgtable-arm-v7s: Extend MediaTek 4GB Mode

2019-06-10 Thread Yong Wu
MediaTek extend the arm v7s descriptor to support the dram over 4GB.

In the mt2712 and mt8173, it's called "4GB mode", the physical address
is from 0x4000_ to 0x1_3fff_, but from EMI point of view, it
is remapped to high address from 0x1__ to 0x1__, the
bit32 is always enabled. thus, in the M4U, we always enable the bit9
for all PTEs which means to enable bit32 of physical address.

but in mt8183, M4U support the dram from 0x4000_ to 0x3__
which isn't remaped. We extend the PTEs: the bit9 represent bit32 of
PA and the bit4 represent bit33 of PA. Meanwhile the iova still is
32bits.

In order to unify code, in the "4GB mode", we add the bit32 for the
physical address manually in our driver.

Correspondingly, Adding bit32 and bit33 for the PA in the iova_to_phys
has to been moved into v7s.

Regarding whether the pagetable address could be over 4GB, the mt8183
support it while the previous mt8173 don't. thus keep it as is.

Signed-off-by: Yong Wu 
Reviewed-by: Robin Murphy 
Reviewed-by: Evan Green 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 31 ---
 drivers/iommu/mtk_iommu.c  | 20 ++--
 drivers/iommu/mtk_iommu.h  |  1 +
 3 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 94c38db..4077822 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -123,7 +123,9 @@
 #define ARM_V7S_TEX_MASK   0x7
 #define ARM_V7S_ATTR_TEX(val)  (((val) & ARM_V7S_TEX_MASK) << 
ARM_V7S_TEX_SHIFT)
 
-#define ARM_V7S_ATTR_MTK_4GB   BIT(9) /* MTK extend it for 4GB mode */
+/* MediaTek extend the two bits below for over 4GB mode */
+#define ARM_V7S_ATTR_MTK_PA_BIT32  BIT(9)
+#define ARM_V7S_ATTR_MTK_PA_BIT33  BIT(4)
 
 /* *well, except for TEX on level 2 large pages, of course :( */
 #define ARM_V7S_CONT_PAGE_TEX_SHIFT6
@@ -190,13 +192,22 @@ static dma_addr_t __arm_v7s_dma_addr(void *pages)
 static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
struct io_pgtable_cfg *cfg)
 {
-   return paddr & ARM_V7S_LVL_MASK(lvl);
+   arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
+
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
+   if (paddr & BIT_ULL(32))
+   pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
+   if (paddr & BIT_ULL(33))
+   pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
+   }
+   return pte;
 }
 
 static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
  struct io_pgtable_cfg *cfg)
 {
arm_v7s_iopte mask;
+   phys_addr_t paddr;
 
if (ARM_V7S_PTE_IS_TABLE(pte, lvl))
mask = ARM_V7S_TABLE_MASK;
@@ -205,7 +216,14 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int 
lvl,
else
mask = ARM_V7S_LVL_MASK(lvl);
 
-   return pte & mask;
+   paddr = pte & mask;
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB) {
+   if (pte & ARM_V7S_ATTR_MTK_PA_BIT32)
+   paddr |= BIT_ULL(32);
+   if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
+   paddr |= BIT_ULL(33);
+   }
+   return paddr;
 }
 
 static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl,
@@ -326,9 +344,6 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
if (lvl == 1 && (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS))
pte |= ARM_V7S_ATTR_NS_SECTION;
 
-   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)
-   pte |= ARM_V7S_ATTR_MTK_4GB;
-
return pte;
 }
 
@@ -515,7 +530,9 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned 
long iova,
if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
return 0;
 
-   if (WARN_ON(upper_32_bits(iova) || upper_32_bits(paddr)))
+   if (WARN_ON(upper_32_bits(iova)) ||
+   WARN_ON(upper_32_bits(paddr) &&
+   !(iop->cfg.quirks & IO_PGTABLE_QUIRK_ARM_MTK_4GB)))
return -ERANGE;
 
ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd);
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 1ddb2b7..aff5004 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -271,7 +271,8 @@ static int mtk_iommu_domain_finalise(struct 
mtk_iommu_domain *dom)
dom->cfg = (struct io_pgtable_cfg) {
.quirks = IO_PGTABLE_QUIRK_ARM_NS |
IO_PGTABLE_QUIRK_NO_PERMS |
-   IO_PGTABLE_QUIRK_TLBI_ON_MAP,
+   IO_PGTABLE_QUIRK_TLBI_ON_MAP |
+   IO_PGTABLE_QUIRK_ARM_MTK_4GB,
.pgsize_bitmap = mtk_iommu_ops.pgsize_bitmap,
.ias = 32,
.oas = 32,
@@ -279,9 +280,6 @@ static int mtk_iommu_domain_finalise(struct 
mtk_iommu_domain 

[PATCH v7 07/21] iommu/mediatek: Add bclk can be supported optionally

2019-06-10 Thread Yong Wu
In some SoCs, M4U doesn't have its "bclk", it will use the EMI
clock instead which has always been enabled when entering kernel.

Currently mt2712 and mt8173 have this bclk while mt8183 doesn't.

This also is a preparing patch for mt8183.

Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c | 10 +++---
 drivers/iommu/mtk_iommu.h |  3 +++
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index aff5004..264dda4 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -611,9 +611,11 @@ static int mtk_iommu_probe(struct platform_device *pdev)
if (data->irq < 0)
return data->irq;
 
-   data->bclk = devm_clk_get(dev, "bclk");
-   if (IS_ERR(data->bclk))
-   return PTR_ERR(data->bclk);
+   if (data->plat_data->has_bclk) {
+   data->bclk = devm_clk_get(dev, "bclk");
+   if (IS_ERR(data->bclk))
+   return PTR_ERR(data->bclk);
+   }
 
larb_nr = of_count_phandle_with_args(dev->of_node,
 "mediatek,larbs", NULL);
@@ -741,11 +743,13 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
 static const struct mtk_iommu_plat_data mt2712_data = {
.m4u_plat = M4U_MT2712,
.has_4gb_mode = true,
+   .has_bclk = true,
 };
 
 static const struct mtk_iommu_plat_data mt8173_data = {
.m4u_plat = M4U_MT8173,
.has_4gb_mode = true,
+   .has_bclk = true,
 };
 
 static const struct of_device_id mtk_iommu_of_ids[] = {
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index d7a001a..63e235e 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -43,6 +43,9 @@ enum mtk_iommu_plat {
 struct mtk_iommu_plat_data {
enum mtk_iommu_plat m4u_plat;
boolhas_4gb_mode;
+
+   /* HW will use the EMI clock if there isn't the "bclk". */
+   boolhas_bclk;
 };
 
 struct mtk_iommu_domain;
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 03/21] memory: mtk-smi: Use a general config_port interface

2019-06-10 Thread Yong Wu
The config_port of mt2712 and mt8183 are the same. Use a general
config_port interface instead.

In addition, in mt2712, larb8 and larb9 are the bdpsys larbs which
are not the normal larb, their register space are different from the
normal one. thus, we can not call the general config_port. In mt8183,
IPU0/1 and CCU connect with smi-common directly, they also are not
the normal larb. Hence, we add a "larb_direct_to_common_mask" for these
larbs which connect to smi-commmon directly.

This is also a preparing patch for adding mt8183 SMI support.

Signed-off-by: Yong Wu 
Reviewed-by: Matthias Brugger 
Reviewed-by: Evan Green 
---
 drivers/memory/mtk-smi.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 8f2d152..9fd6b3d 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -53,6 +53,7 @@ struct mtk_smi_larb_gen {
bool need_larbid;
int port_in_larb[MTK_LARB_NR_MAX + 1];
void (*config_port)(struct device *);
+   unsigned int larb_direct_to_common_mask;
 };
 
 struct mtk_smi {
@@ -176,17 +177,13 @@ void mtk_smi_larb_put(struct device *larbdev)
return -ENODEV;
 }
 
-static void mtk_smi_larb_config_port_mt2712(struct device *dev)
+static void mtk_smi_larb_config_port_gen2_general(struct device *dev)
 {
struct mtk_smi_larb *larb = dev_get_drvdata(dev);
u32 reg;
int i;
 
-   /*
-* larb 8/9 is the bdpsys larb, the iommu_en is enabled defaultly.
-* Don't need to set it again.
-*/
-   if (larb->larbid == 8 || larb->larbid == 9)
+   if (BIT(larb->larbid) & larb->larb_gen->larb_direct_to_common_mask)
return;
 
for_each_set_bit(i, (unsigned long *)larb->mmu, 32) {
@@ -261,7 +258,8 @@ static void mtk_smi_larb_config_port_gen1(struct device 
*dev)
 
 static const struct mtk_smi_larb_gen mtk_smi_larb_mt2712 = {
.need_larbid = true,
-   .config_port = mtk_smi_larb_config_port_mt2712,
+   .config_port= mtk_smi_larb_config_port_gen2_general,
+   .larb_direct_to_common_mask = BIT(8) | BIT(9),  /* bdpsys */
 };
 
 static const struct of_device_id mtk_smi_larb_of_ids[] = {
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 00/21] MT8183 IOMMU SUPPORT

2019-06-10 Thread Yong Wu
This patchset mainly adds support for mt8183 IOMMU and SMI.

mt8183 has only one M4U like mt8173 and is also MTK IOMMU gen2 which
uses ARM Short-Descriptor translation table format.

The mt8183 M4U-SMI HW diagram is as below:

  EMI
   |
  M4U
   |
   --
   ||
   gals0-rx   gals1-rx
   ||
   ||
   gals0-tx   gals1-tx
   ||
  
   SMI Common
  
   |
  +-+-++-+-+---+---+
  | | || | |   |   |
  | |  gals-rx  gals-rx  |   gals-rx gals-rx gals-rx
  | | || | |   |   |
  | | || | |   |   |
  | |  gals-tx  gals-tx  |   gals-tx gals-tx gals-tx
  | | || | |   |   |
larb0 larb1  IPU0IPU1  larb4  larb5  larb6CCU
disp  vdec   img camvenc   imgcam

All the connections are HW fixed, SW can NOT adjust it.

Compared with mt8173, we add a GALS(Global Async Local Sync) module
between SMI-common and M4U, and additional GALS between larb2/3/5/6
and SMI-common. GALS can help synchronize for the modules in different
clock frequency, it can be seen as a "asynchronous fifo".

GALS can only help transfer the command/data while it doesn't have
the configuring register, thus it has the special "smi" clock and it
doesn't have the "apb" clock. From the diagram above, we add "gals0"
and "gals1" clocks for smi-common and add a "gals" clock for smi-larb.

>From the diagram above, IPU0/IPU1(Image Processor Unit) and CCU(Camera
Control Unit) is connected with smi-common directly, we can take them
as "larb2", "larb3" and "larb7", and their register spaces are
different with the normal larb.

This is the general purpose of each patch in this patchset:
the patch 1..13 add the iommu/smi support for mt8183;
the patch 14..16 add mmu1 support;
the last patches contain some minor changes:
   -patch 17 cleanup some smi codes(delete need_larbid).
   -patch 18 fix a issue(fix vld_pa_rng).
   -patch 19/20 improve the 4GB mode.
   -patch 21 switch to SPDX license.
The dtsi was sent at [1].

[1] https://lore.kernel.org/patchwork/patch/1054099/

Change notes:
v7:
   1) rebase on v5.2-rc1.
   2) Add fixed tags in patch 20.
   3) Remove shutdown patch. I will send it independently if necessary.

v6: https://lists.linuxfoundation.org/pipermail/iommu/2019-February/033685.html
1) rebase on v5.0-rc1.
2) About the register name (VLD_PA_RNG), Keep consistent in the patches.
3) In the 4GB mode, Always add MTK_4GB_quirk.
4) Reword some commit message helped from Evan. like common->smi_ao_base is
   completely different from common->base; STANDARD_AXI_MODE reg is 
completely
   different from CTRL_MISC; commit in the shutdown patch.
5) Add 2 new patches again:
   iommu/mediatek: Rename enable_4GB to dram_is_4gb
   iommu/mediatek: Fix iova_to_phys PA start for 4GB mode

v5: https://lists.linuxfoundation.org/pipermail/iommu/2019-January/032387.html
1) Remove this patch "iommu/mediatek: Constify iommu_ops" from here as it
   was applied for v5.0.
2) Again, add 3 preparing patches. Move two property into the plat_data.
   iommu/mediatek: Move vld_pa_rng into plat_data
   iommu/mediatek: Move reset_axi into plat_data
   iommu/mediatek: Refine protect memory definition
3) Add shutdown callback for mtk_iommu_v1 in patch[19/20].

v4: 
http://lists.infradead.org/pipermail/linux-mediatek/2018-December/016205.html
1) Add 3 preparing patches. Seperate some minor meaningful code into
   a new patch according to Matthias's suggestion.
   memory: mtk-smi: Add gals support 
   iommu/mediatek: Add larb-id remapped support 
   iommu/mediatek: Add bclk can be supported optionally   
2) rebase on "iommu/mediatek: Make it explicitly non-modular"
   which was applied.
   https://lore.kernel.org/patchwork/patch/1020125/
3) add some comment about "mediatek,larb-id" in the commit message of
   the patch "mtk-smi: Get rid of need_larbid".
4) Fix bus_sel value.

v3: https://lists.linuxfoundation.org/pipermail/iommu/2018-November/031121.html
1) rebase on v4.20-rc1.
2) In the dt-binding, add a minor string "mt7623" which also use gen1
   since Matthias added it in v4.20.
3) About v7s:
   a) for paddr_to_pte, change the param from "arm_v7s_io_pgtable" to
  "arm_pgtable_cfg", according to Robin suggestion.
   b) Don't use CONFIG_PHYS_ADDR_T_64BIT.
   c) add a little comment(pgtable address still don't over 4GB) in the
  commit message of the patch "Extend MediaTek 4GB Mode".
4) 

[PATCH v7 18/21] iommu/mediatek: Fix VLD_PA_RNG register backup when suspend

2019-06-10 Thread Yong Wu
The register VLD_PA_RNG(0x118) was forgot to backup while adding 4GB
mode support for mt2712. this patch add it.

Fixes: 30e2fccf9512 ("iommu/mediatek: Enlarge the validate PA range
for 4GB mode")
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c | 2 ++
 drivers/iommu/mtk_iommu.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 6053b8b..86158d8 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -719,6 +719,7 @@ static int __maybe_unused mtk_iommu_suspend(struct device 
*dev)
reg->int_control0 = readl_relaxed(base + REG_MMU_INT_CONTROL0);
reg->int_main_control = readl_relaxed(base + REG_MMU_INT_MAIN_CONTROL);
reg->ivrp_paddr = readl_relaxed(base + REG_MMU_IVRP_PADDR);
+   reg->vld_pa_rng = readl_relaxed(base + REG_MMU_VLD_PA_RNG);
clk_disable_unprepare(data->bclk);
return 0;
 }
@@ -743,6 +744,7 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
writel_relaxed(reg->int_control0, base + REG_MMU_INT_CONTROL0);
writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL);
writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
+   writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG);
if (m4u_dom)
writel(m4u_dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
   base + REG_MMU_PT_BASE_ADDR);
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index c0b5c65..753266b 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -32,6 +32,7 @@ struct mtk_iommu_suspend_reg {
u32 int_control0;
u32 int_main_control;
u32 ivrp_paddr;
+   u32 vld_pa_rng;
 };
 
 enum mtk_iommu_plat {
-- 
1.9.1



[PATCH v7 19/21] iommu/mediatek: Rename enable_4GB to dram_is_4gb

2019-06-10 Thread Yong Wu
This patch only rename the variable name from enable_4GB to
dram_is_4gb for readable.

Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c | 10 +-
 drivers/iommu/mtk_iommu.h |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 86158d8..67cab2d 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -382,7 +382,7 @@ static int mtk_iommu_map(struct iommu_domain *domain, 
unsigned long iova,
int ret;
 
/* The "4GB mode" M4U physically can not use the lower remap of Dram. */
-   if (data->plat_data->has_4gb_mode && data->enable_4GB)
+   if (data->plat_data->has_4gb_mode && data->dram_is_4gb)
paddr |= BIT_ULL(32);
 
spin_lock_irqsave(>pgtlock, flags);
@@ -554,13 +554,13 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
*data)
writel_relaxed(regval, data->base + REG_MMU_INT_MAIN_CONTROL);
 
if (data->plat_data->m4u_plat == M4U_MT8173)
-   regval = (data->protect_base >> 1) | (data->enable_4GB << 31);
+   regval = (data->protect_base >> 1) | (data->dram_is_4gb << 31);
else
regval = lower_32_bits(data->protect_base) |
 upper_32_bits(data->protect_base);
writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
 
-   if (data->enable_4GB && data->plat_data->has_vld_pa_rng) {
+   if (data->dram_is_4gb && data->plat_data->has_vld_pa_rng) {
/*
 * If 4GB mode is enabled, the validate PA range is from
 * 0x1__ to 0x1__. here record bit[32:30].
@@ -611,8 +611,8 @@ static int mtk_iommu_probe(struct platform_device *pdev)
return -ENOMEM;
data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN);
 
-   /* Whether the current dram is over 4GB */
-   data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT));
+   /* Whether the current dram is 4GB. */
+   data->dram_is_4gb = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT));
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
data->base = devm_ioremap_resource(dev, res);
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 753266b..e8114b2 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -65,7 +65,7 @@ struct mtk_iommu_data {
struct mtk_iommu_domain *m4u_dom;
struct iommu_group  *m4u_group;
struct mtk_smi_iommusmi_imu;  /* SMI larb iommu info */
-   boolenable_4GB;
+   booldram_is_4gb;
booltlb_flush_active;
 
struct iommu_device iommu;
-- 
1.9.1



[PATCH v7 20/21] iommu/mediatek: Fix iova_to_phys PA start for 4GB mode

2019-06-10 Thread Yong Wu
In the 4GB mode, the physical address is remapped,

Here is the detailed remap relationship.
CPU PA ->HW PA
0x4000_  0x1_4000_ (Add bit32)
0x8000_  0x1_8000_ ...
0xc000_  0x1_c000_ ...
0x1__0x1__ (No change)

Thus, we always add bit32 for PA when entering mtk_iommu_map.
But in the iova_to_phys, the CPU don't need this bit32 if the
PA is from 0x1_4000_ to 0x1__.
This patch discards the bit32 in this iova_to_phys in the 4GB mode.

Fixes: 30e2fccf9512 ("iommu/mediatek: Enlarge the validate PA range
for 4GB mode")
Signed-off-by: Yong Wu 
---
 drivers/iommu/mtk_iommu.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 67cab2d..34f2e40 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -119,6 +119,19 @@ struct mtk_iommu_domain {
 
 static const struct iommu_ops mtk_iommu_ops;
 
+/*
+ * In M4U 4GB mode, the physical address is remapped as below:
+ *  CPU PA ->   M4U HW PA
+ *  0x4000_ 0x1_4000_ (Add bit32)
+ *  0x8000_ 0x1_8000_ ...
+ *  0xc000_ 0x1_c000_ ...
+ *  0x1__   0x1__ (No change)
+ *
+ * Thus, We always add BIT32 in the iommu_map and disable BIT32 if PA is >=
+ * 0x1_4000_ in the iova_to_phys.
+ */
+#define MTK_IOMMU_4GB_MODE_PA_14000 0x14000UL
+
 static LIST_HEAD(m4ulist); /* List all the M4U HWs */
 
 #define for_each_m4u(data) list_for_each_entry(data, , list)
@@ -415,6 +428,7 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct 
iommu_domain *domain,
  dma_addr_t iova)
 {
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
+   struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
unsigned long flags;
phys_addr_t pa;
 
@@ -422,6 +436,10 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct 
iommu_domain *domain,
pa = dom->iop->iova_to_phys(dom->iop, iova);
spin_unlock_irqrestore(>pgtlock, flags);
 
+   if (data->plat_data->has_4gb_mode && data->dram_is_4gb &&
+   pa >= MTK_IOMMU_4GB_MODE_PA_14000)
+   pa &= ~BIT_ULL(32);
+
return pa;
 }
 
-- 
1.9.1



[PATCH v7 21/21] iommu/mediatek: Switch to SPDX license identifier

2019-06-10 Thread Yong Wu
Switch to SPDX license identifier for MediaTek iommu/smi and their
header files.

Signed-off-by: Yong Wu 
Reviewed-by: Rob Herring 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c | 10 +-
 drivers/iommu/mtk_iommu.h | 10 +-
 drivers/iommu/mtk_iommu_v1.c  | 10 +-
 drivers/memory/mtk-smi.c  | 10 +-
 include/dt-bindings/memory/mt2701-larb-port.h | 10 +-
 include/dt-bindings/memory/mt8173-larb-port.h | 10 +-
 include/soc/mediatek/smi.h| 10 +-
 7 files changed, 7 insertions(+), 63 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 34f2e40..6fe3369 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -1,15 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * Copyright (c) 2015-2016 MediaTek Inc.
  * Author: Yong Wu 
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
  */
 #include 
 #include 
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index e8114b2..b24cfd3 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -1,15 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
  * Copyright (c) 2015-2016 MediaTek Inc.
  * Author: Honghui Zhang 
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
  */
 
 #ifndef _MTK_IOMMU_H_
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 73308ad..0b0908c 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * IOMMU API for MTK architected m4u v1 implementations
  *
@@ -5,15 +6,6 @@
  * Author: Honghui Zhang 
  *
  * Based on driver/iommu/mtk_iommu.c
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
  */
 #include 
 #include 
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 10e6493..9688341 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -1,15 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * Copyright (c) 2015-2016 MediaTek Inc.
  * Author: Yong Wu 
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
  */
 #include 
 #include 
diff --git a/include/dt-bindings/memory/mt2701-larb-port.h 
b/include/dt-bindings/memory/mt2701-larb-port.h
index 6764d74..c511f0f 100644
--- a/include/dt-bindings/memory/mt2701-larb-port.h
+++ b/include/dt-bindings/memory/mt2701-larb-port.h
@@ -1,15 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
  * Copyright (c) 2015 MediaTek Inc.
  * Author: Honghui Zhang 
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
  */
 
 #ifndef _MT2701_LARB_PORT_H_
diff --git a/include/dt-bindings/memory/mt8173-larb-port.h 
b/include/dt-bindings/memory/mt8173-larb-port.h
index 111b4b0..a62bfeb 100644
--- a/include/dt-bindings/memory/mt8173-larb-port.h
+++ b/include/dt-bindings/memory/mt8173-larb-port.h
@@ -1,15 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
  * Copyright (c) 2015-2016 MediaTek Inc.
  * Author: Yong 

[PATCH v7 16/21] memory: mtk-smi: Add bus_sel for mt8183

2019-06-10 Thread Yong Wu
There are 2 mmu cells in a M4U HW. we could adjust some larbs entering
mmu0 or mmu1 to balance the bandwidth via the smi-common register
SMI_BUS_SEL(0x220)(Each larb occupy 2 bits).

In mt8183, For better performance, we switch larb1/2/5/7 to enter
mmu1 while the others still keep enter mmu0.

In mt8173 and mt2712, we don't get the performance issue,
Keep its default value(0x0), that means all the larbs enter mmu0.

Note: smi gen1(mt2701/mt7623) don't have this bus_sel.

And, the base of smi-common is completely different with smi_ao_base
of gen1, thus I add new variable for that.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/memory/mtk-smi.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 9790801..08cf40d 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -49,6 +49,12 @@
 #define SMI_LARB_NONSEC_CON(id)(0x380 + ((id) * 4))
 #define F_MMU_EN   BIT(0)
 
+/* SMI COMMON */
+#define SMI_BUS_SEL0x220
+#define SMI_BUS_LARB_SHIFT(larbid) ((larbid) << 1)
+/* All are MMU0 defaultly. Only specialize mmu1 here. */
+#define F_MMU1_LARB(larbid)(0x1 << SMI_BUS_LARB_SHIFT(larbid))
+
 enum mtk_smi_gen {
MTK_SMI_GEN1,
MTK_SMI_GEN2
@@ -57,6 +63,7 @@ enum mtk_smi_gen {
 struct mtk_smi_common_plat {
enum mtk_smi_gen gen;
bool has_gals;
+   u32  bus_sel; /* Balance some larbs to enter mmu0 or mmu1 */
 };
 
 struct mtk_smi_larb_gen {
@@ -72,8 +79,8 @@ struct mtk_smi {
struct clk  *clk_apb, *clk_smi;
struct clk  *clk_gals0, *clk_gals1;
struct clk  *clk_async; /*only needed by mt2701*/
-   void __iomem*smi_ao_base;
-
+   void __iomem*smi_ao_base; /* only for gen1 */
+   void __iomem*base;/* only for gen2 */
const struct mtk_smi_common_plat *plat;
 };
 
@@ -410,6 +417,8 @@ static int __maybe_unused mtk_smi_larb_suspend(struct 
device *dev)
 static const struct mtk_smi_common_plat mtk_smi_common_mt8183 = {
.gen  = MTK_SMI_GEN2,
.has_gals = true,
+   .bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(5) |
+   F_MMU1_LARB(7),
 };
 
 static const struct of_device_id mtk_smi_common_of_ids[] = {
@@ -482,6 +491,11 @@ static int mtk_smi_common_probe(struct platform_device 
*pdev)
ret = clk_prepare_enable(common->clk_async);
if (ret)
return ret;
+   } else {
+   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   common->base = devm_ioremap_resource(dev, res);
+   if (IS_ERR(common->base))
+   return PTR_ERR(common->base);
}
pm_runtime_enable(dev);
platform_set_drvdata(pdev, common);
@@ -497,6 +511,7 @@ static int mtk_smi_common_remove(struct platform_device 
*pdev)
 static int __maybe_unused mtk_smi_common_resume(struct device *dev)
 {
struct mtk_smi *common = dev_get_drvdata(dev);
+   u32 bus_sel = common->plat->bus_sel;
int ret;
 
ret = mtk_smi_clk_enable(common);
@@ -504,6 +519,9 @@ static int __maybe_unused mtk_smi_common_resume(struct 
device *dev)
dev_err(common->dev, "Failed to enable clock(%d).\n", ret);
return ret;
}
+
+   if (common->plat->gen == MTK_SMI_GEN2 && bus_sel)
+   writel(bus_sel, common->base + SMI_BUS_SEL);
return 0;
 }
 
-- 
1.9.1



[PATCH v7 14/21] iommu/mediatek: Add mmu1 support

2019-06-10 Thread Yong Wu
Normally the M4U HW connect EMI with smi. the diagram is like below:
  EMI
   |
  M4U
   |
smi-common
   |
   -
   ||| |...
larb0 larb1  larb2 larb3

Actually there are 2 mmu cells in the M4U HW, like this diagram:

  EMI
   -
| |
   mmu0  mmu1 <- M4U
| |
   -
   |
smi-common
   |
   -
   ||| |...
larb0 larb1  larb2 larb3

This patch add support for mmu1. In order to get better performance,
we could adjust some larbs go to mmu1 while the others still go to
mmu0. This is controlled by a SMI COMMON register SMI_BUS_SEL(0x220).

mt2712, mt8173 and mt8183 M4U HW all have 2 mmu cells. the default
value of that register is 0 which means all the larbs go to mmu0
defaultly.

This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.

Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c | 46 +-
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 3a14301..ec4ce74 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -72,26 +72,32 @@
 #define F_INT_CLR_BIT  BIT(12)
 
 #define REG_MMU_INT_MAIN_CONTROL   0x124
-#define F_INT_TRANSLATION_FAULTBIT(0)
-#define F_INT_MAIN_MULTI_HIT_FAULT BIT(1)
-#define F_INT_INVALID_PA_FAULT BIT(2)
-#define F_INT_ENTRY_REPLACEMENT_FAULT  BIT(3)
-#define F_INT_TLB_MISS_FAULT   BIT(4)
-#define F_INT_MISS_TRANSACTION_FIFO_FAULT  BIT(5)
-#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT   BIT(6)
+   /* mmu0 | mmu1 */
+#define F_INT_TRANSLATION_FAULT(BIT(0) | BIT(7))
+#define F_INT_MAIN_MULTI_HIT_FAULT (BIT(1) | BIT(8))
+#define F_INT_INVALID_PA_FAULT (BIT(2) | BIT(9))
+#define F_INT_ENTRY_REPLACEMENT_FAULT  (BIT(3) | BIT(10))
+#define F_INT_TLB_MISS_FAULT   (BIT(4) | BIT(11))
+#define F_INT_MISS_TRANSACTION_FIFO_FAULT  (BIT(5) | BIT(12))
+#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT   (BIT(6) | BIT(13))
 
 #define REG_MMU_CPE_DONE   0x12C
 
 #define REG_MMU_FAULT_ST1  0x134
+#define F_REG_MMU0_FAULT_MASK  GENMASK(6, 0)
+#define F_REG_MMU1_FAULT_MASK  GENMASK(13, 7)
 
-#define REG_MMU_FAULT_VA   0x13c
+#define REG_MMU0_FAULT_VA  0x13c
 #define F_MMU_FAULT_VA_WRITE_BIT   BIT(1)
 #define F_MMU_FAULT_VA_LAYER_BIT   BIT(0)
 
-#define REG_MMU_INVLD_PA   0x140
-#define REG_MMU_INT_ID 0x150
-#define F_MMU0_INT_ID_LARB_ID(a)   (((a) >> 7) & 0x7)
-#define F_MMU0_INT_ID_PORT_ID(a)   (((a) >> 2) & 0x1f)
+#define REG_MMU0_INVLD_PA  0x140
+#define REG_MMU1_FAULT_VA  0x144
+#define REG_MMU1_INVLD_PA  0x148
+#define REG_MMU0_INT_ID0x150
+#define REG_MMU1_INT_ID0x154
+#define F_MMU_INT_ID_LARB_ID(a)(((a) >> 7) & 0x7)
+#define F_MMU_INT_ID_PORT_ID(a)(((a) >> 2) & 0x1f)
 
 #define MTK_PROTECT_PA_ALIGN   128
 
@@ -210,13 +216,19 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
 
/* Read error info from registers */
int_state = readl_relaxed(data->base + REG_MMU_FAULT_ST1);
-   fault_iova = readl_relaxed(data->base + REG_MMU_FAULT_VA);
+   if (int_state & F_REG_MMU0_FAULT_MASK) {
+   regval = readl_relaxed(data->base + REG_MMU0_INT_ID);
+   fault_iova = readl_relaxed(data->base + REG_MMU0_FAULT_VA);
+   fault_pa = readl_relaxed(data->base + REG_MMU0_INVLD_PA);
+   } else {
+   regval = readl_relaxed(data->base + REG_MMU1_INT_ID);
+   fault_iova = readl_relaxed(data->base + REG_MMU1_FAULT_VA);
+   fault_pa = readl_relaxed(data->base + REG_MMU1_INVLD_PA);
+   }
layer = fault_iova & F_MMU_FAULT_VA_LAYER_BIT;
write = fault_iova & F_MMU_FAULT_VA_WRITE_BIT;
-   fault_pa = readl_relaxed(data->base + REG_MMU_INVLD_PA);
-   regval = readl_relaxed(data->base + REG_MMU_INT_ID);
-   fault_larb = F_MMU0_INT_ID_LARB_ID(regval);
-   fault_port = F_MMU0_INT_ID_PORT_ID(regval);
+   fault_larb = F_MMU_INT_ID_LARB_ID(regval);
+   fault_port = F_MMU_INT_ID_PORT_ID(regval);
 
fault_larb = data->plat_data->larbid_remap[fault_larb];
 
-- 
1.9.1



[PATCH v7 17/21] memory: mtk-smi: Get rid of need_larbid

2019-06-10 Thread Yong Wu
The "mediatek,larb-id" has already been parsed in MTK IOMMU driver.
It's no need to parse it again in SMI driver. Only clean some codes.
This patch is fit for all the current mt2701, mt2712, mt7623, mt8173
and mt8183.

After this patch, the "mediatek,larb-id" only be needed for mt2712
which have 2 M4Us. In the other SoCs, we can get the larb-id from M4U
in which the larbs in the "mediatek,larbs" always are ordered.

Correspondingly, the larb_nr in the "struct mtk_smi_iommu" could also
be deleted.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c|  1 -
 drivers/iommu/mtk_iommu_v1.c |  2 --
 drivers/memory/mtk-smi.c | 26 ++
 include/soc/mediatek/smi.h   |  1 -
 4 files changed, 2 insertions(+), 28 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index ec4ce74..6053b8b 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -634,7 +634,6 @@ static int mtk_iommu_probe(struct platform_device *pdev)
 "mediatek,larbs", NULL);
if (larb_nr < 0)
return larb_nr;
-   data->smi_imu.larb_nr = larb_nr;
 
for (i = 0; i < larb_nr; i++) {
struct device_node *larbnode;
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 52b01e3..73308ad 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -624,8 +624,6 @@ static int mtk_iommu_probe(struct platform_device *pdev)
larb_nr++;
}
 
-   data->smi_imu.larb_nr = larb_nr;
-
platform_set_drvdata(pdev, data);
 
ret = mtk_iommu_hw_init(data);
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 08cf40d..10e6493 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -67,7 +67,6 @@ struct mtk_smi_common_plat {
 };
 
 struct mtk_smi_larb_gen {
-   bool need_larbid;
int port_in_larb[MTK_LARB_NR_MAX + 1];
void (*config_port)(struct device *);
unsigned int larb_direct_to_common_mask;
@@ -153,18 +152,9 @@ void mtk_smi_larb_put(struct device *larbdev)
struct mtk_smi_iommu *smi_iommu = data;
unsigned int i;
 
-   if (larb->larb_gen->need_larbid) {
-   larb->mmu = _iommu->larb_imu[larb->larbid].mmu;
-   return 0;
-   }
-
-   /*
-* If there is no larbid property, Loop to find the corresponding
-* iommu information.
-*/
-   for (i = 0; i < smi_iommu->larb_nr; i++) {
+   for (i = 0; i < MTK_LARB_NR_MAX; i++) {
if (dev == smi_iommu->larb_imu[i].dev) {
-   /* The 'mmu' may be updated in iommu-attach/detach. */
+   larb->larbid = i;
larb->mmu = _iommu->larb_imu[i].mmu;
return 0;
}
@@ -243,7 +233,6 @@ static void mtk_smi_larb_config_port_gen1(struct device 
*dev)
 };
 
 static const struct mtk_smi_larb_gen mtk_smi_larb_mt2701 = {
-   .need_larbid = true,
.port_in_larb = {
LARB0_PORT_OFFSET, LARB1_PORT_OFFSET,
LARB2_PORT_OFFSET, LARB3_PORT_OFFSET
@@ -252,7 +241,6 @@ static void mtk_smi_larb_config_port_gen1(struct device 
*dev)
 };
 
 static const struct mtk_smi_larb_gen mtk_smi_larb_mt2712 = {
-   .need_larbid = true,
.config_port= mtk_smi_larb_config_port_gen2_general,
.larb_direct_to_common_mask = BIT(8) | BIT(9),  /* bdpsys */
 };
@@ -291,7 +279,6 @@ static int mtk_smi_larb_probe(struct platform_device *pdev)
struct device *dev = >dev;
struct device_node *smi_node;
struct platform_device *smi_pdev;
-   int err;
 
larb = devm_kzalloc(dev, sizeof(*larb), GFP_KERNEL);
if (!larb)
@@ -321,15 +308,6 @@ static int mtk_smi_larb_probe(struct platform_device *pdev)
}
larb->smi.dev = dev;
 
-   if (larb->larb_gen->need_larbid) {
-   err = of_property_read_u32(dev->of_node, "mediatek,larb-id",
-  >larbid);
-   if (err) {
-   dev_err(dev, "missing larbid property\n");
-   return err;
-   }
-   }
-
smi_node = of_parse_phandle(dev->of_node, "mediatek,smi", 0);
if (!smi_node)
return -EINVAL;
diff --git a/include/soc/mediatek/smi.h b/include/soc/mediatek/smi.h
index 5201e90..a65324d 100644
--- a/include/soc/mediatek/smi.h
+++ b/include/soc/mediatek/smi.h
@@ -29,7 +29,6 @@ struct mtk_smi_larb_iommu {
 };
 
 struct mtk_smi_iommu {
-   unsigned int larb_nr;
struct mtk_smi_larb_iommu larb_imu[MTK_LARB_NR_MAX];
 };
 
-- 
1.9.1



[PATCH v7 15/21] memory: mtk-smi: Invoke pm runtime_callback to enable clocks

2019-06-10 Thread Yong Wu
This patch only move the clk_prepare_enable and config_port into the
runtime suspend/resume callback. It doesn't change the code content
and sequence.

This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.
(SMI_BUS_SEL need to be restored after smi-common resume every time.)
Also it gives a chance to get rid of mtk_smi_larb_get/put which could
be a next topic.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/memory/mtk-smi.c | 113 ++-
 1 file changed, 72 insertions(+), 41 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index a430721..9790801 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -86,17 +86,13 @@ struct mtk_smi_larb { /* larb: local arbiter */
u32 *mmu;
 };
 
-static int mtk_smi_enable(const struct mtk_smi *smi)
+static int mtk_smi_clk_enable(const struct mtk_smi *smi)
 {
int ret;
 
-   ret = pm_runtime_get_sync(smi->dev);
-   if (ret < 0)
-   return ret;
-
ret = clk_prepare_enable(smi->clk_apb);
if (ret)
-   goto err_put_pm;
+   return ret;
 
ret = clk_prepare_enable(smi->clk_smi);
if (ret)
@@ -118,59 +114,28 @@ static int mtk_smi_enable(const struct mtk_smi *smi)
clk_disable_unprepare(smi->clk_smi);
 err_disable_apb:
clk_disable_unprepare(smi->clk_apb);
-err_put_pm:
-   pm_runtime_put_sync(smi->dev);
return ret;
 }
 
-static void mtk_smi_disable(const struct mtk_smi *smi)
+static void mtk_smi_clk_disable(const struct mtk_smi *smi)
 {
clk_disable_unprepare(smi->clk_gals1);
clk_disable_unprepare(smi->clk_gals0);
clk_disable_unprepare(smi->clk_smi);
clk_disable_unprepare(smi->clk_apb);
-   pm_runtime_put_sync(smi->dev);
 }
 
 int mtk_smi_larb_get(struct device *larbdev)
 {
-   struct mtk_smi_larb *larb = dev_get_drvdata(larbdev);
-   const struct mtk_smi_larb_gen *larb_gen = larb->larb_gen;
-   struct mtk_smi *common = dev_get_drvdata(larb->smi_common_dev);
-   int ret;
+   int ret = pm_runtime_get_sync(larbdev);
 
-   /* Enable the smi-common's power and clocks */
-   ret = mtk_smi_enable(common);
-   if (ret)
-   return ret;
-
-   /* Enable the larb's power and clocks */
-   ret = mtk_smi_enable(>smi);
-   if (ret) {
-   mtk_smi_disable(common);
-   return ret;
-   }
-
-   /* Configure the iommu info for this larb */
-   larb_gen->config_port(larbdev);
-
-   return 0;
+   return (ret < 0) ? ret : 0;
 }
 EXPORT_SYMBOL_GPL(mtk_smi_larb_get);
 
 void mtk_smi_larb_put(struct device *larbdev)
 {
-   struct mtk_smi_larb *larb = dev_get_drvdata(larbdev);
-   struct mtk_smi *common = dev_get_drvdata(larb->smi_common_dev);
-
-   /*
-* Don't de-configure the iommu info for this larb since there may be
-* several modules in this larb.
-* The iommu info will be reset after power off.
-*/
-
-   mtk_smi_disable(>smi);
-   mtk_smi_disable(common);
+   pm_runtime_put_sync(larbdev);
 }
 EXPORT_SYMBOL_GPL(mtk_smi_larb_put);
 
@@ -385,12 +350,52 @@ static int mtk_smi_larb_remove(struct platform_device 
*pdev)
return 0;
 }
 
+static int __maybe_unused mtk_smi_larb_resume(struct device *dev)
+{
+   struct mtk_smi_larb *larb = dev_get_drvdata(dev);
+   const struct mtk_smi_larb_gen *larb_gen = larb->larb_gen;
+   int ret;
+
+   /* Power on smi-common. */
+   ret = pm_runtime_get_sync(larb->smi_common_dev);
+   if (ret < 0) {
+   dev_err(dev, "Failed to pm get for smi-common(%d).\n", ret);
+   return ret;
+   }
+
+   ret = mtk_smi_clk_enable(>smi);
+   if (ret < 0) {
+   dev_err(dev, "Failed to enable clock(%d).\n", ret);
+   pm_runtime_put_sync(larb->smi_common_dev);
+   return ret;
+   }
+
+   /* Configure the basic setting for this larb */
+   larb_gen->config_port(dev);
+
+   return 0;
+}
+
+static int __maybe_unused mtk_smi_larb_suspend(struct device *dev)
+{
+   struct mtk_smi_larb *larb = dev_get_drvdata(dev);
+
+   mtk_smi_clk_disable(>smi);
+   pm_runtime_put_sync(larb->smi_common_dev);
+   return 0;
+}
+
+static const struct dev_pm_ops smi_larb_pm_ops = {
+   SET_RUNTIME_PM_OPS(mtk_smi_larb_suspend, mtk_smi_larb_resume, NULL)
+};
+
 static struct platform_driver mtk_smi_larb_driver = {
.probe  = mtk_smi_larb_probe,
.remove = mtk_smi_larb_remove,
.driver = {
.name = "mtk-smi-larb",
.of_match_table = mtk_smi_larb_of_ids,
+   .pm = _larb_pm_ops,
}
 };
 
@@ -489,12 +494,38 @@ static int mtk_smi_common_remove(struct platform_device 
*pdev)
return 0;
 }
 
+static int __maybe_unused 

[PATCH v7 13/21] iommu/mediatek: Add mt8183 IOMMU support

2019-06-10 Thread Yong Wu
The M4U IP blocks in mt8183 is MediaTek's generation2 M4U which use
the ARM Short-descriptor like mt8173, and most of the HW registers
are the same.

Here list main differences between mt8183 and mt8173/mt2712:
1) mt8183 has only one M4U HW like mt8173 while mt2712 has two.
2) mt8183 don't have the "bclk" clock, it use the EMI clock instead.
3) mt8183 can support the dram over 4GB, but it doesn't call this "4GB
mode".
4) mt8183 pgtable base register(0x0) extend bit[1:0] which represent
the bit[33:32] in the physical address of the pgtable base, But the
standard ttbr0[1] means the S bit which is enabled defaultly, Hence,
we add a mask.
5) mt8183 HW has a GALS modules, SMI should enable "has_gals" support.
6) mt8183 need reset_axi like mt8173.
7) the larb-id in smi-common is remapped. M4U should add its larbid_remap.

Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c | 15 ---
 drivers/iommu/mtk_iommu.h |  1 +
 drivers/memory/mtk-smi.c  | 20 
 3 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index a535dcd..3a14301 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -36,6 +36,7 @@
 #include "mtk_iommu.h"
 
 #define REG_MMU_PT_BASE_ADDR   0x000
+#define MMU_PT_ADDR_MASK   GENMASK(31, 7)
 
 #define REG_MMU_INVALIDATE 0x020
 #define F_ALL_INVLD0x2
@@ -341,7 +342,7 @@ static int mtk_iommu_attach_device(struct iommu_domain 
*domain,
/* Update the pgtable base address register of the M4U HW */
if (!data->m4u_dom) {
data->m4u_dom = dom;
-   writel(dom->cfg.arm_v7s_cfg.ttbr[0],
+   writel(dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
   data->base + REG_MMU_PT_BASE_ADDR);
}
 
@@ -715,6 +716,7 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
 {
struct mtk_iommu_data *data = dev_get_drvdata(dev);
struct mtk_iommu_suspend_reg *reg = >reg;
+   struct mtk_iommu_domain *m4u_dom = data->m4u_dom;
void __iomem *base = data->base;
int ret;
 
@@ -730,8 +732,8 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
writel_relaxed(reg->int_control0, base + REG_MMU_INT_CONTROL0);
writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL);
writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
-   if (data->m4u_dom)
-   writel(data->m4u_dom->cfg.arm_v7s_cfg.ttbr[0],
+   if (m4u_dom)
+   writel(m4u_dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
   base + REG_MMU_PT_BASE_ADDR);
return 0;
 }
@@ -756,9 +758,16 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
.larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
 };
 
+static const struct mtk_iommu_plat_data mt8183_data = {
+   .m4u_plat = M4U_MT8183,
+   .reset_axi= true,
+   .larbid_remap = {0, 4, 5, 6, 7, 2, 3, 1},
+};
+
 static const struct of_device_id mtk_iommu_of_ids[] = {
{ .compatible = "mediatek,mt2712-m4u", .data = _data},
{ .compatible = "mediatek,mt8173-m4u", .data = _data},
+   { .compatible = "mediatek,mt8183-m4u", .data = _data},
{}
 };
 
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index e5c9dde..c0b5c65 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -38,6 +38,7 @@ enum mtk_iommu_plat {
M4U_MT2701,
M4U_MT2712,
M4U_MT8173,
+   M4U_MT8183,
 };
 
 struct mtk_iommu_plat_data {
diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 91634d7..a430721 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -285,6 +285,13 @@ static void mtk_smi_larb_config_port_gen1(struct device 
*dev)
.larb_direct_to_common_mask = BIT(8) | BIT(9),  /* bdpsys */
 };
 
+static const struct mtk_smi_larb_gen mtk_smi_larb_mt8183 = {
+   .has_gals   = true,
+   .config_port= mtk_smi_larb_config_port_gen2_general,
+   .larb_direct_to_common_mask = BIT(2) | BIT(3) | BIT(7),
+ /* IPU0 | IPU1 | CCU */
+};
+
 static const struct of_device_id mtk_smi_larb_of_ids[] = {
{
.compatible = "mediatek,mt8173-smi-larb",
@@ -298,6 +305,10 @@ static void mtk_smi_larb_config_port_gen1(struct device 
*dev)
.compatible = "mediatek,mt2712-smi-larb",
.data = _smi_larb_mt2712
},
+   {
+   .compatible = "mediatek,mt8183-smi-larb",
+   .data = _smi_larb_mt8183
+   },
{}
 };
 
@@ -391,6 +402,11 @@ static int mtk_smi_larb_remove(struct platform_device 
*pdev)
.gen = MTK_SMI_GEN2,
 };
 
+static const struct mtk_smi_common_plat mtk_smi_common_mt8183 = {
+   

[PATCH v7 10/21] iommu/mediatek: Move reset_axi into plat_data

2019-06-10 Thread Yong Wu
In mt8173 and mt8183, 0x48 is REG_MMU_STANDARD_AXI_MODE while it is
REG_MMU_CTRL in the other SoCs, and the bits meaning is completely
different with the REG_MMU_STANDARD_AXI_MODE.

This patch moves this property to plat_data, it's also a preparing
patch for mt8183.

Signed-off-by: Yong Wu 
Reviewed-by: Nicolas Boichat 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c | 4 ++--
 drivers/iommu/mtk_iommu.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index d38dfa2..8ac7034 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -557,8 +557,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
*data)
}
writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
 
-   /* It's MISC control register whose default value is ok except mt8173.*/
-   if (data->plat_data->m4u_plat == M4U_MT8173)
+   if (data->plat_data->reset_axi)
writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE);
 
if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0,
@@ -752,6 +751,7 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
.m4u_plat = M4U_MT8173,
.has_4gb_mode = true,
.has_bclk = true,
+   .reset_axi= true,
.larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
 };
 
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 61fd5d6..55d73c1 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -46,7 +46,7 @@ struct mtk_iommu_plat_data {
 
/* HW will use the EMI clock if there isn't the "bclk". */
boolhas_bclk;
-
+   boolreset_axi;
unsigned char   larbid_remap[MTK_LARB_NR_MAX];
 };
 
-- 
1.9.1



[PATCH v7 11/21] iommu/mediatek: Move vld_pa_rng into plat_data

2019-06-10 Thread Yong Wu
Both mt8173 and mt8183 don't have this vld_pa_rng(valid physical address
range) register while mt2712 have. Move it into the plat_data.

Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c | 3 ++-
 drivers/iommu/mtk_iommu.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 8ac7034..a535dcd 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -547,7 +547,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
*data)
 upper_32_bits(data->protect_base);
writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
 
-   if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
+   if (data->enable_4GB && data->plat_data->has_vld_pa_rng) {
/*
 * If 4GB mode is enabled, the validate PA range is from
 * 0x1__ to 0x1__. here record bit[32:30].
@@ -744,6 +744,7 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
.m4u_plat = M4U_MT2712,
.has_4gb_mode = true,
.has_bclk = true,
+   .has_vld_pa_rng   = true,
.larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
 };
 
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 55d73c1..e5c9dde 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -47,6 +47,7 @@ struct mtk_iommu_plat_data {
/* HW will use the EMI clock if there isn't the "bclk". */
boolhas_bclk;
boolreset_axi;
+   boolhas_vld_pa_rng;
unsigned char   larbid_remap[MTK_LARB_NR_MAX];
 };
 
-- 
1.9.1



[PATCH v7 12/21] memory: mtk-smi: Add gals support

2019-06-10 Thread Yong Wu
In some SoCs like mt8183, SMI add GALS(Global Async Local Sync) module
which can help synchronize for the modules in different clock frequency.
It can be seen as a "asynchronous fifo". This is a example diagram:

M4U
 |
 --
 ||
 gals0-rx   gals1-rx
 ||
 ||
 gals0-tx   gals1-tx
 ||

 SMI Common

 |
  +-++-+- ...
  | || |
  |  gals-rx  gals-rx  |
  | || |
  | || |
  |  gals-tx  gals-tx  |
  | || |
larb1 larb2   larb3  larb4

GALS only help transfer the command/data while it doesn't have the
configuring register, thus it has the special "smi" clock and doesn't
have the "apb" clock. From the diagram above, we add "gals0" and
"gals1" clocks for smi-common and add a "gals" clock for smi-larb.

This patch adds gals clock supporting in the SMI. Note that some larbs
may still don't have the "gals" clock like larb1 and larb4 above.

This is also a preparing patch for mt8183 which has GALS.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/memory/mtk-smi.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 8a2f968..91634d7 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -56,6 +56,7 @@ enum mtk_smi_gen {
 
 struct mtk_smi_common_plat {
enum mtk_smi_gen gen;
+   bool has_gals;
 };
 
 struct mtk_smi_larb_gen {
@@ -63,11 +64,13 @@ struct mtk_smi_larb_gen {
int port_in_larb[MTK_LARB_NR_MAX + 1];
void (*config_port)(struct device *);
unsigned int larb_direct_to_common_mask;
+   bool has_gals;
 };
 
 struct mtk_smi {
struct device   *dev;
struct clk  *clk_apb, *clk_smi;
+   struct clk  *clk_gals0, *clk_gals1;
struct clk  *clk_async; /*only needed by mt2701*/
void __iomem*smi_ao_base;
 
@@ -99,8 +102,20 @@ static int mtk_smi_enable(const struct mtk_smi *smi)
if (ret)
goto err_disable_apb;
 
+   ret = clk_prepare_enable(smi->clk_gals0);
+   if (ret)
+   goto err_disable_smi;
+
+   ret = clk_prepare_enable(smi->clk_gals1);
+   if (ret)
+   goto err_disable_gals0;
+
return 0;
 
+err_disable_gals0:
+   clk_disable_unprepare(smi->clk_gals0);
+err_disable_smi:
+   clk_disable_unprepare(smi->clk_smi);
 err_disable_apb:
clk_disable_unprepare(smi->clk_apb);
 err_put_pm:
@@ -110,6 +125,8 @@ static int mtk_smi_enable(const struct mtk_smi *smi)
 
 static void mtk_smi_disable(const struct mtk_smi *smi)
 {
+   clk_disable_unprepare(smi->clk_gals1);
+   clk_disable_unprepare(smi->clk_gals0);
clk_disable_unprepare(smi->clk_smi);
clk_disable_unprepare(smi->clk_apb);
pm_runtime_put_sync(smi->dev);
@@ -310,6 +327,15 @@ static int mtk_smi_larb_probe(struct platform_device *pdev)
larb->smi.clk_smi = devm_clk_get(dev, "smi");
if (IS_ERR(larb->smi.clk_smi))
return PTR_ERR(larb->smi.clk_smi);
+
+   if (larb->larb_gen->has_gals) {
+   /* The larbs may still haven't gals even if the SoC support.*/
+   larb->smi.clk_gals0 = devm_clk_get(dev, "gals");
+   if (PTR_ERR(larb->smi.clk_gals0) == -ENOENT)
+   larb->smi.clk_gals0 = NULL;
+   else if (IS_ERR(larb->smi.clk_gals0))
+   return PTR_ERR(larb->smi.clk_gals0);
+   }
larb->smi.dev = dev;
 
if (larb->larb_gen->need_larbid) {
@@ -402,6 +428,16 @@ static int mtk_smi_common_probe(struct platform_device 
*pdev)
if (IS_ERR(common->clk_smi))
return PTR_ERR(common->clk_smi);
 
+   if (common->plat->has_gals) {
+   common->clk_gals0 = devm_clk_get(dev, "gals0");
+   if (IS_ERR(common->clk_gals0))
+   return PTR_ERR(common->clk_gals0);
+
+   common->clk_gals1 = devm_clk_get(dev, "gals1");
+   if (IS_ERR(common->clk_gals1))
+   return PTR_ERR(common->clk_gals1);
+   }
+
/*
 * for mtk smi gen 1, we need to get the ao(always on) base to config
 * m4u port, and we need to enable the aync clock for transform the smi
-- 
1.9.1



[PATCH v7 09/21] iommu/mediatek: Refine protect memory definition

2019-06-10 Thread Yong Wu
The protect memory setting is a little different in the different SoCs.
In the register REG_MMU_CTRL_REG(0x110), the TF_PROT(translation fault
protect) shift bit is normally 4 while it shift 5 bits only in the
mt8173. This patch delete the complex MACRO and use a common if-else
instead.

Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index ad838b9..d38dfa2 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -52,12 +52,9 @@
 #define REG_MMU_DCM_DIS0x050
 
 #define REG_MMU_CTRL_REG   0x110
+#define F_MMU_TF_PROT_TO_PROGRAM_ADDR  (2 << 4)
 #define F_MMU_PREFETCH_RT_REPLACE_MOD  BIT(4)
-#define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
-   ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
-/* It's named by F_MMU_TF_PROT_SEL in mt2712. */
-#define F_MMU_TF_PROTECT_SEL(prot, data) \
-   (((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
+#define F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173   (2 << 5)
 
 #define REG_MMU_IVRP_PADDR 0x114
 
@@ -519,9 +516,11 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
*data)
return ret;
}
 
-   regval = F_MMU_TF_PROTECT_SEL(2, data);
if (data->plat_data->m4u_plat == M4U_MT8173)
-   regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
+   regval = F_MMU_PREFETCH_RT_REPLACE_MOD |
+F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173;
+   else
+   regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR;
writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
 
regval = F_L2_MULIT_HIT_EN |
-- 
1.9.1



[PATCH v7 08/21] iommu/mediatek: Add larb-id remapped support

2019-06-10 Thread Yong Wu
The larb-id may be remapped in the smi-common, this means the
larb-id reported in the mtk_iommu_isr isn't the real larb-id,

Take mt8183 as a example:
   M4U
|
-
|   SMI common  |
-0-7-5-6-1-2--3-4- <- Id remapped
 | | | | | |  | |
larb0 larb1 IPU0  IPU1 larb4 larb5  larb6  CCU
disp  vdec  img   cam   venc  imgcam
As above, larb0 connects with the id 0 in smi-common.
  larb1 connects with the id 7 in smi-common.
  ...
If the larb-id reported in the isr is 7, actually it's larb1(vdec).
In order to output the right larb-id in the isr, we add a larb-id
remapping relationship in this patch.

If there is no this larb-id remapping in some SoCs, use the linear
mapping array instead.

This also is a preparing patch for mt8183.

Signed-off-by: Yong Wu 
Reviewed-by: Nicolas Boichat 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c | 4 
 drivers/iommu/mtk_iommu.h | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 264dda4..ad838b9 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -220,6 +220,8 @@ static irqreturn_t mtk_iommu_isr(int irq, void *dev_id)
fault_larb = F_MMU0_INT_ID_LARB_ID(regval);
fault_port = F_MMU0_INT_ID_PORT_ID(regval);
 
+   fault_larb = data->plat_data->larbid_remap[fault_larb];
+
if (report_iommu_fault(>domain, data->dev, fault_iova,
   write ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) {
dev_err_ratelimited(
@@ -744,12 +746,14 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
.m4u_plat = M4U_MT2712,
.has_4gb_mode = true,
.has_bclk = true,
+   .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
 };
 
 static const struct mtk_iommu_plat_data mt8173_data = {
.m4u_plat = M4U_MT8173,
.has_4gb_mode = true,
.has_bclk = true,
+   .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
 };
 
 static const struct of_device_id mtk_iommu_of_ids[] = {
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 63e235e..61fd5d6 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -46,6 +46,8 @@ struct mtk_iommu_plat_data {
 
/* HW will use the EMI clock if there isn't the "bclk". */
boolhas_bclk;
+
+   unsigned char   larbid_remap[MTK_LARB_NR_MAX];
 };
 
 struct mtk_iommu_domain;
-- 
1.9.1



[PATCH v7 05/21] iommu/io-pgtable-arm-v7s: Add paddr_to_iopte and iopte_to_paddr helpers

2019-06-10 Thread Yong Wu
Add two helper functions: paddr_to_iopte and iopte_to_paddr.

Signed-off-by: Yong Wu 
Reviewed-by: Robin Murphy 
Reviewed-by: Evan Green 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 45 --
 1 file changed, 33 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 9a8a887..94c38db 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -180,18 +180,38 @@ struct arm_v7s_io_pgtable {
spinlock_t  split_lock;
 };
 
+static bool arm_v7s_pte_is_cont(arm_v7s_iopte pte, int lvl);
+
 static dma_addr_t __arm_v7s_dma_addr(void *pages)
 {
return (dma_addr_t)virt_to_phys(pages);
 }
 
-static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl)
+static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
+   struct io_pgtable_cfg *cfg)
 {
+   return paddr & ARM_V7S_LVL_MASK(lvl);
+}
+
+static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
+ struct io_pgtable_cfg *cfg)
+{
+   arm_v7s_iopte mask;
+
if (ARM_V7S_PTE_IS_TABLE(pte, lvl))
-   pte &= ARM_V7S_TABLE_MASK;
+   mask = ARM_V7S_TABLE_MASK;
+   else if (arm_v7s_pte_is_cont(pte, lvl))
+   mask = ARM_V7S_LVL_MASK(lvl) * ARM_V7S_CONT_PAGES;
else
-   pte &= ARM_V7S_LVL_MASK(lvl);
-   return phys_to_virt(pte);
+   mask = ARM_V7S_LVL_MASK(lvl);
+
+   return pte & mask;
+}
+
+static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int lvl,
+ struct arm_v7s_io_pgtable *data)
+{
+   return phys_to_virt(iopte_to_paddr(pte, lvl, >iop.cfg));
 }
 
 static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
@@ -407,7 +427,7 @@ static int arm_v7s_init_pte(struct arm_v7s_io_pgtable *data,
if (num_entries > 1)
pte = arm_v7s_pte_to_cont(pte, lvl);
 
-   pte |= paddr & ARM_V7S_LVL_MASK(lvl);
+   pte |= paddr_to_iopte(paddr, lvl, cfg);
 
__arm_v7s_set_pte(ptep, pte, num_entries, cfg);
return 0;
@@ -473,7 +493,7 @@ static int __arm_v7s_map(struct arm_v7s_io_pgtable *data, 
unsigned long iova,
}
 
if (ARM_V7S_PTE_IS_TABLE(pte, lvl)) {
-   cptep = iopte_deref(pte, lvl);
+   cptep = iopte_deref(pte, lvl, data);
} else if (pte) {
/* We require an unmap first */
WARN_ON(!selftest_running);
@@ -523,7 +543,8 @@ static void arm_v7s_free_pgtable(struct io_pgtable *iop)
arm_v7s_iopte pte = data->pgd[i];
 
if (ARM_V7S_PTE_IS_TABLE(pte, 1))
-   __arm_v7s_free_table(iopte_deref(pte, 1), 2, data);
+   __arm_v7s_free_table(iopte_deref(pte, 1, data),
+2, data);
}
__arm_v7s_free_table(data->pgd, 1, data);
kmem_cache_destroy(data->l2_tables);
@@ -593,7 +614,7 @@ static size_t arm_v7s_split_blk_unmap(struct 
arm_v7s_io_pgtable *data,
if (!ARM_V7S_PTE_IS_TABLE(pte, 1))
return 0;
 
-   tablep = iopte_deref(pte, 1);
+   tablep = iopte_deref(pte, 1, data);
return __arm_v7s_unmap(data, iova, size, 2, tablep);
}
 
@@ -652,7 +673,7 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable 
*data,
io_pgtable_tlb_add_flush(iop, iova, blk_size,
ARM_V7S_BLOCK_SIZE(lvl + 1), false);
io_pgtable_tlb_sync(iop);
-   ptep = iopte_deref(pte[i], lvl);
+   ptep = iopte_deref(pte[i], lvl, data);
__arm_v7s_free_table(ptep, lvl + 1, data);
} else if (iop->cfg.quirks & 
IO_PGTABLE_QUIRK_NON_STRICT) {
/*
@@ -677,7 +698,7 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable 
*data,
}
 
/* Keep on walkin' */
-   ptep = iopte_deref(pte[0], lvl);
+   ptep = iopte_deref(pte[0], lvl, data);
return __arm_v7s_unmap(data, iova, size, lvl + 1, ptep);
 }
 
@@ -703,7 +724,7 @@ static phys_addr_t arm_v7s_iova_to_phys(struct 
io_pgtable_ops *ops,
do {
ptep += ARM_V7S_LVL_IDX(iova, ++lvl);
pte = READ_ONCE(*ptep);
-   ptep = iopte_deref(pte, lvl);
+   ptep = iopte_deref(pte, lvl, data);
} while (ARM_V7S_PTE_IS_TABLE(pte, lvl));
 
if (!ARM_V7S_PTE_IS_VALID(pte))
@@ -712,7 +733,7 @@ static phys_addr_t arm_v7s_iova_to_phys(struct 
io_pgtable_ops *ops,
mask = ARM_V7S_LVL_MASK(lvl);
if (arm_v7s_pte_is_cont(pte, lvl))
mask *= ARM_V7S_CONT_PAGES;
-   return (pte & mask) | (iova & ~mask);
+   return iopte_to_paddr(pte, lvl, >iop.cfg) | (iova & 

[PATCH v7 04/21] memory: mtk-smi: Use a struct for the platform data for smi-common

2019-06-10 Thread Yong Wu
Use a struct as the platform special data instead of the enumeration.

Also there is a minor change that moving the position of
"enum mtk_smi_gen" definition, this is because we expect define
"struct mtk_smi_common_plat" before it is referred.

This is a preparing patch for mt8183.

Signed-off-by: Yong Wu 
Reviewed-by: Matthias Brugger 
Reviewed-by: Evan Green 
---
 drivers/memory/mtk-smi.c | 35 ---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 9fd6b3d..8a2f968 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -49,6 +49,15 @@
 #define SMI_LARB_NONSEC_CON(id)(0x380 + ((id) * 4))
 #define F_MMU_EN   BIT(0)
 
+enum mtk_smi_gen {
+   MTK_SMI_GEN1,
+   MTK_SMI_GEN2
+};
+
+struct mtk_smi_common_plat {
+   enum mtk_smi_gen gen;
+};
+
 struct mtk_smi_larb_gen {
bool need_larbid;
int port_in_larb[MTK_LARB_NR_MAX + 1];
@@ -61,6 +70,8 @@ struct mtk_smi {
struct clk  *clk_apb, *clk_smi;
struct clk  *clk_async; /*only needed by mt2701*/
void __iomem*smi_ao_base;
+
+   const struct mtk_smi_common_plat *plat;
 };
 
 struct mtk_smi_larb { /* larb: local arbiter */
@@ -72,11 +83,6 @@ struct mtk_smi_larb { /* larb: local arbiter */
u32 *mmu;
 };
 
-enum mtk_smi_gen {
-   MTK_SMI_GEN1,
-   MTK_SMI_GEN2
-};
-
 static int mtk_smi_enable(const struct mtk_smi *smi)
 {
int ret;
@@ -351,18 +357,26 @@ static int mtk_smi_larb_remove(struct platform_device 
*pdev)
}
 };
 
+static const struct mtk_smi_common_plat mtk_smi_common_gen1 = {
+   .gen = MTK_SMI_GEN1,
+};
+
+static const struct mtk_smi_common_plat mtk_smi_common_gen2 = {
+   .gen = MTK_SMI_GEN2,
+};
+
 static const struct of_device_id mtk_smi_common_of_ids[] = {
{
.compatible = "mediatek,mt8173-smi-common",
-   .data = (void *)MTK_SMI_GEN2
+   .data = _smi_common_gen2,
},
{
.compatible = "mediatek,mt2701-smi-common",
-   .data = (void *)MTK_SMI_GEN1
+   .data = _smi_common_gen1,
},
{
.compatible = "mediatek,mt2712-smi-common",
-   .data = (void *)MTK_SMI_GEN2
+   .data = _smi_common_gen2,
},
{}
 };
@@ -372,13 +386,13 @@ static int mtk_smi_common_probe(struct platform_device 
*pdev)
struct device *dev = >dev;
struct mtk_smi *common;
struct resource *res;
-   enum mtk_smi_gen smi_gen;
int ret;
 
common = devm_kzalloc(dev, sizeof(*common), GFP_KERNEL);
if (!common)
return -ENOMEM;
common->dev = dev;
+   common->plat = of_device_get_match_data(dev);
 
common->clk_apb = devm_clk_get(dev, "apb");
if (IS_ERR(common->clk_apb))
@@ -394,8 +408,7 @@ static int mtk_smi_common_probe(struct platform_device 
*pdev)
 * clock into emi clock domain, but for mtk smi gen2, there's no smi ao
 * base.
 */
-   smi_gen = (enum mtk_smi_gen)of_device_get_match_data(dev);
-   if (smi_gen == MTK_SMI_GEN1) {
+   if (common->plat->gen == MTK_SMI_GEN1) {
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
common->smi_ao_base = devm_ioremap_resource(dev, res);
if (IS_ERR(common->smi_ao_base))
-- 
1.9.1



[PATCH v7 02/21] iommu/mediatek: Use a struct as the platform data

2019-06-10 Thread Yong Wu
Use a struct as the platform special data instead of the enumeration.
This is a prepare patch for adding mt8183 iommu support.

Signed-off-by: Yong Wu 
Reviewed-by: Matthias Brugger 
Reviewed-by: Evan Green 
---
 drivers/iommu/mtk_iommu.c | 24 
 drivers/iommu/mtk_iommu.h |  6 +-
 2 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index b66d11b..1ddb2b7 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -54,7 +54,7 @@
 #define REG_MMU_CTRL_REG   0x110
 #define F_MMU_PREFETCH_RT_REPLACE_MOD  BIT(4)
 #define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
-   ((data)->m4u_plat == M4U_MT2712 ? 4 : 5)
+   ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
 /* It's named by F_MMU_TF_PROT_SEL in mt2712. */
 #define F_MMU_TF_PROTECT_SEL(prot, data) \
(((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
@@ -520,7 +520,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
*data)
}
 
regval = F_MMU_TF_PROTECT_SEL(2, data);
-   if (data->m4u_plat == M4U_MT8173)
+   if (data->plat_data->m4u_plat == M4U_MT8173)
regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
 
@@ -541,14 +541,14 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
*data)
F_INT_PRETETCH_TRANSATION_FIFO_FAULT;
writel_relaxed(regval, data->base + REG_MMU_INT_MAIN_CONTROL);
 
-   if (data->m4u_plat == M4U_MT8173)
+   if (data->plat_data->m4u_plat == M4U_MT8173)
regval = (data->protect_base >> 1) | (data->enable_4GB << 31);
else
regval = lower_32_bits(data->protect_base) |
 upper_32_bits(data->protect_base);
writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
 
-   if (data->enable_4GB && data->m4u_plat != M4U_MT8173) {
+   if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
/*
 * If 4GB mode is enabled, the validate PA range is from
 * 0x1__ to 0x1__. here record bit[32:30].
@@ -559,7 +559,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
*data)
writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
 
/* It's MISC control register whose default value is ok except mt8173.*/
-   if (data->m4u_plat == M4U_MT8173)
+   if (data->plat_data->m4u_plat == M4U_MT8173)
writel_relaxed(0, data->base + REG_MMU_STANDARD_AXI_MODE);
 
if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0,
@@ -592,7 +592,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
if (!data)
return -ENOMEM;
data->dev = dev;
-   data->m4u_plat = (enum mtk_iommu_plat)of_device_get_match_data(dev);
+   data->plat_data = of_device_get_match_data(dev);
 
/* Protect memory. HW will access here while translation fault.*/
protect = devm_kzalloc(dev, MTK_PROTECT_PA_ALIGN * 2, GFP_KERNEL);
@@ -740,9 +740,17 @@ static int __maybe_unused mtk_iommu_resume(struct device 
*dev)
SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(mtk_iommu_suspend, mtk_iommu_resume)
 };
 
+static const struct mtk_iommu_plat_data mt2712_data = {
+   .m4u_plat = M4U_MT2712,
+};
+
+static const struct mtk_iommu_plat_data mt8173_data = {
+   .m4u_plat = M4U_MT8173,
+};
+
 static const struct of_device_id mtk_iommu_of_ids[] = {
-   { .compatible = "mediatek,mt2712-m4u", .data = (void *)M4U_MT2712},
-   { .compatible = "mediatek,mt8173-m4u", .data = (void *)M4U_MT8173},
+   { .compatible = "mediatek,mt2712-m4u", .data = _data},
+   { .compatible = "mediatek,mt8173-m4u", .data = _data},
{}
 };
 
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 62c2c3e..483d210 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -40,6 +40,10 @@ enum mtk_iommu_plat {
M4U_MT8173,
 };
 
+struct mtk_iommu_plat_data {
+   enum mtk_iommu_plat m4u_plat;
+};
+
 struct mtk_iommu_domain;
 
 struct mtk_iommu_data {
@@ -56,7 +60,7 @@ struct mtk_iommu_data {
booltlb_flush_active;
 
struct iommu_device iommu;
-   enum mtk_iommu_plat m4u_plat;
+   const struct mtk_iommu_plat_data *plat_data;
 
struct list_headlist;
 };
-- 
1.9.1



[PATCH v7 01/21] dt-bindings: mediatek: Add binding for mt8183 IOMMU and SMI

2019-06-10 Thread Yong Wu
This patch adds decriptions for mt8183 IOMMU and SMI.

mt8183 has only one M4U like mt8173 and is also MTK IOMMU gen2 which
uses ARM Short-Descriptor translation table format.

The mt8183 M4U-SMI HW diagram is as below:

  EMI
   |
  M4U
   |
   --
   ||
   gals0-rx   gals1-rx
   ||
   ||
   gals0-tx   gals1-tx
   ||
  
   SMI Common
  
   |
  +-+-++-+-+---+---+
  | | || | |   |   |
  | |  gals-rx  gals-rx  |   gals-rx gals-rx gals-rx
  | | || | |   |   |
  | | || | |   |   |
  | |  gals-tx  gals-tx  |   gals-tx gals-tx gals-tx
  | | || | |   |   |
larb0 larb1  IPU0IPU1  larb4  larb5  larb6CCU
disp  vdec   img camvenc   imgcam

All the connections are HW fixed, SW can NOT adjust it.

Compared with mt8173, we add a GALS(Global Async Local Sync) module
between SMI-common and M4U, and additional GALS between larb2/3/5/6
and SMI-common. GALS can help synchronize for the modules in different
clock frequency, it can be seen as a "asynchronous fifo".

GALS can only help transfer the command/data while it doesn't have
the configuring register, thus it has the special "smi" clock and it
doesn't have the "apb" clock. From the diagram above, we add "gals0"
and "gals1" clocks for smi-common and add a "gals" clock for smi-larb.

>From the diagram above, IPU0/IPU1(Image Processor Unit) and CCU(Camera
Control Unit) is connected with smi-common directly, we can take them
as "larb2", "larb3" and "larb7", and their register spaces are
different with the normal larb.

Signed-off-by: Yong Wu 
Reviewed-by: Rob Herring 
Reviewed-by: Evan Green 
---
 .../devicetree/bindings/iommu/mediatek,iommu.txt   |  30 -
 .../memory-controllers/mediatek,smi-common.txt |  12 +-
 .../memory-controllers/mediatek,smi-larb.txt   |   4 +
 include/dt-bindings/memory/mt8183-larb-port.h  | 130 +
 4 files changed, 170 insertions(+), 6 deletions(-)
 create mode 100644 include/dt-bindings/memory/mt8183-larb-port.h

diff --git a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt 
b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
index 6922db5..ce59a50 100644
--- a/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
+++ b/Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
@@ -11,10 +11,23 @@ ARM Short-Descriptor translation table format for address 
translation.
|
   m4u (Multimedia Memory Management Unit)
|
+  ++
+  ||
+  gals0-rx   gals1-rx(Global Async Local Sync rx)
+  ||
+  ||
+  gals0-tx   gals1-tx(Global Async Local Sync tx)
+  ||  Some SoCs may have GALS.
+  ++
+   |
SMI Common(Smart Multimedia Interface Common)
|
++---
||
+   | gals-rxThere may be GALS in some larbs.
+   ||
+   ||
+   | gals-tx
||
SMI larb0SMI larb1   ... SoCs have several SMI local arbiter(larb).
(display) (vdec)
@@ -36,6 +49,10 @@ each local arbiter.
 like display, video decode, and camera. And there are different ports
 in each larb. Take a example, There are many ports like MC, PP, VLD in the
 video decode local arbiter, all these ports are according to the video HW.
+  In some SoCs, there may be a GALS(Global Async Local Sync) module between
+smi-common and m4u, and additional GALS module between smi-larb and
+smi-common. GALS can been seen as a "asynchronous fifo" which could help
+synchronize for the modules in different clock frequency.
 
 Required properties:
 - compatible : must be one of the following string:
@@ -44,18 +61,25 @@ Required properties:
"mediatek,mt7623-m4u", "mediatek,mt2701-m4u" for mt7623 which uses
 generation one m4u HW.
"mediatek,mt8173-m4u" for mt8173 which uses generation two m4u HW.
+   "mediatek,mt8183-m4u" for mt8183 which uses generation two m4u HW.
 - reg : m4u register base and size.
 - interrupts : the interrupt of m4u.
 - clocks : must contain one entry for each clock-names.
-- clock-names : must be "bclk", It is the block clock of m4u.
+- clock-names : Only 1 optional clock:
+  - "bclk": the block clock of m4u.
+  Here is the list which require this "bclk":
+  - mt2701, 

RE: How to resolve an issue in swiotlb environment?

2019-06-10 Thread Yoshihiro Shimoda
Hi Christoph, Alan,
(add linux-usb ML on CC.)

> From: Yoshihiro Shimoda, Sent: Friday, June 7, 2019 9:00 PM
> 
> Hi Christoph,
> 
> I think we should continue to discuss on this email thread instead of the 
> fixed DMA-API.txt patch [1]
> 
> [1]
> https://marc.info/?t=15598941221=1=2
> 
> > From: Yoshihiro Shimoda, Sent: Monday, June 3, 2019 3:42 PM
> >
> > Hi linux-block and iommu mailing lists,
> >
> > I have an issue that a USB SSD with xHCI on R-Car H3 causes "swiotlb is 
> > full" like below.
> >
> > [   36.745286] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 
> > 524288 bytes), total 32768 (slots), used 1338
> (slots)
> >
> > I have investigated this issue by using git bisect, and then I found the 
> > following commit:
> >
> > ---
> > commit 09324d32d2a0843e66652a087da6f77924358e62
> > Author: Christoph Hellwig 
> > Date:   Tue May 21 09:01:41 2019 +0200
> >
> > block: force an unlimited segment size on queues with a virt boundary
> > ---
> 
> Thank you for your comment on other email thread [2] like below:
> ---
> Turns out it isn't as simple as I thought, as there doesn't seem to
> be an easy way to get to the struct device used for DMA mapping
> from USB drivers.  I'll need to think a bit more how to handle that
> best.
> ---
> 
> [2]
> https://marc.info/?l=linux-doc=155989651620473=2

I have another way to avoid the issue. But it doesn't seem that a good way 
though...
According to the commit that adding blk_queue_virt_boundary() [3],
this is needed for vhci_hcd as a workaround so that if we avoid to call it
on xhci-hcd driver, the issue disappeared. What do you think?
JFYI, I pasted a tentative patch in the end of email [4].

---
[3]
commit 747668dbc061b3e62bc1982767a3a1f9815fcf0e
Author: Alan Stern 
Date:   Mon Apr 15 13:19:25 2019 -0400

usb-storage: Set virt_boundary_mask to avoid SG overflows
---
[4]
diff --git a/drivers/usb/storage/scsiglue.c b/drivers/usb/storage/scsiglue.c
index 59190d8..277c6f7e 100644
--- a/drivers/usb/storage/scsiglue.c
+++ b/drivers/usb/storage/scsiglue.c
@@ -30,6 +30,8 @@
 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -65,6 +67,7 @@ static const char* host_info(struct Scsi_Host *host)
 static int slave_alloc (struct scsi_device *sdev)
 {
struct us_data *us = host_to_us(sdev->host);
+   struct usb_hcd *hcd = bus_to_hcd(us->pusb_dev->bus);
int maxp;
 
/*
@@ -80,8 +83,10 @@ static int slave_alloc (struct scsi_device *sdev)
 * Bulk maxpacket value.  Fortunately this value is always a
 * power of 2.  Inform the block layer about this requirement.
 */
-   maxp = usb_maxpacket(us->pusb_dev, us->recv_bulk_pipe, 0);
-   blk_queue_virt_boundary(sdev->request_queue, maxp - 1);
+   if (!strcmp(hcd->driver->description, "vhci_hcd")) {
+   maxp = usb_maxpacket(us->pusb_dev, us->recv_bulk_pipe, 0);
+   blk_queue_virt_boundary(sdev->request_queue, maxp - 1);
+   }
 
/*
 * Some host controllers may have alignment requirements.
---
Best regards,
Yoshihiro Shimoda



Re: [PATCH v4 01/14] dt-bindings: Add binding for MT2712 MIPI-CSI2

2019-06-10 Thread Tomasz Figa
On Mon, Jun 10, 2019 at 4:51 PM CK Hu  wrote:
>
> Hi, Tomasz:
>
> On Mon, 2019-06-10 at 12:32 +0900, Tomasz Figa wrote:
> > Hi CK, Stu,
> >
> > On Mon, Jun 10, 2019 at 11:34 AM CK Hu  wrote:
> > >
> > > Hi, Stu:
> > >
> > > "mediatek,mt2712-mipicsi" and "mediatek,mt2712-mipicsi-common" have many
> > > common part with "mediatek,mt8183-seninf", and I've a discussion in [1],
> > > so I would like these two to be merged together.
> > >
> > > [1] https://patchwork.kernel.org/patch/10979131/
> > >
> >
> > Thanks CK for spotting this.
> >
> > I also noticed that the driver in fact handles two hardware blocks at
> > the same time - SenInf and CamSV. Unless the architecture is very
> > different from MT8183, I'd suggest splitting it.
> >
> > On a general note, the MT8183 SenInf driver has received several
> > rounds of review comments already, but I couldn't find any comments
> > posted for this one.
> >
> > Given the two aspects above and also based on my quick look at code
> > added by this series, I'd recommend adding MT2712 support on top of
> > the MT8183 series.
>
> In [1], "mediatek,mt8183-seninf" use one device to control multiple csi
> instance, so it duplicate many register definition. In [2], one
> "mediatek,mt2712-mipicsi" device control one csi instance, so there are
> multiple device and the register definition does not duplicate.

I guess we didn't catch that in the review yet. It should be fixed.

> You
> recommend adding MT2712 support on top of the MT8183 series, do you mean
> that "mediatek,mt2712-mipicsi" should use one device to control multiple
> csi instance and duplicate the register setting?

There are some aspects of MT8183 series that are done better than the
MT2712 series, but apparently there are also some better aspects in
MT2712. We should take the best aspects of both series. :)

Best regards,
Tomasz

>
> [1] https://patchwork.kernel.org/patch/10979121/
> [2] https://patchwork.kernel.org/patch/10974573/
>
> Regards,
> CK
>
> >
> > Best regards,
> > Tomasz
>
>


Re: [PATCH v4 01/14] dt-bindings: Add binding for MT2712 MIPI-CSI2

2019-06-10 Thread CK Hu
Hi, Tomasz:

On Mon, 2019-06-10 at 12:32 +0900, Tomasz Figa wrote:
> Hi CK, Stu,
> 
> On Mon, Jun 10, 2019 at 11:34 AM CK Hu  wrote:
> >
> > Hi, Stu:
> >
> > "mediatek,mt2712-mipicsi" and "mediatek,mt2712-mipicsi-common" have many
> > common part with "mediatek,mt8183-seninf", and I've a discussion in [1],
> > so I would like these two to be merged together.
> >
> > [1] https://patchwork.kernel.org/patch/10979131/
> >
> 
> Thanks CK for spotting this.
> 
> I also noticed that the driver in fact handles two hardware blocks at
> the same time - SenInf and CamSV. Unless the architecture is very
> different from MT8183, I'd suggest splitting it.
> 
> On a general note, the MT8183 SenInf driver has received several
> rounds of review comments already, but I couldn't find any comments
> posted for this one.
> 
> Given the two aspects above and also based on my quick look at code
> added by this series, I'd recommend adding MT2712 support on top of
> the MT8183 series.

In [1], "mediatek,mt8183-seninf" use one device to control multiple csi
instance, so it duplicate many register definition. In [2], one
"mediatek,mt2712-mipicsi" device control one csi instance, so there are
multiple device and the register definition does not duplicate. You
recommend adding MT2712 support on top of the MT8183 series, do you mean
that "mediatek,mt2712-mipicsi" should use one device to control multiple
csi instance and duplicate the register setting?

[1] https://patchwork.kernel.org/patch/10979121/
[2] https://patchwork.kernel.org/patch/10974573/

Regards,
CK

> 
> Best regards,
> Tomasz




RE: How to resolve an issue in swiotlb environment?

2019-06-10 Thread Biju Das
Hi All,

Any update on the below issue. I am seeing similar issue on RZ/G2M board with 
Linux version 5.2.0-rc3.

root@hihope-rz-g2m:~# [   35.414177] usb 2-1: new SuperSpeed Gen 1 USB device 
number 2 using xhci-hcd
[   35.449402] usb-storage 2-1:1.0: USB Mass Storage device detected
[   35.455915] scsi host0: usb-storage 2-1:1.0
[   36.482585] scsi 0:0:0:0: Direct-Access SanDisk  Extreme  0001 
PQ: 0 ANSI: 6
[   36.491260] sd 0:0:0:0: [sda] 125045424 512-byte logical blocks: (64.0 
GB/59.6 GiB)
[   36.499823] sd 0:0:0:0: [sda] Write Protect is off
[   36.505474] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA
[   36.518074]  sda: sda1
[   36.523163] sd 0:0:0:0: [sda] Attached SCSI disk

root@hihope-rz-g2m:~# mkdir -p /tmp/rmnt/sda1
root@hihope-rz-g2m:~# mount -t auto /dev/sda1 /tmp/rmnt/sda1
root@hihope-rz-g2m:~# dd if=/dev/urandom of=/tmp/sda1-random bs=1024 count=10240
10240+0 records in
10240+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.187696 s, 55.9 MB/s
root@hihope-rz-g2m:~# cp /tmp/sda1-random /tmp/rmnt/sda1/sda1-random
root@hihope-rz-g2m:~# [  218.861212] xhci-hcd ee00.usb: swiotlb buffer is 
full (sz: 1003520 bytes), total 32768 (slots), used 1088 (slots)
[  218.871885] xhci-hcd ee00.usb: overflow 0x00067430b000+1003520 of 
DMA mask  bus mask 0
[  218.881233] WARNING: CPU: 0 PID: 258 at kernel/dma/direct.c:43 
report_addr+0x38/0xa8
[  218.888974] Modules linked in: renesas_usb3 usb_dmac phy_rcar_gen3_usb3
[  218.895594] CPU: 0 PID: 258 Comm: usb-storage Not tainted 
5.2.0-rc3-00017-gc80b083-dirty #5
[  218.903940] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
[  218.910291] pstate: 4005 (nZcv daif -PAN -UAO)
[  218.915078] pc : report_addr+0x38/0xa8
[  218.918821] lr : report_addr+0xa0/0xa8
[  218.922564] sp : 125fb970
[  218.925872] x29: 125fb970 x28: 
[  218.931180] x27:  x26: 1f020280
[  218.936487] x25: 8006394a82a8 x24: 
[  218.941794] x23: 0001 x22: 
[  218.947101] x21: 000f5000 x20: 11309000
[  218.952408] x19: 80063a600010 x18: 
[  218.957714] x17:  x16: 
[  218.963023] x15: 113096c8 x14: 4d4420666f203032
[  218.968331] x13: 35333030312b3030 x12: 3062303334373630
[  218.973638] x11: 3030303030307830 x10: 11309f20
[  218.978945] x9 : 112e3018 x8 : 0123
[  218.984252] x7 : 0005 x6 : 80063b578180
[  218.989559] x5 : 80063b578180 x4 : 
[  218.994865] x3 : 80063b57ef10 x2 : eed25f279b69f300
[  219.000172] x1 : eed25f279b69f300 x0 : 
[  219.005481] Call trace:
[  219.007923]  report_addr+0x38/0xa8
[  219.011321]  dma_direct_map_page+0x148/0x158
[  219.015586]  dma_direct_map_sg+0x78/0xe0
[  219.019510]  usb_hcd_map_urb_for_dma+0x2fc/0x468
[  219.024124]  xhci_map_urb_for_dma+0x54/0x68
[  219.028303]  usb_hcd_submit_urb+0x88/0x968
[  219.032394]  usb_submit_urb+0x3b0/0x570
[  219.036226]  usb_sg_wait+0x98/0x158
[  219.039711]  usb_stor_bulk_transfer_sglist.part.3+0x94/0x128
[  219.045366]  usb_stor_bulk_srb+0x48/0x88
[  219.049283]  usb_stor_Bulk_transport+0x10c/0x390
[  219.053896]  usb_stor_invoke_transport+0x3c/0x500
[  219.058595]  usb_stor_transparent_scsi_command+0xc/0x18
[  219.063816]  usb_stor_control_thread+0x1c4/0x260
[  219.068431]  kthread+0x124/0x128
[  219.071660]  ret_from_fork+0x10/0x18
[  219.075229] ---[ end trace dd9ef2a6b7fef860 ]---
[  219.080087] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 1003520 
bytes), total 32768 (slots), used 1088 (slots)
[  219.090810] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 1003520 
bytes), total 32768 (slots), used 1088 (slots)
[  219.101510] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 1003520 
bytes), total 32768 (slots), used 1088 (slots)
[  219.112209] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 1003520 
bytes), total 32768 (slots), used 1088 (slots)
[  219.122901] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 1003520 
bytes), total 32768 (slots), used 1088 (slots)
[  219.133591] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 1003520 
bytes), total 32768 (slots), used 1088 (slots)
[  219.144283] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 1003520 
bytes), total 32768 (slots), used 1088 (slots)
[  219.154973] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 1003520 
bytes), total 32768 (slots), used 1088 (slots)
[  219.165674] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 1003520 
bytes), total 32768 (slots), used 1088 (slots)
[  223.861717] swiotlb_tbl_map_single: 67451 callbacks suppressed
[  223.861721] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 1003520 
bytes), total 32768 (slots), used 1088 (slots)
[  223.878249] xhci-hcd ee00.usb: swiotlb buffer is full (sz: 1003520 
bytes), total 32768 (slots), used 1088 (slots)
[