Re: [PATCH 1/3] Docs: dt: add generic MSI bindings

2015-07-24 Thread Marc Zyngier
On Thu, 23 Jul 2015 19:26:11 +0100
David Daney  wrote:

> On 07/23/2015 09:52 AM, Mark Rutland wrote:
> [...]
> > +MSI clients
> > +===
> > +
> > +MSI clients are devices which generate MSIs. For each MSI they wish to
> > +generate, the doorbell and payload may be configured, though sideband
> > +information may not be configurable.
> > +
> > +Required properties:
> > +
> > +
> > +- msi-parent: A list of phandle + msi-specifier pairs, one for each MSI
> > +  controller which the device is capable of using.
> > +
> 
> We say here that "msi-parent" consists of pairs ...
> 
> > +  This property is unordered, and MSIs may be allocated from any 
> > combination of
> > +  MSI controllers listed in the msi-parent property.
> > +
> > +  If a device has restrictions on the allocation of MSIs, these 
> > restrictions
> > +  must be described with additional properties.
> > +
> > +  When #msi-cells is non-zero, busses with an msi-parent will require
> > +  additional properties to describe the relationship between devices on 
> > the bus
> > +  and the set of MSIs they can potentially generate.
> > +
> > +
> > +Example
> > +===
> > +
> > +/ {
> > +   #address-cells = <1>;
> > +   #size-cells = <1>;
> > +
> > +   msi_a: msi-controller@a {
> > +   reg = <0xa 0xf00>;
> > +   compatible = "vendor-a,some-controller";
> > +   msi-controller;
> > +   /* No sideband data, so #msi-cells omitted */
> > +   };
> > +
> > +   msi_b: msi-controller@b {
> > +   reg = <0xb 0xf00>;
> > +   compatible = "vendor-b,another-controller";
> > +   msi-controller;
> > +   /* Each device has some unique ID */
> > +   #msi-cells = <1>;
> > +   };
> > +
> > +   msi_c: msi-controller@c {
> > +   reg = <0xb 0xf00>;
> > +   compatible = "vendor-b,another-controller";
> > +   msi-controller;
> > +   /* Each device has some unique ID */
> > +   #msi-cells = <1>;
> > +   };
> > +
> > +   dev@0 {
> > +   reg = <0x0 0xf00>;
> > +   compatible = "vendor-c,some-device";
> > +
> > +   /* Can only generate MSIs to msi_a */
> > +   msi-parent = <&msi_a>;
> 
> 
> My device-tree syntax foo is a little rusty, but doesn't "msi-parent" 
> need a pair of elements?  This has only the phandle.

This is a pair in the sense of (phandle, msi-specifier).

msi_a doesn't have a #msi-cells, so its msi-specifier is of size 0. In
that case, the pair is (phandle, empty-set).

You could also have something like:

msi_d: msi-controller@d {
#msi-cells = <2>;
};

dev@f {
msi-parent = <&msi_d 0x1 0x2>;
};

and msi-parent in this case would still be a pair, with an
msi-specifier of size 2.

Hope this helps,

M.
-- 
Without deviation from the norm, progress is not possible.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/3] Docs: dt: add PCI IOMMU map bindings

2015-07-24 Thread Robin Murphy

Hi Mark,

This looks sane, and lets me describe the thing I have on my desk, so 
I'm happy. I have a couple of general thoughts below, but I don't intend 
that they should stand in the way of this proposal as-is.


On 23/07/15 17:52, Mark Rutland wrote:

The existing IOMMU bindings are able to specify the relationship between
masters and IOMMUs, but they are insufficient for describing the general
case of hotpluggable busses such as PCI where the set of masters is not
known until runtime, and the relationship between masters and IOMMUs is
a property of the integration of the system.

This patch adds a generic binding for mapping PCI devices to IOMMUs,
using a new iommu-map property (specific to PCI*) which may be used to
map devices (identified by their Requester ID) to sideband data for the
IOMMU which they master through.

Signed-off-by: Mark Rutland 
---
  .../devicetree/bindings/pci/pci-iommu.txt  | 171 +
  1 file changed, 171 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/pci/pci-iommu.txt

diff --git a/Documentation/devicetree/bindings/pci/pci-iommu.txt 
b/Documentation/devicetree/bindings/pci/pci-iommu.txt
new file mode 100644
index 000..56c8296
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/pci-iommu.txt
@@ -0,0 +1,171 @@
+This document describes the generic device tree binding for describing the
+relationship between PCI(e) devices and IOMMU(s).
+
+Each PCI(e) device under a root complex is uniquely identified by its Requester
+ID (AKA RID). A Requester ID is a triplet of a Bus number, Device number, and
+Function number.
+
+For the purpose of this document, when treated as a numeric value, a RID is
+formatted such that:
+
+* Bits [15:8] are the Bus number.
+* Bits [7:3] are the Device number.
+* Bits [2:0] are the Function number.
+* Any other bits required for padding must be zero.
+
+IOMMUs may distinguish PCI devices through sideband data derived from the
+Requester ID. While a given PCI device can only master through one IOMMU, a
+root complex may split masters across a set of IOMMUs (e.g. with one IOMMU per
+bus).
+
+The generic 'iommus' property is insufficient to describe this relationship,
+and a mechanism is required to map from a PCI device to its IOMMU and sideband
+data.
+
+For generic IOMMU bindings, see
+Documentation/devicetree/bindings/iommu/iommu.txt.
+
+
+PCI root complex
+
+
+Optional properties
+---
+
+- iommu-map: Maps a Requester ID to an IOMMU and associated iommu-specifier
+  data.
+
+  The property is an arbitrary number of tuples of
+  (rid-base,iommu,iommu-base,length).
+
+  Any RID r in the interval [rid-base, rid-base + length) is associated with
+  the listed IOMMU, with the iommu-specifier (r - rid-base + iommu-base).


Can we take as a guarantee that the system cannot present any ID at a 
given IOMMU that is not represented in an appropriate output range (in 
the sense that we may do things that could blow up horribly if spurious 
IDs appeared)?


Furthermore, would representing one-to-many mappings by having multiple 
matches for a given RID be legal? In the general case it's certainly 
feasible for the IOMMU to see different IDs for e.g. reads vs. writes, 
where the system munges extra bus lines into the sideband signals - 
whether anyone would actually integrate a PCI host controller that way 
is another matter, so I don't think it's something worth really worrying 
about without a definite need.



+
+- iommu-map-mask: A mask to be applied to each Requester ID prior to being
+  mapped to an iommu-specifier per the iommu-map property.


Am I right to assume a mask of 0 would be a valid way to represent 
"everything" (and if so, should rid-base and length just be ignored, or 
mandated to be 0 and 1 respectively)? It looks a bit off at first 
glance, but it does neatly address a genuine use-case.



+
+
+Example (1)
+===
+
+/ {
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   iommu: iommu@a {
+   reg = <0xa 0x1>;
+   compatible = "vendor,some-iommu";
+   #iommu-cells = <1>;


Troll question; what do we do when #iommu-cells > 1, where the IOMMU is 
expecting some extra data associated with each ID (say, memory attributes)?


[ I'm pretty sure the answer here should be "define some additional 
binding if and when anyone actually cares" ;) ]



Robin.


+   };
+
+   pci: pci@f {
+   reg = <0xf 0x1>;
+   compatible = "vendor,pcie-root-complex";
+   device_type = "pci";
+
+   /*
+* The sideband data provided to the IOMMU is the RID,
+* identity-mapped.
+*/
+   iommu-map = <0x0 &iommu 0x0 0x1>;
+   };
+};


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH 3/4] iommu/arm-smmu: Add support for specifying clocks

2015-07-24 Thread Sricharan
Hi Will,

> -Original Message-
> From: linux-arm-msm-ow...@vger.kernel.org [mailto:linux-arm-msm-
> ow...@vger.kernel.org] On Behalf Of Will Deacon
> Sent: Tuesday, July 21, 2015 8:31 PM
> To: Sricharan R
> Cc: linux-arm-ker...@lists.infradead.org;
iommu@lists.linux-foundation.org;
> devicet...@vger.kernel.org; linux-arm-...@vger.kernel.org;
> mitch...@codeaurora.org
> Subject: Re: [RFC PATCH 3/4] iommu/arm-smmu: Add support for specifying
> clocks
> 
> On Fri, Jul 17, 2015 at 05:53:24PM +0100, Sricharan R wrote:
> > From: Mitchel Humpherys 
> >
> > On some platforms with tight power constraints it is polite to only
> > leave your clocks on for as long as you absolutely need them.
> > Currently we assume that all clocks necessary for SMMU register access
> > are always on.
> 
> You've borrowed this commit message from Mitch's previous version of this
> patch, but now you leave the clocks enabled most of the time so it doesn't
> make much sense anymore.
> 
 Sorry, I should have changed that to make it clear.

> Anyway, I'm OK with this kind of clock management in the driver, but I
think
> that anything more fine-grained needs to be designed into the IOMMU core.
> 
  Ok. Tried this to get the right direction.
 I will  check for the power savings and if there are not much then I would
use
 the above approach.

Regards,
  Sricharan

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH 4/4] iommu/arm-smmu: Add support for specifying regulators

2015-07-24 Thread Sricharan
Hi Stephen,

> -Original Message-
> From: linux-arm-kernel [mailto:linux-arm-kernel-
> boun...@lists.infradead.org] On Behalf Of Stephen Boyd
> Sent: Tuesday, July 21, 2015 11:48 PM
> To: Sricharan R
> Cc: devicet...@vger.kernel.org; linux-arm-...@vger.kernel.org;
> will.dea...@arm.com; iommu@lists.linux-foundation.org;
> mitch...@codeaurora.org; linux-arm-ker...@lists.infradead.org
> Subject: Re: [RFC PATCH 4/4] iommu/arm-smmu: Add support for specifying
> regulators
> 
> On 07/17/2015 09:53 AM, Sricharan R wrote:
> > From: Mitchel Humpherys 
> >
> > This adds the support to turn on the regulators required for SMMUs. It
> > is turned on during the SMMU probe and remains 'on' till the device
> > exists.
> 
> The device always exists. Until the driver is removed perhaps?
> 
  Till the device exists.

> >
> > Signed-off-by: Sricharan R 
> 
> We won't be using regulators. We'll be using power domains instead so
> please rewrite this patch accordingly.
> 
 Ok. Thanks for the direction.

Regards,
 Sricharan

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/3] Docs: dt: add PCI IOMMU map bindings

2015-07-24 Thread Mark Rutland
Hi,

> This looks sane, and lets me describe the thing I have on my desk, so 
> I'm happy. I have a couple of general thoughts below, but I don't intend 
> that they should stand in the way of this proposal as-is.

Good to hear that this doesn't fall apart at the sight of a real system!

> On 23/07/15 17:52, Mark Rutland wrote:
> > The existing IOMMU bindings are able to specify the relationship between
> > masters and IOMMUs, but they are insufficient for describing the general
> > case of hotpluggable busses such as PCI where the set of masters is not
> > known until runtime, and the relationship between masters and IOMMUs is
> > a property of the integration of the system.
> >
> > This patch adds a generic binding for mapping PCI devices to IOMMUs,
> > using a new iommu-map property (specific to PCI*) which may be used to
> > map devices (identified by their Requester ID) to sideband data for the
> > IOMMU which they master through.
> >
> > Signed-off-by: Mark Rutland 
> > ---
> >   .../devicetree/bindings/pci/pci-iommu.txt  | 171 
> > +
> >   1 file changed, 171 insertions(+)
> >   create mode 100644 Documentation/devicetree/bindings/pci/pci-iommu.txt
> >
> > diff --git a/Documentation/devicetree/bindings/pci/pci-iommu.txt 
> > b/Documentation/devicetree/bindings/pci/pci-iommu.txt
> > new file mode 100644
> > index 000..56c8296
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/pci/pci-iommu.txt
> > @@ -0,0 +1,171 @@
> > +This document describes the generic device tree binding for describing the
> > +relationship between PCI(e) devices and IOMMU(s).
> > +
> > +Each PCI(e) device under a root complex is uniquely identified by its 
> > Requester
> > +ID (AKA RID). A Requester ID is a triplet of a Bus number, Device number, 
> > and
> > +Function number.
> > +
> > +For the purpose of this document, when treated as a numeric value, a RID is
> > +formatted such that:
> > +
> > +* Bits [15:8] are the Bus number.
> > +* Bits [7:3] are the Device number.
> > +* Bits [2:0] are the Function number.
> > +* Any other bits required for padding must be zero.
> > +
> > +IOMMUs may distinguish PCI devices through sideband data derived from the
> > +Requester ID. While a given PCI device can only master through one IOMMU, a
> > +root complex may split masters across a set of IOMMUs (e.g. with one IOMMU 
> > per
> > +bus).
> > +
> > +The generic 'iommus' property is insufficient to describe this 
> > relationship,
> > +and a mechanism is required to map from a PCI device to its IOMMU and 
> > sideband
> > +data.
> > +
> > +For generic IOMMU bindings, see
> > +Documentation/devicetree/bindings/iommu/iommu.txt.
> > +
> > +
> > +PCI root complex
> > +
> > +
> > +Optional properties
> > +---
> > +
> > +- iommu-map: Maps a Requester ID to an IOMMU and associated iommu-specifier
> > +  data.
> > +
> > +  The property is an arbitrary number of tuples of
> > +  (rid-base,iommu,iommu-base,length).
> > +
> > +  Any RID r in the interval [rid-base, rid-base + length) is associated 
> > with
> > +  the listed IOMMU, with the iommu-specifier (r - rid-base + iommu-base).
> 
> Can we take as a guarantee that the system cannot present any ID at a 
> given IOMMU that is not represented in an appropriate output range (in 
> the sense that we may do things that could blow up horribly if spurious 
> IDs appeared)?

I would expect that for the root complex, the iommu-map should cover all
possible RIDs (and hence we would know their output IDs). In the case
that a possible RID was not translated by the property, I would hope
that we could either detect this at parse time, or prevent probing of
the device when it appeared.

The root complex could share IOMMUs with other masters, so in general
iommu-map alone would not be sufficient to detect all possible IDs.

> Furthermore, would representing one-to-many mappings by having multiple 
> matches for a given RID be legal? In the general case it's certainly 
> feasible for the IOMMU to see different IDs for e.g. reads vs. writes, 
> where the system munges extra bus lines into the sideband signals - 
> whether anyone would actually integrate a PCI host controller that way 
> is another matter, so I don't think it's something worth really worrying 
> about without a definite need.

I'd expect no-one would implement separate read and write IDs, given
that the IOMMU should distinguish reads and writes anyway.

Generally I don't think multiple matches for the same RID make sense.

> > +- iommu-map-mask: A mask to be applied to each Requester ID prior to being
> > +  mapped to an iommu-specifier per the iommu-map property.
> 
> Am I right to assume a mask of 0 would be a valid way to represent 
> "everything" (and if so, should rid-base and length just be ignored, or 
> mandated to be 0 and 1 respectively)? It looks a bit off at first 
> glance, but it does neatly address a genuine use-case.

I think that should be valid, yes.

Re: Since Linux 4.1: A lot of AMD-Vi IO_PAGE_FAULTs

2015-07-24 Thread Bjorn Helgaas
[+cc Tejun, linux-ide]

On Thu, Jul 23, 2015 at 11:22 PM, Andreas Hartmann
 wrote:
> On Tue, Jul 21, 2015 at 06:35PM +0200, Joerg Roedel wrote:
>> On Tue, Jul 21, 2015 at 06:20:23PM +0200, Andreas Hartmann wrote:
>>> [   48.193901] <6>[fglrx] Firegl kernel thread PID: 1840
>>> [   48.193985] <6>[fglrx] Firegl kernel thread PID: 1841
>>> [   48.194063] <6>[fglrx] Firegl kernel thread PID: 1842
>>> [   48.194172] <6>[fglrx] IRQ 28 Enabled
>>> [   48.261580] <6>[fglrx] Reserved FB block: Shared offset:0, size:100
>>> [   48.261586] <6>[fglrx] Reserved FB block: Unshared offset:f7b4000, 
>>> size:4000
>>> [   48.261587] <6>[fglrx] Reserved FB block: Unshared offset:f7b8000, 
>>> size:548000
>>> [   48.261588] <6>[fglrx] Reserved FB block: Unshared offset:3fff3000, 
>>> size:d000
>>
>> From a first glance it doesn't look like an IOMMU driver issue, because
>> the addresses where the faults happen are not from the AMD IOMMU driver.
>>
>> And you have proprietary closed-source drivers loaded, can you reproduce
>> the issue without fglrx?
>
> Yes. I attached this one.
>
> Meanwhile I tested with 4.0.9, too. I wasn't able to reproduce the
> problem with this kernel even after lots of reboots (the problem w/ 4.1
> usually comes up during boot process (but not only - it can be seen
> after boot process, too)).
>
> The problem always is, that there are errors w/ one of the sata discs
> and at the same time, IO_PAGE_FAULT errors are rising as described before:
>
> [  152.533708] ata3.00: failed command: READ FPDMA QUEUED
> [  152.538102] ata3.00: failed command: READ FPDMA QUEUED
> [  152.539862] ata3.00: failed command: READ FPDMA QUEUED
> [  152.541778] ata3.00: failed command: WRITE FPDMA QUEUED
> [  152.543861] ata3.00: failed command: WRITE FPDMA QUEUED
>
> [ 5818.068050] ata2.00: failed command: WRITE FPDMA QUEUED
> [ 5818.068059] ata2.00: failed command: WRITE FPDMA QUEUED
>
> I compared dmesg from 4.1 w/ 4.0 and I realized the following *missing*
> entries in 4.1:
>
> [0.00] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> [0.00] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
> [0.00] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
> [0.00] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
> [0.00] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] enabled)
> [0.00] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] enabled)
> [0.00] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] enabled)
> [0.00] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] enabled)
>
>
> What does this mean? Is there missing some part of the acpi initialization?
>
>
> Thanks for any hint as Linux 4.1 is completely unusable here with these
> errors.

This looks more like an AHCI problem than an IOMMU or PCI problem.
Seems like the device has the wrong idea about where its DMA buffers
are.  Maybe something scribbled on its command list?

>From your attachments:

# lspci -vvs 00:11.0
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40) (prog-if 01 [AHCI
1.0])

pci :00:11.0: [1002:4391] type 00 class 0x010601
ahci :00:11.0: version 3.0
ahci :00:11.0: AHCI 0001.0200 32 slots 6 ports 6 Gbps 0x3f impl SATA mode
ahci :00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio slum part
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x40eba32100618000 flags=0x0010]
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x40eba32100618040 flags=0x0010]
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x flags=0x]
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x00c0 flags=0x]
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x0040 flags=0x]
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:11.0 domain=0x0008
address=0x01c0 flags=0x]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/omap: Add support for configuring dsp iommus on DRA7xx

2015-07-24 Thread Suman Anna
Hi Tony,

On 07/23/2015 11:30 PM, Tony Lindgren wrote:
> * Suman Anna  [150723 09:25]:
>> Hi Tony,
>>
>> On 07/23/2015 02:24 AM, Tony Lindgren wrote:
>>> * Suman Anna  [150722 09:25]:
 On 07/22/2015 12:26 AM, Tony Lindgren wrote:
>
> I don't like using syscon for tinkering directly with SoC registers.

 This is not a SoC-level register, but a register within a sub-module of
 the DSP processor sub-system. The DSP_SYSTEM sub-module in general is
 described in Section 5.3.3 of the TRM [1], and it implements different
 functionalities like the PRCM handshaking, wakeup logic and DSP
 subsystem top-level configuration. It is a module present within the DSP
 processor sub-system, so can only be accessed when the sub-system is
 clocked and the appropriate reset is released.
>>>
>>> OK so if it's specific to the DSP driver along the lines of sysc and
>>> syss registers.
>>
>> There will be those registers too within the MMU config register space,
>> even for DRA7xx MMUs. This is different, think of it like a register in
>> the Control module except that it is present within the DSP sub-system
>> instead of at the SoC level.
> 
> And what is taking care of pm_runtime_get here to ensure the module
> is powered and clocked?

pm_runtime_get_sync is indeed getting invoked, its just the current
patch doesn't include the code block that invokes it. The function that
invokes omap2_iommu_enable does so after a call to the
pm_runtime_get_sync() call.

> 
> I think you are missing a layer here and it's the Linux kernel side
> device driver for DSP that initializes things.

We already have separate drivers for MMUs (omap-iommu) and the processor
management (omap-rproc). The former manages all the low-level
programming sequences for the MMUs, while the latter manages the
low-level reset etc, and is a client user of the respective IOMMU
device. Both integrate into the respective frameworks. The IOMMU API
invocations are handled in the remoteproc core, with the OMAP remoteproc
driver publishing itself as having an MMU with the remoteproc core. The
IOMMU API invoke the appropriate iommu_ops.

You can lookup the functions rproc_enable_iommu()/rproc_disable_iommu()
in the remoteproc core. The IOMMU enabling sequences happen within the
iommu_attach_device() API. The call flow is
iommu_attach_device()->omap_iommu_attach_dev()->omap_iommu_attach()->
iommu_enable()->
   omap_device_deassert_hardreset, pm_runtime_get_sync, omap2_iommu_enable.

>  
>>> Typically we handle these registers by mapping them to the PM runtime
>>> functions for the interconnect so we can reset and idle the hardware
>>> modules even if there is no driver, see for example
>>> omap54xx_mmu_dsp_hwmod.
>>
>> I haven't yet submitted the DRA7xx hwmods, but they will look almost
>> exactly like the OMAP5 ones. The reset and idle on these are in general
>> not effective at boot time since these are also controlled by a
>> hard-reset line, so that's left to the drivers to deal with it through
>> the omap_device_deassert_hardreset API.
> 
> If the MMU configuration is one time init, it may make sense to add
> it to the hwmod reset function. However, if the Linux kernel side
> driver needs to configure things depending on the DSP firmware, it
> should be done in the kernel side device driver.

The MMU configuration comes into picture whenever the remoteproc driver
is being booted and shut down, and also during suspend (no support for
this yet in mainline on OMAP rproc). Today, the hwmod
_enable/_idle/reset functions alone are not enough to power on the OMAP
remoteproc/iommus. We need sequencing calls to both
omap_device_assert/_deassert_hardreset (done through pdata-quirks) and
pm_runtime API to achieve this.


>  
> We should use some Linux generic framework for configuring these
> bits to avoid nasty dependencies between various hardware modules
> on the SoC.
>
> What does DSP_SYS_MMU_CONFIG register do? It seems it's probably
> a regulator or a gate clock? If so, it should be set up as a
> regulator or a clock and then the omap-iommu driver can just
> use regulator or clcok framework to request the resource.

 No, its neither. It is a control bit that dictates whether the
 processor/EDMA addresses go through the respective MMU or not. The
 register currently has 4 bits (bit 0 in each nibble), one each for
 enabling each MMU and requesting an MMU abort on each. The MMU
 integration and enablement notes are detailed in Section 5.3.6 of the
 TRM [1], and the DSP_SYS_MMU_CONFIG register layout is in Table 5-28
 (Page 1641).
>>>
>>> OK yeah seems like it should be handled by the DSP driver during
>>> probe after doing pm_runtime_get. Then the driver can configure
>>> things like IOMMU and load firmware. Or am I missing something here?
>>
>> The DSP (remoteproc) driver uses the generic IOMMU API, and all the
>> IOMMU enabling sequence is transparent to the DSP driver. S

Re: [PATCH v3 3/6] iommu: add ARM short descriptor page table allocator.

2015-07-24 Thread Will Deacon
On Fri, Jul 24, 2015 at 06:24:26AM +0100, Yong Wu wrote:
> On Tue, 2015-07-21 at 18:11 +0100, Will Deacon wrote:
> > On Thu, Jul 16, 2015 at 10:04:32AM +0100, Yong Wu wrote:
> > > +/* level 2 pagetable */
> > > +#define ARM_SHORT_PTE_TYPE_LARGE   BIT(0)
> > > +#define ARM_SHORT_PTE_SMALL_XN BIT(0)
> > > +#define ARM_SHORT_PTE_TYPE_SMALL   BIT(1)
> > > +#define ARM_SHORT_PTE_BBIT(2)
> > > +#define ARM_SHORT_PTE_CBIT(3)
> > > +#define ARM_SHORT_PTE_SMALL_TEX0   BIT(6)
> > > +#define ARM_SHORT_PTE_IMPLEBIT(9)
> >
> > This is AP[2] for small pages.
> 
> Sorry, In our pagetable bit9 in PGD and PTE is PA[32] that is for  the
> dram size over 4G. I didn't care it is different in PTE of the standard
> spec.
> And I don't use the AP[2] currently, so I only delete this line in next
> time.

Is this related to the "special bit". What would be good is a comment
next to the #define for the quirk describing *exactly* that differs in
your implementation. Without that, it's very difficult to know what is
intentional and what is actually broken.

> > > +static arm_short_iopte
> > > +__arm_short_pte_prot(struct arm_short_io_pgtable *data, int prot, bool 
> > > large)
> > > +{
> > > +   arm_short_iopte pteprot;
> > > +
> > > +   pteprot = ARM_SHORT_PTE_S | ARM_SHORT_PTE_nG;
> > > +   pteprot |= large ? ARM_SHORT_PTE_TYPE_LARGE :
> > > +   ARM_SHORT_PTE_TYPE_SMALL;
> > > +   if (prot & IOMMU_CACHE)
> > > +   pteprot |=  ARM_SHORT_PTE_B | ARM_SHORT_PTE_C;
> > > +   if (prot & IOMMU_WRITE)
> > > +   pteprot |= large ? ARM_SHORT_PTE_LARGE_TEX0 :
> > > +   ARM_SHORT_PTE_SMALL_TEX0;
> >
> > This doesn't make any sense. TEX[2:0] is all about memory attributes, not
> > permissions, so you're making the mapping write-back, write-allocate but
> > that's not what the IOMMU_* values are about.
> 
>  I will delete it.

Well, can you not control mapping permissions with the AP bits? The idea
of the IOMMU flags are:

  IOMMU_CACHE : Install a normal, cacheable mapping (you've got this right)
  IOMMU_READ : Allow read access for the device
  IOMMU_WRITE : Allow write access for the device
  IOMMU_NOEXEC : Disallow execute access for the device

so the caller to iommu_map passes in a bitmap of these, which you need to
encode in the page-table entry.

> > > +static int
> > > +_arm_short_map(struct arm_short_io_pgtable *data,
> > > +  unsigned int iova, phys_addr_t paddr,
> > > +  arm_short_iopte pgdprot, arm_short_iopte pteprot,
> > > +  bool large)
> > > +{
> > > +   const struct iommu_gather_ops *tlb = data->iop.cfg.tlb;
> > > +   arm_short_iopte *pgd = data->pgd, *pte;
> > > +   void *cookie = data->iop.cookie, *pte_va;
> > > +   unsigned int ptenr = large ? 16 : 1;
> > > +   int i, quirk = data->iop.cfg.quirks;
> > > +   bool ptenew = false;
> > > +
> > > +   pgd += ARM_SHORT_PGD_IDX(iova);
> > > +
> > > +   if (!pteprot) { /* section or supersection */
> > > +   if (quirk & IO_PGTABLE_QUIRK_SHORT_MTK)
> > > +   pgdprot &= ~ARM_SHORT_PGD_SECTION_XN;
> > > +   pte = pgd;
> > > +   pteprot = pgdprot;
> > > +   } else {/* page or largepage */
> > > +   if (quirk & IO_PGTABLE_QUIRK_SHORT_MTK) {
> > > +   if (large) { /* special Bit */
> >
> > This definitely needs a better comment! What exactly are you doing here
> > and what is that quirk all about?
> 
> I use this quirk is for MTK Special Bit as we don't have the XN bit in
> pagetable.

I'm still not really clear about what this is.

> > > +   if (!(*pgd)) {
> > > +   pte_va = kmem_cache_zalloc(data->ptekmem, 
> > > GFP_ATOMIC);
> > > +   if (unlikely(!pte_va))
> > > +   return -ENOMEM;
> > > +   ptenew = true;
> > > +   *pgd = virt_to_phys(pte_va) | pgdprot;
> > > +   kmemleak_ignore(pte_va);
> > > +   tlb->flush_pgtable(pgd, sizeof(*pgd), cookie);
> >
> > I think you need to flush this before it becomes visible to the walker.
> 
> I have flushed pgtable here, Do you meaning flush tlb here?

No. afaict, you allocate the pte table using kmem_cache_zalloc but you never
flush it. However, you update the pgd to point at this table, so the walker
can potentially see garbage instead of the zeroed entries.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 5/6] iommu/mediatek: Add mt8173 IOMMU driver

2015-07-24 Thread Will Deacon
On Fri, Jul 24, 2015 at 06:43:13AM +0100, Yong Wu wrote:
> On Tue, 2015-07-21 at 15:59 +0100, Will Deacon wrote:
> > On Thu, Jul 16, 2015 at 10:04:34AM +0100, Yong Wu wrote:
> > > +static void mtk_iommu_tlb_flush_all(void *cookie)
> > > +{
> > > +   struct mtk_iommu_domain *domain = cookie;
> > > +   void __iomem *base;
> > > +
> > > +   base = domain->data->base;
> > > +   writel(F_INVLD_EN1 | F_INVLD_EN0, base + REG_MMU_INV_SEL);
> > > +   writel(F_ALL_INVLD, base + REG_MMU_INVALIDATE);
> > 
> > This needs to be synchronous, so you probably want to call
> > mtk_iommu_tlb_sync at the end.
> 
> From our spec, we have to wait until HW done after tlb flush range.
> But it don't need wait after tlb flush all.
> so It isn't necessary to add mtk_iommu_tlb_sync in tlb_flush_all here.

Okey doke, but I'm surprised you don't need a subsequent DSB or read-back.
What if the writel is buffered on the way to the IOMMU?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] of: base: Allow more args than MAX_PHANDLE_ARGS if required

2015-07-24 Thread Rob Herring
On Thu, Jul 16, 2015 at 6:09 AM, Joerg Roedel  wrote:
> Hi Will,
>
> On Thu, Jul 16, 2015 at 11:23:26AM +0100, Will Deacon wrote:
>> On Thu, Jul 16, 2015 at 09:30:43AM +0100, Joerg Roedel wrote:
>> > +struct of_phandle_args *of_alloc_phandle_args(int size)
>> > +{
>> > +   struct of_phandle_args *args;
>> > +   int e = max(0, size - MAX_PHANDLE_ARGS);
>> > +
>> > +   args =  kzalloc(sizeof(struct of_phandle_args) + e * sizeof(uint32_t),
>> > +   GFP_KERNEL);
>>
>> Should you also update args->args_count to reflect the extended array?
>
> The args_count member just tells us how many of the array elements are
> used and not how many there are. So it doesn't need to be updated here.
>
>> That said, extending the fixed-size array member like this feels a bit
>> fragile. Does GCC not complain about out-of-bounds accesses if you
>> statically address args->args[MAX_PHANDLE_ARGS]? Admittedly, I can't
>> think *why* this would be break (things like additional padding will be
>> harmless), but I'm not intimate with the C standard.
>
> Yeah, I agree, it is not the best possible solution. But this way I
> don't need to update all callers, and thus it works better with our
> development model.

Our development model is not to work-around kernel APIs last time I checked.

> But I am open for suggestions on how to solve this problem better. In
> fact, my main motivation in sending this was to get the discussion about
> an upstreamable solution started :)
>
> Lets see what the device-tree maintainers have to say.

A good number of callers and all iommu callers loop thru the list of 
phandles which are just open coded ATM. So we should do loop iterators 
here. With iterators, we can do the allocation within the iterators. 
This can be much more efficient as we don't iterate thru the list from 
the start every time. Normally, the list is not big enough to matter, 
but in your case it may be.

I'm thinking something like this untested and not yet compiling patch. It 
still has the abuse of adding onto the end of the of_phandle_args 
struct which I don't really like. We could do a new struct, but I'd like 
to keep this code common.

I also need to refactor the existing code to use 
__of_parse_one_phandle_with_args.

Rob

8<
diff --git a/drivers/of/base.c b/drivers/of/base.c
index 8b5a187..c1c5a43 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1442,6 +1442,96 @@ void of_print_phandle_args(const char *msg, const struct 
of_phandle_args *args)
printk("\n");
 }
 
+
+static int __of_parse_one_phandle_with_args(const __be32 **list,
+   const char *cells_name,
+   int cell_count,
+   struct of_phandle_args **out_args)
+{
+   struct device_node *node;
+   phandle phandle;
+   struct of_phandle_args *args = *out_args;
+   const __be32 *start_list = *list;
+   u32 i, count = 0;
+   int ret;
+
+   /*
+* If phandle is 0, then it is an empty entry with no
+* arguments.  Skip forward to the next entry.
+*/
+   phandle = be32_to_cpup(*list++);
+   if (phandle) {
+   /*
+* Find the provider node and parse the #*-cells
+* property to determine the argument length.
+*
+* This is not needed if the cell count is hard-coded
+* (i.e. cells_name not set, but cell_count is set),
+* except when we're going to return the found node
+* below.
+*/
+   node = of_find_node_by_phandle(phandle);
+   if (!node)
+   return -ENOENT;
+
+   if (cells_name) {
+   ret = of_property_read_u32(node, cells_name, &count);
+   if (ret) {
+   pr_err("could not get %s for %s\n",
+  cells_name, node->full_name);
+   return ret;
+   }
+   } else {
+   count = cell_count;
+   }
+   }
+
+   if (args && WARN_ON(count > MAX_PHANDLE_ARGS))
+   count = MAX_PHANDLE_ARGS;
+
+   if (!args) {
+   args = kzalloc(sizeof(*args) + (count * sizeof(uint32_t))), 
GFP_KERNEL);
+   *out_args = args;
+   }
+   if (!args)
+   return -ENOMEM;
+
+   if (!phandle) {
+   memset(args, 0, sizeof(*args));
+   return -ENOENT;
+   }
+
+   args->np = node;
+   args->val = start_list;
+   args->args_count = count;
+   for (i = 0; i < count; i++)
+   args->args[i] = be32_to_cpup(*list++);
+
+   return 0;
+}
+
+of_phandle_args *of_prop_next_phandle_args(struct property *prop,
+ 

Re: [PATCH 2/3] Docs: dt: Add PCI MSI map bindings

2015-07-24 Thread Chalamarla, Tirumalesh
looks good. possible to describe the chip we have.
> On Jul 23, 2015, at 9:52 AM, Mark Rutland  wrote:
> 
> Currently msi-parent is used by a few bindings to describe the
> relationship between a PCI root complex and a single MSI controller, but
> this property does not have a generic binding document.
> 
> Additionally, msi-parent is insufficient to describe more complex
> relationships between MSI controllers and devices under a root complex,
> where devices may be able to target multiple MSI controllers, or where
> MSI controllers use (non-probeable) sideband information to distinguish
> devices.
> 
> This patch adds a generic binding for mapping PCI devices to MSI
> controllers. This document covers msi-parent, and a new msi-map property
> (specific to PCI*) which may be used to map devices (identified by their
> Requester ID) to sideband data for each MSI controller that they may
> target.
> 
> Signed-off-by: Mark Rutland 
> ---
> Documentation/devicetree/bindings/pci/pci-msi.txt | 220 ++
> 1 file changed, 220 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/pci/pci-msi.txt
> 
> diff --git a/Documentation/devicetree/bindings/pci/pci-msi.txt 
> b/Documentation/devicetree/bindings/pci/pci-msi.txt
> new file mode 100644
> index 000..9b3cc81
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/pci/pci-msi.txt
> @@ -0,0 +1,220 @@
> +This document describes the generic device tree binding for describing the
> +relationship between PCI devices and MSI controllers.
> +
> +Each PCI device under a root complex is uniquely identified by its Requester 
> ID
> +(AKA RID). A Requester ID is a triplet of a Bus number, Device number, and
> +Function number.
> +
> +For the purpose of this document, when treated as a numeric value, a RID is
> +formatted such that:
> +
> +* Bits [15:8] are the Bus number.
> +* Bits [7:3] are the Device number.
> +* Bits [2:0] are the Function number.
> +* Any other bits required for padding must be zero.
> +
> +MSIs may be distinguished in part through the use of sideband data 
> accompanying
> +writes. In the case of PCI devices, this sideband data may be derived from 
> the
> +Requester ID. A mechanism is required to associate a device with both the MSI
> +controllers it can address, and the sideband data that will be associated 
> with
> +its writes to those controllers.
> +
> +For generic MSI bindings, see
> +Documentation/devicetree/bindings/interrupt-controller/msi.txt.
> +
> +
> +PCI root complex
> +
> +
> +Optional properties
> +---
> +
> +- msi-map: Maps a Requester ID to an MSI controller and associated
> +  msi-specifier data. The property is an arbitrary number of tuples of
> +  (rid-base,msi-controller,msi-base,length), where:
> +
> +  * rid-base is a single cell describing the first RID matched by the entry.
> +
> +  * msi-controller is a single phandle to an MSI controller
> +
> +  * msi-base is an msi-specifier describing the msi-specifier produced for 
> the
> +first RID matched by the entry.
> +
> +  * length is a single cell describing how many consecutive RIDs are matched
> +following the rid-base.
> +
> +  Any RID r in the interval [rid-base, rid-base + length) is associated with
> +  the listed msi-controller, with the msi-specifier (r - rid-base + 
> msi-base).
> +
> +- msi-map-mask: A mask to be applied to each Requester ID prior to being 
> mapped
> +  to an msi-specifier per the msi-map property.
> +
> +- msi-parent: Describes the MSI parent of the root complex itself. Where
> +  the root complex and MSI controller do not pass sideband data with MSI
> +  writes, this property may be used to describe the MSI controller(s)
> +  used by PCI devices under the root complex, if defined as such in the
> +  binding for the root complex.
> +
> +
> +Example (1)
> +===
> +
> +/ {
> + #address-cells = <1>;
> + #size-cells = <1>;
> +
> + msi: msi-controller@a {
> + reg = <0xa 0x1>;
> + compatible = "vendor,some-controller";
> + msi-controller;
> + #msi-cells = <1>;
> + };
> +
> + pci: pci@f {
> + reg = <0xf 0x1>;
> + compatible = "vendor,pcie-root-complex";
> + device_type = "pci";
> +
> + /*
> +  * The sideband data provided to the MSI controller is
> +  * the RID, identity-mapped.
> +  */
> + msi-map = <0x0 &msi_a 0x0 0x1>,
> + };
> +};
> +
> +
> +Example (2)
> +===
> +
> +/ {
> + #address-cells = <1>;
> + #size-cells = <1>;
> +
> + msi: msi-controller@a {
> + reg = <0xa 0x1>;
> + compatible = "vendor,some-controller";
> + msi-controller;
> + #msi-cells = <1>;
> + };
> +
> + pci: pci@f {
> + reg = <0xf 0x1>;
> + compatible = "vendor,pcie-root-complex";
> + device_type = "pci";
> +
> + /