subject:"Re\: \[PATCH\] devicetree\: Add generic IOMMU device tree bindings"

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Thierry Reding

On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
  On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
   On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
[...]
Multiple-master IOMMU:
--

iommu {
/* the specifier represents the ID of the master */
#address-cells = 1;
#size-cells = 0;
  
  How do we know the size of the input address to the IOMMU?  Do we
  get cases for example where the IOMMU only accepts a 32-bit input
  address, but some 64-bit capable masters are connected through it?
 
 I was stuck on this question for a while before, but then I realized
 that it doesn't matter at all: It's the IOMMU driver itself that
 manages the address space, and it doesn't matter if a slave can
 address a larger range than the IOMMU can accept. If the IOMMU
 needs to deal with the opposite case (64-bit input addresses
 but a 32-bit master), that limitation can be put into the specifier.

Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
master device's DMA mask to do the right thing here?

As such I don't think this information needs to be in device tree at
all. The DMA mask should typically be set by the driver in the first
place because it has knowledge about the capabilities of the device.
A different way of saying that is that the DMA mask is implied by the
device's compatible string.

  For determining dma masks, it is the output address that it
  important.  Santosh's code can probably be taught to handle this,
  if given an additional traversal rule for following iommus
  properties.  However, deploying an IOMMU whose output address size
  is smaller than the 
 
 Something seems to be missing here. I don't think we want to handle
 the case where the IOMMU output cannot the entire memory address
 space. If necessary, that would mean using both an IOMMU driver
 and swiotlb, but I think it's a reasonable assumption that hardware
 isn't /that/ crazy.

Similarily, should the IOMMU not be treated like any other device here?
Its DMA mask should determine what address range it can access.

Thierry


pgp8Hr_FCHEjv.pgp
Description: PGP signature

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Arnd Bergmann

On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
 On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
  On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
   On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
 [...]
 Multiple-master IOMMU:
 --
 
   iommu {
   /* the specifier represents the ID of the master */
   #address-cells = 1;
   #size-cells = 0;
   
   How do we know the size of the input address to the IOMMU?  Do we
   get cases for example where the IOMMU only accepts a 32-bit input
   address, but some 64-bit capable masters are connected through it?
  
  I was stuck on this question for a while before, but then I realized
  that it doesn't matter at all: It's the IOMMU driver itself that
  manages the address space, and it doesn't matter if a slave can
  address a larger range than the IOMMU can accept. If the IOMMU
  needs to deal with the opposite case (64-bit input addresses
  but a 32-bit master), that limitation can be put into the specifier.
 
 Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
 master device's DMA mask to do the right thing here?

Ah, yes. I guess that's the right way to do it.

   For determining dma masks, it is the output address that it
   important.  Santosh's code can probably be taught to handle this,
   if given an additional traversal rule for following iommus
   properties.  However, deploying an IOMMU whose output address size
   is smaller than the 
  
  Something seems to be missing here. I don't think we want to handle
  the case where the IOMMU output cannot the entire memory address
  space. If necessary, that would mean using both an IOMMU driver
  and swiotlb, but I think it's a reasonable assumption that hardware
  isn't /that/ crazy.
 
 Similarily, should the IOMMU not be treated like any other device here?
 Its DMA mask should determine what address range it can access.

Right. But for that we need a dma-ranges property in the parent of the
iommu, just so the mask can be set correctly and we don't have to
rely on the 32-bit fallback case.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Arnd Bergmann

On Wednesday 21 May 2014 10:16:09 Thierry Reding wrote:
 On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
  On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
   On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
 On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
  On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
 [...]
   Couldn't a single-master IOMMU be windowed?
  
  Ah, yes. That would actually be like an IBM pSeries, which has a 
  windowed
  IOMMU but uses one window per virtual machine. In that case, the 
  window could
  be a property of the iommu node though, rather than part of the 
  address
  in the link.
 
 Does that mean that the IOMMU has one statically configured window 
 which
 is the same for each virtual machine? That would require some other
 mechanism to assign separate address spaces to each virtual machine,
 wouldn't it? But I suspect that if the IOMMU allows that it could be
 allocated dynamically at runtime.

The way it works on pSeries is that upon VM creation, the guest is 
assigned
one 256MB window for use by assigned DMA capable devices. When the guest
creates a mapping, it uses a hypercall to associate a bus address in 
that
range with a guest physical address. The hypervisor checks that the bus
address is within the allowed range, and translates the guest physical
address into a host physical address, then enters both into the I/O page
table or I/O TLB.
   
   So when a VM is booted it is passed a device tree with that DMA window?
  
  Correct.
  
   Given what you describe above this seems to be more of a configuration
   option to restrict the IOMMU to a subset of the physical memory for
   purposes of virtualization. So I agree that this wouldn't be a good fit
   for what we're trying to achieve with iommus or dma-ranges in this
   binding.
  
  Thinking about it again now, I wonder if there are any other use cases
  for windowed IOMMUs. If this is the only one, there might be no use
  in the #address-cells model I suggested instead of your original
  #iommu-cells.
 
 So in this case virtualization is the reason why we need the DMA window.
 The reason for that is that the guest has no other way of knowing what
 other guests might be using, so it's essentially a mechanism for the
 host to manage the DMA region and allocate subregions for each guest. If
 virtualization isn't an issue then it seems to me that the need for DMA
 windows goes away because the operating system will track DMA regions
 anyway.
 
 The only reason I can think of why a windowed IOMMU would be useful is
 to prevent two or more devices from stepping on each others' toes. But
 that's a problem that the OS should already be handling during DMA
 buffer allocation, isn't it?

Right. As long as we always unmap the buffers from the IOMMU after they
have stopped being in use, it's very unlikely that even a broken device
driver causes a DMA into some bus address that happens to be mapped for
another device.

  I would like to add an explanation about dma-ranges to the binding:
  
  8
  The parent bus of the iommu must have a valid dma-ranges property
  describing how the physical address space of the IOMMU maps into
  memory.
 
 With physical address space you mean the addresses after translation,
 not the I/O virtual addresses, right? But even so, how will this work
 when there are multiple IOMMU devices? What determines which IOMMU is
 mapped via which entry?
 
 Perhaps having multiple IOMMUs implies that there will have to be some
 partitioning of the parent address space to make sure two IOMMUs don't
 translate to the same ranges?

These dma-ranges properties would almost always be for the entire RAM,
and we can treat anything else as a bug.
   
   Would it typically be a 1:1 mapping? In that case could we define an
   empty dma-ranges property to mean exactly that? That would make it
   consistent with the ranges property.
  
  Yes, I believe that is how it's already defined.
 
 I've gone through the proposal at [0] again, but couldn't find a mention
 of an empty dma-ranges property. But regardless I think that a 1:1
 mapping is the obvious meaning of an empty dma-ranges property.
 
 [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt
 
 One thing I'm not sure about is whether dma-ranges should be documented
 in this binding at all. Since there's an accepted standard proposal it
 would seem that it doesn't need to be specifically mentioned. One other
 option would be to link to the above proposal from the binding and then
 complement that with what an empty dma-ranges property means.
 
 Or we could possible document this in a file along with other standard
 properties. I don't think we

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Thierry Reding

On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
 On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
  On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
   On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
  [...]
  Multiple-master IOMMU:
  --
  
  iommu {
  /* the specifier represents the ID of the master */
  #address-cells = 1;
  #size-cells = 0;

How do we know the size of the input address to the IOMMU?  Do we
get cases for example where the IOMMU only accepts a 32-bit input
address, but some 64-bit capable masters are connected through it?
   
   I was stuck on this question for a while before, but then I realized
   that it doesn't matter at all: It's the IOMMU driver itself that
   manages the address space, and it doesn't matter if a slave can
   address a larger range than the IOMMU can accept. If the IOMMU
   needs to deal with the opposite case (64-bit input addresses
   but a 32-bit master), that limitation can be put into the specifier.
  
  Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
  master device's DMA mask to do the right thing here?
 
 Ah, yes. I guess that's the right way to do it.
 
For determining dma masks, it is the output address that it
important.  Santosh's code can probably be taught to handle this,
if given an additional traversal rule for following iommus
properties.  However, deploying an IOMMU whose output address size
is smaller than the 
   
   Something seems to be missing here. I don't think we want to handle
   the case where the IOMMU output cannot the entire memory address
   space. If necessary, that would mean using both an IOMMU driver
   and swiotlb, but I think it's a reasonable assumption that hardware
   isn't /that/ crazy.
  
  Similarily, should the IOMMU not be treated like any other device here?
  Its DMA mask should determine what address range it can access.
 
 Right. But for that we need a dma-ranges property in the parent of the
 iommu, just so the mask can be set correctly and we don't have to
 rely on the 32-bit fallback case.

Shouldn't the IOMMU driver be the one to set the DMA mask for the device
in exactly the same way that other drivers override the 32-bit default?

Thierry


pgpiCURdIdJW2.pgp
Description: PGP signature

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Thierry Reding

On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
 On Wednesday 21 May 2014 10:16:09 Thierry Reding wrote:
  On Tue, May 20, 2014 at 10:31:29PM +0200, Arnd Bergmann wrote:
   On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
  On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
   On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
  [...]
Couldn't a single-master IOMMU be windowed?
   
   Ah, yes. That would actually be like an IBM pSeries, which has a 
   windowed
   IOMMU but uses one window per virtual machine. In that case, the 
   window could
   be a property of the iommu node though, rather than part of the 
   address
   in the link.
  
  Does that mean that the IOMMU has one statically configured window 
  which
  is the same for each virtual machine? That would require some other
  mechanism to assign separate address spaces to each virtual machine,
  wouldn't it? But I suspect that if the IOMMU allows that it could be
  allocated dynamically at runtime.
 
 The way it works on pSeries is that upon VM creation, the guest is 
 assigned
 one 256MB window for use by assigned DMA capable devices. When the 
 guest
 creates a mapping, it uses a hypercall to associate a bus address in 
 that
 range with a guest physical address. The hypervisor checks that the 
 bus
 address is within the allowed range, and translates the guest physical
 address into a host physical address, then enters both into the I/O 
 page
 table or I/O TLB.

So when a VM is booted it is passed a device tree with that DMA window?
   
   Correct.
   
Given what you describe above this seems to be more of a configuration
option to restrict the IOMMU to a subset of the physical memory for
purposes of virtualization. So I agree that this wouldn't be a good fit
for what we're trying to achieve with iommus or dma-ranges in this
binding.
   
   Thinking about it again now, I wonder if there are any other use cases
   for windowed IOMMUs. If this is the only one, there might be no use
   in the #address-cells model I suggested instead of your original
   #iommu-cells.
  
  So in this case virtualization is the reason why we need the DMA window.
  The reason for that is that the guest has no other way of knowing what
  other guests might be using, so it's essentially a mechanism for the
  host to manage the DMA region and allocate subregions for each guest. If
  virtualization isn't an issue then it seems to me that the need for DMA
  windows goes away because the operating system will track DMA regions
  anyway.
  
  The only reason I can think of why a windowed IOMMU would be useful is
  to prevent two or more devices from stepping on each others' toes. But
  that's a problem that the OS should already be handling during DMA
  buffer allocation, isn't it?
 
 Right. As long as we always unmap the buffers from the IOMMU after they
 have stopped being in use, it's very unlikely that even a broken device
 driver causes a DMA into some bus address that happens to be mapped for
 another device.

I think that if buffers remain mapped in the IOMMU when they have been
deallocated that should be considered a bug.

Thierry


pgpngiXcEobEU.pgp
Description: PGP signature

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Arnd Bergmann

On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote:
 On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:

  Right. As long as we always unmap the buffers from the IOMMU after they
  have stopped being in use, it's very unlikely that even a broken device
  driver causes a DMA into some bus address that happens to be mapped for
  another device.
 
 I think that if buffers remain mapped in the IOMMU when they have been
 deallocated that should be considered a bug.

You could see it as a performance optimization in some cases, e.g. when you
cannot flush individual IOTLBs or doing so is expensive, and you just keep
assigning new addresses until every one has been used, then you do a global
IOTLB flush once. Obviously you have to maintain the IO page tables correctly.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Arnd Bergmann

On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
 On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
  On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:

 For determining dma masks, it is the output address that it
 important.  Santosh's code can probably be taught to handle this,
 if given an additional traversal rule for following iommus
 properties.  However, deploying an IOMMU whose output address size
 is smaller than the 

Something seems to be missing here. I don't think we want to handle
the case where the IOMMU output cannot the entire memory address
space. If necessary, that would mean using both an IOMMU driver
and swiotlb, but I think it's a reasonable assumption that hardware
isn't /that/ crazy.
   
   Similarily, should the IOMMU not be treated like any other device here?
   Its DMA mask should determine what address range it can access.
  
  Right. But for that we need a dma-ranges property in the parent of the
  iommu, just so the mask can be set correctly and we don't have to
  rely on the 32-bit fallback case.
 
 Shouldn't the IOMMU driver be the one to set the DMA mask for the device
 in exactly the same way that other drivers override the 32-bit default?

The IOMMU driver could /ask/ for an appropriate mask based on its internal
design, but if you have an IOMMU with a 64-bit output address connected
to a 32-bit bus, that should fail.

Note that it's not obvious what the IOMMU's DMA mask actually means.
It clearly has to be the mask that is used for allocating the IO page
tables, but it wouldn't normally be used in the path that allocates
pages on behalf of a DMA master attached to the IOMMU, because that
allocation is performed by the code that looks at the other device's
dma mask.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Thierry Reding

On Wed, May 21, 2014 at 11:36:32AM +0200, Arnd Bergmann wrote:
 On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
  On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
   On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
 
  For determining dma masks, it is the output address that it
  important.  Santosh's code can probably be taught to handle this,
  if given an additional traversal rule for following iommus
  properties.  However, deploying an IOMMU whose output address size
  is smaller than the 
 
 Something seems to be missing here. I don't think we want to handle
 the case where the IOMMU output cannot the entire memory address
 space. If necessary, that would mean using both an IOMMU driver
 and swiotlb, but I think it's a reasonable assumption that hardware
 isn't /that/ crazy.

Similarily, should the IOMMU not be treated like any other device here?
Its DMA mask should determine what address range it can access.
   
   Right. But for that we need a dma-ranges property in the parent of the
   iommu, just so the mask can be set correctly and we don't have to
   rely on the 32-bit fallback case.
  
  Shouldn't the IOMMU driver be the one to set the DMA mask for the device
  in exactly the same way that other drivers override the 32-bit default?
 
 The IOMMU driver could /ask/ for an appropriate mask based on its internal
 design, but if you have an IOMMU with a 64-bit output address connected
 to a 32-bit bus, that should fail.

Are there real use-cases where that really happens? I guess if we need
that the correct thing would be to bitwise AND both the DMA mask of the
IOMMU device (as set by the driver) with that derived from the IOMMU's
parent bus' dma-ranges property.

 Note that it's not obvious what the IOMMU's DMA mask actually means.
 It clearly has to be the mask that is used for allocating the IO page
 tables, but it wouldn't normally be used in the path that allocates
 pages on behalf of a DMA master attached to the IOMMU, because that
 allocation is performed by the code that looks at the other device's
 dma mask.

Interesting. If a DMA buffer is allocated using the master's DMA mask
wouldn't that cause breakage if the IOMMU and master's DMA masks don't
match. It seems to me like the right thing to do for buffer allocation
is to use the IOMMU's DMA mask if a device uses the IOMMU for
translation and use the device's DMA mask when determining to what I/O
virtual address to map that buffer.

Obviously if we always assume that IOMMU hardware is sane and can always
access at least the whole memory then this isn't an issue. But what if a
device can do DMA to a 64-bit address space, but the IOMMU can only
address 32 bits. If the device's DMA mask is used for allocations, then
buffers could reside beyond the 4 GiB boundary that the IOMMU can
address, so effectively the IOMMU wouldn't be able to write to those
buffers.

Thierry


pgpO1FWiesOeA.pgp
Description: PGP signature

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Arnd Bergmann

On Wednesday 21 May 2014 12:50:38 Thierry Reding wrote:
 On Wed, May 21, 2014 at 11:36:32AM +0200, Arnd Bergmann wrote:
  On Wednesday 21 May 2014 11:00:38 Thierry Reding wrote:
   On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
  
   For determining dma masks, it is the output address that it
   important.  Santosh's code can probably be taught to handle this,
   if given an additional traversal rule for following iommus
   properties.  However, deploying an IOMMU whose output address size
   is smaller than the 
  
  Something seems to be missing here. I don't think we want to handle
  the case where the IOMMU output cannot the entire memory address
  space. If necessary, that would mean using both an IOMMU driver
  and swiotlb, but I think it's a reasonable assumption that hardware
  isn't /that/ crazy.
 
 Similarily, should the IOMMU not be treated like any other device 
 here?
 Its DMA mask should determine what address range it can access.

Right. But for that we need a dma-ranges property in the parent of the
iommu, just so the mask can be set correctly and we don't have to
rely on the 32-bit fallback case.
   
   Shouldn't the IOMMU driver be the one to set the DMA mask for the device
   in exactly the same way that other drivers override the 32-bit default?
  
  The IOMMU driver could /ask/ for an appropriate mask based on its internal
  design, but if you have an IOMMU with a 64-bit output address connected
  to a 32-bit bus, that should fail.
 
 Are there real use-cases where that really happens? I guess if we need
 that the correct thing would be to bitwise AND both the DMA mask of the
 IOMMU device (as set by the driver) with that derived from the IOMMU's
 parent bus' dma-ranges property.

It would be unusual for an IOMMU to need this, but it's how the DMA
mask is supposed to work for normal devices. As mentioned before, I
would probably just error out if we ever encounter such an IOMMU.

  Note that it's not obvious what the IOMMU's DMA mask actually means.
  It clearly has to be the mask that is used for allocating the IO page
  tables, but it wouldn't normally be used in the path that allocates
  pages on behalf of a DMA master attached to the IOMMU, because that
  allocation is performed by the code that looks at the other device's
  dma mask.
 
 Interesting. If a DMA buffer is allocated using the master's DMA mask
 wouldn't that cause breakage if the IOMMU and master's DMA masks don't
 match. It seems to me like the right thing to do for buffer allocation
 is to use the IOMMU's DMA mask if a device uses the IOMMU for
 translation and use the device's DMA mask when determining to what I/O
 virtual address to map that buffer.

Unfortunately not all code agrees regarding how dma mask is actually
interpreted. The most important use is within the dma_map_ops, and
that is aware of the IOMMU. The dma_map_ops use that to decide what
IOVA (bus address) to generate that is usable for the device, normally
this would be a 32-bit range.

When driver code looks at the dma mask of the device itself to make
an allocation decision without taking the IOMMU or swiotlb into
account, things can indeed go wrong.

Russell has recently done a good cleanup of various issues around
dma masks, and I can't find any drivers that get this wrong.
However, there is an issue with the two or three subsystems using
PCI_DMA_BUS_IS_PHYS to decide how they should treat high buffers
coming from user space that get passed to hardware.

If the SCSI layer or the network layer find the horribly misnamed
PCI_DMA_BUS_IS_PHYS (which is hardcoded to 1 on ARM32), they
will create copy in low memory for any data that is above
the device dma_mask (SCSI) or above max_low_pfn (network).

This is not normally a bug, and won't hurt for the swiotlb case,
but will give us worse performance for the IOMMU case, and
we should probably change this code to calculate the boundary
per device by calling a function from dma_map_ops.

We also really need to implement swiotlb support on ARM32 to deal
with any other device (besides SCSI and network) that does not
have an IOMMU but wants to use the streaming DMA API on pages
outside of the dma_mask. We already have this case on shmobile.

 Obviously if we always assume that IOMMU hardware is sane and can always
 access at least the whole memory then this isn't an issue. But what if a
 device can do DMA to a 64-bit address space, but the IOMMU can only
 address 32 bits. If the device's DMA mask is used for allocations, then
 buffers could reside beyond the 4 GiB boundary that the IOMMU can
 address, so effectively the IOMMU wouldn't be able to write to those
 buffers.

The mask of the device is not even an issue here, it's more the general
case of passing a buffer outside of the IOMMU's upstream bus DMA mask
into a driver connected to the IOMMU.

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Grant Grundler

On Wed, May 21, 2014 at 2:32 AM, Arnd Bergmann a...@arndb.de wrote:
 On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote:
 On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:

  Right. As long as we always unmap the buffers from the IOMMU after they
  have stopped being in use, it's very unlikely that even a broken device
  driver causes a DMA into some bus address that happens to be mapped for
  another device.

 I think that if buffers remain mapped in the IOMMU when they have been
 deallocated that should be considered a bug.

There is currently no general requirement to tear down mappings immediately.
An option to enforce immediate tear down might be useful since I agree
Virtual Guests sharing host memory through shared physical devices
will want that. ie there should be no opportunity for a shared device
to be able to access another guests memory through stale (but live)
DMA mappings. I don't think that's the case but I'd ask someone like
Alex Williamson for a more certain answer.

 You could see it as a performance optimization in some cases, e.g. when you
 cannot flush individual IOTLBs or doing so is expensive, and you just keep
 assigning new addresses until every one has been used, then you do a global
 IOTLB flush once.

This is in fact exactly what several IOMMUs do - including the Intel
IOMMU driver.

IIRC, the Intel IOMMU driver collects 256 or so  entries to flush and
then flushes them all at once and updates the metadata. I don't if it
does this for ranges assigned to different virtual guests though or
just within a given range.

 Obviously you have to maintain the IO page tables correctly.

To be clear, Correctly in this case just means until the IOTLB is
flushed, the given IOMMU Pdir entries are marked in use even though
the driver has handed ownership back to the IOMMU driver.

cheers,
grant
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Arnd Bergmann

On Wednesday 21 May 2014 08:44:42 Grant Grundler wrote:
 On Wed, May 21, 2014 at 2:32 AM, Arnd Bergmann a...@arndb.de wrote:
  On Wednesday 21 May 2014 11:02:45 Thierry Reding wrote:
  On Wed, May 21, 2014 at 10:54:42AM +0200, Arnd Bergmann wrote:
 
   Right. As long as we always unmap the buffers from the IOMMU after they
   have stopped being in use, it's very unlikely that even a broken device
   driver causes a DMA into some bus address that happens to be mapped for
   another device.
 
  I think that if buffers remain mapped in the IOMMU when they have been
  deallocated that should be considered a bug.
 
 There is currently no general requirement to tear down mappings immediately.
 An option to enforce immediate tear down might be useful since I agree
 Virtual Guests sharing host memory through shared physical devices
 will want that. ie there should be no opportunity for a shared device
 to be able to access another guests memory through stale (but live)
 DMA mappings. I don't think that's the case but I'd ask someone like
 Alex Williamson for a more certain answer.

I believe powerpc has a boot-time option to enforce the immediate
IOTLB flush, which is very useful for device driver debugging when
something goes wrong with stale DMAs.

  Obviously you have to maintain the IO page tables correctly.
 
 To be clear, Correctly in this case just means until the IOTLB is
 flushed, the given IOMMU Pdir entries are marked in use even though
 the driver has handed ownership back to the IOMMU driver.

I don't know what a Pdir is, but I guess strictly speaking we have
to ensure that all IO page table entries that have been unmapped by
a driver are marked as invalid at least by the time the IOTLB is flushed,
plus we have to flush each entry before it gets reused for a different
page.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Dave Martin

On Wed, May 21, 2014 at 11:00:38AM +0200, Thierry Reding wrote:
 On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
  On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:
   On Tue, May 20, 2014 at 10:26:12PM +0200, Arnd Bergmann wrote:
On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
 On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
  On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
   [...]
   Multiple-master IOMMU:
   --
   
 iommu {
 /* the specifier represents the ID of the master */
 #address-cells = 1;
 #size-cells = 0;
 
 How do we know the size of the input address to the IOMMU?  Do we
 get cases for example where the IOMMU only accepts a 32-bit input
 address, but some 64-bit capable masters are connected through it?

I was stuck on this question for a while before, but then I realized
that it doesn't matter at all: It's the IOMMU driver itself that
manages the address space, and it doesn't matter if a slave can
address a larger range than the IOMMU can accept. If the IOMMU
needs to deal with the opposite case (64-bit input addresses
but a 32-bit master), that limitation can be put into the specifier.
   
   Isn't this what DMA masks are for? Couldn't the IOMMU simply use the
   master device's DMA mask to do the right thing here?
  
  Ah, yes. I guess that's the right way to do it.
  
 For determining dma masks, it is the output address that it
 important.  Santosh's code can probably be taught to handle this,
 if given an additional traversal rule for following iommus
 properties.  However, deploying an IOMMU whose output address size
 is smaller than the 

Something seems to be missing here. I don't think we want to handle
the case where the IOMMU output cannot the entire memory address
space. If necessary, that would mean using both an IOMMU driver
and swiotlb, but I think it's a reasonable assumption that hardware
isn't /that/ crazy.
   
   Similarily, should the IOMMU not be treated like any other device here?
   Its DMA mask should determine what address range it can access.
  
  Right. But for that we need a dma-ranges property in the parent of the
  iommu, just so the mask can be set correctly and we don't have to
  rely on the 32-bit fallback case.
 
 Shouldn't the IOMMU driver be the one to set the DMA mask for the device
 in exactly the same way that other drivers override the 32-bit default?
 

Are we confusing the next-hop DMA mask with the end-to-end DMA mask
here?  The device has a next-hop mask which may be non-trivial.  The
IOMMU also has a next-hop mask and/or remapping, but I think we agree
that in sensible systems that will be trivial.  There might be other
non-trivial remappings between the IOMMU and memory (but again, not
in the common case).

If we just use the same name for all these, we are liable to get
confused.

To answer the question what memory can the kernel allocate for DMA
with this device, it is the end-to-end transformation that is
important.

Cheers
---Dave
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-21 Thread Arnd Bergmann

On Wednesday 21 May 2014 18:09:54 Dave Martin wrote:
 On Wed, May 21, 2014 at 11:00:38AM +0200, Thierry Reding wrote:
  On Wed, May 21, 2014 at 10:50:38AM +0200, Arnd Bergmann wrote:
   On Wednesday 21 May 2014 10:26:11 Thierry Reding wrote:

Similarily, should the IOMMU not be treated like any other device here?
Its DMA mask should determine what address range it can access.
   
   Right. But for that we need a dma-ranges property in the parent of the
   iommu, just so the mask can be set correctly and we don't have to
   rely on the 32-bit fallback case.
  
  Shouldn't the IOMMU driver be the one to set the DMA mask for the device
  in exactly the same way that other drivers override the 32-bit default?
  
 
 Are we confusing the next-hop DMA mask with the end-to-end DMA mask
 here?  The device has a next-hop mask which may be non-trivial.  The
 IOMMU also has a next-hop mask and/or remapping, but I think we agree
 that in sensible systems that will be trivial.  There might be other
 non-trivial remappings between the IOMMU and memory (but again, not
 in the common case).
 
 If we just use the same name for all these, we are liable to get
 confused.
 
 To answer the question what memory can the kernel allocate for DMA
 with this device, it is the end-to-end transformation that is
 important.

Yes, but that is not the same as the dma mask of the device. The DMA
mask gets set by the device according to its capabilities, and may
get limited by what the bus to either memory or to the iommu can do,
if one is in use.

Without an IOMMU, the mask is used for allocations, with an IOMMU,
The iommu code makes the decision what to allocate without taking
dev-dma_mask into account.

As mentioned, this is currently not handled well for SCSI and network,
where we use a smaller mask than necessary and can end up copying
data.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Arnd Bergmann

On Monday 19 May 2014 22:32:33 Thierry Reding wrote:
 On Mon, May 19, 2014 at 06:22:31PM +0100, Dave Martin wrote:
  On Mon, May 19, 2014 at 01:53:37PM +0100, Thierry Reding wrote:
 [...]
   My understanding here is mostly based on the OpenFirmware working group
   proposal for the dma-ranges property[0]. I'll give another example to
   try and clarify how I had imagined this to work:
   
   / {
   #address-cells = 2;
   #size-cells = 2;
   
   iommu {
   /*
* This is somewhat unusual (or maybe not) in that we
* need 2 cells to represent the size of an address
* space that is 32 bits long.
*/
   #address-cells = 1;
   #size-cells = 2;
   
   #iommu-cells = 1;
   };
   
   master {
   iommus = /iommu 42;
  
  Comparing this with the other discussion thread, we have a similar
  concept here, in that the iommu is made a logical parent, however...
  
  Firstly, there's an implicit assumption here that the only kind of
  thing the master could possibly be connected to is an IOMMU, with
  no non-trivial interconnect in between.  I'm not sure this is going
  to scale to complex SoCs.
 
 Here we go again. We're now in the very same bad spot that we've been in
 so many times before. There are at least two SoCs that we know do not
 require anything fancy, yet we're blocking adding support for those use
 cases because we think that at some point some IOMMU may require more
 than that. But at the same time it seems like we don't have enough data
 about what exactly that is, so we keep speculating. At this rate we're
 making it impossible to get a reasonable feature set supported upstream.
 
 That's very frustrating.

I agree. While I just commented that I want to think through how this
would look for other IOMMUs, the generic case of all masters that Dave
wants is going overboard with the complexity and we'd be better off
deferring this to whenever someone needs it, which I assume is never.

  If a range of Stream IDs may be issued (especially from something like
  a PCIe RC where the stream ID may be a many-bit value), describing
  the IDs individually may be impractical.
 
 The IOMMU specifier is completely specific to the IOMMU. If an IOMMU has
 a need to represent the stream IDs as a range there's nothing keeping it
 from defining the specifier accordingly.

If we make the IOMMU address space look like the PCI address space, we
can have a simple representation for this in DT. For the code, I assume
that we will always have to treat PCI and platform devices differently.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Thierry Reding

On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote:
 On Monday 19 May 2014 22:59:46 Thierry Reding wrote:
  On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
   On Monday 19 May 2014 14:53:37 Thierry Reding wrote:
On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote:
 On Friday 16 May 2014 14:23:18 Thierry Reding wrote:
  From: Thierry Reding tred...@nvidia.com
[...]
 Finally, it makes no sense to use the dma-ranges property of the 
 master's
 parent bus, because that bus isn't actually involved in the 
 translation.

My understanding here is mostly based on the OpenFirmware working group
proposal for the dma-ranges property[0]. I'll give another example to
try and clarify how I had imagined this to work:

/ {
#address-cells = 2;
#size-cells = 2;

iommu {
/*
 * This is somewhat unusual (or maybe not) in 
that we
 * need 2 cells to represent the size of an 
address
 * space that is 32 bits long.
 */
#address-cells = 1;
#size-cells = 2;
   
   You should never need #size-cells  #address-cells
  
  That was always my impression as well. But how then do you represent the
  full 4 GiB address space in a 32-bit system? It starts at 0 and ends at
  4 GiB - 1, which makes it 4 GiB large. That's:
  
  0 1 0
  
  With #address-cells = 1 and #size-cells = 1 the best you can do is:
  
  0 0x
  
  but that's not accurate.
 
 I think we've done both in the past, either extended #size-cells or
 taken 0x as a special token. Note that in your example,
 the iommu actually needs #address-cells = 2 anyway.

But it needs #address-cells = 2 only to encode an ID in addition to
the address. If this was a single-master IOMMU then there'd be no need
for the ID.

This really isn't specific to IOMMUs though. It really applies to all
cases where #address-cells and #size-cells are parsed. While it's way
too late to change the semantics of standard properties, perhaps for
properties that are introduced it would make more sense to encode this
as a start limit pair, both of length #address-cells, to avoid this
type of corner case.

On the other hand doing so would make it inconsistent with existing
bindings which may not be desirable either.

But since it seems like we're headed for something completely different
for IOMMUs, perhaps it would be worth considering to describe the IOMMU
range as start limit. Since it will likely use #iommu-cells rather
than #address-cells we have an opportunity to change the semantics.

   Let me do another example, with the address merged into the iommu
   references:
   
   / {
 #address-cells = 2; // 64-bit address
 #size-cells = 2;
   
 iommu@a {
 #address-cells = 2; // 1 cell ID, 1 cell address
 #size-cells = 1;
   
 // no need for #iommu-cells
 };
   
   
 master@b {
 iommus = /iommu@a // iommu
 0x23 // ID
 0x4000   // window start
 0x1000; //window size
 };
   };
   
   A disadvantage of this model would be that for all ARM SMMU users, we'd
   have to always list a 4GB address space, which is kind of redundant.
  
  Isn't that the equivalent of the whole address space default that I
  mentioned in the commit message? Could this be handled with
  #address-cells = 1 and #size-cells = 0 in the iommu node? That way
  the only cell that needs to be specified in iommus would be the ID and
  the redundant address space could be simply omitted from DT since it
  would be implied by the compatible string.
 
 Yes, that's right. After giving it some more thought, I agree that the
 #size-cells case makes sense as an indicator for cases where the master
 doesn't have to care about the address at all and no further translation
 with dma-ranges is possible.
 
 Back to my example with dma-ranges, the simplest case would an IOMMU
 that has an ID per bus, with multiple masters sharing that ID:
 
 / {
   #address-cells = 1;
   #size-cells = 1;
 
   iommu {
   #address-cells = 2; // ID, address
   #size-cells = 2;
   };
 
   master@a {
   iommus =  {/iommu} 0xa 0x0  0x1 0x0; // 4GB ID '0xa'
   }
 
   bus1 {
   #address-cells = 1;
   #size-cells = 1;
   ranges;
   iommus =  {/iommu} 0 0   0x100 0; // all IDs
   dma-ranges = 0  0xb 0  1 0; // child devices use ID '0xb'
 
   anothermaster {
   // no iommus link, implied by dma-ranges above
   };
   };
 };
 
 If you set #size-cells=0, you can't really do that but instead would

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Arnd Bergmann

On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote:
 On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote:
  On Monday 19 May 2014 22:59:46 Thierry Reding wrote:
   On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
On Monday 19 May 2014 14:53:37 Thierry Reding wrote:
 On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote:
  On Friday 16 May 2014 14:23:18 Thierry Reding wrote:
   From: Thierry Reding tred...@nvidia.com
 [...]
  Finally, it makes no sense to use the dma-ranges property of the 
  master's
  parent bus, because that bus isn't actually involved in the 
  translation.
 
 My understanding here is mostly based on the OpenFirmware working 
 group
 proposal for the dma-ranges property[0]. I'll give another example to
 try and clarify how I had imagined this to work:
 
   / {
   #address-cells = 2;
   #size-cells = 2;
 
   iommu {
   /*
* This is somewhat unusual (or maybe not) in 
 that we
* need 2 cells to represent the size of an 
 address
* space that is 32 bits long.
*/
   #address-cells = 1;
   #size-cells = 2;

You should never need #size-cells  #address-cells
   
   That was always my impression as well. But how then do you represent the
   full 4 GiB address space in a 32-bit system? It starts at 0 and ends at
   4 GiB - 1, which makes it 4 GiB large. That's:
   
 0 1 0
   
   With #address-cells = 1 and #size-cells = 1 the best you can do is:
   
 0 0x
   
   but that's not accurate.
  
  I think we've done both in the past, either extended #size-cells or
  taken 0x as a special token. Note that in your example,
  the iommu actually needs #address-cells = 2 anyway.
 
 But it needs #address-cells = 2 only to encode an ID in addition to
 the address. If this was a single-master IOMMU then there'd be no need
 for the ID.

Right. But for a single-master IOMMU, there is no need to specify
any additional data, it could have #address-cells=0 if we take the
optimization you suggested.

 This really isn't specific to IOMMUs though. It really applies to all
 cases where #address-cells and #size-cells are parsed. While it's way
 too late to change the semantics of standard properties, perhaps for
 properties that are introduced it would make more sense to encode this
 as a start limit pair, both of length #address-cells, to avoid this
 type of corner case.
 
 On the other hand doing so would make it inconsistent with existing
 bindings which may not be desirable either.
 
 But since it seems like we're headed for something completely different
 for IOMMUs, perhaps it would be worth considering to describe the IOMMU
 range as start limit. Since it will likely use #iommu-cells rather
 than #address-cells we have an opportunity to change the semantics.

I'd still prefer #address-cells/#size-cells over #iommu-cells.

  / {
  #address-cells = 1;
  #size-cells = 1;
  
  iommu {
  #address-cells = 2; // ID, address
  #size-cells = 2;
  };
  
  master@a {
  iommus =  {/iommu} 0xa 0x0  0x1 0x0; // 4GB ID '0xa'
  }
  
  bus1 {
  #address-cells = 1;
  #size-cells = 1;
  ranges;
  iommus =  {/iommu} 0 0   0x100 0; // all IDs
  dma-ranges = 0  0xb 0  1 0; // child devices use ID '0xb'
  
  anothermaster {
  // no iommus link, implied by dma-ranges above
  };
  };
  };
  
  If you set #size-cells=0, you can't really do that but instead would
  require an iommus property in each master, which is not a big concern
  either.
 
 I'm not sure I understand the need for 0x100 (all IDs) entry above. If
 bus1's iommus property applies to all devices on the bus, why can't the
 ID 0xb be listed in the iommus property?
 
   bus1 {
   #address-cells = 1;
   #size-cells = 1;
   ranges;
   iommus = {/iommu} 0xb 0  0x1 0x0; // 4GB ID '0xb'
   dma-ranges = 0  0xb 0  0x1 0x0;
 
   anothermaster {
   ...
   };
   };

It depends on how the address is interpreted, but we could make this
a valid case too.

 In which case I guess dma-ranges would be redundant.

No, because the iommus property doesn't translate the address range, it
just creates a new address space. bus1 and iommu in the example have
different #address-cells, so you definitely need a non-empty ranges
property.

  The main advantage I think would be for IOMMUs that use the PCI b/d/f
  numbers as IDs. These can have #address-cells=3, #size-cells=2
  and have an empty dma-ranges property in the PCI host bridge node,
  and

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Thierry Reding

On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote:
  On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote:
   On Monday 19 May 2014 22:59:46 Thierry Reding wrote:
On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
[...]
 You should never need #size-cells  #address-cells

That was always my impression as well. But how then do you represent the
full 4 GiB address space in a 32-bit system? It starts at 0 and ends at
4 GiB - 1, which makes it 4 GiB large. That's:

0 1 0

With #address-cells = 1 and #size-cells = 1 the best you can do is:

0 0x

but that's not accurate.
   
   I think we've done both in the past, either extended #size-cells or
   taken 0x as a special token. Note that in your example,
   the iommu actually needs #address-cells = 2 anyway.
  
  But it needs #address-cells = 2 only to encode an ID in addition to
  the address. If this was a single-master IOMMU then there'd be no need
  for the ID.
 
 Right. But for a single-master IOMMU, there is no need to specify
 any additional data, it could have #address-cells=0 if we take the
 optimization you suggested.

Couldn't a single-master IOMMU be windowed?

  I'm not sure I understand the need for 0x100 (all IDs) entry above. If
  bus1's iommus property applies to all devices on the bus, why can't the
  ID 0xb be listed in the iommus property?
  
  bus1 {
  #address-cells = 1;
  #size-cells = 1;
  ranges;
  iommus = {/iommu} 0xb 0  0x1 0x0; // 4GB ID '0xb'
  dma-ranges = 0  0xb 0  0x1 0x0;
  
  anothermaster {
  ...
  };
  };
 
 It depends on how the address is interpreted, but we could make this
 a valid case too.
 
  In which case I guess dma-ranges would be redundant.
 
 No, because the iommus property doesn't translate the address range, it
 just creates a new address space. bus1 and iommu in the example have
 different #address-cells, so you definitely need a non-empty ranges
 property.

Ah, now I get it.

   The main advantage I think would be for IOMMUs that use the PCI b/d/f
   numbers as IDs. These can have #address-cells=3, #size-cells=2
   and have an empty dma-ranges property in the PCI host bridge node,
   and interpret this as using the same encoding as the PCI BARs in
   the ranges property.
  
  I'm somewhat confused here, since you said earlier:
  
   After giving the ranges stuff some more thought, I have come to the
   conclusion that using #iommu-cells should work fine for almost
   all cases, including windowed iommus, because the window is not
   actually needed in the device, but only in the iommu, wihch is of course
   free to interpret the arguments as addresses.
  
  But now you seem to be saying that we should still be using the
  #address-cells and #size-cells properties in the IOMMU node to determine
  the length of the specifier.
 
 I probably wasn't clear. I think we can make it work either way, but
 my feeling is that using #address-cells/#size-cells gives us a nicer
 syntax for the more complex cases.

Okay, so in summary we'd have something like this for simple cases:

Required properties:

- #address-cells: The number of cells in an IOMMU specifier needed to encode
  an address.
- #size-cells: The number of cells in an IOMMU specifier needed to represent
  the length of an address range.

Typical values for the above include:
- #address-cells = 0, size-cells = 0: Single master IOMMU devices are not
  configurable and therefore no additional information needs to be encoded in
  the specifier. This may also apply to multiple master IOMMU devices that do
  not allow the association of masters to be configured.
- #address-cells = 1, size-cells = 0: Multiple master IOMMU devices may
  need to be configured in order to enable translation for a given master. In
  such cases the single address cell corresponds to the master device's ID.
- #address-cells = 2, size-cells = 2: Some IOMMU devices allow the DMA
  window for masters to be configured. The first cell of the address in this
  may contain the master device's ID for example, while the second cell could
  contain the start of the DMA window for the given device. The length of the
  DMA window is specified by two additional cells.

Examples:
=

Single-master IOMMU:


iommu {
#address-cells = 0;
#size-cells = 0;
};

master {
iommus = /iommu;
};

Multiple-master IOMMU with fixed associations:
--

/* multiple-master IOMMU */
iommu {
/*
 * Masters are statically associated with this IOMMU and
 * address translation is always enabled.
 */

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Arnd Bergmann

On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
 On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
  On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote:
   On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote:
On Monday 19 May 2014 22:59:46 Thierry Reding wrote:
 On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
 [...]
  You should never need #size-cells  #address-cells
 
 That was always my impression as well. But how then do you represent 
 the
 full 4 GiB address space in a 32-bit system? It starts at 0 and ends 
 at
 4 GiB - 1, which makes it 4 GiB large. That's:
 
   0 1 0
 
 With #address-cells = 1 and #size-cells = 1 the best you can do 
 is:
 
   0 0x
 
 but that's not accurate.

I think we've done both in the past, either extended #size-cells or
taken 0x as a special token. Note that in your example,
the iommu actually needs #address-cells = 2 anyway.
   
   But it needs #address-cells = 2 only to encode an ID in addition to
   the address. If this was a single-master IOMMU then there'd be no need
   for the ID.
  
  Right. But for a single-master IOMMU, there is no need to specify
  any additional data, it could have #address-cells=0 if we take the
  optimization you suggested.
 
 Couldn't a single-master IOMMU be windowed?

Ah, yes. That would actually be like an IBM pSeries, which has a windowed
IOMMU but uses one window per virtual machine. In that case, the window could
be a property of the iommu node though, rather than part of the address
in the link.

The main advantage I think would be for IOMMUs that use the PCI b/d/f
numbers as IDs. These can have #address-cells=3, #size-cells=2
and have an empty dma-ranges property in the PCI host bridge node,
and interpret this as using the same encoding as the PCI BARs in
the ranges property.
   
   I'm somewhat confused here, since you said earlier:
   
After giving the ranges stuff some more thought, I have come to the
conclusion that using #iommu-cells should work fine for almost
all cases, including windowed iommus, because the window is not
actually needed in the device, but only in the iommu, wihch is of course
free to interpret the arguments as addresses.
   
   But now you seem to be saying that we should still be using the
   #address-cells and #size-cells properties in the IOMMU node to determine
   the length of the specifier.
  
  I probably wasn't clear. I think we can make it work either way, but
  my feeling is that using #address-cells/#size-cells gives us a nicer
  syntax for the more complex cases.
 
 Okay, so in summary we'd have something like this for simple cases:
 
 Required properties:
 
 - #address-cells: The number of cells in an IOMMU specifier needed to encode
   an address.
 - #size-cells: The number of cells in an IOMMU specifier needed to represent
   the length of an address range.
 
 Typical values for the above include:
 - #address-cells = 0, size-cells = 0: Single master IOMMU devices are not
   configurable and therefore no additional information needs to be encoded in
   the specifier. This may also apply to multiple master IOMMU devices that do
   not allow the association of masters to be configured.
 - #address-cells = 1, size-cells = 0: Multiple master IOMMU devices may
   need to be configured in order to enable translation for a given master. In
   such cases the single address cell corresponds to the master device's ID.
 - #address-cells = 2, size-cells = 2: Some IOMMU devices allow the DMA
   window for masters to be configured. The first cell of the address in this
   may contain the master device's ID for example, while the second cell could
   contain the start of the DMA window for the given device. The length of the
   DMA window is specified by two additional cells.
 
 Examples:
 =
 
 Single-master IOMMU:
 
 
   iommu {
   #address-cells = 0;
   #size-cells = 0;
   };
 
   master {
   iommus = /iommu;
   };
 
 Multiple-master IOMMU with fixed associations:
 --
 
   /* multiple-master IOMMU */
   iommu {
   /*
* Masters are statically associated with this IOMMU and
* address translation is always enabled.
*/
   #iommu-cells = 0;
   };

copied wrong? I guess you mean #address-cells=0/#size-cells=0 here.

   /* static association with IOMMU */
   master@1 {
   reg = 1;
   iommus = /iommu;
   };
 
   /* static association with IOMMU */
   master@2 {
   reg = 2;
   iommus = /iommu;
   };
 
 Multiple-master IOMMU:
 --
 
   iommu {
   /* the specifier represents the ID of the master */

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Dave Martin

On Tue, May 20, 2014 at 12:08:44PM +0200, Arnd Bergmann wrote:
 On Monday 19 May 2014 22:32:33 Thierry Reding wrote:
  On Mon, May 19, 2014 at 06:22:31PM +0100, Dave Martin wrote:
   On Mon, May 19, 2014 at 01:53:37PM +0100, Thierry Reding wrote:
  [...]
My understanding here is mostly based on the OpenFirmware working group
proposal for the dma-ranges property[0]. I'll give another example to
try and clarify how I had imagined this to work:

/ {
#address-cells = 2;
#size-cells = 2;

iommu {
/*
 * This is somewhat unusual (or maybe not) in that 
we
 * need 2 cells to represent the size of an address
 * space that is 32 bits long.
 */
#address-cells = 1;
#size-cells = 2;

#iommu-cells = 1;
};

master {
iommus = /iommu 42;
   
   Comparing this with the other discussion thread, we have a similar
   concept here, in that the iommu is made a logical parent, however...
   
   Firstly, there's an implicit assumption here that the only kind of
   thing the master could possibly be connected to is an IOMMU, with
   no non-trivial interconnect in between.  I'm not sure this is going
   to scale to complex SoCs.
  
  Here we go again. We're now in the very same bad spot that we've been in
  so many times before. There are at least two SoCs that we know do not
  require anything fancy, yet we're blocking adding support for those use
  cases because we think that at some point some IOMMU may require more
  than that. But at the same time it seems like we don't have enough data
  about what exactly that is, so we keep speculating. At this rate we're
  making it impossible to get a reasonable feature set supported upstream.
  
  That's very frustrating.
 
 I agree. While I just commented that I want to think through how this
 would look for other IOMMUs, the generic case of all masters that Dave
 wants is going overboard with the complexity and we'd be better off

I don't want that, and actually I agree with the conclusion overboard.

It was really about exploring the problem space, including things that
we can reasonably foresee.  Any bindings today should necessarily be
much simpler, and that's certainly the correct approach.

I've been neglecting this thread (apologies) -- but although I have to
think a bit about Thierry's most recent suggestions, I think there's
actually reasonable alignment now.

 deferring this to whenever someone needs it, which I assume is never.

Well, some of it might be.  But I'm not saying we should give in to
wild speculation (except to get a feel for what we're ruling out, and
the consequences of that).

 
   If a range of Stream IDs may be issued (especially from something like
   a PCIe RC where the stream ID may be a many-bit value), describing
   the IDs individually may be impractical.
  
  The IOMMU specifier is completely specific to the IOMMU. If an IOMMU has
  a need to represent the stream IDs as a range there's nothing keeping it
  from defining the specifier accordingly.
 
 If we make the IOMMU address space look like the PCI address space, we
 can have a simple representation for this in DT. For the code, I assume
 that we will always have to treat PCI and platform devices differently.

Can you elaborate on that?

Master ID signals in ARM systems and how they are handled are rather
under-specified today.  Treating the master ID bits as some extra bits
in some kind of extended bus address could work for ARM IOMMUs etc.,
but I didn't have a complete answer yet.

Cheers
---Dave
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Thierry Reding

On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
[...]
  Couldn't a single-master IOMMU be windowed?
 
 Ah, yes. That would actually be like an IBM pSeries, which has a windowed
 IOMMU but uses one window per virtual machine. In that case, the window could
 be a property of the iommu node though, rather than part of the address
 in the link.

Does that mean that the IOMMU has one statically configured window which
is the same for each virtual machine? That would require some other
mechanism to assign separate address spaces to each virtual machine,
wouldn't it? But I suspect that if the IOMMU allows that it could be
allocated dynamically at runtime.

  Multiple-master IOMMU with fixed associations:
  --
  
  /* multiple-master IOMMU */
  iommu {
  /*
   * Masters are statically associated with this IOMMU and
   * address translation is always enabled.
   */
  #iommu-cells = 0;
  };
 
 copied wrong? I guess you mean #address-cells=0/#size-cells=0 here.

Yes, I obviously wasn't careful when I copied this over.

  Multiple-master device:
  ---
  
  /* single-master IOMMU */
  iommu@1 {
  reg = 1;
  #address-cells = 0;
  #size-cells = 0;
  };
  
  /* multiple-master IOMMU */
  iommu@2 {
  reg = 2;
  #address-cells = 1;
  #size-cells = 0;
  };
  
  /* device with two master interfaces */
  master {
  iommus = /iommu@1,/* master of the single-master IOMMU */
   /iommu@2 42; /* ID 42 in multiple-master IOMMU */
  };
[...]
  Does that sound about right?
 
 Yes, sounds great. I would probably leave out the Multiple-master device
 from the examples, since that seems to be a rather obscure case.

Agreed. We can easily add such examples if/when such device start to
appear.

 I would like to add an explanation about dma-ranges to the binding:
 
 8
 The parent bus of the iommu must have a valid dma-ranges property
 describing how the physical address space of the IOMMU maps into
 memory.

With physical address space you mean the addresses after translation,
not the I/O virtual addresses, right? But even so, how will this work
when there are multiple IOMMU devices? What determines which IOMMU is
mapped via which entry?

Perhaps having multiple IOMMUs implies that there will have to be some
partitioning of the parent address space to make sure two IOMMUs don't
translate to the same ranges?

 A device with an iommus property will ignore the dma-ranges property
 of the parent node and rely on the IOMMU for translation instead.

Do we need to consider the case where an IOMMU listed in iommus isn't
enabled (status = disabled)? In that case presumably the device would
either not function or may optionally continue to master onto the parent
untranslated.

 Using an iommus property in bus device nodes with dma-ranges
 specifying how child devices relate to the IOMMU is a possible extension
 but is not recommended until this binding gets extended.

Just for my understanding, bus device nodes with iommus and dma-ranges
properties could be equivalently written by explicitly moving the iommus
properties into the child device nodes, right? In which case they should
be the same as the other examples. So that concept is a convenience
notation to reduce duplication, but doesn't fundamentally introduce any
new concept.

Thierry


pgpWOnAOo5SOg.pgp
Description: PGP signature

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Arnd Bergmann

On Tuesday 20 May 2014 14:07:09 Dave Martin wrote:
 On Tue, May 20, 2014 at 12:08:44PM +0200, Arnd Bergmann wrote:
  On Monday 19 May 2014 22:32:33 Thierry Reding wrote:
   On Mon, May 19, 2014 at 06:22:31PM +0100, Dave Martin wrote:
On Mon, May 19, 2014 at 01:53:37PM +0100, Thierry Reding wrote:
   [...]
 My understanding here is mostly based on the OpenFirmware working 
 group
 proposal for the dma-ranges property[0]. I'll give another example to
 try and clarify how I had imagined this to work:
 
 / {
 #address-cells = 2;
 #size-cells = 2;
 
 iommu {
 /*
  * This is somewhat unusual (or maybe not) in 
 that we
  * need 2 cells to represent the size of an 
 address
  * space that is 32 bits long.
  */
 #address-cells = 1;
 #size-cells = 2;
 
 #iommu-cells = 1;
 };
 
 master {
 iommus = /iommu 42;

Comparing this with the other discussion thread, we have a similar
concept here, in that the iommu is made a logical parent, however...

Firstly, there's an implicit assumption here that the only kind of
thing the master could possibly be connected to is an IOMMU, with
no non-trivial interconnect in between.  I'm not sure this is going
to scale to complex SoCs.
   
   Here we go again. We're now in the very same bad spot that we've been in
   so many times before. There are at least two SoCs that we know do not
   require anything fancy, yet we're blocking adding support for those use
   cases because we think that at some point some IOMMU may require more
   than that. But at the same time it seems like we don't have enough data
   about what exactly that is, so we keep speculating. At this rate we're
   making it impossible to get a reasonable feature set supported upstream.
   
   That's very frustrating.
  
  I agree. While I just commented that I want to think through how this
  would look for other IOMMUs, the generic case of all masters that Dave
  wants is going overboard with the complexity and we'd be better off
 
 I don't want that, and actually I agree with the conclusion overboard.
 
 It was really about exploring the problem space, including things that
 we can reasonably foresee.  Any bindings today should necessarily be
 much simpler, and that's certainly the correct approach.
 
 I've been neglecting this thread (apologies) -- but although I have to
 think a bit about Thierry's most recent suggestions, I think there's
 actually reasonable alignment now.
 
  deferring this to whenever someone needs it, which I assume is never.
 
 Well, some of it might be.  But I'm not saying we should give in to
 wild speculation (except to get a feel for what we're ruling out, and
 the consequences of that).


Ok, fair enough.

If a range of Stream IDs may be issued (especially from something like
a PCIe RC where the stream ID may be a many-bit value), describing
the IDs individually may be impractical.
   
   The IOMMU specifier is completely specific to the IOMMU. If an IOMMU has
   a need to represent the stream IDs as a range there's nothing keeping it
   from defining the specifier accordingly.
  
  If we make the IOMMU address space look like the PCI address space, we
  can have a simple representation for this in DT. For the code, I assume
  that we will always have to treat PCI and platform devices differently.
 
 Can you elaborate on that?
 
 Master ID signals in ARM systems and how they are handled are rather
 under-specified today.  Treating the master ID bits as some extra bits
 in some kind of extended bus address could work for ARM IOMMUs etc.,
 but I didn't have a complete answer yet.

The PCI binding at http://www.openfirmware.org/1275/bindings/pci/pci2_1.pdf
defines a 96-bit address space for MMIO registers and other things, so
it can uniquely identify not just an address on the bus, but also 
any standard resource as seen by the device, quoting from there:

2.2.1.1. Numerical Representation
(The Numerical Representation of an address is the format that Open
 Firmware uses for storing an address within a property value and on
 the stack, as an argument to a package method.) The numerical
 representation of a PCI address consists of three cells, encoded as
 follows. For this purpose, the least-significant 32 bits of a cell is used;
 if the cell size is larger than 32 bits, any additional high-order bits
 are zero. Bit# 0 refers to the least-significant bit.

Bit#   3322  1100 
   10987654 32109876 54321098 76543210
phys.hi cell:  npt000ss  dfff 
phys.mid cell:    
phys.lo cell:

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Arnd Bergmann

On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
 On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
  On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
 [...]
   Couldn't a single-master IOMMU be windowed?
  
  Ah, yes. That would actually be like an IBM pSeries, which has a windowed
  IOMMU but uses one window per virtual machine. In that case, the window 
  could
  be a property of the iommu node though, rather than part of the address
  in the link.
 
 Does that mean that the IOMMU has one statically configured window which
 is the same for each virtual machine? That would require some other
 mechanism to assign separate address spaces to each virtual machine,
 wouldn't it? But I suspect that if the IOMMU allows that it could be
 allocated dynamically at runtime.

The way it works on pSeries is that upon VM creation, the guest is assigned
one 256MB window for use by assigned DMA capable devices. When the guest
creates a mapping, it uses a hypercall to associate a bus address in that
range with a guest physical address. The hypervisor checks that the bus
address is within the allowed range, and translates the guest physical
address into a host physical address, then enters both into the I/O page
table or I/O TLB.

  I would like to add an explanation about dma-ranges to the binding:
  
  8
  The parent bus of the iommu must have a valid dma-ranges property
  describing how the physical address space of the IOMMU maps into
  memory.
 
 With physical address space you mean the addresses after translation,
 not the I/O virtual addresses, right? But even so, how will this work
 when there are multiple IOMMU devices? What determines which IOMMU is
 mapped via which entry?
 
 Perhaps having multiple IOMMUs implies that there will have to be some
 partitioning of the parent address space to make sure two IOMMUs don't
 translate to the same ranges?

These dma-ranges properties would almost always be for the entire RAM,
and we can treat anything else as a bug.

The mapping between what goes into the IOMMU and what comes out of it
is not reflected in DT at all, since it only happens at runtime.
The dma-ranges property I mean above describes how what comes out of
the IOMMU maps into physical memory.

  A device with an iommus property will ignore the dma-ranges property
  of the parent node and rely on the IOMMU for translation instead.
 
 Do we need to consider the case where an IOMMU listed in iommus isn't
 enabled (status = disabled)? In that case presumably the device would
 either not function or may optionally continue to master onto the parent
 untranslated.

My reasoning was that the DT should specify whether we use the IOMMU
or not. Being able to just switch on or off the IOMMU sounds nice as
well, so we could change the text above to do that.

Another option would be to do this in the IOMMU code, basically
falling back to the IOMMU parent's dma-ranges property and using
linear dma_map_ops when that is disabled.

  Using an iommus property in bus device nodes with dma-ranges
  specifying how child devices relate to the IOMMU is a possible extension
  but is not recommended until this binding gets extended.
 
 Just for my understanding, bus device nodes with iommus and dma-ranges
 properties could be equivalently written by explicitly moving the iommus
 properties into the child device nodes, right? In which case they should
 be the same as the other examples. So that concept is a convenience
 notation to reduce duplication, but doesn't fundamentally introduce any
 new concept.

The one case  where that doesn't work is PCI, because we don't list the
PCI devices in DT normally, and the iommus property would only exist
at the PCI host controller node.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Thierry Reding

On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
  On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
   On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
  [...]
Couldn't a single-master IOMMU be windowed?
   
   Ah, yes. That would actually be like an IBM pSeries, which has a windowed
   IOMMU but uses one window per virtual machine. In that case, the window 
   could
   be a property of the iommu node though, rather than part of the address
   in the link.
  
  Does that mean that the IOMMU has one statically configured window which
  is the same for each virtual machine? That would require some other
  mechanism to assign separate address spaces to each virtual machine,
  wouldn't it? But I suspect that if the IOMMU allows that it could be
  allocated dynamically at runtime.
 
 The way it works on pSeries is that upon VM creation, the guest is assigned
 one 256MB window for use by assigned DMA capable devices. When the guest
 creates a mapping, it uses a hypercall to associate a bus address in that
 range with a guest physical address. The hypervisor checks that the bus
 address is within the allowed range, and translates the guest physical
 address into a host physical address, then enters both into the I/O page
 table or I/O TLB.

So when a VM is booted it is passed a device tree with that DMA window?
Given what you describe above this seems to be more of a configuration
option to restrict the IOMMU to a subset of the physical memory for
purposes of virtualization. So I agree that this wouldn't be a good fit
for what we're trying to achieve with iommus or dma-ranges in this
binding.

   I would like to add an explanation about dma-ranges to the binding:
   
   8
   The parent bus of the iommu must have a valid dma-ranges property
   describing how the physical address space of the IOMMU maps into
   memory.
  
  With physical address space you mean the addresses after translation,
  not the I/O virtual addresses, right? But even so, how will this work
  when there are multiple IOMMU devices? What determines which IOMMU is
  mapped via which entry?
  
  Perhaps having multiple IOMMUs implies that there will have to be some
  partitioning of the parent address space to make sure two IOMMUs don't
  translate to the same ranges?
 
 These dma-ranges properties would almost always be for the entire RAM,
 and we can treat anything else as a bug.

Would it typically be a 1:1 mapping? In that case could we define an
empty dma-ranges property to mean exactly that? That would make it
consistent with the ranges property.

 The mapping between what goes into the IOMMU and what comes out of it
 is not reflected in DT at all, since it only happens at runtime.
 The dma-ranges property I mean above describes how what comes out of
 the IOMMU maps into physical memory.

Understood. That makes sense.

   A device with an iommus property will ignore the dma-ranges property
   of the parent node and rely on the IOMMU for translation instead.
  
  Do we need to consider the case where an IOMMU listed in iommus isn't
  enabled (status = disabled)? In that case presumably the device would
  either not function or may optionally continue to master onto the parent
  untranslated.
 
 My reasoning was that the DT should specify whether we use the IOMMU
 or not. Being able to just switch on or off the IOMMU sounds nice as
 well, so we could change the text above to do that.
 
 Another option would be to do this in the IOMMU code, basically
 falling back to the IOMMU parent's dma-ranges property and using
 linear dma_map_ops when that is disabled.

Yes, it should be trivial for the IOMMU core code to take care of this
special case. Still I think it's worth mentioning it in the binding so
that it's clearly specified.

   Using an iommus property in bus device nodes with dma-ranges
   specifying how child devices relate to the IOMMU is a possible extension
   but is not recommended until this binding gets extended.
  
  Just for my understanding, bus device nodes with iommus and dma-ranges
  properties could be equivalently written by explicitly moving the iommus
  properties into the child device nodes, right? In which case they should
  be the same as the other examples. So that concept is a convenience
  notation to reduce duplication, but doesn't fundamentally introduce any
  new concept.
 
 The one case  where that doesn't work is PCI, because we don't list the
 PCI devices in DT normally, and the iommus property would only exist
 at the PCI host controller node.

But it could work in classic OpenFirmware where the device tree can be
populated with the tree of PCI devices enumerated by OF. There are also
devices that have a fixed configuration and where technically the PCI
devices can be listed in the device tree. This is somewhat important if
for example one PCI device is a GPIO controller and needs to be
referenced by phandle

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Dave Martin

On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
 On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
  On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
   On Tuesday 20 May 2014 13:05:37 Thierry Reding wrote:
On Tue, May 20, 2014 at 12:04:54PM +0200, Arnd Bergmann wrote:
 On Monday 19 May 2014 22:59:46 Thierry Reding wrote:
  On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
  [...]
   You should never need #size-cells  #address-cells
  
  That was always my impression as well. But how then do you 
  represent the
  full 4 GiB address space in a 32-bit system? It starts at 0 and 
  ends at
  4 GiB - 1, which makes it 4 GiB large. That's:
  
  0 1 0
  
  With #address-cells = 1 and #size-cells = 1 the best you can do 
  is:
  
  0 0x
  
  but that's not accurate.
 
 I think we've done both in the past, either extended #size-cells or
 taken 0x as a special token. Note that in your example,
 the iommu actually needs #address-cells = 2 anyway.

But it needs #address-cells = 2 only to encode an ID in addition to
the address. If this was a single-master IOMMU then there'd be no need
for the ID.
   
   Right. But for a single-master IOMMU, there is no need to specify
   any additional data, it could have #address-cells=0 if we take the
   optimization you suggested.
  
  Couldn't a single-master IOMMU be windowed?
 
 Ah, yes. That would actually be like an IBM pSeries, which has a windowed
 IOMMU but uses one window per virtual machine. In that case, the window could
 be a property of the iommu node though, rather than part of the address
 in the link.
 
 The main advantage I think would be for IOMMUs that use the PCI b/d/f
 numbers as IDs. These can have #address-cells=3, #size-cells=2
 and have an empty dma-ranges property in the PCI host bridge node,
 and interpret this as using the same encoding as the PCI BARs in
 the ranges property.

I'm somewhat confused here, since you said earlier:

 After giving the ranges stuff some more thought, I have come to the
 conclusion that using #iommu-cells should work fine for almost
 all cases, including windowed iommus, because the window is not
 actually needed in the device, but only in the iommu, wihch is of 
 course
 free to interpret the arguments as addresses.

But now you seem to be saying that we should still be using the
#address-cells and #size-cells properties in the IOMMU node to determine
the length of the specifier.
   
   I probably wasn't clear. I think we can make it work either way, but
   my feeling is that using #address-cells/#size-cells gives us a nicer
   syntax for the more complex cases.
  
  Okay, so in summary we'd have something like this for simple cases:
  
  Required properties:
  
  - #address-cells: The number of cells in an IOMMU specifier needed to encode
an address.
  - #size-cells: The number of cells in an IOMMU specifier needed to represent
the length of an address range.
  
  Typical values for the above include:
  - #address-cells = 0, size-cells = 0: Single master IOMMU devices are 
  not
configurable and therefore no additional information needs to be encoded 
  in
the specifier. This may also apply to multiple master IOMMU devices that 
  do
not allow the association of masters to be configured.
  - #address-cells = 1, size-cells = 0: Multiple master IOMMU devices may
need to be configured in order to enable translation for a given master. 
  In
such cases the single address cell corresponds to the master device's ID.
  - #address-cells = 2, size-cells = 2: Some IOMMU devices allow the DMA
window for masters to be configured. The first cell of the address in this
may contain the master device's ID for example, while the second cell 
  could
contain the start of the DMA window for the given device. The length of 
  the
DMA window is specified by two additional cells.

I was trying to figure out how to describe the different kinds of
transformation we could have on the address/ID input to the IOMMU.
Treating the whole thing as opaque gets us off the hook there.

IDs are probably not propagated, not remapped, or we simply don't care
about them; whereas the address transformation is software-controlled,
so we don't describe that anyway.

Delegating grokking the mapping to the iommu driver makes sense --
it's what it's there for, after all.


I'm not sure whether the windowed IOMMU case is special actually.

Since the address to program into the master is found by calling the
IOMMU driver to create some mappings, does anything except the IOMMU
driver need to understand that there is windowing?

  
  Examples:
  =
  
  Single-master IOMMU:
  
  
  iommu {
  #address-cells = 0;

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Will Deacon

On Tue, May 20, 2014 at 02:23:47PM +0100, Arnd Bergmann wrote:
   Bit#   3322  1100 
  10987654 32109876 54321098 76543210
 phys.hi cell:  npt000ss  dfff 
 phys.mid cell:    
 phys.lo cell:     
 
 where:
 n is 0 if the address is relocatable, 1 otherwise
 p is 1 if the addressable region is prefetchable, 0 otherwise
 t is 1 if the address is aliased (for non-relocatable I/O),
  below 1 MB (for Memory), or below 64 KB (for relocatable I/O).
 ss is the space code, denoting the address space
  is the 8-bit Bus Number
 d is the 5-bit Device Number
 fff is the 3-bit Function Number
  is the 8-bit Register Number
 hh...hh is a 32-bit unsigned number
 ll...ll is a 32-bit unsigned number
 
 We can ignore n, p, t and r here, and use the same format for a DMA
 address, then define an empty dma-ranges property. That would
 imply that using b/d/f is sufficient to identify each master at the
 iommu. Any device outside of the PCI host but connected to the same
 iommu can use the same notation to list the logical b/d/f that gets
 sent to the IOMMU in bus master transactions.
 
 Do you think this is sufficient for the ARM SMMU, or do we need
 something beyond that?

I think it can define the common-cases for the existing implementations,
yes. I anticipate Stream-IDs becoming  16-bit in the near future though,
so we'd need extra bits if we're describing other devices coming into the
SMMU.

Note that we already have a binding for the current SMMU driver, so I'm not
really in a position to shift over to a new binding until the next version of
the SMMU architecture comes along...

Will
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Dave Martin

On Tue, May 20, 2014 at 04:26:59PM +0100, Will Deacon wrote:
 On Tue, May 20, 2014 at 02:23:47PM +0100, Arnd Bergmann wrote:
  Bit#   3322  1100 
 10987654 32109876 54321098 76543210
  phys.hi cell:  npt000ss  dfff 
  phys.mid cell:    
  phys.lo cell:     
  
  where:
  n is 0 if the address is relocatable, 1 otherwise
  p is 1 if the addressable region is prefetchable, 0 otherwise
  t is 1 if the address is aliased (for non-relocatable I/O),
   below 1 MB (for Memory), or below 64 KB (for relocatable I/O).
  ss is the space code, denoting the address space
   is the 8-bit Bus Number
  d is the 5-bit Device Number
  fff is the 3-bit Function Number
   is the 8-bit Register Number
  hh...hh is a 32-bit unsigned number
  ll...ll is a 32-bit unsigned number
  
  We can ignore n, p, t and r here, and use the same format for a DMA
  address, then define an empty dma-ranges property. That would
  imply that using b/d/f is sufficient to identify each master at the
  iommu. Any device outside of the PCI host but connected to the same
  iommu can use the same notation to list the logical b/d/f that gets
  sent to the IOMMU in bus master transactions.
  
  Do you think this is sufficient for the ARM SMMU, or do we need
  something beyond that?
 
 I think it can define the common-cases for the existing implementations,
 yes. I anticipate Stream-IDs becoming  16-bit in the near future though,
 so we'd need extra bits if we're describing other devices coming into the
 SMMU.
 
 Note that we already have a binding for the current SMMU driver, so I'm not
 really in a position to shift over to a new binding until the next version of
 the SMMU architecture comes along...

How much code relies on the meaning of the nptsbdf bits?

Following the general model of concatenating the master ID with the
address bits and trasting that as a single input to the IOMMU seems
reasonable -- maybe we either don't care about the attribute bits and
can replace them with master ID bits, or instead we could extend with
more master ID bits at bit 96 and up.

npts looks like memory type information.  What memory type information
we need to describe for coherent masters on ARM is a bit of an open
question.  Perhaps we can come up with a good ARM-specific interpretation
of these bits, but I wonder whether they are enough.

If there is a real PCI bus in the system of course, then we are fine
once we reach it.  The only problem is for the stuff on AMBA buses etc.

In theory, we could have to describe memory types, cacheability
attributes and coherency domains, but we may be able to squeeze the
possibilities down to a smaller set of sane memory types.

Cheers
---Dave
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Arnd Bergmann

On Tuesday 20 May 2014 16:24:59 Dave Martin wrote:
 On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
  On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
   On Tue, May 20, 2014 at 01:15:48PM +0200, Arnd Bergmann wrote:
   Typical values for the above include:
   - #address-cells = 0, size-cells = 0: Single master IOMMU devices are 
   not
 configurable and therefore no additional information needs to be 
   encoded in
 the specifier. This may also apply to multiple master IOMMU devices 
   that do
 not allow the association of masters to be configured.
   - #address-cells = 1, size-cells = 0: Multiple master IOMMU devices 
   may
 need to be configured in order to enable translation for a given 
   master. In
 such cases the single address cell corresponds to the master device's 
   ID.
   - #address-cells = 2, size-cells = 2: Some IOMMU devices allow the DMA
 window for masters to be configured. The first cell of the address in 
   this
 may contain the master device's ID for example, while the second cell 
   could
 contain the start of the DMA window for the given device. The length of 
   the
 DMA window is specified by two additional cells.
 
 I was trying to figure out how to describe the different kinds of
 transformation we could have on the address/ID input to the IOMMU.
 Treating the whole thing as opaque gets us off the hook there.
 
 IDs are probably not propagated, not remapped, or we simply don't care
 about them; whereas the address transformation is software-controlled,
 so we don't describe that anyway.
 
 Delegating grokking the mapping to the iommu driver makes sense --
 it's what it's there for, after all.
 
 
 I'm not sure whether the windowed IOMMU case is special actually.
 
 Since the address to program into the master is found by calling the
 IOMMU driver to create some mappings, does anything except the IOMMU
 driver need to understand that there is windowing?

No. I tried to explain that earlier today, and in my earlier mails
I hadn't thought that part through. Only the IOMMU driver needs to care
about the window.

   
   Examples:
   =
   
   Single-master IOMMU:
   
   
 iommu {
 #address-cells = 0;
 #size-cells = 0;
 };
   
 master {
 iommus = /iommu;
 };
   
   Multiple-master IOMMU with fixed associations:
   --
   
 /* multiple-master IOMMU */
 iommu {
 /*
  * Masters are statically associated with this IOMMU and
  * address translation is always enabled.
  */
 #iommu-cells = 0;
 };
  
  copied wrong? I guess you mean #address-cells=0/#size-cells=0 here.
  
 /* static association with IOMMU */
 master@1 {
 reg = 1;
 
 Just for clarification, reg just has its standard meaning here, and
 is nothing to do with the IOMMU?

correct

 iommus = /iommu;
 
 In effect, iommus is doing the same thing as my slaves property.
 
 The way #address-cells and #size-cells determine the address and range
 size for mastering into the IOMMU is also similar.  The main difference
 is that I didn't build the ID into the address.

Right. I think the difference is more about what we want to call
things: Calling it iommu means we want to specifically describe
the case of iommus that needs to get handled by all OSs in a particular
way, while the more generic slave connection doesn't correspond to
a specific concept in the OS.

 };
   
 /* static association with IOMMU */
 master@2 {
 reg = 2;
 iommus = /iommu;
 };
   
   Multiple-master IOMMU:
   --
   
 iommu {
 /* the specifier represents the ID of the master */
 #address-cells = 1;
 #size-cells = 0;
 
 How do we know the size of the input address to the IOMMU?  Do we
 get cases for example where the IOMMU only accepts a 32-bit input
 address, but some 64-bit capable masters are connected through it?

I was stuck on this question for a while before, but then I realized
that it doesn't matter at all: It's the IOMMU driver itself that
manages the address space, and it doesn't matter if a slave can
address a larger range than the IOMMU can accept. If the IOMMU
needs to deal with the opposite case (64-bit input addresses
but a 32-bit master), that limitation can be put into the specifier.

 The size of the output address from the IOMMU will be determined
 by its own mastering destination, which by default in ePAPR is the
 IOMMU node's parent.  I think that's what you intended, and what we
 expect in this case.

Rihgt.

 For determining dma masks, it is the output address that it
 important.  Santosh's code can probably be taught to handle this,
 if given an additional traversal rule for following iommus
 properties.  However, deploying an IOMMU whose output address size
 is

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Arnd Bergmann

On Tuesday 20 May 2014 16:00:02 Thierry Reding wrote:
 On Tue, May 20, 2014 at 03:34:46PM +0200, Arnd Bergmann wrote:
  On Tuesday 20 May 2014 15:17:43 Thierry Reding wrote:
   On Tue, May 20, 2014 at 02:41:18PM +0200, Arnd Bergmann wrote:
On Tuesday 20 May 2014 14:02:43 Thierry Reding wrote:
   [...]
 Couldn't a single-master IOMMU be windowed?

Ah, yes. That would actually be like an IBM pSeries, which has a 
windowed
IOMMU but uses one window per virtual machine. In that case, the window 
could
be a property of the iommu node though, rather than part of the address
in the link.
   
   Does that mean that the IOMMU has one statically configured window which
   is the same for each virtual machine? That would require some other
   mechanism to assign separate address spaces to each virtual machine,
   wouldn't it? But I suspect that if the IOMMU allows that it could be
   allocated dynamically at runtime.
  
  The way it works on pSeries is that upon VM creation, the guest is assigned
  one 256MB window for use by assigned DMA capable devices. When the guest
  creates a mapping, it uses a hypercall to associate a bus address in that
  range with a guest physical address. The hypervisor checks that the bus
  address is within the allowed range, and translates the guest physical
  address into a host physical address, then enters both into the I/O page
  table or I/O TLB.
 
 So when a VM is booted it is passed a device tree with that DMA window?

Correct.

 Given what you describe above this seems to be more of a configuration
 option to restrict the IOMMU to a subset of the physical memory for
 purposes of virtualization. So I agree that this wouldn't be a good fit
 for what we're trying to achieve with iommus or dma-ranges in this
 binding.

Thinking about it again now, I wonder if there are any other use cases
for windowed IOMMUs. If this is the only one, there might be no use
in the #address-cells model I suggested instead of your original
#iommu-cells.

I would like to add an explanation about dma-ranges to the binding:

8
The parent bus of the iommu must have a valid dma-ranges property
describing how the physical address space of the IOMMU maps into
memory.
   
   With physical address space you mean the addresses after translation,
   not the I/O virtual addresses, right? But even so, how will this work
   when there are multiple IOMMU devices? What determines which IOMMU is
   mapped via which entry?
   
   Perhaps having multiple IOMMUs implies that there will have to be some
   partitioning of the parent address space to make sure two IOMMUs don't
   translate to the same ranges?
  
  These dma-ranges properties would almost always be for the entire RAM,
  and we can treat anything else as a bug.
 
 Would it typically be a 1:1 mapping? In that case could we define an
 empty dma-ranges property to mean exactly that? That would make it
 consistent with the ranges property.

Yes, I believe that is how it's already defined.

A device with an iommus property will ignore the dma-ranges property
of the parent node and rely on the IOMMU for translation instead.
   
   Do we need to consider the case where an IOMMU listed in iommus isn't
   enabled (status = disabled)? In that case presumably the device would
   either not function or may optionally continue to master onto the parent
   untranslated.
  
  My reasoning was that the DT should specify whether we use the IOMMU
  or not. Being able to just switch on or off the IOMMU sounds nice as
  well, so we could change the text above to do that.
  
  Another option would be to do this in the IOMMU code, basically
  falling back to the IOMMU parent's dma-ranges property and using
  linear dma_map_ops when that is disabled.
 
 Yes, it should be trivial for the IOMMU core code to take care of this
 special case. Still I think it's worth mentioning it in the binding so
 that it's clearly specified.

Agreed.

Using an iommus property in bus device nodes with dma-ranges
specifying how child devices relate to the IOMMU is a possible extension
but is not recommended until this binding gets extended.
   
   Just for my understanding, bus device nodes with iommus and dma-ranges
   properties could be equivalently written by explicitly moving the iommus
   properties into the child device nodes, right? In which case they should
   be the same as the other examples. So that concept is a convenience
   notation to reduce duplication, but doesn't fundamentally introduce any
   new concept.
  
  The one case  where that doesn't work is PCI, because we don't list the
  PCI devices in DT normally, and the iommus property would only exist
  at the PCI host controller node.
 
 But it could work in classic OpenFirmware where the device tree can be
 populated with the tree of PCI devices enumerated by OF. There are also
 devices that have a fixed configuration and where technically the

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-20 Thread Arnd Bergmann

On Tuesday 20 May 2014 17:39:12 Dave Martin wrote:
 On Tue, May 20, 2014 at 04:26:59PM +0100, Will Deacon wrote:
  On Tue, May 20, 2014 at 02:23:47PM +0100, Arnd Bergmann wrote:
 Bit#   3322  1100 
10987654 32109876 54321098 76543210
   phys.hi cell:  npt000ss  dfff 
   phys.mid cell:    
   phys.lo cell:     
   
   where:
   n is 0 if the address is relocatable, 1 otherwise
   p is 1 if the addressable region is prefetchable, 0 otherwise
   t is 1 if the address is aliased (for non-relocatable I/O),
below 1 MB (for Memory), or below 64 KB (for relocatable I/O).
   ss is the space code, denoting the address space
    is the 8-bit Bus Number
   d is the 5-bit Device Number
   fff is the 3-bit Function Number
    is the 8-bit Register Number
   hh...hh is a 32-bit unsigned number
   ll...ll is a 32-bit unsigned number
   
   We can ignore n, p, t and r here, and use the same format for a DMA
   address, then define an empty dma-ranges property. That would
   imply that using b/d/f is sufficient to identify each master at the
   iommu. Any device outside of the PCI host but connected to the same
   iommu can use the same notation to list the logical b/d/f that gets
   sent to the IOMMU in bus master transactions.
   
   Do you think this is sufficient for the ARM SMMU, or do we need
   something beyond that?
  
  I think it can define the common-cases for the existing implementations,
  yes. I anticipate Stream-IDs becoming  16-bit in the near future though,
  so we'd need extra bits if we're describing other devices coming into the
  SMMU.
  
  Note that we already have a binding for the current SMMU driver, so I'm not
  really in a position to shift over to a new binding until the next version 
  of
  the SMMU architecture comes along...
 
 How much code relies on the meaning of the nptsbdf bits?

Not much at all, I think this was defined mostly for open firmware
client interfaces, which we don't use with FDT.

n, t and r are probably only used in PCI functions that are listed
in DT, not in the host bridge.

b/d/f I think is mostly used for the interrupt-maps property, when
describing hardwired interrupts.

p and s are used when setting up the translation for inbound MMIO
and PIO.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-19 Thread Arnd Bergmann

On Friday 16 May 2014 14:23:18 Thierry Reding wrote:
 From: Thierry Reding tred...@nvidia.com
 
 This commit introduces a generic device tree binding for IOMMU devices.
 Only a very minimal subset is described here, but it is enough to cover
 the requirements of both the Exynos System MMU and Tegra SMMU as
 discussed here:
 
 https://lkml.org/lkml/2014/4/27/346
 
 More advanced functionality such as the dma-ranges property can easily
 be added in a backwards-compatible way. In the absence of a dma-ranges
 property it should be safe to default to the whole address space.
 

The basic binding looks fine, but I'd like it to be more explicit
about dma-ranges. Most importantly, what does the whole address space
mean? A lot of IOMMUs have only 32-bit bus addresses when targetted
by a bus master, it would also be normal for some to be smaller and
some might even support 64-bit.

For the upstream side, I'd hope we always have access to the full
physical memory, but since this is a brand-new binding, it should
be straightforward to just ask for upstream dma-ranges properties
to be set all the way up to the root to confirm that.

For downstream, we don't actually have a good place to put the
dma-ranges property. We can't put it into the iommu node, because
that would imply translating to the iommu's parent bus, not the
iommu's own bus space. We also can't put it into the master, because
dma-ranges is supposed to be in the parent bus. Finally, it makes
no sense to use the dma-ranges property of the master's parent bus,
because that bus isn't actually involved in the translation.

My preferred option would be to always put the address range into
the iommu descriptor, using the iommu's #address-cells.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-19 Thread Thierry Reding

On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote:
 On Friday 16 May 2014 14:23:18 Thierry Reding wrote:
  From: Thierry Reding tred...@nvidia.com
  
  This commit introduces a generic device tree binding for IOMMU devices.
  Only a very minimal subset is described here, but it is enough to cover
  the requirements of both the Exynos System MMU and Tegra SMMU as
  discussed here:
  
  https://lkml.org/lkml/2014/4/27/346
  
  More advanced functionality such as the dma-ranges property can easily
  be added in a backwards-compatible way. In the absence of a dma-ranges
  property it should be safe to default to the whole address space.
  
 
 The basic binding looks fine, but I'd like it to be more explicit
 about dma-ranges. Most importantly, what does the whole address space
 mean?

The whole point was to leave out any mention of dma-ranges from the
binding until we've figured out more of the puzzle.

So what I was trying to avoid was another lengthy discussion on the
topic of dma-ranges. Oh well... =)

 A lot of IOMMUs have only 32-bit bus addresses when targetted
 by a bus master, it would also be normal for some to be smaller and
 some might even support 64-bit.
 
 For the upstream side, I'd hope we always have access to the full
 physical memory, but since this is a brand-new binding, it should
 be straightforward to just ask for upstream dma-ranges properties
 to be set all the way up to the root to confirm that.
 
 For downstream, we don't actually have a good place to put the
 dma-ranges property.

I'm not sure I understand what you mean by upstream and downstream in
this context.

 We can't put it into the iommu node, because  that would imply translating
 to the iommu's parent bus, not the iommu's own bus space.

My understanding was that the purpose of dma-ranges was to define a
mapping from one bus to another. So the general form of

child-address  parent-address  child-size

Would be used to translate a region of size child-size from the
child-address (the I/O address space created by the IOMMU) to the
parent-address (physical address space).

 We also can't put it into the master, because dma-ranges is supposed to be
 in the parent bus.

I don't understand. From the above I would think that the master's node
is precisely where it belongs.

 Finally, it makes no sense to use the dma-ranges property of the master's
 parent bus, because that bus isn't actually involved in the translation.

My understanding here is mostly based on the OpenFirmware working group
proposal for the dma-ranges property[0]. I'll give another example to
try and clarify how I had imagined this to work:

/ {
#address-cells = 2;
#size-cells = 2;

iommu {
/*
 * This is somewhat unusual (or maybe not) in that we
 * need 2 cells to represent the size of an address
 * space that is 32 bits long.
 */
#address-cells = 1;
#size-cells = 2;

#iommu-cells = 1;
};

master {
iommus = /iommu 42;
/*
 * Map I/O addresses 0 - 4 GiB to physical addresses
 * 2 GiB - 6 GiB.
 */
dma-ranges = 0x 0 0x8000 1 0;
};
};

This is somewhat incompatible with [0] in that #address-cells used to
parse the child address must be taken from the iommu node rather than
the child node. But that seems to me to be the only reasonable thing
to do, because after all the IOMMU creates a completely new address
space for the master.

[0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt

 My preferred option would be to always put the address range into
 the iommu descriptor, using the iommu's #address-cells.

That could become impossible to parse. I'm not sure if such hardware
actually exists, but if for some reason we have to split the address
range into two, then there's no longer any way to determine the size
needed for the specifier.

On the other hand what you propose makes it easy to represent multiple
master interfaces on a device. With a separate dma-ranges property how
can you define which ranges apply to which of the master interfaces?

Then again if address ranges can't be broken up in the first place, then
dma-ranges could be considered to be one entry per IOMMU in the iommus
property.

Thierry


pgpK6j32S5cnu.pgp
Description: PGP signature

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-19 Thread Dave Martin

On Mon, May 19, 2014 at 01:53:37PM +0100, Thierry Reding wrote:
 On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote:
  On Friday 16 May 2014 14:23:18 Thierry Reding wrote:
   From: Thierry Reding tred...@nvidia.com
   
   This commit introduces a generic device tree binding for IOMMU devices.
   Only a very minimal subset is described here, but it is enough to cover
   the requirements of both the Exynos System MMU and Tegra SMMU as
   discussed here:
   
   https://lkml.org/lkml/2014/4/27/346
   
   More advanced functionality such as the dma-ranges property can easily
   be added in a backwards-compatible way. In the absence of a dma-ranges
   property it should be safe to default to the whole address space.
   
  
  The basic binding looks fine, but I'd like it to be more explicit
  about dma-ranges. Most importantly, what does the whole address space
  mean?
 
 The whole point was to leave out any mention of dma-ranges from the
 binding until we've figured out more of the puzzle.
 
 So what I was trying to avoid was another lengthy discussion on the
 topic of dma-ranges. Oh well... =)

Apologies for the silence on this topic...  I'm trying to do too many
things at the sime time as usual :/

 
  A lot of IOMMUs have only 32-bit bus addresses when targetted
  by a bus master, it would also be normal for some to be smaller and
  some might even support 64-bit.
  
  For the upstream side, I'd hope we always have access to the full
  physical memory, but since this is a brand-new binding, it should
  be straightforward to just ask for upstream dma-ranges properties
  to be set all the way up to the root to confirm that.
  
  For downstream, we don't actually have a good place to put the
  dma-ranges property.
 
 I'm not sure I understand what you mean by upstream and downstream in
 this context.
 
  We can't put it into the iommu node, because  that would imply translating
  to the iommu's parent bus, not the iommu's own bus space.
 
 My understanding was that the purpose of dma-ranges was to define a
 mapping from one bus to another. So the general form of
 
   child-address  parent-address  child-size
 
 Would be used to translate a region of size child-size from the
 child-address (the I/O address space created by the IOMMU) to the
 parent-address (physical address space).
 
  We also can't put it into the master, because dma-ranges is supposed to be
  in the parent bus.
 
 I don't understand. From the above I would think that the master's node
 is precisely where it belongs.
 
  Finally, it makes no sense to use the dma-ranges property of the master's
  parent bus, because that bus isn't actually involved in the translation.
 
 My understanding here is mostly based on the OpenFirmware working group
 proposal for the dma-ranges property[0]. I'll give another example to
 try and clarify how I had imagined this to work:
 
   / {
   #address-cells = 2;
   #size-cells = 2;
 
   iommu {
   /*
* This is somewhat unusual (or maybe not) in that we
* need 2 cells to represent the size of an address
* space that is 32 bits long.
*/
   #address-cells = 1;
   #size-cells = 2;
 
   #iommu-cells = 1;
   };
 
   master {
   iommus = /iommu 42;

Comparing this with the other discussion thread, we have a similar
concept here, in that the iommu is made a logical parent, however...

Firstly, there's an implicit assumption here that the only kind of
thing the master could possibly be connected to is an IOMMU, with
no non-trivial interconnect in between.  I'm not sure this is going
to scale to complex SoCs.

If a range of Stream IDs may be issued (especially from something like
a PCIe RC where the stream ID may be a many-bit value), describing
the IDs individually may be impractical.

   /*
* Map I/O addresses 0 - 4 GiB to physical addresses
* 2 GiB - 6 GiB.
*/
   dma-ranges = 0x 0 0x8000 1 0;

... I concur with Arnd dma-ranges is now in the wrong place with
respect to the (overridden) child-parent relationship, and that if
this device masters to multiple destinations there's no possibility
of describing different ranges mappings for each.

   };
   };
 
 This is somewhat incompatible with [0] in that #address-cells used to
 parse the child address must be taken from the iommu node rather than
 the child node. But that seems to me to be the only reasonable thing
 to do, because after all the IOMMU creates a completely new address
 space for the master.
 
 [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt
 
  My preferred option would be to always put the address range into
  the iommu

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-19 Thread Arnd Bergmann

On Monday 19 May 2014 14:53:37 Thierry Reding wrote:
 On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote:
  On Friday 16 May 2014 14:23:18 Thierry Reding wrote:
   From: Thierry Reding tred...@nvidia.com
   
   This commit introduces a generic device tree binding for IOMMU devices.
   Only a very minimal subset is described here, but it is enough to cover
   the requirements of both the Exynos System MMU and Tegra SMMU as
   discussed here:
   
   https://lkml.org/lkml/2014/4/27/346
   
   More advanced functionality such as the dma-ranges property can easily
   be added in a backwards-compatible way. In the absence of a dma-ranges
   property it should be safe to default to the whole address space.
   
  
  The basic binding looks fine, but I'd like it to be more explicit
  about dma-ranges. Most importantly, what does the whole address space
  mean?
 
 The whole point was to leave out any mention of dma-ranges from the
 binding until we've figured out more of the puzzle.
 
 So what I was trying to avoid was another lengthy discussion on the
 topic of dma-ranges. Oh well... =)

I think that can't work, because we need a way to specify the
ranges for some iommu drivers. We have to make sure we at least
don't prevent it from working.

  A lot of IOMMUs have only 32-bit bus addresses when targetted
  by a bus master, it would also be normal for some to be smaller and
  some might even support 64-bit.
  
  For the upstream side, I'd hope we always have access to the full
  physical memory, but since this is a brand-new binding, it should
  be straightforward to just ask for upstream dma-ranges properties
  to be set all the way up to the root to confirm that.
  
  For downstream, we don't actually have a good place to put the
  dma-ranges property.
 
 I'm not sure I understand what you mean by upstream and downstream in
 this context.

Upstream - close to memory
Downstream - close to DMA master

  We can't put it into the iommu node, because  that would imply translating
  to the iommu's parent bus, not the iommu's own bus space.
 
 My understanding was that the purpose of dma-ranges was to define a
 mapping from one bus to another. So the general form of
 
   child-address  parent-address  child-size
 
 Would be used to translate a region of size child-size from the
 child-address (the I/O address space created by the IOMMU) to the
 parent-address (physical address space).

That's how it's defined for normal buses, we haven't defined it for
IOMMUs yet.

  We also can't put it into the master, because dma-ranges is supposed to be
  in the parent bus.
 
 I don't understand. From the above I would think that the master's node
 is precisely where it belongs.

The way that I read the binding, the way that dma-ranges belongs into the parent
of the master, according. On normal buses (e.g. classic PCI), all DMA masters
see the same address space. The bus defines how a transaction started on one
of its children gets translated to the address space of its parent. This
is similar to 'ranges', which defines how an MMIO address initiated on the
parent of a bus gets translated to an address understood by its children.

My interpretation at least matches the defintion of of_dma_get_range(),
but then again Santosh added that according to my comments.

  Finally, it makes no sense to use the dma-ranges property of the master's
  parent bus, because that bus isn't actually involved in the translation.
 
 My understanding here is mostly based on the OpenFirmware working group
 proposal for the dma-ranges property[0]. I'll give another example to
 try and clarify how I had imagined this to work:
 
   / {
   #address-cells = 2;
   #size-cells = 2;
 
   iommu {
   /*
* This is somewhat unusual (or maybe not) in that we
* need 2 cells to represent the size of an address
* space that is 32 bits long.
*/
   #address-cells = 1;
   #size-cells = 2;

You should never need #size-cells  #address-cells

   #iommu-cells = 1;
   };
 
   master {
   iommus = /iommu 42;
   /*
* Map I/O addresses 0 - 4 GiB to physical addresses
* 2 GiB - 6 GiB.
*/
   dma-ranges = 0x 0 0x8000 1 0;
   };
   };
 
 This is somewhat incompatible with [0] in that #address-cells used to
 parse the child address must be taken from the iommu node rather than
 the child node. But that seems to me to be the only reasonable thing
 to do, because after all the IOMMU creates a completely new address
 space for the master.
 
 [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt

I don't think you can have a dma-ranges without a

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-19 Thread Thierry Reding

On Mon, May 19, 2014 at 06:22:31PM +0100, Dave Martin wrote:
 On Mon, May 19, 2014 at 01:53:37PM +0100, Thierry Reding wrote:
[...]
  My understanding here is mostly based on the OpenFirmware working group
  proposal for the dma-ranges property[0]. I'll give another example to
  try and clarify how I had imagined this to work:
  
  / {
  #address-cells = 2;
  #size-cells = 2;
  
  iommu {
  /*
   * This is somewhat unusual (or maybe not) in that we
   * need 2 cells to represent the size of an address
   * space that is 32 bits long.
   */
  #address-cells = 1;
  #size-cells = 2;
  
  #iommu-cells = 1;
  };
  
  master {
  iommus = /iommu 42;
 
 Comparing this with the other discussion thread, we have a similar
 concept here, in that the iommu is made a logical parent, however...
 
 Firstly, there's an implicit assumption here that the only kind of
 thing the master could possibly be connected to is an IOMMU, with
 no non-trivial interconnect in between.  I'm not sure this is going
 to scale to complex SoCs.

Here we go again. We're now in the very same bad spot that we've been in
so many times before. There are at least two SoCs that we know do not
require anything fancy, yet we're blocking adding support for those use
cases because we think that at some point some IOMMU may require more
than that. But at the same time it seems like we don't have enough data
about what exactly that is, so we keep speculating. At this rate we're
making it impossible to get a reasonable feature set supported upstream.

That's very frustrating.

 If a range of Stream IDs may be issued (especially from something like
 a PCIe RC where the stream ID may be a many-bit value), describing
 the IDs individually may be impractical.

The IOMMU specifier is completely specific to the IOMMU. If an IOMMU has
a need to represent the stream IDs as a range there's nothing keeping it
from defining the specifier accordingly.

Thierry


pgpklrcFhszL7.pgp
Description: PGP signature

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-19 Thread Thierry Reding

On Mon, May 19, 2014 at 08:34:07PM +0200, Arnd Bergmann wrote:
 On Monday 19 May 2014 14:53:37 Thierry Reding wrote:
  On Mon, May 19, 2014 at 12:26:35PM +0200, Arnd Bergmann wrote:
   On Friday 16 May 2014 14:23:18 Thierry Reding wrote:
From: Thierry Reding tred...@nvidia.com

This commit introduces a generic device tree binding for IOMMU devices.
Only a very minimal subset is described here, but it is enough to cover
the requirements of both the Exynos System MMU and Tegra SMMU as
discussed here:

https://lkml.org/lkml/2014/4/27/346

More advanced functionality such as the dma-ranges property can easily
be added in a backwards-compatible way. In the absence of a dma-ranges
property it should be safe to default to the whole address space.

   
   The basic binding looks fine, but I'd like it to be more explicit
   about dma-ranges. Most importantly, what does the whole address space
   mean?
  
  The whole point was to leave out any mention of dma-ranges from the
  binding until we've figured out more of the puzzle.
  
  So what I was trying to avoid was another lengthy discussion on the
  topic of dma-ranges. Oh well... =)
 
 I think that can't work, because we need a way to specify the
 ranges for some iommu drivers. We have to make sure we at least
 don't prevent it from working.

That was precisely why I wanted to let this out of the binding for now
so that we can actually focus on making existing hardware work rather
than speculate about what we may or may not need.

If you think the current minimal binding will cause future use-cases to
break, then we should change it to avoid that. What you're proposing is
to make the binding more complex on the assumption that we'll get it
right.

Wouldn't it be safer to keep things to the bare minimum necessary to
represent hardware that we have access to and understand now, rather
than speculating about what may be necessary at some point. I'd much
prefer to add complexity on an as-needed basis and when we have a better
understand of where we're headed.

   Finally, it makes no sense to use the dma-ranges property of the master's
   parent bus, because that bus isn't actually involved in the translation.
  
  My understanding here is mostly based on the OpenFirmware working group
  proposal for the dma-ranges property[0]. I'll give another example to
  try and clarify how I had imagined this to work:
  
  / {
  #address-cells = 2;
  #size-cells = 2;
  
  iommu {
  /*
   * This is somewhat unusual (or maybe not) in that we
   * need 2 cells to represent the size of an address
   * space that is 32 bits long.
   */
  #address-cells = 1;
  #size-cells = 2;
 
 You should never need #size-cells  #address-cells

That was always my impression as well. But how then do you represent the
full 4 GiB address space in a 32-bit system? It starts at 0 and ends at
4 GiB - 1, which makes it 4 GiB large. That's:

0 1 0

With #address-cells = 1 and #size-cells = 1 the best you can do is:

0 0x

but that's not accurate.

  #iommu-cells = 1;
  };
  
  master {
  iommus = /iommu 42;
  /*
   * Map I/O addresses 0 - 4 GiB to physical addresses
   * 2 GiB - 6 GiB.
   */
  dma-ranges = 0x 0 0x8000 1 0;
  };
  };
  
  This is somewhat incompatible with [0] in that #address-cells used to
  parse the child address must be taken from the iommu node rather than
  the child node. But that seems to me to be the only reasonable thing
  to do, because after all the IOMMU creates a completely new address
  space for the master.
  
  [0]: http://www.openfirmware.org/ofwg/proposals/Closed/Accepted/410-it.txt
 
 I don't think you can have a dma-ranges without a #address-cells and
 #size-cells property in the same node. In your example, I'd also expect
 a child node below 'master' that then interprets the address space
 made up by dma-ranges.

Okay, so what Dave and you have been saying strongly indicates that
dma-ranges isn't the right thing to use here.

 As a comment on the numbers in your example, I don't expect to ever
 see a 4GB IOMMU address space that doesn't start at an offset. Instead
 I'd expect either addresses that encode a device ID, or those that
 are just a subset of the parent address space, with non-overlapping
 bus addresses for each master.

As I understand the Tegra SMMU allows each of the clients to be assigned
a separate address space (4 GiB on earlier generations and 16 GiB on new
generations) and all addresses in each address space can be mapped to
arbitrary physical addresses.

   My preferred option would be to always put the

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-17 Thread Cho KyongHo

On Fri, 16 May 2014 14:23:18 +0200, Thierry Reding wrote:
 From: Thierry Reding tred...@nvidia.com
 
 This commit introduces a generic device tree binding for IOMMU devices.
 Only a very minimal subset is described here, but it is enough to cover
 the requirements of both the Exynos System MMU and Tegra SMMU as
 discussed here:
 
 https://lkml.org/lkml/2014/4/27/346
 
 More advanced functionality such as the dma-ranges property can easily
 be added in a backwards-compatible way. In the absence of a dma-ranges
 property it should be safe to default to the whole address space.
 
 Signed-off-by: Thierry Reding tred...@nvidia.com
 ---
  Documentation/devicetree/bindings/iommu/iommu.txt | 109 
 ++
  1 file changed, 109 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt
 
 diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt 
 b/Documentation/devicetree/bindings/iommu/iommu.txt
 new file mode 100644
 index ..2d67b52b656e
 --- /dev/null
 +++ b/Documentation/devicetree/bindings/iommu/iommu.txt
 @@ -0,0 +1,109 @@
 +This document describes the generic device tree binding for IOMMUs and their
 +master(s).
 +
 +
 +IOMMU device node:
 +==
 +
 +An IOMMU can provide the following services:
 +
 +* Remap address space to allow devices to access physical memory ranges that
 +  they otherwise wouldn't be capable of accessing.
 +
 +  Example: 32-bit DMA to 64-bit physical addresses
 +
 +* Implement scatter-gather at page level granularity so that the device does
 +  not have to.
 +
 +* Provide system protection against rogue DMA by forcing all accesses to go
 +  through the IOMMU and faulting when encountering accesses to unmapped
 +  address regions.
 +
 +* Provide address space isolation between multiple contexts.
 +
 +  Example: Virtualization
 +
 +Device nodes compatible with this binding represent hardware with some of the
 +above capabilities.
 +
 +IOMMUs can be single-master or multiple-master. Single-master IOMMU devices
 +typically have a fixed association to the master device, whereas multiple-
 +master IOMMU devices can translate accesses from more than one master.
 +
 +Required properties:
 +
 +- #iommu-cells: The number of cells in an IOMMU specifier. The meaning of the
 +  cells is defined by the binding for the IOMMU device.
 +
 +  Typical values include:
 +  * 0: Single-master IOMMU devices are often not configurable, therefore the
 +specifying doesn't need to encode any information and can be empty.
 +
 +  * 1: Multiple-master IOMMU devices need to know for which master they 
 should
 +enable translation. Typically the single cell in the specifier 
 corresponds
 +to the master device's ID.
 +
 +
 +IOMMU master node:
 +==
 +
 +Devices that access memory through an IOMMU are called masters. A device can
 +have multiple master interfaces (to one or more IOMMU devices).
 +
 +Required properties:
 +
 +- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU
 +  master interfaces of the device. One entry in the list describes one master
 +  interface of the device.
 +
 +Optional properties:
 +
 +- iommu-names: A list of names identifying each entry in the iommus property.
 +
 +
 +Examples:
 +=
 +
 +Single-master IOMMU:
 +
 +
 + iommu {
 + #iommu-cells = 0;
 + };
 +
 + master {
 + iommu = /iommu;
 + };
 +

Great work, Thierry.

One simple comment.

This should be also applicable to multi-master IOMMUs that the masters
of an IOMMU is not configurable with ID or something.
I think the title needs to be changed to cover such IOMMUs which always
translate master's transactions and unable to change the configuration
of the relationship between the masters and IOMMUs by S/W.

Regards,

KyongHo

 +Multi-master IOMMU:
 +---
 +
 + iommu {
 + /* the specifier represents the ID of the master */
 + #iommu-cells = 1;
 + };
 +
 + master {
 + /* device has master ID 42 in the IOMMU */
 + iommu = /iommu 42;
 + };
 +
 +Multi-master device:
 +
 +
 + /* single-master IOMMU */
 + iommu@1 {
 + #iommu-cells = 0;
 + };
 +
 + /* multi-master IOMMU */
 + iommu@2 {
 + /* the specifier represents the ID of the master */
 + #iommu-cells = 1;
 + };
 +
 + /* device with two master interfaces */
 + master {
 + iommus = /iommu@1,/* master of the single-master IOMMU */
 +  /iommu@2 42; /* ID 42 in multi-master IOMMU */
 + };
 -- 
 1.9.2
 
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] devicetree: Add generic IOMMU device tree bindings

2014-05-17 Thread Thierry Reding

On Sat, May 17, 2014 at 05:04:55PM +0900, Cho KyongHo wrote:
 On Fri, 16 May 2014 14:23:18 +0200, Thierry Reding wrote:
[...]
  +Examples:
  +=
  +
  +Single-master IOMMU:
  +
  +
  +   iommu {
  +   #iommu-cells = 0;
  +   };
  +
  +   master {
  +   iommu = /iommu;
  +   };
  +
 
 Great work, Thierry.
 
 One simple comment.
 
 This should be also applicable to multi-master IOMMUs that the masters
 of an IOMMU is not configurable with ID or something.
 I think the title needs to be changed to cover such IOMMUs which always
 translate master's transactions and unable to change the configuration
 of the relationship between the masters and IOMMUs by S/W.

Agreed, how about we add a separate example for that case:

Multiple-master IOMMU with fixed associations:
--

/* multiple-master IOMMU */
iommu {
/*
 * Masters are statically associated with this IOMMU and
 * address translation is always enabled.
 */
#iommu-cells = 0;
};

/* static association with IOMMU */
master@1 {
reg = 1;
iommus = /iommu;
};

/* static association with IOMMU */
master@2 {
reg = 2;
iommus = /iommu;
};

?

Thierry


pgp1zkNX_WZWh.pgp
Description: PGP signature

37 matches

Mail list logo