Re: [PATCH 2/2] iommu/s390: Add support for iommu_device handling

2017-04-28 Thread kbuild test robot
Hi Joerg,

[auto build test ERROR on s390/features]
[also build test ERROR on v4.11-rc8 next-20170428]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Joerg-Roedel/iommu-s390-Fix-iommu-groups-and-add-sysfs-support/20170429-003440
base:   https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
config: s390-allmodconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=s390 

All errors (new ones prefixed by >>):

   In file included from include/linux/pci.h:1627:0,
from include/trace/events/iommu.h:14,
from include/linux/iommu.h:27,
from drivers/s390/cio/vfio_ccw_cp.c:12:
>> arch/s390/include/asm/pci.h:127:22: error: field 'iommu_dev' has incomplete 
>> type
 struct iommu_device iommu_dev;  /* IOMMU core handle */
 ^

vim +/iommu_dev +127 arch/s390/include/asm/pci.h

   121  unsigned long   *iommu_bitmap;
   122  unsigned long   *lazy_bitmap;
   123  unsigned long   iommu_size;
   124  unsigned long   iommu_pages;
   125  unsigned intnext_bit;
   126  
 > 127  struct iommu_device iommu_dev;  /* IOMMU core handle */
   128  struct iommu_group *group;  /* IOMMU group for all devices 
behind this zdev */
   129  
   130  char res_name[16];

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 0/2] iommu/s390: Fix iommu-groups and add sysfs support

2017-04-28 Thread Joerg Roedel
Hi Gerald,

On Fri, Apr 28, 2017 at 08:06:12PM +0200, Gerald Schaefer wrote:
> On Fri, 28 Apr 2017 16:55:13 +0200
> Joerg Roedel  wrote:

> Also, IIRC, add_device will get called before attach_dev. Currently we
> allow to attach more than one device (apparently from different buses) to
> one domain (one shared DMA table) in attach_dev. But then it would be too
> late to also add all devices to the same iommu-group. That would have had
> to be done earlier in add_device, but there we don't know yet if a shared
> DMA table would be set up later in attach_dev.

I think there is some misunderstanding here about what iommu-groups are.
An iommu-group is a group of devices that are not isolated from each
other wrt. DMA and/or IRQs. This means that the devices can influence
each other, e.g. directly DMA to each other without IOMMU control. So
the grouping relies on how the hardware is built.

Domains on the other side are a software controled concept. A domain is
basically an abstraction of a DMA address space. Multiple devices can
share on domain/address-space, just as multiple threads can share a
cpu address space.

The point of iommu-grouping is to make sure that we don't assign
different domains to devices in the same group, as that could break or
cause security issues without proper isolation.


Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/2] iommu/s390: Fix iommu-groups and add sysfs support

2017-04-28 Thread Joerg Roedel
On Fri, Apr 28, 2017 at 05:25:20PM +0200, Sebastian Ott wrote:
> On Fri, 28 Apr 2017, Joerg Roedel wrote:
> > That sounds special :) So will every function of a single device end up
> > as a seperate device on a seperate root-bus?
> 
> Yes. That's true even for multi-function and SRIOV.

Okay, so its truly special :) Thanks for your confirmation.


Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: Don't print the failure message when booting non-kdump kernel

2017-04-28 Thread Joerg Roedel
On Fri, Apr 28, 2017 at 01:16:15AM +0800, Qiuxu Zhuo wrote:
> When booting a new non-kdump kernel, we have below failure message:
> 
> [0.004000] DMAR-IR: IRQ remapping was enabled on dmar2 but we are not in 
> kdump mode
> [0.004000] DMAR-IR: Failed to copy IR table for dmar2 from previous kernel
> [0.004000] DMAR-IR: IRQ remapping was enabled on dmar1 but we are not in 
> kdump mode
> [0.004000] DMAR-IR: Failed to copy IR table for dmar1 from previous kernel
> [0.004000] DMAR-IR: IRQ remapping was enabled on dmar0 but we are not in 
> kdump mode
> [0.004000] DMAR-IR: Failed to copy IR table for dmar0 from previous kernel
> [0.004000] DMAR-IR: IRQ remapping was enabled on dmar3 but we are not in 
> kdump mode
> [0.004000] DMAR-IR: Failed to copy IR table for dmar3 from previous kernel
> 
> For non-kdump case, we no need to copy IR table from previous kernel
> so it's nonthing actually failed. To be less alarming or misleading,
> do not print "DMAR-IR: Failed to copy IR table for dmar[0-9] from
> previous kernel" messages when booting non-kdump kernel.
> 
> Signed-off-by: Qiuxu Zhuo 
> ---
>  drivers/iommu/intel_irq_remapping.c | 15 ++-
>  1 file changed, 6 insertions(+), 9 deletions(-)

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/2] iommu/s390: Fix iommu-groups and add sysfs support

2017-04-28 Thread Gerald Schaefer
On Fri, 28 Apr 2017 16:55:13 +0200
Joerg Roedel  wrote:

> > I am however a bit confused now, about how we would have allowed group
> > sharing with the current s390 IOMMU code, or IOW in which scenario would
> > iommu_group_get() in the add_device callback find a shareable iommu-group?  
> 
> The usual way to do this is to use the iommu_group_get_for_dev()
> function, which invokes the iommu_ops->device_group call-back of the
> driver to find a matching group or allocating a new one.
> 
> There are ready-to-use functions for this call-back already:
> 
>   1) generic_device_group() - which just allocates a new group for
>  the device. This is usually used outside of PCI
> 
>   2) pci_device_group() - Which walks the PCI hierarchy to find
>  devices that are not isolated and uses the matching group for
>  its isolation domain.
> 
> A few drivers have their own versions of this call-back, but those are
> IOMMU drivers supporting multiple bus-types and need to find the right
> way to determine the group first.
> 
> > So, I guess we may have an issue with not sharing iommu-groups when
> > it could make sense to do so. But your patch would not fix this, as
> > we still would allocate separate iommu-groups for all functions.  
> 
> Yes, but the above approach won't help when each function ends up on a
> seperate bus because the code looks for different functions that are
> enumerated as such. Anyway, some more insight into how this enumeration
> works on s390 would be great :)

Since Sebastian confirmed this, it looks like we do not really have any
enumeration when there is a separate bus for each function.

Also, IIRC, add_device will get called before attach_dev. Currently we
allow to attach more than one device (apparently from different buses) to
one domain (one shared DMA table) in attach_dev. But then it would be too
late to also add all devices to the same iommu-group. That would have had
to be done earlier in add_device, but there we don't know yet if a shared
DMA table would be set up later in attach_dev.

So, it looks to me that we cannot provide correct iommu-group sharing
on s390, even though we allow iommu-domain sharing, which sounds odd. Since
this "shared domain / DMA table" option in attach_dev was only added
because at that time I thought that was a hard requirement for any arch-
specific IOMMU API implementation, maybe there was some misunderstanding.

It would make the code easier (and more consistent with the s390 hardware)
if I would just remove that option from attach_dev, and allow only one
device/function per iommu-domain. What do you think, could this be removed
for s390, or is there any common code requirement for providing that option
(and is it OK that we have separate iommu-groups in this case)?

Regards,
Gerald

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/s390: Fix IOMMU groups

2017-04-28 Thread kbuild test robot
Hi Joerg,

[auto build test ERROR on s390/features]
[also build test ERROR on v4.11-rc8 next-20170428]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Joerg-Roedel/iommu-s390-Fix-iommu-groups-and-add-sysfs-support/20170429-003440
base:   https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
config: s390-default_defconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=s390 

All errors (new ones prefixed by >>):

   In file included from include/trace/define_trace.h:95:0,
from include/trace/events/iommu.h:167,
from include/linux/iommu.h:27,
from arch/s390/include/asm/pci.h:11,
from include/linux/pci.h:1627,
from include/trace/events/iommu.h:14,
from drivers/iommu/iommu-traces.c:12:
   include/trace/events/iommu.h: In function 
'ftrace_test_probe_add_device_to_group':
   include/trace/trace_events.h:716:2: error: implicit declaration of function 
'check_trace_callback_type_add_device_to_group' 
[-Werror=implicit-function-declaration]
 check_trace_callback_type_##call(trace_event_raw_event_##template); \
 ^
   include/trace/events/iommu.h:39:1: note: in expansion of macro 'DEFINE_EVENT'
DEFINE_EVENT(iommu_group_event, add_device_to_group,
^~~~
   include/trace/events/iommu.h: In function 
'ftrace_test_probe_remove_device_from_group':
   include/trace/trace_events.h:716:2: error: implicit declaration of function 
'check_trace_callback_type_remove_device_from_group' 
[-Werror=implicit-function-declaration]
 check_trace_callback_type_##call(trace_event_raw_event_##template); \
 ^
   include/trace/events/iommu.h:47:1: note: in expansion of macro 'DEFINE_EVENT'
DEFINE_EVENT(iommu_group_event, remove_device_from_group,
^~~~
   include/trace/events/iommu.h: In function 
'ftrace_test_probe_attach_device_to_domain':
   include/trace/trace_events.h:716:2: error: implicit declaration of function 
'check_trace_callback_type_attach_device_to_domain' 
[-Werror=implicit-function-declaration]
 check_trace_callback_type_##call(trace_event_raw_event_##template); \
 ^
   include/trace/events/iommu.h:72:1: note: in expansion of macro 'DEFINE_EVENT'
DEFINE_EVENT(iommu_device_event, attach_device_to_domain,
^~~~
   include/trace/events/iommu.h: In function 
'ftrace_test_probe_detach_device_from_domain':
   include/trace/trace_events.h:716:2: error: implicit declaration of function 
'check_trace_callback_type_detach_device_from_domain' 
[-Werror=implicit-function-declaration]
 check_trace_callback_type_##call(trace_event_raw_event_##template); \
 ^
   include/trace/events/iommu.h:79:1: note: in expansion of macro 'DEFINE_EVENT'
DEFINE_EVENT(iommu_device_event, detach_device_from_domain,
^~~~
   include/trace/events/iommu.h: In function 'ftrace_test_probe_map':
   include/trace/trace_events.h:716:2: error: implicit declaration of function 
'check_trace_callback_type_map' [-Werror=implicit-function-declaration]
 check_trace_callback_type_##call(trace_event_raw_event_##template); \
 ^
   include/trace/trace_events.h:66:2: note: in expansion of macro 'DEFINE_EVENT'
 DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args));
 ^~~~
   include/trace/events/iommu.h:86:1: note: in expansion of macro 'TRACE_EVENT'
TRACE_EVENT(map,
^~~
   include/trace/events/iommu.h: In function 'ftrace_test_probe_unmap':
   include/trace/trace_events.h:716:2: error: implicit declaration of function 
'check_trace_callback_type_unmap' [-Werror=implicit-function-declaration]
 check_trace_callback_type_##call(trace_event_raw_event_##template); \
 ^
   include/trace/trace_events.h:66:2: note: in expansion of macro 'DEFINE_EVENT'
 DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args));
 ^~~~
   include/trace/events/iommu.h:109:1: note: in expansion of macro 'TRACE_EVENT'
TRACE_EVENT(unmap,
^~~
   include/trace/events/iommu.h: In function 'ftrace_test_probe_io_page_fault':
   include/trace/trace_events.h:716:2: error: implicit declaration of function 
'check_trace_callback_type_io_page_fault' 
[-Werror=implicit-function-declaration]
 check_trace_callback_type_##call(trace_event_raw_event_##template); \
 ^
   include/trace/events/iommu.h:158:1: note: in expan

[PATCH] ACPI/IORT: Fix CONFIG_IOMMU_API dependency

2017-04-28 Thread Lorenzo Pieralisi
The IOMMU probe deferral IORT rework had to add code in
iort_iommu_configure() and iort_iommu_xlate() that requires
the IOMMU_API to be selected in order to compile and work.

Stub out the pieces of code that depend on CONFIG_IOMMU_API
to be selected to prevent compilation failures such as:

drivers/acpi/arm64/iort.c: In function 'iort_iommu_xlate':
drivers/acpi/arm64/iort.c:647:22: error: 'struct iommu_fwspec' has no
member named 'ops'

by wrapping the code in static inline functions that provide a NOP
implementation when CONFIG_IOMMU_API is not selected.

Signed-off-by: Lorenzo Pieralisi 
Reported-by: Arnd Bergmann 
Cc: Arnd Bergmann 
Cc: Will Deacon 
Cc: Catalin Marinas 
Cc: Robin Murphy 
Cc: Joerg Roedel 
Cc: Sricharan R 
---
Joerg, all,

as reported by Arnd:

https://lkml.org/lkml/2017/4/27/303

we have this issue in -next and I managed to write this patch in a
way that applies to Joerg's tree - branch:

arm/core

and does not conflict with ACPI IORT patches queued via arm64 tree.

I know it is late and it is not a major breakage, so either Joerg
applies this patch to his arm/core branch or I can send a fix for -rc1
when both the IOMMU and arm64 trees are merged, please let me know how
you prefer handling it, it is a bit rushed owing to where we are in
the cycle.

Thanks !
Lorenzo

 drivers/acpi/arm64/iort.c | 44 ++--
 1 file changed, 34 insertions(+), 10 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index e7b1940..a629e83 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -536,6 +536,33 @@ static inline bool iort_iommu_driver_enabled(u8 type)
}
 }
 
+#ifdef CONFIG_IOMMU_API
+static inline
+const struct iommu_ops *iort_fwspec_iommu_ops(struct iommu_fwspec *fwspec)
+{
+   return (fwspec && fwspec->ops) ? fwspec->ops : NULL;
+}
+
+static inline
+int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev)
+{
+   int err = 0;
+
+   if (!IS_ERR_OR_NULL(ops) && ops->add_device && dev->bus &&
+   !dev->iommu_group)
+   err = ops->add_device(dev);
+
+   return err;
+}
+#else
+static inline
+const struct iommu_ops *iort_fwspec_iommu_ops(struct iommu_fwspec *fwspec)
+{ return NULL; }
+static inline
+int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev)
+{ return 0; }
+#endif
+
 static const struct iommu_ops *iort_iommu_xlate(struct device *dev,
struct acpi_iort_node *node,
u32 streamid)
@@ -543,14 +570,14 @@ static const struct iommu_ops *iort_iommu_xlate(struct 
device *dev,
const struct iommu_ops *ops = NULL;
int ret = -ENODEV;
struct fwnode_handle *iort_fwnode;
-   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
 
/*
 * If we already translated the fwspec there
 * is nothing left to do, return the iommu_ops.
 */
-   if (fwspec && fwspec->ops)
-   return fwspec->ops;
+   ops = iort_fwspec_iommu_ops(dev->iommu_fwspec);
+   if (ops)
+   return ops;
 
if (node) {
iort_fwnode = iort_get_fwnode(node);
@@ -611,6 +638,7 @@ const struct iommu_ops *iort_iommu_configure(struct device 
*dev)
struct acpi_iort_node *node, *parent;
const struct iommu_ops *ops = NULL;
u32 streamid = 0;
+   int err;
 
if (dev_is_pci(dev)) {
struct pci_bus *bus = to_pci_dev(dev)->bus;
@@ -654,13 +682,9 @@ const struct iommu_ops *iort_iommu_configure(struct device 
*dev)
 * If we have reason to believe the IOMMU driver missed the initial
 * add_device callback for dev, replay it to get things in order.
 */
-   if (!IS_ERR_OR_NULL(ops) && ops->add_device &&
-   dev->bus && !dev->iommu_group) {
-   int err = ops->add_device(dev);
-
-   if (err)
-   ops = ERR_PTR(err);
-   }
+   err = iort_add_device_replay(ops, dev);
+   if (err)
+   ops = ERR_PTR(err);
 
return ops;
 }
-- 
2.10.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/2] iommu/s390: Fix iommu-groups and add sysfs support

2017-04-28 Thread Sebastian Ott
On Fri, 28 Apr 2017, Joerg Roedel wrote:
> On Fri, Apr 28, 2017 at 02:46:34PM +0200, Gerald Schaefer wrote:
> > On Thu, 27 Apr 2017 23:03:25 +0200
> > Joerg Roedel  wrote:
> > 
> > > > Well, there is a separate zpci_dev for each pci_dev on s390,
> > > > and each of those has its own separate dma-table (thus not shared).  
> > > 
> > > Is that true for all functions of a PCIe card, so does every function of
> > > a device has its own zpci_dev structure and thus its own DMA-table?
> > 
> > Yes, clp_add_pci_device() is called for every function, which in turn calls
> > zpci_create_device() with a freshly allocated zdev. zpci_enable_device()
> > then sets up a new DMA address space for each function.
> 
> That sounds special :) So will every function of a single device end up
> as a seperate device on a seperate root-bus?

Yes. That's true even for multi-function and SRIOV.

> > > My assumption came from the fact that the zpci_dev is read from
> > > pci_dev->sysdata, which is propagated there from the pci_bridge
> > > through the pci_root_bus structures.
> > 
> > The zdev gets there via zpci_create_device() -> zpci_scan_bus() ->
> > pci_scan_root_bus(), which is done for every single function.
> > 
> > Not sure if I understand this right, but it looks like we set up a new PCI
> > bus for each function.
> 
> Yeah, it sounds like this. Maybe Sebastian can confirm that?

Yes. Confirmed.

Regards,
Sebastian

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/2] iommu/s390: Fix iommu-groups and add sysfs support

2017-04-28 Thread Joerg Roedel
Hi Gerald,

On Fri, Apr 28, 2017 at 02:46:34PM +0200, Gerald Schaefer wrote:
> On Thu, 27 Apr 2017 23:03:25 +0200
> Joerg Roedel  wrote:
> 
> > > Well, there is a separate zpci_dev for each pci_dev on s390,
> > > and each of those has its own separate dma-table (thus not shared).  
> > 
> > Is that true for all functions of a PCIe card, so does every function of
> > a device has its own zpci_dev structure and thus its own DMA-table?
> 
> Yes, clp_add_pci_device() is called for every function, which in turn calls
> zpci_create_device() with a freshly allocated zdev. zpci_enable_device()
> then sets up a new DMA address space for each function.

That sounds special :) So will every function of a single device end up
as a seperate device on a seperate root-bus?

> > My assumption came from the fact that the zpci_dev is read from
> > pci_dev->sysdata, which is propagated there from the pci_bridge
> > through the pci_root_bus structures.
> 
> The zdev gets there via zpci_create_device() -> zpci_scan_bus() ->
> pci_scan_root_bus(), which is done for every single function.
> 
> Not sure if I understand this right, but it looks like we set up a new PCI
> bus for each function.

Yeah, it sounds like this. Maybe Sebastian can confirm that?

> I am however a bit confused now, about how we would have allowed group
> sharing with the current s390 IOMMU code, or IOW in which scenario would
> iommu_group_get() in the add_device callback find a shareable iommu-group?

The usual way to do this is to use the iommu_group_get_for_dev()
function, which invokes the iommu_ops->device_group call-back of the
driver to find a matching group or allocating a new one.

There are ready-to-use functions for this call-back already:

1) generic_device_group() - which just allocates a new group for
   the device. This is usually used outside of PCI

2) pci_device_group() - Which walks the PCI hierarchy to find
   devices that are not isolated and uses the matching group for
   its isolation domain.

A few drivers have their own versions of this call-back, but those are
IOMMU drivers supporting multiple bus-types and need to find the right
way to determine the group first.

> So, I guess we may have an issue with not sharing iommu-groups when
> it could make sense to do so. But your patch would not fix this, as
> we still would allocate separate iommu-groups for all functions.

Yes, but the above approach won't help when each function ends up on a
seperate bus because the code looks for different functions that are
enumerated as such. Anyway, some more insight into how this enumeration
works on s390 would be great :)


Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/s390: Fix IOMMU groups

2017-04-28 Thread Joerg Roedel
On Fri, Apr 28, 2017 at 03:20:17PM +0200, Gerald Schaefer wrote:
> On Thu, 27 Apr 2017 23:12:32 +0200
> Joerg Roedel  wrote:

> > This is the way to free an iommu-group. It was missing before probably
> > because it was unclear whether the add_device function allocated a group
> > or not. So there was no way to know if it needs to be put again in the
> > remove_device function.
> 
> Hmm, for the reference count it should not matter whether a new group was
> allocated or an existing group found with iommu_group_get(). Our add_device
> callback always gets one reference either from iommu_group_get or _alloc,
> and then another one from iommu_group_add_device(), after which the first
> reference is put again. So there should always be one reference more after
> a successful add_device.

Right, my statement above is wrong. The current code is fine, it gets a
reference to the group with iommu_group_get/iommu_group_alloc, attaches
the device to the group (which takes a reference to the group of its
own), and in the end it drops its local reference.

When the device->group link is broken up in the remove_device function,
that reference is also dropped. So everything is fine. The additional
iommu_group_put() in my patch is wrong.


Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] drivers/of_iommu: ignore SMMU DT nodes with status 'disabled'

2017-04-28 Thread Ard Biesheuvel
On 28 April 2017 at 14:17, Will Deacon  wrote:
> On Fri, Apr 28, 2017 at 02:14:49PM +0100, Ard Biesheuvel wrote:
>> On 28 April 2017 at 14:11, Will Deacon  wrote:
>> > Hi Ard,
>> >
>> > [+ devicetree@]
>> >
>> > On Fri, Apr 14, 2017 at 01:43:15PM +0100, Ard Biesheuvel wrote:
>> >> DT nodes may have a status property, and if they do, such nodes should
>> >> only be considered present if the status property is set to 'okay'.
>> >>
>> >> Currently, we call the init function of IOMMUs described by the device
>> >> tree without taking this into account, which may result in the output
>> >> below on systems where some SMMUs may be legally disabled.
>> >>
>> >>  Failed to initialise IOMMU /smb/smmu@e020
>> >>  Failed to initialise IOMMU /smb/smmu@e0c0
>> >>  arm-smmu e0a0.smmu: probing hardware configuration...
>> >>  arm-smmu e0a0.smmu: SMMUv1 with:
>> >>  arm-smmu e0a0.smmu:  stage 2 translation
>> >>  arm-smmu e0a0.smmu:  coherent table walk
>> >>  arm-smmu e0a0.smmu:  stream matching with 32 register groups, mask 
>> >> 0x7fff
>> >>  arm-smmu e0a0.smmu:  8 context banks (8 stage-2 only)
>> >>  arm-smmu e0a0.smmu:  Supported page sizes: 0x60211000
>> >>  arm-smmu e0a0.smmu:  Stage-2: 40-bit IPA -> 40-bit PA
>> >>  Failed to initialise IOMMU /smb/smmu@e060
>> >>  Failed to initialise IOMMU /smb/smmu@e080
>> >>
>> >> Since this is not an error condition, only call the init function if
>> >> the device is enabled, which also inhibits the spurious error messages.
>> >>
>> >> Signed-off-by: Ard Biesheuvel 
>> >> ---
>> >>  drivers/iommu/of_iommu.c | 2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>> >> index 2683e9fc0dcf..2dd1206e6c0d 100644
>> >> --- a/drivers/iommu/of_iommu.c
>> >> +++ b/drivers/iommu/of_iommu.c
>> >> @@ -183,7 +183,7 @@ static int __init of_iommu_init(void)
>> >>   for_each_matching_node_and_match(np, matches, &match) {
>> >>   const of_iommu_init_fn init_fn = match->data;
>> >>
>> >> - if (init_fn(np))
>> >> + if (of_device_is_available(np) && init_fn(np))
>> >>   pr_err("Failed to initialise IOMMU %s\n",
>> >>   of_node_full_name(np));
>> >>   }
>> >
>> > Is there a definition of what status = "disabled" is supposed to mean for 
>> > an
>> > IOMMU? For example, that could mean that the firmware has pre-programmed 
>> > the
>> > SMMU with particular translations or memory attributes (a bit like the
>> > CCA=1, CPM=1, DACS=0 case in ACPI IORT), or even disabled DMA traffic
>> > altogether.
>> >
>> > So I think we'd need an update to the generic IOMMU binding text to say
>> > exactly what the semantics are supposed to be here.
>> >
>>
>> I agree that it might make sense to describe the behavior of the IOMMU
>> when it is left in the state we found it in. But that is not the same
>> as status=disabled.
>>
>> The DTS subtree contains loads and loads of boilerplate
>> configurations, where only some pieces are enabled in the final image
>> by setting status=okay. So a node that has status 'disabled' should be
>> treated as 'not present', not as 'present but can be ignored under
>> assumptions such and such'
>>
>> In other words, I think we are talking about two different issues here.
>
> I'm not so sure... if we have a master device that has an iommus= property
> pointing to an IOMMU with status="disabled", I really don't know whether we
> should:
>
>   1. Assume the master can do DMA with a 1:1 mapping of memory and no
>  changes to memory attributes
>
>   2. Assume the master can do DMA with a 1:1 mapping of memory, but
>  potentially with changes to the attributes
>
>   3. Assume the master can do DMA, but with some pre-existing translation
>  (what?)
>
>   4. Assume the master can't do DMA
>
> and I also don't know whether the "dma-coherent" property remains valid.
>

Ah yes. Good point.

So indeed, there should be some IOMMU specific status property that
can convey all of the above, or 1. and 4. at the minimum
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/s390: Fix IOMMU groups

2017-04-28 Thread Gerald Schaefer
On Thu, 27 Apr 2017 23:12:32 +0200
Joerg Roedel  wrote:

> On Thu, Apr 27, 2017 at 08:11:42PM +0200, Gerald Schaefer wrote:
> > > +void zpci_destroy_iommu(struct zpci_dev *zdev)
> > > +{
> > > + iommu_group_put(zdev->group);
> > > + zdev->group = NULL;
> > > +}  
> > 
> > While the rest of this patch doesn't seem to make much of a difference to
> > the current behavior, I'm wondering where this extra iommu_group_put()
> > comes from. It either was erroneously missing before this patch, or it
> > is erroneously introduced by this patch.  
> 
> This is the way to free an iommu-group. It was missing before probably
> because it was unclear whether the add_device function allocated a group
> or not. So there was no way to know if it needs to be put again in the
> remove_device function.

Hmm, for the reference count it should not matter whether a new group was
allocated or an existing group found with iommu_group_get(). Our add_device
callback always gets one reference either from iommu_group_get or _alloc,
and then another one from iommu_group_add_device(), after which the first
reference is put again. So there should always be one reference more after
a successful add_device.

Now I'm wondering where this one reference is put again, and I thought
that happened in the remove_device callback, where we call
iommu_group_remove_device(). Is this not correct? Just want to make sure
that we don't have a refcount issue in the current code.

Regards,
Gerald

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] drivers/of_iommu: ignore SMMU DT nodes with status 'disabled'

2017-04-28 Thread Will Deacon
On Fri, Apr 28, 2017 at 02:14:49PM +0100, Ard Biesheuvel wrote:
> On 28 April 2017 at 14:11, Will Deacon  wrote:
> > Hi Ard,
> >
> > [+ devicetree@]
> >
> > On Fri, Apr 14, 2017 at 01:43:15PM +0100, Ard Biesheuvel wrote:
> >> DT nodes may have a status property, and if they do, such nodes should
> >> only be considered present if the status property is set to 'okay'.
> >>
> >> Currently, we call the init function of IOMMUs described by the device
> >> tree without taking this into account, which may result in the output
> >> below on systems where some SMMUs may be legally disabled.
> >>
> >>  Failed to initialise IOMMU /smb/smmu@e020
> >>  Failed to initialise IOMMU /smb/smmu@e0c0
> >>  arm-smmu e0a0.smmu: probing hardware configuration...
> >>  arm-smmu e0a0.smmu: SMMUv1 with:
> >>  arm-smmu e0a0.smmu:  stage 2 translation
> >>  arm-smmu e0a0.smmu:  coherent table walk
> >>  arm-smmu e0a0.smmu:  stream matching with 32 register groups, mask 
> >> 0x7fff
> >>  arm-smmu e0a0.smmu:  8 context banks (8 stage-2 only)
> >>  arm-smmu e0a0.smmu:  Supported page sizes: 0x60211000
> >>  arm-smmu e0a0.smmu:  Stage-2: 40-bit IPA -> 40-bit PA
> >>  Failed to initialise IOMMU /smb/smmu@e060
> >>  Failed to initialise IOMMU /smb/smmu@e080
> >>
> >> Since this is not an error condition, only call the init function if
> >> the device is enabled, which also inhibits the spurious error messages.
> >>
> >> Signed-off-by: Ard Biesheuvel 
> >> ---
> >>  drivers/iommu/of_iommu.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> >> index 2683e9fc0dcf..2dd1206e6c0d 100644
> >> --- a/drivers/iommu/of_iommu.c
> >> +++ b/drivers/iommu/of_iommu.c
> >> @@ -183,7 +183,7 @@ static int __init of_iommu_init(void)
> >>   for_each_matching_node_and_match(np, matches, &match) {
> >>   const of_iommu_init_fn init_fn = match->data;
> >>
> >> - if (init_fn(np))
> >> + if (of_device_is_available(np) && init_fn(np))
> >>   pr_err("Failed to initialise IOMMU %s\n",
> >>   of_node_full_name(np));
> >>   }
> >
> > Is there a definition of what status = "disabled" is supposed to mean for an
> > IOMMU? For example, that could mean that the firmware has pre-programmed the
> > SMMU with particular translations or memory attributes (a bit like the
> > CCA=1, CPM=1, DACS=0 case in ACPI IORT), or even disabled DMA traffic
> > altogether.
> >
> > So I think we'd need an update to the generic IOMMU binding text to say
> > exactly what the semantics are supposed to be here.
> >
> 
> I agree that it might make sense to describe the behavior of the IOMMU
> when it is left in the state we found it in. But that is not the same
> as status=disabled.
> 
> The DTS subtree contains loads and loads of boilerplate
> configurations, where only some pieces are enabled in the final image
> by setting status=okay. So a node that has status 'disabled' should be
> treated as 'not present', not as 'present but can be ignored under
> assumptions such and such'
> 
> In other words, I think we are talking about two different issues here.

I'm not so sure... if we have a master device that has an iommus= property
pointing to an IOMMU with status="disabled", I really don't know whether we
should:

  1. Assume the master can do DMA with a 1:1 mapping of memory and no
 changes to memory attributes

  2. Assume the master can do DMA with a 1:1 mapping of memory, but
 potentially with changes to the attributes

  3. Assume the master can do DMA, but with some pre-existing translation
 (what?)

  4. Assume the master can't do DMA

and I also don't know whether the "dma-coherent" property remains valid.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] drivers/of_iommu: ignore SMMU DT nodes with status 'disabled'

2017-04-28 Thread Ard Biesheuvel
On 28 April 2017 at 14:11, Will Deacon  wrote:
> Hi Ard,
>
> [+ devicetree@]
>
> On Fri, Apr 14, 2017 at 01:43:15PM +0100, Ard Biesheuvel wrote:
>> DT nodes may have a status property, and if they do, such nodes should
>> only be considered present if the status property is set to 'okay'.
>>
>> Currently, we call the init function of IOMMUs described by the device
>> tree without taking this into account, which may result in the output
>> below on systems where some SMMUs may be legally disabled.
>>
>>  Failed to initialise IOMMU /smb/smmu@e020
>>  Failed to initialise IOMMU /smb/smmu@e0c0
>>  arm-smmu e0a0.smmu: probing hardware configuration...
>>  arm-smmu e0a0.smmu: SMMUv1 with:
>>  arm-smmu e0a0.smmu:  stage 2 translation
>>  arm-smmu e0a0.smmu:  coherent table walk
>>  arm-smmu e0a0.smmu:  stream matching with 32 register groups, mask 
>> 0x7fff
>>  arm-smmu e0a0.smmu:  8 context banks (8 stage-2 only)
>>  arm-smmu e0a0.smmu:  Supported page sizes: 0x60211000
>>  arm-smmu e0a0.smmu:  Stage-2: 40-bit IPA -> 40-bit PA
>>  Failed to initialise IOMMU /smb/smmu@e060
>>  Failed to initialise IOMMU /smb/smmu@e080
>>
>> Since this is not an error condition, only call the init function if
>> the device is enabled, which also inhibits the spurious error messages.
>>
>> Signed-off-by: Ard Biesheuvel 
>> ---
>>  drivers/iommu/of_iommu.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>> index 2683e9fc0dcf..2dd1206e6c0d 100644
>> --- a/drivers/iommu/of_iommu.c
>> +++ b/drivers/iommu/of_iommu.c
>> @@ -183,7 +183,7 @@ static int __init of_iommu_init(void)
>>   for_each_matching_node_and_match(np, matches, &match) {
>>   const of_iommu_init_fn init_fn = match->data;
>>
>> - if (init_fn(np))
>> + if (of_device_is_available(np) && init_fn(np))
>>   pr_err("Failed to initialise IOMMU %s\n",
>>   of_node_full_name(np));
>>   }
>
> Is there a definition of what status = "disabled" is supposed to mean for an
> IOMMU? For example, that could mean that the firmware has pre-programmed the
> SMMU with particular translations or memory attributes (a bit like the
> CCA=1, CPM=1, DACS=0 case in ACPI IORT), or even disabled DMA traffic
> altogether.
>
> So I think we'd need an update to the generic IOMMU binding text to say
> exactly what the semantics are supposed to be here.
>

I agree that it might make sense to describe the behavior of the IOMMU
when it is left in the state we found it in. But that is not the same
as status=disabled.

The DTS subtree contains loads and loads of boilerplate
configurations, where only some pieces are enabled in the final image
by setting status=okay. So a node that has status 'disabled' should be
treated as 'not present', not as 'present but can be ignored under
assumptions such and such'

In other words, I think we are talking about two different issues here.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] drivers/of_iommu: ignore SMMU DT nodes with status 'disabled'

2017-04-28 Thread Will Deacon
Hi Ard,

[+ devicetree@]

On Fri, Apr 14, 2017 at 01:43:15PM +0100, Ard Biesheuvel wrote:
> DT nodes may have a status property, and if they do, such nodes should
> only be considered present if the status property is set to 'okay'.
> 
> Currently, we call the init function of IOMMUs described by the device
> tree without taking this into account, which may result in the output
> below on systems where some SMMUs may be legally disabled.
> 
>  Failed to initialise IOMMU /smb/smmu@e020
>  Failed to initialise IOMMU /smb/smmu@e0c0
>  arm-smmu e0a0.smmu: probing hardware configuration...
>  arm-smmu e0a0.smmu: SMMUv1 with:
>  arm-smmu e0a0.smmu:  stage 2 translation
>  arm-smmu e0a0.smmu:  coherent table walk
>  arm-smmu e0a0.smmu:  stream matching with 32 register groups, mask 0x7fff
>  arm-smmu e0a0.smmu:  8 context banks (8 stage-2 only)
>  arm-smmu e0a0.smmu:  Supported page sizes: 0x60211000
>  arm-smmu e0a0.smmu:  Stage-2: 40-bit IPA -> 40-bit PA
>  Failed to initialise IOMMU /smb/smmu@e060
>  Failed to initialise IOMMU /smb/smmu@e080
> 
> Since this is not an error condition, only call the init function if
> the device is enabled, which also inhibits the spurious error messages.
> 
> Signed-off-by: Ard Biesheuvel 
> ---
>  drivers/iommu/of_iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index 2683e9fc0dcf..2dd1206e6c0d 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -183,7 +183,7 @@ static int __init of_iommu_init(void)
>   for_each_matching_node_and_match(np, matches, &match) {
>   const of_iommu_init_fn init_fn = match->data;
>  
> - if (init_fn(np))
> + if (of_device_is_available(np) && init_fn(np))
>   pr_err("Failed to initialise IOMMU %s\n",
>   of_node_full_name(np));
>   }

Is there a definition of what status = "disabled" is supposed to mean for an
IOMMU? For example, that could mean that the firmware has pre-programmed the
SMMU with particular translations or memory attributes (a bit like the
CCA=1, CPM=1, DACS=0 case in ACPI IORT), or even disabled DMA traffic
altogether.

So I think we'd need an update to the generic IOMMU binding text to say
exactly what the semantics are supposed to be here.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Qemu-devel] [RFC PATCH 1/8] iommu: Introduce bind_pasid_table API function

2017-04-28 Thread Jean-Philippe Brucker
On 28/04/17 10:04, Liu, Yi L wrote:
> On Wed, Apr 26, 2017 at 05:56:45PM +0100, Jean-Philippe Brucker wrote:
>> Hi Yi, Jacob,
>>
>> On 26/04/17 11:11, Liu, Yi L wrote:
>>> From: Jacob Pan 
>>>
>>> Virtual IOMMU was proposed to support Shared Virtual Memory (SVM) use
>>> case in the guest:
>>> https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05311.html
>>>
>>> As part of the proposed architecture, when a SVM capable PCI
>>> device is assigned to a guest, nested mode is turned on. Guest owns the
>>> first level page tables (request with PASID) and performs GVA->GPA
>>> translation. Second level page tables are owned by the host for GPA->HPA
>>> translation for both request with and without PASID.
>>>
>>> A new IOMMU driver interface is therefore needed to perform tasks as
>>> follows:
>>> * Enable nested translation and appropriate translation type
>>> * Assign guest PASID table pointer (in GPA) and size to host IOMMU
>>>
>>> This patch introduces new functions called iommu_(un)bind_pasid_table()
>>> to IOMMU APIs. Architecture specific IOMMU function can be added later
>>> to perform the specific steps for binding pasid table of assigned devices.
>>>
>>> This patch also adds model definition in iommu.h. It would be used to
>>> check if the bind request is from a compatible entity. e.g. a bind
>>> request from an intel_iommu emulator may not be supported by an ARM SMMU
>>> driver.
>>>
>>> Signed-off-by: Jacob Pan 
>>> Signed-off-by: Liu, Yi L 
>>> ---
>>>  drivers/iommu/iommu.c | 19 +++
>>>  include/linux/iommu.h | 31 +++
>>>  2 files changed, 50 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index dbe7f65..f2da636 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -1134,6 +1134,25 @@ int iommu_attach_device(struct iommu_domain *domain, 
>>> struct device *dev)
>>>  }
>>>  EXPORT_SYMBOL_GPL(iommu_attach_device);
>>>  
>>> +int iommu_bind_pasid_table(struct iommu_domain *domain, struct device *dev,
>>> +   struct pasid_table_info *pasidt_binfo)
>>
>> I guess that domain can always be deduced from dev using
>> iommu_get_domain_for_dev, and doesn't need to be passed as argument?
>>
>> For the next version of my SVM series, I was thinking of passing group
>> instead of device to iommu_bind. Since all devices in a group are expected
>> to share the same mappings (whether they want it or not), users will have
> 
> Virtual address space is not tied to protection domain as I/O virtual address
> space does. Is it really necessary to affect all the devices in this group.
> Or it is just for consistence?

It's mostly about consistency, and also avoid hiding implicit behavior in
the IOMMU driver. I have the following example, described using group and
domain structures from the IOMMU API:
 
|IOMMU   |
|  |DOM  __ ||
|  ||GRP   ||| bind
|  ||A<-Task 1
|  ||B |||
|  ||__|||
|  | __ ||
|  ||GRP   |||
|  ||C |||
|  ||__|||
|  |||
|    |
|  |DOM  __ ||
|  ||GRP   |||
|  ||D |||
|  ||__|||
|  |||
||

Let's take PCI functions A, B, C, and D, all with PASID capabilities. Due
to some hardware limitation (in the bus, the device or the IOMMU), B can
see all DMA transactions issued by A. A and B are therefore in the same
IOMMU group. C and D can be isolated by the IOMMU, so they each have their
own group.

(As far as I know, in the SVM world at the moment, devices are neatly
integrated and there is no need for putting multiple devices in the same
IOMMU group, but I don't think we should expect all future SVM systems to
be well-behaved.)

So when a user binds Task 1 to device A, it is *implicitly* giving device
B access to Task 1 as well. Simply because the IOMMU is unable to isolate
A from B, PASID or not. B could access the same address space as A, even
if you don't call bind again to explicitly attach the PASID table to B.

If the bind is done with device as argument, maybe users will believe that
using PASIDs provides an additional level of isolation within a group,
when it really doesn't. That's why I'm inclined to have the whole bind API
be on groups rather than devices, if only for clarity.

But I don't know, maybe a comment explaining the above would be sufficient.

To be frank my comment about group versus device is partly to make sure
that I grasp the various concepts correctly and that we're on the same
page. Doing the bind on groups is less 

Re: [RFC PATCH 0/2] iommu/s390: Fix iommu-groups and add sysfs support

2017-04-28 Thread Gerald Schaefer
Hi Joerg,

I guess we are a bit special on s390 (again), see below. Sebastian is more
familiar with the base s390 PCI code, he may correct me if I'm wrong.

On Thu, 27 Apr 2017 23:03:25 +0200
Joerg Roedel  wrote:

> > Well, there is a separate zpci_dev for each pci_dev on s390,
> > and each of those has its own separate dma-table (thus not shared).  
> 
> Is that true for all functions of a PCIe card, so does every function of
> a device has its own zpci_dev structure and thus its own DMA-table?

Yes, clp_add_pci_device() is called for every function, which in turn calls
zpci_create_device() with a freshly allocated zdev. zpci_enable_device()
then sets up a new DMA address space for each function.

> My assumption came from the fact that the zpci_dev is read from
> pci_dev->sysdata, which is propagated there from the pci_bridge
> through the pci_root_bus structures.

The zdev gets there via zpci_create_device() -> zpci_scan_bus() ->
pci_scan_root_bus(), which is done for every single function.

Not sure if I understand this right, but it looks like we set up a new PCI
bus for each function.

> > Given this "separate zpci_dev for each pci_dev" situation, I don't
> > see what this update actually changes, compared to the previous code,
> > see also my comments to that patch.  
> 
> The add_device call-back is invoked for every function of a pci-device,
> because each function gets its own pci_dev structure. Also we usually
> group all functions of a PCI-device together into one iommu-group,
> because we don't trust that the device isolates its functions from each
> other.

OK, but similar to the add_device callback, zpci_create_device() is
also invoked for every function. So, allocating a new iommu-group in
zpci_create_device() will never lead to any group sharing.

I am however a bit confused now, about how we would have allowed group
sharing with the current s390 IOMMU code, or IOW in which scenario would
iommu_group_get() in the add_device callback find a shareable iommu-group?

In the attach_dev callback, we provide the option to "force" multiple
functions using the same iommu-domain / DMA address space, by de-registering
the per-function DMA address space and registering a common space. But
such functions would only be in the same iommu "domain" and not "group",
if I get this right.

So, I guess we may have an issue with not sharing iommu-groups when
it could make sense to do so. But your patch would not fix this, as
we still would allocate separate iommu-groups for all functions.

Regards,
Gerald

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 02/20] intel_iommu: exposed extended-context mode to guest

2017-04-28 Thread Liu, Yi L
On Fri, Apr 28, 2017 at 02:00:15PM +0800, Lan Tianyu wrote:
> On 2017年04月27日 18:32, Peter Xu wrote:
> > On Wed, Apr 26, 2017 at 06:06:32PM +0800, Liu, Yi L wrote:
> >> VT-d implementations reporting PASID or PRS fields as "Set", must also
> >> report ecap.ECS as "Set". Extended-Context is required for SVM.
> >>
> >> When ECS is reported, intel iommu driver would initiate extended root entry
> >> and extended context entry, and also PASID table if there is any SVM 
> >> capable
> >> device.
> >>
> >> Signed-off-by: Liu, Yi L 
> >> ---
> >>  hw/i386/intel_iommu.c  | 131 
> >> +++--
> >>  hw/i386/intel_iommu_internal.h |   9 +++
> >>  include/hw/i386/intel_iommu.h  |   2 +-
> >>  3 files changed, 97 insertions(+), 45 deletions(-)
> >>
> >> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> >> index 400d0d1..bf98fa5 100644
> >> --- a/hw/i386/intel_iommu.c
> >> +++ b/hw/i386/intel_iommu.c
> >> @@ -497,6 +497,11 @@ static inline bool 
> >> vtd_root_entry_present(VTDRootEntry *root)
> >>  return root->val & VTD_ROOT_ENTRY_P;
> >>  }
> >>  
> >> +static inline bool vtd_root_entry_upper_present(VTDRootEntry *root)
> >> +{
> >> +return root->rsvd & VTD_ROOT_ENTRY_P;
> >> +}
> >> +
> >>  static int vtd_get_root_entry(IntelIOMMUState *s, uint8_t index,
> >>VTDRootEntry *re)
> >>  {
> >> @@ -509,6 +514,9 @@ static int vtd_get_root_entry(IntelIOMMUState *s, 
> >> uint8_t index,
> >>  return -VTD_FR_ROOT_TABLE_INV;
> >>  }
> >>  re->val = le64_to_cpu(re->val);
> >> +if (s->ecs) {
> >> +re->rsvd = le64_to_cpu(re->rsvd);
> >> +}
> > 
> > I feel it slightly hacky to play with re->rsvd. How about:
> > 
> > union VTDRootEntry {
> > struct {
> > uint64_t val;
> > uint64_t rsvd;
> > } base;
> > struct {
> > uint64_t ext_lo;
> > uint64_t ext_hi;
> > } extended;
> > };
> > 
> > (Or any better way that can get rid of rsvd...)
> > 
> > Even:
> > 
> > struct VTDRootEntry {
> > union {
> > struct {
> > uint64_t val;
> > uint64_t rsvd;
> > } base;
> > struct {
> > uint64_t ext_lo;
> > uint64_t ext_hi;
> > } extended;
> > } data;
> > bool extended;
> > };
> > 
> > Then we read the entry into data, and setup extended bit. A benefit of
> > it is that we may avoid passing around IntelIOMMUState everywhere to
> > know whether we are using extended context entries.
> > 
> >>  return 0;
> >>  }
> >>  
> >> @@ -517,19 +525,30 @@ static inline bool 
> >> vtd_context_entry_present(VTDContextEntry *context)
> >>  return context->lo & VTD_CONTEXT_ENTRY_P;
> >>  }
> >>  
> >> -static int vtd_get_context_entry_from_root(VTDRootEntry *root, uint8_t 
> >> index,
> >> -   VTDContextEntry *ce)
> >> +static int vtd_get_context_entry_from_root(IntelIOMMUState *s,
> >> + VTDRootEntry *root, uint8_t index, VTDContextEntry *ce)
> >>  {
> >> -dma_addr_t addr;
> >> +dma_addr_t addr, ce_size;
> >>  
> >>  /* we have checked that root entry is present */
> >> -addr = (root->val & VTD_ROOT_ENTRY_CTP) + index * sizeof(*ce);
> >> -if (dma_memory_read(&address_space_memory, addr, ce, sizeof(*ce))) {
> >> +ce_size = (s->ecs) ? (2 * sizeof(*ce)) : (sizeof(*ce));
> >> +addr = (s->ecs && (index > 0x7f)) ?
> >> +   ((root->rsvd & VTD_ROOT_ENTRY_CTP) + (index - 0x80) * ce_size) 
> >> :
> >> +   ((root->val & VTD_ROOT_ENTRY_CTP) + index * ce_size);
> >> +
> >> +if (dma_memory_read(&address_space_memory, addr, ce, ce_size)) {
> >>  trace_vtd_re_invalid(root->rsvd, root->val);
> >>  return -VTD_FR_CONTEXT_TABLE_INV;
> >>  }
> >> -ce->lo = le64_to_cpu(ce->lo);
> >> -ce->hi = le64_to_cpu(ce->hi);
> >> +
> >> +ce[0].lo = le64_to_cpu(ce[0].lo);
> >> +ce[0].hi = le64_to_cpu(ce[0].hi);
> > 
> > Again, I feel this even hackier. :)
> > 
> > I would slightly prefer to play the same union trick to context
> > entries, just like what I proposed to the root entries above...
> > 
> >> +
> >> +if (s->ecs) {
> >> +ce[1].lo = le64_to_cpu(ce[1].lo);
> >> +ce[1].hi = le64_to_cpu(ce[1].hi);
> >> +}
> >> +
> >>  return 0;
> >>  }
> >>  
> >> @@ -595,9 +614,11 @@ static inline uint32_t 
> >> vtd_get_agaw_from_context_entry(VTDContextEntry *ce)
> >>  return 30 + (ce->hi & VTD_CONTEXT_ENTRY_AW) * 9;
> >>  }
> >>  
> >> -static inline uint32_t vtd_ce_get_type(VTDContextEntry *ce)
> >> +static inline uint32_t vtd_ce_get_type(IntelIOMMUState *s,
> >> +   VTDContextEntry *ce)
> >>  {
> >> -return ce->lo & VTD_CONTEXT_ENTRY_TT;
> >> +return s->ecs ? (ce->lo & VTD_CONTEXT_ENTRY_TT) :
> >> +(ce->lo & VTD_EXT_CONTEXT_ENTRY_TT);
> >>  }
> >>  
> >>  static inline uint64_t vtd_iova_limit(VT

Re: [Qemu-devel] [RFC PATCH 02/20] intel_iommu: exposed extended-context mode to guest

2017-04-28 Thread Liu, Yi L
On Thu, Apr 27, 2017 at 06:32:21PM +0800, Peter Xu wrote:
> On Wed, Apr 26, 2017 at 06:06:32PM +0800, Liu, Yi L wrote:
> > VT-d implementations reporting PASID or PRS fields as "Set", must also
> > report ecap.ECS as "Set". Extended-Context is required for SVM.
> > 
> > When ECS is reported, intel iommu driver would initiate extended root entry
> > and extended context entry, and also PASID table if there is any SVM capable
> > device.
> > 
> > Signed-off-by: Liu, Yi L 
> > ---
> >  hw/i386/intel_iommu.c  | 131 
> > +++--
> >  hw/i386/intel_iommu_internal.h |   9 +++
> >  include/hw/i386/intel_iommu.h  |   2 +-
> >  3 files changed, 97 insertions(+), 45 deletions(-)
> > 
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 400d0d1..bf98fa5 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -497,6 +497,11 @@ static inline bool vtd_root_entry_present(VTDRootEntry 
> > *root)
> >  return root->val & VTD_ROOT_ENTRY_P;
> >  }
> >  
> > +static inline bool vtd_root_entry_upper_present(VTDRootEntry *root)
> > +{
> > +return root->rsvd & VTD_ROOT_ENTRY_P;
> > +}
> > +
> >  static int vtd_get_root_entry(IntelIOMMUState *s, uint8_t index,
> >VTDRootEntry *re)
> >  {
> > @@ -509,6 +514,9 @@ static int vtd_get_root_entry(IntelIOMMUState *s, 
> > uint8_t index,
> >  return -VTD_FR_ROOT_TABLE_INV;
> >  }
> >  re->val = le64_to_cpu(re->val);
> > +if (s->ecs) {
> > +re->rsvd = le64_to_cpu(re->rsvd);
> > +}
> 
> I feel it slightly hacky to play with re->rsvd. How about:
> 
> union VTDRootEntry {
> struct {
> uint64_t val;
> uint64_t rsvd;
> } base;
> struct {
> uint64_t ext_lo;
> uint64_t ext_hi;
> } extended;
> };

Agree.
 
> (Or any better way that can get rid of rsvd...)
> 
> Even:
> 
> struct VTDRootEntry {
> union {
> struct {
> uint64_t val;
> uint64_t rsvd;
> } base;
> struct {
> uint64_t ext_lo;
> uint64_t ext_hi;
> } extended;
> } data;
> bool extended;
> };
> 
> Then we read the entry into data, and setup extended bit. A benefit of
> it is that we may avoid passing around IntelIOMMUState everywhere to
> know whether we are using extended context entries.

For this proposal, it's combining the s->ecs bit and root entry. But it
may mislead future maintainer as it uses VTDRootEntry. maybe name it
differently.

> >  return 0;
> >  }
> >  
> > @@ -517,19 +525,30 @@ static inline bool 
> > vtd_context_entry_present(VTDContextEntry *context)
> >  return context->lo & VTD_CONTEXT_ENTRY_P;
> >  }
> >  
> > -static int vtd_get_context_entry_from_root(VTDRootEntry *root, uint8_t 
> > index,
> > -   VTDContextEntry *ce)
> > +static int vtd_get_context_entry_from_root(IntelIOMMUState *s,
> > + VTDRootEntry *root, uint8_t index, VTDContextEntry *ce)
> >  {
> > -dma_addr_t addr;
> > +dma_addr_t addr, ce_size;
> >  
> >  /* we have checked that root entry is present */
> > -addr = (root->val & VTD_ROOT_ENTRY_CTP) + index * sizeof(*ce);
> > -if (dma_memory_read(&address_space_memory, addr, ce, sizeof(*ce))) {
> > +ce_size = (s->ecs) ? (2 * sizeof(*ce)) : (sizeof(*ce));
> > +addr = (s->ecs && (index > 0x7f)) ?
> > +   ((root->rsvd & VTD_ROOT_ENTRY_CTP) + (index - 0x80) * ce_size) :
> > +   ((root->val & VTD_ROOT_ENTRY_CTP) + index * ce_size);
> > +
> > +if (dma_memory_read(&address_space_memory, addr, ce, ce_size)) {
> >  trace_vtd_re_invalid(root->rsvd, root->val);
> >  return -VTD_FR_CONTEXT_TABLE_INV;
> >  }
> > -ce->lo = le64_to_cpu(ce->lo);
> > -ce->hi = le64_to_cpu(ce->hi);
> > +
> > +ce[0].lo = le64_to_cpu(ce[0].lo);
> > +ce[0].hi = le64_to_cpu(ce[0].hi);
> 
> Again, I feel this even hackier. :)
> 
> I would slightly prefer to play the same union trick to context
> entries, just like what I proposed to the root entries above...

would think about it.

> > +
> > +if (s->ecs) {
> > +ce[1].lo = le64_to_cpu(ce[1].lo);
> > +ce[1].hi = le64_to_cpu(ce[1].hi);
> > +}
> > +
> >  return 0;
> >  }
> >  
> > @@ -595,9 +614,11 @@ static inline uint32_t 
> > vtd_get_agaw_from_context_entry(VTDContextEntry *ce)
> >  return 30 + (ce->hi & VTD_CONTEXT_ENTRY_AW) * 9;
> >  }
> >  
> > -static inline uint32_t vtd_ce_get_type(VTDContextEntry *ce)
> > +static inline uint32_t vtd_ce_get_type(IntelIOMMUState *s,
> > +   VTDContextEntry *ce)
> >  {
> > -return ce->lo & VTD_CONTEXT_ENTRY_TT;
> > +return s->ecs ? (ce->lo & VTD_CONTEXT_ENTRY_TT) :
> > +(ce->lo & VTD_EXT_CONTEXT_ENTRY_TT);
> >  }
> >  
> >  static inline uint64_t vtd_iova_limit(VTDContextEntry *ce)
> > @@ -842,16 +863,20 @@ static in

Re: [Qemu-devel] [RFC PATCH 1/8] iommu: Introduce bind_pasid_table API function

2017-04-28 Thread Liu, Yi L
On Wed, Apr 26, 2017 at 05:56:45PM +0100, Jean-Philippe Brucker wrote:
> Hi Yi, Jacob,
> 
> On 26/04/17 11:11, Liu, Yi L wrote:
> > From: Jacob Pan 
> > 
> > Virtual IOMMU was proposed to support Shared Virtual Memory (SVM) use
> > case in the guest:
> > https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05311.html
> > 
> > As part of the proposed architecture, when a SVM capable PCI
> > device is assigned to a guest, nested mode is turned on. Guest owns the
> > first level page tables (request with PASID) and performs GVA->GPA
> > translation. Second level page tables are owned by the host for GPA->HPA
> > translation for both request with and without PASID.
> > 
> > A new IOMMU driver interface is therefore needed to perform tasks as
> > follows:
> > * Enable nested translation and appropriate translation type
> > * Assign guest PASID table pointer (in GPA) and size to host IOMMU
> > 
> > This patch introduces new functions called iommu_(un)bind_pasid_table()
> > to IOMMU APIs. Architecture specific IOMMU function can be added later
> > to perform the specific steps for binding pasid table of assigned devices.
> > 
> > This patch also adds model definition in iommu.h. It would be used to
> > check if the bind request is from a compatible entity. e.g. a bind
> > request from an intel_iommu emulator may not be supported by an ARM SMMU
> > driver.
> > 
> > Signed-off-by: Jacob Pan 
> > Signed-off-by: Liu, Yi L 
> > ---
> >  drivers/iommu/iommu.c | 19 +++
> >  include/linux/iommu.h | 31 +++
> >  2 files changed, 50 insertions(+)
> > 
> > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > index dbe7f65..f2da636 100644
> > --- a/drivers/iommu/iommu.c
> > +++ b/drivers/iommu/iommu.c
> > @@ -1134,6 +1134,25 @@ int iommu_attach_device(struct iommu_domain *domain, 
> > struct device *dev)
> >  }
> >  EXPORT_SYMBOL_GPL(iommu_attach_device);
> >  
> > +int iommu_bind_pasid_table(struct iommu_domain *domain, struct device *dev,
> > +   struct pasid_table_info *pasidt_binfo)
> 
> I guess that domain can always be deduced from dev using
> iommu_get_domain_for_dev, and doesn't need to be passed as argument?
> 
> For the next version of my SVM series, I was thinking of passing group
> instead of device to iommu_bind. Since all devices in a group are expected
> to share the same mappings (whether they want it or not), users will have

Virtual address space is not tied to protection domain as I/O virtual address
space does. Is it really necessary to affect all the devices in this group.
Or it is just for consistence?

> to do iommu_group_for_each_dev anyway (as you do in patch 6/8). So it
> might be simpler to let the IOMMU core take the group lock and do
> group->domain->ops->bind_task(dev...) for each device. The question also
> holds for iommu_do_invalidate in patch 3/8.

In my understanding, it is moving the for_each_dev loop into iommu driver?
Is it?

> This way the prototypes would be:
> int iommu_bind...(struct iommu_group *group, struct ... *info)
> int iommu_unbind...(struct iommu_group *group, struct ...*info)
> int iommu_invalidate...(struct iommu_group *group, struct ...*info)

For PASID table binding from guest, I think it'd better to be per-device op
since the bind operation wants to modify the host context entry. But we may
still share the API and do things differently in iommu driver.

For invalidation, I think it'd better to be per-group. Actually, with guest
IOMMU exists, there is only one group in a domain on Intel platform. Do it for
each device is not expected. How about it on ARM?

> For PASID table binding it might not matter much, as VFIO will most likely
> be the only user. But task binding will be called by device drivers, which
> by now should be encouraged to do things at iommu_group granularity.
> Alternatively it could be done implicitly like in iommu_attach_device,
> with "iommu_bind_device_x" calling "iommu_bind_group_x".

Do you mean the bind task from userspace driver? I guess you're trying to do
different types of binding request in a single svm_bind API?

> 
> Extending this reasoning, since groups in a domain are also supposed to
> have the same mappings, then similarly to map/unmap,
> bind/unbind/invalidate should really be done with an iommu_domain (and
> nothing else) as target argument. However this requires the IOMMU core to
> keep a group list in each domain, which might complicate things a little
> too much.
> 
> But "all devices in a domain share the same PASID table" is the paradigm
> I'm currently using in the guts of arm-smmu-v3. And I wonder if, as with
> iommu_group, it should be made more explicit to users, so they don't
> assume that devices within a domain are isolated from each others with
> regard to PASID DMA.

Is the isolation you mentioned means forbidding to do PASID DMA to the same
virtual address space when the device comes from different domain?

Thanks,
Yi L
 

RE: Does a new booting kernel by "kexec -l" need to copy IR table from previous kernel?

2017-04-28 Thread Zhuo, Qiuxu
Hi Joerg Roedel,

> From: j...@8bytes.org [mailto:j...@8bytes.org] 
> Yes, that is true. But the messages are harmless and you are safe to ignore 
> them in your usecase. 
> We only care about the kdump case when copying the old IR and DMAR tables 
> into the new kernel, 
> because in the  kdump case the old kernel crashed and left devices in an 
> undefined state, so that 
> they might still send DMA and IRQ requests to the new kernel, corrupting data 
> or causing spurious/blocked irqs. 
> Blocked IRQs or DMA requests might even break devices so that the new kernel 
> can't initialize them anymore.
...
> That is why we want to make sure, that no spurious IRQ or DMA requests are 
> blocked when the new
>  kernel initializes the IOMMU. In the non-kdump case we can assume that the 
> old kernel put the devices
> into a defined state where they are not sending spurious requests.

Thanks so much for your detail and clear comments, get it now.  Send you a 
patch " Don't print the failure message
when booting non-kdump kernel" to make it less alarming or misleading. Please 
review it, thanks!

BR
qiuxu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Qemu-devel] [RFC PATCH 1/8] iommu: Introduce bind_pasid_table API function

2017-04-28 Thread Liu, Yi L
On Thu, Apr 27, 2017 at 11:12:45AM +0100, Jean-Philippe Brucker wrote:
> On 27/04/17 07:36, Liu, Yi L wrote:
> > On Wed, Apr 26, 2017 at 05:56:45PM +0100, Jean-Philippe Brucker wrote:
> >> Hi Yi, Jacob,
> >>
> >> On 26/04/17 11:11, Liu, Yi L wrote:
> >>> From: Jacob Pan 
> >>>
> >>> Virtual IOMMU was proposed to support Shared Virtual Memory (SVM) use
> >>> case in the guest:
> >>> https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg05311.html
> >>>
> >>> As part of the proposed architecture, when a SVM capable PCI
> >>> device is assigned to a guest, nested mode is turned on. Guest owns the
> >>> first level page tables (request with PASID) and performs GVA->GPA
> >>> translation. Second level page tables are owned by the host for GPA->HPA
> >>> translation for both request with and without PASID.
> >>>
> >>> A new IOMMU driver interface is therefore needed to perform tasks as
> >>> follows:
> >>> * Enable nested translation and appropriate translation type
> >>> * Assign guest PASID table pointer (in GPA) and size to host IOMMU
> >>>
> >>> This patch introduces new functions called iommu_(un)bind_pasid_table()
> >>> to IOMMU APIs. Architecture specific IOMMU function can be added later
> >>> to perform the specific steps for binding pasid table of assigned devices.
> >>>
> >>> This patch also adds model definition in iommu.h. It would be used to
> >>> check if the bind request is from a compatible entity. e.g. a bind
> >>> request from an intel_iommu emulator may not be supported by an ARM SMMU
> >>> driver.
> >>>
> >>> Signed-off-by: Jacob Pan 
> >>> Signed-off-by: Liu, Yi L 
> >>> ---
> >>>  drivers/iommu/iommu.c | 19 +++
> >>>  include/linux/iommu.h | 31 +++
> >>>  2 files changed, 50 insertions(+)
> >>>
> >>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> >>> index dbe7f65..f2da636 100644
> >>> --- a/drivers/iommu/iommu.c
> >>> +++ b/drivers/iommu/iommu.c
> >>> @@ -1134,6 +1134,25 @@ int iommu_attach_device(struct iommu_domain 
> >>> *domain, struct device *dev)
> >>>  }
> >>>  EXPORT_SYMBOL_GPL(iommu_attach_device);
> >>>  
> >>> +int iommu_bind_pasid_table(struct iommu_domain *domain, struct device 
> >>> *dev,
> >>> + struct pasid_table_info *pasidt_binfo)
> >>
> >> I guess that domain can always be deduced from dev using
> >> iommu_get_domain_for_dev, and doesn't need to be passed as argument?
> >>
> >> For the next version of my SVM series, I was thinking of passing group
> >> instead of device to iommu_bind. Since all devices in a group are expected
> >> to share the same mappings (whether they want it or not), users will have
> >> to do iommu_group_for_each_dev anyway (as you do in patch 6/8). So it
> >> might be simpler to let the IOMMU core take the group lock and do
> >> group->domain->ops->bind_task(dev...) for each device. The question also
> >> holds for iommu_do_invalidate in patch 3/8.
> >>
> >> This way the prototypes would be:
> >> int iommu_bind...(struct iommu_group *group, struct ... *info)
> >> int iommu_unbind...(struct iommu_group *group, struct ...*info)
> >> int iommu_invalidate...(struct iommu_group *group, struct ...*info)
> >>
> >> For PASID table binding it might not matter much, as VFIO will most likely
> >> be the only user. But task binding will be called by device drivers, which
> >> by now should be encouraged to do things at iommu_group granularity.
> >> Alternatively it could be done implicitly like in iommu_attach_device,
> >> with "iommu_bind_device_x" calling "iommu_bind_group_x".
> >>
> >>
> >> Extending this reasoning, since groups in a domain are also supposed to
> >> have the same mappings, then similarly to map/unmap,
> >> bind/unbind/invalidate should really be done with an iommu_domain (and
> >> nothing else) as target argument. However this requires the IOMMU core to
> >> keep a group list in each domain, which might complicate things a little
> >> too much.
> >>
> >> But "all devices in a domain share the same PASID table" is the paradigm
> >> I'm currently using in the guts of arm-smmu-v3. And I wonder if, as with
> >> iommu_group, it should be made more explicit to users, so they don't
> >> assume that devices within a domain are isolated from each others with
> >> regard to PASID DMA.
> >>
> >>> +{
> >>> + if (unlikely(!domain->ops->bind_pasid_table))
> >>> + return -EINVAL;
> >>> +
> >>> + return domain->ops->bind_pasid_table(domain, dev, pasidt_binfo);
> >>> +}
> >>> +EXPORT_SYMBOL_GPL(iommu_bind_pasid_table);
> >>> +
> >>> +int iommu_unbind_pasid_table(struct iommu_domain *domain, struct device 
> >>> *dev)
> >>> +{
> >>> + if (unlikely(!domain->ops->unbind_pasid_table))
> >>> + return -EINVAL;
> >>> +
> >>> + return domain->ops->unbind_pasid_table(domain, dev);
> >>> +}
> >>> +EXPORT_SYMBOL_GPL(iommu_unbind_pasid_table);
> >>> +
> >>>  static void __iommu_detach_device(struct iommu_domain *domain,
> >>> struct device *dev)
> 

[PATCH 1/1] iommu/vt-d: Don't print the failure message when booting non-kdump kernel

2017-04-28 Thread Qiuxu Zhuo
When booting a new non-kdump kernel, we have below failure message:

[0.004000] DMAR-IR: IRQ remapping was enabled on dmar2 but we are not in 
kdump mode
[0.004000] DMAR-IR: Failed to copy IR table for dmar2 from previous kernel
[0.004000] DMAR-IR: IRQ remapping was enabled on dmar1 but we are not in 
kdump mode
[0.004000] DMAR-IR: Failed to copy IR table for dmar1 from previous kernel
[0.004000] DMAR-IR: IRQ remapping was enabled on dmar0 but we are not in 
kdump mode
[0.004000] DMAR-IR: Failed to copy IR table for dmar0 from previous kernel
[0.004000] DMAR-IR: IRQ remapping was enabled on dmar3 but we are not in 
kdump mode
[0.004000] DMAR-IR: Failed to copy IR table for dmar3 from previous kernel

For non-kdump case, we no need to copy IR table from previous kernel
so it's nonthing actually failed. To be less alarming or misleading,
do not print "DMAR-IR: Failed to copy IR table for dmar[0-9] from
previous kernel" messages when booting non-kdump kernel.

Signed-off-by: Qiuxu Zhuo 
---
 drivers/iommu/intel_irq_remapping.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel_irq_remapping.c 
b/drivers/iommu/intel_irq_remapping.c
index e7af2a4..7f63d2a 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -408,14 +408,6 @@ static int iommu_load_old_irte(struct intel_iommu *iommu)
size_t size;
u64 irta;
 
-   if (!is_kdump_kernel()) {
-   pr_warn("IRQ remapping was enabled on %s but we are not in 
kdump mode\n",
-   iommu->name);
-   clear_ir_pre_enabled(iommu);
-   iommu_disable_irq_remapping(iommu);
-   return -EINVAL;
-   }
-
/* Check whether the old ir-table has the same size as ours */
irta = dmar_readq(iommu->reg + DMAR_IRTA_REG);
if ((irta & INTR_REMAP_TABLE_REG_SIZE_MASK)
@@ -567,7 +559,12 @@ static int intel_setup_irq_remapping(struct intel_iommu 
*iommu)
init_ir_status(iommu);
 
if (ir_pre_enabled(iommu)) {
-   if (iommu_load_old_irte(iommu))
+   if (!is_kdump_kernel()) {
+   pr_warn("IRQ remapping was enabled on %s but we are not 
in kdump mode\n",
+   iommu->name);
+   clear_ir_pre_enabled(iommu);
+   iommu_disable_irq_remapping(iommu);
+   } else if (iommu_load_old_irte(iommu))
pr_err("Failed to copy IR table for %s from previous 
kernel\n",
   iommu->name);
else
-- 
2.9.0.GIT

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 14/20] intel_iommu: add FOR_EACH_ASSIGN_DEVICE macro

2017-04-28 Thread Lan Tianyu
On 2017年04月26日 18:06, Liu, Yi L wrote:
> Add FOR_EACH_ASSIGN_DEVICE. It would be used to loop all assigned
> devices when processing guest pasid table linking and iommu cache
> invalidate propagation.
> 
> Signed-off-by: Liu, Yi L 
> ---
>  hw/i386/intel_iommu.c  | 32 
>  hw/i386/intel_iommu_internal.h | 11 +++
>  2 files changed, 43 insertions(+)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 0c412d2..f291995 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -55,6 +55,38 @@ static int vtd_dbgflags = VTD_DBGBIT(GENERAL) | 
> VTD_DBGBIT(CSR);
>  #define VTD_DPRINTF(what, fmt, ...) do {} while (0)
>  #endif
>  
> +#define FOR_EACH_ASSIGN_DEVICE(__notify_info_type, \
> +   __opaque_type, \
> +   __hook_info, \
> +   __hook_fn) \
> +do { \
> +IntelIOMMUNotifierNode *node; \
> +VTDNotifierIterator iterator; \
> +int ret = 0; \
> +__notify_info_type *notify_info; \
> +__opaque_type *opaq; \
> +int argsz; \
> +argsz = sizeof(*notify_info) + sizeof(*opaq); \
> +notify_info = g_malloc0(argsz); \
> +QLIST_FOREACH(node, &(s->notifiers_list), next) { \
> +VTDAddressSpace *vtd_as = node->vtd_as; \
> +VTDContextEntry ce[2]; \
> +iterator.bus = pci_bus_num(vtd_as->bus); \
> +ret = vtd_dev_to_context_entry(s, iterator.bus, \
> +   vtd_as->devfn, &ce[0]); \
> +if (ret != 0) { \
> +continue; \
> +} \
> +iterator.sid = vtd_make_source_id(iterator.bus, vtd_as->devfn); \
> +iterator.did =  VTD_CONTEXT_ENTRY_DID(ce[0].hi); \
> +iterator.host_sid = node->host_sid; \
> +iterator.vtd_as = vtd_as; \
> +iterator.ce = &ce[0]; \
> +__hook_fn(&iterator, __hook_info, notify_info); \
> +} \
> +g_free(notify_info); \
> +} while (0)
> +
>  static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
>  uint64_t wmask, uint64_t w1cmask)
>  {
> diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
> index f2a7d12..5178398 100644
> --- a/hw/i386/intel_iommu_internal.h
> +++ b/hw/i386/intel_iommu_internal.h
> @@ -439,6 +439,17 @@ typedef struct VTDRootEntry VTDRootEntry;
>  #define VTD_EXT_CONTEXT_TT_NO_DEV_IOTLB   (4ULL << 2)
>  #define VTD_EXT_CONTEXT_TT_DEV_IOTLB  (5ULL << 2)
>  
> +struct VTDNotifierIterator {
> +VTDAddressSpace *vtd_as;
> +VTDContextEntry *ce;
> +uint16_t host_sid;
> +uint16_t sid;
> +uint16_t did;
> +uint8_t  bus;

The "bus" seems to be redundant.
It is already contained in the "sid", right?

> +};
> +
> +typedef struct VTDNotifierIterator VTDNotifierIterator;
> +
>  /* Paging Structure common */
>  #define VTD_SL_PT_PAGE_SIZE_MASK(1ULL << 7)
>  /* Bits to decide the offset for each level */
> 


-- 
Best regards
Tianyu Lan
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu