Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-21 Thread Pierre Morel

On 21/05/2019 16:59, Alex Williamson wrote:

On Tue, 21 May 2019 11:14:38 +0200
Pierre Morel  wrote:


On 20/05/2019 20:23, Alex Williamson wrote:

On Mon, 20 May 2019 18:31:08 +0200
Pierre Morel  wrote:
   

On 20/05/2019 16:27, Cornelia Huck wrote:

On Mon, 20 May 2019 13:19:23 +0200
Pierre Morel  wrote:
  

On 17/05/2019 20:04, Pierre Morel wrote:

On 17/05/2019 18:41, Alex Williamson wrote:

On Fri, 17 May 2019 18:16:50 +0200
Pierre Morel  wrote:


We implement the capability interface for VFIO_IOMMU_GET_INFO.

When calling the ioctl, the user must specify
VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
must check in the answer if capabilities are supported.

The iommu get_attr callback will be used to retrieve the specific
attributes and fill the capabilities.

Currently two Z-PCI specific capabilities will be queried and
filled by the underlying Z specific s390_iommu:
VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
and
VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.

Other architectures may add new capabilities in the same way
after enhancing the architecture specific IOMMU driver.

Signed-off-by: Pierre Morel 
---
     drivers/vfio/vfio_iommu_type1.c | 122
+++-
     1 file changed, 121 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c
b/drivers/vfio/vfio_iommu_type1.c
index d0f731c..9435647 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1658,6 +1658,97 @@ static int
vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
     return ret;
     }
+static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
+    struct vfio_info_cap *caps, size_t size)
+{
+    struct vfio_iommu_type1_info_pcifn *info_fn;
+    int ret;
+
+    info_fn = kzalloc(size, GFP_KERNEL);
+    if (!info_fn)
+    return -ENOMEM;
+
+    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
+    _fn->response);


What ensures that the 'struct clp_rsp_query_pci' returned from this
get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
Why does the latter contains so many reserved fields (beyond simply
alignment) for a user API?  What fields of these structures are
actually useful to userspace?  Should any fields not be exposed to the
user?  Aren't BAR sizes redundant to what's available through the vfio
PCI API?  I'm afraid that simply redefining an internal structure as
the API leaves a lot to be desired too.  Thanks,

Alex


Hi Alex,

I simply used the structure returned by the firmware to be sure to be
consistent with future evolutions and facilitate the copy from CLP and
to userland.

If you prefer, and I understand that this is the case, I can define a
specific VFIO_IOMMU structure with only the fields relevant to the user,
leaving future enhancement of the user's interface being implemented in
another kernel patch when the time has come.


TBH, I had no idea that CLP is an s390 firmware interface and this is
just dumping that to userspace.  The cover letter says:

Using the PCI VFIO interface allows userland, a.k.a. QEMU, to
retrieve ZPCI specific information without knowing Z specific
identifiers like the function ID or the function handle of the zPCI
function hidden behind the PCI interface.

But what does this allow userland to do and what specific pieces of
information do they need?  We do have a case already where Intel
graphics devices have a table (OpRegion) living in host system memory
that we expose via a vfio region, so it wouldn't be unprecedented to do
something like this, but as Connie suggests, if we knew what was being
consumed here and why, maybe we could generalize it into something
useful for others.


OK, sorry I try to explain better.

1) A short description, of zPCI functions and groups

IN Z, PCI cards, leave behind an adapter between subchannels and PCI.
We access PCI cards through 2 ways:
- dedicated PCI instructions (pci_load/pci_store/pci/store_block)
- DMA
We receive events through
- Adapter interrupts
- CHSC events

The adapter propose an IOMMU to protect the DMA
and the interrupt handling goes through a MSIX like interface handled by
the adapter.

The architecture specific PCI do the interface between the standard PCI
level and the zPCI function (PCI + DMA/IOMMU/Interrupt)

To handle the communication through the "zPCI way" the CLP interface
provides instructions to retrieve informations from the adapters.

There are different group of functions having same functionalities.

clp_list give us a list from zPCI functions
clp_query_pci_function returns informations specific to a function
clp_query_group returns information on a function group


2) Why do we need it in the guest

We need to provide the guest with information on the adapters and zPCI
functions returned by the clp_query instruction so that the guest's
driver gets the right information on how the way to 

Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-21 Thread Alex Williamson
On Tue, 21 May 2019 11:14:38 +0200
Pierre Morel  wrote:

> On 20/05/2019 20:23, Alex Williamson wrote:
> > On Mon, 20 May 2019 18:31:08 +0200
> > Pierre Morel  wrote:
> >   
> >> On 20/05/2019 16:27, Cornelia Huck wrote:  
> >>> On Mon, 20 May 2019 13:19:23 +0200
> >>> Pierre Morel  wrote:
> >>>  
>  On 17/05/2019 20:04, Pierre Morel wrote:  
> > On 17/05/2019 18:41, Alex Williamson wrote:  
> >> On Fri, 17 May 2019 18:16:50 +0200
> >> Pierre Morel  wrote:
> >>
> >>> We implement the capability interface for VFIO_IOMMU_GET_INFO.
> >>>
> >>> When calling the ioctl, the user must specify
> >>> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
> >>> must check in the answer if capabilities are supported.
> >>>
> >>> The iommu get_attr callback will be used to retrieve the specific
> >>> attributes and fill the capabilities.
> >>>
> >>> Currently two Z-PCI specific capabilities will be queried and
> >>> filled by the underlying Z specific s390_iommu:
> >>> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
> >>> and
> >>> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
> >>>
> >>> Other architectures may add new capabilities in the same way
> >>> after enhancing the architecture specific IOMMU driver.
> >>>
> >>> Signed-off-by: Pierre Morel 
> >>> ---
> >>>     drivers/vfio/vfio_iommu_type1.c | 122
> >>> +++-
> >>>     1 file changed, 121 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/vfio/vfio_iommu_type1.c
> >>> b/drivers/vfio/vfio_iommu_type1.c
> >>> index d0f731c..9435647 100644
> >>> --- a/drivers/vfio/vfio_iommu_type1.c
> >>> +++ b/drivers/vfio/vfio_iommu_type1.c
> >>> @@ -1658,6 +1658,97 @@ static int
> >>> vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
> >>>     return ret;
> >>>     }
> >>> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
> >>> +    struct vfio_info_cap *caps, size_t size)
> >>> +{
> >>> +    struct vfio_iommu_type1_info_pcifn *info_fn;
> >>> +    int ret;
> >>> +
> >>> +    info_fn = kzalloc(size, GFP_KERNEL);
> >>> +    if (!info_fn)
> >>> +    return -ENOMEM;
> >>> +
> >>> +    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
> >>> +    _fn->response);  
> >>
> >> What ensures that the 'struct clp_rsp_query_pci' returned from this
> >> get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
> >> Why does the latter contains so many reserved fields (beyond simply
> >> alignment) for a user API?  What fields of these structures are
> >> actually useful to userspace?  Should any fields not be exposed to the
> >> user?  Aren't BAR sizes redundant to what's available through the vfio
> >> PCI API?  I'm afraid that simply redefining an internal structure as
> >> the API leaves a lot to be desired too.  Thanks,
> >>
> >> Alex
> >>
> > Hi Alex,
> >
> > I simply used the structure returned by the firmware to be sure to be
> > consistent with future evolutions and facilitate the copy from CLP and
> > to userland.
> >
> > If you prefer, and I understand that this is the case, I can define a
> > specific VFIO_IOMMU structure with only the fields relevant to the user,
> > leaving future enhancement of the user's interface being implemented in
> > another kernel patch when the time has come.  
> > 
> > TBH, I had no idea that CLP is an s390 firmware interface and this is
> > just dumping that to userspace.  The cover letter says:
> > 
> >Using the PCI VFIO interface allows userland, a.k.a. QEMU, to
> >retrieve ZPCI specific information without knowing Z specific
> >identifiers like the function ID or the function handle of the zPCI
> >function hidden behind the PCI interface.
> > 
> > But what does this allow userland to do and what specific pieces of
> > information do they need?  We do have a case already where Intel
> > graphics devices have a table (OpRegion) living in host system memory
> > that we expose via a vfio region, so it wouldn't be unprecedented to do
> > something like this, but as Connie suggests, if we knew what was being
> > consumed here and why, maybe we could generalize it into something
> > useful for others.  
> 
> OK, sorry I try to explain better.
> 
> 1) A short description, of zPCI functions and groups
> 
> IN Z, PCI cards, leave behind an adapter between subchannels and PCI.
> We access PCI cards through 2 ways:
> - dedicated PCI instructions (pci_load/pci_store/pci/store_block)
> - DMA
> We receive events through
> - Adapter interrupts
> - CHSC events
> 
> The adapter propose an IOMMU to protect the DMA
> and the interrupt handling goes through a MSIX like interface handled by 

Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-21 Thread Pierre Morel

On 21/05/2019 13:11, Cornelia Huck wrote:

On Tue, 21 May 2019 11:14:38 +0200
Pierre Morel  wrote:


1) A short description, of zPCI functions and groups

IN Z, PCI cards, leave behind an adapter between subchannels and PCI.
We access PCI cards through 2 ways:
- dedicated PCI instructions (pci_load/pci_store/pci/store_block)
- DMA


Quick question: What about the new pci instructions? Anything that
needs to be considered there?


No and yes.

No because they should be used when pci_{load,stor,store_block} are 
interpreted AFAIU.

And currently we only use interception.

Yes, because, the CLP part, use to setup the translations IIUC, (do not 
ask me for details now), will need to be re-issued by the kernel after 
some modifications and this will also need a way from QEMU S390 PCI down 
to the ZPCI driver.

Way that I try to setup with this patch.

So answer is not now but we should keep in mind that we will 
definitively need a way down to the zpci low level in the host.





We receive events through
- Adapter interrupts


Note for the non-s390 folks: These are (I/O) interrupts that are not
tied to a specific device. MSI-X is mapped to this.


- CHSC events


Another note for the non-s390 folks: This is a notification mechanism
that is using machine check interrupts; more information is retrieved
via a special instruction (chsc).



thanks, it is yes better to explain better :)



The adapter propose an IOMMU to protect the DMA
and the interrupt handling goes through a MSIX like interface handled by
the adapter.

The architecture specific PCI do the interface between the standard PCI
level and the zPCI function (PCI + DMA/IOMMU/Interrupt)

To handle the communication through the "zPCI way" the CLP interface
provides instructions to retrieve informations from the adapters.

There are different group of functions having same functionalities.

clp_list give us a list from zPCI functions
clp_query_pci_function returns informations specific to a function
clp_query_group returns information on a function group


2) Why do we need it in the guest

We need to provide the guest with information on the adapters and zPCI
functions returned by the clp_query instruction so that the guest's
driver gets the right information on how the way to the zPCI function
has been built in the host.


When a guest issues the CLP instructions we intercept the clp command in
QEMU and we need to feed the response with the right values for the guest.
The "right" values are not the raw CLP response values:

- some identifier must be virtualized, like UID and FID,

- some values must match what the host received from the CLP response,
like the size of the transmited blocks, the DMA Address Space Mask,
number of interrupt, MSIA

- some other must match what the host handled with the adapter and
function, the start and end of DMA,

- some what the host IOMMU driver supports (frame size),



3) We have three different way to get This information:

The PCI Linux interface is a standard PCI interface and some Z specific
information is available in sysfs.
Not all the information needed to be returned inside the CLP response is
available.
So we can not use the sysfs interface to get all the information.

There is a CLP ioctl interface but this interface is not secure in that
it returns the information for all adapters in the system.

The VFIO interface offers the advantage to point to a single PCI
function, so more secure than the clp ioctl interface.
Coupled with the s390_iommu we get access to the zPCI CLP instruction
and to the values handled by the zPCI driver.


4) Until now we used to fill the CLP response to the guest inside QEMU
with fixed values corresponding to the only PCI card we supported.
To support new cards we need to get the right values from the kernel out.


IIRC, the current code fills in values that make sense for one specific
type of card only, right?


yes, right


We also use the same values for emulated
cards (virtio); I assume that they are not completely weird for that
case...



No they are not.

For emulated cards, all is done inside QEMU, we do not need kernel 
access, the emulated cards get a specific emulation function and group 
assigned with pre-defined values.


I sent a QEMU patch related to this.
Even the kernel interface will change with the changes in the kernel 
patch, the emulation should continue in this way.


Regards,
Pierre










--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany



Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-21 Thread Cornelia Huck
On Tue, 21 May 2019 11:14:38 +0200
Pierre Morel  wrote:

> 1) A short description, of zPCI functions and groups
> 
> IN Z, PCI cards, leave behind an adapter between subchannels and PCI.
> We access PCI cards through 2 ways:
> - dedicated PCI instructions (pci_load/pci_store/pci/store_block)
> - DMA

Quick question: What about the new pci instructions? Anything that
needs to be considered there?

> We receive events through
> - Adapter interrupts

Note for the non-s390 folks: These are (I/O) interrupts that are not
tied to a specific device. MSI-X is mapped to this.

> - CHSC events

Another note for the non-s390 folks: This is a notification mechanism
that is using machine check interrupts; more information is retrieved
via a special instruction (chsc).

> 
> The adapter propose an IOMMU to protect the DMA
> and the interrupt handling goes through a MSIX like interface handled by 
> the adapter.
> 
> The architecture specific PCI do the interface between the standard PCI 
> level and the zPCI function (PCI + DMA/IOMMU/Interrupt)
> 
> To handle the communication through the "zPCI way" the CLP interface 
> provides instructions to retrieve informations from the adapters.
> 
> There are different group of functions having same functionalities.
> 
> clp_list give us a list from zPCI functions
> clp_query_pci_function returns informations specific to a function
> clp_query_group returns information on a function group
> 
> 
> 2) Why do we need it in the guest
> 
> We need to provide the guest with information on the adapters and zPCI 
> functions returned by the clp_query instruction so that the guest's 
> driver gets the right information on how the way to the zPCI function 
> has been built in the host.
> 
> 
> When a guest issues the CLP instructions we intercept the clp command in 
> QEMU and we need to feed the response with the right values for the guest.
> The "right" values are not the raw CLP response values:
> 
> - some identifier must be virtualized, like UID and FID,
> 
> - some values must match what the host received from the CLP response, 
> like the size of the transmited blocks, the DMA Address Space Mask, 
> number of interrupt, MSIA
> 
> - some other must match what the host handled with the adapter and 
> function, the start and end of DMA,
> 
> - some what the host IOMMU driver supports (frame size),
> 
> 
> 
> 3) We have three different way to get This information:
> 
> The PCI Linux interface is a standard PCI interface and some Z specific 
> information is available in sysfs.
> Not all the information needed to be returned inside the CLP response is 
> available.
> So we can not use the sysfs interface to get all the information.
> 
> There is a CLP ioctl interface but this interface is not secure in that 
> it returns the information for all adapters in the system.
> 
> The VFIO interface offers the advantage to point to a single PCI 
> function, so more secure than the clp ioctl interface.
> Coupled with the s390_iommu we get access to the zPCI CLP instruction 
> and to the values handled by the zPCI driver.
> 
> 
> 4) Until now we used to fill the CLP response to the guest inside QEMU 
> with fixed values corresponding to the only PCI card we supported.
> To support new cards we need to get the right values from the kernel out.

IIRC, the current code fills in values that make sense for one specific
type of card only, right? We also use the same values for emulated
cards (virtio); I assume that they are not completely weird for that
case...
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-21 Thread Pierre Morel

On 20/05/2019 20:23, Alex Williamson wrote:

On Mon, 20 May 2019 18:31:08 +0200
Pierre Morel  wrote:


On 20/05/2019 16:27, Cornelia Huck wrote:

On Mon, 20 May 2019 13:19:23 +0200
Pierre Morel  wrote:
   

On 17/05/2019 20:04, Pierre Morel wrote:

On 17/05/2019 18:41, Alex Williamson wrote:

On Fri, 17 May 2019 18:16:50 +0200
Pierre Morel  wrote:
 

We implement the capability interface for VFIO_IOMMU_GET_INFO.

When calling the ioctl, the user must specify
VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
must check in the answer if capabilities are supported.

The iommu get_attr callback will be used to retrieve the specific
attributes and fill the capabilities.

Currently two Z-PCI specific capabilities will be queried and
filled by the underlying Z specific s390_iommu:
VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
and
VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.

Other architectures may add new capabilities in the same way
after enhancing the architecture specific IOMMU driver.

Signed-off-by: Pierre Morel 
---
    drivers/vfio/vfio_iommu_type1.c | 122
+++-
    1 file changed, 121 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c
b/drivers/vfio/vfio_iommu_type1.c
index d0f731c..9435647 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1658,6 +1658,97 @@ static int
vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
    return ret;
    }
+static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
+    struct vfio_info_cap *caps, size_t size)
+{
+    struct vfio_iommu_type1_info_pcifn *info_fn;
+    int ret;
+
+    info_fn = kzalloc(size, GFP_KERNEL);
+    if (!info_fn)
+    return -ENOMEM;
+
+    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
+    _fn->response);


What ensures that the 'struct clp_rsp_query_pci' returned from this
get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
Why does the latter contains so many reserved fields (beyond simply
alignment) for a user API?  What fields of these structures are
actually useful to userspace?  Should any fields not be exposed to the
user?  Aren't BAR sizes redundant to what's available through the vfio
PCI API?  I'm afraid that simply redefining an internal structure as
the API leaves a lot to be desired too.  Thanks,

Alex
 

Hi Alex,

I simply used the structure returned by the firmware to be sure to be
consistent with future evolutions and facilitate the copy from CLP and
to userland.

If you prefer, and I understand that this is the case, I can define a
specific VFIO_IOMMU structure with only the fields relevant to the user,
leaving future enhancement of the user's interface being implemented in
another kernel patch when the time has come.


TBH, I had no idea that CLP is an s390 firmware interface and this is
just dumping that to userspace.  The cover letter says:

   Using the PCI VFIO interface allows userland, a.k.a. QEMU, to
   retrieve ZPCI specific information without knowing Z specific
   identifiers like the function ID or the function handle of the zPCI
   function hidden behind the PCI interface.

But what does this allow userland to do and what specific pieces of
information do they need?  We do have a case already where Intel
graphics devices have a table (OpRegion) living in host system memory
that we expose via a vfio region, so it wouldn't be unprecedented to do
something like this, but as Connie suggests, if we knew what was being
consumed here and why, maybe we could generalize it into something
useful for others.


OK, sorry I try to explain better.

1) A short description, of zPCI functions and groups

IN Z, PCI cards, leave behind an adapter between subchannels and PCI.
We access PCI cards through 2 ways:
- dedicated PCI instructions (pci_load/pci_store/pci/store_block)
- DMA
We receive events through
- Adapter interrupts
- CHSC events

The adapter propose an IOMMU to protect the DMA
and the interrupt handling goes through a MSIX like interface handled by 
the adapter.


The architecture specific PCI do the interface between the standard PCI 
level and the zPCI function (PCI + DMA/IOMMU/Interrupt)


To handle the communication through the "zPCI way" the CLP interface 
provides instructions to retrieve informations from the adapters.


There are different group of functions having same functionalities.

clp_list give us a list from zPCI functions
clp_query_pci_function returns informations specific to a function
clp_query_group returns information on a function group


2) Why do we need it in the guest

We need to provide the guest with information on the adapters and zPCI 
functions returned by the clp_query instruction so that the guest's 
driver gets the right information on how the way to the zPCI function 
has been built in the host.



When a guest issues the CLP instructions we intercept the clp 

Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-20 Thread Pierre Morel

On 20/05/2019 16:27, Cornelia Huck wrote:

On Mon, 20 May 2019 13:19:23 +0200
Pierre Morel  wrote:


On 17/05/2019 20:04, Pierre Morel wrote:

On 17/05/2019 18:41, Alex Williamson wrote:

On Fri, 17 May 2019 18:16:50 +0200
Pierre Morel  wrote:
  

We implement the capability interface for VFIO_IOMMU_GET_INFO.

When calling the ioctl, the user must specify
VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
must check in the answer if capabilities are supported.

The iommu get_attr callback will be used to retrieve the specific
attributes and fill the capabilities.

Currently two Z-PCI specific capabilities will be queried and
filled by the underlying Z specific s390_iommu:
VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
and
VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.

Other architectures may add new capabilities in the same way
after enhancing the architecture specific IOMMU driver.

Signed-off-by: Pierre Morel 
---
   drivers/vfio/vfio_iommu_type1.c | 122
+++-
   1 file changed, 121 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c
b/drivers/vfio/vfio_iommu_type1.c
index d0f731c..9435647 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1658,6 +1658,97 @@ static int
vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
   return ret;
   }
+static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
+    struct vfio_info_cap *caps, size_t size)
+{
+    struct vfio_iommu_type1_info_pcifn *info_fn;
+    int ret;
+
+    info_fn = kzalloc(size, GFP_KERNEL);
+    if (!info_fn)
+    return -ENOMEM;
+
+    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
+    _fn->response);


What ensures that the 'struct clp_rsp_query_pci' returned from this
get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
Why does the latter contains so many reserved fields (beyond simply
alignment) for a user API?  What fields of these structures are
actually useful to userspace?  Should any fields not be exposed to the
user?  Aren't BAR sizes redundant to what's available through the vfio
PCI API?  I'm afraid that simply redefining an internal structure as
the API leaves a lot to be desired too.  Thanks,

Alex
  

Hi Alex,

I simply used the structure returned by the firmware to be sure to be
consistent with future evolutions and facilitate the copy from CLP and
to userland.

If you prefer, and I understand that this is the case, I can define a
specific VFIO_IOMMU structure with only the fields relevant to the user,
leaving future enhancement of the user's interface being implemented in
another kernel patch when the time has come.

In fact, the struct will have all defined fields I used but not the BAR
size and address (at least for now because there are special cases we do
not support yet with bars).
All the reserved fields can go away.

Is it more conform to your idea?

Also I have 2 interfaces:

s390_iommu.get_attr <-I1-> VFIO_IOMMU <-I2-> userland

Do you prefer:
- 2 different structures, no CLP raw structure
- the CLP raw structure for I1 and a VFIO specific structure for I2




IIUC, get_attr extracts various data points via clp, and we then make
it available to userspace. The clp interface needs to be abstracted
away at some point... one question from me: Is there a chance that
someone else may want to make use of the userspace interface (extra
information about a function)? If yes, I'd expect the get_attr to
obtain some kind of portable information already (basically your third
option, below).


Yes, seems the most reasonable.
In this case I need to share the structure definition between:
userspace through vfio.h
vfio_iommu (this is obvious)
s390_iommu

It is this third include which made me doubt.
But when you re formulate it it looks the more reasonable because there 
are much less changes.


Thanks for the help, I start this way, still wait one day or two to see 
if any comment against this solution comes and send the update.


Thanks,
Pierre





Hi Alex,

I am back again on this.
This solution here above seems to me the best one but in this way I must
include S390 specific include inside the iommu_type1, which is AFAIU not
a good thing.
It seems that the powerpc architecture use a solution with a dedicated
VFIO_IOMMU, the vfio_iommu_spar_tce.

Wouldn't it be a solution for s390 too, to use the vfio_iommu_type1 as a
basis to have a s390 dedicated solution.
Then it becomes easier to have on one side the s390_iommu interface,
S390 specific, and on the other side a VFIO interface without a blind
copy of the firmware values.


If nobody else would want this exact interface, it might be a solution.
It would still be better not to encode clp data explicitly in the
userspace interface.



Do you think it is a viable solution?

Thanks,
Pierre




- the same VFIO structure for both I1 and I2





--
Pierre Morel
Linux/KVM/QEMU 

Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-20 Thread Cornelia Huck
On Mon, 20 May 2019 13:19:23 +0200
Pierre Morel  wrote:

> On 17/05/2019 20:04, Pierre Morel wrote:
> > On 17/05/2019 18:41, Alex Williamson wrote:  
> >> On Fri, 17 May 2019 18:16:50 +0200
> >> Pierre Morel  wrote:
> >>  
> >>> We implement the capability interface for VFIO_IOMMU_GET_INFO.
> >>>
> >>> When calling the ioctl, the user must specify
> >>> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
> >>> must check in the answer if capabilities are supported.
> >>>
> >>> The iommu get_attr callback will be used to retrieve the specific
> >>> attributes and fill the capabilities.
> >>>
> >>> Currently two Z-PCI specific capabilities will be queried and
> >>> filled by the underlying Z specific s390_iommu:
> >>> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
> >>> and
> >>> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
> >>>
> >>> Other architectures may add new capabilities in the same way
> >>> after enhancing the architecture specific IOMMU driver.
> >>>
> >>> Signed-off-by: Pierre Morel 
> >>> ---
> >>>   drivers/vfio/vfio_iommu_type1.c | 122 
> >>> +++-
> >>>   1 file changed, 121 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/vfio/vfio_iommu_type1.c 
> >>> b/drivers/vfio/vfio_iommu_type1.c
> >>> index d0f731c..9435647 100644
> >>> --- a/drivers/vfio/vfio_iommu_type1.c
> >>> +++ b/drivers/vfio/vfio_iommu_type1.c
> >>> @@ -1658,6 +1658,97 @@ static int 
> >>> vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)
> >>>   return ret;
> >>>   }
> >>> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
> >>> +    struct vfio_info_cap *caps, size_t size)
> >>> +{
> >>> +    struct vfio_iommu_type1_info_pcifn *info_fn;
> >>> +    int ret;
> >>> +
> >>> +    info_fn = kzalloc(size, GFP_KERNEL);
> >>> +    if (!info_fn)
> >>> +    return -ENOMEM;
> >>> +
> >>> +    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
> >>> +    _fn->response);  
> >>
> >> What ensures that the 'struct clp_rsp_query_pci' returned from this
> >> get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
> >> Why does the latter contains so many reserved fields (beyond simply
> >> alignment) for a user API?  What fields of these structures are
> >> actually useful to userspace?  Should any fields not be exposed to the
> >> user?  Aren't BAR sizes redundant to what's available through the vfio
> >> PCI API?  I'm afraid that simply redefining an internal structure as
> >> the API leaves a lot to be desired too.  Thanks,
> >>
> >> Alex
> >>  
> > Hi Alex,
> > 
> > I simply used the structure returned by the firmware to be sure to be 
> > consistent with future evolutions and facilitate the copy from CLP and 
> > to userland.
> > 
> > If you prefer, and I understand that this is the case, I can define a 
> > specific VFIO_IOMMU structure with only the fields relevant to the user, 
> > leaving future enhancement of the user's interface being implemented in 
> > another kernel patch when the time has come.
> > 
> > In fact, the struct will have all defined fields I used but not the BAR 
> > size and address (at least for now because there are special cases we do 
> > not support yet with bars).
> > All the reserved fields can go away.
> > 
> > Is it more conform to your idea?
> > 
> > Also I have 2 interfaces:
> > 
> > s390_iommu.get_attr <-I1-> VFIO_IOMMU <-I2-> userland
> > 
> > Do you prefer:
> > - 2 different structures, no CLP raw structure
> > - the CLP raw structure for I1 and a VFIO specific structure for I2  



IIUC, get_attr extracts various data points via clp, and we then make
it available to userspace. The clp interface needs to be abstracted
away at some point... one question from me: Is there a chance that
someone else may want to make use of the userspace interface (extra
information about a function)? If yes, I'd expect the get_attr to
obtain some kind of portable information already (basically your third
option, below).

> 
> Hi Alex,
> 
> I am back again on this.
> This solution here above seems to me the best one but in this way I must 
> include S390 specific include inside the iommu_type1, which is AFAIU not 
> a good thing.
> It seems that the powerpc architecture use a solution with a dedicated 
> VFIO_IOMMU, the vfio_iommu_spar_tce.
> 
> Wouldn't it be a solution for s390 too, to use the vfio_iommu_type1 as a 
> basis to have a s390 dedicated solution.
> Then it becomes easier to have on one side the s390_iommu interface, 
> S390 specific, and on the other side a VFIO interface without a blind 
> copy of the firmware values.

If nobody else would want this exact interface, it might be a solution.
It would still be better not to encode clp data explicitly in the
userspace interface.

> 
> Do you think it is a viable solution?
> 
> Thanks,
> Pierre
> 
> 
> 
> > - the same VFIO structure for both I1 and I2

Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-20 Thread Pierre Morel

On 17/05/2019 20:04, Pierre Morel wrote:

On 17/05/2019 18:41, Alex Williamson wrote:

On Fri, 17 May 2019 18:16:50 +0200
Pierre Morel  wrote:


We implement the capability interface for VFIO_IOMMU_GET_INFO.

When calling the ioctl, the user must specify
VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
must check in the answer if capabilities are supported.

The iommu get_attr callback will be used to retrieve the specific
attributes and fill the capabilities.

Currently two Z-PCI specific capabilities will be queried and
filled by the underlying Z specific s390_iommu:
VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
and
VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.

Other architectures may add new capabilities in the same way
after enhancing the architecture specific IOMMU driver.

Signed-off-by: Pierre Morel 
---
  drivers/vfio/vfio_iommu_type1.c | 122 
+++-

  1 file changed, 121 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c 
b/drivers/vfio/vfio_iommu_type1.c

index d0f731c..9435647 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1658,6 +1658,97 @@ static int 
vfio_domains_have_iommu_cache(struct vfio_iommu *iommu)

  return ret;
  }
+static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
+    struct vfio_info_cap *caps, size_t size)
+{
+    struct vfio_iommu_type1_info_pcifn *info_fn;
+    int ret;
+
+    info_fn = kzalloc(size, GFP_KERNEL);
+    if (!info_fn)
+    return -ENOMEM;
+
+    ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
+    _fn->response);


What ensures that the 'struct clp_rsp_query_pci' returned from this
get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
Why does the latter contains so many reserved fields (beyond simply
alignment) for a user API?  What fields of these structures are
actually useful to userspace?  Should any fields not be exposed to the
user?  Aren't BAR sizes redundant to what's available through the vfio
PCI API?  I'm afraid that simply redefining an internal structure as
the API leaves a lot to be desired too.  Thanks,

Alex


Hi Alex,

I simply used the structure returned by the firmware to be sure to be 
consistent with future evolutions and facilitate the copy from CLP and 
to userland.


If you prefer, and I understand that this is the case, I can define a 
specific VFIO_IOMMU structure with only the fields relevant to the user, 
leaving future enhancement of the user's interface being implemented in 
another kernel patch when the time has come.


In fact, the struct will have all defined fields I used but not the BAR 
size and address (at least for now because there are special cases we do 
not support yet with bars).

All the reserved fields can go away.

Is it more conform to your idea?

Also I have 2 interfaces:

s390_iommu.get_attr <-I1-> VFIO_IOMMU <-I2-> userland

Do you prefer:
- 2 different structures, no CLP raw structure
- the CLP raw structure for I1 and a VFIO specific structure for I2


Hi Alex,

I am back again on this.
This solution here above seems to me the best one but in this way I must 
include S390 specific include inside the iommu_type1, which is AFAIU not 
a good thing.
It seems that the powerpc architecture use a solution with a dedicated 
VFIO_IOMMU, the vfio_iommu_spar_tce.


Wouldn't it be a solution for s390 too, to use the vfio_iommu_type1 as a 
basis to have a s390 dedicated solution.
Then it becomes easier to have on one side the s390_iommu interface, 
S390 specific, and on the other side a VFIO interface without a blind 
copy of the firmware values.


Do you think it is a viable solution?

Thanks,
Pierre




- the same VFIO structure for both I1 and I2

Thank you if you could give me a direction for this.

Thanks for the comments, and thanks a lot to have answered so quickly.

Pierre










--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-19 Thread kbuild test robot
Hi Pierre,

I love your patch! Perhaps something to improve:

[auto build test WARNING on s390/features]
[also build test WARNING on v5.1 next-20190517]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Pierre-Morel/s390-pci-Exporting-access-to-CLP-PCI-function-and-PCI-group/20190520-025155
base:   https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 


sparse warnings: (new ones prefixed by >>)

>> drivers/vfio/vfio_iommu_type1.c:1707:5: sparse: sparse: symbol 
>> 'vfio_iommu_type1_caps' was not declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-17 Thread Pierre Morel

On 17/05/2019 18:41, Alex Williamson wrote:

On Fri, 17 May 2019 18:16:50 +0200
Pierre Morel  wrote:


We implement the capability interface for VFIO_IOMMU_GET_INFO.

When calling the ioctl, the user must specify
VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
must check in the answer if capabilities are supported.

The iommu get_attr callback will be used to retrieve the specific
attributes and fill the capabilities.

Currently two Z-PCI specific capabilities will be queried and
filled by the underlying Z specific s390_iommu:
VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
and
VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.

Other architectures may add new capabilities in the same way
after enhancing the architecture specific IOMMU driver.

Signed-off-by: Pierre Morel 
---
  drivers/vfio/vfio_iommu_type1.c | 122 +++-
  1 file changed, 121 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index d0f731c..9435647 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1658,6 +1658,97 @@ static int vfio_domains_have_iommu_cache(struct 
vfio_iommu *iommu)
return ret;
  }
  
+static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,

+   struct vfio_info_cap *caps, size_t size)
+{
+   struct vfio_iommu_type1_info_pcifn *info_fn;
+   int ret;
+
+   info_fn = kzalloc(size, GFP_KERNEL);
+   if (!info_fn)
+   return -ENOMEM;
+
+   ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
+   _fn->response);


What ensures that the 'struct clp_rsp_query_pci' returned from this
get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
Why does the latter contains so many reserved fields (beyond simply
alignment) for a user API?  What fields of these structures are
actually useful to userspace?  Should any fields not be exposed to the
user?  Aren't BAR sizes redundant to what's available through the vfio
PCI API?  I'm afraid that simply redefining an internal structure as
the API leaves a lot to be desired too.  Thanks,

Alex


Hi Alex,

I simply used the structure returned by the firmware to be sure to be 
consistent with future evolutions and facilitate the copy from CLP and 
to userland.


If you prefer, and I understand that this is the case, I can define a 
specific VFIO_IOMMU structure with only the fields relevant to the user, 
leaving future enhancement of the user's interface being implemented in 
another kernel patch when the time has come.


In fact, the struct will have all defined fields I used but not the BAR 
size and address (at least for now because there are special cases we do 
not support yet with bars).

All the reserved fields can go away.

Is it more conform to your idea?

Also I have 2 interfaces:

s390_iommu.get_attr <-I1-> VFIO_IOMMU <-I2-> userland

Do you prefer:
- 2 different structures, no CLP raw structure
- the CLP raw structure for I1 and a VFIO specific structure for I2
- the same VFIO structure for both I1 and I2

Thank you if you could give me a direction for this.

Thanks for the comments, and thanks a lot to have answered so quickly.

Pierre







--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-17 Thread Alex Williamson
On Fri, 17 May 2019 18:16:50 +0200
Pierre Morel  wrote:

> We implement the capability interface for VFIO_IOMMU_GET_INFO.
> 
> When calling the ioctl, the user must specify
> VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
> must check in the answer if capabilities are supported.
> 
> The iommu get_attr callback will be used to retrieve the specific
> attributes and fill the capabilities.
> 
> Currently two Z-PCI specific capabilities will be queried and
> filled by the underlying Z specific s390_iommu:
> VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
> and
> VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.
> 
> Other architectures may add new capabilities in the same way
> after enhancing the architecture specific IOMMU driver.
> 
> Signed-off-by: Pierre Morel 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 122 
> +++-
>  1 file changed, 121 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index d0f731c..9435647 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1658,6 +1658,97 @@ static int vfio_domains_have_iommu_cache(struct 
> vfio_iommu *iommu)
>   return ret;
>  }
>  
> +static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
> + struct vfio_info_cap *caps, size_t size)
> +{
> + struct vfio_iommu_type1_info_pcifn *info_fn;
> + int ret;
> +
> + info_fn = kzalloc(size, GFP_KERNEL);
> + if (!info_fn)
> + return -ENOMEM;
> +
> + ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
> + _fn->response);

What ensures that the 'struct clp_rsp_query_pci' returned from this
get_attr remains consistent with a 'struct vfio_iommu_pci_function'?
Why does the latter contains so many reserved fields (beyond simply
alignment) for a user API?  What fields of these structures are
actually useful to userspace?  Should any fields not be exposed to the
user?  Aren't BAR sizes redundant to what's available through the vfio
PCI API?  I'm afraid that simply redefining an internal structure as
the API leaves a lot to be desired too.  Thanks,

Alex

> + if (ret < 0)
> + goto free_fn;
> +
> + info_fn->header.id = VFIO_IOMMU_INFO_CAP_QFN;
> + ret = vfio_info_add_capability(caps, _fn->header, size);
> +
> +free_fn:
> + kfree(info_fn);
> + return ret;
> +}
> +
> +static int vfio_iommu_type1_zpci_grp(struct iommu_domain *domain,
> +  struct vfio_info_cap *caps,
> +  size_t grp_size)
> +{
> + struct vfio_iommu_type1_info_pcifg *info_grp;
> + int ret;
> +
> + info_grp = kzalloc(grp_size, GFP_KERNEL);
> + if (!info_grp)
> + return -ENOMEM;
> +
> + ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_GRP,
> + (void *) _grp->response);
> + if (ret < 0)
> + goto free_grp;
> + info_grp->header.id = VFIO_IOMMU_INFO_CAP_QGRP;
> + ret = vfio_info_add_capability(caps, _grp->header, grp_size);
> +
> +free_grp:
> + kfree(info_grp);
> + return ret;
> +}
> +
> +int vfio_iommu_type1_caps(struct vfio_iommu *iommu, struct vfio_info_cap 
> *caps,
> +   size_t size)
> +{
> + struct vfio_domain *d;
> + unsigned long total_size, fn_size, grp_size;
> + int ret;
> +
> + d = list_first_entry(>domain_list, struct vfio_domain, next);
> + if (!d)
> + return -ENODEV;
> +
> + /* First compute the size the user must provide */
> + total_size = 0;
> + fn_size = iommu_domain_get_attr(d->domain,
> + DOMAIN_ATTR_ZPCI_FN_SIZE, NULL);
> + if (fn_size > 0) {
> + fn_size +=  sizeof(struct vfio_info_cap_header);
> + total_size += fn_size;
> + }
> +
> + grp_size = iommu_domain_get_attr(d->domain,
> +  DOMAIN_ATTR_ZPCI_GRP_SIZE, NULL);
> + if (grp_size > 0) {
> + grp_size +=  sizeof(struct vfio_info_cap_header);
> + total_size += grp_size;
> + }
> +
> + if (total_size > size) {
> + /* Tell caller to call us with a greater buffer */
> + caps->size = total_size;
> + return 0;
> + }
> +
> + if (fn_size) {
> + ret = vfio_iommu_type1_zpci_fn(d->domain, caps, fn_size);
> + if (ret)
> + return ret;
> + }
> +
> + if (grp_size)
> + ret = vfio_iommu_type1_zpci_grp(d->domain, caps, grp_size);
> +
> + return ret;
> +}
> +
>  static long vfio_iommu_type1_ioctl(void *iommu_data,
>  unsigned int cmd, unsigned long arg)
>  {
> @@ -1679,6 +1770,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>   }
>   } else if (cmd == VFIO_IOMMU_GET_INFO) {

[PATCH v2 4/4] vfio: vfio_iommu_type1: implement VFIO_IOMMU_INFO_CAPABILITIES

2019-05-17 Thread Pierre Morel
We implement the capability interface for VFIO_IOMMU_GET_INFO.

When calling the ioctl, the user must specify
VFIO_IOMMU_INFO_CAPABILITIES to retrieve the capabilities and
must check in the answer if capabilities are supported.

The iommu get_attr callback will be used to retrieve the specific
attributes and fill the capabilities.

Currently two Z-PCI specific capabilities will be queried and
filled by the underlying Z specific s390_iommu:
VFIO_IOMMU_INFO_CAP_QFN for the PCI query function attributes
and
VFIO_IOMMU_INFO_CAP_QGRP for the PCI query function group.

Other architectures may add new capabilities in the same way
after enhancing the architecture specific IOMMU driver.

Signed-off-by: Pierre Morel 
---
 drivers/vfio/vfio_iommu_type1.c | 122 +++-
 1 file changed, 121 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index d0f731c..9435647 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1658,6 +1658,97 @@ static int vfio_domains_have_iommu_cache(struct 
vfio_iommu *iommu)
return ret;
 }
 
+static int vfio_iommu_type1_zpci_fn(struct iommu_domain *domain,
+   struct vfio_info_cap *caps, size_t size)
+{
+   struct vfio_iommu_type1_info_pcifn *info_fn;
+   int ret;
+
+   info_fn = kzalloc(size, GFP_KERNEL);
+   if (!info_fn)
+   return -ENOMEM;
+
+   ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_FN,
+   _fn->response);
+   if (ret < 0)
+   goto free_fn;
+
+   info_fn->header.id = VFIO_IOMMU_INFO_CAP_QFN;
+   ret = vfio_info_add_capability(caps, _fn->header, size);
+
+free_fn:
+   kfree(info_fn);
+   return ret;
+}
+
+static int vfio_iommu_type1_zpci_grp(struct iommu_domain *domain,
+struct vfio_info_cap *caps,
+size_t grp_size)
+{
+   struct vfio_iommu_type1_info_pcifg *info_grp;
+   int ret;
+
+   info_grp = kzalloc(grp_size, GFP_KERNEL);
+   if (!info_grp)
+   return -ENOMEM;
+
+   ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_ZPCI_GRP,
+   (void *) _grp->response);
+   if (ret < 0)
+   goto free_grp;
+   info_grp->header.id = VFIO_IOMMU_INFO_CAP_QGRP;
+   ret = vfio_info_add_capability(caps, _grp->header, grp_size);
+
+free_grp:
+   kfree(info_grp);
+   return ret;
+}
+
+int vfio_iommu_type1_caps(struct vfio_iommu *iommu, struct vfio_info_cap *caps,
+ size_t size)
+{
+   struct vfio_domain *d;
+   unsigned long total_size, fn_size, grp_size;
+   int ret;
+
+   d = list_first_entry(>domain_list, struct vfio_domain, next);
+   if (!d)
+   return -ENODEV;
+
+   /* First compute the size the user must provide */
+   total_size = 0;
+   fn_size = iommu_domain_get_attr(d->domain,
+   DOMAIN_ATTR_ZPCI_FN_SIZE, NULL);
+   if (fn_size > 0) {
+   fn_size +=  sizeof(struct vfio_info_cap_header);
+   total_size += fn_size;
+   }
+
+   grp_size = iommu_domain_get_attr(d->domain,
+DOMAIN_ATTR_ZPCI_GRP_SIZE, NULL);
+   if (grp_size > 0) {
+   grp_size +=  sizeof(struct vfio_info_cap_header);
+   total_size += grp_size;
+   }
+
+   if (total_size > size) {
+   /* Tell caller to call us with a greater buffer */
+   caps->size = total_size;
+   return 0;
+   }
+
+   if (fn_size) {
+   ret = vfio_iommu_type1_zpci_fn(d->domain, caps, fn_size);
+   if (ret)
+   return ret;
+   }
+
+   if (grp_size)
+   ret = vfio_iommu_type1_zpci_grp(d->domain, caps, grp_size);
+
+   return ret;
+}
+
 static long vfio_iommu_type1_ioctl(void *iommu_data,
   unsigned int cmd, unsigned long arg)
 {
@@ -1679,6 +1770,8 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
}
} else if (cmd == VFIO_IOMMU_GET_INFO) {
struct vfio_iommu_type1_info info;
+   struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
+   int ret;
 
minsz = offsetofend(struct vfio_iommu_type1_info, iova_pgsizes);
 
@@ -1688,7 +1781,34 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
if (info.argsz < minsz)
return -EINVAL;
 
-   info.flags = VFIO_IOMMU_INFO_PGSIZES;
+   if (info.flags & VFIO_IOMMU_INFO_CAPABILITIES) {
+   minsz = offsetofend(struct vfio_iommu_type1_info,
+   cap_offset);
+   if (info.argsz < minsz)
+