Re: [PATCH v2] vfio: add external user support

2013-06-27 Thread Alex Williamson
On Thu, 2013-06-27 at 17:14 +1000, Alexey Kardashevskiy wrote:
 VFIO is designed to be used via ioctls on file descriptors
 returned by VFIO.
 
 However in some situations support for an external user is required.
 The first user is KVM on PPC64 (SPAPR TCE protocol) which is going to
 use the existing VFIO groups for exclusive access in real/virtual mode
 in the host kernel to avoid passing map/unmap requests to the user
 space which would made things pretty slow.
 
 The proposed protocol includes:
 
 1. do normal VFIO init stuff such as opening a new container, attaching
 group(s) to it, setting an IOMMU driver for a container. When IOMMU is
 set for a container, all groups in it are considered ready to use by
 an external user.
 
 2. pass a fd of the group we want to accelerate to KVM. KVM calls
 vfio_group_iommu_id_from_file() to verify if the group is initialized
 and IOMMU is set for it. The current TCE IOMMU driver marks the whole
 IOMMU table as busy when IOMMU is set for a container what this prevents
 other DMA users from allocating from it so it is safe to pass the group
 to the user space.
 
 3. KVM increases the container users counter via
 vfio_group_add_external_user(). This prevents the VFIO group from
 being disposed prior to exiting KVM.
 
 4. When KVM is finished and doing cleanup, it releases the group file
 and decrements the container users counter. Everything gets released.
 
 5. KVM also keeps the group file as otherwise its fd might have been
 closed at the moment of KVM finish so vfio_group_del_external_user()
 call will not be possible.

This is the wrong order in my mind.  An external user has no business
checking or maintaining any state of a group until it calls
add_external_user().  Only after that call is successful can the user
assume the filep to group relationship is static and get the iommu_id.
Any use of the external user API should start with add and end with
del.

 The vfio: Limit group opens patch is also required for the consistency.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 
 v1-v2: added definitions to vfio.h :)
 Should not compile but compiled. Hm.
 
 ---
  drivers/vfio/vfio.c  |   54 
 ++
  include/linux/vfio.h |7 +++
  2 files changed, 61 insertions(+)
 
 diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
 index c488da5..40875d2 100644
 --- a/drivers/vfio/vfio.c
 +++ b/drivers/vfio/vfio.c
 @@ -1370,6 +1370,60 @@ static const struct file_operations vfio_device_fops = 
 {
  };
  
  /**
 + * External user API, exported by symbols to be linked dynamically.
 + */
 +
 +/* Allows an external user (for example, KVM) to lock an IOMMU group */
 +int vfio_group_add_external_user(struct file *filep)
 +{
 + struct vfio_group *group = filep-private_data;
 +
 + if (filep-f_op != vfio_group_fops)
 + return -EINVAL;
 +
 + if (!atomic_inc_not_zero(group-container_users))
 + return -EINVAL;

This is the place where I was suggesting we need tests to match
get_device_fd.  It's not clear what the external user is holding if the
group has no iommu or is not viable here.


if (!group-container-iommu_driver || !vfio_group_viable(group)) {
vfio_group_try_dissolve_container(group);
return -EINVAL;
}

 +
 + return 0;
 +}
 +EXPORT_SYMBOL_GPL(vfio_group_add_external_user);
 +
 +/* Allows an external user (for example, KVM) to unlock an IOMMU group */
 +void vfio_group_del_external_user(struct file *filep)
 +{
 + struct vfio_group *group = filep-private_data;
 +
 + if (WARN_ON(filep-f_op != vfio_group_fops))
 + return;

How about we make this return int so we can return 0/-EINVAL and the
caller can decide the severity of the response?

 +
 + vfio_group_try_dissolve_container(group);
 +}
 +EXPORT_SYMBOL_GPL(vfio_group_del_external_user);
 +
 +/*
 + * Checks if a group for the specified file can be used by
 + * an external user and returns the IOMMU ID if external use is possible.
 + */
 +int vfio_group_iommu_id_from_file(struct file *filep)

Let's name this in a way that makes it clear that it's part of the
external_user API.  vfio_group_external_user_iommu_id?

 +{
 + int ret;
 + struct vfio_group *group = filep-private_data;
 +
 + if (WARN_ON(filep-f_op != vfio_group_fops))
 + return -EINVAL;

This one probably doesn't deserve a WARN_ON either, let the caller
blowup if it wants.

 +
 + if (0 == atomic_read(group-container_users) ||
 + !group-container-iommu_driver ||
 + !vfio_group_viable(group))
 + return -EINVAL;

The above test just becomes a weak test that the caller is  correctly
using the API since we should be enforcing these tests when the external
user is added.  It doesn't hurt to leave them, but it's not very
convincing that the caller is the one holding anything.

 + ret = iommu_group_id(group-iommu_group);

The 'ret' variable isn't needed.
 

Re: [PATCH v2] vfio: add external user support

2013-06-27 Thread Alexey Kardashevskiy
On 06/28/2013 01:44 AM, Alex Williamson wrote:
 On Thu, 2013-06-27 at 17:14 +1000, Alexey Kardashevskiy wrote:
 VFIO is designed to be used via ioctls on file descriptors
 returned by VFIO.

 However in some situations support for an external user is required.
 The first user is KVM on PPC64 (SPAPR TCE protocol) which is going to
 use the existing VFIO groups for exclusive access in real/virtual mode
 in the host kernel to avoid passing map/unmap requests to the user
 space which would made things pretty slow.

 The proposed protocol includes:

 1. do normal VFIO init stuff such as opening a new container, attaching
 group(s) to it, setting an IOMMU driver for a container. When IOMMU is
 set for a container, all groups in it are considered ready to use by
 an external user.

 2. pass a fd of the group we want to accelerate to KVM. KVM calls
 vfio_group_iommu_id_from_file() to verify if the group is initialized
 and IOMMU is set for it. The current TCE IOMMU driver marks the whole
 IOMMU table as busy when IOMMU is set for a container what this prevents
 other DMA users from allocating from it so it is safe to pass the group
 to the user space.

 3. KVM increases the container users counter via
 vfio_group_add_external_user(). This prevents the VFIO group from
 being disposed prior to exiting KVM.

 4. When KVM is finished and doing cleanup, it releases the group file
 and decrements the container users counter. Everything gets released.

 5. KVM also keeps the group file as otherwise its fd might have been
 closed at the moment of KVM finish so vfio_group_del_external_user()
 call will not be possible.
 
 This is the wrong order in my mind.  An external user has no business
 checking or maintaining any state of a group until it calls
 add_external_user().  Only after that call is successful can the user
 assume the filep to group relationship is static and get the iommu_id.
 Any use of the external user API should start with add and end with
 del.

Yes, this is what I actually do, just wrong commit message, will fix.

 
 The vfio: Limit group opens patch is also required for the consistency.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---

 v1-v2: added definitions to vfio.h :)
 Should not compile but compiled. Hm.

 ---
  drivers/vfio/vfio.c  |   54 
 ++
  include/linux/vfio.h |7 +++
  2 files changed, 61 insertions(+)

 diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
 index c488da5..40875d2 100644
 --- a/drivers/vfio/vfio.c
 +++ b/drivers/vfio/vfio.c
 @@ -1370,6 +1370,60 @@ static const struct file_operations vfio_device_fops 
 = {
  };
  
  /**
 + * External user API, exported by symbols to be linked dynamically.
 + */
 +
 +/* Allows an external user (for example, KVM) to lock an IOMMU group */
 +int vfio_group_add_external_user(struct file *filep)
 +{
 +struct vfio_group *group = filep-private_data;
 +
 +if (filep-f_op != vfio_group_fops)
 +return -EINVAL;
 +
 +if (!atomic_inc_not_zero(group-container_users))
 +return -EINVAL;
 
 This is the place where I was suggesting we need tests to match
 get_device_fd.  It's not clear what the external user is holding if the
 group has no iommu or is not viable here.


In my mind this test must include test for iommu id so I would merge it
with vfio_group_iommu_id_from_file(). Till I check iommu id, I still cannot
use this group so where to put check for iommu/viable does not really
matter (for me).

 
 
 if (!group-container-iommu_driver || !vfio_group_viable(group)) {
   vfio_group_try_dissolve_container(group);
   return -EINVAL;
 }
 
 +
 +return 0;
 +}
 +EXPORT_SYMBOL_GPL(vfio_group_add_external_user);
 +
 +/* Allows an external user (for example, KVM) to unlock an IOMMU group */
 +void vfio_group_del_external_user(struct file *filep)
 +{
 +struct vfio_group *group = filep-private_data;
 +
 +if (WARN_ON(filep-f_op != vfio_group_fops))
 +return;
 
 How about we make this return int so we can return 0/-EINVAL and the
 caller can decide the severity of the response?

And what can the caller possibly do on !0?


-- 
Alexey
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] vfio: add external user support

2013-06-27 Thread Alex Williamson
On Fri, 2013-06-28 at 08:57 +1000, Alexey Kardashevskiy wrote:
 On 06/28/2013 01:44 AM, Alex Williamson wrote:
  On Thu, 2013-06-27 at 17:14 +1000, Alexey Kardashevskiy wrote:
  VFIO is designed to be used via ioctls on file descriptors
  returned by VFIO.
 
  However in some situations support for an external user is required.
  The first user is KVM on PPC64 (SPAPR TCE protocol) which is going to
  use the existing VFIO groups for exclusive access in real/virtual mode
  in the host kernel to avoid passing map/unmap requests to the user
  space which would made things pretty slow.
 
  The proposed protocol includes:
 
  1. do normal VFIO init stuff such as opening a new container, attaching
  group(s) to it, setting an IOMMU driver for a container. When IOMMU is
  set for a container, all groups in it are considered ready to use by
  an external user.
 
  2. pass a fd of the group we want to accelerate to KVM. KVM calls
  vfio_group_iommu_id_from_file() to verify if the group is initialized
  and IOMMU is set for it. The current TCE IOMMU driver marks the whole
  IOMMU table as busy when IOMMU is set for a container what this prevents
  other DMA users from allocating from it so it is safe to pass the group
  to the user space.
 
  3. KVM increases the container users counter via
  vfio_group_add_external_user(). This prevents the VFIO group from
  being disposed prior to exiting KVM.
 
  4. When KVM is finished and doing cleanup, it releases the group file
  and decrements the container users counter. Everything gets released.
 
  5. KVM also keeps the group file as otherwise its fd might have been
  closed at the moment of KVM finish so vfio_group_del_external_user()
  call will not be possible.
  
  This is the wrong order in my mind.  An external user has no business
  checking or maintaining any state of a group until it calls
  add_external_user().  Only after that call is successful can the user
  assume the filep to group relationship is static and get the iommu_id.
  Any use of the external user API should start with add and end with
  del.
 
 Yes, this is what I actually do, just wrong commit message, will fix.
 
  
  The vfio: Limit group opens patch is also required for the consistency.
 
  Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
  ---
 
  v1-v2: added definitions to vfio.h :)
  Should not compile but compiled. Hm.
 
  ---
   drivers/vfio/vfio.c  |   54 
  ++
   include/linux/vfio.h |7 +++
   2 files changed, 61 insertions(+)
 
  diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
  index c488da5..40875d2 100644
  --- a/drivers/vfio/vfio.c
  +++ b/drivers/vfio/vfio.c
  @@ -1370,6 +1370,60 @@ static const struct file_operations 
  vfio_device_fops = {
   };
   
   /**
  + * External user API, exported by symbols to be linked dynamically.
  + */
  +
  +/* Allows an external user (for example, KVM) to lock an IOMMU group */
  +int vfio_group_add_external_user(struct file *filep)
  +{
  +  struct vfio_group *group = filep-private_data;
  +
  +  if (filep-f_op != vfio_group_fops)
  +  return -EINVAL;
  +
  +  if (!atomic_inc_not_zero(group-container_users))
  +  return -EINVAL;
  
  This is the place where I was suggesting we need tests to match
  get_device_fd.  It's not clear what the external user is holding if the
  group has no iommu or is not viable here.
 
 
 In my mind this test must include test for iommu id so I would merge it
 with vfio_group_iommu_id_from_file().

I'm not sure what that means.

 Till I check iommu id, I still cannot
 use this group so where to put check for iommu/viable does not really
 matter (for me).

The difference is that getting the group id may just be the first of
several external user API interfaces.  The idea of external user
interface is that from add-del the group is maintained in the same
state as if a device was opened.  If we disassemble that so that add
sets up some stuff and getting the group id does a little more, what
happens if we start adding more external user API callbacks?  A user of
the interface shouldn't need to know the internals to know which
interface allows what aspect of use.  Besides, I don't want to have to
worry about managing another state slightly different from that used by
the device fd.

  
  
  if (!group-container-iommu_driver || !vfio_group_viable(group)) {
  vfio_group_try_dissolve_container(group);
  return -EINVAL;
  }
  
  +
  +  return 0;
  +}
  +EXPORT_SYMBOL_GPL(vfio_group_add_external_user);
  +
  +/* Allows an external user (for example, KVM) to unlock an IOMMU group */
  +void vfio_group_del_external_user(struct file *filep)
  +{
  +  struct vfio_group *group = filep-private_data;
  +
  +  if (WARN_ON(filep-f_op != vfio_group_fops))
  +  return;
  
  How about we make this return int so we can return 0/-EINVAL and the
  caller can decide the severity of the response?
 
 And what can the caller possibly do on !0?


Re: [PATCH v2] vfio: add external user support

2013-06-27 Thread Alexey Kardashevskiy
On 06/28/2013 10:41 AM, Alex Williamson wrote:
 On Fri, 2013-06-28 at 08:57 +1000, Alexey Kardashevskiy wrote:
 On 06/28/2013 01:44 AM, Alex Williamson wrote:
 On Thu, 2013-06-27 at 17:14 +1000, Alexey Kardashevskiy wrote:
 VFIO is designed to be used via ioctls on file descriptors
 returned by VFIO.

 However in some situations support for an external user is required.
 The first user is KVM on PPC64 (SPAPR TCE protocol) which is going to
 use the existing VFIO groups for exclusive access in real/virtual mode
 in the host kernel to avoid passing map/unmap requests to the user
 space which would made things pretty slow.

 The proposed protocol includes:

 1. do normal VFIO init stuff such as opening a new container, attaching
 group(s) to it, setting an IOMMU driver for a container. When IOMMU is
 set for a container, all groups in it are considered ready to use by
 an external user.

 2. pass a fd of the group we want to accelerate to KVM. KVM calls
 vfio_group_iommu_id_from_file() to verify if the group is initialized
 and IOMMU is set for it. The current TCE IOMMU driver marks the whole
 IOMMU table as busy when IOMMU is set for a container what this prevents
 other DMA users from allocating from it so it is safe to pass the group
 to the user space.

 3. KVM increases the container users counter via
 vfio_group_add_external_user(). This prevents the VFIO group from
 being disposed prior to exiting KVM.

 4. When KVM is finished and doing cleanup, it releases the group file
 and decrements the container users counter. Everything gets released.

 5. KVM also keeps the group file as otherwise its fd might have been
 closed at the moment of KVM finish so vfio_group_del_external_user()
 call will not be possible.

 This is the wrong order in my mind.  An external user has no business
 checking or maintaining any state of a group until it calls
 add_external_user().  Only after that call is successful can the user
 assume the filep to group relationship is static and get the iommu_id.
 Any use of the external user API should start with add and end with
 del.

 Yes, this is what I actually do, just wrong commit message, will fix.


 The vfio: Limit group opens patch is also required for the consistency.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---

 v1-v2: added definitions to vfio.h :)
 Should not compile but compiled. Hm.

 ---
  drivers/vfio/vfio.c  |   54 
 ++
  include/linux/vfio.h |7 +++
  2 files changed, 61 insertions(+)

 diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
 index c488da5..40875d2 100644
 --- a/drivers/vfio/vfio.c
 +++ b/drivers/vfio/vfio.c
 @@ -1370,6 +1370,60 @@ static const struct file_operations 
 vfio_device_fops = {
  };
  
  /**
 + * External user API, exported by symbols to be linked dynamically.
 + */
 +
 +/* Allows an external user (for example, KVM) to lock an IOMMU group */
 +int vfio_group_add_external_user(struct file *filep)
 +{
 +  struct vfio_group *group = filep-private_data;
 +
 +  if (filep-f_op != vfio_group_fops)
 +  return -EINVAL;
 +
 +  if (!atomic_inc_not_zero(group-container_users))
 +  return -EINVAL;

 This is the place where I was suggesting we need tests to match
 get_device_fd.  It's not clear what the external user is holding if the
 group has no iommu or is not viable here.


 In my mind this test must include test for iommu id so I would merge it
 with vfio_group_iommu_id_from_file().
 
 I'm not sure what that means.

Sorry. Still a mess in my head :( I'll to explain.

vfio_group_add_external_user() should tell if the group is viable and has
iommu (does not the latter include check for viable?).

vfio_group_iommu_id_from_file() tells the group id which has to be compared
by KVM with what KVM got from the userspace and KVM should reject if the
group id is wrong.

So there are 3 checks. KVM can continue if all three passed.

 Till I check iommu id, I still cannot
 use this group so where to put check for iommu/viable does not really
 matter (for me).
 
 The difference is that getting the group id may just be the first of
 several external user API interfaces.  The idea of external user
 interface is that from add-del the group is maintained in the same
 state as if a device was opened.

Good point.

 If we disassemble that so that add
 sets up some stuff and getting the group id does a little more, what
 happens if we start adding more external user API callbacks?  A user of
 the interface shouldn't need to know the internals to know which
 interface allows what aspect of use.  Besides, I don't want to have to
 worry about managing another state slightly different from that used by
 the device fd.





 if (!group-container-iommu_driver || !vfio_group_viable(group)) {
 vfio_group_try_dissolve_container(group);
 return -EINVAL;
 }

 +
 +  return 0;
 +}
 +EXPORT_SYMBOL_GPL(vfio_group_add_external_user);
 +
 +/* Allows an external user (for example, KVM) to 

Re: [PATCH v2] vfio: add external user support

2013-06-27 Thread Alex Williamson
On Fri, 2013-06-28 at 11:38 +1000, Alexey Kardashevskiy wrote:
 On 06/28/2013 10:41 AM, Alex Williamson wrote:
  On Fri, 2013-06-28 at 08:57 +1000, Alexey Kardashevskiy wrote:
  On 06/28/2013 01:44 AM, Alex Williamson wrote:
  On Thu, 2013-06-27 at 17:14 +1000, Alexey Kardashevskiy wrote:
  VFIO is designed to be used via ioctls on file descriptors
  returned by VFIO.
 
  However in some situations support for an external user is required.
  The first user is KVM on PPC64 (SPAPR TCE protocol) which is going to
  use the existing VFIO groups for exclusive access in real/virtual mode
  in the host kernel to avoid passing map/unmap requests to the user
  space which would made things pretty slow.
 
  The proposed protocol includes:
 
  1. do normal VFIO init stuff such as opening a new container, attaching
  group(s) to it, setting an IOMMU driver for a container. When IOMMU is
  set for a container, all groups in it are considered ready to use by
  an external user.
 
  2. pass a fd of the group we want to accelerate to KVM. KVM calls
  vfio_group_iommu_id_from_file() to verify if the group is initialized
  and IOMMU is set for it. The current TCE IOMMU driver marks the whole
  IOMMU table as busy when IOMMU is set for a container what this prevents
  other DMA users from allocating from it so it is safe to pass the group
  to the user space.
 
  3. KVM increases the container users counter via
  vfio_group_add_external_user(). This prevents the VFIO group from
  being disposed prior to exiting KVM.
 
  4. When KVM is finished and doing cleanup, it releases the group file
  and decrements the container users counter. Everything gets released.
 
  5. KVM also keeps the group file as otherwise its fd might have been
  closed at the moment of KVM finish so vfio_group_del_external_user()
  call will not be possible.
 
  This is the wrong order in my mind.  An external user has no business
  checking or maintaining any state of a group until it calls
  add_external_user().  Only after that call is successful can the user
  assume the filep to group relationship is static and get the iommu_id.
  Any use of the external user API should start with add and end with
  del.
 
  Yes, this is what I actually do, just wrong commit message, will fix.
 
 
  The vfio: Limit group opens patch is also required for the consistency.
 
  Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
  ---
 
  v1-v2: added definitions to vfio.h :)
  Should not compile but compiled. Hm.
 
  ---
   drivers/vfio/vfio.c  |   54 
  ++
   include/linux/vfio.h |7 +++
   2 files changed, 61 insertions(+)
 
  diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
  index c488da5..40875d2 100644
  --- a/drivers/vfio/vfio.c
  +++ b/drivers/vfio/vfio.c
  @@ -1370,6 +1370,60 @@ static const struct file_operations 
  vfio_device_fops = {
   };
   
   /**
  + * External user API, exported by symbols to be linked dynamically.
  + */
  +
  +/* Allows an external user (for example, KVM) to lock an IOMMU group */
  +int vfio_group_add_external_user(struct file *filep)
  +{
  +struct vfio_group *group = filep-private_data;
  +
  +if (filep-f_op != vfio_group_fops)
  +return -EINVAL;
  +
  +if (!atomic_inc_not_zero(group-container_users))
  +return -EINVAL;
 
  This is the place where I was suggesting we need tests to match
  get_device_fd.  It's not clear what the external user is holding if the
  group has no iommu or is not viable here.
 
 
  In my mind this test must include test for iommu id so I would merge it
  with vfio_group_iommu_id_from_file().
  
  I'm not sure what that means.
 
 Sorry. Still a mess in my head :( I'll to explain.
 
 vfio_group_add_external_user() should tell if the group is viable and has
 iommu

Agreed

 (does not the latter include check for viable?).

Mostly paranoia

 vfio_group_iommu_id_from_file() tells the group id which has to be compared
 by KVM with what KVM got from the userspace and KVM should reject if the
 group id is wrong.
 
 So there are 3 checks. KVM can continue if all three passed.

That's KVM's business, but what does it prove for userspace to give KVM
both a vfio group file descriptor and a group id?  It seems redundant
since the group id from vfio needs to take precedence.  More paranoia?

  Till I check iommu id, I still cannot
  use this group so where to put check for iommu/viable does not really
  matter (for me).
  
  The difference is that getting the group id may just be the first of
  several external user API interfaces.  The idea of external user
  interface is that from add-del the group is maintained in the same
  state as if a device was opened.
 
 Good point.
 
  If we disassemble that so that add
  sets up some stuff and getting the group id does a little more, what
  happens if we start adding more external user API callbacks?  A user of
  the interface shouldn't need to know the 

Re: [PATCH v2] vfio: add external user support

2013-06-27 Thread Alexey Kardashevskiy
On 06/28/2013 12:37 PM, Alex Williamson wrote:
 On Fri, 2013-06-28 at 11:38 +1000, Alexey Kardashevskiy wrote:
 On 06/28/2013 10:41 AM, Alex Williamson wrote:
 On Fri, 2013-06-28 at 08:57 +1000, Alexey Kardashevskiy wrote:
 On 06/28/2013 01:44 AM, Alex Williamson wrote:
 On Thu, 2013-06-27 at 17:14 +1000, Alexey Kardashevskiy wrote:
 VFIO is designed to be used via ioctls on file descriptors
 returned by VFIO.

 However in some situations support for an external user is required.
 The first user is KVM on PPC64 (SPAPR TCE protocol) which is going to
 use the existing VFIO groups for exclusive access in real/virtual mode
 in the host kernel to avoid passing map/unmap requests to the user
 space which would made things pretty slow.

 The proposed protocol includes:

 1. do normal VFIO init stuff such as opening a new container, attaching
 group(s) to it, setting an IOMMU driver for a container. When IOMMU is
 set for a container, all groups in it are considered ready to use by
 an external user.

 2. pass a fd of the group we want to accelerate to KVM. KVM calls
 vfio_group_iommu_id_from_file() to verify if the group is initialized
 and IOMMU is set for it. The current TCE IOMMU driver marks the whole
 IOMMU table as busy when IOMMU is set for a container what this prevents
 other DMA users from allocating from it so it is safe to pass the group
 to the user space.

 3. KVM increases the container users counter via
 vfio_group_add_external_user(). This prevents the VFIO group from
 being disposed prior to exiting KVM.

 4. When KVM is finished and doing cleanup, it releases the group file
 and decrements the container users counter. Everything gets released.

 5. KVM also keeps the group file as otherwise its fd might have been
 closed at the moment of KVM finish so vfio_group_del_external_user()
 call will not be possible.

 This is the wrong order in my mind.  An external user has no business
 checking or maintaining any state of a group until it calls
 add_external_user().  Only after that call is successful can the user
 assume the filep to group relationship is static and get the iommu_id.
 Any use of the external user API should start with add and end with
 del.

 Yes, this is what I actually do, just wrong commit message, will fix.


 The vfio: Limit group opens patch is also required for the consistency.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---

 v1-v2: added definitions to vfio.h :)
 Should not compile but compiled. Hm.

 ---
  drivers/vfio/vfio.c  |   54 
 ++
  include/linux/vfio.h |7 +++
  2 files changed, 61 insertions(+)

 diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
 index c488da5..40875d2 100644
 --- a/drivers/vfio/vfio.c
 +++ b/drivers/vfio/vfio.c
 @@ -1370,6 +1370,60 @@ static const struct file_operations 
 vfio_device_fops = {
  };
  
  /**
 + * External user API, exported by symbols to be linked dynamically.
 + */
 +
 +/* Allows an external user (for example, KVM) to lock an IOMMU group */
 +int vfio_group_add_external_user(struct file *filep)
 +{
 +struct vfio_group *group = filep-private_data;
 +
 +if (filep-f_op != vfio_group_fops)
 +return -EINVAL;
 +
 +if (!atomic_inc_not_zero(group-container_users))
 +return -EINVAL;

 This is the place where I was suggesting we need tests to match
 get_device_fd.  It's not clear what the external user is holding if the
 group has no iommu or is not viable here.


 In my mind this test must include test for iommu id so I would merge it
 with vfio_group_iommu_id_from_file().

 I'm not sure what that means.

 Sorry. Still a mess in my head :( I'll to explain.

 vfio_group_add_external_user() should tell if the group is viable and has
 iommu
 
 Agreed
 
 (does not the latter include check for viable?).
 
 Mostly paranoia
 
 vfio_group_iommu_id_from_file() tells the group id which has to be compared
 by KVM with what KVM got from the userspace and KVM should reject if the
 group id is wrong.

 So there are 3 checks. KVM can continue if all three passed.
 
 That's KVM's business, but what does it prove for userspace to give KVM
 both a vfio group file descriptor and a group id?  It seems redundant
 since the group id from vfio needs to take precedence.  More paranoia?

Yep, that's her :) Without this check, the user space is allowed to mix up
PHB ID (liobn) and IOMMU group. It has the right to do so and it should not
break anything but nice to check, no?



 Till I check iommu id, I still cannot
 use this group so where to put check for iommu/viable does not really
 matter (for me).

 The difference is that getting the group id may just be the first of
 several external user API interfaces.  The idea of external user
 interface is that from add-del the group is maintained in the same
 state as if a device was opened.

 Good point.

 If we disassemble that so that add
 sets up some stuff and getting the group id does a