Re: [Intel-gfx] [PATCH v9 10/25] vfio: Make vfio_device_open() single open for device cdev path
Hi Yi, On 4/1/23 17:18, Yi Liu wrote: > VFIO group has historically allowed multi-open of the device FD. This > was made secure because the "open" was executed via an ioctl to the > group FD which is itself only single open. > > However, no known use of multiple device FDs today. It is kind of a > strange thing to do because new device FDs can naturally be created > via dup(). > > When we implement the new device uAPI (only used in cdev path) there is > no natural way to allow the device itself from being multi-opened in a > secure manner. Without the group FD we cannot prove the security context > of the opener. > > Thus, when moving to the new uAPI we block the ability of opening > a device multiple times. Given old group path still allows it we store > a vfio_group pointer in struct vfio_device_file to differentiate. > > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Terrence Xu > Tested-by: Nicolin Chen > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu > --- > drivers/vfio/group.c | 2 ++ > drivers/vfio/vfio.h | 2 ++ > drivers/vfio/vfio_main.c | 7 +++ > 3 files changed, 11 insertions(+) > > diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c > index d55ce3ca44b7..1af4b9e012a7 100644 > --- a/drivers/vfio/group.c > +++ b/drivers/vfio/group.c > @@ -245,6 +245,8 @@ static struct file *vfio_device_open_file(struct > vfio_device *device) > goto err_out; > } > > + df->group = device->group; > + in previous patches df fields were protected with various locks. I refer to vfio_device_group_open() implementation. No need here? By the way since the group is set here, wrt [PATCH v9 06/25] kvm/vfio: Accept vfio device file from userspace you have a way to determine if a device was opened in the legacy way, no? > ret = vfio_device_group_open(df); > if (ret) > goto err_free; > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h > index b2f20b78a707..f1a448f9d067 100644 > --- a/drivers/vfio/vfio.h > +++ b/drivers/vfio/vfio.h > @@ -18,6 +18,8 @@ struct vfio_container; > > struct vfio_device_file { > struct vfio_device *device; > + struct vfio_group *group; > + > bool access_granted; > spinlock_t kvm_ref_lock; /* protect kvm field */ > struct kvm *kvm; > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c > index 6d5d3c2180c8..c8721d5d05fa 100644 > --- a/drivers/vfio/vfio_main.c > +++ b/drivers/vfio/vfio_main.c > @@ -477,6 +477,13 @@ int vfio_device_open(struct vfio_device_file *df) > > lockdep_assert_held(&device->dev_set->lock); > > + /* > + * Only the group path allows the device opened multiple times. allows the device to be opened multiple times > + * The device cdev path doesn't have a secure way for it. > + */ > + if (device->open_count != 0 && !df->group) > + return -EINVAL; > + > device->open_count++; > if (device->open_count == 1) { > ret = vfio_device_first_open(df); Thanks Eric
Re: [Intel-gfx] [PATCH v9 06/25] kvm/vfio: Accept vfio device file from userspace
Hi Yi, On 4/7/23 05:42, Liu, Yi L wrote: >> From: Alex Williamson >> Sent: Friday, April 7, 2023 2:58 AM You don't say anything about potential restriction, ie. what if the user calls KVM_DEV_VFIO_FILE with device fds while it has been using legacy >> container/group API? >>> legacy container/group path cannot do it as the below enhancement. >>> User needs to call KVM_DEV_VFIO_FILE before open devices, so this >>> should happen before _GET_DEVICE_FD. So the legacy path can never >>> pass device fds in KVM_DEV_VFIO_FILE. >>> >>> >> https://lore.kernel.org/kvm/20230327102059.333d6976.alex.william...@redhat.com >> /#t >> >> Wait, are you suggesting that a comment in the documentation suggesting >> a usage policy somehow provides enforcement of that ordering?? That's >> not how this works. Thanks, > I don't know if there is a good way to enforce this order in the code. The > vfio_device->kvm pointer is optional. If it is NULL, vfio just ignores it. > So vfio doesn't have a good way to tell if the order requirement is met or > not. Perhaps just trigger NULL pointer dereference when kvm pointer is used > in the device drivers like kvmgt if this order is not met. > > So that's why I come up to document it here. The applications uses kvm > should know this and follow this otherwise it may encounter error. > > Do you have other suggestions for it? This order should be a generic > requirement. is it? group path also needs to follow it to make the mdev > driver that refers kvm pointer to be workable. In the same way as kvm_vfio_file_is_valid() called in kvm_vfio_file_add() can't you have a kernel API that checks the fd consistence? Thanks Eric > > Thanks, > Yi Liu > > -The GROUP_ADD operation above should be invoked prior to accessing the > +The FILE/GROUP_ADD operation above should be invoked prior to accessing > the > device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support > drivers which require a kvm pointer to be set in their .open_device() > -callback. > +callback. It is the same for device file descriptor via character device > +open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such > file > +descriptors, FILE_ADD should be invoked before >> VFIO_DEVICE_BIND_IOMMUFD > +to support the drivers mentioned in prior sentence as well. >>> just as here. This means device fds can only be passed with >>> KVM_DEV_VFIO_FILE >>> in the cdev path. >>> >>> Regards, >>> Yi Liu
Re: [Intel-gfx] [PATCH v9 10/25] vfio: Make vfio_device_open() single open for device cdev path
Hi Yi, On 4/1/23 17:18, Yi Liu wrote: > VFIO group has historically allowed multi-open of the device FD. This > was made secure because the "open" was executed via an ioctl to the > group FD which is itself only single open. > > However, no known use of multiple device FDs today. It is kind of a > strange thing to do because new device FDs can naturally be created > via dup(). > > When we implement the new device uAPI (only used in cdev path) there is > no natural way to allow the device itself from being multi-opened in a > secure manner. Without the group FD we cannot prove the security context > of the opener. > > Thus, when moving to the new uAPI we block the ability of opening > a device multiple times. Given old group path still allows it we store > a vfio_group pointer in struct vfio_device_file to differentiate. > > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Terrence Xu > Tested-by: Nicolin Chen > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu Reviewed-by: Eric Auger Thanks Eric > --- > drivers/vfio/group.c | 2 ++ > drivers/vfio/vfio.h | 2 ++ > drivers/vfio/vfio_main.c | 7 +++ > 3 files changed, 11 insertions(+) > > diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c > index d55ce3ca44b7..1af4b9e012a7 100644 > --- a/drivers/vfio/group.c > +++ b/drivers/vfio/group.c > @@ -245,6 +245,8 @@ static struct file *vfio_device_open_file(struct > vfio_device *device) > goto err_out; > } > > + df->group = device->group; > + > ret = vfio_device_group_open(df); > if (ret) > goto err_free; > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h > index b2f20b78a707..f1a448f9d067 100644 > --- a/drivers/vfio/vfio.h > +++ b/drivers/vfio/vfio.h > @@ -18,6 +18,8 @@ struct vfio_container; > > struct vfio_device_file { > struct vfio_device *device; > + struct vfio_group *group; > + > bool access_granted; > spinlock_t kvm_ref_lock; /* protect kvm field */ > struct kvm *kvm; > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c > index 6d5d3c2180c8..c8721d5d05fa 100644 > --- a/drivers/vfio/vfio_main.c > +++ b/drivers/vfio/vfio_main.c > @@ -477,6 +477,13 @@ int vfio_device_open(struct vfio_device_file *df) > > lockdep_assert_held(&device->dev_set->lock); > > + /* > + * Only the group path allows the device opened multiple times. > + * The device cdev path doesn't have a secure way for it. > + */ > + if (device->open_count != 0 && !df->group) > + return -EINVAL; > + > device->open_count++; > if (device->open_count == 1) { > ret = vfio_device_first_open(df);
Re: [Intel-gfx] [PATCH v9 09/25] vfio: Add cdev_device_open_cnt to vfio_group
Hi Yi, On 4/1/23 17:18, Yi Liu wrote: > for counting the devices that are opened via the cdev path. This count > is increased and decreased by the cdev path. The group path checks it > to achieve exclusion with the cdev path. With this, only one path (group > path or cdev path) will claim DMA ownership. This avoids scenarios in > which devices within the same group may be opened via different paths. > > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Terrence Xu > Tested-by: Nicolin Chen > Tested-by: Matthew Rosato > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu Reviewed-by: Eric Auger Thanks Eric > --- > drivers/vfio/group.c | 33 + > drivers/vfio/vfio.h | 3 +++ > 2 files changed, 36 insertions(+) > > diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c > index 71f0a9a4016e..d55ce3ca44b7 100644 > --- a/drivers/vfio/group.c > +++ b/drivers/vfio/group.c > @@ -383,6 +383,33 @@ static long vfio_group_fops_unl_ioctl(struct file *filep, > } > } > > +int vfio_device_block_group(struct vfio_device *device) > +{ > + struct vfio_group *group = device->group; > + int ret = 0; > + > + mutex_lock(&group->group_lock); > + if (group->opened_file) { > + ret = -EBUSY; > + goto out_unlock; > + } > + > + group->cdev_device_open_cnt++; > + > +out_unlock: > + mutex_unlock(&group->group_lock); > + return ret; > +} > + > +void vfio_device_unblock_group(struct vfio_device *device) > +{ > + struct vfio_group *group = device->group; > + > + mutex_lock(&group->group_lock); > + group->cdev_device_open_cnt--; > + mutex_unlock(&group->group_lock); > +} > + > static int vfio_group_fops_open(struct inode *inode, struct file *filep) > { > struct vfio_group *group = > @@ -405,6 +432,11 @@ static int vfio_group_fops_open(struct inode *inode, > struct file *filep) > goto out_unlock; > } > > + if (group->cdev_device_open_cnt) { > + ret = -EBUSY; > + goto out_unlock; > + } > + > /* >* Do we need multiple instances of the group open? Seems not. >*/ > @@ -479,6 +511,7 @@ static void vfio_group_release(struct device *dev) > mutex_destroy(&group->device_lock); > mutex_destroy(&group->group_lock); > WARN_ON(group->iommu_group); > + WARN_ON(group->cdev_device_open_cnt); > ida_free(&vfio.group_ida, MINOR(group->dev.devt)); > kfree(group); > } > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h > index 854f2c97cb9a..b2f20b78a707 100644 > --- a/drivers/vfio/vfio.h > +++ b/drivers/vfio/vfio.h > @@ -83,8 +83,11 @@ struct vfio_group { > struct blocking_notifier_head notifier; > struct iommufd_ctx *iommufd; > spinlock_t kvm_ref_lock; > + unsigned intcdev_device_open_cnt; > }; > > +int vfio_device_block_group(struct vfio_device *device); > +void vfio_device_unblock_group(struct vfio_device *device); > int vfio_device_set_group(struct vfio_device *device, > enum vfio_group_type type); > void vfio_device_remove_group(struct vfio_device *device);
Re: [Intel-gfx] [PATCH v9 08/25] vfio: Block device access via device fd until device is opened
On 4/1/23 17:18, Yi Liu wrote: > Allow the vfio_device file to be in a state where the device FD is > opened but the device cannot be used by userspace (i.e. its .open_device() > hasn't been called). This inbetween state is not used when the device > FD is spawned from the group FD, however when we create the device FD > directly by opening a cdev it will be opened in the blocked state. > > The reason for the inbetween state is that userspace only gets a FD but > doesn't gain access permission until binding the FD to an iommufd. So in > the blocked state, only the bind operation is allowed. Completing bind > will allow user to further access the device. > > This is implemented by adding a flag in struct vfio_device_file to mark > the blocked state and using a simple smp_load_acquire() to obtain the > flag value and serialize all the device setup with the thread accessing > this device. > > Following this lockless scheme, it can safely handle the device FD > unbound->bound but it cannot handle bound->unbound. To allow this we'd > need to add a lock on all the vfio ioctls which seems costly. So once > device FD is bound, it remains bound until the FD is closed. > > Suggested-by: Jason Gunthorpe > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Terrence Xu > Tested-by: Nicolin Chen > Tested-by: Matthew Rosato > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu Reviewed-by: Eric Auger Eric > --- > drivers/vfio/group.c | 11 ++- > drivers/vfio/vfio.h | 1 + > drivers/vfio/vfio_main.c | 42 ++-- > 3 files changed, 47 insertions(+), 7 deletions(-) > > diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c > index 9a7b2765eef6..71f0a9a4016e 100644 > --- a/drivers/vfio/group.c > +++ b/drivers/vfio/group.c > @@ -194,9 +194,18 @@ static int vfio_device_group_open(struct > vfio_device_file *df) > df->iommufd = device->group->iommufd; > > ret = vfio_device_open(df); > - if (ret) > + if (ret) { > df->iommufd = NULL; > + goto out_put_kvm; > + } > + > + /* > + * Paired with smp_load_acquire() in vfio_device_fops::ioctl/ > + * read/write/mmap and vfio_file_has_device_access() > + */ > + smp_store_release(&df->access_granted, true); > > +out_put_kvm: > if (device->open_count == 0) > vfio_device_put_kvm(device); > > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h > index cffc08f5a6f1..854f2c97cb9a 100644 > --- a/drivers/vfio/vfio.h > +++ b/drivers/vfio/vfio.h > @@ -18,6 +18,7 @@ struct vfio_container; > > struct vfio_device_file { > struct vfio_device *device; > + bool access_granted; > spinlock_t kvm_ref_lock; /* protect kvm field */ > struct kvm *kvm; > struct iommufd_ctx *iommufd; /* protected by struct > vfio_device_set::lock */ > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c > index 2ea6cb6d03c7..6d5d3c2180c8 100644 > --- a/drivers/vfio/vfio_main.c > +++ b/drivers/vfio/vfio_main.c > @@ -1114,6 +1114,10 @@ static long vfio_device_fops_unl_ioctl(struct file > *filep, > struct vfio_device *device = df->device; > int ret; > > + /* Paired with smp_store_release() following vfio_device_open() */ > + if (!smp_load_acquire(&df->access_granted)) > + return -EINVAL; > + > ret = vfio_device_pm_runtime_get(device); > if (ret) > return ret; > @@ -1141,6 +1145,10 @@ static ssize_t vfio_device_fops_read(struct file > *filep, char __user *buf, > struct vfio_device_file *df = filep->private_data; > struct vfio_device *device = df->device; > > + /* Paired with smp_store_release() following vfio_device_open() */ > + if (!smp_load_acquire(&df->access_granted)) > + return -EINVAL; > + > if (unlikely(!device->ops->read)) > return -EINVAL; > > @@ -1154,6 +1162,10 @@ static ssize_t vfio_device_fops_write(struct file > *filep, > struct vfio_device_file *df = filep->private_data; > struct vfio_device *device = df->device; > > + /* Paired with smp_store_release() following vfio_device_open() */ > + if (!smp_load_acquire(&df->access_granted)) > + return -EINVAL; > + > if (unlikely(!device->ops->write)) > return -EINVAL; > > @@ -1165,6 +1177,10 @@ static int vfio_device_fops_mmap(struct file *filep, > struct vm_area_struct *vma) > struct vfio_device_file *df = filep->private_data; > s
Re: [Intel-gfx] [PATCH v9 07/25] vfio: Pass struct vfio_device_file * to vfio_device_open/close()
atic int vfio_device_first_open(struct vfio_device *device, > - struct iommufd_ctx *iommufd) > +static int vfio_device_first_open(struct vfio_device_file *df) > { > + struct vfio_device *device = df->device; > + struct iommufd_ctx *iommufd = df->iommufd; > int ret; > > lockdep_assert_held(&device->dev_set->lock); > @@ -453,9 +454,11 @@ static int vfio_device_first_open(struct vfio_device > *device, > return ret; > } > > -static void vfio_device_last_close(struct vfio_device *device, > -struct iommufd_ctx *iommufd) > +static void vfio_device_last_close(struct vfio_device_file *df) > { > + struct vfio_device *device = df->device; > + struct iommufd_ctx *iommufd = df->iommufd; > + > lockdep_assert_held(&device->dev_set->lock); > > if (device->ops->close_device) > @@ -467,15 +470,16 @@ static void vfio_device_last_close(struct vfio_device > *device, > module_put(device->dev->driver->owner); > } > > -int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd) > +int vfio_device_open(struct vfio_device_file *df) > { > + struct vfio_device *device = df->device; > int ret = 0; > > lockdep_assert_held(&device->dev_set->lock); > > device->open_count++; > if (device->open_count == 1) { > - ret = vfio_device_first_open(device, iommufd); > + ret = vfio_device_first_open(df); > if (ret) > device->open_count--; > } > @@ -483,14 +487,15 @@ int vfio_device_open(struct vfio_device *device, struct > iommufd_ctx *iommufd) > return ret; > } > > -void vfio_device_close(struct vfio_device *device, > -struct iommufd_ctx *iommufd) > +void vfio_device_close(struct vfio_device_file *df) > { > + struct vfio_device *device = df->device; > + > lockdep_assert_held(&device->dev_set->lock); > > vfio_assert_device_open(device); > if (device->open_count == 1) > - vfio_device_last_close(device, iommufd); > + vfio_device_last_close(df); > device->open_count--; > } > > @@ -535,7 +540,7 @@ static int vfio_device_fops_release(struct inode *inode, > struct file *filep) > struct vfio_device_file *df = filep->private_data; > struct vfio_device *device = df->device; > > - vfio_device_group_close(device); > + vfio_device_group_close(df); > > vfio_device_put_registration(device); > Maybe it reduces the number of parameters but the diffstat shows it does not really simplify the code overall. I am not really sure it was worth and the df->iommufd pre and post-setting is not really nice looking to me. But well it others are OK ... Reviewed-by: Eric Auger Thanks Eric
Re: [Intel-gfx] [PATCH v9 06/25] kvm/vfio: Accept vfio device file from userspace
Hi Yi, On 4/1/23 17:18, Yi Liu wrote: > This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*. > Old userspace uses KVM_DEV_VFIO_GROUP* works as well. > > Reviewed-by: Jason Gunthorpe > Reviewed-by: Kevin Tian > Tested-by: Terrence Xu > Tested-by: Nicolin Chen > Tested-by: Matthew Rosato > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu > --- > Documentation/virt/kvm/devices/vfio.rst | 53 + > include/uapi/linux/kvm.h| 16 ++-- > virt/kvm/vfio.c | 16 > 3 files changed, 56 insertions(+), 29 deletions(-) > > diff --git a/Documentation/virt/kvm/devices/vfio.rst > b/Documentation/virt/kvm/devices/vfio.rst > index 79b6811bb4f3..277d727ec1a2 100644 > --- a/Documentation/virt/kvm/devices/vfio.rst > +++ b/Documentation/virt/kvm/devices/vfio.rst > @@ -9,24 +9,38 @@ Device types supported: >- KVM_DEV_TYPE_VFIO > > Only one VFIO instance may be created per VM. The created device > -tracks VFIO groups in use by the VM and features of those groups > -important to the correctness and acceleration of the VM. As groups > -are enabled and disabled for use by the VM, KVM should be updated > -about their presence. When registered with KVM, a reference to the > -VFIO-group is held by KVM. > +tracks VFIO files (group or device) in use by the VM and features > +of those groups/devices important to the correctness and acceleration > +of the VM. As groups/devices are enabled and disabled for use by the > +VM, KVM should be updated about their presence. When registered with > +KVM, a reference to the VFIO file is held by KVM. > > Groups: > - KVM_DEV_VFIO_GROUP > - > -KVM_DEV_VFIO_GROUP attributes: > - KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking > - kvm_device_attr.addr points to an int32_t file descriptor > - for the VFIO group. > - KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking > - kvm_device_attr.addr points to an int32_t file descriptor > - for the VFIO group. > - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table > + KVM_DEV_VFIO_FILE > + alias: KVM_DEV_VFIO_GROUP > + > +KVM_DEV_VFIO_FILE attributes: > + KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device > + tracking > + > + alias: KVM_DEV_VFIO_GROUP_ADD > + > + kvm_device_attr.addr points to an int32_t file descriptor for the > + VFIO file. > + > + KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM > + device tracking > + > + alias: KVM_DEV_VFIO_GROUP_DEL > + > + kvm_device_attr.addr points to an int32_t file descriptor for the > + VFIO file. > + > + KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: attaches a guest visible TCE table > allocated by sPAPR KVM. > + > + alias: KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE > + > kvm_device_attr.addr points to a struct:: > > struct kvm_vfio_spapr_tce { > @@ -40,9 +54,14 @@ KVM_DEV_VFIO_GROUP attributes: > - @tablefd is a file descriptor for a TCE table allocated via > KVM_CREATE_SPAPR_TCE. > > + only accepts vfio group file as SPAPR has no iommufd support So then what is the point of introducing KVM_DEV_VFIO_FILE_SET_SPAPR_TCE at this stage? I think would have separated the Groups: KVM_DEV_VFIO_FILE alias: KVM_DEV_VFIO_GROUP KVM_DEV_VFIO_FILE attributes: KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device tracking kvm_device_attr.addr points to an int32_t file descriptor for the VFIO file. KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM device tracking kvm_device_attr.addr points to an int32_t file descriptor for the VFIO file. KVM_DEV_VFIO_GROUP (legacy kvm device group restricted to the handling of VFIO group fd) KVM_DEV_VFIO_GROUP_ADD: same as KVM_DEV_VFIO_FILE_ADD for group fd only KVM_DEV_VFIO_GROUP_DEL: same as KVM_DEV_VFIO_FILE_DEL for group fd only KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table allocated by sPAPR KVM. kvm_device_attr.addr points to a struct:: struct kvm_vfio_spapr_tce { __s32 groupfd; __s32 tablefd; }; where: - @groupfd is a file descriptor for a VFIO group; - @tablefd is a file descriptor for a TCE table allocated via KVM_CREATE_SPAPR_TCE. You don't say anything about potential restriction, ie. what if the user calls KVM_DEV_VFIO_FILE with device fds while it has been using legacy container/group API? Thanks Eric > + > :: > > -The GROUP_ADD operation above should be invoked prior to accessing the > +The FILE/GROUP_ADD operation above should be invoked prior to accessing the > device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support > drivers which require a kvm pointer to be set in their .open_
Re: [Intel-gfx] [PATCH v9 04/25] vfio: Accept vfio device file in the KVM facing kAPI
Hi Yi, On 4/1/23 17:18, Yi Liu wrote: > This makes the vfio file kAPIs to accept vfio device files, also a > preparation for vfio device cdev support. > > For the kvm set with vfio device file, kvm pointer is stored in struct > vfio_device_file, and use kvm_ref_lock to protect kvm set and kvm > pointer usage within VFIO. This kvm pointer will be set to vfio_device > after device file is bound to iommufd in the cdev path. > > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Terrence Xu > Tested-by: Nicolin Chen > Tested-by: Matthew Rosato > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu Reviewed-by: Eric Auger Thanks Eric > --- > drivers/vfio/vfio.h | 2 ++ > drivers/vfio/vfio_main.c | 18 ++ > 2 files changed, 20 insertions(+) > > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h > index 56ad127ac618..e4672d91a6f7 100644 > --- a/drivers/vfio/vfio.h > +++ b/drivers/vfio/vfio.h > @@ -18,6 +18,8 @@ struct vfio_container; > > struct vfio_device_file { > struct vfio_device *device; > + spinlock_t kvm_ref_lock; /* protect kvm field */ > + struct kvm *kvm; > }; > > void vfio_device_put_registration(struct vfio_device *device); > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c > index 748bde4d74d9..cb543791b28b 100644 > --- a/drivers/vfio/vfio_main.c > +++ b/drivers/vfio/vfio_main.c > @@ -414,6 +414,7 @@ vfio_allocate_device_file(struct vfio_device *device) > return ERR_PTR(-ENOMEM); > > df->device = device; > + spin_lock_init(&df->kvm_ref_lock); > > return df; > } > @@ -1246,6 +1247,20 @@ bool vfio_file_enforced_coherent(struct file *file) > } > EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); > > +static void vfio_device_file_set_kvm(struct file *file, struct kvm *kvm) > +{ > + struct vfio_device_file *df = file->private_data; > + > + /* > + * The kvm is first recorded in the vfio_device_file, and will > + * be propagated to vfio_device::kvm when the file is bound to > + * iommufd successfully in the vfio device cdev path. > + */ > + spin_lock(&df->kvm_ref_lock); > + df->kvm = kvm; > + spin_unlock(&df->kvm_ref_lock); > +} > + > /** > * vfio_file_set_kvm - Link a kvm with VFIO drivers > * @file: VFIO group file or VFIO device file > @@ -1259,6 +1274,9 @@ void vfio_file_set_kvm(struct file *file, struct kvm > *kvm) > group = vfio_group_from_file(file); > if (group) > vfio_group_set_kvm(group, kvm); > + > + if (vfio_device_from_file(file)) > + vfio_device_file_set_kvm(file, kvm); > } > EXPORT_SYMBOL_GPL(vfio_file_set_kvm); >
Re: [Intel-gfx] [PATCH v3 12/12] vfio/pci: Report dev_id in VFIO_DEVICE_GET_PCI_HOT_RESET_INFO
On 4/5/23 18:25, Alex Williamson wrote: > On Wed, 5 Apr 2023 14:04:51 + > "Liu, Yi L" wrote: > >> Hi Eric, >> >>> From: Eric Auger >>> Sent: Wednesday, April 5, 2023 8:20 PM >>> >>> Hi Yi, >>> On 4/1/23 16:44, Yi Liu wrote: >>>> for the users that accept device fds passed from management stacks to be >>>> able to figure out the host reset affected devices among the devices >>>> opened by the user. This is needed as such users do not have BDF (bus, >>>> devfn) knowledge about the devices it has opened, hence unable to use >>>> the information reported by existing VFIO_DEVICE_GET_PCI_HOT_RESET_INFO >>>> to figure out the affected devices. >>>> >>>> Signed-off-by: Yi Liu >>>> --- >>>> drivers/vfio/pci/vfio_pci_core.c | 58 >>>> include/uapi/linux/vfio.h| 24 - >>>> 2 files changed, 74 insertions(+), 8 deletions(-) >>>> >>>> diff --git a/drivers/vfio/pci/vfio_pci_core.c >>>> b/drivers/vfio/pci/vfio_pci_core.c >>>> index 19f5b075d70a..a5a7e148dce1 100644 >>>> --- a/drivers/vfio/pci/vfio_pci_core.c >>>> +++ b/drivers/vfio/pci/vfio_pci_core.c >>>> @@ -30,6 +30,7 @@ >>>> #if IS_ENABLED(CONFIG_EEH) >>>> #include >>>> #endif >>>> +#include >>>> >>>> #include "vfio_pci_priv.h" >>>> >>>> @@ -767,6 +768,20 @@ static int vfio_pci_get_irq_count(struct >>> vfio_pci_core_device *vdev, int irq_typ >>>>return 0; >>>> } >>>> >>>> +static struct vfio_device * >>>> +vfio_pci_find_device_in_devset(struct vfio_device_set *dev_set, >>>> + struct pci_dev *pdev) >>>> +{ >>>> + struct vfio_device *cur; >>>> + >>>> + lockdep_assert_held(&dev_set->lock); >>>> + >>>> + list_for_each_entry(cur, &dev_set->device_list, dev_set_list) >>>> + if (cur->dev == &pdev->dev) >>>> + return cur; >>>> + return NULL; >>>> +} >>>> + >>>> static int vfio_pci_count_devs(struct pci_dev *pdev, void *data) >>>> { >>>>(*(int *)data)++; >>>> @@ -776,13 +791,20 @@ static int vfio_pci_count_devs(struct pci_dev *pdev, >>>> void >>> *data) >>>> struct vfio_pci_fill_info { >>>>int max; >>>>int cur; >>>> + bool require_devid; >>>> + struct iommufd_ctx *iommufd; >>>> + struct vfio_device_set *dev_set; >>>>struct vfio_pci_dependent_device *devices; >>>> }; >>>> >>>> static int vfio_pci_fill_devs(struct pci_dev *pdev, void *data) >>>> { >>>>struct vfio_pci_fill_info *fill = data; >>>> + struct vfio_device_set *dev_set = fill->dev_set; >>>>struct iommu_group *iommu_group; >>>> + struct vfio_device *vdev; >>>> + >>>> + lockdep_assert_held(&dev_set->lock); >>>> >>>>if (fill->cur == fill->max) >>>>return -EAGAIN; /* Something changed, try again */ >>>> @@ -791,7 +813,21 @@ static int vfio_pci_fill_devs(struct pci_dev *pdev, >>>> void >>> *data) >>>>if (!iommu_group) >>>>return -EPERM; /* Cannot reset non-isolated devices */ >>>> >>>> - fill->devices[fill->cur].group_id = iommu_group_id(iommu_group); >>>> + if (fill->require_devid) { >>>> + /* >>>> + * Report dev_id of the devices that are opened as cdev >>>> + * and have the same iommufd with the fill->iommufd. >>>> + * Otherwise, just fill IOMMUFD_INVALID_ID. >>>> + */ >>>> + vdev = vfio_pci_find_device_in_devset(dev_set, pdev); >>>> + if (vdev && vfio_device_cdev_opened(vdev) && >>>> + fill->iommufd == vfio_iommufd_physical_ictx(vdev)) >>>> + vfio_iommufd_physical_devid(vdev, &fill->devices[fill- >>>> cur].dev_id); >>>> + else >>>> + fill->devices[fill->cur].dev_id = IOMMUFD_INVALID_ID; >>>> + } else { >>>
Re: [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset
Hi Jason, On 4/5/23 13:41, Jason Gunthorpe wrote: > On Tue, Apr 04, 2023 at 05:59:01PM +0200, Eric Auger wrote: > >>> but the hot reset shall fail as the group is not owned by the user. >> sure it shall but I fail to understand if the reset fails or the device >> plug is somehow delayed until the reset completes. > It is just racy today - vfio_pci_dev_set_resettable() doesn't hold any > locks across the pci_walk_bus() check to prevent hot plug in while it is > working on the reset. OK thanks Eric > > Jason >
Re: [Intel-gfx] [PATCH v9 25/25] docs: vfio: Add vfio device cdev description
Hi Yi, On 4/1/23 17:18, Yi Liu wrote: > This gives notes for userspace applications on device cdev usage. > > Reviewed-by: Kevin Tian > Signed-off-by: Yi Liu > --- > Documentation/driver-api/vfio.rst | 132 ++ > 1 file changed, 132 insertions(+) > > diff --git a/Documentation/driver-api/vfio.rst > b/Documentation/driver-api/vfio.rst > index 363e12c90b87..4f21be7bda8a 100644 > --- a/Documentation/driver-api/vfio.rst > +++ b/Documentation/driver-api/vfio.rst > @@ -239,6 +239,130 @@ group and can access them as follows:: > /* Gratuitous device reset and go... */ > ioctl(device, VFIO_DEVICE_RESET); > > +IOMMUFD and vfio_iommu_type1 > + > + > +IOMMUFD is the new user API to manage I/O page tables from userspace. > +It intends to be the portal of delivering advanced userspace DMA > +features (nested translation [5], PASID [6], etc.) while also providing > +a backwards compatibility interface for existing VFIO_TYPE1v2_IOMMU use > +cases. Eventually the vfio_iommu_type1 driver, as well as the legacy > +vfio container and group model is intended to be deprecated. > + > +The IOMMUFD backwards compatibility interface can be enabled two ways. > +In the first method, the kernel can be configured with > +CONFIG_IOMMUFD_VFIO_CONTAINER, in which case the IOMMUFD subsystem > +transparently provides the entire infrastructure for the VFIO > +container and IOMMU backend interfaces. The compatibility mode can > +also be accessed if the VFIO container interface, ie. /dev/vfio/vfio is > +simply symlink'd to /dev/iommu. Note that at the time of writing, the > +compatibility mode is not entirely feature complete relative to > +VFIO_TYPE1v2_IOMMU (ex. DMA mapping MMIO) and does not attempt to > +provide compatibility to the VFIO_SPAPR_TCE_IOMMU interface. Therefore > +it is not generally advisable at this time to switch from native VFIO > +implementations to the IOMMUFD compatibility interfaces. > + > +Long term, VFIO users should migrate to device access through the cdev > +interface described below, and native access through the IOMMUFD > +provided interfaces. > + > +VFIO Device cdev > + > + > +Traditionally user acquires a device fd via VFIO_GROUP_GET_DEVICE_FD > +in a VFIO group. > + > +With CONFIG_VFIO_DEVICE_CDEV=y the user can now acquire a device fd > +by directly opening a character device /dev/vfio/devices/vfioX where > +"X" is the number allocated uniquely by VFIO for registered devices. > +For noiommu devices, the character device would be named with "noiommu-" > +prefix. e.g. /dev/vfio/devices/noiommu-vfioX. > + > +The cdev only works with IOMMUFD. Both VFIO drivers and applications > +must adapt to the new cdev security model which requires using > +VFIO_DEVICE_BIND_IOMMUFD to claim DMA ownership before starting to > +actually use the device. Once BIND succeeds then a VFIO device can > +be fully accessed by the user. > + > +VFIO device cdev doesn't rely on VFIO group/container/iommu drivers. > +Hence those modules can be fully compiled out in an environment > +where no legacy VFIO application exists. > + > +So far SPAPR does not support IOMMUFD yet. So it cannot support device > +cdev neither. > + > +Device cdev Example > +--- > + > +Assume user wants to access PCI device :6a:01.0:: > + > + $ ls /sys/bus/pci/devices/:6a:01.0/vfio-dev/ > + vfio0 > + > +This device is therefore represented as vfio0. The user can verify > +its existence:: > + > + $ ls -l /dev/vfio/devices/vfio0 > + crw--- 1 root root 511, 0 Feb 16 01:22 /dev/vfio/devices/vfio0 > + $ cat /sys/bus/pci/devices/:6a:01.0/vfio-dev/vfio0/dev you mentionned in the pci hot reset series that the BDF couldn't be used if cdev is being used. According to the above, it could, no? > + 511:0 > + $ ls -l /dev/char/511\:0 > + lrwxrwxrwx 1 root root 21 Feb 16 01:22 /dev/char/511:0 -> > ../vfio/devices/vfio0 > + > +Then provide the user with access to the device if unprivileged > +operation is desired:: > + > + $ chown user:user /dev/vfio/devices/vfio0 > + > +Finally the user could get cdev fd by:: > + > + cdev_fd = open("/dev/vfio/devices/vfio0", O_RDWR); > + > +An opened cdev_fd doesn't give the user any permission of accessing > +the device except binding the cdev_fd to an iommufd. After that point > +then the device is fully accessible including attaching it to an > +IOMMUFD IOAS/HWPT to enable userspace DMA:: > + > + struct vfio_device_bind_iommufd bind = { > + .argsz = sizeof(bind), > + .flags = 0, > + }; > + struct iommu_ioas_alloc alloc_data = { > + .size = sizeof(alloc_data), > + .flags = 0, > + }; > + struct vfio_device_attach_iommufd_pt attach_data = { > + .argsz = sizeof(attach_data), > + .flags = 0, > + }; > + struct iommu_ioas_map map = { > + .size = sizeof(map), > +
Re: [Intel-gfx] [PATCH v9 03/25] vfio: Remove vfio_file_is_group()
Hi Yi, On 4/1/23 17:18, Yi Liu wrote: > since no user of vfio_file_is_group() now. > > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Terrence Xu > Tested-by: Nicolin Chen > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu Reviewed-by: Eric Auger Eric > --- > drivers/vfio/group.c | 10 -- > include/linux/vfio.h | 1 - > 2 files changed, 11 deletions(-) > > diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c > index ede4723c5f72..4f937ebaf6f7 100644 > --- a/drivers/vfio/group.c > +++ b/drivers/vfio/group.c > @@ -792,16 +792,6 @@ struct iommu_group *vfio_file_iommu_group(struct file > *file) > } > EXPORT_SYMBOL_GPL(vfio_file_iommu_group); > > -/** > - * vfio_file_is_group - True if the file is a vfio group file > - * @file: VFIO group file > - */ > -bool vfio_file_is_group(struct file *file) > -{ > - return vfio_group_from_file(file); > -} > -EXPORT_SYMBOL_GPL(vfio_file_is_group); > - > bool vfio_group_enforced_coherent(struct vfio_group *group) > { > struct vfio_device *device; > diff --git a/include/linux/vfio.h b/include/linux/vfio.h > index d9a0770e5fc1..7519ae89fcd6 100644 > --- a/include/linux/vfio.h > +++ b/include/linux/vfio.h > @@ -264,7 +264,6 @@ int vfio_mig_get_next_state(struct vfio_device *device, > * External user API > */ > struct iommu_group *vfio_file_iommu_group(struct file *file); > -bool vfio_file_is_group(struct file *file); > bool vfio_file_is_valid(struct file *file); > bool vfio_file_enforced_coherent(struct file *file); > void vfio_file_set_kvm(struct file *file, struct kvm *kvm);
Re: [Intel-gfx] [PATCH v3 12/12] vfio/pci: Report dev_id in VFIO_DEVICE_GET_PCI_HOT_RESET_INFO
Hi Yi, On 4/1/23 16:44, Yi Liu wrote: > for the users that accept device fds passed from management stacks to be > able to figure out the host reset affected devices among the devices > opened by the user. This is needed as such users do not have BDF (bus, > devfn) knowledge about the devices it has opened, hence unable to use > the information reported by existing VFIO_DEVICE_GET_PCI_HOT_RESET_INFO > to figure out the affected devices. > > Signed-off-by: Yi Liu > --- > drivers/vfio/pci/vfio_pci_core.c | 58 > include/uapi/linux/vfio.h| 24 - > 2 files changed, 74 insertions(+), 8 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_core.c > b/drivers/vfio/pci/vfio_pci_core.c > index 19f5b075d70a..a5a7e148dce1 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -30,6 +30,7 @@ > #if IS_ENABLED(CONFIG_EEH) > #include > #endif > +#include > > #include "vfio_pci_priv.h" > > @@ -767,6 +768,20 @@ static int vfio_pci_get_irq_count(struct > vfio_pci_core_device *vdev, int irq_typ > return 0; > } > > +static struct vfio_device * > +vfio_pci_find_device_in_devset(struct vfio_device_set *dev_set, > +struct pci_dev *pdev) > +{ > + struct vfio_device *cur; > + > + lockdep_assert_held(&dev_set->lock); > + > + list_for_each_entry(cur, &dev_set->device_list, dev_set_list) > + if (cur->dev == &pdev->dev) > + return cur; > + return NULL; > +} > + > static int vfio_pci_count_devs(struct pci_dev *pdev, void *data) > { > (*(int *)data)++; > @@ -776,13 +791,20 @@ static int vfio_pci_count_devs(struct pci_dev *pdev, > void *data) > struct vfio_pci_fill_info { > int max; > int cur; > + bool require_devid; > + struct iommufd_ctx *iommufd; > + struct vfio_device_set *dev_set; > struct vfio_pci_dependent_device *devices; > }; > > static int vfio_pci_fill_devs(struct pci_dev *pdev, void *data) > { > struct vfio_pci_fill_info *fill = data; > + struct vfio_device_set *dev_set = fill->dev_set; > struct iommu_group *iommu_group; > + struct vfio_device *vdev; > + > + lockdep_assert_held(&dev_set->lock); > > if (fill->cur == fill->max) > return -EAGAIN; /* Something changed, try again */ > @@ -791,7 +813,21 @@ static int vfio_pci_fill_devs(struct pci_dev *pdev, void > *data) > if (!iommu_group) > return -EPERM; /* Cannot reset non-isolated devices */ > > - fill->devices[fill->cur].group_id = iommu_group_id(iommu_group); > + if (fill->require_devid) { > + /* > + * Report dev_id of the devices that are opened as cdev > + * and have the same iommufd with the fill->iommufd. > + * Otherwise, just fill IOMMUFD_INVALID_ID. > + */ > + vdev = vfio_pci_find_device_in_devset(dev_set, pdev); > + if (vdev && vfio_device_cdev_opened(vdev) && > + fill->iommufd == vfio_iommufd_physical_ictx(vdev)) > + vfio_iommufd_physical_devid(vdev, > &fill->devices[fill->cur].dev_id); > + else > + fill->devices[fill->cur].dev_id = IOMMUFD_INVALID_ID; > + } else { > + fill->devices[fill->cur].group_id = iommu_group_id(iommu_group); > + } > fill->devices[fill->cur].segment = pci_domain_nr(pdev->bus); > fill->devices[fill->cur].bus = pdev->bus->number; > fill->devices[fill->cur].devfn = pdev->devfn; > @@ -1230,17 +1266,27 @@ static int vfio_pci_ioctl_get_pci_hot_reset_info( > return -ENOMEM; > > fill.devices = devices; > + fill.dev_set = vdev->vdev.dev_set; > > + mutex_lock(&vdev->vdev.dev_set->lock); > + if (vfio_device_cdev_opened(&vdev->vdev)) { > + fill.require_devid = true; > + fill.iommufd = vfio_iommufd_physical_ictx(&vdev->vdev); > + } > ret = vfio_pci_for_each_slot_or_bus(vdev->pdev, vfio_pci_fill_devs, > &fill, slot); > + mutex_unlock(&vdev->vdev.dev_set->lock); > > /* >* If a device was removed between counting and filling, we may come up >* short of fill.max. If a device was added, we'll have a return of >* -EAGAIN above. >*/ > - if (!ret) > + if (!ret) { > hdr.count = fill.cur; > + if (fill.require_devid) > + hdr.flags = VFIO_PCI_HOT_RESET_FLAG_IOMMUFD_DEV_ID; > + } > > reset_info_exit: > if (copy_to_user(arg, &hdr, minsz)) > @@ -2346,12 +2392,10 @@ static bool vfio_dev_in_files(struct > vfio_pci_core_device *vdev, > static int vfio_pci_is_device_in_set(struct pci_dev *pdev, void *data) > { > struct vfio_device_set *dev_set = data; > - struct vfio_device *cur; > > - list_for_each_entry(cur, &dev_set->device_list, d
Re: [Intel-gfx] [PATCH v3 10/12] vfio: Mark cdev usage in vfio_device
On 4/1/23 16:44, Yi Liu wrote: > There are users that need to check if vfio_device is opened as cdev. > e.g. vfio-pci. This adds a flag in vfio_device, it will be set in the > cdev path when device is opened. This is not used at this moment, but > a preparation for vfio device cdev support. better to squash this patch with the patch setting cdev_opened then? Thanks Eric > > Signed-off-by: Yi Liu > --- > include/linux/vfio.h | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/include/linux/vfio.h b/include/linux/vfio.h > index f8fb9ab25188..d9a0770e5fc1 100644 > --- a/include/linux/vfio.h > +++ b/include/linux/vfio.h > @@ -62,6 +62,7 @@ struct vfio_device { > struct iommufd_device *iommufd_device; > bool iommufd_attached; > #endif > + bool cdev_opened; > }; > > /** > @@ -151,6 +152,12 @@ vfio_iommufd_physical_devid(struct vfio_device *vdev, > u32 *id) > ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) > #endif > > +static inline bool vfio_device_cdev_opened(struct vfio_device *device) > +{ > + lockdep_assert_held(&device->dev_set->lock); > + return device->cdev_opened; > +} > + > /** > * @migration_set_state: Optional callback to change the migration state for > * devices that support migration. It's mandatory for
Re: [Intel-gfx] [PATCH v3 11/12] iommufd: Define IOMMUFD_INVALID_ID in uapi
Hi Yi On 4/1/23 16:44, Yi Liu wrote: > as there are IOMMUFD users that want to know check if an ID generated s/want to know check/ need to check which type of ID? > by IOMMUFD is valid or not. e.g. vfio-pci optionaly returns invalid optionally > dev_id to user in the VFIO_DEVICE_GET_PCI_HOT_RESET_INFO ioctl. User > needs to check if the ID is valid or not. so dev id ... > > IOMMUFD_INVALID_ID is defined as 0 since the IDs generated by IOMMUFD > starts from 0. from 1, same as below > > Signed-off-by: Yi Liu > --- > include/uapi/linux/iommufd.h | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h > index 98ebba80cfa1..aeae73a93833 100644 > --- a/include/uapi/linux/iommufd.h > +++ b/include/uapi/linux/iommufd.h > @@ -9,6 +9,9 @@ > > #define IOMMUFD_TYPE (';') > > +/* IDs allocated by IOMMUFD starts from 0 */ ditto > +#define IOMMUFD_INVALID_ID 0 > + > /** > * DOC: General ioctl format > * Eric
Re: [Intel-gfx] [PATCH v3 09/12] vfio/pci: Accept device fd in VFIO_DEVICE_PCI_HOT_RESET ioctl
On 4/1/23 16:44, Yi Liu wrote: > Now user can also provide an array of device fds as a 3rd method to verify > the reset ownership. It's not useful at this point when the device fds are > acquired via group fds. But it's necessary when moving to device cdev which > allows the user to directly acquire device fds by skipping group. In that > case this method can be used as a last resort when the preferred iommufd > verification doesn't work, e.g. in noiommu usages. > > Clarify it in uAPI. > > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu Reviewed-by: Eric Auger Eric > --- > drivers/vfio/pci/vfio_pci_core.c | 9 + > include/uapi/linux/vfio.h| 3 ++- > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_core.c > b/drivers/vfio/pci/vfio_pci_core.c > index da6325008872..19f5b075d70a 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -1289,7 +1289,7 @@ vfio_pci_ioctl_pci_hot_reset_files(struct > vfio_pci_core_device *vdev, > return -ENOMEM; > } > > - if (copy_from_user(fds, arg->group_fds, > + if (copy_from_user(fds, arg->fds, > hdr->count * sizeof(*fds))) { > kfree(fds); > kfree(files); > @@ -1297,8 +1297,8 @@ vfio_pci_ioctl_pci_hot_reset_files(struct > vfio_pci_core_device *vdev, > } > > /* > - * Get the group file for each fd to ensure the group held across > - * the reset > + * Get the file for each fd to ensure the group/device file > + * is held across the reset >*/ > for (file_idx = 0; file_idx < hdr->count; file_idx++) { > struct file *file = fget(fds[file_idx]); > @@ -2469,7 +2469,8 @@ static int vfio_pci_dev_set_hot_reset(struct > vfio_device_set *dev_set, >* cannot race being opened by another user simultaneously. >* >* Otherwise all opened devices in the dev_set must be > - * contained by the set of groups provided by the user. > + * contained by the set of groups/devices provided by > + * the user. >* >* If user provides a zero-length array, then all the >* opened devices must be bound to a same iommufd_ctx. > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h > index 17aa5d09db41..25432ef213ee 100644 > --- a/include/uapi/linux/vfio.h > +++ b/include/uapi/linux/vfio.h > @@ -681,6 +681,7 @@ struct vfio_pci_hot_reset_info { > * > * The ownership can be proved by: > * - An array of group fds > + * - An array of device fds > * - A zero-length array > * > * In the last case all affected devices which are opened by this user > @@ -694,7 +695,7 @@ struct vfio_pci_hot_reset { > __u32 argsz; > __u32 flags; > __u32 count; > - __s32 group_fds[]; > + __s32 fds[]; > }; > > #define VFIO_DEVICE_PCI_HOT_RESET_IO(VFIO_TYPE, VFIO_BASE + 13)
Re: [Intel-gfx] [PATCH v3 08/12] vfio/pci: Renaming for accepting device fd in hot reset path
On 4/1/23 16:44, Yi Liu wrote: > No functional change is intended. > > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu Reviewed-by: Eric Auger Eric > --- > drivers/vfio/pci/vfio_pci_core.c | 52 > 1 file changed, 26 insertions(+), 26 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_core.c > b/drivers/vfio/pci/vfio_pci_core.c > index 2a510b71edcb..da6325008872 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -177,10 +177,10 @@ static void vfio_pci_probe_mmaps(struct > vfio_pci_core_device *vdev) > } > } > > -struct vfio_pci_group_info; > +struct vfio_pci_file_info; > static void vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set); > static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, > - struct vfio_pci_group_info *groups, > + struct vfio_pci_file_info *info, > struct iommufd_ctx *iommufd_ctx); > > /* > @@ -800,7 +800,7 @@ static int vfio_pci_fill_devs(struct pci_dev *pdev, void > *data) > return 0; > } > > -struct vfio_pci_group_info { > +struct vfio_pci_file_info { > int count; > struct file **files; > }; > @@ -1257,14 +1257,14 @@ static int vfio_pci_ioctl_get_pci_hot_reset_info( > } > > static int > -vfio_pci_ioctl_pci_hot_reset_groups(struct vfio_pci_core_device *vdev, > - struct vfio_pci_hot_reset *hdr, > - bool slot, > - struct vfio_pci_hot_reset __user *arg) > +vfio_pci_ioctl_pci_hot_reset_files(struct vfio_pci_core_device *vdev, > +struct vfio_pci_hot_reset *hdr, > +bool slot, > +struct vfio_pci_hot_reset __user *arg) > { > - int32_t *group_fds; > + int32_t *fds; > struct file **files; > - struct vfio_pci_group_info info; > + struct vfio_pci_file_info info; > int file_idx, count = 0, ret = 0; > > /* > @@ -1281,17 +1281,17 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct > vfio_pci_core_device *vdev, > if (hdr->count > count) > return -EINVAL; > > - group_fds = kcalloc(hdr->count, sizeof(*group_fds), GFP_KERNEL); > + fds = kcalloc(hdr->count, sizeof(*fds), GFP_KERNEL); > files = kcalloc(hdr->count, sizeof(*files), GFP_KERNEL); > - if (!group_fds || !files) { > - kfree(group_fds); > + if (!fds || !files) { > + kfree(fds); > kfree(files); > return -ENOMEM; > } > > - if (copy_from_user(group_fds, arg->group_fds, > -hdr->count * sizeof(*group_fds))) { > - kfree(group_fds); > + if (copy_from_user(fds, arg->group_fds, > +hdr->count * sizeof(*fds))) { > + kfree(fds); > kfree(files); > return -EFAULT; > } > @@ -1301,7 +1301,7 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct > vfio_pci_core_device *vdev, >* the reset >*/ > for (file_idx = 0; file_idx < hdr->count; file_idx++) { > - struct file *file = fget(group_fds[file_idx]); > + struct file *file = fget(fds[file_idx]); > > if (!file) { > ret = -EBADF; > @@ -1318,9 +1318,9 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct > vfio_pci_core_device *vdev, > files[file_idx] = file; > } > > - kfree(group_fds); > + kfree(fds); > > - /* release reference to groups on error */ > + /* release reference to fds on error */ > if (ret) > goto hot_reset_release; > > @@ -1358,7 +1358,7 @@ static int vfio_pci_ioctl_pci_hot_reset(struct > vfio_pci_core_device *vdev, > return -ENODEV; > > if (hdr.count) > - return vfio_pci_ioctl_pci_hot_reset_groups(vdev, &hdr, slot, > arg); > + return vfio_pci_ioctl_pci_hot_reset_files(vdev, &hdr, slot, > arg); > > iommufd = vfio_iommufd_physical_ictx(&vdev->vdev); > > @@ -2329,16 +2329,16 @@ const struct pci_error_handlers > vfio_pci_core_err_handlers = { > }; > EXPORT_SYMBOL_GPL(vfio_pci_core_err_handlers); > > -static bool vfio_dev_in_groups(struct vfio_pci_core_device *vdev, > -struct vfio_pci_group_info *groups) > +static boo
Re: [Intel-gfx] [PATCH v3 06/12] vfio: Refine vfio file kAPIs for vfio PCI hot reset
Hi Yi, On 4/1/23 16:44, Yi Liu wrote: > This prepares vfio core to accept vfio device file from the vfio PCI > hot reset path. vfio_file_is_group() is still kept for KVM usage. > > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu > --- > drivers/vfio/group.c | 32 ++-- > drivers/vfio/pci/vfio_pci_core.c | 4 ++-- > drivers/vfio/vfio.h | 2 ++ > drivers/vfio/vfio_main.c | 29 + > include/linux/vfio.h | 1 + > 5 files changed, 48 insertions(+), 20 deletions(-) > > diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c > index 27d5ba7cf9dc..d0c95d033605 100644 > --- a/drivers/vfio/group.c > +++ b/drivers/vfio/group.c > @@ -745,6 +745,15 @@ bool vfio_device_has_container(struct vfio_device > *device) > return device->group->container; > } > > +struct vfio_group *vfio_group_from_file(struct file *file) > +{ > + struct vfio_group *group = file->private_data; > + > + if (file->f_op != &vfio_group_fops) > + return NULL; > + return group; > +} > + > /** > * vfio_file_iommu_group - Return the struct iommu_group for the vfio group > file > * @file: VFIO group file > @@ -755,13 +764,13 @@ bool vfio_device_has_container(struct vfio_device > *device) > */ > struct iommu_group *vfio_file_iommu_group(struct file *file) > { > - struct vfio_group *group = file->private_data; > + struct vfio_group *group = vfio_group_from_file(file); > struct iommu_group *iommu_group = NULL; > > if (!IS_ENABLED(CONFIG_SPAPR_TCE_IOMMU)) > return NULL; > > - if (!vfio_file_is_group(file)) > + if (!group) > return NULL; > > mutex_lock(&group->group_lock); > @@ -775,12 +784,12 @@ struct iommu_group *vfio_file_iommu_group(struct file > *file) > EXPORT_SYMBOL_GPL(vfio_file_iommu_group); > > /** > - * vfio_file_is_group - True if the file is usable with VFIO aPIS > + * vfio_file_is_group - True if the file is a vfio group file > * @file: VFIO group file > */ > bool vfio_file_is_group(struct file *file) > { > - return file->f_op == &vfio_group_fops; > + return vfio_group_from_file(file); > } > EXPORT_SYMBOL_GPL(vfio_file_is_group); > > @@ -842,23 +851,10 @@ void vfio_file_set_kvm(struct file *file, struct kvm > *kvm) > } > EXPORT_SYMBOL_GPL(vfio_file_set_kvm); > > -/** > - * vfio_file_has_dev - True if the VFIO file is a handle for device > - * @file: VFIO file to check > - * @device: Device that must be part of the file > - * > - * Returns true if given file has permission to manipulate the given device. > - */ > -bool vfio_file_has_dev(struct file *file, struct vfio_device *device) > +bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device) > { > - struct vfio_group *group = file->private_data; > - > - if (!vfio_file_is_group(file)) > - return false; > - > return group == device->group; > } > -EXPORT_SYMBOL_GPL(vfio_file_has_dev); > > static char *vfio_devnode(const struct device *dev, umode_t *mode) > { > diff --git a/drivers/vfio/pci/vfio_pci_core.c > b/drivers/vfio/pci/vfio_pci_core.c > index b68fcba67a4b..2a510b71edcb 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -1308,8 +1308,8 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct > vfio_pci_core_device *vdev, > break; > } > > - /* Ensure the FD is a vfio group FD.*/ > - if (!vfio_file_is_group(file)) { > + /* Ensure the FD is a vfio FD. vfio group or vfio device */ it is a bit strange to update the comment here and in the other places in this patch whereas file_is_valid still sticks to group file check By the way I would simply remove the comment which does not bring much > + if (!vfio_file_is_valid(file)) { > fput(file); > ret = -EINVAL; > break; > diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h > index 7b19c621e0e6..c0aeea24fbd6 100644 > --- a/drivers/vfio/vfio.h > +++ b/drivers/vfio/vfio.h > @@ -84,6 +84,8 @@ void vfio_device_group_unregister(struct vfio_device > *device); > int vfio_device_group_use_iommu(struct vfio_device *device); > void vfio_device_group_unuse_iommu(struct vfio_device *device); > void vfio_device_group_close(struct vfio_device *device); > +struct vfio_group *vfio_group_from_file(struct file *file); > +bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device > *device); > bool vfio_device_has_container(struct vfio_device *device); > int __init vfio_group_init(void); > void vfio_group_cleanup(void); > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c > index 89497c933490..fe7446805afd 100644 > --- a/drivers/vfio/vfio_main.c > +++ b/drivers/vfio/vfio_main.c > @@ -1154,6 +1154,35 @@ c
Re: [Intel-gfx] [PATCH v3 07/12] vfio: Accpet device file from vfio PCI hot reset path
Hi Yi, On 4/1/23 16:44, Yi Liu wrote: > This extends both vfio_file_is_valid() and vfio_file_has_dev() to accept > device file from the vfio PCI hot reset. typo in the title s/Accpet/Accept > > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu > --- > drivers/vfio/vfio_main.c | 23 +++ > 1 file changed, 19 insertions(+), 4 deletions(-) > > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c > index fe7446805afd..ebbb6b91a498 100644 > --- a/drivers/vfio/vfio_main.c > +++ b/drivers/vfio/vfio_main.c > @@ -1154,13 +1154,23 @@ const struct file_operations vfio_device_fops = { > .mmap = vfio_device_fops_mmap, > }; > > +static struct vfio_device *vfio_device_from_file(struct file *file) > +{ > + struct vfio_device *device = file->private_data; > + > + if (file->f_op != &vfio_device_fops) > + return NULL; > + return device; > +} > + > /** > * vfio_file_is_valid - True if the file is valid vfio file > * @file: VFIO group file or VFIO device file > */ > bool vfio_file_is_valid(struct file *file) > { > - return vfio_group_from_file(file); > + return vfio_group_from_file(file) || > +vfio_device_from_file(file); > } > EXPORT_SYMBOL_GPL(vfio_file_is_valid); > > @@ -1174,12 +1184,17 @@ EXPORT_SYMBOL_GPL(vfio_file_is_valid); > bool vfio_file_has_dev(struct file *file, struct vfio_device *device) > { > struct vfio_group *group; > + struct vfio_device *vdev; > > group = vfio_group_from_file(file); > - if (!group) > - return false; > + if (group) > + return vfio_group_has_dev(group, device); > + > + vdev = vfio_device_from_file(file); > + if (vdev) > + return vdev == device; > > - return vfio_group_has_dev(group, device); > + return false; > } > EXPORT_SYMBOL_GPL(vfio_file_has_dev); > With Alex' suggestion Reviewed-by: Eric Auger Eric
Re: [Intel-gfx] [PATCH v3 05/12] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET
On 4/4/23 22:18, Alex Williamson wrote: > On Sat, 1 Apr 2023 07:44:22 -0700 > Yi Liu wrote: > >> as an alternative method for ownership check when iommufd is used. In >> this case all opened devices in the affected dev_set are verified to >> be bound to a same valid iommufd value to allow reset. It's simpler >> and faster as user does not need to pass a set of fds and kernel no >> need to search the device within the given fds. >> >> a device in noiommu mode doesn't have a valid iommufd, so this method >> should not be used in a dev_set which contains multiple devices and one >> of them is in noiommu. The only allowed noiommu scenario is that the >> calling device is noiommu and it's in a singleton dev_set. >> >> Suggested-by: Jason Gunthorpe >> Signed-off-by: Jason Gunthorpe >> Reviewed-by: Jason Gunthorpe >> Tested-by: Yanting Jiang >> Signed-off-by: Yi Liu >> --- >> drivers/vfio/pci/vfio_pci_core.c | 42 +++- >> include/uapi/linux/vfio.h| 9 ++- >> 2 files changed, 44 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/vfio/pci/vfio_pci_core.c >> b/drivers/vfio/pci/vfio_pci_core.c >> index 3696b8e58445..b68fcba67a4b 100644 >> --- a/drivers/vfio/pci/vfio_pci_core.c >> +++ b/drivers/vfio/pci/vfio_pci_core.c >> @@ -180,7 +180,8 @@ static void vfio_pci_probe_mmaps(struct >> vfio_pci_core_device *vdev) >> struct vfio_pci_group_info; >> static void vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set); >> static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, >> - struct vfio_pci_group_info *groups); >> + struct vfio_pci_group_info *groups, >> + struct iommufd_ctx *iommufd_ctx); >> >> /* >> * INTx masking requires the ability to disable INTx signaling via >> PCI_COMMAND >> @@ -1277,7 +1278,7 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct >> vfio_pci_core_device *vdev, >> return ret; >> >> /* Somewhere between 1 and count is OK */ >> -if (!hdr->count || hdr->count > count) >> +if (hdr->count > count) >> return -EINVAL; >> >> group_fds = kcalloc(hdr->count, sizeof(*group_fds), GFP_KERNEL); >> @@ -1326,7 +1327,7 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct >> vfio_pci_core_device *vdev, >> info.count = hdr->count; >> info.files = files; >> >> -ret = vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, &info); >> +ret = vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, &info, NULL); >> >> hot_reset_release: >> for (file_idx--; file_idx >= 0; file_idx--) >> @@ -1341,6 +1342,7 @@ static int vfio_pci_ioctl_pci_hot_reset(struct >> vfio_pci_core_device *vdev, >> { >> unsigned long minsz = offsetofend(struct vfio_pci_hot_reset, count); >> struct vfio_pci_hot_reset hdr; >> +struct iommufd_ctx *iommufd; >> bool slot = false; >> >> if (copy_from_user(&hdr, arg, minsz)) >> @@ -1355,7 +1357,12 @@ static int vfio_pci_ioctl_pci_hot_reset(struct >> vfio_pci_core_device *vdev, >> else if (pci_probe_reset_bus(vdev->pdev->bus)) >> return -ENODEV; >> >> -return vfio_pci_ioctl_pci_hot_reset_groups(vdev, &hdr, slot, arg); >> +if (hdr.count) >> +return vfio_pci_ioctl_pci_hot_reset_groups(vdev, &hdr, slot, >> arg); >> + >> +iommufd = vfio_iommufd_physical_ictx(&vdev->vdev); >> + >> +return vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, NULL, iommufd); >> } >> >> static int vfio_pci_ioctl_ioeventfd(struct vfio_pci_core_device *vdev, >> @@ -2327,6 +2334,9 @@ static bool vfio_dev_in_groups(struct >> vfio_pci_core_device *vdev, >> { >> unsigned int i; >> >> +if (!groups) >> +return false; >> + >> for (i = 0; i < groups->count; i++) >> if (vfio_file_has_dev(groups->files[i], &vdev->vdev)) >> return true; >> @@ -2402,13 +2412,25 @@ static int vfio_pci_dev_set_pm_runtime_get(struct >> vfio_device_set *dev_set) >> return ret; >> } >> >> +static bool vfio_dev_in_iommufd_ctx(struct vfio_pci_core_device *vdev, >> +struct iommufd_ctx *iommufd_ctx) >> +{ >> +struct iommufd_ctx *iommufd = vfio_iommufd_physical_ictx(&vdev->vdev); >> + >> +if (!iommufd) >> +return false; >> + >> +return iommufd == iommufd_ctx; >> +} >> + >> /* >> * We need to get memory_lock for each device, but devices can share >> mmap_lock, >> * therefore we need to zap and hold the vma_lock for each device, and only >> then >> * get each memory_lock. >> */ >> static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, >> - struct vfio_pci_group_info *groups) >> + struct vfio_pci_group_info *groups, >> + struct iommufd_ctx *iommufd_ctx) >> { >> struct vfio_pci_core_device *cur_mem; >> struc
Re: [Intel-gfx] [PATCH v3 05/12] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET
Hi Yi, On 4/1/23 16:44, Yi Liu wrote: > as an alternative method for ownership check when iommufd is used. In I don't understand the 1st sentence. > this case all opened devices in the affected dev_set are verified to > be bound to a same valid iommufd value to allow reset. It's simpler > and faster as user does not need to pass a set of fds and kernel no kernel does not need to search > need to search the device within the given fds. > > a device in noiommu mode doesn't have a valid iommufd, so this method > should not be used in a dev_set which contains multiple devices and one > of them is in noiommu. The only allowed noiommu scenario is that the > calling device is noiommu and it's in a singleton dev_set. > > Suggested-by: Jason Gunthorpe > Signed-off-by: Jason Gunthorpe > Reviewed-by: Jason Gunthorpe > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu > --- > drivers/vfio/pci/vfio_pci_core.c | 42 +++- > include/uapi/linux/vfio.h| 9 ++- > 2 files changed, 44 insertions(+), 7 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_core.c > b/drivers/vfio/pci/vfio_pci_core.c > index 3696b8e58445..b68fcba67a4b 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -180,7 +180,8 @@ static void vfio_pci_probe_mmaps(struct > vfio_pci_core_device *vdev) > struct vfio_pci_group_info; > static void vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set); > static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, > - struct vfio_pci_group_info *groups); > + struct vfio_pci_group_info *groups, > + struct iommufd_ctx *iommufd_ctx); > > /* > * INTx masking requires the ability to disable INTx signaling via > PCI_COMMAND > @@ -1277,7 +1278,7 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct > vfio_pci_core_device *vdev, > return ret; > > /* Somewhere between 1 and count is OK */ > - if (!hdr->count || hdr->count > count) > + if (hdr->count > count) then I would simply remove the above comment since !count check is done by the caller. > return -EINVAL; > > group_fds = kcalloc(hdr->count, sizeof(*group_fds), GFP_KERNEL); > @@ -1326,7 +1327,7 @@ vfio_pci_ioctl_pci_hot_reset_groups(struct > vfio_pci_core_device *vdev, > info.count = hdr->count; > info.files = files; > > - ret = vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, &info); > + ret = vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, &info, NULL); > > hot_reset_release: > for (file_idx--; file_idx >= 0; file_idx--) > @@ -1341,6 +1342,7 @@ static int vfio_pci_ioctl_pci_hot_reset(struct > vfio_pci_core_device *vdev, > { > unsigned long minsz = offsetofend(struct vfio_pci_hot_reset, count); > struct vfio_pci_hot_reset hdr; > + struct iommufd_ctx *iommufd; > bool slot = false; > > if (copy_from_user(&hdr, arg, minsz)) > @@ -1355,7 +1357,12 @@ static int vfio_pci_ioctl_pci_hot_reset(struct > vfio_pci_core_device *vdev, > else if (pci_probe_reset_bus(vdev->pdev->bus)) > return -ENODEV; > > - return vfio_pci_ioctl_pci_hot_reset_groups(vdev, &hdr, slot, arg); > + if (hdr.count) > + return vfio_pci_ioctl_pci_hot_reset_groups(vdev, &hdr, slot, > arg); > + > + iommufd = vfio_iommufd_physical_ictx(&vdev->vdev); > + > + return vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, NULL, iommufd); > } > > static int vfio_pci_ioctl_ioeventfd(struct vfio_pci_core_device *vdev, > @@ -2327,6 +2334,9 @@ static bool vfio_dev_in_groups(struct > vfio_pci_core_device *vdev, > { > unsigned int i; > > + if (!groups) > + return false; > + > for (i = 0; i < groups->count; i++) > if (vfio_file_has_dev(groups->files[i], &vdev->vdev)) > return true; > @@ -2402,13 +2412,25 @@ static int vfio_pci_dev_set_pm_runtime_get(struct > vfio_device_set *dev_set) > return ret; > } > > +static bool vfio_dev_in_iommufd_ctx(struct vfio_pci_core_device *vdev, > + struct iommufd_ctx *iommufd_ctx) > +{ > + struct iommufd_ctx *iommufd = vfio_iommufd_physical_ictx(&vdev->vdev); > + > + if (!iommufd) > + return false; > + > + return iommufd == iommufd_ctx; > +} > + > /* > * We need to get memory_lock for each device, but devices can share > mmap_lock, > * therefore we need to zap and hold the vma_lock for each device, and only > then > * get each memory_lock. > */ > static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, > - struct vfio_pci_group_info *groups) > + struct vfio_pci_group_info *groups, > + struct iommufd_ctx *iommufd_ctx) > { > struct vfio_pci_core_device *cur_mem;
Re: [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset
On 4/4/23 17:29, Liu, Yi L wrote: >> From: Eric Auger >> Sent: Tuesday, April 4, 2023 11:19 PM >> >> Hi Yi, >> >> On 4/4/23 16:37, Liu, Yi L wrote: >>> Hi Eric, >>> >>>> From: Eric Auger >>>> Sent: Tuesday, April 4, 2023 10:00 PM >>>> >>>> Hi YI, >>>> >>>> On 4/1/23 16:44, Yi Liu wrote: >>>>> If the affected device is not opened by any user, it's safe to reset it >>>>> given it's not in use. >>>>> >>>>> Reviewed-by: Kevin Tian >>>>> Reviewed-by: Jason Gunthorpe >>>>> Tested-by: Yanting Jiang >>>>> Signed-off-by: Yi Liu >>>>> --- >>>>> drivers/vfio/pci/vfio_pci_core.c | 14 +++--- >>>>> include/uapi/linux/vfio.h| 8 >>>>> 2 files changed, 19 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/drivers/vfio/pci/vfio_pci_core.c >>>>> b/drivers/vfio/pci/vfio_pci_core.c >>>>> index 65bbef562268..5d745c9abf05 100644 >>>>> --- a/drivers/vfio/pci/vfio_pci_core.c >>>>> +++ b/drivers/vfio/pci/vfio_pci_core.c >>>>> @@ -2429,10 +2429,18 @@ static int vfio_pci_dev_set_hot_reset(struct >>>> vfio_device_set *dev_set, >>>>> list_for_each_entry(cur_vma, &dev_set->device_list, vdev.dev_set_list) { >>>>> /* >>>>> - * Test whether all the affected devices are contained by the >>>>> - * set of groups provided by the user. >>>>> + * Test whether all the affected devices can be reset by the >>>>> + * user. >>>>> + * >>>>> + * Resetting an unused device (not opened) is safe, because >>>>> + * dev_set->lock is held in hot reset path so this device >>>>> + * cannot race being opened by another user simultaneously. >>>>> + * >>>>> + * Otherwise all opened devices in the dev_set must be >>>>> + * contained by the set of groups provided by the user. >>>>>*/ >>>>> - if (!vfio_dev_in_groups(cur_vma, groups)) { >>>>> + if (cur_vma->vdev.open_count && >>>>> + !vfio_dev_in_groups(cur_vma, groups)) { >>>>> ret = -EINVAL; >>>>> goto err_undo; >>>>> } >>>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h >>>>> index 0552e8dcf0cb..f96e5689cffc 100644 >>>>> --- a/include/uapi/linux/vfio.h >>>>> +++ b/include/uapi/linux/vfio.h >>>>> @@ -673,6 +673,14 @@ struct vfio_pci_hot_reset_info { >>>>> * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13, >>>>> * struct vfio_pci_hot_reset) >>>>> * >>>>> + * Userspace requests hot reset for the devices it uses. Due to the >>>>> + * underlying topology, multiple devices can be affected in the reset >>>> by the reset >>>>> + * while some might be opened by another user. To avoid interference >>>> s/interference/hot reset failure? >>> I don’t think user can really avoid hot reset failure since there may >>> be new devices plugged into the affected slot. Even user has opened >> I don't know the legacy wrt that issue but this sounds a serious issue, >> meaning the reset of an assigned device could impact another device >> belonging to another group not not owned by the user? > but the hot reset shall fail as the group is not owned by the user. sure it shall but I fail to understand if the reset fails or the device plug is somehow delayed until the reset completes. > >>> all the groups/devices reported by VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, >>> the hot reset can fail if new device is plugged in and has not been >>> bound to vfio or opened by another user during the window of >>> _INFO and HOT_RESET. >> with respect to the latter isn't the dev_set lock held during the hot >> reset and sufficient to prevent any new opening to occur? > yes. new open needs to acquire the dev_set lock. So when hot reset > acquires the dev_set lock, then no new open can occur. > > Regards, > Yi Liu > >>> maybe the whole statement should be as below: >>> >>> To avoid interference, the hot reset can only be conducted when all >>> the affected devices are either opened by the calling user or not >>> opened yet at the moment of the hot reset attempt. >> OK >> >> Eric >>>>> + * the calling user must ensure all affected devices, if opened, are >>>>> + * owned by itself. >>>>> + * >>>>> + * The ownership is proved by an array of group fds. >>>>> + * >>>>> * Return: 0 on success, -errno on failure. >>>>> */ >>>>> struct vfio_pci_hot_reset { >>> Regards, >>> Yi Liu
Re: [Intel-gfx] [PATCH v3 04/12] vfio-iommufd: Add helper to retrieve iommufd_ctx and devid for vfio_device
Hi, On 4/1/23 16:44, Yi Liu wrote: > This is needed by the vfio-pci driver to report affected devices in the > hot reset for a given device. > > Reviewed-by: Jason Gunthorpe > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu > --- > drivers/iommu/iommufd/device.c | 12 > drivers/vfio/iommufd.c | 14 ++ > include/linux/iommufd.h| 3 +++ > include/linux/vfio.h | 13 + > 4 files changed, 42 insertions(+) > > diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c > index 25115d401d8f..04a57aa1ae2c 100644 > --- a/drivers/iommu/iommufd/device.c > +++ b/drivers/iommu/iommufd/device.c > @@ -131,6 +131,18 @@ void iommufd_device_unbind(struct iommufd_device *idev) > } > EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD); > > +struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev) > +{ > + return idev->ictx; > +} > +EXPORT_SYMBOL_NS_GPL(iommufd_device_to_ictx, IOMMUFD); > + > +u32 iommufd_device_to_id(struct iommufd_device *idev) > +{ > + return idev->obj.id; > +} > +EXPORT_SYMBOL_NS_GPL(iommufd_device_to_id, IOMMUFD); > + > static int iommufd_device_setup_msi(struct iommufd_device *idev, > struct iommufd_hw_pagetable *hwpt, > phys_addr_t sw_msi_start) > diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c > index 88b00c501015..809f2dd73b9e 100644 > --- a/drivers/vfio/iommufd.c > +++ b/drivers/vfio/iommufd.c > @@ -66,6 +66,20 @@ void vfio_iommufd_unbind(struct vfio_device *vdev) > vdev->ops->unbind_iommufd(vdev); > } > > +struct iommufd_ctx *vfio_iommufd_physical_ictx(struct vfio_device *vdev) > +{ > + if (!vdev->iommufd_device) > + return NULL; > + return iommufd_device_to_ictx(vdev->iommufd_device); > +} > +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_ictx); > + > +void vfio_iommufd_physical_devid(struct vfio_device *vdev, u32 *id) > +{ > + if (vdev->iommufd_device) > + *id = iommufd_device_to_id(vdev->iommufd_device); since there is no return value, may be worth to add at least a WARN_ON in case of !vdev->iommufd_device > +} > +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_devid); > /* > * The physical standard ops mean that the iommufd_device is bound to the > * physical device vdev->dev that was provided to vfio_init_group_dev(). > Drivers > diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h > index 1129a36a74c4..ac96df406833 100644 > --- a/include/linux/iommufd.h > +++ b/include/linux/iommufd.h > @@ -24,6 +24,9 @@ void iommufd_device_unbind(struct iommufd_device *idev); > int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id); > void iommufd_device_detach(struct iommufd_device *idev); > > +struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev); > +u32 iommufd_device_to_id(struct iommufd_device *idev); > + > struct iommufd_access_ops { > u8 needs_pin_pages : 1; > void (*unmap)(void *data, unsigned long iova, unsigned long length); > diff --git a/include/linux/vfio.h b/include/linux/vfio.h > index 3188d8a374bd..97a1174b922f 100644 > --- a/include/linux/vfio.h > +++ b/include/linux/vfio.h > @@ -113,6 +113,8 @@ struct vfio_device_ops { > }; > > #if IS_ENABLED(CONFIG_IOMMUFD) > +struct iommufd_ctx *vfio_iommufd_physical_ictx(struct vfio_device *vdev); > +void vfio_iommufd_physical_devid(struct vfio_device *vdev, u32 *id); > int vfio_iommufd_physical_bind(struct vfio_device *vdev, > struct iommufd_ctx *ictx, u32 *out_device_id); > void vfio_iommufd_physical_unbind(struct vfio_device *vdev); > @@ -122,6 +124,17 @@ int vfio_iommufd_emulated_bind(struct vfio_device *vdev, > void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); > int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id); > #else > +static inline struct iommufd_ctx * > +vfio_iommufd_physical_ictx(struct vfio_device *vdev) > +{ > + return NULL; > +} > + > +static inline void > +vfio_iommufd_physical_devid(struct vfio_device *vdev, u32 *id) > +{ > +} > + > #define vfio_iommufd_physical_bind \ > ((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx, \ > u32 *out_device_id)) NULL) besides Reviewed-by: Eric Auger Eric
Re: [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset
Hi Yi, On 4/4/23 16:37, Liu, Yi L wrote: > Hi Eric, > >> From: Eric Auger >> Sent: Tuesday, April 4, 2023 10:00 PM >> >> Hi YI, >> >> On 4/1/23 16:44, Yi Liu wrote: >>> If the affected device is not opened by any user, it's safe to reset it >>> given it's not in use. >>> >>> Reviewed-by: Kevin Tian >>> Reviewed-by: Jason Gunthorpe >>> Tested-by: Yanting Jiang >>> Signed-off-by: Yi Liu >>> --- >>> drivers/vfio/pci/vfio_pci_core.c | 14 +++--- >>> include/uapi/linux/vfio.h| 8 >>> 2 files changed, 19 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/vfio/pci/vfio_pci_core.c >>> b/drivers/vfio/pci/vfio_pci_core.c >>> index 65bbef562268..5d745c9abf05 100644 >>> --- a/drivers/vfio/pci/vfio_pci_core.c >>> +++ b/drivers/vfio/pci/vfio_pci_core.c >>> @@ -2429,10 +2429,18 @@ static int vfio_pci_dev_set_hot_reset(struct >> vfio_device_set *dev_set, >>> list_for_each_entry(cur_vma, &dev_set->device_list, vdev.dev_set_list) { >>> /* >>> -* Test whether all the affected devices are contained by the >>> -* set of groups provided by the user. >>> +* Test whether all the affected devices can be reset by the >>> +* user. >>> +* >>> +* Resetting an unused device (not opened) is safe, because >>> +* dev_set->lock is held in hot reset path so this device >>> +* cannot race being opened by another user simultaneously. >>> +* >>> +* Otherwise all opened devices in the dev_set must be >>> +* contained by the set of groups provided by the user. >>> */ >>> - if (!vfio_dev_in_groups(cur_vma, groups)) { >>> + if (cur_vma->vdev.open_count && >>> + !vfio_dev_in_groups(cur_vma, groups)) { >>> ret = -EINVAL; >>> goto err_undo; >>> } >>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h >>> index 0552e8dcf0cb..f96e5689cffc 100644 >>> --- a/include/uapi/linux/vfio.h >>> +++ b/include/uapi/linux/vfio.h >>> @@ -673,6 +673,14 @@ struct vfio_pci_hot_reset_info { >>> * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13, >>> * struct vfio_pci_hot_reset) >>> * >>> + * Userspace requests hot reset for the devices it uses. Due to the >>> + * underlying topology, multiple devices can be affected in the reset >> by the reset >>> + * while some might be opened by another user. To avoid interference >> s/interference/hot reset failure? > I don’t think user can really avoid hot reset failure since there may > be new devices plugged into the affected slot. Even user has opened I don't know the legacy wrt that issue but this sounds a serious issue, meaning the reset of an assigned device could impact another device belonging to another group not not owned by the user? > all the groups/devices reported by VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, > the hot reset can fail if new device is plugged in and has not been > bound to vfio or opened by another user during the window of > _INFO and HOT_RESET. with respect to the latter isn't the dev_set lock held during the hot reset and sufficient to prevent any new opening to occur? > > maybe the whole statement should be as below: > > To avoid interference, the hot reset can only be conducted when all > the affected devices are either opened by the calling user or not > opened yet at the moment of the hot reset attempt. OK Eric > >>> + * the calling user must ensure all affected devices, if opened, are >>> + * owned by itself. >>> + * >>> + * The ownership is proved by an array of group fds. >>> + * >>> * Return: 0 on success, -errno on failure. >>> */ >>> struct vfio_pci_hot_reset { > Regards, > Yi Liu
Re: [Intel-gfx] [PATCH v3 01/12] vfio/pci: Update comment around group_fd get in vfio_pci_ioctl_pci_hot_reset()
Hi Yi, On 4/1/23 16:44, Yi Liu wrote: > this suits more on what the code does. > > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu > --- > drivers/vfio/pci/vfio_pci_core.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_core.c > b/drivers/vfio/pci/vfio_pci_core.c > index a5ab416cf476..65bbef562268 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -1308,9 +1308,8 @@ static int vfio_pci_ioctl_pci_hot_reset(struct > vfio_pci_core_device *vdev, > } > > /* > - * For each group_fd, get the group through the vfio external user > - * interface and store the group and iommu ID. This ensures the group > - * is held across the reset. > + * Get the group file for each fd to ensure the group held across to ensure the group is held Besides Reviewed-by: Eric Auger Eric > + * the reset >*/ > for (file_idx = 0; file_idx < hdr.count; file_idx++) { > struct file *file = fget(group_fds[file_idx]);
Re: [Intel-gfx] [PATCH v3 03/12] vfio/pci: Move the existing hot reset logic to be a helper
if (hdr.argsz < minsz || hdr.flags) > + return -EINVAL; > + > + /* Can we do a slot or bus reset or neither? */ > + if (!pci_probe_reset_slot(vdev->pdev->slot)) > + slot = true; > + else if (pci_probe_reset_bus(vdev->pdev->bus)) > + return -ENODEV; > + > + return vfio_pci_ioctl_pci_hot_reset_groups(vdev, &hdr, slot, arg); > +} > + > static int vfio_pci_ioctl_ioeventfd(struct vfio_pci_core_device *vdev, > struct vfio_device_ioeventfd __user *arg) > { Besides Reviewed-by: Eric Auger Thanks Eric
Re: [Intel-gfx] [PATCH v3 02/12] vfio/pci: Only check ownership of opened devices in hot reset
Hi YI, On 4/1/23 16:44, Yi Liu wrote: > If the affected device is not opened by any user, it's safe to reset it > given it's not in use. > > Reviewed-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > Tested-by: Yanting Jiang > Signed-off-by: Yi Liu > --- > drivers/vfio/pci/vfio_pci_core.c | 14 +++--- > include/uapi/linux/vfio.h| 8 > 2 files changed, 19 insertions(+), 3 deletions(-) > > diff --git a/drivers/vfio/pci/vfio_pci_core.c > b/drivers/vfio/pci/vfio_pci_core.c > index 65bbef562268..5d745c9abf05 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -2429,10 +2429,18 @@ static int vfio_pci_dev_set_hot_reset(struct > vfio_device_set *dev_set, > > list_for_each_entry(cur_vma, &dev_set->device_list, vdev.dev_set_list) { > /* > - * Test whether all the affected devices are contained by the > - * set of groups provided by the user. > + * Test whether all the affected devices can be reset by the > + * user. > + * > + * Resetting an unused device (not opened) is safe, because > + * dev_set->lock is held in hot reset path so this device > + * cannot race being opened by another user simultaneously. > + * > + * Otherwise all opened devices in the dev_set must be > + * contained by the set of groups provided by the user. >*/ > - if (!vfio_dev_in_groups(cur_vma, groups)) { > + if (cur_vma->vdev.open_count && > + !vfio_dev_in_groups(cur_vma, groups)) { > ret = -EINVAL; > goto err_undo; > } > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h > index 0552e8dcf0cb..f96e5689cffc 100644 > --- a/include/uapi/linux/vfio.h > +++ b/include/uapi/linux/vfio.h > @@ -673,6 +673,14 @@ struct vfio_pci_hot_reset_info { > * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13, > * struct vfio_pci_hot_reset) > * > + * Userspace requests hot reset for the devices it uses. Due to the > + * underlying topology, multiple devices can be affected in the reset by the reset > + * while some might be opened by another user. To avoid interference s/interference/hot reset failure? > + * the calling user must ensure all affected devices, if opened, are > + * owned by itself. > + * > + * The ownership is proved by an array of group fds. > + * > * Return: 0 on success, -errno on failure. > */ > struct vfio_pci_hot_reset { Thanks Eric
Re: [Intel-gfx] [PATCH v2 15/15] vfio: Add struct device to vfio_device
On 9/8/22 11:17, Yi Liu wrote: > On 2022/9/8 17:06, Eric Auger wrote: >> Hi Kevin, >> >> On 9/1/22 16:37, Kevin Tian wrote: >>> From: Yi Liu >>> >>> and replace kref. With it a 'vfio-dev/vfioX' node is created under the >>> sysfs path of the parent, indicating the device is bound to a vfio >>> driver, e.g.: >>> >>> /sys/devices/pci\:6f/\:6f\:01.0/vfio-dev/vfio0 >>> >>> It is also a preparatory step toward adding cdev for supporting future >>> device-oriented uAPI. >>> >>> Add Documentation/ABI/testing/sysfs-devices-vfio-dev. >>> >>> Also take this chance to rename chardev 'vfio' to 'vfio-group' in >>> /proc/devices. >>> >>> Suggested-by: Jason Gunthorpe >>> Signed-off-by: Yi Liu >>> Signed-off-by: Kevin Tian >>> Reviewed-by: Jason Gunthorpe >>> --- >>> .../ABI/testing/sysfs-devices-vfio-dev | 8 +++ >>> drivers/vfio/vfio_main.c | 67 >>> +++ >>> include/linux/vfio.h | 6 +- >>> 3 files changed, 66 insertions(+), 15 deletions(-) >>> create mode 100644 Documentation/ABI/testing/sysfs-devices-vfio-dev >>> >>> diff --git a/Documentation/ABI/testing/sysfs-devices-vfio-dev >>> b/Documentation/ABI/testing/sysfs-devices-vfio-dev >>> new file mode 100644 >>> index ..e21424fd9666 >>> --- /dev/null >>> +++ b/Documentation/ABI/testing/sysfs-devices-vfio-dev >>> @@ -0,0 +1,8 @@ >>> +What: /sys/...//vfio-dev/vfioX/ >>> +Date: September 2022 >>> +Contact: Yi Liu >>> +Description: >>> + This directory is created when the device is bound to a >>> + vfio driver. The layout under this directory matches what >>> + exists for a standard 'struct device'. 'X' is a unique >>> + index marking this device in vfio. >>> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c >>> index bfa675d314ab..141f55c3faf5 100644 >>> --- a/drivers/vfio/vfio_main.c >>> +++ b/drivers/vfio/vfio_main.c >>> @@ -46,6 +46,8 @@ static struct vfio { >>> struct mutex group_lock; /* locks group_list */ >>> struct ida group_ida; >>> dev_t group_devt; >>> + struct class *device_class; >>> + struct ida device_ida; >>> } vfio; >>> struct vfio_iommu_driver { >>> @@ -483,12 +485,13 @@ static struct vfio_device >>> *vfio_group_get_device(struct vfio_group *group, >>> * VFIO driver API >>> */ >>> /* Release helper called by vfio_put_device() */ >>> -void vfio_device_release(struct kref *kref) >>> +static void vfio_device_release(struct device *dev) >>> { >>> struct vfio_device *device = >>> - container_of(kref, struct vfio_device, kref); >>> + container_of(dev, struct vfio_device, device); >>> vfio_release_device_set(device); >>> + ida_free(&vfio.device_ida, device->index); >>> /* >>> * kvfree() cannot be done here due to a life cycle mess in >>> @@ -498,7 +501,6 @@ void vfio_device_release(struct kref *kref) >>> */ >>> device->ops->release(device); >>> } >>> -EXPORT_SYMBOL_GPL(vfio_device_release); >>> /* >>> * Alloc and initialize vfio_device so it can be registered to vfio >>> @@ -546,6 +548,13 @@ int vfio_init_device(struct vfio_device >>> *device, struct device *dev, >>> { >>> int ret; >>> + ret = ida_alloc_max(&vfio.device_ida, MINORMASK, GFP_KERNEL); >>> + if (ret < 0) { >>> + dev_dbg(dev, "Error to alloc index\n"); >>> + return ret; >>> + } >>> + >>> + device->index = ret; >>> init_completion(&device->comp); >>> device->dev = dev; >>> device->ops = ops; >>> @@ -556,11 +565,15 @@ int vfio_init_device(struct vfio_device >>> *device, struct device *dev, >>> goto out_uninit; >>> } >>> - kref_init(&device->kref); >>> + device_initialize(&device->device); >>> + device->device.release = vfio_device_release; >>> + device->d
Re: [Intel-gfx] [PATCH v2 01/15] vfio: Add helpers for unifying vfio_device life cycle
Hi Kevin, On 9/8/22 08:19, Tian, Kevin wrote: >> From: Eric Auger >> Sent: Thursday, September 8, 2022 3:28 AM >>> +/* >>> + * Alloc and initialize vfio_device so it can be registered to vfio >>> + * core. >>> + * >>> + * Drivers should use the wrapper vfio_alloc_device() for allocation. >>> + * @size is the size of the structure to be allocated, including any >>> + * private data used by the driver. >>> + * >>> + * Driver may provide an @init callback to cover device private data. >> nit: this comment may rather relate to the vfio_init_device function > Yes but vfio_init_device() is used only by ccw and presumably will be > abandoned once ccw fixes its life cycle mess. Given that I prefer to leaving > the comment here to be noted by broader users. OK Eric > >> Besides >> >> Reviewed-by: Eric Auger >> > Thanks and other comments adopted.
Re: [Intel-gfx] [PATCH v2 15/15] vfio: Add struct device to vfio_device
Hi Kevin, On 9/1/22 16:37, Kevin Tian wrote: > From: Yi Liu > > and replace kref. With it a 'vfio-dev/vfioX' node is created under the > sysfs path of the parent, indicating the device is bound to a vfio > driver, e.g.: > > /sys/devices/pci\:6f/\:6f\:01.0/vfio-dev/vfio0 > > It is also a preparatory step toward adding cdev for supporting future > device-oriented uAPI. > > Add Documentation/ABI/testing/sysfs-devices-vfio-dev. > > Also take this chance to rename chardev 'vfio' to 'vfio-group' in > /proc/devices. > > Suggested-by: Jason Gunthorpe > Signed-off-by: Yi Liu > Signed-off-by: Kevin Tian > Reviewed-by: Jason Gunthorpe > --- > .../ABI/testing/sysfs-devices-vfio-dev| 8 +++ > drivers/vfio/vfio_main.c | 67 +++ > include/linux/vfio.h | 6 +- > 3 files changed, 66 insertions(+), 15 deletions(-) > create mode 100644 Documentation/ABI/testing/sysfs-devices-vfio-dev > > diff --git a/Documentation/ABI/testing/sysfs-devices-vfio-dev > b/Documentation/ABI/testing/sysfs-devices-vfio-dev > new file mode 100644 > index ..e21424fd9666 > --- /dev/null > +++ b/Documentation/ABI/testing/sysfs-devices-vfio-dev > @@ -0,0 +1,8 @@ > +What: /sys/...//vfio-dev/vfioX/ > +Date: September 2022 > +Contact: Yi Liu > +Description: > + This directory is created when the device is bound to a > + vfio driver. The layout under this directory matches what > + exists for a standard 'struct device'. 'X' is a unique > + index marking this device in vfio. > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c > index bfa675d314ab..141f55c3faf5 100644 > --- a/drivers/vfio/vfio_main.c > +++ b/drivers/vfio/vfio_main.c > @@ -46,6 +46,8 @@ static struct vfio { > struct mutexgroup_lock; /* locks group_list */ > struct ida group_ida; > dev_t group_devt; > + struct class*device_class; > + struct ida device_ida; > } vfio; > > struct vfio_iommu_driver { > @@ -483,12 +485,13 @@ static struct vfio_device *vfio_group_get_device(struct > vfio_group *group, > * VFIO driver API > */ > /* Release helper called by vfio_put_device() */ > -void vfio_device_release(struct kref *kref) > +static void vfio_device_release(struct device *dev) > { > struct vfio_device *device = > - container_of(kref, struct vfio_device, kref); > + container_of(dev, struct vfio_device, device); > > vfio_release_device_set(device); > + ida_free(&vfio.device_ida, device->index); > > /* >* kvfree() cannot be done here due to a life cycle mess in > @@ -498,7 +501,6 @@ void vfio_device_release(struct kref *kref) >*/ > device->ops->release(device); > } > -EXPORT_SYMBOL_GPL(vfio_device_release); > > /* > * Alloc and initialize vfio_device so it can be registered to vfio > @@ -546,6 +548,13 @@ int vfio_init_device(struct vfio_device *device, struct > device *dev, > { > int ret; > > + ret = ida_alloc_max(&vfio.device_ida, MINORMASK, GFP_KERNEL); > + if (ret < 0) { > + dev_dbg(dev, "Error to alloc index\n"); > + return ret; > + } > + > + device->index = ret; > init_completion(&device->comp); > device->dev = dev; > device->ops = ops; > @@ -556,11 +565,15 @@ int vfio_init_device(struct vfio_device *device, struct > device *dev, > goto out_uninit; > } > > - kref_init(&device->kref); > + device_initialize(&device->device); > + device->device.release = vfio_device_release; > + device->device.class = vfio.device_class; > + device->device.parent = device->dev; > return 0; > > out_uninit: > vfio_release_device_set(device); > + ida_free(&vfio.device_ida, device->index); > return ret; > } > EXPORT_SYMBOL_GPL(vfio_init_device); > @@ -657,6 +670,7 @@ static int __vfio_register_dev(struct vfio_device *device, > struct vfio_group *group) > { > struct vfio_device *existing_device; > + int ret; > > if (IS_ERR(group)) > return PTR_ERR(group); > @@ -673,16 +687,21 @@ static int __vfio_register_dev(struct vfio_device > *device, > dev_WARN(device->dev, "Device already exists on group %d\n", >iommu_group_id(group->iommu_group)); > vfio_device_put_registration(existing_device); > - if (group->type == VFIO_NO_IOMMU || > - group->type == VFIO_EMULATED_IOMMU) > - iommu_group_remove_device(device->dev); > - vfio_group_put(group); > - return -EBUSY; > + ret = -EBUSY; > + goto err_out; > } > > /* Our reference on group is moved to the d
Re: [Intel-gfx] [PATCH v2 14/15] vfio: Rename vfio_device_put() and vfio_device_try_get()
On 9/1/22 16:37, Kevin Tian wrote: > With the addition of vfio_put_device() now the names become confusing. > > vfio_put_device() is clear from object life cycle p.o.v given kref. > > vfio_device_put()/vfio_device_try_get() are helpers for tracking > users on a registered device. > > Now rename them: > > - vfio_device_put() -> vfio_device_put_registration() > - vfio_device_try_get() -> vfio_device_try_get_registration() > > Signed-off-by: Kevin Tian > Reviewed-by: Jason Gunthorpe Reviewed-by: Eric Auger Eric > --- > drivers/vfio/vfio_main.c | 17 + > 1 file changed, 9 insertions(+), 8 deletions(-) > > diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c > index 957d9f286550..bfa675d314ab 100644 > --- a/drivers/vfio/vfio_main.c > +++ b/drivers/vfio/vfio_main.c > @@ -451,13 +451,13 @@ static void vfio_group_get(struct vfio_group *group) > * Device objects - create, release, get, put, search > */ > /* Device reference always implies a group reference */ > -static void vfio_device_put(struct vfio_device *device) > +static void vfio_device_put_registration(struct vfio_device *device) > { > if (refcount_dec_and_test(&device->refcount)) > complete(&device->comp); > } > > -static bool vfio_device_try_get(struct vfio_device *device) > +static bool vfio_device_try_get_registration(struct vfio_device *device) > { > return refcount_inc_not_zero(&device->refcount); > } > @@ -469,7 +469,8 @@ static struct vfio_device *vfio_group_get_device(struct > vfio_group *group, > > mutex_lock(&group->device_lock); > list_for_each_entry(device, &group->device_list, group_next) { > - if (device->dev == dev && vfio_device_try_get(device)) { > + if (device->dev == dev && > + vfio_device_try_get_registration(device)) { > mutex_unlock(&group->device_lock); > return device; > } > @@ -671,7 +672,7 @@ static int __vfio_register_dev(struct vfio_device *device, > if (existing_device) { > dev_WARN(device->dev, "Device already exists on group %d\n", >iommu_group_id(group->iommu_group)); > - vfio_device_put(existing_device); > + vfio_device_put_registration(existing_device); > if (group->type == VFIO_NO_IOMMU || > group->type == VFIO_EMULATED_IOMMU) > iommu_group_remove_device(device->dev); > @@ -730,7 +731,7 @@ static struct vfio_device > *vfio_device_get_from_name(struct vfio_group *group, > ret = !strcmp(dev_name(it->dev), buf); > } > > - if (ret && vfio_device_try_get(it)) { > + if (ret && vfio_device_try_get_registration(it)) { > device = it; > break; > } > @@ -750,7 +751,7 @@ void vfio_unregister_group_dev(struct vfio_device *device) > bool interrupted = false; > long rc; > > - vfio_device_put(device); > + vfio_device_put_registration(device); > rc = try_wait_for_completion(&device->comp); > while (rc <= 0) { > if (device->ops->request) > @@ -1286,7 +1287,7 @@ static int vfio_group_get_device_fd(struct vfio_group > *group, char *buf) > err_put_fdno: > put_unused_fd(fdno); > err_put_device: > - vfio_device_put(device); > + vfio_device_put_registration(device); > return ret; > } > > @@ -1461,7 +1462,7 @@ static int vfio_device_fops_release(struct inode > *inode, struct file *filep) > > vfio_device_unassign_container(device); > > - vfio_device_put(device); > + vfio_device_put_registration(device); > > return 0; > }
Re: [Intel-gfx] [PATCH v2 12/15] vfio/amba: Use the new device life cycle helpers
vfio_platform_mmap(struct vfio_device *core_vdev, > struct vm_area_struct *vma > } > EXPORT_SYMBOL_GPL(vfio_platform_mmap); > > -static const struct vfio_device_ops vfio_platform_ops = { > - .name = "vfio-platform", > - .open_device= vfio_platform_open_device, > - .close_device = vfio_platform_close_device, > - .ioctl = vfio_platform_ioctl, > - .read = vfio_platform_read, > - .write = vfio_platform_write, > - .mmap = vfio_platform_mmap, > -}; > - > static int vfio_platform_of_probe(struct vfio_platform_device *vdev, > struct device *dev) > { > @@ -674,56 +664,6 @@ void vfio_platform_release_common(struct > vfio_platform_device *vdev) > } > EXPORT_SYMBOL_GPL(vfio_platform_release_common); > > -int vfio_platform_probe_common(struct vfio_platform_device *vdev, > -struct device *dev) > -{ > - int ret; > - > - vfio_init_group_dev(&vdev->vdev, dev, &vfio_platform_ops); > - > - ret = vfio_platform_acpi_probe(vdev, dev); > - if (ret) > - ret = vfio_platform_of_probe(vdev, dev); > - > - if (ret) > - goto out_uninit; > - > - vdev->device = dev; > - > - ret = vfio_platform_get_reset(vdev); > - if (ret && vdev->reset_required) { > - dev_err(dev, "No reset function found for device %s\n", > - vdev->name); > - goto out_uninit; > - } > - > - ret = vfio_register_group_dev(&vdev->vdev); > - if (ret) > - goto put_reset; > - > - mutex_init(&vdev->igate); > - > - pm_runtime_enable(dev); > - return 0; > - > -put_reset: > - vfio_platform_put_reset(vdev); > -out_uninit: > - vfio_uninit_group_dev(&vdev->vdev); > - return ret; > -} > -EXPORT_SYMBOL_GPL(vfio_platform_probe_common); > - > -void vfio_platform_remove_common(struct vfio_platform_device *vdev) > -{ > - vfio_unregister_group_dev(&vdev->vdev); > - > - pm_runtime_disable(vdev->device); > - vfio_platform_put_reset(vdev); > - vfio_uninit_group_dev(&vdev->vdev); > -} > -EXPORT_SYMBOL_GPL(vfio_platform_remove_common); > - > void __vfio_platform_register_reset(struct vfio_platform_reset_node *node) > { > mutex_lock(&driver_lock); > diff --git a/drivers/vfio/platform/vfio_platform_private.h > b/drivers/vfio/platform/vfio_platform_private.h > index a769d649fb97..8d8fab516849 100644 > --- a/drivers/vfio/platform/vfio_platform_private.h > +++ b/drivers/vfio/platform/vfio_platform_private.h > @@ -78,9 +78,6 @@ struct vfio_platform_reset_node { > vfio_platform_reset_fn_t of_reset; > }; > > -int vfio_platform_probe_common(struct vfio_platform_device *vdev, > -struct device *dev); > -void vfio_platform_remove_common(struct vfio_platform_device *vdev); > int vfio_platform_init_common(struct vfio_platform_device *vdev); > void vfio_platform_release_common(struct vfio_platform_device *vdev); > Reviewed-by: Eric Auger Eric
Re: [Intel-gfx] [PATCH v2 01/15] vfio: Add helpers for unifying vfio_device life cycle
goto out_uninit; > + } > + > + kref_init(&device->kref); > + return 0; > + > +out_uninit: > + vfio_uninit_group_dev(device); > + return ret; > +} > +EXPORT_SYMBOL_GPL(vfio_init_device); > + > +/* > + * The helper called by driver @release callback to free the device > + * structure. Drivers which don't have private data to clean can > + * simply use this helper as its @release. > + */ > +void vfio_free_device(struct vfio_device *device) > +{ > + kvfree(device); > +} > +EXPORT_SYMBOL_GPL(vfio_free_device); > + > static struct vfio_group *vfio_noiommu_group_alloc(struct device *dev, > enum vfio_group_type type) > { > diff --git a/include/linux/vfio.h b/include/linux/vfio.h > index e05ddc6fe6a5..e1e9e8352903 100644 > --- a/include/linux/vfio.h > +++ b/include/linux/vfio.h > @@ -45,7 +45,8 @@ struct vfio_device { > struct kvm *kvm; > > /* Members below here are private, not for driver use */ > - refcount_t refcount; > + struct kref kref; /* object life cycle */ > + refcount_t refcount;/* user count on registered device*/ > unsigned int open_count; > struct completion comp; > struct list_head group_next; > @@ -55,6 +56,8 @@ struct vfio_device { > /** > * struct vfio_device_ops - VFIO bus driver device callbacks > * > + * @init: initialize private fields in device structure > + * @release: Reclaim private fields in device structure > * @open_device: Called when the first file descriptor is opened for this > device > * @close_device: Opposite of open_device > * @read: Perform read(2) on device file descriptor > @@ -72,6 +75,8 @@ struct vfio_device { > */ > struct vfio_device_ops { > char*name; > + int (*init)(struct vfio_device *vdev); > + void(*release)(struct vfio_device *vdev); > int (*open_device)(struct vfio_device *vdev); > void(*close_device)(struct vfio_device *vdev); > ssize_t (*read)(struct vfio_device *vdev, char __user *buf, > @@ -137,6 +142,24 @@ static inline int vfio_check_feature(u32 flags, size_t > argsz, u32 supported_ops, > return 1; > } > > +struct vfio_device *_vfio_alloc_device(size_t size, struct device *dev, > +const struct vfio_device_ops *ops); > +#define vfio_alloc_device(dev_struct, member, dev, ops) > \ > + container_of(_vfio_alloc_device(sizeof(struct dev_struct) + > \ > + BUILD_BUG_ON_ZERO(offsetof( > \ > + struct dev_struct, member)), > \ > + dev, ops), > \ > + struct dev_struct, member) > + > +int vfio_init_device(struct vfio_device *device, struct device *dev, > + const struct vfio_device_ops *ops); > +void vfio_free_device(struct vfio_device *device); > +void vfio_device_release(struct kref *kref); > +static inline void vfio_put_device(struct vfio_device *device) > +{ > + kref_put(&device->kref, vfio_device_release); > +} > + > void vfio_init_group_dev(struct vfio_device *device, struct device *dev, >const struct vfio_device_ops *ops); > void vfio_uninit_group_dev(struct vfio_device *device); Besides Reviewed-by: Eric Auger Eric
Re: [Intel-gfx] [PATCH v2 11/15] vfio/platform: Use the new device life cycle helpers
obe(vdev, dev); > + if (ret) > + ret = vfio_platform_of_probe(vdev, dev); > + > + if (ret) > + return ret; > + > + vdev->device = dev; > + mutex_init(&vdev->igate); > + > + ret = vfio_platform_get_reset(vdev); > + if (ret && vdev->reset_required) > + dev_err(dev, "No reset function found for device %s\n", > + vdev->name); > + return ret; > +} > +EXPORT_SYMBOL_GPL(vfio_platform_init_common); > + > +void vfio_platform_release_common(struct vfio_platform_device *vdev) > +{ > + vfio_platform_put_reset(vdev); > +} > +EXPORT_SYMBOL_GPL(vfio_platform_release_common); > + > int vfio_platform_probe_common(struct vfio_platform_device *vdev, > struct device *dev) > { > diff --git a/drivers/vfio/platform/vfio_platform_private.h > b/drivers/vfio/platform/vfio_platform_private.h > index 691b43f4b2b2..a769d649fb97 100644 > --- a/drivers/vfio/platform/vfio_platform_private.h > +++ b/drivers/vfio/platform/vfio_platform_private.h > @@ -81,6 +81,21 @@ struct vfio_platform_reset_node { > int vfio_platform_probe_common(struct vfio_platform_device *vdev, > struct device *dev); > void vfio_platform_remove_common(struct vfio_platform_device *vdev); > +int vfio_platform_init_common(struct vfio_platform_device *vdev); > +void vfio_platform_release_common(struct vfio_platform_device *vdev); > + > +int vfio_platform_open_device(struct vfio_device *core_vdev); > +void vfio_platform_close_device(struct vfio_device *core_vdev); > +long vfio_platform_ioctl(struct vfio_device *core_vdev, > + unsigned int cmd, unsigned long arg); > +ssize_t vfio_platform_read(struct vfio_device *core_vdev, > +char __user *buf, size_t count, > +loff_t *ppos); > +ssize_t vfio_platform_write(struct vfio_device *core_vdev, > + const char __user *buf, > + size_t count, loff_t *ppos); > +int vfio_platform_mmap(struct vfio_device *core_vdev, > +struct vm_area_struct *vma); > > int vfio_platform_irq_init(struct vfio_platform_device *vdev); > void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev); Looks good to me. I also ran basic non regression testing Reviewed-by: Eric Auger Tested-by: Eric Auger Eric
Re: [Intel-gfx] [PATCH v3 07/14] vfio/platform: Use open_device() instead of open coding a refcnt scheme
Hi Jason, On 7/29/21 2:49 AM, Jason Gunthorpe wrote: > Platform simply wants to run some code when the device is first > opened/last closed. Use the core framework and locking for this. Aside > from removing a bit of code this narrows the locking scope from a global > lock. > > Signed-off-by: Jason Gunthorpe > Signed-off-by: Yishai Hadas > Reviewed-by: Cornelia Huck > Reviewed-by: Christoph Hellwig Reviewed-by: Eric Auger Thanks Eric > --- > drivers/vfio/platform/vfio_platform_common.c | 79 --- > drivers/vfio/platform/vfio_platform_private.h | 1 - > 2 files changed, 32 insertions(+), 48 deletions(-) > > diff --git a/drivers/vfio/platform/vfio_platform_common.c > b/drivers/vfio/platform/vfio_platform_common.c > index bdde8605178cd2..6af7ce7d619c25 100644 > --- a/drivers/vfio/platform/vfio_platform_common.c > +++ b/drivers/vfio/platform/vfio_platform_common.c > @@ -218,65 +218,52 @@ static int vfio_platform_call_reset(struct > vfio_platform_device *vdev, > return -EINVAL; > } > > -static void vfio_platform_release(struct vfio_device *core_vdev) > +static void vfio_platform_close_device(struct vfio_device *core_vdev) > { > struct vfio_platform_device *vdev = > container_of(core_vdev, struct vfio_platform_device, vdev); > + const char *extra_dbg = NULL; > + int ret; > > - mutex_lock(&driver_lock); > - > - if (!(--vdev->refcnt)) { > - const char *extra_dbg = NULL; > - int ret; > - > - ret = vfio_platform_call_reset(vdev, &extra_dbg); > - if (ret && vdev->reset_required) { > - dev_warn(vdev->device, "reset driver is required and > reset call failed in release (%d) %s\n", > - ret, extra_dbg ? extra_dbg : ""); > - WARN_ON(1); > - } > - pm_runtime_put(vdev->device); > - vfio_platform_regions_cleanup(vdev); > - vfio_platform_irq_cleanup(vdev); > + ret = vfio_platform_call_reset(vdev, &extra_dbg); > + if (WARN_ON(ret && vdev->reset_required)) { > + dev_warn( > + vdev->device, > + "reset driver is required and reset call failed in > release (%d) %s\n", > + ret, extra_dbg ? extra_dbg : ""); > } > - > - mutex_unlock(&driver_lock); > + pm_runtime_put(vdev->device); > + vfio_platform_regions_cleanup(vdev); > + vfio_platform_irq_cleanup(vdev); > } > > -static int vfio_platform_open(struct vfio_device *core_vdev) > +static int vfio_platform_open_device(struct vfio_device *core_vdev) > { > struct vfio_platform_device *vdev = > container_of(core_vdev, struct vfio_platform_device, vdev); > + const char *extra_dbg = NULL; > int ret; > > - mutex_lock(&driver_lock); > - > - if (!vdev->refcnt) { > - const char *extra_dbg = NULL; > - > - ret = vfio_platform_regions_init(vdev); > - if (ret) > - goto err_reg; > + ret = vfio_platform_regions_init(vdev); > + if (ret) > + return ret; > > - ret = vfio_platform_irq_init(vdev); > - if (ret) > - goto err_irq; > + ret = vfio_platform_irq_init(vdev); > + if (ret) > + goto err_irq; > > - ret = pm_runtime_get_sync(vdev->device); > - if (ret < 0) > - goto err_rst; > + ret = pm_runtime_get_sync(vdev->device); > + if (ret < 0) > + goto err_rst; > > - ret = vfio_platform_call_reset(vdev, &extra_dbg); > - if (ret && vdev->reset_required) { > - dev_warn(vdev->device, "reset driver is required and > reset call failed in open (%d) %s\n", > - ret, extra_dbg ? extra_dbg : ""); > - goto err_rst; > - } > + ret = vfio_platform_call_reset(vdev, &extra_dbg); > + if (ret && vdev->reset_required) { > + dev_warn( > + vdev->device, > + "reset driver is required and reset call failed in open > (%d) %s\n", > + ret, extra_dbg ? extra_dbg : ""); > + goto err_rst; > } > - > - vdev->refcnt++; > - > - mutex_unlock(&driver_lock); > return 0; >