Re: [RFC PATCH 0/1] iommu: Detach device from domain when removed from group

2015-08-03 Thread Joerg Roedel
On Tue, Jul 28, 2015 at 07:55:55PM +0200, Gerald Schaefer wrote:
 On s390, this eventually leads to a kernel panic when binding the device
 again to its non-vfio PCI driver, because of the missing arch-specific
 cleanup in detach_dev. On x86, the detach_dev callback will also not be
 called directly, but there is a notifier that will catch
 BUS_NOTIFY_REMOVED_DEVICE and eventually do the cleanup. Other
 architectures w/o the notifier probably have at least some kind of memory
 leak in this scenario, so a general fix would be nice.

This notifier is not arch-specific, but registered against the bus the
iommu-ops are set for. Why does it not run on s390?


Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/1] iommu: Detach device from domain when removed from group

2015-08-03 Thread Gerald Schaefer
On Mon, 3 Aug 2015 17:48:55 +0200
Joerg Roedel j...@8bytes.org wrote:

 On Tue, Jul 28, 2015 at 07:55:55PM +0200, Gerald Schaefer wrote:
  On s390, this eventually leads to a kernel panic when binding the device
  again to its non-vfio PCI driver, because of the missing arch-specific
  cleanup in detach_dev. On x86, the detach_dev callback will also not be
  called directly, but there is a notifier that will catch
  BUS_NOTIFY_REMOVED_DEVICE and eventually do the cleanup. Other
  architectures w/o the notifier probably have at least some kind of memory
  leak in this scenario, so a general fix would be nice.
 
 This notifier is not arch-specific, but registered against the bus the
 iommu-ops are set for. Why does it not run on s390?

Adding the notifier would of course also work on s390 (and all other affected
architectures). However, it seems that the missing detach_dev issue in this
scenario is not fundamentally fixed by using this notifier, it just seems to
hide the symptom by chance.

Adding the otherwise unneeded notifier just to work around this issue somehow
doesn't seem right, also given that x86 is so far the only user of it. At
least I thought it would be cleaner to fix it in common code and for all
architectures. Not sure what's wrong with fixing the asymmetry as suggested
in my patch, but I guess there are good reasons for having this asymmetry.

For now, I'll just add the notifier to my s390 implementation and post it
soon.

 
 
   Joerg
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-pci in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/1] iommu: Detach device from domain when removed from group

2015-08-03 Thread Gerald Schaefer
On Tue, 28 Jul 2015 19:55:55 +0200
Gerald Schaefer gerald.schae...@de.ibm.com wrote:

 Hi,
 
 during IOMMU API function testing on s390 I hit the following scenario:
 
 After binding a device to vfio-pci, the user completes the VFIO_SET_IOMMU
 ioctl and stops, see the sample C program below. Now the device is manually
 removed via echo 1  /sys/bus/pci/devices/.../remove.
 
 Although the SET_IOMMU ioctl triggered the attach_dev callback in the
 underlying IOMMU API, removing the device in this way won't trigger the
 detach_dev callback, neither during remove nor when the user program
 continues with closing group/container.
 
 On s390, this eventually leads to a kernel panic when binding the device
 again to its non-vfio PCI driver, because of the missing arch-specific
 cleanup in detach_dev. On x86, the detach_dev callback will also not be
 called directly, but there is a notifier that will catch
 BUS_NOTIFY_REMOVED_DEVICE and eventually do the cleanup. Other
 architectures w/o the notifier probably have at least some kind of memory
 leak in this scenario, so a general fix would be nice.
 
 My first approach was to try and fix this in VFIO code, but Alex Williamson
 pointed me to some asymmetry in the IOMMU code: iommu_group_add_device()
 will invoke the attach_dev callback, but iommu_group_remove_device() won't
 trigger detach_dev. Fixing this asymmetry would fix the issue for me, but
 is this the correct fix? Any thoughts?

Ping.

The suggested fix may be completely wrong, but not having detach_dev called
seems like like a serious issue, any feedback would be greatly appreciated.


 
 Regards,
 Gerald
 
 
 Here is the sample C program to trigger the ioctl:
 
 #include stdio.h
 #include fcntl.h
 #include linux/vfio.h
 
 int main(void)
 {
 int container, group, rc;
 
 container = open(/dev/vfio/vfio, O_RDWR);
 if (container  0) {
 perror(open /dev/vfio/vfio\n);
 return -1;
 }
 
 group = open(/dev/vfio/0, O_RDWR);
 if (group  0) {
 perror(open /dev/vfio/0\n);
 return -1;
 }
 
 rc = ioctl(group, VFIO_GROUP_SET_CONTAINER, container);
 if (rc) {
 perror(ioctl VFIO_GROUP_SET_CONTAINER\n);
 return -1;
 }
 
 rc = ioctl(container, VFIO_SET_IOMMU, VFIO_TYPE1_IOMMU);
 if (rc) {
 perror(ioctl VFIO_SET_IOMMU\n);
 return -1;
 }
 
 printf(Try device remove...\n);
 getchar();
 
 close(group);
 close(container);
 return 0;
 }
 
 Gerald Schaefer (1):
   iommu: Detach device from domain when removed from group
 
  drivers/iommu/iommu.c | 5 +
  1 file changed, 5 insertions(+)
 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 0/1] iommu: Detach device from domain when removed from group

2015-07-28 Thread Gerald Schaefer
Hi,

during IOMMU API function testing on s390 I hit the following scenario:

After binding a device to vfio-pci, the user completes the VFIO_SET_IOMMU
ioctl and stops, see the sample C program below. Now the device is manually
removed via echo 1  /sys/bus/pci/devices/.../remove.

Although the SET_IOMMU ioctl triggered the attach_dev callback in the
underlying IOMMU API, removing the device in this way won't trigger the
detach_dev callback, neither during remove nor when the user program
continues with closing group/container.

On s390, this eventually leads to a kernel panic when binding the device
again to its non-vfio PCI driver, because of the missing arch-specific
cleanup in detach_dev. On x86, the detach_dev callback will also not be
called directly, but there is a notifier that will catch
BUS_NOTIFY_REMOVED_DEVICE and eventually do the cleanup. Other
architectures w/o the notifier probably have at least some kind of memory
leak in this scenario, so a general fix would be nice.

My first approach was to try and fix this in VFIO code, but Alex Williamson
pointed me to some asymmetry in the IOMMU code: iommu_group_add_device()
will invoke the attach_dev callback, but iommu_group_remove_device() won't
trigger detach_dev. Fixing this asymmetry would fix the issue for me, but
is this the correct fix? Any thoughts?

Regards,
Gerald


Here is the sample C program to trigger the ioctl:

#include stdio.h
#include fcntl.h
#include linux/vfio.h

int main(void)
{
int container, group, rc;

container = open(/dev/vfio/vfio, O_RDWR);
if (container  0) {
perror(open /dev/vfio/vfio\n);
return -1;
}

group = open(/dev/vfio/0, O_RDWR);
if (group  0) {
perror(open /dev/vfio/0\n);
return -1;
}

rc = ioctl(group, VFIO_GROUP_SET_CONTAINER, container);
if (rc) {
perror(ioctl VFIO_GROUP_SET_CONTAINER\n);
return -1;
}

rc = ioctl(container, VFIO_SET_IOMMU, VFIO_TYPE1_IOMMU);
if (rc) {
perror(ioctl VFIO_SET_IOMMU\n);
return -1;
}

printf(Try device remove...\n);
getchar();

close(group);
close(container);
return 0;
}

Gerald Schaefer (1):
  iommu: Detach device from domain when removed from group

 drivers/iommu/iommu.c | 5 +
 1 file changed, 5 insertions(+)

-- 
2.3.8

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu