RE: RFC: vfio interface for platform devices (v2)
-Original Message- From: Mario Smarduch [mailto:mario.smard...@huawei.com] Sent: Thursday, July 04, 2013 9:45 AM To: Yoder Stuart-B08248 Cc: Alex Williamson; Alexander Graf; Wood Scott-B07421; k...@vger.kernel.org list; Bhushan Bharat-R65777; kvm-ppc@vger.kernel.org; virtualizat...@lists.linux-foundation.org; Sethi Varun-B16395; kvm...@lists.cs.columbia.edu Subject: Re: RFC: vfio interface for platform devices (v2) I'm having trouble understanding how this works where the Guest Device Model != Host. How do you inform the guest where the device is mapped in its physical address space, and handle GPA faults? The vfio mechanisms just expose hardware to user space and the user space app may or may not QEMU. So there may be no 'guest' at all. The intent of this RFC is to provide enough info to user space so an application can use the device, or in the case of QEMU expose the device to a VM. Platform devices are typically exposed via the device tree and that is how I envision them being presented to a guest. Are there real cases you see where guest device model != host? I don't envision ever presenting a platform device as a PCI device or vise versa. Stuart -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: RFC: vfio interface for platform devices (v2)
(sorry for the delayed response, but I've been on PTO) 1. VFIO_GROUP_GET_DEVICE_FD User space knows by out-of-band means which device it is accessing and will call VFIO_GROUP_GET_DEVICE_FD passing a specific sysfs path to get the device information: fd = ioctl(group, VFIO_GROUP_GET_DEVICE_FD, /sys/bus/platform/devices/ffe21.usb)); FWIW, I'm in favor of whichever way works out cleaner in the code for pre-pending /sys/bus or not. It sort of seems like it's unnecessary. It's also a little inconsistent that the returned path doesn't pre-pend /sys in the examples below. Ok. For the returned path in the examples I have the actual device tree path which is slightly different from the path in /sys. The device tree path is what user space would need to interpret /proc/device-tree. 2. VFIO_DEVICE_GET_INFO The number of regions corresponds to the regions defined in reg and ranges in the device tree. Two new flags are added to struct vfio_device_info: #define VFIO_DEVICE_FLAGS_PLATFORM (1 ?) /* A platform bus device */ #define VFIO_DEVICE_FLAGS_DEVTREE (1 ?) /* device tree info available */ It is possible that there could be platform bus devices that are not in the device tree, so we use 2 flags to allow for that. If just VFIO_DEVICE_FLAGS_PLATFORM is set, it means that there are regions and IRQs but no device tree info available. If just VFIO_DEVICE_FLAGS_DEVTREE is set, it means there is device tree info available. But it would be invalid to only have DEVTREE w/o PLATFORM for now, right? Right. The way I stated it is incorrect. DEVTREE would never be set by itself. 3. VFIO_DEVICE_GET_REGION_INFO For platform devices with multiple regions, information is needed to correlate the regions with the device tree structure that drivers use to determine the meaning of device resources. The VFIO_DEVICE_GET_REGION_INFO is extended to provide device tree information. The following information is needed: -the device tree path to the node corresponding to the region -whether it corresponds to a reg or ranges property -there could be multiple sub-regions per reg or ranges and the sub-index within the reg/ranges is needed There are 5 new flags added to vfio_region_info : struct vfio_region_info { __u32 argsz; __u32 flags; #define VFIO_REGION_INFO_FLAG_CACHEABLE (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_REG (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_RANGE (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_INDEX (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_PATH (1 ?) __u32 index; /* Region index */ __u32 resv; /* Reserved for alignment */ __u64 size; /* Region size (bytes) */ __u64 offset; /* Region offset from start of device fd */ }; VFIO_REGION_INFO_FLAG_CACHEABLE -if set indicates that the region must be mapped as cacheable VFIO_DEVTREE_REGION_INFO_FLAG_REG -if set indicates that the region corresponds to a reg property in the device tree representation of the device VFIO_DEVTREE_REGION_INFO_FLAG_RANGE -if set indicates that the region corresponds to a ranges property in the device tree representation of the device VFIO_DEVTREE_REGION_INFO_FLAG_INDEX -if set indicates that there is a dword aligned struct struct vfio_devtree_region_info_index appended to the end of vfio_region_info: struct vfio_devtree_region_info_index { u32 index; } A reg or ranges property may have multiple regsion. The index specifies the index within the reg or ranges that this region corresponds to. VFIO_DEVTREE_REGION_INFO_FLAG_PATH -if set indicates that there is a dword aligned struct struct vfio_devtree_info_path appended to the end of vfio_region_info: struct vfio_devtree_info_path { u32 len; u8 path[]; } The path is the full path to the corresponding device tree node. The len field specifies the length of the path string. If multiple flags are set that indicate that there is an appended struct, the order of the flags indicates the order of the structs. argsz is set by the kernel specifying the total size of struct vfio_region_info and all appended structs. Suggested usage: -call VFIO_DEVICE_GET_REGION_INFO with argsz = sizeof(struct vfio_region_info) -realloc the buffer -call VFIO_DEVICE_GET_REGION_INFO again, and the appended structs will be returned 4. VFIO_DEVICE_GET_IRQ_INFO
Re: RFC: vfio interface for platform devices (v2)
On 04.07.2013, at 16:44, Mario Smarduch wrote: I'm having trouble understanding how this works where the Guest Device Model != Host. How do you inform the guest where the device is mapped in its physical address space, and handle GPA faults? The same way as you would for emulated devices. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC: vfio interface for platform devices (v2)
I'm having trouble understanding how this works where the Guest Device Model != Host. How do you inform the guest where the device is mapped in its physical address space, and handle GPA faults? - Mario On 7/3/2013 11:40 PM, Yoder Stuart-B08248 wrote: Version 2 -VFIO_GROUP_GET_DEVICE_FD-- specified that the path is a sysfs path -VFIO_DEVICE_GET_INFO-- defined 2 flags instead of 1 -deleted VFIO_DEVICE_GET_DEVTREE_INFO ioctl -VFIO_DEVICE_GET_REGION_INFO-- updated as per AlexW's suggestion, defined 5 new flags and associated structs -VFIO_DEVICE_GET_IRQ_INFO-- updated as per AlexW's suggestion, defined 1 new flag and associated struct -removed redundant example -- VFIO for Platform Devices The existing kernel interface for vfio-pci is pretty close to what is needed for platform devices: -mechanism to create a container -add groups/devices to a container -set the IOMMU model -map DMA regions -get an fd for a specific device, which allows user space to determine info about device regions (e.g. registers) and interrupt info -support for mmapping device regions -mechanism to set how interrupts are signaled Many platform device are simple and consist of a single register region and a single interrupt. For these types of devices the existing vfio interfaces should be sufficient. However, platform devices can get complicated-- logically represented as a device tree hierarchy of nodes. For devices with multiple regions and interrupts, new mechanisms are needed in vfio to correlate the regions/interrupts with the device tree structure that drivers use to determine the meaning of device resources. In some cases there are relationships between device, and devices reference other devices using phandle links. The kernel won't expose relationships between devices, but just exposes mappable register regions and interrupts. The changes needed for vfio are around some of the device tree related info that needs to be available with the device fd. 1. VFIO_GROUP_GET_DEVICE_FD User space knows by out-of-band means which device it is accessing and will call VFIO_GROUP_GET_DEVICE_FD passing a specific sysfs path to get the device information: fd = ioctl(group, VFIO_GROUP_GET_DEVICE_FD, /sys/bus/platform/devices/ffe21.usb)); 2. VFIO_DEVICE_GET_INFO The number of regions corresponds to the regions defined in reg and ranges in the device tree. Two new flags are added to struct vfio_device_info: #define VFIO_DEVICE_FLAGS_PLATFORM (1 ?) /* A platform bus device */ #define VFIO_DEVICE_FLAGS_DEVTREE (1 ?) /* device tree info available */ It is possible that there could be platform bus devices that are not in the device tree, so we use 2 flags to allow for that. If just VFIO_DEVICE_FLAGS_PLATFORM is set, it means that there are regions and IRQs but no device tree info available. If just VFIO_DEVICE_FLAGS_DEVTREE is set, it means there is device tree info available. 3. VFIO_DEVICE_GET_REGION_INFO For platform devices with multiple regions, information is needed to correlate the regions with the device tree structure that drivers use to determine the meaning of device resources. The VFIO_DEVICE_GET_REGION_INFO is extended to provide device tree information. The following information is needed: -the device tree path to the node corresponding to the region -whether it corresponds to a reg or ranges property -there could be multiple sub-regions per reg or ranges and the sub-index within the reg/ranges is needed There are 5 new flags added to vfio_region_info : struct vfio_region_info { __u32 argsz; __u32 flags; #define VFIO_REGION_INFO_FLAG_CACHEABLE (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_REG (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_RANGE (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_INDEX (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_PATH (1 ?) __u32 index; /* Region index */ __u32 resv; /* Reserved for alignment */ __u64 size; /* Region size (bytes) */ __u64 offset; /* Region offset from start of device fd */ }; VFIO_REGION_INFO_FLAG_CACHEABLE -if set indicates that the region must be mapped as cacheable VFIO_DEVTREE_REGION_INFO_FLAG_REG -if set indicates that the region corresponds to a reg property in the device tree representation of the device VFIO_DEVTREE_REGION_INFO_FLAG_RANGE -if set indicates that the region corresponds to a ranges property in the device tree representation of the device VFIO_DEVTREE_REGION_INFO_FLAG_INDEX -if set indicates that there is a
RFC: vfio interface for platform devices (v2)
Version 2 -VFIO_GROUP_GET_DEVICE_FD-- specified that the path is a sysfs path -VFIO_DEVICE_GET_INFO-- defined 2 flags instead of 1 -deleted VFIO_DEVICE_GET_DEVTREE_INFO ioctl -VFIO_DEVICE_GET_REGION_INFO-- updated as per AlexW's suggestion, defined 5 new flags and associated structs -VFIO_DEVICE_GET_IRQ_INFO-- updated as per AlexW's suggestion, defined 1 new flag and associated struct -removed redundant example -- VFIO for Platform Devices The existing kernel interface for vfio-pci is pretty close to what is needed for platform devices: -mechanism to create a container -add groups/devices to a container -set the IOMMU model -map DMA regions -get an fd for a specific device, which allows user space to determine info about device regions (e.g. registers) and interrupt info -support for mmapping device regions -mechanism to set how interrupts are signaled Many platform device are simple and consist of a single register region and a single interrupt. For these types of devices the existing vfio interfaces should be sufficient. However, platform devices can get complicated-- logically represented as a device tree hierarchy of nodes. For devices with multiple regions and interrupts, new mechanisms are needed in vfio to correlate the regions/interrupts with the device tree structure that drivers use to determine the meaning of device resources. In some cases there are relationships between device, and devices reference other devices using phandle links. The kernel won't expose relationships between devices, but just exposes mappable register regions and interrupts. The changes needed for vfio are around some of the device tree related info that needs to be available with the device fd. 1. VFIO_GROUP_GET_DEVICE_FD User space knows by out-of-band means which device it is accessing and will call VFIO_GROUP_GET_DEVICE_FD passing a specific sysfs path to get the device information: fd = ioctl(group, VFIO_GROUP_GET_DEVICE_FD, /sys/bus/platform/devices/ffe21.usb)); 2. VFIO_DEVICE_GET_INFO The number of regions corresponds to the regions defined in reg and ranges in the device tree. Two new flags are added to struct vfio_device_info: #define VFIO_DEVICE_FLAGS_PLATFORM (1 ?) /* A platform bus device */ #define VFIO_DEVICE_FLAGS_DEVTREE (1 ?) /* device tree info available */ It is possible that there could be platform bus devices that are not in the device tree, so we use 2 flags to allow for that. If just VFIO_DEVICE_FLAGS_PLATFORM is set, it means that there are regions and IRQs but no device tree info available. If just VFIO_DEVICE_FLAGS_DEVTREE is set, it means there is device tree info available. 3. VFIO_DEVICE_GET_REGION_INFO For platform devices with multiple regions, information is needed to correlate the regions with the device tree structure that drivers use to determine the meaning of device resources. The VFIO_DEVICE_GET_REGION_INFO is extended to provide device tree information. The following information is needed: -the device tree path to the node corresponding to the region -whether it corresponds to a reg or ranges property -there could be multiple sub-regions per reg or ranges and the sub-index within the reg/ranges is needed There are 5 new flags added to vfio_region_info : struct vfio_region_info { __u32 argsz; __u32 flags; #define VFIO_REGION_INFO_FLAG_CACHEABLE (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_REG (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_RANGE (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_INDEX (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_PATH (1 ?) __u32 index; /* Region index */ __u32 resv; /* Reserved for alignment */ __u64 size; /* Region size (bytes) */ __u64 offset; /* Region offset from start of device fd */ }; VFIO_REGION_INFO_FLAG_CACHEABLE -if set indicates that the region must be mapped as cacheable VFIO_DEVTREE_REGION_INFO_FLAG_REG -if set indicates that the region corresponds to a reg property in the device tree representation of the device VFIO_DEVTREE_REGION_INFO_FLAG_RANGE -if set indicates that the region corresponds to a ranges property in the device tree representation of the device VFIO_DEVTREE_REGION_INFO_FLAG_INDEX -if set indicates that there is a dword aligned struct struct vfio_devtree_region_info_index appended to the end of vfio_region_info: struct vfio_devtree_region_info_index { u32 index; } A reg or ranges property may have multiple regsion. The index specifies the index within the reg or ranges that this region corresponds to.
Re: RFC: vfio interface for platform devices (v2)
On Wed, 2013-07-03 at 21:40 +, Yoder Stuart-B08248 wrote: Version 2 -VFIO_GROUP_GET_DEVICE_FD-- specified that the path is a sysfs path -VFIO_DEVICE_GET_INFO-- defined 2 flags instead of 1 -deleted VFIO_DEVICE_GET_DEVTREE_INFO ioctl -VFIO_DEVICE_GET_REGION_INFO-- updated as per AlexW's suggestion, defined 5 new flags and associated structs -VFIO_DEVICE_GET_IRQ_INFO-- updated as per AlexW's suggestion, defined 1 new flag and associated struct -removed redundant example -- VFIO for Platform Devices The existing kernel interface for vfio-pci is pretty close to what is needed for platform devices: -mechanism to create a container -add groups/devices to a container -set the IOMMU model -map DMA regions -get an fd for a specific device, which allows user space to determine info about device regions (e.g. registers) and interrupt info -support for mmapping device regions -mechanism to set how interrupts are signaled Many platform device are simple and consist of a single register region and a single interrupt. For these types of devices the existing vfio interfaces should be sufficient. However, platform devices can get complicated-- logically represented as a device tree hierarchy of nodes. For devices with multiple regions and interrupts, new mechanisms are needed in vfio to correlate the regions/interrupts with the device tree structure that drivers use to determine the meaning of device resources. In some cases there are relationships between device, and devices reference other devices using phandle links. The kernel won't expose relationships between devices, but just exposes mappable register regions and interrupts. The changes needed for vfio are around some of the device tree related info that needs to be available with the device fd. 1. VFIO_GROUP_GET_DEVICE_FD User space knows by out-of-band means which device it is accessing and will call VFIO_GROUP_GET_DEVICE_FD passing a specific sysfs path to get the device information: fd = ioctl(group, VFIO_GROUP_GET_DEVICE_FD, /sys/bus/platform/devices/ffe21.usb)); FWIW, I'm in favor of whichever way works out cleaner in the code for pre-pending /sys/bus or not. It sort of seems like it's unnecessary. It's also a little inconsistent that the returned path doesn't pre-pend /sys in the examples below. 2. VFIO_DEVICE_GET_INFO The number of regions corresponds to the regions defined in reg and ranges in the device tree. Two new flags are added to struct vfio_device_info: #define VFIO_DEVICE_FLAGS_PLATFORM (1 ?) /* A platform bus device */ #define VFIO_DEVICE_FLAGS_DEVTREE (1 ?) /* device tree info available */ It is possible that there could be platform bus devices that are not in the device tree, so we use 2 flags to allow for that. If just VFIO_DEVICE_FLAGS_PLATFORM is set, it means that there are regions and IRQs but no device tree info available. If just VFIO_DEVICE_FLAGS_DEVTREE is set, it means there is device tree info available. But it would be invalid to only have DEVTREE w/o PLATFORM for now, right? 3. VFIO_DEVICE_GET_REGION_INFO For platform devices with multiple regions, information is needed to correlate the regions with the device tree structure that drivers use to determine the meaning of device resources. The VFIO_DEVICE_GET_REGION_INFO is extended to provide device tree information. The following information is needed: -the device tree path to the node corresponding to the region -whether it corresponds to a reg or ranges property -there could be multiple sub-regions per reg or ranges and the sub-index within the reg/ranges is needed There are 5 new flags added to vfio_region_info : struct vfio_region_info { __u32 argsz; __u32 flags; #define VFIO_REGION_INFO_FLAG_CACHEABLE (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_REG (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_RANGE (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_INDEX (1 ?) #define VFIO_DEVTREE_REGION_INFO_FLAG_PATH (1 ?) __u32 index; /* Region index */ __u32 resv; /* Reserved for alignment */ __u64 size; /* Region size (bytes) */ __u64 offset; /* Region offset from start of device fd */ }; VFIO_REGION_INFO_FLAG_CACHEABLE -if set indicates that the region must be mapped as cacheable VFIO_DEVTREE_REGION_INFO_FLAG_REG -if set indicates that the region corresponds to a reg property in the device tree representation of the device VFIO_DEVTREE_REGION_INFO_FLAG_RANGE -if set indicates that the region corresponds to a ranges property in the
Re: RFC: vfio interface for platform devices (v2)
On 07/03/2013 05:53:09 PM, Alex Williamson wrote: Seems like it should work. My only API concern with this model of appending structs is that a user needs to know the size of each struct even if they don't otherwise care about it in order to step over it. In that case, it might be better to make the struct grow linearly rather than with options, and just have a version number on the struct indicating how far the caller thinks struct has grown. The kernel could respond back with a lower version to reflect that it only filled in the fields it knows about. Flags could still be used to indicate which portions of the struct are relevant, but not the physical layout of the struct. In some cases, like the path, the size is variable and the user needs to look into it. For things like path, maybe the caller should just pass in a string buffer that is separate from the struct buffer? -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html