Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Fri, 2015-06-26 at 07:37 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 10:56 PM, Konrad Rzeszutek Wilk wrote: On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote: On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. I still think it is the toolstack which should do this, that's whewre these sorts of layout decisions belong. can the xl tools read pci conf space ? Yes, via sysfs (possibly abstracted via libpci) . Just like lspci and friends do. Using some xen hypercall or a xl-dom0 ioctl ? No, using normal pre-existing Linux functionality. If not then there is no otherway but xenpciback Also I need to introduce a hypercall which would tell toolkit the available holes for virtualBAR mapping. Much simpler is let xen allocate a virtualBAR and return to the caller. At init - sure. But when the guest is running and doing those sort of things. Unless you want guest - pciback - xenstore - libxl - hypercall - send ack on xenstore - pciback - guest. That would entail adding some pcibkack - user-space tickle mechanism and another back. Much simpler to do all of this in xenpciback I think? I agree. If the xenpciback sends a hypercall whenever a BAR read access, the mapping in xen would already have been done, so xen would simply be doing PA-IPA lookup. No xenstore lookup is required. The xenstore read would happen once on device attach, at the same time you are reading the rest of the dev-NNN stuff relating to the just attached device. Doing a xenstore transaction on every BAR read would indeed be silly and doing a hypercall would not be much better. There is no need for either a xenstore read or a hypercall during the cfg space access itself, you just read the value from a pciback datastructure. Add to that the fact that any new hypercall made from dom0 needs to be added as a stable interface I can't see any reason to go with such a model. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Friday 26 June 2015 01:02 PM, Ian Campbell wrote: On Fri, 2015-06-26 at 07:37 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 10:56 PM, Konrad Rzeszutek Wilk wrote: On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote: On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. I still think it is the toolstack which should do this, that's whewre these sorts of layout decisions belong. can the xl tools read pci conf space ? Yes, via sysfs (possibly abstracted via libpci) . Just like lspci and friends do. Using some xen hypercall or a xl-dom0 ioctl ? No, using normal pre-existing Linux functionality. If not then there is no otherway but xenpciback Also I need to introduce a hypercall which would tell toolkit the available holes for virtualBAR mapping. Much simpler is let xen allocate a virtualBAR and return to the caller. At init - sure. But when the guest is running and doing those sort of things. Unless you want guest - pciback - xenstore - libxl - hypercall - send ack on xenstore - pciback - guest. That would entail adding some pcibkack - user-space tickle mechanism and another back. Much simpler to do all of this in xenpciback I think? I agree. If the xenpciback sends a hypercall whenever a BAR read access, the mapping in xen would already have been done, so xen would simply be doing PA-IPA lookup. No xenstore lookup is required. The xenstore read would happen once on device attach, at the same time you are reading the rest of the dev-NNN stuff relating to the just attached device. Doing a xenstore transaction on every BAR read would indeed be silly and doing a hypercall would not be much better. There is no need for either a xenstore read or a hypercall during the cfg space access itself, you just read the value from a pciback datastructure. Add to that the fact that any new hypercall made from dom0 needs to be added as a stable interface I can't see any reason to go with such a model. I think you are overlooking a point which is From what region the virtual BAR be allocated ? One way is for xen to keep a hole for domains where the bar regions be mapped. This is not there as of now. How would the tools know about this hole ? A domctl is required ? For this reason I was suggesting a hypercall to xen to map the physical BARs and return the virtualBARs. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Fri, Jun 26, 2015 at 02:47:38PM +0100, Ian Campbell wrote: On Fri, 2015-06-26 at 09:28 -0400, Konrad Rzeszutek Wilk wrote: I would advocate a more dynamic idea of GUEST_MMIO_* so that the toolstack can allocate those with headroom. Perhaps stash it at the far far end of the guest accessible memory? But that would require an 64-bit capable OS which might not work for all use-cases (where you want a 32-bit OS). I'd be incline to define two areas, one under 4GB but limited in size and the other above and to decide which to used based on $factors-tbd +1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Fri, 2015-06-26 at 14:20 +0530, Manish Jaggi wrote: On Friday 26 June 2015 01:02 PM, Ian Campbell wrote: On Fri, 2015-06-26 at 07:37 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 10:56 PM, Konrad Rzeszutek Wilk wrote: On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote: On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. I still think it is the toolstack which should do this, that's whewre these sorts of layout decisions belong. can the xl tools read pci conf space ? Yes, via sysfs (possibly abstracted via libpci) . Just like lspci and friends do. Using some xen hypercall or a xl-dom0 ioctl ? No, using normal pre-existing Linux functionality. If not then there is no otherway but xenpciback Also I need to introduce a hypercall which would tell toolkit the available holes for virtualBAR mapping. Much simpler is let xen allocate a virtualBAR and return to the caller. At init - sure. But when the guest is running and doing those sort of things. Unless you want guest - pciback - xenstore - libxl - hypercall - send ack on xenstore - pciback - guest. That would entail adding some pcibkack - user-space tickle mechanism and another back. Much simpler to do all of this in xenpciback I think? I agree. If the xenpciback sends a hypercall whenever a BAR read access, the mapping in xen would already have been done, so xen would simply be doing PA-IPA lookup. No xenstore lookup is required. The xenstore read would happen once on device attach, at the same time you are reading the rest of the dev-NNN stuff relating to the just attached device. Doing a xenstore transaction on every BAR read would indeed be silly and doing a hypercall would not be much better. There is no need for either a xenstore read or a hypercall during the cfg space access itself, you just read the value from a pciback datastructure. Add to that the fact that any new hypercall made from dom0 needs to be added as a stable interface I can't see any reason to go with such a model. I think you are overlooking a point which is From what region the virtual BAR be allocated ? One way is for xen to keep a hole for domains where the bar regions be mapped. This is not there as of now. How would the tools know about this hole ? I think you've overlooked the point that _only_ the tools know enough
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Fri, 2015-06-26 at 09:28 -0400, Konrad Rzeszutek Wilk wrote: I would advocate a more dynamic idea of GUEST_MMIO_* so that the toolstack can allocate those with headroom. Perhaps stash it at the far far end of the guest accessible memory? But that would require an 64-bit capable OS which might not work for all use-cases (where you want a 32-bit OS). I'd be incline to define two areas, one under 4GB but limited in size and the other above and to decide which to used based on $factors-tbd Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Friday 26 June 2015 02:39 PM, Ian Campbell wrote: On Fri, 2015-06-26 at 14:20 +0530, Manish Jaggi wrote: On Friday 26 June 2015 01:02 PM, Ian Campbell wrote: On Fri, 2015-06-26 at 07:37 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 10:56 PM, Konrad Rzeszutek Wilk wrote: On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote: On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. I still think it is the toolstack which should do this, that's whewre these sorts of layout decisions belong. can the xl tools read pci conf space ? Yes, via sysfs (possibly abstracted via libpci) . Just like lspci and friends do. Will implement that. Using some xen hypercall or a xl-dom0 ioctl ? No, using normal pre-existing Linux functionality. If not then there is no otherway but xenpciback Also I need to introduce a hypercall which would tell toolkit the available holes for virtualBAR mapping. Much simpler is let xen allocate a virtualBAR and return to the caller. At init - sure. But when the guest is running and doing those sort of things. Unless you want guest - pciback - xenstore - libxl - hypercall - send ack on xenstore - pciback - guest. That would entail adding some pcibkack - user-space tickle mechanism and another back. Much simpler to do all of this in xenpciback I think? I agree. If the xenpciback sends a hypercall whenever a BAR read access, the mapping in xen would already have been done, so xen would simply be doing PA-IPA lookup. No xenstore lookup is required. The xenstore read would happen once on device attach, at the same time you are reading the rest of the dev-NNN stuff relating to the just attached device. Doing a xenstore transaction on every BAR read would indeed be silly and doing a hypercall would not be much better. There is no need for either a xenstore read or a hypercall during the cfg space access itself, you just read the value from a pciback datastructure. Add to that the fact that any new hypercall made from dom0 needs to be added as a stable interface I can't see any reason to go with such a model. I think you are overlooking a point which is From what region the virtual BAR be allocated ? One way is for xen to keep a hole for domains where the bar regions be mapped. This is not there as of now. How would the tools know about this hole ? I think you've overlooked the point that _only_ the tools know enough about the overall guest address space layout to know about this hole. Xen has no need to know anything
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Fri, 2015-06-26 at 17:27 +0530, Manish Jaggi wrote: I still need to implement a domctl xc_domain_query_memory_mapping. If reserve a memory of GUEST_MMIO_HOLE_SIZE from GUEST_MMIO_HOLE_ADDR in arch-arm.h xl tools need some way to query which range is allocated and which is free in this hole. If it needs this info after the fact then it should stash it somewhere itself, no need for a hypercall to get it back from Xen. You may even be able to use the pciback xenstore nodes which you'll have added anyway, otherwise something specific under the /libxl namespace would seem like the way to go. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Fri, Jun 26, 2015 at 01:41:08PM +0100, Ian Campbell wrote: On Fri, 2015-06-26 at 17:27 +0530, Manish Jaggi wrote: I still need to implement a domctl xc_domain_query_memory_mapping. If reserve a memory of GUEST_MMIO_HOLE_SIZE from GUEST_MMIO_HOLE_ADDR in arch-arm.h xl tools need some way to query which range is allocated and which is free in this hole. If it needs this info after the fact then it should stash it somewhere itself, no need for a hypercall to get it back from Xen. You may even be able to use the pciback xenstore nodes which you'll have added anyway, otherwise something specific under the /libxl namespace would seem like the way to go. Xenstore keys would be the easiest in Xen PCIback. Keep also in mind that PCI devices MMIO regions can be huge. If you say pass in 64 NICs - each with 128MB MMIO region - you got quite a large of MMIO area. I would advocate a more dynamic idea of GUEST_MMIO_* so that the toolstack can allocate those with headroom. Perhaps stash it at the far far end of the guest accessible memory? But that would require an 64-bit capable OS which might not work for all use-cases (where you want a 32-bit OS). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. Moreover a pci driver would read BARs only once. c) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it As Julien has noted pciback already deals with this correctly, because sizing a BAR involves a write, it implementes a scheme which allows either the hardcoded virtual BAR to be written or all 1s (needed for size detection). d) when domU queries BAR address from pci-back the virtual BAR address is provided. Option2: b) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it c) when domU queries BAR address from pci-back the virtual BAR address is provided. d) domU sends a hypercall to map virtual BARs, e) xen pci code reads the BAR and maps { virtual BAR : physical BAR } for domU Which option is better I think Ian is for (2) and Stefano may be (1) In fact I'm now (after Julien pointed out the current behaviour of pciback) in favour of (1), although I'm not sure if Stefano is too. (I was never in favour of (2), FWIW, I previously was in favour of (3) which is like (2) except pciback makes the hypervcall to map the virtual bars to the guest, I'd still favour that over (2) but (1) is now my preference) Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. Xen would return the IPA for pci-back to return to the request to domU. Moreover a pci driver would read BARs only once. You can't assume that though, a driver can do whatever it likes, or the module might be unloaded and reloaded in the guest etc etc. Are you going to send out a second draft based on the discussion so far? yes, I was working on that only. I was traveling this week 24 hour flights jetlag... Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. I still think it is the toolstack which should do this, that's whewre these sorts of layout decisions belong. Xen would return the IPA for pci-back to return to the request to domU. Moreover a pci driver would read BARs only once. You can't assume that though, a driver can do whatever it likes, or the module might be unloaded and reloaded in the guest etc etc. Are you going to send out a second draft based on the discussion so far? yes, I was working on that only. I was traveling this week 24 hour flights jetlag... Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote: On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. I still think it is the toolstack which should do this, that's whewre these sorts of layout decisions belong. At init - sure. But when the guest is running and doing those sort of things. Unless you want guest - pciback - xenstore - libxl - hypercall - send ack on xenstore - pciback - guest. That would entail adding some pcibkack - user-space tickle mechanism and another back. Much simpler to do all of this in xenpciback I think? Xen would return the IPA for pci-back to return to the request to domU. Moreover a pci driver would read BARs only once. You can't assume that though, a driver can do whatever it likes, or the module might be unloaded and reloaded in the guest etc etc. Are you going to send out a second draft based on the discussion so far? yes, I was working on that only. I was traveling this week 24 hour flights jetlag... Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Thursday 25 June 2015 10:56 PM, Konrad Rzeszutek Wilk wrote: On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote: On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. I still think it is the toolstack which should do this, that's whewre these sorts of layout decisions belong. can the xl tools read pci conf space ? Using some xen hypercall or a xl-dom0 ioctl ? If not then there is no otherway but xenpciback Also I need to introduce a hypercall which would tell toolkit the available holes for virtualBAR mapping. Much simpler is let xen allocate a virtualBAR and return to the caller. At init - sure. But when the guest is running and doing those sort of things. Unless you want guest - pciback - xenstore - libxl - hypercall - send ack on xenstore - pciback - guest. That would entail adding some pcibkack - user-space tickle mechanism and another back. Much simpler to do all of this in xenpciback I think? I agree. If the xenpciback sends a hypercall whenever a BAR read access, the mapping in xen would already have been done, so xen would simply be doing PA-IPA lookup. No xenstore lookup is required. Xen would return the IPA for pci-back to return to the request to domU. Moreover a pci driver would read BARs only once. You can't assume that though, a driver can do whatever it likes, or the module might be unloaded and reloaded in the guest etc etc. Are you going to send out a second draft based on the discussion so far? yes, I was working on that only. I was traveling this week 24 hour flights jetlag... Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Moreover a pci driver would read BARs only once. You can't assume that though, a driver can do whatever it likes, or the module might be unloaded and reloaded in the guest etc etc. Are you going to send out a second draft based on the discussion so far? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Tue, Jun 23, 2015 at 09:44:31AM +0100, Ian Campbell wrote: On Mon, 2015-06-22 at 14:33 -0400, Konrad Rzeszutek Wilk wrote: On Wed, Jun 17, 2015 at 03:35:02PM +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU c) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it As Julien has noted pciback already deals with this correctly, because sizing a BAR involves a write, it implementes a scheme which allows either the hardcoded virtual BAR to be written or all 1s (needed for size detection). d) when domU queries BAR address from pci-back the virtual BAR address is provided. Option2: b) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it c) when domU queries BAR address from pci-back the virtual BAR address is provided. d) domU sends a hypercall to map virtual BARs, e) xen pci code reads the BAR and maps { virtual BAR : physical BAR } for domU Which option is better I think Ian is for (2) and Stefano may be (1) In fact I'm now (after Julien pointed out the current behaviour of pciback) in favour of (1), although I'm not sure if Stefano is too. (I was never in favour of (2), FWIW, I previously was in favour of (3) which is like (2) except pciback makes the hypervcall to map the virtual bars to the guest, I'd still favour that over (2) but (1) is now my preference) OK, let's go with (1). Right, and as the maintainer of pciback that means I don't have to do anything right :-) I _think_ there will need to be some mechanism for the toolstack to inform pciback what the virtual BARs should contain, so it can correctly process read requests. But other than review that (hopefully) small bit of code, I think there is nothing for you to do. XenBus is the way to go for that. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Mon, 2015-06-22 at 14:33 -0400, Konrad Rzeszutek Wilk wrote: On Wed, Jun 17, 2015 at 03:35:02PM +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU c) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it As Julien has noted pciback already deals with this correctly, because sizing a BAR involves a write, it implementes a scheme which allows either the hardcoded virtual BAR to be written or all 1s (needed for size detection). d) when domU queries BAR address from pci-back the virtual BAR address is provided. Option2: b) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it c) when domU queries BAR address from pci-back the virtual BAR address is provided. d) domU sends a hypercall to map virtual BARs, e) xen pci code reads the BAR and maps { virtual BAR : physical BAR } for domU Which option is better I think Ian is for (2) and Stefano may be (1) In fact I'm now (after Julien pointed out the current behaviour of pciback) in favour of (1), although I'm not sure if Stefano is too. (I was never in favour of (2), FWIW, I previously was in favour of (3) which is like (2) except pciback makes the hypervcall to map the virtual bars to the guest, I'd still favour that over (2) but (1) is now my preference) OK, let's go with (1). Right, and as the maintainer of pciback that means I don't have to do anything right :-) I _think_ there will need to be some mechanism for the toolstack to inform pciback what the virtual BARs should contain, so it can correctly process read requests. But other than review that (hopefully) small bit of code, I think there is nothing for you to do. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, Jun 17, 2015 at 03:35:02PM +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU c) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it As Julien has noted pciback already deals with this correctly, because sizing a BAR involves a write, it implementes a scheme which allows either the hardcoded virtual BAR to be written or all 1s (needed for size detection). d) when domU queries BAR address from pci-back the virtual BAR address is provided. Option2: b) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it c) when domU queries BAR address from pci-back the virtual BAR address is provided. d) domU sends a hypercall to map virtual BARs, e) xen pci code reads the BAR and maps { virtual BAR : physical BAR } for domU Which option is better I think Ian is for (2) and Stefano may be (1) In fact I'm now (after Julien pointed out the current behaviour of pciback) in favour of (1), although I'm not sure if Stefano is too. (I was never in favour of (2), FWIW, I previously was in favour of (3) which is like (2) except pciback makes the hypervcall to map the virtual bars to the guest, I'd still favour that over (2) but (1) is now my preference) OK, let's go with (1). Right, and as the maintainer of pciback that means I don't have to do anything right :-) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, Jun 17, 2015 at 03:26:10PM +0100, Ian Campbell wrote: On Wed, 2015-06-17 at 15:18 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 14:40 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 13:53 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. So it seems that neither of the existing solutions are a good match for either ARM or x86 PVH. _If_ it were permissible to implement all BAR registers as r/o then this might simplify things, however I suspect this is not something we can get away with in the general case (i.e. a driver for a given PCI device _knows_ that BARN is writeable...). So I think we do need to allow guest writes to the BAR registers. I think it would be a bit strange (or even open potentially subtle (security?) issues) to require the guest to write the BAR register and to update the p2m to match as well. Why would it be a security risk? No idea, it seemed like a vague possibility. I did say (security?) not SECURITY ;-) I don't think it is important that the interface between pcifront and pciback refrects the hardware interface. But it is important that the interface is documented and complies to the expected behaviour. I think that if we go with XENMEM_add_to_physmap_range, it would be OK for pcifront to both write to the vBAR (actually writing to the ring to pciback), then call XENMEM_add_to_physmap_range. On the other hand, if we go with XEN_DOMCTL_memory_mapping, I would make the virtual BAR read-only, that is less preferable but still acceptable, as long as we document it. Given that this appears to be how pv pci works today I think this turns out to be preferable and saves us having to argue the toss about who should do the p2m updates on a BAR update. Which would point towards the toolstack doing the layout and telling pciback somehow what the bars for each device are and that's it. (making most of the rest of your reply moot, if you think its really important to let guests move BARs about I can go through it again) I think is unimportant, in fact having the guest setup the mappings was my original suggestion. I can't reconcile the two halves of this statement, are you saying you are still in favour of the guest dealing with the mappings or not? Sorry! I do not think is important to let guests move BARs. In fact having the toolstack doing the layout was my original suggestion and I still think can work correctly. Ah, so s/guest/toolstack/ on the statement above makes it consistent, phew! I don't have a strong feeling on whether it should be the toolstack or the guest to do the mapping. It can work either way. I would certainly avoid having Dom0 do the mapping. I think it should be the toolstack. Allow the guest to do it means implementing writeable BARs which is a much bigger yakk to shave and it doesn't seem to be necessary. True. I skipped
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 2015-06-17 at 13:11 +0100, Julien Grall wrote: Hi Ian, On 17/06/15 11:08, Ian Campbell wrote: On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. By default, the OS doesn't setup the BARs, it expects the firmware to do it. _By_default_, yes. But it can decide to rewrite the BARs if it wants, Linux for example has command line options to do so. Maybe we have the option to exclude this possibility. It would certainly simplify things. (Which makes most of the rest of your reply moot). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
Hi Ian, On 17/06/15 11:08, Ian Campbell wrote: On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. By default, the OS doesn't setup the BARs, it expects the firmware to do it. If I'm not mistaken the PV guest is using the same memory layout as the host, so the BAR is mapped 1:1 in the guest. The interesting part is bar_* in drivers/xen/xenpci-back/config_space_header.c So it seems that neither of the existing solutions are a good match for either ARM or x86 PVH. _If_ it were permissible to implement all BAR registers as r/o then this might simplify things, however I suspect this is not something we can get away with in the general case (i.e. a driver for a given PCI device _knows_ that BARN is writeable...). The guest has to write in the BAR in order to get the size of the BAR [1]. So I think we do need to allow guest writes to the BAR registers. I think it would be a bit strange (or even open potentially subtle (security?) issues) to require the guest to write the BAR register and to update the p2m to match as well. I don't see any possible security issue to let the guest map the BAR itself. If the guest doesn't have the right to some physical region it won't be able to map it (see iomem permission). Actually, we already do similar things for interrupt. So I think updates to the BAR need to be reflected in the p2m too by whichever entity is emulating those reads/writes. QEMU (or another daemon) is not really an option IMHO. Well, by default the OS doesn't setup the BAR and expect the firmware to initialize the BAR. In our case, there is no firmware and Xen will act as it. It would be OK, to do it in the toolstack at boot time. Which leaves us with either Xen doing trap and emulate or pciback dealing with this via the pciif.h cfg space access stuff. I've a slight preference for the latter since I think both ARM and x86 PVH are planning to use pcifront/pciback already. pcifront/pciback is already emulate the access to the config space for us... It would be mad to use Xen for that as we would have to reinvent our own config space access way (ARM doesn't have ioport, hence cf8 is not present). Which leaves us with a requirement for the backend to be able to update the p2m, which in turn leads to a need for an interface available to the dom0 kernel (which the existing domctl is not). There will also need to be a mechanism to expose a suitable MMIO hole to pcifront. On x86 this would be via the machine memory map, I think, but on ARM we may need something else, perhaps a xenstore key associated with the pci bus entry in xenstore? That's for guests, for dom0 Xen also need to be aware of changes to the BAR registers of the real physical devices since it needs to know the real hardware values in order to point guest p2m mappings at them. I don't understand why dom0 Xen needs to be aware of changes in physical BAR. PCIback is already able to deal with the value of physical BAR, you have nothing else to do (see drivers/xen/xenpci-back/config_space_header.c). Regards, [1] http://wiki.osdev.org/PCI#Base_Address_Registers Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. So it seems that neither of the existing solutions are a good match for either ARM or x86 PVH. _If_ it were permissible to implement all BAR registers as r/o then this might simplify things, however I suspect this is not something we can get away with in the general case (i.e. a driver for a given PCI device _knows_ that BARN is writeable...). So I think we do need to allow guest writes to the BAR registers. I think it would be a bit strange (or even open potentially subtle (security?) issues) to require the guest to write the BAR register and to update the p2m to match as well. So I think updates to the BAR need to be reflected in the p2m too by whichever entity is emulating those reads/writes. QEMU (or another daemon) is not really an option IMHO. Which leaves us with either Xen doing trap and emulate or pciback dealing with this via the pciif.h cfg space access stuff. I've a slight preference for the latter since I think both ARM and x86 PVH are planning to use pcifront/pciback already. Which leaves us with a requirement for the backend to be able to update the p2m, which in turn leads to a need for an interface available to the dom0 kernel (which the existing domctl is not). There will also need to be a mechanism to expose a suitable MMIO hole to pcifront. On x86 this would be via the machine memory map, I think, but on ARM we may need something else, perhaps a xenstore key associated with the pci bus entry in xenstore? That's for guests, for dom0 Xen also need to be aware of changes to the BAR registers of the real physical devices since it needs to know the real hardware values in order to point guest p2m mappings at them. Wow, this is far more complicated than I imagined :-( Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 17 Jun 2015, Ian Campbell wrote: On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. So it seems that neither of the existing solutions are a good match for either ARM or x86 PVH. _If_ it were permissible to implement all BAR registers as r/o then this might simplify things, however I suspect this is not something we can get away with in the general case (i.e. a driver for a given PCI device _knows_ that BARN is writeable...). So I think we do need to allow guest writes to the BAR registers. I think it would be a bit strange (or even open potentially subtle (security?) issues) to require the guest to write the BAR register and to update the p2m to match as well. Why would it be a security risk? I don't think it is important that the interface between pcifront and pciback refrects the hardware interface. But it is important that the interface is documented and complies to the expected behaviour. I think that if we go with XENMEM_add_to_physmap_range, it would be OK for pcifront to both write to the vBAR (actually writing to the ring to pciback), then call XENMEM_add_to_physmap_range. On the other hand, if we go with XEN_DOMCTL_memory_mapping, I would make the virtual BAR read-only, that is less preferable but still acceptable, as long as we document it. So I think updates to the BAR need to be reflected in the p2m too by whichever entity is emulating those reads/writes. I wouldn't make this a requirement. QEMU (or another daemon) is not really an option IMHO. I agree! Which leaves us with either Xen doing trap and emulate or pciback dealing with this via the pciif.h cfg space access stuff. I've a slight preference for the latter since I think both ARM and x86 PVH are planning to use pcifront/pciback already. I also agree that between these two options, pciback is better. But XENMEM_add_to_physmap_range is still the simplest. Which leaves us with a requirement for the backend to be able to update the p2m, which in turn leads to a need for an interface available to the dom0 kernel (which the existing domctl is not). There will also need to be a mechanism to expose a suitable MMIO hole to pcifront. On x86 this would be via the machine memory map, I think, but on ARM we may need something else, perhaps a xenstore key associated with the pci bus entry in xenstore? Can't we just let the guest kernel find an address range large enough that is not normal memory? That's for guests, for dom0 Xen also need to be aware of changes to the BAR registers of the real physical devices since it needs to know the real hardware values in order to point guest p2m mappings at them. I think that in the Dom0 case, we want to map them 1:1. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 13:53 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. So it seems that neither of the existing solutions are a good match for either ARM or x86 PVH. _If_ it were permissible to implement all BAR registers as r/o then this might simplify things, however I suspect this is not something we can get away with in the general case (i.e. a driver for a given PCI device _knows_ that BARN is writeable...). So I think we do need to allow guest writes to the BAR registers. I think it would be a bit strange (or even open potentially subtle (security?) issues) to require the guest to write the BAR register and to update the p2m to match as well. Why would it be a security risk? No idea, it seemed like a vague possibility. I did say (security?) not SECURITY ;-) I don't think it is important that the interface between pcifront and pciback refrects the hardware interface. But it is important that the interface is documented and complies to the expected behaviour. I think that if we go with XENMEM_add_to_physmap_range, it would be OK for pcifront to both write to the vBAR (actually writing to the ring to pciback), then call XENMEM_add_to_physmap_range. On the other hand, if we go with XEN_DOMCTL_memory_mapping, I would make the virtual BAR read-only, that is less preferable but still acceptable, as long as we document it. Given that this appears to be how pv pci works today I think this turns out to be preferable and saves us having to argue the toss about who should do the p2m updates on a BAR update. Which would point towards the toolstack doing the layout and telling pciback somehow what the bars for each device are and that's it. (making most of the rest of your reply moot, if you think its really important to let guests move BARs about I can go through it again) I think is unimportant, in fact having the guest setup the mappings was my original suggestion. That's for guests, for dom0 Xen also need to be aware of changes to the BAR registers of the real physical devices since it needs to know the real hardware values in order to point guest p2m mappings at them. I think that in the Dom0 case, we want to map them 1:1. What I was getting at was that if dom0 can write the BARs of a device then maybe the 1:1 mapping in the p2m would need to change, but given we map the whole IO window to dom0 maybe not. I was also wondering if maybe Xen might need to know the BARs itself (e.g. for updating domU p2m somehow, or as part of an assignment call using SBDF not addresses), but I think that's probably not the case. Right, I think it would be best to avoid it. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 2015-06-17 at 13:53 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. So it seems that neither of the existing solutions are a good match for either ARM or x86 PVH. _If_ it were permissible to implement all BAR registers as r/o then this might simplify things, however I suspect this is not something we can get away with in the general case (i.e. a driver for a given PCI device _knows_ that BARN is writeable...). So I think we do need to allow guest writes to the BAR registers. I think it would be a bit strange (or even open potentially subtle (security?) issues) to require the guest to write the BAR register and to update the p2m to match as well. Why would it be a security risk? No idea, it seemed like a vague possibility. I did say (security?) not SECURITY ;-) I don't think it is important that the interface between pcifront and pciback refrects the hardware interface. But it is important that the interface is documented and complies to the expected behaviour. I think that if we go with XENMEM_add_to_physmap_range, it would be OK for pcifront to both write to the vBAR (actually writing to the ring to pciback), then call XENMEM_add_to_physmap_range. On the other hand, if we go with XEN_DOMCTL_memory_mapping, I would make the virtual BAR read-only, that is less preferable but still acceptable, as long as we document it. Given that this appears to be how pv pci works today I think this turns out to be preferable and saves us having to argue the toss about who should do the p2m updates on a BAR update. Which would point towards the toolstack doing the layout and telling pciback somehow what the bars for each device are and that's it. (making most of the rest of your reply moot, if you think its really important to let guests move BARs about I can go through it again) That's for guests, for dom0 Xen also need to be aware of changes to the BAR registers of the real physical devices since it needs to know the real hardware values in order to point guest p2m mappings at them. I think that in the Dom0 case, we want to map them 1:1. What I was getting at was that if dom0 can write the BARs of a device then maybe the 1:1 mapping in the p2m would need to change, but given we map the whole IO window to dom0 maybe not. I was also wondering if maybe Xen might need to know the BARs itself (e.g. for updating domU p2m somehow, or as part of an assignment call using SBDF not addresses), but I think that's probably not the case. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On 17/06/15 13:53, Stefano Stabellini wrote: That's for guests, for dom0 Xen also need to be aware of changes to the BAR registers of the real physical devices since it needs to know the real hardware values in order to point guest p2m mappings at them. I think that in the Dom0 case, we want to map them 1:1. MMIO are always mapped 1:1 in DOM0. This is actually already done by Xen. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
Could you please use plain text emails? See how bad it looks in my client. One comment below. On Tue, 16 Jun 2015, Manish Jaggi wrote: On Tuesday 16 June 2015 10:28 AM, Stefano Stabellini wrote: On Tue, 16 Jun 2015, Manish Jaggi wrote: On Tuesday 16 June 2015 09:21 AM, Roger Pau Monné wrote: El 16/06/15 a les 18.13, Stefano Stabellini ha escrit: On Thu, 11 Jun 2015, Ian Campbell wrote: On Thu, 2015-06-11 at 07:25 -0400, Julien Grall wrote: Hi Ian, On 11/06/2015 04:56, Ian Campbell wrote: On Wed, 2015-06-10 at 15:21 -0400, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest (i.e hvmloader?) is writing in the PCI BAR. What about for x86 PV? I think it is done by the toolstack there, I don't know what pciback does with accesses to BAR registers. XEN_DOMCTL_memory_mapping is only used to map memory in stage-2 page table. This is only used for auto-translated guest. In the case of x86 PV, the page-table is managed by the guest. The only things to do is to give the MMIO permission to the guest in order to the let him use them. This is done at boot time in the toolstack. Ah yes, makes sense. Manish, this sort of thing and the constraints etc should be discussed in the doc please. I think that the toolstack (libxl) will need to call xc_domain_memory_mapping (XEN_DOMCTL_memory_mapping), in addition to xc_domain_iomem_permission, for auto-translated PV guests on x86 (PVH) and ARM guests. I'm not sure about this, AFAICT you are suggesting that the toolstack (or domain builder for Dom0) should setup the MMIO regions on behalf of the guest using the XEN_DOMCTL_memory_mapping hypercall. IMHO the toolstack should not setup MMIO regions and instead the guest should be in charge of setting them in the p2m by using a hypercall (or at least that was the plan on x86 PVH). Roger. There were couple of points discussed, a) There needs to be a hypercall issued from an entity to map the device MMIO space to domU. What that entity be i) Toolstack ii) domU kernel. b) Should the MMIO mapping be 1:1 For (a) I have implemented in domU kernel in the context of the notification received when a device is added on the pci-front bus. This was a logical point I thought this hypercall should be called. Keep in mind that I am still not aware how this works on x86. I think that is OK, but we would like to avoid a new hypercall. Roger's suggestion looks good. Roger, as per your comment if guest is charge of setting p2m, which existing hypercall to be used ? For (b) The BAR region is not updated AFAIK by the pci device driver running in domU. So once set the BARs by firmware or enumeration logic, are not changed, not in domU for sure. Then it is 1:1 always. Should the BAR region of the device be updated to make it not 1:1 ? I think the point Ian and Julien were trying to make is that we should not rely on the mapping being 1:1. It is OK for the guest not to change the BARs. But given that the memory layout of the guest is different from the one on the host, it is possible that the BAR might have an address that overlaps with a valid memory range in DomU. In that case the guest should map the MMIO region elsewhere in the guest physical space. It might also want to update the virtual BAR accordingly. (Alternatively the toolstack could come up with an appropriate placement of the virtual BARs and MMIO region mappings in the guest. should the pci-back driver return a virtual BAR in response to a pci conf space read from the domU. Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range.___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU c) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it d) when domU queries BAR address from pci-back the virtual BAR address is provided. Option2: b) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it c) when domU queries BAR address from pci-back the virtual BAR address is provided. d) domU sends a hypercall to map virtual BARs, e) xen pci code reads the BAR and maps { virtual BAR : physical BAR } for domU Which option is better I think Ian is for (2) and Stefano may be (1) Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 14:40 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 13:53 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. So it seems that neither of the existing solutions are a good match for either ARM or x86 PVH. _If_ it were permissible to implement all BAR registers as r/o then this might simplify things, however I suspect this is not something we can get away with in the general case (i.e. a driver for a given PCI device _knows_ that BARN is writeable...). So I think we do need to allow guest writes to the BAR registers. I think it would be a bit strange (or even open potentially subtle (security?) issues) to require the guest to write the BAR register and to update the p2m to match as well. Why would it be a security risk? No idea, it seemed like a vague possibility. I did say (security?) not SECURITY ;-) I don't think it is important that the interface between pcifront and pciback refrects the hardware interface. But it is important that the interface is documented and complies to the expected behaviour. I think that if we go with XENMEM_add_to_physmap_range, it would be OK for pcifront to both write to the vBAR (actually writing to the ring to pciback), then call XENMEM_add_to_physmap_range. On the other hand, if we go with XEN_DOMCTL_memory_mapping, I would make the virtual BAR read-only, that is less preferable but still acceptable, as long as we document it. Given that this appears to be how pv pci works today I think this turns out to be preferable and saves us having to argue the toss about who should do the p2m updates on a BAR update. Which would point towards the toolstack doing the layout and telling pciback somehow what the bars for each device are and that's it. (making most of the rest of your reply moot, if you think its really important to let guests move BARs about I can go through it again) I think is unimportant, in fact having the guest setup the mappings was my original suggestion. I can't reconcile the two halves of this statement, are you saying you are still in favour of the guest dealing with the mappings or not? Sorry! I do not think is important to let guests move BARs. In fact having the toolstack doing the layout was my original suggestion and I still think can work correctly. I don't have a strong feeling on whether it should be the toolstack or the guest to do the mapping. It can work either way. I would certainly avoid having Dom0 do the mapping. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU c) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it As Julien has noted pciback already deals with this correctly, because sizing a BAR involves a write, it implementes a scheme which allows either the hardcoded virtual BAR to be written or all 1s (needed for size detection). d) when domU queries BAR address from pci-back the virtual BAR address is provided. Option2: b) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it c) when domU queries BAR address from pci-back the virtual BAR address is provided. d) domU sends a hypercall to map virtual BARs, e) xen pci code reads the BAR and maps { virtual BAR : physical BAR } for domU Which option is better I think Ian is for (2) and Stefano may be (1) In fact I'm now (after Julien pointed out the current behaviour of pciback) in favour of (1), although I'm not sure if Stefano is too. (I was never in favour of (2), FWIW, I previously was in favour of (3) which is like (2) except pciback makes the hypervcall to map the virtual bars to the guest, I'd still favour that over (2) but (1) is now my preference) Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU c) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it As Julien has noted pciback already deals with this correctly, because sizing a BAR involves a write, it implementes a scheme which allows either the hardcoded virtual BAR to be written or all 1s (needed for size detection). d) when domU queries BAR address from pci-back the virtual BAR address is provided. Option2: b) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it c) when domU queries BAR address from pci-back the virtual BAR address is provided. d) domU sends a hypercall to map virtual BARs, e) xen pci code reads the BAR and maps { virtual BAR : physical BAR } for domU Which option is better I think Ian is for (2) and Stefano may be (1) In fact I'm now (after Julien pointed out the current behaviour of pciback) in favour of (1), although I'm not sure if Stefano is too. (I was never in favour of (2), FWIW, I previously was in favour of (3) which is like (2) except pciback makes the hypervcall to map the virtual bars to the guest, I'd still favour that over (2) but (1) is now my preference) OK, let's go with (1). ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 2015-06-17 at 14:40 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 13:53 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. So it seems that neither of the existing solutions are a good match for either ARM or x86 PVH. _If_ it were permissible to implement all BAR registers as r/o then this might simplify things, however I suspect this is not something we can get away with in the general case (i.e. a driver for a given PCI device _knows_ that BARN is writeable...). So I think we do need to allow guest writes to the BAR registers. I think it would be a bit strange (or even open potentially subtle (security?) issues) to require the guest to write the BAR register and to update the p2m to match as well. Why would it be a security risk? No idea, it seemed like a vague possibility. I did say (security?) not SECURITY ;-) I don't think it is important that the interface between pcifront and pciback refrects the hardware interface. But it is important that the interface is documented and complies to the expected behaviour. I think that if we go with XENMEM_add_to_physmap_range, it would be OK for pcifront to both write to the vBAR (actually writing to the ring to pciback), then call XENMEM_add_to_physmap_range. On the other hand, if we go with XEN_DOMCTL_memory_mapping, I would make the virtual BAR read-only, that is less preferable but still acceptable, as long as we document it. Given that this appears to be how pv pci works today I think this turns out to be preferable and saves us having to argue the toss about who should do the p2m updates on a BAR update. Which would point towards the toolstack doing the layout and telling pciback somehow what the bars for each device are and that's it. (making most of the rest of your reply moot, if you think its really important to let guests move BARs about I can go through it again) I think is unimportant, in fact having the guest setup the mappings was my original suggestion. I can't reconcile the two halves of this statement, are you saying you are still in favour of the guest dealing with the mappings or not? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 2015-06-17 at 15:18 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 14:40 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-17 at 13:53 +0100, Stefano Stabellini wrote: On Wed, 17 Jun 2015, Ian Campbell wrote: On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. So it seems that neither of the existing solutions are a good match for either ARM or x86 PVH. _If_ it were permissible to implement all BAR registers as r/o then this might simplify things, however I suspect this is not something we can get away with in the general case (i.e. a driver for a given PCI device _knows_ that BARN is writeable...). So I think we do need to allow guest writes to the BAR registers. I think it would be a bit strange (or even open potentially subtle (security?) issues) to require the guest to write the BAR register and to update the p2m to match as well. Why would it be a security risk? No idea, it seemed like a vague possibility. I did say (security?) not SECURITY ;-) I don't think it is important that the interface between pcifront and pciback refrects the hardware interface. But it is important that the interface is documented and complies to the expected behaviour. I think that if we go with XENMEM_add_to_physmap_range, it would be OK for pcifront to both write to the vBAR (actually writing to the ring to pciback), then call XENMEM_add_to_physmap_range. On the other hand, if we go with XEN_DOMCTL_memory_mapping, I would make the virtual BAR read-only, that is less preferable but still acceptable, as long as we document it. Given that this appears to be how pv pci works today I think this turns out to be preferable and saves us having to argue the toss about who should do the p2m updates on a BAR update. Which would point towards the toolstack doing the layout and telling pciback somehow what the bars for each device are and that's it. (making most of the rest of your reply moot, if you think its really important to let guests move BARs about I can go through it again) I think is unimportant, in fact having the guest setup the mappings was my original suggestion. I can't reconcile the two halves of this statement, are you saying you are still in favour of the guest dealing with the mappings or not? Sorry! I do not think is important to let guests move BARs. In fact having the toolstack doing the layout was my original suggestion and I still think can work correctly. Ah, so s/guest/toolstack/ on the statement above makes it consistent, phew! I don't have a strong feeling on whether it should be the toolstack or the guest to do the mapping. It can work either way. I would certainly avoid having Dom0 do the mapping. I think it should be the toolstack. Allow the guest to do it means implementing writeable BARs which is a much bigger yakk to shave and it doesn't seem to be necessary. Ian ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
El 16/06/15 a les 18.13, Stefano Stabellini ha escrit: On Thu, 11 Jun 2015, Ian Campbell wrote: On Thu, 2015-06-11 at 07:25 -0400, Julien Grall wrote: Hi Ian, On 11/06/2015 04:56, Ian Campbell wrote: On Wed, 2015-06-10 at 15:21 -0400, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest (i.e hvmloader?) is writing in the PCI BAR. What about for x86 PV? I think it is done by the toolstack there, I don't know what pciback does with accesses to BAR registers. XEN_DOMCTL_memory_mapping is only used to map memory in stage-2 page table. This is only used for auto-translated guest. In the case of x86 PV, the page-table is managed by the guest. The only things to do is to give the MMIO permission to the guest in order to the let him use them. This is done at boot time in the toolstack. Ah yes, makes sense. Manish, this sort of thing and the constraints etc should be discussed in the doc please. I think that the toolstack (libxl) will need to call xc_domain_memory_mapping (XEN_DOMCTL_memory_mapping), in addition to xc_domain_iomem_permission, for auto-translated PV guests on x86 (PVH) and ARM guests. I'm not sure about this, AFAICT you are suggesting that the toolstack (or domain builder for Dom0) should setup the MMIO regions on behalf of the guest using the XEN_DOMCTL_memory_mapping hypercall. IMHO the toolstack should not setup MMIO regions and instead the guest should be in charge of setting them in the p2m by using a hypercall (or at least that was the plan on x86 PVH). Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Thu, 11 Jun 2015, Ian Campbell wrote: On Thu, 2015-06-11 at 11:05 +0200, Roger Pau Monné wrote: El 10/06/15 a les 21.21, Julien Grall ha escrit: If there is a reason for this restriction/trade off then it should be spelled out as part of the design document, as should other such design decisions (which would include explaining where this differs from how things work for x86 why they must differ). On x86, for HVM the MMIO mapping is done by QEMU. I know that Roger is working on PCI passthrough for PVH. PVH is very similar to ARM guest and I expect to see a similar needs for MMIO mapping. It would be good if we can come up with a common interface. I've kind of left that apart in favour of the new boot ABI that we are currently discussing, but IIRC the plan was to use XENMEM_add_to_physmap_range by adding a new phys_map_space and the physical MMIO pages would be specified in the idxs field. This sounds ok, and preferable to an entirely new hypercall. Indeed. What would be the advantage of using XENMEM_add_to_physmap_range over XEN_DOMCTL_memory_mapping? Would it be called by libxl or by the guest kernel? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Tue, 16 Jun 2015, Roger Pau Monné wrote: El 16/06/15 a les 18.13, Stefano Stabellini ha escrit: On Thu, 11 Jun 2015, Ian Campbell wrote: On Thu, 2015-06-11 at 07:25 -0400, Julien Grall wrote: Hi Ian, On 11/06/2015 04:56, Ian Campbell wrote: On Wed, 2015-06-10 at 15:21 -0400, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest (i.e hvmloader?) is writing in the PCI BAR. What about for x86 PV? I think it is done by the toolstack there, I don't know what pciback does with accesses to BAR registers. XEN_DOMCTL_memory_mapping is only used to map memory in stage-2 page table. This is only used for auto-translated guest. In the case of x86 PV, the page-table is managed by the guest. The only things to do is to give the MMIO permission to the guest in order to the let him use them. This is done at boot time in the toolstack. Ah yes, makes sense. Manish, this sort of thing and the constraints etc should be discussed in the doc please. I think that the toolstack (libxl) will need to call xc_domain_memory_mapping (XEN_DOMCTL_memory_mapping), in addition to xc_domain_iomem_permission, for auto-translated PV guests on x86 (PVH) and ARM guests. I'm not sure about this, AFAICT you are suggesting that the toolstack (or domain builder for Dom0) should setup the MMIO regions on behalf of the guest using the XEN_DOMCTL_memory_mapping hypercall. IMHO the toolstack should not setup MMIO regions and instead the guest should be in charge of setting them in the p2m by using a hypercall (or at least that was the plan on x86 PVH). I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission.___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Tuesday 16 June 2015 10:28 AM, Stefano Stabellini wrote: On Tue, 16 Jun 2015, Manish Jaggi wrote: On Tuesday 16 June 2015 09:21 AM, Roger Pau Monné wrote: El 16/06/15 a les 18.13, Stefano Stabellini ha escrit: On Thu, 11 Jun 2015, Ian Campbell wrote: On Thu, 2015-06-11 at 07:25 -0400, Julien Grall wrote: Hi Ian, On 11/06/2015 04:56, Ian Campbell wrote: On Wed, 2015-06-10 at 15:21 -0400, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest (i.e hvmloader?) is writing in the PCI BAR. What about for x86 PV? I think it is done by the toolstack there, I don't know what pciback does with accesses to BAR registers. XEN_DOMCTL_memory_mapping is only used to map memory in stage-2 page table. This is only used for auto-translated guest. In the case of x86 PV, the page-table is managed by the guest. The only things to do is to give the MMIO permission to the guest in order to the let him use them. This is done at boot time in the toolstack. Ah yes, makes sense. Manish, this sort of thing and the constraints etc should be discussed in the doc please. I think that the toolstack (libxl) will need to call xc_domain_memory_mapping (XEN_DOMCTL_memory_mapping), in addition to xc_domain_iomem_permission, for auto-translated PV guests on x86 (PVH) and ARM guests. I'm not sure about this, AFAICT you are suggesting that the toolstack (or domain builder for Dom0) should setup the MMIO regions on behalf of the guest using the XEN_DOMCTL_memory_mapping hypercall. IMHO the toolstack should not setup MMIO regions and instead the guest should be in charge of setting them in the p2m by using a hypercall (or at least that was the plan on x86 PVH). Roger. There were couple of points discussed, a) There needs to be a hypercall issued from an entity to map the device MMIO space to domU. What that entity be i) Toolstack ii) domU kernel. b) Should the MMIO mapping be 1:1 For (a) I have implemented in domU kernel in the context of the notification received when a device is added on the pci-front bus. This was a logical point I thought this hypercall should be called. Keep in mind that I am still not aware how this works on x86. I think that is OK, but we would like to avoid a new hypercall. Roger's suggestion looks good. Roger, as per your comment if guest is charge of setting p2m, which existing hypercall to be used ? For (b) The BAR region is not updated AFAIK by the pci device driver running in domU. So once set the BARs by firmware or enumeration logic, are not changed, not in domU for sure. Then it is 1:1 always. Should the BAR region of the device be updated to make it not 1:1 ? I think the point Ian and Julien were trying to make is that we should not rely on the mapping being 1:1. It is OK for the guest not to change the BARs. But given that the memory layout of the guest is different from the one on the host, it is possible that the BAR might have an address that overlaps with a valid memory range in DomU. In that case the guest should map the MMIO region elsewhere in the guest physical space. It might also want to update the virtual BAR accordingly. (Alternatively the toolstack could come up with an appropriate placement of the virtual BARs and MMIO region mappings in the guest. should the pci-back driver return a virtual BAR in response to a pci conf space read from the domU. I am not sure if the pci-back driver can query the guest memory map.Is there an existing hypercall? This is what I was suggesting, but this solution is probably more complex than letting the guest choose.) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Tuesday 16 June 2015 09:21 AM, Roger Pau Monné wrote: El 16/06/15 a les 18.13, Stefano Stabellini ha escrit: On Thu, 11 Jun 2015, Ian Campbell wrote: On Thu, 2015-06-11 at 07:25 -0400, Julien Grall wrote: Hi Ian, On 11/06/2015 04:56, Ian Campbell wrote: On Wed, 2015-06-10 at 15:21 -0400, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest (i.e hvmloader?) is writing in the PCI BAR. What about for x86 PV? I think it is done by the toolstack there, I don't know what pciback does with accesses to BAR registers. XEN_DOMCTL_memory_mapping is only used to map memory in stage-2 page table. This is only used for auto-translated guest. In the case of x86 PV, the page-table is managed by the guest. The only things to do is to give the MMIO permission to the guest in order to the let him use them. This is done at boot time in the toolstack. Ah yes, makes sense. Manish, this sort of thing and the constraints etc should be discussed in the doc please. I think that the toolstack (libxl) will need to call xc_domain_memory_mapping (XEN_DOMCTL_memory_mapping), in addition to xc_domain_iomem_permission, for auto-translated PV guests on x86 (PVH) and ARM guests. I'm not sure about this, AFAICT you are suggesting that the toolstack (or domain builder for Dom0) should setup the MMIO regions on behalf of the guest using the XEN_DOMCTL_memory_mapping hypercall. IMHO the toolstack should not setup MMIO regions and instead the guest should be in charge of setting them in the p2m by using a hypercall (or at least that was the plan on x86 PVH). Roger. There were couple of points discussed, a) There needs to be a hypercall issued from an entity to map the device MMIO space to domU. What that entity be i) Toolstack ii) domU kernel. b) Should the MMIO mapping be 1:1 For (a) I have implemented in domU kernel in the context of the notification received when a device is added on the pci-front bus. This was a logical point I thought this hypercall should be called. Keep in mind that I am still not aware how this works on x86. For (b) The BAR region is not updated AFAIK by the pci device driver running in domU. So once set the BARs by firmware or enumeration logic, are not changed, not in domU for sure. Then it is 1:1 always. Should the BAR region of the device be updated to make it not 1:1 ? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Tue, 16 Jun 2015, Manish Jaggi wrote: On Tuesday 16 June 2015 09:21 AM, Roger Pau Monné wrote: El 16/06/15 a les 18.13, Stefano Stabellini ha escrit: On Thu, 11 Jun 2015, Ian Campbell wrote: On Thu, 2015-06-11 at 07:25 -0400, Julien Grall wrote: Hi Ian, On 11/06/2015 04:56, Ian Campbell wrote: On Wed, 2015-06-10 at 15:21 -0400, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest (i.e hvmloader?) is writing in the PCI BAR. What about for x86 PV? I think it is done by the toolstack there, I don't know what pciback does with accesses to BAR registers. XEN_DOMCTL_memory_mapping is only used to map memory in stage-2 page table. This is only used for auto-translated guest. In the case of x86 PV, the page-table is managed by the guest. The only things to do is to give the MMIO permission to the guest in order to the let him use them. This is done at boot time in the toolstack. Ah yes, makes sense. Manish, this sort of thing and the constraints etc should be discussed in the doc please. I think that the toolstack (libxl) will need to call xc_domain_memory_mapping (XEN_DOMCTL_memory_mapping), in addition to xc_domain_iomem_permission, for auto-translated PV guests on x86 (PVH) and ARM guests. I'm not sure about this, AFAICT you are suggesting that the toolstack (or domain builder for Dom0) should setup the MMIO regions on behalf of the guest using the XEN_DOMCTL_memory_mapping hypercall. IMHO the toolstack should not setup MMIO regions and instead the guest should be in charge of setting them in the p2m by using a hypercall (or at least that was the plan on x86 PVH). Roger. There were couple of points discussed, a) There needs to be a hypercall issued from an entity to map the device MMIO space to domU. What that entity be i) Toolstack ii) domU kernel. b) Should the MMIO mapping be 1:1 For (a) I have implemented in domU kernel in the context of the notification received when a device is added on the pci-front bus. This was a logical point I thought this hypercall should be called. Keep in mind that I am still not aware how this works on x86. I think that is OK, but we would like to avoid a new hypercall. Roger's suggestion looks good. For (b) The BAR region is not updated AFAIK by the pci device driver running in domU. So once set the BARs by firmware or enumeration logic, are not changed, not in domU for sure. Then it is 1:1 always. Should the BAR region of the device be updated to make it not 1:1 ? I think the point Ian and Julien were trying to make is that we should not rely on the mapping being 1:1. It is OK for the guest not to change the BARs. But given that the memory layout of the guest is different from the one on the host, it is possible that the BAR might have an address that overlaps with a valid memory range in DomU. In that case the guest should map the MMIO region elsewhere in the guest physical space. It might also want to update the virtual BAR accordingly. (Alternatively the toolstack could come up with an appropriate placement of the virtual BARs and MMIO region mappings in the guest. This is what I was suggesting, but this solution is probably more complex than letting the guest choose.)___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
Hello, El 16/06/15 a les 7.42, Manish Jaggi ha escrit: [...] Beware that the 1:1 mapping doesn't fit with the current guest memory layout which is pre-defined at Xen build time. So you would also have to make it dynamically or decide to use the same memory layout as the host. If same layout as host used, would there be any issue? I'm not sure that a 1:1 mapping is any different to the host layout. But in any case, the host layout also doesn't match the guest layout, so it has the same issues. There is a domctl on x86 DOMCTL_iomem_permission which looks doing the same thing as the map_mmio hypercall which I am proposing. I am not sure if it does a 1:1. Do you have more info on it? No, this hypercall only sets the permissions of IO memory ranges, but AFAICT it doesn't touch the p2m at all. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On 16/06/2015 08:02, Roger Pau Monné wrote: Hello, El 16/06/15 a les 7.42, Manish Jaggi ha escrit: [...] Beware that the 1:1 mapping doesn't fit with the current guest memory layout which is pre-defined at Xen build time. So you would also have to make it dynamically or decide to use the same memory layout as the host. If same layout as host used, would there be any issue? I'm not sure that a 1:1 mapping is any different to the host layout. But in any case, the host layout also doesn't match the guest layout, so it has the same issues. There is a domctl on x86 DOMCTL_iomem_permission which looks doing the same thing as the map_mmio hypercall which I am proposing. I am not sure if it does a 1:1. Do you have more info on it? No, this hypercall only sets the permissions of IO memory ranges, but AFAICT it doesn't touch the p2m at all. Correct. -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Friday 12 June 2015 01:32 AM, Ian Campbell wrote: On Thu, 2015-06-11 at 14:38 -0700, Manish Jaggi wrote: On Wednesday 10 June 2015 12:21 PM, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. I think both the hypercalls are necessary a) the mapping of guest bdf to actual sbdf is required as domU accesses for GIC are trapped and not handled by pciback. A device say 1:0:0.3 is assigned in domU at 0:0:0.3. This is the bestway I could find that works. b) map_mmio call is issued just after the device is added on the pcu bus (in case of domU) The function register_xen_pci_notifier (drivers/xen/pci.c) is modified such that notification is received in domU and dom0. In which please please add to the document a discussion of the current interfaces and why they are not suitable. Beware that the 1:1 mapping doesn't fit with the current guest memory layout which is pre-defined at Xen build time. So you would also have to make it dynamically or decide to use the same memory layout as the host. If same layout as host used, would there be any issue? I'm not sure that a 1:1 mapping is any different to the host layout. But in any case, the host layout also doesn't match the guest layout, so it has the same issues. There is a domctl on x86 DOMCTL_iomem_permission which looks doing the same thing as the map_mmio hypercall which I am proposing. I am not sure if it does a 1:1. Do you have more info on it? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Thu, 2015-06-11 at 14:38 -0700, Manish Jaggi wrote: On Wednesday 10 June 2015 12:21 PM, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. I think both the hypercalls are necessary a) the mapping of guest bdf to actual sbdf is required as domU accesses for GIC are trapped and not handled by pciback. A device say 1:0:0.3 is assigned in domU at 0:0:0.3. This is the bestway I could find that works. b) map_mmio call is issued just after the device is added on the pcu bus (in case of domU) The function register_xen_pci_notifier (drivers/xen/pci.c) is modified such that notification is received in domU and dom0. In which please please add to the document a discussion of the current interfaces and why they are not suitable. Beware that the 1:1 mapping doesn't fit with the current guest memory layout which is pre-defined at Xen build time. So you would also have to make it dynamically or decide to use the same memory layout as the host. If same layout as host used, would there be any issue? I'm not sure that a 1:1 mapping is any different to the host layout. But in any case, the host layout also doesn't match the guest layout, so it has the same issues. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Fri, 2015-06-12 at 07:41 -0400, Julien Grall wrote: I was suggesting to expose the host layout to the guest layout (similar to e820). We do this on x86 only as a workaround for broken hardware (essentially magic system devices which bypass the IOMMU). We shouldn't do this on ARM by default, but only as and when a similar requirement arises. Although, this is not my preferred way and a non 1:1 mapping would be the best. Yes, a non-1:1 mapping is the right answer. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On 12/06/2015 04:32, Ian Campbell wrote: On Thu, 2015-06-11 at 14:38 -0700, Manish Jaggi wrote: On Wednesday 10 June 2015 12:21 PM, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. I think both the hypercalls are necessary a) the mapping of guest bdf to actual sbdf is required as domU accesses for GIC are trapped and not handled by pciback. A device say 1:0:0.3 is assigned in domU at 0:0:0.3. This is the bestway I could find that works. b) map_mmio call is issued just after the device is added on the pcu bus (in case of domU) The function register_xen_pci_notifier (drivers/xen/pci.c) is modified such that notification is received in domU and dom0. In which please please add to the document a discussion of the current interfaces and why they are not suitable. Beware that the 1:1 mapping doesn't fit with the current guest memory layout which is pre-defined at Xen build time. So you would also have to make it dynamically or decide to use the same memory layout as the host. If same layout as host used, would there be any issue? I'm not sure that a 1:1 mapping is any different to the host layout. But in any case, the host layout also doesn't match the guest layout, so it has the same issues. I was suggesting to expose the host layout to the guest layout (similar to e820). Although, this is not my preferred way and a non 1:1 mapping would be the best. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 2015-06-10 at 15:21 -0400, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest (i.e hvmloader?) is writing in the PCI BAR. What about for x86 PV? I think it is done by the toolstack there, I don't know what pciback does with accesses to BAR registers. AFAIU, when the device is assigned to the guest, we don't know yet where the BAR will live in the guest memory. It will be assigned by the guest (I wasn't able to find if Linux is able to do it). As the config space will trap in pciback, we would need to map the physical memory to the guest from the kernel. A domain These sorts of considerations/assumptions should be part of the document IMHO. Xen adds the mmio space to the stage2 translation for domU. The restrction is that xen creates 1:1 mapping of the MMIO address. I don't think we need/want this restriction. We can define some region(s) of guest memory to be an MMIO hole (by adding them to to the memory map in public/arch-arm.h). Even if we decide to choose a 1:1 mapping, this should not be exposed in the hypervisor interface (see the suggested physdev_map_mmio) and let at the discretion of the toolstack domain. Beware that the 1:1 mapping doesn't fit with the current guest memory layout which is pre-defined at Xen build time. So you would also have to make it dynamically or decide to use the same memory layout as the host. I am fairly strongly against using a 1:1 mapping for passthrough MMIO devices to guests, with the knockon effects it implies without a very strong reason why it must be the case, which should be spelled out in detail in the document. If there is a reason for this restriction/trade off then it should be spelled out as part of the design document, as should other such design decisions (which would include explaining where this differs from how things work for x86 why they must differ). On x86, for HVM the MMIO mapping is done by QEMU. I know that Roger is working on PCI passthrough for PVH. PVH is very similar to ARM guest and I expect to see a similar needs for MMIO mapping. It would be good if we can come up with a common interface. Yes. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
El 10/06/15 a les 21.21, Julien Grall ha escrit: If there is a reason for this restriction/trade off then it should be spelled out as part of the design document, as should other such design decisions (which would include explaining where this differs from how things work for x86 why they must differ). On x86, for HVM the MMIO mapping is done by QEMU. I know that Roger is working on PCI passthrough for PVH. PVH is very similar to ARM guest and I expect to see a similar needs for MMIO mapping. It would be good if we can come up with a common interface. I've kind of left that apart in favour of the new boot ABI that we are currently discussing, but IIRC the plan was to use XENMEM_add_to_physmap_range by adding a new phys_map_space and the physical MMIO pages would be specified in the idxs field. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
Hi Ian, On 11/06/2015 04:56, Ian Campbell wrote: On Wed, 2015-06-10 at 15:21 -0400, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest (i.e hvmloader?) is writing in the PCI BAR. What about for x86 PV? I think it is done by the toolstack there, I don't know what pciback does with accesses to BAR registers. XEN_DOMCTL_memory_mapping is only used to map memory in stage-2 page table. This is only used for auto-translated guest. In the case of x86 PV, the page-table is managed by the guest. The only things to do is to give the MMIO permission to the guest in order to the let him use them. This is done at boot time in the toolstack. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Thu, 2015-06-11 at 07:25 -0400, Julien Grall wrote: Hi Ian, On 11/06/2015 04:56, Ian Campbell wrote: On Wed, 2015-06-10 at 15:21 -0400, Julien Grall wrote: Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest (i.e hvmloader?) is writing in the PCI BAR. What about for x86 PV? I think it is done by the toolstack there, I don't know what pciback does with accesses to BAR registers. XEN_DOMCTL_memory_mapping is only used to map memory in stage-2 page table. This is only used for auto-translated guest. In the case of x86 PV, the page-table is managed by the guest. The only things to do is to give the MMIO permission to the guest in order to the let him use them. This is done at boot time in the toolstack. Ah yes, makes sense. Manish, this sort of thing and the constraints etc should be discussed in the doc please. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Thu, 2015-06-11 at 11:05 +0200, Roger Pau Monné wrote: El 10/06/15 a les 21.21, Julien Grall ha escrit: If there is a reason for this restriction/trade off then it should be spelled out as part of the design document, as should other such design decisions (which would include explaining where this differs from how things work for x86 why they must differ). On x86, for HVM the MMIO mapping is done by QEMU. I know that Roger is working on PCI passthrough for PVH. PVH is very similar to ARM guest and I expect to see a similar needs for MMIO mapping. It would be good if we can come up with a common interface. I've kind of left that apart in favour of the new boot ABI that we are currently discussing, but IIRC the plan was to use XENMEM_add_to_physmap_range by adding a new phys_map_space and the physical MMIO pages would be specified in the idxs field. This sounds ok, and preferable to an entirely new hypercall. One question is how to handle writes to the BAR registers in this scenario. If we allow changes (does PCI allow us to not allow them?) then the guest p2m would need updating too... Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Mon, 2015-06-08 at 00:52 -0700, Manish Jaggi wrote: Thanks, the general shape of this is looking good. It'd be a lot easier to read if you could arrange not to mangle the whitespaced/wrapping when sending though. PCI Pass-through in Xen ARM -- Index 1. Background 2. Basic PCI Support in Xen ARM 2.1 pci_hostbridge and pci_hostbridge_ops 2.2 PHYSDEVOP_pci_host_bridge_add hypercall 3. Dom0 Access PCI devices 4. DomU assignment of PCI device 5. NUMA and PCI passthrough 6. DomU pci device attach flow 1. Background of PCI passthrough [...] 2. Basic PCI Support for ARM [...] 3. Dom0 access PCI device - As per the design of xen hypervisor, dom0 enumerates the PCI devices. For each device the MMIO space has to be mapped in the Stage2 translation for dom0. For dom0 xen maps the ranges in pci nodes in stage 2 translation. Currently this is done by mapping the entire PCI window to dom0, not just the regions referenced by a specific device BAR. This could be done by the host controller driver I think. I don't think we need to go to the effort of going into each device's PCI cfg space and reading its BARs etc, do we? This section deal with the routing of PCI INTx interrupts (mapped to SPIs) as well as talking about MSIs. 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. Xen adds the mmio space to the stage2 translation for domU. The restrction is that xen creates 1:1 mapping of the MMIO address. I don't think we need/want this restriction. We can define some region(s) of guest memory to be an MMIO hole (by adding them to to the memory map in public/arch-arm.h). If there is a reason for this restriction/trade off then it should be spelled out as part of the design document, as should other such design decisions (which would include explaining where this differs from how things work for x86 why they must differ). #define PHYSDEVOP_map_sbdf 43 Isn't this just XEN_DOMCTL_assign_device? Change in PCI ForntEnd - backend driver for MSI/X programming - On the Pci frontend bus a msi-parent as gicv3-its is added. As there is a single virtual its for a domU, as there is only a single virtual pci bus in domU. This ensures that the config_msi calls are handled by the gicv3 its driver in domU kernel and not utilizing frontend-backend communication between dom0-domU. OK. 5. NUMA domU and vITS - a) On NUMA systems domU still have a single its node. b) How can xen identify the ITS on which a device is connected. - Using segment number query using api which gives pci host controllers device node struct dt_device_node* pci_hostbridge_dt_node(uint32_t segno) c) Query the interrupt parent of the pci device node to find out the its. Yes, I think that can work. 6. DomU Bootup flow - a. DomU boots up without any pci devices assigned. A daemon listens to events from the xenstore. Which daemon? Where does it live? When a device is attached to domU, the frontend pci bus driver starts enumerating the devices.Front end driver communicates with backend driver in dom0 to read the pci config space. backend driver here == xen-pciback.ko or something else? Does it differ from the daemon referred to above? Does it use the existing pciif.h protocol (I hope so). We do not have to use xen-pciback for everything (e.g. ITS and interrupts generally seem like a reasonable place to differ) but for things which pciback does we should in general prefer to use it. I'd prefer to avoid the need for a separate daemon if possible. b. Device driver of the specific pci device invokes methods to configure the msi/x interrupt which are handled by the its driver in domU kernel. The read/writes by the its driver are trapped in xen. ITS driver finds out the actual sbdf based on the map_sbdf hypercall information. Don't forget to also consider PCI INTx interrupts. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
Hi, On 10/06/2015 08:45, Ian Campbell wrote: 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced I don't think we want/need new hypercalls here, the same existing hypercalls which are used on x86 should be suitable. That's XEN_DOMCTL_memory_mapping from the toolstack I think. XEN_DOMCTL_memory_mapping is done by QEMU for x86 HVM when the guest (i.e hvmloader?) is writing in the PCI BAR. AFAIU, when the device is assigned to the guest, we don't know yet where the BAR will live in the guest memory. It will be assigned by the guest (I wasn't able to find if Linux is able to do it). As the config space will trap in pciback, we would need to map the physical memory to the guest from the kernel. A domain Xen adds the mmio space to the stage2 translation for domU. The restrction is that xen creates 1:1 mapping of the MMIO address. I don't think we need/want this restriction. We can define some region(s) of guest memory to be an MMIO hole (by adding them to to the memory map in public/arch-arm.h). Even if we decide to choose a 1:1 mapping, this should not be exposed in the hypervisor interface (see the suggested physdev_map_mmio) and let at the discretion of the toolstack domain. Beware that the 1:1 mapping doesn't fit with the current guest memory layout which is pre-defined at Xen build time. So you would also have to make it dynamically or decide to use the same memory layout as the host. If there is a reason for this restriction/trade off then it should be spelled out as part of the design document, as should other such design decisions (which would include explaining where this differs from how things work for x86 why they must differ). On x86, for HVM the MMIO mapping is done by QEMU. I know that Roger is working on PCI passthrough for PVH. PVH is very similar to ARM guest and I expect to see a similar needs for MMIO mapping. It would be good if we can come up with a common interface. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
Hi Ian/Stefano, As discussed in the call I have sent the design. Didn't got any feedback on this. Regards, Manish Jaggi From: xen-devel-boun...@lists.xen.org xen-devel-boun...@lists.xen.org on behalf of Manish Jaggi mja...@caviumnetworks.com Sent: Monday, June 8, 2015 12:52:55 AM To: xen-devel@lists.xen.org; Ian Campbell; Stefano Stabellini; Vijay Kilari; Kulkarni, Ganapatrao; Kumar, Vijaya; Kapoor, Prasun Subject: [Xen-devel] PCI Passthrough ARM Design : Draft1 PCI Pass-through in Xen ARM -- Index 1. Background 2. Basic PCI Support in Xen ARM 2.1 pci_hostbridge and pci_hostbridge_ops 2.2 PHYSDEVOP_pci_host_bridge_add hypercall 3. Dom0 Access PCI devices 4. DomU assignment of PCI device 5. NUMA and PCI passthrough 6. DomU pci device attach flow 1. Background of PCI passthrough Passthrough refers to assigning a pci device to a guest domain (domU) such that the guest has full control over the device.The MMIO space and interrupts are managed by the guest itself, close to how a bare kernel manages a device. Device's access to guest address space needs to be isolated and protected. SMMU (System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow device access guest memory for data transfer and sending MSI/X interrupts. In case of MSI/X the device writes to GITS (ITS address space) Interrupt Translation Register. 2. Basic PCI Support for ARM The apis to read write from pci configuration space are based on segment:bdf. How the sbdf is mapped to a physical address is under the realm of the pci host controller. ARM PCI support in Xen, introduces pci host controller similar to what exists in Linux. Each drivers registers callbacks, which are invoked on matching the compatible property in pci device tree node. 2.1: The init function in the pci host driver calls to register hostbridge callbacks: int pci_hostbridge_register(pci_hostbridge_t *pcihb); struct pci_hostbridge_ops { u32 (*pci_conf_read)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes); void (*pci_conf_write)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes, u32 val); }; struct pci_hostbridge{ u32 segno; paddr_t cfg_base; paddr_t cfg_size; struct dt_device_node *dt_node; struct pci_hostbridge_ops ops; struct list_head list; }; A pci conf read function would internally be as follows: u32 pcihb_conf_read(u32 seg, u32 bus, u32 devfn,u32 reg, u32 bytes) { pci_hostbridge_t *pcihb; list_for_each_entry(pcihb, pci_hostbridge_list, list) { if(pcihb-segno == seg) return pcihb-ops.pci_conf_read(pcihb, bus, devfn, reg, bytes); } return -1; } 2.2 PHYSDEVOP_pci_host_bridge_add hypercall Xen code accesses PCI configuration space based on the sbdf received from the guest. The order in which the pci device tree node appear may not be the same order of device enumeration in dom0. Thus there needs to be a mechanism to bind the segment number assigned by dom0 to the pci host controller. The hypercall is introduced: #define PHYSDEVOP_pci_host_bridge_add44 struct physdev_pci_host_bridge_add { /* IN */ uint16_t seg; uint64_t cfg_base; uint64_t cfg_size; }; This hypercall is invoked before dom0 invokes the PHYSDEVOP_pci_device_add hypercall. The handler code invokes to update segment number in pci_hostbridge: int pci_hostbridge_setup(uint32_t segno, uint64_t cfg_base, uint64_t cfg_size); Subsequent calls to pci_conf_read/write are completed by the pci_hostbridge_ops of the respective pci_hostbridge. 3. Dom0 access PCI device - As per the design of xen hypervisor, dom0 enumerates the PCI devices. For each device the MMIO space has to be mapped in the Stage2 translation for dom0. For dom0 xen maps the ranges in pci nodes in stage 2 translation. GITS_ITRANSLATER space (4k( must be programmed in Stage2 translation so that MSI/X must work. This is done in vits intitialization in dom0/domU. 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced #define PHYSDEVOP_map_mmio 40 #define PHYSDEVOP_unmap_mmio41 struct physdev_map_mmio { /* IN */ uint64_t addr; uint64_t size; }; Xen adds the mmio space to the stage2 translation for domU. The restrction is that xen creates 1:1 mapping of the MMIO address. #define PHYSDEVOP_map_sbdf 43 struct physdev_map_sbdf { int domain_id; int sbdf_s; int sbdf_b; int sbdf_d; int sbdf_f; int gsbdf_s; int gsbdf_b
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
Please give us a chance to respond, it's been only just over a day and we are all busy with lots of different things. On Tue, 2015-06-09 at 14:42 +, Jaggi, Manish wrote: Hi Ian/Stefano, As discussed in the call I have sent the design. Didn't got any feedback on this. Regards, Manish Jaggi From: xen-devel-boun...@lists.xen.org xen-devel-boun...@lists.xen.org on behalf of Manish Jaggi mja...@caviumnetworks.com Sent: Monday, June 8, 2015 12:52:55 AM To: xen-devel@lists.xen.org; Ian Campbell; Stefano Stabellini; Vijay Kilari; Kulkarni, Ganapatrao; Kumar, Vijaya; Kapoor, Prasun Subject: [Xen-devel] PCI Passthrough ARM Design : Draft1 PCI Pass-through in Xen ARM -- Index 1. Background 2. Basic PCI Support in Xen ARM 2.1 pci_hostbridge and pci_hostbridge_ops 2.2 PHYSDEVOP_pci_host_bridge_add hypercall 3. Dom0 Access PCI devices 4. DomU assignment of PCI device 5. NUMA and PCI passthrough 6. DomU pci device attach flow 1. Background of PCI passthrough Passthrough refers to assigning a pci device to a guest domain (domU) such that the guest has full control over the device.The MMIO space and interrupts are managed by the guest itself, close to how a bare kernel manages a device. Device's access to guest address space needs to be isolated and protected. SMMU (System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow device access guest memory for data transfer and sending MSI/X interrupts. In case of MSI/X the device writes to GITS (ITS address space) Interrupt Translation Register. 2. Basic PCI Support for ARM The apis to read write from pci configuration space are based on segment:bdf. How the sbdf is mapped to a physical address is under the realm of the pci host controller. ARM PCI support in Xen, introduces pci host controller similar to what exists in Linux. Each drivers registers callbacks, which are invoked on matching the compatible property in pci device tree node. 2.1: The init function in the pci host driver calls to register hostbridge callbacks: int pci_hostbridge_register(pci_hostbridge_t *pcihb); struct pci_hostbridge_ops { u32 (*pci_conf_read)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes); void (*pci_conf_write)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes, u32 val); }; struct pci_hostbridge{ u32 segno; paddr_t cfg_base; paddr_t cfg_size; struct dt_device_node *dt_node; struct pci_hostbridge_ops ops; struct list_head list; }; A pci conf read function would internally be as follows: u32 pcihb_conf_read(u32 seg, u32 bus, u32 devfn,u32 reg, u32 bytes) { pci_hostbridge_t *pcihb; list_for_each_entry(pcihb, pci_hostbridge_list, list) { if(pcihb-segno == seg) return pcihb-ops.pci_conf_read(pcihb, bus, devfn, reg, bytes); } return -1; } 2.2 PHYSDEVOP_pci_host_bridge_add hypercall Xen code accesses PCI configuration space based on the sbdf received from the guest. The order in which the pci device tree node appear may not be the same order of device enumeration in dom0. Thus there needs to be a mechanism to bind the segment number assigned by dom0 to the pci host controller. The hypercall is introduced: #define PHYSDEVOP_pci_host_bridge_add44 struct physdev_pci_host_bridge_add { /* IN */ uint16_t seg; uint64_t cfg_base; uint64_t cfg_size; }; This hypercall is invoked before dom0 invokes the PHYSDEVOP_pci_device_add hypercall. The handler code invokes to update segment number in pci_hostbridge: int pci_hostbridge_setup(uint32_t segno, uint64_t cfg_base, uint64_t cfg_size); Subsequent calls to pci_conf_read/write are completed by the pci_hostbridge_ops of the respective pci_hostbridge. 3. Dom0 access PCI device - As per the design of xen hypervisor, dom0 enumerates the PCI devices. For each device the MMIO space has to be mapped in the Stage2 translation for dom0. For dom0 xen maps the ranges in pci nodes in stage 2 translation. GITS_ITRANSLATER space (4k( must be programmed in Stage2 translation so that MSI/X must work. This is done in vits intitialization in dom0/domU. 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced #define PHYSDEVOP_map_mmio 40 #define PHYSDEVOP_unmap_mmio41 struct physdev_map_mmio { /* IN */ uint64_t addr; uint64_t size; }; Xen adds the mmio space
[Xen-devel] PCI Passthrough ARM Design : Draft1
PCI Pass-through in Xen ARM -- Index 1. Background 2. Basic PCI Support in Xen ARM 2.1 pci_hostbridge and pci_hostbridge_ops 2.2 PHYSDEVOP_pci_host_bridge_add hypercall 3. Dom0 Access PCI devices 4. DomU assignment of PCI device 5. NUMA and PCI passthrough 6. DomU pci device attach flow 1. Background of PCI passthrough Passthrough refers to assigning a pci device to a guest domain (domU) such that the guest has full control over the device.The MMIO space and interrupts are managed by the guest itself, close to how a bare kernel manages a device. Device's access to guest address space needs to be isolated and protected. SMMU (System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow device access guest memory for data transfer and sending MSI/X interrupts. In case of MSI/X the device writes to GITS (ITS address space) Interrupt Translation Register. 2. Basic PCI Support for ARM The apis to read write from pci configuration space are based on segment:bdf. How the sbdf is mapped to a physical address is under the realm of the pci host controller. ARM PCI support in Xen, introduces pci host controller similar to what exists in Linux. Each drivers registers callbacks, which are invoked on matching the compatible property in pci device tree node. 2.1: The init function in the pci host driver calls to register hostbridge callbacks: int pci_hostbridge_register(pci_hostbridge_t *pcihb); struct pci_hostbridge_ops { u32 (*pci_conf_read)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes); void (*pci_conf_write)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes, u32 val); }; struct pci_hostbridge{ u32 segno; paddr_t cfg_base; paddr_t cfg_size; struct dt_device_node *dt_node; struct pci_hostbridge_ops ops; struct list_head list; }; A pci conf read function would internally be as follows: u32 pcihb_conf_read(u32 seg, u32 bus, u32 devfn,u32 reg, u32 bytes) { pci_hostbridge_t *pcihb; list_for_each_entry(pcihb, pci_hostbridge_list, list) { if(pcihb-segno == seg) return pcihb-ops.pci_conf_read(pcihb, bus, devfn, reg, bytes); } return -1; } 2.2 PHYSDEVOP_pci_host_bridge_add hypercall Xen code accesses PCI configuration space based on the sbdf received from the guest. The order in which the pci device tree node appear may not be the same order of device enumeration in dom0. Thus there needs to be a mechanism to bind the segment number assigned by dom0 to the pci host controller. The hypercall is introduced: #define PHYSDEVOP_pci_host_bridge_add44 struct physdev_pci_host_bridge_add { /* IN */ uint16_t seg; uint64_t cfg_base; uint64_t cfg_size; }; This hypercall is invoked before dom0 invokes the PHYSDEVOP_pci_device_add hypercall. The handler code invokes to update segment number in pci_hostbridge: int pci_hostbridge_setup(uint32_t segno, uint64_t cfg_base, uint64_t cfg_size); Subsequent calls to pci_conf_read/write are completed by the pci_hostbridge_ops of the respective pci_hostbridge. 3. Dom0 access PCI device - As per the design of xen hypervisor, dom0 enumerates the PCI devices. For each device the MMIO space has to be mapped in the Stage2 translation for dom0. For dom0 xen maps the ranges in pci nodes in stage 2 translation. GITS_ITRANSLATER space (4k( must be programmed in Stage2 translation so that MSI/X must work. This is done in vits intitialization in dom0/domU. 4. DomU access / assignment PCI device -- When a device is attached to a domU, provision has to be made such that it can access the MMIO space of the device and xen is able to identify the mapping between guest bdf and system bdf. Two hypercalls are introduced #define PHYSDEVOP_map_mmio 40 #define PHYSDEVOP_unmap_mmio41 struct physdev_map_mmio { /* IN */ uint64_t addr; uint64_t size; }; Xen adds the mmio space to the stage2 translation for domU. The restrction is that xen creates 1:1 mapping of the MMIO address. #define PHYSDEVOP_map_sbdf 43 struct physdev_map_sbdf { int domain_id; int sbdf_s; int sbdf_b; int sbdf_d; int sbdf_f; int gsbdf_s; int gsbdf_b; int gsbdf_d; int gsbdf_f; }; Each domain has a pdev list, which contains the list of all pci devices. The pdev structure already has a sbdf information. The arch_pci_dev is updated to contain the gsbdf information. (gs- guest segment id) Whenever there is trap from guest or an interrupt has to be injected, the pdev list is iterated to find the gsbdf. Change in PCI ForntEnd - backend driver for MSI/X programming - On the Pci