Re: [kvm-devel] [RFC] virtio-blk PCI backend
Arnd Bergmann wrote: > Not sure if I'm following the reasoning here. Shouldn't the method be > inherent to the virtio bus driver? > > When you use a PCI based virtio bus, the natural choice would be PIO > in some way, but you could also have a different virtio implementation > on PCI that uses hcalls. This choice is completely up to virtio-pci. > > On s390, you have a different virtio backend altogether, so you always > use DIAG or hcall instead of whatever virtio-pci does. > > The virtio-blk and other high-level drivers don't need to care about > what transport the bus uses in the first place. If you look at HPA's virtio PCI bus in lguest, it uses PCI device organization but no other PCI features. That's where we're heading, because we don't want a different virtio backend. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
On Tuesday 20 November 2007, Avi Kivity wrote: > > > > > Sorry for being late in this thread. > > We (s390) will need a hypercall as we do not have port I/O. I think it > > should be > > possible to default to hypercall on s390 and use pio everywhere else. > > > > Or be generic: advertise the methods available according to host > (kvm/x86, qemu/x86, kvm/s390) and let the guest pick. Not sure if I'm following the reasoning here. Shouldn't the method be inherent to the virtio bus driver? When you use a PCI based virtio bus, the natural choice would be PIO in some way, but you could also have a different virtio implementation on PCI that uses hcalls. This choice is completely up to virtio-pci. On s390, you have a different virtio backend altogether, so you always use DIAG or hcall instead of whatever virtio-pci does. The virtio-blk and other high-level drivers don't need to care about what transport the bus uses in the first place. Arnd <>< - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Christian Borntraeger wrote: > Am Freitag, 9. November 2007 schrieb Dor Laor: > >> I believe that the network interface will quickly go to the kernel since >> copy takes most of the >> cpu time and qemu does not support scatter gather dma at the moment. >> Nevertheless using pio seems good enough, Anthony's suggestion of >> notifying the kernel using ioctls >> is logical. If we'll run into troubles further on we can add a hypercall >> capability and if exist use hypercalls >> instead of pios. >> > > Sorry for being late in this thread. > We (s390) will need a hypercall as we do not have port I/O. I think it should > be > possible to default to hypercall on s390 and use pio everywhere else. > Or be generic: advertise the methods available according to host (kvm/x86, qemu/x86, kvm/s390) and let the guest pick. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Am Freitag, 9. November 2007 schrieb Dor Laor: > I believe that the network interface will quickly go to the kernel since > copy takes most of the > cpu time and qemu does not support scatter gather dma at the moment. > Nevertheless using pio seems good enough, Anthony's suggestion of > notifying the kernel using ioctls > is logical. If we'll run into troubles further on we can add a hypercall > capability and if exist use hypercalls > instead of pios. Sorry for being late in this thread. We (s390) will need a hypercall as we do not have port I/O. I think it should be possible to default to hypercall on s390 and use pio everywhere else. Christian - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Anthony Liguori wrote: > Avi Kivity wrote: >>> There's no reason that the PIO operations couldn't be handled in the >>> kernel. You'll already need some level of cooperation in userspace >>> unless you plan on implementing the PCI bus in kernel space too. >>> It's easy enough in the pci_map function in QEMU to just notify the >>> kernel that it should listen on a particular PIO range. >>> >>> >> >> This is a config space write, right? If so, the range is the regular >> 0xcf8-0xcff and it has to be very specially handled. > > This is a per-device IO slot and as best as I can tell, the PCI device > advertises the size of the region and the OS then identifies a range > of PIO space to use and tells the PCI device about it. So we would > just need to implement a generic userspace virtio PCI device in QEMU > that did an ioctl to the kernel when this happened to tell the kernel > what region to listen on for a particular device. > I'll just go and read the patches more carefully before making any more stupid remarks about the code. >>> vmcalls will certainly get faster but I doubt that the cost >>> difference between vmcall and pio will ever be greater than a few >>> hundred cycles. The only performance sensitive operation here would >>> be the kick and I don't think a few hundred cycles in the kick path >>> is ever going to be that significant for overall performance. >>> >>> >> >> Why do you think the different will be a few hundred cycles? > > The only difference in hardware between a PIO exit and a vmcall is > that you don't have write out an exit reason in the VMC[SB]. So the > performance difference between PIO/vmcall shouldn't be that great (and > if it were, the difference would probably be obvious today). That's > different from, say, a PF exit because with a PF, you also have to > attempt to resolve it by walking the guest page table before > determining that you do in fact need to exit. > You have to look at the pio bitmaps with pio. Point taken though. > >>> So why introduce the extra complexity? >>> >> >> Overall I think it reduces comlexity if we have in-kernel devices. >> Anyway we can add additional signalling methods later. > > In-kernel virtio backends add quite a lot of complexity. Just the > mechanism to setup the device is complicated enough. I suspect that > it'll be necessary down the road for performance but I certainly don't > think it's a simplification. I didn't mean that in-kernel devices simplify things (they don't), but that using hypercalls is simpler for in-kernel than pio. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Dor Laor wrote: > Anthony Liguori wrote: >> This still needs quite a lot of work but I wanted to post it for >> reference. >> >> Regards, >> >> Anthony Liguori >> >> diff --git a/qemu/Makefile.target b/qemu/Makefile.target >> > ... > Why change Rusty's codding standard? It will be harder to track changes. Because Linux kernel coding standards aren't QEMU coding standards. Besides, this is supposed to be an ABI so it shouldn't be changing all that much :-) I posted the QEMU bits as soon as I got it working. I still have a lot to do in it however I have addressed some of the things you brought up already. >> +case VIRTIO_PCI_QUEUE_PFN: >> +pa = (ram_addr_t)val << TARGET_PAGE_BITS; >> +vdev->vq[vdev->queue_sel].pfn = val; >> > Some validity checks are missing, you assume you have the queue_sel. Yes, the code is carefully written so that queue_sel is always valid. >> +if (pa < (ram_size - TARGET_PAGE_SIZE)) >> +vring_init(&vdev->vq[vdev->queue_sel], phys_ram_base + pa); >> +break; >> +case VIRTIO_PCI_QUEUE_SEL: >> +if (val < VIRTIO_PCI_QUEUE_MAX) >> +vdev->queue_sel = val; >> +break; >> +case VIRTIO_PCI_QUEUE_NOTIFY: >> +if (val < VIRTIO_PCI_QUEUE_MAX) >> +virtio_ring_kick(vdev, &vdev->vq[val]); >> +break; >> +case VIRTIO_PCI_STATUS: >> +vdev->status = val & 0xFF; >> > we should keep another internal status and it will track the > initialization of all the above fields ( > pfn, queue_sel,..) the device will be active once all of them were > initialized by the guest Hrm, I don't follow. The only thing that has to be written to by the guest is the PFN which also has the effect of activating the queue. >> +break; >> +default: >> +if (addr >= VIRTIO_PCI_CONFIG && vdev->set_config) >> +vdev->set_config(vdev->opaque, addr - VIRTIO_PCI_CONFIG, val); >> +break; >> +} >> +} >> + >> > > What about having block/net/9p.. in separate files? It will grow over > time. Yup, already have that in my own queue. The latest version queue for the kernel side is at http://hg.codemonkey.ws/virtio-pci (based on the master branch of Rusty's virtio tree). The latest queue for the QEMU side is at http://hg.codemonkey.ws/qemu-virtio. I have a functioning block and 9p transport. I'll continue cleaning up tomorrow and will hopefully post another set of patches early next week. Unfortunately, I uncovered a bug in the in-kernel APIC code today so you need to run with -no-kvm-irqchip if you want to use multiple virtio devices at once :-/ Regards, Anthony Liguori >> +#include >> +#include >> + >> +#define BLK_MAX_QUEUE_SIZE127 >> + >> +static bool virtio_blk_handle_request(BlockDriverState *bs, >> + VirtIODevice *vdev, VirtQueue *vq) >> +{ >> +struct iovec iov[vq->vring.num]; >> +unsigned int head, out_num, in_num, wlen; >> +struct virtio_blk_inhdr *in; >> +struct virtio_blk_outhdr *out; >> + > Great job, Dor. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Anthony Liguori wrote: > This still needs quite a lot of work but I wanted to post it for reference. > > Regards, > > Anthony Liguori > > diff --git a/qemu/Makefile.target b/qemu/Makefile.target > ... Why change Rusty's codding standard? It will be harder to track changes. > +typedef struct VRingDesc > +{ > +uint64_t addr; > +uint32_t len; > +uint16_t flags; > +uint16_t next; > +} VRingDesc; > + > +typedef struct VRingAvail > +{ > +uint16_t flags; > +uint16_t idx; > +uint16_t ring[0]; > +} VRingAvail; > + > +typedef struct VRingUsedElem > +{ > +uint32_t id; > +uint32_t len; > +} VRingUsedElem; > + > +typedef struct VRingUsed > +{ > +uint16_t flags; > +uint16_t idx; > +VRingUsedElem ring[0]; > +} VRingUsed; > + > +typedef struct VRing > +{ > +unsigned int num; > +VRingDesc *desc; > +VRingAvail *avail; > +VRingUsed *used; > +} VRing; > + > +static void virtio_ioport_write(void *opaque, uint32_t addr, uint32_t val) > +{ > +VirtIODevice *vdev = to_virtio_device(opaque); > +ram_addr_t pa; > + > +addr -= vdev->addr; > + > +switch (addr) { > +case VIRTIO_PCI_GUEST_FEATURES: > + if (vdev->set_features) > + vdev->set_features(vdev->opaque, val); > + vdev->features = val; > + break; > +case VIRTIO_PCI_QUEUE_PFN: > + pa = (ram_addr_t)val << TARGET_PAGE_BITS; > + vdev->vq[vdev->queue_sel].pfn = val; > Some validity checks are missing, you assume you have the queue_sel. > + if (pa < (ram_size - TARGET_PAGE_SIZE)) > + vring_init(&vdev->vq[vdev->queue_sel], phys_ram_base + pa); > + break; > +case VIRTIO_PCI_QUEUE_SEL: > + if (val < VIRTIO_PCI_QUEUE_MAX) > + vdev->queue_sel = val; > + break; > +case VIRTIO_PCI_QUEUE_NOTIFY: > + if (val < VIRTIO_PCI_QUEUE_MAX) > + virtio_ring_kick(vdev, &vdev->vq[val]); > + break; > +case VIRTIO_PCI_STATUS: > + vdev->status = val & 0xFF; > we should keep another internal status and it will track the initialization of all the above fields ( pfn, queue_sel,..) the device will be active once all of them were initialized by the guest > + break; > +default: > + if (addr >= VIRTIO_PCI_CONFIG && vdev->set_config) > + vdev->set_config(vdev->opaque, addr - VIRTIO_PCI_CONFIG, val); > + break; > +} > +} > + > What about having block/net/9p.. in separate files? It will grow over time. > +#include > +#include > + > +#define BLK_MAX_QUEUE_SIZE 127 > + > +static bool virtio_blk_handle_request(BlockDriverState *bs, > + VirtIODevice *vdev, VirtQueue *vq) > +{ > +struct iovec iov[vq->vring.num]; > +unsigned int head, out_num, in_num, wlen; > +struct virtio_blk_inhdr *in; > +struct virtio_blk_outhdr *out; > + Great job, Dor. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Anthony Liguori wrote: Avi Kivity wrote: There's no reason that the PIO operations couldn't be handled in the kernel. You'll already need some level of cooperation in userspace unless you plan on implementing the PCI bus in kernel space too. It's easy enough in the pci_map function in QEMU to just notify the kernel that it should listen on a particular PIO range. This is a config space write, right? If so, the range is the regular 0xcf8-0xcff and it has to be very specially handled. This is a per-device IO slot and as best as I can tell, the PCI device advertises the size of the region and the OS then identifies a range of PIO space to use and tells the PCI device about it. So we would just need to implement a generic userspace virtio PCI device in QEMU that did an ioctl to the kernel when this happened to tell the kernel what region to listen on for a particular device. vmcalls will certainly get faster but I doubt that the cost difference between vmcall and pio will ever be greater than a few hundred cycles. The only performance sensitive operation here would be the kick and I don't think a few hundred cycles in the kick path is ever going to be that significant for overall performance. Why do you think the different will be a few hundred cycles? The only difference in hardware between a PIO exit and a vmcall is that you don't have write out an exit reason in the VMC[SB]. So the performance difference between PIO/vmcall shouldn't be that great (and if it were, the difference would probably be obvious today). That's different from, say, a PF exit because with a PF, you also have to attempt to resolve it by walking the guest page table before determining that you do in fact need to exit. And if you have a large number of devices, searching the list becomes expensive too. The PIO address space is relatively small. You could do a radix tree or even a direct array lookup if you are concerned about performance. So why introduce the extra complexity? Overall I think it reduces comlexity if we have in-kernel devices. Anyway we can add additional signalling methods later. In-kernel virtio backends add quite a lot of complexity. Just the mechanism to setup the device is complicated enough. I suspect that it'll be necessary down the road for performance but I certainly don't think it's a simplification. I believe that the network interface will quickly go to the kernel since copy takes most of the cpu time and qemu does not support scatter gather dma at the moment. Nevertheless using pio seems good enough, Anthony's suggestion of notifying the kernel using ioctls is logical. If we'll run into troubles further on we can add a hypercall capability and if exist use hypercalls instead of pios. Regards, Anthony Liguori - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Avi Kivity wrote: > Anthony Liguori wrote: > >> +case VIRTIO_PCI_QUEUE_NOTIFY: >> +if (val < VIRTIO_PCI_QUEUE_MAX) >> +virtio_ring_kick(vdev, &vdev->vq[val]); >> +break; >> > > I see you're not using hypercalls for this, presumably for > compatibility > with -no-kvm. More than just that. By stick to PIO, we are compatible with just about any VMM. For instance, we get Xen support for free. If we used hypercalls, even if we agreed on a way to determine which number to use and how to make those calls, it would still be difficult to implement in something like Xen. >>> >>> But pio through the config space basically means you're committed to >>> handling it in qemu. We want a more flexible mechanism. >> >> There's no reason that the PIO operations couldn't be handled in the >> kernel. You'll already need some level of cooperation in userspace >> unless you plan on implementing the PCI bus in kernel space too. >> It's easy enough in the pci_map function in QEMU to just notify the >> kernel that it should listen on a particular PIO range. > > With my new understanding of what this is all about, I suggest each > virtqueue having an ID filled in by the host. This ID is globally > unique, and is used as an argument for kick. It would map into a Xen > domain id + event channel number, a number to be written into a pio > port for kvm-lite or non-hypercall kvm, the argument for a kick > hypercall on kvm, or whatever. Yeah, right now, I maintain a virtqueue "selector" within virtio-pci and use that for notification. This index is also exposed in the config->find_vq() within virtio. Changing that to an opaque ID would require introducing another mechanism to enumerate the virtqueues since you couldn't just start from 0 and keep going until you hit an invalid virtqueue. I'm not sure I'm convinced that you couldn't just hide this "id" notion in the virtio-xen implementation if you needed to. Regards, Anthony Liguori > This is independent of virtio-pci, which is good. > - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Avi Kivity wrote: >> There's no reason that the PIO operations couldn't be handled in the >> kernel. You'll already need some level of cooperation in userspace >> unless you plan on implementing the PCI bus in kernel space too. >> It's easy enough in the pci_map function in QEMU to just notify the >> kernel that it should listen on a particular PIO range. >> >> > > This is a config space write, right? If so, the range is the regular > 0xcf8-0xcff and it has to be very specially handled. This is a per-device IO slot and as best as I can tell, the PCI device advertises the size of the region and the OS then identifies a range of PIO space to use and tells the PCI device about it. So we would just need to implement a generic userspace virtio PCI device in QEMU that did an ioctl to the kernel when this happened to tell the kernel what region to listen on for a particular device. >> vmcalls will certainly get faster but I doubt that the cost >> difference between vmcall and pio will ever be greater than a few >> hundred cycles. The only performance sensitive operation here would >> be the kick and I don't think a few hundred cycles in the kick path >> is ever going to be that significant for overall performance. >> >> > > Why do you think the different will be a few hundred cycles? The only difference in hardware between a PIO exit and a vmcall is that you don't have write out an exit reason in the VMC[SB]. So the performance difference between PIO/vmcall shouldn't be that great (and if it were, the difference would probably be obvious today). That's different from, say, a PF exit because with a PF, you also have to attempt to resolve it by walking the guest page table before determining that you do in fact need to exit. > And if you have a large number of devices, searching the list > becomes expensive too. The PIO address space is relatively small. You could do a radix tree or even a direct array lookup if you are concerned about performance. >> So why introduce the extra complexity? >> > > Overall I think it reduces comlexity if we have in-kernel devices. > Anyway we can add additional signalling methods later. In-kernel virtio backends add quite a lot of complexity. Just the mechanism to setup the device is complicated enough. I suspect that it'll be necessary down the road for performance but I certainly don't think it's a simplification. Regards, Anthony Liguori - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Anthony Liguori wrote: > +case VIRTIO_PCI_QUEUE_NOTIFY: > +if (val < VIRTIO_PCI_QUEUE_MAX) > +virtio_ring_kick(vdev, &vdev->vq[val]); > +break; > I see you're not using hypercalls for this, presumably for compatibility with -no-kvm. >>> >>> More than just that. By stick to PIO, we are compatible with just >>> about any VMM. For instance, we get Xen support for free. If we >>> used hypercalls, even if we agreed on a way to determine which >>> number to use and how to make those calls, it would still be >>> difficult to implement in something like Xen. >>> >> >> But pio through the config space basically means you're committed to >> handling it in qemu. We want a more flexible mechanism. > > There's no reason that the PIO operations couldn't be handled in the > kernel. You'll already need some level of cooperation in userspace > unless you plan on implementing the PCI bus in kernel space too. It's > easy enough in the pci_map function in QEMU to just notify the kernel > that it should listen on a particular PIO range. With my new understanding of what this is all about, I suggest each virtqueue having an ID filled in by the host. This ID is globally unique, and is used as an argument for kick. It would map into a Xen domain id + event channel number, a number to be written into a pio port for kvm-lite or non-hypercall kvm, the argument for a kick hypercall on kvm, or whatever. This is independent of virtio-pci, which is good. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Anthony Liguori wrote: > Avi Kivity wrote: > >> Anthony Liguori wrote: >> >>> Avi Kivity wrote: >>> Anthony Liguori wrote: > +case VIRTIO_PCI_QUEUE_NOTIFY: > +if (val < VIRTIO_PCI_QUEUE_MAX) > +virtio_ring_kick(vdev, &vdev->vq[val]); > +break; > > I see you're not using hypercalls for this, presumably for compatibility with -no-kvm. >>> More than just that. By stick to PIO, we are compatible with just >>> about any VMM. For instance, we get Xen support for free. If we >>> used hypercalls, even if we agreed on a way to determine which number >>> to use and how to make those calls, it would still be difficult to >>> implement in something like Xen. >>> >>> >> But pio through the config space basically means you're committed to >> handling it in qemu. We want a more flexible mechanism. >> > > There's no reason that the PIO operations couldn't be handled in the > kernel. You'll already need some level of cooperation in userspace > unless you plan on implementing the PCI bus in kernel space too. It's > easy enough in the pci_map function in QEMU to just notify the kernel > that it should listen on a particular PIO range. > > This is a config space write, right? If so, the range is the regular 0xcf8-0xcff and it has to be very specially handled. >> Detecting how to make hypercalls can be left to paravirt_ops. >> >> (for Xen you'd use an event channel; and for kvm the virtio kick >> hypercall). >> >> Well I think I have a solution: advertise vmcall, vmmcall, pio to some port, and int $some_vector as hypercall feature bits in cpuid (for kvm, kvm, qemu, and kvm-lite respectively). Early setup code could patch the instruction as appropriate (I hear code patching is now taught in second grade). >>> That ties our device to our particular hypercall implementation. If >>> we were going to do this, I'd prefer to advertise it in the device I >>> think. I really would need to look at the performance though of >>> vmcall and an edge triggered interrupt. It would have to be pretty >>> compelling to warrant the additional complexity I think. >>> >> vmcall costs will go down, and we don't want to use different >> mechanisms for high bandwidth and low bandwidth devices. >> > > vmcalls will certainly get faster but I doubt that the cost difference > between vmcall and pio will ever be greater than a few hundred cycles. > The only performance sensitive operation here would be the kick and I > don't think a few hundred cycles in the kick path is ever going to be > that significant for overall performance. > > Why do you think the different will be a few hundred cycles? And if you have a large number of devices, searching the list becomes expensive too. > So why introduce the extra complexity? > Overall I think it reduces comlexity if we have in-kernel devices. Anyway we can add additional signalling methods later. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Avi Kivity wrote: > Anthony Liguori wrote: >> Avi Kivity wrote: >>> Anthony Liguori wrote: >>> +case VIRTIO_PCI_QUEUE_NOTIFY: +if (val < VIRTIO_PCI_QUEUE_MAX) +virtio_ring_kick(vdev, &vdev->vq[val]); +break; >>> >>> I see you're not using hypercalls for this, presumably for >>> compatibility >>> with -no-kvm. >> >> More than just that. By stick to PIO, we are compatible with just >> about any VMM. For instance, we get Xen support for free. If we >> used hypercalls, even if we agreed on a way to determine which number >> to use and how to make those calls, it would still be difficult to >> implement in something like Xen. >> > > But pio through the config space basically means you're committed to > handling it in qemu. We want a more flexible mechanism. There's no reason that the PIO operations couldn't be handled in the kernel. You'll already need some level of cooperation in userspace unless you plan on implementing the PCI bus in kernel space too. It's easy enough in the pci_map function in QEMU to just notify the kernel that it should listen on a particular PIO range. > Detecting how to make hypercalls can be left to paravirt_ops. > > (for Xen you'd use an event channel; and for kvm the virtio kick > hypercall). > >>> Well I think I have a solution: advertise vmcall, >>> vmmcall, pio to some port, and int $some_vector as hypercall feature >>> bits in cpuid (for kvm, kvm, qemu, and kvm-lite respectively). Early >>> setup code could patch the instruction as appropriate (I hear code >>> patching is now taught in second grade). >>> >> >> That ties our device to our particular hypercall implementation. If >> we were going to do this, I'd prefer to advertise it in the device I >> think. I really would need to look at the performance though of >> vmcall and an edge triggered interrupt. It would have to be pretty >> compelling to warrant the additional complexity I think. > > vmcall costs will go down, and we don't want to use different > mechanisms for high bandwidth and low bandwidth devices. vmcalls will certainly get faster but I doubt that the cost difference between vmcall and pio will ever be greater than a few hundred cycles. The only performance sensitive operation here would be the kick and I don't think a few hundred cycles in the kick path is ever going to be that significant for overall performance. So why introduce the extra complexity? Regards, Anthony Liguori > > - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Anthony Liguori wrote: > Avi Kivity wrote: >> Anthony Liguori wrote: >> >>> +case VIRTIO_PCI_QUEUE_NOTIFY: >>> +if (val < VIRTIO_PCI_QUEUE_MAX) >>> +virtio_ring_kick(vdev, &vdev->vq[val]); >>> +break; >>> >> >> I see you're not using hypercalls for this, presumably for compatibility >> with -no-kvm. > > More than just that. By stick to PIO, we are compatible with just > about any VMM. For instance, we get Xen support for free. If we used > hypercalls, even if we agreed on a way to determine which number to > use and how to make those calls, it would still be difficult to > implement in something like Xen. > But pio through the config space basically means you're committed to handling it in qemu. We want a more flexible mechanism. Detecting how to make hypercalls can be left to paravirt_ops. (for Xen you'd use an event channel; and for kvm the virtio kick hypercall). >> Well I think I have a solution: advertise vmcall, >> vmmcall, pio to some port, and int $some_vector as hypercall feature >> bits in cpuid (for kvm, kvm, qemu, and kvm-lite respectively). Early >> setup code could patch the instruction as appropriate (I hear code >> patching is now taught in second grade). >> > > That ties our device to our particular hypercall implementation. If > we were going to do this, I'd prefer to advertise it in the device I > think. I really would need to look at the performance though of > vmcall and an edge triggered interrupt. It would have to be pretty > compelling to warrant the additional complexity I think. vmcall costs will go down, and we don't want to use different mechanisms for high bandwidth and low bandwidth devices. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Avi Kivity wrote: > Anthony Liguori wrote: > >> +case VIRTIO_PCI_QUEUE_NOTIFY: >> +if (val < VIRTIO_PCI_QUEUE_MAX) >> +virtio_ring_kick(vdev, &vdev->vq[val]); >> +break; >> >> > > I see you're not using hypercalls for this, presumably for compatibility > with -no-kvm. More than just that. By stick to PIO, we are compatible with just about any VMM. For instance, we get Xen support for free. If we used hypercalls, even if we agreed on a way to determine which number to use and how to make those calls, it would still be difficult to implement in something like Xen. > Well I think I have a solution: advertise vmcall, > vmmcall, pio to some port, and int $some_vector as hypercall feature > bits in cpuid (for kvm, kvm, qemu, and kvm-lite respectively). Early > setup code could patch the instruction as appropriate (I hear code > patching is now taught in second grade). > That ties our device to our particular hypercall implementation. If we were going to do this, I'd prefer to advertise it in the device I think. I really would need to look at the performance though of vmcall and an edge triggered interrupt. It would have to be pretty compelling to warrant the additional complexity I think. Regards, Anthony Liguori > (kvm could advertise all four, or maybe just the first two) > > - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC] virtio-blk PCI backend
Anthony Liguori wrote: > +case VIRTIO_PCI_QUEUE_NOTIFY: > + if (val < VIRTIO_PCI_QUEUE_MAX) > + virtio_ring_kick(vdev, &vdev->vq[val]); > + break; > I see you're not using hypercalls for this, presumably for compatibility with -no-kvm. Well I think I have a solution: advertise vmcall, vmmcall, pio to some port, and int $some_vector as hypercall feature bits in cpuid (for kvm, kvm, qemu, and kvm-lite respectively). Early setup code could patch the instruction as appropriate (I hear code patching is now taught in second grade). (kvm could advertise all four, or maybe just the first two) -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel