Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/23/2009 11:21 PM, Gregory Haskins wrote: That said, you are still incorrect. With what I proposed, the model will run as an in-kernel vbus device, and no longer run in userspace. It would therefore improve virtio-net as I stated, much in the same way vhost-net or venet-tap do today. That can't work. virtio-net has its own ABI on top of virtio, for example it prepends a header for TSO information. Maybe if you disable all features it becomes compatible with venet, but that cripples it. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/27/09 4:15 AM, Avi Kivity wrote: On 12/23/2009 11:21 PM, Gregory Haskins wrote: That said, you are still incorrect. With what I proposed, the model will run as an in-kernel vbus device, and no longer run in userspace. It would therefore improve virtio-net as I stated, much in the same way vhost-net or venet-tap do today. That can't work. virtio-net has its own ABI on top of virtio, for example it prepends a header for TSO information. Maybe if you disable all features it becomes compatible with venet, but that cripples it. You are confused. The backend would be virtio-net specific, and would therefore understand the virtio-net ABI. It would support any feature of virtio-net as long as it was implemented and negotiated by both sides of the link. -Greg signature.asc Description: OpenPGP digital signature
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/27/2009 03:18 PM, Gregory Haskins wrote: On 12/27/09 4:15 AM, Avi Kivity wrote: On 12/23/2009 11:21 PM, Gregory Haskins wrote: That said, you are still incorrect. With what I proposed, the model will run as an in-kernel vbus device, and no longer run in userspace. It would therefore improve virtio-net as I stated, much in the same way vhost-net or venet-tap do today. That can't work. virtio-net has its own ABI on top of virtio, for example it prepends a header for TSO information. Maybe if you disable all features it becomes compatible with venet, but that cripples it. You are confused. The backend would be virtio-net specific, and would therefore understand the virtio-net ABI. It would support any feature of virtio-net as long as it was implemented and negotiated by both sides of the link. Then we're back to square one. A nice demonstration of vbus flexibility, but no help for virtio. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/27/09 8:27 AM, Avi Kivity wrote: On 12/27/2009 03:18 PM, Gregory Haskins wrote: On 12/27/09 4:15 AM, Avi Kivity wrote: On 12/23/2009 11:21 PM, Gregory Haskins wrote: That said, you are still incorrect. With what I proposed, the model will run as an in-kernel vbus device, and no longer run in userspace. It would therefore improve virtio-net as I stated, much in the same way vhost-net or venet-tap do today. That can't work. virtio-net has its own ABI on top of virtio, for example it prepends a header for TSO information. Maybe if you disable all features it becomes compatible with venet, but that cripples it. You are confused. The backend would be virtio-net specific, and would therefore understand the virtio-net ABI. It would support any feature of virtio-net as long as it was implemented and negotiated by both sides of the link. Then we're back to square one. A nice demonstration of vbus flexibility, but no help for virtio. No, where we are is at the point where we demonstrate that your original statement that I did nothing to improve virtio was wrong. -Greg signature.asc Description: OpenPGP digital signature
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/27/2009 03:39 PM, Gregory Haskins wrote: No, where we are is at the point where we demonstrate that your original statement that I did nothing to improve virtio was wrong. I stand by it. virtio + your patch does nothing without a ton more work (more or less equivalent to vhost-net). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/27/09 8:49 AM, Avi Kivity wrote: On 12/27/2009 03:39 PM, Gregory Haskins wrote: No, where we are is at the point where we demonstrate that your original statement that I did nothing to improve virtio was wrong. I stand by it. virtio + your patch does nothing without a ton more work (more or less equivalent to vhost-net). Perhaps, but my work predates vhost-net by months and that has nothing to do with what we are talking about anyway. Since you snipped your original comment that started the thread, here it is again: On 12/23/09 5:22 AM, Avi Kivity wrote: There was no attempt by Gregory to improve virtio-net. It's not a gray area, nor open to interpretation. That statement was, is, and will always be demonstrably false, so I'm sorry but you are still wrong. -Greg signature.asc Description: OpenPGP digital signature
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/23/2009 10:52 PM, Kyle Moffett wrote: On Wed, Dec 23, 2009 at 17:58, Anthony Liguorianth...@codemonkey.ws wrote: Of course, the key feature of virtio is that it makes it possible for you to create your own enumeration mechanism if you're so inclined. See... the thing is... a lot of us random embedded board developers don't *want* to create our own enumeration mechanisms. I see a huge amount of value in vbus as a common zero-copy DMA-capable virtual-device interface, especially over miscellaneous non-PCI-bus interconnects. I mentioned my PCI-E boards earlier, but I would also personally be interested in using infiniband with RDMA as a virtual device bus. I understand what you're saying, but is there really a practical argument here? Infiniband already supports things like IPoIB and SCSI over IB. Is it necessary to add another layer on top of it? That said, it's easy enough to create a common enumeration mechanism for people to use with virtio. I doubt it's really that interesting but it's certainly quite reasonable. In fact, a lot of code could be reused from virtio-s390 or virtio-lguest. Basically, what it comes down to is vbus is practically useful as a generic way to provide a large number of hotpluggable virtual devices across an arbitrary interconnect. I agree that virtio works fine if you have some out-of-band enumeration and hotplug transport (like emulated PCI), but if you *don't* have that, it's pretty much faster to write your own set of paired network drivers than it is to write a whole enumeration and transport stack for virtio. On top of *that*, with the virtio approach I would need to write a whole bunch of tools to manage the set of virtual devices on my custom hardware. With vbus that management interface would be entirely common code across a potentially large number of virtualized physical transports. This particular use case really has nothing to do with virtualization. You really want an infiniband replacement using the PCI-e bus. There's so much on the horizon in this space that's being standardized in PCI-sig like MR-IOV. If it were me, I'd take a much different approach. I would use a very simple device with a single transmit and receive queue. I'd create a standard header, and the implement a command protocol on top of it. You'll be able to support zero copy I/O (although you'll have a fixed number of outstanding requests). You would need a single large ring. That's basically about as much work as writing entirely new network and serial drivers over PCI. Not only that, but I The beauty of vbus for me is that I could write a fairly simple logical-to-physical glue driver which lets vbus talk over my PCI-E or infiniband link and then I'm basically done. Is this something you expect people to use or is this a one-off project? I personally would love to see vbus merged, into staging at the very least. I would definitely spend some time trying to make it work across PCI-E on my *very* *real* embedded boards. Look at vbus not as another virtualization ABI, but as a multiprotocol high-level device abstraction API that already has one well-implemented and high-performance user. If someone wants to advocate vbus for non-virtualized purposes, I have no problem with that. I just don't think it makes sense in for KVM. virtio is not intended to be used for any possible purpose. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/23/2009 05:42 PM, Ira W. Snyder wrote: I've got a single PCI Host (master) with ~20 PCI slots. Physically, it is a backplane in a cPCI chassis, but the form factor is irrelevant. It is regular PCI from a software perspective. Into this backplane, I plug up to 20 PCI Agents (slaves). They are powerpc computers, almost identical to the Freescale MPC8349EMDS board. They're full-featured powerpc computers, with CPU, RAM, etc. They can run standalone. I want to use the PCI backplane as a data transport. Specifically, I want to transport ethernet over the backplane, so I can have the powerpc boards mount their rootfs via NFS, etc. Everyone knows how to write network daemons. It is a good and very well known way to transport data between systems. On the PCI bus, the powerpc systems expose 3 PCI BAR's. The size is configureable, as is the memory location at which they point. What I cannot do is get notified when a read/write hits the BAR. There is a feature on the board which allows me to generate interrupts in either direction: agent-master (PCI INTX) and master-agent (via an MMIO register). The PCI vendor ID and device ID are not configureable. One thing I cannot assume is that the PCI master system is capable of performing DMA. In my system, it is a Pentium3 class x86 machine, which has no DMA engine. However, the PowerPC systems do have DMA engines. In virtio terms, it was suggested to make the powerpc systems the virtio hosts (running the backends) and make the x86 (PCI master) the virtio guest (running virtio-net, etc.). IMHO, virtio and vbus are both the wrong model for what you're doing. The key reason why is that virtio and vbus are generally designed around the concept that there is shared cache coherent memory from which you can use lock-less ring queues to implement efficient I/O. In your architecture, you do not have cache coherent shared memory. Instead, you have two systems connected via a PCI backplace with non-coherent shared memory. You probably need to use the shared memory as a bounce buffer and implement a driver on top of that. I'm not sure what you're suggesting in the paragraph above. I want to use virtio-net as the transport, I do not want to write my own virtual-network driver. Can you please clarify? virtio-net and vbus are going to be overly painful for you to use because no one end can access arbitrary memory in the other end. Hopefully that explains what I'm trying to do. I'd love someone to help guide me in the right direction here. I want something to fill this need in mainline. If I were you, I would write a custom network driver. virtio-net is awfully small (just a few hundred lines). I'd use that as a basis but I would not tie into virtio or vbus. The paradigms don't match. I've been contacted seperately by 10+ people also looking for a similar solution. I hunch most of them end up doing what I did: write a quick-and-dirty network driver. I've been working on this for a year, just to give an idea. The whole architecture of having multiple heterogenous systems on a common high speed backplane is what IBM refers to as hybrid computing. It's a model that I think will be come a lot more common in the future. I think there are typically two types of hybrid models depending on whether the memory sharing is cache coherent or not. If you have coherent shared memory, the problem looks an awfully lot like virtualization. If you don't have coherent shared memory, then the shared memory basically becomes a pool to bounce into and out-of. PS - should I create a new thread on the two mailing lists mentioned above? I don't want to go too far off-topic in an alacrityvm thread. :) Couldn't hurt. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On Thu, Dec 24, 2009 at 11:09:39AM -0600, Anthony Liguori wrote: On 12/23/2009 05:42 PM, Ira W. Snyder wrote: I've got a single PCI Host (master) with ~20 PCI slots. Physically, it is a backplane in a cPCI chassis, but the form factor is irrelevant. It is regular PCI from a software perspective. Into this backplane, I plug up to 20 PCI Agents (slaves). They are powerpc computers, almost identical to the Freescale MPC8349EMDS board. They're full-featured powerpc computers, with CPU, RAM, etc. They can run standalone. I want to use the PCI backplane as a data transport. Specifically, I want to transport ethernet over the backplane, so I can have the powerpc boards mount their rootfs via NFS, etc. Everyone knows how to write network daemons. It is a good and very well known way to transport data between systems. On the PCI bus, the powerpc systems expose 3 PCI BAR's. The size is configureable, as is the memory location at which they point. What I cannot do is get notified when a read/write hits the BAR. There is a feature on the board which allows me to generate interrupts in either direction: agent-master (PCI INTX) and master-agent (via an MMIO register). The PCI vendor ID and device ID are not configureable. One thing I cannot assume is that the PCI master system is capable of performing DMA. In my system, it is a Pentium3 class x86 machine, which has no DMA engine. However, the PowerPC systems do have DMA engines. In virtio terms, it was suggested to make the powerpc systems the virtio hosts (running the backends) and make the x86 (PCI master) the virtio guest (running virtio-net, etc.). IMHO, virtio and vbus are both the wrong model for what you're doing. The key reason why is that virtio and vbus are generally designed around the concept that there is shared cache coherent memory from which you can use lock-less ring queues to implement efficient I/O. In your architecture, you do not have cache coherent shared memory. Instead, you have two systems connected via a PCI backplace with non-coherent shared memory. You probably need to use the shared memory as a bounce buffer and implement a driver on top of that. I'm not sure what you're suggesting in the paragraph above. I want to use virtio-net as the transport, I do not want to write my own virtual-network driver. Can you please clarify? virtio-net and vbus are going to be overly painful for you to use because no one end can access arbitrary memory in the other end. The PCI Agents (powerpc's) can access the lowest 4GB of the PCI Master's memory. Not all at the same time, but I have a 1GB movable window into PCI address space. I hunch Kyle's setup is similar. I've proved that virtio can work via my crossed-wires driver, hooking two virtio-net's together. With a proper in-kernel backend, I think the issues would be gone, and things would work great. Hopefully that explains what I'm trying to do. I'd love someone to help guide me in the right direction here. I want something to fill this need in mainline. If I were you, I would write a custom network driver. virtio-net is awfully small (just a few hundred lines). I'd use that as a basis but I would not tie into virtio or vbus. The paradigms don't match. This is exactly what I did first. I proposed it for mainline, and David Miller shot it down, saying: you're creating your own virtualization scheme, use virtio instead. Arnd Bergmann is maintaining a driver out-of-tree for some IBM cell boards which is very similar, IIRC. In my driver, I used the PCI Agent's PCI BAR's to contain ring descriptors. The PCI Agent actually handles all data transfer (via the onboard DMA engine). It works great. I'll gladly post it if you'd like to see it. In my driver, I had to use 64K MTU to get acceptable performance. I'm not entirely sure how to implement a driver that can handle scatter/gather (fragmented skb's). It clearly isn't that easy to tune a network driver for good performance. For reference, my crossed-wires virtio drivers achieved excellent performance (10x better than my custom driver) with 1500 byte MTU. I've been contacted seperately by 10+ people also looking for a similar solution. I hunch most of them end up doing what I did: write a quick-and-dirty network driver. I've been working on this for a year, just to give an idea. The whole architecture of having multiple heterogenous systems on a common high speed backplane is what IBM refers to as hybrid computing. It's a model that I think will be come a lot more common in the future. I think there are typically two types of hybrid models depending on whether the memory sharing is cache coherent or not. If you have coherent shared memory, the problem looks an awfully lot like virtualization. If you don't have coherent shared memory, then the shared memory basically becomes a pool to bounce into and out-of. Let's
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On Wed, 2009-12-23 at 13:14 +0100, Andi Kleen wrote: http://www.redhat.com/f/pdf/summit/cwright_11_open_source_virt.pdf See slide 32. This is without vhost-net. Thanks. Do you also have latency numbers? It seems like there's definitely still potential for improvement with messages 4K. But for the large messages they indeed look rather good. It's unclear what message size the Alacrity numbers used, but I presume it was rather large. No. It was 1500. Please see: http://developer.novell.com/wiki/index.php/AlacrityVM/Results Best, -PWM -Andi -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/23/09 5:22 AM, Avi Kivity wrote: There was no attempt by Gregory to improve virtio-net. If you truly do not understand why your statement is utterly wrong at this point in the discussion, I feel sorry for you. If you are trying to be purposely disingenuous, you should be ashamed of yourself. In any case, your statement is demonstrably bogus, but you should already know this given that we talked about at least several times. To refresh your memory: http://patchwork.kernel.org/patch/17428/ In case its not blatantly clear, which I would hope it would be to anyone that understands the problem space: What that patch would do is allow an unmodified virtio-net to bridge to a vbus based virtio-net backend. (Also note that this predates vhost-net by months (the date in that thread is 4/9/2009) in case you are next going to try to argue that it does nothing over vhost-net). This would mean that virtio-net would gain most of the benefits I have been advocating (fewer exits, cheaper exits, concurrent execution, etc). So this would very much improve virtio-net indeed, given how poorly the current backend was performing. I tried to convince the team to help me build it out to completion on multiple occasions, but that request was answered with sorry, we are doing our own thing instead. You can say that you didn't like my approach, since that is a subjective opinion. But to say that I didn't attempt to improve it is a flat out wrong, and I do not appreciate it. -Greg signature.asc Description: OpenPGP digital signature
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/23/09 12:52 PM, Peter W. Morreale wrote: On Wed, 2009-12-23 at 13:14 +0100, Andi Kleen wrote: http://www.redhat.com/f/pdf/summit/cwright_11_open_source_virt.pdf See slide 32. This is without vhost-net. Thanks. Do you also have latency numbers? It seems like there's definitely still potential for improvement with messages 4K. But for the large messages they indeed look rather good. It's unclear what message size the Alacrity numbers used, but I presume it was rather large. No. It was 1500. Please see: http://developer.novell.com/wiki/index.php/AlacrityVM/Results Note: 1500 was the L2 MTU, not necessarily the L3/L4 size which was probably much larger (though I do not recall what exactly atm). -Greg signature.asc Description: OpenPGP digital signature
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
* Peter W. Morreale (pmorre...@novell.com) wrote: On Wed, 2009-12-23 at 13:14 +0100, Andi Kleen wrote: http://www.redhat.com/f/pdf/summit/cwright_11_open_source_virt.pdf See slide 32. This is without vhost-net. Thanks. Do you also have latency numbers? It seems like there's definitely still potential for improvement with messages 4K. But for the large messages they indeed look rather good. It's unclear what message size the Alacrity numbers used, but I presume it was rather large. No. It was 1500. Please see: http://developer.novell.com/wiki/index.php/AlacrityVM/Results That's just MTU. Not the message size. We can infer the message size by the bare metal results (reasonably large), but is helpful to record that. thanks, -chris -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On Wed, Dec 23, 2009 at 09:09:21AM -0600, Anthony Liguori wrote: On 12/23/2009 12:15 AM, Kyle Moffett wrote: This is actually something that is of particular interest to me. I have a few prototype boards right now with programmable PCI-E host/device links on them; one of my long-term plans is to finagle vbus into providing multiple virtual devices across that single PCI-E interface. Specifically, I want to be able to provide virtual NIC(s), serial ports and serial consoles, virtual block storage, and possibly other kinds of interfaces. My big problem with existing virtio right now (although I would be happy to be proven wrong) is that it seems to need some sort of out-of-band communication channel for setting up devices, not to mention it seems to need one PCI device per virtual device. We've been thinking about doing a virtio-over-IP mechanism such that you could remote the entire virtio bus to a separate physical machine. virtio-over-IB is probably more interesting since you can make use of RDMA. virtio-over-PCI-e would work just as well. I didn't know you were interested in this as well. See my later reply to Kyle for a lot of code that I've written with this in mind. virtio is a layered architecture. Device enumeration/discovery sits at a lower level than the actual device ABIs. The device ABIs are implemented on top of a bulk data transfer API. The reason for this layering is so that we can reuse PCI as an enumeration/discovery mechanism. This tremendenously simplifies porting drivers to other OSes and let's us use PCI hotplug automatically. We get integration into all the fancy userspace hotplug support for free. But both virtio-lguest and virtio-s390 use in-band enumeration and discovery since they do not have support for PCI on either platform. I'm interested in the same thing, just over PCI. The only PCI agent systems I've used are not capable of manipulating the PCI configuration space in such a way that virtio-pci is usable on them. This means creating your own enumeration mechanism. Which sucks. See my virtio-phys code (http://www.mmarray.org/~iws/virtio-phys/) for an example of how I did it. It was modeled on lguest. Help is appreciated. Ira -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/23/2009 08:15 PM, Gregory Haskins wrote: On 12/23/09 5:22 AM, Avi Kivity wrote: There was no attempt by Gregory to improve virtio-net. If you truly do not understand why your statement is utterly wrong at this point in the discussion, I feel sorry for you. If you are trying to be purposely disingenuous, you should be ashamed of yourself. In any case, your statement is demonstrably bogus, but you should already know this given that we talked about at least several times. There's no need to feel sorry for me, thanks. There's no reason for me to be ashamed, either. And there's no need to take the discussion to personal levels. Please keep it technical. To refresh your memory: http://patchwork.kernel.org/patch/17428/ This is not an attempt to improve virtio-net, it's an attempt to push vbus. With this, virtio-net doesn't become any faster, since the greatest bottleneck is not removed, it remains in userspace. If you wanted to improve virtio-net, you would port venet-host to the virtio-net guest/host interface, and port any secret sauce in venet(-guest) to virtio-net. After that we could judge what vbus' contribution to the equation is. In case its not blatantly clear, which I would hope it would be to anyone that understands the problem space: What that patch would do is allow an unmodified virtio-net to bridge to a vbus based virtio-net backend. (Also note that this predates vhost-net by months (the date in that thread is 4/9/2009) in case you are next going to try to argue that it does nothing over vhost-net). Without the backend, it is useless. It demonstrates vbus' flexibility quite well, but does nothing for virtio-net or its users, at least without a lot more work. This would mean that virtio-net would gain most of the benefits I have been advocating (fewer exits, cheaper exits, concurrent execution, etc). So this would very much improve virtio-net indeed, given how poorly the current backend was performing. I tried to convince the team to help me build it out to completion on multiple occasions, but that request was answered with sorry, we are doing our own thing instead. You can say that you didn't like my approach, since that is a subjective opinion. But to say that I didn't attempt to improve it is a flat out wrong, and I do not appreciate it. Cutting down on the rhetoric is more important than cutting down exits at this point in time. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
(Sorry for top post...on a mobile) When someone repeatedly makes a claim you believe to be wrong and you correct them, you start to wonder if that person has a less than honorable agenda. In any case, I overreacted. For that, I apologize. That said, you are still incorrect. With what I proposed, the model will run as an in-kernel vbus device, and no longer run in userspace. It would therefore improve virtio-net as I stated, much in the same way vhost-net or venet-tap do today. FYI I am about to log out for the long holiday, so will be unresponsive for a bit. Kind Regards, -Greg On 12/23/09, Avi Kivity a...@redhat.com wrote: On 12/23/2009 08:15 PM, Gregory Haskins wrote: On 12/23/09 5:22 AM, Avi Kivity wrote: There was no attempt by Gregory to improve virtio-net. If you truly do not understand why your statement is utterly wrong at this point in the discussion, I feel sorry for you. If you are trying to be purposely disingenuous, you should be ashamed of yourself. In any case, your statement is demonstrably bogus, but you should already know this given that we talked about at least several times. There's no need to feel sorry for me, thanks. There's no reason for me to be ashamed, either. And there's no need to take the discussion to personal levels. Please keep it technical. To refresh your memory: http://patchwork.kernel.org/patch/17428/ This is not an attempt to improve virtio-net, it's an attempt to push vbus. With this, virtio-net doesn't become any faster, since the greatest bottleneck is not removed, it remains in userspace. If you wanted to improve virtio-net, you would port venet-host to the virtio-net guest/host interface, and port any secret sauce in venet(-guest) to virtio-net. After that we could judge what vbus' contribution to the equation is. In case its not blatantly clear, which I would hope it would be to anyone that understands the problem space: What that patch would do is allow an unmodified virtio-net to bridge to a vbus based virtio-net backend. (Also note that this predates vhost-net by months (the date in that thread is 4/9/2009) in case you are next going to try to argue that it does nothing over vhost-net). Without the backend, it is useless. It demonstrates vbus' flexibility quite well, but does nothing for virtio-net or its users, at least without a lot more work. This would mean that virtio-net would gain most of the benefits I have been advocating (fewer exits, cheaper exits, concurrent execution, etc). So this would very much improve virtio-net indeed, given how poorly the current backend was performing. I tried to convince the team to help me build it out to completion on multiple occasions, but that request was answered with sorry, we are doing our own thing instead. You can say that you didn't like my approach, since that is a subjective opinion. But to say that I didn't attempt to improve it is a flat out wrong, and I do not appreciate it. Cutting down on the rhetoric is more important than cutting down exits at this point in time. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- Sent from my mobile device -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On 12/23/2009 01:54 PM, Ira W. Snyder wrote: On Wed, Dec 23, 2009 at 09:09:21AM -0600, Anthony Liguori wrote: I didn't know you were interested in this as well. See my later reply to Kyle for a lot of code that I've written with this in mind. BTW, in the future, please CC me or CC virtualizat...@lists.linux-foundation.org. Or certainly k...@vger. I never looked at the virtio-over-pci patchset although I've heard it referenced before. But both virtio-lguest and virtio-s390 use in-band enumeration and discovery since they do not have support for PCI on either platform. I'm interested in the same thing, just over PCI. The only PCI agent systems I've used are not capable of manipulating the PCI configuration space in such a way that virtio-pci is usable on them. virtio-pci is the wrong place to start if you want to use a PCI *device* as the virtio bus. virtio-pci is meant to use the PCI bus as the virtio bus. That's a very important requirement for us because it maintains the relationship of each device looking like a normal PCI device. This means creating your own enumeration mechanism. Which sucks. I don't think it sucks. The idea is that we don't want to unnecessarily reinvent things. Of course, the key feature of virtio is that it makes it possible for you to create your own enumeration mechanism if you're so inclined. See my virtio-phys code (http://www.mmarray.org/~iws/virtio-phys/) for an example of how I did it. It was modeled on lguest. Help is appreciated. If it were me, I'd take a much different approach. I would use a very simple device with a single transmit and receive queue. I'd create a standard header, and the implement a command protocol on top of it. You'll be able to support zero copy I/O (although you'll have a fixed number of outstanding requests). You would need a single large ring. But then again, I have no idea what your requirements are. You could probably get far treating the thing as a network device and just doing ATAoE or something like that. Regards, Anthony Liguori Ira -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On Wed, Dec 23, 2009 at 04:58:37PM -0600, Anthony Liguori wrote: On 12/23/2009 01:54 PM, Ira W. Snyder wrote: On Wed, Dec 23, 2009 at 09:09:21AM -0600, Anthony Liguori wrote: I didn't know you were interested in this as well. See my later reply to Kyle for a lot of code that I've written with this in mind. BTW, in the future, please CC me or CC virtualizat...@lists.linux-foundation.org. Or certainly k...@vger. I never looked at the virtio-over-pci patchset although I've heard it referenced before. Will do. I wouldn't think k...@vger would be on-topic. I'm not interested in KVM (though I do use it constantly, it is great). I'm only interested in using virtio as a transport between physical systems. Is it a place where discussing virtio by itself is on-topic? But both virtio-lguest and virtio-s390 use in-band enumeration and discovery since they do not have support for PCI on either platform. I'm interested in the same thing, just over PCI. The only PCI agent systems I've used are not capable of manipulating the PCI configuration space in such a way that virtio-pci is usable on them. virtio-pci is the wrong place to start if you want to use a PCI *device* as the virtio bus. virtio-pci is meant to use the PCI bus as the virtio bus. That's a very important requirement for us because it maintains the relationship of each device looking like a normal PCI device. This means creating your own enumeration mechanism. Which sucks. I don't think it sucks. The idea is that we don't want to unnecessarily reinvent things. Of course, the key feature of virtio is that it makes it possible for you to create your own enumeration mechanism if you're so inclined. See my virtio-phys code (http://www.mmarray.org/~iws/virtio-phys/) for an example of how I did it. It was modeled on lguest. Help is appreciated. If it were me, I'd take a much different approach. I would use a very simple device with a single transmit and receive queue. I'd create a standard header, and the implement a command protocol on top of it. You'll be able to support zero copy I/O (although you'll have a fixed number of outstanding requests). You would need a single large ring. But then again, I have no idea what your requirements are. You could probably get far treating the thing as a network device and just doing ATAoE or something like that. I've got a single PCI Host (master) with ~20 PCI slots. Physically, it is a backplane in a cPCI chassis, but the form factor is irrelevant. It is regular PCI from a software perspective. Into this backplane, I plug up to 20 PCI Agents (slaves). They are powerpc computers, almost identical to the Freescale MPC8349EMDS board. They're full-featured powerpc computers, with CPU, RAM, etc. They can run standalone. I want to use the PCI backplane as a data transport. Specifically, I want to transport ethernet over the backplane, so I can have the powerpc boards mount their rootfs via NFS, etc. Everyone knows how to write network daemons. It is a good and very well known way to transport data between systems. On the PCI bus, the powerpc systems expose 3 PCI BAR's. The size is configureable, as is the memory location at which they point. What I cannot do is get notified when a read/write hits the BAR. There is a feature on the board which allows me to generate interrupts in either direction: agent-master (PCI INTX) and master-agent (via an MMIO register). The PCI vendor ID and device ID are not configureable. One thing I cannot assume is that the PCI master system is capable of performing DMA. In my system, it is a Pentium3 class x86 machine, which has no DMA engine. However, the PowerPC systems do have DMA engines. In virtio terms, it was suggested to make the powerpc systems the virtio hosts (running the backends) and make the x86 (PCI master) the virtio guest (running virtio-net, etc.). I'm not sure what you're suggesting in the paragraph above. I want to use virtio-net as the transport, I do not want to write my own virtual-network driver. Can you please clarify? Hopefully that explains what I'm trying to do. I'd love someone to help guide me in the right direction here. I want something to fill this need in mainline. I've been contacted seperately by 10+ people also looking for a similar solution. I hunch most of them end up doing what I did: write a quick-and-dirty network driver. I've been working on this for a year, just to give an idea. PS - should I create a new thread on the two mailing lists mentioned above? I don't want to go too far off-topic in an alacrityvm thread. :) Ira -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Alacrityvm-devel] [GIT PULL] AlacrityVM guest drivers for 2.6.33
On Wed, Dec 23, 2009 at 17:58, Anthony Liguori anth...@codemonkey.ws wrote: On 12/23/2009 01:54 PM, Ira W. Snyder wrote: On Wed, Dec 23, 2009 at 09:09:21AM -0600, Anthony Liguori wrote: But both virtio-lguest and virtio-s390 use in-band enumeration and discovery since they do not have support for PCI on either platform. I'm interested in the same thing, just over PCI. The only PCI agent systems I've used are not capable of manipulating the PCI configuration space in such a way that virtio-pci is usable on them. virtio-pci is the wrong place to start if you want to use a PCI *device* as the virtio bus. virtio-pci is meant to use the PCI bus as the virtio bus. That's a very important requirement for us because it maintains the relationship of each device looking like a normal PCI device. This means creating your own enumeration mechanism. Which sucks. I don't think it sucks. The idea is that we don't want to unnecessarily reinvent things. Of course, the key feature of virtio is that it makes it possible for you to create your own enumeration mechanism if you're so inclined. See... the thing is... a lot of us random embedded board developers don't *want* to create our own enumeration mechanisms. I see a huge amount of value in vbus as a common zero-copy DMA-capable virtual-device interface, especially over miscellaneous non-PCI-bus interconnects. I mentioned my PCI-E boards earlier, but I would also personally be interested in using infiniband with RDMA as a virtual device bus. Basically, what it comes down to is vbus is practically useful as a generic way to provide a large number of hotpluggable virtual devices across an arbitrary interconnect. I agree that virtio works fine if you have some out-of-band enumeration and hotplug transport (like emulated PCI), but if you *don't* have that, it's pretty much faster to write your own set of paired network drivers than it is to write a whole enumeration and transport stack for virtio. On top of *that*, with the virtio approach I would need to write a whole bunch of tools to manage the set of virtual devices on my custom hardware. With vbus that management interface would be entirely common code across a potentially large number of virtualized physical transports. If vbus actually gets merged I will most likely be able to spend the time to get the PCI-E crosslinks on my boards talking vbus, otherwise it's liable to get completely shelved as not worth the effort to write all the glue to make virtio work. See my virtio-phys code (http://www.mmarray.org/~iws/virtio-phys/) for an example of how I did it. It was modeled on lguest. Help is appreciated. If it were me, I'd take a much different approach. I would use a very simple device with a single transmit and receive queue. I'd create a standard header, and the implement a command protocol on top of it. You'll be able to support zero copy I/O (although you'll have a fixed number of outstanding requests). You would need a single large ring. That's basically about as much work as writing entirely new network and serial drivers over PCI. Not only that, but I The beauty of vbus for me is that I could write a fairly simple logical-to-physical glue driver which lets vbus talk over my PCI-E or infiniband link and then I'm basically done. Not only that, but the tools for adding new virtual devices (ethernet, serial, block, etc) over vbus would be the same no matter what the underlying transport. But then again, I have no idea what your requirements are. You could probably get far treating the thing as a network device and just doing ATAoE or something like that. sarcasmOh... yes... clearly the right solution is to forgo the whole zero-copy direct DMA of block writes and instead shuffle the whole thing into 16kB ATAoE packets. That would obviously be much faster on my little 1GHz PowerPC boards /sarcasm Sorry for the rant, but I really do think vbus is a valuable technology and it's a damn shame to see Gregory Haskins being put through this whole hassle. While most everybody else was griping about problems he sat down and wrote some very nice clean maintainable code to do what he needed. Not only that, but he designed a good enough model that it could be ported to run over almost everything from a single PCI-E link to an infiniband network. I personally would love to see vbus merged, into staging at the very least. I would definitely spend some time trying to make it work across PCI-E on my *very* *real* embedded boards. Look at vbus not as another virtualization ABI, but as a multiprotocol high-level device abstraction API that already has one well-implemented and high-performance user. Cheers, Kyle Moffett -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html