Re: Paravirtualization on VMware's Platform [VMI].
Alok Kataria wrote: > > What do you suggest would be the right time ? > > Please note that the next major release of VMware's product will not > have this supported. Also that, most of our customers will actually be > running some distro's enterprise release, rather than running the > cutting edge kernel. So IMO there is still a window of around 1-1.5 > years, until a customer actually sees a kernel which has dropped VMI > support. > I would say it might make sense pulling it out around the end of 2010, which would be about 6 kernel releases from now -- 2.6.37. -hpa ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
On Tue, 2009-09-22 at 14:27 -0700, H. Peter Anvin wrote: > Alok Kataria wrote: > > Hi Ingo, > > > > On Sun, 2009-09-20 at 00:42 -0700, Ingo Molnar wrote: > > > >> The thing is, the overwhelming majority of vmware users dont benefit > >> from hardware features like nested page tables yet. So this needs to be > >> done _way_ more carefully, with a proper sunset period of a couple of > >> kernel cycles. > > > > I am fine with that too. Below is a patch which adds notes in > > feature-removal-schedule.txt, I have marked it for removal from 2.6.34. > > Please consider this patch for 2.6.32. > > > > This seems way, way too early still. What do you suggest would be the right time ? Please note that the next major release of VMware's product will not have this supported. Also that, most of our customers will actually be running some distro's enterprise release, rather than running the cutting edge kernel. So IMO there is still a window of around 1-1.5 years, until a customer actually sees a kernel which has dropped VMI support. Thanks, Alok ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
Alok Kataria wrote: > Hi Ingo, > > On Sun, 2009-09-20 at 00:42 -0700, Ingo Molnar wrote: > >> The thing is, the overwhelming majority of vmware users dont benefit >> from hardware features like nested page tables yet. So this needs to be >> done _way_ more carefully, with a proper sunset period of a couple of >> kernel cycles. > > I am fine with that too. Below is a patch which adds notes in > feature-removal-schedule.txt, I have marked it for removal from 2.6.34. > Please consider this patch for 2.6.32. > This seems way, way too early still. -hpa ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
On 09/22/09 12:30, Alok Kataria wrote: > We can certainly look at removing some paravirt-hooks which are only > used by VMI. Not sure if there are any but will take a look when we > actually remove VMI. > There are a couple: * pte_update_defer * alloc_pmd_clone lguest appears to still use pte_update(), but I suspect its two callsites could be recast in the form of other existing pvops. J ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
Hi Ingo, On Sun, 2009-09-20 at 00:42 -0700, Ingo Molnar wrote: > > The thing is, the overwhelming majority of vmware users dont benefit > from hardware features like nested page tables yet. So this needs to be > done _way_ more carefully, with a proper sunset period of a couple of > kernel cycles. I am fine with that too. Below is a patch which adds notes in feature-removal-schedule.txt, I have marked it for removal from 2.6.34. Please consider this patch for 2.6.32. > If we were able to rip out all (or most) of paravirt from arch/x86 it > would be tempting for other technical reasons - but the patch above is > well localized. We can certainly look at removing some paravirt-hooks which are only used by VMI. Not sure if there are any but will take a look when we actually remove VMI. Thanks, Alok -- Mark VMI for deprecation in feature-removal-schedule.txt. From: Alok N Kataria Add text in feature-removal.txt and also modify Kconfig to disable vmi by default. Patch on top of tip/master. Details about VMware's plan about retiring VMI can be found here http://blogs.vmware.com/guestosguide/2009/09/vmi-retirement.html --- Documentation/feature-removal-schedule.txt | 24 arch/x86/Kconfig |8 +--- 2 files changed, 29 insertions(+), 3 deletions(-) diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index fa75220..b985328 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -459,3 +459,27 @@ Why: OSS sound_core grabs all legacy minors (0-255) of SOUND_MAJOR will also allow making ALSA OSS emulation independent of sound_core. The dependency will be broken then too. Who: Tejun Heo + + + +What: Support for VMware's guest paravirtuliazation technique [VMI] will be + dropped. +When: 2.6.34 +Why: With the recent innovations in CPU hardware acceleration technologies + from Intel and AMD, VMware ran a few experiments to compare these + techniques to guest paravirtulization technique on VMware's platform. + These hardware assisted virtualization techniques have outperformed the + performance benefits provided by VMI in most of the workloads. VMware + expects that these hardware features will be ubiquitous in a couple of + years, as a result, VMware has started a phased retirement of this + feature from the hypervisor. We will be removing this feature from the + Kernel too, in a couple of releases. + Please note that VMI has always been an optimization and non-VMI kernels + still work fine on VMware's platform. + + For more details about VMI retirement take a look at this, + http://blogs.vmware.com/guestosguide/2009/09/vmi-retirement.html + +Who: Alok N Kataria + + diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index e214f45..1f3e156 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -485,14 +485,16 @@ if PARAVIRT_GUEST source "arch/x86/xen/Kconfig" config VMI - bool "VMI Guest support" - select PARAVIRT - depends on X86_32 + bool "VMI Guest support [will be deprecated soon]" + default n + depends on X86_32 && PARAVIRT ---help--- VMI provides a paravirtualized interface to the VMware ESX server (it could be used by other hypervisors in theory too, but is not at the moment), by linking the kernel to a GPL-ed ROM module provided by the hypervisor. + VMware has started a phased retirement of this feature from there + products. Please see feature-removal-schedule.txt for details. config KVM_CLOCK bool "KVM paravirtualized clock" ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
On 09/22/09 12:04, Ingo Molnar wrote: > Sorry for being dense, but what does that mean precisely? No available > hardware? Xen doesnt run? Nobody has implemented hybrid PV mode yet, so we haven't got anything to measure. Also, I don't think there have been very many measurements of Linux HVM (full virtualization) Xen guests, because Linux is typically run paravirtualized and HVM support is primarily tuned for Windows guests. J ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
* Jeremy Fitzhardinge wrote: > On 09/22/09 11:02, Ingo Molnar wrote: > > > obviously they are workload dependent - that's why numbers were > > posted in this thread with various workloads. Do you concur with > > those conclusions that they are generally a speedup over paravirt? > > If not, which are the workloads where paravirt offers significant > > speedup over hardware acceleration? > > We're not in a position to do any useful measurements yet. Sorry for being dense, but what does that mean precisely? No available hardware? Xen doesnt run? Ingo ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [RFC] Virtual Machine Device Queues(VMDq) support on KVM
On Tuesday 22 September 2009, Stephen Hemminger wrote: > > My idea for that was to open multiple file descriptors to the same > > macvtap device and let the kernel figure out the right thing to > > do with that. You can do the same with raw packed sockets in case > > of vhost_net, but I wouldn't want to add more complexity to the > > tun/tap driver for this. > > > Or get tap out of the way entirely. The packets should not have > to go out to user space at all (see veth) How does veth relate to that, do you mean vhost_net? With vhost_net, you could still open multiple sockets, only the access is in the kernel. Obviously, once it all is in the kernel, that could be done under the covers, but I think it would be cleaner to treat vhost_net purely as a way to bypass the syscalls for user space, with as little as possible visible impact otherwise. Arnd <>< ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
On 09/22/09 11:02, Ingo Molnar wrote: > obviously they are workload dependent - that's why numbers were posted > in this thread with various workloads. Do you concur with those > conclusions that they are generally a speedup over paravirt? If not, > which are the workloads where paravirt offers significant speedup over > hardware acceleration? > We're not in a position to do any useful measurements yet. J ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
* Jeremy Fitzhardinge wrote: > On 09/22/09 01:09, Ingo Molnar wrote: > >>> kvm will be removing the pvmmu support soon; and Xen is talking about > >>> running paravirtualized guests in a vmx/svm container where they don't > >>> need most of the hooks. > >>> > >> We have no plans to drop support for non-vmx/svm capable processors, > >> let alone require ept/npt. > > > > But, just to map out our plans for the future, do you concur with > > the statements and numbers offered here by the VMware and KVM folks > > that on sufficiently recent hardware, hardware-assisted > > virtualization outperforms paravirt_ops in many (most?) workloads? > > Well, what Avi is referring to here is some discussions about a hybrid > paravirtualized mode, in which Xen runs a normal Xen PV guest within a > hardware container in order to get some immediate optimisations, and > allow further optimisations like using hardware assisted paging > extensions. > > For KVM and VMI, which always use a shadow pagetable scheme, hardware > paging is now unambigiously better than shadow pagetables, but for Xen > PV guests the picture is mixed since they don't use shadow pagetables. > The NPT/EPT extensions make updating the pagetable more efficent, but > actual access is more expensive because of the higher load on the TLB > and the increased expense of a TLB miss, so the actual performance > effects are very workload dependent. obviously they are workload dependent - that's why numbers were posted in this thread with various workloads. Do you concur with those conclusions that they are generally a speedup over paravirt? If not, which are the workloads where paravirt offers significant speedup over hardware acceleration? Ingo ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
On 09/22/09 00:22, Rusty Russell wrote: > When they're all gone, even I don't think lguest is sufficient excuse > to keep CONFIG_PARAVIRT. Oh well. But that will probably be a while. > /Solidarność/! J ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
On 09/22/09 01:09, Ingo Molnar wrote: >>> kvm will be removing the pvmmu support soon; and Xen is talking about >>> running paravirtualized guests in a vmx/svm container where they don't >>> need most of the hooks. >>> >> We have no plans to drop support for non-vmx/svm capable processors, >> let alone require ept/npt. >> > But, just to map out our plans for the future, do you concur with the > statements and numbers offered here by the VMware and KVM folks that > on sufficiently recent hardware, hardware-assisted virtualization > outperforms paravirt_ops in many (most?) workloads? > Well, what Avi is referring to here is some discussions about a hybrid paravirtualized mode, in which Xen runs a normal Xen PV guest within a hardware container in order to get some immediate optimisations, and allow further optimisations like using hardware assisted paging extensions. For KVM and VMI, which always use a shadow pagetable scheme, hardware paging is now unambigiously better than shadow pagetables, but for Xen PV guests the picture is mixed since they don't use shadow pagetables. The NPT/EPT extensions make updating the pagetable more efficent, but actual access is more expensive because of the higher load on the TLB and the increased expense of a TLB miss, so the actual performance effects are very workload dependent. > I.e. paravirt_ops becomes a legacy hardware thing, not a core component > of the design of arch/x86/. > > (with a long obsoletion period, of course.) > I expect we'll eventually get to the point that the performance delta and the installed userbase will no longer justify the effort in maintaining the full set of pvops hooks. But I don't have a good feeling for when that might be. J ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [RFC] Virtual Machine Device Queues(VMDq) support on KVM
On Tue, 22 Sep 2009 13:50:54 +0200 Arnd Bergmann wrote: > On Tuesday 22 September 2009, Michael S. Tsirkin wrote: > > > > More importantly, when virtualizations is used with multi-queue > > > > NIC's the virtio-net NIC is a single CPU bottleneck. The virtio-net > > > > NIC should preserve the parallelism (lock free) using multiple > > > > receive/transmit queues. The number of queues should equal the > > > > number of CPUs. > > > > > > Yup, multiqueue virtio is on todo list ;-) > > > > > > > Note we'll need multiqueue tap for that to help. > > My idea for that was to open multiple file descriptors to the same > macvtap device and let the kernel figure out the right thing to > do with that. You can do the same with raw packed sockets in case > of vhost_net, but I wouldn't want to add more complexity to the > tun/tap driver for this. > > Arnd <>< Or get tap out of the way entirely. The packets should not have to go out to user space at all (see veth) ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
On 09/22/2009 06:25 PM, Ira W. Snyder wrote: > >> Yes. vbus is more finely layered so there is less code duplication. >> >> The virtio layering was more or less dictated by Xen which doesn't have >> shared memory (it uses grant references instead). As a matter of fact >> lguest, kvm/pci, and kvm/s390 all have shared memory, as you do, so that >> part is duplicated. It's probably possible to add a virtio-shmem.ko >> library that people who do have shared memory can reuse. >> >> > Seems like a nice benefit of vbus. > Yes, it is. With some work virtio can gain that too (virtio-shmem.ko). >>> I've given it some thought, and I think that running vhost-net (or >>> similar) on the ppc boards, with virtio-net on the x86 crate server will >>> work. The virtio-ring abstraction is almost good enough to work for this >>> situation, but I had to re-invent it to work with my boards. >>> >>> I've exposed a 16K region of memory as PCI BAR1 from my ppc board. >>> Remember that this is the "host" system. I used each 4K block as a >>> "device descriptor" which contains: >>> >>> 1) the type of device, config space, etc. for virtio >>> 2) the "desc" table (virtio memory descriptors, see virtio-ring) >>> 3) the "avail" table (available entries in the desc table) >>> >>> >> Won't access from x86 be slow to this memory (on the other hand, if you >> change it to main memory access from ppc will be slow... really depends >> on how your system is tuned. >> >> > Writes across the bus are fast, reads across the bus are slow. These are > just the descriptor tables for memory buffers, not the physical memory > buffers themselves. > > These only need to be written by the guest (x86), and read by the host > (ppc). The host never changes the tables, so we can cache a copy in the > guest, for a fast detach_buf() implementation (see virtio-ring, which > I'm copying the design from). > > The only accesses are writes across the PCI bus. There is never a need > to do a read (except for slow-path configuration). > Okay, sounds like what you're doing it optimal then. > In the spirit of "post early and often", I'm making my code available, > that's all. I'm asking anyone interested for some review, before I have > to re-code this for about the fifth time now. I'm trying to avoid > Haskins' situation, where he's invented and debugged a lot of new code, > and then been told to do it completely differently. > > Yes, the code I posted is only compile-tested, because quite a lot of > code (kernel and userspace) must be working before anything works at > all. I hate to design the whole thing, then be told that something > fundamental about it is wrong, and have to completely re-write it. > Understood. Best to get a review from Rusty then. -- error compiling committee.c: too many arguments to function ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH] virtio_console: Add support for multiple ports for generic guest and host communication
On (Tue) Sep 22 2009 [12:14:04], Rusty Russell wrote: > On Sat, 12 Sep 2009 01:30:10 am Alan Cox wrote: > > > The interface presented to guest userspace is of a simple char > > > device, so it can be used like this: > > > > > > fd = open("/dev/vcon2", O_RDWR); > > > ret = read(fd, buf, 100); > > > ret = write(fd, string, strlen(string)); > > > > > > Each port is to be assigned a unique function, for example, the > > > first 4 ports may be reserved for libvirt usage, the next 4 for > > > generic streaming data and so on. This port-function mapping > > > isn't finalised yet. > > > > Unless I am missing something this looks completely bonkers > > > > Every time we have a table of numbers for functionality it ends in > > tears. We have to keep tables up to date and managed, we have to > > administer the magical number to name space. > > The number comes from the ABI; we need some identifier for the different > ports. Amit started using names, and I said "just use numbers"; they have > to be documented and agreed by all clients anyway. > > ie. the host says "here's a port id 7", which might be the cut & paste > port or whatever. Yeah; port 0 has to be reserved for a console (and then we might need to do a bit more for multiple consoles -- hvc operates on a 'vtermno', so we need to allocate them as well). Also, a 'name' property can be attached to ports, as has been suggested: qemu ... -device virtconport,name=org.qemu.clipboard,port=3,... spawns a port at id 3 and the guest will also place a file: /sys/class/virtio-console/vcon3/name which has "org.qemu.clipboard" as contents, so udev scripts could create a symlink: /dev/vcon/org.qemu.clipboard -> /dev/vcon3 Amit ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
On Tue, Sep 22, 2009 at 12:43:36PM +0300, Avi Kivity wrote: > On 09/22/2009 12:43 AM, Ira W. Snyder wrote: > > > >> Sure, virtio-ira and he is on his own to make a bus-model under that, or > >> virtio-vbus + vbus-ira-connector to use the vbus framework. Either > >> model can work, I agree. > >> > >> > > Yes, I'm having to create my own bus model, a-la lguest, virtio-pci, and > > virtio-s390. It isn't especially easy. I can steal lots of code from the > > lguest bus model, but sometimes it is good to generalize, especially > > after the fourth implemention or so. I think this is what GHaskins tried > > to do. > > > > Yes. vbus is more finely layered so there is less code duplication. > > The virtio layering was more or less dictated by Xen which doesn't have > shared memory (it uses grant references instead). As a matter of fact > lguest, kvm/pci, and kvm/s390 all have shared memory, as you do, so that > part is duplicated. It's probably possible to add a virtio-shmem.ko > library that people who do have shared memory can reuse. > Seems like a nice benefit of vbus. > > I've given it some thought, and I think that running vhost-net (or > > similar) on the ppc boards, with virtio-net on the x86 crate server will > > work. The virtio-ring abstraction is almost good enough to work for this > > situation, but I had to re-invent it to work with my boards. > > > > I've exposed a 16K region of memory as PCI BAR1 from my ppc board. > > Remember that this is the "host" system. I used each 4K block as a > > "device descriptor" which contains: > > > > 1) the type of device, config space, etc. for virtio > > 2) the "desc" table (virtio memory descriptors, see virtio-ring) > > 3) the "avail" table (available entries in the desc table) > > > > Won't access from x86 be slow to this memory (on the other hand, if you > change it to main memory access from ppc will be slow... really depends > on how your system is tuned. > Writes across the bus are fast, reads across the bus are slow. These are just the descriptor tables for memory buffers, not the physical memory buffers themselves. These only need to be written by the guest (x86), and read by the host (ppc). The host never changes the tables, so we can cache a copy in the guest, for a fast detach_buf() implementation (see virtio-ring, which I'm copying the design from). The only accesses are writes across the PCI bus. There is never a need to do a read (except for slow-path configuration). > > Parts 2 and 3 are repeated three times, to allow for a maximum of three > > virtqueues per device. This is good enough for all current drivers. > > > > The plan is to switch to multiqueue soon. Will not affect you if your > boards are uniprocessor or small smp. > Everything I have is UP. I don't need extreme performance, either. 40MB/sec is the minimum I need to reach, though I'd like to have some headroom. For reference, using the CPU to handle data transfers, I get ~2MB/sec transfers. Using the DMA engine, I've hit about 60MB/sec with my "crossed-wires" virtio-net. > > I've gotten plenty of email about this from lots of interested > > developers. There are people who would like this kind of system to just > > work, while having to write just some glue for their device, just like a > > network driver. I hunch most people have created some proprietary mess > > that basically works, and left it at that. > > > > So long as you keep the system-dependent features hookable or > configurable, it should work. > > > So, here is a desperate cry for help. I'd like to make this work, and > > I'd really like to see it in mainline. I'm trying to give back to the > > community from which I've taken plenty. > > > > Not sure who you're crying for help to. Once you get this working, post > patches. If the patches are reasonably clean and don't impact > performance for the main use case, and if you can show the need, I > expect they'll be merged. > In the spirit of "post early and often", I'm making my code available, that's all. I'm asking anyone interested for some review, before I have to re-code this for about the fifth time now. I'm trying to avoid Haskins' situation, where he's invented and debugged a lot of new code, and then been told to do it completely differently. Yes, the code I posted is only compile-tested, because quite a lot of code (kernel and userspace) must be working before anything works at all. I hate to design the whole thing, then be told that something fundamental about it is wrong, and have to completely re-write it. Thanks for the comments, Ira > -- > error compiling committee.c: too many arguments to function > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [RFC] Virtual Machine Device Queues(VMDq) support on KVM
On Tuesday 22 September 2009, Michael S. Tsirkin wrote: > > > More importantly, when virtualizations is used with multi-queue > > > NIC's the virtio-net NIC is a single CPU bottleneck. The virtio-net > > > NIC should preserve the parallelism (lock free) using multiple > > > receive/transmit queues. The number of queues should equal the > > > number of CPUs. > > > > Yup, multiqueue virtio is on todo list ;-) > > > > Note we'll need multiqueue tap for that to help. My idea for that was to open multiple file descriptors to the same macvtap device and let the kernel figure out the right thing to do with that. You can do the same with raw packed sockets in case of vhost_net, but I wouldn't want to add more complexity to the tun/tap driver for this. Arnd <>< ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [RFC] Virtual Machine Device Queues(VMDq) support on KVM
On Mon, Sep 21, 2009 at 09:27:18AM -0700, Chris Wright wrote: > * Stephen Hemminger (shemmin...@vyatta.com) wrote: > > On Mon, 21 Sep 2009 16:37:22 +0930 > > Rusty Russell wrote: > > > > > > > Actually this framework can apply to traditional network adapters > > > > > which have > > > > > just one tx/rx queue pair. And applications using the same > > > > > user/kernel interface > > > > > can utilize this framework to send/receive network traffic directly > > > > > thru a tx/rx > > > > > queue pair in a network adapter. > > > > > > > > > More importantly, when virtualizations is used with multi-queue > > NIC's the virtio-net NIC is a single CPU bottleneck. The virtio-net > > NIC should preserve the parallelism (lock free) using multiple > > receive/transmit queues. The number of queues should equal the > > number of CPUs. > > Yup, multiqueue virtio is on todo list ;-) > > thanks, > -chris Note we'll need multiqueue tap for that to help. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
On 09/22/2009 12:43 AM, Ira W. Snyder wrote: > >> Sure, virtio-ira and he is on his own to make a bus-model under that, or >> virtio-vbus + vbus-ira-connector to use the vbus framework. Either >> model can work, I agree. >> >> > Yes, I'm having to create my own bus model, a-la lguest, virtio-pci, and > virtio-s390. It isn't especially easy. I can steal lots of code from the > lguest bus model, but sometimes it is good to generalize, especially > after the fourth implemention or so. I think this is what GHaskins tried > to do. > Yes. vbus is more finely layered so there is less code duplication. The virtio layering was more or less dictated by Xen which doesn't have shared memory (it uses grant references instead). As a matter of fact lguest, kvm/pci, and kvm/s390 all have shared memory, as you do, so that part is duplicated. It's probably possible to add a virtio-shmem.ko library that people who do have shared memory can reuse. > I've given it some thought, and I think that running vhost-net (or > similar) on the ppc boards, with virtio-net on the x86 crate server will > work. The virtio-ring abstraction is almost good enough to work for this > situation, but I had to re-invent it to work with my boards. > > I've exposed a 16K region of memory as PCI BAR1 from my ppc board. > Remember that this is the "host" system. I used each 4K block as a > "device descriptor" which contains: > > 1) the type of device, config space, etc. for virtio > 2) the "desc" table (virtio memory descriptors, see virtio-ring) > 3) the "avail" table (available entries in the desc table) > Won't access from x86 be slow to this memory (on the other hand, if you change it to main memory access from ppc will be slow... really depends on how your system is tuned. > Parts 2 and 3 are repeated three times, to allow for a maximum of three > virtqueues per device. This is good enough for all current drivers. > The plan is to switch to multiqueue soon. Will not affect you if your boards are uniprocessor or small smp. > I've gotten plenty of email about this from lots of interested > developers. There are people who would like this kind of system to just > work, while having to write just some glue for their device, just like a > network driver. I hunch most people have created some proprietary mess > that basically works, and left it at that. > So long as you keep the system-dependent features hookable or configurable, it should work. > So, here is a desperate cry for help. I'd like to make this work, and > I'd really like to see it in mainline. I'm trying to give back to the > community from which I've taken plenty. > Not sure who you're crying for help to. Once you get this working, post patches. If the patches are reasonably clean and don't impact performance for the main use case, and if you can show the need, I expect they'll be merged. -- error compiling committee.c: too many arguments to function ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
* Jeremy Fitzhardinge wrote: > On 09/20/09 02:00, Avi Kivity wrote: > > On 09/20/2009 10:52 AM, Arjan van de Ven wrote: > >> On Sun, 20 Sep 2009 09:42:47 +0200 > >> Ingo Molnar wrote: > >> > >> > >>> If we were able to rip out all (or most) of paravirt from arch/x86 it > >>> would be tempting for other technical reasons - but the patch above > >>> is well localized. > >>> > >> interesting question is if this would allow us to remove a few of the > >> paravirt hooks > >> > > > > kvm will be removing the pvmmu support soon; and Xen is talking about > > running paravirtualized guests in a vmx/svm container where they don't > > need most of the hooks. > > We have no plans to drop support for non-vmx/svm capable processors, > let alone require ept/npt. But, just to map out our plans for the future, do you concur with the statements and numbers offered here by the VMware and KVM folks that on sufficiently recent hardware, hardware-assisted virtualization outperforms paravirt_ops in many (most?) workloads? I.e. paravirt_ops becomes a legacy hardware thing, not a core component of the design of arch/x86/. (with a long obsoletion period, of course.) Thanks, Ingo ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: Paravirtualization on VMware's Platform [VMI].
On Sun, 20 Sep 2009 06:30:21 pm Avi Kivity wrote: > On 09/20/2009 10:52 AM, Arjan van de Ven wrote: > > On Sun, 20 Sep 2009 09:42:47 +0200 > > Ingo Molnar wrote: > > > > > >> If we were able to rip out all (or most) of paravirt from arch/x86 it > >> would be tempting for other technical reasons - but the patch above > >> is well localized. > >> > > interesting question is if this would allow us to remove a few of the > > paravirt hooks > > > > kvm will be removing the pvmmu support soon; and Xen is talking about > running paravirtualized guests in a vmx/svm container where they don't > need most of the hooks. When they're all gone, even I don't think lguest is sufficient excuse to keep CONFIG_PARAVIRT. Oh well. But that will probably be a while. Cheers, Rusty. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization