Re: [kvm-devel] [PATCH 3/3] virtio PCI device
Avi Kivity wrote: rx and tx are closely related. You rarely have one without the other. In fact, a turned implementation should have zero kicks or interrupts for bulk transfers. The rx interrupt on the host will process new tx descriptors and fill the guest's rx queue; the guest's transmit function can also check the receive queue. I don't know if that's achievable for Linuz guests currently, but we should aim to make it possible. ATM, the net driver does a pretty good job of disabling kicks/interrupts unless they are needed. Checking for rx on tx and vice versa is a good idea and could further help there. I'll give it a try this week. Another point is that virtio still has a lot of leading zeros in its mileage counter. We need to keep things flexible and learn from others as much as possible, especially when talking about the ABI. Yes, after thinking about it over holiday, I agree that we should at least introduce a virtio-pci feature bitmask. I'm not inclined to attempt to define a hypercall ABI or anything like that right now but having the feature bitmask will at least make it possible to do such a thing in the future. I'm wary of introducing the notion of hypercalls to this device because it makes the device VMM specific. Maybe we could have the device provide an option ROM that was treated as the device "BIOS" that we could use for kicking and interrupt acking? Any idea of how that would map to Windows? Are there real PCI devices that use the option ROM space to provide what's essentially firmware? Unfortunately, I don't think an option ROM BIOS would map well to other architectures. The BIOS wouldn't work even on x86 because it isn't mapped to the guest address space (at least not consistently), and doesn't know the guest's programming model (16, 32, or 64-bits? segmented or flat?) Xen uses a hypercall page to abstract these details out. However, I'm not proposing that. Simply indicate that we support hypercalls, and use some layer below to actually send them. It is the responsibility of this layer to detect if hypercalls are present and how to call them. Hey, I think the best place for it is in paravirt_ops. We can even patch the hypercall instruction inline, and the driver doesn't need to know about it. Yes, paravirt_ops is attractive for abstracting the hypercall calling mechanism but it's still necessary to figure out how hypercalls would be identified. I think it would be necessary to define a virtio specific hypercall space and use the virtio device ID to claim subspaces. For instance, the hypercall number could be (virtio_devid << 16) | (call number). How that translates into a hypercall would then be part of the paravirt_ops abstraction. In KVM, we may have a single virtio hypercall where we pass the virtio hypercall number as one of the arguments or something like that. Not much of an argument, I know. wrt. number of queues, 8 queues will consume 32 bytes of pci space if all you store is the ring pfn. You also at least need a num argument which takes you to 48 or 64 depending on whether you care about strange formatting. 8 queues may not be enough either. Eric and I have discussed whether the 9p virtio device should support multiple mounts per-virtio device and if so, whether each one should have it's own queue. Any devices that supports this sort of multiplexing will very quickly start using a lot of queues. Make it appear as a pci function? (though my feeling is that multiple mounts should be different devices; we can then hotplug mountpoints). We may run out of PCI slots though :-/ Then we can start selling virtio extension chassis. :-) Do you know if there is a hard limit on the number of devices on a PCI bus? My concern was that it was limited by something stupid like an 8-bit identifier. Regards, Anthony Liguori ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Xen-devel] Re: Next steps with pv_ops for Xen
Juan Quintela wrote: > Hi, > > your console works great, but rest of patches are assuming: > > arch/x86/boot/compressed/notes-xen.c > arch/x86/xen/early.c > Yes, those are leftovers from a somewhat unsuccessful attempt at getting ELF-in-bzImage booting working. I need to go back and make bzImage booting work properly. I posted those patches as a source of possibly useful code snippets/summary of things I've looked at so far, rather than something that can be directly used. > at least. It looks as if there is missing another patche, could you > take a look, please? > Otherwise, I will take a look at what is missing. > > It breaks with: > > Intel machine check architecture supported. > (XEN) traps.c:1734:d0 Domain attempted WRMSR 0404 from :0001 > to > :. > Intel machine check reporting enabled on CPU#0. > general protection fault: [#1] SMP > Modules linked in: > Hm. Looks like Xen is getting upset about dom0 trying to disable caching. No, wait: 0x:? That's strange; I wonder if its just misreporting the value, because the code doesn't look like its trying to write that. Either way, the fix is to implement xen_write_cr0, and mask off any bits that Xen won't want us to set/clear (or if it doesn't allow dom0 to change cr0, just ignore all updates). J ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [Xen-devel] Re: Next steps with pv_ops for Xen
Hi, your console works great, but rest of patches are assuming: arch/x86/boot/compressed/notes-xen.c arch/x86/xen/early.c at least. It looks as if there is missing another patche, could you take a look, please? Otherwise, I will take a look at what is missing. It breaks with: Intel machine check architecture supported. (XEN) traps.c:1734:d0 Domain attempted WRMSR 0404 from :0001 to :. Intel machine check reporting enabled on CPU#0. general protection fault: [#1] SMP Modules linked in: Pid: 1, comm: swapper Not tainted (2.6.24-rc3-q2 #10) EIP: 0061:[] EFLAGS: 00010082 CPU: 0 EIP is at native_write_cr0+0x0/0x4 EAX: c005003b EBX: c03902a0 ECX: ed03f288 EDX: 0005 ESI: c1c10c80 EDI: ed054200 EBP: 0001 ESP: ed027eb8 DS: 007b ES: 007b FS: 00d8 GS: SS: e021 Process swapper (pid: 1, ti=ed027000 task=ed03ebb0 task.ti=ed027000) Stack: c01125e9 c03902a0 c1c10c80 ed054200 c01128c6 c03900a0 0008 c010e0aa c037b48d ed00efa0 ed027f24 000a c035215c c01e20a7 c1c10c80 8008 06f4 00020800 c0143563 ed03ebb0 017fe000 c03902a0 Call Trace: [] prepare_set+0x20/0x86 [] generic_set_all+0x28/0x34a [] identify_cpu+0x525/0x52d [] kvasprintf+0x3f/0x48 [] trace_hardirqs_off+0x28/0xa1 [] mtrr_ap_init+0x33/0x5d [] smp_store_cpu_info+0x32/0xb9 [] xen_cpu_up+0x22c/0x3b4 [] _cpu_up+0xab/0x120 [] cpu_up+0x4e/0x61 [] kernel_init+0x9e/0x2c6 [] restore_nocheck+0x12/0x15 [] kernel_init+0x0/0x2c6 [] kernel_init+0x0/0x2c6 [] kernel_thread_helper+0x7/0x10 === Code: 53 89 cb 83 ec 08 89 14 24 89 da 8b 04 24 89 4c 24 04 89 f9 0f 30 31 c0 5a 59 5b 5e 5f c3 0f 31 c3 0f 33 c3 0f 06 c3 0f 20 c0 c3 <0f> 22 c0 c3 0f 20 e0 c3 31 c0 0f 20 e0 c3 0f 09 c3 0f 01 00 c3 EIP: [] native_write_cr0+0x0/0x4 SS:ESP e021:ed027eb8 Kernel panic - not syncing: Attempted to kill init! Later, Juan. On Nov 22, 2007 12:12 AM, Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > Stephen C. Tweedie wrote: > > I've been looking at the next steps to try to get Xen running fully on > > top of pv_ops. To that end, I've (just) started looking at one of the > > next major jobs --- i686 dom0 on pv_ops. > > > > Great! > > > There are still a number of things needing done to reach parity with > > xen-unstable: > > > > x86_64 xen on pv_ops > > > > I think once pvops has been unified, Xen support should be fairly > straightforward. I wrote most of the existing code with 64-bit in mind, > so I'm hoping I got it right... > > > Paravirt framebuffer/keyboard > > CPU hotplug > > Balloon > > > > I've done some preliminary work on balloon and hotplug. I think balloon > should make more use of memory hotplug, but a straight port would be a > good first step. > > > kexec > > driver domains > > > > but it looks like these can largely proceed in parallel if desired. > > > > My short-term goal with this is simply to come up with a first-pass > > merge of the linux-2.6.18-xen.hg dom0 support into the current > > kernel.org tree's pv_ops support. No major refactoring in the first > > pass, but absolutely no *-xen.c code copying. > > > > Yes. #ifdefs are the way to go here. > > > I'm just starting this, but at least with the version magic check (see > > > > > > http://lists.xensource.com/archives/html/xen-devel/2007-11/msg00601.html > > > > I was just about to post a fix for this. > > > ) out of the way, an SMP dom0 running pv_ops gets all the way through > > start_kernel() and into rest_init() before dying with an unsupported cr0 > > write. (I'm using direct console hypercalls for printk for now, full > > xencons is not working yet.) > > > > I have some early dom0 patches already, though they're a few months old > now. Not much there, but I did do an early console implementation. > > > I'm happy to put up a git tree for this once it gets anywhere. We'd > > need to decide which tree to track for that purpose --- Linus's, or > > perhaps the tglx or mingo x86 merge tree might make more sense. > > > > Yes, I think the x86 tree is where we need to be, since there's a lot of > activity there. > > I'll attach my dom0 patches for whatever use you can make of them. The > definitely won't apply to anything, not least because of the arch merge > (though it looks like they did get converted by script), but also > because they're based on some defunct experimental booting-from-bzImage > patches. But perhaps there's some useful stuff in there. > > I've also attached my xen-balloon and hotplug patches as-is. They don't > work completely, but they should be closer to applying. > > J > > ___ > Xen-devel mailing list > [EMAIL PROTECTED] > http://lists.xensource.com/xen-devel > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization