Re: [kvm-devel] [patch 00/13] RFC: split the global mutex
Marcelo Tosatti wrote: On Sun, Apr 20, 2008 at 02:16:52PM +0300, Avi Kivity wrote: The iperf numbers are pretty good. Performance of UP guests increase slightly but SMP is quite significant. I expect you're seeing contention induced by memcpy()s and inefficient emulation. With the dma api, I expect the benefit will drop. You still have to memcpy() with the dma api. Even with vringfd the kernel-user copy has to be performed under the global mutex protection, difference being that several packets can be copied per-syscall instead of only one. Block does the copy outside the mutex protection, so net can be adapted to do the same. It does mean we will need to block all I/O temporarily during memory hotplug. For pure cpu emulation, there is a ton of work to be done: protecting the translator as well as making the translated code smp safe. I now believe there is a lot of work (which was not clear before). Not particularly interested in getting real emulation to be multithreaded. Anyways, the lack of multithreading in qemu emulation should not be a blocker for these patches to get in, since these are infrastructural changes. Getting this into qemu upstream is essential as this is far more intrusive than anything else we've done. But again, I believe there are many other fruit hanging from lower branches. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations
Jamie Lokier wrote: Avi Kivity wrote: Does that mean for the majority of deployments, the slow version is sufficient. The few that care about performance can use Linux AIO? In essence, yes. s/slow/slower/ and s/performance/ultimate block device performance/. Many deployments don't care at all about block device performance; they care mostly about networking performance. That's interesting. I'd have expected block device performance to be important for most things, for the same reason that disk performance is (well, reasonably) important for non-virtual machines. Seek time is important. Bandwidth is somewhat important. But for one- and two- spindle workloads (the majority), the cpu utilization induced by getting requests to the disk is not important, and that's what we're optimizing here. Disks work at around 300 Hz. Processors at around 3 GHz. That's seven orders of magnitude difference. Even if you spent 100 usec calculating what's the next best seek, even if it saves you only 10% of seeks it's a win. And of course modern processors spend a few microseconds at most getting a request out. You really need 50+ disks or a large write-back cache to make microoptimizations around the submission path felt. But as you say next: I'm under the impression that the entire and only point of Linux AIO is that it's faster than POSIX AIO on Linux. It is. I estimate posix aio adds a few microseconds above linux aio per I/O request, when using O_DIRECT. Assuming 10 microseconds, you will need 10,000 I/O requests per second per vcpu to have a 10% performance difference. That's definitely rare. Oh, I didn't realise the difference was so small. At such a tiny difference, I'm wondering why Linux-AIO exists at all, as it complicates the kernel rather a lot. I can see the theoretical appeal, but if performance is so marginal, I'm surprised it's in there. Linux aio exists, but that's all that can be said for it. It works mostly for raw disks, doesn't integrate with networking, and doesn't advance at the same pace as the rest of the kernel. I believe only databases use it (and a userspace filesystem I wrote some time ago). I'm also surprised the Glibc implementation of AIO using ordinary threads is so close to it. Why are you surprised? Actually the glibc implementation could be improved from what I've heard. My estimates are for a thread pool implementation, but there is not reason why glibc couldn't achieve exactly the same performance. And then, I'm wondering why use AIO it all: it suggests QEMU would run about as fast doing synchronous I/O in a few dedicated I/O threads. Posix aio is the unix API for this, why not use it? Also, I'd presume that those that need 10K IOPS and above will not place their high throughput images on a filesystem; rather on a separate SAN LUN. Does the separate LUN make any difference? I thought O_DIRECT on a filesystem was meant to be pretty close to block device performance. On a good extent-based filesystem like XFS you will get good performance (though more cpu overhead due to needing to go through additional mapping layers. Old clunkers like ext3 will require additional seeks or a ton of cache (1 GB per 1 TB). I base this on messages here and there which say swapping to a file is about as fast as swapping to a block device, nowadays. Swapping to a file preloads the block mapping into memory, so the filesystem is not involved at all in the I/O path. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations
Javier Guerra Giraldez wrote: On Sunday 20 April 2008, Avi Kivity wrote: Also, I'd presume that those that need 10K IOPS and above will not place their high throughput images on a filesystem; rather on a separate SAN LUN. i think that too; but still that LUN would be accessed by the VM's via one of these IO emulation layers, right? Yes. Hopefully Linux aio. or maybe you're advocating using the SAN initiator in the VM instead of the host? That works too, especially for iSCSI, but that's not what I'm advocating. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] 32-bit binaries failing in 64 bit guests after using vmport
Esteemed kvm developers! I've been trying to debug this bug https://bugs.launchpad.net/ubuntu/+source/kvm/+bug/219165 It originally revealed itself by failing to run grub (which is a 32 bit binary) when installing Ubuntu from our live cd. It turned out to be a more general problem of 32 bit binaries failing to run. The server install worked like a charm. I eventually discovered that loading the vmmouse driver triggered it and narrowed it down to the call to kvm_load_registers in vmport_ioport_read. We're releasing on Thursday, and I needed a quick fix, so I reverted the calls to kvm_{save,load}_registers in vmport_ioport_read to the old code that simply saved the eax, ebx, ecx, edx, esi, and edi registers, but I'm supposing kvm_{load,save}_registers really should work here. I dug a bit further into the code and tried disabling various pieces of the kvm_load_registers until it finally worked again. The problem seems to only arise when the lstar msr is loaded. I've looked at the code, but seeing as three days ago I didn't know there was such a thing as an lstar msr, I'm finding myself getting stuck. :) Any pointers in the right direction would be lovely. -- Soren Hansen | Virtualisation specialist | Ubuntu Server Team Canonical Ltd. | http://www.ubuntu.com/ signature.asc Description: Digital signature - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] paravirt clock stil causing hangs in kvm-65
Marcelo Tosatti wrote: From what me and marcelo discussed, I think there's a possibility that it has marginally something to do with precision of clock calculation. Gerd's patches address that issues. Can somebody test this with those patches (both guest and host), while I'm off ? Haven't seen Gerd's guest patches ? I'm still busy cooking them up. I've mentioned them in a mail, but they didn't ran over the list (yet). Stay tuned ;) cheers, Gerd -- http://kraxel.fedorapeople.org/xenner/ - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] pv clock: kvm is incompatible with xen :-(
Jeremy Fitzhardinge wrote: Gerd Hoffmann wrote: I'm looking at the guest side of the issue right now, trying to identify common code, and while doing so noticed that xen does the version-check-loop in both get_time_values_from_xen(void) and xen_clocksource_read(void), and I can't see any obvious reason for that. The loop in xen_clocksource_read(void) is not needed IMHO. Can I drop it? No. The get_nsec_offset() needs to be atomic with respect to the get_time_values() parameters. Hmm, I somehow fail to see a case where it could be non-atomic ... get_time_values() copies a consistent snapshot, thus xen_clocksource_read() doesn't race against xen updating the fields. The snapshot is in a per-cpu variable, thus it doesn't race against other guest vcpus running get_time_values() at the same time. There could be a loopless __get_time_values() for use in this case, but given that it almost never loops, I don't think its worthwhile. in this case ??? I'm confused. There is only a single user of get_nsec_offset(), which is xen_clocksource_read() ... cheers, Gerd -- http://kraxel.fedorapeople.org/xenner/ - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] KVM Test result, kernel 6cf5973.., userspace 4320192.. -- One Issue Fixed
Hi All, This is today's KVM test result against kvm.git 6cf59734fc9bc89954d0157524eea156c2f9a5ab and kvm-userspace.git 43201923a67647913b67da255ca60f0269a3e34a. One Issue Fixed 1.Can't boot smp guests on ia32e host https://sourceforge.net/tracker/?func=detailatid=893831aid=1944629group_id=180599 Three Old Issues: 1. Booting four guests likely fails https://sourceforge.net/tracker/?func=detailatid=893831aid=1919354group_id=180599 2. booting smp windows guests has 30% chance of hang https://sourceforge.net/tracker/?func=detailatid=893831aid=1910923group_id=180599 3. Cannot boot guests with hugetlbfs https://sourceforge.net/tracker/?func=detailatid=893831aid=1941302group_id=180599 Test environment PlatformWoodcrest CPU 4 Memory size 8G' Details IA32-pae: 1. boot guest with 256M memory PASS 2. boot two windows xp guest PASS 3. boot 4 same guest in parallelPASS 4. boot linux and windows guest in parallel PASS 5. boot guest with 1500M memory PASS 6. boot windows 2003 with ACPI enabled PASS 7. boot Windows xp with ACPI enabled PASS 8. boot Windows 2000 without ACPI PASS 9. kernel build on SMP linux guestPASS 10. LTP on linux guest PASS 11. boot base kernel linux PASS 12. save/restore 32-bit HVM guests PASS 13. live migration 32-bit HVM guests PASS 14. boot SMP Windows xp with ACPI enabledPASS 15. boot SMP Windows 2003 with ACPI enabled PASS 16. boot SMP Windows 2000 with ACPI enabled PASS IA32e: 1. boot four 32-bit guest in parallel PASS 2. boot four 64-bit guest in parallel PASS 3. boot 4G 64-bit guest PASS 4. boot 4G pae guest PASS 5. boot 32-bit linux and 32 bit windows guest in parallelPASS 6. boot 32-bit guest with 1500M memory PASS 7. boot 64-bit guest with 1500M memory PASS 8. boot 32-bit guest with 256M memory PASS 9. boot 64-bit guest with 256M memory PASS 10. boot two 32-bit windows xp in parallelPASS 11. boot four 32-bit different guest in para PASS 12. save/restore 64-bit linux guests PASS 13. save/restore 32-bit linux guests PASS 14. boot 32-bit SMP windows 2003 with ACPI enabled PASS 15. boot 32-bit SMP Windows 2000 with ACPI enabled PASS 16. boot 32-bit SMP Windows xp with ACPI enabledPASS 17. boot 32-bit Windows 2000 without ACPIPASS 18. boot 64-bit Windows xp with ACPI enabledPASS 19. boot 32-bit Windows xp without ACPIPASS 20. boot 64-bit UP vista PASS 21. boot 64-bit SMP vista PASS 22. kernel build in 32-bit linux guest OS PASS 23. kernel build in 64-bit linux guest OS PASS 24. LTP on 32-bit linux guest OSPASS 25. LTP on 64-bit linux guest OSPASS 26. boot 64-bit guests with ACPI enabled PASS 27. boot 32-bit x-server PASS 28. boot 64-bit SMP windows XP with ACPI enabled PASS 29. boot 64-bit SMP windows 2003 with ACPI enabled PASS 30. live migration 64bit linux guests PASS 31. live migration 32bit linux guests PASS 32. reboot 32bit windows xp guest PASS 33. reboot 32bit windows xp guest PASS Report Summary on IA32-pae Summary Test Report of Last Session = Total PassFailNoResult Crash = control_panel 8 5 3 0
Re: [kvm-devel] [ RfC / patch ] kvmclock fixes
Gerd Hoffmann wrote: * Host: make kvm pv clock really compatible with xen pv clock. * Guest/xen: factor out some xen clock code into a separate source file (pvclock.[ch]), so kvm can reuse it. * Guest/kvm: make kvm clock compatible with xen clock by using the common code bits. I guess saving on code duplication is good... +cycle_t pvclock_clocksource_read(struct kvm_vcpu_time_info *src) +{ + struct pvclock_shadow_time *shadow = get_cpu_var(shadow_time); + cycle_t ret; + + pvclock_get_time_values(shadow, src); + ret = shadow-system_timestamp + pvclock_get_nsec_offset(shadow); You need to put this in a loop in case the system clock parameters change between the pvclock_get_time_values() and pvclock_get_nsec_offset(). How does kvm deal with suspend/resume with respect to time? Is the system timestamp guaranteed to remain monotonic? For Xen, I think we'll need to maintain an offset between the initial system timestamp and whatever it is after resuming. J - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] Booting with lilo after kvm upgrade to kvm-64
Hello everybody After I update to KVM-66 (from 65), I have problem to boot guests with lilo installed. Boot sequence always stop with LIL output. With kvm-65 everythink works great. I have also windows XP guest, which boot without problem. With -no-kvm guests boot ok. Processor: AMD Opteron 2210 KVM: kvm-64 Host: gentoo-sources-2.6.25-r1 Arch: x86_64 Guests: gentoo, 2.6.25, x86_64 dmesg: Apr 21 09:53:37 kvm BUG: unable to handle kernel NULL pointer dereference at Apr 21 09:53:37 kvm IP: [88029c85] :kvm:x86_emulate_insn+0x3a47/0x468f Apr 21 09:53:37 kvm PGD 11e989067 PUD 11ec0a067 PMD 0 Apr 21 09:53:37 kvm Oops: 0002 [2] SMP Apr 21 09:53:37 kvm CPU 2 Apr 21 09:53:37 kvm Modules linked in: w83627hf hwmon_vid xt_tcpudp xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack iptable_filter ip_tables x_tables tun kvm_amd kvm shpchp pci_hotplug k8temp i2c_nforce2 i2c_core Apr 21 09:53:37 kvm Pid: 7130, comm: kvm Tainted: G D 2.6.25-gentoo-r1 #1 Apr 21 09:53:37 kvm RIP: 0010:[88029c85] [88029c85] :kvm:x86_emulate_insn+0x3a47/0x468f Apr 21 09:53:37 kvm RSP: 0018:81011e3e5738 EFLAGS: 00010246 Apr 21 09:53:37 kvm RAX: 0010 RBX: RCX: 81011e3e7378 Apr 21 09:53:37 kvm RDX: RSI: RDI: 81011e3e6000 Apr 21 09:53:37 kvm RBP: 81011e3e7378 R08: R09: Apr 21 09:53:37 kvm R10: 88041988 R11: 81011e3e7378 R12: 81011e3e7330 Apr 21 09:53:37 kvm R13: R14: 88033a20 R15: 0ce3 Apr 21 09:53:37 kvm FS: 41e68950(0063) GS:81011ff1b000() knlGS: Apr 21 09:53:37 kvm CS: 0010 DS: ES: CR0: 8005003b Apr 21 09:53:37 kvm CR2: CR3: 00011e1af000 CR4: 06e0 Apr 21 09:53:37 kvm DR0: 805b34f8 DR1: DR2: Apr 21 09:53:37 kvm DR3: DR6: 0ff1 DR7: 0701 Apr 21 09:53:37 kvm Process kvm (pid: 7130, threadinfo 81011e3e4000, task 81011e56) Apr 21 09:53:37 kvm Stack: 8021a311 000f fff7 8021a49b Apr 21 09:53:37 kvm 81011ed41d00 c20001926000 Apr 21 09:53:37 kvm 8021a311 802347a0 81011ed41d00 880419e0 Apr 21 09:53:37 kvm Call Trace: Apr 21 09:53:37 kvm [8021a311] ? do_flush_tlb_all+0x0/0x2f Apr 21 09:53:37 kvm [8021a49b] ? smp_call_function_mask+0x47/0x55 Apr 21 09:53:37 kvm [8021a311] ? do_flush_tlb_all+0x0/0x2f Apr 21 09:53:37 kvm [802347a0] ? on_each_cpu+0x19/0x25 Apr 21 09:53:37 kvm [880419e0] Apr 21 09:53:37 kvm [88020501] ? :kvm:kvm_get_cs_db_l_bits+0x9/0x2f Apr 21 09:53:37 kvm [8801f101] ? :kvm:emulate_instruction+0x1ef/0x3a5 Apr 21 09:53:37 kvm [8801f101] ? :kvm:emulate_instruction+0x1ef/0x3a5 Apr 21 09:53:37 kvm [88041fbc] Apr 21 09:53:37 kvm [88020148] ? :kvm:kvm_arch_vcpu_ioctl_run+0x44a/0x5b8 Apr 21 09:53:37 kvm [8801bf23] ? :kvm:kvm_resched+0x1b4/0x9b7 Apr 21 09:53:37 kvm [8802ad63] ? :kvm:kvm_pic_set_irq+0x21/0x6b Apr 21 09:53:37 kvm [8801e81b] ? :kvm:kvm_arch_vm_ioctl+0x38e/0x5e6 Apr 21 09:53:37 kvm [8026217b] ? zone_statistics+0x41/0x94 Apr 21 09:53:37 kvm [8025bc16] ? get_page_from_freelist+0x457/0x5af Apr 21 09:53:37 kvm [8025bdc0] ? __alloc_pages+0x52/0x2ee Apr 21 09:53:37 kvm [80225e50] ? source_load+0x25/0x41 Apr 21 09:53:37 kvm [802286f1] ? find_busiest_group+0x268/0x742 Apr 21 09:53:37 kvm [80225552] ? hrtick_set+0x99/0x107 Apr 21 09:53:37 kvm [805b3aae] ? thread_return+0x64/0xa5 Apr 21 09:53:37 kvm [80249099] ? get_futex_key+0x76/0x14d Apr 21 09:53:37 kvm [80249816] ? unqueue_me+0x6b/0x73 Apr 21 09:53:37 kvm [80249bc1] ? futex_wait+0x290/0x327 Apr 21 09:53:37 kvm [80227c36] ? try_to_wake_up+0xfa/0x10c Apr 21 09:53:37 kvm [80229752] ? __wake_up_common+0x49/0x74 Apr 21 09:53:37 kvm [80268c29] ? find_extend_vma+0x16/0x61 Apr 21 09:53:37 kvm [80249099] ? get_futex_key+0x76/0x14d Apr 21 09:53:37 kvm [803c1439] ? __up_read+0x10/0x8a Apr 21 09:53:37 kvm [8024955e] ? futex_wake+0xfa/0x10c Apr 21 09:53:37 kvm [80242e5a] ? ktime_get_ts+0x56/0x5d Apr 21 09:53:37 kvm [8801c3cb] ? :kvm:kvm_resched+0x65c/0x9b7 Apr 21 09:53:37 kvm [80225552] ? hrtick_set+0x99/0x107 Apr 21 09:53:37 kvm [8028a311] ? vfs_ioctl+0x29/0x6f Apr 21 09:53:37 kvm [8028a5a4] ? do_vfs_ioctl+0x24d/0x25c Apr 21 09:53:37 kvm [8028a5ef] ? sys_ioctl+0x3c/0x61 Apr 21 09:53:37 kvm [8020b09b] ? system_call_after_swapgs+0x7b/0x80 Apr 21 09:53:37 kvm Apr 21 09:53:37 kvm Apr 21 09:53:37 kvm Code: 02 74 20 77 06 ff c8 74 0e eb 78 83 f8 04 74 20 83 f8 08 74 27 eb 6c 48
[kvm-devel] [PATCH 17/31] KVM: x86 emulator: fix smsw and lmsw with a memory operand
lmsw and smsw were implemented only with a register operand. Extend them to support a memory operand as well. Fixes Windows running some display compatibility test on AMD hosts. Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86_emulate.c | 29 + 1 files changed, 17 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 8e1b32f..46ef78f 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -275,12 +275,15 @@ static u16 group_table[] = { SrcMem | ModRM, 0, SrcMem | ModRM | Stack, 0, [Group7*8] = 0, 0, ModRM | SrcMem, ModRM | SrcMem, - SrcNone | ModRM | DstMem, 0, SrcMem | ModRM, SrcMem | ModRM | ByteOp, + SrcNone | ModRM | DstMem | Mov, 0, + SrcMem16 | ModRM | Mov, SrcMem | ModRM | ByteOp, }; static u16 group2_table[] = { [Group7*8] = - SrcNone | ModRM, 0, 0, 0, SrcNone | ModRM | DstMem, 0, SrcMem | ModRM, 0, + SrcNone | ModRM, 0, 0, 0, + SrcNone | ModRM | DstMem | Mov, 0, + SrcMem16 | ModRM | Mov, 0, }; /* EFLAGS bit definitions. */ @@ -1722,6 +1725,8 @@ twobyte_insn: goto done; kvm_emulate_hypercall(ctxt-vcpu); + /* Disable writeback. */ + c-dst.type = OP_NONE; break; case 2: /* lgdt */ rc = read_descriptor(ctxt, ops, c-src.ptr, @@ -1729,6 +1734,8 @@ twobyte_insn: if (rc) goto done; realmode_lgdt(ctxt-vcpu, size, address); + /* Disable writeback. */ + c-dst.type = OP_NONE; break; case 3: /* lidt/vmmcall */ if (c-modrm_mod == 3 c-modrm_rm == 1) { @@ -1744,27 +1751,25 @@ twobyte_insn: goto done; realmode_lidt(ctxt-vcpu, size, address); } + /* Disable writeback. */ + c-dst.type = OP_NONE; break; case 4: /* smsw */ - if (c-modrm_mod != 3) - goto cannot_emulate; - *(u16 *)c-regs[c-modrm_rm] - = realmode_get_cr(ctxt-vcpu, 0); + c-dst.bytes = 2; + c-dst.val = realmode_get_cr(ctxt-vcpu, 0); break; case 6: /* lmsw */ - if (c-modrm_mod != 3) - goto cannot_emulate; - realmode_lmsw(ctxt-vcpu, (u16)c-modrm_val, - ctxt-eflags); + realmode_lmsw(ctxt-vcpu, (u16)c-src.val, + ctxt-eflags); break; case 7: /* invlpg*/ emulate_invlpg(ctxt-vcpu, memop); + /* Disable writeback. */ + c-dst.type = OP_NONE; break; default: goto cannot_emulate; } - /* Disable writeback. */ - c-dst.type = OP_NONE; break; case 0x06: emulate_clts(ctxt-vcpu); -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 30/31] KVM: ppc: Kconfig fixes
From: Hollis Blanchard [EMAIL PROTECTED] Don't allow building as a module (asm-offsets dependencies). Also, automatically select KVM_BOOKE_HOST until we better separate the guest and host layers. Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/powerpc/kvm/Kconfig | 11 +-- 1 files changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig index 989ee82..6f73edd 100644 --- a/arch/powerpc/kvm/Kconfig +++ b/arch/powerpc/kvm/Kconfig @@ -15,10 +15,12 @@ menuconfig VIRTUALIZATION if VIRTUALIZATION config KVM - tristate Kernel-based Virtual Machine (KVM) support - depends on EXPERIMENTAL + bool Kernel-based Virtual Machine (KVM) support + depends on 44x EXPERIMENTAL select PREEMPT_NOTIFIERS select ANON_INODES + # We can only run on Book E hosts so far + select KVM_BOOKE_HOST ---help--- Support hosting virtualized guest machines. You will also need to select one or more of the processor modules below. @@ -26,13 +28,10 @@ config KVM This module provides access to the hardware capabilities through a character device node named /dev/kvm. - To compile this as a module, choose M here: the module - will be called kvm. - If unsure, say N. config KVM_BOOKE_HOST - tristate KVM host support for Book E PowerPC processors + bool KVM host support for Book E PowerPC processors depends on KVM 44x ---help--- Provides host support for KVM on Book E PowerPC processors. Currently -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 19/31] KVM: Rename debugfs_dir to kvm_debugfs_dir
From: Hollis Blanchard [EMAIL PROTECTED] It's a globally exported symbol now. Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- include/linux/kvm_host.h |2 +- virt/kvm/kvm_main.c |8 virt/kvm/kvm_trace.c |4 ++-- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 81d4c33..4e16682 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -315,7 +315,7 @@ struct kvm_stats_debugfs_item { struct dentry *dentry; }; extern struct kvm_stats_debugfs_item debugfs_entries[]; -extern struct dentry *debugfs_dir; +extern struct dentry *kvm_debugfs_dir; #ifdef CONFIG_KVM_TRACE int kvm_trace_ioctl(unsigned int ioctl, unsigned long arg); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 0998455..d3cb4cc 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -60,7 +60,7 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_cache); static __read_mostly struct preempt_ops kvm_preempt_ops; -struct dentry *debugfs_dir; +struct dentry *kvm_debugfs_dir; static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl, unsigned long arg); @@ -1392,9 +1392,9 @@ static void kvm_init_debug(void) { struct kvm_stats_debugfs_item *p; - debugfs_dir = debugfs_create_dir(kvm, NULL); + kvm_debugfs_dir = debugfs_create_dir(kvm, NULL); for (p = debugfs_entries; p-name; ++p) - p-dentry = debugfs_create_file(p-name, 0444, debugfs_dir, + p-dentry = debugfs_create_file(p-name, 0444, kvm_debugfs_dir, (void *)(long)p-offset, stat_fops[p-kind]); } @@ -1405,7 +1405,7 @@ static void kvm_exit_debug(void) for (p = debugfs_entries; p-name; ++p) debugfs_remove(p-dentry); - debugfs_remove(debugfs_dir); + debugfs_remove(kvm_debugfs_dir); } static int kvm_suspend(struct sys_device *dev, pm_message_t state) diff --git a/virt/kvm/kvm_trace.c b/virt/kvm/kvm_trace.c index 5425440..0e49547 100644 --- a/virt/kvm/kvm_trace.c +++ b/virt/kvm/kvm_trace.c @@ -159,12 +159,12 @@ static int do_kvm_trace_enable(struct kvm_user_trace_setup *kuts) r = -EIO; atomic_set(kt-lost_records, 0); - kt-lost_file = debugfs_create_file(lost_records, 0444, debugfs_dir, + kt-lost_file = debugfs_create_file(lost_records, 0444, kvm_debugfs_dir, kt, kvm_trace_lost_ops); if (!kt-lost_file) goto err; - kt-rchan = relay_open(trace, debugfs_dir, kuts-buf_size, + kt-rchan = relay_open(trace, kvm_debugfs_dir, kuts-buf_size, kuts-buf_nr, kvm_relay_callbacks, kt); if (!kt-rchan) goto err; -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 24/31] KVM: SVM: sync TPR value to V_TPR field in the VMCB
From: Joerg Roedel [EMAIL PROTECTED] This patch adds syncing of the lapic.tpr field to the V_TPR field of the VMCB. With this change we can safely remove the CR8 read intercept. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c | 18 -- 1 files changed, 16 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 3379e13..f8ce36e 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -486,8 +486,7 @@ static void init_vmcb(struct vcpu_svm *svm) control-intercept_cr_read =INTERCEPT_CR0_MASK | INTERCEPT_CR3_MASK | - INTERCEPT_CR4_MASK | - INTERCEPT_CR8_MASK; + INTERCEPT_CR4_MASK; control-intercept_cr_write = INTERCEPT_CR0_MASK | INTERCEPT_CR3_MASK | @@ -1621,6 +1620,19 @@ static void svm_prepare_guest_switch(struct kvm_vcpu *vcpu) { } +static inline void sync_lapic_to_cr8(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + u64 cr8; + + if (!irqchip_in_kernel(vcpu-kvm)) + return; + + cr8 = kvm_get_cr8(vcpu); + svm-vmcb-control.int_ctl = ~V_TPR_MASK; + svm-vmcb-control.int_ctl |= cr8 V_TPR_MASK; +} + static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { struct vcpu_svm *svm = to_svm(vcpu); @@ -1630,6 +1642,8 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) pre_svm_run(svm); + sync_lapic_to_cr8(vcpu); + save_host_msrs(vcpu); fs_selector = read_fs(); gs_selector = read_gs(); -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] performance with guests running 2.4 kernels (specifically RHEL3)
David S. Ahern wrote: I added the traces and captured data over another apparent lockup of the guest. This seems to be representative of the sequence (pid/vcpu removed). (+4776) VMEXIT [ exitcode = 0x, rip = 0x c016127c ] (+ 0) PAGE_FAULT [ errorcode = 0x0003, virt = 0x c0009db4 ] (+3632) VMENTRY (+4552) VMEXIT [ exitcode = 0x, rip = 0x c016104a ] (+ 0) PAGE_FAULT [ errorcode = 0x000b, virt = 0x fffb61c8 ] (+ 54928) VMENTRY Can you oprofile the host to see where the 54K cycles are spent? (+4568) VMEXIT [ exitcode = 0x, rip = 0x c01610e7 ] (+ 0) PAGE_FAULT [ errorcode = 0x0003, virt = 0x c0009db4 ] (+ 0) PTE_WRITE [ gpa = 0x 9db4 gpte = 0x 41c5d363 ] (+8432) VMENTRY (+3936) VMEXIT [ exitcode = 0x, rip = 0x c01610ee ] (+ 0) PAGE_FAULT [ errorcode = 0x0003, virt = 0x c0009db0 ] (+ 0) PTE_WRITE [ gpa = 0x 9db0 gpte = 0x ] (+ 13832) VMENTRY (+5768) VMEXIT [ exitcode = 0x, rip = 0x c016127c ] (+ 0) PAGE_FAULT [ errorcode = 0x0003, virt = 0x c0009db4 ] (+3712) VMENTRY (+4576) VMEXIT [ exitcode = 0x, rip = 0x c016104a ] (+ 0) PAGE_FAULT [ errorcode = 0x000b, virt = 0x fffb61d0 ] (+ 0) PTE_WRITE [ gpa = 0x 3d5981d0 gpte = 0x 3d55d047 ] This indeed has the accessed bit clear. (+ 65216) VMENTRY (+4232) VMEXIT [ exitcode = 0x, rip = 0x c01610e7 ] (+ 0) PAGE_FAULT [ errorcode = 0x0003, virt = 0x c0009db4 ] (+ 0) PTE_WRITE [ gpa = 0x 9db4 gpte = 0x 3d598363 ] This has the accessed bit set and the user bit clear, and the pte pointing at the previous pte_write gpa. Looks like a kmap_atomic(). (+8640) VMENTRY (+3936) VMEXIT [ exitcode = 0x, rip = 0x c01610ee ] (+ 0) PAGE_FAULT [ errorcode = 0x0003, virt = 0x c0009db0 ] (+ 0) PTE_WRITE [ gpa = 0x 9db0 gpte = 0x ] (+ 14160) VMENTRY I can forward a more complete time snippet if you'd like. vcpu0 + corresponding vcpu1 files have 85000 total lines and compressed the files total ~500k. I did not see the FLOODED trace come out during this sample though I did bump the count from 3 to 4 as you suggested. Bumping the count was supposed to remove the flooding... Correlating rip addresses to the 2.4 kernel: c0160d00-c0161290 = page_referenced It looks like the event is kscand running through the pages. I suspected this some time ago, and tried tweaking the kscand_work_percent sysctl variable. It appeared to lower the peak of the spikes, but maybe I imagined it. I believe lowering that value makes kscand wake up more often but do less work (page scanning) each time it is awakened. What does 'top' in the guest show (perhaps sorted by total cpu time rather than instantaneous usage)? What host kernel are you running? How many host cpus? -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 29/31] KVM: SVM: remove selective CR0 comment
From: Joerg Roedel [EMAIL PROTECTED] There is not selective cr0 intercept bug. The code in the comment sets the CR0.PG bit. But KVM sets the CR4.PG bit for SVM always to implement the paged real mode. So the 'mov %eax,%cr0' instruction does not change the CR0.PG bit. Selective CR0 intercepts only occur when a bit is actually changed. So its the right behavior that there is no intercept on this instruction. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c | 11 --- 1 files changed, 0 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index d643605..89e0be2 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -513,17 +513,6 @@ static void init_vmcb(struct vcpu_svm *svm) control-intercept =(1ULL INTERCEPT_INTR) | (1ULL INTERCEPT_NMI) | (1ULL INTERCEPT_SMI) | - /* -* selective cr0 intercept bug? -* 0: 0f 22 d8mov%eax,%cr3 -* 3: 0f 20 c0mov%cr0,%eax -* 6: 0d 00 00 00 80 or $0x8000,%eax -* b: 0f 22 c0mov%eax,%cr0 -* set cr3 -interception -* get cr0 -interception -* set cr0 - no interception -*/ - /* (1ULL INTERCEPT_SELECTIVE_CR0) | */ (1ULL INTERCEPT_CPUID) | (1ULL INTERCEPT_INVD) | (1ULL INTERCEPT_HLT) | -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 01/31] KVM: MMU: Don't assume struct page for x86
From: Anthony Liguori [EMAIL PROTECTED] This patch introduces a gfn_to_pfn() function and corresponding functions like kvm_release_pfn_dirty(). Using these new functions, we can modify the x86 MMU to no longer assume that it can always get a struct page for any given gfn. We don't want to eliminate gfn_to_page() entirely because a number of places assume they can do gfn_to_page() and then kmap() the results. When we support IO memory, gfn_to_page() will fail for IO pages although gfn_to_pfn() will succeed. This does not implement support for avoiding reference counting for reserved RAM or for IO memory. However, it should make those things pretty straight forward. Since we're only introducing new common symbols, I don't think it will break the non-x86 architectures but I haven't tested those. I've tested Intel, AMD, NPT, and hugetlbfs with Windows and Linux guests. [avi: fix overflow when shifting left pfns by adding casts] Signed-off-by: Anthony Liguori [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/mmu.c | 89 +-- arch/x86/kvm/paging_tmpl.h | 26 ++-- include/asm-x86/kvm_host.h |4 +- include/linux/kvm_host.h | 12 ++ include/linux/kvm_types.h |2 + virt/kvm/kvm_main.c| 68 ++--- 6 files changed, 133 insertions(+), 68 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index c89bf23..078a7f1 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -240,11 +240,9 @@ static int is_rmap_pte(u64 pte) return is_shadow_present_pte(pte); } -static struct page *spte_to_page(u64 pte) +static pfn_t spte_to_pfn(u64 pte) { - hfn_t hfn = (pte PT64_BASE_ADDR_MASK) PAGE_SHIFT; - - return pfn_to_page(hfn); + return (pte PT64_BASE_ADDR_MASK) PAGE_SHIFT; } static gfn_t pse36_gfn_delta(u32 gpte) @@ -541,20 +539,20 @@ static void rmap_remove(struct kvm *kvm, u64 *spte) struct kvm_rmap_desc *desc; struct kvm_rmap_desc *prev_desc; struct kvm_mmu_page *sp; - struct page *page; + pfn_t pfn; unsigned long *rmapp; int i; if (!is_rmap_pte(*spte)) return; sp = page_header(__pa(spte)); - page = spte_to_page(*spte); + pfn = spte_to_pfn(*spte); if (*spte PT_ACCESSED_MASK) - mark_page_accessed(page); + kvm_set_pfn_accessed(pfn); if (is_writeble_pte(*spte)) - kvm_release_page_dirty(page); + kvm_release_pfn_dirty(pfn); else - kvm_release_page_clean(page); + kvm_release_pfn_clean(pfn); rmapp = gfn_to_rmap(kvm, sp-gfns[spte - sp-spt], is_large_pte(*spte)); if (!*rmapp) { printk(KERN_ERR rmap_remove: %p %llx 0-BUG\n, spte, *spte); @@ -635,11 +633,11 @@ static void rmap_write_protect(struct kvm *kvm, u64 gfn) spte = rmap_next(kvm, rmapp, spte); } if (write_protected) { - struct page *page; + pfn_t pfn; spte = rmap_next(kvm, rmapp, NULL); - page = spte_to_page(*spte); - SetPageDirty(page); + pfn = spte_to_pfn(*spte); + kvm_set_pfn_dirty(pfn); } /* check for huge page mappings */ @@ -1036,7 +1034,7 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, unsigned pt_access, unsigned pte_access, int user_fault, int write_fault, int dirty, int *ptwrite, int largepage, gfn_t gfn, -struct page *page, bool speculative) +pfn_t pfn, bool speculative) { u64 spte; int was_rmapped = 0; @@ -1058,10 +1056,9 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, child = page_header(pte PT64_BASE_ADDR_MASK); mmu_page_remove_parent_pte(child, shadow_pte); - } else if (page != spte_to_page(*shadow_pte)) { + } else if (pfn != spte_to_pfn(*shadow_pte)) { pgprintk(hfn old %lx new %lx\n, -page_to_pfn(spte_to_page(*shadow_pte)), -page_to_pfn(page)); +spte_to_pfn(*shadow_pte), pfn); rmap_remove(vcpu-kvm, shadow_pte); } else { if (largepage) @@ -1090,7 +1087,7 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, if (largepage) spte |= PT_PAGE_SIZE_MASK; - spte |= page_to_phys(page); + spte |= (u64)pfn PAGE_SHIFT; if ((pte_access ACC_WRITE_MASK) || (write_fault !is_write_protection(vcpu) !user_fault)) { @@ -1135,12 +1132,12 @@ unshadowed: if (!was_rmapped) {
[kvm-devel] [PATCH 18/31] KVM: x86 emulator: fix lea to really get the effective address
We never hit this, since there is currently no reason to emulate lea. Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86_emulate.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 46ef78f..2ca0838 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -1512,7 +1512,7 @@ special_insn: case 0x88 ... 0x8b: /* mov */ goto mov; case 0x8d: /* lea r16/r32, m */ - c-dst.val = c-modrm_val; + c-dst.val = c-modrm_ea; break; case 0x8f: /* pop (sole member of Grp1a) */ rc = emulate_grp1a(ctxt, ops); -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 11/31] KVM: hlt emulation should take in-kernel APIC/PIT timers into account
From: Marcelo Tosatti [EMAIL PROTECTED] Timers that fire between guest hlt and vcpu_block's add_wait_queue() are ignored, possibly resulting in hangs. Also make sure that atomic_inc and waitqueue_active tests happen in the specified order, otherwise the following race is open: CPU0CPU1 if (waitqueue_active(wq)) add_wait_queue() if (!atomic_read(pit_timer-pending)) schedule() atomic_inc(pit_timer-pending) Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/ia64/kvm/kvm-ia64.c |5 + arch/s390/kvm/interrupt.c |5 + arch/x86/kvm/i8254.c | 11 +++ arch/x86/kvm/irq.c| 15 +++ arch/x86/kvm/irq.h|3 +++ arch/x86/kvm/lapic.c | 10 ++ include/linux/kvm_host.h |1 + virt/kvm/kvm_main.c |1 + 8 files changed, 51 insertions(+), 0 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 9c56b64..ca1cfb1 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1778,6 +1778,11 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu) return 0; } +int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu) +{ + return 0; +} + gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn) { return gfn; diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index f62588c..fcd1ed8 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -325,6 +325,11 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu) return rc; } +int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu) +{ + return 0; +} + int kvm_s390_handle_wait(struct kvm_vcpu *vcpu) { u64 now, sltime; diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index 06a241a..abb4b16 100644 --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -199,6 +199,7 @@ int __pit_timer_fn(struct kvm_kpit_state *ps) struct kvm_kpit_timer *pt = ps-pit_timer; atomic_inc(pt-pending); + smp_mb__after_atomic_inc(); if (vcpu0 waitqueue_active(vcpu0-wq)) { vcpu0-arch.mp_state = VCPU_MP_STATE_RUNNABLE; wake_up_interruptible(vcpu0-wq); @@ -210,6 +211,16 @@ int __pit_timer_fn(struct kvm_kpit_state *ps) return (pt-period == 0 ? 0 : 1); } +int pit_has_pending_timer(struct kvm_vcpu *vcpu) +{ + struct kvm_pit *pit = vcpu-kvm-arch.vpit; + + if (pit vcpu-vcpu_id == 0) + return atomic_read(pit-pit_state.pit_timer.pending); + + return 0; +} + static enum hrtimer_restart pit_timer_fn(struct hrtimer *data) { struct kvm_kpit_state *ps; diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index dbfe21c..ce1f583 100644 --- a/arch/x86/kvm/irq.c +++ b/arch/x86/kvm/irq.c @@ -26,6 +26,21 @@ #include i8254.h /* + * check if there are pending timer events + * to be processed. + */ +int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu) +{ + int ret; + + ret = pit_has_pending_timer(vcpu); + ret |= apic_has_pending_timer(vcpu); + + return ret; +} +EXPORT_SYMBOL(kvm_cpu_has_pending_timer); + +/* * check if there is pending interrupt without * intack. */ diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h index fa5ed5d..1802134 100644 --- a/arch/x86/kvm/irq.h +++ b/arch/x86/kvm/irq.h @@ -85,4 +85,7 @@ void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); +int pit_has_pending_timer(struct kvm_vcpu *vcpu); +int apic_has_pending_timer(struct kvm_vcpu *vcpu); + #endif diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 31280df..debf582 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -952,6 +952,16 @@ static int __apic_timer_fn(struct kvm_lapic *apic) return result; } +int apic_has_pending_timer(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *lapic = vcpu-arch.apic; + + if (lapic) + return atomic_read(lapic-timer.pending); + + return 0; +} + static int __inject_apic_timer_irq(struct kvm_lapic *apic) { int vector; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index bd0c2d2..0bc4003 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -269,6 +269,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm); int kvm_cpu_get_interrupt(struct kvm_vcpu *v); int kvm_cpu_has_interrupt(struct kvm_vcpu *v); +int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu); void kvm_vcpu_kick(struct kvm_vcpu *vcpu); static inline void kvm_guest_enter(void) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d5911d9..47cbc6e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -765,6 +765,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) * We will block until either an
[kvm-devel] [PATCH 06/31] KVM: Add trace markers
From: Feng (Eric) Liu [EMAIL PROTECTED] Trace markers allow userspace to trace execution of a virtual machine in order to monitor its performance. Signed-off-by: Feng (Eric) Liu [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/vmx.c | 35 ++- arch/x86/kvm/x86.c | 26 +++ include/asm-x86/kvm.h | 20 ++ include/asm-x86/kvm_host.h | 19 + include/linux/kvm.h| 49 +++- 5 files changed, 147 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6249810..8e5d664 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1843,6 +1843,8 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu, int irq) { struct vcpu_vmx *vmx = to_vmx(vcpu); + KVMTRACE_1D(INJ_VIRQ, vcpu, (u32)irq, handler); + if (vcpu-arch.rmode.active) { vmx-rmode.irq.pending = true; vmx-rmode.irq.vector = irq; @@ -1993,6 +1995,8 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) error_code = vmcs_read32(VM_EXIT_INTR_ERROR_CODE); if (is_page_fault(intr_info)) { cr2 = vmcs_readl(EXIT_QUALIFICATION); + KVMTRACE_3D(PAGE_FAULT, vcpu, error_code, (u32)cr2, + (u32)((u64)cr2 32), handler); return kvm_mmu_page_fault(vcpu, cr2, error_code); } @@ -2021,6 +2025,7 @@ static int handle_external_interrupt(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { ++vcpu-stat.irq_exits; + KVMTRACE_1D(INTR, vcpu, vmcs_read32(VM_EXIT_INTR_INFO), handler); return 1; } @@ -2078,6 +2083,8 @@ static int handle_cr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) reg = (exit_qualification 8) 15; switch ((exit_qualification 4) 3) { case 0: /* mov to cr */ + KVMTRACE_3D(CR_WRITE, vcpu, (u32)cr, (u32)vcpu-arch.regs[reg], + (u32)((u64)vcpu-arch.regs[reg] 32), handler); switch (cr) { case 0: vcpu_load_rsp_rip(vcpu); @@ -2110,6 +2117,7 @@ static int handle_cr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) vcpu-arch.cr0 = ~X86_CR0_TS; vmcs_writel(CR0_READ_SHADOW, vcpu-arch.cr0); vmx_fpu_activate(vcpu); + KVMTRACE_0D(CLTS, vcpu, handler); skip_emulated_instruction(vcpu); return 1; case 1: /*mov from cr*/ @@ -2118,12 +2126,18 @@ static int handle_cr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) vcpu_load_rsp_rip(vcpu); vcpu-arch.regs[reg] = vcpu-arch.cr3; vcpu_put_rsp_rip(vcpu); + KVMTRACE_3D(CR_READ, vcpu, (u32)cr, + (u32)vcpu-arch.regs[reg], + (u32)((u64)vcpu-arch.regs[reg] 32), + handler); skip_emulated_instruction(vcpu); return 1; case 8: vcpu_load_rsp_rip(vcpu); vcpu-arch.regs[reg] = kvm_get_cr8(vcpu); vcpu_put_rsp_rip(vcpu); + KVMTRACE_2D(CR_READ, vcpu, (u32)cr, + (u32)vcpu-arch.regs[reg], handler); skip_emulated_instruction(vcpu); return 1; } @@ -2169,6 +2183,7 @@ static int handle_dr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) val = 0; } vcpu-arch.regs[reg] = val; + KVMTRACE_2D(DR_READ, vcpu, (u32)dr, (u32)val, handler); } else { /* mov to dr */ } @@ -2193,6 +2208,9 @@ static int handle_rdmsr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) return 1; } + KVMTRACE_3D(MSR_READ, vcpu, ecx, (u32)data, (u32)(data 32), + handler); + /* FIXME: handling of bits 32:63 of rax, rdx */ vcpu-arch.regs[VCPU_REGS_RAX] = data -1u; vcpu-arch.regs[VCPU_REGS_RDX] = (data 32) -1u; @@ -2206,6 +2224,9 @@ static int handle_wrmsr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) u64 data = (vcpu-arch.regs[VCPU_REGS_RAX] -1u) | ((u64)(vcpu-arch.regs[VCPU_REGS_RDX] -1u) 32); + KVMTRACE_3D(MSR_WRITE, vcpu, ecx, (u32)data, (u32)(data 32), + handler); + if (vmx_set_msr(vcpu, ecx, data) != 0) { kvm_inject_gp(vcpu, 0); return 1; @@ -2230,6 +2251,9 @@ static int handle_interrupt_window(struct kvm_vcpu *vcpu, cpu_based_vm_exec_control = vmcs_read32(CPU_BASED_VM_EXEC_CONTROL);
[kvm-devel] [PATCH 22/31] KVM: Add MAINTAINERS entry for PowerPC KVM
From: Hollis Blanchard [EMAIL PROTECTED] Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Acked-by: Paul Mackerras [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- MAINTAINERS |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index a18aac1..6072f2f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2293,6 +2293,13 @@ L: kvm-devel@lists.sourceforge.net W: kvm.sourceforge.net S: Supported +KERNEL VIRTUAL MACHINE (KVM) FOR POWERPC +P: Hollis Blanchard +M: [EMAIL PROTECTED] +L: [EMAIL PROTECTED] +W: kvm.sourceforge.net +S: Supported + KERNEL VIRTUAL MACHINE For Itanium(KVM/IA64) P: Anthony Xu M: [EMAIL PROTECTED] -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 31/31] KVM: MMU: kvm_pv_mmu_op should not take mmap_sem
From: Marcelo Tosatti [EMAIL PROTECTED] kvm_pv_mmu_op should not take mmap_sem. All gfn_to_page() callers down in the MMU processing will take it if necessary, so as it is it can deadlock. Apparently a leftover from the days before slots_lock. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/mmu.c |3 --- 1 files changed, 0 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 078a7f1..2ad6f54 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -2173,8 +2173,6 @@ int kvm_pv_mmu_op(struct kvm_vcpu *vcpu, unsigned long bytes, int r; struct kvm_pv_mmu_op_buffer buffer; - down_read(current-mm-mmap_sem); - buffer.ptr = buffer.buf; buffer.len = min_t(unsigned long, bytes, sizeof buffer.buf); buffer.processed = 0; @@ -2194,7 +2192,6 @@ int kvm_pv_mmu_op(struct kvm_vcpu *vcpu, unsigned long bytes, r = 1; out: *ret = buffer.processed; - up_read(current-mm-mmap_sem); return r; } -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 07/31] KVM: s390: Stub out kvmtrace
Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/s390/kvm/Kconfig |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig index 2489b34..1761b74 100644 --- a/arch/s390/kvm/Kconfig +++ b/arch/s390/kvm/Kconfig @@ -36,6 +36,9 @@ config KVM If unsure, say N. +config KVM_TRACE + bool + # OK, it's a little counter-intuitive to do this, but it puts it neatly under # the virtualization menu. source drivers/virtio/Kconfig -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 26/31] KVM: SVM: sync V_TPR with LAPIC.TPR if CR8 write intercept is disabled
From: Joerg Roedel [EMAIL PROTECTED] If the CR8 write intercept is disabled the V_TPR field of the VMCB needs to be synced with the TPR field in the local apic. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c | 12 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index f8ce36e..ee2ee83 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1620,6 +1620,16 @@ static void svm_prepare_guest_switch(struct kvm_vcpu *vcpu) { } +static inline void sync_cr8_to_lapic(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + if (!(svm-vmcb-control.intercept_cr_write INTERCEPT_CR8_MASK)) { + int cr8 = svm-vmcb-control.int_ctl V_TPR_MASK; + kvm_lapic_set_tpr(vcpu, cr8); + } +} + static inline void sync_lapic_to_cr8(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -1791,6 +1801,8 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) stgi(); + sync_cr8_to_lapic(vcpu); + svm-next_rip = 0; } -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 05/31] KVM: SVM: add intercept for machine check exception
From: Joerg Roedel [EMAIL PROTECTED] To properly forward a MCE occured while the guest is running to the host, we have to intercept this exception and call the host handler by hand. This is implemented by this patch. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c | 17 - include/asm-x86/kvm_host.h |1 + 2 files changed, 17 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 8af463b..da3ddef 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -507,7 +507,8 @@ static void init_vmcb(struct vcpu_svm *svm) INTERCEPT_DR7_MASK; control-intercept_exceptions = (1 PF_VECTOR) | - (1 UD_VECTOR); + (1 UD_VECTOR) | + (1 MC_VECTOR); control-intercept =(1ULL INTERCEPT_INTR) | @@ -1044,6 +1045,19 @@ static int nm_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run) return 1; } +static int mc_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run) +{ + /* +* On an #MC intercept the MCE handler is not called automatically in +* the host. So do it by hand here. +*/ + asm volatile ( + int $0x12\n); + /* not sure if we ever come back to this point */ + + return 1; +} + static int shutdown_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run) { /* @@ -1367,6 +1381,7 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm, [SVM_EXIT_EXCP_BASE + UD_VECTOR]= ud_interception, [SVM_EXIT_EXCP_BASE + PF_VECTOR]= pf_interception, [SVM_EXIT_EXCP_BASE + NM_VECTOR]= nm_interception, + [SVM_EXIT_EXCP_BASE + MC_VECTOR]= mc_interception, [SVM_EXIT_INTR] = nop_on_interception, [SVM_EXIT_NMI] = nop_on_interception, [SVM_EXIT_SMI] = nop_on_interception, diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index de3eccf..2861178 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -62,6 +62,7 @@ #define SS_VECTOR 12 #define GP_VECTOR 13 #define PF_VECTOR 14 +#define MC_VECTOR 18 #define SELECTOR_TI_MASK (1 2) #define SELECTOR_RPL_MASK 0x03 -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 13/31] KVM: add ioctls to save/store mpstate
From: Marcelo Tosatti [EMAIL PROTECTED] So userspace can save/restore the mpstate during migration. [avi: export the #define constants describing the value] [christian: add s390 stubs] [avi: ditto for ia64] Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Signed-off-by: Christian Borntraeger [EMAIL PROTECTED] Signed-off-by: Carsten Otte [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] KVM: ia64: provide get/set_mp_state stubs to fix compile error Since commit ded6fb24fb694bcc5f308a02ec504d45fbc8aaa6 Author: Marcelo Tosatti [EMAIL PROTECTED] Date: Fri Apr 11 13:24:45 2008 -0300 KVM: add ioctls to save/store mpstate kvm does not compile on ia64. This patch provides ioctl stubs for ia64 to make kvm.git compile again. Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/ia64/kvm/kvm-ia64.c | 12 arch/s390/kvm/kvm-s390.c | 12 arch/x86/kvm/x86.c | 19 +++ include/asm-x86/kvm_host.h |5 - include/linux/kvm.h| 15 +++ include/linux/kvm_host.h |4 virt/kvm/kvm_main.c| 24 7 files changed, 86 insertions(+), 5 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index f7589db..6df0732 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1792,3 +1792,15 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu) { return vcpu-arch.mp_state == KVM_MP_STATE_RUNNABLE; } + +int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + return -EINVAL; +} + +int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + return -EINVAL; +} diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index d966137..98d1e73 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -414,6 +414,18 @@ int kvm_arch_vcpu_ioctl_debug_guest(struct kvm_vcpu *vcpu, return -EINVAL; /* not implemented yet */ } +int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + return -EINVAL; /* not implemented yet */ +} + +int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + return -EINVAL; /* not implemented yet */ +} + static void __vcpu_run(struct kvm_vcpu *vcpu) { memcpy(vcpu-arch.sie_block-gg14, vcpu-arch.guest_gprs[14], 16); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b364d19..5c3c9d3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -817,6 +817,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_CLOCKSOURCE: case KVM_CAP_PIT: case KVM_CAP_NOP_IO_DELAY: + case KVM_CAP_MP_STATE: r = 1; break; case KVM_CAP_VAPIC: @@ -3083,6 +3084,24 @@ int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, return 0; } +int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + vcpu_load(vcpu); + mp_state-mp_state = vcpu-arch.mp_state; + vcpu_put(vcpu); + return 0; +} + +int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + vcpu_load(vcpu); + vcpu-arch.mp_state = mp_state-mp_state; + vcpu_put(vcpu); + return 0; +} + static void set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg) { diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index f35a6ad..9d963cd 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -227,11 +227,6 @@ struct kvm_vcpu_arch { u64 shadow_efer; u64 apic_base; struct kvm_lapic *apic;/* kernel irqchip context */ -#define KVM_MP_STATE_RUNNABLE 0 -#define KVM_MP_STATE_UNINITIALIZED 1 -#define KVM_MP_STATE_INIT_RECEIVED 2 -#define KVM_MP_STATE_SIPI_RECEIVED 3 -#define KVM_MP_STATE_HALTED4 int mp_state; int sipi_vector; u64 ia32_misc_enable_msr; diff --git a/include/linux/kvm.h b/include/linux/kvm.h index d302d63..f8e211d 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -228,6 +228,18 @@ struct kvm_vapic_addr { __u64 vapic_addr; }; +/* for KVM_SET_MPSTATE */ + +#define KVM_MP_STATE_RUNNABLE 0 +#define KVM_MP_STATE_UNINITIALIZED 1 +#define KVM_MP_STATE_INIT_RECEIVED 2 +#define KVM_MP_STATE_HALTED3 +#define KVM_MP_STATE_SIPI_RECEIVED 4 + +struct kvm_mp_state { + __u32 mp_state; +}; + struct kvm_s390_psw { __u64 mask; __u64 addr; @@ -326,6 +338,7 @@ struct kvm_trace_rec { #define KVM_CAP_PIT 11 #define KVM_CAP_NOP_IO_DELAY 12 #define
[kvm-devel] [PATCH 03/31] KVM: SVM: indent svm_set_cr4 with tabs instead of spaces
From: Joerg Roedel [EMAIL PROTECTED] The svm_set_cr4 function is indented with spaces. This patch replaces them with tabs. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index ad27346..d7439ce 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -878,10 +878,10 @@ set: static void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { - vcpu-arch.cr4 = cr4; - if (!npt_enabled) - cr4 |= X86_CR4_PAE; - to_svm(vcpu)-vmcb-save.cr4 = cr4; + vcpu-arch.cr4 = cr4; + if (!npt_enabled) + cr4 |= X86_CR4_PAE; + to_svm(vcpu)-vmcb-save.cr4 = cr4; } static void svm_set_segment(struct kvm_vcpu *vcpu, -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 10/31] KVM: SVM: do not intercept task switch with NPT
From: Joerg Roedel [EMAIL PROTECTED] When KVM uses NPT there is no reason to intercept task switches. This patch removes the intercept for it in that case. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index da3ddef..8d04aed 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -591,6 +591,7 @@ static void init_vmcb(struct vcpu_svm *svm) if (npt_enabled) { /* Setup VMCB for Nested Paging */ control-nested_ctl = 1; + control-intercept = ~(1ULL INTERCEPT_TASK_SWITCH); control-intercept_exceptions = ~(1 PF_VECTOR); control-intercept_cr_read = ~(INTERCEPT_CR0_MASK| INTERCEPT_CR3_MASK); -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 28/31] KVM: SVM: remove now obsolete FIXME comment
From: Joerg Roedel [EMAIL PROTECTED] With the usage of the V_TPR field this comment is now obsolete. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |7 --- 1 files changed, 0 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 61bb2cb..d643605 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -916,13 +916,6 @@ static void svm_set_segment(struct kvm_vcpu *vcpu, } -/* FIXME: - - svm(vcpu)-vmcb-control.int_ctl = ~V_TPR_MASK; - svm(vcpu)-vmcb-control.int_ctl |= (sregs-cr8 V_TPR_MASK); - -*/ - static int svm_guest_debug(struct kvm_vcpu *vcpu, struct kvm_debug_guest *dbg) { return -EOPNOTSUPP; -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 09/31] KVM: Add kvm trace userspace interface
From: Feng(Eric) Liu [EMAIL PROTECTED] This interface allows user a space application to read the trace of kvm related events through relayfs. Signed-off-by: Feng (Eric) Liu [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/Kconfig | 11 ++ arch/x86/kvm/Makefile|3 + include/linux/kvm_host.h | 14 +++ virt/kvm/kvm_main.c |8 +- virt/kvm/kvm_trace.c | 276 ++ 5 files changed, 311 insertions(+), 1 deletions(-) create mode 100644 virt/kvm/kvm_trace.c diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 76c70ab..8d45fab 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -50,6 +50,17 @@ config KVM_AMD Provides support for KVM on AMD processors equipped with the AMD-V (SVM) extensions. +config KVM_TRACE + bool KVM trace support + depends on KVM MARKERS SYSFS + select RELAY + select DEBUG_FS + default n + ---help--- + This option allows reading a trace of kvm-related events through + relayfs. Note the ABI is not considered stable and will be + modified in future updates. + # OK, it's a little counter-intuitive to do this, but it puts it neatly under # the virtualization menu. source drivers/lguest/Kconfig diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index 4d0c22e..c97d35c 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -3,6 +3,9 @@ # common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o) +ifeq ($(CONFIG_KVM_TRACE),y) +common-objs += $(addprefix ../../../virt/kvm/, kvm_trace.o) +endif EXTRA_CFLAGS += -Ivirt/kvm -Iarch/x86/kvm diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 578c363..bd0c2d2 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -15,6 +15,7 @@ #include linux/sched.h #include linux/mm.h #include linux/preempt.h +#include linux/marker.h #include asm/signal.h #include linux/kvm.h @@ -309,5 +310,18 @@ struct kvm_stats_debugfs_item { struct dentry *dentry; }; extern struct kvm_stats_debugfs_item debugfs_entries[]; +extern struct dentry *debugfs_dir; + +#ifdef CONFIG_KVM_TRACE +int kvm_trace_ioctl(unsigned int ioctl, unsigned long arg); +void kvm_trace_cleanup(void); +#else +static inline +int kvm_trace_ioctl(unsigned int ioctl, unsigned long arg) +{ + return -EINVAL; +} +#define kvm_trace_cleanup() ((void)0) +#endif #endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 6a52c08..d5911d9 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -60,7 +60,7 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_cache); static __read_mostly struct preempt_ops kvm_preempt_ops; -static struct dentry *debugfs_dir; +struct dentry *debugfs_dir; static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl, unsigned long arg); @@ -1191,6 +1191,11 @@ static long kvm_dev_ioctl(struct file *filp, r += PAGE_SIZE;/* pio data page */ #endif break; + case KVM_TRACE_ENABLE: + case KVM_TRACE_PAUSE: + case KVM_TRACE_DISABLE: + r = kvm_trace_ioctl(ioctl, arg); + break; default: return kvm_arch_dev_ioctl(filp, ioctl, arg); } @@ -1519,6 +1524,7 @@ EXPORT_SYMBOL_GPL(kvm_init); void kvm_exit(void) { + kvm_trace_cleanup(); misc_deregister(kvm_dev); kmem_cache_destroy(kvm_vcpu_cache); sysdev_unregister(kvm_sysdev); diff --git a/virt/kvm/kvm_trace.c b/virt/kvm/kvm_trace.c new file mode 100644 index 000..5425440 --- /dev/null +++ b/virt/kvm/kvm_trace.c @@ -0,0 +1,276 @@ +/* + * kvm trace + * + * It is designed to allow debugging traces of kvm to be generated + * on UP / SMP machines. Each trace entry can be timestamped so that + * it's possible to reconstruct a chronological record of trace events. + * The implementation refers to blktrace kernel support. + * + * Copyright (c) 2008 Intel Corporation + * Copyright (C) 2006 Jens Axboe [EMAIL PROTECTED] + * + * Authors: Feng(Eric) Liu, [EMAIL PROTECTED] + * + * Date:Feb 2008 + */ + +#include linux/module.h +#include linux/relay.h +#include linux/debugfs.h + +#include linux/kvm_host.h + +#define KVM_TRACE_STATE_RUNNING(1 0) +#define KVM_TRACE_STATE_PAUSE (1 1) +#define KVM_TRACE_STATE_CLEARUP(1 2) + +struct kvm_trace { + int trace_state; + struct rchan *rchan; + struct dentry *lost_file; + atomic_t lost_records; +}; +static struct kvm_trace *kvm_trace; + +struct kvm_trace_probe { + const char *name; + const char *format; + u32 cycle_in; + marker_probe_func *probe_func; +}; + +static inline int calc_rec_size(int cycle, int extra) +{ + int rec_size = KVM_TRC_HEAD_SIZE; + + rec_size += extra; + return cycle ? rec_size += KVM_TRC_CYCLE_SIZE : rec_size; +} + +static void
[kvm-devel] Правильно оформить договор
Бухгалтеру о договорной работе организации - правовые основы и налоговый аспект 7 мая 2008, г. Мoсква Прoграмма семинара Программа семинара 1. Как правильно оформить договор, обязательные и дополнительные условия договоров. Когда можно считать соблюденной простую письменную форму договора. Когда договор требует государственной регистрации или нотариального заверения. Рамочные договоры. Оферта (одностороннее предложение заключить сделку). Подписание договора. 2. Гарантийные условия в договорах - залог, задаток, неустойка (налоговые преимущества гарантий по сравнению с авансами по договору). Регулирование в договорах и ╚по умолчанию╩ вопросов возмещения ущерба от неисполнения договора. Упущенная выгода. Ничтожность противозаконных условий сделок и ее налоговые последствия. 3. Разрешение споров по договорам. Претензионная работа. Признание долга сомнительным, безнадежным, списание долга. Сроки исковой давности. 4. Цена в договоре - способы обозначения (должна ли быть указана определенная цена), единицы изменения (рубли, иностранная валюта, условные единицы), обоснование рыночной цены. Скидки - разовые и накопительные - порядок предоставления и учета. 5. Порядок расчетов по договору, наличные и безналичные платежи с учетом изменений порядка расчета наличными согласно Указанию ЦБ от 20.06.2007 N 1843-У О предельном размере расчетов наличными деньгами и расходовании наличных денег, поступивших в кассу юридического лица или кассу индивидуального предпринимателя. Даты признания доходов и расходов по договорам. Расчетные документы в рублях, валюте, условных единицах. Сопроводительные и расчетные документы в электронном виде. Акты: обязательно ли составлять акт, ждать ли окончания договора или составлять акт поэтапно, форма акта, позиция Минфина относительно порядка заполнения актов и детализации сведений в них. 6. Договоры между юридическими лицами - оформление, учет и налогообложение. Зависимость налогового бремени от вида и содержания договора. Договор и налог на прибыль. Договор и НДС. - договор купли-продажи (предмет, обязательные условия, оформление и последствия возврата товара), - договор мены (обмен и мена - в чем отличия, рыночная цена сделки), - договор аренды (регистрация договоров, аренда автотранспортного средства, аренда офиса), - договор страхования (личное и имущественное страхование, страхование ответственности, налоговые льготы), - договоры займа (признание расходов, беспроцентные займы), - договоры безвозмездной передачи и безвозмездного пользования (ограничения в сфере применения и признания расходов), - посреднические договоры (особенности договоров комиссии, агентирования, поручения), - договор простого товарищества (участники, налоговые преимущества, доля участника и распределение расходов и доходов), - договор возмездного оказания услуг (существенные условия договора, разновидности договоров услуг, преимущества перед договором подряда). 6. Договоры организации с физическими лицами: коллективные, трудовые, гражданско-правовые, договоры с индивидуальными предпринимателями - учет и налогообложение, выплаты по таким договорам. Возможность управления налоговой нагрузкой на предприятие с помощью таких договоров. Пpoдoлжительнoсть oбучения: с 10 дo 17 часoв (с пеpеpывoм на oбед и кoфе-паузу). Местo oбучения: г. Мoсква, 5 мин. пешкoм oт м. Академическая. Стoимoсть oбучения: 4900 pуб. (с НДС). (В стoимoсть вxoдит: pаздатoчный матеpиал, кoфе-пауза, oбед в pестopане). Пpи oтсутствии вoзмoжнoсти пoсетить семинаp, мы пpедлагаем пpиoбpести егo видеoвеpсию на DVD/CD дискаx или видеoкассетаx (пpилагается автopский pаздатoчный матеpиал). Цена видеoкуpса - 3500 pублей, с учетoм НДС. Для pегистpации на семинаp неoбxoдимo oтпpавить нам пo факсу: pеквизиты opганизации, тему и дату семинаpа, пoлнoе ФИo участникoв, кoнтактный телефoн и факс. Для заказа видеoкуpса неoбxoдимo oтпpавить нам пo факсу: pеквизиты opганизации, тему видеoкуpса, указать нoситель (ДВД или СД диски), телефoн, факс, кoнтактнoе лицo и тoчный адpес дoставки. Пoлучить дoпoлнительную инфopмацию и заpегистpиpoваться мoжнo: пo т/ф: ( Ч 9 5 ) 5 ЧЗ = 8 8 = Ч 6 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 20/31] ppc: Export tlb_44x_hwater for KVM
From: Hollis Blanchard [EMAIL PROTECTED] PowerPC 440 KVM needs to know how many TLB entries are used for the host kernel linear mapping (it does not modify these mappings when switching between guest and host execution). Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Acked-by: Josh Boyer [EMAIL PROTECTED] Acked-by: Paul Mackerras [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- include/asm-powerpc/mmu-44x.h |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/include/asm-powerpc/mmu-44x.h b/include/asm-powerpc/mmu-44x.h index 62772ae..b6953e8 100644 --- a/include/asm-powerpc/mmu-44x.h +++ b/include/asm-powerpc/mmu-44x.h @@ -53,6 +53,8 @@ #ifndef __ASSEMBLY__ +extern unsigned int tlb_44x_hwater; + typedef unsigned long long phys_addr_t; typedef struct { -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 16/31] KVM: x86 emulator: initialize src.val and dst.val for register operands
This lets us treat the case where mod == 3 in the same manner as other cases. Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86_emulate.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index f59ed93..8e1b32f 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -1001,6 +1001,7 @@ done_prefixes: */ if ((c-d ModRM) c-modrm_mod == 3) { c-src.type = OP_REG; + c-src.val = c-modrm_val; break; } c-src.type = OP_MEM; @@ -1044,6 +1045,7 @@ done_prefixes: case DstMem: if ((c-d ModRM) c-modrm_mod == 3) { c-dst.type = OP_REG; + c-dst.val = c-dst.orig_val = c-modrm_val; break; } c-dst.type = OP_MEM; -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 12/31] KVM: Rename VCPU_MP_STATE_* to KVM_MP_STATE_*
We wish to export it to userspace, so move it into the kvm namespace. Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/ia64/kvm/kvm-ia64.c| 26 +- arch/x86/kvm/i8254.c|2 +- arch/x86/kvm/lapic.c| 16 arch/x86/kvm/x86.c | 18 +- include/asm-ia64/kvm_host.h |8 include/asm-x86/kvm_host.h | 10 +- 6 files changed, 40 insertions(+), 40 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index ca1cfb1..f7589db 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -340,7 +340,7 @@ static int handle_ipi(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) regs-cr_iip = vcpu-kvm-arch.rdv_sal_data.boot_ip; regs-r1 = vcpu-kvm-arch.rdv_sal_data.boot_gp; - target_vcpu-arch.mp_state = VCPU_MP_STATE_RUNNABLE; + target_vcpu-arch.mp_state = KVM_MP_STATE_RUNNABLE; if (waitqueue_active(target_vcpu-wq)) wake_up_interruptible(target_vcpu-wq); } else { @@ -386,7 +386,7 @@ static int handle_global_purge(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) for (i = 0; i KVM_MAX_VCPUS; i++) { if (!kvm-vcpus[i] || kvm-vcpus[i]-arch.mp_state == - VCPU_MP_STATE_UNINITIALIZED || + KVM_MP_STATE_UNINITIALIZED || vcpu == kvm-vcpus[i]) continue; @@ -437,12 +437,12 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu) hrtimer_start(p_ht, kt, HRTIMER_MODE_ABS); if (irqchip_in_kernel(vcpu-kvm)) { - vcpu-arch.mp_state = VCPU_MP_STATE_HALTED; + vcpu-arch.mp_state = KVM_MP_STATE_HALTED; kvm_vcpu_block(vcpu); hrtimer_cancel(p_ht); vcpu-arch.ht_active = 0; - if (vcpu-arch.mp_state != VCPU_MP_STATE_RUNNABLE) + if (vcpu-arch.mp_state != KVM_MP_STATE_RUNNABLE) return -EINTR; return 1; } else { @@ -668,7 +668,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) vcpu_load(vcpu); - if (unlikely(vcpu-arch.mp_state == VCPU_MP_STATE_UNINITIALIZED)) { + if (unlikely(vcpu-arch.mp_state == KVM_MP_STATE_UNINITIALIZED)) { kvm_vcpu_block(vcpu); vcpu_put(vcpu); return -EAGAIN; @@ -1127,12 +1127,12 @@ static enum hrtimer_restart hlt_timer_fn(struct hrtimer *data) wait_queue_head_t *q; vcpu = container_of(data, struct kvm_vcpu, arch.hlt_timer); - if (vcpu-arch.mp_state != VCPU_MP_STATE_HALTED) + if (vcpu-arch.mp_state != KVM_MP_STATE_HALTED) goto out; q = vcpu-wq; if (waitqueue_active(q)) { - vcpu-arch.mp_state = VCPU_MP_STATE_RUNNABLE; + vcpu-arch.mp_state = KVM_MP_STATE_RUNNABLE; wake_up_interruptible(q); } out: @@ -1159,7 +1159,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) return PTR_ERR(vmm_vcpu); if (vcpu-vcpu_id == 0) { - vcpu-arch.mp_state = VCPU_MP_STATE_RUNNABLE; + vcpu-arch.mp_state = KVM_MP_STATE_RUNNABLE; /*Set entry address for first run.*/ regs-cr_iip = PALE_RESET_ENTRY; @@ -1172,7 +1172,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) v-arch.last_itc = 0; } } else - vcpu-arch.mp_state = VCPU_MP_STATE_UNINITIALIZED; + vcpu-arch.mp_state = KVM_MP_STATE_UNINITIALIZED; r = -ENOMEM; vcpu-arch.apic = kzalloc(sizeof(struct kvm_lapic), GFP_KERNEL); @@ -1704,10 +1704,10 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, u8 vec, u8 trig) if (!test_and_set_bit(vec, vpd-irr[0])) { vcpu-arch.irq_new_pending = 1; -if (vcpu-arch.mp_state == VCPU_MP_STATE_RUNNABLE) +if (vcpu-arch.mp_state == KVM_MP_STATE_RUNNABLE) kvm_vcpu_kick(vcpu); - else if (vcpu-arch.mp_state == VCPU_MP_STATE_HALTED) { - vcpu-arch.mp_state = VCPU_MP_STATE_RUNNABLE; + else if (vcpu-arch.mp_state == KVM_MP_STATE_HALTED) { + vcpu-arch.mp_state = KVM_MP_STATE_RUNNABLE; if (waitqueue_active(vcpu-wq)) wake_up_interruptible(vcpu-wq); } @@ -1790,5 +1790,5 @@ gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn) int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu) { - return vcpu-arch.mp_state == VCPU_MP_STATE_RUNNABLE; + return vcpu-arch.mp_state == KVM_MP_STATE_RUNNABLE; } diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index abb4b16..2852dd1 100644 ---
[kvm-devel] [PATCH 21/31] KVM: ppc: Add DCR access information to struct kvm_run
From: Hollis Blanchard [EMAIL PROTECTED] Device Control Registers are essentially another address space found on PowerPC 4xx processors, analogous to PIO on x86. DCRs are always 32 bits, and can be identified by a 32-bit number. We forward most DCR accesses to userspace for emulation (with the exception of CPR0 registers, which can be read directly for simplicity in timebase frequency determination). Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- include/linux/kvm.h |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/include/linux/kvm.h b/include/linux/kvm.h index f8e211d..a281afe 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -82,6 +82,7 @@ struct kvm_irqchip { #define KVM_EXIT_TPR_ACCESS 12 #define KVM_EXIT_S390_SIEIC 13 #define KVM_EXIT_S390_RESET 14 +#define KVM_EXIT_DCR 15 /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */ struct kvm_run { @@ -161,6 +162,12 @@ struct kvm_run { #define KVM_S390_RESET_CPU_INIT 8 #define KVM_S390_RESET_IPL 16 __u64 s390_reset_flags; + /* KVM_EXIT_DCR */ + struct { + __u32 dcrn; + __u32 data; + __u8 is_write; + } dcr; /* Fix the size of the union. */ char padding[256]; }; -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 04/31] KVM: SVM: align shadow CR4.MCE with host
From: Joerg Roedel [EMAIL PROTECTED] This patch aligns the host version of the CR4.MCE bit with the CR4 active in the guest. This is necessary to get MCE exceptions when the guest is running. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index d7439ce..8af463b 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -878,9 +878,12 @@ set: static void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { + unsigned long host_cr4_mce = read_cr4() X86_CR4_MCE; + vcpu-arch.cr4 = cr4; if (!npt_enabled) cr4 |= X86_CR4_PAE; + cr4 |= host_cr4_mce; to_svm(vcpu)-vmcb-save.cr4 = cr4; } -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 15/31] KVM: SVM: force a new asid when initializing the vmcb
Shutdown interception clears the vmcb, leaving the asid at zero (which is illegal. so force a new asid on vmcb initialization. Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 8d04aed..3379e13 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -603,7 +603,7 @@ static void init_vmcb(struct vcpu_svm *svm) save-cr3 = 0; save-cr4 = 0; } - + force_new_asid(svm-vcpu); } static int svm_vcpu_reset(struct kvm_vcpu *vcpu) -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 25/31] KVM: export kvm_lapic_set_tpr() to modules
From: Joerg Roedel [EMAIL PROTECTED] This patch exports the kvm_lapic_set_tpr() function from the lapic code to modules. It is required in the kvm-amd module to optimize CR8 intercepts. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/lapic.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 2ccf994..57ac4e4 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -822,6 +822,7 @@ void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8) apic_set_tpr(apic, ((cr8 0x0f) 4) | (apic_get_reg(apic, APIC_TASKPRI) 4)); } +EXPORT_SYMBOL_GPL(kvm_lapic_set_tpr); u64 kvm_lapic_get_cr8(struct kvm_vcpu *vcpu) { -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 27/31] KVM: SVM: disable CR8 intercept when tpr is not masking interrupts
From: Joerg Roedel [EMAIL PROTECTED] This patch disables the intercept of CR8 writes if the TPR is not masking interrupts. This reduces the total number CR8 intercepts to below 1 percent of what we have without this patch using Windows 64 bit guests. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c | 31 +++ 1 files changed, 27 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index ee2ee83..61bb2cb 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1502,6 +1502,27 @@ static void svm_set_irq(struct kvm_vcpu *vcpu, int irq) svm_inject_irq(svm, irq); } +static void update_cr8_intercept(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + struct vmcb *vmcb = svm-vmcb; + int max_irr, tpr; + + if (!irqchip_in_kernel(vcpu-kvm) || vcpu-arch.apic-vapic_addr) + return; + + vmcb-control.intercept_cr_write = ~INTERCEPT_CR8_MASK; + + max_irr = kvm_lapic_find_highest_irr(vcpu); + if (max_irr == -1) + return; + + tpr = kvm_lapic_get_cr8(vcpu) 4; + + if (tpr = (max_irr 0xf0)) + vmcb-control.intercept_cr_write |= INTERCEPT_CR8_MASK; +} + static void svm_intr_assist(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -1514,14 +1535,14 @@ static void svm_intr_assist(struct kvm_vcpu *vcpu) SVM_EVTINJ_VEC_MASK; vmcb-control.exit_int_info = 0; svm_inject_irq(svm, intr_vector); - return; + goto out; } if (vmcb-control.int_ctl V_IRQ_MASK) - return; + goto out; if (!kvm_cpu_has_interrupt(vcpu)) - return; + goto out; if (!(vmcb-save.rflags X86_EFLAGS_IF) || (vmcb-control.int_state SVM_INTERRUPT_SHADOW_MASK) || @@ -1529,12 +1550,14 @@ static void svm_intr_assist(struct kvm_vcpu *vcpu) /* unable to deliver irq, set pending irq */ vmcb-control.intercept |= (1ULL INTERCEPT_VINTR); svm_inject_irq(svm, 0x0); - return; + goto out; } /* Okay, we can deliver the interrupt: grab it and update PIC state. */ intr_vector = kvm_cpu_get_interrupt(vcpu); svm_inject_irq(svm, intr_vector); kvm_timer_intr_post(vcpu, intr_vector); +out: + update_cr8_intercept(vcpu); } static void kvm_reput_irq(struct vcpu_svm *svm) -- 1.5.5 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] pv clock: kvm is incompatible with xen :-(
Gerd Hoffmann wrote: Hmm, I somehow fail to see a case where it could be non-atomic ... get_time_values() copies a consistent snapshot, thus xen_clocksource_read() doesn't race against xen updating the fields. The snapshot is in a per-cpu variable, thus it doesn't race against other guest vcpus running get_time_values() at the same time. Xen could change the parameters in the instant after get_time_values(). That change could be as a result of suspend-resume, so the parameters and the tsc could be wildly different. It's definitely an edge-case, but it's easy enough to deal with. There could be a loopless __get_time_values() for use in this case, but given that it almost never loops, I don't think its worthwhile. in this case ??? I'm confused. There is only a single user of get_nsec_offset(), which is xen_clocksource_read() ... Sure, but get_time_values() has several other callers. If xen_clocksource_read() had its own loop to make sure the read_tsc is atomic with respect to get_time_values, then get_time_values itself needn't loop. But, given that it only loops in the very rare case that it races with Xen updating those parameters, it doesn't seem to make much difference either way. J - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH/trivial] kvm: remove long - void *user - long cast
Avi, kvm_dev_ioctl casts the arg value to void __user *, just to recast it again to long. This seems unnecessary. According to objdump the binary code on x86 is unchanged by this patch. Signed-off-by: Christian Borntraeger [EMAIL PROTECTED] --- virt/kvm/kvm_main.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) Index: kvm/virt/kvm/kvm_main.c === --- kvm.orig/virt/kvm/kvm_main.c +++ kvm/virt/kvm/kvm_main.c @@ -1188,7 +1188,6 @@ static int kvm_dev_ioctl_create_vm(void) static long kvm_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { - void __user *argp = (void __user *)arg; long r = -EINVAL; switch (ioctl) { @@ -1205,7 +1204,7 @@ static long kvm_dev_ioctl(struct file *f r = kvm_dev_ioctl_create_vm(); break; case KVM_CHECK_EXTENSION: - r = kvm_dev_ioctl_check_extension((long)argp); + r = kvm_dev_ioctl_check_extension(arg); break; case KVM_GET_VCPU_MMAP_SIZE: r = -EINVAL; - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations
Avi Kivity wrote: At such a tiny difference, I'm wondering why Linux-AIO exists at all, as it complicates the kernel rather a lot. I can see the theoretical appeal, but if performance is so marginal, I'm surprised it's in there. Linux aio exists, but that's all that can be said for it. It works mostly for raw disks, doesn't integrate with networking, and doesn't advance at the same pace as the rest of the kernel. I believe only databases use it (and a userspace filesystem I wrote some time ago). And video streaming on some embedded devices with no MMU! (Due to the page cache heuristics working poorly with no MMU, sustained reliable streaming is managed with O_DIRECT and the app managing cache itself (like a database), and that needs AIO to keep the request queue busy. At least, that's the theory.) I'm also surprised the Glibc implementation of AIO using ordinary threads is so close to it. Why are you surprised? Because I've read that Glibc AIO (which uses a thread pool) is a relatively poor performer as AIO implementations go, and is only there for API compatibility, not suggested for performance. But I read that quite a while ago, perhaps it's changed. Actually the glibc implementation could be improved from what I've heard. My estimates are for a thread pool implementation, but there is not reason why glibc couldn't achieve exactly the same performance. Erm... I thought you said it _does_ achieve nearly the same performance, not that it _could_. Do you mean it could achieve exactly the same performance by using Linux AIO when possible? And then, I'm wondering why use AIO it all: it suggests QEMU would run about as fast doing synchronous I/O in a few dedicated I/O threads. Posix aio is the unix API for this, why not use it? Because far more host platforms have threads than have POSIX AIO. (I suspect both options will end up supported in the end, as dedicated I/O threads were already suggested for other things.) Also, I'd presume that those that need 10K IOPS and above will not place their high throughput images on a filesystem; rather on a separate SAN LUN. Does the separate LUN make any difference? I thought O_DIRECT on a filesystem was meant to be pretty close to block device performance. On a good extent-based filesystem like XFS you will get good performance (though more cpu overhead due to needing to go through additional mapping layers. Old clunkers like ext3 will require additional seeks or a ton of cache (1 GB per 1 TB). Hmm. Thanks. I may consider switching to XFS now -- Jamie - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] pv clock: kvm is incompatible with xen :-(
Jeremy Fitzhardinge wrote: Xen could change the parameters in the instant after get_time_values(). That change could be as a result of suspend-resume, so the parameters and the tsc could be wildly different. Ah, ok, forgot the rdtsc in the picture. With that in mind I fully agree that the loop is needed. I think kvm guests can even hit that one with the vcpu migrating to a different physical cpu, so we better handle it correctly ;) Sure, but get_time_values() has several other callers. Not really. There are only two calls, one in clocksource_read() and one in the init path. The later is superfluous I think because clocksource_read() is the only user of the shadowed time info. cheers, Gerd -- http://kraxel.fedorapeople.org/xenner/ - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [ RfC / patch ] kvmclock fixes
Jeremy Fitzhardinge wrote: Gerd Hoffmann wrote: +cycle_t pvclock_clocksource_read(struct kvm_vcpu_time_info *src) +{ +struct pvclock_shadow_time *shadow = get_cpu_var(shadow_time); +cycle_t ret; + +pvclock_get_time_values(shadow, src); +ret = shadow-system_timestamp + pvclock_get_nsec_offset(shadow); You need to put this in a loop in case the system clock parameters change between the pvclock_get_time_values() and pvclock_get_nsec_offset(). Fixed, new patch attached. How does kvm deal with suspend/resume with respect to time? Is the system timestamp guaranteed to remain monotonic? For Xen, I think we'll need to maintain an offset between the initial system timestamp and whatever it is after resuming. Havn't looked at it yet. cheers, Gerd -- http://kraxel.fedorapeople.org/xenner/ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 1cc9d42..688df87 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -79,7 +79,7 @@ obj-$(CONFIG_DEBUG_NX_TEST) += test_nx.o obj-$(CONFIG_VMI) += vmi_32.o vmiclock_32.o obj-$(CONFIG_KVM_GUEST) += kvm.o obj-$(CONFIG_KVM_CLOCK) += kvmclock.o -obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch_$(BITS).o +obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch_$(BITS).o pvclock.o ifdef CONFIG_INPUT_PCSPKR obj-y+= pcspeaker.o diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index ddee040..476b7c7 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -18,6 +18,7 @@ #include linux/clocksource.h #include linux/kvm_para.h +#include asm/pvclock.h #include asm/arch_hooks.h #include asm/msr.h #include asm/apic.h @@ -37,17 +38,9 @@ early_param(no-kvmclock, parse_no_kvmclock); /* The hypervisor will put information about time periodically here */ static DEFINE_PER_CPU_SHARED_ALIGNED(struct kvm_vcpu_time_info, hv_clock); -#define get_clock(cpu, field) per_cpu(hv_clock, cpu).field - -static inline u64 kvm_get_delta(u64 last_tsc) -{ - int cpu = smp_processor_id(); - u64 delta = native_read_tsc() - last_tsc; - return (delta * get_clock(cpu, tsc_to_system_mul)) KVM_SCALE; -} static struct kvm_wall_clock wall_clock; -static cycle_t kvm_clock_read(void); + /* * The wallclock is the time of day when we booted. Since then, some time may * have elapsed since the hypervisor wrote the data. So we try to account for @@ -55,35 +48,19 @@ static cycle_t kvm_clock_read(void); */ unsigned long kvm_get_wallclock(void) { - u32 wc_sec, wc_nsec; - u64 delta; + struct kvm_vcpu_time_info *vcpu_time; struct timespec ts; - int version, nsec; int low, high; low = (int)__pa(wall_clock); high = ((u64)__pa(wall_clock) 32); - - delta = kvm_clock_read(); - native_write_msr(MSR_KVM_WALL_CLOCK, low, high); - do { - version = wall_clock.wc_version; - rmb(); - wc_sec = wall_clock.wc_sec; - wc_nsec = wall_clock.wc_nsec; - rmb(); - } while ((wall_clock.wc_version != version) || (version 1)); - - delta = kvm_clock_read() - delta; - delta += wc_nsec; - nsec = do_div(delta, NSEC_PER_SEC); - set_normalized_timespec(ts, wc_sec + delta, nsec); - /* - * Of all mechanisms of time adjustment I've tested, this one - * was the champion! - */ - return ts.tv_sec + 1; + + vcpu_time = get_cpu_var(hv_clock); + pvclock_read_wallclock(wall_clock, vcpu_time, ts); + put_cpu_var(hv_clock); + + return ts.tv_sec; } int kvm_set_wallclock(unsigned long now) @@ -91,28 +68,17 @@ int kvm_set_wallclock(unsigned long now) return 0; } -/* - * This is our read_clock function. The host puts an tsc timestamp each time - * it updates a new time. Without the tsc adjustment, we can have a situation - * in which a vcpu starts to run earlier (smaller system_time), but probes - * time later (compared to another vcpu), leading to backwards time - */ static cycle_t kvm_clock_read(void) { - u64 last_tsc, now; - int cpu; + struct kvm_vcpu_time_info *src; + cycle_t ret; - preempt_disable(); - cpu = smp_processor_id(); - - last_tsc = get_clock(cpu, tsc_timestamp); - now = get_clock(cpu, system_time); - - now += kvm_get_delta(last_tsc); - preempt_enable(); - - return now; + src = get_cpu_var(hv_clock); + ret = pvclock_clocksource_read(src); + put_cpu_var(hv_clock); + return ret; } + static struct clocksource kvm_clock = { .name = kvm-clock, .read = kvm_clock_read, diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c new file mode 100644 index 000..6e7dae0 --- /dev/null +++ b/arch/x86/kernel/pvclock.c @@ -0,0 +1,150 @@ +/* paravirtual clock -- common code used by kvm/xen + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied
Re: [kvm-devel] pv clock: kvm is incompatible with xen :-(
Gerd Hoffmann wrote: Jeremy Fitzhardinge wrote: Xen could change the parameters in the instant after get_time_values(). That change could be as a result of suspend-resume, so the parameters and the tsc could be wildly different. Ah, ok, forgot the rdtsc in the picture. With that in mind I fully agree that the loop is needed. I think kvm guests can even hit that one with the vcpu migrating to a different physical cpu, so we better handle it correctly ;) Yes, same with Xen. Sure, but get_time_values() has several other callers. Not really. There are only two calls, one in clocksource_read() and one in the init path. The later is superfluous I think because clocksource_read() is the only user of the shadowed time info. Hm. It doesn't look like shadow_time needs to be a static percpu at all. It could just be a local to clocksource_read, I think. J - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [ RfC / patch ] kvmclock fixes
Gerd Hoffmann wrote: +cycle_t pvclock_clocksource_read(struct kvm_vcpu_time_info *src) +{ + struct pvclock_shadow_time *shadow; + cycle_t ret; + unsigned version; + + shadow = get_cpu_var(shadow_time); + do { + version = pvclock_get_time_values(shadow, src); + barrier(); + ret = shadow-system_timestamp + pvclock_get_nsec_offset(shadow); + barrier(); Is barrier() strong enough? Does kvm guarantee that the per-cpu time parameters are only ever updated by that cpu? I'm pretty sure Xen does, so that's OK. J - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] 32-bit binaries failing in 64 bit guests after using vmport
Soren Hansen wrote: Esteemed kvm developers! I've been trying to debug this bug https://bugs.launchpad.net/ubuntu/+source/kvm/+bug/219165 It originally revealed itself by failing to run grub (which is a 32 bit binary) when installing Ubuntu from our live cd. It turned out to be a more general problem of 32 bit binaries failing to run. The server install worked like a charm. I eventually discovered that loading the vmmouse driver triggered it and narrowed it down to the call to kvm_load_registers in vmport_ioport_read. We're releasing on Thursday, and I needed a quick fix, so I reverted the calls to kvm_{save,load}_registers in vmport_ioport_read to the old code that simply saved the eax, ebx, ecx, edx, esi, and edi registers, but I'm supposing kvm_{load,save}_registers really should work here. Ah, you may have missed the fix that updated the KVM load/save functions to deal with the in-kernel APIC. It turns out, vmmouse was horribly broken with SMP guests too. See commit 9949bd84ac4dfdfc60b2974557819637b8719911 Author: Anthony Liguori [EMAIL PROTECTED] Date: Thu Apr 3 18:37:16 2008 -0500 commit 5208ce19dca268f84a2b9441c2fbb6129161e44c Author: Marcelo Tosatti [EMAIL PROTECTED] Date: Thu Apr 3 20:24:37 2008 -0300 commit 85a67aa2a1b942ddccfcbd625d280869367edc95 Author: Marcelo Tosatti [EMAIL PROTECTED] Date: Fri Apr 11 13:24:41 2008 -0300 Regards, Anthony LIguori I dug a bit further into the code and tried disabling various pieces of the kvm_load_registers until it finally worked again. The problem seems to only arise when the lstar msr is loaded. I've looked at the code, but seeing as three days ago I didn't know there was such a thing as an lstar msr, I'm finding myself getting stuck. :) Any pointers in the right direction would be lovely. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] pv clock: kvm is incompatible with xen :-(
Jeremy Fitzhardinge wrote: Gerd Hoffmann wrote: Not really. There are only two calls, one in clocksource_read() and one in the init path. The later is superfluous I think because clocksource_read() is the only user of the shadowed time info. Hm. It doesn't look like shadow_time needs to be a static percpu at all. It could just be a local to clocksource_read, I think. Good point, one more cleanup. thanks, Gerd -- http://kraxel.fedorapeople.org/xenner/ - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] Using -kernel .. with -drive ...
If I try $ qemu -kernel minimal-kernel -drive file=jeos-devel.img,if=virtio I get the following error: A disk image must be given for 'hda' when booting a Linux kernel is this neccesseary? -- damjan | дамјан This is my jabber ID -- [EMAIL PROTECTED] -- not my mail address, it's a Jabber ID --^ :) - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] Some FAQ questions
I have some questions for the FAQ, about the configuration of Linux guests: a) is swap needed in the guest (I'd say no, but..) b) what filesystem is best for a guest c) what io scheduler in the guest (noop? or cfq) d) are there any runtime kernel tweaks for the guest (/proc/sys)? e) suggested linux kernel source configuration (.config)? -- damjan | дамјан This is my jabber ID -- [EMAIL PROTECTED] -- not my mail address, it's a Jabber ID --^ :) - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] gfxboot VMX workaround v2
On Fri, 18 Apr 2008 10:25:15 -0500 Anthony Liguori [EMAIL PROTECTED] wrote: I'd prefer you not do an emulate_instruction loop at all. Just emulate one instruction on vmentry failure and let VT tell you what instructions you need to emulate. It's only four instructions so I don't think the performance is going to matter. Take a look at the patch I posted previously. you were right, I not updated eip correctly. It is fixed now with the following code: case 0xea: /* jmp (far, absolute) */ { struct kvm_segment kvm_seg; uint16_t eip; uint16_t sel; int ret; eip = insn_fetch(u16, 2, c-eip); sel = insn_fetch(u16, 2, c-eip); kvm_x86_ops-get_segment(ctxt-vcpu, kvm_seg, VCPU_SREG_CS); kvm_seg.selector = sel; ret = load_segment_descriptor(ctxt-vcpu, kvm_seg.selector, 9, VCPU_SREG_CS); if (ret 0 ) { printk(KERN_INFO %s: Failed to load CS selector\n, __FUNCTION__); goto cannot_emulate; } c-eip = eip; break; I print the instruction to be emulated and it seems ok. I have the following outputs: [24203.663324] vmentry_failure: emulation at (46e53) rip 6e13: ea 18 6e 18 [24203.664668] vmentry_failure: emulation at (46e58) rip 6e18: 66 b8 20 00 [24203.668650] vmentry_failure: emulation failed (vmentry failure) rip 6e18 66 b8 20 00 So the emulation that failed is mov $0x20, %ax. It needs to be emulated. As you said Anthony it's only four instructions that need to be emulated, shouldn't be a big issue. Best regards, Guillaume - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] 32-bit binaries failing in 64 bit guests after using vmport
On Mon, Apr 21, 2008 at 08:51:17AM -0500, Anthony Liguori wrote: We're releasing on Thursday, and I needed a quick fix, so I reverted the calls to kvm_{save,load}_registers in vmport_ioport_read to the old code that simply saved the eax, ebx, ecx, edx, esi, and edi registers, but I'm supposing kvm_{load,save}_registers really should work here. /me sighs very deeply Ok, first chance I get, I'm signing up for Patch management 101. :( I got some tests mixed around, so it failing is actually dependent on whether EIP (not LSTAR as I originally thought) is restored or not. I have a patch that fixes it, but I need to work a few things out first before I submit it. Ah, you may have missed the fix that updated the KVM load/save functions to deal with the in-kernel APIC. Indeed. It turns out, vmmouse was horribly broken with SMP guests too. See commit 9949bd84ac4dfdfc60b2974557819637b8719911 Author: Anthony Liguori [EMAIL PROTECTED] Date: Thu Apr 3 18:37:16 2008 -0500 commit 5208ce19dca268f84a2b9441c2fbb6129161e44c Author: Marcelo Tosatti [EMAIL PROTECTED] Date: Thu Apr 3 20:24:37 2008 -0300 I did my tests using kvm-65 userland and kernel, so these two should already be included, afaics. commit 85a67aa2a1b942ddccfcbd625d280869367edc95 Author: Marcelo Tosatti [EMAIL PROTECTED] Date: Fri Apr 11 13:24:41 2008 -0300 This did not change anything for me. -- Soren Hansen | Virtualisation specialist | Ubuntu Server Team Canonical Ltd. | http://www.ubuntu.com/ signature.asc Description: Digital signature - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [patch] qemu/ia64 include prototype for qemu_mallocz
Hi, This one fixes a segfault problem I am seeing on ia64 due to the malloc'ed address being truncated to 32 bit. Cheers, Jes Include qemu-common.h for the prototype for qemu_mallocz to avoid the being truncated to 32 bit. Signed-off-by: Jes Sorensen [EMAIL PROTECTED] --- target-ia64/op_helper.c |1 + 1 file changed, 1 insertion(+) Index: qemu/target-ia64/op_helper.c === --- qemu.orig/target-ia64/op_helper.c +++ qemu/target-ia64/op_helper.c @@ -25,6 +25,7 @@ #include exec-all.h #include qemu-kvm.h +#include qemu-common.h CPUState *cpu_ia64_init(char *cpu_model){ CPUState *env; - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] Using -kernel .. with -drive ...
If I try $ qemu -kernel minimal-kernel -drive file=jeos-devel.img,if=virtio I get the following error: A disk image must be given for 'hda' when booting a Linux kernel is this neccesseary? Hi, i don't know if it is always needed (I saw some patches to avoid that) but you can use: qemu -kernel minimal-kernel -drive file=jeos-devel.img,if=virtio -hda /dev/zero cool that works, thanks -- damjan | дамјан This is my jabber ID -- [EMAIL PROTECTED] -- not my mail address, it's a Jabber ID --^ :) - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] KVM console dying
Marcelo Tosatti wrote: From: Alan Pevec [EMAIL PROTECTED] - add serial console, workaround for F9 livecd KVM guest dying with standard console only. VNC console will go blank but node will continue to boot With only console=tty qemu-kvm dies when, AFAICT from udevdebug output, start_udev is processing console rules. Here is backtrace: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1208993168 (LWP 16163)] 0x0095337d in memmove () from /lib/libc.so.6 Missing separate debuginfos, use: debuginfo-install SDL.i386 alsa-lib.i386 glibc.i686 gnutls.i386 libgcrypt.i386 libgpg-error.i386 zlib.i386 The info that's needed is what the size of the vnc frame buffer is. My suspicion is that we're racing. A VCPU IO operation will trigger the bit blit which will cause this vnc_copy path to hit. However, the VNC server is not thread safe so if the IO thread is running simultaneously, very bad things could happen. Is this with standard KVM or your lock break-up patches? Regards, Anthony Liguori - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] KVM console dying
On Mon, Apr 21, 2008 at 01:24:00PM -0500, Anthony Liguori wrote: Marcelo Tosatti wrote: From: Alan Pevec [EMAIL PROTECTED] - add serial console, workaround for F9 livecd KVM guest dying with standard console only. VNC console will go blank but node will continue to boot With only console=tty qemu-kvm dies when, AFAICT from udevdebug output, start_udev is processing console rules. Here is backtrace: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1208993168 (LWP 16163)] 0x0095337d in memmove () from /lib/libc.so.6 Missing separate debuginfos, use: debuginfo-install SDL.i386 alsa-lib.i386 glibc.i686 gnutls.i386 libgcrypt.i386 libgpg-error.i386 zlib.i386 The info that's needed is what the size of the vnc frame buffer is. My suspicion is that we're racing. A VCPU IO operation will trigger the bit blit which will cause this vnc_copy path to hit. However, the VNC server is not thread safe so if the IO thread is running simultaneously, very bad things could happen. Is this with standard KVM or your lock break-up patches? Standard KVM. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] WARNING: at /usr/src/modules/kvm/mmu.c:390 account_shadowed()
Hi, I am running kvm-66 on top of a debian sid host with 2.6.24 (intel 32bit host). Got the following in my logs today : Apr 21 17:55:01 buffy kernel: WARNING: at /usr/src/modules/kvm/mmu.c:390 account_shadowed() Apr 21 17:55:01 buffy kernel: Pid: 21416, comm: kvm Tainted: P 2.6.24-1-686 #1 Apr 21 17:55:01 buffy kernel: [f8d07a36] kvm_mmu_get_page+0x42d/0x447 [kvm] Apr 21 17:55:01 buffy kernel: [f8d08cca] kvm_mmu_load+0xdf/0x15c [kvm] Apr 21 17:55:01 buffy kernel: [f8affe41] vmx_queue_exception+0x0/0x33 [kvm_intel] Apr 21 17:55:01 buffy kernel: [f8d05521] kvm_arch_vcpu_ioctl_run+0x233/0x5a9 [kvm] Apr 21 17:55:01 buffy kernel: [f8d013aa] kvm_vcpu_ioctl+0xe4/0x34c [kvm] Apr 21 17:55:01 buffy kernel: [c0159078] delayacct_end+0x70/0x77 Apr 21 17:55:01 buffy kernel: [c015aa19] sync_page+0x0/0x3b Apr 21 17:55:01 buffy kernel: [c0159388] __delayacct_blkio_end+0x5b/0x5f Apr 21 17:55:01 buffy kernel: [c02bcaab] io_schedule+0x64/0x80 Apr 21 17:55:01 buffy kernel: [c011e07d] enqueue_entity+0x2b/0x3d Apr 21 17:55:01 buffy kernel: [c0115343] apic_wait_icr_idle+0xe/0x15 Apr 21 17:55:01 buffy kernel: [c011e0a5] enqueue_task_fair+0x16/0x24 Apr 21 17:55:01 buffy kernel: [c011d643] enqueue_task+0x52/0x5d Apr 21 17:55:01 buffy kernel: [c011de9e] resched_task+0x52/0x54 Apr 21 17:55:01 buffy kernel: [c011f459] try_to_wake_up+0x2b8/0x2c2 Apr 21 17:55:01 buffy kernel: [c011d47e] __wake_up_common+0x32/0x5c Apr 21 17:55:01 buffy kernel: [c011eecc] __wake_up+0x32/0x42 Apr 21 17:55:01 buffy kernel: [c013e25c] wake_futex+0x3b/0x45 Apr 21 17:55:01 buffy kernel: [c013e4de] futex_wake+0x81/0xb0 Apr 21 17:55:01 buffy kernel: [c013f097] do_futex+0x77/0x983 Apr 21 17:55:01 buffy kernel: [c011d9ca] update_curr+0x62/0xef Apr 21 17:55:01 buffy kernel: [c0103044] __switch_to+0x9d/0x11d Apr 21 17:55:01 buffy kernel: [f8d012c6] kvm_vcpu_ioctl+0x0/0x34c [kvm] Apr 21 17:55:01 buffy kernel: [c018285b] do_ioctl+0x1f/0x62 Apr 21 17:55:01 buffy kernel: [c0182ad5] vfs_ioctl+0x237/0x249 Apr 21 17:55:01 buffy kernel: [c0182b2c] sys_ioctl+0x45/0x5d Apr 21 17:55:01 buffy kernel: [c0103e5e] sysenter_past_esp+0x6b/0xa1 Regards, Thomas. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] What kernel options do I need to properly enable virtio net driver
virtio net device does not appear to show itself in the guest. I'm curious of what options I may be missing. Here is my config # # Automatically generated make config: don't edit # Linux kernel version: 2.6.25-rc9 # Mon Apr 21 15:52:50 2008 # # CONFIG_PPC64 is not set # # Processor support # # CONFIG_6xx is not set # CONFIG_PPC_85xx is not set # CONFIG_PPC_8xx is not set # CONFIG_40x is not set CONFIG_44x=y # CONFIG_E200 is not set CONFIG_PPC_FPU=y CONFIG_4xx=y CONFIG_BOOKE=y CONFIG_PTE_64BIT=y CONFIG_PHYS_64BIT=y # CONFIG_PPC_MM_SLICES is not set CONFIG_NOT_COHERENT_CACHE=y CONFIG_PPC32=y CONFIG_WORD_SIZE=32 CONFIG_PPC_MERGE=y CONFIG_MMU=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_HARDIRQS=y # CONFIG_HAVE_SETUP_PER_CPU_AREA is not set CONFIG_IRQ_PER_CPU=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_ARCH_HAS_ILOG2_U32=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_GENERIC_FIND_NEXT_BIT=y # CONFIG_ARCH_NO_VIRT_TO_BUS is not set CONFIG_PPC=y CONFIG_EARLY_PRINTK=y CONFIG_GENERIC_NVRAM=y CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_PPC_OF=y CONFIG_OF=y CONFIG_PPC_UDBG_16550=y # CONFIG_GENERIC_TBSYNC is not set CONFIG_AUDIT_ARCH=y CONFIG_GENERIC_BUG=y # CONFIG_DEFAULT_UIMAGE is not set CONFIG_PPC_DCR_NATIVE=y # CONFIG_PPC_DCR_MMIO is not set CONFIG_PPC_DCR=y CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION= CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_AUDIT is not set # CONFIG_IKCONFIG is not set CONFIG_LOG_BUF_SHIFT=14 # CONFIG_CGROUPS is not set CONFIG_GROUP_SCHED=y CONFIG_FAIR_GROUP_SCHED=y # CONFIG_RT_GROUP_SCHED is not set CONFIG_USER_SCHED=y # CONFIG_CGROUP_SCHED is not set CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y CONFIG_RELAY=y # CONFIG_NAMESPACES is not set CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE=/home/jerone/tmp/LINUX_RAM_DISK/ CONFIG_INITRAMFS_ROOT_UID=0 CONFIG_INITRAMFS_ROOT_GID=0 # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SYSCTL=y CONFIG_EMBEDDED=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_COMPAT_BRK=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_SLUB_DEBUG=y # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set # CONFIG_PROFILING is not set # CONFIG_MARKERS is not set CONFIG_HAVE_OPROFILE=y # CONFIG_KPROBES is not set CONFIG_HAVE_KPROBES=y CONFIG_HAVE_KRETPROBES=y CONFIG_PROC_PAGE_MONITOR=y CONFIG_SLABINFO=y CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y # CONFIG_MODULE_FORCE_UNLOAD is not set # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_KMOD=y CONFIG_BLOCK=y CONFIG_LBD=y # CONFIG_BLK_DEV_IO_TRACE is not set # CONFIG_LSF is not set # CONFIG_BLK_DEV_BSG is not set # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_IOSCHED_CFQ=y CONFIG_DEFAULT_AS=y # CONFIG_DEFAULT_DEADLINE is not set # CONFIG_DEFAULT_CFQ is not set # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED=anticipatory CONFIG_PREEMPT_NOTIFIERS=y CONFIG_CLASSIC_RCU=y # CONFIG_PPC4xx_PCI_EXPRESS is not set # # Platform support # # CONFIG_PPC_MPC512x is not set # CONFIG_PPC_MPC5121 is not set # CONFIG_PPC_CELL is not set # CONFIG_PPC_CELL_NATIVE is not set # CONFIG_PQ2ADS is not set CONFIG_BAMBOO=y # CONFIG_EBONY is not set # CONFIG_SEQUOIA is not set # CONFIG_TAISHAN is not set # CONFIG_KATMAI is not set # CONFIG_RAINIER is not set # CONFIG_WARP is not set CONFIG_440EP=y CONFIG_IBM440EP_ERR42=y # CONFIG_IPIC is not set # CONFIG_MPIC is not set # CONFIG_MPIC_WEIRD is not set # CONFIG_PPC_I8259 is not set # CONFIG_PPC_RTAS is not set # CONFIG_MMIO_NVRAM is not set # CONFIG_PPC_MPC106 is not set # CONFIG_PPC_970_NAP is not set # CONFIG_PPC_INDIRECT_IO is not set # CONFIG_GENERIC_IOMAP is not set # CONFIG_CPU_FREQ is not set # CONFIG_FSL_ULI1575 is not set # # Kernel options # # CONFIG_HIGHMEM is not set # CONFIG_TICK_ONESHOT is not set # CONFIG_NO_HZ is not set # CONFIG_HIGH_RES_TIMERS is not set CONFIG_GENERIC_CLOCKEVENTS_BUILD=y # CONFIG_HZ_100 is not set CONFIG_HZ_250=y # CONFIG_HZ_300 is not set # CONFIG_HZ_1000 is not set CONFIG_HZ=250 # CONFIG_SCHED_HRTICK is not set CONFIG_PREEMPT_NONE=y # CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set CONFIG_BINFMT_ELF=y # CONFIG_BINFMT_MISC is not set # CONFIG_MATH_EMULATION is not set # CONFIG_IOMMU_HELPER is not set
[kvm-devel] [patch 1/2] [PATCH] pci nic: pci_register_device can fail
The pci_register_device() call in PCI nic initialization routines can fail. Handle this failure and propagate a meaningful error message to the user instead of generating a SEGV. Cc: Marcelo Tosatti [EMAIL PROTECTED] Signed-off-by: Chris Wright [EMAIL PROTECTED] --- qemu/hw/e1000.c |3 +++ qemu/hw/eepro100.c |2 ++ qemu/hw/ne2000.c |3 +++ qemu/hw/pci.c|6 ++ qemu/hw/pcnet.c |2 ++ qemu/hw/rtl8139.c|3 +++ qemu/hw/virtio-net.c |2 ++ qemu/hw/virtio.c |3 +++ 8 files changed, 24 insertions(+) --- a/qemu/hw/e1000.c +++ b/qemu/hw/e1000.c @@ -963,6 +963,9 @@ pci_e1000_init(PCIBus *bus, NICInfo *nd, d = (E1000State *)pci_register_device(bus, e1000, sizeof(E1000State), devfn, NULL, NULL); +if (!d) + return NULL; + pci_conf = d-dev.config; memset(pci_conf, 0, 256); --- a/qemu/hw/eepro100.c +++ b/qemu/hw/eepro100.c @@ -1753,6 +1753,8 @@ static PCIDevice *nic_init(PCIBus * bus, d = (PCIEEPRO100State *) pci_register_device(bus, name, sizeof(PCIEEPRO100State), -1, NULL, NULL); +if (!d) +return NULL; s = d-eepro100; s-device = device; --- a/qemu/hw/ne2000.c +++ b/qemu/hw/ne2000.c @@ -796,6 +796,9 @@ PCIDevice *pci_ne2000_init(PCIBus *bus, NE2000, sizeof(PCINE2000State), devfn, NULL, NULL); +if (!d) + return NULL; + pci_conf = d-dev.config; pci_conf[0x00] = 0xec; // Realtek 8029 pci_conf[0x01] = 0x10; --- a/qemu/hw/pci.c +++ b/qemu/hw/pci.c @@ -696,6 +696,12 @@ PCIDevice *pci_nic_init(PCIBus *bus, NIC fprintf(stderr, qemu: Unsupported NIC: %s\n, nd-model); return NULL; } + +if (!pci_dev) { +fprintf(stderr, qemu: Unable to initialze NIC: %s\n, nd-model); +return NULL; +} + nd-devfn = pci_dev-devfn; return pci_dev; } --- a/qemu/hw/pcnet.c +++ b/qemu/hw/pcnet.c @@ -1970,6 +1970,8 @@ PCIDevice *pci_pcnet_init(PCIBus *bus, N d = (PCNetState *)pci_register_device(bus, PCNet, sizeof(PCNetState), devfn, NULL, NULL); +if (!d) + return NULL; pci_conf = d-dev.config; --- a/qemu/hw/rtl8139.c +++ b/qemu/hw/rtl8139.c @@ -3411,6 +3411,9 @@ PCIDevice *pci_rtl8139_init(PCIBus *bus, RTL8139, sizeof(PCIRTL8139State), devfn, NULL, NULL); +if (!d) + return NULL; + pci_conf = d-dev.config; pci_conf[0x00] = 0xec; /* Realtek 8139 */ pci_conf[0x01] = 0x10; --- a/qemu/hw/virtio-net.c +++ b/qemu/hw/virtio-net.c @@ -292,6 +292,8 @@ PCIDevice *virtio_net_init(PCIBus *bus, 0, VIRTIO_ID_NET, 0x02, 0x00, 0x00, 6, sizeof(VirtIONet)); +if (!n) + return NULL; n-vdev.update_config = virtio_net_update_config; n-vdev.get_features = virtio_net_get_features; --- a/qemu/hw/virtio.c +++ b/qemu/hw/virtio.c @@ -408,6 +408,9 @@ VirtIODevice *virtio_init_pci(PCIBus *bu pci_dev = pci_register_device(bus, name, struct_size, -1, NULL, NULL); +if (!pci_dev) + return NULL; + vdev = to_virtio_device(pci_dev); vdev-status = 0; -- - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [patch 0/2] pci_register_device can fail
The pci hotadd patches make it easy to trigger segfaults when adding more devices than a single PCI bus can handle. The following 2 patches fix the pci nic devices and virtio-blk device. Now the following the following: OK bus 0, slot 31, function 0 (devfn 248) (qemu) pci_add 0 nic model=virtio Segmentation fault OK bus 0, slot 31, function 0 (devfn 248) (qemu) pci_add 0 storage file=/mnt/disk1,if=virtio Segmentation fault become: OK bus 0, slot 31, function 0 (devfn 248) (qemu) pci_add 0 nic model=virtio qemu: Unable to initialze NIC: virtio failed to add model=virtio OK bus 0, slot 31, function 0 (devfn 248) (qemu) pci_add 0 storage file=/mnt/disk1,if=virtio failed to add file=/mnt/disk1,if=virtio thanks, -chris -- - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [kvm-ppc-devel] [PATCH 1/5]Add some trace markers and exposeinterfaces in kernel for tracing
Hollis Blanchard wrote: On Sunday 20 April 2008 00:38:32 Liu, Eric E wrote: Christian Ehrhardt wrote: Liu, Eric E wrote: Hollis Blanchard wrote: On Wednesday 16 April 2008 01:45:34 Liu, Eric E wrote: [...] Actually... we could have kvmtrace itself insert the metadata, so there would be no chance of it being overwritten in the kernel buffers. The header could be written in tip_open_output(), and update fs_size accordingly. Yes, let kvmtrace insert the metadata is more reasonable. I wanted to note that the kvmtrace tool should, but not need to know everything about the data format. I think of e.g. changing kernel implementations that change endianess or even flags we don't yet know, but we might need in the future. What about adding another debugfs entry the kernel can use to expose the kvmtrace-metadata defined by the kernel implementation. The kvmtrace tool could then use that to build up the record by using one entry for kernel defined metadata and another to add any metadata that would be defined by kvmtrace tool itself. what about that one: struct metadata { u32 kmagic; /* stores kernel defined metadata read from debugfs entry */ u32 umagic; /* stores userspace tool defined metadata */ u32 extra; /* it is redundant, only use to fit into record. */ } That should give us the flexibility to keep the format if we get more metadata requirements in the future. Yes, maybe we need metadata to indicate the changing kernel implementations in the future, but adding debugfs entry seems not a good approach. What about defining a similar metadat in kernel rather than in userland and write it in rchan at the first time we add trace data. Then we don't need kvmtrace tool to insert the medadata again. like this: struct kvm_trace_metadata { u32 kmagic; /* stores kernel defined metadata */ u64 extra; /* use to fit into record. */ } So you've gone back to the idea about the kernel inserting a special trace record? How do you handle the case where this record is overwritten before the logging app gets a chance to extract it? This issue is why I would prefer Christian's idea of a separate debugfs file (*not* relay channel) to export kernel flags. Yes, seems my original idea not a good one, I agree with you and Christian. At that point, kvmtrace can insert the flags in any way it wants. It doesn't need to appear as a trace record at all; it should simply be a header at the beginning of the trace file. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] Using -kernel .. with -drive ...
On Tue, Apr 22, 2008 at 12:59 AM, Damjan [EMAIL PROTECTED] wrote: If I try $ qemu -kernel minimal-kernel -drive file=jeos-devel.img,if=virtio I get the following error: A disk image must be given for 'hda' when booting a Linux kernel is this neccesseary? Hi, i don't know if it is always needed (I saw some patches to avoid that) but you can use: qemu -kernel minimal-kernel -drive file=jeos-devel.img,if=virtio -hda /dev/zero cool that works, thanks Try the linuxboot option ROM, and you might not need the trick -hda /dev/zero Thanks, Q - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [RFC PATCH] virtio: change config to guest endian.
[Christian, Hollis, how much is this ABI breakage going to hurt you?] A recent proposed feature addition to the virtio block driver revealed some flaws in the API, in particular how easy it is to break big endian machines. The virtio config space was originally chosen to be little-endian, because we thought the config might be part of the PCI config space for virtio_pci. It's actually a separate mmio region, so that argument holds little water; as only x86 is currently using the virtio mechanism, we can change this (but must do so now, before the impending s390 and ppc merges). API changes: - __virtio_config_val() just becomes a striaght vdev-config_get() call. Signed-off-by: Rusty Russell [EMAIL PROTECTED] --- drivers/block/virtio_blk.c |4 +-- drivers/virtio/virtio_balloon.c |6 ++--- include/linux/virtio_config.h | 47 +--- 3 files changed, 21 insertions(+), 36 deletions(-) diff -r a098f19a6da5 drivers/block/virtio_blk.c --- a/drivers/block/virtio_blk.cSun Apr 20 14:41:02 2008 +1000 +++ b/drivers/block/virtio_blk.cSun Apr 20 15:07:45 2008 +1000 @@ -246,8 +246,8 @@ static int virtblk_probe(struct virtio_d blk_queue_ordered(vblk-disk-queue, QUEUE_ORDERED_TAG, NULL); /* Host must always specify the capacity. */ - __virtio_config_val(vdev, offsetof(struct virtio_blk_config, capacity), - cap); + vdev-config-get(vdev, offsetof(struct virtio_blk_config, capacity), + cap, sizeof(cap)); /* If capacity is too big, truncate with warning. */ if ((sector_t)cap != cap) { diff -r a098f19a6da5 drivers/virtio/virtio_balloon.c --- a/drivers/virtio/virtio_balloon.c Sun Apr 20 14:41:02 2008 +1000 +++ b/drivers/virtio/virtio_balloon.c Sun Apr 20 15:07:45 2008 +1000 @@ -155,9 +155,9 @@ static inline s64 towards_target(struct static inline s64 towards_target(struct virtio_balloon *vb) { u32 v; - __virtio_config_val(vb-vdev, - offsetof(struct virtio_balloon_config, num_pages), - v); + vb-vdev-config-get(vb-vdev, + offsetof(struct virtio_balloon_config, num_pages), + v); return v - vb-num_pages; } diff -r a098f19a6da5 include/linux/virtio_config.h --- a/include/linux/virtio_config.h Sun Apr 20 14:41:02 2008 +1000 +++ b/include/linux/virtio_config.h Sun Apr 20 15:07:45 2008 +1000 @@ -16,7 +16,7 @@ #define VIRTIO_CONFIG_S_FAILED 0x80 #ifdef __KERNEL__ -struct virtio_device; +#include linux/virtio.h /** * virtio_config_ops - operations for configuring a virtio device @@ -30,13 +30,11 @@ struct virtio_device; * offset: the offset of the configuration field * buf: the buffer to write the field value into. * len: the length of the buffer - * Note that contents are conventionally little-endian. * @set: write the value of a configuration field * vdev: the virtio_device * offset: the offset of the configuration field * buf: the buffer to read the field value from. * len: the length of the buffer - * Note that contents are conventionally little-endian. * @get_status: read the status byte * vdev: the virtio_device * Returns the status byte @@ -70,40 +68,27 @@ struct virtio_config_ops }; /** - * virtio_config_val - look for a feature and get a single virtio config. + * virtio_config_val - look for a feature and get a virtio config entry. * @vdev: the virtio device * @fbit: the feature bit * @offset: the type to search for. * @val: a pointer to the value to fill in. * * The return value is -ENOENT if the feature doesn't exist. Otherwise - * the value is endian-corrected and returned in v. */ -#define virtio_config_val(vdev, fbit, offset, v) ({\ - int _err; \ - if ((vdev)-config-feature((vdev), (fbit))) { \ - __virtio_config_val((vdev), (offset), (v)); \ - _err = 0; \ - } else \ - _err = -ENOENT; \ - _err; \ -}) + * the config value is copied into whatever is pointed to by v. */ +#define virtio_config_val(vdev, fbit, offset, v) \ + virtio_config_buf((vdev), (fbit), (offset), (v), sizeof(v)) -/** - * __virtio_config_val - get a single virtio config without feature check. - * @vdev: the virtio device - * @offset: the type to search for. - * @val: a pointer to the value to fill in. - * - * The value is endian-corrected and returned in v. */ -#define __virtio_config_val(vdev, offset, v) do { \ -
Re: [kvm-devel] [patch 2/2] [PATCH] virtio-blk: virtio_pci_init can fail
* Marcelo Tosatti ([EMAIL PROTECTED]) wrote: Looks good. Does SCSI handle pci_register_device() failure too? Yeah, but it missed actually checking the return value from lsi_scsi_init. Patch to follow. thanks, -chris - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [patch 3/2] hotadd: lsi_scsi_init can fail
During hotadd of SCSI devices lsi_scsi_init() handles failed pci_device_register(), but qemu_system_hot_add_storage() will try and attach a drive any way. Handle this error case rather the generating SEGV. Cc: Marcelo Tosatti [EMAIL PROTECTED] Signed-off-by: Chris Wright [EMAIL PROTECTED] --- qemu/hw/device-hotplug.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/qemu/hw/device-hotplug.c +++ b/qemu/hw/device-hotplug.c @@ -125,7 +125,7 @@ static PCIDevice *qemu_system_hot_add_st switch (type) { case IF_SCSI: opaque = lsi_scsi_init (pci_bus, -1); -if (drive_idx = 0) +if (opaque drive_idx = 0) lsi_scsi_attach (opaque, drives_table[drive_idx].bdrv, drives_table[drive_idx].unit); break; - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel