Re: win8 installation iso can not boot on rhel6.2 kernel?
Gleb: would you pls tell where to find RHEL 6.4 kernel, as the current latest office release is 6.3? and would you figure out what the root cause that produce the problem? thanks. Regards. Suya 2013/2/5 ya su : > Gleb: > > would you pls tell where to find RHEL 6.4 kernel, as the current > latest office release is 6.3? > > and would you figure out what the root cause that produce the problem? > > thanks. > > Regards. > > Suya > > 2013/2/5 Gleb Natapov : >> On Tue, Feb 05, 2013 at 02:55:07PM +0800, ya su wrote: >>> I use the following cmd on rhel6.2 kernel 2.6.32-220.17.1: >>> x86_64-softmmu/qemu-system-x86_64 -hda win8.img -cdrom >>> window_8_pro.iso -m 2048 -L pc-bios -cpu host, it will display the >>> following error: >>> Your PC needs to restart. >>> Please hold down the power button. >>> Error Code: 0x005D >>> Parameters: >>> 0x03100A00 >>> 0x68747541 >>> 0x69746E65 >>> 0x444D4163 >>> >>> I also tried the newest rhel6 kernel version: 2.6.32-279.19.1, it >>> bring out the same result. >>> >>> if I try standard kernel 2.6.32 version, it can boot normally. >>> >>> Any suggestions? thanks. >>> >> You need latest RHEL6.4 kernel. >> >> -- >> Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
win8 installation iso can not boot on rhel6.2 kernel?
I use the following cmd on rhel6.2 kernel 2.6.32-220.17.1: x86_64-softmmu/qemu-system-x86_64 -hda win8.img -cdrom window_8_pro.iso -m 2048 -L pc-bios -cpu host, it will display the following error: Your PC needs to restart. Please hold down the power button. Error Code: 0x005D Parameters: 0x03100A00 0x68747541 0x69746E65 0x444D4163 I also tried the newest rhel6 kernel version: 2.6.32-279.19.1, it bring out the same result. if I try standard kernel 2.6.32 version, it can boot normally. Any suggestions? thanks. Suya. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Has any work 3.3 kvm-kmod for rhel 6.2 kernel successfully?
Jan: sorry for late response of your suggestion. I have found the patch which produce this problem, it comes from this one: 7850ac5420803996e2960d15b924021f28e0dffc. I change as the following, it works fine. diff -ur -i kvm-kmod-3.4/x86/kvm_main.c kvm-kmod-3.4-fix/x86/kvm_main.c --- kvm-kmod-3.4/x86/kvm_main.c 2012-05-21 23:43:02.0 +0800 +++ kvm-kmod-3.4-fix/x86/kvm_main.c 2012-06-05 12:19:37.780136969 +0800 @@ -1525,8 +1525,8 @@ if (memslot && memslot->dirty_bitmap) { unsigned long rel_gfn = gfn - memslot->base_gfn; - if (!test_and_set_bit_le(rel_gfn, memslot->dirty_bitmap)) - memslot->nr_dirty_pages++; + __set_bit_le(rel_gfn, memslot->dirty_bitmap); + memslot->nr_dirty_pages++; } } ~ I think the root cause maybe: the acton of clear dirty_bitmap don't sync with that of set nr_dirty_pages=0. but I don't realize why it works fine in new kernel. Regards. Suya. 2012/4/16 Jan Kiszka : > On 2012-04-16 16:34, ya su wrote: >> I first notice 3.3 release notes, it says it can compile against >> 2.6.32-40, so I think it can work with 2.6.32, then I change it with >> rhel 2.6.32 kernel. > > The problem is that the RHEL 2.6.32 kernel has nothing to do with a > standard 2.6.32 as too many features were ported back. So the version > number based feature checks fail as you noticed. > > We could adapt kvm-kmod to detect that it is a RHEL kernel (there is > surely some define), but it requires going through all the relevant > features carefully. > >> >> I just re-change orginal kvm-kmod 3.3 with rhel 2.6.32, only to change >> compile redefination errors, but the problem remains the same. the >> patch attached. >> >> I don't go through git commits, as so many changes from 2.6.32 to 3.3 in >> kernel. >> >> I think the problem may come from memory change notification. > > The approach to resolve this could be to identify backported features > based on the build breakage or runtime anomalies, then analyze the > kvm-kmod history for changes that wrapped those features, and finally > adjust all affected code blocks. I'm open for patches and willing to > support you on questions, but I can't work on this myself. > > Jan > > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Has any work 3.3 kvm-kmod for rhel 6.2 kernel successfully?
I first notice 3.3 release notes, it says it can compile against 2.6.32-40, so I think it can work with 2.6.32, then I change it with rhel 2.6.32 kernel. I just re-change orginal kvm-kmod 3.3 with rhel 2.6.32, only to change compile redefination errors, but the problem remains the same. the patch attached. I don't go through git commits, as so many changes from 2.6.32 to 3.3 in kernel. I think the problem may come from memory change notification. 2012/4/16, Jan Kiszka : > On 2012-04-16 14:23, ya su wrote: >> kvm-kmod 3.3 patch attached. >> >> I also change kernel to export __get_user_pages_fast. > > Ugh, that's huge. How did you select which feature to enable? Based on > compile tests? The risk would then be to miss some bits that are > additionally required. Or did you step through the git commits to > identify corresponding parts? > > Jan > > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux > kvm-kmod-min.patch Description: Binary data
Re: Has any work 3.3 kvm-kmod for rhel 6.2 kernel successfully?
kvm-kmod 3.3 patch attached. I also change kernel to export __get_user_pages_fast. Regards. Suya. 2012/4/16, Jan Kiszka : > On 2012-04-16 12:12, ya su wrote: >> Hi,all: >> >> I try 3.3 kvm-kmod to compile against redhat 2.6.32-220.7.1, after >> change some macros in external-module-compat-comm.h , >> external-module-compat.h, and in some C files, finally I can compile >> and run qemu-kvm(0.12 with rhel release) with 3.3 module, everything >> looks fine except that screen-display can not flush correctly, it >> looks like the display-card memory can not get updated in time when it >> changes. >> >>Is there anyone that can give me some clues, thanks. > > Maybe you can share your adoptions to make it easier to asses to what > degree that kernel is different from a real 2.6.32 kernel. > > Jan > > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux > kvm-kmod.patch Description: Binary data
Has any work 3.3 kvm-kmod for rhel 6.2 kernel successfully?
Hi,all: I try 3.3 kvm-kmod to compile against redhat 2.6.32-220.7.1, after change some macros in external-module-compat-comm.h , external-module-compat.h, and in some C files, finally I can compile and run qemu-kvm(0.12 with rhel release) with 3.3 module, everything looks fine except that screen-display can not flush correctly, it looks like the display-card memory can not get updated in time when it changes. Is there anyone that can give me some clues, thanks. Regards. Suya. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: about NPIV with qemu-kvm.
hi, hannes I really appreciate your clarify of my daze. as to improve vm's storage io perfomance as nearly hardware's, it seems the only way is something like sr-iov by hba card. NPIV can not achieve this goal. I remember that LSI released some kind SAS controller(IR 2008?) which support sr-iov , but there is not any document which describes the steps to config. I wonder if your have any clues to help? thanks. Regards. Suya. 2011/10/26, Hannes Reinecke : > On 10/26/2011 06:40 AM, ya su wrote: >> hi, hannes: >> >> I want to use NPIV with qemu-kvm, I issued the following command: >> >> echo ':' > >> /sys/class/fc_host/host0/vport_create >> >> and it will produce a new host6 and one vport succesfully, but it >> does not create any virtual hba pci device. so I don't know how to >> assign the virtual host to qemu-kvm. >> > Well, you can't. There is no mechanism for. When using NPIV you need > to pass in the individual LUNs via eg virtio-blk. > >> from your this mail, does array will first need to assign a lun to >> this vport? and through this new created disk, like device /dev/sdf, >> then I add qemu-kvm with -drive file=/dev/sdf,if=virtio... arguments? >> > Yes. That's what you need to do. > > Cheers, > > Hannes > -- > Dr. Hannes Reinecke zSeries & Storage > h...@suse.de +49 911 74053 688 > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: Markus Rex, HRB 16746 (AG Nürnberg) > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
about NPIV with qemu-kvm.
hi, hannes: I want to use NPIV with qemu-kvm, I issued the following command: echo ':' > /sys/class/fc_host/host0/vport_create and it will produce a new host6 and one vport succesfully, but it does not create any virtual hba pci device. so I don't know how to assign the virtual host to qemu-kvm. from your this mail, does array will first need to assign a lun to this vport? and through this new created disk, like device /dev/sdf, then I add qemu-kvm with -drive file=/dev/sdf,if=virtio... arguments? Regards. Suya. 2011/6/29, Hannes Reinecke : > On 06/29/2011 12:07 PM, Christoph Hellwig wrote: >> On Wed, Jun 29, 2011 at 10:39:42AM +0100, Stefan Hajnoczi wrote: >>> I think we're missing a level of addressing. We need the ability to >>> talk to multiple target ports in order for "list target ports" to make >>> sense. Right now there is one implicit target that handles all >>> commands. That means there is one fixed I_T Nexus. >>> >>> If we introduce "list target ports" we also need a way to say "This >>> CDB is destined for target port #0". Then it is possible to enumerate >>> target ports and address targets independently of the LUN field in the >>> CDB. >>> >>> I'm pretty sure this is also how SAS and other transports work. In >>> their framing they include the target port. >> >> Yes, exactly. Hierachial LUNs are a nasty fringe feature that we should >> avoid as much as possible, that is for everything but IBM vSCSI which is >> braindead enough to force them. >> > Yep. > >>> The question is whether we really need to support multiple targets on >>> a virtio-scsi adapter or not. If you are selectively mapping LUNs >>> that the guest may access, then multiple targets are not necessary. >>> If we want to do pass-through of the entire SCSI bus then we need >>> multiple targets but I'm not sure if there are other challenges like >>> dependencies on the transport (Fibre Channel, SAS, etc) which make it >>> impossible to pass through bus-level access? >> >> I don't think bus-level pass through is either easily possible nor >> desirable. What multiple targets are useful for is allowing more >> virtual disks than we have virtual PCI slots. We could do this by >> supporting multiple LUNs, but given that many SCSI ressources are >> target-based doing multiple targets most likely is the more scabale >> and more logical variant. E.g. we could much more easily have one >> virtqueue per target than per LUN. >> > The general idea here is that we can support NPIV. > With NPIV we'll have several scsi_hosts, each of which is assigned a > different set of LUNs by the array. > With virtio we need to able to react on LUN remapping on the array > side, ie we need to be able to issue a 'REPORT LUNS' command and > add/remove LUNs on the fly. This means we have to expose the > scsi_host in some way via virtio. > > This is impossible with a one-to-one mapping between targets and > LUNs. The actual bus-level pass-through will be just on the SCSI > layer, ie 'REPORT LUNS' should be possible. If and how we do a LUN > remapping internally on the host is a totally different matter. > Same goes for the transport details; I doubt we will expose all the > dingy details of the various transports, but rather restrict > ourselves to an abstract transport. > > Cheers, > > Hannes > -- > Dr. Hannes Reinecke zSeries & Storage > h...@suse.de+49 911 74053 688 > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Great different disk-write performance of winxp vm compared with linux vm, and with different storage controller.
hi,all I run winxp vm on Dell R710 server, used with LSI SAS 1068E controller which is connected with one FUJITSU MBA3147RC disk(SAS, 15k rpm, no raid setting), virtio-blk and aio=native is used, if I copy a big file in VM, it displays about 3-5Mbps from iotop display. I run dd in host, it can reach about 90Mbps.(dd if=/dev/zero of=1.test bs=1M count=1024 oflag=direct); if I run dd in FC14 vm with the same dd command, it can reach about 70Mbsp. I then attach a SATA disk(ST31000528AS) with IDE/SATA contoller (Intel 82801IB), winxp vm can reach about 30Mbps while copying a file viewed from iotop. the problem is: why winxp vm achive so slow disk-write performance on a faster storage device? it seems the reasons come from: winxp model of copying a file, and from different storage device. Is there any tool or way to dig deeper into the problem, so that it can get winxp vm to copy files normally? BTW: I have tried with qemu-kvm 0.15 and kvm-kmod-3.0b, the result remains the same. thanks. Suya. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: APIC: avoid instruction emulation for EOI writes
hi,Kevin: I applied the patch in 2.6.39-rc7+, and run a winxp vm, there is many ICR write emulations, trace-cmd is as the following: 600281: kvm_entry:vcpu 0 600284: kvm_exit: reason APIC_ACCESS rip 0x806e7b85 info 3 600285: kvm_emulate_insn: 0:806e7b85: f7 05 00 03 fe ff 00 10 00 0 600286: kvm_apic: apic_read APIC_ICR = 0x40041 600286: kvm_mmio: mmio read len 4 gpa 0xfee00300 val 0x40041 600287: kvm_mmio: mmio write len 4 gpa 0xfee00300 val 0x40041 600287: kvm_apic: apic_write APIC_ICR = 0x40041 600287: kvm_apic_ipi: dst 1 vec 65 (Fixed|physical|de-assert|edge|self) 600288: kvm_apic_accept_irq: apicid 0 vec 65 (Fixed|edge) 600288: kvm_entry:vcpu 0 600289: kvm_exit: reason APIC_ACCESS rip 0x806e7b91 info 1 600290: kvm_emulate_insn: 0:806e7b91: 89 0d 00 03 fe ff 600291: kvm_mmio: mmio write len 4 gpa 0xfee00300 val 0x40041 600291: kvm_apic: apic_write APIC_ICR = 0x40041 600291: kvm_apic_ipi: dst 1 vec 65 (Fixed|physical|de-assert|edge|self) 600291: kvm_apic_accept_irq: apicid 0 vec 65 (Fixed|edge) (coalesced) 600292: kvm_entry:vcpu 0 600293: kvm_exit: reason APIC_ACCESS rip 0x806e7b97 info 3 600294: kvm_emulate_insn: 0:806e7b97: f7 05 00 03 fe ff 00 10 00 0 600294: kvm_apic: apic_read APIC_ICR = 0x40041 600295: kvm_mmio: mmio read len 4 gpa 0xfee00300 val 0x40041 600295: kvm_mmio: mmio write len 4 gpa 0xfee00300 val 0x40041 600295: kvm_apic: apic_write APIC_ICR = 0x40041 600296: kvm_apic_ipi: dst 1 vec 65 (Fixed|physical|de-assert|edge|self) 600296: kvm_apic_accept_irq: apicid 0 vec 65 (Fixed|edge) (coalesced) as the asm code on addr 0x80637b85: 0x80637b85: testl $0x1000, 0xfffe0300 0x80637b8f: jne 0x80637b85 0x80637b91: mov %ecx, 0xfffe0300 0x80637b97: testl $0x1000, 0xfffe0300 0x80637ba1: jne 0x80637b97 I wonder why testl operation will also cause a ICR write, from the asm code, there should only issue one IPI, but from trace-cmd, it issued 3 IPI, is there something wrong? Is it also possible to optimize ICR write emulation, from the result, winxp vm will produce a lot of ICR writes Regards! Suya. 2011/8/29 Avi Kivity : > On 08/29/2011 04:55 PM, Jan Kiszka wrote: >> >> On 2011-08-29 13:11, Avi Kivity wrote: >> > On 08/29/2011 02:03 PM, Jan Kiszka wrote: >> >>> >> >>> Just reading the first byte requires a guest page table walk. This >> >>> is >> >>> probably the highest cost in emulation (which also requires a walk >> >>> for >> >>> the data access). >> >> >> >> And what about caching the result of the first walk? Usually, a "sane >> >> guest" won't have many code pages that issue the EIO. >> >> >> > >> > There's no way to know when to invalidate the cache. >> >> Set the affected code page read-only? > > The virt-phys mapping could change too. And please, don't think of new > reasons to write protect pages, they break up my lovely 2M maps. > >> > >> > We could go a bit further, and cache the the whole thing. On the first >> > exit, do the entire emulation, and remember %rip. On the second exit, >> > if %rip matches, skip directly to kvm_lapic_eoi(). >> > >> > But I don't think it's worth it. This also has failure modes, and >> > really, no guest will ever write to EOI with stosl. >> >> ...or add/sub/and/or etc. > > Argh, yes, flags can be updated. > > Actually, this might work - if we get a read access first as part of the > RMW, we'll emulate the instruction. No idea what the hardware does in this > case. > >> Well, we've done other crazy things in the >> past just to keep even the unlikely case correct. I was just wondering >> if that policy changed. > > I can't answer yes to that question. But I see no way to make it work both > fast and correct. > >> >> However, I just realized that user space is able to avoid this >> inaccuracy for potentially insane guests by not using in-kernel >> irqchips. So we have at least a knob. > > Could/should have a flag to disable this in the kernel as well. > > -- > error compiling committee.c: too many arguments to function > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: windows workload: many ept_violation and mmio exits
hi,Avi: I met the same problem, tons of hpet vm_exits(vector 209, fault address is in the guest vm's hpet mmio range), even I disable hpet device in win7 guest vm, it still produce a larget amount of vm_exits when trace-cmd ; I add -no-hpet to start the vm, it still has HPET device inside VM. Does that means the HPET device in VM does not depend on the emulated hpet device in qemu-kvm? Is there any way to disable the VM HPET device to prevent so many vm_exits? Thansk. Regards. Suya. 2009/12/3 Avi Kivity : > On 12/03/2009 03:46 PM, Andrew Theurer wrote: >> >> I am running a windows workload which has 26 windows VMs running many >> instances of a J2EE workload. There are 13 pairs of an application server >> VM and database server VM. There seem to be quite a bit of vm_exits, and it >> looks over a third of them are mmio_exit: >> >>> efer_relo 0 >>> exits 337139 >>> fpu_reloa 247321 >>> halt_exit 19092 >>> halt_wake 18611 >>> host_stat 247332 >>> hypercall 0 >>> insn_emul 184265 >>> insn_emul 184265 >>> invlpg 0 >>> io_exits 69184 >>> irq_exits 52953 >>> irq_injec 48115 >>> irq_windo 2411 >>> largepage 19 >>> mmio_exit 123554 >> >> I collected a kvmtrace, and below is a very small portion of that. Is >> there a way I can figure out what device the mmio's are for? > > We want 'info physical_address_space' in the monitor. > >> Also, is it normal to have lots of ept_violations? This is a 2 socket >> Nehalem system with SMT on. > > So long as pf_fixed is low, these are all mmio or apic accesses. > >> >> >>> qemu-system-x86-19673 [014] 213577.939624: kvm_page_fault: address >>> fed000f0 error_code 181 >>> qemu-system-x86-19673 [014] 213577.939627: kvm_mmio: mmio >>> unsatisfied-read len 4 gpa 0xfed000f0 val 0x0 >>> qemu-system-x86-19673 [014] 213577.939629: kvm_mmio: mmio read len 4 gpa >>> 0xfed000f0 val 0xfb8f214d > > hpet > >>> qemu-system-x86-19673 [014] 213577.939631: kvm_entry: vcpu 0 >>> qemu-system-x86-19673 [014] 213577.939633: kvm_exit: reason >>> ept_violation rip 0xf8000160ef8e >>> qemu-system-x86-19673 [014] 213577.939634: kvm_page_fault: address >>> fed000f0 error_code 181 > > hpet - was this the same exit? we ought to skip over the emulated > instruction. > >>> qemu-system-x86-19673 [014] 213577.939693: kvm_page_fault: address >>> fed000f0 error_code 181 >>> qemu-system-x86-19673 [014] 213577.939696: kvm_mmio: mmio >>> unsatisfied-read len 4 gpa 0xfed000f0 val 0x0 > > hpet > >>> qemu-system-x86-19332 [008] 213577.939699: kvm_exit: reason >>> ept_violation rip 0xf80001b3af8e >>> qemu-system-x86-19332 [008] 213577.939700: kvm_page_fault: address >>> fed000f0 error_code 181 >>> qemu-system-x86-19673 [014] 213577.939702: kvm_mmio: mmio read len 4 gpa >>> 0xfed000f0 val 0xfb8f3da6 > > hpet > >>> qemu-system-x86-19332 [008] 213577.939706: kvm_mmio: mmio >>> unsatisfied-read len 4 gpa 0xfed000f0 val 0x0 >>> qemu-system-x86-19563 [010] 213577.939707: kvm_ioapic_set_irq: pin 11 >>> dst 1 vec=130 (LowPrio|logical|level) >>> qemu-system-x86-19332 [008] 213577.939713: kvm_mmio: mmio read len 4 gpa >>> 0xfed000f0 val 0x29a105de > > hpet ... > >>> qemu-system-x86-19673 [014] 213577.939908: kvm_ioapic_set_irq: pin 11 >>> dst 1 vec=130 (LowPrio|logical|level) >>> qemu-system-x86-19673 [014] 213577.939910: kvm_entry: vcpu 0 >>> qemu-system-x86-19673 [014] 213577.939912: kvm_exit: reason apic_access >>> rip 0xf800016a050c >>> qemu-system-x86-19673 [014] 213577.939914: kvm_mmio: mmio write len 4 >>> gpa 0xfee000b0 val 0x0 > > apic eoi > >>> qemu-system-x86-19332 [008] 213577.939958: kvm_mmio: mmio write len 4 >>> gpa 0xfee000b0 val 0x0 >>> qemu-system-x86-19673 [014] 213577.939958: kvm_pic_set_irq: chip 1 pin 3 >>> (level|masked) >>> qemu-system-x86-19332 [008] 213577.939958: kvm_apic: apic_write APIC_EOI >>> = 0x0 > > apic eoi > >>> qemu-system-x86-19673 [014] 213577.940010: kvm_exit: reason cr_access >>> rip 0xf800016ee2b2 >>> qemu-system-x86-19673 [014] 213577.940011: kvm_cr: cr_write 4 = 0x678 >>> qemu-system-x86-19673 [014] 213577.940017: kvm_entry: vcpu 0 >>> qemu-system-x86-19673 [014] 213577.940019: kvm_exit: reason cr_access >>> rip 0xf800016ee2b5 >>> qemu-system-x86-19673 [014] 213577.940019: kvm_cr: cr_write 4 = 0x6f8 > > toggling global pages, we can avoid that with CR4_GUEST_HOST_MASK. > > So, tons of hpet and eois. We can accelerate both by thing the hyper-V > accelerations, we already have some (unmerged) code for eoi, so this should > be improved soon. > >> >> Here is oprofile: >> >>> 4117817 62.2029 kvm-intel.ko kvm-intel.ko >>> vmx_vcpu_run >>> 338198 5.1087 qemu-system-x86_64 qemu-system-x86_64 >>> /usr/local/qemu/48bb360cc687b89b74dfb1cac0f6e8812b64841c/bin/qemu-system-x86_64 >>> 62449 0.9433 kvm.ko kvm.ko >>> kvm_arch_vcpu_ioctl_run >>> 56512 0.8537 >>> vmlinux-2.6.32-rc7-5e8cb552cb8b48244b6d07bff984b3c4080d4bc9-autokern1 >>> vmlinux-2.6.32-rc7-5e8cb552cb8b48
Re: large amount of NMI_INTERRUPT disgrade winxp VM performance much.
Hi, Avi: Your guess is right, the fast server is AMD with NPT. this slow server is Intel's 7430 with no EPT, I now understand the reserved bit come from kvm's virtual soft-mmu. But there is still one confusing problem: why a FC14 VM has a much better storage IO performance on the same host? I always check the IO on host with iotop when copying files or running fio in VM, when running with FC14 VM guest, it can reach 30Mbps; while copying file or running fio in winxp VM guest, it is about 2-3Mbps. FC14's trace-cmd output is as the following: qemu-kvm-7636 [006] 897.452208: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.452213: kvm_exit: reason EXCEPTION_NMI rip 0x8100b5fa qemu-kvm-7636 [006] 897.452217: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.452408: kvm_exit: reason EXTERNAL_INTERRUPT rip 0x81009ddd qemu-kvm-7636 [006] 897.452411: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.452437: kvm_exit: reason CR_ACCESS rip 0x8103fadd qemu-kvm-7636 [006] 897.452437: kvm_cr: cr_write 3 = 0x7a709000 qemu-kvm-7636 [006] 897.452442: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.453113: kvm_exit: reason EXTERNAL_INTERRUPT rip 0x8103d12e qemu-kvm-7636 [006] 897.453116: kvm_apic_accept_irq: apicid 0 vec 239 (Fixed|edge) qemu-kvm-7636 [006] 897.453120: kvm_inj_virq: irq 239 qemu-kvm-7636 [006] 897.453121: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.453126: kvm_exit: reason APIC_ACCESS rip 0x81026239 qemu-kvm-7636 [006] 897.453134: kvm_mmio: mmio write len 4 gpa 0xfee000b0 val 0x0 qemu-kvm-7636 [006] 897.453135: kvm_apic: apic_write APIC_EOI = 0x0 qemu-kvm-7636 [006] 897.453137: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.453155: kvm_exit: reason APIC_ACCESS rip 0x81026239 qemu-kvm-7636 [006] 897.453159: kvm_mmio: mmio write len 4 gpa 0xfee00380 val 0xe6f5 qemu-kvm-7636 [006] 897.453160: kvm_apic: apic_write APIC_TMICT = 0xe6f5 qemu-kvm-7636 [006] 897.453164: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.453373: kvm_exit: reason IO_INSTRUCTION rip 0x812243e4 qemu-kvm-7636 [006] 897.453378: kvm_pio: pio_write at 0xc050 size 2 count 1 qemu-kvm-7636 [006] 897.453625: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.453984: kvm_exit: reason IO_INSTRUCTION rip 0x812243e4 qemu-kvm-7636 [006] 897.453984: kvm_pio: pio_write at 0xc050 size 2 count 1 qemu-kvm-7636 [006] 897.454198: kvm_apic_accept_irq: apicid 0 vec 239 (Fixed|edge) qemu-kvm-7636 [006] 897.454201: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.454206: kvm_exit: reason PENDING_INTERRUPT rip 0x81201c95 qemu-kvm-7636 [006] 897.454209: kvm_inj_virq: irq 239 qemu-kvm-7636 [006] 897.454209: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.454212: kvm_exit: reason APIC_ACCESS rip 0x81026239 qemu-kvm-7636 [006] 897.454220: kvm_mmio: mmio write len 4 gpa 0xfee000b0 val 0x0 qemu-kvm-7636 [006] 897.454222: kvm_apic: apic_write APIC_EOI = 0x0 qemu-kvm-7636 [006] 897.454225: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.454238: kvm_exit: reason APIC_ACCESS rip 0x81026239 qemu-kvm-7636 [006] 897.454243: kvm_mmio: mmio write len 4 gpa 0xfee00380 val 0xd29f qemu-kvm-7636 [006] 897.454243: kvm_apic: apic_write APIC_TMICT = 0xd29f qemu-kvm-7636 [006] 897.454247: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.454405: kvm_exit: reason EXTERNAL_INTERRUPT rip 0x8113e91f qemu-kvm-7636 [006] 897.454410: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.454547: kvm_exit: reason IO_INSTRUCTION rip 0x812243e4 qemu-kvm-7636 [006] 897.454548: kvm_pio: pio_write at 0xc050 size 2 count 1 qemu-kvm-7636 [006] 897.454690: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.454714: kvm_exit: reason APIC_ACCESS rip 0x81026239 qemu-kvm-7636 [006] 897.454720: kvm_mmio: mmio write len 4 gpa 0xfee00380 val 0x7e qemu-kvm-7636 [006] 897.454721: kvm_apic: apic_write APIC_TMICT = 0x7e qemu-kvm-7636 [006] 897.454725: kvm_entry:vcpu 0 qemu-kvm-7636 [006] 897.454730: kvm_exit: reason APIC_ACCESS rip 0x81026239 qemu-kvm-7636 [006] 897.454733: kvm_mmio: mmio write len 4 gpa 0xfee00380 val 0x1c040d qemu-kvm-7636 [006] 897.454735: kvm_apic: apic_write APIC_TMICT = 0x1c040d qemu-kvm-7636 [006] 897.454737: kvm_entry:vcpu 0
Re: large amount of NMI_INTERRUPT disgrade winxp VM performance much.
To clear the problem from guest settings, I run the same winxp image on another server with the same kernel/qemu-kvm/command. the copy is fast. so I think this problem relates only with some kind of host's special hardware. the fast server's trace-cmd output as the following: qemu-system-x86-7681 [001] 20054.604841: kvm_entry:vcpu 0 qemu-system-x86-7681 [001] 20054.604842: kvm_exit: reason UNKNOWN rip 0x806e7d33 qemu-system-x86-7681 [001] 20054.604842: kvm_page_fault: address fee000b0 error_code 6 qemu-system-x86-7681 [001] 20054.604843: kvm_mmio: mmio write len 4 gpa 0xfee000b0 val 0x0 qemu-system-x86-7681 [001] 20054.604843: kvm_apic: apic_write APIC_EOI = 0x0 qemu-system-x86-7681 [001] 20054.604844: kvm_entry:vcpu 0 qemu-system-x86-7681 [001] 20054.604917: kvm_exit: reason UNKNOWN rip 0xbff63b14 qemu-system-x86-7681 [001] 20054.604917: kvm_page_fault: address b8040 error_code 4 qemu-system-x86-7681 [001] 20054.604920: kvm_mmio: mmio unsatisfied-read len 1 gpa 0xb8040 val 0x0 qemu-system-x86-7681 [001] 20054.604923: kvm_mmio: mmio read len 1 gpa 0xb8040 val 0x0 qemu-system-x86-7681 [001] 20054.604924: kvm_mmio: mmio write len 1 gpa 0xb8040 val 0x0 qemu-system-x86-7681 [001] 20054.604925: kvm_entry:vcpu 0 qemu-system-x86-7681 [001] 20054.604926: kvm_exit: reason UNKNOWN rip 0xbff63b1a qemu-system-x86-7681 [001] 20054.604927: kvm_page_fault: address b801a error_code 6 qemu-system-x86-7681 [001] 20054.604928: kvm_mmio: mmio write len 1 gpa 0xb801a val 0xd qemu-system-x86-7681 [001] 20054.604928: kvm_entry:vcpu 0 qemu-system-x86-7681 [001] 20054.604929: kvm_exit: reason UNKNOWN rip 0xbff63b23 qemu-system-x86-7681 [001] 20054.604929: kvm_page_fault: address b8014 error_code 6 qemu-system-x86-7681 [001] 20054.604930: kvm_mmio: mmio write len 4 gpa 0xb8014 val 0x15f900 According to Tian's suggest, the NMI is produced by guest's write to reservd page, Is there any way to find out why the slow-copy server reserve the memory page? I checked the server's memory, there is large free space, no swap is used. and I have tested with server with kernel 2.6.39, the problem remains. Regards. Suya. 2011/8/11 Tian, Kevin : >> From: ya su >> Sent: Thursday, August 11, 2011 11:57 AM >> >> When I run winxp guest on one server, copy one file about 4G will >> take a time of 40-50 min; if I run a FC14 guest, it will take about >> 2-3 min; >> >> I copy and run the winxp image on another server, it works well, take >> about 3min. >> >> I run trace-cmd while copying files, the main difference of the two >> outputs is that: the slow one's output have many NMI_INTERRUPT >> vm_exit, while the fast output has no such vm_exit. both of the two >> servers have NMI enabled default. the slow one's output as the >> following: >> qemu-system-x86-4454 [004] 549.958147: kvm_entry: >> vcpu 0 >> qemu-system-x86-4454 [004] 549.958172: kvm_exit: >> reason EXCEPTION_NMI rip 0x8051d5e1 >> qemu-system-x86-4454 [004] 549.958172: kvm_page_fault: >> address c8f8a000 error_code b >> qemu-system-x86-4454 [004] 549.958177: kvm_entry: >> vcpu 0 >> qemu-system-x86-4454 [004] 549.958202: kvm_exit: >> reason EXCEPTION_NMI rip 0x8051d5e1 >> qemu-system-x86-4454 [004] 549.958204: kvm_page_fault: >> address c8f8b000 error_code b >> qemu-system-x86-4454 [004] 549.958209: kvm_entry: >> vcpu 0 >> qemu-system-x86-4454 [004] 549.958234: kvm_exit: >> reason EXCEPTION_NMI rip 0x8051d5e1 >> qemu-system-x86-4454 [004] 549.958234: kvm_page_fault: >> address c8f8c000 error_code b >> qemu-system-x86-4454 [004] 549.958239: kvm_entry: >> vcpu 0 >> qemu-system-x86-4454 [004] 549.958264: kvm_exit: >> reason EXCEPTION_NMI rip 0x8051d5e1 >> qemu-system-x86-4454 [004] 549.958264: kvm_page_fault: >> address c8f8d000 error_code b >> qemu-system-x86-4454 [004] 549.958267: kvm_entry: >> vcpu 0 >> qemu-system-x86-4454 [004] 549.958292: kvm_exit: >> reason EXCEPTION_NMI rip 0x8051d5e1 >> qemu-system-x86-4454 [004] 549.958294: kvm_page_fault: >> address c8f8e000 error_code b >> qemu-system-x86-4454 [004] 549.958299: kvm_entry: >> vcpu 0 >> qemu-system-x86-4454 [004] 549.958324: kvm_exit: >> reason EXCEPTION_NMI rip 0x8051d5e1 >> qemu-system-x86-4454 [004] 549.958324: kvm_page_fault: >> address c8f8f000 error_code b >> qemu-system-x86-4454 [004] 549.958329: kvm_entry: >> vcpu 0 >> qemu-system-x86-4454 [004] 549.958447: kvm_exit: >> reason EXTERNAL_INTERRUPT rip 0x8054
large amount of NMI_INTERRUPT disgrade winxp VM performance much.
When I run winxp guest on one server, copy one file about 4G will take a time of 40-50 min; if I run a FC14 guest, it will take about 2-3 min; I copy and run the winxp image on another server, it works well, take about 3min. I run trace-cmd while copying files, the main difference of the two outputs is that: the slow one's output have many NMI_INTERRUPT vm_exit, while the fast output has no such vm_exit. both of the two servers have NMI enabled default. the slow one's output as the following: qemu-system-x86-4454 [004] 549.958147: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958172: kvm_exit: reason EXCEPTION_NMI rip 0x8051d5e1 qemu-system-x86-4454 [004] 549.958172: kvm_page_fault: address c8f8a000 error_code b qemu-system-x86-4454 [004] 549.958177: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958202: kvm_exit: reason EXCEPTION_NMI rip 0x8051d5e1 qemu-system-x86-4454 [004] 549.958204: kvm_page_fault: address c8f8b000 error_code b qemu-system-x86-4454 [004] 549.958209: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958234: kvm_exit: reason EXCEPTION_NMI rip 0x8051d5e1 qemu-system-x86-4454 [004] 549.958234: kvm_page_fault: address c8f8c000 error_code b qemu-system-x86-4454 [004] 549.958239: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958264: kvm_exit: reason EXCEPTION_NMI rip 0x8051d5e1 qemu-system-x86-4454 [004] 549.958264: kvm_page_fault: address c8f8d000 error_code b qemu-system-x86-4454 [004] 549.958267: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958292: kvm_exit: reason EXCEPTION_NMI rip 0x8051d5e1 qemu-system-x86-4454 [004] 549.958294: kvm_page_fault: address c8f8e000 error_code b qemu-system-x86-4454 [004] 549.958299: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958324: kvm_exit: reason EXCEPTION_NMI rip 0x8051d5e1 qemu-system-x86-4454 [004] 549.958324: kvm_page_fault: address c8f8f000 error_code b qemu-system-x86-4454 [004] 549.958329: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958447: kvm_exit: reason EXTERNAL_INTERRUPT rip 0x80547ac8 qemu-system-x86-4454 [004] 549.958450: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958461: kvm_exit: reason CR_ACCESS rip 0x8054428c qemu-system-x86-4454 [004] 549.958461: kvm_cr: cr_write 0 = 0x80010031 qemu-system-x86-4454 [004] 549.958541: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958573: kvm_exit: reason CR_ACCESS rip 0x80546beb qemu-system-x86-4454 [004] 549.958575: kvm_cr: cr_write 0 = 0x8001003b qemu-system-x86-4454 [004] 549.958585: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958610: kvm_exit: reason CR_ACCESS rip 0x80546b6c qemu-system-x86-4454 [004] 549.958610: kvm_cr: cr_write 3 = 0x6e00020 qemu-system-x86-4454 [004] 549.958621: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958645: kvm_exit: reason EXCEPTION_NMI rip 0x8051d7f4 qemu-system-x86-4454 [004] 549.958645: kvm_page_fault: address c0648200 error_code 3 qemu-system-x86-4454 [004] 549.958653: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958725: kvm_exit: reason EXCEPTION_NMI rip 0x8050a26a qemu-system-x86-4454 [004] 549.958726: kvm_page_fault: address c0796994 error_code 3 qemu-system-x86-4454 [004] 549.958738: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958750: kvm_exit: reason IO_INSTRUCTION rip 0x806edad0 qemu-system-x86-4454 [004] 549.958750: kvm_pio: pio_write at 0xc050 size 2 count 1 qemu-system-x86-4454 [004] 549.958838: kvm_entry:vcpu 0 qemu-system-x86-4454 [004] 549.958844: kvm_exit: reason APIC_ACCESS rip 0x806e7b85 qemu-system-x86-4454 [004] 549.958852: kvm_apic: apic_read APIC_ICR = 0x40041 qemu-system-x86-4454 [004] 549.958855: kvm_mmio: mmio read len 4 gpa 0xfee00300 val 0x40041 qemu-system-x86-4454 [004] 549.958857: kvm_mmio: mmio write len 4 gpa 0xfee00300 val 0x40041 qemu-system-x86-4454 [004] 549.958858: kvm_apic: apic_write APIC_ICR = 0x40041 qemu-system-x86-4454 [004] 549.958860: kvm_apic_ipi: dst 1 vec 65 (Fixed|physical|de-assert|edge|self) qemu-system-x86-4454 [004] 549.958860: kvm_apic_accept_irq: apicid 0 vec 65 (Fixed|edge) Even I disable the NMI when I boot the kernel with nmi_watchdog=0, the trace-cmd output still show there are many NMI_INTERRUPT. I find that in /proc/interrupts, the amount of NMI is 0. Does this mean that NMI is produced in winxp guest OS, or this setting can not hinder kvm to catch NMI interrupt? I think the difference between FC14 and winxp is that: fc14 process the NMI interrupt correctly, but winxp can not, is this right? I run qemu-kvm with version 0.14.0, kernel version 2.6.32-131.6.4. I change kvm-kmod to 2.6.32-27, it produce the same result. Any suggestions? thanks. Regards. Suya. -- To unsubscribe from th
Re: PowerPoint performance degrade greatly when logon on through rdesktop to winxp, and when lotus notes is running.
it: reason UNKNOWN rip 0x806e7b91 qemu-system-x86-4239 [001] 7369.308539: kvm_page_fault: address fee00300 error_code 6 qemu-system-x86-4239 [001] 7369.308540: kvm_mmio: mmio write len 4 gpa 0xfee00300 val 0x40041 qemu-system-x86-4239 [001] 7369.308540: kvm_apic: apic_write APIC_ICR = 0x40041 qemu-system-x86-4239 [001] 7369.308540: kvm_apic_ipi: dst 0 vec 65 (Fixed|physical|de-assert|edge|self) qemu-system-x86-4239 [001] 7369.308540: kvm_apic_accept_irq: apicid 0 vec 65 (Fixed|edge) (coalesced) qemu-system-x86-4239 [001] 7369.308541: kvm_entry:vcpu 0 qemu-system-x86-4239 [001] 7369.308542: kvm_exit: reason UNKNOWN rip 0x806e7b97 qemu-system-x86-4239 [001] 7369.308542: kvm_page_fault: address fee00300 error_code 4 qemu-system-x86-4239 [001] 7369.308542: kvm_apic: apic_read APIC_ICR = 0x40041 qemu-system-x86-4239 [001] 7369.308542: kvm_mmio: mmio read len 4 gpa 0xfee00300 val 0x40041 qemu-system-x86-4239 [001] 7369.308543: kvm_mmio: mmio write len 4 gpa 0xfee00300 val 0x40041 qemu-system-x86-4239 [001] 7369.308543: kvm_apic: apic_write APIC_ICR = 0x40041 qemu-system-x86-4239 [001] 7369.308543: kvm_apic_ipi: dst 0 vec 65 (Fixed|physical|de-assert|edge|self) qemu-system-x86-4239 [001] 7369.308543: kvm_apic_accept_irq: apicid 0 vec 65 (Fixed|edge) (coalesced) there are multiply wirte to apic's APIC_ICR with 0x40041, and every time vcpu quickly exit after enter, is this okey? qemu-system-x86-4239 [001] 7369.308543: kvm_entry:vcpu 0 qemu-system-x86-4239 [001] 7369.308545: kvm_exit: reason UNKNOWN rip 0x806e7f18 qemu-system-x86-4239 [001] 7369.308545: kvm_page_fault: address fee000b0 error_code 6 qemu-system-x86-4239 [001] 7369.308545: kvm_mmio: mmio write len 4 gpa 0xfee000b0 val 0x0 qemu-system-x86-4239 [001] 7369.308545: kvm_apic: apic_write APIC_EOI = 0x0 qemu-system-x86-4239 [001] 7369.308546: kvm_ack_irq: irqchip IOAPIC pin 11 qemu-system-x86-4239 [001] 7369.308546: kvm_entry:vcpu 0 qemu-system-x86-4239 [001] 7369.308560: kvm_exit: reason UNKNOWN rip 0x800ca22e qemu-system-x86-4239 [001] 7369.308560: kvm_hypercall:nr 0x1 a0 0x41 a1 0x0 a2 0x0 a3 0x806e7410 qemu-system-x86-4239 [001] 7369.308561: kvm_entry:vcpu 0 qemu-system-x86-4239 [001] 7369.308562: kvm_exit: reason UNKNOWN rip 0x806e7d33 qemu-system-x86-4239 [001] 7369.308562: kvm_page_fault: address fee000b0 error_code 6 qemu-system-x86-4239 [001] 7369.308563: kvm_mmio: mmio write len 4 gpa 0xfee000b0 val 0x0 qemu-system-x86-4239 [001] 7369.308563: kvm_apic: apic_write APIC_EOI = 0x0 qemu-system-x86-4239 [001] 7369.308564: kvm_entry:vcpu 0 qemu-system-x86-4239 [001] 7369.308569: kvm_exit: reason UNKNOWN rip 0xf77ffd3d again, APIC_EOI is write twice. qemu-system-x86-4227 [000] 7369.310965: kvm_set_irq: gsi 11 level 1 source 0 I once think it may come from network speed, so I assign a pci network card to vm, but the problem remains. so I think the problem may come from windows rdp display driver's drop & drop realization, especailly it can be influenced by lotus notes, but I can not see how this hinder the vm schedule? I also increase the vm memory, it has no effect. and my kernel is 2.6.32-131.4.1. Any suggestions? thanks. Regards! Green. 2011/7/5 Avi Kivity : > On 07/05/2011 12:40 PM, ya su wrote: >> >> I am using qemu-kvm, cli as the following: >> >> qemu-system-x86_64 -drive >> file=test-notes.img,if=virtio,cache=none,boot=on -net >> nic,macaddr=00:00:00:11:22:88,model=virtio -net tap -m 1024 -vnc :3 >> >> I open powerpoint 2007, and drag a rectangel, it moves very >> slowly. it must meet the following conditions to produce the same >> result: >> (1) lotus notes is running. >> (2) logon through rdestktop. >> >> if I connect through vnc, it will not happen; if I don't run >> louts notes, it will not happen. if I change to 2 vcpus as the >> following cli, it will respond much better. >> >> qemu-system-x86_64 -drive >> file=test-notes.img,if=virtio,cache=none,boot=on -net >> nic,macaddr=00:00:00:11:22:88,model=virtio -net tap -m 1024 -vnc :3 >> -smp 2 >> >> I first doubt that maybe it's from windows internal problem, so >> I tested on a uni-processor PC, but It looks good. >> >> I also run qemu-kvm with -no-kvm, it produce the same results. >> >> I run kvm-stat when dragging a rectangle, the output is as the following: >> >> exits 4650520 24645 >> insn_emulation 3508180 15158 >> host_state_reload 1273409 13999 >> io_exits
PowerPoint performance degrade greatly when logon on through rdesktop to winxp, and when lotus notes is running.
I am using qemu-kvm, cli as the following: qemu-system-x86_64 -drive file=test-notes.img,if=virtio,cache=none,boot=on -net nic,macaddr=00:00:00:11:22:88,model=virtio -net tap -m 1024 -vnc :3 I open powerpoint 2007, and drag a rectangel, it moves very slowly. it must meet the following conditions to produce the same result: (1) lotus notes is running. (2) logon through rdestktop. if I connect through vnc, it will not happen; if I don't run louts notes, it will not happen. if I change to 2 vcpus as the following cli, it will respond much better. qemu-system-x86_64 -drive file=test-notes.img,if=virtio,cache=none,boot=on -net nic,macaddr=00:00:00:11:22:88,model=virtio -net tap -m 1024 -vnc :3 -smp 2 I first doubt that maybe it's from windows internal problem, so I tested on a uni-processor PC, but It looks good. I also run qemu-kvm with -no-kvm, it produce the same results. I run kvm-stat when dragging a rectangle, the output is as the following: exits 4650520 24645 insn_emulation 3508180 15158 host_state_reload 1273409 13999 io_exits 1031465 13504 irq_injections 1791042629 hypercalls 1314812084 halt_wakeup 33589 495 halt_exits 33584 495 irq_exits 105020 237 pf_fixed449879 106 fpu_reload 16852 54 mmio_exits 46426 1 mmu_cache_miss9985 0 mmu_shadow_zapped11736 0 signal_exits 2145 0 remote_tlb_flush 251 0 it seems that qemu-kvm is emulating some instruction which take much cpu resource, but I don't know how to find the emuated instructions. Any suggestions? thanks. Regards. Green. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: USB EHCI patch for 0.14.0?
David: I have applied the patch to 0.14.0, and there is a bug if I add a optiarc CRRWDVD CRX890A usb device on windows xp, I first comment out the following code in usb-linux.c: if (is_halted(s, p->devep)) { ret = ioctl(s->fd, USBDEVFS_CLEAR_HALT, &urb->endpoint); #if 0 if (ret < 0) { DPRINTF("husb: failed to clear halt. ep 0x%x errno %d\n", urb->endpoint, errno); return USB_RET_NAK; } #endif clear_halt(s, p->devep); } then it can continue to run in linux, but still stall on windows xp and win7. I turn on debug, part of the output is as the following: husb: async cancel. aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status -2 alen 0 husb: reset device 6.8 husb: claiming interfaces. config 1 husb: i is 18, descr_len is 50, dl 9, dt 2 husb: config #1 need 1 husb: 1 interfaces claimed for configuration 1 husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 64 husb: submit ctrl. len 72 aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status 0 alen 18 invoking packet_complete. plen = 8 husb: reset device 6.8 husb: claiming interfaces. config 1 husb: i is 18, descr_len is 50, dl 9, dt 2 husb: config #1 need 1 husb: 1 interfaces claimed for configuration 1 husb: ctrl type 0x0 req 0x5 val 0x2 index 0 len 0 husb: ctrl set addr 2 husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 18 husb: submit ctrl. len 26 aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status 0 alen 18 invoking packet_complete. plen = 8 husb: ctrl type 0x0 req 0x9 val 0x1 index 0 len 0 husb: releasing interfaces husb: ctrl set config 1 ret 0 errno 11 husb: claiming interfaces. config 1 husb: i is 18, descr_len is 50, dl 9, dt 2 husb: config #1 need 1 husb: 1 interfaces claimed for configuration 1 husb: data submit. ep 0x2 len 31 aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status 0 alen 31 invoking packet_complete. plen = 31 husb: data submit. ep 0x81 len 64 aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status 0 alen 4 invoking packet_complete. plen = 4 husb: data submit. ep 0x81 len 13 aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status -32 alen 0 invoking packet_complete. plen = -3 husb: reset device 6.8 husb: claiming interfaces. config 1 husb: i is 18, descr_len is 50, dl 9, dt 2 husb: config #1 need 1 husb: 1 interfaces claimed for configuration 1 husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 64 husb: submit ctrl. len 72 aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status 0 alen 18 invoking packet_complete. plen = 8 husb: reset device 6.8 husb: claiming interfaces. config 1 husb: i is 18, descr_len is 50, dl 9, dt 2 husb: config #1 need 1 husb: 1 interfaces claimed for configuration 1 husb: ctrl type 0x0 req 0x5 val 0x1 index 0 len 0 husb: ctrl set addr 1 husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 18 husb: submit ctrl. len 26 aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status 0 alen 18 invoking packet_complete. plen = 8 husb: ctrl type 0x0 req 0x9 val 0x1 index 0 len 0 husb: releasing interfaces husb: ctrl set config 1 ret 0 errno 11 husb: claiming interfaces. config 1 husb: i is 18, descr_len is 50, dl 9, dt 2 husb: config #1 need 1 husb: 1 interfaces claimed for configuration 1 husb: data submit. ep 0x2 len 31 aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status 0 alen 31 invoking packet_complete. plen = 31 husb: data submit. ep 0x81 len 64 aurb 0x1616cd0 [Thread 0x74f75710 (LWP 3317) exited] husb: async completed. aurb 0x1616cd0 status 0 alen 4 invoking packet_complete. plen = 4 husb: data submit. ep 0x81 len 13 aurb 0x1616cd0 husb: async cancel. aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status -2 alen 0 husb: reset device 6.8 husb: claiming interfaces. config 1 husb: i is 18, descr_len is 50, dl 9, dt 2 husb: config #1 need 1 husb: 1 interfaces claimed for configuration 1 husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 64 husb: submit ctrl. len 72 aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status 0 alen 18 invoking packet_complete. plen = 8 husb: reset device 6.8 husb: claiming interfaces. config 1 husb: i is 18, descr_len is 50, dl 9, dt 2 husb: config #1 need 1 husb: 1 interfaces claimed for configuration 1 husb: ctrl type 0x0 req 0x5 val 0x2 index 0 len 0 husb: ctrl set addr 2 husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 18 husb: submit ctrl. len 26 aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status 0 alen 18 invoking packet_complete. plen = 8 husb: ctrl type 0x0 req 0x9 val 0x1 index 0 len 0 husb: releasing interfaces husb: ctrl set config 1 ret 0 errno 11 husb: claiming interfaces. config 1 husb: i is 18, descr_len is 50, dl 9, dt 2 husb: config #1 need 1 husb: 1 interfaces claimed for configuration 1 husb: data submit. ep 0x2 len 31 aurb 0x1616cd0 husb: async completed. aurb 0x1616cd0 status 0 alen 31 invoking packet_complete. plen = 31 husb: data submit. ep 0x81 len
Re: [COMMIT] [WIN-GUEST-DRIVERS] Balloon - remove WMI usage. Remove wmi.c.
Yan: I have tested the newest balloon driver (from 1.1.16) on windows server 2003, balloon.sys can not be installed successfully and return error code 10. have you tested this or any updates? thanks. Regards. Green. 2010/2/15 Yan Vugenfirer : > repository: C:/dev/kvm-guest-drivers-windows > branch: master > commit 7ab588f373eda9d08a497e969739019d2075a6d2 > Author: Yan Vugenfirer > Date: Mon Feb 15 15:01:36 2010 +0200 > > [WIN-GUEST-DRIVERS] Balloon - remove WMI usage. Remove wmi.c. > > Signed-off-by: Vadim Rozenfeld > > diff --git a/Balloon/BalloonWDF/wmi.c b/Balloon/BalloonWDF/wmi.c > deleted file mode 100644 > index 70a9270..000 > --- a/Balloon/BalloonWDF/wmi.c > +++ /dev/null > @@ -1,90 +0,0 @@ > -/** > - * Copyright (c) 2009 Red Hat, Inc. > - * > - * File: device.c > - * > - * Author(s): > - * > - * This file contains WMI support routines > - * > - * This work is licensed under the terms of the GNU GPL, version 2. See > - * the COPYING file in the top-level directory. > - * > -**/ > -#include "precomp.h" > - > -#if defined(EVENT_TRACING) > -#include "wmi.tmh" > -#endif > - > - > -#define MOFRESOURCENAME L"MofResourceName" > - > -#ifdef ALLOC_PRAGMA > -#pragma alloc_text(PAGE, WmiRegistration) > -#pragma alloc_text(PAGE, EvtWmiDeviceInfoQueryInstance) > -#endif > - > -NTSTATUS > -WmiRegistration( > - WDFDEVICE Device > - ) > -{ > - WDF_WMI_PROVIDER_CONFIG providerConfig; > - WDF_WMI_INSTANCE_CONFIG instanceConfig; > - NTSTATUS status; > - DECLARE_CONST_UNICODE_STRING(mofRsrcName, MOFRESOURCENAME); > - > - PAGED_CODE(); > - > - TraceEvents(TRACE_LEVEL_INFORMATION, DBG_PNP, "--> WmiRegistration\n"); > - > - status = WdfDeviceAssignMofResourceName(Device, &mofRsrcName); > - if (!NT_SUCCESS(status)) { > - TraceEvents(TRACE_LEVEL_ERROR, DBG_PNP, > - "WdfDeviceAssignMofResourceName failed 0x%x", status); > - return status; > - } > - > - WDF_WMI_PROVIDER_CONFIG_INIT(&providerConfig, &GUID_DEV_WMI_BALLOON); > - providerConfig.MinInstanceBufferSize = sizeof(ULONGLONG); > - > - WDF_WMI_INSTANCE_CONFIG_INIT_PROVIDER_CONFIG(&instanceConfig, > &providerConfig); > - instanceConfig.Register = TRUE; > - instanceConfig.EvtWmiInstanceQueryInstance = > EvtWmiDeviceInfoQueryInstance; > - > - status = WdfWmiInstanceCreate(Device, > - &instanceConfig, > - WDF_NO_OBJECT_ATTRIBUTES, > - WDF_NO_HANDLE); > - if (!NT_SUCCESS(status)) { > - TraceEvents(TRACE_LEVEL_ERROR, DBG_PNP, > - "WdfWmiInstanceCreate failed 0x%x", status); > - return status; > - } > - > - TraceEvents(TRACE_LEVEL_INFORMATION, DBG_PNP, "<-- WmiRegistration\n"); > - return status; > -} > - > -NTSTATUS > -EvtWmiDeviceInfoQueryInstance( > - __in WDFWMIINSTANCE WmiInstance, > - __in ULONG OutBufferSize, > - __out_bcount_part(OutBufferSize, *BufferUsed) PVOID OutBuffer, > - __out PULONG BufferUsed > - ) > -{ > - PDRIVER_CONTEXT drvCxt = GetDriverContext(WdfGetDriver()); > - > - PAGED_CODE(); > - > - TraceEvents(TRACE_LEVEL_VERBOSE, DBG_WMI, "--> > EvtWmiDeviceInfoQueryInstance\n"); > - > - RtlZeroMemory(OutBuffer, sizeof(ULONGLONG)); > - *(ULONGLONG*) OutBuffer = (ULONGLONG)drvCxt->num_pages; > - *BufferUsed = sizeof(ULONGLONG); > - > - TraceEvents(TRACE_LEVEL_VERBOSE, DBG_WMI, "<-- > EvtWmiDeviceInfoQueryInstance\n"); > - return STATUS_SUCCESS; > -} > -- > To unsubscribe from this list: send the line "unsubscribe kvm-commits" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
SR-IOV of LSI MegaRAID storage controller?
hi,all: I noticed a news that kvm can use the SR-IOV function of LSI MegaRAID storage controller on 2009 IDT, have anyone been succeed in testing this function, and how to configure the kernel and qemu? thanks. Regards. Green. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/18] Introduce event-tap.
Yoshi: I meet one problem if I killed a ft source VM, the dest ft VM will return errors as the following: qemu-system-x86_64: fill buffer failed, Resource temporarily unavailable qemu-system-x86_64: recv header failed the problem is that the dest VM can not continue to run, as it is interrupted in the middle of a transaction, some of rams have been updated, but the others not, do you have any plan for rolling back to cancel the interrupted transaction? thanks. Green. 2011/3/9 Yoshiaki Tamura : > ya su wrote: >> >> Yoshi: >> >> I think event-tap is a great idea, it remove the reading from disk >> which will increase ft effiency much better as your plan in later >> series. >> >> one question: IO read/write may dirty rams, but it is difficute to >> differ them from other dirty pages like caused by running of >> softwares, whether that means you need change all the emulated device >> realization? actually I think it will not send too much rams caused >> by IO Read/Write in ram_save_live, but if It can event-tap IO >> read/write and replay on the other side, Does that means we don't need >> call qemu_savevm_state_full in ft transactoins? > > I'm not expecting to remove qemu_savevm_state_full in the transaction. Just > reduce the number of pages to be transfered as a result. > > Thanks, > > Yoshi > >> >> Green. >> >> >> 2011/3/9 Yoshiaki Tamura: >>> >>> ya su wrote: >>>> >>>> 2011/3/8 Yoshiaki Tamura: >>>>> >>>>> ya su wrote: >>>>>> >>>>>> Yokshiaki: >>>>>> >>>>>> event-tap record block and io wirte events, and replay these on >>>>>> the other side, so block_save_live is useless during the latter ft >>>>>> phase, right? if so, I think it need to process the following code in >>>>>> block_save_live function: >>>>> >>>>> Actually no. It just replays the last events only. We do have patches >>>>> that >>>>> enable block replication without using block live migration, like the >>>>> way >>>>> you described above. In that case, we disable block live migration >>>>> when >>>>> we >>>>> go into ft mode. We're thinking to propose it after this series get >>>>> settled. >>>> >>>> so event-tap's objective is to initial a ft transaction, to start the >>>> sync. of ram/block/device states? if so, it need not change >>>> bdrv_aio_writev/bdrv_aio_flush normal process, on the other side it >>>> need not invokde bdrv_aio_writev either, right? >>> >>> Mostly yes, but because event-tap is queuing requests from block/net, it >>> needs to flush queued requests after the transaction on the primary side. >>> On the secondary, it currently doesn't have to invoke bdrv_aio_writev as >>> you mentioned. But will change soon to enable block replication with >>> event-tap. >>> >>>> >>>>> >>>>>> >>>>>> if (stage == 1) { >>>>>> init_blk_migration(mon, f); >>>>>> >>>>>> /* start track dirty blocks */ >>>>>> set_dirty_tracking(1); >>>>>> } >>>>>> -- >>>>>> the following code will send block to the other side, as this will >>>>>> also be done by event-tap replay. I think it should placed in stage 3, >>>>>> before the assert line. (this may affect some stage 2 rate-limit >>>>>> then, so this can be placed in stage 2, though it looks ugly), another >>>>>> choice is to avoid the invocation of block_save_live, right? >>>>>> --- >>>>>> flush_blks(f); >>>>>> >>>>>> if (qemu_file_has_error(f)) { >>>>>> blk_mig_cleanup(mon); >>>>>> return 0; >>>>>> } >>>>>> >>>>>> blk_mig_reset_dirty_cursor(); >>>>>> >>>>>> if (stage == 2) { >>>>>> >>>>>> >>>>>> another question is: since you event-tap io write(I think IO READ >>>>>> should also be event-tapped, as read may cause io chip state to >&g
Re: [PATCH 07/18] Introduce fault tolerant VM transaction QEMUFile and ft_mode.
Juan: It's especailly important for ft to be a standalone thread, as it may cause monitor to be blocked by network problems. what's your schedule, maybe I can help some. Yoshi: in the following code: + +s->file = qemu_fopen_ops(s, ft_trans_put_buffer, ft_trans_get_buffer, + ft_trans_close, ft_trans_rate_limit, + ft_trans_set_rate_limit, NULL); + +return s->file; +} I think you should register ft_trans_get_rate_limit function, otherwise it will not transfer any block data at stage 2 in block_save_live function: if (stage == 2) { /* control the rate of transfer */ while ((block_mig_state.submitted + block_mig_state.read_done) * BLOCK_SIZE < qemu_file_get_rate_limit(f)) { qemu_file_get_rate_limit will return 0, then it will not proceed to copy dirty block data. FYI. Green. 2011/2/24 Yoshiaki Tamura : > 2011/2/24 Juan Quintela : >> >> [ trimming cc to kvm & qemu lists] >> >> Yoshiaki Tamura wrote: >>> Juan Quintela wrote: Yoshiaki Tamura wrote: > This code implements VM transaction protocol. Like buffered_file, it > sits between savevm and migration layer. With this architecture, VM > transaction protocol is implemented mostly independent from other > existing code. Could you explain what is the difference with buffered_file.c? I am fixing problems on buffered_file, and having something that copies lot of code from there makes me nervous. >>> >>> The objective is different: >>> >>> buffered_file buffers data for transmission control. >>> ft_trans_file adds headers to the stream, and controls the transaction >>> between sender and receiver. >>> >>> Although ft_trans_file sometimes buffers date, but it's not the main >>> objective. >>> If you're fixing the problems on buffered_file, I'll keep eyes on them. >>> > +typedef ssize_t (FtTransPutBufferFunc)(void *opaque, const void *data, > size_t size); Can we get some sharing here? typedef ssize_t (BufferedPutFunc)(void *opaque, const void *data, size_t size); There are not so much types for a write function that the 1st element is one opaque :p >>> >>> You're right, but I want to keep ft_trans_file independent of >>> buffered_file at this point. Once Kemari gets merged, I'm happy to >>> work with you to fix the problems on buffered_file and ft_trans_file, >>> and refactoring them. >> >> My goal is getting its own thread for migration on 0.15, that >> basically means that we can do rm buffered_file.c. I guess that >> something similar could happen for kemari. > > That means both gets initiated by it's own thread, not like > current poll based. I'm still skeptical whether Anthony agrees, > but I'll keep it in my mind. > >> But for now, this is just the start + handwaving, once I start doing the >> work I will told you. > > Yes, please. > > Yoshi > >> >> Later, Juan. >> -- >> To unsubscribe from this list: send the line "unsubscribe kvm" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/18] Introduce event-tap.
Yoshi: I think event-tap is a great idea, it remove the reading from disk which will increase ft effiency much better as your plan in later series. one question: IO read/write may dirty rams, but it is difficute to differ them from other dirty pages like caused by running of softwares, whether that means you need change all the emulated device realization? actually I think it will not send too much rams caused by IO Read/Write in ram_save_live, but if It can event-tap IO read/write and replay on the other side, Does that means we don't need call qemu_savevm_state_full in ft transactoins? Green. 2011/3/9 Yoshiaki Tamura : > ya su wrote: >> >> 2011/3/8 Yoshiaki Tamura: >>> >>> ya su wrote: >>>> >>>> Yokshiaki: >>>> >>>> event-tap record block and io wirte events, and replay these on >>>> the other side, so block_save_live is useless during the latter ft >>>> phase, right? if so, I think it need to process the following code in >>>> block_save_live function: >>> >>> Actually no. It just replays the last events only. We do have patches >>> that >>> enable block replication without using block live migration, like the way >>> you described above. In that case, we disable block live migration when >>> we >>> go into ft mode. We're thinking to propose it after this series get >>> settled. >> >> so event-tap's objective is to initial a ft transaction, to start the >> sync. of ram/block/device states? if so, it need not change >> bdrv_aio_writev/bdrv_aio_flush normal process, on the other side it >> need not invokde bdrv_aio_writev either, right? > > Mostly yes, but because event-tap is queuing requests from block/net, it > needs to flush queued requests after the transaction on the primary side. > On the secondary, it currently doesn't have to invoke bdrv_aio_writev as > you mentioned. But will change soon to enable block replication with > event-tap. > >> >>> >>>> >>>> if (stage == 1) { >>>> init_blk_migration(mon, f); >>>> >>>> /* start track dirty blocks */ >>>> set_dirty_tracking(1); >>>> } >>>> -- >>>> the following code will send block to the other side, as this will >>>> also be done by event-tap replay. I think it should placed in stage 3, >>>> before the assert line. (this may affect some stage 2 rate-limit >>>> then, so this can be placed in stage 2, though it looks ugly), another >>>> choice is to avoid the invocation of block_save_live, right? >>>> --- >>>> flush_blks(f); >>>> >>>> if (qemu_file_has_error(f)) { >>>> blk_mig_cleanup(mon); >>>> return 0; >>>> } >>>> >>>> blk_mig_reset_dirty_cursor(); >>>> >>>> if (stage == 2) { >>>> >>>> >>>> another question is: since you event-tap io write(I think IO READ >>>> should also be event-tapped, as read may cause io chip state to >>>> change), you then need not invoke qemu_savevm_state_full in >>>> qemu_savevm_trans_complete, right? thanks. >>> >>> It's not necessary to tap IO READ, but you can if you like. We also have >>> experimental patches for this to reduce rams to be transfered. But I >>> don't >>> understand why we don't have to invoke qemu_savevm_state_full although I >>> think we may reduce number of rams by replaying IO READ on the secondary. >>> >> >> I first think the objective of io-Write event-tap is to reproduce the >> same device state on the other side, though I doubt this, so I think >> IO-Read also should be recorded and replayed. since event-tap is only >> to initial a ft transaction, the sync. of states still depend on >> qemu_save_vm_live/full, I understand the design now, thanks. >> >> but I don't understand why io-write event-tap can reduce transfered >> rams as you mentioned, the amount of rams only depend on dirty pages, >> IO write don't change the normal process unlike block write, right? > > The point is, if we can assure that IO read retrieves the same data on both > sides, instead of dirtying the ram by read, meaning we have to transfer in > the transaction, just replay the operation and get the same data on the > otherside. Anyway
Re: [PATCH 09/18] Introduce event-tap.
2011/3/8 Yoshiaki Tamura : > ya su wrote: >> >> Yokshiaki: >> >> event-tap record block and io wirte events, and replay these on >> the other side, so block_save_live is useless during the latter ft >> phase, right? if so, I think it need to process the following code in >> block_save_live function: > > Actually no. It just replays the last events only. We do have patches that > enable block replication without using block live migration, like the way > you described above. In that case, we disable block live migration when we > go into ft mode. We're thinking to propose it after this series get > settled. so event-tap's objective is to initial a ft transaction, to start the sync. of ram/block/device states? if so, it need not change bdrv_aio_writev/bdrv_aio_flush normal process, on the other side it need not invokde bdrv_aio_writev either, right? > >> >> if (stage == 1) { >> init_blk_migration(mon, f); >> >> /* start track dirty blocks */ >> set_dirty_tracking(1); >> } >> -- >> the following code will send block to the other side, as this will >> also be done by event-tap replay. I think it should placed in stage 3, >> before the assert line. (this may affect some stage 2 rate-limit >> then, so this can be placed in stage 2, though it looks ugly), another >> choice is to avoid the invocation of block_save_live, right? >> --- >> flush_blks(f); >> >> if (qemu_file_has_error(f)) { >> blk_mig_cleanup(mon); >> return 0; >> } >> >> blk_mig_reset_dirty_cursor(); >> >> if (stage == 2) { >> >> >> another question is: since you event-tap io write(I think IO READ >> should also be event-tapped, as read may cause io chip state to >> change), you then need not invoke qemu_savevm_state_full in >> qemu_savevm_trans_complete, right? thanks. > > It's not necessary to tap IO READ, but you can if you like. We also have > experimental patches for this to reduce rams to be transfered. But I don't > understand why we don't have to invoke qemu_savevm_state_full although I > think we may reduce number of rams by replaying IO READ on the secondary. > I first think the objective of io-Write event-tap is to reproduce the same device state on the other side, though I doubt this, so I think IO-Read also should be recorded and replayed. since event-tap is only to initial a ft transaction, the sync. of states still depend on qemu_save_vm_live/full, I understand the design now, thanks. but I don't understand why io-write event-tap can reduce transfered rams as you mentioned, the amount of rams only depend on dirty pages, IO write don't change the normal process unlike block write, right? > Thanks, > > Yoshi > >> >> >> Green. >> >> >> >> 2011/2/24 Yoshiaki Tamura: >>> >>> event-tap controls when to start FT transaction, and provides proxy >>> functions to called from net/block devices. While FT transaction, it >>> queues up net/block requests, and flush them when the transaction gets >>> completed. >>> >>> Signed-off-by: Yoshiaki Tamura >>> Signed-off-by: OHMURA Kei >>> --- >>> Makefile.target | 1 + >>> event-tap.c | 940 >>> +++ >>> event-tap.h | 44 +++ >>> qemu-tool.c | 28 ++ >>> trace-events | 10 + >>> 5 files changed, 1023 insertions(+), 0 deletions(-) >>> create mode 100644 event-tap.c >>> create mode 100644 event-tap.h >>> >>> diff --git a/Makefile.target b/Makefile.target >>> index 220589e..da57efe 100644 >>> --- a/Makefile.target >>> +++ b/Makefile.target >>> @@ -199,6 +199,7 @@ obj-y += rwhandler.o >>> obj-$(CONFIG_KVM) += kvm.o kvm-all.o >>> obj-$(CONFIG_NO_KVM) += kvm-stub.o >>> LIBS+=-lz >>> +obj-y += event-tap.o >>> >>> QEMU_CFLAGS += $(VNC_TLS_CFLAGS) >>> QEMU_CFLAGS += $(VNC_SASL_CFLAGS) >>> diff --git a/event-tap.c b/event-tap.c >>> new file mode 100644 >>> index 000..95c147a >>> --- /dev/null >>> +++ b/event-tap.c >>> @@ -0,0 +1,940 @@ >>> +/* >>> + * Event Tap functions for QEMU >>> + * >>> + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation. >>> + * >>> + * This work is lic
Re: [PATCH 09/18] Introduce event-tap.
Yokshiaki: event-tap record block and io wirte events, and replay these on the other side, so block_save_live is useless during the latter ft phase, right? if so, I think it need to process the following code in block_save_live function: if (stage == 1) { init_blk_migration(mon, f); /* start track dirty blocks */ set_dirty_tracking(1); } -- the following code will send block to the other side, as this will also be done by event-tap replay. I think it should placed in stage 3, before the assert line. (this may affect some stage 2 rate-limit then, so this can be placed in stage 2, though it looks ugly), another choice is to avoid the invocation of block_save_live, right? --- flush_blks(f); if (qemu_file_has_error(f)) { blk_mig_cleanup(mon); return 0; } blk_mig_reset_dirty_cursor(); if (stage == 2) { another question is: since you event-tap io write(I think IO READ should also be event-tapped, as read may cause io chip state to change), you then need not invoke qemu_savevm_state_full in qemu_savevm_trans_complete, right? thanks. Green. 2011/2/24 Yoshiaki Tamura : > event-tap controls when to start FT transaction, and provides proxy > functions to called from net/block devices. While FT transaction, it > queues up net/block requests, and flush them when the transaction gets > completed. > > Signed-off-by: Yoshiaki Tamura > Signed-off-by: OHMURA Kei > --- > Makefile.target | 1 + > event-tap.c | 940 > +++ > event-tap.h | 44 +++ > qemu-tool.c | 28 ++ > trace-events | 10 + > 5 files changed, 1023 insertions(+), 0 deletions(-) > create mode 100644 event-tap.c > create mode 100644 event-tap.h > > diff --git a/Makefile.target b/Makefile.target > index 220589e..da57efe 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -199,6 +199,7 @@ obj-y += rwhandler.o > obj-$(CONFIG_KVM) += kvm.o kvm-all.o > obj-$(CONFIG_NO_KVM) += kvm-stub.o > LIBS+=-lz > +obj-y += event-tap.o > > QEMU_CFLAGS += $(VNC_TLS_CFLAGS) > QEMU_CFLAGS += $(VNC_SASL_CFLAGS) > diff --git a/event-tap.c b/event-tap.c > new file mode 100644 > index 000..95c147a > --- /dev/null > +++ b/event-tap.c > @@ -0,0 +1,940 @@ > +/* > + * Event Tap functions for QEMU > + * > + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation. > + * > + * This work is licensed under the terms of the GNU GPL, version 2. See > + * the COPYING file in the top-level directory. > + */ > + > +#include "qemu-common.h" > +#include "qemu-error.h" > +#include "block.h" > +#include "block_int.h" > +#include "ioport.h" > +#include "osdep.h" > +#include "sysemu.h" > +#include "hw/hw.h" > +#include "net.h" > +#include "event-tap.h" > +#include "trace.h" > + > +enum EVENT_TAP_STATE { > + EVENT_TAP_OFF, > + EVENT_TAP_ON, > + EVENT_TAP_SUSPEND, > + EVENT_TAP_FLUSH, > + EVENT_TAP_LOAD, > + EVENT_TAP_REPLAY, > +}; > + > +static enum EVENT_TAP_STATE event_tap_state = EVENT_TAP_OFF; > + > +typedef struct EventTapIOport { > + uint32_t address; > + uint32_t data; > + int index; > +} EventTapIOport; > + > +#define MMIO_BUF_SIZE 8 > + > +typedef struct EventTapMMIO { > + uint64_t address; > + uint8_t buf[MMIO_BUF_SIZE]; > + int len; > +} EventTapMMIO; > + > +typedef struct EventTapNetReq { > + char *device_name; > + int iovcnt; > + int vlan_id; > + bool vlan_needed; > + bool async; > + struct iovec *iov; > + NetPacketSent *sent_cb; > +} EventTapNetReq; > + > +#define MAX_BLOCK_REQUEST 32 > + > +typedef struct EventTapAIOCB EventTapAIOCB; > + > +typedef struct EventTapBlkReq { > + char *device_name; > + int num_reqs; > + int num_cbs; > + bool is_flush; > + BlockRequest reqs[MAX_BLOCK_REQUEST]; > + EventTapAIOCB *acb[MAX_BLOCK_REQUEST]; > +} EventTapBlkReq; > + > +#define EVENT_TAP_IOPORT (1 << 0) > +#define EVENT_TAP_MMIO (1 << 1) > +#define EVENT_TAP_NET (1 << 2) > +#define EVENT_TAP_BLK (1 << 3) > + > +#define EVENT_TAP_TYPE_MASK (EVENT_TAP_NET - 1) > + > +typedef struct EventTapLog { > + int mode; > + union { > + EventTapIOport ioport; > + EventTapMMIO mmio; > + }; > + union { > + EventTapNetReq net_req; > + EventTapBlkReq blk_req; > + }; > + QTAILQ_ENTRY(EventTapLog) node; > +} EventTapLog; > + > +struct EventTapAIOCB { > + BlockDriverAIOCB common; > + BlockDriverAIOCB *acb; > + bool is_canceled; > +}; > + > +static EventTapLog *last_event_tap; > + > +static QTAILQ_HEAD(, EventTapLog) event_list; > +static QTAILQ_HEAD(, EventTapLog) event_pool; > + > +static int (*event_tap_cb)(void); > +static QEMUBH *event_tap_bh; > +static VMChangeStateEntry *vmstate; > + > +static void event_tap_bh_cb(void *p) > +{ > + if (event_tap_cb) { > + eve
Re: problem about blocked monitor when disk image on NFS can not be reached.
hi,all: io_thread bt as the following: #0 0x7f3086eaa034 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x7f3086ea5345 in _L_lock_870 () from /lib64/libpthread.so.0 #2 0x7f3086ea5217 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00436018 in kvm_mutex_lock () at /root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1730 #4 qemu_mutex_lock_iothread () at /root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1744 #5 0x0041ca67 in main_loop_wait (nonblocking=) at /root/rpmbuild/BUILD/qemu-kvm-0.14/vl.c:1377 #6 0x004363e7 in kvm_main_loop () at /root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1589 #7 0x0041dc3a in main_loop (argc=, argv=, envp=) at /root/rpmbuild/BUILD/qemu-kvm-0.14/vl.c:1429 #8 main (argc=, argv=, envp=) at /root/rpmbuild/BUILD/qemu-kvm-0.14/vl.c:3201 cpu thread as the following: #0 0x7f3084dff093 in select () from /lib64/libc.so.6 #1 0x004453ea in qemu_aio_wait () at aio.c:193 #2 0x00444175 in bdrv_write_em (bs=0x1ec3090, sector_num=2009871, buf=0x7f3087532800 "F\b\200u\022\366F$\004u\fPV\350\226\367\377\377\003Ft\353\fPV\350\212\367\377\377\353\003\213Ft^]\302\b", nb_sectors=16) at block.c:2577 #3 0x0059ca13 in ide_sector_write (s=0x215f508) at /root/rpmbuild/BUILD/qemu-kvm-0.14/hw/ide/core.c:574 #4 0x00438ced in kvm_handle_io (env=0x202ef60) at /root/rpmbuild/BUILD/qemu-kvm-0.14/kvm-all.c:821 #5 kvm_run (env=0x202ef60) at /root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:617 #6 0x00438e09 in kvm_cpu_exec (env=) at /root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1233 #7 0x0043a0f7 in kvm_main_loop_cpu (_env=0x202ef60) at /root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1419 #8 ap_main_loop (_env=0x202ef60) at /root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1466 #9 0x7f3086ea37e1 in start_thread () from /lib64/libpthread.so.0 #10 0x7f3084e0653d in clone () from /lib64/libc.so.6 aio_thread bt as the following: #0 0x7f3086eaae83 in pwrite64 () from /lib64/libpthread.so.0 #1 0x00447501 in handle_aiocb_rw_linear (aiocb=0x21cff10, buf=0x7f3087532800 "F\b\200u\022\366F$\004u\fPV\350\226\367\377\377\003Ft\353\fPV\350\212\367\377\377\353\003\213Ft^]\302\b") at posix-aio-compat.c:212 #2 0x00447d48 in handle_aiocb_rw (unused=) at posix-aio-compat.c:247 #3 aio_thread (unused=) at posix-aio-compat.c:341 #4 0x7f3086ea37e1 in start_thread () from /lib64/libpthread.so.0 #5 0x7f3084e0653d in clone () from /lib64/libc.so.6 I think io_thread is blocked by cpu thread which take the qemu_mutux first, cpu thread is waiting for aio_thread's result by qemu_aio_wait function, aio_thead take much time on pwrite64, it will take about 5-10s, then return a error(it seems like an non-block timeout call), after that, io thead will have a chance to receive monitor input, so the monitor seems to blocked frequently. in this suition, if I stop the vm, the monitor will response faster. the problem is caused by unavailabity of block layer, the block layer process the io error in a normal way, it report error to ide device, the error is handled in ide_sector_write. the root cause is: monitor's input and io operation(pwrite function) must execute in a serialized method(by qemu_mutux seamphore), so pwrite long block time will hinder monitor input. as stefan says, it seems difficult to take monitor input out of the protection, currently I will stop the vm if the disk image can not be reached. 2011/3/1 Avi Kivity : > On 03/01/2011 05:01 PM, Stefan Hajnoczi wrote: >> >> On Tue, Mar 1, 2011 at 12:39 PM, ya su wrote: >> > how about to remove kvm_handle_io/handle_mmio in kvm_run function >> > into kvm_main_loop, as these operation belong to io operation, this >> > will remove the qemu_mutux between the 2 threads. is this an >> > reasonable thought? >> > >> > In order to keep the monitor to response to user quicker under >> > this suition, an easier way is to take monito io out of qemu_mutux >> > protection. this include vnc/serial/telnet io related with monitor, >> > as these io will not affect the running of vm itself, it need not in >> > so stirct protection. >> >> The qemu_mutex protects all QEMU global state. The monitor does some >> I/O and parsing which is not necessarily global state but once it >> begins actually performing the command you sent, access to global >> state will be required (pretty much any monitor command will operate >> on global state). >> >> I think there are two options for handling NFS hangs: >> 1. Ensure that QEMU is never put to sleep by NFS for disk images. The >> guest continues executing, may time out and notice that storage is >> unavailable. > > That's the NFS soft mount option. > >> 2.
Re: problem about blocked monitor when disk image on NFS can not be reached.
first say sorry for the same mail sent more than one time, I don't know it will take so long time to come back. hi, stefan: thank for your explaining. how about to remove kvm_handle_io/handle_mmio in kvm_run function into kvm_main_loop, as these operation belong to io operation, this will remove the qemu_mutux between the 2 threads. is this an reasonable thought? In order to keep the monitor to response to user quicker under this suition, an easier way is to take monito io out of qemu_mutux protection. this include vnc/serial/telnet io related with monitor, as these io will not affect the running of vm itself, it need not in so stirct protection. Any suggestions? thanks. Green. 2011/3/1 Stefan Hajnoczi : > On Tue, Mar 1, 2011 at 5:01 AM, ya su wrote: >> kvm start with disk image on nfs server, when nfs server can not be >> reached, monitor will be blocked. I change io_thread to SCHED_RR >> policy, it will work unfluently waiting for disk read/write timeout. > > There are some synchronous disk image reads that can put qemu-kvm to > sleep until NFS responds or errors. For example, when starting > hw/virtio-blk.c calls bdrv_guess_geometry() which may invoke > bdrv_read(). > > Once the VM is running and you're using virtio-blk then disk I/O > should be asynchronous. There are some synchronous cases to do with > migration, snapshotting, etc where we wait for outstanding aio > requests. Again this can block qemu-kvm. > > So in short, there's no easy way to avoid blocking the VM in all cases > today. You should find, however, that normal read/write operation to > a running VM does not cause qemu-kvm to sleep. > > Stefan > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
problem about blocked monitor when disk image on NFS can not be reached.
hi all: kvm start with disk image on nfs server, when nfs server can not be reached, monitor will be blocked. I change io_thread to SCHED_RR policy, it will work unfluently waiting for disk read/write timeout. I have tested a standalone thread to process kvm_handle_io, it can not start up correctly, it may need qemu_mutux protection. as io_thread process different io tasks, is it possible to transfer kvm_handle_io and handle_mmio function into this thread? but the problem will still stay, monitor will still be blocked by read/write disk request. is there anyone that will have a good suggestion? thanks. Green. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Fwd: problem about blocked monitor when disk image on NFS can not be reached.
I have tested a standalone thread to process kvm_handle_io, it can not start up correctly, this function may need qemu_mutux protection. as io_thread process different io tasks, is it possible to transfer kvm_handle_io and handle_mmio function into this thread? but the problem will still stay, monitor will still be blocked by read/write disk request. is there anyone that will have a good suggestion? thanks. Green. -- Forwarded message -- From: ya su Date: 2011/2/28 Subject: problem about blocked monitor when disk image on NFS can not be reached. To: kvm@vger.kernel.org hi: kvm start with disk image on nfs server, when nfs server can not be reached, monitor will be blocked. I change io_thread to SCHED_RR policy, it will work unfluently waiting for disk read/write timeout. I think one solution to this is to change kvm_handle_io in a seperate thread, I will put kvm_handle_io in a new spawned thread, all io request passed in a queue between io_thread and the new spawned thread, it need copy run->io.size*run->io.count bytes from address:(uint8_t *)run + run->io.data_offset. Is this a right direction? any suggestion is welcome, thanks! Green. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
problem about blocked monitor when disk image on NFS can not be reached.
hi: kvm start with disk image on nfs server, when nfs server can not be reached, monitor will be blocked. I change io_thread to SCHED_RR policy, it will work unfluently waiting for disk read/write timeout. I think one solution to this is to change kvm_handle_io in a seperate thread, I will put kvm_handle_io in a new spawned thread, all io request passed in a queue between io_thread and the new spawned thread, it need copy run->io.size*run->io.count bytes from address:(uint8_t *)run + run->io.data_offset. Is this a right direction? any suggestion is welcome, thanks! Green. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html