Re: win8 installation iso can not boot on rhel6.2 kernel?

2013-02-05 Thread ya su
Gleb:

would you pls tell where to find RHEL 6.4 kernel, as the current
latest office release is 6.3?

and would you figure out what the root cause that produce the problem?

thanks.


Regards.

Suya

2013/2/5 ya su :
> Gleb:
>
> would you pls tell where to find RHEL 6.4 kernel, as the current
> latest office release is 6.3?
>
> and would you figure out what the root cause that produce the problem?
>
> thanks.
>
> Regards.
>
> Suya
>
> 2013/2/5 Gleb Natapov :
>> On Tue, Feb 05, 2013 at 02:55:07PM +0800, ya su wrote:
>>> I use the following cmd on rhel6.2 kernel 2.6.32-220.17.1:
>>> x86_64-softmmu/qemu-system-x86_64 -hda win8.img -cdrom
>>> window_8_pro.iso -m 2048 -L pc-bios -cpu host, it will display the
>>> following error:
>>> Your PC needs to restart.
>>> Please hold down the power button.
>>> Error Code: 0x005D
>>> Parameters:
>>> 0x03100A00
>>> 0x68747541
>>> 0x69746E65
>>> 0x444D4163
>>>
>>> I also tried the newest rhel6 kernel version: 2.6.32-279.19.1, it
>>> bring out the same result.
>>>
>>> if I try standard kernel 2.6.32 version, it can boot normally.
>>>
>>> Any suggestions? thanks.
>>>
>> You need latest RHEL6.4 kernel.
>>
>> --
>> Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


win8 installation iso can not boot on rhel6.2 kernel?

2013-02-04 Thread ya su
I use the following cmd on rhel6.2 kernel 2.6.32-220.17.1:
x86_64-softmmu/qemu-system-x86_64 -hda win8.img -cdrom
window_8_pro.iso -m 2048 -L pc-bios -cpu host, it will display the
following error:
Your PC needs to restart.
Please hold down the power button.
Error Code: 0x005D
Parameters:
0x03100A00
0x68747541
0x69746E65
0x444D4163

I also tried the newest rhel6 kernel version: 2.6.32-279.19.1, it
bring out the same result.

if I try standard kernel 2.6.32 version, it can boot normally.

Any suggestions? thanks.

Suya.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Has any work 3.3 kvm-kmod for rhel 6.2 kernel successfully?

2012-06-04 Thread ya su
Jan:

sorry for late response of your suggestion.

I have found the patch which produce this problem, it comes from
this one: 7850ac5420803996e2960d15b924021f28e0dffc.

I change as the following, it works fine.

diff -ur -i kvm-kmod-3.4/x86/kvm_main.c kvm-kmod-3.4-fix/x86/kvm_main.c
--- kvm-kmod-3.4/x86/kvm_main.c 2012-05-21 23:43:02.0 +0800
+++ kvm-kmod-3.4-fix/x86/kvm_main.c 2012-06-05 12:19:37.780136969 +0800
@@ -1525,8 +1525,8 @@
if (memslot && memslot->dirty_bitmap) {
unsigned long rel_gfn = gfn - memslot->base_gfn;

-   if (!test_and_set_bit_le(rel_gfn, memslot->dirty_bitmap))
-   memslot->nr_dirty_pages++;
+   __set_bit_le(rel_gfn, memslot->dirty_bitmap);
+   memslot->nr_dirty_pages++;
}
 }

~

I think the root cause maybe: the acton of clear dirty_bitmap
don't sync with that of set nr_dirty_pages=0.

but I don't realize why it works fine in new kernel.

Regards.

Suya.


2012/4/16 Jan Kiszka :
> On 2012-04-16 16:34, ya su wrote:
>> I first notice 3.3 release notes, it says it can compile against
>> 2.6.32-40, so I think it can work with 2.6.32,  then I change it with
>> rhel 2.6.32 kernel.
>
> The problem is that the RHEL 2.6.32 kernel has nothing to do with a
> standard 2.6.32 as too many features were ported back. So the version
> number based feature checks fail as you noticed.
>
> We could adapt kvm-kmod to detect that it is a RHEL kernel (there is
> surely some define), but it requires going through all the relevant
> features carefully.
>
>>
>> I just re-change orginal kvm-kmod 3.3 with rhel 2.6.32, only to change
>> compile redefination errors, but the problem remains the same. the
>> patch attached.
>>
>> I don't go through git commits, as so many changes from 2.6.32 to 3.3 in 
>> kernel.
>>
>> I think the problem may come from  memory change notification.
>
> The approach to resolve this could be to identify backported features
> based on the build breakage or runtime anomalies, then analyze the
> kvm-kmod history for changes that wrapped those features, and finally
> adjust all affected code blocks. I'm open for patches and willing to
> support you on questions, but I can't work on this myself.
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Has any work 3.3 kvm-kmod for rhel 6.2 kernel successfully?

2012-04-16 Thread ya su
I first notice 3.3 release notes, it says it can compile against
2.6.32-40, so I think it can work with 2.6.32,  then I change it with
rhel 2.6.32 kernel.

I just re-change orginal kvm-kmod 3.3 with rhel 2.6.32, only to change
compile redefination errors, but the problem remains the same. the
patch attached.

I don't go through git commits, as so many changes from 2.6.32 to 3.3 in kernel.

I think the problem may come from  memory change notification.

2012/4/16, Jan Kiszka :
> On 2012-04-16 14:23, ya su wrote:
>> kvm-kmod 3.3 patch attached.
>>
>> I also change kernel to export __get_user_pages_fast.
>
> Ugh, that's huge. How did you select which feature to enable? Based on
> compile tests? The risk would then be to miss some bits that are
> additionally required. Or did you step through the git commits to
> identify corresponding parts?
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
>


kvm-kmod-min.patch
Description: Binary data


Re: Has any work 3.3 kvm-kmod for rhel 6.2 kernel successfully?

2012-04-16 Thread ya su
kvm-kmod 3.3 patch attached.

I also change kernel to export __get_user_pages_fast.

Regards.

Suya.


2012/4/16, Jan Kiszka :
> On 2012-04-16 12:12, ya su wrote:
>> Hi,all:
>>
>> I try 3.3 kvm-kmod to compile against redhat 2.6.32-220.7.1, after
>> change some macros in external-module-compat-comm.h ,
>> external-module-compat.h, and in some C files, finally I can compile
>> and run qemu-kvm(0.12 with rhel release) with 3.3 module, everything
>> looks fine except that screen-display can not flush correctly,  it
>> looks like the display-card memory can not get updated in time when it
>> changes.
>>
>>Is there anyone that can give me some clues, thanks.
>
> Maybe you can share your adoptions to make it easier to asses to what
> degree that kernel is different from a real 2.6.32 kernel.
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
>


kvm-kmod.patch
Description: Binary data


Has any work 3.3 kvm-kmod for rhel 6.2 kernel successfully?

2012-04-16 Thread ya su
Hi,all:

I try 3.3 kvm-kmod to compile against redhat 2.6.32-220.7.1, after
change some macros in external-module-compat-comm.h ,
external-module-compat.h, and in some C files, finally I can compile
and run qemu-kvm(0.12 with rhel release) with 3.3 module, everything
looks fine except that screen-display can not flush correctly,  it
looks like the display-card memory can not get updated in time when it
changes.

   Is there anyone that can give me some clues, thanks.

Regards.

Suya.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: about NPIV with qemu-kvm.

2011-10-27 Thread ya su
hi, hannes

  I really appreciate your clarify of my daze.

  as to improve vm's storage io perfomance as nearly hardware's,
it seems the only way is something like sr-iov by hba card.  NPIV can
not achieve this goal.

  I remember that LSI released some kind SAS controller(IR 2008?)
which support sr-iov , but there is not any document which describes
the steps to config. I wonder if your have any clues to help? thanks.

Regards.

Suya.

2011/10/26, Hannes Reinecke :
> On 10/26/2011 06:40 AM, ya su wrote:
>> hi, hannes:
>>
>> I want to use NPIV with qemu-kvm, I issued the following command:
>>
>> echo ':' >
>> /sys/class/fc_host/host0/vport_create
>>
>> and it will produce a new host6 and one vport succesfully, but it
>> does not create any virtual hba pci device. so I don't know how to
>> assign the virtual host to qemu-kvm.
>>
> Well, you can't. There is no mechanism for. When using NPIV you need
> to pass in the individual LUNs via eg virtio-blk.
>
>> from your this mail, does array will first need to assign a lun to
>> this vport? and through this new created disk, like device /dev/sdf,
>> then I add qemu-kvm with -drive file=/dev/sdf,if=virtio... arguments?
>>
> Yes. That's what you need to do.
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke  zSeries & Storage
> h...@suse.de  +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Markus Rex, HRB 16746 (AG Nürnberg)
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


about NPIV with qemu-kvm.

2011-10-25 Thread ya su
hi, hannes:

I want to use NPIV with qemu-kvm, I issued the following command:

echo ':' >
/sys/class/fc_host/host0/vport_create

and it will produce a new host6 and one vport succesfully, but it
does not create any virtual hba pci device. so I don't know how to
assign the virtual host to qemu-kvm.

from your this mail, does array will first need to assign a lun to
this vport? and through this new created disk, like device /dev/sdf,
then I add qemu-kvm with -drive file=/dev/sdf,if=virtio... arguments?


Regards.

Suya.

2011/6/29, Hannes Reinecke :
> On 06/29/2011 12:07 PM, Christoph Hellwig wrote:
>> On Wed, Jun 29, 2011 at 10:39:42AM +0100, Stefan Hajnoczi wrote:
>>> I think we're missing a level of addressing.  We need the ability to
>>> talk to multiple target ports in order for "list target ports" to make
>>> sense.  Right now there is one implicit target that handles all
>>> commands.  That means there is one fixed I_T Nexus.
>>>
>>> If we introduce "list target ports" we also need a way to say "This
>>> CDB is destined for target port #0".  Then it is possible to enumerate
>>> target ports and address targets independently of the LUN field in the
>>> CDB.
>>>
>>> I'm pretty sure this is also how SAS and other transports work.  In
>>> their framing they include the target port.
>>
>> Yes, exactly.  Hierachial LUNs are a nasty fringe feature that we should
>> avoid as much as possible, that is for everything but IBM vSCSI which is
>> braindead enough to force them.
>>
> Yep.
>
>>> The question is whether we really need to support multiple targets on
>>> a virtio-scsi adapter or not.  If you are selectively mapping LUNs
>>> that the guest may access, then multiple targets are not necessary.
>>> If we want to do pass-through of the entire SCSI bus then we need
>>> multiple targets but I'm not sure if there are other challenges like
>>> dependencies on the transport (Fibre Channel, SAS, etc) which make it
>>> impossible to pass through bus-level access?
>>
>> I don't think bus-level pass through is either easily possible nor
>> desirable.  What multiple targets are useful for is allowing more
>> virtual disks than we have virtual PCI slots.  We could do this by
>> supporting multiple LUNs, but given that many SCSI ressources are
>> target-based doing multiple targets most likely is the more scabale
>> and more logical variant.  E.g. we could much more easily have one
>> virtqueue per target than per LUN.
>>
> The general idea here is that we can support NPIV.
> With NPIV we'll have several scsi_hosts, each of which is assigned a
> different set of LUNs by the array.
> With virtio we need to able to react on LUN remapping on the array
> side, ie we need to be able to issue a 'REPORT LUNS' command and
> add/remove LUNs on the fly. This means we have to expose the
> scsi_host in some way via virtio.
>
> This is impossible with a one-to-one mapping between targets and
> LUNs. The actual bus-level pass-through will be just on the SCSI
> layer, ie 'REPORT LUNS' should be possible. If and how we do a LUN
> remapping internally on the host is a totally different matter.
> Same goes for the transport details; I doubt we will expose all the
> dingy details of the various transports, but rather restrict
> ourselves to an abstract transport.
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke zSeries & Storage
> h...@suse.de+49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Great different disk-write performance of winxp vm compared with linux vm, and with different storage controller.

2011-09-18 Thread ya su
hi,all

 I run winxp vm on Dell R710 server, used with LSI SAS 1068E
controller which is connected with one FUJITSU MBA3147RC disk(SAS, 15k
rpm, no raid setting),  virtio-blk and aio=native is used, if I copy a
big file in VM, it displays about 3-5Mbps from iotop display.

 I run dd in host, it can reach about 90Mbps.(dd if=/dev/zero
of=1.test bs=1M count=1024 oflag=direct);

if I run dd in FC14 vm with the same dd command, it can reach about 70Mbsp.

I then attach a SATA disk(ST31000528AS) with IDE/SATA contoller
(Intel 82801IB),  winxp vm can reach about 30Mbps while copying a file
viewed from iotop.

the problem is: why winxp vm achive so slow disk-write performance
on a faster storage device? it seems the reasons come from: winxp
model of copying a file, and from different storage device.

Is there any tool or way to dig deeper into the problem, so that
it can get winxp vm to copy files normally?

BTW: I have tried with qemu-kvm 0.15 and kvm-kmod-3.0b, the result
remains the same.

thanks.

Suya.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: APIC: avoid instruction emulation for EOI writes

2011-09-10 Thread ya su
hi,Kevin:

I applied the patch in 2.6.39-rc7+, and run a winxp vm, there is
many ICR write emulations, trace-cmd is as the following:

600281: kvm_entry:vcpu 0
600284: kvm_exit: reason APIC_ACCESS rip 0x806e7b85 info 3
600285: kvm_emulate_insn: 0:806e7b85: f7 05 00 03 fe ff 00 10 00 0
600286: kvm_apic: apic_read APIC_ICR = 0x40041
600286: kvm_mmio: mmio read len 4 gpa 0xfee00300 val 0x40041
600287: kvm_mmio: mmio write len 4 gpa 0xfee00300 val 0x40041
600287: kvm_apic: apic_write APIC_ICR = 0x40041
600287: kvm_apic_ipi: dst 1 vec 65 (Fixed|physical|de-assert|edge|self)
600288: kvm_apic_accept_irq:  apicid 0 vec 65 (Fixed|edge)

600288: kvm_entry:vcpu 0
600289: kvm_exit: reason APIC_ACCESS rip 0x806e7b91 info 1
600290: kvm_emulate_insn: 0:806e7b91: 89 0d 00 03 fe ff
600291: kvm_mmio: mmio write len 4 gpa 0xfee00300 val 0x40041
600291: kvm_apic: apic_write APIC_ICR = 0x40041
600291: kvm_apic_ipi: dst 1 vec 65 (Fixed|physical|de-assert|edge|self)
600291: kvm_apic_accept_irq:  apicid 0 vec 65 (Fixed|edge) (coalesced)

600292: kvm_entry:vcpu 0
600293: kvm_exit: reason APIC_ACCESS rip 0x806e7b97 info 3
600294: kvm_emulate_insn: 0:806e7b97: f7 05 00 03 fe ff 00 10 00 0
600294: kvm_apic: apic_read APIC_ICR = 0x40041
600295: kvm_mmio: mmio read len 4 gpa 0xfee00300 val 0x40041
600295: kvm_mmio: mmio write len 4 gpa 0xfee00300 val 0x40041
600295: kvm_apic: apic_write APIC_ICR = 0x40041
600296: kvm_apic_ipi: dst 1 vec 65 (Fixed|physical|de-assert|edge|self)
600296: kvm_apic_accept_irq:  apicid 0 vec 65 (Fixed|edge) (coalesced)

as the asm code on addr 0x80637b85:

0x80637b85:  testl $0x1000, 0xfffe0300
0x80637b8f:   jne 0x80637b85
0x80637b91:  mov %ecx, 0xfffe0300
0x80637b97:  testl $0x1000, 0xfffe0300
0x80637ba1:  jne 0x80637b97

I wonder why testl operation will also cause a ICR write, from the
asm code, there should only issue one IPI, but from trace-cmd, it
issued 3 IPI, is there something wrong?

   Is it also possible to optimize ICR write emulation, from the
result, winxp vm will produce a lot of ICR writes

Regards!

Suya.


2011/8/29 Avi Kivity :
> On 08/29/2011 04:55 PM, Jan Kiszka wrote:
>>
>> On 2011-08-29 13:11, Avi Kivity wrote:
>> >  On 08/29/2011 02:03 PM, Jan Kiszka wrote:
>> >>>
>> >>>   Just reading the first byte requires a guest page table walk.  This
>> >>> is
>> >>>   probably the highest cost in emulation (which also requires a walk
>> >>> for
>> >>>   the data access).
>> >>
>> >>  And what about caching the result of the first walk? Usually, a "sane
>> >>  guest" won't have many code pages that issue the EIO.
>> >>
>> >
>> >  There's no way to know when to invalidate the cache.
>>
>> Set the affected code page read-only?
>
> The virt-phys mapping could change too.  And please, don't think of new
> reasons to write protect pages, they break up my lovely 2M maps.
>
>> >
>> >  We could go a bit further, and cache the the whole thing.  On the first
>> >  exit, do the entire emulation, and remember %rip.  On the second exit,
>> >  if %rip matches, skip directly to kvm_lapic_eoi().
>> >
>> >  But I don't think it's worth it.  This also has failure modes, and
>> >  really, no guest will ever write to EOI with stosl.
>>
>> ...or add/sub/and/or etc.
>
> Argh, yes, flags can be updated.
>
> Actually, this might work - if we get a read access first as part of the
> RMW, we'll emulate the instruction.  No idea what the hardware does in this
> case.
>
>>  Well, we've done other crazy things in the
>> past just to keep even the unlikely case correct. I was just wondering
>> if that policy changed.
>
> I can't answer yes to that question.  But I see no way to make it work both
> fast and correct.
>
>>
>> However, I just realized that user space is able to avoid this
>> inaccuracy for potentially insane guests by not using in-kernel
>> irqchips. So we have at least a knob.
>
> Could/should have a flag to disable this in the kernel as well.
>
> --
> error compiling committee.c: too many arguments to function
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: windows workload: many ept_violation and mmio exits

2011-08-25 Thread ya su
hi,Avi:

I met the same problem, tons of hpet vm_exits(vector 209, fault
address is in the guest vm's hpet mmio range), even I disable hpet
device in win7 guest vm, it still produce a larget amount of vm_exits
when trace-cmd ;  I add -no-hpet to start the vm, it still has HPET
device inside VM.

Does that means the HPET device in VM does not depend on the
emulated hpet device in qemu-kvm? Is there any way to disable the VM
HPET device to prevent so many vm_exits?  Thansk.

Regards.

Suya.

2009/12/3 Avi Kivity :
> On 12/03/2009 03:46 PM, Andrew Theurer wrote:
>>
>> I am running a windows workload which has 26 windows VMs running many
>> instances of a J2EE workload.  There are 13 pairs of an application server
>> VM and database server VM.  There seem to be quite a bit of vm_exits, and it
>> looks over a third of them are mmio_exit:
>>
>>> efer_relo  0
>>> exits      337139
>>> fpu_reloa  247321
>>> halt_exit  19092
>>> halt_wake  18611
>>> host_stat  247332
>>> hypercall  0
>>> insn_emul  184265
>>> insn_emul  184265
>>> invlpg     0
>>> io_exits   69184
>>> irq_exits  52953
>>> irq_injec  48115
>>> irq_windo  2411
>>> largepage  19
>>> mmio_exit  123554
>>
>> I collected a kvmtrace, and below is a very small portion of that.  Is
>> there a way I can figure out what device the mmio's are for?
>
> We want 'info physical_address_space' in the monitor.
>
>> Also, is it normal to have lots of ept_violations?  This is a 2 socket
>> Nehalem system with SMT on.
>
> So long as pf_fixed is low, these are all mmio or apic accesses.
>
>>
>>
>>> qemu-system-x86-19673 [014] 213577.939624: kvm_page_fault: address
>>> fed000f0 error_code 181
>>>  qemu-system-x86-19673 [014] 213577.939627: kvm_mmio: mmio
>>> unsatisfied-read len 4 gpa 0xfed000f0 val 0x0
>>>  qemu-system-x86-19673 [014] 213577.939629: kvm_mmio: mmio read len 4 gpa
>>> 0xfed000f0 val 0xfb8f214d
>
> hpet
>
>>>  qemu-system-x86-19673 [014] 213577.939631: kvm_entry: vcpu 0
>>>  qemu-system-x86-19673 [014] 213577.939633: kvm_exit: reason
>>> ept_violation rip 0xf8000160ef8e
>>>  qemu-system-x86-19673 [014] 213577.939634: kvm_page_fault: address
>>> fed000f0 error_code 181
>
> hpet - was this the same exit? we ought to skip over the emulated
> instruction.
>
>>>  qemu-system-x86-19673 [014] 213577.939693: kvm_page_fault: address
>>> fed000f0 error_code 181
>>>  qemu-system-x86-19673 [014] 213577.939696: kvm_mmio: mmio
>>> unsatisfied-read len 4 gpa 0xfed000f0 val 0x0
>
> hpet
>
>>>  qemu-system-x86-19332 [008] 213577.939699: kvm_exit: reason
>>> ept_violation rip 0xf80001b3af8e
>>>  qemu-system-x86-19332 [008] 213577.939700: kvm_page_fault: address
>>> fed000f0 error_code 181
>>>  qemu-system-x86-19673 [014] 213577.939702: kvm_mmio: mmio read len 4 gpa
>>> 0xfed000f0 val 0xfb8f3da6
>
> hpet
>
>>>  qemu-system-x86-19332 [008] 213577.939706: kvm_mmio: mmio
>>> unsatisfied-read len 4 gpa 0xfed000f0 val 0x0
>>>  qemu-system-x86-19563 [010] 213577.939707: kvm_ioapic_set_irq: pin 11
>>> dst 1 vec=130 (LowPrio|logical|level)
>>>  qemu-system-x86-19332 [008] 213577.939713: kvm_mmio: mmio read len 4 gpa
>>> 0xfed000f0 val 0x29a105de
>
> hpet ...
>
>>>  qemu-system-x86-19673 [014] 213577.939908: kvm_ioapic_set_irq: pin 11
>>> dst 1 vec=130 (LowPrio|logical|level)
>>>  qemu-system-x86-19673 [014] 213577.939910: kvm_entry: vcpu 0
>>>  qemu-system-x86-19673 [014] 213577.939912: kvm_exit: reason apic_access
>>> rip 0xf800016a050c
>>>  qemu-system-x86-19673 [014] 213577.939914: kvm_mmio: mmio write len 4
>>> gpa 0xfee000b0 val 0x0
>
> apic eoi
>
>>>  qemu-system-x86-19332 [008] 213577.939958: kvm_mmio: mmio write len 4
>>> gpa 0xfee000b0 val 0x0
>>>  qemu-system-x86-19673 [014] 213577.939958: kvm_pic_set_irq: chip 1 pin 3
>>> (level|masked)
>>>  qemu-system-x86-19332 [008] 213577.939958: kvm_apic: apic_write APIC_EOI
>>> = 0x0
>
> apic eoi
>
>>>  qemu-system-x86-19673 [014] 213577.940010: kvm_exit: reason cr_access
>>> rip 0xf800016ee2b2
>>>  qemu-system-x86-19673 [014] 213577.940011: kvm_cr: cr_write 4 = 0x678
>>>  qemu-system-x86-19673 [014] 213577.940017: kvm_entry: vcpu 0
>>>  qemu-system-x86-19673 [014] 213577.940019: kvm_exit: reason cr_access
>>> rip 0xf800016ee2b5
>>>  qemu-system-x86-19673 [014] 213577.940019: kvm_cr: cr_write 4 = 0x6f8
>
> toggling global pages, we can avoid that with CR4_GUEST_HOST_MASK.
>
> So, tons of hpet and eois.  We can accelerate both by thing the hyper-V
> accelerations, we already have some (unmerged) code for eoi, so this should
> be improved soon.
>
>>
>> Here is oprofile:
>>
>>> 4117817  62.2029  kvm-intel.ko             kvm-intel.ko
>>> vmx_vcpu_run
>>> 338198    5.1087  qemu-system-x86_64       qemu-system-x86_64
>>> /usr/local/qemu/48bb360cc687b89b74dfb1cac0f6e8812b64841c/bin/qemu-system-x86_64
>>> 62449     0.9433  kvm.ko                   kvm.ko
>>> kvm_arch_vcpu_ioctl_run
>>> 56512     0.8537
>>>  vmlinux-2.6.32-rc7-5e8cb552cb8b48244b6d07bff984b3c4080d4bc9-autokern1
>>> vmlinux-2.6.32-rc7-5e8cb552cb8b48

Re: large amount of NMI_INTERRUPT disgrade winxp VM performance much.

2011-08-11 Thread ya su
Hi, Avi:

Your guess is right, the fast server is AMD with NPT.  this slow
server is Intel's 7430 with no EPT, I now understand the reserved bit
come from kvm's virtual soft-mmu.

But there is still one confusing problem: why a FC14 VM has a much
better storage IO performance on the same host?

I always check the IO on host with iotop when copying files or
running fio in VM, when running with FC14 VM guest, it can reach
30Mbps; while copying file or running fio in winxp VM guest, it is
about 2-3Mbps.

FC14's trace-cmd output is as the following:

   qemu-kvm-7636  [006]   897.452208: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.452213: kvm_exit:
reason EXCEPTION_NMI rip 0x8100b5fa
qemu-kvm-7636  [006]   897.452217: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.452408: kvm_exit:
reason EXTERNAL_INTERRUPT rip 0x81009ddd
qemu-kvm-7636  [006]   897.452411: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.452437: kvm_exit:
reason CR_ACCESS rip 0x8103fadd
qemu-kvm-7636  [006]   897.452437: kvm_cr:
cr_write 3 = 0x7a709000
qemu-kvm-7636  [006]   897.452442: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.453113: kvm_exit:
reason EXTERNAL_INTERRUPT rip 0x8103d12e
qemu-kvm-7636  [006]   897.453116: kvm_apic_accept_irq:
apicid 0 vec 239 (Fixed|edge)
qemu-kvm-7636  [006]   897.453120: kvm_inj_virq: irq 239
qemu-kvm-7636  [006]   897.453121: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.453126: kvm_exit:
reason APIC_ACCESS rip 0x81026239
qemu-kvm-7636  [006]   897.453134: kvm_mmio: mmio
write len 4 gpa 0xfee000b0 val 0x0
qemu-kvm-7636  [006]   897.453135: kvm_apic:
apic_write APIC_EOI = 0x0
qemu-kvm-7636  [006]   897.453137: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.453155: kvm_exit:
reason APIC_ACCESS rip 0x81026239
qemu-kvm-7636  [006]   897.453159: kvm_mmio: mmio
write len 4 gpa 0xfee00380 val 0xe6f5
qemu-kvm-7636  [006]   897.453160: kvm_apic:
apic_write APIC_TMICT = 0xe6f5
qemu-kvm-7636  [006]   897.453164: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.453373: kvm_exit:
reason IO_INSTRUCTION rip 0x812243e4
qemu-kvm-7636  [006]   897.453378: kvm_pio:
pio_write at 0xc050 size 2 count 1
qemu-kvm-7636  [006]   897.453625: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.453984: kvm_exit:
reason IO_INSTRUCTION rip 0x812243e4
qemu-kvm-7636  [006]   897.453984: kvm_pio:
pio_write at 0xc050 size 2 count 1
qemu-kvm-7636  [006]   897.454198: kvm_apic_accept_irq:
apicid 0 vec 239 (Fixed|edge)
qemu-kvm-7636  [006]   897.454201: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.454206: kvm_exit:
reason PENDING_INTERRUPT rip 0x81201c95
qemu-kvm-7636  [006]   897.454209: kvm_inj_virq: irq 239
qemu-kvm-7636  [006]   897.454209: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.454212: kvm_exit:
reason APIC_ACCESS rip 0x81026239
qemu-kvm-7636  [006]   897.454220: kvm_mmio: mmio
write len 4 gpa 0xfee000b0 val 0x0
qemu-kvm-7636  [006]   897.454222: kvm_apic:
apic_write APIC_EOI = 0x0
qemu-kvm-7636  [006]   897.454225: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.454238: kvm_exit:
reason APIC_ACCESS rip 0x81026239
qemu-kvm-7636  [006]   897.454243: kvm_mmio: mmio
write len 4 gpa 0xfee00380 val 0xd29f
qemu-kvm-7636  [006]   897.454243: kvm_apic:
apic_write APIC_TMICT = 0xd29f
   qemu-kvm-7636  [006]   897.454247: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.454405: kvm_exit:
reason EXTERNAL_INTERRUPT rip 0x8113e91f
qemu-kvm-7636  [006]   897.454410: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.454547: kvm_exit:
reason IO_INSTRUCTION rip 0x812243e4
qemu-kvm-7636  [006]   897.454548: kvm_pio:
pio_write at 0xc050 size 2 count 1
qemu-kvm-7636  [006]   897.454690: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.454714: kvm_exit:
reason APIC_ACCESS rip 0x81026239
qemu-kvm-7636  [006]   897.454720: kvm_mmio: mmio
write len 4 gpa 0xfee00380 val 0x7e
qemu-kvm-7636  [006]   897.454721: kvm_apic:
apic_write APIC_TMICT = 0x7e
qemu-kvm-7636  [006]   897.454725: kvm_entry:vcpu 0
qemu-kvm-7636  [006]   897.454730: kvm_exit:
reason APIC_ACCESS rip 0x81026239
qemu-kvm-7636  [006]   897.454733: kvm_mmio: mmio
write len 4 gpa 0xfee00380 val 0x1c040d
qemu-kvm-7636  [006]   897.454735: kvm_apic:
apic_write APIC_TMICT = 0x1c040d
qemu-kvm-7636  [006]   897.454737: kvm_entry:vcpu 0
 

Re: large amount of NMI_INTERRUPT disgrade winxp VM performance much.

2011-08-10 Thread ya su
To clear the problem from guest settings, I run the same winxp image
on another server with the same kernel/qemu-kvm/command. the copy is
fast. so I think this problem relates only with some kind of host's
special hardware. the fast server's trace-cmd output as the following:

 qemu-system-x86-7681  [001] 20054.604841: kvm_entry:vcpu 0
 qemu-system-x86-7681  [001] 20054.604842: kvm_exit:
reason UNKNOWN rip 0x806e7d33
 qemu-system-x86-7681  [001] 20054.604842: kvm_page_fault:
address fee000b0 error_code 6
 qemu-system-x86-7681  [001] 20054.604843: kvm_mmio: mmio
write len 4 gpa 0xfee000b0 val 0x0
 qemu-system-x86-7681  [001] 20054.604843: kvm_apic:
apic_write APIC_EOI = 0x0
 qemu-system-x86-7681  [001] 20054.604844: kvm_entry:vcpu 0
 qemu-system-x86-7681  [001] 20054.604917: kvm_exit:
reason UNKNOWN rip 0xbff63b14
 qemu-system-x86-7681  [001] 20054.604917: kvm_page_fault:
address b8040 error_code 4
 qemu-system-x86-7681  [001] 20054.604920: kvm_mmio: mmio
unsatisfied-read len 1 gpa 0xb8040 val 0x0
 qemu-system-x86-7681  [001] 20054.604923: kvm_mmio: mmio
read len 1 gpa 0xb8040 val 0x0
 qemu-system-x86-7681  [001] 20054.604924: kvm_mmio: mmio
write len 1 gpa 0xb8040 val 0x0
 qemu-system-x86-7681  [001] 20054.604925: kvm_entry:vcpu 0
 qemu-system-x86-7681  [001] 20054.604926: kvm_exit:
reason UNKNOWN rip 0xbff63b1a
 qemu-system-x86-7681  [001] 20054.604927: kvm_page_fault:
address b801a error_code 6
 qemu-system-x86-7681  [001] 20054.604928: kvm_mmio: mmio
write len 1 gpa 0xb801a val 0xd
 qemu-system-x86-7681  [001] 20054.604928: kvm_entry:vcpu 0
 qemu-system-x86-7681  [001] 20054.604929: kvm_exit:
reason UNKNOWN rip 0xbff63b23
 qemu-system-x86-7681  [001] 20054.604929: kvm_page_fault:
address b8014 error_code 6
 qemu-system-x86-7681  [001] 20054.604930: kvm_mmio: mmio
write len 4 gpa 0xb8014 val 0x15f900

   According to Tian's  suggest, the NMI is produced by guest's write
to reservd page, Is there any way to find out why the slow-copy server
reserve the memory page?

   I checked the server's memory, there is large free space, no swap is used.

   and I have tested with server with  kernel 2.6.39, the problem remains.


Regards.

Suya.

2011/8/11 Tian, Kevin :
>> From: ya su
>> Sent: Thursday, August 11, 2011 11:57 AM
>>
>>  When I run winxp guest on one server, copy one file about 4G will
>> take a time of 40-50 min; if I run a FC14 guest, it will take about
>> 2-3 min;
>>
>>  I copy and run the winxp image on another server, it works well, take
>> about 3min.
>>
>>  I run trace-cmd while copying files, the main difference of  the two
>> outputs is that: the slow one's output have many NMI_INTERRUPT
>> vm_exit, while the fast output has no such vm_exit. both of the two
>> servers have NMI enabled default. the slow one's output as the
>> following:
>>  qemu-system-x86-4454  [004]   549.958147: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958172: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958172: kvm_page_fault:
>> address c8f8a000 error_code b
>>  qemu-system-x86-4454  [004]   549.958177: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958202: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958204: kvm_page_fault:
>> address c8f8b000 error_code b
>>  qemu-system-x86-4454  [004]   549.958209: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958234: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958234: kvm_page_fault:
>> address c8f8c000 error_code b
>>  qemu-system-x86-4454  [004]   549.958239: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958264: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958264: kvm_page_fault:
>> address c8f8d000 error_code b
>>  qemu-system-x86-4454  [004]   549.958267: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958292: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958294: kvm_page_fault:
>> address c8f8e000 error_code b
>>  qemu-system-x86-4454  [004]   549.958299: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958324: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958324: kvm_page_fault:
>> address c8f8f000 error_code b
>>  qemu-system-x86-4454  [004]   549.958329: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958447: kvm_exit:
>> reason EXTERNAL_INTERRUPT rip 0x8054

large amount of NMI_INTERRUPT disgrade winxp VM performance much.

2011-08-10 Thread ya su
 When I run winxp guest on one server, copy one file about 4G will
take a time of 40-50 min; if I run a FC14 guest, it will take about
2-3 min;

 I copy and run the winxp image on another server, it works well, take
about 3min.

 I run trace-cmd while copying files, the main difference of  the two
outputs is that: the slow one's output have many NMI_INTERRUPT
vm_exit, while the fast output has no such vm_exit. both of the two
servers have NMI enabled default. the slow one's output as the
following:
 qemu-system-x86-4454  [004]   549.958147: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958172: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958172: kvm_page_fault:
address c8f8a000 error_code b
 qemu-system-x86-4454  [004]   549.958177: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958202: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958204: kvm_page_fault:
address c8f8b000 error_code b
 qemu-system-x86-4454  [004]   549.958209: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958234: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958234: kvm_page_fault:
address c8f8c000 error_code b
 qemu-system-x86-4454  [004]   549.958239: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958264: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958264: kvm_page_fault:
address c8f8d000 error_code b
 qemu-system-x86-4454  [004]   549.958267: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958292: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958294: kvm_page_fault:
address c8f8e000 error_code b
 qemu-system-x86-4454  [004]   549.958299: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958324: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958324: kvm_page_fault:
address c8f8f000 error_code b
 qemu-system-x86-4454  [004]   549.958329: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958447: kvm_exit:
reason EXTERNAL_INTERRUPT rip 0x80547ac8
 qemu-system-x86-4454  [004]   549.958450: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958461: kvm_exit:
reason CR_ACCESS rip 0x8054428c
 qemu-system-x86-4454  [004]   549.958461: kvm_cr:
cr_write 0 = 0x80010031
 qemu-system-x86-4454  [004]   549.958541: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958573: kvm_exit:
reason CR_ACCESS rip 0x80546beb
 qemu-system-x86-4454  [004]   549.958575: kvm_cr:
cr_write 0 = 0x8001003b
 qemu-system-x86-4454  [004]   549.958585: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958610: kvm_exit:
reason CR_ACCESS rip 0x80546b6c
 qemu-system-x86-4454  [004]   549.958610: kvm_cr:
cr_write 3 = 0x6e00020
 qemu-system-x86-4454  [004]   549.958621: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958645: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d7f4
 qemu-system-x86-4454  [004]   549.958645: kvm_page_fault:
address c0648200 error_code 3
 qemu-system-x86-4454  [004]   549.958653: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958725: kvm_exit:
reason EXCEPTION_NMI rip 0x8050a26a
 qemu-system-x86-4454  [004]   549.958726: kvm_page_fault:
address c0796994 error_code 3
 qemu-system-x86-4454  [004]   549.958738: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958750: kvm_exit:
reason IO_INSTRUCTION rip 0x806edad0
 qemu-system-x86-4454  [004]   549.958750: kvm_pio:
pio_write at 0xc050 size 2 count 1
 qemu-system-x86-4454  [004]   549.958838: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958844: kvm_exit:
reason APIC_ACCESS rip 0x806e7b85
 qemu-system-x86-4454  [004]   549.958852: kvm_apic:
apic_read APIC_ICR = 0x40041
 qemu-system-x86-4454  [004]   549.958855: kvm_mmio: mmio
read len 4 gpa 0xfee00300 val 0x40041
 qemu-system-x86-4454  [004]   549.958857: kvm_mmio: mmio
write len 4 gpa 0xfee00300 val 0x40041
 qemu-system-x86-4454  [004]   549.958858: kvm_apic:
apic_write APIC_ICR = 0x40041
 qemu-system-x86-4454  [004]   549.958860: kvm_apic_ipi: dst 1
vec 65 (Fixed|physical|de-assert|edge|self)
 qemu-system-x86-4454  [004]   549.958860: kvm_apic_accept_irq:
apicid 0 vec 65 (Fixed|edge)

 Even I disable the NMI when I boot the kernel with nmi_watchdog=0,
the trace-cmd output still show there are many NMI_INTERRUPT. I find
that in /proc/interrupts, the amount of NMI is 0. Does this mean that
NMI is produced in winxp guest OS, or this setting can not hinder kvm
to catch NMI interrupt?

  I think the difference between FC14 and winxp is that: fc14 process
the NMI interrupt correctly, but winxp can not, is this right?

  I run qemu-kvm with version 0.14.0, kernel version 2.6.32-131.6.4. I
change kvm-kmod to 2.6.32-27, it produce the same result.

  Any suggestions? thanks.

Regards.

Suya.
--
To unsubscribe from th

Re: PowerPoint performance degrade greatly when logon on through rdesktop to winxp, and when lotus notes is running.

2011-07-05 Thread ya su
it:
reason UNKNOWN rip 0x806e7b91
 qemu-system-x86-4239  [001]  7369.308539: kvm_page_fault:
address fee00300 error_code 6
 qemu-system-x86-4239  [001]  7369.308540: kvm_mmio: mmio
write len 4 gpa 0xfee00300 val 0x40041
 qemu-system-x86-4239  [001]  7369.308540: kvm_apic:
apic_write APIC_ICR = 0x40041
 qemu-system-x86-4239  [001]  7369.308540: kvm_apic_ipi: dst 0
vec 65 (Fixed|physical|de-assert|edge|self)
 qemu-system-x86-4239  [001]  7369.308540: kvm_apic_accept_irq:
apicid 0 vec 65 (Fixed|edge) (coalesced)
 qemu-system-x86-4239  [001]  7369.308541: kvm_entry:vcpu 0
 qemu-system-x86-4239  [001]  7369.308542: kvm_exit:
reason UNKNOWN rip 0x806e7b97
 qemu-system-x86-4239  [001]  7369.308542: kvm_page_fault:
address fee00300 error_code 4
 qemu-system-x86-4239  [001]  7369.308542: kvm_apic:
apic_read APIC_ICR = 0x40041
 qemu-system-x86-4239  [001]  7369.308542: kvm_mmio: mmio
read len 4 gpa 0xfee00300 val 0x40041
 qemu-system-x86-4239  [001]  7369.308543: kvm_mmio: mmio
write len 4 gpa 0xfee00300 val 0x40041
 qemu-system-x86-4239  [001]  7369.308543: kvm_apic:
apic_write APIC_ICR = 0x40041
 qemu-system-x86-4239  [001]  7369.308543: kvm_apic_ipi: dst 0
vec 65 (Fixed|physical|de-assert|edge|self)
 qemu-system-x86-4239  [001]  7369.308543: kvm_apic_accept_irq:
apicid 0 vec 65 (Fixed|edge) (coalesced)

there are multiply wirte to apic's APIC_ICR with 0x40041, and every
time vcpu quickly exit after enter,  is this okey?

 qemu-system-x86-4239  [001]  7369.308543: kvm_entry:vcpu 0
 qemu-system-x86-4239  [001]  7369.308545: kvm_exit:
reason UNKNOWN rip 0x806e7f18
 qemu-system-x86-4239  [001]  7369.308545: kvm_page_fault:
address fee000b0 error_code 6
 qemu-system-x86-4239  [001]  7369.308545: kvm_mmio: mmio
write len 4 gpa 0xfee000b0 val 0x0
 qemu-system-x86-4239  [001]  7369.308545: kvm_apic:
apic_write APIC_EOI = 0x0
 qemu-system-x86-4239  [001]  7369.308546: kvm_ack_irq:
irqchip IOAPIC pin 11
 qemu-system-x86-4239  [001]  7369.308546: kvm_entry:vcpu 0
 qemu-system-x86-4239  [001]  7369.308560: kvm_exit:
reason UNKNOWN rip 0x800ca22e
 qemu-system-x86-4239  [001]  7369.308560: kvm_hypercall:nr
0x1 a0 0x41 a1 0x0 a2 0x0 a3 0x806e7410
 qemu-system-x86-4239  [001]  7369.308561: kvm_entry:vcpu 0
 qemu-system-x86-4239  [001]  7369.308562: kvm_exit:
reason UNKNOWN rip 0x806e7d33
 qemu-system-x86-4239  [001]  7369.308562: kvm_page_fault:
address fee000b0 error_code 6
 qemu-system-x86-4239  [001]  7369.308563: kvm_mmio: mmio
write len 4 gpa 0xfee000b0 val 0x0
 qemu-system-x86-4239  [001]  7369.308563: kvm_apic:
apic_write APIC_EOI = 0x0
 qemu-system-x86-4239  [001]  7369.308564: kvm_entry:vcpu 0
 qemu-system-x86-4239  [001]  7369.308569: kvm_exit:
reason UNKNOWN rip 0xf77ffd3d

again, APIC_EOI is write twice.

 qemu-system-x86-4227  [000]  7369.310965: kvm_set_irq:  gsi
11 level 1 source 0

  I once think it may come from network speed, so I assign a pci
network card to vm, but the problem remains. so I think the problem
may come from windows rdp display driver's drop & drop realization,
especailly it can be influenced by lotus notes, but I can not see how
this hinder the vm schedule?

 I also increase the vm memory, it has no effect. and my kernel is
2.6.32-131.4.1.

 Any suggestions? thanks.

Regards!

Green.


2011/7/5 Avi Kivity :
> On 07/05/2011 12:40 PM, ya su wrote:
>>
>>      I am using qemu-kvm, cli as the following:
>>
>> qemu-system-x86_64 -drive
>> file=test-notes.img,if=virtio,cache=none,boot=on -net
>> nic,macaddr=00:00:00:11:22:88,model=virtio -net tap -m 1024 -vnc :3
>>
>>      I open powerpoint 2007, and drag a rectangel, it moves very
>> slowly.  it must meet the following conditions to produce the same
>> result:
>>      (1) lotus notes is running.
>>      (2) logon through rdestktop.
>>
>>      if I connect through vnc, it will not happen; if I don't run
>> louts notes, it will not happen. if I change to 2 vcpus as the
>> following cli, it will respond much better.
>>
>> qemu-system-x86_64 -drive
>> file=test-notes.img,if=virtio,cache=none,boot=on -net
>> nic,macaddr=00:00:00:11:22:88,model=virtio -net tap -m 1024 -vnc :3
>> -smp 2
>>
>>       I first doubt that maybe it's from windows internal problem, so
>> I tested on a uni-processor PC, but It looks good.
>>
>>       I also run qemu-kvm with -no-kvm, it produce the same results.
>>
>> I run kvm-stat when dragging a rectangle, the output is as the following:
>>
>> exits                                      4650520   24645
>>  insn_emulation                             3508180   15158
>>  host_state_reload                          1273409   13999
>>  io_exits    

PowerPoint performance degrade greatly when logon on through rdesktop to winxp, and when lotus notes is running.

2011-07-05 Thread ya su
 I am using qemu-kvm, cli as the following:

qemu-system-x86_64 -drive
file=test-notes.img,if=virtio,cache=none,boot=on -net
nic,macaddr=00:00:00:11:22:88,model=virtio -net tap -m 1024 -vnc :3

 I open powerpoint 2007, and drag a rectangel, it moves very
slowly.  it must meet the following conditions to produce the same
result:
 (1) lotus notes is running.
 (2) logon through rdestktop.

 if I connect through vnc, it will not happen; if I don't run
louts notes, it will not happen. if I change to 2 vcpus as the
following cli, it will respond much better.

qemu-system-x86_64 -drive
file=test-notes.img,if=virtio,cache=none,boot=on -net
nic,macaddr=00:00:00:11:22:88,model=virtio -net tap -m 1024 -vnc :3
-smp 2

  I first doubt that maybe it's from windows internal problem, so
I tested on a uni-processor PC, but It looks good.

  I also run qemu-kvm with -no-kvm, it produce the same results.

I run kvm-stat when dragging a rectangle, the output is as the following:

exits  4650520   24645
 insn_emulation 3508180   15158
 host_state_reload  1273409   13999
 io_exits   1031465   13504
 irq_injections  1791042629
 hypercalls  1314812084
 halt_wakeup  33589 495
 halt_exits   33584 495
 irq_exits   105020 237
 pf_fixed449879 106
 fpu_reload   16852  54
 mmio_exits   46426   1
 mmu_cache_miss9985   0
 mmu_shadow_zapped11736   0
 signal_exits  2145   0
 remote_tlb_flush   251   0

 it seems that qemu-kvm is emulating some instruction which take
much cpu resource, but I don't know how to find the emuated
instructions.

 Any suggestions? thanks.

Regards.

Green.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB EHCI patch for 0.14.0?

2011-04-11 Thread ya su
David:

I have applied the patch to 0.14.0, and there is a bug if I add a
optiarc CRRWDVD CRX890A usb device on windows xp, I first comment out
the following code in usb-linux.c:

if (is_halted(s, p->devep)) {
ret = ioctl(s->fd, USBDEVFS_CLEAR_HALT, &urb->endpoint);
#if 0   
if (ret < 0) {
DPRINTF("husb: failed to clear halt. ep 0x%x errno %d\n",
urb->endpoint, errno);
return USB_RET_NAK;
}
#endif
clear_halt(s, p->devep);
}

 then it can continue to run in linux, but still stall on windows
xp and win7. I turn on debug, part of the output is as the following:

husb: async cancel. aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status -2 alen 0
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 64
husb: submit ctrl. len 72 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x0 req 0x5 val 0x2 index 0 len 0
husb: ctrl set addr 2
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 18
husb: submit ctrl. len 26 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: ctrl type 0x0 req 0x9 val 0x1 index 0 len 0
husb: releasing interfaces
husb: ctrl set config 1 ret 0 errno 11
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: data submit. ep 0x2 len 31 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 31
invoking packet_complete. plen = 31
husb: data submit. ep 0x81 len 64 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 4
invoking packet_complete. plen = 4
husb: data submit. ep 0x81 len 13 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status -32 alen 0
invoking packet_complete. plen = -3
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 64
husb: submit ctrl. len 72 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x0 req 0x5 val 0x1 index 0 len 0
husb: ctrl set addr 1
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 18
husb: submit ctrl. len 26 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: ctrl type 0x0 req 0x9 val 0x1 index 0 len 0
husb: releasing interfaces
husb: ctrl set config 1 ret 0 errno 11
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: data submit. ep 0x2 len 31 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 31
invoking packet_complete. plen = 31
husb: data submit. ep 0x81 len 64 aurb 0x1616cd0
[Thread 0x74f75710 (LWP 3317) exited]
husb: async completed. aurb 0x1616cd0 status 0 alen 4
invoking packet_complete. plen = 4
husb: data submit. ep 0x81 len 13 aurb 0x1616cd0
husb: async cancel. aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status -2 alen 0
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 64
husb: submit ctrl. len 72 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x0 req 0x5 val 0x2 index 0 len 0
husb: ctrl set addr 2
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 18
husb: submit ctrl. len 26 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: ctrl type 0x0 req 0x9 val 0x1 index 0 len 0
husb: releasing interfaces
husb: ctrl set config 1 ret 0 errno 11
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: data submit. ep 0x2 len 31 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 31
invoking packet_complete. plen = 31
husb: data submit. ep 0x81 len 

Re: [COMMIT] [WIN-GUEST-DRIVERS] Balloon - remove WMI usage. Remove wmi.c.

2011-03-18 Thread ya su
Yan:

 I have tested the newest balloon driver (from 1.1.16) on windows
server 2003, balloon.sys can not be installed successfully and return
error code 10. have you tested this or any updates? thanks.

Regards.

Green.


2010/2/15 Yan Vugenfirer :
> repository: C:/dev/kvm-guest-drivers-windows
> branch: master
> commit 7ab588f373eda9d08a497e969739019d2075a6d2
> Author: Yan Vugenfirer 
> Date:   Mon Feb 15 15:01:36 2010 +0200
>
>    [WIN-GUEST-DRIVERS] Balloon - remove WMI usage. Remove wmi.c.
>
>        Signed-off-by: Vadim Rozenfeld
>
> diff --git a/Balloon/BalloonWDF/wmi.c b/Balloon/BalloonWDF/wmi.c
> deleted file mode 100644
> index 70a9270..000
> --- a/Balloon/BalloonWDF/wmi.c
> +++ /dev/null
> @@ -1,90 +0,0 @@
> -/**
> - * Copyright (c) 2009  Red Hat, Inc.
> - *
> - * File: device.c
> - *
> - * Author(s):
> - *
> - * This file contains WMI support routines
> - *
> - * This work is licensed under the terms of the GNU GPL, version 2.  See
> - * the COPYING file in the top-level directory.
> - *
> -**/
> -#include "precomp.h"
> -
> -#if defined(EVENT_TRACING)
> -#include "wmi.tmh"
> -#endif
> -
> -
> -#define MOFRESOURCENAME L"MofResourceName"
> -
> -#ifdef ALLOC_PRAGMA
> -#pragma alloc_text(PAGE, WmiRegistration)
> -#pragma alloc_text(PAGE, EvtWmiDeviceInfoQueryInstance)
> -#endif
> -
> -NTSTATUS
> -WmiRegistration(
> -    WDFDEVICE      Device
> -    )
> -{
> -    WDF_WMI_PROVIDER_CONFIG providerConfig;
> -    WDF_WMI_INSTANCE_CONFIG instanceConfig;
> -    NTSTATUS        status;
> -    DECLARE_CONST_UNICODE_STRING(mofRsrcName, MOFRESOURCENAME);
> -
> -    PAGED_CODE();
> -
> -    TraceEvents(TRACE_LEVEL_INFORMATION, DBG_PNP, "--> WmiRegistration\n");
> -
> -    status = WdfDeviceAssignMofResourceName(Device, &mofRsrcName);
> -    if (!NT_SUCCESS(status)) {
> -        TraceEvents(TRACE_LEVEL_ERROR, DBG_PNP,
> -                     "WdfDeviceAssignMofResourceName failed 0x%x", status);
> -        return status;
> -    }
> -
> -    WDF_WMI_PROVIDER_CONFIG_INIT(&providerConfig, &GUID_DEV_WMI_BALLOON);
> -    providerConfig.MinInstanceBufferSize = sizeof(ULONGLONG);
> -
> -    WDF_WMI_INSTANCE_CONFIG_INIT_PROVIDER_CONFIG(&instanceConfig, 
> &providerConfig);
> -    instanceConfig.Register = TRUE;
> -    instanceConfig.EvtWmiInstanceQueryInstance = 
> EvtWmiDeviceInfoQueryInstance;
> -
> -    status = WdfWmiInstanceCreate(Device,
> -                                  &instanceConfig,
> -                                  WDF_NO_OBJECT_ATTRIBUTES,
> -                                  WDF_NO_HANDLE);
> -    if (!NT_SUCCESS(status)) {
> -        TraceEvents(TRACE_LEVEL_ERROR, DBG_PNP,
> -                     "WdfWmiInstanceCreate failed 0x%x", status);
> -        return status;
> -    }
> -
> -    TraceEvents(TRACE_LEVEL_INFORMATION, DBG_PNP, "<-- WmiRegistration\n");
> -    return status;
> -}
> -
> -NTSTATUS
> -EvtWmiDeviceInfoQueryInstance(
> -    __in  WDFWMIINSTANCE WmiInstance,
> -    __in  ULONG OutBufferSize,
> -    __out_bcount_part(OutBufferSize, *BufferUsed) PVOID OutBuffer,
> -    __out PULONG BufferUsed
> -    )
> -{
> -    PDRIVER_CONTEXT drvCxt = GetDriverContext(WdfGetDriver());
> -
> -    PAGED_CODE();
> -
> -    TraceEvents(TRACE_LEVEL_VERBOSE, DBG_WMI, "--> 
> EvtWmiDeviceInfoQueryInstance\n");
> -
> -    RtlZeroMemory(OutBuffer, sizeof(ULONGLONG));
> -    *(ULONGLONG*) OutBuffer = (ULONGLONG)drvCxt->num_pages;
> -    *BufferUsed = sizeof(ULONGLONG);
> -
> -    TraceEvents(TRACE_LEVEL_VERBOSE, DBG_WMI, "<-- 
> EvtWmiDeviceInfoQueryInstance\n");
> -    return STATUS_SUCCESS;
> -}
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-commits" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SR-IOV of LSI MegaRAID storage controller?

2011-03-17 Thread ya su
hi,all:

I noticed a news that kvm can use the SR-IOV function of LSI
MegaRAID storage controller on 2009 IDT,  have anyone been succeed in
testing this function, and how to configure the kernel and qemu?
thanks.

Regards.

Green.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/18] Introduce event-tap.

2011-03-09 Thread ya su
Yoshi:

I meet one problem if I killed a ft source VM, the dest ft VM will
return errors as the following:

qemu-system-x86_64: fill buffer failed, Resource temporarily unavailable
qemu-system-x86_64: recv header failed

the problem is that the dest VM can not continue to run, as it is
interrupted in the middle of a transaction, some of rams have been
updated, but the others not, do you have any plan for rolling back to
cancel the interrupted transaction? thanks.


Green.



2011/3/9 Yoshiaki Tamura :
> ya su wrote:
>>
>> Yoshi:
>>
>>     I think event-tap is a great idea, it remove the reading from disk
>> which will increase ft effiency much better as your plan in later
>> series.
>>
>>     one question: IO read/write may dirty rams, but it is difficute to
>> differ them from other dirty pages like caused by  running of
>> softwares,  whether that means you need change all the emulated device
>> realization?  actually I think it will not send too much rams caused
>> by IO Read/Write in ram_save_live, but if It can event-tap IO
>> read/write and replay on the other side, Does that means we don't need
>> call qemu_savevm_state_full in ft transactoins?
>
> I'm not expecting to remove qemu_savevm_state_full in the transaction.  Just
> reduce the number of pages to be transfered as a result.
>
> Thanks,
>
> Yoshi
>
>>
>> Green.
>>
>>
>> 2011/3/9 Yoshiaki Tamura:
>>>
>>> ya su wrote:
>>>>
>>>> 2011/3/8 Yoshiaki Tamura:
>>>>>
>>>>> ya su wrote:
>>>>>>
>>>>>> Yokshiaki:
>>>>>>
>>>>>>     event-tap record block and io wirte events, and replay these on
>>>>>> the other side, so block_save_live is useless during the latter ft
>>>>>> phase, right? if so, I think it need to process the following code in
>>>>>> block_save_live function:
>>>>>
>>>>> Actually no.  It just replays the last events only.  We do have patches
>>>>> that
>>>>> enable block replication without using block live migration, like the
>>>>> way
>>>>> you described above.  In that case, we disable block live migration
>>>>> when
>>>>>  we
>>>>> go into ft mode.  We're thinking to propose it after this series get
>>>>> settled.
>>>>
>>>> so event-tap's objective is to initial a ft transaction, to start the
>>>> sync. of ram/block/device states? if so, it need not change
>>>> bdrv_aio_writev/bdrv_aio_flush normal process, on the other side it
>>>> need not invokde bdrv_aio_writev either, right?
>>>
>>> Mostly yes, but because event-tap is queuing requests from block/net, it
>>> needs to flush queued requests after the transaction on the primary side.
>>>  On the secondary, it currently doesn't have to invoke bdrv_aio_writev as
>>> you mentioned.  But will change soon to enable block replication with
>>> event-tap.
>>>
>>>>
>>>>>
>>>>>>
>>>>>>     if (stage == 1) {
>>>>>>         init_blk_migration(mon, f);
>>>>>>
>>>>>>         /* start track dirty blocks */
>>>>>>         set_dirty_tracking(1);
>>>>>>     }
>>>>>> --
>>>>>> the following code will send block to the other side, as this will
>>>>>> also be done by event-tap replay. I think it should placed in stage 3,
>>>>>> before the assert line. (this may affect some stage 2 rate-limit
>>>>>> then, so this can be placed in stage 2, though it looks ugly), another
>>>>>> choice is to avoid the invocation of block_save_live, right?
>>>>>> ---
>>>>>>     flush_blks(f);
>>>>>>
>>>>>>     if (qemu_file_has_error(f)) {
>>>>>>         blk_mig_cleanup(mon);
>>>>>>         return 0;
>>>>>>     }
>>>>>>
>>>>>>     blk_mig_reset_dirty_cursor();
>>>>>> 
>>>>>>     if (stage == 2) {
>>>>>>
>>>>>>
>>>>>>     another question is: since you event-tap io write(I think IO READ
>>>>>> should also be event-tapped, as read may cause io chip state to
>&g

Re: [PATCH 07/18] Introduce fault tolerant VM transaction QEMUFile and ft_mode.

2011-03-09 Thread ya su
Juan:

 It's especailly important for ft to be a standalone thread, as it
may cause monitor to  be blocked by network problems.  what's your
schedule, maybe I can help some.

Yoshi:

 in the following code:

+
+s->file = qemu_fopen_ops(s, ft_trans_put_buffer, ft_trans_get_buffer,
+ ft_trans_close, ft_trans_rate_limit,
+ ft_trans_set_rate_limit, NULL);
+
+return s->file;
+}

I think you should register ft_trans_get_rate_limit function,
otherwise it will not transfer any block data at stage 2 in
block_save_live function:

if (stage == 2) {
/* control the rate of transfer */
while ((block_mig_state.submitted +
block_mig_state.read_done) * BLOCK_SIZE <
   qemu_file_get_rate_limit(f)) {

 qemu_file_get_rate_limit will return 0, then it will not proceed
to copy dirty block data.

 FYI.

Green.


2011/2/24 Yoshiaki Tamura :
> 2011/2/24 Juan Quintela :
>>
>> [ trimming cc to kvm & qemu lists]
>>
>> Yoshiaki Tamura  wrote:
>>> Juan Quintela wrote:
 Yoshiaki Tamura  wrote:
> This code implements VM transaction protocol.  Like buffered_file, it
> sits between savevm and migration layer.  With this architecture, VM
> transaction protocol is implemented mostly independent from other
> existing code.

 Could you explain what is the difference with buffered_file.c?
 I am fixing problems on buffered_file, and having something that copies
 lot of code from there makes me nervous.
>>>
>>> The objective is different:
>>>
>>> buffered_file buffers data for transmission control.
>>> ft_trans_file adds headers to the stream, and controls the transaction
>>> between sender and receiver.
>>>
>>> Although ft_trans_file sometimes buffers date, but it's not the main 
>>> objective.
>>> If you're fixing the problems on buffered_file, I'll keep eyes on them.
>>>
> +typedef ssize_t (FtTransPutBufferFunc)(void *opaque, const void *data, 
> size_t size);

 Can we get some sharing here?
 typedef ssize_t (BufferedPutFunc)(void *opaque, const void *data, size_t 
 size);

 There are not so much types for a write function that the 1st element is
 one opaque :p
>>>
>>> You're right, but I want to keep ft_trans_file independent of
>>> buffered_file at this point.  Once Kemari gets merged, I'm happy to
>>> work with you to fix the problems on buffered_file and ft_trans_file,
>>> and refactoring them.
>>
>> My goal is getting its own thread for migration on 0.15, that
>> basically means that we can do rm buffered_file.c.  I guess that
>> something similar could happen for kemari.
>
> That means both gets initiated by it's own thread, not like
> current poll based.  I'm still skeptical whether Anthony agrees,
> but I'll keep it in my mind.
>
>> But for now, this is just the start + handwaving, once I start doing the
>> work I will told you.
>
> Yes, please.
>
> Yoshi
>
>>
>> Later, Juan.
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/18] Introduce event-tap.

2011-03-08 Thread ya su
Yoshi:

I think event-tap is a great idea, it remove the reading from disk
which will increase ft effiency much better as your plan in later
series.

one question: IO read/write may dirty rams, but it is difficute to
differ them from other dirty pages like caused by  running of
softwares,  whether that means you need change all the emulated device
realization?  actually I think it will not send too much rams caused
by IO Read/Write in ram_save_live, but if It can event-tap IO
read/write and replay on the other side, Does that means we don't need
call qemu_savevm_state_full in ft transactoins?

Green.


2011/3/9 Yoshiaki Tamura :
> ya su wrote:
>>
>> 2011/3/8 Yoshiaki Tamura:
>>>
>>> ya su wrote:
>>>>
>>>> Yokshiaki:
>>>>
>>>>     event-tap record block and io wirte events, and replay these on
>>>> the other side, so block_save_live is useless during the latter ft
>>>> phase, right? if so, I think it need to process the following code in
>>>> block_save_live function:
>>>
>>> Actually no.  It just replays the last events only.  We do have patches
>>> that
>>> enable block replication without using block live migration, like the way
>>> you described above.  In that case, we disable block live migration when
>>>  we
>>> go into ft mode.  We're thinking to propose it after this series get
>>> settled.
>>
>> so event-tap's objective is to initial a ft transaction, to start the
>> sync. of ram/block/device states? if so, it need not change
>> bdrv_aio_writev/bdrv_aio_flush normal process, on the other side it
>> need not invokde bdrv_aio_writev either, right?
>
> Mostly yes, but because event-tap is queuing requests from block/net, it
> needs to flush queued requests after the transaction on the primary side.
>  On the secondary, it currently doesn't have to invoke bdrv_aio_writev as
> you mentioned.  But will change soon to enable block replication with
> event-tap.
>
>>
>>>
>>>>
>>>>     if (stage == 1) {
>>>>         init_blk_migration(mon, f);
>>>>
>>>>         /* start track dirty blocks */
>>>>         set_dirty_tracking(1);
>>>>     }
>>>> --
>>>> the following code will send block to the other side, as this will
>>>> also be done by event-tap replay. I think it should placed in stage 3,
>>>> before the assert line. (this may affect some stage 2 rate-limit
>>>> then, so this can be placed in stage 2, though it looks ugly), another
>>>> choice is to avoid the invocation of block_save_live, right?
>>>> ---
>>>>     flush_blks(f);
>>>>
>>>>     if (qemu_file_has_error(f)) {
>>>>         blk_mig_cleanup(mon);
>>>>         return 0;
>>>>     }
>>>>
>>>>     blk_mig_reset_dirty_cursor();
>>>> 
>>>>     if (stage == 2) {
>>>>
>>>>
>>>>     another question is: since you event-tap io write(I think IO READ
>>>> should also be event-tapped, as read may cause io chip state to
>>>> change),  you then need not invoke qemu_savevm_state_full in
>>>> qemu_savevm_trans_complete, right? thanks.
>>>
>>> It's not necessary to tap IO READ, but you can if you like.  We also have
>>> experimental patches for this to reduce rams to be transfered.  But I
>>> don't
>>> understand why we don't have to invoke qemu_savevm_state_full although I
>>> think we may reduce number of rams by replaying IO READ on the secondary.
>>>
>>
>> I first think the objective of io-Write event-tap is to reproduce the
>> same device state on the other side, though I doubt this,  so I think
>> IO-Read also should be recorded and replayed. since event-tap is only
>> to initial a ft transaction, the sync. of states still depend on
>> qemu_save_vm_live/full,  I understand the design now, thanks.
>>
>> but I don't understand why io-write event-tap can reduce transfered
>> rams as you mentioned, the amount of rams only depend on dirty pages,
>> IO write don't change the normal process unlike block write, right?
>
> The point is, if we can assure that IO read retrieves the same data on both
> sides, instead of dirtying the ram by read, meaning we have to transfer in
> the transaction, just replay the operation and get the same data on the
> otherside. Anyway

Re: [PATCH 09/18] Introduce event-tap.

2011-03-08 Thread ya su
2011/3/8 Yoshiaki Tamura :
> ya su wrote:
>>
>> Yokshiaki:
>>
>>     event-tap record block and io wirte events, and replay these on
>> the other side, so block_save_live is useless during the latter ft
>> phase, right? if so, I think it need to process the following code in
>> block_save_live function:
>
> Actually no.  It just replays the last events only.  We do have patches that
> enable block replication without using block live migration, like the way
> you described above.  In that case, we disable block live migration when  we
> go into ft mode.  We're thinking to propose it after this series get
> settled.

so event-tap's objective is to initial a ft transaction, to start the
sync. of ram/block/device states? if so, it need not change
bdrv_aio_writev/bdrv_aio_flush normal process, on the other side it
need not invokde bdrv_aio_writev either, right?

>
>>
>>     if (stage == 1) {
>>         init_blk_migration(mon, f);
>>
>>         /* start track dirty blocks */
>>         set_dirty_tracking(1);
>>     }
>> --
>> the following code will send block to the other side, as this will
>> also be done by event-tap replay. I think it should placed in stage 3,
>> before the assert line. (this may affect some stage 2 rate-limit
>> then, so this can be placed in stage 2, though it looks ugly), another
>> choice is to avoid the invocation of block_save_live, right?
>> ---
>>     flush_blks(f);
>>
>>     if (qemu_file_has_error(f)) {
>>         blk_mig_cleanup(mon);
>>         return 0;
>>     }
>>
>>     blk_mig_reset_dirty_cursor();
>> 
>>     if (stage == 2) {
>>
>>
>>     another question is: since you event-tap io write(I think IO READ
>> should also be event-tapped, as read may cause io chip state to
>> change),  you then need not invoke qemu_savevm_state_full in
>> qemu_savevm_trans_complete, right? thanks.
>
> It's not necessary to tap IO READ, but you can if you like.  We also have
> experimental patches for this to reduce rams to be transfered.  But I don't
> understand why we don't have to invoke qemu_savevm_state_full although I
> think we may reduce number of rams by replaying IO READ on the secondary.
>

I first think the objective of io-Write event-tap is to reproduce the
same device state on the other side, though I doubt this,  so I think
IO-Read also should be recorded and replayed. since event-tap is only
to initial a ft transaction, the sync. of states still depend on
qemu_save_vm_live/full,  I understand the design now, thanks.

but I don't understand why io-write event-tap can reduce transfered
rams as you mentioned, the amount of rams only depend on dirty pages,
IO write don't change the normal process unlike block write, right?

> Thanks,
>
> Yoshi
>
>>
>>
>> Green.
>>
>>
>>
>> 2011/2/24 Yoshiaki Tamura:
>>>
>>> event-tap controls when to start FT transaction, and provides proxy
>>> functions to called from net/block devices.  While FT transaction, it
>>> queues up net/block requests, and flush them when the transaction gets
>>> completed.
>>>
>>> Signed-off-by: Yoshiaki Tamura
>>> Signed-off-by: OHMURA Kei
>>> ---
>>>  Makefile.target |    1 +
>>>  event-tap.c     |  940
>>> +++
>>>  event-tap.h     |   44 +++
>>>  qemu-tool.c     |   28 ++
>>>  trace-events    |   10 +
>>>  5 files changed, 1023 insertions(+), 0 deletions(-)
>>>  create mode 100644 event-tap.c
>>>  create mode 100644 event-tap.h
>>>
>>> diff --git a/Makefile.target b/Makefile.target
>>> index 220589e..da57efe 100644
>>> --- a/Makefile.target
>>> +++ b/Makefile.target
>>> @@ -199,6 +199,7 @@ obj-y += rwhandler.o
>>>  obj-$(CONFIG_KVM) += kvm.o kvm-all.o
>>>  obj-$(CONFIG_NO_KVM) += kvm-stub.o
>>>  LIBS+=-lz
>>> +obj-y += event-tap.o
>>>
>>>  QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
>>>  QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
>>> diff --git a/event-tap.c b/event-tap.c
>>> new file mode 100644
>>> index 000..95c147a
>>> --- /dev/null
>>> +++ b/event-tap.c
>>> @@ -0,0 +1,940 @@
>>> +/*
>>> + * Event Tap functions for QEMU
>>> + *
>>> + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation.
>>> + *
>>> + * This work is lic

Re: [PATCH 09/18] Introduce event-tap.

2011-03-03 Thread ya su
Yokshiaki:

event-tap record block and io wirte events, and replay these on
the other side, so block_save_live is useless during the latter ft
phase, right? if so, I think it need to process the following code in
block_save_live function:

if (stage == 1) {
init_blk_migration(mon, f);

/* start track dirty blocks */
set_dirty_tracking(1);
}
--
the following code will send block to the other side, as this will
also be done by event-tap replay. I think it should placed in stage 3,
before the assert line. (this may affect some stage 2 rate-limit
then, so this can be placed in stage 2, though it looks ugly), another
choice is to avoid the invocation of block_save_live, right?
---
flush_blks(f);

if (qemu_file_has_error(f)) {
blk_mig_cleanup(mon);
return 0;
}

blk_mig_reset_dirty_cursor();

if (stage == 2) {


another question is: since you event-tap io write(I think IO READ
should also be event-tapped, as read may cause io chip state to
change),  you then need not invoke qemu_savevm_state_full in
qemu_savevm_trans_complete, right? thanks.


Green.



2011/2/24 Yoshiaki Tamura :
> event-tap controls when to start FT transaction, and provides proxy
> functions to called from net/block devices.  While FT transaction, it
> queues up net/block requests, and flush them when the transaction gets
> completed.
>
> Signed-off-by: Yoshiaki Tamura 
> Signed-off-by: OHMURA Kei 
> ---
>  Makefile.target |    1 +
>  event-tap.c     |  940 
> +++
>  event-tap.h     |   44 +++
>  qemu-tool.c     |   28 ++
>  trace-events    |   10 +
>  5 files changed, 1023 insertions(+), 0 deletions(-)
>  create mode 100644 event-tap.c
>  create mode 100644 event-tap.h
>
> diff --git a/Makefile.target b/Makefile.target
> index 220589e..da57efe 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -199,6 +199,7 @@ obj-y += rwhandler.o
>  obj-$(CONFIG_KVM) += kvm.o kvm-all.o
>  obj-$(CONFIG_NO_KVM) += kvm-stub.o
>  LIBS+=-lz
> +obj-y += event-tap.o
>
>  QEMU_CFLAGS += $(VNC_TLS_CFLAGS)
>  QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
> diff --git a/event-tap.c b/event-tap.c
> new file mode 100644
> index 000..95c147a
> --- /dev/null
> +++ b/event-tap.c
> @@ -0,0 +1,940 @@
> +/*
> + * Event Tap functions for QEMU
> + *
> + * Copyright (c) 2010 Nippon Telegraph and Telephone Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu-common.h"
> +#include "qemu-error.h"
> +#include "block.h"
> +#include "block_int.h"
> +#include "ioport.h"
> +#include "osdep.h"
> +#include "sysemu.h"
> +#include "hw/hw.h"
> +#include "net.h"
> +#include "event-tap.h"
> +#include "trace.h"
> +
> +enum EVENT_TAP_STATE {
> +    EVENT_TAP_OFF,
> +    EVENT_TAP_ON,
> +    EVENT_TAP_SUSPEND,
> +    EVENT_TAP_FLUSH,
> +    EVENT_TAP_LOAD,
> +    EVENT_TAP_REPLAY,
> +};
> +
> +static enum EVENT_TAP_STATE event_tap_state = EVENT_TAP_OFF;
> +
> +typedef struct EventTapIOport {
> +    uint32_t address;
> +    uint32_t data;
> +    int      index;
> +} EventTapIOport;
> +
> +#define MMIO_BUF_SIZE 8
> +
> +typedef struct EventTapMMIO {
> +    uint64_t address;
> +    uint8_t  buf[MMIO_BUF_SIZE];
> +    int      len;
> +} EventTapMMIO;
> +
> +typedef struct EventTapNetReq {
> +    char *device_name;
> +    int iovcnt;
> +    int vlan_id;
> +    bool vlan_needed;
> +    bool async;
> +    struct iovec *iov;
> +    NetPacketSent *sent_cb;
> +} EventTapNetReq;
> +
> +#define MAX_BLOCK_REQUEST 32
> +
> +typedef struct EventTapAIOCB EventTapAIOCB;
> +
> +typedef struct EventTapBlkReq {
> +    char *device_name;
> +    int num_reqs;
> +    int num_cbs;
> +    bool is_flush;
> +    BlockRequest reqs[MAX_BLOCK_REQUEST];
> +    EventTapAIOCB *acb[MAX_BLOCK_REQUEST];
> +} EventTapBlkReq;
> +
> +#define EVENT_TAP_IOPORT (1 << 0)
> +#define EVENT_TAP_MMIO   (1 << 1)
> +#define EVENT_TAP_NET    (1 << 2)
> +#define EVENT_TAP_BLK    (1 << 3)
> +
> +#define EVENT_TAP_TYPE_MASK (EVENT_TAP_NET - 1)
> +
> +typedef struct EventTapLog {
> +    int mode;
> +    union {
> +        EventTapIOport ioport;
> +        EventTapMMIO mmio;
> +    };
> +    union {
> +        EventTapNetReq net_req;
> +        EventTapBlkReq blk_req;
> +    };
> +    QTAILQ_ENTRY(EventTapLog) node;
> +} EventTapLog;
> +
> +struct EventTapAIOCB {
> +    BlockDriverAIOCB common;
> +    BlockDriverAIOCB *acb;
> +    bool is_canceled;
> +};
> +
> +static EventTapLog *last_event_tap;
> +
> +static QTAILQ_HEAD(, EventTapLog) event_list;
> +static QTAILQ_HEAD(, EventTapLog) event_pool;
> +
> +static int (*event_tap_cb)(void);
> +static QEMUBH *event_tap_bh;
> +static VMChangeStateEntry *vmstate;
> +
> +static void event_tap_bh_cb(void *p)
> +{
> +    if (event_tap_cb) {
> +        eve

Re: problem about blocked monitor when disk image on NFS can not be reached.

2011-03-02 Thread ya su
hi,all:

io_thread bt as the following:
#0  0x7f3086eaa034 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7f3086ea5345 in _L_lock_870 () from /lib64/libpthread.so.0
#2  0x7f3086ea5217 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00436018 in kvm_mutex_lock () at
/root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1730
#4  qemu_mutex_lock_iothread () at
/root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1744
#5  0x0041ca67 in main_loop_wait (nonblocking=)
at /root/rpmbuild/BUILD/qemu-kvm-0.14/vl.c:1377
#6  0x004363e7 in kvm_main_loop () at
/root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1589
#7  0x0041dc3a in main_loop (argc=,
argv=,
envp=) at /root/rpmbuild/BUILD/qemu-kvm-0.14/vl.c:1429
#8  main (argc=, argv=,
envp=)
at /root/rpmbuild/BUILD/qemu-kvm-0.14/vl.c:3201

cpu thread as the following:
#0  0x7f3084dff093 in select () from /lib64/libc.so.6
#1  0x004453ea in qemu_aio_wait () at aio.c:193
#2  0x00444175 in bdrv_write_em (bs=0x1ec3090, sector_num=2009871,
buf=0x7f3087532800
"F\b\200u\022\366F$\004u\fPV\350\226\367\377\377\003Ft\353\fPV\350\212\367\377\377\353\003\213Ft^]\302\b",
nb_sectors=16) at block.c:2577
#3  0x0059ca13 in ide_sector_write (s=0x215f508) at
/root/rpmbuild/BUILD/qemu-kvm-0.14/hw/ide/core.c:574
#4  0x00438ced in kvm_handle_io (env=0x202ef60) at
/root/rpmbuild/BUILD/qemu-kvm-0.14/kvm-all.c:821
#5  kvm_run (env=0x202ef60) at /root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:617
#6  0x00438e09 in kvm_cpu_exec (env=)
at /root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1233
#7  0x0043a0f7 in kvm_main_loop_cpu (_env=0x202ef60)
at /root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1419
#8  ap_main_loop (_env=0x202ef60) at
/root/rpmbuild/BUILD/qemu-kvm-0.14/qemu-kvm.c:1466
#9  0x7f3086ea37e1 in start_thread () from /lib64/libpthread.so.0
#10 0x7f3084e0653d in clone () from /lib64/libc.so.6

aio_thread bt as the following:
#0  0x7f3086eaae83 in pwrite64 () from /lib64/libpthread.so.0
#1  0x00447501 in handle_aiocb_rw_linear (aiocb=0x21cff10,
buf=0x7f3087532800
"F\b\200u\022\366F$\004u\fPV\350\226\367\377\377\003Ft\353\fPV\350\212\367\377\377\353\003\213Ft^]\302\b")
at posix-aio-compat.c:212
#2  0x00447d48 in handle_aiocb_rw (unused=) at posix-aio-compat.c:247
#3  aio_thread (unused=) at posix-aio-compat.c:341
#4  0x7f3086ea37e1 in start_thread () from /lib64/libpthread.so.0
#5  0x7f3084e0653d in clone () from /lib64/libc.so.6

I think io_thread is blocked by cpu thread which take the qemu_mutux
first, cpu thread is waiting for aio_thread's result by qemu_aio_wait
function,  aio_thead take much time on pwrite64, it will take about
5-10s, then return a error(it seems like an non-block timeout call),
after that, io thead will have a chance to receive monitor input, so
the monitor seems to blocked frequently. in this suition, if I stop
the vm, the monitor will response faster.

the problem is caused by unavailabity of block layer, the block layer
process the io error in a normal way, it report error to ide device,
the error is handled in ide_sector_write. the root cause is: monitor's
input and io operation(pwrite function) must execute in a serialized
method(by qemu_mutux seamphore), so pwrite long block time will hinder
monitor input.

as stefan says, it seems difficult to take monitor input out of the
protection, currently I will stop the vm if the disk image can not be
reached.


2011/3/1 Avi Kivity :
> On 03/01/2011 05:01 PM, Stefan Hajnoczi wrote:
>>
>> On Tue, Mar 1, 2011 at 12:39 PM, ya su  wrote:
>> >      how about to remove kvm_handle_io/handle_mmio in kvm_run function
>> >  into kvm_main_loop, as these operation belong to io operation, this
>> >  will remove the qemu_mutux between the 2 threads. is this an
>> >  reasonable thought?
>> >
>> >      In order to keep the monitor to response to user quicker under
>> >  this suition, an easier way is to take monito io out of qemu_mutux
>> >  protection. this include vnc/serial/telnet io related with monitor,
>> >  as these io will not affect the running of vm itself, it need not in
>> >  so stirct protection.
>>
>> The qemu_mutex protects all QEMU global state.  The monitor does some
>> I/O and parsing which is not necessarily global state but once it
>> begins actually performing the command you sent, access to global
>> state will be required (pretty much any monitor command will operate
>> on global state).
>>
>> I think there are two options for handling NFS hangs:
>> 1. Ensure that QEMU is never put to sleep by NFS for disk images.  The
>> guest continues executing, may time out and notice that storage is
>> unavailable.
>
> That's the NFS soft mount option.
>
>> 2.

Re: problem about blocked monitor when disk image on NFS can not be reached.

2011-03-01 Thread ya su
first say sorry for the same mail sent more than one time, I don't
know it will take so long time to come back.

hi, stefan:

thank for your explaining.

how about to remove kvm_handle_io/handle_mmio in kvm_run function
into kvm_main_loop, as these operation belong to io operation, this
will remove the qemu_mutux between the 2 threads. is this an
reasonable thought?

In order to keep the monitor to response to user quicker under
this suition, an easier way is to take monito io out of qemu_mutux
protection. this include vnc/serial/telnet io related with monitor,
as these io will not affect the running of vm itself, it need not in
so stirct protection.

Any suggestions? thanks.

Green.


2011/3/1 Stefan Hajnoczi :
> On Tue, Mar 1, 2011 at 5:01 AM, ya su  wrote:
>>   kvm start with disk image on nfs server, when nfs server can not be
>> reached, monitor will be blocked. I change io_thread to SCHED_RR
>> policy, it will work unfluently waiting for disk read/write timeout.
>
> There are some synchronous disk image reads that can put qemu-kvm to
> sleep until NFS responds or errors.  For example, when starting
> hw/virtio-blk.c calls bdrv_guess_geometry() which may invoke
> bdrv_read().
>
> Once the VM is running and you're using virtio-blk then disk I/O
> should be asynchronous.  There are some synchronous cases to do with
> migration, snapshotting, etc where we wait for outstanding aio
> requests.  Again this can block qemu-kvm.
>
> So in short, there's no easy way to avoid blocking the VM in all cases
> today.  You should find, however, that normal read/write operation to
> a running VM does not cause qemu-kvm to sleep.
>
> Stefan
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


problem about blocked monitor when disk image on NFS can not be reached.

2011-02-28 Thread ya su
hi all:

   kvm start with disk image on nfs server, when nfs server can not be
reached, monitor will be blocked. I change io_thread to SCHED_RR
policy, it will work unfluently waiting for disk read/write timeout.

I have tested a standalone thread to process kvm_handle_io, it can not
start up correctly, it may need qemu_mutux protection.

as io_thread process different io tasks, is it possible to transfer
kvm_handle_io and handle_mmio function into this thread? but the
problem will still stay, monitor will still be blocked by read/write
disk request.

is there anyone that will have a good suggestion? thanks.

Green.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: problem about blocked monitor when disk image on NFS can not be reached.

2011-02-28 Thread ya su
I have tested a standalone thread to process kvm_handle_io, it can not
start up correctly, this function may need qemu_mutux protection.

as io_thread process different io tasks, is it possible to transfer
kvm_handle_io and handle_mmio function into this thread? but the
problem will still stay, monitor will still be blocked by read/write
disk request.

is there anyone that will have a good suggestion? thanks.

Green.


-- Forwarded message --
From: ya su 
Date: 2011/2/28
Subject: problem about blocked monitor when disk image on NFS can not
be reached.
To: kvm@vger.kernel.org


hi:

   kvm start with disk image on nfs server, when nfs server can not be
reached, monitor will be blocked. I change io_thread to SCHED_RR
policy, it will work unfluently waiting for disk read/write timeout.

  I think one solution to this is to change kvm_handle_io in a
seperate thread, I will put kvm_handle_io in a new spawned thread, all
io request passed in a queue between io_thread and the new spawned
thread,  it need copy run->io.size*run->io.count bytes from
address:(uint8_t *)run + run->io.data_offset.

  Is this a right direction? any suggestion is welcome, thanks!

Green.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


problem about blocked monitor when disk image on NFS can not be reached.

2011-02-28 Thread ya su
hi:

   kvm start with disk image on nfs server, when nfs server can not be
reached, monitor will be blocked. I change io_thread to SCHED_RR
policy, it will work unfluently waiting for disk read/write timeout.

  I think one solution to this is to change kvm_handle_io in a
seperate thread, I will put kvm_handle_io in a new spawned thread, all
io request passed in a queue between io_thread and the new spawned
thread,  it need copy run->io.size*run->io.count bytes from
address:(uint8_t *)run + run->io.data_offset.

  Is this a right direction? any suggestion is welcome, thanks!

Green.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html