[PATCH v2] KVM: x86 emulator: emulate RETF imm

2013-09-09 Thread Bruce Rogers
Opcode CA

This gets used by a DOS based NetWare guest.

Signed-off-by: Bruce Rogers 
---
 arch/x86/kvm/emulate.c |   14 +-
 1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 2bc1e81..ddc3f3d 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2025,6 +2025,17 @@ static int em_ret_far(struct x86_emulate_ctxt *ctxt)
return rc;
 }
 
+static int em_ret_far_imm(struct x86_emulate_ctxt *ctxt)
+{
+int rc;
+
+rc = em_ret_far(ctxt);
+if (rc != X86EMUL_CONTINUE)
+return rc;
+rsp_increment(ctxt, ctxt->src.val);
+return X86EMUL_CONTINUE;
+}
+
 static int em_cmpxchg(struct x86_emulate_ctxt *ctxt)
 {
/* Save real source value, then compare EAX against destination. */
@@ -3763,7 +3774,8 @@ static const struct opcode opcode_table[256] = {
G(ByteOp, group11), G(0, group11),
/* 0xC8 - 0xCF */
I(Stack | SrcImmU16 | Src2ImmByte, em_enter), I(Stack, em_leave),
-   N, I(ImplicitOps | Stack, em_ret_far),
+   I(ImplicitOps | Stack | SrcImmU16, em_ret_far_imm),
+   I(ImplicitOps | Stack, em_ret_far),
D(ImplicitOps), DI(SrcImmByte, intn),
D(ImplicitOps | No64), II(ImplicitOps, em_iret, iret),
/* 0xD0 - 0xD7 */
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: x86 emulator: emulate RETF imm

2013-09-09 Thread Bruce Rogers
 >>> On 9/9/2013 at 07:10 AM, Gleb Natapov  wrote: 
> On Mon, Sep 09, 2013 at 07:09:15AM -0600, Bruce Rogers wrote:
>>  >>> On 9/8/2013 at 07:13 AM, Gleb Natapov  wrote: 
>> > On Tue, Sep 03, 2013 at 01:42:09PM -0600, Bruce Rogers wrote:
>> >> Opcode CA
>> >> 
>> >> This gets used by a DOS based NetWare guest.
>> >> 
>> >> Signed-off-by: Bruce Rogers 
>> >> ---
>> >>  arch/x86/kvm/emulate.c |   23 ++-
>> >>  1 files changed, 22 insertions(+), 1 deletions(-)
>> >> 
>> >> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> >> index 2bc1e81..aee238a 100644
>> >> --- a/arch/x86/kvm/emulate.c
>> >> +++ b/arch/x86/kvm/emulate.c
>> >> @@ -2025,6 +2025,26 @@ static int em_ret_far(struct x86_emulate_ctxt 
>> >> *ctxt)
>> >>   return rc;
>> >>  }
>> >>  
>> >> +static int em_ret_far_imm(struct x86_emulate_ctxt *ctxt)
>> >> +{
>> >> +int rc;
>> >> +unsigned long cs;
>> >> +
>> >> +rc = emulate_pop(ctxt, &ctxt->_eip, ctxt->op_bytes);
>> >> +if (rc != X86EMUL_CONTINUE)
>> >> +return rc;
>> >> +if (ctxt->op_bytes == 4)
>> >> +ctxt->_eip = (u32)ctxt->_eip;
>> >> +rc = emulate_pop(ctxt, &cs, ctxt->op_bytes);
>> >> +if (rc != X86EMUL_CONTINUE)
>> >> +return rc;
>> >> +rc = load_segment_descriptor(ctxt, (u16)cs, VCPU_SREG_CS);
>> >> +if (rc != X86EMUL_CONTINUE)
>> >> +return rc;
>> >> +rsp_increment(ctxt, ctxt->src.val);
>> >> +return X86EMUL_CONTINUE;
>> >> +}
>> >> +
>> > Why not:
>> > 
>> > static int em_ret_far_imm(struct x86_emulate_ctxt *ctxt)
>> > {
>> >int rc;
>> >rc = em_ret_far(struct x86_emulate_ctxt *ctxt);
>> >if (rc != X86EMUL_CONTINUE)
>> >return rc;
>> >rsp_increment(ctxt, ctxt->src.val);
>> >return X86EMUL_CONTINUE;
>> > }
>> > 
>> > --
>> >Gleb.
>> 
>> Yes, that does seem better. Ack.
>> 
> Somebody still needs to write a proper patch :) Can you do it please?

Sure, will do.

Bruce


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: x86 emulator: emulate RETF imm

2013-09-09 Thread Bruce Rogers
 >>> On 9/8/2013 at 07:13 AM, Gleb Natapov  wrote: 
> On Tue, Sep 03, 2013 at 01:42:09PM -0600, Bruce Rogers wrote:
>> Opcode CA
>> 
>> This gets used by a DOS based NetWare guest.
>> 
>> Signed-off-by: Bruce Rogers 
>> ---
>>  arch/x86/kvm/emulate.c |   23 ++-
>>  1 files changed, 22 insertions(+), 1 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index 2bc1e81..aee238a 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -2025,6 +2025,26 @@ static int em_ret_far(struct x86_emulate_ctxt *ctxt)
>>  return rc;
>>  }
>>  
>> +static int em_ret_far_imm(struct x86_emulate_ctxt *ctxt)
>> +{
>> +int rc;
>> +unsigned long cs;
>> +
>> +rc = emulate_pop(ctxt, &ctxt->_eip, ctxt->op_bytes);
>> +if (rc != X86EMUL_CONTINUE)
>> +return rc;
>> +if (ctxt->op_bytes == 4)
>> +ctxt->_eip = (u32)ctxt->_eip;
>> +rc = emulate_pop(ctxt, &cs, ctxt->op_bytes);
>> +if (rc != X86EMUL_CONTINUE)
>> +return rc;
>> +rc = load_segment_descriptor(ctxt, (u16)cs, VCPU_SREG_CS);
>> +if (rc != X86EMUL_CONTINUE)
>> +return rc;
>> +rsp_increment(ctxt, ctxt->src.val);
>> +return X86EMUL_CONTINUE;
>> +}
>> +
> Why not:
> 
> static int em_ret_far_imm(struct x86_emulate_ctxt *ctxt)
> {
>   int rc;
>   rc = em_ret_far(struct x86_emulate_ctxt *ctxt);
>   if (rc != X86EMUL_CONTINUE)
>   return rc;
>   rsp_increment(ctxt, ctxt->src.val);
>   return X86EMUL_CONTINUE;
> }
> 
> --
>   Gleb.

Yes, that does seem better. Ack.

Bruce

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH kvm-unit-tests] realmode: test RETF imm

2013-09-04 Thread Bruce Rogers
Signed-off-by: Bruce Rogers 
---
 x86/realmode.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/x86/realmode.c b/x86/realmode.c
index 3546771..c57e033 100644
--- a/x86/realmode.c
+++ b/x86/realmode.c
@@ -481,6 +481,9 @@ void test_io(void)
 asm ("retf: lretw");
 extern void retf();
 
+asm ("retf_imm: lretw $10");
+extern void retf_imm();
+
 void test_call(void)
 {
u32 esp[16];
@@ -503,6 +506,7 @@ void test_call(void)
MK_INSN(call_far1,  "lcallw *(%ebx)\n\t");
MK_INSN(call_far2,  "lcallw $0, $retf\n\t");
MK_INSN(ret_imm,"sub $10, %sp; jmp 2f; 1: retw $10; 2: callw 1b");
+   MK_INSN(retf_imm,   "sub $10, %sp; lcallw $0, $retf_imm");
 
exec_in_big_real_mode(&insn_call1);
report("call 1", R_AX, outregs.eax == 0x1234);
@@ -523,6 +527,9 @@ void test_call(void)
 
exec_in_big_real_mode(&insn_ret_imm);
report("ret imm 1", 0, 1);
+
+   exec_in_big_real_mode(&insn_retf_imm);
+   report("retf imm 1", 0, 1);
 }
 
 void test_jcc_short(void)
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: x86 emulator: emulate RETF imm

2013-09-03 Thread Bruce Rogers
Opcode CA

This gets used by a DOS based NetWare guest.

Signed-off-by: Bruce Rogers 
---
 arch/x86/kvm/emulate.c |   23 ++-
 1 files changed, 22 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 2bc1e81..aee238a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2025,6 +2025,26 @@ static int em_ret_far(struct x86_emulate_ctxt *ctxt)
return rc;
 }
 
+static int em_ret_far_imm(struct x86_emulate_ctxt *ctxt)
+{
+int rc;
+unsigned long cs;
+
+rc = emulate_pop(ctxt, &ctxt->_eip, ctxt->op_bytes);
+if (rc != X86EMUL_CONTINUE)
+return rc;
+if (ctxt->op_bytes == 4)
+ctxt->_eip = (u32)ctxt->_eip;
+rc = emulate_pop(ctxt, &cs, ctxt->op_bytes);
+if (rc != X86EMUL_CONTINUE)
+return rc;
+rc = load_segment_descriptor(ctxt, (u16)cs, VCPU_SREG_CS);
+if (rc != X86EMUL_CONTINUE)
+return rc;
+rsp_increment(ctxt, ctxt->src.val);
+return X86EMUL_CONTINUE;
+}
+
 static int em_cmpxchg(struct x86_emulate_ctxt *ctxt)
 {
/* Save real source value, then compare EAX against destination. */
@@ -3763,7 +3783,8 @@ static const struct opcode opcode_table[256] = {
G(ByteOp, group11), G(0, group11),
/* 0xC8 - 0xCF */
I(Stack | SrcImmU16 | Src2ImmByte, em_enter), I(Stack, em_leave),
-   N, I(ImplicitOps | Stack, em_ret_far),
+   I(ImplicitOps | Stack | SrcImmU16, em_ret_far_imm),
+   I(ImplicitOps | Stack, em_ret_far),
D(ImplicitOps), DI(SrcImmByte, intn),
D(ImplicitOps | No64), II(ImplicitOps, em_iret, iret),
/* 0xD0 - 0xD7 */
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with ETP enabled

2013-07-11 Thread Bruce Rogers
 >>> On 7/11/2013 at 03:36 AM, "Zhanghaoyu (A)"  wrote: 
> hi all,
> 
> I met similar problem to these, while performing live migration or 
> save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2, 
> guest:suse11sp2), running tele-communication software suite in guest,
> https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html
> http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506
> http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592
> https://bugzilla.kernel.org/show_bug.cgi?id=58771
> 
> After live migration or virsh restore [savefile], one process's CPU 
> utilization went up by about 30%, resulted in throughput degradation of this 
> process.
> oprofile report on this process in guest,
> pre live migration:
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples  %app name symbol name
> 248  12.3016  no-vmlinux   (no symbols)
> 783.8690  libc.so.6memset
> 683.3730  libc.so.6memcpy
> 301.4881  cscf.scu SipMmBufMemAlloc
> 291.4385  libpthread.so.0  pthread_mutex_lock
> 261.2897  cscf.scu SipApiGetNextIe
> 251.2401  cscf.scu DBFI_DATA_Search
> 200.9921  libpthread.so.0  __pthread_mutex_unlock_usercnt
> 160.7937  cscf.scu DLM_FreeSlice
> 160.7937  cscf.scu receivemessage
> 150.7440  cscf.scu SipSmCopyString
> 140.6944  cscf.scu DLM_AllocSlice
> 
> post live migration:
> CPU: CPU with timer interrupt, speed 0 MHz (estimated)
> Profiling through timer interrupt
> samples  %app name symbol name
> 1586 42.2370  libc.so.6memcpy
> 271   7.2170  no-vmlinux   (no symbols)
> 832.2104  libc.so.6memset
> 411.0919  libpthread.so.0  __pthread_mutex_unlock_usercnt
> 350.9321  cscf.scu SipMmBufMemAlloc
> 290.7723  cscf.scu DLM_AllocSlice
> 280.7457  libpthread.so.0  pthread_mutex_lock
> 230.6125  cscf.scu SipApiGetNextIe
> 170.4527  cscf.scu SipSmCopyString
> 160.4261  cscf.scu receivemessage
> 150.3995  cscf.scu SipcMsgStatHandle
> 140.3728  cscf.scu Urilex
> 120.3196  cscf.scu DBFI_DATA_Search
> 120.3196  cscf.scu SipDsmGetHdrBitValInner
> 120.3196  cscf.scu SipSmGetDataFromRefString
> 
> So, memcpy costs much more cpu cycles after live migration. Then, I restart 
> the process, this problem disappeared. save-restore has the similar problem.
> 
> perf report on vcpu thread in host,
> pre live migration:
> Performance counter stats for thread id '21082':
> 
>  0 page-faults
>  0 minor-faults
>  0 major-faults
>  31616 cs
>506 migrations
>  0 alignment-faults
>  0 emulation-faults
> 5075957539 L1-dcache-loads
>  
>  [21.32%]
>  324685106 L1-dcache-load-misses #6.40% of all L1-dcache hits 
>   
> [21.85%]
> 3681777120 L1-dcache-stores   
>  
>  [21.65%]
>   65251823 L1-dcache-store-misses# 1.77%  
>   
>[22.78%]
>  0 L1-dcache-prefetches   
>  
>  [22.84%]
>  0 L1-dcache-prefetch-misses  
>   
> [22.32%]
> 9321652613 L1-icache-loads
>  
>  [22.60%]
> 1353418869 L1-icache-load-misses #   14.52% of all L1-icache hits 
>   
> [21.92%]
>  169126969 LLC-loads  
>   [21.87%]
>   12583605 LLC-load-misses   #7.44% of all LL-cache hits  
>   
> [ 5.84%]
>  132853447 LLC-stores 
>   [ 6.61%]
>   10601171 LLC-store-misses  #7.9%
>  
>   [ 5.01%]
>   25309497 LLC-prefetches #30%
>   [ 4.96%]
>7723198 LLC-prefetch-misses
>  
>  [ 6.04%]
> 4954075817 dTLB-loads 
>   [11.56%]
>   26753106 dTLB-load-misses  #0.54% of all dTLB cache 
> hits 
>  [16.80%]
> 3553702874 dTLB-stores
>   [22.37%]
>4720313 dTLB-store-misses#0.13%
>  
>[21

Re: [Qemu-devel] qemu-kvm: remove "boot=on|off" drive parameter compatibility

2012-10-01 Thread Bruce Rogers
 >>> On 10/1/2012 at 07:19 AM, Anthony Liguori  wrote: 
> Jan Kiszka  writes:
> 
>> On 2012-10-01 11:31, Marcelo Tosatti wrote:
>>
>> It's not just about default configs. We need to validate if the
>> migration formats are truly compatible (qemu-kvm -> QEMU, the other way
>> around definitely not). For the command line switches, we could provide
>> a wrapper script that translates them into upstream format or simply
>> ignores them. That should be harmless to carry upstream.
> 
> qemu-kvm has:
> 
>  -no-kvm
>  -no-kvm-irqchip
>  -no-kvm-pit
>  -no-kvm-pit-reinjection
>  -tdf <- does nothing
> 
> There are replacements for all of the above.  If we need to add them to
> qemu.git, it's not big deal to add them.
> 
>  -drive ...,boot= <- this is ignored
> 
> cpu_set command for CPU hotplug which is known broken in qemu-kvm.
> 
> testdev which is nice but only used for development
> 
> Default nic is rtl8139 vs. e1000.
> 
> Some logic to move change the default VGA ram size to 16mb for pc-1.2
> (QEMU uses 16mb by default now too).
> 
> I think at this point, none of this matters but I added the various
> distro maintainers to the thread.
> 
> I think it's time for the distros to drop qemu-kvm and just ship
> qemu.git.  Is there anything else that needs to happen to make that
> switch?

We are seriously considering moving to qemu.git for our SP3 release of
SUSE SLES 11. There are just a handful of patches that provide the backwards
compatibility we need to maintain (default to kvm, default nic model,
vga ram size), so assuming there is a 100% commitment to fully supporting
kvm in qemu going forward (which I don't doubt) I think this is a good time
for us to make that switch.

Bruce

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] handle device help before accelerator set up

2012-08-08 Thread Bruce Rogers
A command line device probe using just -device "?" gets processed
after qemu-kvm initializes the accelerator. If /dev/kvm is not
present, the accelerator check will fail (kvm is defaulted to on),
which causes libvirt to not be set up to handle qemu guests.

Moving the device help handling before the accelerator set up allows
the device probe to work in this configuration and libvirt succeeds
in setting up for a qemu hypervisor mode.

Signed-off-by: Bruce Rogers 
---
 vl.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/vl.c b/vl.c
index 1a46d2d..5b75cf9 100644
--- a/vl.c
+++ b/vl.c
@@ -3380,6 +3380,9 @@ int main(int argc, char **argv, char **envp)
 ram_size = DEFAULT_RAM_SIZE * 1024 * 1024;
 }
 
+if (qemu_opts_foreach(qemu_find_opts("device"), device_help_func, NULL, 0) 
!= 0)
+exit(0);
+
 configure_accelerator();
 
 qemu_init_cpu_loop();
@@ -3535,9 +3538,6 @@ int main(int argc, char **argv, char **envp)
 }
 select_vgahw(vga_model);
 
-if (qemu_opts_foreach(qemu_find_opts("device"), device_help_func, NULL, 0) 
!= 0)
-exit(0);
-
 if (watchdog) {
 i = select_watchdog(watchdog);
 if (i > 0)
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] kvm: kvmclock: apply kvmclock offset to guest wall clock time

2012-08-01 Thread Bruce Rogers
 >>> On 8/1/2012 at 02:21 PM, Marcelo Tosatti  wrote: 
> On Mon, Jul 23, 2012 at 09:44:54PM -0300, Marcelo Tosatti wrote:
>> On Fri, Jul 20, 2012 at 10:44:24AM -0600, Bruce Rogers wrote:
>> > When a guest migrates to a new host, the system time difference from the
>> > previous host is used in the updates to the kvmclock system time visible
>> > to the guest, resulting in a continuation of correct kvmclock based guest
>> > timekeeping.
>> > 
>> > The wall clock component of the kvmclock provided time is currently not
>> > updated with this same time offset. Since the Linux guest caches the
>> > wall clock based time, this discrepency is not noticed until the guest is
>> > rebooted. After reboot the guest's time calculations are off.
>> > 
>> > This patch adjusts the wall clock by the kvmclock_offset, resulting in
>> > correct guest time after a reboot.
>> > 
>> > Cc: Glauber Costa 
>> > Cc: Zachary Amsden 
>> > Signed-off-by: Bruce Rogers 
>> > ---
>> >  arch/x86/kvm/x86.c |4 
>> >  1 files changed, 4 insertions(+), 0 deletions(-)
>> > 
>> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> > index be6d549..14c290d 100644
>> > --- a/arch/x86/kvm/x86.c
>> > +++ b/arch/x86/kvm/x86.c
>> > @@ -907,6 +907,10 @@ static void kvm_write_wall_clock(struct kvm *kvm, 
>> > gpa_t 
> wall_clock)
>> > */
>> >getboottime(&boot);
>> >  
>> > +  if (kvm->arch.kvmclock_offset) {
>> > +  struct timespec ts = ns_to_timespec(kvm->arch.kvmclock_offset);
>> > +  boot = timespec_sub(boot, ts);
>> > +  }
>> 
>> kvmclock_offset is signed (both directions). Must check the sign and use
>> _sub and _add_safe accordingly.
> 
> Your patch is correct, sorry (applied to master).
> 
> Patch 2 still makes no sense.

I'm fine with dropping the second patch.

Thanks

Bruce

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] kvm: kvmclock: fix kvmclock reboot after migrate issues

2012-07-20 Thread Bruce Rogers
When a linux guest live migrates to a new host and subsequently
reboots, the guest no longer has the correct time. This is due
to a failure to apply the kvmclock offset to the wall clock time.

The first patch addresses this failure directly, while the second
patch detects when the offset is no longer needed, and zeroes the
offset as a matter of cleaning up migration state which is no longer
relevant. Both patches address the issue, but in different ways. 


Bruce Rogers (2):
  kvm: kvmclock: apply kvmclock offset to guest wall clock time
  kvm: kvmclock: eliminate kvmclock offset when time page count goes to
zero

 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kvm/x86.c  |9 -
 2 files changed, 9 insertions(+), 1 deletions(-)

-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] kvm: kvmclock: eliminate kvmclock offset when time page count goes to zero

2012-07-20 Thread Bruce Rogers
When a guest is migrated, a time offset is generated in order to
maintain the correct kvmclock based time for the guest. Detect when
all kvmclock time pages are deleted so that the kvmclock offset can
be safely reset to zero.

Cc: Glauber Costa 
Cc: Zachary Amsden 
Signed-off-by: Bruce Rogers 
---
 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kvm/x86.c  |5 -
 2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index db7c1f2..112415c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -524,6 +524,7 @@ struct kvm_arch {
 
unsigned long irq_sources_bitmap;
s64 kvmclock_offset;
+   unsigned int n_time_pages;
raw_spinlock_t tsc_write_lock;
u64 last_tsc_nsec;
u64 last_tsc_write;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 14c290d..350c51b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1511,6 +1511,8 @@ static void kvmclock_reset(struct kvm_vcpu *vcpu)
if (vcpu->arch.time_page) {
kvm_release_page_dirty(vcpu->arch.time_page);
vcpu->arch.time_page = NULL;
+   if (--vcpu->kvm->arch.n_time_pages == 0)
+   vcpu->kvm->arch.kvmclock_offset = 0;
}
 }
 
@@ -1624,7 +1626,8 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, 
u64 data)
if (is_error_page(vcpu->arch.time_page)) {
kvm_release_page_clean(vcpu->arch.time_page);
vcpu->arch.time_page = NULL;
-   }
+   } else
+   vcpu->kvm->arch.n_time_pages++;
break;
}
case MSR_KVM_ASYNC_PF_EN:
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvm: kvmclock: apply kvmclock offset to guest wall clock time

2012-07-20 Thread Bruce Rogers
When a guest migrates to a new host, the system time difference from the
previous host is used in the updates to the kvmclock system time visible
to the guest, resulting in a continuation of correct kvmclock based guest
timekeeping.

The wall clock component of the kvmclock provided time is currently not
updated with this same time offset. Since the Linux guest caches the
wall clock based time, this discrepency is not noticed until the guest is
rebooted. After reboot the guest's time calculations are off.

This patch adjusts the wall clock by the kvmclock_offset, resulting in
correct guest time after a reboot.

Cc: Glauber Costa 
Cc: Zachary Amsden 
Signed-off-by: Bruce Rogers 
---
 arch/x86/kvm/x86.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index be6d549..14c290d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -907,6 +907,10 @@ static void kvm_write_wall_clock(struct kvm *kvm, gpa_t 
wall_clock)
 */
getboottime(&boot);
 
+   if (kvm->arch.kvmclock_offset) {
+   struct timespec ts = ns_to_timespec(kvm->arch.kvmclock_offset);
+   boot = timespec_sub(boot, ts);
+   }
wc.sec = boot.tv_sec;
wc.nsec = boot.tv_nsec;
wc.version = version;
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] kvm: kvmclock: apply kvmclock offset to guest wall clock time

2012-07-20 Thread Bruce Rogers
When a guest migrates to a new host, the system time difference from the
previous host is used in the updates to the kvmclock system time visible
to the guest, resulting in a continuation of correct kvmclock based guest
timekeeping.

The wall clock component of the kvmclock provided time is currently not
updated with this same time offset. Since the Linux guest caches the
wall clock based time, this discrepency is not noticed until the guest is
rebooted. After reboot the guest's time calculations are off.

This patch adjusts the wall clock by the kvmclock_offset, resulting in
correct guest time after a reboot.

Cc: Glauber Costa 
Cc: Zachary Amsden 
Signed-off-by: Bruce Rogers 
---
 arch/x86/kvm/x86.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index be6d549..14c290d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -907,6 +907,10 @@ static void kvm_write_wall_clock(struct kvm *kvm, gpa_t 
wall_clock)
 */
getboottime(&boot);
 
+   if (kvm->arch.kvmclock_offset) {
+   struct timespec ts = ns_to_timespec(kvm->arch.kvmclock_offset);
+   boot = timespec_sub(boot, ts);
+   }
wc.sec = boot.tv_sec;
wc.nsec = boot.tv_nsec;
wc.version = version;
-- 
1.7.7


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] kvm: kvmclock: fix kvmclock reboot after migrate issues

2012-07-20 Thread Bruce Rogers
When a linux guest live migrates to a new host and subsequently
reboots, the guest no longer has the correct time. This is due
to a failure to apply the kvmclock offset to the wall clock time.

The first patch addresses this failure directly, while the second
patch detects when the offset is no longer needed, and zeroes the
offset as a matter of cleaning up migration state which is no longer
relevant. Both patches address the issue, but in different ways. 


Bruce Rogers (2):
  kvm: kvmclock: apply kvmclock offset to guest wall clock time
  kvm: kvmclock: eliminate kvmclock offset when time page count goes to
zero

 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kvm/x86.c  |9 -
 2 files changed, 9 insertions(+), 1 deletions(-)

-- 
1.7.7


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] kvm: kvmclock: eliminate kvmclock offset when time page count goes to zero

2012-07-20 Thread Bruce Rogers
When a guest is migrated, a time offset is generated in order to
maintain the correct kvmclock based time for the guest. Detect when
all kvmclock time pages are deleted so that the kvmclock offset can
be safely reset to zero.

Cc: Glauber Costa 
Cc: Zachary Amsden 
Signed-off-by: Bruce Rogers 
---
 arch/x86/include/asm/kvm_host.h |1 +
 arch/x86/kvm/x86.c  |5 -
 2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index db7c1f2..112415c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -524,6 +524,7 @@ struct kvm_arch {
 
unsigned long irq_sources_bitmap;
s64 kvmclock_offset;
+   unsigned int n_time_pages;
raw_spinlock_t tsc_write_lock;
u64 last_tsc_nsec;
u64 last_tsc_write;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 14c290d..350c51b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1511,6 +1511,8 @@ static void kvmclock_reset(struct kvm_vcpu *vcpu)
if (vcpu->arch.time_page) {
kvm_release_page_dirty(vcpu->arch.time_page);
vcpu->arch.time_page = NULL;
+   if (--vcpu->kvm->arch.n_time_pages == 0)
+   vcpu->kvm->arch.kvmclock_offset = 0;
}
 }
 
@@ -1624,7 +1626,8 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, 
u64 data)
if (is_error_page(vcpu->arch.time_page)) {
kvm_release_page_clean(vcpu->arch.time_page);
vcpu->arch.time_page = NULL;
-   }
+   } else
+   vcpu->kvm->arch.n_time_pages++;
break;
}
case MSR_KVM_ASYNC_PF_EN:
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [stable] [PATCH 3/3][STABLE] KVM: add schedule check to napi_enable call

2010-06-03 Thread Bruce Rogers
 >>> On 6/3/2010 at 04:51 PM, Greg KH  wrote: 
> On Thu, Jun 03, 2010 at 04:17:34PM -0600, Bruce Rogers wrote:
>>  >>> On 6/3/2010 at 03:03 PM, Greg KH  wrote: 
>> > On Thu, Jun 03, 2010 at 01:38:31PM -0600, Bruce Rogers wrote:
>> >> virtio_net: Add schedule check to napi_enable call
>> >> Under harsh testing conditions, including low memory, the guest would
>> >> stop receiving packets. With this patch applied we no longer see any
>> >> problems in the driver while performing these tests for extended 
> periods
>> >> of time.
>> >> 
>> >> Make sure napi is scheduled subsequent to each napi_enable.
>> >> 
>> >> Signed-off-by: Bruce Rogers 
>> >> Signed-off-by: Olaf Kirch 
>> > 
>> > I need a git commit id for this one as well.
>> > 
>> 
>> This one is not upstream.
> 
> Then I can't include it in the -stable tree, so why are you sending it
> to me?  :)
> 
> thanks,
> 
> greg k-h

Good point!
Sorry about the confusion.
Bruce

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [stable] [PATCH 3/3][STABLE] KVM: add schedule check to napi_enable call

2010-06-03 Thread Bruce Rogers
 >>> On 6/3/2010 at 04:51 PM, Greg KH  wrote: 
> On Thu, Jun 03, 2010 at 04:17:34PM -0600, Bruce Rogers wrote:
>>  >>> On 6/3/2010 at 03:03 PM, Greg KH  wrote: 
>> > On Thu, Jun 03, 2010 at 01:38:31PM -0600, Bruce Rogers wrote:
>> >> virtio_net: Add schedule check to napi_enable call
>> >> Under harsh testing conditions, including low memory, the guest would
>> >> stop receiving packets. With this patch applied we no longer see any
>> >> problems in the driver while performing these tests for extended 
> periods
>> >> of time.
>> >> 
>> >> Make sure napi is scheduled subsequent to each napi_enable.
>> >> 
>> >> Signed-off-by: Bruce Rogers 
>> >> Signed-off-by: Olaf Kirch 
>> > 
>> > I need a git commit id for this one as well.
>> > 
>> 
>> This one is not upstream.
> 
> Then I can't include it in the -stable tree, so why are you sending it
> to me?  :)
> 
> thanks,
> 
> greg k-h

Good point!
Sorry about the confusion.
Bruce

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [stable] [PATCH 3/3][STABLE] KVM: add schedule check to napi_enable call

2010-06-03 Thread Bruce Rogers
 >>> On 6/3/2010 at 03:03 PM, Greg KH  wrote: 
> On Thu, Jun 03, 2010 at 01:38:31PM -0600, Bruce Rogers wrote:
>> virtio_net: Add schedule check to napi_enable call
>> Under harsh testing conditions, including low memory, the guest would
>> stop receiving packets. With this patch applied we no longer see any
>> problems in the driver while performing these tests for extended periods
>> of time.
>> 
>> Make sure napi is scheduled subsequent to each napi_enable.
>> 
>> Signed-off-by: Bruce Rogers 
>> Signed-off-by: Olaf Kirch 
> 
> I need a git commit id for this one as well.
> 

This one is not upstream.

Bruce

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [stable] [PATCH 2/3][STABLE] KVM: indicate oom if add_buf fails

2010-06-03 Thread Bruce Rogers
 >>> On 6/3/2010 at 03:02 PM, Greg KH  wrote: 

> 
> WHat is the git commit id of the upstream patch?
> 

9ab86bbcf8be755256f0a5e994e0b38af6b4d399
I grabbed this from:
git://git.kernel.org/pub/scm/virt/kvm/kvm.git

> I need that for all stable patches to be accepted, thanks.
> 
> Also, all KVM stuff needs to get acked by Avi, I can't take them until
> he says they are ok.

Understood.

> 
> Oh, and what -stable trees do you want these patches in?  .27, .32, .33,
> or .34?  I have a bunch of them going at the moment...

All 3 in 2.6.32, only #2 and #3 in 2.6.33, and only #3 in 2.6.34

Thanks,
Bruce

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3][STABLE] KVM: fix delayed refill checking

2010-06-03 Thread Bruce Rogers
Please consider this for stable:

commit 39d321577405e8e269fd238b278aaf2425fa788a
Author: Herbert Xu 
Date:   Mon Jan 25 15:51:01 2010 -0800

virtio_net: Make delayed refill more reliable

I have seen RX stalls on a machine that experienced a suspected
OOM.  After the stall, the RX buffer is empty on the guest side
and there are exactly 16 entries available on the host side.  As
the number of entries is less than that required by a maximal
skb, the host cannot proceed.

The guest did not have a refill job scheduled.

My diagnosis is that an OOM had occured, with the delayed refill
job scheduled.  The job was able to allocate at least one skb, but
not enough to overcome the minimum required by the host to proceed.

As the refill job would only reschedule itself if it failed completely
to allocate any skbs, this would lead to an RX stall.

The following patch removes this stall possibility by always
rescheduling the refill job until the ring is totally refilled.

Testing has shown that the RX stall no longer occurs whereas
previously it would occur within a day.

Signed-off-by: Herbert Xu 
Acked-by: Rusty Russell 
Signed-off-by: David S. Miller 

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index c708ecc..9ead30b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -395,8 +395,7 @@ static void refill_work(struct work_struct *work)

vi = container_of(work, struct virtnet_info, refill.work);
napi_disable(&vi->napi);
-   try_fill_recv(vi, GFP_KERNEL);
-   still_empty = (vi->num == 0);
+   still_empty = !try_fill_recv(vi, GFP_KERNEL);
napi_enable(&vi->napi);

/* In theory, this can happen: if we don't get any buffers in


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3][STABLE] KVM: Various issues in virtio_net

2010-06-03 Thread Bruce Rogers
These are patches which we have found useful for our 2.6.32 based SLES 11 SP1 
release. 

The first patch is already upstream, but should be included in stable.

The second patch is a subset of another upstream patch. Again, stable material.

The third patch solves the last remaining issue we saw when testing kvm 
configurations with the SUSE certification test suite. Under heavy load, we 
observed rx stalls (first two patches applied), and this third patch was 
crafted to address the issue. Please apply to stable.
I assume this last problem also exists in more recent kernels than 2.6.32, but 
I haven't validated that.

With these 3 patches applied we no longer see any issues with virito networking 
using our certification test suite.

Signed-off-by: Bruce Rogers 


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3][STABLE] KVM: add schedule check to napi_enable call

2010-06-03 Thread Bruce Rogers
virtio_net: Add schedule check to napi_enable call
Under harsh testing conditions, including low memory, the guest would
stop receiving packets. With this patch applied we no longer see any
problems in the driver while performing these tests for extended periods
of time.

Make sure napi is scheduled subsequent to each napi_enable.

Signed-off-by: Bruce Rogers 
Signed-off-by: Olaf Kirch 

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -388,6 +388,20 @@ static void skb_recv_done(struct virtque
}
 }

+static void virtnet_napi_enable(struct virtnet_info *vi)
+{
+   napi_enable(&vi->napi);
+
+   /* If all buffers were filled by other side before we napi_enabled, we
+* won't get another interrupt, so process any outstanding packets
+* now.  virtnet_poll wants re-enable the queue, so we disable here.
+* We synchronize against interrupts via NAPI_STATE_SCHED */
+   if (napi_schedule_prep(&vi->napi)) {
+   vi->rvq->vq_ops->disable_cb(vi->rvq);
+   __napi_schedule(&vi->napi);
+   }
+}
+
 static void refill_work(struct work_struct *work)
 {
struct virtnet_info *vi;
@@ -397,7 +411,7 @@ static void refill_work(struct work_stru
napi_disable(&vi->napi);
try_fill_recv(vi, GFP_KERNEL);
still_empty = (vi->num == 0);
-   napi_enable(&vi->napi);
+   virtnet_napi_enable(vi);

/* In theory, this can happen: if we don't get any buffers in
 * we will *never* try to fill again. */
@@ -589,16 +603,7 @@ static int virtnet_open(struct net_devic
 {
struct virtnet_info *vi = netdev_priv(dev);

-   napi_enable(&vi->napi);
-
-   /* If all buffers were filled by other side before we napi_enabled, we
-* won't get another interrupt, so process any outstanding packets
-* now.  virtnet_poll wants re-enable the queue, so we disable here.
-* We synchronize against interrupts via NAPI_STATE_SCHED */
-   if (napi_schedule_prep(&vi->napi)) {
-   vi->rvq->vq_ops->disable_cb(vi->rvq);
-   __napi_schedule(&vi->napi);
-   }
+   virtnet_napi_enable(vi);
return 0;
 }



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3][STABLE] KVM: indicate oom if add_buf fails

2010-06-03 Thread Bruce Rogers
This patch is a subset of an already upstream patch, but this portion is useful 
in earlier releases.
Please consider for stable.

If the add_buf operation fails, indicate failure to the caller.
Signed-off-by: Bruce Rogers 

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c

@@ -318,6 +318,7 @@ static bool try_fill_recv_maxbufs(struct
skb_unlink(skb, &vi->recv);
trim_pages(vi, skb);
kfree_skb(skb);
+   oom = true;
break;
}
vi->num++;
@@ -368,6 +369,7 @@ static bool try_fill_recv(struct virtnet
if (err < 0) {
skb_unlink(skb, &vi->recv);
kfree_skb(skb);
+   oom = true;
break;
}
vi->num++;


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] document boot option to -drive parameter

2010-04-16 Thread Bruce Rogers
The boot option is missing from the documentation for the -drive parameter.

If there is a better way to descibe it, I'm all ears.

Signed-off-by: Bruce Rogers http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] [RESEND] make help output be a little more self-consistent

2010-01-07 Thread Bruce Rogers
This is the part which applies to the base qemu. 
btw: it was sent to qemu-de...@nongnu.org yesterday.) 

Signed-off-by: Bruce Rogers 
---
 qemu-options.hx |   39 ---
 1 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index ecd50eb..20b696d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -42,7 +42,7 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp,
 "-smp n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]\n"
 "set the number of CPUs to 'n' [default=1]\n"
 "maxcpus= maximum number of total cpus, including\n"
-"  offline CPUs for hotplug etc.\n"
+"offline CPUs for hotplug, etc\n"
 "cores= number of CPU cores on one socket\n"
 "threads= number of threads on one CPU core\n"
 "sockets= number of discrete sockets in the system\n")
@@ -405,8 +405,9 @@ ETEXI
 DEF("device", HAS_ARG, QEMU_OPTION_device,
 "-device driver[,options]  add device\n")
 DEF("name", HAS_ARG, QEMU_OPTION_name,
-"-name string1[,process=string2]set the name of the guest\n"
-"string1 sets the window title and string2 the process name 
(on Linux)\n")
+"-name string1[,process=string2]\n"
+"set the name of the guest\n"
+"string1 sets the window title and string2 the process 
name (on Linux)\n")
 STEXI
 @item -name @var{name}
 Sets the @var{name} of the guest.
@@ -483,7 +484,7 @@ ETEXI
 
 #ifdef CONFIG_SDL
 DEF("ctrl-grab", 0, QEMU_OPTION_ctrl_grab,
-"-ctrl-grab   use Right-Ctrl to grab mouse (instead of Ctrl-Alt)\n")
+"-ctrl-grab  use Right-Ctrl to grab mouse (instead of Ctrl-Alt)\n")
 #endif
 STEXI
 @item -ctrl-grab
@@ -756,12 +757,12 @@ ETEXI
 #ifdef TARGET_I386
 DEF("smbios", HAS_ARG, QEMU_OPTION_smbios,
 "-smbios file=binary\n"
-"Load SMBIOS entry from binary file\n"
+"load SMBIOS entry from binary file\n"
 "-smbios type=0[,vendor=str][,version=str][,date=str][,release=%%d.%%d]\n"
-"Specify SMBIOS type 0 fields\n"
+"specify SMBIOS type 0 fields\n"
 "-smbios 
type=1[,manufacturer=str][,product=str][,version=str][,serial=str]\n"
 "  [,uuid=uuid][,sku=str][,family=str]\n"
-"Specify SMBIOS type 1 fields\n")
+"specify SMBIOS type 1 fields\n")
 #endif
 STEXI
 @item -smbios fi...@var{binary}
@@ -816,13 +817,13 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
 "-net 
tap[,vlan=n][,name=str][,fd=h][,ifname=name][,script=file][,downscript=dfile][,sndbuf=nbytes][,vnet_hdr=on|off]\n"
 "connect the host TAP network interface to VLAN 'n' and 
use the\n"
 "network scripts 'file' (default=%s)\n"
-"and 'dfile' (default=%s);\n"
-"use '[down]script=no' to disable script execution;\n"
+"and 'dfile' (default=%s)\n"
+"use '[down]script=no' to disable script execution\n"
 "use 'fd=h' to connect to an already opened TAP 
interface\n"
-"use 'sndbuf=nbytes' to limit the size of the send buffer; 
the\n"
-"default of 'sndbuf=1048576' can be disabled using 
'sndbuf=0'\n"
-"use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap 
flag; use\n"
-"vnet_hdr=on to make the lack of IFF_VNET_HDR support an 
error condition\n"
+"use 'sndbuf=nbytes' to limit the size of the send buffer 
(the\n"
+"default of 'sndbuf=1048576' can be disabled using 
'sndbuf=0')\n"
+"use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap 
flag\n"
+"use vnet_hdr=on to make the lack of IFF_VNET_HDR support 
an error condition\n"
 #endif
 "-net 
socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"
 "connect the vlan 'n' to another VLAN using a socket 
connection\n"
@@ -837,7 +838,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
 #endif
 "-net dump[,vlan=n][,file=f][,len=n]\n"
 "dump traffic on vlan 'n' to

[PATCH 2/2] make help output be a little more self-consistent

2010-01-07 Thread Bruce Rogers

This is the part which applies to qemu-kvm. 

Signed-off-by: Bruce Rogers  
---
 qemu-options.hx |   19 ++-
 1 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 788d849..fdd5884 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1938,7 +1938,7 @@ DEF("readconfig", HAS_ARG, QEMU_OPTION_readconfig,
 "-readconfig \n")
 DEF("writeconfig", HAS_ARG, QEMU_OPTION_writeconfig,
 "-writeconfig \n"
-"read/write config file")
+"read/write config file\n")
 
 DEF("no-kvm", 0, QEMU_OPTION_no_kvm,
 "-no-kvm disable KVM hardware virtualization\n")
@@ -1947,26 +1947,27 @@ DEF("no-kvm-irqchip", 0, QEMU_OPTION_no_kvm_irqchip,
 DEF("no-kvm-pit", 0, QEMU_OPTION_no_kvm_pit,
 "-no-kvm-pit disable KVM kernel mode PIT\n")
 DEF("no-kvm-pit-reinjection", 0, QEMU_OPTION_no_kvm_pit_reinjection,
-"-no-kvm-pit-reinjection disable KVM kernel mode PIT interrupt 
reinjection\n")
+"-no-kvm-pit-reinjection\n"
+"disable KVM kernel mode PIT interrupt reinjection\n")
 #if defined(TARGET_I386) || defined(TARGET_X86_64) || defined(TARGET_IA64) || 
defined(__linux__)
 DEF("pcidevice", HAS_ARG, QEMU_OPTION_pcidevice,
 "-pcidevice host=bus:dev.func[,dma=none][,name=string]\n"
-"expose a PCI device to the guest OS.\n"
+"expose a PCI device to the guest OS\n"
 "dma=none: don't perform any dma translations (default is 
to use an iommu)\n"
-"'string' is used in log output.\n")
+"'string' is used in log output\n")
 #endif
 DEF("enable-nesting", 0, QEMU_OPTION_enable_nesting,
 "-enable-nesting enable support for running a VM inside the VM (AMD 
only)\n")
 DEF("nvram", HAS_ARG, QEMU_OPTION_nvram,
-"-nvram FILE  provide ia64 nvram contents\n")
+"-nvram FILE provide ia64 nvram contents\n")
 DEF("tdf", 0, QEMU_OPTION_tdf,
-"-tdf enable guest time drift compensation\n")
+"-tdfenable guest time drift compensation\n")
 DEF("kvm-shadow-memory", HAS_ARG, QEMU_OPTION_kvm_shadow_memory,
 "-kvm-shadow-memory MEGABYTES\n"
-" allocate MEGABYTES for kvm mmu shadowing\n")
+"allocate MEGABYTES for kvm mmu shadowing\n")
 DEF("mem-path", HAS_ARG, QEMU_OPTION_mempath,
-"-mem-path FILE   provide backing storage for guest RAM\n")
+"-mem-path FILE  provide backing storage for guest RAM\n")
 #ifdef MAP_POPULATE
 DEF("mem-prealloc", 0, QEMU_OPTION_mem_prealloc,
-"-mem-preallocpreallocate guest memory (use with -mempath)\n")
+"-mem-prealloc   preallocate guest memory (use with -mempath)\n")
 #endif


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] make help output be a little more self-consistent

2010-01-06 Thread Bruce Rogers
Signed-off-by: Bruce Rogers 
---
 qemu-options.hx |   58 --
 1 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 812d067..fdd5884 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -42,7 +42,7 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp,
 "-smp n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]\n"
 "set the number of CPUs to 'n' [default=1]\n"
 "maxcpus= maximum number of total cpus, including\n"
-"  offline CPUs for hotplug etc.\n"
+"offline CPUs for hotplug, etc\n"
 "cores= number of CPU cores on one socket\n"
 "threads= number of threads on one CPU core\n"
 "sockets= number of discrete sockets in the system\n")
@@ -406,8 +406,9 @@ ETEXI
 DEF("device", HAS_ARG, QEMU_OPTION_device,
 "-device driver[,options]  add device\n")
 DEF("name", HAS_ARG, QEMU_OPTION_name,
-"-name string1[,process=string2]set the name of the guest\n"
-"string1 sets the window title and string2 the process name 
(on Linux)\n")
+"-name string1[,process=string2]\n"
+"set the name of the guest\n"
+"string1 sets the window title and string2 the process 
name (on Linux)\n")
 STEXI
 @item -name @var{name}
 Sets the @var{name} of the guest.
@@ -484,7 +485,7 @@ ETEXI
 
 #ifdef CONFIG_SDL
 DEF("ctrl-grab", 0, QEMU_OPTION_ctrl_grab,
-"-ctrl-grab   use Right-Ctrl to grab mouse (instead of Ctrl-Alt)\n")
+"-ctrl-grab  use Right-Ctrl to grab mouse (instead of Ctrl-Alt)\n")
 #endif
 STEXI
 @item -ctrl-grab
@@ -757,12 +758,12 @@ ETEXI
 #ifdef TARGET_I386
 DEF("smbios", HAS_ARG, QEMU_OPTION_smbios,
 "-smbios file=binary\n"
-"Load SMBIOS entry from binary file\n"
+"load SMBIOS entry from binary file\n"
 "-smbios type=0[,vendor=str][,version=str][,date=str][,release=%%d.%%d]\n"
-"Specify SMBIOS type 0 fields\n"
+"specify SMBIOS type 0 fields\n"
 "-smbios 
type=1[,manufacturer=str][,product=str][,version=str][,serial=str]\n"
 "  [,uuid=uuid][,sku=str][,family=str]\n"
-"Specify SMBIOS type 1 fields\n")
+"specify SMBIOS type 1 fields\n")
 #endif
 STEXI
 @item -smbios fi...@var{binary}
@@ -817,13 +818,13 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
 "-net 
tap[,vlan=n][,name=str][,fd=h][,ifname=name][,script=file][,downscript=dfile][,sndbuf=nbytes][,vnet_hdr=on|off]\n"
 "connect the host TAP network interface to VLAN 'n' and 
use the\n"
 "network scripts 'file' (default=%s)\n"
-"and 'dfile' (default=%s);\n"
-"use '[down]script=no' to disable script execution;\n"
+"and 'dfile' (default=%s)\n"
+"use '[down]script=no' to disable script execution\n"
 "use 'fd=h' to connect to an already opened TAP 
interface\n"
-"use 'sndbuf=nbytes' to limit the size of the send buffer; 
the\n"
-"default of 'sndbuf=1048576' can be disabled using 
'sndbuf=0'\n"
-"use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap 
flag; use\n"
-"vnet_hdr=on to make the lack of IFF_VNET_HDR support an 
error condition\n"
+"use 'sndbuf=nbytes' to limit the size of the send buffer 
(the\n"
+"default of 'sndbuf=1048576' can be disabled using 
'sndbuf=0')\n"
+"use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap 
flag\n"
+"use vnet_hdr=on to make the lack of IFF_VNET_HDR support 
an error condition\n"
 #endif
 "-net 
socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"
 "connect the vlan 'n' to another VLAN using a socket 
connection\n"
@@ -838,7 +839,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
 #endif
 "-net dump[,vlan=n][,file=f][,len=n]\n"
 "dump traffic on vlan 'n' to file 'f' (max n bytes per 
packet)\n"
-"-net none   use it alone to

[PATCH] kvm: allocate correct size for dirty bitmap

2009-09-23 Thread Bruce Rogers
The dirty bitmap copied out to userspace is stored in a long array, and gets 
copied out to userspace accordingly.  This patch accounts for that correctly.  
Currently I'm seeing kvm crashing due to writing beyond the end of the alloc'd 
dirty bitmap memory, because the buffer has the wrong size.

Signed-off-by: Bruce Rogers 
---
 qemu-kvm.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-kvm.c b/qemu-kvm.c
index 6511cb6..ee5db76 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -702,7 +702,7 @@ int kvm_get_dirty_pages_range(kvm_context_t kvm, unsigned 
long phys_addr,
 for (i = 0; i < KVM_MAX_NUM_MEM_REGIONS; ++i) {
 if ((slots[i].len && (uint64_t) slots[i].phys_addr >= phys_addr)
 && ((uint64_t) slots[i].phys_addr + slots[i].len <= end_addr)) {
-buf = qemu_malloc((slots[i].len / 4096 + 7) / 8 + 2);
+buf = qemu_malloc(BITMAP_SIZE(slots[i].len));
 r = kvm_get_map(kvm, KVM_GET_DIRTY_LOG, i, buf);
 if (r) {
 qemu_free(buf);


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm scaling question

2009-09-14 Thread Bruce Rogers
 On 9/11/2009 at 5:02 PM, Andre Przywara  wrote:
> Marcelo Tosatti wrote:
>> On Fri, Sep 11, 2009 at 09:36:10AM -0600, Bruce Rogers wrote:
>>> I am wondering if anyone has investigated how well kvm scales when 
> supporting many guests, or many vcpus or both.
>>>
>>> I'll do some investigations into the per vm memory overhead and
>>> play with bumping the max vcpu limit way beyond 16, but hopefully
>>> someone can comment on issues such as locking problems that are known
>>> to exist and needing to be addressed to increased parallellism,
>>> general overhead percentages which can help provide consolidation
>>> expectations, etc.
>> 
>> I suppose it depends on the guest and workload. With an EPT host and
>> 16-way Linux guest doing kernel compilations, on recent kernel, i see:
>  > ...
>> 
>>> Also, when I did a simple experiment with vcpu overcommitment, I was
>>> surprised how quickly performance suffered (just bringing a Linux vm
>>> up), since I would have assumed the additional vcpus would have been
>>> halted the vast majority of the time. On a 2 proc box, overcommitment
>>> to 8 vcpus in a guest (I know this isn't a good usage scenario, but
>>> does provide some insights) caused the boot time to increase to almost
>>> exponential levels. At 16 vcpus, it took hours to just reach the gui
>>> login prompt.
>> 
>> One probable reason for that are vcpus which hold spinlocks in the guest
>> are scheduled out in favour of vcpus which spin on that same lock.
> We have encountered this issue some time ago in Xen. Ticket spinlocks 
> make this even worse. More detailed info can be found here:
> http://www.amd64.org/research/virtualization.html#Lock_holder_preemption 
> 
> Have you tried using paravirtualized spinlock in the guest kernel?
> http://lkml.indiana.edu/hypermail/linux/kernel/0807.0/2808.html 


I'll try to give that a try.  Thanks for the tips.

Bruce


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm scaling question

2009-09-14 Thread Bruce Rogers
 On 9/11/2009 at 3:53 PM, Marcelo Tosatti  wrote:
> On Fri, Sep 11, 2009 at 09:36:10AM -0600, Bruce Rogers wrote:
>> I am wondering if anyone has investigated how well kvm scales when 
> supporting many guests, or many vcpus or both.
>> 
>> I'll do some investigations into the per vm memory overhead and
>> play with bumping the max vcpu limit way beyond 16, but hopefully
>> someone can comment on issues such as locking problems that are known
>> to exist and needing to be addressed to increased parallellism,
>> general overhead percentages which can help provide consolidation
>> expectations, etc.
> 
> I suppose it depends on the guest and workload. With an EPT host and
> 16-way Linux guest doing kernel compilations, on recent kernel, i see:
> 
> # Samples: 98703304
> #
> # Overhead  Command  Shared Object  Symbol
> #   ...  .  ..
> #
> 97.15%   sh  [kernel]   [k] 
> vmx_vcpu_run
>  0.27%   sh  [kernel]   [k] 
> kvm_arch_vcpu_ioctl_
>  0.12%   sh  [kernel]   [k] 
> default_send_IPI_mas
>  0.09%   sh  [kernel]   [k] 
> _spin_lock_irq
> 
> Which is pretty good. Without EPT/NPT the mmu_lock seems to be the major
> bottleneck to parallelism.
> 
>> Also, when I did a simple experiment with vcpu overcommitment, I was
>> surprised how quickly performance suffered (just bringing a Linux vm
>> up), since I would have assumed the additional vcpus would have been
>> halted the vast majority of the time. On a 2 proc box, overcommitment
>> to 8 vcpus in a guest (I know this isn't a good usage scenario, but
>> does provide some insights) caused the boot time to increase to almost
>> exponential levels. At 16 vcpus, it took hours to just reach the gui
>> login prompt.
> 
> One probable reason for that are vcpus which hold spinlocks in the guest
> are scheduled out in favour of vcpus which spin on that same lock.

I suspected it might be a whole lot of spinning happening. That does seems most 
likely. I was just surprised how bad the behavior was.

Bruce

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm scaling question

2009-09-14 Thread Bruce Rogers
 On 9/11/2009 at 9:46 AM, Javier Guerra  wrote:
> On Fri, Sep 11, 2009 at 10:36 AM, Bruce Rogers  wrote:
>> Also, when I did a simple experiment with vcpu overcommitment, I was 
> surprised how quickly performance suffered (just bringing a Linux vm up), 
> since I would have assumed the additional vcpus would have been halted the 
> vast majority of the time.  On a 2 proc box, overcommitment to 8 vcpus in a 
> guest (I know this isn't a good usage scenario, but does provide some 
> insights) caused the boot time to increase to almost exponential levels. At 
> 16 vcpus, it took hours to just reach the gui login prompt.
> 
> I'd guess (and hope!) that having many 1- or 2-cpu guests won't kill
> performance as sharply as having a single guest with more vcpus than
> the physical cpus available.  have you tested that?
> 
> -- 
> Javier

Yes, but not empirically.  I'll certainly be doing that, but wanted to see what 
perspective there was on the results I was seeing.
And I've gotten the response that explains why overcommitment is performing so 
poorly in another email.

Bruce



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kvm scaling question

2009-09-11 Thread Bruce Rogers
I am wondering if anyone has investigated how well kvm scales when supporting 
many guests, or many vcpus or both.

I'll do some investigations into the per vm memory overhead and play with 
bumping the max vcpu limit way beyond 16, but hopefully someone can comment on 
issues such as locking problems that are known to exist and needing to be 
addressed to increased parallellism, general overhead percentages which can 
help provide consolidation expectations, etc.

Also, when I did a simple experiment with vcpu overcommitment, I was surprised 
how quickly performance suffered (just bringing a Linux vm up), since I would 
have assumed the additional vcpus would have been halted the vast majority of 
the time.  On a 2 proc box, overcommitment to 8 vcpus in a guest (I know this 
isn't a good usage scenario, but does provide some insights) caused the boot 
time to increase to almost exponential levels. At 16 vcpus, it took hours to 
just reach the gui login prompt.

Any perspective you can offer would be appreciated.

Bruce

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: Add COPYING file

2009-08-07 Thread Bruce Rogers
kvm-kmod sources have no COPYING file.  This patch adds it.

Signed-off-by: Bruce Rogers 
---
 COPYING |  339 
 1 file changed, 339 insertions(+)

diff --git a/COPYING b/COPYING
new file mode 100644
index 000..00ccfbb
--- /dev/null
+++ b/COPYING
@@ -0,0 +1,339 @@
+GNU GENERAL PUBLIC LICENSE
+   Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warra

[PATCH] handle -smp > 16 more cleanly

2009-04-09 Thread Bruce Rogers
The x86 kvm kernel module limits guest cpu count to 16, but theuserspace pc 
definition says 255 still, so kvm_create_vcpu will fail for that reason with 
-smp > 16 specified.  This patch causes qemu-kvm to exit in that case.  Without 
this patch other errors get reported down the road and finally a segfault 
occurs.

Bruce

Signed-off-by: Bruce Rogers 

diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
index ed76367..b6d6d5e 100644
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -417,12 +417,18 @@ static void *ap_main_loop(void *_env)
 CPUState *env = _env;
 sigset_t signals;
 struct ioperm_data *data = NULL;
+int r;

 current_env = env;
 env->thread_id = kvm_get_thread_id();
 sigfillset(&signals);
 sigprocmask(SIG_BLOCK, &signals, NULL);
-kvm_create_vcpu(kvm_context, env->cpu_index);
+r = kvm_create_vcpu(kvm_context, env->cpu_index);
+if (r)
+{
+fprintf(stderr, "error creating vcpu: %d\n", r);
+exit(1);
+}
 kvm_qemu_init_env(env);

 #ifdef USE_KVM_DEVICE_ASSIGNMENT


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html