Re: [PATCHv7] add mergeable buffers support to vhost_net

2010-04-30 Thread David Stevens
Michael S. Tsirkin m...@redhat.com wrote on 04/29/2010 06:45:15 AM:

 On Wed, Apr 28, 2010 at 01:57:12PM -0700, David L Stevens wrote:
  This patch adds mergeable receive buffer support to vhost_net.
  
  Signed-off-by: David L Stevens dlstev...@us.ibm.com
 
 I have applied this, thanks very much!
 I have also applied some tweaks on top,
 please take a look.
 
 Thanks,
 MSt
 

Looks fine to me.

Acked-by: David L Stevens dlstev...@us.ibm.com

 commit 2809e94f5f26d89dc5232aaec753ffda95c4d95e
 Author: Michael S. Tsirkin m...@redhat.com
 Date:   Thu Apr 29 16:18:08 2010 +0300
 
 vhost-net: minor tweaks in mergeable buffer code
 
 Applies the following tweaks on top of mergeable buffers patch:
 1. vhost_get_desc_n assumes that all desriptors are 'in' only.
It's also unlikely to be useful for any vhost frontend
besides vhost_net, so just move it to net.c, and rename
get_rx_bufs for brevity.
 
 2. for rx, we called iov_length within vhost_get_desc_n
(now get_rx_bufs) already, so we don't
need an extra call to iov_length to avoid overflow anymore.
Accordingly, copy_iovec_hdr can return void now.
 
 3. for rx, do some further code tweaks:
do not assign len = err as we check that err == len
handle data length in a way similar to how we handle
header length: datalen - sock_len, len - vhost_len.
add sock_hlen as a local variable, for symmetry with vhost_hlen.
 
 4. add some likely/unlikely annotations
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 
 diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
 index d61d945..85519b4 100644
 --- a/drivers/vhost/net.c
 +++ b/drivers/vhost/net.c
 @@ -74,9 +74,9 @@ static int move_iovec_hdr(struct iovec *from, struct 
iovec *to,
 }
 return seg;
  }
 -/* Copy iovec entries for len bytes from iovec. Return segments used. 
*/
 -static int copy_iovec_hdr(const struct iovec *from, struct iovec *to,
 -   size_t len, int iovcount)
 +/* Copy iovec entries for len bytes from iovec. */
 +static void copy_iovec_hdr(const struct iovec *from, struct iovec *to,
 +size_t len, int iovcount)
  {
 int seg = 0;
 size_t size;
 @@ -89,7 +89,6 @@ static int copy_iovec_hdr(const struct iovec *from, 
struct iovec *to,
++to;
++seg;
 }
 -   return seg;
  }
 
  /* Caller must have TX VQ lock */
 @@ -204,7 +203,7 @@ static void handle_tx(struct vhost_net *net)
 unuse_mm(net-dev.mm);
  }
 
 -static int vhost_head_len(struct vhost_virtqueue *vq, struct sock *sk)
 +static int peek_head_len(struct vhost_virtqueue *vq, struct sock *sk)
  {
 struct sk_buff *head;
 int len = 0;
 @@ -212,17 +211,70 @@ static int vhost_head_len(struct vhost_virtqueue 
*vq, 
 struct sock *sk)
 lock_sock(sk);
 head = skb_peek(sk-sk_receive_queue);
 if (head)
 -  len = head-len + vq-sock_hlen;
 +  len = head-len;
 release_sock(sk);
 return len;
  }
 
 +/* This is a multi-buffer version of vhost_get_desc, that works if
 + *   vq has read descriptors only.
 + * @vq  - the relevant virtqueue
 + * @datalen   - data length we'll be reading
 + * @iovcount   - returned count of io vectors we fill
 + * @log  - vhost log
 + * @log_num   - log offset
 + *   returns number of buffer heads allocated, negative on error
 + */
 +static int get_rx_bufs(struct vhost_virtqueue *vq,
 + struct vring_used_elem *heads,
 + int datalen,
 + unsigned *iovcount,
 + struct vhost_log *log,
 + unsigned *log_num)
 +{
 +   unsigned int out, in;
 +   int seg = 0;
 +   int headcount = 0;
 +   unsigned d;
 +   int r;
 +
 +   while (datalen  0) {
 +  if (unlikely(headcount = VHOST_NET_MAX_SG)) {
 + r = -ENOBUFS;
 + goto err;
 +  }
 +  d = vhost_get_desc(vq-dev, vq, vq-iov + seg,
 +   ARRAY_SIZE(vq-iov) - seg, out,
 +   in, log, log_num);
 +  if (d == vq-num) {
 + r = 0;
 + goto err;
 +  }
 +  if (unlikely(out || in = 0)) {
 + vq_err(vq, unexpected descriptor format for RX: 
 +out %d, in %d\n, out, in);
 + r = -EINVAL;
 + goto err;
 +  }
 +  heads[headcount].id = d;
 +  heads[headcount].len = iov_length(vq-iov + seg, in);
 +  datalen -= heads[headcount].len;
 +  ++headcount;
 +  seg += in;
 +   }
 +   *iovcount = seg;
 +   return headcount;
 +err:
 +   vhost_discard_desc(vq, headcount);
 +   return r;
 +}
 +
  /* Expects to be always run from workqueue - which acts as
   * read-size critical section for our kind of RCU. */
  static void handle_rx(struct vhost_net *net)
  {
 struct vhost_virtqueue *vq = net-dev.vqs[VHOST_NET_VQ_RX];
 -   unsigned in, log, s;
 +   unsigned uninitialized_var(in), log;
 struct vhost_log *vq_log;
 struct msghdr msg = {
.msg_name = NULL,
 @@ -238,9 +290,10 @@ static void handle_rx(struct vhost_net *net)
   

Fedora 13 Beta - Cannot Read HD on WinXP 2003 64bit Installs

2010-04-30 Thread Jonathan Hoover
Hello,

I came across a message someone posted elsewhere at
http://article.gmane.org/gmane.comp.emulators.qemu/66135 while trying to
determine why Windows XP and 2003 guest machines are not installing
under Fedora 13 Beta. I would think others have run into this problem,
but see no discussion in the forums? I use the script below to do my
installs (mainly because Virt Manager never does the cache=writeback
part, which really speeds up the guest). I am getting it to format and
copy files, but after the reboot of the guest, I get A disk read error
occurred after it says Booting from Hard Disk. Other guest types
(linux CentOS or Fedora for example) install fine.

virt-install -n xp64_1 -r 1024 --vcpus=1 --os-type=windows
--os-variant=winxp64 --sound --hvm --virt-type=kvm -c
/var/lib/libvirt/images/WINXPPRO64.iso --disk
path=/var/lib/libvirt/images/xp1_1.img,size=30,sparse=false,cache=writeb
ack --network network:default

I am running the following:

CPU:Intel(R) Core(TM)2 Quad  CPU   Q9300  @ 2.50GHz
KVM:QEMU PC emulator version 0.12.3 (qemu-kvm-0.12.3)
LIBVIRTD:   libvirtd (libvirt) 0.7.7
KERNEL: 2.6.33.2-41.fc13.x86_64
GUEST:  Windows XP 64 bit  Windows Server 2003 64 bit

Any thoughts?
Jonathan Hoover 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu-kvm.0.12.2 aborts on linux

2010-04-30 Thread Gleb Natapov
On Wed, Apr 28, 2010 at 10:19:37AM -0700, K D wrote:
 Am using yahoo mail and my mails to this mailer gets rejected every time 
 saying message has HTML content etc. Should I use some other mail tool? Below 
 is my issue.
 
 I am trying to get KVM/qemu running on linux. I compiled 2.6.27.10 by 
 enabling KVM, KVM for intel options at configure time. My box is running 
 with this KVM enabled inside kernel. I also built qemu-kvm-0.12.2 using above 
 kernel headers etc. I enabled virtualization in BIOS. I didn't try to install 
 any guest from CD etc. I made a hard disk image, installed grub on it, copied 
 kernel, initrd onto it. Now when I try to create Vm as below, it crashes with 
 following backtrace. What could be going wrong?
 
 Also when I try to say -m 256 malloc (or posix_memalign) fails with ENOMEM. 
 So right now -m 128, which is default, works. Why is that? I have 4G RAM in 
 my setup and my native linux is using less than 1G. Is there some rlimits for 
 qemu that I need to raise?
 
 Sounds like I'm doing some basic stuff wrong. I'm using bios, vapic, pxe-rtl 
 bin straight from the qemu-kvm dir.
 
 If I don't do '-nographic' its running into some malloc failure inside some 
 vga routine. I pasted that backtrace too below.
 
What's your Linux distribution? First trace bellow shows that ptheard_create()
failed which is strange. What ulimit -a shows?


 Appreciate your help.
 thanks
 
 qemu-system-x86_64 -hda /dev/shm/vmhd.img -bios ./bios.bin --option-rom 
 ./vapic.bin -curses -nographic -vga none -option-rom ./pxe-rtl8139.bin
 
 #0 0x414875a6 in raise () from /lib/libc.so.6
 (gdb) bt
 #0 0x414875a6 in raise () from /lib/libc.so.6
 #1 0x4148ad18 in abort () from /lib/libc.so.6
 #2 0x080b4cb3 in die2 (err=value optimized out,
 what=0x81f2662 pthread_create) at posix-aio-compat.c:80
 #3 0x080b5682 in thread_create (arg=value optimized out,
 start_routine=value optimized out, attr=value optimized out,
 thread=value optimized out) at posix-aio-compat.c:118
 #4 spawn_thread () at posix-aio-compat.c:379
 #5 qemu_paio_submit (aiocb=0x846b550) at posix-aio-compat.c:390
 #6 0x080b57cb in paio_submit (bs=0x843c008, fd=5, sector_num=0,
 qiov=0x84cefb8, nb_sectors=512, cb=0x81cb950 dma_bdrv_cb,
 opaque=0x84cef80, type=1) at posix-aio-compat.c:584
 #7 0x080cc7b8 in raw_aio_submit (type=value optimized out,
 opaque=value optimized out, cb=value optimized out,
 nb_sectors=value optimized out, qiov=value optimized out,
 sector_num=value optimized out, bs=value optimized out)
 at block/raw-posix.c:562
 #8 raw_aio_readv (bs=0x843c008, sector_num=0, qiov=0x84cefb8, nb_sectors=1,
 cb=0x81cb950 dma_bdrv_cb, opaque=0x84cef80) at block/raw-posix.c:570
 #9 0x080b0593 in bdrv_aio_readv (bs=0x843c008, sector_num=0, qiov=0x84cefb8,
 nb_sectors=1, cb=0x81cb950 dma_bdrv_cb, opaque=0x84cef80)
 at block.c:1548
 #10 0x081cbb26 in dma_bdrv_cb (opaque=0x84cef80, ret=0)
 at /ws/pkoya-sjc/temp/qemu-kvm-0.12.2/dma-helpers.c:123
 #11 0x081cbcde in dma_bdrv_io (bs=0x843c008, sg=0x846861c, sector_num=0,
 cb=0x8074320 ide_read_dma_cb, opaque=0x8468f1c, is_write=0)
 at /ws/pkoya-sjc/temp/qemu-kvm-0.12.2/dma-helpers.c:167
 #12 0x0807441b in ide_read_dma_cb (opaque=0x8468f1c, ret=0)
 at /ws/pkoya-sjc/temp/qemu-kvm-0.12.2/hw/ide/core.c:597
 #13 0x080760ec in bmdma_cmd_writeb (opaque=0x8468f1c, addr=49152, val=9)
 at /ws/pkoya-sjc/temp/qemu-kvm-0.12.2/hw/ide/pci.c:51
 #14 0x080d9d5f in ioport_write (data=value optimized out,
 address=value optimized out, index=value optimized out) at ioport.c:80
 #15 cpu_outb (addr=6587, val=value optimized out) at ioport.c:198
 #16 0xb60a3bc9 in ?? ()
 #17 0xc000 in ?? ()
 #18 0x0009 in ?? ()
 #19 0x in ?? ()
 (gdb) q
 
 backtrace without -nographic
 
 (gdb) bt
 #0 0x414875a6 in raise () from /lib/libc.so.6
 #1 0x4148ad18 in abort () from /lib/libc.so.6
 #2 0x080b4c3c in qemu_memalign (alignment=4096, size=16777216) at osdep.c:96
 #3 0x080b4c5a in qemu_vmalloc (size=16777216) at osdep.c:110
 #4 0x08119995 in qemu_ram_alloc (size=16777216)
 at /devel/temp/qemu-kvm-0.12.2/exec.c:2550
 #5 0x0807ffd0 in vga_common_init (s=0x84be7e4, vga_ram_size=16777216)
 at /devel/temp/qemu-kvm-0.12.2/hw/vga.c:2291
 #6 0x080a1c4b in pci_cirrus_vga_initfn (dev=0x84be618)
 at /devel/temp/qemu-kvm-0.12.2/hw/cirrus_vga.c:3209
 #7 0x0805e61e in pci_qdev_init (qdev=0x84be618, base=0x8229700)
 at /devel/temp/qemu-kvm-0.12.2/hw/pci.c:1482
 #8 0x080fa7ee in qdev_init (dev=0x84be618)
 at /devel/temp/qemu-kvm-0.12.2/hw/qdev.c:242
 #9 0x080fa885 in qdev_init_nofail (dev=0x84be618)
 at /devel/temp/qemu-kvm-0.12.2/hw/qdev.c:285
 #10 0x0805d8ca in pci_create_simple (bus=0x845ab58, devfn=-1,
 name=0x81ce0f8 cirrus-vga)
 at /devel/temp/qemu-kvm-0.12.2/hw/pci.c:1533
 #11 0x080a2c71 in pci_cirrus_vga_init (bus=0x845ab58)
 at /devel/temp/qemu-kvm-0.12.2/hw/cirrus_vga.c:3235
 #12 0x0808abc3 in pc_init1 (ram_size=value optimized out,
 boot_device=0xbf9fea17 cad, kernel_filename=0x0,
 

Re: [PATCH 0/5] Fix EFER.NX=0 with EPT

2010-04-30 Thread Marcelo Tosatti
On Wed, Apr 28, 2010 at 04:47:14PM +0300, Avi Kivity wrote:
 Currently we run with EFER.NX=1 on the guest even if the guest value is 0.
 This is fine with shadow, since we check bit 63 when instantiating a page
 table, and fault if bit 63 is set while EFER.NX is clear.
 
 This doesn't work with EPT, since we no longer get the change to check guest
 ptes.  So we need to run with EFER.NX=0.
 
 This is complicated by the fact that if we switch EFER.NX on the host, we'll
 trap immediately, since some host pages are mapped with the NX bit set.  As
 a result, we need to switch the MSR atomically during guest entry and exit.
 
 This patchset implements the complications described above.
 
 Avi Kivity (5):
   KVM: Let vcpu structure alignment be determined at runtime
   KVM: VMX: Add definition for msr autoload entry
   KVM: VMX: Add definitions for guest and host EFER autoswitch vmcs
 entries
   KVM: VMX: Add facility to atomically switch MSRs on guest entry/exit
   KVM: VMX: Atomically switch efer if EPT  !EFER.NX

Applied, thanks.

Out of curiosity, did you measure the vmentry/vmexit overhead?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 32-bit color graphic on KVM virtual machines

2010-04-30 Thread Andy Lutomirski

shacky wrote:

Hi.
Is it possible to have 32-bit color graphic on KVM virtual machines?
I installed a Windows virtual machine, but it allows me to configure
only 24-bit color display and it does not have any display driver
installed.


24-bit means 8 bits per RGB channel.  32-bit means 8 bits per RGB 
channel plus 8 bits alpha, which isn't very useful on the display.  So I 
wouldn't worry about it.  (If you had a 8bpp display, that would be a 
different story, but those aren't very common.)


Of course, lots of programs use 32 bit offscreen surfaces, but that's a 
different story.


--Andy



Is there a way to solve this problem?

Thank youv very much!
Bye.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] KVM: X86: add the support of XSAVE/XRSTOR to guest

2010-04-30 Thread Dexuan Cui
When the host enables XSAVE/XRSTOR, the patch exposes the XSAVE/XRSTOR
related CPUID leaves to guest by fixing up kvm_emulate_cpuid() and the
patch allows guest to set CR4.OSXSAVE to enable XSAVE.
The patch adds per-vcpu host/guest xstate image/mask and enhances the
current FXSAVE/FRSTOR with the new XSAVE/XRSTOR on the host xstate
(FPU/SSE/YMM) switch.

Signed-off-by: Dexuan Cui dexuan@intel.com
---
 arch/x86/include/asm/kvm_host.h |   15 +--
 arch/x86/include/asm/vmx.h  |1 +
 arch/x86/include/asm/xsave.h|3 +
 arch/x86/kvm/vmx.c  |   24 +
 arch/x86/kvm/x86.c  |  217 +--
 5 files changed, 242 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3f0007b..60be1a7 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -303,6 +303,11 @@ struct kvm_vcpu_arch {
struct i387_fxsave_struct host_fx_image;
struct i387_fxsave_struct guest_fx_image;
 
+   struct xsave_struct *host_xstate_image;
+   struct xsave_struct *guest_xstate_image;
+   uint64_t host_xstate_mask;
+   uint64_t guest_xstate_mask;
+
gva_t mmio_fault_cr2;
struct kvm_pio_request pio;
void *pio_data;
@@ -718,16 +723,6 @@ static inline unsigned long read_msr(unsigned long msr)
 }
 #endif
 
-static inline void kvm_fx_save(struct i387_fxsave_struct *image)
-{
-   asm(fxsave (%0):: r (image));
-}
-
-static inline void kvm_fx_restore(struct i387_fxsave_struct *image)
-{
-   asm(fxrstor (%0):: r (image));
-}
-
 static inline void kvm_fx_finit(void)
 {
asm(finit);
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index fb9a080..842286b 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -260,6 +260,7 @@ enum vmcs_field {
 #define EXIT_REASON_EPT_VIOLATION   48
 #define EXIT_REASON_EPT_MISCONFIG   49
 #define EXIT_REASON_WBINVD 54
+#define EXIT_REASON_XSETBV 55
 
 /*
  * Interruption-information format
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index ddc04cc..ada81a2 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -13,6 +13,9 @@
 
 #define FXSAVE_SIZE512
 
+#define XSTATE_YMM_SIZE 256
+#define XSTATE_YMM_OFFSET (512 + 64)
+
 /*
  * These are the features that the OS can handle currently.
  */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 875b785..a72d024 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -35,6 +35,8 @@
 #include asm/vmx.h
 #include asm/virtext.h
 #include asm/mce.h
+#include asm/i387.h
+#include asm/xcr.h
 
 #include trace.h
 
@@ -2517,6 +2519,8 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
vmx-vcpu.arch.cr4_guest_owned_bits = KVM_CR4_GUEST_OWNED_BITS;
if (enable_ept)
vmx-vcpu.arch.cr4_guest_owned_bits |= X86_CR4_PGE;
+   if (cpu_has_xsave)
+   vmx-vcpu.arch.cr4_guest_owned_bits |= X86_CR4_OSXSAVE;
vmcs_writel(CR4_GUEST_HOST_MASK, ~vmx-vcpu.arch.cr4_guest_owned_bits);
 
tsc_base = vmx-vcpu.kvm-arch.vm_init_tsc;
@@ -3258,6 +3262,25 @@ static int handle_wbinvd(struct kvm_vcpu *vcpu)
return 1;
 }
 
+static int handle_xsetbv(struct kvm_vcpu *vcpu)
+{
+   u64 new_bv = ((u64)kvm_register_read(vcpu, VCPU_REGS_RDX)) |
+   kvm_register_read(vcpu, VCPU_REGS_RAX);
+   u64 host_bv = vcpu-arch.host_xstate_mask;
+
+   if (((new_bv ^ host_bv)  ~host_bv) || !(new_bv  1))
+   goto err;
+   if ((host_bv  XSTATE_YMM  new_bv)  !(new_bv  XSTATE_SSE))
+   goto err;
+   vcpu-arch.guest_xstate_mask = new_bv;
+   xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu-arch.guest_xstate_mask);
+   skip_emulated_instruction(vcpu);
+   return 1;
+err:
+   kvm_inject_gp(vcpu, 0);
+   return 1;
+}
+
 static int handle_apic_access(struct kvm_vcpu *vcpu)
 {
unsigned long exit_qualification;
@@ -3556,6 +3579,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu 
*vcpu) = {
[EXIT_REASON_TPR_BELOW_THRESHOLD] = handle_tpr_below_threshold,
[EXIT_REASON_APIC_ACCESS] = handle_apic_access,
[EXIT_REASON_WBINVD]  = handle_wbinvd,
+   [EXIT_REASON_XSETBV]  = handle_xsetbv,
[EXIT_REASON_TASK_SWITCH] = handle_task_switch,
[EXIT_REASON_MCE_DURING_VMENTRY]  = handle_machine_check,
[EXIT_REASON_EPT_VIOLATION]   = handle_ept_violation,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6b2ce1d..2af3fbe 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -52,6 +52,8 @@
 #include asm/desc.h
 #include asm/mtrr.h
 #include asm/mce.h
+#include asm/i387.h
+#include asm/xcr.h
 
 #define MAX_IO_MSRS 256
 #define CR0_RESERVED_BITS  \
@@ -62,6 +64,7 @@

Re: [PATCH] kvm mmu: reduce 50% memory usage

2010-04-30 Thread Avi Kivity

On 04/30/2010 05:25 AM, Lai Jiangshan wrote:



It's unrelated to TDP, same issue with shadow.  I think the calculation
is correct.  For example the 4th spte for a level=2 page will yield
gfn=4*512.
 

Avi, Marcelo
Thank you very much.

The calculation I used is correct.
   


Yes.  btw, can you update the patch to also correct mmu.txt?


The fault is comes from the FNAME(fetch) when using shadow page.

I have fix it in the patch
[RFC PATCH] kvm: calculate correct gfn for small host pages which emulates large 
guest pages

(It seems that the mail lists is unreachable, did you receive it?)
   


The lists are down at the moment.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] KVM MMU: fix race in invlpg code

2010-04-30 Thread Avi Kivity

On 04/30/2010 12:00 PM, Xiao Guangrong wrote:

It has race in invlpg code, like below sequences:

A: hold mmu_lock and get 'sp'
B: release mmu_lock and do other things
C: hold mmu_lock and continue use 'sp'

if other path freed 'sp' in stage B, then kernel will crash

This patch checks 'sp' whether lived before use 'sp' in stage C
   



Signed-off-by: Xiao Guangrongxiaoguangr...@cn.fujitsu.com
---
  arch/x86/kvm/paging_tmpl.h |   18 +-
  1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 624b38f..641d844 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -462,11 +462,15 @@ out_unlock:

  static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
  {
-   struct kvm_mmu_page *sp = NULL;
+   struct kvm_mmu_page *sp = NULL, *s;
struct kvm_shadow_walk_iterator iterator;
+   struct hlist_head *bucket;
+   struct hlist_node *node, *tmp;
gfn_t gfn = -1;
u64 *sptep = NULL, gentry;
int invlpg_counter, level, offset = 0, need_flush = 0;
+   unsigned index;
+   bool live = false;

spin_lock(vcpu-kvm-mmu_lock);

@@ -519,10 +523,22 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t 
gva)

mmu_guess_page_from_pte_write(vcpu, gfn_to_gpa(gfn) + offset, gentry);
spin_lock(vcpu-kvm-mmu_lock);
+   index = kvm_page_table_hashfn(gfn);
+   bucket =vcpu-kvm-arch.mmu_page_hash[index];
+   hlist_for_each_entry_safe(s, node, tmp, bucket, hash_link)
+   if (s == sp) {
   


At this point, sp might have been freed and re-allocated, now pointing 
at something completely different.  So need to check role etc.


Alternatively, increase root_count.  Then sp is guaranteed to be live 
(though it may have role.invalid set).



+   live = true;
+   break;
+   }
+
+   if (!live)
+   goto unlock_exit;
+
if (atomic_read(vcpu-kvm-arch.invlpg_counter) == invlpg_counter) {
++vcpu-kvm-stat.mmu_pte_updated;
FNAME(update_pte)(vcpu, sp, sptep,gentry);
}
+unlock_exit:
spin_unlock(vcpu-kvm-mmu_lock);
mmu_release_page_from_pte_write(vcpu);
  }
   



--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kvm io-performance with smp

2010-04-30 Thread Udo Lembke

Hello,
i'm new on this list and hope this topic isn't discuss several times before.
I use kvm with proxmox ve and kvm work very well, but the io-performance 
with windows-guests is only good with one (guest)cpu.

The testings are based on kvm 0.12.3 (with the same result on 0.11.1).

Here is my posting, that i post in the proxmox-forum. But nobody there 
know a solution.


Hi,
after a lot more tests (on another node, with faster raid and the raid 
only for the vm)...
It seems that the IO-performance under windows with smp depends on 
fortune (or luck, or what else). If only run the IO-Prozess the values 
are sometimes not so bad, but if there another process, like the 
task-manager to show the cpu usage, the io-performance be worse. Up to 
very worse. I've got a notion that the performance drop if the 
io-process change the cpu (often).
It's happens with the 2.6.32 and also with the 2.6.24 kernel (I don't 
test the 2.6.18 yet).


Here the test-results (h2benchw -p -w 2cp_th_2.6.32_v1 2; profile 
install) on a virtio-disk (virtio-driver 4.3.0.17241) :

all values MB/s - different runs seperate with |

http://forums.meulie.net/viewtopic.php?f=43t=6136sid=79fe2173580718169824cb2e454a3efa#1 
CPU 2.6.32: 488||

||1 CPU aio=threads 2.6.32: 517 | 573 | 569||
||2 CPU aio=threads 2.6.32: 333 |  78 |  28||
||2 CPU aio=native  2.6.32: 101 | 128 |  53||
||2 CPU aio=threads 2.6.32: 215 |  66 | 103 | 179 |  26 |  58 | 109||
||2 CPU aio=native  2.6.32:  70 |  39 |  3.7|  14||
||2 CPU aio=threads 2.6.24: 298 |  47 |  27 | 121 | 120 |  82 | 104||
|2 CPU cache=none  2.6.24:  55 |  92 | 102 | 114 | open task-manager: 67|

Perhaps there are other switches for kvm to solve this problem?
My test kvm:|
/usr/bin/kvm -monitor unix:/var/run/qemu-server/126.mon,server,nowait 
-vnc unix:/var/run/qemu-server/126.vnc,password -pidfile 
/var/run/qemu-server/126.pid -daemonize -usbdevice tablet -name knecht2 
-smp sockets=2,cores=1 -nodefaults -boot menu=on,order=c -vga cirrus 
-tdf -localtime -rtc-td-hack -k de -drive 
file=/var/lib/vz/template/iso/vm-tools.iso,if=ide,index=1,media=cdrom 
-drive 
file=/var/lib/vz/images/126/vm-126-disk-1.raw,if=ide,index=0,boot=on 
-drive file=/var/lib/vz/images/126/vm-126-disk-2.raw,if=ide,index=2 
-drive 
file=/var/lib/vz/images/126/vm-126-disk-3.raw,if=virtio,index=0,aio=threads 
-m 1024 -net 
tap,vlan=0,ifname=vmtab126i0,script=/var/lib/qemu-server/bridge-vlan 
-net nic,vlan=0,model=e1000,macaddr=F6:E1:E2:E4:93:4E|


I will be quite happy, if someone has a hint for me.

Best regards

Udo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] KVM MMU: allow shadow page become unsync at creating time

2010-04-30 Thread Avi Kivity

On 04/30/2010 12:03 PM, Xiao Guangrong wrote:

Allow new shadow page become unsync when is created, then we no need
write-protect the 'sp-gfn', this idea is from Avi:

|Another interesting case is to create new shadow pages in the unsync
|state. That can help when the guest starts a short lived process: we
|can avoid write protecting its pagetables completely

   


Any idea how this improves performance?  A kernel build has a lot of 
short lived processes, might see an improvement there.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm mmu: reduce 50% memory usage

2010-04-30 Thread Avi Kivity

On 04/30/2010 11:54 AM, Lai Jiangshan wrote:

Avi Kivity wrote:
   

On 04/30/2010 05:25 AM, Lai Jiangshan wrote:
 
   

It's unrelated to TDP, same issue with shadow.  I think the calculation
is correct.  For example the 4th spte for a level=2 page will yield
gfn=4*512.

 

Avi, Marcelo
Thank you very much.

The calculation I used is correct.

   

Yes.  btw, can you update the patch to also correct mmu.txt?
 

The corresponding content in mmu.txt are:
   role.direct:
 If set, leaf sptes reachable from this page are for a linear range.
 Examples include real mode translation, large guest pages backed by small
 host pages, and gpa-hpa translations when NPT or EPT is active.
 The linear range starts at (gfn  PAGE_SHIFT) and its size is determined
 by role.level (2MB for first level, 1GB for second level, 0.5TB for third
 level, 256TB for fourth level)
 If clear, this page corresponds to a guest page table denoted by the gfn
 field.

   gfn:
 Either the guest page table containing the translations shadowed by this
 page, or the base page frame for linear translations.  See role.direct.

These are correct. My patch is fully base on this document.
I think it is not need to be fixed.

Did I miss something?

   


sp-gfns can now be NULL, so the documentation of this field needs to be 
updated.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 00/23] next round of emulator cleanups

2010-04-30 Thread Avi Kivity

On 04/28/2010 07:15 PM, Gleb Natapov wrote:

This is the next round of emulator cleanups. Make it even more detached
from kvm. First patch introduces IO read cache which is needed to
correctly emulate instructions that require more then one IO read exit
during emulation.
   


Reviewed-by: Avi Kivity a...@redhat.com

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/23] KVM: do not inject #PF in (read|write)_emulated() callbacks

2010-04-30 Thread Avi Kivity

On 04/28/2010 12:21 PM, Gleb Natapov wrote:

On Wed, Apr 28, 2010 at 12:11:41PM +0300, Avi Kivity wrote:
   

On 04/27/2010 03:15 PM, Gleb Natapov wrote:
 

Return error to x86 emulator instead of injection exception behind its back.

Signed-off-by: Gleb Natapovg...@redhat.com
---
  arch/x86/include/asm/kvm_emulate.h |3 +++
  arch/x86/kvm/emulate.c |   12 +++-
  arch/x86/kvm/x86.c |   28 ++--
  3 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h 
b/arch/x86/include/asm/kvm_emulate.h
index ae4af86..b977ccf 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -94,6 +94,7 @@ struct x86_emulate_ops {
int (*read_emulated)(unsigned long addr,
 void *val,
 unsigned int bytes,
+unsigned int *error,
 struct kvm_vcpu *vcpu);

   

The fault may be at a different address than addr, if we cross a
page boundary.  Need a struct here.

 

Correct and you can find couple of FIXME in emulator.c that say so.
I'll fix that later. This is not the problem that was introduced by this
patch set.


OK.


  What do you mean we need struct here? Struct with
error code + error location?

   


Yes.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3] drivers/uio/uio_pci_generic.c: allow access for non-privileged processes

2010-04-30 Thread Michael S. Tsirkin
On Thu, Apr 29, 2010 at 12:29:40PM -0700, Tom Lyon wrote:
 Michael, et al - sorry for the delay, but I've been digesting the comments 
 and researching new approaches.
 
 I think the plan for V4 will be to take things entirely out of the UIO 
 framework, and instead have a driver which supports user mode use of 
 well-behaved PCI devices. I would like to use read and write to support 
 access to memory regions, IO regions,  or PCI config space. Config space is a 
 bitch because not everything is safe to read or write, but I've come up with 
 a table driven approach which can be run-time extended for non-compliant 
 devices (under root control) which could then enable non-privileged users. 
 For instance, OHCI 1394 devices use a dword in config space which is not 
 formatted as a PCI capability, root can use sysfs to enable access:
   echo offset readbits writebits  
 /sys/dev/pci/devices/:xx:xx.x/yyy/config_permit
 
 
 A well-behaved PCI device must have memory BARs = 4K for mmaping, must 
 have separate memory space for MSI-X that does not need mmaping
 by the user driver, must support the PCI 2.3 interrupt masking, and must not 
 go totally crazy with PCI config space (tg3 is real ugly, e1000 is fine).

e1000 has a good driver in kernel, though.

 
 Again, my primary usage model is for direct user-level access to network 
 devices, not for virtualization, but I think both will work.

I suspect that without mmap and (to a lesser extent) write-combining,
this would be pretty useless for virtualization.

 So, I will go outside UIO because:
 1 - it doesn't allow reads and writes to sub-drivers, just irqcontrol
 2 - it doesn't have ioctls
 3 - it has its own interrupt model which doesn't use eventfds
 4 - it's ugly doing the new stuff and maintaining backwards compat.
 
 I hereby solicit comments on the name and location for the new driver.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


FYI: WinKVM (Windows kernel-based virtual machine)

2010-04-30 Thread kazushi takahashi
Hi there,

I have ported Linux KVM to the Microsoft Windows XP and already succeeded in
executing Linux guest OS which is attached in QEMU. I have named this virtual
machine WinKVM.

I introduced a daring and original means to develop WinKVM. More specifically,
I implemented a software abstraction layer that can translate from
Linux API into
WinKVM Native API on Microsoft Windows kernel. KVM source code which is not
 patched, is linked with my software abstraction layer when building WinKVM.
So, I did not fix KVM itself. I only prepared the software abstraction
layer to build
 WinKVM.

This is because KVM development speed is so fast. If I simply read KVM source
code and only regenerate the Windows driver that is similar to be KVM, I can
not follow to the KVM developers. It is hard to reprogram KVM when it
has update.

So, I have decided to implement the abstraction layer that emulate Linux kernel
on Microsoft Windows.

I have not yet prepared the amenities which are related to WinKVM such as
document and website and so on. But I have github repository
http://github.com/ddk50/winkvm/

I would appreciate your feedback.

Regards,
Kazushi Takahashi.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] KVM: VMX: Atomically switch efer if EPT !EFER.NX

2010-04-30 Thread Marcelo Tosatti

Avi,

This patch breaks WinVista.64 install.

On Wed, Apr 28, 2010 at 04:47:19PM +0300, Avi Kivity wrote:
 When EPT is enabled, we cannot emulate EFER.NX=0 through the shadow page
 tables.  This causes accesses through ptes with bit 63 set to succeed instead
 of failing a reserved bit check.
 
 Signed-off-by: Avi Kivity a...@redhat.com
 ---
  arch/x86/kvm/vmx.c |   11 +++
  1 files changed, 11 insertions(+), 0 deletions(-)
 
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index ae22dcf..cc78fee 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -678,6 +678,17 @@ static bool update_transition_efer(struct vcpu_vmx *vmx, 
 int efer_offset)
   guest_efer |= host_efer  ignore_bits;
   vmx-guest_msrs[efer_offset].data = guest_efer;
   vmx-guest_msrs[efer_offset].mask = ~ignore_bits;
 +
 + clear_atomic_switch_msr(vmx, MSR_EFER);
 + /* On ept, can't emulate nx, and must switch nx atomically */
 + if (enable_ept  ((vmx-vcpu.arch.efer ^ host_efer)  EFER_NX)) {
 + guest_efer = vmx-vcpu.arch.efer;
 + if (!(guest_efer  EFER_LMA))
 + guest_efer = ~EFER_LME;
 + add_atomic_switch_msr(vmx, MSR_EFER, guest_efer, host_efer);
 + return false;
 + }
 +
   return true;
  }
  
 -- 
 1.7.0.4
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] KVM: VMX: Atomically switch efer if EPT !EFER.NX

2010-04-30 Thread Avi Kivity

On 04/30/2010 08:37 PM, Marcelo Tosatti wrote:

Avi,

This patch breaks WinVista.64 install.
   


Please back it out then (the entire patchset).

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv7] add mergeable buffers support to vhost_net

2010-04-30 Thread Michael S. Tsirkin
On Wed, Apr 28, 2010 at 01:57:12PM -0700, David L Stevens wrote:
 This patch adds mergeable receive buffer support to vhost_net.
 
 Signed-off-by: David L Stevens dlstev...@us.ibm.com

I have applied this, thanks very much!
I have also applied some tweaks on top,
please take a look.

Thanks,
MSt

commit 2809e94f5f26d89dc5232aaec753ffda95c4d95e
Author: Michael S. Tsirkin m...@redhat.com
Date:   Thu Apr 29 16:18:08 2010 +0300

vhost-net: minor tweaks in mergeable buffer code

Applies the following tweaks on top of mergeable buffers patch:
1. vhost_get_desc_n assumes that all desriptors are 'in' only.
   It's also unlikely to be useful for any vhost frontend
   besides vhost_net, so just move it to net.c, and rename
   get_rx_bufs for brevity.

2. for rx, we called iov_length within vhost_get_desc_n
   (now get_rx_bufs) already, so we don't
   need an extra call to iov_length to avoid overflow anymore.
   Accordingly, copy_iovec_hdr can return void now.

3. for rx, do some further code tweaks:
   do not assign len = err as we check that err == len
   handle data length in a way similar to how we handle
   header length: datalen - sock_len, len - vhost_len.
   add sock_hlen as a local variable, for symmetry with vhost_hlen.

4. add some likely/unlikely annotations

Signed-off-by: Michael S. Tsirkin m...@redhat.com

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index d61d945..85519b4 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -74,9 +74,9 @@ static int move_iovec_hdr(struct iovec *from, struct iovec 
*to,
}
return seg;
 }
-/* Copy iovec entries for len bytes from iovec. Return segments used. */
-static int copy_iovec_hdr(const struct iovec *from, struct iovec *to,
- size_t len, int iovcount)
+/* Copy iovec entries for len bytes from iovec. */
+static void copy_iovec_hdr(const struct iovec *from, struct iovec *to,
+  size_t len, int iovcount)
 {
int seg = 0;
size_t size;
@@ -89,7 +89,6 @@ static int copy_iovec_hdr(const struct iovec *from, struct 
iovec *to,
++to;
++seg;
}
-   return seg;
 }
 
 /* Caller must have TX VQ lock */
@@ -204,7 +203,7 @@ static void handle_tx(struct vhost_net *net)
unuse_mm(net-dev.mm);
 }
 
-static int vhost_head_len(struct vhost_virtqueue *vq, struct sock *sk)
+static int peek_head_len(struct vhost_virtqueue *vq, struct sock *sk)
 {
struct sk_buff *head;
int len = 0;
@@ -212,17 +211,70 @@ static int vhost_head_len(struct vhost_virtqueue *vq, 
struct sock *sk)
lock_sock(sk);
head = skb_peek(sk-sk_receive_queue);
if (head)
-   len = head-len + vq-sock_hlen;
+   len = head-len;
release_sock(sk);
return len;
 }
 
+/* This is a multi-buffer version of vhost_get_desc, that works if
+ * vq has read descriptors only.
+ * @vq - the relevant virtqueue
+ * @datalen- data length we'll be reading
+ * @iovcount   - returned count of io vectors we fill
+ * @log- vhost log
+ * @log_num- log offset
+ * returns number of buffer heads allocated, negative on error
+ */
+static int get_rx_bufs(struct vhost_virtqueue *vq,
+  struct vring_used_elem *heads,
+  int datalen,
+  unsigned *iovcount,
+  struct vhost_log *log,
+  unsigned *log_num)
+{
+   unsigned int out, in;
+   int seg = 0;
+   int headcount = 0;
+   unsigned d;
+   int r;
+
+   while (datalen  0) {
+   if (unlikely(headcount = VHOST_NET_MAX_SG)) {
+   r = -ENOBUFS;
+   goto err;
+   }
+   d = vhost_get_desc(vq-dev, vq, vq-iov + seg,
+  ARRAY_SIZE(vq-iov) - seg, out,
+  in, log, log_num);
+   if (d == vq-num) {
+   r = 0;
+   goto err;
+   }
+   if (unlikely(out || in = 0)) {
+   vq_err(vq, unexpected descriptor format for RX: 
+   out %d, in %d\n, out, in);
+   r = -EINVAL;
+   goto err;
+   }
+   heads[headcount].id = d;
+   heads[headcount].len = iov_length(vq-iov + seg, in);
+   datalen -= heads[headcount].len;
+   ++headcount;
+   seg += in;
+   }
+   *iovcount = seg;
+   return headcount;
+err:
+   vhost_discard_desc(vq, headcount);
+   return r;
+}
+
 /* Expects to be always run from workqueue - which acts as
  * read-size critical section for our kind of RCU. */
 static void handle_rx(struct vhost_net *net)
 {
struct vhost_virtqueue *vq = 

Re: [PATCH] kvm mmu: reduce 50% memory usage

2010-04-30 Thread Marcelo Tosatti
On Wed, Apr 28, 2010 at 07:57:01PM +0800, Lai Jiangshan wrote:
 
 I think users will enable tdp when their hardwares support ept or npt.
 This patch can reduce about 50% kvm mmu memory usage for they.
 
 This simple patch use the fact that:
 
 When sp-role.direct is set, sp-gfns does not contain any essential
 information, leaf sptes reachable from this sp are for a continuate
 guest physical memory range(a linear range).
 So sp-gfns[i](if it was set) equals to sp-gfn + i. (PT_PAGE_TABLE_LEVEL)
 Obviously, it is not essential information, we can calculate it when need.
 
 It means we don't need sp-gfns when sp-role.direct=1,
 Thus we can save one page usage for every kvm_mmu_page.
 
 Note:
 Access to sp-gfns must be wrapped by kvm_mmu_page_get_gfn()
 or kvm_mmu_page_set_gfn().
 It is only exposed in FNAME(sync_page).

Lai,

You missed quadrant on 4mb large page emulation with shadow (see updated
patch below). Also for some reason i can't understand the assumption
does not hold for large sptes with TDP, so reverted for now.

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 3266d73..a9edfdb 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -393,6 +393,27 @@ static void mmu_free_rmap_desc(struct kvm_rmap_desc *rd)
kfree(rd);
 }
 
+static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index)
+{
+   gfn_t gfn;
+
+   if (!sp-role.direct)
+   return sp-gfns[index];
+
+   gfn = sp-gfn + index * (1  (sp-role.level - 1) * PT64_LEVEL_BITS);
+   gfn += sp-role.quadrant  PT64_LEVEL_BITS;
+
+   return gfn;
+}
+
+static void kvm_mmu_page_set_gfn(struct kvm_mmu_page *sp, int index, gfn_t gfn)
+{
+   if (sp-role.direct)
+   BUG_ON(gfn != kvm_mmu_page_get_gfn(sp, index));
+   else
+   sp-gfns[index] = gfn;
+}
+
 /*
  * Return the pointer to the largepage write count for a given
  * gfn, handling slots that are not large page aligned.
@@ -543,7 +564,7 @@ static int rmap_add(struct kvm_vcpu *vcpu, u64 *spte, gfn_t 
gfn)
return count;
gfn = unalias_gfn(vcpu-kvm, gfn);
sp = page_header(__pa(spte));
-   sp-gfns[spte - sp-spt] = gfn;
+   kvm_mmu_page_set_gfn(sp, spte - sp-spt, gfn);
rmapp = gfn_to_rmap(vcpu-kvm, gfn, sp-role.level);
if (!*rmapp) {
rmap_printk(rmap_add: %p %llx 0-1\n, spte, *spte);
@@ -601,6 +622,7 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
struct kvm_rmap_desc *prev_desc;
struct kvm_mmu_page *sp;
pfn_t pfn;
+   gfn_t gfn;
unsigned long *rmapp;
int i;
 
@@ -612,7 +634,8 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
kvm_set_pfn_accessed(pfn);
if (is_writable_pte(*spte))
kvm_set_pfn_dirty(pfn);
-   rmapp = gfn_to_rmap(kvm, sp-gfns[spte - sp-spt], sp-role.level);
+   gfn = kvm_mmu_page_get_gfn(sp, spte - sp-spt);
+   rmapp = gfn_to_rmap(kvm, gfn, sp-role.level);
if (!*rmapp) {
printk(KERN_ERR rmap_remove: %p %llx 0-BUG\n, spte, *spte);
BUG();
@@ -896,7 +919,8 @@ static void kvm_mmu_free_page(struct kvm *kvm, struct 
kvm_mmu_page *sp)
ASSERT(is_empty_shadow_page(sp-spt));
list_del(sp-link);
__free_page(virt_to_page(sp-spt));
-   __free_page(virt_to_page(sp-gfns));
+   if (!sp-role.direct)
+   __free_page(virt_to_page(sp-gfns));
kfree(sp);
++kvm-arch.n_free_mmu_pages;
 }
@@ -907,13 +931,15 @@ static unsigned kvm_page_table_hashfn(gfn_t gfn)
 }
 
 static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
-  u64 *parent_pte)
+  u64 *parent_pte, int direct)
 {
struct kvm_mmu_page *sp;
 
sp = mmu_memory_cache_alloc(vcpu-arch.mmu_page_header_cache, sizeof 
*sp);
sp-spt = mmu_memory_cache_alloc(vcpu-arch.mmu_page_cache, PAGE_SIZE);
-   sp-gfns = mmu_memory_cache_alloc(vcpu-arch.mmu_page_cache, 
PAGE_SIZE);
+   if (!direct)
+   sp-gfns = mmu_memory_cache_alloc(vcpu-arch.mmu_page_cache,
+ PAGE_SIZE);
set_page_private(virt_to_page(sp-spt), (unsigned long)sp);
list_add(sp-link, vcpu-kvm-arch.active_mmu_pages);
bitmap_zero(sp-slot_bitmap, KVM_MEMORY_SLOTS + KVM_PRIVATE_MEM_SLOTS);
@@ -1352,7 +1378,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct 
kvm_vcpu *vcpu,
if (role.direct)
role.cr4_pae = 0;
role.access = access;
-   if (vcpu-arch.mmu.root_level = PT32_ROOT_LEVEL) {
+   if (vcpu-arch.mmu.root_level == PT32_ROOT_LEVEL) {
quadrant = gaddr  (PAGE_SHIFT + (PT64_PT_BITS * level));
quadrant = (1  ((PT32_PT_BITS - PT64_PT_BITS) * level)) - 1;
role.quadrant = quadrant;
@@ -1379,7 +1405,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct 

Re: [PATCH] kvm mmu: reduce 50% memory usage

2010-04-30 Thread Avi Kivity

On 04/29/2010 09:09 PM, Marcelo Tosatti wrote:


You missed quadrant on 4mb large page emulation with shadow (see updated
patch below).


Good catch.


Also for some reason i can't understand the assumption
does not hold for large sptes with TDP, so reverted for now.
   


It's unrelated to TDP, same issue with shadow.  I think the calculation 
is correct.  For example the 4th spte for a level=2 page will yield 
gfn=4*512.



@@ -393,6 +393,27 @@ static void mmu_free_rmap_desc(struct kvm_rmap_desc *rd)
kfree(rd);
  }

+static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index)
+{
+   gfn_t gfn;
+
+   if (!sp-role.direct)
+   return sp-gfns[index];
+
+   gfn = sp-gfn + index * (1  (sp-role.level - 1) * PT64_LEVEL_BITS);
+   gfn += sp-role.quadrant  PT64_LEVEL_BITS;
   


PT64_LEVEL_BITS * level


+
+   return gfn;
+}
+

   



--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


VT-d passing a Raid Controller in KVM

2010-04-30 Thread Yushu Yao
Hi Experts, 

I have successfully used VD-t pci passthrough for PCI-E based NICs in KVM.

I'm trying to pass a PCI-E x8 MegaRaid 9240 SAS raid card to a KVM guest. 

However, the device driver in the guest can't initialize the Card. It seems that
the driver is trying to RESET the card, however, the card is not in a state that
can be reset (maybe because the host driver already done that?) after some
waiting the guest driver put the device into DEAD. Need to restart the host to
wake it up. 

I'm just wondering can these kind of devices be passed to the guest? 

Also, at the boot of the host, there is a POST screen showing that the MegaRaid
card is looking for drives, etc. Naively thinking, should we see the same post
screen in the GUEST?

Thank a lot!

-Yushu



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv7] add mergeable buffers support to vhost_net

2010-04-30 Thread Michael S. Tsirkin
On Wed, Apr 28, 2010 at 01:57:12PM -0700, David L Stevens wrote:
 This patch adds mergeable receive buffer support to vhost_net.
 
 Signed-off-by: David L Stevens dlstev...@us.ibm.com

You can find the latest version on the following net-next based tree:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] KVM MMU: do not intercept invlpg if 'oos_shadow' is disabled

2010-04-30 Thread Avi Kivity

On 04/30/2010 12:05 PM, Xiao Guangrong wrote:

If 'oos_shadow' == 0, intercepting invlpg command is really
unnecessary.

And it's good for us to compare the performance between enable 'oos_shadow'
and disable 'oos_shadow'

@@ -74,8 +74,9 @@ static int dbg = 0;
  module_param(dbg, bool, 0644);
  #endif

-static int oos_shadow = 1;
+int __read_mostly oos_shadow = 1;
  module_param(oos_shadow, bool, 0644);
+EXPORT_SYMBOL_GPL(oos_shadow);
   


Please rename to kvm_oos_shadow to reduce potential for conflict with 
other global names.


But really, this is a debug option, I don't expect people to run with 
oos_shadow=0, so there's not much motivation to optimize it.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm mmu: reduce 50% memory usage

2010-04-30 Thread Marcelo Tosatti
On Thu, Apr 29, 2010 at 09:43:40PM +0300, Avi Kivity wrote:
 On 04/29/2010 09:09 PM, Marcelo Tosatti wrote:
 
 You missed quadrant on 4mb large page emulation with shadow (see updated
 patch below).
 
 Good catch.
 
 Also for some reason i can't understand the assumption
 does not hold for large sptes with TDP, so reverted for now.
 
 It's unrelated to TDP, same issue with shadow.  I think the
 calculation is correct.  For example the 4th spte for a level=2 page
 will yield gfn=4*512.

Under testing i see sp at level 2, with sp-gfn == 4096, mmu_set_spte
setting index 8 to gfn 4096 (whereas kvm_mmu_page_get_gfn returns 4096 +
8*512).

Lai, can you please take a look at it? You should see the
kvm_mmu_page_set_gfn BUG_ON by using -mem-path on hugetlbfs.

 @@ -393,6 +393,27 @@ static void mmu_free_rmap_desc(struct kvm_rmap_desc *rd)
  kfree(rd);
   }
 
 +static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index)
 +{
 +gfn_t gfn;
 +
 +if (!sp-role.direct)
 +return sp-gfns[index];
 +
 +gfn = sp-gfn + index * (1  (sp-role.level - 1) * PT64_LEVEL_BITS);
 +gfn += sp-role.quadrant  PT64_LEVEL_BITS;
 
 PT64_LEVEL_BITS * level
 
 +
 +return gfn;
 +}
 +
 
 
 
 -- 
 I have a truly marvellous patch that fixes the bug which this
 signature is too narrow to contain.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Regression in vga performance between 0.11.1 and 0.12.1.1

2010-04-30 Thread Avi Kivity

On 04/28/2010 10:33 PM, Adam Greenblatt wrote:

Hi,
  I noticed that certain guests (for example, Ubuntu 9.04, Ubuntu 9.10,
and the Ubuntu 10.04 release candidate) show dramatically (~100x) slower
graphical output when running under qemu-kvm-0.12.1.1 than under 
qemu-kvm-0.11.1.

Other guests, notably Windows XP and Windows Vista, run fine under both
version of qemu.  The regression is still present in qemu-kvm-0.12.3.



Please post kvm_stat output for the fast and slow cases (preferably 
running the same workload, perhaps a web page displaying an animation).


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM hook for code integrity checking

2010-04-30 Thread Suen Chun Hui
Dear KVM developers,

I'm currently working on an open source security patch to use KVM to
implement code verification on a guest VM in runtime. Thus, it would be
very helpful if someone can point to me the right function or place to
look at for adding 2 hooks into the KVM paging code to:

1. Detect a new guest page (which I assume will imply a new pte and
imply a new spte).
Currently, I'm considering putting a hook in the function
mmu_set_spte(), but may there is a better place.
This hook will be used as the main entry point into the code
verification function

2. Detect a write fault to a read-only spte (eg. for the case of
updating the dirty bit back to the guest pte)
Unfortunately, I'm unable to find an appropriate place where this
actually takes place after reading the code many times.
This hook will be used to prevent a secondary peek page from modifying
an existing verified code page.

I will gladly contribute my work(my phd thesis) to the list once it is
workable, if anyone find it useful.
Thank you in advance.

Regards,
Chun Hui
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] Fix EFER.NX=0 with EPT

2010-04-30 Thread Avi Kivity

On 04/29/2010 02:22 AM, Marcelo Tosatti wrote:

On Wed, Apr 28, 2010 at 04:47:14PM +0300, Avi Kivity wrote:
   

Currently we run with EFER.NX=1 on the guest even if the guest value is 0.
This is fine with shadow, since we check bit 63 when instantiating a page
table, and fault if bit 63 is set while EFER.NX is clear.

This doesn't work with EPT, since we no longer get the change to check guest
ptes.  So we need to run with EFER.NX=0.

This is complicated by the fact that if we switch EFER.NX on the host, we'll
trap immediately, since some host pages are mapped with the NX bit set.  As
a result, we need to switch the MSR atomically during guest entry and exit.

 

Applied, thanks.

Out of curiosity, did you measure the vmentry/vmexit overhead?
   


Just did now.

  nx=0: 3384
  nx=1: 2203

Perhaps using the dedicated vmcs field is faster (but since it's 
optional, we have to support the generic autoload area).


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: Document KVM_SET_BOOT_CPU_ID

2010-04-30 Thread Avi Kivity
Signed-off-by: Avi Kivity a...@redhat.com
---
 Documentation/kvm/api.txt |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt
index 0f96e52..159b4ef 100644
--- a/Documentation/kvm/api.txt
+++ b/Documentation/kvm/api.txt
@@ -910,6 +910,18 @@ This ioctl is required on Intel-based hosts.  This is 
needed on Intel hardware
 because of a quirk in the virtualization implementation (see the internals
 documentation when it pops into existence).
 
+4.40 KVM_SET_BOOT_CPU_ID
+
+Capability: KVM_CAP_SET_BOOT_CPU_ID
+Architectures: x86, ia64
+Type: vm ioctl
+Parameters: unsigned long vcpu_id
+Returns: 0 on success, -1 on error
+
+Define which vcpu is the Bootstrap Processor (BSP).  Values are the same
+as the vcpu id in KVM_CREATE_VCPU.  If this ioctl is not called, the default
+is vcpu 0.
+
 5. The kvm_run structure
 
 Application code obtains a pointer to the kvm_run structure by
-- 
1.7.0.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can I simulate a virtual Dual-Head Graphiccard?

2010-04-30 Thread Avi Kivity

On 04/28/2010 11:08 PM, Axel Kittenberger wrote:

Hello,

This is a question I was not able to answer with a search. I've been 
using kvm now quite successfully as server side solution. Now I want 
to use it on a particular desktop to have a Windows 7 Guest on a 
native Linux system. Well this desktop has two Screens, and I'm sure 
its expected to have the Guest also on both screens.


Supposevly I could just simulate a very wide Screen and have the host 
split it (either SDL or VNC). However, this is not quite the same, as 
the guest will think exactly that, 1 wide screen. Meaning it will put 
all the messageboxes exactly in the middle between the two screens, 
have the startbar spawn on both screens and what not. So two screens 
are handled a tad differently than one wide.


Is this possible with kvm?


The spice project supports multiple monitors, so when that is merged, 
you'll be able to use multiple displays.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Potential thread synchronization issue in qcow2.c and qcow2-cluster.c

2010-04-30 Thread Stefan Hajnoczi
 I profiled all executions of
 qemu_mutex_lock_iothread(), and found that
 it only protects the vl.c:main_loop_wai() thread but does NOT protect
 the qemu-kvm.c:kvm_cpu_exec() thread. Did I miss something or is this
 a defect?

Hi again, I took another look at qemu-kvm 0.12.3 and here is how I read it:

The mutex which is supposed to protect IO emulation is qemu-kvm.c:qemu_mutex.

The cpu thread will unlock qemu_mutex in pre_kvm_run() before
ioctl(fd, KVM_RUN, 0).  Then it will lock qemu_mutex again in
post_kvm_run().

The io thread will unlock qemu_mutex via
qemu-kvm.c:qemu_mutex_unlock_iothread() before waiting in select().
Then it will lock qemu_mutex again in
qemu-kvm.c:qemu_mutex_lock_iothread().

I believe this *does* protect IO emulation correctly.  The code is
confusing because there are multiple definitions of the same functions
and #ifdefs, maybe I made a mistake.

 Here is the trace showing that
 qemu_mutex_lock_iothread() does not protect the thread
 that executes. kvm_cpu_exec()-...-qcow_aio_write_cb().

 home/ctang/kvm/qemu-kvm-0.12.3/qemu-kvm.c : 2530    thread: b7e056d0
       /home/ctang/kvm/bin/qemu-system-x86_64(qemu_mutex_unlock_iothread+0x1a)
 [0x8092242]
       /home/ctang/kvm/bin/qemu-system-x86_64(main_loop_wait+0x221) [0x806edef]
       /home/ctang/kvm/bin/qemu-system-x86_64(kvm_main_loop+0x1ff) [0x80916a1]
       /home/ctang/kvm/bin/qemu-system-x86_64 [0x806f5c2]
       /home/ctang/kvm/bin/qemu-system-x86_64(main+0x2e2c) [0x80736d1]
       /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5) [0xb7e33775]
       /home/ctang/kvm/bin/qemu-system-x86_64 [0x8068bb1]

 block/qcow2-cluster.c : 721    thread: b7dc2b90
       /home/ctang/kvm/bin/qemu-system-x86_64(qcow2_alloc_cluster_offset+0x3c)
 [0x81175fa]
       /home/ctang/kvm/bin/qemu-system-x86_64(qcow_aio_write_cb+0x158)
 [0x8111d73]
       /home/ctang/kvm/bin/qemu-system-x86_64(qcow_aio_writev+0x94) [0x8112054]
       /home/ctang/kvm/bin/qemu-system-x86_64(bdrv_aio_writev+0xe1) [0x80fa8e9]
       /home/ctang/kvm/bin/qemu-system-x86_64 [0x81f4a96]
       /home/ctang/kvm/bin/qemu-system-x86_64 [0x81f4c04]
       /home/ctang/kvm/bin/qemu-system-x86_64(dma_bdrv_write+0x48) [0x81f4cbf]
       /home/ctang/kvm/bin/qemu-system-x86_64 [0x80a437c]
       /home/ctang/kvm/bin/qemu-system-x86_64(bmdma_cmd_writeb+0x73)
 [0x80a9503]
       /home/ctang/kvm/bin/qemu-system-x86_64 [0x812b1eb]
       /home/ctang/kvm/bin/qemu-system-x86_64(cpu_outb+0x27) [0x812b4e6]
       /home/ctang/kvm/bin/qemu-system-x86_64 [0x808d267]
       /home/ctang/kvm/bin/qemu-system-x86_64(kvm_run+0x2f4) [0x808f4b8]
       /home/ctang/kvm/bin/qemu-system-x86_64(kvm_cpu_exec+0x56) [0x80907b2]
       /home/ctang/kvm/bin/qemu-system-x86_64 [0x8090f4d]
       /home/ctang/kvm/bin/qemu-system-x86_64 [0x8091098]
       /lib/tls/i686/cmov/libpthread.so.0 [0xb7fd24ff]
       /lib/tls/i686/cmov/libc.so.6(clone+0x5e) [0xb7f0149e]

 /home/ctang/kvm/qemu-kvm-0.12.3/qemu-kvm.c : 2537    thread: b7e056d0
       /home/ctang/kvm/bin/qemu-system-x86_64(qemu_mutex_lock_iothread+0x1a)
 [0x809229d]
       /home/ctang/kvm/bin/qemu-system-x86_64(main_loop_wait+0x25c) [0x806ee2a]
       /home/ctang/kvm/bin/qemu-system-x86_64(kvm_main_loop+0x1ff) [0x80916a1]
       /home/ctang/kvm/bin/qemu-system-x86_64 [0x806f5c2]
       /home/ctang/kvm/bin/qemu-system-x86_64(main+0x2e2c) [0x80736d1]
       /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5) [0xb7e33775]
       /home/ctang/kvm/bin/qemu-system-x86_64 [0x8068bb1]

kvm_cpu_exec() never calls qemu_mutex_lock_iothread() but it does lock
the underlying mutex via post_kvm_run().  It's just confusing because
vl.c calls it the iothread mutex whereas qemu-kvm.c calls it qemu
mutex and there are wrapper functions.

Does this help?

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3] drivers/uio/uio_pci_generic.c: allow access for non-privileged processes

2010-04-30 Thread Tom Lyon
Michael, et al - sorry for the delay, but I've been digesting the comments and 
researching new approaches.

I think the plan for V4 will be to take things entirely out of the UIO 
framework, and instead have a driver which supports user mode use of 
well-behaved PCI devices. I would like to use read and write to support 
access to memory regions, IO regions,  or PCI config space. Config space is a 
bitch because not everything is safe to read or write, but I've come up with a 
table driven approach which can be run-time extended for non-compliant devices 
(under root control) which could then enable non-privileged users. For 
instance, OHCI 1394 devices use a dword in config space which is not formatted 
as a PCI capability, root can use sysfs to enable access:
echo offset readbits writebits  
/sys/dev/pci/devices/:xx:xx.x/yyy/config_permit


A well-behaved PCI device must have memory BARs = 4K for mmaping, must have 
separate memory space for MSI-X that does not need mmaping
by the user driver, must support the PCI 2.3 interrupt masking, and must not go 
totally crazy with PCI config space (tg3 is real ugly, e1000 is fine).

Again, my primary usage model is for direct user-level access to network 
devices, not for virtualization, but I think both will work.

So, I will go outside UIO because:
1 - it doesn't allow reads and writes to sub-drivers, just irqcontrol
2 - it doesn't have ioctls
3 - it has its own interrupt model which doesn't use eventfds
4 - it's ugly doing the new stuff and maintaining backwards compat.

I hereby solicit comments on the name and location for the new driver.

Michael - some of your comments below imply you didn't look at the companion 
changes to uio.c, which had the eventfd interrupts and effectively the same 
iommu handling - but see my inline comments below.

On Wednesday 21 April 2010 02:38:49 am Michael S. Tsirkin wrote:
 On Mon, Apr 19, 2010 at 03:05:35PM -0700, Tom Lyon wrote:
  
  These are changes to uio_pci_generic.c to allow better use of the driver by
  non-privileged processes.
  1. Add back old code which allowed interrupt re-enablement through uio fd.
  2. Translate PCI bards to uio mmap regions, to allow mmap through uio fd.
 
 Since it's common for drivers to need configuration cycles
 for device control, the above 2 are not sufficient for generic devices.
 And if you fix the above, you won't need irqcontrol,
 which IMO we are better off saving for stuff like eventfd mapping.
I will handle config access for well-behaved devices.

 
 Also - this modifies a kernel/userspace interface in a way
 that makes an operation that was always safe previously
 potentially unsafe.
Not sure what you meant, but probably irrelevant in new scheme.
 
 Also, BAR regions could be less than 1 page in size,
 mapping these to unpriveledged process is a security problem.
Agreed, no mmaping, just r/w.

 Also, for a generic driver, we likely need write combining
 support in the interface.
Given that many system platforms don't have it, doesn't seem like a big deal.
But I'll look into it.

 Also, io space often can not be mmaped. We need read/write
 for that.
Agreed.

 
  3. Allow devices which support MSI or MSI-X, but not IRQ.
 
 If the device supports MSI or MSI-X, it can perform
 PCI writes upstream, and MSI-X vectors are controlled
 through memory. So with MSI-X + mmap to an unpriveledged
 process you can easily cause the device to modify any memory.
Yes, will virtualize this in the driver. User level will not be allowed to
mmap real MSI-X region (if MSI-X in use); MSI config writes will be intercepted.

 With MSI it's hard to be sure, maybe some devices might make guarantees
 not to do writes except for MSI, but there's no generic way to declare
 that: bus master needs to be enabled for MSI to work, and once bus
 master is enabled, nothing seems to prevent the device from corrupting
 host memory.
The code already requires iommu protection for masters, I will make sure this
includes MSI and MSI-X devices.

As an aside, the IOMMU is supposed to be able to do interrupt translation also,
but the format for vectors changes, so it doesn't really help with 
virtualization.

 So the patch doesn't look like generic enough or safe enough
 for users I have in mind (virtualization). What users/devices
 do you have in mind?
Non-virt, just new user level drivers for special cases.

 For virtualization, I've been thinking about unpriviledged access and
 msi as well, and here's a plan I thought might work:
 
 - add a uio_iommu character device that controls an iommu domain
 - uio_iommu would make sure iommu is programmed in a safe way
 - use irqcontrol to bind pci device to iommu domain
 - allow config cycles through uio fd, but
   force bus master to 0 unless device is bound to a domain
 - for sub-page regions, and io, we can't allow mmap to an unpriveledged
   process. extend irqcontrol to allow read/write and range-check the
   operations
 - for msi/msix, drivers use multiple 

Re: Potential thread synchronization issue in qcow2.c and qcow2-cluster.c

2010-04-30 Thread Chunqiang (CQ) Tang
You are absolute right and this solved the puzzle. I also did
profiling to confirm your observation. Thank you for all the help!

 Hi again, I took another look at qemu-kvm 0.12.3 and here is how I read it:

 The mutex which is supposed to protect IO emulation is qemu-kvm.c:qemu_mutex.

 The cpu thread will unlock qemu_mutex in pre_kvm_run() before
 ioctl(fd, KVM_RUN, 0).  Then it will lock qemu_mutex again in
 post_kvm_run().

 The io thread will unlock qemu_mutex via
 qemu-kvm.c:qemu_mutex_unlock_iothread() before waiting in select().
 Then it will lock qemu_mutex again in
 qemu-kvm.c:qemu_mutex_lock_iothread().

 I believe this *does* protect IO emulation correctly.  The code is
 confusing because there are multiple definitions of the same functions
 and #ifdefs, maybe I made a mistake.

 Regards,
CQ Tang
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] intel_txt: enable VMXON check with SMX in KVM

2010-04-30 Thread Shane Wang
Per document, for feature control MSR
Bit 1 enables VMXON in SMX operation. If the bit is clear, execution of VMXON 
in SMX operation causes a general-protection exception.
Bit 2 enables VMXON outside SMX operation. If the bit is clear, execution of 
VMXON outside SMX operation causes a general-protection exception.

This patch is to enable this kind of check with SMX for VMXON in KVM.

Signed-off-by: Shane Wang shane.w...@intel.com
---
 arch/x86/include/asm/msr-index.h |5 ++--
 arch/x86/kernel/tboot.c  |1
 arch/x86/kvm/vmx.c   |   32 +++--
 include/linux/tboot.h|1
 4 files changed, 26 insertions(+), 13 deletions(-)

diff -r a96602743dbd arch/x86/include/asm/msr-index.h
--- a/arch/x86/include/asm/msr-index.h  Thu Apr 29 11:49:08 2010 -0400
+++ b/arch/x86/include/asm/msr-index.h  Thu Apr 29 11:49:40 2010 -0400
@@ -202,8 +202,9 @@
 #define MSR_IA32_EBL_CR_POWERON0x002a
 #define MSR_IA32_FEATURE_CONTROL0x003a
 
-#define FEATURE_CONTROL_LOCKED (10)
-#define FEATURE_CONTROL_VMXON_ENABLED  (12)
+#define FEATURE_CONTROL_LOCKED (10)
+#define FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX   (11)
+#define FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX  (12)
 
 #define MSR_IA32_APICBASE  0x001b
 #define MSR_IA32_APICBASE_BSP  (18)
diff -r a96602743dbd arch/x86/kernel/tboot.c
--- a/arch/x86/kernel/tboot.c   Thu Apr 29 11:49:08 2010 -0400
+++ b/arch/x86/kernel/tboot.c   Thu Apr 29 11:49:40 2010 -0400
@@ -46,6 +46,7 @@
 
 /* Global pointer to shared data; NULL means no measured launch. */
 struct tboot *tboot __read_mostly;
+EXPORT_SYMBOL(tboot);
 
 /* timeout for APs (in secs) to enter wait-for-SIPI state during shutdown */
 #define AP_WAIT_TIMEOUT1
diff -r a96602743dbd arch/x86/kvm/vmx.c
--- a/arch/x86/kvm/vmx.cThu Apr 29 11:49:08 2010 -0400
+++ b/arch/x86/kvm/vmx.cThu Apr 29 11:49:40 2010 -0400
@@ -27,6 +27,7 @@
 #include linux/moduleparam.h
 #include linux/ftrace_event.h
 #include linux/slab.h
+#include linux/tboot.h
 #include kvm_cache_regs.h
 #include x86.h
 
@@ -1176,9 +1177,16 @@ static __init int vmx_disabled_by_bios(v
u64 msr;
 
rdmsrl(MSR_IA32_FEATURE_CONTROL, msr);
-   return (msr  (FEATURE_CONTROL_LOCKED |
-  FEATURE_CONTROL_VMXON_ENABLED))
-   == FEATURE_CONTROL_LOCKED;
+   if (!!(msr  FEATURE_CONTROL_LOCKED)) {
+   if (!(msr  FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX)
+tboot_enabled())
+   return 1;
+   if (!(msr  FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX)
+!tboot_enabled())
+   return 1;
+   }
+
+   return 0;
/* locked but not enabled */
 }
 
@@ -1186,21 +1194,23 @@ static int hardware_enable(void *garbage
 {
int cpu = raw_smp_processor_id();
u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
-   u64 old;
+   u64 old, test_bits;
 
if (read_cr4()  X86_CR4_VMXE)
return -EBUSY;
 
INIT_LIST_HEAD(per_cpu(vcpus_on_cpu, cpu));
rdmsrl(MSR_IA32_FEATURE_CONTROL, old);
-   if ((old  (FEATURE_CONTROL_LOCKED |
-   FEATURE_CONTROL_VMXON_ENABLED))
-   != (FEATURE_CONTROL_LOCKED |
-   FEATURE_CONTROL_VMXON_ENABLED))
+
+   test_bits = FEATURE_CONTROL_LOCKED;
+   test_bits |= FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
+   if (tboot_enabled())
+   test_bits |= FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX;
+
+   if ((old  test_bits) != test_bits) {
/* enable and lock */
-   wrmsrl(MSR_IA32_FEATURE_CONTROL, old |
-  FEATURE_CONTROL_LOCKED |
-  FEATURE_CONTROL_VMXON_ENABLED);
+   wrmsrl(MSR_IA32_FEATURE_CONTROL, old | test_bits);
+   }
write_cr4(read_cr4() | X86_CR4_VMXE); /* FIXME: not cpu hotplug safe */
asm volatile (ASM_VMX_VMXON_RAX
  : : a(phys_addr), m(phys_addr)
diff -r a96602743dbd include/linux/tboot.h
--- a/include/linux/tboot.h Thu Apr 29 11:49:08 2010 -0400
+++ b/include/linux/tboot.h Thu Apr 29 11:49:40 2010 -0400
@@ -150,6 +150,7 @@ extern int tboot_force_iommu(void);
 
 #else
 
+#define tboot_enabled()0
 #define tboot_probe()  do { } while (0)
 #define tboot_shutdown(shutdown_type)  do { } while (0)
 #define tboot_sleep(sleep_state, pm1a_control, pm1b_control)   \
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--no-kvm support is still broken - Will this ever be supported again?

2010-04-30 Thread Trevor Orsztynowicz
Hi everyone,

KVM-0.12.3 and the latest version from the git repository no longer
support the '--no-kvm' option. This seems to be related to virt-io and
the boot=on flag. The last version that this worked in to my knowledge
was KVM-0.11.0. Many distributions don't include straight QEMU
versions any more (only KVM ones) so this flag is still necessary.
This feature should really be supported.

Issue:
The latest versions of KVM have had the --no-kvm option broken for some time.

Description:
VMs that work fine with KVM enabled do not work with the -no-kvm flag.
If I remove the boot=on from the hard drive, I am able to boot an ISO
image but if I re-add boot=on the system hangs either in loading GRUB
or ISOLINUX if I boot on an Ubuntu install ISO. This is happening on
multiple machines (all x86_64). On machines without virtualization
extensions this happens regardless of the -no-kvm flag.

Original bug report:
http://sourceforge.net/tracker/?func=detailaid=2972734group_id=180599atid=893831
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 7/10] KVM MMU: allow more page become unsync at gfn mapping time

2010-04-30 Thread Marcelo Tosatti
On Wed, Apr 28, 2010 at 11:55:49AM +0800, Xiao Guangrong wrote:
 In current code, shadow page can become asynchronous only if one
 shadow page for a gfn, this rule is too strict, in fact, we can
 let all last mapping page(i.e, it's the pte page) become unsync,
 and sync them at invlpg or flush tlb time.
 
 This patch allow more page become asynchronous at gfn mapping time 
 
 Signed-off-by: Xiao Guangrong xiaoguangr...@cn.fujitsu.com

Xiao,

This patch breaks Fedora 8 32 install. Reverted patches 5-10.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can I simulate a virtual Dual-Head Graphiccard?

2010-04-30 Thread FinnTux
 Is this possible with kvm?
 Either simulate a dual head mapping to one wide SDL/VNC display. Or
 having two SDL/VNC displays?


 No. It was brought up before on the qemu list I believe. I think the gist was
 that qemu didn't support more than one vga card.


Spice (www.spice-space.org) can do this. I just tested it and works
well. Too bad spice isn't included in qemu-kvm yet.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What changed since kvm-72 resulting in winNT to fail to boot (STOP 0x0000001E) ?

2010-04-30 Thread Michael Tokarev

I've a bugreport handy, see
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=575439
about the apparent problem booting winNT 4 in kvm 0.12.
At least 2 people were hit by this issue.  In short, when
booting winNT 4.0, it BSODs with error code 0x001E,
which means inaccessible boot device.

Note that it is when upgrading from -72 to 0.12.  But the
second person on the bug page also tried to reinstall the
OS, and that failed as well, now with a different error
message.

I'll try to find an nt4 cdrom here to try it, but I can't
promise anything

Thanks!

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [qemu-kvm tests PATCH v2 0/3] qemu-kvm tests cleanup

2010-04-30 Thread Marcelo Tosatti
On Tue, Apr 27, 2010 at 03:57:42PM +0300, Naphtali Sprei wrote:
 changes v1 - v2
 single trailing whitespace cleanup
 
 Cleanup, mostly x86 oriented.
 Patches against 'next' branch.
 
 Naphtali Sprei (3):
   qemu-kvm tests cleanup
   qemu-kvm tests cleanup: adapt stringio test to kernel-mode run
   qemu-kvm tests cleanup: Added printing for passing tests Also typo
 fix

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What changed since kvm-72 resulting in winNT to fail to boot (STOP 0x0000001E) ?

2010-04-30 Thread Michael Tokarev

01.05.2010 00:06, Michael Tokarev wrote:

I've a bugreport handy, see
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=575439
about the apparent problem booting winNT 4 in kvm 0.12.
At least 2 people were hit by this issue. In short, when
booting winNT 4.0, it BSODs with error code 0x001E,
which means inaccessible boot device.

Note that it is when upgrading from -72 to 0.12. But the
second person on the bug page also tried to reinstall the
OS, and that failed as well, now with a different error
message.

I'll try to find an nt4 cdrom here to try it, but I can't
promise anything


I found an old winNT-4.0 install CD-Rom and tried that.  But
it stops right at Inspecting your hardware configuration
stage, with the error: STOP: 0x003E (BSOD).  According
to MS -- http://msdn.microsoft.com/en-us/library/ms819006.aspx --
this means multiprocessor configuration is not supported. For
example, not all processors are at the same level or of the same
type. There might also be mismatched coprocessor support.
Obviously I'm running it with -smp 1, here's the complete
kvm command line:
 kvm -hda winNT.raw -m 512 -cdrom Win_NT_4_Enterprise.iso -localtime -vga std
(with default vga - cirrus - it displays garbage)

And the complete BSOD:
  http://www.corpit.ru/mjt/winnt4_1.gif


Thanks!

/mjt


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Booting/installing WindowsNT

2010-04-30 Thread Michael Tokarev

Apparently with current kvm stable (0.12.3)
Windows NT 4.0 does not install anymore.

With default -cpu, it boots, displays the
Inspecting your hardware configuration
message and BSODs with STOP: 0x003E
error as shown here:
 http://www.corpit.ru/mjt/winnt4_1.gif
With -cpu pentium the situation is a
bit better, it displays:

 Microsoft (R) Windows NT (TM) Version 4.0 (Build 1381).
 1 System Processor [512 MB Memory]  Multiprocessor Kernel

and stops there with 100% CPU usage, never
going any further.

Kvm command line is trivial, with -hda
and -cdrom and -vga std (with -vga cirrus
it displays garbage here).  The only parameters
of interest are:

 -no-acpi - this one has no visible effect
 -cpu pentium - tried that with some change
   but no success anyway.

Anyone know what's going on here?

Thanks!

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: High CPU load on target host after migration

2010-04-30 Thread Thomas Beinicke
I forgot to mention that I am using qemu-kvm-0.12.3 on all of my machines.
I am also using bridged networking on all hosts.
Is anyone else on the list doing life migration and is facing the same 
problems I mentioned below?

I would really like to debug this further but I need some assistance on how to 
proceed.

Thanks,

Thomas


On Wednesday 28 April 2010 23:14:40 you wrote:
 Hi,
 
 I have been toying around with kvm / libvirt / virt-manager and it's
 migration feature.
 Both host machines are running a 2.6.33 Kernel.
 
 One host is a Dual Quad Core Intel Xeon E5520  @ 2.27GHz and the other is a
 Dual Quad Core Intel E5420  @ 2.50GHz.
 
 Migrating Linux machines works great but Windows XP SP3 is giving me a
 headache.
 The migration process finishes without any crash/error but the CPU load on
 the target host is ~50% on two cores. There is no CPU intensive task
 running inside the VM though.
 Removing the network card from the VM and migrating it between the two
 machines doesn't seem to trigger the high CPU load.
 As network driver I tried the realtek and Red Hat virtio driver but it
 doesn't seem to make a difference.
 
 Any insights on what could cause this situation and how to best debug it?
 
 Cheers,
 
 Thomas
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] IOzone test: Introduce postprocessing module

2010-04-30 Thread Lucas Meneghel Rodrigues
This module contains code to postprocess IOzone data
in a convenient way so we can generate performance graphs
and condensed data. The graph generation part depends
on gnuplot, but if the utility is not present,
functionality will gracefully degrade.

The reason why this was created as a separate module is:
 * It doesn't pollute the main test class.
 * Allows us to use the postprocess module as a stand alone program,
   that can even do performance comparison between 2 IOzone runs.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/iozone/postprocessing.py |  487 +
 1 files changed, 487 insertions(+), 0 deletions(-)
 create mode 100755 client/tests/iozone/postprocessing.py

diff --git a/client/tests/iozone/postprocessing.py 
b/client/tests/iozone/postprocessing.py
new file mode 100755
index 000..b495502
--- /dev/null
+++ b/client/tests/iozone/postprocessing.py
@@ -0,0 +1,487 @@
+#!/usr/bin/python
+
+Postprocessing module for IOzone. It is capable to pick results from an
+IOzone run, calculate the geometric mean for all throughput results for
+a given file size or record size, and then generate a series of 2D and 3D
+graphs. The graph generation functionality depends on gnuplot, and if it
+is not present, functionality degrates gracefully.
+
+...@copyright: Red Hat 2010
+
+import os, sys, optparse, logging, math, time
+import common
+from autotest_lib.client.common_lib import logging_config, logging_manager
+from autotest_lib.client.common_lib import error
+from autotest_lib.client.bin import utils, os_dep
+
+
+_LABELS = ('file_size', 'record_size', 'write', 'rewrite', 'read', 'reread',
+   'randread', 'randwrite', 'bkwdread', 'recordrewrite', 'strideread',
+   'fwrite', 'frewrite', 'fread', 'freread')
+
+
+def unique(list):
+
+Return a list of the elements in list, but without duplicates.
+
+@param list: List with values.
+@return: List with non duplicate elements.
+
+n = len(list)
+if n == 0:
+return []
+u = {}
+try:
+for x in list:
+u[x] = 1
+except TypeError:
+return None
+else:
+return u.keys()
+
+
+def geometric_mean(values):
+
+Evaluates the geometric mean for a list of numeric values.
+
+@param values: List with values.
+@return: Single value representing the geometric mean for the list values.
+@see: http://en.wikipedia.org/wiki/Geometric_mean
+
+try:
+values = [int(value) for value in values]
+except ValueError:
+return None
+product = 1
+n = len(values)
+if n == 0:
+return None
+return math.exp(sum([math.log(x) for x in values])/n)
+
+
+def compare_matrices(matrix1, matrix2, treshold=0.05):
+
+Compare 2 matrices nxm and return a matrix nxm with comparison data
+
+@param matrix1: Reference Matrix with numeric data
+@param matrix2: Matrix that will be compared
+@param treshold: Any difference bigger than this percent treshold will be
+reported.
+
+improvements = 0
+regressions = 0
+same = 0
+comparison_matrix = []
+
+new_matrix = []
+for line1, line2 in zip(matrix1, matrix2):
+new_line = []
+for element1, element2 in zip(line1, line2):
+ratio = float(element2) / float(element1)
+if ratio  (1 - treshold):
+regressions += 1
+new_line.append((100 * ratio - 1) - 100)
+elif ratio  (1 + treshold):
+improvements += 1
+new_line.append(+ + str((100 * ratio - 1) - 100))
+else:
+same + 1
+if line1.index(element1) == 0:
+new_line.append(element1)
+else:
+new_line.append(.)
+new_matrix.append(new_line)
+
+total = improvements + regressions + same
+
+return (new_matrix, improvements, regressions, total)
+
+
+class IOzoneAnalyzer(object):
+
+Analyze an unprocessed IOzone file, and generate the following types of
+report:
+
+* Summary of throughput for all file and record sizes combined
+* Summary of throughput for all file sizes
+* Summary of throughput for all record sizes
+
+If more than one file is provided to the analyzer object, a comparison
+between the two runs is made, searching for regressions in performance.
+
+def __init__(self, list_files, output_dir):
+self.list_files = list_files
+if not os.path.isdir(output_dir):
+os.makedirs(output_dir)
+self.output_dir = output_dir
+logging.info(Results will be stored in %s, output_dir)
+
+
+def average_performance(self, results, size=None):
+
+Flattens a list containing performance results.
+
+@param results: List of n lists containing data from performance runs.
+@param size: Numerical value of a size (say, file_size) that 

[PATCH 2/2] IOzone test: Introduce additional results postprocessing

2010-04-30 Thread Lucas Meneghel Rodrigues
Using the postprocessing module introduced on the previous
patch, use it to analyze results and write performance
graphs and performance tables.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/iozone/iozone.py |   24 +++-
 1 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/client/tests/iozone/iozone.py b/client/tests/iozone/iozone.py
index fa3fba4..4977b3c 100755
--- a/client/tests/iozone/iozone.py
+++ b/client/tests/iozone/iozone.py
@@ -1,5 +1,6 @@
 import os, re
 from autotest_lib.client.bin import test, utils
+import postprocessing
 
 
 class iozone(test.test):
@@ -63,17 +64,19 @@ class iozone(test.test):
 self.results = utils.system_output('%s %s' % (cmd, args))
 self.auto_mode = (-a in args)
 
-path = os.path.join(self.resultsdir, 'raw_output_%s' % self.iteration)
-raw_output_file = open(path, 'w')
-raw_output_file.write(self.results)
-raw_output_file.close()
+self.results_path = os.path.join(self.resultsdir,
+ 'raw_output_%s' % self.iteration)
+self.analysisdir = os.path.join(self.resultsdir,
+'analysis_%s' % self.iteration)
+
+utils.open_write_close(self.results_path, self.results)
 
 
 def __get_section_name(self, desc):
 return desc.strip().replace(' ', '_')
 
 
-def postprocess_iteration(self):
+def generate_keyval(self):
 keylist = {}
 
 if self.auto_mode:
@@ -150,3 +153,14 @@ class iozone(test.test):
 keylist[key_name] = result
 
 self.write_perf_keyval(keylist)
+
+
+def postprocess_iteration(self):
+self.generate_keyval()
+a = postprocessing.IOzoneAnalyzer(list_files=[self.results_path],
+  output_dir=self.analysisdir)
+a.analyze()
+p = postprocessing.IOzonePlotter(results_file=self.results_path,
+ output_dir=self.analysisdir)
+p.plot_all()
+
-- 
1.7.0.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [PATCH 1/2] IOzone test: Introduce postprocessing module

2010-04-30 Thread Martin Bligh
I'm slightly surprised this isn't called from postprocess
in the test? Any downside to doing that?

On Fri, Apr 30, 2010 at 2:20 PM, Lucas Meneghel Rodrigues
l...@redhat.com wrote:
 This module contains code to postprocess IOzone data
 in a convenient way so we can generate performance graphs
 and condensed data. The graph generation part depends
 on gnuplot, but if the utility is not present,
 functionality will gracefully degrade.

 The reason why this was created as a separate module is:
  * It doesn't pollute the main test class.
  * Allows us to use the postprocess module as a stand alone program,
   that can even do performance comparison between 2 IOzone runs.

 Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
 ---
  client/tests/iozone/postprocessing.py |  487 
 +
  1 files changed, 487 insertions(+), 0 deletions(-)
  create mode 100755 client/tests/iozone/postprocessing.py

 diff --git a/client/tests/iozone/postprocessing.py 
 b/client/tests/iozone/postprocessing.py
 new file mode 100755
 index 000..b495502
 --- /dev/null
 +++ b/client/tests/iozone/postprocessing.py
 @@ -0,0 +1,487 @@
 +#!/usr/bin/python
 +
 +Postprocessing module for IOzone. It is capable to pick results from an
 +IOzone run, calculate the geometric mean for all throughput results for
 +a given file size or record size, and then generate a series of 2D and 3D
 +graphs. The graph generation functionality depends on gnuplot, and if it
 +is not present, functionality degrates gracefully.
 +
 +...@copyright: Red Hat 2010
 +
 +import os, sys, optparse, logging, math, time
 +import common
 +from autotest_lib.client.common_lib import logging_config, logging_manager
 +from autotest_lib.client.common_lib import error
 +from autotest_lib.client.bin import utils, os_dep
 +
 +
 +_LABELS = ('file_size', 'record_size', 'write', 'rewrite', 'read', 'reread',
 +           'randread', 'randwrite', 'bkwdread', 'recordrewrite', 
 'strideread',
 +           'fwrite', 'frewrite', 'fread', 'freread')
 +
 +
 +def unique(list):
 +    
 +    Return a list of the elements in list, but without duplicates.
 +
 +   �...@param list: List with values.
 +   �...@return: List with non duplicate elements.
 +    
 +    n = len(list)
 +    if n == 0:
 +        return []
 +    u = {}
 +    try:
 +        for x in list:
 +            u[x] = 1
 +    except TypeError:
 +        return None
 +    else:
 +        return u.keys()
 +
 +
 +def geometric_mean(values):
 +    
 +    Evaluates the geometric mean for a list of numeric values.
 +
 +   �...@param values: List with values.
 +   �...@return: Single value representing the geometric mean for the list 
 values.
 +   �...@see: http://en.wikipedia.org/wiki/Geometric_mean
 +    
 +    try:
 +        values = [int(value) for value in values]
 +    except ValueError:
 +        return None
 +    product = 1
 +    n = len(values)
 +    if n == 0:
 +        return None
 +    return math.exp(sum([math.log(x) for x in values])/n)
 +
 +
 +def compare_matrices(matrix1, matrix2, treshold=0.05):
 +    
 +    Compare 2 matrices nxm and return a matrix nxm with comparison data
 +
 +   �...@param matrix1: Reference Matrix with numeric data
 +   �...@param matrix2: Matrix that will be compared
 +   �...@param treshold: Any difference bigger than this percent treshold 
 will be
 +            reported.
 +    
 +    improvements = 0
 +    regressions = 0
 +    same = 0
 +    comparison_matrix = []
 +
 +    new_matrix = []
 +    for line1, line2 in zip(matrix1, matrix2):
 +        new_line = []
 +        for element1, element2 in zip(line1, line2):
 +            ratio = float(element2) / float(element1)
 +            if ratio  (1 - treshold):
 +                regressions += 1
 +                new_line.append((100 * ratio - 1) - 100)
 +            elif ratio  (1 + treshold):
 +                improvements += 1
 +                new_line.append(+ + str((100 * ratio - 1) - 100))
 +            else:
 +                same + 1
 +                if line1.index(element1) == 0:
 +                    new_line.append(element1)
 +                else:
 +                    new_line.append(.)
 +        new_matrix.append(new_line)
 +
 +    total = improvements + regressions + same
 +
 +    return (new_matrix, improvements, regressions, total)
 +
 +
 +class IOzoneAnalyzer(object):
 +    
 +    Analyze an unprocessed IOzone file, and generate the following types of
 +    report:
 +
 +    * Summary of throughput for all file and record sizes combined
 +    * Summary of throughput for all file sizes
 +    * Summary of throughput for all record sizes
 +
 +    If more than one file is provided to the analyzer object, a comparison
 +    between the two runs is made, searching for regressions in performance.
 +    
 +    def __init__(self, list_files, output_dir):
 +        self.list_files = list_files
 +        if not os.path.isdir(output_dir):
 +            os.makedirs(output_dir)
 +        self.output_dir = 

Re: [Autotest] [PATCH 1/2] IOzone test: Introduce postprocessing module

2010-04-30 Thread Lucas Meneghel Rodrigues
On Fri, 2010-04-30 at 14:23 -0700, Martin Bligh wrote:
 I'm slightly surprised this isn't called from postprocess
 in the test? Any downside to doing that?

In the second patch I do the change to make the test to use the
postprocessing module.

 On Fri, Apr 30, 2010 at 2:20 PM, Lucas Meneghel Rodrigues
 l...@redhat.com wrote:
  This module contains code to postprocess IOzone data
  in a convenient way so we can generate performance graphs
  and condensed data. The graph generation part depends
  on gnuplot, but if the utility is not present,
  functionality will gracefully degrade.
 
  The reason why this was created as a separate module is:
   * It doesn't pollute the main test class.
   * Allows us to use the postprocess module as a stand alone program,
that can even do performance comparison between 2 IOzone runs.
 
  Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
  ---
   client/tests/iozone/postprocessing.py |  487 
  +
   1 files changed, 487 insertions(+), 0 deletions(-)
   create mode 100755 client/tests/iozone/postprocessing.py
 
  diff --git a/client/tests/iozone/postprocessing.py 
  b/client/tests/iozone/postprocessing.py
  new file mode 100755
  index 000..b495502
  --- /dev/null
  +++ b/client/tests/iozone/postprocessing.py
  @@ -0,0 +1,487 @@
  +#!/usr/bin/python
  +
  +Postprocessing module for IOzone. It is capable to pick results from an
  +IOzone run, calculate the geometric mean for all throughput results for
  +a given file size or record size, and then generate a series of 2D and 3D
  +graphs. The graph generation functionality depends on gnuplot, and if it
  +is not present, functionality degrates gracefully.
  +
  +...@copyright: Red Hat 2010
  +
  +import os, sys, optparse, logging, math, time
  +import common
  +from autotest_lib.client.common_lib import logging_config, logging_manager
  +from autotest_lib.client.common_lib import error
  +from autotest_lib.client.bin import utils, os_dep
  +
  +
  +_LABELS = ('file_size', 'record_size', 'write', 'rewrite', 'read', 
  'reread',
  +   'randread', 'randwrite', 'bkwdread', 'recordrewrite', 
  'strideread',
  +   'fwrite', 'frewrite', 'fread', 'freread')
  +
  +
  +def unique(list):
  +
  +Return a list of the elements in list, but without duplicates.
  +
  +@param list: List with values.
  +@return: List with non duplicate elements.
  +
  +n = len(list)
  +if n == 0:
  +return []
  +u = {}
  +try:
  +for x in list:
  +u[x] = 1
  +except TypeError:
  +return None
  +else:
  +return u.keys()
  +
  +
  +def geometric_mean(values):
  +
  +Evaluates the geometric mean for a list of numeric values.
  +
  +@param values: List with values.
  +@return: Single value representing the geometric mean for the list 
  values.
  +@see: http://en.wikipedia.org/wiki/Geometric_mean
  +
  +try:
  +values = [int(value) for value in values]
  +except ValueError:
  +return None
  +product = 1
  +n = len(values)
  +if n == 0:
  +return None
  +return math.exp(sum([math.log(x) for x in values])/n)
  +
  +
  +def compare_matrices(matrix1, matrix2, treshold=0.05):
  +
  +Compare 2 matrices nxm and return a matrix nxm with comparison data
  +
  +@param matrix1: Reference Matrix with numeric data
  +@param matrix2: Matrix that will be compared
  +@param treshold: Any difference bigger than this percent treshold will 
  be
  +reported.
  +
  +improvements = 0
  +regressions = 0
  +same = 0
  +comparison_matrix = []
  +
  +new_matrix = []
  +for line1, line2 in zip(matrix1, matrix2):
  +new_line = []
  +for element1, element2 in zip(line1, line2):
  +ratio = float(element2) / float(element1)
  +if ratio  (1 - treshold):
  +regressions += 1
  +new_line.append((100 * ratio - 1) - 100)
  +elif ratio  (1 + treshold):
  +improvements += 1
  +new_line.append(+ + str((100 * ratio - 1) - 100))
  +else:
  +same + 1
  +if line1.index(element1) == 0:
  +new_line.append(element1)
  +else:
  +new_line.append(.)
  +new_matrix.append(new_line)
  +
  +total = improvements + regressions + same
  +
  +return (new_matrix, improvements, regressions, total)
  +
  +
  +class IOzoneAnalyzer(object):
  +
  +Analyze an unprocessed IOzone file, and generate the following types of
  +report:
  +
  +* Summary of throughput for all file and record sizes combined
  +* Summary of throughput for all file sizes
  +* Summary of throughput for all record sizes
  +
  +If more than one file is provided to the analyzer object, a comparison
  +between the two runs 

Re: Booting/installing WindowsNT

2010-04-30 Thread Michael Tokarev

01.05.2010 00:59, Michael Tokarev wrote:

Apparently with current kvm stable (0.12.3)
Windows NT 4.0 does not install anymore.

With default -cpu, it boots, displays the
Inspecting your hardware configuration
message and BSODs with STOP: 0x003E
error as shown here:
http://www.corpit.ru/mjt/winnt4_1.gif
With -cpu pentium the situation is a
 bit better, it displays:

Microsoft (R) Windows NT (TM) Version 4.0 (Build 1381).
1 System Processor [512 MB Memory] Multiprocessor Kernel

and stops there with 100% CPU usage, never
going any further.

Kvm command line is trivial, with -hda
and -cdrom and -vga std (with -vga cirrus
it displays garbage here). The only parameters
of interest are:

-no-acpi - this one has no visible effect
-cpu pentium - tried that with some change
but no success anyway.


I were able to boot and install it just fine
using -cpu host (without -no-acpi or any
other option).

 Microsoft(R) Windows NT(R) version 4.0 (Build 1381: Service Pack 1)
 (C) 1981-1996

where my host cpu is Athlon X2-64 4850e (2 cores).

I tried a few other -cpu values, but no luck - it
either BSODs with 0x3E error, or stops after first
kernel message.


Anyone know what's going on here?


Thanks!

/mjt

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [PATCH 1/2] IOzone test: Introduce postprocessing module

2010-04-30 Thread Martin Bligh
On Fri, Apr 30, 2010 at 2:37 PM, Lucas Meneghel Rodrigues
l...@redhat.com wrote:
 On Fri, 2010-04-30 at 14:23 -0700, Martin Bligh wrote:
 I'm slightly surprised this isn't called from postprocess
 in the test? Any downside to doing that?

 In the second patch I do the change to make the test to use the
 postprocessing module.

Ah, OK, missed that. Will go look. This one looks good.


 On Fri, Apr 30, 2010 at 2:20 PM, Lucas Meneghel Rodrigues
 l...@redhat.com wrote:
  This module contains code to postprocess IOzone data
  in a convenient way so we can generate performance graphs
  and condensed data. The graph generation part depends
  on gnuplot, but if the utility is not present,
  functionality will gracefully degrade.
 
  The reason why this was created as a separate module is:
   * It doesn't pollute the main test class.
   * Allows us to use the postprocess module as a stand alone program,
    that can even do performance comparison between 2 IOzone runs.
 
  Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
  ---
   client/tests/iozone/postprocessing.py |  487 
  +
   1 files changed, 487 insertions(+), 0 deletions(-)
   create mode 100755 client/tests/iozone/postprocessing.py
 
  diff --git a/client/tests/iozone/postprocessing.py 
  b/client/tests/iozone/postprocessing.py
  new file mode 100755
  index 000..b495502
  --- /dev/null
  +++ b/client/tests/iozone/postprocessing.py
  @@ -0,0 +1,487 @@
  +#!/usr/bin/python
  +
  +Postprocessing module for IOzone. It is capable to pick results from an
  +IOzone run, calculate the geometric mean for all throughput results for
  +a given file size or record size, and then generate a series of 2D and 3D
  +graphs. The graph generation functionality depends on gnuplot, and if it
  +is not present, functionality degrates gracefully.
  +
  +...@copyright: Red Hat 2010
  +
  +import os, sys, optparse, logging, math, time
  +import common
  +from autotest_lib.client.common_lib import logging_config, logging_manager
  +from autotest_lib.client.common_lib import error
  +from autotest_lib.client.bin import utils, os_dep
  +
  +
  +_LABELS = ('file_size', 'record_size', 'write', 'rewrite', 'read', 
  'reread',
  +           'randread', 'randwrite', 'bkwdread', 'recordrewrite', 
  'strideread',
  +           'fwrite', 'frewrite', 'fread', 'freread')
  +
  +
  +def unique(list):
  +    
  +    Return a list of the elements in list, but without duplicates.
  +
  +   �...@param list: List with values.
  +   �...@return: List with non duplicate elements.
  +    
  +    n = len(list)
  +    if n == 0:
  +        return []
  +    u = {}
  +    try:
  +        for x in list:
  +            u[x] = 1
  +    except TypeError:
  +        return None
  +    else:
  +        return u.keys()
  +
  +
  +def geometric_mean(values):
  +    
  +    Evaluates the geometric mean for a list of numeric values.
  +
  +   �...@param values: List with values.
  +   �...@return: Single value representing the geometric mean for the list 
  values.
  +   �...@see: http://en.wikipedia.org/wiki/Geometric_mean
  +    
  +    try:
  +        values = [int(value) for value in values]
  +    except ValueError:
  +        return None
  +    product = 1
  +    n = len(values)
  +    if n == 0:
  +        return None
  +    return math.exp(sum([math.log(x) for x in values])/n)
  +
  +
  +def compare_matrices(matrix1, matrix2, treshold=0.05):
  +    
  +    Compare 2 matrices nxm and return a matrix nxm with comparison data
  +
  +   �...@param matrix1: Reference Matrix with numeric data
  +   �...@param matrix2: Matrix that will be compared
  +   �...@param treshold: Any difference bigger than this percent treshold 
  will be
  +            reported.
  +    
  +    improvements = 0
  +    regressions = 0
  +    same = 0
  +    comparison_matrix = []
  +
  +    new_matrix = []
  +    for line1, line2 in zip(matrix1, matrix2):
  +        new_line = []
  +        for element1, element2 in zip(line1, line2):
  +            ratio = float(element2) / float(element1)
  +            if ratio  (1 - treshold):
  +                regressions += 1
  +                new_line.append((100 * ratio - 1) - 100)
  +            elif ratio  (1 + treshold):
  +                improvements += 1
  +                new_line.append(+ + str((100 * ratio - 1) - 100))
  +            else:
  +                same + 1
  +                if line1.index(element1) == 0:
  +                    new_line.append(element1)
  +                else:
  +                    new_line.append(.)
  +        new_matrix.append(new_line)
  +
  +    total = improvements + regressions + same
  +
  +    return (new_matrix, improvements, regressions, total)
  +
  +
  +class IOzoneAnalyzer(object):
  +    
  +    Analyze an unprocessed IOzone file, and generate the following types 
  of
  +    report:
  +
  +    * Summary of throughput for all file and record sizes combined
  +    * Summary of throughput for all 

[ kvm-Bugs-2980197 ] qemu-kvm.h:845: error: expected ‘)’ before ‘start_addr’

2010-04-30 Thread SourceForge.net
Bugs item #2980197, was opened at 2010-03-31 19:45
Message generated for change (Settings changed) made by sf-robot
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2980197group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Closed
Resolution: None
Priority: 5
Private: No
Submitted By: Jeff Kowalczyk (jfkw)
Assigned to: Nobody/Anonymous (nobody)
Summary: qemu-kvm.h:845: error: expected ) before start_addr

Initial Comment:
On Gentoo Linux compiling qem-kvm from the repository gives the following 
compile error:

  CCarm-softmmu/gdbstub.o
In file included from 
/var/tmp/portage/app-emulation/qemu-kvm-/work/qemu-kvm-/gdbstub.c:38:
/var/tmp/portage/app-emulation/qemu-kvm-/work/qemu-kvm-/qemu-kvm.h:845: 
error: expected ) before start_addr
/var/tmp/portage/app-emulation/qemu-kvm-/work/qemu-kvm-/qemu-kvm.h:847: 
error: expected ) before start_addr
/var/tmp/portage/app-emulation/qemu-kvm-/work/qemu-kvm-/qemu-kvm.h:850: 
error: expected ) before start_addr
/var/tmp/portage/app-emulation/qemu-kvm-/work/qemu-kvm-/qemu-kvm.h:852: 
error: expected ) before start
In file included from 
/var/tmp/portage/app-emulation/qemu-kvm-/work/qemu-kvm-/gdbstub.c:38:
/var/tmp/portage/app-emulation/qemu-kvm-/work/qemu-kvm-/qemu-kvm.h:931: 
error: expected ) before start_addr
make[1]: *** [gdbstub.o] Error 1
make: *** [subdir-i386-linux-user] Error 2
make: *** Waiting for unfinished jobs

The repository HEAD is at 79fdb980fa4fe40c3fba9391b83b655045af030f for the 
above example.

The file qemu-kvm.h is not affected by any Gentoo-specific patch in the 
packaging.

Thanks.

--

Comment By: SourceForge Robot (sf-robot)
Date: 2010-05-01 02:20

Message:
This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).

--

Comment By: Brian Jackson (iggy_cav)
Date: 2010-04-16 18:53

Message:
Same as your other compile bug.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2980197group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html