[PATCH] KVM: x86 emulator: Use opcode::execute for INS/OUTS

2011-11-22 Thread Takuya Yoshikawa
From: Takuya Yoshikawa 

INSB   : 6C
INSW/INSD  : 6D
OUTSB  : 6E
OUTSW/OUTSD: 6F

The I/O port address is read from the DX register when we decode the
operand because we see the SrcDX/DstDX flag is set.

Signed-off-by: Takuya Yoshikawa 
---
 arch/x86/kvm/emulate.c |   14 ++
 1 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 4cd3313..ac8e5ed 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3321,8 +3321,8 @@ static struct opcode opcode_table[256] = {
I(DstReg | SrcMem | ModRM | Src2Imm, em_imul_3op),
I(SrcImmByte | Mov | Stack, em_push),
I(DstReg | SrcMem | ModRM | Src2ImmByte, em_imul_3op),
-   D2bvIP(DstDI | SrcDX | Mov | String, ins, check_perm_in), /* insb, 
insw/insd */
-   D2bvIP(SrcSI | DstDX | String, outs, check_perm_out), /* outsb, 
outsw/outsd */
+   I2bvIP(DstDI | SrcDX | Mov | String, em_in, ins, check_perm_in), /* 
insb, insw/insd */
+   I2bvIP(SrcSI | DstDX | String, em_out, outs, check_perm_out), /* outsb, 
outsw/outsd */
/* 0x70 - 0x7F */
X16(D(SrcImmByte)),
/* 0x80 - 0x87 */
@@ -4027,16 +4027,6 @@ special_insn:
goto cannot_emulate;
ctxt->dst.val = (s32) ctxt->src.val;
break;
-   case 0x6c:  /* insb */
-   case 0x6d:  /* insw/insd */
-   ctxt->src.val = ctxt->regs[VCPU_REGS_RDX];
-   rc = em_in(ctxt);
-   break;
-   case 0x6e:  /* outsb */
-   case 0x6f:  /* outsw/outsd */
-   ctxt->dst.val = ctxt->regs[VCPU_REGS_RDX];
-   rc = em_out(ctxt);
-   break;
case 0x70 ... 0x7f: /* jcc (short) */
if (test_cc(ctxt->b, ctxt->eflags))
jmp_rel(ctxt, ctxt->src.val);
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for Novemeber 22

2011-11-22 Thread 王永博
Thank you !

2011/11/23 Alex Jia :
> Hi Yongbo,
> I know VMsafe covers three main areas are Memory, Disk and Network
> for securing the virtual environment, as far as I know, for kvm
> security, we have similar security features or resource management
> and control, for instance:
>
> 1. Host network isolation, configuring network interface for the host
> and a separate network interface for the guest operating systems.
>
> 2. SELinux automatically stores and protect images on host
>
> 3. Secure remote management with libvirt such as using SSH tunnels,
> using SASL authentication and encryption and using TLS for remote access
>
> 4. Using sVirt isolates virtual machines
>
> 5. With cgroups in RHEL6, you can restrict a set of tasks to a set of
> resources, prevent denial-of-service situations in KVM environments,
> and monitor resource use
>
> 6. Disk-image encryption is a technique aimed at protecting data at rest
>
> 7. Auditing the KVM virtualization host and guests
>
> In addition, libvirt includes a pluggable framework for lock managers,
> which hypervisor drivers can use to ensure safety for guest domain disks,
> and potentially other resources.
>
> Of course, I'm not a developer, I believe that virt developers can show
> more security technique or features for virtualization to you.
>
> Regards,
> Alex
>
>
> - Original Message -
> From: "王永博" 
> To: quint...@redhat.com
> Cc: "Developers qemu-devel" , "KVM devel mailing list" 
> 
> Sent: Wednesday, November 23, 2011 9:44:39 AM
> Subject: Re: [Qemu-devel] KVM call agenda for Novemeber 22
>
> Does kvm has  the api like vmsafe to help cooperator  protect their product ?
>
> 2011/11/22 Juan Quintela :
>>
>> Hi
>>
>> Please send in any agenda items you are interested in covering.
>>
>> Later, Juan.
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for Novemeber 22

2011-11-22 Thread Alex Jia
Hi Yongbo,
I know VMsafe covers three main areas are Memory, Disk and Network 
for securing the virtual environment, as far as I know, for kvm 
security, we have similar security features or resource management
and control, for instance:

1. Host network isolation, configuring network interface for the host 
and a separate network interface for the guest operating systems.

2. SELinux automatically stores and protect images on host

3. Secure remote management with libvirt such as using SSH tunnels,
using SASL authentication and encryption and using TLS for remote access

4. Using sVirt isolates virtual machines

5. With cgroups in RHEL6, you can restrict a set of tasks to a set of 
resources, prevent denial-of-service situations in KVM environments, 
and monitor resource use

6. Disk-image encryption is a technique aimed at protecting data at rest

7. Auditing the KVM virtualization host and guests

In addition, libvirt includes a pluggable framework for lock managers, 
which hypervisor drivers can use to ensure safety for guest domain disks, 
and potentially other resources.

Of course, I'm not a developer, I believe that virt developers can show
more security technique or features for virtualization to you.

Regards,
Alex


- Original Message -
From: "王永博" 
To: quint...@redhat.com
Cc: "Developers qemu-devel" , "KVM devel mailing list" 

Sent: Wednesday, November 23, 2011 9:44:39 AM
Subject: Re: [Qemu-devel] KVM call agenda for Novemeber 22

Does kvm has  the api like vmsafe to help cooperator  protect their product ?

2011/11/22 Juan Quintela :
>
> Hi
>
> Please send in any agenda items you are interested in covering.
>
> Later, Juan.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv4 0/3] acpi: DSDT/SSDT runtime patching

2011-11-22 Thread Kevin O'Connor
On Mon, Nov 21, 2011 at 10:22:22PM +0200, Michael S. Tsirkin wrote:
> On Sun, Nov 20, 2011 at 04:08:59PM -0500, Kevin O'Connor wrote:
> > On Sun, Nov 20, 2011 at 07:56:43PM +0200, Michael S. Tsirkin wrote:
> > > Here's an updated revision of acpi runtime patching patchset.
> > > Lightly tested.
> > 
> > It looks good to me.
> > 
> > -Kevin
> 
> Run some linux and windows tests, things seem to work
> smoothly. Pls apply.

I committed this series.

Thanks,
-Kevin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 RFC] virtio-pci: flexible configuration layout

2011-11-22 Thread Rusty Russell
On Tue, 22 Nov 2011 20:36:22 +0200, "Michael S. Tsirkin"  
wrote:
> Here's an updated vesion.
> I'm alternating between updating the spec and the driver,
> spec update to follow.

Don't touch the spec yet, we have a long way to go :(

I want the ability for driver to set the ring size, and the device to
set the alignment.  That's a bigger change than you have here.  I
imagine it almost rips the driver into two completely different drivers.

This is the kind of thing I had in mind, for the header.  Want me to
code up the rest?

(I've avoided adding the constants for the new layout: a struct is more
compact and more descriptive).

Cheers,
Rusty.

diff --git a/include/linux/virtio_pci.h b/include/linux/virtio_pci.h
--- a/include/linux/virtio_pci.h
+++ b/include/linux/virtio_pci.h
@@ -42,56 +42,74 @@
 #include 
 
 /* A 32-bit r/o bitmask of the features supported by the host */
-#define VIRTIO_PCI_HOST_FEATURES   0
+#define VIRTIO_PCI_LEGACY_HOST_FEATURES0
 
 /* A 32-bit r/w bitmask of features activated by the guest */
-#define VIRTIO_PCI_GUEST_FEATURES  4
+#define VIRTIO_PCI_LEGACY_GUEST_FEATURES   4
 
 /* A 32-bit r/w PFN for the currently selected queue */
-#define VIRTIO_PCI_QUEUE_PFN   8
+#define VIRTIO_PCI_LEGACY_QUEUE_PFN8
 
 /* A 16-bit r/o queue size for the currently selected queue */
-#define VIRTIO_PCI_QUEUE_NUM   12
+#define VIRTIO_PCI_LEGACY_QUEUE_NUM12
 
 /* A 16-bit r/w queue selector */
-#define VIRTIO_PCI_QUEUE_SEL   14
+#define VIRTIO_PCI_LEGACY_QUEUE_SEL14
 
 /* A 16-bit r/w queue notifier */
-#define VIRTIO_PCI_QUEUE_NOTIFY16
+#define VIRTIO_PCI_LEGACY_QUEUE_NOTIFY 16
 
 /* An 8-bit device status register.  */
-#define VIRTIO_PCI_STATUS  18
+#define VIRTIO_PCI_LEGACY_STATUS   18
 
 /* An 8-bit r/o interrupt status register.  Reading the value will return the
  * current contents of the ISR and will also clear it.  This is effectively
  * a read-and-acknowledge. */
-#define VIRTIO_PCI_ISR 19
-
-/* The bit of the ISR which indicates a device configuration change. */
-#define VIRTIO_PCI_ISR_CONFIG  0x2
+#define VIRTIO_PCI_LEGACY_ISR  19
 
 /* MSI-X registers: only enabled if MSI-X is enabled. */
 /* A 16-bit vector for configuration changes. */
-#define VIRTIO_MSI_CONFIG_VECTOR20
+#define VIRTIO_MSI_LEGACY_CONFIG_VECTOR20
 /* A 16-bit vector for selected queue notifications. */
-#define VIRTIO_MSI_QUEUE_VECTOR 22
-/* Vector value used to disable MSI for queue */
-#define VIRTIO_MSI_NO_VECTOR0x
+#define VIRTIO_MSI_LEGACY_QUEUE_VECTOR 22
 
 /* The remaining space is defined by each driver as the per-driver
  * configuration space */
-#define VIRTIO_PCI_CONFIG(dev) ((dev)->msix_enabled ? 24 : 20)
+#define VIRTIO_PCI_LEGACY_CONFIG(dev)  ((dev)->msix_enabled ? 24 : 20)
+
+/* How many bits to shift physical queue address written to QUEUE_PFN.
+ * 12 is historical, and due to x86 page size. */
+#define VIRTIO_PCI_LEGACY_QUEUE_ADDR_SHIFT 12
+
+/* The alignment to use between consumer and producer parts of vring.
+ * x86 pagesize again. */
+#define VIRTIO_PCI_LEGACY_VRING_ALIGN  4096
+
+#ifndef __KERNEL__
+/* Don't break compile of old userspace code.  These will go away. */
+#define VIRTIO_PCI_HOST_FEATURES VIRTIO_PCI_LEGACY_HOST_FEATURES
+#define VIRTIO_PCI_GUEST_FEATURES VIRTIO_PCI_LEGACY_GUEST_FEATURES
+#define VIRTIO_PCI_LEGACY_QUEUE_PFN VIRTIO_PCI_QUEUE_PFN
+#define VIRTIO_PCI_LEGACY_QUEUE_NUM VIRTIO_PCI_QUEUE_NUM
+#define VIRTIO_PCI_LEGACY_QUEUE_SEL VIRTIO_PCI_QUEUE_SEL
+#define VIRTIO_PCI_LEGACY_QUEUE_NOTIFY VIRTIO_PCI_QUEUE_NOTIFY
+#define VIRTIO_PCI_LEGACY_STATUS VIRTIO_PCI_STATUS
+#define VIRTIO_PCI_LEGACY_ISR VIRTIO_PCI_ISR
+#define VIRTIO_MSI_LEGACY_CONFIG_VECTOR VIRTIO_MSI_CONFIG_VECTOR
+#define VIRTIO_MSI_LEGACY_QUEUE_VECTOR VIRTIO_MSI_QUEUE_VECTOR
+#define VIRTIO_PCI_LEGACY_CONFIG(dev) VIRTIO_PCI_CONFIG(dev)
+#define VIRTIO_PCI_LEGACY_QUEUE_ADDR_SHIFT VIRTIO_PCI_QUEUE_ADDR_SHIFT
+#define VIRTIO_PCI_LEGACY_VRING_ALIGN VIRTIO_PCI_VRING_ALIGN
+#endif /* ...!KERNEL */
 
 /* Virtio ABI version, this must match exactly */
 #define VIRTIO_PCI_ABI_VERSION 0
 
-/* How many bits to shift physical queue address written to QUEUE_PFN.
- * 12 is historical, and due to x86 page size. */
-#define VIRTIO_PCI_QUEUE_ADDR_SHIFT12
+/* Vector value used to disable MSI for queue */
+#define VIRTIO_MSI_NO_VECTOR0x
 
-/* The alignment to use between consumer and producer parts of vring.
- * x86 pagesize again. */
-#define VIRTIO_PCI_VRING_ALIGN 4096
+/* The bit of the ISR which indicates a device configuration change. */
+#define VIRTIO_PCI_ISR_CONFIG  0x2
 
 /*
  * Layout for Virtio PCI vendor specific capability (little-endian):
@@ -133,4 +151,20 @@
 #define VIRTIO_PCI_CAP_CFG_OFF_MASK0x
 #define VIRTIO_P

Re: [PATCH 5 of 5] virtio: expose added descriptors immediately

2011-11-22 Thread Rusty Russell
On Tue, 22 Nov 2011 08:29:08 +0200, "Michael S. Tsirkin"  
wrote:
> On Tue, Nov 22, 2011 at 11:03:04AM +1030, Rusty Russell wrote:
> > -   /* If you haven't kicked in this long, you're probably doing something
> > -* wrong. */
> > -   WARN_ON(vq->num_added > vq->vring.num);
> > +   /* This is very unlikely, but theoretically possible.  Kick
> > +* just in case. */
> > +   if (unlikely(vq->num_added == 65535))
> 
> This is 0x but why use the decimal notation?

Interesting.  Why use hex?  Feels more like binary?

But I've changed it to "(1 << 16) - 1" to be clear.

> > +   virtqueue_kick(_vq);
> >  
> > pr_debug("Added buffer head %i to %p\n", head, vq);
> > END_USE(vq);
> 
> We also still need to reset vq->num_added, right?

virtqueue_kick does that for us.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Novemeber 22

2011-11-22 Thread 王永博
Does kvm has  the api like vmsafe to help cooperator  protect their product ?

2011/11/22 Juan Quintela :
>
> Hi
>
> Please send in any agenda items you are interested in covering.
>
> Later, Juan.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC PATCH] vfio: VFIO Driver core framework

2011-11-22 Thread Alex Williamson
On Tue, 2011-11-22 at 14:00 -0600, Scott Wood wrote:
> On 11/22/2011 01:16 PM, Alex Williamson wrote:
> > On Fri, Nov 18, 2011 at 2:09 PM, Scott Wood  wrote:
> >> On Fri, Nov 18, 2011 at 01:32:56PM -0700, Alex Williamson wrote:
> >>> Ugh, I suppose you're thinking of an ILP64 platform with ILP32 compat
> >>> mode.
> >>
> >> Does Linux support ILP64?  There are "int" ioctls all over the place, and
> >> I don't think we do compat wrappers for them.  In fact, some of the
> >> ioctls in linux/fs.h use "int" for the compatible version of ioctls
> >> originally defined as "long".
> >>
> >> It's cleaner to always use the fixed types, though.
> > 
> > I've updated anything that passes data to use a structure 
> 
> That's a bit extreme...

Ok, I lied, it's not everything.  I have consolidated some GET_FLAGS and
GET_NUM_* calls into generic GET_INFO ioctls so we have more
flexibility.  I think the structures make sense there.  I'm not as
convinced on the eventfd and irq unmask structures, but who knows, they
might save us some day.

Here's where I stand on the API definitions, maybe we can get some
agreement on this before diving into semantics of the documentation or
or implementation, though it still includes the merge interface.
Thanks,

Alex

/*
 * VFIO API definition
 *
 * Copyright (C) 2011 Red Hat, Inc.  All rights reserved.
 *  Author: Alex Williamson 
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License version 2 as
 * published by the Free Software Foundation.
 */
#ifndef VFIO_H
#define VFIO_H

#include 

#ifdef __KERNEL__   /* Internal VFIO-core/bus driver API */

/**
 * struct vfio_device_ops - VFIO bus driver device callbacks
 *
 * @match: Return true if buf describes the device
 * @open: Called when userspace receives file descriptor for device
 * @release: Called when userspace releases file descriptor for device
 * @read: Perform read(2) on device file descriptor
 * @write: Perform write(2) on device file descriptor
 * @ioctl: Perform ioctl(2) on device file descriptor, supporting VFIO_DEVICE_*
 * operations documented below
 * @mmap: Perform mmap(2) on a region of the device file descriptor
 */
struct vfio_device_ops {
bool(*match)(struct device *dev, const char *buf);
int (*open)(void *device_data);
void(*release)(void *device_data);
ssize_t (*read)(void *device_data, char __user *buf,
size_t count, loff_t *ppos);
ssize_t (*write)(void *device_data, const char __user *buf,
 size_t count, loff_t *size);
long(*ioctl)(void *device_data, unsigned int cmd,
 unsigned long arg);
int (*mmap)(void *device_data, struct vm_area_struct *vma);
};

/**
 * vfio_group_add_dev() - Add a device to the vfio-core
 *
 * @dev: Device to add
 * @ops: VFIO bus driver callbacks for device
 *
 * This registration makes the VFIO core aware of the device, creates
 * groups objects as required and exposes chardevs under /dev/vfio.
 *
 * Return 0 on success, errno on failure.
 */
extern int vfio_group_add_dev(struct device *dev,
  const struct vfio_device_ops *ops);

/**
 * vfio_group_del_dev() - Remove a device from the vfio-core
 *
 * @dev: Device to remove
 *
 * Remove a device previously added to the VFIO core, removing groups
 * and chardevs as necessary.
 */
extern void vfio_group_del_dev(struct device *dev);

/**
 * vfio_bind_dev() - Indicate device is bound to the VFIO bus driver and
 *   register private data structure for ops callbacks.
 *
 * @dev: Device being bound
 * @device_data: VFIO bus driver private data
 *
 * This registration indicate that a device previously registered with
 * vfio_group_add_dev() is now available for use by the VFIO core.  When
 * all devices within a group are available, the group is viable and my
 * be used by userspace drivers.  Typically called from VFIO bus driver
 * probe function.
 *
 * Return 0 on success, errno on failure
 */
extern int vfio_bind_dev(struct device *dev, void *device_data);

/**
 * vfio_unbind_dev() - Indicate device is unbinding from VFIO bus driver
 *
 * @dev: Device being unbound
 *
 * De-registration of the device previously registered with vfio_bind_dev()
 * from VFIO.  Upon completion, the device is no longer available for use by
 * the VFIO core.  Typically called from the VFIO bus driver remove function.
 * The VFIO core will attempt to release the device from users and may take
 * measures to free the device and/or block as necessary.
 *
 * Returns pointer to private device_data structure registered with
 * vfio_bind_dev().
 */
extern void *vfio_unbind_dev(struct device *dev);

#endif /* __KERNEL__ */

/* Kernel & User level defines for VFIO IOCTLs. */

/*
 * The IOCTL interface is designed for extensibility by embedding the
 * structure length (argsz) and flags into structur

Re: Current kernel fails to compile with KVM on PowerPC

2011-11-22 Thread Alexander Graf

On 22.11.2011, at 21:04, Jörg Sommer wrote:

> Hi,
> 
> Jörg Sommer hat am Mon 07. Nov, 20:48 (+0100) geschrieben:
>> I'm trying to build the kernel with the git commit-id
>> 31555213f03bca37d2c02e10946296052f4ecfcd, but it fails
>> 
>>  CHK include/linux/version.h
>>  HOSTCC  scripts/mod/modpost.o
>>  CHK include/generated/utsrelease.h
>>  UPD include/generated/utsrelease.h
>>  HOSTLD  scripts/mod/modpost
>>  GEN include/generated/bounds.h
>>  CC  arch/powerpc/kernel/asm-offsets.s
>> In file included from arch/powerpc/kernel/asm-offsets.c:59:0:
>> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h: In function 
>> ‘compute_tlbie_rb’:
>> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h:393:10: error: 
>> ‘HPTE_V_SECONDARY’ undeclared (first use in this function)
>> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h:393:10: note: 
>> each undeclared identifier is reported only once for each function it 
>> appears in
>> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h:396:12: error: 
>> ‘HPTE_V_1TB_SEG’ undeclared (first use in this function)
>> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h:401:10: error: 
>> ‘HPTE_V_LARGE’ undeclared (first use in this function)
>> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h:415:2: warning: 
>> right shift count >= width of type [enabled by default]
>> make[3]: *** [arch/powerpc/kernel/asm-offsets.s] Fehler 1
>> make[2]: *** [prepare0] Fehler 2
>> make[1]: *** [deb-pkg] Fehler 2
>> make: *** [deb-pkg] Fehler 2
> 
> I'm still having this problem. I can' build
> 6fe4c6d466e95d31164f14b1ac4aefb51f0f4f82. Are there any patches to
> make the kernel builds and do not oops [1] on PowerPC?

The failures above should be fixed by now.

> [1] »kernel BUG at include/linux/kvm_host.h:603!«
>  http://www.mail-archive.com/kvm@vger.kernel.org/msg61433.html

This is unfortunately still there. It's because of preemption being enabled. 
Please just use CONFIG_PREEMPT_NONE for the time being - it's on my todo list :)


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current kernel fails to compile with KVM on PowerPC

2011-11-22 Thread Jörg Sommer
Hi,

Jörg Sommer hat am Mon 07. Nov, 20:48 (+0100) geschrieben:
> I'm trying to build the kernel with the git commit-id
> 31555213f03bca37d2c02e10946296052f4ecfcd, but it fails
> 
>   CHK include/linux/version.h
>   HOSTCC  scripts/mod/modpost.o
>   CHK include/generated/utsrelease.h
>   UPD include/generated/utsrelease.h
>   HOSTLD  scripts/mod/modpost
>   GEN include/generated/bounds.h
>   CC  arch/powerpc/kernel/asm-offsets.s
> In file included from arch/powerpc/kernel/asm-offsets.c:59:0:
> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h: In function 
> ‘compute_tlbie_rb’:
> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h:393:10: error: 
> ‘HPTE_V_SECONDARY’ undeclared (first use in this function)
> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h:393:10: note: 
> each undeclared identifier is reported only once for each function it appears 
> in
> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h:396:12: error: 
> ‘HPTE_V_1TB_SEG’ undeclared (first use in this function)
> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h:401:10: error: 
> ‘HPTE_V_LARGE’ undeclared (first use in this function)
> /home/joerg/git/linux/arch/powerpc/include/asm/kvm_book3s.h:415:2: warning: 
> right shift count >= width of type [enabled by default]
> make[3]: *** [arch/powerpc/kernel/asm-offsets.s] Fehler 1
> make[2]: *** [prepare0] Fehler 2
> make[1]: *** [deb-pkg] Fehler 2
> make: *** [deb-pkg] Fehler 2

I'm still having this problem. I can' build
6fe4c6d466e95d31164f14b1ac4aefb51f0f4f82. Are there any patches to
make the kernel builds and do not oops [1] on PowerPC?

[1] »kernel BUG at include/linux/kvm_host.h:603!«
  http://www.mail-archive.com/kvm@vger.kernel.org/msg61433.html

Bye, Jörg.
-- 
Das Recht, seine Meinung zu wechseln, ist eines der wichtigsten
menschlichen Privilegien.
(Robert Peel)


signature.asc
Description: Digital signature http://en.wikipedia.org/wiki/OpenPGP


Re: [Qemu-devel] [RFC PATCH] vfio: VFIO Driver core framework

2011-11-22 Thread Scott Wood
On 11/22/2011 01:16 PM, Alex Williamson wrote:
> On Fri, Nov 18, 2011 at 2:09 PM, Scott Wood  wrote:
>> On Fri, Nov 18, 2011 at 01:32:56PM -0700, Alex Williamson wrote:
>>> Ugh, I suppose you're thinking of an ILP64 platform with ILP32 compat
>>> mode.
>>
>> Does Linux support ILP64?  There are "int" ioctls all over the place, and
>> I don't think we do compat wrappers for them.  In fact, some of the
>> ioctls in linux/fs.h use "int" for the compatible version of ioctls
>> originally defined as "long".
>>
>> It's cleaner to always use the fixed types, though.
> 
> I've updated anything that passes data to use a structure 

That's a bit extreme...

> and will make use of __s32 in place of ints.  If there ever exists an ILP64
> system, we can use a flag bit of the structure to indicate 64bit file
> descriptor support.

If we end up supporting an ABI where compatibility between user and
kernel is broken even when we use fixed-size types and are careful about
alignment, we'll need a compat wrapper, and we'll know what ABI
userspace is supposed to be using.  I'm not sure how a flag would help.

>>> The point of the group is to provide a unit of ownership.  We can't let
>>> $userA open $groupid and fetch a device, then have $userB do the same,
>>> grabbing a different device.  The mappings will step on each other and
>>> the devices have no isolation.  We can't restrict that purely by file
>>> permissions or we'll have the same problem with sudo.
>>
>> What is the problem with sudo?  If you're running processes as the same
>> user, regardless of how, they're going to be able to mess with each
>> other.
> 
> Just trying to indicate that file permissions are easy to bypass and
> privileged users can inadvertently do stupid stuff.

Preventing stupid stuff can also prevent useful stuff.  Security and
accident-avoidance are different things.  "We can't let" is the domain
of the former.

> Kind of like request_region() in the kernel.   Kernel drivers are privileged, 
> but
> we still want to enforce an owner of that region.  VFIO extends the
> ownership of a device to a single entity in userspace.  How do we
> identify that entity and keep others out?

That's fine as long as it's an optional safeguard that can be turned off
if needed.  Maybe require userspace to set a flag via some mechanism to
indicate it's opening the device in shared mode.

>> It would be nice if this limitation weren't excessively integrated into
>> the design -- in the embedded space we've got unusual partitioning
>> setups, including failover arrangements where partitions share devices.
>> The device may be configured with the IOMMU pointing only at regions that
>> are shared by both mms, or the non-shared regions may be reconfigured as
>> active ownership of the device gets handed around.
>>
>> It would be up to userspace code to make sure that the mappings don't
>> "step on each other".  The mapping could be done with whichever mm issued
>> the map call for a given region.
>>
>> For this use case, there is unlikely to be an issue with ownership
>> because there will not be separate privilege domains creating partitions
>> -- other use cases could refrain from enabling multiple-mm support unless
>> ownership issues are resolved.
>>
>> This doesn't need to be supported initially, but we should try to avoid
>> letting the assumption permeate the code.
> 
> So I'm hearing "we want to use this driver you're developing that's
> centered around using the iommu to securely provide access to a device
> from userspace, but can we do it without the iommu and can we loosen
> up the security a bit?"  Is that about right?  ;)  Thanks,

We have a variety of use cases for userspace and KVM-guest access to
devices.  Some of those involve an iommu, some don't.  Some involve
shared ownership (which isn't necessarily a loosening of security --
there's still an iommu, and access control on the vfio group), some
don't.  Some don't involve DMA at all.  I see no reason to have entirely
separate kernel mechanisms for these use cases.

I'm not asking you to implement any of this, just hoping you'll keep
such flexibility in mind when deciding on fundamental assumptions that
the code and API are to make.

-Scott

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC PATCH] vfio: VFIO Driver core framework

2011-11-22 Thread Alex Williamson
On Fri, Nov 18, 2011 at 2:09 PM, Scott Wood  wrote:
> On Fri, Nov 18, 2011 at 01:32:56PM -0700, Alex Williamson wrote:
>> Hmm, that might be cleaner than eliminating the size with just using
>> _IO().  So we might have something like:
>>
>> #define VFIO_IOMMU_MAP_DMA              _IOWR(';', 106, struct vfio_dma_map)
>> #define VFIO_IOMMU_MAP_DMA_V2           _IOWR(';', 106, struct 
>> vfio_dma_map_v2)
>>
>> For which the driver might do:
>>
>> case VFIO_IOMMU_MAP_DMA:
>> case VFIO_IOMMU_MAP_DMA_V2:
>> {
>>       struct vfio_dma_map map;
>>
>>       /* We don't care about the extra v2 bits */
>>       if (copy_from_user(&map, (void __user *)arg, sizeof map))
>>               return -EFAULT;
>
> That won't work if you have an old kernel that doesn't know about v2, and
> a new user that uses v2.  To make this work you'd have to strip out the
> size from the ioctl number before switching (but still use it when
> considering whether to access the v2 fields).  Simpler to just leave it
> out of the ioctl number and put it in the struct field as currently
> planned.

Ok, _IO for all ioctls passing structs then.

>> > > I think all our structure sizes are independent of host width.  If I'm
>> > > missing something, let me know.
>> >
>> > Ah, for structures, that might be true.  I was seeing the bunch of
>> > ioctl()s that take ints.
>>
>> Ugh, I suppose you're thinking of an ILP64 platform with ILP32 compat
>> mode.
>
> Does Linux support ILP64?  There are "int" ioctls all over the place, and
> I don't think we do compat wrappers for them.  In fact, some of the
> ioctls in linux/fs.h use "int" for the compatible version of ioctls
> originally defined as "long".
>
> It's cleaner to always use the fixed types, though.

I've updated anything that passes data to use a structure and will
make use of __s32 in place of ints.  If there ever exists an ILP64
system, we can use a flag bit of the structure to indicate 64bit file
descriptor support.

>> Darn it, guess we need to make everything 64bit, including file
>> descriptors.
>
> What's wrong with __u32/__s32 (or uint32_t/int32_t)?
>
> I really do not see Linux supporting an ABI that has no 32-bit type at
> all, especially in a situation where userspace compatibility is needed.
> If that does happen, the ABI breakage will go well beyond VFIO.

Yep, I think the structs fix this and still leave room for the impossible.

>> The point of the group is to provide a unit of ownership.  We can't let
>> $userA open $groupid and fetch a device, then have $userB do the same,
>> grabbing a different device.  The mappings will step on each other and
>> the devices have no isolation.  We can't restrict that purely by file
>> permissions or we'll have the same problem with sudo.
>
> What is the problem with sudo?  If you're running processes as the same
> user, regardless of how, they're going to be able to mess with each
> other.

Just trying to indicate that file permissions are easy to bypass and
privileged users can inadvertently do stupid stuff.  Kind of like
request_region() in the kernel.   Kernel drivers are privileged, but
we still want to enforce an owner of that region.  VFIO extends the
ownership of a device to a single entity in userspace.  How do we
identify that entity and keep others out?

> Is it possible to expose access to only specific groups via an
> individually-permissionable /dev/device, so only the entity handing out
> access to devices needs access to everything?

Yes, that's fundamental to vfio.  vfio-bus drivers enumerate devices
to the vfio-core.  Privileged users bind devices to the vfio-bus
driver creating viable groups.  Groups are represented as chardevs
under /dev/vfio.  If a user has permission to access the chardev, they
have the ability to use the devices.  Once they get a device or iommu
descriptor the group is tied to them via the struct mm and only they
are permitted to access the other devices in the group.

>> At one point we discussed a single open instance, but that
>> unnecessarily limits the user, so we settled on the mm.  Thanks,
>
> It would be nice if this limitation weren't excessively integrated into
> the design -- in the embedded space we've got unusual partitioning
> setups, including failover arrangements where partitions share devices.
> The device may be configured with the IOMMU pointing only at regions that
> are shared by both mms, or the non-shared regions may be reconfigured as
> active ownership of the device gets handed around.
>
> It would be up to userspace code to make sure that the mappings don't
> "step on each other".  The mapping could be done with whichever mm issued
> the map call for a given region.
>
> For this use case, there is unlikely to be an issue with ownership
> because there will not be separate privilege domains creating partitions
> -- other use cases could refrain from enabling multiple-mm support unless
> ownership issues are resolved.
>
> This doesn't need to be supported initially, but we should tr

[PATCHv3 RFC] virtio-pci: flexible configuration layout

2011-11-22 Thread Michael S. Tsirkin
Here's an updated vesion.
I'm alternating between updating the spec and the driver,
spec update to follow.

Compiled only.  Posting here for early feedback, and to allow Sasha to
proceed with his "kvm tool" work.

Changes from v2:
address comments by Rusty
bugfixes by Sasha
Changes from v1:
Updated to match v3 of the spec, see:

todo:
split core changes out

Signed-off-by: Michael S. Tsirkin 
---
 drivers/virtio/Kconfig  |   22 +
 drivers/virtio/virtio_pci.c |  203 ---
 include/asm-generic/io.h|4 +
 include/asm-generic/iomap.h |   11 +++
 include/linux/pci_regs.h|4 +
 include/linux/virtio_pci.h  |   41 +
 lib/iomap.c |   41 -
 7 files changed, 307 insertions(+), 19 deletions(-)

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 816ed08..465245e 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -25,6 +25,28 @@ config VIRTIO_PCI
 
  If unsure, say M.
 
+config VIRTIO_PCI_LEGACY
+   bool "Compatibility with Legacy PIO"
+   default y
+   depends on VIRTIO_PCI
+   ---help---
+ Look out into your driveway.  Do you have a flying car?  If
+ so, you can happily disable this option and virtio will not
+ break.  Otherwise, leave it set.  Unless you're testing what
+ life will be like in The Future.
+
+ In other words:
+
+ Support compatibility with legacy PIO mapping in hypervisors.
+ As of Nov 2011, this is required by all hypervisors without
+ exception, so for now, disabling this option is only useful for
+ testing.  In future hypervisors, it will be possible to disable
+ this option and get a slightly smaller kernel.
+ You know that your hypervisor is new enough if you disable this
+ option and the device initialization passes.
+
+ If unsure, say Y.
+
 config VIRTIO_BALLOON
tristate "Virtio balloon driver (EXPERIMENTAL)"
select VIRTIO
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index 3d1bf41..681347b 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -37,8 +37,14 @@ struct virtio_pci_device
struct virtio_device vdev;
struct pci_dev *pci_dev;
 
-   /* the IO mapping for the PCI config space */
+   /* the IO address for the common PCI config space */
void __iomem *ioaddr;
+   /* the IO address for device specific config */
+   void __iomem *ioaddr_device;
+   /* the IO address to use for notifications operations */
+   void __iomem *ioaddr_notify;
+   /* the IO address to use for reading ISR */
+   void __iomem *ioaddr_isr;
 
/* a list of queues so we can dispatch IRQs */
spinlock_t lock;
@@ -57,8 +63,175 @@ struct virtio_pci_device
unsigned msix_used_vectors;
/* Whether we have vector per vq */
bool per_vq_vectors;
+
+   /* Various IO mappings: used for resource tracking only. */
+
+#ifdef CONFIG_VIRTIO_PCI_LEGACY
+   /* Legacy BAR0: typically PIO. */
+   void __iomem *legacy_map;
+#endif
+
+   /* Mappings specified by device capabilities: typically in MMIO */
+   void __iomem *isr_map;
+   void __iomem *notify_map;
+   void __iomem *common_map;
+   void __iomem *device_map;
 };
 
+#ifdef CONFIG_VIRTIO_PCI_LEGACY
+static void __iomem *virtio_pci_set_legacy_map(struct virtio_pci_device 
*vp_dev)
+{
+   return vp_dev->legacy_map = pci_iomap(vp_dev->pci_dev, 0, 256);
+}
+
+static void __iomem *virtio_pci_legacy_map(struct virtio_pci_device *vp_dev)
+{
+   return vp_dev->legacy_map;
+}
+#else
+static void __iomem *virtio_pci_set_legacy_map(struct virtio_pci_device 
*vp_dev)
+{
+   return NULL;
+}
+
+static void __iomem *virtio_pci_legacy_map(struct virtio_pci_device *vp_dev)
+{
+   return NULL;
+}
+#endif
+
+/*
+ * With PIO, device-specific config moves as MSI-X is enabled/disabled.
+ * Use this accessor to keep pointer to that config in sync.
+ */
+static void virtio_pci_set_msix_enabled(struct virtio_pci_device *vp_dev, int 
enabled)
+{
+   void __iomem* m;
+   vp_dev->msix_enabled = enabled;
+   m = virtio_pci_legacy_map(vp_dev);
+   if (m)
+   vp_dev->ioaddr_device = m + VIRTIO_PCI_CONFIG(vp_dev);
+   else
+   vp_dev->ioaddr_device = vp_dev->device_map;
+}
+
+static void __iomem *virtio_pci_map_cfg(struct virtio_pci_device *vp_dev, u8 
cap_id,
+   u32 align)
+{
+u32 size;
+u32 offset;
+u8 bir;
+u8 cap_len;
+   int pos;
+   struct pci_dev *dev = vp_dev->pci_dev;
+   void __iomem *p;
+
+   for (pos = pci_find_capability(dev, PCI_CAP_ID_VNDR);
+pos > 0;
+pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_VNDR)) {
+   u8 id;
+   pci_read_config_byte

Re: [RFC PATCH] vfio: VFIO Driver core framework

2011-11-22 Thread Alex Williamson
On Mon, 2011-11-21 at 13:47 +1100, David Gibson wrote:
> On Fri, Nov 18, 2011 at 01:32:56PM -0700, Alex Williamson wrote:
> > On Thu, 2011-11-17 at 11:02 +1100, David Gibson wrote:
> > > On Tue, Nov 15, 2011 at 11:01:28AM -0700, Alex Williamson wrote:
> > > > On Tue, 2011-11-15 at 17:34 +1100, David Gibson wrote:
> > > > > On Thu, Nov 03, 2011 at 02:12:24PM -0600, Alex Williamson wrote:
 
> > > > As we've discussed previously, configfs provides part of this, but has
> > > > no ioctl support.  It doesn't make sense to me to go play with groups in
> > > > configfs, but then still interact with them via a char dev.
> > > 
> > > Why not?  You configure, say, loopback devices with losetup, then use
> > > them as a block device.  Similar with nbd.  You can configure serial
> > > devices with setserial, then use them as a char dev.
> > > 
> > > >  It also
> > > > splits the ownership model 
> > > 
> > > I'm not even sure what that means.
> > > 
> > > > and makes it harder to enforce who gets to
> > > > interact with the devices vs who gets to manipulate groups.
> > > 
> > > How so.
> > 
> > Let's map out what a configfs interface would look like, maybe I'll
> > convince myself it's on the table.  We'd probably start with
> 
> Hrm, assumin we used configfs, which is not the only option.

I'm not writing vfiofs, configfs seems most like what we'd need.  If
there are others we should consider, please note them.

> > /config/vfio/$bus_type.name/
> > 
> > That would probably be pre-populated with a bunch of $groupid files,
> > matching /dev/vfio/$bus_type.name/$groupid char dev files (assuming
> > configfs can pre-populate files).  To make a user defined group, we
> > might then do:
> > 
> > mkdir /config/vfio/$bus_type.name/my_group
> > 
> > That would generate a /dev/vfio/$bus_type.name/my_group char dev.  To
> > add groups to the new my_group "super group", we'd need to do something
> > like:
> > 
> > ln -s /config/vfio/$bus_type.name/$groupidA 
> > /config/vfio/$bus_type.name/my_group/nic_group
> > 
> > I might then add a second group as:
> > 
> > ln -s /config/vfio/$bus_type.name/$groupidB 
> > /config/vfio/$bus_type.name/my_group/hba_group
> > 
> > Either link could fail if the target group is not viable,
> 
> The link op shouldn't fail because the subgroup isn't viable.
> Instead, the supergroup jusy won't be viable until all devices in all
> subgroups are bound to vfio.

The supergroup may already be in use if it's a hotplug.  What does it
mean to have an incompatible group linked into the supergroup?  When
does the subgroup actually become part of the supergroup?  Does the
userspace driver using the supergroup get notified somehow?  Does the
vfio driver get notified independently?  This example continues to show
what an administration nightmare it becomes when we split management
from usage.

> > the group is
> > already in use, or the second link could fail if the iommu domains were
> > incompatible.
> > 
> > Do these links cause /dev/vfio/$bus_type.name/{$groupidA,$groupidB} to
> > disappear?  If not, do we allow them to be opened?  Linking would also
> > have to fail if we later tried to link one of these groupids to a
> > different super group.
> 
> Again, I think some confusion is coming in here from calling both the
> hardware determined thing and the admin determined thing a "group".
> So for now I'm going to call the first a "group" and the second a
> "predomain" (because once it's viable and the right conditions are set
> up it will become an iommu domain).
> 
> So another option is that "groups" *only* participate in the merging
> interface; getting iommu and device handles occurs only on a
> predomain.  Therefore there would be no /dev/vfio/$group, you would
> have to configure a predomain with at least one group before you had a
> device file.

I think this actually leads to a more complicated, more difficult to use
interface that interposes an unnecessary administration layer into a
driver's decisions about how to manage the iommu.

> > Now we want to give my_group to a user, so we have to go back to /dev
> > and
> > 
> > chown $user /dev/vfio/$bus_type.name/my_group
> > 
> > At this point my_group would have the existing set of group ioctls sans
> > {UN}MERGE, of course.
> > 
> > So $user can use the super group, but not manipulate it's members.  Do
> > we then allow:
> > 
> > chown $user /config/vfio/$bus_type.name/my_group
> > 
> > If so, what does it imply about the user then doing:
> > 
> > ln -s /config/vfio/$bus_type.name/$groupidC 
> > /config/vfio/$bus_type.name/my_group/stolen_group
> > 
> > Would we instead need to chown the configfs groups as well as the super
> > group?
> > 
> > chown $user /config/vfio/$bus_type.name/my_group
> > chown $user /config/vfio/$bus_type.name/$groupidA
> > chown $user /config/vfio/$bus_type.name/$groupidB
> > 
> > ie:
> > 
> > # chown $user:$user /config/vfio/$bus_type.name/$groupC
> > $ ln -s /config/vfio/$bus_type.name/$groupidC 
> > /config/vfio/$bus_ty

Re: [RFC] kvm tools: Implement multiple VQ for virtio-net

2011-11-22 Thread Stephen Hemminger
I have been playing with userspace-rcu which has a number of neat
lockless routines for queuing and hashing. But there aren't kernel versions
and several of them may require cmpxchg to work.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


nested virtualization on Intel and needed cpu flags to pass

2011-11-22 Thread Gianluca Cecchi
Hello,
I'm going to test nested virtualization on Intel with Fedora 16 host.
I have used it successfully with Amd, where it is enabled by default
in its kvm-amd module.
Based on
https://github.com/torvalds/linux/blob/master/Documentation/virtual/kvm/nested-vmx.txt
and F16 having now
[root at f16 ~]# uname -r
3.1.1-2.fc16.x86_64
and the same confirmations in its kernel-doc
/usr/share/doc/kernel-doc-3.1.1/Documentation/virtual/kvm/nested-vmx.txt file

In F15:
# uname -r
2.6.40.6-0.fc15.x86_64
# modinfo kvm-intel
...
parm:   bypass_guest_pf:bool
parm:   vpid:bool
parm:   flexpriority:bool
parm:   ept:bool
parm:   unrestricted_guest:bool
parm:   emulate_invalid_guest_state:bool
parm:   vmm_exclusive:bool
parm:   yield_on_hlt:bool
parm:   ple_gap:int
parm:   ple_window:int

In F16 indeed we have now:
# modinfo kvm-intel
...
vermagic:   3.1.1-2.fc16.x86_64 SMP mod_unload
parm:   vpid:bool
parm:   flexpriority:bool
parm:   ept:bool
parm:   unrestricted_guest:bool
parm:   emulate_invalid_guest_state:bool
parm:   vmm_exclusive:bool
parm:   yield_on_hlt:bool
parm:   nested:bool
parm:   ple_gap:int
parm:   ple_window:int


Based on docs, the "nested" option is disabled by default on Intel
(and probably not changed in Fedora)
Just made some tests with these F16 packages some days ago:
kernel-3.1.0-7.fc16.x86_64
qemu-kvm-0.15.1-1.fc16.x86_64
virt-manager-0.9.0-7.fc16.noarch

host is Asus u36sd laptop with 8Gb of ram and an ssd disk
its cpu is:
model name  : Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz

on host:
$ sudo systool -m kvm_intel -v|grep nested
[sudo] password for g.cecchi:
   nested  = "Y"

Preliminary results are not so good.
I created an F16 guest (f16vm), with default options
I then put its virtio disk in cache mode = none and I/O=native
I then selected "copy to guest" as cpu and it was put to "nehalem".

Inside the guest I create a windows xp with default values proposed by
virt-manager
(cd iso is winxp sp3)
Until now not able to complete installation
Better results if I choose "core2duo" as the cpu of f16vm.
But installations blocks in different points never arriving at the end
of the first "copy files" of windows xp install.
This because f16vm freezes (no more network, no more console...)
I have t power off it

I set up f16vm with serial console to see if something appears.. but
nothing appears when f16vm freezes

host run its F16 guest (f16vm) with this command:
# ps -ef|grep qemu
qemu 18638 1 85 15:39 ?00:03:45 /usr/bin/qemu-kvm -S
-M pc-0.14 -cpu
core2duo,+rdtscp,+x2apic,+xtpr,+tm2,+est,+vmx,+ds_cpl,+pbe,+tm,+ht,+ss,+acpi,+ds
-enable-kvm -m 2048 -smp 1,sockets=1,cores=1,threads=1 -name f16 -uuid
690251ac-b691-f320-1eba-9f7f61c2e0d3 -nodefconfig -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/f16.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive
file=/var/lib/libvirt/images/f16.img,if=none,id=drive-virtio-disk0,format=raw,cache=none,aio=native
-device 
virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-netdev tap,fd=23,id=hostnet0 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:62:12:4a,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
spicevmc,id=charchannel0,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
-usb -device usb-tablet,id=input0 -spice
port=5900,addr=127.0.0.1,disable-ticketing -vga qxl -global
qxl-vga.vram_size=67108864 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5


F16 guest (f16vm) runs its windows xp install guest with this command:

qemu  1355 1 92 15:40 ?00:03:38 /usr/bin/qemu-kvm -S
-M pc-0.14 -cpu
core2duo,+lahf_lm,+rdtscp,+hypervisor,+x2apic,+cx16,+vmx,+ss,-monitor
-enable-kvm -m 768 -smp 1,sockets=1,cores=1,threads=1 -name winxp
-uuid 9e7ed89e-1bab-b354-cccb-b545d0cc2c29 -nodefconfig -nodefaults
-chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/winxp.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime
-drive 
file=/var/lib/libvirt/images/winxp.img,if=none,id=drive-ide0-0-0,format=raw
-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=2
-drive 
file=/var/lib/libvirt/images/winxp_sp3.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
-device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1
-netdev tap,fd=24,id=hostnet0 -device
rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:91:6b:e6,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -usb -device
usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga std -device
intel-hda,

Re: [Qemu-devel] Memory sync algorithm during migration

2011-11-22 Thread Pierre Riteau
On 22 nov. 2011, at 14:04, Oliver Hookins wrote:

> On Tue, Nov 22, 2011 at 10:31:58AM +0100, ext Juan Quintela wrote:
>> Oliver Hookins  wrote:
>>> On Tue, Nov 15, 2011 at 11:47:58AM +0100, ext Juan Quintela wrote:
 Takuya Yoshikawa  wrote:
> Adding qemu-devel ML to CC.
> 
> Your question should have been sent to qemu-devel ML because the logic
> is implemented in QEMU, not KVM.
> 
> (2011/11/11 1:35), Oliver Hookins wrote:
>> Hi,
>> 
>> I am performing some benchmarks on KVM migration on two different types 
>> of VM.
>> One has 4GB RAM and the other 32GB. More or less idle, the 4GB VM takes 
>> about 20
>> seconds to migrate on our hardware while the 32GB VM takes about a 
>> minute.
>> 
>> With a reasonable amount of memory activity going on (in the hundreds of 
>> MB per
>> second) the 32GB VM takes 3.5 minutes to migrate, but the 4GB VM never
>> completes. Intuitively this tells me there is some watermarking of dirty 
>> pages
>> going on that is not particularly efficient when the dirty pages ratio 
>> is high
>> compared to total memory, but I may be completely incorrect.
 
> You can change the ratio IIRC.
> Hopefully, someone who knows well about QEMU will tell you better ways.
> 
>   Takuya
> 
>> 
>> Could anybody fill me in on what might be going on here? We're using 
>> libvirt
>> 0.8.2 and kvm-83-224.el5.centos.1
 
 This is pretty old qemu/kvm code base.
 In principle, it makes no sense that with 32GB RAM migration finishes,
 and with 4GB RAM it is unable (intuitively it should be, if ever, the
 other way around).
 
 Do you have an easy test that makes the problem easily reproducible?
 Have you tried ustream qemu.git? (some improvements on that department).
>>> 
>>> I've just tried the qemu-kvm 0.14.1 tag which seems to be the latest that 
>>> builds
>>> on my platform. For some strange reason migrations always seem to fail in 
>>> one
>>> direction with "Unknown savevm section or instance 'hpet' 0" messages.
>> 
>> What is your platform?  This seems like you are running with hpet in one
>> side, but without it in the other.  What command line are you using?
> 
> Yes, my mistake. We were also testing later kernels and my test machines 
> managed
> to get out of sync. One had support for hpet clocksource but the other one
> didn't.
> 
>> 
>>> This seems to point to different migration protocols on either end but they 
>>> are
>>> both running the same version of qemu-kvm I built. Does this ring any bells 
>>> for
>>> anyone?
>> 
>> Command line mismatch.  But, what is your platform?
> 
> CentOS5.6. Now running the VMs through qemu-kvm 0.14.1, unloaded migrations 
> take
> about half the time but with memory I/O load now both VMs never complete the
> migration. In practical terms I'm writing about 50MB/s into memory and we 
> have a
> 10Gbps network (and I've seen real speeds up to 8-9Gbps on the wire) so there
> should be enough capacity to sync up the dirty pages.
> 
> So now the 32GB and 4GB VMs have matching behaviour (which makes more sense) 
> but
> I'm not any closer to figuring out what is going on.

You say you write 50 MB/s in memory, but this does not provide enough 
information to analyze the problem.
How distributed in memory are these writes? If your writes are not restricted 
to a small memory region, they could dirty many pages. In this case, live 
migration would have to transfer much more than 50 MB/s of pages to the 
destination.

-- 
Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
http://perso.univ-rennes1.fr/pierre.riteau/

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Memory sync algorithm during migration

2011-11-22 Thread Pierre Riteau
On 22 nov. 2011, at 14:04, Oliver Hookins wrote:

> On Tue, Nov 22, 2011 at 10:31:58AM +0100, ext Juan Quintela wrote:
>> Oliver Hookins  wrote:
>>> On Tue, Nov 15, 2011 at 11:47:58AM +0100, ext Juan Quintela wrote:
 Takuya Yoshikawa  wrote:
> Adding qemu-devel ML to CC.
> 
> Your question should have been sent to qemu-devel ML because the logic
> is implemented in QEMU, not KVM.
> 
> (2011/11/11 1:35), Oliver Hookins wrote:
>> Hi,
>> 
>> I am performing some benchmarks on KVM migration on two different types 
>> of VM.
>> One has 4GB RAM and the other 32GB. More or less idle, the 4GB VM takes 
>> about 20
>> seconds to migrate on our hardware while the 32GB VM takes about a 
>> minute.
>> 
>> With a reasonable amount of memory activity going on (in the hundreds of 
>> MB per
>> second) the 32GB VM takes 3.5 minutes to migrate, but the 4GB VM never
>> completes. Intuitively this tells me there is some watermarking of dirty 
>> pages
>> going on that is not particularly efficient when the dirty pages ratio 
>> is high
>> compared to total memory, but I may be completely incorrect.
 
> You can change the ratio IIRC.
> Hopefully, someone who knows well about QEMU will tell you better ways.
> 
>   Takuya
> 
>> 
>> Could anybody fill me in on what might be going on here? We're using 
>> libvirt
>> 0.8.2 and kvm-83-224.el5.centos.1
 
 This is pretty old qemu/kvm code base.
 In principle, it makes no sense that with 32GB RAM migration finishes,
 and with 4GB RAM it is unable (intuitively it should be, if ever, the
 other way around).
 
 Do you have an easy test that makes the problem easily reproducible?
 Have you tried ustream qemu.git? (some improvements on that department).
>>> 
>>> I've just tried the qemu-kvm 0.14.1 tag which seems to be the latest that 
>>> builds
>>> on my platform. For some strange reason migrations always seem to fail in 
>>> one
>>> direction with "Unknown savevm section or instance 'hpet' 0" messages.
>> 
>> What is your platform?  This seems like you are running with hpet in one
>> side, but without it in the other.  What command line are you using?
> 
> Yes, my mistake. We were also testing later kernels and my test machines 
> managed
> to get out of sync. One had support for hpet clocksource but the other one
> didn't.
> 
>> 
>>> This seems to point to different migration protocols on either end but they 
>>> are
>>> both running the same version of qemu-kvm I built. Does this ring any bells 
>>> for
>>> anyone?
>> 
>> Command line mismatch.  But, what is your platform?
> 
> CentOS5.6. Now running the VMs through qemu-kvm 0.14.1, unloaded migrations 
> take
> about half the time but with memory I/O load now both VMs never complete the
> migration. In practical terms I'm writing about 50MB/s into memory and we 
> have a
> 10Gbps network (and I've seen real speeds up to 8-9Gbps on the wire) so there
> should be enough capacity to sync up the dirty pages.
> 
> So now the 32GB and 4GB VMs have matching behaviour (which makes more sense) 
> but
> I'm not any closer to figuring out what is going on.

Did you modify the max downtime? The default is 30 ms. At 8 Gbps, this only 
allows to send 30 MB of data on the wire.

-- 
Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
http://perso.univ-rennes1.fr/pierre.riteau/


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Allow aligned byte and word writes to IOAPIC registers.

2011-11-22 Thread Julian Stecklina
This fixes byte accesses to IOAPIC_REG_SELECT as mandated by at least the
ICH10 and Intel Series 5 chipset specs. It also makes ioapic_mmio_write
consistent with ioapic_mmio_read, which also allows byte and word accesses.

Signed-off-by: Julian Stecklina 
---
 virt/kvm/ioapic.c |   17 +
 1 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 3eed61e..e94ef6ba 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -332,9 +332,18 @@ static int ioapic_mmio_write(struct kvm_io_device *this, 
gpa_t addr, int len,
 (void*)addr, len, val);
ASSERT(!(addr & 0xf));  /* check alignment */
 
-   if (len == 4 || len == 8)
-   data = *(u32 *) val;
-   else {
+switch (len) {
+case 8:
+case 4:
+data = *(u32 *) val;
+break;
+case 2:
+data = *(u16 *) val;
+break;
+case 1:
+data = *(u8  *) val;
+break;
+default:
printk(KERN_WARNING "ioapic: Unsupported size %d\n", len);
return 0;
}
@@ -343,7 +352,7 @@ static int ioapic_mmio_write(struct kvm_io_device *this, 
gpa_t addr, int len,
spin_lock(&ioapic->lock);
switch (addr) {
case IOAPIC_REG_SELECT:
-   ioapic->ioregsel = data;
+   ioapic->ioregsel = data & 0xFF; /* 8-bit register */
break;
 
case IOAPIC_REG_WINDOW:
-- 
1.7.7.3


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for Novemeber 22

2011-11-22 Thread Stefan Hajnoczi
On Tue, Nov 22, 2011 at 3:00 PM, Stefan Hajnoczi  wrote:
> On Tue, Nov 22, 2011 at 2:39 PM, Juan Quintela  wrote:
>> Juan Quintela  wrote:
>>> Hi
>>>
>>> Please send in any agenda items you are interested in covering.
>>
>> As there is no topic for today, and Anthony just give us reading we
>> would cancel today call.
>>
>> Happy hacking (and reading), Juan.
>
> I think Anthony was trying to provide some background for a QEMU/KVM
> sandboxing discussion in today's call.  Let's see who is joining
> because they may want to discuss sandboxing.

It's 10 minutes past now.  Please ignore my request :).

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IOAPIC doesn't handle byte writes

2011-11-22 Thread Julian Stecklina
Hello,

Avi Kivity wrote:
> Care to post a patch instead?

Sure. Never hacked KVM, though. Is there a particular reason why the
void *val argument to ioapic_mmio_read/_write is only dereferenced when
ioapic->lock is not held?

Regards, Julian

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for Novemeber 22

2011-11-22 Thread Stefan Hajnoczi
On Tue, Nov 22, 2011 at 2:39 PM, Juan Quintela  wrote:
> Juan Quintela  wrote:
>> Hi
>>
>> Please send in any agenda items you are interested in covering.
>
> As there is no topic for today, and Anthony just give us reading we
> would cancel today call.
>
> Happy hacking (and reading), Juan.

I think Anthony was trying to provide some background for a QEMU/KVM
sandboxing discussion in today's call.  Let's see who is joining
because they may want to discuss sandboxing.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Novemeber 22

2011-11-22 Thread Juan Quintela
Juan Quintela  wrote:
> Hi
>
> Please send in any agenda items you are interested in covering.

As there is no topic for today, and Anthony just give us reading we
would cancel today call.

Happy hacking (and reading), Juan.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IOAPIC doesn't handle byte writes

2011-11-22 Thread Avi Kivity
On 11/22/2011 03:13 PM, Julian Stecklina wrote:
> Hello,
>
> KVM emulates an IOAPIC that doesn't handle byte writes to its
> IOAPIC_REG_SELECT register, although for example the ICH10 spec[1]
> clearly states that this is an 8-bit register. See
> http://www.intel.com/content/dam/doc/datasheet/io-controller-hub-10-family-datasheet.pdf
>  Table 13-4 on page 433.
>
> The code in question is:
>
> http://git.kernel.org/?p=virt/kvm/kvm.git;a=blob;f=virt/kvm/ioapic.c;h=3eed61eb48675a63dd1f31b0095217ab6bc5f646;hb=HEAD#l323
>
> This breaks IOAPIC code in OSes that adhere to the spec.

Agree, the code is broken.

> I would file a bug, but the kernel bugzilla seems to be down at the
> moment.

Care to post a patch instead?

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


IOAPIC doesn't handle byte writes

2011-11-22 Thread Julian Stecklina
Hello,

KVM emulates an IOAPIC that doesn't handle byte writes to its
IOAPIC_REG_SELECT register, although for example the ICH10 spec[1]
clearly states that this is an 8-bit register. See
http://www.intel.com/content/dam/doc/datasheet/io-controller-hub-10-family-datasheet.pdf
 Table 13-4 on page 433.

The code in question is:

http://git.kernel.org/?p=virt/kvm/kvm.git;a=blob;f=virt/kvm/ioapic.c;h=3eed61eb48675a63dd1f31b0095217ab6bc5f646;hb=HEAD#l323

This breaks IOAPIC code in OSes that adhere to the spec.

I've created a small testcase[1]:

$ qemu-kvm -serial stdio -kernel ioapic
[26303.961804] ioapic: Unsupported size 1
IOAPIC ID  
[26303.970466] ioapic: Unsupported size 1
IOAPIC VER 
Done
qemu: terminating on signal 2
$ qemu-kvm  -no-kvm-irqchip -serial stdio -kernel ioapic 
IOAPIC ID  
IOAPIC VER 00170011
Done
qemu: terminating on signal 2

Expected behavior is that the IOAPIC register is not read as zero with
KVM irqchip emulation.

I would file a bug, but the kernel bugzilla seems to be down at the
moment.

Regards, Julian

[1] http://os.inf.tu-dresden.de/~jsteckli/tmp/ioapic



signature.asc
Description: This is a digitally signed message part


Re: Memory sync algorithm during migration

2011-11-22 Thread Oliver Hookins
On Tue, Nov 22, 2011 at 10:31:58AM +0100, ext Juan Quintela wrote:
> Oliver Hookins  wrote:
> > On Tue, Nov 15, 2011 at 11:47:58AM +0100, ext Juan Quintela wrote:
> >> Takuya Yoshikawa  wrote:
> >> > Adding qemu-devel ML to CC.
> >> >
> >> > Your question should have been sent to qemu-devel ML because the logic
> >> > is implemented in QEMU, not KVM.
> >> >
> >> > (2011/11/11 1:35), Oliver Hookins wrote:
> >> >> Hi,
> >> >>
> >> >> I am performing some benchmarks on KVM migration on two different types 
> >> >> of VM.
> >> >> One has 4GB RAM and the other 32GB. More or less idle, the 4GB VM takes 
> >> >> about 20
> >> >> seconds to migrate on our hardware while the 32GB VM takes about a 
> >> >> minute.
> >> >>
> >> >> With a reasonable amount of memory activity going on (in the hundreds 
> >> >> of MB per
> >> >> second) the 32GB VM takes 3.5 minutes to migrate, but the 4GB VM never
> >> >> completes. Intuitively this tells me there is some watermarking of 
> >> >> dirty pages
> >> >> going on that is not particularly efficient when the dirty pages ratio 
> >> >> is high
> >> >> compared to total memory, but I may be completely incorrect.
> >> 
> >> > You can change the ratio IIRC.
> >> > Hopefully, someone who knows well about QEMU will tell you better ways.
> >> >
> >> >  Takuya
> >> >
> >> >>
> >> >> Could anybody fill me in on what might be going on here? We're using 
> >> >> libvirt
> >> >> 0.8.2 and kvm-83-224.el5.centos.1
> >> 
> >> This is pretty old qemu/kvm code base.
> >> In principle, it makes no sense that with 32GB RAM migration finishes,
> >> and with 4GB RAM it is unable (intuitively it should be, if ever, the
> >> other way around).
> >> 
> >> Do you have an easy test that makes the problem easily reproducible?
> >> Have you tried ustream qemu.git? (some improvements on that department).
> >
> > I've just tried the qemu-kvm 0.14.1 tag which seems to be the latest that 
> > builds
> > on my platform. For some strange reason migrations always seem to fail in 
> > one
> > direction with "Unknown savevm section or instance 'hpet' 0" messages.
> 
> What is your platform?  This seems like you are running with hpet in one
> side, but without it in the other.  What command line are you using?

Yes, my mistake. We were also testing later kernels and my test machines managed
to get out of sync. One had support for hpet clocksource but the other one
didn't.

> 
> > This seems to point to different migration protocols on either end but they 
> > are
> > both running the same version of qemu-kvm I built. Does this ring any bells 
> > for
> > anyone?
> 
> Command line mismatch.  But, what is your platform?

CentOS5.6. Now running the VMs through qemu-kvm 0.14.1, unloaded migrations take
about half the time but with memory I/O load now both VMs never complete the
migration. In practical terms I'm writing about 50MB/s into memory and we have a
10Gbps network (and I've seen real speeds up to 8-9Gbps on the wire) so there
should be enough capacity to sync up the dirty pages.

So now the 32GB and 4GB VMs have matching behaviour (which makes more sense) but
I'm not any closer to figuring out what is going on.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] KVM: x86 emulator: Use opcode::execute for IN/OUT

2011-11-22 Thread Avi Kivity
On 11/22/2011 12:49 PM, Avi Kivity wrote:
> On 11/22/2011 08:16 AM, Takuya Yoshikawa wrote:
> > IN : E4, E5, EC, ED
> > OUT: E6, E7, EE, EF
> >
> > @@ -3867,11 +3888,12 @@ special_insn:
> > case 0x6c:  /* insb */
> > case 0x6d:  /* insw/insd */
> > ctxt->src.val = ctxt->regs[VCPU_REGS_RDX];
> > -   goto do_io_in;
> > +   rc = em_in(ctxt);
> > +   break;
> > case 0x6e:  /* outsb */
> > case 0x6f:  /* outsw/outsd */
> > ctxt->dst.val = ctxt->regs[VCPU_REGS_RDX];
> > -   goto do_io_out;
> > +   rc = em_out(ctxt);
> > break;
> > case 0x70 ... 0x7f: /* jcc (short) */
> > if (test_cc(ctxt->b, ctxt->eflags))
> >
>
> We have SrcDX/DstDX for these.
>

Everything else looks good; no need to regenerate this, it can be done
as a follow up patch if you like.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] KVM: x86 emulator: Use opcode::execute for IN/OUT

2011-11-22 Thread Avi Kivity
On 11/22/2011 08:16 AM, Takuya Yoshikawa wrote:
> IN : E4, E5, EC, ED
> OUT: E6, E7, EE, EF
>
> @@ -3867,11 +3888,12 @@ special_insn:
>   case 0x6c:  /* insb */
>   case 0x6d:  /* insw/insd */
>   ctxt->src.val = ctxt->regs[VCPU_REGS_RDX];
> - goto do_io_in;
> + rc = em_in(ctxt);
> + break;
>   case 0x6e:  /* outsb */
>   case 0x6f:  /* outsw/outsd */
>   ctxt->dst.val = ctxt->regs[VCPU_REGS_RDX];
> - goto do_io_out;
> + rc = em_out(ctxt);
>   break;
>   case 0x70 ... 0x7f: /* jcc (short) */
>   if (test_cc(ctxt->b, ctxt->eflags))
>

We have SrcDX/DstDX for these.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [FYI] Need to do a full rebuild if you are on Linux x86 host

2011-11-22 Thread Avi Kivity
On 11/22/2011 12:01 PM, Paolo Bonzini wrote:
> On 11/22/2011 10:41 AM, Gerd Hoffmann wrote:
>> On 11/22/11 01:25, Anthony Liguori wrote:
>>> Due to this commit:
>>>
>>> commit 40d6444e91c6ab17e5e8ab01d4eece90cbc4afed
>>> Author: Avi Kivity
>>> Date:   Tue Nov 15 20:12:17 2011 +0200
>>>
>>>  configure: build position independent executables on x86-Linux
>>> hosts
>>>
>>> PIE binaries cannot be linked with non-PIE binaries and make is not
>>> smart enough to rebuild when the CFLAGS have changed.
>>
>> Breaks build on RHEL-5 and probably also other not-so-recent linux
>> distros.
>>
>> [ ... ]
>>CCi386-softmmu/exec.o
>> [ ... ]
>>LINK  i386-softmmu/qemu-system-i386
>> /usr/bin/ld: exec.o: relocation R_X86_64_TPOFF32 against
>> `tls__cpu_single_env' can not be used when making a shared object;
>> recompile with -fPIC
>> exec.o: could not read symbols: Bad value
>> collect2: ld returned 1 exit status
>> make[1]: *** [qemu-system-i386] Error 1
>> make: *** [subdir-i386-softmmu] Error 2
>
> It can be worked around by replacing "-fpie" with "-fpic" or (to avoid
> a rather bad performance degradation) "-fpic -ftls-model=initial-exec"
> but it's a bug in the linker and it should be fixed in RHEL:
>
> http://sourceware.org/bugzilla/show_bug.cgi?id=10434 (upstream BZ)
> https://bugzilla.redhat.com/show_bug.cgi?id=755872 (RHEL BZ)

I'll extend the configure build test to include a tls variable.  If
anyone's interested in tweaking it for older distros, that's for 1.1.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [FYI] Need to do a full rebuild if you are on Linux x86 host

2011-11-22 Thread Paolo Bonzini

On 11/22/2011 10:41 AM, Gerd Hoffmann wrote:

On 11/22/11 01:25, Anthony Liguori wrote:

Due to this commit:

commit 40d6444e91c6ab17e5e8ab01d4eece90cbc4afed
Author: Avi Kivity
Date:   Tue Nov 15 20:12:17 2011 +0200

 configure: build position independent executables on x86-Linux hosts

PIE binaries cannot be linked with non-PIE binaries and make is not
smart enough to rebuild when the CFLAGS have changed.


Breaks build on RHEL-5 and probably also other not-so-recent linux distros.

[ ... ]
   CCi386-softmmu/exec.o
[ ... ]
   LINK  i386-softmmu/qemu-system-i386
/usr/bin/ld: exec.o: relocation R_X86_64_TPOFF32 against
`tls__cpu_single_env' can not be used when making a shared object;
recompile with -fPIC
exec.o: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[1]: *** [qemu-system-i386] Error 1
make: *** [subdir-i386-softmmu] Error 2


It can be worked around by replacing "-fpie" with "-fpic" or (to avoid a 
rather bad performance degradation) "-fpic -ftls-model=initial-exec" but 
it's a bug in the linker and it should be fixed in RHEL:


http://sourceware.org/bugzilla/show_bug.cgi?id=10434 (upstream BZ)
https://bugzilla.redhat.com/show_bug.cgi?id=755872 (RHEL BZ)

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm-tpr-opt: Fix instruction_is_ok() for push tpr

2011-11-22 Thread Markus Armbruster
Missing break spotted by Coverity.

Signed-off-by: Markus Armbruster 
---
 kvm-tpr-opt.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/kvm-tpr-opt.c b/kvm-tpr-opt.c
index 4b2bd47..14c4b36 100644
--- a/kvm-tpr-opt.c
+++ b/kvm-tpr-opt.c
@@ -152,6 +152,7 @@ static int instruction_is_ok(CPUState *env, uint64_t rip, 
int is_write)
 if (modrm_reg(b2) != 6 || !is_abs_modrm(b2))
 return 0;
 addr_offset = 2;
+break;
 default:
return 0;
 }
-- 
1.7.6.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [FYI] Need to do a full rebuild if you are on Linux x86 host

2011-11-22 Thread Gerd Hoffmann
On 11/22/11 01:25, Anthony Liguori wrote:
> Due to this commit:
> 
> commit 40d6444e91c6ab17e5e8ab01d4eece90cbc4afed
> Author: Avi Kivity 
> Date:   Tue Nov 15 20:12:17 2011 +0200
> 
> configure: build position independent executables on x86-Linux hosts
> 
> PIE binaries cannot be linked with non-PIE binaries and make is not
> smart enough to rebuild when the CFLAGS have changed.

Breaks build on RHEL-5 and probably also other not-so-recent linux distros.

[ ... ]
  CCi386-softmmu/exec.o
[ ... ]
  LINK  i386-softmmu/qemu-system-i386
/usr/bin/ld: exec.o: relocation R_X86_64_TPOFF32 against
`tls__cpu_single_env' can not be used when making a shared object;
recompile with -fPIC
exec.o: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[1]: *** [qemu-system-i386] Error 1
make: *** [subdir-i386-softmmu] Error 2

cheers,
  Gerd

PS: for those with ipv6 connectivity the full log is available at
http://spunk.home.kraxel.org/bb/builders/rhel5-default/builds/98/steps/compile/logs/stdio
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Memory sync algorithm during migration

2011-11-22 Thread Juan Quintela
Oliver Hookins  wrote:
> On Tue, Nov 15, 2011 at 11:47:58AM +0100, ext Juan Quintela wrote:
>> Takuya Yoshikawa  wrote:
>> > Adding qemu-devel ML to CC.
>> >
>> > Your question should have been sent to qemu-devel ML because the logic
>> > is implemented in QEMU, not KVM.
>> >
>> > (2011/11/11 1:35), Oliver Hookins wrote:
>> >> Hi,
>> >>
>> >> I am performing some benchmarks on KVM migration on two different types 
>> >> of VM.
>> >> One has 4GB RAM and the other 32GB. More or less idle, the 4GB VM takes 
>> >> about 20
>> >> seconds to migrate on our hardware while the 32GB VM takes about a minute.
>> >>
>> >> With a reasonable amount of memory activity going on (in the hundreds of 
>> >> MB per
>> >> second) the 32GB VM takes 3.5 minutes to migrate, but the 4GB VM never
>> >> completes. Intuitively this tells me there is some watermarking of dirty 
>> >> pages
>> >> going on that is not particularly efficient when the dirty pages ratio is 
>> >> high
>> >> compared to total memory, but I may be completely incorrect.
>> 
>> > You can change the ratio IIRC.
>> > Hopefully, someone who knows well about QEMU will tell you better ways.
>> >
>> >Takuya
>> >
>> >>
>> >> Could anybody fill me in on what might be going on here? We're using 
>> >> libvirt
>> >> 0.8.2 and kvm-83-224.el5.centos.1
>> 
>> This is pretty old qemu/kvm code base.
>> In principle, it makes no sense that with 32GB RAM migration finishes,
>> and with 4GB RAM it is unable (intuitively it should be, if ever, the
>> other way around).
>> 
>> Do you have an easy test that makes the problem easily reproducible?
>> Have you tried ustream qemu.git? (some improvements on that department).
>
> I've just tried the qemu-kvm 0.14.1 tag which seems to be the latest that 
> builds
> on my platform. For some strange reason migrations always seem to fail in one
> direction with "Unknown savevm section or instance 'hpet' 0" messages.

What is your platform?  This seems like you are running with hpet in one
side, but without it in the other.  What command line are you using?

> This seems to point to different migration protocols on either end but they 
> are
> both running the same version of qemu-kvm I built. Does this ring any bells 
> for
> anyone?

Command line mismatch.  But, what is your platform?

Later, Juan.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html