Re: [PATCH: kvm 4/5] Fix hotremove of CPUs for KVM.

2009-09-27 Thread Zachary Amsden

On 09/26/2009 10:54 PM, Avi Kivity wrote:


First, I'm not sure per_cpu works for possible but not actual cpus.  
Second, we now eagerly allocate but lazily free, leading to lots of 
ifs and buts.  I think the code can be cleaner by eagerly allocating 
and eagerly freeing.


Eager freeing requires a hotplug remove notification to the arch layer.  
I had done that originally, but not sure.


How does per_cpu() work when defined in a module anyway?  The linker 
magic going on here evades a simple one-minute analysis.


Zach
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH: kvm 3/5] Fix hotadd of CPUs for KVM.

2009-09-27 Thread Zachary Amsden

On 09/26/2009 10:52 PM, Avi Kivity wrote:

On 09/25/2009 03:47 AM, Zachary Amsden wrote:


--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1716,9 +1716,6 @@ static int kvm_cpu_hotplug(struct 
notifier_block *notifier, unsigned long val,

  {
  int cpu = (long)v;

-if (!kvm_usage_count)
-return NOTIFY_OK;
-


Why?  You'll now do hardware_enable() even if kvm is not in use.


Because otherwise you'll never allocate and hardware_enable_all will fail:

Switch to broadcast mode on CPU1
svm_hardware_enable: svm_data is NULL on 1
kvm: enabling virtualization on CPU1 failed
qemu-system-x86[8698]: segfault at 20 ip 004db22f sp 
7fff0a3b4560 error 6 in qemu-system-x86_64[40+21f000]


Perhaps I can make this work better by putting the allocation within 
hardware_enable_all.


Zach
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


buildbot failure in qemu-kvm on default_x86_64_debian_5_0

2009-09-27 Thread qemu-kvm
The Buildbot has detected a new failure of default_x86_64_debian_5_0 on 
qemu-kvm.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu-kvm/builders/default_x86_64_debian_5_0/builds/87

Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/

Buildslave for this Build: b1_qemu_kvm_1

Build Reason: The Nightly scheduler named 'nightly_default' triggered this build
Build Source Stamp: [branch master] HEAD
Blamelist: 

BUILD FAILED: failed git

sincerely,
 -The Buildbot

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


trivial patch: echo -e in ./configure

2009-09-27 Thread Michael Tokarev

The following one-liner eliminates an annoying -e output
during ./configure run if your /bin/sh is not bash or ksh:

$ ./configure
...
IO thread no
Install blobs yes
-e KVM support   yes <===
KVM trace support no
fdt support   no
preadv supportno
$ _

(I dunno if it's qemu or kvm thing)
Thanks!

/mjt

---
--- qemu-kvm-0.11.0/configure.sav   2009-09-23 11:30:02.0 +0400
+++ qemu-kvm-0.11.0/configure   2009-09-27 20:04:03.230408438 +0400
@@ -1591 +1591 @@
-echo -e "KVM support   $kvm"
+echo "KVM support   $kvm"
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Unix domain socket device

2009-09-27 Thread Giuseppe Coviello
Hi all, as I can read from the website, a point of the kvm TODO list is 
"Add a Unix domain socket device. With this, the guest can talk to a pci 
device which is connected to a Unix domain socket on the host.", it is 
classified as a smaller scale task that can be done by someone wishing 
to get involved.


Since the Unix domain socket device is exactly what I need for my degree 
thesis, I can (I have to) develop this device, but I'm a little lost in 
the kvm sources and documentation; so I need someone that points me to 
the right place (documentation and source code) where to start from.


I have a fair knowledge in programming, expecially with C, and a "not so 
bad" knowledge of linux sources since I supervised the porting of linux 
for the Sam440ep board[1].


Regards, Giuseppe

[1] http://en.wikipedia.org/wiki/Sam440ep
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

2009-09-27 Thread Joerg Roedel
On Sun, Sep 27, 2009 at 04:18:00PM +0200, Avi Kivity wrote:
> On 09/27/2009 04:07 PM, Joerg Roedel wrote:
>> On Sun, Sep 27, 2009 at 03:47:55PM +0200, Avi Kivity wrote:
>>
>>> On 09/27/2009 03:46 PM, Joerg Roedel wrote:
>>>  

> We can't find exactly which vcpu, but we can:
>
> - rule out threads that are not vcpus for this guest
> - rule out threads that are already running
>
> A major problem with sleep() is that it effectively reduces the vm
> priority relative to guests that don't have spinlock contention.  By
> selecting a random nonrunnable vcpu belonging to this guest, we at least
> preserve the guest's timeslice.
>
>  
 Ok, that makes sense. But before trying that we should probably try to
 call just yield() instead of schedule()? I remember someone from our
 team here at AMD did this for Xen a while ago and already had pretty
 good results with that. Xen has a completly other scheduler but maybe
 its worth trying?


>>> yield() is a no-op in CFS.
>>>  
>> Hmm, true. At least when kernel.sched_compat_yield == 0, which it is on my
>> distro.
>> If the scheduler would give us something like a real_yield() function
>> which asumes kernel.sched_compat_yield = 1 might help. At least its
>> better than sleeping for some random amount of time.
>>
>>
>
> Depends.  If it's a global yield(), yes.  If it's a local yield() that  
> doesn't rebalance the runqueues we might be left with the spinning task  
> re-running.

Only one runable task on each cpu is unlikely in a situation of high
vcpu overcommit (where pause filtering matters).

> Also, if yield means "give up the reminder of our timeslice", then we  
> potentially end up sleeping a much longer random amount of time.  If we  
> yield to another vcpu in the same guest we might not care, but if we  
> yield to some other guest we're seriously penalizing ourselves.

I agree that a directed yield with possible rebalance would be good to
have, but this is very intrusive to the scheduler code and I think we
should at least try if this simpler approach already gives us good
results.

Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] Notify nested hypervisor of lost event injections

2009-09-27 Thread Joerg Roedel
Hi Avi,

can you pleas apply this patch (only 5/5) directly before Alex does a
repost? It is pretty independet from the others and contains an
important bugfix for nested svm and should go in as soon as possible.

Joerg

On Fri, Sep 18, 2009 at 03:00:32PM +0200, Alexander Graf wrote:
> Normally when event_inj is valid the host CPU would write the contents to
> exit_int_info, so the hypervisor knows that the event wasn't injected.
> 
> We failed to do so so far, so let's model closer to the CPU.
> 
> Signed-off-by: Alexander Graf 
> ---
>  arch/x86/kvm/svm.c |   16 
>  1 files changed, 16 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 12ec8ee..75e3d75 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -1643,6 +1643,22 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
>   nested_vmcb->control.exit_info_2   = vmcb->control.exit_info_2;
>   nested_vmcb->control.exit_int_info = vmcb->control.exit_int_info;
>   nested_vmcb->control.exit_int_info_err = 
> vmcb->control.exit_int_info_err;
> +
> + /*
> +  * If we emulate a VMRUN/#VMEXIT in the same host #vmexit cycle we have
> +  * to make sure that we do not lose injected events. So check event_inj
> +  * here and copy it to exit_int_info if it is valid.
> +  * exit_int_info and event_inj can't be both valid because the below
> +  * case only happens on a VMRUN instruction intercept which has not
> +  * valid exit_int_info set.
> +  */
> + if (vmcb->control.event_inj & SVM_EVTINJ_VALID) {
> + struct vmcb_control_area *nc = &nested_vmcb->control;
> +
> + nc->exit_int_info = vmcb->control.event_inj;
> + nc->exit_int_info_err = vmcb->control.event_inj_err;
> + }
> +
>   nested_vmcb->control.tlb_ctl   = 0;
>   nested_vmcb->control.event_inj = 0;
>   nested_vmcb->control.event_inj_err = 0;
> -- 
> 1.6.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

2009-09-27 Thread Avi Kivity

On 09/27/2009 04:07 PM, Joerg Roedel wrote:

On Sun, Sep 27, 2009 at 03:47:55PM +0200, Avi Kivity wrote:
   

On 09/27/2009 03:46 PM, Joerg Roedel wrote:
 
   

We can't find exactly which vcpu, but we can:

- rule out threads that are not vcpus for this guest
- rule out threads that are already running

A major problem with sleep() is that it effectively reduces the vm
priority relative to guests that don't have spinlock contention.  By
selecting a random nonrunnable vcpu belonging to this guest, we at least
preserve the guest's timeslice.

 

Ok, that makes sense. But before trying that we should probably try to
call just yield() instead of schedule()? I remember someone from our
team here at AMD did this for Xen a while ago and already had pretty
good results with that. Xen has a completly other scheduler but maybe
its worth trying?

   

yield() is a no-op in CFS.
 

Hmm, true. At least when kernel.sched_compat_yield == 0, which it is on my
distro.
If the scheduler would give us something like a real_yield() function
which asumes kernel.sched_compat_yield = 1 might help. At least its
better than sleeping for some random amount of time.

   


Depends.  If it's a global yield(), yes.  If it's a local yield() that 
doesn't rebalance the runqueues we might be left with the spinning task 
re-running.


Also, if yield means "give up the reminder of our timeslice", then we 
potentially end up sleeping a much longer random amount of time.  If we 
yield to another vcpu in the same guest we might not care, but if we 
yield to some other guest we're seriously penalizing ourselves.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

2009-09-27 Thread Joerg Roedel
On Sun, Sep 27, 2009 at 03:47:55PM +0200, Avi Kivity wrote:
> On 09/27/2009 03:46 PM, Joerg Roedel wrote:
>>
>>> We can't find exactly which vcpu, but we can:
>>>
>>> - rule out threads that are not vcpus for this guest
>>> - rule out threads that are already running
>>>
>>> A major problem with sleep() is that it effectively reduces the vm
>>> priority relative to guests that don't have spinlock contention.  By
>>> selecting a random nonrunnable vcpu belonging to this guest, we at least
>>> preserve the guest's timeslice.
>>>  
>> Ok, that makes sense. But before trying that we should probably try to
>> call just yield() instead of schedule()? I remember someone from our
>> team here at AMD did this for Xen a while ago and already had pretty
>> good results with that. Xen has a completly other scheduler but maybe
>> its worth trying?
>>
>
> yield() is a no-op in CFS.

Hmm, true. At least when kernel.sched_compat_yield == 0, which it is on my
distro.
If the scheduler would give us something like a real_yield() function
which asumes kernel.sched_compat_yield = 1 might help. At least its
better than sleeping for some random amount of time.

Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

2009-09-27 Thread Avi Kivity

On 09/27/2009 03:46 PM, Joerg Roedel wrote:



We can't find exactly which vcpu, but we can:

- rule out threads that are not vcpus for this guest
- rule out threads that are already running

A major problem with sleep() is that it effectively reduces the vm
priority relative to guests that don't have spinlock contention.  By
selecting a random nonrunnable vcpu belonging to this guest, we at least
preserve the guest's timeslice.
 

Ok, that makes sense. But before trying that we should probably try to
call just yield() instead of schedule()? I remember someone from our
team here at AMD did this for Xen a while ago and already had pretty
good results with that. Xen has a completly other scheduler but maybe
its worth trying?
   


yield() is a no-op in CFS.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

2009-09-27 Thread Joerg Roedel
On Sun, Sep 27, 2009 at 10:31:21AM +0200, Avi Kivity wrote:
> On 09/25/2009 11:43 PM, Joerg Roedel wrote:
>> On Wed, Sep 23, 2009 at 05:09:38PM +0300, Avi Kivity wrote:
>>
>>> We haven't sorted out what is the correct thing to do here.  I think we
>>> should go for a directed yield, but until we have it, you can use
>>> hrtimers to sleep for 100 microseconds and hope the holding vcpu will
>>> get scheduled.  Even if it doesn't, we're only wasting a few percent cpu
>>> time instead of spinning.
>>>  
>> How do you plan to find out to which vcpu thread the current thread
>> should yield?
>>
>
> We can't find exactly which vcpu, but we can:
>
> - rule out threads that are not vcpus for this guest
> - rule out threads that are already running
>
> A major problem with sleep() is that it effectively reduces the vm  
> priority relative to guests that don't have spinlock contention.  By  
> selecting a random nonrunnable vcpu belonging to this guest, we at least  
> preserve the guest's timeslice.

Ok, that makes sense. But before trying that we should probably try to
call just yield() instead of schedule()? I remember someone from our
team here at AMD did this for Xen a while ago and already had pretty
good results with that. Xen has a completly other scheduler but maybe
its worth trying?

Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sync guest calls made async on host - SQLite performance

2009-09-27 Thread Matthew Tippett

I have created a launchpad bug against qemu-kvm in Ubuntu.

https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/437473

Just re-iterating, my concern isn't so much performance, but integrity 
of stock KVM configurations with server or other workloads that expect 
sync fileIO requests to be honored and synchronous to the underlying 
physical disk.


(That and ensuring that sanity reigns where a benchmark doesn't show a 
guest operating 10 times faster than a host for the same test :).


Regards,

Matthew
 Original Message  
Subject: Re: sync guest calls made async on host - SQLite performance
From: Avi Kivity 
To: RW 
Cc: kvm@vger.kernel.org
Date: 09/27/2009 07:37 AM


On 09/25/2009 10:00 AM, RW wrote:

I think ext3 with "data=writeback" in a KVM and KVM started
with "if=virtio,cache=none" is a little bit crazy. I don't know
if this is the case with current Ubuntu Alpha but it looks
like so.
   


It's not crazy, qemu bypasses the cache with cache=none so the ext3 
data= setting is immaterial.




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sync guest calls made async on host - SQLite performance

2009-09-27 Thread Avi Kivity

On 09/25/2009 10:00 AM, RW wrote:

I think ext3 with "data=writeback" in a KVM and KVM started
with "if=virtio,cache=none" is a little bit crazy. I don't know
if this is the case with current Ubuntu Alpha but it looks
like so.
   


It's not crazy, qemu bypasses the cache with cache=none so the ext3 
data= setting is immaterial.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Change log message of VM login

2009-09-27 Thread Yolkfull Chow
We may use this function 'wait_for_login' for several times in a case,
only the first time login could be "Waiting guest to be up".

Signed-off-by: Yolkfull Chow 
---
 client/tests/kvm/kvm_test_utils.py |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/client/tests/kvm/kvm_test_utils.py 
b/client/tests/kvm/kvm_test_utils.py
index 601b350..aa3f2ee 100644
--- a/client/tests/kvm/kvm_test_utils.py
+++ b/client/tests/kvm/kvm_test_utils.py
@@ -52,7 +52,7 @@ def wait_for_login(vm, nic_index=0, timeout=240):
 @param timeout: Time to wait before giving up.
 @return: A shell session object.
 """
-logging.info("Waiting for guest '%s' to be up..." % vm.name)
+logging.info("Try to login to guest '%s'..." % vm.name)
 session = kvm_utils.wait_for(lambda: vm.remote_login(nic_index=nic_index),
  timeout, 0, 2)
 if not session:
-- 
1.6.2.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server

2009-09-27 Thread Avi Kivity

On 09/26/2009 12:32 AM, Gregory Haskins wrote:


I realize in retrospect that my choice of words above implies vbus _is_
complete, but this is not what I was saying.  What I was trying to
convey is that vbus is _more_ complete.  Yes, in either case some kind
of glue needs to be written.  The difference is that vbus implements
more of the glue generally, and leaves less required to be customized
for each iteration.

   


No argument there.  Since you care about non-virt scenarios and virtio
doesn't, naturally vbus is a better fit for them as the code stands.
 

Thanks for finally starting to acknowledge there's a benefit, at least.
   


I think I've mentioned vbus' finer grained layers as helpful here, 
though I doubt the value of this.  Hypervisors are added rarely, while 
devices and drivers are added (and modified) much more often.  I don't 
buy the anything-to-anything promise.



To be more precise, IMO virtio is designed to be a performance oriented
ring-based driver interface that supports all types of hypervisors (e.g.
shmem based kvm, and non-shmem based Xen).  vbus is designed to be a
high-performance generic shared-memory interconnect (for rings or
otherwise) framework for environments where linux is the underpinning
"host" (physical or virtual).  They are distinctly different, but
complementary (the former addresses the part of the front-end, and
latter addresses the back-end, and a different part of the front-end).
   


They're not truly complementary since they're incompatible.  A 2.6.27 
guest, or Windows guest with the existing virtio drivers, won't work 
over vbus.  Further, non-shmem virtio can't work over vbus.  Since 
virtio is guest-oriented and host-agnostic, it can't ignore 
non-shared-memory hosts (even though it's unlikely virtio will be 
adopted there).



In addition, the kvm-connector used in AlacrityVM's design strives to
add value and improve performance via other mechanisms, such as dynamic
  allocation, interrupt coalescing (thus reducing exit-ratio, which is a
serious issue in KVM)


Do you have measurements of inter-interrupt coalescing rates (excluding 
intra-interrupt coalescing).



and priortizable/nestable signals.
   


That doesn't belong in a bus.


Today there is a large performance disparity between what a KVM guest
sees and what a native linux application sees on that same host.  Just
take a look at some of my graphs between "virtio", and "native", for
example:

http://developer.novell.com/wiki/images/b/b7/31-rc4_throughput.png
   


That's a red herring.  The problem is not with virtio as an ABI, but 
with its implementation in userspace.  vhost-net should offer equivalent 
performance to vbus.



A dominant vbus design principle is to try to achieve the same IO
performance for all "linux applications" whether they be literally
userspace applications, or things like KVM vcpus or Ira's physical
boards.  It also aims to solve problems not previously expressible with
current technologies (even virtio), like nested real-time.

And even though you repeatedly insist otherwise, the neat thing here is
that the two technologies mesh (at least under certain circumstances,
like when virtio is deployed on a shared-memory friendly linux backend
like KVM).  I hope that my stack diagram below depicts that clearly.
   


Right, when you ignore the points where they don't fit, it's a perfect mesh.


But that's not a strong argument for vbus; instead of adding vbus you
could make virtio more friendly to non-virt
 

Actually, it _is_ a strong argument then because adding vbus is what
helps makes virtio friendly to non-virt, at least for when performance
matters.
   


As vhost-net shows, you can do that without vbus and without breaking 
compatibility.





Right.  virtio assumes that it's in a virt scenario and that the guest
architecture already has enumeration and hotplug mechanisms which it
would prefer to use.  That happens to be the case for kvm/x86.
 

No, virtio doesn't assume that.  It's stack provides the "virtio-bus"
abstraction and what it does assume is that it will be wired up to
something underneath. Kvm/x86 conveniently has pci, so the virtio-pci
adapter was created to reuse much of that facility.  For other things
like lguest and s360, something new had to be created underneath to make
up for the lack of pci-like support.
   


Right, I was wrong there.  But it does allow you to have a 1:1 mapping 
between native devices and virtio devices.




So to answer your question, the difference is that the part that has to
be customized in vbus should be a fraction of what needs to be
customized with vhost because it defines more of the stack.
   

But if you want to use the native mechanisms, vbus doesn't have any
added value.
 

First of all, thats incorrect.  If you want to use the "native"
mechanisms (via the way the vbus-connector is implemented, for instance)
you at least still have the benefit that the backend design is more
broadly re-u

[PATCH] Add a kvm test guest_s4 which supports both Linux and Windows platform

2009-09-27 Thread Yolkfull Chow
For this case, Ken Cao wrote the linux part previously and I did extensive
modifications on Windows platform support.

Signed-off-by: Ken Cao 
Signed-off-by: Yolkfull Chow 
---
 client/tests/kvm/kvm_tests.cfg.sample |   14 +++
 client/tests/kvm/tests/guest_s4.py|   66 +
 2 files changed, 80 insertions(+), 0 deletions(-)
 create mode 100644 client/tests/kvm/tests/guest_s4.py

diff --git a/client/tests/kvm/kvm_tests.cfg.sample 
b/client/tests/kvm/kvm_tests.cfg.sample
index 285a38f..f9ecb61 100644
--- a/client/tests/kvm/kvm_tests.cfg.sample
+++ b/client/tests/kvm/kvm_tests.cfg.sample
@@ -94,6 +94,14 @@ variants:
 - linux_s3: install setup
 type = linux_s3
 
+- guest_s4:
+type = guest_s4
+check_s4_support_cmd = grep -q disk /sys/power/state
+test_s4_cmd = "cd /tmp/;nohup tcpdump -q -t ip host localhost"
+check_s4_cmd = pgrep tcpdump
+set_s4_cmd = echo disk > /sys/power/state
+kill_test_s4_cmd = pkill tcpdump
+
 - timedrift:install setup
 type = timedrift
 extra_params += " -rtc-td-hack"
@@ -382,6 +390,12 @@ variants:
 # Alternative host load:
 #host_load_command = "dd if=/dev/urandom of=/dev/null"
 host_load_instances = 8
+guest_s4:
+check_s4_support_cmd = powercfg /hibernate on
+test_s4_cmd = start /B ping -n 3000 localhost
+check_s4_cmd = tasklist | find /I "ping"
+set_s4_cmd = rundll32.exe PowrProf.dll, SetSuspendState
+kill_test_s4_cmd = taskkill /IM ping.exe /F
 
 variants:
 - Win2000:
diff --git a/client/tests/kvm/tests/guest_s4.py 
b/client/tests/kvm/tests/guest_s4.py
new file mode 100644
index 000..5d8fbdf
--- /dev/null
+++ b/client/tests/kvm/tests/guest_s4.py
@@ -0,0 +1,66 @@
+import logging, time
+from autotest_lib.client.common_lib import error
+import kvm_test_utils, kvm_utils
+
+
+def run_guest_s4(test, params, env):
+"""
+Suspend guest to disk,supports both Linux & Windows OSes.
+
+@param test: kvm test object.
+@param params: Dictionary with test parameters.
+@param env: Dictionary with the test environment.
+"""
+vm = kvm_test_utils.get_living_vm(env, params.get("main_vm"))
+session = kvm_test_utils.wait_for_login(vm)
+
+logging.info("Checking whether VM supports S4")
+status = session.get_command_status(params.get("check_s4_support_cmd"))
+if status is None:
+logging.error("Failed to check if S4 exists")
+elif status != 0:
+raise error.TestFail("Guest does not support S4")
+
+logging.info("Waiting for a while for X to start...")
+time.sleep(10)
+
+# Start up a program(tcpdump for linux OS & ping for M$ OS), as a flag.
+# If the program died after suspend, then fails this testcase.
+test_s4_cmd = params.get("test_s4_cmd")
+session.sendline(test_s4_cmd)
+
+# Get the second session to start S4
+session2 = kvm_test_utils.wait_for_login(vm)
+
+check_s4_cmd = params.get("check_s4_cmd")
+if session2.get_command_status(check_s4_cmd):
+raise error.TestError("Failed to launch %s background" % test_s4_cmd)
+logging.info("Launched command background in guest: %s" % test_s4_cmd)
+
+# Implement S4
+logging.info("Start suspend to disk now...")
+session2.sendline(params.get("set_s4_cmd"))
+
+if not kvm_utils.wait_for(vm.is_dead, 360, 30, 2):
+raise error.TestFail("VM refuse to go down,suspend failed")
+logging.info("VM suspended successfully.")
+
+logging.info("VM suspended to disk. sleep 10 seconds to have a break...")
+time.sleep(10)
+
+# Start vm, and check whether the program is still running
+logging.info("Restart VM now...")
+
+if not vm.create():
+raise error.TestError("failed to start the vm again.")
+if not vm.is_alive():
+raise error.TestError("VM seems to be dead; Test requires a live VM.")
+
+# Check whether test command still alive
+if session2.get_command_status(check_s4_cmd):
+raise error.TestFail("%s died, indicating that S4 failed" % 
test_s4_cmd)
+
+logging.info("VM resumed after S4")
+session2.sendline(params.get("kill_test_s4_cmd"))
+session.close()
+session2.close()
-- 
1.6.2.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH: kvm 4/5] Fix hotremove of CPUs for KVM.

2009-09-27 Thread Avi Kivity

On 09/25/2009 03:47 AM, Zachary Amsden wrote:

In the process of bringing down CPUs, the SVM / VMX structures associated
with those CPUs are not freed.  This may cause leaks when unloading and
reloading the KVM module, as only the structures associated with online
CPUs are cleaned up.  So, clean up all possible CPUs, not just online ones.

Signed-off-by: Zachary Amsden
---
  arch/x86/kvm/svm.c |2 +-
  arch/x86/kvm/vmx.c |7 +--
  2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 8f99d0c..13ca268 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -525,7 +525,7 @@ static __exit void svm_hardware_unsetup(void)
  {
int cpu;

-   for_each_online_cpu(cpu)
+   for_each_possible_cpu(cpu)
svm_cpu_uninit(cpu);

__free_pages(pfn_to_page(iopm_base>>  PAGE_SHIFT), IOPM_ALLOC_ORDER);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index b8a8428..603bde3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1350,8 +1350,11 @@ static void free_kvm_area(void)
  {
int cpu;

-   for_each_online_cpu(cpu)
-   free_vmcs(per_cpu(vmxarea, cpu));
+   for_each_possible_cpu(cpu)
+   if (per_cpu(vmxarea, cpu)) {
+   free_vmcs(per_cpu(vmxarea, cpu));
+   per_cpu(vmxarea, cpu) = NULL;
+   }
  }

  static __init int alloc_kvm_area(void)
   


First, I'm not sure per_cpu works for possible but not actual cpus.  
Second, we now eagerly allocate but lazily free, leading to lots of ifs 
and buts.  I think the code can be cleaner by eagerly allocating and 
eagerly freeing.



--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH: kvm 3/5] Fix hotadd of CPUs for KVM.

2009-09-27 Thread Avi Kivity

On 09/25/2009 03:47 AM, Zachary Amsden wrote:

Both VMX and SVM require per-cpu memory allocation, which is done at module
init time, for only online cpus.  When bringing a new CPU online, we must
also allocate this structure.  The method chosen to implement this is to
make the CPU online notifier available via a call to the arch code.  This
allows memory allocation to be done smoothly, without any need to allocate
extra structures.

Note: CPU up notifiers may call KVM callback before calling cpufreq callbacks.
This would causes the CPU frequency not to be detected (and it is not always
clear on non-constant TSC platforms what the bringup TSC rate will be, so the
guess of using tsc_khz could be wrong).  So, we clear the rate to zero in such
a case and add logic to query it upon entry.

Signed-off-by: Zachary Amsden
---
  arch/x86/include/asm/kvm_host.h |2 ++
  arch/x86/kvm/svm.c  |   15 +--
  arch/x86/kvm/vmx.c  |   17 +
  arch/x86/kvm/x86.c  |   13 +
  include/linux/kvm_host.h|6 ++
  virt/kvm/kvm_main.c |6 ++
  6 files changed, 49 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 299cc1b..b7dd14b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -459,6 +459,7 @@ struct descriptor_table {
  struct kvm_x86_ops {
int (*cpu_has_kvm_support)(void);  /* __init */
int (*disabled_by_bios)(void); /* __init */
+   int (*cpu_hotadd)(int cpu);
int (*hardware_enable)(void *dummy);
void (*hardware_disable)(void *dummy);
void (*check_processor_compatibility)(void *rtn);
@@ -791,6 +792,7 @@ asmlinkage void kvm_handle_fault_on_reboot(void);
_ASM_PTR " 666b, 667b \n\t" \
".popsection"

+#define KVM_ARCH_WANT_HOTPLUG_NOTIFIER
  #define KVM_ARCH_WANT_MMU_NOTIFIER
  int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
  int kvm_age_hva(struct kvm *kvm, unsigned long hva);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 9a4daca..8f99d0c 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -330,13 +330,13 @@ static int svm_hardware_enable(void *garbage)
return -EBUSY;

if (!has_svm()) {
-   printk(KERN_ERR "svm_cpu_init: err EOPNOTSUPP on %d\n", me);
+   printk(KERN_ERR "svm_hardware_enable: err EOPNOTSUPP on %d\n", 
me);
return -EINVAL;
}
svm_data = per_cpu(svm_data, me);

if (!svm_data) {
-   printk(KERN_ERR "svm_cpu_init: svm_data is NULL on %d\n",
+   printk(KERN_ERR "svm_hardware_enable: svm_data is NULL on %d\n",
   me);
return -EINVAL;
}
@@ -394,6 +394,16 @@ err_1:

  }

+static __cpuinit int svm_cpu_hotadd(int cpu)
+{
+   struct svm_cpu_data *svm_data = per_cpu(svm_data, cpu);
+
+   if (svm_data)
+   return 0;
+
+   return svm_cpu_init(cpu);
+}
+
  static void set_msr_interception(u32 *msrpm, unsigned msr,
 int read, int write)
  {
@@ -2858,6 +2868,7 @@ static struct kvm_x86_ops svm_x86_ops = {
.hardware_setup = svm_hardware_setup,
.hardware_unsetup = svm_hardware_unsetup,
.check_processor_compatibility = svm_check_processor_compat,
+   .cpu_hotadd = svm_cpu_hotadd,
.hardware_enable = svm_hardware_enable,
.hardware_disable = svm_hardware_disable,
.cpu_has_accelerated_tpr = svm_cpu_has_accelerated_tpr,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 3fe0d42..b8a8428 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1408,6 +1408,22 @@ static __exit void hardware_unsetup(void)
free_kvm_area();
  }

+static __cpuinit int vmx_cpu_hotadd(int cpu)
+{
+   struct vmcs *vmcs;
+
+   if (per_cpu(vmxarea, cpu))
+   return 0;
+
+   vmcs = alloc_vmcs_cpu(cpu);
+   if (!vmcs)
+   return -ENOMEM;
+
+   per_cpu(vmxarea, cpu) = vmcs;
+
+   return 0;
+}
+
  static void fix_pmode_dataseg(int seg, struct kvm_save_segment *save)
  {
struct kvm_vmx_segment_field *sf =&kvm_vmx_segment_fields[seg];
@@ -3925,6 +3941,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
.hardware_setup = hardware_setup,
.hardware_unsetup = hardware_unsetup,
.check_processor_compatibility = vmx_check_processor_compat,
+   .cpu_hotadd = vmx_cpu_hotadd,
.hardware_enable = hardware_enable,
.hardware_disable = hardware_disable,
.cpu_has_accelerated_tpr = report_flexpriority,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c18e2fc..66c6bb9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1326,6 +1326,8 @@ out:
  void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
  {
kvm_x86_ops->vcpu_load(vcpu, cpu);
+   if (unlikely(per_cpu(cpu_

Re: [PATCH: kvm 3/6] Fix hotadd of CPUs for KVM.

2009-09-27 Thread Avi Kivity

On 09/24/2009 11:32 PM, Zachary Amsden wrote:

On 09/24/2009 05:52 AM, Marcelo Tosatti wrote:



+static __cpuinit int vmx_cpu_hotadd(int cpu)
+{
+struct vmcs *vmcs;
+
+if (per_cpu(vmxarea, cpu))
+return 0;
+
+vmcs = alloc_vmcs_cpu(cpu);
+if (!vmcs)
+return -ENOMEM;
+
+per_cpu(vmxarea, cpu) = vmcs;
+
+return 0;
+}

Have to free in __cpuexit?

Is it too wasteful to allocate statically with 
DEFINE_PER_CPU_PAGE_ALIGNED?


Unfortunately, I think it is.  The VMX / SVM structures are quite 
large, and we can have a lot of potential CPUs.


I think percpu is only allocated when the cpu is online (it would still 
be wasteful if the modules were loaded but unused).


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] KVM: fix lock imbalance

2009-09-27 Thread Avi Kivity

On 09/25/2009 10:33 AM, Jiri Slaby wrote:

Stanse found 2 lock imbalances in kvm_request_irq_source_id and
kvm_free_irq_source_id. They omit to unlock kvm->irq_lock on fail paths.

Fix that by adding unlock labels at the end of the functions and jump
there from the fail paths.
   


Applied, thanks.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] x86: Pick up local arch trace headers

2009-09-27 Thread Avi Kivity

On 09/25/2009 07:18 PM, Jan Kiszka wrote:

This unbreaks 2.6.31 builds but also ensures that we always use the most
recent ones.

Signed-off-by: Jan Kiszka
---

  include/arch/x86/kvm |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)
  create mode 12 include/arch/x86/kvm

diff --git a/include/arch/x86/kvm b/include/arch/x86/kvm
new file mode 12
index 000..c635817
--- /dev/null
+++ b/include/arch/x86/kvm
@@ -0,0 +1 @@
+../../../x86
\ No newline at end of file

   


Shouldn't it be asm-x86?

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu-kvm: Fix segfault on -no-kvm startup

2009-09-27 Thread Avi Kivity

On 09/25/2009 10:03 PM, Jan Kiszka wrote:

Gleb Natapov wrote:
   

On Fri, Sep 25, 2009 at 06:05:49PM +0200, Jan Kiszka wrote:
 

The check for in-kernel irqchip must be protected by kvm_enabled, and we
have a different wrapper for it.

   

Why not move kvm_enabled() into kvm_irqchip_in_kernel()? It will return
false if !kvm_enabled().
 

Yes, possible. But I'm not sure if it's worth to refactor at this level.
   


In any case, fix bugs first, refactor later.


I think the whole irqchip interface has to go through some broader
refactoring when pushing it upstream. The result should either be a
specific, in-kernel-irqchip apic device or generic wrapper services that
cover all cases, is easily compiled away in the absence of KVM and avoid
#ifdefs like below.
   


s/when/before/

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu-kvm: Fix segfault on -no-kvm startup

2009-09-27 Thread Avi Kivity

On 09/25/2009 07:05 PM, Jan Kiszka wrote:

The check for in-kernel irqchip must be protected by kvm_enabled, and we
have a different wrapper for it.

   


Applied, thanks.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

2009-09-27 Thread Avi Kivity

On 09/25/2009 11:43 PM, Joerg Roedel wrote:

On Wed, Sep 23, 2009 at 05:09:38PM +0300, Avi Kivity wrote:
   

We haven't sorted out what is the correct thing to do here.  I think we
should go for a directed yield, but until we have it, you can use
hrtimers to sleep for 100 microseconds and hope the holding vcpu will
get scheduled.  Even if it doesn't, we're only wasting a few percent cpu
time instead of spinning.
 

How do you plan to find out to which vcpu thread the current thread
should yield?
   


We can't find exactly which vcpu, but we can:

- rule out threads that are not vcpus for this guest
- rule out threads that are already running

A major problem with sleep() is that it effectively reduces the vm 
priority relative to guests that don't have spinlock contention.  By 
selecting a random nonrunnable vcpu belonging to this guest, we at least 
preserve the guest's timeslice.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

2009-09-27 Thread Avi Kivity

On 09/25/2009 04:11 AM, Zhai, Edwin wrote:

Avi,

hrtimer is used for sleep in attached patch, which have similar perf 
gain with previous one. Maybe we can check in this patch first, and 
turn to direct yield in future, as you suggested.


+/*
+ * These 2 parameters are used to config the controls for Pause-Loop Exiting:
+ * ple_gap:upper bound on the amount of time between two successive
+ * executions of PAUSE in a loop. Also indicate if ple enabled.
+ * According to test, this time is usually small than 41 cycles.
+ * ple_window: upper bound on the amount of time a guest is allowed to execute
+ * in a PAUSE loop. Tests indicate that most spinlocks are held for
+ * less than 2^12 cycles
+ * Time is measured based on a counter that runs at the same rate as the TSC,
+ * refer SDM volume 3b section 21.6.13&  22.1.3.
+ */
+#define KVM_VMX_DEFAULT_PLE_GAP41
+#define KVM_VMX_DEFAULT_PLE_WINDOW 4096
+static int __read_mostly ple_gap = KVM_VMX_DEFAULT_PLE_GAP;
+module_param(ple_gap, int, S_IRUGO);
+
+static int __read_mostly ple_window = KVM_VMX_DEFAULT_PLE_WINDOW;
+module_param(ple_window, int, S_IRUGO);
   


Shouldn't be __read_mostly since they're read very rarely (__read_mostly 
should be for variables that are very often read, and rarely written).


I'm not even sure they should be parameters.


  /*
+ * Indicate a busy-waiting vcpu in spinlock. We do not enable the PAUSE
+ * exiting, so only get here on cpu with PAUSE-Loop-Exiting.
+ */
+static int handle_pause(struct kvm_vcpu *vcpu,
+   struct kvm_run *kvm_run)
+{
+   ktime_t expires;
+   skip_emulated_instruction(vcpu);
+
+   /* Sleep for 1 msec, and hope lock-holder got scheduled */
+   expires = ktime_add_ns(ktime_get(), 100UL);
   


I think this should be much lower, 50-100us.  Maybe this should be a 
parameter.  With 1ms we losing significant cpu time if the congestion 
clears.



+   set_current_state(TASK_INTERRUPTIBLE);
+   schedule_hrtimeout(&expires, HRTIMER_MODE_ABS);
+
   


Please add a tracepoint for this (since it can cause significant change 
in behaviour), and move the logic to kvm_main.c.  It will be reused by 
the AMD implementation, possibly my software spinlock detector, 
paravirtualized spinlocks, and hopefully other architectures.



+   return 1;
+}
+
+/*
   


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server

2009-09-27 Thread Michael S. Tsirkin
On Fri, Sep 25, 2009 at 10:01:58AM -0700, Ira W. Snyder wrote:
> > +   case VHOST_SET_VRING_KICK:
> > +   r = copy_from_user(&f, argp, sizeof f);
> > +   if (r < 0)
> > +   break;
> > +   eventfp = f.fd == -1 ? NULL : eventfd_fget(f.fd);
> > +   if (IS_ERR(eventfp))
> > +   return PTR_ERR(eventfp);
> > +   if (eventfp != vq->kick) {
> > +   pollstop = filep = vq->kick;
> > +   pollstart = vq->kick = eventfp;
> > +   } else
> > +   filep = eventfp;
> > +   break;
> > +   case VHOST_SET_VRING_CALL:
> > +   r = copy_from_user(&f, argp, sizeof f);
> > +   if (r < 0)
> > +   break;
> > +   eventfp = f.fd == -1 ? NULL : eventfd_fget(f.fd);
> > +   if (IS_ERR(eventfp))
> > +   return PTR_ERR(eventfp);
> > +   if (eventfp != vq->call) {
> > +   filep = vq->call;
> > +   ctx = vq->call_ctx;
> > +   vq->call = eventfp;
> > +   vq->call_ctx = eventfp ?
> > +   eventfd_ctx_fileget(eventfp) : NULL;
> > +   } else
> > +   filep = eventfp;
> > +   break;
> > +   case VHOST_SET_VRING_ERR:
> > +   r = copy_from_user(&f, argp, sizeof f);
> > +   if (r < 0)
> > +   break;
> > +   eventfp = f.fd == -1 ? NULL : eventfd_fget(f.fd);
> > +   if (IS_ERR(eventfp))
> > +   return PTR_ERR(eventfp);
> > +   if (eventfp != vq->error) {
> > +   filep = vq->error;
> > +   vq->error = eventfp;
> > +   ctx = vq->error_ctx;
> > +   vq->error_ctx = eventfp ?
> > +   eventfd_ctx_fileget(eventfp) : NULL;
> > +   } else
> > +   filep = eventfp;
> > +   break;
> 
> I'm not sure how these eventfd's save a trip to userspace.
> 
> AFAICT, eventfd's cannot be used to signal another part of the kernel,
> they can only be used to wake up userspace.

Yes, they can.  See irqfd code in virt/kvm/eventfd.c.

> In my system, when an IRQ for kick() comes in, I have an eventfd which
> gets signalled to notify userspace. When I want to send a call(), I have
> to use a special ioctl(), just like lguest does.
> 
> Doesn't this mean that for call(), vhost is just going to signal an
> eventfd to wake up userspace, which is then going to call ioctl(), and
> then we're back in kernelspace. Seems like a wasted userspace
> round-trip.
> 
> Or am I mis-reading this code?

Yes. Kernel can poll eventfd and deliver an interrupt directly
without involving userspace.

> PS - you can see my current code at:
> http://www.mmarray.org/~iws/virtio-phys/
> 
> Thanks,
> Ira
> 
> > +   default:
> > +   r = -ENOIOCTLCMD;
> > +   }
> > +
> > +   if (pollstop && vq->handle_kick)
> > +   vhost_poll_stop(&vq->poll);
> > +
> > +   if (ctx)
> > +   eventfd_ctx_put(ctx);
> > +   if (filep)
> > +   fput(filep);
> > +
> > +   if (pollstart && vq->handle_kick)
> > +   vhost_poll_start(&vq->poll, vq->kick);
> > +
> > +   mutex_unlock(&vq->mutex);
> > +
> > +   if (pollstop && vq->handle_kick)
> > +   vhost_poll_flush(&vq->poll);
> > +   return 0;
> > +}
> > +
> > +long vhost_dev_ioctl(struct vhost_dev *d, unsigned int ioctl, unsigned 
> > long arg)
> > +{
> > +   void __user *argp = (void __user *)arg;
> > +   long r;
> > +
> > +   mutex_lock(&d->mutex);
> > +   /* If you are not the owner, you can become one */
> > +   if (ioctl == VHOST_SET_OWNER) {
> > +   r = vhost_dev_set_owner(d);
> > +   goto done;
> > +   }
> > +
> > +   /* You must be the owner to do anything else */
> > +   r = vhost_dev_check_owner(d);
> > +   if (r)
> > +   goto done;
> > +
> > +   switch (ioctl) {
> > +   case VHOST_SET_MEM_TABLE:
> > +   r = vhost_set_memory(d, argp);
> > +   break;
> > +   default:
> > +   r = vhost_set_vring(d, ioctl, argp);
> > +   break;
> > +   }
> > +done:
> > +   mutex_unlock(&d->mutex);
> > +   return r;
> > +}
> > +
> > +static const struct vhost_memory_region *find_region(struct vhost_memory 
> > *mem,
> > +__u64 addr, __u32 len)
> > +{
> > +   struct vhost_memory_region *reg;
> > +   int i;
> > +   /* linear search is not brilliant, but we really have on the order of 6
> > +* regions in practice */
> > +   for (i = 0; i < mem->nregions; ++i) {
> > +   reg = mem->regions + i;
> > +   if (reg->guest_phys_addr <= addr &&
> > +   reg->guest_phys_addr + reg->memory_size - 1 >= addr)
> > +   return reg;
> > +   }
> > +   return NULL;
> > +}
> > +
> > +int translate_desc(struct vhost_dev *dev, u64 addr, u32 len,
> > +  struct iovec iov[], int iov_size)
> > +{
> > +   const s

[ANNOUNCE] qemu-kvm-0.11.0 released

2009-09-27 Thread Avi Kivity
qemu-kvm-0.11.0 is now available.  This release is is based on the 
upstream qemu 0.11.0, plus kvm-specific enhancements.


Changes from the qemu-kvm-0.10 series:
- merge qemu 0.11.0
  - qdev device model
  - qemu-io
  - i386: multiboot support for -kernel
  - gdbstub: vCont support
  - i386: control over boot menu
  - i386: pc-0.10 compatibility machine type
  - qcow2: use cache=writethrough by default
  - i386: MCE emulation
  - i386: host cpuid support
  - slirp: host network config
  - virtio: MSI-x support
  - pci: allow devices to specify bus address
  - migration: allow down time based threshold
  - virtio-net: filtering support
  - http block device support
  - i386: expose numa topology to guests
  - native preadv/pwritev support
  - kvm: guest debugging support
  - vnc: support for acls and gssapi
  - monitor: allow multiple monitors
- device assignment: MSI-X support (Sheng Yang)
- device assignment: SR/IOV support (Sheng Yang)
- irqfd support (Gregory Haskins)
- drop libkvm, use some of the upstream kvm support (Glauber Costa)
- device assignment: option ROM support (Alex Williamson)
- x2apic support (Gleb Natapov)
- kvm/msi integration (Michael S. Tsirkin)
- hpet/kvm integration (Beth Kon)
- mce/kvm ingration (Huang Ying)

http://www.linux-kvm.org

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html