date:20100321

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Zhang, Yanmin

On Sun, 2010-03-21 at 22:20 +0100, Ingo Molnar wrote:
> * Avi Kivity  wrote:
> 
> > > Well, for what it's worth, I rarely ever use anything else. My virtual 
> > > disks are raw so I can loop mount them easily, and I can also switch my 
> > > guest kernels from outside... without ever needing to mount those disks.
> > 
> > Curious, what do you use them for?
> > 
> > btw, if you build your kernel outside the guest, then you already have 
> > access to all its symbols, without needing anything further.
> 
> There's two errors with your argument:
> 
> 1) you are assuming that it's only about kernel symbols
> 
> Look at this 'perf report' output:
> 
> # Samples: 7127509216
> #
> # Overhead Command  Shared Object  Symbol
> #   ..  .  ..
> #
> 19.14% git  git[.] lookup_object
> 15.16%perf  git[.] lookup_object
>  4.74%perf  libz.so.1.2.3  [.] inflate
>  4.52% git  libz.so.1.2.3  [.] inflate
>  4.21%perf  libz.so.1.2.3  [.] inflate_table
>  3.94% git  libz.so.1.2.3  [.] inflate_table
>  3.29% git  git[.] find_pack_entry_one
>  3.24% git  libz.so.1.2.3  [.] inflate_fast
>  2.96%perf  libz.so.1.2.3  [.] inflate_fast
>  2.96% git  git[.] decode_tree_entry
>  2.80%perf  libc-2.11.90.so[.] __strlen_sse42
>  2.56% git  libc-2.11.90.so[.] __strlen_sse42
>  1.98%perf  libc-2.11.90.so[.] __GI_memcpy
>  1.71%perf  git[.] decode_tree_entry
>  1.53% git  libc-2.11.90.so[.] __GI_memcpy
>  1.48% git  git[.] lookup_blob
>  1.30% git  git[.] process_tree
>  1.30%perf  git[.] process_tree
>  0.90%perf  git[.] tree_entry
>  0.82%perf  git[.] lookup_blob
>  0.78% git  [kernel.kallsyms]  [k] kstat_irqs_cpu
> 
> kernel symbols are only a small portion of the symbols. (a single line in 
> this 
> case)
Above example shows perf could summarize both kernel and application hot 
functions.
If we collect guest os statistics from host side, we can't summarize detailed 
guest os
application info because we couldn't get guest os's application process id from 
host
side. So we could only get detailed kernel info and the total utilization 
percent of
guest application processes.


> 
> To get to those other symbols we have to read the ELF symbols of those 
> binaries in the guest filesystem, in the post-processing/reporting phase. 
> This 
> is both complex to do and relatively slow so we dont want to (and cannot) do 
> this at sample time from IRQ context or NMI context ...
> 
> Also, many aspects of reporting are interactive so it's done lazily or 
> on-demand. So we need ready access to the guest filesystem - for those guests 
> which decide to integrate with the host for this.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 11:52 PM, Ingo Molnar wrote:

* Avi Kivity  wrote:

   

I.e. you are arguing for microkernel Linux, while you see me as arguing
for a monolithic kernel.
   

No. I'm arguing for reducing bloat wherever possible.  Kernel code is more
expensive than userspace code in every metric possible.
 

1)

One of the primary design arguments of the micro-kernel design as well was to
push as much into user-space as possible without impacting performance too
much so you very much seem to be arguing for a micro-kernel design for the
kernel.

I think history has given us the answer for that fight between microkernels
and monolithic kernels.
   


I am not arguing for a microkernel.  Again: reduce bloat where possible, 
kernel code is more expensive than userspace code.



Furthermore, to not engage in hypotheticals about microkernels: by your
argument the Oprofile design was perfect (it was minimalistic kernel-space,
with all the complexity in user-space), while perf was over-complex (which
does many things in the kernel that could have been done in user-space).

Practical results suggest the exact opposite happened - Oprofile is being
replaced by perf. How do you explain that?
   


I did not say that the amount of kernel and userspace code is the only 
factor deciding the quality of software.  If that were so, microkernels 
would have won out long ago.


It may be that that perf has too much kernel code, and won against 
oprofile despite that because it was better in other areas.  Or it may 
be that perf has exactly the right user/kernel division.  Or maybe perf 
needs some of the code moved from userspace to the kernel.  I don't 
know, I haven't examined the code.


The user/kernel boundary is only one metric for code quality.  Nor is it 
always in favour of pushing things to userspace.  Narrowing or 
simplifying an interface is often an argument in favour of pushing 
things into the kernel.


IMO the reason perf is more usable than oprofile has less to do with the 
kernel/userspace boundary and more do to with effort and attention spent 
on the userspace/user boundary.



2)

In your analysis you again ignore the package boundary costs and artifacts as
if they didnt exist.

That was my main argument, and that is what we saw with oprofile and perf:
while maintaining more kernel-code may be more expensive, it sure pays off for
getting us a much better solution in the end.
   


Package costs are real.  We need to bear them.  I don't think that 
because maintaining another package (and the interface between two 
packages) is more difficult, then the kernel size should increase.



And getting a 'much better solution' to users is the goal of all this, isnt
it?

I dont mind what you call 'bloat' per se if it's for a purpose that users find
like a good deal. I have quite a bit of RAM in most of my systems, having 50K
more or less included in the kernel image is far less important than having a
healthy and vibrant development model and having satisfied users ...
   


I'm not worried about 50K or so, I'm worried about a bug in those 50K 
taking down the guest.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 10:37 PM, Ingo Molnar wrote:



That includes the guest kernel.  If you can deploy a new kernel in the
guest, presumably you can deploy a userspace package.
 

Note that with perf we can instrument the guest with zero guest-kernel
modifications as well.

We try to reduce the guest impact to a bare minimum, as the difficulties in
deployment are function of the cross section surface to the guest.

Also, note that the kernel is special with regards to instrumentation: since
this is the kernel project, we are doing kernel space changes, as we are doing
them _anyway_. So adding symbol resolution capabilities would be a minimal
addition to that - while adding a while new guest package for the demon would
significantly increase the cross section surface.
   


It's true that for us, changing the kernel is easier than changing the 
rest of the guest.  IMO we should still resist the temptation to go the 
easy path and do the right thing (I understand we disagree about what 
the right thing is).


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 11:20 PM, Ingo Molnar wrote:

* Avi Kivity  wrote:

   

Well, for what it's worth, I rarely ever use anything else. My virtual
disks are raw so I can loop mount them easily, and I can also switch my
guest kernels from outside... without ever needing to mount those disks.
   

Curious, what do you use them for?

btw, if you build your kernel outside the guest, then you already have
access to all its symbols, without needing anything further.
 

There's two errors with your argument:

1) you are assuming that it's only about kernel symbols

Look at this 'perf report' output:

# Samples: 7127509216
#
# Overhead Command  Shared Object  Symbol
#   ..  .  ..
#
 19.14% git  git[.] lookup_object
 15.16%perf  git[.] lookup_object
  4.74%perf  libz.so.1.2.3  [.] inflate
  4.52% git  libz.so.1.2.3  [.] inflate
  4.21%perf  libz.so.1.2.3  [.] inflate_table
  3.94% git  libz.so.1.2.3  [.] inflate_table
  3.29% git  git[.] find_pack_entry_one
  3.24% git  libz.so.1.2.3  [.] inflate_fast
  2.96%perf  libz.so.1.2.3  [.] inflate_fast
  2.96% git  git[.] decode_tree_entry
  2.80%perf  libc-2.11.90.so[.] __strlen_sse42
  2.56% git  libc-2.11.90.so[.] __strlen_sse42
  1.98%perf  libc-2.11.90.so[.] __GI_memcpy
  1.71%perf  git[.] decode_tree_entry
  1.53% git  libc-2.11.90.so[.] __GI_memcpy
  1.48% git  git[.] lookup_blob
  1.30% git  git[.] process_tree
  1.30%perf  git[.] process_tree
  0.90%perf  git[.] tree_entry
  0.82%perf  git[.] lookup_blob
  0.78% git  [kernel.kallsyms]  [k] kstat_irqs_cpu

kernel symbols are only a small portion of the symbols. (a single line in this
case)

To get to those other symbols we have to read the ELF symbols of those
binaries in the guest filesystem, in the post-processing/reporting phase. This
is both complex to do and relatively slow so we dont want to (and cannot) do
this at sample time from IRQ context or NMI context ...
   


Okay.  So a symbol server is necessary.  Still, I don't think -kernel is 
a good reason for including the symbol server in the kernel itself.  If 
someone uses it extensively together with perf, _and_ they can't put the 
symbol server in the guest for some reason, let them patch mkinitrd to 
include it.



Also, many aspects of reporting are interactive so it's done lazily or
on-demand. So we need ready access to the guest filesystem - for those guests
which decide to integrate with the host for this.

2) the 'SystemTap mistake'

You are assuming that the symbols of the kernel when it got built got saved
properly and are discoverable easily. In reality those symbols can be erased
by a make clean, can be modified by a new build, can be misplaced and can
generally be hard to find because each distro puts them in a different
installation path.

My 10+ years experience with kernel instrumentation solutions is that
kernel-driven, self-sufficient, robust, trustable, well-enumerated sources of
information work far better in practice.
   


What about line number information?  And the source?  Into the kernel 
with them as well?




The thing is, in this thread i'm forced to repeat the same basic facts again
and again. Could you _PLEASE_, pretty please, when it comes to instrumentation
details, at least _read the mails_ of the guys who actually ... write and
maintain Linux instrumentation code? This is getting ridiculous really.
   


I've read every one of your emails.  If I misunderstood or overlooked 
something, I apologize.  The thread is very long and at times 
antagonistic so it's hard to keep all the details straight.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] qemu-kvm: Introduce wrapper functions to access phys_ram_dirty, and replace existing direct accesses to it.

2010-03-21 Thread Yoshiaki Tamura


Marcelo Tosatti wrote:

On Wed, Mar 17, 2010 at 02:51:46PM +0900, Yoshiaki Tamura wrote:


Before replacing byte-based dirty bitmap with bit-based dirty bitmap,
clearing direct accesses to the bitmap first seems to be good point to
start with.

This patch set is based on the following discussion.

http://www.mail-archive.com/kvm@vger.kernel.org/msg30724.html

Thanks,

Yoshi


Looks fine to me.

This is qemu upstream material, though.


Thanks for your comment.
I should have removed qemu-kvm from the title.

Should I rebase the patch to qemu.git and repost?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Streaming Audio from Virtual Machine

2010-03-21 Thread David S. Ahern



On 03/21/2010 01:12 PM, Gus Zernial wrote:
> I'm using Kubuntu 9.10 32-bit on a quad-core Phenom II with 
> Gigabit ethernet. I want to stream audio from MLB.com from a 
> WinXP client thru a Linksys WMB54G wireless music bridge. Note 
> that there are drivers for the WMB54G only for WinXP and Vista.
> 
> If I stream the audio thru a native WinXP box thru the WMB54G,
> all is well and the audio sounds fine. When I try to stream thru a 
> WinXP virtual machine on Kubuntu 9.10, the audio is poor quality
> and subject to gaps and dropping the stream altogether. So far
> I've tried KVM/QEMU and VirtualBox, same result.
> 
> Regards KVM/QEMU, I note AMD-V is activated in the BIOS, and I have a 
> custom 2.6.32.7 kernel, and QEMU 0.11.0. The kvm kvm_amd modules are compiled 
> in and loaded. I've been using bridged networking . I think it's set up 
> correctly but I confess I'm no networking expert. My start command for the 
> WinXP virtual machine is:
> 
> sudo /usr/bin/qemu -m 1024 -boot c 
> -netnic,vlan=0,macaddr=00:d0:13:b0:2d:32,model=rtl8139 -net 
> tap,vlan=0,ifname=tap0,script=/etc/qemu-ifup -localtime -soundhw ac97 -smp 4 
> -fda /dev/fd0 -vga std -usb /home/rbroman/windows.img
> 
> I also tried model=virtio but that didn't help. 
> 
> I suspect this is a virtual machine networking problem but I'm
> not sure. So my questions are:
> 
> -What's the best/fastest networking option and how do I set it up?
> Pointers to step-by-step instructions appreciated.
> 
> -Is it possible I have a problem other than networking? Configuration
> problem with KVM/QEMU? Or could there be a problem with the WMB54G driver 
> when used thru a virtual machine?
> 
> -Is there a better virtual machine solution than KVM/QEMU for what 
> I'm trying to do?

[dsa] I have been able to stream and video in a KVM-hosted winxp VM, and
I have even watched a netflix-based movie. My laptop has a Core-2 duo
cpu, T9550, with 4 GB of RAM. Networking at home is through a wireless-N
router, and I use bridged networking and NAT for VMs.

Host activity definitely has an impact. When streaming I make sure I am
not doing any heavy activity in the host layer, and if I notice jitter
the first thing I do is up the priority of the VM threads using chrt.

David

> 
> Recommendations appreciated - Gus
> 
> 
> 
> 
> 
>   
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[KVM-AUTOTEST PATCH 3/5] KVM test: kvm_utils.load_env(): do not fail if env file is corrupted

2010-03-21 Thread Michael Goldish

- Include the unpickling code in the 'try' block, so that an exception raised
  during unpickling will not fail the test.
- Change the default env (returned by load_env() when the file is missing or
  corrupt) to {}.

Signed-off-by: Michael Goldish 
---
 client/tests/kvm/kvm_utils.py |   10 ++
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py
index d386456..cc39b5d 100644
--- a/client/tests/kvm/kvm_utils.py
+++ b/client/tests/kvm/kvm_utils.py
@@ -22,7 +22,7 @@ def dump_env(obj, filename):
 file.close()
 
 
-def load_env(filename, default=None):
+def load_env(filename, default={}):
 """
 Load KVM test environment from an environment file.
 
@@ -30,11 +30,13 @@ def load_env(filename, default=None):
 """
 try:
 file = open(filename, "r")
+obj = cPickle.load(file)
+file.close()
+return obj
+# Almost any exception can be raised during unpickling, so let's catch
+# them all
 except:
 return default
-obj = cPickle.load(file)
-file.close()
-return obj
 
 
 def get_sub_dict(dict, name):
-- 
1.5.4.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[KVM-AUTOTEST PATCH 5/5] KVM test: take frequent screendumps during all tests

2010-03-21 Thread Michael Goldish

Screendumps are taken regularly and converted to JPEG format.
They are stored in .../debug/screendumps_/.
Requires python-imaging.

- Enabled by 'take_regular_screendumps = yes' (naming suggestions welcome).
- Delay between screendumps is controlled by 'screendump_delay' (default 5).
- Compression quality is controlled by 'screendump_quality' (default 30).
- It's probably a good idea to dump them to /dev/shm before converting them
  in order to minimize disk use.  This can be enabled by
  'screendump_temp_dir = /dev/shm' (commented out by default because I'm not
  sure /dev/shm is available on all machines.)
- Screendumps are removed unless 'keep_screendumps'['_on_error'] is 'yes'.
  The recommended setting when submitting jobs from autoserv is
  'keep_screendumps_on_error = yes', which means screendumps are kept only if
  the test fails.  Keeping all screendumps may use up all of the server's
  storage space.

This patch sets reasonable defaults in tests_base.cfg.sample.

(It also makes sure post_command is executed last in the postprocessing
procedure -- otherwise post_command failure can prevent other postprocessing
steps (like removing the screendump dirs) from taking place.)

Signed-off-by: Michael Goldish 
---
 client/tests/kvm/kvm_preprocessing.py  |   85 +--
 client/tests/kvm/tests_base.cfg.sample |   13 -
 2 files changed, 89 insertions(+), 9 deletions(-)

diff --git a/client/tests/kvm/kvm_preprocessing.py 
b/client/tests/kvm/kvm_preprocessing.py
index e3a5501..0e4ce87 100644
--- a/client/tests/kvm/kvm_preprocessing.py
+++ b/client/tests/kvm/kvm_preprocessing.py
@@ -1,4 +1,4 @@
-import sys, os, time, commands, re, logging, signal, glob
+import sys, os, time, commands, re, logging, signal, glob, threading, shutil
 from autotest_lib.client.bin import test, utils
 from autotest_lib.client.common_lib import error
 import kvm_vm, kvm_utils, kvm_subprocess, ppm_utils
@@ -11,6 +11,10 @@ except ImportError:
 'distro.')
 
 
+_screendump_thread = None
+_screendump_thread_termination_event = None
+
+
 def preprocess_image(test, params):
 """
 Preprocess a single QEMU image according to the instructions in params.
@@ -254,6 +258,14 @@ def preprocess(test, params, env):
 # Preprocess all VMs and images
 process(test, params, env, preprocess_image, preprocess_vm)
 
+# Start the screendump thread
+if params.get("take_regular_screendumps") == "yes":
+global _screendump_thread, _screendump_thread_termination_event
+_screendump_thread_termination_event = threading.Event()
+_screendump_thread = threading.Thread(target=_take_screendumps,
+  args=(test, params, env))
+_screendump_thread.start()
+
 
 def postprocess(test, params, env):
 """
@@ -263,8 +275,15 @@ def postprocess(test, params, env):
 @param params: Dict containing all VM and image parameters.
 @param env: The environment (a dict-like object).
 """
+# Postprocess all VMs and images
 process(test, params, env, postprocess_image, postprocess_vm)
 
+# Terminate the screendump thread
+global _screendump_thread, _screendump_thread_termination_event
+if _screendump_thread:
+_screendump_thread_termination_event.set()
+_screendump_thread.join(10)
+
 # Warn about corrupt PPM files
 for f in glob.glob(os.path.join(test.debugdir, "*.ppm")):
 if not ppm_utils.image_verify_ppm_file(f):
@@ -290,11 +309,13 @@ def postprocess(test, params, env):
 for f in glob.glob(os.path.join(test.debugdir, '*.ppm')):
 os.unlink(f)
 
-# Execute any post_commands
-if params.get("post_command"):
-process_command(test, params, env, params.get("post_command"),
-int(params.get("post_command_timeout", "600")),
-params.get("post_command_noncritical") == "yes")
+# Should we keep the screendump dirs?
+if params.get("keep_screendumps") != "yes":
+logging.debug("'keep_screendumps' not specified; removing screendump "
+  "dirs...")
+for d in glob.glob(os.path.join(test.debugdir, "screendumps_*")):
+if os.path.isdir(d) and not os.path.islink(d):
+shutil.rmtree(d, ignore_errors=True)
 
 # Kill all unresponsive VMs
 if params.get("kill_unresponsive_vms") == "yes":
@@ -318,6 +339,12 @@ def postprocess(test, params, env):
 env["tcpdump"].close()
 del env["tcpdump"]
 
+# Execute any post_commands
+if params.get("post_command"):
+process_command(test, params, env, params.get("post_command"),
+int(params.get("post_command_timeout", "600")),
+params.get("post_command_noncritical") == "yes")
+
 
 def postprocess_on_error(test, params, env):
 """
@@ -343,3 +370,49 @@ def _update_address_cache(address_cache, line):
   mac_addre

[KVM-AUTOTEST PATCH 4/5] KVM test: make kvm_stat usage optional

2010-03-21 Thread Michael Goldish

Relying on the test tag is not cool.  Use a dedicated parameter instead.
By default, all tests except build tests will use kvm_stat.

Signed-off-by: Michael Goldish 
---
 client/tests/kvm/kvm_utils.py  |8 
 client/tests/kvm/tests_base.cfg.sample |3 +++
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py
index cc39b5d..5834539 100644
--- a/client/tests/kvm/kvm_utils.py
+++ b/client/tests/kvm/kvm_utils.py
@@ -845,8 +845,8 @@ def run_tests(test_list, job):
 @return: True, if all tests ran passed, False if any of them failed.
 """
 status_dict = {}
-
 failed = False
+
 for dict in test_list:
 if dict.get("skip") == "yes":
 continue
@@ -863,12 +863,12 @@ def run_tests(test_list, job):
 test_tag = dict.get("shortname")
 # Setting up kvm_stat profiling during test execution.
 # We don't need kvm_stat profiling on the build tests.
-if "build" in test_tag:
+if dict.get("run_kvm_stat") == "yes":
+profile = True
+else:
 # None because it's the default value on the base_test class
 # and the value None is specifically checked there.
 profile = None
-else:
-profile = True
 
 if profile:
 job.profilers.add('kvm_stat')
diff --git a/client/tests/kvm/tests_base.cfg.sample 
b/client/tests/kvm/tests_base.cfg.sample
index 9963a44..b13aec4 100644
--- a/client/tests/kvm/tests_base.cfg.sample
+++ b/client/tests/kvm/tests_base.cfg.sample
@@ -40,6 +40,9 @@ nic_mode = user
 nic_script = scripts/qemu-ifup
 address_index = 0
 
+# Misc
+run_kvm_stat = yes
+
 
 # Tests
 variants:
-- 
1.5.4.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[KVM-AUTOTEST PATCH 2/5] KVM test: kvm.py: make sure all dump_env() calls are inside 'finally' blocks

2010-03-21 Thread Michael Goldish

Signed-off-by: Michael Goldish 
---
 client/tests/kvm/kvm.py |   29 +++--
 1 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/client/tests/kvm/kvm.py b/client/tests/kvm/kvm.py
index 9b8a10c..c6e146d 100644
--- a/client/tests/kvm/kvm.py
+++ b/client/tests/kvm/kvm.py
@@ -21,6 +21,7 @@ class kvm(test.test):
 (Online doc - Getting started with KVM testing)
 """
 version = 1
+
 def run_once(self, params):
 # Report the parameters we've received and write them as keyvals
 logging.debug("Test parameters:")
@@ -33,7 +34,7 @@ class kvm(test.test):
 # Open the environment file
 env_filename = os.path.join(self.bindir, params.get("env", "env"))
 env = kvm_utils.load_env(env_filename, {})
-logging.debug("Contents of environment: %s" % str(env))
+logging.debug("Contents of environment: %s", str(env))
 
 try:
 try:
@@ -50,22 +51,30 @@ class kvm(test.test):
 f.close()
 
 # Preprocess
-kvm_preprocessing.preprocess(self, params, env)
-kvm_utils.dump_env(env, env_filename)
+try:
+kvm_preprocessing.preprocess(self, params, env)
+finally:
+kvm_utils.dump_env(env, env_filename)
 # Run the test function
 run_func = getattr(test_module, "run_%s" % t_type)
-run_func(self, params, env)
-kvm_utils.dump_env(env, env_filename)
+try:
+run_func(self, params, env)
+finally:
+kvm_utils.dump_env(env, env_filename)
 
 except Exception, e:
 logging.error("Test failed: %s", e)
 logging.debug("Postprocessing on error...")
-kvm_preprocessing.postprocess_on_error(self, params, env)
-kvm_utils.dump_env(env, env_filename)
+try:
+kvm_preprocessing.postprocess_on_error(self, params, env)
+finally:
+kvm_utils.dump_env(env, env_filename)
 raise
 
 finally:
 # Postprocess
-kvm_preprocessing.postprocess(self, params, env)
-logging.debug("Contents of environment: %s", str(env))
-kvm_utils.dump_env(env, env_filename)
+try:
+kvm_preprocessing.postprocess(self, params, env)
+finally:
+kvm_utils.dump_env(env, env_filename)
+logging.debug("Contents of environment: %s", str(env))
-- 
1.5.4.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[KVM-AUTOTEST PATCH 1/5] KVM test: kvm_preprocessing.py: minor style corrections

2010-03-21 Thread Michael Goldish

Also, fetch the KVM version before setting up the VMs.

Signed-off-by: Michael Goldish 
---
 client/tests/kvm/kvm_preprocessing.py |   58 +++-
 1 files changed, 27 insertions(+), 31 deletions(-)

diff --git a/client/tests/kvm/kvm_preprocessing.py 
b/client/tests/kvm/kvm_preprocessing.py
index e91d1da..e3a5501 100644
--- a/client/tests/kvm/kvm_preprocessing.py
+++ b/client/tests/kvm/kvm_preprocessing.py
@@ -58,8 +58,8 @@ def preprocess_vm(test, params, env, name):
 for_migration = False
 
 if params.get("start_vm_for_migration") == "yes":
-logging.debug("'start_vm_for_migration' specified; (re)starting VM 
with"
-  " -incoming option...")
+logging.debug("'start_vm_for_migration' specified; (re)starting VM "
+  "with -incoming option...")
 start_vm = True
 for_migration = True
 elif params.get("restart_vm") == "yes":
@@ -187,12 +187,12 @@ def preprocess(test, params, env):
 @param env: The environment (a dict-like object).
 """
 # Start tcpdump if it isn't already running
-if not env.has_key("address_cache"):
+if "address_cache" not in env:
 env["address_cache"] = {}
-if env.has_key("tcpdump") and not env["tcpdump"].is_alive():
+if "tcpdump" in env and not env["tcpdump"].is_alive():
 env["tcpdump"].close()
 del env["tcpdump"]
-if not env.has_key("tcpdump"):
+if "tcpdump" not in env:
 command = "/usr/sbin/tcpdump -npvi any 'dst port 68'"
 logging.debug("Starting tcpdump (%s)...", command)
 env["tcpdump"] = kvm_subprocess.kvm_tail(
@@ -208,35 +208,23 @@ def preprocess(test, params, env):
 
 # Destroy and remove VMs that are no longer needed in the environment
 requested_vms = kvm_utils.get_sub_dict_names(params, "vms")
-for key in env.keys():
+for key in env:
 vm = env[key]
 if not kvm_utils.is_vm(vm):
 continue
 if not vm.name in requested_vms:
-logging.debug("VM '%s' found in environment but not required for"
-  " test; removing it..." % vm.name)
+logging.debug("VM '%s' found in environment but not required for "
+  "test; removing it..." % vm.name)
 vm.destroy()
 del env[key]
 
-# Execute any pre_commands
-if params.get("pre_command"):
-process_command(test, params, env, params.get("pre_command"),
-int(params.get("pre_command_timeout", "600")),
-params.get("pre_command_noncritical") == "yes")
-
-# Preprocess all VMs and images
-process(test, params, env, preprocess_image, preprocess_vm)
-
 # Get the KVM kernel module version and write it as a keyval
 logging.debug("Fetching KVM module version...")
 if os.path.exists("/dev/kvm"):
-kvm_version = os.uname()[2]
 try:
-file = open("/sys/module/kvm/version", "r")
-kvm_version = file.read().strip()
-file.close()
+kvm_version = open("/sys/module/kvm/version").read().strip()
 except:
-pass
+kvm_version = os.uname()[2]
 else:
 kvm_version = "Unknown"
 logging.debug("KVM module not loaded")
@@ -248,16 +236,24 @@ def preprocess(test, params, env):
 qemu_path = kvm_utils.get_path(test.bindir, params.get("qemu_binary",
"qemu"))
 version_line = commands.getoutput("%s -help | head -n 1" % qemu_path)
-exp = re.compile("[Vv]ersion .*?,")
-match = exp.search(version_line)
-if match:
-kvm_userspace_version = " ".join(match.group().split()[1:]).strip(",")
+matches = re.findall("[Vv]ersion .*?,", version_line)
+if matches:
+kvm_userspace_version = " ".join(matches[0].split()[1:]).strip(",")
 else:
 kvm_userspace_version = "Unknown"
 logging.debug("Could not fetch KVM userspace version")
 logging.debug("KVM userspace version: %s" % kvm_userspace_version)
 test.write_test_keyval({"kvm_userspace_version": kvm_userspace_version})
 
+# Execute any pre_commands
+if params.get("pre_command"):
+process_command(test, params, env, params.get("pre_command"),
+int(params.get("pre_command_timeout", "600")),
+params.get("pre_command_noncritical") == "yes")
+
+# Preprocess all VMs and images
+process(test, params, env, preprocess_image, preprocess_vm)
+
 
 def postprocess(test, params, env):
 """
@@ -276,8 +272,8 @@ def postprocess(test, params, env):
 
 # Should we convert PPM files to PNG format?
 if params.get("convert_ppm_files_to_png") == "yes":
-logging.debug("'convert_ppm_files_to_png' specified; converting PPM"
-  " files to PNG format...")
+logging.debug("'convert_ppm_files_to_png' specified; converting

Re: [Autotest] [PATCH] KVM-Test: Add kvm userspace unit test

2010-03-21 Thread Shuxi Shang

OK, I approve of your suggestion.

- "Lucas Meneghel Rodrigues"  写道：

> I have an update about this test after talking to Naphtali Sprei:
> 
> This patch does the unit testing using the old way of invoking it,
> and
> Avi superseded it with a new -kernel option. Naphtali is working in
> making the new way of doing the test work, so I will wait until we
> can
> merge both ways of doing this test, OK?
> 
> On Thu, Mar 18, 2010 at 12:16 AM, Lucas Meneghel Rodrigues
>  wrote:
> > Hi Shuxi, sorry that it took so long before I could give you return
> on this one.
> >
> > The general idea is just fine, but there is one gotcha that will
> need
> > more thought: This is dependent of having the KVM source code for
> > testing (ie, it depends on the build test *and* the build mode has
> to
> > involve source code, such as git builds, things like koji install
> will
> > also not work). Since by default we are not making the tests
> depending
> > directly on build, so we have to figure out a way to have this
> > integrated without breaking things for users who are not interested
> to
> > run the build test.
> >
> > Today I was reviewing the qemu-img functional test, so it occurred
> to
> > me that all those tests that do not depend on guests and different
> > qemu command line options, we can make them all dependent on the
> build
> > test. This way we'd have the separation that we need, still not
> > breaking anything for users that do not care about build and other
> > types of test.
> >
> > Michael, what do you think? Should we put the config of tests like
> > this one and qemu_img on build.cfg, making them depend on build?
> >
> > Oh Shuxi, on the code below I have some small comments to make:
> >
> > On Fri, Mar 5, 2010 at 3:22 AM, sshang  wrote:
> >>  The test use kvm test harness kvmctl load binary test case file to
> test various function of kvm kernel module.
> >>
> >> Signed-off-by: sshang 
> >> ---
> >>  client/tests/kvm/tests/unit_test.py    |   29
> +
> >>  client/tests/kvm/tests_base.cfg.sample |    7 +++
> >>  2 files changed, 36 insertions(+), 0 deletions(-)
> >>  create mode 100644 client/tests/kvm/tests/unit_test.py
> >>
> >> diff --git a/client/tests/kvm/tests/unit_test.py
> b/client/tests/kvm/tests/unit_test.py
> >> new file mode 100644
> >> index 000..9bc7441
> >> --- /dev/null
> >> +++ b/client/tests/kvm/tests/unit_test.py
> >> @@ -0,0 +1,29 @@
> >> +import os
> >> +from autotest_lib.client.bin import utils
> >> +from autotest_lib.client.common_lib import error
> >> +
> >> +def run_unit_test(test, params, env):
> >> +    """
> >> +    This is kvm userspace unit test, use kvm test harness kvmctl
> load binary
> >> +    test case file to test various function of kvm kernel module.
> >> +    The output of all unit test can be found in the test result
> dir.
> >> +    """
> >> +
> >> +    case_list = params.get("case_list","access apic emulator
> hypercall irq"\
> >> +              " port80 realmode sieve smptest tsc stringio
> vmexit").split()
> >> +    srcdir = params.get("srcdir",test.srcdir)
> >> +    user_dir = os.path.join(srcdir,"kvm_userspace/kvm/user")
> >> +    os.chdir(user_dir)
> >> +    test_fail_list = []
> >> +
> >> +    for i in case_list:
> >> +        result_file = test.outputdir + "/" + i
> >> +        testfile = i + ".flat"
> >> +        results = utils.system("./kvmctl test/x86/bootstrap
> test/x86/" + \
> >> +                     testfile + " > " +
> result_file,ignore_status=True)
> >
> > About the above statement: In general you should not use shell
> > redirection to write the output of your program to the log files.
> > Please take advantage of the fact utils.run allow you to connect
> > stdout and stderr pipes to the result file. Also, utils.run return
> a
> > CmdResult object, hat has a list of useful properties out of it.
> >
> >> +        if results != 0:
> >> +            test_fail_list.append(i)
> >> +
> >> +    if test_fail_list:
> >> +        raise error.TestFail("< " + " ".join(test_fail_list) + \
> >> +                                   " >")
> >
> > In the above, you could just have used
> >
> >        raise error.TestFail("KVM module unit test failed. Test
> cases
> > failed: %s" % test_fail_list)
> >
> > IMHO it's easier to understand.
> >
> >> diff --git a/client/tests/kvm/tests_base.cfg.sample
> b/client/tests/kvm/tests_base.cfg.sample
> >> index 040d0c3..0918c26 100644
> >> --- a/client/tests/kvm/tests_base.cfg.sample
> >> +++ b/client/tests/kvm/tests_base.cfg.sample
> >> @@ -300,6 +300,13 @@ variants:
> >>         shutdown_method = shell
> >>         kill_vm = yes
> >>         kill_vm_gracefully = no
> >> +
> >> +    - unit_test:
> >> +        type = unit_test
> >> +        case_list = access apic emulator hypercall msr port80
> realmode sieve smptest tsc stringio vmexit
> >> +        #srcdir should be same as build.cfg
> >> +        srcdir =
> >> +        vms = ''
> >>     # Do not define test variants below shutdown
>

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Anthony Liguori


On 03/21/2010 05:00 PM, Ingo Molnar wrote:

If that is the theory then it has failed to trickle through in practice. As
you know i have reported a long list of usability problems with hardly a look.
That list could be created by pretty much anyone spending a few minutes of
getting a first impression with qemu-kvm.
   


Can you transfer your list to the following wiki page:

http://wiki.qemu.org/Features/Usability

This thread is so large that I can't find your note that contained the 
initial list.


I want to make sure this input doesn't die once this thread settles down.

Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Anthony Liguori


On 03/21/2010 04:54 PM, Ingo Molnar wrote:

* Avi Kivity  wrote:

   

On 03/21/2010 10:55 PM, Ingo Molnar wrote:
 

Of course you could say the following:

   ' Thanks, I'll mark this for v2.6.36 integration. Note that we are not
 able to add this to the v2.6.35 kernel queue anymore as the ongoing
 usability work already takes up all of the project's maintainer and
 testing bandwidth. If you want the feature to be merged sooner than that
 then please help us cut down on the TODO and BUGS list that can be found
 at XYZ. There's quite a few low hanging fruits there. '
   

That would be shooting at my own foot as well as the contributor's since I
badly want that RCU stuff, and while a GUI would be nice, that itch isn't on
my back.
 

I think this sums up the root cause of all the problems i see with KVM pretty
well.
   


A good maintainer has to strike a balance between asking more of people 
than what they initially volunteer and getting people to implement the 
less fun things that are nonetheless required.  The kernel can take this 
to an extreme because at the end of the day, it's the only game in town 
and there is an unending number of potential volunteers.  Most other 
projects are not quite as fortunate.


When someone submits a patch set to QEMU implementing a new network 
backend for raw sockets, we can push back about how it fits into the 
entire stack wrt security, usability, etc.  Ultimately, we can arrive at 
a different, more user friendly solution (networking helpers) and along 
with some time investment on my part, we can create a much nicer, more 
user friendly solution.  Still command line based though.


Responding to such a patch set with, replace the SDL front end with a 
GTK one that lets you graphically configure networking, is not 
reasonable and the result would be one less QEMU contributor in the long 
run.


Overtime, we can, and are, pushing people to focus more on usability.  
But that doesn't get you a first class GTK GUI overnight.  The only way 
you're going to get that is by having a contributor be specifically 
interesting in building such a thing.


We simply haven't had that in the past 5 years that I've been involved 
in the project.  If someone stepped up to build this, I'd certainly 
support it in every way possible and there are probably some steps we 
could take to even further encourage this.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Tracking KVM development

2010-03-21 Thread Asdo




I've looked at libvirt a bit, and I fail at seeing the attraction. I
think I will stay with plain qemu-kvm, unless there are some very
compelling reasons for going down the libvirt route.
  


Virsh (uses libvirt) is almost irreplaceable for us...
How do you start and stop virtual machines easily, get a list of the 
running ones... How do you ensure a virtual machine is never started 
twice? (would obviously have disastrous results on the filesystem) How 
do you connect on-demand to the graphics of the VM from your laptop, 
with a good security so that only the system administrator can do that? 
(virt-viewer provides very easy support for this, tunnelling VNC 
graphics over SSH, you connect by specifying the name of the host and 
the name of the VM... just great!)


If there is another way I'm interested, in fact libvirt also brings 
problems to us mainly because it takes a while to support latest KVM 
features, and also installing libvirt from source and configuring it 
properly for the host first time is much more difficult than for KVM 
sources.


Thank you
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Anthony Liguori


On 03/21/2010 05:00 PM, Ingo Molnar wrote:

If that is the theory then it has failed to trickle through in practice. As
you know i have reported a long list of usability problems with hardly a look.
That list could be created by pretty much anyone spending a few minutes of
getting a first impression with qemu-kvm.
   


I think the point you're missing is that your list was from the 
perspective of someone looking at a desktop virtualization solution that 
had was graphically oriented.


As Avi has repeatedly mentioned, so far, that has not been the target 
audience of QEMU.  The target audience tends to be: 1) people looking to 
do server virtualization and 2) people looking to do command line based 
development.


Usually, both (1) and (2) are working on machines that are remotely 
located.  What's important to these users is that VMs be easily 
launchable from the command line, that there is a lot of flexibility in 
defining machine types, and that there is a programmatic way to interact 
with a given instance of QEMU.  Those are the things that we've been 
focusing on recently.


The reason we don't have better desktop virtualization support is 
simple.  No one is volunteering to do it and no company is funding 
development for it.


When you look at something like VirtualBox, what you're looking at is a 
long ago forked version of QEMU with a GUI added focusing on desktop 
virtualization.


There is no magic behind adding a better, more usable GUI to QEMU.  It 
just takes resources.


I understand that you're trying to make the point that without catering 
to the desktop virtualization use case, we won't get as many developers 
as we could.  Personally, I don't think that argument is accurate.  If 
you look at VirtualBox, it's performance is terrible.  Having a nice GUI 
hasn't gotten them the type of developers that can improve their 
performance.


No one is arguing that we wouldn't like to have a nicer UI.  I'd love to 
merge any patch like that.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Anthony Liguori


On 03/21/2010 04:00 PM, Ingo Molnar wrote:

* Avi Kivity  wrote:

   

On 03/21/2010 09:59 PM, Ingo Molnar wrote:
 

Frankly, i was surprised (and taken slightly off base) by both Avi and Anthony
suggesting such a clearly inferior "add a demon to the guest space" solution.
It's a usability and deployment non-starter.
   

It's only clearly inferior if you ignore every consideration against it.
It's definitely not a deployment non-starter, see the tons of daemons that
come with any Linux system. [...]
 

Avi, please dont put arguments into my mouth that i never made.

My (clearly expressed) argument was that:

 _a new guest-side demon is a transparent instrumentation non-starter_
   


FWIW, there's no reason you couldn't consume a vmchannel port from 
within the kernel.  I don't think the code needs to be in the kernel and 
from a security PoV, that suggests that it should be in userspace IMHO.


But if you want to make a kernel thread, knock yourself out.  I have no 
objection to that from a qemu perspective.  I can't see why Avi would 
mind either.  I think it's papering around another problem (the kernel 
should control initrds IMHO) but that's a different topic.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Anthony Liguori


On 03/21/2010 02:17 PM, Ingo Molnar wrote:



If you want to improve this, you need to do the following:

1) Add a userspace daemon that uses vmchannel that runs in the guest and can
fetch kallsyms and arbitrary modules.  If that daemon lives in
tools/perf, that's fine.
 

Adding any new daemon to an existing guest is a deployment and usability
nightmare.

The basic rule of good instrumentation is to be transparent. The moment we
have to modify the user-space of a guest just to monitor it, the purpose of
transparent instrumentation is defeated.

That was one of the fundamental usability mistakes of Oprofile.

There is no 'perf' daemon - all the perf functionality is _built in_, and for
very good reasons. It is one of the main reasons for perf's success as well.
   


The solution should be a long lived piece of code that runs without 
kernel privileges.  How the code is delivered to the user is a separate 
problem.


If you want to argue that the kernel should build an initramfs that 
contains some things that always should be shipped with the kernel but 
don't need to be within the kernel, I think that's something that's long 
over due.


We could make it a kernel thread, but what's the point?  It's much safer 
for it to be a userspace thread and it doesn't need to interact with the 
kernel in an intimate way.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Ingo Molnar

* Avi Kivity  wrote:

> > Consider the _other_ examples that are a lot more clear:
> >
> >' If you expose paravirt spilocks via KVM please also make sure the KVM
> >  tooling can make use of it, has an option for it to configure it, and
> >  that it has sufficient efficiency statistics displayed in the tool for
> >  admins to monitor.'
> >
> >' If you create this new paravirt driver then please also make sure it 
> > can
> >  be configured in the tooling. '
> >
> >' Please also add a testcase for this bug to tools/kvm/testcases/ so we 
> > dont
> >  repeat this same mistake in the future. '
> 
> All three happen quite commonly in qemu/kvm development.  Of course someone 
> who develops a feature also develops a patch that exposes it in qemu.  There 
> are several test cases in qemu-kvm.git/kvm/user/test.

If that is the theory then it has failed to trickle through in practice. As 
you know i have reported a long list of usability problems with hardly a look. 
That list could be created by pretty much anyone spending a few minutes of 
getting a first impression with qemu-kvm.

So something is seriously wrong in KVM land, to pretty much anyone trying it 
for the first time. I have explained how i see the root cause of that, while 
you seem to suggest that there's nothing wrong to begin with. I guess we'll 
have to agree to disagree on that.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Ingo Molnar


* Avi Kivity  wrote:

> On 03/21/2010 10:55 PM, Ingo Molnar wrote:
> >
> >Of course you could say the following:
> >
> >   ' Thanks, I'll mark this for v2.6.36 integration. Note that we are not
> > able to add this to the v2.6.35 kernel queue anymore as the ongoing
> > usability work already takes up all of the project's maintainer and
> > testing bandwidth. If you want the feature to be merged sooner than that
> > then please help us cut down on the TODO and BUGS list that can be found
> > at XYZ. There's quite a few low hanging fruits there. '
> 
> That would be shooting at my own foot as well as the contributor's since I 
> badly want that RCU stuff, and while a GUI would be nice, that itch isn't on 
> my back.

I think this sums up the root cause of all the problems i see with KVM pretty 
well.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Ingo Molnar

* Avi Kivity  wrote:

> > I.e. you are arguing for microkernel Linux, while you see me as arguing 
> > for a monolithic kernel.
> 
> No. I'm arguing for reducing bloat wherever possible.  Kernel code is more 
> expensive than userspace code in every metric possible.

1)

One of the primary design arguments of the micro-kernel design as well was to 
push as much into user-space as possible without impacting performance too 
much so you very much seem to be arguing for a micro-kernel design for the 
kernel.

I think history has given us the answer for that fight between microkernels 
and monolithic kernels.

Furthermore, to not engage in hypotheticals about microkernels: by your 
argument the Oprofile design was perfect (it was minimalistic kernel-space, 
with all the complexity in user-space), while perf was over-complex (which 
does many things in the kernel that could have been done in user-space).

Practical results suggest the exact opposite happened - Oprofile is being 
replaced by perf. How do you explain that?

2)

In your analysis you again ignore the package boundary costs and artifacts as 
if they didnt exist.

That was my main argument, and that is what we saw with oprofile and perf: 
while maintaining more kernel-code may be more expensive, it sure pays off for 
getting us a much better solution in the end.

And getting a 'much better solution' to users is the goal of all this, isnt 
it?

I dont mind what you call 'bloat' per se if it's for a purpose that users find 
like a good deal. I have quite a bit of RAM in most of my systems, having 50K 
more or less included in the kernel image is far less important than having a 
healthy and vibrant development model and having satisfied users ...

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 11:00 PM, Ingo Molnar wrote:

* Avi Kivity  wrote:

   

On 03/21/2010 09:59 PM, Ingo Molnar wrote:
 

Frankly, i was surprised (and taken slightly off base) by both Avi and Anthony
suggesting such a clearly inferior "add a demon to the guest space" solution.
It's a usability and deployment non-starter.
   

It's only clearly inferior if you ignore every consideration against it.
It's definitely not a deployment non-starter, see the tons of daemons that
come with any Linux system. [...]
 

Avi, please dont put arguments into my mouth that i never made.
   


Sorry, that was not the intent.  I meant that putting things into the 
kernel have disadvantages that must be considered.



My (clearly expressed) argument was that:

 _a new guest-side demon is a transparent instrumentation non-starter_

What is so hard to understand about that simple concept? Instrumentation is
good if it's as transparent as possible.

Of course lots of other features can be done via a new user-space package ...
   


I believe you can deploy this daemon via a (default) package, without 
any hassle to users.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 10:55 PM, Ingo Molnar wrote:


Of course you could say the following:

   ' Thanks, I'll mark this for v2.6.36 integration. Note that we are not
 able to add this to the v2.6.35 kernel queue anymore as the ongoing
 usability work already takes up all of the project's maintainer and
 testing bandwidth. If you want the feature to be merged sooner than that
 then please help us cut down on the TODO and BUGS list that can be found
 at XYZ. There's quite a few low hanging fruits there. '
   


That would be shooting at my own foot as well as the contributor's since 
I badly want that RCU stuff, and while a GUI would be nice, that itch 
isn't on my back.


You're asking a developer and a maintainer to put off the work they're 
interested in, in order to work on something someone else is interested 
in, but not contributing the work.



Although this RCU example is 'worst' possible example, as it's a pure speedup
change with no functionality effect.

Consider the _other_ examples that are a lot more clear:

' If you expose paravirt spilocks via KVM please also make sure the KVM
  tooling can make use of it, has an option for it to configure it, and
  that it has sufficient efficiency statistics displayed in the tool for
  admins to monitor.'

' If you create this new paravirt driver then please also make sure it can
  be configured in the tooling. '

' Please also add a testcase for this bug to tools/kvm/testcases/ so we dont
  repeat this same mistake in the future. '
   


All three happen quite commonly in qemu/kvm development.  Of course 
someone who develops a feature also develops a patch that exposes it in 
qemu.  There are several test cases in qemu-kvm.git/kvm/user/test.



I'd say most of the high-level feature work in KVM has tooling impact.
   


Usually, pretty low.  Plumbing down a feature is usually trivial.  There 
are exceptions, of course - smp is only supported in qemu-kvm.git, not 
in upstream qemu.git, for example.  In any case of course the work is 
done in both qemu and kvm - do you think people develop features to see 
them bitrot?



And note the important arguement that the 'eject button' thing would not occur
naturally in a project that is well designed and has a good quality balance.
It would only occur in the transitionary period if a big lump of lower-quality
code is unified with higher-quality code. Then indeed a lot of pressure gets
created on the people working on the high-quality portion to go over and fix
the low-quality portion.
   


It's a matter of priorities.


Which, btw., is an unconditonally good thing ...

But even an RCU speedup can be fairly linked/ordered to more pressing needs of
a project.
   


Pressing to whom?


Really, the unification of two tightly related pieces of code has numerous
clear advantages. Please give it some thought before rejecting it.
   


I'm not blind to the advantages.  Dropping tcg would be the biggest of 
them by far (much more than moving the repository, IMO).  But there are 
disadvantages as well.


Around two years ago I seriously considered forking qemu, at this time I 
do not think it is a good idea.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 10:31 PM, Ingo Molnar wrote:

* Avi Kivity  wrote:

   

On 03/21/2010 09:17 PM, Ingo Molnar wrote:
 

Adding any new daemon to an existing guest is a deployment and usability
nightmare.
   

The logical conclusion of that is that everything should be built into the
kernel. [...]
 

Only if you apply it as a totalitarian rule.

Furthermore, the logical conclusion of _your_ line of argument (applied in a
totalitarian manner) is that 'nothing should be built into the kernel'.
   


I'm certainly a minimalist, but that doesn't follow.  Things that 
require privileged access, or access to the page cache, or that can't be 
made to perform otherwise should certainly be in the kernel.  That's why 
I submitted kvm for inclusion in the first place.


If it's something that can work just as well in userspace but we can't 
be bothered to fix any 'deployment nightmares', then they shouldn't be 
in the kernel.  Examples include lvm2 and mdadm (which truly are 
'deployment nightmares' - you need to start them before you have access 
to your filesystem - yet they work somehow).



I.e. you are arguing for microkernel Linux, while you see me as arguing for a
monolithic kernel.
   


No. I'm arguing for reducing bloat wherever possible.  Kernel code is 
more expensive than userspace code in every metric possible.



Reality is that we are somewhere inbetween, we are neither black nor white:
it's shades of grey.

If we want to do a good job with all this then we observe subsystems, we see
how they relate to the physical world and decide about how to shape them. We
identify long-term changes and re-design modularization boundaries in
hindsight - when we got them wrong initially. We dont try to rationalize the
status-quo.
   


I'm not for the status quo either - I'm for reducing the kernel code 
footprint whereever it doesn't impact performance or break clean interfaces.



Lets see one example of that thought process in action: Oprofile.

We saw that the modularization of oprofile was a total nightmare: a separate
kernel-space and a separate user-space component, which was in constant
version friction. The ABI between them was stiffling: it was hard to change it
(you needed to trickle that through the tool as well which was on a different
release schedule, etc.e tc.)

The result was sucky usability that never went beyond some basic 'you can do
profiling' threshold. The subsystem worked well within that design box, and it
was worked on by highly competent people - but it was still far, far away from
the potential it could have achieved.

So we observed those problems and decided to do something about it:

  - We unified the two parts into a single maintenance domain. There's
the kernel-side in kernel/perf_event.c and arch/*/*/perf_event.c,
plus the user-side in tools/perf/. The two are connected by a very
flexible, forwards and backwards compatible ABI.
   


That's useful because perf is still small.  If it were a full fledged 
350KLOC GUI, then most of the development would concentrate on the GUI 
and very little (relatively) would have to do with the kernel.


Qemu is in that state today.  Please, please look at the recent commits 
and check how many have actually anything to do with kvm, and how many 
with everything else.



  - We moved much more code into the kernel, realizing that transparent
and robust instrumentation should be offered instead of punting
abstractions into user-space (which is in a disadvantaged position
to implement system-wide abstractions).
   


No argument.

I have a similar experience with kvm.  The user/kernel break is at the 
cpu virtualization level - that is kvm is solely responsible for 
emulating a cpu and userspace is responsible for emulating devices.  An 
exception was made for the PIC/IOAPIC/PIT due to performance 
considerations - they are emulated in the kernel as well.


A common FAQ is why do we not emulate real-mode instructions in qemu.  
The answer is that it the interface to kvm would be insane - it would 
emulate a partial cpu.  All other users of that interface would have to 
implement an emulator (there is also a practical argument - the qemu 
emulator does not implement atomics correctly wrt other threads).



  - We created a no-bullsh*t approach to usability. perf is by no means
perfect, but it's written by developers for developers and if you report a
bug to us we'll act on it before anything else. Furthermore the kernel
developers do the user-space coding as well, so there's no chinese
wall separating them. Kernel-space becomes aware of the intricacies of
user-space and user-space developers become aware of the difficulties of
kernel-space as well. It's a good mix in our experience.
   


Excellent.  However qemu is written by developers for their users, and 
their users are not worried about an eject button in the qemu SDL 
interface, or about running the qemu command line by hand.  They have

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Ingo Molnar

* Avi Kivity  wrote:

> > Well, for what it's worth, I rarely ever use anything else. My virtual 
> > disks are raw so I can loop mount them easily, and I can also switch my 
> > guest kernels from outside... without ever needing to mount those disks.
> 
> Curious, what do you use them for?
> 
> btw, if you build your kernel outside the guest, then you already have 
> access to all its symbols, without needing anything further.

There's two errors with your argument:

1) you are assuming that it's only about kernel symbols

Look at this 'perf report' output:

# Samples: 7127509216
#
# Overhead Command  Shared Object  Symbol
#   ..  .  ..
#
19.14% git  git[.] lookup_object
15.16%perf  git[.] lookup_object
 4.74%perf  libz.so.1.2.3  [.] inflate
 4.52% git  libz.so.1.2.3  [.] inflate
 4.21%perf  libz.so.1.2.3  [.] inflate_table
 3.94% git  libz.so.1.2.3  [.] inflate_table
 3.29% git  git[.] find_pack_entry_one
 3.24% git  libz.so.1.2.3  [.] inflate_fast
 2.96%perf  libz.so.1.2.3  [.] inflate_fast
 2.96% git  git[.] decode_tree_entry
 2.80%perf  libc-2.11.90.so[.] __strlen_sse42
 2.56% git  libc-2.11.90.so[.] __strlen_sse42
 1.98%perf  libc-2.11.90.so[.] __GI_memcpy
 1.71%perf  git[.] decode_tree_entry
 1.53% git  libc-2.11.90.so[.] __GI_memcpy
 1.48% git  git[.] lookup_blob
 1.30% git  git[.] process_tree
 1.30%perf  git[.] process_tree
 0.90%perf  git[.] tree_entry
 0.82%perf  git[.] lookup_blob
 0.78% git  [kernel.kallsyms]  [k] kstat_irqs_cpu

kernel symbols are only a small portion of the symbols. (a single line in this 
case)

To get to those other symbols we have to read the ELF symbols of those 
binaries in the guest filesystem, in the post-processing/reporting phase. This 
is both complex to do and relatively slow so we dont want to (and cannot) do 
this at sample time from IRQ context or NMI context ...

Also, many aspects of reporting are interactive so it's done lazily or 
on-demand. So we need ready access to the guest filesystem - for those guests 
which decide to integrate with the host for this.

2) the 'SystemTap mistake'

You are assuming that the symbols of the kernel when it got built got saved 
properly and are discoverable easily. In reality those symbols can be erased 
by a make clean, can be modified by a new build, can be misplaced and can 
generally be hard to find because each distro puts them in a different 
installation path.

My 10+ years experience with kernel instrumentation solutions is that 
kernel-driven, self-sufficient, robust, trustable, well-enumerated sources of 
information work far better in practice.

The thing is, in this thread i'm forced to repeat the same basic facts again 
and again. Could you _PLEASE_, pretty please, when it comes to instrumentation 
details, at least _read the mails_ of the guys who actually ... write and 
maintain Linux instrumentation code? This is getting ridiculous really.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 10:31 PM, Antoine Martin wrote:

On 03/22/2010 03:24 AM, Avi Kivity wrote:

On 03/21/2010 10:18 PM, Antoine Martin wrote:
That includes the guest kernel.  If you can deploy a new kernel in 
the guest, presumably you can deploy a userspace package.


That's not always true.
The host admin can control the guest kernel via "kvm -kernel" easily 
enough, but he may or may not have access to the disk that is used 
in the guest. (think encrypted disks, service agreements, etc)


There is a matching -initrd argument that you can use to launch a 
daemon.
I thought this discussion was about making it easy to deploy... and 
generating a custom initrd isn't easy by any means, and it requires 
access to the guest filesystem (and its mkinitrd tools).


That's true.  You need to run mkinitrd anyway, though, unless your guest 
is non-modular and non-lvm.


  I believe that -kernel use will be rare, though.  It's a lot easier 
to keep everything in one filesystem.
Well, for what it's worth, I rarely ever use anything else. My virtual 
disks are raw so I can loop mount them easily, and I can also switch 
my guest kernels from outside... without ever needing to mount those 
disks.


Curious, what do you use them for?

btw, if you build your kernel outside the guest, then you already have 
access to all its symbols, without needing anything further.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Ingo Molnar


* Avi Kivity  wrote:

> On 03/21/2010 09:59 PM, Ingo Molnar wrote:
> >
> >Frankly, i was surprised (and taken slightly off base) by both Avi and 
> >Anthony
> >suggesting such a clearly inferior "add a demon to the guest space" solution.
> >It's a usability and deployment non-starter.
> 
> It's only clearly inferior if you ignore every consideration against it.  
> It's definitely not a deployment non-starter, see the tons of daemons that 
> come with any Linux system. [...]

Avi, please dont put arguments into my mouth that i never made.

My (clearly expressed) argument was that:

_a new guest-side demon is a transparent instrumentation non-starter_

What is so hard to understand about that simple concept? Instrumentation is 
good if it's as transparent as possible.

Of course lots of other features can be done via a new user-space package ...

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Ingo Molnar

* Avi Kivity  wrote:

> On 03/21/2010 09:06 PM, Ingo Molnar wrote:
> >* Avi Kivity  wrote:
> >
> [...] Second, from my point of view all contributors are volunteers
> (perhaps their employer volunteered them, but there's no difference from
> my perspective). Asking them to repaint my apartment as a condition to
> get a patch applied is abuse.  If a patch is good, it gets applied.
> >>>This is one of the weirdest arguments i've seen in this thread. Almost all
> >>>the time do we make contributions conditional on the general shape of the
> >>>project. Developers dont get to do just the fun stuff.
> >>So, do you think a reply to a patch along the lines of
> >>
> >>   NAK.  Improving scalability is pointless while we don't have a decent 
> >> GUI.
> >>I'll review you RCU patches
> >>   _after_ you've contributed a usable GUI.
> >>
> >>?
> >What does this have to do with RCU?
> 
> The example was rcuifying kvm which took place a bit ago.  Sorry, it wasn't 
> clear.
> 
> > I'm talking about KVM, which is a Linux kernel feature that is useless 
> > without a proper, KVM-specific app making use of it.
> >
> > RCU is a general kernel performance feature that works across the board. 
> > It helps KVM indirectly, and it helps many other kernel subsystems as 
> > well. It needs no user-space tool to be useful.
> 
> Correct.  So should I tell someone that has sent a patch that rcu-ified kvm 
> in order to scale it, that I won't accept the patch unless they do some 
> usability userspace work?  say, implementing an eject button. That's what I 
> understood you to mean.

Of course you could say the following:

  ' Thanks, I'll mark this for v2.6.36 integration. Note that we are not
able to add this to the v2.6.35 kernel queue anymore as the ongoing 
usability work already takes up all of the project's maintainer and 
testing bandwidth. If you want the feature to be merged sooner than that 
then please help us cut down on the TODO and BUGS list that can be found 
at XYZ. There's quite a few low hanging fruits there. '

Although this RCU example is 'worst' possible example, as it's a pure speedup 
change with no functionality effect.

Consider the _other_ examples that are a lot more clear:

   ' If you expose paravirt spilocks via KVM please also make sure the KVM
 tooling can make use of it, has an option for it to configure it, and 
 that it has sufficient efficiency statistics displayed in the tool for 
 admins to monitor.'

   ' If you create this new paravirt driver then please also make sure it can
 be configured in the tooling. '

   ' Please also add a testcase for this bug to tools/kvm/testcases/ so we dont
 repeat this same mistake in the future. '

I'd say most of the high-level feature work in KVM has tooling impact.

And note the important arguement that the 'eject button' thing would not occur 
naturally in a project that is well designed and has a good quality balance. 
It would only occur in the transitionary period if a big lump of lower-quality 
code is unified with higher-quality code. Then indeed a lot of pressure gets 
created on the people working on the high-quality portion to go over and fix 
the low-quality portion.

Which, btw., is an unconditonally good thing ...

But even an RCU speedup can be fairly linked/ordered to more pressing needs of 
a project.

Really, the unification of two tightly related pieces of code has numerous 
clear advantages. Please give it some thought before rejecting it.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Tracking KVM development

2010-03-21 Thread Zdenek Kaspar

Dne 21.3.2010 12:21, Thomas Løcke napsal(a):
> Any and all suggestions to keeping a healthy and stable KVM setup
> running is more than welcome.

Hi, I compile stable qemu-kvm releases from source and install under
/opt/qemu-kvm-${version}. With this setup I can run/test multiple
versions without messing up "any" distro..

HTH, Z.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Tracking KVM development

2010-03-21 Thread Thomas Løcke

On Sun, Mar 21, 2010 at 9:19 PM, Andre Przywara  wrote:
> Please think twice about that. Every time I wanted to go away from Slackware
> because of missing packages I ended up with accepting the involved hassle
> with self-compiling because I could stay with the simplicity and clean
> design of Slackware.


Same here.


> I usually compile my own kernels anyway and use the Slackware kernels only
> for testing and installation. So I usually do "make oldconfig" on a stable
> 2.6.xx.>=3 kernel, and am happy with that. QEMU(-kvm) is not a problem at
> all, the dependencies are very small and with Slackware[64] 13.0 it compiles
> out of the box with almost all features. I can send you a reasonably
> configured package (or build-script) if you like.


I also use the config provided by Slackware as a foundation for newer
kernels, and I always compile my own.

I would very much like to see the build-script you mention.


> Currently both qemu-kvm-0.12.3 and Linux 2.6.33 work together very well,
> although I usually do only testing and development with KVM and actually
> "use" it very rarely. So if you need more upper level management tools (like
> libvirt) I cannot help you on this.


I've looked at libvirt a bit, and I fail at seeing the attraction. I
think I will stay with plain qemu-kvm, unless there are some very
compelling reasons for going down the libvirt route.

:o)
/Thomas
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Ingo Molnar

* Avi Kivity  wrote:

> On 03/21/2010 10:08 PM, Olivier Galibert wrote:
> >On Sun, Mar 21, 2010 at 10:01:51PM +0200, Avi Kivity wrote:
> >>On 03/21/2010 09:17 PM, Ingo Molnar wrote:
> >>>Adding any new daemon to an existing guest is a deployment and usability
> >>>nightmare.
> >>>
> >>The logical conclusion of that is that everything should be built into
> >>the kernel.  Where a failure brings the system down or worse.  Where you
> >>have to bear the memory footprint whether you ever use the functionality
> >>or not.  Where to update the functionality you need to deploy a new
> >>kernel (possibly introducing unrelated bugs) and reboot.
> >>
> >>If userspace daemons are such a deployment and usability nightmare,
> >>maybe we should fix that instead.
> >Which userspace?  Deploying *anything* in the guest can be a
> >nightmare, including paravirt drivers if you don't have a natively
> >supported in the OS virtual hardware backoff.
> 
> That includes the guest kernel.  If you can deploy a new kernel in the 
> guest, presumably you can deploy a userspace package.

Note that with perf we can instrument the guest with zero guest-kernel 
modifications as well.

We try to reduce the guest impact to a bare minimum, as the difficulties in 
deployment are function of the cross section surface to the guest.

Also, note that the kernel is special with regards to instrumentation: since 
this is the kernel project, we are doing kernel space changes, as we are doing 
them _anyway_. So adding symbol resolution capabilities would be a minimal 
addition to that - while adding a while new guest package for the demon would 
significantly increase the cross section surface.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: CONFIG_HAVE_KVM=n impossible?

2010-03-21 Thread devzero

thanks, 

i had seen some bootmessage about kvm being active and thought that i still 
have seen it after disabling all kvm config options - but apparently it was my 
fault and i mixed things up.

indeed, KVM is OFF, even with HAVE_KVM=y.

so, sorry for the noise.

at least, it appears that this config option is confusing other people , too - 
see http://communities.vmware.com/message/1498691

regards
roland


>devz...@web.de wrote:
>> Hello, 
>> 
>> does anybody know why it seems that it`s not possible to build a kernel with 
>> "CONFIG_HAVE_KVM=n" ?
>> 
>> It always switches back to "y" with every kernel build and i have no clue, 
>> why.
>
>It's an internal config symbol which is not visible in the menu
>system and is always set up unconditionally based on the platform.
>Just like "CONFIG_HAVE_MMU".
>
>You want another symbols, like CONFIG_KVM.
>
>/mjt
___
WEB.DE DSL: Internet, Telefon und Entertainment für nur 19,99 EUR/mtl.!
http://produkte.web.de/go/02/
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Antoine Martin


On 03/22/2010 03:24 AM, Avi Kivity wrote:

On 03/21/2010 10:18 PM, Antoine Martin wrote:
That includes the guest kernel.  If you can deploy a new kernel in 
the guest, presumably you can deploy a userspace package.


That's not always true.
The host admin can control the guest kernel via "kvm -kernel" easily 
enough, but he may or may not have access to the disk that is used in 
the guest. (think encrypted disks, service agreements, etc)


There is a matching -initrd argument that you can use to launch a daemon.
I thought this discussion was about making it easy to deploy... and 
generating a custom initrd isn't easy by any means, and it requires 
access to the guest filesystem (and its mkinitrd tools).
  I believe that -kernel use will be rare, though.  It's a lot easier 
to keep everything in one filesystem.
Well, for what it's worth, I rarely ever use anything else. My virtual 
disks are raw so I can loop mount them easily, and I can also switch my 
guest kernels from outside... without ever needing to mount those disks.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Ingo Molnar

* Avi Kivity  wrote:

> On 03/21/2010 09:17 PM, Ingo Molnar wrote:
> >
> > Adding any new daemon to an existing guest is a deployment and usability
> > nightmare.
> 
> The logical conclusion of that is that everything should be built into the 
> kernel. [...]

Only if you apply it as a totalitarian rule.

Furthermore, the logical conclusion of _your_ line of argument (applied in a 
totalitarian manner) is that 'nothing should be built into the kernel'.

I.e. you are arguing for microkernel Linux, while you see me as arguing for a 
monolithic kernel.

Reality is that we are somewhere inbetween, we are neither black nor white:
it's shades of grey.

If we want to do a good job with all this then we observe subsystems, we see 
how they relate to the physical world and decide about how to shape them. We 
identify long-term changes and re-design modularization boundaries in 
hindsight - when we got them wrong initially. We dont try to rationalize the 
status-quo.

Lets see one example of that thought process in action: Oprofile.

We saw that the modularization of oprofile was a total nightmare: a separate 
kernel-space and a separate user-space component, which was in constant 
version friction. The ABI between them was stiffling: it was hard to change it 
(you needed to trickle that through the tool as well which was on a different 
release schedule, etc.e tc.)

The result was sucky usability that never went beyond some basic 'you can do 
profiling' threshold. The subsystem worked well within that design box, and it 
was worked on by highly competent people - but it was still far, far away from 
the potential it could have achieved.

So we observed those problems and decided to do something about it:

 - We unified the two parts into a single maintenance domain. There's
   the kernel-side in kernel/perf_event.c and arch/*/*/perf_event.c,
   plus the user-side in tools/perf/. The two are connected by a very
   flexible, forwards and backwards compatible ABI.

 - We moved much more code into the kernel, realizing that transparent
   and robust instrumentation should be offered instead of punting
   abstractions into user-space (which is in a disadvantaged position
   to implement system-wide abstractions).

 - We created a no-bullsh*t approach to usability. perf is by no means 
   perfect, but it's written by developers for developers and if you report a 
   bug to us we'll act on it before anything else. Furthermore the kernel
   developers do the user-space coding as well, so there's no chinese
   wall separating them. Kernel-space becomes aware of the intricacies of
   user-space and user-space developers become aware of the difficulties of
   kernel-space as well. It's a good mix in our experience.

The thing is (and i doubt you are surprised that i say that), i see a similar 
situation with KVM. The basic parameters are comparable to Oprofile: it has a 
kernel-space component and a KVM-specific user-space. By all practical means 
the two are one and the same, but are maintained as different projects.

I have followed KVM since its inception with great interest. I saw its good 
initial design, i tried it early on and even wrote various patches for it. So 
i care more about KVM than a random observer would, but this preference and 
passion for KVM's good technical sides does not cloud my judgement when it 
comes to its weaknesses.

In fact the weaknesses are far more important to identify and express 
publicly, so i tend to concentrate on them. Dont take this as me blasting KVM, 
we both know the many good aspects of KVM.

So, as i explained it earlier in greater detail the modularization of KVM into 
a separate kernel-space and user-space component is one of its worst current 
weaknesses, and it has become the main stiffling force in the way of a better 
KVM experience to users.

That, IMO, is the 'weakest link' of KVM today and no matter how well the rest 
of KVM gets improved those nice bits all get unfairly ignored when the user 
cannot have a usable and good desktop experience and thinks that KVM is 
crappy.

I think you should think outside the initial design box you have created 4 
years ago, you should consider iterating the model and you should consider the 
alternative i suggested: move (or create) KVM tooling to tools/kvm/ and treat 
it as a single project from there on.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 10:18 PM, Antoine Martin wrote:
That includes the guest kernel.  If you can deploy a new kernel in 
the guest, presumably you can deploy a userspace package.


That's not always true.
The host admin can control the guest kernel via "kvm -kernel" easily 
enough, but he may or may not have access to the disk that is used in 
the guest. (think encrypted disks, service agreements, etc)


There is a matching -initrd argument that you can use to launch a 
daemon.  I believe that -kernel use will be rare, though.  It's a lot 
easier to keep everything in one filesystem.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 09:06 PM, Ingo Molnar wrote:

* Avi Kivity  wrote:

   

[...] Second, from my point of view all contributors are volunteers
(perhaps their employer volunteered them, but there's no difference from
my perspective). Asking them to repaint my apartment as a condition to
get a patch applied is abuse.  If a patch is good, it gets applied.
 

This is one of the weirdest arguments i've seen in this thread. Almost all
the time do we make contributions conditional on the general shape of the
project. Developers dont get to do just the fun stuff.
   

So, do you think a reply to a patch along the lines of

   NAK.  Improving scalability is pointless while we don't have a decent GUI.
I'll review you RCU patches
   _after_ you've contributed a usable GUI.

?
 

What does this have to do with RCU?
   


The example was rcuifying kvm which took place a bit ago.  Sorry, it 
wasn't clear.



I'm talking about KVM, which is a Linux kernel feature that is useless without
a proper, KVM-specific app making use of it.

RCU is a general kernel performance feature that works across the board. It
helps KVM indirectly, and it helps many other kernel subsystems as well. It
needs no user-space tool to be useful.
   


Correct.  So should I tell someone that has sent a patch that rcu-ified 
kvm in order to scale it, that I won't accept the patch unless they do 
some usability userspace work?  say, implementing an eject button. 
That's what I understood you to mean.



KVM on the other hand is useless without a user-space tool.

[ Theoretically you might have a fair point if it were a critical feature of
   RCU for it to have a GUI, and if the main tool that made use of it sucked.
   But it isnt and you should know that. ]

Had you suggested the following 'NAK', applied to a different, relevant
subsystem:

   |   NAK.  Improving scalability is pointless while we don't have a usable
   | tool.  I'll review you perf patches _after_ you've contributed a usable
   | tool.
   


That might hold, but the tool is usable at least for some people - it 
runs in production.  The people running it won't benefit from an eject 
button or any usability improvement since they run it through a 
centralized management tool that hides everything.  They will benefit 
from the scalability patches.  Should I still make those patches 
conditional on scalability work that is of no interest to the submitter?


   

This is a basic quid pro quo: new features introduce risks and create
additional workload not just to the originating developer but on the rest
of the community as well. You should check how Linus has pulled new
features in the past 15 years: he very much requires the existing code to
first be top-notch before he accepts new features for a given area of
functionality.
   

For a given area, yes. [...]
 

That is my precise point.

KVM is a specific subsystem or "area" that makes no sense without the
user-space tooling it relates to. You seem to argue that you have no 'right'
to insist on good quality of that tooling - and IMO you are fundamentally
wrong with that.
   


kvm contains many sub-areas.  I'm not going to tie unrelated things 
together like the GUI and sclability, configuration file format and 
emulator correctness, nested virtualization and qcow2 asynchronity, or 
other crazy combinations.  People either leave en mass or become 
frustrated if they can't.  I do reject patches touching a sub-area that 
I think need to be done in userspace, for example.


That's not to say kvm development is random.  We have a weekly 
conference call where regular contributors and maintainers of both qemu 
and kvm participate and where we decide where to focus.  Sadly the issue 
of a qemu GUI is not raised often.  Perhaps you can participate and 
voice your concerns.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Tracking KVM development

2010-03-21 Thread Andre Przywara


Thomas Løcke wrote:

On Sun, Mar 21, 2010 at 1:23 PM, Avi Kivity  wrote:

Tracking git repositories and stable setups are mutually exclusive.  If you
are interested in something stable I recommend staying with the distribution
provided setup (and picking a distribution that has an emphasis on kvm).  If
you want to track upstream, use qemu-kvm-0.12.x stable releases and
kernel.org 2.6.x.y stable releases.  If you want to track git repositories,
use qemu-kvm.git and kvm.git for the kernel and kvm.


Thanks Avi.

I will stay with the stable qemu-kvm releases and stable kernel.org
kernel releases from now on.

I've never heard of any KVM specific distributions. Are you aware of
any? My primary reason for going with Slackware, is because I already
know it. But if there are better choices for a KVM virtualization
host, then I'm willing to switch.
Please think twice about that. Every time I wanted to go away from 
Slackware because of missing packages I ended up with accepting the 
involved hassle with self-compiling because I could stay with the 
simplicity and clean design of Slackware.
I usually compile my own kernels anyway and use the Slackware kernels 
only for testing and installation. So I usually do "make oldconfig" on a 
stable 2.6.xx.>=3 kernel, and am happy with that. QEMU(-kvm) is not a 
problem at all, the dependencies are very small and with Slackware[64] 
13.0 it compiles out of the box with almost all features. I can send you 
a reasonably configured package (or build-script) if you like.
Currently both qemu-kvm-0.12.3 and Linux 2.6.33 work together very well, 
although I usually do only testing and development with KVM and actually 
"use" it very rarely. So if you need more upper level management tools 
(like libvirt) I cannot help you on this.


Regards,
Andre.

--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 488-3567-12

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Antoine Martin


On 03/22/2010 03:11 AM, Avi Kivity wrote:

On 03/21/2010 10:08 PM, Olivier Galibert wrote:

On Sun, Mar 21, 2010 at 10:01:51PM +0200, Avi Kivity wrote:

On 03/21/2010 09:17 PM, Ingo Molnar wrote:
Adding any new daemon to an existing guest is a deployment and 
usability

nightmare.


The logical conclusion of that is that everything should be built into
the kernel.  Where a failure brings the system down or worse.  Where 
you
have to bear the memory footprint whether you ever use the 
functionality

or not.  Where to update the functionality you need to deploy a new
kernel (possibly introducing unrelated bugs) and reboot.

If userspace daemons are such a deployment and usability nightmare,
maybe we should fix that instead.

Which userspace?  Deploying *anything* in the guest can be a
nightmare, including paravirt drivers if you don't have a natively
supported in the OS virtual hardware backoff.


That includes the guest kernel.  If you can deploy a new kernel in the 
guest, presumably you can deploy a userspace package.

That's not always true.
The host admin can control the guest kernel via "kvm -kernel" easily 
enough, but he may or may not have access to the disk that is used in 
the guest. (think encrypted disks, service agreements, etc)


Antoine

Deploying things in the
host OTOH is business as usual.


True.


And you're smart enough to know that.


Thanks.



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: CONFIG_HAVE_KVM=n impossible?

2010-03-21 Thread Michael Tokarev

devz...@web.de wrote:
> Hello, 
> 
> does anybody know why it seems that it`s not possible to build a kernel with 
> "CONFIG_HAVE_KVM=n" ?
> 
> It always switches back to "y" with every kernel build and i have no clue, 
> why.

It's an internal config symbol which is not visible in the menu
system and is always set up unconditionally based on the platform.
Just like "CONFIG_HAVE_MMU".

You want another symbols, like CONFIG_KVM.

/mjt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Tracking KVM development

2010-03-21 Thread Michael Tokarev

Avi Kivity wrote:
[]
> The only kvm-specific distribution I know of is RHEV-H, but that's
> probably not what you're looking for.  I'm talking about distributions
> that have an active kvm package maintainer, update the packages
> regularly, have bug trackers that someone looks into, etc.  At least
> Fedora and Ubuntu do this, perhaps openSuSE as well (though the latter
> has a stronger Xen emphasis).

Debian is a lot better on this front than it used to be a year ago.
At least I'm trying to look for the bugreports on a regular basis ;)

/mjt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

CONFIG_HAVE_KVM=n impossible?

2010-03-21 Thread devzero

Hello, 

does anybody know why it seems that it`s not possible to build a kernel with 
"CONFIG_HAVE_KVM=n" ?

It always switches back to "y" with every kernel build and i have no clue, why.

i`m using 2.6.33 vanilla.

regards
Roland
___
WEB.DE DSL: Internet, Telefon und Entertainment für nur 19,99 EUR/mtl.!
http://produkte.web.de/go/02/
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 10:08 PM, Olivier Galibert wrote:

On Sun, Mar 21, 2010 at 10:01:51PM +0200, Avi Kivity wrote:
   

On 03/21/2010 09:17 PM, Ingo Molnar wrote:
 

Adding any new daemon to an existing guest is a deployment and usability
nightmare.

   

The logical conclusion of that is that everything should be built into
the kernel.  Where a failure brings the system down or worse.  Where you
have to bear the memory footprint whether you ever use the functionality
or not.  Where to update the functionality you need to deploy a new
kernel (possibly introducing unrelated bugs) and reboot.

If userspace daemons are such a deployment and usability nightmare,
maybe we should fix that instead.
 

Which userspace?  Deploying *anything* in the guest can be a
nightmare, including paravirt drivers if you don't have a natively
supported in the OS virtual hardware backoff.


That includes the guest kernel.  If you can deploy a new kernel in the 
guest, presumably you can deploy a userspace package.



Deploying things in the
host OTOH is business as usual.
   


True.


And you're smart enough to know that.
   


Thanks.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 09:59 PM, Ingo Molnar wrote:


Frankly, i was surprised (and taken slightly off base) by both Avi and Anthony
suggesting such a clearly inferior "add a demon to the guest space" solution.
It's a usability and deployment non-starter.
   


It's only clearly inferior if you ignore every consideration against 
it.  It's definitely not a deployment non-starter, see the tons of 
daemons that come with any Linux system.  The basic ones are installed 
and enabled automatically during system installation.



Furthermore, allowing a guest to integrate/mount its files into the host VFS
space (which was my suggestion) has many other uses and advantages as well,
beyond the instrumentation/symbol-lookup purpose.
   


Yes.  I'm just not sure about the auto-enabling part.


So can we please have some resolution here and move on: the KVM maintainers
should either suggest a different transparent approach, or should retract the
NAK for the solution we suggested.
   


So long as you define 'transparent' as in 'only the guest kernel is 
involved' or even 'only the guest and host kernels are involved' we 
aren't going to make a lot of progress.  I oppose shoving random bits of 
functionality into the kernel, especially things that are in daily use.  
While us developers do and will use profiling extensively, it doesn't 
need sit in every guest's non-swappable .text.



We very much want to make progress and want to write code, but obviously we
cannot code against a maintainer NAK, nor can we code up an inferior solution
either.
   


You haven't heard any NAKs, only objections.  If we discuss things 
perhaps we can achieve something that works for everyone.  If we keep 
turning the flames higher that's unlikely.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Tracking KVM development

2010-03-21 Thread Alexander Graf


On 21.03.2010, at 17:42, Avi Kivity wrote:

> On 03/21/2010 06:37 PM, Thomas Løcke wrote:
>> On Sun, Mar 21, 2010 at 1:23 PM, Avi Kivity  wrote:
>>   
>>> Tracking git repositories and stable setups are mutually exclusive.  If you
>>> are interested in something stable I recommend staying with the distribution
>>> provided setup (and picking a distribution that has an emphasis on kvm).  If
>>> you want to track upstream, use qemu-kvm-0.12.x stable releases and
>>> kernel.org 2.6.x.y stable releases.  If you want to track git repositories,
>>> use qemu-kvm.git and kvm.git for the kernel and kvm.
>>> 
>> Thanks Avi.
>> 
>> I will stay with the stable qemu-kvm releases and stable kernel.org
>> kernel releases from now on.
>> 
>> I've never heard of any KVM specific distributions. Are you aware of
>> any? My primary reason for going with Slackware, is because I already
>> know it. But if there are better choices for a KVM virtualization
>> host, then I'm willing to switch.
>>   
> 
> The only kvm-specific distribution I know of is RHEV-H, but that's probably 
> not what you're looking for.  I'm talking about distributions that have an 
> active kvm package maintainer, update the packages regularly, have bug 
> trackers that someone looks into, etc.  At least Fedora and Ubuntu do this, 
> perhaps openSuSE as well (though the latter has a stronger Xen emphasis).

Yes, we do. Though openSUSE 11.2 isn't exactly where I want it to be. Expect 
11.3 to be a lot better there.

Alex--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Olivier Galibert

On Sun, Mar 21, 2010 at 10:01:51PM +0200, Avi Kivity wrote:
> On 03/21/2010 09:17 PM, Ingo Molnar wrote:
> >
> >Adding any new daemon to an existing guest is a deployment and usability
> >nightmare.
> >   
> 
> The logical conclusion of that is that everything should be built into 
> the kernel.  Where a failure brings the system down or worse.  Where you 
> have to bear the memory footprint whether you ever use the functionality 
> or not.  Where to update the functionality you need to deploy a new 
> kernel (possibly introducing unrelated bugs) and reboot.
> 
> If userspace daemons are such a deployment and usability nightmare, 
> maybe we should fix that instead.

Which userspace?  Deploying *anything* in the guest can be a
nightmare, including paravirt drivers if you don't have a natively
supported in the OS virtual hardware backoff.  Deploying things in the
host OTOH is business as usual.

And you're smart enough to know that.

  OG.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/21/2010 09:17 PM, Ingo Molnar wrote:


Adding any new daemon to an existing guest is a deployment and usability
nightmare.
   


The logical conclusion of that is that everything should be built into 
the kernel.  Where a failure brings the system down or worse.  Where you 
have to bear the memory footprint whether you ever use the functionality 
or not.  Where to update the functionality you need to deploy a new 
kernel (possibly introducing unrelated bugs) and reboot.


If userspace daemons are such a deployment and usability nightmare, 
maybe we should fix that instead.



The basic rule of good instrumentation is to be transparent. The moment we
have to modify the user-space of a guest just to monitor it, the purpose of
transparent instrumentation is defeated.
   


You have to modify the guest anyway by deploying a new kernel.


Please try think with the heads of our users and developers and dont suggest
some weird ivory-tower design that is totally impractical ...
   


inetd.d style 'drop a listener config here and it will be executed on 
connection' should work.  The listener could come with the kernel 
package, though I don't think it's a good idea.  module-init-tools 
doesn't and people have survived somehow.



And no, you have to code none of this, we'll do all the coding. The only thing
we are asking is for you to not stand in the way of good usability ...
   


Thanks.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Ingo Molnar


* Antoine Martin  wrote:

> On 03/22/2010 02:17 AM, Ingo Molnar wrote:
> >* Anthony Liguori  wrote:
> >>On 03/19/2010 03:53 AM, Ingo Molnar wrote:
> >>>* Avi Kivity   wrote:
> >There were two negative reactions immediately, both showed a fundamental
> >server versus desktop bias:
> >
> >  - you did not accept that the most important usecase is when there is a
> >single guest running.
> Well, it isn't.
> >>>Erm, my usability points are _doubly_ true when there are multiple guests 
> >>>...
> >>>
> >>>The inconvenience of having to type:
> >>>
> >>>   perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms \
> >>>   --guestmodules=/home/ymzhang/guest/modules top
> >>>
> >>>is very obvious even with a single guest. Now multiply that by more guests 
> >>>...
> >>If you want to improve this, you need to do the following:
> >>
> >>1) Add a userspace daemon that uses vmchannel that runs in the guest and can
> >>fetch kallsyms and arbitrary modules.  If that daemon lives in
> >>tools/perf, that's fine.
> >
> > Adding any new daemon to an existing guest is a deployment and usability 
> > nightmare.
>
> Absolutely. In most cases it is not desirable, and you'll find that in a lot 
> of cases it is not even possible - for non-technical reasons.
>
> One of the main benefits of virtualization is the ability to manage and see 
> things from the outside.
>
> > The basic rule of good instrumentation is to be transparent. The moment we 
> > have to modify the user-space of a guest just to monitor it, the purpose 
> > of transparent instrumentation is defeated.
>
> Not to mention Heisenbugs and interference.

Correct.

Frankly, i was surprised (and taken slightly off base) by both Avi and Anthony 
suggesting such a clearly inferior "add a demon to the guest space" solution. 
It's a usability and deployment non-starter.

Furthermore, allowing a guest to integrate/mount its files into the host VFS 
space (which was my suggestion) has many other uses and advantages as well, 
beyond the instrumentation/symbol-lookup purpose.

So can we please have some resolution here and move on: the KVM maintainers 
should either suggest a different transparent approach, or should retract the 
NAK for the solution we suggested.

We very much want to make progress and want to write code, but obviously we 
cannot code against a maintainer NAK, nor can we code up an inferior solution 
either.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Antoine Martin


On 03/22/2010 02:17 AM, Ingo Molnar wrote:

* Anthony Liguori  wrote:
   

On 03/19/2010 03:53 AM, Ingo Molnar wrote:
 

* Avi Kivity   wrote:
   

There were two negative reactions immediately, both showed a fundamental
server versus desktop bias:

  - you did not accept that the most important usecase is when there is a
single guest running.
   

Well, it isn't.
 

Erm, my usability points are _doubly_ true when there are multiple guests ...

The inconvenience of having to type:

   perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms \
   --guestmodules=/home/ymzhang/guest/modules top

is very obvious even with a single guest. Now multiply that by more guests ...
   

If you want to improve this, you need to do the following:

1) Add a userspace daemon that uses vmchannel that runs in the guest and can
fetch kallsyms and arbitrary modules.  If that daemon lives in
tools/perf, that's fine.
 

Adding any new daemon to an existing guest is a deployment and usability
nightmare.
   
Absolutely. In most cases it is not desirable, and you'll find that in a 
lot of cases it is not even possible - for non-technical reasons.
One of the main benefits of virtualization is the ability to manage and 
see things from the outside.

The basic rule of good instrumentation is to be transparent. The moment we
have to modify the user-space of a guest just to monitor it, the purpose of
transparent instrumentation is defeated.
   

Not to mention Heisenbugs and interference.

Cheers
Antoine


That was one of the fundamental usability mistakes of Oprofile.

There is no 'perf' daemon - all the perf functionality is _built in_, and for
very good reasons. It is one of the main reasons for perf's success as well.

Now Qemu is trying to repeat that stupid mistake ...

So please either suggest a different transparent solution that is technically
better than the one i suggested, or you should concede the point really.

Please try think with the heads of our users and developers and dont suggest
some weird ivory-tower design that is totally impractical ...

And no, you have to code none of this, we'll do all the coding. The only thing
we are asking is for you to not stand in the way of good usability ...

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
   


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Streaming Audio from Virtual Machine

2010-03-21 Thread Gus Zernial

I'm using Kubuntu 9.10 32-bit on a quad-core Phenom II with 
Gigabit ethernet. I want to stream audio from MLB.com from a 
WinXP client thru a Linksys WMB54G wireless music bridge. Note 
that there are drivers for the WMB54G only for WinXP and Vista.

If I stream the audio thru a native WinXP box thru the WMB54G,
all is well and the audio sounds fine. When I try to stream thru a 
WinXP virtual machine on Kubuntu 9.10, the audio is poor quality
and subject to gaps and dropping the stream altogether. So far
I've tried KVM/QEMU and VirtualBox, same result.

Regards KVM/QEMU, I note AMD-V is activated in the BIOS, and I have a 
custom 2.6.32.7 kernel, and QEMU 0.11.0. The kvm kvm_amd modules are compiled 
in and loaded. I've been using bridged networking . I think it's set up 
correctly but I confess I'm no networking expert. My start command for the 
WinXP virtual machine is:

sudo /usr/bin/qemu -m 1024 -boot c 
-netnic,vlan=0,macaddr=00:d0:13:b0:2d:32,model=rtl8139 -net 
tap,vlan=0,ifname=tap0,script=/etc/qemu-ifup -localtime -soundhw ac97 -smp 4 
-fda /dev/fd0 -vga std -usb /home/rbroman/windows.img

I also tried model=virtio but that didn't help. 

I suspect this is a virtual machine networking problem but I'm
not sure. So my questions are:

-What's the best/fastest networking option and how do I set it up?
Pointers to step-by-step instructions appreciated.

-Is it possible I have a problem other than networking? Configuration
problem with KVM/QEMU? Or could there be a problem with the WMB54G driver when 
used thru a virtual machine?

-Is there a better virtual machine solution than KVM/QEMU for what 
I'm trying to do?

Recommendations appreciated - Gus





  
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Ingo Molnar

* Anthony Liguori  wrote:

> On 03/19/2010 03:53 AM, Ingo Molnar wrote:
> >* Avi Kivity  wrote:
> >
> >>>There were two negative reactions immediately, both showed a fundamental
> >>>server versus desktop bias:
> >>>
> >>>  - you did not accept that the most important usecase is when there is a
> >>>single guest running.
> >>Well, it isn't.
> >Erm, my usability points are _doubly_ true when there are multiple guests ...
> >
> >The inconvenience of having to type:
> >
> >   perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms \
> >   --guestmodules=/home/ymzhang/guest/modules top
> >
> >is very obvious even with a single guest. Now multiply that by more guests 
> >...
> 
> If you want to improve this, you need to do the following:
> 
> 1) Add a userspace daemon that uses vmchannel that runs in the guest and can 
>fetch kallsyms and arbitrary modules.  If that daemon lives in 
>tools/perf, that's fine.

Adding any new daemon to an existing guest is a deployment and usability 
nightmare.

The basic rule of good instrumentation is to be transparent. The moment we 
have to modify the user-space of a guest just to monitor it, the purpose of 
transparent instrumentation is defeated.

That was one of the fundamental usability mistakes of Oprofile.

There is no 'perf' daemon - all the perf functionality is _built in_, and for 
very good reasons. It is one of the main reasons for perf's success as well.

Now Qemu is trying to repeat that stupid mistake ...

So please either suggest a different transparent solution that is technically 
better than the one i suggested, or you should concede the point really.

Please try think with the heads of our users and developers and dont suggest 
some weird ivory-tower design that is totally impractical ...

And no, you have to code none of this, we'll do all the coding. The only thing 
we are asking is for you to not stand in the way of good usability ...

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Ingo Molnar

* Avi Kivity  wrote:

> >> [...] Second, from my point of view all contributors are volunteers 
> >> (perhaps their employer volunteered them, but there's no difference from 
> >> my perspective). Asking them to repaint my apartment as a condition to 
> >> get a patch applied is abuse.  If a patch is good, it gets applied.
> >
> > This is one of the weirdest arguments i've seen in this thread. Almost all 
> > the time do we make contributions conditional on the general shape of the 
> > project. Developers dont get to do just the fun stuff.
> 
> So, do you think a reply to a patch along the lines of
> 
>   NAK.  Improving scalability is pointless while we don't have a decent GUI.  
> I'll review you RCU patches
>   _after_ you've contributed a usable GUI.
> 
> ?

What does this have to do with RCU?

I'm talking about KVM, which is a Linux kernel feature that is useless without 
a proper, KVM-specific app making use of it.

RCU is a general kernel performance feature that works across the board. It 
helps KVM indirectly, and it helps many other kernel subsystems as well. It 
needs no user-space tool to be useful.

KVM on the other hand is useless without a user-space tool.

[ Theoretically you might have a fair point if it were a critical feature of 
  RCU for it to have a GUI, and if the main tool that made use of it sucked. 
  But it isnt and you should know that. ]

Had you suggested the following 'NAK', applied to a different, relevant 
subsystem:

  |   NAK.  Improving scalability is pointless while we don't have a usable 
  | tool.  I'll review you perf patches _after_ you've contributed a usable 
  | tool.

you would have a fair point. In fact, we are doing that we are living by that. 
It makes absolutely zero sense to improve the scalability of perf if its 
usability sucks.

So where you are trying to point out an inconsistency in my argument there is 
none.

> > This is a basic quid pro quo: new features introduce risks and create 
> > additional workload not just to the originating developer but on the rest 
> > of the community as well. You should check how Linus has pulled new 
> > features in the past 15 years: he very much requires the existing code to 
> > first be top-notch before he accepts new features for a given area of 
> > functionality.
> 
> For a given area, yes. [...]

That is my precise point.

KVM is a specific subsystem or "area" that makes no sense without the 
user-space tooling it relates to. You seem to argue that you have no 'right' 
to insist on good quality of that tooling - and IMO you are fundamentally 
wrong with that.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Enhance perf to collect KVM guest os statistics from host side

2010-03-21 Thread Ingo Molnar

* oerg Roedel  wrote:

> On Fri, Mar 19, 2010 at 09:21:22AM +0100, Ingo Molnar wrote:
> > Unfortunately, in a previous thread the Qemu maintainer has indicated that 
> > he 
> > will essentially NAK any attempt to enhance Qemu to provide an easily 
> > discoverable, self-contained, transparent guest mount on the host side.
> > 
> > No technical justification was given for that NAK, despite my repeated 
> > requests to particulate the exact security problems that such an approach 
> > would cause.
> > 
> > If that NAK does not stand in that form then i'd like to know about it - it 
> > makes no sense for us to try to code up a solution against a standing 
> > maintainer NAK ...
> 
> I still think it is the best and most generic way to let the guest do the 
> symbol resolution. [...]

Not really.

> [...] This has several advantages:
> 
>   1. The guest knows best about its symbol space. So this would be
>  extensible to other guest operating systems.  A brave
>  developer may even implement symbol passing for Windows or
>  the BSDs ;-)

Having access to the actual executable files that include the symbols achieves 
precisely that - with the additional robustness that all this functionality is 
concentrated into the host, while the guest side is kept minimal (and 
transparent).

>   2. The guest can decide for its own if it want to pass this
>  inforamtion to the host-perf. No security issues at all.

It can decide whether it exposes the files. Nor are there any "security 
issues" to begin with.

>   3. The guest can also pass us the call-chain and we don't need
>  to care about complicated of fetching from the guest
>  ourself.

You need to be aware of the fact that symbol resolution is a separate step 
from call chain generation.

I.e. call-chains are a (entirely) separate issue, and could reasonably be done 
in the guest or in the host.

It has no bearing on this symbol resolution question.

>   4. This way extensible to nested virtualization too.

Nested virtualization is actually already taken care of by the filesystem 
solution via an existing method called 'subdirectories'. If the guest offers 
sub-guests then those symbols will be exposed in a similar way via its own 
'guest files' directory hierarchy.

I.e. if we have 'Guest-2' nested inside 'the 'Guest-Fedora-1' instance, we get:

 /guests/
 /guests/Guest-Fedora-1/etc/
 /guests/Guest-Fedora-1/usr/

we'd also have:

 /guests/Guest-Fedora-1/guests/Guest-2/

So this is taken care of automatically.

I.e. none of the four 'advantages' listed here are actually advantages over my 
proposed solution, so your conclusion is subsequently flawed as well.

> How we speak to the guest was already discussed in this thread. My personal 
> opinion is that going through qemu is an unnecessary step and we can solve 
> that more clever and transparent for perf.

Meaning exactly what?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: unexpected exit_ini_info when nesting svm

2010-03-21 Thread Joerg Roedel

Hello Oliver,

On Thu, Mar 18, 2010 at 08:43:53PM +0100, Olivier Berghmans wrote:
> I tried nesting kvm in kvm on an AMD processor with support for svm
> and npt (the dmesg told me both were in use). I managed to install the
> nested kvm and when starting the L2 guest in order to install an
> operating system, I got following messages in the L1 guest:
> 
> [ 2016.712047] handle_exit: unexpected exit_ini_info 0x8008 exit_code 0x60
> [ 2031.432032] handle_exit: unexpected exit_ini_info 0x8008 exit_code 0x60
> [ 2034.468058] handle_exit: unexpected exit_ini_info 0x8008 exit_code 0x60

These messages result from a difference between a real hardware svm and
the emulated svm from kvm. Hardware SVM always injects an exception
first before it does an #vmexit(0x60) while the svm emulation does
immediatlt #vmexit again. I have a patch to fix this but it needs more
testing.

The patch implements detection of the above situation and sends an
self-ipi in this case.

Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Tracking KVM development

2010-03-21 Thread Avi Kivity


On 03/21/2010 06:37 PM, Thomas Løcke wrote:

On Sun, Mar 21, 2010 at 1:23 PM, Avi Kivity  wrote:
   

Tracking git repositories and stable setups are mutually exclusive.  If you
are interested in something stable I recommend staying with the distribution
provided setup (and picking a distribution that has an emphasis on kvm).  If
you want to track upstream, use qemu-kvm-0.12.x stable releases and
kernel.org 2.6.x.y stable releases.  If you want to track git repositories,
use qemu-kvm.git and kvm.git for the kernel and kvm.
 

Thanks Avi.

I will stay with the stable qemu-kvm releases and stable kernel.org
kernel releases from now on.

I've never heard of any KVM specific distributions. Are you aware of
any? My primary reason for going with Slackware, is because I already
know it. But if there are better choices for a KVM virtualization
host, then I'm willing to switch.
   


The only kvm-specific distribution I know of is RHEV-H, but that's 
probably not what you're looking for.  I'm talking about distributions 
that have an active kvm package maintainer, update the packages 
regularly, have bug trackers that someone looks into, etc.  At least 
Fedora and Ubuntu do this, perhaps openSuSE as well (though the latter 
has a stronger Xen emphasis).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Tracking KVM development

2010-03-21 Thread Thomas Løcke

On Sun, Mar 21, 2010 at 1:23 PM, Avi Kivity  wrote:
> Tracking git repositories and stable setups are mutually exclusive.  If you
> are interested in something stable I recommend staying with the distribution
> provided setup (and picking a distribution that has an emphasis on kvm).  If
> you want to track upstream, use qemu-kvm-0.12.x stable releases and
> kernel.org 2.6.x.y stable releases.  If you want to track git repositories,
> use qemu-kvm.git and kvm.git for the kernel and kvm.

Thanks Avi.

I will stay with the stable qemu-kvm releases and stable kernel.org
kernel releases from now on.

I've never heard of any KVM specific distributions. Are you aware of
any? My primary reason for going with Slackware, is because I already
know it. But if there are better choices for a KVM virtualization
host, then I'm willing to switch.

:o)
/Thomas
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Strange CPU usage pattern in SMP guest

2010-03-21 Thread Sebastian Hetze

On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote:
> On 03/21/2010 04:55 PM, Sebastian Hetze wrote:
>> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
>>
>>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>>>  
 12:46:02 CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
 %guest   %idle
 12:46:03 all0,20   11,35   10,968,960,402,990,00   
  0,00   65,14
 12:46:03   01,00   11,007,00   15,000,001,000,00   
  0,00   65,00
 12:46:03   10,007,142,046,121,02   11,220,00   
  0,00   72,45
 12:46:03   20,00   15,001,00   12,000,001,000,00   
  0,00   71,00
 12:46:03   30,00   11,00   23,008,000,000,000,00   
  0,00   58,00
 12:46:03   40,000,00   50,000,000,000,000,00   
  0,00   50,00
 12:46:03   50,00   13,00   20,004,000,001,000,00   
  0,00   62,00

 So it is only CPU4 that is showing this strange behaviour.


>>> Can you adjust irqtop to only count cpu4?  or even just post a few 'cat
>>> /proc/interrupts' from that guest.
>>>
>>> Most likely the timer interrupt for cpu4 died.
>>>  
>> I've added two keys +/- to your irqtop to focus up and down
>> in the row of available CPUs.
>> The irqtop for CPU4 shows a constant number of 6 local timer interrupts
>> per update, while the other CPUs show various higher values:
>>
>> irqtop for cpu 4
>>
>>   eth0  188
>>   Rescheduling interrupts   162
>>   Local timer interrupts  6
>>   ata_piix3
>>   TLB shootdowns  1
>>   Spurious interrupts 0
>>   Machine check exceptions0
>>
>>
>> irqtop for cpu 5
>>
>>   eth0  257
>>   Local timer interrupts251
>>   Rescheduling interrupts   237
>>   Spurious interrupts 0
>>   Machine check exceptions0
>>
>> So the timer interrupt for cpu4 is not completely dead but somehow
>> broken.
>
> That is incredibly weird.
>
>> What can cause this problem? Any way to speed it up again?
>>
>
> The host has 8 cpus and is only running this 6 vcpu guest, yes?

The host is an dual quad core E5520 with hyperthrading enabled, so we
see 2x4x2=16 CPUs on the host. The guest is started with 6 CPUs.

> Can you confirm the other vcpus are ticking at 250 Hz?

The irqtop shows different numbers for local timer interrupts on the
other CPUs. The total number (summed up over all CPUs) varies between
something like 700 and 1400. Any CPU can be down to 10 and next update
up to 260. Only CPU4 stays at the 6 local timer interrupts.

>
> What does 'top' show running on cpu 4?  Pressing 'f' 'j' will add a  
> last-used-cpu field in the display.

The processes are not bound to a particular CPU, so the picture varies.
Here are two shots:

take1:

   15 root  RT  -5 000 S0  0.0   0:01.70 4 migration/4
   16 root  15  -5 000 S0  0.0   0:00.08 4 ksoftirqd/4
   17 root  RT  -5 000 S0  0.0   0:00.00 4 watchdog/4
   25 root  15  -5 000 S0  0.0   0:00.01 4 events/4
   35 root  15  -5 000 S0  0.0   0:00.00 4 kintegrityd/4
   41 root  15  -5 000 S0  0.0   0:00.03 4 kblockd/4
   50 root  15  -5 000 S0  0.0   0:00.90 4 ata/4
   55 root  15  -5 000 S0  0.0   0:00.00 4 kseriod
   66 root  15  -5 000 S0  0.0   0:00.00 4 aio/4
   73 root  15  -5 000 S0  0.0   0:00.00 4 crypto/4
   80 root  15  -5 000 S0  0.0   2:11.71 4 scsi_eh_1
   87 root  15  -5 000 S0  0.0   0:00.00 4 kmpathd/4
   95 root  15  -5 000 S0  0.0   0:00.00 4 kondemand/4
  101 root  15  -5 000 S0  0.0   0:00.00 4 kconservative/4
  103 root  10 -10 000 S0  0.0   0:00.00 4 krfcommd
  681 root  15  -5 000 S0  0.0   0:00.00 4 kdmflush
  686 root  15  -5 000 S0  0.0   0:00.00 4 kdmflush
  691 root  15  -5 000 S0  0.0   0:00.00 4 kdmflush
  737 root  15  -5 000 S0  0.0   0:00.71 4 kjournald
  826 root  16  -4  2100  452  312 S0  0.0   0:00.14 4 udevd
 1350 root  15  -5 000 S0  0.0   0:00.00 4 kpsmoused
 1444 root  15  -5 000 S0  0.0   0:00.00 4 kgameportd
 1718 root  15  -5 000 S0  0.0   0:14.62 4 kjournald
 2108 statd 20   0  2252 1152  760 S0  0.0   0:02.66 4 rpc.statd
 2117 root  15  -5 000 S0  0.0   0:00.36 4 rpciod/4
 2123 root  15  -5 00

Re: Strange CPU usage pattern in SMP guest

2010-03-21 Thread Avi Kivity


On 03/21/2010 04:55 PM, Sebastian Hetze wrote:

On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
   

On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
 

12:46:02 CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
%guest   %idle
12:46:03 all0,20   11,35   10,968,960,402,990,00
0,00   65,14
12:46:03   01,00   11,007,00   15,000,001,000,00
0,00   65,00
12:46:03   10,007,142,046,121,02   11,220,00
0,00   72,45
12:46:03   20,00   15,001,00   12,000,001,000,00
0,00   71,00
12:46:03   30,00   11,00   23,008,000,000,000,00
0,00   58,00
12:46:03   40,000,00   50,000,000,000,000,00
0,00   50,00
12:46:03   50,00   13,00   20,004,000,001,000,00
0,00   62,00

So it is only CPU4 that is showing this strange behaviour.

   

Can you adjust irqtop to only count cpu4?  or even just post a few 'cat
/proc/interrupts' from that guest.

Most likely the timer interrupt for cpu4 died.
 

I've added two keys +/- to your irqtop to focus up and down
in the row of available CPUs.
The irqtop for CPU4 shows a constant number of 6 local timer interrupts
per update, while the other CPUs show various higher values:

irqtop for cpu 4

  eth0  188
  Rescheduling interrupts   162
  Local timer interrupts  6
  ata_piix3
  TLB shootdowns  1
  Spurious interrupts 0
  Machine check exceptions0


irqtop for cpu 5

  eth0  257
  Local timer interrupts251
  Rescheduling interrupts   237
  Spurious interrupts 0
  Machine check exceptions0

So the timer interrupt for cpu4 is not completely dead but somehow
broken.


That is incredibly weird.


What can cause this problem? Any way to speed it up again?
   


The host has 8 cpus and is only running this 6 vcpu guest, yes?

Can you confirm the other vcpus are ticking at 250 Hz?

What does 'top' show running on cpu 4?  Pressing 'f' 'j' will add a 
last-used-cpu field in the display.


Marcelo, any ideas?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] KVM: x86 emulator: fix unlocked CMPXCHG8B emulation.

2010-03-21 Thread Gleb Natapov


When CMPXCHG8B is executed without LOCK prefix it is racy. Preserve this
behaviour in emulator too.

Signed-off-by: Gleb Natapov 
---

This patch goes on top of my previous "KVM: x86 emulator: add decoding
of CMPXCHG8B dst operand." patch.

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 904351e..e2bbb9c 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1724,7 +1724,6 @@ static inline int emulate_grp9(struct x86_emulate_ctxt 
*ctxt,
   (u32) c->regs[VCPU_REGS_RBX];
 
ctxt->eflags |= EFLG_ZF;
-   c->lock_prefix = 1;
}
return X86EMUL_CONTINUE;
 }
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Strange CPU usage pattern in SMP guest

2010-03-21 Thread Sebastian Hetze

On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote:
> On 03/21/2010 02:02 PM, Sebastian Hetze wrote:
>>
>> 12:46:02 CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
>> %guest   %idle
>> 12:46:03 all0,20   11,35   10,968,960,402,990,00
>> 0,00   65,14
>> 12:46:03   01,00   11,007,00   15,000,001,000,00
>> 0,00   65,00
>> 12:46:03   10,007,142,046,121,02   11,220,00
>> 0,00   72,45
>> 12:46:03   20,00   15,001,00   12,000,001,000,00
>> 0,00   71,00
>> 12:46:03   30,00   11,00   23,008,000,000,000,00
>> 0,00   58,00
>> 12:46:03   40,000,00   50,000,000,000,000,00
>> 0,00   50,00
>> 12:46:03   50,00   13,00   20,004,000,001,000,00
>> 0,00   62,00
>>
>> So it is only CPU4 that is showing this strange behaviour.
>>
>
> Can you adjust irqtop to only count cpu4?  or even just post a few 'cat  
> /proc/interrupts' from that guest.
>
> Most likely the timer interrupt for cpu4 died.

I've added two keys +/- to your irqtop to focus up and down
in the row of available CPUs.
The irqtop for CPU4 shows a constant number of 6 local timer interrupts
per update, while the other CPUs show various higher values:

irqtop for cpu 4

 eth0  188
 Rescheduling interrupts   162
 Local timer interrupts  6
 ata_piix3
 TLB shootdowns  1
 Spurious interrupts 0
 Machine check exceptions0


irqtop for cpu 5

 eth0  257
 Local timer interrupts251
 Rescheduling interrupts   237
 Spurious interrupts 0
 Machine check exceptions0

So the timer interrupt for cpu4 is not completely dead but somehow
broken. What can cause this problem? Any way to speed it up again?

#!/usr/bin/python

import curses
import sys, os, time, optparse

def read_interrupts():
global target
irq = {}
proc = file('/proc/interrupts')
nrcpu = len(proc.readline().split())
if target < 0:
target = 0;
if target > nrcpu:
target = nrcpu
for line in proc.readlines():
vec, data = line.strip().split(':', 1)
if vec in ('ERR', 'MIS'):
continue
counts = data.split(None, nrcpu)
counts, rest = (counts[:-1], counts[-1])
if target == 0:
count = sum([int(x) for x in counts])
else:
count = int(counts[target-1])
try:
v = int(vec)
name = rest.split(None, 1)[1]
except:
name = rest
irq[name] = count
return irq

def delta_interrupts():
old = read_interrupts()
while True:
irq = read_interrupts()
delta = {}
for key in irq.keys():
delta[key] = irq[key] - old[key]
yield delta
old = irq

target = 0
label_width = 35
number_width = 10

def tui(screen):
curses.use_default_colors()
global target
curses.noecho()
def getcount(x):
return x[1]
def refresh(irq):
screen.erase()
if target > 0:
title = "irqtop for cpu %d"%(target-1)
else:
title = "irqtop sum for all cpu's"
screen.addstr(0, 0, title)
row = 2
for name, count in sorted(irq.items(), key = getcount, reverse = True):
if row >= screen.getmaxyx()[0]:
break
col = 1
screen.addstr(row, col, name)
col += label_width
screen.addstr(row, col, '%10d' % (count,))
row += 1
screen.refresh()

for irqs in delta_interrupts():
refresh(irqs)
curses.halfdelay(10)
try:
c = screen.getkey()
if c == 'q':
break
if c == '+':
target = target+1
if c == '-':
target = target-1
except KeyboardInterrupt:
break
except curses.error:
continue

import curses.wrapper
curses.wrapper(tui)

Re: [PATCH 2/2] KVM: x86 emulator: add decoding of CMPXCHG8B dst operand.

2010-03-21 Thread Avi Kivity


On 03/21/2010 04:44 PM, Gleb Natapov wrote:

On Sun, Mar 21, 2010 at 04:41:24PM +0200, Avi Kivity wrote:
   

On 03/21/2010 01:08 PM, Gleb Natapov wrote:
 

Decode CMPXCHG8B destination operand in decoding stage. Fixes regression
introduced by "If LOCK prefix is used dest arg should be memory" commit.
This commit relies on dst operand be decoded at the beginning of an
instruction emulation.
   
 

@@ -1719,15 +1719,12 @@ static inline int emulate_grp9(struct x86_emulate_ctxt 
*ctxt,
c->regs[VCPU_REGS_RAX] = (u32) (old>>   0);
c->regs[VCPU_REGS_RDX] = (u32) (old>>   32);
ctxt->eflags&= ~EFLG_ZF;
-
} else {
-   new = ((u64)c->regs[VCPU_REGS_RCX]<<   32) |
+   c->dst.val = ((u64)c->regs[VCPU_REGS_RCX]<<   32) |
   (u32) c->regs[VCPU_REGS_RBX];

-   rc = ops->cmpxchg_emulated(c->modrm_ea,&old,&new, 8, 
ctxt->vcpu);
-   if (rc != X86EMUL_CONTINUE)
-   return rc;
ctxt->eflags |= EFLG_ZF;
+   c->lock_prefix = 1;
   

Why is this bit needed?  cmpxchg64b without lock is valid and racy,
but the guest may know it is safe.

 

Agree. Before this patch cmpxchg8b emulation always called
cmpxchg_emulated(), so to be extra careful I wanted to preserve old
behaviour. Resend the patch without this line?
   


Better a 3/2 that removes it.  So we have a large patch that just 
transforms code, and a small patch that corrects an earlier bug.  May 
help a bisector one day.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] KVM: x86 emulator: add decoding of CMPXCHG8B dst operand.

2010-03-21 Thread Gleb Natapov

On Sun, Mar 21, 2010 at 04:41:24PM +0200, Avi Kivity wrote:
> On 03/21/2010 01:08 PM, Gleb Natapov wrote:
> >Decode CMPXCHG8B destination operand in decoding stage. Fixes regression
> >introduced by "If LOCK prefix is used dest arg should be memory" commit.
> >This commit relies on dst operand be decoded at the beginning of an
> >instruction emulation.
> 
> >@@ -1719,15 +1719,12 @@ static inline int emulate_grp9(struct 
> >x86_emulate_ctxt *ctxt,
> > c->regs[VCPU_REGS_RAX] = (u32) (old>>  0);
> > c->regs[VCPU_REGS_RDX] = (u32) (old>>  32);
> > ctxt->eflags&= ~EFLG_ZF;
> >-
> > } else {
> >-new = ((u64)c->regs[VCPU_REGS_RCX]<<  32) |
> >+c->dst.val = ((u64)c->regs[VCPU_REGS_RCX]<<  32) |
> >(u32) c->regs[VCPU_REGS_RBX];
> >
> >-rc = ops->cmpxchg_emulated(c->modrm_ea,&old,&new, 8, 
> >ctxt->vcpu);
> >-if (rc != X86EMUL_CONTINUE)
> >-return rc;
> > ctxt->eflags |= EFLG_ZF;
> >+c->lock_prefix = 1;
> 
> Why is this bit needed?  cmpxchg64b without lock is valid and racy,
> but the guest may know it is safe.
> 
Agree. Before this patch cmpxchg8b emulation always called
cmpxchg_emulated(), so to be extra careful I wanted to preserve old
behaviour. Resend the patch without this line?

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] KVM: x86 emulator: add decoding of CMPXCHG8B dst operand.

2010-03-21 Thread Avi Kivity


On 03/21/2010 01:08 PM, Gleb Natapov wrote:

Decode CMPXCHG8B destination operand in decoding stage. Fixes regression
introduced by "If LOCK prefix is used dest arg should be memory" commit.
This commit relies on dst operand be decoded at the beginning of an
instruction emulation.
   



@@ -1719,15 +1719,12 @@ static inline int emulate_grp9(struct x86_emulate_ctxt 
*ctxt,
c->regs[VCPU_REGS_RAX] = (u32) (old>>  0);
c->regs[VCPU_REGS_RDX] = (u32) (old>>  32);
ctxt->eflags&= ~EFLG_ZF;
-
} else {
-   new = ((u64)c->regs[VCPU_REGS_RCX]<<  32) |
+   c->dst.val = ((u64)c->regs[VCPU_REGS_RCX]<<  32) |
   (u32) c->regs[VCPU_REGS_RBX];

-   rc = ops->cmpxchg_emulated(c->modrm_ea,&old,&new, 8, 
ctxt->vcpu);
-   if (rc != X86EMUL_CONTINUE)
-   return rc;
ctxt->eflags |= EFLG_ZF;
+   c->lock_prefix = 1;
   


Why is this bit needed?  cmpxchg64b without lock is valid and racy, but 
the guest may know it is safe.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.

2010-03-21 Thread Avi Kivity


On 03/21/2010 04:35 PM, Gleb Natapov wrote:

On Sun, Mar 21, 2010 at 04:32:42PM +0200, Avi Kivity wrote:
   

On 03/21/2010 01:09 PM, Gleb Natapov wrote:
 

Wrong To: header. Ignore please.
   

See sendemail.aliasesfile in 'git help send-email'.

 

I use alisesfile, but unfortunately if alias is not found there git does
not complain, just pass it as is to sendmail and sendmail adds part
after @ by itself.
   


Ah.  Then don't use sendmail.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.

2010-03-21 Thread Gleb Natapov

On Sun, Mar 21, 2010 at 04:32:42PM +0200, Avi Kivity wrote:
> On 03/21/2010 01:09 PM, Gleb Natapov wrote:
> >Wrong To: header. Ignore please.
> 
> See sendemail.aliasesfile in 'git help send-email'.
> 
I use alisesfile, but unfortunately if alias is not found there git does
not complain, just pass it as is to sendmail and sendmail adds part
after @ by itself.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: Fix a build error

2010-03-21 Thread Avi Kivity


On 03/20/2010 07:17 PM, Amos Kong wrote:

arch/x86/kvm/x86.c: In function ‘emulator_cmpxchg_emulated’:
arch/x86/kvm/x86.c:3367: error: ‘u’ undeclared (first use in this function)
arch/x86/kvm/x86.c:3367: error: (Each undeclared identifier is reported only 
once
arch/x86/kvm/x86.c:3367: error: for each function it appears in.)
arch/x86/kvm/x86.c:3367: error: expected expression before ‘)’ token
   


Thanks, just applied same patch from Jan.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: x86: Fix 32-bit build breakage due to typo

2010-03-21 Thread Avi Kivity


On 03/20/2010 11:14 AM, Jan Kiszka wrote:


Obviously, the 64-bit case is considered stable now and 32 bit remained
untested (not included in autotest?).


We don't autotest on 32-bit hosts these days.


So here is the build fix:
   


Thanks, applied.  Should have done it myself.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.

2010-03-21 Thread Avi Kivity


On 03/21/2010 01:09 PM, Gleb Natapov wrote:

Wrong To: header. Ignore please.
   


See sendemail.aliasesfile in 'git help send-email'.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Gabor Gombas

On Thu, Mar 18, 2010 at 05:13:10PM +0100, Ingo Molnar wrote:

> > Why does Linux AIO still suck?  Why do we not have a proper interface in 
> > userspace for doing asynchronous file system operations?
> 
> Good that you mention it, i think it's an excellent example.
> 
> The suckage of kernel async IO is for similar reasons: there's an ugly 
> package 
> separation problem between the kernel and between glibc - and between the 
> apps 
> that would make use of it.

No, kernel async IO sucks because it still does not play well with
buffered I/O. Last time I checked (about a year ago or so), AIO syscall
latencies were much worse when buffered I/O was used compared to direct
I/O. Unfortunately, to achieve good performance with direct I/O, you
need a HW RAID card with lots of on-board cache.

Gabor
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: Drop KVM_REQ_PENDING_TIMER

2010-03-21 Thread Avi Kivity


On 03/20/2010 05:20 AM, Xiao Wang wrote:

The pending timer is not detected through KVM_REQ_PENDING_TIMER now.

   


It does, see the commit message of 06e056456.

Marcelo, IIRC this is the second time time we get this patch... we need 
either a comment in the code, or better, a fix that doesn't involve an 
atomic in the fast path.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Tracking KVM development

2010-03-21 Thread Avi Kivity


On 03/21/2010 01:21 PM, Thomas Løcke wrote:

Hey all,

I've recently started testing KVM as a possible virtualization
solution for a bunch of servers, and so far things are going pretty
well. My OS of choice is Slackware, and I usually just go with
whatever kernel Slackware comes with.

But with KVM I feel I might need to pay a bit more attention to that
part of Slackware, as it appears to a be a project in rapid
development, so my questions concern how best to track and keep KVM
up-to-date?

Currently I upgrade to the latest stable kernel almost as soon as its
been released by Linus, and I track qemu-kvm using this Git
repository:  git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git

But should I perhaps also track the KVM modules, and if so, from where?

Any and all suggestions to keeping a healthy and stable KVM setup
running is more than welcome.

   


Tracking git repositories and stable setups are mutually exclusive.  If 
you are interested in something stable I recommend staying with the 
distribution provided setup (and picking a distribution that has an 
emphasis on kvm).  If you want to track upstream, use qemu-kvm-0.12.x 
stable releases and kernel.org 2.6.x.y stable releases.  If you want to 
track git repositories, use qemu-kvm.git and kvm.git for the kernel and kvm.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Strange CPU usage pattern in SMP guest

2010-03-21 Thread Avi Kivity


On 03/21/2010 02:02 PM, Sebastian Hetze wrote:


12:46:02 CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
%guest   %idle
12:46:03 all0,20   11,35   10,968,960,402,990,00
0,00   65,14
12:46:03   01,00   11,007,00   15,000,001,000,00
0,00   65,00
12:46:03   10,007,142,046,121,02   11,220,00
0,00   72,45
12:46:03   20,00   15,001,00   12,000,001,000,00
0,00   71,00
12:46:03   30,00   11,00   23,008,000,000,000,00
0,00   58,00
12:46:03   40,000,00   50,000,000,000,000,00
0,00   50,00
12:46:03   50,00   13,00   20,004,000,001,000,00
0,00   62,00

So it is only CPU4 that is showing this strange behaviour.
   


Can you adjust irqtop to only count cpu4?  or even just post a few 'cat 
/proc/interrupts' from that guest.


Most likely the timer interrupt for cpu4 died.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

hi, may I ask some help on the paravirtualization of KVM?

2010-03-21 Thread Liang YANG

I want to set up the virtio-net for the GuestOS on KVM. Following is my steps:

1.Compile the kvm-88 and make, make install.
2.Compile the GuestOS(redhat) with kernel version 2.6.27.45(with
virtio support). The required option are all selected.
  o CONFIG_VIRTIO_PCI=y (Virtualization -> PCI driver for
virtio devices)
  o CONFIG_VIRTIO_BALLOON=y (Virtualization -> Virtio balloon driver)
  o CONFIG_VIRTIO_BLK=y (Device Drivers -> Block -> Virtio block driver)
  o CONFIG_VIRTIO_NET=y (Device Drivers -> Network device
support -> Virtio network driver)
  o CONFIG_VIRTIO=y (automatically selected)
  o CONFIG_VIRTIO_RING=y (automatically selected)
3.Then start up the GuestOS by such command:
  x86_64-softmmu/qemu-system-x86_64  -m 1024 /root/redhat.img
-net nic,model=virtio -net tap,script=/etc/kvm/qemu-ifup
4.Result is this:
  * The Guest OS start up.
  * But the network not, no eth-X device found.
  * lsmod | grep virtio get none module about virtio

Then why the virtio_net not show up in the GuestOS? Is there any
wrongs on my each steps? or lacking some settings? I have referred the
page http://www.linux-kvm.org/page/Virtio, but not found any special
requirement.

Does anyone have some tips? Thanks in advance.



-- 
BestRegards.
YangLiang
_
 Department of Computer Science .
 School of Electronics Engineering & Computer Science .
_
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Strange CPU usage pattern in SMP guest

2010-03-21 Thread Sebastian Hetze

On Sun, Mar 21, 2010 at 12:09:00PM +0200, Avi Kivity wrote:
> On 03/21/2010 02:13 AM, Sebastian Hetze wrote:
>> Hi *,
>>
>> in an 6 CPU SMP guest running on an host with 2 quad core
>> Intel Xeon E5520 with hyperthrading enabled
>> we see one or more guest CPUs working in a very strange
>> pattern. It looks like all or nothing. We can easily identify
>> the effected CPU with xosview. Here is the mpstat output
>> compared to one regular working CPU:
>>
>>
>> mpstat -P 4 1
>> Linux 2.6.31-16-generic-pae (guest)  21.03.2010  _i686_  (6 CPU)
>> 00:45:19 CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
>> %guest   %idle
>> 00:45:20   40,00  100,000,000,000,000,000,00
>> 0,000,00
>> 00:45:21   40,00  100,000,000,000,000,000,00
>> 0,000,00
>> 00:45:22   40,00  100,000,000,000,000,000,00
>> 0,000,00
>> 00:45:23   40,00  100,000,000,000,000,000,00
>> 0,000,00
>> 00:45:24   40,00   66,670,000,000,00   33,330,00
>> 0,000,00
>> 00:45:25   40,00  100,000,000,000,000,000,00
>> 0,000,00
>> 00:45:26   40,00  100,000,000,000,000,000,00
>> 0,000,00
>>
>
> Looks like the guest is only receiving 3-4 timer interrupts per second,  
> so time becomes quantized.
>
> Please run the attached irqtop in the affected guest and report the results.
>
> Is the host overly busy?  What host kernel, kvm, and qemu are you  
> running?  Is the guest running an I/O workload? if so, how are the disks  

The host is not busy at all. In fact, currently it is running only one
guest. The host is running an ubuntu 2.6.31-14-server kernel. qemu-kvm
is 0.12.2-0ubuntu6. The kvm module has srcversion: 82D6B673524596F9CF3E84C
as stated by modinfo.

The guest occasionally is running IO workload. However, the effect is
visible all the time. And it is only one out of 6 CPUs the very same guest
is running. This is the output on the guest for all CPUs:

mpstat -P ALL 1
12:45:59 CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
%guest   %idle
12:46:00 all0,409,742,395,370,803,980,00
0,00   77,34
12:46:00   01,005,006,003,001,009,000,00
0,00   75,00
12:46:00   10,00   23,002,00   10,000,000,000,00
0,00   65,00
12:46:00   20,005,940,996,930,001,980,00
0,00   84,16
12:46:00   30,008,002,005,002,009,000,00
0,00   74,00
12:46:00   40,00   33,330,000,000,000,000,00
0,00   66,67
12:46:00   50,005,940,003,960,000,990,00
0,00   89,11

12:46:00 CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
%guest   %idle
12:46:01 all0,605,813,21   24,450,403,610,00
0,00   61,92
12:46:01   01,014,047,07   31,311,016,060,00
0,00   49,49
12:46:01   10,005,002,00   19,000,002,000,00
0,00   72,00
12:46:01   20,997,921,98   35,640,002,970,00
0,00   50,50
12:46:01   31,984,952,97   13,860,006,930,00
0,00   69,31
12:46:01   40,00   33,330,000,000,000,000,00
0,00   66,67
12:46:01   50,008,083,03   22,220,001,010,00
0,00   65,66

12:46:01 CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
%guest   %idle
12:46:02 all2,38   12,70   17,06   14,680,601,980,00
0,00   50,60
12:46:02   03,96   15,849,90   13,860,002,970,00
0,00   53,47
12:46:02   12,976,935,94   19,802,972,970,00
0,00   58,42
12:46:02   22,02   17,178,08   18,182,021,010,00
0,00   51,52
12:46:02   32,02   10,108,08   14,140,002,020,00
0,00   63,64
12:46:02   40,000,000,000,000,000,000,00
0,00  100,00
12:46:02   50,00   13,00   55,006,000,001,000,00
0,00   25,00

12:46:02 CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
%guest   %idle
12:46:03 all0,20   11,35   10,968,960,402,990,00
0,00   65,14
12:46:03   01,00   11,007,00   15,000,001,000,00
0,00   65,00
12:46:03   10,007,142,046,121,02   11,220,00
0,00   72,45
12:46:03   20,00   15,001,00   12,000,001,000,00
0,00   71,00
12:46:03   30,00   11,00   23,008,000,000,000,00
0,00   58,00
12:46:03   40,000,00   50,000,000,000,000,00
0,00   50,00
12:46:03   50,00   13,00   20,004,000,001,000,00
0,00   6

Re: Unable to create more than 1 guest virtio-net device using vhost-net backend

2010-03-21 Thread Avi Kivity


On 03/21/2010 01:34 PM, Michael S. Tsirkin wrote:

On Sun, Mar 21, 2010 at 12:29:31PM +0200, Avi Kivity wrote:
   

On 03/21/2010 12:15 PM, Michael S. Tsirkin wrote:
 

Nothing easy that I can see. Each device needs 2 of these.  Avi, Gleb,
any objections to increasing the limit to say 16?  That would give us
5 more devices to the limit of 6 per guest.


   

Increase it to 200, then.

 

OK. I think we'll also need a smarter allocator
than bus->dev_count++ than we now have. Right?

   

No, why?
 

We'll run into problems if devices are created/removed in random order,
won't we?
   


unregister_dev() takes care of it.


Eventually we'll want faster scanning than the linear search we employ
now, though.
 

Yes I suspect with 200 entries we will :). Let's just make it 16 for
now?
   


Let's make it 200 and fix the performance problems later.  Making it 16 
is just asking for trouble.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Unable to create more than 1 guest virtio-net device using vhost-net backend

2010-03-21 Thread Michael S. Tsirkin

On Sun, Mar 21, 2010 at 12:29:31PM +0200, Avi Kivity wrote:
> On 03/21/2010 12:15 PM, Michael S. Tsirkin wrote:
 Nothing easy that I can see. Each device needs 2 of these.  Avi, Gleb,
 any objections to increasing the limit to say 16?  That would give us
 5 more devices to the limit of 6 per guest.


>>> Increase it to 200, then.
>>>  
>> OK. I think we'll also need a smarter allocator
>> than bus->dev_count++ than we now have. Right?
>>
>
> No, why?

We'll run into problems if devices are created/removed in random order,
won't we?

> Eventually we'll want faster scanning than the linear search we employ  
> now, though.

Yes I suspect with 200 entries we will :). Let's just make it 16 for
now?

>>> Is the limit visible to userspace?  If not, we need to expose it.
>>>  
>> I don't think it's visible: it seems to be used in a single
>> place in kvm. Let's add an ioctl? Note that qemu doesn't
>> need it now ...
>>
>
> We usually expose limits via KVM_CHECK_EXTENSION(KVM_CAP_BLAH).  We can  
> expose it via KVM_CAP_IOEVENTFD (and need to reserve iodev entries for  
> those).
>
> -- 
> error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Time and KVM - best practices

2010-03-21 Thread Thomas Løcke

Hey,

What is considered "best practice" when running a KVM host with a
mixture of Linux and Windows guests?

Currently I have ntpd running on the host, and I start my guests using
"-rtc base=localhost,clock=host", with an extra "-tdf" added for
Windows guests, just to keep their clock from drifting madly during
load.

But with this setup, all my guests are constantly 1-2 seconds behind
the host. I can live with that for the Windows guests, as they are not
running anything that depends heavily on the time being set perfect,
but for some of the Linux guests it's an issue.

Would I be better of using ntpd and "-rtc base=localhost,clock=vm" for
all the Linux guests, or is there some other magic way of ensuring
that the clock is perfectly in sync with the host? Perhaps there are
some kernel configuration I can do to optimize the host for KVM?

I'm currently using QEMU PC emulator version 0.12.50 (qemu-kvm-devel)
because version 0.12.30 did not work well at all with Windows guests,
and the kernel in both host and Linux guests is 2.6.33.1

:o)
/Thomas
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Tracking KVM development

2010-03-21 Thread Thomas Løcke

Hey all,

I've recently started testing KVM as a possible virtualization
solution for a bunch of servers, and so far things are going pretty
well. My OS of choice is Slackware, and I usually just go with
whatever kernel Slackware comes with.

But with KVM I feel I might need to pay a bit more attention to that
part of Slackware, as it appears to a be a project in rapid
development, so my questions concern how best to track and keep KVM
up-to-date?

Currently I upgrade to the latest stable kernel almost as soon as its
been released by Linus, and I track qemu-kvm using this Git
repository:  git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git

But should I perhaps also track the KVM modules, and if so, from where?

Any and all suggestions to keeping a healthy and stable KVM setup
running is more than welcome.

:o)
/Thomas
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.

2010-03-21 Thread Gleb Natapov

Wrong To: header. Ignore please.

On Sun, Mar 21, 2010 at 01:06:02PM +0200, Gleb Natapov wrote:
> Make sure that rflags is committed only after successful instruction
> emulation.
> 
> Signed-off-by: Gleb Natapov 
> ---
>  arch/x86/include/asm/kvm_emulate.h |1 +
>  arch/x86/kvm/emulate.c |1 +
>  arch/x86/kvm/x86.c |8 ++--
>  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_emulate.h 
> b/arch/x86/include/asm/kvm_emulate.h
> index b5e12c5..a1319c8 100644
> --- a/arch/x86/include/asm/kvm_emulate.h
> +++ b/arch/x86/include/asm/kvm_emulate.h
> @@ -136,6 +136,7 @@ struct x86_emulate_ops {
>   ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu);
>   void (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu);
>   int (*cpl)(struct kvm_vcpu *vcpu);
> + void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
>  };
>  
>  /* Type, address-of, and value of an instruction's operand. */
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 266576c..c1aa983 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -2968,6 +2968,7 @@ writeback:
>   /* Commit shadow register state. */
>   memcpy(ctxt->vcpu->arch.regs, c->regs, sizeof c->regs);
>   kvm_rip_write(ctxt->vcpu, c->eip);
> + ops->set_rflags(ctxt->vcpu, ctxt->eflags);
>  
>  done:
>   return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index bb9a24a..3fa70b3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3643,6 +3643,11 @@ static void emulator_set_segment_selector(u16 sel, int 
> seg,
>   kvm_set_segment(vcpu, &kvm_seg, seg);
>  }
>  
> +static void emulator_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
> +{
> + kvm_x86_ops->set_rflags(vcpu, rflags);
> +}
> +
>  static struct x86_emulate_ops emulate_ops = {
>   .read_std= kvm_read_guest_virt_system,
>   .write_std   = kvm_write_guest_virt_system,
> @@ -3660,6 +3665,7 @@ static struct x86_emulate_ops emulate_ops = {
>   .get_cr  = emulator_get_cr,
>   .set_cr  = emulator_set_cr,
>   .cpl = emulator_get_cpl,
> + .set_rflags  = emulator_set_rflags,
>  };
>  
>  static void cache_all_regs(struct kvm_vcpu *vcpu)
> @@ -3780,8 +3786,6 @@ restart:
>   return EMULATE_DO_MMIO;
>   }
>  
> - kvm_x86_ops->set_rflags(vcpu, vcpu->arch.emulate_ctxt.eflags);
> -
>   if (vcpu->mmio_is_write) {
>   vcpu->mmio_needed = 0;
>   return EMULATE_DO_MMIO;
> -- 
> 1.6.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] KVM: x86 emulator: add decoding of CMPXCHG8B dst operand.

2010-03-21 Thread Gleb Natapov

Decode CMPXCHG8B destination operand in decoding stage. Fixes regression
introduced by "If LOCK prefix is used dest arg should be memory" commit.
This commit relies on dst operand be decoded at the beginning of an
instruction emulation.

Signed-off-by: Gleb Natapov 
---
 arch/x86/kvm/emulate.c |   24 ++--
 1 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index c1aa983..904351e 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -52,6 +52,7 @@
 #define DstMem  (3<<1) /* Memory operand. */
 #define DstAcc  (4<<1)  /* Destination Accumulator */
 #define DstDI   (5<<1) /* Destination is in ES:(E)DI */
+#define DstMem64(6<<1) /* 64bit memory operand */
 #define DstMask (7<<1)
 /* Source operand type. */
 #define SrcNone (0<<4) /* No source operand. */
@@ -360,7 +361,7 @@ static u32 group_table[] = {
DstMem | SrcImmByte | ModRM, DstMem | SrcImmByte | ModRM | Lock,
DstMem | SrcImmByte | ModRM | Lock, DstMem | SrcImmByte | ModRM | Lock,
[Group9*8] =
-   0, ImplicitOps | ModRM | Lock, 0, 0, 0, 0, 0, 0,
+   0, DstMem64 | ModRM | Lock, 0, 0, 0, 0, 0, 0,
 };
 
 static u32 group2_table[] = {
@@ -1205,6 +1206,7 @@ done_prefixes:
 c->twobyte && (c->b == 0xb6 || c->b == 0xb7));
break;
case DstMem:
+   case DstMem64:
if ((c->d & ModRM) && c->modrm_mod == 3) {
c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
c->dst.type = OP_REG;
@@ -1214,7 +1216,10 @@ done_prefixes:
}
c->dst.type = OP_MEM;
c->dst.ptr = (unsigned long *)c->modrm_ea;
-   c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
+   if ((c->d & DstMask) == DstMem64)
+   c->dst.bytes = 8;
+   else
+   c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
c->dst.val = 0;
if (c->d & BitOp) {
unsigned long mask = ~(c->dst.bytes * 8 - 1);
@@ -1706,12 +1711,7 @@ static inline int emulate_grp9(struct x86_emulate_ctxt 
*ctxt,
   struct x86_emulate_ops *ops)
 {
struct decode_cache *c = &ctxt->decode;
-   u64 old, new;
-   int rc;
-
-   rc = ops->read_emulated(c->modrm_ea, &old, 8, ctxt->vcpu);
-   if (rc != X86EMUL_CONTINUE)
-   return rc;
+   u64 old = c->dst.orig_val;
 
if (((u32) (old >> 0) != (u32) c->regs[VCPU_REGS_RAX]) ||
((u32) (old >> 32) != (u32) c->regs[VCPU_REGS_RDX])) {
@@ -1719,15 +1719,12 @@ static inline int emulate_grp9(struct x86_emulate_ctxt 
*ctxt,
c->regs[VCPU_REGS_RAX] = (u32) (old >> 0);
c->regs[VCPU_REGS_RDX] = (u32) (old >> 32);
ctxt->eflags &= ~EFLG_ZF;
-
} else {
-   new = ((u64)c->regs[VCPU_REGS_RCX] << 32) |
+   c->dst.val = ((u64)c->regs[VCPU_REGS_RCX] << 32) |
   (u32) c->regs[VCPU_REGS_RBX];
 
-   rc = ops->cmpxchg_emulated(c->modrm_ea, &old, &new, 8, 
ctxt->vcpu);
-   if (rc != X86EMUL_CONTINUE)
-   return rc;
ctxt->eflags |= EFLG_ZF;
+   c->lock_prefix = 1;
}
return X86EMUL_CONTINUE;
 }
@@ -3241,7 +3238,6 @@ twobyte_insn:
rc = emulate_grp9(ctxt, ops);
if (rc != X86EMUL_CONTINUE)
goto done;
-   c->dst.type = OP_NONE;
break;
}
goto writeback;
-- 
1.6.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.

2010-03-21 Thread Gleb Natapov

Make sure that rflags is committed only after successful instruction
emulation.

Signed-off-by: Gleb Natapov 
---
 arch/x86/include/asm/kvm_emulate.h |1 +
 arch/x86/kvm/emulate.c |1 +
 arch/x86/kvm/x86.c |8 ++--
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h 
b/arch/x86/include/asm/kvm_emulate.h
index b5e12c5..a1319c8 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -136,6 +136,7 @@ struct x86_emulate_ops {
ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu);
void (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu);
int (*cpl)(struct kvm_vcpu *vcpu);
+   void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
 };
 
 /* Type, address-of, and value of an instruction's operand. */
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 266576c..c1aa983 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2968,6 +2968,7 @@ writeback:
/* Commit shadow register state. */
memcpy(ctxt->vcpu->arch.regs, c->regs, sizeof c->regs);
kvm_rip_write(ctxt->vcpu, c->eip);
+   ops->set_rflags(ctxt->vcpu, ctxt->eflags);
 
 done:
return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bb9a24a..3fa70b3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3643,6 +3643,11 @@ static void emulator_set_segment_selector(u16 sel, int 
seg,
kvm_set_segment(vcpu, &kvm_seg, seg);
 }
 
+static void emulator_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
+{
+   kvm_x86_ops->set_rflags(vcpu, rflags);
+}
+
 static struct x86_emulate_ops emulate_ops = {
.read_std= kvm_read_guest_virt_system,
.write_std   = kvm_write_guest_virt_system,
@@ -3660,6 +3665,7 @@ static struct x86_emulate_ops emulate_ops = {
.get_cr  = emulator_get_cr,
.set_cr  = emulator_set_cr,
.cpl = emulator_get_cpl,
+   .set_rflags  = emulator_set_rflags,
 };
 
 static void cache_all_regs(struct kvm_vcpu *vcpu)
@@ -3780,8 +3786,6 @@ restart:
return EMULATE_DO_MMIO;
}
 
-   kvm_x86_ops->set_rflags(vcpu, vcpu->arch.emulate_ctxt.eflags);
-
if (vcpu->mmio_is_write) {
vcpu->mmio_needed = 0;
return EMULATE_DO_MMIO;
-- 
1.6.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] KVM: x86 emulator: add decoding of CMPXCHG8B dst operand.

2010-03-21 Thread Gleb Natapov

Decode CMPXCHG8B destination operand in decoding stage. Fixes regression
introduced by "If LOCK prefix is used dest arg should be memory" commit.
This commit relies on dst operand be decoded at the beginning of an
instruction emulation.

Signed-off-by: Gleb Natapov 
---
 arch/x86/kvm/emulate.c |   24 ++--
 1 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index c1aa983..904351e 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -52,6 +52,7 @@
 #define DstMem  (3<<1) /* Memory operand. */
 #define DstAcc  (4<<1)  /* Destination Accumulator */
 #define DstDI   (5<<1) /* Destination is in ES:(E)DI */
+#define DstMem64(6<<1) /* 64bit memory operand */
 #define DstMask (7<<1)
 /* Source operand type. */
 #define SrcNone (0<<4) /* No source operand. */
@@ -360,7 +361,7 @@ static u32 group_table[] = {
DstMem | SrcImmByte | ModRM, DstMem | SrcImmByte | ModRM | Lock,
DstMem | SrcImmByte | ModRM | Lock, DstMem | SrcImmByte | ModRM | Lock,
[Group9*8] =
-   0, ImplicitOps | ModRM | Lock, 0, 0, 0, 0, 0, 0,
+   0, DstMem64 | ModRM | Lock, 0, 0, 0, 0, 0, 0,
 };
 
 static u32 group2_table[] = {
@@ -1205,6 +1206,7 @@ done_prefixes:
 c->twobyte && (c->b == 0xb6 || c->b == 0xb7));
break;
case DstMem:
+   case DstMem64:
if ((c->d & ModRM) && c->modrm_mod == 3) {
c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
c->dst.type = OP_REG;
@@ -1214,7 +1216,10 @@ done_prefixes:
}
c->dst.type = OP_MEM;
c->dst.ptr = (unsigned long *)c->modrm_ea;
-   c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
+   if ((c->d & DstMask) == DstMem64)
+   c->dst.bytes = 8;
+   else
+   c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes;
c->dst.val = 0;
if (c->d & BitOp) {
unsigned long mask = ~(c->dst.bytes * 8 - 1);
@@ -1706,12 +1711,7 @@ static inline int emulate_grp9(struct x86_emulate_ctxt 
*ctxt,
   struct x86_emulate_ops *ops)
 {
struct decode_cache *c = &ctxt->decode;
-   u64 old, new;
-   int rc;
-
-   rc = ops->read_emulated(c->modrm_ea, &old, 8, ctxt->vcpu);
-   if (rc != X86EMUL_CONTINUE)
-   return rc;
+   u64 old = c->dst.orig_val;
 
if (((u32) (old >> 0) != (u32) c->regs[VCPU_REGS_RAX]) ||
((u32) (old >> 32) != (u32) c->regs[VCPU_REGS_RDX])) {
@@ -1719,15 +1719,12 @@ static inline int emulate_grp9(struct x86_emulate_ctxt 
*ctxt,
c->regs[VCPU_REGS_RAX] = (u32) (old >> 0);
c->regs[VCPU_REGS_RDX] = (u32) (old >> 32);
ctxt->eflags &= ~EFLG_ZF;
-
} else {
-   new = ((u64)c->regs[VCPU_REGS_RCX] << 32) |
+   c->dst.val = ((u64)c->regs[VCPU_REGS_RCX] << 32) |
   (u32) c->regs[VCPU_REGS_RBX];
 
-   rc = ops->cmpxchg_emulated(c->modrm_ea, &old, &new, 8, 
ctxt->vcpu);
-   if (rc != X86EMUL_CONTINUE)
-   return rc;
ctxt->eflags |= EFLG_ZF;
+   c->lock_prefix = 1;
}
return X86EMUL_CONTINUE;
 }
@@ -3241,7 +3238,6 @@ twobyte_insn:
rc = emulate_grp9(ctxt, ops);
if (rc != X86EMUL_CONTINUE)
goto done;
-   c->dst.type = OP_NONE;
break;
}
goto writeback;
-- 
1.6.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.

2010-03-21 Thread Gleb Natapov

Make sure that rflags is committed only after successful instruction
emulation.

Signed-off-by: Gleb Natapov 
---
 arch/x86/include/asm/kvm_emulate.h |1 +
 arch/x86/kvm/emulate.c |1 +
 arch/x86/kvm/x86.c |8 ++--
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h 
b/arch/x86/include/asm/kvm_emulate.h
index b5e12c5..a1319c8 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -136,6 +136,7 @@ struct x86_emulate_ops {
ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu);
void (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu);
int (*cpl)(struct kvm_vcpu *vcpu);
+   void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags);
 };
 
 /* Type, address-of, and value of an instruction's operand. */
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 266576c..c1aa983 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2968,6 +2968,7 @@ writeback:
/* Commit shadow register state. */
memcpy(ctxt->vcpu->arch.regs, c->regs, sizeof c->regs);
kvm_rip_write(ctxt->vcpu, c->eip);
+   ops->set_rflags(ctxt->vcpu, ctxt->eflags);
 
 done:
return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bb9a24a..3fa70b3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3643,6 +3643,11 @@ static void emulator_set_segment_selector(u16 sel, int 
seg,
kvm_set_segment(vcpu, &kvm_seg, seg);
 }
 
+static void emulator_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
+{
+   kvm_x86_ops->set_rflags(vcpu, rflags);
+}
+
 static struct x86_emulate_ops emulate_ops = {
.read_std= kvm_read_guest_virt_system,
.write_std   = kvm_write_guest_virt_system,
@@ -3660,6 +3665,7 @@ static struct x86_emulate_ops emulate_ops = {
.get_cr  = emulator_get_cr,
.set_cr  = emulator_set_cr,
.cpl = emulator_get_cpl,
+   .set_rflags  = emulator_set_rflags,
 };
 
 static void cache_all_regs(struct kvm_vcpu *vcpu)
@@ -3780,8 +3786,6 @@ restart:
return EMULATE_DO_MMIO;
}
 
-   kvm_x86_ops->set_rflags(vcpu, vcpu->arch.emulate_ctxt.eflags);
-
if (vcpu->mmio_is_write) {
vcpu->mmio_needed = 0;
return EMULATE_DO_MMIO;
-- 
1.6.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Unable to create more than 1 guest virtio-net device using vhost-net backend

2010-03-21 Thread Avi Kivity


On 03/21/2010 12:21 PM, Gleb Natapov wrote:

On Sun, Mar 21, 2010 at 12:11:33PM +0200, Avi Kivity wrote:
   

On 03/21/2010 11:55 AM, Michael S. Tsirkin wrote:
 

On Fri, Mar 19, 2010 at 03:19:27PM -0700, Sridhar Samudrala wrote:
   

When creating a guest with 2 virtio-net interfaces, i am running
into a issue causing the 2nd i/f falling back to userpace virtio
even when vhost is enabled.

After some debugging, it turned out that KVM_IOEVENTFD ioctl()
call in qemu is failing with ENOSPC.
This is because of the NR_IOBUS_DEVS(6) limit in kvm_io_bus_register_dev()
routine in the host kernel.

I think we need to increase this limit if we want to support multiple
network interfaces using vhost-net.
Is there an alternate solution?

Thanks
Sridhar
 

Nothing easy that I can see. Each device needs 2 of these.  Avi, Gleb,
any objections to increasing the limit to say 16?  That would give us
5 more devices to the limit of 6 per guest.
   

Increase it to 200, then.

 

Currently on each device read/write we iterate over all registered
devices. This is not scalable.
   


Yeah.  We need first to drop the callback based matching and replace it 
with explicit ranges, then to replace the search with a hash table for 
small ranges (keeping a linear search for large ranges, can happen for 
coalesced mmio).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Unable to create more than 1 guest virtio-net device using vhost-net backend

2010-03-21 Thread Avi Kivity


On 03/21/2010 12:15 PM, Michael S. Tsirkin wrote:

Nothing easy that I can see. Each device needs 2 of these.  Avi, Gleb,
any objections to increasing the limit to say 16?  That would give us
5 more devices to the limit of 6 per guest.

   

Increase it to 200, then.
 

OK. I think we'll also need a smarter allocator
than bus->dev_count++ than we now have. Right?
   


No, why?

Eventually we'll want faster scanning than the linear search we employ 
now, though.



Is the limit visible to userspace?  If not, we need to expose it.
 

I don't think it's visible: it seems to be used in a single
place in kvm. Let's add an ioctl? Note that qemu doesn't
need it now ...
   


We usually expose limits via KVM_CHECK_EXTENSION(KVM_CAP_BLAH).  We can 
expose it via KVM_CAP_IOEVENTFD (and need to reserve iodev entries for 
those).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Unable to create more than 1 guest virtio-net device using vhost-net backend

2010-03-21 Thread Gleb Natapov

On Sun, Mar 21, 2010 at 12:11:33PM +0200, Avi Kivity wrote:
> On 03/21/2010 11:55 AM, Michael S. Tsirkin wrote:
> >On Fri, Mar 19, 2010 at 03:19:27PM -0700, Sridhar Samudrala wrote:
> >>When creating a guest with 2 virtio-net interfaces, i am running
> >>into a issue causing the 2nd i/f falling back to userpace virtio
> >>even when vhost is enabled.
> >>
> >>After some debugging, it turned out that KVM_IOEVENTFD ioctl()
> >>call in qemu is failing with ENOSPC.
> >>This is because of the NR_IOBUS_DEVS(6) limit in kvm_io_bus_register_dev()
> >>routine in the host kernel.
> >>
> >>I think we need to increase this limit if we want to support multiple
> >>network interfaces using vhost-net.
> >>Is there an alternate solution?
> >>
> >>Thanks
> >>Sridhar
> >Nothing easy that I can see. Each device needs 2 of these.  Avi, Gleb,
> >any objections to increasing the limit to say 16?  That would give us
> >5 more devices to the limit of 6 per guest.
> 
> Increase it to 200, then.
> 
Currently on each device read/write we iterate over all registered
devices. This is not scalable.

> Is the limit visible to userspace?  If not, we need to expose it.
> 
> -- 
> error compiling committee.c: too many arguments to function

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Unable to create more than 1 guest virtio-net device using vhost-net backend

2010-03-21 Thread Michael S. Tsirkin

On Sun, Mar 21, 2010 at 12:11:33PM +0200, Avi Kivity wrote:
> On 03/21/2010 11:55 AM, Michael S. Tsirkin wrote:
>> On Fri, Mar 19, 2010 at 03:19:27PM -0700, Sridhar Samudrala wrote:
>>
>>> When creating a guest with 2 virtio-net interfaces, i am running
>>> into a issue causing the 2nd i/f falling back to userpace virtio
>>> even when vhost is enabled.
>>>
>>> After some debugging, it turned out that KVM_IOEVENTFD ioctl()
>>> call in qemu is failing with ENOSPC.
>>> This is because of the NR_IOBUS_DEVS(6) limit in kvm_io_bus_register_dev()
>>> routine in the host kernel.
>>>
>>> I think we need to increase this limit if we want to support multiple
>>> network interfaces using vhost-net.
>>> Is there an alternate solution?
>>>
>>> Thanks
>>> Sridhar
>>>  
>> Nothing easy that I can see. Each device needs 2 of these.  Avi, Gleb,
>> any objections to increasing the limit to say 16?  That would give us
>> 5 more devices to the limit of 6 per guest.
>>
>
> Increase it to 200, then.

OK. I think we'll also need a smarter allocator
than bus->dev_count++ than we now have. Right?

> Is the limit visible to userspace?  If not, we need to expose it.

I don't think it's visible: it seems to be used in a single
place in kvm. Let's add an ioctl? Note that qemu doesn't
need it now ...

> -- 
> error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Unable to create more than 1 guest virtio-net device using vhost-net backend

2010-03-21 Thread Avi Kivity


On 03/21/2010 11:55 AM, Michael S. Tsirkin wrote:

On Fri, Mar 19, 2010 at 03:19:27PM -0700, Sridhar Samudrala wrote:
   

When creating a guest with 2 virtio-net interfaces, i am running
into a issue causing the 2nd i/f falling back to userpace virtio
even when vhost is enabled.

After some debugging, it turned out that KVM_IOEVENTFD ioctl()
call in qemu is failing with ENOSPC.
This is because of the NR_IOBUS_DEVS(6) limit in kvm_io_bus_register_dev()
routine in the host kernel.

I think we need to increase this limit if we want to support multiple
network interfaces using vhost-net.
Is there an alternate solution?

Thanks
Sridhar
 

Nothing easy that I can see. Each device needs 2 of these.  Avi, Gleb,
any objections to increasing the limit to say 16?  That would give us
5 more devices to the limit of 6 per guest.
   


Increase it to 200, then.

Is the limit visible to userspace?  If not, we need to expose it.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Strange CPU usage pattern in SMP guest

2010-03-21 Thread Avi Kivity


On 03/21/2010 02:13 AM, Sebastian Hetze wrote:

Hi *,

in an 6 CPU SMP guest running on an host with 2 quad core
Intel Xeon E5520 with hyperthrading enabled
we see one or more guest CPUs working in a very strange
pattern. It looks like all or nothing. We can easily identify
the effected CPU with xosview. Here is the mpstat output
compared to one regular working CPU:


mpstat -P 4 1
Linux 2.6.31-16-generic-pae (guest) 21.03.2010  _i686_  (6 CPU)
00:45:19 CPU%usr   %nice%sys %iowait%irq   %soft  %steal  
%guest   %idle
00:45:20   40,00  100,000,000,000,000,000,00
0,000,00
00:45:21   40,00  100,000,000,000,000,000,00
0,000,00
00:45:22   40,00  100,000,000,000,000,000,00
0,000,00
00:45:23   40,00  100,000,000,000,000,000,00
0,000,00
00:45:24   40,00   66,670,000,000,00   33,330,00
0,000,00
00:45:25   40,00  100,000,000,000,000,000,00
0,000,00
00:45:26   40,00  100,000,000,000,000,000,00
0,000,00
   


Looks like the guest is only receiving 3-4 timer interrupts per second, 
so time becomes quantized.


Please run the attached irqtop in the affected guest and report the results.

Is the host overly busy?  What host kernel, kvm, and qemu are you 
running?  Is the guest running an I/O workload? if so, how are the disks 
configured?


--
error compiling committee.c: too many arguments to function

#!/usr/bin/python

import curses
import sys, os, time, optparse

def read_interrupts():
irq = {}
proc = file('/proc/interrupts')
nrcpu = len(proc.readline().split())
for line in proc.readlines():
vec, data = line.strip().split(':', 1)
if vec in ('ERR', 'MIS'):
continue
counts = data.split(None, nrcpu)
counts, rest = (counts[:-1], counts[-1])
count = sum([int(x) for x in counts])
try:
v = int(vec)
name = rest.split(None, 1)[1]
except:
name = rest
irq[name] = count
return irq

def delta_interrupts():
old = read_interrupts()
while True:
irq = read_interrupts()
delta = {}
for key in irq.keys():
delta[key] = irq[key] - old[key]
yield delta
old = irq

label_width = 30
number_width = 10

def tui(screen):
curses.use_default_colors()
curses.noecho()
def getcount(x):
return x[1]
def refresh(irq):
screen.erase()
screen.addstr(0, 0, 'irqtop')
row = 2
for name, count in sorted(irq.items(), key = getcount, reverse = True):
if row >= screen.getmaxyx()[0]:
break
col = 1
screen.addstr(row, col, name)
col += label_width
screen.addstr(row, col, '%10d' % (count,))
row += 1
screen.refresh()

for irqs in delta_interrupts():
refresh(irqs)
curses.halfdelay(10)
try:
c = screen.getkey()
if c == 'q':
break
except KeyboardInterrupt:
break
except curses.error:
continue

import curses.wrapper
curses.wrapper(tui)

Re: [RFC] Unify KVM kernel-space and user-space code into a single project

2010-03-21 Thread Avi Kivity


On 03/20/2010 04:59 PM, Andrea Arcangeli wrote:

On Fri, Mar 19, 2010 at 09:21:49AM +0200, Avi Kivity wrote:
   

On 03/19/2010 12:44 AM, Ingo Molnar wrote:
 

Too bad - there was heavy initial opposition to the arch/x86 unification as
well [and heavy opposition to tools/perf/ as well], still both worked out
extremely well :-)

   

Did you forget that arch/x86 was a merging of a code fork that happened
several years previously?  Maybe that fork shouldn't have been done to
begin with.
 

We discussed and probably timidly tried to share the sharable
initially but we realized it was too time wasteful. In addition to
having to adapt the code to 64bit we would also had to constantly
solve another problem on top of it (see the various split on _32/_64,
those takes time to achieve, maybe not huge time but still definitely
some time and effort). Even in retrospect I am quite sure the way
x86-64 happened was optimal and if we would go back we would do it
again the exact same way even if the final object was to have a common
arch/x86 (and thankfully Linus is flexible and smart enough to realize
that code that isn't risking to destabilize anything shouldn't be
forced out just because it's not to a totally
theoretical-perfect-nitpicking-clean-state yet). It's still a lot of
work do the unification later as a separate task, but it's not like if
we did it immediately it would have been a lot less work. It's about
the same amount of effort and we were able to defer it for later and
decrease the time to market which surely has contributed to the
success of x86-64.
   


In hindsight decisions are much easier.  I agree it was less risky to 
fork than to share.  But if another instruction set forks out a 64-bit 
not-exactly-compatible variant, I'm sure we'll start out shared and not 
fork it, especially if the platform remains the same.



Problem of qemu is not some lack of GUI or that it's not included in
the linux kernel git tree, the definitive problem is how to merge
qemu-kvm/kvm and qlx into it. If you (Avi) were the qemu maintainer I
am sure there wouldn't two trees so as a developer I would totally
love it, and I am sure that with you as maintainer it would have a
chance to move forward with qlx on desktop virtualization without
proposing to extend vnc instead to achieve a "similar" result (imagine
if btrfs is published on a website and people starts to discuss if it
should ever be merged ever because reinventing some part of btrfs
inside ext5 might achieve ""similar"" results).
   


The qemu/qemu-kvm fork is definitely hurting.  Some history: when kvm 
started out I pulled qemu for fast hacking and, much like arch/x86_64, I 
couldn't destabilize qemu for something that was completely experimental 
(and closed source at the time).  Moreover, it wasn't clear if the qemu 
community would be interested.


The qemu-kvm fork was designed for minimal intrusion so I could merge 
upstream qemu regularly.  This resulted in kvm integration that was 
fairly ugly.  Later Anthony merged a well-integrated alternative 
implementation (in retrospect this was a mistake IMO - we were left with 
a well tested high performing ugly implementation and a clean, slow, 
untested, and unfeatured implementation, and no one who wants to merge 
the two).  So now it is pretty confusing to read the code which has the 
two alternate implementation sometimes sharing code and sometimes diverging.




About a GUI for KVM to use on desktop distributions, that is an
irrelevant concern compared to the lack of protocol more efficient
than rdesktop/rdp/vnc for desktop virtualization. I've people asking
me to migrate hundreds of desktops to desktop virtualization on KVM in
their organizations and I tell them to use spice because I believe
it's the most efficient option available (at least as far as we stick
to open source open protocols), there are universities using spice on
thousand of student desktops, and I think we need paravirt graphics to
happen ASAP in the main qemu tree too.
   


That effort will have to wait for the spice project to mature.


In short: running KVM on the desktop is irrelevant compared to running
the desktop on KVM so I suggest to focus on what is more important
first ;).
   


Anyone can focus on what interests them, if someone has an interest in a 
good desktop-on-desktop experience they should start hacking and sending 
patches.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Unable to create more than 1 guest virtio-net device using vhost-net backend

2010-03-21 Thread Michael S. Tsirkin

On Fri, Mar 19, 2010 at 03:19:27PM -0700, Sridhar Samudrala wrote:
> When creating a guest with 2 virtio-net interfaces, i am running
> into a issue causing the 2nd i/f falling back to userpace virtio
> even when vhost is enabled.
> 
> After some debugging, it turned out that KVM_IOEVENTFD ioctl() 
> call in qemu is failing with ENOSPC.
> This is because of the NR_IOBUS_DEVS(6) limit in kvm_io_bus_register_dev()
> routine in the host kernel.
> 
> I think we need to increase this limit if we want to support multiple
> network interfaces using vhost-net.
> Is there an alternate solution?
> 
> Thanks
> Sridhar

Nothing easy that I can see. Each device needs 2 of these.  Avi, Gleb,
any objections to increasing the limit to say 16?  That would give us
5 more devices to the limit of 6 per guest.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

90 matches

Mail list logo