Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On Sun, 2010-03-21 at 22:20 +0100, Ingo Molnar wrote: > * Avi Kivity wrote: > > > > Well, for what it's worth, I rarely ever use anything else. My virtual > > > disks are raw so I can loop mount them easily, and I can also switch my > > > guest kernels from outside... without ever needing to mount those disks. > > > > Curious, what do you use them for? > > > > btw, if you build your kernel outside the guest, then you already have > > access to all its symbols, without needing anything further. > > There's two errors with your argument: > > 1) you are assuming that it's only about kernel symbols > > Look at this 'perf report' output: > > # Samples: 7127509216 > # > # Overhead Command Shared Object Symbol > # .. . .. > # > 19.14% git git[.] lookup_object > 15.16%perf git[.] lookup_object > 4.74%perf libz.so.1.2.3 [.] inflate > 4.52% git libz.so.1.2.3 [.] inflate > 4.21%perf libz.so.1.2.3 [.] inflate_table > 3.94% git libz.so.1.2.3 [.] inflate_table > 3.29% git git[.] find_pack_entry_one > 3.24% git libz.so.1.2.3 [.] inflate_fast > 2.96%perf libz.so.1.2.3 [.] inflate_fast > 2.96% git git[.] decode_tree_entry > 2.80%perf libc-2.11.90.so[.] __strlen_sse42 > 2.56% git libc-2.11.90.so[.] __strlen_sse42 > 1.98%perf libc-2.11.90.so[.] __GI_memcpy > 1.71%perf git[.] decode_tree_entry > 1.53% git libc-2.11.90.so[.] __GI_memcpy > 1.48% git git[.] lookup_blob > 1.30% git git[.] process_tree > 1.30%perf git[.] process_tree > 0.90%perf git[.] tree_entry > 0.82%perf git[.] lookup_blob > 0.78% git [kernel.kallsyms] [k] kstat_irqs_cpu > > kernel symbols are only a small portion of the symbols. (a single line in > this > case) Above example shows perf could summarize both kernel and application hot functions. If we collect guest os statistics from host side, we can't summarize detailed guest os application info because we couldn't get guest os's application process id from host side. So we could only get detailed kernel info and the total utilization percent of guest application processes. > > To get to those other symbols we have to read the ELF symbols of those > binaries in the guest filesystem, in the post-processing/reporting phase. > This > is both complex to do and relatively slow so we dont want to (and cannot) do > this at sample time from IRQ context or NMI context ... > > Also, many aspects of reporting are interactive so it's done lazily or > on-demand. So we need ready access to the guest filesystem - for those guests > which decide to integrate with the host for this. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 11:52 PM, Ingo Molnar wrote: * Avi Kivity wrote: I.e. you are arguing for microkernel Linux, while you see me as arguing for a monolithic kernel. No. I'm arguing for reducing bloat wherever possible. Kernel code is more expensive than userspace code in every metric possible. 1) One of the primary design arguments of the micro-kernel design as well was to push as much into user-space as possible without impacting performance too much so you very much seem to be arguing for a micro-kernel design for the kernel. I think history has given us the answer for that fight between microkernels and monolithic kernels. I am not arguing for a microkernel. Again: reduce bloat where possible, kernel code is more expensive than userspace code. Furthermore, to not engage in hypotheticals about microkernels: by your argument the Oprofile design was perfect (it was minimalistic kernel-space, with all the complexity in user-space), while perf was over-complex (which does many things in the kernel that could have been done in user-space). Practical results suggest the exact opposite happened - Oprofile is being replaced by perf. How do you explain that? I did not say that the amount of kernel and userspace code is the only factor deciding the quality of software. If that were so, microkernels would have won out long ago. It may be that that perf has too much kernel code, and won against oprofile despite that because it was better in other areas. Or it may be that perf has exactly the right user/kernel division. Or maybe perf needs some of the code moved from userspace to the kernel. I don't know, I haven't examined the code. The user/kernel boundary is only one metric for code quality. Nor is it always in favour of pushing things to userspace. Narrowing or simplifying an interface is often an argument in favour of pushing things into the kernel. IMO the reason perf is more usable than oprofile has less to do with the kernel/userspace boundary and more do to with effort and attention spent on the userspace/user boundary. 2) In your analysis you again ignore the package boundary costs and artifacts as if they didnt exist. That was my main argument, and that is what we saw with oprofile and perf: while maintaining more kernel-code may be more expensive, it sure pays off for getting us a much better solution in the end. Package costs are real. We need to bear them. I don't think that because maintaining another package (and the interface between two packages) is more difficult, then the kernel size should increase. And getting a 'much better solution' to users is the goal of all this, isnt it? I dont mind what you call 'bloat' per se if it's for a purpose that users find like a good deal. I have quite a bit of RAM in most of my systems, having 50K more or less included in the kernel image is far less important than having a healthy and vibrant development model and having satisfied users ... I'm not worried about 50K or so, I'm worried about a bug in those 50K taking down the guest. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 10:37 PM, Ingo Molnar wrote: That includes the guest kernel. If you can deploy a new kernel in the guest, presumably you can deploy a userspace package. Note that with perf we can instrument the guest with zero guest-kernel modifications as well. We try to reduce the guest impact to a bare minimum, as the difficulties in deployment are function of the cross section surface to the guest. Also, note that the kernel is special with regards to instrumentation: since this is the kernel project, we are doing kernel space changes, as we are doing them _anyway_. So adding symbol resolution capabilities would be a minimal addition to that - while adding a while new guest package for the demon would significantly increase the cross section surface. It's true that for us, changing the kernel is easier than changing the rest of the guest. IMO we should still resist the temptation to go the easy path and do the right thing (I understand we disagree about what the right thing is). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 11:20 PM, Ingo Molnar wrote: * Avi Kivity wrote: Well, for what it's worth, I rarely ever use anything else. My virtual disks are raw so I can loop mount them easily, and I can also switch my guest kernels from outside... without ever needing to mount those disks. Curious, what do you use them for? btw, if you build your kernel outside the guest, then you already have access to all its symbols, without needing anything further. There's two errors with your argument: 1) you are assuming that it's only about kernel symbols Look at this 'perf report' output: # Samples: 7127509216 # # Overhead Command Shared Object Symbol # .. . .. # 19.14% git git[.] lookup_object 15.16%perf git[.] lookup_object 4.74%perf libz.so.1.2.3 [.] inflate 4.52% git libz.so.1.2.3 [.] inflate 4.21%perf libz.so.1.2.3 [.] inflate_table 3.94% git libz.so.1.2.3 [.] inflate_table 3.29% git git[.] find_pack_entry_one 3.24% git libz.so.1.2.3 [.] inflate_fast 2.96%perf libz.so.1.2.3 [.] inflate_fast 2.96% git git[.] decode_tree_entry 2.80%perf libc-2.11.90.so[.] __strlen_sse42 2.56% git libc-2.11.90.so[.] __strlen_sse42 1.98%perf libc-2.11.90.so[.] __GI_memcpy 1.71%perf git[.] decode_tree_entry 1.53% git libc-2.11.90.so[.] __GI_memcpy 1.48% git git[.] lookup_blob 1.30% git git[.] process_tree 1.30%perf git[.] process_tree 0.90%perf git[.] tree_entry 0.82%perf git[.] lookup_blob 0.78% git [kernel.kallsyms] [k] kstat_irqs_cpu kernel symbols are only a small portion of the symbols. (a single line in this case) To get to those other symbols we have to read the ELF symbols of those binaries in the guest filesystem, in the post-processing/reporting phase. This is both complex to do and relatively slow so we dont want to (and cannot) do this at sample time from IRQ context or NMI context ... Okay. So a symbol server is necessary. Still, I don't think -kernel is a good reason for including the symbol server in the kernel itself. If someone uses it extensively together with perf, _and_ they can't put the symbol server in the guest for some reason, let them patch mkinitrd to include it. Also, many aspects of reporting are interactive so it's done lazily or on-demand. So we need ready access to the guest filesystem - for those guests which decide to integrate with the host for this. 2) the 'SystemTap mistake' You are assuming that the symbols of the kernel when it got built got saved properly and are discoverable easily. In reality those symbols can be erased by a make clean, can be modified by a new build, can be misplaced and can generally be hard to find because each distro puts them in a different installation path. My 10+ years experience with kernel instrumentation solutions is that kernel-driven, self-sufficient, robust, trustable, well-enumerated sources of information work far better in practice. What about line number information? And the source? Into the kernel with them as well? The thing is, in this thread i'm forced to repeat the same basic facts again and again. Could you _PLEASE_, pretty please, when it comes to instrumentation details, at least _read the mails_ of the guys who actually ... write and maintain Linux instrumentation code? This is getting ridiculous really. I've read every one of your emails. If I misunderstood or overlooked something, I apologize. The thread is very long and at times antagonistic so it's hard to keep all the details straight. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] qemu-kvm: Introduce wrapper functions to access phys_ram_dirty, and replace existing direct accesses to it.
Marcelo Tosatti wrote: On Wed, Mar 17, 2010 at 02:51:46PM +0900, Yoshiaki Tamura wrote: Before replacing byte-based dirty bitmap with bit-based dirty bitmap, clearing direct accesses to the bitmap first seems to be good point to start with. This patch set is based on the following discussion. http://www.mail-archive.com/kvm@vger.kernel.org/msg30724.html Thanks, Yoshi Looks fine to me. This is qemu upstream material, though. Thanks for your comment. I should have removed qemu-kvm from the title. Should I rebase the patch to qemu.git and repost? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Streaming Audio from Virtual Machine
On 03/21/2010 01:12 PM, Gus Zernial wrote: > I'm using Kubuntu 9.10 32-bit on a quad-core Phenom II with > Gigabit ethernet. I want to stream audio from MLB.com from a > WinXP client thru a Linksys WMB54G wireless music bridge. Note > that there are drivers for the WMB54G only for WinXP and Vista. > > If I stream the audio thru a native WinXP box thru the WMB54G, > all is well and the audio sounds fine. When I try to stream thru a > WinXP virtual machine on Kubuntu 9.10, the audio is poor quality > and subject to gaps and dropping the stream altogether. So far > I've tried KVM/QEMU and VirtualBox, same result. > > Regards KVM/QEMU, I note AMD-V is activated in the BIOS, and I have a > custom 2.6.32.7 kernel, and QEMU 0.11.0. The kvm kvm_amd modules are compiled > in and loaded. I've been using bridged networking . I think it's set up > correctly but I confess I'm no networking expert. My start command for the > WinXP virtual machine is: > > sudo /usr/bin/qemu -m 1024 -boot c > -netnic,vlan=0,macaddr=00:d0:13:b0:2d:32,model=rtl8139 -net > tap,vlan=0,ifname=tap0,script=/etc/qemu-ifup -localtime -soundhw ac97 -smp 4 > -fda /dev/fd0 -vga std -usb /home/rbroman/windows.img > > I also tried model=virtio but that didn't help. > > I suspect this is a virtual machine networking problem but I'm > not sure. So my questions are: > > -What's the best/fastest networking option and how do I set it up? > Pointers to step-by-step instructions appreciated. > > -Is it possible I have a problem other than networking? Configuration > problem with KVM/QEMU? Or could there be a problem with the WMB54G driver > when used thru a virtual machine? > > -Is there a better virtual machine solution than KVM/QEMU for what > I'm trying to do? [dsa] I have been able to stream and video in a KVM-hosted winxp VM, and I have even watched a netflix-based movie. My laptop has a Core-2 duo cpu, T9550, with 4 GB of RAM. Networking at home is through a wireless-N router, and I use bridged networking and NAT for VMs. Host activity definitely has an impact. When streaming I make sure I am not doing any heavy activity in the host layer, and if I notice jitter the first thing I do is up the priority of the VM threads using chrt. David > > Recommendations appreciated - Gus > > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 3/5] KVM test: kvm_utils.load_env(): do not fail if env file is corrupted
- Include the unpickling code in the 'try' block, so that an exception raised during unpickling will not fail the test. - Change the default env (returned by load_env() when the file is missing or corrupt) to {}. Signed-off-by: Michael Goldish --- client/tests/kvm/kvm_utils.py | 10 ++ 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py index d386456..cc39b5d 100644 --- a/client/tests/kvm/kvm_utils.py +++ b/client/tests/kvm/kvm_utils.py @@ -22,7 +22,7 @@ def dump_env(obj, filename): file.close() -def load_env(filename, default=None): +def load_env(filename, default={}): """ Load KVM test environment from an environment file. @@ -30,11 +30,13 @@ def load_env(filename, default=None): """ try: file = open(filename, "r") +obj = cPickle.load(file) +file.close() +return obj +# Almost any exception can be raised during unpickling, so let's catch +# them all except: return default -obj = cPickle.load(file) -file.close() -return obj def get_sub_dict(dict, name): -- 1.5.4.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 5/5] KVM test: take frequent screendumps during all tests
Screendumps are taken regularly and converted to JPEG format. They are stored in .../debug/screendumps_/. Requires python-imaging. - Enabled by 'take_regular_screendumps = yes' (naming suggestions welcome). - Delay between screendumps is controlled by 'screendump_delay' (default 5). - Compression quality is controlled by 'screendump_quality' (default 30). - It's probably a good idea to dump them to /dev/shm before converting them in order to minimize disk use. This can be enabled by 'screendump_temp_dir = /dev/shm' (commented out by default because I'm not sure /dev/shm is available on all machines.) - Screendumps are removed unless 'keep_screendumps'['_on_error'] is 'yes'. The recommended setting when submitting jobs from autoserv is 'keep_screendumps_on_error = yes', which means screendumps are kept only if the test fails. Keeping all screendumps may use up all of the server's storage space. This patch sets reasonable defaults in tests_base.cfg.sample. (It also makes sure post_command is executed last in the postprocessing procedure -- otherwise post_command failure can prevent other postprocessing steps (like removing the screendump dirs) from taking place.) Signed-off-by: Michael Goldish --- client/tests/kvm/kvm_preprocessing.py | 85 +-- client/tests/kvm/tests_base.cfg.sample | 13 - 2 files changed, 89 insertions(+), 9 deletions(-) diff --git a/client/tests/kvm/kvm_preprocessing.py b/client/tests/kvm/kvm_preprocessing.py index e3a5501..0e4ce87 100644 --- a/client/tests/kvm/kvm_preprocessing.py +++ b/client/tests/kvm/kvm_preprocessing.py @@ -1,4 +1,4 @@ -import sys, os, time, commands, re, logging, signal, glob +import sys, os, time, commands, re, logging, signal, glob, threading, shutil from autotest_lib.client.bin import test, utils from autotest_lib.client.common_lib import error import kvm_vm, kvm_utils, kvm_subprocess, ppm_utils @@ -11,6 +11,10 @@ except ImportError: 'distro.') +_screendump_thread = None +_screendump_thread_termination_event = None + + def preprocess_image(test, params): """ Preprocess a single QEMU image according to the instructions in params. @@ -254,6 +258,14 @@ def preprocess(test, params, env): # Preprocess all VMs and images process(test, params, env, preprocess_image, preprocess_vm) +# Start the screendump thread +if params.get("take_regular_screendumps") == "yes": +global _screendump_thread, _screendump_thread_termination_event +_screendump_thread_termination_event = threading.Event() +_screendump_thread = threading.Thread(target=_take_screendumps, + args=(test, params, env)) +_screendump_thread.start() + def postprocess(test, params, env): """ @@ -263,8 +275,15 @@ def postprocess(test, params, env): @param params: Dict containing all VM and image parameters. @param env: The environment (a dict-like object). """ +# Postprocess all VMs and images process(test, params, env, postprocess_image, postprocess_vm) +# Terminate the screendump thread +global _screendump_thread, _screendump_thread_termination_event +if _screendump_thread: +_screendump_thread_termination_event.set() +_screendump_thread.join(10) + # Warn about corrupt PPM files for f in glob.glob(os.path.join(test.debugdir, "*.ppm")): if not ppm_utils.image_verify_ppm_file(f): @@ -290,11 +309,13 @@ def postprocess(test, params, env): for f in glob.glob(os.path.join(test.debugdir, '*.ppm')): os.unlink(f) -# Execute any post_commands -if params.get("post_command"): -process_command(test, params, env, params.get("post_command"), -int(params.get("post_command_timeout", "600")), -params.get("post_command_noncritical") == "yes") +# Should we keep the screendump dirs? +if params.get("keep_screendumps") != "yes": +logging.debug("'keep_screendumps' not specified; removing screendump " + "dirs...") +for d in glob.glob(os.path.join(test.debugdir, "screendumps_*")): +if os.path.isdir(d) and not os.path.islink(d): +shutil.rmtree(d, ignore_errors=True) # Kill all unresponsive VMs if params.get("kill_unresponsive_vms") == "yes": @@ -318,6 +339,12 @@ def postprocess(test, params, env): env["tcpdump"].close() del env["tcpdump"] +# Execute any post_commands +if params.get("post_command"): +process_command(test, params, env, params.get("post_command"), +int(params.get("post_command_timeout", "600")), +params.get("post_command_noncritical") == "yes") + def postprocess_on_error(test, params, env): """ @@ -343,3 +370,49 @@ def _update_address_cache(address_cache, line): mac_addre
[KVM-AUTOTEST PATCH 4/5] KVM test: make kvm_stat usage optional
Relying on the test tag is not cool. Use a dedicated parameter instead. By default, all tests except build tests will use kvm_stat. Signed-off-by: Michael Goldish --- client/tests/kvm/kvm_utils.py |8 client/tests/kvm/tests_base.cfg.sample |3 +++ 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py index cc39b5d..5834539 100644 --- a/client/tests/kvm/kvm_utils.py +++ b/client/tests/kvm/kvm_utils.py @@ -845,8 +845,8 @@ def run_tests(test_list, job): @return: True, if all tests ran passed, False if any of them failed. """ status_dict = {} - failed = False + for dict in test_list: if dict.get("skip") == "yes": continue @@ -863,12 +863,12 @@ def run_tests(test_list, job): test_tag = dict.get("shortname") # Setting up kvm_stat profiling during test execution. # We don't need kvm_stat profiling on the build tests. -if "build" in test_tag: +if dict.get("run_kvm_stat") == "yes": +profile = True +else: # None because it's the default value on the base_test class # and the value None is specifically checked there. profile = None -else: -profile = True if profile: job.profilers.add('kvm_stat') diff --git a/client/tests/kvm/tests_base.cfg.sample b/client/tests/kvm/tests_base.cfg.sample index 9963a44..b13aec4 100644 --- a/client/tests/kvm/tests_base.cfg.sample +++ b/client/tests/kvm/tests_base.cfg.sample @@ -40,6 +40,9 @@ nic_mode = user nic_script = scripts/qemu-ifup address_index = 0 +# Misc +run_kvm_stat = yes + # Tests variants: -- 1.5.4.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 2/5] KVM test: kvm.py: make sure all dump_env() calls are inside 'finally' blocks
Signed-off-by: Michael Goldish --- client/tests/kvm/kvm.py | 29 +++-- 1 files changed, 19 insertions(+), 10 deletions(-) diff --git a/client/tests/kvm/kvm.py b/client/tests/kvm/kvm.py index 9b8a10c..c6e146d 100644 --- a/client/tests/kvm/kvm.py +++ b/client/tests/kvm/kvm.py @@ -21,6 +21,7 @@ class kvm(test.test): (Online doc - Getting started with KVM testing) """ version = 1 + def run_once(self, params): # Report the parameters we've received and write them as keyvals logging.debug("Test parameters:") @@ -33,7 +34,7 @@ class kvm(test.test): # Open the environment file env_filename = os.path.join(self.bindir, params.get("env", "env")) env = kvm_utils.load_env(env_filename, {}) -logging.debug("Contents of environment: %s" % str(env)) +logging.debug("Contents of environment: %s", str(env)) try: try: @@ -50,22 +51,30 @@ class kvm(test.test): f.close() # Preprocess -kvm_preprocessing.preprocess(self, params, env) -kvm_utils.dump_env(env, env_filename) +try: +kvm_preprocessing.preprocess(self, params, env) +finally: +kvm_utils.dump_env(env, env_filename) # Run the test function run_func = getattr(test_module, "run_%s" % t_type) -run_func(self, params, env) -kvm_utils.dump_env(env, env_filename) +try: +run_func(self, params, env) +finally: +kvm_utils.dump_env(env, env_filename) except Exception, e: logging.error("Test failed: %s", e) logging.debug("Postprocessing on error...") -kvm_preprocessing.postprocess_on_error(self, params, env) -kvm_utils.dump_env(env, env_filename) +try: +kvm_preprocessing.postprocess_on_error(self, params, env) +finally: +kvm_utils.dump_env(env, env_filename) raise finally: # Postprocess -kvm_preprocessing.postprocess(self, params, env) -logging.debug("Contents of environment: %s", str(env)) -kvm_utils.dump_env(env, env_filename) +try: +kvm_preprocessing.postprocess(self, params, env) +finally: +kvm_utils.dump_env(env, env_filename) +logging.debug("Contents of environment: %s", str(env)) -- 1.5.4.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[KVM-AUTOTEST PATCH 1/5] KVM test: kvm_preprocessing.py: minor style corrections
Also, fetch the KVM version before setting up the VMs. Signed-off-by: Michael Goldish --- client/tests/kvm/kvm_preprocessing.py | 58 +++- 1 files changed, 27 insertions(+), 31 deletions(-) diff --git a/client/tests/kvm/kvm_preprocessing.py b/client/tests/kvm/kvm_preprocessing.py index e91d1da..e3a5501 100644 --- a/client/tests/kvm/kvm_preprocessing.py +++ b/client/tests/kvm/kvm_preprocessing.py @@ -58,8 +58,8 @@ def preprocess_vm(test, params, env, name): for_migration = False if params.get("start_vm_for_migration") == "yes": -logging.debug("'start_vm_for_migration' specified; (re)starting VM with" - " -incoming option...") +logging.debug("'start_vm_for_migration' specified; (re)starting VM " + "with -incoming option...") start_vm = True for_migration = True elif params.get("restart_vm") == "yes": @@ -187,12 +187,12 @@ def preprocess(test, params, env): @param env: The environment (a dict-like object). """ # Start tcpdump if it isn't already running -if not env.has_key("address_cache"): +if "address_cache" not in env: env["address_cache"] = {} -if env.has_key("tcpdump") and not env["tcpdump"].is_alive(): +if "tcpdump" in env and not env["tcpdump"].is_alive(): env["tcpdump"].close() del env["tcpdump"] -if not env.has_key("tcpdump"): +if "tcpdump" not in env: command = "/usr/sbin/tcpdump -npvi any 'dst port 68'" logging.debug("Starting tcpdump (%s)...", command) env["tcpdump"] = kvm_subprocess.kvm_tail( @@ -208,35 +208,23 @@ def preprocess(test, params, env): # Destroy and remove VMs that are no longer needed in the environment requested_vms = kvm_utils.get_sub_dict_names(params, "vms") -for key in env.keys(): +for key in env: vm = env[key] if not kvm_utils.is_vm(vm): continue if not vm.name in requested_vms: -logging.debug("VM '%s' found in environment but not required for" - " test; removing it..." % vm.name) +logging.debug("VM '%s' found in environment but not required for " + "test; removing it..." % vm.name) vm.destroy() del env[key] -# Execute any pre_commands -if params.get("pre_command"): -process_command(test, params, env, params.get("pre_command"), -int(params.get("pre_command_timeout", "600")), -params.get("pre_command_noncritical") == "yes") - -# Preprocess all VMs and images -process(test, params, env, preprocess_image, preprocess_vm) - # Get the KVM kernel module version and write it as a keyval logging.debug("Fetching KVM module version...") if os.path.exists("/dev/kvm"): -kvm_version = os.uname()[2] try: -file = open("/sys/module/kvm/version", "r") -kvm_version = file.read().strip() -file.close() +kvm_version = open("/sys/module/kvm/version").read().strip() except: -pass +kvm_version = os.uname()[2] else: kvm_version = "Unknown" logging.debug("KVM module not loaded") @@ -248,16 +236,24 @@ def preprocess(test, params, env): qemu_path = kvm_utils.get_path(test.bindir, params.get("qemu_binary", "qemu")) version_line = commands.getoutput("%s -help | head -n 1" % qemu_path) -exp = re.compile("[Vv]ersion .*?,") -match = exp.search(version_line) -if match: -kvm_userspace_version = " ".join(match.group().split()[1:]).strip(",") +matches = re.findall("[Vv]ersion .*?,", version_line) +if matches: +kvm_userspace_version = " ".join(matches[0].split()[1:]).strip(",") else: kvm_userspace_version = "Unknown" logging.debug("Could not fetch KVM userspace version") logging.debug("KVM userspace version: %s" % kvm_userspace_version) test.write_test_keyval({"kvm_userspace_version": kvm_userspace_version}) +# Execute any pre_commands +if params.get("pre_command"): +process_command(test, params, env, params.get("pre_command"), +int(params.get("pre_command_timeout", "600")), +params.get("pre_command_noncritical") == "yes") + +# Preprocess all VMs and images +process(test, params, env, preprocess_image, preprocess_vm) + def postprocess(test, params, env): """ @@ -276,8 +272,8 @@ def postprocess(test, params, env): # Should we convert PPM files to PNG format? if params.get("convert_ppm_files_to_png") == "yes": -logging.debug("'convert_ppm_files_to_png' specified; converting PPM" - " files to PNG format...") +logging.debug("'convert_ppm_files_to_png' specified; converting
Re: [Autotest] [PATCH] KVM-Test: Add kvm userspace unit test
OK, I approve of your suggestion. - "Lucas Meneghel Rodrigues" 写道: > I have an update about this test after talking to Naphtali Sprei: > > This patch does the unit testing using the old way of invoking it, > and > Avi superseded it with a new -kernel option. Naphtali is working in > making the new way of doing the test work, so I will wait until we > can > merge both ways of doing this test, OK? > > On Thu, Mar 18, 2010 at 12:16 AM, Lucas Meneghel Rodrigues > wrote: > > Hi Shuxi, sorry that it took so long before I could give you return > on this one. > > > > The general idea is just fine, but there is one gotcha that will > need > > more thought: This is dependent of having the KVM source code for > > testing (ie, it depends on the build test *and* the build mode has > to > > involve source code, such as git builds, things like koji install > will > > also not work). Since by default we are not making the tests > depending > > directly on build, so we have to figure out a way to have this > > integrated without breaking things for users who are not interested > to > > run the build test. > > > > Today I was reviewing the qemu-img functional test, so it occurred > to > > me that all those tests that do not depend on guests and different > > qemu command line options, we can make them all dependent on the > build > > test. This way we'd have the separation that we need, still not > > breaking anything for users that do not care about build and other > > types of test. > > > > Michael, what do you think? Should we put the config of tests like > > this one and qemu_img on build.cfg, making them depend on build? > > > > Oh Shuxi, on the code below I have some small comments to make: > > > > On Fri, Mar 5, 2010 at 3:22 AM, sshang wrote: > >> The test use kvm test harness kvmctl load binary test case file to > test various function of kvm kernel module. > >> > >> Signed-off-by: sshang > >> --- > >> client/tests/kvm/tests/unit_test.py | 29 > + > >> client/tests/kvm/tests_base.cfg.sample | 7 +++ > >> 2 files changed, 36 insertions(+), 0 deletions(-) > >> create mode 100644 client/tests/kvm/tests/unit_test.py > >> > >> diff --git a/client/tests/kvm/tests/unit_test.py > b/client/tests/kvm/tests/unit_test.py > >> new file mode 100644 > >> index 000..9bc7441 > >> --- /dev/null > >> +++ b/client/tests/kvm/tests/unit_test.py > >> @@ -0,0 +1,29 @@ > >> +import os > >> +from autotest_lib.client.bin import utils > >> +from autotest_lib.client.common_lib import error > >> + > >> +def run_unit_test(test, params, env): > >> + """ > >> + This is kvm userspace unit test, use kvm test harness kvmctl > load binary > >> + test case file to test various function of kvm kernel module. > >> + The output of all unit test can be found in the test result > dir. > >> + """ > >> + > >> + case_list = params.get("case_list","access apic emulator > hypercall irq"\ > >> + " port80 realmode sieve smptest tsc stringio > vmexit").split() > >> + srcdir = params.get("srcdir",test.srcdir) > >> + user_dir = os.path.join(srcdir,"kvm_userspace/kvm/user") > >> + os.chdir(user_dir) > >> + test_fail_list = [] > >> + > >> + for i in case_list: > >> + result_file = test.outputdir + "/" + i > >> + testfile = i + ".flat" > >> + results = utils.system("./kvmctl test/x86/bootstrap > test/x86/" + \ > >> + testfile + " > " + > result_file,ignore_status=True) > > > > About the above statement: In general you should not use shell > > redirection to write the output of your program to the log files. > > Please take advantage of the fact utils.run allow you to connect > > stdout and stderr pipes to the result file. Also, utils.run return > a > > CmdResult object, hat has a list of useful properties out of it. > > > >> + if results != 0: > >> + test_fail_list.append(i) > >> + > >> + if test_fail_list: > >> + raise error.TestFail("< " + " ".join(test_fail_list) + \ > >> + " >") > > > > In the above, you could just have used > > > > raise error.TestFail("KVM module unit test failed. Test > cases > > failed: %s" % test_fail_list) > > > > IMHO it's easier to understand. > > > >> diff --git a/client/tests/kvm/tests_base.cfg.sample > b/client/tests/kvm/tests_base.cfg.sample > >> index 040d0c3..0918c26 100644 > >> --- a/client/tests/kvm/tests_base.cfg.sample > >> +++ b/client/tests/kvm/tests_base.cfg.sample > >> @@ -300,6 +300,13 @@ variants: > >> shutdown_method = shell > >> kill_vm = yes > >> kill_vm_gracefully = no > >> + > >> + - unit_test: > >> + type = unit_test > >> + case_list = access apic emulator hypercall msr port80 > realmode sieve smptest tsc stringio vmexit > >> + #srcdir should be same as build.cfg > >> + srcdir = > >> + vms = '' > >> # Do not define test variants below shutdown >
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 05:00 PM, Ingo Molnar wrote: If that is the theory then it has failed to trickle through in practice. As you know i have reported a long list of usability problems with hardly a look. That list could be created by pretty much anyone spending a few minutes of getting a first impression with qemu-kvm. Can you transfer your list to the following wiki page: http://wiki.qemu.org/Features/Usability This thread is so large that I can't find your note that contained the initial list. I want to make sure this input doesn't die once this thread settles down. Regards, Anthony Liguori -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 04:54 PM, Ingo Molnar wrote: * Avi Kivity wrote: On 03/21/2010 10:55 PM, Ingo Molnar wrote: Of course you could say the following: ' Thanks, I'll mark this for v2.6.36 integration. Note that we are not able to add this to the v2.6.35 kernel queue anymore as the ongoing usability work already takes up all of the project's maintainer and testing bandwidth. If you want the feature to be merged sooner than that then please help us cut down on the TODO and BUGS list that can be found at XYZ. There's quite a few low hanging fruits there. ' That would be shooting at my own foot as well as the contributor's since I badly want that RCU stuff, and while a GUI would be nice, that itch isn't on my back. I think this sums up the root cause of all the problems i see with KVM pretty well. A good maintainer has to strike a balance between asking more of people than what they initially volunteer and getting people to implement the less fun things that are nonetheless required. The kernel can take this to an extreme because at the end of the day, it's the only game in town and there is an unending number of potential volunteers. Most other projects are not quite as fortunate. When someone submits a patch set to QEMU implementing a new network backend for raw sockets, we can push back about how it fits into the entire stack wrt security, usability, etc. Ultimately, we can arrive at a different, more user friendly solution (networking helpers) and along with some time investment on my part, we can create a much nicer, more user friendly solution. Still command line based though. Responding to such a patch set with, replace the SDL front end with a GTK one that lets you graphically configure networking, is not reasonable and the result would be one less QEMU contributor in the long run. Overtime, we can, and are, pushing people to focus more on usability. But that doesn't get you a first class GTK GUI overnight. The only way you're going to get that is by having a contributor be specifically interesting in building such a thing. We simply haven't had that in the past 5 years that I've been involved in the project. If someone stepped up to build this, I'd certainly support it in every way possible and there are probably some steps we could take to even further encourage this. Regards, Anthony Liguori -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tracking KVM development
I've looked at libvirt a bit, and I fail at seeing the attraction. I think I will stay with plain qemu-kvm, unless there are some very compelling reasons for going down the libvirt route. Virsh (uses libvirt) is almost irreplaceable for us... How do you start and stop virtual machines easily, get a list of the running ones... How do you ensure a virtual machine is never started twice? (would obviously have disastrous results on the filesystem) How do you connect on-demand to the graphics of the VM from your laptop, with a good security so that only the system administrator can do that? (virt-viewer provides very easy support for this, tunnelling VNC graphics over SSH, you connect by specifying the name of the host and the name of the VM... just great!) If there is another way I'm interested, in fact libvirt also brings problems to us mainly because it takes a while to support latest KVM features, and also installing libvirt from source and configuring it properly for the host first time is much more difficult than for KVM sources. Thank you -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 05:00 PM, Ingo Molnar wrote: If that is the theory then it has failed to trickle through in practice. As you know i have reported a long list of usability problems with hardly a look. That list could be created by pretty much anyone spending a few minutes of getting a first impression with qemu-kvm. I think the point you're missing is that your list was from the perspective of someone looking at a desktop virtualization solution that had was graphically oriented. As Avi has repeatedly mentioned, so far, that has not been the target audience of QEMU. The target audience tends to be: 1) people looking to do server virtualization and 2) people looking to do command line based development. Usually, both (1) and (2) are working on machines that are remotely located. What's important to these users is that VMs be easily launchable from the command line, that there is a lot of flexibility in defining machine types, and that there is a programmatic way to interact with a given instance of QEMU. Those are the things that we've been focusing on recently. The reason we don't have better desktop virtualization support is simple. No one is volunteering to do it and no company is funding development for it. When you look at something like VirtualBox, what you're looking at is a long ago forked version of QEMU with a GUI added focusing on desktop virtualization. There is no magic behind adding a better, more usable GUI to QEMU. It just takes resources. I understand that you're trying to make the point that without catering to the desktop virtualization use case, we won't get as many developers as we could. Personally, I don't think that argument is accurate. If you look at VirtualBox, it's performance is terrible. Having a nice GUI hasn't gotten them the type of developers that can improve their performance. No one is arguing that we wouldn't like to have a nicer UI. I'd love to merge any patch like that. Regards, Anthony Liguori -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 04:00 PM, Ingo Molnar wrote: * Avi Kivity wrote: On 03/21/2010 09:59 PM, Ingo Molnar wrote: Frankly, i was surprised (and taken slightly off base) by both Avi and Anthony suggesting such a clearly inferior "add a demon to the guest space" solution. It's a usability and deployment non-starter. It's only clearly inferior if you ignore every consideration against it. It's definitely not a deployment non-starter, see the tons of daemons that come with any Linux system. [...] Avi, please dont put arguments into my mouth that i never made. My (clearly expressed) argument was that: _a new guest-side demon is a transparent instrumentation non-starter_ FWIW, there's no reason you couldn't consume a vmchannel port from within the kernel. I don't think the code needs to be in the kernel and from a security PoV, that suggests that it should be in userspace IMHO. But if you want to make a kernel thread, knock yourself out. I have no objection to that from a qemu perspective. I can't see why Avi would mind either. I think it's papering around another problem (the kernel should control initrds IMHO) but that's a different topic. Regards, Anthony Liguori -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 02:17 PM, Ingo Molnar wrote: If you want to improve this, you need to do the following: 1) Add a userspace daemon that uses vmchannel that runs in the guest and can fetch kallsyms and arbitrary modules. If that daemon lives in tools/perf, that's fine. Adding any new daemon to an existing guest is a deployment and usability nightmare. The basic rule of good instrumentation is to be transparent. The moment we have to modify the user-space of a guest just to monitor it, the purpose of transparent instrumentation is defeated. That was one of the fundamental usability mistakes of Oprofile. There is no 'perf' daemon - all the perf functionality is _built in_, and for very good reasons. It is one of the main reasons for perf's success as well. The solution should be a long lived piece of code that runs without kernel privileges. How the code is delivered to the user is a separate problem. If you want to argue that the kernel should build an initramfs that contains some things that always should be shipped with the kernel but don't need to be within the kernel, I think that's something that's long over due. We could make it a kernel thread, but what's the point? It's much safer for it to be a userspace thread and it doesn't need to interact with the kernel in an intimate way. Regards, Anthony Liguori -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
* Avi Kivity wrote: > > Consider the _other_ examples that are a lot more clear: > > > >' If you expose paravirt spilocks via KVM please also make sure the KVM > > tooling can make use of it, has an option for it to configure it, and > > that it has sufficient efficiency statistics displayed in the tool for > > admins to monitor.' > > > >' If you create this new paravirt driver then please also make sure it > > can > > be configured in the tooling. ' > > > >' Please also add a testcase for this bug to tools/kvm/testcases/ so we > > dont > > repeat this same mistake in the future. ' > > All three happen quite commonly in qemu/kvm development. Of course someone > who develops a feature also develops a patch that exposes it in qemu. There > are several test cases in qemu-kvm.git/kvm/user/test. If that is the theory then it has failed to trickle through in practice. As you know i have reported a long list of usability problems with hardly a look. That list could be created by pretty much anyone spending a few minutes of getting a first impression with qemu-kvm. So something is seriously wrong in KVM land, to pretty much anyone trying it for the first time. I have explained how i see the root cause of that, while you seem to suggest that there's nothing wrong to begin with. I guess we'll have to agree to disagree on that. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
* Avi Kivity wrote: > On 03/21/2010 10:55 PM, Ingo Molnar wrote: > > > >Of course you could say the following: > > > > ' Thanks, I'll mark this for v2.6.36 integration. Note that we are not > > able to add this to the v2.6.35 kernel queue anymore as the ongoing > > usability work already takes up all of the project's maintainer and > > testing bandwidth. If you want the feature to be merged sooner than that > > then please help us cut down on the TODO and BUGS list that can be found > > at XYZ. There's quite a few low hanging fruits there. ' > > That would be shooting at my own foot as well as the contributor's since I > badly want that RCU stuff, and while a GUI would be nice, that itch isn't on > my back. I think this sums up the root cause of all the problems i see with KVM pretty well. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
* Avi Kivity wrote: > > I.e. you are arguing for microkernel Linux, while you see me as arguing > > for a monolithic kernel. > > No. I'm arguing for reducing bloat wherever possible. Kernel code is more > expensive than userspace code in every metric possible. 1) One of the primary design arguments of the micro-kernel design as well was to push as much into user-space as possible without impacting performance too much so you very much seem to be arguing for a micro-kernel design for the kernel. I think history has given us the answer for that fight between microkernels and monolithic kernels. Furthermore, to not engage in hypotheticals about microkernels: by your argument the Oprofile design was perfect (it was minimalistic kernel-space, with all the complexity in user-space), while perf was over-complex (which does many things in the kernel that could have been done in user-space). Practical results suggest the exact opposite happened - Oprofile is being replaced by perf. How do you explain that? 2) In your analysis you again ignore the package boundary costs and artifacts as if they didnt exist. That was my main argument, and that is what we saw with oprofile and perf: while maintaining more kernel-code may be more expensive, it sure pays off for getting us a much better solution in the end. And getting a 'much better solution' to users is the goal of all this, isnt it? I dont mind what you call 'bloat' per se if it's for a purpose that users find like a good deal. I have quite a bit of RAM in most of my systems, having 50K more or less included in the kernel image is far less important than having a healthy and vibrant development model and having satisfied users ... Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 11:00 PM, Ingo Molnar wrote: * Avi Kivity wrote: On 03/21/2010 09:59 PM, Ingo Molnar wrote: Frankly, i was surprised (and taken slightly off base) by both Avi and Anthony suggesting such a clearly inferior "add a demon to the guest space" solution. It's a usability and deployment non-starter. It's only clearly inferior if you ignore every consideration against it. It's definitely not a deployment non-starter, see the tons of daemons that come with any Linux system. [...] Avi, please dont put arguments into my mouth that i never made. Sorry, that was not the intent. I meant that putting things into the kernel have disadvantages that must be considered. My (clearly expressed) argument was that: _a new guest-side demon is a transparent instrumentation non-starter_ What is so hard to understand about that simple concept? Instrumentation is good if it's as transparent as possible. Of course lots of other features can be done via a new user-space package ... I believe you can deploy this daemon via a (default) package, without any hassle to users. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 10:55 PM, Ingo Molnar wrote: Of course you could say the following: ' Thanks, I'll mark this for v2.6.36 integration. Note that we are not able to add this to the v2.6.35 kernel queue anymore as the ongoing usability work already takes up all of the project's maintainer and testing bandwidth. If you want the feature to be merged sooner than that then please help us cut down on the TODO and BUGS list that can be found at XYZ. There's quite a few low hanging fruits there. ' That would be shooting at my own foot as well as the contributor's since I badly want that RCU stuff, and while a GUI would be nice, that itch isn't on my back. You're asking a developer and a maintainer to put off the work they're interested in, in order to work on something someone else is interested in, but not contributing the work. Although this RCU example is 'worst' possible example, as it's a pure speedup change with no functionality effect. Consider the _other_ examples that are a lot more clear: ' If you expose paravirt spilocks via KVM please also make sure the KVM tooling can make use of it, has an option for it to configure it, and that it has sufficient efficiency statistics displayed in the tool for admins to monitor.' ' If you create this new paravirt driver then please also make sure it can be configured in the tooling. ' ' Please also add a testcase for this bug to tools/kvm/testcases/ so we dont repeat this same mistake in the future. ' All three happen quite commonly in qemu/kvm development. Of course someone who develops a feature also develops a patch that exposes it in qemu. There are several test cases in qemu-kvm.git/kvm/user/test. I'd say most of the high-level feature work in KVM has tooling impact. Usually, pretty low. Plumbing down a feature is usually trivial. There are exceptions, of course - smp is only supported in qemu-kvm.git, not in upstream qemu.git, for example. In any case of course the work is done in both qemu and kvm - do you think people develop features to see them bitrot? And note the important arguement that the 'eject button' thing would not occur naturally in a project that is well designed and has a good quality balance. It would only occur in the transitionary period if a big lump of lower-quality code is unified with higher-quality code. Then indeed a lot of pressure gets created on the people working on the high-quality portion to go over and fix the low-quality portion. It's a matter of priorities. Which, btw., is an unconditonally good thing ... But even an RCU speedup can be fairly linked/ordered to more pressing needs of a project. Pressing to whom? Really, the unification of two tightly related pieces of code has numerous clear advantages. Please give it some thought before rejecting it. I'm not blind to the advantages. Dropping tcg would be the biggest of them by far (much more than moving the repository, IMO). But there are disadvantages as well. Around two years ago I seriously considered forking qemu, at this time I do not think it is a good idea. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 10:31 PM, Ingo Molnar wrote: * Avi Kivity wrote: On 03/21/2010 09:17 PM, Ingo Molnar wrote: Adding any new daemon to an existing guest is a deployment and usability nightmare. The logical conclusion of that is that everything should be built into the kernel. [...] Only if you apply it as a totalitarian rule. Furthermore, the logical conclusion of _your_ line of argument (applied in a totalitarian manner) is that 'nothing should be built into the kernel'. I'm certainly a minimalist, but that doesn't follow. Things that require privileged access, or access to the page cache, or that can't be made to perform otherwise should certainly be in the kernel. That's why I submitted kvm for inclusion in the first place. If it's something that can work just as well in userspace but we can't be bothered to fix any 'deployment nightmares', then they shouldn't be in the kernel. Examples include lvm2 and mdadm (which truly are 'deployment nightmares' - you need to start them before you have access to your filesystem - yet they work somehow). I.e. you are arguing for microkernel Linux, while you see me as arguing for a monolithic kernel. No. I'm arguing for reducing bloat wherever possible. Kernel code is more expensive than userspace code in every metric possible. Reality is that we are somewhere inbetween, we are neither black nor white: it's shades of grey. If we want to do a good job with all this then we observe subsystems, we see how they relate to the physical world and decide about how to shape them. We identify long-term changes and re-design modularization boundaries in hindsight - when we got them wrong initially. We dont try to rationalize the status-quo. I'm not for the status quo either - I'm for reducing the kernel code footprint whereever it doesn't impact performance or break clean interfaces. Lets see one example of that thought process in action: Oprofile. We saw that the modularization of oprofile was a total nightmare: a separate kernel-space and a separate user-space component, which was in constant version friction. The ABI between them was stiffling: it was hard to change it (you needed to trickle that through the tool as well which was on a different release schedule, etc.e tc.) The result was sucky usability that never went beyond some basic 'you can do profiling' threshold. The subsystem worked well within that design box, and it was worked on by highly competent people - but it was still far, far away from the potential it could have achieved. So we observed those problems and decided to do something about it: - We unified the two parts into a single maintenance domain. There's the kernel-side in kernel/perf_event.c and arch/*/*/perf_event.c, plus the user-side in tools/perf/. The two are connected by a very flexible, forwards and backwards compatible ABI. That's useful because perf is still small. If it were a full fledged 350KLOC GUI, then most of the development would concentrate on the GUI and very little (relatively) would have to do with the kernel. Qemu is in that state today. Please, please look at the recent commits and check how many have actually anything to do with kvm, and how many with everything else. - We moved much more code into the kernel, realizing that transparent and robust instrumentation should be offered instead of punting abstractions into user-space (which is in a disadvantaged position to implement system-wide abstractions). No argument. I have a similar experience with kvm. The user/kernel break is at the cpu virtualization level - that is kvm is solely responsible for emulating a cpu and userspace is responsible for emulating devices. An exception was made for the PIC/IOAPIC/PIT due to performance considerations - they are emulated in the kernel as well. A common FAQ is why do we not emulate real-mode instructions in qemu. The answer is that it the interface to kvm would be insane - it would emulate a partial cpu. All other users of that interface would have to implement an emulator (there is also a practical argument - the qemu emulator does not implement atomics correctly wrt other threads). - We created a no-bullsh*t approach to usability. perf is by no means perfect, but it's written by developers for developers and if you report a bug to us we'll act on it before anything else. Furthermore the kernel developers do the user-space coding as well, so there's no chinese wall separating them. Kernel-space becomes aware of the intricacies of user-space and user-space developers become aware of the difficulties of kernel-space as well. It's a good mix in our experience. Excellent. However qemu is written by developers for their users, and their users are not worried about an eject button in the qemu SDL interface, or about running the qemu command line by hand. They have
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
* Avi Kivity wrote: > > Well, for what it's worth, I rarely ever use anything else. My virtual > > disks are raw so I can loop mount them easily, and I can also switch my > > guest kernels from outside... without ever needing to mount those disks. > > Curious, what do you use them for? > > btw, if you build your kernel outside the guest, then you already have > access to all its symbols, without needing anything further. There's two errors with your argument: 1) you are assuming that it's only about kernel symbols Look at this 'perf report' output: # Samples: 7127509216 # # Overhead Command Shared Object Symbol # .. . .. # 19.14% git git[.] lookup_object 15.16%perf git[.] lookup_object 4.74%perf libz.so.1.2.3 [.] inflate 4.52% git libz.so.1.2.3 [.] inflate 4.21%perf libz.so.1.2.3 [.] inflate_table 3.94% git libz.so.1.2.3 [.] inflate_table 3.29% git git[.] find_pack_entry_one 3.24% git libz.so.1.2.3 [.] inflate_fast 2.96%perf libz.so.1.2.3 [.] inflate_fast 2.96% git git[.] decode_tree_entry 2.80%perf libc-2.11.90.so[.] __strlen_sse42 2.56% git libc-2.11.90.so[.] __strlen_sse42 1.98%perf libc-2.11.90.so[.] __GI_memcpy 1.71%perf git[.] decode_tree_entry 1.53% git libc-2.11.90.so[.] __GI_memcpy 1.48% git git[.] lookup_blob 1.30% git git[.] process_tree 1.30%perf git[.] process_tree 0.90%perf git[.] tree_entry 0.82%perf git[.] lookup_blob 0.78% git [kernel.kallsyms] [k] kstat_irqs_cpu kernel symbols are only a small portion of the symbols. (a single line in this case) To get to those other symbols we have to read the ELF symbols of those binaries in the guest filesystem, in the post-processing/reporting phase. This is both complex to do and relatively slow so we dont want to (and cannot) do this at sample time from IRQ context or NMI context ... Also, many aspects of reporting are interactive so it's done lazily or on-demand. So we need ready access to the guest filesystem - for those guests which decide to integrate with the host for this. 2) the 'SystemTap mistake' You are assuming that the symbols of the kernel when it got built got saved properly and are discoverable easily. In reality those symbols can be erased by a make clean, can be modified by a new build, can be misplaced and can generally be hard to find because each distro puts them in a different installation path. My 10+ years experience with kernel instrumentation solutions is that kernel-driven, self-sufficient, robust, trustable, well-enumerated sources of information work far better in practice. The thing is, in this thread i'm forced to repeat the same basic facts again and again. Could you _PLEASE_, pretty please, when it comes to instrumentation details, at least _read the mails_ of the guys who actually ... write and maintain Linux instrumentation code? This is getting ridiculous really. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 10:31 PM, Antoine Martin wrote: On 03/22/2010 03:24 AM, Avi Kivity wrote: On 03/21/2010 10:18 PM, Antoine Martin wrote: That includes the guest kernel. If you can deploy a new kernel in the guest, presumably you can deploy a userspace package. That's not always true. The host admin can control the guest kernel via "kvm -kernel" easily enough, but he may or may not have access to the disk that is used in the guest. (think encrypted disks, service agreements, etc) There is a matching -initrd argument that you can use to launch a daemon. I thought this discussion was about making it easy to deploy... and generating a custom initrd isn't easy by any means, and it requires access to the guest filesystem (and its mkinitrd tools). That's true. You need to run mkinitrd anyway, though, unless your guest is non-modular and non-lvm. I believe that -kernel use will be rare, though. It's a lot easier to keep everything in one filesystem. Well, for what it's worth, I rarely ever use anything else. My virtual disks are raw so I can loop mount them easily, and I can also switch my guest kernels from outside... without ever needing to mount those disks. Curious, what do you use them for? btw, if you build your kernel outside the guest, then you already have access to all its symbols, without needing anything further. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
* Avi Kivity wrote: > On 03/21/2010 09:59 PM, Ingo Molnar wrote: > > > >Frankly, i was surprised (and taken slightly off base) by both Avi and > >Anthony > >suggesting such a clearly inferior "add a demon to the guest space" solution. > >It's a usability and deployment non-starter. > > It's only clearly inferior if you ignore every consideration against it. > It's definitely not a deployment non-starter, see the tons of daemons that > come with any Linux system. [...] Avi, please dont put arguments into my mouth that i never made. My (clearly expressed) argument was that: _a new guest-side demon is a transparent instrumentation non-starter_ What is so hard to understand about that simple concept? Instrumentation is good if it's as transparent as possible. Of course lots of other features can be done via a new user-space package ... Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
* Avi Kivity wrote: > On 03/21/2010 09:06 PM, Ingo Molnar wrote: > >* Avi Kivity wrote: > > > [...] Second, from my point of view all contributors are volunteers > (perhaps their employer volunteered them, but there's no difference from > my perspective). Asking them to repaint my apartment as a condition to > get a patch applied is abuse. If a patch is good, it gets applied. > >>>This is one of the weirdest arguments i've seen in this thread. Almost all > >>>the time do we make contributions conditional on the general shape of the > >>>project. Developers dont get to do just the fun stuff. > >>So, do you think a reply to a patch along the lines of > >> > >> NAK. Improving scalability is pointless while we don't have a decent > >> GUI. > >>I'll review you RCU patches > >> _after_ you've contributed a usable GUI. > >> > >>? > >What does this have to do with RCU? > > The example was rcuifying kvm which took place a bit ago. Sorry, it wasn't > clear. > > > I'm talking about KVM, which is a Linux kernel feature that is useless > > without a proper, KVM-specific app making use of it. > > > > RCU is a general kernel performance feature that works across the board. > > It helps KVM indirectly, and it helps many other kernel subsystems as > > well. It needs no user-space tool to be useful. > > Correct. So should I tell someone that has sent a patch that rcu-ified kvm > in order to scale it, that I won't accept the patch unless they do some > usability userspace work? say, implementing an eject button. That's what I > understood you to mean. Of course you could say the following: ' Thanks, I'll mark this for v2.6.36 integration. Note that we are not able to add this to the v2.6.35 kernel queue anymore as the ongoing usability work already takes up all of the project's maintainer and testing bandwidth. If you want the feature to be merged sooner than that then please help us cut down on the TODO and BUGS list that can be found at XYZ. There's quite a few low hanging fruits there. ' Although this RCU example is 'worst' possible example, as it's a pure speedup change with no functionality effect. Consider the _other_ examples that are a lot more clear: ' If you expose paravirt spilocks via KVM please also make sure the KVM tooling can make use of it, has an option for it to configure it, and that it has sufficient efficiency statistics displayed in the tool for admins to monitor.' ' If you create this new paravirt driver then please also make sure it can be configured in the tooling. ' ' Please also add a testcase for this bug to tools/kvm/testcases/ so we dont repeat this same mistake in the future. ' I'd say most of the high-level feature work in KVM has tooling impact. And note the important arguement that the 'eject button' thing would not occur naturally in a project that is well designed and has a good quality balance. It would only occur in the transitionary period if a big lump of lower-quality code is unified with higher-quality code. Then indeed a lot of pressure gets created on the people working on the high-quality portion to go over and fix the low-quality portion. Which, btw., is an unconditonally good thing ... But even an RCU speedup can be fairly linked/ordered to more pressing needs of a project. Really, the unification of two tightly related pieces of code has numerous clear advantages. Please give it some thought before rejecting it. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tracking KVM development
Dne 21.3.2010 12:21, Thomas Løcke napsal(a): > Any and all suggestions to keeping a healthy and stable KVM setup > running is more than welcome. Hi, I compile stable qemu-kvm releases from source and install under /opt/qemu-kvm-${version}. With this setup I can run/test multiple versions without messing up "any" distro.. HTH, Z. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tracking KVM development
On Sun, Mar 21, 2010 at 9:19 PM, Andre Przywara wrote: > Please think twice about that. Every time I wanted to go away from Slackware > because of missing packages I ended up with accepting the involved hassle > with self-compiling because I could stay with the simplicity and clean > design of Slackware. Same here. > I usually compile my own kernels anyway and use the Slackware kernels only > for testing and installation. So I usually do "make oldconfig" on a stable > 2.6.xx.>=3 kernel, and am happy with that. QEMU(-kvm) is not a problem at > all, the dependencies are very small and with Slackware[64] 13.0 it compiles > out of the box with almost all features. I can send you a reasonably > configured package (or build-script) if you like. I also use the config provided by Slackware as a foundation for newer kernels, and I always compile my own. I would very much like to see the build-script you mention. > Currently both qemu-kvm-0.12.3 and Linux 2.6.33 work together very well, > although I usually do only testing and development with KVM and actually > "use" it very rarely. So if you need more upper level management tools (like > libvirt) I cannot help you on this. I've looked at libvirt a bit, and I fail at seeing the attraction. I think I will stay with plain qemu-kvm, unless there are some very compelling reasons for going down the libvirt route. :o) /Thomas -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
* Avi Kivity wrote: > On 03/21/2010 10:08 PM, Olivier Galibert wrote: > >On Sun, Mar 21, 2010 at 10:01:51PM +0200, Avi Kivity wrote: > >>On 03/21/2010 09:17 PM, Ingo Molnar wrote: > >>>Adding any new daemon to an existing guest is a deployment and usability > >>>nightmare. > >>> > >>The logical conclusion of that is that everything should be built into > >>the kernel. Where a failure brings the system down or worse. Where you > >>have to bear the memory footprint whether you ever use the functionality > >>or not. Where to update the functionality you need to deploy a new > >>kernel (possibly introducing unrelated bugs) and reboot. > >> > >>If userspace daemons are such a deployment and usability nightmare, > >>maybe we should fix that instead. > >Which userspace? Deploying *anything* in the guest can be a > >nightmare, including paravirt drivers if you don't have a natively > >supported in the OS virtual hardware backoff. > > That includes the guest kernel. If you can deploy a new kernel in the > guest, presumably you can deploy a userspace package. Note that with perf we can instrument the guest with zero guest-kernel modifications as well. We try to reduce the guest impact to a bare minimum, as the difficulties in deployment are function of the cross section surface to the guest. Also, note that the kernel is special with regards to instrumentation: since this is the kernel project, we are doing kernel space changes, as we are doing them _anyway_. So adding symbol resolution capabilities would be a minimal addition to that - while adding a while new guest package for the demon would significantly increase the cross section surface. Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CONFIG_HAVE_KVM=n impossible?
thanks, i had seen some bootmessage about kvm being active and thought that i still have seen it after disabling all kvm config options - but apparently it was my fault and i mixed things up. indeed, KVM is OFF, even with HAVE_KVM=y. so, sorry for the noise. at least, it appears that this config option is confusing other people , too - see http://communities.vmware.com/message/1498691 regards roland >devz...@web.de wrote: >> Hello, >> >> does anybody know why it seems that it`s not possible to build a kernel with >> "CONFIG_HAVE_KVM=n" ? >> >> It always switches back to "y" with every kernel build and i have no clue, >> why. > >It's an internal config symbol which is not visible in the menu >system and is always set up unconditionally based on the platform. >Just like "CONFIG_HAVE_MMU". > >You want another symbols, like CONFIG_KVM. > >/mjt ___ WEB.DE DSL: Internet, Telefon und Entertainment für nur 19,99 EUR/mtl.! http://produkte.web.de/go/02/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/22/2010 03:24 AM, Avi Kivity wrote: On 03/21/2010 10:18 PM, Antoine Martin wrote: That includes the guest kernel. If you can deploy a new kernel in the guest, presumably you can deploy a userspace package. That's not always true. The host admin can control the guest kernel via "kvm -kernel" easily enough, but he may or may not have access to the disk that is used in the guest. (think encrypted disks, service agreements, etc) There is a matching -initrd argument that you can use to launch a daemon. I thought this discussion was about making it easy to deploy... and generating a custom initrd isn't easy by any means, and it requires access to the guest filesystem (and its mkinitrd tools). I believe that -kernel use will be rare, though. It's a lot easier to keep everything in one filesystem. Well, for what it's worth, I rarely ever use anything else. My virtual disks are raw so I can loop mount them easily, and I can also switch my guest kernels from outside... without ever needing to mount those disks. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
* Avi Kivity wrote: > On 03/21/2010 09:17 PM, Ingo Molnar wrote: > > > > Adding any new daemon to an existing guest is a deployment and usability > > nightmare. > > The logical conclusion of that is that everything should be built into the > kernel. [...] Only if you apply it as a totalitarian rule. Furthermore, the logical conclusion of _your_ line of argument (applied in a totalitarian manner) is that 'nothing should be built into the kernel'. I.e. you are arguing for microkernel Linux, while you see me as arguing for a monolithic kernel. Reality is that we are somewhere inbetween, we are neither black nor white: it's shades of grey. If we want to do a good job with all this then we observe subsystems, we see how they relate to the physical world and decide about how to shape them. We identify long-term changes and re-design modularization boundaries in hindsight - when we got them wrong initially. We dont try to rationalize the status-quo. Lets see one example of that thought process in action: Oprofile. We saw that the modularization of oprofile was a total nightmare: a separate kernel-space and a separate user-space component, which was in constant version friction. The ABI between them was stiffling: it was hard to change it (you needed to trickle that through the tool as well which was on a different release schedule, etc.e tc.) The result was sucky usability that never went beyond some basic 'you can do profiling' threshold. The subsystem worked well within that design box, and it was worked on by highly competent people - but it was still far, far away from the potential it could have achieved. So we observed those problems and decided to do something about it: - We unified the two parts into a single maintenance domain. There's the kernel-side in kernel/perf_event.c and arch/*/*/perf_event.c, plus the user-side in tools/perf/. The two are connected by a very flexible, forwards and backwards compatible ABI. - We moved much more code into the kernel, realizing that transparent and robust instrumentation should be offered instead of punting abstractions into user-space (which is in a disadvantaged position to implement system-wide abstractions). - We created a no-bullsh*t approach to usability. perf is by no means perfect, but it's written by developers for developers and if you report a bug to us we'll act on it before anything else. Furthermore the kernel developers do the user-space coding as well, so there's no chinese wall separating them. Kernel-space becomes aware of the intricacies of user-space and user-space developers become aware of the difficulties of kernel-space as well. It's a good mix in our experience. The thing is (and i doubt you are surprised that i say that), i see a similar situation with KVM. The basic parameters are comparable to Oprofile: it has a kernel-space component and a KVM-specific user-space. By all practical means the two are one and the same, but are maintained as different projects. I have followed KVM since its inception with great interest. I saw its good initial design, i tried it early on and even wrote various patches for it. So i care more about KVM than a random observer would, but this preference and passion for KVM's good technical sides does not cloud my judgement when it comes to its weaknesses. In fact the weaknesses are far more important to identify and express publicly, so i tend to concentrate on them. Dont take this as me blasting KVM, we both know the many good aspects of KVM. So, as i explained it earlier in greater detail the modularization of KVM into a separate kernel-space and user-space component is one of its worst current weaknesses, and it has become the main stiffling force in the way of a better KVM experience to users. That, IMO, is the 'weakest link' of KVM today and no matter how well the rest of KVM gets improved those nice bits all get unfairly ignored when the user cannot have a usable and good desktop experience and thinks that KVM is crappy. I think you should think outside the initial design box you have created 4 years ago, you should consider iterating the model and you should consider the alternative i suggested: move (or create) KVM tooling to tools/kvm/ and treat it as a single project from there on. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 10:18 PM, Antoine Martin wrote: That includes the guest kernel. If you can deploy a new kernel in the guest, presumably you can deploy a userspace package. That's not always true. The host admin can control the guest kernel via "kvm -kernel" easily enough, but he may or may not have access to the disk that is used in the guest. (think encrypted disks, service agreements, etc) There is a matching -initrd argument that you can use to launch a daemon. I believe that -kernel use will be rare, though. It's a lot easier to keep everything in one filesystem. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 09:06 PM, Ingo Molnar wrote: * Avi Kivity wrote: [...] Second, from my point of view all contributors are volunteers (perhaps their employer volunteered them, but there's no difference from my perspective). Asking them to repaint my apartment as a condition to get a patch applied is abuse. If a patch is good, it gets applied. This is one of the weirdest arguments i've seen in this thread. Almost all the time do we make contributions conditional on the general shape of the project. Developers dont get to do just the fun stuff. So, do you think a reply to a patch along the lines of NAK. Improving scalability is pointless while we don't have a decent GUI. I'll review you RCU patches _after_ you've contributed a usable GUI. ? What does this have to do with RCU? The example was rcuifying kvm which took place a bit ago. Sorry, it wasn't clear. I'm talking about KVM, which is a Linux kernel feature that is useless without a proper, KVM-specific app making use of it. RCU is a general kernel performance feature that works across the board. It helps KVM indirectly, and it helps many other kernel subsystems as well. It needs no user-space tool to be useful. Correct. So should I tell someone that has sent a patch that rcu-ified kvm in order to scale it, that I won't accept the patch unless they do some usability userspace work? say, implementing an eject button. That's what I understood you to mean. KVM on the other hand is useless without a user-space tool. [ Theoretically you might have a fair point if it were a critical feature of RCU for it to have a GUI, and if the main tool that made use of it sucked. But it isnt and you should know that. ] Had you suggested the following 'NAK', applied to a different, relevant subsystem: | NAK. Improving scalability is pointless while we don't have a usable | tool. I'll review you perf patches _after_ you've contributed a usable | tool. That might hold, but the tool is usable at least for some people - it runs in production. The people running it won't benefit from an eject button or any usability improvement since they run it through a centralized management tool that hides everything. They will benefit from the scalability patches. Should I still make those patches conditional on scalability work that is of no interest to the submitter? This is a basic quid pro quo: new features introduce risks and create additional workload not just to the originating developer but on the rest of the community as well. You should check how Linus has pulled new features in the past 15 years: he very much requires the existing code to first be top-notch before he accepts new features for a given area of functionality. For a given area, yes. [...] That is my precise point. KVM is a specific subsystem or "area" that makes no sense without the user-space tooling it relates to. You seem to argue that you have no 'right' to insist on good quality of that tooling - and IMO you are fundamentally wrong with that. kvm contains many sub-areas. I'm not going to tie unrelated things together like the GUI and sclability, configuration file format and emulator correctness, nested virtualization and qcow2 asynchronity, or other crazy combinations. People either leave en mass or become frustrated if they can't. I do reject patches touching a sub-area that I think need to be done in userspace, for example. That's not to say kvm development is random. We have a weekly conference call where regular contributors and maintainers of both qemu and kvm participate and where we decide where to focus. Sadly the issue of a qemu GUI is not raised often. Perhaps you can participate and voice your concerns. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tracking KVM development
Thomas Løcke wrote: On Sun, Mar 21, 2010 at 1:23 PM, Avi Kivity wrote: Tracking git repositories and stable setups are mutually exclusive. If you are interested in something stable I recommend staying with the distribution provided setup (and picking a distribution that has an emphasis on kvm). If you want to track upstream, use qemu-kvm-0.12.x stable releases and kernel.org 2.6.x.y stable releases. If you want to track git repositories, use qemu-kvm.git and kvm.git for the kernel and kvm. Thanks Avi. I will stay with the stable qemu-kvm releases and stable kernel.org kernel releases from now on. I've never heard of any KVM specific distributions. Are you aware of any? My primary reason for going with Slackware, is because I already know it. But if there are better choices for a KVM virtualization host, then I'm willing to switch. Please think twice about that. Every time I wanted to go away from Slackware because of missing packages I ended up with accepting the involved hassle with self-compiling because I could stay with the simplicity and clean design of Slackware. I usually compile my own kernels anyway and use the Slackware kernels only for testing and installation. So I usually do "make oldconfig" on a stable 2.6.xx.>=3 kernel, and am happy with that. QEMU(-kvm) is not a problem at all, the dependencies are very small and with Slackware[64] 13.0 it compiles out of the box with almost all features. I can send you a reasonably configured package (or build-script) if you like. Currently both qemu-kvm-0.12.3 and Linux 2.6.33 work together very well, although I usually do only testing and development with KVM and actually "use" it very rarely. So if you need more upper level management tools (like libvirt) I cannot help you on this. Regards, Andre. -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany Tel: +49 351 488-3567-12 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/22/2010 03:11 AM, Avi Kivity wrote: On 03/21/2010 10:08 PM, Olivier Galibert wrote: On Sun, Mar 21, 2010 at 10:01:51PM +0200, Avi Kivity wrote: On 03/21/2010 09:17 PM, Ingo Molnar wrote: Adding any new daemon to an existing guest is a deployment and usability nightmare. The logical conclusion of that is that everything should be built into the kernel. Where a failure brings the system down or worse. Where you have to bear the memory footprint whether you ever use the functionality or not. Where to update the functionality you need to deploy a new kernel (possibly introducing unrelated bugs) and reboot. If userspace daemons are such a deployment and usability nightmare, maybe we should fix that instead. Which userspace? Deploying *anything* in the guest can be a nightmare, including paravirt drivers if you don't have a natively supported in the OS virtual hardware backoff. That includes the guest kernel. If you can deploy a new kernel in the guest, presumably you can deploy a userspace package. That's not always true. The host admin can control the guest kernel via "kvm -kernel" easily enough, but he may or may not have access to the disk that is used in the guest. (think encrypted disks, service agreements, etc) Antoine Deploying things in the host OTOH is business as usual. True. And you're smart enough to know that. Thanks. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CONFIG_HAVE_KVM=n impossible?
devz...@web.de wrote: > Hello, > > does anybody know why it seems that it`s not possible to build a kernel with > "CONFIG_HAVE_KVM=n" ? > > It always switches back to "y" with every kernel build and i have no clue, > why. It's an internal config symbol which is not visible in the menu system and is always set up unconditionally based on the platform. Just like "CONFIG_HAVE_MMU". You want another symbols, like CONFIG_KVM. /mjt -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tracking KVM development
Avi Kivity wrote: [] > The only kvm-specific distribution I know of is RHEV-H, but that's > probably not what you're looking for. I'm talking about distributions > that have an active kvm package maintainer, update the packages > regularly, have bug trackers that someone looks into, etc. At least > Fedora and Ubuntu do this, perhaps openSuSE as well (though the latter > has a stronger Xen emphasis). Debian is a lot better on this front than it used to be a year ago. At least I'm trying to look for the bugreports on a regular basis ;) /mjt -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
CONFIG_HAVE_KVM=n impossible?
Hello, does anybody know why it seems that it`s not possible to build a kernel with "CONFIG_HAVE_KVM=n" ? It always switches back to "y" with every kernel build and i have no clue, why. i`m using 2.6.33 vanilla. regards Roland ___ WEB.DE DSL: Internet, Telefon und Entertainment für nur 19,99 EUR/mtl.! http://produkte.web.de/go/02/ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 10:08 PM, Olivier Galibert wrote: On Sun, Mar 21, 2010 at 10:01:51PM +0200, Avi Kivity wrote: On 03/21/2010 09:17 PM, Ingo Molnar wrote: Adding any new daemon to an existing guest is a deployment and usability nightmare. The logical conclusion of that is that everything should be built into the kernel. Where a failure brings the system down or worse. Where you have to bear the memory footprint whether you ever use the functionality or not. Where to update the functionality you need to deploy a new kernel (possibly introducing unrelated bugs) and reboot. If userspace daemons are such a deployment and usability nightmare, maybe we should fix that instead. Which userspace? Deploying *anything* in the guest can be a nightmare, including paravirt drivers if you don't have a natively supported in the OS virtual hardware backoff. That includes the guest kernel. If you can deploy a new kernel in the guest, presumably you can deploy a userspace package. Deploying things in the host OTOH is business as usual. True. And you're smart enough to know that. Thanks. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 09:59 PM, Ingo Molnar wrote: Frankly, i was surprised (and taken slightly off base) by both Avi and Anthony suggesting such a clearly inferior "add a demon to the guest space" solution. It's a usability and deployment non-starter. It's only clearly inferior if you ignore every consideration against it. It's definitely not a deployment non-starter, see the tons of daemons that come with any Linux system. The basic ones are installed and enabled automatically during system installation. Furthermore, allowing a guest to integrate/mount its files into the host VFS space (which was my suggestion) has many other uses and advantages as well, beyond the instrumentation/symbol-lookup purpose. Yes. I'm just not sure about the auto-enabling part. So can we please have some resolution here and move on: the KVM maintainers should either suggest a different transparent approach, or should retract the NAK for the solution we suggested. So long as you define 'transparent' as in 'only the guest kernel is involved' or even 'only the guest and host kernels are involved' we aren't going to make a lot of progress. I oppose shoving random bits of functionality into the kernel, especially things that are in daily use. While us developers do and will use profiling extensively, it doesn't need sit in every guest's non-swappable .text. We very much want to make progress and want to write code, but obviously we cannot code against a maintainer NAK, nor can we code up an inferior solution either. You haven't heard any NAKs, only objections. If we discuss things perhaps we can achieve something that works for everyone. If we keep turning the flames higher that's unlikely. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tracking KVM development
On 21.03.2010, at 17:42, Avi Kivity wrote: > On 03/21/2010 06:37 PM, Thomas Løcke wrote: >> On Sun, Mar 21, 2010 at 1:23 PM, Avi Kivity wrote: >> >>> Tracking git repositories and stable setups are mutually exclusive. If you >>> are interested in something stable I recommend staying with the distribution >>> provided setup (and picking a distribution that has an emphasis on kvm). If >>> you want to track upstream, use qemu-kvm-0.12.x stable releases and >>> kernel.org 2.6.x.y stable releases. If you want to track git repositories, >>> use qemu-kvm.git and kvm.git for the kernel and kvm. >>> >> Thanks Avi. >> >> I will stay with the stable qemu-kvm releases and stable kernel.org >> kernel releases from now on. >> >> I've never heard of any KVM specific distributions. Are you aware of >> any? My primary reason for going with Slackware, is because I already >> know it. But if there are better choices for a KVM virtualization >> host, then I'm willing to switch. >> > > The only kvm-specific distribution I know of is RHEV-H, but that's probably > not what you're looking for. I'm talking about distributions that have an > active kvm package maintainer, update the packages regularly, have bug > trackers that someone looks into, etc. At least Fedora and Ubuntu do this, > perhaps openSuSE as well (though the latter has a stronger Xen emphasis). Yes, we do. Though openSUSE 11.2 isn't exactly where I want it to be. Expect 11.3 to be a lot better there. Alex-- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On Sun, Mar 21, 2010 at 10:01:51PM +0200, Avi Kivity wrote: > On 03/21/2010 09:17 PM, Ingo Molnar wrote: > > > >Adding any new daemon to an existing guest is a deployment and usability > >nightmare. > > > > The logical conclusion of that is that everything should be built into > the kernel. Where a failure brings the system down or worse. Where you > have to bear the memory footprint whether you ever use the functionality > or not. Where to update the functionality you need to deploy a new > kernel (possibly introducing unrelated bugs) and reboot. > > If userspace daemons are such a deployment and usability nightmare, > maybe we should fix that instead. Which userspace? Deploying *anything* in the guest can be a nightmare, including paravirt drivers if you don't have a natively supported in the OS virtual hardware backoff. Deploying things in the host OTOH is business as usual. And you're smart enough to know that. OG. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/21/2010 09:17 PM, Ingo Molnar wrote: Adding any new daemon to an existing guest is a deployment and usability nightmare. The logical conclusion of that is that everything should be built into the kernel. Where a failure brings the system down or worse. Where you have to bear the memory footprint whether you ever use the functionality or not. Where to update the functionality you need to deploy a new kernel (possibly introducing unrelated bugs) and reboot. If userspace daemons are such a deployment and usability nightmare, maybe we should fix that instead. The basic rule of good instrumentation is to be transparent. The moment we have to modify the user-space of a guest just to monitor it, the purpose of transparent instrumentation is defeated. You have to modify the guest anyway by deploying a new kernel. Please try think with the heads of our users and developers and dont suggest some weird ivory-tower design that is totally impractical ... inetd.d style 'drop a listener config here and it will be executed on connection' should work. The listener could come with the kernel package, though I don't think it's a good idea. module-init-tools doesn't and people have survived somehow. And no, you have to code none of this, we'll do all the coding. The only thing we are asking is for you to not stand in the way of good usability ... Thanks. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
* Antoine Martin wrote: > On 03/22/2010 02:17 AM, Ingo Molnar wrote: > >* Anthony Liguori wrote: > >>On 03/19/2010 03:53 AM, Ingo Molnar wrote: > >>>* Avi Kivity wrote: > >There were two negative reactions immediately, both showed a fundamental > >server versus desktop bias: > > > > - you did not accept that the most important usecase is when there is a > >single guest running. > Well, it isn't. > >>>Erm, my usability points are _doubly_ true when there are multiple guests > >>>... > >>> > >>>The inconvenience of having to type: > >>> > >>> perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms \ > >>> --guestmodules=/home/ymzhang/guest/modules top > >>> > >>>is very obvious even with a single guest. Now multiply that by more guests > >>>... > >>If you want to improve this, you need to do the following: > >> > >>1) Add a userspace daemon that uses vmchannel that runs in the guest and can > >>fetch kallsyms and arbitrary modules. If that daemon lives in > >>tools/perf, that's fine. > > > > Adding any new daemon to an existing guest is a deployment and usability > > nightmare. > > Absolutely. In most cases it is not desirable, and you'll find that in a lot > of cases it is not even possible - for non-technical reasons. > > One of the main benefits of virtualization is the ability to manage and see > things from the outside. > > > The basic rule of good instrumentation is to be transparent. The moment we > > have to modify the user-space of a guest just to monitor it, the purpose > > of transparent instrumentation is defeated. > > Not to mention Heisenbugs and interference. Correct. Frankly, i was surprised (and taken slightly off base) by both Avi and Anthony suggesting such a clearly inferior "add a demon to the guest space" solution. It's a usability and deployment non-starter. Furthermore, allowing a guest to integrate/mount its files into the host VFS space (which was my suggestion) has many other uses and advantages as well, beyond the instrumentation/symbol-lookup purpose. So can we please have some resolution here and move on: the KVM maintainers should either suggest a different transparent approach, or should retract the NAK for the solution we suggested. We very much want to make progress and want to write code, but obviously we cannot code against a maintainer NAK, nor can we code up an inferior solution either. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/22/2010 02:17 AM, Ingo Molnar wrote: * Anthony Liguori wrote: On 03/19/2010 03:53 AM, Ingo Molnar wrote: * Avi Kivity wrote: There were two negative reactions immediately, both showed a fundamental server versus desktop bias: - you did not accept that the most important usecase is when there is a single guest running. Well, it isn't. Erm, my usability points are _doubly_ true when there are multiple guests ... The inconvenience of having to type: perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms \ --guestmodules=/home/ymzhang/guest/modules top is very obvious even with a single guest. Now multiply that by more guests ... If you want to improve this, you need to do the following: 1) Add a userspace daemon that uses vmchannel that runs in the guest and can fetch kallsyms and arbitrary modules. If that daemon lives in tools/perf, that's fine. Adding any new daemon to an existing guest is a deployment and usability nightmare. Absolutely. In most cases it is not desirable, and you'll find that in a lot of cases it is not even possible - for non-technical reasons. One of the main benefits of virtualization is the ability to manage and see things from the outside. The basic rule of good instrumentation is to be transparent. The moment we have to modify the user-space of a guest just to monitor it, the purpose of transparent instrumentation is defeated. Not to mention Heisenbugs and interference. Cheers Antoine That was one of the fundamental usability mistakes of Oprofile. There is no 'perf' daemon - all the perf functionality is _built in_, and for very good reasons. It is one of the main reasons for perf's success as well. Now Qemu is trying to repeat that stupid mistake ... So please either suggest a different transparent solution that is technically better than the one i suggested, or you should concede the point really. Please try think with the heads of our users and developers and dont suggest some weird ivory-tower design that is totally impractical ... And no, you have to code none of this, we'll do all the coding. The only thing we are asking is for you to not stand in the way of good usability ... Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Streaming Audio from Virtual Machine
I'm using Kubuntu 9.10 32-bit on a quad-core Phenom II with Gigabit ethernet. I want to stream audio from MLB.com from a WinXP client thru a Linksys WMB54G wireless music bridge. Note that there are drivers for the WMB54G only for WinXP and Vista. If I stream the audio thru a native WinXP box thru the WMB54G, all is well and the audio sounds fine. When I try to stream thru a WinXP virtual machine on Kubuntu 9.10, the audio is poor quality and subject to gaps and dropping the stream altogether. So far I've tried KVM/QEMU and VirtualBox, same result. Regards KVM/QEMU, I note AMD-V is activated in the BIOS, and I have a custom 2.6.32.7 kernel, and QEMU 0.11.0. The kvm kvm_amd modules are compiled in and loaded. I've been using bridged networking . I think it's set up correctly but I confess I'm no networking expert. My start command for the WinXP virtual machine is: sudo /usr/bin/qemu -m 1024 -boot c -netnic,vlan=0,macaddr=00:d0:13:b0:2d:32,model=rtl8139 -net tap,vlan=0,ifname=tap0,script=/etc/qemu-ifup -localtime -soundhw ac97 -smp 4 -fda /dev/fd0 -vga std -usb /home/rbroman/windows.img I also tried model=virtio but that didn't help. I suspect this is a virtual machine networking problem but I'm not sure. So my questions are: -What's the best/fastest networking option and how do I set it up? Pointers to step-by-step instructions appreciated. -Is it possible I have a problem other than networking? Configuration problem with KVM/QEMU? Or could there be a problem with the WMB54G driver when used thru a virtual machine? -Is there a better virtual machine solution than KVM/QEMU for what I'm trying to do? Recommendations appreciated - Gus -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
* Anthony Liguori wrote: > On 03/19/2010 03:53 AM, Ingo Molnar wrote: > >* Avi Kivity wrote: > > > >>>There were two negative reactions immediately, both showed a fundamental > >>>server versus desktop bias: > >>> > >>> - you did not accept that the most important usecase is when there is a > >>>single guest running. > >>Well, it isn't. > >Erm, my usability points are _doubly_ true when there are multiple guests ... > > > >The inconvenience of having to type: > > > > perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms \ > > --guestmodules=/home/ymzhang/guest/modules top > > > >is very obvious even with a single guest. Now multiply that by more guests > >... > > If you want to improve this, you need to do the following: > > 1) Add a userspace daemon that uses vmchannel that runs in the guest and can >fetch kallsyms and arbitrary modules. If that daemon lives in >tools/perf, that's fine. Adding any new daemon to an existing guest is a deployment and usability nightmare. The basic rule of good instrumentation is to be transparent. The moment we have to modify the user-space of a guest just to monitor it, the purpose of transparent instrumentation is defeated. That was one of the fundamental usability mistakes of Oprofile. There is no 'perf' daemon - all the perf functionality is _built in_, and for very good reasons. It is one of the main reasons for perf's success as well. Now Qemu is trying to repeat that stupid mistake ... So please either suggest a different transparent solution that is technically better than the one i suggested, or you should concede the point really. Please try think with the heads of our users and developers and dont suggest some weird ivory-tower design that is totally impractical ... And no, you have to code none of this, we'll do all the coding. The only thing we are asking is for you to not stand in the way of good usability ... Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
* Avi Kivity wrote: > >> [...] Second, from my point of view all contributors are volunteers > >> (perhaps their employer volunteered them, but there's no difference from > >> my perspective). Asking them to repaint my apartment as a condition to > >> get a patch applied is abuse. If a patch is good, it gets applied. > > > > This is one of the weirdest arguments i've seen in this thread. Almost all > > the time do we make contributions conditional on the general shape of the > > project. Developers dont get to do just the fun stuff. > > So, do you think a reply to a patch along the lines of > > NAK. Improving scalability is pointless while we don't have a decent GUI. > I'll review you RCU patches > _after_ you've contributed a usable GUI. > > ? What does this have to do with RCU? I'm talking about KVM, which is a Linux kernel feature that is useless without a proper, KVM-specific app making use of it. RCU is a general kernel performance feature that works across the board. It helps KVM indirectly, and it helps many other kernel subsystems as well. It needs no user-space tool to be useful. KVM on the other hand is useless without a user-space tool. [ Theoretically you might have a fair point if it were a critical feature of RCU for it to have a GUI, and if the main tool that made use of it sucked. But it isnt and you should know that. ] Had you suggested the following 'NAK', applied to a different, relevant subsystem: | NAK. Improving scalability is pointless while we don't have a usable | tool. I'll review you perf patches _after_ you've contributed a usable | tool. you would have a fair point. In fact, we are doing that we are living by that. It makes absolutely zero sense to improve the scalability of perf if its usability sucks. So where you are trying to point out an inconsistency in my argument there is none. > > This is a basic quid pro quo: new features introduce risks and create > > additional workload not just to the originating developer but on the rest > > of the community as well. You should check how Linus has pulled new > > features in the past 15 years: he very much requires the existing code to > > first be top-notch before he accepts new features for a given area of > > functionality. > > For a given area, yes. [...] That is my precise point. KVM is a specific subsystem or "area" that makes no sense without the user-space tooling it relates to. You seem to argue that you have no 'right' to insist on good quality of that tooling - and IMO you are fundamentally wrong with that. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Enhance perf to collect KVM guest os statistics from host side
* oerg Roedel wrote: > On Fri, Mar 19, 2010 at 09:21:22AM +0100, Ingo Molnar wrote: > > Unfortunately, in a previous thread the Qemu maintainer has indicated that > > he > > will essentially NAK any attempt to enhance Qemu to provide an easily > > discoverable, self-contained, transparent guest mount on the host side. > > > > No technical justification was given for that NAK, despite my repeated > > requests to particulate the exact security problems that such an approach > > would cause. > > > > If that NAK does not stand in that form then i'd like to know about it - it > > makes no sense for us to try to code up a solution against a standing > > maintainer NAK ... > > I still think it is the best and most generic way to let the guest do the > symbol resolution. [...] Not really. > [...] This has several advantages: > > 1. The guest knows best about its symbol space. So this would be > extensible to other guest operating systems. A brave > developer may even implement symbol passing for Windows or > the BSDs ;-) Having access to the actual executable files that include the symbols achieves precisely that - with the additional robustness that all this functionality is concentrated into the host, while the guest side is kept minimal (and transparent). > 2. The guest can decide for its own if it want to pass this > inforamtion to the host-perf. No security issues at all. It can decide whether it exposes the files. Nor are there any "security issues" to begin with. > 3. The guest can also pass us the call-chain and we don't need > to care about complicated of fetching from the guest > ourself. You need to be aware of the fact that symbol resolution is a separate step from call chain generation. I.e. call-chains are a (entirely) separate issue, and could reasonably be done in the guest or in the host. It has no bearing on this symbol resolution question. > 4. This way extensible to nested virtualization too. Nested virtualization is actually already taken care of by the filesystem solution via an existing method called 'subdirectories'. If the guest offers sub-guests then those symbols will be exposed in a similar way via its own 'guest files' directory hierarchy. I.e. if we have 'Guest-2' nested inside 'the 'Guest-Fedora-1' instance, we get: /guests/ /guests/Guest-Fedora-1/etc/ /guests/Guest-Fedora-1/usr/ we'd also have: /guests/Guest-Fedora-1/guests/Guest-2/ So this is taken care of automatically. I.e. none of the four 'advantages' listed here are actually advantages over my proposed solution, so your conclusion is subsequently flawed as well. > How we speak to the guest was already discussed in this thread. My personal > opinion is that going through qemu is an unnecessary step and we can solve > that more clever and transparent for perf. Meaning exactly what? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unexpected exit_ini_info when nesting svm
Hello Oliver, On Thu, Mar 18, 2010 at 08:43:53PM +0100, Olivier Berghmans wrote: > I tried nesting kvm in kvm on an AMD processor with support for svm > and npt (the dmesg told me both were in use). I managed to install the > nested kvm and when starting the L2 guest in order to install an > operating system, I got following messages in the L1 guest: > > [ 2016.712047] handle_exit: unexpected exit_ini_info 0x8008 exit_code 0x60 > [ 2031.432032] handle_exit: unexpected exit_ini_info 0x8008 exit_code 0x60 > [ 2034.468058] handle_exit: unexpected exit_ini_info 0x8008 exit_code 0x60 These messages result from a difference between a real hardware svm and the emulated svm from kvm. Hardware SVM always injects an exception first before it does an #vmexit(0x60) while the svm emulation does immediatlt #vmexit again. I have a patch to fix this but it needs more testing. The patch implements detection of the above situation and sends an self-ipi in this case. Joerg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tracking KVM development
On 03/21/2010 06:37 PM, Thomas Løcke wrote: On Sun, Mar 21, 2010 at 1:23 PM, Avi Kivity wrote: Tracking git repositories and stable setups are mutually exclusive. If you are interested in something stable I recommend staying with the distribution provided setup (and picking a distribution that has an emphasis on kvm). If you want to track upstream, use qemu-kvm-0.12.x stable releases and kernel.org 2.6.x.y stable releases. If you want to track git repositories, use qemu-kvm.git and kvm.git for the kernel and kvm. Thanks Avi. I will stay with the stable qemu-kvm releases and stable kernel.org kernel releases from now on. I've never heard of any KVM specific distributions. Are you aware of any? My primary reason for going with Slackware, is because I already know it. But if there are better choices for a KVM virtualization host, then I'm willing to switch. The only kvm-specific distribution I know of is RHEV-H, but that's probably not what you're looking for. I'm talking about distributions that have an active kvm package maintainer, update the packages regularly, have bug trackers that someone looks into, etc. At least Fedora and Ubuntu do this, perhaps openSuSE as well (though the latter has a stronger Xen emphasis). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tracking KVM development
On Sun, Mar 21, 2010 at 1:23 PM, Avi Kivity wrote: > Tracking git repositories and stable setups are mutually exclusive. If you > are interested in something stable I recommend staying with the distribution > provided setup (and picking a distribution that has an emphasis on kvm). If > you want to track upstream, use qemu-kvm-0.12.x stable releases and > kernel.org 2.6.x.y stable releases. If you want to track git repositories, > use qemu-kvm.git and kvm.git for the kernel and kvm. Thanks Avi. I will stay with the stable qemu-kvm releases and stable kernel.org kernel releases from now on. I've never heard of any KVM specific distributions. Are you aware of any? My primary reason for going with Slackware, is because I already know it. But if there are better choices for a KVM virtualization host, then I'm willing to switch. :o) /Thomas -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Strange CPU usage pattern in SMP guest
On Sun, Mar 21, 2010 at 05:17:38PM +0200, Avi Kivity wrote: > On 03/21/2010 04:55 PM, Sebastian Hetze wrote: >> On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote: >> >>> On 03/21/2010 02:02 PM, Sebastian Hetze wrote: >>> 12:46:02 CPU%usr %nice%sys %iowait%irq %soft %steal %guest %idle 12:46:03 all0,20 11,35 10,968,960,402,990,00 0,00 65,14 12:46:03 01,00 11,007,00 15,000,001,000,00 0,00 65,00 12:46:03 10,007,142,046,121,02 11,220,00 0,00 72,45 12:46:03 20,00 15,001,00 12,000,001,000,00 0,00 71,00 12:46:03 30,00 11,00 23,008,000,000,000,00 0,00 58,00 12:46:03 40,000,00 50,000,000,000,000,00 0,00 50,00 12:46:03 50,00 13,00 20,004,000,001,000,00 0,00 62,00 So it is only CPU4 that is showing this strange behaviour. >>> Can you adjust irqtop to only count cpu4? or even just post a few 'cat >>> /proc/interrupts' from that guest. >>> >>> Most likely the timer interrupt for cpu4 died. >>> >> I've added two keys +/- to your irqtop to focus up and down >> in the row of available CPUs. >> The irqtop for CPU4 shows a constant number of 6 local timer interrupts >> per update, while the other CPUs show various higher values: >> >> irqtop for cpu 4 >> >> eth0 188 >> Rescheduling interrupts 162 >> Local timer interrupts 6 >> ata_piix3 >> TLB shootdowns 1 >> Spurious interrupts 0 >> Machine check exceptions0 >> >> >> irqtop for cpu 5 >> >> eth0 257 >> Local timer interrupts251 >> Rescheduling interrupts 237 >> Spurious interrupts 0 >> Machine check exceptions0 >> >> So the timer interrupt for cpu4 is not completely dead but somehow >> broken. > > That is incredibly weird. > >> What can cause this problem? Any way to speed it up again? >> > > The host has 8 cpus and is only running this 6 vcpu guest, yes? The host is an dual quad core E5520 with hyperthrading enabled, so we see 2x4x2=16 CPUs on the host. The guest is started with 6 CPUs. > Can you confirm the other vcpus are ticking at 250 Hz? The irqtop shows different numbers for local timer interrupts on the other CPUs. The total number (summed up over all CPUs) varies between something like 700 and 1400. Any CPU can be down to 10 and next update up to 260. Only CPU4 stays at the 6 local timer interrupts. > > What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a > last-used-cpu field in the display. The processes are not bound to a particular CPU, so the picture varies. Here are two shots: take1: 15 root RT -5 000 S0 0.0 0:01.70 4 migration/4 16 root 15 -5 000 S0 0.0 0:00.08 4 ksoftirqd/4 17 root RT -5 000 S0 0.0 0:00.00 4 watchdog/4 25 root 15 -5 000 S0 0.0 0:00.01 4 events/4 35 root 15 -5 000 S0 0.0 0:00.00 4 kintegrityd/4 41 root 15 -5 000 S0 0.0 0:00.03 4 kblockd/4 50 root 15 -5 000 S0 0.0 0:00.90 4 ata/4 55 root 15 -5 000 S0 0.0 0:00.00 4 kseriod 66 root 15 -5 000 S0 0.0 0:00.00 4 aio/4 73 root 15 -5 000 S0 0.0 0:00.00 4 crypto/4 80 root 15 -5 000 S0 0.0 2:11.71 4 scsi_eh_1 87 root 15 -5 000 S0 0.0 0:00.00 4 kmpathd/4 95 root 15 -5 000 S0 0.0 0:00.00 4 kondemand/4 101 root 15 -5 000 S0 0.0 0:00.00 4 kconservative/4 103 root 10 -10 000 S0 0.0 0:00.00 4 krfcommd 681 root 15 -5 000 S0 0.0 0:00.00 4 kdmflush 686 root 15 -5 000 S0 0.0 0:00.00 4 kdmflush 691 root 15 -5 000 S0 0.0 0:00.00 4 kdmflush 737 root 15 -5 000 S0 0.0 0:00.71 4 kjournald 826 root 16 -4 2100 452 312 S0 0.0 0:00.14 4 udevd 1350 root 15 -5 000 S0 0.0 0:00.00 4 kpsmoused 1444 root 15 -5 000 S0 0.0 0:00.00 4 kgameportd 1718 root 15 -5 000 S0 0.0 0:14.62 4 kjournald 2108 statd 20 0 2252 1152 760 S0 0.0 0:02.66 4 rpc.statd 2117 root 15 -5 000 S0 0.0 0:00.36 4 rpciod/4 2123 root 15 -5 00
Re: Strange CPU usage pattern in SMP guest
On 03/21/2010 04:55 PM, Sebastian Hetze wrote: On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote: On 03/21/2010 02:02 PM, Sebastian Hetze wrote: 12:46:02 CPU%usr %nice%sys %iowait%irq %soft %steal %guest %idle 12:46:03 all0,20 11,35 10,968,960,402,990,00 0,00 65,14 12:46:03 01,00 11,007,00 15,000,001,000,00 0,00 65,00 12:46:03 10,007,142,046,121,02 11,220,00 0,00 72,45 12:46:03 20,00 15,001,00 12,000,001,000,00 0,00 71,00 12:46:03 30,00 11,00 23,008,000,000,000,00 0,00 58,00 12:46:03 40,000,00 50,000,000,000,000,00 0,00 50,00 12:46:03 50,00 13,00 20,004,000,001,000,00 0,00 62,00 So it is only CPU4 that is showing this strange behaviour. Can you adjust irqtop to only count cpu4? or even just post a few 'cat /proc/interrupts' from that guest. Most likely the timer interrupt for cpu4 died. I've added two keys +/- to your irqtop to focus up and down in the row of available CPUs. The irqtop for CPU4 shows a constant number of 6 local timer interrupts per update, while the other CPUs show various higher values: irqtop for cpu 4 eth0 188 Rescheduling interrupts 162 Local timer interrupts 6 ata_piix3 TLB shootdowns 1 Spurious interrupts 0 Machine check exceptions0 irqtop for cpu 5 eth0 257 Local timer interrupts251 Rescheduling interrupts 237 Spurious interrupts 0 Machine check exceptions0 So the timer interrupt for cpu4 is not completely dead but somehow broken. That is incredibly weird. What can cause this problem? Any way to speed it up again? The host has 8 cpus and is only running this 6 vcpu guest, yes? Can you confirm the other vcpus are ticking at 250 Hz? What does 'top' show running on cpu 4? Pressing 'f' 'j' will add a last-used-cpu field in the display. Marcelo, any ideas? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: x86 emulator: fix unlocked CMPXCHG8B emulation.
When CMPXCHG8B is executed without LOCK prefix it is racy. Preserve this behaviour in emulator too. Signed-off-by: Gleb Natapov --- This patch goes on top of my previous "KVM: x86 emulator: add decoding of CMPXCHG8B dst operand." patch. diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 904351e..e2bbb9c 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1724,7 +1724,6 @@ static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt, (u32) c->regs[VCPU_REGS_RBX]; ctxt->eflags |= EFLG_ZF; - c->lock_prefix = 1; } return X86EMUL_CONTINUE; } -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Strange CPU usage pattern in SMP guest
On Sun, Mar 21, 2010 at 02:19:40PM +0200, Avi Kivity wrote: > On 03/21/2010 02:02 PM, Sebastian Hetze wrote: >> >> 12:46:02 CPU%usr %nice%sys %iowait%irq %soft %steal >> %guest %idle >> 12:46:03 all0,20 11,35 10,968,960,402,990,00 >> 0,00 65,14 >> 12:46:03 01,00 11,007,00 15,000,001,000,00 >> 0,00 65,00 >> 12:46:03 10,007,142,046,121,02 11,220,00 >> 0,00 72,45 >> 12:46:03 20,00 15,001,00 12,000,001,000,00 >> 0,00 71,00 >> 12:46:03 30,00 11,00 23,008,000,000,000,00 >> 0,00 58,00 >> 12:46:03 40,000,00 50,000,000,000,000,00 >> 0,00 50,00 >> 12:46:03 50,00 13,00 20,004,000,001,000,00 >> 0,00 62,00 >> >> So it is only CPU4 that is showing this strange behaviour. >> > > Can you adjust irqtop to only count cpu4? or even just post a few 'cat > /proc/interrupts' from that guest. > > Most likely the timer interrupt for cpu4 died. I've added two keys +/- to your irqtop to focus up and down in the row of available CPUs. The irqtop for CPU4 shows a constant number of 6 local timer interrupts per update, while the other CPUs show various higher values: irqtop for cpu 4 eth0 188 Rescheduling interrupts 162 Local timer interrupts 6 ata_piix3 TLB shootdowns 1 Spurious interrupts 0 Machine check exceptions0 irqtop for cpu 5 eth0 257 Local timer interrupts251 Rescheduling interrupts 237 Spurious interrupts 0 Machine check exceptions0 So the timer interrupt for cpu4 is not completely dead but somehow broken. What can cause this problem? Any way to speed it up again? #!/usr/bin/python import curses import sys, os, time, optparse def read_interrupts(): global target irq = {} proc = file('/proc/interrupts') nrcpu = len(proc.readline().split()) if target < 0: target = 0; if target > nrcpu: target = nrcpu for line in proc.readlines(): vec, data = line.strip().split(':', 1) if vec in ('ERR', 'MIS'): continue counts = data.split(None, nrcpu) counts, rest = (counts[:-1], counts[-1]) if target == 0: count = sum([int(x) for x in counts]) else: count = int(counts[target-1]) try: v = int(vec) name = rest.split(None, 1)[1] except: name = rest irq[name] = count return irq def delta_interrupts(): old = read_interrupts() while True: irq = read_interrupts() delta = {} for key in irq.keys(): delta[key] = irq[key] - old[key] yield delta old = irq target = 0 label_width = 35 number_width = 10 def tui(screen): curses.use_default_colors() global target curses.noecho() def getcount(x): return x[1] def refresh(irq): screen.erase() if target > 0: title = "irqtop for cpu %d"%(target-1) else: title = "irqtop sum for all cpu's" screen.addstr(0, 0, title) row = 2 for name, count in sorted(irq.items(), key = getcount, reverse = True): if row >= screen.getmaxyx()[0]: break col = 1 screen.addstr(row, col, name) col += label_width screen.addstr(row, col, '%10d' % (count,)) row += 1 screen.refresh() for irqs in delta_interrupts(): refresh(irqs) curses.halfdelay(10) try: c = screen.getkey() if c == 'q': break if c == '+': target = target+1 if c == '-': target = target-1 except KeyboardInterrupt: break except curses.error: continue import curses.wrapper curses.wrapper(tui)
Re: [PATCH 2/2] KVM: x86 emulator: add decoding of CMPXCHG8B dst operand.
On 03/21/2010 04:44 PM, Gleb Natapov wrote: On Sun, Mar 21, 2010 at 04:41:24PM +0200, Avi Kivity wrote: On 03/21/2010 01:08 PM, Gleb Natapov wrote: Decode CMPXCHG8B destination operand in decoding stage. Fixes regression introduced by "If LOCK prefix is used dest arg should be memory" commit. This commit relies on dst operand be decoded at the beginning of an instruction emulation. @@ -1719,15 +1719,12 @@ static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt, c->regs[VCPU_REGS_RAX] = (u32) (old>> 0); c->regs[VCPU_REGS_RDX] = (u32) (old>> 32); ctxt->eflags&= ~EFLG_ZF; - } else { - new = ((u64)c->regs[VCPU_REGS_RCX]<< 32) | + c->dst.val = ((u64)c->regs[VCPU_REGS_RCX]<< 32) | (u32) c->regs[VCPU_REGS_RBX]; - rc = ops->cmpxchg_emulated(c->modrm_ea,&old,&new, 8, ctxt->vcpu); - if (rc != X86EMUL_CONTINUE) - return rc; ctxt->eflags |= EFLG_ZF; + c->lock_prefix = 1; Why is this bit needed? cmpxchg64b without lock is valid and racy, but the guest may know it is safe. Agree. Before this patch cmpxchg8b emulation always called cmpxchg_emulated(), so to be extra careful I wanted to preserve old behaviour. Resend the patch without this line? Better a 3/2 that removes it. So we have a large patch that just transforms code, and a small patch that corrects an earlier bug. May help a bisector one day. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: x86 emulator: add decoding of CMPXCHG8B dst operand.
On Sun, Mar 21, 2010 at 04:41:24PM +0200, Avi Kivity wrote: > On 03/21/2010 01:08 PM, Gleb Natapov wrote: > >Decode CMPXCHG8B destination operand in decoding stage. Fixes regression > >introduced by "If LOCK prefix is used dest arg should be memory" commit. > >This commit relies on dst operand be decoded at the beginning of an > >instruction emulation. > > >@@ -1719,15 +1719,12 @@ static inline int emulate_grp9(struct > >x86_emulate_ctxt *ctxt, > > c->regs[VCPU_REGS_RAX] = (u32) (old>> 0); > > c->regs[VCPU_REGS_RDX] = (u32) (old>> 32); > > ctxt->eflags&= ~EFLG_ZF; > >- > > } else { > >-new = ((u64)c->regs[VCPU_REGS_RCX]<< 32) | > >+c->dst.val = ((u64)c->regs[VCPU_REGS_RCX]<< 32) | > >(u32) c->regs[VCPU_REGS_RBX]; > > > >-rc = ops->cmpxchg_emulated(c->modrm_ea,&old,&new, 8, > >ctxt->vcpu); > >-if (rc != X86EMUL_CONTINUE) > >-return rc; > > ctxt->eflags |= EFLG_ZF; > >+c->lock_prefix = 1; > > Why is this bit needed? cmpxchg64b without lock is valid and racy, > but the guest may know it is safe. > Agree. Before this patch cmpxchg8b emulation always called cmpxchg_emulated(), so to be extra careful I wanted to preserve old behaviour. Resend the patch without this line? -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: x86 emulator: add decoding of CMPXCHG8B dst operand.
On 03/21/2010 01:08 PM, Gleb Natapov wrote: Decode CMPXCHG8B destination operand in decoding stage. Fixes regression introduced by "If LOCK prefix is used dest arg should be memory" commit. This commit relies on dst operand be decoded at the beginning of an instruction emulation. @@ -1719,15 +1719,12 @@ static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt, c->regs[VCPU_REGS_RAX] = (u32) (old>> 0); c->regs[VCPU_REGS_RDX] = (u32) (old>> 32); ctxt->eflags&= ~EFLG_ZF; - } else { - new = ((u64)c->regs[VCPU_REGS_RCX]<< 32) | + c->dst.val = ((u64)c->regs[VCPU_REGS_RCX]<< 32) | (u32) c->regs[VCPU_REGS_RBX]; - rc = ops->cmpxchg_emulated(c->modrm_ea,&old,&new, 8, ctxt->vcpu); - if (rc != X86EMUL_CONTINUE) - return rc; ctxt->eflags |= EFLG_ZF; + c->lock_prefix = 1; Why is this bit needed? cmpxchg64b without lock is valid and racy, but the guest may know it is safe. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.
On 03/21/2010 04:35 PM, Gleb Natapov wrote: On Sun, Mar 21, 2010 at 04:32:42PM +0200, Avi Kivity wrote: On 03/21/2010 01:09 PM, Gleb Natapov wrote: Wrong To: header. Ignore please. See sendemail.aliasesfile in 'git help send-email'. I use alisesfile, but unfortunately if alias is not found there git does not complain, just pass it as is to sendmail and sendmail adds part after @ by itself. Ah. Then don't use sendmail. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.
On Sun, Mar 21, 2010 at 04:32:42PM +0200, Avi Kivity wrote: > On 03/21/2010 01:09 PM, Gleb Natapov wrote: > >Wrong To: header. Ignore please. > > See sendemail.aliasesfile in 'git help send-email'. > I use alisesfile, but unfortunately if alias is not found there git does not complain, just pass it as is to sendmail and sendmail adds part after @ by itself. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: Fix a build error
On 03/20/2010 07:17 PM, Amos Kong wrote: arch/x86/kvm/x86.c: In function ‘emulator_cmpxchg_emulated’: arch/x86/kvm/x86.c:3367: error: ‘u’ undeclared (first use in this function) arch/x86/kvm/x86.c:3367: error: (Each undeclared identifier is reported only once arch/x86/kvm/x86.c:3367: error: for each function it appears in.) arch/x86/kvm/x86.c:3367: error: expected expression before ‘)’ token Thanks, just applied same patch from Jan. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: x86: Fix 32-bit build breakage due to typo
On 03/20/2010 11:14 AM, Jan Kiszka wrote: Obviously, the 64-bit case is considered stable now and 32 bit remained untested (not included in autotest?). We don't autotest on 32-bit hosts these days. So here is the build fix: Thanks, applied. Should have done it myself. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.
On 03/21/2010 01:09 PM, Gleb Natapov wrote: Wrong To: header. Ignore please. See sendemail.aliasesfile in 'git help send-email'. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On Thu, Mar 18, 2010 at 05:13:10PM +0100, Ingo Molnar wrote: > > Why does Linux AIO still suck? Why do we not have a proper interface in > > userspace for doing asynchronous file system operations? > > Good that you mention it, i think it's an excellent example. > > The suckage of kernel async IO is for similar reasons: there's an ugly > package > separation problem between the kernel and between glibc - and between the > apps > that would make use of it. No, kernel async IO sucks because it still does not play well with buffered I/O. Last time I checked (about a year ago or so), AIO syscall latencies were much worse when buffered I/O was used compared to direct I/O. Unfortunately, to achieve good performance with direct I/O, you need a HW RAID card with lots of on-board cache. Gabor -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: Drop KVM_REQ_PENDING_TIMER
On 03/20/2010 05:20 AM, Xiao Wang wrote: The pending timer is not detected through KVM_REQ_PENDING_TIMER now. It does, see the commit message of 06e056456. Marcelo, IIRC this is the second time time we get this patch... we need either a comment in the code, or better, a fix that doesn't involve an atomic in the fast path. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tracking KVM development
On 03/21/2010 01:21 PM, Thomas Løcke wrote: Hey all, I've recently started testing KVM as a possible virtualization solution for a bunch of servers, and so far things are going pretty well. My OS of choice is Slackware, and I usually just go with whatever kernel Slackware comes with. But with KVM I feel I might need to pay a bit more attention to that part of Slackware, as it appears to a be a project in rapid development, so my questions concern how best to track and keep KVM up-to-date? Currently I upgrade to the latest stable kernel almost as soon as its been released by Linus, and I track qemu-kvm using this Git repository: git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git But should I perhaps also track the KVM modules, and if so, from where? Any and all suggestions to keeping a healthy and stable KVM setup running is more than welcome. Tracking git repositories and stable setups are mutually exclusive. If you are interested in something stable I recommend staying with the distribution provided setup (and picking a distribution that has an emphasis on kvm). If you want to track upstream, use qemu-kvm-0.12.x stable releases and kernel.org 2.6.x.y stable releases. If you want to track git repositories, use qemu-kvm.git and kvm.git for the kernel and kvm. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Strange CPU usage pattern in SMP guest
On 03/21/2010 02:02 PM, Sebastian Hetze wrote: 12:46:02 CPU%usr %nice%sys %iowait%irq %soft %steal %guest %idle 12:46:03 all0,20 11,35 10,968,960,402,990,00 0,00 65,14 12:46:03 01,00 11,007,00 15,000,001,000,00 0,00 65,00 12:46:03 10,007,142,046,121,02 11,220,00 0,00 72,45 12:46:03 20,00 15,001,00 12,000,001,000,00 0,00 71,00 12:46:03 30,00 11,00 23,008,000,000,000,00 0,00 58,00 12:46:03 40,000,00 50,000,000,000,000,00 0,00 50,00 12:46:03 50,00 13,00 20,004,000,001,000,00 0,00 62,00 So it is only CPU4 that is showing this strange behaviour. Can you adjust irqtop to only count cpu4? or even just post a few 'cat /proc/interrupts' from that guest. Most likely the timer interrupt for cpu4 died. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
hi, may I ask some help on the paravirtualization of KVM?
I want to set up the virtio-net for the GuestOS on KVM. Following is my steps: 1.Compile the kvm-88 and make, make install. 2.Compile the GuestOS(redhat) with kernel version 2.6.27.45(with virtio support). The required option are all selected. o CONFIG_VIRTIO_PCI=y (Virtualization -> PCI driver for virtio devices) o CONFIG_VIRTIO_BALLOON=y (Virtualization -> Virtio balloon driver) o CONFIG_VIRTIO_BLK=y (Device Drivers -> Block -> Virtio block driver) o CONFIG_VIRTIO_NET=y (Device Drivers -> Network device support -> Virtio network driver) o CONFIG_VIRTIO=y (automatically selected) o CONFIG_VIRTIO_RING=y (automatically selected) 3.Then start up the GuestOS by such command: x86_64-softmmu/qemu-system-x86_64 -m 1024 /root/redhat.img -net nic,model=virtio -net tap,script=/etc/kvm/qemu-ifup 4.Result is this: * The Guest OS start up. * But the network not, no eth-X device found. * lsmod | grep virtio get none module about virtio Then why the virtio_net not show up in the GuestOS? Is there any wrongs on my each steps? or lacking some settings? I have referred the page http://www.linux-kvm.org/page/Virtio, but not found any special requirement. Does anyone have some tips? Thanks in advance. -- BestRegards. YangLiang _ Department of Computer Science . School of Electronics Engineering & Computer Science . _ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Strange CPU usage pattern in SMP guest
On Sun, Mar 21, 2010 at 12:09:00PM +0200, Avi Kivity wrote: > On 03/21/2010 02:13 AM, Sebastian Hetze wrote: >> Hi *, >> >> in an 6 CPU SMP guest running on an host with 2 quad core >> Intel Xeon E5520 with hyperthrading enabled >> we see one or more guest CPUs working in a very strange >> pattern. It looks like all or nothing. We can easily identify >> the effected CPU with xosview. Here is the mpstat output >> compared to one regular working CPU: >> >> >> mpstat -P 4 1 >> Linux 2.6.31-16-generic-pae (guest) 21.03.2010 _i686_ (6 CPU) >> 00:45:19 CPU%usr %nice%sys %iowait%irq %soft %steal >> %guest %idle >> 00:45:20 40,00 100,000,000,000,000,000,00 >> 0,000,00 >> 00:45:21 40,00 100,000,000,000,000,000,00 >> 0,000,00 >> 00:45:22 40,00 100,000,000,000,000,000,00 >> 0,000,00 >> 00:45:23 40,00 100,000,000,000,000,000,00 >> 0,000,00 >> 00:45:24 40,00 66,670,000,000,00 33,330,00 >> 0,000,00 >> 00:45:25 40,00 100,000,000,000,000,000,00 >> 0,000,00 >> 00:45:26 40,00 100,000,000,000,000,000,00 >> 0,000,00 >> > > Looks like the guest is only receiving 3-4 timer interrupts per second, > so time becomes quantized. > > Please run the attached irqtop in the affected guest and report the results. > > Is the host overly busy? What host kernel, kvm, and qemu are you > running? Is the guest running an I/O workload? if so, how are the disks The host is not busy at all. In fact, currently it is running only one guest. The host is running an ubuntu 2.6.31-14-server kernel. qemu-kvm is 0.12.2-0ubuntu6. The kvm module has srcversion: 82D6B673524596F9CF3E84C as stated by modinfo. The guest occasionally is running IO workload. However, the effect is visible all the time. And it is only one out of 6 CPUs the very same guest is running. This is the output on the guest for all CPUs: mpstat -P ALL 1 12:45:59 CPU%usr %nice%sys %iowait%irq %soft %steal %guest %idle 12:46:00 all0,409,742,395,370,803,980,00 0,00 77,34 12:46:00 01,005,006,003,001,009,000,00 0,00 75,00 12:46:00 10,00 23,002,00 10,000,000,000,00 0,00 65,00 12:46:00 20,005,940,996,930,001,980,00 0,00 84,16 12:46:00 30,008,002,005,002,009,000,00 0,00 74,00 12:46:00 40,00 33,330,000,000,000,000,00 0,00 66,67 12:46:00 50,005,940,003,960,000,990,00 0,00 89,11 12:46:00 CPU%usr %nice%sys %iowait%irq %soft %steal %guest %idle 12:46:01 all0,605,813,21 24,450,403,610,00 0,00 61,92 12:46:01 01,014,047,07 31,311,016,060,00 0,00 49,49 12:46:01 10,005,002,00 19,000,002,000,00 0,00 72,00 12:46:01 20,997,921,98 35,640,002,970,00 0,00 50,50 12:46:01 31,984,952,97 13,860,006,930,00 0,00 69,31 12:46:01 40,00 33,330,000,000,000,000,00 0,00 66,67 12:46:01 50,008,083,03 22,220,001,010,00 0,00 65,66 12:46:01 CPU%usr %nice%sys %iowait%irq %soft %steal %guest %idle 12:46:02 all2,38 12,70 17,06 14,680,601,980,00 0,00 50,60 12:46:02 03,96 15,849,90 13,860,002,970,00 0,00 53,47 12:46:02 12,976,935,94 19,802,972,970,00 0,00 58,42 12:46:02 22,02 17,178,08 18,182,021,010,00 0,00 51,52 12:46:02 32,02 10,108,08 14,140,002,020,00 0,00 63,64 12:46:02 40,000,000,000,000,000,000,00 0,00 100,00 12:46:02 50,00 13,00 55,006,000,001,000,00 0,00 25,00 12:46:02 CPU%usr %nice%sys %iowait%irq %soft %steal %guest %idle 12:46:03 all0,20 11,35 10,968,960,402,990,00 0,00 65,14 12:46:03 01,00 11,007,00 15,000,001,000,00 0,00 65,00 12:46:03 10,007,142,046,121,02 11,220,00 0,00 72,45 12:46:03 20,00 15,001,00 12,000,001,000,00 0,00 71,00 12:46:03 30,00 11,00 23,008,000,000,000,00 0,00 58,00 12:46:03 40,000,00 50,000,000,000,000,00 0,00 50,00 12:46:03 50,00 13,00 20,004,000,001,000,00 0,00 6
Re: Unable to create more than 1 guest virtio-net device using vhost-net backend
On 03/21/2010 01:34 PM, Michael S. Tsirkin wrote: On Sun, Mar 21, 2010 at 12:29:31PM +0200, Avi Kivity wrote: On 03/21/2010 12:15 PM, Michael S. Tsirkin wrote: Nothing easy that I can see. Each device needs 2 of these. Avi, Gleb, any objections to increasing the limit to say 16? That would give us 5 more devices to the limit of 6 per guest. Increase it to 200, then. OK. I think we'll also need a smarter allocator than bus->dev_count++ than we now have. Right? No, why? We'll run into problems if devices are created/removed in random order, won't we? unregister_dev() takes care of it. Eventually we'll want faster scanning than the linear search we employ now, though. Yes I suspect with 200 entries we will :). Let's just make it 16 for now? Let's make it 200 and fix the performance problems later. Making it 16 is just asking for trouble. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unable to create more than 1 guest virtio-net device using vhost-net backend
On Sun, Mar 21, 2010 at 12:29:31PM +0200, Avi Kivity wrote: > On 03/21/2010 12:15 PM, Michael S. Tsirkin wrote: Nothing easy that I can see. Each device needs 2 of these. Avi, Gleb, any objections to increasing the limit to say 16? That would give us 5 more devices to the limit of 6 per guest. >>> Increase it to 200, then. >>> >> OK. I think we'll also need a smarter allocator >> than bus->dev_count++ than we now have. Right? >> > > No, why? We'll run into problems if devices are created/removed in random order, won't we? > Eventually we'll want faster scanning than the linear search we employ > now, though. Yes I suspect with 200 entries we will :). Let's just make it 16 for now? >>> Is the limit visible to userspace? If not, we need to expose it. >>> >> I don't think it's visible: it seems to be used in a single >> place in kvm. Let's add an ioctl? Note that qemu doesn't >> need it now ... >> > > We usually expose limits via KVM_CHECK_EXTENSION(KVM_CAP_BLAH). We can > expose it via KVM_CAP_IOEVENTFD (and need to reserve iodev entries for > those). > > -- > error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Time and KVM - best practices
Hey, What is considered "best practice" when running a KVM host with a mixture of Linux and Windows guests? Currently I have ntpd running on the host, and I start my guests using "-rtc base=localhost,clock=host", with an extra "-tdf" added for Windows guests, just to keep their clock from drifting madly during load. But with this setup, all my guests are constantly 1-2 seconds behind the host. I can live with that for the Windows guests, as they are not running anything that depends heavily on the time being set perfect, but for some of the Linux guests it's an issue. Would I be better of using ntpd and "-rtc base=localhost,clock=vm" for all the Linux guests, or is there some other magic way of ensuring that the clock is perfectly in sync with the host? Perhaps there are some kernel configuration I can do to optimize the host for KVM? I'm currently using QEMU PC emulator version 0.12.50 (qemu-kvm-devel) because version 0.12.30 did not work well at all with Windows guests, and the kernel in both host and Linux guests is 2.6.33.1 :o) /Thomas -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Tracking KVM development
Hey all, I've recently started testing KVM as a possible virtualization solution for a bunch of servers, and so far things are going pretty well. My OS of choice is Slackware, and I usually just go with whatever kernel Slackware comes with. But with KVM I feel I might need to pay a bit more attention to that part of Slackware, as it appears to a be a project in rapid development, so my questions concern how best to track and keep KVM up-to-date? Currently I upgrade to the latest stable kernel almost as soon as its been released by Linus, and I track qemu-kvm using this Git repository: git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git But should I perhaps also track the KVM modules, and if so, from where? Any and all suggestions to keeping a healthy and stable KVM setup running is more than welcome. :o) /Thomas -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.
Wrong To: header. Ignore please. On Sun, Mar 21, 2010 at 01:06:02PM +0200, Gleb Natapov wrote: > Make sure that rflags is committed only after successful instruction > emulation. > > Signed-off-by: Gleb Natapov > --- > arch/x86/include/asm/kvm_emulate.h |1 + > arch/x86/kvm/emulate.c |1 + > arch/x86/kvm/x86.c |8 ++-- > 3 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_emulate.h > b/arch/x86/include/asm/kvm_emulate.h > index b5e12c5..a1319c8 100644 > --- a/arch/x86/include/asm/kvm_emulate.h > +++ b/arch/x86/include/asm/kvm_emulate.h > @@ -136,6 +136,7 @@ struct x86_emulate_ops { > ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu); > void (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu); > int (*cpl)(struct kvm_vcpu *vcpu); > + void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags); > }; > > /* Type, address-of, and value of an instruction's operand. */ > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c > index 266576c..c1aa983 100644 > --- a/arch/x86/kvm/emulate.c > +++ b/arch/x86/kvm/emulate.c > @@ -2968,6 +2968,7 @@ writeback: > /* Commit shadow register state. */ > memcpy(ctxt->vcpu->arch.regs, c->regs, sizeof c->regs); > kvm_rip_write(ctxt->vcpu, c->eip); > + ops->set_rflags(ctxt->vcpu, ctxt->eflags); > > done: > return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0; > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index bb9a24a..3fa70b3 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3643,6 +3643,11 @@ static void emulator_set_segment_selector(u16 sel, int > seg, > kvm_set_segment(vcpu, &kvm_seg, seg); > } > > +static void emulator_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags) > +{ > + kvm_x86_ops->set_rflags(vcpu, rflags); > +} > + > static struct x86_emulate_ops emulate_ops = { > .read_std= kvm_read_guest_virt_system, > .write_std = kvm_write_guest_virt_system, > @@ -3660,6 +3665,7 @@ static struct x86_emulate_ops emulate_ops = { > .get_cr = emulator_get_cr, > .set_cr = emulator_set_cr, > .cpl = emulator_get_cpl, > + .set_rflags = emulator_set_rflags, > }; > > static void cache_all_regs(struct kvm_vcpu *vcpu) > @@ -3780,8 +3786,6 @@ restart: > return EMULATE_DO_MMIO; > } > > - kvm_x86_ops->set_rflags(vcpu, vcpu->arch.emulate_ctxt.eflags); > - > if (vcpu->mmio_is_write) { > vcpu->mmio_needed = 0; > return EMULATE_DO_MMIO; > -- > 1.6.5 > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] KVM: x86 emulator: add decoding of CMPXCHG8B dst operand.
Decode CMPXCHG8B destination operand in decoding stage. Fixes regression introduced by "If LOCK prefix is used dest arg should be memory" commit. This commit relies on dst operand be decoded at the beginning of an instruction emulation. Signed-off-by: Gleb Natapov --- arch/x86/kvm/emulate.c | 24 ++-- 1 files changed, 10 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index c1aa983..904351e 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -52,6 +52,7 @@ #define DstMem (3<<1) /* Memory operand. */ #define DstAcc (4<<1) /* Destination Accumulator */ #define DstDI (5<<1) /* Destination is in ES:(E)DI */ +#define DstMem64(6<<1) /* 64bit memory operand */ #define DstMask (7<<1) /* Source operand type. */ #define SrcNone (0<<4) /* No source operand. */ @@ -360,7 +361,7 @@ static u32 group_table[] = { DstMem | SrcImmByte | ModRM, DstMem | SrcImmByte | ModRM | Lock, DstMem | SrcImmByte | ModRM | Lock, DstMem | SrcImmByte | ModRM | Lock, [Group9*8] = - 0, ImplicitOps | ModRM | Lock, 0, 0, 0, 0, 0, 0, + 0, DstMem64 | ModRM | Lock, 0, 0, 0, 0, 0, 0, }; static u32 group2_table[] = { @@ -1205,6 +1206,7 @@ done_prefixes: c->twobyte && (c->b == 0xb6 || c->b == 0xb7)); break; case DstMem: + case DstMem64: if ((c->d & ModRM) && c->modrm_mod == 3) { c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; c->dst.type = OP_REG; @@ -1214,7 +1216,10 @@ done_prefixes: } c->dst.type = OP_MEM; c->dst.ptr = (unsigned long *)c->modrm_ea; - c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; + if ((c->d & DstMask) == DstMem64) + c->dst.bytes = 8; + else + c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; c->dst.val = 0; if (c->d & BitOp) { unsigned long mask = ~(c->dst.bytes * 8 - 1); @@ -1706,12 +1711,7 @@ static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops) { struct decode_cache *c = &ctxt->decode; - u64 old, new; - int rc; - - rc = ops->read_emulated(c->modrm_ea, &old, 8, ctxt->vcpu); - if (rc != X86EMUL_CONTINUE) - return rc; + u64 old = c->dst.orig_val; if (((u32) (old >> 0) != (u32) c->regs[VCPU_REGS_RAX]) || ((u32) (old >> 32) != (u32) c->regs[VCPU_REGS_RDX])) { @@ -1719,15 +1719,12 @@ static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt, c->regs[VCPU_REGS_RAX] = (u32) (old >> 0); c->regs[VCPU_REGS_RDX] = (u32) (old >> 32); ctxt->eflags &= ~EFLG_ZF; - } else { - new = ((u64)c->regs[VCPU_REGS_RCX] << 32) | + c->dst.val = ((u64)c->regs[VCPU_REGS_RCX] << 32) | (u32) c->regs[VCPU_REGS_RBX]; - rc = ops->cmpxchg_emulated(c->modrm_ea, &old, &new, 8, ctxt->vcpu); - if (rc != X86EMUL_CONTINUE) - return rc; ctxt->eflags |= EFLG_ZF; + c->lock_prefix = 1; } return X86EMUL_CONTINUE; } @@ -3241,7 +3238,6 @@ twobyte_insn: rc = emulate_grp9(ctxt, ops); if (rc != X86EMUL_CONTINUE) goto done; - c->dst.type = OP_NONE; break; } goto writeback; -- 1.6.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.
Make sure that rflags is committed only after successful instruction emulation. Signed-off-by: Gleb Natapov --- arch/x86/include/asm/kvm_emulate.h |1 + arch/x86/kvm/emulate.c |1 + arch/x86/kvm/x86.c |8 ++-- 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index b5e12c5..a1319c8 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -136,6 +136,7 @@ struct x86_emulate_ops { ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu); void (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu); int (*cpl)(struct kvm_vcpu *vcpu); + void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags); }; /* Type, address-of, and value of an instruction's operand. */ diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 266576c..c1aa983 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -2968,6 +2968,7 @@ writeback: /* Commit shadow register state. */ memcpy(ctxt->vcpu->arch.regs, c->regs, sizeof c->regs); kvm_rip_write(ctxt->vcpu, c->eip); + ops->set_rflags(ctxt->vcpu, ctxt->eflags); done: return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bb9a24a..3fa70b3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3643,6 +3643,11 @@ static void emulator_set_segment_selector(u16 sel, int seg, kvm_set_segment(vcpu, &kvm_seg, seg); } +static void emulator_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags) +{ + kvm_x86_ops->set_rflags(vcpu, rflags); +} + static struct x86_emulate_ops emulate_ops = { .read_std= kvm_read_guest_virt_system, .write_std = kvm_write_guest_virt_system, @@ -3660,6 +3665,7 @@ static struct x86_emulate_ops emulate_ops = { .get_cr = emulator_get_cr, .set_cr = emulator_set_cr, .cpl = emulator_get_cpl, + .set_rflags = emulator_set_rflags, }; static void cache_all_regs(struct kvm_vcpu *vcpu) @@ -3780,8 +3786,6 @@ restart: return EMULATE_DO_MMIO; } - kvm_x86_ops->set_rflags(vcpu, vcpu->arch.emulate_ctxt.eflags); - if (vcpu->mmio_is_write) { vcpu->mmio_needed = 0; return EMULATE_DO_MMIO; -- 1.6.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] KVM: x86 emulator: add decoding of CMPXCHG8B dst operand.
Decode CMPXCHG8B destination operand in decoding stage. Fixes regression introduced by "If LOCK prefix is used dest arg should be memory" commit. This commit relies on dst operand be decoded at the beginning of an instruction emulation. Signed-off-by: Gleb Natapov --- arch/x86/kvm/emulate.c | 24 ++-- 1 files changed, 10 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index c1aa983..904351e 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -52,6 +52,7 @@ #define DstMem (3<<1) /* Memory operand. */ #define DstAcc (4<<1) /* Destination Accumulator */ #define DstDI (5<<1) /* Destination is in ES:(E)DI */ +#define DstMem64(6<<1) /* 64bit memory operand */ #define DstMask (7<<1) /* Source operand type. */ #define SrcNone (0<<4) /* No source operand. */ @@ -360,7 +361,7 @@ static u32 group_table[] = { DstMem | SrcImmByte | ModRM, DstMem | SrcImmByte | ModRM | Lock, DstMem | SrcImmByte | ModRM | Lock, DstMem | SrcImmByte | ModRM | Lock, [Group9*8] = - 0, ImplicitOps | ModRM | Lock, 0, 0, 0, 0, 0, 0, + 0, DstMem64 | ModRM | Lock, 0, 0, 0, 0, 0, 0, }; static u32 group2_table[] = { @@ -1205,6 +1206,7 @@ done_prefixes: c->twobyte && (c->b == 0xb6 || c->b == 0xb7)); break; case DstMem: + case DstMem64: if ((c->d & ModRM) && c->modrm_mod == 3) { c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; c->dst.type = OP_REG; @@ -1214,7 +1216,10 @@ done_prefixes: } c->dst.type = OP_MEM; c->dst.ptr = (unsigned long *)c->modrm_ea; - c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; + if ((c->d & DstMask) == DstMem64) + c->dst.bytes = 8; + else + c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; c->dst.val = 0; if (c->d & BitOp) { unsigned long mask = ~(c->dst.bytes * 8 - 1); @@ -1706,12 +1711,7 @@ static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops) { struct decode_cache *c = &ctxt->decode; - u64 old, new; - int rc; - - rc = ops->read_emulated(c->modrm_ea, &old, 8, ctxt->vcpu); - if (rc != X86EMUL_CONTINUE) - return rc; + u64 old = c->dst.orig_val; if (((u32) (old >> 0) != (u32) c->regs[VCPU_REGS_RAX]) || ((u32) (old >> 32) != (u32) c->regs[VCPU_REGS_RDX])) { @@ -1719,15 +1719,12 @@ static inline int emulate_grp9(struct x86_emulate_ctxt *ctxt, c->regs[VCPU_REGS_RAX] = (u32) (old >> 0); c->regs[VCPU_REGS_RDX] = (u32) (old >> 32); ctxt->eflags &= ~EFLG_ZF; - } else { - new = ((u64)c->regs[VCPU_REGS_RCX] << 32) | + c->dst.val = ((u64)c->regs[VCPU_REGS_RCX] << 32) | (u32) c->regs[VCPU_REGS_RBX]; - rc = ops->cmpxchg_emulated(c->modrm_ea, &old, &new, 8, ctxt->vcpu); - if (rc != X86EMUL_CONTINUE) - return rc; ctxt->eflags |= EFLG_ZF; + c->lock_prefix = 1; } return X86EMUL_CONTINUE; } @@ -3241,7 +3238,6 @@ twobyte_insn: rc = emulate_grp9(ctxt, ops); if (rc != X86EMUL_CONTINUE) goto done; - c->dst.type = OP_NONE; break; } goto writeback; -- 1.6.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] KVM: x86 emulator: commit rflags as part of registers commit.
Make sure that rflags is committed only after successful instruction emulation. Signed-off-by: Gleb Natapov --- arch/x86/include/asm/kvm_emulate.h |1 + arch/x86/kvm/emulate.c |1 + arch/x86/kvm/x86.c |8 ++-- 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index b5e12c5..a1319c8 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -136,6 +136,7 @@ struct x86_emulate_ops { ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu); void (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu); int (*cpl)(struct kvm_vcpu *vcpu); + void (*set_rflags)(struct kvm_vcpu *vcpu, unsigned long rflags); }; /* Type, address-of, and value of an instruction's operand. */ diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 266576c..c1aa983 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -2968,6 +2968,7 @@ writeback: /* Commit shadow register state. */ memcpy(ctxt->vcpu->arch.regs, c->regs, sizeof c->regs); kvm_rip_write(ctxt->vcpu, c->eip); + ops->set_rflags(ctxt->vcpu, ctxt->eflags); done: return (rc == X86EMUL_UNHANDLEABLE) ? -1 : 0; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bb9a24a..3fa70b3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3643,6 +3643,11 @@ static void emulator_set_segment_selector(u16 sel, int seg, kvm_set_segment(vcpu, &kvm_seg, seg); } +static void emulator_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags) +{ + kvm_x86_ops->set_rflags(vcpu, rflags); +} + static struct x86_emulate_ops emulate_ops = { .read_std= kvm_read_guest_virt_system, .write_std = kvm_write_guest_virt_system, @@ -3660,6 +3665,7 @@ static struct x86_emulate_ops emulate_ops = { .get_cr = emulator_get_cr, .set_cr = emulator_set_cr, .cpl = emulator_get_cpl, + .set_rflags = emulator_set_rflags, }; static void cache_all_regs(struct kvm_vcpu *vcpu) @@ -3780,8 +3786,6 @@ restart: return EMULATE_DO_MMIO; } - kvm_x86_ops->set_rflags(vcpu, vcpu->arch.emulate_ctxt.eflags); - if (vcpu->mmio_is_write) { vcpu->mmio_needed = 0; return EMULATE_DO_MMIO; -- 1.6.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unable to create more than 1 guest virtio-net device using vhost-net backend
On 03/21/2010 12:21 PM, Gleb Natapov wrote: On Sun, Mar 21, 2010 at 12:11:33PM +0200, Avi Kivity wrote: On 03/21/2010 11:55 AM, Michael S. Tsirkin wrote: On Fri, Mar 19, 2010 at 03:19:27PM -0700, Sridhar Samudrala wrote: When creating a guest with 2 virtio-net interfaces, i am running into a issue causing the 2nd i/f falling back to userpace virtio even when vhost is enabled. After some debugging, it turned out that KVM_IOEVENTFD ioctl() call in qemu is failing with ENOSPC. This is because of the NR_IOBUS_DEVS(6) limit in kvm_io_bus_register_dev() routine in the host kernel. I think we need to increase this limit if we want to support multiple network interfaces using vhost-net. Is there an alternate solution? Thanks Sridhar Nothing easy that I can see. Each device needs 2 of these. Avi, Gleb, any objections to increasing the limit to say 16? That would give us 5 more devices to the limit of 6 per guest. Increase it to 200, then. Currently on each device read/write we iterate over all registered devices. This is not scalable. Yeah. We need first to drop the callback based matching and replace it with explicit ranges, then to replace the search with a hash table for small ranges (keeping a linear search for large ranges, can happen for coalesced mmio). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unable to create more than 1 guest virtio-net device using vhost-net backend
On 03/21/2010 12:15 PM, Michael S. Tsirkin wrote: Nothing easy that I can see. Each device needs 2 of these. Avi, Gleb, any objections to increasing the limit to say 16? That would give us 5 more devices to the limit of 6 per guest. Increase it to 200, then. OK. I think we'll also need a smarter allocator than bus->dev_count++ than we now have. Right? No, why? Eventually we'll want faster scanning than the linear search we employ now, though. Is the limit visible to userspace? If not, we need to expose it. I don't think it's visible: it seems to be used in a single place in kvm. Let's add an ioctl? Note that qemu doesn't need it now ... We usually expose limits via KVM_CHECK_EXTENSION(KVM_CAP_BLAH). We can expose it via KVM_CAP_IOEVENTFD (and need to reserve iodev entries for those). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unable to create more than 1 guest virtio-net device using vhost-net backend
On Sun, Mar 21, 2010 at 12:11:33PM +0200, Avi Kivity wrote: > On 03/21/2010 11:55 AM, Michael S. Tsirkin wrote: > >On Fri, Mar 19, 2010 at 03:19:27PM -0700, Sridhar Samudrala wrote: > >>When creating a guest with 2 virtio-net interfaces, i am running > >>into a issue causing the 2nd i/f falling back to userpace virtio > >>even when vhost is enabled. > >> > >>After some debugging, it turned out that KVM_IOEVENTFD ioctl() > >>call in qemu is failing with ENOSPC. > >>This is because of the NR_IOBUS_DEVS(6) limit in kvm_io_bus_register_dev() > >>routine in the host kernel. > >> > >>I think we need to increase this limit if we want to support multiple > >>network interfaces using vhost-net. > >>Is there an alternate solution? > >> > >>Thanks > >>Sridhar > >Nothing easy that I can see. Each device needs 2 of these. Avi, Gleb, > >any objections to increasing the limit to say 16? That would give us > >5 more devices to the limit of 6 per guest. > > Increase it to 200, then. > Currently on each device read/write we iterate over all registered devices. This is not scalable. > Is the limit visible to userspace? If not, we need to expose it. > > -- > error compiling committee.c: too many arguments to function -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unable to create more than 1 guest virtio-net device using vhost-net backend
On Sun, Mar 21, 2010 at 12:11:33PM +0200, Avi Kivity wrote: > On 03/21/2010 11:55 AM, Michael S. Tsirkin wrote: >> On Fri, Mar 19, 2010 at 03:19:27PM -0700, Sridhar Samudrala wrote: >> >>> When creating a guest with 2 virtio-net interfaces, i am running >>> into a issue causing the 2nd i/f falling back to userpace virtio >>> even when vhost is enabled. >>> >>> After some debugging, it turned out that KVM_IOEVENTFD ioctl() >>> call in qemu is failing with ENOSPC. >>> This is because of the NR_IOBUS_DEVS(6) limit in kvm_io_bus_register_dev() >>> routine in the host kernel. >>> >>> I think we need to increase this limit if we want to support multiple >>> network interfaces using vhost-net. >>> Is there an alternate solution? >>> >>> Thanks >>> Sridhar >>> >> Nothing easy that I can see. Each device needs 2 of these. Avi, Gleb, >> any objections to increasing the limit to say 16? That would give us >> 5 more devices to the limit of 6 per guest. >> > > Increase it to 200, then. OK. I think we'll also need a smarter allocator than bus->dev_count++ than we now have. Right? > Is the limit visible to userspace? If not, we need to expose it. I don't think it's visible: it seems to be used in a single place in kvm. Let's add an ioctl? Note that qemu doesn't need it now ... > -- > error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unable to create more than 1 guest virtio-net device using vhost-net backend
On 03/21/2010 11:55 AM, Michael S. Tsirkin wrote: On Fri, Mar 19, 2010 at 03:19:27PM -0700, Sridhar Samudrala wrote: When creating a guest with 2 virtio-net interfaces, i am running into a issue causing the 2nd i/f falling back to userpace virtio even when vhost is enabled. After some debugging, it turned out that KVM_IOEVENTFD ioctl() call in qemu is failing with ENOSPC. This is because of the NR_IOBUS_DEVS(6) limit in kvm_io_bus_register_dev() routine in the host kernel. I think we need to increase this limit if we want to support multiple network interfaces using vhost-net. Is there an alternate solution? Thanks Sridhar Nothing easy that I can see. Each device needs 2 of these. Avi, Gleb, any objections to increasing the limit to say 16? That would give us 5 more devices to the limit of 6 per guest. Increase it to 200, then. Is the limit visible to userspace? If not, we need to expose it. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Strange CPU usage pattern in SMP guest
On 03/21/2010 02:13 AM, Sebastian Hetze wrote: Hi *, in an 6 CPU SMP guest running on an host with 2 quad core Intel Xeon E5520 with hyperthrading enabled we see one or more guest CPUs working in a very strange pattern. It looks like all or nothing. We can easily identify the effected CPU with xosview. Here is the mpstat output compared to one regular working CPU: mpstat -P 4 1 Linux 2.6.31-16-generic-pae (guest) 21.03.2010 _i686_ (6 CPU) 00:45:19 CPU%usr %nice%sys %iowait%irq %soft %steal %guest %idle 00:45:20 40,00 100,000,000,000,000,000,00 0,000,00 00:45:21 40,00 100,000,000,000,000,000,00 0,000,00 00:45:22 40,00 100,000,000,000,000,000,00 0,000,00 00:45:23 40,00 100,000,000,000,000,000,00 0,000,00 00:45:24 40,00 66,670,000,000,00 33,330,00 0,000,00 00:45:25 40,00 100,000,000,000,000,000,00 0,000,00 00:45:26 40,00 100,000,000,000,000,000,00 0,000,00 Looks like the guest is only receiving 3-4 timer interrupts per second, so time becomes quantized. Please run the attached irqtop in the affected guest and report the results. Is the host overly busy? What host kernel, kvm, and qemu are you running? Is the guest running an I/O workload? if so, how are the disks configured? -- error compiling committee.c: too many arguments to function #!/usr/bin/python import curses import sys, os, time, optparse def read_interrupts(): irq = {} proc = file('/proc/interrupts') nrcpu = len(proc.readline().split()) for line in proc.readlines(): vec, data = line.strip().split(':', 1) if vec in ('ERR', 'MIS'): continue counts = data.split(None, nrcpu) counts, rest = (counts[:-1], counts[-1]) count = sum([int(x) for x in counts]) try: v = int(vec) name = rest.split(None, 1)[1] except: name = rest irq[name] = count return irq def delta_interrupts(): old = read_interrupts() while True: irq = read_interrupts() delta = {} for key in irq.keys(): delta[key] = irq[key] - old[key] yield delta old = irq label_width = 30 number_width = 10 def tui(screen): curses.use_default_colors() curses.noecho() def getcount(x): return x[1] def refresh(irq): screen.erase() screen.addstr(0, 0, 'irqtop') row = 2 for name, count in sorted(irq.items(), key = getcount, reverse = True): if row >= screen.getmaxyx()[0]: break col = 1 screen.addstr(row, col, name) col += label_width screen.addstr(row, col, '%10d' % (count,)) row += 1 screen.refresh() for irqs in delta_interrupts(): refresh(irqs) curses.halfdelay(10) try: c = screen.getkey() if c == 'q': break except KeyboardInterrupt: break except curses.error: continue import curses.wrapper curses.wrapper(tui)
Re: [RFC] Unify KVM kernel-space and user-space code into a single project
On 03/20/2010 04:59 PM, Andrea Arcangeli wrote: On Fri, Mar 19, 2010 at 09:21:49AM +0200, Avi Kivity wrote: On 03/19/2010 12:44 AM, Ingo Molnar wrote: Too bad - there was heavy initial opposition to the arch/x86 unification as well [and heavy opposition to tools/perf/ as well], still both worked out extremely well :-) Did you forget that arch/x86 was a merging of a code fork that happened several years previously? Maybe that fork shouldn't have been done to begin with. We discussed and probably timidly tried to share the sharable initially but we realized it was too time wasteful. In addition to having to adapt the code to 64bit we would also had to constantly solve another problem on top of it (see the various split on _32/_64, those takes time to achieve, maybe not huge time but still definitely some time and effort). Even in retrospect I am quite sure the way x86-64 happened was optimal and if we would go back we would do it again the exact same way even if the final object was to have a common arch/x86 (and thankfully Linus is flexible and smart enough to realize that code that isn't risking to destabilize anything shouldn't be forced out just because it's not to a totally theoretical-perfect-nitpicking-clean-state yet). It's still a lot of work do the unification later as a separate task, but it's not like if we did it immediately it would have been a lot less work. It's about the same amount of effort and we were able to defer it for later and decrease the time to market which surely has contributed to the success of x86-64. In hindsight decisions are much easier. I agree it was less risky to fork than to share. But if another instruction set forks out a 64-bit not-exactly-compatible variant, I'm sure we'll start out shared and not fork it, especially if the platform remains the same. Problem of qemu is not some lack of GUI or that it's not included in the linux kernel git tree, the definitive problem is how to merge qemu-kvm/kvm and qlx into it. If you (Avi) were the qemu maintainer I am sure there wouldn't two trees so as a developer I would totally love it, and I am sure that with you as maintainer it would have a chance to move forward with qlx on desktop virtualization without proposing to extend vnc instead to achieve a "similar" result (imagine if btrfs is published on a website and people starts to discuss if it should ever be merged ever because reinventing some part of btrfs inside ext5 might achieve ""similar"" results). The qemu/qemu-kvm fork is definitely hurting. Some history: when kvm started out I pulled qemu for fast hacking and, much like arch/x86_64, I couldn't destabilize qemu for something that was completely experimental (and closed source at the time). Moreover, it wasn't clear if the qemu community would be interested. The qemu-kvm fork was designed for minimal intrusion so I could merge upstream qemu regularly. This resulted in kvm integration that was fairly ugly. Later Anthony merged a well-integrated alternative implementation (in retrospect this was a mistake IMO - we were left with a well tested high performing ugly implementation and a clean, slow, untested, and unfeatured implementation, and no one who wants to merge the two). So now it is pretty confusing to read the code which has the two alternate implementation sometimes sharing code and sometimes diverging. About a GUI for KVM to use on desktop distributions, that is an irrelevant concern compared to the lack of protocol more efficient than rdesktop/rdp/vnc for desktop virtualization. I've people asking me to migrate hundreds of desktops to desktop virtualization on KVM in their organizations and I tell them to use spice because I believe it's the most efficient option available (at least as far as we stick to open source open protocols), there are universities using spice on thousand of student desktops, and I think we need paravirt graphics to happen ASAP in the main qemu tree too. That effort will have to wait for the spice project to mature. In short: running KVM on the desktop is irrelevant compared to running the desktop on KVM so I suggest to focus on what is more important first ;). Anyone can focus on what interests them, if someone has an interest in a good desktop-on-desktop experience they should start hacking and sending patches. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unable to create more than 1 guest virtio-net device using vhost-net backend
On Fri, Mar 19, 2010 at 03:19:27PM -0700, Sridhar Samudrala wrote: > When creating a guest with 2 virtio-net interfaces, i am running > into a issue causing the 2nd i/f falling back to userpace virtio > even when vhost is enabled. > > After some debugging, it turned out that KVM_IOEVENTFD ioctl() > call in qemu is failing with ENOSPC. > This is because of the NR_IOBUS_DEVS(6) limit in kvm_io_bus_register_dev() > routine in the host kernel. > > I think we need to increase this limit if we want to support multiple > network interfaces using vhost-net. > Is there an alternate solution? > > Thanks > Sridhar Nothing easy that I can see. Each device needs 2 of these. Avi, Gleb, any objections to increasing the limit to say 16? That would give us 5 more devices to the limit of 6 per guest. -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html