date:20110121

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-21 Thread Srivatsa Vaddagiri

On Fri, Jan 21, 2011 at 09:48:29AM -0500, Rik van Riel wrote:
> >>Why?  If a VCPU can't make progress because its waiting for some
> >>resource, then why not schedule something else instead?
> >
> >In the process, "something else" can get more share of cpu resource than its
> >entitled to and that's where I was bit concerned. I guess one could
> >employ hard-limits to cap "something else's" bandwidth where it is of real
> >concern (like clouds).
> 
> I'd like to think I fixed those things in my yield_task_fair +
> yield_to + kvm_vcpu_on_spin patch series from yesterday.

Speaking of the spinlock-in-virtualized-environment problem as whole, IMHO
I don't think that kvm_vcpu_on_spin + yield changes will provide the best
results, especially where ticketlocks are involved and they are paravirtualized 
in a manner being discussed in this thread. An important focus of pv-ticketlocks
is to reduce the lock _acquisition_ time by ensuring that the next-in-line 
vcpu gets to run asap when a ticket lock is released. With the way 
kvm_vcpu_on_spin+yield_to is implemented, I don't see how we can provide the 
best lock acquisition times for threads. It would be nice though to compare 
the two approaches (kvm_vcpu_on_spin optimization and the pv-ticketlock scheme) 
to get some real-world numbers. I unfortunately don't have access to a PLE
capable hardware which is required to test your kvm_vcpu_on_spin changes?

Also it may be possible for the pv-ticketlocks to track owning vcpu and make use
of a yield-to interface as further optimization to avoid the 
"others-get-more-time" problem, but Peterz rightly pointed that PI would be a 
better solution there than yield-to. So overall IMO kvm_vcpu_on_spin+yield_to
could be the best solution for unmodified guests, while paravirtualized
ticketlocks + some sort of PI would be a better solution where we have the
luxury of modifying guest sources!

- vatsa
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] KVM test: tests_base.cfg: Fixing tabs instead of whitespace

2011-01-21 Thread Lucas Meneghel Rodrigues

Those were introduced on a previous netperf fixes.

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/tests_base.cfg.sample |   14 +++---
 1 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/client/tests/kvm/tests_base.cfg.sample 
b/client/tests/kvm/tests_base.cfg.sample
index bd2f720..cdfb3ad 100644
--- a/client/tests/kvm/tests_base.cfg.sample
+++ b/client/tests/kvm/tests_base.cfg.sample
@@ -687,13 +687,13 @@ variants:
 packet_size = 1500
 setup_cmd = "cd %s && tar xvfj netperf-2.4.5.tar.bz2 && cd 
netperf-2.4.5 && patch -p0 < ../wait_before_data.patch && ./configure && make"
 netserver_cmd =  %s/netperf-2.4.5/src/netserver
-   variants:
-   - stream:
-   netperf_cmd = %s/netperf-2.4.5/src/netperf -t %s -H %s -l 
60 -- -m %s
-   protocols = "TCP_STREAM TCP_MAERTS TCP_SENDFILE UDP_STREAM"
-   - rr:
-   netperf_cmd = %s/netperf-2.4.5/src/netperf -t %s -H %s -l 
60 -- -r %s
-   protocols = "TCP_RR TCP_CRR UDP_RR"
+variants:
+- stream:
+netperf_cmd = %s/netperf-2.4.5/src/netperf -t %s -H %s -l 60 
-- -m %s
+protocols = "TCP_STREAM TCP_MAERTS TCP_SENDFILE UDP_STREAM"
+- rr:
+netperf_cmd = %s/netperf-2.4.5/src/netperf -t %s -H %s -l 60 
-- -r %s
+protocols = "TCP_RR TCP_CRR UDP_RR"
 
 - ethtool: install setup unattended_install.cdrom
 type = ethtool
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] KVM test: Fix wrong parameter name for migrate_background

2011-01-21 Thread Lucas Meneghel Rodrigues

On unattended_install with background ping pong migration.
It was my mistake when modifying the original Jason's patchset.

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/tests_base.cfg.sample |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/client/tests/kvm/tests_base.cfg.sample 
b/client/tests/kvm/tests_base.cfg.sample
index cdfb3ad..b82d1dc 100644
--- a/client/tests/kvm/tests_base.cfg.sample
+++ b/client/tests/kvm/tests_base.cfg.sample
@@ -103,7 +103,7 @@ variants:
 initrd = initrd.img
 nic_mode = tap
 # uncomment the following line to test the migration in parallel
-# migrate_with_background = yes
+# migrate_background = yes
 
 variants:
 # Install guest from cdrom 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/4] KVM test: Rename virtio_guest.py to virtio_console_guest.py

2011-01-21 Thread Lucas Meneghel Rodrigues

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/scripts/virtio_console_guest.py |  715 ++
 client/tests/kvm/scripts/virtio_guest.py |  715 --
 client/tests/kvm/tests/virtio_console.py |   20 +-
 3 files changed, 725 insertions(+), 725 deletions(-)
 create mode 100755 client/tests/kvm/scripts/virtio_console_guest.py
 delete mode 100755 client/tests/kvm/scripts/virtio_guest.py

diff --git a/client/tests/kvm/scripts/virtio_console_guest.py 
b/client/tests/kvm/scripts/virtio_console_guest.py
new file mode 100755
index 000..35efb7d
--- /dev/null
+++ b/client/tests/kvm/scripts/virtio_console_guest.py
@@ -0,0 +1,715 @@
+#!/usr/bin/python
+# -*- coding: utf-8 -*-
+"""
+Auxiliary script used to send data between ports on guests.
+
+@copyright: 2010 Red Hat, Inc.
+@author: Jiri Zupka (jzu...@redhat.com)
+@author: Lukas Doktor (ldok...@redhat.com)
+"""
+import threading
+from threading import Thread
+import os, time, select, re, random, sys, array
+import fcntl, subprocess, traceback, signal
+
+DEBUGPATH = "/sys/kernel/debug"
+SYSFSPATH = "/sys/class/virtio-ports/"
+
+exiting = False
+
+class VirtioGuest:
+"""
+Test tools of virtio_ports.
+"""
+LOOP_NONE = 0
+LOOP_POLL = 1
+LOOP_SELECT = 2
+
+def __init__(self):
+self.files = {}
+self.exit_thread = threading.Event()
+self.threads = []
+self.ports = {}
+self.poll_fds = {}
+self.catch_signal = None
+self.use_config = threading.Event()
+
+
+def _readfile(self, name):
+"""
+Read file and return content as string
+
+@param name: Name of file
+@return: Content of file as string
+"""
+out = ""
+try:
+f = open(name, "r")
+out = f.read()
+f.close()
+except:
+print "FAIL: Cannot open file %s" % (name)
+
+return out
+
+
+def _get_port_status(self):
+"""
+Get info about ports from kernel debugfs.
+
+@return: Ports dictionary of port properties
+"""
+ports = {}
+not_present_msg = "FAIL: There's no virtio-ports dir in debugfs"
+if (not os.path.ismount(DEBUGPATH)):
+os.system('mount -t debugfs none %s' % (DEBUGPATH))
+try:
+if not os.path.isdir('%s/virtio-ports' % (DEBUGPATH)):
+print not_present_msg
+except:
+print not_present_msg
+else:
+viop_names = os.listdir('%s/virtio-ports' % (DEBUGPATH))
+for name in viop_names:
+open_db_file = "%s/virtio-ports/%s" % (DEBUGPATH, name)
+f = open(open_db_file, 'r')
+port = {}
+file = []
+for line in iter(f):
+file.append(line)
+try:
+for line in file:
+m = re.match("(\S+): (\S+)", line)
+port[m.group(1)] = m.group(2)
+
+if (port['is_console'] == "yes"):
+port["path"] = "/dev/hvc%s" % (port["console_vtermno"])
+# Console works like a serialport
+else:
+port["path"] = "/dev/%s" % name
+
+if (not os.path.exists(port['path'])):
+print "FAIL: %s not exist" % port['path']
+
+sysfspath = SYSFSPATH + name
+if (not os.path.isdir(sysfspath)):
+print "FAIL: %s not exist" % (sysfspath)
+
+info_name = sysfspath + "/name"
+port_name = self._readfile(info_name).strip()
+if (port_name != port["name"]):
+print ("FAIL: Port info not match \n%s - %s\n%s - %s" %
+   (info_name , port_name,
+"%s/virtio-ports/%s" % (DEBUGPATH, name),
+port["name"]))
+except AttributeError:
+print ("In file " + open_db_file +
+   " are bad data\n"+ "".join(file).strip())
+print ("FAIL: Fail file data.")
+return
+
+ports[port['name']] = port
+f.close()
+
+return ports
+
+
+def init(self, in_files):
+"""
+Init and check port properties.
+"""
+self.ports = self._get_port_status()
+
+if self.ports == None:
+return
+for item in in_files:
+if (item[1] != self.ports[item[0]]["is_console"]):
+print self.ports
+print "FAIL: Host console is not like console on guest side\n"
+print "PASS: Init and check virtioconsole files in system."
+
+
+class Switch(Thread):
+"""
+Thread that sends data between ports.
+"""
+

[PATCH 1/4] KVM test: Renaming script bonding_setup.py to nic_bonding_guest.py

2011-01-21 Thread Lucas Meneghel Rodrigues

We'll stablish a convention (of course, no extremely strict)
about scripts ran in guest: We can call them [test_name]_guest.py.
Let's start by converting bonding_setup to this convention.

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/scripts/bonding_setup.py |   37 -
 client/tests/kvm/scripts/nic_bonding_guest.py |   37 +
 client/tests/kvm/tests/nic_bonding.py |8 +++---
 3 files changed, 41 insertions(+), 41 deletions(-)
 delete mode 100644 client/tests/kvm/scripts/bonding_setup.py
 create mode 100644 client/tests/kvm/scripts/nic_bonding_guest.py

diff --git a/client/tests/kvm/scripts/bonding_setup.py 
b/client/tests/kvm/scripts/bonding_setup.py
deleted file mode 100644
index f2d4be9..000
--- a/client/tests/kvm/scripts/bonding_setup.py
+++ /dev/null
@@ -1,37 +0,0 @@
-import os, re, commands, sys
-"""This script is used to setup bonding, macaddr of bond0 should be assigned by
-argv1"""
-
-if len(sys.argv) != 2:
-sys.exit(1)
-mac = sys.argv[1]
-eth_nums = 0
-ifconfig_output = commands.getoutput("ifconfig")
-re_eth = "eth[0-9]*"
-for ename in re.findall(re_eth, ifconfig_output):
-eth_config_file = "/etc/sysconfig/network-scripts/ifcfg-%s" % ename
-eth_config = """DEVICE=%s
-USERCTL=no
-ONBOOT=yes
-MASTER=bond0
-SLAVE=yes
-BOOTPROTO=none
-""" % ename
-f = file(eth_config_file,'w')
-f.write(eth_config)
-f.close()
-
-bonding_config_file = "/etc/sysconfig/network-scripts/ifcfg-bond0"
-bond_config = """DEVICE=bond0
-BOOTPROTO=dhcp
-NETWORKING_IPV6=no
-ONBOOT=yes
-USERCTL=no
-MACADDR=%s
-""" % mac
-f = file(bonding_config_file, "w")
-f.write(bond_config)
-f.close()
-os.system("modprobe bonding")
-os.system("service NetworkManager stop")
-os.system("service network restart")
diff --git a/client/tests/kvm/scripts/nic_bonding_guest.py 
b/client/tests/kvm/scripts/nic_bonding_guest.py
new file mode 100644
index 000..f2d4be9
--- /dev/null
+++ b/client/tests/kvm/scripts/nic_bonding_guest.py
@@ -0,0 +1,37 @@
+import os, re, commands, sys
+"""This script is used to setup bonding, macaddr of bond0 should be assigned by
+argv1"""
+
+if len(sys.argv) != 2:
+sys.exit(1)
+mac = sys.argv[1]
+eth_nums = 0
+ifconfig_output = commands.getoutput("ifconfig")
+re_eth = "eth[0-9]*"
+for ename in re.findall(re_eth, ifconfig_output):
+eth_config_file = "/etc/sysconfig/network-scripts/ifcfg-%s" % ename
+eth_config = """DEVICE=%s
+USERCTL=no
+ONBOOT=yes
+MASTER=bond0
+SLAVE=yes
+BOOTPROTO=none
+""" % ename
+f = file(eth_config_file,'w')
+f.write(eth_config)
+f.close()
+
+bonding_config_file = "/etc/sysconfig/network-scripts/ifcfg-bond0"
+bond_config = """DEVICE=bond0
+BOOTPROTO=dhcp
+NETWORKING_IPV6=no
+ONBOOT=yes
+USERCTL=no
+MACADDR=%s
+""" % mac
+f = file(bonding_config_file, "w")
+f.write(bond_config)
+f.close()
+os.system("modprobe bonding")
+os.system("service NetworkManager stop")
+os.system("service network restart")
diff --git a/client/tests/kvm/tests/nic_bonding.py 
b/client/tests/kvm/tests/nic_bonding.py
index ca9d70a..52ce0ae 100644
--- a/client/tests/kvm/tests/nic_bonding.py
+++ b/client/tests/kvm/tests/nic_bonding.py
@@ -8,7 +8,7 @@ def run_nic_bonding(test, params, env):
 Nic bonding test in guest.
 
 1) Start guest with four nic models.
-2) Setup bond0 in guest by script bonding_setup.py.
+2) Setup bond0 in guest by script nic_bonding_guest.py.
 3) Execute file transfer test between guest and host.
 4) Repeatedly put down/up interfaces by set_link
 5) Execute file transfer test between guest and host.
@@ -34,9 +34,9 @@ def run_nic_bonding(test, params, env):
 vm = env.get_vm(params["main_vm"])
 vm.verify_alive()
 session_serial = vm.wait_for_serial_login(timeout=timeout)
-script_path = kvm_utils.get_path(test.bindir, "scripts/bonding_setup.py")
-vm.copy_files_to(script_path, "/tmp/bonding_setup.py")
-cmd = "python /tmp/bonding_setup.py %s" % vm.get_mac_address()
+script_path = kvm_utils.get_path(test.bindir, 
"scripts/nic_bonding_guest.py")
+vm.copy_files_to(script_path, "/tmp/nic_bonding_guest.py")
+cmd = "python /tmp/nic_bonding_guest.py %s" % vm.get_mac_address()
 session_serial.cmd(cmd)
 
 termination_event = threading.Event()
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/4] KVM test: Renaming join_mcast.py to multicast_guest.py

2011-01-21 Thread Lucas Meneghel Rodrigues

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/scripts/join_mcast.py  |   37 ---
 client/tests/kvm/scripts/multicast_guest.py |   37 +++
 client/tests/kvm/tests/multicast.py |4 +-
 3 files changed, 39 insertions(+), 39 deletions(-)
 delete mode 100755 client/tests/kvm/scripts/join_mcast.py
 create mode 100755 client/tests/kvm/scripts/multicast_guest.py

diff --git a/client/tests/kvm/scripts/join_mcast.py 
b/client/tests/kvm/scripts/join_mcast.py
deleted file mode 100755
index 350cd5f..000
--- a/client/tests/kvm/scripts/join_mcast.py
+++ /dev/null
@@ -1,37 +0,0 @@
-#!/usr/bin/python
-import socket, struct, os, signal, sys
-# -*- coding: utf-8 -*-
-
-"""
-Script used to join machine into multicast groups.
-
-@author Amos Kong 
-"""
-
-if __name__ == "__main__":
-if len(sys.argv) < 4:
-print """%s [mgroup_count] [prefix] [suffix]
-mgroup_count: count of multicast addresses
-prefix: multicast address prefix
-suffix: multicast address suffix""" % sys.argv[0]
-sys.exit()
-
-mgroup_count = int(sys.argv[1])
-prefix = sys.argv[2]
-suffix = int(sys.argv[3])
-
-s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
-for i in range(mgroup_count):
-mcast = prefix + "." + str(suffix + i)
-try:
-mreq = struct.pack("4sl", socket.inet_aton(mcast),
-   socket.INADDR_ANY)
-s.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, mreq)
-except:
-s.close()
-print "Could not join multicast: %s" % mcast
-raise
-
-print "join_mcast_pid:%s" % os.getpid()
-os.kill(os.getpid(), signal.SIGSTOP)
-s.close()
diff --git a/client/tests/kvm/scripts/multicast_guest.py 
b/client/tests/kvm/scripts/multicast_guest.py
new file mode 100755
index 000..350cd5f
--- /dev/null
+++ b/client/tests/kvm/scripts/multicast_guest.py
@@ -0,0 +1,37 @@
+#!/usr/bin/python
+import socket, struct, os, signal, sys
+# -*- coding: utf-8 -*-
+
+"""
+Script used to join machine into multicast groups.
+
+@author Amos Kong 
+"""
+
+if __name__ == "__main__":
+if len(sys.argv) < 4:
+print """%s [mgroup_count] [prefix] [suffix]
+mgroup_count: count of multicast addresses
+prefix: multicast address prefix
+suffix: multicast address suffix""" % sys.argv[0]
+sys.exit()
+
+mgroup_count = int(sys.argv[1])
+prefix = sys.argv[2]
+suffix = int(sys.argv[3])
+
+s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
+for i in range(mgroup_count):
+mcast = prefix + "." + str(suffix + i)
+try:
+mreq = struct.pack("4sl", socket.inet_aton(mcast),
+   socket.INADDR_ANY)
+s.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, mreq)
+except:
+s.close()
+print "Could not join multicast: %s" % mcast
+raise
+
+print "join_mcast_pid:%s" % os.getpid()
+os.kill(os.getpid(), signal.SIGSTOP)
+s.close()
diff --git a/client/tests/kvm/tests/multicast.py 
b/client/tests/kvm/tests/multicast.py
index ddb7807..5dfecbc 100644
--- a/client/tests/kvm/tests/multicast.py
+++ b/client/tests/kvm/tests/multicast.py
@@ -53,9 +53,9 @@ def run_multicast(test, params, env):
 prefix = re.findall("\d+.\d+.\d+", mcast)[0]
 suffix = int(re.findall("\d+", mcast)[-1])
 # copy python script to guest for joining guest to multicast groups
-mcast_path = os.path.join(test.bindir, "scripts/join_mcast.py")
+mcast_path = os.path.join(test.bindir, "scripts/multicast_guest.py")
 vm.copy_files_to(mcast_path, "/tmp")
-output = session.cmd_output("python /tmp/join_mcast.py %d %s %d" %
+output = session.cmd_output("python /tmp/multicast_guest.py %d %s %d" %
 (mgroup_count, prefix, suffix))
 
 # if success to join multicast, the process will be paused, and return PID.
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/4] KVM test: renaming allocator.py to ksm_overcommit_guest.py

2011-01-21 Thread Lucas Meneghel Rodrigues

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/scripts/allocator.py|  237 --
 client/tests/kvm/scripts/ksm_overcommit_guest.py |  237 ++
 client/tests/kvm/tests/ksm_overcommit.py |   40 ++--
 3 files changed, 258 insertions(+), 256 deletions(-)
 delete mode 100755 client/tests/kvm/scripts/allocator.py
 create mode 100755 client/tests/kvm/scripts/ksm_overcommit_guest.py

diff --git a/client/tests/kvm/scripts/allocator.py 
b/client/tests/kvm/scripts/allocator.py
deleted file mode 100755
index 09dc004..000
--- a/client/tests/kvm/scripts/allocator.py
+++ /dev/null
@@ -1,237 +0,0 @@
-#!/usr/bin/python
-# -*- coding: utf-8 -*-
-"""
-Auxiliary script used to allocate memory on guests.
-
-@copyright: 2008-2009 Red Hat Inc.
-@author: Jiri Zupka (jzu...@redhat.com)
-"""
-
-
-import os, array, sys, struct, random, copy, inspect, tempfile, datetime, math
-
-PAGE_SIZE = 4096 # machine page size
-
-TMPFS_OVERHEAD = 0.0022 # overhead on 1MB of write data
-
-
-class MemFill(object):
-"""
-Fills guest memory according to certain patterns.
-"""
-def __init__(self, mem, static_value, random_key):
-"""
-Constructor of MemFill class.
-
-@param mem: Amount of test memory in MB.
-@param random_key: Seed of random series used for fill up memory.
-@param static_value: Value used to fill all memory.
-"""
-if (static_value < 0 or static_value > 255):
-print ("FAIL: Initialization static value"
-   "can be only in range (0..255)")
-return
-
-self.tmpdp = tempfile.mkdtemp()
-ret_code = os.system("mount -o size=%dM tmpfs %s -t tmpfs" %
- ((mem+math.ceil(mem*TMPFS_OVERHEAD)),
- self.tmpdp))
-if ret_code != 0:
-if os.getuid() != 0:
-print ("FAIL: Unable to mount tmpfs "
-   "(likely cause: you are not root)")
-else:
-print "FAIL: Unable to mount tmpfs"
-else:
-self.f = tempfile.TemporaryFile(prefix='mem', dir=self.tmpdp)
-self.allocate_by = 'L'
-self.npages = ((mem * 1024 * 1024) / PAGE_SIZE)
-self.random_key = random_key
-self.static_value = static_value
-print "PASS: Initialization"
-
-
-def __del__(self):
-if os.path.ismount(self.tmpdp):
-self.f.close()
-os.system("umount %s" % (self.tmpdp))
-
-
-def compare_page(self, original, inmem):
-"""
-Compare pages of memory and print the differences found.
-
-@param original: Data that was expected to be in memory.
-@param inmem: Data in memory.
-"""
-for ip in range(PAGE_SIZE / original.itemsize):
-if (not original[ip] == inmem[ip]): # find which item is wrong
-originalp = array.array("B")
-inmemp = array.array("B")
-originalp.fromstring(original[ip:ip+1].tostring())
-inmemp.fromstring(inmem[ip:ip+1].tostring())
-for ib in range(len(originalp)): # find wrong byte in item
-if not (originalp[ib] == inmemp[ib]):
-position = (self.f.tell() - PAGE_SIZE + ip *
-original.itemsize + ib)
-print ("Mem error on position %d wanted 0x%Lx and is "
-   "0x%Lx" % (position, originalp[ib], inmemp[ib]))
-
-
-def value_page(self, value):
-"""
-Create page filled by value.
-
-@param value: String we want to fill the page with.
-@return: return array of bytes size PAGE_SIZE.
-"""
-a = array.array("B")
-for i in range((PAGE_SIZE / a.itemsize)):
-try:
-a.append(value)
-except:
-print "FAIL: Value can be only in range (0..255)"
-return a
-
-
-def random_page(self, seed):
-"""
-Create page filled by static random series.
-
-@param seed: Seed of random series.
-@return: Static random array series.
-"""
-random.seed(seed)
-a = array.array(self.allocate_by)
-for i in range(PAGE_SIZE / a.itemsize):
-a.append(random.randrange(0, sys.maxint))
-return a
-
-
-def value_fill(self, value=None):
-"""
-Fill memory page by page, with value generated with value_page.
-
-@param value: Parameter to be passed to value_page. None to just use
-what's on the attribute static_value.
-"""
-self.f.seek(0)
-if value is None:
-value = self.static_value
-page = self.value_page(value)
-for pages in range(self.npages):
-page.tofile(self.f)
-print "PASS: Mem value fill"
-
-
-def value_check(se

[PATCH 0/4] Renaming scripts that we run on guests

2011-01-21 Thread Lucas Meneghel Rodrigues

For the sake of clarity, we stablish a convention, scripts copied
to guests and executed there by tests will be called
[test_name]_guest.py. This patchset takes care of renaming
the scripts.

Lucas Meneghel Rodrigues (4):
  KVM test: Renaming script bonding_setup.py to nic_bonding_guest.py
  KVM test: Renaming join_mcast.py to multicast_guest.py
  KVM test: renaming allocator.py to ksm_overcommit_guest.py
  KVM test: Rename virtio_guest.py to virtio_console_guest.py

 client/tests/kvm/scripts/allocator.py|  237 ---
 client/tests/kvm/scripts/bonding_setup.py|   37 --
 client/tests/kvm/scripts/join_mcast.py   |   37 --
 client/tests/kvm/scripts/ksm_overcommit_guest.py |  237 +++
 client/tests/kvm/scripts/multicast_guest.py  |   37 ++
 client/tests/kvm/scripts/nic_bonding_guest.py|   37 ++
 client/tests/kvm/scripts/virtio_console_guest.py |  715 ++
 client/tests/kvm/scripts/virtio_guest.py |  715 --
 client/tests/kvm/tests/ksm_overcommit.py |   40 +-
 client/tests/kvm/tests/multicast.py  |4 +-
 client/tests/kvm/tests/nic_bonding.py|8 +-
 client/tests/kvm/tests/virtio_console.py |   20 +-
 12 files changed, 1063 insertions(+), 1061 deletions(-)
 delete mode 100755 client/tests/kvm/scripts/allocator.py
 delete mode 100644 client/tests/kvm/scripts/bonding_setup.py
 delete mode 100755 client/tests/kvm/scripts/join_mcast.py
 create mode 100755 client/tests/kvm/scripts/ksm_overcommit_guest.py
 create mode 100755 client/tests/kvm/scripts/multicast_guest.py
 create mode 100644 client/tests/kvm/scripts/nic_bonding_guest.py
 create mode 100755 client/tests/kvm/scripts/virtio_console_guest.py
 delete mode 100755 client/tests/kvm/scripts/virtio_guest.py

-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 6/6] KVM test: Removing enospc pre and post scripts

2011-01-21 Thread Lucas Meneghel Rodrigues

As their functionality has been reimplemented as
framework functionality

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/scripts/enospc-post.py |   77 ---
 client/tests/kvm/scripts/enospc-pre.py  |   73 -
 2 files changed, 0 insertions(+), 150 deletions(-)
 delete mode 100755 client/tests/kvm/scripts/enospc-post.py
 delete mode 100755 client/tests/kvm/scripts/enospc-pre.py

diff --git a/client/tests/kvm/scripts/enospc-post.py 
b/client/tests/kvm/scripts/enospc-post.py
deleted file mode 100755
index c6714f2..000
--- a/client/tests/kvm/scripts/enospc-post.py
+++ /dev/null
@@ -1,77 +0,0 @@
-#!/usr/bin/python
-"""
-Simple script to setup enospc test environment
-"""
-import os, commands, sys
-
-SCRIPT_DIR = os.path.dirname(sys.modules[__name__].__file__)
-KVM_TEST_DIR = os.path.abspath(os.path.join(SCRIPT_DIR, ".."))
-
-class SetupError(Exception):
-"""
-Simple wrapper for the builtin Exception class.
-"""
-pass
-
-
-def find_command(cmd):
-"""
-Searches for a command on common paths, error if it can't find it.
-
-@param cmd: Command to be found.
-"""
-if os.path.exists(cmd):
-return cmd
-for dir in ["/usr/local/sbin", "/usr/local/bin",
-"/usr/sbin", "/usr/bin", "/sbin", "/bin"]:
-file = os.path.join(dir, cmd)
-if os.path.exists(file):
-return file
-raise ValueError('Missing command: %s' % cmd)
-
-
-def run(cmd, info=None):
-"""
-Run a command and throw an exception if it fails.
-Optionally, you can provide additional contextual info.
-
-@param cmd: Command string.
-@param reason: Optional string that explains the context of the failure.
-
-@raise: SetupError if command fails.
-"""
-print "Running '%s'" % cmd
-cmd_name = cmd.split(' ')[0]
-find_command(cmd_name)
-status, output = commands.getstatusoutput(cmd)
-if status:
-e_msg = ('Command %s failed.\nStatus:%s\nOutput:%s' %
- (cmd, status, output))
-if info is not None:
-e_msg += '\nAdditional Info:%s' % info
-raise SetupError(e_msg)
-
-return (status, output)
-
-
-if __name__ == "__main__":
-qemu_img_binary = os.environ['KVM_TEST_qemu_img_binary']
-if not os.path.isabs(qemu_img_binary):
-qemu_img_binary = os.path.join(KVM_TEST_DIR, qemu_img_binary)
-if not os.path.exists(qemu_img_binary):
-raise SetupError('The qemu-img binary that is supposed to be used '
- '(%s) does not exist. Please verify your '
- 'configuration' % qemu_img_binary)
-
-run("lvremove -f vgtest")
-status, output = run("losetup -a")
-loopback_device = None
-if output:
-for line in output.splitlines():
-device = line.split(":")[0]
-if "/tmp/enospc.raw" in line:
-loopback_device = device
-break
-if loopback_device is not None:
-run("losetup -d %s" % loopback_device)
-run("rm -rf /tmp/enospc.raw /tmp/kvm_autotest_root/images/enospc.qcow2")
diff --git a/client/tests/kvm/scripts/enospc-pre.py 
b/client/tests/kvm/scripts/enospc-pre.py
deleted file mode 100755
index 1313de3..000
--- a/client/tests/kvm/scripts/enospc-pre.py
+++ /dev/null
@@ -1,73 +0,0 @@
-#!/usr/bin/python
-"""
-Simple script to setup enospc test environment
-"""
-import os, commands, sys
-
-SCRIPT_DIR = os.path.dirname(sys.modules[__name__].__file__)
-KVM_TEST_DIR = os.path.abspath(os.path.join(SCRIPT_DIR, ".."))
-
-class SetupError(Exception):
-"""
-Simple wrapper for the builtin Exception class.
-"""
-pass
-
-
-def find_command(cmd):
-"""
-Searches for a command on common paths, error if it can't find it.
-
-@param cmd: Command to be found.
-"""
-if os.path.exists(cmd):
-return cmd
-for dir in ["/usr/local/sbin", "/usr/local/bin",
-"/usr/sbin", "/usr/bin", "/sbin", "/bin"]:
-file = os.path.join(dir, cmd)
-if os.path.exists(file):
-return file
-raise ValueError('Missing command: %s' % cmd)
-
-
-def run(cmd, info=None):
-"""
-Run a command and throw an exception if it fails.
-Optionally, you can provide additional contextual info.
-
-@param cmd: Command string.
-@param reason: Optional string that explains the context of the failure.
-
-@raise: SetupError if command fails.
-"""
-print "Running '%s'" % cmd
-cmd_name = cmd.split(' ')[0]
-find_command(cmd_name)
-status, output = commands.getstatusoutput(cmd)
-if status:
-e_msg = ('Command %s failed.\nStatus:%s\nOutput:%s' %
- (cmd, status, output))
-if info is not None:
-e_msg += '\nAdditional Info:%s' % info
-raise SetupError(e_msg)
-
-return (status, output.strip())
-
-
-if __name__ == "__main__":
-qemu_img_binary = os.environ['KVM_TEST_q

[PATCH 3/6] KVM test: Removing scripts/unattended.py

2011-01-21 Thread Lucas Meneghel Rodrigues

Now that its functionality was implemented as part of
the framework.

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/scripts/unattended.py |  543 
 client/tests/kvm/tests_base.cfg.sample |2 -
 2 files changed, 0 insertions(+), 545 deletions(-)
 delete mode 100755 client/tests/kvm/scripts/unattended.py

diff --git a/client/tests/kvm/scripts/unattended.py 
b/client/tests/kvm/scripts/unattended.py
deleted file mode 100755
index e65fe46..000
--- a/client/tests/kvm/scripts/unattended.py
+++ /dev/null
@@ -1,543 +0,0 @@
-#!/usr/bin/python
-"""
-Simple script to setup unattended installs on KVM guests.
-"""
-# -*- coding: utf-8 -*-
-import os, sys, shutil, tempfile, re, ConfigParser, glob, inspect, commands
-import common
-
-
-SCRIPT_DIR = os.path.dirname(sys.modules[__name__].__file__)
-KVM_TEST_DIR = os.path.abspath(os.path.join(SCRIPT_DIR, ".."))
-
-
-class SetupError(Exception):
-"""
-Simple wrapper for the builtin Exception class.
-"""
-pass
-
-
-def find_command(cmd):
-"""
-Searches for a command on common paths, error if it can't find it.
-
-@param cmd: Command to be found.
-"""
-if os.path.exists(cmd):
-return cmd
-for dir in ["/usr/local/sbin", "/usr/local/bin",
-"/usr/sbin", "/usr/bin", "/sbin", "/bin"]:
-file = os.path.join(dir, cmd)
-if os.path.exists(file):
-return file
-raise ValueError('Missing command: %s' % cmd)
-
-
-def run(cmd, info=None):
-"""
-Run a command and throw an exception if it fails.
-Optionally, you can provide additional contextual info.
-
-@param cmd: Command string.
-@param reason: Optional string that explains the context of the failure.
-
-@raise: SetupError if command fails.
-"""
-print "Running '%s'" % cmd
-cmd_name = cmd.split(' ')[0]
-find_command(cmd_name)
-status, output = commands.getstatusoutput(cmd)
-if status:
-e_msg = ('Command %s failed.\nStatus:%s\nOutput:%s' %
- (cmd, status, output))
-if info is not None:
-e_msg += '\nAdditional Info:%s' % info
-raise SetupError(e_msg)
-
-return (status, output.strip())
-
-
-def cleanup(dir):
-"""
-If dir is a mountpoint, do what is possible to unmount it. Afterwards,
-try to remove it.
-
-@param dir: Directory to be cleaned up.
-"""
-print "Cleaning up directory %s" % dir
-if os.path.ismount(dir):
-os.system('fuser -k %s' % dir)
-run('umount %s' % dir, info='Could not unmount %s' % dir)
-if os.path.isdir(dir):
-shutil.rmtree(dir)
-
-
-def clean_old_image(image):
-"""
-Clean a leftover image file from previous processes. If it contains a
-mounted file system, do the proper cleanup procedures.
-
-@param image: Path to image to be cleaned up.
-"""
-if os.path.exists(image):
-mtab = open('/etc/mtab', 'r')
-mtab_contents = mtab.read()
-mtab.close()
-if image in mtab_contents:
-os.system('fuser -k %s' % image)
-os.system('umount %s' % image)
-os.remove(image)
-
-
-class Disk(object):
-"""
-Abstract class for Disk objects, with the common methods implemented.
-"""
-def __init__(self):
-self.path = None
-
-
-def setup_answer_file(self, filename, contents):
-answer_file = open(os.path.join(self.mount, filename), 'w')
-answer_file.write(contents)
-answer_file.close()
-
-
-def copy_to(self, src):
-dst = os.path.join(self.mount, os.path.basename(src))
-if os.path.isdir(src):
-shutil.copytree(src, dst)
-elif os.path.isfile(src):
-shutil.copyfile(src, dst)
-
-
-def close(self):
-os.chmod(self.path, 0755)
-cleanup(self.mount)
-print "Disk %s successfuly set" % self.path
-
-
-class FloppyDisk(Disk):
-"""
-Represents a 1.44 MB floppy disk. We can copy files to it, and setup it in
-convenient ways.
-"""
-def __init__(self, path):
-print "Creating floppy unattended image %s" % path
-qemu_img_binary = os.environ['KVM_TEST_qemu_img_binary']
-if not os.path.isabs(qemu_img_binary):
-qemu_img_binary = os.path.join(KVM_TEST_DIR, qemu_img_binary)
-if not os.path.exists(qemu_img_binary):
-raise SetupError('The qemu-img binary that is supposed to be used '
- '(%s) does not exist. Please verify your '
- 'configuration' % qemu_img_binary)
-
-self.mount = tempfile.mkdtemp(prefix='floppy_', dir='/tmp')
-self.virtio_mount = None
-self.path = path
-clean_old_image(path)
-if not os.path.isdir(os.path.dirname(path)):
-os.makedirs(os.path.dirname(path))
-
-try:
-c_cmd = '%s create -f raw %s 1440k' % (qemu_img_binary, path)
-run(c_cm

[PATCH 5/6] KVM test: Turn enospc test pre/post actions into infrastructure

2011-01-21 Thread Lucas Meneghel Rodrigues

So we can get rid of the pre/post scripts. Wit the rearrangement
we were able to achieve several advantages:

- More rigorous and paranoid cleanup phase
- Better identification of the lvm devices, less likely to
originate conflicts with devices in the host
- Use the shared autotest API avoiding code duplication

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/kvm_preprocessing.py  |8 ++
 client/tests/kvm/test_setup.py |  116 +--
 client/tests/kvm/tests/enospc.py   |6 ++-
 client/tests/kvm/tests_base.cfg.sample |5 +-
 4 files changed, 124 insertions(+), 11 deletions(-)

diff --git a/client/tests/kvm/kvm_preprocessing.py 
b/client/tests/kvm/kvm_preprocessing.py
index 081a13f..2713805 100644
--- a/client/tests/kvm/kvm_preprocessing.py
+++ b/client/tests/kvm/kvm_preprocessing.py
@@ -262,6 +262,10 @@ def preprocess(test, params, env):
 u = test_setup.UnattendedInstallConfig(test, params)
 u.setup()
 
+if params.get("type") == "enospc":
+e = test_setup.EnospcConfig(test, params)
+e.setup()
+
 # Execute any pre_commands
 if params.get("pre_command"):
 process_command(test, params, env, params.get("pre_command"),
@@ -362,6 +366,10 @@ def postprocess(test, params, env):
 h = kvm_utils.HugePageConfig(params)
 h.cleanup()
 
+if params.get("type") == "enospc":
+e = test_setup.EnospcConfig(test, params)
+e.cleanup()
+
 # Execute any post_commands
 if params.get("post_command"):
 process_command(test, params, env, params.get("post_command"),
diff --git a/client/tests/kvm/test_setup.py b/client/tests/kvm/test_setup.py
index b17c473..e906e18 100644
--- a/client/tests/kvm/test_setup.py
+++ b/client/tests/kvm/test_setup.py
@@ -2,7 +2,7 @@
 Library to perform pre/post test setup for KVM autotest.
 """
 import os, sys, shutil, tempfile, re, ConfigParser, glob, inspect, commands
-import logging
+import logging, time
 from autotest_lib.client.common_lib import error
 from autotest_lib.client.bin import utils
 
@@ -42,6 +42,19 @@ def clean_old_image(image):
 os.remove(image)
 
 
+def display_attributes(instance):
+"""
+Inspects a given class instance attributes and displays them, convenient
+for debugging.
+"""
+logging.debug("Attributes set:")
+for member in inspect.getmembers(instance):
+name, value = member
+attribute = getattr(instance, name)
+if not (name.startswith("__") or callable(attribute) or not value):
+logging.debug("%s: %s", name, value)
+
+
 class Disk(object):
 """
 Abstract class for Disk objects, with the common methods implemented.
@@ -472,13 +485,7 @@ class UnattendedInstallConfig(object):
 Uses an appropriate strategy according to each install model.
 """
 logging.info("Starting unattended install setup")
-
-logging.debug("Variables set:")
-for member in inspect.getmembers(self):
-name, value = member
-attribute = getattr(self, name)
-if not (name.startswith("__") or callable(attribute) or not value):
-logging.debug("%s: %s", name, value)
+display_attributes(self)
 
 if self.unattended_file and (self.floppy or self.cdrom_unattended):
 self.setup_boot_disk()
@@ -593,3 +600,96 @@ class HugePageConfig(object):
 return
 utils.system("echo 0 > %s" % self.kernel_hp_file)
 logging.debug("Hugepage memory successfuly dealocated")
+
+
+class EnospcConfig(object):
+"""
+Performs setup for the test enospc. This is a borg class, similar to a
+singleton. The idea is to keep state in memory for when we call cleanup()
+on postprocessing.
+"""
+__shared_state = {}
+def __init__(self, test, params):
+self.__dict__ = self.__shared_state
+root_dir = test.bindir
+self.tmpdir = test.tmpdir
+self.qemu_img_binary = params.get('qemu_img_binary')
+if not os.path.isfile(self.qemu_img_binary):
+self.qemu_img_binary = os.path.join(root_dir,
+self.qemu_img_binary)
+self.raw_file_path = os.path.join(self.tmpdir, 'enospc.raw')
+# Here we're trying to choose fairly explanatory names so it's less
+# likely that we run in conflict with other devices in the system
+self.vgtest_name = params.get("vgtest_name")
+self.lvtest_name = params.get("lvtest_name")
+self.lvtest_device = "/dev/%s/%s" % (self.vgtest_name, 
self.lvtest_name)
+image_dir = os.path.dirname(params.get("image_name"))
+self.qcow_file_path = os.path.join(image_dir, 'enospc.qcow2')
+try:
+getattr(self, 'loopback')
+except AttributeError:
+self.loopback = ''
+
+
+@error.context_aware
+def setup(self):
+logging.debug("Starting enospc setup")
+

[PATCH 1/6] KVM test: Introducing test_setup library

2011-01-21 Thread Lucas Meneghel Rodrigues

In order to concentrate setup classes for the KVM autotest
tests, create test_setup.py. This library will contain
code used to perform actions before to the actual test
execution, putting some hooks on the test postprocessing
code. The first class in there is the UnattendedInstallConfig
class, that prepares the environment for unattended installs.

Advantages with doing this in framework code:

- Setup errors are easier to figure out than having a
'pre command failed' error reason.
- We can use test.tmpdir to store temp dirs, which makes
things even cleaner and less intrusive in the system.
- Less code duplication.

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/test_setup.py |  494 
 1 files changed, 494 insertions(+), 0 deletions(-)
 create mode 100644 client/tests/kvm/test_setup.py

diff --git a/client/tests/kvm/test_setup.py b/client/tests/kvm/test_setup.py
new file mode 100644
index 000..7b7ef14
--- /dev/null
+++ b/client/tests/kvm/test_setup.py
@@ -0,0 +1,494 @@
+"""
+Library to perform pre/post test setup for KVM autotest.
+"""
+import os, sys, shutil, tempfile, re, ConfigParser, glob, inspect, commands
+import logging
+from autotest_lib.client.common_lib import error
+from autotest_lib.client.bin import utils
+
+
+@error.context_aware
+def cleanup(dir):
+"""
+If dir is a mountpoint, do what is possible to unmount it. Afterwards,
+try to remove it.
+
+@param dir: Directory to be cleaned up.
+"""
+error.context("cleaning up unattended install directory %s" % dir)
+if os.path.ismount(dir):
+utils.run('fuser -k %s' % dir, ignore_status=True)
+utils.run('umount %s' % dir)
+if os.path.isdir(dir):
+shutil.rmtree(dir)
+
+
+@error.context_aware
+def clean_old_image(image):
+"""
+Clean a leftover image file from previous processes. If it contains a
+mounted file system, do the proper cleanup procedures.
+
+@param image: Path to image to be cleaned up.
+"""
+error.context("cleaning up old leftover image %s" % image)
+if os.path.exists(image):
+mtab = open('/etc/mtab', 'r')
+mtab_contents = mtab.read()
+mtab.close()
+if image in mtab_contents:
+utils.run('fuser -k %s' % image, ignore_status=True)
+utils.run('umount %s' % image)
+os.remove(image)
+
+
+class Disk(object):
+"""
+Abstract class for Disk objects, with the common methods implemented.
+"""
+def __init__(self):
+self.path = None
+
+
+def setup_answer_file(self, filename, contents):
+utils.open_write_close(os.path.join(self.mount, filename), contents)
+
+
+def copy_to(self, src):
+dst = os.path.join(self.mount, os.path.basename(src))
+if os.path.isdir(src):
+shutil.copytree(src, dst)
+elif os.path.isfile(src):
+shutil.copyfile(src, dst)
+
+
+def close(self):
+os.chmod(self.path, 0755)
+cleanup(self.mount)
+logging.debug("Disk %s successfuly set", self.path)
+
+
+class FloppyDisk(Disk):
+"""
+Represents a 1.44 MB floppy disk. We can copy files to it, and setup it in
+convenient ways.
+"""
+@error.context_aware
+def __init__(self, path, qemu_img_binary, tmpdir):
+error.context("Creating unattended install floppy image %s" % path)
+self.tmpdir = tmpdir
+self.mount = tempfile.mkdtemp(prefix='floppy_', dir=self.tmpdir)
+self.virtio_mount = None
+self.path = path
+clean_old_image(path)
+if not os.path.isdir(os.path.dirname(path)):
+os.makedirs(os.path.dirname(path))
+
+try:
+c_cmd = '%s create -f raw %s 1440k' % (qemu_img_binary, path)
+utils.run(c_cmd)
+f_cmd = 'mkfs.msdos -s 1 %s' % path
+utils.run(f_cmd)
+m_cmd = 'mount -o loop,rw %s %s' % (path, self.mount)
+utils.run(m_cmd)
+except error.CmdError, e:
+cleanup(self.mount)
+raise
+
+
+def _copy_virtio_drivers(self, virtio_floppy):
+"""
+Copy the virtio drivers on the virtio floppy to the install floppy.
+
+1) Mount the floppy containing the viostor drivers
+2) Copy its contents to the root of the install floppy
+"""
+virtio_mount = tempfile.mkdtemp(prefix='virtio_floppy_',
+dir=self.tmpdir)
+
+pwd = os.getcwd()
+try:
+m_cmd = 'mount -o loop %s %s' % (virtio_floppy, virtio_mount)
+utils.run(m_cmd)
+os.chdir(virtio_mount)
+path_list = glob.glob('*')
+for path in path_list:
+self.copy_to(path)
+finally:
+os.chdir(pwd)
+cleanup(virtio_mount)
+
+
+def setup_virtio_win2003(self, virtio_floppy, virtio_oemsetup_id):
+"""
+Setup the install floppy with the virtio s

[PATCH 2/6] KVM test: Make unattended _install use the new pre script

2011-01-21 Thread Lucas Meneghel Rodrigues

Also, get rid of references to the old unattended install
script.

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/kvm_preprocessing.py  |6 +-
 client/tests/kvm/tests_base.cfg.sample |2 --
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/client/tests/kvm/kvm_preprocessing.py 
b/client/tests/kvm/kvm_preprocessing.py
index 41455cf..12adb6a 100644
--- a/client/tests/kvm/kvm_preprocessing.py
+++ b/client/tests/kvm/kvm_preprocessing.py
@@ -1,7 +1,7 @@
 import sys, os, time, commands, re, logging, signal, glob, threading, shutil
 from autotest_lib.client.bin import test, utils
 from autotest_lib.client.common_lib import error
-import kvm_vm, kvm_utils, kvm_subprocess, kvm_monitor, ppm_utils
+import kvm_vm, kvm_utils, kvm_subprocess, kvm_monitor, ppm_utils, test_setup
 try:
 import PIL.Image
 except ImportError:
@@ -258,6 +258,10 @@ def preprocess(test, params, env):
 h = kvm_utils.HugePageConfig(params)
 h.setup()
 
+if params.get("type") == "unattended_install":
+u = test_setup.UnattendedInstallConfig(test, params)
+u.setup()
+
 # Execute any pre_commands
 if params.get("pre_command"):
 process_command(test, params, env, params.get("pre_command"),
diff --git a/client/tests/kvm/tests_base.cfg.sample 
b/client/tests/kvm/tests_base.cfg.sample
index 184a582..c727c32 100644
--- a/client/tests/kvm/tests_base.cfg.sample
+++ b/client/tests/kvm/tests_base.cfg.sample
@@ -97,7 +97,6 @@ variants:
 kill_vm_gracefully = yes
 kill_vm_on_error = yes
 force_create_image = yes
-pre_command += " scripts/unattended.py;"
 extra_params += " -boot d"
 guest_port_unattended_install = 12323
 kernel = vmlinuz
@@ -381,7 +380,6 @@ variants:
 # The support VM is identical to the tested VM in every way
 # except for the image name which ends with '-supportvm'.
 type = unattended_install
-pre_command += " scripts/unattended.py;"
 extra_params += " -boot d"
 force_create_image = yes
 kill_vm = yes
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/6] KVM config: Move HugePageConfig() to test_setup

2011-01-21 Thread Lucas Meneghel Rodrigues

So we concentrate the setup classes together.

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/tests/kvm/kvm_preprocessing.py |2 +-
 client/tests/kvm/kvm_utils.py |  101 -
 client/tests/kvm/test_setup.py|  101 +
 3 files changed, 102 insertions(+), 102 deletions(-)

diff --git a/client/tests/kvm/kvm_preprocessing.py 
b/client/tests/kvm/kvm_preprocessing.py
index 12adb6a..081a13f 100644
--- a/client/tests/kvm/kvm_preprocessing.py
+++ b/client/tests/kvm/kvm_preprocessing.py
@@ -255,7 +255,7 @@ def preprocess(test, params, env):
 test.write_test_keyval({"kvm_userspace_version": kvm_userspace_version})
 
 if params.get("setup_hugepages") == "yes":
-h = kvm_utils.HugePageConfig(params)
+h = test_setup.HugePageConfig(params)
 h.setup()
 
 if params.get("type") == "unattended_install":
diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py
index 632badb..78c9f25 100644
--- a/client/tests/kvm/kvm_utils.py
+++ b/client/tests/kvm/kvm_utils.py
@@ -1265,107 +1265,6 @@ class KvmLoggingConfig(logging_config.LoggingConfig):
 verbose=verbose)
 
 
-class HugePageConfig:
-def __init__(self, params):
-"""
-Gets environment variable values and calculates the target number
-of huge memory pages.
-
-@param params: Dict like object containing parameters for the test.
-"""
-self.vms = len(params.objects("vms"))
-self.mem = int(params.get("mem"))
-self.max_vms = int(params.get("max_vms", 0))
-self.hugepage_path = '/mnt/kvm_hugepage'
-self.hugepage_size = self.get_hugepage_size()
-self.target_hugepages = self.get_target_hugepages()
-self.kernel_hp_file = '/proc/sys/vm/nr_hugepages'
-
-
-def get_hugepage_size(self):
-"""
-Get the current system setting for huge memory page size.
-"""
-meminfo = open('/proc/meminfo', 'r').readlines()
-huge_line_list = [h for h in meminfo if h.startswith("Hugepagesize")]
-try:
-return int(huge_line_list[0].split()[1])
-except ValueError, e:
-raise ValueError("Could not get huge page size setting from "
- "/proc/meminfo: %s" % e)
-
-
-def get_target_hugepages(self):
-"""
-Calculate the target number of hugepages for testing purposes.
-"""
-if self.vms < self.max_vms:
-self.vms = self.max_vms
-# memory of all VMs plus qemu overhead of 64MB per guest
-vmsm = (self.vms * self.mem) + (self.vms * 64)
-return int(vmsm * 1024 / self.hugepage_size)
-
-
-@error.context_aware
-def set_hugepages(self):
-"""
-Sets the hugepage limit to the target hugepage value calculated.
-"""
-error.context("setting hugepages limit to %s" % self.target_hugepages)
-hugepage_cfg = open(self.kernel_hp_file, "r+")
-hp = hugepage_cfg.readline()
-while int(hp) < self.target_hugepages:
-loop_hp = hp
-hugepage_cfg.write(str(self.target_hugepages))
-hugepage_cfg.flush()
-hugepage_cfg.seek(0)
-hp = int(hugepage_cfg.readline())
-if loop_hp == hp:
-raise ValueError("Cannot set the kernel hugepage setting "
- "to the target value of %d hugepages." %
- self.target_hugepages)
-hugepage_cfg.close()
-logging.debug("Successfuly set %s large memory pages on host ",
-  self.target_hugepages)
-
-
-@error.context_aware
-def mount_hugepage_fs(self):
-"""
-Verify if there's a hugetlbfs mount set. If there's none, will set up
-a hugetlbfs mount using the class attribute that defines the mount
-point.
-"""
-error.context("mounting hugepages path")
-if not os.path.ismount(self.hugepage_path):
-if not os.path.isdir(self.hugepage_path):
-os.makedirs(self.hugepage_path)
-cmd = "mount -t hugetlbfs none %s" % self.hugepage_path
-utils.system(cmd)
-
-
-def setup(self):
-logging.debug("Number of VMs this test will use: %d", self.vms)
-logging.debug("Amount of memory used by each vm: %s", self.mem)
-logging.debug("System setting for large memory page size: %s",
-  self.hugepage_size)
-logging.debug("Number of large memory pages needed for this test: %s",
-  self.target_hugepages)
-self.set_hugepages()
-self.mount_hugepage_fs()
-
-
-@error.context_aware
-def cleanup(self):
-error.context("trying to dealocate hugepage memory")
-try:
-utils.system("umount %s" % self.hugepage_path)
-except error.C

[RFC PATCH 2/2] device-assignment: Count required kvm memory slots

2011-01-21 Thread Alex Williamson

Each MMIO PCI BAR of an assigned device is directly mapped via a KVM
memory slot to avoid bouncing reads and writes through qemu.  KVM only
provides a (small) fixed number of these slots and attempting to
exceed the unadvertised limit results in an abort.  We can't reserve
slots, but let's at least try to make an attempt to check whether
there are enough available before adding a device.

The non-hotplug case is troublesome here because we have no visibility
as to what else might make use of these slots, but hasn't yet been
mapped.  We used to limit the number of devices that could be specified
on the commandline using the -pcidevice option.  The heuristic here
seems to work and provides a similar limit.

We can also avoid using these memory slots by allowing devices to
bounce mmio access through qemu.  This is trivially accomplished by
adding a force_slow=on option to pci-assign.

Signed-off-by: Alex Williamson 
---

 hw/device-assignment.c |   59 +++-
 hw/device-assignment.h |3 ++
 2 files changed, 61 insertions(+), 1 deletions(-)

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index e97f565..0063a11 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -546,7 +546,9 @@ static int assigned_dev_register_regions(PCIRegion 
*io_regions,
 ? PCI_BASE_ADDRESS_MEM_PREFETCH
 : PCI_BASE_ADDRESS_SPACE_MEMORY;
 
-if (cur_region->size & 0xFFF) {
+if (pci_dev->features & ASSIGNED_DEVICE_FORCE_SLOW_MASK) {
+slow_map = 1;
+} else if (cur_region->size & 0xFFF) {
 fprintf(stderr, "PCI region %d at address 0x%llx "
 "has size 0x%x, which is not a multiple of 4K. "
 "You might experience some performance hit "
@@ -556,6 +558,10 @@ static int assigned_dev_register_regions(PCIRegion 
*io_regions,
 slow_map = 1;
 }
 
+if (!slow_map) {
+pci_dev->slots_needed++;
+}
+
 /* map physical memory */
 pci_dev->v_addrs[i].e_physbase = cur_region->base_addr;
 pci_dev->v_addrs[i].u.r_virtbase = mmap(NULL, cur_region->size,
@@ -1666,6 +1672,30 @@ static CPUReadMemoryFunc *msix_mmio_read[] = {
 
 static int assigned_dev_register_msix_mmio(AssignedDevice *dev)
 {
+int i;
+PCIRegion *pci_region = dev->real_device.regions;
+
+/* Determine if the MSI-X table splits a BAR, requiring the use of
+ * two memory slots, one to map each remaining part. */
+if (!(dev->features & ASSIGNED_DEVICE_FORCE_SLOW_MASK)) {
+for (i = 0; i < dev->real_device.region_number; i++, pci_region++) {
+if (!pci_region->valid) {
+continue;
+}
+
+if (ranges_overlap(pci_region->base_addr, pci_region->size,
+   dev->msix_table_addr, 0x1000)) {
+target_phys_addr_t offset;
+
+offset = dev->msix_table_addr - pci_region->base_addr;
+if (offset && pci_region->size > offset + 0x1000) {
+dev->slots_needed++;
+}
+break;
+}
+}
+}
+
 dev->msix_table_page = mmap(NULL, 0x1000,
 PROT_READ|PROT_WRITE,
 MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
@@ -1768,6 +1798,31 @@ static int assigned_initfn(struct PCIDevice *pci_dev)
 if (assigned_dev_register_msix_mmio(dev))
 goto assigned_out;
 
+if (!(dev->features & ASSIGNED_DEVICE_FORCE_SLOW_MASK)) {
+int free_slots = kvm_free_slots();
+int total_slots = dev->slots_needed;
+
+if (!dev->dev.qdev.hotplugged) {
+AssignedDevice *adev;
+
+QLIST_FOREACH(adev, &devs, next) {
+total_slots += adev->slots_needed;
+}
+
+/* This seems to work, but it's completely heuristically
+ * determined.  Any number of things might make use of kvm
+ * memory slots before the guest starts mapping memory BARs.
+ * This is really just a guess. */
+free_slots -= 13;
+}
+
+if (total_slots > free_slots) {
+error_report("pci-assign: Out of memory slots, need %d, have %d\n",
+ total_slots, free_slots);
+goto assigned_out;
+}
+}
+
 assigned_dev_load_option_rom(dev);
 QLIST_INSERT_HEAD(&devs, dev, next);
 
@@ -1837,6 +1892,8 @@ static PCIDeviceInfo assign_info = {
 ASSIGNED_DEVICE_USE_IOMMU_BIT, true),
 DEFINE_PROP_BIT("prefer_msi", AssignedDevice, features,
 ASSIGNED_DEVICE_PREFER_MSI_BIT, true),
+DEFINE_PROP_BIT("force_slow", AssignedDevice, features,
+ASSIGNED_DEVICE_FORCE_SLOW_BIT, false),
 DEFINE_PROP_STRING("configfd", AssignedDevice

[RFC PATCH 1/2] kvm: Allow querying free slots

2011-01-21 Thread Alex Williamson

KVM memory slots are used any place we want a guest to have direct
access to a chunk of memory.  Unfortunately, there's only a small,
fixed number of them, and accidentally going over the limit causes
an abort.  Add a trivial interface so that callers can at least
guess if they have a chance to successfully map memory.

Signed-off-by: Alex Williamson 
---

 kvm-all.c |   16 
 kvm.h |2 ++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 2f203dd..4fe3631 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -96,6 +96,22 @@ static KVMSlot *kvm_alloc_slot(KVMState *s)
 abort();
 }
 
+int kvm_free_slots(void)
+{
+KVMState *s = kvm_state;
+int i, j;
+
+for (i = 0, j = 0; i < ARRAY_SIZE(s->slots); i++) {
+/* KVM private memory slots and used slots */
+if ((i >= 8 && i < 12) || s->slots[i].memory_size) {
+continue;
+}
+j++;
+}
+
+return j;
+}
+
 static KVMSlot *kvm_lookup_matching_slot(KVMState *s,
  target_phys_addr_t start_addr,
  target_phys_addr_t end_addr)
diff --git a/kvm.h b/kvm.h
index 02280a6..93da155 100644
--- a/kvm.h
+++ b/kvm.h
@@ -221,4 +221,6 @@ int kvm_irqchip_in_kernel(void);
 
 int kvm_set_irq(int irq, int level, int *status);
 
+int kvm_free_slots(void);
+
 #endif

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH 0/2] Expose available KVM free memory slot count to help avoid aborts

2011-01-21 Thread Alex Williamson

When doing device assignment, we use cpu_register_physical_memory() to
directly map the qemu mmap of the device resource into the address
space of the guest.  The unadvertised feature of the register physical
memory code path on kvm, at least for this type of mapping, is that it
needs to allocate an index from a small, fixed array of memory slots.
Even better, if it can't get an index, the code aborts deep in the
kvm specific bits, preventing the caller from having a chance to
recover.

It's really easy to hit this by hot adding too many assigned devices
to a guest (pretty easy to hit with too many devices at instantiation
time too, but the abort is slightly more bearable there).

I'm assuming it's pretty difficult to make the memory slot array
dynamically sized.  If that's not the case, please let me know as
that would be a much better solution.

I'm not terribly happy with the solution in this series, it doesn't
provide any guarantees whether a cpu_register_physical_memory() will
succeed, only slightly better educated guesses.

Are there better ideas how we could solve this?  Thanks,

Alex

---

Alex Williamson (2):
  device-assignment: Count required kvm memory slots
  kvm: Allow querying free slots


 hw/device-assignment.c |   59 +++-
 hw/device-assignment.h |3 ++
 kvm-all.c  |   16 +
 kvm.h  |2 ++
 4 files changed, 79 insertions(+), 1 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Flow Control and Port Mirroring Revisited

2011-01-21 Thread Simon Horman

On Fri, Jan 21, 2011 at 11:59:30AM +0200, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2011 at 05:38:33PM +0900, Simon Horman wrote:
> > [ Trimmed Eric from CC list as vger was complaining that it is too long ]
> > 
> > On Tue, Jan 18, 2011 at 11:41:22AM -0800, Rick Jones wrote:
> > > >So it won't be all that simple to implement well, and before we try,
> > > >I'd like to know whether there are applications that are helped
> > > >by it. For example, we could try to measure latency at various
> > > >pps and see whether the backpressure helps. netperf has -b, -w
> > > >flags which might help these measurements.
> > > 
> > > Those options are enabled when one adds --enable-burst to the
> > > pre-compilation ./configure  of netperf (one doesn't have to
> > > recompile netserver).  However, if one is also looking at latency
> > > statistics via the -j option in the top-of-trunk, or simply at the
> > > histogram with --enable-histogram on the ./configure and a verbosity
> > > level of 2 (global -v 2) then one wants the very top of trunk
> > > netperf from:
> > 
> > Hi,
> > 
> > I have constructed a test where I run an un-paced  UDP_STREAM test in
> > one guest and a paced omni rr test in another guest at the same time.
> 
> Hmm, what is this supposed to measure?  Basically each time you run an
> un-paced UDP_STREAM you get some random load on the network.
> You can't tell what it was exactly, only that it was between
> the send and receive throughput.

Rick mentioned in another email that I messed up my test parameters a bit,
so I will re-run the tests, incorporating his suggestions.

What I was attempting to measure was the effect of an unpaced UDP_STREAM
on the latency of more moderated traffic. Because I am interested in
what effect an abusive guest has on other guests and how that my be
mitigated.

Could you suggest some tests that you feel are more appropriate?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 01/18] kvm: x86: Swallow KVM_EXIT_SET_TPR

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

This exit only triggers activity in the common exit path, but we should
accept it in order to be able to detect unknown exit types.

Signed-off-by: Jan Kiszka 
---
 target-i386/kvm.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index fda07d2..0aeb079 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1534,6 +1534,9 @@ int kvm_arch_handle_exit(CPUState *env, struct kvm_run 
*run)
 DPRINTF("handle_hlt\n");
 ret = kvm_handle_halt(env);
 break;
+case KVM_EXIT_SET_TPR:
+ret = 1;
+break;
 }
 
 return ret;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 03/18] kvm: Improve reporting of fatal errors

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

Report KVM_EXIT_UNKNOWN, KVM_EXIT_FAIL_ENTRY, and KVM_EXIT_EXCEPTION
with more details to stderr. The latter two are so far x86-only, so move
them into the arch-specific handler. Integrate the Intel real mode
warning on KVM_EXIT_FAIL_ENTRY that qemu-kvm carries, but actually
restrict it to Intel CPUs. Moreover, always dump the CPU state in case
we fail.

Signed-off-by: Jan Kiszka 
---
 kvm-all.c   |   22 --
 target-i386/cpu.h   |2 ++
 target-i386/cpuid.c |5 ++---
 target-i386/kvm.c   |   33 +
 4 files changed, 45 insertions(+), 17 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index eaf9272..10e1194 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -817,22 +817,22 @@ static int kvm_handle_io(uint16_t port, void *data, int 
direction, int size,
 #ifdef KVM_CAP_INTERNAL_ERROR_DATA
 static int kvm_handle_internal_error(CPUState *env, struct kvm_run *run)
 {
-
+fprintf(stderr, "KVM internal error.");
 if (kvm_check_extension(kvm_state, KVM_CAP_INTERNAL_ERROR_DATA)) {
 int i;
 
-fprintf(stderr, "KVM internal error. Suberror: %d\n",
-run->internal.suberror);
-
+fprintf(stderr, " Suberror: %d\n", run->internal.suberror);
 for (i = 0; i < run->internal.ndata; ++i) {
 fprintf(stderr, "extra data[%d]: %"PRIx64"\n",
 i, (uint64_t)run->internal.data[i]);
 }
+} else {
+fprintf(stderr, "\n");
 }
-cpu_dump_state(env, stderr, fprintf, 0);
 if (run->internal.suberror == KVM_INTERNAL_ERROR_EMULATION) {
 fprintf(stderr, "emulation failure\n");
 if (!kvm_arch_stop_on_emulation_error(env)) {
+cpu_dump_state(env, stderr, fprintf, 0);
 return 0;
 }
 }
@@ -966,15 +966,8 @@ int kvm_cpu_exec(CPUState *env)
 ret = 1;
 break;
 case KVM_EXIT_UNKNOWN:
-DPRINTF("kvm_exit_unknown\n");
-ret = -1;
-break;
-case KVM_EXIT_FAIL_ENTRY:
-DPRINTF("kvm_exit_fail_entry\n");
-ret = -1;
-break;
-case KVM_EXIT_EXCEPTION:
-DPRINTF("kvm_exit_exception\n");
+fprintf(stderr, "KVM: unknown exit, hardware reason %" PRIx64 "\n",
+(uint64_t)run->hw.hardware_exit_reason);
 ret = -1;
 break;
 #ifdef KVM_CAP_INTERNAL_ERROR_DATA
@@ -1001,6 +994,7 @@ int kvm_cpu_exec(CPUState *env)
 } while (ret > 0);
 
 if (ret < 0) {
+cpu_dump_state(env, stderr, fprintf, 0);
 vm_stop(0);
 env->exit_request = 1;
 }
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index dddcd74..a457423 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -874,6 +874,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
uint32_t *ecx, uint32_t *edx);
 int cpu_x86_register (CPUX86State *env, const char *cpu_model);
 void cpu_clear_apic_feature(CPUX86State *env);
+void host_cpuid(uint32_t function, uint32_t count,
+uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx);
 
 /* helper.c */
 int cpu_x86_handle_mmu_fault(CPUX86State *env, target_ulong addr,
diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index 165045e..5382a28 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -103,9 +103,8 @@ typedef struct model_features_t {
 int check_cpuid = 0;
 int enforce_cpuid = 0;
 
-static void host_cpuid(uint32_t function, uint32_t count,
-   uint32_t *eax, uint32_t *ebx,
-   uint32_t *ecx, uint32_t *edx)
+void host_cpuid(uint32_t function, uint32_t count,
+uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx)
 {
 #if defined(CONFIG_KVM)
 uint32_t vec[4];
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 6b4abaa..0ba13fc 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1525,8 +1525,19 @@ static int kvm_handle_halt(CPUState *env)
 return 1;
 }
 
+static bool host_supports_vmx(void)
+{
+uint32_t ecx, unused;
+
+host_cpuid(1, 0, &unused, &unused, &ecx, &unused);
+return ecx & CPUID_EXT_VMX;
+}
+
+#define VMX_INVALID_GUEST_STATE 0x8021
+
 int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run)
 {
+uint64_t code;
 int ret = 0;
 
 switch (run->exit_reason) {
@@ -1537,6 +1548,28 @@ int kvm_arch_handle_exit(CPUState *env, struct kvm_run 
*run)
 case KVM_EXIT_SET_TPR:
 ret = 1;
 break;
+case KVM_EXIT_FAIL_ENTRY:
+code = run->fail_entry.hardware_entry_failure_reason;
+fprintf(stderr, "KVM: entry failed, hardware error 0x%" PRIx64 "\n",
+code);
+if (host_supports_vmx() && code == VMX_INVALID_GUEST_STATE) {
+fprintf(stderr,
+"\nIf you're runnning a guest on an Intel machine without "
+"unrestricted mode\n"
+"supp

[PATCH 11/18] kvm: x86: Fix !CONFIG_KVM_PARA build

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

If we lack kvm_para.h, MSR_KVM_ASYNC_PF_EN is not defined. The change in
kvm_arch_init_vcpu is just for consistency reasons.

Signed-off-by: Jan Kiszka 
---
 target-i386/kvm.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 825af42..feaf33d 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -319,7 +319,7 @@ int kvm_arch_init_vcpu(CPUState *env)
 uint32_t limit, i, j, cpuid_i;
 uint32_t unused;
 struct kvm_cpuid_entry2 *c;
-#ifdef KVM_CPUID_SIGNATURE
+#ifdef CONFIG_KVM_PARA
 uint32_t signature[3];
 #endif
 
@@ -855,7 +855,7 @@ static int kvm_put_msrs(CPUState *env, int level)
 kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME,
   env->system_time_msr);
 kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
-#ifdef KVM_CAP_ASYNC_PF
+#if defined(CONFIG_KVM_PARA) && defined(KVM_CAP_ASYNC_PF)
 kvm_msr_entry_set(&msrs[n++], MSR_KVM_ASYNC_PF_EN, 
env->async_pf_en_msr);
 #endif
 }
@@ -1091,7 +1091,7 @@ static int kvm_get_msrs(CPUState *env)
 #endif
 msrs[n++].index = MSR_KVM_SYSTEM_TIME;
 msrs[n++].index = MSR_KVM_WALL_CLOCK;
-#ifdef KVM_CAP_ASYNC_PF
+#if defined(CONFIG_KVM_PARA) && defined(KVM_CAP_ASYNC_PF)
 msrs[n++].index = MSR_KVM_ASYNC_PF_EN;
 #endif
 
@@ -1167,7 +1167,7 @@ static int kvm_get_msrs(CPUState *env)
 }
 #endif
 break;
-#ifdef KVM_CAP_ASYNC_PF
+#if defined(CONFIG_KVM_PARA) && defined(KVM_CAP_ASYNC_PF)
 case MSR_KVM_ASYNC_PF_EN:
 env->async_pf_en_msr = msrs[i].data;
 break;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 09/18] kvm: x86: Refactor msr_star/hsave_pa setup and checks

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

Simplify kvm_has_msr_star/hsave_pa to booleans and push their one-time
initialization into kvm_arch_init. Also handle potential errors of that
setup procedure.

Signed-off-by: Jan Kiszka 
---
 target-i386/kvm.c |   47 +++
 1 files changed, 19 insertions(+), 28 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index c4a22dd..454ddb1 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -54,6 +54,8 @@
 #define BUS_MCEERR_AO 5
 #endif
 
+static bool has_msr_star;
+static bool has_msr_hsave_pa;
 static int lm_capable_kernel;
 
 #ifdef KVM_CAP_EXT_CPUID
@@ -459,13 +461,10 @@ void kvm_arch_reset_vcpu(CPUState *env)
 }
 }
 
-int has_msr_star;
-int has_msr_hsave_pa;
-
-static void kvm_supported_msrs(CPUState *env)
+static int kvm_get_supported_msrs(KVMState *s)
 {
 static int kvm_supported_msrs;
-int ret;
+int ret = 0;
 
 /* first time */
 if (kvm_supported_msrs == 0) {
@@ -476,9 +475,9 @@ static void kvm_supported_msrs(CPUState *env)
 /* Obtain MSR list from KVM.  These are the MSRs that we must
  * save/restore */
 msr_list.nmsrs = 0;
-ret = kvm_ioctl(env->kvm_state, KVM_GET_MSR_INDEX_LIST, &msr_list);
+ret = kvm_ioctl(s, KVM_GET_MSR_INDEX_LIST, &msr_list);
 if (ret < 0 && ret != -E2BIG) {
-return;
+return ret;
 }
 /* Old kernel modules had a bug and could write beyond the provided
memory. Allocate at least a safe amount of 1K. */
@@ -487,17 +486,17 @@ static void kvm_supported_msrs(CPUState *env)
   sizeof(msr_list.indices[0])));
 
 kvm_msr_list->nmsrs = msr_list.nmsrs;
-ret = kvm_ioctl(env->kvm_state, KVM_GET_MSR_INDEX_LIST, kvm_msr_list);
+ret = kvm_ioctl(s, KVM_GET_MSR_INDEX_LIST, kvm_msr_list);
 if (ret >= 0) {
 int i;
 
 for (i = 0; i < kvm_msr_list->nmsrs; i++) {
 if (kvm_msr_list->indices[i] == MSR_STAR) {
-has_msr_star = 1;
+has_msr_star = true;
 continue;
 }
 if (kvm_msr_list->indices[i] == MSR_VM_HSAVE_PA) {
-has_msr_hsave_pa = 1;
+has_msr_hsave_pa = true;
 continue;
 }
 }
@@ -506,19 +505,7 @@ static void kvm_supported_msrs(CPUState *env)
 free(kvm_msr_list);
 }
 
-return;
-}
-
-static int kvm_has_msr_hsave_pa(CPUState *env)
-{
-kvm_supported_msrs(env);
-return has_msr_hsave_pa;
-}
-
-static int kvm_has_msr_star(CPUState *env)
-{
-kvm_supported_msrs(env);
-return has_msr_star;
+return ret;
 }
 
 static int kvm_init_identity_map_page(KVMState *s)
@@ -543,9 +530,13 @@ static int kvm_init_identity_map_page(KVMState *s)
 int kvm_arch_init(KVMState *s, int smp_cpus)
 {
 int ret;
-
 struct utsname utsname;
 
+ret = kvm_get_supported_msrs(s);
+if (ret < 0) {
+return ret;
+}
+
 uname(&utsname);
 lm_capable_kernel = strcmp(utsname.machine, "x86_64") == 0;
 
@@ -830,10 +821,10 @@ static int kvm_put_msrs(CPUState *env, int level)
 kvm_msr_entry_set(&msrs[n++], MSR_IA32_SYSENTER_CS, env->sysenter_cs);
 kvm_msr_entry_set(&msrs[n++], MSR_IA32_SYSENTER_ESP, env->sysenter_esp);
 kvm_msr_entry_set(&msrs[n++], MSR_IA32_SYSENTER_EIP, env->sysenter_eip);
-if (kvm_has_msr_star(env)) {
+if (has_msr_star) {
 kvm_msr_entry_set(&msrs[n++], MSR_STAR, env->star);
 }
-if (kvm_has_msr_hsave_pa(env)) {
+if (has_msr_hsave_pa) {
 kvm_msr_entry_set(&msrs[n++], MSR_VM_HSAVE_PA, env->vm_hsave);
 }
 #ifdef TARGET_X86_64
@@ -1076,10 +1067,10 @@ static int kvm_get_msrs(CPUState *env)
 msrs[n++].index = MSR_IA32_SYSENTER_CS;
 msrs[n++].index = MSR_IA32_SYSENTER_ESP;
 msrs[n++].index = MSR_IA32_SYSENTER_EIP;
-if (kvm_has_msr_star(env)) {
+if (has_msr_star) {
 msrs[n++].index = MSR_STAR;
 }
-if (kvm_has_msr_hsave_pa(env)) {
+if (has_msr_hsave_pa) {
 msrs[n++].index = MSR_VM_HSAVE_PA;
 }
 msrs[n++].index = MSR_IA32_TSC;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 00/18] [uq/master] Rebased patch queue, part I

2011-01-21 Thread Jan Kiszka

In order to make progress with flushing my kvm-upstream queue without
overloading the channels (38 further patches are pending), here comes
part I against updated uq/master.

Changes in this part compared to last postings:

 - Dropped "kvm: Drop return value of kvm_cpu_exec", we will actually
   need it later on.

 - Additional patch to swallow KVM_EXIT_SET_TPR (required now that we
   watch out for unknown exits).

 - Postponed MCE bits, they will follow later as part of a complete
   rework.

CC: Glauber Costa 

Jan Kiszka (18):
  kvm: x86: Swallow KVM_EXIT_SET_TPR
  kvm: Stop on all fatal exit reasons
  kvm: Improve reporting of fatal errors
  x86: Optionally dump code bytes on cpu_dump_state
  kvm: x86: Align kvm_arch_put_registers code with comment
  kvm: x86: Prepare kvm_get_mp_state for in-kernel irqchip
  kvm: x86: Remove redundant mp_state initialization
  kvm: x86: Fix xcr0 reset mismerge
  kvm: x86: Refactor msr_star/hsave_pa setup and checks
  kvm: x86: Reset paravirtual MSRs
  kvm: x86: Fix !CONFIG_KVM_PARA build
  kvm: Drop smp_cpus argument from init functions
  kvm: Consolidate must-have capability checks
  kvm: x86: Rework identity map and TSS setup for larger BIOS sizes
  kvm: Flush coalesced mmio buffer on IO window exits
  kvm: Do not use qemu_fair_mutex
  kvm: x86: Implicitly clear nmi_injected/pending on reset
  kvm: x86: Only read/write MSR_KVM_ASYNC_PF_EN if supported

 configure|   39 ++---
 cpu-all.h|2 +
 cpus.c   |2 -
 kvm-all.c|  108 +++-
 kvm-stub.c   |2 +-
 kvm.h|   14 +++-
 target-i386/cpu.h|8 ++-
 target-i386/cpuid.c  |5 +-
 target-i386/helper.c |   21 +
 target-i386/kvm.c|  227 +++---
 target-ppc/kvm.c |   10 ++-
 target-s390x/kvm.c   |6 +-
 vl.c |2 +-
 13 files changed, 256 insertions(+), 190 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 12/18] kvm: Drop smp_cpus argument from init functions

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

No longer used.

Signed-off-by: Jan Kiszka 
---
 kvm-all.c  |4 ++--
 kvm-stub.c |2 +-
 kvm.h  |4 ++--
 target-i386/kvm.c  |2 +-
 target-ppc/kvm.c   |2 +-
 target-s390x/kvm.c |2 +-
 vl.c   |2 +-
 7 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 41decde..8053f92 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -636,7 +636,7 @@ static CPUPhysMemoryClient kvm_cpu_phys_memory_client = {
 .migration_log = kvm_client_migration_log,
 };
 
-int kvm_init(int smp_cpus)
+int kvm_init(void)
 {
 static const char upgrade_note[] =
 "Please upgrade to at least kernel 2.6.29 or recent kvm-kmod\n"
@@ -749,7 +749,7 @@ int kvm_init(int smp_cpus)
 s->xcrs = kvm_check_extension(s, KVM_CAP_XCRS);
 #endif
 
-ret = kvm_arch_init(s, smp_cpus);
+ret = kvm_arch_init(s);
 if (ret < 0) {
 goto err;
 }
diff --git a/kvm-stub.c b/kvm-stub.c
index 33d4476..88682f2 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -58,7 +58,7 @@ int kvm_check_extension(KVMState *s, unsigned int extension)
 return 0;
 }
 
-int kvm_init(int smp_cpus)
+int kvm_init(void)
 {
 return -ENOSYS;
 }
diff --git a/kvm.h b/kvm.h
index ce08d42..a971752 100644
--- a/kvm.h
+++ b/kvm.h
@@ -34,7 +34,7 @@ struct kvm_run;
 
 /* external API */
 
-int kvm_init(int smp_cpus);
+int kvm_init(void);
 
 int kvm_has_sync_mmu(void);
 int kvm_has_vcpu_events(void);
@@ -105,7 +105,7 @@ int kvm_arch_get_registers(CPUState *env);
 
 int kvm_arch_put_registers(CPUState *env, int level);
 
-int kvm_arch_init(KVMState *s, int smp_cpus);
+int kvm_arch_init(KVMState *s);
 
 int kvm_arch_init_vcpu(CPUState *env);
 
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index feaf33d..016b67d 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -527,7 +527,7 @@ static int kvm_init_identity_map_page(KVMState *s)
 return 0;
 }
 
-int kvm_arch_init(KVMState *s, int smp_cpus)
+int kvm_arch_init(KVMState *s)
 {
 int ret;
 struct utsname utsname;
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 849b404..3c05630 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -56,7 +56,7 @@ static void kvm_kick_env(void *env)
 qemu_cpu_kick(env);
 }
 
-int kvm_arch_init(KVMState *s, int smp_cpus)
+int kvm_arch_init(KVMState *s)
 {
 #ifdef KVM_CAP_PPC_UNSET_IRQ
 cap_interrupt_unset = kvm_check_extension(s, KVM_CAP_PPC_UNSET_IRQ);
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index adf4a9e..b177e10 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -70,7 +70,7 @@
 #define SCLP_CMDW_READ_SCP_INFO 0x00020001
 #define SCLP_CMDW_READ_SCP_INFO_FORCED  0x00120001
 
-int kvm_arch_init(KVMState *s, int smp_cpus)
+int kvm_arch_init(KVMState *s)
 {
 return 0;
 }
diff --git a/vl.c b/vl.c
index 0292184..33f844f 100644
--- a/vl.c
+++ b/vl.c
@@ -2836,7 +2836,7 @@ int main(int argc, char **argv, char **envp)
 }
 
 if (kvm_allowed) {
-int ret = kvm_init(smp_cpus);
+int ret = kvm_init();
 if (ret < 0) {
 if (!kvm_available()) {
 printf("KVM not supported for this target\n");
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 08/18] kvm: x86: Fix xcr0 reset mismerge

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

For unknown reasons, xcr0 reset ended up in kvm_arch_update_guest_debug
on upstream merge. Fix this and also remove the misleading comment (1 is
THE reset value).

Signed-off-by: Jan Kiszka 
---
 target-i386/kvm.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 07c75c0..c4a22dd 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -450,6 +450,7 @@ void kvm_arch_reset_vcpu(CPUState *env)
 env->interrupt_injected = -1;
 env->nmi_injected = 0;
 env->nmi_pending = 0;
+env->xcr0 = 1;
 if (kvm_irqchip_in_kernel()) {
 env->mp_state = cpu_is_bsp(env) ? KVM_MP_STATE_RUNNABLE :
   KVM_MP_STATE_UNINITIALIZED;
@@ -1759,8 +1760,6 @@ void kvm_arch_update_guest_debug(CPUState *env, struct 
kvm_guest_debug *dbg)
 ((uint32_t)len_code[hw_breakpoint[n].len] << (18 + n*4));
 }
 }
-/* Legal xcr0 for loading */
-env->xcr0 = 1;
 }
 #endif /* KVM_CAP_SET_GUEST_DEBUG */
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 14/18] kvm: x86: Rework identity map and TSS setup for larger BIOS sizes

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

In order to support loading BIOSes > 256K, reorder the code, adjusting
the base if the kernel supports moving the identity map.

Signed-off-by: Jan Kiszka 
---
 target-i386/kvm.c |   63 +---
 1 files changed, 30 insertions(+), 33 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 1db8227..72f9fdf 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -493,27 +493,9 @@ static int kvm_get_supported_msrs(KVMState *s)
 return ret;
 }
 
-static int kvm_init_identity_map_page(KVMState *s)
-{
-#ifdef KVM_CAP_SET_IDENTITY_MAP_ADDR
-int ret;
-uint64_t addr = 0xfffbc000;
-
-if (!kvm_check_extension(s, KVM_CAP_SET_IDENTITY_MAP_ADDR)) {
-return 0;
-}
-
-ret = kvm_vm_ioctl(s, KVM_SET_IDENTITY_MAP_ADDR, &addr);
-if (ret < 0) {
-fprintf(stderr, "kvm_set_identity_map_addr: %s\n", strerror(ret));
-return ret;
-}
-#endif
-return 0;
-}
-
 int kvm_arch_init(KVMState *s)
 {
+uint64_t identity_base = 0xfffbc000;
 int ret;
 struct utsname utsname;
 
@@ -525,27 +507,42 @@ int kvm_arch_init(KVMState *s)
 uname(&utsname);
 lm_capable_kernel = strcmp(utsname.machine, "x86_64") == 0;
 
-/* create vm86 tss.  KVM uses vm86 mode to emulate 16-bit code
- * directly.  In order to use vm86 mode, a TSS is needed.  Since this
- * must be part of guest physical memory, we need to allocate it. */
-
-/* this address is 3 pages before the bios, and the bios should present
- * as unavaible memory.  FIXME, need to ensure the e820 map deals with
- * this?
- */
 /*
- * Tell fw_cfg to notify the BIOS to reserve the range.
+ * On older Intel CPUs, KVM uses vm86 mode to emulate 16-bit code directly.
+ * In order to use vm86 mode, an EPT identity map and a TSS  are needed.
+ * Since these must be part of guest physical memory, we need to allocate
+ * them, both by setting their start addresses in the kernel and by
+ * creating a corresponding e820 entry. We need 4 pages before the BIOS.
+ *
+ * Older KVM versions may not support setting the identity map base. In
+ * that case we need to stick with the default, i.e. a 256K maximum BIOS
+ * size.
  */
-if (e820_add_entry(0xfffbc000, 0x4000, E820_RESERVED) < 0) {
-perror("e820_add_entry() table is full");
-exit(1);
+#ifdef KVM_CAP_SET_IDENTITY_MAP_ADDR
+if (kvm_check_extension(s, KVM_CAP_SET_IDENTITY_MAP_ADDR)) {
+/* Allows up to 16M BIOSes. */
+identity_base = 0xfeffc000;
+
+ret = kvm_vm_ioctl(s, KVM_SET_IDENTITY_MAP_ADDR, &identity_base);
+if (ret < 0) {
+return ret;
+}
 }
-ret = kvm_vm_ioctl(s, KVM_SET_TSS_ADDR, 0xfffbd000);
+#endif
+/* Set TSS base one page after EPT identity map. */
+ret = kvm_vm_ioctl(s, KVM_SET_TSS_ADDR, identity_base + 0x1000);
+if (ret < 0) {
+return ret;
+}
+
+/* Tell fw_cfg to notify the BIOS to reserve the range. */
+ret = e820_add_entry(identity_base, 0x4000, E820_RESERVED);
 if (ret < 0) {
+fprintf(stderr, "e820_add_entry() table is full\n");
 return ret;
 }
 
-return kvm_init_identity_map_page(s);
+return 0;
 }
 
 static void set_v8086_seg(struct kvm_segment *lhs, const SegmentCache *rhs)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 04/18] x86: Optionally dump code bytes on cpu_dump_state

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

Introduce the cpu_dump_state flag CPU_DUMP_CODE and implement it for
x86. This writes out the code bytes around the current instruction
pointer. Make use of this feature in KVM to help debugging fatal vm
exits.

Signed-off-by: Jan Kiszka 
---
 cpu-all.h|2 ++
 kvm-all.c|4 ++--
 target-i386/helper.c |   21 +
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 4ce4e83..ffbd6a4 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -765,6 +765,8 @@ int page_check_range(target_ulong start, target_ulong len, 
int flags);
 CPUState *cpu_copy(CPUState *env);
 CPUState *qemu_get_cpu(int cpu);
 
+#define CPU_DUMP_CODE 0x0001
+
 void cpu_dump_state(CPUState *env, FILE *f, fprintf_function cpu_fprintf,
 int flags);
 void cpu_dump_statistics(CPUState *env, FILE *f, fprintf_function cpu_fprintf,
diff --git a/kvm-all.c b/kvm-all.c
index 10e1194..41decde 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -832,7 +832,7 @@ static int kvm_handle_internal_error(CPUState *env, struct 
kvm_run *run)
 if (run->internal.suberror == KVM_INTERNAL_ERROR_EMULATION) {
 fprintf(stderr, "emulation failure\n");
 if (!kvm_arch_stop_on_emulation_error(env)) {
-cpu_dump_state(env, stderr, fprintf, 0);
+cpu_dump_state(env, stderr, fprintf, CPU_DUMP_CODE);
 return 0;
 }
 }
@@ -994,7 +994,7 @@ int kvm_cpu_exec(CPUState *env)
 } while (ret > 0);
 
 if (ret < 0) {
-cpu_dump_state(env, stderr, fprintf, 0);
+cpu_dump_state(env, stderr, fprintf, CPU_DUMP_CODE);
 vm_stop(0);
 env->exit_request = 1;
 }
diff --git a/target-i386/helper.c b/target-i386/helper.c
index 6dfa27d..1217452 100644
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -249,6 +249,9 @@ done:
 cpu_fprintf(f, "\n");
 }
 
+#define DUMP_CODE_BYTES_TOTAL50
+#define DUMP_CODE_BYTES_BACKWARD 20
+
 void cpu_dump_state(CPUState *env, FILE *f, fprintf_function cpu_fprintf,
 int flags)
 {
@@ -434,6 +437,24 @@ void cpu_dump_state(CPUState *env, FILE *f, 
fprintf_function cpu_fprintf,
 cpu_fprintf(f, " ");
 }
 }
+if (flags & CPU_DUMP_CODE) {
+target_ulong base = env->segs[R_CS].base + env->eip;
+target_ulong offs = MIN(env->eip, DUMP_CODE_BYTES_BACKWARD);
+uint8_t code;
+char codestr[3];
+
+cpu_fprintf(f, "Code=");
+for (i = 0; i < DUMP_CODE_BYTES_TOTAL; i++) {
+if (cpu_memory_rw_debug(env, base - offs + i, &code, 1, 0) == 0) {
+snprintf(codestr, sizeof(codestr), "%02x", code);
+} else {
+snprintf(codestr, sizeof(codestr), "??");
+}
+cpu_fprintf(f, "%s%s%s%s", i > 0 ? " " : "",
+i == offs ? "<" : "", codestr, i == offs ? ">" : "");
+}
+cpu_fprintf(f, "\n");
+}
 }
 
 /***/
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 02/18] kvm: Stop on all fatal exit reasons

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

Ensure that we stop the guest whenever we face a fatal or unknown exit
reason. If we stop, we also have to enforce a cpu loop exit.

Signed-off-by: Jan Kiszka 
---
 kvm-all.c |   15 +++
 target-i386/kvm.c |4 
 target-ppc/kvm.c  |4 
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 86ddbd6..eaf9272 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -815,7 +815,7 @@ static int kvm_handle_io(uint16_t port, void *data, int 
direction, int size,
 }
 
 #ifdef KVM_CAP_INTERNAL_ERROR_DATA
-static void kvm_handle_internal_error(CPUState *env, struct kvm_run *run)
+static int kvm_handle_internal_error(CPUState *env, struct kvm_run *run)
 {
 
 if (kvm_check_extension(kvm_state, KVM_CAP_INTERNAL_ERROR_DATA)) {
@@ -833,13 +833,13 @@ static void kvm_handle_internal_error(CPUState *env, 
struct kvm_run *run)
 if (run->internal.suberror == KVM_INTERNAL_ERROR_EMULATION) {
 fprintf(stderr, "emulation failure\n");
 if (!kvm_arch_stop_on_emulation_error(env)) {
-return;
+return 0;
 }
 }
 /* FIXME: Should trigger a qmp message to let management know
  * something went wrong.
  */
-vm_stop(0);
+return -1;
 }
 #endif
 
@@ -967,16 +967,19 @@ int kvm_cpu_exec(CPUState *env)
 break;
 case KVM_EXIT_UNKNOWN:
 DPRINTF("kvm_exit_unknown\n");
+ret = -1;
 break;
 case KVM_EXIT_FAIL_ENTRY:
 DPRINTF("kvm_exit_fail_entry\n");
+ret = -1;
 break;
 case KVM_EXIT_EXCEPTION:
 DPRINTF("kvm_exit_exception\n");
+ret = -1;
 break;
 #ifdef KVM_CAP_INTERNAL_ERROR_DATA
 case KVM_EXIT_INTERNAL_ERROR:
-kvm_handle_internal_error(env, run);
+ret = kvm_handle_internal_error(env, run);
 break;
 #endif
 case KVM_EXIT_DEBUG:
@@ -997,6 +1000,10 @@ int kvm_cpu_exec(CPUState *env)
 }
 } while (ret > 0);
 
+if (ret < 0) {
+vm_stop(0);
+env->exit_request = 1;
+}
 if (env->exit_request) {
 env->exit_request = 0;
 env->exception_index = EXCP_INTERRUPT;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 0aeb079..6b4abaa 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1537,6 +1537,10 @@ int kvm_arch_handle_exit(CPUState *env, struct kvm_run 
*run)
 case KVM_EXIT_SET_TPR:
 ret = 1;
 break;
+default:
+fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
+ret = -1;
+break;
 }
 
 return ret;
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 5caa07c..849b404 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -307,6 +307,10 @@ int kvm_arch_handle_exit(CPUState *env, struct kvm_run 
*run)
 dprintf("handle halt\n");
 ret = kvmppc_handle_halt(env);
 break;
+default:
+fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
+ret = -1;
+break;
 }
 
 return ret;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 07/18] kvm: x86: Remove redundant mp_state initialization

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

kvm_arch_reset_vcpu initializes mp_state, and that function is invoked
right after kvm_arch_init_vcpu.

Signed-off-by: Jan Kiszka 
---
 target-i386/kvm.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 531b69e..07c75c0 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -321,8 +321,6 @@ int kvm_arch_init_vcpu(CPUState *env)
 uint32_t signature[3];
 #endif
 
-env->mp_state = KVM_MP_STATE_RUNNABLE;
-
 env->cpuid_features &= kvm_arch_get_supported_cpuid(env, 1, 0, R_EDX);
 
 i = env->cpuid_ext_features & CPUID_EXT_HYPERVISOR;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 05/18] kvm: x86: Align kvm_arch_put_registers code with comment

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

The ordering doesn't matter in this case, but better keep it consistent.

Signed-off-by: Jan Kiszka 
---
 target-i386/kvm.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 0ba13fc..9bb34ab 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1388,12 +1388,12 @@ int kvm_arch_put_registers(CPUState *env, int level)
 if (ret < 0) {
 return ret;
 }
-/* must be last */
-ret = kvm_guest_debug_workarounds(env);
+ret = kvm_put_debugregs(env);
 if (ret < 0) {
 return ret;
 }
-ret = kvm_put_debugregs(env);
+/* must be last */
+ret = kvm_guest_debug_workarounds(env);
 if (ret < 0) {
 return ret;
 }
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 10/18] kvm: x86: Reset paravirtual MSRs

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

Make sure to write the cleared MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK,
and MSR_KVM_ASYNC_PF_EN to the kernel state so that a freshly booted
guest cannot be disturbed by old values.

Signed-off-by: Jan Kiszka 
CC: Glauber Costa 
---
 target-i386/kvm.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 454ddb1..825af42 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -845,6 +845,13 @@ static int kvm_put_msrs(CPUState *env, int level)
 if (smp_cpus == 1 || env->tsc != 0) {
 kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc);
 }
+}
+/*
+ * The following paravirtual MSRs have side effects on the guest or are
+ * too heavy for normal writeback. Limit them to reset or full state
+ * updates.
+ */
+if (level >= KVM_PUT_RESET_STATE) {
 kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME,
   env->system_time_msr);
 kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 06/18] kvm: x86: Prepare kvm_get_mp_state for in-kernel irqchip

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

This code path will not yet be taken as we still lack in-kernel irqchip
support. But qemu-kvm can already make use of it and drop its own
mp_state access services.

Signed-off-by: Jan Kiszka 
---
 target-i386/kvm.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 9bb34ab..531b69e 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1198,6 +1198,9 @@ static int kvm_get_mp_state(CPUState *env)
 return ret;
 }
 env->mp_state = mp_state.mp_state;
+if (kvm_irqchip_in_kernel()) {
+env->halted = (mp_state.mp_state == KVM_MP_STATE_HALTED);
+}
 return 0;
 }
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 17/18] kvm: x86: Implicitly clear nmi_injected/pending on reset

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

All CPUX86State variables before CPU_COMMON are automatically cleared on
reset. Reorder nmi_injected and nmi_pending to avoid having to touch
them explicitly.

Signed-off-by: Jan Kiszka 
---
 target-i386/cpu.h |6 --
 target-i386/kvm.c |2 --
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index a457423..af701a4 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -699,6 +699,10 @@ typedef struct CPUX86State {
 uint32_t smbase;
 int old_exception;  /* exception in flight */
 
+/* KVM states, automatically cleared on reset */
+uint8_t nmi_injected;
+uint8_t nmi_pending;
+
 CPU_COMMON
 
 /* processor features (e.g. for CPUID insn) */
@@ -726,8 +730,6 @@ typedef struct CPUX86State {
 int32_t exception_injected;
 int32_t interrupt_injected;
 uint8_t soft_interrupt;
-uint8_t nmi_injected;
-uint8_t nmi_pending;
 uint8_t has_error_code;
 uint32_t sipi_vector;
 uint32_t cpuid_kvm_features;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 72f9fdf..b2c5ee0 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -435,8 +435,6 @@ void kvm_arch_reset_vcpu(CPUState *env)
 {
 env->exception_injected = -1;
 env->interrupt_injected = -1;
-env->nmi_injected = 0;
-env->nmi_pending = 0;
 env->xcr0 = 1;
 if (kvm_irqchip_in_kernel()) {
 env->mp_state = cpu_is_bsp(env) ? KVM_MP_STATE_RUNNABLE :
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 13/18] kvm: Consolidate must-have capability checks

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

Instead of splattering the code with #ifdefs and runtime checks for
capabilities we cannot work without anyway, provide central test
infrastructure for verifying their availability both at build and
runtime.

Signed-off-by: Jan Kiszka 
---
 configure  |   39 --
 kvm-all.c  |   67 +---
 kvm.h  |   10 +++
 target-i386/kvm.c  |   39 ++
 target-ppc/kvm.c   |4 +++
 target-s390x/kvm.c |4 +++
 6 files changed, 79 insertions(+), 84 deletions(-)

diff --git a/configure b/configure
index 9a02d1f..4673bf0 100755
--- a/configure
+++ b/configure
@@ -1662,18 +1662,31 @@ if test "$kvm" != "no" ; then
 #if !defined(KVM_API_VERSION) || KVM_API_VERSION < 12 || KVM_API_VERSION > 12
 #error Invalid KVM version
 #endif
-#if !defined(KVM_CAP_USER_MEMORY)
-#error Missing KVM capability KVM_CAP_USER_MEMORY
-#endif
-#if !defined(KVM_CAP_SET_TSS_ADDR)
-#error Missing KVM capability KVM_CAP_SET_TSS_ADDR
-#endif
-#if !defined(KVM_CAP_DESTROY_MEMORY_REGION_WORKS)
-#error Missing KVM capability KVM_CAP_DESTROY_MEMORY_REGION_WORKS
-#endif
-#if !defined(KVM_CAP_USER_NMI)
-#error Missing KVM capability KVM_CAP_USER_NMI
+EOF
+must_have_caps="KVM_CAP_USER_MEMORY \
+KVM_CAP_DESTROY_MEMORY_REGION_WORKS \
+KVM_CAP_COALESCED_MMIO \
+KVM_CAP_SYNC_MMU \
+   "
+if test \( "$cpu" = "i386" -o "$cpu" = "x86_64" \) ; then
+  must_have_caps="$caps \
+  KVM_CAP_SET_TSS_ADDR \
+  KVM_CAP_EXT_CPUID \
+  KVM_CAP_CLOCKSOURCE \
+  KVM_CAP_NOP_IO_DELAY \
+  KVM_CAP_PV_MMU \
+  KVM_CAP_MP_STATE \
+  KVM_CAP_USER_NMI \
+ "
+fi
+for c in $must_have_caps ; do
+  cat >> $TMPC <> $TMPC <1) printf(", "); printf("%s",$2);}'`
 if test "$kvmerr" != "" ; then
   echo -e "${kvmerr}\n\
-  NOTE: To enable KVM support, update your kernel to 2.6.29+ or install \
-  recent kvm-kmod from http://sourceforge.net/projects/kvm.";
+NOTE: To enable KVM support, update your kernel to 2.6.29+ or install \
+recent kvm-kmod from http://sourceforge.net/projects/kvm.";
 fi
   fi
   feature_not_found "kvm"
diff --git a/kvm-all.c b/kvm-all.c
index 8053f92..3a1f63b 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -63,9 +63,7 @@ struct KVMState
 int fd;
 int vmfd;
 int coalesced_mmio;
-#ifdef KVM_CAP_COALESCED_MMIO
 struct kvm_coalesced_mmio_ring *coalesced_mmio_ring;
-#endif
 int broken_set_mem_region;
 int migration_log;
 int vcpu_events;
@@ -82,6 +80,12 @@ struct KVMState
 
 static KVMState *kvm_state;
 
+static const KVMCapabilityInfo kvm_required_capabilites[] = {
+KVM_CAP_INFO(USER_MEMORY),
+KVM_CAP_INFO(DESTROY_MEMORY_REGION_WORKS),
+KVM_CAP_LAST_INFO
+};
+
 static KVMSlot *kvm_alloc_slot(KVMState *s)
 {
 int i;
@@ -227,12 +231,10 @@ int kvm_init_vcpu(CPUState *env)
 goto err;
 }
 
-#ifdef KVM_CAP_COALESCED_MMIO
 if (s->coalesced_mmio && !s->coalesced_mmio_ring) {
 s->coalesced_mmio_ring =
 (void *)env->kvm_run + s->coalesced_mmio * PAGE_SIZE;
 }
-#endif
 
 ret = kvm_arch_init_vcpu(env);
 if (ret == 0) {
@@ -401,7 +403,6 @@ static int 
kvm_physical_sync_dirty_bitmap(target_phys_addr_t start_addr,
 int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size)
 {
 int ret = -ENOSYS;
-#ifdef KVM_CAP_COALESCED_MMIO
 KVMState *s = kvm_state;
 
 if (s->coalesced_mmio) {
@@ -412,7 +413,6 @@ int kvm_coalesce_mmio_region(target_phys_addr_t start, 
ram_addr_t size)
 
 ret = kvm_vm_ioctl(s, KVM_REGISTER_COALESCED_MMIO, &zone);
 }
-#endif
 
 return ret;
 }
@@ -420,7 +420,6 @@ int kvm_coalesce_mmio_region(target_phys_addr_t start, 
ram_addr_t size)
 int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size)
 {
 int ret = -ENOSYS;
-#ifdef KVM_CAP_COALESCED_MMIO
 KVMState *s = kvm_state;
 
 if (s->coalesced_mmio) {
@@ -431,7 +430,6 @@ int kvm_uncoalesce_mmio_region(target_phys_addr_t start, 
ram_addr_t size)
 
 ret = kvm_vm_ioctl(s, KVM_UNREGISTER_COALESCED_MMIO, &zone);
 }
-#endif
 
 return ret;
 }
@@ -481,6 +479,18 @@ static int kvm_check_many_ioeventfds(void)
 #endif
 }
 
+static const KVMCapabilityInfo *
+kvm_check_extension_list(KVMState *s, const KVMCapabilityInfo *list)
+{
+while (list->name) {
+if (!kvm_check_extension(s, list->value)) {
+return list;
+}
+list++;
+}
+return NULL;
+}
+
 static void kvm_set_phys_mem(target_phys_addr_t start_addr, ram_addr_t size,
  ram_addr_t phys_offset)
 {
@@ -642,6 +652,7 @@ int kvm_init(void)
 "Please upgrade to at least kernel 2.6.29 or recent kvm-kmod\n"

[PATCH 18/18] kvm: x86: Only read/write MSR_KVM_ASYNC_PF_EN if supported

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

If the kernel does not support KVM_CAP_ASYNC_PF, it also does not know
about the related MSR. So skip it during state synchronization in that
case. Fixes annoying kernel warnings.

Signed-off-by: Jan Kiszka 
---
 target-i386/kvm.c |   13 +++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index b2c5ee0..8e8880a 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -63,6 +63,9 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
 
 static bool has_msr_star;
 static bool has_msr_hsave_pa;
+#if defined(CONFIG_KVM_PARA) && defined(KVM_CAP_ASYNC_PF)
+static bool has_msr_async_pf_en;
+#endif
 static int lm_capable_kernel;
 
 static struct kvm_cpuid2 *try_get_cpuid(KVMState *s, int max)
@@ -164,6 +167,7 @@ static int get_para_features(CPUState *env)
 features |= (1 << para_features[i].feature);
 }
 }
+has_msr_async_pf_en = features & (1 << KVM_FEATURE_ASYNC_PF);
 return features;
 }
 #endif
@@ -828,7 +832,10 @@ static int kvm_put_msrs(CPUState *env, int level)
   env->system_time_msr);
 kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr);
 #if defined(CONFIG_KVM_PARA) && defined(KVM_CAP_ASYNC_PF)
-kvm_msr_entry_set(&msrs[n++], MSR_KVM_ASYNC_PF_EN, 
env->async_pf_en_msr);
+if (has_msr_async_pf_en) {
+kvm_msr_entry_set(&msrs[n++], MSR_KVM_ASYNC_PF_EN,
+  env->async_pf_en_msr);
+}
 #endif
 }
 #ifdef KVM_CAP_MCE
@@ -1064,7 +1071,9 @@ static int kvm_get_msrs(CPUState *env)
 msrs[n++].index = MSR_KVM_SYSTEM_TIME;
 msrs[n++].index = MSR_KVM_WALL_CLOCK;
 #if defined(CONFIG_KVM_PARA) && defined(KVM_CAP_ASYNC_PF)
-msrs[n++].index = MSR_KVM_ASYNC_PF_EN;
+if (has_msr_async_pf_en) {
+msrs[n++].index = MSR_KVM_ASYNC_PF_EN;
+}
 #endif
 
 #ifdef KVM_CAP_MCE
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 15/18] kvm: Flush coalesced mmio buffer on IO window exits

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

We must flush pending mmio writes if we leave kvm_cpu_exec for an IO
window. Otherwise we risk to loose those requests when migrating to a
different host during that window.

Signed-off-by: Jan Kiszka 
---
 kvm-all.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 3a1f63b..9976762 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -918,6 +918,8 @@ int kvm_cpu_exec(CPUState *env)
 cpu_single_env = env;
 kvm_arch_post_run(env, run);
 
+kvm_flush_coalesced_mmio_buffer();
+
 if (ret == -EINTR || ret == -EAGAIN) {
 cpu_exit(env);
 DPRINTF("io window exit\n");
@@ -930,8 +932,6 @@ int kvm_cpu_exec(CPUState *env)
 abort();
 }
 
-kvm_flush_coalesced_mmio_buffer();
-
 ret = 0; /* exit loop */
 switch (run->exit_reason) {
 case KVM_EXIT_IO:
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 16/18] kvm: Do not use qemu_fair_mutex

2011-01-21 Thread Jan Kiszka

From: Jan Kiszka 

The imbalance in the hold time of qemu_global_mutex only exists in TCG
mode. In contrast to TCG VCPUs, KVM drops the global lock during guest
execution. We already avoid touching the fairness lock from the
IO-thread in KVM mode, so also stop using it from the VCPU threads.

Signed-off-by: Jan Kiszka 
---
 cpus.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index 0309189..4c9928e 100644
--- a/cpus.c
+++ b/cpus.c
@@ -735,9 +735,7 @@ static sigset_t block_io_signals(void)
 void qemu_mutex_lock_iothread(void)
 {
 if (kvm_enabled()) {
-qemu_mutex_lock(&qemu_fair_mutex);
 qemu_mutex_lock(&qemu_global_mutex);
-qemu_mutex_unlock(&qemu_fair_mutex);
 } else {
 qemu_mutex_lock(&qemu_fair_mutex);
 if (qemu_mutex_trylock(&qemu_global_mutex)) {
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-21 Thread Blue Swirl

On Fri, Jan 21, 2011 at 6:17 PM, Jan Kiszka  wrote:
> On 2011-01-21 19:04, Blue Swirl wrote:
>> On Fri, Jan 21, 2011 at 5:21 PM, Jan Kiszka  wrote:
>>> On 2011-01-21 17:37, Blue Swirl wrote:
 On Fri, Jan 21, 2011 at 8:46 AM, Gerd Hoffmann  wrote:
>  Hi,
>
>> By the way, we don't have a QEMUState but instead use globals.
>
> /me wants to underline this.
>
> IMO it is absolutely pointless to worry about ways to pass around 
> kvm_state.
>  There never ever will be a serious need for that.
>
> We can stick with the current model of keeping global state in global
> variables.  And just do the same with kvm_state.
>
> Or we can move to have all state in a QEMUState struct which we'll pass
> around basically everywhere.  Then we can simply embed or reference
> kvm_state there.
>
> I'd tend to stick with the global variables as I don't see the point in
> having a QEMUstate.  I doubt we'll ever see two virtual machines driven 
> by a
> single qemu process.  YMMV.

 Global variables are signs of a poor design.
>>>
>>> s/are/can be/.
>>>
 QEMUState would not help
 that, instead more specific structures should be designed, much like
 what I've proposed for KVMState. Some of these new structures should
 be even passed around when it makes sense.

 But I'd not start kvm_state redesign around global variables or QEMUState.
>>>
>>> We do not need to move individual fields yet, but we need to define
>>> classes of fields and strategies how to deal with them long-term. Then
>>> we can move forward, and that already in the right direction.
>>
>> Excellent plan.
>>
>>> Obvious classes are
>>>  - static host capabilities and means for the KVM core to query them
>>
>> OK. There could be other host capabilities here in the future too,
>> like Xen. I don't think there are any Xen capabilities ATM though but
>> IIRC some recently sent patches had something like those.
>>
>>>  - per-VM fields
>>
>> What is per-VM which is not machine or CPU architecture specific?
>
> I think it would suffice for a first step to consider all per-VM fields
> as independent of CPU architecture or machine type.

I'm afraid that would not be progress.

>>>  - fields related to memory management
>>
>> OK.
>>
>> I'd add fourth possible class:
>>  - device, CPU and machine configuration, like nographic,
>> win2k_install_hack, no_hpet, smp_cpus etc. Maybe also
>> irqchip_in_kernel could fit here, though it obviously depends on a
>> host capability too.
>
> I would count everything that cannot be assigned to a concrete device
> upfront to the dynamic state of a machine, thus class 2. The point is,
> (potentially) every device of that machine requires access to it, just
> like (indirectly, via the KVM core services) to some KVM VM state bits.

The machine class should not be a catch-all, it would be like
QEMUState or KVMState then. Perhaps each field or variable should be
listed and given more thought.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-21 Thread Jan Kiszka

On 2011-01-21 19:04, Blue Swirl wrote:
> On Fri, Jan 21, 2011 at 5:21 PM, Jan Kiszka  wrote:
>> On 2011-01-21 17:37, Blue Swirl wrote:
>>> On Fri, Jan 21, 2011 at 8:46 AM, Gerd Hoffmann  wrote:
  Hi,

> By the way, we don't have a QEMUState but instead use globals.

 /me wants to underline this.

 IMO it is absolutely pointless to worry about ways to pass around 
 kvm_state.
  There never ever will be a serious need for that.

 We can stick with the current model of keeping global state in global
 variables.  And just do the same with kvm_state.

 Or we can move to have all state in a QEMUState struct which we'll pass
 around basically everywhere.  Then we can simply embed or reference
 kvm_state there.

 I'd tend to stick with the global variables as I don't see the point in
 having a QEMUstate.  I doubt we'll ever see two virtual machines driven by 
 a
 single qemu process.  YMMV.
>>>
>>> Global variables are signs of a poor design.
>>
>> s/are/can be/.
>>
>>> QEMUState would not help
>>> that, instead more specific structures should be designed, much like
>>> what I've proposed for KVMState. Some of these new structures should
>>> be even passed around when it makes sense.
>>>
>>> But I'd not start kvm_state redesign around global variables or QEMUState.
>>
>> We do not need to move individual fields yet, but we need to define
>> classes of fields and strategies how to deal with them long-term. Then
>> we can move forward, and that already in the right direction.
> 
> Excellent plan.
> 
>> Obvious classes are
>>  - static host capabilities and means for the KVM core to query them
> 
> OK. There could be other host capabilities here in the future too,
> like Xen. I don't think there are any Xen capabilities ATM though but
> IIRC some recently sent patches had something like those.
> 
>>  - per-VM fields
> 
> What is per-VM which is not machine or CPU architecture specific?

I think it would suffice for a first step to consider all per-VM fields
as independent of CPU architecture or machine type.

> 
>>  - fields related to memory management
> 
> OK.
> 
> I'd add fourth possible class:
>  - device, CPU and machine configuration, like nographic,
> win2k_install_hack, no_hpet, smp_cpus etc. Maybe also
> irqchip_in_kernel could fit here, though it obviously depends on a
> host capability too.

I would count everything that cannot be assigned to a concrete device
upfront to the dynamic state of a machine, thus class 2. The point is,
(potentially) every device of that machine requires access to it, just
like (indirectly, via the KVM core services) to some KVM VM state bits.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Flow Control and Port Mirroring Revisited

2011-01-21 Thread Rick Jones


I have constructed a test where I run an un-paced  UDP_STREAM test in
one guest and a paced omni rr test in another guest at the same time.



Hmm, what is this supposed to measure?  Basically each time you run an
un-paced UDP_STREAM you get some random load on the network.


Well, if the netperf is (effectively) pinned to a given CPU, presumably it would 
be trying to generate UDP datagrams at the same rate each time.  Indeed though, 
no guarantee that rate would consistently get through each time.


But then, that is where one can use the confidence intervals options to get an 
idea by how much the rate varied.


rick jones
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-21 Thread Blue Swirl

On Fri, Jan 21, 2011 at 5:21 PM, Jan Kiszka  wrote:
> On 2011-01-21 17:37, Blue Swirl wrote:
>> On Fri, Jan 21, 2011 at 8:46 AM, Gerd Hoffmann  wrote:
>>>  Hi,
>>>
 By the way, we don't have a QEMUState but instead use globals.
>>>
>>> /me wants to underline this.
>>>
>>> IMO it is absolutely pointless to worry about ways to pass around kvm_state.
>>>  There never ever will be a serious need for that.
>>>
>>> We can stick with the current model of keeping global state in global
>>> variables.  And just do the same with kvm_state.
>>>
>>> Or we can move to have all state in a QEMUState struct which we'll pass
>>> around basically everywhere.  Then we can simply embed or reference
>>> kvm_state there.
>>>
>>> I'd tend to stick with the global variables as I don't see the point in
>>> having a QEMUstate.  I doubt we'll ever see two virtual machines driven by a
>>> single qemu process.  YMMV.
>>
>> Global variables are signs of a poor design.
>
> s/are/can be/.
>
>> QEMUState would not help
>> that, instead more specific structures should be designed, much like
>> what I've proposed for KVMState. Some of these new structures should
>> be even passed around when it makes sense.
>>
>> But I'd not start kvm_state redesign around global variables or QEMUState.
>
> We do not need to move individual fields yet, but we need to define
> classes of fields and strategies how to deal with them long-term. Then
> we can move forward, and that already in the right direction.

Excellent plan.

> Obvious classes are
>  - static host capabilities and means for the KVM core to query them

OK. There could be other host capabilities here in the future too,
like Xen. I don't think there are any Xen capabilities ATM though but
IIRC some recently sent patches had something like those.

>  - per-VM fields

What is per-VM which is not machine or CPU architecture specific?

>  - fields related to memory management

OK.

I'd add fourth possible class:
 - device, CPU and machine configuration, like nographic,
win2k_install_hack, no_hpet, smp_cpus etc. Maybe also
irqchip_in_kernel could fit here, though it obviously depends on a
host capability too.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-21 Thread Jan Kiszka

On 2011-01-21 17:37, Blue Swirl wrote:
> On Fri, Jan 21, 2011 at 8:46 AM, Gerd Hoffmann  wrote:
>>  Hi,
>>
>>> By the way, we don't have a QEMUState but instead use globals.
>>
>> /me wants to underline this.
>>
>> IMO it is absolutely pointless to worry about ways to pass around kvm_state.
>>  There never ever will be a serious need for that.
>>
>> We can stick with the current model of keeping global state in global
>> variables.  And just do the same with kvm_state.
>>
>> Or we can move to have all state in a QEMUState struct which we'll pass
>> around basically everywhere.  Then we can simply embed or reference
>> kvm_state there.
>>
>> I'd tend to stick with the global variables as I don't see the point in
>> having a QEMUstate.  I doubt we'll ever see two virtual machines driven by a
>> single qemu process.  YMMV.
> 
> Global variables are signs of a poor design.

s/are/can be/.

> QEMUState would not help
> that, instead more specific structures should be designed, much like
> what I've proposed for KVMState. Some of these new structures should
> be even passed around when it makes sense.
> 
> But I'd not start kvm_state redesign around global variables or QEMUState.

We do not need to move individual fields yet, but we need to define
classes of fields and strategies how to deal with them long-term. Then
we can move forward, and that already in the right direction.

Obvious classes are
 - static host capabilities and means for the KVM core to query them
 - per-VM fields
 - fields related to memory management

And we now need at least a plan for the second class to proceed with the
actual job.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-21 Thread Blue Swirl

On Fri, Jan 21, 2011 at 8:46 AM, Gerd Hoffmann  wrote:
>  Hi,
>
>> By the way, we don't have a QEMUState but instead use globals.
>
> /me wants to underline this.
>
> IMO it is absolutely pointless to worry about ways to pass around kvm_state.
>  There never ever will be a serious need for that.
>
> We can stick with the current model of keeping global state in global
> variables.  And just do the same with kvm_state.
>
> Or we can move to have all state in a QEMUState struct which we'll pass
> around basically everywhere.  Then we can simply embed or reference
> kvm_state there.
>
> I'd tend to stick with the global variables as I don't see the point in
> having a QEMUstate.  I doubt we'll ever see two virtual machines driven by a
> single qemu process.  YMMV.

Global variables are signs of a poor design. QEMUState would not help
that, instead more specific structures should be designed, much like
what I've proposed for KVMState. Some of these new structures should
be even passed around when it makes sense.

But I'd not start kvm_state redesign around global variables or QEMUState.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [REPOST] [PATCH 3/3] Provide control over unmapped pages (v3)

2011-01-21 Thread Christoph Lameter

On Fri, 21 Jan 2011, Balbir Singh wrote:

> * Christoph Lameter  [2011-01-20 09:00:09]:
>
> > On Thu, 20 Jan 2011, Balbir Singh wrote:
> >
> > > + unmapped_page_control
> > > + [KNL] Available if CONFIG_UNMAPPED_PAGECACHE_CONTROL
> > > + is enabled. It controls the amount of unmapped memory
> > > + that is present in the system. This boot option plus
> > > + vm.min_unmapped_ratio (sysctl) provide granular control
> >
> > min_unmapped_ratio is there to guarantee that zone reclaim does not
> > reclaim all unmapped pages.
> >
> > What you want here is a max_unmapped_ratio.
> >
>
> I thought about that, the logic for reusing min_unmapped_ratio was to
> keep a limit beyond which unmapped page cache shrinking should stop.

Right. That is the role of it. Its a minimum to leave. You want a maximum
size of the pagte cache.

> I think you are suggesting max_unmapped_ratio as the point at which
> shrinking should begin, right?

The role of min_unmapped_ratio is to never reclaim more pagecache if we
reach that ratio even if we have to go off node for an allocation.

AFAICT What you propose is a maximum size of the page cache. If the number
of page cache pages goes beyond that then you trim the page cache in
background reclaim.

> > > + reclaim_unmapped_pages(priority, zone, &sc);
> > > +
> > >   if (!zone_watermark_ok_safe(zone, order,
> >
> > H. Okay that means background reclaim does it. If so then we also want
> > zone reclaim to be able to work in the background I think.
>
> Anything specific you had in mind, works for me in testing, but is
> there anything specific that stands out in your mind that needs to be
> done?

Hmmm. So this would also work in a NUMA configuration, right. Limiting the
sizes of the page cache would avoid zone reclaim through these limit. Page
cache size would be limited by the max_unmapped_ratio.

zone_reclaim only would come into play if other allocations make the
memory on the node so tight that we would have to evict more page
cache pages in direct reclaim.
Then zone_reclaim could go down to shrink the page cache size to
min_unmapped_ratio.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-21 Thread Rik van Riel


On 01/21/2011 09:02 AM, Srivatsa Vaddagiri wrote:

On Thu, Jan 20, 2011 at 09:56:27AM -0800, Jeremy Fitzhardinge wrote:

  The key here is not to
sleep when waiting for locks (as implemented by current patch-series, which can
put other VMs at an advantage by giving them more time than they are entitled
to)


Why?  If a VCPU can't make progress because its waiting for some
resource, then why not schedule something else instead?


In the process, "something else" can get more share of cpu resource than its
entitled to and that's where I was bit concerned. I guess one could
employ hard-limits to cap "something else's" bandwidth where it is of real
concern (like clouds).


I'd like to think I fixed those things in my yield_task_fair +
yield_to + kvm_vcpu_on_spin patch series from yesterday.

https://lkml.org/lkml/2011/1/20/403

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests

2011-01-21 Thread Anthony Liguori


On 01/21/2011 03:55 AM, Michael S. Tsirkin wrote:

On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
   

On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
 

On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
   

On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:

 

On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:

   

When MSI is off, each interrupt needs to be bounced through the io
thread when it's set/cleared, so vhost-net causes more context switches and
higher CPU utilization than userspace virtio which handles networking in
the same thread.

We'll need to fix this by adding level irq support in kvm irqfd,
for now disable vhost-net in these configurations.

Signed-off-by: Michael S. Tsirkin

 

I actually think this should be a terminal error.  The user asks for
vhost-net, if we cannot enable it, we should exit.

Or we should warn the user that they should expect bad performance.
Silently doing something that the user has explicitly asked us not
to do is not a good behavior.

Regards,

Anthony Liguori

   

The issue is that user has no control of the guest, and can not know
whether the guest enables MSI. So what you ask for will just make
some guests fail, and others fail sometimes.
The user also has no way to know that version X of kvm does not expose a
way to inject level interrupts with irqfd.

We could have *another* flag that says "use vhost where it helps" but
then I think this is what everyone wants to do, anyway, and libvirt
already sets vhost=on so I prefer redefining the meaning of an existing
flag.

 

In the very least, there needs to be a vhost=force.

Having some sort of friendly default policy is fine but we need to
provide a mechanism for a user to have the final say.  If you want to
redefine vhost=on to really mean, use the friendly default, that's fine
by me, but only if the vhost=force option exists.

I actually would think libvirt would want to use vhost=force.  Debugging
with vhost=on is going to be a royal pain in the ass if a user reports
bad performance.  Given the libvirt XML, you can't actually tell from
the guest and the XML whether or not vhost was actually in use or not.
   

If we add a force option, let's please distinguish hotplug from VM
creation time.  The latter can abort.  Hotplug should print an error and
fail the initfn.
 

It can't abort at init - MSI is disabled at init, it needs to be enabled
by the guest later. And aborting the guest in the middle of the run
is a very bad idea.

What vhostforce=true will do is force vhost backend to be used even if
it is slower.
   


vhost=on,vhostforce=false  use vhost if we think it will 
improve performance

vhost=on,vhostforce=true   always use vhost
vhost=off,vhostforce=*do not use vhost

Regards,

Anthony Liguori

   

  Thanks,

Alex
 
   


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests

2011-01-21 Thread Anthony Liguori


On 01/21/2011 03:48 AM, Michael S. Tsirkin wrote:

On Thu, Jan 20, 2011 at 06:23:36PM -0600, Anthony Liguori wrote:
   

On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
 

On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
   

On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
 

When MSI is off, each interrupt needs to be bounced through the io
thread when it's set/cleared, so vhost-net causes more context switches and
higher CPU utilization than userspace virtio which handles networking in
the same thread.

We'll need to fix this by adding level irq support in kvm irqfd,
for now disable vhost-net in these configurations.

Signed-off-by: Michael S. Tsirkin
   

I actually think this should be a terminal error.  The user asks for
vhost-net, if we cannot enable it, we should exit.

Or we should warn the user that they should expect bad performance.
Silently doing something that the user has explicitly asked us not
to do is not a good behavior.

Regards,

Anthony Liguori
 

The issue is that user has no control of the guest, and can not know
whether the guest enables MSI. So what you ask for will just make
some guests fail, and others fail sometimes.
The user also has no way to know that version X of kvm does not expose a
way to inject level interrupts with irqfd.

We could have *another* flag that says "use vhost where it helps" but
then I think this is what everyone wants to do, anyway, and libvirt
already sets vhost=on so I prefer redefining the meaning of an existing
flag.
   

In the very least, there needs to be a vhost=force.
Having some sort of friendly default policy is fine but we need to
provide a mechanism for a user to have the final say.  If you want
to redefine vhost=on to really mean, use the friendly default,
that's fine by me, but only if the vhost=force option exists.
 

OK, I will add that, probably as a separate flag as vhost
is a boolean.  This will get worse performance but it will be what the
user asked for.

   

I actually would think libvirt would want to use vhost=force.
Debugging with vhost=on is going to be a royal pain in the ass if a
user reports bad performance.  Given the libvirt XML, you can't
actually tell from the guest and the XML whether or not vhost was
actually in use or not.
 

Yes you can: check MSI enabled in the guest, if it is -
check vhost enabled in the XML. Not that bad at all, is it?
   


Until you automatically detect level triggered interrupt support for 
irqfd.  This means it's also dependent on a kernel feature too.


Is there any way to tell in QEMU that vhost was silently disabled?

Regards,

Anthony Liguori

   

Regards,

Anthony Liguori
 

We get worse performance without MSI anyway, how is this different?

   

Maybe this is best handled by a documentation update?

We always said:
"use vhost=on to enable experimental in kernel 
accelerator\n"

note 'enable' not 'require'. This is similar to how we specify
nvectors : you can not make guest use the feature.

How about this:

diff --git a/qemu-options.hx b/qemu-options.hx
index 898561d..3c937c1 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1061,6 +1061,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
  "use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap 
flag\n"
  "use vnet_hdr=on to make the lack of IFF_VNET_HDR support an 
error condition\n"
  "use vhost=on to enable experimental in kernel 
accelerator\n"
+"(note: vhost=on has no effect unless guest uses MSI-X)\n"
  "use 'vhostfd=h' to connect to an already opened vhost net 
device\n"
  #endif
  "-net 
socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"


   
   


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-21 Thread Srivatsa Vaddagiri

On Thu, Jan 20, 2011 at 09:56:27AM -0800, Jeremy Fitzhardinge wrote:
> >  The key here is not to
> > sleep when waiting for locks (as implemented by current patch-series, which 
> > can 
> > put other VMs at an advantage by giving them more time than they are 
> > entitled 
> > to)
> 
> Why?  If a VCPU can't make progress because its waiting for some
> resource, then why not schedule something else instead?

In the process, "something else" can get more share of cpu resource than its 
entitled to and that's where I was bit concerned. I guess one could
employ hard-limits to cap "something else's" bandwidth where it is of real 
concern (like clouds).

> Presumably when
> the VCPU does become runnable, the scheduler will credit its previous
> blocked state and let it run in preference to something else.

which may not be sufficient for it to gain back bandwidth lost while blocked
(speaking of mainline scheduler atleast).

> > Is there a way we can dynamically expand the size of lock only upon 
> > contention
> > to include additional information like owning vcpu? Have the lock point to a
> > per-cpu area upon contention where additional details can be stored perhaps?
> 
> As soon as you add a pointer to the lock, you're increasing its size. 

I didn't really mean to expand size statically. Rather have some bits of the 
lock word store pointer to a per-cpu area when there is contention (somewhat 
similar to how bits of rt_mutex.owner are used). I haven't thought thr' this in
detail to see if that is possible though.

- vatsa
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: EPT: Misconfiguration

2011-01-21 Thread Marcelo Tosatti

On Thu, Jan 20, 2011 at 12:48:00PM +0100, Ruben Kerkhof wrote:
> I'm suddenly getting lots of the following errors on a server running
> 2.36.7, but I have no idea what it means:
> 
> 2011-01-20T12:41:18.358603+01:00 phy005 kernel: EPT: Misconfiguration.
> 2011-01-20T12:41:18.358621+01:00 phy005 kernel: EPT: GPA: 0x3dbff6b0
> 2011-01-20T12:41:18.358624+01:00 phy005 kernel:
> ept_misconfig_inspect_spte: spte 0x50743e007 level 4
> 2011-01-20T12:41:18.358627+01:00 phy005 kernel:
> ept_misconfig_inspect_spte: spte 0x523de2007 level 3
> 2011-01-20T12:41:18.358629+01:00 phy005 kernel:
> ept_misconfig_inspect_spte: spte 0x62336f007 level 2
> 2011-01-20T12:41:18.360109+01:00 phy005 kernel:
> ept_misconfig_inspect_spte: spte 0x1603a0730500d277 level 1
> 2011-01-20T12:41:18.360137+01:00 phy005 kernel:
> ept_misconfig_inspect_spte: rsvd_bits = 0x3a000
> 2011-01-20T12:41:18.360151+01:00 phy005 kernel: [ cut here
> ]

A shadow pagetable entry in memory has bits 45-49 set, which is not
allowed. Its probably bad memory if this errors were not present before 
with the same workload and host software. Would be useful to see what
memtest86 says.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests

2011-01-21 Thread Michael S. Tsirkin

On Fri, Jan 21, 2011 at 06:19:13AM -0700, Alex Williamson wrote:
> On Fri, 2011-01-21 at 11:55 +0200, Michael S. Tsirkin wrote:
> > On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
> > > On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
> > > > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> > > > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> > > > >
> > > > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> > > > >>  
> > > > >>> When MSI is off, each interrupt needs to be bounced through the io
> > > > >>> thread when it's set/cleared, so vhost-net causes more context 
> > > > >>> switches and
> > > > >>> higher CPU utilization than userspace virtio which handles 
> > > > >>> networking in
> > > > >>> the same thread.
> > > > >>>
> > > > >>> We'll need to fix this by adding level irq support in kvm irqfd,
> > > > >>> for now disable vhost-net in these configurations.
> > > > >>>
> > > > >>> Signed-off-by: Michael S. Tsirkin
> > > > >>>
> > > > >> I actually think this should be a terminal error.  The user asks for
> > > > >> vhost-net, if we cannot enable it, we should exit.
> > > > >>
> > > > >> Or we should warn the user that they should expect bad performance.
> > > > >> Silently doing something that the user has explicitly asked us not
> > > > >> to do is not a good behavior.
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> Anthony Liguori
> > > > >>  
> > > > > The issue is that user has no control of the guest, and can not know
> > > > > whether the guest enables MSI. So what you ask for will just make
> > > > > some guests fail, and others fail sometimes.
> > > > > The user also has no way to know that version X of kvm does not 
> > > > > expose a
> > > > > way to inject level interrupts with irqfd.
> > > > >
> > > > > We could have *another* flag that says "use vhost where it helps" but
> > > > > then I think this is what everyone wants to do, anyway, and libvirt
> > > > > already sets vhost=on so I prefer redefining the meaning of an 
> > > > > existing
> > > > > flag.
> > > > >
> > > > 
> > > > In the very least, there needs to be a vhost=force.
> > > > 
> > > > Having some sort of friendly default policy is fine but we need to 
> > > > provide a mechanism for a user to have the final say.  If you want to 
> > > > redefine vhost=on to really mean, use the friendly default, that's fine 
> > > > by me, but only if the vhost=force option exists.
> > > > 
> > > > I actually would think libvirt would want to use vhost=force.  
> > > > Debugging 
> > > > with vhost=on is going to be a royal pain in the ass if a user reports 
> > > > bad performance.  Given the libvirt XML, you can't actually tell from 
> > > > the guest and the XML whether or not vhost was actually in use or not.
> > > 
> > > If we add a force option, let's please distinguish hotplug from VM
> > > creation time.  The latter can abort.  Hotplug should print an error and
> > > fail the initfn.
> > 
> > It can't abort at init - MSI is disabled at init, it needs to be enabled
> > by the guest later. And aborting the guest in the middle of the run
> > is a very bad idea.
> 
> Yeah, I was thinking about the ordering of device being added vs guest
> enabling MSI this morning.  Waiting until the guest decides to try to
> start using the device to NAK it with an abort is very undesirable.
> What if when we have vhost=on,force, the device doesn't advertise an
> INTx (PCI_INTERRUPT_PIN = 0)?
> 
> Alex

Then we break backward compatibility with old guests.
I don't see what the issue is really:
It is trivial to check that the guest uses MSIX.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests

2011-01-21 Thread Alex Williamson

On Fri, 2011-01-21 at 11:55 +0200, Michael S. Tsirkin wrote:
> On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
> > On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
> > > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> > > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> > > >
> > > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> > > >>  
> > > >>> When MSI is off, each interrupt needs to be bounced through the io
> > > >>> thread when it's set/cleared, so vhost-net causes more context 
> > > >>> switches and
> > > >>> higher CPU utilization than userspace virtio which handles networking 
> > > >>> in
> > > >>> the same thread.
> > > >>>
> > > >>> We'll need to fix this by adding level irq support in kvm irqfd,
> > > >>> for now disable vhost-net in these configurations.
> > > >>>
> > > >>> Signed-off-by: Michael S. Tsirkin
> > > >>>
> > > >> I actually think this should be a terminal error.  The user asks for
> > > >> vhost-net, if we cannot enable it, we should exit.
> > > >>
> > > >> Or we should warn the user that they should expect bad performance.
> > > >> Silently doing something that the user has explicitly asked us not
> > > >> to do is not a good behavior.
> > > >>
> > > >> Regards,
> > > >>
> > > >> Anthony Liguori
> > > >>  
> > > > The issue is that user has no control of the guest, and can not know
> > > > whether the guest enables MSI. So what you ask for will just make
> > > > some guests fail, and others fail sometimes.
> > > > The user also has no way to know that version X of kvm does not expose a
> > > > way to inject level interrupts with irqfd.
> > > >
> > > > We could have *another* flag that says "use vhost where it helps" but
> > > > then I think this is what everyone wants to do, anyway, and libvirt
> > > > already sets vhost=on so I prefer redefining the meaning of an existing
> > > > flag.
> > > >
> > > 
> > > In the very least, there needs to be a vhost=force.
> > > 
> > > Having some sort of friendly default policy is fine but we need to 
> > > provide a mechanism for a user to have the final say.  If you want to 
> > > redefine vhost=on to really mean, use the friendly default, that's fine 
> > > by me, but only if the vhost=force option exists.
> > > 
> > > I actually would think libvirt would want to use vhost=force.  Debugging 
> > > with vhost=on is going to be a royal pain in the ass if a user reports 
> > > bad performance.  Given the libvirt XML, you can't actually tell from 
> > > the guest and the XML whether or not vhost was actually in use or not.
> > 
> > If we add a force option, let's please distinguish hotplug from VM
> > creation time.  The latter can abort.  Hotplug should print an error and
> > fail the initfn.
> 
> It can't abort at init - MSI is disabled at init, it needs to be enabled
> by the guest later. And aborting the guest in the middle of the run
> is a very bad idea.

Yeah, I was thinking about the ordering of device being added vs guest
enabling MSI this morning.  Waiting until the guest decides to try to
start using the device to NAK it with an abort is very undesirable.
What if when we have vhost=on,force, the device doesn't advertise an
INTx (PCI_INTERRUPT_PIN = 0)?

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Flow Control and Port Mirroring Revisited

2011-01-21 Thread Michael S. Tsirkin

On Thu, Jan 20, 2011 at 05:38:33PM +0900, Simon Horman wrote:
> [ Trimmed Eric from CC list as vger was complaining that it is too long ]
> 
> On Tue, Jan 18, 2011 at 11:41:22AM -0800, Rick Jones wrote:
> > >So it won't be all that simple to implement well, and before we try,
> > >I'd like to know whether there are applications that are helped
> > >by it. For example, we could try to measure latency at various
> > >pps and see whether the backpressure helps. netperf has -b, -w
> > >flags which might help these measurements.
> > 
> > Those options are enabled when one adds --enable-burst to the
> > pre-compilation ./configure  of netperf (one doesn't have to
> > recompile netserver).  However, if one is also looking at latency
> > statistics via the -j option in the top-of-trunk, or simply at the
> > histogram with --enable-histogram on the ./configure and a verbosity
> > level of 2 (global -v 2) then one wants the very top of trunk
> > netperf from:
> 
> Hi,
> 
> I have constructed a test where I run an un-paced  UDP_STREAM test in
> one guest and a paced omni rr test in another guest at the same time.

Hmm, what is this supposed to measure?  Basically each time you run an
un-paced UDP_STREAM you get some random load on the network.
You can't tell what it was exactly, only that it was between
the send and receive throughput.

> Breifly I get the following results from the omni test..
> 
> 1. Omni test only:MEAN_LATENCY=272.00
> 2. Omni and stream test:  MEAN_LATENCY=3423.00
> 3. cpu and net_cls group: MEAN_LATENCY=493.00
>As per 2 plus cgoups are created for each guest
>and guest tasks added to the groups
> 4. 100Mbit/s class:   MEAN_LATENCY=273.00
>As per 3 plus the net_cls groups each have a 100MBit/s HTB class
> 5. cpu.shares=128:MEAN_LATENCY=652.00
>As per 4 plus the cpu groups have cpu.shares set to 128
> 6. Busy CPUS: MEAN_LATENCY=15126.00
>As per 5 but the CPUs are made busy using a simple shell while loop
> 
> There is a bit of noise in the results as the two netperf invocations
> aren't started at exactly the same moment
> 
> For reference, my netperf invocations are:
> netperf -c -C -t UDP_STREAM -H 172.17.60.216 -l 12
> netperf.omni -p 12866 -D -c -C -H 172.17.60.216 -t omni -j -v 2 -- -r 1 -d rr 
> -k foo -b 1 -w 200 -m 200
> 
> foo contains
> PROTOCOL
> THROUGHPUT,THROUGHPUT_UNITS
> LOCAL_SEND_THROUGHPUT
> LOCAL_RECV_THROUGHPUT
> REMOTE_SEND_THROUGHPUT
> REMOTE_RECV_THROUGHPUT
> RT_LATENCY,MIN_LATENCY,MEAN_LATENCY,MAX_LATENCY
> P50_LATENCY,P90_LATENCY,P99_LATENCY,STDDEV_LATENCY
> LOCAL_CPU_UTIL,REMOTE_CPU_UTIL
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-21 Thread Markus Armbruster

Gerd Hoffmann  writes:

>   Hi,
>
>> By the way, we don't have a QEMUState but instead use globals.
>
> /me wants to underline this.
>
> IMO it is absolutely pointless to worry about ways to pass around
> kvm_state.  There never ever will be a serious need for that.
>
> We can stick with the current model of keeping global state in global
> variables.  And just do the same with kvm_state.
>
> Or we can move to have all state in a QEMUState struct which we'll
> pass around basically everywhere.  Then we can simply embed or
> reference kvm_state there.
>
> I'd tend to stick with the global variables as I don't see the point
> in having a QEMUstate.  I doubt we'll ever see two virtual machines
> driven by a single qemu process.  YMMV.

/me grabs the fat magic marker and underlines some more.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-21 Thread Markus Armbruster

Gerd Hoffmann  writes:

> On 01/20/11 20:39, Anthony Liguori wrote:
>> On 01/20/2011 02:44 AM, Gerd Hoffmann wrote:
>>> Hi,
>>>
 For (2), you cannot use bus=X,addr=Y because it makes assumptions about
 the PCI topology which may change in newer -M pc's.
>>>
>>> Why should the PCI topology for 'pc' ever change?
>>>
>>> We'll probably get q35 support some day, but when this lands I expect
>>> we'll see a new machine type 'q35', so '-m q35' will pick the ich9
>>> chipset (which will have a different pci topology of course) and '-m
>>> pc' will pick the existing piix chipset (which will continue to look
>>> like it looks today).
>>
>> But then what's the default machine type? When I say -M pc, I really
>> mean the default machine.
>
> I'd tend to leave pc as default for a release cycle or two so we can
> hash out issues with q35, then flip the default once it got broader
> testing and runs stable.
>
>> At some point, "qemu-system-x86_64 -device virtio-net-pci,addr=2.0"
>>
>> Is not going to be a reliable way to invoke qemu because there's no way
>> we can guarantee that slot 2 isn't occupied by a chipset device or some
>> other default device.
>
> Indeed.  But qemu -M pc should continue to work though.  'pc' would
> better named 'piix3', but renaming it now is probably not worth the
> trouble.

We mustn't change pc-0.14 & friends.  We routinely change pc, but
whether an upgrade to q35 qualifies as routine change is debatable.

If you don't want PCI topology (and more) to change across QEMU updates,
consider using the versioned machine types.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests

2011-01-21 Thread Michael S. Tsirkin

On Thu, Jan 20, 2011 at 06:35:46PM -0700, Alex Williamson wrote:
> On Thu, 2011-01-20 at 18:23 -0600, Anthony Liguori wrote:
> > On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> > > On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> > >
> > >> On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> > >>  
> > >>> When MSI is off, each interrupt needs to be bounced through the io
> > >>> thread when it's set/cleared, so vhost-net causes more context switches 
> > >>> and
> > >>> higher CPU utilization than userspace virtio which handles networking in
> > >>> the same thread.
> > >>>
> > >>> We'll need to fix this by adding level irq support in kvm irqfd,
> > >>> for now disable vhost-net in these configurations.
> > >>>
> > >>> Signed-off-by: Michael S. Tsirkin
> > >>>
> > >> I actually think this should be a terminal error.  The user asks for
> > >> vhost-net, if we cannot enable it, we should exit.
> > >>
> > >> Or we should warn the user that they should expect bad performance.
> > >> Silently doing something that the user has explicitly asked us not
> > >> to do is not a good behavior.
> > >>
> > >> Regards,
> > >>
> > >> Anthony Liguori
> > >>  
> > > The issue is that user has no control of the guest, and can not know
> > > whether the guest enables MSI. So what you ask for will just make
> > > some guests fail, and others fail sometimes.
> > > The user also has no way to know that version X of kvm does not expose a
> > > way to inject level interrupts with irqfd.
> > >
> > > We could have *another* flag that says "use vhost where it helps" but
> > > then I think this is what everyone wants to do, anyway, and libvirt
> > > already sets vhost=on so I prefer redefining the meaning of an existing
> > > flag.
> > >
> > 
> > In the very least, there needs to be a vhost=force.
> > 
> > Having some sort of friendly default policy is fine but we need to 
> > provide a mechanism for a user to have the final say.  If you want to 
> > redefine vhost=on to really mean, use the friendly default, that's fine 
> > by me, but only if the vhost=force option exists.
> > 
> > I actually would think libvirt would want to use vhost=force.  Debugging 
> > with vhost=on is going to be a royal pain in the ass if a user reports 
> > bad performance.  Given the libvirt XML, you can't actually tell from 
> > the guest and the XML whether or not vhost was actually in use or not.
> 
> If we add a force option, let's please distinguish hotplug from VM
> creation time.  The latter can abort.  Hotplug should print an error and
> fail the initfn.

It can't abort at init - MSI is disabled at init, it needs to be enabled
by the guest later. And aborting the guest in the middle of the run
is a very bad idea.

What vhostforce=true will do is force vhost backend to be used even if
it is slower.

>  Thanks,
> 
> Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH] vhost: force vhost off for non-MSI guests

2011-01-21 Thread Michael S. Tsirkin

On Thu, Jan 20, 2011 at 06:23:36PM -0600, Anthony Liguori wrote:
> On 01/20/2011 10:07 AM, Michael S. Tsirkin wrote:
> >On Thu, Jan 20, 2011 at 09:43:57AM -0600, Anthony Liguori wrote:
> >>On 01/20/2011 09:35 AM, Michael S. Tsirkin wrote:
> >>>When MSI is off, each interrupt needs to be bounced through the io
> >>>thread when it's set/cleared, so vhost-net causes more context switches and
> >>>higher CPU utilization than userspace virtio which handles networking in
> >>>the same thread.
> >>>
> >>>We'll need to fix this by adding level irq support in kvm irqfd,
> >>>for now disable vhost-net in these configurations.
> >>>
> >>>Signed-off-by: Michael S. Tsirkin
> >>I actually think this should be a terminal error.  The user asks for
> >>vhost-net, if we cannot enable it, we should exit.
> >>
> >>Or we should warn the user that they should expect bad performance.
> >>Silently doing something that the user has explicitly asked us not
> >>to do is not a good behavior.
> >>
> >>Regards,
> >>
> >>Anthony Liguori
> >The issue is that user has no control of the guest, and can not know
> >whether the guest enables MSI. So what you ask for will just make
> >some guests fail, and others fail sometimes.
> >The user also has no way to know that version X of kvm does not expose a
> >way to inject level interrupts with irqfd.
> >
> >We could have *another* flag that says "use vhost where it helps" but
> >then I think this is what everyone wants to do, anyway, and libvirt
> >already sets vhost=on so I prefer redefining the meaning of an existing
> >flag.
> 
> In the very least, there needs to be a vhost=force.
> Having some sort of friendly default policy is fine but we need to
> provide a mechanism for a user to have the final say.  If you want
> to redefine vhost=on to really mean, use the friendly default,
> that's fine by me, but only if the vhost=force option exists.

OK, I will add that, probably as a separate flag as vhost
is a boolean.  This will get worse performance but it will be what the
user asked for.

> 
> I actually would think libvirt would want to use vhost=force.
> Debugging with vhost=on is going to be a royal pain in the ass if a
> user reports bad performance.  Given the libvirt XML, you can't
> actually tell from the guest and the XML whether or not vhost was
> actually in use or not.

Yes you can: check MSI enabled in the guest, if it is -
check vhost enabled in the XML. Not that bad at all, is it?

> 
> Regards,
> 
> Anthony Liguori

We get worse performance without MSI anyway, how is this different?

> >Maybe this is best handled by a documentation update?
> >
> >We always said:
> > "use vhost=on to enable experimental in kernel 
> > accelerator\n"
> >
> >note 'enable' not 'require'. This is similar to how we specify
> >nvectors : you can not make guest use the feature.
> >
> >How about this:
> >
> >diff --git a/qemu-options.hx b/qemu-options.hx
> >index 898561d..3c937c1 100644
> >--- a/qemu-options.hx
> >+++ b/qemu-options.hx
> >@@ -1061,6 +1061,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
> >  "use vnet_hdr=off to avoid enabling the IFF_VNET_HDR 
> > tap flag\n"
> >  "use vnet_hdr=on to make the lack of IFF_VNET_HDR 
> > support an error condition\n"
> >  "use vhost=on to enable experimental in kernel 
> > accelerator\n"
> >+"(note: vhost=on has no effect unless guest uses 
> >MSI-X)\n"
> >  "use 'vhostfd=h' to connect to an already opened vhost 
> > net device\n"
> >  #endif
> >  "-net 
> > socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"
> >
> >
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-21 Thread Gerd Hoffmann


  Hi,


By the way, we don't have a QEMUState but instead use globals.


/me wants to underline this.

IMO it is absolutely pointless to worry about ways to pass around 
kvm_state.  There never ever will be a serious need for that.


We can stick with the current model of keeping global state in global 
variables.  And just do the same with kvm_state.


Or we can move to have all state in a QEMUState struct which we'll pass 
around basically everywhere.  Then we can simply embed or reference 
kvm_state there.


I'd tend to stick with the global variables as I don't see the point in 
having a QEMUstate.  I doubt we'll ever see two virtual machines driven 
by a single qemu process.  YMMV.


cheers,
  Gerd

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-21 Thread Gerd Hoffmann


On 01/20/11 20:39, Anthony Liguori wrote:

On 01/20/2011 02:44 AM, Gerd Hoffmann wrote:

Hi,


For (2), you cannot use bus=X,addr=Y because it makes assumptions about
the PCI topology which may change in newer -M pc's.


Why should the PCI topology for 'pc' ever change?

We'll probably get q35 support some day, but when this lands I expect
we'll see a new machine type 'q35', so '-m q35' will pick the ich9
chipset (which will have a different pci topology of course) and '-m
pc' will pick the existing piix chipset (which will continue to look
like it looks today).


But then what's the default machine type? When I say -M pc, I really
mean the default machine.


I'd tend to leave pc as default for a release cycle or two so we can 
hash out issues with q35, then flip the default once it got broader 
testing and runs stable.



At some point, "qemu-system-x86_64 -device virtio-net-pci,addr=2.0"

Is not going to be a reliable way to invoke qemu because there's no way
we can guarantee that slot 2 isn't occupied by a chipset device or some
other default device.


Indeed.  But qemu -M pc should continue to work though.  'pc' would 
better named 'piix3', but renaming it now is probably not worth the trouble.


cheers,
  Gerd

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 26872] qemu stop responding if using kvm with usb passthru

2011-01-21 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=26872


alien.vi...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||PATCH_ALREADY_AVAILABLE




--- Comment #1 from alien.vi...@gmail.com  2011-01-21 08:02:24 ---
user must enable MMU in kernel command line

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

59 matches

Mail list logo