[Qemu-devel] [Bug 712416] Re: kvm_intel kernel module crash with via nano vmx
** Tags removed: kvm -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/712416 Title: kvm_intel kernel module crash with via nano vmx Status in QEMU: New Status in “kvm” package in Ubuntu: Incomplete Status in “kvm” package in Debian: New Bug description: kvm module for hardware virtualisation not work properly on via nano processors. Tested with processor: VIA Nano processor U2250. Processors flags (visible in /proc/cpuinfo): fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush acpi mmx fxsr sse sse2 ss tm syscall nx lm constant_tsc up rep_good pni monitor vmx est tm2 ssse3 cx16 xtpr rng rng_en ace ace_en ace2 phe phe_en lahf_lm With kernel 2.6.32: kvm not work and dmesg contains a lot of: handle_exception: unexpected, vectoring info 0x800d intr info 0x8b0d With kernel 2.6.35: all the system crash. Nothing visible in logs
Re: [Qemu-devel] OSX build issues
On 14.03.2011, at 22:21, François Revol wrote: The OSX build has been broken for some time now... * qemu-thread-posix.c: both qemu_mutex_timedlock and qemu_cond_timedwait make use of clock_gettime() and CLOCK_REALTIME, which OSX doesn't have. It seems like both functions are nowhere found. Can they be removed then ? * cpus.c: qemu_kvm_eat_signals refers to sigbus_reraise which is defined conditionally on CONFIG_LINUX... And OSX doesn't have sigtimedwait... Any maintainer around who can fix it ? Andreas is your man :). Alex
[Qemu-devel] Re: KVM call minutes for Mar 8
On 2011-03-15 04:38, Marcelo Tosatti wrote: On Tue, Mar 08, 2011 at 06:21:07PM +0100, Jan Kiszka wrote: On 2011-03-08 18:15, Paolo Bonzini wrote: On 03/08/2011 06:10 PM, Jan Kiszka wrote: The qemu.git bit seen with my win32 patch series should also be a regression from qemu-kvm.git to qemu.git, no? Can't follow. What do you mean? I didn't understand very well Avi and Marcelo's exchange, but this test definitely 1) fails with qemu iothread, 2) works with qemu non-iothread. What I didn't understand is whether it works with qemu-kvm. Are we talking about the failing Fedora installations? Or something else? Jan The failing Fedora installation problem is gone. You lost it :), or was it solved? Anyway, will now repost part V of upstream patches later today. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH -V3 2/8] hw/9pfs: Add file descriptor reclaim support
On Mon, 14 Mar 2011 10:13:59 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sun, Mar 13, 2011 at 6:57 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Sun, 13 Mar 2011 16:08:29 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: @@ -107,7 +108,12 @@ static int v9fs_do_closedir(V9fsState *s, DIR *dir) static int v9fs_do_open(V9fsState *s, V9fsString *path, int flags) { - return s-ops-open(s-ctx, path-data, flags); + int fd; + fd = s-ops-open(s-ctx, path-data, flags); + if (fd P9_FD_RECLAIM_THRES) { + v9fs_reclaim_fd(s); + } I think the threshold should depend on the file descriptor ulimit. The hardcoded constant doesn't work if the ulimit is set to 1000 or less (it would cause other users in QEMU to hit EMFILE errors). Yes. That is suppose to be a follow up patch. I had that set to 100 for all the early testing. Using getrlimit(2) to choose a good threshold at runtime shouldn't be a lot of code. Please add it to this patch so the threshold isn't arbitrary and possibly ineffective due to ulimit. ok. @@ -2719,7 +2806,11 @@ static void v9fs_remove(V9fsState *s, V9fsPDU *pdu) err = -EINVAL; goto out; } - + /* + * IF the file is unlinked, we cannot reopen + * the file later. So don't reclaim fd + */ + v9fs_mark_fids_unreclaim(s, vs-fidp-fsmap.path); This poses a problem for the case where guest and host are both accessing the file system. If the fd is reclaimed and the host deletes the file, then the guest cannot access its open file anymore. The same issue also affects rename and has not been covered by this patch. Currently virtFS don't handle the host rename/unlink. That we walk a name and get the fid and then use the fid to open the file. In between if the file get removed/renamed we will get an EINVAL. All that will go away once we switch to handle based open. Can you explain this more? Will multiple entities be able to safely use the file system (e.g. host and guest)? handles are stable across renames. So even if host rename the file, qemu will be able to access it. But we still won't be able to handle unlink on host. But that is true with even other file servers. They do get ESTALE in that case. -aneesh
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Sun, 13 Mar 2011 20:57:12 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sun, Mar 13, 2011 at 7:04 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: cache=none implies the file are opened in the host with O_SYNC open flag O_SYNC does not bypass the host page cache. It ensures that writes only complete once data has been written to the disk. O_DIRECT is a hint to bypass the host page cache when possible. A boolean on|off option would be nicer than an option that takes the special string none. For example, direct=on|off. It also makes the code nicer by using bools instead of strdup strings that get leaked. What i wanted is the O_SYNC behavior. Well the comment should be updated. I want to make sure that we don't have dirty data in host page cache after a write. It is always good to make read hit the page cache I have sent a patch to clean up the -virtfs option parsing, you are CCed. I think it will make it easier to add a new sync=on|off option. Absolutely. So what i will do is i will carry the patch in the series and later will drop the same once your changes get pushed upstream. -aneesh
[Qemu-devel] [PATCH 0/2] char, virtio_console: Allow chardevs to be re-used
This series does two things: - prevents a single chardev to be used by multiple devices at the same time - virtio-console/serial ports don't close a chardev, instead free it for later use by other devices (or a new hot-plugged virtio serial port). Please apply. Amit Shah (2): virtio-console: Keep chardev open for other users after hot-unplug char: Prevent multiple devices opening same chardev hw/qdev-properties.c |7 ++- hw/virtio-console.c |6 +- qemu-char.c |4 qemu-char.h |1 + 4 files changed, 16 insertions(+), 2 deletions(-) -- 1.7.4
[Qemu-devel] [PATCH 2/2] char: Prevent multiple devices opening same chardev
Prevent: -chardev socket,path=/tmp/foo,server,nowait,id=c0 \ -device virtserialport,chardev=c0,id=vs0 \ -device virtserialport,chardev=c0,id=vs1 Reported-by: Mike Cao b...@redhat.com Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/qdev-properties.c |7 ++- qemu-char.c |4 qemu-char.h |1 + 3 files changed, 11 insertions(+), 1 deletions(-) diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c index a45b61e..1088a26 100644 --- a/hw/qdev-properties.c +++ b/hw/qdev-properties.c @@ -351,8 +351,13 @@ static int parse_chr(DeviceState *dev, Property *prop, const char *str) CharDriverState **ptr = qdev_get_prop_ptr(dev, prop); *ptr = qemu_chr_find(str); -if (*ptr == NULL) +if (*ptr == NULL) { return -ENOENT; +} +if ((*ptr)-assigned) { +return -EEXIST; +} +(*ptr)-assigned = 1; return 0; } diff --git a/qemu-char.c b/qemu-char.c index bd4e944..524cdc1 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -197,6 +197,10 @@ void qemu_chr_add_handlers(CharDriverState *s, IOEventHandler *fd_event, void *opaque) { +if (!opaque) { +/* chr driver being released. */ +s-assigned = 0; +} s-chr_can_read = fd_can_read; s-chr_read = fd_read; s-chr_event = fd_event; diff --git a/qemu-char.h b/qemu-char.h index 56d9954..fb96eef 100644 --- a/qemu-char.h +++ b/qemu-char.h @@ -70,6 +70,7 @@ struct CharDriverState { char *label; char *filename; int opened; +int assigned; /* chardev assigned to a device */ QTAILQ_ENTRY(CharDriverState) next; }; -- 1.7.4
[Qemu-devel] [PATCH 1/2] virtio-console: Keep chardev open for other users after hot-unplug
After a hot-unplug operation, the previous behaviour was to close the chardev. That meant the chardev couldn't be re-used. Also, since chardev hot-plug isn't possible so far, this means virtio-console hot-plug isn't feasible as well. With this change, the chardev is kept around. A new virtio-console channel can then be hot-plugged with the same chardev and things will continue to work. Signed-off-by: Amit Shah amit.s...@redhat.com --- hw/virtio-console.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/hw/virtio-console.c b/hw/virtio-console.c index c235b27..84ed572 100644 --- a/hw/virtio-console.c +++ b/hw/virtio-console.c @@ -82,7 +82,11 @@ static int virtconsole_exitfn(VirtIOSerialPort *port) if (vcon-chr) { port-info-have_data = NULL; -qemu_chr_close(vcon-chr); + /* +* Instead of closing the chardev, free it so it can be used +* for other purposes. +*/ + qemu_chr_add_handlers(vcon-chr, NULL, NULL, NULL, NULL); } return 0; -- 1.7.4
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Mon, 14 Mar 2011 10:20:57 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sun, Mar 13, 2011 at 7:04 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: cache=none implies the file are opened in the host with O_SYNC open flag O_SYNC does not bypass the host page cache. It ensures that writes only complete once data has been written to the disk. O_DIRECT is a hint to bypass the host page cache when possible. A boolean on|off option would be nicer than an option that takes the special string none. For example, direct=on|off. It also makes the code nicer by using bools instead of strdup strings that get leaked. What i wanted is the O_SYNC behavior. Well the comment should be updated. I want to make sure that we don't have dirty data in host page cache after a write. It is always good to make read hit the page cache Why silently enforce O_SYNC on the server side? The client does not know whether or not O_SYNC is in effect, cannot take advantage of that knowledge, and cannot control it. I think a more useful solution is a 9p client mount option called sync that caused the client to always add O_SYNC and skip syncfs. The whole stack becomes aware of O_SYNC and clients are in control over whether or not they need O_SYNC semantics. The cache=none specifically enables us to ignore the tsyncfs request on host. tsyncfs on host can be really slow in certain setup. -aneesh
Re: [Qemu-devel] [PATCH -V3 1/8] hw/9pfs: Add V9fsfidmap in preparation for adding fd reclaim
On Sun, 13 Mar 2011 20:53:41 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sun, Mar 13, 2011 at 7:06 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Sun, 13 Mar 2011 15:46:29 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: @@ -185,17 +188,22 @@ typedef struct V9fsXattr int flags; } V9fsXattr; +typedef struct V9fsfidmap { V9fsFidMap (naming convention) + union { + int fd; + DIR *dir; + V9fsXattr xattr; + } fs; The name fs is not meaningful. + int fid_type; + V9fsString path; + int flags; +} V9fsFidMap; + struct V9fsFidState { - int fid_type; int32_t fid; - V9fsString path; - union { - int fd; - DIR *dir; - V9fsXattr xattr; - } fs; uid_t uid; + V9fsFidMap fsmap; This name is confusing. A map is usually a container that stores key/value pairs. V9fsFidMapEntry would be clearer. But then I thought that is what V9fsFidState is? I am bad at naming. I wanted to indicate something that can be shared across multiple fids and also indicate the local file system mapping/data. I will take any suggestion. Where does sharing happen, I didn't notice any code that shares fds between fids? That patch is not yet there. We can only share fd if they open flags match. Hence making sure we open files on host with limited set of open flags which enables us much better sharing. -aneesh
Re: [Qemu-devel] [PATCH v2] Do not delete BlockDriverState when deleting the drive
Sorry for the long delay, I was out of action for a week. Ryan Harper ry...@us.ibm.com writes: When removing a drive from the host-side via drive_del we currently have the following path: drive_del qemu_aio_flush() bdrv_close() drive_uninit() bdrv_delete() When we bdrv_delete() we end up qemu_free() the BlockDriverState pointer however, the block devices retain a copy of this pointer, see hw/virtio-blk.c:virtio_blk_init() where we s-bs = conf-bs. We now have a use-after-free situation. If the guest continues to issue IO against the device, and we've reallocated the memory that the BlockDriverState pointed at, then we will fail the bs-drv checks in the various bdrv_ methods. we will fail the bs-drv checks is misleading, in my opinion. Here's what happens: 1. bdrv_close(bs) zaps bs-drv, which makes any subsequent I/O get dropped. Works as designed. 2. drive_uninit() frees the bs. Since the device is still connected to bs, any subsequent I/O is a use-after-free. The value of bs-drv becomes unpredictable on free. As long as it remains null, I/O still gets dropped. I/O crashes or worse once that changed. Could be right on free, could be much later. If you respin anyway, please clarify your description. To resolve this issue as simply as possible, we can chose to not actually delete the BlockDriverState pointer. Since bdrv_close() handles setting the drv pointer to NULL, we just need to remove the BlockDriverState from the QLIST that is used to enumerate the block devices. This is currently handled within bdrv_delete, so move this into it's own function, bdrv_remove(). Why do we remove the BlockDriverState from bdrv_states? Because we want drive_del make its *name* go away. Begs the question: is the code prepared for a BlockDriverState object that isn't on bdrv_states? Turns out we're in luck: bdrv_new() already creates such objects when the device_name is empty. This is used for internal BlockDriverStates such as COW backing files. Your code makes device_name empty when taking the object off bdrv_states, so we're good. Begs yet another question: how does the behavior of a BlockDriverState change when it's taken off bdrv_states, and is that the behavior we want? Changes: * bdrv_delete() no longer takes it off bdrv_states. Good. * bdrv_close_all(), bdrv_commit_all() and bdrv_flush_all() no longer cover it. Okay, because bdrv_close(), bdrv_commit() and bdrv_flush() do nothing anyway for closed BlockDriverStates. * info block and info blockstats no longer show it, because bdrv_info() and bdrv_info_stats() no longer see it. Okay. * bdrv_find(), bdrv_next(), bdrv_iterate() no longer see it. Impact? Please check their uses and report. The result is that we can now invoke drive_del, this closes the file descriptors and sets BlockDriverState-drv to NULL which prevents futher IO to the device, and since we do not free BlockDriverState, we don't have to worry about the copy retained in the block devices. Yep. But there's one more question: is the BlockDriverState freed when the device using it gets destroyed? qdev_free() runs prop-info-free() for all properties. The drive property's free() is free_drive(): static void free_drive(DeviceState *dev, Property *prop) { BlockDriverState **ptr = qdev_get_prop_ptr(dev, prop); if (*ptr) { bdrv_detach(*ptr, dev); blockdev_auto_del(*ptr); } } This should indeed delete the drive. But only if the property still points to it. See below. Reported-by: Marcus Armbruster arm...@redhat.com Signed-off-by: Ryan Harper ry...@us.ibm.com --- v1-v2 - NULL bs-device_name after removing from list to prevent second removal. block.c| 12 +--- block.h|1 + blockdev.c |2 +- 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index 1544d81..0df9942 100644 --- a/block.c +++ b/block.c @@ -697,14 +697,20 @@ void bdrv_close_all(void) } } +void bdrv_remove(BlockDriverState *bs) +{ +if (bs-device_name[0] != '\0') { +QTAILQ_REMOVE(bdrv_states, bs, list); +} +bs-device_name[0] = '\0'; +} + I don't like this name. What's the difference between delete and remove? The function zaps the device name. bdrv_make_anon()? void bdrv_delete(BlockDriverState *bs) { assert(!bs-peer); /* remove from list, if necessary */ -if (bs-device_name[0] != '\0') { -QTAILQ_REMOVE(bdrv_states, bs, list); -} +bdrv_remove(bs); bdrv_close(bs); if (bs-file != NULL) { diff --git a/block.h b/block.h index 5d78fc0..8447397 100644 --- a/block.h +++ b/block.h @@ -66,6 +66,7 @@ int bdrv_create(BlockDriver *drv, const char* filename, QEMUOptionParameter *options); int bdrv_create_file(const char* filename, QEMUOptionParameter *options); BlockDriverState *bdrv_new(const char *device_name); +void
Re: [Qemu-devel] [RFC] QCFG: a new mechanism to replace QemuOpts and option handling
Am 14.03.2011 18:48, schrieb Anthony Liguori: As I've been waiting for QAPI review, I've been working on the design of a new mechanism to replace our current command line option handling (QemuOpts) with something that reuses the QAPI infrastructure. The 'QemuOpts' syntax is just a way to encode complex data structures. 'nic,model=virtio,macaddress=00:01:02:03:04:05' can be mapped directly to a C data structure. This is exactly what QCFG does using the same JSON schema mechanism that QMP uses. The effect is that you describe a command line argument in JSON like so: { 'type': 'VncConfig', 'data': { 'address': 'str', '*password': 'bool', '*reverse': 'bool', '*no-lock-key-sync': 'bool', '*sasl': 'bool', '*tls': 'bool', '*x509': 'str', '*x509verify': 'str', '*acl': 'bool', '*lossy': 'bool' } } You then just implement a C function that gets called for each -vnc option specified: void qcfg_handle_vnc(VncConfig *option, Error **errp) { } And that's it. You can squirrel away the option such that they all can be processed later, you can perform additional validation and return an error, or you can implement the appropriate logic. The VncConfig structure is a proper C data structure. The advantages of this approach compared to QemuOpts are similar to QAPI: 1) Strong typing means less bugs with lack of command line validation. In many cases, a bad command line results in a SEGV today. 2) Every option is formally specified and documented in a way that is both rigorous and machine readable. This means we can generate high quality documentation in a variety of formats. 3) The command line parameters support full introspection. This should provide the same functionality as Dan's earlier introspection patches. 4) The 'VncConfig' structure also has JSON marshallers and the qcfg_handle_vnc() function can be trivially bridged to QMP. This means command line oriented interfaces (like device_add) are better integrated with QMP. 5) Very complex data types can be implemented. We had some discussion of supporting nested structures with -blockdev. This wouldn't work with QemuOpts but I've already implemented it with QCFG (blockdev syntax is my test case right now). The syntax I'm currently using is -blockdev cache=none,id=foo,format.qcow.protocol.nbd.hostname=localhost where '.' is used to reference sub structures. Do you have an example from your implementation for this? I think the tricky part is that the valid fields depend on the block driver. qcow2 wants another BlockDriverState as its image file; file wants a file name; vvfat wants a directory name, FAT type and disk type; and NBD wants a host name and a port, except if it uses a UNIX socket. This is probably the most complex thing you can get, so I think it would make a better example than a VNC configuration. Kevin
Re: [Qemu-devel] Re: KVM call agenda for Jan 25
Am 14.03.2011 16:13, schrieb Dushyant Bansal: Nice that qemu-img convert isn't that far out by default on raw :). About Google Summer of Code, I have posted my take on applying and want to share that with you and qemu-devel: http://blog.vmsplice.net/2011/03/advice-for-students-applying-to-google.html Thanks for sharing your experiences. After reading about qcow2 and qed and how they organize data (thanks to the newly added qcow2 doc and discussions on the mailing list), this is what I understand. So, the main difference between qed and qcow2 is the absence of reference count structure in qed(means less meta data). It improves performance due to: 1. For write operations, less or no metadata to update. 2. Data write and metadata write can be in any order This also means these features are no longer supported: 1. Internal snapshots, 2. CPU/device state snapshots, 3. Compression, 4. Encryption Now, coming to qed--qcow2 conversion, I want to clarify some things. 1. header_size: variable in qed, equals to cluster size in qcow2: When will it be larger than 1 cluster in qed? So, what will happen to that extra data on qed-qcow2 conversion. If you have an feature that is used in the original image, but cannot be represented in the new format, I think you should just get an error. 2. L2 table size: equals to L1 table size in qed, equals to cluster size in qcow2: we need to take it into account during conversion. Right. I think we'll have to rewrite all of the metadata. I wonder if we can manage to have a nice block driver interface for in-place image conversions so that we don't only get a qed-qcow2 converter, but also can implement the interface in e.g. VMDK and get VMDK-qcow2 and VMDK-qed as well. 3. refcount table: qcow2-qed:we do not keep refcount structure qed-qcow2: initialize refcount structure Yes, refcounts can be rebuilt after the mapping has been converted. So, a qcow2-qed-qcow2 conversion means if earlier, qcow2 image was using additional features{1-4}, all information related to that will be lost. We shouldn't lose anything but just abort if a conversion isn't possible. The user can still use qemu-img convert for the more complicated cases, for example for removing encryption or compression. Kevin
Re: [Qemu-devel] OSX build issues
On Tue, Mar 15, 2011 at 6:41 AM, Alexander Graf ag...@suse.de wrote: On 14.03.2011, at 22:21, François Revol wrote: The OSX build has been broken for some time now... * qemu-thread-posix.c: both qemu_mutex_timedlock and qemu_cond_timedwait make use of clock_gettime() and CLOCK_REALTIME, which OSX doesn't have. It seems like both functions are nowhere found. Can they be removed then ? * cpus.c: qemu_kvm_eat_signals refers to sigbus_reraise which is defined conditionally on CONFIG_LINUX... And OSX doesn't have sigtimedwait... Any maintainer around who can fix it ? Andreas is your man :). Daniel and Christian recently added a buildbot setup that periodically builds qemu.git and checks that builds are successful. OSX is currently not covered but maybe you'd like to help here. Here is the buildbot, if you're curious what it looks like: http://buildbot.b1-systems.de/qemu/one_box_per_builder If you have an OSX box available to run periodic builds you can help by contributing it as a buildslave. Check out this wiki page for more info on contributing buildslaves: http://wiki.qemu.org/ContinuousIntegration Stefan
[Qemu-devel] [v1 PATCH 0/3]: Use GLib threadpool in 9pfs.
Hi, This patchset enables the use of GLib threadpool infrastructure in 9pfs. It contains the following patches: 1/3 - Move the paio_signal_handler to a generic location. 2/3 - Helper routines to use GLib threadpool infrastructure in 9pfs. 3/3 - Convert v9fs_stat to threaded model. As a prerequisite, before these patches are applied, we need to apply Anthony's patch Add support for glib based threading and convert qemu thread to use it found at http://www.mail-archive.com/qemu-devel@nongnu.org/msg52791.html Testing carried out: This patchset has been tested by running the following autotest suites successfully: * Dbench * Fsstress * Connecthon * Tuxera POSIX. Please let me know your comments. -arun
[Qemu-devel] [v1 PATCH 1/3]: Move the paio_signal_handler to a generic location.
* Arun R Bharadwaj a...@linux.vnet.ibm.com [2011-03-15 16:04:53]: Author: Arun R Bharadwaj a...@linux.vnet.ibm.com Date: Thu Mar 10 14:45:25 2011 +0530 Move the paio_signal_handler to a generic location. The paio subsystem uses the signal, SIGUSR2. So move the signal handler to a more generic place such that other subsystems like 9pfs can also use it. TODO: I have moved the signal handler code to qemu-thread.c, which is NOT the right place. I need suggestions as to where is the right place to put it. Signed-off-by: Arun R Bharadwaj a...@linux.vnet.ibm.com Reviewed-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com diff --git a/posix-aio-compat.c b/posix-aio-compat.c index fa5494d..df001d6 100644 --- a/posix-aio-compat.c +++ b/posix-aio-compat.c @@ -28,6 +28,7 @@ #include qemu-common.h #include trace.h #include block_int.h +#include qemu-thread.h #include block/raw-posix-aio.h @@ -302,6 +303,8 @@ static ssize_t handle_aiocb_rw(struct qemu_paiocb *aiocb) return nbytes; } +static PosixAioState *posix_aio_state; + static void *aio_thread(void *unused) { pid_t pid; @@ -356,6 +359,15 @@ static void *aio_thread(void *unused) idle_threads++; mutex_unlock(lock); +if (posix_aio_state) { +char byte = 0; +ssize_t ret; + +ret = write(posix_aio_state-wfd, byte, sizeof(byte)); +if (ret 0 errno != EAGAIN) +die(write()); +} + if (kill(pid, aiocb-ev_signo)) die(kill failed); } @@ -497,22 +509,6 @@ static int posix_aio_flush(void *opaque) return !!s-first_aio; } -static PosixAioState *posix_aio_state; - -static void aio_signal_handler(int signum) -{ -if (posix_aio_state) { -char byte = 0; -ssize_t ret; - -ret = write(posix_aio_state-wfd, byte, sizeof(byte)); -if (ret 0 errno != EAGAIN) -die(write()); -} - -qemu_service_io(); -} - static void paio_remove(struct qemu_paiocb *acb) { struct qemu_paiocb **pacb; @@ -616,7 +612,6 @@ BlockDriverAIOCB *paio_ioctl(BlockDriverState *bs, int fd, int paio_init(void) { -struct sigaction act; PosixAioState *s; int fds[2]; int ret; @@ -626,11 +621,6 @@ int paio_init(void) s = qemu_malloc(sizeof(PosixAioState)); -sigfillset(act.sa_mask); -act.sa_flags = 0; /* do not restart syscalls to interrupt select() */ -act.sa_handler = aio_signal_handler; -sigaction(SIGUSR2, act, NULL); - s-first_aio = NULL; if (qemu_pipe(fds) == -1) { fprintf(stderr, failed to create pipe\n); diff --git a/qemu-thread.c b/qemu-thread.c index 2c521ab..bed1b60 100644 --- a/qemu-thread.c +++ b/qemu-thread.c @@ -149,3 +149,18 @@ void qemu_thread_exit(void *retval) { g_thread_exit(retval); } + +void sigusr2_signal_handler(int signum) +{ +qemu_service_io(); +} + +void register_sigusr2_signal_handler(void) +{ +struct sigaction act; + +sigfillset(act.sa_mask); +act.sa_flags = 0; /* do not restart syscalls to interrupt select() */ +act.sa_handler = sigusr2_signal_handler; +sigaction(SIGUSR2, act, NULL); +} diff --git a/qemu-thread.h b/qemu-thread.h index dc22a60..0f1cbe8 100644 --- a/qemu-thread.h +++ b/qemu-thread.h @@ -41,5 +41,7 @@ void qemu_thread_signal(QemuThread *thread, int sig); void qemu_thread_self(QemuThread *thread); int qemu_thread_equal(QemuThread *thread1, QemuThread *thread2); void qemu_thread_exit(void *retval); +void sigusr2_signal_handler(int signum); +void register_sigusr2_signal_handler(void); #endif diff --git a/vl.c b/vl.c index 3225b1d..a8dc4fc 100644 --- a/vl.c +++ b/vl.c @@ -148,6 +148,7 @@ int main(int argc, char **argv) #include qemu-config.h #include qemu-objects.h #include qemu-options.h +#include qemu-thread.h #ifdef CONFIG_VIRTFS #include fsdev/qemu-fsdev.h #endif @@ -3003,6 +3004,8 @@ int main(int argc, char **argv, char **envp) } } +register_sigusr2_signal_handler(); + if (qemu_opts_foreach(qemu_find_opts(mon), mon_init_func, NULL, 1) != 0) { exit(1); }
Re: [Qemu-devel] [PATCH -V3 1/8] hw/9pfs: Add V9fsfidmap in preparation for adding fd reclaim
On Tue, Mar 15, 2011 at 9:20 AM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Sun, 13 Mar 2011 20:53:41 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sun, Mar 13, 2011 at 7:06 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Sun, 13 Mar 2011 15:46:29 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: @@ -185,17 +188,22 @@ typedef struct V9fsXattr int flags; } V9fsXattr; +typedef struct V9fsfidmap { V9fsFidMap (naming convention) + union { + int fd; + DIR *dir; + V9fsXattr xattr; + } fs; The name fs is not meaningful. + int fid_type; + V9fsString path; + int flags; +} V9fsFidMap; + struct V9fsFidState { - int fid_type; int32_t fid; - V9fsString path; - union { - int fd; - DIR *dir; - V9fsXattr xattr; - } fs; uid_t uid; + V9fsFidMap fsmap; This name is confusing. A map is usually a container that stores key/value pairs. V9fsFidMapEntry would be clearer. But then I thought that is what V9fsFidState is? I am bad at naming. I wanted to indicate something that can be shared across multiple fids and also indicate the local file system mapping/data. I will take any suggestion. Where does sharing happen, I didn't notice any code that shares fds between fids? That patch is not yet there. We can only share fd if they open flags match. Hence making sure we open files on host with limited set of open flags which enables us much better sharing. Tracking open flags is fine and is needed for fd reclaim. But splitting V9fsFidState into the V9fsFidMap structure and all the churn that causes to the code isn't necessary yet. Please wait with that until you submit patches fd sharing. The reason I make this suggestion is that everyone reading or working on the code until fd sharing is added now needs to deal with the V9fsFidMap structure which (currently) serves no purpose. Stefan
[Qemu-devel] [v1 PATCH 2/3]: Helper routines to use GLib threadpool infrastructure in 9pfs.
* Arun R Bharadwaj a...@linux.vnet.ibm.com [2011-03-15 16:04:53]: Author: Arun R Bharadwaj a...@linux.vnet.ibm.com Date: Thu Mar 10 15:11:49 2011 +0530 Helper routines to use GLib threadpool infrastructure in 9pfs. This patch creates helper routines to make use of the threadpool infrastructure provided by GLib. This is based on the prototype patch by Anthony which does a similar thing for posix-aio-compat.c An example use case is provided in the next patch where one of the syscalls in 9pfs is converted into the threaded model using these helper routines. Signed-off-by: Arun R Bharadwaj a...@linux.vnet.ibm.com Reviewed-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c index dceefd5..cf61345 100644 --- a/hw/9pfs/virtio-9p.c +++ b/hw/9pfs/virtio-9p.c @@ -18,6 +18,8 @@ #include fsdev/qemu-fsdev.h #include virtio-9p-debug.h #include virtio-9p-xattr.h +#include signal.h +#include qemu-thread.h int debug_9p_pdu; static void v9fs_reclaim_fd(V9fsState *s); @@ -36,6 +38,89 @@ enum { Oappend = 0x80, }; +typedef struct V9fsPool { +GThreadPool *pool; +GList *requests; +int rfd; +int wfd; +} V9fsPool; + +static V9fsPool v9fs_pool; + +static void v9fs_qemu_submit_request(V9fsRequest *req) +{ +V9fsPool *p = v9fs_pool; + +p-requests = g_list_append(p-requests, req); +g_thread_pool_push(v9fs_pool.pool, req, NULL); +} + +static void die2(int err, const char *what) +{ +fprintf(stderr, %s failed: %s\n, what, strerror(err)); +abort(); +} + +static void die(const char *what) +{ +die2(errno, what); +} + +static void v9fs_qemu_process_post_ops(void *arg) +{ +struct V9fsPool *p = v9fs_pool; +struct V9fsPostOp *post_op; +char byte; +ssize_t len; +GList *cur_req, *next_req; + +do { +len = read(p-rfd, byte, sizeof(byte)); +} while (len == -1 errno == EINTR); + +for (cur_req = p-requests; cur_req != NULL; cur_req = next_req) { +V9fsRequest *req = cur_req-data; +next_req = g_list_next(cur_req); + +if (!req-done) { +continue; +} + +post_op = req-post_op; +post_op-func(post_op-arg); +p-requests = g_list_remove_link(p-requests, cur_req); +g_list_free(p-requests); +} +} + +static inline void v9fs_thread_signal(void) +{ +struct V9fsPool *p = v9fs_pool; +char byte = 0; +ssize_t ret; + +do { +ret = write(p-wfd, byte, sizeof(byte)); +} while (ret == -1 errno == EINTR); + +if (ret 0 errno != EAGAIN) { +die(write() in v9fs); +} + +if (kill(getpid(), SIGUSR2)) { +die(kill failed); +} +} + +static void v9fs_thread_routine(gpointer data, gpointer user_data) +{ +V9fsRequest *req = data; + +req-func(req); +v9fs_thread_signal(); +req-done = 1; +} + static int omode_to_uflags(int8_t mode) { int ret = 0; @@ -3850,7 +3935,8 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf *conf) int i, len; struct stat stat; FsTypeEntry *fse; - +int fds[2]; +V9fsPool *p = v9fs_pool; s = (V9fsState *)virtio_common_init(virtio-9p, VIRTIO_ID_9P, @@ -3939,5 +4025,21 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf *conf) s-tag_len; s-vdev.get_config = virtio_9p_get_config; +if (qemu_pipe(fds) == -1) { +fprintf(stderr, failed to create fd's for virtio-9p\n); +exit(1); +} + +p-pool = g_thread_pool_new(v9fs_thread_routine, p, 8, FALSE, NULL); +p-rfd = fds[0]; +p-wfd = fds[1]; + +fcntl(p-rfd, F_SETFL, O_NONBLOCK); +fcntl(p-wfd, F_SETFL, O_NONBLOCK); + +qemu_set_fd_handler(p-rfd, v9fs_qemu_process_post_ops, NULL, NULL); + +(void) v9fs_qemu_submit_request; + return s-vdev; } diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h index 10809ba..e7d2326 100644 --- a/hw/9pfs/virtio-9p.h +++ b/hw/9pfs/virtio-9p.h @@ -124,6 +124,20 @@ struct V9fsPDU QLIST_ENTRY(V9fsPDU) next; }; +typedef struct V9fsPostOp { +/* Post Operation routine to execute after executing syscall */ +void (*func)(void *arg); +void *arg; +} V9fsPostOp; + +typedef struct V9fsRequest { +void (*func)(struct V9fsRequest *req); + +/* Flag to indicate that request is satisfied, ready for post-processing */ +int done; + +V9fsPostOp post_op; +} V9fsRequest; /* FIXME * 1) change user needs to set groups and stuff
[Qemu-devel] [v1 PATCH 3/3]: Convert v9fs_stat to threaded model.
* Arun R Bharadwaj a...@linux.vnet.ibm.com [2011-03-15 16:04:53]: Author: Harsh Prateek Bora ha...@linux.vnet.ibm.com Date: Mon Mar 14 13:55:37 2011 +0530 Convert v9fs_stat to threaded model. This patch converts v9fs_stat syscall of 9pfs to threaded model by making use of the helper routines provided created by the previous patch. Signed-off-by: Harsh Prateek Bora ha...@linux.vnet.ibm.com Signed-off-by: Arun R Bharadwaj a...@linux.vnet.ibm.com diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c index cf61345..a328e97 100644 --- a/hw/9pfs/virtio-9p.c +++ b/hw/9pfs/virtio-9p.c @@ -1458,26 +1458,35 @@ out: v9fs_string_free(aname); } -static void v9fs_stat_post_lstat(V9fsState *s, V9fsStatState *vs, int err) +static void v9fs_stat_post_lstat(void *opaque) { -if (err == -1) { -err = -errno; +V9fsStatState *vs = (V9fsStatState *)opaque; +if (vs-err == -1) { +vs-err = -(vs-v9fs_errno); goto out; } -err = stat_to_v9stat(s, vs-fidp-fsmap.path, vs-stbuf, vs-v9stat); -if (err) { +vs-err = stat_to_v9stat(vs-s, vs-fidp-fsmap.path, vs-stbuf, vs-v9stat); +if (vs-err) { goto out; } vs-offset += pdu_marshal(vs-pdu, vs-offset, wS, 0, vs-v9stat); -err = vs-offset; +vs-err = vs-offset; out: -complete_pdu(s, vs-pdu, err); +complete_pdu(vs-s, vs-pdu, vs-err); v9fs_stat_free(vs-v9stat); qemu_free(vs); } +static void v9fs_stat_do_lstat(V9fsRequest *request) +{ +V9fsStatState *vs = container_of(request, V9fsStatState, request); + +vs-err = v9fs_do_lstat(vs-s, vs-fidp-fsmap.path, vs-stbuf); +vs-v9fs_errno = errno; +} + static void v9fs_stat(V9fsState *s, V9fsPDU *pdu) { int32_t fid; @@ -1487,6 +1496,10 @@ static void v9fs_stat(V9fsState *s, V9fsPDU *pdu) vs = qemu_malloc(sizeof(*vs)); vs-pdu = pdu; vs-offset = 7; +vs-s = s; +vs-request.func = v9fs_stat_do_lstat; +vs-request.post_op.func = v9fs_stat_post_lstat; +vs-request.post_op.arg = vs; memset(vs-v9stat, 0, sizeof(vs-v9stat)); @@ -1498,8 +1511,11 @@ static void v9fs_stat(V9fsState *s, V9fsPDU *pdu) goto out; } +/* err = v9fs_do_lstat(s, vs-fidp-fsmap.path, vs-stbuf); v9fs_stat_post_lstat(s, vs, err); +*/ +v9fs_qemu_submit_request(vs-request); return; out: diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h index e7d2326..1d6c17c 100644 --- a/hw/9pfs/virtio-9p.h +++ b/hw/9pfs/virtio-9p.h @@ -271,6 +271,10 @@ typedef struct V9fsStatState { V9fsStat v9stat; V9fsFidState *fidp; struct stat stbuf; +V9fsState *s; +int err; +int v9fs_errno; +V9fsRequest request; } V9fsStatState; typedef struct V9fsStatDotl {
[Qemu-devel] Re: Write cache enable from guest at runtime
On Mon, Mar 14, 2011 at 07:15:14PM +, Stefan Hajnoczi wrote: Sounds like a good idea. Feel free to post the patches RFC and I or someone else can debug and polish them if you don't have time. By looking at your document and doing what you recommend against I think I got a much simpler solution than what I had before. I didn't know we could actually write to the config space. The vpcu synchronous behaviour is not a problem as it's done seldomly, and we can trivially check against errors by reading the value back. I've just started building this variant, and it seems surprisingly simple.
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Tue, Mar 15, 2011 at 9:19 AM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Mon, 14 Mar 2011 10:20:57 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sun, Mar 13, 2011 at 7:04 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: cache=none implies the file are opened in the host with O_SYNC open flag O_SYNC does not bypass the host page cache. It ensures that writes only complete once data has been written to the disk. O_DIRECT is a hint to bypass the host page cache when possible. A boolean on|off option would be nicer than an option that takes the special string none. For example, direct=on|off. It also makes the code nicer by using bools instead of strdup strings that get leaked. What i wanted is the O_SYNC behavior. Well the comment should be updated. I want to make sure that we don't have dirty data in host page cache after a write. It is always good to make read hit the page cache Why silently enforce O_SYNC on the server side? The client does not know whether or not O_SYNC is in effect, cannot take advantage of that knowledge, and cannot control it. I think a more useful solution is a 9p client mount option called sync that caused the client to always add O_SYNC and skip syncfs. The whole stack becomes aware of O_SYNC and clients are in control over whether or not they need O_SYNC semantics. The cache=none specifically enables us to ignore the tsyncfs request on host. tsyncfs on host can be really slow in certain setup. If I'm a client with the sync mount option all my fids are O_SYNC and I do not need to send TSYNCFS requests to the server because my fids are already stable. Stefan
[Qemu-devel] Re: [RFC] QCFG: a new mechanism to replace QemuOpts and option handling
Am 14.03.2011 18:48, schrieb Anthony Liguori: I've got a spec written up at http://wiki.qemu.org/Features/QCFG. Initial code is in my QAPI tree. One question about a small detail on this wiki page: typedef struct BlockdevConfig { char * file; struct BlockdevConfig * backing_file; struct BlockdevConfig * next; } BlockdevConfig; What is the 'next' pointer used for, are you going to store a list of all -blockdev options used? And why isn't it a QLIST or something? Kevin
[Qemu-devel] Re: Write cache enable from guest at runtime
On Tue, Mar 15, 2011 at 10:50 AM, Christoph Hellwig h...@lst.de wrote: On Mon, Mar 14, 2011 at 07:15:14PM +, Stefan Hajnoczi wrote: Sounds like a good idea. Feel free to post the patches RFC and I or someone else can debug and polish them if you don't have time. By looking at your document and doing what you recommend against I think I got a much simpler solution than what I had before. I didn't know we could actually write to the config space. The vpcu synchronous behaviour is not a problem as it's done seldomly, and we can trivially check against errors by reading the value back. I've just started building this variant, and it seems surprisingly simple. I didn't think about the just reading the value back. What do you think about reopening the file via /proc/$pid/fd/$old_fd? I wrote a test and had a look at the proc fd code. It seems to work fine and doesn't require an O_SYNC runtime change kernel patch. It allows us to fall back to the old fd if the new file cannot be opened and it works even when the old file has been deleted. Stefan
[Qemu-devel] [PATCH v2 01/20] Implement qemu_kvm_eat_signals only for CONFIG_LINUX
qemu_kvm_eat_signals requires POSIX support with realtime extensions for sigtimedwait. Not all our target platforms provide this. Moreover, undefined sigbus_reraise was referenced on non-Linux as well. Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Andreas Färber andreas.faer...@web.de --- cpus.c | 94 1 files changed, 47 insertions(+), 47 deletions(-) diff --git a/cpus.c b/cpus.c index 077729c..26e5bba 100644 --- a/cpus.c +++ b/cpus.c @@ -245,11 +245,58 @@ static void qemu_init_sigbus(void) prctl(PR_MCE_KILL, PR_MCE_KILL_SET, PR_MCE_KILL_EARLY, 0, 0); } +static void qemu_kvm_eat_signals(CPUState *env) +{ +struct timespec ts = { 0, 0 }; +siginfo_t siginfo; +sigset_t waitset; +sigset_t chkset; +int r; + +sigemptyset(waitset); +sigaddset(waitset, SIG_IPI); +sigaddset(waitset, SIGBUS); + +do { +r = sigtimedwait(waitset, siginfo, ts); +if (r == -1 !(errno == EAGAIN || errno == EINTR)) { +perror(sigtimedwait); +exit(1); +} + +switch (r) { +case SIGBUS: +if (kvm_on_sigbus_vcpu(env, siginfo.si_code, siginfo.si_addr)) { +sigbus_reraise(); +} +break; +default: +break; +} + +r = sigpending(chkset); +if (r == -1) { +perror(sigpending); +exit(1); +} +} while (sigismember(chkset, SIG_IPI) || sigismember(chkset, SIGBUS)); + +#ifndef CONFIG_IOTHREAD +if (sigismember(chkset, SIGIO) || sigismember(chkset, SIGALRM)) { +qemu_notify_event(); +} +#endif +} + #else /* !CONFIG_LINUX */ static void qemu_init_sigbus(void) { } + +static void qemu_kvm_eat_signals(CPUState *env) +{ +} #endif /* !CONFIG_LINUX */ #ifndef _WIN32 @@ -455,49 +502,6 @@ static void qemu_tcg_init_cpu_signals(void) #endif } -static void qemu_kvm_eat_signals(CPUState *env) -{ -struct timespec ts = { 0, 0 }; -siginfo_t siginfo; -sigset_t waitset; -sigset_t chkset; -int r; - -sigemptyset(waitset); -sigaddset(waitset, SIG_IPI); -sigaddset(waitset, SIGBUS); - -do { -r = sigtimedwait(waitset, siginfo, ts); -if (r == -1 !(errno == EAGAIN || errno == EINTR)) { -perror(sigtimedwait); -exit(1); -} - -switch (r) { -case SIGBUS: -if (kvm_on_sigbus_vcpu(env, siginfo.si_code, siginfo.si_addr)) { -sigbus_reraise(); -} -break; -default: -break; -} - -r = sigpending(chkset); -if (r == -1) { -perror(sigpending); -exit(1); -} -} while (sigismember(chkset, SIG_IPI) || sigismember(chkset, SIGBUS)); - -#ifndef CONFIG_IOTHREAD -if (sigismember(chkset, SIGIO) || sigismember(chkset, SIGALRM)) { -qemu_notify_event(); -} -#endif -} - #else /* _WIN32 */ HANDLE qemu_event_handle; @@ -526,10 +530,6 @@ static void qemu_event_increment(void) } } -static void qemu_kvm_eat_signals(CPUState *env) -{ -} - static int qemu_signal_init(void) { return 0; -- 1.7.1
[Qemu-devel] [PATCH v2 08/20] kvm: x86: Do not leave halt if interrupts are disabled
When an external interrupt is pending but IF is cleared, we must not leave the halt state prematurely. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- target-i386/kvm.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index f7995bd..3a07fce 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -1590,7 +1590,9 @@ int kvm_arch_process_async_events(CPUState *env) return 0; } -if (env-interrupt_request (CPU_INTERRUPT_HARD | CPU_INTERRUPT_NMI)) { +if (((env-interrupt_request CPU_INTERRUPT_HARD) + (env-eflags IF_MASK)) || +(env-interrupt_request CPU_INTERRUPT_NMI)) { env-halted = 0; } if (env-interrupt_request CPU_INTERRUPT_INIT) { -- 1.7.1
[Qemu-devel] [PATCH v2 04/20] Break up user and system cpu_interrupt implementations
Both have only two lines in common, and we will convert the system service into a callback which is of no use for user mode operation. Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Riku Voipio riku.voi...@iki.fi --- exec.c | 14 ++ 1 files changed, 10 insertions(+), 4 deletions(-) diff --git a/exec.c b/exec.c index c5358c3..12ea582 100644 --- a/exec.c +++ b/exec.c @@ -1627,6 +1627,7 @@ static void cpu_unlink_tb(CPUState *env) spin_unlock(interrupt_lock); } +#ifndef CONFIG_USER_ONLY /* mask must never be zero, except for A20 change call */ void cpu_interrupt(CPUState *env, int mask) { @@ -1635,7 +1636,6 @@ void cpu_interrupt(CPUState *env, int mask) old_mask = env-interrupt_request; env-interrupt_request |= mask; -#ifndef CONFIG_USER_ONLY /* * If called from iothread context, wake the target cpu in * case its halted. @@ -1644,21 +1644,27 @@ void cpu_interrupt(CPUState *env, int mask) qemu_cpu_kick(env); return; } -#endif if (use_icount) { env-icount_decr.u16.high = 0x; -#ifndef CONFIG_USER_ONLY if (!can_do_io(env) (mask ~old_mask) != 0) { cpu_abort(env, Raised interrupt while not in I/O function); } -#endif } else { cpu_unlink_tb(env); } } +#else /* CONFIG_USER_ONLY */ + +void cpu_interrupt(CPUState *env, int mask) +{ +env-interrupt_request |= mask; +cpu_unlink_tb(env); +} +#endif /* CONFIG_USER_ONLY */ + void cpu_reset_interrupt(CPUState *env, int mask) { env-interrupt_request = ~mask; -- 1.7.1
[Qemu-devel] [PATCH v2 12/20] kvm: x86: Synchronize PAT MSR with the kernel
Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- target-i386/kvm.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 3a07fce..032bc3e 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -861,6 +861,7 @@ static int kvm_put_msrs(CPUState *env, int level) kvm_msr_entry_set(msrs[n++], MSR_IA32_SYSENTER_CS, env-sysenter_cs); kvm_msr_entry_set(msrs[n++], MSR_IA32_SYSENTER_ESP, env-sysenter_esp); kvm_msr_entry_set(msrs[n++], MSR_IA32_SYSENTER_EIP, env-sysenter_eip); +kvm_msr_entry_set(msrs[n++], MSR_PAT, env-pat); if (has_msr_star) { kvm_msr_entry_set(msrs[n++], MSR_STAR, env-star); } @@ -1113,6 +1114,7 @@ static int kvm_get_msrs(CPUState *env) msrs[n++].index = MSR_IA32_SYSENTER_CS; msrs[n++].index = MSR_IA32_SYSENTER_ESP; msrs[n++].index = MSR_IA32_SYSENTER_EIP; +msrs[n++].index = MSR_PAT; if (has_msr_star) { msrs[n++].index = MSR_STAR; } @@ -1168,6 +1170,9 @@ static int kvm_get_msrs(CPUState *env) case MSR_IA32_SYSENTER_EIP: env-sysenter_eip = msrs[i].data; break; +case MSR_PAT: +env-pat = msrs[i].data; +break; case MSR_STAR: env-star = msrs[i].data; break; -- 1.7.1
[Qemu-devel] [PATCH v2 05/20] Redirect cpu_interrupt to callback handler
This allows to override the interrupt handling of QEMU in system mode. KVM will make use of it to set optimized handlers. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- cpu-all.h | 14 +- exec.c|4 +++- 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/cpu-all.h b/cpu-all.h index 4f4631d..5835cfa 100644 --- a/cpu-all.h +++ b/cpu-all.h @@ -790,7 +790,19 @@ extern CPUState *cpu_single_env; #define CPU_INTERRUPT_SIPI 0x800 /* SIPI pending. */ #define CPU_INTERRUPT_MCE0x1000 /* (x86 only) MCE pending. */ -void cpu_interrupt(CPUState *s, int mask); +#ifndef CONFIG_USER_ONLY +typedef void (*CPUInterruptHandler)(CPUState *, int); + +extern CPUInterruptHandler cpu_interrupt_handler; + +static inline void cpu_interrupt(CPUState *s, int mask) +{ +cpu_interrupt_handler(s, mask); +} +#else /* USER_ONLY */ +void cpu_interrupt(CPUState *env, int mask); +#endif /* USER_ONLY */ + void cpu_reset_interrupt(CPUState *env, int mask); void cpu_exit(CPUState *s); diff --git a/exec.c b/exec.c index 12ea582..b59f7ff 100644 --- a/exec.c +++ b/exec.c @@ -1629,7 +1629,7 @@ static void cpu_unlink_tb(CPUState *env) #ifndef CONFIG_USER_ONLY /* mask must never be zero, except for A20 change call */ -void cpu_interrupt(CPUState *env, int mask) +static void tcg_handle_interrupt(CPUState *env, int mask) { int old_mask; @@ -1656,6 +1656,8 @@ void cpu_interrupt(CPUState *env, int mask) } } +CPUInterruptHandler cpu_interrupt_handler = tcg_handle_interrupt; + #else /* CONFIG_USER_ONLY */ void cpu_interrupt(CPUState *env, int mask) -- 1.7.1
[Qemu-devel] [PATCH v2 02/20] x86: Unbreak TCG support for hardware breakpoints
Commit 83f338f73e broke x86 hardware breakpoint emulation by moving the debug exception handling out of cpu_exec. Fix this by moving all TCG related bits back, only leaving the generic guest debugging parts in cpus.c. Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: TeLeMan gele...@gmail.com --- cpu-exec.c | 27 +++ cpus.c | 27 +++ 2 files changed, 30 insertions(+), 24 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 34eaedc..5cc9379 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -196,6 +196,30 @@ static inline TranslationBlock *tb_find_fast(void) return tb; } +static CPUDebugExcpHandler *debug_excp_handler; + +CPUDebugExcpHandler *cpu_set_debug_excp_handler(CPUDebugExcpHandler *handler) +{ +CPUDebugExcpHandler *old_handler = debug_excp_handler; + +debug_excp_handler = handler; +return old_handler; +} + +static void cpu_handle_debug_exception(CPUState *env) +{ +CPUWatchpoint *wp; + +if (!env-watchpoint_hit) { +QTAILQ_FOREACH(wp, env-watchpoints, entry) { +wp-flags = ~BP_WATCHPOINT_HIT; +} +} +if (debug_excp_handler) { +debug_excp_handler(env); +} +} + /* main execution loop */ volatile sig_atomic_t exit_request; @@ -269,6 +293,9 @@ int cpu_exec(CPUState *env1) if (env-exception_index = EXCP_INTERRUPT) { /* exit request from the cpu execution loop */ ret = env-exception_index; +if (ret == EXCP_DEBUG) { +cpu_handle_debug_exception(env); +} break; } else { #if defined(CONFIG_USER_ONLY) diff --git a/cpus.c b/cpus.c index 26e5bba..975a6ce 100644 --- a/cpus.c +++ b/cpus.c @@ -166,29 +166,8 @@ static bool all_cpu_threads_idle(void) return true; } -static CPUDebugExcpHandler *debug_excp_handler; - -CPUDebugExcpHandler *cpu_set_debug_excp_handler(CPUDebugExcpHandler *handler) -{ -CPUDebugExcpHandler *old_handler = debug_excp_handler; - -debug_excp_handler = handler; -return old_handler; -} - -static void cpu_handle_debug_exception(CPUState *env) +static void cpu_handle_guest_debug(CPUState *env) { -CPUWatchpoint *wp; - -if (!env-watchpoint_hit) { -QTAILQ_FOREACH(wp, env-watchpoints, entry) { -wp-flags = ~BP_WATCHPOINT_HIT; -} -} -if (debug_excp_handler) { -debug_excp_handler(env); -} - gdb_set_stop_cpu(env); qemu_system_debug_request(); #ifdef CONFIG_IOTHREAD @@ -818,7 +797,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg) if (cpu_can_run(env)) { r = kvm_cpu_exec(env); if (r == EXCP_DEBUG) { -cpu_handle_debug_exception(env); +cpu_handle_guest_debug(env); } } qemu_kvm_wait_io_event(env); @@ -1110,7 +1089,7 @@ bool cpu_exec_all(void) r = tcg_cpu_exec(env); } if (r == EXCP_DEBUG) { -cpu_handle_debug_exception(env); +cpu_handle_guest_debug(env); break; } } else if (env-stop || env-stopped) { -- 1.7.1
[Qemu-devel] [PATCH v2 06/20] kvm: Install optimized interrupt handler
KVM only requires to set the raised IRQ in CPUState and to kick the receiving vcpu if it is remote. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c | 11 +++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 226843c..25ab545 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -650,6 +650,15 @@ static CPUPhysMemoryClient kvm_cpu_phys_memory_client = { .log_stop = kvm_log_stop, }; +static void kvm_handle_interrupt(CPUState *env, int mask) +{ +env-interrupt_request |= mask; + +if (!qemu_cpu_is_self(env)) { +qemu_cpu_kick(env); +} +} + int kvm_init(void) { static const char upgrade_note[] = @@ -758,6 +767,8 @@ int kvm_init(void) s-many_ioeventfds = kvm_check_many_ioeventfds(); +cpu_interrupt_handler = kvm_handle_interrupt; + return 0; err: -- 1.7.1
[Qemu-devel] [PATCH v2 17/20] kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes
Make the return code of kvm_arch_handle_exit directly usable for kvm_cpu_exec. This is straightforward for x86 and ppc, just s390 would require more work. Avoid this for now by pushing the return code translation logic into s390's kvm_arch_handle_exit. Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Alexander Graf ag...@suse.de --- kvm-all.c |5 - target-i386/kvm.c |8 target-ppc/kvm.c |8 target-s390x/kvm.c |5 + 4 files changed, 13 insertions(+), 13 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index e6ff95c..78e4fbf 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1000,11 +1000,6 @@ int kvm_cpu_exec(CPUState *env) default: DPRINTF(kvm_arch_handle_exit\n); ret = kvm_arch_handle_exit(env, run); -if (ret == 0) { -ret = EXCP_INTERRUPT; -} else if (ret 0) { -ret = 0; -} break; } } while (ret == 0); diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 032bc3e..6f84610 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -1618,10 +1618,10 @@ static int kvm_handle_halt(CPUState *env) (env-eflags IF_MASK)) !(env-interrupt_request CPU_INTERRUPT_NMI)) { env-halted = 1; -return 0; +return EXCP_HLT; } -return 1; +return 0; } static bool host_supports_vmx(void) @@ -1637,7 +1637,7 @@ static bool host_supports_vmx(void) int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run) { uint64_t code; -int ret = 0; +int ret; switch (run-exit_reason) { case KVM_EXIT_HLT: @@ -1645,7 +1645,7 @@ int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run) ret = kvm_handle_halt(env); break; case KVM_EXIT_SET_TPR: -ret = 1; +ret = 0; break; case KVM_EXIT_FAIL_ENTRY: code = run-fail_entry.hardware_entry_failure_reason; diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 6c99a16..593eb98 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -271,7 +271,7 @@ static int kvmppc_handle_halt(CPUState *env) env-exception_index = EXCP_HLT; } -return 1; +return 0; } /* map dcr access to existing qemu dcr emulation */ @@ -280,7 +280,7 @@ static int kvmppc_handle_dcr_read(CPUState *env, uint32_t dcrn, uint32_t *data) if (ppc_dcr_read(env-dcr_env, dcrn, data) 0) fprintf(stderr, Read to unhandled DCR (0x%x)\n, dcrn); -return 1; +return 0; } static int kvmppc_handle_dcr_write(CPUState *env, uint32_t dcrn, uint32_t data) @@ -288,12 +288,12 @@ static int kvmppc_handle_dcr_write(CPUState *env, uint32_t dcrn, uint32_t data) if (ppc_dcr_write(env-dcr_env, dcrn, data) 0) fprintf(stderr, Write to unhandled DCR (0x%x)\n, dcrn); -return 1; +return 0; } int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run) { -int ret = 0; +int ret; switch (run-exit_reason) { case KVM_EXIT_DCR: diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c index a85ae0f..9123203 100644 --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -497,6 +497,11 @@ int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run) break; } +if (ret == 0) { +ret = EXCP_INTERRUPT; +} else if (ret 0) { +ret = 0; +} return ret; } -- 1.7.1
[Qemu-devel] [PATCH v2 10/20] x86: Properly reset PAT MSR
Conforming to the Intel spec, set the power-on value of PAT also on reset, but save it across INIT. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- target-i386/cpu.h|4 ++-- target-i386/cpuid.c |1 - target-i386/helper.c |5 + 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index d0eae75..c7047d5 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -685,8 +685,6 @@ typedef struct CPUX86State { uint64_t tsc; -uint64_t pat; - uint64_t mcg_status; /* exception/interrupt handling */ @@ -707,6 +705,8 @@ typedef struct CPUX86State { CPU_COMMON +uint64_t pat; + /* processor features (e.g. for CPUID insn) */ uint32_t cpuid_level; uint32_t cpuid_vendor1; diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c index 5382a28..814d13e 100644 --- a/target-i386/cpuid.c +++ b/target-i386/cpuid.c @@ -847,7 +847,6 @@ int cpu_x86_register (CPUX86State *env, const char *cpu_model) env-cpuid_version |= ((def-model 0xf) 4) | ((def-model 4) 16); env-cpuid_version |= def-stepping; env-cpuid_features = def-features; -env-pat = 0x0007040600070406ULL; env-cpuid_ext_features = def-ext_features; env-cpuid_ext2_features = def-ext2_features; env-cpuid_ext3_features = def-ext3_features; diff --git a/target-i386/helper.c b/target-i386/helper.c index a08309f..d15fca5 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -99,6 +99,8 @@ void cpu_reset(CPUX86State *env) env-mxcsr = 0x1f80; +env-pat = 0x0007040600070406ULL; + memset(env-dr, 0, sizeof(env-dr)); env-dr[6] = DR6_FIXED_1; env-dr[7] = DR7_FIXED_1; @@ -1280,8 +1282,11 @@ CPUX86State *cpu_x86_init(const char *cpu_model) void do_cpu_init(CPUState *env) { int sipi = env-interrupt_request CPU_INTERRUPT_SIPI; +uint64_t pat = env-pat; + cpu_reset(env); env-interrupt_request = sipi; +env-pat = pat; apic_init_reset(env-apic_state); env-halted = !cpu_is_bsp(env); } -- 1.7.1
[Qemu-devel] [PATCH v2 14/20] kvm: Keep KVM_RUN return value in separate variable
Avoid using 'ret' both for the return value of KVM_RUN as well as the code kvm_cpu_exec is supposed to return. Both have no direct relation. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 982e5cc..99abe82 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -901,7 +901,7 @@ void kvm_cpu_synchronize_post_init(CPUState *env) int kvm_cpu_exec(CPUState *env) { struct kvm_run *run = env-kvm_run; -int ret; +int ret, run_ret; DPRINTF(kvm_cpu_exec()\n); @@ -931,7 +931,7 @@ int kvm_cpu_exec(CPUState *env) cpu_single_env = NULL; qemu_mutex_unlock_iothread(); -ret = kvm_vcpu_ioctl(env, KVM_RUN, 0); +run_ret = kvm_vcpu_ioctl(env, KVM_RUN, 0); qemu_mutex_lock_iothread(); cpu_single_env = env; @@ -939,14 +939,14 @@ int kvm_cpu_exec(CPUState *env) kvm_flush_coalesced_mmio_buffer(); -if (ret == -EINTR || ret == -EAGAIN) { +if (run_ret == -EINTR || run_ret == -EAGAIN) { DPRINTF(io window exit\n); ret = 0; break; } -if (ret 0) { -DPRINTF(kvm run failed %s\n, strerror(-ret)); +if (run_ret 0) { +DPRINTF(kvm run failed %s\n, strerror(-run_ret)); abort(); } -- 1.7.1
[Qemu-devel] [PATCH v2 20/20] Expose thread_id in info cpus
Based on patch by Glauber Costa: To allow management applications like libvirt to apply CPU affinities to the VCPU threads, expose their ID via info cpus. This patch provides the pre-existing and used interface from qemu-kvm. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- cpu-defs.h |1 + cpus.c |2 ++ exec.c |3 +++ monitor.c |4 os-posix.c | 10 ++ os-win32.c |5 + osdep.h |1 + qmp-commands.hx |3 +++ 8 files changed, 29 insertions(+), 0 deletions(-) diff --git a/cpu-defs.h b/cpu-defs.h index 2b59fa6..db48a7a 100644 --- a/cpu-defs.h +++ b/cpu-defs.h @@ -203,6 +203,7 @@ typedef struct CPUWatchpoint { int nr_cores; /* number of cores within this CPU package */\ int nr_threads;/* number of threads within this CPU */ \ int running; /* Nonzero if cpu is currently running(usermode). */ \ +int thread_id; \ /* user data */ \ void *opaque; \ \ diff --git a/cpus.c b/cpus.c index d310b7e..28c2da2 100644 --- a/cpus.c +++ b/cpus.c @@ -776,6 +776,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg) qemu_mutex_lock(qemu_global_mutex); qemu_thread_get_self(env-thread); +env-thread_id = qemu_get_thread_id(); r = kvm_init_vcpu(env); if (r 0) { @@ -817,6 +818,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg) /* signal CPU creation */ qemu_mutex_lock(qemu_global_mutex); for (env = first_cpu; env != NULL; env = env-next_cpu) { +env-thread_id = qemu_get_thread_id(); env-created = 1; } qemu_cond_signal(qemu_cpu_cond); diff --git a/exec.c b/exec.c index b59f7ff..0c80f84 100644 --- a/exec.c +++ b/exec.c @@ -638,6 +638,9 @@ void cpu_exec_init(CPUState *env) env-numa_node = 0; QTAILQ_INIT(env-breakpoints); QTAILQ_INIT(env-watchpoints); +#ifndef CONFIG_USER_ONLY +env-thread_id = qemu_get_thread_id(); +#endif *penv = env; #if defined(CONFIG_USER_ONLY) cpu_list_unlock(); diff --git a/monitor.c b/monitor.c index ae20927..481572d 100644 --- a/monitor.c +++ b/monitor.c @@ -897,6 +897,9 @@ static void print_cpu_iter(QObject *obj, void *opaque) monitor_printf(mon, (halted)); } +monitor_printf(mon, thread_id=% PRId64 , + qdict_get_int(cpu, thread_id)); + monitor_printf(mon, \n); } @@ -941,6 +944,7 @@ static void do_info_cpus(Monitor *mon, QObject **ret_data) #elif defined(TARGET_MIPS) qdict_put(cpu, PC, qint_from_int(env-active_tc.PC)); #endif +qdict_put(cpu, thread_id, qint_from_int(env-thread_id)); qlist_append(cpu_list, cpu); } diff --git a/os-posix.c b/os-posix.c index 38c29d1..7971f86 100644 --- a/os-posix.c +++ b/os-posix.c @@ -41,6 +41,7 @@ #ifdef CONFIG_LINUX #include sys/prctl.h +#include sys/syscall.h #endif #ifdef CONFIG_EVENTFD @@ -382,3 +383,12 @@ int qemu_create_pidfile(const char *filename) return 0; } + +int qemu_get_thread_id(void) +{ +#if defined (__linux__) +return syscall(SYS_gettid); +#else +return getpid(); +#endif +} diff --git a/os-win32.c b/os-win32.c index c971d92..d6d54c6 100644 --- a/os-win32.c +++ b/os-win32.c @@ -266,3 +266,8 @@ int qemu_create_pidfile(const char *filename) } return 0; } + +int qemu_get_thread_id(void) +{ +return GetCurrentThreadId(); +} diff --git a/osdep.h b/osdep.h index 27eedcf..748df54 100644 --- a/osdep.h +++ b/osdep.h @@ -130,5 +130,6 @@ void qemu_vfree(void *ptr); int qemu_madvise(void *addr, size_t len, int advice); int qemu_create_pidfile(const char *filename); +int qemu_get_thread_id(void); #endif diff --git a/qmp-commands.hx b/qmp-commands.hx index df40a3d..1f72a8d 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -1194,6 +1194,7 @@ Return a json-array. Each CPU is represented by a json-object, which contains: nip: PPC (json-int) pc and npc: sparc (json-int) PC: mips (json-int) +- thread_id: ID of the underlying host thread (json-int) Example: @@ -1205,12 +1206,14 @@ Example: current:true, halted:false, pc:3227107138 +thread_id:3134 }, { CPU:1, current:false, halted:true, pc:7108165 +thread_id:3135 } ] } -- 1.7.1
[Qemu-devel] [PATCH v2 07/20] kvm: Add in-kernel irqchip awareness to cpu_thread_is_idle
With in-kernel irqchip support enabled, the vcpu threads sleep in kernel space while halted. Account for this difference in cpu_thread_is_idle. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- cpus.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/cpus.c b/cpus.c index 975a6ce..d310b7e 100644 --- a/cpus.c +++ b/cpus.c @@ -148,7 +148,8 @@ static bool cpu_thread_is_idle(CPUState *env) if (env-stopped || !vm_running) { return true; } -if (!env-halted || qemu_cpu_has_work(env)) { +if (!env-halted || qemu_cpu_has_work(env) || +(kvm_enabled() kvm_irqchip_in_kernel())) { return false; } return true; -- 1.7.1
[Qemu-devel] [PATCH v2 15/20] kvm: Reorder error handling of KVM_RUN
Test for general errors first as this is the slower path. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c | 11 +-- 1 files changed, 5 insertions(+), 6 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 99abe82..59276cd 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -939,13 +939,12 @@ int kvm_cpu_exec(CPUState *env) kvm_flush_coalesced_mmio_buffer(); -if (run_ret == -EINTR || run_ret == -EAGAIN) { -DPRINTF(io window exit\n); -ret = 0; -break; -} - if (run_ret 0) { +if (run_ret == -EINTR || run_ret == -EAGAIN) { +DPRINTF(io window exit\n); +ret = 0; +break; +} DPRINTF(kvm run failed %s\n, strerror(-run_ret)); abort(); } -- 1.7.1
[Qemu-devel] [PATCH v2 03/20] s390: Detect invalid invocations of qemu_ram_free/remap
This both detects invalid invocations of qemu_ram_free and qemu_ram_remap when mem_path is non-NULL and fixes a build error on s390 ('area' may be used uninitialized in this function). Signed-off-by: Jan Kiszka jan.kis...@siemens.com CC: Alexander Graf ag...@suse.de --- exec.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/exec.c b/exec.c index 723ace4..c5358c3 100644 --- a/exec.c +++ b/exec.c @@ -2931,6 +2931,8 @@ void qemu_ram_free(ram_addr_t addr) } else { qemu_vfree(block-host); } +#else +abort(); #endif } else { #if defined(TARGET_S390X) defined(CONFIG_KVM) @@ -2979,6 +2981,8 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length) area = mmap(vaddr, length, PROT_READ | PROT_WRITE, flags, -1, 0); } +#else +abort(); #endif } else { #if defined(TARGET_S390X) defined(CONFIG_KVM) -- 1.7.1
[Qemu-devel] [PATCH v2 13/20] kvm: Consider EXIT_DEBUG unknown without CAP_SET_GUEST_DEBUG
Without KVM_CAP_SET_GUEST_DEBUG, we neither motivate the kernel to report KVM_EXIT_DEBUG nor do we expect such exits. So fall through to the arch code which will simply report an unknown exit reason. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 36553fe..982e5cc 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -986,17 +986,17 @@ int kvm_cpu_exec(CPUState *env) ret = kvm_handle_internal_error(env, run); break; #endif +#ifdef KVM_CAP_SET_GUEST_DEBUG case KVM_EXIT_DEBUG: DPRINTF(kvm_exit_debug\n); -#ifdef KVM_CAP_SET_GUEST_DEBUG if (kvm_arch_debug(run-debug.arch)) { ret = EXCP_DEBUG; goto out; } /* re-enter, this exception was guest-internal */ ret = 1; -#endif /* KVM_CAP_SET_GUEST_DEBUG */ break; +#endif /* KVM_CAP_SET_GUEST_DEBUG */ default: DPRINTF(kvm_arch_handle_exit\n); ret = kvm_arch_handle_exit(env, run); -- 1.7.1
[Qemu-devel] [PATCH v2 09/20] kvm: Mark VCPU state dirty on creation
This avoids that early cpu_synchronize_state calls try to retrieve an uninitialized state from the kernel. That even causes a deadlock if io-thread is enabled. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 25ab545..36553fe 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -211,6 +211,7 @@ int kvm_init_vcpu(CPUState *env) env-kvm_fd = ret; env-kvm_state = s; +env-kvm_vcpu_dirty = 1; mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0); if (mmap_size 0) { -- 1.7.1
[Qemu-devel] [PATCH v2 16/20] kvm: Rework inner loop of kvm_cpu_exec
Let kvm_cpu_exec return EXCP_* values consistently and generate those codes already inside its inner loop. This means we will now re-enter the kernel while ret == 0. Update kvm_handle_internal_error accordingly, but keep kvm_arch_handle_exit untouched, it will be converted in a separate step. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c | 26 ++ 1 files changed, 14 insertions(+), 12 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 59276cd..e6ff95c 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -842,7 +842,7 @@ static int kvm_handle_internal_error(CPUState *env, struct kvm_run *run) fprintf(stderr, emulation failure\n); if (!kvm_arch_stop_on_emulation_error(env)) { cpu_dump_state(env, stderr, fprintf, CPU_DUMP_CODE); -return 0; +return EXCP_INTERRUPT; } } /* FIXME: Should trigger a qmp message to let management know @@ -942,14 +942,13 @@ int kvm_cpu_exec(CPUState *env) if (run_ret 0) { if (run_ret == -EINTR || run_ret == -EAGAIN) { DPRINTF(io window exit\n); -ret = 0; +ret = EXCP_INTERRUPT; break; } DPRINTF(kvm run failed %s\n, strerror(-run_ret)); abort(); } -ret = 0; /* exit loop */ switch (run-exit_reason) { case KVM_EXIT_IO: DPRINTF(handle_io\n); @@ -958,7 +957,7 @@ int kvm_cpu_exec(CPUState *env) run-io.direction, run-io.size, run-io.count); -ret = 1; +ret = 0; break; case KVM_EXIT_MMIO: DPRINTF(handle_mmio\n); @@ -966,14 +965,16 @@ int kvm_cpu_exec(CPUState *env) run-mmio.data, run-mmio.len, run-mmio.is_write); -ret = 1; +ret = 0; break; case KVM_EXIT_IRQ_WINDOW_OPEN: DPRINTF(irq_window_open\n); +ret = EXCP_INTERRUPT; break; case KVM_EXIT_SHUTDOWN: DPRINTF(shutdown\n); qemu_system_reset_request(); +ret = EXCP_INTERRUPT; break; case KVM_EXIT_UNKNOWN: fprintf(stderr, KVM: unknown exit, hardware reason % PRIx64 \n, @@ -990,28 +991,29 @@ int kvm_cpu_exec(CPUState *env) DPRINTF(kvm_exit_debug\n); if (kvm_arch_debug(run-debug.arch)) { ret = EXCP_DEBUG; -goto out; +break; } /* re-enter, this exception was guest-internal */ -ret = 1; +ret = 0; break; #endif /* KVM_CAP_SET_GUEST_DEBUG */ default: DPRINTF(kvm_arch_handle_exit\n); ret = kvm_arch_handle_exit(env, run); +if (ret == 0) { +ret = EXCP_INTERRUPT; +} else if (ret 0) { +ret = 0; +} break; } -} while (ret 0); +} while (ret == 0); if (ret 0) { cpu_dump_state(env, stderr, fprintf, CPU_DUMP_CODE); vm_stop(VMSTOP_PANIC); } -ret = EXCP_INTERRUPT; -#ifdef KVM_CAP_SET_GUEST_DEBUG -out: -#endif env-exit_request = 0; cpu_single_env = NULL; return ret; -- 1.7.1
[Qemu-devel] [PATCH v3] Add qcow2 documentation
This adds a description of the qcow2 file format to the docs/ directory. Besides documenting what's there, which is never wrong, the document should provide a good basis for the discussion of format extensions (called qcow3 in previous discussions) Signed-off-by: Kevin Wolf kw...@redhat.com --- docs/specs/qcow2.txt | 260 ++ 1 files changed, 260 insertions(+), 0 deletions(-) create mode 100644 docs/specs/qcow2.txt v2: - Added limits to cluster_bits v3: - Added semantics for unallocated clusters - Fixed cluster_offset calculation - Added header extensions diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt new file mode 100644 index 000..8fc3cb2 --- /dev/null +++ b/docs/specs/qcow2.txt @@ -0,0 +1,260 @@ +== General == + +A qcow2 image file is organized in units of constant size, which are called +(host) clusters. A cluster is the unit in which all allocations are done, +both for actual guest data and for image metadata. + +Likewise, the virtual disk as seen by the guest is divided into (guest) +clusters of the same size. + +All numbers in qcow2 are stored in Big Endian byte order. + + +== Header == + +The first cluster of a qcow2 image contains the file header: + +Byte 0 - 3: magic +QCOW magic string (QFI\xfb) + + 4 - 7: version +Version number (only valid value is 2) + + 8 - 15: backing_file_offset +Offset into the image file at which the backing file name +is stored (NB: The string is not null terminated). 0 if the +image doesn't have a backing file. + + 16 - 19: backing_file_size +Length of the backing file name in bytes. Must not be +longer than 1023 bytes. Undefined if the image doesn't have +a backing file. + + 20 - 23: cluster_bits +Number of bits that are used for addressing an offset +within a cluster (1 cluster_bits is the cluster size). +Must not be less than 9 (i.e. 512 byte clusters). + +Note: qemu as of today has an implementation limit of 2 MB +as the maximum cluster size and won't be able to open images +with larger cluster sizes. + + 24 - 31: size +Virtual disk size in bytes + + 32 - 35: crypt_method +0 for no encryption +1 for AES encryption + + 36 - 39: l1_size +Number of entries in the active L1 table + + 40 - 47: l1_table_offset +Offset into the image file at which the active L1 table +starts. Must be aligned to a cluster boundary. + + 48 - 55: refcount_table_offset +Offset into the image file at which the refcount table +starts. Must be aligned to a cluster boundary. + + 56 - 59: refcount_table_clusters +Number of clusters that the refcount table occupies + + 60 - 63: nb_snapshots +Number of snapshots contained in the image + + 64 - 71: snapshots_offset +Offset into the image file at which the snapshot table +starts. Must be aligned to a cluster boundary. + +Directly after the image header, optional sections called header extensions can +be stored. Each extension has a structure like the following: + +Byte 0 - 3: Header extension type: +0x - End of the header extension area +0xE2792ACA - Backing file format name +other - Unknown header extension, can be safely + ignored + + 4 - 7: Length of the header extension data + + 8 - n: Header extension data + + n - m: Padding to round up the header extension size to the next +multiple of 8. + +The remaining space between the end of the header extension area and the end of +the first cluster can be used for other data. Usually, the backing file name is +stored there. + + +== Host cluster management == + +qcow2 manages the allocation of host clusters by maintaining a reference count +for each host cluster. A refcount of 0 means that the cluster is free, 1 means +that it is used, and = 2 means that it is used and any write access must +perform a COW (copy on write) operation. + +The refcounts are managed in a two-level table. The first level is called +refcount table and has a variable size (which is stored in the header). The +refcount table can cover multiple clusters, however it needs to be contiguous +in the image file. + +It contains pointers to the second level structures which are called refcount +blocks and are exactly one cluster in size. + +Given a
[Qemu-devel] [PATCH v2 11/20] x86: Save/restore PAT MSR
Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- target-i386/machine.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/target-i386/machine.c b/target-i386/machine.c index d78eceb..6384f54 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -491,6 +491,8 @@ static const VMStateDescription vmstate_cpu = { VMSTATE_UINT64_V(xcr0, CPUState, 12), VMSTATE_UINT64_V(xstate_bv, CPUState, 12), VMSTATE_YMMH_REGS_VARS(ymmh_regs, CPUState, CPU_NB_REGS, 12), + +VMSTATE_UINT64_V(pat, CPUState, 13), VMSTATE_END_OF_LIST() /* The above list is not sorted /wrt version numbers, watch out! */ }, -- 1.7.1
[Qemu-devel] [PATCHv3] report that QEMU process was killed by a signal
Currently when rogue script kills QEMU process (using TERM/INT/HUP signal) it looks indistinguishable from system shutdown. Lets report that QEMU was killed and leave some clues about the killer identity. Signed-off-by: Gleb Natapov g...@redhat.com --- v1-v2: - print message from a main loop instead of signal handler v2-v3 - use pid_t to store pid instead of int diff --git a/os-posix.c b/os-posix.c index 38c29d1..36d937c 100644 --- a/os-posix.c +++ b/os-posix.c @@ -61,9 +61,9 @@ void os_setup_early_signal_handling(void) sigaction(SIGPIPE, act, NULL); } -static void termsig_handler(int signal) +static void termsig_handler(int signal, siginfo_t *info, void *c) { -qemu_system_shutdown_request(); +qemu_system_killed(info-si_signo, info-si_pid); } static void sigchld_handler(int signal) @@ -76,7 +76,8 @@ void os_setup_signal_handling(void) struct sigaction act; memset(act, 0, sizeof(act)); -act.sa_handler = termsig_handler; +act.sa_sigaction = termsig_handler; +act.sa_flags = SA_SIGINFO; sigaction(SIGINT, act, NULL); sigaction(SIGHUP, act, NULL); sigaction(SIGTERM, act, NULL); diff --git a/sysemu.h b/sysemu.h index 0a83ab9..f868500 100644 --- a/sysemu.h +++ b/sysemu.h @@ -66,6 +66,8 @@ void qemu_system_vmstop_request(int reason); int qemu_shutdown_requested(void); int qemu_reset_requested(void); int qemu_powerdown_requested(void); +void qemu_system_killed(int signal, pid_t pid); +void qemu_kill_report(void); extern qemu_irq qemu_system_powerdown; void qemu_system_reset(void); diff --git a/vl.c b/vl.c index 5e007a7..000c845 100644 --- a/vl.c +++ b/vl.c @@ -1213,7 +1213,8 @@ typedef struct QEMUResetEntry { static QTAILQ_HEAD(reset_handlers, QEMUResetEntry) reset_handlers = QTAILQ_HEAD_INITIALIZER(reset_handlers); static int reset_requested; -static int shutdown_requested; +static int shutdown_requested, shutdown_signal = -1; +static pid_t shutdown_pid; static int powerdown_requested; static int debug_requested; static int vmstop_requested; @@ -1225,6 +1226,15 @@ int qemu_shutdown_requested(void) return r; } +void qemu_kill_report(void) +{ +if (shutdown_signal != -1) { +fprintf(stderr, Got signal %d from pid %d\n, + shutdown_signal, shutdown_pid); +shutdown_signal = -1; +} +} + int qemu_reset_requested(void) { int r = reset_requested; @@ -1298,6 +1308,13 @@ void qemu_system_reset_request(void) qemu_notify_event(); } +void qemu_system_killed(int signal, pid_t pid) +{ +shutdown_signal = signal; +shutdown_pid = pid; +qemu_system_shutdown_request(); +} + void qemu_system_shutdown_request(void) { shutdown_requested = 1; @@ -1441,6 +1458,7 @@ static void main_loop(void) vm_stop(VMSTOP_DEBUG); } if (qemu_shutdown_requested()) { +qemu_kill_report(); monitor_protocol_event(QEVENT_SHUTDOWN, NULL); if (no_shutdown) { vm_stop(VMSTOP_SHUTDOWN); -- Gleb.
[Qemu-devel] [PATCH v2 00/20] [uq/master] Patch queue, part V (the rest)
This series catches all the rest to prepare QEMU's KVM support for merging with qemu-kvm. IOW, once these bits here are applied, qemu-kvm can switch its infrastructure to upstream and is effectively only adding own bits for in-kernel irqchip and device assignment support. Topics of this series are: - support for optimized interrupt handling by hooking cpu_interrupt - another preparational step for in-kernel irqchip support - x86: Do not leave halt if interrupts are disabled - mark VCPU state dirty on creation (fixed deadlock on early hw_error) - complete KVM support for PAT MSR, some related improvements for TCG - further consolidation of inner kvm_cpu_exec loop - expose VCPU host thread ID via info cpus and query-cpus Changes in v2: - Rebased over current uq/master - Build fix for MAC OS (regression of previous round) - Fix for x86 hardware breakpoints in TCG mode (regression of previous round) - Build fix for s390 (regression of previous round) - Removed premature optimization from Install optimized interrupt handlers - Keep KVM_RUN return value in separate variable (cleanup) - Reorder error handling of KVM_RUN (micro-optimization) CC: Alexander Graf ag...@suse.de CC: Andreas Färber andreas.faer...@web.de CC: Riku Voipio riku.voi...@iki.fi CC: TeLeMan gele...@gmail.com Jan Kiszka (20): Implement qemu_kvm_eat_signals only for CONFIG_LINUX x86: Unbreak TCG support for hardware breakpoints s390: Detect invalid invocations of qemu_ram_free/remap Break up user and system cpu_interrupt implementations Redirect cpu_interrupt to callback handler kvm: Install optimized interrupt handler kvm: Add in-kernel irqchip awareness to cpu_thread_is_idle kvm: x86: Do not leave halt if interrupts are disabled kvm: Mark VCPU state dirty on creation x86: Properly reset PAT MSR x86: Save/restore PAT MSR kvm: x86: Synchronize PAT MSR with the kernel kvm: Consider EXIT_DEBUG unknown without CAP_SET_GUEST_DEBUG kvm: Keep KVM_RUN return value in separate variable kvm: Reorder error handling of KVM_RUN kvm: Rework inner loop of kvm_cpu_exec kvm: Align kvm_arch_handle_exit to kvm_cpu_exec changes kvm: x86: Reorder functions in kvm.c kvm: x86: Push kvm_arch_debug to kvm_arch_handle_exit Expose thread_id in info cpus cpu-all.h | 14 - cpu-defs.h|1 + cpu-exec.c| 27 + cpus.c| 126 ++ exec.c| 25 +++-- kvm-all.c | 57 +-- kvm.h |2 - monitor.c |4 + os-posix.c| 10 +++ os-win32.c|5 ++ osdep.h |1 + qmp-commands.hx |3 + target-i386/cpu.h |4 +- target-i386/cpuid.c |1 - target-i386/helper.c |5 ++ target-i386/kvm.c | 146 +++-- target-i386/machine.c |2 + target-ppc/kvm.c |8 +- target-s390x/kvm.c|5 ++ 19 files changed, 263 insertions(+), 183 deletions(-)
[Qemu-devel] [PATCH v2 19/20] kvm: x86: Push kvm_arch_debug to kvm_arch_handle_exit
There are no generic bits remaining in the handling of KVM_EXIT_DEBUG. So push its logic completely into arch hands, i.e. only x86 so far. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- kvm-all.c | 11 --- kvm.h |2 -- target-i386/kvm.c | 25 - 3 files changed, 16 insertions(+), 22 deletions(-) diff --git a/kvm-all.c b/kvm-all.c index 78e4fbf..fd1fbfe 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -986,17 +986,6 @@ int kvm_cpu_exec(CPUState *env) ret = kvm_handle_internal_error(env, run); break; #endif -#ifdef KVM_CAP_SET_GUEST_DEBUG -case KVM_EXIT_DEBUG: -DPRINTF(kvm_exit_debug\n); -if (kvm_arch_debug(run-debug.arch)) { -ret = EXCP_DEBUG; -break; -} -/* re-enter, this exception was guest-internal */ -ret = 0; -break; -#endif /* KVM_CAP_SET_GUEST_DEBUG */ default: DPRINTF(kvm_arch_handle_exit\n); ret = kvm_arch_handle_exit(env, run); diff --git a/kvm.h b/kvm.h index 7bc04e0..d565dba 100644 --- a/kvm.h +++ b/kvm.h @@ -136,8 +136,6 @@ struct kvm_sw_breakpoint { QTAILQ_HEAD(kvm_sw_breakpoint_head, kvm_sw_breakpoint); -int kvm_arch_debug(struct kvm_debug_exit_arch *arch_info); - struct kvm_sw_breakpoint *kvm_find_sw_breakpoint(CPUState *env, target_ulong pc); diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 3920444..a13599d 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -1731,31 +1731,31 @@ void kvm_arch_remove_all_hw_breakpoints(void) static CPUWatchpoint hw_watchpoint; -int kvm_arch_debug(struct kvm_debug_exit_arch *arch_info) +static int kvm_handle_debug(struct kvm_debug_exit_arch *arch_info) { -int handle = 0; +int ret = 0; int n; if (arch_info-exception == 1) { if (arch_info-dr6 (1 14)) { if (cpu_single_env-singlestep_enabled) { -handle = 1; +ret = EXCP_DEBUG; } } else { for (n = 0; n 4; n++) { if (arch_info-dr6 (1 n)) { switch ((arch_info-dr7 (16 + n*4)) 0x3) { case 0x0: -handle = 1; +ret = EXCP_DEBUG; break; case 0x1: -handle = 1; +ret = EXCP_DEBUG; cpu_single_env-watchpoint_hit = hw_watchpoint; hw_watchpoint.vaddr = hw_breakpoint[n].addr; hw_watchpoint.flags = BP_MEM_WRITE; break; case 0x3: -handle = 1; +ret = EXCP_DEBUG; cpu_single_env-watchpoint_hit = hw_watchpoint; hw_watchpoint.vaddr = hw_breakpoint[n].addr; hw_watchpoint.flags = BP_MEM_ACCESS; @@ -1765,17 +1765,18 @@ int kvm_arch_debug(struct kvm_debug_exit_arch *arch_info) } } } else if (kvm_find_sw_breakpoint(cpu_single_env, arch_info-pc)) { -handle = 1; +ret = EXCP_DEBUG; } -if (!handle) { +if (ret == 0) { cpu_synchronize_state(cpu_single_env); assert(cpu_single_env-exception_injected == -1); +/* pass to guest */ cpu_single_env-exception_injected = arch_info-exception; cpu_single_env-has_error_code = 0; } -return handle; +return ret; } void kvm_arch_update_guest_debug(CPUState *env, struct kvm_guest_debug *dbg) @@ -1851,6 +1852,12 @@ int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run) run-ex.exception, run-ex.error_code); ret = -1; break; +#ifdef KVM_CAP_SET_GUEST_DEBUG +case KVM_EXIT_DEBUG: +DPRINTF(kvm_exit_debug\n); +ret = kvm_handle_debug(run-debug.arch); +break; +#endif /* KVM_CAP_SET_GUEST_DEBUG */ default: fprintf(stderr, KVM: unknown exit reason %d\n, run-exit_reason); ret = -1; -- 1.7.1
[Qemu-devel] [PATCH v2 18/20] kvm: x86: Reorder functions in kvm.c
Required for next patch which will access guest debug services from kvm_arch_handle_exit. No functional changes. Signed-off-by: Jan Kiszka jan.kis...@siemens.com --- target-i386/kvm.c | 108 ++-- 1 files changed, 54 insertions(+), 54 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 6f84610..3920444 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -1624,60 +1624,6 @@ static int kvm_handle_halt(CPUState *env) return 0; } -static bool host_supports_vmx(void) -{ -uint32_t ecx, unused; - -host_cpuid(1, 0, unused, unused, ecx, unused); -return ecx CPUID_EXT_VMX; -} - -#define VMX_INVALID_GUEST_STATE 0x8021 - -int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run) -{ -uint64_t code; -int ret; - -switch (run-exit_reason) { -case KVM_EXIT_HLT: -DPRINTF(handle_hlt\n); -ret = kvm_handle_halt(env); -break; -case KVM_EXIT_SET_TPR: -ret = 0; -break; -case KVM_EXIT_FAIL_ENTRY: -code = run-fail_entry.hardware_entry_failure_reason; -fprintf(stderr, KVM: entry failed, hardware error 0x% PRIx64 \n, -code); -if (host_supports_vmx() code == VMX_INVALID_GUEST_STATE) { -fprintf(stderr, -\nIf you're runnning a guest on an Intel machine without -unrestricted mode\n -support, the failure can be most likely due to the guest -entering an invalid\n -state for Intel VT. For example, the guest maybe running -in big real mode\n -which is not supported on less recent Intel processors. -\n\n); -} -ret = -1; -break; -case KVM_EXIT_EXCEPTION: -fprintf(stderr, KVM: exception %d exit (error code 0x%x)\n, -run-ex.exception, run-ex.error_code); -ret = -1; -break; -default: -fprintf(stderr, KVM: unknown exit reason %d\n, run-exit_reason); -ret = -1; -break; -} - -return ret; -} - #ifdef KVM_CAP_SET_GUEST_DEBUG int kvm_arch_insert_sw_breakpoint(CPUState *env, struct kvm_sw_breakpoint *bp) { @@ -1860,6 +1806,60 @@ void kvm_arch_update_guest_debug(CPUState *env, struct kvm_guest_debug *dbg) } #endif /* KVM_CAP_SET_GUEST_DEBUG */ +static bool host_supports_vmx(void) +{ +uint32_t ecx, unused; + +host_cpuid(1, 0, unused, unused, ecx, unused); +return ecx CPUID_EXT_VMX; +} + +#define VMX_INVALID_GUEST_STATE 0x8021 + +int kvm_arch_handle_exit(CPUState *env, struct kvm_run *run) +{ +uint64_t code; +int ret; + +switch (run-exit_reason) { +case KVM_EXIT_HLT: +DPRINTF(handle_hlt\n); +ret = kvm_handle_halt(env); +break; +case KVM_EXIT_SET_TPR: +ret = 0; +break; +case KVM_EXIT_FAIL_ENTRY: +code = run-fail_entry.hardware_entry_failure_reason; +fprintf(stderr, KVM: entry failed, hardware error 0x% PRIx64 \n, +code); +if (host_supports_vmx() code == VMX_INVALID_GUEST_STATE) { +fprintf(stderr, +\nIf you're runnning a guest on an Intel machine without +unrestricted mode\n +support, the failure can be most likely due to the guest +entering an invalid\n +state for Intel VT. For example, the guest maybe running +in big real mode\n +which is not supported on less recent Intel processors. +\n\n); +} +ret = -1; +break; +case KVM_EXIT_EXCEPTION: +fprintf(stderr, KVM: exception %d exit (error code 0x%x)\n, +run-ex.exception, run-ex.error_code); +ret = -1; +break; +default: +fprintf(stderr, KVM: unknown exit reason %d\n, run-exit_reason); +ret = -1; +break; +} + +return ret; +} + bool kvm_arch_stop_on_emulation_error(CPUState *env) { return !(env-cr[0] CR0_PE_MASK) || -- 1.7.1
[Qemu-devel] Re: [v1 PATCH 1/3]: Move the paio_signal_handler to a generic location.
On Tue, Mar 15, 2011 at 10:36 AM, Arun R Bharadwaj a...@linux.vnet.ibm.com wrote: * Arun R Bharadwaj a...@linux.vnet.ibm.com [2011-03-15 16:04:53]: Author: Arun R Bharadwaj a...@linux.vnet.ibm.com Date: Thu Mar 10 14:45:25 2011 +0530 Move the paio_signal_handler to a generic location. The paio subsystem uses the signal, SIGUSR2. So move the signal handler to a more generic place such that other subsystems like 9pfs can also use it. TODO: I have moved the signal handler code to qemu-thread.c, which is NOT the right place. I need suggestions as to where is the right place to put it. I think os-posix.c would be appropriate. Please check how this affects Windows host and linux-user builds. @@ -356,6 +359,15 @@ static void *aio_thread(void *unused) idle_threads++; mutex_unlock(lock); + if (posix_aio_state) { If we get here posix_aio_state must be non-NULL. Please remove the check. +void sigusr2_signal_handler(int signum) static void sigusr2_signal_handler(int signum) Stefan
Re: [Qemu-devel] [v1 PATCH 2/3]: Helper routines to use GLib threadpool infrastructure in 9pfs.
On 03/15/2011 04:08 PM, Arun R Bharadwaj wrote: * Arun R Bharadwaja...@linux.vnet.ibm.com [2011-03-15 16:04:53]: Author: Arun R Bharadwaja...@linux.vnet.ibm.com Date: Thu Mar 10 15:11:49 2011 +0530 Helper routines to use GLib threadpool infrastructure in 9pfs. This patch creates helper routines to make use of the threadpool infrastructure provided by GLib. This is based on the prototype patch by Anthony which does a similar thing for posix-aio-compat.c An example use case is provided in the next patch where one of the syscalls in 9pfs is converted into the threaded model using these helper routines. Signed-off-by: Arun R Bharadwaja...@linux.vnet.ibm.com Reviewed-by: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c index dceefd5..cf61345 100644 --- a/hw/9pfs/virtio-9p.c +++ b/hw/9pfs/virtio-9p.c @@ -18,6 +18,8 @@ #include fsdev/qemu-fsdev.h #include virtio-9p-debug.h #include virtio-9p-xattr.h +#include signal.h +#include qemu-thread.h int debug_9p_pdu; static void v9fs_reclaim_fd(V9fsState *s); @@ -36,6 +38,89 @@ enum { Oappend = 0x80, }; +typedef struct V9fsPool { +GThreadPool *pool; +GList *requests; +int rfd; +int wfd; +} V9fsPool; + +static V9fsPool v9fs_pool; + +static void v9fs_qemu_submit_request(V9fsRequest *req) +{ +V9fsPool *p =v9fs_pool; + +p-requests = g_list_append(p-requests, req); +g_thread_pool_push(v9fs_pool.pool, req, NULL); +} + +static void die2(int err, const char *what) +{ +fprintf(stderr, %s failed: %s\n, what, strerror(err)); +abort(); +} + +static void die(const char *what) +{ +die2(errno, what); +} + +static void v9fs_qemu_process_post_ops(void *arg) +{ +struct V9fsPool *p =v9fs_pool; +struct V9fsPostOp *post_op; +char byte; +ssize_t len; +GList *cur_req, *next_req; + +do { +len = read(p-rfd,byte, sizeof(byte)); +} while (len == -1 errno == EINTR); + +for (cur_req = p-requests; cur_req != NULL; cur_req = next_req) { +V9fsRequest *req = cur_req-data; +next_req = g_list_next(cur_req); + +if (!req-done) { +continue; +} + +post_op =req-post_op; +post_op-func(post_op-arg); +p-requests = g_list_remove_link(p-requests, cur_req); +g_list_free(p-requests); +} +} + +static inline void v9fs_thread_signal(void) +{ +struct V9fsPool *p =v9fs_pool; +char byte = 0; +ssize_t ret; + +do { +ret = write(p-wfd,byte, sizeof(byte)); +} while (ret == -1 errno == EINTR); + +if (ret 0 errno != EAGAIN) { +die(write() in v9fs); +} + +if (kill(getpid(), SIGUSR2)) { Not sure whether using pthread_kill or qemu_thread_signal is better to go with? +die(kill failed); +} +} + +static void v9fs_thread_routine(gpointer data, gpointer user_data) +{ +V9fsRequest *req = data; + +req-func(req); +v9fs_thread_signal(); +req-done = 1; Shouldn't it be in reverse order, setting flag first and then signal: req-done = 1; v9fs_thread_signal(); +} + static int omode_to_uflags(int8_t mode) { int ret = 0; @@ -3850,7 +3935,8 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf *conf) int i, len; struct stat stat; FsTypeEntry *fse; - +int fds[2]; +V9fsPool *p =v9fs_pool; s = (V9fsState *)virtio_common_init(virtio-9p, VIRTIO_ID_9P, @@ -3939,5 +4025,21 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf *conf) s-tag_len; s-vdev.get_config = virtio_9p_get_config; +if (qemu_pipe(fds) == -1) { +fprintf(stderr, failed to create fd's for virtio-9p\n); +exit(1); +} + +p-pool = g_thread_pool_new(v9fs_thread_routine, p, 8, FALSE, NULL); +p-rfd = fds[0]; +p-wfd = fds[1]; + +fcntl(p-rfd, F_SETFL, O_NONBLOCK); +fcntl(p-wfd, F_SETFL, O_NONBLOCK); + +qemu_set_fd_handler(p-rfd, v9fs_qemu_process_post_ops, NULL, NULL); + +(void) v9fs_qemu_submit_request; Do we really need it ^ ? - Harsh + returns-vdev; } diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h index 10809ba..e7d2326 100644 --- a/hw/9pfs/virtio-9p.h +++ b/hw/9pfs/virtio-9p.h @@ -124,6 +124,20 @@ struct V9fsPDU QLIST_ENTRY(V9fsPDU) next; }; +typedef struct V9fsPostOp { +/* Post Operation routine to execute after executing syscall */ +void (*func)(void *arg); +void *arg; +} V9fsPostOp; + +typedef struct V9fsRequest { +void (*func)(struct V9fsRequest *req); + +/* Flag to indicate that request is satisfied, ready for post-processing */ +int done; + +V9fsPostOp post_op; +} V9fsRequest; /* FIXME * 1) change user needs to set groups and stuff
Re: [Qemu-devel] [PATCH -V3 1/8] hw/9pfs: Add V9fsfidmap in preparation for adding fd reclaim
On Tue, 15 Mar 2011 10:38:31 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Tue, Mar 15, 2011 at 9:20 AM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Sun, 13 Mar 2011 20:53:41 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sun, Mar 13, 2011 at 7:06 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Sun, 13 Mar 2011 15:46:29 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: @@ -185,17 +188,22 @@ typedef struct V9fsXattr int flags; } V9fsXattr; +typedef struct V9fsfidmap { V9fsFidMap (naming convention) + union { + int fd; + DIR *dir; + V9fsXattr xattr; + } fs; The name fs is not meaningful. + int fid_type; + V9fsString path; + int flags; +} V9fsFidMap; + struct V9fsFidState { - int fid_type; int32_t fid; - V9fsString path; - union { - int fd; - DIR *dir; - V9fsXattr xattr; - } fs; uid_t uid; + V9fsFidMap fsmap; This name is confusing. A map is usually a container that stores key/value pairs. V9fsFidMapEntry would be clearer. But then I thought that is what V9fsFidState is? I am bad at naming. I wanted to indicate something that can be shared across multiple fids and also indicate the local file system mapping/data. I will take any suggestion. Where does sharing happen, I didn't notice any code that shares fds between fids? That patch is not yet there. We can only share fd if they open flags match. Hence making sure we open files on host with limited set of open flags which enables us much better sharing. Tracking open flags is fine and is needed for fd reclaim. But splitting V9fsFidState into the V9fsFidMap structure and all the churn that causes to the code isn't necessary yet. Please wait with that until you submit patches fd sharing. The reason I make this suggestion is that everyone reading or working on the code until fd sharing is added now needs to deal with the V9fsFidMap structure which (currently) serves no purpose. taken. will remove the patch from the series. -aneesh
Re: [Qemu-devel] [PATCH -V3 7/8] hw/9pfs: Add new virtfs option cache=none to skip host page cache
On Tue, 15 Mar 2011 11:11:46 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Tue, Mar 15, 2011 at 9:19 AM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Mon, 14 Mar 2011 10:20:57 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sun, Mar 13, 2011 at 7:04 PM, Aneesh Kumar K. V aneesh.ku...@linux.vnet.ibm.com wrote: On Sun, 13 Mar 2011 17:23:50 +, Stefan Hajnoczi stefa...@gmail.com wrote: On Sat, Mar 5, 2011 at 5:52 PM, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: cache=none implies the file are opened in the host with O_SYNC open flag O_SYNC does not bypass the host page cache. It ensures that writes only complete once data has been written to the disk. O_DIRECT is a hint to bypass the host page cache when possible. A boolean on|off option would be nicer than an option that takes the special string none. For example, direct=on|off. It also makes the code nicer by using bools instead of strdup strings that get leaked. What i wanted is the O_SYNC behavior. Well the comment should be updated. I want to make sure that we don't have dirty data in host page cache after a write. It is always good to make read hit the page cache Why silently enforce O_SYNC on the server side? The client does not know whether or not O_SYNC is in effect, cannot take advantage of that knowledge, and cannot control it. I think a more useful solution is a 9p client mount option called sync that caused the client to always add O_SYNC and skip syncfs. The whole stack becomes aware of O_SYNC and clients are in control over whether or not they need O_SYNC semantics. The cache=none specifically enables us to ignore the tsyncfs request on host. tsyncfs on host can be really slow in certain setup. If I'm a client with the sync mount option all my fids are O_SYNC and I do not need to send TSYNCFS requests to the server because my fids are already stable. Having sync mount option is useful, Infact for dotu we already default O_SYNC on the client side because we don't have tsyncfs. But being able to avoid the tfsyncfs flush from the server point of view also is nice. Consider a setup where one doesn't have control on the guest mount option but can control the qemu export options. -aneesh
[Qemu-devel] [PATCH v2 0/7] Introduce -display and make VNC optional
From: Jes Sorensen jes.soren...@redhat.com Hi, This is the second version of the -display patches and the option to make VNC optional. It introduces a new -display argument to consolidate the current -sdl/-curses/-nographic/-vnc arguments and I included the patch I posted last week to consolidate the DisplaySurface code. New in this series is support for sub-parameters for SDL, and -display vnc. I have added documentation to the SDL side, but I am not quite comfortable with the bits needed to convert the VNC bits, so I have not included that for now. Would be good to get someone who knows docbook (or whatever the format is we use) to make sure it is done correctly. Basically the -display=vnc option supports sub-arbuments as Anthony suggested yesterday. Cheers, Jes Jes Sorensen (7): Consolidate DisplaySurface allocation in qemu_alloc_display() Introduce -display argument Introduce -display none Add support for -display vnc error message if user specifies SDL cmd line option when SDL is disabled error message if user specifies curses on cmd line when curses is disabled Make VNC support optional Makefile.objs | 19 configure | 37 +++- console.c | 45 +++- console.h | 29 +++- monitor.c | 22 - qemu-options.hx | 48 +++- qerror.h|3 + sysemu.h|1 + ui/sdl.c| 21 +++-- ui/vnc.c| 14 -- vl.c| 129 --- 11 files changed, 278 insertions(+), 90 deletions(-) -- 1.7.4
[Qemu-devel] [PATCH 2/7] Introduce -display argument
From: Jes Sorensen jes.soren...@redhat.com This patch introduces a -display argument which consolidates the setting of the display mode. Valid options are: sdl/curses/default/serial (serial is equivalent to -nographic) Signed-off-by: Jes Sorensen jes.soren...@redhat.com --- qemu-options.hx | 27 +++ vl.c| 77 +++ 2 files changed, 104 insertions(+), 0 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index badb730..f08ffb1 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -590,6 +590,33 @@ STEXI @table @option ETEXI +DEF(display, HAS_ARG, QEMU_OPTION_display, +-display sdl[,frame=on|off][,alt_grab=on|off][,ctrl_grab=on|off]\n +[,window_close=on|off]|curses|serial\n +select display type\n, QEMU_ARCH_ALL) +STEXI +@item -display @var{type} +@findex -display +Select type of display to use. This option is a replacement for the +old style -sdl/-curses/... options. Valid values for @var{type} are +@table @option +@item sdl +Pick the SDL display option. +@item curses +Pick the curses display option. Normally, QEMU uses SDL to display the +VGA output. With this option, QEMU can display the VGA output when in +text mode using a curses/ncurses interface. Nothing is displayed in +graphical mode. +@item serial +Normally, QEMU uses SDL to display the VGA output. With this option, +you can totally disable graphical output so that QEMU is a simple +command line application. The emulated serial port is redirected on +the console. Therefore, you can still use QEMU to debug a Linux kernel +with a serial console. This option is equivalent to the old -nographic +argument. +@end table +ETEXI + DEF(nographic, 0, QEMU_OPTION_nographic, -nographic disable graphical output and redirect serial I/Os to console\n, QEMU_ARCH_ALL) diff --git a/vl.c b/vl.c index 5e007a7..c88ee58 100644 --- a/vl.c +++ b/vl.c @@ -1554,6 +1554,80 @@ static void select_vgahw (const char *p) } } +static DisplayType select_display(const char *p) +{ +const char *opts; +DisplayType display = DT_DEFAULT; + +if (strstart(p, sdl, opts)) { +#ifdef CONFIG_SDL +display = DT_SDL; +while (*opts) { +const char *nextopt; + +if (strstart(opts, ,frame=, nextopt)) { +opts = nextopt; +if (strstart(opts, on, nextopt)) { +no_frame = 0; +} else if (strstart(opts, off, nextopt)) { +no_frame = 1; +} else { +goto invalid_display; +} +} else if (strstart(opts, ,alt_grab=, nextopt)) { +opts = nextopt; +if (strstart(opts, on, nextopt)) { +alt_grab = 1; +} else if (strstart(opts, off, nextopt)) { +alt_grab = 0; +} else { +goto invalid_display; +} +} else if (strstart(opts, ,ctrl_grab=, nextopt)) { +opts = nextopt; +if (strstart(opts, on, nextopt)) { +ctrl_grab = 1; +} else if (strstart(opts, off, nextopt)) { +ctrl_grab = 0; +} else { +goto invalid_display; +} +} else if (strstart(opts, ,window_close=, nextopt)) { +opts = nextopt; +if (strstart(opts, on, nextopt)) { +no_quit = 0; +} else if (strstart(opts, off, nextopt)) { +no_quit = 1; +} else { +goto invalid_display; +} +} else { +goto invalid_display; +} +opts = nextopt; +} +#else +fprintf(stderr, SDL support is disabled\n); +exit(1); +#endif +} else if (strstart(p, curses, opts)) { +#ifdef CONFIG_CURSES +display = DT_CURSES; +#else +fprintf(stderr, Curses support is disabled\n); +exit(1); +#endif +} else if (strstart(p, serial, opts)) { +display = DT_NOGRAPHIC; +} else { +invalid_display: +fprintf(stderr, Unknown display type: %s\n, p); +exit(1); +} + +return display; +} + static int balloon_parse(const char *arg) { QemuOpts *opts; @@ -2152,6 +2226,9 @@ int main(int argc, char **argv, char **envp) } numa_add(optarg); break; +case QEMU_OPTION_display: +display_type = select_display(optarg); +break; case QEMU_OPTION_nographic: display_type = DT_NOGRAPHIC; break; -- 1.7.4
[Qemu-devel] [PATCH 4/7] Add support for -display vnc
From: Jes Sorensen jes.soren...@redhat.com Signed-off-by: Jes Sorensen jes.soren...@redhat.com --- qemu-options.hx |5 - vl.c| 14 ++ 2 files changed, 18 insertions(+), 1 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index 80506e7..e2a31bc 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -592,7 +592,8 @@ ETEXI DEF(display, HAS_ARG, QEMU_OPTION_display, -display sdl[,frame=on|off][,alt_grab=on|off][,ctrl_grab=on|off]\n -[,window_close=on|off]|curses|none|serial\n +[,window_close=on|off]|curses|none|serial|\n +vnc=display[,optargs]\n select display type\n, QEMU_ARCH_ALL) STEXI @item -display @var{type} @@ -620,6 +621,8 @@ command line application. The emulated serial port is redirected on the console. Therefore, you can still use QEMU to debug a Linux kernel with a serial console. This option is equivalent to the old -nographic argument. +@item vnc +Start a VNC server on display arg @end table ETEXI diff --git a/vl.c b/vl.c index d12ac96..e58958d 100644 --- a/vl.c +++ b/vl.c @@ -1610,6 +1610,20 @@ static DisplayType select_display(const char *p) fprintf(stderr, SDL support is disabled\n); exit(1); #endif +} else if (strstart(p, vnc, opts)) { +display_remote++; + +if (*opts) { +const char *nextopt; + +if (strstart(opts, =, nextopt)) { +vnc_display = nextopt; +} +} +if (!vnc_display) { +fprintf(stderr, VNC requires a display argument vnc=display\n); +exit(1); +} } else if (strstart(p, curses, opts)) { #ifdef CONFIG_CURSES display = DT_CURSES; -- 1.7.4
[Qemu-devel] [PATCH 1/7] Consolidate DisplaySurface allocation in qemu_alloc_display()
From: Jes Sorensen jes.soren...@redhat.com This removes various code duplication from console.e and sdl.c Signed-off-by: Jes Sorensen jes.soren...@redhat.com --- console.c | 45 + console.h |3 +++ ui/sdl.c | 21 - 3 files changed, 36 insertions(+), 33 deletions(-) diff --git a/console.c b/console.c index 57d6eb5..4939a72 100644 --- a/console.c +++ b/console.c @@ -1278,35 +1278,40 @@ static DisplaySurface* defaultallocator_create_displaysurface(int width, int hei { DisplaySurface *surface = (DisplaySurface*) qemu_mallocz(sizeof(DisplaySurface)); -surface-width = width; -surface-height = height; -surface-linesize = width * 4; -surface-pf = qemu_default_pixelformat(32); -#ifdef HOST_WORDS_BIGENDIAN -surface-flags = QEMU_ALLOCATED_FLAG | QEMU_BIG_ENDIAN_FLAG; -#else -surface-flags = QEMU_ALLOCATED_FLAG; -#endif -surface-data = (uint8_t*) qemu_mallocz(surface-linesize * surface-height); - +int linesize = width * 4; +surface = qemu_alloc_display(surface, width, height, linesize, + qemu_default_pixelformat(32), 0); return surface; } static DisplaySurface* defaultallocator_resize_displaysurface(DisplaySurface *surface, int width, int height) { +int linesize = width * 4; +surface = qemu_alloc_display(surface, width, height, linesize, + qemu_default_pixelformat(32), 0); +return surface; +} + +DisplaySurface* +qemu_alloc_display(DisplaySurface *surface, int width, int height, + int linesize, PixelFormat pf, int newflags) +{ +void *data; surface-width = width; surface-height = height; -surface-linesize = width * 4; -surface-pf = qemu_default_pixelformat(32); -if (surface-flags QEMU_ALLOCATED_FLAG) -surface-data = (uint8_t*) qemu_realloc(surface-data, surface-linesize * surface-height); -else -surface-data = (uint8_t*) qemu_malloc(surface-linesize * surface-height); +surface-linesize = linesize; +surface-pf = pf; +if (surface-flags QEMU_ALLOCATED_FLAG) { +data = qemu_realloc(surface-data, +surface-linesize * surface-height); +} else { +data = qemu_malloc(surface-linesize * surface-height); +} +surface-data = (uint8_t *)data; +surface-flags = newflags | QEMU_ALLOCATED_FLAG; #ifdef HOST_WORDS_BIGENDIAN -surface-flags = QEMU_ALLOCATED_FLAG | QEMU_BIG_ENDIAN_FLAG; -#else -surface-flags = QEMU_ALLOCATED_FLAG; +surface-flags |= QEMU_BIG_ENDIAN_FLAG; #endif return surface; diff --git a/console.h b/console.h index f4e4741..dec9a76 100644 --- a/console.h +++ b/console.h @@ -189,6 +189,9 @@ void register_displaystate(DisplayState *ds); DisplayState *get_displaystate(void); DisplaySurface* qemu_create_displaysurface_from(int width, int height, int bpp, int linesize, uint8_t *data); +DisplaySurface* qemu_alloc_display(DisplaySurface *surface, int width, + int height, int linesize, + PixelFormat pf, int newflags); PixelFormat qemu_different_endianness_pixelformat(int bpp); PixelFormat qemu_default_pixelformat(int bpp); diff --git a/ui/sdl.c b/ui/sdl.c index 47ac49c..6c10ea6 100644 --- a/ui/sdl.c +++ b/ui/sdl.c @@ -176,23 +176,18 @@ static DisplaySurface* sdl_create_displaysurface(int width, int height) surface-width = width; surface-height = height; - + if (scaling_active) { +int linesize; +PixelFormat pf; if (host_format.BytesPerPixel != 2 host_format.BytesPerPixel != 4) { -surface-linesize = width * 4; -surface-pf = qemu_default_pixelformat(32); +linesize = width * 4; +pf = qemu_default_pixelformat(32); } else { -surface-linesize = width * host_format.BytesPerPixel; -surface-pf = sdl_to_qemu_pixelformat(host_format); +linesize = width * host_format.BytesPerPixel; +pf = sdl_to_qemu_pixelformat(host_format); } -#ifdef HOST_WORDS_BIGENDIAN -surface-flags = QEMU_ALLOCATED_FLAG | QEMU_BIG_ENDIAN_FLAG; -#else -surface-flags = QEMU_ALLOCATED_FLAG; -#endif -surface-data = (uint8_t*) qemu_mallocz(surface-linesize * surface-height); - -return surface; +return qemu_alloc_display(surface, width, height, linesize, pf, 0); } if (host_format.BitsPerPixel == 16) -- 1.7.4
[Qemu-devel] [PATCH 5/7] error message if user specifies SDL cmd line option when SDL is disabled
From: Jes Sorensen jes.soren...@redhat.com Signed-off-by: Jes Sorensen jes.soren...@redhat.com --- qemu-options.hx | 10 -- vl.c|8 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index e2a31bc..ee7e1d7 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -652,11 +652,9 @@ QEMU can display the VGA output when in text mode using a curses/ncurses interface. Nothing is displayed in graphical mode. ETEXI -#ifdef CONFIG_SDL DEF(no-frame, 0, QEMU_OPTION_no_frame, -no-frame open SDL window without a frame and window decorations\n, QEMU_ARCH_ALL) -#endif STEXI @item -no-frame @findex -no-frame @@ -665,42 +663,34 @@ available screen space. This makes the using QEMU in a dedicated desktop workspace more convenient. ETEXI -#ifdef CONFIG_SDL DEF(alt-grab, 0, QEMU_OPTION_alt_grab, -alt-grab use Ctrl-Alt-Shift to grab mouse (instead of Ctrl-Alt)\n, QEMU_ARCH_ALL) -#endif STEXI @item -alt-grab @findex -alt-grab Use Ctrl-Alt-Shift to grab mouse (instead of Ctrl-Alt). ETEXI -#ifdef CONFIG_SDL DEF(ctrl-grab, 0, QEMU_OPTION_ctrl_grab, -ctrl-grab use Right-Ctrl to grab mouse (instead of Ctrl-Alt)\n, QEMU_ARCH_ALL) -#endif STEXI @item -ctrl-grab @findex -ctrl-grab Use Right-Ctrl to grab mouse (instead of Ctrl-Alt). ETEXI -#ifdef CONFIG_SDL DEF(no-quit, 0, QEMU_OPTION_no_quit, -no-quitdisable SDL window close capability\n, QEMU_ARCH_ALL) -#endif STEXI @item -no-quit @findex -no-quit Disable SDL window close capability. ETEXI -#ifdef CONFIG_SDL DEF(sdl, 0, QEMU_OPTION_sdl, -sdlenable SDL\n, QEMU_ARCH_ALL) -#endif STEXI @item -sdl @findex -sdl diff --git a/vl.c b/vl.c index e58958d..4bc81cf 100644 --- a/vl.c +++ b/vl.c @@ -2626,6 +2626,14 @@ int main(int argc, char **argv, char **envp) case QEMU_OPTION_sdl: display_type = DT_SDL; break; +#else +case QEMU_OPTION_no_frame: +case QEMU_OPTION_alt_grab: +case QEMU_OPTION_ctrl_grab: +case QEMU_OPTION_no_quit: +case QEMU_OPTION_sdl: +fprintf(stderr, SDL support is disabled\n); +exit(1); #endif case QEMU_OPTION_pidfile: pid_file = optarg; -- 1.7.4
[Qemu-devel] [PATCH 6/7] error message if user specifies curses on cmd line when curses is disabled
From: Jes Sorensen jes.soren...@redhat.com Signed-off-by: Jes Sorensen jes.soren...@redhat.com --- qemu-options.hx |2 -- vl.c|7 +-- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index ee7e1d7..b6b125c 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -639,11 +639,9 @@ the console. Therefore, you can still use QEMU to debug a Linux kernel with a serial console. ETEXI -#ifdef CONFIG_CURSES DEF(curses, 0, QEMU_OPTION_curses, -curses use a curses/ncurses interface instead of SDL\n, QEMU_ARCH_ALL) -#endif STEXI @item -curses @findex curses diff --git a/vl.c b/vl.c index 4bc81cf..baa267a 100644 --- a/vl.c +++ b/vl.c @@ -2248,11 +2248,14 @@ int main(int argc, char **argv, char **envp) case QEMU_OPTION_nographic: display_type = DT_NOGRAPHIC; break; -#ifdef CONFIG_CURSES case QEMU_OPTION_curses: +#ifdef CONFIG_CURSES display_type = DT_CURSES; -break; +#else +fprintf(stderr, Curses support is disabled\n); +exit(1); #endif +break; case QEMU_OPTION_portrait: graphic_rotate = 1; break; -- 1.7.4
[Qemu-devel] [PATCH 7/7] Make VNC support optional
From: Jes Sorensen jes.soren...@redhat.com Per default VNC is enabled. Signed-off-by: Jes Sorensen jes.soren...@redhat.com --- Makefile.objs | 19 ++- configure | 37 + console.h | 26 -- monitor.c | 22 ++ qerror.h |3 +++ ui/vnc.c | 14 ++ vl.c | 21 + 7 files changed, 99 insertions(+), 43 deletions(-) diff --git a/Makefile.objs b/Makefile.objs index a52f42f..9796d12 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -127,19 +127,20 @@ common-obj-y += $(addprefix audio/, $(audio-obj-y)) ui-obj-y += keymaps.o ui-obj-$(CONFIG_SDL) += sdl.o sdl_zoom.o x_keymap.o ui-obj-$(CONFIG_CURSES) += curses.o -ui-obj-y += vnc.o d3des.o -ui-obj-y += vnc-enc-zlib.o vnc-enc-hextile.o -ui-obj-y += vnc-enc-tight.o vnc-palette.o -ui-obj-y += vnc-enc-zrle.o -ui-obj-$(CONFIG_VNC_TLS) += vnc-tls.o vnc-auth-vencrypt.o -ui-obj-$(CONFIG_VNC_SASL) += vnc-auth-sasl.o -ui-obj-$(CONFIG_COCOA) += cocoa.o +vnc-obj-y += vnc.o d3des.o +vnc-obj-y += vnc-enc-zlib.o vnc-enc-hextile.o +vnc-obj-y += vnc-enc-tight.o vnc-palette.o +vnc-obj-y += vnc-enc-zrle.o +vnc-obj-$(CONFIG_VNC_TLS) += vnc-tls.o vnc-auth-vencrypt.o +vnc-obj-$(CONFIG_VNC_SASL) += vnc-auth-sasl.o +vnc-obj-$(CONFIG_COCOA) += cocoa.o ifdef CONFIG_VNC_THREAD -ui-obj-y += vnc-jobs-async.o +vnc-obj-y += vnc-jobs-async.o else -ui-obj-y += vnc-jobs-sync.o +vnc-obj-y += vnc-jobs-sync.o endif common-obj-y += $(addprefix ui/, $(ui-obj-y)) +common-obj-$(CONFIG_VNC) += $(addprefix ui/, $(vnc-obj-y)) common-obj-y += iov.o acl.o common-obj-$(CONFIG_POSIX) += qemu-thread-posix.o compatfd.o diff --git a/configure b/configure index a166de0..abd3317 100755 --- a/configure +++ b/configure @@ -117,6 +117,7 @@ kvm= kvm_para= nptl= sdl= +vnc=yes sparse=no uuid= vde= @@ -539,6 +540,10 @@ for opt do ;; --enable-sdl) sdl=yes ;; + --disable-vnc) vnc=no + ;; + --enable-vnc) vnc=yes + ;; --fmod-lib=*) fmod_lib=$optarg ;; --fmod-inc=*) fmod_inc=$optarg @@ -836,6 +841,8 @@ echo --disable-strip disable stripping binaries echo --disable-werror disable compilation abort on warning echo --disable-sdldisable SDL echo --enable-sdl enable SDL +echo --disable-vncdisable VNC +echo --enable-vnc enable VNC echo --enable-cocoa enable COCOA (Mac OS X only) echo --audio-drv-list=LISTset audio drivers list: echoAvailable drivers: $audio_possible_drivers @@ -1273,7 +1280,7 @@ fi ## # VNC TLS detection -if test $vnc_tls != no ; then +if test $vnc = yes -a $vnc_tls != no ; then cat $TMPC EOF #include gnutls/gnutls.h int main(void) { gnutls_session_t s; gnutls_init(s, GNUTLS_SERVER); return 0; } @@ -1293,7 +1300,7 @@ fi ## # VNC SASL detection -if test $vnc_sasl != no ; then +if test $vnc = yes -a $vnc_sasl != no ; then cat $TMPC EOF #include sasl/sasl.h #include stdio.h @@ -1315,7 +1322,7 @@ fi ## # VNC JPEG detection -if test $vnc_jpeg != no ; then +if test $vnc = yes -a $vnc_jpeg != no ; then cat $TMPC EOF #include stdio.h #include jpeglib.h @@ -1336,7 +1343,7 @@ fi ## # VNC PNG detection -if test $vnc_png != no ; then +if test $vnc = yes -a $vnc_png != no ; then cat $TMPC EOF //#include stdio.h #include png.h @@ -2495,11 +2502,14 @@ echo Audio drivers $audio_drv_list echo Extra audio cards $audio_card_list echo Block whitelist $block_drv_whitelist echo Mixer emulation $mixemu -echo VNC TLS support $vnc_tls -echo VNC SASL support $vnc_sasl -echo VNC JPEG support $vnc_jpeg -echo VNC PNG support $vnc_png -echo VNC thread$vnc_thread +echo VNC support $vnc +if test $vnc = yes ; then +echo VNC TLS support $vnc_tls +echo VNC SASL support $vnc_sasl +echo VNC JPEG support $vnc_jpeg +echo VNC PNG support $vnc_png +echo VNC thread$vnc_thread +fi if test -n $sparc_cpu; then echo Target Sparc Arch $sparc_cpu fi @@ -2649,6 +2659,9 @@ echo CONFIG_BDRV_WHITELIST=$block_drv_whitelist $config_host_mak if test $mixemu = yes ; then echo CONFIG_MIXEMU=y $config_host_mak fi +if test $vnc = yes ; then + echo CONFIG_VNC=y $config_host_mak +fi if test $vnc_tls = yes ; then echo CONFIG_VNC_TLS=y $config_host_mak echo VNC_TLS_CFLAGS=$vnc_tls_cflags $config_host_mak @@ -2657,15 +2670,15 @@ if test $vnc_sasl = yes ; then echo CONFIG_VNC_SASL=y $config_host_mak echo VNC_SASL_CFLAGS=$vnc_sasl_cflags $config_host_mak fi -if test $vnc_jpeg != no ; then +if test $vnc_jpeg = yes ; then echo CONFIG_VNC_JPEG=y $config_host_mak echo VNC_JPEG_CFLAGS=$vnc_jpeg_cflags $config_host_mak fi -if test $vnc_png != no ; then +if
[Qemu-devel] [PATCH 3/7] Introduce -display none
From: Jes Sorensen jes.soren...@redhat.com New option -display none. This option differs from -display nographic by not trying to take control of stdio etc. but instead behaves as if a graphics display is enabled, except that it doesn't show one. Signed-off-by: Jes Sorensen jes.soren...@redhat.com --- qemu-options.hx |8 +++- sysemu.h|1 + vl.c|2 ++ 3 files changed, 10 insertions(+), 1 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index f08ffb1..80506e7 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -592,7 +592,7 @@ ETEXI DEF(display, HAS_ARG, QEMU_OPTION_display, -display sdl[,frame=on|off][,alt_grab=on|off][,ctrl_grab=on|off]\n -[,window_close=on|off]|curses|serial\n +[,window_close=on|off]|curses|none|serial\n select display type\n, QEMU_ARCH_ALL) STEXI @item -display @var{type} @@ -607,6 +607,12 @@ Pick the curses display option. Normally, QEMU uses SDL to display the VGA output. With this option, QEMU can display the VGA output when in text mode using a curses/ncurses interface. Nothing is displayed in graphical mode. +@item none +Pick the none display option. This option will still run with an +emulated graphics card, but none will be displayed to the QEMU +user. This options differs from the -nographic option in that QEMU +will behave like if one of the display options had been picked, it +will not change the control on the command line. @item serial Normally, QEMU uses SDL to display the VGA output. With this option, you can totally disable graphical output so that QEMU is a simple diff --git a/sysemu.h b/sysemu.h index 0a83ab9..c43c7af 100644 --- a/sysemu.h +++ b/sysemu.h @@ -110,6 +110,7 @@ typedef enum DisplayType DT_CURSES, DT_SDL, DT_NOGRAPHIC, +DT_NONE, } DisplayType; extern int autostart; diff --git a/vl.c b/vl.c index c88ee58..d12ac96 100644 --- a/vl.c +++ b/vl.c @@ -1619,6 +1619,8 @@ static DisplayType select_display(const char *p) #endif } else if (strstart(p, serial, opts)) { display = DT_NOGRAPHIC; +} else if (strstart(p, none, opts)) { +display = DT_NONE; } else { invalid_display: fprintf(stderr, Unknown display type: %s\n, p); -- 1.7.4
Re: [Qemu-devel] segmentation fault in qemu-kvm-0.14.0
On 09.03.2011 08:26, Stefan Weil wrote: Am 08.03.2011 23:53, schrieb Peter Lieven: Hi, during testing of qemu-kvm-0.14.0 i can reproduce the following segfault. i have seen similar crash already in 0.13.0, but had no time to debug. my guess is that this segfault is related to the threaded vnc server which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc client is attached. it might also be connected to a resolution change in the guest. i have a backtrace attached. the debugger is still running if someone needs more output Reading symbols from /usr/local/bin/qemu-system-x86_64...done. (gdb) r -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive format=host_device,file=/dev/mapper/iqn.2001-05.co m.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native -m 1024 -monitor tcp:0:4001,server,nowait -vnc :1 -name 'lieven-winxp-te st' -boot order=c,menu=on -k de -pidfile /var/run/qemu/vm-265.pid -mem-path /hugepages -mem-prealloc -cpu qemu64,model_id='Intel(R) Xeon(R) CPU E5640 @ 2.67GHz',-n x -rtc base=localtime,clock=vm -vga cirrus -usb -usbdevice tablet Starting program: /usr/local/bin/qemu-system-x86_64 -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net nic,vlan=141,model=rtl8139,macaddr=52:54:00:ff:00:93 -drive format =host_device,file=/dev/mapper/iqn.2001-05.com.equallogic:0-8a0906-e6b70e107-e87000e7acf4d4e5-lieven-winxp-r17453,if=ide,boot=on,cache=none,aio=native -m 1024 -monitor tcp:0:4001, server,nowait -vnc :1 -name 'lieven-winxp-test' -boot order=c,menu=on -k de -pidfile /var/run/qemu/vm-265.pid -mem-path /hugepages -mem-prealloc -cpu qemu64,model_id='Intel(R ) Xeon(R) CPU E5640 @ 2.67GHz',-nx -rtc base=localtime,clock=vm -vga cirrus -usb -usbdevice tablet [Thread debugging using libthread_db enabled] [New Thread 0x7694e700 (LWP 29042)] [New Thread 0x76020700 (LWP 29043)] [New Thread 0x7581f700 (LWP 29074)] [Thread 0x7581f700 (LWP 29074) exited] [New Thread 0x7581f700 (LWP 29124)] [Thread 0x7581f700 (LWP 29124) exited] [New Thread 0x7581f700 (LWP 29170)] [Thread 0x7581f700 (LWP 29170) exited] [New Thread 0x7581f700 (LWP 29246)] [Thread 0x7581f700 (LWP 29246) exited] [New Thread 0x7581f700 (LWP 29303)] [Thread 0x7581f700 (LWP 29303) exited] [New Thread 0x7581f700 (LWP 29349)] [Thread 0x7581f700 (LWP 29349) exited] [New Thread 0x7581f700 (LWP 29399)] [Thread 0x7581f700 (LWP 29399) exited] [New Thread 0x7581f700 (LWP 29471)] [Thread 0x7581f700 (LWP 29471) exited] [New Thread 0x7581f700 (LWP 29521)] [Thread 0x7581f700 (LWP 29521) exited] [New Thread 0x7581f700 (LWP 29593)] [Thread 0x7581f700 (LWP 29593) exited] [New Thread 0x7581f700 (LWP 29703)] [Thread 0x7581f700 (LWP 29703) exited] Program received signal SIGSEGV, Segmentation fault. 0x in ?? () (gdb) (gdb) thread apply all bt full Thread 3 (Thread 0x76020700 (LWP 29043)): #0 0x779c385c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 No symbol table info available. #1 0x004d3ae1 in qemu_cond_wait (cond=0x1612d50, mutex=0x1612d80) at qemu-thread.c:133 err = 0 __func__ = qemu_cond_wait #2 0x004d2b39 in vnc_worker_thread_loop (queue=0x1612d50) at ui/vnc-jobs-async.c:198 job = 0x7058cd20 entry = 0x0 tmp = 0x0 vs = {csock = -1, ds = 0x15cb380, dirty = {{0, 0, 0, 0, 0} repeats 2048 times}, vd = 0x1607ff0, need_update = 0, force_update = 0, features = 243, absolute = 0, last_x = 0, last_y = 0, client_width = 0, client_height = 0, vnc_encoding = 7, major = 0, minor = 0, challenge = '\000' repeats 15 times, info = 0x0, output = {capacity = 3194, offset = 2723, buffer = 0x1fbbfd0 }, input = {capacity = 0, offset = 0, buffer = 0x0}, write_pixels = 0x4c4bc9 vnc_write_pixels_generic, clientds = {flags = 0 '\000', width = 720, height = 400, linesize = 2880, data = 0x76021010 Address 0x76021010 out of bounds, pf = {bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004', depth = 24 '\030', rmask = 0, gmask = 0, bmask = 0, amask = 0, rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\000', ashift = 24 '\030', rmax = 255 '\377', gmax = 255 '\377', bmax = 255 '\377', amax = 255 '\377', rbits = 8 '\b', gbits = 8 '\b', bbits = 8 '\b', abits = 8 '\b'}}, audio_cap = 0x0, as = {freq = 0, nchannels = 0, fmt = AUD_FMT_U8, endianness = 0}, read_handler = 0, read_handler_expect = 0, modifiers_state = '\000' repeats 255 times, led = 0x0, abort = false, output_mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' repeats 39 times, __align = 0}}, tight = { type = 7, quality = 255 '\377', compression = 9 '\t', pixel24 = 1 '\001', tight = {capacity = 3146708, offset = 1376, buffer = 0x74684010 }, tmp = {capacity = 3194, offset = 2672, buffer =
Re: [Qemu-devel] [PATCH 4/7] libcacard: initial commit
On 03/14/11 17:40, Alon Levy wrote: On Mon, Mar 14, 2011 at 04:20:22PM +0100, Jes Sorensen wrote: ok, here is a note where I kinda ignored my own wishes but I want to be very clear on them: libcacard should not be part of qemu. it is here because I once thought it would speed things up. So I'm not taking it out or anything - it's fine with me that it goes into qemu, just as long as it's understood that I'm now maintaining another copy of it for usage outside of qemu, in the spice client (or any other client for that matter - it will be the same when we do vnc support for this). Hi Alon, This bit is somewhat problematic. If QEMU is maintaining a copy of libcacard, then that has to comply with the QEMU way of doing things. QEMU cannot rely on various portions in the tree behaving in different ways. Otherwise it really should be an external library requirement pulled in by the build. I am not sure what is the best way, if it stays in QEMU people will eventually start making modifications to it, without looking at the other copy that is being maintained. Alternatively the external apps that build against it should be taught to link with the QEMU version. Cheers, Jes
[Qemu-devel] [Bug 735454] [NEW] live kvm migration with non-shared storage corrupts file system
Public bug reported: Description of problem: Migrating a kvm guest on non-shared lvm-storage using block migration (-b flag) results in a corrupted file system if that guest is under considerable I/O load. Version-Release number of selected component (if applicable): qemu-kvm-0.12.3 linux-kernel-2.6.32 lvm2-2.02.54 How reproducible: The error can be reproduced consistently. Steps to Reproduce: 1. create a guest using lvm-based storage 2. create an LV on the destination node for the guest to be migrated to 3. place the attached scripts somewhere on the guest's system 4. run 'runlots' 5. migrate the guest using the -b flag 6. if the migration doesn't complete in an appropriate amount of time (45 minutes for our 100GB image), it will be necessary to stop the test scripts: type 'killall python' 7. attempt to shut down the guest, forcing it off if necessary 8. access the partitions of the LV on the node: 'partprobe /dev/mapper /volume-name' 9. run fsck: 'fsck -n -f /dev/mapper/volume-namep1' Actual results: You should see a big mess of errors, that go beyond what can be accounted for by an unclean shutdown. Expected results: Expected is a clean bill of health from fsck. Additional information: I suspect that there is some sort of race condition in the live synchronization algorithm for dirty blocks used for block migration. Workaround: The only safe way to migrate guests in this scenario is by suspending them just prior to the migration. That way they are first suspended, then everything is transferred, and finally resumed on the target node. When the I/O load is low, the migration works live, as well. However, this is too risky to use on production systems because there is no way to tell when the I/O load is too high for a successful live migration. Using this workaround is very dissatisfying because for a guest with a 100GB filesystem, the migration takes 45 minutes on our systems, meaning that we have a downtime of 45 minutes. Having migrated other guests with 0 downtime got us hooked. The attached scripts to simulate high I/O load are somewhat artificial in nature. However, the bug is motivated by a real-world scenario: We migrated a productive mail-server that subsequently became buggy, finally crashed and corrupted several of our customers' e-mails. Unfortunately a bug of this nature can't be tested on non-productive systems, because they don't reach the necessary load levels. The scripts reliably reproduce the failure experienced by our mail-server. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/735454 Title: live kvm migration with non-shared storage corrupts file system Status in QEMU: New Bug description: Description of problem: Migrating a kvm guest on non-shared lvm-storage using block migration (-b flag) results in a corrupted file system if that guest is under considerable I/O load. Version-Release number of selected component (if applicable): qemu-kvm-0.12.3 linux-kernel-2.6.32 lvm2-2.02.54 How reproducible: The error can be reproduced consistently. Steps to Reproduce: 1. create a guest using lvm-based storage 2. create an LV on the destination node for the guest to be migrated to 3. place the attached scripts somewhere on the guest's system 4. run 'runlots' 5. migrate the guest using the -b flag 6. if the migration doesn't complete in an appropriate amount of time (45 minutes for our 100GB image), it will be necessary to stop the test scripts: type 'killall python' 7. attempt to shut down the guest, forcing it off if necessary 8. access the partitions of the LV on the node: 'partprobe /dev/mapper /volume-name' 9. run fsck: 'fsck -n -f /dev/mapper/volume-namep1' Actual results: You should see a big mess of errors, that go beyond what can be accounted for by an unclean shutdown. Expected results: Expected is a clean bill of health from fsck. Additional information: I suspect that there is some sort of race condition in the live synchronization algorithm for dirty blocks used for block migration. Workaround: The only safe way to migrate guests in this scenario is by suspending them just prior to the migration. That way they are first suspended, then everything is transferred, and finally resumed on the target node. When the I/O load is low, the migration works live, as well. However, this is too risky to use on production systems because there is no way to tell when the I/O load is too high for a successful live migration. Using this workaround is very dissatisfying because for a guest with a 100GB filesystem, the migration takes 45 minutes on our systems, meaning that we have a downtime of 45 minutes. Having migrated other guests with 0 downtime got us hooked. The
[Qemu-devel] [Bug 735454] Re: live kvm migration with non-shared storage corrupts file system
** Attachment added: A set of scripts to exercise the file system https://bugs.launchpad.net/bugs/735454/+attachment/1910025/+files/uglyfstest.tbz2 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/735454 Title: live kvm migration with non-shared storage corrupts file system Status in QEMU: New Bug description: Description of problem: Migrating a kvm guest on non-shared lvm-storage using block migration (-b flag) results in a corrupted file system if that guest is under considerable I/O load. Version-Release number of selected component (if applicable): qemu-kvm-0.12.3 linux-kernel-2.6.32 lvm2-2.02.54 How reproducible: The error can be reproduced consistently. Steps to Reproduce: 1. create a guest using lvm-based storage 2. create an LV on the destination node for the guest to be migrated to 3. place the attached scripts somewhere on the guest's system 4. run 'runlots' 5. migrate the guest using the -b flag 6. if the migration doesn't complete in an appropriate amount of time (45 minutes for our 100GB image), it will be necessary to stop the test scripts: type 'killall python' 7. attempt to shut down the guest, forcing it off if necessary 8. access the partitions of the LV on the node: 'partprobe /dev/mapper /volume-name' 9. run fsck: 'fsck -n -f /dev/mapper/volume-namep1' Actual results: You should see a big mess of errors, that go beyond what can be accounted for by an unclean shutdown. Expected results: Expected is a clean bill of health from fsck. Additional information: I suspect that there is some sort of race condition in the live synchronization algorithm for dirty blocks used for block migration. Workaround: The only safe way to migrate guests in this scenario is by suspending them just prior to the migration. That way they are first suspended, then everything is transferred, and finally resumed on the target node. When the I/O load is low, the migration works live, as well. However, this is too risky to use on production systems because there is no way to tell when the I/O load is too high for a successful live migration. Using this workaround is very dissatisfying because for a guest with a 100GB filesystem, the migration takes 45 minutes on our systems, meaning that we have a downtime of 45 minutes. Having migrated other guests with 0 downtime got us hooked. The attached scripts to simulate high I/O load are somewhat artificial in nature. However, the bug is motivated by a real-world scenario: We migrated a productive mail-server that subsequently became buggy, finally crashed and corrupted several of our customers' e-mails. Unfortunately a bug of this nature can't be tested on non-productive systems, because they don't reach the necessary load levels. The scripts reliably reproduce the failure experienced by our mail-server.
[Qemu-devel] Re: [v1 PATCH 2/3]: Helper routines to use GLib threadpool infrastructure in 9pfs.
On 03/15/2011 05:38 AM, Arun R Bharadwaj wrote: * Arun R Bharadwaja...@linux.vnet.ibm.com [2011-03-15 16:04:53]: Author: Arun R Bharadwaja...@linux.vnet.ibm.com Date: Thu Mar 10 15:11:49 2011 +0530 Helper routines to use GLib threadpool infrastructure in 9pfs. This patch creates helper routines to make use of the threadpool infrastructure provided by GLib. This is based on the prototype patch by Anthony which does a similar thing for posix-aio-compat.c An example use case is provided in the next patch where one of the syscalls in 9pfs is converted into the threaded model using these helper routines. Signed-off-by: Arun R Bharadwaja...@linux.vnet.ibm.com Reviewed-by: Aneesh Kumar K.Vaneesh.ku...@linux.vnet.ibm.com Why even bothering signaling for completion with the virtio-9p threadpool? There's no sane guest that's going to poll on virtio-9p completion with interrupts disabled and no timer. Once we enable the I/O thread by default, it won't even be necessary for the paio layer. Regards, Anthony Liguori diff --git a/hw/9pfs/virtio-9p.c b/hw/9pfs/virtio-9p.c index dceefd5..cf61345 100644 --- a/hw/9pfs/virtio-9p.c +++ b/hw/9pfs/virtio-9p.c @@ -18,6 +18,8 @@ #include fsdev/qemu-fsdev.h #include virtio-9p-debug.h #include virtio-9p-xattr.h +#include signal.h +#include qemu-thread.h int debug_9p_pdu; static void v9fs_reclaim_fd(V9fsState *s); @@ -36,6 +38,89 @@ enum { Oappend = 0x80, }; +typedef struct V9fsPool { +GThreadPool *pool; +GList *requests; +int rfd; +int wfd; +} V9fsPool; + +static V9fsPool v9fs_pool; + +static void v9fs_qemu_submit_request(V9fsRequest *req) +{ +V9fsPool *p =v9fs_pool; + +p-requests = g_list_append(p-requests, req); +g_thread_pool_push(v9fs_pool.pool, req, NULL); +} + +static void die2(int err, const char *what) +{ +fprintf(stderr, %s failed: %s\n, what, strerror(err)); +abort(); +} + +static void die(const char *what) +{ +die2(errno, what); +} + +static void v9fs_qemu_process_post_ops(void *arg) +{ +struct V9fsPool *p =v9fs_pool; +struct V9fsPostOp *post_op; +char byte; +ssize_t len; +GList *cur_req, *next_req; + +do { +len = read(p-rfd,byte, sizeof(byte)); +} while (len == -1 errno == EINTR); + +for (cur_req = p-requests; cur_req != NULL; cur_req = next_req) { +V9fsRequest *req = cur_req-data; +next_req = g_list_next(cur_req); + +if (!req-done) { +continue; +} + +post_op =req-post_op; +post_op-func(post_op-arg); +p-requests = g_list_remove_link(p-requests, cur_req); +g_list_free(p-requests); +} +} + +static inline void v9fs_thread_signal(void) +{ +struct V9fsPool *p =v9fs_pool; +char byte = 0; +ssize_t ret; + +do { +ret = write(p-wfd,byte, sizeof(byte)); +} while (ret == -1 errno == EINTR); + +if (ret 0 errno != EAGAIN) { +die(write() in v9fs); +} + +if (kill(getpid(), SIGUSR2)) { +die(kill failed); +} +} + +static void v9fs_thread_routine(gpointer data, gpointer user_data) +{ +V9fsRequest *req = data; + +req-func(req); +v9fs_thread_signal(); +req-done = 1; +} + static int omode_to_uflags(int8_t mode) { int ret = 0; @@ -3850,7 +3935,8 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf *conf) int i, len; struct stat stat; FsTypeEntry *fse; - +int fds[2]; +V9fsPool *p =v9fs_pool; s = (V9fsState *)virtio_common_init(virtio-9p, VIRTIO_ID_9P, @@ -3939,5 +4025,21 @@ VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf *conf) s-tag_len; s-vdev.get_config = virtio_9p_get_config; +if (qemu_pipe(fds) == -1) { +fprintf(stderr, failed to create fd's for virtio-9p\n); +exit(1); +} + +p-pool = g_thread_pool_new(v9fs_thread_routine, p, 8, FALSE, NULL); +p-rfd = fds[0]; +p-wfd = fds[1]; + +fcntl(p-rfd, F_SETFL, O_NONBLOCK); +fcntl(p-wfd, F_SETFL, O_NONBLOCK); + +qemu_set_fd_handler(p-rfd, v9fs_qemu_process_post_ops, NULL, NULL); + +(void) v9fs_qemu_submit_request; + returns-vdev; } diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h index 10809ba..e7d2326 100644 --- a/hw/9pfs/virtio-9p.h +++ b/hw/9pfs/virtio-9p.h @@ -124,6 +124,20 @@ struct V9fsPDU QLIST_ENTRY(V9fsPDU) next; }; +typedef struct V9fsPostOp { +/* Post Operation routine to execute after executing syscall */ +void (*func)(void *arg); +void *arg; +} V9fsPostOp; + +typedef struct V9fsRequest { +void (*func)(struct V9fsRequest *req); + +/* Flag to indicate that request is satisfied, ready for post-processing */ +int done; + +V9fsPostOp post_op; +} V9fsRequest; /* FIXME * 1) change user needs to set groups and stuff
Re: [Qemu-devel] [PATCH 4/7] libcacard: initial commit
On Tue, Mar 15, 2011 at 01:42:56PM +0100, Jes Sorensen wrote: On 03/14/11 17:40, Alon Levy wrote: On Mon, Mar 14, 2011 at 04:20:22PM +0100, Jes Sorensen wrote: ok, here is a note where I kinda ignored my own wishes but I want to be very clear on them: libcacard should not be part of qemu. it is here because I once thought it would speed things up. So I'm not taking it out or anything - it's fine with me that it goes into qemu, just as long as it's understood that I'm now maintaining another copy of it for usage outside of qemu, in the spice client (or any other client for that matter - it will be the same when we do vnc support for this). Hi Alon, This bit is somewhat problematic. If QEMU is maintaining a copy of libcacard, then that has to comply with the QEMU way of doing things. QEMU cannot rely on various portions in the tree behaving in different ways. Otherwise it really should be an external library requirement pulled in by the build. I am not sure what is the best way, if it stays in QEMU people will eventually start making modifications to it, without looking at the other copy that is being maintained. Yeah, I've already decided (actually minutes after sending this email) to go the route of keeping a copy of qemu-thread* in the external library, since the api is basically just a mirror of pthreads it was nothing but a few renames. Alternatively the external apps that build against it should be taught to link with the QEMU version. That would require me to teach qemu's configure to build libcacard, possibly only libcacard (even though qemu doesn't need a lot of packages by itself, I still wouldn't want apt-get install spice-client to drag in qemu-kvm). Cheers, Jes
[Qemu-devel] [PATCH] Autodetect clock_gettime
Some POSIX OSes (such as Darwin) doesn't have clock_gettime. This patch falls back on gettimeofday if clock_gettime is not available. Signed-off-by: Tristan Gingold ging...@adacore.com --- configure | 11 --- qemu-thread-posix.c | 17 +++-- 2 files changed, 23 insertions(+), 5 deletions(-) diff --git a/configure b/configure index c18f571..6e6cd35 100755 --- a/configure +++ b/configure @@ -2236,17 +2236,18 @@ if compile_prog ; then fi ## -# Do we need librt +# Do we need clock_gettime + librt +clock_gettime=no cat $TMPC EOF -#include signal.h #include time.h int main(void) { clockid_t id; return clock_gettime(id, NULL); } EOF if compile_prog ; then - : + clock_gettime=yes elif compile_prog -lrt ; then LIBS=-lrt $LIBS + clock_gettime=yes fi if test $darwin != yes -a $mingw32 != yes -a $solaris != yes -a \ @@ -2530,6 +2531,7 @@ echo preadv support$preadv echo fdatasync $fdatasync echo madvise $madvise echo posix_madvise $posix_madvise +echo clock_gettime $clock_gettime echo uuid support $uuid echo vhost-net support $vhost_net echo Trace backend $trace_backend @@ -2679,6 +2681,9 @@ fi if test $fnmatch = yes ; then echo CONFIG_FNMATCH=y $config_host_mak fi +if test $clock_gettime = yes ; then + echo CONFIG_CLOCK_GETTIME=y $config_host_mak +fi if test $uuid = yes ; then echo CONFIG_UUID=y $config_host_mak fi diff --git a/qemu-thread-posix.c b/qemu-thread-posix.c index 87c1a9f..dbe14c3 100644 --- a/qemu-thread-posix.c +++ b/qemu-thread-posix.c @@ -61,6 +61,19 @@ int qemu_mutex_trylock(QemuMutex *mutex) return pthread_mutex_trylock(mutex-lock); } +static void qemu_gettime(struct timespec *ts) +{ +#ifdef CONFIG_CLOCK_GETTIME +clock_gettime(CLOCK_REALTIME, ts); +#else +struct timeval tv; + +gettimeofday(tv, NULL); +ts-tv_sec = tv.tv_sec; +ts-tv_nsec = tv.tv_usec * 1000; +#endif +} + static void timespec_add_ms(struct timespec *ts, uint64_t msecs) { ts-tv_sec = ts-tv_sec + (long)(msecs / 1000); @@ -76,7 +89,7 @@ int qemu_mutex_timedlock(QemuMutex *mutex, uint64_t msecs) int err; struct timespec ts; -clock_gettime(CLOCK_REALTIME, ts); +qemu_gettime(ts); timespec_add_ms(ts, msecs); err = pthread_mutex_timedlock(mutex-lock, ts); @@ -144,7 +157,7 @@ int qemu_cond_timedwait(QemuCond *cond, QemuMutex *mutex, uint64_t msecs) struct timespec ts; int err; -clock_gettime(CLOCK_REALTIME, ts); +qemu_gettime(ts); timespec_add_ms(ts, msecs); err = pthread_cond_timedwait(cond-cond, mutex-lock, ts); -- 1.7.3.GIT
[Qemu-devel] [PATCH] cocoa: do not create a spurious window for -version
When invoked with -version, qemu will exit just after displaying the version, so there is no need to create a window. Also handles --XXX options. Signed-off-by: Tristan Gingold ging...@adacore.com --- ui/cocoa.m | 15 --- 1 files changed, 12 insertions(+), 3 deletions(-) diff --git a/ui/cocoa.m b/ui/cocoa.m index 20f91bc..1ff1ac6 100644 --- a/ui/cocoa.m +++ b/ui/cocoa.m @@ -865,10 +865,19 @@ int main (int argc, const char * argv[]) { /* In case we don't need to display a window, let's not do that */ for (i = 1; i argc; i++) { -if (!strcmp(argv[i], -vnc) || -!strcmp(argv[i], -nographic) || -!strcmp(argv[i], -curses)) { +const char *opt = argv[i]; + +if (opt[0] == '-') { +/* Treat --foo the same as -foo. */ +if (opt[1] == '-') { +opt++; +} +if (!strcmp(opt, -vnc) || +!strcmp(opt, -nographic) || +!strcmp(opt, -version) || +!strcmp(opt, -curses)) { return qemu_main(gArgc, gArgv); +} } } -- 1.7.3.GIT
[Qemu-devel] [PATCH] Fix net_check_clients warnings: make it per vlan.
Signed-off-by: Tristan Gingold ging...@adacore.com --- net.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/net.c b/net.c index ddcca97..b2dfaa8 100644 --- a/net.c +++ b/net.c @@ -1305,9 +1305,10 @@ void net_check_clients(void) { VLANState *vlan; VLANClientState *vc; -int has_nic = 0, has_host_dev = 0; QTAILQ_FOREACH(vlan, vlans, next) { +int has_nic = 0, has_host_dev = 0; + QTAILQ_FOREACH(vc, vlan-clients, next) { switch (vc-info-type) { case NET_CLIENT_TYPE_NIC: -- 1.7.3.GIT
Re: [Qemu-devel] [RFC] QCFG: a new mechanism to replace QemuOpts and option handling
On 03/15/2011 05:09 AM, Kevin Wolf wrote: 5) Very complex data types can be implemented. We had some discussion of supporting nested structures with -blockdev. This wouldn't work with QemuOpts but I've already implemented it with QCFG (blockdev syntax is my test case right now). The syntax I'm currently using is -blockdev cache=none,id=foo,format.qcow.protocol.nbd.hostname=localhost where '.' is used to reference sub structures. Do you have an example from your implementation for this? It's not exhaustive as I'm only using this for testing but here's what I've been working with: { 'type': 'ProbeProtocol', 'data': { 'unsafe': 'bool', 'filename': 'str' } } { 'type': 'FileProtocol', 'data': { 'filename': 'str' } } { 'type': 'HostDeviceProtocol', 'data': { 'device': 'str' } } { 'type': 'NbdProtocol', 'data': { 'hostname': 'str', 'port': 'int' } } { 'union': 'BlockdevProtocol', 'data': { 'probe': 'ProbeProtocol', 'file': 'FileProtocol', 'host-dev': 'HostDeviceProtocol', 'nbd': 'NbdProtocol' } } { 'type': 'ProbeFormat', 'data': { '*unsafe': 'bool', 'protocol': 'BlockdevProtocol' } } { 'type': 'RawFormat', 'data': { 'protocol': 'BlockdevProtocol' } } { 'type': 'Qcow2Format', 'data': { 'protocol': 'BlockdevProtocol', '*backing-file': 'BlockdevFormat' } } { 'type': 'QedFormat', 'data': { 'protocol': 'BlockdevProtocol', '*backing-file': 'BlockdevFormat', '*copy-on-read': 'bool' } } { 'union': 'BlockdevFormat', 'data': { 'probe': 'ProbeFormat', 'raw': 'RawFormat', 'qcow2': 'Qcow2Format', 'qed': 'QedFormat' } } { 'enum': 'BlockdevCacheSetting', 'data': [ 'none', 'writethrough', 'writeback' ] } { 'type': 'BlockdevConfig', 'data': { 'id': 'str', 'format': 'BlockdevFormat', '*cache': 'BlockdevCacheSetting', '*device': 'str' } } { 'option': 'blockdev', 'data': 'BlockdevConfig', 'implicit': 'id' } Choosing a union is implicit in selecting the union value. This was done to simplify the command line. Here are some examples: # create a blockdev using probing -blockdev my-image.qcow2,id=ide0-hd0 # create a blockdev using probing without relying on implicit keys and allowing unsafe probing -blockdev format.probe.unsafe=on,format.probe.protocol.file.filename=my-image.qcow2,id=ide0-hd0 # create a blockdev using qcow2 over NBD with a qed backing file -blockdev format.qcow2.protocol.nbd={hostname=localhost,port=1025},\ format.qcow2.backing-file.format.qed.protocol.nbd={hostname=localhost,port=1026},\ id=ide0-hd0 It looks less awkward in config file format: [blockdev] id = ide0-hd0 format.qcow2.protocol.nbd.hostname = localhost format.qcow2.protocol.nbd.port = 1025 format.qcow2.backing-file.format.qed.protocol.nbd.hostname = localhost format.qcow2.backing-file.format.qed.protocol.nbd.port = 1026 And with a syntax this complex, errors are important. Here are some examples of Error messages: #./test-qcfg format.qcow2.file.filename=image.img,id=ide0-hd0 -blockdev: Parameter 'format.qcow2.protocol' is missing # ./test-qcfg format.qcow2.protocol.file.filename=image.img,format.qcow3.backing-file.format.qcow2.protocol.file.filename=foo.img,id=ide0-hd0 -blockdev: Invalid parameter 'format.qcow3.backing-file.format.qcow2.protocol.file.filename' #./test-qcfg format.qcow2.protocol.file.filename=image.img,id=ide0-hd0,cache=no-thank-you -blockdev: Enum 'cache' with value 'no-thank-you' is invalid for type 'BlockdevCacheSetting' I think the tricky part is that the valid fields depend on the block driver. qcow2 wants another BlockDriverState as its image file; file wants a file name; vvfat wants a directory name, FAT type and disk type; and NBD wants a host name and a port, except if it uses a UNIX socket. Yes, it's all handled with a new union type. This is probably the most complex thing you can get, so I think it would make a better example than a VNC configuration. Yup, that's been what I've been using to prototype all of this. I didn't it in the mail because it's rather complex :-) Regards, Anthony Liguori Kevin
[Qemu-devel] Re: [PATCH] Autodetect clock_gettime
On 03/15/2011 02:16 PM, Tristan Gingold wrote: Some POSIX OSes (such as Darwin) doesn't have clock_gettime. This patch falls back on gettimeofday if clock_gettime is not available. This may be okay as a stopgap measure, but any sane porting target for QEMU should have a monotonic clock. In fact, Darwin has it. http://www.wand.net.nz/~smr26/wordpress/2009/01/19/monotonic-time-in-mac-os-x/ hints that code such as the following should work and return nanoseconds: #import mach/mach_time.h uint64_t t = mach_absolute_time(); static mach_timebase_info_data_t info; if (info.denom == 0) { mach_timebase_info(info); } return muldiv64(t, info.numer, info.denom); Paolo
[Qemu-devel] [Bug 584143] Re: qemu fails to set hdd serial number
Please test the packages uploaded in comment #6 (or, if you're on maverick, comment #7) and comment if they work for you. Once verified we can merge the linked bzr trees. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/584143 Title: qemu fails to set hdd serial number Status in QEMU: Fix Released Status in “qemu-kvm” package in Ubuntu: Fix Released Status in “qemu-kvm” source package in Lucid: In Progress Status in “qemu-kvm” source package in Maverick: In Progress Status in “qemu-kvm” package in Debian: Unknown Bug description: The -drive ...,serial=xyz option is broken, at least in 0.12. See Debian bug#573439, http://bugs.debian.org/cgi- bin/bugreport.cgi?bug=573439 for details. The proposed fix from the original reporter: --- qemu-kvm-0.12.3+dfsg/vl.c 2010-02-26 11:34:00.0 +0900 +++ qemu-kvm-0.12.3+dfsg.old/vl.c 2010-03-11 02:26:00.134217787 +0900 @@ -2397,7 +2397,7 @@ dinfo-on_write_error = on_write_error; dinfo-opts = opts; if (serial) -strncpy(dinfo-serial, serial, sizeof(serial)); +strncpy(dinfo-serial, serial, sizeof(dinfo-serial)); QTAILQ_INSERT_TAIL(drives, dinfo, next); if (is_extboot) { extboot_drive = dinfo;
Re: [Qemu-devel] Re: [RFC] QCFG: a new mechanism to replace QemuOpts and option handling
On 03/15/2011 06:21 AM, Kevin Wolf wrote: Am 14.03.2011 18:48, schrieb Anthony Liguori: I've got a spec written up at http://wiki.qemu.org/Features/QCFG. Initial code is in my QAPI tree. One question about a small detail on this wiki page: typedef struct BlockdevConfig { char * file; struct BlockdevConfig * backing_file; struct BlockdevConfig * next; } BlockdevConfig; What is the 'next' pointer used for, This is a standard part of QAPI. All types get a next pointer added such that we can support lists of complex types. are you going to store a list of all -blockdev options used? And why isn't it a QLIST or something? Two reasons. QLIST requires another type for the head of the list which would complicate things overall. Second is that these types are part of the libqmp interface and I didn't want to force qemu-queue on any consumer of libqmp. Regards, Anthony Liguori Kevin
Re: [Qemu-devel] [PATCH 4/7] libcacard: initial commit
On 03/15/11 14:14, Alon Levy wrote: On Tue, Mar 15, 2011 at 01:42:56PM +0100, Jes Sorensen wrote: Alternatively the external apps that build against it should be taught to link with the QEMU version. That would require me to teach qemu's configure to build libcacard, possibly only libcacard (even though qemu doesn't need a lot of packages by itself, I still wouldn't want apt-get install spice-client to drag in qemu-kvm). Hi Alon, I am a little confused as to what the library really does. Is it a library to manage iso7816 cards, or is it an emulation library? If it is hw emulation the library really should be part of qemu.git, but there is nothing that prevents us to expanding the qemu Makefile to build the library and then have a separate RPM called qemu-libs or something that can be installed without the main qemu RPM being installed. Can you elaborate a bit on how spice uses libcacard? I can understand it relying on a library to access/manage smartcards, but the emulation bit puzzles me? If libcacard does both card management and emulation, my next question is whether it wouldn't make more sense to split the two into two separate packages? Cheers, Jes
Re: [Qemu-devel] [RFC] QCFG: a new mechanism to replace QemuOpts and option handling
Am 15.03.2011 14:27, schrieb Anthony Liguori: On 03/15/2011 05:09 AM, Kevin Wolf wrote: 5) Very complex data types can be implemented. We had some discussion of supporting nested structures with -blockdev. This wouldn't work with QemuOpts but I've already implemented it with QCFG (blockdev syntax is my test case right now). The syntax I'm currently using is -blockdev cache=none,id=foo,format.qcow.protocol.nbd.hostname=localhost where '.' is used to reference sub structures. Do you have an example from your implementation for this? It's not exhaustive as I'm only using this for testing but here's what I've been working with: { 'type': 'ProbeProtocol', 'data': { 'unsafe': 'bool', 'filename': 'str' } } { 'type': 'FileProtocol', 'data': { 'filename': 'str' } } { 'type': 'HostDeviceProtocol', 'data': { 'device': 'str' } } { 'type': 'NbdProtocol', 'data': { 'hostname': 'str', 'port': 'int' } } { 'union': 'BlockdevProtocol', 'data': { 'probe': 'ProbeProtocol', 'file': 'FileProtocol', 'host-dev': 'HostDeviceProtocol', 'nbd': 'NbdProtocol' } } What would this look like in the generated C code? A union of differently typed pointers? Are format drivers still contained in a single C file in block/ that is enabled just by compiling it in or does the block layer now have to know about all available drivers and the options they provide? This is probably the most complex thing you can get, so I think it would make a better example than a VNC configuration. Yup, that's been what I've been using to prototype all of this. I didn't it in the mail because it's rather complex :-) This is exactly what makes it interesting. :-) Kevin
Re: [Qemu-devel] [PATCH 4/7] libcacard: initial commit
On 03/15/2011 07:42 AM, Jes Sorensen wrote: On 03/14/11 17:40, Alon Levy wrote: On Mon, Mar 14, 2011 at 04:20:22PM +0100, Jes Sorensen wrote: ok, here is a note where I kinda ignored my own wishes but I want to be very clear on them: libcacard should not be part of qemu. it is here because I once thought it would speed things up. So I'm not taking it out or anything - it's fine with me that it goes into qemu, just as long as it's understood that I'm now maintaining another copy of it for usage outside of qemu, in the spice client (or any other client for that matter - it will be the same when we do vnc support for this). Hi Alon, This bit is somewhat problematic. If QEMU is maintaining a copy of libcacard, then that has to comply with the QEMU way of doing things. QEMU cannot rely on various portions in the tree behaving in different ways. Otherwise it really should be an external library requirement pulled in by the build. I am not sure what is the best way, if it stays in QEMU people will eventually start making modifications to it, without looking at the other copy that is being maintained. Two copies is not really practical. QEMU should be the place that owns it and things should be consuming a .so from QEMU. Regards, Anthony Liguori Alternatively the external apps that build against it should be taught to link with the QEMU version. Cheers, Jes
Re: [Qemu-devel] [PATCH 4/7] libcacard: initial commit
On 03/15/2011 08:14 AM, Alon Levy wrote: Alternatively the external apps that build against it should be taught to link with the QEMU version. That would require me to teach qemu's configure to build libcacard, possibly only libcacard (even though qemu doesn't need a lot of packages by itself, I still wouldn't want apt-get install spice-client to drag in qemu-kvm). Any reasonable packaging system can generate multiple binary packages from a source source package without the binary packages being implicit dependent on each other. Regards, Anthony Liguori Cheers, Jes
[Qemu-devel] [Bug 584143] Re: qemu fails to set hdd serial number
** Description changed: + = + SRU Justification: + 1. Impact: 'qemu -drive ...,serial=xyz' does not work + 2. How addressed: a patch from upstream fixes bug that sizeof was called on the wrong thing. + 3. patch: is in the description + 4. to reproduce: use '-drive ...,serial=xyz' option to qemu + 5. regression potential: this only changes one line which called sizeof on the wrong thing, so should not impact any other code. + = + The -drive ...,serial=xyz option is broken, at least in 0.12. See Debian bug#573439, http://bugs.debian.org/cgi- bin/bugreport.cgi?bug=573439 for details. The proposed fix from the original reporter: --- qemu-kvm-0.12.3+dfsg/vl.c 2010-02-26 11:34:00.0 +0900 +++ qemu-kvm-0.12.3+dfsg.old/vl.c 2010-03-11 02:26:00.134217787 +0900 @@ -2397,7 +2397,7 @@ - dinfo-on_write_error = on_write_error; - dinfo-opts = opts; - if (serial) + dinfo-on_write_error = on_write_error; + dinfo-opts = opts; + if (serial) -strncpy(dinfo-serial, serial, sizeof(serial)); +strncpy(dinfo-serial, serial, sizeof(dinfo-serial)); - QTAILQ_INSERT_TAIL(drives, dinfo, next); - if (is_extboot) { - extboot_drive = dinfo; + QTAILQ_INSERT_TAIL(drives, dinfo, next); + if (is_extboot) { + extboot_drive = dinfo; -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/584143 Title: qemu fails to set hdd serial number Status in QEMU: Fix Released Status in “qemu-kvm” package in Ubuntu: Fix Released Status in “qemu-kvm” source package in Lucid: In Progress Status in “qemu-kvm” source package in Maverick: In Progress Status in “qemu-kvm” package in Debian: Unknown Bug description: = SRU Justification: 1. Impact: 'qemu -drive ...,serial=xyz' does not work 2. How addressed: a patch from upstream fixes bug that sizeof was called on the wrong thing. 3. patch: is in the description 4. to reproduce: use '-drive ...,serial=xyz' option to qemu 5. regression potential: this only changes one line which called sizeof on the wrong thing, so should not impact any other code. = The -drive ...,serial=xyz option is broken, at least in 0.12. See Debian bug#573439, http://bugs.debian.org/cgi- bin/bugreport.cgi?bug=573439 for details. The proposed fix from the original reporter: --- qemu-kvm-0.12.3+dfsg/vl.c 2010-02-26 11:34:00.0 +0900 +++ qemu-kvm-0.12.3+dfsg.old/vl.c 2010-03-11 02:26:00.134217787 +0900 @@ -2397,7 +2397,7 @@ dinfo-on_write_error = on_write_error; dinfo-opts = opts; if (serial) -strncpy(dinfo-serial, serial, sizeof(serial)); +strncpy(dinfo-serial, serial, sizeof(dinfo-serial)); QTAILQ_INSERT_TAIL(drives, dinfo, next); if (is_extboot) { extboot_drive = dinfo;
Re: [Qemu-devel] Re: [RFC] QCFG: a new mechanism to replace QemuOpts and option handling
Am 15.03.2011 14:37, schrieb Anthony Liguori: On 03/15/2011 06:21 AM, Kevin Wolf wrote: Am 14.03.2011 18:48, schrieb Anthony Liguori: I've got a spec written up at http://wiki.qemu.org/Features/QCFG. Initial code is in my QAPI tree. One question about a small detail on this wiki page: typedef struct BlockdevConfig { char * file; struct BlockdevConfig * backing_file; struct BlockdevConfig * next; } BlockdevConfig; What is the 'next' pointer used for, This is a standard part of QAPI. All types get a next pointer added such that we can support lists of complex types. Only a single list for each object. are you going to store a list of all -blockdev options used? And why isn't it a QLIST or something? Two reasons. QLIST requires another type for the head of the list which would complicate things overall. Second is that these types are part of the libqmp interface and I didn't want to force qemu-queue on any consumer of libqmp. And now you force existing qemu code to go back to open coded lists, which is arguably a step backwards. I don't think this is any better than forcing the (non-existent) users of libqmp to include one additional header file. Kevin
[Qemu-devel] Re: KVM call agenda for Mars 14th
On 03/14/2011 01:14 PM, Juan Quintela wrote: Please send any agenda items you are interested in covering. Switching from gPXE to iPXE. Fedora could lead the switch if it gets some commitment from QEMU to switch in 0.15 too (assuming the switch is successful). See also https://bugzilla.redhat.com/show_bug.cgi?id=684792 The only downside I see is that iPXE doesn't have a stable release. Paolo
[Qemu-devel] Re: [PATCH] Autodetect clock_gettime
On Mar 15, 2011, at 2:34 PM, Paolo Bonzini wrote: On 03/15/2011 02:16 PM, Tristan Gingold wrote: Some POSIX OSes (such as Darwin) doesn't have clock_gettime. This patch falls back on gettimeofday if clock_gettime is not available. This may be okay as a stopgap measure, but any sane porting target for QEMU should have a monotonic clock. In fact, Darwin has it. Yes mach primitives could be used. But why isn't a monotonic clock used on Linux ? According to man, CLOCK_MONOTONIC is monotonic while CLOCK_REALTIME isn't. Tristan.
Re: [Qemu-devel] [RFC] QCFG: a new mechanism to replace QemuOpts and option handling
On 03/15/2011 08:45 AM, Kevin Wolf wrote: Am 15.03.2011 14:27, schrieb Anthony Liguori: On 03/15/2011 05:09 AM, Kevin Wolf wrote: 5) Very complex data types can be implemented. We had some discussion of supporting nested structures with -blockdev. This wouldn't work with QemuOpts but I've already implemented it with QCFG (blockdev syntax is my test case right now). The syntax I'm currently using is -blockdev cache=none,id=foo,format.qcow.protocol.nbd.hostname=localhost where '.' is used to reference sub structures. Do you have an example from your implementation for this? It's not exhaustive as I'm only using this for testing but here's what I've been working with: { 'type': 'ProbeProtocol', 'data': { 'unsafe': 'bool', 'filename': 'str' } } { 'type': 'FileProtocol', 'data': { 'filename': 'str' } } { 'type': 'HostDeviceProtocol', 'data': { 'device': 'str' } } { 'type': 'NbdProtocol', 'data': { 'hostname': 'str', 'port': 'int' } } { 'union': 'BlockdevProtocol', 'data': { 'probe': 'ProbeProtocol', 'file': 'FileProtocol', 'host-dev': 'HostDeviceProtocol', 'nbd': 'NbdProtocol' } } What would this look like in the generated C code? A union of differently typed pointers? Yes: typedef enum BlockdevFormatKind { BFK_PROBE = 0, BFK_RAW = 1, BFK_QCOW2 = 2, BFK_QED = 3, } BlockdevFormatKind; typedef struct BlockdevFormat { BlockdevFormatKind kind; union { struct ProbeFormat * probe; struct RawFormat * raw; struct Qcow2Format * qcow2; struct QedFormat * qed; }; struct BlockdevFormat * next; } BlockdevFormat; Are format drivers still contained in a single C file in block/ that is enabled just by compiling it in or does the block layer now have to know about all available drivers and the options they provide? Yes, everything is contained within a single file. In terms of build dependencies, it's really just a call about what matters to you. You can have the block open take a BlockdevFormat which means the block layer doesn't need to know about specific formats. Regards, Anthony Liguori This is probably the most complex thing you can get, so I think it would make a better example than a VNC configuration. Yup, that's been what I've been using to prototype all of this. I didn't it in the mail because it's rather complex :-) This is exactly what makes it interesting. :-) Kevin
[Qemu-devel] Sistemi pubblicitari innovativi
World Marketing un sistema innovativo per la tua pubblicità http://fineuropa6.x10.mx/world_marketing.html --- This e-mail was sent to qemu-devel@nongnu.org because you are subscribed to at least one of our mailing lists. If at any time you would like to remove yourself from our mailing list, please feel free to do so by visiting: http://www.affareworld.org/marketing/public/unsubscribe.php?g=2addr=qemu-devel@nongnu.org
[Qemu-devel] Re: [PATCH] Autodetect clock_gettime
On 03/15/2011 02:47 PM, Tristan Gingold wrote: On Mar 15, 2011, at 2:34 PM, Paolo Bonzini wrote: On 03/15/2011 02:16 PM, Tristan Gingold wrote: Some POSIX OSes (such as Darwin) doesn't have clock_gettime. This patch falls back on gettimeofday if clock_gettime is not available. This may be okay as a stopgap measure, but any sane porting target for QEMU should have a monotonic clock. In fact, Darwin has it. Yes mach primitives could be used. But why isn't a monotonic clock used on Linux ? According to man, CLOCK_MONOTONIC is monotonic while CLOCK_REALTIME isn't. /me rereads the patch Unfortunately, pthread timed wait/lock functions are documented to use the realtime clock by default. Using pthread_condattr_setclock is probably not portable enough, and anyway there is no such function for mutexes so we're stuck with CLOCK_REALTIME. What you're patching is fine, but those functions might actually go away soon as they're not supported on Win32. So, in addition to what you've done, you should probably use those Mach primitives in qemu-timer.h. Paolo
[Qemu-devel] Re: [PATCH 2/7] Introduce -display argument
On 15 March 2011 12:36, jes.soren...@redhat.com wrote: From: Jes Sorensen jes.soren...@redhat.com This patch introduces a -display argument which consolidates the setting of the display mode. Valid options are: sdl/curses/default/serial (serial is equivalent to -nographic) So I still think that we should not be including any new -display subargument which mirrors the behaviour of -nographic. -display represents an opportunity to provide a set of orthogonal command line options which affect the handling of particular devices; it ought to mean what happens to VGA/video output?, and should not change the behaviour of any other devices. -nographic is effectively a convenience shortcut which changes the behaviour of several different devices (display, serial, parallel, at least). It doesn't belong under '-display' from an orthogonality argument, and people who want it because it is a shortcut will be better served by the existing '-nographic' because it's less typing than '-display serial' anyway. (Ideally we should document '-nographic' by saying that it is equivalent to some set of other options including -display none -serial stdio and whatever else it does.) -- PMM
Re: [Qemu-devel] [PATCH 4/7] libcacard: initial commit
On Tue, Mar 15, 2011 at 02:40:04PM +0100, Jes Sorensen wrote: On 03/15/11 14:14, Alon Levy wrote: On Tue, Mar 15, 2011 at 01:42:56PM +0100, Jes Sorensen wrote: Alternatively the external apps that build against it should be taught to link with the QEMU version. That would require me to teach qemu's configure to build libcacard, possibly only libcacard (even though qemu doesn't need a lot of packages by itself, I still wouldn't want apt-get install spice-client to drag in qemu-kvm). Hi Alon, I am a little confused as to what the library really does. Is it a library to manage iso7816 cards, or is it an emulation library? If it is emulation library. hw emulation the library really should be part of qemu.git, but there is nothing that prevents us to expanding the qemu Makefile to build the library and then have a separate RPM called qemu-libs or something that can be installed without the main qemu RPM being installed. Yes, that's what I was thinking about. Of course we can do it downstream (in fedora/rhel), but I'd rather have an upstream make target / configure option == solution.. Can you elaborate a bit on how spice uses libcacard? I can understand it relying on a library to access/manage smartcards, but the emulation bit puzzles me? If no emulation was required in the middle we would have just done usb forwarding. The fact is we need the client and the guest to access the card at the same time, potentially the client and a few guests. Because there is no locking in the smartcard protocol, no idea of multiple outstanding requests, this requires giving each guest it's own card state, that is emulating a card. libcacard emulates a CAC, that is a Common Access Card. So the second option. The reader emulation is naturally part of the pc emulation, so qemu is the right place. There are two locations to do the card emulation, currently both are implemented: * in the pc emulator: ccid-card-emualted. This links with the libcacard files (well, the way we do linking it links with all the world, but it uses that code, those symbols). * in the client: that's what spice uses. in the vm side we have ccid-card-passthru, over the wire we get the APDU's (application protocol data unit for the 7186 standard, which the CAC standard uses), and the card emulation itself is done in the client, via linking with libcacard (the standalone one). Obviously it would have been simpler if we decided from the start to do what anthony wanted, that is to emulate in the host/pc. But we/I didn't, it seemed easier to emulate in the client, and also I thought more performant. The performance part really depends on which latency is more important, and no benchmarks have been done. So right now contents wise (I mean, what's in this patchset) I think we are over the question of which devices will be accepted in qemu, we are just down to the question of what color the code should be, and I'll be sending v21 once I fix the review concerns. If libcacard does both card management and emulation, my next question is whether it wouldn't make more sense to split the two into two separate packages? Cheers, Jes
[Qemu-devel] [PATCH, RFC 0/4] allow guest control of the volatile write cache
This series adds support for the guest to control the volatile write cache setting on disks exported by qemu for ide and virtio. For ide it just wires up the existing SETFEATURES calls, and for virtio it adds a new writeable config space field. SCSI is not supported at this point, as the convoluted callback mess in the SCSI stack doesn't allow commands except for plain WRITEs to read data from guest memory. The backend is based on the code that Prerna posted a while ago, and not Stefan's /proc based variant. I'm open to either one, but the problem with the /proc based one is that it's Linux-specific.
[Qemu-devel] [PATCH 4/4] virtio-blk: add runtime cache control
Add a new writeable features config space field, which allows the guest to communicate features it wants enabled/disabled at runtime. The only feature defined so far is the status of the volatile write cache. Also rename the virtio_blk_update_config to virtio_blk_get_config to fit the method naming scheme. Signed-off-by: Christoph Hellwig h...@lst.de Index: qemu/hw/virtio-blk.c === --- qemu.orig/hw/virtio-blk.c 2011-03-15 13:07:10.093633862 +0100 +++ qemu/hw/virtio-blk.c2011-03-15 13:07:54.108135875 +0100 @@ -434,7 +434,7 @@ static void virtio_blk_reset(VirtIODevic /* coalesce internal state, copy to pci i/o region 0 */ -static void virtio_blk_update_config(VirtIODevice *vdev, uint8_t *config) +static void virtio_blk_get_config(VirtIODevice *vdev, uint8_t *config) { VirtIOBlock *s = to_virtio_blk(vdev); struct virtio_blk_config blkcfg; @@ -455,9 +455,26 @@ static void virtio_blk_update_config(Vir blkcfg.alignment_offset = 0; blkcfg.min_io_size = s-conf-min_io_size / blkcfg.blk_size; blkcfg.opt_io_size = s-conf-opt_io_size / blkcfg.blk_size; +if (bdrv_enable_write_cache(s-bs)) +blkcfg.features |= VIRTIO_BLK_RT_WCE; memcpy(config, blkcfg, sizeof(struct virtio_blk_config)); } +static void virtio_blk_set_config(VirtIODevice *vdev, const uint8_t *config) +{ +VirtIOBlock *s = to_virtio_blk(vdev); +struct virtio_blk_config blkcfg; +bool enable = false; + +memcpy(blkcfg, config, sizeof(blkcfg)); + +if (blkcfg.features VIRTIO_BLK_RT_WCE) +enable = true; + +/* no error reporting, needs to be checked by a config re-read */ +bdrv_change_cache(s-bs, enable); +} + static uint32_t virtio_blk_get_features(VirtIODevice *vdev, uint32_t features) { VirtIOBlock *s = to_virtio_blk(vdev); @@ -466,6 +483,7 @@ static uint32_t virtio_blk_get_features( features |= (1 VIRTIO_BLK_F_GEOMETRY); features |= (1 VIRTIO_BLK_F_TOPOLOGY); features |= (1 VIRTIO_BLK_F_BLK_SIZE); +features |= (1 VIRTIO_BLK_F_DYNAMIC); if (bdrv_enable_write_cache(s-bs)) features |= (1 VIRTIO_BLK_F_WCACHE); @@ -543,7 +561,8 @@ VirtIODevice *virtio_blk_init(DeviceStat sizeof(struct virtio_blk_config), sizeof(VirtIOBlock)); -s-vdev.get_config = virtio_blk_update_config; +s-vdev.get_config = virtio_blk_get_config; +s-vdev.set_config = virtio_blk_set_config; s-vdev.get_features = virtio_blk_get_features; s-vdev.reset = virtio_blk_reset; s-bs = conf-bs; Index: qemu/hw/virtio-blk.h === --- qemu.orig/hw/virtio-blk.h 2011-03-15 13:07:10.109636192 +0100 +++ qemu/hw/virtio-blk.h2011-03-15 13:07:54.112135546 +0100 @@ -33,6 +33,7 @@ /* #define VIRTIO_BLK_F_IDENTIFY 8 ATA IDENTIFY supported, DEPRECATED */ #define VIRTIO_BLK_F_WCACHE 9 /* write cache enabled */ #define VIRTIO_BLK_F_TOPOLOGY 10 /* Topology information is available */ +#define VIRTIO_BLK_F_DYNAMIC11 /* Dynamic features field */ struct virtio_blk_config { @@ -47,6 +48,8 @@ struct virtio_blk_config uint8_t alignment_offset; uint16_t min_io_size; uint32_t opt_io_size; +uint32_t features; +#define VIRTIO_BLK_RT_WCE (1 0) } __attribute__((packed)); /* These two define direction. */
[Qemu-devel] [PATCH 2/4] block: add a helper to change writeback mode on the fly
Add a new bdrv_change_cache that can set/clear the writeback flag at runtime by stopping all I/O and closing/reopening the image file. All code is based on a patch from Prerna Saxena pre...@linux.vnet.ibm.com with minimal refactoring. Signed-off-by: Christoph Hellwig h...@lst.de Index: qemu/block.c === --- qemu.orig/block.c 2011-03-15 11:47:31.285634626 +0100 +++ qemu/block.c2011-03-15 14:57:03.680633093 +0100 @@ -441,6 +441,8 @@ static int bdrv_open_common(BlockDriverS if (flags BDRV_O_CACHE_WB) bs-enable_write_cache = 1; +else +bs-enable_write_cache = 0; /* * Clear flags that are internal to the block layer before opening the @@ -651,6 +653,44 @@ unlink_and_fail: return ret; } +static int bdrv_reopen(BlockDriverState *bs, int bdrv_flags) +{ +BlockDriver *drv = bs-drv; +int ret; + +if (bdrv_flags == bs-open_flags) { +return 0; +} + +/* Quiesce IO for the given block device */ +qemu_aio_flush(); +bdrv_flush(bs); + +bdrv_close(bs); +ret = bdrv_open(bs, bs-filename, bdrv_flags, drv); + +/* + * A failed attempt to reopen the image file must lead to 'abort()' + */ +if (ret != 0) { +abort(); +} + +return ret; +} + +int bdrv_change_cache(BlockDriverState *bs, bool enable) +{ +int bdrv_flags = 0; + +bdrv_flags = bs-open_flags ~BDRV_O_CACHE_WB; +if (enable) { +bdrv_flags |= BDRV_O_CACHE_WB; +} + +return bdrv_reopen(bs, bdrv_flags); +} + void bdrv_close(BlockDriverState *bs) { if (bs-drv) { Index: qemu/block.h === --- qemu.orig/block.h 2011-03-15 11:47:18.664136441 +0100 +++ qemu/block.h2011-03-15 11:47:31.813634525 +0100 @@ -87,6 +87,7 @@ int bdrv_pwrite_sync(BlockDriverState *b int bdrv_write_sync(BlockDriverState *bs, int64_t sector_num, const uint8_t *buf, int nb_sectors); int bdrv_truncate(BlockDriverState *bs, int64_t offset); +int bdrv_change_cache(BlockDriverState *bs, bool enable); int64_t bdrv_getlength(BlockDriverState *bs); void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr); void bdrv_guess_geometry(BlockDriverState *bs, int *pcyls, int *pheads, int *psecs);
[Qemu-devel] [PATCH 1/4] block: clarify the meaning of BDRV_O_NOCACHE
Change BDRV_O_NOCACHE to only imply bypassing the host OS file cache, but no writeback semantics. All existing callers are changed to also specify BDRV_O_CACHE_WB to give them writeback semantics. Signed-off-by: Christoph Hellwig h...@lst.de Index: qemu/block.c === --- qemu.orig/block.c 2011-03-08 20:11:08.188219978 +0100 +++ qemu/block.c2011-03-08 20:12:44.971718742 +0100 @@ -439,13 +439,7 @@ static int bdrv_open_common(BlockDriverS bs-drv = drv; bs-opaque = qemu_mallocz(drv-instance_size); -/* - * Yes, BDRV_O_NOCACHE aka O_DIRECT means we have to present a - * write cache to the guest. We do need the fdatasync to flush - * out transactions for block allocations, and we maybe have a - * volatile write cache in our backing device to deal with. - */ -if (flags (BDRV_O_CACHE_WB|BDRV_O_NOCACHE)) +if (flags BDRV_O_CACHE_WB) bs-enable_write_cache = 1; /* Index: qemu/block/raw-posix.c === --- qemu.orig/block/raw-posix.c 2011-03-08 20:11:08.200220692 +0100 +++ qemu/block/raw-posix.c 2011-03-08 20:11:22.229218596 +0100 @@ -154,7 +154,7 @@ static int raw_open_common(BlockDriverSt * and O_DIRECT for no caching. */ if ((bdrv_flags BDRV_O_NOCACHE)) s-open_flags |= O_DIRECT; -else if (!(bdrv_flags BDRV_O_CACHE_WB)) +if (!(bdrv_flags BDRV_O_CACHE_WB)) s-open_flags |= O_DSYNC; s-fd = -1; Index: qemu/block/raw-win32.c === --- qemu.orig/block/raw-win32.c 2011-03-08 20:11:08.212218227 +0100 +++ qemu/block/raw-win32.c 2011-03-08 20:11:22.237218180 +0100 @@ -88,9 +88,9 @@ static int raw_open(BlockDriverState *bs } overlapped = FILE_ATTRIBUTE_NORMAL; -if ((flags BDRV_O_NOCACHE)) -overlapped |= FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH; -else if (!(flags BDRV_O_CACHE_WB)) +if (flags BDRV_O_NOCACHE) +overlapped |= FILE_FLAG_NO_BUFFERING; +if (!(flags BDRV_O_CACHE_WB)) overlapped |= FILE_FLAG_WRITE_THROUGH; s-hfile = CreateFile(filename, access_flags, FILE_SHARE_READ, NULL, @@ -349,9 +349,9 @@ static int hdev_open(BlockDriverState *b create_flags = OPEN_EXISTING; overlapped = FILE_ATTRIBUTE_NORMAL; -if ((flags BDRV_O_NOCACHE)) -overlapped |= FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH; -else if (!(flags BDRV_O_CACHE_WB)) +if (flags BDRV_O_NOCACHE) +overlapped |= FILE_FLAG_NO_BUFFERING; +if (!(flags BDRV_O_CACHE_WB)) overlapped |= FILE_FLAG_WRITE_THROUGH; s-hfile = CreateFile(filename, access_flags, FILE_SHARE_READ, NULL, Index: qemu/blockdev.c === --- qemu.orig/blockdev.c2011-03-08 20:11:08.220217606 +0100 +++ qemu/blockdev.c 2011-03-08 20:11:22.237218180 +0100 @@ -326,7 +326,7 @@ DriveInfo *drive_init(QemuOpts *opts, in if ((buf = qemu_opt_get(opts, cache)) != NULL) { if (!strcmp(buf, off) || !strcmp(buf, none)) { -bdrv_flags |= BDRV_O_NOCACHE; +bdrv_flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB; } else if (!strcmp(buf, writeback)) { bdrv_flags |= BDRV_O_CACHE_WB; } else if (!strcmp(buf, unsafe)) { Index: qemu/qemu-io.c === --- qemu.orig/qemu-io.c 2011-03-08 20:11:08.232220650 +0100 +++ qemu/qemu-io.c 2011-03-08 20:11:22.245232398 +0100 @@ -1655,7 +1655,7 @@ open_f(int argc, char **argv) flags |= BDRV_O_SNAPSHOT; break; case 'n': - flags |= BDRV_O_NOCACHE; + flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB; break; case 'r': readonly = 1; @@ -1751,7 +1751,7 @@ int main(int argc, char **argv) flags |= BDRV_O_SNAPSHOT; break; case 'n': - flags |= BDRV_O_NOCACHE; + flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB; break; case 'c': add_user_command(optarg); Index: qemu/qemu-nbd.c === --- qemu.orig/qemu-nbd.c2011-03-08 20:11:08.244217894 +0100 +++ qemu/qemu-nbd.c 2011-03-08 20:11:22.253267426 +0100 @@ -238,7 +238,7 @@ int main(int argc, char **argv) flags |= BDRV_O_SNAPSHOT; break; case 'n': -flags |= BDRV_O_NOCACHE; +flags |= BDRV_O_NOCACHE | BDRV_O_CACHE_WB; break; case 'b': bindto = optarg;
[Qemu-devel] [PATCH 3/4] ide: wire up setfeatures cache control
Wire up the ATA SETFEATURES subcalls that control the volatile write cache to the new bdrv_change_cache helper. Signed-off-by: Christoph Hellwig h...@lst.de Index: qemu/hw/ide/core.c === --- qemu.orig/hw/ide/core.c 2011-03-15 11:47:18.569636140 +0100 +++ qemu/hw/ide/core.c 2011-03-15 13:07:21.464634347 +0100 @@ -1700,6 +1700,19 @@ void ide_ioport_write(void *opaque, uint } } +static void ide_setcache(IDEState *s, bool enable) +{ +if (bdrv_change_cache(s-bs, enable)) { +ide_abort_command(s); +ide_set_irq(s-bus); +return; +} + +s-identify_set = 0; + +s-status = READY_STAT | SEEK_STAT; +ide_set_irq(s-bus); +} void ide_exec_cmd(IDEBus *bus, uint32_t val) { @@ -1855,7 +1868,11 @@ void ide_exec_cmd(IDEBus *bus, uint32_t case 0xcc: /* reverting to power-on defaults enable */ case 0x66: /* reverting to power-on defaults disable */ case 0x02: /* write cache enable */ +ide_setcache(s, true); +break; case 0x82: /* write cache disable */ +ide_setcache(s, false); +break; case 0xaa: /* read look-ahead enable */ case 0x55: /* read look-ahead disable */ case 0x05: /* set advanced power management mode */
[Qemu-devel] qemu-kvm 0.14.0 and clocksource=acpi_pm
Hi, i'm currently testing qemu-kvm 0.14.0 in conjunction with Linux 2.6.38 on the host system. As there are some old kernels out that support kvm_clock but not reliably we used to run some of them with clocksource=acpi_pm. However, on this new combination of qemu-kvm and linux kernel I see the following message in the guests: Override clocksource acpi_pm is not HRT compatible. Cannot switch while in HRT/NOHZ mode This used to work with qemu-kvm 0.12.5 and linux 2.6.34. The guest is Ubuntu LTS 10.04.2 64-bit. Has anyone a clue? Additionally it would be great if someone who knows definetely would say from with kernel on clocksource=kvm_clock is stable (especially in conjunction with live migration) BR, Peter
[Qemu-devel] [PATCH, RFC] virtio_blk: add cache control support
Add support for the new dynamic features config space field to allow en/disabling the write cache at runtime. The userspace interface is a SCSI-compatible sysfs attribute. Signed-off-by: Christoph Hellwig h...@lst.de Index: linux-2.6/drivers/block/virtio_blk.c === --- linux-2.6.orig/drivers/block/virtio_blk.c 2011-03-15 12:16:29.156133695 +0100 +++ linux-2.6/drivers/block/virtio_blk.c2011-03-15 13:17:30.160634723 +0100 @@ -291,6 +291,73 @@ static ssize_t virtblk_serial_show(struc } DEVICE_ATTR(serial, S_IRUGO, virtblk_serial_show, NULL); +static bool virtblk_has_wb_cache(struct virtio_blk *vblk) +{ + struct virtio_device *vdev = vblk-vdev; + u32 features; + + if (!virtio_has_feature(vdev, VIRTIO_BLK_F_FLUSH)) + return false; + if (!virtio_has_feature(vdev, VIRTIO_BLK_F_DYNAMIC)) + return true; + + vdev-config-get(vdev, offsetof(struct virtio_blk_config, features), + features, sizeof(features)); + + if (features VIRTIO_BLK_RT_WCE) + return true; + return false; +} + +static ssize_t virtblk_cache_type_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct gendisk *disk = dev_to_disk(dev); + + if (virtblk_has_wb_cache(disk-private_data)) + return sprintf(buf, write back\n); + else + return sprintf(buf, write through\n); +} + +static ssize_t virtblk_cache_type_store(struct device *dev, + struct device_attribute *attr, const char *buf, size_t count) +{ + struct gendisk *disk = dev_to_disk(dev); + struct virtio_blk *vblk = disk-private_data; + struct virtio_device *vdev = vblk-vdev; + u32 features, features2 = 0; + + if (!virtio_has_feature(vdev, VIRTIO_BLK_F_FLUSH) || + !virtio_has_feature(vdev, VIRTIO_BLK_F_DYNAMIC)) + return -ENXIO; + + if (strncmp(buf, write through, sizeof(write through) - 1) == 0) { + ; + } else if (strncmp(buf, write back, sizeof(write back) - 1) == 0) { + blk_queue_flush(disk-queue, REQ_FLUSH); + features |= VIRTIO_BLK_RT_WCE; + } else { + return -EINVAL; + } + + vdev-config-set(vdev, offsetof(struct virtio_blk_config, features), + features, sizeof(features)); + + vdev-config-get(vdev, offsetof(struct virtio_blk_config, features), + features2, sizeof(features2)); + + if ((features VIRTIO_BLK_RT_WCE) != + (features2 VIRTIO_BLK_RT_WCE)) + return -EIO; + + if (!(features2 VIRTIO_BLK_RT_WCE)) + blk_queue_flush(disk-queue, 0); + return count; +} +static DEVICE_ATTR(cache_type, S_IRUGO|S_IWUSR, + virtblk_cache_type_show, virtblk_cache_type_store); + static int __devinit virtblk_probe(struct virtio_device *vdev) { struct virtio_blk *vblk; @@ -377,7 +444,7 @@ static int __devinit virtblk_probe(struc index++; /* configure queue flush support */ - if (virtio_has_feature(vdev, VIRTIO_BLK_F_FLUSH)) + if (virtblk_has_wb_cache(vblk)) blk_queue_flush(q, REQ_FLUSH); /* If disk is read-only in the host, the guest should obey */ @@ -456,6 +523,10 @@ static int __devinit virtblk_probe(struc if (err) goto out_del_disk; + err = device_create_file(disk_to_dev(vblk-disk), dev_attr_cache_type); + if (err) + goto out_del_disk; + return 0; out_del_disk: @@ -499,7 +570,7 @@ static const struct virtio_device_id id_ static unsigned int features[] = { VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX, VIRTIO_BLK_F_GEOMETRY, VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE, VIRTIO_BLK_F_SCSI, - VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_TOPOLOGY + VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_DYNAMIC }; /* Index: linux-2.6/include/linux/virtio_blk.h === --- linux-2.6.orig/include/linux/virtio_blk.h 2011-03-15 12:14:28.261632780 +0100 +++ linux-2.6/include/linux/virtio_blk.h2011-03-15 12:25:56.308634399 +0100 @@ -16,6 +16,7 @@ #define VIRTIO_BLK_F_SCSI 7 /* Supports scsi command passthru */ #define VIRTIO_BLK_F_FLUSH 9 /* Cache flush command support */ #define VIRTIO_BLK_F_TOPOLOGY 10 /* Topology information is available */ +#define VIRTIO_BLK_F_DYNAMIC 11 /* Dynamic features field */ #define VIRTIO_BLK_ID_BYTES20 /* ID string length */ @@ -45,6 +46,9 @@ struct virtio_blk_config { __u16 min_io_size; /* optimal sustained I/O size in logical blocks. */ __u32 opt_io_size; + /* runtime controllable features */ + __u32 features; +#define VIRTIO_BLK_RT_WCE (1 0)
[Qemu-devel] Re: Re: [PATCH] Autodetect clock_gettime
Le 15 mars 2011 à 14:58, Paolo Bonzini pbonz...@redhat.com a écrit : Some POSIX OSes (such as Darwin) doesn't have clock_gettime. This patch falls back on gettimeofday if clock_gettime is not available. Some code I've seen use #ifdef CLOCK_REALTIME but this doesn't seem right http://pubs.opengroup.org/onlinepubs/009695399/basedefs/time.h.html This may be okay as a stopgap measure, but any sane porting target for QEMU should have a monotonic clock. In fact, Darwin has it. Yes mach primitives could be used. But why isn't a monotonic clock used on Linux ? cf. http://www.virtualbox.org/browser/trunk/src/VBox/Runtime/r3/darwin/time-darwin.cpp You might want to check this as well : http://www.wand.net.nz/~smr26/wordpress/2009/01/19/monotonic-time-in-mac-os-x/ According to man, CLOCK_MONOTONIC is monotonic while CLOCK_REALTIME isn't. Yes that's the reason both exist actually... Unfortunately, pthread timed wait/lock functions are documented to use the realtime clock by default. Using pthread_condattr_setclock is probably not portable enough, and anyway there is no such function for mutexes so we're stuck with CLOCK_REALTIME. What you're patching is fine, but those functions might actually go away soon as they're not supported on Win32. If you're talking about qemu_mutex_timedlock and qemu_cond_timedwait, I failed to grep them anywhere else than at their own definition. So, in addition to what you've done, you should probably use those Mach primitives in qemu-timer.h. Likely. (though I'd like to get QEMU to just work again here first as I need it running for friday...) François.
Re: [Qemu-devel] [PATCH 4/7] libcacard: initial commit
On Tue, Mar 15, 2011 at 08:45:29AM -0500, Anthony Liguori wrote: On 03/15/2011 08:14 AM, Alon Levy wrote: Alternatively the external apps that build against it should be taught to link with the QEMU version. That would require me to teach qemu's configure to build libcacard, possibly only libcacard (even though qemu doesn't need a lot of packages by itself, I still wouldn't want apt-get install spice-client to drag in qemu-kvm). Any reasonable packaging system can generate multiple binary packages from a source source package without the binary packages being implicit dependent on each other. of course, I was just saying it would be easier if qemu's configure system allowed this. Regards, Anthony Liguori Cheers, Jes
[Qemu-devel] Re: [PATCH] Autodetect clock_gettime
On Mar 15, 2011, at 2:58 PM, Paolo Bonzini wrote: On 03/15/2011 02:47 PM, Tristan Gingold wrote: On Mar 15, 2011, at 2:34 PM, Paolo Bonzini wrote: On 03/15/2011 02:16 PM, Tristan Gingold wrote: Some POSIX OSes (such as Darwin) doesn't have clock_gettime. This patch falls back on gettimeofday if clock_gettime is not available. This may be okay as a stopgap measure, but any sane porting target for QEMU should have a monotonic clock. In fact, Darwin has it. Yes mach primitives could be used. But why isn't a monotonic clock used on Linux ? According to man, CLOCK_MONOTONIC is monotonic while CLOCK_REALTIME isn't. /me rereads the patch Unfortunately, pthread timed wait/lock functions are documented to use the realtime clock by default. Using pthread_condattr_setclock is probably not portable enough, and anyway there is no such function for mutexes so we're stuck with CLOCK_REALTIME. What you're patching is fine, but those functions might actually go away soon as they're not supported on Win32. Fine. So, in addition to what you've done, you should probably use those Mach primitives in qemu-timer.h. Yes. But note that the first aim of this patch is to make qemu compiling again on Darwin. Tristan.
Re: [Qemu-devel] [PATCH 4/7] libcacard: initial commit
On Tue, Mar 15, 2011 at 08:44:27AM -0500, Anthony Liguori wrote: On 03/15/2011 07:42 AM, Jes Sorensen wrote: On 03/14/11 17:40, Alon Levy wrote: On Mon, Mar 14, 2011 at 04:20:22PM +0100, Jes Sorensen wrote: ok, here is a note where I kinda ignored my own wishes but I want to be very clear on them: libcacard should not be part of qemu. it is here because I once thought it would speed things up. So I'm not taking it out or anything - it's fine with me that it goes into qemu, just as long as it's understood that I'm now maintaining another copy of it for usage outside of qemu, in the spice client (or any other client for that matter - it will be the same when we do vnc support for this). Hi Alon, This bit is somewhat problematic. If QEMU is maintaining a copy of libcacard, then that has to comply with the QEMU way of doing things. QEMU cannot rely on various portions in the tree behaving in different ways. Otherwise it really should be an external library requirement pulled in by the build. I am not sure what is the best way, if it stays in QEMU people will eventually start making modifications to it, without looking at the other copy that is being maintained. Two copies is not really practical. QEMU should be the place that owns it and things should be consuming a .so from QEMU. My bad - I thought you didn't want this. I can do a patch to make qemu build an .so file if configure gets a --libs, how does that sound? right now that would build just libcacard, I guess libqmp too later? or perhaps have a separate Makefile (Makefile.libs)? Have you given this any thought? Regards, Anthony Liguori Alternatively the external apps that build against it should be taught to link with the QEMU version. Cheers, Jes
[Qemu-devel] Re: [PATCH 1/7] Consolidate DisplaySurface allocation in qemu_alloc_display()
On 03/15/2011 07:36 AM, jes.soren...@redhat.com wrote: From: Jes Sorensenjes.soren...@redhat.com This removes various code duplication from console.e and sdl.c Signed-off-by: Jes Sorensenjes.soren...@redhat.com --- console.c | 45 + console.h |3 +++ ui/sdl.c | 21 - 3 files changed, 36 insertions(+), 33 deletions(-) diff --git a/console.c b/console.c index 57d6eb5..4939a72 100644 --- a/console.c +++ b/console.c @@ -1278,35 +1278,40 @@ static DisplaySurface* defaultallocator_create_displaysurface(int width, int hei { DisplaySurface *surface = (DisplaySurface*) qemu_mallocz(sizeof(DisplaySurface)); -surface-width = width; -surface-height = height; -surface-linesize = width * 4; -surface-pf = qemu_default_pixelformat(32); -#ifdef HOST_WORDS_BIGENDIAN -surface-flags = QEMU_ALLOCATED_FLAG | QEMU_BIG_ENDIAN_FLAG; -#else -surface-flags = QEMU_ALLOCATED_FLAG; -#endif -surface-data = (uint8_t*) qemu_mallocz(surface-linesize * surface-height); - +int linesize = width * 4; +surface = qemu_alloc_display(surface, width, height, linesize, + qemu_default_pixelformat(32), 0); return surface; } static DisplaySurface* defaultallocator_resize_displaysurface(DisplaySurface *surface, int width, int height) { +int linesize = width * 4; +surface = qemu_alloc_display(surface, width, height, linesize, + qemu_default_pixelformat(32), 0); +return surface; +} + +DisplaySurface* +qemu_alloc_display(DisplaySurface *surface, int width, int height, + int linesize, PixelFormat pf, int newflags) +{ +void *data; surface-width = width; surface-height = height; -surface-linesize = width * 4; -surface-pf = qemu_default_pixelformat(32); -if (surface-flags QEMU_ALLOCATED_FLAG) -surface-data = (uint8_t*) qemu_realloc(surface-data, surface-linesize * surface-height); -else -surface-data = (uint8_t*) qemu_malloc(surface-linesize * surface-height); +surface-linesize = linesize; +surface-pf = pf; +if (surface-flags QEMU_ALLOCATED_FLAG) { +data = qemu_realloc(surface-data, +surface-linesize * surface-height); +} else { +data = qemu_malloc(surface-linesize * surface-height); +} +surface-data = (uint8_t *)data; +surface-flags = newflags | QEMU_ALLOCATED_FLAG; #ifdef HOST_WORDS_BIGENDIAN -surface-flags = QEMU_ALLOCATED_FLAG | QEMU_BIG_ENDIAN_FLAG; -#else -surface-flags = QEMU_ALLOCATED_FLAG; +surface-flags |= QEMU_BIG_ENDIAN_FLAG; #endif return surface; diff --git a/console.h b/console.h index f4e4741..dec9a76 100644 --- a/console.h +++ b/console.h @@ -189,6 +189,9 @@ void register_displaystate(DisplayState *ds); DisplayState *get_displaystate(void); DisplaySurface* qemu_create_displaysurface_from(int width, int height, int bpp, int linesize, uint8_t *data); +DisplaySurface* qemu_alloc_display(DisplaySurface *surface, int width, + int height, int linesize, + PixelFormat pf, int newflags); Is it really useful at all to return DisplaySurface? When I see a return value of 'DisplaySurface *' and an alloc in the function name, I assume this function allocates a display surface but it's really allocating the framebuffer within a display surface. Regards, Anthony Liguori
[Qemu-devel] Re: [PATCH 2/7] Introduce -display argument
On 03/15/2011 07:36 AM, jes.soren...@redhat.com wrote: From: Jes Sorensenjes.soren...@redhat.com This patch introduces a -display argument which consolidates the setting of the display mode. Valid options are: sdl/curses/default/serial (serial is equivalent to -nographic) Signed-off-by: Jes Sorensenjes.soren...@redhat.com --- qemu-options.hx | 27 +++ vl.c| 77 +++ 2 files changed, 104 insertions(+), 0 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index badb730..f08ffb1 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -590,6 +590,33 @@ STEXI @table @option ETEXI +DEF(display, HAS_ARG, QEMU_OPTION_display, +-display sdl[,frame=on|off][,alt_grab=on|off][,ctrl_grab=on|off]\n +[,window_close=on|off]|curses|serial\n +select display type\n, QEMU_ARCH_ALL) +STEXI +@item -display @var{type} +@findex -display +Select type of display to use. This option is a replacement for the +old style -sdl/-curses/... options. Valid values for @var{type} are +@table @option +@item sdl +Pick the SDL display option. +@item curses +Pick the curses display option. Normally, QEMU uses SDL to display the +VGA output. With this option, QEMU can display the VGA output when in +text mode using a curses/ncurses interface. Nothing is displayed in +graphical mode. +@item serial +Normally, QEMU uses SDL to display the VGA output. With this option, +you can totally disable graphical output so that QEMU is a simple +command line application. The emulated serial port is redirected on +the console. Therefore, you can still use QEMU to debug a Linux kernel +with a serial console. This option is equivalent to the old -nographic +argument. +@end table +ETEXI + DEF(nographic, 0, QEMU_OPTION_nographic, -nographic disable graphical output and redirect serial I/Os to console\n, QEMU_ARCH_ALL) diff --git a/vl.c b/vl.c index 5e007a7..c88ee58 100644 --- a/vl.c +++ b/vl.c @@ -1554,6 +1554,80 @@ static void select_vgahw (const char *p) } } +static DisplayType select_display(const char *p) +{ +const char *opts; +DisplayType display = DT_DEFAULT; + +if (strstart(p, sdl,opts)) { +#ifdef CONFIG_SDL +display = DT_SDL; +while (*opts) { +const char *nextopt; + +if (strstart(opts, ,frame=,nextopt)) { +opts = nextopt; +if (strstart(opts, on,nextopt)) { +no_frame = 0; +} else if (strstart(opts, off,nextopt)) { +no_frame = 1; +} else { +goto invalid_display; +} +} else if (strstart(opts, ,alt_grab=,nextopt)) { +opts = nextopt; +if (strstart(opts, on,nextopt)) { +alt_grab = 1; +} else if (strstart(opts, off,nextopt)) { +alt_grab = 0; +} else { +goto invalid_display; +} +} else if (strstart(opts, ,ctrl_grab=,nextopt)) { +opts = nextopt; +if (strstart(opts, on,nextopt)) { +ctrl_grab = 1; +} else if (strstart(opts, off,nextopt)) { +ctrl_grab = 0; +} else { +goto invalid_display; +} +} else if (strstart(opts, ,window_close=,nextopt)) { +opts = nextopt; +if (strstart(opts, on,nextopt)) { +no_quit = 0; +} else if (strstart(opts, off,nextopt)) { +no_quit = 1; +} else { +goto invalid_display; +} +} else { +goto invalid_display; +} +opts = nextopt; +} So the natural reaction here is going to be, just use QemuOpts. But this is harder than it seems. The problem is that the VNC options inverse the meaning of booleans making conversion of VNC to use QemuOpts much harder than it would appear. Doing it this way lets us pass the vnc option string directly to vnc_display_open(). I don't like it much, but I don't mind it as an interim step. Regards, Anthony Liguori +#else +fprintf(stderr, SDL support is disabled\n); +exit(1); +#endif +} else if (strstart(p, curses,opts)) { +#ifdef CONFIG_CURSES +display = DT_CURSES; +#else +fprintf(stderr, Curses support is disabled\n); +exit(1); +#endif +} else if (strstart(p, serial,opts)) { +display = DT_NOGRAPHIC; +} else { +invalid_display: +fprintf(stderr, Unknown display type: %s\n, p); +exit(1); +} + +return display; +} + static int balloon_parse(const char *arg) { QemuOpts *opts; @@ -2152,6 +2226,9 @@ int main(int argc, char **argv, char **envp)
Re: [Qemu-devel] [PATCH 4/7] libcacard: initial commit
On 03/15/11 15:25, Alon Levy wrote: I am not sure what is the best way, if it stays in QEMU people will eventually start making modifications to it, without looking at the other copy that is being maintained. Two copies is not really practical. QEMU should be the place that owns it and things should be consuming a .so from QEMU. My bad - I thought you didn't want this. I can do a patch to make qemu build an .so file if configure gets a --libs, how does that sound? right now that would build just libcacard, I guess libqmp too later? or perhaps have a separate Makefile (Makefile.libs)? Have you given this any thought? I think the libs should be built by default as part of the build process and get installed as part of the regular install. Ie. it becomes part of the QEMU build process, so it requires a QEMU build to build the support libraries, but they can be packages into separate RPMs/debs by the distro people. I don't think we want a --libs option that turns the build process into only producing the libs. Cheers, Jes
[Qemu-devel] [PATCH] Fix migration uint8 arrys handled
commit 82fa39b75181b730d6d4d09f443bd26bcfcd045c only contains half of the fix. It forgots the save state fix for UINT8 indexes. Anthony, please apply, without this migration using hpet is broken. (only current user). Signed-off-by: Juan Quintela quint...@redhat.com --- savevm.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/savevm.c b/savevm.c index 60d2f2a..67459a7 100644 --- a/savevm.c +++ b/savevm.c @@ -1395,6 +1395,8 @@ void vmstate_save_state(QEMUFile *f, const VMStateDescription *vmsd, n_elems = *(int32_t *)(opaque+field-num_offset); } else if (field-flags VMS_VARRAY_UINT16) { n_elems = *(uint16_t *)(opaque+field-num_offset); +} else if (field-flags VMS_VARRAY_UINT8) { +n_elems = *(uint8_t *)(opaque+field-num_offset); } if (field-flags VMS_POINTER) { base_addr = *(void **)base_addr + field-start; -- 1.7.4
[Qemu-devel] KVM call minutes for Mar 15
QAPI -- http://wiki.qemu.org/Features/QAPI - please review! - Anthony would like to see feedback and plans to commit in a week (assuming agreement and no major issues in review) - some concern about the maintainability of code generation - but still nothing concrete on the list, need to review and discuss on the list - some concern that implementation details may change the wire protocol - introduces a new mechanism for new signals (mask by default and enabled explicitly) - disagreement over when/how to introduce new extensions - libvirt feedback? - no protocol level changes - old and new versions are testable with test suite and proves this - c library implementation is critical to have unit tests and test driven development - thread safe? - no shared state, no statics. - threading model requires lock for the qmp session - licensiing? - LGPL - forwards/backwards compat? - designed with that in mind see wiki: http://wiki.qemu.org/Features/QAPI QCFG -- http://wiki.qemu.org/Features/QCFG - command line args translation to objects is complex and buggy - schema + code generator to formalize this - formally describe each command line option and generate code to build and validate objects - provides systematic way to document command line options - automatically - device_add does multiple conversions to go from qmp to qemuopts to objects - move to basic c structures, and autogenerated marshalling code - no plan to do this work soon, late in 0.15 cycle - same as qapi, fork a tree, do mass conversion and merge for 0.16 cycle - qmp server mode to take all configuation commands before actually starting the guest - can provide a config file - qdev... - could just bridge to setting and getting qdev properties - OR get to point where device objects go directly to qdev device init - why not move command line to qmp instead of new schema? - single schema - considerations for -M (didn't capture all of these) - for all the details: http://wiki.qemu.org/Features/QCFG Merging big changes - in the past, evolving in tree has not worked well, leaving partial conversions - QAPI/QCFG method of doing changes in external tree hopes to set new precedent - preserve patch/review on list - do full conversion - provide strong testing to show it works Kemari merge plans - just needs some ACKs - Juan, Anthony, anybody else who is familiar with migration to review? switch from gpxe to ipxe - possible 0.15 release w/ ipxe (Alex looking into it) - Michael Brown been helpful in fixing bugs, so compat - Alex will send out mail soon on the details - ipxe releases? not yet, there are plans for it, should be coming RSN - Stefan volunteers to help test