Re: [Qemu-devel] [PATCH 1/3] Introduce threadlets
* Arun R B a...@linux.vnet.ibm.com [2010-10-21 17:40:45]: From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com This patch creates a generic asynchronous-task-offloading infrastructure named threadlets. The core idea has been borrowed from the threading framework that is being used by paio. The reason for creating this generic infrastructure is so that other subsystems, such as virtio-9p could make use of it for offloading tasks that could block. The patch creates a global queue on-to which subsystems can queue their tasks to be executed asynchronously. The patch also provides API's that allow a subsystem to create a private queue with an associated pool of threads. [...@in.ibm.com: Facelift of the code, Documentation, cancel_threadlet and other helpers] Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Signed-off-by: Gautham R Shenoy gautham.she...@gmail.com Signed-off-by: Sripathi Kodi sripat...@in.ibm.com Signed-off-by: Arun R Bharadwaj a...@linux.vnet.ibm.com Acked-by: Balbir Singh bal...@linux.vnet.ibm.com -- Three Cheers, Balbir
[Qemu-devel] [Bug 663713] Re: Mouse frozen under an emulated ubuntu
Well, looks like the patch was added on 20 october. http://git.qemu.org/qemu.git/commit/?id=cd496926155afcb3b6323e70dd720dc118b3a255 With a little luck, it would be in Qemu 0.13.1. Thanks for the URL and the patch. Have a good day. -- Mouse frozen under an emulated ubuntu https://bugs.launchpad.net/bugs/663713 You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. Status in QEMU: Incomplete Bug description: Qemu 0.13.0 Command line used : qemu-system-x86_64 --enable-kvm -localtime -soundhw all -k fr -m 1500 -net user -net nic,model=rtl8139 -hda disk.img -cdrom ubuntu-10.10-desktop-amd64.iso -boot d When I try to move mouse cursor in qemu, pointer is frozen. Nothing is moving. Was working perfectly with Qemu 0.12.5.
Re: [Qemu-devel] [PATCH v9 0/4] Introduce strtosz and make use of it
On 10/22/10 09:59, Markus Armbruster wrote: jes.soren...@redhat.com writes: From: Jes Sorensen jes.soren...@redhat.com This patch introduces cutils.c: strtosz() and gets rid of the multiple custom hacks for parsing byte sizes. In addition it adds supports for specifying human style sizes such as 1.5G. Last it eliminates the horrible abuse of a float to store the byte size for migrate_set_speed in the monitor. Note, this is tested on Linux and build tested for win32 using mingw32. v9: I worked through a couple of revisions directly with Markus and I think I got it right finally. I'd prefer to have strtosz() to match strtol() friends and not restrict suffixes. But this code does what it claims to do, as far as I can see, so: ACK series Thanks! I thought about this a fair bit and I believe doing the full test in the function is the most valuable. It's a personal preference obviously. I think it's a win to do it here since it simplifies the caller code. Would be great to get this applied so I can get it off my plate :) Cheers, Jes
Re: [Qemu-devel] [PATCH 2/2] v2 Fix Block Hotplug race with drive_unplug()
On Thu, Oct 21, 2010 at 04:37:46PM -0500, Ryan Harper wrote: * Daniel P. Berrange berra...@redhat.com [2010-10-21 08:29]: On Tue, Oct 19, 2010 at 09:32:29AM -0500, Ryan Harper wrote: Block hot unplug is racy since the guest is required to acknowlege the ACPI unplug event; this may not happen synchronously with the device removal command This series aims to close a gap where by mgmt applications that assume the block resource has been removed without confirming that the guest has acknowledged the removal may re-assign the underlying device to a second guest leading to data leakage. This series introduces a new montor command to decouple asynchornous device removal from restricting guest access to a block device. We do this by creating a new monitor command drive_unplug which maps to a bdrv_unplug() command which does a qemu_aio_flush; bdrv_flush() and bdrv_close(). Once complete, subsequent IO is rejected from the device and the guest will get IO errors but continue to function. A subsequent device removal command can be issued to remove the device, to which the guest may or maynot respond, but as long as the unplugged bit is set, no IO will be sumbitted. The name 'drive_unplug' suggests to me that the drive object is not being deleted/free()d ? Is that correct understanding, and if so, what is responsible for finally free()ing the drive backend ? It's technically the BlockDriverState Driver that we're closing. To fully release the remaining resources, a device_del is required (which of course requires guest participation with the current interface). Once QEMU issues the removal request, the guest responds and the piix4 acpi handler for pciej_write writes invokes qdev_free() on the target device. qdev_free() on the pci device will make it's way to the qdev exit handler registered for virtio-blk devices, virtio_blk_exit_pci(). virtio_blk_exit_pci() marks the drive structure for deletion. When qdev calls the properties handler, it invokes free_drive() on the disk and that calls blockdev_auto_del() which will do a bdrv_delete() which nukes the remaining objects (the acutal BlockDriverState). I think I got the whole path in there. Ok, thanks, that makes sense to me. Sounds like we do still need a separate drive_del in the future to handle the different case of, drive_add, followed by a device_add attempt which fails. Regards, Daniel -- |: Red Hat, Engineering, London-o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org-o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
Re: [Qemu-devel] [PATCH v9 0/4] Introduce strtosz and make use of it
jes.soren...@redhat.com writes: From: Jes Sorensen jes.soren...@redhat.com This patch introduces cutils.c: strtosz() and gets rid of the multiple custom hacks for parsing byte sizes. In addition it adds supports for specifying human style sizes such as 1.5G. Last it eliminates the horrible abuse of a float to store the byte size for migrate_set_speed in the monitor. Note, this is tested on Linux and build tested for win32 using mingw32. v9: I worked through a couple of revisions directly with Markus and I think I got it right finally. I'd prefer to have strtosz() to match strtol() friends and not restrict suffixes. But this code does what it claims to do, as far as I can see, so: ACK series
[Qemu-devel] Re: VM with two interfaces
On Thu, Oct 21, 2010 at 11:49 PM, Nirmal Guhan vavat...@gmail.com wrote: On Thu, Oct 21, 2010 at 9:23 AM, Nirmal Guhan vavat...@gmail.com wrote: Hi, Am trying to create a VM using qemu-kvm with two interfaces(fedora12 is the host and vm) and running into an issue. Given below is the command : qemu-kvm -net nic,macaddr=$macaddress,model=pcnet -net tap,script=/etc/qemu-ifup -net nic,model=pcnet -net tap,script=/etc/qemu-ifup -m 1024 -hda ./vdisk.img -kernel ./bzImage-1019 -append ip=x.y.z.w:a.b.c.d:p.q.r.s:a.b.c.d ip=x.y.z.u:a.b.c.d:p.q.r.s:a.b.c.d root=/dev/nfs rw nfsroot=x.y.z.v:/blahblahblah On boot, both eth0 and eth1 come up but the vm tries to send dhcp and rarp requests instead of using the command line IP addresses. DHCP would fail in my case. With just one interface, dhcp is not attempted and nfs mount of root works fine. Any clue on what could be wrong here? Thanks, Nirmal Can someone help please? Hard pressed on time... sorry Try the #qemu or #kvm IRC channels on chat.freenode.net. Often people will respond and debug interactively with you there. Your problem does not seem QEMU/KVM related. You'll need to debug the guest's boot process and network configuration just like a physical machine. In fact, I bet with an identical setup on a physical machine you'd see the same problem. Double-check the kernel parameters documentation. See if you have a network configuration in the initramfs or the root filesystem that would cause it to DHCP and how to pull the pre-configured settings from the kernel. Stefan
Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
Am 21.10.2010 21:32, schrieb Laurent Vivier: Le jeudi 21 octobre 2010 à 10:07 -0500, Anthony Liguori a écrit : On 10/21/2010 09:07 AM, Kevin Wolf wrote: Hi all, I'm currently looking into adding a return value to qemu's bdrv_flush function and I noticed that your block drivers (nbd, rbd and sheepdog) don't implement bdrv_flush at all. bdrv_flush is going to return -ENOTSUP for any block driver not implementing this, effectively breaking these three drivers for anything but cache=unsafe. Is there a specific reason why your drivers don't implement this? NBD doesn't have a notion of flush. Only read/write and the block-nbd implementation doesn't do write-caching so flush would be a nop. I'm not sure what the right semantics would be for QEMU. My guess is a nop flush. I agree. Of course, as Laurent said a while ago, there is no specification for NBD, so it's hard to say what the intended semantics is. However, I did have a look at the nbdserver code and it looks as if it implements something similar to writethrough (namely fsync after each write) only if configured this way on the server side. qemu-nbd defaults to writethrough, but can be configured to use cache=none. So with either server qemu as a client can't tell whether the data is safe on disk or not. In my book this is a strong argument for refusing to open nbd connections with anything but cache=unsafe. Kevin
Re: Fwd: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
[ Adding qemu-devel to CC again ] Am 21.10.2010 20:59, schrieb Sage Weil: On Thu, 21 Oct 2010, Christian Brunner wrote: Hi, is there a flush operation in librados? - I guess the only way to handle this, would be waiting until all aio requests are finished? That's not the semantics of bdrv_flush, you don't need to wait for running requests. You just need to make sure that all completed requests are safe on disk so that they would persist even in case of a crash/power failure. There is no flush currently. But librados does no caching, so in this case at least silenting upgrading to cache=writethrough should work. You're making sure that the data can't be cached in the server's page cache or volatile disk cache either, e.g. by using O_SYNC for the image file? If so, upgrading would be safe. If that's a problem, we can implement a flush. Just let us know. Presumably providing a writeback mode with explicit flushes could improve performance. Upgrading to writethrough is not a correctness problem, though, so it's your decision if you want to implement it. Kevin -- Forwarded message -- From: Kevin Wolf kw...@redhat.com Date: 2010/10/21 Subject: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog To: Christian Brunner c...@muc.de, Laurent Vivier laur...@vivier.eu, MORITA Kazutaka morita.kazut...@lab.ntt.co.jp Cc: Qemu-devel@nongnu.org Hi all, I'm currently looking into adding a return value to qemu's bdrv_flush function and I noticed that your block drivers (nbd, rbd and sheepdog) don't implement bdrv_flush at all. bdrv_flush is going to return -ENOTSUP for any block driver not implementing this, effectively breaking these three drivers for anything but cache=unsafe. Is there a specific reason why your drivers don't implement this? I think I remember that one of the drivers always provides cache=writethough semantics. It would be okay to silently upgrade to cache=writethrough, so in this case I'd just need to add an empty bdrv_flush implementation. Otherwise, we really cannot allow any option except cache=unsafe because that's the semantics provided by the driver. In any case, I think it would be a good idea to implement a real bdrv_flush function to allow the write-back cache modes cache=off and cache=writeback in order to improve performance over writethrough. Is this possible with your protocols, or can the protocol be changed to consider this? Any hints on how to proceed? Kevin -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
Am 22.10.2010 07:43, schrieb MORITA Kazutaka: At Thu, 21 Oct 2010 16:07:28 +0200, Kevin Wolf wrote: Hi all, I'm currently looking into adding a return value to qemu's bdrv_flush function and I noticed that your block drivers (nbd, rbd and sheepdog) don't implement bdrv_flush at all. bdrv_flush is going to return -ENOTSUP for any block driver not implementing this, effectively breaking these three drivers for anything but cache=unsafe. Is there a specific reason why your drivers don't implement this? I think I remember that one of the drivers always provides cache=writethough semantics. It would be okay to silently upgrade to cache=writethrough, so in this case I'd just need to add an empty bdrv_flush implementation. Otherwise, we really cannot allow any option except cache=unsafe because that's the semantics provided by the driver. In any case, I think it would be a good idea to implement a real bdrv_flush function to allow the write-back cache modes cache=off and cache=writeback in order to improve performance over writethrough. Is this possible with your protocols, or can the protocol be changed to consider this? Any hints on how to proceed? It is a bit difficult to implement an effective bdrv_flush in the sheepdog block driver. Sheepdog virtual disks are splited and distributed to all cluster servers, so the block driver needs to send flush requests to all of them. I'm not sure this could improve performance more than writethrough semantics. It could probably be optimized so that you only send flush requests to servers that have actually received write requests since the last flush. But yes, that's probably a valid point. I guess there's only one way to find out how it performs: Trying it out. So I think it is better to support only writethrough semantics currently (I'll modify sheepdog server codes to open stored objects with O_SYNC or O_DIRECT) and leave write-back semantics as a future work. I agree, that makes sense. Note that O_DIRECT does not provide write-through semantics. It bypasses the page cache, but it doesn't flush other caches like a volatile disk write cache. If you want to use it, you still need explicit flushes or O_DIRECT | O_SYNC. Kevin
Re: [Qemu-devel] [PATCH v9 0/4] Introduce strtosz and make use of it
Jes Sorensen jes.soren...@redhat.com writes: On 10/22/10 09:59, Markus Armbruster wrote: jes.soren...@redhat.com writes: From: Jes Sorensen jes.soren...@redhat.com This patch introduces cutils.c: strtosz() and gets rid of the multiple custom hacks for parsing byte sizes. In addition it adds supports for specifying human style sizes such as 1.5G. Last it eliminates the horrible abuse of a float to store the byte size for migrate_set_speed in the monitor. Note, this is tested on Linux and build tested for win32 using mingw32. v9: I worked through a couple of revisions directly with Markus and I think I got it right finally. I'd prefer to have strtosz() to match strtol() friends and not restrict suffixes. But this code does what it claims to do, as far as I can see, so: ACK series Thanks! I thought about this a fair bit and I believe doing the full test in the function is the most valuable. It's a personal preference obviously. I think it's a win to do it here since it simplifies the caller code. But strtosz() doesn't know what suffixes the caller wants accepted. It accepting suffixes starting with whitespace, '\0' and ',' is just a guess. Three cases: 1. The guess is exactly right. strtosz()'s checking saves the caller the check. 2. It's too lose, caller wants to accept fewer suffixes, e.g. just '\0'. strtosz()'s checking doesn't save the caller anything. 3. It's too tight, caller wants to accept more suffixes. Caller needs to make a copy without the suffix. In the general case, this requires duplicating strtosz()'s parsing logic. It's easier not to use it then. The uses of strtosz() added by this series all fall into case 2. The callers lack the check, which means we accept trailing garbage. Fine with me, because it's no worse than before. Would be great to get this applied so I can get it off my plate :) Agreed.
[Qemu-devel] Re: [PATCH v6 08/12] pcie/aer: helper functions for pcie aer capability
On Fri, Oct 22, 2010 at 08:53:15AM +0900, Isaku Yamahata wrote: On Thu, Oct 21, 2010 at 10:07:01AM +0200, Michael S. Tsirkin wrote: On Thu, Oct 21, 2010 at 02:15:24PM +0900, Isaku Yamahata wrote: Thank you for detailed review. On Wed, Oct 20, 2010 at 11:56:16AM +0200, Michael S. Tsirkin wrote: +static uint32_t aer_log_del(PCIEAERLog *aer_log) +{ +uint32_t i = aer_log-consumer; +aer_log-consumer = aer_log_next(aer_log-consumer, aer_log-log_max); +return i; +} Please just use 'int' or 'unsigned' instead of uint32_t if you simply want to say 'a number'. Using specific length makes it impossible to say where you *really* want a value. PCIEAERLog is saved/loaded. So explicit sized number is chosen. I didn't notice. But consumer/producer are not guest visible, are they? If yes I think it's a mistake to save/load them, tying us to a specific implementation: just calculate and save the # of valid entries in the log. ACIEAERLog and PCIEAERErr themselves are the implementation specific internal states which is not visible to guest. So there is no point of arguing that consumer/producer are specific to implementation. Well the errors can be read out by the guest, so they are potentially guest visible. Thus I think the only thing we want to migrate is the list of errors logged. Also I put the comment here but it is a general one: pls go over the code, and just take a look at what types you use all over and think whether size really matters. In most places it does not, it just must be big enough, so use int or unsigned there. It will be much harder for others to do so later. Will do. -- yamahata
Re: [Qemu-devel] v6: [PATCH 0/3]: Threadlets: A generic task offloading framework
On (Wed) Oct 20 2010 [08:18:51], Anthony Liguori wrote: On 10/20/2010 07:05 AM, Stefan Hajnoczi wrote: On Wed, Oct 20, 2010 at 12:57 PM, Amit Shahamit.s...@redhat.com wrote: On (Tue) Oct 19 2010 [23:12:20], Arun R Bharadwaj wrote: Hi, This is the v6 of the patch-series to have a generic asynchronous task offloading framework (called threadlets) within qemu. Request to consider pulling this series as discussed during the Qemu-devel call. I tried this out with virtio-serial (patch below). Have a couple of things to note: - Guests get a SIGUSR2 on startup sometimes. This doesn't happen with qemu.git, so looks like it's introduced by this patchset. - After running some tests, I get an abort. I still have to look at what's causing it, but doesn't look like it's related to virtio-serial code. Program received signal SIGABRT, Aborted. 0x003dc76329a5 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install SDL-1.2.14-8.fc13.x86_64 glibc-2.12.1-2.x86_64 libX11-1.3.1-3.fc13.x86_64 libXau-1.0.5-1.fc12.x86_64 libpng-1.2.44-1.fc13.x86_64 libxcb-1.5-1.fc13.x86_64 ncurses-libs-5.7-7.20100130.fc13.x86_64 zlib-1.2.3-23.fc12.x86_64 (gdb) bt #0 0x003dc76329a5 in raise () from /lib64/libc.so.6 #1 0x003dc7634185 in abort () from /lib64/libc.so.6 #2 0x004bf829 in qemu_get_ram_ptr (addr=value optimized out) at /home/amit/src/qemu/exec.c:2936 #3 0x004bf9a7 in lduw_phys (addr=value optimized out) at /home/amit/src/qemu/exec.c:3836 #4 0x00557c90 in vring_avail_idx (vq=0x17b9320, idx=1333) at /home/amit/src/qemu/hw/virtio.c:133 #5 virtqueue_num_heads (vq=0x17b9320, idx=1333) at /home/amit/src/qemu/hw/virtio.c:252 #6 0x00557e5e in virtqueue_avail_bytes (vq=0x17b9320, in_bytes=4096, out_bytes=0) at /home/amit/src/qemu/hw/virtio.c:311 - I'm using a threadlet to queue up several work items which are to be processed in a fifo order. There's no cancel function for a threadlet that either processes all work and then quits the thread or just cancels all pending work and quits. Amit diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c index 74ba5ec..caaafbe 100644 --- a/hw/virtio-serial-bus.c +++ b/hw/virtio-serial-bus.c @@ -51,6 +51,14 @@ struct VirtIOSerial { struct virtio_console_config config; }; +typedef struct VirtIOSerialWork { +ThreadletWork work; +VirtIOSerialPort *port; +VirtQueue *vq; +VirtIODevice *vdev; +int discard; +} VirtIOSerialWork; + static VirtIOSerialPort *find_port_by_id(VirtIOSerial *vser, uint32_t id) { VirtIOSerialPort *port; @@ -113,10 +121,20 @@ static size_t write_to_port(VirtIOSerialPort *port, return offset; } -static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq, - VirtIODevice *vdev, bool discard) +static void async_flush_queued_data(ThreadletWork *work) { +VirtIOSerialPort *port; +VirtIOSerialWork *vs_work; +VirtQueue *vq; +VirtIODevice *vdev; VirtQueueElement elem; +int discard; + +vs_work = DO_UPCAST(VirtIOSerialWork, work, work); +port = vs_work-port; +vq = vs_work-vq; +vdev = vs_work-vdev; +discard = vs_work-discard; assert(port || discard); assert(virtio_queue_ready(vq)); You cannot access guest memory using QEMU RAM functions (or use the virtqueue_pop() function which uses them) from a thread without taking the QEMU global mutex. The abort stack trace is a result of accessing guest RAM from two threads simultaneously. In general it is not safe to use QEMU functions from a thread unless they are explicitly written to work outside the QEMU global mutex. Most functions assume the global mutex, which serializes I/O thread and vcpu changes to global state, is held. Yes, threadlets are only meant to be used to make synchronous system calls asynchronous. They are not meant to add parallelism to QEMU (yet). Yes -- I realised that. (But I don't see why the virtio rings get modified elsewhere other than the only function I push/pop from above.) Anyway, just one question as I've still not read the code: does a running work item in a threadlet block migration? Do the remaining work items in a threadlet get migrated fine? Amit
[Qemu-devel] Re: [PATCH 2/2] Replace remaining gcc format attributes by macro GCC_FMT_ATTR (format checking)
Am 13.10.2010 20:54, schrieb Stefan Weil: Replace the remaining format attribute printf by macro GCC_FMT_ATTR which uses gnu_printf (if supported). v2 * Removal of dyngen specific code is now done in a separate patch. * Handle attribute in new ui/spice-display.c, too. Cc: Blue Swirlblauwir...@gmail.com Signed-off-by: Stefan Weilw...@mail.berlios.de --- cpu-all.h |2 +- ui/spice-display.c |3 +-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/cpu-all.h b/cpu-all.h index 67a3266..11edddc 100644 --- a/cpu-all.h +++ b/cpu-all.h @@ -773,7 +773,7 @@ void cpu_dump_statistics (CPUState *env, FILE *f, int flags); void QEMU_NORETURN cpu_abort(CPUState *env, const char *fmt, ...) -__attribute__ ((__format__ (__printf__, 2, 3))); +GCC_FMT_ATTR(2, 3); extern CPUState *first_cpu; extern CPUState *cpu_single_env; diff --git a/ui/spice-display.c b/ui/spice-display.c index 6702dfd..7b4f5c1 100644 --- a/ui/spice-display.c +++ b/ui/spice-display.c @@ -29,8 +29,7 @@ static int debug = 0; -static void __attribute__((format(printf,2,3))) -dprint(int level, const char *fmt, ...) +static void GCC_FMT_ATTR(2, 3) dprint(int level, const char *fmt, ...) { va_list args; Can these two patches be applied to qemu master, or is there still something missing? They are needed for additional format checks. Regards, Stefan
Re: [Qemu-devel] [PATCH 2/2] v2 Fix Block Hotplug race with drive_unplug()
Am 21.10.2010 23:37, schrieb Ryan Harper: * Daniel P. Berrange berra...@redhat.com [2010-10-21 08:29]: On Tue, Oct 19, 2010 at 09:32:29AM -0500, Ryan Harper wrote: Block hot unplug is racy since the guest is required to acknowlege the ACPI unplug event; this may not happen synchronously with the device removal command This series aims to close a gap where by mgmt applications that assume the block resource has been removed without confirming that the guest has acknowledged the removal may re-assign the underlying device to a second guest leading to data leakage. This series introduces a new montor command to decouple asynchornous device removal from restricting guest access to a block device. We do this by creating a new monitor command drive_unplug which maps to a bdrv_unplug() command which does a qemu_aio_flush; bdrv_flush() and bdrv_close(). Once complete, subsequent IO is rejected from the device and the guest will get IO errors but continue to function. A subsequent device removal command can be issued to remove the device, to which the guest may or maynot respond, but as long as the unplugged bit is set, no IO will be sumbitted. The name 'drive_unplug' suggests to me that the drive object is not being deleted/free()d ? Is that correct understanding, and if so, what is responsible for finally free()ing the drive backend ? It's technically the BlockDriverState Driver that we're closing. To fully release the remaining resources, a device_del is required (which of course requires guest participation with the current interface). So is this basically what blockdev_del is supposed to become one day? Copying Markus to have a look at this. I'm sure he has some thoughts on it as he was planning to implement blockdev_add/del. Kevin Once QEMU issues the removal request, the guest responds and the piix4 acpi handler for pciej_write writes invokes qdev_free() on the target device. qdev_free() on the pci device will make it's way to the qdev exit handler registered for virtio-blk devices, virtio_blk_exit_pci(). virtio_blk_exit_pci() marks the drive structure for deletion. When qdev calls the properties handler, it invokes free_drive() on the disk and that calls blockdev_auto_del() which will do a bdrv_delete() which nukes the remaining objects (the acutal BlockDriverState). I think I got the whole path in there.
Re: [Qemu-devel] Re: [PATCH v6 08/12] pcie/aer: helper functions for pcie aer capability
Michael S. Tsirkin m...@redhat.com writes: On Wed, Oct 20, 2010 at 05:18:57PM +0900, Isaku Yamahata wrote: This patch implements helper functions for pcie aer capability which will be used later. Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp Some style comments and a couple of minor bugs. [...] + +static int aer_log_add_err(PCIEAERLog *aer_log, const PCIEAERErr *err) +{ +uint32_t i; +if (aer_log_full(aer_log)) { +return -1; +} +i = aer_log_add(aer_log); +memcpy(aer_log-log[i], err, sizeof(*err)); sizeof is not a function, you don't need () Matter of taste. Yes, it's an operator, but it's as function-like as operators get. For what it's worth, grep says roughly 98% of our sizeof uses are followed by '('. I doubt the parenthesis are required for grouping that often. [...]
Re: [Qemu-devel] [PATCH]: VNC: set listener socket to non-blocking mode
On Thu, 2010-10-21 at 17:44 +0100, Anthony Liguori wrote: On 10/21/2010 11:18 AM, Gianni Tedesco wrote: This prevents qemu from hanging waiting for a client to connect. I have reproduced this when doing a loadvm but it may be a more general problem in that poll/accept may race if a client aborts the connection with a RST before the accept has completed. In either case the fix seems harmless. Signed-off-by: Gianni Tedescogianni.tede...@citrix.com I'd feel a little better with explicit error handling in the accept() path and a comment explaining why this was necessary. But otherwise, nice catch! Thanks! Not sure what you mean about the explicit error handling in the accept() path though? Accept can fail for a few reasons, the most worrying is EMFILE which easily leads to livelock. Poll reports listener as ready to accept but accept fails because we maxed out the fd table so go back to poll and the listener is still ready to accept etc... It's not always clear how to handle specific accept() errors. I think only calling vnc_connect() when we have an fd is the best we can hope for right now. Unless I misunderstood? -8-8-8- diff --git a/ui/vnc.c b/ui/vnc.c index 864342e..65dc55c 100644 --- a/ui/vnc.c +++ b/ui/vnc.c @@ -2736,5 +2736,9 @@ int vnc_display_open(DisplayState *ds, const char *display) vs-display = dpy; } } +/* necessary to prevent accept() hanging indefinitely if a clients + * connection-reset wins the race between poll() and accept() + */ +socket_set_nonblock(vs-lsock); return qemu_set_fd_handler2(vs-lsock, NULL, vnc_listen_read, NULL, vs); }
Re: [Qemu-devel] Re: [PATCH v6 07/12] pcie/hotplug: introduce pushing attention button command
Michael S. Tsirkin m...@redhat.com writes: On Wed, Oct 20, 2010 at 05:18:56PM +0900, Isaku Yamahata wrote: glue pcie_push_attention_button command. Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp So as a high level command, I think we need to think about how to tie this into pci_add/pci_del. Right? [...] Do we have consensus how our set of commands for hot plug should look like? We talked about it, but did we reach consensus? If yes, did we write it down somewhere?
[Qemu-devel] Re: [RESEND PATCH] monitor: properly handle illegal fd/vhostfd from command line
On Thu, 21 Oct 2010 19:29:02 +0800 Jason Wang jasow...@redhat.com wrote: When hanlding fd/vhostfd form command line through net_handle_fd_param(), we need to check mon and return value of strtol() other than we could get segmentation fault or invalid fd when user type an illegal fd/vhostfd. This patch is based on the suggestions from Luiz Capitulino lcapitul...@redhat.com. Signed-off-by: Jason Wang jasow...@redhat.com --- net.c | 12 +--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/net.c b/net.c index ed74c7f..ab9c3bb 100644 --- a/net.c +++ b/net.c @@ -774,8 +774,8 @@ int qemu_find_nic_model(NICInfo *nd, const char * const *models, int net_handle_fd_param(Monitor *mon, const char *param) { -if (!qemu_isdigit(param[0])) { -int fd; +int fd; Better to add a space here. +if (!qemu_isdigit(param[0]) mon) { fd = monitor_get_fd(mon, param); if (fd == -1) { @@ -785,7 +785,13 @@ int net_handle_fd_param(Monitor *mon, const char *param) return fd; } else { -return strtol(param, NULL, 0); +char *endptr = NULL; + +fd = strtol(param, endptr, 10); +if (*endptr || (fd == 0 param == endptr)) { +return -1; +} +return fd; } You can put 'return fd' here and drop the two above, also this is not a monitor patch, it's a networking one. Otherwise looks ok to me. }
Re: [Qemu-devel] [PATCH 0/3] add usb_detach and usb_attach (v3)
On Thu, 21 Oct 2010 15:27:23 +0200 Alon Levy al...@redhat.com wrote: On Thu, Oct 21, 2010 at 08:13:19AM -0500, Anthony Liguori wrote: On 10/21/2010 08:03 AM, Gerd Hoffmann wrote: On 10/21/10 08:36, Alon Levy wrote: v2-v3 changes: * add configure parameter * fix docs v2 message: This patchset uses id like device_del for attaching/detaching usb devices. The first two patches ready the way: 1. makes qdev_find_recursive non static and in qdev.h 2. adds a usb_device_by_id which goes over the usb buses calling qdev_find_recursive 3. adds the commands that use usb_device_by_id Alon Levy (3): qdev: make qdev_find_recursive public usb: add public usb_device_by_id monitor: add usb_attach and usb_detach (v2) Acked-by: Gerd Hoffmann kra...@redhat.com Okay, I am still confused about the use-case for this and I don't see any further explanation in the commit messages. I've seen debugging but can you be a bit more specific about which cases it's needed for? To elaborate a little more, when using a certificates based card there is no hardware event (i.e. removing/inserting the physical card) that causes a usb_detach/attach to the card (both in passthru and emulated), but otoh certificates is good for testing since it decouples it from NSS/tcp. So I needed some way to emulate an insert/remove, and I saw usb_del, which was pretty close, and voila. This is not the same as card remove/reinsert, but it is exactly what will happen to the guest when spicec connects/disconnects, since I detach devices on disconnect and attach on connect. Looks reasonable to me, specially because this will be protected by #ifdef DEBUG. I don't see a big deal in merging this. Objections, Anthony? This is just adding a HMP command. Is that the right approach or was that an unintentional consequence of rebasing post-HMP/QMP split? I don't think this should be available under QMP, it's more a debugging command for USB developers. Regards, Anthony Liguori cheers, Gerd
Re: [Qemu-devel] [PATCH 0/3] add usb_detach and usb_attach (v3)
On Fri, 22 Oct 2010 07:55:02 -0500 Anthony Liguori anth...@codemonkey.ws wrote: On 10/22/2010 07:48 AM, Luiz Capitulino wrote: On Thu, 21 Oct 2010 15:27:23 +0200 Alon Levyal...@redhat.com wrote: On Thu, Oct 21, 2010 at 08:13:19AM -0500, Anthony Liguori wrote: On 10/21/2010 08:03 AM, Gerd Hoffmann wrote: On 10/21/10 08:36, Alon Levy wrote: v2-v3 changes: * add configure parameter * fix docs v2 message: This patchset uses id like device_del for attaching/detaching usb devices. The first two patches ready the way: 1. makes qdev_find_recursive non static and in qdev.h 2. adds a usb_device_by_id which goes over the usb buses calling qdev_find_recursive 3. adds the commands that use usb_device_by_id Alon Levy (3): qdev: make qdev_find_recursive public usb: add public usb_device_by_id monitor: add usb_attach and usb_detach (v2) Acked-by: Gerd Hoffmannkra...@redhat.com Okay, I am still confused about the use-case for this and I don't see any further explanation in the commit messages. I've seen debugging but can you be a bit more specific about which cases it's needed for? To elaborate a little more, when using a certificates based card there is no hardware event (i.e. removing/inserting the physical card) that causes a usb_detach/attach to the card (both in passthru and emulated), but otoh certificates is good for testing since it decouples it from NSS/tcp. So I needed some way to emulate an insert/remove, and I saw usb_del, which was pretty close, and voila. This is not the same as card remove/reinsert, but it is exactly what will happen to the guest when spicec connects/disconnects, since I detach devices on disconnect and attach on connect. Looks reasonable to me, specially because this will be protected by #ifdef DEBUG. I don't see a big deal in merging this. I'd just like to see better documentation. A command isn't useful for debugging if noone knows how to use it. Guarding with an #ifdef isn't necessary. It should be unconditionally enabled otherwise it will bit rot. You and Gerd asked about the purpose of this command, turns out that it's only useful for developing new USB devices for QEMU, so I thought it would be better to restrict it, so that people don't start using this the wrong way or worse, we can't drop/break it because some tool is now using it. Anyway, maybe a good doc will do, this was just a small suggestion. Objections, Anthony? Not with better docs. Regards, Anthony Liguori This is just adding a HMP command. Is that the right approach or was that an unintentional consequence of rebasing post-HMP/QMP split? I don't think this should be available under QMP, it's more a debugging command for USB developers. Regards, Anthony Liguori cheers, Gerd
Re: [Qemu-devel] [PATCH 0/3] add usb_detach and usb_attach (v3)
On 10/22/2010 07:48 AM, Luiz Capitulino wrote: On Thu, 21 Oct 2010 15:27:23 +0200 Alon Levyal...@redhat.com wrote: On Thu, Oct 21, 2010 at 08:13:19AM -0500, Anthony Liguori wrote: On 10/21/2010 08:03 AM, Gerd Hoffmann wrote: On 10/21/10 08:36, Alon Levy wrote: v2-v3 changes: * add configure parameter * fix docs v2 message: This patchset uses id like device_del for attaching/detaching usb devices. The first two patches ready the way: 1. makes qdev_find_recursive non static and in qdev.h 2. adds a usb_device_by_id which goes over the usb buses calling qdev_find_recursive 3. adds the commands that use usb_device_by_id Alon Levy (3): qdev: make qdev_find_recursive public usb: add public usb_device_by_id monitor: add usb_attach and usb_detach (v2) Acked-by: Gerd Hoffmannkra...@redhat.com Okay, I am still confused about the use-case for this and I don't see any further explanation in the commit messages. I've seen debugging but can you be a bit more specific about which cases it's needed for? To elaborate a little more, when using a certificates based card there is no hardware event (i.e. removing/inserting the physical card) that causes a usb_detach/attach to the card (both in passthru and emulated), but otoh certificates is good for testing since it decouples it from NSS/tcp. So I needed some way to emulate an insert/remove, and I saw usb_del, which was pretty close, and voila. This is not the same as card remove/reinsert, but it is exactly what will happen to the guest when spicec connects/disconnects, since I detach devices on disconnect and attach on connect. Looks reasonable to me, specially because this will be protected by #ifdef DEBUG. I don't see a big deal in merging this. I'd just like to see better documentation. A command isn't useful for debugging if noone knows how to use it. Guarding with an #ifdef isn't necessary. It should be unconditionally enabled otherwise it will bit rot. Objections, Anthony? Not with better docs. Regards, Anthony Liguori This is just adding a HMP command. Is that the right approach or was that an unintentional consequence of rebasing post-HMP/QMP split? I don't think this should be available under QMP, it's more a debugging command for USB developers. Regards, Anthony Liguori cheers, Gerd
Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
On 10/22/2010 03:29 AM, Kevin Wolf wrote: I agree. Of course, as Laurent said a while ago, there is no specification for NBD, so it's hard to say what the intended semantics is. However, I did have a look at the nbdserver code and it looks as if it implements something similar to writethrough (namely fsync after each write) only if configured this way on the server side. qemu-nbd defaults to writethrough, but can be configured to use cache=none. So with either server qemu as a client can't tell whether the data is safe on disk or not. In my book this is a strong argument for refusing to open nbd connections with anything but cache=unsafe. On a physical system, if you don't have a battery backed disk and you enable the WC on your disk, then even with cache=writethrough we're unsafe. Likewise, if you mount your filesystem with barrier=0, QEMU is unsafe. QEMU can't guarantee safety. The underlying storage needs to be configured correctly. As long as we're not introducing caching within QEMU, I don't think we should assume we're unsafe. Do we have any place where we can add docs on a per-block format basis? It would be good to at least mention for each block device how the backing storage needed to be configured for safety. Regards, Anthony Liguori Kevin
Re: [Qemu-devel] [PATCH 0/6] First part of autoconfy series: cleanup tests/
On Thu, 21 Oct 2010 10:18:34 +0200 Paolo Bonzini pbonz...@redhat.com wrote: This is a small start from the autoconfy configure series. Actually, this part is almost completely new. It cleans up the tests Makefile so that the i386-linux-user testsuite's Makefile targets are more easily extensible and can work wherever a compiler for i386 is installed. In the future this could be extended to ARM and MIPS targets. This is unrelated to this series, but something that would be very good to have would be to move all check- programs to tests/ (or check/) and make them part of 'make test' or introduce a new 'make check'. Paolo Bonzini (6): unbreak make from tests directory unbreak make from vpath-built tests directory disable test_enter on i386, it is broken make runcom compile on recent distributions fix test_path rewrite i386 tests Makefile configure |5 ++ rules.mak |9 tests/Makefile| 123 ++-- tests/runcom.c| 11 ++--- tests/test-i386.c |5 ++ tests/test_path.c | 13 +- 6 files changed, 114 insertions(+), 52 deletions(-)
Re: [Qemu-devel] [PATCH 0/7] [PULL] qemu-kvm.git uq/master queue
On 10/21/2010 01:25 PM, Marcelo Tosatti wrote: The following changes since commit 633aa0acfe2c4d3e56acfe28c912796bf54de6d3: Fix pci hotplug to generate level triggered interrupt. (2010-10-20 17:23:28 -0500) Pulled. Thanks. Regards, Anthony Liguori are available in the git repository at: git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git uq/master Hidetoshi Seto (3): x86, mce: ignore SRAO only when MCG_SER_P is available x86, mce: broadcast mce depending on the cpu version Fix build on !KVM_CAP_MCE Marcelo Tosatti (4): kvm: add save/restore of MSR_VM_HSAVE_PA kvm: factor out kvm_has_msr_star kvm: writeback SMP TSCs on migration only kvm: save/restore x86-64 MSRs on x86-64 kernels target-i386/kvm.c | 132 +++- 1 files changed, 99 insertions(+), 33 deletions(-)
Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
Am 22.10.2010 14:58, schrieb Anthony Liguori: On 10/22/2010 03:29 AM, Kevin Wolf wrote: I agree. Of course, as Laurent said a while ago, there is no specification for NBD, so it's hard to say what the intended semantics is. However, I did have a look at the nbdserver code and it looks as if it implements something similar to writethrough (namely fsync after each write) only if configured this way on the server side. qemu-nbd defaults to writethrough, but can be configured to use cache=none. So with either server qemu as a client can't tell whether the data is safe on disk or not. In my book this is a strong argument for refusing to open nbd connections with anything but cache=unsafe. On a physical system, if you don't have a battery backed disk and you enable the WC on your disk, then even with cache=writethrough we're unsafe. I don't think that's right. O_SYNC should guarantee that the volatile disk cache is flushed. Likewise, if you mount your filesystem with barrier=0, QEMU is unsafe. Yeah, if you do something equivalent to cache=unsafe on a lower layer, then qemu can't do much about it. Maybe you can apply the same argument to NBD, even though it's unsafe by default. QEMU can't guarantee safety. The underlying storage needs to be configured correctly. As long as we're not introducing caching within QEMU, I don't think we should assume we're unsafe. Do we have any place where we can add docs on a per-block format basis? It would be good to at least mention for each block device how the backing storage needed to be configured for safety. docs/block-protocols.txt? Kevin
Re: [Qemu-devel] [PATCH 0/6] First part of autoconfy series: cleanup tests/
On 10/22/2010 03:26 PM, Luiz Capitulino wrote: On Thu, 21 Oct 2010 10:18:34 +0200 Paolo Bonzinipbonz...@redhat.com wrote: This is a small start from the autoconfy configure series. Actually, this part is almost completely new. It cleans up the tests Makefile so that the i386-linux-user testsuite's Makefile targets are more easily extensible and can work wherever a compiler for i386 is installed. In the future this could be extended to ARM and MIPS targets. This is unrelated to this series, but something that would be very good to have would be to move all check- programs to tests/ (or check/) and make them part of 'make test' or introduce a new 'make check'. Right, I thought about that too. The main hurdle is that: 1) right now almost every test in make test is failing; 2) the testsuite run by make test is very noisy, does not create logs, etc. If we could use autotest... :) We could also have check-i386, check-qmp, etc. and check would simply call all of them. Paolo
[Qemu-devel] [PULL 00/11] Block patches
The following changes since commit d03703c81a202cea156811e5dbc8e88627c19986: curses: Fix contro...@[\]^_} and ESC (2010-10-21 18:31:28 +0200) are available in the git repository at: git://repo.or.cz/qemu/kevin.git for-anthony Christoph Hellwig (1): ide: set WCACHE supported in IDENTIFY data Kevin Wolf (7): qcow2: Simplify image creation qcow2: Remove old image creation function qemu-io: New command map qemu-img: Fix qemu-img convert -obacking_file ide: Factor ide_flush_cache out ide: Handle flush failure virtio-blk: Respect werror option for flushes Stefan Hajnoczi (1): qcow2: Support exact L1 table growth Stefan Weil (1): block: Use GCC_FMT_ATTR and fix a format error edison (1): Copy snapshots out of QCOW2 disk block.c| 16 +++ block.h|2 + block/blkverify.c |5 +- block/qcow2-cluster.c | 25 +++-- block/qcow2-snapshot.c | 33 ++- block/qcow2.c | 278 block/qcow2.h |3 +- block_int.h|2 + hw/ide/core.c | 28 -- hw/ide/internal.h |3 +- hw/virtio-blk.c|8 ++- qemu-img-cmds.hx |4 +- qemu-img.c | 26 +- qemu-img.texi |4 +- qemu-io.c | 39 +++ 15 files changed, 264 insertions(+), 212 deletions(-)
[Qemu-devel] [PATCH 02/11] qcow2: Simplify image creation
Instead of doing lots of magic for setting up initial refcount blocks and stuff create a minimal (inconsistent) image, open it and initialize the rest with regular qcow2 functions. This is a complete rewrite of the image creation function. The old implementating is #ifdef'd out and will be removed by the next patch (removing it here would have made the diff unreadable because diff tries to find similarities when it's really a rewrite) Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2.c | 133 - 1 files changed, 132 insertions(+), 1 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 345d2e4..3b2e38c 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -27,6 +27,7 @@ #include zlib.h #include aes.h #include block/qcow2.h +#include qemu-error.h /* Differences with QCOW: @@ -794,6 +795,7 @@ static int qcow2_change_backing_file(BlockDriverState *bs, return qcow2_update_ext_header(bs, backing_file, backing_fmt); } +#if 0 static int get_bits_from_size(size_t size) { int res = 0; @@ -814,6 +816,7 @@ static int get_bits_from_size(size_t size) return res; } +#endif static int preallocate(BlockDriverState *bs) @@ -869,6 +872,7 @@ static int preallocate(BlockDriverState *bs) return 0; } +#if 0 static int qcow_create2(const char *filename, int64_t total_size, const char *backing_file, const char *backing_format, int flags, size_t cluster_size, int prealloc) @@ -1066,6 +1070,133 @@ exit: return ret; } +#else +static int qcow_create2(const char *filename, int64_t total_size, +const char *backing_file, const char *backing_format, +int flags, size_t cluster_size, int prealloc, +QEMUOptionParameter *options) +{ +/* Calulate cluster_bits */ +int cluster_bits; +cluster_bits = ffs(cluster_size) - 1; +if (cluster_bits MIN_CLUSTER_BITS || cluster_bits MAX_CLUSTER_BITS || +(1 cluster_bits) != cluster_size) +{ +error_report( +Cluster size must be a power of two between %d and %dk\n, +1 MIN_CLUSTER_BITS, 1 (MAX_CLUSTER_BITS - 10)); +return -EINVAL; +} + +/* + * Open the image file and write a minimal qcow2 header. + * + * We keep things simple and start with a zero-sized image. We also + * do without refcount blocks or a L1 table for now. We'll fix the + * inconsistency later. + * + * We do need a refcount table because growing the refcount table means + * allocating two new refcount blocks - the seconds of which would be at + * 2 GB for 64k clusters, and we don't want to have a 2 GB initial file + * size for any qcow2 image. + */ +BlockDriverState* bs; +QCowHeader header; +uint8_t* refcount_table; +int ret; + +ret = bdrv_create_file(filename, options); +if (ret 0) { +return ret; +} + +ret = bdrv_file_open(bs, filename, BDRV_O_RDWR); +if (ret 0) { +return ret; +} + +/* Write the header */ +memset(header, 0, sizeof(header)); +header.magic = cpu_to_be32(QCOW_MAGIC); +header.version = cpu_to_be32(QCOW_VERSION); +header.cluster_bits = cpu_to_be32(cluster_bits); +header.size = cpu_to_be64(0); +header.l1_table_offset = cpu_to_be64(0); +header.l1_size = cpu_to_be32(0); +header.refcount_table_offset = cpu_to_be64(cluster_size); +header.refcount_table_clusters = cpu_to_be32(1); + +if (flags BLOCK_FLAG_ENCRYPT) { +header.crypt_method = cpu_to_be32(QCOW_CRYPT_AES); +} else { +header.crypt_method = cpu_to_be32(QCOW_CRYPT_NONE); +} + +ret = bdrv_pwrite(bs, 0, header, sizeof(header)); +if (ret 0) { +goto out; +} + +/* Write an empty refcount table */ +refcount_table = qemu_mallocz(cluster_size); +ret = bdrv_pwrite(bs, cluster_size, refcount_table, cluster_size); +qemu_free(refcount_table); + +if (ret 0) { +goto out; +} + +bdrv_close(bs); + +/* + * And now open the image and make it consistent first (i.e. increase the + * refcount of the cluster that is occupied by the header and the refcount + * table) + */ +BlockDriver* drv = bdrv_find_format(qcow2); +assert(drv != NULL); +ret = bdrv_open(bs, filename, BDRV_O_RDWR | BDRV_O_NO_FLUSH, drv); +if (ret 0) { +goto out; +} + +ret = qcow2_alloc_clusters(bs, 2 * cluster_size); +if (ret 0) { +goto out; + +} else if (ret != 0) { +error_report(Huh, first cluster in empty image is already in use?); +abort(); +} + +/* Okay, now that we have a valid image, let's give it the right size */ +ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE); +if (ret 0) { +goto out; +} + +/* Want a backing file? There you go.*/ +if (backing_file) { +
[Qemu-devel] [PATCH 03/11] qcow2: Remove old image creation function
They have been #ifdef'd out by the previous patch. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2.c | 224 - 1 files changed, 0 insertions(+), 224 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 3b2e38c..45ed073 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -795,30 +795,6 @@ static int qcow2_change_backing_file(BlockDriverState *bs, return qcow2_update_ext_header(bs, backing_file, backing_fmt); } -#if 0 -static int get_bits_from_size(size_t size) -{ -int res = 0; - -if (size == 0) { -return -1; -} - -while (size != 1) { -/* Not a power of two */ -if (size 1) { -return -1; -} - -size = 1; -res++; -} - -return res; -} -#endif - - static int preallocate(BlockDriverState *bs) { uint64_t nb_sectors; @@ -872,205 +848,6 @@ static int preallocate(BlockDriverState *bs) return 0; } -#if 0 -static int qcow_create2(const char *filename, int64_t total_size, -const char *backing_file, const char *backing_format, -int flags, size_t cluster_size, int prealloc) -{ - -int fd, header_size, backing_filename_len, l1_size, i, shift, l2_bits; -int ref_clusters, reftable_clusters, backing_format_len = 0; -int rounded_ext_bf_len = 0; -QCowHeader header; -uint64_t tmp, offset; -uint64_t old_ref_clusters; -QCowCreateState s1, *s = s1; -QCowExtension ext_bf = {0, 0}; -int ret; - -memset(s, 0, sizeof(*s)); - -fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644); -if (fd 0) -return -errno; -memset(header, 0, sizeof(header)); -header.magic = cpu_to_be32(QCOW_MAGIC); -header.version = cpu_to_be32(QCOW_VERSION); -header.size = cpu_to_be64(total_size * 512); -header_size = sizeof(header); -backing_filename_len = 0; -if (backing_file) { -if (backing_format) { -ext_bf.magic = QCOW_EXT_MAGIC_BACKING_FORMAT; -backing_format_len = strlen(backing_format); -ext_bf.len = backing_format_len; -rounded_ext_bf_len = (sizeof(ext_bf) + ext_bf.len + 7) ~7; -header_size += rounded_ext_bf_len; -} -header.backing_file_offset = cpu_to_be64(header_size); -backing_filename_len = strlen(backing_file); -header.backing_file_size = cpu_to_be32(backing_filename_len); -header_size += backing_filename_len; -} - -/* Cluster size */ -s-cluster_bits = get_bits_from_size(cluster_size); -if (s-cluster_bits MIN_CLUSTER_BITS || -s-cluster_bits MAX_CLUSTER_BITS) -{ -fprintf(stderr, Cluster size must be a power of two between -%d and %dk\n, -1 MIN_CLUSTER_BITS, -1 (MAX_CLUSTER_BITS - 10)); -return -EINVAL; -} -s-cluster_size = 1 s-cluster_bits; - -header.cluster_bits = cpu_to_be32(s-cluster_bits); -header_size = (header_size + 7) ~7; -if (flags BLOCK_FLAG_ENCRYPT) { -header.crypt_method = cpu_to_be32(QCOW_CRYPT_AES); -} else { -header.crypt_method = cpu_to_be32(QCOW_CRYPT_NONE); -} -l2_bits = s-cluster_bits - 3; -shift = s-cluster_bits + l2_bits; -l1_size = (((total_size * 512) + (1LL shift) - 1) shift); -offset = align_offset(header_size, s-cluster_size); -s-l1_table_offset = offset; -header.l1_table_offset = cpu_to_be64(s-l1_table_offset); -header.l1_size = cpu_to_be32(l1_size); -offset += align_offset(l1_size * sizeof(uint64_t), s-cluster_size); - -/* count how many refcount blocks needed */ - -#define NUM_CLUSTERS(bytes) \ -(((bytes) + (s-cluster_size) - 1) / (s-cluster_size)) - -ref_clusters = NUM_CLUSTERS(NUM_CLUSTERS(offset) * sizeof(uint16_t)); - -do { -uint64_t image_clusters; -old_ref_clusters = ref_clusters; - -/* Number of clusters used for the refcount table */ -reftable_clusters = NUM_CLUSTERS(ref_clusters * sizeof(uint64_t)); - -/* Number of clusters that the whole image will have */ -image_clusters = NUM_CLUSTERS(offset) + ref_clusters -+ reftable_clusters; - -/* Number of refcount blocks needed for the image */ -ref_clusters = NUM_CLUSTERS(image_clusters * sizeof(uint16_t)); - -} while (ref_clusters != old_ref_clusters); - -s-refcount_table = qemu_mallocz(reftable_clusters * s-cluster_size); - -s-refcount_table_offset = offset; -header.refcount_table_offset = cpu_to_be64(offset); -header.refcount_table_clusters = cpu_to_be32(reftable_clusters); -offset += (reftable_clusters * s-cluster_size); -s-refcount_block_offset = offset; - -for (i=0; i ref_clusters; i++) { -s-refcount_table[i] = cpu_to_be64(offset); -offset += s-cluster_size; -} - -s-refcount_block = qemu_mallocz(ref_clusters * s-cluster_size); - -
[Qemu-devel] [PATCH 04/11] ide: set WCACHE supported in IDENTIFY data
From: Christoph Hellwig h...@lst.de ATA does not only have the WCACHE enabled bit in identify word 85, but also a WCACHE supported bit in word 82. While the Linux kernel is fine with the latter at least hdparm also needs the former before correctly displaying the cache settings. There's also a non-zero chance other operating systems are more picky in their volatile write cache detection. Signed-off-by: Christoph Hellwig h...@lst.de Signed-off-by: Kevin Wolf kw...@redhat.com --- hw/ide/core.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/ide/core.c b/hw/ide/core.c index 06b6e14..5ccb09c 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -146,8 +146,8 @@ static void ide_identify(IDEState *s) put_le16(p + 68, 120); put_le16(p + 80, 0xf0); /* ata3 - ata6 supported */ put_le16(p + 81, 0x16); /* conforms to ata5 */ -/* 14=NOP supported, 0=SMART supported */ -put_le16(p + 82, (1 14) | 1); +/* 14=NOP supported, 5=WCACHE supported, 0=SMART supported */ +put_le16(p + 82, (1 14) | (1 5) | 1); /* 13=flush_cache_ext,12=flush_cache,10=lba48 */ put_le16(p + 83, (1 14) | (1 13) | (1 12) | (1 10)); /* 14=set to 1, 1=SMART self test, 0=SMART error logging */ -- 1.7.2.3
[Qemu-devel] [PATCH 01/11] qcow2: Support exact L1 table growth
From: Stefan Hajnoczi stefa...@linux.vnet.ibm.com The L1 table grow operation includes a size calculation that bumps up the new L1 table size in order to anticipate the size needs of vmstate data. This helps reduce the number of times that the L1 table has to be grown when vmstate data is appended. This size overhead is not necessary during image creation, bdrv_truncate(), or snapshot goto operations. In fact, existing qemu-iotests that exercise table growth are no longer able to trigger it because image creation preallocates an L1 table that is too large after changes to qcow_create2(). This patch keeps the size calculation but also adds exact growth for callers that do not want to inflate the L1 table size unnecessarily. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com Signed-off-by: Kevin Wolf kw...@redhat.com --- block/qcow2-cluster.c | 25 - block/qcow2-snapshot.c |2 +- block/qcow2.c |2 +- block/qcow2.h |2 +- 4 files changed, 19 insertions(+), 12 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index fb4224a..4f7dc59 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -28,7 +28,7 @@ #include block_int.h #include block/qcow2.h -int qcow2_grow_l1_table(BlockDriverState *bs, int min_size) +int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size) { BDRVQcowState *s = bs-opaque; int new_l1_size, new_l1_size2, ret, i; @@ -36,15 +36,22 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size) int64_t new_l1_table_offset; uint8_t data[12]; -new_l1_size = s-l1_size; -if (min_size = new_l1_size) +if (min_size = s-l1_size) return 0; -if (new_l1_size == 0) { -new_l1_size = 1; -} -while (min_size new_l1_size) { -new_l1_size = (new_l1_size * 3 + 1) / 2; + +if (exact_size) { +new_l1_size = min_size; +} else { +/* Bump size up to reduce the number of times we have to grow */ +new_l1_size = s-l1_size; +if (new_l1_size == 0) { +new_l1_size = 1; +} +while (min_size new_l1_size) { +new_l1_size = (new_l1_size * 3 + 1) / 2; +} } + #ifdef DEBUG_ALLOC2 printf(grow l1_table from %d to %d\n, s-l1_size, new_l1_size); #endif @@ -550,7 +557,7 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset, l1_index = offset (s-l2_bits + s-cluster_bits); if (l1_index = s-l1_size) { -ret = qcow2_grow_l1_table(bs, l1_index + 1); +ret = qcow2_grow_l1_table(bs, l1_index + 1, false); if (ret 0) { return ret; } diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c index bbfcaaa..8dd5df0 100644 --- a/block/qcow2-snapshot.c +++ b/block/qcow2-snapshot.c @@ -327,7 +327,7 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id) if (qcow2_update_snapshot_refcount(bs, s-l1_table_offset, s-l1_size, -1) 0) goto fail; -if (qcow2_grow_l1_table(bs, sn-l1_size) 0) +if (qcow2_grow_l1_table(bs, sn-l1_size, true) 0) goto fail; s-l1_size = sn-l1_size; diff --git a/block/qcow2.c b/block/qcow2.c index ee3481b..345d2e4 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1154,7 +1154,7 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset) } new_l1_size = size_to_l1(s, offset); -ret = qcow2_grow_l1_table(bs, new_l1_size); +ret = qcow2_grow_l1_table(bs, new_l1_size, true); if (ret 0) { return ret; } diff --git a/block/qcow2.h b/block/qcow2.h index 356a34a..add710b 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -188,7 +188,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res); /* qcow2-cluster.c functions */ -int qcow2_grow_l1_table(BlockDriverState *bs, int min_size); +int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size); void qcow2_l2_cache_reset(BlockDriverState *bs); int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset); void qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num, -- 1.7.2.3
[Qemu-devel] [PATCH 05/11] Copy snapshots out of QCOW2 disk
From: edison edi...@cloud.com In order to backup snapshots, created from QCOW2 iamge, we want to copy snapshots out of QCOW2 disk to a seperate storage. The following patch adds a new option in qemu-img: qemu-img convert -f qcow2 -O qcow2 -s snapshot_name src_img bck_img. Right now, it only supports to copy the full snapshot, delta snapshot is on the way. Changes from V1: all the comments from Kevin are addressed: Add read-only checking Fix coding style Change the name from bdrv_snapshot_load to bdrv_snapshot_load_tmp Signed-off-by: Disheng Su edi...@cloud.com Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c| 16 block.h|2 ++ block/qcow2-snapshot.c | 31 +++ block/qcow2.c |1 + block/qcow2.h |1 + block_int.h|2 ++ qemu-img-cmds.hx |4 ++-- qemu-img.c | 19 ++- qemu-img.texi |4 ++-- 9 files changed, 75 insertions(+), 5 deletions(-) diff --git a/block.c b/block.c index a19374d..985d0b7 100644 --- a/block.c +++ b/block.c @@ -1899,6 +1899,22 @@ int bdrv_snapshot_list(BlockDriverState *bs, return -ENOTSUP; } +int bdrv_snapshot_load_tmp(BlockDriverState *bs, +const char *snapshot_name) +{ +BlockDriver *drv = bs-drv; +if (!drv) { +return -ENOMEDIUM; +} +if (!bs-read_only) { +return -EINVAL; +} +if (drv-bdrv_snapshot_load_tmp) { +return drv-bdrv_snapshot_load_tmp(bs, snapshot_name); +} +return -ENOTSUP; +} + #define NB_SUFFIXES 4 char *get_human_readable_size(char *buf, int buf_size, int64_t size) diff --git a/block.h b/block.h index 5f64380..a4facf2 100644 --- a/block.h +++ b/block.h @@ -211,6 +211,8 @@ int bdrv_snapshot_goto(BlockDriverState *bs, int bdrv_snapshot_delete(BlockDriverState *bs, const char *snapshot_id); int bdrv_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_info); +int bdrv_snapshot_load_tmp(BlockDriverState *bs, + const char *snapshot_name); char *bdrv_snapshot_dump(char *buf, int buf_size, QEMUSnapshotInfo *sn); char *get_human_readable_size(char *buf, int buf_size, int64_t size); diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c index 8dd5df0..aacf357 100644 --- a/block/qcow2-snapshot.c +++ b/block/qcow2-snapshot.c @@ -418,3 +418,34 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab) return s-nb_snapshots; } +int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name) +{ +int i, snapshot_index, l1_size2; +BDRVQcowState *s = bs-opaque; +QCowSnapshot *sn; + +snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_name); +if (snapshot_index 0) { +return -ENOENT; +} + +sn = s-snapshots[snapshot_index]; +s-l1_size = sn-l1_size; +l1_size2 = s-l1_size * sizeof(uint64_t); +if (s-l1_table != NULL) { +qemu_free(s-l1_table); +} + +s-l1_table_offset = sn-l1_table_offset; +s-l1_table = qemu_mallocz(align_offset(l1_size2, 512)); + +if (bdrv_pread(bs-file, sn-l1_table_offset, + s-l1_table, l1_size2) != l1_size2) { +return -1; +} + +for(i = 0;i s-l1_size; i++) { +be64_to_cpus(s-l1_table[i]); +} +return 0; +} diff --git a/block/qcow2.c b/block/qcow2.c index 45ed073..b816d87 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1286,6 +1286,7 @@ static BlockDriver bdrv_qcow2 = { .bdrv_snapshot_goto = qcow2_snapshot_goto, .bdrv_snapshot_delete = qcow2_snapshot_delete, .bdrv_snapshot_list = qcow2_snapshot_list, +.bdrv_snapshot_load_tmp = qcow2_snapshot_load_tmp, .bdrv_get_info = qcow_get_info, .bdrv_save_vmstate= qcow_save_vmstate, diff --git a/block/qcow2.h b/block/qcow2.h index add710b..2d22e5e 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -211,6 +211,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info); int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id); int qcow2_snapshot_delete(BlockDriverState *bs, const char *snapshot_id); int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab); +int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name); void qcow2_free_snapshots(BlockDriverState *bs); int qcow2_read_snapshots(BlockDriverState *bs); diff --git a/block_int.h b/block_int.h index e8e7156..87e60b8 100644 --- a/block_int.h +++ b/block_int.h @@ -93,6 +93,8 @@ struct BlockDriver { int (*bdrv_snapshot_delete)(BlockDriverState *bs, const char *snapshot_id); int (*bdrv_snapshot_list)(BlockDriverState *bs, QEMUSnapshotInfo **psn_info); +int (*bdrv_snapshot_load_tmp)(BlockDriverState *bs, + const char *snapshot_name); int (*bdrv_get_info)(BlockDriverState *bs,
Re: [Qemu-devel] [PATCH 0/3] add usb_detach and usb_attach (v3)
On 10/22/2010 08:17 AM, Luiz Capitulino wrote: You and Gerd asked about the purpose of this command, turns out that it's only useful for developing new USB devices for QEMU, so I thought it would be better to restrict it, so that people don't start using this the wrong way or worse, we can't drop/break it because some tool is now using it. Need HMP commands have no support associated with them. Given QMP, there's no reason to script HMP commands. But a long standing policy in QEMU has been to avoid conditional code because it results in dead code. I think this policy has actually worked well for us historically. Regards, Anthony Liguori Anyway, maybe a good doc will do, this was just a small suggestion. Objections, Anthony? Not with better docs. Regards, Anthony Liguori This is just adding a HMP command. Is that the right approach or was that an unintentional consequence of rebasing post-HMP/QMP split? I don't think this should be available under QMP, it's more a debugging command for USB developers. Regards, Anthony Liguori cheers, Gerd
[Qemu-devel] [PATCH 06/11] qemu-io: New command map
The new map command in qemu-io lists all allocated/unallocated areas in an image file. Signed-off-by: Kevin Wolf kw...@redhat.com --- qemu-io.c | 39 +++ 1 files changed, 39 insertions(+), 0 deletions(-) diff --git a/qemu-io.c b/qemu-io.c index b4e5cc8..ff353eb 100644 --- a/qemu-io.c +++ b/qemu-io.c @@ -1443,6 +1443,44 @@ static const cmdinfo_t alloc_cmd = { }; static int +map_f(int argc, char **argv) +{ + int64_t offset; + int64_t nb_sectors; + char s1[64]; + int num, num_checked; + int ret; + const char *retstr; + + offset = 0; + nb_sectors = bs-total_sectors; + + do { + num_checked = MIN(nb_sectors, INT_MAX); + ret = bdrv_is_allocated(bs, offset, num_checked, num); + retstr = ret ? allocated : not allocated; + cvtstr(offset 9ULL, s1, sizeof(s1)); + printf([% 24 PRId64 ] % 8d/% 8d sectors %s at offset %s (%d)\n, + offset 9ULL, num, num_checked, retstr, s1, ret); + + offset += num; + nb_sectors -= num; + } while(offset bs-total_sectors); + + return 0; +} + +static const cmdinfo_t map_cmd = { + .name = map, + .argmin = 0, + .argmax = 0, + .cfunc = map_f, + .args = , + .oneline= prints the allocated areas of a file, +}; + + +static int close_f(int argc, char **argv) { bdrv_close(bs); @@ -1680,6 +1718,7 @@ int main(int argc, char **argv) add_command(length_cmd); add_command(info_cmd); add_command(alloc_cmd); + add_command(map_cmd); add_args_command(init_args_command); add_check_command(init_check_command); -- 1.7.2.3
[Qemu-devel] [PATCH 10/11] ide: Handle flush failure
Instead of always assuming success for bdrv_aio_flush, actually do something with the error. This respects the werror option and accordingly ignores the error, reports it to the guest or stops the VM and retries after cont. Ignoring the error is trivial, obviously. For stopping the VM and retrying later old code can be reused, but we need to introduce a new status for retry a flush. For reporting to the guest, fortunately the same action is required as for a failed read/write (status = DRDY | ERR, error = ABRT), so this code can be reused as well. Signed-off-by: Kevin Wolf kw...@redhat.com --- hw/ide/core.c | 10 +- hw/ide/internal.h |3 ++- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/hw/ide/core.c b/hw/ide/core.c index 6d8606e..bc3e916 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -65,6 +65,7 @@ static void ide_dma_start(IDEState *s, BlockDriverCompletionFunc *dma_cb); static void ide_dma_restart(IDEState *s, int is_read); static void ide_atapi_cmd_read_dma_cb(void *opaque, int ret); static int ide_handle_rw_error(IDEState *s, int error, int op); +static void ide_flush_cache(IDEState *s); static void padstr(char *str, const char *src, int len) { @@ -688,6 +689,8 @@ static void ide_dma_restart_bh(void *opaque) } else { ide_sector_write(bmdma_active_if(bm)); } +} else if (bm-status BM_STATUS_RETRY_FLUSH) { +ide_flush_cache(bmdma_active_if(bm)); } } @@ -795,7 +798,12 @@ static void ide_flush_cb(void *opaque, int ret) { IDEState *s = opaque; -/* XXX: how do we signal I/O errors here? */ +if (ret 0) { +/* XXX: What sector number to set here? */ +if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) { +return; +} +} s-status = READY_STAT | SEEK_STAT; ide_set_irq(s-bus); diff --git a/hw/ide/internal.h b/hw/ide/internal.h index 4165543..d652e06 100644 --- a/hw/ide/internal.h +++ b/hw/ide/internal.h @@ -472,7 +472,8 @@ struct IDEDeviceInfo { #define BM_STATUS_INT0x04 #define BM_STATUS_DMA_RETRY 0x08 #define BM_STATUS_PIO_RETRY 0x10 -#define BM_STATUS_RETRY_READ 0x20 +#define BM_STATUS_RETRY_READ 0x20 +#define BM_STATUS_RETRY_FLUSH 0x40 #define BM_CMD_START 0x01 #define BM_CMD_READ 0x08 -- 1.7.2.3
[Qemu-devel] [PATCH 09/11] ide: Factor ide_flush_cache out
The next patch reuses this code, so put it in its own function. Signed-off-by: Kevin Wolf kw...@redhat.com --- hw/ide/core.c | 14 ++ 1 files changed, 10 insertions(+), 4 deletions(-) diff --git a/hw/ide/core.c b/hw/ide/core.c index 5ccb09c..6d8606e 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -801,6 +801,15 @@ static void ide_flush_cb(void *opaque, int ret) ide_set_irq(s-bus); } +static void ide_flush_cache(IDEState *s) +{ +if (s-bs) { +bdrv_aio_flush(s-bs, ide_flush_cb, s); +} else { +ide_flush_cb(s, 0); +} +} + static inline void cpu_to_ube16(uint8_t *buf, int val) { buf[0] = val 8; @@ -2031,10 +2040,7 @@ void ide_ioport_write(void *opaque, uint32_t addr, uint32_t val) break; case WIN_FLUSH_CACHE: case WIN_FLUSH_CACHE_EXT: -if (s-bs) -bdrv_aio_flush(s-bs, ide_flush_cb, s); -else -ide_flush_cb(s, 0); +ide_flush_cache(s); break; case WIN_STANDBY: case WIN_STANDBY2: -- 1.7.2.3
[Qemu-devel] [PATCH 07/11] block: Use GCC_FMT_ATTR and fix a format error
From: Stefan Weil w...@mail.berlios.de Adding the gcc format attribute detects a format bug which is fixed here. v2: Don't use type cast. BDRV_SECTOR_SIZE is unsigned long long, so %lld should be the correct format specifier. Cc: Blue Swirl blauwir...@gmail.com Cc: Kevin Wolf kw...@redhat.com Signed-off-by: Stefan Weil w...@mail.berlios.de Signed-off-by: Kevin Wolf kw...@redhat.com --- block/blkverify.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/block/blkverify.c b/block/blkverify.c index 8083464..b2a12fe 100644 --- a/block/blkverify.c +++ b/block/blkverify.c @@ -53,7 +53,8 @@ static AIOPool blkverify_aio_pool = { .cancel = blkverify_aio_cancel, }; -static void blkverify_err(BlkverifyAIOCB *acb, const char *fmt, ...) +static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyAIOCB *acb, + const char *fmt, ...) { va_list ap; @@ -299,7 +300,7 @@ static void blkverify_verify_readv(BlkverifyAIOCB *acb) { ssize_t offset = blkverify_iovec_compare(acb-qiov, acb-raw_qiov); if (offset != -1) { -blkverify_err(acb, contents mismatch in sector %ld, +blkverify_err(acb, contents mismatch in sector %lld, acb-sector_num + (offset / BDRV_SECTOR_SIZE)); } } -- 1.7.2.3
Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
On 10/22/2010 08:35 AM, Kevin Wolf wrote: Am 22.10.2010 14:58, schrieb Anthony Liguori: On 10/22/2010 03:29 AM, Kevin Wolf wrote: I agree. Of course, as Laurent said a while ago, there is no specification for NBD, so it's hard to say what the intended semantics is. However, I did have a look at the nbdserver code and it looks as if it implements something similar to writethrough (namely fsync after each write) only if configured this way on the server side. qemu-nbd defaults to writethrough, but can be configured to use cache=none. So with either server qemu as a client can't tell whether the data is safe on disk or not. In my book this is a strong argument for refusing to open nbd connections with anything but cache=unsafe. On a physical system, if you don't have a battery backed disk and you enable the WC on your disk, then even with cache=writethrough we're unsafe. I don't think that's right. O_SYNC should guarantee that the volatile disk cache is flushed. If your filesystem does the right thing which an awful lot of them don't today. Likewise, if you mount your filesystem with barrier=0, QEMU is unsafe. Yeah, if you do something equivalent to cache=unsafe on a lower layer, then qemu can't do much about it. Maybe you can apply the same argument to NBD, even though it's unsafe by default. QEMU can't guarantee safety. The underlying storage needs to be configured correctly. As long as we're not introducing caching within QEMU, I don't think we should assume we're unsafe. Do we have any place where we can add docs on a per-block format basis? It would be good to at least mention for each block device how the backing storage needed to be configured for safety. docs/block-protocols.txt? Maybe docs/block/name.txt? Would be a good home for the qed spec too. Regards, Anthony Liguori Kevin
[Qemu-devel] [PATCH 08/11] qemu-img: Fix qemu-img convert -obacking_file
The old -B option caused a backing file to be used for the converted image and to avoid copying clusters from the old backing file. When replaced with -obacking_file, qemu-img convert does assign the backing file to the new image, but it doesn't realize that it should avoid copying clusters from the backing file. This patch checks the -o options for a backing_file and applies the same logic as for -B in this case. Signed-off-by: Kevin Wolf kw...@redhat.com --- qemu-img.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index d4a3b4e..2864cb8 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -645,6 +645,7 @@ static int img_convert(int argc, char **argv) const uint8_t *buf1; BlockDriverInfo bdi; QEMUOptionParameter *param = NULL, *create_options = NULL; +QEMUOptionParameter *out_baseimg_param; char *options = NULL; const char *snapshot_name = NULL; @@ -769,6 +770,12 @@ static int img_convert(int argc, char **argv) goto out; } +/* Get backing file name if -o backing_file was used */ +out_baseimg_param = get_option_parameter(param, BLOCK_OPT_BACKING_FILE); +if (out_baseimg_param) { +out_baseimg = out_baseimg_param-value.s; +} + /* Check if compression is supported */ if (flags BLOCK_FLAG_COMPRESS) { QEMUOptionParameter *encryption = -- 1.7.2.3
Re: [Qemu-devel] [PATCH 1/2] usb-ccid: add CCID bus
Alon Levy al...@redhat.com writes: A CCID device is a smart card reader. It is a USB device, defined at [1]. This patch introduces the usb-ccid device that is a ccid bus. Next patches will introduce two card types to use it, a passthru card and an emulated card. [1] http://www.usb.org/developers/devclass_docs/DWG_Smart-Card_CCID_Rev110. Signed-off-by: Alon Levy al...@redhat.com --- Makefile.objs |1 + configure | 12 + hw/ccid.h | 34 ++ hw/usb-ccid.c | 1349 + 4 files changed, 1396 insertions(+), 0 deletions(-) create mode 100644 hw/ccid.h create mode 100644 hw/usb-ccid.c [...] diff --git a/hw/usb-ccid.c b/hw/usb-ccid.c new file mode 100644 index 000..a7b4c3f --- /dev/null +++ b/hw/usb-ccid.c @@ -0,0 +1,1349 @@ [...] +struct CCIDBus { +BusState qbus; +USBCCIDState *ccid; Where does ccid point to? +}; [...] +static int ccid_initfn(USBDevice *dev) +{ +USBCCIDState *s = DO_UPCAST(USBCCIDState, dev, dev); + +s-bus = ccid_bus_new(dev-qdev); +s-card = NULL; +s-cardinfo = NULL; +s-bus-ccid = s; Looks like it points back to the device providing the bus. Why not use bus-qbus-parent ? +s-migration_state = MIGRATION_NONE; +dev-auto_attach = s-auto_attach; +debug = s-debug; Wait a sec! Each CCID device has its own property debug (defined below), but they all copy to the same static debug on initialization. In other words, the device initialized last wins. Ugh. +s-migration_target_ip = 0; +s-migration_target_port = 0; +s-dev.speed = USB_SPEED_FULL; +s-notify_slot_change = false; +s-powered = true; +s-pending_answers_num = 0; +s-last_answer_error = 0; +s-bulk_in_pending_start = 0; +s-bulk_in_pending_end = 0; +s-current_bulk_in = NULL; +ccid_reset_error_status(s); +s-bulk_out_pos = 0; +ccid_reset_parameters(s); +ccid_reset(s); +return 0; +} [...] +static struct USBDeviceInfo ccid_info = { +.product_desc = QEMU USB CCID, +.qdev.name = CCID_DEV_NAME, +.qdev.size = sizeof(USBCCIDState), +.qdev.vmsd = ccid_vmstate, +.init = ccid_initfn, +.handle_packet = usb_generic_handle_packet, +.handle_reset = ccid_handle_reset, +.handle_control = ccid_handle_control, +.handle_data= ccid_handle_data, +.handle_destroy = ccid_handle_destroy, +.usbdevice_name = ccid, +.qdev.props = (Property[]) { +DEFINE_PROP_UINT8(auto_attach, USBCCIDState, auto_attach, 0), +DEFINE_PROP_UINT8(debug, USBCCIDState, debug, 0), +DEFINE_PROP_END_OF_LIST(), +}, +}; + + +static void ccid_register_devices(void) +{ +usb_qdev_register(ccid_info); +} +device_init(ccid_register_devices)
Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
On 10/22/2010 08:57 AM, Kevin Wolf wrote: Am 22.10.2010 15:45, schrieb Anthony Liguori: On a physical system, if you don't have a battery backed disk and you enable the WC on your disk, then even with cache=writethrough we're unsafe. I don't think that's right. O_SYNC should guarantee that the volatile disk cache is flushed. If your filesystem does the right thing which an awful lot of them don't today. The list of really relevant filesystems is rather short, though. Likewise, if you mount your filesystem with barrier=0, QEMU is unsafe. Yeah, if you do something equivalent to cache=unsafe on a lower layer, then qemu can't do much about it. Maybe you can apply the same argument to NBD, even though it's unsafe by default. QEMU can't guarantee safety. The underlying storage needs to be configured correctly. As long as we're not introducing caching within QEMU, I don't think we should assume we're unsafe. Do we have any place where we can add docs on a per-block format basis? It would be good to at least mention for each block device how the backing storage needed to be configured for safety. docs/block-protocols.txt? Maybe docs/block/name.txt? Would be a good home for the qed spec too. I think spec and documentation for users should be kept separate. I thought that's the reason why docs/specs/ exists. And if you exclude specs, I'm not sure if there's a lot left to say for each format. Having ten files under docs/block/ which consist of two lines each would be ridiculous. If contrary to my expectations we actually do have content for it, docs/block/name.txt works for me as well. Okay, sounds reasonable. Regards, Anthony Liguori Kevin
[Qemu-devel] [PATCH 1/4] Trivial fix for QMP/qmp-events.txt
From: Hidetoshi Seto seto.hideto...@jp.fujitsu.com Fix example of STOP event that was just copy-and-pasted. Signed-off-by: Hidetoshi Seto seto.hideto...@jp.fujitsu.com Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- QMP/qmp-events.txt |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt index 01ec85f..aa20210 100644 --- a/QMP/qmp-events.txt +++ b/QMP/qmp-events.txt @@ -89,7 +89,7 @@ Data: None. Example: -{ event: SHUTDOWN, +{ event: STOP, timestamp: { seconds: 1267041730, microseconds: 281295 } } VNC_CONNECTED -- 1.7.3.1.127.g1bb28
Re: [Qemu-devel] [PATCH 0/6] First part of autoconfy series: cleanup tests/
On Fri, 22 Oct 2010 15:38:34 +0200 Paolo Bonzini pbonz...@redhat.com wrote: On 10/22/2010 03:26 PM, Luiz Capitulino wrote: On Thu, 21 Oct 2010 10:18:34 +0200 Paolo Bonzinipbonz...@redhat.com wrote: This is a small start from the autoconfy configure series. Actually, this part is almost completely new. It cleans up the tests Makefile so that the i386-linux-user testsuite's Makefile targets are more easily extensible and can work wherever a compiler for i386 is installed. In the future this could be extended to ARM and MIPS targets. This is unrelated to this series, but something that would be very good to have would be to move all check- programs to tests/ (or check/) and make them part of 'make test' or introduce a new 'make check'. Right, I thought about that too. The main hurdle is that: 1) right now almost every test in make test is failing; Yeah, just saw that. 2) the testsuite run by make test is very noisy, does not create logs, etc. If we could use autotest... :) We could also have check-i386, check-qmp, etc. and check would simply call all of them. That would be perfect, I think. I mean, if we could stick only to check to do all our unit-testing, then it would be easier to setup autotest around it. But we'll have to port current tests to it, of course.
Re: [Qemu-devel] [PATCH 0/6] First part of autoconfy series: cleanup tests/
On Fri, 22 Oct 2010 15:53:39 +0200 Paolo Bonzini pbonz...@redhat.com wrote: On 10/22/2010 03:44 PM, Luiz Capitulino wrote: We could also have check-i386, check-qmp, etc. and check would simply call all of them. That would be perfect, I think. I mean, if we could stick only to check to do all our unit-testing Well, the current make test is more integration testing than unit testing, so I'm not sure check can work for that. But autotest can, which is why I mentioned it. Ah, got it.
[Qemu-devel] [PATCH 2/4] Silence compiler warning in json test case
From: Jan Kiszka jan.kis...@web.de This avoids error: zero-length gnu_printf format string Signed-off-by: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- check-qjson.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/check-qjson.c b/check-qjson.c index 0b60e45..64fcdcb 100644 --- a/check-qjson.c +++ b/check-qjson.c @@ -639,7 +639,9 @@ END_TEST START_TEST(empty_input) { -QObject *obj = qobject_from_json(); +const char *empty = ; + +QObject *obj = qobject_from_json(empty); fail_unless(obj == NULL); } END_TEST -- 1.7.3.1.127.g1bb28
[Qemu-devel] [PATCH 3/4] Fix test suite build with tracing enabled
From: Jan Kiszka jan.kis...@siemens.com qemu_malloc instrumentations require linking against the trace objects. Signed-off-by: Jan Kiszka jan.kis...@siemens.com Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- Makefile | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Makefile b/Makefile index 252c817..106a401 100644 --- a/Makefile +++ b/Makefile @@ -140,12 +140,12 @@ qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx check-qint.o check-qstring.o check-qdict.o check-qlist.o check-qfloat.o check-qjson.o: $(GENERATED_HEADERS) -check-qint: check-qint.o qint.o qemu-malloc.o -check-qstring: check-qstring.o qstring.o qemu-malloc.o -check-qdict: check-qdict.o qdict.o qfloat.o qint.o qstring.o qbool.o qemu-malloc.o qlist.o -check-qlist: check-qlist.o qlist.o qint.o qemu-malloc.o -check-qfloat: check-qfloat.o qfloat.o qemu-malloc.o -check-qjson: check-qjson.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o qjson.o json-streamer.o json-lexer.o json-parser.o qemu-malloc.o +check-qint: check-qint.o qint.o qemu-malloc.o $(trace-obj-y) +check-qstring: check-qstring.o qstring.o qemu-malloc.o $(trace-obj-y) +check-qdict: check-qdict.o qdict.o qfloat.o qint.o qstring.o qbool.o qemu-malloc.o qlist.o $(trace-obj-y) +check-qlist: check-qlist.o qlist.o qint.o qemu-malloc.o $(trace-obj-y) +check-qfloat: check-qfloat.o qfloat.o qemu-malloc.o $(trace-obj-y) +check-qjson: check-qjson.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o qjson.o json-streamer.o json-lexer.o json-parser.o qemu-malloc.o $(trace-obj-y) clean: # avoid old build problems by removing potentially incorrect old files -- 1.7.3.1.127.g1bb28
Re: [Qemu-devel] [PATCH 0/3] add usb_detach and usb_attach (v3)
On Fri, 22 Oct 2010 08:43:39 -0500 Anthony Liguori anth...@codemonkey.ws wrote: On 10/22/2010 08:17 AM, Luiz Capitulino wrote: You and Gerd asked about the purpose of this command, turns out that it's only useful for developing new USB devices for QEMU, so I thought it would be better to restrict it, so that people don't start using this the wrong way or worse, we can't drop/break it because some tool is now using it. Need HMP commands have no support associated with them. Given QMP, there's no reason to script HMP commands. But a long standing policy in QEMU has been to avoid conditional code because it results in dead code. I think this policy has actually worked well for us historically. Fine with me.
[Qemu-devel] [PATCH 11/11] virtio-blk: Respect werror option for flushes
The werror option now affects not only write requests, but also flush requests. Previously, it was not possible to stop a VM on a failed flush. Signed-off-by: Kevin Wolf kw...@redhat.com --- hw/virtio-blk.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index a1df26d..dbe2070 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -106,7 +106,13 @@ static void virtio_blk_flush_complete(void *opaque, int ret) { VirtIOBlockReq *req = opaque; -virtio_blk_req_complete(req, ret ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK); +if (ret) { +if (virtio_blk_handle_rw_error(req, -ret, 0)) { +return; +} +} + +virtio_blk_req_complete(req, VIRTIO_BLK_S_OK); } static VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s) -- 1.7.2.3
[Qemu-devel] [PULL 0/4]: Monitor queue
Anthony, The following patches have been sent to the list and look good to me. I've also tested them. The changes (since d03703c81a202cea156811e5dbc8e88627c19986) are available in the following repository: git://repo.or.cz/qemu/qmp-unstable.git for-anthony Hidetoshi Seto (1): Trivial fix for QMP/qmp-events.txt Jan Kiszka (2): Silence compiler warning in json test case Fix test suite build with tracing enabled Kusanagi Kouichi (1): monitor: Ignore . and .. when completing file name. Makefile | 12 ++-- QMP/qmp-events.txt |2 +- check-qjson.c |4 +++- monitor.c |5 + 4 files changed, 15 insertions(+), 8 deletions(-)
Re: [Qemu-devel] [PATCH 0/6] First part of autoconfy series: cleanup tests/
On 10/22/2010 03:44 PM, Luiz Capitulino wrote: We could also have check-i386, check-qmp, etc. and check would simply call all of them. That would be perfect, I think. I mean, if we could stick only to check to do all our unit-testing Well, the current make test is more integration testing than unit testing, so I'm not sure check can work for that. But autotest can, which is why I mentioned it. Paolo
[Qemu-devel] [PATCH 4/4] monitor: Ignore . and .. when completing file name.
From: Kusanagi Kouichi sl...@ac.auone-net.jp Signed-off-by: Kusanagi Kouichi sl...@ac.auone-net.jp Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- monitor.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/monitor.c b/monitor.c index 260cc02..61607c5 100644 --- a/monitor.c +++ b/monitor.c @@ -3976,6 +3976,11 @@ static void file_completion(const char *input) d = readdir(ffs); if (!d) break; + +if (strcmp(d-d_name, .) == 0 || strcmp(d-d_name, ..) == 0) { +continue; +} + if (strstart(d-d_name, file_prefix, NULL)) { memcpy(file, input, input_path_len); if (input_path_len sizeof(file)) -- 1.7.3.1.127.g1bb28
Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
Am 22.10.2010 15:45, schrieb Anthony Liguori: On a physical system, if you don't have a battery backed disk and you enable the WC on your disk, then even with cache=writethrough we're unsafe. I don't think that's right. O_SYNC should guarantee that the volatile disk cache is flushed. If your filesystem does the right thing which an awful lot of them don't today. The list of really relevant filesystems is rather short, though. Likewise, if you mount your filesystem with barrier=0, QEMU is unsafe. Yeah, if you do something equivalent to cache=unsafe on a lower layer, then qemu can't do much about it. Maybe you can apply the same argument to NBD, even though it's unsafe by default. QEMU can't guarantee safety. The underlying storage needs to be configured correctly. As long as we're not introducing caching within QEMU, I don't think we should assume we're unsafe. Do we have any place where we can add docs on a per-block format basis? It would be good to at least mention for each block device how the backing storage needed to be configured for safety. docs/block-protocols.txt? Maybe docs/block/name.txt? Would be a good home for the qed spec too. I think spec and documentation for users should be kept separate. I thought that's the reason why docs/specs/ exists. And if you exclude specs, I'm not sure if there's a lot left to say for each format. Having ten files under docs/block/ which consist of two lines each would be ridiculous. If contrary to my expectations we actually do have content for it, docs/block/name.txt works for me as well. Kevin
Re: [Qemu-devel] Re: [PATCH v6 07/12] pcie/hotplug: introduce pushing attention button command
On Fri, Oct 22, 2010 at 01:35:47PM +0200, Markus Armbruster wrote: Michael S. Tsirkin m...@redhat.com writes: On Wed, Oct 20, 2010 at 05:18:56PM +0900, Isaku Yamahata wrote: glue pcie_push_attention_button command. Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp So as a high level command, I think we need to think about how to tie this into pci_add/pci_del. Right? [...] Do we have consensus how our set of commands for hot plug should look like? We talked about it, but did we reach consensus? If yes, did we write it down somewhere? I think for simple things yes: - command to send hotplug notification to the guest - command to immediately add/remove the device - event to notify about guest ack - way to poll status: did guest ack last command? Existing ones will keep function: - send notification and when acked remove device - add device and send notification These are useful for human monitor but maybe not for management. -- MST
Re: [Qemu-devel] Re: [PATCH v6 07/12] pcie/hotplug: introduce pushing attention button command
On 10/22/2010 09:38 AM, Michael S. Tsirkin wrote: On Fri, Oct 22, 2010 at 01:35:47PM +0200, Markus Armbruster wrote: Michael S. Tsirkinm...@redhat.com writes: On Wed, Oct 20, 2010 at 05:18:56PM +0900, Isaku Yamahata wrote: glue pcie_push_attention_button command. Signed-off-by: Isaku Yamahatayamah...@valinux.co.jp So as a high level command, I think we need to think about how to tie this into pci_add/pci_del. Right? [...] Do we have consensus how our set of commands for hot plug should look like? We talked about it, but did we reach consensus? If yes, did we write it down somewhere? I think for simple things yes: - command to send hotplug notification to the guest - command to immediately add/remove the device - event to notify about guest ack Is it a guest ack? I thought it was actually an eject that can be initiated without the notification being sent to the guest. If so, polling doesn't really make much sense. Regards, Anthony Liguori - way to poll status: did guest ack last command? Existing ones will keep function: - send notification and when acked remove device - add device and send notification These are useful for human monitor but maybe not for management.
[Qemu-devel] (no subject)
QEMU Enhanced Disk format is a disk image format that forgoes features found in qcow2 in favor of better levels of performance and data integrity. Due to its simpler on-disk layout, it is possible to safely perform metadata updates more efficiently. Installations, suspend-to-disk, and other allocation-heavy I/O workloads will see increased performance due to fewer I/Os and syncs. Workloads that do not cause new clusters to be allocated will perform similar to raw images due to in-memory metadata caching. The format supports sparse disk images. It does not rely on the host filesystem holes feature, making it a good choice for sparse disk images that need to be transferred over channels where holes are not supported. Backing files are supported so only deltas against a base image can be stored. The base image may be smaller than the image file. The file format is extensible so that additional features can be added later with graceful compatibility handling. A specification for the file format is included in this patchset. Internal snapshots are not supported. This eliminates the need for additional metadata to track copy-on-write clusters. Compression and encryption are not supported. They add complexity and can be implemented at other layers in the stack (i.e. inside the guest or on the host). Encryption has been identified as a potential future extension and the file format allows for this. Signed-off-by: Anthony Liguori aligu...@us.ibm.com Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- This code is also available from git: http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/qed-v3 I have preserved distinct commits against v2 for easier reviewing here: http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/qed-v3-presquash v3: * Flush before L2 update when a backing file is used * Use QED_F_BACKING_FORMAT_NO_PROBE instead of backing_fmt header field * Allow non-cluster sized images * Implement autoclear feature bits * Implement backing image smaller size - reads from backing image should zero beyond EOF * Preserve errno in qed_find_cluster_cb() - don't dumb down to QED_CLUSTER_ERROR * Use ffs() instead of get_bits_from_size() * Remove l2_cache argument to qed_unref_l2_cache_entry * Eliminate L2TableAllocFunc function pointer * Split qed_aio_write in-place and allocating code path to make code clearer * Document how L2 cache is used * Document qed_find_cluster() * Update QED specification * Fix COPYING.LIB LGPL license file references * Add copyright header to qed-check.c * Avoid the bytes_to_str()/cvtstr()/sztostr() dependency until Jes' strtosz() goes in v2: * Add QED format specification to documentation * Use __builtin_ctzl() for get_bits_from_size() * Fine-grained table locking to allow concurrent allocating write requests * Fix qemu_free() instead of qemu_vfree() in qed_unref_l2_cache_entry() * Comment clean-ups
[Qemu-devel] [PATCH v3 5/5] qed: Consistency check support
This patch adds support for the qemu-img check command. It also introduces a dirty bit in the qed header to mark modified images as needing a check. This bit is cleared when the image file is closed cleanly. If an image file is opened and it has the dirty bit set, a consistency check will run and try to fix corrupted table offsets. These corruptions may occur if there is power loss while an allocating write is performed. Once the image is fixed it opens as normal again. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- Makefile.objs |2 +- block/qed-check.c | 210 + block/qed.c | 125 +++- block/qed.h |9 ++ 4 files changed, 342 insertions(+), 4 deletions(-) create mode 100644 block/qed-check.c diff --git a/Makefile.objs b/Makefile.objs index 8ae17d8..62cab02 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -15,7 +15,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o -block-nested-y += qed-lock.o +block-nested-y += qed-lock.o qed-check.o block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o block-nested-$(CONFIG_WIN32) += raw-win32.o block-nested-$(CONFIG_POSIX) += raw-posix.o diff --git a/block/qed-check.c b/block/qed-check.c new file mode 100644 index 000..4600932 --- /dev/null +++ b/block/qed-check.c @@ -0,0 +1,210 @@ +/* + * QEMU Enhanced Disk Format Consistency Check + * + * Copyright IBM, Corp. 2010 + * + * Authors: + * Stefan Hajnoczi stefa...@linux.vnet.ibm.com + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include qed.h + +typedef struct { +BDRVQEDState *s; +BdrvCheckResult *result; +bool fix; /* whether to fix invalid offsets */ + +size_t nclusters; +uint32_t *used_clusters;/* referenced cluster bitmap */ + +QEDRequest request; +} QEDCheck; + +static bool qed_test_bit(uint32_t *bitmap, uint64_t n) { +return !!(bitmap[n / 32] (1 (n % 32))); +} + +static void qed_set_bit(uint32_t *bitmap, uint64_t n) { +bitmap[n / 32] |= 1 (n % 32); +} + +/** + * Set bitmap bits for clusters + * + * @check: Check structure + * @offset: Starting offset in bytes + * @n: Number of clusters + */ +static bool qed_set_used_clusters(QEDCheck *check, uint64_t offset, + unsigned int n) +{ +uint64_t cluster = qed_bytes_to_clusters(check-s, offset); +unsigned int corruptions = 0; + +while (n-- != 0) { +/* Clusters should only be referenced once */ +if (qed_test_bit(check-used_clusters, cluster)) { +corruptions++; +} + +qed_set_bit(check-used_clusters, cluster); +cluster++; +} + +check-result-corruptions += corruptions; +return corruptions == 0; +} + +/** + * Check an L2 table + * + * @ret:Number of invalid cluster offsets + */ +static unsigned int qed_check_l2_table(QEDCheck *check, QEDTable *table) +{ +BDRVQEDState *s = check-s; +unsigned int i, num_invalid = 0; + +for (i = 0; i s-table_nelems; i++) { +uint64_t offset = table-offsets[i]; + +if (!offset) { +continue; +} + +/* Detect invalid cluster offset */ +if (!qed_check_cluster_offset(s, offset)) { +if (check-fix) { +table-offsets[i] = 0; +} else { +check-result-corruptions++; +} + +num_invalid++; +continue; +} + +qed_set_used_clusters(check, offset, 1); +} + +return num_invalid; +} + +/** + * Descend tables and check each cluster is referenced once only + */ +static int qed_check_l1_table(QEDCheck *check, QEDTable *table) +{ +BDRVQEDState *s = check-s; +unsigned int i, num_invalid_l1 = 0; +int ret, last_error = 0; + +/* Mark L1 table clusters used */ +qed_set_used_clusters(check, s-header.l1_table_offset, + s-header.table_size); + +for (i = 0; i s-table_nelems; i++) { +unsigned int num_invalid_l2; +uint64_t offset = table-offsets[i]; + +if (!offset) { +continue; +} + +/* Detect invalid L2 offset */ +if (!qed_check_table_offset(s, offset)) { +/* Clear invalid offset */ +if (check-fix) { +table-offsets[i] = 0; +} else { +check-result-corruptions++; +} + +num_invalid_l1++; +continue; +} + +if (!qed_set_used_clusters(check, offset, s-header.table_size)) { +
[Qemu-devel] [PATCH v3 1/5] docs: Add QED image format specification
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- docs/specs/qed_spec.txt | 128 +++ 1 files changed, 128 insertions(+), 0 deletions(-) create mode 100644 docs/specs/qed_spec.txt diff --git a/docs/specs/qed_spec.txt b/docs/specs/qed_spec.txt new file mode 100644 index 000..e4425c8 --- /dev/null +++ b/docs/specs/qed_spec.txt @@ -0,0 +1,128 @@ +=Specification= + +The file format looks like this: + + +--+--+--+-+ + | cluster0 | cluster1 | cluster2 | ... | + +--+--+--+-+ + +The first cluster begins with the '''header'''. The header contains information about where regular clusters start; this allows the header to be extensible and store extra information about the image file. A regular cluster may be a '''data cluster''', an '''L2''', or an '''L1 table'''. L1 and L2 tables are composed of one or more contiguous clusters. + +Normally the file size will be a multiple of the cluster size. If the file size is not a multiple, extra information after the last cluster may not be preserved if data is written. Legitimate extra information should use space between the header and the first regular cluster. + +All fields are little-endian. + +==Header== + Header { + uint32_t magic; /* QED\0 */ + + uint32_t cluster_size;/* in bytes */ + uint32_t table_size; /* for L1 and L2 tables, in clusters */ + uint32_t header_size; /* in clusters */ + + uint64_t features;/* format feature bits */ + uint64_t compat_features; /* compat feature bits */ + uint64_t l1_table_offset; /* in bytes */ + uint64_t image_size; /* total logical image size, in bytes */ + + /* if (features QED_F_BACKING_FILE) */ + uint32_t backing_filename_offset; /* in bytes from start of header */ + uint32_t backing_filename_size; /* in bytes */ + } + +Field descriptions: +* ''cluster_size'' must be a power of 2 in range [2^12, 2^26]. +* ''table_size'' must be a power of 2 in range [1, 16]. +* ''header_size'' is the number of clusters used by the header and any additional information stored before regular clusters. +* ''features'', ''compat_features'', and ''autoclear_features'' are file format extension bitmaps. They work as follows: +** An image with unknown ''features'' bits enabled must not be opened. File format changes that are not backwards-compatible must use ''features'' bits. +** An image with unknown ''compat_features'' bits enabled can be opened safely. The unknown features are simply ignored and represent backwards-compatible changes to the file format. +** An image with unknown ''autoclear_features'' bits enable can be opened safely after clearing the unknown bits. This allows for backwards-compatible changes to the file format which degrade gracefully and can be re-enabled again by a new program later. +* ''l1_table_offset'' is the offset of the first byte of the L1 table in the image file and must be a multiple of ''cluster_size''. +* ''image_size'' is the block device size seen by the guest and must be a multiple of 512 bytes. +* ''backing_filename'' is a string in (byte offset, byte size) form. It is not NUL-terminated and has no alignment constraints. + +Feature bits: +* QED_F_BACKING_FILE = 0x01. The image uses a backing file. The backing filename string is given in the ''backing_filename_{offset,size}'' fields and may be an absolute path or relative to the image file. +* QED_F_NEED_CHECK = 0x02. The image needs a consistency check before use. +* QED_F_BACKING_FORMAT_NO_PROBE = 0x04. The backing file is a raw disk image and no file format autodetection should be attempted. This should be used to ensure that raw backing images are never detected as an image format if they happen to contain magic constants. + +There are currently no defined ''compat_features'' or ''autoclear_features'' bits. + +Fields predicated on a feature bit are only used when that feature is set. The fields always take up header space, regardless of whether or not the feature bit is set. + +==Tables== + +Tables provide the translation from logical offsets in the block device to cluster offsets in the file. + + #define TABLE_NOFFSETS (table_size * cluster_size / sizeof(uint64_t)) + + Table { + uint64_t offsets[TABLE_NOFFSETS]; + } + +The tables are organized as follows: + ++--+ +| L1 table | ++--+ + ,--' | '--. + +--+ |+--+ + | L2 table | ... | L2 table | + +--++--+ + ,--' | '--. + +--+ |+--+ + | Data | ... | Data | + +--++--+ + +A table is made up of one or more contiguous clusters. The table_size header field determines table size for an image
[Qemu-devel] [PATCH v3 4/5] qed: Read/write support
This patch implements the read/write state machine. Operations are fully asynchronous and multiple operations may be active at any time. Allocating writes lock tables to ensure metadata updates do not interfere with each other. If two allocating writes need to update the same L2 table they will run sequentially. If two allocating writes need to update different L2 tables they will run in parallel. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- Makefile.objs|1 + block/qed-lock.c | 124 ++ block/qed.c | 667 +- block/qed.h | 43 trace-events | 10 + 5 files changed, 843 insertions(+), 2 deletions(-) create mode 100644 block/qed-lock.c diff --git a/Makefile.objs b/Makefile.objs index bcab5bc..8ae17d8 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -15,6 +15,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o +block-nested-y += qed-lock.o block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o block-nested-$(CONFIG_WIN32) += raw-win32.o block-nested-$(CONFIG_POSIX) += raw-posix.o diff --git a/block/qed-lock.c b/block/qed-lock.c new file mode 100644 index 000..1b7a763 --- /dev/null +++ b/block/qed-lock.c @@ -0,0 +1,124 @@ +/* + * QEMU Enhanced Disk Format Lock + * + * Copyright IBM, Corp. 2010 + * + * Authors: + * Stefan Hajnoczi stefa...@linux.vnet.ibm.com + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +/* + * Table locking works as follows: + * + * Reads and non-allocating writes do not acquire locks because they do not + * modify tables and only see committed L2 cache entries. + * + * An allocating write request that needs to update an existing L2 table + * acquires a lock on the table. This serializes requests that touch the same + * L2 table. + * + * An allocating write request that needs to create a new L2 table and update + * the L1 table acquires a lock on the L1 table. This serializes requests that + * create new L2 tables. + * + * When a request is unable to acquire a lock, it is put to sleep and must + * return. When the lock it attempted to acquire becomes available, a wakeup + * function is invoked to activate it again. + * + * A request must retry its cluster lookup after waking up because the tables + * have changed. For example, an allocating write may no longer need to + * allocate if the previous request already allocated the cluster. + */ + +#include qed.h + +struct QEDLockEntry { +uint64_t key; +QSIMPLEQ_HEAD(, QEDAIOCB) reqs; +QTAILQ_ENTRY(QEDLockEntry) next; +}; + +/** + * Initialize a lock + * + * @lock: Lock + * @wakeup_fn: Callback to reactivate a sleeping request + */ +void qed_lock_init(QEDLock *lock, BlockDriverCompletionFunc *wakeup_fn) +{ +QTAILQ_INIT(lock-entries); +lock-wakeup_fn = wakeup_fn; +} + +/** + * Acquire a lock on a given key + * + * @lock: Lock + * @key:Key to lock on + * @acb:Request + * @ret:true if lock was acquired, false if request needs to sleep + * + * If the request currently has another lock held, that lock will be released. + */ +bool qed_lock(QEDLock *lock, uint64_t key, QEDAIOCB *acb) +{ +QEDLockEntry *entry = acb-lock_entry; + +if (entry) { +/* Lock already held */ +if (entry-key == key) { +return true; +} + +/* Release old lock */ +qed_unlock(lock, acb); +} + +/* Find held lock */ +QTAILQ_FOREACH(entry, lock-entries, next) { +if (entry-key == key) { +QSIMPLEQ_INSERT_TAIL(entry-reqs, acb, next); +acb-lock_entry = entry; +return false; +} +} + +/* Allocate new lock entry */ +entry = qemu_malloc(sizeof(*entry)); +entry-key = key; +QSIMPLEQ_INIT(entry-reqs); +QSIMPLEQ_INSERT_TAIL(entry-reqs, acb, next); +QTAILQ_INSERT_TAIL(lock-entries, entry, next); +acb-lock_entry = entry; +return true; +} + +/** + * Release a held lock + */ +void qed_unlock(QEDLock *lock, QEDAIOCB *acb) +{ +QEDLockEntry *entry = acb-lock_entry; + +if (!entry) { +return; +} + +acb-lock_entry = NULL; +QSIMPLEQ_REMOVE_HEAD(entry-reqs, next); + +/* Wake up next lock holder */ +acb = QSIMPLEQ_FIRST(entry-reqs); +if (acb) { +lock-wakeup_fn(acb, 0); +return; +} + +/* Free lock entry */ +QTAILQ_REMOVE(lock-entries, entry, next); +qemu_free(entry); +} diff --git a/block/qed.c b/block/qed.c index 6795242..5bd503b 100644 --- a/block/qed.c +++ b/block/qed.c @@ -12,8 +12,26 @@ * */ +#include trace.h
[Qemu-devel] [PATCH v3 2/5] qed: Add QEMU Enhanced Disk image format
This patch introduces the qed on-disk layout and implements image creation. Later patches add read/write and other functionality. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- Makefile.objs |1 + block/qed.c | 546 + block/qed.h | 148 3 files changed, 695 insertions(+), 0 deletions(-) create mode 100644 block/qed.c create mode 100644 block/qed.h diff --git a/Makefile.objs b/Makefile.objs index 6e6f60a..2f036c8 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -14,6 +14,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o +block-nested-y += qed.o block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o block-nested-$(CONFIG_WIN32) += raw-win32.o block-nested-$(CONFIG_POSIX) += raw-posix.o diff --git a/block/qed.c b/block/qed.c new file mode 100644 index 000..3f2d61b --- /dev/null +++ b/block/qed.c @@ -0,0 +1,546 @@ +/* + * QEMU Enhanced Disk Format + * + * Copyright IBM, Corp. 2010 + * + * Authors: + * Stefan Hajnoczi stefa...@linux.vnet.ibm.com + * Anthony Liguori aligu...@us.ibm.com + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include qed.h + +static int bdrv_qed_probe(const uint8_t *buf, int buf_size, + const char *filename) +{ +const QEDHeader *header = (const void *)buf; + +if (buf_size sizeof(*header)) { +return 0; +} +if (le32_to_cpu(header-magic) != QED_MAGIC) { +return 0; +} +return 100; +} + +/** + * Check whether an image format is raw + * + * @fmt:Backing file format, may be NULL + */ +static bool qed_fmt_is_raw(const char *fmt) +{ +return fmt strcmp(fmt, raw) == 0; +} + +static void qed_header_le_to_cpu(const QEDHeader *le, QEDHeader *cpu) +{ +cpu-magic = le32_to_cpu(le-magic); +cpu-cluster_size = le32_to_cpu(le-cluster_size); +cpu-table_size = le32_to_cpu(le-table_size); +cpu-header_size = le32_to_cpu(le-header_size); +cpu-features = le64_to_cpu(le-features); +cpu-compat_features = le64_to_cpu(le-compat_features); +cpu-autoclear_features = le64_to_cpu(le-autoclear_features); +cpu-l1_table_offset = le64_to_cpu(le-l1_table_offset); +cpu-image_size = le64_to_cpu(le-image_size); +cpu-backing_filename_offset = le32_to_cpu(le-backing_filename_offset); +cpu-backing_filename_size = le32_to_cpu(le-backing_filename_size); +} + +static void qed_header_cpu_to_le(const QEDHeader *cpu, QEDHeader *le) +{ +le-magic = cpu_to_le32(cpu-magic); +le-cluster_size = cpu_to_le32(cpu-cluster_size); +le-table_size = cpu_to_le32(cpu-table_size); +le-header_size = cpu_to_le32(cpu-header_size); +le-features = cpu_to_le64(cpu-features); +le-compat_features = cpu_to_le64(cpu-compat_features); +le-autoclear_features = cpu_to_le64(cpu-autoclear_features); +le-l1_table_offset = cpu_to_le64(cpu-l1_table_offset); +le-image_size = cpu_to_le64(cpu-image_size); +le-backing_filename_offset = cpu_to_le32(cpu-backing_filename_offset); +le-backing_filename_size = cpu_to_le32(cpu-backing_filename_size); +} + +static int qed_write_header_sync(BDRVQEDState *s) +{ +QEDHeader le; +int ret; + +qed_header_cpu_to_le(s-header, le); +ret = bdrv_pwrite(s-bs-file, 0, le, sizeof(le)); +if (ret != sizeof(le)) { +return ret; +} +return 0; +} + +static uint64_t qed_max_image_size(uint32_t cluster_size, uint32_t table_size) +{ +uint64_t table_entries; +uint64_t l2_size; + +table_entries = (table_size * cluster_size) / sizeof(uint64_t); +l2_size = table_entries * cluster_size; + +return l2_size * table_entries; +} + +static bool qed_is_cluster_size_valid(uint32_t cluster_size) +{ +if (cluster_size QED_MIN_CLUSTER_SIZE || +cluster_size QED_MAX_CLUSTER_SIZE) { +return false; +} +if (cluster_size (cluster_size - 1)) { +return false; /* not power of 2 */ +} +return true; +} + +static bool qed_is_table_size_valid(uint32_t table_size) +{ +if (table_size QED_MIN_TABLE_SIZE || +table_size QED_MAX_TABLE_SIZE) { +return false; +} +if (table_size (table_size - 1)) { +return false; /* not power of 2 */ +} +return true; +} + +static bool qed_is_image_size_valid(uint64_t image_size, uint32_t cluster_size, +uint32_t table_size) +{ +if (image_size == 0) { +/* Supporting zero size images makes life harder because even the L1 + * table is not needed. Make life simple and forbid zero size images. + */ +return false; +} +if (image_size % BDRV_SECTOR_SIZE != 0) { +return false; /*
[Qemu-devel] [PATCH v3 3/5] qed: Table, L2 cache, and cluster functions
This patch adds code to look up data cluster offsets in the image via the L1/L2 tables. The L2 tables are writethrough cached in memory for performance (each read/write requires a lookup so it is essential to cache the tables). With cluster lookup code in place it is possible to implement bdrv_is_allocated() to query the number of contiguous allocated/unallocated clusters. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- Makefile.objs|2 +- block/qed-cluster.c | 154 block/qed-gencb.c| 32 + block/qed-l2-cache.c | 173 +++ block/qed-table.c| 317 ++ block/qed.c | 54 +- block/qed.h | 118 +++ trace-events | 11 ++ 8 files changed, 859 insertions(+), 2 deletions(-) create mode 100644 block/qed-cluster.c create mode 100644 block/qed-gencb.c create mode 100644 block/qed-l2-cache.c create mode 100644 block/qed-table.c diff --git a/Makefile.objs b/Makefile.objs index 2f036c8..bcab5bc 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -14,7 +14,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o -block-nested-y += qed.o +block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o block-nested-$(CONFIG_WIN32) += raw-win32.o block-nested-$(CONFIG_POSIX) += raw-posix.o diff --git a/block/qed-cluster.c b/block/qed-cluster.c new file mode 100644 index 000..0ec864b --- /dev/null +++ b/block/qed-cluster.c @@ -0,0 +1,154 @@ +/* + * QEMU Enhanced Disk Format Cluster functions + * + * Copyright IBM, Corp. 2010 + * + * Authors: + * Stefan Hajnoczi stefa...@linux.vnet.ibm.com + * Anthony Liguori aligu...@us.ibm.com + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include qed.h + +/** + * Count the number of contiguous data clusters + * + * @s: QED state + * @table: L2 table + * @index: First cluster index + * @n: Maximum number of clusters + * @offset: Set to first cluster offset + * + * This function scans tables for contiguous allocated or free clusters. + */ +static unsigned int qed_count_contiguous_clusters(BDRVQEDState *s, + QEDTable *table, + unsigned int index, + unsigned int n, + uint64_t *offset) +{ +unsigned int end = MIN(index + n, s-table_nelems); +uint64_t last = table-offsets[index]; +unsigned int i; + +*offset = last; + +for (i = index + 1; i end; i++) { +if (last == 0) { +/* Counting free clusters */ +if (table-offsets[i] != 0) { +break; +} +} else { +/* Counting allocated clusters */ +if (table-offsets[i] != last + s-header.cluster_size) { +break; +} +last = table-offsets[i]; +} +} +return i - index; +} + +typedef struct { +BDRVQEDState *s; +uint64_t pos; +size_t len; + +QEDRequest *request; + +/* User callback */ +QEDFindClusterFunc *cb; +void *opaque; +} QEDFindClusterCB; + +static void qed_find_cluster_cb(void *opaque, int ret) +{ +QEDFindClusterCB *find_cluster_cb = opaque; +BDRVQEDState *s = find_cluster_cb-s; +QEDRequest *request = find_cluster_cb-request; +uint64_t offset = 0; +size_t len = 0; +unsigned int index; +unsigned int n; + +if (ret) { +goto out; +} + +index = qed_l2_index(s, find_cluster_cb-pos); +n = qed_bytes_to_clusters(s, + qed_offset_into_cluster(s, find_cluster_cb-pos) + + find_cluster_cb-len); +n = qed_count_contiguous_clusters(s, request-l2_table-table, + index, n, offset); + +ret = offset ? QED_CLUSTER_FOUND : QED_CLUSTER_L2; +len = MIN(find_cluster_cb-len, n * s-header.cluster_size - + qed_offset_into_cluster(s, find_cluster_cb-pos)); + +if (offset !qed_check_cluster_offset(s, offset)) { +ret = -EINVAL; +} + +out: +find_cluster_cb-cb(find_cluster_cb-opaque, ret, offset, len); +qemu_free(find_cluster_cb); +} + +/** + * Find the offset of a data cluster + * + * @s: QED state + * @request:L2 cache entry + * @pos:Byte position in device + * @len:Number of bytes + * @cb: Completion function + * @opaque: User data for completion function + * + * This
Re: [Qemu-devel] [PATCH v3 0/5] qed: Add QEMU Enhanced Disk format
Subject line fixed. Stefan
[Qemu-devel] Re: [PATCH] sparc32: convert debug printf statements to tracepoints
On Thu, Oct 21, 2010 at 08:13:51PM +, Blue Swirl wrote: Replace debug printf statements with tracepoints. Signed-off-by: Blue Swirl blauwir...@gmail.com --- There were problems with tracetool if the tracepoint string contained parentheses or brackets, so I removed those. hw/cs4231.c| 20 ++--- hw/eccmemctl.c | 46 +- hw/lance.c | 14 ++- hw/slavio_intctl.c | 38 ++ hw/slavio_misc.c | 51 ++--- hw/slavio_timer.c | 49 --- hw/sparc32_dma.c | 42 ++-- hw/sun4m.c | 18 ++-- hw/sun4m_iommu.c | 33 --- trace-events | 108 10 files changed, 213 insertions(+), 206 deletions(-) Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com Out of interest, why is %8.8x used instead of %08x? Is there a difference in meaning or just style? Stefan
[Qemu-devel] Re: [RFC][PATCH 4/5] trace-event
On Thu, Oct 21, 2010 at 03:10:18PM +0530, Prerna Saxena wrote: trace-event : QMP interface to change state of a trace-event. (Analogous to hmp command : trace-event ) Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com --- qmp-commands.hx | 32 1 files changed, 32 insertions(+), 0 deletions(-) diff --git a/qmp-commands.hx b/qmp-commands.hx index 7e95f4e..f2008e8 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -761,6 +761,38 @@ Example: Note: This command must be issued before issuing any other command. +EQMP + +{ +.name = trace-event, +.args_type = name:s,option:b, +.params = name on|off, +.help = changes state of a specific trace event, +.user_print = monitor_user_noop, +.mhandler.cmd_new = do_change_trace_event_state_qmp, +}, + +SQMP +trace-event +--- + +Change state of a trace-event. The name is a little odd because it has no verb. How about set-trace-event or enable-trace-event? Stefan
Re: [Qemu-devel] [PATCH 2/2] v3 Fix Block Hotplug race with drive_unplug()
On Thu, 21 Oct 2010 21:55:53 -0500 Ryan Harper ry...@us.ibm.com wrote: Block hot unplug is racy since the guest is required to acknowlege the ACPI unplug event; this may not happen synchronously with the device removal command This series aims to close a gap where by mgmt applications that assume the block resource has been removed without confirming that the guest has acknowledged the removal may re-assign the underlying device to a second guest leading to data leakage. This series introduces a new montor command to decouple asynchornous device removal from restricting guest access to a block device. We do this by creating a new monitor command drive_unplug which maps to a bdrv_unplug() command which does a qemu_aio_flush; bdrv_flush() and bdrv_close(). Once complete, subsequent IO is rejected from the device and the guest will get IO errors but continue to function. A subsequent device removal command can be issued to remove the device, to which the guest may or maynot respond, but as long as the unplugged bit is set, no IO will be sumbitted. Changes since v2: - Added qmp command for drive_unplug Changes since v1: - Added qemu_aio_flush() before bdrv_flush() to wait on pending io Signed-off-by: Ryan Harper ry...@us.ibm.com --- block.c |7 +++ block.h |1 + blockdev.c | 26 ++ blockdev.h |1 + hmp-commands.hx | 15 +++ qmp-commands.hx | 26 ++ 6 files changed, 76 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index a19374d..be47655 100644 --- a/block.c +++ b/block.c @@ -1328,6 +1328,13 @@ void bdrv_set_removable(BlockDriverState *bs, int removable) } } +void bdrv_unplug(BlockDriverState *bs) +{ +qemu_aio_flush(); +bdrv_flush(bs); +bdrv_close(bs); +} + int bdrv_is_removable(BlockDriverState *bs) { return bs-removable; diff --git a/block.h b/block.h index 5f64380..732f63e 100644 --- a/block.h +++ b/block.h @@ -171,6 +171,7 @@ void bdrv_set_on_error(BlockDriverState *bs, BlockErrorAction on_read_error, BlockErrorAction on_write_error); BlockErrorAction bdrv_get_on_error(BlockDriverState *bs, int is_read); void bdrv_set_removable(BlockDriverState *bs, int removable); +void bdrv_unplug(BlockDriverState *bs); int bdrv_is_removable(BlockDriverState *bs); int bdrv_is_read_only(BlockDriverState *bs); int bdrv_is_sg(BlockDriverState *bs); diff --git a/blockdev.c b/blockdev.c index 5fc3b9b..68eb329 100644 --- a/blockdev.c +++ b/blockdev.c @@ -610,3 +610,29 @@ int do_change_block(Monitor *mon, const char *device, } return monitor_read_bdrv_key_start(mon, bs, NULL, NULL); } + +int do_drive_unplug(Monitor *mon, const QDict *qdict, QObject **ret_data) +{ +DriveInfo *dinfo; +BlockDriverState *bs; +const char *id; + +if (!qdict_haskey(qdict, id)) { +qerror_report(QERR_MISSING_PARAMETER, id); +return -1; +} You can drop this check, id is a required argument, the upper layer will perform the check for you. + +id = qdict_get_str(qdict, id); +dinfo = drive_get_by_id(id); +if (!dinfo) { +qerror_report(QERR_DEVICE_NOT_FOUND, id); +return -1; +} + +/* mark block device unplugged */ +bs = dinfo-bdrv; +bdrv_unplug(bs); + +return 0; +} + diff --git a/blockdev.h b/blockdev.h index 19c6915..ecb9ac8 100644 --- a/blockdev.h +++ b/blockdev.h @@ -52,5 +52,6 @@ int do_eject(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_block_set_passwd(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_change_block(Monitor *mon, const char *device, const char *filename, const char *fmt); +int do_drive_unplug(Monitor *mon, const QDict *qdict, QObject **ret_data); #endif diff --git a/hmp-commands.hx b/hmp-commands.hx index 81999aa..7a32a2e 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -68,6 +68,21 @@ Eject a removable medium (use -f to force it). ETEXI { +.name = drive_unplug, +.args_type = id:s, +.params = device, +.help = unplug block device, +.user_print = monitor_user_noop, +.mhandler.cmd_new = do_drive_unplug, +}, + +STEXI +...@item unplug @var{device} +...@findex unplug +Unplug block device. +ETEXI + +{ .name = change, .args_type = device:B,target:F,arg:s?, .params = device filename [format], diff --git a/qmp-commands.hx b/qmp-commands.hx index 793cf1c..e8f3d4a 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -338,6 +338,32 @@ Example: EQMP { +.name = drive_unplug, +.args_type = id:s, +.params = device, +.help = unplug block device, +.user_print =
Re: [Qemu-devel] [PATCH 0/2] v3 Decouple block device removal from device removal
On Thu, 21 Oct 2010 21:55:51 -0500 Ryan Harper ry...@us.ibm.com wrote: This patch series decouples the detachment of a block device from the removal of the backing pci-device. Removal of a hotplugged pci device requires the guest to respond before qemu tears down the block device. In some cases, the guest may not respond leaving the guest with continued access to the block device. The new monitor command, drive_unplug, will revoke a guests access to the block device independently of the removal of the pci device. The first patch adds a new drive find method, the second patch implements the monitor command and block layer changes. Reviewed the monitor part, I think we're waiting for Kevin's and/or Markus's ACK to get this merged? Changes since v2: - Added QMP command for drive_unplug() Changes since v1: - CodingStyle fixes - Added qemu_aio_flush() to bdrv_unplug() Signed-off-by: Ryan Harper ry...@us.ibm.com
Re: [Qemu-devel] [PATCH 1/2] v2 Add drive_get_by_id
On Thu, 21 Oct 2010 21:55:52 -0500 Ryan Harper ry...@us.ibm.com wrote: Add a function to find a drive by id string. Changes since v1: -Coding Style fix Signed-off-by: Ryan Harper ry...@us.ibm.com --- blockdev.c | 13 + blockdev.h |1 + 2 files changed, 14 insertions(+), 0 deletions(-) diff --git a/blockdev.c b/blockdev.c index ff7602b..5fc3b9b 100644 --- a/blockdev.c +++ b/blockdev.c @@ -75,6 +75,19 @@ DriveInfo *drive_get(BlockInterfaceType type, int bus, int unit) return NULL; } +DriveInfo *drive_get_by_id(const char *id) +{ +DriveInfo *dinfo; + +QTAILQ_FOREACH(dinfo, drives, next) { +if (strcmp(id, dinfo-id)) { +continue; +} Very minor, but the following allows you to drop a statement: if (!strcmp()) { return dinfo; } +return dinfo; +} +return NULL; +} + int drive_get_max_bus(BlockInterfaceType type) { int max_bus; diff --git a/blockdev.h b/blockdev.h index 653affc..19c6915 100644 --- a/blockdev.h +++ b/blockdev.h @@ -38,6 +38,7 @@ DriveInfo *drive_get(BlockInterfaceType type, int bus, int unit); int drive_get_max_bus(BlockInterfaceType type); void drive_uninit(DriveInfo *dinfo); DriveInfo *drive_get_by_blockdev(BlockDriverState *bs); +DriveInfo *drive_get_by_id(const char *id); QemuOpts *drive_add(const char *file, const char *fmt, ...) GCC_FMT_ATTR(2, 3); DriveInfo *drive_init(QemuOpts *arg, int default_to_scsi, int *fatal_error);
Re: [Qemu-devel] [PATCH 0/2] v3 Decouple block device removal from device removal
Am 22.10.2010 17:48, schrieb Luiz Capitulino: On Thu, 21 Oct 2010 21:55:51 -0500 Ryan Harper ry...@us.ibm.com wrote: This patch series decouples the detachment of a block device from the removal of the backing pci-device. Removal of a hotplugged pci device requires the guest to respond before qemu tears down the block device. In some cases, the guest may not respond leaving the guest with continued access to the block device. The new monitor command, drive_unplug, will revoke a guests access to the block device independently of the removal of the pci device. The first patch adds a new drive find method, the second patch implements the monitor command and block layer changes. Reviewed the monitor part, I think we're waiting for Kevin's and/or Markus's ACK to get this merged? Yes, I'm waiting for Markus' comments before applying this. Maybe we shouldn't add this to QMP yet, drive_add doesn't exist there either because we want to do it right with blockdev_add/del. On the other hand Markus should know much better than me what the right thing would look like - if he says that this interface will work with whatever we're going to get with blockdev_*, I'm fine with merging it. Kevin
[Qemu-devel] [PATCH] usb-linux: allow multiple devices with matching IDs
Right now if we pass through multiple USB devices with matching vendor and product IDs, only first one is passed to guest, as the code thinks second device is already attached. The only way to get those devices working is to specify bus.addr which is inconvenient if devices are frequently replugged on host, because the address changes after replug. Fix this by checking bus.addr before assuming the device is already attached. This way -usbdevice host:1234:1234 -usbdevice host:1234:1234 will pass through 2 devices correctly. Signed-off-by: Grazvydas Ignotas nota...@gmail.com --- usb-linux.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/usb-linux.c b/usb-linux.c index c3c38ec..b5f1396 100644 --- a/usb-linux.c +++ b/usb-linux.c @@ -1464,9 +1464,13 @@ static int usb_host_auto_scan(void *opaque, int bus_num, int addr, } /* We got a match */ -/* Already attached ? */ if (s-fd != -1) { -return 0; +/* Already attached? */ +if (s-bus_num == bus_num s-addr == addr) +return 0; + +/* Not attached but needs another hostdev */ +continue; } DPRINTF(husb: auto open: bus_num %d addr %d\n, bus_num, addr); -- 1.6.3.3
Re: [Qemu-devel] [PATCH 0/2] v3 Decouple block device removal from device removal
* Kevin Wolf kw...@redhat.com [2010-10-22 11:12]: Am 22.10.2010 17:48, schrieb Luiz Capitulino: On Thu, 21 Oct 2010 21:55:51 -0500 Ryan Harper ry...@us.ibm.com wrote: This patch series decouples the detachment of a block device from the removal of the backing pci-device. Removal of a hotplugged pci device requires the guest to respond before qemu tears down the block device. In some cases, the guest may not respond leaving the guest with continued access to the block device. The new monitor command, drive_unplug, will revoke a guests access to the block device independently of the removal of the pci device. The first patch adds a new drive find method, the second patch implements the monitor command and block layer changes. Reviewed the monitor part, I think we're waiting for Kevin's and/or Markus's ACK to get this merged? Yes, I'm waiting for Markus' comments before applying this. Maybe we shouldn't add this to QMP yet, drive_add doesn't exist there either because we want to do it right with blockdev_add/del. On the other hand Markus should know much better than me what the right thing would look like - if he says that this interface will work with whatever we're going to get with blockdev_*, I'm fine with merging it. I can go either way. Anthony asked me to re-spin with a QMP version; but the libvirt patch works either via TextMonitor or JSONMonitor. -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx ry...@us.ibm.com
Re: Fwd: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
On Fri, 22 Oct 2010, Kevin Wolf wrote: [ Adding qemu-devel to CC again ] Am 21.10.2010 20:59, schrieb Sage Weil: On Thu, 21 Oct 2010, Christian Brunner wrote: Hi, is there a flush operation in librados? - I guess the only way to handle this, would be waiting until all aio requests are finished? That's not the semantics of bdrv_flush, you don't need to wait for running requests. You just need to make sure that all completed requests are safe on disk so that they would persist even in case of a crash/power failure. Okay, in that case we're fine. librados doesn't declare a write committed until it is safely on disk on multiple backend nodes. There is a mechanism to get an ack sooner, but the qemu storage driver does not use it. There is no flush currently. But librados does no caching, so in this case at least silenting upgrading to cache=writethrough should work. You're making sure that the data can't be cached in the server's page cache or volatile disk cache either, e.g. by using O_SYNC for the image file? If so, upgrading would be safe. Right. If that's a problem, we can implement a flush. Just let us know. Presumably providing a writeback mode with explicit flushes could improve performance. Upgrading to writethrough is not a correctness problem, though, so it's your decision if you want to implement it. So is a bdrv_flush generated when e.g. the guest filesystem issues a barrier, or would otherwise normally ask a SATA disk to flush it's cache? sage Kevin -- Forwarded message -- From: Kevin Wolf kw...@redhat.com Date: 2010/10/21 Subject: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog To: Christian Brunner c...@muc.de, Laurent Vivier laur...@vivier.eu, MORITA Kazutaka morita.kazut...@lab.ntt.co.jp Cc: Qemu-devel@nongnu.org Hi all, I'm currently looking into adding a return value to qemu's bdrv_flush function and I noticed that your block drivers (nbd, rbd and sheepdog) don't implement bdrv_flush at all. bdrv_flush is going to return -ENOTSUP for any block driver not implementing this, effectively breaking these three drivers for anything but cache=unsafe. Is there a specific reason why your drivers don't implement this? I think I remember that one of the drivers always provides cache=writethough semantics. It would be okay to silently upgrade to cache=writethrough, so in this case I'd just need to add an empty bdrv_flush implementation. Otherwise, we really cannot allow any option except cache=unsafe because that's the semantics provided by the driver. In any case, I think it would be a good idea to implement a real bdrv_flush function to allow the write-back cache modes cache=off and cache=writeback in order to improve performance over writethrough. Is this possible with your protocols, or can the protocol be changed to consider this? Any hints on how to proceed? Kevin -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 2/4] Silence compiler warning in json test case
Luiz Capitulino lcapitul...@redhat.com writes: From: Jan Kiszka jan.kis...@web.de This avoids error: zero-length gnu_printf format string Signed-off-by: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- check-qjson.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/check-qjson.c b/check-qjson.c index 0b60e45..64fcdcb 100644 --- a/check-qjson.c +++ b/check-qjson.c @@ -639,7 +639,9 @@ END_TEST START_TEST(empty_input) { -QObject *obj = qobject_from_json(); +const char *empty = ; + +QObject *obj = qobject_from_json(empty); fail_unless(obj == NULL); } END_TEST The warning is silly. Printing nothing is unlikely to happen unintentionally, and is perfectly well-defined and portable. Why make the code ugly to avoid a useless warning, when we can disable the warning?
Re: [Qemu-devel] KVM call minutes for Oct 19
On 10/22/2010 12:29 PM, Chris Wright wrote: * Anthony Liguori (anth...@codemonkey.ws) wrote: The first step is just identifying what interfaces we need in a guest agent. So far, I think we can get away with a very small number of interfaces (mainly read/write files, execute command). Could you elaborate here? I can't imagine you mean: vm_system(target_vm, shutdown -r now) But from other post, it does seem you want complexity in the host side not guest side of agent. Seems vm_reboot(target_vm) as the RPC makes more sense with the guest side implementing that in whatever guest-specific appropriate way. What I really want is a vm_system API that a guest agent MUST implement and then APIs like vm_reboot that a guest agent MAY implement. In my mind, the guest agent lives in the distros even though it's built from QEMU source tree. We don't install it ourselves. That means we might have a new funky fresh version of Fedora 21 version of QEMU but running an old Fedora 14 guest with a really back-level guest agent. Having very low level APIs with logic that primarily lives in QEMU gives us the ability to support new features in QEMU with older guests. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 2/4] Silence compiler warning in json test case
On Fri, 22 Oct 2010 19:15:07 +0200 Markus Armbruster arm...@redhat.com wrote: Luiz Capitulino lcapitul...@redhat.com writes: From: Jan Kiszka jan.kis...@web.de This avoids error: zero-length gnu_printf format string Signed-off-by: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Luiz Capitulino lcapitul...@redhat.com --- check-qjson.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/check-qjson.c b/check-qjson.c index 0b60e45..64fcdcb 100644 --- a/check-qjson.c +++ b/check-qjson.c @@ -639,7 +639,9 @@ END_TEST START_TEST(empty_input) { -QObject *obj = qobject_from_json(); +const char *empty = ; + +QObject *obj = qobject_from_json(empty); fail_unless(obj == NULL); } END_TEST The warning is silly. Printing nothing is unlikely to happen unintentionally, and is perfectly well-defined and portable. Why make the code ugly to avoid a useless warning, when we can disable the warning? You mean, disable it only for this specific case or QEMU wide? If it's the former, please, submit a patch. Otherwise, this has been discussed already and the conclusion was that the warning is useful: http://www.mail-archive.com/qemu-devel@nongnu.org/msg44072.html Honestly speaking, no matter what the conclusion is, what can not happen is having code that doesn't compile in the tree. Either: we apply this patch or revert the patch that broke the build.
Re: [Qemu-devel] KVM call minutes for Oct 19
* Anthony Liguori (anth...@codemonkey.ws) wrote: The first step is just identifying what interfaces we need in a guest agent. So far, I think we can get away with a very small number of interfaces (mainly read/write files, execute command). Could you elaborate here? I can't imagine you mean: vm_system(target_vm, shutdown -r now) But from other post, it does seem you want complexity in the host side not guest side of agent. Seems vm_reboot(target_vm) as the RPC makes more sense with the guest side implementing that in whatever guest-specific appropriate way. thanks, -chris
[Qemu-devel] Re: [PATCH] sparc32: convert debug printf statements to tracepoints
On Fri, Oct 22, 2010 at 3:22 PM, Stefan Hajnoczi stefa...@linux.vnet.ibm.com wrote: On Thu, Oct 21, 2010 at 08:13:51PM +, Blue Swirl wrote: Replace debug printf statements with tracepoints. Signed-off-by: Blue Swirl blauwir...@gmail.com --- There were problems with tracetool if the tracepoint string contained parentheses or brackets, so I removed those. hw/cs4231.c | 20 ++--- hw/eccmemctl.c | 46 +- hw/lance.c | 14 ++- hw/slavio_intctl.c | 38 ++ hw/slavio_misc.c | 51 ++--- hw/slavio_timer.c | 49 --- hw/sparc32_dma.c | 42 ++-- hw/sun4m.c | 18 ++-- hw/sun4m_iommu.c | 33 --- trace-events | 108 10 files changed, 213 insertions(+), 206 deletions(-) Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com Thanks. Out of interest, why is %8.8x used instead of %08x? Is there a difference in meaning or just style? Both seems to produce the same results. %08x is a bit shorter, so that should be preferred.
Re: [Qemu-devel] [PATCH 3/3] Add helper functions to enable virtio-9p make use of the threadlets
On 10/21/2010 5:10 AM, Arun R Bharadwaj wrote: From: Gautham R Shenoy e...@in.ibm.com infrastructure for offloading blocking tasks such as making posix calls on to the helper threads and handle the post_posix_operations() from the context of the iothread. This frees the vcpu thread to process any other guest operations while the processing of v9fs_io is in progress. Signed-off-by: Gautham R Shenoy e...@in.ibm.com Signed-off-by: Sripathi Kodi sripat...@in.ibm.com Signed-off-by: Arun R Bharadwaj a...@linux.vnet.ibm.com --- hw/virtio-9p.c | 168 posix-aio-compat.c | 48 ++- qemu-threadlets.c | 21 +++ qemu-threadlets.h |1 vl.c |3 + 5 files changed, 211 insertions(+), 30 deletions(-) diff --git a/hw/virtio-9p.c b/hw/virtio-9p.c index a871685..f9a7b7d 100644 --- a/hw/virtio-9p.c +++ b/hw/virtio-9p.c @@ -18,6 +18,7 @@ #include fsdev/qemu-fsdev.h #include virtio-9p-debug.h #include virtio-9p-xattr.h +#include qemu-threadlets.h int debug_9p_pdu; @@ -33,6 +34,149 @@ enum { Oappend = 0x80, }; +struct v9fs_post_op { +QTAILQ_ENTRY(v9fs_post_op) node; +void (*func)(void *arg); +void *arg; +}; + +static struct { +int rfd; +int wfd; +QemuMutex lock; +QTAILQ_HEAD(, v9fs_post_op) post_op_list; +} v9fs_async_struct; + +static void die2(int err, const char *what) +{ +fprintf(stderr, %s failed: %s\n, what, strerror(err)); +abort(); +} + +static void die(const char *what) +{ +die2(errno, what); +} + +#define ASYNC_MAX_PROCESS 5 + +/** + * v9fs_process_post_ops: Process any pending v9fs_post_posix_operation + * @arg: Not used. + * + * This function serves as a callback to the iothread to be called into whenever + * the v9fs_async_struct.wfd is written into. This thread goes through the list + * of v9fs_post_posix_operations() and executes them. In the process, it might + * queue more job on the asynchronous thread pool. + */ +static void v9fs_process_post_ops(void *arg) +{ +int count = 0; +struct v9fs_post_op *post_op; +int ret; +char byte; + +qemu_mutex_lock(v9fs_async_struct.lock); +do { +ret = read(v9fs_async_struct.rfd, byte, sizeof(byte)); +} while (ret = 0 errno != EAGAIN); + +for (count = 0; count ASYNC_MAX_PROCESS; count++) { +if (QTAILQ_EMPTY((v9fs_async_struct.post_op_list))) { +break; +} +post_op = QTAILQ_FIRST((v9fs_async_struct.post_op_list)); +QTAILQ_REMOVE((v9fs_async_struct.post_op_list), post_op, node); + +qemu_mutex_unlock(v9fs_async_struct.lock); +post_op-func(post_op-arg); +qemu_free(post_op); +qemu_mutex_lock(v9fs_async_struct.lock); +} +qemu_mutex_unlock(v9fs_async_struct.lock); +} + +/** + * v9fs_async_signal: Inform the io-thread of completion of async job. + * + * This function is used to inform the iothread that a particular + * async-operation pertaining to v9fs has been completed and that the io thread + * can handle the v9fs_post_posix_operation. + * + * This is based on the aio_signal_handler + */ +static inline void v9fs_async_signal(void) +{ +char byte = 0; +ssize_t ret; +int tries = 0; + +qemu_mutex_lock(v9fs_async_struct.lock); +do { +assert(tries != 100); + ret = write(v9fs_async_struct.wfd, byte, sizeof(byte)); +tries++; +} while (ret 0 errno == EAGAIN); +qemu_mutex_unlock(v9fs_async_struct.lock); + +if (ret 0 errno != EAGAIN) { +die(write() in v9fs); +} + +if (kill(getpid(), SIGUSR2)) { +die(kill failed); +} +} + +/** + * v9fs_async_helper_done: Marks the completion of the v9fs_async job + * @func: v9fs_post_posix_func() for post-processing invoked in the context of + *the io-thread + * @arg: Argument to func. + * + * This function is called from the context of one of the asynchronous threads + * in the thread pool. This is called when the asynchronous thread has finished + * executing a v9fs_posix_operation. It's purpose is to initiate the process of + * informing the io-thread that the v9fs_posix_operation has completed. + */ +static void v9fs_async_helper_done(void (*func)(void *arg), void *arg) +{ +struct v9fs_post_op *post_op; + +post_op = qemu_mallocz(sizeof(*post_op)); +post_op-func = func; +post_op-arg = arg; + +qemu_mutex_lock(v9fs_async_struct.lock); +QTAILQ_INSERT_TAIL((v9fs_async_struct.post_op_list), post_op, node); +qemu_mutex_unlock(v9fs_async_struct.lock); + +v9fs_async_signal(); +} + +/** + * v9fs_do_async_posix: Offload v9fs_posix_operation onto async thread. + * @vs: V9fsOPState variable for the OP operation. + * @posix_fn: The posix function which has to be offloaded onto async thread. + *
[Qemu-devel] Chandra invites you to MySpace
Hello, I'm a member of the MySpace community and thought you'd like to join too. Join MySpace and you will be automatically connected to my profile on the site. MySpace is a social networking site that allows users to connect with friends, plan their social life, explore new music, and much much more!See you on MySpace!Chandra Click here to join! If you want to block any emails from MySpace members in the future, click here: http://www.myspace.com/index.cfm?fuseaction=block=7e896fd1-4dd5-423c-b631-28c619e22164 Or send a single blank email with the subject line BLOCK to: priv...@myspace.com You can also block future emails or direct any other inquiries by regular postal mail to: MySpace, Inc. 8391 Beverly Blvd. #349 Los Angeles, CA 90048 USA (C)2003-2009 MySpace.com. All Rights Reserved
Re: [Qemu-devel] KVM call minutes for Oct 19
* Anthony Liguori (anth...@codemonkey.ws) wrote: On 10/22/2010 12:29 PM, Chris Wright wrote: * Anthony Liguori (anth...@codemonkey.ws) wrote: The first step is just identifying what interfaces we need in a guest agent. So far, I think we can get away with a very small number of interfaces (mainly read/write files, execute command). Could you elaborate here? I can't imagine you mean: vm_system(target_vm, shutdown -r now) But from other post, it does seem you want complexity in the host side not guest side of agent. Seems vm_reboot(target_vm) as the RPC makes more sense with the guest side implementing that in whatever guest-specific appropriate way. What I really want is a vm_system API that a guest agent MUST implement and then APIs like vm_reboot that a guest agent MAY implement. In my mind, the guest agent lives in the distros even though it's built from QEMU source tree. We don't install it ourselves. That means we might have a new funky fresh version of Fedora 21 version of QEMU but running an old Fedora 14 guest with a really back-level guest agent. Having very low level APIs with logic that primarily lives in QEMU gives us the ability to support new features in QEMU with older guests. I'm not sure about that. That same new shiny Fedora 21 QEMU has no idea what the right OS specific command to run in guest is. Granted, it's not likely that reboot or shutdown -r now are likely to change for Linux guests, do we assume cygwin for Windows guests? Really seems to make more sense to have a stable ABI and negotiate version. Also, from the point of view of a cloud where a VM agent is awfully close to provider having backdoor into VM...a freeform vm_system() doesn't seem like it'd be real popular. thanks, -chris
[Qemu-devel] Re: [PATCH 2/2] Replace remaining gcc format attributes by macro GCC_FMT_ATTR (format checking)
Thanks, applied both. On Fri, Oct 22, 2010 at 10:33 AM, Stefan Weil w...@mail.berlios.de wrote: Am 13.10.2010 20:54, schrieb Stefan Weil: Replace the remaining format attribute printf by macro GCC_FMT_ATTR which uses gnu_printf (if supported). v2 * Removal of dyngen specific code is now done in a separate patch. * Handle attribute in new ui/spice-display.c, too. Cc: Blue Swirlblauwir...@gmail.com Signed-off-by: Stefan Weilw...@mail.berlios.de --- cpu-all.h | 2 +- ui/spice-display.c | 3 +-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/cpu-all.h b/cpu-all.h index 67a3266..11edddc 100644 --- a/cpu-all.h +++ b/cpu-all.h @@ -773,7 +773,7 @@ void cpu_dump_statistics (CPUState *env, FILE *f, int flags); void QEMU_NORETURN cpu_abort(CPUState *env, const char *fmt, ...) - __attribute__ ((__format__ (__printf__, 2, 3))); + GCC_FMT_ATTR(2, 3); extern CPUState *first_cpu; extern CPUState *cpu_single_env; diff --git a/ui/spice-display.c b/ui/spice-display.c index 6702dfd..7b4f5c1 100644 --- a/ui/spice-display.c +++ b/ui/spice-display.c @@ -29,8 +29,7 @@ static int debug = 0; -static void __attribute__((format(printf,2,3))) -dprint(int level, const char *fmt, ...) +static void GCC_FMT_ATTR(2, 3) dprint(int level, const char *fmt, ...) { va_list args; Can these two patches be applied to qemu master, or is there still something missing? They are needed for additional format checks. Regards, Stefan
[Qemu-devel] [PATCH] Use '_raw' memory access primitives
Using a pointer on the host should not go through lduw. Signed-off-by: Lluís Vilanova vilan...@ac.upc.edu --- linux-user/signal.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/linux-user/signal.c b/linux-user/signal.c index 77683f7..097da9d 100644 --- a/linux-user/signal.c +++ b/linux-user/signal.c @@ -982,8 +982,8 @@ restore_sigcontext(CPUX86State *env, struct target_sigcontext *sc, int *peax) env-regs[R_ECX] = tswapl(sc-ecx); env-eip = tswapl(sc-eip); -cpu_x86_load_seg(env, R_CS, lduw(sc-cs) | 3); -cpu_x86_load_seg(env, R_SS, lduw(sc-ss) | 3); +cpu_x86_load_seg(env, R_CS, lduw_raw(sc-cs) | 3); +cpu_x86_load_seg(env, R_SS, lduw_raw(sc-ss) | 3); tmpflags = tswapl(sc-eflags); env-eflags = (env-eflags ~0x40DD5) | (tmpflags 0x40DD5); -- 1.7.1 -- And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer. -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth
[Qemu-devel] [PATCH] Allow generation of helpers with 5 arguments
Signed-off-by: Lluís Vilanova vilan...@ac.upc.edu --- def-helper.h | 26 ++ 1 files changed, 26 insertions(+), 0 deletions(-) diff --git a/def-helper.h b/def-helper.h index 8a822c7..7abaf76 100644 --- a/def-helper.h +++ b/def-helper.h @@ -118,6 +118,8 @@ DEF_HELPER_FLAGS_3(name, 0, ret, t1, t2, t3) #define DEF_HELPER_4(name, ret, t1, t2, t3, t4) \ DEF_HELPER_FLAGS_4(name, 0, ret, t1, t2, t3, t4) +#define DEF_HELPER_5(name, ret, t1, t2, t3, t4, t5) \ +DEF_HELPER_FLAGS_5(name, 0, ret, t1, t2, t3, t4, t5) #endif /* DEF_HELPER_H */ @@ -139,6 +141,10 @@ dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3)); #define DEF_HELPER_FLAGS_4(name, flags, ret, t1, t2, t3, t4) \ dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \ dh_ctype(t4)); +#define DEF_HELPER_FLAGS_5(name, flags, ret, t1, t2, t3, t4, t5) \ +dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \ +dh_ctype(t4), dh_ctype(t5)); + #undef GEN_HELPER #define GEN_HELPER -1 @@ -203,6 +209,22 @@ static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1 tcg_gen_helperN(HELPER(name), flags, sizemask, dh_retvar(ret), 4, args); \ } +#define DEF_HELPER_FLAGS_5(name, flags, ret, t1, t2, t3, t4, t5) \ +static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1), \ +dh_arg_decl(t2, 2), dh_arg_decl(t3, 3), dh_arg_decl(t4, 4), \ + dh_arg_decl(t5, 5)) \ +{ \ + TCGArg args[5]; \ + int sizemask; \ + sizemask = dh_is_64bit(ret); \ + dh_arg(t1, 1); \ + dh_arg(t2, 2); \ + dh_arg(t3, 3); \ + dh_arg(t4, 4); \ + dh_arg(t5, 5); \ + tcg_gen_helperN(HELPER(name), flags, sizemask, dh_retvar(ret), 5, args); \ +} + #undef GEN_HELPER #define GEN_HELPER -1 @@ -224,6 +246,9 @@ DEF_HELPER_FLAGS_0(name, flags, ret) #define DEF_HELPER_FLAGS_4(name, flags, ret, t1, t2, t3, t4) \ DEF_HELPER_FLAGS_0(name, flags, ret) +#define DEF_HELPER_FLAGS_5(name, flags, ret, t1, t2, t3, t4, t5) \ +DEF_HELPER_FLAGS_0(name, flags, ret) + #undef GEN_HELPER #define GEN_HELPER -1 @@ -235,6 +260,7 @@ DEF_HELPER_FLAGS_0(name, flags, ret) #undef DEF_HELPER_FLAGS_2 #undef DEF_HELPER_FLAGS_3 #undef DEF_HELPER_FLAGS_4 +#undef DEF_HELPER_FLAGS_5 #undef GEN_HELPER #endif -- 1.7.1 -- And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer. -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth
[Qemu-devel] [PATCH] [arm m68k] Move helpers.h to helper.h
This provides a consistent naming scheme across all targets. Signed-off-by: Lluís Vilanova vilan...@ac.upc.edu --- target-arm/helper.c|2 +- target-arm/helper.h| 450 target-arm/helpers.h | 450 target-arm/iwmmxt_helper.c |2 +- target-arm/neon_helper.c |2 +- target-arm/op_helper.c |2 +- target-arm/translate.c |6 +- target-m68k/helper.c |2 +- target-m68k/helper.h | 54 ++ target-m68k/helpers.h | 54 -- target-m68k/op_helper.c|2 +- target-m68k/translate.c|6 +- 12 files changed, 516 insertions(+), 516 deletions(-) create mode 100644 target-arm/helper.h delete mode 100644 target-arm/helpers.h create mode 100644 target-m68k/helper.h delete mode 100644 target-m68k/helpers.h diff --git a/target-arm/helper.c b/target-arm/helper.c index 2dd64d9..08afe24 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -5,7 +5,7 @@ #include cpu.h #include exec-all.h #include gdbstub.h -#include helpers.h +#include helper.h #include qemu-common.h #include host-utils.h #if !defined(CONFIG_USER_ONLY) diff --git a/target-arm/helper.h b/target-arm/helper.h new file mode 100644 index 000..0d1bc47 --- /dev/null +++ b/target-arm/helper.h @@ -0,0 +1,450 @@ +#include def-helper.h + +DEF_HELPER_1(clz, i32, i32) +DEF_HELPER_1(sxtb16, i32, i32) +DEF_HELPER_1(uxtb16, i32, i32) + +DEF_HELPER_2(add_setq, i32, i32, i32) +DEF_HELPER_2(add_saturate, i32, i32, i32) +DEF_HELPER_2(sub_saturate, i32, i32, i32) +DEF_HELPER_2(add_usaturate, i32, i32, i32) +DEF_HELPER_2(sub_usaturate, i32, i32, i32) +DEF_HELPER_1(double_saturate, i32, s32) +DEF_HELPER_2(sdiv, s32, s32, s32) +DEF_HELPER_2(udiv, i32, i32, i32) +DEF_HELPER_1(rbit, i32, i32) +DEF_HELPER_1(abs, i32, i32) + +#define PAS_OP(pfx) \ +DEF_HELPER_3(pfx ## add8, i32, i32, i32, ptr) \ +DEF_HELPER_3(pfx ## sub8, i32, i32, i32, ptr) \ +DEF_HELPER_3(pfx ## sub16, i32, i32, i32, ptr) \ +DEF_HELPER_3(pfx ## add16, i32, i32, i32, ptr) \ +DEF_HELPER_3(pfx ## addsubx, i32, i32, i32, ptr) \ +DEF_HELPER_3(pfx ## subaddx, i32, i32, i32, ptr) + +PAS_OP(s) +PAS_OP(u) +#undef PAS_OP + +#define PAS_OP(pfx) \ +DEF_HELPER_2(pfx ## add8, i32, i32, i32) \ +DEF_HELPER_2(pfx ## sub8, i32, i32, i32) \ +DEF_HELPER_2(pfx ## sub16, i32, i32, i32) \ +DEF_HELPER_2(pfx ## add16, i32, i32, i32) \ +DEF_HELPER_2(pfx ## addsubx, i32, i32, i32) \ +DEF_HELPER_2(pfx ## subaddx, i32, i32, i32) +PAS_OP(q) +PAS_OP(sh) +PAS_OP(uq) +PAS_OP(uh) +#undef PAS_OP + +DEF_HELPER_2(ssat, i32, i32, i32) +DEF_HELPER_2(usat, i32, i32, i32) +DEF_HELPER_2(ssat16, i32, i32, i32) +DEF_HELPER_2(usat16, i32, i32, i32) + +DEF_HELPER_2(usad8, i32, i32, i32) + +DEF_HELPER_1(logicq_cc, i32, i64) + +DEF_HELPER_3(sel_flags, i32, i32, i32, i32) +DEF_HELPER_1(exception, void, i32) +DEF_HELPER_0(wfi, void) + +DEF_HELPER_2(cpsr_write, void, i32, i32) +DEF_HELPER_0(cpsr_read, i32) + +DEF_HELPER_3(v7m_msr, void, env, i32, i32) +DEF_HELPER_2(v7m_mrs, i32, env, i32) + +DEF_HELPER_3(set_cp15, void, env, i32, i32) +DEF_HELPER_2(get_cp15, i32, env, i32) + +DEF_HELPER_3(set_cp, void, env, i32, i32) +DEF_HELPER_2(get_cp, i32, env, i32) + +DEF_HELPER_2(get_r13_banked, i32, env, i32) +DEF_HELPER_3(set_r13_banked, void, env, i32, i32) + +DEF_HELPER_1(get_user_reg, i32, i32) +DEF_HELPER_2(set_user_reg, void, i32, i32) + +DEF_HELPER_1(vfp_get_fpscr, i32, env) +DEF_HELPER_2(vfp_set_fpscr, void, env, i32) + +DEF_HELPER_3(vfp_adds, f32, f32, f32, env) +DEF_HELPER_3(vfp_addd, f64, f64, f64, env) +DEF_HELPER_3(vfp_subs, f32, f32, f32, env) +DEF_HELPER_3(vfp_subd, f64, f64, f64, env) +DEF_HELPER_3(vfp_muls, f32, f32, f32, env) +DEF_HELPER_3(vfp_muld, f64, f64, f64, env) +DEF_HELPER_3(vfp_divs, f32, f32, f32, env) +DEF_HELPER_3(vfp_divd, f64, f64, f64, env) +DEF_HELPER_1(vfp_negs, f32, f32) +DEF_HELPER_1(vfp_negd, f64, f64) +DEF_HELPER_1(vfp_abss, f32, f32) +DEF_HELPER_1(vfp_absd, f64, f64) +DEF_HELPER_2(vfp_sqrts, f32, f32, env) +DEF_HELPER_2(vfp_sqrtd, f64, f64, env) +DEF_HELPER_3(vfp_cmps, void, f32, f32, env) +DEF_HELPER_3(vfp_cmpd, void, f64, f64, env) +DEF_HELPER_3(vfp_cmpes, void, f32, f32, env) +DEF_HELPER_3(vfp_cmped, void, f64, f64, env) + +DEF_HELPER_2(vfp_fcvtds, f64, f32, env) +DEF_HELPER_2(vfp_fcvtsd, f32, f64, env) + +DEF_HELPER_2(vfp_uitos, f32, f32, env) +DEF_HELPER_2(vfp_uitod, f64, f32, env) +DEF_HELPER_2(vfp_sitos, f32, f32, env) +DEF_HELPER_2(vfp_sitod, f64, f32, env) + +DEF_HELPER_2(vfp_touis, f32, f32, env) +DEF_HELPER_2(vfp_touid, f32, f64, env) +DEF_HELPER_2(vfp_touizs, f32, f32, env) +DEF_HELPER_2(vfp_touizd, f32, f64, env) +DEF_HELPER_2(vfp_tosis, f32, f32, env) +DEF_HELPER_2(vfp_tosid, f32, f64, env) +DEF_HELPER_2(vfp_tosizs, f32, f32, env) +DEF_HELPER_2(vfp_tosizd, f32, f64, env) + +DEF_HELPER_3(vfp_toshs, f32, f32, i32, env) +DEF_HELPER_3(vfp_tosls, f32, f32, i32, env)
[Qemu-devel] [PATCH] Refactor flush of per-CPU virtual TB cache
Signed-off-by: Lluís Vilanova vilan...@ac.upc.edu --- exec-all.h |2 ++ exec.c |9 +++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/exec-all.h b/exec-all.h index 3a53fe6..2ae09c5 100644 --- a/exec-all.h +++ b/exec-all.h @@ -182,6 +182,8 @@ static inline unsigned int tb_phys_hash_func(tb_page_addr_t pc) TranslationBlock *tb_alloc(target_ulong pc); void tb_free(TranslationBlock *tb); +/** Flush the per-CPU virtual translation block cache. */ +void tb_flush_jmp_cache(CPUState *env); void tb_flush(CPUState *env); void tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc, tb_page_addr_t phys_page2); diff --git a/exec.c b/exec.c index 1fbe91c..516960a 100644 --- a/exec.c +++ b/exec.c @@ -688,6 +688,11 @@ static void page_flush_tb(void) } } +void tb_flush_jmp_cache (CPUState * env) +{ +memset (env-tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *)); +} + /* flush all the translation blocks */ /* XXX: tb_flush is currently not thread safe */ void tb_flush(CPUState *env1) @@ -705,7 +710,7 @@ void tb_flush(CPUState *env1) nb_tbs = 0; for(env = first_cpu; env != NULL; env = env-next_cpu) { -memset (env-tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *)); +tb_flush_jmp_cache(env); } memset (tb_phys_hash, 0, CODE_GEN_PHYS_HASH_SIZE * sizeof (void *)); @@ -1941,7 +1946,7 @@ void tlb_flush(CPUState *env, int flush_global) } } -memset (env-tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *)); +tb_flush_jmp_cache(env); env-tlb_flush_addr = -1; env-tlb_flush_mask = 0; -- 1.7.1 -- And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer. -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth
[Qemu-devel] [RFC][PATCH 02/15] virtproxy: qemu-vp, standalone daemon skeleton
Daemon to be run in guest, or on host in standalone mode. (re-)implements some qemu utility functions used by core virtproxy.c code via wrapper functions. For built-in virtproxy code we will define these wrapper functions in terms of qemu's built-in implementations. Main logic will come in a later patch. Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- qemu-vp.c | 151 + 1 files changed, 151 insertions(+), 0 deletions(-) create mode 100644 qemu-vp.c diff --git a/qemu-vp.c b/qemu-vp.c new file mode 100644 index 000..5075cdc --- /dev/null +++ b/qemu-vp.c @@ -0,0 +1,151 @@ +/* + * virt-proxy - host/guest communication daemon + * + * Copyright IBM Corp. 2010 + * + * Authors: + * Michael Roth mdr...@linux.vnet.ibm.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include virtproxy.h + +/* mirror qemu I/O-related code for standalone daemon */ +typedef struct IOHandlerRecord { +int fd; +IOCanReadHandler *fd_read_poll; +IOHandler *fd_read; +IOHandler *fd_write; +int deleted; +void *opaque; +/* temporary data */ +struct pollfd *ufd; +QLIST_ENTRY(IOHandlerRecord) next; +} IOHandlerRecord; + +static QLIST_HEAD(, IOHandlerRecord) io_handlers = +QLIST_HEAD_INITIALIZER(io_handlers); + +int vp_set_fd_handler2(int fd, + IOCanReadHandler *fd_read_poll, + IOHandler *fd_read, + IOHandler *fd_write, + void *opaque) +{ +IOHandlerRecord *ioh; + +if (!fd_read !fd_write) { +QLIST_FOREACH(ioh, io_handlers, next) { +if (ioh-fd == fd) { +ioh-deleted = 1; +break; +} +} +} else { +QLIST_FOREACH(ioh, io_handlers, next) { +if (ioh-fd == fd) +goto found; +} +ioh = qemu_mallocz(sizeof(IOHandlerRecord)); +QLIST_INSERT_HEAD(io_handlers, ioh, next); +found: +ioh-fd = fd; +ioh-fd_read_poll = fd_read_poll; +ioh-fd_read = fd_read; +ioh-fd_write = fd_write; +ioh-opaque = opaque; +ioh-deleted = 0; +} +return 0; +} + +int vp_set_fd_handler(int fd, +IOHandler *fd_read, +IOHandler *fd_write, +void *opaque) +{ +return vp_set_fd_handler2(fd, NULL, fd_read, fd_write, opaque); +} + +int vp_send_all(int fd, const void *buf, int len1) +{ +int ret, len; + +len = len1; +while (len 0) { +ret = write(fd, buf, len); +if (ret 0) { +if (errno != EINTR errno != EAGAIN) { +warn(write() failed); +return -1; +} +} else if (ret == 0) { +break; +} else { +buf += ret; +len -= ret; +} +} +return len1 - len; +} + +static void main_loop_wait(int nonblocking) +{ +IOHandlerRecord *ioh; +fd_set rfds, wfds, xfds; +int ret, nfds; +struct timeval tv; +int timeout = 1000; + +if (nonblocking) { +timeout = 0; +} + +/* poll any events */ +nfds = -1; +FD_ZERO(rfds); +FD_ZERO(wfds); +FD_ZERO(xfds); +QLIST_FOREACH(ioh, io_handlers, next) { +if (ioh-deleted) +continue; +if (ioh-fd_read +(!ioh-fd_read_poll || + ioh-fd_read_poll(ioh-opaque) != 0)) { +FD_SET(ioh-fd, rfds); +if (ioh-fd nfds) +nfds = ioh-fd; +} +if (ioh-fd_write) { +FD_SET(ioh-fd, wfds); +if (ioh-fd nfds) +nfds = ioh-fd; +} +} + +tv.tv_sec = timeout / 1000; +tv.tv_usec = (timeout % 1000) * 1000; + +ret = select(nfds + 1, rfds, wfds, xfds, tv); + +if (ret 0) { +IOHandlerRecord *pioh; + +QLIST_FOREACH_SAFE(ioh, io_handlers, next, pioh) { +if (ioh-deleted) { +QLIST_REMOVE(ioh, next); +qemu_free(ioh); +continue; +} +if (ioh-fd_read FD_ISSET(ioh-fd, rfds)) { +ioh-fd_read(ioh-opaque); +} +if (ioh-fd_write FD_ISSET(ioh-fd, wfds)) { +ioh-fd_write(ioh-opaque); +} +} +} +} -- 1.7.0.4
[Qemu-devel] [RFC][PATCH 09/15] virtproxy: add handler for data packets
Process VPPackets coming in from channel and send them to the appropriate server/client connections. Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtproxy.c | 42 ++ 1 files changed, 42 insertions(+), 0 deletions(-) diff --git a/virtproxy.c b/virtproxy.c index 6c3611b..57ab2b0 100644 --- a/virtproxy.c +++ b/virtproxy.c @@ -235,6 +235,48 @@ static void vp_channel_accept(void *opaque) vp_set_fd_handler(drv-listen_fd, NULL, NULL, NULL); } +/* handle data packets + * + * process VPPackets containing data and send them to the corresponding + * FDs + */ +static int vp_handle_data_packet(void *drv, const VPPacket *pkt) +{ +int fd, ret; + +TRACE(called with drv: %p, drv); + +if (pkt-type == VP_PKT_CLIENT) { +TRACE(recieved client packet, client fd: %d, server fd: %d, + pkt-payload.proxied.client_fd, pkt-payload.proxied.server_fd); +fd = pkt-payload.proxied.server_fd; +} else if (pkt-type == VP_PKT_SERVER) { +TRACE(recieved server packet, client fd: %d, server fd: %d, + pkt-payload.proxied.client_fd, pkt-payload.proxied.server_fd); +fd = pkt-payload.proxied.client_fd; +} else { +TRACE(unknown packet type); +return -1; +} + +/* TODO: proxied in non-blocking mode can causes us to spin here + * for slow servers/clients. need to use write()'s and maintain + * a per-conn write queue that we clear out before sending any + * more data to the fd + */ +ret = vp_send_all(fd, (void *)pkt-payload.proxied.data, +pkt-payload.proxied.bytes); +if (ret == -1) { +LOG(error sending data over channel); +return -1; +} else if (ret != pkt-payload.proxied.bytes) { +TRACE(buffer full?); +return -1; +} + +return 0; +} + /* read handler for communication channel * * de-multiplexes data coming in over the channel. for control messages -- 1.7.0.4
[Qemu-devel] [RFC][PATCH 08/15] virtproxy: interfaces to set/remove/handle VPOForwards
Functions to add listener FDs (oforwards) which set up proxied connections to associated service, and the corresponding handler function to process to new connections to said FDs and initialize new client connections to the associated remote server over the channel Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtproxy.c | 103 +++ virtproxy.h |1 + 2 files changed, 104 insertions(+), 0 deletions(-) diff --git a/virtproxy.c b/virtproxy.c index cc0ac9a..6c3611b 100644 --- a/virtproxy.c +++ b/virtproxy.c @@ -12,6 +12,7 @@ */ #include virtproxy.h +#include qemu_socket.h #define DEBUG_VP @@ -314,6 +315,70 @@ static void vp_channel_read(void *opaque) } } +/* handler to accept() and init new client connections */ +static void vp_oforward_accept(void *opaque) +{ +VPOForward *f = opaque; +VPDriver *drv = f-drv; + +struct sockaddr_in saddr; +struct sockaddr *addr; +socklen_t len; +int fd, ret; +VPConn *conn = NULL; +VPPacket pkt; +VPControlMsg msg; + +TRACE(called with opaque: %p, drv: %p, f, drv); + +for(;;) { +len = sizeof(saddr); +addr = (struct sockaddr *)saddr; +fd = qemu_accept(f-listen_fd, addr, len); + +if (fd 0 errno != EINTR) { +TRACE(accept() failed); +return; +} else if (fd = 0) { +TRACE(accepted connection); +break; +} +} + +if (drv-channel_fd == -1) { +TRACE(communication channel not open, closing connection); +closesocket(fd); +return; +} + +/* send init packet over channel */ +memset(msg, 0, sizeof(VPControlMsg)); +msg.type = VP_CONTROL_CONNECT_INIT; +msg.args.connect_init.client_fd = fd; +pstrcpy(msg.args.connect_init.service_id, VP_SERVICE_ID_LEN, f-service_id); + +memset(pkt, 0, sizeof(VPPacket)); +pkt.type = VP_PKT_CONTROL; +pkt.payload.msg = msg; +pkt.magic = VP_MAGIC; + +ret = vp_send_all(drv-channel_fd, pkt, sizeof(VPPacket)); +if (ret == -1) { +LOG(vp_send_all() failed); +return; +} + +/* create new VPConn for client */ +conn = qemu_mallocz(sizeof(VPConn)); +conn-drv = drv; +conn-client_fd = fd; +conn-type = VP_CONN_CLIENT; +conn-state = VP_STATE_NEW; +QLIST_INSERT_HEAD(drv-conns, conn, next); + +socket_set_nonblock(fd); +} + /* create/init VPDriver object */ VPDriver *vp_new(int fd, bool listen) { @@ -336,3 +401,41 @@ VPDriver *vp_new(int fd, bool listen) return drv; } + +/* set/modify/remove a service_id - net/unix listening socket mapping + * + * service_id is a user-defined id for the service. this is what the + * client end will tag it's connections with so that the remote end can + * route it to the proper socket on the remote end. + * + * fd is a listen()'ing socket we want virtproxy to listen for new + * connections of this service type on. set fd to -1 to remove the + * existing listening socket for this service_id + */ +int vp_set_oforward(VPDriver *drv, int fd, const char *service_id) +{ +VPOForward *f = get_oforward(drv, service_id); + +if (fd == -1) { +if (f != NULL) { +vp_set_fd_handler(f-listen_fd, NULL, NULL, NULL); +QLIST_REMOVE(f, next); +qemu_free(f); +} +return 0; +} + +if (f == NULL) { +f = qemu_mallocz(sizeof(VPOForward)); +f-drv = drv; +strncpy(f-service_id, service_id, VP_SERVICE_ID_LEN); +QLIST_INSERT_HEAD(drv-oforwards, f, next); +} else { +closesocket(f-listen_fd); +} + +f-listen_fd = fd; +vp_set_fd_handler(f-listen_fd, vp_oforward_accept, NULL, f); + +return 0; +} diff --git a/virtproxy.h b/virtproxy.h index 3df1691..39d5d40 100644 --- a/virtproxy.h +++ b/virtproxy.h @@ -33,5 +33,6 @@ int vp_set_fd_handler(int fd, /* virtproxy interface */ VPDriver *vp_new(int fd, bool listen); +int vp_set_oforward(VPDriver *drv, int fd, const char *service_id); #endif /* VIRTPROXY_H */ -- 1.7.0.4
[Qemu-devel] [RFC][PATCH 06/15] virtproxy: add read handler for communication channel
Handle data coming in over the channel as VPPackets: Process control messages and forward data from remote client/server connections to the appropriate server/client FD on our end. Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtproxy.c | 83 +++ 1 files changed, 83 insertions(+), 0 deletions(-) diff --git a/virtproxy.c b/virtproxy.c index 20532c2..c9c3022 100644 --- a/virtproxy.c +++ b/virtproxy.c @@ -33,6 +33,7 @@ #define VP_SERVICE_ID_LEN 32/* max length of service id string */ #define VP_PKT_DATA_LEN 1024/* max proxied bytes per VPPacket */ #define VP_CONN_DATA_LEN 1024 /* max bytes conns can send at a time */ +#define VP_CHAN_DATA_LEN 4096 /* max bytes channel can send at a time */ #define VP_MAGIC 0x1F374059 /* listening fd, one for each service we're forwarding to remote end */ @@ -150,6 +151,8 @@ static QemuOptsList vp_socket_opts = { }, }; +static void vp_channel_read(void *opaque); + /* get VPConn by fd, client denotes whether to look for client or server */ static VPConn *get_conn(const VPDriver *drv, int fd, bool client) { @@ -230,3 +233,83 @@ static void vp_channel_accept(void *opaque) /* dont accept anymore connections until channel_fd is closed */ vp_set_fd_handler(drv-listen_fd, NULL, NULL, NULL); } + +/* read handler for communication channel + * + * de-multiplexes data coming in over the channel. for control messages + * we process them here, for data destined for a service or client we + * send it to the appropriate FD. + */ +static void vp_channel_read(void *opaque) +{ +VPDriver *drv = opaque; +VPPacket pkt; +int count, ret, buf_offset; +char buf[VP_CHAN_DATA_LEN]; +char *pkt_ptr, *buf_ptr; + +TRACE(called with opaque: %p, drv); + +count = read(drv-channel_fd, buf, sizeof(buf)); + +if (count == -1) { +LOG(read() failed: %s, strerror(errno)); +return; +} else if (count == 0) { +/* TODO: channel closed, this probably shouldn't happen for guest-side + * serial/virtio-serial connections, but need to confirm and consider + * what should happen in this case. as it stands this virtproxy instance + * is basically defunct at this point, same goes for client instances + * of virtproxy where the remote end has hung-up. + */ +LOG(channel connection closed); +vp_set_fd_handler(drv-channel_fd, NULL, NULL, drv); +drv-channel_fd = -1; +if (drv-listen_fd) { +vp_set_fd_handler(drv-listen_fd, vp_channel_accept, NULL, drv); +} +/* TODO: should close/remove/delete all existing VPConns here */ +} + +if (drv-buflen + count = sizeof(VPPacket)) { +TRACE(initial packet, drv-buflen: %d, drv-buflen); +pkt_ptr = (char *)pkt; +memcpy(pkt_ptr, drv-buf, drv-buflen); +pkt_ptr += drv-buflen; +memcpy(pkt_ptr, buf, sizeof(VPPacket) - drv-buflen); +/* handle first packet */ +ret = vp_handle_packet(drv, pkt); +if (ret != 0) { +LOG(error handling packet); +} +/* handle the rest of the buffer */ +buf_offset = sizeof(VPPacket) - drv-buflen; +drv-buflen = 0; +buf_ptr = buf + buf_offset; +count -= buf_offset; +while (count 0) { +if (count = sizeof(VPPacket)) { +/* handle full packet */ +TRACE(additional packet, drv-buflen: %d, drv-buflen); +memcpy((void *)pkt, buf_ptr, sizeof(VPPacket)); +ret = vp_handle_packet(drv, pkt); +if (ret != 0) { +LOG(error handling packet); +} +count -= sizeof(VPPacket); +buf_ptr += sizeof(VPPacket); +} else { +/* buffer the remainder */ +TRACE(buffering packet); +memcpy(drv-buf, buf_ptr, count); +drv-buflen = count; +break; +} +} +} else { +/* haven't got a full VPPacket yet, buffer for later */ +buf_ptr = drv-buf + drv-buflen; +memcpy(buf_ptr, buf, count); +drv-buflen += count; +} +} -- 1.7.0.4
[Qemu-devel] [RFC][PATCH 11/15] virtproxy: add vp_handle_packet()
Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtproxy.c | 23 +++ 1 files changed, 23 insertions(+), 0 deletions(-) diff --git a/virtproxy.c b/virtproxy.c index 4f56aba..5ec4e77 100644 --- a/virtproxy.c +++ b/virtproxy.c @@ -431,6 +431,29 @@ static int vp_handle_data_packet(void *drv, const VPPacket *pkt) return 0; } +static inline int vp_handle_packet(VPDriver *drv, const VPPacket *pkt) +{ +int ret; + +TRACE(called with drv: %p, drv); + +if (pkt-magic != VP_MAGIC) { +LOG(invalid packet magic field); +return -1; +} + +if (pkt-type == VP_PKT_CONTROL) { +ret = vp_handle_control_packet(drv, pkt); +} else if (pkt-type == VP_PKT_CLIENT || pkt-type == VP_PKT_SERVER) { +ret = vp_handle_data_packet(drv, pkt); +} else { +LOG(invalid packet type); +return -1; +} + +return ret; +} + /* read handler for communication channel * * de-multiplexes data coming in over the channel. for control messages -- 1.7.0.4
[Qemu-devel] [RFC][PATCH 03/15] virtproxy: add debug functions for virtproxy core
Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtproxy.c | 17 + 1 files changed, 17 insertions(+), 0 deletions(-) diff --git a/virtproxy.c b/virtproxy.c index f30b859..2f8996c 100644 --- a/virtproxy.c +++ b/virtproxy.c @@ -13,6 +13,23 @@ #include virtproxy.h +#define DEBUG_VP + +#ifdef DEBUG_VP +#define TRACE(msg, ...) do { \ +fprintf(stderr, %s:%s():L%d: msg \n, \ +__FILE__, __FUNCTION__, __LINE__, ## __VA_ARGS__); \ +} while(0) +#else +#define TRACE(msg, ...) \ +do { } while (0) +#endif + +#define LOG(msg, ...) do { \ +fprintf(stderr, %s:%s(): msg \n, \ +__FILE__, __FUNCTION__, ## __VA_ARGS__); \ +} while(0) + #define VP_SERVICE_ID_LEN 32/* max length of service id string */ #define VP_PKT_DATA_LEN 1024/* max proxied bytes per VPPacket */ #define VP_CONN_DATA_LEN 1024 /* max bytes conns can send at a time */ -- 1.7.0.4
[Qemu-devel] [RFC][PATCH 07/15] virtproxy: add vp_new() VPDriver constructor
Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtproxy.c | 23 +++ virtproxy.h |3 +++ 2 files changed, 26 insertions(+), 0 deletions(-) diff --git a/virtproxy.c b/virtproxy.c index c9c3022..cc0ac9a 100644 --- a/virtproxy.c +++ b/virtproxy.c @@ -313,3 +313,26 @@ static void vp_channel_read(void *opaque) drv-buflen += count; } } + +/* create/init VPDriver object */ +VPDriver *vp_new(int fd, bool listen) +{ +VPDriver *drv = NULL; + +drv = qemu_mallocz(sizeof(VPDriver)); +drv-listen_fd = -1; +drv-channel_fd = -1; +QLIST_INIT(drv-oforwards); +QLIST_INIT(drv-conns); + +if (listen) { +/* provided FD is to be listened on for channel connection */ +drv-listen_fd = fd; +vp_set_fd_handler(drv-listen_fd, vp_channel_accept, NULL, drv); +} else { +drv-channel_fd = fd; +vp_set_fd_handler(drv-channel_fd, vp_channel_read, NULL, drv); +} + +return drv; +} diff --git a/virtproxy.h b/virtproxy.h index 0203421..3df1691 100644 --- a/virtproxy.h +++ b/virtproxy.h @@ -31,4 +31,7 @@ int vp_set_fd_handler(int fd, IOHandler *fd_write, void *opaque); +/* virtproxy interface */ +VPDriver *vp_new(int fd, bool listen); + #endif /* VIRTPROXY_H */ -- 1.7.0.4
[Qemu-devel] [RFC][PATCH 10/15] virtproxy: add handler for control packet
Process control packets coming in over the channel. This entails setting up/tearing down connections to local services initiated from the other end of the channel. Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtproxy.c | 154 +++ 1 files changed, 154 insertions(+), 0 deletions(-) diff --git a/virtproxy.c b/virtproxy.c index 57ab2b0..4f56aba 100644 --- a/virtproxy.c +++ b/virtproxy.c @@ -235,6 +235,160 @@ static void vp_channel_accept(void *opaque) vp_set_fd_handler(drv-listen_fd, NULL, NULL, NULL); } +/* handle control packets + * + * process VPPackets containing control messages + */ +static int vp_handle_control_packet(VPDriver *drv, const VPPacket *pkt) +{ +const VPControlMsg *msg = pkt-payload.msg; +int ret; + +TRACE(called with drv: %p, drv); + +switch (msg-type) { +case VP_CONTROL_CONNECT_INIT: { +int client_fd = msg-args.connect_init.client_fd; +int server_fd; +char service_id[VP_SERVICE_ID_LEN]; +VPPacket resp_pkt; +VPConn *new_conn; +VPIForward *iforward; + +pstrcpy(service_id, VP_SERVICE_ID_LEN, + msg-args.connect_init.service_id); +TRACE(setting up connection for service id %s, service_id); + +/* create server connection on behalf of remote end */ +iforward = get_iforward(drv, service_id); +if (iforward == NULL) { +LOG(no forwarder configured for service id); +return -1; +} + +qemu_opts_print(iforward-socket_opts, NULL); +if (qemu_opt_get(iforward-socket_opts, host) != NULL) { +server_fd = inet_connect_opts(iforward-socket_opts); +} else if (qemu_opt_get(iforward-socket_opts, path) != NULL) { +server_fd = unix_connect_opts(iforward-socket_opts); +} else { +LOG(unable to find listening socket host/addr info); +return -1; +} + +if (server_fd == -1) { +LOG(failed to create connection to service with id %s, +service_id); +} +TRACE(server_fd: %d, server_fd); + +new_conn = qemu_mallocz(sizeof(VPConn)); +if (!new_conn) { +LOG(memory allocation failed); +return -1; +} + +/* send a connect_ack back over the channel */ +/* TODO: all fields should be explicitly set so we shouldn't + * need to memset. this might hurt if we beef up VPPacket size + */ +memset(resp_pkt, 0, sizeof(resp_pkt)); +resp_pkt.type = VP_PKT_CONTROL; +resp_pkt.payload.msg.type = VP_CONTROL_CONNECT_ACK; +resp_pkt.payload.msg.args.connect_ack.server_fd = server_fd; +resp_pkt.payload.msg.args.connect_ack.client_fd = client_fd; +resp_pkt.magic = VP_MAGIC; + +/* TODO: can this potentially block or cause a deadlock with + * the remote end? need to look into potentially buffering these + * if it looks like the remote end is waiting for us to read data + * off the channel. + */ +if (drv-channel_fd == -1) { +TRACE(channel no longer connected, ignoring packet); +return -1; +} + +ret = vp_send_all(drv-channel_fd, (void *)resp_pkt, sizeof(resp_pkt)); +if (ret == -1) { +LOG(error sending data over channel); +return -1; +} +if (ret != sizeof(resp_pkt)) { +TRACE(buffer full? %d bytes remaining, ret); +return -1; +} + +/* add new VPConn to list and set a read handler for it */ +new_conn-drv = drv; +new_conn-client_fd = client_fd; +new_conn-server_fd = server_fd; +new_conn-type = VP_CONN_SERVER; +new_conn-state = VP_STATE_CONNECTED; +QLIST_INSERT_HEAD(drv-conns, new_conn, next); +vp_set_fd_handler(server_fd, vp_conn_read, NULL, new_conn); + +break; +} +case VP_CONTROL_CONNECT_ACK: { +int client_fd = msg-args.connect_ack.client_fd; +int server_fd = msg-args.connect_ack.server_fd; +VPConn *conn; + +TRACE(recieved ack from remote end for client fd %d, client_fd); + +if (server_fd = 0) { +LOG(remote end sent invalid server fd); +return -1; +} + +conn = get_conn(drv, client_fd, true); + +if (conn == NULL) { +LOG(failed to find connection with client_fd %d, client_fd); +return -1; +} + +conn-server_fd = server_fd; +conn-state = VP_STATE_CONNECTED; +vp_set_fd_handler(client_fd, vp_conn_read, NULL, conn); + +break; +} +case VP_CONTROL_CLOSE: { +int fd; +VPConn *conn; + +TRACE(closing connection on behalf of remote end); + +if (msg-args.close.client_fd = 0) { +fd = msg-args.close.client_fd; +TRACE(recieved
[Qemu-devel] [RFC][PATCH 00/15] virtproxy: host/guest communication layer
OVERVIEW: Virtproxy proxies and multiplexes socket streams over a data channel between a host and a guest (currently network connections, emulated serial, or virtio-serial channels are supported). This allows for services such as guest data collection agents, host/guest file transfer, and event generation/handling to be implemented/deployed as basic socket-based daemons, independently of the actual data channel. This code is intended to provide a channel-independent abstraction layer for communicating with a QEMU-specific guest agent (in particular, the virtagent RPC guest agent which will follow this in a seperate patchset), but may have general utility beyond this (for instance: ssh/sftp/other guest agents/etc over isa/virtio serial), and so is submitted here as a seperate patchset. Currently this communication involves 2 daemons (common code): 1 in the guest, and 1 in the host. Each end multiplexes/demultiplexes/proxies connections from the other end. In the future we hope to integrate the host component directly into qemu as a chardev. BUILD/USAGE INFO: make qemu-vp ./qemu-vp -h EXAMPLE USAGE: - Proxy http and ssh connections from a host to a guest over a virtio-serial connection: # start guest with virtio-serial. for example (RHEL6s13): qemu \ -device virtio-serial \ -chardev socket,path=/tmp/test0-virtioconsole.sock,server,nowait,id=test0 \ -device virtconsole,chardev=test0,name=test0 \ -chardev socket,path=/tmp/test1-virtio-serial.sock,server,nowait,id=test1 \ -device virtserialport,chardev=test1,name=test1 \ -chardev socket,path=/tmp/test2-virtio-serial.sock,server,nowait,id=test2 \ -device virtserialport,chardev=test2,name=test2 \ ... # in the host: ./qemu-vp -c unix-connect:/tmp/test2-virtio-serial.sock:- -o http:127.0.0.1:9080 \ -o ssh:127.0.0.1:9022 # in the guest: ./qemu-vp -c virtserial-open:/dev/virtio-ports/test2:- -i http:127.0.0.1:80 \ -i ssh:127.0.0.1:22 # from host, access guest http server wget http://locahost:9080 # from host, access guest ssh server ssh localhost -p 9022 - Proxy http and ssh connections from a host to a guest over a network connection: # start guest with network connectivity to host # in the guest: ./qemu-vp -c tcp-listen:guest_ip:9000 -i http:127.0.0.1:80 \ -i ssh:127.0.0.1:22 # in the host: ./qemu-vp -c tcp-connect:guest_ip:9000 -o http:127.0.0.1:9080 \ -o ssh:127.0.0.1:9022 ... By specifying -i and -o options in the host and guest, respectively, the channel can also be used to establish connections from a guest to a host. KNOWN ISSUES: - Deadlocking the guest: In tests over isa-serial ports I've hit cases where the chardev (socket) on the host-side seem to fill up the buffer, likely due to qemu rate-limiting data in accordance with the port's baud rate (which may explain why i hadn't seen this with network-based or virtio-serial data channels. When qemu-vp reads data from client connections it puts it into a VPPacket and tries to send the packet in it's entirety back over the channel. In this particular case that write() blocks (or vp_send_all() spins if we set O_NONBLOCK on the client FD). In the meantime qemu fills up the other end of the socket buffer and ends up spinning in qemu-char:send_all(), basically causing a deadlock between qemu and qemu-vp, and causing the guest to freeze. Currently I'm planning on replacing vp_send_all() with a function that simply buffers write()'s, which would allow the use of non-blocking write()'s out to the channel/chardev socket while still retaining wholeness/fifo-ordering of the VPPackets. - Sync issues with virtio-serial: This may or may not be related to the issue above, but I noticed some cases where proxied ssh sessions from the guest to the host would lag by a few bytes. For instance typing top would result in to being displayed, and the p wouldn't show up till I hit another key. This could be related to how I'm handling the buffering, but I haven't been able to reproduce using a network-based channel. TODO: - Rework vp_send_all() to use buffering to avoid above-mentioned deadlock scenario - Integrate qemu-vp directly into qemu by adding a virtproxy chardev device. For example: ./qemu-vp -c unix-connect:/tmp/vp1-virtio-serial.sock:- -o ssh:127.0.0.1:9022 in the host, would be analogous to: qemu \ -device virtio-serial \ -chardev virtproxy,oforward=ssh:127.0.0.1:9022,id=vp1 \ -device virtserialport,chardev=vp1,name=vp1 - Better channel negotiation to gracefully handle guest reboots/disconnects/etc - Add monitor commands to add/remove virtproxy channels/oforwards/iforwards on the fly .gitignore |1 + Makefile|4 +- configure |1 + qemu-vp.c | 618 + virtproxy.c | 799
[Qemu-devel] [RFC][PATCH 01/15] virtproxy: base data structures and constants
Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtproxy.c | 134 +++ virtproxy.h | 34 +++ 2 files changed, 168 insertions(+), 0 deletions(-) create mode 100644 virtproxy.c create mode 100644 virtproxy.h diff --git a/virtproxy.c b/virtproxy.c new file mode 100644 index 000..f30b859 --- /dev/null +++ b/virtproxy.c @@ -0,0 +1,134 @@ +/* + * virt-proxy - host/guest communication layer + * + * Copyright IBM Corp. 2010 + * + * Authors: + * Michael Roth mdr...@linux.vnet.ibm.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include virtproxy.h + +#define VP_SERVICE_ID_LEN 32/* max length of service id string */ +#define VP_PKT_DATA_LEN 1024/* max proxied bytes per VPPacket */ +#define VP_CONN_DATA_LEN 1024 /* max bytes conns can send at a time */ +#define VP_MAGIC 0x1F374059 + +/* listening fd, one for each service we're forwarding to remote end */ +typedef struct VPOForward { +VPDriver *drv; +int listen_fd; +char service_id[VP_SERVICE_ID_LEN]; +QLIST_ENTRY(VPOForward) next; +} VPOForward; + +/* service_id-path/port mapping of each service forwarded from remote end */ +typedef struct VPIForward { +VPDriver *drv; +char service_id[VP_SERVICE_ID_LEN]; +QemuOpts *socket_opts; +QLIST_ENTRY(VPIForward) next; +} VPIForward; + +/* proxied client/server connected states */ +typedef struct VPConn { +VPDriver *drv; +int client_fd; +int server_fd; +enum { +VP_CONN_CLIENT = 1, +VP_CONN_SERVER, +} type; +enum { +VP_STATE_NEW = 1, /* accept()'d and registered fd */ +VP_STATE_INIT, /* sent init pkt to remote end, waiting for ack */ +VP_STATE_CONNECTED, /* client and server connected */ +} state; +QLIST_ENTRY(VPConn) next; +} VPConn; + +typedef struct VPControlMsg { +enum { +VP_CONTROL_CONNECT_INIT = 1, +VP_CONTROL_CONNECT_ACK, +VP_CONTROL_CLOSE, +} type; +union { +/* tell remote end connect to server and map client_fd to it */ +struct { +int client_fd; +char service_id[VP_SERVICE_ID_LEN]; +} connect_init; +/* tell remote end we've created the connection to the server, + * and give them the corresponding fd to use so we don't have + * to do a reverse lookup everytime + */ +struct { +int client_fd; +int server_fd; +} connect_ack; +/* tell remote end to close fd in question, presumably because + * connection was closed on our end + */ +struct { +int client_fd; +int server_fd; +} close; +} args; +} VPControlMsg; + +typedef struct VPPacket { +enum { +VP_PKT_CONTROL = 1, +VP_PKT_CLIENT, +VP_PKT_SERVER, +} type; +union { +VPControlMsg msg; +struct { +int client_fd; +int server_fd; +int bytes; +char data[VP_PKT_DATA_LEN]; +} proxied; +} payload; +int magic; +} __attribute__((__packed__)) VPPacket; + +struct VPDriver { +int channel_fd; +int listen_fd; +char buf[sizeof(VPPacket)]; +int buflen; +QLIST_HEAD(, VPOForward) oforwards; +QLIST_HEAD(, VPIForward) iforwards; +QLIST_HEAD(, VPConn) conns; +}; + +static QemuOptsList vp_socket_opts = { +.name = vp_socket_opts, +.head = QTAILQ_HEAD_INITIALIZER(vp_socket_opts.head), +.desc = { +{ +.name = path, +.type = QEMU_OPT_STRING, +},{ +.name = host, +.type = QEMU_OPT_STRING, +},{ +.name = port, +.type = QEMU_OPT_STRING, +},{ +.name = ipv4, +.type = QEMU_OPT_BOOL, +},{ +.name = ipv6, +.type = QEMU_OPT_BOOL, +}, +{ /* end if list */ } +}, +}; diff --git a/virtproxy.h b/virtproxy.h new file mode 100644 index 000..0203421 --- /dev/null +++ b/virtproxy.h @@ -0,0 +1,34 @@ +/* + * virt-proxy - host/guest communication layer + * + * Copyright IBM Corp. 2010 + * + * Authors: + * Michael Roth mdr...@linux.vnet.ibm.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef VIRTPROXY_H +#define VIRTPROXY_H + +#include qemu-common.h +#include qemu-queue.h + +typedef struct VPDriver VPDriver; + +/* wrappers for s/vp/qemu/ functions we need */ +int vp_send_all(int fd, const void *buf, int len1); +int vp_set_fd_handler2(int fd, + IOCanReadHandler *fd_read_poll, + IOHandler *fd_read, + IOHandler *fd_write, + void *opaque); +int
[Qemu-devel] [PATCH 00/17] [RFC] static instrumentation
Here's a set of patches with the current state of static instrumentation. Hope that this organization will help understanding the point of it. Patch 9 describes the taken approach for instrumenting during code generation, which is the main point of this. Code can also be reached at: https://projects.gso.ac.upc.edu/projects/qemu-instrument https://code.gso.ac.upc.edu/git/qemu-instrument Lluís Vilanova (17): backdoor: Declare host-side backdoor helpers backdoor: [all] Include backdoor helper declarations backdoor: Declare guest-side interface macros backdoor: [i386] Decode backdoor instructions backdoor: [i386] Declare guest-side interface macros backdoor: Add a simple example instrument: Handle config-time activation instrument: Add initial instrumentation example instrument: Dynamic per-CPU state of static instrumentation points instrument: Code-generation macros instrument: [all] Include instrumentation helper declarations instrument: Add FETCH point instrument: [i386] Call FETCH point instrument: Add VMEM point instrument: [all] Call VMEM point instrument: Add PLVL point instrument: [i386] Call PLVL point .gitignore |2 + Makefile.target| 35 +- backdoor/examples/print/README | 13 ++ backdoor/examples/print/guest/Makefile |7 + backdoor/examples/print/guest/test.c | 33 + backdoor/examples/print/host/Makefile | 13 ++ backdoor/examples/print/host/printcb.c | 36 + backdoor/guest.h | 60 backdoor/helper.h | 21 +++ configure | 19 +++ cpu-all.h | 74 ++ cpu-defs.h | 24 +++ cpu-exec.c |8 +- cpus.c |8 + exec-all.h |7 +- exec.c | 50 +--- instrument/control.c | 74 ++ instrument/control.h | 44 ++ instrument/examples/dynprint/README| 16 ++ instrument/examples/dynprint/guest/Makefile|7 + instrument/examples/dynprint/guest/test.c | 62 instrument/examples/dynprint/host/Makefile | 14 ++ instrument/examples/dynprint/host/backdoor.c | 61 instrument/examples/dynprint/host/helpers.c| 86 .../dynprint/host/instrument-host-helpers.h| 22 +++ .../examples/dynprint/host/instrument-host.h | 81 +++ instrument/gen-vmem-wrappers.h | 88 instrument/generate.h | 128 + instrument/host-stub.h | 81 +++ instrument/state.h | 61 instrument/types.h | 44 ++ linux-user/main.c | 12 ++ qemu-common.h |4 + softmmu_header.h | 15 ++ target-alpha/helper.h |8 + target-alpha/translate.c |4 + target-arm/helper.h|8 + target-arm/translate.c |4 + target-cris/helper.h |8 + target-cris/translate.c|4 + target-i386/cpu.h | 21 ++-- target-i386/helper.h |8 + target-i386/translate.c| 147 target-m68k/helper.h |8 + target-m68k/translate.c|4 + target-microblaze/helper.h |8 + target-microblaze/translate.c | 10 ++ target-mips/helper.h |8 + target-mips/translate.c|4 + target-ppc/helper.h|8 + target-ppc/translate.c |4 + target-s390x/translate.c |4 + target-sh4/helper.h|8 + target-sh4/translate.c |4 + target-sparc/helper.h |8 + target-sparc/translate.c |4 + 56 files changed, 1543 insertions(+), 61 deletions(-) create mode 100644 backdoor/examples/print/README create mode 100644 backdoor/examples/print/guest/Makefile create mode 100644 backdoor/examples/print/guest/test.c create mode
[Qemu-devel] [RFC][PATCH 12/15] virtproxy: interfaces to set/remove VPIForwards
Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtproxy.c | 59 +++ virtproxy.h |2 ++ 2 files changed, 61 insertions(+), 0 deletions(-) diff --git a/virtproxy.c b/virtproxy.c index 5ec4e77..86a8e5b 100644 --- a/virtproxy.c +++ b/virtproxy.c @@ -658,3 +658,62 @@ int vp_set_oforward(VPDriver *drv, int fd, const char *service_id) return 0; } + +/* add/modify a service_id - net/unix socket mapping + * + * service_id is a user-defined id for the service. this is what the + * remote end will use to proxy connections to a specific service on + * our end. + * + * if port is NULL, addr is the address of the net socket the + * service is running on. otherwise, addr is the path to the unix socket + * the service is running on. + * + * if port AND addr are NULL, find and remove the current iforward + * for this service_id if it exists. + * + * ipv6 is a bool denoting whether or not to use ipv6 + */ +int vp_set_iforward(VPDriver *drv, const char *service_id, const char *addr, +const char *port, bool ipv6) +{ +VPIForward *f = get_iforward(drv, service_id); + +if (addr == NULL port == NULL) { +if (f != NULL) { +qemu_opts_del(f-socket_opts); +QLIST_REMOVE(f, next); +qemu_free(f); +} +return 0; +} + +if (f == NULL) { +f = qemu_mallocz(sizeof(VPIForward)); +f-drv = drv; +strncpy(f-service_id, service_id, VP_SERVICE_ID_LEN); +QLIST_INSERT_HEAD(drv-iforwards, f, next); +} else { +qemu_opts_del(f-socket_opts); +} + +/* stick socket-related options in a QemuOpts so we can + * utilize qemu socket utility functions directly + */ +f-socket_opts = qemu_opts_create(vp_socket_opts, NULL, 0); +if (port == NULL) { +/* no port given, assume unix path */ +qemu_opt_set(f-socket_opts, path, addr); +} else { +qemu_opt_set(f-socket_opts, host, addr); +qemu_opt_set(f-socket_opts, port, port); +} + +if (ipv6) { +qemu_opt_set(f-socket_opts, ipv6, on); +} else { +qemu_opt_set(f-socket_opts, ipv4, on); +} + +return 0; +} diff --git a/virtproxy.h b/virtproxy.h index 39d5d40..d2522b3 100644 --- a/virtproxy.h +++ b/virtproxy.h @@ -34,5 +34,7 @@ int vp_set_fd_handler(int fd, /* virtproxy interface */ VPDriver *vp_new(int fd, bool listen); int vp_set_oforward(VPDriver *drv, int fd, const char *service_id); +int vp_set_iforward(VPDriver *drv, const char *service_id, const char *addr, +const char *port, bool ipv6); #endif /* VIRTPROXY_H */ -- 1.7.0.4
[Qemu-devel] [RFC][PATCH 02/10] virtagent: base definitions for host/guest RPC daemon
Basic skeleton code for RPC daemon loop. This is shared by both the guest-side RPC server as well as the host-side one (the advertised RPCs for each by guest/host-specific arrays). Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtagent-daemon.c | 118 virtagent-daemon.h | 20 + 2 files changed, 138 insertions(+), 0 deletions(-) create mode 100644 virtagent-daemon.c create mode 100644 virtagent-daemon.h diff --git a/virtagent-daemon.c b/virtagent-daemon.c new file mode 100644 index 000..998b025 --- /dev/null +++ b/virtagent-daemon.c @@ -0,0 +1,118 @@ +/* + * virt-agent - host/guest RPC daemon functions + * + * Copyright IBM Corp. 2010 + * + * Authors: + * Adam Litkeagli...@linux.vnet.ibm.com + * Michael Roth mdr...@linux.vnet.ibm.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ +#include stdlib.h +#include stdio.h +#include unistd.h +#include xmlrpc-c/base.h +#include xmlrpc-c/server.h +#include xmlrpc-c/server_abyss.h +#include qemu_socket.h +#include virtagent-daemon.h +#include virtagent-common.h + +static int va_accept(int listen_fd) { +struct sockaddr_in saddr; +struct sockaddr *addr; +socklen_t len; +int fd; + +while (1) { +len = sizeof(saddr); +addr = (struct sockaddr *)saddr; +fd = qemu_accept(listen_fd, addr, len); +if (fd 0 errno != EINTR) { +LOG(accept() failed); +break; +} else if (fd = 0) { +TRACE(accepted connection); +break; +} +} +return fd; +} + +typedef struct RPCFunction { +xmlrpc_value *(*func)(xmlrpc_env *env, xmlrpc_value *param, void *unused); +const char *func_name; +} RPCFunction; + +static RPCFunction guest_functions[] = { +{ NULL, NULL } +}; +static RPCFunction host_functions[] = { +{ NULL, NULL } +}; + +static void va_register_functions(xmlrpc_env *env, xmlrpc_registry *registry, + RPCFunction *list) +{ +int i; +for (i = 0; list[i].func != NULL; ++i) { +TRACE(adding func: %s, list[i].func_name); +xmlrpc_registry_add_method(env, registry, NULL, list[i].func_name, + list[i].func, NULL); +} +} + +int va_server_loop(int listen_fd, bool is_host) +{ +xmlrpc_registry *registryP; +xmlrpc_env env; +int ret, fd; +char *rpc_request; +int rpc_request_len; +xmlrpc_mem_block *rpc_response; +RPCFunction *func_list = is_host ? host_functions : guest_functions; + +xmlrpc_env_init(env); +registryP = xmlrpc_registry_new(env); +va_register_functions(env, registryP, func_list); + +while (1) { +TRACE(waiting for connection from RPC client); +fd = va_accept(listen_fd); +if (fd 0) { +TRACE(connection error: %s, strerror(errno)); +continue; +} +TRACE(RPC client connected, fetching RPC...); +ret = va_get_rpc_request(fd, rpc_request, rpc_request_len); +if (ret != 0 || rpc_request == NULL) { +LOG(error retrieving rpc request); +goto out; +} +TRACE(handling RPC request); +rpc_response = xmlrpc_registry_process_call(env, registryP, NULL, +rpc_request, +rpc_request_len); +if (rpc_response == NULL) { +LOG(error handling rpc request); +goto out; +} + +qemu_free(rpc_request); +TRACE(sending RPC response); +ret = va_send_rpc_response(fd, rpc_response); +if (ret != 0) { +LOG(error sending rpc response); +goto out; +} +TRACE(RPC completed); +XMLRPC_MEMBLOCK_FREE(char, rpc_response); +out: +closesocket(fd); +} + +return 0; +} diff --git a/virtagent-daemon.h b/virtagent-daemon.h new file mode 100644 index 000..bb197d0 --- /dev/null +++ b/virtagent-daemon.h @@ -0,0 +1,20 @@ +/* + * virt-agent - host/guest RPC daemon functions + * + * Copyright IBM Corp. 2010 + * + * Authors: + * Michael Roth mdr...@linux.vnet.ibm.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ +#define GUEST_AGENT_SERVICE_ID virtagent +#define GUEST_AGENT_PATH /tmp/virtagent-guest.sock +#define HOST_AGENT_SERVICE_ID virtagent-host +#define HOST_AGENT_PATH /tmp/virtagent-host.sock +#define VA_GETFILE_MAX 1 30 +#define VA_FILEBUF_LEN 16384 + +int va_server_loop(int listen_fd, bool is_host); -- 1.7.0.4
[Qemu-devel] [RFC][PATCH 14/15] virtproxy: Makefile/configure changes to build qemu-vp
Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- .gitignore |1 + Makefile |4 +++- configure |1 + 3 files changed, 5 insertions(+), 1 deletions(-) diff --git a/.gitignore b/.gitignore index a43e4d1..da307d2 100644 --- a/.gitignore +++ b/.gitignore @@ -31,6 +31,7 @@ qemu-img-cmds.texi qemu-img-cmds.h qemu-io qemu-monitor.texi +qemu-vp QMP/qmp-commands.txt .gdbinit *.a diff --git a/Makefile b/Makefile index 252c817..53b58d2 100644 --- a/Makefile +++ b/Makefile @@ -127,7 +127,7 @@ version-obj-$(CONFIG_WIN32) += version.o ## qemu-img.o: qemu-img-cmds.h -qemu-img.o qemu-tool.o qemu-nbd.o qemu-io.o: $(GENERATED_HEADERS) +qemu-img.o qemu-tool.o qemu-nbd.o qemu-io.o virtproxy.o: $(GENERATED_HEADERS) qemu-img$(EXESUF): qemu-img.o qemu-tool.o qemu-error.o $(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y) @@ -135,6 +135,8 @@ qemu-nbd$(EXESUF): qemu-nbd.o qemu-tool.o qemu-error.o $(trace-obj-y) $(block-ob qemu-io$(EXESUF): qemu-io.o cmd.o qemu-tool.o qemu-error.o $(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y) +qemu-vp$(EXESUF): qemu-vp.o virtproxy.o qemu-tool.o qemu-error.o qemu-sockets.c $(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y) + qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx $(call quiet-command,sh $(SRC_PATH)/hxtool -h $ $@, GEN $@) diff --git a/configure b/configure index a079a49..27f92e0 100755 --- a/configure +++ b/configure @@ -2232,6 +2232,7 @@ if test $softmmu = yes ; then tools=qemu-img\$(EXESUF) qemu-io\$(EXESUF) $tools if [ $linux = yes -o $bsd = yes -o $solaris = yes ] ; then tools=qemu-nbd\$(EXESUF) $tools + tools=qemu-vp\$(EXESUF) $tools if [ $check_utests = yes ]; then tools=check-qint check-qstring check-qdict check-qlist $tools tools=check-qfloat check-qjson $tools -- 1.7.0.4
[Qemu-devel] [RFC][PATCH 01/10] virtagent: add common rpc transport defs
Common code for sending/recieving RPCs via http over virtproxy channel. Eventually these will all be switched to asynchronous handlers to avoid deadlocks between qemu and the guest. For now we can usually get away with just doing asynchronous reads for http/RPC responses if we don't send large RPC requests to the guest (since these will likely get buffered rather than block or cause spinning on writes). Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtagent-common.c | 381 virtagent-common.h | 63 + 2 files changed, 444 insertions(+), 0 deletions(-) create mode 100644 virtagent-common.c create mode 100644 virtagent-common.h diff --git a/virtagent-common.c b/virtagent-common.c new file mode 100644 index 000..51016a6 --- /dev/null +++ b/virtagent-common.c @@ -0,0 +1,381 @@ +/* + * virt-agent - common host/guest RPC functions + * + * Copyright IBM Corp. 2010 + * + * Authors: + * Adam Litkeagli...@linux.vnet.ibm.com + * Michael Roth mdr...@linux.vnet.ibm.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include stdlib.h +#include unistd.h +#include errno.h +#include string.h +#include ctype.h +#include xmlrpc-c/base.h +#include xmlrpc-c/client.h +#include xmlrpc-c/server.h +#include virtagent-common.h + +static int write_all(int fd, const void *buf, int len1) +{ +int ret, len; + +len = len1; +while (len 0) { +ret = write(fd, buf, len); +if (ret 0) { +if (errno != EINTR errno != EAGAIN) { +LOG(write() failed); +return -1; +} +} else if (ret == 0) { +break; +} else { +buf += ret; +len -= ret; +} +} +return len1 - len; +} + +static int read_all(int fd, void *buf, int len) +{ +int ret, remaining; + +remaining = len; +while (remaining 0) { +ret = read(fd, buf, remaining); +if (ret 0) { +if (errno != EINTR errno != EAGAIN) { +LOG(read failed); +return -1; +} +} else if (ret == 0) { +break; +} else { +buf += ret; +remaining -= ret; +} +} +return len - remaining; +} + +static int read_line(int fd, void *buf, int len) +{ +int ret, remaining; +remaining = len; +while (remaining 0) { +ret = read(fd, buf, 1); +if (ret 0) { +if (errno != EINTR errno != EAGAIN) { +LOG(read failed); +return -1; +} +} else if (ret == 0) { +break; +} else { +remaining--; +buf++; +if (*((char *)buf-1) == '\n') +break; +} +} +memset(buf, 0, remaining); +return len - remaining; +} + +typedef struct va_http { +char *preamble; +int content_length; +int content_read; +char *content; +} va_http; + +typedef struct HttpReadState { +int fd; +char hdr_buf[4096]; +int hdr_pos; +va_http http; +enum { +IN_RESP_HEADER = 0, +IN_RESP_BODY, +} state; +RPCRequest *rpc_data; +} HttpReadState; + +static int end_of_header(char *buf, int end_pos) { +return !strncmp(buf+(end_pos-2), \n\r\n, 3); +} + +static void parse_hdr(va_http *http, char *buf, int len) { +int i, line_pos=0; +char line_buf[4096]; + +for (i=0; ilen; ++i) { +if (buf[i] != '\n') { +/* read line */ +line_buf[line_pos++] = buf[i]; +} else { +/* process line */ +if (strncasecmp(line_buf, content-length: , 16) == 0) { +http-content_length = atoi(line_buf[16]); +return; +} +line_pos = 0; +} +} +} + +static void http_read_handler(void *opaque) { +HttpReadState *s = opaque; +int ret; + +TRACE(called with opaque: %p, opaque); +if (s-state == IN_RESP_HEADER) { +while((ret = read(s-fd, s-hdr_buf + s-hdr_pos, 1)) 0) { +//TRACE(buf: %c, s-hdr_buf[0]); +s-hdr_pos += ret; +if (end_of_header(s-hdr_buf, s-hdr_pos - 1)) { +s-state = IN_RESP_BODY; +break; +} +} +if (ret == -1) { +if (errno == EAGAIN || errno == EWOULDBLOCK) { +return; +} else { +LOG(error reading connection: %s, strerror(errno)); +goto out_bad; +} +} else if (ret == 0) { +LOG(connection closed unexpectedly); +goto out_bad; +} else { +s-http.content_length = -1; +parse_hdr(s-http, s-hdr_buf, s-hdr_pos); +if (s-http.content_length == -1) { +LOG(malformed http header); +
[Qemu-devel] [RFC][PATCH 05/15] virtproxy: add accept handler for communication channel
This accept()'s connections to the socket we told virt-proxy to listen for the channel connection on and sets the appropriate read handler for the resulting FD. Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- virtproxy.c | 37 + 1 files changed, 37 insertions(+), 0 deletions(-) diff --git a/virtproxy.c b/virtproxy.c index fa17722..20532c2 100644 --- a/virtproxy.c +++ b/virtproxy.c @@ -166,6 +166,8 @@ static VPConn *get_conn(const VPDriver *drv, int fd, bool client) return NULL; } +static void vp_channel_accept(void *opaque); + /* get VPOForward by service_id */ static VPOForward *get_oforward(const VPDriver *drv, const char *service_id) { @@ -193,3 +195,38 @@ static VPIForward *get_iforward(const VPDriver *drv, const char *service_id) return NULL; } + +/* accept handler for communication channel + * + * accept()s connection to communication channel (for sockets), and sets + * up the read handler for resulting FD. + */ +static void vp_channel_accept(void *opaque) +{ +VPDriver *drv = opaque; +struct sockaddr_in saddr; +struct sockaddr *addr; +socklen_t len; +int fd; + +TRACE(called with opaque: %p, drv); + +for(;;) { +len = sizeof(saddr); +addr = (struct sockaddr *)saddr; +fd = qemu_accept(drv-listen_fd, addr, len); + +if (fd 0 errno != EINTR) { +TRACE(accept() failed); +return; +} else if (fd = 0) { +TRACE(accepted connection); +break; +} +} + +drv-channel_fd = fd; +vp_set_fd_handler(drv-channel_fd, vp_channel_read, NULL, drv); +/* dont accept anymore connections until channel_fd is closed */ +vp_set_fd_handler(drv-listen_fd, NULL, NULL, NULL); +} -- 1.7.0.4
[Qemu-devel] [RFC][PATCH 08/10] virtagent: add agent_viewdmesg command
Add commands to view guest dmesg output. Currently it is a 16K buffer. Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- hmp-commands.hx | 16 + qmp-commands.hx | 35 +++ virtagent.c | 100 +++ virtagent.h |3 ++ 4 files changed, 154 insertions(+), 0 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index e9a7f4a..0e7a6c9 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1227,6 +1227,22 @@ STEXI Echo the file identified by @var{filepath} on the guest filesystem ETEXI +{ +.name = agent_viewdmesg, +.args_type = , +.params = , +.help = View guest dmesg output, +.user_print = do_agent_viewdmesg_print, +.mhandler.cmd_async = do_agent_viewdmesg, +.flags = MONITOR_CMD_ASYNC, +}, + +STEXI +...@item agent_viewdmesg +...@findex agent_viewdmesg +View guest dmesg output +ETEXI + STEXI @end table ETEXI diff --git a/qmp-commands.hx b/qmp-commands.hx index efa2137..dc319b7 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -771,6 +771,41 @@ Example: EQMP { +.name = agent_viewdmesg, +.args_type = , +.params = , +.help = View guest dmesg output, +.user_print = do_agent_viewdmesg_print, +.mhandler.cmd_async = do_agent_viewdmesg, +.flags = MONITOR_CMD_ASYNC, +}, + +STEXI +...@item agent_viewdmesg +...@findex agent_viewdmesg +View guest dmesg output +ETEXI +SQMP +agent_viewdmesg + + +View guest dmesg output + +Arguments: + +(none) + +Example: + +- { execute: agent_viewdmesg } +- { return: { + contents: [353487.942215] usb 1-4: USB disconnect, address 9\n... + } + } + +EQMP + +{ .name = qmp_capabilities, .args_type = , .params = , diff --git a/virtagent.c b/virtagent.c index 8059d00..cae9a85 100644 --- a/virtagent.c +++ b/virtagent.c @@ -196,3 +196,103 @@ int do_agent_viewfile(Monitor *mon, const QDict *mon_params, return 0; } + +void do_agent_viewdmesg_print(Monitor *mon, const QObject *data) +{ +QDict *qdict; +const char *contents = NULL; +int i; + +qdict = qobject_to_qdict(data); +if (!qdict_haskey(qdict, contents)) { +return; +} + +contents = qdict_get_str(qdict, contents); +if (contents != NULL) { + /* monitor_printf truncates so do it in chunks. also, file_contents + * may not be null-termed at proper location so explicitly calc + * last chunk sizes */ +for (i = 0; i strlen(contents); i += 1024) { +monitor_printf(mon, %.1024s, contents + i); +} +} +monitor_printf(mon, \n); +} + +static void do_agent_viewdmesg_cb(void *opaque) +{ +RPCRequest *rpc_data = opaque; +xmlrpc_value *resp = NULL; +char *dmesg = NULL; +int ret; +xmlrpc_env env; +QDict *qdict = qdict_new(); + +if (rpc_data-resp_xml == NULL) { +monitor_printf(rpc_data-mon, error handling RPC request); +goto out_no_resp; +} + +xmlrpc_env_init(env); +resp = xmlrpc_parse_response(env, rpc_data-resp_xml, + rpc_data-resp_xml_len); +if (rpc_has_error(env)) { +ret = -1; +goto out_no_resp; +} + +xmlrpc_parse_value(env, resp, s, dmesg); +if (rpc_has_error(env)) { +ret = -1; +goto out; +} + +if (dmesg != NULL) { +qdict_put(qdict, contents, qstring_from_str(dmesg)); +} + +out: +qemu_free(rpc_data-resp_xml); +xmlrpc_DECREF(resp); +out_no_resp: +rpc_data-mon_cb(rpc_data-mon_data, QOBJECT(qdict)); +qemu_free(rpc_data); +} + +/* + * do_agent_viewdmesg(): View guest dmesg output + */ +int do_agent_viewdmesg(Monitor *mon, const QDict *mon_params, + MonitorCompletion cb, void *opaque) +{ +xmlrpc_env env; +xmlrpc_value *params; +RPCRequest *rpc_data; +int ret; + +xmlrpc_env_init(env); + +params = xmlrpc_build_value(env, (n)); +if (rpc_has_error(env)) { +return -1; +} + +rpc_data = qemu_mallocz(sizeof(RPCRequest)); +rpc_data-cb = do_agent_viewdmesg_cb; +rpc_data-mon = mon; +rpc_data-mon_cb = cb; +rpc_data-mon_data = opaque; + +ret = rpc_execute(env, getdmesg, params, rpc_data); +if (ret == -EREMOTE) { +monitor_printf(mon, RPC Failed (%i): %s\n, env.fault_code, + env.fault_string); +return -1; +} else if (ret == -1) { +monitor_printf(mon, RPC communication error\n); +return -1; +} + +return 0; +} diff --git a/virtagent.h b/virtagent.h index d027eac..82a6924 100644 --- a/virtagent.h +++ b/virtagent.h @@ -25,5 +25,8 @@ void do_agent_viewfile_print(Monitor *mon, const QObject *qobject); int do_agent_viewfile(Monitor *mon, const QDict *mon_params, MonitorCompletion cb, void