date:20101022

Re: [Qemu-devel] [PATCH 1/3] Introduce threadlets

2010-10-22 Thread Balbir Singh

* Arun R B a...@linux.vnet.ibm.com [2010-10-21 17:40:45]:

 From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 
 This patch creates a generic asynchronous-task-offloading infrastructure named
 threadlets. The core idea has been borrowed from the threading framework that
 is being used by paio.
 
 The reason for creating this generic infrastructure is so that other 
 subsystems,
 such as virtio-9p could make use of it for offloading tasks that could block.
 
 The patch creates a global queue on-to which subsystems can queue their tasks 
 to
 be executed asynchronously.
 
 The patch also provides API's that allow a subsystem to create a private queue
 with an associated pool of threads.
 
 [...@in.ibm.com: Facelift of the code, Documentation, cancel_threadlet
 and other helpers]
 
 Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 Signed-off-by: Gautham R Shenoy gautham.she...@gmail.com
 Signed-off-by: Sripathi Kodi sripat...@in.ibm.com
 Signed-off-by: Arun R Bharadwaj a...@linux.vnet.ibm.com


Acked-by: Balbir Singh bal...@linux.vnet.ibm.com
 
-- 
Three Cheers,
Balbir

[Qemu-devel] [Bug 663713] Re: Mouse frozen under an emulated ubuntu

2010-10-22 Thread FredBezies

Well, looks like the patch was added on 20 october.

http://git.qemu.org/qemu.git/commit/?id=cd496926155afcb3b6323e70dd720dc118b3a255

With a little luck, it would be in Qemu 0.13.1.

Thanks for the URL and the patch.

Have a good day.

-- 
Mouse frozen under an emulated ubuntu
https://bugs.launchpad.net/bugs/663713
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.

Status in QEMU: Incomplete

Bug description:
Qemu 0.13.0

Command line used :

qemu-system-x86_64 --enable-kvm -localtime -soundhw all -k fr -m 1500 -net user 
-net nic,model=rtl8139 -hda disk.img -cdrom ubuntu-10.10-desktop-amd64.iso 
-boot d

When I try to move mouse cursor in qemu, pointer is frozen. Nothing is moving. 
Was working perfectly with Qemu 0.12.5.

Re: [Qemu-devel] [PATCH v9 0/4] Introduce strtosz and make use of it

2010-10-22 Thread Jes Sorensen

On 10/22/10 09:59, Markus Armbruster wrote:
 jes.soren...@redhat.com writes:
 From: Jes Sorensen jes.soren...@redhat.com
 This patch introduces cutils.c: strtosz() and gets rid of the
 multiple custom hacks for parsing byte sizes. In addition it adds
 supports for specifying human style sizes such as 1.5G. Last it
 eliminates the horrible abuse of a float to store the byte size for
 migrate_set_speed in the monitor.

 Note, this is tested on Linux and build tested for win32 using
 mingw32.

 v9: I worked through a couple of revisions directly with Markus and I
 think I got it right finally. 
 
 I'd prefer to have strtosz() to match strtol()  friends and not
 restrict suffixes.  But this code does what it claims to do, as far as I
 can see, so:
 
 ACK series

Thanks!

I thought about this a fair bit and I believe doing the full test in the
function is the most valuable. It's a personal preference obviously.
I think it's a win to do it here since it simplifies the caller code.

Would be great to get this applied so I can get it off my plate :)

Cheers,
Jes

Re: [Qemu-devel] [PATCH 2/2] v2 Fix Block Hotplug race with drive_unplug()

2010-10-22 Thread Daniel P. Berrange

On Thu, Oct 21, 2010 at 04:37:46PM -0500, Ryan Harper wrote:
 * Daniel P. Berrange berra...@redhat.com [2010-10-21 08:29]:
  On Tue, Oct 19, 2010 at 09:32:29AM -0500, Ryan Harper wrote:
   Block hot unplug is racy since the guest is required to acknowlege the 
   ACPI
   unplug event; this may not happen synchronously with the device removal 
   command
   
   This series aims to close a gap where by mgmt applications that assume the
   block resource has been removed without confirming that the guest has
   acknowledged the removal may re-assign the underlying device to a second 
   guest
   leading to data leakage.
   
   This series introduces a new montor command to decouple asynchornous 
   device
   removal from restricting guest access to a block device.  We do this by 
   creating
   a new monitor command drive_unplug which maps to a bdrv_unplug() command 
   which
   does a qemu_aio_flush; bdrv_flush() and bdrv_close().  Once complete, 
   subsequent
   IO is rejected from the device and the guest will get IO errors but 
   continue to
   function.
   
   A subsequent device removal command can be issued to remove the device, 
   to which
   the guest may or maynot respond, but as long as the unplugged bit is set, 
   no IO
   will be sumbitted.
  
  The name 'drive_unplug' suggests to me that the drive object is
  not being deleted/free()d ? Is that correct understanding, and if
  so, what is responsible for finally free()ing the drive backend ?
 
 It's technically the BlockDriverState Driver that we're closing.  To
 fully release the remaining resources, a device_del is required (which
 of course requires guest participation with the current
 interface).
 
 Once QEMU issues the removal request, the guest responds and the piix4
 acpi handler for pciej_write writes invokes qdev_free() on the target
 device.  qdev_free() on the pci device will make it's way to the qdev
 exit handler registered for virtio-blk devices, virtio_blk_exit_pci().
 virtio_blk_exit_pci() marks the drive structure for deletion.  When qdev
 calls the properties handler, it invokes free_drive() on the disk and
 that calls blockdev_auto_del() which will do a bdrv_delete() which nukes
 the remaining objects (the acutal BlockDriverState).
 
 I think I got the whole path in there.

Ok, thanks, that makes sense to me.

Sounds like we do still need a separate drive_del in the future to 
handle the different case of, drive_add, followed by a device_add
attempt which fails.

Regards,
Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Re: [Qemu-devel] [PATCH v9 0/4] Introduce strtosz and make use of it

2010-10-22 Thread Markus Armbruster

jes.soren...@redhat.com writes:

 From: Jes Sorensen jes.soren...@redhat.com

 This patch introduces cutils.c: strtosz() and gets rid of the
 multiple custom hacks for parsing byte sizes. In addition it adds
 supports for specifying human style sizes such as 1.5G. Last it
 eliminates the horrible abuse of a float to store the byte size for
 migrate_set_speed in the monitor.

 Note, this is tested on Linux and build tested for win32 using
 mingw32.

 v9: I worked through a couple of revisions directly with Markus and I
 think I got it right finally. 

I'd prefer to have strtosz() to match strtol()  friends and not
restrict suffixes.  But this code does what it claims to do, as far as I
can see, so:

ACK series

[Qemu-devel] Re: VM with two interfaces

2010-10-22 Thread Stefan Hajnoczi

On Thu, Oct 21, 2010 at 11:49 PM, Nirmal Guhan vavat...@gmail.com wrote:
 On Thu, Oct 21, 2010 at 9:23 AM, Nirmal Guhan vavat...@gmail.com wrote:
 Hi,

 Am trying to create a VM using qemu-kvm with two interfaces(fedora12
 is the host and vm) and running into an issue. Given below is the
 command :

 qemu-kvm -net nic,macaddr=$macaddress,model=pcnet -net
 tap,script=/etc/qemu-ifup -net nic,model=pcnet -net
 tap,script=/etc/qemu-ifup -m 1024 -hda ./vdisk.img -kernel
 ./bzImage-1019 -append ip=x.y.z.w:a.b.c.d:p.q.r.s:a.b.c.d
 ip=x.y.z.u:a.b.c.d:p.q.r.s:a.b.c.d root=/dev/nfs rw
 nfsroot=x.y.z.v:/blahblahblah

 On boot, both eth0 and eth1 come up but the vm tries to send dhcp and
 rarp requests instead of using the command line IP addresses. DHCP
 would fail in my case.

 With just one interface, dhcp is not attempted and nfs mount of root works 
 fine.

 Any clue on what could be wrong here?

 Thanks,
 Nirmal

 Can someone help please? Hard pressed on time... sorry

Try the #qemu or #kvm IRC channels on chat.freenode.net.  Often people
will respond and debug interactively with you there.

Your problem does not seem QEMU/KVM related.  You'll need to debug the
guest's boot process and network configuration just like a physical
machine.  In fact, I bet with an identical setup on a physical machine
you'd see the same problem.

Double-check the kernel parameters documentation.
See if you have a network configuration in the initramfs or the root
filesystem that would cause it to DHCP and how to pull the
pre-configured settings from the kernel.

Stefan

Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog

2010-10-22 Thread Kevin Wolf

Am 21.10.2010 21:32, schrieb Laurent Vivier:
 Le jeudi 21 octobre 2010 à 10:07 -0500, Anthony Liguori a écrit :
 On 10/21/2010 09:07 AM, Kevin Wolf wrote:
 Hi all,

 I'm currently looking into adding a return value to qemu's bdrv_flush
 function and I noticed that your block drivers (nbd, rbd and sheepdog)
 don't implement bdrv_flush at all. bdrv_flush is going to return
 -ENOTSUP for any block driver not implementing this, effectively
 breaking these three drivers for anything but cache=unsafe.

 Is there a specific reason why your drivers don't implement this?

 NBD doesn't have a notion of flush.  Only read/write and the block-nbd 
 implementation doesn't do write-caching so flush would be a nop.

 I'm not sure what the right semantics would be for QEMU.  My guess is a 
 nop flush.
 
 I agree.

Of course, as Laurent said a while ago, there is no specification for
NBD, so it's hard to say what the intended semantics is.

However, I did have a look at the nbdserver code and it looks as if it
implements something similar to writethrough (namely fsync after each
write) only if configured this way on the server side. qemu-nbd defaults
to writethrough, but can be configured to use cache=none. So with either
server qemu as a client can't tell whether the data is safe on disk or not.

In my book this is a strong argument for refusing to open nbd
connections with anything but cache=unsafe.

Kevin

Re: Fwd: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog

2010-10-22 Thread Kevin Wolf

[ Adding qemu-devel to CC again ]

Am 21.10.2010 20:59, schrieb Sage Weil:
 On Thu, 21 Oct 2010, Christian Brunner wrote:
 Hi,

 is there a flush operation in librados? - I guess the only way to
 handle this, would be waiting until all aio requests are finished?

That's not the semantics of bdrv_flush, you don't need to wait for
running requests. You just need to make sure that all completed requests
are safe on disk so that they would persist even in case of a
crash/power failure.

 There is no flush currently.  But librados does no caching, so in this 
 case at least silenting upgrading to cache=writethrough should work.

You're making sure that the data can't be cached in the server's page
cache or volatile disk cache either, e.g. by using O_SYNC for the image
file? If so, upgrading would be safe.

 If that's a problem, we can implement a flush.  Just let us know.

Presumably providing a writeback mode with explicit flushes could
improve performance. Upgrading to writethrough is not a correctness
problem, though, so it's your decision if you want to implement it.

Kevin

 -- Forwarded message --
 From: Kevin Wolf kw...@redhat.com
 Date: 2010/10/21
 Subject: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
 To: Christian Brunner c...@muc.de, Laurent Vivier
 laur...@vivier.eu, MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
 Cc: Qemu-devel@nongnu.org


 Hi all,

 I'm currently looking into adding a return value to qemu's bdrv_flush
 function and I noticed that your block drivers (nbd, rbd and sheepdog)
 don't implement bdrv_flush at all. bdrv_flush is going to return
 -ENOTSUP for any block driver not implementing this, effectively
 breaking these three drivers for anything but cache=unsafe.

 Is there a specific reason why your drivers don't implement this? I
 think I remember that one of the drivers always provides
 cache=writethough semantics. It would be okay to silently upgrade to
 cache=writethrough, so in this case I'd just need to add an empty
 bdrv_flush implementation.

 Otherwise, we really cannot allow any option except cache=unsafe because
 that's the semantics provided by the driver.

 In any case, I think it would be a good idea to implement a real
 bdrv_flush function to allow the write-back cache modes cache=off and
 cache=writeback in order to improve performance over writethrough.

 Is this possible with your protocols, or can the protocol be changed to
 consider this? Any hints on how to proceed?

 Kevin
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog

2010-10-22 Thread Kevin Wolf

Am 22.10.2010 07:43, schrieb MORITA Kazutaka:
 At Thu, 21 Oct 2010 16:07:28 +0200,
 Kevin Wolf wrote:

 Hi all,

 I'm currently looking into adding a return value to qemu's bdrv_flush
 function and I noticed that your block drivers (nbd, rbd and sheepdog)
 don't implement bdrv_flush at all. bdrv_flush is going to return
 -ENOTSUP for any block driver not implementing this, effectively
 breaking these three drivers for anything but cache=unsafe.

 Is there a specific reason why your drivers don't implement this? I
 think I remember that one of the drivers always provides
 cache=writethough semantics. It would be okay to silently upgrade to
 cache=writethrough, so in this case I'd just need to add an empty
 bdrv_flush implementation.

 Otherwise, we really cannot allow any option except cache=unsafe because
 that's the semantics provided by the driver.

 In any case, I think it would be a good idea to implement a real
 bdrv_flush function to allow the write-back cache modes cache=off and
 cache=writeback in order to improve performance over writethrough.

 Is this possible with your protocols, or can the protocol be changed to
 consider this? Any hints on how to proceed?

 
 It is a bit difficult to implement an effective bdrv_flush in the
 sheepdog block driver.  Sheepdog virtual disks are splited and
 distributed to all cluster servers, so the block driver needs to send
 flush requests to all of them.  I'm not sure this could improve
 performance more than writethrough semantics.

It could probably be optimized so that you only send flush requests to
servers that have actually received write requests since the last flush.

But yes, that's probably a valid point. I guess there's only one way to
find out how it performs: Trying it out.

 So I think it is better to support only writethrough semantics
 currently (I'll modify sheepdog server codes to open stored objects
 with O_SYNC or O_DIRECT) and leave write-back semantics as a future
 work.

I agree, that makes sense.

Note that O_DIRECT does not provide write-through semantics. It bypasses
the page cache, but it doesn't flush other caches like a volatile disk
write cache. If you want to use it, you still need explicit flushes or
O_DIRECT | O_SYNC.

Kevin

Re: [Qemu-devel] [PATCH v9 0/4] Introduce strtosz and make use of it

2010-10-22 Thread Markus Armbruster

Jes Sorensen jes.soren...@redhat.com writes:

 On 10/22/10 09:59, Markus Armbruster wrote:
 jes.soren...@redhat.com writes:
 From: Jes Sorensen jes.soren...@redhat.com
 This patch introduces cutils.c: strtosz() and gets rid of the
 multiple custom hacks for parsing byte sizes. In addition it adds
 supports for specifying human style sizes such as 1.5G. Last it
 eliminates the horrible abuse of a float to store the byte size for
 migrate_set_speed in the monitor.

 Note, this is tested on Linux and build tested for win32 using
 mingw32.

 v9: I worked through a couple of revisions directly with Markus and I
 think I got it right finally. 
 
 I'd prefer to have strtosz() to match strtol()  friends and not
 restrict suffixes.  But this code does what it claims to do, as far as I
 can see, so:
 
 ACK series

 Thanks!

 I thought about this a fair bit and I believe doing the full test in the
 function is the most valuable. It's a personal preference obviously.
 I think it's a win to do it here since it simplifies the caller code.

But strtosz() doesn't know what suffixes the caller wants accepted.  It
accepting suffixes starting with whitespace, '\0' and ',' is just a
guess.  Three cases:

1. The guess is exactly right.  strtosz()'s checking saves the caller
the check.

2. It's too lose, caller wants to accept fewer suffixes, e.g. just '\0'.
strtosz()'s checking doesn't save the caller anything.

3. It's too tight, caller wants to accept more suffixes.  Caller needs
to make a copy without the suffix.  In the general case, this requires
duplicating strtosz()'s parsing logic.  It's easier not to use it then.

The uses of strtosz() added by this series all fall into case 2.  The
callers lack the check, which means we accept trailing garbage.  Fine
with me, because it's no worse than before.

 Would be great to get this applied so I can get it off my plate :)

Agreed.

[Qemu-devel] Re: [PATCH v6 08/12] pcie/aer: helper functions for pcie aer capability

2010-10-22 Thread Michael S. Tsirkin

On Fri, Oct 22, 2010 at 08:53:15AM +0900, Isaku Yamahata wrote:
 On Thu, Oct 21, 2010 at 10:07:01AM +0200, Michael S. Tsirkin wrote:
  On Thu, Oct 21, 2010 at 02:15:24PM +0900, Isaku Yamahata wrote:
   Thank you for detailed review.
   
   On Wed, Oct 20, 2010 at 11:56:16AM +0200, Michael S. Tsirkin wrote:
 +static uint32_t aer_log_del(PCIEAERLog *aer_log)
 +{
 +uint32_t i = aer_log-consumer;
 +aer_log-consumer = aer_log_next(aer_log-consumer, 
 aer_log-log_max);
 +return i;
 +}


Please just use 'int' or 'unsigned' instead of uint32_t if you simply
want to say 'a number'.  Using specific length makes it impossible to
say where you *really* want a value.
   
   PCIEAERLog is saved/loaded. So explicit sized number is chosen.
  
  I didn't notice. But consumer/producer are not guest visible, are they?
  If yes I think it's a mistake to save/load them, tying us to
  a specific implementation: just calculate and save the # of
  valid entries in the log.
 
 ACIEAERLog and PCIEAERErr themselves are the implementation specific
 internal states which is not visible to guest.
 So there is no point of arguing that consumer/producer are 
 specific to implementation.

Well the errors can be read out by the guest, so they are
potentially guest visible. Thus I think the only thing
we want to migrate is the list of errors logged.

  Also I put the comment here but it is a general one: pls go over the
  code, and just take a look at what types you use all over and think
  whether size really matters. In most places it does not, it just must be
  big enough, so use int or unsigned there.  It will be much harder for
  others to do so later.
 
 Will do.
 -- 
 yamahata

Re: [Qemu-devel] v6: [PATCH 0/3]: Threadlets: A generic task offloading framework

2010-10-22 Thread Amit Shah

On (Wed) Oct 20 2010 [08:18:51], Anthony Liguori wrote:
 On 10/20/2010 07:05 AM, Stefan Hajnoczi wrote:
 On Wed, Oct 20, 2010 at 12:57 PM, Amit Shahamit.s...@redhat.com  wrote:
 On (Tue) Oct 19 2010 [23:12:20], Arun R Bharadwaj wrote:
 Hi,
 
 This is the v6 of the patch-series to have a generic asynchronous task
 offloading framework (called threadlets) within qemu.
 
 Request to consider pulling this series as discussed during the
 Qemu-devel call.
 I tried this out with virtio-serial (patch below).  Have a couple of
 things to note:
 
 - Guests get a SIGUSR2 on startup sometimes.  This doesn't happen with
   qemu.git, so looks like it's introduced by this patchset.
 
 - After running some tests, I get an abort.  I still have to look at
   what's causing it, but doesn't look like it's related to virtio-serial
   code.
 
 Program received signal SIGABRT, Aborted.
 0x003dc76329a5 in raise () from /lib64/libc.so.6
 Missing separate debuginfos, use: debuginfo-install
 SDL-1.2.14-8.fc13.x86_64 glibc-2.12.1-2.x86_64
 libX11-1.3.1-3.fc13.x86_64 libXau-1.0.5-1.fc12.x86_64
 libpng-1.2.44-1.fc13.x86_64 libxcb-1.5-1.fc13.x86_64
 ncurses-libs-5.7-7.20100130.fc13.x86_64 zlib-1.2.3-23.fc12.x86_64
 (gdb) bt
 #0  0x003dc76329a5 in raise () from /lib64/libc.so.6
 #1  0x003dc7634185 in abort () from /lib64/libc.so.6
 #2  0x004bf829 in qemu_get_ram_ptr (addr=value optimized out)
 at /home/amit/src/qemu/exec.c:2936
 #3  0x004bf9a7 in lduw_phys (addr=value optimized out) at
 /home/amit/src/qemu/exec.c:3836
 #4  0x00557c90 in vring_avail_idx (vq=0x17b9320, idx=1333) at
 /home/amit/src/qemu/hw/virtio.c:133
 #5  virtqueue_num_heads (vq=0x17b9320, idx=1333) at
 /home/amit/src/qemu/hw/virtio.c:252
 #6  0x00557e5e in virtqueue_avail_bytes (vq=0x17b9320,
 in_bytes=4096, out_bytes=0) at /home/amit/src/qemu/hw/virtio.c:311
 
 - I'm using a threadlet to queue up several work items which are to be
   processed in a fifo order.  There's no cancel function for a threadlet
   that either processes all work and then quits the thread or just
   cancels all pending work and quits.
 
 Amit
 
 
 diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
 index 74ba5ec..caaafbe 100644
 --- a/hw/virtio-serial-bus.c
 +++ b/hw/virtio-serial-bus.c
 @@ -51,6 +51,14 @@ struct VirtIOSerial {
  struct virtio_console_config config;
   };
 
 +typedef struct VirtIOSerialWork {
 +ThreadletWork work;
 +VirtIOSerialPort *port;
 +VirtQueue *vq;
 +VirtIODevice *vdev;
 +int discard;
 +} VirtIOSerialWork;
 +
   static VirtIOSerialPort *find_port_by_id(VirtIOSerial *vser, uint32_t id)
   {
  VirtIOSerialPort *port;
 @@ -113,10 +121,20 @@ static size_t write_to_port(VirtIOSerialPort *port,
  return offset;
   }
 
 -static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq,
 - VirtIODevice *vdev, bool discard)
 +static void async_flush_queued_data(ThreadletWork *work)
   {
 +VirtIOSerialPort *port;
 +VirtIOSerialWork *vs_work;
 +VirtQueue *vq;
 +VirtIODevice *vdev;
  VirtQueueElement elem;
 +int discard;
 +
 +vs_work = DO_UPCAST(VirtIOSerialWork, work, work);
 +port = vs_work-port;
 +vq = vs_work-vq;
 +vdev = vs_work-vdev;
 +discard = vs_work-discard;
 
  assert(port || discard);
  assert(virtio_queue_ready(vq));
 You cannot access guest memory using QEMU RAM functions (or use the
 virtqueue_pop() function which uses them) from a thread without taking
 the QEMU global mutex.
 
 The abort stack trace is a result of accessing guest RAM from two
 threads simultaneously.
 
 In general it is not safe to use QEMU functions from a thread unless
 they are explicitly written to work outside the QEMU global mutex.
 Most functions assume the global mutex, which serializes I/O thread
 and vcpu changes to global state, is held.
 
 Yes, threadlets are only meant to be used to make synchronous system
 calls asynchronous.  They are not meant to add parallelism to QEMU
 (yet).

Yes -- I realised that.  (But I don't see why the virtio rings get
modified elsewhere other than the only function I push/pop from above.)

Anyway, just one question as I've still not read the code: does a
running work item in a threadlet block migration?  Do the remaining work
items in a threadlet get migrated fine?

Amit

[Qemu-devel] Re: [PATCH 2/2] Replace remaining gcc format attributes by macro GCC_FMT_ATTR (format checking)

2010-10-22 Thread Stefan Weil


Am 13.10.2010 20:54, schrieb Stefan Weil:

Replace the remaining format attribute printf by macro
GCC_FMT_ATTR which uses gnu_printf (if supported).

v2
* Removal of dyngen specific code is now done in a separate patch.
* Handle attribute in new ui/spice-display.c, too.

Cc: Blue Swirlblauwir...@gmail.com
Signed-off-by: Stefan Weilw...@mail.berlios.de
---
  cpu-all.h  |2 +-
  ui/spice-display.c |3 +--
  2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 67a3266..11edddc 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -773,7 +773,7 @@ void cpu_dump_statistics (CPUState *env, FILE *f,
int flags);

  void QEMU_NORETURN cpu_abort(CPUState *env, const char *fmt, ...)
-__attribute__ ((__format__ (__printf__, 2, 3)));
+GCC_FMT_ATTR(2, 3);
  extern CPUState *first_cpu;
  extern CPUState *cpu_single_env;

diff --git a/ui/spice-display.c b/ui/spice-display.c
index 6702dfd..7b4f5c1 100644
--- a/ui/spice-display.c
+++ b/ui/spice-display.c
@@ -29,8 +29,7 @@

  static int debug = 0;

-static void __attribute__((format(printf,2,3)))
-dprint(int level, const char *fmt, ...)
+static void GCC_FMT_ATTR(2, 3) dprint(int level, const char *fmt, ...)
  {
  va_list args;

   



Can these two patches be applied to qemu master, or is there still 
something missing?

They are needed for additional format checks.

Regards,
Stefan

Re: [Qemu-devel] [PATCH 2/2] v2 Fix Block Hotplug race with drive_unplug()

2010-10-22 Thread Kevin Wolf

Am 21.10.2010 23:37, schrieb Ryan Harper:
 * Daniel P. Berrange berra...@redhat.com [2010-10-21 08:29]:
 On Tue, Oct 19, 2010 at 09:32:29AM -0500, Ryan Harper wrote:
 Block hot unplug is racy since the guest is required to acknowlege the ACPI
 unplug event; this may not happen synchronously with the device removal 
 command

 This series aims to close a gap where by mgmt applications that assume the
 block resource has been removed without confirming that the guest has
 acknowledged the removal may re-assign the underlying device to a second 
 guest
 leading to data leakage.

 This series introduces a new montor command to decouple asynchornous device
 removal from restricting guest access to a block device.  We do this by 
 creating
 a new monitor command drive_unplug which maps to a bdrv_unplug() command 
 which
 does a qemu_aio_flush; bdrv_flush() and bdrv_close().  Once complete, 
 subsequent
 IO is rejected from the device and the guest will get IO errors but 
 continue to
 function.

 A subsequent device removal command can be issued to remove the device, to 
 which
 the guest may or maynot respond, but as long as the unplugged bit is set, 
 no IO
 will be sumbitted.

 The name 'drive_unplug' suggests to me that the drive object is
 not being deleted/free()d ? Is that correct understanding, and if
 so, what is responsible for finally free()ing the drive backend ?
 
 It's technically the BlockDriverState Driver that we're closing.  To
 fully release the remaining resources, a device_del is required (which
 of course requires guest participation with the current
 interface).

So is this basically what blockdev_del is supposed to become one day?

Copying Markus to have a look at this. I'm sure he has some thoughts on
it as he was planning to implement blockdev_add/del.

Kevin

 
 Once QEMU issues the removal request, the guest responds and the piix4
 acpi handler for pciej_write writes invokes qdev_free() on the target
 device.  qdev_free() on the pci device will make it's way to the qdev
 exit handler registered for virtio-blk devices, virtio_blk_exit_pci().
 virtio_blk_exit_pci() marks the drive structure for deletion.  When qdev
 calls the properties handler, it invokes free_drive() on the disk and
 that calls blockdev_auto_del() which will do a bdrv_delete() which nukes
 the remaining objects (the acutal BlockDriverState).
 
 I think I got the whole path in there.

Re: [Qemu-devel] Re: [PATCH v6 08/12] pcie/aer: helper functions for pcie aer capability

2010-10-22 Thread Markus Armbruster

Michael S. Tsirkin m...@redhat.com writes:

 On Wed, Oct 20, 2010 at 05:18:57PM +0900, Isaku Yamahata wrote:
 This patch implements helper functions for pcie aer capability
 which will be used later.
 
 Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp

 Some style comments and a couple of minor bugs.
[...]
 +
 +static int aer_log_add_err(PCIEAERLog *aer_log, const PCIEAERErr *err)
 +{
 +uint32_t i;
 +if (aer_log_full(aer_log)) {
 +return -1;
 +}
 +i = aer_log_add(aer_log);
 +memcpy(aer_log-log[i], err, sizeof(*err));

 sizeof is not a function, you don't need ()

Matter of taste.  Yes, it's an operator, but it's as function-like as
operators get.

For what it's worth, grep says roughly 98% of our sizeof uses are
followed by '('.  I doubt the parenthesis are required for grouping that
often.

[...]

Re: [Qemu-devel] [PATCH]: VNC: set listener socket to non-blocking mode

2010-10-22 Thread Gianni Tedesco

On Thu, 2010-10-21 at 17:44 +0100, Anthony Liguori wrote:
 On 10/21/2010 11:18 AM, Gianni Tedesco wrote:
  This prevents qemu from hanging waiting for a client to connect. I have
  reproduced this when doing a loadvm but it may be a more general problem
  in that poll/accept may race if a client aborts the connection with a
  RST before the accept has completed. In either case the fix seems
  harmless.
 
  Signed-off-by: Gianni Tedescogianni.tede...@citrix.com
 
 
 I'd feel a little better with explicit error handling in the accept() 
 path and a comment explaining why this was necessary.
 
 But otherwise, nice catch!

Thanks!

Not sure what you mean about the explicit error handling in the accept()
path though? Accept can fail for a few reasons, the most worrying is
EMFILE which easily leads to livelock. Poll reports listener as ready to
accept but accept fails because we maxed out the fd table so go back to
poll and the listener is still ready to accept etc... It's not always
clear how to handle specific accept() errors.

I think only calling vnc_connect() when we have an fd is the best we can
hope for right now.

Unless I misunderstood?

-8-8-8-

diff --git a/ui/vnc.c b/ui/vnc.c
index 864342e..65dc55c 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -2736,5 +2736,9 @@ int vnc_display_open(DisplayState *ds, const char 
*display)
 vs-display = dpy;
 }
 }
+/* necessary to prevent accept() hanging indefinitely if a clients
+ * connection-reset wins the race between poll() and accept()
+ */
+socket_set_nonblock(vs-lsock);
 return qemu_set_fd_handler2(vs-lsock, NULL, vnc_listen_read, NULL, vs);
 }

Re: [Qemu-devel] Re: [PATCH v6 07/12] pcie/hotplug: introduce pushing attention button command

2010-10-22 Thread Markus Armbruster

Michael S. Tsirkin m...@redhat.com writes:

 On Wed, Oct 20, 2010 at 05:18:56PM +0900, Isaku Yamahata wrote:
 glue pcie_push_attention_button command.
 
 Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp

 So as a high level command, I think we need to
 think about how to tie this into pci_add/pci_del.
 Right?
[...]

Do we have consensus how our set of commands for hot plug should look
like?  We talked about it, but did we reach consensus?  If yes, did we
write it down somewhere?

[Qemu-devel] Re: [RESEND PATCH] monitor: properly handle illegal fd/vhostfd from command line

2010-10-22 Thread Luiz Capitulino

On Thu, 21 Oct 2010 19:29:02 +0800
Jason Wang jasow...@redhat.com wrote:

 When hanlding fd/vhostfd form command line through net_handle_fd_param(),
 we need to check mon and return value of strtol() other than we could
 get segmentation fault or invalid fd when user type an illegal fd/vhostfd.
 
 This patch is based on the suggestions from
 Luiz Capitulino lcapitul...@redhat.com.
 
 Signed-off-by: Jason Wang jasow...@redhat.com
 ---
  net.c |   12 +---
  1 files changed, 9 insertions(+), 3 deletions(-)
 
 diff --git a/net.c b/net.c
 index ed74c7f..ab9c3bb 100644
 --- a/net.c
 +++ b/net.c
 @@ -774,8 +774,8 @@ int qemu_find_nic_model(NICInfo *nd, const char * const 
 *models,
  
  int net_handle_fd_param(Monitor *mon, const char *param)
  {
 -if (!qemu_isdigit(param[0])) {
 -int fd;
 +int fd;

Better to add a space here.

 +if (!qemu_isdigit(param[0])  mon) {
  
  fd = monitor_get_fd(mon, param);
  if (fd == -1) {
 @@ -785,7 +785,13 @@ int net_handle_fd_param(Monitor *mon, const char *param)
  
  return fd;
  } else {
 -return strtol(param, NULL, 0);
 +char *endptr = NULL;
 +
 +fd = strtol(param, endptr, 10);
 +if (*endptr || (fd == 0  param == endptr)) {
 +return -1;
 +}
 +return fd;
  }

You can put 'return fd' here and drop the two above, also this is not
a monitor patch, it's a networking one.

Otherwise looks ok to me.

  }

Re: [Qemu-devel] [PATCH 0/3] add usb_detach and usb_attach (v3)

2010-10-22 Thread Luiz Capitulino

On Thu, 21 Oct 2010 15:27:23 +0200
Alon Levy al...@redhat.com wrote:

 On Thu, Oct 21, 2010 at 08:13:19AM -0500, Anthony Liguori wrote:
  On 10/21/2010 08:03 AM, Gerd Hoffmann wrote:
  On 10/21/10 08:36, Alon Levy wrote:
  v2-v3 changes:
* add configure parameter
* fix docs
  
  v2 message:
  This patchset uses id like device_del for attaching/detaching usb
  devices. The first two patches ready the way:
1. makes qdev_find_recursive non static and in qdev.h
2. adds a usb_device_by_id which goes over the usb buses calling
 qdev_find_recursive
3. adds the commands that use usb_device_by_id
  
  Alon Levy (3):
 qdev: make qdev_find_recursive public
 usb: add public usb_device_by_id
 monitor: add usb_attach and usb_detach (v2)
  
  
  Acked-by: Gerd Hoffmann kra...@redhat.com
  
  Okay, I am still confused about the use-case for this and I don't
  see any further explanation in the commit messages.  I've seen
  debugging but can you be a bit more specific about which cases
  it's needed for?
 
 To elaborate a little more, when using a certificates based card
 there is no hardware event (i.e. removing/inserting the physical card)
 that causes a usb_detach/attach to the card (both in passthru and
 emulated), but otoh certificates is good for testing since it decouples
 it from NSS/tcp. So I needed some way to emulate an insert/remove, and
 I saw usb_del, which was pretty close, and voila. This is not the same
 as card remove/reinsert, but it is exactly what will happen to the
 guest when spicec connects/disconnects, since I detach devices on
 disconnect and attach on connect.

Looks reasonable to me, specially because this will be protected by
#ifdef DEBUG. I don't see a big deal in merging this.

Objections, Anthony?

  This is just adding a HMP command.  Is that the right approach or
  was that an unintentional consequence of rebasing post-HMP/QMP
  split?

I don't think this should be available under QMP, it's more a debugging
command for USB developers.

  
  Regards,
  
  Anthony Liguori
  
  
  cheers,
Gerd

Re: [Qemu-devel] [PATCH 0/3] add usb_detach and usb_attach (v3)

2010-10-22 Thread Luiz Capitulino

On Fri, 22 Oct 2010 07:55:02 -0500
Anthony Liguori anth...@codemonkey.ws wrote:

 On 10/22/2010 07:48 AM, Luiz Capitulino wrote:
  On Thu, 21 Oct 2010 15:27:23 +0200
  Alon Levyal...@redhat.com  wrote:
 
 
  On Thu, Oct 21, 2010 at 08:13:19AM -0500, Anthony Liguori wrote:
   
  On 10/21/2010 08:03 AM, Gerd Hoffmann wrote:
 
  On 10/21/10 08:36, Alon Levy wrote:
   
  v2-v3 changes:
* add configure parameter
* fix docs
 
  v2 message:
  This patchset uses id like device_del for attaching/detaching usb
  devices. The first two patches ready the way:
1. makes qdev_find_recursive non static and in qdev.h
2. adds a usb_device_by_id which goes over the usb buses calling
 qdev_find_recursive
3. adds the commands that use usb_device_by_id
 
  Alon Levy (3):
 qdev: make qdev_find_recursive public
 usb: add public usb_device_by_id
 monitor: add usb_attach and usb_detach (v2)
 
 
  Acked-by: Gerd Hoffmannkra...@redhat.com
   
  Okay, I am still confused about the use-case for this and I don't
  see any further explanation in the commit messages.  I've seen
  debugging but can you be a bit more specific about which cases
  it's needed for?
 
  To elaborate a little more, when using a certificates based card
  there is no hardware event (i.e. removing/inserting the physical card)
  that causes a usb_detach/attach to the card (both in passthru and
  emulated), but otoh certificates is good for testing since it decouples
  it from NSS/tcp. So I needed some way to emulate an insert/remove, and
  I saw usb_del, which was pretty close, and voila. This is not the same
  as card remove/reinsert, but it is exactly what will happen to the
  guest when spicec connects/disconnects, since I detach devices on
  disconnect and attach on connect.
   
  Looks reasonable to me, specially because this will be protected by
  #ifdef DEBUG. I don't see a big deal in merging this.
 
 
 I'd just like to see better documentation.  A command isn't useful for 
 debugging if noone knows how to use it.
 
 Guarding with an #ifdef isn't necessary.  It should be unconditionally 
 enabled otherwise it will bit rot.

You and Gerd asked about the purpose of this command, turns out that it's
only useful for developing new USB devices for QEMU, so I thought it would
be better to restrict it, so that people don't start using this the
wrong way or worse, we can't drop/break it because some tool is now using it.

Anyway, maybe a good doc will do, this was just a small suggestion.

 
  Objections, Anthony?
 
 
 Not with better docs.
 
 Regards,
 
 Anthony Liguori
 
  This is just adding a HMP command.  Is that the right approach or
  was that an unintentional consequence of rebasing post-HMP/QMP
  split?
 
  I don't think this should be available under QMP, it's more a debugging
  command for USB developers.
 
 
  Regards,
 
  Anthony Liguori
 
 
  cheers,
Gerd

Re: [Qemu-devel] [PATCH 0/3] add usb_detach and usb_attach (v3)

2010-10-22 Thread Anthony Liguori


On 10/22/2010 07:48 AM, Luiz Capitulino wrote:

On Thu, 21 Oct 2010 15:27:23 +0200
Alon Levyal...@redhat.com  wrote:

   

On Thu, Oct 21, 2010 at 08:13:19AM -0500, Anthony Liguori wrote:
 

On 10/21/2010 08:03 AM, Gerd Hoffmann wrote:
   

On 10/21/10 08:36, Alon Levy wrote:
 

v2-v3 changes:
  * add configure parameter
  * fix docs

v2 message:
This patchset uses id like device_del for attaching/detaching usb
devices. The first two patches ready the way:
  1. makes qdev_find_recursive non static and in qdev.h
  2. adds a usb_device_by_id which goes over the usb buses calling
   qdev_find_recursive
  3. adds the commands that use usb_device_by_id

Alon Levy (3):
   qdev: make qdev_find_recursive public
   usb: add public usb_device_by_id
   monitor: add usb_attach and usb_detach (v2)

   

Acked-by: Gerd Hoffmannkra...@redhat.com
 

Okay, I am still confused about the use-case for this and I don't
see any further explanation in the commit messages.  I've seen
debugging but can you be a bit more specific about which cases
it's needed for?
   

To elaborate a little more, when using a certificates based card
there is no hardware event (i.e. removing/inserting the physical card)
that causes a usb_detach/attach to the card (both in passthru and
emulated), but otoh certificates is good for testing since it decouples
it from NSS/tcp. So I needed some way to emulate an insert/remove, and
I saw usb_del, which was pretty close, and voila. This is not the same
as card remove/reinsert, but it is exactly what will happen to the
guest when spicec connects/disconnects, since I detach devices on
disconnect and attach on connect.
 

Looks reasonable to me, specially because this will be protected by
#ifdef DEBUG. I don't see a big deal in merging this.
   


I'd just like to see better documentation.  A command isn't useful for 
debugging if noone knows how to use it.


Guarding with an #ifdef isn't necessary.  It should be unconditionally 
enabled otherwise it will bit rot.



Objections, Anthony?
   


Not with better docs.

Regards,

Anthony Liguori


This is just adding a HMP command.  Is that the right approach or
was that an unintentional consequence of rebasing post-HMP/QMP
split?
   

I don't think this should be available under QMP, it's more a debugging
command for USB developers.

   

Regards,

Anthony Liguori

   

cheers,
  Gerd

Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog

2010-10-22 Thread Anthony Liguori


On 10/22/2010 03:29 AM, Kevin Wolf wrote:

I agree.
 

Of course, as Laurent said a while ago, there is no specification for
NBD, so it's hard to say what the intended semantics is.

However, I did have a look at the nbdserver code and it looks as if it
implements something similar to writethrough (namely fsync after each
write) only if configured this way on the server side. qemu-nbd defaults
to writethrough, but can be configured to use cache=none. So with either
server qemu as a client can't tell whether the data is safe on disk or not.

In my book this is a strong argument for refusing to open nbd
connections with anything but cache=unsafe.
   


On a physical system, if you don't have a battery backed disk and you 
enable the WC on your disk, then even with cache=writethrough we're unsafe.


Likewise, if you mount your filesystem with barrier=0, QEMU is unsafe.

QEMU can't guarantee safety.  The underlying storage needs to be 
configured correctly.  As long as we're not introducing caching within 
QEMU, I don't think we should assume we're unsafe.


Do we have any place where we can add docs on a per-block format basis?  
It would be good to at least mention for each block device how the 
backing storage needed to be configured for safety.


Regards,

Anthony Liguori


Kevin

Re: [Qemu-devel] [PATCH 0/6] First part of autoconfy series: cleanup tests/

2010-10-22 Thread Luiz Capitulino

On Thu, 21 Oct 2010 10:18:34 +0200
Paolo Bonzini pbonz...@redhat.com wrote:

 This is a small start from the autoconfy configure series.  Actually,
 this part is almost completely new.  It cleans up the tests Makefile
 so that the i386-linux-user testsuite's Makefile targets are more
 easily extensible and can work wherever a compiler for i386 is
 installed.  In the future this could be extended to ARM and
 MIPS targets.

This is unrelated to this series, but something that would be very good to
have would be to move all check- programs to tests/ (or check/) and make them
part of 'make test' or introduce a new 'make check'.


 
 Paolo Bonzini (6):
   unbreak make from tests directory
   unbreak make from vpath-built tests directory
   disable test_enter on i386, it is broken
   make runcom compile on recent distributions
   fix test_path
   rewrite i386 tests Makefile
 
  configure |5 ++
  rules.mak |9 
  tests/Makefile|  123 ++--
  tests/runcom.c|   11 ++---
  tests/test-i386.c |5 ++
  tests/test_path.c |   13 +-
  6 files changed, 114 insertions(+), 52 deletions(-)

Re: [Qemu-devel] [PATCH 0/7] [PULL] qemu-kvm.git uq/master queue

2010-10-22 Thread Anthony Liguori


On 10/21/2010 01:25 PM, Marcelo Tosatti wrote:

The following changes since commit 633aa0acfe2c4d3e56acfe28c912796bf54de6d3:

   Fix pci hotplug to generate level triggered interrupt. (2010-10-20 17:23:28 
-0500)
   


Pulled.  Thanks.

Regards,

Anthony Liguori


are available in the git repository at:
   git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git uq/master

Hidetoshi Seto (3):
   x86, mce: ignore SRAO only when MCG_SER_P is available
   x86, mce: broadcast mce depending on the cpu version
   Fix build on !KVM_CAP_MCE

Marcelo Tosatti (4):
   kvm: add save/restore of MSR_VM_HSAVE_PA
   kvm: factor out kvm_has_msr_star
   kvm: writeback SMP TSCs on migration only
   kvm: save/restore x86-64 MSRs on x86-64 kernels

  target-i386/kvm.c |  132 +++-
  1 files changed, 99 insertions(+), 33 deletions(-)

Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog

2010-10-22 Thread Kevin Wolf

Am 22.10.2010 14:58, schrieb Anthony Liguori:
 On 10/22/2010 03:29 AM, Kevin Wolf wrote:
 I agree.
  
 Of course, as Laurent said a while ago, there is no specification for
 NBD, so it's hard to say what the intended semantics is.

 However, I did have a look at the nbdserver code and it looks as if it
 implements something similar to writethrough (namely fsync after each
 write) only if configured this way on the server side. qemu-nbd defaults
 to writethrough, but can be configured to use cache=none. So with either
 server qemu as a client can't tell whether the data is safe on disk or not.

 In my book this is a strong argument for refusing to open nbd
 connections with anything but cache=unsafe.

 
 On a physical system, if you don't have a battery backed disk and you 
 enable the WC on your disk, then even with cache=writethrough we're unsafe.

I don't think that's right. O_SYNC should guarantee that the volatile
disk cache is flushed.

 Likewise, if you mount your filesystem with barrier=0, QEMU is unsafe.

Yeah, if you do something equivalent to cache=unsafe on a lower layer,
then qemu can't do much about it. Maybe you can apply the same argument
to NBD, even though it's unsafe by default.

 QEMU can't guarantee safety.  The underlying storage needs to be 
 configured correctly.  As long as we're not introducing caching within 
 QEMU, I don't think we should assume we're unsafe.
 
 Do we have any place where we can add docs on a per-block format basis?  
 It would be good to at least mention for each block device how the 
 backing storage needed to be configured for safety.

docs/block-protocols.txt?

Kevin

Re: [Qemu-devel] [PATCH 0/6] First part of autoconfy series: cleanup tests/

2010-10-22 Thread Paolo Bonzini


On 10/22/2010 03:26 PM, Luiz Capitulino wrote:

On Thu, 21 Oct 2010 10:18:34 +0200
Paolo Bonzinipbonz...@redhat.com  wrote:


This is a small start from the autoconfy configure series.  Actually,
this part is almost completely new.  It cleans up the tests Makefile
so that the i386-linux-user testsuite's Makefile targets are more
easily extensible and can work wherever a compiler for i386 is
installed.  In the future this could be extended to ARM and
MIPS targets.


This is unrelated to this series, but something that would be very good to
have would be to move all check- programs to tests/ (or check/) and make them
part of 'make test' or introduce a new 'make check'.


Right, I thought about that too.  The main hurdle is that: 1) right now 
almost every test in make test is failing; 2) the testsuite run by 
make test is very noisy, does not create logs, etc.  If we could use 
autotest... :)


We could also have check-i386, check-qmp, etc. and check would simply 
call all of them.


Paolo

[Qemu-devel] [PULL 00/11] Block patches

2010-10-22 Thread Kevin Wolf

The following changes since commit d03703c81a202cea156811e5dbc8e88627c19986:

  curses: Fix contro...@[\]^_} and ESC (2010-10-21 18:31:28 +0200)

are available in the git repository at:
  git://repo.or.cz/qemu/kevin.git for-anthony

Christoph Hellwig (1):
  ide: set WCACHE supported in IDENTIFY data

Kevin Wolf (7):
  qcow2: Simplify image creation
  qcow2: Remove old image creation function
  qemu-io: New command map
  qemu-img: Fix qemu-img convert -obacking_file
  ide: Factor ide_flush_cache out
  ide: Handle flush failure
  virtio-blk: Respect werror option for flushes

Stefan Hajnoczi (1):
  qcow2: Support exact L1 table growth

Stefan Weil (1):
  block: Use GCC_FMT_ATTR and fix a format error

edison (1):
  Copy snapshots out of QCOW2 disk

 block.c|   16 +++
 block.h|2 +
 block/blkverify.c  |5 +-
 block/qcow2-cluster.c  |   25 +++--
 block/qcow2-snapshot.c |   33 ++-
 block/qcow2.c  |  278 
 block/qcow2.h  |3 +-
 block_int.h|2 +
 hw/ide/core.c  |   28 --
 hw/ide/internal.h  |3 +-
 hw/virtio-blk.c|8 ++-
 qemu-img-cmds.hx   |4 +-
 qemu-img.c |   26 +-
 qemu-img.texi  |4 +-
 qemu-io.c  |   39 +++
 15 files changed, 264 insertions(+), 212 deletions(-)

[Qemu-devel] [PATCH 02/11] qcow2: Simplify image creation

2010-10-22 Thread Kevin Wolf

Instead of doing lots of magic for setting up initial refcount blocks and stuff
create a minimal (inconsistent) image, open it and initialize the rest with
regular qcow2 functions.

This is a complete rewrite of the image creation function. The old
implementating is #ifdef'd out and will be removed by the next patch (removing
it here would have made the diff unreadable because diff tries to find
similarities when it's really a rewrite)

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2.c |  133 -
 1 files changed, 132 insertions(+), 1 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 345d2e4..3b2e38c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -27,6 +27,7 @@
 #include zlib.h
 #include aes.h
 #include block/qcow2.h
+#include qemu-error.h
 
 /*
   Differences with QCOW:
@@ -794,6 +795,7 @@ static int qcow2_change_backing_file(BlockDriverState *bs,
 return qcow2_update_ext_header(bs, backing_file, backing_fmt);
 }
 
+#if 0
 static int get_bits_from_size(size_t size)
 {
 int res = 0;
@@ -814,6 +816,7 @@ static int get_bits_from_size(size_t size)
 
 return res;
 }
+#endif
 
 
 static int preallocate(BlockDriverState *bs)
@@ -869,6 +872,7 @@ static int preallocate(BlockDriverState *bs)
 return 0;
 }
 
+#if 0
 static int qcow_create2(const char *filename, int64_t total_size,
 const char *backing_file, const char *backing_format,
 int flags, size_t cluster_size, int prealloc)
@@ -1066,6 +1070,133 @@ exit:
 
 return ret;
 }
+#else
+static int qcow_create2(const char *filename, int64_t total_size,
+const char *backing_file, const char *backing_format,
+int flags, size_t cluster_size, int prealloc,
+QEMUOptionParameter *options)
+{
+/* Calulate cluster_bits */
+int cluster_bits;
+cluster_bits = ffs(cluster_size) - 1;
+if (cluster_bits  MIN_CLUSTER_BITS || cluster_bits  MAX_CLUSTER_BITS ||
+(1  cluster_bits) != cluster_size)
+{
+error_report(
+Cluster size must be a power of two between %d and %dk\n,
+1  MIN_CLUSTER_BITS, 1  (MAX_CLUSTER_BITS - 10));
+return -EINVAL;
+}
+
+/*
+ * Open the image file and write a minimal qcow2 header.
+ *
+ * We keep things simple and start with a zero-sized image. We also
+ * do without refcount blocks or a L1 table for now. We'll fix the
+ * inconsistency later.
+ *
+ * We do need a refcount table because growing the refcount table means
+ * allocating two new refcount blocks - the seconds of which would be at
+ * 2 GB for 64k clusters, and we don't want to have a 2 GB initial file
+ * size for any qcow2 image.
+ */
+BlockDriverState* bs;
+QCowHeader header;
+uint8_t* refcount_table;
+int ret;
+
+ret = bdrv_create_file(filename, options);
+if (ret  0) {
+return ret;
+}
+
+ret = bdrv_file_open(bs, filename, BDRV_O_RDWR);
+if (ret  0) {
+return ret;
+}
+
+/* Write the header */
+memset(header, 0, sizeof(header));
+header.magic = cpu_to_be32(QCOW_MAGIC);
+header.version = cpu_to_be32(QCOW_VERSION);
+header.cluster_bits = cpu_to_be32(cluster_bits);
+header.size = cpu_to_be64(0);
+header.l1_table_offset = cpu_to_be64(0);
+header.l1_size = cpu_to_be32(0);
+header.refcount_table_offset = cpu_to_be64(cluster_size);
+header.refcount_table_clusters = cpu_to_be32(1);
+
+if (flags  BLOCK_FLAG_ENCRYPT) {
+header.crypt_method = cpu_to_be32(QCOW_CRYPT_AES);
+} else {
+header.crypt_method = cpu_to_be32(QCOW_CRYPT_NONE);
+}
+
+ret = bdrv_pwrite(bs, 0, header, sizeof(header));
+if (ret  0) {
+goto out;
+}
+
+/* Write an empty refcount table */
+refcount_table = qemu_mallocz(cluster_size);
+ret = bdrv_pwrite(bs, cluster_size, refcount_table, cluster_size);
+qemu_free(refcount_table);
+
+if (ret  0) {
+goto out;
+}
+
+bdrv_close(bs);
+
+/*
+ * And now open the image and make it consistent first (i.e. increase the
+ * refcount of the cluster that is occupied by the header and the refcount
+ * table)
+ */
+BlockDriver* drv = bdrv_find_format(qcow2);
+assert(drv != NULL);
+ret = bdrv_open(bs, filename, BDRV_O_RDWR | BDRV_O_NO_FLUSH, drv);
+if (ret  0) {
+goto out;
+}
+
+ret = qcow2_alloc_clusters(bs, 2 * cluster_size);
+if (ret  0) {
+goto out;
+
+} else if (ret != 0) {
+error_report(Huh, first cluster in empty image is already in use?);
+abort();
+}
+
+/* Okay, now that we have a valid image, let's give it the right size */
+ret = bdrv_truncate(bs, total_size * BDRV_SECTOR_SIZE);
+if (ret  0) {
+goto out;
+}
+
+/* Want a backing file? There you go.*/
+if (backing_file) {
+

[Qemu-devel] [PATCH 03/11] qcow2: Remove old image creation function

2010-10-22 Thread Kevin Wolf

They have been #ifdef'd out by the previous patch.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2.c |  224 -
 1 files changed, 0 insertions(+), 224 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 3b2e38c..45ed073 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -795,30 +795,6 @@ static int qcow2_change_backing_file(BlockDriverState *bs,
 return qcow2_update_ext_header(bs, backing_file, backing_fmt);
 }
 
-#if 0
-static int get_bits_from_size(size_t size)
-{
-int res = 0;
-
-if (size == 0) {
-return -1;
-}
-
-while (size != 1) {
-/* Not a power of two */
-if (size  1) {
-return -1;
-}
-
-size = 1;
-res++;
-}
-
-return res;
-}
-#endif
-
-
 static int preallocate(BlockDriverState *bs)
 {
 uint64_t nb_sectors;
@@ -872,205 +848,6 @@ static int preallocate(BlockDriverState *bs)
 return 0;
 }
 
-#if 0
-static int qcow_create2(const char *filename, int64_t total_size,
-const char *backing_file, const char *backing_format,
-int flags, size_t cluster_size, int prealloc)
-{
-
-int fd, header_size, backing_filename_len, l1_size, i, shift, l2_bits;
-int ref_clusters, reftable_clusters, backing_format_len = 0;
-int rounded_ext_bf_len = 0;
-QCowHeader header;
-uint64_t tmp, offset;
-uint64_t old_ref_clusters;
-QCowCreateState s1, *s = s1;
-QCowExtension ext_bf = {0, 0};
-int ret;
-
-memset(s, 0, sizeof(*s));
-
-fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, 0644);
-if (fd  0)
-return -errno;
-memset(header, 0, sizeof(header));
-header.magic = cpu_to_be32(QCOW_MAGIC);
-header.version = cpu_to_be32(QCOW_VERSION);
-header.size = cpu_to_be64(total_size * 512);
-header_size = sizeof(header);
-backing_filename_len = 0;
-if (backing_file) {
-if (backing_format) {
-ext_bf.magic = QCOW_EXT_MAGIC_BACKING_FORMAT;
-backing_format_len = strlen(backing_format);
-ext_bf.len = backing_format_len;
-rounded_ext_bf_len = (sizeof(ext_bf) + ext_bf.len + 7)  ~7;
-header_size += rounded_ext_bf_len;
-}
-header.backing_file_offset = cpu_to_be64(header_size);
-backing_filename_len = strlen(backing_file);
-header.backing_file_size = cpu_to_be32(backing_filename_len);
-header_size += backing_filename_len;
-}
-
-/* Cluster size */
-s-cluster_bits = get_bits_from_size(cluster_size);
-if (s-cluster_bits  MIN_CLUSTER_BITS ||
-s-cluster_bits  MAX_CLUSTER_BITS)
-{
-fprintf(stderr, Cluster size must be a power of two between 
-%d and %dk\n,
-1  MIN_CLUSTER_BITS,
-1  (MAX_CLUSTER_BITS - 10));
-return -EINVAL;
-}
-s-cluster_size = 1  s-cluster_bits;
-
-header.cluster_bits = cpu_to_be32(s-cluster_bits);
-header_size = (header_size + 7)  ~7;
-if (flags  BLOCK_FLAG_ENCRYPT) {
-header.crypt_method = cpu_to_be32(QCOW_CRYPT_AES);
-} else {
-header.crypt_method = cpu_to_be32(QCOW_CRYPT_NONE);
-}
-l2_bits = s-cluster_bits - 3;
-shift = s-cluster_bits + l2_bits;
-l1_size = (((total_size * 512) + (1LL  shift) - 1)  shift);
-offset = align_offset(header_size, s-cluster_size);
-s-l1_table_offset = offset;
-header.l1_table_offset = cpu_to_be64(s-l1_table_offset);
-header.l1_size = cpu_to_be32(l1_size);
-offset += align_offset(l1_size * sizeof(uint64_t), s-cluster_size);
-
-/* count how many refcount blocks needed */
-
-#define NUM_CLUSTERS(bytes) \
-(((bytes) + (s-cluster_size) - 1) / (s-cluster_size))
-
-ref_clusters = NUM_CLUSTERS(NUM_CLUSTERS(offset) * sizeof(uint16_t));
-
-do {
-uint64_t image_clusters;
-old_ref_clusters = ref_clusters;
-
-/* Number of clusters used for the refcount table */
-reftable_clusters = NUM_CLUSTERS(ref_clusters * sizeof(uint64_t));
-
-/* Number of clusters that the whole image will have */
-image_clusters = NUM_CLUSTERS(offset) + ref_clusters
-+ reftable_clusters;
-
-/* Number of refcount blocks needed for the image */
-ref_clusters = NUM_CLUSTERS(image_clusters * sizeof(uint16_t));
-
-} while (ref_clusters != old_ref_clusters);
-
-s-refcount_table = qemu_mallocz(reftable_clusters * s-cluster_size);
-
-s-refcount_table_offset = offset;
-header.refcount_table_offset = cpu_to_be64(offset);
-header.refcount_table_clusters = cpu_to_be32(reftable_clusters);
-offset += (reftable_clusters * s-cluster_size);
-s-refcount_block_offset = offset;
-
-for (i=0; i  ref_clusters; i++) {
-s-refcount_table[i] = cpu_to_be64(offset);
-offset += s-cluster_size;
-}
-
-s-refcount_block = qemu_mallocz(ref_clusters * s-cluster_size);
-
-

[Qemu-devel] [PATCH 04/11] ide: set WCACHE supported in IDENTIFY data

2010-10-22 Thread Kevin Wolf

From: Christoph Hellwig h...@lst.de

ATA does not only have the WCACHE enabled bit in identify word 85, but also
a WCACHE supported bit in word 82.  While the Linux kernel is fine with the
latter at least hdparm also needs the former before correctly displaying
the cache settings.  There's also a non-zero chance other operating systems
are more picky in their volatile write cache detection.

Signed-off-by: Christoph Hellwig h...@lst.de
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/ide/core.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 06b6e14..5ccb09c 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -146,8 +146,8 @@ static void ide_identify(IDEState *s)
 put_le16(p + 68, 120);
 put_le16(p + 80, 0xf0); /* ata3 - ata6 supported */
 put_le16(p + 81, 0x16); /* conforms to ata5 */
-/* 14=NOP supported, 0=SMART supported */
-put_le16(p + 82, (1  14) | 1);
+/* 14=NOP supported, 5=WCACHE supported, 0=SMART supported */
+put_le16(p + 82, (1  14) | (1  5) | 1);
 /* 13=flush_cache_ext,12=flush_cache,10=lba48 */
 put_le16(p + 83, (1  14) | (1  13) | (1 12) | (1  10));
 /* 14=set to 1, 1=SMART self test, 0=SMART error logging */
-- 
1.7.2.3

[Qemu-devel] [PATCH 01/11] qcow2: Support exact L1 table growth

2010-10-22 Thread Kevin Wolf

From: Stefan Hajnoczi stefa...@linux.vnet.ibm.com

The L1 table grow operation includes a size calculation that bumps up
the new L1 table size in order to anticipate the size needs of vmstate
data.  This helps reduce the number of times that the L1 table has to be
grown when vmstate data is appended.

This size overhead is not necessary during image creation,
bdrv_truncate(), or snapshot goto operations.  In fact, existing
qemu-iotests that exercise table growth are no longer able to trigger it
because image creation preallocates an L1 table that is too large after
changes to qcow_create2().

This patch keeps the size calculation but also adds exact growth for
callers that do not want to inflate the L1 table size unnecessarily.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2-cluster.c  |   25 -
 block/qcow2-snapshot.c |2 +-
 block/qcow2.c  |2 +-
 block/qcow2.h  |2 +-
 4 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index fb4224a..4f7dc59 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -28,7 +28,7 @@
 #include block_int.h
 #include block/qcow2.h
 
-int qcow2_grow_l1_table(BlockDriverState *bs, int min_size)
+int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
 {
 BDRVQcowState *s = bs-opaque;
 int new_l1_size, new_l1_size2, ret, i;
@@ -36,15 +36,22 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size)
 int64_t new_l1_table_offset;
 uint8_t data[12];
 
-new_l1_size = s-l1_size;
-if (min_size = new_l1_size)
+if (min_size = s-l1_size)
 return 0;
-if (new_l1_size == 0) {
-new_l1_size = 1;
-}
-while (min_size  new_l1_size) {
-new_l1_size = (new_l1_size * 3 + 1) / 2;
+
+if (exact_size) {
+new_l1_size = min_size;
+} else {
+/* Bump size up to reduce the number of times we have to grow */
+new_l1_size = s-l1_size;
+if (new_l1_size == 0) {
+new_l1_size = 1;
+}
+while (min_size  new_l1_size) {
+new_l1_size = (new_l1_size * 3 + 1) / 2;
+}
 }
+
 #ifdef DEBUG_ALLOC2
 printf(grow l1_table from %d to %d\n, s-l1_size, new_l1_size);
 #endif
@@ -550,7 +557,7 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t 
offset,
 
 l1_index = offset  (s-l2_bits + s-cluster_bits);
 if (l1_index = s-l1_size) {
-ret = qcow2_grow_l1_table(bs, l1_index + 1);
+ret = qcow2_grow_l1_table(bs, l1_index + 1, false);
 if (ret  0) {
 return ret;
 }
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index bbfcaaa..8dd5df0 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -327,7 +327,7 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char 
*snapshot_id)
 if (qcow2_update_snapshot_refcount(bs, s-l1_table_offset, s-l1_size, -1) 
 0)
 goto fail;
 
-if (qcow2_grow_l1_table(bs, sn-l1_size)  0)
+if (qcow2_grow_l1_table(bs, sn-l1_size, true)  0)
 goto fail;
 
 s-l1_size = sn-l1_size;
diff --git a/block/qcow2.c b/block/qcow2.c
index ee3481b..345d2e4 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1154,7 +1154,7 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t 
offset)
 }
 
 new_l1_size = size_to_l1(s, offset);
-ret = qcow2_grow_l1_table(bs, new_l1_size);
+ret = qcow2_grow_l1_table(bs, new_l1_size, true);
 if (ret  0) {
 return ret;
 }
diff --git a/block/qcow2.h b/block/qcow2.h
index 356a34a..add710b 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -188,7 +188,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res);
 
 /* qcow2-cluster.c functions */
-int qcow2_grow_l1_table(BlockDriverState *bs, int min_size);
+int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size);
 void qcow2_l2_cache_reset(BlockDriverState *bs);
 int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
 void qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num,
-- 
1.7.2.3

[Qemu-devel] [PATCH 05/11] Copy snapshots out of QCOW2 disk

2010-10-22 Thread Kevin Wolf

From: edison edi...@cloud.com

In order to backup snapshots, created from QCOW2 iamge, we want to copy 
snapshots out of QCOW2 disk to a seperate storage.
The following patch adds a new option in qemu-img: qemu-img convert -f qcow2 
-O qcow2 -s snapshot_name src_img bck_img.
Right now, it only supports to copy the full snapshot, delta snapshot is on the 
way.

Changes from V1: all the comments from Kevin are addressed:
Add read-only checking
Fix coding style
Change the name from bdrv_snapshot_load to bdrv_snapshot_load_tmp

Signed-off-by: Disheng Su edi...@cloud.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block.c|   16 
 block.h|2 ++
 block/qcow2-snapshot.c |   31 +++
 block/qcow2.c  |1 +
 block/qcow2.h  |1 +
 block_int.h|2 ++
 qemu-img-cmds.hx   |4 ++--
 qemu-img.c |   19 ++-
 qemu-img.texi  |4 ++--
 9 files changed, 75 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index a19374d..985d0b7 100644
--- a/block.c
+++ b/block.c
@@ -1899,6 +1899,22 @@ int bdrv_snapshot_list(BlockDriverState *bs,
 return -ENOTSUP;
 }
 
+int bdrv_snapshot_load_tmp(BlockDriverState *bs,
+const char *snapshot_name)
+{
+BlockDriver *drv = bs-drv;
+if (!drv) {
+return -ENOMEDIUM;
+}
+if (!bs-read_only) {
+return -EINVAL;
+}
+if (drv-bdrv_snapshot_load_tmp) {
+return drv-bdrv_snapshot_load_tmp(bs, snapshot_name);
+}
+return -ENOTSUP;
+}
+
 #define NB_SUFFIXES 4
 
 char *get_human_readable_size(char *buf, int buf_size, int64_t size)
diff --git a/block.h b/block.h
index 5f64380..a4facf2 100644
--- a/block.h
+++ b/block.h
@@ -211,6 +211,8 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
 int bdrv_snapshot_delete(BlockDriverState *bs, const char *snapshot_id);
 int bdrv_snapshot_list(BlockDriverState *bs,
QEMUSnapshotInfo **psn_info);
+int bdrv_snapshot_load_tmp(BlockDriverState *bs,
+   const char *snapshot_name);
 char *bdrv_snapshot_dump(char *buf, int buf_size, QEMUSnapshotInfo *sn);
 
 char *get_human_readable_size(char *buf, int buf_size, int64_t size);
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 8dd5df0..aacf357 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -418,3 +418,34 @@ int qcow2_snapshot_list(BlockDriverState *bs, 
QEMUSnapshotInfo **psn_tab)
 return s-nb_snapshots;
 }
 
+int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name)
+{
+int i, snapshot_index, l1_size2;
+BDRVQcowState *s = bs-opaque;
+QCowSnapshot *sn;
+
+snapshot_index = find_snapshot_by_id_or_name(bs, snapshot_name);
+if (snapshot_index  0) {
+return -ENOENT;
+}
+
+sn = s-snapshots[snapshot_index];
+s-l1_size = sn-l1_size;
+l1_size2 = s-l1_size * sizeof(uint64_t);
+if (s-l1_table != NULL) {
+qemu_free(s-l1_table);
+}
+
+s-l1_table_offset = sn-l1_table_offset;
+s-l1_table = qemu_mallocz(align_offset(l1_size2, 512));
+
+if (bdrv_pread(bs-file, sn-l1_table_offset,
+   s-l1_table, l1_size2) != l1_size2) {
+return -1;
+}
+
+for(i = 0;i  s-l1_size; i++) {
+be64_to_cpus(s-l1_table[i]);
+}
+return 0;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index 45ed073..b816d87 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1286,6 +1286,7 @@ static BlockDriver bdrv_qcow2 = {
 .bdrv_snapshot_goto = qcow2_snapshot_goto,
 .bdrv_snapshot_delete   = qcow2_snapshot_delete,
 .bdrv_snapshot_list = qcow2_snapshot_list,
+.bdrv_snapshot_load_tmp = qcow2_snapshot_load_tmp,
 .bdrv_get_info = qcow_get_info,
 
 .bdrv_save_vmstate= qcow_save_vmstate,
diff --git a/block/qcow2.h b/block/qcow2.h
index add710b..2d22e5e 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -211,6 +211,7 @@ int qcow2_snapshot_create(BlockDriverState *bs, 
QEMUSnapshotInfo *sn_info);
 int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id);
 int qcow2_snapshot_delete(BlockDriverState *bs, const char *snapshot_id);
 int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab);
+int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name);
 
 void qcow2_free_snapshots(BlockDriverState *bs);
 int qcow2_read_snapshots(BlockDriverState *bs);
diff --git a/block_int.h b/block_int.h
index e8e7156..87e60b8 100644
--- a/block_int.h
+++ b/block_int.h
@@ -93,6 +93,8 @@ struct BlockDriver {
 int (*bdrv_snapshot_delete)(BlockDriverState *bs, const char *snapshot_id);
 int (*bdrv_snapshot_list)(BlockDriverState *bs,
   QEMUSnapshotInfo **psn_info);
+int (*bdrv_snapshot_load_tmp)(BlockDriverState *bs,
+  const char *snapshot_name);
 int (*bdrv_get_info)(BlockDriverState *bs,

Re: [Qemu-devel] [PATCH 0/3] add usb_detach and usb_attach (v3)

2010-10-22 Thread Anthony Liguori


On 10/22/2010 08:17 AM, Luiz Capitulino wrote:

You and Gerd asked about the purpose of this command, turns out that it's
only useful for developing new USB devices for QEMU, so I thought it would
be better to restrict it, so that people don't start using this the
wrong way or worse, we can't drop/break it because some tool is now using it.
   


Need HMP commands have no support associated with them.  Given QMP, 
there's no reason to script HMP commands.


But a long standing policy in QEMU has been to avoid conditional code 
because it results in dead code.  I think this policy has actually 
worked well for us historically.


Regards,

Anthony Liguori


Anyway, maybe a good doc will do, this was just a small suggestion.

   
 

Objections, Anthony?

   

Not with better docs.

Regards,

Anthony Liguori

 

This is just adding a HMP command.  Is that the right approach or
was that an unintentional consequence of rebasing post-HMP/QMP
split?

   

I don't think this should be available under QMP, it's more a debugging
command for USB developers.


   

Regards,

Anthony Liguori


   

cheers,
   Gerd

[Qemu-devel] [PATCH 06/11] qemu-io: New command map

2010-10-22 Thread Kevin Wolf

The new map command in qemu-io lists all allocated/unallocated areas in an
image file.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 qemu-io.c |   39 +++
 1 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index b4e5cc8..ff353eb 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -1443,6 +1443,44 @@ static const cmdinfo_t alloc_cmd = {
 };
 
 static int
+map_f(int argc, char **argv)
+{
+   int64_t offset;
+   int64_t nb_sectors;
+   char s1[64];
+   int num, num_checked;
+   int ret;
+   const char *retstr;
+
+   offset = 0;
+   nb_sectors = bs-total_sectors;
+
+   do {
+   num_checked = MIN(nb_sectors, INT_MAX);
+   ret = bdrv_is_allocated(bs, offset, num_checked, num);
+   retstr = ret ? allocated : not allocated;
+   cvtstr(offset  9ULL, s1, sizeof(s1));
+   printf([% 24 PRId64 ] % 8d/% 8d sectors %s at offset %s 
(%d)\n,
+   offset  9ULL, num, num_checked, retstr, s1, 
ret);
+
+   offset += num;
+   nb_sectors -= num;
+   } while(offset  bs-total_sectors);
+
+   return 0;
+}
+
+static const cmdinfo_t map_cmd = {
+   .name   = map,
+   .argmin = 0,
+   .argmax = 0,
+   .cfunc  = map_f,
+   .args   = ,
+   .oneline= prints the allocated areas of a file,
+};
+
+
+static int
 close_f(int argc, char **argv)
 {
bdrv_close(bs);
@@ -1680,6 +1718,7 @@ int main(int argc, char **argv)
add_command(length_cmd);
add_command(info_cmd);
add_command(alloc_cmd);
+   add_command(map_cmd);
 
add_args_command(init_args_command);
add_check_command(init_check_command);
-- 
1.7.2.3

[Qemu-devel] [PATCH 10/11] ide: Handle flush failure

2010-10-22 Thread Kevin Wolf

Instead of always assuming success for bdrv_aio_flush, actually do something
with the error. This respects the werror option and accordingly ignores the
error, reports it to the guest or stops the VM and retries after cont.

Ignoring the error is trivial, obviously. For stopping the VM and retrying
later old code can be reused, but we need to introduce a new status for retry
a flush. For reporting to the guest, fortunately the same action is required
as for a failed read/write (status = DRDY | ERR, error = ABRT), so this code
can be reused as well.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/ide/core.c |   10 +-
 hw/ide/internal.h |3 ++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 6d8606e..bc3e916 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -65,6 +65,7 @@ static void ide_dma_start(IDEState *s, 
BlockDriverCompletionFunc *dma_cb);
 static void ide_dma_restart(IDEState *s, int is_read);
 static void ide_atapi_cmd_read_dma_cb(void *opaque, int ret);
 static int ide_handle_rw_error(IDEState *s, int error, int op);
+static void ide_flush_cache(IDEState *s);
 
 static void padstr(char *str, const char *src, int len)
 {
@@ -688,6 +689,8 @@ static void ide_dma_restart_bh(void *opaque)
 } else {
 ide_sector_write(bmdma_active_if(bm));
 }
+} else if (bm-status  BM_STATUS_RETRY_FLUSH) {
+ide_flush_cache(bmdma_active_if(bm));
 }
 }
 
@@ -795,7 +798,12 @@ static void ide_flush_cb(void *opaque, int ret)
 {
 IDEState *s = opaque;
 
-/* XXX: how do we signal I/O errors here? */
+if (ret  0) {
+/* XXX: What sector number to set here? */
+if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) {
+return;
+}
+}
 
 s-status = READY_STAT | SEEK_STAT;
 ide_set_irq(s-bus);
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index 4165543..d652e06 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -472,7 +472,8 @@ struct IDEDeviceInfo {
 #define BM_STATUS_INT0x04
 #define BM_STATUS_DMA_RETRY  0x08
 #define BM_STATUS_PIO_RETRY  0x10
-#define BM_STATUS_RETRY_READ 0x20
+#define BM_STATUS_RETRY_READ  0x20
+#define BM_STATUS_RETRY_FLUSH 0x40
 
 #define BM_CMD_START 0x01
 #define BM_CMD_READ  0x08
-- 
1.7.2.3

[Qemu-devel] [PATCH 09/11] ide: Factor ide_flush_cache out

2010-10-22 Thread Kevin Wolf

The next patch reuses this code, so put it in its own function.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/ide/core.c |   14 ++
 1 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 5ccb09c..6d8606e 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -801,6 +801,15 @@ static void ide_flush_cb(void *opaque, int ret)
 ide_set_irq(s-bus);
 }
 
+static void ide_flush_cache(IDEState *s)
+{
+if (s-bs) {
+bdrv_aio_flush(s-bs, ide_flush_cb, s);
+} else {
+ide_flush_cb(s, 0);
+}
+}
+
 static inline void cpu_to_ube16(uint8_t *buf, int val)
 {
 buf[0] = val  8;
@@ -2031,10 +2040,7 @@ void ide_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 break;
 case WIN_FLUSH_CACHE:
 case WIN_FLUSH_CACHE_EXT:
-if (s-bs)
-bdrv_aio_flush(s-bs, ide_flush_cb, s);
-else
-ide_flush_cb(s, 0);
+ide_flush_cache(s);
 break;
 case WIN_STANDBY:
 case WIN_STANDBY2:
-- 
1.7.2.3

[Qemu-devel] [PATCH 07/11] block: Use GCC_FMT_ATTR and fix a format error

2010-10-22 Thread Kevin Wolf

From: Stefan Weil w...@mail.berlios.de

Adding the gcc format attribute detects a format bug
which is fixed here.

v2:
Don't use type cast. BDRV_SECTOR_SIZE is unsigned long long,
so %lld should be the correct format specifier.

Cc: Blue Swirl blauwir...@gmail.com
Cc: Kevin Wolf kw...@redhat.com
Signed-off-by: Stefan Weil w...@mail.berlios.de
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/blkverify.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/blkverify.c b/block/blkverify.c
index 8083464..b2a12fe 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -53,7 +53,8 @@ static AIOPool blkverify_aio_pool = {
 .cancel = blkverify_aio_cancel,
 };
 
-static void blkverify_err(BlkverifyAIOCB *acb, const char *fmt, ...)
+static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyAIOCB *acb,
+ const char *fmt, ...)
 {
 va_list ap;
 
@@ -299,7 +300,7 @@ static void blkverify_verify_readv(BlkverifyAIOCB *acb)
 {
 ssize_t offset = blkverify_iovec_compare(acb-qiov, acb-raw_qiov);
 if (offset != -1) {
-blkverify_err(acb, contents mismatch in sector %ld,
+blkverify_err(acb, contents mismatch in sector %lld,
   acb-sector_num + (offset / BDRV_SECTOR_SIZE));
 }
 }
-- 
1.7.2.3

Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog

2010-10-22 Thread Anthony Liguori


On 10/22/2010 08:35 AM, Kevin Wolf wrote:

Am 22.10.2010 14:58, schrieb Anthony Liguori:
   

On 10/22/2010 03:29 AM, Kevin Wolf wrote:
 

I agree.

 

Of course, as Laurent said a while ago, there is no specification for
NBD, so it's hard to say what the intended semantics is.

However, I did have a look at the nbdserver code and it looks as if it
implements something similar to writethrough (namely fsync after each
write) only if configured this way on the server side. qemu-nbd defaults
to writethrough, but can be configured to use cache=none. So with either
server qemu as a client can't tell whether the data is safe on disk or not.

In my book this is a strong argument for refusing to open nbd
connections with anything but cache=unsafe.

   

On a physical system, if you don't have a battery backed disk and you
enable the WC on your disk, then even with cache=writethrough we're unsafe.
 

I don't think that's right. O_SYNC should guarantee that the volatile
disk cache is flushed.
   


If your filesystem does the right thing which an awful lot of them don't 
today.



Likewise, if you mount your filesystem with barrier=0, QEMU is unsafe.
 

Yeah, if you do something equivalent to cache=unsafe on a lower layer,
then qemu can't do much about it. Maybe you can apply the same argument
to NBD, even though it's unsafe by default.

   

QEMU can't guarantee safety.  The underlying storage needs to be
configured correctly.  As long as we're not introducing caching within
QEMU, I don't think we should assume we're unsafe.

Do we have any place where we can add docs on a per-block format basis?
It would be good to at least mention for each block device how the
backing storage needed to be configured for safety.
 

docs/block-protocols.txt?
   


Maybe docs/block/name.txt?  Would be a good home for the qed spec too.

Regards,

Anthony Liguori


Kevin

[Qemu-devel] [PATCH 08/11] qemu-img: Fix qemu-img convert -obacking_file

2010-10-22 Thread Kevin Wolf

The old -B option caused a backing file to be used for the converted image and
to avoid copying clusters from the old backing file. When replaced with
-obacking_file, qemu-img convert does assign the backing file to the new image,
but it doesn't realize that it should avoid copying clusters from the backing
file.

This patch checks the -o options for a backing_file and applies the same logic
as for -B in this case.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 qemu-img.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index d4a3b4e..2864cb8 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -645,6 +645,7 @@ static int img_convert(int argc, char **argv)
 const uint8_t *buf1;
 BlockDriverInfo bdi;
 QEMUOptionParameter *param = NULL, *create_options = NULL;
+QEMUOptionParameter *out_baseimg_param;
 char *options = NULL;
 const char *snapshot_name = NULL;
 
@@ -769,6 +770,12 @@ static int img_convert(int argc, char **argv)
 goto out;
 }
 
+/* Get backing file name if -o backing_file was used */
+out_baseimg_param = get_option_parameter(param, BLOCK_OPT_BACKING_FILE);
+if (out_baseimg_param) {
+out_baseimg = out_baseimg_param-value.s;
+}
+
 /* Check if compression is supported */
 if (flags  BLOCK_FLAG_COMPRESS) {
 QEMUOptionParameter *encryption =
-- 
1.7.2.3

Re: [Qemu-devel] [PATCH 1/2] usb-ccid: add CCID bus

2010-10-22 Thread Markus Armbruster

Alon Levy al...@redhat.com writes:

 A CCID device is a smart card reader. It is a USB device, defined at [1].
 This patch introduces the usb-ccid device that is a ccid bus. Next patches 
 will
 introduce two card types to use it, a passthru card and an emulated card.

  [1] http://www.usb.org/developers/devclass_docs/DWG_Smart-Card_CCID_Rev110.

 Signed-off-by: Alon Levy al...@redhat.com
 ---
  Makefile.objs |1 +
  configure |   12 +
  hw/ccid.h |   34 ++
  hw/usb-ccid.c | 1349 
 +
  4 files changed, 1396 insertions(+), 0 deletions(-)
  create mode 100644 hw/ccid.h
  create mode 100644 hw/usb-ccid.c
[...]
 diff --git a/hw/usb-ccid.c b/hw/usb-ccid.c
 new file mode 100644
 index 000..a7b4c3f
 --- /dev/null
 +++ b/hw/usb-ccid.c
 @@ -0,0 +1,1349 @@
[...]
 +struct CCIDBus {
 +BusState qbus;
 +USBCCIDState *ccid;

Where does ccid point to?

 +};
[...]
 +static int ccid_initfn(USBDevice *dev)
 +{
 +USBCCIDState *s = DO_UPCAST(USBCCIDState, dev, dev);
 +
 +s-bus = ccid_bus_new(dev-qdev);
 +s-card = NULL;
 +s-cardinfo = NULL;
 +s-bus-ccid = s;

Looks like it points back to the device providing the bus.  Why not use
bus-qbus-parent ?

 +s-migration_state = MIGRATION_NONE;
 +dev-auto_attach = s-auto_attach;
 +debug = s-debug;

Wait a sec!  Each CCID device has its own property debug (defined
below), but they all copy to the same static debug on initialization.
In other words, the device initialized last wins.  Ugh.

 +s-migration_target_ip = 0;
 +s-migration_target_port = 0;
 +s-dev.speed = USB_SPEED_FULL;
 +s-notify_slot_change = false;
 +s-powered = true;
 +s-pending_answers_num = 0;
 +s-last_answer_error = 0;
 +s-bulk_in_pending_start = 0;
 +s-bulk_in_pending_end = 0;
 +s-current_bulk_in = NULL;
 +ccid_reset_error_status(s);
 +s-bulk_out_pos = 0;
 +ccid_reset_parameters(s);
 +ccid_reset(s);
 +return 0;
 +}
[...]
 +static struct USBDeviceInfo ccid_info = {
 +.product_desc   = QEMU USB CCID,
 +.qdev.name  = CCID_DEV_NAME,
 +.qdev.size  = sizeof(USBCCIDState),
 +.qdev.vmsd  = ccid_vmstate,
 +.init   = ccid_initfn,
 +.handle_packet  = usb_generic_handle_packet,
 +.handle_reset   = ccid_handle_reset,
 +.handle_control = ccid_handle_control,
 +.handle_data= ccid_handle_data,
 +.handle_destroy = ccid_handle_destroy,
 +.usbdevice_name = ccid,
 +.qdev.props = (Property[]) {
 +DEFINE_PROP_UINT8(auto_attach, USBCCIDState, auto_attach, 0),
 +DEFINE_PROP_UINT8(debug, USBCCIDState, debug, 0),
 +DEFINE_PROP_END_OF_LIST(),
 +},
 +};
 +
 +
 +static void ccid_register_devices(void)
 +{
 +usb_qdev_register(ccid_info);
 +}
 +device_init(ccid_register_devices)

Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog

2010-10-22 Thread Anthony Liguori


On 10/22/2010 08:57 AM, Kevin Wolf wrote:

Am 22.10.2010 15:45, schrieb Anthony Liguori:
   

On a physical system, if you don't have a battery backed disk and you
enable the WC on your disk, then even with cache=writethrough we're unsafe.

 

I don't think that's right. O_SYNC should guarantee that the volatile
disk cache is flushed.

   

If your filesystem does the right thing which an awful lot of them don't
today.
 

The list of really relevant filesystems is rather short, though.

   

Likewise, if you mount your filesystem with barrier=0, QEMU is unsafe.

 

Yeah, if you do something equivalent to cache=unsafe on a lower layer,
then qemu can't do much about it. Maybe you can apply the same argument
to NBD, even though it's unsafe by default.


   

QEMU can't guarantee safety.  The underlying storage needs to be
configured correctly.  As long as we're not introducing caching within
QEMU, I don't think we should assume we're unsafe.

Do we have any place where we can add docs on a per-block format basis?
It would be good to at least mention for each block device how the
backing storage needed to be configured for safety.

 

docs/block-protocols.txt?

   

Maybe docs/block/name.txt?  Would be a good home for the qed spec too.
 

I think spec and documentation for users should be kept separate. I
thought that's the reason why docs/specs/ exists.

And if you exclude specs, I'm not sure if there's a lot left to say for
each format. Having ten files under docs/block/ which consist of two
lines each would be ridiculous. If contrary to my expectations we
actually do have content for it, docs/block/name.txt works for me as well.
   


Okay, sounds reasonable.

Regards,

Anthony Liguori


Kevin

[Qemu-devel] [PATCH 1/4] Trivial fix for QMP/qmp-events.txt

2010-10-22 Thread Luiz Capitulino

From: Hidetoshi Seto seto.hideto...@jp.fujitsu.com

Fix example of STOP event that was just copy-and-pasted.

Signed-off-by: Hidetoshi Seto seto.hideto...@jp.fujitsu.com
Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---
 QMP/qmp-events.txt |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt
index 01ec85f..aa20210 100644
--- a/QMP/qmp-events.txt
+++ b/QMP/qmp-events.txt
@@ -89,7 +89,7 @@ Data: None.
 
 Example:
 
-{ event: SHUTDOWN,
+{ event: STOP,
 timestamp: { seconds: 1267041730, microseconds: 281295 } }
 
 VNC_CONNECTED
-- 
1.7.3.1.127.g1bb28

Re: [Qemu-devel] [PATCH 0/6] First part of autoconfy series: cleanup tests/

2010-10-22 Thread Luiz Capitulino

On Fri, 22 Oct 2010 15:38:34 +0200
Paolo Bonzini pbonz...@redhat.com wrote:

 On 10/22/2010 03:26 PM, Luiz Capitulino wrote:
  On Thu, 21 Oct 2010 10:18:34 +0200
  Paolo Bonzinipbonz...@redhat.com  wrote:
 
  This is a small start from the autoconfy configure series.  Actually,
  this part is almost completely new.  It cleans up the tests Makefile
  so that the i386-linux-user testsuite's Makefile targets are more
  easily extensible and can work wherever a compiler for i386 is
  installed.  In the future this could be extended to ARM and
  MIPS targets.
 
  This is unrelated to this series, but something that would be very good to
  have would be to move all check- programs to tests/ (or check/) and make 
  them
  part of 'make test' or introduce a new 'make check'.
 
 Right, I thought about that too.  The main hurdle is that: 1) right now 
 almost every test in make test is failing;

Yeah, just saw that.

 2) the testsuite run by 
 make test is very noisy, does not create logs, etc.  If we could use 
 autotest... :)
 
 We could also have check-i386, check-qmp, etc. and check would simply 
 call all of them.

That would be perfect, I think. I mean, if we could stick only to check
to do all our unit-testing, then it would be easier to setup autotest
around it.

But we'll have to port current tests to it, of course.

Re: [Qemu-devel] [PATCH 0/6] First part of autoconfy series: cleanup tests/

2010-10-22 Thread Luiz Capitulino

On Fri, 22 Oct 2010 15:53:39 +0200
Paolo Bonzini pbonz...@redhat.com wrote:

 On 10/22/2010 03:44 PM, Luiz Capitulino wrote:
We could also have check-i386, check-qmp, etc. and check would simply
call all of them.
 
  That would be perfect, I think. I mean, if we could stick only to check
  to do all our unit-testing
 
 Well, the current make test is more integration testing than unit 
 testing, so I'm not sure check can work for that.  But autotest can, 
 which is why I mentioned it.

Ah, got it.

[Qemu-devel] [PATCH 2/4] Silence compiler warning in json test case

2010-10-22 Thread Luiz Capitulino

From: Jan Kiszka jan.kis...@web.de

This avoids

error: zero-length gnu_printf format string

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---
 check-qjson.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/check-qjson.c b/check-qjson.c
index 0b60e45..64fcdcb 100644
--- a/check-qjson.c
+++ b/check-qjson.c
@@ -639,7 +639,9 @@ END_TEST
 
 START_TEST(empty_input)
 {
-QObject *obj = qobject_from_json();
+const char *empty = ;
+
+QObject *obj = qobject_from_json(empty);
 fail_unless(obj == NULL);
 }
 END_TEST
-- 
1.7.3.1.127.g1bb28

[Qemu-devel] [PATCH 3/4] Fix test suite build with tracing enabled

2010-10-22 Thread Luiz Capitulino

From: Jan Kiszka jan.kis...@siemens.com

qemu_malloc instrumentations require linking against the trace objects.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---
 Makefile |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Makefile b/Makefile
index 252c817..106a401 100644
--- a/Makefile
+++ b/Makefile
@@ -140,12 +140,12 @@ qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
 
 check-qint.o check-qstring.o check-qdict.o check-qlist.o check-qfloat.o 
check-qjson.o: $(GENERATED_HEADERS)
 
-check-qint: check-qint.o qint.o qemu-malloc.o
-check-qstring: check-qstring.o qstring.o qemu-malloc.o
-check-qdict: check-qdict.o qdict.o qfloat.o qint.o qstring.o qbool.o 
qemu-malloc.o qlist.o
-check-qlist: check-qlist.o qlist.o qint.o qemu-malloc.o
-check-qfloat: check-qfloat.o qfloat.o qemu-malloc.o
-check-qjson: check-qjson.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o 
qjson.o json-streamer.o json-lexer.o json-parser.o qemu-malloc.o
+check-qint: check-qint.o qint.o qemu-malloc.o $(trace-obj-y)
+check-qstring: check-qstring.o qstring.o qemu-malloc.o $(trace-obj-y)
+check-qdict: check-qdict.o qdict.o qfloat.o qint.o qstring.o qbool.o 
qemu-malloc.o qlist.o $(trace-obj-y)
+check-qlist: check-qlist.o qlist.o qint.o qemu-malloc.o $(trace-obj-y)
+check-qfloat: check-qfloat.o qfloat.o qemu-malloc.o $(trace-obj-y)
+check-qjson: check-qjson.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o 
qjson.o json-streamer.o json-lexer.o json-parser.o qemu-malloc.o $(trace-obj-y)
 
 clean:
 # avoid old build problems by removing potentially incorrect old files
-- 
1.7.3.1.127.g1bb28

Re: [Qemu-devel] [PATCH 0/3] add usb_detach and usb_attach (v3)

2010-10-22 Thread Luiz Capitulino

On Fri, 22 Oct 2010 08:43:39 -0500
Anthony Liguori anth...@codemonkey.ws wrote:

 On 10/22/2010 08:17 AM, Luiz Capitulino wrote:
  You and Gerd asked about the purpose of this command, turns out that it's
  only useful for developing new USB devices for QEMU, so I thought it would
  be better to restrict it, so that people don't start using this the
  wrong way or worse, we can't drop/break it because some tool is now using 
  it.
 
 
 Need HMP commands have no support associated with them.  Given QMP, 
 there's no reason to script HMP commands.
 
 But a long standing policy in QEMU has been to avoid conditional code 
 because it results in dead code.  I think this policy has actually 
 worked well for us historically.

Fine with me.

[Qemu-devel] [PATCH 11/11] virtio-blk: Respect werror option for flushes

2010-10-22 Thread Kevin Wolf

The werror option now affects not only write requests, but also flush requests.
Previously, it was not possible to stop a VM on a failed flush.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/virtio-blk.c |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index a1df26d..dbe2070 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -106,7 +106,13 @@ static void virtio_blk_flush_complete(void *opaque, int 
ret)
 {
 VirtIOBlockReq *req = opaque;
 
-virtio_blk_req_complete(req, ret ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK);
+if (ret) {
+if (virtio_blk_handle_rw_error(req, -ret, 0)) {
+return;
+}
+}
+
+virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
 }
 
 static VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
-- 
1.7.2.3

[Qemu-devel] [PULL 0/4]: Monitor queue

2010-10-22 Thread Luiz Capitulino

Anthony,

The following patches have been sent to the list and look good to me. I've
also tested them.

The changes (since d03703c81a202cea156811e5dbc8e88627c19986) are available
in the following repository:

git://repo.or.cz/qemu/qmp-unstable.git for-anthony

Hidetoshi Seto (1):
  Trivial fix for QMP/qmp-events.txt

Jan Kiszka (2):
  Silence compiler warning in json test case
  Fix test suite build with tracing enabled

Kusanagi Kouichi (1):
  monitor: Ignore . and .. when completing file name.

 Makefile   |   12 ++--
 QMP/qmp-events.txt |2 +-
 check-qjson.c  |4 +++-
 monitor.c  |5 +
 4 files changed, 15 insertions(+), 8 deletions(-)

Re: [Qemu-devel] [PATCH 0/6] First part of autoconfy series: cleanup tests/

2010-10-22 Thread Paolo Bonzini


On 10/22/2010 03:44 PM, Luiz Capitulino wrote:

  We could also have check-i386, check-qmp, etc. and check would simply
  call all of them.

That would be perfect, I think. I mean, if we could stick only to check
to do all our unit-testing


Well, the current make test is more integration testing than unit 
testing, so I'm not sure check can work for that.  But autotest can, 
which is why I mentioned it.


Paolo

[Qemu-devel] [PATCH 4/4] monitor: Ignore . and .. when completing file name.

2010-10-22 Thread Luiz Capitulino

From: Kusanagi Kouichi sl...@ac.auone-net.jp

Signed-off-by: Kusanagi Kouichi sl...@ac.auone-net.jp
Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
---
 monitor.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/monitor.c b/monitor.c
index 260cc02..61607c5 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3976,6 +3976,11 @@ static void file_completion(const char *input)
 d = readdir(ffs);
 if (!d)
 break;
+
+if (strcmp(d-d_name, .) == 0 || strcmp(d-d_name, ..) == 0) {
+continue;
+}
+
 if (strstart(d-d_name, file_prefix, NULL)) {
 memcpy(file, input, input_path_len);
 if (input_path_len  sizeof(file))
-- 
1.7.3.1.127.g1bb28

Re: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog

2010-10-22 Thread Kevin Wolf

Am 22.10.2010 15:45, schrieb Anthony Liguori:
 On a physical system, if you don't have a battery backed disk and you
 enable the WC on your disk, then even with cache=writethrough we're unsafe.
  
 I don't think that's right. O_SYNC should guarantee that the volatile
 disk cache is flushed.

 
 If your filesystem does the right thing which an awful lot of them don't 
 today.

The list of really relevant filesystems is rather short, though.

 Likewise, if you mount your filesystem with barrier=0, QEMU is unsafe.
  
 Yeah, if you do something equivalent to cache=unsafe on a lower layer,
 then qemu can't do much about it. Maybe you can apply the same argument
 to NBD, even though it's unsafe by default.


 QEMU can't guarantee safety.  The underlying storage needs to be
 configured correctly.  As long as we're not introducing caching within
 QEMU, I don't think we should assume we're unsafe.

 Do we have any place where we can add docs on a per-block format basis?
 It would be good to at least mention for each block device how the
 backing storage needed to be configured for safety.
  
 docs/block-protocols.txt?

 
 Maybe docs/block/name.txt?  Would be a good home for the qed spec too.

I think spec and documentation for users should be kept separate. I
thought that's the reason why docs/specs/ exists.

And if you exclude specs, I'm not sure if there's a lot left to say for
each format. Having ten files under docs/block/ which consist of two
lines each would be ridiculous. If contrary to my expectations we
actually do have content for it, docs/block/name.txt works for me as well.

Kevin

Re: [Qemu-devel] Re: [PATCH v6 07/12] pcie/hotplug: introduce pushing attention button command

2010-10-22 Thread Michael S. Tsirkin

On Fri, Oct 22, 2010 at 01:35:47PM +0200, Markus Armbruster wrote:
 Michael S. Tsirkin m...@redhat.com writes:
 
  On Wed, Oct 20, 2010 at 05:18:56PM +0900, Isaku Yamahata wrote:
  glue pcie_push_attention_button command.
  
  Signed-off-by: Isaku Yamahata yamah...@valinux.co.jp
 
  So as a high level command, I think we need to
  think about how to tie this into pci_add/pci_del.
  Right?
 [...]
 
 Do we have consensus how our set of commands for hot plug should look
 like?  We talked about it, but did we reach consensus?  If yes, did we
 write it down somewhere?

I think for simple things yes:
- command to send hotplug notification to the guest
- command to immediately add/remove the device
- event to notify about guest ack
- way to poll status: did guest ack last command?

Existing ones will keep function:
- send notification and when acked remove device
- add device and send notification
These are useful for human monitor but maybe not
for management.


-- 
MST

Re: [Qemu-devel] Re: [PATCH v6 07/12] pcie/hotplug: introduce pushing attention button command

2010-10-22 Thread Anthony Liguori


On 10/22/2010 09:38 AM, Michael S. Tsirkin wrote:

On Fri, Oct 22, 2010 at 01:35:47PM +0200, Markus Armbruster wrote:
   

Michael S. Tsirkinm...@redhat.com  writes:

 

On Wed, Oct 20, 2010 at 05:18:56PM +0900, Isaku Yamahata wrote:
   

glue pcie_push_attention_button command.

Signed-off-by: Isaku Yamahatayamah...@valinux.co.jp
 

So as a high level command, I think we need to
think about how to tie this into pci_add/pci_del.
Right?
   

[...]

Do we have consensus how our set of commands for hot plug should look
like?  We talked about it, but did we reach consensus?  If yes, did we
write it down somewhere?
 

I think for simple things yes:
- command to send hotplug notification to the guest
- command to immediately add/remove the device
- event to notify about guest ack
   


Is it a guest ack?  I thought it was actually an eject that can be 
initiated without the notification being sent to the guest.


If so, polling doesn't really make much sense.

Regards,

Anthony Liguori


- way to poll status: did guest ack last command?

Existing ones will keep function:
- send notification and when acked remove device
- add device and send notification
These are useful for human monitor but maybe not
for management.

[Qemu-devel] (no subject)

2010-10-22 Thread Stefan Hajnoczi

QEMU Enhanced Disk format is a disk image format that forgoes features
found in qcow2 in favor of better levels of performance and data
integrity.  Due to its simpler on-disk layout, it is possible to safely
perform metadata updates more efficiently.

Installations, suspend-to-disk, and other allocation-heavy I/O workloads
will see increased performance due to fewer I/Os and syncs.  Workloads
that do not cause new clusters to be allocated will perform similar to
raw images due to in-memory metadata caching.

The format supports sparse disk images.  It does not rely on the host
filesystem holes feature, making it a good choice for sparse disk images
that need to be transferred over channels where holes are not supported.

Backing files are supported so only deltas against a base image can be
stored.  The base image may be smaller than the image file.

The file format is extensible so that additional features can be added
later with graceful compatibility handling.  A specification for the file
format is included in this patchset.

Internal snapshots are not supported.  This eliminates the need for
additional metadata to track copy-on-write clusters.

Compression and encryption are not supported.  They add complexity and can be
implemented at other layers in the stack (i.e. inside the guest or on the
host).  Encryption has been identified as a potential future extension and the
file format allows for this.

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
This code is also available from git:

http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/qed-v3

I have preserved distinct commits against v2 for easier reviewing here:

http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/qed-v3-presquash

v3:
 * Flush before L2 update when a backing file is used
 * Use QED_F_BACKING_FORMAT_NO_PROBE instead of backing_fmt header field
 * Allow non-cluster sized images
 * Implement autoclear feature bits
 * Implement backing image smaller size - reads from backing image should zero 
beyond EOF
 * Preserve errno in qed_find_cluster_cb() - don't dumb down to 
QED_CLUSTER_ERROR
 * Use ffs() instead of get_bits_from_size()
 * Remove l2_cache argument to qed_unref_l2_cache_entry
 * Eliminate L2TableAllocFunc function pointer
 * Split qed_aio_write in-place and allocating code path to make code clearer
 * Document how L2 cache is used
 * Document qed_find_cluster()
 * Update QED specification
 * Fix COPYING.LIB LGPL license file references
 * Add copyright header to qed-check.c
 * Avoid the bytes_to_str()/cvtstr()/sztostr() dependency until Jes' strtosz() 
goes in

v2:
 * Add QED format specification to documentation
 * Use __builtin_ctzl() for get_bits_from_size()
 * Fine-grained table locking to allow concurrent allocating write requests
 * Fix qemu_free() instead of qemu_vfree() in qed_unref_l2_cache_entry()
 * Comment clean-ups

[Qemu-devel] [PATCH v3 5/5] qed: Consistency check support

2010-10-22 Thread Stefan Hajnoczi

This patch adds support for the qemu-img check command.  It also
introduces a dirty bit in the qed header to mark modified images as
needing a check.  This bit is cleared when the image file is closed
cleanly.

If an image file is opened and it has the dirty bit set, a consistency
check will run and try to fix corrupted table offsets.  These
corruptions may occur if there is power loss while an allocating write
is performed.  Once the image is fixed it opens as normal again.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 Makefile.objs |2 +-
 block/qed-check.c |  210 +
 block/qed.c   |  125 +++-
 block/qed.h   |9 ++
 4 files changed, 342 insertions(+), 4 deletions(-)
 create mode 100644 block/qed-check.c

diff --git a/Makefile.objs b/Makefile.objs
index 8ae17d8..62cab02 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -15,7 +15,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o 
vvfat.o
 block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
 block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
-block-nested-y += qed-lock.o
+block-nested-y += qed-lock.o qed-check.o
 block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
 block-nested-$(CONFIG_POSIX) += raw-posix.o
diff --git a/block/qed-check.c b/block/qed-check.c
new file mode 100644
index 000..4600932
--- /dev/null
+++ b/block/qed-check.c
@@ -0,0 +1,210 @@
+/*
+ * QEMU Enhanced Disk Format Consistency Check
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Authors:
+ *  Stefan Hajnoczi   stefa...@linux.vnet.ibm.com
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include qed.h
+
+typedef struct {
+BDRVQEDState *s;
+BdrvCheckResult *result;
+bool fix;   /* whether to fix invalid offsets */
+
+size_t nclusters;
+uint32_t *used_clusters;/* referenced cluster bitmap */
+
+QEDRequest request;
+} QEDCheck;
+
+static bool qed_test_bit(uint32_t *bitmap, uint64_t n) {
+return !!(bitmap[n / 32]  (1  (n % 32)));
+}
+
+static void qed_set_bit(uint32_t *bitmap, uint64_t n) {
+bitmap[n / 32] |= 1  (n % 32);
+}
+
+/**
+ * Set bitmap bits for clusters
+ *
+ * @check:  Check structure
+ * @offset: Starting offset in bytes
+ * @n:  Number of clusters
+ */
+static bool qed_set_used_clusters(QEDCheck *check, uint64_t offset,
+  unsigned int n)
+{
+uint64_t cluster = qed_bytes_to_clusters(check-s, offset);
+unsigned int corruptions = 0;
+
+while (n-- != 0) {
+/* Clusters should only be referenced once */
+if (qed_test_bit(check-used_clusters, cluster)) {
+corruptions++;
+}
+
+qed_set_bit(check-used_clusters, cluster);
+cluster++;
+}
+
+check-result-corruptions += corruptions;
+return corruptions == 0;
+}
+
+/**
+ * Check an L2 table
+ *
+ * @ret:Number of invalid cluster offsets
+ */
+static unsigned int qed_check_l2_table(QEDCheck *check, QEDTable *table)
+{
+BDRVQEDState *s = check-s;
+unsigned int i, num_invalid = 0;
+
+for (i = 0; i  s-table_nelems; i++) {
+uint64_t offset = table-offsets[i];
+
+if (!offset) {
+continue;
+}
+
+/* Detect invalid cluster offset */
+if (!qed_check_cluster_offset(s, offset)) {
+if (check-fix) {
+table-offsets[i] = 0;
+} else {
+check-result-corruptions++;
+}
+
+num_invalid++;
+continue;
+}
+
+qed_set_used_clusters(check, offset, 1);
+}
+
+return num_invalid;
+}
+
+/**
+ * Descend tables and check each cluster is referenced once only
+ */
+static int qed_check_l1_table(QEDCheck *check, QEDTable *table)
+{
+BDRVQEDState *s = check-s;
+unsigned int i, num_invalid_l1 = 0;
+int ret, last_error = 0;
+
+/* Mark L1 table clusters used */
+qed_set_used_clusters(check, s-header.l1_table_offset,
+  s-header.table_size);
+
+for (i = 0; i  s-table_nelems; i++) {
+unsigned int num_invalid_l2;
+uint64_t offset = table-offsets[i];
+
+if (!offset) {
+continue;
+}
+
+/* Detect invalid L2 offset */
+if (!qed_check_table_offset(s, offset)) {
+/* Clear invalid offset */
+if (check-fix) {
+table-offsets[i] = 0;
+} else {
+check-result-corruptions++;
+}
+
+num_invalid_l1++;
+continue;
+}
+
+if (!qed_set_used_clusters(check, offset, s-header.table_size)) {
+

[Qemu-devel] [PATCH v3 1/5] docs: Add QED image format specification

2010-10-22 Thread Stefan Hajnoczi

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 docs/specs/qed_spec.txt |  128 +++
 1 files changed, 128 insertions(+), 0 deletions(-)
 create mode 100644 docs/specs/qed_spec.txt

diff --git a/docs/specs/qed_spec.txt b/docs/specs/qed_spec.txt
new file mode 100644
index 000..e4425c8
--- /dev/null
+++ b/docs/specs/qed_spec.txt
@@ -0,0 +1,128 @@
+=Specification=
+
+The file format looks like this:
+
+ +--+--+--+-+
+ | cluster0 | cluster1 | cluster2 | ... |
+ +--+--+--+-+
+
+The first cluster begins with the '''header'''.  The header contains 
information about where regular clusters start; this allows the header to be 
extensible and store extra information about the image file.  A regular cluster 
may be a '''data cluster''', an '''L2''', or an '''L1 table'''.  L1 and L2 
tables are composed of one or more contiguous clusters.
+
+Normally the file size will be a multiple of the cluster size.  If the file 
size is not a multiple, extra information after the last cluster may not be 
preserved if data is written.  Legitimate extra information should use space 
between the header and the first regular cluster.
+
+All fields are little-endian.
+
+==Header==
+ Header {
+ uint32_t magic;   /* QED\0 */
+ 
+ uint32_t cluster_size;/* in bytes */
+ uint32_t table_size;  /* for L1 and L2 tables, in clusters */
+ uint32_t header_size; /* in clusters */
+ 
+ uint64_t features;/* format feature bits */
+ uint64_t compat_features; /* compat feature bits */
+ uint64_t l1_table_offset; /* in bytes */
+ uint64_t image_size;  /* total logical image size, in bytes */
+ 
+ /* if (features  QED_F_BACKING_FILE) */
+ uint32_t backing_filename_offset; /* in bytes from start of header */
+ uint32_t backing_filename_size;   /* in bytes */
+ }
+
+Field descriptions:
+* ''cluster_size'' must be a power of 2 in range [2^12, 2^26].
+* ''table_size'' must be a power of 2 in range [1, 16].
+* ''header_size'' is the number of clusters used by the header and any 
additional information stored before regular clusters.
+* ''features'', ''compat_features'', and ''autoclear_features'' are file 
format extension bitmaps.  They work as follows:
+** An image with unknown ''features'' bits enabled must not be opened.  File 
format changes that are not backwards-compatible must use ''features'' bits.
+** An image with unknown ''compat_features'' bits enabled can be opened 
safely.  The unknown features are simply ignored and represent 
backwards-compatible changes to the file format.
+** An image with unknown ''autoclear_features'' bits enable can be opened 
safely after clearing the unknown bits.  This allows for backwards-compatible 
changes to the file format which degrade gracefully and can be re-enabled again 
by a new program later.
+* ''l1_table_offset'' is the offset of the first byte of the L1 table in the 
image file and must be a multiple of ''cluster_size''.
+* ''image_size'' is the block device size seen by the guest and must be a 
multiple of 512 bytes.
+* ''backing_filename'' is a string in (byte offset, byte size) form.  It is 
not NUL-terminated and has no alignment constraints.
+
+Feature bits:
+* QED_F_BACKING_FILE = 0x01.  The image uses a backing file.  The backing 
filename string is given in the ''backing_filename_{offset,size}'' fields and 
may be an absolute path or relative to the image file.
+* QED_F_NEED_CHECK = 0x02.  The image needs a consistency check before use.
+* QED_F_BACKING_FORMAT_NO_PROBE = 0x04.  The backing file is a raw disk image 
and no file format autodetection should be attempted.  This should be used to 
ensure that raw backing images are never detected as an image format if they 
happen to contain magic constants.
+
+There are currently no defined ''compat_features'' or ''autoclear_features'' 
bits.
+
+Fields predicated on a feature bit are only used when that feature is set.  
The fields always take up header space, regardless of whether or not the 
feature bit is set.
+
+==Tables==
+
+Tables provide the translation from logical offsets in the block device to 
cluster offsets in the file.
+
+ #define TABLE_NOFFSETS (table_size * cluster_size / sizeof(uint64_t))
+  
+ Table {
+ uint64_t offsets[TABLE_NOFFSETS];
+ }
+
+The tables are organized as follows:
+
++--+
+| L1 table |
++--+
+   ,--'  |  '--.
+  +--+   |+--+
+  | L2 table |  ...   | L2 table |
+  +--++--+
+  ,--'  |  '--.
+ +--+   |+--+
+ |   Data   |  ...   |   Data   |
+ +--++--+
+
+A table is made up of one or more contiguous clusters.  The table_size header 
field determines table size for an image

[Qemu-devel] [PATCH v3 4/5] qed: Read/write support

2010-10-22 Thread Stefan Hajnoczi

This patch implements the read/write state machine.  Operations are
fully asynchronous and multiple operations may be active at any time.

Allocating writes lock tables to ensure metadata updates do not
interfere with each other.  If two allocating writes need to update the
same L2 table they will run sequentially.  If two allocating writes need
to update different L2 tables they will run in parallel.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 Makefile.objs|1 +
 block/qed-lock.c |  124 ++
 block/qed.c  |  667 +-
 block/qed.h  |   43 
 trace-events |   10 +
 5 files changed, 843 insertions(+), 2 deletions(-)
 create mode 100644 block/qed-lock.c

diff --git a/Makefile.objs b/Makefile.objs
index bcab5bc..8ae17d8 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -15,6 +15,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o 
vvfat.o
 block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
 block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
+block-nested-y += qed-lock.o
 block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
 block-nested-$(CONFIG_POSIX) += raw-posix.o
diff --git a/block/qed-lock.c b/block/qed-lock.c
new file mode 100644
index 000..1b7a763
--- /dev/null
+++ b/block/qed-lock.c
@@ -0,0 +1,124 @@
+/*
+ * QEMU Enhanced Disk Format Lock
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Authors:
+ *  Stefan Hajnoczi stefa...@linux.vnet.ibm.com
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+/*
+ * Table locking works as follows:
+ *
+ * Reads and non-allocating writes do not acquire locks because they do not
+ * modify tables and only see committed L2 cache entries.
+ *
+ * An allocating write request that needs to update an existing L2 table
+ * acquires a lock on the table.  This serializes requests that touch the same
+ * L2 table.
+ *
+ * An allocating write request that needs to create a new L2 table and update
+ * the L1 table acquires a lock on the L1 table.  This serializes requests that
+ * create new L2 tables.
+ *
+ * When a request is unable to acquire a lock, it is put to sleep and must
+ * return.  When the lock it attempted to acquire becomes available, a wakeup
+ * function is invoked to activate it again.
+ *
+ * A request must retry its cluster lookup after waking up because the tables
+ * have changed.  For example, an allocating write may no longer need to
+ * allocate if the previous request already allocated the cluster.
+ */
+
+#include qed.h
+
+struct QEDLockEntry {
+uint64_t key;
+QSIMPLEQ_HEAD(, QEDAIOCB) reqs;
+QTAILQ_ENTRY(QEDLockEntry) next;
+};
+
+/**
+ * Initialize a lock
+ *
+ * @lock:   Lock
+ * @wakeup_fn:  Callback to reactivate a sleeping request
+ */
+void qed_lock_init(QEDLock *lock, BlockDriverCompletionFunc *wakeup_fn)
+{
+QTAILQ_INIT(lock-entries);
+lock-wakeup_fn = wakeup_fn;
+}
+
+/**
+ * Acquire a lock on a given key
+ *
+ * @lock:   Lock
+ * @key:Key to lock on
+ * @acb:Request
+ * @ret:true if lock was acquired, false if request needs to sleep
+ *
+ * If the request currently has another lock held, that lock will be released.
+ */
+bool qed_lock(QEDLock *lock, uint64_t key, QEDAIOCB *acb)
+{
+QEDLockEntry *entry = acb-lock_entry;
+
+if (entry) {
+/* Lock already held */
+if (entry-key == key) {
+return true;
+}
+
+/* Release old lock */
+qed_unlock(lock, acb);
+}
+
+/* Find held lock */
+QTAILQ_FOREACH(entry, lock-entries, next) {
+if (entry-key == key) {
+QSIMPLEQ_INSERT_TAIL(entry-reqs, acb, next);
+acb-lock_entry = entry;
+return false;
+}
+}
+
+/* Allocate new lock entry */
+entry = qemu_malloc(sizeof(*entry));
+entry-key = key;
+QSIMPLEQ_INIT(entry-reqs);
+QSIMPLEQ_INSERT_TAIL(entry-reqs, acb, next);
+QTAILQ_INSERT_TAIL(lock-entries, entry, next);
+acb-lock_entry = entry;
+return true;
+}
+
+/**
+ * Release a held lock
+ */
+void qed_unlock(QEDLock *lock, QEDAIOCB *acb)
+{
+QEDLockEntry *entry = acb-lock_entry;
+
+if (!entry) {
+return;
+}
+
+acb-lock_entry = NULL;
+QSIMPLEQ_REMOVE_HEAD(entry-reqs, next);
+
+/* Wake up next lock holder */
+acb = QSIMPLEQ_FIRST(entry-reqs);
+if (acb) {
+lock-wakeup_fn(acb, 0);
+return;
+}
+
+/* Free lock entry */
+QTAILQ_REMOVE(lock-entries, entry, next);
+qemu_free(entry);
+}
diff --git a/block/qed.c b/block/qed.c
index 6795242..5bd503b 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -12,8 +12,26 @@
  *
  */
 
+#include trace.h

[Qemu-devel] [PATCH v3 2/5] qed: Add QEMU Enhanced Disk image format

2010-10-22 Thread Stefan Hajnoczi

This patch introduces the qed on-disk layout and implements image
creation.  Later patches add read/write and other functionality.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 Makefile.objs |1 +
 block/qed.c   |  546 +
 block/qed.h   |  148 
 3 files changed, 695 insertions(+), 0 deletions(-)
 create mode 100644 block/qed.c
 create mode 100644 block/qed.h

diff --git a/Makefile.objs b/Makefile.objs
index 6e6f60a..2f036c8 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -14,6 +14,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
 block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o 
vvfat.o
 block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
+block-nested-y += qed.o
 block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
 block-nested-$(CONFIG_POSIX) += raw-posix.o
diff --git a/block/qed.c b/block/qed.c
new file mode 100644
index 000..3f2d61b
--- /dev/null
+++ b/block/qed.c
@@ -0,0 +1,546 @@
+/*
+ * QEMU Enhanced Disk Format
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Authors:
+ *  Stefan Hajnoczi   stefa...@linux.vnet.ibm.com
+ *  Anthony Liguori   aligu...@us.ibm.com
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include qed.h
+
+static int bdrv_qed_probe(const uint8_t *buf, int buf_size,
+  const char *filename)
+{
+const QEDHeader *header = (const void *)buf;
+
+if (buf_size  sizeof(*header)) {
+return 0;
+}
+if (le32_to_cpu(header-magic) != QED_MAGIC) {
+return 0;
+}
+return 100;
+}
+
+/**
+ * Check whether an image format is raw
+ *
+ * @fmt:Backing file format, may be NULL
+ */
+static bool qed_fmt_is_raw(const char *fmt)
+{
+return fmt  strcmp(fmt, raw) == 0;
+}
+
+static void qed_header_le_to_cpu(const QEDHeader *le, QEDHeader *cpu)
+{
+cpu-magic = le32_to_cpu(le-magic);
+cpu-cluster_size = le32_to_cpu(le-cluster_size);
+cpu-table_size = le32_to_cpu(le-table_size);
+cpu-header_size = le32_to_cpu(le-header_size);
+cpu-features = le64_to_cpu(le-features);
+cpu-compat_features = le64_to_cpu(le-compat_features);
+cpu-autoclear_features = le64_to_cpu(le-autoclear_features);
+cpu-l1_table_offset = le64_to_cpu(le-l1_table_offset);
+cpu-image_size = le64_to_cpu(le-image_size);
+cpu-backing_filename_offset = le32_to_cpu(le-backing_filename_offset);
+cpu-backing_filename_size = le32_to_cpu(le-backing_filename_size);
+}
+
+static void qed_header_cpu_to_le(const QEDHeader *cpu, QEDHeader *le)
+{
+le-magic = cpu_to_le32(cpu-magic);
+le-cluster_size = cpu_to_le32(cpu-cluster_size);
+le-table_size = cpu_to_le32(cpu-table_size);
+le-header_size = cpu_to_le32(cpu-header_size);
+le-features = cpu_to_le64(cpu-features);
+le-compat_features = cpu_to_le64(cpu-compat_features);
+le-autoclear_features = cpu_to_le64(cpu-autoclear_features);
+le-l1_table_offset = cpu_to_le64(cpu-l1_table_offset);
+le-image_size = cpu_to_le64(cpu-image_size);
+le-backing_filename_offset = cpu_to_le32(cpu-backing_filename_offset);
+le-backing_filename_size = cpu_to_le32(cpu-backing_filename_size);
+}
+
+static int qed_write_header_sync(BDRVQEDState *s)
+{
+QEDHeader le;
+int ret;
+
+qed_header_cpu_to_le(s-header, le);
+ret = bdrv_pwrite(s-bs-file, 0, le, sizeof(le));
+if (ret != sizeof(le)) {
+return ret;
+}
+return 0;
+}
+
+static uint64_t qed_max_image_size(uint32_t cluster_size, uint32_t table_size)
+{
+uint64_t table_entries;
+uint64_t l2_size;
+
+table_entries = (table_size * cluster_size) / sizeof(uint64_t);
+l2_size = table_entries * cluster_size;
+
+return l2_size * table_entries;
+}
+
+static bool qed_is_cluster_size_valid(uint32_t cluster_size)
+{
+if (cluster_size  QED_MIN_CLUSTER_SIZE ||
+cluster_size  QED_MAX_CLUSTER_SIZE) {
+return false;
+}
+if (cluster_size  (cluster_size - 1)) {
+return false; /* not power of 2 */
+}
+return true;
+}
+
+static bool qed_is_table_size_valid(uint32_t table_size)
+{
+if (table_size  QED_MIN_TABLE_SIZE ||
+table_size  QED_MAX_TABLE_SIZE) {
+return false;
+}
+if (table_size  (table_size - 1)) {
+return false; /* not power of 2 */
+}
+return true;
+}
+
+static bool qed_is_image_size_valid(uint64_t image_size, uint32_t cluster_size,
+uint32_t table_size)
+{
+if (image_size == 0) {
+/* Supporting zero size images makes life harder because even the L1
+ * table is not needed.  Make life simple and forbid zero size images.
+ */
+return false;
+}
+if (image_size % BDRV_SECTOR_SIZE != 0) {
+return false; /*

[Qemu-devel] [PATCH v3 3/5] qed: Table, L2 cache, and cluster functions

2010-10-22 Thread Stefan Hajnoczi

This patch adds code to look up data cluster offsets in the image via
the L1/L2 tables.  The L2 tables are writethrough cached in memory for
performance (each read/write requires a lookup so it is essential to
cache the tables).

With cluster lookup code in place it is possible to implement
bdrv_is_allocated() to query the number of contiguous
allocated/unallocated clusters.

Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
---
 Makefile.objs|2 +-
 block/qed-cluster.c  |  154 
 block/qed-gencb.c|   32 +
 block/qed-l2-cache.c |  173 +++
 block/qed-table.c|  317 ++
 block/qed.c  |   54 +-
 block/qed.h  |  118 +++
 trace-events |   11 ++
 8 files changed, 859 insertions(+), 2 deletions(-)
 create mode 100644 block/qed-cluster.c
 create mode 100644 block/qed-gencb.c
 create mode 100644 block/qed-l2-cache.c
 create mode 100644 block/qed-table.c

diff --git a/Makefile.objs b/Makefile.objs
index 2f036c8..bcab5bc 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -14,7 +14,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
 block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o 
vvfat.o
 block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
-block-nested-y += qed.o
+block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
 block-nested-$(CONFIG_POSIX) += raw-posix.o
diff --git a/block/qed-cluster.c b/block/qed-cluster.c
new file mode 100644
index 000..0ec864b
--- /dev/null
+++ b/block/qed-cluster.c
@@ -0,0 +1,154 @@
+/*
+ * QEMU Enhanced Disk Format Cluster functions
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Authors:
+ *  Stefan Hajnoczi   stefa...@linux.vnet.ibm.com
+ *  Anthony Liguori   aligu...@us.ibm.com
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include qed.h
+
+/**
+ * Count the number of contiguous data clusters
+ *
+ * @s:  QED state
+ * @table:  L2 table
+ * @index:  First cluster index
+ * @n:  Maximum number of clusters
+ * @offset: Set to first cluster offset
+ *
+ * This function scans tables for contiguous allocated or free clusters.
+ */
+static unsigned int qed_count_contiguous_clusters(BDRVQEDState *s,
+  QEDTable *table,
+  unsigned int index,
+  unsigned int n,
+  uint64_t *offset)
+{
+unsigned int end = MIN(index + n, s-table_nelems);
+uint64_t last = table-offsets[index];
+unsigned int i;
+
+*offset = last;
+
+for (i = index + 1; i  end; i++) {
+if (last == 0) {
+/* Counting free clusters */
+if (table-offsets[i] != 0) {
+break;
+}
+} else {
+/* Counting allocated clusters */
+if (table-offsets[i] != last + s-header.cluster_size) {
+break;
+}
+last = table-offsets[i];
+}
+}
+return i - index;
+}
+
+typedef struct {
+BDRVQEDState *s;
+uint64_t pos;
+size_t len;
+
+QEDRequest *request;
+
+/* User callback */
+QEDFindClusterFunc *cb;
+void *opaque;
+} QEDFindClusterCB;
+
+static void qed_find_cluster_cb(void *opaque, int ret)
+{
+QEDFindClusterCB *find_cluster_cb = opaque;
+BDRVQEDState *s = find_cluster_cb-s;
+QEDRequest *request = find_cluster_cb-request;
+uint64_t offset = 0;
+size_t len = 0;
+unsigned int index;
+unsigned int n;
+
+if (ret) {
+goto out;
+}
+
+index = qed_l2_index(s, find_cluster_cb-pos);
+n = qed_bytes_to_clusters(s,
+  qed_offset_into_cluster(s, find_cluster_cb-pos) 
+
+  find_cluster_cb-len);
+n = qed_count_contiguous_clusters(s, request-l2_table-table,
+  index, n, offset);
+
+ret = offset ? QED_CLUSTER_FOUND : QED_CLUSTER_L2;
+len = MIN(find_cluster_cb-len, n * s-header.cluster_size -
+  qed_offset_into_cluster(s, find_cluster_cb-pos));
+
+if (offset  !qed_check_cluster_offset(s, offset)) {
+ret = -EINVAL;
+}
+
+out:
+find_cluster_cb-cb(find_cluster_cb-opaque, ret, offset, len);
+qemu_free(find_cluster_cb);
+}
+
+/**
+ * Find the offset of a data cluster
+ *
+ * @s:  QED state
+ * @request:L2 cache entry
+ * @pos:Byte position in device
+ * @len:Number of bytes
+ * @cb: Completion function
+ * @opaque: User data for completion function
+ *
+ * This

Re: [Qemu-devel] [PATCH v3 0/5] qed: Add QEMU Enhanced Disk format

2010-10-22 Thread Stefan Hajnoczi

Subject line fixed.

Stefan

[Qemu-devel] Re: [PATCH] sparc32: convert debug printf statements to tracepoints

2010-10-22 Thread Stefan Hajnoczi

On Thu, Oct 21, 2010 at 08:13:51PM +, Blue Swirl wrote:
 Replace debug printf statements with tracepoints.
 
 Signed-off-by: Blue Swirl blauwir...@gmail.com
 ---
 There were problems with tracetool if the tracepoint string contained
 parentheses or brackets, so I removed those.
 
  hw/cs4231.c|   20 ++---
  hw/eccmemctl.c |   46 +-
  hw/lance.c |   14 ++-
  hw/slavio_intctl.c |   38 ++
  hw/slavio_misc.c   |   51 ++---
  hw/slavio_timer.c  |   49 ---
  hw/sparc32_dma.c   |   42 ++--
  hw/sun4m.c |   18 ++--
  hw/sun4m_iommu.c   |   33 ---
  trace-events   |  108 
 
  10 files changed, 213 insertions(+), 206 deletions(-)

Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com

Out of interest, why is %8.8x used instead of %08x?  Is there a
difference in meaning or just style?

Stefan

[Qemu-devel] Re: [RFC][PATCH 4/5] trace-event

2010-10-22 Thread Stefan Hajnoczi

On Thu, Oct 21, 2010 at 03:10:18PM +0530, Prerna Saxena wrote:
 trace-event : QMP interface to change state of a trace-event.
 (Analogous to hmp command : trace-event )
 
 Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
 ---
  qmp-commands.hx |   32 
  1 files changed, 32 insertions(+), 0 deletions(-)
 
 diff --git a/qmp-commands.hx b/qmp-commands.hx
 index 7e95f4e..f2008e8 100644
 --- a/qmp-commands.hx
 +++ b/qmp-commands.hx
 @@ -761,6 +761,38 @@ Example:
 
  Note: This command must be issued before issuing any other command.
 
 +EQMP
 +
 +{
 +.name   = trace-event,
 +.args_type  = name:s,option:b,
 +.params = name on|off,
 +.help   = changes state of a specific trace event,
 +.user_print = monitor_user_noop,
 +.mhandler.cmd_new = do_change_trace_event_state_qmp,
 +},
 +
 +SQMP
 +trace-event
 +---
 +
 +Change state of a trace-event.

The name is a little odd because it has no verb.  How about
set-trace-event or enable-trace-event?

Stefan

Re: [Qemu-devel] [PATCH 2/2] v3 Fix Block Hotplug race with drive_unplug()

2010-10-22 Thread Luiz Capitulino

On Thu, 21 Oct 2010 21:55:53 -0500
Ryan Harper ry...@us.ibm.com wrote:

 Block hot unplug is racy since the guest is required to acknowlege the ACPI
 unplug event; this may not happen synchronously with the device removal 
 command
 
 This series aims to close a gap where by mgmt applications that assume the
 block resource has been removed without confirming that the guest has
 acknowledged the removal may re-assign the underlying device to a second guest
 leading to data leakage.
 
 This series introduces a new montor command to decouple asynchornous device
 removal from restricting guest access to a block device.  We do this by 
 creating
 a new monitor command drive_unplug which maps to a bdrv_unplug() command which
 does a qemu_aio_flush; bdrv_flush() and bdrv_close().  Once complete, 
 subsequent
 IO is rejected from the device and the guest will get IO errors but continue 
 to
 function.
 
 A subsequent device removal command can be issued to remove the device, to 
 which
 the guest may or maynot respond, but as long as the unplugged bit is set, no 
 IO
 will be sumbitted.
 
 Changes since v2:
 - Added qmp command for drive_unplug
 
 Changes since v1:
 - Added qemu_aio_flush() before bdrv_flush() to wait on pending io
 
 Signed-off-by: Ryan Harper ry...@us.ibm.com
 ---
  block.c |7 +++
  block.h |1 +
  blockdev.c  |   26 ++
  blockdev.h  |1 +
  hmp-commands.hx |   15 +++
  qmp-commands.hx |   26 ++
  6 files changed, 76 insertions(+), 0 deletions(-)
 
 diff --git a/block.c b/block.c
 index a19374d..be47655 100644
 --- a/block.c
 +++ b/block.c
 @@ -1328,6 +1328,13 @@ void bdrv_set_removable(BlockDriverState *bs, int 
 removable)
  }
  }
  
 +void bdrv_unplug(BlockDriverState *bs)
 +{
 +qemu_aio_flush();
 +bdrv_flush(bs);
 +bdrv_close(bs);
 +}
 +
  int bdrv_is_removable(BlockDriverState *bs)
  {
  return bs-removable;
 diff --git a/block.h b/block.h
 index 5f64380..732f63e 100644
 --- a/block.h
 +++ b/block.h
 @@ -171,6 +171,7 @@ void bdrv_set_on_error(BlockDriverState *bs, 
 BlockErrorAction on_read_error,
 BlockErrorAction on_write_error);
  BlockErrorAction bdrv_get_on_error(BlockDriverState *bs, int is_read);
  void bdrv_set_removable(BlockDriverState *bs, int removable);
 +void bdrv_unplug(BlockDriverState *bs);
  int bdrv_is_removable(BlockDriverState *bs);
  int bdrv_is_read_only(BlockDriverState *bs);
  int bdrv_is_sg(BlockDriverState *bs);
 diff --git a/blockdev.c b/blockdev.c
 index 5fc3b9b..68eb329 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -610,3 +610,29 @@ int do_change_block(Monitor *mon, const char *device,
  }
  return monitor_read_bdrv_key_start(mon, bs, NULL, NULL);
  }
 +
 +int do_drive_unplug(Monitor *mon, const QDict *qdict, QObject **ret_data)
 +{
 +DriveInfo *dinfo;
 +BlockDriverState *bs;
 +const char *id;
 +
 +if (!qdict_haskey(qdict, id)) {
 +qerror_report(QERR_MISSING_PARAMETER, id);
 +return -1;
 +}

You can drop this check, id is a required argument, the upper layer will
perform the check for you.

 +
 +id = qdict_get_str(qdict, id);
 +dinfo = drive_get_by_id(id);
 +if (!dinfo) {
 +qerror_report(QERR_DEVICE_NOT_FOUND, id);
 +return -1;
 +}
 +
 +/* mark block device unplugged */
 +bs = dinfo-bdrv;
 +bdrv_unplug(bs);
 +
 +return 0;
 +}
 + 
 diff --git a/blockdev.h b/blockdev.h
 index 19c6915..ecb9ac8 100644
 --- a/blockdev.h
 +++ b/blockdev.h
 @@ -52,5 +52,6 @@ int do_eject(Monitor *mon, const QDict *qdict, QObject 
 **ret_data);
  int do_block_set_passwd(Monitor *mon, const QDict *qdict, QObject 
 **ret_data);
  int do_change_block(Monitor *mon, const char *device,
  const char *filename, const char *fmt);
 +int do_drive_unplug(Monitor *mon, const QDict *qdict, QObject **ret_data);
  
  #endif
 diff --git a/hmp-commands.hx b/hmp-commands.hx
 index 81999aa..7a32a2e 100644
 --- a/hmp-commands.hx
 +++ b/hmp-commands.hx
 @@ -68,6 +68,21 @@ Eject a removable medium (use -f to force it).
  ETEXI
  
  {
 +.name   = drive_unplug,
 +.args_type  = id:s,
 +.params = device,
 +.help   = unplug block device,
 +.user_print = monitor_user_noop,
 +.mhandler.cmd_new = do_drive_unplug,
 +},
 +
 +STEXI
 +...@item unplug @var{device}
 +...@findex unplug
 +Unplug block device.
 +ETEXI
 +
 +{
  .name   = change,
  .args_type  = device:B,target:F,arg:s?,
  .params = device filename [format],
 diff --git a/qmp-commands.hx b/qmp-commands.hx
 index 793cf1c..e8f3d4a 100644
 --- a/qmp-commands.hx
 +++ b/qmp-commands.hx
 @@ -338,6 +338,32 @@ Example:
  EQMP
  
  {
 +.name   = drive_unplug,
 +.args_type  = id:s,
 +.params = device,
 +.help   = unplug block device,
 +.user_print =

Re: [Qemu-devel] [PATCH 0/2] v3 Decouple block device removal from device removal

2010-10-22 Thread Luiz Capitulino

On Thu, 21 Oct 2010 21:55:51 -0500
Ryan Harper ry...@us.ibm.com wrote:

 This patch series decouples the detachment of a block device from the removal
 of the backing pci-device.  Removal of a hotplugged pci device requires the
 guest to respond before qemu tears down the block device. In some cases, the
 guest may not respond leaving the guest with continued access to the block
 device.  
 
 The new monitor command, drive_unplug, will revoke a guests access to the
 block device independently of the removal of the pci device.
 
 The first patch adds a new drive find method, the second patch implements the
 monitor command and block layer changes.

Reviewed the monitor part, I think we're waiting for Kevin's and/or
Markus's ACK to get this merged?

 
 Changes since v2:
 - Added QMP command for drive_unplug()
 
 Changes since v1:
 - CodingStyle fixes
 - Added qemu_aio_flush() to bdrv_unplug()
 
 Signed-off-by: Ryan Harper ry...@us.ibm.com

Re: [Qemu-devel] [PATCH 1/2] v2 Add drive_get_by_id

2010-10-22 Thread Luiz Capitulino

On Thu, 21 Oct 2010 21:55:52 -0500
Ryan Harper ry...@us.ibm.com wrote:

 Add a function to find a drive by id string.
 
 Changes since v1:
 -Coding Style fix
 
 Signed-off-by: Ryan Harper ry...@us.ibm.com
 ---
  blockdev.c |   13 +
  blockdev.h |1 +
  2 files changed, 14 insertions(+), 0 deletions(-)
 
 diff --git a/blockdev.c b/blockdev.c
 index ff7602b..5fc3b9b 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -75,6 +75,19 @@ DriveInfo *drive_get(BlockInterfaceType type, int bus, int 
 unit)
  return NULL;
  }
  
 +DriveInfo *drive_get_by_id(const char *id)
 +{
 +DriveInfo *dinfo;
 +
 +QTAILQ_FOREACH(dinfo, drives, next) {
 +if (strcmp(id, dinfo-id)) {
 +continue;
 +}

Very minor, but the following allows you to drop a statement:

if (!strcmp()) {
  return dinfo;
}

 +return dinfo;
 +}
 +return NULL;
 +}
 +
  int drive_get_max_bus(BlockInterfaceType type)
  {
  int max_bus;
 diff --git a/blockdev.h b/blockdev.h
 index 653affc..19c6915 100644
 --- a/blockdev.h
 +++ b/blockdev.h
 @@ -38,6 +38,7 @@ DriveInfo *drive_get(BlockInterfaceType type, int bus, int 
 unit);
  int drive_get_max_bus(BlockInterfaceType type);
  void drive_uninit(DriveInfo *dinfo);
  DriveInfo *drive_get_by_blockdev(BlockDriverState *bs);
 +DriveInfo *drive_get_by_id(const char *id);
  
  QemuOpts *drive_add(const char *file, const char *fmt, ...) GCC_FMT_ATTR(2, 
 3);
  DriveInfo *drive_init(QemuOpts *arg, int default_to_scsi, int *fatal_error);

Re: [Qemu-devel] [PATCH 0/2] v3 Decouple block device removal from device removal

2010-10-22 Thread Kevin Wolf

Am 22.10.2010 17:48, schrieb Luiz Capitulino:
 On Thu, 21 Oct 2010 21:55:51 -0500
 Ryan Harper ry...@us.ibm.com wrote:
 
 This patch series decouples the detachment of a block device from the removal
 of the backing pci-device.  Removal of a hotplugged pci device requires the
 guest to respond before qemu tears down the block device. In some cases, the
 guest may not respond leaving the guest with continued access to the block
 device.  

 The new monitor command, drive_unplug, will revoke a guests access to the
 block device independently of the removal of the pci device.

 The first patch adds a new drive find method, the second patch implements the
 monitor command and block layer changes.
 
 Reviewed the monitor part, I think we're waiting for Kevin's and/or
 Markus's ACK to get this merged?

Yes, I'm waiting for Markus' comments before applying this.

Maybe we shouldn't add this to QMP yet, drive_add doesn't exist there
either because we want to do it right with blockdev_add/del. On the
other hand Markus should know much better than me what the right thing
would look like - if he says that this interface will work with whatever
we're going to get with blockdev_*, I'm fine with merging it.

Kevin

[Qemu-devel] [PATCH] usb-linux: allow multiple devices with matching IDs

2010-10-22 Thread Grazvydas Ignotas

Right now if we pass through multiple USB devices with matching vendor
and product IDs, only first one is passed to guest, as the code thinks
second device is already attached. The only way to get those devices
working is to specify bus.addr which is inconvenient if devices are
frequently replugged on host, because the address changes after replug.

Fix this by checking bus.addr before assuming the device is already
attached. This way -usbdevice host:1234:1234 -usbdevice host:1234:1234
will pass through 2 devices correctly.

Signed-off-by: Grazvydas Ignotas nota...@gmail.com
---
 usb-linux.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/usb-linux.c b/usb-linux.c
index c3c38ec..b5f1396 100644
--- a/usb-linux.c
+++ b/usb-linux.c
@@ -1464,9 +1464,13 @@ static int usb_host_auto_scan(void *opaque, int bus_num, 
int addr,
 }
 /* We got a match */
 
-/* Already attached ? */
 if (s-fd != -1) {
-return 0;
+/* Already attached? */
+if (s-bus_num == bus_num  s-addr == addr)
+return 0;
+
+/* Not attached but needs another hostdev */
+continue;
 }
 DPRINTF(husb: auto open: bus_num %d addr %d\n, bus_num, addr);
 
-- 
1.6.3.3

Re: [Qemu-devel] [PATCH 0/2] v3 Decouple block device removal from device removal

2010-10-22 Thread Ryan Harper

* Kevin Wolf kw...@redhat.com [2010-10-22 11:12]:
 Am 22.10.2010 17:48, schrieb Luiz Capitulino:
  On Thu, 21 Oct 2010 21:55:51 -0500
  Ryan Harper ry...@us.ibm.com wrote:
  
  This patch series decouples the detachment of a block device from the 
  removal
  of the backing pci-device.  Removal of a hotplugged pci device requires the
  guest to respond before qemu tears down the block device. In some cases, 
  the
  guest may not respond leaving the guest with continued access to the block
  device.  
 
  The new monitor command, drive_unplug, will revoke a guests access to the
  block device independently of the removal of the pci device.
 
  The first patch adds a new drive find method, the second patch implements 
  the
  monitor command and block layer changes.
  
  Reviewed the monitor part, I think we're waiting for Kevin's and/or
  Markus's ACK to get this merged?
 
 Yes, I'm waiting for Markus' comments before applying this.
 
 Maybe we shouldn't add this to QMP yet, drive_add doesn't exist there
 either because we want to do it right with blockdev_add/del. On the
 other hand Markus should know much better than me what the right thing
 would look like - if he says that this interface will work with whatever
 we're going to get with blockdev_*, I'm fine with merging it.

I can go either way.  Anthony asked me to re-spin with a QMP version;
but the libvirt patch works either via TextMonitor or JSONMonitor.


-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ry...@us.ibm.com

Re: Fwd: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog

2010-10-22 Thread Sage Weil

On Fri, 22 Oct 2010, Kevin Wolf wrote:
 [ Adding qemu-devel to CC again ]
 
 Am 21.10.2010 20:59, schrieb Sage Weil:
  On Thu, 21 Oct 2010, Christian Brunner wrote:
  Hi,
 
  is there a flush operation in librados? - I guess the only way to
  handle this, would be waiting until all aio requests are finished?
 
 That's not the semantics of bdrv_flush, you don't need to wait for
 running requests. You just need to make sure that all completed requests
 are safe on disk so that they would persist even in case of a
 crash/power failure.

Okay, in that case we're fine.  librados doesn't declare a write committed 
until it is safely on disk on multiple backend nodes.  There is a 
mechanism to get an ack sooner, but the qemu storage driver does not use 
it.  

  There is no flush currently.  But librados does no caching, so in this 
  case at least silenting upgrading to cache=writethrough should work.
 
 You're making sure that the data can't be cached in the server's page
 cache or volatile disk cache either, e.g. by using O_SYNC for the image
 file? If so, upgrading would be safe.

Right.

  If that's a problem, we can implement a flush.  Just let us know.
 
 Presumably providing a writeback mode with explicit flushes could
 improve performance. Upgrading to writethrough is not a correctness
 problem, though, so it's your decision if you want to implement it.

So is a bdrv_flush generated when e.g. the guest filesystem issues a 
barrier, or would otherwise normally ask a SATA disk to flush it's cache?

sage



 Kevin
 
  -- Forwarded message --
  From: Kevin Wolf kw...@redhat.com
  Date: 2010/10/21
  Subject: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and 
  sheepdog
  To: Christian Brunner c...@muc.de, Laurent Vivier
  laur...@vivier.eu, MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
  Cc: Qemu-devel@nongnu.org
 
 
  Hi all,
 
  I'm currently looking into adding a return value to qemu's bdrv_flush
  function and I noticed that your block drivers (nbd, rbd and sheepdog)
  don't implement bdrv_flush at all. bdrv_flush is going to return
  -ENOTSUP for any block driver not implementing this, effectively
  breaking these three drivers for anything but cache=unsafe.
 
  Is there a specific reason why your drivers don't implement this? I
  think I remember that one of the drivers always provides
  cache=writethough semantics. It would be okay to silently upgrade to
  cache=writethrough, so in this case I'd just need to add an empty
  bdrv_flush implementation.
 
  Otherwise, we really cannot allow any option except cache=unsafe because
  that's the semantics provided by the driver.
 
  In any case, I think it would be a good idea to implement a real
  bdrv_flush function to allow the write-back cache modes cache=off and
  cache=writeback in order to improve performance over writethrough.
 
  Is this possible with your protocols, or can the protocol be changed to
  consider this? Any hints on how to proceed?
 
  Kevin
  --
  To unsubscribe from this list: send the line unsubscribe ceph-devel in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 2/4] Silence compiler warning in json test case

2010-10-22 Thread Markus Armbruster

Luiz Capitulino lcapitul...@redhat.com writes:

 From: Jan Kiszka jan.kis...@web.de

 This avoids

 error: zero-length gnu_printf format string

 Signed-off-by: Jan Kiszka jan.kis...@siemens.com
 Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
 ---
  check-qjson.c |4 +++-
  1 files changed, 3 insertions(+), 1 deletions(-)

 diff --git a/check-qjson.c b/check-qjson.c
 index 0b60e45..64fcdcb 100644
 --- a/check-qjson.c
 +++ b/check-qjson.c
 @@ -639,7 +639,9 @@ END_TEST
  
  START_TEST(empty_input)
  {
 -QObject *obj = qobject_from_json();
 +const char *empty = ;
 +
 +QObject *obj = qobject_from_json(empty);
  fail_unless(obj == NULL);
  }
  END_TEST

The warning is silly.  Printing nothing is unlikely to happen
unintentionally, and is perfectly well-defined and portable.

Why make the code ugly to avoid a useless warning, when we can disable
the warning?

Re: [Qemu-devel] KVM call minutes for Oct 19

2010-10-22 Thread Anthony Liguori


On 10/22/2010 12:29 PM, Chris Wright wrote:

* Anthony Liguori (anth...@codemonkey.ws) wrote:
   

The first step is just identifying what interfaces we need in a
guest agent.  So far, I think we can get away with a very small
number of interfaces (mainly read/write files, execute command).
 

Could you elaborate here?  I can't imagine you mean:

vm_system(target_vm, shutdown -r now)

But from other post, it does seem you want complexity in the host side
not guest side of agent.

Seems vm_reboot(target_vm) as the RPC makes more sense with the guest
side implementing that in whatever guest-specific appropriate way.
   


What I really want is a vm_system API that a guest agent MUST implement 
and then APIs like vm_reboot that a guest agent MAY implement.


In my mind, the guest agent lives in the distros even though it's built 
from QEMU source tree.  We don't install it ourselves.


That means we might have a new funky fresh version of Fedora 21 version 
of QEMU but running an old Fedora 14 guest with a really back-level 
guest agent.


Having very low level APIs with logic that primarily lives in QEMU gives 
us the ability to support new features in QEMU with older guests.


Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH 2/4] Silence compiler warning in json test case

2010-10-22 Thread Luiz Capitulino

On Fri, 22 Oct 2010 19:15:07 +0200
Markus Armbruster arm...@redhat.com wrote:

 Luiz Capitulino lcapitul...@redhat.com writes:
 
  From: Jan Kiszka jan.kis...@web.de
 
  This avoids
 
  error: zero-length gnu_printf format string
 
  Signed-off-by: Jan Kiszka jan.kis...@siemens.com
  Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
  ---
   check-qjson.c |4 +++-
   1 files changed, 3 insertions(+), 1 deletions(-)
 
  diff --git a/check-qjson.c b/check-qjson.c
  index 0b60e45..64fcdcb 100644
  --- a/check-qjson.c
  +++ b/check-qjson.c
  @@ -639,7 +639,9 @@ END_TEST
   
   START_TEST(empty_input)
   {
  -QObject *obj = qobject_from_json();
  +const char *empty = ;
  +
  +QObject *obj = qobject_from_json(empty);
   fail_unless(obj == NULL);
   }
   END_TEST
 
 The warning is silly.  Printing nothing is unlikely to happen
 unintentionally, and is perfectly well-defined and portable.
 
 Why make the code ugly to avoid a useless warning, when we can disable
 the warning?

You mean, disable it only for this specific case or QEMU wide?

If it's the former, please, submit a patch. Otherwise, this has been
discussed already and the conclusion was that the warning is
useful:

 http://www.mail-archive.com/qemu-devel@nongnu.org/msg44072.html

Honestly speaking, no matter what the conclusion is, what can not
happen is having code that doesn't compile in the tree. Either: we apply
this patch or revert the patch that broke the build.

Re: [Qemu-devel] KVM call minutes for Oct 19

2010-10-22 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
 The first step is just identifying what interfaces we need in a
 guest agent.  So far, I think we can get away with a very small
 number of interfaces (mainly read/write files, execute command).

Could you elaborate here?  I can't imagine you mean:

vm_system(target_vm, shutdown -r now)

But from other post, it does seem you want complexity in the host side
not guest side of agent.

Seems vm_reboot(target_vm) as the RPC makes more sense with the guest
side implementing that in whatever guest-specific appropriate way.

thanks,
-chris

[Qemu-devel] Re: [PATCH] sparc32: convert debug printf statements to tracepoints

2010-10-22 Thread Blue Swirl

On Fri, Oct 22, 2010 at 3:22 PM, Stefan Hajnoczi
stefa...@linux.vnet.ibm.com wrote:
 On Thu, Oct 21, 2010 at 08:13:51PM +, Blue Swirl wrote:
 Replace debug printf statements with tracepoints.

 Signed-off-by: Blue Swirl blauwir...@gmail.com
 ---
 There were problems with tracetool if the tracepoint string contained
 parentheses or brackets, so I removed those.

  hw/cs4231.c        |   20 ++---
  hw/eccmemctl.c     |   46 +-
  hw/lance.c         |   14 ++-
  hw/slavio_intctl.c |   38 ++
  hw/slavio_misc.c   |   51 ++---
  hw/slavio_timer.c  |   49 ---
  hw/sparc32_dma.c   |   42 ++--
  hw/sun4m.c         |   18 ++--
  hw/sun4m_iommu.c   |   33 ---
  trace-events       |  108 
 
  10 files changed, 213 insertions(+), 206 deletions(-)

 Acked-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com

Thanks.

 Out of interest, why is %8.8x used instead of %08x?  Is there a
 difference in meaning or just style?

Both seems to produce the same results. %08x is a bit shorter, so that
should be preferred.

Re: [Qemu-devel] [PATCH 3/3] Add helper functions to enable virtio-9p make use of the threadlets

2010-10-22 Thread Venkateswararao Jujjuri (JV)

On 10/21/2010 5:10 AM, Arun R Bharadwaj wrote:
 From: Gautham R Shenoy e...@in.ibm.com
 
 infrastructure for offloading blocking tasks such as making posix calls on
 to the helper threads and handle the post_posix_operations() from the
 context of the iothread. This frees the vcpu thread to process any other guest
 operations while the processing of v9fs_io is in progress.
 
 Signed-off-by: Gautham R Shenoy e...@in.ibm.com
 Signed-off-by: Sripathi Kodi sripat...@in.ibm.com
 Signed-off-by: Arun R Bharadwaj a...@linux.vnet.ibm.com
 ---
  hw/virtio-9p.c |  168 
 
  posix-aio-compat.c |   48 ++-
  qemu-threadlets.c  |   21 +++
  qemu-threadlets.h  |1 
  vl.c   |3 +
  5 files changed, 211 insertions(+), 30 deletions(-)
 
 diff --git a/hw/virtio-9p.c b/hw/virtio-9p.c
 index a871685..f9a7b7d 100644
 --- a/hw/virtio-9p.c
 +++ b/hw/virtio-9p.c
 @@ -18,6 +18,7 @@
  #include fsdev/qemu-fsdev.h
  #include virtio-9p-debug.h
  #include virtio-9p-xattr.h
 +#include qemu-threadlets.h
 
  int debug_9p_pdu;
 
 @@ -33,6 +34,149 @@ enum {
  Oappend = 0x80,
  };
 
 +struct v9fs_post_op {
 +QTAILQ_ENTRY(v9fs_post_op) node;
 +void (*func)(void *arg);
 +void *arg;
 +};
 +
 +static struct {
 +int rfd;
 +int wfd;
 +QemuMutex lock;
 +QTAILQ_HEAD(, v9fs_post_op) post_op_list;
 +} v9fs_async_struct;
 +
 +static void die2(int err, const char *what)
 +{
 +fprintf(stderr, %s failed: %s\n, what, strerror(err));
 +abort();
 +}
 +
 +static void die(const char *what)
 +{
 +die2(errno, what);
 +}
 +
 +#define ASYNC_MAX_PROCESS   5
 +
 +/**
 + * v9fs_process_post_ops: Process any pending v9fs_post_posix_operation
 + * @arg: Not used.
 + *
 + * This function serves as a callback to the iothread to be called into 
 whenever
 + * the v9fs_async_struct.wfd is written into. This thread goes through the 
 list
 + * of v9fs_post_posix_operations() and executes them. In the process, it 
 might
 + * queue more job on the asynchronous thread pool.
 + */
 +static void v9fs_process_post_ops(void *arg)
 +{
 +int count = 0;
 +struct v9fs_post_op *post_op;
 +int ret;
 +char byte;
 +
 +qemu_mutex_lock(v9fs_async_struct.lock);
 +do {
 +ret = read(v9fs_async_struct.rfd, byte, sizeof(byte));
 +} while (ret = 0  errno != EAGAIN);
 +
 +for (count = 0; count  ASYNC_MAX_PROCESS; count++) {
 +if (QTAILQ_EMPTY((v9fs_async_struct.post_op_list))) {
 +break;
 +}
 +post_op = QTAILQ_FIRST((v9fs_async_struct.post_op_list));
 +QTAILQ_REMOVE((v9fs_async_struct.post_op_list), post_op, node);
 +
 +qemu_mutex_unlock(v9fs_async_struct.lock);
 +post_op-func(post_op-arg);
 +qemu_free(post_op);
 +qemu_mutex_lock(v9fs_async_struct.lock);
 +}
 +qemu_mutex_unlock(v9fs_async_struct.lock);
 +}
 +
 +/**
 + * v9fs_async_signal: Inform the io-thread of completion of async job.
 + *
 + * This function is used to inform the iothread that a particular
 + * async-operation pertaining to v9fs has been completed and that the io 
 thread
 + * can handle the v9fs_post_posix_operation.
 + *
 + * This is based on the aio_signal_handler
 + */
 +static inline void v9fs_async_signal(void)
 +{
 +char byte = 0;
 +ssize_t ret;
 +int tries = 0;
 +
 +qemu_mutex_lock(v9fs_async_struct.lock);
 +do {
 +assert(tries != 100);
 +   ret = write(v9fs_async_struct.wfd, byte, sizeof(byte));
 +tries++;
 +} while (ret  0  errno == EAGAIN);
 +qemu_mutex_unlock(v9fs_async_struct.lock);
 +
 +if (ret  0  errno != EAGAIN) {
 +die(write() in v9fs);
 +}
 +
 +if (kill(getpid(), SIGUSR2)) {
 +die(kill failed);
 +}
 +}
 +
 +/**
 + * v9fs_async_helper_done: Marks the completion of the v9fs_async job
 + * @func: v9fs_post_posix_func() for post-processing invoked in the context 
 of
 + *the io-thread
 + * @arg: Argument to func.
 + *
 + * This function is called from the context of one of the asynchronous 
 threads
 + * in the thread pool. This is called when the asynchronous thread has 
 finished
 + * executing a v9fs_posix_operation. It's purpose is to initiate the process 
 of
 + * informing the io-thread that the v9fs_posix_operation has completed.
 + */
 +static void v9fs_async_helper_done(void (*func)(void *arg), void *arg)
 +{
 +struct v9fs_post_op *post_op;
 +
 +post_op = qemu_mallocz(sizeof(*post_op));
 +post_op-func = func;
 +post_op-arg = arg;
 +
 +qemu_mutex_lock(v9fs_async_struct.lock);
 +QTAILQ_INSERT_TAIL((v9fs_async_struct.post_op_list), post_op, node);
 +qemu_mutex_unlock(v9fs_async_struct.lock);
 +
 +v9fs_async_signal();
 +}
 +
 +/**
 + * v9fs_do_async_posix: Offload v9fs_posix_operation onto async thread.
 + * @vs: V9fsOPState variable for the OP operation.
 + * @posix_fn: The posix function which has to be offloaded onto async thread.
 + *

[Qemu-devel] Chandra invites you to MySpace

2010-10-22 Thread Myspace








 
 
	



  
Hello,

 
 I'm a member of the MySpace community and thought you'd like to join too. 

Join MySpace and you will be automatically connected to my profile on the site.  MySpace is a social networking site that allows users to connect with friends, plan their social life, explore new music, and much much more!See you on MySpace!Chandra

		
   
   
   
   
   
   
   
   Click here to join! 
   
   
   



   




  
If you want to block any emails from MySpace members in the future, click here: 
   
http://www.myspace.com/index.cfm?fuseaction=block=7e896fd1-4dd5-423c-b631-28c619e22164
	
Or send a single blank email with the subject line BLOCK to: 
priv...@myspace.com
You can also block future emails or direct any other inquiries by regular postal mail to:
MySpace, Inc.
  8391 Beverly Blvd. #349 
  Los Angeles, CA 90048
  USA 
 (C)2003-2009 MySpace.com. All Rights Reserved

Re: [Qemu-devel] KVM call minutes for Oct 19

2010-10-22 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
 On 10/22/2010 12:29 PM, Chris Wright wrote:
 * Anthony Liguori (anth...@codemonkey.ws) wrote:
 The first step is just identifying what interfaces we need in a
 guest agent.  So far, I think we can get away with a very small
 number of interfaces (mainly read/write files, execute command).
 Could you elaborate here?  I can't imagine you mean:
 
 vm_system(target_vm, shutdown -r now)
 
 But from other post, it does seem you want complexity in the host side
 not guest side of agent.
 
 Seems vm_reboot(target_vm) as the RPC makes more sense with the guest
 side implementing that in whatever guest-specific appropriate way.
 
 What I really want is a vm_system API that a guest agent MUST
 implement and then APIs like vm_reboot that a guest agent MAY
 implement.
 
 In my mind, the guest agent lives in the distros even though it's
 built from QEMU source tree.  We don't install it ourselves.
 
 That means we might have a new funky fresh version of Fedora 21
 version of QEMU but running an old Fedora 14 guest with a really
 back-level guest agent.
 
 Having very low level APIs with logic that primarily lives in QEMU
 gives us the ability to support new features in QEMU with older
 guests.

I'm not sure about that.  That same new shiny Fedora 21 QEMU has no idea
what the right OS specific command to run in guest is.  Granted, it's
not likely that reboot or shutdown -r now are likely to change for
Linux guests, do we assume cygwin for Windows guests?  Really seems to
make more sense to have a stable ABI and negotiate version.

Also, from the point of view of a cloud where a VM agent is awfully
close to provider having backdoor into VM...a freeform vm_system()
doesn't seem like it'd be real popular.

thanks,
-chris

[Qemu-devel] Re: [PATCH 2/2] Replace remaining gcc format attributes by macro GCC_FMT_ATTR (format checking)

2010-10-22 Thread Blue Swirl

Thanks, applied both.

On Fri, Oct 22, 2010 at 10:33 AM, Stefan Weil w...@mail.berlios.de wrote:
 Am 13.10.2010 20:54, schrieb Stefan Weil:

 Replace the remaining format attribute printf by macro
 GCC_FMT_ATTR which uses gnu_printf (if supported).

 v2
 * Removal of dyngen specific code is now done in a separate patch.
 * Handle attribute in new ui/spice-display.c, too.

 Cc: Blue Swirlblauwir...@gmail.com
 Signed-off-by: Stefan Weilw...@mail.berlios.de
 ---
  cpu-all.h          |    2 +-
  ui/spice-display.c |    3 +--
  2 files changed, 2 insertions(+), 3 deletions(-)

 diff --git a/cpu-all.h b/cpu-all.h
 index 67a3266..11edddc 100644
 --- a/cpu-all.h
 +++ b/cpu-all.h
 @@ -773,7 +773,7 @@ void cpu_dump_statistics (CPUState *env, FILE *f,
                            int flags);

  void QEMU_NORETURN cpu_abort(CPUState *env, const char *fmt, ...)
 -    __attribute__ ((__format__ (__printf__, 2, 3)));
 +    GCC_FMT_ATTR(2, 3);
  extern CPUState *first_cpu;
  extern CPUState *cpu_single_env;

 diff --git a/ui/spice-display.c b/ui/spice-display.c
 index 6702dfd..7b4f5c1 100644
 --- a/ui/spice-display.c
 +++ b/ui/spice-display.c
 @@ -29,8 +29,7 @@

  static int debug = 0;

 -static void __attribute__((format(printf,2,3)))
 -dprint(int level, const char *fmt, ...)
 +static void GCC_FMT_ATTR(2, 3) dprint(int level, const char *fmt, ...)
  {
      va_list args;




 Can these two patches be applied to qemu master, or is there still something
 missing?
 They are needed for additional format checks.

 Regards,
 Stefan

[Qemu-devel] [PATCH] Use '_raw' memory access primitives

2010-10-22 Thread Lluís

Using a pointer on the host should not go through lduw.

Signed-off-by: Lluís Vilanova vilan...@ac.upc.edu
---
 linux-user/signal.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 77683f7..097da9d 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -982,8 +982,8 @@ restore_sigcontext(CPUX86State *env, struct 
target_sigcontext *sc, int *peax)
 env-regs[R_ECX] = tswapl(sc-ecx);
 env-eip = tswapl(sc-eip);
 
-cpu_x86_load_seg(env, R_CS, lduw(sc-cs) | 3);
-cpu_x86_load_seg(env, R_SS, lduw(sc-ss) | 3);
+cpu_x86_load_seg(env, R_CS, lduw_raw(sc-cs) | 3);
+cpu_x86_load_seg(env, R_SS, lduw_raw(sc-ss) | 3);
 
 tmpflags = tswapl(sc-eflags);
 env-eflags = (env-eflags  ~0x40DD5) | (tmpflags  0x40DD5);
-- 
1.7.1

-- 
 And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer.
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

[Qemu-devel] [PATCH] Allow generation of helpers with 5 arguments

2010-10-22 Thread Lluís


Signed-off-by: Lluís Vilanova vilan...@ac.upc.edu
---
 def-helper.h |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/def-helper.h b/def-helper.h
index 8a822c7..7abaf76 100644
--- a/def-helper.h
+++ b/def-helper.h
@@ -118,6 +118,8 @@
 DEF_HELPER_FLAGS_3(name, 0, ret, t1, t2, t3)
 #define DEF_HELPER_4(name, ret, t1, t2, t3, t4) \
 DEF_HELPER_FLAGS_4(name, 0, ret, t1, t2, t3, t4)
+#define DEF_HELPER_5(name, ret, t1, t2, t3, t4, t5) \
+DEF_HELPER_FLAGS_5(name, 0, ret, t1, t2, t3, t4, t5)
 
 #endif /* DEF_HELPER_H */
 
@@ -139,6 +141,10 @@ dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), 
dh_ctype(t3));
 #define DEF_HELPER_FLAGS_4(name, flags, ret, t1, t2, t3, t4) \
 dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
dh_ctype(t4));
+#define DEF_HELPER_FLAGS_5(name, flags, ret, t1, t2, t3, t4, t5) \
+dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
+dh_ctype(t4), dh_ctype(t5));
+
 
 #undef GEN_HELPER
 #define GEN_HELPER -1
@@ -203,6 +209,22 @@ static inline void glue(gen_helper_, 
name)(dh_retvar_decl(ret) dh_arg_decl(t1, 1
   tcg_gen_helperN(HELPER(name), flags, sizemask, dh_retvar(ret), 4, args); \
 }
 
+#define DEF_HELPER_FLAGS_5(name, flags, ret, t1, t2, t3, t4, t5) \
+static inline void glue(gen_helper_, name)(dh_retvar_decl(ret) dh_arg_decl(t1, 
1), \
+dh_arg_decl(t2, 2), dh_arg_decl(t3, 3), dh_arg_decl(t4, 4), \
+   dh_arg_decl(t5, 5))  \
+{ \
+  TCGArg args[5]; \
+  int sizemask; \
+  sizemask = dh_is_64bit(ret); \
+  dh_arg(t1, 1); \
+  dh_arg(t2, 2); \
+  dh_arg(t3, 3); \
+  dh_arg(t4, 4); \
+  dh_arg(t5, 5); \
+  tcg_gen_helperN(HELPER(name), flags, sizemask, dh_retvar(ret), 5, args); \
+}
+
 #undef GEN_HELPER
 #define GEN_HELPER -1
 
@@ -224,6 +246,9 @@ DEF_HELPER_FLAGS_0(name, flags, ret)
 #define DEF_HELPER_FLAGS_4(name, flags, ret, t1, t2, t3, t4) \
 DEF_HELPER_FLAGS_0(name, flags, ret)
 
+#define DEF_HELPER_FLAGS_5(name, flags, ret, t1, t2, t3, t4, t5) \
+DEF_HELPER_FLAGS_0(name, flags, ret)
+
 #undef GEN_HELPER
 #define GEN_HELPER -1
 
@@ -235,6 +260,7 @@ DEF_HELPER_FLAGS_0(name, flags, ret)
 #undef DEF_HELPER_FLAGS_2
 #undef DEF_HELPER_FLAGS_3
 #undef DEF_HELPER_FLAGS_4
+#undef DEF_HELPER_FLAGS_5
 #undef GEN_HELPER
 
 #endif
-- 
1.7.1

-- 
 And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer.
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

[Qemu-devel] [PATCH] [arm m68k] Move helpers.h to helper.h

2010-10-22 Thread Lluís

This provides a consistent naming scheme across all targets.

Signed-off-by: Lluís Vilanova vilan...@ac.upc.edu
---
 target-arm/helper.c|2 +-
 target-arm/helper.h|  450 
 target-arm/helpers.h   |  450 
 target-arm/iwmmxt_helper.c |2 +-
 target-arm/neon_helper.c   |2 +-
 target-arm/op_helper.c |2 +-
 target-arm/translate.c |6 +-
 target-m68k/helper.c   |2 +-
 target-m68k/helper.h   |   54 ++
 target-m68k/helpers.h  |   54 --
 target-m68k/op_helper.c|2 +-
 target-m68k/translate.c|6 +-
 12 files changed, 516 insertions(+), 516 deletions(-)
 create mode 100644 target-arm/helper.h
 delete mode 100644 target-arm/helpers.h
 create mode 100644 target-m68k/helper.h
 delete mode 100644 target-m68k/helpers.h

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 2dd64d9..08afe24 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -5,7 +5,7 @@
 #include cpu.h
 #include exec-all.h
 #include gdbstub.h
-#include helpers.h
+#include helper.h
 #include qemu-common.h
 #include host-utils.h
 #if !defined(CONFIG_USER_ONLY)
diff --git a/target-arm/helper.h b/target-arm/helper.h
new file mode 100644
index 000..0d1bc47
--- /dev/null
+++ b/target-arm/helper.h
@@ -0,0 +1,450 @@
+#include def-helper.h
+
+DEF_HELPER_1(clz, i32, i32)
+DEF_HELPER_1(sxtb16, i32, i32)
+DEF_HELPER_1(uxtb16, i32, i32)
+
+DEF_HELPER_2(add_setq, i32, i32, i32)
+DEF_HELPER_2(add_saturate, i32, i32, i32)
+DEF_HELPER_2(sub_saturate, i32, i32, i32)
+DEF_HELPER_2(add_usaturate, i32, i32, i32)
+DEF_HELPER_2(sub_usaturate, i32, i32, i32)
+DEF_HELPER_1(double_saturate, i32, s32)
+DEF_HELPER_2(sdiv, s32, s32, s32)
+DEF_HELPER_2(udiv, i32, i32, i32)
+DEF_HELPER_1(rbit, i32, i32)
+DEF_HELPER_1(abs, i32, i32)
+
+#define PAS_OP(pfx)  \
+DEF_HELPER_3(pfx ## add8, i32, i32, i32, ptr) \
+DEF_HELPER_3(pfx ## sub8, i32, i32, i32, ptr) \
+DEF_HELPER_3(pfx ## sub16, i32, i32, i32, ptr) \
+DEF_HELPER_3(pfx ## add16, i32, i32, i32, ptr) \
+DEF_HELPER_3(pfx ## addsubx, i32, i32, i32, ptr) \
+DEF_HELPER_3(pfx ## subaddx, i32, i32, i32, ptr)
+
+PAS_OP(s)
+PAS_OP(u)
+#undef PAS_OP
+
+#define PAS_OP(pfx)  \
+DEF_HELPER_2(pfx ## add8, i32, i32, i32) \
+DEF_HELPER_2(pfx ## sub8, i32, i32, i32) \
+DEF_HELPER_2(pfx ## sub16, i32, i32, i32) \
+DEF_HELPER_2(pfx ## add16, i32, i32, i32) \
+DEF_HELPER_2(pfx ## addsubx, i32, i32, i32) \
+DEF_HELPER_2(pfx ## subaddx, i32, i32, i32)
+PAS_OP(q)
+PAS_OP(sh)
+PAS_OP(uq)
+PAS_OP(uh)
+#undef PAS_OP
+
+DEF_HELPER_2(ssat, i32, i32, i32)
+DEF_HELPER_2(usat, i32, i32, i32)
+DEF_HELPER_2(ssat16, i32, i32, i32)
+DEF_HELPER_2(usat16, i32, i32, i32)
+
+DEF_HELPER_2(usad8, i32, i32, i32)
+
+DEF_HELPER_1(logicq_cc, i32, i64)
+
+DEF_HELPER_3(sel_flags, i32, i32, i32, i32)
+DEF_HELPER_1(exception, void, i32)
+DEF_HELPER_0(wfi, void)
+
+DEF_HELPER_2(cpsr_write, void, i32, i32)
+DEF_HELPER_0(cpsr_read, i32)
+
+DEF_HELPER_3(v7m_msr, void, env, i32, i32)
+DEF_HELPER_2(v7m_mrs, i32, env, i32)
+
+DEF_HELPER_3(set_cp15, void, env, i32, i32)
+DEF_HELPER_2(get_cp15, i32, env, i32)
+
+DEF_HELPER_3(set_cp, void, env, i32, i32)
+DEF_HELPER_2(get_cp, i32, env, i32)
+
+DEF_HELPER_2(get_r13_banked, i32, env, i32)
+DEF_HELPER_3(set_r13_banked, void, env, i32, i32)
+
+DEF_HELPER_1(get_user_reg, i32, i32)
+DEF_HELPER_2(set_user_reg, void, i32, i32)
+
+DEF_HELPER_1(vfp_get_fpscr, i32, env)
+DEF_HELPER_2(vfp_set_fpscr, void, env, i32)
+
+DEF_HELPER_3(vfp_adds, f32, f32, f32, env)
+DEF_HELPER_3(vfp_addd, f64, f64, f64, env)
+DEF_HELPER_3(vfp_subs, f32, f32, f32, env)
+DEF_HELPER_3(vfp_subd, f64, f64, f64, env)
+DEF_HELPER_3(vfp_muls, f32, f32, f32, env)
+DEF_HELPER_3(vfp_muld, f64, f64, f64, env)
+DEF_HELPER_3(vfp_divs, f32, f32, f32, env)
+DEF_HELPER_3(vfp_divd, f64, f64, f64, env)
+DEF_HELPER_1(vfp_negs, f32, f32)
+DEF_HELPER_1(vfp_negd, f64, f64)
+DEF_HELPER_1(vfp_abss, f32, f32)
+DEF_HELPER_1(vfp_absd, f64, f64)
+DEF_HELPER_2(vfp_sqrts, f32, f32, env)
+DEF_HELPER_2(vfp_sqrtd, f64, f64, env)
+DEF_HELPER_3(vfp_cmps, void, f32, f32, env)
+DEF_HELPER_3(vfp_cmpd, void, f64, f64, env)
+DEF_HELPER_3(vfp_cmpes, void, f32, f32, env)
+DEF_HELPER_3(vfp_cmped, void, f64, f64, env)
+
+DEF_HELPER_2(vfp_fcvtds, f64, f32, env)
+DEF_HELPER_2(vfp_fcvtsd, f32, f64, env)
+
+DEF_HELPER_2(vfp_uitos, f32, f32, env)
+DEF_HELPER_2(vfp_uitod, f64, f32, env)
+DEF_HELPER_2(vfp_sitos, f32, f32, env)
+DEF_HELPER_2(vfp_sitod, f64, f32, env)
+
+DEF_HELPER_2(vfp_touis, f32, f32, env)
+DEF_HELPER_2(vfp_touid, f32, f64, env)
+DEF_HELPER_2(vfp_touizs, f32, f32, env)
+DEF_HELPER_2(vfp_touizd, f32, f64, env)
+DEF_HELPER_2(vfp_tosis, f32, f32, env)
+DEF_HELPER_2(vfp_tosid, f32, f64, env)
+DEF_HELPER_2(vfp_tosizs, f32, f32, env)
+DEF_HELPER_2(vfp_tosizd, f32, f64, env)
+
+DEF_HELPER_3(vfp_toshs, f32, f32, i32, env)
+DEF_HELPER_3(vfp_tosls, f32, f32, i32, env)

[Qemu-devel] [PATCH] Refactor flush of per-CPU virtual TB cache

2010-10-22 Thread Lluís

Signed-off-by: Lluís Vilanova vilan...@ac.upc.edu
---
 exec-all.h |2 ++
 exec.c |9 +++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index 3a53fe6..2ae09c5 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -182,6 +182,8 @@ static inline unsigned int tb_phys_hash_func(tb_page_addr_t 
pc)
 
 TranslationBlock *tb_alloc(target_ulong pc);
 void tb_free(TranslationBlock *tb);
+/** Flush the per-CPU virtual translation block cache. */
+void tb_flush_jmp_cache(CPUState *env);
 void tb_flush(CPUState *env);
 void tb_link_page(TranslationBlock *tb,
   tb_page_addr_t phys_pc, tb_page_addr_t phys_page2);
diff --git a/exec.c b/exec.c
index 1fbe91c..516960a 100644
--- a/exec.c
+++ b/exec.c
@@ -688,6 +688,11 @@ static void page_flush_tb(void)
 }
 }
 
+void tb_flush_jmp_cache (CPUState * env)
+{
+memset (env-tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *));
+}
+
 /* flush all the translation blocks */
 /* XXX: tb_flush is currently not thread safe */
 void tb_flush(CPUState *env1)
@@ -705,7 +710,7 @@ void tb_flush(CPUState *env1)
 nb_tbs = 0;
 
 for(env = first_cpu; env != NULL; env = env-next_cpu) {
-memset (env-tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *));
+tb_flush_jmp_cache(env);
 }
 
 memset (tb_phys_hash, 0, CODE_GEN_PHYS_HASH_SIZE * sizeof (void *));
@@ -1941,7 +1946,7 @@ void tlb_flush(CPUState *env, int flush_global)
 }
 }
 
-memset (env-tb_jmp_cache, 0, TB_JMP_CACHE_SIZE * sizeof (void *));
+tb_flush_jmp_cache(env);
 
 env-tlb_flush_addr = -1;
 env-tlb_flush_mask = 0;
-- 
1.7.1

-- 
 And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer.
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

[Qemu-devel] [RFC][PATCH 02/15] virtproxy: qemu-vp, standalone daemon skeleton

2010-10-22 Thread Michael Roth

Daemon to be run in guest, or on host in standalone mode.
(re-)implements some qemu utility functions used by core virtproxy.c
code via wrapper functions. For built-in virtproxy code we will define
these wrapper functions in terms of qemu's built-in implementations.

Main logic will come in a later patch.

Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 qemu-vp.c |  151 +
 1 files changed, 151 insertions(+), 0 deletions(-)
 create mode 100644 qemu-vp.c

diff --git a/qemu-vp.c b/qemu-vp.c
new file mode 100644
index 000..5075cdc
--- /dev/null
+++ b/qemu-vp.c
@@ -0,0 +1,151 @@
+/*
+ * virt-proxy - host/guest communication daemon
+ *
+ * Copyright IBM Corp. 2010
+ *
+ * Authors:
+ *  Michael Roth  mdr...@linux.vnet.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include virtproxy.h
+
+/* mirror qemu I/O-related code for standalone daemon */
+typedef struct IOHandlerRecord {
+int fd;
+IOCanReadHandler *fd_read_poll;
+IOHandler *fd_read;
+IOHandler *fd_write;
+int deleted;
+void *opaque;
+/* temporary data */
+struct pollfd *ufd;
+QLIST_ENTRY(IOHandlerRecord) next;
+} IOHandlerRecord;
+
+static QLIST_HEAD(, IOHandlerRecord) io_handlers =
+QLIST_HEAD_INITIALIZER(io_handlers);
+
+int vp_set_fd_handler2(int fd,
+ IOCanReadHandler *fd_read_poll,
+ IOHandler *fd_read,
+ IOHandler *fd_write,
+ void *opaque)
+{
+IOHandlerRecord *ioh;
+
+if (!fd_read  !fd_write) {
+QLIST_FOREACH(ioh, io_handlers, next) {
+if (ioh-fd == fd) {
+ioh-deleted = 1;
+break;
+}
+}
+} else {
+QLIST_FOREACH(ioh, io_handlers, next) {
+if (ioh-fd == fd)
+goto found;
+}
+ioh = qemu_mallocz(sizeof(IOHandlerRecord));
+QLIST_INSERT_HEAD(io_handlers, ioh, next);
+found:
+ioh-fd = fd;
+ioh-fd_read_poll = fd_read_poll;
+ioh-fd_read = fd_read;
+ioh-fd_write = fd_write;
+ioh-opaque = opaque;
+ioh-deleted = 0;
+}
+return 0;
+}
+
+int vp_set_fd_handler(int fd,
+IOHandler *fd_read,
+IOHandler *fd_write,
+void *opaque)
+{
+return vp_set_fd_handler2(fd, NULL, fd_read, fd_write, opaque);
+}
+
+int vp_send_all(int fd, const void *buf, int len1)
+{
+int ret, len;
+
+len = len1;
+while (len  0) {
+ret = write(fd, buf, len);
+if (ret  0) {
+if (errno != EINTR  errno != EAGAIN) {
+warn(write() failed);
+return -1;
+}
+} else if (ret == 0) {
+break;
+} else {
+buf += ret;
+len -= ret;
+}
+}
+return len1 - len;
+}
+
+static void main_loop_wait(int nonblocking)
+{
+IOHandlerRecord *ioh;
+fd_set rfds, wfds, xfds;
+int ret, nfds;
+struct timeval tv;
+int timeout = 1000;
+
+if (nonblocking) {
+timeout = 0;
+}
+
+/* poll any events */
+nfds = -1;
+FD_ZERO(rfds);
+FD_ZERO(wfds);
+FD_ZERO(xfds);
+QLIST_FOREACH(ioh, io_handlers, next) {
+if (ioh-deleted)
+continue;
+if (ioh-fd_read 
+(!ioh-fd_read_poll ||
+ ioh-fd_read_poll(ioh-opaque) != 0)) {
+FD_SET(ioh-fd, rfds);
+if (ioh-fd  nfds)
+nfds = ioh-fd;
+}
+if (ioh-fd_write) {
+FD_SET(ioh-fd, wfds);
+if (ioh-fd  nfds)
+nfds = ioh-fd;
+}
+}
+
+tv.tv_sec = timeout / 1000;
+tv.tv_usec = (timeout % 1000) * 1000;
+
+ret = select(nfds + 1, rfds, wfds, xfds, tv);
+
+if (ret  0) {
+IOHandlerRecord *pioh;
+
+QLIST_FOREACH_SAFE(ioh, io_handlers, next, pioh) {
+if (ioh-deleted) {
+QLIST_REMOVE(ioh, next);
+qemu_free(ioh);
+continue;
+}
+if (ioh-fd_read  FD_ISSET(ioh-fd, rfds)) {
+ioh-fd_read(ioh-opaque);
+}
+if (ioh-fd_write  FD_ISSET(ioh-fd, wfds)) {
+ioh-fd_write(ioh-opaque);
+}
+}
+}
+}
-- 
1.7.0.4

[Qemu-devel] [RFC][PATCH 09/15] virtproxy: add handler for data packets

2010-10-22 Thread Michael Roth

Process VPPackets coming in from channel and send them to the
appropriate server/client connections.

Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtproxy.c |   42 ++
 1 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/virtproxy.c b/virtproxy.c
index 6c3611b..57ab2b0 100644
--- a/virtproxy.c
+++ b/virtproxy.c
@@ -235,6 +235,48 @@ static void vp_channel_accept(void *opaque)
 vp_set_fd_handler(drv-listen_fd, NULL, NULL, NULL);
 }
 
+/* handle data packets
+ *
+ * process VPPackets containing data and send them to the corresponding
+ * FDs
+ */
+static int vp_handle_data_packet(void *drv, const VPPacket *pkt)
+{
+int fd, ret;
+
+TRACE(called with drv: %p, drv);
+
+if (pkt-type == VP_PKT_CLIENT) {
+TRACE(recieved client packet, client fd: %d, server fd: %d,
+  pkt-payload.proxied.client_fd, pkt-payload.proxied.server_fd);
+fd = pkt-payload.proxied.server_fd;
+} else if (pkt-type == VP_PKT_SERVER) {
+TRACE(recieved server packet, client fd: %d, server fd: %d,
+  pkt-payload.proxied.client_fd, pkt-payload.proxied.server_fd);
+fd = pkt-payload.proxied.client_fd;
+} else {
+TRACE(unknown packet type);
+return -1;
+}
+
+/* TODO: proxied in non-blocking mode can causes us to spin here
+ * for slow servers/clients. need to use write()'s and maintain
+ * a per-conn write queue that we clear out before sending any
+ * more data to the fd
+ */
+ret = vp_send_all(fd, (void *)pkt-payload.proxied.data,
+pkt-payload.proxied.bytes);
+if (ret == -1) {
+LOG(error sending data over channel);
+return -1;
+} else if (ret != pkt-payload.proxied.bytes) {
+TRACE(buffer full?);
+return -1;
+}
+
+return 0;
+}
+
 /* read handler for communication channel
  *
  * de-multiplexes data coming in over the channel. for control messages
-- 
1.7.0.4

[Qemu-devel] [RFC][PATCH 08/15] virtproxy: interfaces to set/remove/handle VPOForwards

2010-10-22 Thread Michael Roth

Functions to add listener FDs (oforwards) which set up proxied connections
to associated service, and the corresponding handler function to process
to new connections to said FDs and initialize new client connections to
the associated remote server over the channel

Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtproxy.c |  103 +++
 virtproxy.h |1 +
 2 files changed, 104 insertions(+), 0 deletions(-)

diff --git a/virtproxy.c b/virtproxy.c
index cc0ac9a..6c3611b 100644
--- a/virtproxy.c
+++ b/virtproxy.c
@@ -12,6 +12,7 @@
  */
 
 #include virtproxy.h
+#include qemu_socket.h
 
 #define DEBUG_VP
 
@@ -314,6 +315,70 @@ static void vp_channel_read(void *opaque)
 }
 }
 
+/* handler to accept() and init new client connections */
+static void vp_oforward_accept(void *opaque)
+{
+VPOForward *f = opaque;
+VPDriver *drv = f-drv;
+
+struct sockaddr_in saddr;
+struct sockaddr *addr;
+socklen_t len;
+int fd, ret;
+VPConn *conn = NULL;
+VPPacket pkt;
+VPControlMsg msg;
+
+TRACE(called with opaque: %p, drv: %p, f, drv);
+
+for(;;) {
+len = sizeof(saddr);
+addr = (struct sockaddr *)saddr;
+fd = qemu_accept(f-listen_fd, addr, len);
+
+if (fd  0  errno != EINTR) {
+TRACE(accept() failed);
+return;
+} else if (fd = 0) {
+TRACE(accepted connection);
+break;
+}
+}
+
+if (drv-channel_fd == -1) {
+TRACE(communication channel not open, closing connection);
+closesocket(fd);
+return;
+}
+
+/* send init packet over channel */
+memset(msg, 0, sizeof(VPControlMsg));
+msg.type = VP_CONTROL_CONNECT_INIT;
+msg.args.connect_init.client_fd = fd;
+pstrcpy(msg.args.connect_init.service_id, VP_SERVICE_ID_LEN, 
f-service_id);
+
+memset(pkt, 0, sizeof(VPPacket));
+pkt.type = VP_PKT_CONTROL;
+pkt.payload.msg = msg;
+pkt.magic = VP_MAGIC;
+
+ret = vp_send_all(drv-channel_fd, pkt, sizeof(VPPacket));
+if (ret == -1) {
+LOG(vp_send_all() failed);
+return;
+}
+
+/* create new VPConn for client */
+conn = qemu_mallocz(sizeof(VPConn));
+conn-drv = drv;
+conn-client_fd = fd;
+conn-type = VP_CONN_CLIENT;
+conn-state = VP_STATE_NEW;
+QLIST_INSERT_HEAD(drv-conns, conn, next);
+
+socket_set_nonblock(fd);
+}
+
 /* create/init VPDriver object */
 VPDriver *vp_new(int fd, bool listen)
 {
@@ -336,3 +401,41 @@ VPDriver *vp_new(int fd, bool listen)
 
 return drv;
 }
+
+/* set/modify/remove a service_id - net/unix listening socket mapping
+ *
+ * service_id is a user-defined id for the service. this is what the
+ * client end will tag it's connections with so that the remote end can
+ * route it to the proper socket on the remote end.
+ *
+ * fd is a listen()'ing socket we want virtproxy to listen for new
+ * connections of this service type on. set fd to -1 to remove the 
+ * existing listening socket for this service_id
+ */
+int vp_set_oforward(VPDriver *drv, int fd, const char *service_id)
+{
+VPOForward *f = get_oforward(drv, service_id);
+
+if (fd == -1) {
+if (f != NULL) {
+vp_set_fd_handler(f-listen_fd, NULL, NULL, NULL);
+QLIST_REMOVE(f, next);
+qemu_free(f);
+}
+return 0;
+}
+
+if (f == NULL) {
+f = qemu_mallocz(sizeof(VPOForward));
+f-drv = drv;
+strncpy(f-service_id, service_id, VP_SERVICE_ID_LEN);
+QLIST_INSERT_HEAD(drv-oforwards, f, next);
+} else {
+closesocket(f-listen_fd);
+}
+
+f-listen_fd = fd;
+vp_set_fd_handler(f-listen_fd, vp_oforward_accept, NULL, f);
+
+return 0;
+}
diff --git a/virtproxy.h b/virtproxy.h
index 3df1691..39d5d40 100644
--- a/virtproxy.h
+++ b/virtproxy.h
@@ -33,5 +33,6 @@ int vp_set_fd_handler(int fd,
 
 /* virtproxy interface */
 VPDriver *vp_new(int fd, bool listen);
+int vp_set_oforward(VPDriver *drv, int fd, const char *service_id);
 
 #endif /* VIRTPROXY_H */
-- 
1.7.0.4

[Qemu-devel] [RFC][PATCH 06/15] virtproxy: add read handler for communication channel

2010-10-22 Thread Michael Roth

Handle data coming in over the channel as VPPackets: Process control
messages and forward data from remote client/server connections to the
appropriate server/client FD on our end.

Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtproxy.c |   83 +++
 1 files changed, 83 insertions(+), 0 deletions(-)

diff --git a/virtproxy.c b/virtproxy.c
index 20532c2..c9c3022 100644
--- a/virtproxy.c
+++ b/virtproxy.c
@@ -33,6 +33,7 @@
 #define VP_SERVICE_ID_LEN 32/* max length of service id string */
 #define VP_PKT_DATA_LEN 1024/* max proxied bytes per VPPacket */
 #define VP_CONN_DATA_LEN 1024   /* max bytes conns can send at a time */
+#define VP_CHAN_DATA_LEN 4096   /* max bytes channel can send at a time */
 #define VP_MAGIC 0x1F374059
 
 /* listening fd, one for each service we're forwarding to remote end */
@@ -150,6 +151,8 @@ static QemuOptsList vp_socket_opts = {
 },
 };
 
+static void vp_channel_read(void *opaque);
+
 /* get VPConn by fd, client denotes whether to look for client or server */
 static VPConn *get_conn(const VPDriver *drv, int fd, bool client)
 {
@@ -230,3 +233,83 @@ static void vp_channel_accept(void *opaque)
 /* dont accept anymore connections until channel_fd is closed */
 vp_set_fd_handler(drv-listen_fd, NULL, NULL, NULL);
 }
+
+/* read handler for communication channel
+ *
+ * de-multiplexes data coming in over the channel. for control messages
+ * we process them here, for data destined for a service or client we
+ * send it to the appropriate FD.
+ */
+static void vp_channel_read(void *opaque)
+{
+VPDriver *drv = opaque;
+VPPacket pkt;
+int count, ret, buf_offset;
+char buf[VP_CHAN_DATA_LEN];
+char *pkt_ptr, *buf_ptr;
+
+TRACE(called with opaque: %p, drv);
+
+count = read(drv-channel_fd, buf, sizeof(buf));
+
+if (count == -1) {
+LOG(read() failed: %s, strerror(errno));
+return;
+} else if (count == 0) {
+/* TODO: channel closed, this probably shouldn't happen for guest-side
+ * serial/virtio-serial connections, but need to confirm and consider
+ * what should happen in this case. as it stands this virtproxy 
instance
+ * is basically defunct at this point, same goes for client instances
+ * of virtproxy where the remote end has hung-up.
+ */
+LOG(channel connection closed);
+vp_set_fd_handler(drv-channel_fd, NULL, NULL, drv);
+drv-channel_fd = -1;
+if (drv-listen_fd) {
+vp_set_fd_handler(drv-listen_fd, vp_channel_accept, NULL, drv);
+}
+/* TODO: should close/remove/delete all existing VPConns here */
+}
+
+if (drv-buflen + count = sizeof(VPPacket)) {
+TRACE(initial packet, drv-buflen: %d, drv-buflen);
+pkt_ptr = (char *)pkt;
+memcpy(pkt_ptr, drv-buf, drv-buflen);
+pkt_ptr += drv-buflen;
+memcpy(pkt_ptr, buf, sizeof(VPPacket) - drv-buflen);
+/* handle first packet */
+ret = vp_handle_packet(drv, pkt);
+if (ret != 0) {
+LOG(error handling packet);
+}
+/* handle the rest of the buffer */
+buf_offset = sizeof(VPPacket) - drv-buflen;
+drv-buflen = 0;
+buf_ptr = buf + buf_offset;
+count -= buf_offset;
+while (count  0) {
+if (count = sizeof(VPPacket)) {
+/* handle full packet */
+TRACE(additional packet, drv-buflen: %d, drv-buflen);
+memcpy((void *)pkt, buf_ptr, sizeof(VPPacket));
+ret = vp_handle_packet(drv, pkt);
+if (ret != 0) {
+LOG(error handling packet);
+}
+count -= sizeof(VPPacket);
+buf_ptr += sizeof(VPPacket);
+} else {
+/* buffer the remainder */
+TRACE(buffering packet);
+memcpy(drv-buf, buf_ptr, count);
+drv-buflen = count;
+break;
+}
+}
+} else {
+/* haven't got a full VPPacket yet, buffer for later */
+buf_ptr = drv-buf + drv-buflen;
+memcpy(buf_ptr, buf, count);
+drv-buflen += count;
+}
+}
-- 
1.7.0.4

[Qemu-devel] [RFC][PATCH 11/15] virtproxy: add vp_handle_packet()

2010-10-22 Thread Michael Roth


Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtproxy.c |   23 +++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/virtproxy.c b/virtproxy.c
index 4f56aba..5ec4e77 100644
--- a/virtproxy.c
+++ b/virtproxy.c
@@ -431,6 +431,29 @@ static int vp_handle_data_packet(void *drv, const VPPacket 
*pkt)
 return 0;
 }
 
+static inline int vp_handle_packet(VPDriver *drv, const VPPacket *pkt)
+{
+int ret;
+
+TRACE(called with drv: %p, drv);
+
+if (pkt-magic != VP_MAGIC) {
+LOG(invalid packet magic field);
+return -1;
+}
+
+if (pkt-type == VP_PKT_CONTROL) {
+ret = vp_handle_control_packet(drv, pkt);
+} else if (pkt-type == VP_PKT_CLIENT || pkt-type == VP_PKT_SERVER) {
+ret = vp_handle_data_packet(drv, pkt);
+} else {
+LOG(invalid packet type);
+return -1;
+}
+
+return ret;
+}
+
 /* read handler for communication channel
  *
  * de-multiplexes data coming in over the channel. for control messages
-- 
1.7.0.4

[Qemu-devel] [RFC][PATCH 03/15] virtproxy: add debug functions for virtproxy core

2010-10-22 Thread Michael Roth


Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtproxy.c |   17 +
 1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/virtproxy.c b/virtproxy.c
index f30b859..2f8996c 100644
--- a/virtproxy.c
+++ b/virtproxy.c
@@ -13,6 +13,23 @@
 
 #include virtproxy.h
 
+#define DEBUG_VP
+
+#ifdef DEBUG_VP
+#define TRACE(msg, ...) do { \
+fprintf(stderr, %s:%s():L%d:  msg \n, \
+__FILE__, __FUNCTION__, __LINE__, ## __VA_ARGS__); \
+} while(0)
+#else
+#define TRACE(msg, ...) \
+do { } while (0)
+#endif
+
+#define LOG(msg, ...) do { \
+fprintf(stderr, %s:%s():  msg \n, \
+__FILE__, __FUNCTION__, ## __VA_ARGS__); \
+} while(0)
+
 #define VP_SERVICE_ID_LEN 32/* max length of service id string */
 #define VP_PKT_DATA_LEN 1024/* max proxied bytes per VPPacket */
 #define VP_CONN_DATA_LEN 1024   /* max bytes conns can send at a time */
-- 
1.7.0.4

[Qemu-devel] [RFC][PATCH 07/15] virtproxy: add vp_new() VPDriver constructor

2010-10-22 Thread Michael Roth


Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtproxy.c |   23 +++
 virtproxy.h |3 +++
 2 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/virtproxy.c b/virtproxy.c
index c9c3022..cc0ac9a 100644
--- a/virtproxy.c
+++ b/virtproxy.c
@@ -313,3 +313,26 @@ static void vp_channel_read(void *opaque)
 drv-buflen += count;
 }
 }
+
+/* create/init VPDriver object */
+VPDriver *vp_new(int fd, bool listen)
+{
+VPDriver *drv = NULL;
+
+drv = qemu_mallocz(sizeof(VPDriver));
+drv-listen_fd = -1;
+drv-channel_fd = -1;
+QLIST_INIT(drv-oforwards);
+QLIST_INIT(drv-conns);
+
+if (listen) {
+/* provided FD is to be listened on for channel connection */
+drv-listen_fd = fd;
+vp_set_fd_handler(drv-listen_fd, vp_channel_accept, NULL, drv);
+} else {
+drv-channel_fd = fd;
+vp_set_fd_handler(drv-channel_fd, vp_channel_read, NULL, drv);
+}
+
+return drv;
+}
diff --git a/virtproxy.h b/virtproxy.h
index 0203421..3df1691 100644
--- a/virtproxy.h
+++ b/virtproxy.h
@@ -31,4 +31,7 @@ int vp_set_fd_handler(int fd,
 IOHandler *fd_write,
 void *opaque);
 
+/* virtproxy interface */
+VPDriver *vp_new(int fd, bool listen);
+
 #endif /* VIRTPROXY_H */
-- 
1.7.0.4

[Qemu-devel] [RFC][PATCH 10/15] virtproxy: add handler for control packet

2010-10-22 Thread Michael Roth

Process control packets coming in over the channel. This entails setting
up/tearing down connections to local services initiated from the other
end of the channel.

Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtproxy.c |  154 +++
 1 files changed, 154 insertions(+), 0 deletions(-)

diff --git a/virtproxy.c b/virtproxy.c
index 57ab2b0..4f56aba 100644
--- a/virtproxy.c
+++ b/virtproxy.c
@@ -235,6 +235,160 @@ static void vp_channel_accept(void *opaque)
 vp_set_fd_handler(drv-listen_fd, NULL, NULL, NULL);
 }
 
+/* handle control packets
+ *
+ * process VPPackets containing control messages
+ */
+static int vp_handle_control_packet(VPDriver *drv, const VPPacket *pkt)
+{
+const VPControlMsg *msg = pkt-payload.msg;
+int ret;
+
+TRACE(called with drv: %p, drv);
+
+switch (msg-type) {
+case VP_CONTROL_CONNECT_INIT: {
+int client_fd = msg-args.connect_init.client_fd;
+int server_fd;
+char service_id[VP_SERVICE_ID_LEN];
+VPPacket resp_pkt;
+VPConn *new_conn;
+VPIForward *iforward;
+
+pstrcpy(service_id, VP_SERVICE_ID_LEN,
+ msg-args.connect_init.service_id);
+TRACE(setting up connection for service id %s, service_id);
+
+/* create server connection on behalf of remote end */
+iforward = get_iforward(drv, service_id);
+if (iforward == NULL) {
+LOG(no forwarder configured for service id);
+return -1;
+}
+
+qemu_opts_print(iforward-socket_opts, NULL);
+if (qemu_opt_get(iforward-socket_opts, host) != NULL) {
+server_fd = inet_connect_opts(iforward-socket_opts);
+} else if (qemu_opt_get(iforward-socket_opts, path) != NULL) {
+server_fd = unix_connect_opts(iforward-socket_opts);
+} else {
+LOG(unable to find listening socket host/addr info);
+return -1;
+}
+
+if (server_fd == -1) {
+LOG(failed to create connection to service with id %s,
+service_id);
+}
+TRACE(server_fd: %d, server_fd);
+
+new_conn = qemu_mallocz(sizeof(VPConn));
+if (!new_conn) {
+LOG(memory allocation failed);
+return -1;
+}
+
+/* send a connect_ack back over the channel */
+/* TODO: all fields should be explicitly set so we shouldn't
+ * need to memset. this might hurt if we beef up VPPacket size
+ */
+memset(resp_pkt, 0, sizeof(resp_pkt));
+resp_pkt.type = VP_PKT_CONTROL;
+resp_pkt.payload.msg.type = VP_CONTROL_CONNECT_ACK;
+resp_pkt.payload.msg.args.connect_ack.server_fd = server_fd;
+resp_pkt.payload.msg.args.connect_ack.client_fd = client_fd;
+resp_pkt.magic = VP_MAGIC;
+
+/* TODO: can this potentially block or cause a deadlock with
+ * the remote end? need to look into potentially buffering these
+ * if it looks like the remote end is waiting for us to read data
+ * off the channel.
+ */
+if (drv-channel_fd == -1) {
+TRACE(channel no longer connected, ignoring packet);
+return -1;
+}
+
+ret = vp_send_all(drv-channel_fd, (void *)resp_pkt, 
sizeof(resp_pkt));
+if (ret == -1) {
+LOG(error sending data over channel);
+return -1;
+}
+if (ret != sizeof(resp_pkt)) {
+TRACE(buffer full? %d bytes remaining, ret);
+return -1;
+}
+
+/* add new VPConn to list and set a read handler for it */
+new_conn-drv = drv;
+new_conn-client_fd = client_fd;
+new_conn-server_fd = server_fd;
+new_conn-type = VP_CONN_SERVER;
+new_conn-state = VP_STATE_CONNECTED;
+QLIST_INSERT_HEAD(drv-conns, new_conn, next);
+vp_set_fd_handler(server_fd, vp_conn_read, NULL, new_conn);
+
+break;
+}
+case VP_CONTROL_CONNECT_ACK: {
+int client_fd = msg-args.connect_ack.client_fd;
+int server_fd = msg-args.connect_ack.server_fd;
+VPConn *conn;
+
+TRACE(recieved ack from remote end for client fd %d, client_fd);
+
+if (server_fd = 0) {
+LOG(remote end sent invalid server fd);
+return -1;
+}
+
+conn = get_conn(drv, client_fd, true);
+
+if (conn == NULL) {
+LOG(failed to find connection with client_fd %d, client_fd);
+return -1;
+}
+
+conn-server_fd = server_fd;
+conn-state = VP_STATE_CONNECTED;
+vp_set_fd_handler(client_fd, vp_conn_read, NULL, conn);
+
+break;
+}
+case VP_CONTROL_CLOSE: {
+int fd;
+VPConn *conn;
+
+TRACE(closing connection on behalf of remote end);
+
+if (msg-args.close.client_fd = 0) {
+fd = msg-args.close.client_fd;
+TRACE(recieved

[Qemu-devel] [RFC][PATCH 00/15] virtproxy: host/guest communication layer

2010-10-22 Thread Michael Roth

OVERVIEW:

Virtproxy proxies and multiplexes socket streams over a data channel between a 
host and a guest (currently network connections, emulated serial, or 
virtio-serial channels are supported). This allows for services such as guest 
data collection agents, host/guest file transfer, and event generation/handling 
to be implemented/deployed as basic socket-based daemons, independently of the 
actual data channel.

This code is intended to provide a channel-independent abstraction layer for 
communicating with a QEMU-specific guest agent (in particular, the virtagent 
RPC guest agent which will follow this in a seperate patchset), but may have 
general utility beyond this (for instance: ssh/sftp/other guest agents/etc over 
isa/virtio serial), and so is submitted here as a seperate patchset.

Currently this communication involves 2 daemons (common code): 1 in the guest, 
and 1 in the host. Each end multiplexes/demultiplexes/proxies connections from 
the other end. In the future we hope to integrate the host component directly 
into qemu as a chardev.

BUILD/USAGE INFO:
  make qemu-vp
  ./qemu-vp -h

EXAMPLE USAGE:

 - Proxy http and ssh connections from a host to a guest over a virtio-serial 
connection:
# start guest with virtio-serial. for example (RHEL6s13):
qemu \
-device virtio-serial \
-chardev socket,path=/tmp/test0-virtioconsole.sock,server,nowait,id=test0 \
-device virtconsole,chardev=test0,name=test0 \
-chardev socket,path=/tmp/test1-virtio-serial.sock,server,nowait,id=test1 \
-device virtserialport,chardev=test1,name=test1 \
-chardev socket,path=/tmp/test2-virtio-serial.sock,server,nowait,id=test2 \
-device virtserialport,chardev=test2,name=test2 \
...
# in the host:
./qemu-vp -c unix-connect:/tmp/test2-virtio-serial.sock:- -o 
http:127.0.0.1:9080 \
  -o ssh:127.0.0.1:9022
# in the guest:
./qemu-vp -c virtserial-open:/dev/virtio-ports/test2:- -i http:127.0.0.1:80 
\
  -i ssh:127.0.0.1:22

# from host, access guest http server
wget http://locahost:9080
# from host, access guest ssh server
ssh localhost -p 9022

 - Proxy http and ssh connections from a host to a guest over a network 
connection:
# start guest with network connectivity to host
# in the guest:
./qemu-vp -c tcp-listen:guest_ip:9000 -i http:127.0.0.1:80 \
  -i ssh:127.0.0.1:22
# in the host:
./qemu-vp -c tcp-connect:guest_ip:9000 -o http:127.0.0.1:9080 \
  -o ssh:127.0.0.1:9022
...

By specifying -i and -o options in the host and guest, respectively, the 
channel can also be used to establish connections from a guest to a host.

KNOWN ISSUES:

 - Deadlocking the guest: In tests over isa-serial ports I've hit cases where 
the chardev (socket) on the host-side seem to fill up the buffer, likely due to 
qemu rate-limiting data in accordance with the port's baud rate (which may 
explain why i hadn't seen this with network-based or virtio-serial data 
channels. When qemu-vp reads data from client connections it puts it into a 
VPPacket and tries to send the packet in it's entirety back over the channel. 
In this particular case that write() blocks (or vp_send_all() spins if we set 
O_NONBLOCK on the client FD). In the meantime qemu fills up the other end of 
the socket buffer and ends up spinning in qemu-char:send_all(), basically 
causing a deadlock between qemu and qemu-vp, and causing the guest to freeze.

   Currently I'm planning on replacing vp_send_all() with a function that 
simply buffers write()'s, which would allow the use of non-blocking write()'s 
out to the channel/chardev socket while still retaining wholeness/fifo-ordering 
of the VPPackets.

 - Sync issues with virtio-serial: This may or may not be related to the issue 
above, but I noticed some cases where proxied ssh sessions from the guest to 
the host would lag by a few bytes. For instance typing top would result in 
to being displayed, and the p wouldn't show up till I hit another key. This 
could be related to how I'm handling the buffering, but I haven't been able to 
reproduce using a network-based channel.

TODO:

 - Rework vp_send_all() to use buffering to avoid above-mentioned deadlock 
scenario 
 - Integrate qemu-vp directly into qemu by adding a virtproxy chardev device. 
For example:
 ./qemu-vp -c unix-connect:/tmp/vp1-virtio-serial.sock:- -o 
ssh:127.0.0.1:9022
   in the host, would be analogous to:
 qemu \
 -device virtio-serial \
 -chardev virtproxy,oforward=ssh:127.0.0.1:9022,id=vp1 \
 -device virtserialport,chardev=vp1,name=vp1
 - Better channel negotiation to gracefully handle guest reboots/disconnects/etc
 - Add monitor commands to add/remove virtproxy channels/oforwards/iforwards on 
the fly

 .gitignore  |1 +
 Makefile|4 +-
 configure   |1 +
 qemu-vp.c   |  618 +
 virtproxy.c |  799

[Qemu-devel] [RFC][PATCH 01/15] virtproxy: base data structures and constants

2010-10-22 Thread Michael Roth


Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtproxy.c |  134 +++
 virtproxy.h |   34 +++
 2 files changed, 168 insertions(+), 0 deletions(-)
 create mode 100644 virtproxy.c
 create mode 100644 virtproxy.h

diff --git a/virtproxy.c b/virtproxy.c
new file mode 100644
index 000..f30b859
--- /dev/null
+++ b/virtproxy.c
@@ -0,0 +1,134 @@
+/*
+ * virt-proxy - host/guest communication layer
+ *
+ * Copyright IBM Corp. 2010
+ *
+ * Authors:
+ *  Michael Roth  mdr...@linux.vnet.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include virtproxy.h
+
+#define VP_SERVICE_ID_LEN 32/* max length of service id string */
+#define VP_PKT_DATA_LEN 1024/* max proxied bytes per VPPacket */
+#define VP_CONN_DATA_LEN 1024   /* max bytes conns can send at a time */
+#define VP_MAGIC 0x1F374059
+
+/* listening fd, one for each service we're forwarding to remote end */
+typedef struct VPOForward {
+VPDriver *drv;
+int listen_fd;
+char service_id[VP_SERVICE_ID_LEN];
+QLIST_ENTRY(VPOForward) next;
+} VPOForward;
+
+/* service_id-path/port mapping of each service forwarded from remote end */
+typedef struct VPIForward {
+VPDriver *drv;
+char service_id[VP_SERVICE_ID_LEN];
+QemuOpts *socket_opts;
+QLIST_ENTRY(VPIForward) next;
+} VPIForward;
+
+/* proxied client/server connected states */
+typedef struct VPConn {
+VPDriver *drv;
+int client_fd;
+int server_fd;
+enum {
+VP_CONN_CLIENT = 1,
+VP_CONN_SERVER,
+} type;
+enum {
+VP_STATE_NEW = 1,   /* accept()'d and registered fd */
+VP_STATE_INIT,  /* sent init pkt to remote end, waiting for ack */
+VP_STATE_CONNECTED, /* client and server connected */
+} state;
+QLIST_ENTRY(VPConn) next;
+} VPConn;
+
+typedef struct VPControlMsg {
+enum {
+VP_CONTROL_CONNECT_INIT = 1,
+VP_CONTROL_CONNECT_ACK,
+VP_CONTROL_CLOSE,
+} type;
+union {
+/* tell remote end connect to server and map client_fd to it */
+struct {
+int client_fd;
+char service_id[VP_SERVICE_ID_LEN];
+} connect_init;
+/* tell remote end we've created the connection to the server,
+ * and give them the corresponding fd to use so we don't have
+ * to do a reverse lookup everytime
+ */
+struct {
+int client_fd;
+int server_fd;
+} connect_ack;
+/* tell remote end to close fd in question, presumably because
+ * connection was closed on our end
+ */
+struct {
+int client_fd;
+int server_fd;
+} close;
+} args;
+} VPControlMsg;
+
+typedef struct VPPacket {
+enum {
+VP_PKT_CONTROL = 1,
+VP_PKT_CLIENT,
+VP_PKT_SERVER,
+} type;
+union {
+VPControlMsg msg;
+struct {
+int client_fd;
+int server_fd;
+int bytes;
+char data[VP_PKT_DATA_LEN];
+} proxied;
+} payload;
+int magic;
+} __attribute__((__packed__)) VPPacket;
+
+struct VPDriver {
+int channel_fd;
+int listen_fd;
+char buf[sizeof(VPPacket)];
+int buflen;
+QLIST_HEAD(, VPOForward) oforwards;
+QLIST_HEAD(, VPIForward) iforwards;
+QLIST_HEAD(, VPConn) conns;
+};
+
+static QemuOptsList vp_socket_opts = {
+.name = vp_socket_opts,
+.head = QTAILQ_HEAD_INITIALIZER(vp_socket_opts.head),
+.desc = {
+{
+.name = path,
+.type = QEMU_OPT_STRING,
+},{
+.name = host,
+.type = QEMU_OPT_STRING,
+},{
+.name = port,
+.type = QEMU_OPT_STRING,
+},{
+.name = ipv4,
+.type = QEMU_OPT_BOOL,
+},{
+.name = ipv6,
+.type = QEMU_OPT_BOOL,
+},
+{ /* end if list */ }
+},
+};
diff --git a/virtproxy.h b/virtproxy.h
new file mode 100644
index 000..0203421
--- /dev/null
+++ b/virtproxy.h
@@ -0,0 +1,34 @@
+/*
+ * virt-proxy - host/guest communication layer
+ *
+ * Copyright IBM Corp. 2010
+ *
+ * Authors:
+ *  Michael Roth  mdr...@linux.vnet.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef VIRTPROXY_H
+#define VIRTPROXY_H
+
+#include qemu-common.h
+#include qemu-queue.h
+
+typedef struct VPDriver VPDriver;
+
+/* wrappers for s/vp/qemu/ functions we need */
+int vp_send_all(int fd, const void *buf, int len1);
+int vp_set_fd_handler2(int fd,
+ IOCanReadHandler *fd_read_poll,
+ IOHandler *fd_read,
+ IOHandler *fd_write,
+ void *opaque);
+int

[Qemu-devel] [PATCH 00/17] [RFC] static instrumentation

2010-10-22 Thread Lluís

Here's a set of patches with the current state of static
instrumentation.

Hope that this organization will help understanding the point of
it. Patch 9 describes the taken approach for instrumenting during code
generation, which is the main point of this.

Code can also be reached at:
 https://projects.gso.ac.upc.edu/projects/qemu-instrument
 https://code.gso.ac.upc.edu/git/qemu-instrument

Lluís Vilanova (17):
  backdoor: Declare host-side backdoor helpers
  backdoor: [all] Include backdoor helper declarations
  backdoor: Declare guest-side interface macros
  backdoor: [i386] Decode backdoor instructions
  backdoor: [i386] Declare guest-side interface macros
  backdoor: Add a simple example
  instrument: Handle config-time activation
  instrument: Add initial instrumentation example
  instrument: Dynamic per-CPU state of static instrumentation points
  instrument: Code-generation macros
  instrument: [all] Include instrumentation helper declarations
  instrument: Add FETCH point
  instrument: [i386] Call FETCH point
  instrument: Add VMEM point
  instrument: [all] Call VMEM point
  instrument: Add PLVL point
  instrument: [i386] Call PLVL point

 .gitignore |2 +
 Makefile.target|   35 +-
 backdoor/examples/print/README |   13 ++
 backdoor/examples/print/guest/Makefile |7 +
 backdoor/examples/print/guest/test.c   |   33 +
 backdoor/examples/print/host/Makefile  |   13 ++
 backdoor/examples/print/host/printcb.c |   36 +
 backdoor/guest.h   |   60 
 backdoor/helper.h  |   21 +++
 configure  |   19 +++
 cpu-all.h  |   74 ++
 cpu-defs.h |   24 +++
 cpu-exec.c |8 +-
 cpus.c |8 +
 exec-all.h |7 +-
 exec.c |   50 +---
 instrument/control.c   |   74 ++
 instrument/control.h   |   44 ++
 instrument/examples/dynprint/README|   16 ++
 instrument/examples/dynprint/guest/Makefile|7 +
 instrument/examples/dynprint/guest/test.c  |   62 
 instrument/examples/dynprint/host/Makefile |   14 ++
 instrument/examples/dynprint/host/backdoor.c   |   61 
 instrument/examples/dynprint/host/helpers.c|   86 
 .../dynprint/host/instrument-host-helpers.h|   22 +++
 .../examples/dynprint/host/instrument-host.h   |   81 +++
 instrument/gen-vmem-wrappers.h |   88 
 instrument/generate.h  |  128 +
 instrument/host-stub.h |   81 +++
 instrument/state.h |   61 
 instrument/types.h |   44 ++
 linux-user/main.c  |   12 ++
 qemu-common.h  |4 +
 softmmu_header.h   |   15 ++
 target-alpha/helper.h  |8 +
 target-alpha/translate.c   |4 +
 target-arm/helper.h|8 +
 target-arm/translate.c |4 +
 target-cris/helper.h   |8 +
 target-cris/translate.c|4 +
 target-i386/cpu.h  |   21 ++--
 target-i386/helper.h   |8 +
 target-i386/translate.c|  147 
 target-m68k/helper.h   |8 +
 target-m68k/translate.c|4 +
 target-microblaze/helper.h |8 +
 target-microblaze/translate.c  |   10 ++
 target-mips/helper.h   |8 +
 target-mips/translate.c|4 +
 target-ppc/helper.h|8 +
 target-ppc/translate.c |4 +
 target-s390x/translate.c   |4 +
 target-sh4/helper.h|8 +
 target-sh4/translate.c |4 +
 target-sparc/helper.h  |8 +
 target-sparc/translate.c   |4 +
 56 files changed, 1543 insertions(+), 61 deletions(-)
 create mode 100644 backdoor/examples/print/README
 create mode 100644 backdoor/examples/print/guest/Makefile
 create mode 100644 backdoor/examples/print/guest/test.c
 create mode

[Qemu-devel] [RFC][PATCH 12/15] virtproxy: interfaces to set/remove VPIForwards

2010-10-22 Thread Michael Roth


Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtproxy.c |   59 +++
 virtproxy.h |2 ++
 2 files changed, 61 insertions(+), 0 deletions(-)

diff --git a/virtproxy.c b/virtproxy.c
index 5ec4e77..86a8e5b 100644
--- a/virtproxy.c
+++ b/virtproxy.c
@@ -658,3 +658,62 @@ int vp_set_oforward(VPDriver *drv, int fd, const char 
*service_id)
 
 return 0;
 }
+
+/* add/modify a service_id - net/unix socket mapping
+ *
+ * service_id is a user-defined id for the service. this is what the
+ * remote end will use to proxy connections to a specific service on
+ * our end.
+ *
+ * if port is NULL, addr is the address of the net socket the
+ * service is running on. otherwise, addr is the path to the unix socket
+ * the service is running on.
+ *
+ * if port AND addr are NULL, find and remove the current iforward
+ * for this service_id if it exists.
+ *
+ * ipv6 is a bool denoting whether or not to use ipv6
+ */
+int vp_set_iforward(VPDriver *drv, const char *service_id, const char *addr,
+const char *port, bool ipv6)
+{
+VPIForward *f = get_iforward(drv, service_id);
+
+if (addr == NULL  port == NULL) {
+if (f != NULL) {
+qemu_opts_del(f-socket_opts);
+QLIST_REMOVE(f, next);
+qemu_free(f);
+}
+return 0;
+}
+
+if (f == NULL) {
+f = qemu_mallocz(sizeof(VPIForward));
+f-drv = drv;
+strncpy(f-service_id, service_id, VP_SERVICE_ID_LEN);
+QLIST_INSERT_HEAD(drv-iforwards, f, next);
+} else {
+qemu_opts_del(f-socket_opts);
+}
+
+/* stick socket-related options in a QemuOpts so we can
+ * utilize qemu socket utility functions directly
+ */
+f-socket_opts = qemu_opts_create(vp_socket_opts, NULL, 0);
+if (port == NULL) {
+/* no port given, assume unix path */
+qemu_opt_set(f-socket_opts, path, addr);
+} else {
+qemu_opt_set(f-socket_opts, host, addr);
+qemu_opt_set(f-socket_opts, port, port);
+}
+
+if (ipv6) {
+qemu_opt_set(f-socket_opts, ipv6, on);
+} else {
+qemu_opt_set(f-socket_opts, ipv4, on);
+}
+
+return 0;
+}
diff --git a/virtproxy.h b/virtproxy.h
index 39d5d40..d2522b3 100644
--- a/virtproxy.h
+++ b/virtproxy.h
@@ -34,5 +34,7 @@ int vp_set_fd_handler(int fd,
 /* virtproxy interface */
 VPDriver *vp_new(int fd, bool listen);
 int vp_set_oforward(VPDriver *drv, int fd, const char *service_id);
+int vp_set_iforward(VPDriver *drv, const char *service_id, const char *addr,
+const char *port, bool ipv6);
 
 #endif /* VIRTPROXY_H */
-- 
1.7.0.4

[Qemu-devel] [RFC][PATCH 02/10] virtagent: base definitions for host/guest RPC daemon

2010-10-22 Thread Michael Roth

Basic skeleton code for RPC daemon loop. This is shared by both the
guest-side RPC server as well as the host-side one (the advertised RPCs
for each by guest/host-specific arrays).

Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtagent-daemon.c |  118 
 virtagent-daemon.h |   20 +
 2 files changed, 138 insertions(+), 0 deletions(-)
 create mode 100644 virtagent-daemon.c
 create mode 100644 virtagent-daemon.h

diff --git a/virtagent-daemon.c b/virtagent-daemon.c
new file mode 100644
index 000..998b025
--- /dev/null
+++ b/virtagent-daemon.c
@@ -0,0 +1,118 @@
+/*
+ * virt-agent - host/guest RPC daemon functions
+ *
+ * Copyright IBM Corp. 2010
+ *
+ * Authors:
+ *  Adam Litkeagli...@linux.vnet.ibm.com
+ *  Michael Roth  mdr...@linux.vnet.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+#include stdlib.h
+#include stdio.h
+#include unistd.h
+#include xmlrpc-c/base.h
+#include xmlrpc-c/server.h
+#include xmlrpc-c/server_abyss.h
+#include qemu_socket.h
+#include virtagent-daemon.h
+#include virtagent-common.h
+
+static int va_accept(int listen_fd) {
+struct sockaddr_in saddr;
+struct sockaddr *addr;
+socklen_t len;
+int fd;
+
+while (1) {
+len = sizeof(saddr);
+addr = (struct sockaddr *)saddr;
+fd = qemu_accept(listen_fd, addr, len);
+if (fd  0  errno != EINTR) {
+LOG(accept() failed);
+break;
+} else if (fd = 0) {
+TRACE(accepted connection);
+break;
+}
+}
+return fd;
+}
+
+typedef struct RPCFunction {
+xmlrpc_value *(*func)(xmlrpc_env *env, xmlrpc_value *param, void *unused);
+const char *func_name;
+} RPCFunction;
+
+static RPCFunction guest_functions[] = {
+{ NULL, NULL }
+};
+static RPCFunction host_functions[] = {
+{ NULL, NULL }
+};
+
+static void va_register_functions(xmlrpc_env *env, xmlrpc_registry *registry,
+  RPCFunction *list)
+{
+int i;
+for (i = 0; list[i].func != NULL; ++i) {
+TRACE(adding func: %s, list[i].func_name);
+xmlrpc_registry_add_method(env, registry, NULL, list[i].func_name,
+   list[i].func, NULL);
+}
+}
+
+int va_server_loop(int listen_fd, bool is_host)
+{
+xmlrpc_registry *registryP;
+xmlrpc_env env;
+int ret, fd;
+char *rpc_request;
+int rpc_request_len;
+xmlrpc_mem_block *rpc_response;
+RPCFunction *func_list = is_host ? host_functions : guest_functions;
+
+xmlrpc_env_init(env);
+registryP = xmlrpc_registry_new(env);
+va_register_functions(env, registryP, func_list);
+
+while (1) {
+TRACE(waiting for connection from RPC client);
+fd = va_accept(listen_fd);
+if (fd  0) {
+TRACE(connection error: %s, strerror(errno));
+continue;
+}
+TRACE(RPC client connected, fetching RPC...);
+ret = va_get_rpc_request(fd, rpc_request, rpc_request_len);
+if (ret != 0 || rpc_request == NULL) {
+LOG(error retrieving rpc request);
+goto out;
+}
+TRACE(handling RPC request);
+rpc_response = xmlrpc_registry_process_call(env, registryP, NULL,
+rpc_request,
+rpc_request_len);
+if (rpc_response == NULL) {
+LOG(error handling rpc request);
+goto out;
+}
+
+qemu_free(rpc_request);
+TRACE(sending RPC response);
+ret = va_send_rpc_response(fd, rpc_response);
+if (ret != 0) {
+LOG(error sending rpc response);
+goto out;
+}
+TRACE(RPC completed);
+XMLRPC_MEMBLOCK_FREE(char, rpc_response);
+out:
+closesocket(fd);
+}
+
+return 0;
+}
diff --git a/virtagent-daemon.h b/virtagent-daemon.h
new file mode 100644
index 000..bb197d0
--- /dev/null
+++ b/virtagent-daemon.h
@@ -0,0 +1,20 @@
+/*
+ * virt-agent - host/guest RPC daemon functions
+ *
+ * Copyright IBM Corp. 2010
+ *
+ * Authors:
+ *  Michael Roth  mdr...@linux.vnet.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+#define GUEST_AGENT_SERVICE_ID virtagent
+#define GUEST_AGENT_PATH /tmp/virtagent-guest.sock
+#define HOST_AGENT_SERVICE_ID virtagent-host
+#define HOST_AGENT_PATH /tmp/virtagent-host.sock
+#define VA_GETFILE_MAX 1  30
+#define VA_FILEBUF_LEN 16384
+
+int va_server_loop(int listen_fd, bool is_host);
-- 
1.7.0.4

[Qemu-devel] [RFC][PATCH 14/15] virtproxy: Makefile/configure changes to build qemu-vp

2010-10-22 Thread Michael Roth


Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 .gitignore |1 +
 Makefile   |4 +++-
 configure  |1 +
 3 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/.gitignore b/.gitignore
index a43e4d1..da307d2 100644
--- a/.gitignore
+++ b/.gitignore
@@ -31,6 +31,7 @@ qemu-img-cmds.texi
 qemu-img-cmds.h
 qemu-io
 qemu-monitor.texi
+qemu-vp
 QMP/qmp-commands.txt
 .gdbinit
 *.a
diff --git a/Makefile b/Makefile
index 252c817..53b58d2 100644
--- a/Makefile
+++ b/Makefile
@@ -127,7 +127,7 @@ version-obj-$(CONFIG_WIN32) += version.o
 ##
 
 qemu-img.o: qemu-img-cmds.h
-qemu-img.o qemu-tool.o qemu-nbd.o qemu-io.o: $(GENERATED_HEADERS)
+qemu-img.o qemu-tool.o qemu-nbd.o qemu-io.o virtproxy.o: $(GENERATED_HEADERS)
 
 qemu-img$(EXESUF): qemu-img.o qemu-tool.o qemu-error.o $(trace-obj-y) 
$(block-obj-y) $(qobject-obj-y) $(version-obj-y)
 
@@ -135,6 +135,8 @@ qemu-nbd$(EXESUF): qemu-nbd.o qemu-tool.o qemu-error.o 
$(trace-obj-y) $(block-ob
 
 qemu-io$(EXESUF): qemu-io.o cmd.o qemu-tool.o qemu-error.o $(trace-obj-y) 
$(block-obj-y) $(qobject-obj-y) $(version-obj-y)
 
+qemu-vp$(EXESUF): qemu-vp.o virtproxy.o qemu-tool.o qemu-error.o 
qemu-sockets.c $(trace-obj-y) $(block-obj-y) $(qobject-obj-y) $(version-obj-y)
+
 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
$(call quiet-command,sh $(SRC_PATH)/hxtool -h  $  $@,  GEN   $@)
 
diff --git a/configure b/configure
index a079a49..27f92e0 100755
--- a/configure
+++ b/configure
@@ -2232,6 +2232,7 @@ if test $softmmu = yes ; then
   tools=qemu-img\$(EXESUF) qemu-io\$(EXESUF) $tools
   if [ $linux = yes -o $bsd = yes -o $solaris = yes ] ; then
   tools=qemu-nbd\$(EXESUF) $tools
+  tools=qemu-vp\$(EXESUF) $tools
 if [ $check_utests = yes ]; then
   tools=check-qint check-qstring check-qdict check-qlist $tools
   tools=check-qfloat check-qjson $tools
-- 
1.7.0.4

[Qemu-devel] [RFC][PATCH 01/10] virtagent: add common rpc transport defs

2010-10-22 Thread Michael Roth

Common code for sending/recieving RPCs via http over virtproxy channel.
Eventually these will all be switched to asynchronous handlers to avoid
deadlocks between qemu and the guest. For now we can usually get away with
just doing asynchronous reads for http/RPC responses if we don't send
large RPC requests to the guest (since these will likely get buffered
rather than block or cause spinning on writes).

Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtagent-common.c |  381 
 virtagent-common.h |   63 +
 2 files changed, 444 insertions(+), 0 deletions(-)
 create mode 100644 virtagent-common.c
 create mode 100644 virtagent-common.h

diff --git a/virtagent-common.c b/virtagent-common.c
new file mode 100644
index 000..51016a6
--- /dev/null
+++ b/virtagent-common.c
@@ -0,0 +1,381 @@
+/*
+ * virt-agent - common host/guest RPC functions
+ *
+ * Copyright IBM Corp. 2010
+ *
+ * Authors:
+ *  Adam Litkeagli...@linux.vnet.ibm.com
+ *  Michael Roth  mdr...@linux.vnet.ibm.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include stdlib.h
+#include unistd.h
+#include errno.h
+#include string.h
+#include ctype.h
+#include xmlrpc-c/base.h
+#include xmlrpc-c/client.h
+#include xmlrpc-c/server.h
+#include virtagent-common.h
+
+static int write_all(int fd, const void *buf, int len1)
+{
+int ret, len;
+
+len = len1;
+while (len  0) {
+ret = write(fd, buf, len);
+if (ret  0) {
+if (errno != EINTR  errno != EAGAIN) {
+LOG(write() failed);
+return -1;
+}
+} else if (ret == 0) {
+break;
+} else {
+buf += ret;
+len -= ret;
+}
+}
+return len1 - len;
+}
+
+static int read_all(int fd, void *buf, int len)
+{
+int ret, remaining;
+
+remaining = len;
+while (remaining  0) {
+ret = read(fd, buf, remaining);
+if (ret  0) {
+if (errno != EINTR  errno != EAGAIN) {
+LOG(read failed);
+return -1;
+}
+} else if (ret == 0) {
+break;
+} else {
+buf += ret;
+remaining -= ret;
+}
+}
+return len - remaining;
+}
+
+static int read_line(int fd, void *buf, int len)
+{
+int ret, remaining;
+remaining = len;
+while (remaining  0) {
+ret = read(fd, buf, 1);
+if (ret  0) {
+if (errno != EINTR  errno != EAGAIN) {
+LOG(read failed);
+return -1;
+}
+} else if (ret == 0) {
+break;
+} else {
+remaining--;
+buf++;
+if (*((char *)buf-1) == '\n')
+break;
+}
+}
+memset(buf, 0, remaining);
+return len - remaining;
+}
+
+typedef struct va_http {
+char *preamble;
+int content_length;
+int content_read;
+char *content;
+} va_http;
+
+typedef struct HttpReadState {
+int fd;
+char hdr_buf[4096];
+int hdr_pos;
+va_http http;
+enum {
+IN_RESP_HEADER = 0,
+IN_RESP_BODY,
+} state;
+RPCRequest *rpc_data;
+} HttpReadState;
+
+static int end_of_header(char *buf, int end_pos) {
+return !strncmp(buf+(end_pos-2), \n\r\n, 3);
+}
+
+static void parse_hdr(va_http *http, char *buf, int len) {
+int i, line_pos=0;
+char line_buf[4096];
+
+for (i=0; ilen; ++i) {
+if (buf[i] != '\n') {
+/* read line */
+line_buf[line_pos++] = buf[i];
+} else {
+/* process line */
+if (strncasecmp(line_buf, content-length: , 16) == 0) {
+http-content_length = atoi(line_buf[16]);
+return;
+}
+line_pos = 0;
+}
+}
+}
+
+static void http_read_handler(void *opaque) {
+HttpReadState *s = opaque;
+int ret;
+
+TRACE(called with opaque: %p, opaque);
+if (s-state == IN_RESP_HEADER) {
+while((ret = read(s-fd, s-hdr_buf + s-hdr_pos, 1))  0) {
+//TRACE(buf: %c, s-hdr_buf[0]);
+s-hdr_pos += ret;
+if (end_of_header(s-hdr_buf, s-hdr_pos - 1)) {
+s-state = IN_RESP_BODY;
+break;
+}
+}
+if (ret == -1) {
+if (errno == EAGAIN || errno == EWOULDBLOCK) {
+return;
+} else {
+LOG(error reading connection: %s, strerror(errno));
+goto out_bad;
+}
+} else if (ret == 0) {
+LOG(connection closed unexpectedly);
+goto out_bad;
+} else {
+s-http.content_length = -1;
+parse_hdr(s-http, s-hdr_buf, s-hdr_pos);
+if (s-http.content_length == -1) {
+LOG(malformed http header);
+

[Qemu-devel] [RFC][PATCH 05/15] virtproxy: add accept handler for communication channel

2010-10-22 Thread Michael Roth

This accept()'s connections to the socket we told virt-proxy to listen
for the channel connection on and sets the appropriate read handler for
the resulting FD.

Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 virtproxy.c |   37 +
 1 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/virtproxy.c b/virtproxy.c
index fa17722..20532c2 100644
--- a/virtproxy.c
+++ b/virtproxy.c
@@ -166,6 +166,8 @@ static VPConn *get_conn(const VPDriver *drv, int fd, bool 
client)
 return NULL;
 }
 
+static void vp_channel_accept(void *opaque);
+
 /* get VPOForward by service_id */
 static VPOForward *get_oforward(const VPDriver *drv, const char *service_id)
 {
@@ -193,3 +195,38 @@ static VPIForward *get_iforward(const VPDriver *drv, const 
char *service_id)
 
 return NULL;
 }
+
+/* accept handler for communication channel
+ *
+ * accept()s connection to communication channel (for sockets), and sets
+ * up the read handler for resulting FD.
+ */
+static void vp_channel_accept(void *opaque)
+{
+VPDriver *drv = opaque;
+struct sockaddr_in saddr;
+struct sockaddr *addr;
+socklen_t len;
+int fd;
+
+TRACE(called with opaque: %p, drv);
+
+for(;;) {
+len = sizeof(saddr);
+addr = (struct sockaddr *)saddr;
+fd = qemu_accept(drv-listen_fd, addr, len);
+
+if (fd  0  errno != EINTR) {
+TRACE(accept() failed);
+return;
+} else if (fd = 0) {
+TRACE(accepted connection);
+break;
+}
+}
+
+drv-channel_fd = fd;
+vp_set_fd_handler(drv-channel_fd, vp_channel_read, NULL, drv);
+/* dont accept anymore connections until channel_fd is closed */
+vp_set_fd_handler(drv-listen_fd, NULL, NULL, NULL);
+}
-- 
1.7.0.4

[Qemu-devel] [RFC][PATCH 08/10] virtagent: add agent_viewdmesg command

2010-10-22 Thread Michael Roth

Add commands to view guest dmesg output. Currently it is a 16K buffer.

Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com
---
 hmp-commands.hx |   16 +
 qmp-commands.hx |   35 +++
 virtagent.c |  100 +++
 virtagent.h |3 ++
 4 files changed, 154 insertions(+), 0 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index e9a7f4a..0e7a6c9 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1227,6 +1227,22 @@ STEXI
 Echo the file identified by @var{filepath} on the guest filesystem
 ETEXI
 
+{
+.name   = agent_viewdmesg,
+.args_type  = ,
+.params = ,
+.help   = View guest dmesg output,
+.user_print = do_agent_viewdmesg_print,
+.mhandler.cmd_async = do_agent_viewdmesg,
+.flags  = MONITOR_CMD_ASYNC,
+},
+
+STEXI
+...@item agent_viewdmesg
+...@findex agent_viewdmesg
+View guest dmesg output
+ETEXI
+
 STEXI
 @end table
 ETEXI
diff --git a/qmp-commands.hx b/qmp-commands.hx
index efa2137..dc319b7 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -771,6 +771,41 @@ Example:
 EQMP
 
 {
+.name   = agent_viewdmesg,
+.args_type  = ,
+.params = ,
+.help   = View guest dmesg output,
+.user_print = do_agent_viewdmesg_print,
+.mhandler.cmd_async = do_agent_viewdmesg,
+.flags  = MONITOR_CMD_ASYNC,
+},
+
+STEXI
+...@item agent_viewdmesg
+...@findex agent_viewdmesg
+View guest dmesg output
+ETEXI
+SQMP
+agent_viewdmesg
+
+
+View guest dmesg output
+
+Arguments:
+
+(none)
+
+Example:
+
+- { execute: agent_viewdmesg }
+- { return: {
+   contents: [353487.942215] usb 1-4: USB disconnect, address 9\n...
+ }
+   }
+
+EQMP
+
+{
 .name   = qmp_capabilities,
 .args_type  = ,
 .params = ,
diff --git a/virtagent.c b/virtagent.c
index 8059d00..cae9a85 100644
--- a/virtagent.c
+++ b/virtagent.c
@@ -196,3 +196,103 @@ int do_agent_viewfile(Monitor *mon, const QDict 
*mon_params,
 
 return 0;
 }
+
+void do_agent_viewdmesg_print(Monitor *mon, const QObject *data)
+{
+QDict *qdict;
+const char *contents = NULL;
+int i;
+
+qdict = qobject_to_qdict(data);
+if (!qdict_haskey(qdict, contents)) {
+return;
+}
+
+contents = qdict_get_str(qdict, contents);
+if (contents != NULL) {
+ /* monitor_printf truncates so do it in chunks. also, file_contents
+  * may not be null-termed at proper location so explicitly calc
+  * last chunk sizes */
+for (i = 0; i  strlen(contents); i += 1024) {
+monitor_printf(mon, %.1024s, contents + i);
+}
+}
+monitor_printf(mon, \n);
+}
+
+static void do_agent_viewdmesg_cb(void *opaque)
+{
+RPCRequest *rpc_data = opaque;
+xmlrpc_value *resp = NULL;
+char *dmesg = NULL;
+int ret;
+xmlrpc_env env;
+QDict *qdict = qdict_new();
+
+if (rpc_data-resp_xml == NULL) {
+monitor_printf(rpc_data-mon, error handling RPC request);
+goto out_no_resp;
+}
+
+xmlrpc_env_init(env);
+resp = xmlrpc_parse_response(env, rpc_data-resp_xml,
+ rpc_data-resp_xml_len);
+if (rpc_has_error(env)) {
+ret = -1;
+goto out_no_resp;
+}
+
+xmlrpc_parse_value(env, resp, s, dmesg);
+if (rpc_has_error(env)) {
+ret = -1;
+goto out;
+}
+
+if (dmesg != NULL) {
+qdict_put(qdict, contents, qstring_from_str(dmesg));
+}
+
+out:
+qemu_free(rpc_data-resp_xml);
+xmlrpc_DECREF(resp);
+out_no_resp:
+rpc_data-mon_cb(rpc_data-mon_data, QOBJECT(qdict));
+qemu_free(rpc_data);
+}
+
+/*
+ * do_agent_viewdmesg(): View guest dmesg output
+ */
+int do_agent_viewdmesg(Monitor *mon, const QDict *mon_params,
+  MonitorCompletion cb, void *opaque)
+{
+xmlrpc_env env;
+xmlrpc_value *params;
+RPCRequest *rpc_data;
+int ret;
+
+xmlrpc_env_init(env);
+
+params = xmlrpc_build_value(env, (n));
+if (rpc_has_error(env)) {
+return -1;
+}
+
+rpc_data = qemu_mallocz(sizeof(RPCRequest));
+rpc_data-cb = do_agent_viewdmesg_cb;
+rpc_data-mon = mon;
+rpc_data-mon_cb = cb;
+rpc_data-mon_data = opaque;
+
+ret = rpc_execute(env, getdmesg, params, rpc_data);
+if (ret == -EREMOTE) {
+monitor_printf(mon, RPC Failed (%i): %s\n, env.fault_code,
+   env.fault_string);
+return -1;
+} else if (ret == -1) {
+monitor_printf(mon, RPC communication error\n);
+return -1;
+}
+
+return 0;
+}
diff --git a/virtagent.h b/virtagent.h
index d027eac..82a6924 100644
--- a/virtagent.h
+++ b/virtagent.h
@@ -25,5 +25,8 @@
 void do_agent_viewfile_print(Monitor *mon, const QObject *qobject);
 int do_agent_viewfile(Monitor *mon, const QDict *mon_params,
   MonitorCompletion cb, void

1 2 >

1 - 100 of 146 matches

Mail list logo