Re: [Qemu-devel] [Xen-devel] Upstream QEMU and Xen unstable not working

2011-07-19 Thread Ian Campbell
On Tue, 2011-07-19 at 03:53 +0100, Wei Liu wrote:
> On Mon, 2011-07-18 at 11:03 +0100, Ian Campbell wrote:
> > On Mon, 2011-07-18 at 10:51 +0100, Wei Liu wrote:
> > > Bug resend.
> > > 
> > > This bug was reported about one month ago. QEMU fails to start with
> > > Xen unstable. I found that it has not been fix with latest Xen
> > > unstable. BIOS is Seabios (with Xen patch).
> > 
> > Please use current mainline seabios.git -- it does not require any
> > additional patches.
> > 
> > http://wiki.xensource.com/xenwiki/QEMUUpstream also includes an updated
> > SeaBIOS .config which you might try.
> > 
> > Ian.
> > 
> 
> Thanks Ian. This bug is fixed. But I spot new bug with Stefano's
> xen-next QEMU.
> 
> --mainline seabios with xen-next--
> (XEN) HVM7: HVM Loader
> (XEN) HVM7: Detected Xen v4.2-unstable
> (XEN) HVM7: Xenbus rings @0xfeffc000, event channel 2
> (XEN) HVM7: System requested SeaBIOS
> (XEN) HVM7: CPU speed is 2993 MHz
> (XEN) irq.c:264: Dom7 PCI link 0 changed 0 -> 5
> (XEN) HVM7: PCI-ISA link 0 routed to IRQ5
> (XEN) irq.c:264: Dom7 PCI link 1 changed 0 -> 10
> (XEN) HVM7: PCI-ISA link 1 routed to IRQ10
> (XEN) irq.c:264: Dom7 PCI link 2 changed 0 -> 11
> (XEN) HVM7: PCI-ISA link 2 routed to IRQ11
> (XEN) irq.c:264: Dom7 PCI link 3 changed 0 -> 5
> (XEN) HVM7: PCI-ISA link 3 routed to IRQ5
> (XEN) HVM7: *** HVMLoader assertion '(devfn != PCI_ISA_DEVFN) ||
> ((vendor_id == 0x8086) &&
> (XEN) HVM7:  (device_id == 0x7000))' failed at pci.c:78
> (XEN) HVM7: *** HVMLoader crashed.

Anthony posted a patch for this to qemu-devel a few weeks back, I think
it was "hw/piix_pci.c: Fix PIIX3-xen to initialize ids" (did I see a
pull request for it recently? If so then it might be in the main tree by
now...)

> If I use Anthony's old QEMU tree, qemu-dm-15, HVM boots up. But there
> are issues with irq binding.

I don't know about this one I'm afraid.

Ian.

> --mainline seabios with qemu-dm-15--
> (XEN) irq.c:344: Dom6 callback via changed to Direct Vector 0xf3
> (XEN) irq.c:1979: dom6: pirq 16 or emuirq 8 already mapped
> (XEN) irq.c:1979: dom6: pirq 16 or emuirq 12 already mapped
> (XEN) irq.c:1979: dom6: pirq 16 or emuirq 1 already mapped
> (XEN) irq.c:1979: dom6: pirq 16 or emuirq 6 already mapped
> (XEN) irq.c:1979: dom6: pirq 16 or emuirq 4 already mapped
> (XEN) irq.c:1979: dom6: pirq 16 or emuirq 7 already mapped
> (XEN) event_channel.c:341:d6 EVTCHNOP failure: error -17
> (XEN) event_channel.c:341:d6 EVTCHNOP failure: error -17
> (XEN) event_channel.c:341:d6 EVTCHNOP failure: error -17
> (XEN) event_channel.c:341:d6 EVTCHNOP failure: error -17
> 
> 
> Wei.
> 
> 





Re: [Qemu-devel] [PATCH] Fix duplicate device reset

2011-07-19 Thread Isaku Yamahata
On Tue, Jul 19, 2011 at 07:56:41AM +0200, Stefan Weil wrote:
> Am 19.07.2011 04:39, schrieb Isaku Yamahata:
>> Thank you for addressing this. Similar patches were proposed and
>> weren't merged unfortunately.
>>
>> The reason why the qdev_register_reset() in vl.c is to keep the reset order.
>> The reset for main_system_bus shouldn't registered by qbus_create_inplace().
>> But the check, bus != main_system_bus, doesn't work as intended because
>> main_system_bus is NULL in early qdev creation.
>> So there are possible ways for the fix.
>>
>> - Don't care the reset order
>>your patch +
>>remove "if (bus != main_system_bus)" in qbus_create_inplace()
>>
>> - keep the reset order
>>- instantiate main_system_bus early.
>>  So the check, bus != main_system_bus in qbus_create_inplace(), will 
>> work.
>>or
>>- fix the check, bus != main_system_bus in qbus_create_inplace(), somehow
>>
>> thanks,
>
> Hi,
>
> my patch does not remove sysbus_get_default(),
> so the reset order is kept because main_system_bus
> is instantiated by this call.

Yes, your patch doesn't change the order from the existing code.
I think it's not intended one.
During machine creation, someone may call sysbus_get_default().
So the reset for main_system_bus may not be the lastly registered.

The changeset of 80376c3f tries to keep the reset order, but failed.
That's the issue.
-- 
yamahata



[Qemu-devel] [PATCH] avoid core reading with bdrv_read (qemu-io)

2011-07-19 Thread Frediano Ziglio
This patch apply to kevin coroutine-block branch and avoid code. It
fix "qcow: Use coroutines" patch. Test case:

$ ./qemu-img create -f qcow aaa.img 1G
Formatting 'aaa.img', fmt=qcow size=1073741824 encryption=off
$ ./qemu-io aaa.img
qemu-io> read 1024 1024
Segmentation fault

Signed-off-by: Frediano Ziglio 
---
 block/qcow.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/block/qcow.c b/block/qcow.c
index 6f7973c..1386e92 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -573,7 +573,8 @@ static int qcow_aio_read_cb(void *opaque)

 if (acb->nb_sectors == 0) {
 /* request completed */
-qemu_iovec_from_buffer(acb->qiov, acb->orig_buf, acb->qiov->size);
+if (acb->orig_buf)
+qemu_iovec_from_buffer(acb->qiov, acb->orig_buf, acb->qiov->size);
 return 0;
 }

@@ -648,6 +649,7 @@ static int qcow_co_readv(BlockDriverState *bs,
int64_t sector_num,

 if (acb->qiov->niov > 1) {
 qemu_vfree(acb->orig_buf);
+acb->orig_buf = NULL;
 }
 qemu_aio_release(acb);

@@ -729,6 +731,7 @@ static int qcow_co_writev(BlockDriverState *bs,
int64_t sector_num,

 if (acb->qiov->niov > 1) {
 qemu_vfree(acb->orig_buf);
+acb->orig_buf = NULL;
 }
 qemu_aio_release(acb);

-- 
1.7.1



Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Jes Sorensen
On 07/18/11 16:08, Stefan Hajnoczi wrote:
> On Fri, Jul 15, 2011 at 3:58 PM, Jes Sorensen  wrote:
>> I have been updating the live snapshot wiki for qemu to try and cover
>> the commands we will want for async snapshot handling too.
>>
>> http://wiki.qemu.org/Features/Snapshots
> 
> Regarding fd passing, do we even support SELinux today with backing files?

Not sure I understand what you mean. The current code should be happy to
take an existing file or a raw device for the snapshot.

Jes



Re: [Qemu-devel] [PATCH 0/4] scsi fixes

2011-07-19 Thread Kevin Wolf
Am 19.07.2011 08:31, schrieb Hannes Reinecke:
> On 07/12/2011 03:37 PM, Kevin Wolf wrote:
>> Am 11.07.2011 15:02, schrieb Hannes Reinecke:
>>> Hi all,
>>>
>>> these are some fixes I found during debugging my megasas HBA emulation.
>>> This time I've sent them as a separate patchset for inclusion.
>>> All of them have been acked, so please apply.
>>>
>>> Hannes Reinecke (4):
>>>iov: Update parameter usage in iov_(to|from)_buf()
>>>scsi: Add 'hba_private' to SCSIRequest
>>>scsi-disk: Fixup debugging statement
>>>scsi-disk: Mask out serial number EVPD
>>
>> Thanks, applied all to the block branch.
>>
> Any chance to have them pulled into the main tree?
> My megasas emulation relies on them, and it feels a bit
> stupid to send a patch relying on some fixes not in mainline.
> At the same time it's really stupid to resend the entire
> patchset again ...

I'm hoping to send a pull request today, now that the VMDK patches look
good finally.

Anyway, I don't think that not having the patches in master yet should
stop you from going forward with the next patches. They will go through
the block tree anyway, so basing them on that tree is fine (and you
wouldn't be the first one to do that). Just state in PATCH 0/n for the
reviewers that it depends on the other patches.

Kevin



Re: [Qemu-devel] [PATCH] avoid core reading with bdrv_read (qemu-io)

2011-07-19 Thread Kevin Wolf
Am 19.07.2011 09:33, schrieb Frediano Ziglio:
> This patch apply to kevin coroutine-block branch and avoid code. It
> fix "qcow: Use coroutines" patch. Test case:
> 
> $ ./qemu-img create -f qcow aaa.img 1G
> Formatting 'aaa.img', fmt=qcow size=1073741824 encryption=off
> $ ./qemu-io aaa.img
> qemu-io> read 1024 1024
> Segmentation fault
> 
> Signed-off-by: Frediano Ziglio 

Thanks for the report. I'll update the patch, but in a slightly
different way that matches the old code better:

diff --git a/block/qcow.c b/block/qcow.c
index 6f7973c..6447c2a 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -573,7 +573,6 @@ static int qcow_aio_read_cb(void *opaque)

 if (acb->nb_sectors == 0) {
 /* request completed */
-qemu_iovec_from_buffer(acb->qiov, acb->orig_buf, acb->qiov->size);
 return 0;
 }

@@ -647,6 +646,7 @@ static int qcow_co_readv(BlockDriverState *bs,
int64_t sector_num,
 qemu_co_mutex_unlock(&s->lock);

 if (acb->qiov->niov > 1) {
+qemu_iovec_from_buffer(acb->qiov, acb->orig_buf, acb->qiov->size);
 qemu_vfree(acb->orig_buf);
 }
 qemu_aio_release(acb);



[Qemu-devel] [PULL] virtio-serial: Fixes, trace points

2011-07-19 Thread Amit Shah
Hi Anthony,

Please pull for trace points for virtio-serial/console code and a fix
for a host process closing chardev connection causing an abort().

The following changes since commit 89b9ba661bd2d6155308f895ec075d813f0e129b:

  Fix signal handling of SIG_IPI when io-thread is enabled (2011-07-16 19:43:00 
+)

are available in the git repository at:
  git://git.kernel.org/pub/scm/qemu/amit/virtio-serial.git for-anthony

Amit Shah (4):
  virtio-serial-bus: Add trace events
  virtio-console: Add some trace events
  virtio-serial-bus: Fix trailing \n in error_report string
  virtio-console: Prevent abort()s in case of host chardev close

 hw/virtio-console.c|   25 +++--
 hw/virtio-serial-bus.c |9 -
 trace-events   |   11 +++
 3 files changed, 42 insertions(+), 3 deletions(-)


Amit



[Qemu-devel] coroutines and block I/O considerations

2011-07-19 Thread Frediano Ziglio
Hi,
  I'm exercise myself in block I/O layer and I decided to test
coroutine branch cause I find it easier to use instead of normal
callback. Looking at normal code there are a lot of rows in source to
save/restore state and declare callbacks and is not that easier to
understand the normal flow. At the end I would like to create a new
image format to get rid of some performance problem I encounter using
writethrough and snapshots. I have some questions regard block I/O and
also coroutines

1- threading model. I don't understand it. I can see that aio pool
routines does not contain locking code so I think aio layer is mainly
executed in a single thread. I saw introduction of some locking using
coroutines so I think coroutines are now called from different threads
and needs lock (current implementation serialize all device
operations)

2- memory considerations on coroutines. Beside coroutines allow more
readable code I wonder if somebody considered memory. For every
coroutines a different stack has to be allocated. For instance
ucontext and win32 implementation use 4mb. Assuming 128 concurrent AIO
this require about 512mb of ram (mostly only committed but not used
and coroutines are reused).

About snapshot and block i/o I think that using "external snapshot"
would help making some stuff easier. By "external snapshot" I mean
creating a new image with backing file as current image file and using
this new image for future operations. This would allow for instance
- support snapshot with every format (even raw)
- making snapshot backup using external programs (even from different
hosts using clustered file system and without many locking issues as
original image is now read-only)
- convert images live (just snapshot, qemu-img convert, remove snapshot)

Regards
  Frediano



Re: [Qemu-devel] [PATCH 0/4] scsi fixes

2011-07-19 Thread Hannes Reinecke

On 07/19/2011 09:39 AM, Kevin Wolf wrote:

Am 19.07.2011 08:31, schrieb Hannes Reinecke:

On 07/12/2011 03:37 PM, Kevin Wolf wrote:

Am 11.07.2011 15:02, schrieb Hannes Reinecke:

Hi all,

these are some fixes I found during debugging my megasas HBA emulation.
This time I've sent them as a separate patchset for inclusion.
All of them have been acked, so please apply.

Hannes Reinecke (4):
iov: Update parameter usage in iov_(to|from)_buf()
scsi: Add 'hba_private' to SCSIRequest
scsi-disk: Fixup debugging statement
scsi-disk: Mask out serial number EVPD


Thanks, applied all to the block branch.


Any chance to have them pulled into the main tree?
My megasas emulation relies on them, and it feels a bit
stupid to send a patch relying on some fixes not in mainline.
At the same time it's really stupid to resend the entire
patchset again ...


I'm hoping to send a pull request today, now that the VMDK patches look
good finally.

Anyway, I don't think that not having the patches in master yet should
stop you from going forward with the next patches. They will go through
the block tree anyway, so basing them on that tree is fine (and you
wouldn't be the first one to do that). Just state in PATCH 0/n for the
reviewers that it depends on the other patches.


Well, the remaining patch is 'just' the megasas emulation itself.
And I want to make reviewing that as easy as possible, so that it's 
not again being held off by complains about missing patches.


Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)



Re: [Qemu-devel] [PATCH] avoid core reading with bdrv_read (qemu-io)

2011-07-19 Thread Frediano Ziglio
2011/7/19 Kevin Wolf :
> Am 19.07.2011 09:33, schrieb Frediano Ziglio:
>> This patch apply to kevin coroutine-block branch and avoid code. It
>> fix "qcow: Use coroutines" patch. Test case:
>>
>> $ ./qemu-img create -f qcow aaa.img 1G
>> Formatting 'aaa.img', fmt=qcow size=1073741824 encryption=off
>> $ ./qemu-io aaa.img
>> qemu-io> read 1024 1024
>> Segmentation fault
>>
>> Signed-off-by: Frediano Ziglio 
>
> Thanks for the report. I'll update the patch, but in a slightly
> different way that matches the old code better:
>
> diff --git a/block/qcow.c b/block/qcow.c
> index 6f7973c..6447c2a 100644
> --- a/block/qcow.c
> +++ b/block/qcow.c
> @@ -573,7 +573,6 @@ static int qcow_aio_read_cb(void *opaque)
>
>     if (acb->nb_sectors == 0) {
>         /* request completed */
> -        qemu_iovec_from_buffer(acb->qiov, acb->orig_buf, acb->qiov->size);
>         return 0;
>     }
>
> @@ -647,6 +646,7 @@ static int qcow_co_readv(BlockDriverState *bs,
> int64_t sector_num,
>     qemu_co_mutex_unlock(&s->lock);
>
>     if (acb->qiov->niov > 1) {
> +        qemu_iovec_from_buffer(acb->qiov, acb->orig_buf, acb->qiov->size);
>         qemu_vfree(acb->orig_buf);
>     }
>     qemu_aio_release(acb);
>

Yes, my patch also removed some dandling pointer which I don't like
but are not a problem with current code.
In case of ret < 0 (error) your code could copy data that probably are
not initialized. I don't know if data is used in case of failure but
in case memory is shared with guest (I don't know, perhaps using
virtio) this lead to security issues. Also some memory debugger like
valgrind could not like that copy.

Frediano



Re: [Qemu-devel] [Xen-devel] Upstream QEMU and Xen unstable not working

2011-07-19 Thread Wei Liu
On Tue, 2011-07-19 at 08:14 +0100, Ian Campbell wrote:
> On Tue, 2011-07-19 at 03:53 +0100, Wei Liu wrote:
> > On Mon, 2011-07-18 at 11:03 +0100, Ian Campbell wrote:
> > > On Mon, 2011-07-18 at 10:51 +0100, Wei Liu wrote:
> > > > Bug resend.
> > > > 
> > > > This bug was reported about one month ago. QEMU fails to start with
> > > > Xen unstable. I found that it has not been fix with latest Xen
> > > > unstable. BIOS is Seabios (with Xen patch).
> > > 
> > > Please use current mainline seabios.git -- it does not require any
> > > additional patches.
> > > 
> > > http://wiki.xensource.com/xenwiki/QEMUUpstream also includes an updated
> > > SeaBIOS .config which you might try.
> > > 
> > > Ian.
> > > 
> > 
> > Thanks Ian. This bug is fixed. But I spot new bug with Stefano's
> > xen-next QEMU.
> > 
> > --mainline seabios with xen-next--
> > (XEN) HVM7: HVM Loader
> > (XEN) HVM7: Detected Xen v4.2-unstable
> > (XEN) HVM7: Xenbus rings @0xfeffc000, event channel 2
> > (XEN) HVM7: System requested SeaBIOS
> > (XEN) HVM7: CPU speed is 2993 MHz
> > (XEN) irq.c:264: Dom7 PCI link 0 changed 0 -> 5
> > (XEN) HVM7: PCI-ISA link 0 routed to IRQ5
> > (XEN) irq.c:264: Dom7 PCI link 1 changed 0 -> 10
> > (XEN) HVM7: PCI-ISA link 1 routed to IRQ10
> > (XEN) irq.c:264: Dom7 PCI link 2 changed 0 -> 11
> > (XEN) HVM7: PCI-ISA link 2 routed to IRQ11
> > (XEN) irq.c:264: Dom7 PCI link 3 changed 0 -> 5
> > (XEN) HVM7: PCI-ISA link 3 routed to IRQ5
> > (XEN) HVM7: *** HVMLoader assertion '(devfn != PCI_ISA_DEVFN) ||
> > ((vendor_id == 0x8086) &&
> > (XEN) HVM7:  (device_id == 0x7000))' failed at pci.c:78
> > (XEN) HVM7: *** HVMLoader crashed.
> 
> Anthony posted a patch for this to qemu-devel a few weeks back, I think
> it was "hw/piix_pci.c: Fix PIIX3-xen to initialize ids" (did I see a
> pull request for it recently? If so then it might be in the main tree by
> now...)
> 

Good, this is it.

But this patch is not yet pulled in the tree.

Wei.





Re: [Qemu-devel] [PATCH 0/4] scsi fixes

2011-07-19 Thread Alexander Graf

On 19.07.2011, at 10:10, Hannes Reinecke wrote:

> On 07/19/2011 09:39 AM, Kevin Wolf wrote:
>> Am 19.07.2011 08:31, schrieb Hannes Reinecke:
>>> On 07/12/2011 03:37 PM, Kevin Wolf wrote:
 Am 11.07.2011 15:02, schrieb Hannes Reinecke:
> Hi all,
> 
> these are some fixes I found during debugging my megasas HBA emulation.
> This time I've sent them as a separate patchset for inclusion.
> All of them have been acked, so please apply.
> 
> Hannes Reinecke (4):
>iov: Update parameter usage in iov_(to|from)_buf()
>scsi: Add 'hba_private' to SCSIRequest
>scsi-disk: Fixup debugging statement
>scsi-disk: Mask out serial number EVPD
 
 Thanks, applied all to the block branch.
 
>>> Any chance to have them pulled into the main tree?
>>> My megasas emulation relies on them, and it feels a bit
>>> stupid to send a patch relying on some fixes not in mainline.
>>> At the same time it's really stupid to resend the entire
>>> patchset again ...
>> 
>> I'm hoping to send a pull request today, now that the VMDK patches look
>> good finally.
>> 
>> Anyway, I don't think that not having the patches in master yet should
>> stop you from going forward with the next patches. They will go through
>> the block tree anyway, so basing them on that tree is fine (and you
>> wouldn't be the first one to do that). Just state in PATCH 0/n for the
>> reviewers that it depends on the other patches.
>> 
> Well, the remaining patch is 'just' the megasas emulation itself.
> And I want to make reviewing that as easy as possible, so that it's not again 
> being held off by complains about missing patches.

Yes, no worries. Just state in the patch description that it's based on the 
block branch and actually do base it on it - the endianness specific ld./st. 
patches are upstream, so everything you need should be in that branch.


Alex




Re: [Qemu-devel] [PATCH] USB: add usb network redirection support

2011-07-19 Thread Hans de Goede

Hi,

On 07/18/2011 04:33 PM, Gerd Hoffmann wrote:

On 07/18/11 09:13, Hans de Goede wrote:

This patch adds support for a usb-redir device, which takes a chardev
as a communication channel to an actual usbdevice using the usbredir protocol.

Compiling the usb-redir device requires usbredir-0.3 to be installed for
the usbredir protocol parser, usbredir-0.3 also contains a server for
redirecting usb traffic from an actual usb device. You can get the 0.3
release of usbredir here:
http://people.fedoraproject.org/~jwrdegoede/usbredir-0.3.tar.bz2
(getting a more formal site for it is a WIP)


Looks good overall. scripts/checkpatch.pl has a bunch of codestyle complains 
which need to be fixed.


Sorry, I should have though of running checkpatch before submitting myself,
new version coming up.

Regards,

Hans



[Qemu-devel] [PATCH] USB: add usb network redirection support

2011-07-19 Thread Hans de Goede
This patch adds support for a usb-redir device, which takes a chardev
as a communication channel to an actual usbdevice using the usbredir protocol.

Compiling the usb-redir device requires usbredir-0.3 to be installed for
the usbredir protocol parser, usbredir-0.3 also contains a server for
redirecting usb traffic from an actual usb device. You can get the 0.3
release of usbredir here:
http://people.fedoraproject.org/~jwrdegoede/usbredir-0.3.tar.bz2
(getting a more formal site for it is a WIP)

Example usage:
1) Start usbredirserver for a usb device:
sudo usbredirserver 045e:0772
2) Start qemu with usb2 support + a chardev talking to usbredirserver +
   a usb-redir device using this chardev:
qemu ... \
  -readconfig docs/ich9-ehci-uhci.cfg \
  -chardev socket,id=usbredirchardev,host=localhost,port=4000 \
  -device usb-redir,chardev=usbredirchardev,id=usbredirdev

Signed-off-by: Hans de Goede 
---
 Makefile.objs |1 +
 configure |   28 ++
 usb-redir.c   | 1218 +
 3 files changed, 1247 insertions(+), 0 deletions(-)
 create mode 100644 usb-redir.c

diff --git a/Makefile.objs b/Makefile.objs
index cea15e4..ad69fbc 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -205,6 +205,7 @@ hw-obj-$(CONFIG_HPET) += hpet.o
 hw-obj-$(CONFIG_APPLESMC) += applesmc.o
 hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o
 hw-obj-$(CONFIG_SMARTCARD_NSS) += ccid-card-emulated.o
+hw-obj-$(CONFIG_USB_REDIR) += usb-redir.o
 
 # PPC devices
 hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/configure b/configure
index 88159ac..843bbd8 100755
--- a/configure
+++ b/configure
@@ -177,6 +177,7 @@ spice=""
 rbd=""
 smartcard=""
 smartcard_nss=""
+usb_redir=""
 opengl=""
 
 # parse CC options first
@@ -743,6 +744,10 @@ for opt do
   ;;
   --enable-smartcard-nss) smartcard_nss="yes"
   ;;
+  --disable-usb-redir) usb_redir="no"
+  ;;
+  --enable-usb-redir) usb_redir="yes"
+  ;;
   *) echo "ERROR: unknown option $opt"; show_help="yes"
   ;;
   esac
@@ -1018,6 +1023,8 @@ echo "  --disable-smartcard  disable smartcard 
support"
 echo "  --enable-smartcard   enable smartcard support"
 echo "  --disable-smartcard-nss  disable smartcard nss support"
 echo "  --enable-smartcard-nss   enable smartcard nss support"
+echo "  --disable-usb-redir  disable usb network redirection support"
+echo "  --enable-usb-redir   enable usb network redirection support"
 echo ""
 echo "NOTE: The object files are built at the place where configure is 
launched"
 exit 1
@@ -2371,6 +2378,22 @@ if test "$smartcard" = "no" ; then
 smartcard_nss="no"
 fi
 
+# check for usbredirparser for usb network redirection support
+if test "$usb_redir" != "no" ; then
+if $pkg_config libusbredirparser >/dev/null 2>&1 ; then
+usb_redir="yes"
+usb_redir_cflags=$($pkg_config --cflags libusbredirparser 2>/dev/null)
+usb_redir_libs=$($pkg_config --libs libusbredirparser 2>/dev/null)
+QEMU_CFLAGS="$QEMU_CFLAGS $usb_redir_cflags"
+LIBS="$LIBS $usb_redir_libs"
+else
+if test "$usb_redir" = "yes"; then
+feature_not_found "usb-redir"
+fi
+usb_redir="no"
+fi
+fi
+
 ##
 
 ##
@@ -2617,6 +2640,7 @@ echo "spice support $spice"
 echo "rbd support   $rbd"
 echo "xfsctl support$xfs"
 echo "nss used  $smartcard_nss"
+echo "usb net redir $usb_redir"
 echo "OpenGL support$opengl"
 
 if test $sdl_too_old = "yes"; then
@@ -2910,6 +2934,10 @@ if test "$smartcard_nss" = "yes" ; then
   echo "CONFIG_SMARTCARD_NSS=y" >> $config_host_mak
 fi
 
+if test "$usb_redir" = "yes" ; then
+  echo "CONFIG_USB_REDIR=y" >> $config_host_mak
+fi
+
 if test "$opengl" = "yes" ; then
   echo "CONFIG_OPENGL=y" >> $config_host_mak
 fi
diff --git a/usb-redir.c b/usb-redir.c
new file mode 100644
index 000..e212993
--- /dev/null
+++ b/usb-redir.c
@@ -0,0 +1,1218 @@
+/*
+ * USB redirector usb-guest
+ *
+ * Copyright (c) 2011 Red Hat, Inc.
+ *
+ * Red Hat Authors:
+ * Hans de Goede 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIAB

[Qemu-devel] External COW format for raw images

2011-07-19 Thread Robert Wang
As you known, raw image is very popular,but the raw image format does
NOT support Copy-On-Write,a raw image file can NOT be used as a copy
destination, then image streaming/Live Block Copy will NOT work.

To fix this, we need to add a new block driver raw-cow to QEMU. If
finished, we can use qemu-img like this:
qemu-img create -f raw-cow -o backing_file=ubuntu.img,raw_file=my_vm.img
my_vm.raw-cow

1) ubuntu.img is the backing file, my_vm.img is a raw file,
my_vm.raw-cow stores a COW bitmap related to my_vm.img.

2) If the entire COW bitmap is set to dirty flag then we can get all
information from my_vm.img and can ignore ubuntu.img and my_vm.raw-cow
from now.

To implement this, I think I can follow these steps:
1) Add a new member to BlockDriverState struct:
char raw_file[1024];
This member will track raw_file parameter related to raw-cow file from
command line.

2)  * Create a new file block/raw-cow.c. It will be much more like the
mixture of block/cow.c and block/raw.c.

So I will change some functions in cow.c and raw.c to none-static, then
raw-cow.c can re-use them. When read operation occurs, determine whether
dirty flag in raw-cow image is set. If true, read directly from the raw
file. After write operation, set related dirty flag in raw-cow image.
And other functions might also be modified.

* Of course, format_name member of BlockDriver struct will be "raw-cow".
And in order to keep relationship with raw file( like my_vm.img) ,
raw_cow_header struct should be
struct raw_cow_header {
uint32_t magic;
uint32_t version;
char backing_file[1024];
char raw_file[1024];/* added*/
int32_t mtime;
uint64_t size;
uint32_t sectorsize;
};
* Struct raw_cow_create_options should be one member plus based on
cow_create_options:
{
.name = BLOCK_OPT_RAW_FILE,
.type = OPT_STRING,
.help = "Raw file name"
},

3) Add bdrv_get_raw_filename in img_info function of qemu-img.c. In
bdrv_get_raw_filename, if the format of the image file is "raw-cow",
print the related raw file.

Do you think my approach is right?
Thank you.




[Qemu-devel] Using checkpatch.pl to check coding style added to wiki

2011-07-19 Thread Stefan Hajnoczi
I just updated SubmitAPatch to mention running scripts/checkpatch.pl
before submitting patches:
http://wiki.qemu.org/Contribute/SubmitAPatch

"Follow the coding style and run scripts/checkpatch.pl 
before submitting"

Checkpatch.pl makes it easy to follow the coding style and eliminates
the "please add a curly brace and resend the patch" hassle :).

Stefan



Re: [Qemu-devel] External COW format for raw images

2011-07-19 Thread Stefan Hajnoczi
2011/7/19 Robert Wang :
> 2)      * Create a new file block/raw-cow.c. It will be much more like the
> mixture of block/cow.c and block/raw.c.
>
> So I will change some functions in cow.c and raw.c to none-static, then
> raw-cow.c can re-use them. When read operation occurs, determine whether
> dirty flag in raw-cow image is set. If true, read directly from the raw
> file. After write operation, set related dirty flag in raw-cow image.
> And other functions might also be modified.

The block/cow.c driver is inefficient because it does I/O for each
bitmap set/test operation.  I think doing this more efficiently means
basically rewriting the bitmap code to keep a writethrough bitmap in
memory.

Regarding the file header, the msize is not really useful - there is
no interface to read it and no feature makes use of msize.  The
sector_size field could also be dropped.  The true sector size does
from the underlying storage that contains this image file.  Especially
in the cache=none (O_DIRECT) case we need to honor the underlying
sector size and I'm not sure the sector_size field helps.

Stefan



Re: [Qemu-devel] coroutines and block I/O considerations

2011-07-19 Thread Kevin Wolf
Am 19.07.2011 10:06, schrieb Frediano Ziglio:
>   I'm exercise myself in block I/O layer and I decided to test
> coroutine branch cause I find it easier to use instead of normal
> callback. Looking at normal code there are a lot of rows in source to
> save/restore state and declare callbacks and is not that easier to
> understand the normal flow. 

Yes. This is one of the reasons why we're trying to switch to
coroutines. QED is a prototype for a fully asynchronous callback-based
image format, and sometimes it's really hard to follow its code paths.
That the real functionality gets lost in the noise of transferring state
doesn't really help with readability either.

> At the end I would like to create a new
> image format to get rid of some performance problem I encounter using
> writethrough and snapshots. I have some questions regard block I/O and
> also coroutines

No. A new image format is the wrong answer, whatever the question may
be. :-)

If writethrough doesn't perform well with the existing format drivers,
fix the existing format drivers. You need very good reasons to convince
me that qcow2 can't do what your new format could do.

The solution for slow writethrough mode in qcow2 is probably to make
requests parallel, even if they touch metadata. This is a change that
becomes possible relatively easily once we have switched to coroutines.

What exactly is the problem with snapshots? Saving/loading internal
snapshots is too slow, or general performance with an image that has
snapshots? I think Luiz reported the first one a while ago, and it
should be easy enough to fix (use Qcow2Cache in writeback mode during
the refcount update).

> 1- threading model. I don't understand it. I can see that aio pool
> routines does not contain locking code so I think aio layer is mainly
> executed in a single thread. I saw introduction of some locking using
> coroutines so I think coroutines are now called from different threads
> and needs lock (current implementation serialize all device
> operations)

You can view coroutines as threads with cooperative scheduling. That is,
unlike threads a coroutine is never interrupted by a scheduler, but it
can only call qemu_coroutine_yield(), which transfers control to a
different coroutine. Compared to threads this simplifies locking a bit
because you exactly know at which point other code may run.

But of course, even though you know where it happens, you have other
code running in the middle of your function,  so there can be a need to
lock things, which is why there are things like a CoMutex.

They are still all running in the same thread.

> 2- memory considerations on coroutines. Beside coroutines allow more
> readable code I wonder if somebody considered memory. For every
> coroutines a different stack has to be allocated. For instance
> ucontext and win32 implementation use 4mb. Assuming 128 concurrent AIO
> this require about 512mb of ram (mostly only committed but not used
> and coroutines are reused).

128 concurrent requests is a lot. And even then, it's only virtual
memory. I doubt that we're actually using much more than we do in the
old code with the AIOCBs (which will disappear and become local
variables when we complete the conversion).

> About snapshot and block i/o I think that using "external snapshot"
> would help making some stuff easier. By "external snapshot" I mean
> creating a new image with backing file as current image file and using
> this new image for future operations. This would allow for instance
> - support snapshot with every format (even raw)
> - making snapshot backup using external programs (even from different
> hosts using clustered file system and without many locking issues as
> original image is now read-only)
> - convert images live (just snapshot, qemu-img convert, remove snapshot)

These are things that are actively worked on. snapshot_blkdev is a
monitor command that already exists and does exactly what you describe.
For the rest, live block copy and image streaming are the keywords that
you should be looking for. We've had quite some discussions on these in
the past few weeks. You may also be interested in this wiki page:
http://wiki.qemu.org/Features/LiveBlockMigration

Kevin



[Qemu-devel] [PULL 00/21] Block patches

2011-07-19 Thread Kevin Wolf
The following changes since commit 89b9ba661bd2d6155308f895ec075d813f0e129b:

  Fix signal handling of SIG_IPI when io-thread is enabled (2011-07-16 19:43:00 
+)

are available in the git repository at:
  git://repo.or.cz/qemu/kevin.git for-anthony

Devin Nakamura (2):
  qemu-io: Fix formatting
  qemu-io: Fix if scoping bug

Fam Zheng (12):
  VMDK: introduce VmdkExtent
  VMDK: bugfix, align offset to cluster in get_whole_cluster
  VMDK: probe for monolithicFlat images
  VMDK: separate vmdk_open by format version
  VMDK: add field BDRVVmdkState.desc_offset
  VMDK: flush multiple extents
  VMDK: move 'static' cid_update flag to bs field
  VMDK: change get_cluster_offset return type
  VMDK: open/read/write for monolithicFlat image
  VMDK: create different subformats
  VMDK: fix coding style
  block: add bdrv_get_allocated_file_size() operation

Hannes Reinecke (4):
  iov: Update parameter usage in iov_(to|from)_buf()
  scsi: Add 'hba_private' to SCSIRequest
  scsi-disk: Fixup debugging statement
  scsi-disk: Mask out serial number EVPD

Luiz Capitulino (2):
  qemu-options.hx: Document missing -drive options
  qemu-config: Document -drive options

MORITA Kazutaka (1):
  sheepdog: add full data preallocation support

 block.c|   19 +
 block.h|1 +
 block/raw-posix.c  |   21 +
 block/raw-win32.c  |   29 +
 block/sheepdog.c   |   71 ++-
 block/vmdk.c   | 1297 
 block_int.h|2 +
 hw/esp.c   |2 +-
 hw/lsi53c895a.c|   22 +-
 hw/scsi-bus.c  |9 +-
 hw/scsi-disk.c |   21 +-
 hw/scsi-generic.c  |5 +-
 hw/scsi.h  |   10 +-
 hw/spapr_vscsi.c   |   29 +-
 hw/usb-msd.c   |9 +-
 hw/virtio-net.c|2 +-
 hw/virtio-serial-bus.c |2 +-
 iov.c  |   49 +-
 iov.h  |   10 +-
 qemu-config.c  |6 +
 qemu-img.c |   31 +-
 qemu-io.c  | 2653 
 qemu-options.hx|8 +
 23 files changed, 2462 insertions(+), 1846 deletions(-)



[Qemu-devel] [PATCH 12/21] VMDK: probe for monolithicFlat images

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

Probe as the same behavior as VMware does.
Recognize image as monolithicFlat descriptor file when the file is text
and the first effective line (not '#' leaded comment or space line) is
either 'version=1' or 'version=2'. No space or upper case charactors
accepted.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/vmdk.c |   45 +++--
 1 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 03a4619..f8a815c 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -103,10 +103,51 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 return 0;
 magic = be32_to_cpu(*(uint32_t *)buf);
 if (magic == VMDK3_MAGIC ||
-magic == VMDK4_MAGIC)
+magic == VMDK4_MAGIC) {
 return 100;
-else
+} else {
+const char *p = (const char *)buf;
+const char *end = p + buf_size;
+while (p < end) {
+if (*p == '#') {
+/* skip comment line */
+while (p < end && *p != '\n') {
+p++;
+}
+p++;
+continue;
+}
+if (*p == ' ') {
+while (p < end && *p == ' ') {
+p++;
+}
+/* skip '\r' if windows line endings used. */
+if (p < end && *p == '\r') {
+p++;
+}
+/* only accept blank lines before 'version=' line */
+if (p == end || *p != '\n') {
+return 0;
+}
+p++;
+continue;
+}
+if (end - p >= strlen("version=X\n")) {
+if (strncmp("version=1\n", p, strlen("version=1\n")) == 0 ||
+strncmp("version=2\n", p, strlen("version=2\n")) == 0) {
+return 100;
+}
+}
+if (end - p >= strlen("version=X\r\n")) {
+if (strncmp("version=1\r\n", p, strlen("version=1\r\n")) == 0 
||
+strncmp("version=2\r\n", p, strlen("version=2\r\n")) == 0) 
{
+return 100;
+}
+}
+return 0;
+}
 return 0;
+}
 }
 
 #define CHECK_CID 1
-- 
1.7.6




[Qemu-devel] [PATCH 03/21] qemu-io: Fix if scoping bug

2011-07-19 Thread Kevin Wolf
From: Devin Nakamura 

Fix a bug caused by lack of braces in if statement

Lack of braces means that if(count & 0x1ff) is never reached

Signed-off-by: Devin Nakamura 
Signed-off-by: Kevin Wolf 
---
 qemu-io.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index e3c825f..a553d0c 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -449,7 +449,7 @@ static int read_f(int argc, char **argv)
 return 0;
 }
 
-if (!pflag)
+if (!pflag) {
 if (offset & 0x1ff) {
 printf("offset %" PRId64 " is not sector aligned\n",
offset);
@@ -460,6 +460,7 @@ static int read_f(int argc, char **argv)
count);
 return 0;
 }
+}
 
 buf = qemu_io_alloc(count, 0xab);
 
-- 
1.7.6




[Qemu-devel] [PATCH 01/21] sheepdog: add full data preallocation support

2011-07-19 Thread Kevin Wolf
From: MORITA Kazutaka 

This introduces qemu-img create option for sheepdog which allows the
data to be fully preallocated (note that sheepdog always preallocates
metadata).

The option is disabled by default and you need to enable it like the
following:

qemu-img create sheepdog:test -o preallocation=full 1G

Signed-off-by: MORITA Kazutaka 
Signed-off-by: FUJITA Tomonori 
Signed-off-by: Kevin Wolf 
---
 block/sheepdog.c |   71 +++--
 1 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 80d106c..77a4de5 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -1286,6 +1286,49 @@ static int do_sd_create(char *filename, int64_t vdi_size,
 return 0;
 }
 
+static int sd_prealloc(const char *filename)
+{
+BlockDriverState *bs = NULL;
+uint32_t idx, max_idx;
+int64_t vdi_size;
+void *buf = qemu_mallocz(SD_DATA_OBJ_SIZE);
+int ret;
+
+ret = bdrv_file_open(&bs, filename, BDRV_O_RDWR);
+if (ret < 0) {
+goto out;
+}
+
+vdi_size = bdrv_getlength(bs);
+if (vdi_size < 0) {
+ret = vdi_size;
+goto out;
+}
+max_idx = DIV_ROUND_UP(vdi_size, SD_DATA_OBJ_SIZE);
+
+for (idx = 0; idx < max_idx; idx++) {
+/*
+ * The created image can be a cloned image, so we need to read
+ * a data from the source image.
+ */
+ret = bdrv_pread(bs, idx * SD_DATA_OBJ_SIZE, buf, SD_DATA_OBJ_SIZE);
+if (ret < 0) {
+goto out;
+}
+ret = bdrv_pwrite(bs, idx * SD_DATA_OBJ_SIZE, buf, SD_DATA_OBJ_SIZE);
+if (ret < 0) {
+goto out;
+}
+}
+out:
+if (bs) {
+bdrv_delete(bs);
+}
+qemu_free(buf);
+
+return ret;
+}
+
 static int sd_create(const char *filename, QEMUOptionParameter *options)
 {
 int ret;
@@ -1295,13 +1338,15 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options)
 BDRVSheepdogState s;
 char vdi[SD_MAX_VDI_LEN], tag[SD_MAX_VDI_TAG_LEN];
 uint32_t snapid;
+int prealloc = 0;
+const char *vdiname;
 
-strstart(filename, "sheepdog:", (const char **)&filename);
+strstart(filename, "sheepdog:", &vdiname);
 
 memset(&s, 0, sizeof(s));
 memset(vdi, 0, sizeof(vdi));
 memset(tag, 0, sizeof(tag));
-if (parse_vdiname(&s, filename, vdi, &snapid, tag) < 0) {
+if (parse_vdiname(&s, vdiname, vdi, &snapid, tag) < 0) {
 error_report("invalid filename");
 return -EINVAL;
 }
@@ -1311,6 +1356,16 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options)
 vdi_size = options->value.n;
 } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) {
 backing_file = options->value.s;
+} else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) {
+if (!options->value.s || !strcmp(options->value.s, "off")) {
+prealloc = 0;
+} else if (!strcmp(options->value.s, "full")) {
+prealloc = 1;
+} else {
+error_report("Invalid preallocation mode: '%s'",
+ options->value.s);
+return -EINVAL;
+}
 }
 options++;
 }
@@ -1348,7 +1403,12 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options)
 bdrv_delete(bs);
 }
 
-return do_sd_create((char *)vdi, vdi_size, base_vid, &vid, 0, s.addr, 
s.port);
+ret = do_sd_create(vdi, vdi_size, base_vid, &vid, 0, s.addr, s.port);
+if (!prealloc || ret) {
+return ret;
+}
+
+return sd_prealloc(filename);
 }
 
 static void sd_close(BlockDriverState *bs)
@@ -1984,6 +2044,11 @@ static QEMUOptionParameter sd_create_options[] = {
 .type = OPT_STRING,
 .help = "File name of a base image"
 },
+{
+.name = BLOCK_OPT_PREALLOC,
+.type = OPT_STRING,
+.help = "Preallocation mode (allowed values: off, full)"
+},
 { NULL }
 };
 
-- 
1.7.6




[Qemu-devel] [PATCH 10/21] VMDK: introduce VmdkExtent

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

Introduced VmdkExtent array into BDRVVmdkState, enable holding multiple
image extents for multiple file image support.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/vmdk.c |  348 +-
 1 files changed, 246 insertions(+), 102 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 922b23d..3b78583 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -60,7 +60,11 @@ typedef struct {
 
 #define L2_CACHE_SIZE 16
 
-typedef struct BDRVVmdkState {
+typedef struct VmdkExtent {
+BlockDriverState *file;
+bool flat;
+int64_t sectors;
+int64_t end_sector;
 int64_t l1_table_offset;
 int64_t l1_backup_table_offset;
 uint32_t *l1_table;
@@ -74,7 +78,13 @@ typedef struct BDRVVmdkState {
 uint32_t l2_cache_counts[L2_CACHE_SIZE];
 
 unsigned int cluster_sectors;
+} VmdkExtent;
+
+typedef struct BDRVVmdkState {
 uint32_t parent_cid;
+int num_extents;
+/* Extent array with num_extents entries, ascend ordered by address */
+VmdkExtent *extents;
 } BDRVVmdkState;
 
 typedef struct VmdkMetaData {
@@ -105,6 +115,19 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 #define DESC_SIZE 20*SECTOR_SIZE   // 20 sectors of 512 bytes each
 #define HEADER_SIZE 512// first sector of 512 bytes
 
+static void vmdk_free_extents(BlockDriverState *bs)
+{
+int i;
+BDRVVmdkState *s = bs->opaque;
+
+for (i = 0; i < s->num_extents; i++) {
+qemu_free(s->extents[i].l1_table);
+qemu_free(s->extents[i].l2_cache);
+qemu_free(s->extents[i].l1_backup_table);
+}
+qemu_free(s->extents);
+}
+
 static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
 {
 char desc[DESC_SIZE];
@@ -358,11 +381,50 @@ static int vmdk_parent_open(BlockDriverState *bs)
 return 0;
 }
 
+/* Create and append extent to the extent array. Return the added VmdkExtent
+ * address. return NULL if allocation failed. */
+static VmdkExtent *vmdk_add_extent(BlockDriverState *bs,
+   BlockDriverState *file, bool flat, int64_t sectors,
+   int64_t l1_offset, int64_t l1_backup_offset,
+   uint32_t l1_size,
+   int l2_size, unsigned int cluster_sectors)
+{
+VmdkExtent *extent;
+BDRVVmdkState *s = bs->opaque;
+
+s->extents = qemu_realloc(s->extents,
+  (s->num_extents + 1) * sizeof(VmdkExtent));
+extent = &s->extents[s->num_extents];
+s->num_extents++;
+
+memset(extent, 0, sizeof(VmdkExtent));
+extent->file = file;
+extent->flat = flat;
+extent->sectors = sectors;
+extent->l1_table_offset = l1_offset;
+extent->l1_backup_table_offset = l1_backup_offset;
+extent->l1_size = l1_size;
+extent->l1_entry_sectors = l2_size * cluster_sectors;
+extent->l2_size = l2_size;
+extent->cluster_sectors = cluster_sectors;
+
+if (s->num_extents > 1) {
+extent->end_sector = (*(extent - 1)).end_sector + extent->sectors;
+} else {
+extent->end_sector = extent->sectors;
+}
+bs->total_sectors = extent->end_sector;
+return extent;
+}
+
+
 static int vmdk_open(BlockDriverState *bs, int flags)
 {
 BDRVVmdkState *s = bs->opaque;
 uint32_t magic;
-int l1_size, i;
+int i;
+uint32_t l1_size, l1_entry_sectors;
+VmdkExtent *extent = NULL;
 
 if (bdrv_pread(bs->file, 0, &magic, sizeof(magic)) != sizeof(magic))
 goto fail;
@@ -370,32 +432,34 @@ static int vmdk_open(BlockDriverState *bs, int flags)
 magic = be32_to_cpu(magic);
 if (magic == VMDK3_MAGIC) {
 VMDK3Header header;
-
-if (bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header)) != 
sizeof(header))
+if (bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header))
+!= sizeof(header)) {
 goto fail;
-s->cluster_sectors = le32_to_cpu(header.granularity);
-s->l2_size = 1 << 9;
-s->l1_size = 1 << 6;
-bs->total_sectors = le32_to_cpu(header.disk_sectors);
-s->l1_table_offset = le32_to_cpu(header.l1dir_offset) << 9;
-s->l1_backup_table_offset = 0;
-s->l1_entry_sectors = s->l2_size * s->cluster_sectors;
+}
+extent = vmdk_add_extent(bs, bs->file, false,
+  le32_to_cpu(header.disk_sectors),
+  le32_to_cpu(header.l1dir_offset) << 9, 0,
+  1 << 6, 1 << 9, le32_to_cpu(header.granularity));
 } else if (magic == VMDK4_MAGIC) {
 VMDK4Header header;
-
-if (bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header)) != 
sizeof(header))
+if (bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header))
+!= sizeof(header)) {
 goto fail;
-bs->total_sectors = le64_to_cpu(header.ca

[Qemu-devel] [PATCH 08/21] qemu-options.hx: Document missing -drive options

2011-07-19 Thread Kevin Wolf
From: Luiz Capitulino 

They are 'werror', 'rerror' and 'readonly'.

Signed-off-by: Luiz Capitulino 
Signed-off-by: Kevin Wolf 
---
 qemu-options.hx |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index e6d7adc..64114dd 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -160,6 +160,14 @@ an untrusted format header.
 This option specifies the serial number to assign to the device.
 @item addr=@var{addr}
 Specify the controller's PCI address (if=virtio only).
+@item werror=@var{action},rerror=@var{action}
+Specify which @var{action} to take on write and read errors. Valid actions are:
+"ignore" (ignore the error and try to continue), "stop" (pause QEMU),
+"report" (report the error to the guest), "enospc" (pause QEMU only if the
+host disk is full; report the error to the guest otherwise).
+The default setting is @option{werror=enospc} and @option{rerror=report}.
+@item readonly
+Open drive @option{file} as read-only. Guest write attempts will fail.
 @end table
 
 By default, writethrough caching is used for all block device.  This means that
-- 
1.7.6




[Qemu-devel] [PATCH 04/21] iov: Update parameter usage in iov_(to|from)_buf()

2011-07-19 Thread Kevin Wolf
From: Hannes Reinecke 

iov_to_buf() has an 'offset' parameter, iov_from_buf() hasn't.
This patch adds the missing parameter to iov_from_buf().
It also renames the 'offset' parameter to 'iov_off' to
emphasize it's the offset into the iovec and not the buffer.

Signed-off-by: Hannes Reinecke 
Acked-by: Alexander Graf 
Signed-off-by: Kevin Wolf 
---
 hw/virtio-net.c|2 +-
 hw/virtio-serial-bus.c |2 +-
 iov.c  |   49 ++-
 iov.h  |   10 
 4 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 6997e02..a32cc01 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -657,7 +657,7 @@ static ssize_t virtio_net_receive(VLANClientState *nc, 
const uint8_t *buf, size_
 
 /* copy in packet.  ugh */
 len = iov_from_buf(sg, elem.in_num,
-   buf + offset, size - offset);
+   buf + offset, 0, size - offset);
 total += len;
 offset += len;
 /* If buffers can't be merged, at this point we
diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 7f6db7b..53c58d0 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -103,7 +103,7 @@ static size_t write_to_port(VirtIOSerialPort *port,
 }
 
 len = iov_from_buf(elem.in_sg, elem.in_num,
-   buf + offset, size - offset);
+   buf + offset, 0, size - offset);
 offset += len;
 
 virtqueue_push(vq, &elem, len);
diff --git a/iov.c b/iov.c
index 588cd04..1e02791 100644
--- a/iov.c
+++ b/iov.c
@@ -14,56 +14,61 @@
 
 #include "iov.h"
 
-size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
-const void *buf, size_t size)
+size_t iov_from_buf(struct iovec *iov, unsigned int iov_cnt,
+const void *buf, size_t iov_off, size_t size)
 {
-size_t offset;
+size_t iovec_off, buf_off;
 unsigned int i;
 
-offset = 0;
-for (i = 0; offset < size && i < iovcnt; i++) {
-size_t len;
+iovec_off = 0;
+buf_off = 0;
+for (i = 0; i < iov_cnt && size; i++) {
+if (iov_off < (iovec_off + iov[i].iov_len)) {
+size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off, size);
 
-len = MIN(iov[i].iov_len, size - offset);
+memcpy(iov[i].iov_base + (iov_off - iovec_off), buf + buf_off, 
len);
 
-memcpy(iov[i].iov_base, buf + offset, len);
-offset += len;
+buf_off += len;
+iov_off += len;
+size -= len;
+}
+iovec_off += iov[i].iov_len;
 }
-return offset;
+return buf_off;
 }
 
-size_t iov_to_buf(const struct iovec *iov, const unsigned int iovcnt,
-  void *buf, size_t offset, size_t size)
+size_t iov_to_buf(const struct iovec *iov, const unsigned int iov_cnt,
+  void *buf, size_t iov_off, size_t size)
 {
 uint8_t *ptr;
-size_t iov_off, buf_off;
+size_t iovec_off, buf_off;
 unsigned int i;
 
 ptr = buf;
-iov_off = 0;
+iovec_off = 0;
 buf_off = 0;
-for (i = 0; i < iovcnt && size; i++) {
-if (offset < (iov_off + iov[i].iov_len)) {
-size_t len = MIN((iov_off + iov[i].iov_len) - offset , size);
+for (i = 0; i < iov_cnt && size; i++) {
+if (iov_off < (iovec_off + iov[i].iov_len)) {
+size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off , size);
 
-memcpy(ptr + buf_off, iov[i].iov_base + (offset - iov_off), len);
+memcpy(ptr + buf_off, iov[i].iov_base + (iov_off - iovec_off), 
len);
 
 buf_off += len;
-offset += len;
+iov_off += len;
 size -= len;
 }
-iov_off += iov[i].iov_len;
+iovec_off += iov[i].iov_len;
 }
 return buf_off;
 }
 
-size_t iov_size(const struct iovec *iov, const unsigned int iovcnt)
+size_t iov_size(const struct iovec *iov, const unsigned int iov_cnt)
 {
 size_t len;
 unsigned int i;
 
 len = 0;
-for (i = 0; i < iovcnt; i++) {
+for (i = 0; i < iov_cnt; i++) {
 len += iov[i].iov_len;
 }
 return len;
diff --git a/iov.h b/iov.h
index 60a8547..110f67a 100644
--- a/iov.h
+++ b/iov.h
@@ -12,8 +12,8 @@
 
 #include "qemu-common.h"
 
-size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
-const void *buf, size_t size);
-size_t iov_to_buf(const struct iovec *iov, const unsigned int iovcnt,
-  void *buf, size_t offset, size_t size);
-size_t iov_size(const struct iovec *iov, const unsigned int iovcnt);
+size_t iov_from_buf(struct iovec *iov, unsigned int iov_cnt,
+const void *buf, size_t iov_off, size_t size);
+size_t iov_to_buf(const struct iovec *iov, const unsigned int iov_cnt,
+  void *buf, size_t iov_off, size_t size);
+size_t iov_size(const struct iovec *iov, con

[Qemu-devel] [PATCH 11/21] VMDK: bugfix, align offset to cluster in get_whole_cluster

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

In get_whole_cluster, the offset is not aligned to cluster when reading
from backing_hd. When the first write to child is not at the cluster
boundary, wrong address data from parent is copied to child.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/vmdk.c |8 +---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 3b78583..03a4619 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -514,21 +514,23 @@ static int get_whole_cluster(BlockDriverState *bs,
 /* 128 sectors * 512 bytes each = grain size 64KB */
 uint8_t  whole_grain[extent->cluster_sectors * 512];
 
-// we will be here if it's first write on non-exist grain(cluster).
-// try to read from parent image, if exist
+/* we will be here if it's first write on non-exist grain(cluster).
+ * try to read from parent image, if exist */
 if (bs->backing_hd) {
 int ret;
 
 if (!vmdk_is_cid_valid(bs))
 return -1;
 
+/* floor offset to cluster */
+offset -= offset % (extent->cluster_sectors * 512);
 ret = bdrv_read(bs->backing_hd, offset >> 9, whole_grain,
 extent->cluster_sectors);
 if (ret < 0) {
 return -1;
 }
 
-//Write grain only into the active image
+/* Write grain only into the active image */
 ret = bdrv_write(extent->file, cluster_offset, whole_grain,
 extent->cluster_sectors);
 if (ret < 0) {
-- 
1.7.6




[Qemu-devel] [PATCH 09/21] qemu-config: Document -drive options

2011-07-19 Thread Kevin Wolf
From: Luiz Capitulino 

Signed-off-by: Luiz Capitulino 
Signed-off-by: Kevin Wolf 
---
 qemu-config.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index c63741c..93d20c6 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -23,6 +23,7 @@ static QemuOptsList qemu_drive_opts = {
 },{
 .name = "index",
 .type = QEMU_OPT_NUMBER,
+.help = "index number",
 },{
 .name = "cyls",
 .type = QEMU_OPT_NUMBER,
@@ -46,6 +47,7 @@ static QemuOptsList qemu_drive_opts = {
 },{
 .name = "snapshot",
 .type = QEMU_OPT_BOOL,
+.help = "enable/disable snapshot mode",
 },{
 .name = "file",
 .type = QEMU_OPT_STRING,
@@ -65,12 +67,15 @@ static QemuOptsList qemu_drive_opts = {
 },{
 .name = "serial",
 .type = QEMU_OPT_STRING,
+.help = "disk serial number",
 },{
 .name = "rerror",
 .type = QEMU_OPT_STRING,
+.help = "read error action",
 },{
 .name = "werror",
 .type = QEMU_OPT_STRING,
+.help = "write error action",
 },{
 .name = "addr",
 .type = QEMU_OPT_STRING,
@@ -78,6 +83,7 @@ static QemuOptsList qemu_drive_opts = {
 },{
 .name = "readonly",
 .type = QEMU_OPT_BOOL,
+.help = "open drive file as read-only",
 },
 { /* end of list */ }
 },
-- 
1.7.6




[Qemu-devel] [PATCH 06/21] scsi-disk: Fixup debugging statement

2011-07-19 Thread Kevin Wolf
From: Hannes Reinecke 

A debugging statement wasn't converted to the new interface.

Signed-off-by: Hannes Reinecke 
Acked-by: Paolo Bonzini 
Signed-off-by: Kevin Wolf 
---
 hw/scsi-disk.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index c2a99fe..5804662 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -1007,7 +1007,7 @@ static int32_t scsi_send_command(SCSIRequest *req, 
uint8_t *buf)
 
 command = buf[0];
 outbuf = (uint8_t *)r->iov.iov_base;
-DPRINTF("Command: lun=%d tag=0x%x data=0x%02x", lun, tag, buf[0]);
+DPRINTF("Command: lun=%d tag=0x%x data=0x%02x", req->lun, req->tag, 
buf[0]);
 
 if (scsi_req_parse(&r->req, buf) != 0) {
 BADF("Unsupported command length, command %x\n", command);
-- 
1.7.6




Re: [Qemu-devel] [Xen-devel] Re: Upstream QEMU and Xen unstable not working

2011-07-19 Thread Ian Campbell
On Mon, 2011-07-18 at 17:17 +0100, Stefano Stabellini wrote:
> On Mon, 18 Jul 2011, Wei Liu wrote:
> > Stefano and Anthony, you once said that you were going to setup a
> > public QEMU repository for Xen, how is it going now?
> 
> We are getting there, but there are still too many xen patches floating
> around qemu-devel at the moment to announce a new qemu xen tree.

Isn't the presence of all those patches floating around qemu-devel and
the need for people to trawl around collecting so as to have a working
build exactly the problem such a tree would be intended to solve? i.e. a
one stop place to pick up pending patches before they hit the main tree.

> However you can try the following branch for now:
> 
> git://xenbits.xen.org/people/sstabellini/qemu-dm.git xen-next

Ian.




[Qemu-devel] [PATCH 13/21] VMDK: separate vmdk_open by format version

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

Separate vmdk_open by subformats to:
* vmdk_open_vmdk3
* vmdk_open_vmdk4

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/vmdk.c |  178 -
 1 files changed, 112 insertions(+), 66 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index f8a815c..6d7b497 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -458,67 +458,20 @@ static VmdkExtent *vmdk_add_extent(BlockDriverState *bs,
 return extent;
 }
 
-
-static int vmdk_open(BlockDriverState *bs, int flags)
+static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent)
 {
-BDRVVmdkState *s = bs->opaque;
-uint32_t magic;
-int i;
-uint32_t l1_size, l1_entry_sectors;
-VmdkExtent *extent = NULL;
-
-if (bdrv_pread(bs->file, 0, &magic, sizeof(magic)) != sizeof(magic))
-goto fail;
-
-magic = be32_to_cpu(magic);
-if (magic == VMDK3_MAGIC) {
-VMDK3Header header;
-if (bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header))
-!= sizeof(header)) {
-goto fail;
-}
-extent = vmdk_add_extent(bs, bs->file, false,
-  le32_to_cpu(header.disk_sectors),
-  le32_to_cpu(header.l1dir_offset) << 9, 0,
-  1 << 6, 1 << 9, le32_to_cpu(header.granularity));
-} else if (magic == VMDK4_MAGIC) {
-VMDK4Header header;
-if (bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header))
-!= sizeof(header)) {
-goto fail;
-}
-l1_entry_sectors = le32_to_cpu(header.num_gtes_per_gte)
-* le64_to_cpu(header.granularity);
-l1_size = (le64_to_cpu(header.capacity) + l1_entry_sectors - 1)
-/ l1_entry_sectors;
-extent = vmdk_add_extent(bs, bs->file, false,
-  le64_to_cpu(header.capacity),
-  le64_to_cpu(header.gd_offset) << 9,
-  le64_to_cpu(header.rgd_offset) << 9,
-  l1_size,
-  le32_to_cpu(header.num_gtes_per_gte),
-  le64_to_cpu(header.granularity));
-if (extent->l1_entry_sectors <= 0) {
-goto fail;
-}
-// try to open parent images, if exist
-if (vmdk_parent_open(bs) != 0)
-goto fail;
-// write the CID once after the image creation
-s->parent_cid = vmdk_read_cid(bs,1);
-} else {
-goto fail;
-}
+int ret;
+int l1_size, i;
 
 /* read the L1 table */
 l1_size = extent->l1_size * sizeof(uint32_t);
 extent->l1_table = qemu_malloc(l1_size);
-if (bdrv_pread(bs->file,
-extent->l1_table_offset,
-extent->l1_table,
-l1_size)
-!= l1_size) {
-goto fail;
+ret = bdrv_pread(extent->file,
+extent->l1_table_offset,
+extent->l1_table,
+l1_size);
+if (ret < 0) {
+goto fail_l1;
 }
 for (i = 0; i < extent->l1_size; i++) {
 le32_to_cpus(&extent->l1_table[i]);
@@ -526,12 +479,12 @@ static int vmdk_open(BlockDriverState *bs, int flags)
 
 if (extent->l1_backup_table_offset) {
 extent->l1_backup_table = qemu_malloc(l1_size);
-if (bdrv_pread(bs->file,
-extent->l1_backup_table_offset,
-extent->l1_backup_table,
-l1_size)
-!= l1_size) {
-goto fail;
+ret = bdrv_pread(extent->file,
+extent->l1_backup_table_offset,
+extent->l1_backup_table,
+l1_size);
+if (ret < 0) {
+goto fail_l1b;
 }
 for (i = 0; i < extent->l1_size; i++) {
 le32_to_cpus(&extent->l1_backup_table[i]);
@@ -541,9 +494,102 @@ static int vmdk_open(BlockDriverState *bs, int flags)
 extent->l2_cache =
 qemu_malloc(extent->l2_size * L2_CACHE_SIZE * sizeof(uint32_t));
 return 0;
+ fail_l1b:
+qemu_free(extent->l1_backup_table);
+ fail_l1:
+qemu_free(extent->l1_table);
+return ret;
+}
+
+static int vmdk_open_vmdk3(BlockDriverState *bs, int flags)
+{
+int ret;
+uint32_t magic;
+VMDK3Header header;
+VmdkExtent *extent;
+
+ret = bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header));
+if (ret < 0) {
+goto fail;
+}
+extent = vmdk_add_extent(bs,
+ bs->file, false,
+ le32_to_cpu(header.disk_sectors),
+ le32_to_cpu(header.l1dir_offset) << 9,
+ 0, 1 << 6, 1 << 9,
+ le32_to_cpu(header.granularity));
+ret = vmdk_init_tables(bs, extent);
+if (ret) {
+/* vmdk_init_tables cleans up

[Qemu-devel] [PATCH 15/21] VMDK: flush multiple extents

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

Flush all the file that referenced by the image.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/vmdk.c |   12 +++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 529ae90..f6d2986 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1072,7 +1072,17 @@ static void vmdk_close(BlockDriverState *bs)
 
 static int vmdk_flush(BlockDriverState *bs)
 {
-return bdrv_flush(bs->file);
+int i, ret, err;
+BDRVVmdkState *s = bs->opaque;
+
+ret = bdrv_flush(bs->file);
+for (i = 0; i < s->num_extents; i++) {
+err = bdrv_flush(s->extents[i].file);
+if (err < 0) {
+ret = err;
+}
+}
+return ret;
 }
 
 
-- 
1.7.6




[Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Kevin Wolf
From: Hannes Reinecke 

'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.
'tag' is still being kept for tracing purposes.

Signed-off-by: Hannes Reinecke 
Acked-by: Paolo Bonzini 
Signed-off-by: Kevin Wolf 
---
 hw/esp.c  |2 +-
 hw/lsi53c895a.c   |   22 --
 hw/scsi-bus.c |9 ++---
 hw/scsi-disk.c|4 ++--
 hw/scsi-generic.c |5 +++--
 hw/scsi.h |   10 +++---
 hw/spapr_vscsi.c  |   29 +
 hw/usb-msd.c  |9 +
 8 files changed, 37 insertions(+), 53 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index aa50800..9ddd637 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, uint8_t 
busid)
 
 DPRINTF("do_busid_cmd: busid 0x%x\n", busid);
 lun = busid & 7;
-s->current_req = scsi_req_new(s->current_dev, 0, lun);
+s->current_req = scsi_req_new(s->current_dev, 0, lun, NULL);
 datalen = scsi_req_enqueue(s->current_req, buf);
 s->ti_size = datalen;
 if (datalen != 0) {
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 940b43a..69eec1d 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, uint32_t 
tag)
 static void lsi_request_cancelled(SCSIRequest *req)
 {
 LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
-lsi_request *p;
+lsi_request *p = req->hba_private;
 
 if (s->current && req == s->current->req) {
 scsi_req_unref(req);
@@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
 return;
 }
 
-p = lsi_find_by_tag(s, req->tag);
 if (p) {
 QTAILQ_REMOVE(&s->queue, p, next);
 scsi_req_unref(req);
@@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)
 
 /* Record that data is available for a queued command.  Returns zero if
the device was reselected, nonzero if the IO is deferred.  */
-static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
+static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
 {
-lsi_request *p;
-
-p = lsi_find_by_tag(s, tag);
-if (!p) {
-BADF("IO with unknown tag %d\n", tag);
-return 1;
-}
+lsi_request *p = req->hba_private;
 
 if (p->pending) {
-BADF("Multiple IO pending for tag %d\n", tag);
+BADF("Multiple IO pending for request %p\n", p);
 }
 p->pending = len;
 /* Reselect if waiting for it, or if reselection triggers an IRQ
@@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, uint32_t 
len)
 LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
 int out;
 
-if (s->waiting == 1 || !s->current || req->tag != s->current->tag ||
+if (s->waiting == 1 || !s->current || req->hba_private != s->current ||
 (lsi_irq_on_rsl(s) && !(s->scntl1 & LSI_SCNTL1_CON))) {
-if (lsi_queue_tag(s, req->tag, len)) {
+if (lsi_queue_req(s, req, len)) {
 return;
 }
 }
@@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
 assert(s->current == NULL);
 s->current = qemu_mallocz(sizeof(lsi_request));
 s->current->tag = s->select_tag;
-s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun);
+s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun,
+   s->current);
 
 n = scsi_req_enqueue(s->current->req, buf);
 if (n) {
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index ad6a730..8b1a412 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
 return res;
 }
 
-SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t 
lun)
+SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
+uint32_t lun, void *hba_private)
 {
 SCSIRequest *req;
 
@@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, 
uint32_t tag, uint32_t l
 req->dev = d;
 req->tag = tag;
 req->lun = lun;
+req->hba_private = hba_private;
 req->status = -1;
 trace_scsi_req_alloc(req->dev->id, req->lun, req->tag);
 return req;
 }
 
-SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
+SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
+  void *hba_private)
 {
-return d->info->alloc_req(d, tag, lun);
+return d->info->alloc_req(d, tag, lun, hba_private);
 }
 
 uint8_t *scsi_req_get_buf(SCSIRequest *req)
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index a8c7372..c2a99fe 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -81,13 +81,13 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, 
int type

[Qemu-devel] [PATCH 16/21] VMDK: move 'static' cid_update flag to bs field

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

Cid_update is the flag for updating CID on first write after opening the
image. This should be per image open rather than per program life cycle,
so change it from static var of vmdk_write to a field in BDRVVmdkState.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/vmdk.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index f6d2986..8dc58a8 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -82,6 +82,7 @@ typedef struct VmdkExtent {
 
 typedef struct BDRVVmdkState {
 int desc_offset;
+bool cid_updated;
 uint32_t parent_cid;
 int num_extents;
 /* Extent array with num_extents entries, ascend ordered by address */
@@ -853,7 +854,6 @@ static int vmdk_write(BlockDriverState *bs, int64_t 
sector_num,
 int n;
 int64_t index_in_cluster;
 uint64_t cluster_offset;
-static int cid_update = 0;
 VmdkMetaData m_data;
 
 if (sector_num > bs->total_sectors) {
@@ -900,9 +900,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t 
sector_num,
 buf += n * 512;
 
 // update CID on the first write every time the virtual disk is opened
-if (!cid_update) {
+if (!s->cid_updated) {
 vmdk_write_cid(bs, time(NULL));
-cid_update++;
+s->cid_updated = true;
 }
 }
 return 0;
-- 
1.7.6




[Qemu-devel] [PATCH 18/21] VMDK: open/read/write for monolithicFlat image

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

Parse vmdk decriptor file and open mono flat image.
Read/write the flat extent.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/vmdk.c |  171 +-
 1 files changed, 158 insertions(+), 13 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index f637d98..e1fb962 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -65,6 +65,7 @@ typedef struct VmdkExtent {
 bool flat;
 int64_t sectors;
 int64_t end_sector;
+int64_t flat_start_offset;
 int64_t l1_table_offset;
 int64_t l1_backup_table_offset;
 uint32_t *l1_table;
@@ -407,9 +408,10 @@ fail:
 static int vmdk_parent_open(BlockDriverState *bs)
 {
 char *p_name;
-char desc[DESC_SIZE];
+char desc[DESC_SIZE + 1];
 BDRVVmdkState *s = bs->opaque;
 
+desc[DESC_SIZE] = '\0';
 if (bdrv_pread(bs->file, s->desc_offset, desc, DESC_SIZE) != DESC_SIZE) {
 return -1;
 }
@@ -584,6 +586,144 @@ static int vmdk_open_vmdk4(BlockDriverState *bs, int 
flags)
 return ret;
 }
 
+/* find an option value out of descriptor file */
+static int vmdk_parse_description(const char *desc, const char *opt_name,
+char *buf, int buf_size)
+{
+char *opt_pos, *opt_end;
+const char *end = desc + strlen(desc);
+
+opt_pos = strstr(desc, opt_name);
+if (!opt_pos) {
+return -1;
+}
+/* Skip "=\"" following opt_name */
+opt_pos += strlen(opt_name) + 2;
+if (opt_pos >= end) {
+return -1;
+}
+opt_end = opt_pos;
+while (opt_end < end && *opt_end != '"') {
+opt_end++;
+}
+if (opt_end == end || buf_size < opt_end - opt_pos + 1) {
+return -1;
+}
+pstrcpy(buf, opt_end - opt_pos + 1, opt_pos);
+return 0;
+}
+
+static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
+const char *desc_file_path)
+{
+int ret;
+char access[11];
+char type[11];
+char fname[512];
+const char *p = desc;
+int64_t sectors = 0;
+int64_t flat_offset;
+
+while (*p) {
+/* parse extent line:
+ * RW [size in sectors] FLAT "file-name.vmdk" OFFSET
+ * or
+ * RW [size in sectors] SPARSE "file-name.vmdk"
+ */
+flat_offset = -1;
+ret = sscanf(p, "%10s %" SCNd64 " %10s %511s %" SCNd64,
+access, §ors, type, fname, &flat_offset);
+if (ret < 4 || strcmp(access, "RW")) {
+goto next_line;
+} else if (!strcmp(type, "FLAT")) {
+if (ret != 5 || flat_offset < 0) {
+return -EINVAL;
+}
+} else if (ret != 4) {
+return -EINVAL;
+}
+
+/* trim the quotation marks around */
+if (fname[0] == '"') {
+memmove(fname, fname + 1, strlen(fname));
+if (strlen(fname) <= 1 || fname[strlen(fname) - 1] != '"') {
+return -EINVAL;
+}
+fname[strlen(fname) - 1] = '\0';
+}
+if (sectors <= 0 ||
+(strcmp(type, "FLAT") && strcmp(type, "SPARSE")) ||
+(strcmp(access, "RW"))) {
+goto next_line;
+}
+
+/* save to extents array */
+if (!strcmp(type, "FLAT")) {
+/* FLAT extent */
+char extent_path[PATH_MAX];
+BlockDriverState *extent_file;
+VmdkExtent *extent;
+
+path_combine(extent_path, sizeof(extent_path),
+desc_file_path, fname);
+ret = bdrv_file_open(&extent_file, extent_path, bs->open_flags);
+if (ret) {
+return ret;
+}
+extent = vmdk_add_extent(bs, extent_file, true, sectors,
+0, 0, 0, 0, sectors);
+extent->flat_start_offset = flat_offset;
+} else {
+/* SPARSE extent, not supported for now */
+fprintf(stderr,
+"VMDK: Not supported extent type \"%s\""".\n", type);
+return -ENOTSUP;
+}
+next_line:
+/* move to next line */
+while (*p && *p != '\n') {
+p++;
+}
+p++;
+}
+return 0;
+}
+
+static int vmdk_open_desc_file(BlockDriverState *bs, int flags)
+{
+int ret;
+char buf[2048];
+char ct[128];
+BDRVVmdkState *s = bs->opaque;
+
+ret = bdrv_pread(bs->file, 0, buf, sizeof(buf));
+if (ret < 0) {
+return ret;
+}
+buf[2047] = '\0';
+if (vmdk_parse_description(buf, "createType", ct, sizeof(ct))) {
+return -EINVAL;
+}
+if (strcmp(ct, "monolithicFlat")) {
+fprintf(stderr,
+"VMDK: Not supported image type \"%s\""".\n", ct);
+return -ENOTSUP;
+}
+s->desc_offset = 0;
+ret = vmdk_parse_extents(buf, bs, bs->file->filename);
+if (ret) {
+return ret;
+}
+
+/* try to open parent images, if exist */
+if (vmdk_parent_open(bs)) {
+ 

[Qemu-devel] [PATCH] qcow2: Use Qcow2Cache in writeback mode during loadvm/savevm

2011-07-19 Thread Kevin Wolf
In snapshotting there is no guest involved, so we can safely use a writeback
mode and do the flushes in the right place (i.e. at the very end). This
improves the time that creating/restoring an internal snapshot takes with an
image in writethrough mode.

Signed-off-by: Kevin Wolf 
---
 block/qcow2-cache.c|   12 
 block/qcow2-refcount.c |   38 +++---
 block/qcow2.h  |2 ++
 3 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 3824739..8408847 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -312,3 +312,15 @@ found:
 c->entries[i].dirty = true;
 }
 
+bool qcow2_cache_set_writethrough(BlockDriverState *bs, Qcow2Cache *c,
+bool enable)
+{
+bool old = c->writethrough;
+
+if (!old && enable) {
+qcow2_cache_flush(bs, c);
+}
+
+c->writethrough = enable;
+return old;
+}
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index ac95b88..14b2f67 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -705,8 +705,15 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 BDRVQcowState *s = bs->opaque;
 uint64_t *l1_table, *l2_table, l2_offset, offset, l1_size2, l1_allocated;
 int64_t old_offset, old_l2_offset;
-int i, j, l1_modified, nb_csectors, refcount;
+int i, j, l1_modified = 0, nb_csectors, refcount;
 int ret;
+bool old_l2_writethrough, old_refcount_writethrough;
+
+/* Switch caches to writeback mode during update */
+old_l2_writethrough =
+qcow2_cache_set_writethrough(bs, s->l2_table_cache, false);
+old_refcount_writethrough =
+qcow2_cache_set_writethrough(bs, s->refcount_block_cache, false);
 
 l2_table = NULL;
 l1_table = NULL;
@@ -720,7 +727,11 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 l1_allocated = 1;
 if (bdrv_pread(bs->file, l1_table_offset,
l1_table, l1_size2) != l1_size2)
+{
+ret = -EIO;
 goto fail;
+}
+
 for(i = 0;i < l1_size; i++)
 be64_to_cpus(&l1_table[i]);
 } else {
@@ -729,7 +740,6 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 l1_allocated = 0;
 }
 
-l1_modified = 0;
 for(i = 0; i < l1_size; i++) {
 l2_offset = l1_table[i];
 if (l2_offset) {
@@ -773,6 +783,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 }
 
 if (refcount < 0) {
+ret = -EIO;
 goto fail;
 }
 }
@@ -803,6 +814,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 refcount = get_refcount(bs, l2_offset >> s->cluster_bits);
 }
 if (refcount < 0) {
+ret = -EIO;
 goto fail;
 } else if (refcount == 1) {
 l2_offset |= QCOW_OFLAG_COPIED;
@@ -813,6 +825,18 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 }
 }
 }
+
+ret = 0;
+fail:
+if (l2_table) {
+qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
+}
+
+/* Enable writethrough cache mode again */
+qcow2_cache_set_writethrough(bs, s->l2_table_cache, old_l2_writethrough);
+qcow2_cache_set_writethrough(bs, s->refcount_block_cache,
+old_refcount_writethrough);
+
 if (l1_modified) {
 for(i = 0; i < l1_size; i++)
 cpu_to_be64s(&l1_table[i]);
@@ -824,15 +848,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 }
 if (l1_allocated)
 qemu_free(l1_table);
-return 0;
- fail:
-if (l2_table) {
-qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
-}
-
-if (l1_allocated)
-qemu_free(l1_table);
-return -EIO;
+return ret;
 }
 
 
diff --git a/block/qcow2.h b/block/qcow2.h
index e1ae3e8..6a0a21b 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -228,6 +228,8 @@ int qcow2_read_snapshots(BlockDriverState *bs);
 Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables,
 bool writethrough);
 int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
+bool qcow2_cache_set_writethrough(BlockDriverState *bs, Qcow2Cache *c,
+bool enable);
 
 void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table);
 int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
-- 
1.7.6




[Qemu-devel] [PATCH 14/21] VMDK: add field BDRVVmdkState.desc_offset

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

There are several occurrence of magic number 0x200 as the descriptor
offset within mono sparse image file. This is not the case for images
with separate descriptor file. So a field is added to BDRVVmdkState to
hold the correct value.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/vmdk.c |   27 ++-
 1 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 6d7b497..529ae90 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -81,6 +81,7 @@ typedef struct VmdkExtent {
 } VmdkExtent;
 
 typedef struct BDRVVmdkState {
+int desc_offset;
 uint32_t parent_cid;
 int num_extents;
 /* Extent array with num_extents entries, ascend ordered by address */
@@ -175,10 +176,11 @@ static uint32_t vmdk_read_cid(BlockDriverState *bs, int 
parent)
 uint32_t cid;
 const char *p_name, *cid_str;
 size_t cid_str_size;
+BDRVVmdkState *s = bs->opaque;
 
-/* the descriptor offset = 0x200 */
-if (bdrv_pread(bs->file, 0x200, desc, DESC_SIZE) != DESC_SIZE)
+if (bdrv_pread(bs->file, s->desc_offset, desc, DESC_SIZE) != DESC_SIZE) {
 return 0;
+}
 
 if (parent) {
 cid_str = "parentCID";
@@ -200,10 +202,12 @@ static int vmdk_write_cid(BlockDriverState *bs, uint32_t 
cid)
 {
 char desc[DESC_SIZE], tmp_desc[DESC_SIZE];
 char *p_name, *tmp_str;
+BDRVVmdkState *s = bs->opaque;
 
-/* the descriptor offset = 0x200 */
-if (bdrv_pread(bs->file, 0x200, desc, DESC_SIZE) != DESC_SIZE)
-return -1;
+memset(desc, 0, sizeof(desc));
+if (bdrv_pread(bs->file, s->desc_offset, desc, DESC_SIZE) != DESC_SIZE) {
+return -EIO;
+}
 
 tmp_str = strstr(desc,"parentCID");
 pstrcpy(tmp_desc, sizeof(tmp_desc), tmp_str);
@@ -213,8 +217,9 @@ static int vmdk_write_cid(BlockDriverState *bs, uint32_t 
cid)
 pstrcat(desc, sizeof(desc), tmp_desc);
 }
 
-if (bdrv_pwrite_sync(bs->file, 0x200, desc, DESC_SIZE) < 0)
-return -1;
+if (bdrv_pwrite_sync(bs->file, s->desc_offset, desc, DESC_SIZE) < 0) {
+return -EIO;
+}
 return 0;
 }
 
@@ -402,10 +407,11 @@ static int vmdk_parent_open(BlockDriverState *bs)
 {
 char *p_name;
 char desc[DESC_SIZE];
+BDRVVmdkState *s = bs->opaque;
 
-/* the descriptor offset = 0x200 */
-if (bdrv_pread(bs->file, 0x200, desc, DESC_SIZE) != DESC_SIZE)
+if (bdrv_pread(bs->file, s->desc_offset, desc, DESC_SIZE) != DESC_SIZE) {
 return -1;
+}
 
 if ((p_name = strstr(desc,"parentFileNameHint")) != NULL) {
 char *end_name;
@@ -506,8 +512,10 @@ static int vmdk_open_vmdk3(BlockDriverState *bs, int flags)
 int ret;
 uint32_t magic;
 VMDK3Header header;
+BDRVVmdkState *s = bs->opaque;
 VmdkExtent *extent;
 
+s->desc_offset = 0x200;
 ret = bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header));
 if (ret < 0) {
 goto fail;
@@ -539,6 +547,7 @@ static int vmdk_open_vmdk4(BlockDriverState *bs, int flags)
 BDRVVmdkState *s = bs->opaque;
 VmdkExtent *extent;
 
+s->desc_offset = 0x200;
 ret = bdrv_pread(bs->file, sizeof(magic), &header, sizeof(header));
 if (ret < 0) {
 goto fail;
-- 
1.7.6




[Qemu-devel] [PATCH 07/21] scsi-disk: Mask out serial number EVPD

2011-07-19 Thread Kevin Wolf
From: Hannes Reinecke 

If the serial number is not set we should mask it out in the
list of supported VPD pages and mark it as not supported.

Signed-off-by: Hannes Reinecke 
Acked-by: Paolo Bonzini 
Signed-off-by: Kevin Wolf 
---
 hw/scsi-disk.c |   15 ---
 1 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 5804662..05d14ab 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -398,7 +398,8 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
 "buffer size %zd\n", req->cmd.xfer);
 pages = buflen++;
 outbuf[buflen++] = 0x00; // list of supported pages (this page)
-outbuf[buflen++] = 0x80; // unit serial number
+if (s->serial)
+outbuf[buflen++] = 0x80; // unit serial number
 outbuf[buflen++] = 0x83; // device identification
 if (s->drive_kind == SCSI_HD) {
 outbuf[buflen++] = 0xb0; // block limits
@@ -409,8 +410,14 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
 }
 case 0x80: /* Device serial number, optional */
 {
-int l = strlen(s->serial);
+int l;
 
+if (!s->serial) {
+DPRINTF("Inquiry (EVPD[Serial number] not supported\n");
+return -1;
+}
+
+l = strlen(s->serial);
 if (l > req->cmd.xfer)
 l = req->cmd.xfer;
 if (l > 20)
@@ -1203,7 +1210,9 @@ static int scsi_initfn(SCSIDevice *dev, SCSIDriveKind 
kind)
 if (!s->serial) {
 /* try to fall back to value set with legacy -drive serial=... */
 dinfo = drive_get_by_blockdev(s->bs);
-s->serial = qemu_strdup(*dinfo->serial ? dinfo->serial : "0");
+if (*dinfo->serial) {
+s->serial = qemu_strdup(dinfo->serial);
+}
 }
 
 if (!s->version) {
-- 
1.7.6




[Qemu-devel] [PATCH 20/21] VMDK: fix coding style

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

Conform coding style in vmdk.c to pass scripts/checkpatch.pl checks.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/vmdk.c |   76 +++---
 1 files changed, 46 insertions(+), 30 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index b53c5f5..de08d0c 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -102,8 +102,9 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 {
 uint32_t magic;
 
-if (buf_size < 4)
+if (buf_size < 4) {
 return 0;
+}
 magic = be32_to_cpu(*(uint32_t *)buf);
 if (magic == VMDK3_MAGIC ||
 magic == VMDK4_MAGIC) {
@@ -193,9 +194,10 @@ static uint32_t vmdk_read_cid(BlockDriverState *bs, int 
parent)
 cid_str_size = sizeof("CID");
 }
 
-if ((p_name = strstr(desc,cid_str)) != NULL) {
+p_name = strstr(desc, cid_str);
+if (p_name != NULL) {
 p_name += cid_str_size;
-sscanf(p_name,"%x",&cid);
+sscanf(p_name, "%x", &cid);
 }
 
 return cid;
@@ -212,9 +214,10 @@ static int vmdk_write_cid(BlockDriverState *bs, uint32_t 
cid)
 return -EIO;
 }
 
-tmp_str = strstr(desc,"parentCID");
+tmp_str = strstr(desc, "parentCID");
 pstrcpy(tmp_desc, sizeof(tmp_desc), tmp_str);
-if ((p_name = strstr(desc,"CID")) != NULL) {
+p_name = strstr(desc, "CID");
+if (p_name != NULL) {
 p_name += sizeof("CID");
 snprintf(p_name, sizeof(desc) - (p_name - desc), "%x\n", cid);
 pstrcat(desc, sizeof(desc), tmp_desc);
@@ -234,13 +237,14 @@ static int vmdk_is_cid_valid(BlockDriverState *bs)
 uint32_t cur_pcid;
 
 if (p_bs) {
-cur_pcid = vmdk_read_cid(p_bs,0);
-if (s->parent_cid != cur_pcid)
-// CID not valid
+cur_pcid = vmdk_read_cid(p_bs, 0);
+if (s->parent_cid != cur_pcid) {
+/* CID not valid */
 return 0;
+}
 }
 #endif
-// CID valid
+/* CID valid */
 return 1;
 }
 
@@ -255,14 +259,18 @@ static int vmdk_parent_open(BlockDriverState *bs)
 return -1;
 }
 
-if ((p_name = strstr(desc,"parentFileNameHint")) != NULL) {
+p_name = strstr(desc, "parentFileNameHint");
+if (p_name != NULL) {
 char *end_name;
 
 p_name += sizeof("parentFileNameHint") + 1;
-if ((end_name = strchr(p_name,'\"')) == NULL)
+end_name = strchr(p_name, '\"');
+if (end_name == NULL) {
 return -1;
-if ((end_name - p_name) > sizeof (bs->backing_file) - 1)
+}
+if ((end_name - p_name) > sizeof(bs->backing_file) - 1) {
 return -1;
+}
 
 pstrcpy(bs->backing_file, end_name - p_name + 1, p_name);
 }
@@ -595,8 +603,9 @@ static int get_whole_cluster(BlockDriverState *bs,
 if (bs->backing_hd) {
 int ret;
 
-if (!vmdk_is_cid_valid(bs))
+if (!vmdk_is_cid_valid(bs)) {
 return -1;
+}
 
 /* floor offset to cluster */
 offset -= offset % (extent->cluster_sectors * 512);
@@ -655,8 +664,9 @@ static int get_cluster_offset(BlockDriverState *bs,
 int min_index, i, j;
 uint32_t min_count, *l2_table, tmp = 0;
 
-if (m_data)
+if (m_data) {
 m_data->valid = 0;
+}
 if (extent->flat) {
 *cluster_offset = extent->flat_start_offset;
 return 0;
@@ -712,7 +722,7 @@ static int get_cluster_offset(BlockDriverState *bs,
 return -1;
 }
 
-// Avoid the L2 tables update for the images that have snapshots.
+/* Avoid the L2 tables update for the images that have snapshots. */
 *cluster_offset = bdrv_getlength(extent->file);
 bdrv_truncate(
 extent->file,
@@ -729,8 +739,9 @@ static int get_cluster_offset(BlockDriverState *bs,
  * or inappropriate VM shutdown.
  */
 if (get_whole_cluster(
-bs, extent, *cluster_offset, offset, allocate) == -1)
+bs, extent, *cluster_offset, offset, allocate) == -1) {
 return -1;
+}
 
 if (m_data) {
 m_data->offset = tmp;
@@ -780,8 +791,9 @@ static int vmdk_is_allocated(BlockDriverState *bs, int64_t 
sector_num,
 
 index_in_cluster = sector_num % extent->cluster_sectors;
 n = extent->cluster_sectors - index_in_cluster;
-if (n > nb_sectors)
+if (n > nb_sectors) {
 n = nb_sectors;
+}
 *pnum = n;
 return ret;
 }
@@ -805,16 +817,19 @@ static int vmdk_read(BlockDriverState *bs, int64_t 
sector_num,
 sector_num << 9, 0, &cluster_offset);
 index_in_cluster = sector_num % extent->cluster_sectors;
 n = extent->cluster_sectors - index_in_cluster;
-if (n > nb_sectors)
+if (n > nb_sectors) {
 n = nb_sectors;
+}
 if (ret) {
 /* if not allocated, try to read from 

Re: [Qemu-devel] [Xen-devel] Re: Upstream QEMU and Xen unstable not working

2011-07-19 Thread Stefano Stabellini
On Tue, 19 Jul 2011, Ian Campbell wrote:
> On Mon, 2011-07-18 at 17:17 +0100, Stefano Stabellini wrote:
> > On Mon, 18 Jul 2011, Wei Liu wrote:
> > > Stefano and Anthony, you once said that you were going to setup a
> > > public QEMU repository for Xen, how is it going now?
> > 
> > We are getting there, but there are still too many xen patches floating
> > around qemu-devel at the moment to announce a new qemu xen tree.
> 
> Isn't the presence of all those patches floating around qemu-devel and
> the need for people to trawl around collecting so as to have a working
> build exactly the problem such a tree would be intended to solve? i.e. a
> one stop place to pick up pending patches before they hit the main tree.
> 

Yes, however the "base" hasn't been stable enough so far even collecting
all the patches together.
With Anthony's latest patch series we are almost there.



[Qemu-devel] [PATCH 21/21] block: add bdrv_get_allocated_file_size() operation

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

qemu-img.c wants to count allocated file size of image. Previously it
counts a single bs->file by 'stat' or Window API. As VMDK introduces
multiple file support, the operation becomes format specific with
platform specific meanwhile.

The functions are moved to block/raw-{posix,win32}.c and qemu-img.c calls
bdrv_get_allocated_file_size to count the bs. And also added VMDK code
to count his own extents.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block.c   |   19 +++
 block.h   |1 +
 block/raw-posix.c |   21 +
 block/raw-win32.c |   29 +
 block/vmdk.c  |   24 
 block_int.h   |1 +
 qemu-img.c|   31 +--
 7 files changed, 96 insertions(+), 30 deletions(-)

diff --git a/block.c b/block.c
index 24a25d5..9549b9e 100644
--- a/block.c
+++ b/block.c
@@ -1147,6 +1147,25 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
 }
 
 /**
+ * Length of a allocated file in bytes. Sparse files are counted by actual
+ * allocated space. Return < 0 if error or unknown.
+ */
+int64_t bdrv_get_allocated_file_size(BlockDriverState *bs)
+{
+BlockDriver *drv = bs->drv;
+if (!drv) {
+return -ENOMEDIUM;
+}
+if (drv->bdrv_get_allocated_file_size) {
+return drv->bdrv_get_allocated_file_size(bs);
+}
+if (bs->file) {
+return bdrv_get_allocated_file_size(bs->file);
+}
+return -ENOTSUP;
+}
+
+/**
  * Length of a file in bytes. Return < 0 if error or unknown.
  */
 int64_t bdrv_getlength(BlockDriverState *bs)
diff --git a/block.h b/block.h
index 859d1d9..59cc410 100644
--- a/block.h
+++ b/block.h
@@ -89,6 +89,7 @@ int bdrv_write_sync(BlockDriverState *bs, int64_t sector_num,
 const uint8_t *buf, int nb_sectors);
 int bdrv_truncate(BlockDriverState *bs, int64_t offset);
 int64_t bdrv_getlength(BlockDriverState *bs);
+int64_t bdrv_get_allocated_file_size(BlockDriverState *bs);
 void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
 void bdrv_guess_geometry(BlockDriverState *bs, int *pcyls, int *pheads, int 
*psecs);
 int bdrv_commit(BlockDriverState *bs);
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 34b64aa..cd89c83 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -793,6 +793,17 @@ static int64_t raw_getlength(BlockDriverState *bs)
 }
 #endif
 
+static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
+{
+struct stat st;
+BDRVRawState *s = bs->opaque;
+
+if (fstat(s->fd, &st) < 0) {
+return -errno;
+}
+return (int64_t)st.st_blocks * 512;
+}
+
 static int raw_create(const char *filename, QEMUOptionParameter *options)
 {
 int fd;
@@ -888,6 +899,8 @@ static BlockDriver bdrv_file = {
 
 .bdrv_truncate = raw_truncate,
 .bdrv_getlength = raw_getlength,
+.bdrv_get_allocated_file_size
+= raw_get_allocated_file_size,
 
 .create_options = raw_create_options,
 };
@@ -1156,6 +1169,8 @@ static BlockDriver bdrv_host_device = {
 .bdrv_read  = raw_read,
 .bdrv_write = raw_write,
 .bdrv_getlength= raw_getlength,
+.bdrv_get_allocated_file_size
+= raw_get_allocated_file_size,
 
 /* generic scsi device */
 #ifdef __linux__
@@ -1277,6 +1292,8 @@ static BlockDriver bdrv_host_floppy = {
 .bdrv_read  = raw_read,
 .bdrv_write = raw_write,
 .bdrv_getlength= raw_getlength,
+.bdrv_get_allocated_file_size
+= raw_get_allocated_file_size,
 
 /* removable device support */
 .bdrv_is_inserted   = floppy_is_inserted,
@@ -1380,6 +1397,8 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_read  = raw_read,
 .bdrv_write = raw_write,
 .bdrv_getlength = raw_getlength,
+.bdrv_get_allocated_file_size
+= raw_get_allocated_file_size,
 
 /* removable device support */
 .bdrv_is_inserted   = cdrom_is_inserted,
@@ -1503,6 +1522,8 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_read  = raw_read,
 .bdrv_write = raw_write,
 .bdrv_getlength = raw_getlength,
+.bdrv_get_allocated_file_size
+= raw_get_allocated_file_size,
 
 /* removable device support */
 .bdrv_is_inserted   = cdrom_is_inserted,
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 56bd719..91067e7 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -213,6 +213,31 @@ static int64_t raw_getlength(BlockDriverState *bs)
 return l.QuadPart;
 }
 
+static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
+{
+typedef DWORD (WINAPI * get_compressed_t)(const char *filename,
+  DWORD * high);
+get_compressed_t get_compressed;
+struct _stati64 st;
+const char *filename = bs->filename;
+/* WinNT sup

[Qemu-devel] [PATCH 17/21] VMDK: change get_cluster_offset return type

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

The return type of get_cluster_offset was an offset that use 0 to denote
'not allocated', this will be no longer true for flat extents, as we see
flat extent file as a single huge cluster whose offset is 0 and length
is the whole file length.
So now we use int return value, 0 means success and otherwise offset
invalid.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/vmdk.c |   79 ++---
 1 files changed, 42 insertions(+), 37 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 8dc58a8..f637d98 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -665,26 +665,31 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData 
*m_data)
 return 0;
 }
 
-static uint64_t get_cluster_offset(BlockDriverState *bs,
+static int get_cluster_offset(BlockDriverState *bs,
 VmdkExtent *extent,
 VmdkMetaData *m_data,
-uint64_t offset, int allocate)
+uint64_t offset,
+int allocate,
+uint64_t *cluster_offset)
 {
 unsigned int l1_index, l2_offset, l2_index;
 int min_index, i, j;
 uint32_t min_count, *l2_table, tmp = 0;
-uint64_t cluster_offset;
 
 if (m_data)
 m_data->valid = 0;
+if (extent->flat) {
+*cluster_offset = 0;
+return 0;
+}
 
 l1_index = (offset >> 9) / extent->l1_entry_sectors;
 if (l1_index >= extent->l1_size) {
-return 0;
+return -1;
 }
 l2_offset = extent->l1_table[l1_index];
 if (!l2_offset) {
-return 0;
+return -1;
 }
 for (i = 0; i < L2_CACHE_SIZE; i++) {
 if (l2_offset == extent->l2_cache_offsets[i]) {
@@ -714,28 +719,29 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
 l2_table,
 extent->l2_size * sizeof(uint32_t)
 ) != extent->l2_size * sizeof(uint32_t)) {
-return 0;
+return -1;
 }
 
 extent->l2_cache_offsets[min_index] = l2_offset;
 extent->l2_cache_counts[min_index] = 1;
  found:
 l2_index = ((offset >> 9) / extent->cluster_sectors) % extent->l2_size;
-cluster_offset = le32_to_cpu(l2_table[l2_index]);
+*cluster_offset = le32_to_cpu(l2_table[l2_index]);
 
-if (!cluster_offset) {
-if (!allocate)
-return 0;
+if (!*cluster_offset) {
+if (!allocate) {
+return -1;
+}
 
 // Avoid the L2 tables update for the images that have snapshots.
-cluster_offset = bdrv_getlength(extent->file);
+*cluster_offset = bdrv_getlength(extent->file);
 bdrv_truncate(
 extent->file,
-cluster_offset + (extent->cluster_sectors << 9)
+*cluster_offset + (extent->cluster_sectors << 9)
 );
 
-cluster_offset >>= 9;
-tmp = cpu_to_le32(cluster_offset);
+*cluster_offset >>= 9;
+tmp = cpu_to_le32(*cluster_offset);
 l2_table[l2_index] = tmp;
 
 /* First of all we write grain itself, to avoid race condition
@@ -744,8 +750,8 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
  * or inappropriate VM shutdown.
  */
 if (get_whole_cluster(
-bs, extent, cluster_offset, offset, allocate) == -1)
-return 0;
+bs, extent, *cluster_offset, offset, allocate) == -1)
+return -1;
 
 if (m_data) {
 m_data->offset = tmp;
@@ -755,8 +761,8 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
 m_data->valid = 1;
 }
 }
-cluster_offset <<= 9;
-return cluster_offset;
+*cluster_offset <<= 9;
+return 0;
 }
 
 static VmdkExtent *find_extent(BDRVVmdkState *s,
@@ -780,7 +786,6 @@ static int vmdk_is_allocated(BlockDriverState *bs, int64_t 
sector_num,
  int nb_sectors, int *pnum)
 {
 BDRVVmdkState *s = bs->opaque;
-
 int64_t index_in_cluster, n, ret;
 uint64_t offset;
 VmdkExtent *extent;
@@ -789,15 +794,13 @@ static int vmdk_is_allocated(BlockDriverState *bs, 
int64_t sector_num,
 if (!extent) {
 return 0;
 }
-if (extent->flat) {
-n = extent->end_sector - sector_num;
-ret = 1;
-} else {
-offset = get_cluster_offset(bs, extent, NULL, sector_num * 512, 0);
-index_in_cluster = sector_num % extent->cluster_sectors;
-n = extent->cluster_sectors - index_in_cluster;
-ret = offset ? 1 : 0;
-}
+ret = get_cluster_offset(bs, extent, NULL,
+sector_num * 512, 0, &offset);
+/* get_cluster_offset returning 0 means success */
+ret = !ret;
+
+index_in_cluster = sector_num % extent->cluster_sectors;
+n = extent->cluster_sectors - index_in_cluster;
 i

[Qemu-devel] [PATCH 19/21] VMDK: create different subformats

2011-07-19 Thread Kevin Wolf
From: Fam Zheng 

Add create option 'format', with enums:
monolithicSparse
monolithicFlat
twoGbMaxExtentSparse
twoGbMaxExtentFlat
Each creates a subformat image file. The default is monolithicSparse.

Signed-off-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/vmdk.c |  503 +++--
 block_int.h  |1 +
 2 files changed, 275 insertions(+), 229 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index e1fb962..b53c5f5 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -156,8 +156,9 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 #define CHECK_CID 1
 
 #define SECTOR_SIZE 512
-#define DESC_SIZE 20*SECTOR_SIZE   // 20 sectors of 512 bytes each
-#define HEADER_SIZE 512// first sector of 512 bytes
+#define DESC_SIZE (20 * SECTOR_SIZE)/* 20 sectors of 512 bytes each */
+#define BUF_SIZE 4096
+#define HEADER_SIZE 512 /* first sector of 512 bytes */
 
 static void vmdk_free_extents(BlockDriverState *bs)
 {
@@ -243,168 +244,6 @@ static int vmdk_is_cid_valid(BlockDriverState *bs)
 return 1;
 }
 
-static int vmdk_snapshot_create(const char *filename, const char *backing_file)
-{
-int snp_fd, p_fd;
-int ret;
-uint32_t p_cid;
-char *p_name, *gd_buf, *rgd_buf;
-const char *real_filename, *temp_str;
-VMDK4Header header;
-uint32_t gde_entries, gd_size;
-int64_t gd_offset, rgd_offset, capacity, gt_size;
-char p_desc[DESC_SIZE], s_desc[DESC_SIZE], hdr[HEADER_SIZE];
-static const char desc_template[] =
-"# Disk DescriptorFile\n"
-"version=1\n"
-"CID=%x\n"
-"parentCID=%x\n"
-"createType=\"monolithicSparse\"\n"
-"parentFileNameHint=\"%s\"\n"
-"\n"
-"# Extent description\n"
-"RW %u SPARSE \"%s\"\n"
-"\n"
-"# The Disk Data Base \n"
-"#DDB\n"
-"\n";
-
-snp_fd = open(filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY | 
O_LARGEFILE, 0644);
-if (snp_fd < 0)
-return -errno;
-p_fd = open(backing_file, O_RDONLY | O_BINARY | O_LARGEFILE);
-if (p_fd < 0) {
-close(snp_fd);
-return -errno;
-}
-
-/* read the header */
-if (lseek(p_fd, 0x0, SEEK_SET) == -1) {
-ret = -errno;
-goto fail;
-}
-if (read(p_fd, hdr, HEADER_SIZE) != HEADER_SIZE) {
-ret = -errno;
-goto fail;
-}
-
-/* write the header */
-if (lseek(snp_fd, 0x0, SEEK_SET) == -1) {
-ret = -errno;
-goto fail;
-}
-if (write(snp_fd, hdr, HEADER_SIZE) == -1) {
-ret = -errno;
-goto fail;
-}
-
-memset(&header, 0, sizeof(header));
-memcpy(&header,&hdr[4], sizeof(header)); // skip the VMDK4_MAGIC
-
-if (ftruncate(snp_fd, header.grain_offset << 9)) {
-ret = -errno;
-goto fail;
-}
-/* the descriptor offset = 0x200 */
-if (lseek(p_fd, 0x200, SEEK_SET) == -1) {
-ret = -errno;
-goto fail;
-}
-if (read(p_fd, p_desc, DESC_SIZE) != DESC_SIZE) {
-ret = -errno;
-goto fail;
-}
-
-if ((p_name = strstr(p_desc,"CID")) != NULL) {
-p_name += sizeof("CID");
-sscanf(p_name,"%x",&p_cid);
-}
-
-real_filename = filename;
-if ((temp_str = strrchr(real_filename, '\\')) != NULL)
-real_filename = temp_str + 1;
-if ((temp_str = strrchr(real_filename, '/')) != NULL)
-real_filename = temp_str + 1;
-if ((temp_str = strrchr(real_filename, ':')) != NULL)
-real_filename = temp_str + 1;
-
-snprintf(s_desc, sizeof(s_desc), desc_template, p_cid, p_cid, backing_file,
- (uint32_t)header.capacity, real_filename);
-
-/* write the descriptor */
-if (lseek(snp_fd, 0x200, SEEK_SET) == -1) {
-ret = -errno;
-goto fail;
-}
-if (write(snp_fd, s_desc, strlen(s_desc)) == -1) {
-ret = -errno;
-goto fail;
-}
-
-gd_offset = header.gd_offset * SECTOR_SIZE; // offset of GD table
-rgd_offset = header.rgd_offset * SECTOR_SIZE;   // offset of RGD table
-capacity = header.capacity * SECTOR_SIZE;   // Extent size
-/*
- * Each GDE span 32M disk, means:
- * 512 GTE per GT, each GTE points to grain
- */
-gt_size = (int64_t)header.num_gtes_per_gte * header.granularity * 
SECTOR_SIZE;
-if (!gt_size) {
-ret = -EINVAL;
-goto fail;
-}
-gde_entries = (uint32_t)(capacity / gt_size);  // number of gde/rgde
-gd_size = gde_entries * sizeof(uint32_t);
-
-/* write RGD */
-rgd_buf = qemu_malloc(gd_size);
-if (lseek(p_fd, rgd_offset, SEEK_SET) == -1) {
-ret = -errno;
-goto fail_rgd;
-}
-if (read(p_fd, rgd_buf, gd_size) != gd_size) {
-ret = -errno;
-goto fail_rgd;
-}
-if (lseek(snp_fd, rgd_offset, SEEK_SET) == -1) {
-ret = -errno;
-goto fail_rgd;
-}
-if (write(snp_fd, rgd_buf, gd_size) 

Re: [Qemu-devel] [Xen-devel] Upstream QEMU and Xen unstable not working

2011-07-19 Thread Stefano Stabellini
On Tue, 19 Jul 2011, Wei Liu wrote:
> Good, this is it.
> 
> But this patch is not yet pulled in the tree.
 
I pushed few commits that I had in my local tree, they should be in
xen-next now.



Re: [Qemu-devel] External COW format for raw images

2011-07-19 Thread Kevin Wolf
Am 19.07.2011 11:25, schrieb Robert Wang:
> As you known, raw image is very popular,but the raw image format does
> NOT support Copy-On-Write,a raw image file can NOT be used as a copy
> destination, then image streaming/Live Block Copy will NOT work.
> 
> To fix this, we need to add a new block driver raw-cow to QEMU. If
> finished, we can use qemu-img like this:
> qemu-img create -f raw-cow -o backing_file=ubuntu.img,raw_file=my_vm.img
> my_vm.raw-cow

Just one comment for the start: This is not only useful for raw (while
certainly being the most important case), but also for every other image
format for which qemu doesn't support backing files.

This means that we should look for a better name than raw-cow. I know,
it was me who introduced this, but only for lack of a better name.

> 1) ubuntu.img is the backing file, my_vm.img is a raw file,
> my_vm.raw-cow stores a COW bitmap related to my_vm.img.
> 
> 2) If the entire COW bitmap is set to dirty flag then we can get all
> information from my_vm.img and can ignore ubuntu.img and my_vm.raw-cow
> from now.
> 
> To implement this, I think I can follow these steps:
> 1) Add a new member to BlockDriverState struct:
> char raw_file[1024];
> This member will track raw_file parameter related to raw-cow file from
> command line.

Can't this be private to the COW driver? It certainly will have a
BDRVRawCowState or something like that.

> 2)* Create a new file block/raw-cow.c. It will be much more like the
> mixture of block/cow.c and block/raw.c.
> 
> So I will change some functions in cow.c and raw.c to none-static, then
> raw-cow.c can re-use them.

I think it's better to keep drivers cleanly separated. If we really need
to share code, we should provide some sort of a library that is used by
both, but I doubt that it's required in this case.

What the driver should probably do, is to open the raw file internally
and keep a BlockDriverState of the raw file in its private structure.
For all accesses to the raw file, use the "official" interfaces of the
raw driver, like bdrv_aio_readv/writev.

>  When read operation occurs, determine whether
> dirty flag in raw-cow image is set. If true, read directly from the raw
> file. After write operation, set related dirty flag in raw-cow image.
> And other functions might also be modified.
> 
>   * Of course, format_name member of BlockDriver struct will be "raw-cow".
> And in order to keep relationship with raw file( like my_vm.img) ,
> raw_cow_header struct should be
> struct raw_cow_header {
> uint32_t magic;
> uint32_t version;
> char backing_file[1024];
> char raw_file[1024];/* added*/
> int32_t mtime;
> uint64_t size;
> uint32_t sectorsize;

I don't think any of mtime, size and sectorsize are necessary. They will
just be taken from the raw file (if needed at all).

> };
>   * Struct raw_cow_create_options should be one member plus based on
> cow_create_options:
> {
> .name = BLOCK_OPT_RAW_FILE,
> .type = OPT_STRING,
> .help = "Raw file name"
> },
> 
> 3) Add bdrv_get_raw_filename in img_info function of qemu-img.c. In
> bdrv_get_raw_filename, if the format of the image file is "raw-cow",
> print the related raw file.

Hm... Won't be implemented by any other driver, but I guess it makes
some sense to provide this information.

Kevin



Re: [Qemu-devel] [PATCH 3/3] qemu-x86: Set tsc_khz in kvm when supported

2011-07-19 Thread Marcelo Tosatti
On Thu, Jul 07, 2011 at 04:13:13PM +0200, Joerg Roedel wrote:
> Make use of the KVM_TSC_CONTROL feature if available.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  target-i386/kvm.c |   18 +-
>  1 files changed, 17 insertions(+), 1 deletions(-)
> 
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index 10fb2c4..923d2d5 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -354,6 +354,7 @@ int kvm_arch_init_vcpu(CPUState *env)
>  uint32_t unused;
>  struct kvm_cpuid_entry2 *c;
>  uint32_t signature[3];
> +int r;
>  
>  env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
>  
> @@ -499,7 +500,22 @@ int kvm_arch_init_vcpu(CPUState *env)
>  
>  qemu_add_vm_change_state_handler(cpu_update_state, env);
>  
> -return kvm_vcpu_ioctl(env, KVM_SET_CPUID2, &cpuid_data);
> +r = kvm_vcpu_ioctl(env, KVM_SET_CPUID2, &cpuid_data);
> +if (r)
> + return r;
> +
> +#ifdef KVM_CAP_TSC_CONTROL
> +r = kvm_check_extension(env->kvm_state, KVM_CAP_TSC_CONTROL);
> +if (r && env->tsc_khz) {
> +r = kvm_vcpu_ioctl(env, KVM_SET_TSC_KHZ, env->tsc_khz);
> +if (r < 0) {
> +fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
> +return r;
> +}
> +}
> +#endif

And this should be moved to kvm_arch_put_registers, in case 
level == KVM_PUT_FULL_STATE.




Re: [Qemu-devel] coroutines and block I/O considerations

2011-07-19 Thread Stefan Hajnoczi
On Tue, Jul 19, 2011 at 11:10 AM, Kevin Wolf  wrote:
> Am 19.07.2011 10:06, schrieb Frediano Ziglio:
>> 2- memory considerations on coroutines. Beside coroutines allow more
>> readable code I wonder if somebody considered memory. For every
>> coroutines a different stack has to be allocated. For instance
>> ucontext and win32 implementation use 4mb. Assuming 128 concurrent AIO
>> this require about 512mb of ram (mostly only committed but not used
>> and coroutines are reused).
>
> 128 concurrent requests is a lot. And even then, it's only virtual
> memory. I doubt that we're actually using much more than we do in the
> old code with the AIOCBs (which will disappear and become local
> variables when we complete the conversion).

>From what I understand "committed" on Windows means that physical
pages have been allocated and pagefile space has been set aside:
http://msdn.microsoft.com/en-us/library/ms810627.aspx

On Linux memory is overcommitted and will not require swap space or
any actual pages.  This behavior can be configured differently IIRC
but the default is to be lazy about claiming memory resources so that
even 4 MB thread/coroutine stacks are not an issue.

The question is how can we get the same effect on Windows and does the
current Fibers implementation not already work?

Stefan



Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Marcelo Tosatti
On Thu, Jul 07, 2011 at 04:13:12PM +0200, Joerg Roedel wrote:
> To let the user configure the desired tsc frequency for the
> guest if running in KVM.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  target-i386/cpu.h   |1 +
>  target-i386/cpuid.c |   13 +
>  2 files changed, 14 insertions(+), 0 deletions(-)
> 
> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> index cdf68ff..399e124 100644
> --- a/target-i386/cpu.h
> +++ b/target-i386/cpu.h
> @@ -743,6 +743,7 @@ typedef struct CPUX86State {
>  uint32_t cpuid_kvm_features;
>  uint32_t cpuid_svm_features;
>  bool tsc_valid;
> +int tsc_khz;

This should be saved/restore in migration data (missing VMSTATE entry).

>  /* in order to simplify APIC support, we leave this pointer to the
> user */
> diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
> index e1ae3af..89e9623 100644
> --- a/target-i386/cpuid.c
> +++ b/target-i386/cpuid.c
> @@ -224,6 +224,7 @@ typedef struct x86_def_t {
>  int family;
>  int model;
>  int stepping;
> +int tsc_khz;
>  uint32_t features, ext_features, ext2_features, ext3_features;
>  uint32_t kvm_features, svm_features;
>  uint32_t xlevel;
> @@ -704,6 +705,17 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
> const char *cpu_model)
>  } else if (!strcmp(featurestr, "model_id")) {
>  pstrcpy(x86_cpu_def->model_id, sizeof(x86_cpu_def->model_id),
>  val);
> +} else if (!strcmp(featurestr, "tsc_freq")) {
> +int64_t tsc_freq;
> +char *err;
> +
> +tsc_freq = strtosz_suffix_unit(val, &err,
> +   STRTOSZ_DEFSUFFIX_B, 1000);
> +if (!*val || *err) {
> +fprintf(stderr, "bad numerical value %s\n", val);
> +goto error;
> +}
> +x86_cpu_def->tsc_khz = tsc_freq / 1000;
>  } else {
>  fprintf(stderr, "unrecognized feature %s\n", featurestr);
>  goto error;
> @@ -872,6 +884,7 @@ int cpu_x86_register (CPUX86State *env, const char 
> *cpu_model)
>  env->cpuid_svm_features = def->svm_features;
>  env->cpuid_ext4_features = def->ext4_features;
>  env->cpuid_xlevel2 = def->xlevel2;
> +env->tsc_khz = def->tsc_khz;
>  if (!kvm_enabled()) {
>  env->cpuid_features &= TCG_FEATURES;
>  env->cpuid_ext_features &= TCG_EXT_FEATURES;
> -- 
> 1.7.4.1
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [Qemu-devel] [Xen-devel] Upstream QEMU and Xen unstable not working

2011-07-19 Thread Wei Liu
On Tue, 2011-07-19 at 12:09 +0100, Stefano Stabellini wrote:
> On Tue, 19 Jul 2011, Wei Liu wrote:
> > Good, this is it.
> > 
> > But this patch is not yet pulled in the tree.
>  
> I pushed few commits that I had in my local tree, they should be in
> xen-next now.

The commit 2aa8f492c85604b91b263350560042d632fcdeb2 in your xen-next
tree.

Author: Anthony PERARD 
Date:   Wed Jul 6 18:58:14 2011 +0100

hw/piix_pci.c: Fix PIIX3-xen to initialize ids

Signed-off-by: Anthony PERARD 

The content mismatch with description.

It patches hw/ide/piix.c, which doesn't solve the problem. It should be
hw/piix_pci.c .

Seems that you got the wrong patch. The right one is here.

http://marc.info/?l=qemu-devel&m=130876651402847&w=2

Wei.




Re: [Qemu-devel] [PATCH 0/2] netdev fixes

2011-07-19 Thread Michael S. Tsirkin
On Thu, Jun 16, 2011 at 06:45:35PM +0200, Markus Armbruster wrote:
> Markus Armbruster (2):
>   Fix automatically assigned network names for netdev
>   Fix netdev name lookup in -device, device_add, netdev_del
> 
>  net.c |   19 +++
>  1 files changed, 15 insertions(+), 4 deletions(-)

Thanks, applied.
I think going forward we should bring more order into ways we assign
IDs.

> -- 
> 1.7.2.3
> 



[Qemu-devel] [PATCH] do not call monitor_resume() from migrate_fd_put_buffer() error path

2011-07-19 Thread Michael Tokarev
If we do, it results in double monitor_resume() (second being called
from migrate_fd_cleanup() anyway) and monitor suspend count becoming
negative.

Cc'ing people from `git blame' list for the lines in question: the
change fixes the problem but I'm not sure what the original intention
of this code was in this place.  Unfortunately noone replied to two
my attempts to raise this issue.

Signed-Off-By: Michael Tokarev 
---
 migration.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/migration.c b/migration.c
index af3a1f2..115588c 100644
--- a/migration.c
+++ b/migration.c
@@ -330,9 +330,6 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void 
*data, size_t size)
 if (ret == -EAGAIN) {
 qemu_set_fd_handler2(s->fd, NULL, NULL, migrate_fd_put_notify, s);
 } else if (ret < 0) {
-if (s->mon) {
-monitor_resume(s->mon);
-}
 s->state = MIG_STATE_ERROR;
 notifier_list_notify(&migration_state_notifiers);
 }
-- 
1.7.2.5




Re: [Qemu-devel] [Xen-devel] Upstream QEMU and Xen unstable not working

2011-07-19 Thread Stefano Stabellini
On Tue, 19 Jul 2011, Wei Liu wrote:
> On Tue, 2011-07-19 at 12:09 +0100, Stefano Stabellini wrote:
> > On Tue, 19 Jul 2011, Wei Liu wrote:
> > > Good, this is it.
> > > 
> > > But this patch is not yet pulled in the tree.
> >  
> > I pushed few commits that I had in my local tree, they should be in
> > xen-next now.
> 
> The commit 2aa8f492c85604b91b263350560042d632fcdeb2 in your xen-next
> tree.
> 
> Author: Anthony PERARD 
> Date:   Wed Jul 6 18:58:14 2011 +0100
> 
> hw/piix_pci.c: Fix PIIX3-xen to initialize ids
> 
> Signed-off-by: Anthony PERARD 
> 
> The content mismatch with description.
> 
> It patches hw/ide/piix.c, which doesn't solve the problem. It should be
> hw/piix_pci.c .
> 
> Seems that you got the wrong patch. The right one is here.
> 
> http://marc.info/?l=qemu-devel&m=130876651402847&w=2
> 

Yeah, I realized it right after sending the email.

Unfortunately there is also a regression in Xen, similar to the one
fixed by 23550, that causes PV on HVM guests to hang during boot at the
moment.
The offending commit is CS 23573:

"replace d->nr_pirqs sized arrays with radix tree"



[Qemu-devel] [PATCHv2] target-arm: support for ARM1176JZF-s cores

2011-07-19 Thread Jamie Iles
Add support for v6K ARM1176JZF-S.  This core includes the VA<->PA
translation capability and security extensions.

v2: Model the version with the VFP

Cc: Peter Maydell 
Cc: Paul Brook 
Cc: Aurelien Jarno 
Signed-off-by: Jamie Iles 
---
 target-arm/cpu.h|1 +
 target-arm/helper.c |   23 +++
 2 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 01f5b57..8708f9e 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -414,6 +414,7 @@ void cpu_arm_set_cp_io(CPUARMState *env, int cpnum,
 #define ARM_CPUID_PXA270_C5   0x69054117
 #define ARM_CPUID_ARM1136 0x4117b363
 #define ARM_CPUID_ARM1136_R2  0x4107b362
+#define ARM_CPUID_ARM1176 0x410fb767
 #define ARM_CPUID_ARM11MPCORE 0x410fb022
 #define ARM_CPUID_CORTEXA80x410fc080
 #define ARM_CPUID_CORTEXA90x410fc090
diff --git a/target-arm/helper.c b/target-arm/helper.c
index eda881b..c5ba5a6 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -36,6 +36,12 @@ static uint32_t arm1136_cp15_c0_c1[8] =
 static uint32_t arm1136_cp15_c0_c2[8] =
 { 0x00140011, 0x12002111, 0x1123, 0x01102131, 0x141, 0, 0, 0 };
 
+static uint32_t arm1176_cp15_c0_c1[8] =
+{ 0x111, 0x11, 0x33, 0x01130003, 0x01130003, 0x10030302, 0x01222100, 0 };
+
+static uint32_t arm1176_cp15_c0_c2[8] =
+{ 0x0140011, 0x12002111, 0x11231121, 0x01102131, 0x01141, 0, 0, 0 };
+
 static uint32_t cpu_arm_find_by_name(const char *name);
 
 static inline void set_feature(CPUARMState *env, int feature)
@@ -86,6 +92,21 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t id)
 env->cp15.c0_cachetype = 0x1dd20d2;
 env->cp15.c1_sys = 0x00050078;
 break;
+case ARM_CPUID_ARM1176:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
+set_feature(env, ARM_FEATURE_V6);
+set_feature(env, ARM_FEATURE_V6K);
+set_feature(env, ARM_FEATURE_VFP);
+set_feature(env, ARM_FEATURE_AUXCR);
+env->vfp.xregs[ARM_VFP_FPSID] = 0x410120b5;
+env->vfp.xregs[ARM_VFP_MVFR0] = 0x;
+env->vfp.xregs[ARM_VFP_MVFR1] = 0x;
+memcpy(env->cp15.c0_c1, arm1176_cp15_c0_c1, 8 * sizeof(uint32_t));
+memcpy(env->cp15.c0_c2, arm1176_cp15_c0_c2, 8 * sizeof(uint32_t));
+env->cp15.c0_cachetype = 0x1dd20d2;
+env->cp15.c1_sys = 0x00050078;
+break;
 case ARM_CPUID_ARM11MPCORE:
 set_feature(env, ARM_FEATURE_V4T);
 set_feature(env, ARM_FEATURE_V5);
@@ -377,6 +398,7 @@ static const struct arm_cpu_t arm_cpu_names[] = {
 { ARM_CPUID_ARM1026, "arm1026"},
 { ARM_CPUID_ARM1136, "arm1136"},
 { ARM_CPUID_ARM1136_R2, "arm1136-r2"},
+{ ARM_CPUID_ARM1176, "arm1176"},
 { ARM_CPUID_ARM11MPCORE, "arm11mpcore"},
 { ARM_CPUID_CORTEXM3, "cortex-m3"},
 { ARM_CPUID_CORTEXA8, "cortex-a8"},
@@ -1770,6 +1792,7 @@ uint32_t HELPER(get_cp15)(CPUState *env, uint32_t insn)
 return 1;
 case ARM_CPUID_ARM1136:
 case ARM_CPUID_ARM1136_R2:
+case ARM_CPUID_ARM1176:
 return 7;
 case ARM_CPUID_ARM11MPCORE:
 return 1;
-- 
1.7.4.1




Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Avi Kivity

On 07/19/2011 02:46 PM, Marcelo Tosatti wrote:

On Thu, Jul 07, 2011 at 04:13:12PM +0200, Joerg Roedel wrote:
>  To let the user configure the desired tsc frequency for the
>  guest if running in KVM.
>
>  Signed-off-by: Joerg Roedel
>  ---
>   target-i386/cpu.h   |1 +
>   target-i386/cpuid.c |   13 +
>   2 files changed, 14 insertions(+), 0 deletions(-)
>
>  diff --git a/target-i386/cpu.h b/target-i386/cpu.h
>  index cdf68ff..399e124 100644
>  --- a/target-i386/cpu.h
>  +++ b/target-i386/cpu.h
>  @@ -743,6 +743,7 @@ typedef struct CPUX86State {
>   uint32_t cpuid_kvm_features;
>   uint32_t cpuid_svm_features;
>   bool tsc_valid;
>  +int tsc_khz;

This should be saved/restore in migration data (missing VMSTATE entry).


Why?  It's static data.  Traditionally we only migrate runtime data.

(although we've been talking about starting a naked qemu and pushing all 
of the configuration from the source).


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Anthony Liguori

On 07/19/2011 05:15 AM, Kevin Wolf wrote:

From: Hannes Reinecke

'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.
'tag' is still being kept for tracing purposes.

Signed-off-by: Hannes Reinecke
Acked-by: Paolo Bonzini
Signed-off-by: Kevin Wolf
---
  hw/esp.c  |2 +-
  hw/lsi53c895a.c   |   22 --
  hw/scsi-bus.c |9 ++---
  hw/scsi-disk.c|4 ++--
  hw/scsi-generic.c |5 +++--
  hw/scsi.h |   10 +++---
  hw/spapr_vscsi.c  |   29 +
  hw/usb-msd.c  |9 +
  8 files changed, 37 insertions(+), 53 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index aa50800..9ddd637 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, uint8_t 
busid)

  DPRINTF("do_busid_cmd: busid 0x%x\n", busid);
  lun = busid&  7;
-s->current_req = scsi_req_new(s->current_dev, 0, lun);
+s->current_req = scsi_req_new(s->current_dev, 0, lun, NULL);
  datalen = scsi_req_enqueue(s->current_req, buf);
  s->ti_size = datalen;
  if (datalen != 0) {
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 940b43a..69eec1d 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, uint32_t 
tag)
  static void lsi_request_cancelled(SCSIRequest *req)
  {
  LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
-lsi_request *p;
+lsi_request *p = req->hba_private;

  if (s->current&&  req == s->current->req) {
  scsi_req_unref(req);
@@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
  return;
  }

-p = lsi_find_by_tag(s, req->tag);
  if (p) {
  QTAILQ_REMOVE(&s->queue, p, next);
  scsi_req_unref(req);
@@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)

  /* Record that data is available for a queued command.  Returns zero if
 the device was reselected, nonzero if the IO is deferred.  */
-static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
+static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
  {
-lsi_request *p;
-
-p = lsi_find_by_tag(s, tag);
-if (!p) {
-BADF("IO with unknown tag %d\n", tag);
-return 1;
-}
+lsi_request *p = req->hba_private;

  if (p->pending) {
-BADF("Multiple IO pending for tag %d\n", tag);
+BADF("Multiple IO pending for request %p\n", p);
  }
  p->pending = len;
  /* Reselect if waiting for it, or if reselection triggers an IRQ
@@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, uint32_t 
len)
  LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
  int out;

-if (s->waiting == 1 || !s->current || req->tag != s->current->tag ||
+if (s->waiting == 1 || !s->current || req->hba_private != s->current ||
  (lsi_irq_on_rsl(s)&&  !(s->scntl1&  LSI_SCNTL1_CON))) {
-if (lsi_queue_tag(s, req->tag, len)) {
+if (lsi_queue_req(s, req, len)) {
  return;
  }
  }
@@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
  assert(s->current == NULL);
  s->current = qemu_mallocz(sizeof(lsi_request));
  s->current->tag = s->select_tag;
-s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun);
+s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun,
+   s->current);

  n = scsi_req_enqueue(s->current->req, buf);
  if (n) {
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index ad6a730..8b1a412 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
  return res;
  }

-SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t 
lun)
+SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
+uint32_t lun, void *hba_private)
  {
  SCSIRequest *req;

@@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, 
uint32_t tag, uint32_t l
  req->dev = d;
  req->tag = tag;
  req->lun = lun;
+req->hba_private = hba_private;
  req->status = -1;
  trace_scsi_req_alloc(req->dev->id, req->lun, req->tag);
  return req;
  }

-SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
+SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
+  void *hba_private)
  {
-return d->info->alloc_req(d, tag, lun);
+return d->info->alloc_req(d, tag, lun, hba_private);
  }

  uint8_t *scsi_req_get_buf(SCSIRequest *req)
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index a8c7372..c2a99fe 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ 

Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Marcelo Tosatti
On Tue, Jul 19, 2011 at 03:20:37PM +0300, Avi Kivity wrote:
> On 07/19/2011 02:46 PM, Marcelo Tosatti wrote:
> >On Thu, Jul 07, 2011 at 04:13:12PM +0200, Joerg Roedel wrote:
> >>  To let the user configure the desired tsc frequency for the
> >>  guest if running in KVM.
> >>
> >>  Signed-off-by: Joerg Roedel
> >>  ---
> >>   target-i386/cpu.h   |1 +
> >>   target-i386/cpuid.c |   13 +
> >>   2 files changed, 14 insertions(+), 0 deletions(-)
> >>
> >>  diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> >>  index cdf68ff..399e124 100644
> >>  --- a/target-i386/cpu.h
> >>  +++ b/target-i386/cpu.h
> >>  @@ -743,6 +743,7 @@ typedef struct CPUX86State {
> >>   uint32_t cpuid_kvm_features;
> >>   uint32_t cpuid_svm_features;
> >>   bool tsc_valid;
> >>  +int tsc_khz;
> >
> >This should be saved/restore in migration data (missing VMSTATE entry).
> 
> Why?  It's static data.  Traditionally we only migrate runtime data.
> 
> (although we've been talking about starting a naked qemu and pushing
> all of the configuration from the source).

Right.




[Qemu-devel] [PATCH] Add missing documentation for qemu-img -p

2011-07-19 Thread Jes . Sorensen
From: Jes Sorensen 

Signed-off-by: Jes Sorensen 
---
 qemu-img-cmds.hx |4 ++--
 qemu-img.texi|6 --
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 2b70618..1299e83 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -30,7 +30,7 @@ ETEXI
 DEF("convert", img_convert,
 "convert [-c] [-p] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-s 
snapshot_name] filename [filename2 [...]] output_filename")
 STEXI
-@item convert [-c] [-f @var{fmt}] [-O @var{output_fmt}] [-o @var{options}] [-s 
@var{snapshot_name}] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
+@item convert [-c] [-p] [-f @var{fmt}] [-O @var{output_fmt}] [-o 
@var{options}] [-s @var{snapshot_name}] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
 ETEXI
 
 DEF("info", img_info,
@@ -48,7 +48,7 @@ ETEXI
 DEF("rebase", img_rebase,
 "rebase [-f fmt] [-t cache] [-p] [-u] -b backing_file [-F backing_fmt] 
filename")
 STEXI
-@item rebase [-f @var{fmt}] [-u] -b @var{backing_file} [-F @var{backing_fmt}] 
@var{filename}
+@item rebase [-f @var{fmt}] [-p] [-u] -b @var{backing_file} [-F 
@var{backing_fmt}] @var{filename}
 ETEXI
 
 DEF("resize", img_resize,
diff --git a/qemu-img.texi b/qemu-img.texi
index 526474c..495a1b6 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -38,6 +38,8 @@ by the used format or see the format descriptions below for 
details.
 indicates that target image must be compressed (qcow format only)
 @item -h
 with or without a command shows help and lists the supported formats
+@item -p
+display progress bar (convert and rebase commands only)
 @end table
 
 Parameters to snapshot subcommand:
@@ -84,7 +86,7 @@ it doesn't need to be specified separately in this case.
 
 Commit the changes recorded in @var{filename} in its base image.
 
-@item convert [-c] [-f @var{fmt}] [-O @var{output_fmt}] [-o @var{options}] [-s 
@var{snapshot_name}] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
+@item convert [-c] [-p] [-f @var{fmt}] [-O @var{output_fmt}] [-o 
@var{options}] [-s @var{snapshot_name}] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
 
 Convert the disk image @var{filename} or a snapshot @var{snapshot_name} to 
disk image @var{output_filename}
 using format @var{output_fmt}. It can be optionally compressed (@code{-c}
@@ -114,7 +116,7 @@ they are displayed too.
 
 List, apply, create or delete snapshots in image @var{filename}.
 
-@item rebase [-f @var{fmt}] [-u] -b @var{backing_file} [-F @var{backing_fmt}] 
@var{filename}
+@item rebase [-f @var{fmt}] [-p] [-u] -b @var{backing_file} [-F 
@var{backing_fmt}] @var{filename}
 
 Changes the backing file of an image. Only the formats @code{qcow2} and
 @code{qed} support changing the backing file.
-- 
1.7.4.4




Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Kevin Wolf
Am 19.07.2011 14:43, schrieb Anthony Liguori:
> On 07/19/2011 05:15 AM, Kevin Wolf wrote:
>> From: Hannes Reinecke
>>
>> 'tag' is just an abstraction to identify the command
>> from the driver. So we should make that explicit by
>> replacing 'tag' with a driver-defined pointer 'hba_private'.
>> This saves the lookup for driver handling several commands
>> in parallel.
>> 'tag' is still being kept for tracing purposes.
>>
>> Signed-off-by: Hannes Reinecke
>> Acked-by: Paolo Bonzini
>> Signed-off-by: Kevin Wolf
>> ---
>>   hw/esp.c  |2 +-
>>   hw/lsi53c895a.c   |   22 --
>>   hw/scsi-bus.c |9 ++---
>>   hw/scsi-disk.c|4 ++--
>>   hw/scsi-generic.c |5 +++--
>>   hw/scsi.h |   10 +++---
>>   hw/spapr_vscsi.c  |   29 +
>>   hw/usb-msd.c  |9 +
>>   8 files changed, 37 insertions(+), 53 deletions(-)
>>
>> diff --git a/hw/esp.c b/hw/esp.c
>> index aa50800..9ddd637 100644
>> --- a/hw/esp.c
>> +++ b/hw/esp.c
>> @@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, 
>> uint8_t busid)
>>
>>   DPRINTF("do_busid_cmd: busid 0x%x\n", busid);
>>   lun = busid&  7;
>> -s->current_req = scsi_req_new(s->current_dev, 0, lun);
>> +s->current_req = scsi_req_new(s->current_dev, 0, lun, NULL);
>>   datalen = scsi_req_enqueue(s->current_req, buf);
>>   s->ti_size = datalen;
>>   if (datalen != 0) {
>> diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
>> index 940b43a..69eec1d 100644
>> --- a/hw/lsi53c895a.c
>> +++ b/hw/lsi53c895a.c
>> @@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, 
>> uint32_t tag)
>>   static void lsi_request_cancelled(SCSIRequest *req)
>>   {
>>   LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
>> -lsi_request *p;
>> +lsi_request *p = req->hba_private;
>>
>>   if (s->current&&  req == s->current->req) {
>>   scsi_req_unref(req);
>> @@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
>>   return;
>>   }
>>
>> -p = lsi_find_by_tag(s, req->tag);
>>   if (p) {
>>   QTAILQ_REMOVE(&s->queue, p, next);
>>   scsi_req_unref(req);
>> @@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)
>>
>>   /* Record that data is available for a queued command.  Returns zero if
>>  the device was reselected, nonzero if the IO is deferred.  */
>> -static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
>> +static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
>>   {
>> -lsi_request *p;
>> -
>> -p = lsi_find_by_tag(s, tag);
>> -if (!p) {
>> -BADF("IO with unknown tag %d\n", tag);
>> -return 1;
>> -}
>> +lsi_request *p = req->hba_private;
>>
>>   if (p->pending) {
>> -BADF("Multiple IO pending for tag %d\n", tag);
>> +BADF("Multiple IO pending for request %p\n", p);
>>   }
>>   p->pending = len;
>>   /* Reselect if waiting for it, or if reselection triggers an IRQ
>> @@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, uint32_t 
>> len)
>>   LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
>>   int out;
>>
>> -if (s->waiting == 1 || !s->current || req->tag != s->current->tag ||
>> +if (s->waiting == 1 || !s->current || req->hba_private != s->current ||
>>   (lsi_irq_on_rsl(s)&&  !(s->scntl1&  LSI_SCNTL1_CON))) {
>> -if (lsi_queue_tag(s, req->tag, len)) {
>> +if (lsi_queue_req(s, req, len)) {
>>   return;
>>   }
>>   }
>> @@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
>>   assert(s->current == NULL);
>>   s->current = qemu_mallocz(sizeof(lsi_request));
>>   s->current->tag = s->select_tag;
>> -s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun);
>> +s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun,
>> +   s->current);
>>
>>   n = scsi_req_enqueue(s->current->req, buf);
>>   if (n) {
>> diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
>> index ad6a730..8b1a412 100644
>> --- a/hw/scsi-bus.c
>> +++ b/hw/scsi-bus.c
>> @@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
>>   return res;
>>   }
>>
>> -SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, 
>> uint32_t lun)
>> +SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
>> +uint32_t lun, void *hba_private)
>>   {
>>   SCSIRequest *req;
>>
>> @@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice 
>> *d, uint32_t tag, uint32_t l
>>   req->dev = d;
>>   req->tag = tag;
>>   req->lun = lun;
>> +req->hba_private = hba_private;
>>   req->status = -1;
>>   trace_scsi_req_alloc(req->dev->id, req->lun, req->tag);
>>   return req;
>>   }
>>
>> -SCSIRequest *scsi_req_new(SCSIDevice 

Re: [Qemu-devel] [PULL] v2: pending linux-user patches

2011-07-19 Thread Anthony Liguori

On 07/18/2011 02:37 AM, Riku Voipio wrote:

The following changes since commit 89b9ba661bd2d6155308f895ec075d813f0e129b:

   Fix signal handling of SIG_IPI when io-thread is enabled (2011-07-16 
19:43:00 +)


Pulled.  Thanks.

Regards,

Anthony Liguori



are available in the git repository at:
   git://git.linaro.org/people/rikuvoipio/qemu.git linux-user-for-upstream

Cédric VINCENT (4):
   arm-semi: Provide access to CLI arguments passed through the "-append" 
option
   linux-user: Add support for KD...LED ioctls
   linux-user: Add support for more VT ioctls
   linux-user: Add support for even more FB ioctls

Peter Maydell (4):
   linux-user: Add syscall numbers from kernel 2.6.39.2
   linux-user: Implement prlimit64 syscall
   linux-user/syscall.c: Enforce pselect6 sigset size restrictions
   linux-user/signal.c: Rename s390 target_ucontext fields to fix ia64

Riku Voipio (2):
   linux-user: correct syscall 123 on sh4
   linux-user: make MIPS and ARM eabi use same argument reordering

Wesley W. Terpstra (5):
   mips: sigaltstack args
   mips: missing syscall returns wrong errno
   mips: null pointer deref should segfault
   mips: rlimit incorrectly converts values
   mips: rlimit codes are not the same

  arm-semi.c |  113 ---
  linux-user/alpha/syscall_nr.h  |   23 +-
  linux-user/arm/syscall_nr.h|   13 +++
  linux-user/cris/syscall_nr.h   |2 +
  linux-user/i386/syscall_nr.h   |   12 +++
  linux-user/ioctls.h|   13 +++
  linux-user/m68k/syscall_nr.h   |   16 
  linux-user/main.c  |   33 +++-
  linux-user/microblaze/syscall_nr.h |   14 +++-
  linux-user/mips/syscall_nr.h   |   13 +++
  linux-user/mips64/syscall_nr.h |   13 +++
  linux-user/mipsn32/syscall_nr.h|   14 +++
  linux-user/ppc/syscall_nr.h|   30 +++
  linux-user/s390x/syscall_nr.h  |   13 +++-
  linux-user/sh4/syscall_nr.h|   34 -
  linux-user/signal.c|   30 
  linux-user/sparc/syscall_nr.h  |   12 +++
  linux-user/sparc64/syscall_nr.h|   12 +++
  linux-user/syscall.c   |  153 +---
  linux-user/syscall_defs.h  |   51 
  linux-user/syscall_types.h |   20 +
  linux-user/x86_64/syscall_nr.h |   12 +++
  22 files changed, 549 insertions(+), 97 deletions(-)








Re: [Qemu-devel] [PULL] virtio-serial: Fixes, trace points

2011-07-19 Thread Anthony Liguori

On 07/19/2011 03:00 AM, Amit Shah wrote:

Hi Anthony,

Please pull for trace points for virtio-serial/console code and a fix
for a host process closing chardev connection causing an abort().

The following changes since commit 89b9ba661bd2d6155308f895ec075d813f0e129b:

   Fix signal handling of SIG_IPI when io-thread is enabled (2011-07-16 
19:43:00 +)

are available in the git repository at:
   git://git.kernel.org/pub/scm/qemu/amit/virtio-serial.git for-anthony


Pulled.  Thanks.

Regards,

Anthony Liguori



Amit Shah (4):
   virtio-serial-bus: Add trace events
   virtio-console: Add some trace events
   virtio-serial-bus: Fix trailing \n in error_report string
   virtio-console: Prevent abort()s in case of host chardev close

  hw/virtio-console.c|   25 +++--
  hw/virtio-serial-bus.c |9 -
  trace-events   |   11 +++
  3 files changed, 42 insertions(+), 3 deletions(-)


Amit







Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Anthony Liguori

On 07/17/2011 06:13 AM, Avi Kivity wrote:

New in this version:
   MemoryRegionOps gained .old_mmio and .old_portio members, which allow
   reusing old-style callbacks with the new API.  All uses were converted,
   except for eepro100.c, which uses the same MemoryRegionOps for both
   portio and mmio.  Some intermediate patches do introduce dispatching
   callbacks, but they are removed later.

Caveats:
- some devices still grab a global memory region instead of inheriting
   it from their bus.  Seen in the code as #include "exec-memory.h"


Could you write up a quick document on how to use this new api for docs/?

There's bits I don't like about the interface but I think it's a huge 
improvement over what we have now so I'm inclined to commit once it 
includes documentation.


Regards,

Anthony Liguori



Avi Kivity (58):
   Hierarchical memory region API
   memory: implement dirty tracking
   memory: merge adjacent segments of a single memory region
   Internal interfaces for memory API
   memory: abstract address space operations
   memory: rename MemoryRegion::has_ram_addr to ::terminates
   memory: late initialization of ram_addr
   memory:  I/O address space support
   memory: add backward compatibility for old portio registration
   memory: add backward compatibility for old mmio registration
   memory: add ioeventfd support
   exec.c: initialize memory map
   ioport: register ranges by byte aligned addresses always
   pc: grab system_memory
   pc: convert pc_memory_init() to memory API
   pc: move global memory map out of pc_init1() and into its callers
   pci: pass address space to pci bus when created
   pci: add MemoryRegion based BAR management API
   sysbus: add MemoryRegion based memory management API
   usb-ohci: convert to MemoryRegion
   pci: add API to get a BAR's mapped address
   vmsvga: don't remember pci BAR address in callback any more
   vga: convert vga and its derivatives to the memory API
   cirrus: simplify mmio BAR access functions
   cirrus: simplify bitblt BAR access functions
   cirrus: simplify vga window mmio access functions
   vga: simplify vga window mmio access functions
   cirrus: simplify linear framebuffer access functions
   Integrate I/O memory regions into qemu
   exec.c: fix initialization of system I/O memory region
   pci: pass I/O address space to new PCI bus
   pci: allow I/O BARs to be registered with pci_register_bar_region()
   rtl8139: convert to memory API
   ac97: convert to memory API
   e1000: convert to memory API
   eepro100: convert to memory API
   es1370: convert to memory API
   ide: convert to memory API
   ivshmem: convert to memory API
   virtio-pci: convert to memory API
   ahci: convert to memory API
   intel-hda: convert to memory API
   lsi53c895a: convert to memory API
   ppc: convert to memory API
   ne2000: convert to memory API
   pcnet: convert to memory API
   i6300esb: convert to memory API
   isa-mmio: concert to memory API
   sun4u: convert to memory API
   ehci: convert to memory API
   uhci: convert to memory API
   xen-platform: convert to memory API
   msix: convert to memory API
   pci: remove pci_register_bar_simple()
   pci: convert pci rom to memory API
   pci: remove pci_register_bar()
   pci: fold BAR mapping function into its caller
   pci: rename pci_register_bar_region() to pci_register_bar()

  Makefile.target|1 +
  exec-memory.h  |   28 ++
  exec.c |   29 ++
  hw/ac97.c  |   88 +++--
  hw/apb_pci.c   |3 +
  hw/bonito.c|5 +-
  hw/cirrus_vga.c|  460 +++---
  hw/cuda.c  |6 +-
  hw/e1000.c |  113 +++
  hw/eepro100.c  |  181 ++---
  hw/es1370.c|   43 ++-
  hw/escc.c  |   42 +-
  hw/escc.h  |2 +-
  hw/grackle_pci.c   |9 +-
  hw/gt64xxx.c   |6 +-
  hw/heathrow_pic.c  |   29 +-
  hw/ide.h   |2 +-
  hw/ide/ahci.c  |   31 +-
  hw/ide/ahci.h  |2 +-
  hw/ide/cmd646.c|  204 +++
  hw/ide/ich.c   |3 +-
  hw/ide/macio.c |   36 +-
  hw/ide/pci.c   |   25 +-
  hw/ide/pci.h   |   19 +-
  hw/ide/piix.c  |   63 +++-
  hw/ide/via.c   |   64 +++-
  hw/intel-hda.c |   35 +-
  hw/isa.h   |2 +
  hw/isa_mmio.c  |   30 +-
  hw/ivshmem.c   |  158 +++-
  hw/lance.c |   31 +-
  hw/lsi53c895a.c|  257 +++--
  hw/mac_dbdma.c |   32 +-
  hw/mac_dbdma.h |4 +-
  hw/mac_nvram.c |   39 +--
  hw/macio.c |   73 ++--
  hw/msix.c  |   64 +--
  hw/msix.h  |6 +-
  hw/ne2000-isa.c|   14 +-
  hw/ne2000.c|   77 +++--
  hw/ne2000.h|8 +-
  hw/openpic.c   |   81 ++--
  hw/openpic.h   |2 +-
  hw/pc.c|   62 ++-
  hw/pc.h|   11 +-
  hw/pc_piix.c   |   24 +-
  hw/pci.c   |  104 +++---
  hw/pci.h   |   30 +-
  hw/pci_host.h  |1 +
  hw/pci_internals.h |2 +
  hw/pcnet-pci.c |   74 

Re: [Qemu-devel] [PULL] pci, virtio, vhost, xen

2011-07-19 Thread Anthony Liguori

On 07/17/2011 11:29 AM, Michael S. Tsirkin wrote:

The following changes since commit 89b9ba661bd2d6155308f895ec075d813f0e129b:

   Fix signal handling of SIG_IPI when io-thread is enabled (2011-07-16 
19:43:00 +)

are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu.git for_anthony


Pulled.  Thanks.

Regards,

Anthony Liguori



Anthony PERARD (1):
   hw/piix_pci.c: Fix PIIX3-xen to initialize ids

Michael S. Tsirkin (4):
   vhost: fix double free on device stop
   pci_ids: tweak names to match linux/pci_ids.h
   xen: move to new pci initializers
   virtio: fix indirect descriptor buffer overflow

  hw/pci_ids.h  |3 ++-
  hw/piix_pci.c |3 +++
  hw/vhost.c|1 +
  hw/virtio.c   |8 
  hw/xen_platform.c |   15 +++
  5 files changed, 21 insertions(+), 9 deletions(-)







Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Benjamin Herrenschmidt
On Tue, 2011-07-19 at 15:06 +0200, Kevin Wolf wrote:
> Am 19.07.2011 14:43, schrieb Anthony Liguori:
> > On 07/19/2011 05:15 AM, Kevin Wolf wrote:
> >> From: Hannes Reinecke
> >>
> >> 'tag' is just an abstraction to identify the command
> >> from the driver. So we should make that explicit by
> >> replacing 'tag' with a driver-defined pointer 'hba_private'.
> >> This saves the lookup for driver handling several commands
> >> in parallel.
> >> 'tag' is still being kept for tracing purposes.
> >>
> >> Signed-off-by: Hannes Reinecke
> >> Acked-by: Paolo Bonzini
> >> Signed-off-by: Kevin Wolf
> >> ---
> >>   hw/esp.c  |2 +-
> >>   hw/lsi53c895a.c   |   22 --
> >>   hw/scsi-bus.c |9 ++---
> >>   hw/scsi-disk.c|4 ++--
> >>   hw/scsi-generic.c |5 +++--
> >>   hw/scsi.h |   10 +++---
> >>   hw/spapr_vscsi.c  |   29 +
> >>   hw/usb-msd.c  |9 +
> >>   8 files changed, 37 insertions(+), 53 deletions(-)
> >>
> >> diff --git a/hw/esp.c b/hw/esp.c
> >> index aa50800..9ddd637 100644
> >> --- a/hw/esp.c
> >> +++ b/hw/esp.c
> >> @@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, 
> >> uint8_t busid)
> >>
> >>   DPRINTF("do_busid_cmd: busid 0x%x\n", busid);
> >>   lun = busid&  7;
> >> -s->current_req = scsi_req_new(s->current_dev, 0, lun);
> >> +s->current_req = scsi_req_new(s->current_dev, 0, lun, NULL);
> >>   datalen = scsi_req_enqueue(s->current_req, buf);
> >>   s->ti_size = datalen;
> >>   if (datalen != 0) {
> >> diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
> >> index 940b43a..69eec1d 100644
> >> --- a/hw/lsi53c895a.c
> >> +++ b/hw/lsi53c895a.c
> >> @@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, 
> >> uint32_t tag)
> >>   static void lsi_request_cancelled(SCSIRequest *req)
> >>   {
> >>   LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
> >> -lsi_request *p;
> >> +lsi_request *p = req->hba_private;
> >>
> >>   if (s->current&&  req == s->current->req) {
> >>   scsi_req_unref(req);
> >> @@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
> >>   return;
> >>   }
> >>
> >> -p = lsi_find_by_tag(s, req->tag);
> >>   if (p) {
> >>   QTAILQ_REMOVE(&s->queue, p, next);
> >>   scsi_req_unref(req);
> >> @@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)
> >>
> >>   /* Record that data is available for a queued command.  Returns zero if
> >>  the device was reselected, nonzero if the IO is deferred.  */
> >> -static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
> >> +static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
> >>   {
> >> -lsi_request *p;
> >> -
> >> -p = lsi_find_by_tag(s, tag);
> >> -if (!p) {
> >> -BADF("IO with unknown tag %d\n", tag);
> >> -return 1;
> >> -}
> >> +lsi_request *p = req->hba_private;
> >>
> >>   if (p->pending) {
> >> -BADF("Multiple IO pending for tag %d\n", tag);
> >> +BADF("Multiple IO pending for request %p\n", p);
> >>   }
> >>   p->pending = len;
> >>   /* Reselect if waiting for it, or if reselection triggers an IRQ
> >> @@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, 
> >> uint32_t len)
> >>   LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
> >>   int out;
> >>
> >> -if (s->waiting == 1 || !s->current || req->tag != s->current->tag ||
> >> +if (s->waiting == 1 || !s->current || req->hba_private != s->current 
> >> ||
> >>   (lsi_irq_on_rsl(s)&&  !(s->scntl1&  LSI_SCNTL1_CON))) {
> >> -if (lsi_queue_tag(s, req->tag, len)) {
> >> +if (lsi_queue_req(s, req, len)) {
> >>   return;
> >>   }
> >>   }
> >> @@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
> >>   assert(s->current == NULL);
> >>   s->current = qemu_mallocz(sizeof(lsi_request));
> >>   s->current->tag = s->select_tag;
> >> -s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun);
> >> +s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun,
> >> +   s->current);
> >>
> >>   n = scsi_req_enqueue(s->current->req, buf);
> >>   if (n) {
> >> diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
> >> index ad6a730..8b1a412 100644
> >> --- a/hw/scsi-bus.c
> >> +++ b/hw/scsi-bus.c
> >> @@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
> >>   return res;
> >>   }
> >>
> >> -SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, 
> >> uint32_t lun)
> >> +SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
> >> +uint32_t lun, void *hba_private)
> >>   {
> >>   SCSIRequest *req;
> >>
> >> @@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice 
> >> *

Re: [Qemu-devel] coroutines and block I/O considerations

2011-07-19 Thread Anthony Liguori

On 07/19/2011 05:10 AM, Kevin Wolf wrote:

Am 19.07.2011 10:06, schrieb Frediano Ziglio:
They are still all running in the same thread.


2- memory considerations on coroutines. Beside coroutines allow more
readable code I wonder if somebody considered memory. For every
coroutines a different stack has to be allocated. For instance
ucontext and win32 implementation use 4mb. Assuming 128 concurrent AIO
this require about 512mb of ram (mostly only committed but not used
and coroutines are reused).


128 concurrent requests is a lot. And even then, it's only virtual
memory. I doubt that we're actually using much more than we do in the
old code with the AIOCBs (which will disappear and become local
variables when we complete the conversion).


A 4mb stack is probably overkill anyway.  It's easiest to just start 
with a large stack and then once all of the functionality is worked out, 
optimize to a smaller stack.


The same problem exists with using threads FWIW since the default thread 
stack is usually quite large.


Regards,

Anthony Liguori



Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Hannes Reinecke

On 07/19/2011 03:06 PM, Kevin Wolf wrote:

Am 19.07.2011 14:43, schrieb Anthony Liguori:

On 07/19/2011 05:15 AM, Kevin Wolf wrote:

From: Hannes Reinecke

'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.
'tag' is still being kept for tracing purposes.

Signed-off-by: Hannes Reinecke
Acked-by: Paolo Bonzini
Signed-off-by: Kevin Wolf
---
   hw/esp.c  |2 +-
   hw/lsi53c895a.c   |   22 --
   hw/scsi-bus.c |9 ++---
   hw/scsi-disk.c|4 ++--
   hw/scsi-generic.c |5 +++--
   hw/scsi.h |   10 +++---
   hw/spapr_vscsi.c  |   29 +
   hw/usb-msd.c  |9 +
   8 files changed, 37 insertions(+), 53 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index aa50800..9ddd637 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, uint8_t 
busid)

   DPRINTF("do_busid_cmd: busid 0x%x\n", busid);
   lun = busid&   7;
-s->current_req = scsi_req_new(s->current_dev, 0, lun);
+s->current_req = scsi_req_new(s->current_dev, 0, lun, NULL);
   datalen = scsi_req_enqueue(s->current_req, buf);
   s->ti_size = datalen;
   if (datalen != 0) {
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 940b43a..69eec1d 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, uint32_t 
tag)
   static void lsi_request_cancelled(SCSIRequest *req)
   {
   LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
-lsi_request *p;
+lsi_request *p = req->hba_private;

   if (s->current&&   req == s->current->req) {
   scsi_req_unref(req);
@@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
   return;
   }

-p = lsi_find_by_tag(s, req->tag);
   if (p) {
   QTAILQ_REMOVE(&s->queue, p, next);
   scsi_req_unref(req);
@@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)

   /* Record that data is available for a queued command.  Returns zero if
  the device was reselected, nonzero if the IO is deferred.  */
-static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
+static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
   {
-lsi_request *p;
-
-p = lsi_find_by_tag(s, tag);
-if (!p) {
-BADF("IO with unknown tag %d\n", tag);
-return 1;
-}
+lsi_request *p = req->hba_private;

   if (p->pending) {
-BADF("Multiple IO pending for tag %d\n", tag);
+BADF("Multiple IO pending for request %p\n", p);
   }
   p->pending = len;
   /* Reselect if waiting for it, or if reselection triggers an IRQ
@@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, uint32_t 
len)
   LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
   int out;

-if (s->waiting == 1 || !s->current || req->tag != s->current->tag ||
+if (s->waiting == 1 || !s->current || req->hba_private != s->current ||
   (lsi_irq_on_rsl(s)&&   !(s->scntl1&   LSI_SCNTL1_CON))) {
-if (lsi_queue_tag(s, req->tag, len)) {
+if (lsi_queue_req(s, req, len)) {
   return;
   }
   }
@@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
   assert(s->current == NULL);
   s->current = qemu_mallocz(sizeof(lsi_request));
   s->current->tag = s->select_tag;
-s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun);
+s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun,
+   s->current);

   n = scsi_req_enqueue(s->current->req, buf);
   if (n) {
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index ad6a730..8b1a412 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
   return res;
   }

-SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t 
lun)
+SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
+uint32_t lun, void *hba_private)
   {
   SCSIRequest *req;

@@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, 
uint32_t tag, uint32_t l
   req->dev = d;
   req->tag = tag;
   req->lun = lun;
+req->hba_private = hba_private;
   req->status = -1;
   trace_scsi_req_alloc(req->dev->id, req->lun, req->tag);
   return req;
   }

-SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
+SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
+  void *hba_private)
   {
-return d->info->alloc_req(d, tag, lun);
+return d->info->alloc_req(d, tag, lun, hba_private);
   }

   uint8_t *scsi_

Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Stefan Hajnoczi
On Tue, Jul 19, 2011 at 8:24 AM, Jes Sorensen  wrote:
> On 07/18/11 16:08, Stefan Hajnoczi wrote:
>> On Fri, Jul 15, 2011 at 3:58 PM, Jes Sorensen  
>> wrote:
>>> I have been updating the live snapshot wiki for qemu to try and cover
>>> the commands we will want for async snapshot handling too.
>>>
>>> http://wiki.qemu.org/Features/Snapshots
>>
>> Regarding fd passing, do we even support SELinux today with backing files?
>
> Not sure I understand what you mean. The current code should be happy to
> take an existing file or a raw device for the snapshot.

Sorry, I was off on a tangent.

I think today QEMU does not support opening image files with a backing
file purely using file descriptors.  We currently require the ability
to open files.

Stefan



Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Benjamin Herrenschmidt
On Tue, 2011-07-19 at 07:43 -0500, Anthony Liguori wrote:
> 
> This breaks the build:
> 
> make[1]: Nothing to be done for `all'.
>CCppc64-softmmu/spapr_vscsi.o
> /home/anthony/git/qemu/hw/spapr_vscsi.c: In function 
> ‘vscsi_command_complete’:
> /home/anthony/git/qemu/hw/spapr_vscsi.c:535:34: error: ‘s’ undeclared 
> (first use in this function)
> /home/anthony/git/qemu/hw/spapr_vscsi.c:535:34: note: each undeclared 
> identifier is reported only once for each function it appears in
> 
> This file is only built when libfdt is installed which is probably
> why 
> you didn't catch it.
> 
> Ben/David, is there a way we can still build most of this stuff
> without 
> libfdt?  libfdt is still not commonly packaged by some distros.

That would be hard ... the DT stuff is pretty deeply involved. Might be
easier to try to fix the distro :-)

Which ones ?

Cheers,
Ben.




Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Jes Sorensen
On 07/19/11 15:23, Stefan Hajnoczi wrote:
> On Tue, Jul 19, 2011 at 8:24 AM, Jes Sorensen  wrote:
>> On 07/18/11 16:08, Stefan Hajnoczi wrote:
>>> On Fri, Jul 15, 2011 at 3:58 PM, Jes Sorensen  
>>> wrote:
 I have been updating the live snapshot wiki for qemu to try and cover
 the commands we will want for async snapshot handling too.

 http://wiki.qemu.org/Features/Snapshots
>>>
>>> Regarding fd passing, do we even support SELinux today with backing files?
>>
>> Not sure I understand what you mean. The current code should be happy to
>> take an existing file or a raw device for the snapshot.
> 
> Sorry, I was off on a tangent.
> 
> I think today QEMU does not support opening image files with a backing
> file purely using file descriptors.  We currently require the ability
> to open files.

I see what you mean - I don't actually know how that would work, since
the backing file specified in the front image will be a file name.

Eric, what happens if libvirt in an selinux environment tells QEMU to
launch using an image file that is backed by backing file(s)?

Cheers,
Jes



Re: [Qemu-devel] External COW format for raw images

2011-07-19 Thread Anthony Liguori
On 07/19/2011 04:25 AM, Robert Wang wrote:
> As you known, raw image is very popular,but the raw image format does
> NOT support Copy-On-Write,a raw image file can NOT be used as a copy
> destination, then image streaming/Live Block Copy will NOT work.
> 
> To fix this, we need to add a new block driver raw-cow to QEMU. If
> finished, we can use qemu-img like this:
> qemu-img create -f raw-cow -o backing_file=ubuntu.img,raw_file=my_vm.img
> my_vm.raw-cow
> 
> 1) ubuntu.img is the backing file, my_vm.img is a raw file,
> my_vm.raw-cow stores a COW bitmap related to my_vm.img.
> 
> 2) If the entire COW bitmap is set to dirty flag then we can get all
> information from my_vm.img and can ignore ubuntu.img and my_vm.raw-cow
> from now.
> 
> To implement this, I think I can follow these steps:
> 1) Add a new member to BlockDriverState struct:
> char raw_file[1024];
> This member will track raw_file parameter related to raw-cow file from
> command line.
> 
> 2)* Create a new file block/raw-cow.c. It will be much more like the
> mixture of block/cow.c and block/raw.c.
> 
> So I will change some functions in cow.c and raw.c to none-static, then
> raw-cow.c can re-use them. When read operation occurs, determine whether
> dirty flag in raw-cow image is set. If true, read directly from the raw
> file. After write operation, set related dirty flag in raw-cow image.
> And other functions might also be modified.
> 
>   * Of course, format_name member of BlockDriver struct will be "raw-cow".
> And in order to keep relationship with raw file( like my_vm.img) ,
> raw_cow_header struct should be
> struct raw_cow_header {
> uint32_t magic;
> uint32_t version;
> char backing_file[1024];
> char raw_file[1024];/* added*/
> int32_t mtime;
> uint64_t size;
> uint32_t sectorsize;
> };

I'd suggest that doing an image format is the wrong approach here.  Why
not just have a image format where you can pass it the location of a
bitmap?  That let's you compose arbitrarily complex backing file chains
and avoids the introduce of a new bitmap.

The bitmap format is also useful for implementing things like dirty
tracking.

Regards,

Anthony Liguori

>   * Struct raw_cow_create_options should be one member plus based on
> cow_create_options:
> {
> .name = BLOCK_OPT_RAW_FILE,
> .type = OPT_STRING,
> .help = "Raw file name"
> },
> 
> 3) Add bdrv_get_raw_filename in img_info function of qemu-img.c. In
> bdrv_get_raw_filename, if the format of the image file is "raw-cow",
> print the related raw file.
> 
> Do you think my approach is right?
> Thank you.
> 
> 




Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Avi Kivity

On 07/19/2011 04:09 PM, Anthony Liguori wrote:

On 07/17/2011 06:13 AM, Avi Kivity wrote:

New in this version:
   MemoryRegionOps gained .old_mmio and .old_portio members, which allow
   reusing old-style callbacks with the new API.  All uses were 
converted,

   except for eepro100.c, which uses the same MemoryRegionOps for both
   portio and mmio.  Some intermediate patches do introduce dispatching
   callbacks, but they are removed later.

Caveats:
- some devices still grab a global memory region instead of inheriting
   it from their bus.  Seen in the code as #include "exec-memory.h"


Could you write up a quick document on how to use this new api for docs/?


Sure.  It's pretty simple.



There's bits I don't like about the interface 


Which bits are these?

but I think it's a huge improvement over what we have now so I'm 
inclined to commit once it includes documentation.




My problem is that to start leveraging it, everything must flow through 
it.  There are still several hundred call sites that are unconverted.


One option is to invert the relationship between ram_addr_t and 
MemoryRegion - implement the former in terms of the latter.  That only 
works for uses which don't invoke IO_MEM_UNASSIGNED or address arithmetic.


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Joerg Roedel
On Tue, Jul 19, 2011 at 03:20:37PM +0300, Avi Kivity wrote:
> On 07/19/2011 02:46 PM, Marcelo Tosatti wrote:
>> On Thu, Jul 07, 2011 at 04:13:12PM +0200, Joerg Roedel wrote:
>> >  To let the user configure the desired tsc frequency for the
>> >  guest if running in KVM.
>> >
>> >  Signed-off-by: Joerg Roedel
>> >  ---
>> >   target-i386/cpu.h   |1 +
>> >   target-i386/cpuid.c |   13 +
>> >   2 files changed, 14 insertions(+), 0 deletions(-)
>> >
>> >  diff --git a/target-i386/cpu.h b/target-i386/cpu.h
>> >  index cdf68ff..399e124 100644
>> >  --- a/target-i386/cpu.h
>> >  +++ b/target-i386/cpu.h
>> >  @@ -743,6 +743,7 @@ typedef struct CPUX86State {
>> >   uint32_t cpuid_kvm_features;
>> >   uint32_t cpuid_svm_features;
>> >   bool tsc_valid;
>> >  +int tsc_khz;
>>
>> This should be saved/restore in migration data (missing VMSTATE entry).
>
> Why?  It's static data.  Traditionally we only migrate runtime data.
>
> (although we've been talking about starting a naked qemu and pushing all  
> of the configuration from the source).

Hmm, I planned to do the VMSTATE thing in a follow-on patch-set. The
plan is to read the VCPU tsc_freq at guest start time on !tsc-scale
hosts and migrate it over so that the destination host can set the
tsc-freq if it supports tsc-scaling.

Joerg




Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Kevin Wolf
Am 19.07.2011 15:26, schrieb Hannes Reinecke:
> On 07/19/2011 03:06 PM, Kevin Wolf wrote:
>> Am 19.07.2011 14:43, schrieb Anthony Liguori:
>>> On 07/19/2011 05:15 AM, Kevin Wolf wrote:
 From: Hannes Reinecke

 'tag' is just an abstraction to identify the command
 from the driver. So we should make that explicit by
 replacing 'tag' with a driver-defined pointer 'hba_private'.
 This saves the lookup for driver handling several commands
 in parallel.
 'tag' is still being kept for tracing purposes.

 Signed-off-by: Hannes Reinecke
 Acked-by: Paolo Bonzini
 Signed-off-by: Kevin Wolf
 ---
hw/esp.c  |2 +-
hw/lsi53c895a.c   |   22 --
hw/scsi-bus.c |9 ++---
hw/scsi-disk.c|4 ++--
hw/scsi-generic.c |5 +++--
hw/scsi.h |   10 +++---
hw/spapr_vscsi.c  |   29 +
hw/usb-msd.c  |9 +
8 files changed, 37 insertions(+), 53 deletions(-)

 diff --git a/hw/esp.c b/hw/esp.c
 index aa50800..9ddd637 100644
 --- a/hw/esp.c
 +++ b/hw/esp.c
 @@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, 
 uint8_t busid)

DPRINTF("do_busid_cmd: busid 0x%x\n", busid);
lun = busid&   7;
 -s->current_req = scsi_req_new(s->current_dev, 0, lun);
 +s->current_req = scsi_req_new(s->current_dev, 0, lun, NULL);
datalen = scsi_req_enqueue(s->current_req, buf);
s->ti_size = datalen;
if (datalen != 0) {
 diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
 index 940b43a..69eec1d 100644
 --- a/hw/lsi53c895a.c
 +++ b/hw/lsi53c895a.c
 @@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, 
 uint32_t tag)
static void lsi_request_cancelled(SCSIRequest *req)
{
LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
 -lsi_request *p;
 +lsi_request *p = req->hba_private;

if (s->current&&   req == s->current->req) {
scsi_req_unref(req);
 @@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
return;
}

 -p = lsi_find_by_tag(s, req->tag);
if (p) {
QTAILQ_REMOVE(&s->queue, p, next);
scsi_req_unref(req);
 @@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)

/* Record that data is available for a queued command.  Returns zero if
   the device was reselected, nonzero if the IO is deferred.  */
 -static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
 +static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
{
 -lsi_request *p;
 -
 -p = lsi_find_by_tag(s, tag);
 -if (!p) {
 -BADF("IO with unknown tag %d\n", tag);
 -return 1;
 -}
 +lsi_request *p = req->hba_private;

if (p->pending) {
 -BADF("Multiple IO pending for tag %d\n", tag);
 +BADF("Multiple IO pending for request %p\n", p);
}
p->pending = len;
/* Reselect if waiting for it, or if reselection triggers an IRQ
 @@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, 
 uint32_t len)
LSIState *s = DO_UPCAST(LSIState, dev.qdev, req->bus->qbus.parent);
int out;

 -if (s->waiting == 1 || !s->current || req->tag != s->current->tag ||
 +if (s->waiting == 1 || !s->current || req->hba_private != s->current 
 ||
(lsi_irq_on_rsl(s)&&   !(s->scntl1&   LSI_SCNTL1_CON))) {
 -if (lsi_queue_tag(s, req->tag, len)) {
 +if (lsi_queue_req(s, req, len)) {
return;
}
}
 @@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
assert(s->current == NULL);
s->current = qemu_mallocz(sizeof(lsi_request));
s->current->tag = s->select_tag;
 -s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun);
 +s->current->req = scsi_req_new(dev, s->current->tag, s->current_lun,
 +   s->current);

n = scsi_req_enqueue(s->current->req, buf);
if (n) {
 diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
 index ad6a730..8b1a412 100644
 --- a/hw/scsi-bus.c
 +++ b/hw/scsi-bus.c
 @@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
return res;
}

 -SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, 
 uint32_t lun)
 +SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
 +uint32_t lun, void *hba_private)
{
SCSIRequest *req;
>>>

Re: [Qemu-devel] [PATCH V3] e1000: Handle IO Port.

2011-07-19 Thread Juan Quintela
Anthony PERARD  wrote:
> This patch introduces the two IOPorts on e1000, IOADDR and IODATA. The
> IOADDR is used to specify which register we want to access when we read
> or write on IODATA.
>
> This patch fixes some weird behavior that I see when I use e1000 with
> QEMU/Xen, the guest memory can be corrupted by this NIC because it will
> write on memory that it doesn't own anymore after a reset. It's because
> the kernel Linux use the IOPort to reset the network card instead of the
> MMIO.
>
> Signed-off-by: Anthony PERARD 

This "used" to work, so the question is:
- do ioport_addr normally has a value of 0, and then migration works?
- is very rare that we are in the middle of an io cycle?

To be able to use a subsection, we have to had a way to decide that the
old default value is going to go.  My understanding is that testing for
->ioport_addr == 0 should be the test for a subsection, but the code
never looks to put ioport_addr back to zero.

I am missing anything obvious?  Or is there any easy way to now if we
are in the middle of a couple of io operations?  For my reading of 
e100_ioport_read/writel() it looks like it should be used as:

write(base+IOADDR)
write(base+IODATA)

but, should this always be paired, and we can reset ioport_addr after
the second?  Then just setting ioport_addr to zero after the second
would made the subsection work in the normal case.

Any other clue about _when_ we should send ioport_addr?

Thanks, Juan.
> @@ -202,8 +201,12 @@ rxbufsize(uint32_t v)
>  static void
>  set_ctrl(E1000State *s, int index, uint32_t val)
>  {
> -/* RST is self clearing */
> -s->mac_reg[CTRL] = val & ~E1000_CTRL_RST;
> +DBGOUT(IO, "set ctrl = %08x\n", val);
> +if (val & E1000_CTRL_RST) {
> +e1000_reset(s);
> +return;
> +}
> +s->mac_reg[CTRL] = val;
>  }


This looks to me as a different fix that can go in a different patch.

> +/* Writes that are less than 32 bits are ignored on IOADDR.
> + * For the Flash access, a write can be less than 32 bits for
> + * IODATA register, but is not handled.
> + */

Code to implement it is almost the same lenght that this O:-)

> +
> +register_ioport_read(addr, size, 1, e1000_ioport_readl, d);
> +
> +register_ioport_read(addr, size, 2, e1000_ioport_readl, d);
> +
> +register_ioport_write(addr, size, 4, e1000_ioport_writel, d);
> +register_ioport_read(addr, size, 4, e1000_ioport_readl, d);

This is curiosity on my part.  Are we returinng 32bits reads for 1,2 and
4 bytes reads, or there is code at some other level that drops the bits
that we are not interested into?  My understanding of iport.c is that
this is not checked done (it is more, but I don't claim to fully
understand it, or if it mattres at all).

Later, Juan.



Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Eric Blake

On 07/19/2011 07:27 AM, Jes Sorensen wrote:

On 07/19/11 15:23, Stefan Hajnoczi wrote:

On Tue, Jul 19, 2011 at 8:24 AM, Jes Sorensen  wrote:

On 07/18/11 16:08, Stefan Hajnoczi wrote:

On Fri, Jul 15, 2011 at 3:58 PM, Jes Sorensen  wrote:

I have been updating the live snapshot wiki for qemu to try and cover
the commands we will want for async snapshot handling too.

http://wiki.qemu.org/Features/Snapshots


Regarding fd passing, do we even support SELinux today with backing files?


Not sure I understand what you mean. The current code should be happy to
take an existing file or a raw device for the snapshot.


Sorry, I was off on a tangent.

I think today QEMU does not support opening image files with a backing
file purely using file descriptors.  We currently require the ability
to open files.


I see what you mean - I don't actually know how that would work, since
the backing file specified in the front image will be a file name.

Eric, what happens if libvirt in an selinux environment tells QEMU to
launch using an image file that is backed by backing file(s)?


Before starting qemu, libvirt first parses all the image files, to see 
if any of them have backing images.  For every qcow2 or qed image with a 
backing file, libvirt sets the SELinux context of both the qcow2 image 
and its backing file so that qemu will be able to successfully open() 
them.  But if any of those files reside on NFS, then it is not possible 
to label individual files, so it requires setting the SELinux bool 
virt_use_nfs, which thus gives qemu the power to open() arbitrary files 
on NFS, and you've lost security.


It would be nice if libvirt had a way to pass fds for every disk and 
backing file up front; then, SELinux can work around the lack of NFS 
per-file labelling by blocking open() in qemu.  In fact, this has 
already been proposed:


http://lists.gnu.org/archive/html/qemu-devel/2011-06/msg02072.html
http://lists.gnu.org/archive/html/qemu-devel/2011-06/msg01992.html

That thread mentioned both a command-line syntax for passing in fds for 
backing files, as well as an extension to the getfd monitor command to 
allow association of a runtime fd with a filename.


--
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Avi Kivity

On 07/19/2011 04:30 PM, Joerg Roedel wrote:

>
>  (although we've been talking about starting a naked qemu and pushing all
>  of the configuration from the source).

Hmm, I planned to do the VMSTATE thing in a follow-on patch-set. The
plan is to read the VCPU tsc_freq at guest start time on !tsc-scale
hosts and migrate it over so that the destination host can set the
tsc-freq if it supports tsc-scaling.


This can be done by a management tool if desired.

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Jes Sorensen
On 07/19/11 15:58, Eric Blake wrote:
> On 07/19/2011 07:27 AM, Jes Sorensen wrote:
>> Eric, what happens if libvirt in an selinux environment tells QEMU to
>> launch using an image file that is backed by backing file(s)?
> 
> Before starting qemu, libvirt first parses all the image files, to see
> if any of them have backing images.  For every qcow2 or qed image with a
> backing file, libvirt sets the SELinux context of both the qcow2 image
> and its backing file so that qemu will be able to successfully open()
> them.  But if any of those files reside on NFS, then it is not possible
> to label individual files, so it requires setting the SELinux bool
> virt_use_nfs, which thus gives qemu the power to open() arbitrary files
> on NFS, and you've lost security.

Urgh, libvirt parsing image files is really unfortunate, it really
doesn't give me warm fuzzy feelings :( libvirt really should not know
about internals of image formats.

> It would be nice if libvirt had a way to pass fds for every disk and
> backing file up front; then, SELinux can work around the lack of NFS
> per-file labelling by blocking open() in qemu.  In fact, this has
> already been proposed:

A cleaner solution seems to have libvirt provide a call-back allowing
QEMU to call out and have libvirt open a file descriptor instead. This
way libvirt can validate it and open it for QEMU and pass it back.

If we cannot do something like this, I would prefer to have backing
files on NFS should simply not be supported when running in an selinux
setup.

Cheers,
Jes



Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Avi Kivity

On 07/19/2011 04:54 PM, Avi Kivity wrote:

On 07/19/2011 04:30 PM, Joerg Roedel wrote:

>
>  (although we've been talking about starting a naked qemu and 
pushing all

>  of the configuration from the source).

Hmm, I planned to do the VMSTATE thing in a follow-on patch-set. The
plan is to read the VCPU tsc_freq at guest start time on !tsc-scale
hosts and migrate it over so that the destination host can set the
tsc-freq if it supports tsc-scaling.


This can be done by a management tool if desired.



Although, if we do this unconditionally (that is, also for tsc-scale 
hosts) then we get stable tsc even without supplying a tsc frequency 
argument... need to think about this.


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Avi Kivity

On 07/19/2011 04:56 PM, Michael S. Tsirkin wrote:

On Sun, Jul 17, 2011 at 02:13:27PM +0300, Avi Kivity wrote:
>  New in this version:
>MemoryRegionOps gained .old_mmio and .old_portio members, which allow
>reusing old-style callbacks with the new API.  All uses were converted,
>except for eepro100.c, which uses the same MemoryRegionOps for both
>portio and mmio.  Some intermediate patches do introduce dispatching
>callbacks, but they are removed later.
>
>  Caveats:
>  - some devices still grab a global memory region instead of inheriting
>it from their bus.  Seen in the code as #include "exec-memory.h"

Looks good to me.

It looks like with this, users of vga_dirty_log_stop
like qxl_write_config can go away because the region can
stay registered with dirty logging enabled?


Yes.  You set the property once on the framebuffer, and it and all 
aliases are tracked whenever they or a subregion are exposed.


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Michael S. Tsirkin
On Sun, Jul 17, 2011 at 02:13:27PM +0300, Avi Kivity wrote:
> New in this version:
>   MemoryRegionOps gained .old_mmio and .old_portio members, which allow
>   reusing old-style callbacks with the new API.  All uses were converted,
>   except for eepro100.c, which uses the same MemoryRegionOps for both
>   portio and mmio.  Some intermediate patches do introduce dispatching
>   callbacks, but they are removed later.
> 
> Caveats:
> - some devices still grab a global memory region instead of inheriting
>   it from their bus.  Seen in the code as #include "exec-memory.h"

Looks good to me.

It looks like with this, users of vga_dirty_log_stop
like qxl_write_config can go away because the region can
stay registered with dirty logging enabled?

-- 
MST



Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Jes Sorensen
On 07/19/11 16:24, Eric Blake wrote:
> [adding the libvir-list]
> On 07/19/2011 08:09 AM, Jes Sorensen wrote:
>> Urgh, libvirt parsing image files is really unfortunate, it really
>> doesn't give me warm fuzzy feelings :( libvirt really should not know
>> about internals of image formats.
> 
> But even if you add new features to qemu to avoid needing this in the
> future, it doesn't change the past - libvirt will always have to know
> how to parse image files understood by older qemu, and so as long as
> libvirt already knows how to do that parsing, we might as well take
> advantage of it.

What has been done here in the past is plain wrong. Continuing to do it
isn't the right thing to do here.

> Besides, I feel that having a well-documented file format, so that
> independent applications can both parse the same file with the same
> semantics by obeying the file format specification, is a good design goal.

We all know that documentation is rarely uptodate, new features may not
get added and libvirt will never be able to keep up. The driver for a
file format belongs in QEMU and nowhere else.


>>> It would be nice if libvirt had a way to pass fds for every disk and
>>> backing file up front; then, SELinux can work around the lack of NFS
>>> per-file labelling by blocking open() in qemu.  In fact, this has
>>> already been proposed:
>>
>> A cleaner solution seems to have libvirt provide a call-back allowing
>> QEMU to call out and have libvirt open a file descriptor instead. This
>> way libvirt can validate it and open it for QEMU and pass it back.
> 
> Yes, that could probably be made to work with libvirt.

I am a little frustrated this approach wasn't taken up front instead of
the evil hack of having libvirt attempt to parse image files.

>> If we cannot do something like this, I would prefer to have backing
>> files on NFS should simply not be supported when running in an selinux
>> setup.
> 
> As nice as that sentiment is, it will never fly, because it would be a
> regression in current behavior.  The whole reason that the virt_use_nfs
> SELinux bool exists is that some people are willing to make the partial
> security tradeoff.  Besides, the use of sVirt via SELinux is more than
> just open() protection - while the current virt_use_nfs bool makes NFS
> less secure than otherwise possible, it still gives some nice guarantees
> to the rest of the qemu process such as passthrough accesses to local
> pci devices.

Well leaving things at status quo is not making it worse, it just leaves
an evil in place.

> Just because it is currently not as secure to mix NFS shared storage
> with backing files doesn't stop some people from wanting to do it [in
> fact, that's my current development setup - I use qcow2 images on NFS
> shared storage, keep SELinux enabled, and enable the virt_use_nfs bool].
>  This discussion is about adding enhancements that make SELinux even
> more powerful when using NFS shared storage, by adding fd passing
> (whether libvirt parses in advance, or whether qemu raises an event and
> requires feedback from libvirt), and not about crippling the existing
> capability to use the virt_use_nfs selinux bool.

I do not believe we should try and add extra interfaces to support
something which is inherently broken. This really boils down to whether
we should support fd passing for snapshots in the first place. If it is
to support the broken setup of libvirt parsing image files, then I am
totally against it, if we work on a proper solution that involves this
in some way, then we can discuss it.

Cheers,
Jes




[Qemu-devel] Updated 0.15 release schedule

2011-07-19 Thread Anthony Liguori
Here's my proposal for an updated 0.15 schedule.  Please not that 
stable-0.15 will fork off this Friday.


| 2011-02-01
| Begin of 0.15 development phase
|-
| 2011-05-16
| Soft feature freeze.  Major features should have initial code 
committed by this date.

|-
| 2011-06-15; Now 2011-07-22
| Fork off stable-0.15, development of 0.16 begins. Tag qemu-0.15.0-rc0
|-
| 2011-06-24; Now 2011-07-29
| Tag qemu-0.15.0-rc1
|-
| 2011-06-28; Now 2011-08-02
| Tag qemu-0.15.0-rc2
|-
| 2011-07-01; Now 2011-08-05
| Tag qemu-0.15.0

Regards,

Anthony Liguori



Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Joerg Roedel
On Tue, Jul 19, 2011 at 04:55:53PM +0300, Avi Kivity wrote:
> On 07/19/2011 04:54 PM, Avi Kivity wrote:
>> On 07/19/2011 04:30 PM, Joerg Roedel wrote:

>>> Hmm, I planned to do the VMSTATE thing in a follow-on patch-set. The
>>> plan is to read the VCPU tsc_freq at guest start time on !tsc-scale
>>> hosts and migrate it over so that the destination host can set the
>>> tsc-freq if it supports tsc-scaling.
>>
>> This can be done by a management tool if desired.
>>
>
> Although, if we do this unconditionally (that is, also for tsc-scale  
> hosts) then we get stable tsc even without supplying a tsc frequency  
> argument... need to think about this.

It has the advantage that it "just works", without the need to extend
management tools and the like. And it makes migration more transparent
to the guests.

Joerg




Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Eric Blake

[adding the libvir-list]

On 07/19/2011 08:09 AM, Jes Sorensen wrote:

On 07/19/11 15:58, Eric Blake wrote:

On 07/19/2011 07:27 AM, Jes Sorensen wrote:

Eric, what happens if libvirt in an selinux environment tells QEMU to
launch using an image file that is backed by backing file(s)?


Before starting qemu, libvirt first parses all the image files, to see
if any of them have backing images.  For every qcow2 or qed image with a
backing file, libvirt sets the SELinux context of both the qcow2 image
and its backing file so that qemu will be able to successfully open()
them.  But if any of those files reside on NFS, then it is not possible
to label individual files, so it requires setting the SELinux bool
virt_use_nfs, which thus gives qemu the power to open() arbitrary files
on NFS, and you've lost security.


Urgh, libvirt parsing image files is really unfortunate, it really
doesn't give me warm fuzzy feelings :( libvirt really should not know
about internals of image formats.


But even if you add new features to qemu to avoid needing this in the 
future, it doesn't change the past - libvirt will always have to know 
how to parse image files understood by older qemu, and so as long as 
libvirt already knows how to do that parsing, we might as well take 
advantage of it.


Besides, I feel that having a well-documented file format, so that 
independent applications can both parse the same file with the same 
semantics by obeying the file format specification, is a good design goal.





It would be nice if libvirt had a way to pass fds for every disk and
backing file up front; then, SELinux can work around the lack of NFS
per-file labelling by blocking open() in qemu.  In fact, this has
already been proposed:


A cleaner solution seems to have libvirt provide a call-back allowing
QEMU to call out and have libvirt open a file descriptor instead. This
way libvirt can validate it and open it for QEMU and pass it back.


Yes, that could probably be made to work with libvirt.



If we cannot do something like this, I would prefer to have backing
files on NFS should simply not be supported when running in an selinux
setup.


As nice as that sentiment is, it will never fly, because it would be a 
regression in current behavior.  The whole reason that the virt_use_nfs 
SELinux bool exists is that some people are willing to make the partial 
security tradeoff.  Besides, the use of sVirt via SELinux is more than 
just open() protection - while the current virt_use_nfs bool makes NFS 
less secure than otherwise possible, it still gives some nice guarantees 
to the rest of the qemu process such as passthrough accesses to local 
pci devices.


Just because it is currently not as secure to mix NFS shared storage 
with backing files doesn't stop some people from wanting to do it [in 
fact, that's my current development setup - I use qcow2 images on NFS 
shared storage, keep SELinux enabled, and enable the virt_use_nfs bool]. 
 This discussion is about adding enhancements that make SELinux even 
more powerful when using NFS shared storage, by adding fd passing 
(whether libvirt parses in advance, or whether qemu raises an event and 
requires feedback from libvirt), and not about crippling the existing 
capability to use the virt_use_nfs selinux bool.


--
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



Re: [Qemu-devel] External COW format for raw images

2011-07-19 Thread Frediano Ziglio
2011/7/19 Robert Wang :
> As you known, raw image is very popular,but the raw image format does
> NOT support Copy-On-Write,a raw image file can NOT be used as a copy
> destination, then image streaming/Live Block Copy will NOT work.
>
> To fix this, we need to add a new block driver raw-cow to QEMU. If
> finished, we can use qemu-img like this:
> qemu-img create -f raw-cow -o backing_file=ubuntu.img,raw_file=my_vm.img
> my_vm.raw-cow
>
> 1) ubuntu.img is the backing file, my_vm.img is a raw file,
> my_vm.raw-cow stores a COW bitmap related to my_vm.img.
>
> 2) If the entire COW bitmap is set to dirty flag then we can get all
> information from my_vm.img and can ignore ubuntu.img and my_vm.raw-cow
> from now.
>
> To implement this, I think I can follow these steps:
> 1) Add a new member to BlockDriverState struct:
> char raw_file[1024];
> This member will track raw_file parameter related to raw-cow file from
> command line.
>
> 2)      * Create a new file block/raw-cow.c. It will be much more like the
> mixture of block/cow.c and block/raw.c.
>
> So I will change some functions in cow.c and raw.c to none-static, then
> raw-cow.c can re-use them. When read operation occurs, determine whether
> dirty flag in raw-cow image is set. If true, read directly from the raw
> file. After write operation, set related dirty flag in raw-cow image.
> And other functions might also be modified.
>
>        * Of course, format_name member of BlockDriver struct will be 
> "raw-cow".
> And in order to keep relationship with raw file( like my_vm.img) ,
> raw_cow_header struct should be
> struct raw_cow_header {
> uint32_t magic;
> uint32_t version;
> char backing_file[1024];
> char raw_file[1024];/* added*/
> int32_t mtime;
> uint64_t size;
> uint32_t sectorsize;
> };
>        * Struct raw_cow_create_options should be one member plus based on
> cow_create_options:
> {
> .name = BLOCK_OPT_RAW_FILE,
> .type = OPT_STRING,
> .help = "Raw file name"
> },
>
> 3) Add bdrv_get_raw_filename in img_info function of qemu-img.c. In
> bdrv_get_raw_filename, if the format of the image file is "raw-cow",
> print the related raw file.
>
> Do you think my approach is right?
> Thank you.
>

I don't understand if you mean just a way to track clusters/sectors
changed or a new way to implement snapshotting, something that writing
data just store original to cow like like:


normal backfile

is allocated on image
  write to image
else
  allocate on image
  copy from backing to image
  write to image (patch before previous write)

cow-raw (inverse backfile)

is allocated on image
  write to backing
else
  allocate on image
  copy from backing to image
  write to backing


that is


is not allocated on image
  allocate on image
  copy from backing to image
is normal backing
  write to image
else
  write to backing


Frediano



Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Anthony Liguori

On 07/19/2011 08:27 AM, Avi Kivity wrote:

On 07/19/2011 04:09 PM, Anthony Liguori wrote:

On 07/17/2011 06:13 AM, Avi Kivity wrote:

New in this version:
MemoryRegionOps gained .old_mmio and .old_portio members, which allow
reusing old-style callbacks with the new API. All uses were converted,
except for eepro100.c, which uses the same MemoryRegionOps for both
portio and mmio. Some intermediate patches do introduce dispatching
callbacks, but they are removed later.

Caveats:
- some devices still grab a global memory region instead of inheriting
it from their bus. Seen in the code as #include "exec-memory.h"


Could you write up a quick document on how to use this new api for docs/?


Sure. It's pretty simple.


Thanks.



There's bits I don't like about the interface


Which bits are these?


Nothing I haven't already commented on.  I think there's too much in the 
generic level.  I don't think coalesced I/O belongs here.  It's a 
concept that doesn't fit.  I think a side-band API would be nicer.


Endianness also seems out of place.  There are many layers that can 
affect final endianness.  It depends on how devices handle endianness 
and also whether the bus modifies endianness.


There are numerous devices that have a register that allows endianness 
to be toggled for the device.  That makes the actual endianness of the 
device dynamic which doesn't fit the memory region API very well IMHO.





but I think it's a huge improvement over what we have now so I'm
inclined to commit once it includes documentation.



My problem is that to start leveraging it, everything must flow through
it. There are still several hundred call sites that are unconverted.


Really several hundred?  That surprises me.

Regards,

Anthony Liguori



One option is to invert the relationship between ram_addr_t and
MemoryRegion - implement the former in terms of the latter. That only
works for uses which don't invoke IO_MEM_UNASSIGNED or address arithmetic.






Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Stefan Hajnoczi
On Tue, Jul 19, 2011 at 3:30 PM, Jes Sorensen  wrote:
> On 07/19/11 16:24, Eric Blake wrote:
>> [adding the libvir-list]
>> On 07/19/2011 08:09 AM, Jes Sorensen wrote:
>>> Urgh, libvirt parsing image files is really unfortunate, it really
>>> doesn't give me warm fuzzy feelings :( libvirt really should not know
>>> about internals of image formats.
>>
>> But even if you add new features to qemu to avoid needing this in the
>> future, it doesn't change the past - libvirt will always have to know
>> how to parse image files understood by older qemu, and so as long as
>> libvirt already knows how to do that parsing, we might as well take
>> advantage of it.
>
> What has been done here in the past is plain wrong. Continuing to do it
> isn't the right thing to do here.
>
>> Besides, I feel that having a well-documented file format, so that
>> independent applications can both parse the same file with the same
>> semantics by obeying the file format specification, is a good design goal.
>
> We all know that documentation is rarely uptodate, new features may not
> get added and libvirt will never be able to keep up. The driver for a
> file format belongs in QEMU and nowhere else.

It should be a goal to avoid dependencies in multiple layers of the
stack because it becomes are to add new features - they require
coordinated changes in multiple layers.  Having both QEMU and libvirt
know the internals of image files is such a multi-dependency.  If I
want to add a new format or change an existing format I have to touch
both layers.

For fd-passing perhaps we have an opportunity to use a callback
mechanism (QEMU request: filename -> libvirt response: fd) and do all
the image format parsing in QEMU.

Stefan



[Qemu-devel] [RFC] QEMU Object Model

2011-07-19 Thread Anthony Liguori

Hi,

I've started an effort to introduce a consistent object model to QEMU. 
Today, every subsystem implements an ad-hoc object model.  These object 
models all have the same basic properties but do things in arbitrarily 
different ways:


1) Factory interface for object creation
 - Objects usually have names
 - Construction properties for objects

2) Object properties
 - Some have converged around QemuOpts
 - Some support properties on at construction time
 - Most objects don't support introspection of properties

3) Inheritance and Polymorphism
 - Most use a vtable to implement inheritance and polymorphism
 - Only works effectively for one level of inheritance
 - Inconsistency around semantics of overloaded functions
   - Sometimes the base object invokes the overloaded function and 
implements additional behavior


netdev, block, chardev, fsdev, qdev, and displaystate are all examples 
of ad-hoc object models.  They all have their own implementations of the 
above and their own command line/monitor syntaxes.


QOM is a unifying object model inspired by the GObject/GType system.

Here is a short description of the feature it supports or is intended to 
support:


1) All objects derive from a common base object (Plug).  Plug's support 
properties that can be set/get with a Visitor.  This means QMP can be 
natively used to read/write object properties.


2) Properties have a type and flags associated with them.  Properties 
can be read-only, read-write, and locked.


3) Locked properties are read-only after realize.

4) Two special types of properties, plug<> and socket<> allow for a 
type-safe way to create a directed graph of objects at run time.  This 
provides a consistent mechanism to create a tree of devices and to 
associate backends with devices.


5) Single inheritance is supported with full polymorphism.  Interfaces 
are also supported which allows a restricted (Java-style) form of MI.


6) All types are registered through the same interface.  Type modules 
can be used to implement anything from new devices, buses, block/net 
backends, or even entirely new types of backends.  In the future, I 
would like to support demand loading of modules to allow a small core of 
QEMU to be loaded which then loads only the bits of code required to run 
a guest.


It has a few key different from GObject:

1) GObject properties are implemented as GValues.  GValues are variants 
that are assumed to be immutable.  A key requirement of QOM is that we 
can use the Visitor framework for interacting with properties.  This 
allows for a richer expression of properties (specifically, complex 
device state to be serialized as a property).


2) GObject properties are installed in the class.  In order to support 
things like multifunction devices, we really need to install properties 
with the object so that we can have arrays of properties that are sized 
from another property.


3) GTypes/GObjects are always heap allocated as discrete objects. 
GObjects are also reference counted.  In order to support object 
composition, it's necessary to be able to initialize an object from an 
existing piece of memory.


I'll follow up later in the week with some documentation on how the type 
system works in more detail.  A tree is available below that has the 
current implementation:


http://repo.or.cz/w/qemu/aliguori.git/tree/qdev2

I'll be documenting the type system at:

http://wiki.qemu.org/Features/QOM

Regards,

Anthony Liguori



Re: [Qemu-devel] [PULL 00/12] Xen patch queue 2011-07-05

2011-07-19 Thread Anthony Liguori

On 07/05/2011 11:51 AM, Alexander Graf wrote:

Hi Anthony,

This is my current patch queue for Xen stuff that accumulated over
the past few weeks.

Please pull.


Pulled.  Thanks.

Regards,

Anthony Liguori



Alex

The following changes since commit 9312805d33e8b106bae356d13a8071fb37d75554:
   Vasily Khoruzhick (1):
 pxa2xx_lcd: add proper rotation support

are available in the git repository at:

   git://repo.or.cz/qemu/agraf.git xen-next

Alexander Graf (2):
   checkpatch: don't error out on },{ lines
   xen_console: fall back to qemu serial device

Jan Kiszka (3):
   xen: Clean up build system
   xen: Clean up map cache API naming
   xen: Fold CONFIG_XEN_MAPCACHE into CONFIG_XEN

Stefano Stabellini (7):
   xen: enable console and disk backend in HVM mode
   xen_console: fix memory leak
   xen: add vkbd support for PV on HVM guests
   xen_disk: cope with missing xenstore "params" node
   qemu_ram_ptr_length: take ram_addr_t as arguments
   xen_disk: treat "aio" as "raw"
   xen_console: support the new extended xenstore protocol

  Makefile.objs |4 +-
  Makefile.target   |   14 +
  configure |2 +-
  cpu-common.h  |2 +-
  exec.c|   55 +---
  hw/xen.h  |   10 +--
  hw/xen_common.h   |   12 
  hw/xen_console.c  |   25 -
  hw/xen_disk.c |   37 -
  hw/xenfb.c|   19 -
  scripts/checkpatch.pl |4 ++-
  trace-events  |6 ++--
  xen-all.c |   73 +++-
  xen-mapcache-stub.c   |   36 
  xen-mapcache.c|   41 +++
  xen-mapcache.h|   14 -
  16 files changed, 217 insertions(+), 137 deletions(-)
  delete mode 100644 xen-mapcache-stub.c








Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Avi Kivity

On 07/19/2011 05:14 PM, Joerg Roedel wrote:

On Tue, Jul 19, 2011 at 04:55:53PM +0300, Avi Kivity wrote:
>  On 07/19/2011 04:54 PM, Avi Kivity wrote:
>>  On 07/19/2011 04:30 PM, Joerg Roedel wrote:

>>>  Hmm, I planned to do the VMSTATE thing in a follow-on patch-set. The
>>>  plan is to read the VCPU tsc_freq at guest start time on !tsc-scale
>>>  hosts and migrate it over so that the destination host can set the
>>>  tsc-freq if it supports tsc-scaling.
>>
>>  This can be done by a management tool if desired.
>>
>
>  Although, if we do this unconditionally (that is, also for tsc-scale
>  hosts) then we get stable tsc even without supplying a tsc frequency
>  argument... need to think about this.

It has the advantage that it "just works", without the need to extend
management tools and the like. And it makes migration more transparent
to the guests.



Yes, exactly.  The flip side is that automagic stuff is sometimes 
unexpected and leads to breakage.  I'm not sure what the right thing is.


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PULL] pci, vhost

2011-07-19 Thread Anthony Liguori

On 07/04/2011 12:01 PM, Michael S. Tsirkin wrote:

The following changes since commit 1dfdcaa83f9ce34aded8bc0669e81753d94f1b7d:

   user: Fix -d debug logging for usermode emulation (2011-06-28 20:57:09 +0200)

are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu.git for_anthony


Pulled.  Thanks.

Regards,

Anthony Liguori



Anthony PERARD (1):
   hw/piix_pci.c: Fix PIIX3-xen to initialize ids

Michael S. Tsirkin (3):
   vhost: fix double free on device stop
   pci_ids: tweak names to match linux/pci_ids.h
   xen: move to new pci initializers

  hw/pci_ids.h  |3 ++-
  hw/piix_pci.c |3 +++
  hw/vhost.c|1 +
  hw/xen_platform.c |   15 +++
  4 files changed, 13 insertions(+), 9 deletions(-)







Re: [Qemu-devel] [PATCH 0/3][uq/master] Basic TSC-Scaling support v2

2011-07-19 Thread Marcelo Tosatti
On Thu, Jul 07, 2011 at 04:13:10PM +0200, Joerg Roedel wrote:
> Hi Avi, Marcelo,
> 
> here is v2 of the patches to support setting the guests tsc-frequency
> from the qemu command line. This version addresses the comment from Avi
> on the first version. To reflect that units can be given to the
> frequency, the parameter was renamed from tsc_khz to tsc_freq.
> 
> Thanks,
> 
>   Joerg

Applied to uq/master, thanks.




Re: [Qemu-devel] [PULL] virtio-serial: trace events, trivial fix

2011-07-19 Thread Anthony Liguori

On 07/07/2011 08:13 AM, Amit Shah wrote:

Hello,

This series adds some trace events to virtio-serial-bus.c and
virtio-console.c.  There's also one trivial patch to remove a trailing
\n from an error_report() string.

Note: some mirrors may not yet have received the update.


Pulled.  Thanks.

Regards,

Anthony Liguori




The following changes since commit 9312805d33e8b106bae356d13a8071fb37d75554:

   pxa2xx_lcd: add proper rotation support (2011-07-04 22:12:21 +0200)

are available in the git repository at:
   git://git.kernel.org/pub/scm/virt/qemu/amit/virtio-serial.git for-anthony


Amit Shah (3):
   virtio-serial-bus: Add trace events
   virtio-console: Add some trace events
   virtio-serial-bus: Fix trailing \n in error_report string

  hw/virtio-console.c|9 -
  hw/virtio-serial-bus.c |9 -
  trace-events   |   11 +++
  3 files changed, 27 insertions(+), 2 deletions(-)






Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Anthony Liguori

On 07/19/2011 09:30 AM, Jes Sorensen wrote:

On 07/19/11 16:24, Eric Blake wrote:

[adding the libvir-list]
On 07/19/2011 08:09 AM, Jes Sorensen wrote:

Urgh, libvirt parsing image files is really unfortunate, it really
doesn't give me warm fuzzy feelings :( libvirt really should not know
about internals of image formats.


But even if you add new features to qemu to avoid needing this in the
future, it doesn't change the past - libvirt will always have to know
how to parse image files understood by older qemu, and so as long as
libvirt already knows how to do that parsing, we might as well take
advantage of it.


What has been done here in the past is plain wrong. Continuing to do it
isn't the right thing to do here.


Besides, I feel that having a well-documented file format, so that
independent applications can both parse the same file with the same
semantics by obeying the file format specification, is a good design goal.


We all know that documentation is rarely uptodate, new features may not
get added and libvirt will never be able to keep up. The driver for a
file format belongs in QEMU and nowhere else.



It would be nice if libvirt had a way to pass fds for every disk and
backing file up front; then, SELinux can work around the lack of NFS
per-file labelling by blocking open() in qemu.  In fact, this has
already been proposed:


A cleaner solution seems to have libvirt provide a call-back allowing
QEMU to call out and have libvirt open a file descriptor instead. This
way libvirt can validate it and open it for QEMU and pass it back.


Yes, that could probably be made to work with libvirt.


I am a little frustrated this approach wasn't taken up front instead of
the evil hack of having libvirt attempt to parse image files.


If we cannot do something like this, I would prefer to have backing
files on NFS should simply not be supported when running in an selinux
setup.


As nice as that sentiment is, it will never fly, because it would be a
regression in current behavior.  The whole reason that the virt_use_nfs
SELinux bool exists is that some people are willing to make the partial
security tradeoff.  Besides, the use of sVirt via SELinux is more than
just open() protection - while the current virt_use_nfs bool makes NFS
less secure than otherwise possible, it still gives some nice guarantees
to the rest of the qemu process such as passthrough accesses to local
pci devices.


Well leaving things at status quo is not making it worse, it just leaves
an evil in place.


NFS and SELinux is a fundamental problem with SELinux and NFS.  We can 
piss and moan as much as we want about it but it's reality.  SELinux 
fundamentally requires extended attributes.  By the time NFS adds 
extended attribute support, we'll all be flying around in hover cars.


As terrible as NFS is, people use it all of the time.

It would be nice if libvirt had the ability to make better use of DAC to 
support isolation.  The fact that MAC is the only way you can do 
isolation between guests is pretty unfortunate.  If I could assign 
specific UIDs to a guest and use that to enforce isolation, it would go 
a long ways to solving this problem.


Regards,

Anthony Liguori



Re: [Qemu-devel] [PULL] usb patch queue

2011-07-19 Thread Anthony Liguori

On 07/08/2011 04:50 AM, Gerd Hoffmann wrote:

   Hi,

Here is the current usb patch queue.  Most noteworthy is the usb
companion controller support added.  There are also a bunch of bug
fixes, some from Hans which he found while doing the companion
controller work and some have been found in patch review.


Pulled.  Thanks.

Regards,

Anthony Liguori



please pull,
   Gerd

The following changes since commit 9312805d33e8b106bae356d13a8071fb37d75554:

   pxa2xx_lcd: add proper rotation support (2011-07-04 22:12:21 +0200)

are available in the git repository at:
   git://git.kraxel.org/qemu usb.19

Gerd Hoffmann (8):
   pci: add ich9 usb controller ids
   uhci: add ich9 controllers
   ehci: fix port count.
   ehci: add ich9 controller.
   usb: update documentation
   usb: fixup bluetooth descriptors
   usb-hub: remove unused descriptor arrays
   usb-ohci: raise interrupt on attach

Hans de Goede (13):
   usb: Add a usb_fill_port helper function
   usb: Move (initial) call of usb_port_location to usb_fill_port
   usb: Add a register_companion USB bus op.
   usb: Make port wakeup and complete ops take a USBPort instead of a Device
   usb: Replace device_destroy bus op with a child_detach port op
   usb-ehci: drop unused num-ports state member
   usb-ehci: Connect Status bit is read only, don't allow changing it by 
the guest
   usb-ehci: cleanup port reset handling
   usb: assert on calling usb_attach(port, NULL) on a port without a dev
   usb-ehci: Fix handling of PED and PEDC port status bits
   usb-ehci: Add support for registering companion controllers
   usb-uhci: Add support for being a companion controller
   usb-ohci: Add support for being a companion controller

Jes Sorensen (1):
   usb_register_port(): do not set port->opaque and port->index twice

Peter Maydell (1):
   hw/usb-musb.c: Don't misuse usb_packet_complete()

  docs/ich9-ehci-uhci.cfg |   37 +++
  docs/usb2.txt   |   33 +-
  hw/milkymist-softusb.c  |9 ++-
  hw/pci_ids.h|8 ++
  hw/usb-bt.c |   24 ++--
  hw/usb-bus.c|   46 +++-
  hw/usb-ehci.c   |  270 ++-
  hw/usb-hub.c|   90 +++-
  hw/usb-musb.c   |   24 +++--
  hw/usb-ohci.c   |   89 +++-
  hw/usb-uhci.c   |   95 +
  hw/usb.c|   13 +--
  hw/usb.h|   20 +++-
  13 files changed, 523 insertions(+), 235 deletions(-)
  create mode 100644 docs/ich9-ehci-uhci.cfg







Re: [Qemu-devel] [PULL] spice patch queue

2011-07-19 Thread Anthony Liguori

On 07/04/2011 10:14 AM, Gerd Hoffmann wrote:

   Hi,

Here is the spice patch queue with a bunch of small fixes and
improvements collected over time.  No major changes.

please pull,
   Gerd


Pulled.  Thanks.

Regards,

Anthony Liguori



Alon Levy (5):
   qxl: set mm_time in vga update
   qxl: interface_get_command: fix reported mode
   qxl-logger: add timestamp to command log
   qxl: add dev id to guest prints
   qxl: allow QXL_IO_LOG also in vga

Gerd Hoffmann (3):
   qxl: device id fixup
   spice: catch spice server initialization failures.
   qxl: put QXL_IO_UPDATE_IRQ into vgamode whitelist

Yonit Halperin (1):
   qxl: make sure primary surface is saved on migration

  hw/qxl-logger.c|4 +++-
  hw/qxl.c   |   50 ++
  ui/spice-core.c|5 -
  ui/spice-display.c |5 +
  4 files changed, 46 insertions(+), 18 deletions(-)

The following changes since commit 75ef849696830fc2ddeff8bb90eea5887ff50df6:

   esp: correctly fill bus id with requested lun (2011-07-02 18:50:19 +)

are available in the git repository at:
   git://anongit.freedesktop.org/spice/qemu spice.v38

Alon Levy (5):
   qxl: set mm_time in vga update
   qxl: interface_get_command: fix reported mode
   qxl-logger: add timestamp to command log
   qxl: add dev id to guest prints
   qxl: allow QXL_IO_LOG also in vga

Gerd Hoffmann (3):
   qxl: device id fixup
   spice: catch spice server initialization failures.
   qxl: put QXL_IO_UPDATE_IRQ into vgamode whitelist

Yonit Halperin (1):
   qxl: make sure primary surface is saved on migration

  hw/qxl-logger.c|4 +++-
  hw/qxl.c   |   50 ++
  ui/spice-core.c|5 -
  ui/spice-display.c |5 +
  4 files changed, 46 insertions(+), 18 deletions(-)







Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Avi Kivity

On 07/19/2011 05:50 PM, Anthony Liguori wrote:




There's bits I don't like about the interface


Which bits are these?


Nothing I haven't already commented on.  I think there's too much in 
the generic level.  I don't think coalesced I/O belongs here.  It's a 
concept that doesn't fit.  I think a side-band API would be nicer.


Well, it's impossible to do it in a side band.  When a range that has 
coalesced mmio is exposed is completely orthogonal to programming the 
BAR register - it can happen, for example, due to another BAR being 
removed or the bridge window being programmed.  You can also have a 
coalesced mmio region being partially clipped.




Endianness also seems out of place.  There are many layers that can 
affect final endianness.  It depends on how devices handle endianness 
and also whether the bus modifies endianness.


That is handled naturally by the API.  Currently only leaves specify 
endianess, but in the futures containers (=buses) would as well.




There are numerous devices that have a register that allows endianness 
to be toggled for the device.  That makes the actual endianness of the 
device dynamic which doesn't fit the memory region API very well IMHO.


static const MemoryRegionOps mydevice_ops = {
...
   .endianess_callback = mydevice_endianess,
...
};

Or

 memory_region_set_endianess(...);






but I think it's a huge improvement over what we have now so I'm
inclined to commit once it includes documentation.



My problem is that to start leveraging it, everything must flow through
it. There are still several hundred call sites that are unconverted.


Really several hundred?  That surprises me.



$ git grep -w cpu_register_physical_memory | wc -l
222

$ git grep -w cpu_register_io_memory | wc -l
233

$ git grep -w qemu_ram_alloc | wc -l
113

$ git grep  memory_region_init | wc -l
134

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 3/3] qemu-x86: Set tsc_khz in kvm when supported

2011-07-19 Thread Jan Kiszka
On 2011-07-19 13:48, Marcelo Tosatti wrote:
> On Thu, Jul 07, 2011 at 04:13:13PM +0200, Joerg Roedel wrote:
>> Make use of the KVM_TSC_CONTROL feature if available.
>>
>> Signed-off-by: Joerg Roedel 
>> ---
>>  target-i386/kvm.c |   18 +-
>>  1 files changed, 17 insertions(+), 1 deletions(-)
>>
>> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
>> index 10fb2c4..923d2d5 100644
>> --- a/target-i386/kvm.c
>> +++ b/target-i386/kvm.c
>> @@ -354,6 +354,7 @@ int kvm_arch_init_vcpu(CPUState *env)
>>  uint32_t unused;
>>  struct kvm_cpuid_entry2 *c;
>>  uint32_t signature[3];
>> +int r;
>>  
>>  env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
>>  
>> @@ -499,7 +500,22 @@ int kvm_arch_init_vcpu(CPUState *env)
>>  
>>  qemu_add_vm_change_state_handler(cpu_update_state, env);
>>  
>> -return kvm_vcpu_ioctl(env, KVM_SET_CPUID2, &cpuid_data);
>> +r = kvm_vcpu_ioctl(env, KVM_SET_CPUID2, &cpuid_data);
>> +if (r)
>> +return r;
>> +
>> +#ifdef KVM_CAP_TSC_CONTROL
>> +r = kvm_check_extension(env->kvm_state, KVM_CAP_TSC_CONTROL);
>> +if (r && env->tsc_khz) {
>> +r = kvm_vcpu_ioctl(env, KVM_SET_TSC_KHZ, env->tsc_khz);
>> +if (r < 0) {
>> +fprintf(stderr, "KVM_SET_TSC_KHZ failed\n");
>> +return r;
>> +}
>> +}
>> +#endif
> 

#ifdef times are over, please clean up before pushing upstream.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Avi Kivity

On 07/19/2011 07:05 PM, Avi Kivity wrote:

On 07/19/2011 05:50 PM, Anthony Liguori wrote:




There's bits I don't like about the interface


Which bits are these?


Nothing I haven't already commented on.  I think there's too much in 
the generic level.  I don't think coalesced I/O belongs here.  It's a 
concept that doesn't fit.  I think a side-band API would be nicer.


Well, it's impossible to do it in a side band.  When a range that has 
coalesced mmio is exposed is completely orthogonal to programming the 
BAR register - it can happen, for example, due to another BAR being 
removed or the bridge window being programmed.  You can also have a 
coalesced mmio region being partially clipped.


Of course, it's not really impossible, just clumsy.

You could have

   struct MemoryRegionOps {
void (*on_mapping_added)(t_p_a_t offset, unsigned nr, const 
AddrRange ranges[]);
void (*on_mapping_removed)(t_p_a_t offset, unsigned nr, const 
AddrRange ranges[]);

};

the device callbacks would then compare the added or removed ranges with 
the coalesced mmio ranges needed by the device, and call the kvm 
callbacks as needed.  But that's not something a device model writer 
should do, IMO (same goes for ioeventfd - they are now part of the API 
as well).


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH][v7] megasas: LSI Megaraid SAS emulation

2011-07-19 Thread Christoph Hellwig
I've mentioned this a few times before, but just to make sure it's not
lost:

This is a really bad idea for adding to qemu.  It's not a controller that
actually speaks a plain SCSI protocol to disks, but a RAID controller, that
has it's own command set for data plan operation, and minimal support for
a few SCSI CDBs around it.  I also supports passthrough channels for
CDROMs and other periphals, but using those with disks is not recommended
by LSI, thus explicitly disabled in most drivers and most certainly not
tested.  So it fits the qemu model, especially if using scsi-generic
undeneath pretty badly and I don't think it helps us in practice.




Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Daniel P. Berrange
On Tue, Jul 19, 2011 at 04:14:27PM +0100, Stefan Hajnoczi wrote:
> On Tue, Jul 19, 2011 at 3:30 PM, Jes Sorensen  wrote:
> > On 07/19/11 16:24, Eric Blake wrote:
> >> [adding the libvir-list]
> >> On 07/19/2011 08:09 AM, Jes Sorensen wrote:
> >>> Urgh, libvirt parsing image files is really unfortunate, it really
> >>> doesn't give me warm fuzzy feelings :( libvirt really should not know
> >>> about internals of image formats.
> >>
> >> But even if you add new features to qemu to avoid needing this in the
> >> future, it doesn't change the past - libvirt will always have to know
> >> how to parse image files understood by older qemu, and so as long as
> >> libvirt already knows how to do that parsing, we might as well take
> >> advantage of it.
> >
> > What has been done here in the past is plain wrong. Continuing to do it
> > isn't the right thing to do here.
> >
> >> Besides, I feel that having a well-documented file format, so that
> >> independent applications can both parse the same file with the same
> >> semantics by obeying the file format specification, is a good design goal.
> >
> > We all know that documentation is rarely uptodate, new features may not
> > get added and libvirt will never be able to keep up. The driver for a
> > file format belongs in QEMU and nowhere else.
> 
> It should be a goal to avoid dependencies in multiple layers of the
> stack because it becomes are to add new features - they require
> coordinated changes in multiple layers.  Having both QEMU and libvirt
> know the internals of image files is such a multi-dependency.  If I
> want to add a new format or change an existing format I have to touch
> both layers.
> 
> For fd-passing perhaps we have an opportunity to use a callback
> mechanism (QEMU request: filename -> libvirt response: fd) and do all
> the image format parsing in QEMU.

The reason why libvirt does the parsing of file headers to determine
backing files is to maintain the trust boundary. Everything run from
the exec() of QEMU onwards is considered untrusted code. So having
QEMU parsing the file headers & passing back open() requests to libvirt
is breaking the trust boundary.

NB, i'm not happy about libvirt having to have knowledge of file format
headers, but we needed something more efficient & reliable than invoking
qemu-img info & parsing the output. Ideally QEMU (or something else)
would provide a library libblockformat.so with stable APIs for at least
reading metadata about image formats. If it had APIs for image creation,
etc too that would be a bonus, but we're more or less ok spawning qemu-img
for those cases currently.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Daniel P. Berrange
On Tue, Jul 19, 2011 at 04:30:19PM +0200, Jes Sorensen wrote:
> On 07/19/11 16:24, Eric Blake wrote:
> > [adding the libvir-list]
> > On 07/19/2011 08:09 AM, Jes Sorensen wrote:
> >> Urgh, libvirt parsing image files is really unfortunate, it really
> >> doesn't give me warm fuzzy feelings :( libvirt really should not know
> >> about internals of image formats.
> > 
> > But even if you add new features to qemu to avoid needing this in the
> > future, it doesn't change the past - libvirt will always have to know
> > how to parse image files understood by older qemu, and so as long as
> > libvirt already knows how to do that parsing, we might as well take
> > advantage of it.
> 
> What has been done here in the past is plain wrong. Continuing to do it
> isn't the right thing to do here.
> 
> > Besides, I feel that having a well-documented file format, so that
> > independent applications can both parse the same file with the same
> > semantics by obeying the file format specification, is a good design goal.
> 
> We all know that documentation is rarely uptodate, new features may not
> get added and libvirt will never be able to keep up. The driver for a
> file format belongs in QEMU and nowhere else.

This would be possible if QEMU to provide a libblockformat.so library
which allowed apps to extract metadata from file formats using a stable
API.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



  1   2   >