date:20171122

Re: [Qemu-devel] [PATCH 2/2] pc-bios/s390-ccw: zero out bss section

2017-11-22 Thread Richard Henderson

On 11/22/2017 03:26 PM, Christian Borntraeger wrote:
> The QEMU ELF loader does not zero the bss segment.
> This resulted in several bugs, e.g. see
> 
> commit 5d739a4787a5 (s390-ccw.img: Fix sporadic errors with ccw boot image - 
> initialize css)
> commit 6a40fa2669d3 (s390-ccw.img: Initialize next_idx)
> commit 8775d91a0f42 (pc-bios/s390-ccw: Fix problem with invalid virtio-scsi 
> LUN when rebooting)
> 
> Lets fix this once and forever by letting the BIOS zero the bss itself.
> 
> Suggested-by: Alexander Graf 
> Signed-off-by: Christian Borntraeger 
> ---
>  pc-bios/s390-ccw/start.S | 30 +++---
>  1 file changed, 27 insertions(+), 3 deletions(-)

Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH 00/17] iotests: Fix iotests for weird formats/options

2017-11-22 Thread Fam Zheng

On Thu, 11/23 03:08, Max Reitz wrote:
> (I hate to write Python tests because the boilerplate seems so large and
>  the debugging is so hard.  But there is test 194 which shows that it is
>  possible to write simple bash-like tests as well--and that is how I
>  should probably write tests from now on.)

If the boilerplate means copy&paste from another test it is not a big problem;
if you really mean it's more cumbersome than bash to write test code, I think we
can try new ways to exploit Python's expressiveness. 2c93c5cb43 and f4844ac0ad
are good examples.

So what do you think about debugging? IMO the biggest problem is the less
verbose reference output, and lack of a separate logging channel. We can and
should improve that, even for existing scripts.

And yes, the 194 style is actually better than "iotests.QMPTestCase" ones.

Fam

[Qemu-devel] [PATCH v2] rcu: reduce more than 7MB heap memory by malloc_trim()

2017-11-22 Thread Yang Zhong

Since there are some issues in memory alloc/free machenism
in glibc for little chunk memory, if Qemu frequently
alloc/free little chunk memory, the glibc doesn't alloc
little chunk memory from free list of glibc and still
allocate from OS, which make the heap size bigger and bigger.

This patch introduce malloc_trim(), which will free heap memory.

Below are test results from smaps file.
(1)without patch
55f0783e1000-55f07992a000 rw-p  00:00 0  [heap]
Size:  21796 kB
Rss:   14260 kB
Pss:   14260 kB

(2)with patch
55cc5fadf000-55cc61008000 rw-p  00:00 0  [heap]
Size:  21668 kB
Rss:6940 kB
Pss:6940 kB

Signed-off-by: Yang Zhong 
---
 configure  | 4 
 util/rcu.c | 6 ++
 2 files changed, 10 insertions(+)

diff --git a/configure b/configure
index 0e856bb..5b463d4 100755
--- a/configure
+++ b/configure
@@ -6012,6 +6012,10 @@ if test "$opengl" = "yes" ; then
   fi
 fi
 
+if test "$tcmalloc" = "yes" || test "$jemalloc" = "yes" ; then
+  echo "CONFIG_NONGLIBMALLOC=y" >> $config_host_mak
+fi
+
 if test "$avx2_opt" = "yes" ; then
   echo "CONFIG_AVX2_OPT=y" >> $config_host_mak
 fi
diff --git a/util/rcu.c b/util/rcu.c
index ca5a63e..f3e96a8 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -32,6 +32,9 @@
 #include "qemu/atomic.h"
 #include "qemu/thread.h"
 #include "qemu/main-loop.h"
+#if defined(CONFIG_LINUX)
+#include 
+#endif
 
 /*
  * Global grace period counter.  Bit 0 is always one in rcu_gp_ctr.
@@ -272,6 +275,9 @@ static void *call_rcu_thread(void *opaque)
 node->func(node);
 }
 qemu_mutex_unlock_iothread();
+#if defined(CONFIG_LINUX) && !defined(CONFIG_NONGLIBMALLOC)
+malloc_trim(4 * 1024 * 1024);
+#endif
 }
 abort();
 }
-- 
1.9.1

Re: [Qemu-devel] [ANNOUNCE] QEMU 2.11.0-rc2 is now available

2017-11-22 Thread Fam Zheng

On Wed, 11/22 21:43, Jeff Cody wrote:
> On Thu, Nov 23, 2017 at 08:47:47AM +0800, Fam Zheng wrote:
> > On Wed, 11/22 04:55, Jeff Cody wrote:
> > > On Wed, Nov 22, 2017 at 09:09:02AM +0100, Christian Borntraeger wrote:
> > > > 
> > > > 
> > > > On 11/22/2017 04:23 AM, Michael Roth wrote:
> > > > > Quoting Christian Borntraeger (2017-11-21 15:38:32)
> > > > >> forgot to cc qemu-devel
> > > > >>
> > > > >> On 11/21/2017 10:37 PM, Christian Borntraeger wrote:
> > > > >>> a quick heads up . Rc2 now triggers
> > > > >>> +qemu-img: block/block-backend.c:2088: blk_root_drained_end: 
> > > > >>> Assertion `blk->quiesce_counter' failed.
> > > > >>> for several qemu iotests. 
> > > > >>>
> > > > >>> I have not looked into any details.
> > > > > 
> > > > > It looks to be due to:
> > > > > 
> > > > > 4afeffc8572f40d8844b946a30c00b10da4442b1
> > > > > blockjob: do not allow coroutine double entry or 
> > > > > entry-after-completion
> > > > 
> > > > Yes, I can confirm that reverting this patch gets rid of this 
> > > > assertion, but
> > > > I see things like
> > > > 
> > > > --- /home/cborntra/REPOS/qemu/tests/qemu-iotests/020.out
> > > > 2017-11-21 20:19:34.785519323 +0100
> > > > +++ /home/cborntra/REPOS/qemu/build/tests/qemu-iotests/020.out.bad  
> > > > 2017-11-22 09:04:50.127612500 +0100
> > > > @@ -537,7 +537,8 @@
> > > >  wrote 65536/65536 bytes at offset 4295098368
> > > >  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > > >  No errors were found on the image.
> > > > -Image committed.
> > > > +qemu_aio_coroutine_enter: Co-routine was already scheduled in 
> > > > 'co_aio_sleep_ns'
> > > > +./common.rc: line 61: 88002 Aborted (core dumped) ( 
> > > > exec "$QEMU_IMG_PROG" $QEMU_IMG_OPTIONS "$@" )
> > > > 
> > > 
> > > That is from the subsequent patches in the series - you will want to 
> > > revert
> > > the whole series to test, as the introduced aborts catch the illegal
> > > entries that the reverted patch sidestepped.
> > > 
> > > The series patches are:
> > > 
> > > 4afeffc
> > > 6133b39
> > > a233969
> > > d975301
> > > 
> > > Of course, these new aborts prevent improper behavior, so we may want to
> > > figure out why this is getting hit.
> > > 
> > > Unfortunately, I am traveling at the moment (waiting to board my flight), 
> > > so
> > > will have limited connectivity.
> > 
> > I'll take a look at this today and the bottom line is we revert the series 
> > until
> > a proper fix is found.
> > 
> 
> My hunch is the series is a proper fix, but uncovered other latent bugs that
> were relying on dangerous behavior.

Yes, I don't know. Either a different fix for your bug, or fixes for the latent
bugs.

Fam

Re: [Qemu-devel] [PATCH v7 0/5] fw_cfg: add DMA operations & etc/vmcoreinfo support

2017-11-22 Thread Michael S. Tsirkin

On Mon, Nov 20, 2017 at 10:55:14AM +0100, Marc-André Lureau wrote:
> Hi,
> 
> This series adds DMA operations support to the qemu fw_cfg kernel
> module and populates "etc/vmcoreinfo" with vmcoreinfo location
> details.
> 
> Note: the support for this entry handling has been merged for upcoming
> qemu release (2.11).

Does not help. Still get crashes from ltp.

> v7:
> - add a patch to fix driver remove()
> - remove DMA operatiom timeout (qemu finishes sync today)
> - synchronize the DMA transfer before reading from CPU
> - removed kmalloc() use static allocation instead
> - drop some r-b tags
> 
> v6:
> - change acpi_acquire_global_lock() error to return EINVAL
>   (instead of EBUSY)
> - replace 0 as pointer argument for NULL
> - add Gabriel r-b/a-b tags
> 
> v5:
> - resent to CC kdump people on the paddr_vmcoreinfo_note() export patch
> 
> v4:
> - export paddr_vmcoreinfo_note() to fix fw_cfg.ko build
> - fix build with !CONFIG_CRASH_CORE
> - replace the unbounded yield() loop with a usleep_range() loop and a
>   200ms timeout
> - do not write vmcoreinfo entry when running the kdump kernel (D. Hatayama)
> - drop the experimental sysfs write support patch from this series
> 
> v3: (thanks kbuild)
> - add "fw_cfg: fix the command line module name" patch
> - fix build of "fw_cfg: add DMA register" with CONFIG_FW_CFG_SYSFS_CMDLINE=y
> - fix 'Wshift-count-overflow'
> 
> v2:
> - use platform device for dma mapping
> - add etc/vmcoreinfo patch
> - some code cleanups
> 
> Marc-André Lureau (5):
>   fw_cfg: fix driver remove
>   fw_cfg: add DMA register
>   fw_cfg: do DMA read operation
>   crash: export paddr_vmcoreinfo_note()
>   fw_cfg: write vmcoreinfo details
> 
>  drivers/firmware/qemu_fw_cfg.c | 276 
> -
>  kernel/crash_core.c|   1 +
>  2 files changed, 247 insertions(+), 30 deletions(-)
> 
> -- 
> 2.15.0.277.ga3d2ad2c43

Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion

2017-11-22 Thread Xiao Guangrong




On 11/22/2017 02:19 AM, Rik van Riel wrote:


We can go with the "best" interface for what
could be a relatively slow flush (fsync on a
file on ssd/disk on the host), which requires
that the flushing task wait on completion
asynchronously.


I'd like to clarify the interface of "wait on completion
asynchronously" and KVM async page fault a bit more.

Current design of async-page-fault only works on RAM rather
than MMIO, i.e, if the page fault caused by accessing the
device memory of a emulated device, it needs to go to
userspace (QEMU) which emulates the operation in vCPU's
thread.

As i mentioned before the memory region used for vNVDIMM
flush interface should be MMIO and consider its support
on other hypervisors, so we do better push this async
mechanism into the flush interface design itself rather
than depends on kvm async-page-fault.

[Qemu-devel] [PATCH for 2.11] virtio-net: don't touch virtqueue if vm is stopped

2017-11-22 Thread Jason Wang

Guest state should not be touched if VM is stopped, unfortunately we
didn't check running state and tried to drain tx queue unconditionally
in virtio_net_set_status(). A crash was then noticed as a migration
destination when user type quit after virtqueue state is loaded but
before region cache is initialized. In this case,
virtio_net_drop_tx_queue_data() tries to access the uninitialized
region cache.

Fix this by only dropping tx queue data when vm is running.

Fixes: 283e2c2adcb80 ("net: virtio-net discards TX data after link down")
Cc: Yuri Benditovich 
Cc: Paolo Bonzini 
Cc: Stefan Hajnoczi 
Cc: Michael S. Tsirkin 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Jason Wang 
---
 hw/net/virtio-net.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 150fd07..38674b0 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -288,7 +288,8 @@ static void virtio_net_set_status(struct VirtIODevice 
*vdev, uint8_t status)
 qemu_bh_cancel(q->tx_bh);
 }
 if ((n->status & VIRTIO_NET_S_LINK_UP) == 0 &&
-(queue_status & VIRTIO_CONFIG_S_DRIVER_OK)) {
+(queue_status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+vdev->vm_running) {
 /* if tx is waiting we are likely have some packets in tx queue
  * and disabled notification */
 q->tx_waiting = 0;
-- 
2.7.4

Re: [Qemu-devel] [Qemu-block] [PATCH] block: Fix qemu crash when using scsi-block

2017-11-22 Thread Deepa Srinivasan

I agree that passing in QEMUIOVector to blk_aio_ioctl() as a holder of the 
void* buffer used in blk_aio_ioctl_entry() is unnecessary. But, as Kevin noted, 
read and write were using the QEMUIOVector in BlkRwCo.

To avoid changes to the callers of blk_aio_ioctl(), I’ll change blk_aio_prwv() 
to take a void pointer instead of QEMUIOVector* and use a union to hold the 
buffer in BlkRwCo.

> On Nov 22, 2017, at 11:24 AM, Paolo Bonzini  wrote:
> 
> On 22/11/2017 19:06, Kevin Wolf wrote:
>> Am 22.11.2017 um 17:34 hat Paolo Bonzini geschrieben:
>>> On 22/11/2017 16:33, Deepa Srinivasan wrote:
 Starting qemu with the following arguments causes qemu to segfault:
 ... -device lsi,id=lsi0 -drive 
 file=iscsi:<...>,format=raw,if=none,node-name=
 iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1

 This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
 blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. 
 More
 details about the bug follow.

 blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
 coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().

 When blk_aio_ioctl() is executed from within a coroutine context (e.g.
 iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
 the current coroutine's wakeup queue. blk_aio_ioctl() then returns.

 When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:

BlkRwCo *rwco = &acb->rwco;

rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
 rwco->qiov->iov[0].iov_base);  <--- qiov is
 invalid 
 here
 ...

 In the case when blk_aio_ioctl() is called from a non-coroutine context,
 blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
 qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
 execution is complete, control returns to blk_aio_ioctl_entry() after the 
 call
 to blk_co_ioctl(). There is no invalid reference after this point, but the
 function is still holding on to invalid pointers.

 The fix is to allocate memory for the QEMUIOVector and struct iovec as 
 part of
 the request struct which the IO buffer is part of. The memory for this 
 struct is
 guaranteed to be valid till the AIO is completed.

 Signed-off-by: Deepa Srinivasan 
 Signed-off-by: Konrad Rzeszutek Wilk 
 Reviewed-by: Mark Kanda 
 ---
 block/block-backend.c  | 13 ++---
 hw/block/virtio-blk.c  |  9 -
 hw/scsi/scsi-disk.c| 10 +-
 hw/scsi/scsi-generic.c |  9 -
 include/sysemu/block-backend.h |  2 +-
 5 files changed, 28 insertions(+), 15 deletions(-)

 diff --git a/block/block-backend.c b/block/block-backend.c
 index baef8e7..c275827 100644
 --- a/block/block-backend.c
 +++ b/block/block-backend.c
 @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque)
 blk_aio_complete(acb);
 }

 -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void 
 *buf,
 +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, 
 QEMUIOVector *qiov,
   BlockCompletionFunc *cb, void *opaque)
>>> 
>>> I think this is not the best way to fix the bug, because it adds extra
>>> unnecessary code in the callers.
>>> 
>>> Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the
>>> same for blk_aio_prwv's "qiov" argument?
>>> 
>>> Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just
>>> use rwco->buf.
>> 
>> But the same struct is used for read and write requests that do use an
>> actual QEMUIOVector and not just a linear buffer.
> 
> Then let's call it "void *opaque", or make it a union (but I think
> that's overkill).
> 
> The QEMUIOVector pointer is opaque as far as blk_aio_prwv is concerned,
> and it is only created by blk_aio_ioctl for blk_aio_ioctl_entry to
> extract buf:
> 
>rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
> rwco->qiov->iov[0].iov_base);
> 
> Exposing the fake QEMUIOVector to the callers of blk_aio_ioctl is much
> uglier than using a void* for what is effectively a multi-type pointer.
> 
> Paolo

Re: [Qemu-devel] [ANNOUNCE] QEMU 2.11.0-rc2 is now available

2017-11-22 Thread Jeff Cody

On Thu, Nov 23, 2017 at 08:47:47AM +0800, Fam Zheng wrote:
> On Wed, 11/22 04:55, Jeff Cody wrote:
> > On Wed, Nov 22, 2017 at 09:09:02AM +0100, Christian Borntraeger wrote:
> > > 
> > > 
> > > On 11/22/2017 04:23 AM, Michael Roth wrote:
> > > > Quoting Christian Borntraeger (2017-11-21 15:38:32)
> > > >> forgot to cc qemu-devel
> > > >>
> > > >> On 11/21/2017 10:37 PM, Christian Borntraeger wrote:
> > > >>> a quick heads up . Rc2 now triggers
> > > >>> +qemu-img: block/block-backend.c:2088: blk_root_drained_end: 
> > > >>> Assertion `blk->quiesce_counter' failed.
> > > >>> for several qemu iotests. 
> > > >>>
> > > >>> I have not looked into any details.
> > > > 
> > > > It looks to be due to:
> > > > 
> > > > 4afeffc8572f40d8844b946a30c00b10da4442b1
> > > > blockjob: do not allow coroutine double entry or entry-after-completion
> > > 
> > > Yes, I can confirm that reverting this patch gets rid of this assertion, 
> > > but
> > > I see things like
> > > 
> > > --- /home/cborntra/REPOS/qemu/tests/qemu-iotests/020.out  2017-11-21 
> > > 20:19:34.785519323 +0100
> > > +++ /home/cborntra/REPOS/qemu/build/tests/qemu-iotests/020.out.bad
> > > 2017-11-22 09:04:50.127612500 +0100
> > > @@ -537,7 +537,8 @@
> > >  wrote 65536/65536 bytes at offset 4295098368
> > >  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > >  No errors were found on the image.
> > > -Image committed.
> > > +qemu_aio_coroutine_enter: Co-routine was already scheduled in 
> > > 'co_aio_sleep_ns'
> > > +./common.rc: line 61: 88002 Aborted (core dumped) ( exec 
> > > "$QEMU_IMG_PROG" $QEMU_IMG_OPTIONS "$@" )
> > > 
> > 
> > That is from the subsequent patches in the series - you will want to revert
> > the whole series to test, as the introduced aborts catch the illegal
> > entries that the reverted patch sidestepped.
> > 
> > The series patches are:
> > 
> > 4afeffc
> > 6133b39
> > a233969
> > d975301
> > 
> > Of course, these new aborts prevent improper behavior, so we may want to
> > figure out why this is getting hit.
> > 
> > Unfortunately, I am traveling at the moment (waiting to board my flight), so
> > will have limited connectivity.
> 
> I'll take a look at this today and the bottom line is we revert the series 
> until
> a proper fix is found.
> 

My hunch is the series is a proper fix, but uncovered other latent bugs that
were relying on dangerous behavior.

[Qemu-devel] [PATCH 16/17] iotests: Make 191 work with qcow2 options

2017-11-22 Thread Max Reitz

In order for 191 to work with an explicit refcount_bits or compat=0.10,
we should strip format-specific information from the output--and we can
do so by using _filter_img_info.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/191 |   5 +-
 tests/qemu-iotests/191.out | 313 +
 2 files changed, 91 insertions(+), 227 deletions(-)

diff --git a/tests/qemu-iotests/191 b/tests/qemu-iotests/191
index ad785e10b1..dfad6555e4 100755
--- a/tests/qemu-iotests/191
+++ b/tests/qemu-iotests/191
@@ -45,7 +45,6 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.qemu
 
 _supported_fmt qcow2
-_unsupported_imgopts compat=0.10
 _supported_proto file
 _supported_os Linux
 
@@ -92,7 +91,7 @@ echo === Check that both top and top2 point to base now ===
 echo
 
 _send_qemu_cmd $h "{ 'execute': 'query-named-block-nodes' }" "^}" |
-_filter_generated_node_ids | _filter_actual_image_size
+_filter_generated_node_ids | _filter_actual_image_size | _filter_img_info
 
 _send_qemu_cmd $h "{ 'execute': 'quit' }" "^}"
 wait=1 _cleanup_qemu
@@ -140,7 +139,7 @@ echo === Check that both top and top2 point to base now ===
 echo
 
 _send_qemu_cmd $h "{ 'execute': 'query-named-block-nodes' }" "^}" |
-_filter_generated_node_ids | _filter_actual_image_size
+_filter_generated_node_ids | _filter_actual_image_size | _filter_img_info
 
 _send_qemu_cmd $h "{ 'execute': 'quit' }" "^}"
 wait=1 _cleanup_qemu
diff --git a/tests/qemu-iotests/191.out b/tests/qemu-iotests/191.out
index 73c0ed454c..190c5f049a 100644
--- a/tests/qemu-iotests/191.out
+++ b/tests/qemu-iotests/191.out
@@ -44,49 +44,31 @@ wrote 65536/65536 bytes at offset 1048576
 "image": {
 "backing-image": {
 "virtual-size": 67108864,
-"filename": "TEST_DIR/t.qcow2.base",
+"filename": "TEST_DIR/t.IMGFMT.base",
 "cluster-size": 65536,
-"format": "qcow2",
+"format": "IMGFMT",
 "actual-size": SIZE,
-"format-specific": {
-"type": "qcow2",
-"data": {
-"compat": "1.1",
-"lazy-refcounts": false,
-"refcount-bits": 16,
-"corrupt": false
-}
-},
 "dirty-flag": false
 },
-"backing-filename-format": "qcow2",
+"backing-filename-format": "IMGFMT",
 "virtual-size": 67108864,
-"filename": "TEST_DIR/t.qcow2.ovl2",
+"filename": "TEST_DIR/t.IMGFMT.ovl2",
 "cluster-size": 65536,
-"format": "qcow2",
+"format": "IMGFMT",
 "actual-size": SIZE,
-"format-specific": {
-"type": "qcow2",
-"data": {
-"compat": "1.1",
-"lazy-refcounts": false,
-"refcount-bits": 16,
-"corrupt": false
-}
-},
-"full-backing-filename": "TEST_DIR/t.qcow2.base",
-"backing-filename": "TEST_DIR/t.qcow2.base",
+"full-backing-filename": "TEST_DIR/t.IMGFMT.base",
+"backing-filename": "TEST_DIR/t.IMGFMT.base",
 "dirty-flag": false
 },
 "iops_wr": 0,
 "ro": false,
 "node-name": "top2",
 "backing_file_depth": 1,
-"drv": "qcow2",
+"drv": "IMGFMT",
 "iops": 0,
 "bps_wr": 0,
 "write_threshold": 0,
-"backing_file": "TEST_DIR/t.qcow2.base",
+"backing_file": "TEST_DIR/t.IMGFMT.base",
 "encrypted": false,
 "bps": 0,
 "bps_rd": 0,
@@ -95,7 +77,7 @@ wrote 65536/65536 bytes at offset 1048576
 "direct": false,
 "writeback": true
 },
-"file": "TEST_DIR/t.qcow2.ovl2",
+"file": "TEST_DIR/t.IMGFMT.ovl2",
 "encryption_key_missing": false
 },
 {
@@ -103,7 +85,7 @@ wrote 65536/65536 bytes at offset 1048576
 "detect_zeroes": "off",
 "image": {
 "virtual-size": 197120,
-"filename": "TEST_DIR/t.qcow2.ovl2",
+"filename": "TEST_DIR/t.IMGFMT.ovl2",
 "format": "file",
 "actual-size": SIZE,
 "dirty-flag": false
@@ -124,7 +106,7 @@ wrote 65536/65536 bytes at offset 1048576
 "direct": false,
 "writeback": true
 },
-"file": "TEST_DIR/t.qcow2.ovl2",
+"file": "TEST_DIR/t.IMGFMT.ovl2",
 "encryption_key_missing": false
 },

Re: [Qemu-devel] [PATCH 00/17] iotests: Fix iotests for weird formats/options

2017-11-22 Thread Max Reitz

On 2017-11-23 03:08, Max Reitz wrote:
> This series fixes the qemu-iotests for qcow, vmdk, qcow2 v2 and qcow2 v3
> with refcount_bits=1.

(By the way, excuse me for calling vmdk a weird format.  That was
supposed to be a joke.

(I know explaining a joke makes it unfunny, but I now feel a bit bad for
it.)

Max)



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH 15/17] iotests: Make 184 image-less

2017-11-22 Thread Max Reitz

184 does not need an image, so don't use one.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/184 | 25 --
 tests/qemu-iotests/184.out | 63 +++---
 2 files changed, 14 insertions(+), 74 deletions(-)

diff --git a/tests/qemu-iotests/184 b/tests/qemu-iotests/184
index ee96c99af3..2b68284d58 100755
--- a/tests/qemu-iotests/184
+++ b/tests/qemu-iotests/184
@@ -27,18 +27,12 @@ echo "QA output created by $seq"
 here=`pwd`
 status=1   # failure is the default!
 
-_cleanup()
-{
-_cleanup_test_img
-}
-trap "_cleanup; exit \$status" 0 1 2 3 15
+trap "exit \$status" 0 1 2 3 15
 
 # get standard environment, filters and checks
 . ./common.rc
 . ./common.filter
 
-_supported_fmt qcow2
-_supported_proto file
 _supported_os Linux
 
 function do_run_qemu()
@@ -55,7 +49,6 @@ function run_qemu()
   | _filter_actual_image_size
 }
 
-_make_test_img 64M
 test_throttle=$($QEMU_IMG --help|grep throttle)
 [ "$test_throttle" = "" ] && _supported_fmt throttle
 
@@ -66,12 +59,8 @@ run_qemu <

[Qemu-devel] [PATCH 14/17] iotests: Make 089 compatible with compat=0.10

2017-11-22 Thread Max Reitz

The only thing that is missing is a _filter_img_info after the
"$QEMU_IO -c info" invocations.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/089 |  4 ++--
 tests/qemu-iotests/089.out | 10 --
 2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/tests/qemu-iotests/089 b/tests/qemu-iotests/089
index 9bfe2307b3..0b059aba90 100755
--- a/tests/qemu-iotests/089
+++ b/tests/qemu-iotests/089
@@ -119,11 +119,11 @@ echo
 
 # Both options given directly and those given in the filename should be used
 $QEMU_IO -c "open -o driver=qcow2 json:{\"file.filename\":\"$TEST_IMG\"}" \
- -c "info" 2>&1 | _filter_testdir | _filter_imgfmt
+ -c "info" 2>&1 | _filter_img_info
 
 # Options given directly should be prioritized over those given in the filename
 $QEMU_IO -c "open -o driver=qcow2 
json:{\"driver\":\"raw\",\"file.filename\":\"$TEST_IMG\"}" \
- -c "info" 2>&1 | _filter_testdir | _filter_imgfmt
+ -c "info" 2>&1 | _filter_img_info
 
 
 # success, all done
diff --git a/tests/qemu-iotests/089.out b/tests/qemu-iotests/089.out
index 18f5fdda7a..0bf5a13ec1 100644
--- a/tests/qemu-iotests/089.out
+++ b/tests/qemu-iotests/089.out
@@ -38,17 +38,7 @@ cluster_size: 65536
 format name: IMGFMT
 cluster size: 64 KiB
 vm state offset: 512 MiB
-Format specific information:
-compat: 1.1
-lazy refcounts: false
-refcount bits: 16
-corrupt: false
 format name: IMGFMT
 cluster size: 64 KiB
 vm state offset: 512 MiB
-Format specific information:
-compat: 1.1
-lazy refcounts: false
-refcount bits: 16
-corrupt: false
 *** done
-- 
2.13.6

[Qemu-devel] [PATCH 12/17] iotests: Fix 059's reference output

2017-11-22 Thread Max Reitz

As of commit 9877860e7bd1e26ee70ab9bb5ebc34c92bf23bf5, vmdk fails
differently when opening the sample image.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/059.out | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/059.out b/tests/qemu-iotests/059.out
index f6dce7947c..1ac5d56233 100644
--- a/tests/qemu-iotests/059.out
+++ b/tests/qemu-iotests/059.out
@@ -2358,5 +2358,5 @@ Offset  Length  Mapped to   File
 0x14000 0x1 0x5 TEST_DIR/t-s003.vmdk
 
 === Testing afl image with a very large capacity ===
-qemu-img: Can't get image size 'TEST_DIR/afl9.IMGFMT': File too large
+qemu-img: Could not open 'TEST_DIR/afl9.IMGFMT': Could not open 
'TEST_DIR/afl9.IMGFMT': Invalid argument
 *** done
-- 
2.13.6

[Qemu-devel] [PATCH 13/17] iotests: Fix 067 for compat=0.10

2017-11-22 Thread Max Reitz

067 works very well with compat=0.10 once you remove format-specific
information from the QMP output.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/067 |  3 +-
 tests/qemu-iotests/067.out | 97 +-
 2 files changed, 28 insertions(+), 72 deletions(-)

diff --git a/tests/qemu-iotests/067 b/tests/qemu-iotests/067
index 9d561ef786..fe259f6165 100755
--- a/tests/qemu-iotests/067
+++ b/tests/qemu-iotests/067
@@ -57,7 +57,8 @@ function run_qemu()
 {
 do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qmp | _filter_qemu \
   | _filter_actual_image_size \
-  | _filter_generated_node_ids | _filter_qmp_events
+  | _filter_generated_node_ids | _filter_qmp_events \
+  | _filter_img_info
 }
 
 size=128M
diff --git a/tests/qemu-iotests/067.out b/tests/qemu-iotests/067.out
index 58e83c4505..2e71cff3ce 100644
--- a/tests/qemu-iotests/067.out
+++ b/tests/qemu-iotests/067.out
@@ -3,7 +3,7 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 
 === -drive/-device and device_del ===
 
-Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device 
virtio-blk,drive=disk,id=virtio0
+Testing: -drive file=TEST_DIR/t.IMGFMT,format=IMGFMT,if=none,id=disk -device 
virtio-blk,drive=disk,id=virtio0
 {
 QMP_VERSION
 }
@@ -23,26 +23,17 @@ Testing: -drive 
file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device virti
 "detect_zeroes": "off",
 "image": {
 "virtual-size": 134217728,
-"filename": "TEST_DIR/t.qcow2",
+"filename": "TEST_DIR/t.IMGFMT",
 "cluster-size": 65536,
-"format": "qcow2",
+"format": "IMGFMT",
 "actual-size": SIZE,
-"format-specific": {
-"type": "qcow2",
-"data": {
-"compat": "1.1",
-"lazy-refcounts": false,
-"refcount-bits": 16,
-"corrupt": false
-}
-},
 "dirty-flag": false
 },
 "iops_wr": 0,
 "ro": false,
 "node-name": "NODE_NAME",
 "backing_file_depth": 0,
-"drv": "qcow2",
+"drv": "IMGFMT",
 "iops": 0,
 "bps_wr": 0,
 "write_threshold": 0,
@@ -54,7 +45,7 @@ Testing: -drive 
file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device virti
 "direct": false,
 "writeback": true
 },
-"file": "TEST_DIR/t.qcow2",
+"file": "TEST_DIR/t.IMGFMT",
 "encryption_key_missing": false
 },
 "qdev": "/machine/peripheral/virtio0/virtio-backend",
@@ -81,7 +72,7 @@ Testing: -drive 
file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device virti
 
 === -drive/device_add and device_del ===
 
-Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk
+Testing: -drive file=TEST_DIR/t.IMGFMT,format=IMGFMT,if=none,id=disk
 {
 QMP_VERSION
 }
@@ -100,26 +91,17 @@ Testing: -drive 
file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk
 "detect_zeroes": "off",
 "image": {
 "virtual-size": 134217728,
-"filename": "TEST_DIR/t.qcow2",
+"filename": "TEST_DIR/t.IMGFMT",
 "cluster-size": 65536,
-"format": "qcow2",
+"format": "IMGFMT",
 "actual-size": SIZE,
-"format-specific": {
-"type": "qcow2",
-"data": {
-"compat": "1.1",
-"lazy-refcounts": false,
-"refcount-bits": 16,
-"corrupt": false
-}
-},
 "dirty-flag": false
 },
 "iops_wr": 0,
 "ro": false,
 "node-name": "NODE_NAME",
 "backing_file_depth": 0,
-"drv": "qcow2",
+"drv": "IMGFMT",
 "iops": 0,
 "bps_wr": 0,
 "write_threshold": 0,
@@ -131,7 +113,7 @@ Testing: -drive 
file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk
 "direct": false,
 "writeback": true
 },
-"file": "TEST_DIR/t.qcow2",
+"file": "TEST_DIR/t.IMGFMT",
 "encryption_key_missing": false
 },
 "type": "unknown"
@@ -183,26 +165,17 @@ Testing:
 "detect_zeroes": "off",

[Qemu-devel] [PATCH 08/17] iotests: Skip 103 for refcount_bits=1

2017-11-22 Thread Max Reitz

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/103 | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/103 b/tests/qemu-iotests/103
index ecbd8ebd71..d0cfab8844 100755
--- a/tests/qemu-iotests/103
+++ b/tests/qemu-iotests/103
@@ -40,6 +40,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 _supported_fmt qcow2
 _supported_proto file nfs
 _supported_os Linux
+# Internal snapshots are (currently) impossible with refcount_bits=1
+_unsupported_imgopts 'refcount_bits=1[^0-9]'
 
 IMG_SIZE=64K
 
-- 
2.13.6

[Qemu-devel] [PATCH 17/17] iotests: Filter compat-dependent info in 198

2017-11-22 Thread Max Reitz

There is a bit of image-specific information which depends on the qcow2
compat level.  Filter it so that 198 works with compat=0.10 (and any
refcount_bits value).

Note that we cannot simply drop the --format-specific switch because we
do need the "encrypt" information.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/198 | 6 --
 tests/qemu-iotests/198.out | 8 
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/tests/qemu-iotests/198 b/tests/qemu-iotests/198
index a84a058396..54eaaf5153 100755
--- a/tests/qemu-iotests/198
+++ b/tests/qemu-iotests/198
@@ -92,12 +92,14 @@ $QEMU_IO --object $SECRET0 --object $SECRET1 -c "read -P 
0x9 0 $size" --image-op
 echo
 echo "== checking image base =="
 $QEMU_IMG info --image-opts $IMGSPECBASE | _filter_img_info --format-specific \
-| sed -e "/^disk size:/ D"
+| sed -e "/^disk size:/ D" -e '/refcount bits:/ D' -e '/compat:/ D' \
+  -e '/lazy refcounts:/ D' -e '/corrupt:/ D'
 
 echo
 echo "== checking image layer =="
 $QEMU_IMG info --image-opts $IMGSPECLAYER | _filter_img_info --format-specific 
\
-| sed -e "/^disk size:/ D"
+| sed -e "/^disk size:/ D" -e '/refcount bits:/ D' -e '/compat:/ D' \
+  -e '/lazy refcounts:/ D' -e '/corrupt:/ D'
 
 
 # success, all done
diff --git a/tests/qemu-iotests/198.out b/tests/qemu-iotests/198.out
index 2db565e16b..adb805cce9 100644
--- a/tests/qemu-iotests/198.out
+++ b/tests/qemu-iotests/198.out
@@ -36,9 +36,6 @@ image: json:{"encrypt.key-secret": "sec0", "driver": 
"IMGFMT", "file": {"driver"
 file format: IMGFMT
 virtual size: 16M (16777216 bytes)
 Format specific information:
-compat: 1.1
-lazy refcounts: false
-refcount bits: 16
 encrypt:
 ivgen alg: plain64
 hash alg: sha256
@@ -75,7 +72,6 @@ Format specific information:
 key offset: 1810432
 payload offset: 2068480
 master key iters: 1024
-corrupt: false
 
 == checking image layer ==
 image: json:{"encrypt.key-secret": "sec1", "driver": "IMGFMT", "file": 
{"driver": "file", "filename": "TEST_DIR/t.IMGFMT"}}
@@ -83,9 +79,6 @@ file format: IMGFMT
 virtual size: 16M (16777216 bytes)
 backing file: TEST_DIR/t.IMGFMT.base
 Format specific information:
-compat: 1.1
-lazy refcounts: false
-refcount bits: 16
 encrypt:
 ivgen alg: plain64
 hash alg: sha256
@@ -122,5 +115,4 @@ Format specific information:
 key offset: 1810432
 payload offset: 2068480
 master key iters: 1024
-corrupt: false
 *** done
-- 
2.13.6

[Qemu-devel] [PATCH 06/17] iotests: Drop format-specific in _filter_img_info

2017-11-22 Thread Max Reitz

_filter_img_info should remove format-specific information, too.  We
already have such a filter in _img_info, and it is very useful for
query-block-named-block-nodes (etc.), too.

However, in 198 we need that information (but we still want the rest of
the filter), so make that filtering optional.  Note that "the rest of
the filter" includes filtering of the test directory, so we can drop the
_filter_testdir from 198 at the same time.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/198   |  6 --
 tests/qemu-iotests/common.filter | 29 -
 2 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/198 b/tests/qemu-iotests/198
index 34ef666381..a84a058396 100755
--- a/tests/qemu-iotests/198
+++ b/tests/qemu-iotests/198
@@ -91,11 +91,13 @@ $QEMU_IO --object $SECRET0 --object $SECRET1 -c "read -P 
0x9 0 $size" --image-op
 
 echo
 echo "== checking image base =="
-$QEMU_IMG info --image-opts $IMGSPECBASE | _filter_img_info | _filter_testdir 
| sed -e "/^disk size:/ D"
+$QEMU_IMG info --image-opts $IMGSPECBASE | _filter_img_info --format-specific \
+| sed -e "/^disk size:/ D"
 
 echo
 echo "== checking image layer =="
-$QEMU_IMG info --image-opts $IMGSPECLAYER | _filter_img_info | _filter_testdir 
| sed -e "/^disk size:/ D"
+$QEMU_IMG info --image-opts $IMGSPECLAYER | _filter_img_info --format-specific 
\
+| sed -e "/^disk size:/ D"
 
 
 # success, all done
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index d9237799e9..0c0e53fae7 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -139,6 +139,15 @@ _filter_img_create()
 
 _filter_img_info()
 {
+if [[ "$1" == "--format-specific" ]]; then
+local format_specific=1
+shift
+else
+local format_specific=0
+fi
+
+discard=0
+regex_json_spec_start='^ *"format-specific": \{'
 sed -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" \
 -e "s#$TEST_DIR#TEST_DIR#g" \
 -e "s#$IMGFMT#IMGFMT#g" \
@@ -159,7 +168,25 @@ _filter_img_info()
 -e "/block_state_zero: \\(on\\|off\\)/d" \
 -e "/log_size: [0-9]\\+/d" \
 -e "s/iters: [0-9]\\+/iters: 1024/" \
--e "s/uuid: [-a-f0-9]\\+/uuid: ----/"
+-e "s/uuid: [-a-f0-9]\\+/uuid: ----/" 
| \
+while IFS='' read -r line; do
+if [[ $format_specific == 1 ]]; then
+discard=0
+elif [[ $line == "Format specific information:" ]]; then
+discard=1
+elif [[ $line =~ $regex_json_spec_start ]]; then
+discard=2
+regex_json_spec_end="^${line%%[^ ]*}\\},? *$"
+fi
+if [[ $discard == 0 ]]; then
+echo "$line"
+elif [[ $discard == 1 && ! $line ]]; then
+echo
+discard=0
+elif [[ $discard == 2 && $line =~ $regex_json_spec_end ]]; then
+discard=0
+fi
+done
 }
 
 # filter out offsets and file names from qemu-img map; good for both
-- 
2.13.6

[Qemu-devel] [PATCH 11/17] iotests: Fix 051 for compat=0.10

2017-11-22 Thread Max Reitz

051 has both compat=1.1 and compat=0.10 tests (once it uses
lazy_refcounts, once it tests that setting them does not work).
For the compat=0.10 tests, it already explicitly creates a suitable
image.  So let's just ignore the user-specified compat level for the
lazy_refcounts test and explicitly create a compat=1.1 image there, too.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/051| 2 ++
 tests/qemu-iotests/051.out| 1 +
 tests/qemu-iotests/051.pc.out | 1 +
 3 files changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index dba8816c9f..0c3be16489 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -131,6 +131,8 @@ echo
 echo === Enable and disable lazy refcounting on the command line, plus some 
invalid values ===
 echo
 
+_make_test_img -o compat=1.1 "$size"
+
 run_qemu -drive file="$TEST_IMG",format=qcow2,lazy-refcounts=on
 run_qemu -drive file="$TEST_IMG",format=qcow2,lazy-refcounts=off
 run_qemu -drive file="$TEST_IMG",format=qcow2,lazy-refcounts=
diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out
index e3c6eaba57..dd9846d1ce 100644
--- a/tests/qemu-iotests/051.out
+++ b/tests/qemu-iotests/051.out
@@ -77,6 +77,7 @@ QEMU_PROG: -drive 
file=TEST_DIR/t.qcow2,file.backing.driver=qcow2,file.backing.f
 
 === Enable and disable lazy refcounting on the command line, plus some invalid 
values ===
 
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=on
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) quit
diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
index f2c5622cee..830c11880a 100644
--- a/tests/qemu-iotests/051.pc.out
+++ b/tests/qemu-iotests/051.pc.out
@@ -77,6 +77,7 @@ QEMU_PROG: -drive 
file=TEST_DIR/t.qcow2,file.backing.driver=qcow2,file.backing.f
 
 === Enable and disable lazy refcounting on the command line, plus some invalid 
values ===
 
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=on
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) quit
-- 
2.13.6

[Qemu-devel] [PATCH 05/17] iotests: Fix _img_info for backslashes

2017-11-22 Thread Max Reitz

read without -r eats backslashes.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/common.rc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index dbae7d74ba..6fa0495e10 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -338,7 +338,7 @@ _img_info()
 -e "s#$IMGFMT#IMGFMT#g" \
 -e "/^disk size:/ D" \
 -e "/actual-size/ D" | \
-while IFS='' read line; do
+while IFS='' read -r line; do
 if [[ $format_specific == 1 ]]; then
 discard=0
 elif [[ $line == "Format specific information:" ]]; then
-- 
2.13.6

[Qemu-devel] [PATCH 04/17] block/vmdk: Add blkdebug events

2017-11-22 Thread Max Reitz

This is certainly not complete, but it includes at least write_aio and
read_aio.

Signed-off-by: Max Reitz 
---
 block/vmdk.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/block/vmdk.c b/block/vmdk.c
index 1ae47b1c2e..d71cec4f31 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1075,6 +1075,8 @@ static int get_whole_cluster(BlockDriverState *bs,
 /* Read backing data before skip range */
 if (skip_start_bytes > 0) {
 if (bs->backing) {
+/* qcow2 emits this on bs->file instead of bs->backing */
+BLKDBG_EVENT(extent->file, BLKDBG_COW_READ);
 ret = bdrv_pread(bs->backing, offset, whole_grain,
  skip_start_bytes);
 if (ret < 0) {
@@ -1082,6 +1084,7 @@ static int get_whole_cluster(BlockDriverState *bs,
 goto exit;
 }
 }
+BLKDBG_EVENT(extent->file, BLKDBG_COW_WRITE);
 ret = bdrv_pwrite(extent->file, cluster_offset, whole_grain,
   skip_start_bytes);
 if (ret < 0) {
@@ -1092,6 +1095,8 @@ static int get_whole_cluster(BlockDriverState *bs,
 /* Read backing data after skip range */
 if (skip_end_bytes < cluster_bytes) {
 if (bs->backing) {
+/* qcow2 emits this on bs->file instead of bs->backing */
+BLKDBG_EVENT(extent->file, BLKDBG_COW_READ);
 ret = bdrv_pread(bs->backing, offset + skip_end_bytes,
  whole_grain + skip_end_bytes,
  cluster_bytes - skip_end_bytes);
@@ -1100,6 +1105,7 @@ static int get_whole_cluster(BlockDriverState *bs,
 goto exit;
 }
 }
+BLKDBG_EVENT(extent->file, BLKDBG_COW_WRITE);
 ret = bdrv_pwrite(extent->file, cluster_offset + skip_end_bytes,
   whole_grain + skip_end_bytes,
   cluster_bytes - skip_end_bytes);
@@ -1120,6 +1126,7 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData 
*m_data,
 {
 offset = cpu_to_le32(offset);
 /* update L2 table */
+BLKDBG_EVENT(extent->file, BLKDBG_L2_UPDATE);
 if (bdrv_pwrite_sync(extent->file,
 ((int64_t)m_data->l2_offset * 512)
 + (m_data->l2_index * sizeof(offset)),
@@ -1218,6 +1225,7 @@ static int get_cluster_offset(BlockDriverState *bs,
 }
 }
 l2_table = extent->l2_cache + (min_index * extent->l2_size);
+BLKDBG_EVENT(extent->file, BLKDBG_L2_LOAD);
 if (bdrv_pread(extent->file,
 (int64_t)l2_offset * 512,
 l2_table,
@@ -1393,9 +1401,13 @@ static int vmdk_write_extent(VmdkExtent *extent, int64_t 
cluster_offset,
 .iov_len= n_bytes,
 };
 qemu_iovec_init_external(&local_qiov, &iov, 1);
+
+BLKDBG_EVENT(extent->file, BLKDBG_WRITE_COMPRESSED);
 } else {
 qemu_iovec_init(&local_qiov, qiov->niov);
 qemu_iovec_concat(&local_qiov, qiov, qiov_offset, n_bytes);
+
+BLKDBG_EVENT(extent->file, BLKDBG_WRITE_AIO);
 }
 
 write_offset = cluster_offset + offset_in_cluster;
@@ -1437,6 +1449,7 @@ static int vmdk_read_extent(VmdkExtent *extent, int64_t 
cluster_offset,
 
 
 if (!extent->compressed) {
+BLKDBG_EVENT(extent->file, BLKDBG_READ_AIO);
 ret = bdrv_co_preadv(extent->file,
  cluster_offset + offset_in_cluster, bytes,
  qiov, 0);
@@ -1450,6 +1463,7 @@ static int vmdk_read_extent(VmdkExtent *extent, int64_t 
cluster_offset,
 buf_bytes = cluster_bytes * 2;
 cluster_buf = g_malloc(buf_bytes);
 uncomp_buf = g_malloc(cluster_bytes);
+BLKDBG_EVENT(extent->file, BLKDBG_READ_COMPRESSED);
 ret = bdrv_pread(extent->file,
 cluster_offset,
 cluster_buf, buf_bytes);
@@ -1527,6 +1541,8 @@ vmdk_co_preadv(BlockDriverState *bs, uint64_t offset, 
uint64_t bytes,
 qemu_iovec_reset(&local_qiov);
 qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
 
+/* qcow2 emits this on bs->file instead of bs->backing */
+BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
 ret = bdrv_co_preadv(bs->backing, offset, n_bytes,
  &local_qiov, 0);
 if (ret < 0) {
-- 
2.13.6

[Qemu-devel] [PATCH 10/17] iotests: Fix 020 for vmdk

2017-11-22 Thread Max Reitz

vmdk cannot work with anything but vmdk backing files, so make the
backing file be the same format as the overlay.

Reported-by: John Snow 
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/020 | 9 ++---
 tests/qemu-iotests/020.out | 6 --
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020
index d22ab81414..eac5080f83 100755
--- a/tests/qemu-iotests/020
+++ b/tests/qemu-iotests/020
@@ -111,10 +111,12 @@ echo
 echo 'Testing failing commit'
 echo
 
+TEST_IMG="$TEST_IMG.base" _make_test_img 1M
+
 # Create an image with a null backing file to which committing will fail (with
 # ENOSPC so we can distinguish the result from some generic EIO which may be
 # generated anywhere in the block layer)
-_make_test_img -b "json:{'driver': 'raw',
+_make_test_img -b "json:{'driver': '$IMGFMT',
  'file': {
  'driver': 'blkdebug',
  'inject-error': [{
@@ -123,14 +125,15 @@ _make_test_img -b "json:{'driver': 'raw',
  'once': true
  }],
  'image': {
- 'driver': 'null-co'
+ 'driver': 'file',
+ 'filename': '$TEST_IMG.base'
  }}}"
 
 # Just write anything so committing will not be a no-op
 $QEMU_IO -c 'writev 0 64k' "$TEST_IMG" | _filter_qemu_io
 
 $QEMU_IMG commit "$TEST_IMG"
-_cleanup_test_img
+_cleanup
 
 # success, all done
 echo "*** done"
diff --git a/tests/qemu-iotests/020.out b/tests/qemu-iotests/020.out
index 165b70aa49..4b722b2dd0 100644
--- a/tests/qemu-iotests/020.out
+++ b/tests/qemu-iotests/020.out
@@ -1078,7 +1078,8 @@ No errors were found on the image.
 
 Testing failing commit
 
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824 
backing_file=json:{'driver': 'raw',,
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 
backing_file=json:{'driver': 'IMGFMT',,
  'file': {
  'driver': 'blkdebug',,
  'inject-error': [{
@@ -1087,7 +1088,8 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT 
size=1073741824 backing_file=json:{'d
  'once': true
  }],,
  'image': {
- 'driver': 'null-co'
+ 'driver': 'file',,
+ 'filename': 'TEST_DIR/t.IMGFMT.base'
  }}}
 wrote 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-- 
2.13.6

[Qemu-devel] [PATCH 09/17] iotests: Disable some tests for compat=0.10

2017-11-22 Thread Max Reitz

Tests 080, 130, 137, and 176 simply do not work with compat=0.10 for the
reasons stated there.

177 is a bit more interesting:  Originally, it was actually very much
intended to work with compat=0.10 (it even had a special case for that).
However, it now prints the test image's map twice, and short of just not
doing that, there is no solution I can imagine that is both simple and
would leave compat=0.10 support intact.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/080 |  5 +++--
 tests/qemu-iotests/130 |  2 ++
 tests/qemu-iotests/137 |  2 ++
 tests/qemu-iotests/176 |  2 ++
 tests/qemu-iotests/177 | 13 +++--
 5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/tests/qemu-iotests/080 b/tests/qemu-iotests/080
index 55044c700b..1c2bd85742 100755
--- a/tests/qemu-iotests/080
+++ b/tests/qemu-iotests/080
@@ -41,8 +41,9 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 _supported_fmt qcow2
 _supported_proto file
 _supported_os Linux
-# Internal snapshots are (currently) impossible with refcount_bits=1
-_unsupported_imgopts 'refcount_bits=1[^0-9]'
+# - Internal snapshots are (currently) impossible with refcount_bits=1
+# - This is generally a test for compat=1.1 images
+_unsupported_imgopts 'refcount_bits=1[^0-9]' 'compat=0.10'
 
 header_size=104
 
diff --git a/tests/qemu-iotests/130 b/tests/qemu-iotests/130
index e7e43de6d6..2c4b94da1b 100755
--- a/tests/qemu-iotests/130
+++ b/tests/qemu-iotests/130
@@ -45,6 +45,8 @@ _supported_fmt qcow2
 _supported_proto generic
 _unsupported_proto vxhs
 _supported_os Linux
+# We are going to use lazy-refcounts
+_unsupported_imgopts 'compat=0.10'
 
 qemu_comm_method="monitor"
 
diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137
index eb91e517d7..5a01250005 100755
--- a/tests/qemu-iotests/137
+++ b/tests/qemu-iotests/137
@@ -41,6 +41,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 _supported_fmt qcow2
 _supported_proto file
 _supported_os Linux
+# We are going to use lazy-refcounts
+_unsupported_imgopts 'compat=0.10'
 
 
 _make_test_img 64M
diff --git a/tests/qemu-iotests/176 b/tests/qemu-iotests/176
index b8dc17c592..d38b3aeb91 100755
--- a/tests/qemu-iotests/176
+++ b/tests/qemu-iotests/176
@@ -48,6 +48,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 _supported_fmt qcow2
 _supported_proto file
 _supported_os Linux
+# Persistent dirty bitmaps require compat=1.1
+_unsupported_imgopts 'compat=0.10'
 
 function run_qemu()
 {
diff --git a/tests/qemu-iotests/177 b/tests/qemu-iotests/177
index 28990977f1..86cf25f855 100755
--- a/tests/qemu-iotests/177
+++ b/tests/qemu-iotests/177
@@ -39,6 +39,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2
 _supported_proto file
+# This test assumes that discard leaves zero clusters
+_unsupported_imgopts 'compat=0.10'
 
 CLUSTER_SIZE=1M
 size=128M
@@ -93,15 +95,6 @@ echo "== verify image content =="
 
 function verify_io()
 {
-if ($QEMU_IMG info -f "$IMGFMT" "$TEST_IMG" |
-   grep "compat: 0.10" > /dev/null); then
-# For v2 images, discarded clusters are read from the backing file
-discarded=11
-else
-# Discarded clusters are zeroed for v3 or later
-discarded=0
-fi
-
 echo read -P 22 0 1000
 echo read -P 33 1000 128k
 echo read -P 22 132072 7871512
@@ -109,7 +102,7 @@ function verify_io()
 echo read -P 22 10096640 23457792
 echo read -P 0 32M 32M
 echo read -P 22 64M 13M
-echo read -P $discarded 77M 29M
+echo read -P 0 77M 29M
 echo read -P 22 106M 4M
 echo read -P 11 110M 18M
 }
-- 
2.13.6

[Qemu-devel] [PATCH 03/17] block/qcow: Add blkdebug events

2017-11-22 Thread Max Reitz

This is not necessarily complete, but it should include the most
important places.

Signed-off-by: Max Reitz 
---
 block/qcow.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/block/qcow.c b/block/qcow.c
index 9569deeaf0..d552a6eba8 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -379,6 +379,7 @@ static int get_cluster_offset(BlockDriverState *bs,
 /* update the L1 entry */
 s->l1_table[l1_index] = l2_offset;
 tmp = cpu_to_be64(l2_offset);
+BLKDBG_EVENT(bs->file, BLKDBG_L1_UPDATE);
 ret = bdrv_pwrite_sync(bs->file,
s->l1_table_offset + l1_index * sizeof(tmp),
&tmp, sizeof(tmp));
@@ -409,6 +410,7 @@ static int get_cluster_offset(BlockDriverState *bs,
 }
 }
 l2_table = s->l2_cache + (min_index << s->l2_bits);
+BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
 if (new_l2_table) {
 memset(l2_table, 0, s->l2_size * sizeof(uint64_t));
 ret = bdrv_pwrite_sync(bs->file, l2_offset, l2_table,
@@ -432,6 +434,7 @@ static int get_cluster_offset(BlockDriverState *bs,
 ((cluster_offset & QCOW_OFLAG_COMPRESSED) && allocate == 1)) {
 if (!allocate)
 return 0;
+BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC);
 /* allocate a new cluster */
 if ((cluster_offset & QCOW_OFLAG_COMPRESSED) &&
 (n_end - n_start) < s->cluster_sectors) {
@@ -447,6 +450,7 @@ static int get_cluster_offset(BlockDriverState *bs,
 }
 cluster_offset = QEMU_ALIGN_UP(cluster_offset, s->cluster_size);
 /* write the cluster content */
+BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
 ret = bdrv_pwrite(bs->file, cluster_offset, s->cluster_cache,
   s->cluster_size);
 if (ret < 0) {
@@ -486,6 +490,7 @@ static int get_cluster_offset(BlockDriverState *bs,
   NULL) < 0) {
 return -EIO;
 }
+BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
 ret = bdrv_pwrite(bs->file,
   cluster_offset + i * 512,
   s->cluster_data, 512);
@@ -503,6 +508,11 @@ static int get_cluster_offset(BlockDriverState *bs,
 /* update L2 table */
 tmp = cpu_to_be64(cluster_offset);
 l2_table[l2_index] = tmp;
+if (allocate == 2) {
+BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED);
+} else {
+BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
+}
 ret = bdrv_pwrite_sync(bs->file, l2_offset + l2_index * sizeof(tmp),
&tmp, sizeof(tmp));
 if (ret < 0) {
@@ -579,6 +589,7 @@ static int decompress_cluster(BlockDriverState *bs, 
uint64_t cluster_offset)
 if (s->cluster_cache_offset != coffset) {
 csize = cluster_offset >> (63 - s->cluster_bits);
 csize &= (s->cluster_size - 1);
+BLKDBG_EVENT(bs->file, BLKDBG_READ_COMPRESSED);
 ret = bdrv_pread(bs->file, coffset, s->cluster_data, csize);
 if (ret != csize)
 return -1;
@@ -635,6 +646,8 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, 
int64_t sector_num,
 hd_iov.iov_len = n * 512;
 qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
 qemu_co_mutex_unlock(&s->lock);
+/* qcow2 emits this on bs->file instead of bs->backing */
+BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
 ret = bdrv_co_readv(bs->backing, sector_num, n, &hd_qiov);
 qemu_co_mutex_lock(&s->lock);
 if (ret < 0) {
@@ -661,6 +674,7 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, 
int64_t sector_num,
 hd_iov.iov_len = n * 512;
 qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
 qemu_co_mutex_unlock(&s->lock);
+BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
 ret = bdrv_co_readv(bs->file,
 (cluster_offset >> 9) + index_in_cluster,
 n, &hd_qiov);
@@ -754,6 +768,7 @@ static coroutine_fn int qcow_co_writev(BlockDriverState 
*bs, int64_t sector_num,
 hd_iov.iov_len = n * 512;
 qemu_iovec_init_external(&hd_qiov, &hd_iov, 1);
 qemu_co_mutex_unlock(&s->lock);
+BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
 ret = bdrv_co_writev(bs->file,
  (cluster_offset >> 9) + index_in_cluster,
  n, &hd_qiov);
@@ -1048,6 +1063,7 @@ qcow_co_pwritev_compressed(BlockDriverState *bs, uint64_t 
offset,
 .iov_len= out_len,
 };
 qemu_iovec_init_external(&hd_qiov, &iov, 1);
+BLKDBG_EVENT(bs->file, BLKDBG_WRITE_COMPRESSED);
 ret =

[Qemu-devel] [PATCH 07/17] iotests: Forbid 020 for non-file protocols

2017-11-22 Thread Max Reitz

This test does funny things like TEST_IMG="TEST_IMG.base" _make_test_img
that usually only work with the file protocol.  More specifically, they
do not work with the most interesting non-file protocols, so we might as
well skip this for anything but file.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/020 | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020
index cefe3a830e..d22ab81414 100755
--- a/tests/qemu-iotests/020
+++ b/tests/qemu-iotests/020
@@ -42,18 +42,12 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 # Any format supporting backing files
 _supported_fmt qcow qcow2 vmdk qed
-_supported_proto generic
-_unsupported_proto vxhs
+_supported_proto file
 _supported_os Linux
 _unsupported_imgopts "subformat=monolithicFlat" \
  "subformat=twoGbMaxExtentFlat" \
  "subformat=twoGbMaxExtentSparse"
 
-# NFS does not support bdrv_reopen_prepare thus qemu-img commit fails.
-if [ "$IMGPROTO" = "nfs" ]; then
-_notrun "image protocol $IMGPROTO does not support bdrv_commit"
-fi
-
 TEST_OFFSETS="0 4294967296"
 
 TEST_IMG_SAVE="$TEST_IMG"
-- 
2.13.6

[Qemu-devel] [PATCH 02/17] qcow2: No persistent dirty bitmaps for compat=0.10

2017-11-22 Thread Max Reitz

Persistent dirty bitmaps require a properly functioning
autoclear_features field, or we cannot track when an unsupporting
program might overwrite them.  Therefore, we cannot support them for
compat=0.10 images.

Signed-off-by: Max Reitz 
---
 block/qcow2-bitmap.c | 10 ++
 block/qcow2.c| 14 +++---
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index f45e46cfbd..efa10c6663 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -1449,6 +1449,16 @@ bool qcow2_can_store_new_dirty_bitmap(BlockDriverState 
*bs,
 bool found;
 Qcow2BitmapList *bm_list;
 
+if (s->qcow_version < 3) {
+/* Without autoclear_features, we would always have to assume
+ * that a program without persistent dirty bitmap support has
+ * accessed this qcow2 file when opening it, and would thus
+ * have to drop all dirty bitmaps (defeating their purpose).
+ */
+error_setg(errp, "Cannot store dirty bitmaps in qcow2 v2 files");
+goto fail;
+}
+
 if (check_constraints_on_bitmap(bs, name, granularity, errp) != 0) {
 goto fail;
 }
diff --git a/block/qcow2.c b/block/qcow2.c
index 1914a940e5..cabf93e43c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -302,9 +302,17 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
 }
 
 if (!(s->autoclear_features & QCOW2_AUTOCLEAR_BITMAPS)) {
-warn_report("a program lacking bitmap support "
-"modified this file, so all bitmaps are now "
-"considered inconsistent");
+if (s->qcow_version < 3) {
+/* Let's be a bit more specific */
+warn_report("This qcow2 v2 image contains bitmaps, but "
+"they may have been modified by a program "
+"without persistent bitmap support; so now "
+"they must all be considered inconsistent");
+} else {
+warn_report("a program lacking bitmap support "
+"modified this file, so all bitmaps are now "
+"considered inconsistent");
+}
 error_printf("Some clusters may be leaked, "
  "run 'qemu-img check -r' on the image "
  "file to fix.");
-- 
2.13.6

[Qemu-devel] [PATCH 00/17] iotests: Fix iotests for weird formats/options

2017-11-22 Thread Max Reitz

This series fixes the qemu-iotests for qcow, vmdk, qcow2 v2 and qcow2 v3
with refcount_bits=1.

Patches 1 and 2 contain real fixes (not urgent, though, so no need to
hurry for 2.11--we can take them into 2.11 if we want to, but there is
no absolute need for them).

Patches 3 and 4 add blkdebug events to qcow and vmdk so iotests 020 can
work with them even after patch 10.

Patches 5 and 6 are general "fixes" for the iotests infrastructure.

Patches 7, 8, and 9 add some missing skips under certain circumstances
to tests that need them.

The rest of this series (patches 10 through 17) actually fix tests so
they work for the formats and options mentioned above.

(Fun fact: qcow v1 wasn't broken before this series.  But it would be
 broken by patch 10 if I didn't include patch 3.  That is why I
 mentioned it above.)


Personal note:  I should really stop writing bash tests, at least as
soon as there is QMP involved.  While working on this series I got
sidetracked a bit and actually wrote some iotests.py functions that may
come in handy next time I write a test.

(I hate to write Python tests because the boilerplate seems so large and
 the debugging is so hard.  But there is test 194 which shows that it is
 possible to write simple bash-like tests as well--and that is how I
 should probably write tests from now on.)


Max Reitz (17):
  block/vmdk: Fix , instead of ; at end of line
  qcow2: No persistent dirty bitmaps for compat=0.10
  block/qcow: Add blkdebug events
  block/vmdk: Add blkdebug events
  iotests: Fix _img_info for backslashes
  iotests: Drop format-specific in _filter_img_info
  iotests: Forbid 020 for non-file protocols
  iotests: Skip 103 for refcount_bits=1
  iotests: Disable some tests for compat=0.10
  iotests: Fix 020 for vmdk
  iotests: Fix 051 for compat=0.10
  iotests: Fix 059's reference output
  iotests: Fix 067 for compat=0.10
  iotests: Make 089 compatible with compat=0.10
  iotests: Make 184 image-less
  iotests: Make 191 work with qcow2 options
  iotests: Filter compat-dependent info in 198

 block/qcow.c |  16 ++
 block/qcow2-bitmap.c |  10 ++
 block/qcow2.c|  14 +-
 block/vmdk.c |  18 ++-
 tests/qemu-iotests/020   |  17 +--
 tests/qemu-iotests/020.out   |   6 +-
 tests/qemu-iotests/051   |   2 +
 tests/qemu-iotests/051.out   |   1 +
 tests/qemu-iotests/051.pc.out|   1 +
 tests/qemu-iotests/059.out   |   2 +-
 tests/qemu-iotests/067   |   3 +-
 tests/qemu-iotests/067.out   |  97 
 tests/qemu-iotests/080   |   5 +-
 tests/qemu-iotests/089   |   4 +-
 tests/qemu-iotests/089.out   |  10 --
 tests/qemu-iotests/103   |   2 +
 tests/qemu-iotests/130   |   2 +
 tests/qemu-iotests/137   |   2 +
 tests/qemu-iotests/176   |   2 +
 tests/qemu-iotests/177   |  13 +-
 tests/qemu-iotests/184   |  25 +---
 tests/qemu-iotests/184.out   |  63 ++--
 tests/qemu-iotests/191   |   5 +-
 tests/qemu-iotests/191.out   | 313 +++
 tests/qemu-iotests/198   |   8 +-
 tests/qemu-iotests/198.out   |   8 -
 tests/qemu-iotests/common.filter |  29 +++-
 tests/qemu-iotests/common.rc |   2 +-
 28 files changed, 254 insertions(+), 426 deletions(-)

-- 
2.13.6

[Qemu-devel] [PATCH 01/17] block/vmdk: Fix , instead of ; at end of line

2017-11-22 Thread Max Reitz

Signed-off-by: Max Reitz 
---
 block/vmdk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index c665bcc977..1ae47b1c2e 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1398,7 +1398,7 @@ static int vmdk_write_extent(VmdkExtent *extent, int64_t 
cluster_offset,
 qemu_iovec_concat(&local_qiov, qiov, qiov_offset, n_bytes);
 }
 
-write_offset = cluster_offset + offset_in_cluster,
+write_offset = cluster_offset + offset_in_cluster;
 ret = bdrv_co_pwritev(extent->file, write_offset, n_bytes,
   &local_qiov, 0);
 
-- 
2.13.6

Re: [Qemu-devel] [PATCH] Add a blog post about HAXM acceleration on Windows

2017-11-22 Thread Yu Ning




On 11/23/2017 1:03, Thomas Huth wrote:

On 22.11.2017 15:52, Yu Ning wrote:

On 11/22/2017 20:25, Thomas Huth wrote:

[...]

Would it be OK to remove the qemu-debian-wheezy-gui-with-haxm.png from
the blog post?

Yes, I'm fine with removing it.  Sorry I haven't installed Jekyll and
didn't test the rendering.

Would you confirm whether that's the only change I need to make in v2?

No need to respin, since this was just a one-line change, I was able to
do it on my own (I still removed the screenshot, even if it seemed to be
working with Paolo's patch to the CSS, since the screenshot looked just
a bit too big for the blog - I hope that's OK for you).


Yes, absolutely :)


So the blog post is now online:

  https://www.qemu.org/2017/11/22/haxm-usage-windows/


Looks nice, thanks!

Re: [Qemu-devel] [PATCH qemu] vfio/spapr: Allow fallback to SPAPR TCE IOMMU v1

2017-11-22 Thread Alexey Kardashevskiy

On 23/11/17 00:39, David Gibson wrote:
> On Wed, Nov 22, 2017 at 04:15:52PM +1100, Alexey Kardashevskiy wrote:
>> The vfio_iommu_spapr_tce driver always advertises v1 and v2 IOMMU support,
>> however PR KVM (a special version of KVM designed to work in
>> a paravirtualized system; these days used for nested virtualizaion) only
>> supports the "pseries" platform which does not support v2. Since there is
>> no way to choose the IOMMU version in QEMU, it fails to start.
>>
>> This adds a fallback to the v1 IOMMU if v2 cannot be used.
>>
>> Signed-off-by: Alexey Kardashevskiy 
> 
> The fallback itself isn't a bad idea, but your commit message contains
> several inaccurracies.  KVM PR is not particularly designed to work in
> a paravirtualized system, and it doesn't only support the pseries
> platform (as guest *or* host).  It's actually a lot more general than
> KVM HV - just slow, not that well tested and missing a number of
> features that no-one's bothered to port to it.

Well, true. I kinda tried to give an example of how this may be useful and
exaggerated a bit, plus my ignorance :) I'll repost if Alex does not have
objections otherwise.



> 
>> ---
>>  hw/vfio/common.c | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index 7b2924c..cd81cc9 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1040,6 +1040,11 @@ static int vfio_connect_container(VFIOGroup *group, 
>> AddressSpace *as,
>>  v2 ? VFIO_SPAPR_TCE_v2_IOMMU : VFIO_SPAPR_TCE_IOMMU;
>>  ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
>>  if (ret) {
>> +container->iommu_type = VFIO_SPAPR_TCE_IOMMU;
>> +v2 = false;
>> +ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
>> +}
>> +if (ret) {
>>  error_setg_errno(errp, errno, "failed to set iommu for 
>> container");
>>  ret = -errno;
>>  goto free_container_exit;
> 


-- 
Alexey



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [ANNOUNCE] QEMU 2.11.0-rc2 is now available

2017-11-22 Thread Fam Zheng

On Wed, 11/22 04:55, Jeff Cody wrote:
> On Wed, Nov 22, 2017 at 09:09:02AM +0100, Christian Borntraeger wrote:
> > 
> > 
> > On 11/22/2017 04:23 AM, Michael Roth wrote:
> > > Quoting Christian Borntraeger (2017-11-21 15:38:32)
> > >> forgot to cc qemu-devel
> > >>
> > >> On 11/21/2017 10:37 PM, Christian Borntraeger wrote:
> > >>> a quick heads up . Rc2 now triggers
> > >>> +qemu-img: block/block-backend.c:2088: blk_root_drained_end: Assertion 
> > >>> `blk->quiesce_counter' failed.
> > >>> for several qemu iotests. 
> > >>>
> > >>> I have not looked into any details.
> > > 
> > > It looks to be due to:
> > > 
> > > 4afeffc8572f40d8844b946a30c00b10da4442b1
> > > blockjob: do not allow coroutine double entry or entry-after-completion
> > 
> > Yes, I can confirm that reverting this patch gets rid of this assertion, but
> > I see things like
> > 
> > --- /home/cborntra/REPOS/qemu/tests/qemu-iotests/020.out2017-11-21 
> > 20:19:34.785519323 +0100
> > +++ /home/cborntra/REPOS/qemu/build/tests/qemu-iotests/020.out.bad  
> > 2017-11-22 09:04:50.127612500 +0100
> > @@ -537,7 +537,8 @@
> >  wrote 65536/65536 bytes at offset 4295098368
> >  64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> >  No errors were found on the image.
> > -Image committed.
> > +qemu_aio_coroutine_enter: Co-routine was already scheduled in 
> > 'co_aio_sleep_ns'
> > +./common.rc: line 61: 88002 Aborted (core dumped) ( exec 
> > "$QEMU_IMG_PROG" $QEMU_IMG_OPTIONS "$@" )
> > 
> 
> That is from the subsequent patches in the series - you will want to revert
> the whole series to test, as the introduced aborts catch the illegal
> entries that the reverted patch sidestepped.
> 
> The series patches are:
> 
> 4afeffc
> 6133b39
> a233969
> d975301
> 
> Of course, these new aborts prevent improper behavior, so we may want to
> figure out why this is getting hit.
> 
> Unfortunately, I am traveling at the moment (waiting to board my flight), so
> will have limited connectivity.

I'll take a look at this today and the bottom line is we revert the series until
a proper fix is found.

Fa

Re: [Qemu-devel] [PATCH v7 11/13] xilinx_spips: Don't set TX FIFO UNDERFLOW at cmd done

2017-11-22 Thread Alistair Francis

On Thu, Nov 2, 2017 at 5:01 PM, Francisco Iglesias
 wrote:
> Don't set TX FIFO UNDERFLOW interrupt after done transmiting the commands.

after transmitting the commands

> Also update interrupts after reading out the interrupt status.
>
> Signed-off-by: Francisco Iglesias 

Acked-by: Alistair Francis 

Alistair


> ---
>  hw/ssi/xilinx_spips.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
> index 7f0f317..159a89d 100644
> --- a/hw/ssi/xilinx_spips.c
> +++ b/hw/ssi/xilinx_spips.c
> @@ -329,9 +329,6 @@ static void xilinx_spips_flush_txfifo(XilinxSPIPS *s)
>  uint8_t addr_length;
>
>  if (fifo8_is_empty(&s->tx_fifo)) {
> -if (!(s->regs[R_LQSPI_CFG] & LQSPI_CFG_LQ_MODE)) {
> -s->regs[R_INTR_STATUS] |= IXR_TX_FIFO_UNDERFLOW;
> -}
>  xilinx_spips_update_ixr(s);
>  return;
>  } else if (s->snoop_state == SNOOP_STRIPING) {
> @@ -530,6 +527,7 @@ static uint64_t xilinx_spips_read(void *opaque, hwaddr 
> addr,
>  ret = s->regs[addr] & IXR_ALL;
>  s->regs[addr] = 0;
>  DB_PRINT_L(0, "addr=" TARGET_FMT_plx " = %x\n", addr * 4, ret);
> +xilinx_spips_update_ixr(s);
>  return ret;
>  case R_INTR_MASK:
>  mask = IXR_ALL;
> --
> 2.9.3
>
>

Re: [Qemu-devel] [PATCH v7 10/13] xilinx_spips: Add support for 4 byte addresses in the LQSPI

2017-11-22 Thread Alistair Francis

On Thu, Nov 2, 2017 at 5:01 PM, Francisco Iglesias
 wrote:
> Add support for 4 byte addresses in the LQSPI and correct LQSPI_CFG_SEP_BUS.
>
> Signed-off-by: Francisco Iglesias 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/ssi/xilinx_spips.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
> index 3a98799..7f0f317 100644
> --- a/hw/ssi/xilinx_spips.c
> +++ b/hw/ssi/xilinx_spips.c
> @@ -92,8 +92,9 @@
>  #define R_LQSPI_CFG_RESET   0x03A002EB
>  #define LQSPI_CFG_LQ_MODE   (1U << 31)
>  #define LQSPI_CFG_TWO_MEM   (1 << 30)
> -#define LQSPI_CFG_SEP_BUS   (1 << 30)
> +#define LQSPI_CFG_SEP_BUS   (1 << 29)
>  #define LQSPI_CFG_U_PAGE(1 << 28)
> +#define LQSPI_CFG_ADDR4 (1 << 27)
>  #define LQSPI_CFG_MODE_EN   (1 << 25)
>  #define LQSPI_CFG_MODE_WIDTH8
>  #define LQSPI_CFG_MODE_SHIFT16
> @@ -702,6 +703,9 @@ static void lqspi_load_cache(void *opaque, hwaddr addr)
>  fifo8_push(&s->tx_fifo, s->regs[R_LQSPI_CFG] & LQSPI_CFG_INST_CODE);
>  /* read address */
>  DB_PRINT_L(0, "pushing read address %06x\n", flash_addr);
> +if (s->regs[R_LQSPI_CFG] & LQSPI_CFG_ADDR4) {
> +fifo8_push(&s->tx_fifo, (uint8_t)(flash_addr >> 24));
> +}
>  fifo8_push(&s->tx_fifo, (uint8_t)(flash_addr >> 16));
>  fifo8_push(&s->tx_fifo, (uint8_t)(flash_addr >> 8));
>  fifo8_push(&s->tx_fifo, (uint8_t)flash_addr);
> --
> 2.9.3
>
>

Re: [Qemu-devel] [PATCH v7 06/13] xilinx_spips: Update striping to be big-endian bit order

2017-11-22 Thread Alistair Francis

On Thu, Nov 2, 2017 at 5:01 PM, Francisco Iglesias
 wrote:
> Update striping functionality to be big-endian bit order (as according to
> the Zynq-7000 Technical Reference Manual). Output thereafter the even bits
> into the flash memory connected to the lower QSPI bus and the odd bits into
> the flash memory connected to the upper QSPI bus.
>
> Signed-off-by: Francisco Iglesias 
> ---
>  hw/ssi/xilinx_spips.c | 19 ++-
>  1 file changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
> index 559fa79..7accf5d 100644
> --- a/hw/ssi/xilinx_spips.c
> +++ b/hw/ssi/xilinx_spips.c
> @@ -208,14 +208,14 @@ static void xilinx_spips_reset(DeviceState *d)
>  xilinx_spips_update_cs_lines(s);
>  }
>
> -/* N way (num) in place bit striper. Lay out row wise bits (LSB to MSB)
> +/* N way (num) in place bit striper. Lay out row wise bits (MSB to LSB)
>   * column wise (from element 0 to N-1). num is the length of x, and dir
>   * reverses the direction of the transform. Best illustrated by example:
>   * Each digit in the below array is a single bit (num == 3):
>   *
> - * {{ 76543210, }  - stripe (dir == false) -> {{ FCheb630, }
> - *  { hgfedcba, }  { GDAfc741, }
> - *  { HGFEDCBA, }} < upstripe (dir == true) -  { HEBgda52, }}
> + * {{ 76543210, }  - stripe (dir == false) -> {{ 741gdaFC, }
> + *  { hgfedcba, }  { 630fcHEB, }
> + *  { HGFEDCBA, }} < upstripe (dir == true) -  { 52hebGDA, }}
>   */
>
>  static inline void stripe8(uint8_t *x, int num, bool dir)
> @@ -223,15 +223,15 @@ static inline void stripe8(uint8_t *x, int num, bool 
> dir)
>  uint8_t r[num];
>  memset(r, 0, sizeof(uint8_t) * num);
>  int idx[2] = {0, 0};
> -int bit[2] = {0, 0};
> +int bit[2] = {0, 7};
>  int d = dir;
>
>  for (idx[0] = 0; idx[0] < num; ++idx[0]) {
> -for (bit[0] = 0; bit[0] < 8; ++bit[0]) {
> -r[idx[d]] |= x[idx[!d]] & 1 << bit[!d] ? 1 << bit[d] : 0;
> +for (bit[0] = 7; bit[0] != -1; bit[0] += -1) {

I think this is easier to read:
for (bit[0] = 7; bit[0] >= 0; bit[0]--)

> +r[idx[!d]] |= x[idx[d]] & 1 << bit[d] ? 1 << bit[!d] : 0;
>  idx[1] = (idx[1] + 1) % num;
>  if (!idx[1]) {
> -bit[1]++;
> +bit[1] += -1;

bit[1]--

Otherwise:

Acked-by: Alistair Francis 

Alistair

>  }
>  }
>  }
> @@ -266,8 +266,9 @@ static void xilinx_spips_flush_txfifo(XilinxSPIPS *s)
>  }
>
>  for (i = 0; i < num_effective_busses(s); ++i) {
> +int bus = num_effective_busses(s) - 1 - i;
>  DB_PRINT_L(debug_level, "tx = %02x\n", tx_rx[i]);
> -tx_rx[i] = ssi_transfer(s->spi[i], (uint32_t)tx_rx[i]);
> +tx_rx[i] = ssi_transfer(s->spi[bus], (uint32_t)tx_rx[i]);
>  DB_PRINT_L(debug_level, "rx = %02x\n", tx_rx[i]);
>  }
>
> --
> 2.9.3
>
>

Re: [Qemu-devel] [PATCH v7 05/13] xilinx_spips: Move FlashCMD, XilinxQSPIPS and XilinxSPIPSClass

2017-11-22 Thread Alistair Francis

On Thu, Nov 2, 2017 at 5:01 PM, Francisco Iglesias
 wrote:
> Move the FlashCMD enum, XilinxQSPIPS and XilinxSPIPSClass structures to the
> header for consistency (struct XilinxSPIPS is found there). Also move out
> a define and remove two dubbel included headers (while touching the code).

s/dubbel/double/g

> Finally, add 4 byte address commands to the FlashCMD enum.

This probably could be a separate patch, but I'm not fussed either way.

>
> Signed-off-by: Francisco Iglesias 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/ssi/xilinx_spips.c | 35 ---
>  include/hw/ssi/xilinx_spips.h | 34 ++
>  2 files changed, 34 insertions(+), 35 deletions(-)
>
> diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c
> index ef56d35..559fa79 100644
> --- a/hw/ssi/xilinx_spips.c
> +++ b/hw/ssi/xilinx_spips.c
> @@ -27,8 +27,6 @@
>  #include "sysemu/sysemu.h"
>  #include "hw/ptimer.h"
>  #include "qemu/log.h"
> -#include "qemu/fifo8.h"
> -#include "hw/ssi/ssi.h"
>  #include "qemu/bitops.h"
>  #include "hw/ssi/xilinx_spips.h"
>  #include "qapi/error.h"
> @@ -116,44 +114,11 @@
>
>  /* 16MB per linear region */
>  #define LQSPI_ADDRESS_BITS 24
> -/* Bite off 4k chunks at a time */
> -#define LQSPI_CACHE_SIZE 1024
>
>  #define SNOOP_CHECKING 0xFF
>  #define SNOOP_NONE 0xFE
>  #define SNOOP_STRIPING 0
>
> -typedef enum {
> -READ = 0x3,
> -FAST_READ = 0xb,
> -DOR = 0x3b,
> -QOR = 0x6b,
> -DIOR = 0xbb,
> -QIOR = 0xeb,
> -
> -PP = 0x2,
> -DPP = 0xa2,
> -QPP = 0x32,
> -} FlashCMD;
> -
> -typedef struct {
> -XilinxSPIPS parent_obj;
> -
> -uint8_t lqspi_buf[LQSPI_CACHE_SIZE];
> -hwaddr lqspi_cached_addr;
> -Error *migration_blocker;
> -bool mmio_execution_enabled;
> -} XilinxQSPIPS;
> -
> -typedef struct XilinxSPIPSClass {
> -SysBusDeviceClass parent_class;
> -
> -const MemoryRegionOps *reg_ops;
> -
> -uint32_t rx_fifo_size;
> -uint32_t tx_fifo_size;
> -} XilinxSPIPSClass;
> -
>  static inline int num_effective_busses(XilinxSPIPS *s)
>  {
>  return (s->regs[R_LQSPI_CFG] & LQSPI_CFG_SEP_BUS &&
> diff --git a/include/hw/ssi/xilinx_spips.h b/include/hw/ssi/xilinx_spips.h
> index 06aa096..7f9e2fc 100644
> --- a/include/hw/ssi/xilinx_spips.h
> +++ b/include/hw/ssi/xilinx_spips.h
> @@ -32,6 +32,22 @@ typedef struct XilinxSPIPS XilinxSPIPS;
>
>  #define XLNX_SPIPS_R_MAX(0x100 / 4)
>
> +/* Bite off 4k chunks at a time */
> +#define LQSPI_CACHE_SIZE 1024
> +
> +typedef enum {
> +READ = 0x3, READ_4 = 0x13,
> +FAST_READ = 0xb,FAST_READ_4 = 0x0c,
> +DOR = 0x3b, DOR_4 = 0x3c,
> +QOR = 0x6b, QOR_4 = 0x6c,
> +DIOR = 0xbb,DIOR_4 = 0xbc,
> +QIOR = 0xeb,QIOR_4 = 0xec,
> +
> +PP = 0x2,   PP_4 = 0x12,
> +DPP = 0xa2,
> +QPP = 0x32, QPP_4 = 0x34,
> +} FlashCMD;
> +
>  struct XilinxSPIPS {
>  SysBusDevice parent_obj;
>
> @@ -56,6 +72,24 @@ struct XilinxSPIPS {
>  uint32_t regs[XLNX_SPIPS_R_MAX];
>  };
>
> +typedef struct {
> +XilinxSPIPS parent_obj;
> +
> +uint8_t lqspi_buf[LQSPI_CACHE_SIZE];
> +hwaddr lqspi_cached_addr;
> +Error *migration_blocker;
> +bool mmio_execution_enabled;
> +} XilinxQSPIPS;
> +
> +typedef struct XilinxSPIPSClass {
> +SysBusDeviceClass parent_class;
> +
> +const MemoryRegionOps *reg_ops;
> +
> +uint32_t rx_fifo_size;
> +uint32_t tx_fifo_size;
> +} XilinxSPIPSClass;
> +
>  #define TYPE_XILINX_SPIPS "xlnx.ps7-spi"
>  #define TYPE_XILINX_QSPIPS "xlnx.ps7-qspi"
>
> --
> 2.9.3
>
>

Re: [Qemu-devel] [PATCH v7 00/13] Add support for the ZynqMP Generic QSPI

2017-11-22 Thread Alistair Francis

On Tue, Nov 21, 2017 at 10:39 AM, Peter Maydell
 wrote:
> On 3 November 2017 at 00:00, Francisco Iglesias
>  wrote:
>> Hi,
>>
>> This patch series is an attempt to add support for the ZynqMP QSPI 
>> (consisting
>> of the Generic QSPI and the legacy QSPI) to the xlnx-zcu102 board and connect
>> Numonyx n25q512a11 flashes to the QSPI. Also some functionality is added to
>> m25p80.
>>
>> The series starts by adding support in m25p80 for continous read out of 
>> status
>> registers, SST flash READ ID commands, bank address register accesses, bulk
>> erase (0x60) and two Numonyx flashes (n25q512a11 and n25q512a13). Thereafter 
>> it
>> updates the striping behaviour to be bit big endiann in the Xilinx QSPI model
>> and adds support for RX discard, zero pumping according transfer register 
>> and 4
>> byte LQSPI addresses. Finally it adds support for the ZynqMP Generic QSPI and
>> adds the ZynqMP QSPI to the xlnx-zcu102 board.
>>
>> Best regards,
>> Francisco Iglesias
>
> Hi; just a note to say that I'm assuming the Xilinx folk are going
> to review the xilinx_spips patches in this set...

Yep, we will!

Alistair

>
> thanks
> -- PMM
>

Re: [Qemu-devel] [PATCH 3/5] nbd/server: add helper nbd_opt_invalid

2017-11-22 Thread Eric Blake

On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  nbd/server.c | 74 
> +---
>  1 file changed, 46 insertions(+), 28 deletions(-)
> 

> @@ -418,8 +447,8 @@ static int nbd_negotiate_handle_info(NBDClient *client, 
> uint16_t myflags,
>  be16_to_cpus(&requests);
>  trace_nbd_negotiate_handle_info_requests(requests);
>  if (requests != client->optlen / sizeof(request)) {
> -msg = "incorrect number of  requests for overall length";
> -goto invalid;
> +return nbd_opt_invalid(
> +client, errp, "incorrect number of requests for overall length");

Nice that you are fixing the typo of the double space in the error message.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH qemu] vfio/spapr: Allow fallback to SPAPR TCE IOMMU v1

2017-11-22 Thread David Gibson

On Wed, Nov 22, 2017 at 04:15:52PM +1100, Alexey Kardashevskiy wrote:
> The vfio_iommu_spapr_tce driver always advertises v1 and v2 IOMMU support,
> however PR KVM (a special version of KVM designed to work in
> a paravirtualized system; these days used for nested virtualizaion) only
> supports the "pseries" platform which does not support v2. Since there is
> no way to choose the IOMMU version in QEMU, it fails to start.
> 
> This adds a fallback to the v1 IOMMU if v2 cannot be used.
> 
> Signed-off-by: Alexey Kardashevskiy 

The fallback itself isn't a bad idea, but your commit message contains
several inaccurracies.  KVM PR is not particularly designed to work in
a paravirtualized system, and it doesn't only support the pseries
platform (as guest *or* host).  It's actually a lot more general than
KVM HV - just slow, not that well tested and missing a number of
features that no-one's bothered to port to it.

> ---
>  hw/vfio/common.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 7b2924c..cd81cc9 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -1040,6 +1040,11 @@ static int vfio_connect_container(VFIOGroup *group, 
> AddressSpace *as,
>  v2 ? VFIO_SPAPR_TCE_v2_IOMMU : VFIO_SPAPR_TCE_IOMMU;
>  ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
>  if (ret) {
> +container->iommu_type = VFIO_SPAPR_TCE_IOMMU;
> +v2 = false;
> +ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
> +}
> +if (ret) {
>  error_setg_errno(errp, errno, "failed to set iommu for 
> container");
>  ret = -errno;
>  goto free_container_exit;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] tcg: Fix complilation with TCG

2017-11-22 Thread Emilio G. Cota

On Wed, Nov 22, 2017 at 09:32:04 -0300, Philippe Mathieu-Daudé wrote:
> On 11/22/2017 05:41 AM, Juan Quintela wrote:
> > This commit started use tb_unlock() and tlb_set_dirty() on non TCG
> > code.  Add the function as stubs.
> > 
> > commit 27266271977c5a30f2f7d493e042be1897827bdd
> > Author: Peter Maydell 
> > Date:   Mon Nov 20 18:08:27 2017 +
> > 
> > exec.c: Factor out before/after actions for notdirty memory writes
> > 
> > 
> 
> Maybe pasting the sha-1 is enough "Since 272662719 those stubs are missing".

Paolo's "git whatis" alias comes to mind:
  https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg00972.html

E.

[Qemu-devel] [PATCH v3 7/7] s390x/pci: search for subregion inside the BARs

2017-11-22 Thread Pierre Morel

When dispatching memory access to PCI BAR region, we must
look for possible subregions, used by the PCI device to map
different memory areas inside the same PCI BAR.

Since the data offset we received is calculated starting at the
region start address we need to adjust the offset for the subregion.

The data offset inside the subregion is calculated by substracting
the subregion's starting address from the data offset in the region.

The access to the MSIX region is now handled in a generic way,
we do not need the specific trap_msix() function anymore.

Signed-off-by: Pierre Morel 
Reviewed-by: Yi Min Zhao 
---
 hw/s390x/s390-pci-inst.c | 44 +---
 1 file changed, 25 insertions(+), 19 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 8d35f8f..fae3ef7 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -345,12 +345,31 @@ static int zpci_endian_swap(uint64_t *ptr, uint8_t len)
 return 0;
 }
 
+static MemoryRegion *s390_get_subregion(MemoryRegion *mr, uint64_t offset,
+uint8_t len)
+{
+MemoryRegion *other;
+uint64_t subregion_size;
+
+QTAILQ_FOREACH(other, &mr->subregions, subregions_link) {
+subregion_size = int128_get64(other->size);
+if ((offset >= other->addr) &&
+(offset + len) <= (other->addr + subregion_size)) {
+mr = other;
+break;
+}
+}
+return mr;
+}
+
 static MemTxResult zpci_read_bar(S390PCIBusDevice *pbdev, uint8_t pcias,
  uint64_t offset, uint64_t *data, uint8_t len)
 {
 MemoryRegion *mr;
 
 mr = pbdev->pdev->io_regions[pcias].memory;
+mr = s390_get_subregion(mr, offset, len);
+offset -= mr->addr;
 return memory_region_dispatch_read(mr, offset, data, len,
MEMTXATTRS_UNSPECIFIED);
 }
@@ -442,30 +461,14 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 return 0;
 }
 
-static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias)
-{
-if (pbdev->msix.available && pbdev->msix.table_bar == pcias &&
-offset >= pbdev->msix.table_offset &&
-offset < (pbdev->msix.table_offset +
-  pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) {
-return 1;
-} else {
-return 0;
-}
-}
-
 static MemTxResult zpci_write_bar(S390PCIBusDevice *pbdev, uint8_t pcias,
   uint64_t offset, uint64_t data, uint8_t len)
 {
 MemoryRegion *mr;
 
-if (trap_msix(pbdev, offset, pcias)) {
-offset = offset - pbdev->msix.table_offset;
-mr = &pbdev->pdev->msix_table_mmio;
-} else {
-mr = pbdev->pdev->io_regions[pcias].memory;
-}
-
+mr = pbdev->pdev->io_regions[pcias].memory;
+mr = s390_get_subregion(mr, offset, len);
+offset -= mr->addr;
 return memory_region_dispatch_write(mr, offset, data, len,
 MEMTXATTRS_UNSPECIFIED);
 }
@@ -726,6 +729,9 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r3, uint64_t gaddr,
 }
 
 mr = pbdev->pdev->io_regions[pcias].memory;
+mr = s390_get_subregion(mr, offset, len);
+offset -= mr->addr;
+
 if (!memory_region_access_valid(mr, offset, len, true)) {
 program_interrupt(env, PGM_OPERAND, 6);
 return 0;
-- 
2.7.4

[Qemu-devel] [PATCH v3 6/7] s390x/pci: move the memory region write from pcistg

2017-11-22 Thread Pierre Morel

Let's move the memory region write from pcistg into a dedicated
function.
This allows us to prepare a later patch searching for subregions
inside of the memory region.

Signed-off-by: Pierre Morel 
Reviewed-by: Yi Min Zhao 
---
 hw/s390x/s390-pci-inst.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 69ff7b8..8d35f8f 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -454,12 +454,27 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t 
offset, uint8_t pcias)
 }
 }
 
+static MemTxResult zpci_write_bar(S390PCIBusDevice *pbdev, uint8_t pcias,
+  uint64_t offset, uint64_t data, uint8_t len)
+{
+MemoryRegion *mr;
+
+if (trap_msix(pbdev, offset, pcias)) {
+offset = offset - pbdev->msix.table_offset;
+mr = &pbdev->pdev->msix_table_mmio;
+} else {
+mr = pbdev->pdev->io_regions[pcias].memory;
+}
+
+return memory_region_dispatch_write(mr, offset, data, len,
+MEMTXATTRS_UNSPECIFIED);
+}
+
 int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2)
 {
 CPUS390XState *env = &cpu->env;
 uint64_t offset, data;
 S390PCIBusDevice *pbdev;
-MemoryRegion *mr;
 MemTxResult result;
 uint8_t len;
 uint32_t fh;
@@ -516,15 +531,7 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 return 0;
 }
 
-if (trap_msix(pbdev, offset, pcias)) {
-offset = offset - pbdev->msix.table_offset;
-mr = &pbdev->pdev->msix_table_mmio;
-} else {
-mr = pbdev->pdev->io_regions[pcias].memory;
-}
-
-result = memory_region_dispatch_write(mr, offset, data, len,
- MEMTXATTRS_UNSPECIFIED);
+result = zpci_write_bar(pbdev, pcias, offset, data, len);
 if (result != MEMTX_OK) {
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
-- 
2.7.4

[Qemu-devel] [PATCH v3 2/7] s390x/pci: rework PCI STORE

2017-11-22 Thread Pierre Morel

Enhance the fault detection, correction of the fault reporting.

Signed-off-by: Pierre Morel 
Reviewed-by: Yi Min Zhao 
---
 hw/s390x/s390-pci-inst.c | 39 ++-
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 3e1f1a0..930c197 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -470,6 +470,12 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 pcias = (env->regs[r2] >> 16) & 0xf;
 len = env->regs[r2] & 0xf;
 offset = env->regs[r2 + 1];
+data = env->regs[r1];
+
+if (!(fh & FH_MASK_ENABLE)) {
+setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
+return 0;
+}
 
 pbdev = s390_pci_find_dev_by_fh(s390_get_phb(), fh);
 if (!pbdev) {
@@ -479,12 +485,7 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 }
 
 switch (pbdev->state) {
-case ZPCI_FS_RESERVED:
-case ZPCI_FS_STANDBY:
-case ZPCI_FS_DISABLED:
 case ZPCI_FS_PERMANENT_ERROR:
-setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
-return 0;
 case ZPCI_FS_ERROR:
 setcc(cpu, ZPCI_PCI_LS_ERR);
 s390_set_status_code(env, r2, ZPCI_PCI_ST_BLOCKED);
@@ -493,9 +494,13 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 break;
 }
 
-data = env->regs[r1];
-if (pcias < 6) {
-if ((8 - (offset & 0x7)) < len) {
+switch (pcias) {
+/* A ZPCI PCI card may use any BAR from BAR 0 to BAR 5 */
+case 0 ... 5:
+/* Check length:
+ * A length of 0 is invalid and length should not cross a double word
+ */
+if (!len || (len > (8 - (offset & 0x7 {
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
 }
@@ -513,21 +518,21 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
 }
-} else if (pcias == 15) {
-if ((4 - (offset & 0x3)) < len) {
-program_interrupt(env, PGM_OPERAND, 4);
-return 0;
-}
-
-if (zpci_endian_swap(&data, len)) {
+break;
+case 15:
+/* ZPCI uses the pseudo BAR number 15 as configuration space */
+/* possible access lengths are 1,2,4 and must not cross a word */
+if (!len || (len > (4 - (offset & 0x3))) || len == 3) {
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
 }
-
+/* len = 1,2,4 so we do not need to test */
+zpci_endian_swap(&data, len);
 pci_host_config_write_common(pbdev->pdev, offset,
  pci_config_size(pbdev->pdev),
  data, len);
-} else {
+break;
+default:
 DPRINTF("pcistg invalid space\n");
 setcc(cpu, ZPCI_PCI_LS_ERR);
 s390_set_status_code(env, r2, ZPCI_PCI_ST_INVAL_AS);
-- 
2.7.4

[Qemu-devel] [PATCH v3 1/7] s390x/pci: factor out endianess conversion

2017-11-22 Thread Pierre Morel

There are two places where the same endianness conversion
is done.
Let's factor this out into a static function.

Signed-off-by: Pierre Morel 
Reviewed-by: Yi Min Zhao 
---
 hw/s390x/s390-pci-inst.c | 59 +++-
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 8e088f3..3e1f1a0 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -314,6 +314,36 @@ out:
 return 0;
 }
 
+/**
+ * Swap data contained in s390x big endian registers to little endian
+ * PCI bars.
+ *
+ * @ptr: a pointer to a uint64_t data field
+ * @len: the length of the valid data, must be 1,2,4 or 8
+ */
+static int zpci_endian_swap(uint64_t *ptr, uint8_t len)
+{
+uint64_t data = *ptr;
+
+switch (len) {
+case 1:
+break;
+case 2:
+data = bswap16(data);
+break;
+case 4:
+data = bswap32(data);
+break;
+case 8:
+data = bswap64(data);
+break;
+default:
+return -EINVAL;
+}
+*ptr = data;
+return 0;
+}
+
 int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2)
 {
 CPUS390XState *env = &cpu->env;
@@ -385,19 +415,7 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 data =  pci_host_config_read_common(
pbdev->pdev, offset, pci_config_size(pbdev->pdev), len);
 
-switch (len) {
-case 1:
-break;
-case 2:
-data = bswap16(data);
-break;
-case 4:
-data = bswap32(data);
-break;
-case 8:
-data = bswap64(data);
-break;
-default:
+if (zpci_endian_swap(&data, len)) {
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
 }
@@ -500,19 +518,8 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
 }
-switch (len) {
-case 1:
-break;
-case 2:
-data = bswap16(data);
-break;
-case 4:
-data = bswap32(data);
-break;
-case 8:
-data = bswap64(data);
-break;
-default:
+
+if (zpci_endian_swap(&data, len)) {
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
 }
-- 
2.7.4

[Qemu-devel] [PATCH v3 5/7] s390x/pci: move the memory region read from pcilg

2017-11-22 Thread Pierre Morel

Let's move the memory region read from pcilg into a dedicated function.
This allows us to prepare a later patch.

Signed-off-by: Pierre Morel 
Reviewed-by: Yi Min Zhao 
---
 hw/s390x/s390-pci-inst.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index fe6e042..69ff7b8 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -345,13 +345,22 @@ static int zpci_endian_swap(uint64_t *ptr, uint8_t len)
 return 0;
 }
 
+static MemTxResult zpci_read_bar(S390PCIBusDevice *pbdev, uint8_t pcias,
+ uint64_t offset, uint64_t *data, uint8_t len)
+{
+MemoryRegion *mr;
+
+mr = pbdev->pdev->io_regions[pcias].memory;
+return memory_region_dispatch_read(mr, offset, data, len,
+   MEMTXATTRS_UNSPECIFIED);
+}
+
 int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2)
 {
 CPUS390XState *env = &cpu->env;
 S390PCIBusDevice *pbdev;
 uint64_t offset;
 uint64_t data;
-MemoryRegion *mr;
 MemTxResult result;
 uint8_t len;
 uint32_t fh;
@@ -402,9 +411,7 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2)
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
 }
-mr = pbdev->pdev->io_regions[pcias].memory;
-result = memory_region_dispatch_read(mr, offset, &data, len,
- MEMTXATTRS_UNSPECIFIED);
+result = zpci_read_bar(pbdev, pcias, offset, &data, len);
 if (result != MEMTX_OK) {
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
-- 
2.7.4

[Qemu-devel] [PATCH v3 0/7] s390x/pci: Improve zPCI to cover more cases

2017-11-22 Thread Pierre Morel

This patch fixes the following BUG:
Even a guest is able to detect virtio_pci device, the init function
the Linux virtio_pci driver will hang because zPCI does not support
the subregions used by virtio_pci.

It follows that right now the PCI support is very limited
(e.g. pass through of a host vfio device)
To enable features like virtio-pci several modifications needs to be
done.

As already stated above, Virtio-PCI uses subregions, which may eventually
be discontinuous inside bars instead of a single flat region often used
by real devices.
The address offset being formerly calculated from the BAR base address
must be adapted to the subregions instead of to the single region.

This patch provides the new calculation for the three kind of BAR
access, zPCI STORE, zPCI LOAD and zPCI STORE BLOCK done by zPCI.

We use the opportunity to
 - enhance the fault detection for zPCI STORE and LOAD,
 - enhance the fault detection and to provide the maximum STORE BLOCK
   block size, maxstbl, for zPCI STORE BLOCK
 - factor out part of the code used to calculate the offset and
   access the BARs,
 - factor out the code for endianess conversion.


Pierre Morel (7):
  s390x/pci: factor out endianess conversion
  s390x/pci: rework PCI STORE
  s390x/pci: rework PCI LOAD
  s390x/pci: rework PCI STORE BLOCK
  s390x/pci: move the memory region read from pcilg
  s390x/pci: move the memory region write from pcistg
  s390x/pci: search for subregion inside the BARs

 hw/s390x/s390-pci-bus.h  |   1 +
 hw/s390x/s390-pci-inst.c | 248 ---
 hw/s390x/s390-pci-inst.h |   2 +-
 3 files changed, 151 insertions(+), 100 deletions(-)

-- 
2.7.4

Changelog:

from v2
- rewording of comments, coding style.

from v1
- suppress fallthrough to PCI_ROM_SLOT to handle it in the default case.
- reword most of the patch commit messages
- add comments to the endianness conversion
- reword somme comments inside the patches

[Qemu-devel] [PATCH v3 4/7] s390x/pci: rework PCI STORE BLOCK

2017-11-22 Thread Pierre Morel

Enhance the fault detection.

Fixup the precedence to check the destination path existance
before checking for the source accessibility.

Add the maxstbl entry to both the Query PCI Function Group
response and the PCIBusDevice structure.

Initialize the maxstbl to 128 per default until we get
the actual data from the hardware.

Signed-off-by: Pierre Morel 
Reviewed-by: Yi Min Zhao 
---
 hw/s390x/s390-pci-bus.h  |  1 +
 hw/s390x/s390-pci-inst.c | 63 ++--
 hw/s390x/s390-pci-inst.h |  2 +-
 3 files changed, 41 insertions(+), 25 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.h b/hw/s390x/s390-pci-bus.h
index 560bd82..2993f0d 100644
--- a/hw/s390x/s390-pci-bus.h
+++ b/hw/s390x/s390-pci-bus.h
@@ -284,6 +284,7 @@ struct S390PCIBusDevice {
 uint64_t fmb_addr;
 uint8_t isc;
 uint16_t noi;
+uint16_t maxstbl;
 uint8_t sum;
 S390MsixInfo msix;
 AdapterRoutes routes;
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 2295412..fe6e042 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -294,6 +294,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2)
 stq_p(&resgrp->msia, ZPCI_MSI_ADDR);
 stw_p(&resgrp->mui, 0);
 stw_p(&resgrp->i, 128);
+stw_p(&resgrp->maxstbl, 128);
 resgrp->version = 0;
 
 stw_p(&resgrp->hdr.rsp, CLP_RC_OK);
@@ -645,6 +646,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r3, uint64_t gaddr,
 S390PCIBusDevice *pbdev;
 MemoryRegion *mr;
 MemTxResult result;
+uint64_t offset;
 int i;
 uint32_t fh;
 uint8_t pcias;
@@ -659,22 +661,10 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r3, uint64_t gaddr,
 fh = env->regs[r1] >> 32;
 pcias = (env->regs[r1] >> 16) & 0xf;
 len = env->regs[r1] & 0xff;
+offset = env->regs[r3];
 
-if (pcias > 5) {
-DPRINTF("pcistb invalid space\n");
-setcc(cpu, ZPCI_PCI_LS_ERR);
-s390_set_status_code(env, r1, ZPCI_PCI_ST_INVAL_AS);
-return 0;
-}
-
-switch (len) {
-case 16:
-case 32:
-case 64:
-case 128:
-break;
-default:
-program_interrupt(env, PGM_SPECIFICATION, 6);
+if (!(fh & FH_MASK_ENABLE)) {
+setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
 return 0;
 }
 
@@ -686,12 +676,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r3, uint64_t gaddr,
 }
 
 switch (pbdev->state) {
-case ZPCI_FS_RESERVED:
-case ZPCI_FS_STANDBY:
-case ZPCI_FS_DISABLED:
 case ZPCI_FS_PERMANENT_ERROR:
-setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
-return 0;
 case ZPCI_FS_ERROR:
 setcc(cpu, ZPCI_PCI_LS_ERR);
 s390_set_status_code(env, r1, ZPCI_PCI_ST_BLOCKED);
@@ -700,8 +685,34 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r3, uint64_t gaddr,
 break;
 }
 
+if (pcias > 5) {
+DPRINTF("pcistb invalid space\n");
+setcc(cpu, ZPCI_PCI_LS_ERR);
+s390_set_status_code(env, r1, ZPCI_PCI_ST_INVAL_AS);
+return 0;
+}
+
+/* Verify the address, offset and length */
+/* offset must be a multiple of 8 */
+if (offset % 8) {
+goto addressing_error;
+}
+/* Length must be greater than 8, a multiple of 8 */
+/* and not greater than maxstbl */
+if ((len <= 8) || (len % 8) || (len > pbdev->maxstbl)) {
+goto addressing_error;
+}
+/* Do not cross a 4K-byte boundary */
+if (((offset & 0xfff) + len) > 0x1000) {
+goto addressing_error;
+}
+/* Guest address must be double word aligned */
+if (gaddr & 0x07UL) {
+goto addressing_error;
+}
+
 mr = pbdev->pdev->io_regions[pcias].memory;
-if (!memory_region_access_valid(mr, env->regs[r3], len, true)) {
+if (!memory_region_access_valid(mr, offset, len, true)) {
 program_interrupt(env, PGM_OPERAND, 6);
 return 0;
 }
@@ -711,9 +722,9 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r3, uint64_t gaddr,
 }
 
 for (i = 0; i < len / 8; i++) {
-result = memory_region_dispatch_write(mr, env->regs[r3] + i * 8,
- ldq_p(buffer + i * 8), 8,
- MEMTXATTRS_UNSPECIFIED);
+result = memory_region_dispatch_write(mr, offset + i * 8,
+  ldq_p(buffer + i * 8), 8,
+  MEMTXATTRS_UNSPECIFIED);
 if (result != MEMTX_OK) {
 program_interrupt(env, PGM_OPERAND, 6);
 return 0;
@@ -722,6 +733,10 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r3, uint64_t gaddr,
 
 setcc(cpu, ZPCI_PCI_LS_OK);
 return 0;
+
+addressing_error:
+program_interrupt(env, PGM_SPECIFICATION, 6);
+return 0;
 }
 
 static int reg_irqs(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib)
diff --git a/hw/s390x/s390-pci-inst.h b/

[Qemu-devel] [PATCH v3 3/7] s390x/pci: rework PCI LOAD

2017-11-22 Thread Pierre Morel

Enhance the fault detection, correction of the fault reporting.

Signed-off-by: Pierre Morel 
Reviewed-by: Yi Min Zhao 
---
 hw/s390x/s390-pci-inst.c | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 930c197..2295412 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -373,6 +373,11 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 len = env->regs[r2] & 0xf;
 offset = env->regs[r2 + 1];
 
+if (!(fh & FH_MASK_ENABLE)) {
+setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
+return 0;
+}
+
 pbdev = s390_pci_find_dev_by_fh(s390_get_phb(), fh);
 if (!pbdev) {
 DPRINTF("pcilg no pci dev\n");
@@ -381,12 +386,7 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
 }
 
 switch (pbdev->state) {
-case ZPCI_FS_RESERVED:
-case ZPCI_FS_STANDBY:
-case ZPCI_FS_DISABLED:
 case ZPCI_FS_PERMANENT_ERROR:
-setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
-return 0;
 case ZPCI_FS_ERROR:
 setcc(cpu, ZPCI_PCI_LS_ERR);
 s390_set_status_code(env, r2, ZPCI_PCI_ST_BLOCKED);
@@ -395,8 +395,9 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2)
 break;
 }
 
-if (pcias < 6) {
-if ((8 - (offset & 0x7)) < len) {
+switch (pcias) {
+case 0 ... 5:
+if (!len || (len > (8 - (offset & 0x7 {
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
 }
@@ -407,8 +408,9 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2)
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
 }
-} else if (pcias == 15) {
-if ((4 - (offset & 0x3)) < len) {
+break;
+case 15:
+if (!len || (len > (4 - (offset & 0x3))) || len == 3) {
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
 }
@@ -419,8 +421,9 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2)
 program_interrupt(env, PGM_OPERAND, 4);
 return 0;
 }
-} else {
-DPRINTF("invalid space\n");
+break;
+default:
+DPRINTF("pcilg invalid space\n");
 setcc(cpu, ZPCI_PCI_LS_ERR);
 s390_set_status_code(env, r2, ZPCI_PCI_ST_INVAL_AS);
 return 0;
-- 
2.7.4

Re: [Qemu-devel] [PATCH 5/5] nbd/server: structurize option reply sending

2017-11-22 Thread Eric Blake

On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  nbd/server.c | 40 +---
>  1 file changed, 13 insertions(+), 27 deletions(-)
> 

Nice diffstat; shows that this one probably makes sense (even if it will
need rebasing on top of the planned churn for the rest of the series).

> diff --git a/nbd/server.c b/nbd/server.c
> index 79b937d88f..975fe8efe9 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -152,43 +152,29 @@ static inline int nbd_opt_drop(NBDClient *client, 
> size_t size, Error **errp)
>  return nbd_drop(client->ioc, size, errp);
>  }
>  
> +static inline void set_be_option_rep(NBDOptionReply *rep, uint32_t option,
> + uint32_t type, uint32_t length)

'static inline' is probably pointless; the compiler already knows what
static functions are ideal to inline, without us having to name it (I
realize you've already done this in other patches, so I probably ought
to post a cleanup patch that removes 'inline').

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 3/5] nbd/server: add helper nbd_opt_invalid

2017-11-22 Thread Eric Blake

On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  nbd/server.c | 74 
> +---
>  1 file changed, 46 insertions(+), 28 deletions(-)
> 

> +/* nbd_opt_invalid
> + * Drop reminded option data and reply with NBD_REP_ERR_INVALID

s/reminded/the remainder of/

In fact, if we generalize this just a bit more, and let the caller
choose whether to use NBD_REP_ERR_INVALID or NBD_REP_ERR_TLS_REQD or
NBD_REP_ERR_UNSUP, then we can merge this functionality directly into
nbd_opt_drop() instead of adding another function.  I guess I'll have to
merge that into the counter-proposal I have in mind (but at this point,
the counter-proposal won't be posted any sooner than 2.11-rc3, in part
because I'm almost on my Thanksgiving vacation).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 4/5] nbd: rename nbd_option and nbd_opt_reply

2017-11-22 Thread Eric Blake

On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote:
> Rename nbd_optino and nbd_opt_reply to NBDOption and NBDOptionReply

s/optino/option/

> to correspond to Qemu coding style and other structures here.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/nbd.h |  8 
>  nbd/client.c| 12 ++--
>  2 files changed, 10 insertions(+), 10 deletions(-)
> 
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 2/7] s390x/pci: rework PCI STORE

2017-11-22 Thread Pierre Morel


On 22/11/2017 14:25, Cornelia Huck wrote:

On Tue, 21 Nov 2017 19:03:17 +0100
Pierre Morel  wrote:


On 21/11/2017 15:25, Cornelia Huck wrote:

On Tue, 21 Nov 2017 11:38:45 +0100
Cornelia Huck  wrote:
   

On Thu, 16 Nov 2017 18:51:50 +0100
Pierre Morel  wrote:
   

@@ -493,9 +494,13 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
   break;
   }
   
-data = env->regs[r1];

-if (pcias < 6) {
-if ((8 - (offset & 0x7)) < len) {
+/* Test that the pcias is valid and do the appropriates operations */
+switch (pcias) {
+case 0 ... 5:


Make this
case 0...5:
?


...only that it confuses the compiler when using numbers. We can either


I did not see this as I replied to the previous email, sorry.


keep the slightly ugly version, or introduce #defines.
ZPCI_PCIAS_IOREGION_{MIN,MAX} (and maybe ZPCI_PCIAS_CONFIG for 15)?



I agree something is missing.
But I am not sure that a #define brings clarity.
I would prefer to add a comment.
/* A ZPCI PCI card may use any BAR from BAR 0 to BAR 5 */

?


It's more a speaking value vs. magic numbers thing. A comment does not
hurt either, though :)

(And we get rid of the spacing as well.)


OK, thanks.







+ * A length of 0 is invalid and length should not cross a double word
+ */
+if (!len || (len > (8 - (offset & 0x7 {
   program_interrupt(env, PGM_OPERAND, 4);
   return 0;
   }
@@ -513,21 +518,21 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r2)
   program_interrupt(env, PGM_OPERAND, 4);
   return 0;
   }
-} else if (pcias == 15) {
-if ((4 - (offset & 0x3)) < len) {
-program_interrupt(env, PGM_OPERAND, 4);
-return 0;
-}
-
-if (zpci_endian_swap(&data, len)) {
+break;
+case 15:
+/* ZPCI uses the pseudo BAR number 15 as configuration space */
+/* possible access lengths are 1,2,4 and must not cross a word */
+if (!len || (len > (4 - (offset & 0x3))) || len == 3) {
   program_interrupt(env, PGM_OPERAND, 4);
   return 0;
   }
-
+/* len = 1,2,4 so we do not need to test */
+zpci_endian_swap(&data, len);
   pci_host_config_write_common(pbdev->pdev, offset,
pci_config_size(pbdev->pdev),
data, len);
-} else {
+break;
+default:
   DPRINTF("pcistg invalid space\n");
   setcc(cpu, ZPCI_PCI_LS_ERR);
   s390_set_status_code(env, r2, ZPCI_PCI_ST_INVAL_AS);


Other than the nits, looks good.
   








--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

Re: [Qemu-devel] [PATCH v3 29/30] i.MX: Add i.MX7 SOC implementation.

2017-11-22 Thread Andrey Smirnov

On Wed, Nov 22, 2017 at 7:34 AM, Igor Mammedov  wrote:
> On Mon,  6 Nov 2017 07:48:12 -0800
> Andrey Smirnov  wrote:
>
>> The following interfaces are partially or fully emulated:
>>
>> * up to 2 Cortex A9 cores (SMP works with PSCI)
>> * A7 MPCORE (identical to A15 MPCORE)
>> * 4 GPTs modules
>> * 7 GPIO controllers
>> * 2 IOMUXC controllers
>> * 1 CCM module
>> * 1 SVNS module
>> * 1 SRC module
>> * 1 GPCv2 controller
>> * 4 eCSPI controllers
>> * 4 I2C controllers
>> * 7 i.MX UART controllers
>> * 2 FlexCAN controllers
>> * 2 Ethernet controllers (FEC)
>> * 3 SD controllers (USDHC)
>> * 4 WDT modules
>> * 1 SDMA module
>> * 1 GPR module
>> * 2 USBMISC modules
>> * 2 ADC modules
>> * 1 PCIe controller
>>
>> Tested to boot and work with upstream Linux (4.13+) guest.
>>
>> Cc: Peter Maydell 
>> Cc: Jason Wang 
>> Cc: Philippe Mathieu-Daudé 
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-...@nongnu.org
>> Cc: yurov...@gmail.com
>> Signed-off-by: Andrey Smirnov 
>> ---
> ...
>> +
>> +static void fsl_imx7_init(Object *obj)
>> +{
>> +BusState *sysbus = sysbus_get_default();
>> +FslIMX7State *s = FSL_IMX7(obj);
>> +char name[NAME_SIZE];
>> +int i;
>> +
>> +if (smp_cpus > FSL_IMX7_NUM_CPUS) {
>> +error_report("%s: Only %d CPUs are supported (%d requested)",
>> + TYPE_FSL_IMX7, FSL_IMX7_NUM_CPUS, smp_cpus);
>> +exit(1);
>> +}
>> +
>> +for (i = 0; i < smp_cpus; i++) {
>> +object_initialize(&s->cpu[i], sizeof(s->cpu[i]),
>> +  "cortex-a7-" TYPE_ARM_CPU);
> pls reuse ARM_CPU_TYPE_NAME() macro here

Will do in v4.

Thanks,
Andrey Smirnov

Re: [Qemu-devel] [PATCH v3 19/30] i.MX: Add code to emulate SDMA IP block

2017-11-22 Thread Andrey Smirnov

On Tue, Nov 21, 2017 at 10:20 AM, Peter Maydell
 wrote:
> On 6 November 2017 at 15:48, Andrey Smirnov  wrote:
>> Add minimal code needed to allow upstream Linux guest to boot.
>>
>> Cc: Peter Maydell 
>> Cc: Jason Wang 
>> Cc: Philippe Mathieu-Daudé 
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-...@nongnu.org
>> Cc: yurov...@gmail.com
>> Signed-off-by: Andrey Smirnov 
>> ---
>>  hw/dma/Makefile.objs  |  1 +
>>  hw/dma/imx_sdma.c | 99 
>> +++
>>  include/hw/dma/imx_sdma.h | 22 +++
>>  3 files changed, 122 insertions(+)
>>  create mode 100644 hw/dma/imx_sdma.c
>>  create mode 100644 include/hw/dma/imx_sdma.h
>>
>
> Does Linux really insist on reads-as-written behaviour?
> (ie can you get away with just using
> create_unimplemented_device() ?)
>

Not sure. I'll give it a try for v4.

Thanks,
Andrey Smirnov

Re: [Qemu-devel] [PATCH v3 14/30] i.MX: Add code to emulate i.MX2 watchdog IP block

2017-11-22 Thread Andrey Smirnov

On Tue, Nov 21, 2017 at 10:10 AM, Peter Maydell
 wrote:
> On 6 November 2017 at 15:47, Andrey Smirnov  wrote:
>> Add enough code to emulate i.MX2 watchdog IP block so it would be
>> possible to reboot the machine running Linux Guest.
>>
>> Cc: Peter Maydell 
>> Cc: Jason Wang 
>> Cc: Philippe Mathieu-Daudé 
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-...@nongnu.org
>> Cc: yurov...@gmail.com
>> Signed-off-by: Andrey Smirnov 
>> ---
>>  hw/misc/Makefile.objs  |  1 +
>>  hw/misc/imx2_wdt.c | 88 
>> ++
>>  include/hw/misc/imx2_wdt.h | 34 ++
>>  3 files changed, 123 insertions(+)
>>  create mode 100644 hw/misc/imx2_wdt.c
>>  create mode 100644 include/hw/misc/imx2_wdt.h
>>
>> diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
>> index ac1be05a03..c393a93456 100644
>> --- a/hw/misc/Makefile.objs
>> +++ b/hw/misc/Makefile.objs
>> @@ -35,6 +35,7 @@ obj-$(CONFIG_IMX) += imx25_ccm.o
>>  obj-$(CONFIG_IMX) += imx6_ccm.o
>>  obj-$(CONFIG_IMX) += imx6_src.o
>>  obj-$(CONFIG_IMX) += imx7_ccm.o
>> +obj-$(CONFIG_IMX) += imx2_wdt.o
>>  obj-$(CONFIG_MILKYMIST) += milkymist-hpdmc.o
>>  obj-$(CONFIG_MILKYMIST) += milkymist-pfpu.o
>>  obj-$(CONFIG_MAINSTONE) += mst_fpga.o
>> diff --git a/hw/misc/imx2_wdt.c b/hw/misc/imx2_wdt.c
>> new file mode 100644
>> index 00..3a1c33aa51
>> --- /dev/null
>> +++ b/hw/misc/imx2_wdt.c
>> @@ -0,0 +1,88 @@
>> +/*
>> + * Copyright (c) 2017, Impinj, Inc.
>> + *
>> + * i.MX2 Watchdog IP block
>> + *
>> + * Author: Andrey Smirnov 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "sysemu/watchdog.h"
>> +
>> +#include "hw/misc/imx2_wdt.h"
>> +
>> +#define IMX2_WDT_WCR_WDABIT(5)  /* -> External Reset WDOG_B */
>> +#define IMX2_WDT_WCR_SRSBIT(4)  /* -> Software Reset Signal */
>> +
>> +static uint64_t imx2_wdt_read(void *opaque, hwaddr addr,
>> +  unsigned int size)
>> +{
>> +return 0;
>> +}
>> +
>> +static void imx2_wdt_write(void *opaque, hwaddr addr,
>> +   uint64_t value, unsigned int size)
>> +{
>> +if (addr == IMX2_WDT_WCR &&
>> +(value & (IMX2_WDT_WCR_WDA | IMX2_WDT_WCR_SRS))) {
>> +watchdog_perform_action();
>> +}
>> +}
>> +
>> +static const MemoryRegionOps imx2_wdt_ops = {
>> +.read  = imx2_wdt_read,
>> +.write = imx2_wdt_write,
>> +.endianness = DEVICE_NATIVE_ENDIAN,
>> +.impl = {
>> +/*
>> + * Our device would not work correctly if the guest was doing
>> + * unaligned access. This might not be a limitation on the
>> + * real device but in practice there is no reason for a guest
>> + * to access this device unaligned.
>> + */
>> +.min_access_size = 4,
>> +.max_access_size = 4,
>> +.unaligned = false,
>> +},
>> +};
>> +
>> +static void imx2_wdt_realize(DeviceState *dev, Error **errp)
>> +{
>> +IMX2WdtState *s = IMX2_WDT(dev);
>> +
>> +memory_region_init_io(&s->mmio, OBJECT(dev),
>> +  &imx2_wdt_ops, s,
>> +  TYPE_IMX2_WDT".mmio",
>> +  IMX2_WDT_REG_NUM * sizeof(uint16_t));
>> +sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->mmio);
>> +}
>> +
>> +static void imx2_wdt_class_init(ObjectClass *klass, void *data)
>> +{
>> +DeviceClass *dc = DEVICE_CLASS(klass);
>> +
>> +dc->realize = imx2_wdt_realize;
>> +set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>> +}
>> +
>> +static const TypeInfo imx2_wdt_info = {
>> +.name  = TYPE_IMX2_WDT,
>> +.parent= TYPE_SYS_BUS_DEVICE,
>> +.instance_size = sizeof(IMX2WdtState),
>> +.class_init= imx2_wdt_class_init,
>> +};
>> +
>> +static WatchdogTimerModel model = {
>> +.wdt_name = "imx2-watchdog",
>> +.wdt_description = "i.MX2 Watchdog",
>> +};
>> +
>> +static void imx2_wdt_register_type(void)
>> +{
>> +watchdog_add_model(&model);
>> +type_register_static(&imx2_wdt_info);
>> +}
>> +type_init(imx2_wdt_register_type)
>> diff --git a/include/hw/misc/imx2_wdt.h b/include/hw/misc/imx2_wdt.h
>> new file mode 100644
>> index 00..e67ac6939d
>> --- /dev/null
>> +++ b/include/hw/misc/imx2_wdt.h
>> @@ -0,0 +1,34 @@
>> +/*
>> + * Copyright (c) 2017, Impinj, Inc.
>> + *
>> + * i.MX2 Watchdog IP block
>> + *
>> + * Author: Andrey Smirnov 
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + */
>> +
>> +#ifndef IMX2_WDT_H
>> +#define IMX2_WDT_H
>> +
>> +#include "qemu/bitops.h"
>> +#include "hw/sysbus.h"
>
> The bitops.h include should be in the .c file, not here.
>

Will fix in v4.

Thanks,
Andrey Smirnov

Re: [Qemu-devel] [PATCH v3 13/30] i.MX: Add code to emulate i.MX7 CCM, PMU and ANALOG IP blocks

2017-11-22 Thread Andrey Smirnov

On Tue, Nov 21, 2017 at 10:08 AM, Peter Maydell
 wrote:
> On 6 November 2017 at 15:47, Andrey Smirnov  wrote:
>> Add minimal code needed to allow upstream Linux guest to boot.
>>
>> Cc: Peter Maydell 
>> Cc: Jason Wang 
>> Cc: Philippe Mathieu-Daudé 
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-...@nongnu.org
>> Cc: yurov...@gmail.com
>> Signed-off-by: Andrey Smirnov 
>> ---
>>  hw/misc/Makefile.objs  |   1 +
>>  hw/misc/imx7_ccm.c | 233 
>> +
>>  include/hw/misc/imx7_ccm.h | 130 +
>>  3 files changed, 364 insertions(+)
>>  create mode 100644 hw/misc/imx7_ccm.c
>>  create mode 100644 include/hw/misc/imx7_ccm.h
>>
>> diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
>> index 29fb922cef..ac1be05a03 100644
>> --- a/hw/misc/Makefile.objs
>> +++ b/hw/misc/Makefile.objs
>> @@ -34,6 +34,7 @@ obj-$(CONFIG_IMX) += imx31_ccm.o
>>  obj-$(CONFIG_IMX) += imx25_ccm.o
>>  obj-$(CONFIG_IMX) += imx6_ccm.o
>>  obj-$(CONFIG_IMX) += imx6_src.o
>> +obj-$(CONFIG_IMX) += imx7_ccm.o
>>  obj-$(CONFIG_MILKYMIST) += milkymist-hpdmc.o
>>  obj-$(CONFIG_MILKYMIST) += milkymist-pfpu.o
>>  obj-$(CONFIG_MAINSTONE) += mst_fpga.o
>> diff --git a/hw/misc/imx7_ccm.c b/hw/misc/imx7_ccm.c
>> new file mode 100644
>> index 00..2876164cfa
>> --- /dev/null
>> +++ b/hw/misc/imx7_ccm.c
>> @@ -0,0 +1,233 @@
>> +/*
>> + * Copyright (c) 2017, Impinj, Inc.
>> + *
>> + * i.MX7 CCM, PMU and ANALOG IP blocks emulation code
>
> Should these really all be in one single device rather
> than one device per IP block ?
>

They all share the same register write semantics,
imx7_set_clr_tog_write, so I'd like to keep them in the same file. But
other than that, they can be split into all sorts of configurations.
As far as memory map in i.MX7 RM is concerned "CCM" and "Analog" are
distinct register files where the rest of them ("digiprog", "pmu") are
"sub-blocks" of those two (or at least that's my interpretation). I'll
change v4 to have that split.

If you want me to convert every block _and_ sub-block into a
standalone device, let me know.

Thanks,
Andrey Smirnov

Re: [Qemu-devel] [PATCH] kvm: apic: save and restore x2APIC LDR

2017-11-22 Thread Radim Krčmář

2017-11-22 18:28-0200, Eduardo Habkost:
> On Wed, Nov 22, 2017 at 07:09:08PM +0100, Radim Krčmář wrote:
> > QEMU saves only 8 bits of APIC LDR, which means that it does not support
> > x2APIC.  The correct way of fixing this would be to save and restore the
> > full 32 bit register, but because x2APIC LDR is a function of x2APIC ID,
> > we can also compute it and keep the migration format untouched.
> > 
> > KVM always expected the LDR format to follow the xAPIC/x2APIC standard,
> > but pre 4.1 KVMs used non-standard x2APIC ID in case the OS changed
> > xAPIC ID before switching to x2APIC, which means that QEMU has to use
> > the kvm_x2apic_api feature to derive the x2APIC ID.
> > 
> > This bug has also been addressed on the KVM side with patch 5849d75a5c9b
> > ("KVM: lapic: Fixup LDR on load in x2apic").
> 
> Is this sufficient to fix the bug on hosts that lack KVM commit
> 5849d75a5c9b, or we need both the KVM and QEMU patches?

Good point, either one is enough to fix the bug.
This patch makes x2APIC LDR work on old KVMs.

Re: [Qemu-devel] [PATCH] kvm: apic: save and restore x2APIC LDR

2017-11-22 Thread Radim Krčmář

2017-11-22 20:26+0100, Paolo Bonzini:
> On 22/11/2017 19:09, Radim Krčmář wrote:
> > QEMU saves only 8 bits of APIC LDR, which means that it does not support
> > x2APIC.  The correct way of fixing this would be to save and restore the
> > full 32 bit register, but because x2APIC LDR is a function of x2APIC ID,
> > we can also compute it and keep the migration format untouched.
> > 
> > KVM always expected the LDR format to follow the xAPIC/x2APIC standard,
> > but pre 4.1 KVMs used non-standard x2APIC ID in case the OS changed
> > xAPIC ID before switching to x2APIC, which means that QEMU has to use
> > the kvm_x2apic_api feature to derive the x2APIC ID.
> > 
> > This bug has also been addressed on the KVM side with patch 5849d75a5c9b
> > ("KVM: lapic: Fixup LDR on load in x2apic").
> 
> > +if (s->apicbase & MSR_IA32_APICBASE_EXTD) {
> > +kvm_apic_set_reg(kapic, 0xd, kvm_apic_calc_x2apic_ldr(s));
> 
> Is this correct if the kernel doesn't support the new-style x2APIC API?

Should be: QEMU will use the APIC_ID register in that case, which
contains the x2APIC ID that KVM used to compute the LDR from.

(old-style APIC_ID just cannot store more than 8 bits and isn't tied to
 vcpu_id.)

> In the end, it seems simpler to just fix it in the kernel.

We already have the workaround in KVM, so dropping this one doesn't make
that much of a difference.  I perceive it as solely QEMU bug, though. :)

Re: [Qemu-devel] [PATCH v3 12/30] sdhci: Implement write method of ACMD12ERRSTS register

2017-11-22 Thread Andrey Smirnov

On Tue, Nov 21, 2017 at 10:04 AM, Peter Maydell
 wrote:
> On 6 November 2017 at 15:47, Andrey Smirnov  wrote:
>> Cc: Peter Maydell 
>> Cc: Jason Wang 
>> Cc: Philippe Mathieu-Daudé 
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-...@nongnu.org
>> Cc: yurov...@gmail.com
>> Signed-off-by: Andrey Smirnov 
>> ---
>>  hw/sd/sdhci.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
>> index f561cc44e3..53e5e011a7 100644
>> --- a/hw/sd/sdhci.c
>> +++ b/hw/sd/sdhci.c
>> @@ -1139,6 +1139,9 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, 
>> unsigned size)
>>  s->admasysaddr = (s->admasysaddr & (0xULL |
>>  ((uint64_t)mask << 32))) | ((uint64_t)value << 32);
>>  break;
>> +case SDHC_ACMD12ERRSTS:
>> +MASKED_WRITE(s->acmd12errsts, mask, value);
>> +break;
>>  case SDHC_FEAER:
>>  s->acmd12errsts |= value;
>>  s->errintsts |= (value >> 16) & s->errintstsen;
>> --
>> 2.13.6
>
> Is this part of the stock SDHCI spec that we just forgot to implement?

Yes it is. I don't know if missing that code critical for anything,
but since the rest of the plumbing is there I thought that we may as
well implement it.

Thanks,
Andrey Smirnov

Re: [Qemu-devel] [PATCH v3 11/30] sdhci: Add i.MX specific subtype of SDHCI

2017-11-22 Thread Andrey Smirnov

On Tue, Nov 21, 2017 at 10:02 AM, Peter Maydell
 wrote:
> On 6 November 2017 at 15:47, Andrey Smirnov  wrote:
>> IP block found on several generations of i.MX family does not use
>> vanilla SDHCI implementation and it comes with a number of quirks.
>>
>> Introduce i.MX SDHCI subtype of SDHCI block to add code necessary to
>> support unmodified Linux guest driver.
>>
>> Cc: Peter Maydell 
>> Cc: Jason Wang 
>> Cc: Philippe Mathieu-Daudé 
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-...@nongnu.org
>> Cc: yurov...@gmail.com
>> Signed-off-by: Andrey Smirnov 
>
> Hi. Mostly this looks ok; some comments below.
>
>> ---
>>  hw/sd/sdhci-internal.h |  15 ++
>>  hw/sd/sdhci.c  | 127 
>> -
>>  include/hw/sd/sdhci.h  |   8 
>>  3 files changed, 148 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/sd/sdhci-internal.h b/hw/sd/sdhci-internal.h
>> index 161177cf39..2a1b4b06ee 100644
>> --- a/hw/sd/sdhci-internal.h
>> +++ b/hw/sd/sdhci-internal.h
>> @@ -91,6 +91,8 @@
>>  #define SDHC_CTRL_ADMA2_32 0x10
>>  #define SDHC_CTRL_ADMA2_64 0x18
>>  #define SDHC_DMA_TYPE(x)   ((x) & SDHC_CTRL_DMA_CHECK_MASK)
>> +#define SDHC_CTRL_4BITBUS  0x02
>> +#define SDHC_CTRL_8BITBUS  0x20
>>
>>  /* R/W Power Control Register 0x0 */
>>  #define SDHC_PWRCON0x29
>> @@ -229,4 +231,17 @@ enum {
>>
>>  extern const VMStateDescription sdhci_vmstate;
>>
>> +
>> +#define ESDHC_MIX_CTRL  0x48
>> +#define ESDHC_VENDOR_SPEC   0xc0
>> +#define ESDHC_DLL_CTRL  0x60
>> +
>> +#define ESDHC_TUNING_CTRL   0xcc
>> +#define ESDHC_TUNE_CTRL_STATUS  0x68
>> +#define ESDHC_WTMK_LVL  0x44
>> +
>> +#define ESDHC_CTRL_4BITBUS  (0x1 << 1)
>> +#define ESDHC_CTRL_8BITBUS  (0x2 << 1)
>> +
>> +
>>  #endif
>> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
>> index 6d6a791ee9..f561cc44e3 100644
>> --- a/hw/sd/sdhci.c
>> +++ b/hw/sd/sdhci.c
>> @@ -265,7 +265,8 @@ static void sdhci_send_command(SDHCIState *s)
>>  }
>>  }
>>
>> -if ((s->norintstsen & SDHC_NISEN_TRSCMP) &&
>> +if (!(s->quirks & SDHCI_QUIRK_NO_BUSY_IRQ) &&
>> +(s->norintstsen & SDHC_NISEN_TRSCMP) &&
>>  (s->cmdreg & SDHC_CMD_RESPONSE) == SDHC_CMD_RSP_WITH_BUSY) {
>>  s->norintsts |= SDHC_NIS_TRSCMP;
>>  }
>> @@ -1191,6 +1192,8 @@ static void sdhci_initfn(SDHCIState *s)
>>
>>  s->insert_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, 
>> sdhci_raise_insertion_irq, s);
>>  s->transfer_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, 
>> sdhci_data_transfer, s);
>> +
>> +s->io_ops = &sdhci_mmio_ops;
>>  }
>>
>>  static void sdhci_uninitfn(SDHCIState *s)
>> @@ -1347,7 +1350,7 @@ static void sdhci_sysbus_realize(DeviceState *dev, 
>> Error ** errp)
>>  s->buf_maxsz = sdhci_get_fifolen(s);
>>  s->fifo_buffer = g_malloc0(s->buf_maxsz);
>>  sysbus_init_irq(sbd, &s->irq);
>> -memory_region_init_io(&s->iomem, OBJECT(s), &sdhci_mmio_ops, s, "sdhci",
>> +memory_region_init_io(&s->iomem, OBJECT(s), s->io_ops, s, "sdhci",
>>  SDHC_REGISTERS_MAP_SIZE);
>>  sysbus_init_mmio(sbd, &s->iomem);
>>  }
>> @@ -1386,11 +1389,131 @@ static const TypeInfo sdhci_bus_info = {
>>  .class_init = sdhci_bus_class_init,
>>  };
>>
>> +static uint64_t usdhc_read(void *opaque, hwaddr offset, unsigned size)
>> +{
>> +SDHCIState *s = SYSBUS_SDHCI(opaque);
>> +uint32_t ret;
>> +uint16_t hostctl;
>> +
>> +switch (offset) {
>> +default:
>> +return sdhci_read(opaque, offset, size);
>> +
>> +case SDHC_HOSTCTL:
>> +hostctl = SDHC_DMA_TYPE(s->hostctl) << 5;
>> +
>> +if (s->hostctl & SDHC_CTRL_8BITBUS) {
>> +hostctl |= ESDHC_CTRL_8BITBUS;
>> +}
>> +
>> +if (s->hostctl & SDHC_CTRL_4BITBUS) {
>> +hostctl |= ESDHC_CTRL_4BITBUS;
>> +}
>> +
>> +ret = hostctl | (s->blkgap << 16) |
>> +(s->wakcon << 24);
>> +
>> +break;
>> +
>> +case ESDHC_DLL_CTRL:
>> +case ESDHC_TUNE_CTRL_STATUS:
>> +case 0x6c:
>> +case ESDHC_TUNING_CTRL:
>> +case ESDHC_VENDOR_SPEC:
>> +case ESDHC_MIX_CTRL:
>> +case ESDHC_WTMK_LVL:
>> +ret = 0;
>> +break;
>> +}
>> +
>> +return ret;
>> +}
>> +
>> +static void
>> +usdhc_write(void *opaque, hwaddr offset, uint64_t val, unsigned size)
>> +{
>> +SDHCIState *s = SYSBUS_SDHCI(opaque);
>> +uint8_t hostctl = 0;
>> +uint32_t value = (uint32_t)val;
>> +
>> +switch (offset) {
>> +case ESDHC_DLL_CTRL:
>> +case ESDHC_TUNE_CTRL_STATUS:
>> +case 0x6c:
>> +case ESDHC_TUNING_CTRL:
>> +case ESDHC_WTMK_LVL:
>> +case ESDHC_VENDOR_SPEC:
>> +break;
>> +
>> +case SDHC_HOSTCTL:
>> +if (value & ESDHC_CTRL_8BITBUS) {
>> +hostctl |= SDHC_CTRL_8BITBUS;
>> +}
>> +
>> +

Re: [Qemu-devel] [PATCH v2 4/7] s390x/pci: rework PCI STORE BLOCK

2017-11-22 Thread Pierre Morel


On 22/11/2017 14:28, Cornelia Huck wrote:

On Wed, 22 Nov 2017 13:15:21 +0800
Yi Min Zhao  wrote:


在 2017/11/22 上午2:07, Pierre Morel 写道:

On 21/11/2017 11:42, Cornelia Huck wrote:

On Thu, 16 Nov 2017 18:51:52 +0100
Pierre Morel  wrote:
  

Enhance the fault detection.

Fixup the precedence to check the destination path existance
before checking for the source accessibility.

Add the maxstbl entry to both the Query PCI Function Group
response and the PCIBusDevice structure.

Initialize the maxstbl to 128 per default until we get
the actual data from the hardware.

Signed-off-by: Pierre Morel 
Reviewed-by: Yi Min Zhao 
---
   hw/s390x/s390-pci-bus.h  |  1 +
   hw/s390x/s390-pci-inst.c | 62
+---
   hw/s390x/s390-pci-inst.h |  2 +-
   3 files changed, 40 insertions(+), 25 deletions(-)
  

@@ -700,8 +685,33 @@ int pcistb_service_call(S390CPU *cpu, uint8_t
r1, uint8_t r3, uint64_t gaddr,
   break;
   }
   +    if (pcias > 5) {
+    DPRINTF("pcistb invalid space\n");
+    setcc(cpu, ZPCI_PCI_LS_ERR);
+    s390_set_status_code(env, r1, ZPCI_PCI_ST_INVAL_AS);
+    return 0;
+    }
+
+    /* Verify the address, offset and length */
+    /* offset must be a multiple of 8 */
+    if (offset % 8) {
+    goto addressing_error;
+    }
+    /* Length must be greater than 8, a multiple of 8, not greater
maxstbl */


"not greater than maxstlb"


Better I know but greater that 80 characters, this is why I preferred
broken English.
What do I do ? break the line or English ?

less than?


It's less or equal, no?

In any case, I prefer a multi-line comment over awkward language.



OK, thanks

  
  

+    if ((len <= 8) || (len % 8) || (len > pbdev->maxstbl)) {
+    goto addressing_error;
+    }
+    /* Do not cross a 4K-byte boundary */
+    if (((offset & 0xfff) + len) > 0x1000) {
+    goto addressing_error;
+    }
+    /* Guest address must be double word aligned */
+    if (gaddr & 0x07UL) {
+    goto addressing_error;
+    }
+
   mr = pbdev->pdev->io_regions[pcias].memory;
-    if (!memory_region_access_valid(mr, env->regs[r3], len, true)) {
+    if (!memory_region_access_valid(mr, offset, len, true)) {
   program_interrupt(env, PGM_OPERAND, 6);
   return 0;
   }


Looks good.
  


  







--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

Re: [Qemu-devel] [PATCH v3 10/30] imx_fec: Reserve full 4K page for the register file

2017-11-22 Thread Andrey Smirnov

On Tue, Nov 21, 2017 at 9:48 AM, Peter Maydell  wrote:
> On 6 November 2017 at 15:47, Andrey Smirnov  wrote:
>> Some i.MX SoCs (e.g. i.MX7) have FEC registers going as far as offset
>> 0x614, so to avoid getting aborts when accessing those on QEMU, extend
>> the register file to cover 4KB of address space instead of just 1K.
>>
>> Cc: Peter Maydell 
>> Cc: Jason Wang 
>> Cc: Philippe Mathieu-Daudé 
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-...@nongnu.org
>> Cc: yurov...@gmail.com
>> Signed-off-by: Andrey Smirnov 
>> ---
>>  hw/net/imx_fec.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
>> index 48d012cad6..e236bc933c 100644
>> --- a/hw/net/imx_fec.c
>> +++ b/hw/net/imx_fec.c
>> @@ -1252,7 +1252,7 @@ static void imx_eth_realize(DeviceState *dev, Error 
>> **errp)
>>  SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
>>
>>  memory_region_init_io(&s->iomem, OBJECT(dev), &imx_eth_ops, s,
>> -  TYPE_IMX_FEC, 0x400);
>> +  TYPE_IMX_FEC, 0x1000);
>>  sysbus_init_mmio(sbd, &s->iomem);
>>  sysbus_init_irq(sbd, &s->irq[0]);
>>  sysbus_init_irq(sbd, &s->irq[1]);
>> --
>
> I notice that we have an unused #define FSL_IMX25_FEC_SIZE 0x4000 in
> fsl-imx25.h, and the Linux device trees for the imx25 define the size
> of the FEC register block as 0x4000. Should this be 0x4000 ?
>

I think the size reserved for that register file differs between
differen SoC. E.g. it's 16K on i.MX25, as you pointed out, but 64K on
i.MX7. It's all the same to me as long as it's greater than 0x1000.
I'll change the code to use FSL_IMX25_FEC_SIZE since it gets rid of a
magic number.

Thanks,
Andrey Smirnov

Re: [Qemu-devel] [PATCH] kvm: apic: save and restore x2APIC LDR

2017-11-22 Thread Eduardo Habkost

On Wed, Nov 22, 2017 at 07:09:08PM +0100, Radim Krčmář wrote:
> QEMU saves only 8 bits of APIC LDR, which means that it does not support
> x2APIC.  The correct way of fixing this would be to save and restore the
> full 32 bit register, but because x2APIC LDR is a function of x2APIC ID,
> we can also compute it and keep the migration format untouched.
> 
> KVM always expected the LDR format to follow the xAPIC/x2APIC standard,
> but pre 4.1 KVMs used non-standard x2APIC ID in case the OS changed
> xAPIC ID before switching to x2APIC, which means that QEMU has to use
> the kvm_x2apic_api feature to derive the x2APIC ID.
> 
> This bug has also been addressed on the KVM side with patch 5849d75a5c9b
> ("KVM: lapic: Fixup LDR on load in x2apic").

Is this sufficient to fix the bug on hosts that lack KVM commit
5849d75a5c9b, or we need both the KVM and QEMU patches?

> 
> Reported-by: Dr. David Alan Gilbert 
> Reported-by: Yiqian Wei 
> Signed-off-by: Radim Krčmář 
> ---
>  I haven't tested that it actually fixes the bug,
>  https://bugzilla.redhat.com/show_bug.cgi?id=1502591.
>  
>  hw/i386/kvm/apic.c | 19 +--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
> index 1df6d26816f9..89df165a04bf 100644
> --- a/hw/i386/kvm/apic.c
> +++ b/hw/i386/kvm/apic.c
> @@ -30,6 +30,13 @@ static inline uint32_t kvm_apic_get_reg(struct 
> kvm_lapic_state *kapic,
>  return *((uint32_t *)(kapic->regs + (reg_id << 4)));
>  }
>  
> +static inline uint32_t kvm_apic_calc_x2apic_ldr(APICCommonState *s)
> +{
> +uint32_t id = kvm_has_x2apic_api() ? s->initial_apic_id : s->id;
> +
> +return ((id >> 4) << 16) | (1 << (id & 0xf));
> +}
> +
>  static void kvm_put_apic_state(APICCommonState *s, struct kvm_lapic_state 
> *kapic)
>  {
>  int i;
> @@ -41,7 +48,11 @@ static void kvm_put_apic_state(APICCommonState *s, struct 
> kvm_lapic_state *kapic
>  kvm_apic_set_reg(kapic, 0x2, s->id << 24);
>  }
>  kvm_apic_set_reg(kapic, 0x8, s->tpr);
> -kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24);
> +if (s->apicbase & MSR_IA32_APICBASE_EXTD) {
> +kvm_apic_set_reg(kapic, 0xd, kvm_apic_calc_x2apic_ldr(s));
> +} else {
> +kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24);
> +}
>  kvm_apic_set_reg(kapic, 0xe, s->dest_mode << 28 | 0x0fff);
>  kvm_apic_set_reg(kapic, 0xf, s->spurious_vec);
>  for (i = 0; i < 8; i++) {
> @@ -71,7 +82,11 @@ void kvm_get_apic_state(DeviceState *dev, struct 
> kvm_lapic_state *kapic)
>  }
>  s->tpr = kvm_apic_get_reg(kapic, 0x8);
>  s->arb_id = kvm_apic_get_reg(kapic, 0x9);
> -s->log_dest = kvm_apic_get_reg(kapic, 0xd) >> 24;
> +if (s->apicbase & MSR_IA32_APICBASE_EXTD) {
> +assert(kvm_apic_get_reg(kapic, 0xd) == kvm_apic_calc_x2apic_ldr(s));

I assume this assert() won't trigger if the host just lacks the
kernel patch, will it?

What if we're going to migrate to a QEMU version that doesn't
have this patch applied?  Do we want to send the same log_dest
value as old QEMU versions, just in case?  (Those 8 bits QEMU
currently sets at LDR[31:24] seem completely useless, but maybe
it won't hurt to keep them?)


> +} else {
> +s->log_dest = kvm_apic_get_reg(kapic, 0xd) >> 24;
> +}
>  s->dest_mode = kvm_apic_get_reg(kapic, 0xe) >> 28;
>  s->spurious_vec = kvm_apic_get_reg(kapic, 0xf);
>  for (i = 0; i < 8; i++) {
> -- 
> 2.14.2
> 

-- 
Eduardo

Re: [Qemu-devel] [PATCH v3 07/30] imx_fec: Add support for multiple Tx DMA rings

2017-11-22 Thread Andrey Smirnov

On Tue, Nov 21, 2017 at 9:44 AM, Peter Maydell  wrote:
> On 6 November 2017 at 15:47, Andrey Smirnov  wrote:
>> More recent version of the IP block support more than one Tx DMA ring,
>> so add the code implementing that feature.
>>
>> Cc: Peter Maydell 
>> Cc: Jason Wang 
>> Cc: Philippe Mathieu-Daudé 
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-...@nongnu.org
>> Cc: yurov...@gmail.com
>> Signed-off-by: Andrey Smirnov 
>
>>  static const VMStateDescription vmstate_imx_eth = {
>>  .name = TYPE_IMX_FEC,
>> -.version_id = 2,
>> -.minimum_version_id = 2,
>> +.version_id = 3,
>> +.minimum_version_id = 3,
>>  .fields = (VMStateField[]) {
>>  VMSTATE_UINT32_ARRAY(regs, IMXFECState, ENET_MAX),
>>  VMSTATE_UINT32(rx_descriptor, IMXFECState),
>> -VMSTATE_UINT32(tx_descriptor, IMXFECState),
>> -
>> +VMSTATE_UINT32_ARRAY(tx_descriptor, IMXFECState, ENET_TX_RING_NUM),
>> +VMSTATE_UINT32(tx_ring_num, IMXFECState),
>>  VMSTATE_UINT32(phy_status, IMXFECState),
>>  VMSTATE_UINT32(phy_control, IMXFECState),
>>  VMSTATE_UINT32(phy_advertise, IMXFECState),
>
> tx_ring_num is constant for any particular instantiation of the device,
> so you don't need to put it in the vmstate.
>
> It's pretty trivial to make this vmstate compatible with the old
> ones for the existing single-tx-descriptor devices, so we might as well:
>
> /* Versions of this device with more than one TX descriptor
>  * save the 2nd and 3rd descriptors in a subsection, to maintain
>  * migration compatibility with previous versions of the device
>  * that only supported a single descriptor.
>  */
> static bool txdescs_needed(void *opaque) {
> IMXFECState *s = opaque;
>
> return s->tx_ring_num > 1;
> }
>
> static const VMStateDescription vmstate_imx_eth_txdescs = {
> .name = "imx.fec/txdescs",
> .version_id = 1,
> .minimum_version_id = 1,
> .needed = txdescs_needed,
> .fields = (VMStateField[]) {
>  VMSTATE_UINT32(tx_descriptor[1], IMXFECState),
>  VMSTATE_UINT32(tx_descriptor[2], IMXFECState),
>  VMSTATE_END_OF_LIST()
> }
> };
>
> and then add this to the vmx_state_eth at the end:
> .subsections = (const VMStateDescription*[]) {
>  &vmstate_imx_eth_txdescs,
>  NULL
> }
>
> Then you don't need to bump version_id/minimum_version_id on the
> vmstate_imx_eth struct.
>

Cool, sounds good. Will add that to the patch in v4.

>> @@ -791,6 +821,7 @@ static void imx_eth_write(void *opaque, hwaddr offset, 
>> uint64_t value,
>> unsigned size)
>>  {
>>  IMXFECState *s = IMX_FEC(opaque);
>> +const bool single_tx_ring = s->tx_ring_num != 3;
>
> This looks odd -- I would have expected "single_tx_ring =
> s->tx_ring_num == 1" ...

AFAIK the HW that's out there will have either 3 or 1 Tx ring, so
that's why I wrote it this way. I'll change the logic to the way your
suggest to avoid surprising the reader.

Thanks,
Andrey Smirnov

Re: [Qemu-devel] [PATCH v3 03/30] imx_fec: Change queue flushing heuristics

2017-11-22 Thread Andrey Smirnov

On Tue, Nov 21, 2017 at 9:27 AM, Peter Maydell  wrote:
> On 6 November 2017 at 15:47, Andrey Smirnov  wrote:
>> In current implementation, packet queue flushing logic seem to suffer
>> from a deadlock like scenario if a packet is received by the interface
>> before before Rx ring is initialized by Guest's driver. Consider the
>> following sequence of events:
>>
>> 1. A QEMU instance is started against a TAP device on Linux
>>host, running Linux guest, e. g., something to the effect
>>of:
>>
>>qemu-system-arm \
>>   -net nic,model=imx.fec,netdev=lan0 \
>>   netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \
>>   ... rest of the arguments ...
>>
>> 2. Once QEMU starts, but before guest reaches the point where
>>FEC deriver is done initializing the HW, Guest, via TAP
>>interface, receives a number of multicast MDNS packets from
>>Host (not necessarily true for every OS, but it happens at
>>least on Fedora 25)
>>
>> 3. Recieving a packet in such a state results in
>>imx_eth_can_receive() returning '0', which in turn causes
>>tap_send() to disable corresponding event (tap.c:203)
>>
>> 4. Once Guest's driver reaches the point where it is ready to
>>recieve packets it prepares Rx ring descriptors and writes
>>ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that
>>more descriptors are ready. And at this points emulation
>>layer does this:
>>
>>  s->regs[index] = ENET_RDAR_RDAR;
>>  imx_eth_enable_rx(s);
>>
>>which, combined with:
>>
>>   if (!s->regs[ENET_RDAR]) {
>>  qemu_flush_queued_packets(qemu_get_queue(s->nic));
>>   }
>>
>>results in Rx queue never being flushed and corresponding
>>I/O event beign disabled.
>>
>> To prevent the problem, change the code to always flush packet queue
>> when ENET_RDAR transitions 0 -> ENET_RDAR_RDAR.
>>
>> Cc: Peter Maydell 
>> Cc: Jason Wang 
>> Cc: Philippe Mathieu-Daudé 
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-...@nongnu.org
>> Cc: yurov...@gmail.com
>> Signed-off-by: Andrey Smirnov 
>> diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
>> index 62ad473b05..4bc8f03ec2 100644
>> --- a/include/hw/net/imx_fec.h
>> +++ b/include/hw/net/imx_fec.h
>> @@ -252,6 +252,7 @@ typedef struct IMXFECState {
>>  uint32_t phy_int_mask;
>>
>>  bool is_fec;
>> +bool needs_flush;
>>  } IMXFECState;
>
> This field isn't needed any more in this version of the patch, I think?
>

Yeah, my bad, forgot to remove this part. Will do in v4.

Thanks,
Andrey Smirnov

Re: [Qemu-devel] [PATCH v3 04/30] imx_fec: Use ENET_FTRL to determine truncation length

2017-11-22 Thread Andrey Smirnov

On Tue, Nov 21, 2017 at 9:31 AM, Peter Maydell  wrote:
> On 6 November 2017 at 15:47, Andrey Smirnov  wrote:
>> Frame truncation length, TRUNC_FL, is determined by the contents of
>> ENET_FTRL register, so convert the code to use it instead of a
>> hardcoded constant.
>>
>> To avoid the case where TRUNC_FL is greater that ENET_MAX_FRAME_SIZE,
>> increase the value of the latter to its theoretical maximum of 16K.
>>
>> Cc: Peter Maydell 
>> Cc: Jason Wang 
>> Cc: Philippe Mathieu-Daudé 
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-...@nongnu.org
>> Cc: yurov...@gmail.com
>> Signed-off-by: Andrey Smirnov 
>> ---
>>  hw/net/imx_fec.c | 4 ++--
>>  include/hw/net/imx_fec.h | 3 ++-
>>  2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
>> index eb034ffd0c..dda0816fb3 100644
>> --- a/hw/net/imx_fec.c
>> +++ b/hw/net/imx_fec.c
>> @@ -1052,8 +1052,8 @@ static ssize_t imx_enet_receive(NetClientState *nc, 
>> const uint8_t *buf,
>>  crc_ptr = (uint8_t *) &crc;
>>
>>  /* Huge frames are truncted.  */
>> -if (size > ENET_MAX_FRAME_SIZE) {
>> -size = ENET_MAX_FRAME_SIZE;
>> +if (size > s->regs[ENET_FTRL]) {
>> +size = s->regs[ENET_FTRL];
>>  flags |= ENET_BD_TR | ENET_BD_LG;
>>  }
>>
>> diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
>> index 4bc8f03ec2..0fcc4f0c71 100644
>> --- a/include/hw/net/imx_fec.h
>> +++ b/include/hw/net/imx_fec.h
>> @@ -86,7 +86,6 @@
>>  #define ENET_TCCR3 393
>>  #define ENET_MAX   400
>>
>> -#define ENET_MAX_FRAME_SIZE2032
>>
>>  /* EIR and EIMR */
>>  #define ENET_INT_HB(1 << 31)
>> @@ -155,6 +154,8 @@
>>  #define ENET_RCR_NLC   (1 << 30)
>>  #define ENET_RCR_GRS   (1 << 31)
>>
>> +#define ENET_MAX_FRAME_SIZE(1 << ENET_RCR_MAX_FL_LENGTH)
>
> This means we now have functions with 16K local array
> variables on the stack, which seems like a bad idea.
>

Can't say I see a big difference between having a 2K vs 16K buffer on
the stack, but regardless, I am not quite clear if you are not too hot
about this patch and want me to drop it (I am fine with it) or do you
want me to modify it to have the emulation layer allocate said 16K
buffer on the heap instead of a stack?

Thanks,
Andrey Smirnov

Re: [Qemu-devel] [PATCH v3 00/30] Initial i.MX7 support

2017-11-22 Thread Andrey Smirnov

On Tue, Nov 21, 2017 at 10:34 AM, Peter Maydell
 wrote:
> On 6 November 2017 at 15:47, Andrey Smirnov  wrote:
>> Hi everyone,
>> - Added proper USB emulation code, so now it should be possible to
>>   emulated guest's USB bus
>
> The patchset is huge as it is, if you add more stuff to it
> it makes it even more likely to sink to the bottom of my
> to-review queue...
>

USB peripheral emulation had to be a part of a patch-set, either in
dummy or a full featured form, in order to be able to boot vanilla
Linux kernel because you insisted that I don't use
"ignore_memory_transaction_failures". The reason why the dummy
emulation version of it it was not a part of v2 was because I did my
test with a bad kernel config where USB was disabled, didn't realize
USB was essential and did not write code to support it. Now, once I
realized it, I wrote a dummy version, and then later, while waiting
for v2 to be reviewed, worked on proper USB emulation the code. Said
code turned out to be comparatively trivial to the first dummy
implementation, so instead of going through the exercise of submitting
dummy first and then proper version later I squashed both and the
result in v3.

>> Peter, I didn't hear anything from you about the code of
>> mcimx7d_add_psci_node(), as discussed here:
>>
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg486874.html
>>
>> so I kept the original code intact. As I mentioned before, my goal was
>> to be able to boot into vanilla Linux kerenel and have working SMP
>> without needing to use a PSCI implementing bootloader. If that is
>> something that new board code shouldn't do, please let me know.
>
> Broadly, board code should work the same way the real hardware
> does, unless there's a clear reason why not.

Yes, this all makes sense. As far as I understand convenience being
able to boot Linux directly in QEMU has long been the "clear reason
why not". Now that certain SoC specific versions of Linux are not as
self-sufficient and can't support SMP without external help, emulating
PSCI and doing appropriate DTB fixups for that is just an adaptaion of
the old convenience mechanism to new times and circumstances, IMHO.

> "virt" is special because it writes its own dtb entirely.
>
> Maybe PSCI does need to be a different special case, since we're
> emulating part of a bootloader here.

OK, I'll ignore the "maybe" part and proceed as if we are in agreement
on PSCI for v4.

> But if so I think that code belongs more in hw/arm/boot.c, so that we 
> automatically fix up the
> dtb to say "we have psci" if we are (a) booting a kernel directly
> and (b) the CPU has the psci-conduit property set to enable QEMU's
> PSCI implementation.
>

OK, sure, makes sense. I'll change the patch to use shared
infrastructure for that.

> (Also the code in virt.c for adding a PSCI node is considerably
> fuller-featured than yours is.)
>

Okay... My code is targeting both fixed PSCI conduit (smc) and PSCI
implementation (0.2/1.0), implementing support for anything but that
in my board specific code would've been, IMHO, silly. OK, I'll
interpret that comment not as a slight, but as a request to use
virt.c's implementation for shared infrastructure.

>> Thanks,
>> Andrey Smirnov
>>
>> [v2] https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg05516.html
>> [v1] https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg04770.html
>>
>>
>> Andrey Smirnov (30):
>>   imx_fec: Do not link to netdev
>>   imx_fec: Refactor imx_eth_enable_rx()
>>   imx_fec: Change queue flushing heuristics
>>   imx_fec: Use ENET_FTRL to determine truncation length
>>   imx_fec: Use MIN instead of explicit ternary operator
>>   imx_fec: Emulate SHIFT16 in ENETx_RACC
>>   imx_fec: Add support for multiple Tx DMA rings
>>   imx_fec: Use correct length for packet size
>>   imx_fec: Fix a typo in imx_enet_receive()
>>   imx_fec: Reserve full 4K page for the register file
>>   sdhci: Add i.MX specific subtype of SDHCI
>>   sdhci: Implement write method of ACMD12ERRSTS register
>
> Everything above here is pretty nearly ready to go in;
> if you send that as a patchseries then it should be easy
> to review and queue ready for 2.12 (which will open up
> for new commits in mid-december).
>

I'm not quite sure where you stand on "imx_fec: Use ENET_FTRL to
determine truncation length", but sure, sounds good, I'll prepare the
rest of them as a separate patch set and submit it, as soon as I get a
chance.

>>   i.MX: Add code to emulate i.MX7 CCM, PMU and ANALOG IP blocks
>>   i.MX: Add code to emulate i.MX2 watchdog IP block
>>   i.MX: Add code to emulate i.MX7 SNVS IP-block
>>   i.MX: Add code to emulate GPCv2 IP block
>>   i.MX: Add code to emulate i.MX7 IOMUXC IP block
>>   i.MX: Add i.MX7 GPT variant
>>   i.MX: Add code to emulate SDMA IP block
>>   i.MX: Add code to emulate FlexCAN IP block
>>   i.MX: Add implementation of i.MX7 GPR IP block
>>   pci: Add support for Designware IP block
>>   i.MX: Add code to emulate i.MX7 USBMISC IP block
>

Re: [Qemu-devel] [PATCH 2/5] nbd/server: add nbd_opt_{read, drop} to track client->optlen

2017-11-22 Thread Eric Blake

On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  nbd/server.c | 34 ++
>  1 file changed, 22 insertions(+), 12 deletions(-)
> 

Hmm, revisiting my idea about moving more of the error checking into the
helper function.  As of this patch, we only have two clients of
nbd_opt_read:

> @@ -299,7 +312,7 @@ static int nbd_negotiate_handle_export_name(NBDClient 
> *client,
>  error_setg(errp, "Bad length received");
>  return -EINVAL;
>  }
> -if (nbd_read(client->ioc, name, client->optlen, errp) < 0) {
> +if (nbd_opt_read(client, name, client->optlen, errp) < 0) {
>  error_prepend(errp, "read failed: ");
>  return -EINVAL;

1. NBD_OPT_EXPORT_NAME, where the length check is based on the maximum
size name we're willing to accept (and NOT on comparison to the header
size, as we request the entire client->optlen).  This call cannot send
an error reply (so if the length is bogus, we can just drop the
connection by returning -EINVAL).  Furthermore, once we handle this
option, option processing is done; we do not reach the assert that
client->optlen == 0.

>  }
> @@ -383,40 +396,36 @@ static int nbd_negotiate_handle_info(NBDClient *client, 
> uint16_t myflags,
>  msg = "overall request too short";
>  goto invalid;
>  }
> -if (nbd_read(client->ioc, &namelen, sizeof(namelen), errp) < 0) {
> +if (nbd_opt_read(client, &namelen, sizeof(namelen), errp) < 0) {
>  return -EIO;
>  }

2. Multiple calls within NBD_OPT_INFO/NBD_OPT_GO.  Here, the length
check is based on our read being a subset of client->optlen, and our
desired response on a failed check is whatever is at the invalid: label,
namely:

 invalid:
if (nbd_opt_drop(client, client->optlen, errp) < 0) {
return -EIO;
}
return nbd_negotiate_send_rep_err(client, NBD_REP_ERR_INVALID,
  errp, "%s", msg);

We want to drop all remaining data, reply to the client, and return 0 to
keep the connection alive (or -EIO if the read or write failed).

You are planning on adding more calls withing NBD_OPT_LIST_META_CONTEXT,
which will have semantics more like 2 (if we detect an inconsistent
size, we want to consume the rest of the input and return an EINVAL
reply to the client, but keep the connection alive unless there is an
I/O error in the meantime).

I think that means we need a tri-state return from nbd_opt_read(): < 0
on I/O failure, 0 if the EINVAL message has been sent, and 1 if the read
was successful; I also think it means that the handler for
NBD_OPT_EXPORT_NAME does not really fit the bill for using the same
handler.  Hopefully this explanation will give you more insight into the
counter-proposal patch I'm about to post.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] kvm: virtio-net: saved image requires TUN_F_UFO support

2017-11-22 Thread Dr. David Alan Gilbert

* Stefan Priebe - Profihost AG (s.pri...@profihost.ag) wrote:
> Hello,
> 
> Am 22.11.2017 um 20:41 schrieb Dr. David Alan Gilbert:
> > * Paolo Bonzini (pbonz...@redhat.com) wrote:
> >> On 06/11/2017 12:09, Stefan Priebe - Profihost AG wrote:
> >>> HI Paolo,
> >>>
> >>> could this patchset be related?
> >>
> >> Uh oh, yes it should.  Jason, any ways to fix it?  I suppose we need to
> >> disable UFO in the newest machine types, but do we also have to do
> >> (software) UFO in vhost-net and QEMU for migration compatibility?
> > 
> > Was there a solution to this?
> 
> it will be this one:
> https://patchwork.ozlabs.org/patch/840094/

Thanks; I've added a link to:
https://wiki.qemu.org/Features/Migration/Troubleshooting#virtio-net:_saved_image_requires_TUN_F_UFO_support

Dave

> 
> Stefan
> 
> > Dave
> > 
> >> Thanks,
> >>
> >> Paolo
> >>
> >>> Greets,
> >>> Stefan
> >>>
> >>> Am 06.11.2017 um 10:52 schrieb Stefan Priebe - Profihost AG:
>  Hi Paolo,
> 
>  Am 06.11.2017 um 10:49 schrieb Paolo Bonzini:
> > On 06/11/2017 10:48, Stefan Priebe - Profihost AG wrote:
> >> Hi Paolo,
> >>
> >> Am 06.11.2017 um 10:40 schrieb Paolo Bonzini:
> >>> On 06/11/2017 10:38, Stefan Priebe - Profihost AG wrote:
>  Hello,
> 
>  i've upgraded some servers from kernel 4.4 to 4.12 - both running 
>  Qemu
>  2.9.1.
> 
>  If i migrate a VM from a host running kernel 4.4 to a host running 
>  4.12
>  i get:
> 
>  kvm: virtio-net: saved image requires TUN_F_UFO support
>  kvm: Failed to load virtio-net-device:tmp
>  kvm: Failed to load virtio-net:virtio
>  kvm: error while loading state for instance 0x0 of device
>  ':00:12.0/virtio-net'
>  kvm: load of migration failed: Invalid argument
> 
> 
>  while migrating from 4.12 to 4.4 works fine.
> 
>  Can anybody help? Is this expected?
> >>>
> >>> Can you check why peer_has_ufo failed (in hw/net/virtio-net.c)?
> >>
> >> May be - how can i archieve this? Patching the code is not a problem if
> >> you can give me a hint.
> >>
> >>> Also, did this ioctl fail when the tap device was set up on the 4.12 
> >>> destination?
> >>> int tap_probe_has_ufo(int fd)
> >>> {
> >>> unsigned offload;
> >>>
> >>> offload = TUN_F_CSUM | TUN_F_UFO;
> >>>
> >>> if (ioctl(fd, TUNSETOFFLOAD, offload) < 0)
> >>> return 0;
> >>>
> >>> return 1;
> >>> }
> >>
> >> Should there be any kernel output or how can i detect / check it?
> >
> > For both, the simplest answer is probably just using printf.
> 
>  arg i missed an important part. The kernel is an opensuse SLE15 one.
> 
>  I've seen it contains the following patchset:
>  https://www.spinics.net/lists/netdev/msg443821.html
> 
>  Greets,
>  Stefan
> 
> >>
> >>
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> > 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] kvm: virtio-net: saved image requires TUN_F_UFO support

2017-11-22 Thread Stefan Priebe - Profihost AG

Hello,

Am 22.11.2017 um 20:41 schrieb Dr. David Alan Gilbert:
> * Paolo Bonzini (pbonz...@redhat.com) wrote:
>> On 06/11/2017 12:09, Stefan Priebe - Profihost AG wrote:
>>> HI Paolo,
>>>
>>> could this patchset be related?
>>
>> Uh oh, yes it should.  Jason, any ways to fix it?  I suppose we need to
>> disable UFO in the newest machine types, but do we also have to do
>> (software) UFO in vhost-net and QEMU for migration compatibility?
> 
> Was there a solution to this?

it will be this one:
https://patchwork.ozlabs.org/patch/840094/


Stefan

> Dave
> 
>> Thanks,
>>
>> Paolo
>>
>>> Greets,
>>> Stefan
>>>
>>> Am 06.11.2017 um 10:52 schrieb Stefan Priebe - Profihost AG:
 Hi Paolo,

 Am 06.11.2017 um 10:49 schrieb Paolo Bonzini:
> On 06/11/2017 10:48, Stefan Priebe - Profihost AG wrote:
>> Hi Paolo,
>>
>> Am 06.11.2017 um 10:40 schrieb Paolo Bonzini:
>>> On 06/11/2017 10:38, Stefan Priebe - Profihost AG wrote:
 Hello,

 i've upgraded some servers from kernel 4.4 to 4.12 - both running Qemu
 2.9.1.

 If i migrate a VM from a host running kernel 4.4 to a host running 4.12
 i get:

 kvm: virtio-net: saved image requires TUN_F_UFO support
 kvm: Failed to load virtio-net-device:tmp
 kvm: Failed to load virtio-net:virtio
 kvm: error while loading state for instance 0x0 of device
 ':00:12.0/virtio-net'
 kvm: load of migration failed: Invalid argument


 while migrating from 4.12 to 4.4 works fine.

 Can anybody help? Is this expected?
>>>
>>> Can you check why peer_has_ufo failed (in hw/net/virtio-net.c)?
>>
>> May be - how can i archieve this? Patching the code is not a problem if
>> you can give me a hint.
>>
>>> Also, did this ioctl fail when the tap device was set up on the 4.12 
>>> destination?
>>> int tap_probe_has_ufo(int fd)
>>> {
>>> unsigned offload;
>>>
>>> offload = TUN_F_CSUM | TUN_F_UFO;
>>>
>>> if (ioctl(fd, TUNSETOFFLOAD, offload) < 0)
>>> return 0;
>>>
>>> return 1;
>>> }
>>
>> Should there be any kernel output or how can i detect / check it?
>
> For both, the simplest answer is probably just using printf.

 arg i missed an important part. The kernel is an opensuse SLE15 one.

 I've seen it contains the following patchset:
 https://www.spinics.net/lists/netdev/msg443821.html

 Greets,
 Stefan

>>
>>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
>

Re: [Qemu-devel] kvm: virtio-net: saved image requires TUN_F_UFO support

2017-11-22 Thread Dr. David Alan Gilbert

* Paolo Bonzini (pbonz...@redhat.com) wrote:
> On 06/11/2017 12:09, Stefan Priebe - Profihost AG wrote:
> > HI Paolo,
> > 
> > could this patchset be related?
> 
> Uh oh, yes it should.  Jason, any ways to fix it?  I suppose we need to
> disable UFO in the newest machine types, but do we also have to do
> (software) UFO in vhost-net and QEMU for migration compatibility?

Was there a solution to this?

Dave

> Thanks,
> 
> Paolo
> 
> > Greets,
> > Stefan
> > 
> > Am 06.11.2017 um 10:52 schrieb Stefan Priebe - Profihost AG:
> >> Hi Paolo,
> >>
> >> Am 06.11.2017 um 10:49 schrieb Paolo Bonzini:
> >>> On 06/11/2017 10:48, Stefan Priebe - Profihost AG wrote:
>  Hi Paolo,
> 
>  Am 06.11.2017 um 10:40 schrieb Paolo Bonzini:
> > On 06/11/2017 10:38, Stefan Priebe - Profihost AG wrote:
> >> Hello,
> >>
> >> i've upgraded some servers from kernel 4.4 to 4.12 - both running Qemu
> >> 2.9.1.
> >>
> >> If i migrate a VM from a host running kernel 4.4 to a host running 4.12
> >> i get:
> >>
> >> kvm: virtio-net: saved image requires TUN_F_UFO support
> >> kvm: Failed to load virtio-net-device:tmp
> >> kvm: Failed to load virtio-net:virtio
> >> kvm: error while loading state for instance 0x0 of device
> >> ':00:12.0/virtio-net'
> >> kvm: load of migration failed: Invalid argument
> >>
> >>
> >> while migrating from 4.12 to 4.4 works fine.
> >>
> >> Can anybody help? Is this expected?
> >
> > Can you check why peer_has_ufo failed (in hw/net/virtio-net.c)?
> 
>  May be - how can i archieve this? Patching the code is not a problem if
>  you can give me a hint.
> 
> > Also, did this ioctl fail when the tap device was set up on the 4.12 
> > destination?
> > int tap_probe_has_ufo(int fd)
> > {
> > unsigned offload;
> >
> > offload = TUN_F_CSUM | TUN_F_UFO;
> >
> > if (ioctl(fd, TUNSETOFFLOAD, offload) < 0)
> > return 0;
> >
> > return 1;
> > }
> 
>  Should there be any kernel output or how can i detect / check it?
> >>>
> >>> For both, the simplest answer is probably just using printf.
> >>
> >> arg i missed an important part. The kernel is an opensuse SLE15 one.
> >>
> >> I've seen it contains the following patchset:
> >> https://www.spinics.net/lists/netdev/msg443821.html
> >>
> >> Greets,
> >> Stefan
> >>
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH] kvm: apic: save and restore x2APIC LDR

2017-11-22 Thread Paolo Bonzini

On 22/11/2017 19:09, Radim Krčmář wrote:
> QEMU saves only 8 bits of APIC LDR, which means that it does not support
> x2APIC.  The correct way of fixing this would be to save and restore the
> full 32 bit register, but because x2APIC LDR is a function of x2APIC ID,
> we can also compute it and keep the migration format untouched.
> 
> KVM always expected the LDR format to follow the xAPIC/x2APIC standard,
> but pre 4.1 KVMs used non-standard x2APIC ID in case the OS changed
> xAPIC ID before switching to x2APIC, which means that QEMU has to use
> the kvm_x2apic_api feature to derive the x2APIC ID.
> 
> This bug has also been addressed on the KVM side with patch 5849d75a5c9b
> ("KVM: lapic: Fixup LDR on load in x2apic").

> +if (s->apicbase & MSR_IA32_APICBASE_EXTD) {
> +kvm_apic_set_reg(kapic, 0xd, kvm_apic_calc_x2apic_ldr(s));

Is this correct if the kernel doesn't support the new-style x2APIC API?

In the end, it seems simpler to just fix it in the kernel.

Paolo

> +} else {
> +kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24);
> +}

Re: [Qemu-devel] [Qemu-block] [PATCH] block: Fix qemu crash when using scsi-block

2017-11-22 Thread Paolo Bonzini

On 22/11/2017 19:06, Kevin Wolf wrote:
> Am 22.11.2017 um 17:34 hat Paolo Bonzini geschrieben:
>> On 22/11/2017 16:33, Deepa Srinivasan wrote:
>>> Starting qemu with the following arguments causes qemu to segfault:
>>> ... -device lsi,id=lsi0 -drive 
>>> file=iscsi:<...>,format=raw,if=none,node-name=
>>> iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
>>>
>>> This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
>>> blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. 
>>> More
>>> details about the bug follow.
>>>
>>> blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
>>> coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
>>>
>>> When blk_aio_ioctl() is executed from within a coroutine context (e.g.
>>> iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
>>> the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
>>>
>>> When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
>>> 
>>> BlkRwCo *rwco = &acb->rwco;
>>>
>>> rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
>>>  rwco->qiov->iov[0].iov_base);  <--- qiov is
>>>  invalid 
>>> here
>>> ...
>>>
>>> In the case when blk_aio_ioctl() is called from a non-coroutine context,
>>> blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
>>> qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
>>> execution is complete, control returns to blk_aio_ioctl_entry() after the 
>>> call
>>> to blk_co_ioctl(). There is no invalid reference after this point, but the
>>> function is still holding on to invalid pointers.
>>>
>>> The fix is to allocate memory for the QEMUIOVector and struct iovec as part 
>>> of
>>> the request struct which the IO buffer is part of. The memory for this 
>>> struct is
>>> guaranteed to be valid till the AIO is completed.
>>>
>>> Signed-off-by: Deepa Srinivasan 
>>> Signed-off-by: Konrad Rzeszutek Wilk 
>>> Reviewed-by: Mark Kanda 
>>> ---
>>>  block/block-backend.c  | 13 ++---
>>>  hw/block/virtio-blk.c  |  9 -
>>>  hw/scsi/scsi-disk.c| 10 +-
>>>  hw/scsi/scsi-generic.c |  9 -
>>>  include/sysemu/block-backend.h |  2 +-
>>>  5 files changed, 28 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/block/block-backend.c b/block/block-backend.c
>>> index baef8e7..c275827 100644
>>> --- a/block/block-backend.c
>>> +++ b/block/block-backend.c
>>> @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque)
>>>  blk_aio_complete(acb);
>>>  }
>>>  
>>> -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void 
>>> *buf,
>>> +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, 
>>> QEMUIOVector *qiov,
>>>BlockCompletionFunc *cb, void *opaque)
>>
>> I think this is not the best way to fix the bug, because it adds extra
>> unnecessary code in the callers.
>>
>> Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the
>> same for blk_aio_prwv's "qiov" argument?
>>
>> Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just
>> use rwco->buf.
> 
> But the same struct is used for read and write requests that do use an
> actual QEMUIOVector and not just a linear buffer.

Then let's call it "void *opaque", or make it a union (but I think
that's overkill).

The QEMUIOVector pointer is opaque as far as blk_aio_prwv is concerned,
and it is only created by blk_aio_ioctl for blk_aio_ioctl_entry to
extract buf:

rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
 rwco->qiov->iov[0].iov_base);

Exposing the fake QEMUIOVector to the callers of blk_aio_ioctl is much
uglier than using a void* for what is effectively a multi-type pointer.

Paolo

Re: [Qemu-devel] [PATCH v3 30/30] Implement support for i.MX7 Sabre board

2017-11-22 Thread Andrey Smirnov

On Tue, Nov 21, 2017 at 10:22 AM, Peter Maydell
 wrote:
> On 6 November 2017 at 15:48, Andrey Smirnov  wrote:
>> Implement code needed to set up emulation of MCIMX7SABRE board from
>> NXP. For more info about the HW see:
>>
>> https://www.nxp.com/support/developer-resources/hardware-development-tools/sabre-development-system/sabre-board-for-smart-devices-based-on-the-i.mx-7dual-applications-processors:MCIMX7SABRE
>
> You could put this URL in a comment in the code as well.
>
>> Cc: Peter Maydell 
>> Cc: Jason Wang 
>> Cc: Philippe Mathieu-Daudé 
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-...@nongnu.org
>> Cc: yurov...@gmail.com
>> Signed-off-by: Andrey Smirnov 
>> ---
>>  hw/arm/Makefile.objs   |   2 +-
>>  hw/arm/mcimx7d-sabre.c | 101 
>> +
>>  2 files changed, 102 insertions(+), 1 deletion(-)
>>  create mode 100644 hw/arm/mcimx7d-sabre.c
>>
>> diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
>> index f379ddc74b..eb6f6c5997 100644
>> --- a/hw/arm/Makefile.objs
>> +++ b/hw/arm/Makefile.objs
>> @@ -19,5 +19,5 @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
>>  obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
>>  obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o
>>  obj-$(CONFIG_MPS2) += mps2.o
>> -obj-$(CONFIG_FSL_IMX7) += fsl-imx7.o
>> +obj-$(CONFIG_FSL_IMX7) += fsl-imx7.o mcimx7d-sabre.o
>>
>> diff --git a/hw/arm/mcimx7d-sabre.c b/hw/arm/mcimx7d-sabre.c
>> new file mode 100644
>> index 00..7ca8e668e8
>> --- /dev/null
>> +++ b/hw/arm/mcimx7d-sabre.c
>> @@ -0,0 +1,101 @@
>> +/*
>> + * Copyright (c) 2017, Impinj, Inc.
>> + *
>> + * MCIMX7D_SABRE Board System emulation.
>> + *
>> + * Author: Andrey Smirnov 
>> + *
>> + * This code is licensed under the GPL, version 2 or later.
>> + * See the file `COPYING' in the top level directory.
>> + *
>> + * It (partially) emulates a mcimx7d_sabre board, with a Freescale
>> + * i.MX7 SoC
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qapi/error.h"
>> +#include "qemu-common.h"
>> +#include "hw/arm/fsl-imx7.h"
>> +#include "hw/boards.h"
>> +#include "sysemu/sysemu.h"
>> +#include "sysemu/device_tree.h"
>> +#include "qemu/error-report.h"
>> +#include "sysemu/qtest.h"
>> +#include "net/net.h"
>> +
>> +typedef struct {
>> +FslIMX7State soc;
>> +MemoryRegion ram;
>> +} MCIMX7Sabre;
>> +
>> +static void mcimx7d_add_psci_node(const struct arm_boot_info *boot_info,
>> +  void *fdt)
>> +{
>> +const char comp[] = "arm,psci-0.2\0arm,psci";
>> +
>> +qemu_fdt_add_subnode(fdt, "/psci");
>> +qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
>> +qemu_fdt_setprop_string(fdt, "/psci", "method", "smc");
>> +}
>
> I'm still unconvinced by this (none of the other i.mx boards we have
> have anything like it).

None of the other boards are both SMP capable and support SMP in
upstream Linux only through PSCI (as it the case for i.MX7), so
comparing against the precedent is not very helpful.

> How does the real hardware boot SMP ?
>

Real hardware executes a bootloader which is expected to implement
PSCI and do the DTB fixup as I implemented above.

Thanks,
Andrey Smirnov

Re: [Qemu-devel] [PATCH 2/5] nbd/server: add nbd_opt_{read, drop} to track client->optlen

2017-11-22 Thread Eric Blake

On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  nbd/server.c | 34 ++
>  1 file changed, 22 insertions(+), 12 deletions(-)
> @@ -299,7 +312,7 @@ static int nbd_negotiate_handle_export_name(NBDClient 
> *client,
>  error_setg(errp, "Bad length received");
>  return -EINVAL;
>  }
> -if (nbd_read(client->ioc, name, client->optlen, errp) < 0) {
> +if (nbd_opt_read(client, name, client->optlen, errp) < 0) {
>  error_prepend(errp, "read failed: ");
>  return -EINVAL;
>  }

More context:

  name[client->optlen] = '\0';

Oops - that's broken, because client->optlen is now 0.  Which means your
code was only tested with empty-string default exports.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH for-2.12 v3 1/3] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged

2017-11-22 Thread Cédric Le Goater

On 11/22/2017 03:33 AM, David Gibson wrote:
> On Mon, Nov 20, 2017 at 11:03:45AM +0100, Cédric Le Goater wrote:
>> When a CPU is stopped with the 'stop-self' RTAS call, its state
>> 'halted' is switched to 1 and, in this case, the MSR is not taken into
>> account anymore in the cpu_has_work() routine. Only the pending
>> hardware interrupts are checked with their LPCR:PECE* enablement bit.
>>
>> If the DECR timer fires after 'stop-self' is called and before the CPU
>> 'stop' state is reached, the nearly-dead CPU will have some work to do
>> and the guest will crash. This case happens very frequently with the
>> not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
>> occasionally fired but after 'stop' state, so no work is to be done
>> and the guest survives.
>>
>> I suspect there is a race between the QEMU mainloop triggering the
>> timers and the TCG CPU thread but I could not quite identify the root
>> cause. To be safe, let's disable in the LPCR all the exceptions which
>> can cause an exit while the CPU is in power-saving mode and reenable
>> them when the CPU is started.
>>
>> For this purpose, we introduce a little helper routine to calculate
>> the PECE bits for a processor variant. We could also use the mask
>> value LPCR_PECE_L_MASK for the P8 and P9 processors. bit 47 and 48 are
>> reserved on P7 but it is still compatible.
>>
>> Signed-off-by: Cédric Le Goater 
> 
> I'm not thrilled about addressing this without 100% knowing what's
> going on, 

me either :/ I have spent hours, days, on QEMU logs trying to catch 
a possible race in the state machine of the CPUs and didn't find it.
I need a better understanding of the internals. 

> but this seems like a sensible change in any case, so I'm ok
> with applying something like this.
>
> A detail however..
> 
> [snip]
>>  #if !defined(CONFIG_USER_ONLY)
>> +
>> +target_ulong cpu_ppc_papr_pece_bits(CPUPPCState *env)
>> +{
>> +switch (env->mmu_model) {
>> +case POWERPC_MMU_3_00:
>> +return LPCR_PDEE | LPCR_HDEE | LPCR_EEE | LPCR_DEE | LPCR_OEE;
>> +default:
>> +/* P7 and P8 has slightly different PECE bits, mostly because P8 
>> adds
>> + * bit 47 and 48 which are reserved on P7. Here we set them all, 
>> which
>> + * will work as expected for both implementations
>> + */
>> +return LPCR_P8_PECE0 | LPCR_P8_PECE1 | LPCR_P8_PECE2 | 
>> LPCR_P8_PECE3 |
>> +LPCR_P8_PECE4;
>> +}
>> +}
> 
> ..since we're working in this area, might as well clean up this
> inappropriate use of mmu_model.  Two options which I'd be ok with:
> 
> 1) Add a pece_bits field to the PowerPCCPUClass, correctly initialized
> for the various processors.
> 
> 2) A similar helper but using ppc_check_compat() to check the arch
> level, instead of using env->mmu_model.
> 

OK. 

Thanks,

C.

Re: [Qemu-devel] qemu iotest 020 failing for vmdk after 2b7731938d9

2017-11-22 Thread Max Reitz

On 2017-11-21 23:16, John Snow wrote:
> Commit 2b7731938d9 adds a blkdebug driver test for failing commits, but
> the vmdk driver doesn't appear to appreciate this format:
> 
> +_qemu_img_wrapper create -f vmdk -b "json:{'driver': 'raw',
> + 'file': {
> + 'driver': 'blkdebug',
> + 'inject-error': [{
> + 'event': 'write_aio',
> + 'errno': 28,
> + 'once': true
> + }],
> + 'image': {
> + 'driver': 'null-co'
> + }}}"
> "/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/t.vmdk"
> +qemu-img: TEST_DIR/t.IMGFMT: Could not create image: Invalid argument
> 
> 
> ...so;
> 
> (A) VMDK should be dropped from 020, or
> (B) This sub-test should be rewritten, or
> (C) This sub-test should be split out into a new unit where VMDK can be
> dropped.
> 
> I don't like (A) very much because I like testing our weird formats when
> possible, I don't like (B) very much because I don't really like
> wrangling QMP commands inside of the bash unit tests.
> 
> (C) Could work; though it's odd to have it away from its kin in 020.

The issue is not blkdebug, the issue is that the backing file is not in
vmdk format (because it's a null-co node at heart).

I can make it a VMDK, but that'll be fun either when specifying the
backing file file.image node because this test is supposed to work with
all protocols but NFS; or I'll have to resort to the gold old
blkdebug:conf:file syntax...  But why not, it works.

Now I'll just have to figure this out:

-qemu-img: Block job failed: No space left on device
+Image committed.

:-/

Max



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH 3/3] hyperv: make SynIC version msr constant

2017-11-22 Thread Roman Kagan

The value of HV_X64_MSR_SVERSION is initialized once at vcpu init, and
is reset to zero on vcpu reset, which is wrong.

It is supposed to be a constant, so drop the field from X86CPU, set the
msr with the constant value, and don't bother getting it.

Signed-off-by: Roman Kagan 
---
 target/i386/cpu.h | 1 -
 target/i386/kvm.c | 9 ++---
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index ea9db80de5..b264419678 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1101,7 +1101,6 @@ typedef struct CPUX86State {
 uint64_t msr_hv_crash_params[HV_CRASH_PARAMS];
 uint64_t msr_hv_runtime;
 uint64_t msr_hv_synic_control;
-uint64_t msr_hv_synic_version;
 uint64_t msr_hv_synic_evt_page;
 uint64_t msr_hv_synic_msg_page;
 uint64_t msr_hv_synic_sint[HV_SINT_COUNT];
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index ea6e6e5f30..0479fa4e4a 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -669,7 +669,6 @@ static int hyperv_handle_properties(CPUState *cs)
 }
 
 env->features[FEAT_HYPERV_EAX] |= HV_SYNIC_AVAILABLE;
-env->msr_hv_synic_version = HV_SYNIC_VERSION;
 }
 if (cpu->hyperv_stimer) {
 if (!has_msr_hv_stimer) {
@@ -1715,10 +1714,10 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
 if (cpu->hyperv_synic) {
 int j;
 
+kvm_msr_entry_add(cpu, HV_X64_MSR_SVERSION, HV_SYNIC_VERSION);
+
 kvm_msr_entry_add(cpu, HV_X64_MSR_SCONTROL,
   env->msr_hv_synic_control);
-kvm_msr_entry_add(cpu, HV_X64_MSR_SVERSION,
-  env->msr_hv_synic_version);
 kvm_msr_entry_add(cpu, HV_X64_MSR_SIEFP,
   env->msr_hv_synic_evt_page);
 kvm_msr_entry_add(cpu, HV_X64_MSR_SIMP,
@@ -2082,7 +2081,6 @@ static int kvm_get_msrs(X86CPU *cpu)
 uint32_t msr;
 
 kvm_msr_entry_add(cpu, HV_X64_MSR_SCONTROL, 0);
-kvm_msr_entry_add(cpu, HV_X64_MSR_SVERSION, 0);
 kvm_msr_entry_add(cpu, HV_X64_MSR_SIEFP, 0);
 kvm_msr_entry_add(cpu, HV_X64_MSR_SIMP, 0);
 for (msr = HV_X64_MSR_SINT0; msr <= HV_X64_MSR_SINT15; msr++) {
@@ -2286,9 +2284,6 @@ static int kvm_get_msrs(X86CPU *cpu)
 case HV_X64_MSR_SCONTROL:
 env->msr_hv_synic_control = msrs[i].data;
 break;
-case HV_X64_MSR_SVERSION:
-env->msr_hv_synic_version = msrs[i].data;
-break;
 case HV_X64_MSR_SIEFP:
 env->msr_hv_synic_evt_page = msrs[i].data;
 break;
-- 
2.14.3

[Qemu-devel] [PATCH 1/3] hyperv: set partition-wide MSRs only on first vcpu

2017-11-22 Thread Roman Kagan

From: Evgeny Yakovlev 

Hyper-V has a notion of partition-wide MSRs.  Those MSRs are read and
written as usual on each VCPU, however the hypervisor maintains a single
global value for all VCPUs.  Thus writing such an MSR from any single
VCPU affects the global value that is read by all other VCPUs.

This leads to an issue during VCPU hotplug: the zero-initialzied values
of those MSRs get synced into KVM and override the global values as has
already been set by the guest.

This change makes the partition-wide MSRs only be synchronized on the
first vcpu.

Signed-off-by: Evgeny Yakovlev 
Signed-off-by: Roman Kagan 
---
 target/i386/cpu.h |  5 -
 target/i386/kvm.c | 23 +++
 2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index b086b1528b..ea9db80de5 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1091,10 +1091,13 @@ typedef struct CPUX86State {
 uint64_t async_pf_en_msr;
 uint64_t pv_eoi_en_msr;
 
+/* Partition-wide HV MSRs, will be updated only on the first vcpu */
 uint64_t msr_hv_hypercall;
 uint64_t msr_hv_guest_os_id;
-uint64_t msr_hv_vapic;
 uint64_t msr_hv_tsc;
+
+/* Per-VCPU HV MSRs */
+uint64_t msr_hv_vapic;
 uint64_t msr_hv_crash_params[HV_CRASH_PARAMS];
 uint64_t msr_hv_runtime;
 uint64_t msr_hv_synic_control;
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index b1e32e95d3..563967241b 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1678,19 +1678,26 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
 kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL,
   env->msr_global_ctrl);
 }
-if (has_msr_hv_hypercall) {
-kvm_msr_entry_add(cpu, HV_X64_MSR_GUEST_OS_ID,
-  env->msr_hv_guest_os_id);
-kvm_msr_entry_add(cpu, HV_X64_MSR_HYPERCALL,
-  env->msr_hv_hypercall);
+/*
+ * Hyper-V partition-wide MSRs: to avoid clearing them on cpu hot-add,
+ * only sync them to KVM on the first cpu
+ */
+if (current_cpu == first_cpu) {
+if (has_msr_hv_hypercall) {
+kvm_msr_entry_add(cpu, HV_X64_MSR_GUEST_OS_ID,
+  env->msr_hv_guest_os_id);
+kvm_msr_entry_add(cpu, HV_X64_MSR_HYPERCALL,
+  env->msr_hv_hypercall);
+}
+if (cpu->hyperv_time) {
+kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC,
+  env->msr_hv_tsc);
+}
 }
 if (cpu->hyperv_vapic) {
 kvm_msr_entry_add(cpu, HV_X64_MSR_APIC_ASSIST_PAGE,
   env->msr_hv_vapic);
 }
-if (cpu->hyperv_time) {
-kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC, env->msr_hv_tsc);
-}
 if (has_msr_hv_crash) {
 int j;
 
-- 
2.14.3

[Qemu-devel] [PATCH 2/3] hyperv: ensure SINTx msrs are reset properly

2017-11-22 Thread Roman Kagan

Initially SINTx msrs should be in "masked" state.  To ensure that
happens on *every* reset, move setting their values to
kvm_arch_vcpu_reset.

Signed-off-by: Roman Kagan 
---
 target/i386/kvm.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 563967241b..ea6e6e5f30 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -662,8 +662,6 @@ static int hyperv_handle_properties(CPUState *cs)
 env->features[FEAT_HYPERV_EAX] |= HV_VP_RUNTIME_AVAILABLE;
 }
 if (cpu->hyperv_synic) {
-int sint;
-
 if (!has_msr_hv_synic ||
 kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_SYNIC, 0)) {
 fprintf(stderr, "Hyper-V SynIC is not supported by kernel\n");
@@ -672,9 +670,6 @@ static int hyperv_handle_properties(CPUState *cs)
 
 env->features[FEAT_HYPERV_EAX] |= HV_SYNIC_AVAILABLE;
 env->msr_hv_synic_version = HV_SYNIC_VERSION;
-for (sint = 0; sint < ARRAY_SIZE(env->msr_hv_synic_sint); sint++) {
-env->msr_hv_synic_sint[sint] = HV_SINT_MASKED;
-}
 }
 if (cpu->hyperv_stimer) {
 if (!has_msr_hv_stimer) {
@@ -1053,6 +1048,13 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
 } else {
 env->mp_state = KVM_MP_STATE_RUNNABLE;
 }
+
+if (cpu->hyperv_synic) {
+int i;
+for (i = 0; i < ARRAY_SIZE(env->msr_hv_synic_sint); i++) {
+env->msr_hv_synic_sint[i] = HV_SINT_MASKED;
+}
+}
 }
 
 void kvm_arch_do_init_vcpu(X86CPU *cpu)
-- 
2.14.3

[Qemu-devel] [PATCH 0/3] hyperv: hv msr initialization fixes

2017-11-22 Thread Roman Kagan

These patches fix problems with hyperv msr initialization.

Evgeny Yakovlev (1):
  hyperv: set partition-wide MSRs only on first vcpu

Roman Kagan (2):
  hyperv: ensure SINTx msrs are reset properly
  hyperv: make SynIC version msr constant

 target/i386/cpu.h |  6 --
 target/i386/kvm.c | 44 
 2 files changed, 28 insertions(+), 22 deletions(-)

-- 
2.14.3

Re: [Qemu-devel] [RFC PATCH v2 1/3] s390x/ccs: add ccw-testdev emulated device

2017-11-22 Thread Cornelia Huck

On Wed,  8 Nov 2017 17:54:20 +0100
Halil Pasic  wrote:

Subject: s/ccs/css/

> Add a fake device meant for testing the correctness of our css emulation.
> 
> What we currently have is writing a Fibonacci sequence of uint32_t to the
> device via ccw write. The write is going to fail if it ain't a Fibonacci
> and indicate a device exception in scsw together with the proper residual
> count. With this we can do basic verification of data integrity.
> 
> Of course lot's of invalid inputs (besides basic data processing) can be
> tested with that as well.
> 
> We also have a no-op mode where the device just tells all-good! This
> could be useful for fuzzing.
> 
> Usage:
> 1) fire up a qemu with something like -device ccw-testdev,devno=fe.0.0001
>on the command line
> 2) exercise the device from the guest
> 
> Signed-off-by: Halil Pasic 
> ---
> 
> Introduction
> 
> 
> While discussing v1 we (more or less) decided a test device for ccw is a
> good idea. This is an RFC because there are some unresolved technical
> questions I would like to discuss.
> 
> Usage
> -
> 
> Build like this:  make CONFIG_CCW_TESTDEV=y
> 
> Changelog
> -
> 
> v1 -> v2:
> - Renamed and moved to hw/misc/
> - Changed cu_type to 0xfffe.
> - Changed chpid_type to 0x25 (questionable).
> - Extensibility: addedd both in-band (ccw) and out-of-band set mode
>   mechanism. Currently we have two modes: fib and no-op. The purpose
>   of the mode mechanism is to facilitate different behaviors. One can
>   both specify and lock a mode on the command line.
> - Added read for fib mode.
> 
> Things I've figured out and things to figure out
> ---
> 
> The zVM folks say they don't have a reserved cu_type reserved for test
> (or hypervisor use in general). So I've gone with cu_type  0xfffe because
> according to Christian  only numeric digit hex values are used, and Linux
> uses x0 as extremal value.

ack

> 
> The zVM folks say the don't have a chpid_type reserved for test (or
> hypervisor use in general. So I took 0x25 for now, which belongs to FC
> and should be safe in my opinion.

That's fine, I think.

> 
> AFAIR there was some discussion on using a dedicated diag function code.
> There are multiple such that could be re-purposed for out-of-band
> signaling reserved by zVM. For now I've decided to go with a new subcode
> for diag 0x500 as it appears to me that it requires less changes.

It means that we don't have to synchronize with anyone else, yes.

> 
> I've implemented both in-band and out-of-band signaling for influencing
> what the testdev does from the guest. We should probably get rid of one.
> I've just implemented both so we can discuss pros and cons with some
> code.
> 
> I've taken subcode 254, the last one supported at the moment. I'm not
> really happy with the way I 'took' it. Maybe all taken subcodes
> could go into hw/s390x/s390-virtio-hcall.h either via include or
> directly.

For virtio-ccw, we're going the way via the standard headers (as for
other virtio things). The last used subcode for the old virtio
transport is already in this file (in s390-next).

We currently have Documentation/virtual/kvm/s390-diag.txt in the
kernel. Not sure whether that would be a good place to document the
testdev subcode.

> 
> I'm not really happy with the side effects of moving it to hw/misc, which
> ain't s390x specific. I've pretty much bounced off the build system, so
> I would really appreciate some help here. Currently you have to say you
> want it when you do make or it won't get compiled into your qemu. IMHO
> this device only makes sense for testing and should not be rutinely
> shipped in production builds. That is why I did not touch
> default-configs/s390x-softmmu.mak.

The pci and isa testdevs seem to be included on all platforms that
support isa/pci (except for the pci testdev on s390x). They do have
their own config symbols, so they can be manually disabled should it be
desired. This seems like a good way to go for us as well.

> 
> I think, I have the most problematic places marked with a  TODO
> comment in the code.

I'll save looking at the code for another day.

[Qemu-devel] [PATCH] kvm: apic: save and restore x2APIC LDR

2017-11-22 Thread Radim Krčmář

QEMU saves only 8 bits of APIC LDR, which means that it does not support
x2APIC.  The correct way of fixing this would be to save and restore the
full 32 bit register, but because x2APIC LDR is a function of x2APIC ID,
we can also compute it and keep the migration format untouched.

KVM always expected the LDR format to follow the xAPIC/x2APIC standard,
but pre 4.1 KVMs used non-standard x2APIC ID in case the OS changed
xAPIC ID before switching to x2APIC, which means that QEMU has to use
the kvm_x2apic_api feature to derive the x2APIC ID.

This bug has also been addressed on the KVM side with patch 5849d75a5c9b
("KVM: lapic: Fixup LDR on load in x2apic").

Reported-by: Dr. David Alan Gilbert 
Reported-by: Yiqian Wei 
Signed-off-by: Radim Krčmář 
---
 I haven't tested that it actually fixes the bug,
 https://bugzilla.redhat.com/show_bug.cgi?id=1502591.
 
 hw/i386/kvm/apic.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
index 1df6d26816f9..89df165a04bf 100644
--- a/hw/i386/kvm/apic.c
+++ b/hw/i386/kvm/apic.c
@@ -30,6 +30,13 @@ static inline uint32_t kvm_apic_get_reg(struct 
kvm_lapic_state *kapic,
 return *((uint32_t *)(kapic->regs + (reg_id << 4)));
 }
 
+static inline uint32_t kvm_apic_calc_x2apic_ldr(APICCommonState *s)
+{
+uint32_t id = kvm_has_x2apic_api() ? s->initial_apic_id : s->id;
+
+return ((id >> 4) << 16) | (1 << (id & 0xf));
+}
+
 static void kvm_put_apic_state(APICCommonState *s, struct kvm_lapic_state 
*kapic)
 {
 int i;
@@ -41,7 +48,11 @@ static void kvm_put_apic_state(APICCommonState *s, struct 
kvm_lapic_state *kapic
 kvm_apic_set_reg(kapic, 0x2, s->id << 24);
 }
 kvm_apic_set_reg(kapic, 0x8, s->tpr);
-kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24);
+if (s->apicbase & MSR_IA32_APICBASE_EXTD) {
+kvm_apic_set_reg(kapic, 0xd, kvm_apic_calc_x2apic_ldr(s));
+} else {
+kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24);
+}
 kvm_apic_set_reg(kapic, 0xe, s->dest_mode << 28 | 0x0fff);
 kvm_apic_set_reg(kapic, 0xf, s->spurious_vec);
 for (i = 0; i < 8; i++) {
@@ -71,7 +82,11 @@ void kvm_get_apic_state(DeviceState *dev, struct 
kvm_lapic_state *kapic)
 }
 s->tpr = kvm_apic_get_reg(kapic, 0x8);
 s->arb_id = kvm_apic_get_reg(kapic, 0x9);
-s->log_dest = kvm_apic_get_reg(kapic, 0xd) >> 24;
+if (s->apicbase & MSR_IA32_APICBASE_EXTD) {
+assert(kvm_apic_get_reg(kapic, 0xd) == kvm_apic_calc_x2apic_ldr(s));
+} else {
+s->log_dest = kvm_apic_get_reg(kapic, 0xd) >> 24;
+}
 s->dest_mode = kvm_apic_get_reg(kapic, 0xe) >> 28;
 s->spurious_vec = kvm_apic_get_reg(kapic, 0xf);
 for (i = 0; i < 8; i++) {
-- 
2.14.2

Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block

2017-11-22 Thread Kevin Wolf

Am 22.11.2017 um 17:34 hat Paolo Bonzini geschrieben:
> On 22/11/2017 16:33, Deepa Srinivasan wrote:
> > Starting qemu with the following arguments causes qemu to segfault:
> > ... -device lsi,id=lsi0 -drive 
> > file=iscsi:<...>,format=raw,if=none,node-name=
> > iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
> > 
> > This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
> > blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. 
> > More
> > details about the bug follow.
> > 
> > blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
> > coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
> > 
> > When blk_aio_ioctl() is executed from within a coroutine context (e.g.
> > iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
> > the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
> > 
> > When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
> > 
> > BlkRwCo *rwco = &acb->rwco;
> > 
> > rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
> >  rwco->qiov->iov[0].iov_base);  <--- qiov is
> >  invalid 
> > here
> > ...
> > 
> > In the case when blk_aio_ioctl() is called from a non-coroutine context,
> > blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
> > qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
> > execution is complete, control returns to blk_aio_ioctl_entry() after the 
> > call
> > to blk_co_ioctl(). There is no invalid reference after this point, but the
> > function is still holding on to invalid pointers.
> > 
> > The fix is to allocate memory for the QEMUIOVector and struct iovec as part 
> > of
> > the request struct which the IO buffer is part of. The memory for this 
> > struct is
> > guaranteed to be valid till the AIO is completed.
> > 
> > Signed-off-by: Deepa Srinivasan 
> > Signed-off-by: Konrad Rzeszutek Wilk 
> > Reviewed-by: Mark Kanda 
> > ---
> >  block/block-backend.c  | 13 ++---
> >  hw/block/virtio-blk.c  |  9 -
> >  hw/scsi/scsi-disk.c| 10 +-
> >  hw/scsi/scsi-generic.c |  9 -
> >  include/sysemu/block-backend.h |  2 +-
> >  5 files changed, 28 insertions(+), 15 deletions(-)
> > 
> > diff --git a/block/block-backend.c b/block/block-backend.c
> > index baef8e7..c275827 100644
> > --- a/block/block-backend.c
> > +++ b/block/block-backend.c
> > @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque)
> >  blk_aio_complete(acb);
> >  }
> >  
> > -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void 
> > *buf,
> > +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, 
> > QEMUIOVector *qiov,
> >BlockCompletionFunc *cb, void *opaque)
> 
> I think this is not the best way to fix the bug, because it adds extra
> unnecessary code in the callers.
> 
> Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the
> same for blk_aio_prwv's "qiov" argument?
> 
> Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just
> use rwco->buf.

But the same struct is used for read and write requests that do use an
actual QEMUIOVector and not just a linear buffer.

Kevin

Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block

2017-11-22 Thread Kevin Wolf

Am 22.11.2017 um 18:06 hat Stefan Hajnoczi geschrieben:
> On Wed, Nov 22, 2017 at 07:33:28AM -0800, Deepa Srinivasan wrote:
> > Starting qemu with the following arguments causes qemu to segfault:
> > ... -device lsi,id=lsi0 -drive 
> > file=iscsi:<...>,format=raw,if=none,node-name=
> > iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
> > 
> > This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
> > blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. 
> > More
> > details about the bug follow.
> > 
> > blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
> > coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
> > 
> > When blk_aio_ioctl() is executed from within a coroutine context (e.g.
> > iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
> > the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
> > 
> > When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
> > 
> > BlkRwCo *rwco = &acb->rwco;
> > 
> > rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
> >  rwco->qiov->iov[0].iov_base);  <--- qiov is
> >  invalid 
> > here
> > ...
> > 
> > In the case when blk_aio_ioctl() is called from a non-coroutine context,
> > blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
> > qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
> > execution is complete, control returns to blk_aio_ioctl_entry() after the 
> > call
> > to blk_co_ioctl(). There is no invalid reference after this point, but the
> > function is still holding on to invalid pointers.
> > 
> > The fix is to allocate memory for the QEMUIOVector and struct iovec as part 
> > of
> > the request struct which the IO buffer is part of. The memory for this 
> > struct is
> > guaranteed to be valid till the AIO is completed.
> 
> Thanks for the patch!
> 
> AIO APIs currently don't require the caller to match qiov's lifetime to
> the I/O request lifetime.  This patch changes that for blk_aio_ioctl()
> only.  If we want to do this consistently then all aio callers need to
> be audited and fixed.
> 
> The alternative is to make the API copy qiov when necessary.  That is
> less efficient but avoids modifying all callers.
> 
> Either way, the lifetime of qiov must be consistent across all aio APIs,
> not just blk_aio_ioctl().

Don't all blk_aio_*() APIs that take a qiov pointer require that it
remains valid until the request completes? I don't think they are copied
anywhere for blk_aio_preadv/pwritev() before being passed to the block
driver.

So this does look consistent with the existing functions to me.

Kevin


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 01/12] e500: add board config options

2017-11-22 Thread Michael Davidsaver

On 11/21/2017 09:28 PM, David Gibson wrote:
> On Sun, Nov 19, 2017 at 09:24:09PM -0600, Michael Davidsaver wrote:
>> allow board code to skip common NIC and guest image setup
>> and configure decrementor frequency.
>> Existing boards unchanged.
>>
>> Signed-off-by: Michael Davidsaver 
> 
> So, it's spelled "decrementer".
> 
> Other than that, the patch looks correct.  However having a big common
> function for overall init with a pile of ad-hoc configuration
> parameters is usually not a great way to go.  I think what we want
> instead is to eliminate ppce500_init(), instead doing the setup logic
> separately in each of the e500 machines.   The large common slabs of
> code can be helpers in e500.c, but the overall logic - including most
> of the things controlled by the current params - would be under the
> individual machine's control.

I agree that ppce500_init() is unwieldy, and am willing to spend some
time on cleanup.  I'm just not sure what to do.

I can see moving initialization of at least the serial, i2c, gpio, and
possibly MPIC and PCI host bridge into the e500-ccsr class.

I'm not sure what to do with all the device tree construction code in
e500.c, which has a bunch of hard coded register offsets.  A comment
here summarizes the problem nicely.

> /* TODO: parameterize */
> #define MPC8544_CCSRBAR_SIZE   0x0010ULL
> #define MPC8544_MPIC_REGS_OFFSET   0x4ULL

It seems desirable to avoid having these offsets appear in two different
files, which could allow them to get out of sync.

I had the idea of using an Interface to split up device tree
construction, and was encouraged to find PnvXScomInterfaceClass which
seems be a way of combining device tree construction in a device class.
Is this the way to go?

Some guidance would be appreciated.

Michael


>> ---
>>  hw/ppc/e500.c  | 8 ++--
>>  hw/ppc/e500.h  | 3 +++
>>  hw/ppc/e500plat.c  | 1 +
>>  hw/ppc/mpc8544ds.c | 1 +
>>  4 files changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
>> index 5cf0dabef3..9e7e1b29c4 100644
>> --- a/hw/ppc/e500.c
>> +++ b/hw/ppc/e500.c
>> @@ -826,7 +826,7 @@ void ppce500_init(MachineState *machine, PPCE500Params 
>> *params)
>>  env->mpic_iack = params->ccsrbar_base +
>>   MPC8544_MPIC_REGS_OFFSET + 0xa0;
>>  
>> -ppc_booke_timers_init(cpu, 4, PPC_TIMER_E500);
>> +ppc_booke_timers_init(cpu, params->decrementor_freq, 
>> PPC_TIMER_E500);
>>  
>>  /* Register reset handler */
>>  if (!i) {
>> @@ -899,7 +899,7 @@ void ppce500_init(MachineState *machine, PPCE500Params 
>> *params)
>>  if (!pci_bus)
>>  printf("couldn't create PCI controller!\n");
>>  
>> -if (pci_bus) {
>> +if (pci_bus && !params->tsec_nic) {
>>  /* Register network interfaces. */
>>  for (i = 0; i < nb_nics; i++) {
>>  pci_nic_init_nofail(&nd_table[i], pci_bus, "virtio", NULL);
>> @@ -948,6 +948,10 @@ void ppce500_init(MachineState *machine, PPCE500Params 
>> *params)
>>  sysbus_mmio_get_region(s, 0));
>>  }
>>  
>> +if (params->skip_load) {
>> +return;
>> +}
>> +
>>  /* Load kernel. */
>>  if (machine->kernel_filename) {
>>  kernel_base = cur_base;
>> diff --git a/hw/ppc/e500.h b/hw/ppc/e500.h
>> index 70ba1d8f4f..40f72f2de2 100644
>> --- a/hw/ppc/e500.h
>> +++ b/hw/ppc/e500.h
>> @@ -22,6 +22,9 @@ typedef struct PPCE500Params {
>>  hwaddr pci_mmio_base;
>>  hwaddr pci_mmio_bus_base;
>>  hwaddr spin_base;
>> +uint32_t decrementor_freq; /* in Hz */
>> +bool skip_load;
>> +bool tsec_nic;
>>  } PPCE500Params;
>>  
>>  void ppce500_init(MachineState *machine, PPCE500Params *params);
>> diff --git a/hw/ppc/e500plat.c b/hw/ppc/e500plat.c
>> index e59e80fb9e..3d07987bd1 100644
>> --- a/hw/ppc/e500plat.c
>> +++ b/hw/ppc/e500plat.c
>> @@ -47,6 +47,7 @@ static void e500plat_init(MachineState *machine)
>>  .pci_mmio_base = 0xCULL,
>>  .pci_mmio_bus_base = 0xE000ULL,
>>  .spin_base = 0xFEF00ULL,
>> +.decrementor_freq = 4,
>>  };
>>  
>>  /* Older KVM versions don't support EPR which breaks guests when we 
>> announce
>> diff --git a/hw/ppc/mpc8544ds.c b/hw/ppc/mpc8544ds.c
>> index 1717953ec7..6d9931c475 100644
>> --- a/hw/ppc/mpc8544ds.c
>> +++ b/hw/ppc/mpc8544ds.c
>> @@ -40,6 +40,7 @@ static void mpc8544ds_init(MachineState *machine)
>>  .pci_mmio_bus_base = 0xC000ULL,
>>  .pci_pio_base = 0xE100ULL,
>>  .spin_base = 0xEF00ULL,
>> +.decrementor_freq = 4,
>>  };
>>  
>>  if (machine->ram_size > 0xc000) {
> 




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC PATCH 1/1] s390x: deprecate s390-squash-mcss machine prop

2017-11-22 Thread Cornelia Huck

On Wed, 22 Nov 2017 15:18:32 +0100
Halil Pasic  wrote:

> With the cssids unrestricted (commit  "s390x/css: unrestrict

I think you can simply reference the title only. I plan to merge both
in one go.

> cssids") the s390-squash-mcss machine property should not be used.
> Actually libvirt never supported this, so the expectation is that
> removing it should be pretty painless.  But let's play nice and deprecate
> it first.
> 
> Signed-off-by: Halil Pasic 
> ---
> 
> I'm not sure about the warnig I generate if 's390-squash-mcss'
> is set to 'on':
> * There is prior art where we don't warn_report but info_report
> (commit 83926ad527) 

Tbh, I don't really care about one or the other.

> * Warning iff specified on the command line regardless of value
> would probably be cleaner. Yet I'm not sure where would I need
> to hook in to do that.

Yeah, I don't see a good way to do that either. OTOH, I don't really
think people set this to false explicitly.

> 
> I would like to reference a commit for "s390x/css: unrestrict cssids"
> which is currently in RFC status on the list.

As said, I'll just merge them in one go. We can reference the commit in
the changelog, though.

There also should be an entry in
https://wiki.qemu.org/Features/LegacyRemoval, and we should update the
advice in
https://wiki.qemu.org/index.php/Features/Channel_I/O_Passthrough.

> ---
>  hw/s390x/s390-virtio-ccw.c | 5 -
>  qemu-doc.texi  | 6 ++
>  qemu-options.hx| 6 +-
>  3 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index 6a57f94197..c71e1d7e55 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -554,6 +554,9 @@ static inline void machine_set_squash_mcss(Object *obj, 
> bool value,
>  S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
>  
>  ms->s390_squash_mcss = value;
> +if (ms->s390_squash_mcss) {
> +warn_report("the machine property 's390-squash-mcss' is deprecated");

s/the/The/

You should also add an explanation ("obsoleted by lifting the cssid
restrictions").

> +}
>  }
>  
>  static inline void s390_machine_initfn(Object *obj)
> @@ -583,7 +586,7 @@ static inline void s390_machine_initfn(Object *obj)
>  object_property_add_bool(obj, "s390-squash-mcss",
>   machine_get_squash_mcss,
>   machine_set_squash_mcss, NULL);
> -object_property_set_description(obj, "s390-squash-mcss",
> +object_property_set_description(obj, "s390-squash-mcss", "(deprecated) "
>  "enable/disable squashing subchannels into the default css",
>  NULL);
>  object_property_set_bool(obj, false, "s390-squash-mcss", NULL);
> diff --git a/qemu-doc.texi b/qemu-doc.texi
> index d383ac44d4..90aa92d8b9 100644
> --- a/qemu-doc.texi
> +++ b/qemu-doc.texi
> @@ -2500,6 +2500,12 @@ enabled via the ``-machine usb=on'' argument.
>  
>  The ``-nodefconfig`` argument is a synonym for ``-no-user-config``.
>  
> +@subsection -machine virtio-ccw,s390-squash-mcss=on|off (since 2.12.0)
> +
> +Since version 2.12.0 the cssid can be chosen freely. Instead of squashing
> +mcss for guest that don't support multiple channel subsystems we recommend
> +putting all devices into the default channel subsystem (that is 0xfe).

"The ``s390-squash-mcss=on`` property has been obsoleted by allowing the
cssid to be chosen freely. Instead of squashing subchannels into the
default channel subsystem image for guests that do not support multiple
channel subsystems, all devices can be put into the default channel
subsystem image."

> +
>  @section qemu-img command line arguments
>  
>  @subsection convert -s (since 2.0.0)
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 3728e9b4dd..53b13ec203 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -43,7 +43,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
>  "suppress-vmdesc=on|off disables self-describing 
> migration (default=off)\n"
>  "nvdimm=on|off controls NVDIMM support (default=off)\n"
>  "enforce-config-section=on|off enforce configuration 
> section migration (default=off)\n"
> -"s390-squash-mcss=on|off controls support for squashing 
> into default css (default=off)\n",
> +"s390-squash-mcss=on|off (deprecated) controls support 
> for squashing into default css (default=off)\n",
>  QEMU_ARCH_ALL)
>  STEXI
>  @item -machine [type=]@var{name}[,prop=@var{value}[,...]]
> @@ -98,6 +98,10 @@ Enables or disables NVDIMM support. The default is off.
>  @item s390-squash-mcss=on|off
>  Enables or disables squashing subchannels into the default css.
>  The default is off.
> +NOTE: This property is deprecated and may be removed in future releases.

s/may/will/

Let them complain :)

> +Since version 2.12.0 the cssid can be chosen freely. Instead of squashing
> +mcss for guest that don't

Re: [Qemu-devel] [PATCH 2/5] nbd/server: add nbd_opt_{read, drop} to track client->optlen

2017-11-22 Thread Eric Blake

On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

The subject line says what and sort of implies why, but the commit
message body is rather sparse.

> ---
>  nbd/server.c | 34 ++
>  1 file changed, 22 insertions(+), 12 deletions(-)

Not a net win in lines of code, so a bit more justification may be helpful.

> 
> diff --git a/nbd/server.c b/nbd/server.c
> index bccc0274e7..c9445a89e9 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -139,6 +139,19 @@ static void nbd_client_receive_next_request(NBDClient 
> *client);
>  
>  */
>  
> +static inline int nbd_opt_read(NBDClient *client, void *buffer, size_t size,
> +   Error **errp)
> +{
> +client->optlen -= size;
> +return qio_channel_read_all(client->ioc, buffer, size, errp) < 0 ? -EIO 
> : 0;

Should this code check that client->optlen >= size, and issue the
appropriate error message if not? That may centralize some of the error
checking repeated elsewhere.

> +}
> +
> +static inline int nbd_opt_drop(NBDClient *client, size_t size, Error **errp)
> +{
> +client->optlen -= size;
> +return nbd_drop(client->ioc, size, errp);

Same question on size validation.

> +}
> +
>  /* Send a reply header, including length, but no payload.
>   * Return -errno on error, 0 on success. */
>  static int nbd_negotiate_send_rep_len(NBDClient *client, uint32_t type,
> @@ -299,7 +312,7 @@ static int nbd_negotiate_handle_export_name(NBDClient 
> *client,
>  error_setg(errp, "Bad length received");
>  return -EINVAL;
>  }
> -if (nbd_read(client->ioc, name, client->optlen, errp) < 0) {
> +if (nbd_opt_read(client, name, client->optlen, errp) < 0) {
>  error_prepend(errp, "read failed: ");
>  return -EINVAL;

Here's an example caller that had a previous size validation.

>  }
> @@ -383,40 +396,36 @@ static int nbd_negotiate_handle_info(NBDClient *client, 
> uint16_t myflags,
>  msg = "overall request too short";
>  goto invalid;
>  }
> -if (nbd_read(client->ioc, &namelen, sizeof(namelen), errp) < 0) {
> +if (nbd_opt_read(client, &namelen, sizeof(namelen), errp) < 0) {
>  return -EIO;

And again, a size validation right before the call.

>  }
>  be32_to_cpus(&namelen);
> -client->optlen -= sizeof(namelen);

Okay, part of the refactoring means you don't have to remember to track
remaining size separately from reading; that's a nice feature.  I
suspect it is also possible to assert that client->optlen is zero before
reading the next opt (I'm reviewing the patch in order, we'll see if I
come back here). [1]

>  if (namelen > client->optlen - sizeof(requests) ||
>  (client->optlen - namelen) % 2)
>  {
>  msg = "name length is incorrect";
>  goto invalid;
>  }
> -if (nbd_read(client->ioc, name, namelen, errp) < 0) {
> +if (nbd_opt_read(client, name, namelen, errp) < 0) {
>  return -EIO;
>  }

Another size check prior to the call (actually checking multiple
conditions)...

>  name[namelen] = '\0';
> -client->optlen -= namelen;
>  trace_nbd_negotiate_handle_export_name_request(name);
>  
> -if (nbd_read(client->ioc, &requests, sizeof(requests), errp) < 0) {
> +if (nbd_opt_read(client, &requests, sizeof(requests), errp) < 0) {
>  return -EIO;

...and no direct size check before this call, because the earlier size
checks already covered it.  Arguably, the in-place size check error
messages may be slightly more precise, but any message at all about
unexpected sizing is appropriate (given that only broken clients should
be sending incorrect sizes) - so I'm still leaning towards a stronger
refactoring that puts the size check in the helper function.

> @@ -521,7 +530,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, 
> uint16_t myflags,
>  return rc;
>  
>   invalid:
> -if (nbd_drop(client->ioc, client->optlen, errp) < 0) {
> +if (nbd_opt_drop(client, client->optlen, errp) < 0) {
>  return -EIO;
>  }
>  return nbd_negotiate_send_rep_err(client, NBD_REP_ERR_INVALID,
> @@ -715,7 +724,7 @@ static int nbd_negotiate_options(NBDClient *client, 
> uint16_t myflags,
>  return -EINVAL;
>  
>  default:
> -if (nbd_drop(client->ioc, length, errp) < 0) {
> +if (nbd_opt_drop(client, length, errp) < 0) {

Isn't length == client->optlen here? If so, can we omit the length
parameter to nbd_opt_drop(), and instead have it defined as dropping to
the end of the current option?

> @@ -821,6 +830,7 @@ static int nbd_negotiate_options(NBDClient *client, 
> uint16_t myflags,
>  if (ret < 0) {
>  return ret;
>  }
> +assert(client->optlen == 0);

[1] Ah, you did add the assertion.

I'm going to propose a variant on this patch, to cover the points I made
above about ease-of-use (a

Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block

2017-11-22 Thread Stefan Hajnoczi

On Wed, Nov 22, 2017 at 07:33:28AM -0800, Deepa Srinivasan wrote:
> Starting qemu with the following arguments causes qemu to segfault:
> ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
> iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
> 
> This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
> blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
> details about the bug follow.
> 
> blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
> coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
> 
> When blk_aio_ioctl() is executed from within a coroutine context (e.g.
> iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
> the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
> 
> When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
> 
> BlkRwCo *rwco = &acb->rwco;
> 
> rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
>  rwco->qiov->iov[0].iov_base);  <--- qiov is
>  invalid here
> ...
> 
> In the case when blk_aio_ioctl() is called from a non-coroutine context,
> blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
> qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
> execution is complete, control returns to blk_aio_ioctl_entry() after the call
> to blk_co_ioctl(). There is no invalid reference after this point, but the
> function is still holding on to invalid pointers.
> 
> The fix is to allocate memory for the QEMUIOVector and struct iovec as part of
> the request struct which the IO buffer is part of. The memory for this struct 
> is
> guaranteed to be valid till the AIO is completed.

Thanks for the patch!

AIO APIs currently don't require the caller to match qiov's lifetime to
the I/O request lifetime.  This patch changes that for blk_aio_ioctl()
only.  If we want to do this consistently then all aio callers need to
be audited and fixed.

The alternative is to make the API copy qiov when necessary.  That is
less efficient but avoids modifying all callers.

Either way, the lifetime of qiov must be consistent across all aio APIs,
not just blk_aio_ioctl().


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] Add a blog post about HAXM acceleration on Windows

2017-11-22 Thread Thomas Huth

On 22.11.2017 15:52, Yu Ning wrote:
> 
> On 11/22/2017 20:25, Thomas Huth wrote:
[...]
>> Would it be OK to remove the qemu-debian-wheezy-gui-with-haxm.png from
>> the blog post?
> 
> Yes, I'm fine with removing it.  Sorry I haven't installed Jekyll and
> didn't test the rendering.
> 
> Would you confirm whether that's the only change I need to make in v2?

No need to respin, since this was just a one-line change, I was able to
do it on my own (I still removed the screenshot, even if it seemed to be
working with Paolo's patch to the CSS, since the screenshot looked just
a bit too big for the blog - I hope that's OK for you).

So the blog post is now online:

 https://www.qemu.org/2017/11/22/haxm-usage-windows/

Thanks again,
 Thomas

[Qemu-devel] [Bug 1673976] Re: linux-user clone() can't handle glibc posix_spawn() (causes locale-gen to assert)

2017-11-22 Thread Peter Maydell

Thanks for tracking down the glibc change; I will try to set up a chroot
with a more recent glibc to see whether we can do something that fixes
that posix_spawn implementation...

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1673976

Title:
  linux-user clone() can't handle glibc posix_spawn() (causes locale-gen
  to assert)

Status in QEMU:
  New

Bug description:
  I'm running a command (locale-gen) inside of an armv7h chroot mounted
  on my x86_64 desktop by putting qemu-arm-static into /usr/bin/ of the
  chroot file system and I get a core dump.

  locale-gen
  Generating locales...
    en_US.UTF-8...localedef: ../sysdeps/unix/sysv/linux/spawni.c:360: 
__spawnix: Assertion `ec >= 0' failed.
  qemu: uncaught target signal 6 (Aborted) - core dumped
  /usr/bin/locale-gen: line 41:34 Aborted (core dumped) 
localedef -i $input -c -f $charset -A /usr/share/locale/locale.alias $locale

  I've done this same thing successfully for years, but this breakage
  has appeared some time in the last 3 or so months. Possibly with the
  update to qemu version 2.8.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1673976/+subscriptions

Re: [Qemu-devel] [qemu-web PATCH] css: avoid over-large images

2017-11-22 Thread Cornelia Huck

On Wed, 22 Nov 2017 17:48:28 +0100
Christian Borntraeger  wrote:

> On 11/22/2017 05:37 PM, Paolo Bonzini wrote:
> > Make sure that images are scaled to fit inside their container.  
> 
> Thanks God. I was thinking "what is wrong with the channel subsystem in 
> QEMU?" when
> reading your subject.

Me too. Hurray for overloaded TLAs ;)

>  
> > Tested-by: Thomas Huth 
> > Reviewed-by: Thomas Huth 
> > Signed-off-by: Paolo Bonzini 
> > ---
> >  assets/css/style.css | 4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/assets/css/style.css b/assets/css/style.css
> > index 2d4fe0c..b828887 100644
> > --- a/assets/css/style.css
> > +++ b/assets/css/style.css
> > @@ -200,6 +200,10 @@
> > 
> > /* Images */
> > 
> > +   img {
> > +   max-width: 100%;
> > +   }
> > +
> > .image
> > {
> > display: inline-block;
> >   
> 
>

Re: [Qemu-devel] [qemu-web PATCH] css: avoid over-large images

2017-11-22 Thread Christian Borntraeger

On 11/22/2017 05:37 PM, Paolo Bonzini wrote:
> Make sure that images are scaled to fit inside their container.

Thanks God. I was thinking "what is wrong with the channel subsystem in QEMU?" 
when
reading your subject.
 
> Tested-by: Thomas Huth 
> Reviewed-by: Thomas Huth 
> Signed-off-by: Paolo Bonzini 
> ---
>  assets/css/style.css | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/assets/css/style.css b/assets/css/style.css
> index 2d4fe0c..b828887 100644
> --- a/assets/css/style.css
> +++ b/assets/css/style.css
> @@ -200,6 +200,10 @@
> 
>   /* Images */
> 
> + img {
> + max-width: 100%;
> + }
> +
>   .image
>   {
>   display: inline-block;
>

Re: [Qemu-devel] [PATCH 08/12] e500: add mpc8540 i2c controller to ccsr

2017-11-22 Thread Michael Davidsaver

On 11/21/2017 10:08 PM, David Gibson wrote:
> On Sun, Nov 19, 2017 at 09:24:16PM -0600, Michael Davidsaver wrote:
>> Signed-off-by: Michael Davidsaver 
> 
> You're adding what seems to be a fairly specific device to the general
> e500 init - this again suggests that it should be split, putting
> creation of devices under control of individual machines.

I'll address the ppce500_init() part of this question separately.

In addition to the MPC8540, I find that documentation for the MPC8544
(modeled) and P2020 (un-modeled) show the same i2c controller registers.
 So I think it's reasonable for the generic ppce500 machine to have it
as well.

For what it's worth, the Linux driver for this unit
(drivers/i2c/busses/i2c-mpc.c) lists compatibility with a number of
other freescale SoCs with only some differences in clock selection (not
modeled).  The description for the module is:

> MODULE_DESCRIPTION("I2C-Bus adapter for MPC107 bridge and "
>"MPC824x/83xx/85xx/86xx/512x/52xx processors");



>> ---
>>  hw/ppc/e500.c | 8 
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
>> index 6f77844303..bef7d313d4 100644
>> --- a/hw/ppc/e500.c
>> +++ b/hw/ppc/e500.c
>> @@ -861,6 +861,14 @@ void ppce500_init(MachineState *machine, PPCE500Params 
>> *params)
>>  qdev_init_nofail(dev);
>>  ccsr_addr_space = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0);
>>  
>> +dev = qdev_create(NULL, "mpc8540-i2c");
>> +object_property_add_child(qdev_get_machine(), "i2c[*]",
>> +  OBJECT(dev), NULL);
>> +qdev_init_nofail(dev);
>> +s = SYS_BUS_DEVICE(dev);
>> +memory_region_add_subregion(ccsr_addr_space, 0x3000,
>> +sysbus_mmio_get_region(s, 0));
>> +
>>  mpicdev = ppce500_init_mpic(machine, params, ccsr_addr_space, irqs);
>>  
>>  /* Serial */
> 




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 1/5] nbd/server: refactor negotiation functions parameters

2017-11-22 Thread Eric Blake

On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote:
> Instead of passing currently negotiating option and its length to
> many of negotiation functions let's just store them on NBDClient
> struct to be state-variables of negotiation phase.
> 
> This unifies semantics of negotiation functions and this allows to
> track changes of remaining option length in future patches.

"allows to $verb" is not idiomatic English; the options are "allows
$subject to $verb" or "allows ${verb}ing"

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  nbd/server.c | 164 
> +--
>  1 file changed, 82 insertions(+), 82 deletions(-)

Not really a net win in lines of code; but I'll have to see how the
other patches in the series are made easier by this.  It relies on the
fact that the NBD protocol states that negotiation phase is synchronous
(at one point, it was argued that negotiation, like transmission, could
allow out-of-order replies from the server, but we got that fixed in the
spec to match existing practice, and this patch cements the fact that we
are processing exactly one option at a time).

Mechanically, the conversion looks sane, so:
Reviewed-by: Eric Blake 

We'll see how I feel at the end of the series.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] Add a blog post about HAXM acceleration on Windows

2017-11-22 Thread Paolo Bonzini

On 22/11/2017 16:37, Thomas Huth wrote:
> On 22.11.2017 15:42, Paolo Bonzini wrote:
>> On 22/11/2017 13:25, Thomas Huth wrote:
>>> diff --git a/assets/css/skel-noscript.css b/assets/css/skel-noscript.css
>>> index e0a141e..f574b9f 100644
>>> --- a/assets/css/skel-noscript.css
>>> +++ b/assets/css/skel-noscript.css
>>> @@ -27,6 +27,7 @@
>>> -o-box-sizing: border-box;
>>> -ms-box-sizing: border-box;
>>> box-sizing: border-box;
>>> +   max-width:100%;
>>> }
>>>
>>> /* Rows */
>>>
>>> Paolo, does that look OK for you?
>>
>> That is black magic and I'm a bit afraid to touch it.
>>
>> Does this also work for you?
>>
>> diff --git a/assets/css/style.css b/assets/css/style.css
>> index 2d4fe0c..d482fe9 100644
>> --- a/assets/css/style.css
>> +++ b/assets/css/style.css
>> @@ -200,6 +200,10 @@
>>
>>  /* Images */
>>
>> +img {
>> +max-width: 100%;
>> +}
>> +
> 
> Yes, that seems to work, too.

Okay, I've pushed it.  Can you apply Yu Ning's patch?

Thanks,

Paolo

[Qemu-devel] [qemu-web PATCH] css: avoid over-large images

2017-11-22 Thread Paolo Bonzini

Make sure that images are scaled to fit inside their container.

Tested-by: Thomas Huth 
Reviewed-by: Thomas Huth 
Signed-off-by: Paolo Bonzini 
---
 assets/css/style.css | 4 
 1 file changed, 4 insertions(+)

diff --git a/assets/css/style.css b/assets/css/style.css
index 2d4fe0c..b828887 100644
--- a/assets/css/style.css
+++ b/assets/css/style.css
@@ -200,6 +200,10 @@
 
/* Images */
 
+   img {
+   max-width: 100%;
+   }
+
.image
{
display: inline-block;
-- 
2.14.3

Re: [Qemu-devel] [PATCH 2/2] pc-bios/s390-ccw: zero out bss section

2017-11-22 Thread Thomas Huth

On 22.11.2017 15:26, Christian Borntraeger wrote:
> The QEMU ELF loader does not zero the bss segment.
> This resulted in several bugs, e.g. see
> 
> commit 5d739a4787a5 (s390-ccw.img: Fix sporadic errors with ccw boot image - 
> initialize css)
> commit 6a40fa2669d3 (s390-ccw.img: Initialize next_idx)
> commit 8775d91a0f42 (pc-bios/s390-ccw: Fix problem with invalid virtio-scsi 
> LUN when rebooting)
> 
> Lets fix this once and forever by letting the BIOS zero the bss itself.
> 
> Suggested-by: Alexander Graf 
> Signed-off-by: Christian Borntraeger 
> ---
>  pc-bios/s390-ccw/start.S | 30 +++---
>  1 file changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/pc-bios/s390-ccw/start.S b/pc-bios/s390-ccw/start.S
> index 43f9bd2..eb8d024 100644
> --- a/pc-bios/s390-ccw/start.S
> +++ b/pc-bios/s390-ccw/start.S
> @@ -3,7 +3,7 @@
>   * into the pc-bios directory of qemu.
>   *
>   * Copyright (c) 2013 Alexander Graf 
> - * Copyright 2013 IBM Corp.
> + * Copyright IBM Corp. 2013, 2017
>   *
>   * This work is licensed under the terms of the GNU GPL, version 2 or (at
>   * your option) any later version. See the COPYING file in the top-level
> @@ -13,8 +13,32 @@
>  .globl _start
>  _start:
>  
> -larl %r15, stack + 0x8000/* Set up stack */
> -jmain/* And call C */
> + larl   %r15, stack + 0x8000 /* Set up stack */
> +
> + /* clear bss */
> + larl %r2, __bss_start
> + larl %r3, _end
> + slgr %r3, %r2   /* get sizeof bss */
> + ltgr%r3,%r3 /* bss emtpy? */
> + jz  done
> + aghi%r3,-1
> + srlg%r4,%r3,8   /* how many 256 byte chunks? */
> + ltgr%r4,%r4
> + lgr %r1,%r2
> + jz  remainder
> +loop:
> + xc  0(256,%r1),0(%r1)
> + la  %r1,256(%r1)
> + brctg   %r4,loop
> +remainder:
> + larl%r2,memsetxc
> + ex  %r3,0(%r2)

Wow, with EXECUTE :-)

> +done:
> + j  main /* And call C */
> +
> +memsetxc:
> + xc  0(1,%r1),0(%r1)
> +
>  
>  /*
>   * void disabled_wait(void)
> 

Looks good to me:

Reviewed-by: Thomas Huth

Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block

2017-11-22 Thread Paolo Bonzini

On 22/11/2017 16:33, Deepa Srinivasan wrote:
> Starting qemu with the following arguments causes qemu to segfault:
> ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
> iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
> 
> This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
> blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
> details about the bug follow.
> 
> blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
> coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
> 
> When blk_aio_ioctl() is executed from within a coroutine context (e.g.
> iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
> the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
> 
> When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
> 
> BlkRwCo *rwco = &acb->rwco;
> 
> rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
>  rwco->qiov->iov[0].iov_base);  <--- qiov is
>  invalid here
> ...
> 
> In the case when blk_aio_ioctl() is called from a non-coroutine context,
> blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
> qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
> execution is complete, control returns to blk_aio_ioctl_entry() after the call
> to blk_co_ioctl(). There is no invalid reference after this point, but the
> function is still holding on to invalid pointers.
> 
> The fix is to allocate memory for the QEMUIOVector and struct iovec as part of
> the request struct which the IO buffer is part of. The memory for this struct 
> is
> guaranteed to be valid till the AIO is completed.
> 
> Signed-off-by: Deepa Srinivasan 
> Signed-off-by: Konrad Rzeszutek Wilk 
> Reviewed-by: Mark Kanda 
> ---
>  block/block-backend.c  | 13 ++---
>  hw/block/virtio-blk.c  |  9 -
>  hw/scsi/scsi-disk.c| 10 +-
>  hw/scsi/scsi-generic.c |  9 -
>  include/sysemu/block-backend.h |  2 +-
>  5 files changed, 28 insertions(+), 15 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index baef8e7..c275827 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque)
>  blk_aio_complete(acb);
>  }
>  
> -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void 
> *buf,
> +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, 
> QEMUIOVector *qiov,
>BlockCompletionFunc *cb, void *opaque)

I think this is not the best way to fix the bug, because it adds extra
unnecessary code in the callers.

Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the
same for blk_aio_prwv's "qiov" argument?

Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just
use rwco->buf.

Thanks,

Paolo

>  {
> -QEMUIOVector qiov;
> -struct iovec iov;
> -
> -iov = (struct iovec) {
> -.iov_base = buf,
> -.iov_len = 0,
> -};
> -qemu_iovec_init_external(&qiov, &iov, 1);
> -
> -return blk_aio_prwv(blk, req, 0, &qiov, blk_aio_ioctl_entry, 0, cb, 
> opaque);
> +return blk_aio_prwv(blk, req, 0, qiov, blk_aio_ioctl_entry, 0, cb, 
> opaque);
>  }
>  
>  int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 05d1440..ed9f774 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -151,6 +151,8 @@ out:
>  typedef struct {
>  VirtIOBlockReq *req;
>  struct sg_io_hdr hdr;
> +QEMUIOVector qiov;
> +struct iovec iov;
>  } VirtIOBlockIoctlReq;
>  
>  static void virtio_blk_ioctl_complete(void *opaque, int status)
> @@ -298,7 +300,12 @@ static int virtio_blk_handle_scsi_req(VirtIOBlockReq 
> *req)
>  ioctl_req->hdr.sbp = elem->in_sg[elem->in_num - 3].iov_base;
>  ioctl_req->hdr.mx_sb_len = elem->in_sg[elem->in_num - 3].iov_len;
>  
> -acb = blk_aio_ioctl(blk->blk, SG_IO, &ioctl_req->hdr,
> +ioctl_req->iov.iov_base = &ioctl_req->hdr;
> +ioctl_req->iov.iov_len = 0;
> +
> +qemu_iovec_init_external(&ioctl_req->qiov, &ioctl_req->iov, 1);
> +
> +acb = blk_aio_ioctl(blk->blk, SG_IO, &ioctl_req->qiov,
>  virtio_blk_ioctl_complete, ioctl_req);
>  if (!acb) {
>  g_free(ioctl_req);
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index 1243117..7cbe18d 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -2636,6 +2636,9 @@ typedef struct SCSIBlockReq {
>  SCSIDiskReq req;
>  sg_io_hdr_t io_header;
>  
> +QEMUIOVector qiov;
> +struct iovec iov;
> +
>  /* Selected bytes of the original CDB, copied into our own CDB.  */
>  uint8_t cmd, cdb1, group_number;
>  
> @@ -2722,7 +272

Re: [Qemu-devel] [RFC PATCH 1/1] s390x/css: unresrict cssids

2017-11-22 Thread Cornelia Huck

On Wed, 22 Nov 2017 15:45:56 +0100
Boris Fiuczynski  wrote:

> On 11/22/2017 01:13 PM, Cornelia Huck wrote:
> >> +object_class_property_add_bool(klass, "cssid-unrestricted",
> >> +   prop_get_true, NULL, NULL);  
> > This looks really, really strange. This is a property that is always
> > true if it exists.
> >
>  Won't argue about that. The libvirt guys are actually not interested
>  int he value at all: only that there is such a property.  
> >>> So what about a machine property? Wouldn't that help as well?
> >>>  
> >> Technically it is doable. The property would be still a weird
> >> one, but from QEMU perspective in a less weird place. I was also
> >> arguing that from OO perspective this kind of a don't care about
> >> it's value property is weird, but AFAIK not having the info if
> >> we can do something with a device at the device is weird from
> >> Libvirt perspective. I'm really uncomforble with speaking for
> >> Libvirt/Boris. Hope he will make his point tomorrow.
> >>  
> >>> Or a css object?
> >>>  
> >> My first internal-only version used this approach, but
> >> I was asked to do it like this.  
> > If we formulate this rather as "we now expose the default css", we (a)
> > have something that really makes the most sense at a channel subsystem
> > or machine level, and (b) can be detected by libvirt as an indicator of
> > "no restriction for virtual vs. non-virtual".  
> 
> I think that there are good reasons for using a device property as well 
> as for using a machine property. Technically both options are possible 
> (even for libvirt :) without too much rope...). At first my personal 
> choice was to express the changed behavior/capability on the device 
> level since that is what a user defines on the command line and also 
> where he gets restricted to use a css matching his device type.
>  From the libvirt perspective we would have supported vfio-ccw devices 
> only if the vfio-ccw would be allowed to be defined with unrestricted 
> cssids.

Thanks, I can now understand where that property came from.

> As I wrote before I also understand the reasoning to express the channel 
> subsystem wide changed behavior as a machine option. In that case 
> libvirt would need to check that both capabilities (vfio-ccw and machine 
> option cssid-unrestricted) are set and not just 
> vfio-cww.cssid-unrestricted. All doable. A qemu command line user would 
> obviously need to know the correlation of the machine option and the ccw 
> devices but that certainly is also nothing new.
> When talking to Christian and Halil I started to favor the idea of a new 
> css object especially when thinking about the future in which the user 
> might want to specify the default css. It for sure would also be able to 
> be set with the use of a machine option but a css object would at that 
> point be much a nicer and more capable approach. What do you think?

I think exposing the css object and making the default_css property
available is the cleanest solution (especially if we want more css
properties in the future). The only drawback I see is that it needs
more coding (and probably care to keep it backwards compatible.)

Halil, do you think you can come up with something?

Re: [Qemu-devel] [PATCH] iotests: fix 075 and 078

2017-11-22 Thread Kevin Wolf

Am 22.11.2017 um 01:16 hat John Snow geschrieben:
> Both of these tests are for formats which now stipulate that they are
> read-only. Adjust the tests to match.
> 
> Signed-off-by: John Snow 

Oh, right, seems I forgot about the tests...

Thanks, applied to the block branch.

Kevin

1 2 >

1 - 100 of 189 matches

Mail list logo