Re: [Qemu-devel] [PATCH 2/2] pc-bios/s390-ccw: zero out bss section
On 11/22/2017 03:26 PM, Christian Borntraeger wrote: > The QEMU ELF loader does not zero the bss segment. > This resulted in several bugs, e.g. see > > commit 5d739a4787a5 (s390-ccw.img: Fix sporadic errors with ccw boot image - > initialize css) > commit 6a40fa2669d3 (s390-ccw.img: Initialize next_idx) > commit 8775d91a0f42 (pc-bios/s390-ccw: Fix problem with invalid virtio-scsi > LUN when rebooting) > > Lets fix this once and forever by letting the BIOS zero the bss itself. > > Suggested-by: Alexander Graf > Signed-off-by: Christian Borntraeger > --- > pc-bios/s390-ccw/start.S | 30 +++--- > 1 file changed, 27 insertions(+), 3 deletions(-) Reviewed-by: Richard Henderson r~
Re: [Qemu-devel] [PATCH 00/17] iotests: Fix iotests for weird formats/options
On Thu, 11/23 03:08, Max Reitz wrote: > (I hate to write Python tests because the boilerplate seems so large and > the debugging is so hard. But there is test 194 which shows that it is > possible to write simple bash-like tests as well--and that is how I > should probably write tests from now on.) If the boilerplate means copy&paste from another test it is not a big problem; if you really mean it's more cumbersome than bash to write test code, I think we can try new ways to exploit Python's expressiveness. 2c93c5cb43 and f4844ac0ad are good examples. So what do you think about debugging? IMO the biggest problem is the less verbose reference output, and lack of a separate logging channel. We can and should improve that, even for existing scripts. And yes, the 194 style is actually better than "iotests.QMPTestCase" ones. Fam
[Qemu-devel] [PATCH v2] rcu: reduce more than 7MB heap memory by malloc_trim()
Since there are some issues in memory alloc/free machenism in glibc for little chunk memory, if Qemu frequently alloc/free little chunk memory, the glibc doesn't alloc little chunk memory from free list of glibc and still allocate from OS, which make the heap size bigger and bigger. This patch introduce malloc_trim(), which will free heap memory. Below are test results from smaps file. (1)without patch 55f0783e1000-55f07992a000 rw-p 00:00 0 [heap] Size: 21796 kB Rss: 14260 kB Pss: 14260 kB (2)with patch 55cc5fadf000-55cc61008000 rw-p 00:00 0 [heap] Size: 21668 kB Rss:6940 kB Pss:6940 kB Signed-off-by: Yang Zhong --- configure | 4 util/rcu.c | 6 ++ 2 files changed, 10 insertions(+) diff --git a/configure b/configure index 0e856bb..5b463d4 100755 --- a/configure +++ b/configure @@ -6012,6 +6012,10 @@ if test "$opengl" = "yes" ; then fi fi +if test "$tcmalloc" = "yes" || test "$jemalloc" = "yes" ; then + echo "CONFIG_NONGLIBMALLOC=y" >> $config_host_mak +fi + if test "$avx2_opt" = "yes" ; then echo "CONFIG_AVX2_OPT=y" >> $config_host_mak fi diff --git a/util/rcu.c b/util/rcu.c index ca5a63e..f3e96a8 100644 --- a/util/rcu.c +++ b/util/rcu.c @@ -32,6 +32,9 @@ #include "qemu/atomic.h" #include "qemu/thread.h" #include "qemu/main-loop.h" +#if defined(CONFIG_LINUX) +#include +#endif /* * Global grace period counter. Bit 0 is always one in rcu_gp_ctr. @@ -272,6 +275,9 @@ static void *call_rcu_thread(void *opaque) node->func(node); } qemu_mutex_unlock_iothread(); +#if defined(CONFIG_LINUX) && !defined(CONFIG_NONGLIBMALLOC) +malloc_trim(4 * 1024 * 1024); +#endif } abort(); } -- 1.9.1
Re: [Qemu-devel] [ANNOUNCE] QEMU 2.11.0-rc2 is now available
On Wed, 11/22 21:43, Jeff Cody wrote: > On Thu, Nov 23, 2017 at 08:47:47AM +0800, Fam Zheng wrote: > > On Wed, 11/22 04:55, Jeff Cody wrote: > > > On Wed, Nov 22, 2017 at 09:09:02AM +0100, Christian Borntraeger wrote: > > > > > > > > > > > > On 11/22/2017 04:23 AM, Michael Roth wrote: > > > > > Quoting Christian Borntraeger (2017-11-21 15:38:32) > > > > >> forgot to cc qemu-devel > > > > >> > > > > >> On 11/21/2017 10:37 PM, Christian Borntraeger wrote: > > > > >>> a quick heads up . Rc2 now triggers > > > > >>> +qemu-img: block/block-backend.c:2088: blk_root_drained_end: > > > > >>> Assertion `blk->quiesce_counter' failed. > > > > >>> for several qemu iotests. > > > > >>> > > > > >>> I have not looked into any details. > > > > > > > > > > It looks to be due to: > > > > > > > > > > 4afeffc8572f40d8844b946a30c00b10da4442b1 > > > > > blockjob: do not allow coroutine double entry or > > > > > entry-after-completion > > > > > > > > Yes, I can confirm that reverting this patch gets rid of this > > > > assertion, but > > > > I see things like > > > > > > > > --- /home/cborntra/REPOS/qemu/tests/qemu-iotests/020.out > > > > 2017-11-21 20:19:34.785519323 +0100 > > > > +++ /home/cborntra/REPOS/qemu/build/tests/qemu-iotests/020.out.bad > > > > 2017-11-22 09:04:50.127612500 +0100 > > > > @@ -537,7 +537,8 @@ > > > > wrote 65536/65536 bytes at offset 4295098368 > > > > 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > > > > No errors were found on the image. > > > > -Image committed. > > > > +qemu_aio_coroutine_enter: Co-routine was already scheduled in > > > > 'co_aio_sleep_ns' > > > > +./common.rc: line 61: 88002 Aborted (core dumped) ( > > > > exec "$QEMU_IMG_PROG" $QEMU_IMG_OPTIONS "$@" ) > > > > > > > > > > That is from the subsequent patches in the series - you will want to > > > revert > > > the whole series to test, as the introduced aborts catch the illegal > > > entries that the reverted patch sidestepped. > > > > > > The series patches are: > > > > > > 4afeffc > > > 6133b39 > > > a233969 > > > d975301 > > > > > > Of course, these new aborts prevent improper behavior, so we may want to > > > figure out why this is getting hit. > > > > > > Unfortunately, I am traveling at the moment (waiting to board my flight), > > > so > > > will have limited connectivity. > > > > I'll take a look at this today and the bottom line is we revert the series > > until > > a proper fix is found. > > > > My hunch is the series is a proper fix, but uncovered other latent bugs that > were relying on dangerous behavior. Yes, I don't know. Either a different fix for your bug, or fixes for the latent bugs. Fam
Re: [Qemu-devel] [PATCH v7 0/5] fw_cfg: add DMA operations & etc/vmcoreinfo support
On Mon, Nov 20, 2017 at 10:55:14AM +0100, Marc-André Lureau wrote: > Hi, > > This series adds DMA operations support to the qemu fw_cfg kernel > module and populates "etc/vmcoreinfo" with vmcoreinfo location > details. > > Note: the support for this entry handling has been merged for upcoming > qemu release (2.11). Does not help. Still get crashes from ltp. > v7: > - add a patch to fix driver remove() > - remove DMA operatiom timeout (qemu finishes sync today) > - synchronize the DMA transfer before reading from CPU > - removed kmalloc() use static allocation instead > - drop some r-b tags > > v6: > - change acpi_acquire_global_lock() error to return EINVAL > (instead of EBUSY) > - replace 0 as pointer argument for NULL > - add Gabriel r-b/a-b tags > > v5: > - resent to CC kdump people on the paddr_vmcoreinfo_note() export patch > > v4: > - export paddr_vmcoreinfo_note() to fix fw_cfg.ko build > - fix build with !CONFIG_CRASH_CORE > - replace the unbounded yield() loop with a usleep_range() loop and a > 200ms timeout > - do not write vmcoreinfo entry when running the kdump kernel (D. Hatayama) > - drop the experimental sysfs write support patch from this series > > v3: (thanks kbuild) > - add "fw_cfg: fix the command line module name" patch > - fix build of "fw_cfg: add DMA register" with CONFIG_FW_CFG_SYSFS_CMDLINE=y > - fix 'Wshift-count-overflow' > > v2: > - use platform device for dma mapping > - add etc/vmcoreinfo patch > - some code cleanups > > Marc-André Lureau (5): > fw_cfg: fix driver remove > fw_cfg: add DMA register > fw_cfg: do DMA read operation > crash: export paddr_vmcoreinfo_note() > fw_cfg: write vmcoreinfo details > > drivers/firmware/qemu_fw_cfg.c | 276 > - > kernel/crash_core.c| 1 + > 2 files changed, 247 insertions(+), 30 deletions(-) > > -- > 2.15.0.277.ga3d2ad2c43
Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion
On 11/22/2017 02:19 AM, Rik van Riel wrote: We can go with the "best" interface for what could be a relatively slow flush (fsync on a file on ssd/disk on the host), which requires that the flushing task wait on completion asynchronously. I'd like to clarify the interface of "wait on completion asynchronously" and KVM async page fault a bit more. Current design of async-page-fault only works on RAM rather than MMIO, i.e, if the page fault caused by accessing the device memory of a emulated device, it needs to go to userspace (QEMU) which emulates the operation in vCPU's thread. As i mentioned before the memory region used for vNVDIMM flush interface should be MMIO and consider its support on other hypervisors, so we do better push this async mechanism into the flush interface design itself rather than depends on kvm async-page-fault.
[Qemu-devel] [PATCH for 2.11] virtio-net: don't touch virtqueue if vm is stopped
Guest state should not be touched if VM is stopped, unfortunately we didn't check running state and tried to drain tx queue unconditionally in virtio_net_set_status(). A crash was then noticed as a migration destination when user type quit after virtqueue state is loaded but before region cache is initialized. In this case, virtio_net_drop_tx_queue_data() tries to access the uninitialized region cache. Fix this by only dropping tx queue data when vm is running. Fixes: 283e2c2adcb80 ("net: virtio-net discards TX data after link down") Cc: Yuri Benditovich Cc: Paolo Bonzini Cc: Stefan Hajnoczi Cc: Michael S. Tsirkin Cc: qemu-sta...@nongnu.org Signed-off-by: Jason Wang --- hw/net/virtio-net.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 150fd07..38674b0 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -288,7 +288,8 @@ static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status) qemu_bh_cancel(q->tx_bh); } if ((n->status & VIRTIO_NET_S_LINK_UP) == 0 && -(queue_status & VIRTIO_CONFIG_S_DRIVER_OK)) { +(queue_status & VIRTIO_CONFIG_S_DRIVER_OK) && +vdev->vm_running) { /* if tx is waiting we are likely have some packets in tx queue * and disabled notification */ q->tx_waiting = 0; -- 2.7.4
Re: [Qemu-devel] [Qemu-block] [PATCH] block: Fix qemu crash when using scsi-block
I agree that passing in QEMUIOVector to blk_aio_ioctl() as a holder of the void* buffer used in blk_aio_ioctl_entry() is unnecessary. But, as Kevin noted, read and write were using the QEMUIOVector in BlkRwCo. To avoid changes to the callers of blk_aio_ioctl(), I’ll change blk_aio_prwv() to take a void pointer instead of QEMUIOVector* and use a union to hold the buffer in BlkRwCo. > On Nov 22, 2017, at 11:24 AM, Paolo Bonzini wrote: > > On 22/11/2017 19:06, Kevin Wolf wrote: >> Am 22.11.2017 um 17:34 hat Paolo Bonzini geschrieben: >>> On 22/11/2017 16:33, Deepa Srinivasan wrote: Starting qemu with the following arguments causes qemu to segfault: ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name= iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1 This patch fixes blk_aio_ioctl() so it does not pass stack addresses to blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More details about the bug follow. blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter(). When blk_aio_ioctl() is executed from within a coroutine context (e.g. iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to the current coroutine's wakeup queue. blk_aio_ioctl() then returns. When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer: BlkRwCo *rwco = &acb->rwco; rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, rwco->qiov->iov[0].iov_base); <--- qiov is invalid here ... In the case when blk_aio_ioctl() is called from a non-coroutine context, blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine execution is complete, control returns to blk_aio_ioctl_entry() after the call to blk_co_ioctl(). There is no invalid reference after this point, but the function is still holding on to invalid pointers. The fix is to allocate memory for the QEMUIOVector and struct iovec as part of the request struct which the IO buffer is part of. The memory for this struct is guaranteed to be valid till the AIO is completed. Signed-off-by: Deepa Srinivasan Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: Mark Kanda --- block/block-backend.c | 13 ++--- hw/block/virtio-blk.c | 9 - hw/scsi/scsi-disk.c| 10 +- hw/scsi/scsi-generic.c | 9 - include/sysemu/block-backend.h | 2 +- 5 files changed, 28 insertions(+), 15 deletions(-) diff --git a/block/block-backend.c b/block/block-backend.c index baef8e7..c275827 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque) blk_aio_complete(acb); } -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf, +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, QEMUIOVector *qiov, BlockCompletionFunc *cb, void *opaque) >>> >>> I think this is not the best way to fix the bug, because it adds extra >>> unnecessary code in the callers. >>> >>> Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the >>> same for blk_aio_prwv's "qiov" argument? >>> >>> Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just >>> use rwco->buf. >> >> But the same struct is used for read and write requests that do use an >> actual QEMUIOVector and not just a linear buffer. > > Then let's call it "void *opaque", or make it a union (but I think > that's overkill). > > The QEMUIOVector pointer is opaque as far as blk_aio_prwv is concerned, > and it is only created by blk_aio_ioctl for blk_aio_ioctl_entry to > extract buf: > >rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, > rwco->qiov->iov[0].iov_base); > > Exposing the fake QEMUIOVector to the callers of blk_aio_ioctl is much > uglier than using a void* for what is effectively a multi-type pointer. > > Paolo
Re: [Qemu-devel] [ANNOUNCE] QEMU 2.11.0-rc2 is now available
On Thu, Nov 23, 2017 at 08:47:47AM +0800, Fam Zheng wrote: > On Wed, 11/22 04:55, Jeff Cody wrote: > > On Wed, Nov 22, 2017 at 09:09:02AM +0100, Christian Borntraeger wrote: > > > > > > > > > On 11/22/2017 04:23 AM, Michael Roth wrote: > > > > Quoting Christian Borntraeger (2017-11-21 15:38:32) > > > >> forgot to cc qemu-devel > > > >> > > > >> On 11/21/2017 10:37 PM, Christian Borntraeger wrote: > > > >>> a quick heads up . Rc2 now triggers > > > >>> +qemu-img: block/block-backend.c:2088: blk_root_drained_end: > > > >>> Assertion `blk->quiesce_counter' failed. > > > >>> for several qemu iotests. > > > >>> > > > >>> I have not looked into any details. > > > > > > > > It looks to be due to: > > > > > > > > 4afeffc8572f40d8844b946a30c00b10da4442b1 > > > > blockjob: do not allow coroutine double entry or entry-after-completion > > > > > > Yes, I can confirm that reverting this patch gets rid of this assertion, > > > but > > > I see things like > > > > > > --- /home/cborntra/REPOS/qemu/tests/qemu-iotests/020.out 2017-11-21 > > > 20:19:34.785519323 +0100 > > > +++ /home/cborntra/REPOS/qemu/build/tests/qemu-iotests/020.out.bad > > > 2017-11-22 09:04:50.127612500 +0100 > > > @@ -537,7 +537,8 @@ > > > wrote 65536/65536 bytes at offset 4295098368 > > > 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > > > No errors were found on the image. > > > -Image committed. > > > +qemu_aio_coroutine_enter: Co-routine was already scheduled in > > > 'co_aio_sleep_ns' > > > +./common.rc: line 61: 88002 Aborted (core dumped) ( exec > > > "$QEMU_IMG_PROG" $QEMU_IMG_OPTIONS "$@" ) > > > > > > > That is from the subsequent patches in the series - you will want to revert > > the whole series to test, as the introduced aborts catch the illegal > > entries that the reverted patch sidestepped. > > > > The series patches are: > > > > 4afeffc > > 6133b39 > > a233969 > > d975301 > > > > Of course, these new aborts prevent improper behavior, so we may want to > > figure out why this is getting hit. > > > > Unfortunately, I am traveling at the moment (waiting to board my flight), so > > will have limited connectivity. > > I'll take a look at this today and the bottom line is we revert the series > until > a proper fix is found. > My hunch is the series is a proper fix, but uncovered other latent bugs that were relying on dangerous behavior.
[Qemu-devel] [PATCH 16/17] iotests: Make 191 work with qcow2 options
In order for 191 to work with an explicit refcount_bits or compat=0.10, we should strip format-specific information from the output--and we can do so by using _filter_img_info. Signed-off-by: Max Reitz --- tests/qemu-iotests/191 | 5 +- tests/qemu-iotests/191.out | 313 + 2 files changed, 91 insertions(+), 227 deletions(-) diff --git a/tests/qemu-iotests/191 b/tests/qemu-iotests/191 index ad785e10b1..dfad6555e4 100755 --- a/tests/qemu-iotests/191 +++ b/tests/qemu-iotests/191 @@ -45,7 +45,6 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 . ./common.qemu _supported_fmt qcow2 -_unsupported_imgopts compat=0.10 _supported_proto file _supported_os Linux @@ -92,7 +91,7 @@ echo === Check that both top and top2 point to base now === echo _send_qemu_cmd $h "{ 'execute': 'query-named-block-nodes' }" "^}" | -_filter_generated_node_ids | _filter_actual_image_size +_filter_generated_node_ids | _filter_actual_image_size | _filter_img_info _send_qemu_cmd $h "{ 'execute': 'quit' }" "^}" wait=1 _cleanup_qemu @@ -140,7 +139,7 @@ echo === Check that both top and top2 point to base now === echo _send_qemu_cmd $h "{ 'execute': 'query-named-block-nodes' }" "^}" | -_filter_generated_node_ids | _filter_actual_image_size +_filter_generated_node_ids | _filter_actual_image_size | _filter_img_info _send_qemu_cmd $h "{ 'execute': 'quit' }" "^}" wait=1 _cleanup_qemu diff --git a/tests/qemu-iotests/191.out b/tests/qemu-iotests/191.out index 73c0ed454c..190c5f049a 100644 --- a/tests/qemu-iotests/191.out +++ b/tests/qemu-iotests/191.out @@ -44,49 +44,31 @@ wrote 65536/65536 bytes at offset 1048576 "image": { "backing-image": { "virtual-size": 67108864, -"filename": "TEST_DIR/t.qcow2.base", +"filename": "TEST_DIR/t.IMGFMT.base", "cluster-size": 65536, -"format": "qcow2", +"format": "IMGFMT", "actual-size": SIZE, -"format-specific": { -"type": "qcow2", -"data": { -"compat": "1.1", -"lazy-refcounts": false, -"refcount-bits": 16, -"corrupt": false -} -}, "dirty-flag": false }, -"backing-filename-format": "qcow2", +"backing-filename-format": "IMGFMT", "virtual-size": 67108864, -"filename": "TEST_DIR/t.qcow2.ovl2", +"filename": "TEST_DIR/t.IMGFMT.ovl2", "cluster-size": 65536, -"format": "qcow2", +"format": "IMGFMT", "actual-size": SIZE, -"format-specific": { -"type": "qcow2", -"data": { -"compat": "1.1", -"lazy-refcounts": false, -"refcount-bits": 16, -"corrupt": false -} -}, -"full-backing-filename": "TEST_DIR/t.qcow2.base", -"backing-filename": "TEST_DIR/t.qcow2.base", +"full-backing-filename": "TEST_DIR/t.IMGFMT.base", +"backing-filename": "TEST_DIR/t.IMGFMT.base", "dirty-flag": false }, "iops_wr": 0, "ro": false, "node-name": "top2", "backing_file_depth": 1, -"drv": "qcow2", +"drv": "IMGFMT", "iops": 0, "bps_wr": 0, "write_threshold": 0, -"backing_file": "TEST_DIR/t.qcow2.base", +"backing_file": "TEST_DIR/t.IMGFMT.base", "encrypted": false, "bps": 0, "bps_rd": 0, @@ -95,7 +77,7 @@ wrote 65536/65536 bytes at offset 1048576 "direct": false, "writeback": true }, -"file": "TEST_DIR/t.qcow2.ovl2", +"file": "TEST_DIR/t.IMGFMT.ovl2", "encryption_key_missing": false }, { @@ -103,7 +85,7 @@ wrote 65536/65536 bytes at offset 1048576 "detect_zeroes": "off", "image": { "virtual-size": 197120, -"filename": "TEST_DIR/t.qcow2.ovl2", +"filename": "TEST_DIR/t.IMGFMT.ovl2", "format": "file", "actual-size": SIZE, "dirty-flag": false @@ -124,7 +106,7 @@ wrote 65536/65536 bytes at offset 1048576 "direct": false, "writeback": true }, -"file": "TEST_DIR/t.qcow2.ovl2", +"file": "TEST_DIR/t.IMGFMT.ovl2", "encryption_key_missing": false },
Re: [Qemu-devel] [PATCH 00/17] iotests: Fix iotests for weird formats/options
On 2017-11-23 03:08, Max Reitz wrote: > This series fixes the qemu-iotests for qcow, vmdk, qcow2 v2 and qcow2 v3 > with refcount_bits=1. (By the way, excuse me for calling vmdk a weird format. That was supposed to be a joke. (I know explaining a joke makes it unfunny, but I now feel a bit bad for it.) Max) signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH 15/17] iotests: Make 184 image-less
184 does not need an image, so don't use one. Signed-off-by: Max Reitz --- tests/qemu-iotests/184 | 25 -- tests/qemu-iotests/184.out | 63 +++--- 2 files changed, 14 insertions(+), 74 deletions(-) diff --git a/tests/qemu-iotests/184 b/tests/qemu-iotests/184 index ee96c99af3..2b68284d58 100755 --- a/tests/qemu-iotests/184 +++ b/tests/qemu-iotests/184 @@ -27,18 +27,12 @@ echo "QA output created by $seq" here=`pwd` status=1 # failure is the default! -_cleanup() -{ -_cleanup_test_img -} -trap "_cleanup; exit \$status" 0 1 2 3 15 +trap "exit \$status" 0 1 2 3 15 # get standard environment, filters and checks . ./common.rc . ./common.filter -_supported_fmt qcow2 -_supported_proto file _supported_os Linux function do_run_qemu() @@ -55,7 +49,6 @@ function run_qemu() | _filter_actual_image_size } -_make_test_img 64M test_throttle=$($QEMU_IMG --help|grep throttle) [ "$test_throttle" = "" ] && _supported_fmt throttle @@ -66,12 +59,8 @@ run_qemu <
[Qemu-devel] [PATCH 14/17] iotests: Make 089 compatible with compat=0.10
The only thing that is missing is a _filter_img_info after the "$QEMU_IO -c info" invocations. Signed-off-by: Max Reitz --- tests/qemu-iotests/089 | 4 ++-- tests/qemu-iotests/089.out | 10 -- 2 files changed, 2 insertions(+), 12 deletions(-) diff --git a/tests/qemu-iotests/089 b/tests/qemu-iotests/089 index 9bfe2307b3..0b059aba90 100755 --- a/tests/qemu-iotests/089 +++ b/tests/qemu-iotests/089 @@ -119,11 +119,11 @@ echo # Both options given directly and those given in the filename should be used $QEMU_IO -c "open -o driver=qcow2 json:{\"file.filename\":\"$TEST_IMG\"}" \ - -c "info" 2>&1 | _filter_testdir | _filter_imgfmt + -c "info" 2>&1 | _filter_img_info # Options given directly should be prioritized over those given in the filename $QEMU_IO -c "open -o driver=qcow2 json:{\"driver\":\"raw\",\"file.filename\":\"$TEST_IMG\"}" \ - -c "info" 2>&1 | _filter_testdir | _filter_imgfmt + -c "info" 2>&1 | _filter_img_info # success, all done diff --git a/tests/qemu-iotests/089.out b/tests/qemu-iotests/089.out index 18f5fdda7a..0bf5a13ec1 100644 --- a/tests/qemu-iotests/089.out +++ b/tests/qemu-iotests/089.out @@ -38,17 +38,7 @@ cluster_size: 65536 format name: IMGFMT cluster size: 64 KiB vm state offset: 512 MiB -Format specific information: -compat: 1.1 -lazy refcounts: false -refcount bits: 16 -corrupt: false format name: IMGFMT cluster size: 64 KiB vm state offset: 512 MiB -Format specific information: -compat: 1.1 -lazy refcounts: false -refcount bits: 16 -corrupt: false *** done -- 2.13.6
[Qemu-devel] [PATCH 12/17] iotests: Fix 059's reference output
As of commit 9877860e7bd1e26ee70ab9bb5ebc34c92bf23bf5, vmdk fails differently when opening the sample image. Signed-off-by: Max Reitz --- tests/qemu-iotests/059.out | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qemu-iotests/059.out b/tests/qemu-iotests/059.out index f6dce7947c..1ac5d56233 100644 --- a/tests/qemu-iotests/059.out +++ b/tests/qemu-iotests/059.out @@ -2358,5 +2358,5 @@ Offset Length Mapped to File 0x14000 0x1 0x5 TEST_DIR/t-s003.vmdk === Testing afl image with a very large capacity === -qemu-img: Can't get image size 'TEST_DIR/afl9.IMGFMT': File too large +qemu-img: Could not open 'TEST_DIR/afl9.IMGFMT': Could not open 'TEST_DIR/afl9.IMGFMT': Invalid argument *** done -- 2.13.6
[Qemu-devel] [PATCH 13/17] iotests: Fix 067 for compat=0.10
067 works very well with compat=0.10 once you remove format-specific information from the QMP output. Signed-off-by: Max Reitz --- tests/qemu-iotests/067 | 3 +- tests/qemu-iotests/067.out | 97 +- 2 files changed, 28 insertions(+), 72 deletions(-) diff --git a/tests/qemu-iotests/067 b/tests/qemu-iotests/067 index 9d561ef786..fe259f6165 100755 --- a/tests/qemu-iotests/067 +++ b/tests/qemu-iotests/067 @@ -57,7 +57,8 @@ function run_qemu() { do_run_qemu "$@" 2>&1 | _filter_testdir | _filter_qmp | _filter_qemu \ | _filter_actual_image_size \ - | _filter_generated_node_ids | _filter_qmp_events + | _filter_generated_node_ids | _filter_qmp_events \ + | _filter_img_info } size=128M diff --git a/tests/qemu-iotests/067.out b/tests/qemu-iotests/067.out index 58e83c4505..2e71cff3ce 100644 --- a/tests/qemu-iotests/067.out +++ b/tests/qemu-iotests/067.out @@ -3,7 +3,7 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 === -drive/-device and device_del === -Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device virtio-blk,drive=disk,id=virtio0 +Testing: -drive file=TEST_DIR/t.IMGFMT,format=IMGFMT,if=none,id=disk -device virtio-blk,drive=disk,id=virtio0 { QMP_VERSION } @@ -23,26 +23,17 @@ Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device virti "detect_zeroes": "off", "image": { "virtual-size": 134217728, -"filename": "TEST_DIR/t.qcow2", +"filename": "TEST_DIR/t.IMGFMT", "cluster-size": 65536, -"format": "qcow2", +"format": "IMGFMT", "actual-size": SIZE, -"format-specific": { -"type": "qcow2", -"data": { -"compat": "1.1", -"lazy-refcounts": false, -"refcount-bits": 16, -"corrupt": false -} -}, "dirty-flag": false }, "iops_wr": 0, "ro": false, "node-name": "NODE_NAME", "backing_file_depth": 0, -"drv": "qcow2", +"drv": "IMGFMT", "iops": 0, "bps_wr": 0, "write_threshold": 0, @@ -54,7 +45,7 @@ Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device virti "direct": false, "writeback": true }, -"file": "TEST_DIR/t.qcow2", +"file": "TEST_DIR/t.IMGFMT", "encryption_key_missing": false }, "qdev": "/machine/peripheral/virtio0/virtio-backend", @@ -81,7 +72,7 @@ Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device virti === -drive/device_add and device_del === -Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk +Testing: -drive file=TEST_DIR/t.IMGFMT,format=IMGFMT,if=none,id=disk { QMP_VERSION } @@ -100,26 +91,17 @@ Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk "detect_zeroes": "off", "image": { "virtual-size": 134217728, -"filename": "TEST_DIR/t.qcow2", +"filename": "TEST_DIR/t.IMGFMT", "cluster-size": 65536, -"format": "qcow2", +"format": "IMGFMT", "actual-size": SIZE, -"format-specific": { -"type": "qcow2", -"data": { -"compat": "1.1", -"lazy-refcounts": false, -"refcount-bits": 16, -"corrupt": false -} -}, "dirty-flag": false }, "iops_wr": 0, "ro": false, "node-name": "NODE_NAME", "backing_file_depth": 0, -"drv": "qcow2", +"drv": "IMGFMT", "iops": 0, "bps_wr": 0, "write_threshold": 0, @@ -131,7 +113,7 @@ Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk "direct": false, "writeback": true }, -"file": "TEST_DIR/t.qcow2", +"file": "TEST_DIR/t.IMGFMT", "encryption_key_missing": false }, "type": "unknown" @@ -183,26 +165,17 @@ Testing: "detect_zeroes": "off",
[Qemu-devel] [PATCH 08/17] iotests: Skip 103 for refcount_bits=1
Signed-off-by: Max Reitz --- tests/qemu-iotests/103 | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/qemu-iotests/103 b/tests/qemu-iotests/103 index ecbd8ebd71..d0cfab8844 100755 --- a/tests/qemu-iotests/103 +++ b/tests/qemu-iotests/103 @@ -40,6 +40,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 _supported_fmt qcow2 _supported_proto file nfs _supported_os Linux +# Internal snapshots are (currently) impossible with refcount_bits=1 +_unsupported_imgopts 'refcount_bits=1[^0-9]' IMG_SIZE=64K -- 2.13.6
[Qemu-devel] [PATCH 17/17] iotests: Filter compat-dependent info in 198
There is a bit of image-specific information which depends on the qcow2 compat level. Filter it so that 198 works with compat=0.10 (and any refcount_bits value). Note that we cannot simply drop the --format-specific switch because we do need the "encrypt" information. Signed-off-by: Max Reitz --- tests/qemu-iotests/198 | 6 -- tests/qemu-iotests/198.out | 8 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/tests/qemu-iotests/198 b/tests/qemu-iotests/198 index a84a058396..54eaaf5153 100755 --- a/tests/qemu-iotests/198 +++ b/tests/qemu-iotests/198 @@ -92,12 +92,14 @@ $QEMU_IO --object $SECRET0 --object $SECRET1 -c "read -P 0x9 0 $size" --image-op echo echo "== checking image base ==" $QEMU_IMG info --image-opts $IMGSPECBASE | _filter_img_info --format-specific \ -| sed -e "/^disk size:/ D" +| sed -e "/^disk size:/ D" -e '/refcount bits:/ D' -e '/compat:/ D' \ + -e '/lazy refcounts:/ D' -e '/corrupt:/ D' echo echo "== checking image layer ==" $QEMU_IMG info --image-opts $IMGSPECLAYER | _filter_img_info --format-specific \ -| sed -e "/^disk size:/ D" +| sed -e "/^disk size:/ D" -e '/refcount bits:/ D' -e '/compat:/ D' \ + -e '/lazy refcounts:/ D' -e '/corrupt:/ D' # success, all done diff --git a/tests/qemu-iotests/198.out b/tests/qemu-iotests/198.out index 2db565e16b..adb805cce9 100644 --- a/tests/qemu-iotests/198.out +++ b/tests/qemu-iotests/198.out @@ -36,9 +36,6 @@ image: json:{"encrypt.key-secret": "sec0", "driver": "IMGFMT", "file": {"driver" file format: IMGFMT virtual size: 16M (16777216 bytes) Format specific information: -compat: 1.1 -lazy refcounts: false -refcount bits: 16 encrypt: ivgen alg: plain64 hash alg: sha256 @@ -75,7 +72,6 @@ Format specific information: key offset: 1810432 payload offset: 2068480 master key iters: 1024 -corrupt: false == checking image layer == image: json:{"encrypt.key-secret": "sec1", "driver": "IMGFMT", "file": {"driver": "file", "filename": "TEST_DIR/t.IMGFMT"}} @@ -83,9 +79,6 @@ file format: IMGFMT virtual size: 16M (16777216 bytes) backing file: TEST_DIR/t.IMGFMT.base Format specific information: -compat: 1.1 -lazy refcounts: false -refcount bits: 16 encrypt: ivgen alg: plain64 hash alg: sha256 @@ -122,5 +115,4 @@ Format specific information: key offset: 1810432 payload offset: 2068480 master key iters: 1024 -corrupt: false *** done -- 2.13.6
[Qemu-devel] [PATCH 06/17] iotests: Drop format-specific in _filter_img_info
_filter_img_info should remove format-specific information, too. We already have such a filter in _img_info, and it is very useful for query-block-named-block-nodes (etc.), too. However, in 198 we need that information (but we still want the rest of the filter), so make that filtering optional. Note that "the rest of the filter" includes filtering of the test directory, so we can drop the _filter_testdir from 198 at the same time. Signed-off-by: Max Reitz --- tests/qemu-iotests/198 | 6 -- tests/qemu-iotests/common.filter | 29 - 2 files changed, 32 insertions(+), 3 deletions(-) diff --git a/tests/qemu-iotests/198 b/tests/qemu-iotests/198 index 34ef666381..a84a058396 100755 --- a/tests/qemu-iotests/198 +++ b/tests/qemu-iotests/198 @@ -91,11 +91,13 @@ $QEMU_IO --object $SECRET0 --object $SECRET1 -c "read -P 0x9 0 $size" --image-op echo echo "== checking image base ==" -$QEMU_IMG info --image-opts $IMGSPECBASE | _filter_img_info | _filter_testdir | sed -e "/^disk size:/ D" +$QEMU_IMG info --image-opts $IMGSPECBASE | _filter_img_info --format-specific \ +| sed -e "/^disk size:/ D" echo echo "== checking image layer ==" -$QEMU_IMG info --image-opts $IMGSPECLAYER | _filter_img_info | _filter_testdir | sed -e "/^disk size:/ D" +$QEMU_IMG info --image-opts $IMGSPECLAYER | _filter_img_info --format-specific \ +| sed -e "/^disk size:/ D" # success, all done diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter index d9237799e9..0c0e53fae7 100644 --- a/tests/qemu-iotests/common.filter +++ b/tests/qemu-iotests/common.filter @@ -139,6 +139,15 @@ _filter_img_create() _filter_img_info() { +if [[ "$1" == "--format-specific" ]]; then +local format_specific=1 +shift +else +local format_specific=0 +fi + +discard=0 +regex_json_spec_start='^ *"format-specific": \{' sed -e "s#$IMGPROTO:$TEST_DIR#TEST_DIR#g" \ -e "s#$TEST_DIR#TEST_DIR#g" \ -e "s#$IMGFMT#IMGFMT#g" \ @@ -159,7 +168,25 @@ _filter_img_info() -e "/block_state_zero: \\(on\\|off\\)/d" \ -e "/log_size: [0-9]\\+/d" \ -e "s/iters: [0-9]\\+/iters: 1024/" \ --e "s/uuid: [-a-f0-9]\\+/uuid: ----/" +-e "s/uuid: [-a-f0-9]\\+/uuid: ----/" | \ +while IFS='' read -r line; do +if [[ $format_specific == 1 ]]; then +discard=0 +elif [[ $line == "Format specific information:" ]]; then +discard=1 +elif [[ $line =~ $regex_json_spec_start ]]; then +discard=2 +regex_json_spec_end="^${line%%[^ ]*}\\},? *$" +fi +if [[ $discard == 0 ]]; then +echo "$line" +elif [[ $discard == 1 && ! $line ]]; then +echo +discard=0 +elif [[ $discard == 2 && $line =~ $regex_json_spec_end ]]; then +discard=0 +fi +done } # filter out offsets and file names from qemu-img map; good for both -- 2.13.6
[Qemu-devel] [PATCH 11/17] iotests: Fix 051 for compat=0.10
051 has both compat=1.1 and compat=0.10 tests (once it uses lazy_refcounts, once it tests that setting them does not work). For the compat=0.10 tests, it already explicitly creates a suitable image. So let's just ignore the user-specified compat level for the lazy_refcounts test and explicitly create a compat=1.1 image there, too. Signed-off-by: Max Reitz --- tests/qemu-iotests/051| 2 ++ tests/qemu-iotests/051.out| 1 + tests/qemu-iotests/051.pc.out | 1 + 3 files changed, 4 insertions(+) diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051 index dba8816c9f..0c3be16489 100755 --- a/tests/qemu-iotests/051 +++ b/tests/qemu-iotests/051 @@ -131,6 +131,8 @@ echo echo === Enable and disable lazy refcounting on the command line, plus some invalid values === echo +_make_test_img -o compat=1.1 "$size" + run_qemu -drive file="$TEST_IMG",format=qcow2,lazy-refcounts=on run_qemu -drive file="$TEST_IMG",format=qcow2,lazy-refcounts=off run_qemu -drive file="$TEST_IMG",format=qcow2,lazy-refcounts= diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out index e3c6eaba57..dd9846d1ce 100644 --- a/tests/qemu-iotests/051.out +++ b/tests/qemu-iotests/051.out @@ -77,6 +77,7 @@ QEMU_PROG: -drive file=TEST_DIR/t.qcow2,file.backing.driver=qcow2,file.backing.f === Enable and disable lazy refcounting on the command line, plus some invalid values === +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=on QEMU X.Y.Z monitor - type 'help' for more information (qemu) quit diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out index f2c5622cee..830c11880a 100644 --- a/tests/qemu-iotests/051.pc.out +++ b/tests/qemu-iotests/051.pc.out @@ -77,6 +77,7 @@ QEMU_PROG: -drive file=TEST_DIR/t.qcow2,file.backing.driver=qcow2,file.backing.f === Enable and disable lazy refcounting on the command line, plus some invalid values === +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=on QEMU X.Y.Z monitor - type 'help' for more information (qemu) quit -- 2.13.6
[Qemu-devel] [PATCH 05/17] iotests: Fix _img_info for backslashes
read without -r eats backslashes. Signed-off-by: Max Reitz --- tests/qemu-iotests/common.rc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index dbae7d74ba..6fa0495e10 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -338,7 +338,7 @@ _img_info() -e "s#$IMGFMT#IMGFMT#g" \ -e "/^disk size:/ D" \ -e "/actual-size/ D" | \ -while IFS='' read line; do +while IFS='' read -r line; do if [[ $format_specific == 1 ]]; then discard=0 elif [[ $line == "Format specific information:" ]]; then -- 2.13.6
[Qemu-devel] [PATCH 04/17] block/vmdk: Add blkdebug events
This is certainly not complete, but it includes at least write_aio and read_aio. Signed-off-by: Max Reitz --- block/vmdk.c | 16 1 file changed, 16 insertions(+) diff --git a/block/vmdk.c b/block/vmdk.c index 1ae47b1c2e..d71cec4f31 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -1075,6 +1075,8 @@ static int get_whole_cluster(BlockDriverState *bs, /* Read backing data before skip range */ if (skip_start_bytes > 0) { if (bs->backing) { +/* qcow2 emits this on bs->file instead of bs->backing */ +BLKDBG_EVENT(extent->file, BLKDBG_COW_READ); ret = bdrv_pread(bs->backing, offset, whole_grain, skip_start_bytes); if (ret < 0) { @@ -1082,6 +1084,7 @@ static int get_whole_cluster(BlockDriverState *bs, goto exit; } } +BLKDBG_EVENT(extent->file, BLKDBG_COW_WRITE); ret = bdrv_pwrite(extent->file, cluster_offset, whole_grain, skip_start_bytes); if (ret < 0) { @@ -1092,6 +1095,8 @@ static int get_whole_cluster(BlockDriverState *bs, /* Read backing data after skip range */ if (skip_end_bytes < cluster_bytes) { if (bs->backing) { +/* qcow2 emits this on bs->file instead of bs->backing */ +BLKDBG_EVENT(extent->file, BLKDBG_COW_READ); ret = bdrv_pread(bs->backing, offset + skip_end_bytes, whole_grain + skip_end_bytes, cluster_bytes - skip_end_bytes); @@ -1100,6 +1105,7 @@ static int get_whole_cluster(BlockDriverState *bs, goto exit; } } +BLKDBG_EVENT(extent->file, BLKDBG_COW_WRITE); ret = bdrv_pwrite(extent->file, cluster_offset + skip_end_bytes, whole_grain + skip_end_bytes, cluster_bytes - skip_end_bytes); @@ -1120,6 +1126,7 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data, { offset = cpu_to_le32(offset); /* update L2 table */ +BLKDBG_EVENT(extent->file, BLKDBG_L2_UPDATE); if (bdrv_pwrite_sync(extent->file, ((int64_t)m_data->l2_offset * 512) + (m_data->l2_index * sizeof(offset)), @@ -1218,6 +1225,7 @@ static int get_cluster_offset(BlockDriverState *bs, } } l2_table = extent->l2_cache + (min_index * extent->l2_size); +BLKDBG_EVENT(extent->file, BLKDBG_L2_LOAD); if (bdrv_pread(extent->file, (int64_t)l2_offset * 512, l2_table, @@ -1393,9 +1401,13 @@ static int vmdk_write_extent(VmdkExtent *extent, int64_t cluster_offset, .iov_len= n_bytes, }; qemu_iovec_init_external(&local_qiov, &iov, 1); + +BLKDBG_EVENT(extent->file, BLKDBG_WRITE_COMPRESSED); } else { qemu_iovec_init(&local_qiov, qiov->niov); qemu_iovec_concat(&local_qiov, qiov, qiov_offset, n_bytes); + +BLKDBG_EVENT(extent->file, BLKDBG_WRITE_AIO); } write_offset = cluster_offset + offset_in_cluster; @@ -1437,6 +1449,7 @@ static int vmdk_read_extent(VmdkExtent *extent, int64_t cluster_offset, if (!extent->compressed) { +BLKDBG_EVENT(extent->file, BLKDBG_READ_AIO); ret = bdrv_co_preadv(extent->file, cluster_offset + offset_in_cluster, bytes, qiov, 0); @@ -1450,6 +1463,7 @@ static int vmdk_read_extent(VmdkExtent *extent, int64_t cluster_offset, buf_bytes = cluster_bytes * 2; cluster_buf = g_malloc(buf_bytes); uncomp_buf = g_malloc(cluster_bytes); +BLKDBG_EVENT(extent->file, BLKDBG_READ_COMPRESSED); ret = bdrv_pread(extent->file, cluster_offset, cluster_buf, buf_bytes); @@ -1527,6 +1541,8 @@ vmdk_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, qemu_iovec_reset(&local_qiov); qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes); +/* qcow2 emits this on bs->file instead of bs->backing */ +BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO); ret = bdrv_co_preadv(bs->backing, offset, n_bytes, &local_qiov, 0); if (ret < 0) { -- 2.13.6
[Qemu-devel] [PATCH 10/17] iotests: Fix 020 for vmdk
vmdk cannot work with anything but vmdk backing files, so make the backing file be the same format as the overlay. Reported-by: John Snow Signed-off-by: Max Reitz --- tests/qemu-iotests/020 | 9 ++--- tests/qemu-iotests/020.out | 6 -- 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020 index d22ab81414..eac5080f83 100755 --- a/tests/qemu-iotests/020 +++ b/tests/qemu-iotests/020 @@ -111,10 +111,12 @@ echo echo 'Testing failing commit' echo +TEST_IMG="$TEST_IMG.base" _make_test_img 1M + # Create an image with a null backing file to which committing will fail (with # ENOSPC so we can distinguish the result from some generic EIO which may be # generated anywhere in the block layer) -_make_test_img -b "json:{'driver': 'raw', +_make_test_img -b "json:{'driver': '$IMGFMT', 'file': { 'driver': 'blkdebug', 'inject-error': [{ @@ -123,14 +125,15 @@ _make_test_img -b "json:{'driver': 'raw', 'once': true }], 'image': { - 'driver': 'null-co' + 'driver': 'file', + 'filename': '$TEST_IMG.base' }}}" # Just write anything so committing will not be a no-op $QEMU_IO -c 'writev 0 64k' "$TEST_IMG" | _filter_qemu_io $QEMU_IMG commit "$TEST_IMG" -_cleanup_test_img +_cleanup # success, all done echo "*** done" diff --git a/tests/qemu-iotests/020.out b/tests/qemu-iotests/020.out index 165b70aa49..4b722b2dd0 100644 --- a/tests/qemu-iotests/020.out +++ b/tests/qemu-iotests/020.out @@ -1078,7 +1078,8 @@ No errors were found on the image. Testing failing commit -Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824 backing_file=json:{'driver': 'raw',, +Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576 +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=json:{'driver': 'IMGFMT',, 'file': { 'driver': 'blkdebug',, 'inject-error': [{ @@ -1087,7 +1088,8 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824 backing_file=json:{'d 'once': true }],, 'image': { - 'driver': 'null-co' + 'driver': 'file',, + 'filename': 'TEST_DIR/t.IMGFMT.base' }}} wrote 65536/65536 bytes at offset 0 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) -- 2.13.6
[Qemu-devel] [PATCH 09/17] iotests: Disable some tests for compat=0.10
Tests 080, 130, 137, and 176 simply do not work with compat=0.10 for the reasons stated there. 177 is a bit more interesting: Originally, it was actually very much intended to work with compat=0.10 (it even had a special case for that). However, it now prints the test image's map twice, and short of just not doing that, there is no solution I can imagine that is both simple and would leave compat=0.10 support intact. Signed-off-by: Max Reitz --- tests/qemu-iotests/080 | 5 +++-- tests/qemu-iotests/130 | 2 ++ tests/qemu-iotests/137 | 2 ++ tests/qemu-iotests/176 | 2 ++ tests/qemu-iotests/177 | 13 +++-- 5 files changed, 12 insertions(+), 12 deletions(-) diff --git a/tests/qemu-iotests/080 b/tests/qemu-iotests/080 index 55044c700b..1c2bd85742 100755 --- a/tests/qemu-iotests/080 +++ b/tests/qemu-iotests/080 @@ -41,8 +41,9 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 _supported_fmt qcow2 _supported_proto file _supported_os Linux -# Internal snapshots are (currently) impossible with refcount_bits=1 -_unsupported_imgopts 'refcount_bits=1[^0-9]' +# - Internal snapshots are (currently) impossible with refcount_bits=1 +# - This is generally a test for compat=1.1 images +_unsupported_imgopts 'refcount_bits=1[^0-9]' 'compat=0.10' header_size=104 diff --git a/tests/qemu-iotests/130 b/tests/qemu-iotests/130 index e7e43de6d6..2c4b94da1b 100755 --- a/tests/qemu-iotests/130 +++ b/tests/qemu-iotests/130 @@ -45,6 +45,8 @@ _supported_fmt qcow2 _supported_proto generic _unsupported_proto vxhs _supported_os Linux +# We are going to use lazy-refcounts +_unsupported_imgopts 'compat=0.10' qemu_comm_method="monitor" diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137 index eb91e517d7..5a01250005 100755 --- a/tests/qemu-iotests/137 +++ b/tests/qemu-iotests/137 @@ -41,6 +41,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 _supported_fmt qcow2 _supported_proto file _supported_os Linux +# We are going to use lazy-refcounts +_unsupported_imgopts 'compat=0.10' _make_test_img 64M diff --git a/tests/qemu-iotests/176 b/tests/qemu-iotests/176 index b8dc17c592..d38b3aeb91 100755 --- a/tests/qemu-iotests/176 +++ b/tests/qemu-iotests/176 @@ -48,6 +48,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 _supported_fmt qcow2 _supported_proto file _supported_os Linux +# Persistent dirty bitmaps require compat=1.1 +_unsupported_imgopts 'compat=0.10' function run_qemu() { diff --git a/tests/qemu-iotests/177 b/tests/qemu-iotests/177 index 28990977f1..86cf25f855 100755 --- a/tests/qemu-iotests/177 +++ b/tests/qemu-iotests/177 @@ -39,6 +39,8 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 _supported_fmt qcow2 _supported_proto file +# This test assumes that discard leaves zero clusters +_unsupported_imgopts 'compat=0.10' CLUSTER_SIZE=1M size=128M @@ -93,15 +95,6 @@ echo "== verify image content ==" function verify_io() { -if ($QEMU_IMG info -f "$IMGFMT" "$TEST_IMG" | - grep "compat: 0.10" > /dev/null); then -# For v2 images, discarded clusters are read from the backing file -discarded=11 -else -# Discarded clusters are zeroed for v3 or later -discarded=0 -fi - echo read -P 22 0 1000 echo read -P 33 1000 128k echo read -P 22 132072 7871512 @@ -109,7 +102,7 @@ function verify_io() echo read -P 22 10096640 23457792 echo read -P 0 32M 32M echo read -P 22 64M 13M -echo read -P $discarded 77M 29M +echo read -P 0 77M 29M echo read -P 22 106M 4M echo read -P 11 110M 18M } -- 2.13.6
[Qemu-devel] [PATCH 03/17] block/qcow: Add blkdebug events
This is not necessarily complete, but it should include the most important places. Signed-off-by: Max Reitz --- block/qcow.c | 16 1 file changed, 16 insertions(+) diff --git a/block/qcow.c b/block/qcow.c index 9569deeaf0..d552a6eba8 100644 --- a/block/qcow.c +++ b/block/qcow.c @@ -379,6 +379,7 @@ static int get_cluster_offset(BlockDriverState *bs, /* update the L1 entry */ s->l1_table[l1_index] = l2_offset; tmp = cpu_to_be64(l2_offset); +BLKDBG_EVENT(bs->file, BLKDBG_L1_UPDATE); ret = bdrv_pwrite_sync(bs->file, s->l1_table_offset + l1_index * sizeof(tmp), &tmp, sizeof(tmp)); @@ -409,6 +410,7 @@ static int get_cluster_offset(BlockDriverState *bs, } } l2_table = s->l2_cache + (min_index << s->l2_bits); +BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD); if (new_l2_table) { memset(l2_table, 0, s->l2_size * sizeof(uint64_t)); ret = bdrv_pwrite_sync(bs->file, l2_offset, l2_table, @@ -432,6 +434,7 @@ static int get_cluster_offset(BlockDriverState *bs, ((cluster_offset & QCOW_OFLAG_COMPRESSED) && allocate == 1)) { if (!allocate) return 0; +BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC); /* allocate a new cluster */ if ((cluster_offset & QCOW_OFLAG_COMPRESSED) && (n_end - n_start) < s->cluster_sectors) { @@ -447,6 +450,7 @@ static int get_cluster_offset(BlockDriverState *bs, } cluster_offset = QEMU_ALIGN_UP(cluster_offset, s->cluster_size); /* write the cluster content */ +BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO); ret = bdrv_pwrite(bs->file, cluster_offset, s->cluster_cache, s->cluster_size); if (ret < 0) { @@ -486,6 +490,7 @@ static int get_cluster_offset(BlockDriverState *bs, NULL) < 0) { return -EIO; } +BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO); ret = bdrv_pwrite(bs->file, cluster_offset + i * 512, s->cluster_data, 512); @@ -503,6 +508,11 @@ static int get_cluster_offset(BlockDriverState *bs, /* update L2 table */ tmp = cpu_to_be64(cluster_offset); l2_table[l2_index] = tmp; +if (allocate == 2) { +BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED); +} else { +BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE); +} ret = bdrv_pwrite_sync(bs->file, l2_offset + l2_index * sizeof(tmp), &tmp, sizeof(tmp)); if (ret < 0) { @@ -579,6 +589,7 @@ static int decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset) if (s->cluster_cache_offset != coffset) { csize = cluster_offset >> (63 - s->cluster_bits); csize &= (s->cluster_size - 1); +BLKDBG_EVENT(bs->file, BLKDBG_READ_COMPRESSED); ret = bdrv_pread(bs->file, coffset, s->cluster_data, csize); if (ret != csize) return -1; @@ -635,6 +646,8 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, int64_t sector_num, hd_iov.iov_len = n * 512; qemu_iovec_init_external(&hd_qiov, &hd_iov, 1); qemu_co_mutex_unlock(&s->lock); +/* qcow2 emits this on bs->file instead of bs->backing */ +BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO); ret = bdrv_co_readv(bs->backing, sector_num, n, &hd_qiov); qemu_co_mutex_lock(&s->lock); if (ret < 0) { @@ -661,6 +674,7 @@ static coroutine_fn int qcow_co_readv(BlockDriverState *bs, int64_t sector_num, hd_iov.iov_len = n * 512; qemu_iovec_init_external(&hd_qiov, &hd_iov, 1); qemu_co_mutex_unlock(&s->lock); +BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO); ret = bdrv_co_readv(bs->file, (cluster_offset >> 9) + index_in_cluster, n, &hd_qiov); @@ -754,6 +768,7 @@ static coroutine_fn int qcow_co_writev(BlockDriverState *bs, int64_t sector_num, hd_iov.iov_len = n * 512; qemu_iovec_init_external(&hd_qiov, &hd_iov, 1); qemu_co_mutex_unlock(&s->lock); +BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO); ret = bdrv_co_writev(bs->file, (cluster_offset >> 9) + index_in_cluster, n, &hd_qiov); @@ -1048,6 +1063,7 @@ qcow_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset, .iov_len= out_len, }; qemu_iovec_init_external(&hd_qiov, &iov, 1); +BLKDBG_EVENT(bs->file, BLKDBG_WRITE_COMPRESSED); ret =
[Qemu-devel] [PATCH 07/17] iotests: Forbid 020 for non-file protocols
This test does funny things like TEST_IMG="TEST_IMG.base" _make_test_img that usually only work with the file protocol. More specifically, they do not work with the most interesting non-file protocols, so we might as well skip this for anything but file. Signed-off-by: Max Reitz --- tests/qemu-iotests/020 | 8 +--- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020 index cefe3a830e..d22ab81414 100755 --- a/tests/qemu-iotests/020 +++ b/tests/qemu-iotests/020 @@ -42,18 +42,12 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 # Any format supporting backing files _supported_fmt qcow qcow2 vmdk qed -_supported_proto generic -_unsupported_proto vxhs +_supported_proto file _supported_os Linux _unsupported_imgopts "subformat=monolithicFlat" \ "subformat=twoGbMaxExtentFlat" \ "subformat=twoGbMaxExtentSparse" -# NFS does not support bdrv_reopen_prepare thus qemu-img commit fails. -if [ "$IMGPROTO" = "nfs" ]; then -_notrun "image protocol $IMGPROTO does not support bdrv_commit" -fi - TEST_OFFSETS="0 4294967296" TEST_IMG_SAVE="$TEST_IMG" -- 2.13.6
[Qemu-devel] [PATCH 02/17] qcow2: No persistent dirty bitmaps for compat=0.10
Persistent dirty bitmaps require a properly functioning autoclear_features field, or we cannot track when an unsupporting program might overwrite them. Therefore, we cannot support them for compat=0.10 images. Signed-off-by: Max Reitz --- block/qcow2-bitmap.c | 10 ++ block/qcow2.c| 14 +++--- 2 files changed, 21 insertions(+), 3 deletions(-) diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c index f45e46cfbd..efa10c6663 100644 --- a/block/qcow2-bitmap.c +++ b/block/qcow2-bitmap.c @@ -1449,6 +1449,16 @@ bool qcow2_can_store_new_dirty_bitmap(BlockDriverState *bs, bool found; Qcow2BitmapList *bm_list; +if (s->qcow_version < 3) { +/* Without autoclear_features, we would always have to assume + * that a program without persistent dirty bitmap support has + * accessed this qcow2 file when opening it, and would thus + * have to drop all dirty bitmaps (defeating their purpose). + */ +error_setg(errp, "Cannot store dirty bitmaps in qcow2 v2 files"); +goto fail; +} + if (check_constraints_on_bitmap(bs, name, granularity, errp) != 0) { goto fail; } diff --git a/block/qcow2.c b/block/qcow2.c index 1914a940e5..cabf93e43c 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -302,9 +302,17 @@ static int qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset, } if (!(s->autoclear_features & QCOW2_AUTOCLEAR_BITMAPS)) { -warn_report("a program lacking bitmap support " -"modified this file, so all bitmaps are now " -"considered inconsistent"); +if (s->qcow_version < 3) { +/* Let's be a bit more specific */ +warn_report("This qcow2 v2 image contains bitmaps, but " +"they may have been modified by a program " +"without persistent bitmap support; so now " +"they must all be considered inconsistent"); +} else { +warn_report("a program lacking bitmap support " +"modified this file, so all bitmaps are now " +"considered inconsistent"); +} error_printf("Some clusters may be leaked, " "run 'qemu-img check -r' on the image " "file to fix."); -- 2.13.6
[Qemu-devel] [PATCH 00/17] iotests: Fix iotests for weird formats/options
This series fixes the qemu-iotests for qcow, vmdk, qcow2 v2 and qcow2 v3 with refcount_bits=1. Patches 1 and 2 contain real fixes (not urgent, though, so no need to hurry for 2.11--we can take them into 2.11 if we want to, but there is no absolute need for them). Patches 3 and 4 add blkdebug events to qcow and vmdk so iotests 020 can work with them even after patch 10. Patches 5 and 6 are general "fixes" for the iotests infrastructure. Patches 7, 8, and 9 add some missing skips under certain circumstances to tests that need them. The rest of this series (patches 10 through 17) actually fix tests so they work for the formats and options mentioned above. (Fun fact: qcow v1 wasn't broken before this series. But it would be broken by patch 10 if I didn't include patch 3. That is why I mentioned it above.) Personal note: I should really stop writing bash tests, at least as soon as there is QMP involved. While working on this series I got sidetracked a bit and actually wrote some iotests.py functions that may come in handy next time I write a test. (I hate to write Python tests because the boilerplate seems so large and the debugging is so hard. But there is test 194 which shows that it is possible to write simple bash-like tests as well--and that is how I should probably write tests from now on.) Max Reitz (17): block/vmdk: Fix , instead of ; at end of line qcow2: No persistent dirty bitmaps for compat=0.10 block/qcow: Add blkdebug events block/vmdk: Add blkdebug events iotests: Fix _img_info for backslashes iotests: Drop format-specific in _filter_img_info iotests: Forbid 020 for non-file protocols iotests: Skip 103 for refcount_bits=1 iotests: Disable some tests for compat=0.10 iotests: Fix 020 for vmdk iotests: Fix 051 for compat=0.10 iotests: Fix 059's reference output iotests: Fix 067 for compat=0.10 iotests: Make 089 compatible with compat=0.10 iotests: Make 184 image-less iotests: Make 191 work with qcow2 options iotests: Filter compat-dependent info in 198 block/qcow.c | 16 ++ block/qcow2-bitmap.c | 10 ++ block/qcow2.c| 14 +- block/vmdk.c | 18 ++- tests/qemu-iotests/020 | 17 +-- tests/qemu-iotests/020.out | 6 +- tests/qemu-iotests/051 | 2 + tests/qemu-iotests/051.out | 1 + tests/qemu-iotests/051.pc.out| 1 + tests/qemu-iotests/059.out | 2 +- tests/qemu-iotests/067 | 3 +- tests/qemu-iotests/067.out | 97 tests/qemu-iotests/080 | 5 +- tests/qemu-iotests/089 | 4 +- tests/qemu-iotests/089.out | 10 -- tests/qemu-iotests/103 | 2 + tests/qemu-iotests/130 | 2 + tests/qemu-iotests/137 | 2 + tests/qemu-iotests/176 | 2 + tests/qemu-iotests/177 | 13 +- tests/qemu-iotests/184 | 25 +--- tests/qemu-iotests/184.out | 63 ++-- tests/qemu-iotests/191 | 5 +- tests/qemu-iotests/191.out | 313 +++ tests/qemu-iotests/198 | 8 +- tests/qemu-iotests/198.out | 8 - tests/qemu-iotests/common.filter | 29 +++- tests/qemu-iotests/common.rc | 2 +- 28 files changed, 254 insertions(+), 426 deletions(-) -- 2.13.6
[Qemu-devel] [PATCH 01/17] block/vmdk: Fix , instead of ; at end of line
Signed-off-by: Max Reitz --- block/vmdk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/vmdk.c b/block/vmdk.c index c665bcc977..1ae47b1c2e 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -1398,7 +1398,7 @@ static int vmdk_write_extent(VmdkExtent *extent, int64_t cluster_offset, qemu_iovec_concat(&local_qiov, qiov, qiov_offset, n_bytes); } -write_offset = cluster_offset + offset_in_cluster, +write_offset = cluster_offset + offset_in_cluster; ret = bdrv_co_pwritev(extent->file, write_offset, n_bytes, &local_qiov, 0); -- 2.13.6
Re: [Qemu-devel] [PATCH] Add a blog post about HAXM acceleration on Windows
On 11/23/2017 1:03, Thomas Huth wrote: On 22.11.2017 15:52, Yu Ning wrote: On 11/22/2017 20:25, Thomas Huth wrote: [...] Would it be OK to remove the qemu-debian-wheezy-gui-with-haxm.png from the blog post? Yes, I'm fine with removing it. Sorry I haven't installed Jekyll and didn't test the rendering. Would you confirm whether that's the only change I need to make in v2? No need to respin, since this was just a one-line change, I was able to do it on my own (I still removed the screenshot, even if it seemed to be working with Paolo's patch to the CSS, since the screenshot looked just a bit too big for the blog - I hope that's OK for you). Yes, absolutely :) So the blog post is now online: https://www.qemu.org/2017/11/22/haxm-usage-windows/ Looks nice, thanks!
Re: [Qemu-devel] [PATCH qemu] vfio/spapr: Allow fallback to SPAPR TCE IOMMU v1
On 23/11/17 00:39, David Gibson wrote: > On Wed, Nov 22, 2017 at 04:15:52PM +1100, Alexey Kardashevskiy wrote: >> The vfio_iommu_spapr_tce driver always advertises v1 and v2 IOMMU support, >> however PR KVM (a special version of KVM designed to work in >> a paravirtualized system; these days used for nested virtualizaion) only >> supports the "pseries" platform which does not support v2. Since there is >> no way to choose the IOMMU version in QEMU, it fails to start. >> >> This adds a fallback to the v1 IOMMU if v2 cannot be used. >> >> Signed-off-by: Alexey Kardashevskiy > > The fallback itself isn't a bad idea, but your commit message contains > several inaccurracies. KVM PR is not particularly designed to work in > a paravirtualized system, and it doesn't only support the pseries > platform (as guest *or* host). It's actually a lot more general than > KVM HV - just slow, not that well tested and missing a number of > features that no-one's bothered to port to it. Well, true. I kinda tried to give an example of how this may be useful and exaggerated a bit, plus my ignorance :) I'll repost if Alex does not have objections otherwise. > >> --- >> hw/vfio/common.c | 5 + >> 1 file changed, 5 insertions(+) >> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c >> index 7b2924c..cd81cc9 100644 >> --- a/hw/vfio/common.c >> +++ b/hw/vfio/common.c >> @@ -1040,6 +1040,11 @@ static int vfio_connect_container(VFIOGroup *group, >> AddressSpace *as, >> v2 ? VFIO_SPAPR_TCE_v2_IOMMU : VFIO_SPAPR_TCE_IOMMU; >> ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type); >> if (ret) { >> +container->iommu_type = VFIO_SPAPR_TCE_IOMMU; >> +v2 = false; >> +ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type); >> +} >> +if (ret) { >> error_setg_errno(errp, errno, "failed to set iommu for >> container"); >> ret = -errno; >> goto free_container_exit; > -- Alexey signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [ANNOUNCE] QEMU 2.11.0-rc2 is now available
On Wed, 11/22 04:55, Jeff Cody wrote: > On Wed, Nov 22, 2017 at 09:09:02AM +0100, Christian Borntraeger wrote: > > > > > > On 11/22/2017 04:23 AM, Michael Roth wrote: > > > Quoting Christian Borntraeger (2017-11-21 15:38:32) > > >> forgot to cc qemu-devel > > >> > > >> On 11/21/2017 10:37 PM, Christian Borntraeger wrote: > > >>> a quick heads up . Rc2 now triggers > > >>> +qemu-img: block/block-backend.c:2088: blk_root_drained_end: Assertion > > >>> `blk->quiesce_counter' failed. > > >>> for several qemu iotests. > > >>> > > >>> I have not looked into any details. > > > > > > It looks to be due to: > > > > > > 4afeffc8572f40d8844b946a30c00b10da4442b1 > > > blockjob: do not allow coroutine double entry or entry-after-completion > > > > Yes, I can confirm that reverting this patch gets rid of this assertion, but > > I see things like > > > > --- /home/cborntra/REPOS/qemu/tests/qemu-iotests/020.out2017-11-21 > > 20:19:34.785519323 +0100 > > +++ /home/cborntra/REPOS/qemu/build/tests/qemu-iotests/020.out.bad > > 2017-11-22 09:04:50.127612500 +0100 > > @@ -537,7 +537,8 @@ > > wrote 65536/65536 bytes at offset 4295098368 > > 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > > No errors were found on the image. > > -Image committed. > > +qemu_aio_coroutine_enter: Co-routine was already scheduled in > > 'co_aio_sleep_ns' > > +./common.rc: line 61: 88002 Aborted (core dumped) ( exec > > "$QEMU_IMG_PROG" $QEMU_IMG_OPTIONS "$@" ) > > > > That is from the subsequent patches in the series - you will want to revert > the whole series to test, as the introduced aborts catch the illegal > entries that the reverted patch sidestepped. > > The series patches are: > > 4afeffc > 6133b39 > a233969 > d975301 > > Of course, these new aborts prevent improper behavior, so we may want to > figure out why this is getting hit. > > Unfortunately, I am traveling at the moment (waiting to board my flight), so > will have limited connectivity. I'll take a look at this today and the bottom line is we revert the series until a proper fix is found. Fa
Re: [Qemu-devel] [PATCH v7 11/13] xilinx_spips: Don't set TX FIFO UNDERFLOW at cmd done
On Thu, Nov 2, 2017 at 5:01 PM, Francisco Iglesias wrote: > Don't set TX FIFO UNDERFLOW interrupt after done transmiting the commands. after transmitting the commands > Also update interrupts after reading out the interrupt status. > > Signed-off-by: Francisco Iglesias Acked-by: Alistair Francis Alistair > --- > hw/ssi/xilinx_spips.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c > index 7f0f317..159a89d 100644 > --- a/hw/ssi/xilinx_spips.c > +++ b/hw/ssi/xilinx_spips.c > @@ -329,9 +329,6 @@ static void xilinx_spips_flush_txfifo(XilinxSPIPS *s) > uint8_t addr_length; > > if (fifo8_is_empty(&s->tx_fifo)) { > -if (!(s->regs[R_LQSPI_CFG] & LQSPI_CFG_LQ_MODE)) { > -s->regs[R_INTR_STATUS] |= IXR_TX_FIFO_UNDERFLOW; > -} > xilinx_spips_update_ixr(s); > return; > } else if (s->snoop_state == SNOOP_STRIPING) { > @@ -530,6 +527,7 @@ static uint64_t xilinx_spips_read(void *opaque, hwaddr > addr, > ret = s->regs[addr] & IXR_ALL; > s->regs[addr] = 0; > DB_PRINT_L(0, "addr=" TARGET_FMT_plx " = %x\n", addr * 4, ret); > +xilinx_spips_update_ixr(s); > return ret; > case R_INTR_MASK: > mask = IXR_ALL; > -- > 2.9.3 > >
Re: [Qemu-devel] [PATCH v7 10/13] xilinx_spips: Add support for 4 byte addresses in the LQSPI
On Thu, Nov 2, 2017 at 5:01 PM, Francisco Iglesias wrote: > Add support for 4 byte addresses in the LQSPI and correct LQSPI_CFG_SEP_BUS. > > Signed-off-by: Francisco Iglesias Reviewed-by: Alistair Francis Alistair > --- > hw/ssi/xilinx_spips.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c > index 3a98799..7f0f317 100644 > --- a/hw/ssi/xilinx_spips.c > +++ b/hw/ssi/xilinx_spips.c > @@ -92,8 +92,9 @@ > #define R_LQSPI_CFG_RESET 0x03A002EB > #define LQSPI_CFG_LQ_MODE (1U << 31) > #define LQSPI_CFG_TWO_MEM (1 << 30) > -#define LQSPI_CFG_SEP_BUS (1 << 30) > +#define LQSPI_CFG_SEP_BUS (1 << 29) > #define LQSPI_CFG_U_PAGE(1 << 28) > +#define LQSPI_CFG_ADDR4 (1 << 27) > #define LQSPI_CFG_MODE_EN (1 << 25) > #define LQSPI_CFG_MODE_WIDTH8 > #define LQSPI_CFG_MODE_SHIFT16 > @@ -702,6 +703,9 @@ static void lqspi_load_cache(void *opaque, hwaddr addr) > fifo8_push(&s->tx_fifo, s->regs[R_LQSPI_CFG] & LQSPI_CFG_INST_CODE); > /* read address */ > DB_PRINT_L(0, "pushing read address %06x\n", flash_addr); > +if (s->regs[R_LQSPI_CFG] & LQSPI_CFG_ADDR4) { > +fifo8_push(&s->tx_fifo, (uint8_t)(flash_addr >> 24)); > +} > fifo8_push(&s->tx_fifo, (uint8_t)(flash_addr >> 16)); > fifo8_push(&s->tx_fifo, (uint8_t)(flash_addr >> 8)); > fifo8_push(&s->tx_fifo, (uint8_t)flash_addr); > -- > 2.9.3 > >
Re: [Qemu-devel] [PATCH v7 06/13] xilinx_spips: Update striping to be big-endian bit order
On Thu, Nov 2, 2017 at 5:01 PM, Francisco Iglesias wrote: > Update striping functionality to be big-endian bit order (as according to > the Zynq-7000 Technical Reference Manual). Output thereafter the even bits > into the flash memory connected to the lower QSPI bus and the odd bits into > the flash memory connected to the upper QSPI bus. > > Signed-off-by: Francisco Iglesias > --- > hw/ssi/xilinx_spips.c | 19 ++- > 1 file changed, 10 insertions(+), 9 deletions(-) > > diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c > index 559fa79..7accf5d 100644 > --- a/hw/ssi/xilinx_spips.c > +++ b/hw/ssi/xilinx_spips.c > @@ -208,14 +208,14 @@ static void xilinx_spips_reset(DeviceState *d) > xilinx_spips_update_cs_lines(s); > } > > -/* N way (num) in place bit striper. Lay out row wise bits (LSB to MSB) > +/* N way (num) in place bit striper. Lay out row wise bits (MSB to LSB) > * column wise (from element 0 to N-1). num is the length of x, and dir > * reverses the direction of the transform. Best illustrated by example: > * Each digit in the below array is a single bit (num == 3): > * > - * {{ 76543210, } - stripe (dir == false) -> {{ FCheb630, } > - * { hgfedcba, } { GDAfc741, } > - * { HGFEDCBA, }} < upstripe (dir == true) - { HEBgda52, }} > + * {{ 76543210, } - stripe (dir == false) -> {{ 741gdaFC, } > + * { hgfedcba, } { 630fcHEB, } > + * { HGFEDCBA, }} < upstripe (dir == true) - { 52hebGDA, }} > */ > > static inline void stripe8(uint8_t *x, int num, bool dir) > @@ -223,15 +223,15 @@ static inline void stripe8(uint8_t *x, int num, bool > dir) > uint8_t r[num]; > memset(r, 0, sizeof(uint8_t) * num); > int idx[2] = {0, 0}; > -int bit[2] = {0, 0}; > +int bit[2] = {0, 7}; > int d = dir; > > for (idx[0] = 0; idx[0] < num; ++idx[0]) { > -for (bit[0] = 0; bit[0] < 8; ++bit[0]) { > -r[idx[d]] |= x[idx[!d]] & 1 << bit[!d] ? 1 << bit[d] : 0; > +for (bit[0] = 7; bit[0] != -1; bit[0] += -1) { I think this is easier to read: for (bit[0] = 7; bit[0] >= 0; bit[0]--) > +r[idx[!d]] |= x[idx[d]] & 1 << bit[d] ? 1 << bit[!d] : 0; > idx[1] = (idx[1] + 1) % num; > if (!idx[1]) { > -bit[1]++; > +bit[1] += -1; bit[1]-- Otherwise: Acked-by: Alistair Francis Alistair > } > } > } > @@ -266,8 +266,9 @@ static void xilinx_spips_flush_txfifo(XilinxSPIPS *s) > } > > for (i = 0; i < num_effective_busses(s); ++i) { > +int bus = num_effective_busses(s) - 1 - i; > DB_PRINT_L(debug_level, "tx = %02x\n", tx_rx[i]); > -tx_rx[i] = ssi_transfer(s->spi[i], (uint32_t)tx_rx[i]); > +tx_rx[i] = ssi_transfer(s->spi[bus], (uint32_t)tx_rx[i]); > DB_PRINT_L(debug_level, "rx = %02x\n", tx_rx[i]); > } > > -- > 2.9.3 > >
Re: [Qemu-devel] [PATCH v7 05/13] xilinx_spips: Move FlashCMD, XilinxQSPIPS and XilinxSPIPSClass
On Thu, Nov 2, 2017 at 5:01 PM, Francisco Iglesias wrote: > Move the FlashCMD enum, XilinxQSPIPS and XilinxSPIPSClass structures to the > header for consistency (struct XilinxSPIPS is found there). Also move out > a define and remove two dubbel included headers (while touching the code). s/dubbel/double/g > Finally, add 4 byte address commands to the FlashCMD enum. This probably could be a separate patch, but I'm not fussed either way. > > Signed-off-by: Francisco Iglesias Reviewed-by: Alistair Francis Alistair > --- > hw/ssi/xilinx_spips.c | 35 --- > include/hw/ssi/xilinx_spips.h | 34 ++ > 2 files changed, 34 insertions(+), 35 deletions(-) > > diff --git a/hw/ssi/xilinx_spips.c b/hw/ssi/xilinx_spips.c > index ef56d35..559fa79 100644 > --- a/hw/ssi/xilinx_spips.c > +++ b/hw/ssi/xilinx_spips.c > @@ -27,8 +27,6 @@ > #include "sysemu/sysemu.h" > #include "hw/ptimer.h" > #include "qemu/log.h" > -#include "qemu/fifo8.h" > -#include "hw/ssi/ssi.h" > #include "qemu/bitops.h" > #include "hw/ssi/xilinx_spips.h" > #include "qapi/error.h" > @@ -116,44 +114,11 @@ > > /* 16MB per linear region */ > #define LQSPI_ADDRESS_BITS 24 > -/* Bite off 4k chunks at a time */ > -#define LQSPI_CACHE_SIZE 1024 > > #define SNOOP_CHECKING 0xFF > #define SNOOP_NONE 0xFE > #define SNOOP_STRIPING 0 > > -typedef enum { > -READ = 0x3, > -FAST_READ = 0xb, > -DOR = 0x3b, > -QOR = 0x6b, > -DIOR = 0xbb, > -QIOR = 0xeb, > - > -PP = 0x2, > -DPP = 0xa2, > -QPP = 0x32, > -} FlashCMD; > - > -typedef struct { > -XilinxSPIPS parent_obj; > - > -uint8_t lqspi_buf[LQSPI_CACHE_SIZE]; > -hwaddr lqspi_cached_addr; > -Error *migration_blocker; > -bool mmio_execution_enabled; > -} XilinxQSPIPS; > - > -typedef struct XilinxSPIPSClass { > -SysBusDeviceClass parent_class; > - > -const MemoryRegionOps *reg_ops; > - > -uint32_t rx_fifo_size; > -uint32_t tx_fifo_size; > -} XilinxSPIPSClass; > - > static inline int num_effective_busses(XilinxSPIPS *s) > { > return (s->regs[R_LQSPI_CFG] & LQSPI_CFG_SEP_BUS && > diff --git a/include/hw/ssi/xilinx_spips.h b/include/hw/ssi/xilinx_spips.h > index 06aa096..7f9e2fc 100644 > --- a/include/hw/ssi/xilinx_spips.h > +++ b/include/hw/ssi/xilinx_spips.h > @@ -32,6 +32,22 @@ typedef struct XilinxSPIPS XilinxSPIPS; > > #define XLNX_SPIPS_R_MAX(0x100 / 4) > > +/* Bite off 4k chunks at a time */ > +#define LQSPI_CACHE_SIZE 1024 > + > +typedef enum { > +READ = 0x3, READ_4 = 0x13, > +FAST_READ = 0xb,FAST_READ_4 = 0x0c, > +DOR = 0x3b, DOR_4 = 0x3c, > +QOR = 0x6b, QOR_4 = 0x6c, > +DIOR = 0xbb,DIOR_4 = 0xbc, > +QIOR = 0xeb,QIOR_4 = 0xec, > + > +PP = 0x2, PP_4 = 0x12, > +DPP = 0xa2, > +QPP = 0x32, QPP_4 = 0x34, > +} FlashCMD; > + > struct XilinxSPIPS { > SysBusDevice parent_obj; > > @@ -56,6 +72,24 @@ struct XilinxSPIPS { > uint32_t regs[XLNX_SPIPS_R_MAX]; > }; > > +typedef struct { > +XilinxSPIPS parent_obj; > + > +uint8_t lqspi_buf[LQSPI_CACHE_SIZE]; > +hwaddr lqspi_cached_addr; > +Error *migration_blocker; > +bool mmio_execution_enabled; > +} XilinxQSPIPS; > + > +typedef struct XilinxSPIPSClass { > +SysBusDeviceClass parent_class; > + > +const MemoryRegionOps *reg_ops; > + > +uint32_t rx_fifo_size; > +uint32_t tx_fifo_size; > +} XilinxSPIPSClass; > + > #define TYPE_XILINX_SPIPS "xlnx.ps7-spi" > #define TYPE_XILINX_QSPIPS "xlnx.ps7-qspi" > > -- > 2.9.3 > >
Re: [Qemu-devel] [PATCH v7 00/13] Add support for the ZynqMP Generic QSPI
On Tue, Nov 21, 2017 at 10:39 AM, Peter Maydell wrote: > On 3 November 2017 at 00:00, Francisco Iglesias > wrote: >> Hi, >> >> This patch series is an attempt to add support for the ZynqMP QSPI >> (consisting >> of the Generic QSPI and the legacy QSPI) to the xlnx-zcu102 board and connect >> Numonyx n25q512a11 flashes to the QSPI. Also some functionality is added to >> m25p80. >> >> The series starts by adding support in m25p80 for continous read out of >> status >> registers, SST flash READ ID commands, bank address register accesses, bulk >> erase (0x60) and two Numonyx flashes (n25q512a11 and n25q512a13). Thereafter >> it >> updates the striping behaviour to be bit big endiann in the Xilinx QSPI model >> and adds support for RX discard, zero pumping according transfer register >> and 4 >> byte LQSPI addresses. Finally it adds support for the ZynqMP Generic QSPI and >> adds the ZynqMP QSPI to the xlnx-zcu102 board. >> >> Best regards, >> Francisco Iglesias > > Hi; just a note to say that I'm assuming the Xilinx folk are going > to review the xilinx_spips patches in this set... Yep, we will! Alistair > > thanks > -- PMM >
Re: [Qemu-devel] [PATCH 3/5] nbd/server: add helper nbd_opt_invalid
On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > nbd/server.c | 74 > +--- > 1 file changed, 46 insertions(+), 28 deletions(-) > > @@ -418,8 +447,8 @@ static int nbd_negotiate_handle_info(NBDClient *client, > uint16_t myflags, > be16_to_cpus(&requests); > trace_nbd_negotiate_handle_info_requests(requests); > if (requests != client->optlen / sizeof(request)) { > -msg = "incorrect number of requests for overall length"; > -goto invalid; > +return nbd_opt_invalid( > +client, errp, "incorrect number of requests for overall length"); Nice that you are fixing the typo of the double space in the error message. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH qemu] vfio/spapr: Allow fallback to SPAPR TCE IOMMU v1
On Wed, Nov 22, 2017 at 04:15:52PM +1100, Alexey Kardashevskiy wrote: > The vfio_iommu_spapr_tce driver always advertises v1 and v2 IOMMU support, > however PR KVM (a special version of KVM designed to work in > a paravirtualized system; these days used for nested virtualizaion) only > supports the "pseries" platform which does not support v2. Since there is > no way to choose the IOMMU version in QEMU, it fails to start. > > This adds a fallback to the v1 IOMMU if v2 cannot be used. > > Signed-off-by: Alexey Kardashevskiy The fallback itself isn't a bad idea, but your commit message contains several inaccurracies. KVM PR is not particularly designed to work in a paravirtualized system, and it doesn't only support the pseries platform (as guest *or* host). It's actually a lot more general than KVM HV - just slow, not that well tested and missing a number of features that no-one's bothered to port to it. > --- > hw/vfio/common.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > index 7b2924c..cd81cc9 100644 > --- a/hw/vfio/common.c > +++ b/hw/vfio/common.c > @@ -1040,6 +1040,11 @@ static int vfio_connect_container(VFIOGroup *group, > AddressSpace *as, > v2 ? VFIO_SPAPR_TCE_v2_IOMMU : VFIO_SPAPR_TCE_IOMMU; > ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type); > if (ret) { > +container->iommu_type = VFIO_SPAPR_TCE_IOMMU; > +v2 = false; > +ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type); > +} > +if (ret) { > error_setg_errno(errp, errno, "failed to set iommu for > container"); > ret = -errno; > goto free_container_exit; -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH] tcg: Fix complilation with TCG
On Wed, Nov 22, 2017 at 09:32:04 -0300, Philippe Mathieu-Daudé wrote: > On 11/22/2017 05:41 AM, Juan Quintela wrote: > > This commit started use tb_unlock() and tlb_set_dirty() on non TCG > > code. Add the function as stubs. > > > > commit 27266271977c5a30f2f7d493e042be1897827bdd > > Author: Peter Maydell > > Date: Mon Nov 20 18:08:27 2017 + > > > > exec.c: Factor out before/after actions for notdirty memory writes > > > > > > Maybe pasting the sha-1 is enough "Since 272662719 those stubs are missing". Paolo's "git whatis" alias comes to mind: https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg00972.html E.
[Qemu-devel] [PATCH v3 7/7] s390x/pci: search for subregion inside the BARs
When dispatching memory access to PCI BAR region, we must look for possible subregions, used by the PCI device to map different memory areas inside the same PCI BAR. Since the data offset we received is calculated starting at the region start address we need to adjust the offset for the subregion. The data offset inside the subregion is calculated by substracting the subregion's starting address from the data offset in the region. The access to the MSIX region is now handled in a generic way, we do not need the specific trap_msix() function anymore. Signed-off-by: Pierre Morel Reviewed-by: Yi Min Zhao --- hw/s390x/s390-pci-inst.c | 44 +--- 1 file changed, 25 insertions(+), 19 deletions(-) diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c index 8d35f8f..fae3ef7 100644 --- a/hw/s390x/s390-pci-inst.c +++ b/hw/s390x/s390-pci-inst.c @@ -345,12 +345,31 @@ static int zpci_endian_swap(uint64_t *ptr, uint8_t len) return 0; } +static MemoryRegion *s390_get_subregion(MemoryRegion *mr, uint64_t offset, +uint8_t len) +{ +MemoryRegion *other; +uint64_t subregion_size; + +QTAILQ_FOREACH(other, &mr->subregions, subregions_link) { +subregion_size = int128_get64(other->size); +if ((offset >= other->addr) && +(offset + len) <= (other->addr + subregion_size)) { +mr = other; +break; +} +} +return mr; +} + static MemTxResult zpci_read_bar(S390PCIBusDevice *pbdev, uint8_t pcias, uint64_t offset, uint64_t *data, uint8_t len) { MemoryRegion *mr; mr = pbdev->pdev->io_regions[pcias].memory; +mr = s390_get_subregion(mr, offset, len); +offset -= mr->addr; return memory_region_dispatch_read(mr, offset, data, len, MEMTXATTRS_UNSPECIFIED); } @@ -442,30 +461,14 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) return 0; } -static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) -{ -if (pbdev->msix.available && pbdev->msix.table_bar == pcias && -offset >= pbdev->msix.table_offset && -offset < (pbdev->msix.table_offset + - pbdev->msix.entries * PCI_MSIX_ENTRY_SIZE)) { -return 1; -} else { -return 0; -} -} - static MemTxResult zpci_write_bar(S390PCIBusDevice *pbdev, uint8_t pcias, uint64_t offset, uint64_t data, uint8_t len) { MemoryRegion *mr; -if (trap_msix(pbdev, offset, pcias)) { -offset = offset - pbdev->msix.table_offset; -mr = &pbdev->pdev->msix_table_mmio; -} else { -mr = pbdev->pdev->io_regions[pcias].memory; -} - +mr = pbdev->pdev->io_regions[pcias].memory; +mr = s390_get_subregion(mr, offset, len); +offset -= mr->addr; return memory_region_dispatch_write(mr, offset, data, len, MEMTXATTRS_UNSPECIFIED); } @@ -726,6 +729,9 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr, } mr = pbdev->pdev->io_regions[pcias].memory; +mr = s390_get_subregion(mr, offset, len); +offset -= mr->addr; + if (!memory_region_access_valid(mr, offset, len, true)) { program_interrupt(env, PGM_OPERAND, 6); return 0; -- 2.7.4
[Qemu-devel] [PATCH v3 6/7] s390x/pci: move the memory region write from pcistg
Let's move the memory region write from pcistg into a dedicated function. This allows us to prepare a later patch searching for subregions inside of the memory region. Signed-off-by: Pierre Morel Reviewed-by: Yi Min Zhao --- hw/s390x/s390-pci-inst.c | 27 +-- 1 file changed, 17 insertions(+), 10 deletions(-) diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c index 69ff7b8..8d35f8f 100644 --- a/hw/s390x/s390-pci-inst.c +++ b/hw/s390x/s390-pci-inst.c @@ -454,12 +454,27 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t offset, uint8_t pcias) } } +static MemTxResult zpci_write_bar(S390PCIBusDevice *pbdev, uint8_t pcias, + uint64_t offset, uint64_t data, uint8_t len) +{ +MemoryRegion *mr; + +if (trap_msix(pbdev, offset, pcias)) { +offset = offset - pbdev->msix.table_offset; +mr = &pbdev->pdev->msix_table_mmio; +} else { +mr = pbdev->pdev->io_regions[pcias].memory; +} + +return memory_region_dispatch_write(mr, offset, data, len, +MEMTXATTRS_UNSPECIFIED); +} + int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) { CPUS390XState *env = &cpu->env; uint64_t offset, data; S390PCIBusDevice *pbdev; -MemoryRegion *mr; MemTxResult result; uint8_t len; uint32_t fh; @@ -516,15 +531,7 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) return 0; } -if (trap_msix(pbdev, offset, pcias)) { -offset = offset - pbdev->msix.table_offset; -mr = &pbdev->pdev->msix_table_mmio; -} else { -mr = pbdev->pdev->io_regions[pcias].memory; -} - -result = memory_region_dispatch_write(mr, offset, data, len, - MEMTXATTRS_UNSPECIFIED); +result = zpci_write_bar(pbdev, pcias, offset, data, len); if (result != MEMTX_OK) { program_interrupt(env, PGM_OPERAND, 4); return 0; -- 2.7.4
[Qemu-devel] [PATCH v3 2/7] s390x/pci: rework PCI STORE
Enhance the fault detection, correction of the fault reporting. Signed-off-by: Pierre Morel Reviewed-by: Yi Min Zhao --- hw/s390x/s390-pci-inst.c | 39 ++- 1 file changed, 22 insertions(+), 17 deletions(-) diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c index 3e1f1a0..930c197 100644 --- a/hw/s390x/s390-pci-inst.c +++ b/hw/s390x/s390-pci-inst.c @@ -470,6 +470,12 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) pcias = (env->regs[r2] >> 16) & 0xf; len = env->regs[r2] & 0xf; offset = env->regs[r2 + 1]; +data = env->regs[r1]; + +if (!(fh & FH_MASK_ENABLE)) { +setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE); +return 0; +} pbdev = s390_pci_find_dev_by_fh(s390_get_phb(), fh); if (!pbdev) { @@ -479,12 +485,7 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) } switch (pbdev->state) { -case ZPCI_FS_RESERVED: -case ZPCI_FS_STANDBY: -case ZPCI_FS_DISABLED: case ZPCI_FS_PERMANENT_ERROR: -setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE); -return 0; case ZPCI_FS_ERROR: setcc(cpu, ZPCI_PCI_LS_ERR); s390_set_status_code(env, r2, ZPCI_PCI_ST_BLOCKED); @@ -493,9 +494,13 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) break; } -data = env->regs[r1]; -if (pcias < 6) { -if ((8 - (offset & 0x7)) < len) { +switch (pcias) { +/* A ZPCI PCI card may use any BAR from BAR 0 to BAR 5 */ +case 0 ... 5: +/* Check length: + * A length of 0 is invalid and length should not cross a double word + */ +if (!len || (len > (8 - (offset & 0x7 { program_interrupt(env, PGM_OPERAND, 4); return 0; } @@ -513,21 +518,21 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) program_interrupt(env, PGM_OPERAND, 4); return 0; } -} else if (pcias == 15) { -if ((4 - (offset & 0x3)) < len) { -program_interrupt(env, PGM_OPERAND, 4); -return 0; -} - -if (zpci_endian_swap(&data, len)) { +break; +case 15: +/* ZPCI uses the pseudo BAR number 15 as configuration space */ +/* possible access lengths are 1,2,4 and must not cross a word */ +if (!len || (len > (4 - (offset & 0x3))) || len == 3) { program_interrupt(env, PGM_OPERAND, 4); return 0; } - +/* len = 1,2,4 so we do not need to test */ +zpci_endian_swap(&data, len); pci_host_config_write_common(pbdev->pdev, offset, pci_config_size(pbdev->pdev), data, len); -} else { +break; +default: DPRINTF("pcistg invalid space\n"); setcc(cpu, ZPCI_PCI_LS_ERR); s390_set_status_code(env, r2, ZPCI_PCI_ST_INVAL_AS); -- 2.7.4
[Qemu-devel] [PATCH v3 1/7] s390x/pci: factor out endianess conversion
There are two places where the same endianness conversion is done. Let's factor this out into a static function. Signed-off-by: Pierre Morel Reviewed-by: Yi Min Zhao --- hw/s390x/s390-pci-inst.c | 59 +++- 1 file changed, 33 insertions(+), 26 deletions(-) diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c index 8e088f3..3e1f1a0 100644 --- a/hw/s390x/s390-pci-inst.c +++ b/hw/s390x/s390-pci-inst.c @@ -314,6 +314,36 @@ out: return 0; } +/** + * Swap data contained in s390x big endian registers to little endian + * PCI bars. + * + * @ptr: a pointer to a uint64_t data field + * @len: the length of the valid data, must be 1,2,4 or 8 + */ +static int zpci_endian_swap(uint64_t *ptr, uint8_t len) +{ +uint64_t data = *ptr; + +switch (len) { +case 1: +break; +case 2: +data = bswap16(data); +break; +case 4: +data = bswap32(data); +break; +case 8: +data = bswap64(data); +break; +default: +return -EINVAL; +} +*ptr = data; +return 0; +} + int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) { CPUS390XState *env = &cpu->env; @@ -385,19 +415,7 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) data = pci_host_config_read_common( pbdev->pdev, offset, pci_config_size(pbdev->pdev), len); -switch (len) { -case 1: -break; -case 2: -data = bswap16(data); -break; -case 4: -data = bswap32(data); -break; -case 8: -data = bswap64(data); -break; -default: +if (zpci_endian_swap(&data, len)) { program_interrupt(env, PGM_OPERAND, 4); return 0; } @@ -500,19 +518,8 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) program_interrupt(env, PGM_OPERAND, 4); return 0; } -switch (len) { -case 1: -break; -case 2: -data = bswap16(data); -break; -case 4: -data = bswap32(data); -break; -case 8: -data = bswap64(data); -break; -default: + +if (zpci_endian_swap(&data, len)) { program_interrupt(env, PGM_OPERAND, 4); return 0; } -- 2.7.4
[Qemu-devel] [PATCH v3 5/7] s390x/pci: move the memory region read from pcilg
Let's move the memory region read from pcilg into a dedicated function. This allows us to prepare a later patch. Signed-off-by: Pierre Morel Reviewed-by: Yi Min Zhao --- hw/s390x/s390-pci-inst.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c index fe6e042..69ff7b8 100644 --- a/hw/s390x/s390-pci-inst.c +++ b/hw/s390x/s390-pci-inst.c @@ -345,13 +345,22 @@ static int zpci_endian_swap(uint64_t *ptr, uint8_t len) return 0; } +static MemTxResult zpci_read_bar(S390PCIBusDevice *pbdev, uint8_t pcias, + uint64_t offset, uint64_t *data, uint8_t len) +{ +MemoryRegion *mr; + +mr = pbdev->pdev->io_regions[pcias].memory; +return memory_region_dispatch_read(mr, offset, data, len, + MEMTXATTRS_UNSPECIFIED); +} + int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) { CPUS390XState *env = &cpu->env; S390PCIBusDevice *pbdev; uint64_t offset; uint64_t data; -MemoryRegion *mr; MemTxResult result; uint8_t len; uint32_t fh; @@ -402,9 +411,7 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) program_interrupt(env, PGM_OPERAND, 4); return 0; } -mr = pbdev->pdev->io_regions[pcias].memory; -result = memory_region_dispatch_read(mr, offset, &data, len, - MEMTXATTRS_UNSPECIFIED); +result = zpci_read_bar(pbdev, pcias, offset, &data, len); if (result != MEMTX_OK) { program_interrupt(env, PGM_OPERAND, 4); return 0; -- 2.7.4
[Qemu-devel] [PATCH v3 0/7] s390x/pci: Improve zPCI to cover more cases
This patch fixes the following BUG: Even a guest is able to detect virtio_pci device, the init function the Linux virtio_pci driver will hang because zPCI does not support the subregions used by virtio_pci. It follows that right now the PCI support is very limited (e.g. pass through of a host vfio device) To enable features like virtio-pci several modifications needs to be done. As already stated above, Virtio-PCI uses subregions, which may eventually be discontinuous inside bars instead of a single flat region often used by real devices. The address offset being formerly calculated from the BAR base address must be adapted to the subregions instead of to the single region. This patch provides the new calculation for the three kind of BAR access, zPCI STORE, zPCI LOAD and zPCI STORE BLOCK done by zPCI. We use the opportunity to - enhance the fault detection for zPCI STORE and LOAD, - enhance the fault detection and to provide the maximum STORE BLOCK block size, maxstbl, for zPCI STORE BLOCK - factor out part of the code used to calculate the offset and access the BARs, - factor out the code for endianess conversion. Pierre Morel (7): s390x/pci: factor out endianess conversion s390x/pci: rework PCI STORE s390x/pci: rework PCI LOAD s390x/pci: rework PCI STORE BLOCK s390x/pci: move the memory region read from pcilg s390x/pci: move the memory region write from pcistg s390x/pci: search for subregion inside the BARs hw/s390x/s390-pci-bus.h | 1 + hw/s390x/s390-pci-inst.c | 248 --- hw/s390x/s390-pci-inst.h | 2 +- 3 files changed, 151 insertions(+), 100 deletions(-) -- 2.7.4 Changelog: from v2 - rewording of comments, coding style. from v1 - suppress fallthrough to PCI_ROM_SLOT to handle it in the default case. - reword most of the patch commit messages - add comments to the endianness conversion - reword somme comments inside the patches
[Qemu-devel] [PATCH v3 4/7] s390x/pci: rework PCI STORE BLOCK
Enhance the fault detection. Fixup the precedence to check the destination path existance before checking for the source accessibility. Add the maxstbl entry to both the Query PCI Function Group response and the PCIBusDevice structure. Initialize the maxstbl to 128 per default until we get the actual data from the hardware. Signed-off-by: Pierre Morel Reviewed-by: Yi Min Zhao --- hw/s390x/s390-pci-bus.h | 1 + hw/s390x/s390-pci-inst.c | 63 ++-- hw/s390x/s390-pci-inst.h | 2 +- 3 files changed, 41 insertions(+), 25 deletions(-) diff --git a/hw/s390x/s390-pci-bus.h b/hw/s390x/s390-pci-bus.h index 560bd82..2993f0d 100644 --- a/hw/s390x/s390-pci-bus.h +++ b/hw/s390x/s390-pci-bus.h @@ -284,6 +284,7 @@ struct S390PCIBusDevice { uint64_t fmb_addr; uint8_t isc; uint16_t noi; +uint16_t maxstbl; uint8_t sum; S390MsixInfo msix; AdapterRoutes routes; diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c index 2295412..fe6e042 100644 --- a/hw/s390x/s390-pci-inst.c +++ b/hw/s390x/s390-pci-inst.c @@ -294,6 +294,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2) stq_p(&resgrp->msia, ZPCI_MSI_ADDR); stw_p(&resgrp->mui, 0); stw_p(&resgrp->i, 128); +stw_p(&resgrp->maxstbl, 128); resgrp->version = 0; stw_p(&resgrp->hdr.rsp, CLP_RC_OK); @@ -645,6 +646,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr, S390PCIBusDevice *pbdev; MemoryRegion *mr; MemTxResult result; +uint64_t offset; int i; uint32_t fh; uint8_t pcias; @@ -659,22 +661,10 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr, fh = env->regs[r1] >> 32; pcias = (env->regs[r1] >> 16) & 0xf; len = env->regs[r1] & 0xff; +offset = env->regs[r3]; -if (pcias > 5) { -DPRINTF("pcistb invalid space\n"); -setcc(cpu, ZPCI_PCI_LS_ERR); -s390_set_status_code(env, r1, ZPCI_PCI_ST_INVAL_AS); -return 0; -} - -switch (len) { -case 16: -case 32: -case 64: -case 128: -break; -default: -program_interrupt(env, PGM_SPECIFICATION, 6); +if (!(fh & FH_MASK_ENABLE)) { +setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE); return 0; } @@ -686,12 +676,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr, } switch (pbdev->state) { -case ZPCI_FS_RESERVED: -case ZPCI_FS_STANDBY: -case ZPCI_FS_DISABLED: case ZPCI_FS_PERMANENT_ERROR: -setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE); -return 0; case ZPCI_FS_ERROR: setcc(cpu, ZPCI_PCI_LS_ERR); s390_set_status_code(env, r1, ZPCI_PCI_ST_BLOCKED); @@ -700,8 +685,34 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr, break; } +if (pcias > 5) { +DPRINTF("pcistb invalid space\n"); +setcc(cpu, ZPCI_PCI_LS_ERR); +s390_set_status_code(env, r1, ZPCI_PCI_ST_INVAL_AS); +return 0; +} + +/* Verify the address, offset and length */ +/* offset must be a multiple of 8 */ +if (offset % 8) { +goto addressing_error; +} +/* Length must be greater than 8, a multiple of 8 */ +/* and not greater than maxstbl */ +if ((len <= 8) || (len % 8) || (len > pbdev->maxstbl)) { +goto addressing_error; +} +/* Do not cross a 4K-byte boundary */ +if (((offset & 0xfff) + len) > 0x1000) { +goto addressing_error; +} +/* Guest address must be double word aligned */ +if (gaddr & 0x07UL) { +goto addressing_error; +} + mr = pbdev->pdev->io_regions[pcias].memory; -if (!memory_region_access_valid(mr, env->regs[r3], len, true)) { +if (!memory_region_access_valid(mr, offset, len, true)) { program_interrupt(env, PGM_OPERAND, 6); return 0; } @@ -711,9 +722,9 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr, } for (i = 0; i < len / 8; i++) { -result = memory_region_dispatch_write(mr, env->regs[r3] + i * 8, - ldq_p(buffer + i * 8), 8, - MEMTXATTRS_UNSPECIFIED); +result = memory_region_dispatch_write(mr, offset + i * 8, + ldq_p(buffer + i * 8), 8, + MEMTXATTRS_UNSPECIFIED); if (result != MEMTX_OK) { program_interrupt(env, PGM_OPERAND, 6); return 0; @@ -722,6 +733,10 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr, setcc(cpu, ZPCI_PCI_LS_OK); return 0; + +addressing_error: +program_interrupt(env, PGM_SPECIFICATION, 6); +return 0; } static int reg_irqs(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib) diff --git a/hw/s390x/s390-pci-inst.h b/
[Qemu-devel] [PATCH v3 3/7] s390x/pci: rework PCI LOAD
Enhance the fault detection, correction of the fault reporting. Signed-off-by: Pierre Morel Reviewed-by: Yi Min Zhao --- hw/s390x/s390-pci-inst.c | 25 ++--- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c index 930c197..2295412 100644 --- a/hw/s390x/s390-pci-inst.c +++ b/hw/s390x/s390-pci-inst.c @@ -373,6 +373,11 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) len = env->regs[r2] & 0xf; offset = env->regs[r2 + 1]; +if (!(fh & FH_MASK_ENABLE)) { +setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE); +return 0; +} + pbdev = s390_pci_find_dev_by_fh(s390_get_phb(), fh); if (!pbdev) { DPRINTF("pcilg no pci dev\n"); @@ -381,12 +386,7 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) } switch (pbdev->state) { -case ZPCI_FS_RESERVED: -case ZPCI_FS_STANDBY: -case ZPCI_FS_DISABLED: case ZPCI_FS_PERMANENT_ERROR: -setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE); -return 0; case ZPCI_FS_ERROR: setcc(cpu, ZPCI_PCI_LS_ERR); s390_set_status_code(env, r2, ZPCI_PCI_ST_BLOCKED); @@ -395,8 +395,9 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) break; } -if (pcias < 6) { -if ((8 - (offset & 0x7)) < len) { +switch (pcias) { +case 0 ... 5: +if (!len || (len > (8 - (offset & 0x7 { program_interrupt(env, PGM_OPERAND, 4); return 0; } @@ -407,8 +408,9 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) program_interrupt(env, PGM_OPERAND, 4); return 0; } -} else if (pcias == 15) { -if ((4 - (offset & 0x3)) < len) { +break; +case 15: +if (!len || (len > (4 - (offset & 0x3))) || len == 3) { program_interrupt(env, PGM_OPERAND, 4); return 0; } @@ -419,8 +421,9 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) program_interrupt(env, PGM_OPERAND, 4); return 0; } -} else { -DPRINTF("invalid space\n"); +break; +default: +DPRINTF("pcilg invalid space\n"); setcc(cpu, ZPCI_PCI_LS_ERR); s390_set_status_code(env, r2, ZPCI_PCI_ST_INVAL_AS); return 0; -- 2.7.4
Re: [Qemu-devel] [PATCH 5/5] nbd/server: structurize option reply sending
On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > nbd/server.c | 40 +--- > 1 file changed, 13 insertions(+), 27 deletions(-) > Nice diffstat; shows that this one probably makes sense (even if it will need rebasing on top of the planned churn for the rest of the series). > diff --git a/nbd/server.c b/nbd/server.c > index 79b937d88f..975fe8efe9 100644 > --- a/nbd/server.c > +++ b/nbd/server.c > @@ -152,43 +152,29 @@ static inline int nbd_opt_drop(NBDClient *client, > size_t size, Error **errp) > return nbd_drop(client->ioc, size, errp); > } > > +static inline void set_be_option_rep(NBDOptionReply *rep, uint32_t option, > + uint32_t type, uint32_t length) 'static inline' is probably pointless; the compiler already knows what static functions are ideal to inline, without us having to name it (I realize you've already done this in other patches, so I probably ought to post a cleanup patch that removes 'inline'). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 3/5] nbd/server: add helper nbd_opt_invalid
On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > nbd/server.c | 74 > +--- > 1 file changed, 46 insertions(+), 28 deletions(-) > > +/* nbd_opt_invalid > + * Drop reminded option data and reply with NBD_REP_ERR_INVALID s/reminded/the remainder of/ In fact, if we generalize this just a bit more, and let the caller choose whether to use NBD_REP_ERR_INVALID or NBD_REP_ERR_TLS_REQD or NBD_REP_ERR_UNSUP, then we can merge this functionality directly into nbd_opt_drop() instead of adding another function. I guess I'll have to merge that into the counter-proposal I have in mind (but at this point, the counter-proposal won't be posted any sooner than 2.11-rc3, in part because I'm almost on my Thanksgiving vacation). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 4/5] nbd: rename nbd_option and nbd_opt_reply
On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote: > Rename nbd_optino and nbd_opt_reply to NBDOption and NBDOptionReply s/optino/option/ > to correspond to Qemu coding style and other structures here. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > include/block/nbd.h | 8 > nbd/client.c| 12 ++-- > 2 files changed, 10 insertions(+), 10 deletions(-) > Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 2/7] s390x/pci: rework PCI STORE
On 22/11/2017 14:25, Cornelia Huck wrote: On Tue, 21 Nov 2017 19:03:17 +0100 Pierre Morel wrote: On 21/11/2017 15:25, Cornelia Huck wrote: On Tue, 21 Nov 2017 11:38:45 +0100 Cornelia Huck wrote: On Thu, 16 Nov 2017 18:51:50 +0100 Pierre Morel wrote: @@ -493,9 +494,13 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) break; } -data = env->regs[r1]; -if (pcias < 6) { -if ((8 - (offset & 0x7)) < len) { +/* Test that the pcias is valid and do the appropriates operations */ +switch (pcias) { +case 0 ... 5: Make this case 0...5: ? ...only that it confuses the compiler when using numbers. We can either I did not see this as I replied to the previous email, sorry. keep the slightly ugly version, or introduce #defines. ZPCI_PCIAS_IOREGION_{MIN,MAX} (and maybe ZPCI_PCIAS_CONFIG for 15)? I agree something is missing. But I am not sure that a #define brings clarity. I would prefer to add a comment. /* A ZPCI PCI card may use any BAR from BAR 0 to BAR 5 */ ? It's more a speaking value vs. magic numbers thing. A comment does not hurt either, though :) (And we get rid of the spacing as well.) OK, thanks. + * A length of 0 is invalid and length should not cross a double word + */ +if (!len || (len > (8 - (offset & 0x7 { program_interrupt(env, PGM_OPERAND, 4); return 0; } @@ -513,21 +518,21 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) program_interrupt(env, PGM_OPERAND, 4); return 0; } -} else if (pcias == 15) { -if ((4 - (offset & 0x3)) < len) { -program_interrupt(env, PGM_OPERAND, 4); -return 0; -} - -if (zpci_endian_swap(&data, len)) { +break; +case 15: +/* ZPCI uses the pseudo BAR number 15 as configuration space */ +/* possible access lengths are 1,2,4 and must not cross a word */ +if (!len || (len > (4 - (offset & 0x3))) || len == 3) { program_interrupt(env, PGM_OPERAND, 4); return 0; } - +/* len = 1,2,4 so we do not need to test */ +zpci_endian_swap(&data, len); pci_host_config_write_common(pbdev->pdev, offset, pci_config_size(pbdev->pdev), data, len); -} else { +break; +default: DPRINTF("pcistg invalid space\n"); setcc(cpu, ZPCI_PCI_LS_ERR); s390_set_status_code(env, r2, ZPCI_PCI_ST_INVAL_AS); Other than the nits, looks good. -- Pierre Morel Linux/KVM/QEMU in Böblingen - Germany
Re: [Qemu-devel] [PATCH v3 29/30] i.MX: Add i.MX7 SOC implementation.
On Wed, Nov 22, 2017 at 7:34 AM, Igor Mammedov wrote: > On Mon, 6 Nov 2017 07:48:12 -0800 > Andrey Smirnov wrote: > >> The following interfaces are partially or fully emulated: >> >> * up to 2 Cortex A9 cores (SMP works with PSCI) >> * A7 MPCORE (identical to A15 MPCORE) >> * 4 GPTs modules >> * 7 GPIO controllers >> * 2 IOMUXC controllers >> * 1 CCM module >> * 1 SVNS module >> * 1 SRC module >> * 1 GPCv2 controller >> * 4 eCSPI controllers >> * 4 I2C controllers >> * 7 i.MX UART controllers >> * 2 FlexCAN controllers >> * 2 Ethernet controllers (FEC) >> * 3 SD controllers (USDHC) >> * 4 WDT modules >> * 1 SDMA module >> * 1 GPR module >> * 2 USBMISC modules >> * 2 ADC modules >> * 1 PCIe controller >> >> Tested to boot and work with upstream Linux (4.13+) guest. >> >> Cc: Peter Maydell >> Cc: Jason Wang >> Cc: Philippe Mathieu-Daudé >> Cc: qemu-devel@nongnu.org >> Cc: qemu-...@nongnu.org >> Cc: yurov...@gmail.com >> Signed-off-by: Andrey Smirnov >> --- > ... >> + >> +static void fsl_imx7_init(Object *obj) >> +{ >> +BusState *sysbus = sysbus_get_default(); >> +FslIMX7State *s = FSL_IMX7(obj); >> +char name[NAME_SIZE]; >> +int i; >> + >> +if (smp_cpus > FSL_IMX7_NUM_CPUS) { >> +error_report("%s: Only %d CPUs are supported (%d requested)", >> + TYPE_FSL_IMX7, FSL_IMX7_NUM_CPUS, smp_cpus); >> +exit(1); >> +} >> + >> +for (i = 0; i < smp_cpus; i++) { >> +object_initialize(&s->cpu[i], sizeof(s->cpu[i]), >> + "cortex-a7-" TYPE_ARM_CPU); > pls reuse ARM_CPU_TYPE_NAME() macro here Will do in v4. Thanks, Andrey Smirnov
Re: [Qemu-devel] [PATCH v3 19/30] i.MX: Add code to emulate SDMA IP block
On Tue, Nov 21, 2017 at 10:20 AM, Peter Maydell wrote: > On 6 November 2017 at 15:48, Andrey Smirnov wrote: >> Add minimal code needed to allow upstream Linux guest to boot. >> >> Cc: Peter Maydell >> Cc: Jason Wang >> Cc: Philippe Mathieu-Daudé >> Cc: qemu-devel@nongnu.org >> Cc: qemu-...@nongnu.org >> Cc: yurov...@gmail.com >> Signed-off-by: Andrey Smirnov >> --- >> hw/dma/Makefile.objs | 1 + >> hw/dma/imx_sdma.c | 99 >> +++ >> include/hw/dma/imx_sdma.h | 22 +++ >> 3 files changed, 122 insertions(+) >> create mode 100644 hw/dma/imx_sdma.c >> create mode 100644 include/hw/dma/imx_sdma.h >> > > Does Linux really insist on reads-as-written behaviour? > (ie can you get away with just using > create_unimplemented_device() ?) > Not sure. I'll give it a try for v4. Thanks, Andrey Smirnov
Re: [Qemu-devel] [PATCH v3 14/30] i.MX: Add code to emulate i.MX2 watchdog IP block
On Tue, Nov 21, 2017 at 10:10 AM, Peter Maydell wrote: > On 6 November 2017 at 15:47, Andrey Smirnov wrote: >> Add enough code to emulate i.MX2 watchdog IP block so it would be >> possible to reboot the machine running Linux Guest. >> >> Cc: Peter Maydell >> Cc: Jason Wang >> Cc: Philippe Mathieu-Daudé >> Cc: qemu-devel@nongnu.org >> Cc: qemu-...@nongnu.org >> Cc: yurov...@gmail.com >> Signed-off-by: Andrey Smirnov >> --- >> hw/misc/Makefile.objs | 1 + >> hw/misc/imx2_wdt.c | 88 >> ++ >> include/hw/misc/imx2_wdt.h | 34 ++ >> 3 files changed, 123 insertions(+) >> create mode 100644 hw/misc/imx2_wdt.c >> create mode 100644 include/hw/misc/imx2_wdt.h >> >> diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs >> index ac1be05a03..c393a93456 100644 >> --- a/hw/misc/Makefile.objs >> +++ b/hw/misc/Makefile.objs >> @@ -35,6 +35,7 @@ obj-$(CONFIG_IMX) += imx25_ccm.o >> obj-$(CONFIG_IMX) += imx6_ccm.o >> obj-$(CONFIG_IMX) += imx6_src.o >> obj-$(CONFIG_IMX) += imx7_ccm.o >> +obj-$(CONFIG_IMX) += imx2_wdt.o >> obj-$(CONFIG_MILKYMIST) += milkymist-hpdmc.o >> obj-$(CONFIG_MILKYMIST) += milkymist-pfpu.o >> obj-$(CONFIG_MAINSTONE) += mst_fpga.o >> diff --git a/hw/misc/imx2_wdt.c b/hw/misc/imx2_wdt.c >> new file mode 100644 >> index 00..3a1c33aa51 >> --- /dev/null >> +++ b/hw/misc/imx2_wdt.c >> @@ -0,0 +1,88 @@ >> +/* >> + * Copyright (c) 2017, Impinj, Inc. >> + * >> + * i.MX2 Watchdog IP block >> + * >> + * Author: Andrey Smirnov >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 or later. >> + * See the COPYING file in the top-level directory. >> + */ >> + >> +#include "qemu/osdep.h" >> +#include "sysemu/watchdog.h" >> + >> +#include "hw/misc/imx2_wdt.h" >> + >> +#define IMX2_WDT_WCR_WDABIT(5) /* -> External Reset WDOG_B */ >> +#define IMX2_WDT_WCR_SRSBIT(4) /* -> Software Reset Signal */ >> + >> +static uint64_t imx2_wdt_read(void *opaque, hwaddr addr, >> + unsigned int size) >> +{ >> +return 0; >> +} >> + >> +static void imx2_wdt_write(void *opaque, hwaddr addr, >> + uint64_t value, unsigned int size) >> +{ >> +if (addr == IMX2_WDT_WCR && >> +(value & (IMX2_WDT_WCR_WDA | IMX2_WDT_WCR_SRS))) { >> +watchdog_perform_action(); >> +} >> +} >> + >> +static const MemoryRegionOps imx2_wdt_ops = { >> +.read = imx2_wdt_read, >> +.write = imx2_wdt_write, >> +.endianness = DEVICE_NATIVE_ENDIAN, >> +.impl = { >> +/* >> + * Our device would not work correctly if the guest was doing >> + * unaligned access. This might not be a limitation on the >> + * real device but in practice there is no reason for a guest >> + * to access this device unaligned. >> + */ >> +.min_access_size = 4, >> +.max_access_size = 4, >> +.unaligned = false, >> +}, >> +}; >> + >> +static void imx2_wdt_realize(DeviceState *dev, Error **errp) >> +{ >> +IMX2WdtState *s = IMX2_WDT(dev); >> + >> +memory_region_init_io(&s->mmio, OBJECT(dev), >> + &imx2_wdt_ops, s, >> + TYPE_IMX2_WDT".mmio", >> + IMX2_WDT_REG_NUM * sizeof(uint16_t)); >> +sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->mmio); >> +} >> + >> +static void imx2_wdt_class_init(ObjectClass *klass, void *data) >> +{ >> +DeviceClass *dc = DEVICE_CLASS(klass); >> + >> +dc->realize = imx2_wdt_realize; >> +set_bit(DEVICE_CATEGORY_MISC, dc->categories); >> +} >> + >> +static const TypeInfo imx2_wdt_info = { >> +.name = TYPE_IMX2_WDT, >> +.parent= TYPE_SYS_BUS_DEVICE, >> +.instance_size = sizeof(IMX2WdtState), >> +.class_init= imx2_wdt_class_init, >> +}; >> + >> +static WatchdogTimerModel model = { >> +.wdt_name = "imx2-watchdog", >> +.wdt_description = "i.MX2 Watchdog", >> +}; >> + >> +static void imx2_wdt_register_type(void) >> +{ >> +watchdog_add_model(&model); >> +type_register_static(&imx2_wdt_info); >> +} >> +type_init(imx2_wdt_register_type) >> diff --git a/include/hw/misc/imx2_wdt.h b/include/hw/misc/imx2_wdt.h >> new file mode 100644 >> index 00..e67ac6939d >> --- /dev/null >> +++ b/include/hw/misc/imx2_wdt.h >> @@ -0,0 +1,34 @@ >> +/* >> + * Copyright (c) 2017, Impinj, Inc. >> + * >> + * i.MX2 Watchdog IP block >> + * >> + * Author: Andrey Smirnov >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 or later. >> + * See the COPYING file in the top-level directory. >> + */ >> + >> +#ifndef IMX2_WDT_H >> +#define IMX2_WDT_H >> + >> +#include "qemu/bitops.h" >> +#include "hw/sysbus.h" > > The bitops.h include should be in the .c file, not here. > Will fix in v4. Thanks, Andrey Smirnov
Re: [Qemu-devel] [PATCH v3 13/30] i.MX: Add code to emulate i.MX7 CCM, PMU and ANALOG IP blocks
On Tue, Nov 21, 2017 at 10:08 AM, Peter Maydell wrote: > On 6 November 2017 at 15:47, Andrey Smirnov wrote: >> Add minimal code needed to allow upstream Linux guest to boot. >> >> Cc: Peter Maydell >> Cc: Jason Wang >> Cc: Philippe Mathieu-Daudé >> Cc: qemu-devel@nongnu.org >> Cc: qemu-...@nongnu.org >> Cc: yurov...@gmail.com >> Signed-off-by: Andrey Smirnov >> --- >> hw/misc/Makefile.objs | 1 + >> hw/misc/imx7_ccm.c | 233 >> + >> include/hw/misc/imx7_ccm.h | 130 + >> 3 files changed, 364 insertions(+) >> create mode 100644 hw/misc/imx7_ccm.c >> create mode 100644 include/hw/misc/imx7_ccm.h >> >> diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs >> index 29fb922cef..ac1be05a03 100644 >> --- a/hw/misc/Makefile.objs >> +++ b/hw/misc/Makefile.objs >> @@ -34,6 +34,7 @@ obj-$(CONFIG_IMX) += imx31_ccm.o >> obj-$(CONFIG_IMX) += imx25_ccm.o >> obj-$(CONFIG_IMX) += imx6_ccm.o >> obj-$(CONFIG_IMX) += imx6_src.o >> +obj-$(CONFIG_IMX) += imx7_ccm.o >> obj-$(CONFIG_MILKYMIST) += milkymist-hpdmc.o >> obj-$(CONFIG_MILKYMIST) += milkymist-pfpu.o >> obj-$(CONFIG_MAINSTONE) += mst_fpga.o >> diff --git a/hw/misc/imx7_ccm.c b/hw/misc/imx7_ccm.c >> new file mode 100644 >> index 00..2876164cfa >> --- /dev/null >> +++ b/hw/misc/imx7_ccm.c >> @@ -0,0 +1,233 @@ >> +/* >> + * Copyright (c) 2017, Impinj, Inc. >> + * >> + * i.MX7 CCM, PMU and ANALOG IP blocks emulation code > > Should these really all be in one single device rather > than one device per IP block ? > They all share the same register write semantics, imx7_set_clr_tog_write, so I'd like to keep them in the same file. But other than that, they can be split into all sorts of configurations. As far as memory map in i.MX7 RM is concerned "CCM" and "Analog" are distinct register files where the rest of them ("digiprog", "pmu") are "sub-blocks" of those two (or at least that's my interpretation). I'll change v4 to have that split. If you want me to convert every block _and_ sub-block into a standalone device, let me know. Thanks, Andrey Smirnov
Re: [Qemu-devel] [PATCH] kvm: apic: save and restore x2APIC LDR
2017-11-22 18:28-0200, Eduardo Habkost: > On Wed, Nov 22, 2017 at 07:09:08PM +0100, Radim Krčmář wrote: > > QEMU saves only 8 bits of APIC LDR, which means that it does not support > > x2APIC. The correct way of fixing this would be to save and restore the > > full 32 bit register, but because x2APIC LDR is a function of x2APIC ID, > > we can also compute it and keep the migration format untouched. > > > > KVM always expected the LDR format to follow the xAPIC/x2APIC standard, > > but pre 4.1 KVMs used non-standard x2APIC ID in case the OS changed > > xAPIC ID before switching to x2APIC, which means that QEMU has to use > > the kvm_x2apic_api feature to derive the x2APIC ID. > > > > This bug has also been addressed on the KVM side with patch 5849d75a5c9b > > ("KVM: lapic: Fixup LDR on load in x2apic"). > > Is this sufficient to fix the bug on hosts that lack KVM commit > 5849d75a5c9b, or we need both the KVM and QEMU patches? Good point, either one is enough to fix the bug. This patch makes x2APIC LDR work on old KVMs.
Re: [Qemu-devel] [PATCH] kvm: apic: save and restore x2APIC LDR
2017-11-22 20:26+0100, Paolo Bonzini: > On 22/11/2017 19:09, Radim Krčmář wrote: > > QEMU saves only 8 bits of APIC LDR, which means that it does not support > > x2APIC. The correct way of fixing this would be to save and restore the > > full 32 bit register, but because x2APIC LDR is a function of x2APIC ID, > > we can also compute it and keep the migration format untouched. > > > > KVM always expected the LDR format to follow the xAPIC/x2APIC standard, > > but pre 4.1 KVMs used non-standard x2APIC ID in case the OS changed > > xAPIC ID before switching to x2APIC, which means that QEMU has to use > > the kvm_x2apic_api feature to derive the x2APIC ID. > > > > This bug has also been addressed on the KVM side with patch 5849d75a5c9b > > ("KVM: lapic: Fixup LDR on load in x2apic"). > > > +if (s->apicbase & MSR_IA32_APICBASE_EXTD) { > > +kvm_apic_set_reg(kapic, 0xd, kvm_apic_calc_x2apic_ldr(s)); > > Is this correct if the kernel doesn't support the new-style x2APIC API? Should be: QEMU will use the APIC_ID register in that case, which contains the x2APIC ID that KVM used to compute the LDR from. (old-style APIC_ID just cannot store more than 8 bits and isn't tied to vcpu_id.) > In the end, it seems simpler to just fix it in the kernel. We already have the workaround in KVM, so dropping this one doesn't make that much of a difference. I perceive it as solely QEMU bug, though. :)
Re: [Qemu-devel] [PATCH v3 12/30] sdhci: Implement write method of ACMD12ERRSTS register
On Tue, Nov 21, 2017 at 10:04 AM, Peter Maydell wrote: > On 6 November 2017 at 15:47, Andrey Smirnov wrote: >> Cc: Peter Maydell >> Cc: Jason Wang >> Cc: Philippe Mathieu-Daudé >> Cc: qemu-devel@nongnu.org >> Cc: qemu-...@nongnu.org >> Cc: yurov...@gmail.com >> Signed-off-by: Andrey Smirnov >> --- >> hw/sd/sdhci.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c >> index f561cc44e3..53e5e011a7 100644 >> --- a/hw/sd/sdhci.c >> +++ b/hw/sd/sdhci.c >> @@ -1139,6 +1139,9 @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, >> unsigned size) >> s->admasysaddr = (s->admasysaddr & (0xULL | >> ((uint64_t)mask << 32))) | ((uint64_t)value << 32); >> break; >> +case SDHC_ACMD12ERRSTS: >> +MASKED_WRITE(s->acmd12errsts, mask, value); >> +break; >> case SDHC_FEAER: >> s->acmd12errsts |= value; >> s->errintsts |= (value >> 16) & s->errintstsen; >> -- >> 2.13.6 > > Is this part of the stock SDHCI spec that we just forgot to implement? Yes it is. I don't know if missing that code critical for anything, but since the rest of the plumbing is there I thought that we may as well implement it. Thanks, Andrey Smirnov
Re: [Qemu-devel] [PATCH v3 11/30] sdhci: Add i.MX specific subtype of SDHCI
On Tue, Nov 21, 2017 at 10:02 AM, Peter Maydell wrote: > On 6 November 2017 at 15:47, Andrey Smirnov wrote: >> IP block found on several generations of i.MX family does not use >> vanilla SDHCI implementation and it comes with a number of quirks. >> >> Introduce i.MX SDHCI subtype of SDHCI block to add code necessary to >> support unmodified Linux guest driver. >> >> Cc: Peter Maydell >> Cc: Jason Wang >> Cc: Philippe Mathieu-Daudé >> Cc: qemu-devel@nongnu.org >> Cc: qemu-...@nongnu.org >> Cc: yurov...@gmail.com >> Signed-off-by: Andrey Smirnov > > Hi. Mostly this looks ok; some comments below. > >> --- >> hw/sd/sdhci-internal.h | 15 ++ >> hw/sd/sdhci.c | 127 >> - >> include/hw/sd/sdhci.h | 8 >> 3 files changed, 148 insertions(+), 2 deletions(-) >> >> diff --git a/hw/sd/sdhci-internal.h b/hw/sd/sdhci-internal.h >> index 161177cf39..2a1b4b06ee 100644 >> --- a/hw/sd/sdhci-internal.h >> +++ b/hw/sd/sdhci-internal.h >> @@ -91,6 +91,8 @@ >> #define SDHC_CTRL_ADMA2_32 0x10 >> #define SDHC_CTRL_ADMA2_64 0x18 >> #define SDHC_DMA_TYPE(x) ((x) & SDHC_CTRL_DMA_CHECK_MASK) >> +#define SDHC_CTRL_4BITBUS 0x02 >> +#define SDHC_CTRL_8BITBUS 0x20 >> >> /* R/W Power Control Register 0x0 */ >> #define SDHC_PWRCON0x29 >> @@ -229,4 +231,17 @@ enum { >> >> extern const VMStateDescription sdhci_vmstate; >> >> + >> +#define ESDHC_MIX_CTRL 0x48 >> +#define ESDHC_VENDOR_SPEC 0xc0 >> +#define ESDHC_DLL_CTRL 0x60 >> + >> +#define ESDHC_TUNING_CTRL 0xcc >> +#define ESDHC_TUNE_CTRL_STATUS 0x68 >> +#define ESDHC_WTMK_LVL 0x44 >> + >> +#define ESDHC_CTRL_4BITBUS (0x1 << 1) >> +#define ESDHC_CTRL_8BITBUS (0x2 << 1) >> + >> + >> #endif >> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c >> index 6d6a791ee9..f561cc44e3 100644 >> --- a/hw/sd/sdhci.c >> +++ b/hw/sd/sdhci.c >> @@ -265,7 +265,8 @@ static void sdhci_send_command(SDHCIState *s) >> } >> } >> >> -if ((s->norintstsen & SDHC_NISEN_TRSCMP) && >> +if (!(s->quirks & SDHCI_QUIRK_NO_BUSY_IRQ) && >> +(s->norintstsen & SDHC_NISEN_TRSCMP) && >> (s->cmdreg & SDHC_CMD_RESPONSE) == SDHC_CMD_RSP_WITH_BUSY) { >> s->norintsts |= SDHC_NIS_TRSCMP; >> } >> @@ -1191,6 +1192,8 @@ static void sdhci_initfn(SDHCIState *s) >> >> s->insert_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, >> sdhci_raise_insertion_irq, s); >> s->transfer_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, >> sdhci_data_transfer, s); >> + >> +s->io_ops = &sdhci_mmio_ops; >> } >> >> static void sdhci_uninitfn(SDHCIState *s) >> @@ -1347,7 +1350,7 @@ static void sdhci_sysbus_realize(DeviceState *dev, >> Error ** errp) >> s->buf_maxsz = sdhci_get_fifolen(s); >> s->fifo_buffer = g_malloc0(s->buf_maxsz); >> sysbus_init_irq(sbd, &s->irq); >> -memory_region_init_io(&s->iomem, OBJECT(s), &sdhci_mmio_ops, s, "sdhci", >> +memory_region_init_io(&s->iomem, OBJECT(s), s->io_ops, s, "sdhci", >> SDHC_REGISTERS_MAP_SIZE); >> sysbus_init_mmio(sbd, &s->iomem); >> } >> @@ -1386,11 +1389,131 @@ static const TypeInfo sdhci_bus_info = { >> .class_init = sdhci_bus_class_init, >> }; >> >> +static uint64_t usdhc_read(void *opaque, hwaddr offset, unsigned size) >> +{ >> +SDHCIState *s = SYSBUS_SDHCI(opaque); >> +uint32_t ret; >> +uint16_t hostctl; >> + >> +switch (offset) { >> +default: >> +return sdhci_read(opaque, offset, size); >> + >> +case SDHC_HOSTCTL: >> +hostctl = SDHC_DMA_TYPE(s->hostctl) << 5; >> + >> +if (s->hostctl & SDHC_CTRL_8BITBUS) { >> +hostctl |= ESDHC_CTRL_8BITBUS; >> +} >> + >> +if (s->hostctl & SDHC_CTRL_4BITBUS) { >> +hostctl |= ESDHC_CTRL_4BITBUS; >> +} >> + >> +ret = hostctl | (s->blkgap << 16) | >> +(s->wakcon << 24); >> + >> +break; >> + >> +case ESDHC_DLL_CTRL: >> +case ESDHC_TUNE_CTRL_STATUS: >> +case 0x6c: >> +case ESDHC_TUNING_CTRL: >> +case ESDHC_VENDOR_SPEC: >> +case ESDHC_MIX_CTRL: >> +case ESDHC_WTMK_LVL: >> +ret = 0; >> +break; >> +} >> + >> +return ret; >> +} >> + >> +static void >> +usdhc_write(void *opaque, hwaddr offset, uint64_t val, unsigned size) >> +{ >> +SDHCIState *s = SYSBUS_SDHCI(opaque); >> +uint8_t hostctl = 0; >> +uint32_t value = (uint32_t)val; >> + >> +switch (offset) { >> +case ESDHC_DLL_CTRL: >> +case ESDHC_TUNE_CTRL_STATUS: >> +case 0x6c: >> +case ESDHC_TUNING_CTRL: >> +case ESDHC_WTMK_LVL: >> +case ESDHC_VENDOR_SPEC: >> +break; >> + >> +case SDHC_HOSTCTL: >> +if (value & ESDHC_CTRL_8BITBUS) { >> +hostctl |= SDHC_CTRL_8BITBUS; >> +} >> + >> +
Re: [Qemu-devel] [PATCH v2 4/7] s390x/pci: rework PCI STORE BLOCK
On 22/11/2017 14:28, Cornelia Huck wrote: On Wed, 22 Nov 2017 13:15:21 +0800 Yi Min Zhao wrote: 在 2017/11/22 上午2:07, Pierre Morel 写道: On 21/11/2017 11:42, Cornelia Huck wrote: On Thu, 16 Nov 2017 18:51:52 +0100 Pierre Morel wrote: Enhance the fault detection. Fixup the precedence to check the destination path existance before checking for the source accessibility. Add the maxstbl entry to both the Query PCI Function Group response and the PCIBusDevice structure. Initialize the maxstbl to 128 per default until we get the actual data from the hardware. Signed-off-by: Pierre Morel Reviewed-by: Yi Min Zhao --- hw/s390x/s390-pci-bus.h | 1 + hw/s390x/s390-pci-inst.c | 62 +--- hw/s390x/s390-pci-inst.h | 2 +- 3 files changed, 40 insertions(+), 25 deletions(-) @@ -700,8 +685,33 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr, break; } + if (pcias > 5) { + DPRINTF("pcistb invalid space\n"); + setcc(cpu, ZPCI_PCI_LS_ERR); + s390_set_status_code(env, r1, ZPCI_PCI_ST_INVAL_AS); + return 0; + } + + /* Verify the address, offset and length */ + /* offset must be a multiple of 8 */ + if (offset % 8) { + goto addressing_error; + } + /* Length must be greater than 8, a multiple of 8, not greater maxstbl */ "not greater than maxstlb" Better I know but greater that 80 characters, this is why I preferred broken English. What do I do ? break the line or English ? less than? It's less or equal, no? In any case, I prefer a multi-line comment over awkward language. OK, thanks + if ((len <= 8) || (len % 8) || (len > pbdev->maxstbl)) { + goto addressing_error; + } + /* Do not cross a 4K-byte boundary */ + if (((offset & 0xfff) + len) > 0x1000) { + goto addressing_error; + } + /* Guest address must be double word aligned */ + if (gaddr & 0x07UL) { + goto addressing_error; + } + mr = pbdev->pdev->io_regions[pcias].memory; - if (!memory_region_access_valid(mr, env->regs[r3], len, true)) { + if (!memory_region_access_valid(mr, offset, len, true)) { program_interrupt(env, PGM_OPERAND, 6); return 0; } Looks good. -- Pierre Morel Linux/KVM/QEMU in Böblingen - Germany
Re: [Qemu-devel] [PATCH v3 10/30] imx_fec: Reserve full 4K page for the register file
On Tue, Nov 21, 2017 at 9:48 AM, Peter Maydell wrote: > On 6 November 2017 at 15:47, Andrey Smirnov wrote: >> Some i.MX SoCs (e.g. i.MX7) have FEC registers going as far as offset >> 0x614, so to avoid getting aborts when accessing those on QEMU, extend >> the register file to cover 4KB of address space instead of just 1K. >> >> Cc: Peter Maydell >> Cc: Jason Wang >> Cc: Philippe Mathieu-Daudé >> Cc: qemu-devel@nongnu.org >> Cc: qemu-...@nongnu.org >> Cc: yurov...@gmail.com >> Signed-off-by: Andrey Smirnov >> --- >> hw/net/imx_fec.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c >> index 48d012cad6..e236bc933c 100644 >> --- a/hw/net/imx_fec.c >> +++ b/hw/net/imx_fec.c >> @@ -1252,7 +1252,7 @@ static void imx_eth_realize(DeviceState *dev, Error >> **errp) >> SysBusDevice *sbd = SYS_BUS_DEVICE(dev); >> >> memory_region_init_io(&s->iomem, OBJECT(dev), &imx_eth_ops, s, >> - TYPE_IMX_FEC, 0x400); >> + TYPE_IMX_FEC, 0x1000); >> sysbus_init_mmio(sbd, &s->iomem); >> sysbus_init_irq(sbd, &s->irq[0]); >> sysbus_init_irq(sbd, &s->irq[1]); >> -- > > I notice that we have an unused #define FSL_IMX25_FEC_SIZE 0x4000 in > fsl-imx25.h, and the Linux device trees for the imx25 define the size > of the FEC register block as 0x4000. Should this be 0x4000 ? > I think the size reserved for that register file differs between differen SoC. E.g. it's 16K on i.MX25, as you pointed out, but 64K on i.MX7. It's all the same to me as long as it's greater than 0x1000. I'll change the code to use FSL_IMX25_FEC_SIZE since it gets rid of a magic number. Thanks, Andrey Smirnov
Re: [Qemu-devel] [PATCH] kvm: apic: save and restore x2APIC LDR
On Wed, Nov 22, 2017 at 07:09:08PM +0100, Radim Krčmář wrote: > QEMU saves only 8 bits of APIC LDR, which means that it does not support > x2APIC. The correct way of fixing this would be to save and restore the > full 32 bit register, but because x2APIC LDR is a function of x2APIC ID, > we can also compute it and keep the migration format untouched. > > KVM always expected the LDR format to follow the xAPIC/x2APIC standard, > but pre 4.1 KVMs used non-standard x2APIC ID in case the OS changed > xAPIC ID before switching to x2APIC, which means that QEMU has to use > the kvm_x2apic_api feature to derive the x2APIC ID. > > This bug has also been addressed on the KVM side with patch 5849d75a5c9b > ("KVM: lapic: Fixup LDR on load in x2apic"). Is this sufficient to fix the bug on hosts that lack KVM commit 5849d75a5c9b, or we need both the KVM and QEMU patches? > > Reported-by: Dr. David Alan Gilbert > Reported-by: Yiqian Wei > Signed-off-by: Radim Krčmář > --- > I haven't tested that it actually fixes the bug, > https://bugzilla.redhat.com/show_bug.cgi?id=1502591. > > hw/i386/kvm/apic.c | 19 +-- > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c > index 1df6d26816f9..89df165a04bf 100644 > --- a/hw/i386/kvm/apic.c > +++ b/hw/i386/kvm/apic.c > @@ -30,6 +30,13 @@ static inline uint32_t kvm_apic_get_reg(struct > kvm_lapic_state *kapic, > return *((uint32_t *)(kapic->regs + (reg_id << 4))); > } > > +static inline uint32_t kvm_apic_calc_x2apic_ldr(APICCommonState *s) > +{ > +uint32_t id = kvm_has_x2apic_api() ? s->initial_apic_id : s->id; > + > +return ((id >> 4) << 16) | (1 << (id & 0xf)); > +} > + > static void kvm_put_apic_state(APICCommonState *s, struct kvm_lapic_state > *kapic) > { > int i; > @@ -41,7 +48,11 @@ static void kvm_put_apic_state(APICCommonState *s, struct > kvm_lapic_state *kapic > kvm_apic_set_reg(kapic, 0x2, s->id << 24); > } > kvm_apic_set_reg(kapic, 0x8, s->tpr); > -kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24); > +if (s->apicbase & MSR_IA32_APICBASE_EXTD) { > +kvm_apic_set_reg(kapic, 0xd, kvm_apic_calc_x2apic_ldr(s)); > +} else { > +kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24); > +} > kvm_apic_set_reg(kapic, 0xe, s->dest_mode << 28 | 0x0fff); > kvm_apic_set_reg(kapic, 0xf, s->spurious_vec); > for (i = 0; i < 8; i++) { > @@ -71,7 +82,11 @@ void kvm_get_apic_state(DeviceState *dev, struct > kvm_lapic_state *kapic) > } > s->tpr = kvm_apic_get_reg(kapic, 0x8); > s->arb_id = kvm_apic_get_reg(kapic, 0x9); > -s->log_dest = kvm_apic_get_reg(kapic, 0xd) >> 24; > +if (s->apicbase & MSR_IA32_APICBASE_EXTD) { > +assert(kvm_apic_get_reg(kapic, 0xd) == kvm_apic_calc_x2apic_ldr(s)); I assume this assert() won't trigger if the host just lacks the kernel patch, will it? What if we're going to migrate to a QEMU version that doesn't have this patch applied? Do we want to send the same log_dest value as old QEMU versions, just in case? (Those 8 bits QEMU currently sets at LDR[31:24] seem completely useless, but maybe it won't hurt to keep them?) > +} else { > +s->log_dest = kvm_apic_get_reg(kapic, 0xd) >> 24; > +} > s->dest_mode = kvm_apic_get_reg(kapic, 0xe) >> 28; > s->spurious_vec = kvm_apic_get_reg(kapic, 0xf); > for (i = 0; i < 8; i++) { > -- > 2.14.2 > -- Eduardo
Re: [Qemu-devel] [PATCH v3 07/30] imx_fec: Add support for multiple Tx DMA rings
On Tue, Nov 21, 2017 at 9:44 AM, Peter Maydell wrote: > On 6 November 2017 at 15:47, Andrey Smirnov wrote: >> More recent version of the IP block support more than one Tx DMA ring, >> so add the code implementing that feature. >> >> Cc: Peter Maydell >> Cc: Jason Wang >> Cc: Philippe Mathieu-Daudé >> Cc: qemu-devel@nongnu.org >> Cc: qemu-...@nongnu.org >> Cc: yurov...@gmail.com >> Signed-off-by: Andrey Smirnov > >> static const VMStateDescription vmstate_imx_eth = { >> .name = TYPE_IMX_FEC, >> -.version_id = 2, >> -.minimum_version_id = 2, >> +.version_id = 3, >> +.minimum_version_id = 3, >> .fields = (VMStateField[]) { >> VMSTATE_UINT32_ARRAY(regs, IMXFECState, ENET_MAX), >> VMSTATE_UINT32(rx_descriptor, IMXFECState), >> -VMSTATE_UINT32(tx_descriptor, IMXFECState), >> - >> +VMSTATE_UINT32_ARRAY(tx_descriptor, IMXFECState, ENET_TX_RING_NUM), >> +VMSTATE_UINT32(tx_ring_num, IMXFECState), >> VMSTATE_UINT32(phy_status, IMXFECState), >> VMSTATE_UINT32(phy_control, IMXFECState), >> VMSTATE_UINT32(phy_advertise, IMXFECState), > > tx_ring_num is constant for any particular instantiation of the device, > so you don't need to put it in the vmstate. > > It's pretty trivial to make this vmstate compatible with the old > ones for the existing single-tx-descriptor devices, so we might as well: > > /* Versions of this device with more than one TX descriptor > * save the 2nd and 3rd descriptors in a subsection, to maintain > * migration compatibility with previous versions of the device > * that only supported a single descriptor. > */ > static bool txdescs_needed(void *opaque) { > IMXFECState *s = opaque; > > return s->tx_ring_num > 1; > } > > static const VMStateDescription vmstate_imx_eth_txdescs = { > .name = "imx.fec/txdescs", > .version_id = 1, > .minimum_version_id = 1, > .needed = txdescs_needed, > .fields = (VMStateField[]) { > VMSTATE_UINT32(tx_descriptor[1], IMXFECState), > VMSTATE_UINT32(tx_descriptor[2], IMXFECState), > VMSTATE_END_OF_LIST() > } > }; > > and then add this to the vmx_state_eth at the end: > .subsections = (const VMStateDescription*[]) { > &vmstate_imx_eth_txdescs, > NULL > } > > Then you don't need to bump version_id/minimum_version_id on the > vmstate_imx_eth struct. > Cool, sounds good. Will add that to the patch in v4. >> @@ -791,6 +821,7 @@ static void imx_eth_write(void *opaque, hwaddr offset, >> uint64_t value, >> unsigned size) >> { >> IMXFECState *s = IMX_FEC(opaque); >> +const bool single_tx_ring = s->tx_ring_num != 3; > > This looks odd -- I would have expected "single_tx_ring = > s->tx_ring_num == 1" ... AFAIK the HW that's out there will have either 3 or 1 Tx ring, so that's why I wrote it this way. I'll change the logic to the way your suggest to avoid surprising the reader. Thanks, Andrey Smirnov
Re: [Qemu-devel] [PATCH v3 03/30] imx_fec: Change queue flushing heuristics
On Tue, Nov 21, 2017 at 9:27 AM, Peter Maydell wrote: > On 6 November 2017 at 15:47, Andrey Smirnov wrote: >> In current implementation, packet queue flushing logic seem to suffer >> from a deadlock like scenario if a packet is received by the interface >> before before Rx ring is initialized by Guest's driver. Consider the >> following sequence of events: >> >> 1. A QEMU instance is started against a TAP device on Linux >>host, running Linux guest, e. g., something to the effect >>of: >> >>qemu-system-arm \ >> -net nic,model=imx.fec,netdev=lan0 \ >> netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \ >> ... rest of the arguments ... >> >> 2. Once QEMU starts, but before guest reaches the point where >>FEC deriver is done initializing the HW, Guest, via TAP >>interface, receives a number of multicast MDNS packets from >>Host (not necessarily true for every OS, but it happens at >>least on Fedora 25) >> >> 3. Recieving a packet in such a state results in >>imx_eth_can_receive() returning '0', which in turn causes >>tap_send() to disable corresponding event (tap.c:203) >> >> 4. Once Guest's driver reaches the point where it is ready to >>recieve packets it prepares Rx ring descriptors and writes >>ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that >>more descriptors are ready. And at this points emulation >>layer does this: >> >> s->regs[index] = ENET_RDAR_RDAR; >> imx_eth_enable_rx(s); >> >>which, combined with: >> >> if (!s->regs[ENET_RDAR]) { >> qemu_flush_queued_packets(qemu_get_queue(s->nic)); >> } >> >>results in Rx queue never being flushed and corresponding >>I/O event beign disabled. >> >> To prevent the problem, change the code to always flush packet queue >> when ENET_RDAR transitions 0 -> ENET_RDAR_RDAR. >> >> Cc: Peter Maydell >> Cc: Jason Wang >> Cc: Philippe Mathieu-Daudé >> Cc: qemu-devel@nongnu.org >> Cc: qemu-...@nongnu.org >> Cc: yurov...@gmail.com >> Signed-off-by: Andrey Smirnov >> diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h >> index 62ad473b05..4bc8f03ec2 100644 >> --- a/include/hw/net/imx_fec.h >> +++ b/include/hw/net/imx_fec.h >> @@ -252,6 +252,7 @@ typedef struct IMXFECState { >> uint32_t phy_int_mask; >> >> bool is_fec; >> +bool needs_flush; >> } IMXFECState; > > This field isn't needed any more in this version of the patch, I think? > Yeah, my bad, forgot to remove this part. Will do in v4. Thanks, Andrey Smirnov
Re: [Qemu-devel] [PATCH v3 04/30] imx_fec: Use ENET_FTRL to determine truncation length
On Tue, Nov 21, 2017 at 9:31 AM, Peter Maydell wrote: > On 6 November 2017 at 15:47, Andrey Smirnov wrote: >> Frame truncation length, TRUNC_FL, is determined by the contents of >> ENET_FTRL register, so convert the code to use it instead of a >> hardcoded constant. >> >> To avoid the case where TRUNC_FL is greater that ENET_MAX_FRAME_SIZE, >> increase the value of the latter to its theoretical maximum of 16K. >> >> Cc: Peter Maydell >> Cc: Jason Wang >> Cc: Philippe Mathieu-Daudé >> Cc: qemu-devel@nongnu.org >> Cc: qemu-...@nongnu.org >> Cc: yurov...@gmail.com >> Signed-off-by: Andrey Smirnov >> --- >> hw/net/imx_fec.c | 4 ++-- >> include/hw/net/imx_fec.h | 3 ++- >> 2 files changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c >> index eb034ffd0c..dda0816fb3 100644 >> --- a/hw/net/imx_fec.c >> +++ b/hw/net/imx_fec.c >> @@ -1052,8 +1052,8 @@ static ssize_t imx_enet_receive(NetClientState *nc, >> const uint8_t *buf, >> crc_ptr = (uint8_t *) &crc; >> >> /* Huge frames are truncted. */ >> -if (size > ENET_MAX_FRAME_SIZE) { >> -size = ENET_MAX_FRAME_SIZE; >> +if (size > s->regs[ENET_FTRL]) { >> +size = s->regs[ENET_FTRL]; >> flags |= ENET_BD_TR | ENET_BD_LG; >> } >> >> diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h >> index 4bc8f03ec2..0fcc4f0c71 100644 >> --- a/include/hw/net/imx_fec.h >> +++ b/include/hw/net/imx_fec.h >> @@ -86,7 +86,6 @@ >> #define ENET_TCCR3 393 >> #define ENET_MAX 400 >> >> -#define ENET_MAX_FRAME_SIZE2032 >> >> /* EIR and EIMR */ >> #define ENET_INT_HB(1 << 31) >> @@ -155,6 +154,8 @@ >> #define ENET_RCR_NLC (1 << 30) >> #define ENET_RCR_GRS (1 << 31) >> >> +#define ENET_MAX_FRAME_SIZE(1 << ENET_RCR_MAX_FL_LENGTH) > > This means we now have functions with 16K local array > variables on the stack, which seems like a bad idea. > Can't say I see a big difference between having a 2K vs 16K buffer on the stack, but regardless, I am not quite clear if you are not too hot about this patch and want me to drop it (I am fine with it) or do you want me to modify it to have the emulation layer allocate said 16K buffer on the heap instead of a stack? Thanks, Andrey Smirnov
Re: [Qemu-devel] [PATCH v3 00/30] Initial i.MX7 support
On Tue, Nov 21, 2017 at 10:34 AM, Peter Maydell wrote: > On 6 November 2017 at 15:47, Andrey Smirnov wrote: >> Hi everyone, >> - Added proper USB emulation code, so now it should be possible to >> emulated guest's USB bus > > The patchset is huge as it is, if you add more stuff to it > it makes it even more likely to sink to the bottom of my > to-review queue... > USB peripheral emulation had to be a part of a patch-set, either in dummy or a full featured form, in order to be able to boot vanilla Linux kernel because you insisted that I don't use "ignore_memory_transaction_failures". The reason why the dummy emulation version of it it was not a part of v2 was because I did my test with a bad kernel config where USB was disabled, didn't realize USB was essential and did not write code to support it. Now, once I realized it, I wrote a dummy version, and then later, while waiting for v2 to be reviewed, worked on proper USB emulation the code. Said code turned out to be comparatively trivial to the first dummy implementation, so instead of going through the exercise of submitting dummy first and then proper version later I squashed both and the result in v3. >> Peter, I didn't hear anything from you about the code of >> mcimx7d_add_psci_node(), as discussed here: >> >> https://www.mail-archive.com/qemu-devel@nongnu.org/msg486874.html >> >> so I kept the original code intact. As I mentioned before, my goal was >> to be able to boot into vanilla Linux kerenel and have working SMP >> without needing to use a PSCI implementing bootloader. If that is >> something that new board code shouldn't do, please let me know. > > Broadly, board code should work the same way the real hardware > does, unless there's a clear reason why not. Yes, this all makes sense. As far as I understand convenience being able to boot Linux directly in QEMU has long been the "clear reason why not". Now that certain SoC specific versions of Linux are not as self-sufficient and can't support SMP without external help, emulating PSCI and doing appropriate DTB fixups for that is just an adaptaion of the old convenience mechanism to new times and circumstances, IMHO. > "virt" is special because it writes its own dtb entirely. > > Maybe PSCI does need to be a different special case, since we're > emulating part of a bootloader here. OK, I'll ignore the "maybe" part and proceed as if we are in agreement on PSCI for v4. > But if so I think that code belongs more in hw/arm/boot.c, so that we > automatically fix up the > dtb to say "we have psci" if we are (a) booting a kernel directly > and (b) the CPU has the psci-conduit property set to enable QEMU's > PSCI implementation. > OK, sure, makes sense. I'll change the patch to use shared infrastructure for that. > (Also the code in virt.c for adding a PSCI node is considerably > fuller-featured than yours is.) > Okay... My code is targeting both fixed PSCI conduit (smc) and PSCI implementation (0.2/1.0), implementing support for anything but that in my board specific code would've been, IMHO, silly. OK, I'll interpret that comment not as a slight, but as a request to use virt.c's implementation for shared infrastructure. >> Thanks, >> Andrey Smirnov >> >> [v2] https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg05516.html >> [v1] https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg04770.html >> >> >> Andrey Smirnov (30): >> imx_fec: Do not link to netdev >> imx_fec: Refactor imx_eth_enable_rx() >> imx_fec: Change queue flushing heuristics >> imx_fec: Use ENET_FTRL to determine truncation length >> imx_fec: Use MIN instead of explicit ternary operator >> imx_fec: Emulate SHIFT16 in ENETx_RACC >> imx_fec: Add support for multiple Tx DMA rings >> imx_fec: Use correct length for packet size >> imx_fec: Fix a typo in imx_enet_receive() >> imx_fec: Reserve full 4K page for the register file >> sdhci: Add i.MX specific subtype of SDHCI >> sdhci: Implement write method of ACMD12ERRSTS register > > Everything above here is pretty nearly ready to go in; > if you send that as a patchseries then it should be easy > to review and queue ready for 2.12 (which will open up > for new commits in mid-december). > I'm not quite sure where you stand on "imx_fec: Use ENET_FTRL to determine truncation length", but sure, sounds good, I'll prepare the rest of them as a separate patch set and submit it, as soon as I get a chance. >> i.MX: Add code to emulate i.MX7 CCM, PMU and ANALOG IP blocks >> i.MX: Add code to emulate i.MX2 watchdog IP block >> i.MX: Add code to emulate i.MX7 SNVS IP-block >> i.MX: Add code to emulate GPCv2 IP block >> i.MX: Add code to emulate i.MX7 IOMUXC IP block >> i.MX: Add i.MX7 GPT variant >> i.MX: Add code to emulate SDMA IP block >> i.MX: Add code to emulate FlexCAN IP block >> i.MX: Add implementation of i.MX7 GPR IP block >> pci: Add support for Designware IP block >> i.MX: Add code to emulate i.MX7 USBMISC IP block >
Re: [Qemu-devel] [PATCH 2/5] nbd/server: add nbd_opt_{read, drop} to track client->optlen
On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > nbd/server.c | 34 ++ > 1 file changed, 22 insertions(+), 12 deletions(-) > Hmm, revisiting my idea about moving more of the error checking into the helper function. As of this patch, we only have two clients of nbd_opt_read: > @@ -299,7 +312,7 @@ static int nbd_negotiate_handle_export_name(NBDClient > *client, > error_setg(errp, "Bad length received"); > return -EINVAL; > } > -if (nbd_read(client->ioc, name, client->optlen, errp) < 0) { > +if (nbd_opt_read(client, name, client->optlen, errp) < 0) { > error_prepend(errp, "read failed: "); > return -EINVAL; 1. NBD_OPT_EXPORT_NAME, where the length check is based on the maximum size name we're willing to accept (and NOT on comparison to the header size, as we request the entire client->optlen). This call cannot send an error reply (so if the length is bogus, we can just drop the connection by returning -EINVAL). Furthermore, once we handle this option, option processing is done; we do not reach the assert that client->optlen == 0. > } > @@ -383,40 +396,36 @@ static int nbd_negotiate_handle_info(NBDClient *client, > uint16_t myflags, > msg = "overall request too short"; > goto invalid; > } > -if (nbd_read(client->ioc, &namelen, sizeof(namelen), errp) < 0) { > +if (nbd_opt_read(client, &namelen, sizeof(namelen), errp) < 0) { > return -EIO; > } 2. Multiple calls within NBD_OPT_INFO/NBD_OPT_GO. Here, the length check is based on our read being a subset of client->optlen, and our desired response on a failed check is whatever is at the invalid: label, namely: invalid: if (nbd_opt_drop(client, client->optlen, errp) < 0) { return -EIO; } return nbd_negotiate_send_rep_err(client, NBD_REP_ERR_INVALID, errp, "%s", msg); We want to drop all remaining data, reply to the client, and return 0 to keep the connection alive (or -EIO if the read or write failed). You are planning on adding more calls withing NBD_OPT_LIST_META_CONTEXT, which will have semantics more like 2 (if we detect an inconsistent size, we want to consume the rest of the input and return an EINVAL reply to the client, but keep the connection alive unless there is an I/O error in the meantime). I think that means we need a tri-state return from nbd_opt_read(): < 0 on I/O failure, 0 if the EINVAL message has been sent, and 1 if the read was successful; I also think it means that the handler for NBD_OPT_EXPORT_NAME does not really fit the bill for using the same handler. Hopefully this explanation will give you more insight into the counter-proposal patch I'm about to post. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] kvm: virtio-net: saved image requires TUN_F_UFO support
* Stefan Priebe - Profihost AG (s.pri...@profihost.ag) wrote: > Hello, > > Am 22.11.2017 um 20:41 schrieb Dr. David Alan Gilbert: > > * Paolo Bonzini (pbonz...@redhat.com) wrote: > >> On 06/11/2017 12:09, Stefan Priebe - Profihost AG wrote: > >>> HI Paolo, > >>> > >>> could this patchset be related? > >> > >> Uh oh, yes it should. Jason, any ways to fix it? I suppose we need to > >> disable UFO in the newest machine types, but do we also have to do > >> (software) UFO in vhost-net and QEMU for migration compatibility? > > > > Was there a solution to this? > > it will be this one: > https://patchwork.ozlabs.org/patch/840094/ Thanks; I've added a link to: https://wiki.qemu.org/Features/Migration/Troubleshooting#virtio-net:_saved_image_requires_TUN_F_UFO_support Dave > > Stefan > > > Dave > > > >> Thanks, > >> > >> Paolo > >> > >>> Greets, > >>> Stefan > >>> > >>> Am 06.11.2017 um 10:52 schrieb Stefan Priebe - Profihost AG: > Hi Paolo, > > Am 06.11.2017 um 10:49 schrieb Paolo Bonzini: > > On 06/11/2017 10:48, Stefan Priebe - Profihost AG wrote: > >> Hi Paolo, > >> > >> Am 06.11.2017 um 10:40 schrieb Paolo Bonzini: > >>> On 06/11/2017 10:38, Stefan Priebe - Profihost AG wrote: > Hello, > > i've upgraded some servers from kernel 4.4 to 4.12 - both running > Qemu > 2.9.1. > > If i migrate a VM from a host running kernel 4.4 to a host running > 4.12 > i get: > > kvm: virtio-net: saved image requires TUN_F_UFO support > kvm: Failed to load virtio-net-device:tmp > kvm: Failed to load virtio-net:virtio > kvm: error while loading state for instance 0x0 of device > ':00:12.0/virtio-net' > kvm: load of migration failed: Invalid argument > > > while migrating from 4.12 to 4.4 works fine. > > Can anybody help? Is this expected? > >>> > >>> Can you check why peer_has_ufo failed (in hw/net/virtio-net.c)? > >> > >> May be - how can i archieve this? Patching the code is not a problem if > >> you can give me a hint. > >> > >>> Also, did this ioctl fail when the tap device was set up on the 4.12 > >>> destination? > >>> int tap_probe_has_ufo(int fd) > >>> { > >>> unsigned offload; > >>> > >>> offload = TUN_F_CSUM | TUN_F_UFO; > >>> > >>> if (ioctl(fd, TUNSETOFFLOAD, offload) < 0) > >>> return 0; > >>> > >>> return 1; > >>> } > >> > >> Should there be any kernel output or how can i detect / check it? > > > > For both, the simplest answer is probably just using printf. > > arg i missed an important part. The kernel is an opensuse SLE15 one. > > I've seen it contains the following patchset: > https://www.spinics.net/lists/netdev/msg443821.html > > Greets, > Stefan > > >> > >> > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] kvm: virtio-net: saved image requires TUN_F_UFO support
Hello, Am 22.11.2017 um 20:41 schrieb Dr. David Alan Gilbert: > * Paolo Bonzini (pbonz...@redhat.com) wrote: >> On 06/11/2017 12:09, Stefan Priebe - Profihost AG wrote: >>> HI Paolo, >>> >>> could this patchset be related? >> >> Uh oh, yes it should. Jason, any ways to fix it? I suppose we need to >> disable UFO in the newest machine types, but do we also have to do >> (software) UFO in vhost-net and QEMU for migration compatibility? > > Was there a solution to this? it will be this one: https://patchwork.ozlabs.org/patch/840094/ Stefan > Dave > >> Thanks, >> >> Paolo >> >>> Greets, >>> Stefan >>> >>> Am 06.11.2017 um 10:52 schrieb Stefan Priebe - Profihost AG: Hi Paolo, Am 06.11.2017 um 10:49 schrieb Paolo Bonzini: > On 06/11/2017 10:48, Stefan Priebe - Profihost AG wrote: >> Hi Paolo, >> >> Am 06.11.2017 um 10:40 schrieb Paolo Bonzini: >>> On 06/11/2017 10:38, Stefan Priebe - Profihost AG wrote: Hello, i've upgraded some servers from kernel 4.4 to 4.12 - both running Qemu 2.9.1. If i migrate a VM from a host running kernel 4.4 to a host running 4.12 i get: kvm: virtio-net: saved image requires TUN_F_UFO support kvm: Failed to load virtio-net-device:tmp kvm: Failed to load virtio-net:virtio kvm: error while loading state for instance 0x0 of device ':00:12.0/virtio-net' kvm: load of migration failed: Invalid argument while migrating from 4.12 to 4.4 works fine. Can anybody help? Is this expected? >>> >>> Can you check why peer_has_ufo failed (in hw/net/virtio-net.c)? >> >> May be - how can i archieve this? Patching the code is not a problem if >> you can give me a hint. >> >>> Also, did this ioctl fail when the tap device was set up on the 4.12 >>> destination? >>> int tap_probe_has_ufo(int fd) >>> { >>> unsigned offload; >>> >>> offload = TUN_F_CSUM | TUN_F_UFO; >>> >>> if (ioctl(fd, TUNSETOFFLOAD, offload) < 0) >>> return 0; >>> >>> return 1; >>> } >> >> Should there be any kernel output or how can i detect / check it? > > For both, the simplest answer is probably just using printf. arg i missed an important part. The kernel is an opensuse SLE15 one. I've seen it contains the following patchset: https://www.spinics.net/lists/netdev/msg443821.html Greets, Stefan >> >> > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK >
Re: [Qemu-devel] kvm: virtio-net: saved image requires TUN_F_UFO support
* Paolo Bonzini (pbonz...@redhat.com) wrote: > On 06/11/2017 12:09, Stefan Priebe - Profihost AG wrote: > > HI Paolo, > > > > could this patchset be related? > > Uh oh, yes it should. Jason, any ways to fix it? I suppose we need to > disable UFO in the newest machine types, but do we also have to do > (software) UFO in vhost-net and QEMU for migration compatibility? Was there a solution to this? Dave > Thanks, > > Paolo > > > Greets, > > Stefan > > > > Am 06.11.2017 um 10:52 schrieb Stefan Priebe - Profihost AG: > >> Hi Paolo, > >> > >> Am 06.11.2017 um 10:49 schrieb Paolo Bonzini: > >>> On 06/11/2017 10:48, Stefan Priebe - Profihost AG wrote: > Hi Paolo, > > Am 06.11.2017 um 10:40 schrieb Paolo Bonzini: > > On 06/11/2017 10:38, Stefan Priebe - Profihost AG wrote: > >> Hello, > >> > >> i've upgraded some servers from kernel 4.4 to 4.12 - both running Qemu > >> 2.9.1. > >> > >> If i migrate a VM from a host running kernel 4.4 to a host running 4.12 > >> i get: > >> > >> kvm: virtio-net: saved image requires TUN_F_UFO support > >> kvm: Failed to load virtio-net-device:tmp > >> kvm: Failed to load virtio-net:virtio > >> kvm: error while loading state for instance 0x0 of device > >> ':00:12.0/virtio-net' > >> kvm: load of migration failed: Invalid argument > >> > >> > >> while migrating from 4.12 to 4.4 works fine. > >> > >> Can anybody help? Is this expected? > > > > Can you check why peer_has_ufo failed (in hw/net/virtio-net.c)? > > May be - how can i archieve this? Patching the code is not a problem if > you can give me a hint. > > > Also, did this ioctl fail when the tap device was set up on the 4.12 > > destination? > > int tap_probe_has_ufo(int fd) > > { > > unsigned offload; > > > > offload = TUN_F_CSUM | TUN_F_UFO; > > > > if (ioctl(fd, TUNSETOFFLOAD, offload) < 0) > > return 0; > > > > return 1; > > } > > Should there be any kernel output or how can i detect / check it? > >>> > >>> For both, the simplest answer is probably just using printf. > >> > >> arg i missed an important part. The kernel is an opensuse SLE15 one. > >> > >> I've seen it contains the following patchset: > >> https://www.spinics.net/lists/netdev/msg443821.html > >> > >> Greets, > >> Stefan > >> > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH] kvm: apic: save and restore x2APIC LDR
On 22/11/2017 19:09, Radim Krčmář wrote: > QEMU saves only 8 bits of APIC LDR, which means that it does not support > x2APIC. The correct way of fixing this would be to save and restore the > full 32 bit register, but because x2APIC LDR is a function of x2APIC ID, > we can also compute it and keep the migration format untouched. > > KVM always expected the LDR format to follow the xAPIC/x2APIC standard, > but pre 4.1 KVMs used non-standard x2APIC ID in case the OS changed > xAPIC ID before switching to x2APIC, which means that QEMU has to use > the kvm_x2apic_api feature to derive the x2APIC ID. > > This bug has also been addressed on the KVM side with patch 5849d75a5c9b > ("KVM: lapic: Fixup LDR on load in x2apic"). > +if (s->apicbase & MSR_IA32_APICBASE_EXTD) { > +kvm_apic_set_reg(kapic, 0xd, kvm_apic_calc_x2apic_ldr(s)); Is this correct if the kernel doesn't support the new-style x2APIC API? In the end, it seems simpler to just fix it in the kernel. Paolo > +} else { > +kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24); > +}
Re: [Qemu-devel] [Qemu-block] [PATCH] block: Fix qemu crash when using scsi-block
On 22/11/2017 19:06, Kevin Wolf wrote: > Am 22.11.2017 um 17:34 hat Paolo Bonzini geschrieben: >> On 22/11/2017 16:33, Deepa Srinivasan wrote: >>> Starting qemu with the following arguments causes qemu to segfault: >>> ... -device lsi,id=lsi0 -drive >>> file=iscsi:<...>,format=raw,if=none,node-name= >>> iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1 >>> >>> This patch fixes blk_aio_ioctl() so it does not pass stack addresses to >>> blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. >>> More >>> details about the bug follow. >>> >>> blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the >>> coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter(). >>> >>> When blk_aio_ioctl() is executed from within a coroutine context (e.g. >>> iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to >>> the current coroutine's wakeup queue. blk_aio_ioctl() then returns. >>> >>> When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer: >>> >>> BlkRwCo *rwco = &acb->rwco; >>> >>> rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, >>> rwco->qiov->iov[0].iov_base); <--- qiov is >>> invalid >>> here >>> ... >>> >>> In the case when blk_aio_ioctl() is called from a non-coroutine context, >>> blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls >>> qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine >>> execution is complete, control returns to blk_aio_ioctl_entry() after the >>> call >>> to blk_co_ioctl(). There is no invalid reference after this point, but the >>> function is still holding on to invalid pointers. >>> >>> The fix is to allocate memory for the QEMUIOVector and struct iovec as part >>> of >>> the request struct which the IO buffer is part of. The memory for this >>> struct is >>> guaranteed to be valid till the AIO is completed. >>> >>> Signed-off-by: Deepa Srinivasan >>> Signed-off-by: Konrad Rzeszutek Wilk >>> Reviewed-by: Mark Kanda >>> --- >>> block/block-backend.c | 13 ++--- >>> hw/block/virtio-blk.c | 9 - >>> hw/scsi/scsi-disk.c| 10 +- >>> hw/scsi/scsi-generic.c | 9 - >>> include/sysemu/block-backend.h | 2 +- >>> 5 files changed, 28 insertions(+), 15 deletions(-) >>> >>> diff --git a/block/block-backend.c b/block/block-backend.c >>> index baef8e7..c275827 100644 >>> --- a/block/block-backend.c >>> +++ b/block/block-backend.c >>> @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque) >>> blk_aio_complete(acb); >>> } >>> >>> -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void >>> *buf, >>> +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, >>> QEMUIOVector *qiov, >>>BlockCompletionFunc *cb, void *opaque) >> >> I think this is not the best way to fix the bug, because it adds extra >> unnecessary code in the callers. >> >> Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the >> same for blk_aio_prwv's "qiov" argument? >> >> Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just >> use rwco->buf. > > But the same struct is used for read and write requests that do use an > actual QEMUIOVector and not just a linear buffer. Then let's call it "void *opaque", or make it a union (but I think that's overkill). The QEMUIOVector pointer is opaque as far as blk_aio_prwv is concerned, and it is only created by blk_aio_ioctl for blk_aio_ioctl_entry to extract buf: rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, rwco->qiov->iov[0].iov_base); Exposing the fake QEMUIOVector to the callers of blk_aio_ioctl is much uglier than using a void* for what is effectively a multi-type pointer. Paolo
Re: [Qemu-devel] [PATCH v3 30/30] Implement support for i.MX7 Sabre board
On Tue, Nov 21, 2017 at 10:22 AM, Peter Maydell wrote: > On 6 November 2017 at 15:48, Andrey Smirnov wrote: >> Implement code needed to set up emulation of MCIMX7SABRE board from >> NXP. For more info about the HW see: >> >> https://www.nxp.com/support/developer-resources/hardware-development-tools/sabre-development-system/sabre-board-for-smart-devices-based-on-the-i.mx-7dual-applications-processors:MCIMX7SABRE > > You could put this URL in a comment in the code as well. > >> Cc: Peter Maydell >> Cc: Jason Wang >> Cc: Philippe Mathieu-Daudé >> Cc: qemu-devel@nongnu.org >> Cc: qemu-...@nongnu.org >> Cc: yurov...@gmail.com >> Signed-off-by: Andrey Smirnov >> --- >> hw/arm/Makefile.objs | 2 +- >> hw/arm/mcimx7d-sabre.c | 101 >> + >> 2 files changed, 102 insertions(+), 1 deletion(-) >> create mode 100644 hw/arm/mcimx7d-sabre.c >> >> diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs >> index f379ddc74b..eb6f6c5997 100644 >> --- a/hw/arm/Makefile.objs >> +++ b/hw/arm/Makefile.objs >> @@ -19,5 +19,5 @@ obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o >> obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o >> obj-$(CONFIG_ASPEED_SOC) += aspeed_soc.o aspeed.o >> obj-$(CONFIG_MPS2) += mps2.o >> -obj-$(CONFIG_FSL_IMX7) += fsl-imx7.o >> +obj-$(CONFIG_FSL_IMX7) += fsl-imx7.o mcimx7d-sabre.o >> >> diff --git a/hw/arm/mcimx7d-sabre.c b/hw/arm/mcimx7d-sabre.c >> new file mode 100644 >> index 00..7ca8e668e8 >> --- /dev/null >> +++ b/hw/arm/mcimx7d-sabre.c >> @@ -0,0 +1,101 @@ >> +/* >> + * Copyright (c) 2017, Impinj, Inc. >> + * >> + * MCIMX7D_SABRE Board System emulation. >> + * >> + * Author: Andrey Smirnov >> + * >> + * This code is licensed under the GPL, version 2 or later. >> + * See the file `COPYING' in the top level directory. >> + * >> + * It (partially) emulates a mcimx7d_sabre board, with a Freescale >> + * i.MX7 SoC >> + */ >> + >> +#include "qemu/osdep.h" >> +#include "qapi/error.h" >> +#include "qemu-common.h" >> +#include "hw/arm/fsl-imx7.h" >> +#include "hw/boards.h" >> +#include "sysemu/sysemu.h" >> +#include "sysemu/device_tree.h" >> +#include "qemu/error-report.h" >> +#include "sysemu/qtest.h" >> +#include "net/net.h" >> + >> +typedef struct { >> +FslIMX7State soc; >> +MemoryRegion ram; >> +} MCIMX7Sabre; >> + >> +static void mcimx7d_add_psci_node(const struct arm_boot_info *boot_info, >> + void *fdt) >> +{ >> +const char comp[] = "arm,psci-0.2\0arm,psci"; >> + >> +qemu_fdt_add_subnode(fdt, "/psci"); >> +qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp)); >> +qemu_fdt_setprop_string(fdt, "/psci", "method", "smc"); >> +} > > I'm still unconvinced by this (none of the other i.mx boards we have > have anything like it). None of the other boards are both SMP capable and support SMP in upstream Linux only through PSCI (as it the case for i.MX7), so comparing against the precedent is not very helpful. > How does the real hardware boot SMP ? > Real hardware executes a bootloader which is expected to implement PSCI and do the DTB fixup as I implemented above. Thanks, Andrey Smirnov
Re: [Qemu-devel] [PATCH 2/5] nbd/server: add nbd_opt_{read, drop} to track client->optlen
On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > nbd/server.c | 34 ++ > 1 file changed, 22 insertions(+), 12 deletions(-) > @@ -299,7 +312,7 @@ static int nbd_negotiate_handle_export_name(NBDClient > *client, > error_setg(errp, "Bad length received"); > return -EINVAL; > } > -if (nbd_read(client->ioc, name, client->optlen, errp) < 0) { > +if (nbd_opt_read(client, name, client->optlen, errp) < 0) { > error_prepend(errp, "read failed: "); > return -EINVAL; > } More context: name[client->optlen] = '\0'; Oops - that's broken, because client->optlen is now 0. Which means your code was only tested with empty-string default exports. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH for-2.12 v3 1/3] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
On 11/22/2017 03:33 AM, David Gibson wrote: > On Mon, Nov 20, 2017 at 11:03:45AM +0100, Cédric Le Goater wrote: >> When a CPU is stopped with the 'stop-self' RTAS call, its state >> 'halted' is switched to 1 and, in this case, the MSR is not taken into >> account anymore in the cpu_has_work() routine. Only the pending >> hardware interrupts are checked with their LPCR:PECE* enablement bit. >> >> If the DECR timer fires after 'stop-self' is called and before the CPU >> 'stop' state is reached, the nearly-dead CPU will have some work to do >> and the guest will crash. This case happens very frequently with the >> not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is >> occasionally fired but after 'stop' state, so no work is to be done >> and the guest survives. >> >> I suspect there is a race between the QEMU mainloop triggering the >> timers and the TCG CPU thread but I could not quite identify the root >> cause. To be safe, let's disable in the LPCR all the exceptions which >> can cause an exit while the CPU is in power-saving mode and reenable >> them when the CPU is started. >> >> For this purpose, we introduce a little helper routine to calculate >> the PECE bits for a processor variant. We could also use the mask >> value LPCR_PECE_L_MASK for the P8 and P9 processors. bit 47 and 48 are >> reserved on P7 but it is still compatible. >> >> Signed-off-by: Cédric Le Goater > > I'm not thrilled about addressing this without 100% knowing what's > going on, me either :/ I have spent hours, days, on QEMU logs trying to catch a possible race in the state machine of the CPUs and didn't find it. I need a better understanding of the internals. > but this seems like a sensible change in any case, so I'm ok > with applying something like this. > > A detail however.. > > [snip] >> #if !defined(CONFIG_USER_ONLY) >> + >> +target_ulong cpu_ppc_papr_pece_bits(CPUPPCState *env) >> +{ >> +switch (env->mmu_model) { >> +case POWERPC_MMU_3_00: >> +return LPCR_PDEE | LPCR_HDEE | LPCR_EEE | LPCR_DEE | LPCR_OEE; >> +default: >> +/* P7 and P8 has slightly different PECE bits, mostly because P8 >> adds >> + * bit 47 and 48 which are reserved on P7. Here we set them all, >> which >> + * will work as expected for both implementations >> + */ >> +return LPCR_P8_PECE0 | LPCR_P8_PECE1 | LPCR_P8_PECE2 | >> LPCR_P8_PECE3 | >> +LPCR_P8_PECE4; >> +} >> +} > > ..since we're working in this area, might as well clean up this > inappropriate use of mmu_model. Two options which I'd be ok with: > > 1) Add a pece_bits field to the PowerPCCPUClass, correctly initialized > for the various processors. > > 2) A similar helper but using ppc_check_compat() to check the arch > level, instead of using env->mmu_model. > OK. Thanks, C.
Re: [Qemu-devel] qemu iotest 020 failing for vmdk after 2b7731938d9
On 2017-11-21 23:16, John Snow wrote: > Commit 2b7731938d9 adds a blkdebug driver test for failing commits, but > the vmdk driver doesn't appear to appreciate this format: > > +_qemu_img_wrapper create -f vmdk -b "json:{'driver': 'raw', > + 'file': { > + 'driver': 'blkdebug', > + 'inject-error': [{ > + 'event': 'write_aio', > + 'errno': 28, > + 'once': true > + }], > + 'image': { > + 'driver': 'null-co' > + }}}" > "/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/t.vmdk" > +qemu-img: TEST_DIR/t.IMGFMT: Could not create image: Invalid argument > > > ...so; > > (A) VMDK should be dropped from 020, or > (B) This sub-test should be rewritten, or > (C) This sub-test should be split out into a new unit where VMDK can be > dropped. > > I don't like (A) very much because I like testing our weird formats when > possible, I don't like (B) very much because I don't really like > wrangling QMP commands inside of the bash unit tests. > > (C) Could work; though it's odd to have it away from its kin in 020. The issue is not blkdebug, the issue is that the backing file is not in vmdk format (because it's a null-co node at heart). I can make it a VMDK, but that'll be fun either when specifying the backing file file.image node because this test is supposed to work with all protocols but NFS; or I'll have to resort to the gold old blkdebug:conf:file syntax... But why not, it works. Now I'll just have to figure this out: -qemu-img: Block job failed: No space left on device +Image committed. :-/ Max signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH 3/3] hyperv: make SynIC version msr constant
The value of HV_X64_MSR_SVERSION is initialized once at vcpu init, and is reset to zero on vcpu reset, which is wrong. It is supposed to be a constant, so drop the field from X86CPU, set the msr with the constant value, and don't bother getting it. Signed-off-by: Roman Kagan --- target/i386/cpu.h | 1 - target/i386/kvm.c | 9 ++--- 2 files changed, 2 insertions(+), 8 deletions(-) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index ea9db80de5..b264419678 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1101,7 +1101,6 @@ typedef struct CPUX86State { uint64_t msr_hv_crash_params[HV_CRASH_PARAMS]; uint64_t msr_hv_runtime; uint64_t msr_hv_synic_control; -uint64_t msr_hv_synic_version; uint64_t msr_hv_synic_evt_page; uint64_t msr_hv_synic_msg_page; uint64_t msr_hv_synic_sint[HV_SINT_COUNT]; diff --git a/target/i386/kvm.c b/target/i386/kvm.c index ea6e6e5f30..0479fa4e4a 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -669,7 +669,6 @@ static int hyperv_handle_properties(CPUState *cs) } env->features[FEAT_HYPERV_EAX] |= HV_SYNIC_AVAILABLE; -env->msr_hv_synic_version = HV_SYNIC_VERSION; } if (cpu->hyperv_stimer) { if (!has_msr_hv_stimer) { @@ -1715,10 +1714,10 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (cpu->hyperv_synic) { int j; +kvm_msr_entry_add(cpu, HV_X64_MSR_SVERSION, HV_SYNIC_VERSION); + kvm_msr_entry_add(cpu, HV_X64_MSR_SCONTROL, env->msr_hv_synic_control); -kvm_msr_entry_add(cpu, HV_X64_MSR_SVERSION, - env->msr_hv_synic_version); kvm_msr_entry_add(cpu, HV_X64_MSR_SIEFP, env->msr_hv_synic_evt_page); kvm_msr_entry_add(cpu, HV_X64_MSR_SIMP, @@ -2082,7 +2081,6 @@ static int kvm_get_msrs(X86CPU *cpu) uint32_t msr; kvm_msr_entry_add(cpu, HV_X64_MSR_SCONTROL, 0); -kvm_msr_entry_add(cpu, HV_X64_MSR_SVERSION, 0); kvm_msr_entry_add(cpu, HV_X64_MSR_SIEFP, 0); kvm_msr_entry_add(cpu, HV_X64_MSR_SIMP, 0); for (msr = HV_X64_MSR_SINT0; msr <= HV_X64_MSR_SINT15; msr++) { @@ -2286,9 +2284,6 @@ static int kvm_get_msrs(X86CPU *cpu) case HV_X64_MSR_SCONTROL: env->msr_hv_synic_control = msrs[i].data; break; -case HV_X64_MSR_SVERSION: -env->msr_hv_synic_version = msrs[i].data; -break; case HV_X64_MSR_SIEFP: env->msr_hv_synic_evt_page = msrs[i].data; break; -- 2.14.3
[Qemu-devel] [PATCH 1/3] hyperv: set partition-wide MSRs only on first vcpu
From: Evgeny Yakovlev Hyper-V has a notion of partition-wide MSRs. Those MSRs are read and written as usual on each VCPU, however the hypervisor maintains a single global value for all VCPUs. Thus writing such an MSR from any single VCPU affects the global value that is read by all other VCPUs. This leads to an issue during VCPU hotplug: the zero-initialzied values of those MSRs get synced into KVM and override the global values as has already been set by the guest. This change makes the partition-wide MSRs only be synchronized on the first vcpu. Signed-off-by: Evgeny Yakovlev Signed-off-by: Roman Kagan --- target/i386/cpu.h | 5 - target/i386/kvm.c | 23 +++ 2 files changed, 19 insertions(+), 9 deletions(-) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index b086b1528b..ea9db80de5 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1091,10 +1091,13 @@ typedef struct CPUX86State { uint64_t async_pf_en_msr; uint64_t pv_eoi_en_msr; +/* Partition-wide HV MSRs, will be updated only on the first vcpu */ uint64_t msr_hv_hypercall; uint64_t msr_hv_guest_os_id; -uint64_t msr_hv_vapic; uint64_t msr_hv_tsc; + +/* Per-VCPU HV MSRs */ +uint64_t msr_hv_vapic; uint64_t msr_hv_crash_params[HV_CRASH_PARAMS]; uint64_t msr_hv_runtime; uint64_t msr_hv_synic_control; diff --git a/target/i386/kvm.c b/target/i386/kvm.c index b1e32e95d3..563967241b 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -1678,19 +1678,26 @@ static int kvm_put_msrs(X86CPU *cpu, int level) kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, env->msr_global_ctrl); } -if (has_msr_hv_hypercall) { -kvm_msr_entry_add(cpu, HV_X64_MSR_GUEST_OS_ID, - env->msr_hv_guest_os_id); -kvm_msr_entry_add(cpu, HV_X64_MSR_HYPERCALL, - env->msr_hv_hypercall); +/* + * Hyper-V partition-wide MSRs: to avoid clearing them on cpu hot-add, + * only sync them to KVM on the first cpu + */ +if (current_cpu == first_cpu) { +if (has_msr_hv_hypercall) { +kvm_msr_entry_add(cpu, HV_X64_MSR_GUEST_OS_ID, + env->msr_hv_guest_os_id); +kvm_msr_entry_add(cpu, HV_X64_MSR_HYPERCALL, + env->msr_hv_hypercall); +} +if (cpu->hyperv_time) { +kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC, + env->msr_hv_tsc); +} } if (cpu->hyperv_vapic) { kvm_msr_entry_add(cpu, HV_X64_MSR_APIC_ASSIST_PAGE, env->msr_hv_vapic); } -if (cpu->hyperv_time) { -kvm_msr_entry_add(cpu, HV_X64_MSR_REFERENCE_TSC, env->msr_hv_tsc); -} if (has_msr_hv_crash) { int j; -- 2.14.3
[Qemu-devel] [PATCH 2/3] hyperv: ensure SINTx msrs are reset properly
Initially SINTx msrs should be in "masked" state. To ensure that happens on *every* reset, move setting their values to kvm_arch_vcpu_reset. Signed-off-by: Roman Kagan --- target/i386/kvm.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/target/i386/kvm.c b/target/i386/kvm.c index 563967241b..ea6e6e5f30 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -662,8 +662,6 @@ static int hyperv_handle_properties(CPUState *cs) env->features[FEAT_HYPERV_EAX] |= HV_VP_RUNTIME_AVAILABLE; } if (cpu->hyperv_synic) { -int sint; - if (!has_msr_hv_synic || kvm_vcpu_enable_cap(cs, KVM_CAP_HYPERV_SYNIC, 0)) { fprintf(stderr, "Hyper-V SynIC is not supported by kernel\n"); @@ -672,9 +670,6 @@ static int hyperv_handle_properties(CPUState *cs) env->features[FEAT_HYPERV_EAX] |= HV_SYNIC_AVAILABLE; env->msr_hv_synic_version = HV_SYNIC_VERSION; -for (sint = 0; sint < ARRAY_SIZE(env->msr_hv_synic_sint); sint++) { -env->msr_hv_synic_sint[sint] = HV_SINT_MASKED; -} } if (cpu->hyperv_stimer) { if (!has_msr_hv_stimer) { @@ -1053,6 +1048,13 @@ void kvm_arch_reset_vcpu(X86CPU *cpu) } else { env->mp_state = KVM_MP_STATE_RUNNABLE; } + +if (cpu->hyperv_synic) { +int i; +for (i = 0; i < ARRAY_SIZE(env->msr_hv_synic_sint); i++) { +env->msr_hv_synic_sint[i] = HV_SINT_MASKED; +} +} } void kvm_arch_do_init_vcpu(X86CPU *cpu) -- 2.14.3
[Qemu-devel] [PATCH 0/3] hyperv: hv msr initialization fixes
These patches fix problems with hyperv msr initialization. Evgeny Yakovlev (1): hyperv: set partition-wide MSRs only on first vcpu Roman Kagan (2): hyperv: ensure SINTx msrs are reset properly hyperv: make SynIC version msr constant target/i386/cpu.h | 6 -- target/i386/kvm.c | 44 2 files changed, 28 insertions(+), 22 deletions(-) -- 2.14.3
Re: [Qemu-devel] [RFC PATCH v2 1/3] s390x/ccs: add ccw-testdev emulated device
On Wed, 8 Nov 2017 17:54:20 +0100 Halil Pasic wrote: Subject: s/ccs/css/ > Add a fake device meant for testing the correctness of our css emulation. > > What we currently have is writing a Fibonacci sequence of uint32_t to the > device via ccw write. The write is going to fail if it ain't a Fibonacci > and indicate a device exception in scsw together with the proper residual > count. With this we can do basic verification of data integrity. > > Of course lot's of invalid inputs (besides basic data processing) can be > tested with that as well. > > We also have a no-op mode where the device just tells all-good! This > could be useful for fuzzing. > > Usage: > 1) fire up a qemu with something like -device ccw-testdev,devno=fe.0.0001 >on the command line > 2) exercise the device from the guest > > Signed-off-by: Halil Pasic > --- > > Introduction > > > While discussing v1 we (more or less) decided a test device for ccw is a > good idea. This is an RFC because there are some unresolved technical > questions I would like to discuss. > > Usage > - > > Build like this: make CONFIG_CCW_TESTDEV=y > > Changelog > - > > v1 -> v2: > - Renamed and moved to hw/misc/ > - Changed cu_type to 0xfffe. > - Changed chpid_type to 0x25 (questionable). > - Extensibility: addedd both in-band (ccw) and out-of-band set mode > mechanism. Currently we have two modes: fib and no-op. The purpose > of the mode mechanism is to facilitate different behaviors. One can > both specify and lock a mode on the command line. > - Added read for fib mode. > > Things I've figured out and things to figure out > --- > > The zVM folks say they don't have a reserved cu_type reserved for test > (or hypervisor use in general). So I've gone with cu_type 0xfffe because > according to Christian only numeric digit hex values are used, and Linux > uses x0 as extremal value. ack > > The zVM folks say the don't have a chpid_type reserved for test (or > hypervisor use in general. So I took 0x25 for now, which belongs to FC > and should be safe in my opinion. That's fine, I think. > > AFAIR there was some discussion on using a dedicated diag function code. > There are multiple such that could be re-purposed for out-of-band > signaling reserved by zVM. For now I've decided to go with a new subcode > for diag 0x500 as it appears to me that it requires less changes. It means that we don't have to synchronize with anyone else, yes. > > I've implemented both in-band and out-of-band signaling for influencing > what the testdev does from the guest. We should probably get rid of one. > I've just implemented both so we can discuss pros and cons with some > code. > > I've taken subcode 254, the last one supported at the moment. I'm not > really happy with the way I 'took' it. Maybe all taken subcodes > could go into hw/s390x/s390-virtio-hcall.h either via include or > directly. For virtio-ccw, we're going the way via the standard headers (as for other virtio things). The last used subcode for the old virtio transport is already in this file (in s390-next). We currently have Documentation/virtual/kvm/s390-diag.txt in the kernel. Not sure whether that would be a good place to document the testdev subcode. > > I'm not really happy with the side effects of moving it to hw/misc, which > ain't s390x specific. I've pretty much bounced off the build system, so > I would really appreciate some help here. Currently you have to say you > want it when you do make or it won't get compiled into your qemu. IMHO > this device only makes sense for testing and should not be rutinely > shipped in production builds. That is why I did not touch > default-configs/s390x-softmmu.mak. The pci and isa testdevs seem to be included on all platforms that support isa/pci (except for the pci testdev on s390x). They do have their own config symbols, so they can be manually disabled should it be desired. This seems like a good way to go for us as well. > > I think, I have the most problematic places marked with a TODO > comment in the code. I'll save looking at the code for another day.
[Qemu-devel] [PATCH] kvm: apic: save and restore x2APIC LDR
QEMU saves only 8 bits of APIC LDR, which means that it does not support x2APIC. The correct way of fixing this would be to save and restore the full 32 bit register, but because x2APIC LDR is a function of x2APIC ID, we can also compute it and keep the migration format untouched. KVM always expected the LDR format to follow the xAPIC/x2APIC standard, but pre 4.1 KVMs used non-standard x2APIC ID in case the OS changed xAPIC ID before switching to x2APIC, which means that QEMU has to use the kvm_x2apic_api feature to derive the x2APIC ID. This bug has also been addressed on the KVM side with patch 5849d75a5c9b ("KVM: lapic: Fixup LDR on load in x2apic"). Reported-by: Dr. David Alan Gilbert Reported-by: Yiqian Wei Signed-off-by: Radim Krčmář --- I haven't tested that it actually fixes the bug, https://bugzilla.redhat.com/show_bug.cgi?id=1502591. hw/i386/kvm/apic.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c index 1df6d26816f9..89df165a04bf 100644 --- a/hw/i386/kvm/apic.c +++ b/hw/i386/kvm/apic.c @@ -30,6 +30,13 @@ static inline uint32_t kvm_apic_get_reg(struct kvm_lapic_state *kapic, return *((uint32_t *)(kapic->regs + (reg_id << 4))); } +static inline uint32_t kvm_apic_calc_x2apic_ldr(APICCommonState *s) +{ +uint32_t id = kvm_has_x2apic_api() ? s->initial_apic_id : s->id; + +return ((id >> 4) << 16) | (1 << (id & 0xf)); +} + static void kvm_put_apic_state(APICCommonState *s, struct kvm_lapic_state *kapic) { int i; @@ -41,7 +48,11 @@ static void kvm_put_apic_state(APICCommonState *s, struct kvm_lapic_state *kapic kvm_apic_set_reg(kapic, 0x2, s->id << 24); } kvm_apic_set_reg(kapic, 0x8, s->tpr); -kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24); +if (s->apicbase & MSR_IA32_APICBASE_EXTD) { +kvm_apic_set_reg(kapic, 0xd, kvm_apic_calc_x2apic_ldr(s)); +} else { +kvm_apic_set_reg(kapic, 0xd, s->log_dest << 24); +} kvm_apic_set_reg(kapic, 0xe, s->dest_mode << 28 | 0x0fff); kvm_apic_set_reg(kapic, 0xf, s->spurious_vec); for (i = 0; i < 8; i++) { @@ -71,7 +82,11 @@ void kvm_get_apic_state(DeviceState *dev, struct kvm_lapic_state *kapic) } s->tpr = kvm_apic_get_reg(kapic, 0x8); s->arb_id = kvm_apic_get_reg(kapic, 0x9); -s->log_dest = kvm_apic_get_reg(kapic, 0xd) >> 24; +if (s->apicbase & MSR_IA32_APICBASE_EXTD) { +assert(kvm_apic_get_reg(kapic, 0xd) == kvm_apic_calc_x2apic_ldr(s)); +} else { +s->log_dest = kvm_apic_get_reg(kapic, 0xd) >> 24; +} s->dest_mode = kvm_apic_get_reg(kapic, 0xe) >> 28; s->spurious_vec = kvm_apic_get_reg(kapic, 0xf); for (i = 0; i < 8; i++) { -- 2.14.2
Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Am 22.11.2017 um 17:34 hat Paolo Bonzini geschrieben: > On 22/11/2017 16:33, Deepa Srinivasan wrote: > > Starting qemu with the following arguments causes qemu to segfault: > > ... -device lsi,id=lsi0 -drive > > file=iscsi:<...>,format=raw,if=none,node-name= > > iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1 > > > > This patch fixes blk_aio_ioctl() so it does not pass stack addresses to > > blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. > > More > > details about the bug follow. > > > > blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the > > coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter(). > > > > When blk_aio_ioctl() is executed from within a coroutine context (e.g. > > iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to > > the current coroutine's wakeup queue. blk_aio_ioctl() then returns. > > > > When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer: > > > > BlkRwCo *rwco = &acb->rwco; > > > > rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, > > rwco->qiov->iov[0].iov_base); <--- qiov is > > invalid > > here > > ... > > > > In the case when blk_aio_ioctl() is called from a non-coroutine context, > > blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls > > qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine > > execution is complete, control returns to blk_aio_ioctl_entry() after the > > call > > to blk_co_ioctl(). There is no invalid reference after this point, but the > > function is still holding on to invalid pointers. > > > > The fix is to allocate memory for the QEMUIOVector and struct iovec as part > > of > > the request struct which the IO buffer is part of. The memory for this > > struct is > > guaranteed to be valid till the AIO is completed. > > > > Signed-off-by: Deepa Srinivasan > > Signed-off-by: Konrad Rzeszutek Wilk > > Reviewed-by: Mark Kanda > > --- > > block/block-backend.c | 13 ++--- > > hw/block/virtio-blk.c | 9 - > > hw/scsi/scsi-disk.c| 10 +- > > hw/scsi/scsi-generic.c | 9 - > > include/sysemu/block-backend.h | 2 +- > > 5 files changed, 28 insertions(+), 15 deletions(-) > > > > diff --git a/block/block-backend.c b/block/block-backend.c > > index baef8e7..c275827 100644 > > --- a/block/block-backend.c > > +++ b/block/block-backend.c > > @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque) > > blk_aio_complete(acb); > > } > > > > -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void > > *buf, > > +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, > > QEMUIOVector *qiov, > >BlockCompletionFunc *cb, void *opaque) > > I think this is not the best way to fix the bug, because it adds extra > unnecessary code in the callers. > > Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the > same for blk_aio_prwv's "qiov" argument? > > Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just > use rwco->buf. But the same struct is used for read and write requests that do use an actual QEMUIOVector and not just a linear buffer. Kevin
Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Am 22.11.2017 um 18:06 hat Stefan Hajnoczi geschrieben: > On Wed, Nov 22, 2017 at 07:33:28AM -0800, Deepa Srinivasan wrote: > > Starting qemu with the following arguments causes qemu to segfault: > > ... -device lsi,id=lsi0 -drive > > file=iscsi:<...>,format=raw,if=none,node-name= > > iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1 > > > > This patch fixes blk_aio_ioctl() so it does not pass stack addresses to > > blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. > > More > > details about the bug follow. > > > > blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the > > coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter(). > > > > When blk_aio_ioctl() is executed from within a coroutine context (e.g. > > iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to > > the current coroutine's wakeup queue. blk_aio_ioctl() then returns. > > > > When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer: > > > > BlkRwCo *rwco = &acb->rwco; > > > > rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, > > rwco->qiov->iov[0].iov_base); <--- qiov is > > invalid > > here > > ... > > > > In the case when blk_aio_ioctl() is called from a non-coroutine context, > > blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls > > qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine > > execution is complete, control returns to blk_aio_ioctl_entry() after the > > call > > to blk_co_ioctl(). There is no invalid reference after this point, but the > > function is still holding on to invalid pointers. > > > > The fix is to allocate memory for the QEMUIOVector and struct iovec as part > > of > > the request struct which the IO buffer is part of. The memory for this > > struct is > > guaranteed to be valid till the AIO is completed. > > Thanks for the patch! > > AIO APIs currently don't require the caller to match qiov's lifetime to > the I/O request lifetime. This patch changes that for blk_aio_ioctl() > only. If we want to do this consistently then all aio callers need to > be audited and fixed. > > The alternative is to make the API copy qiov when necessary. That is > less efficient but avoids modifying all callers. > > Either way, the lifetime of qiov must be consistent across all aio APIs, > not just blk_aio_ioctl(). Don't all blk_aio_*() APIs that take a qiov pointer require that it remains valid until the request completes? I don't think they are copied anywhere for blk_aio_preadv/pwritev() before being passed to the block driver. So this does look consistent with the existing functions to me. Kevin signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH 01/12] e500: add board config options
On 11/21/2017 09:28 PM, David Gibson wrote: > On Sun, Nov 19, 2017 at 09:24:09PM -0600, Michael Davidsaver wrote: >> allow board code to skip common NIC and guest image setup >> and configure decrementor frequency. >> Existing boards unchanged. >> >> Signed-off-by: Michael Davidsaver > > So, it's spelled "decrementer". > > Other than that, the patch looks correct. However having a big common > function for overall init with a pile of ad-hoc configuration > parameters is usually not a great way to go. I think what we want > instead is to eliminate ppce500_init(), instead doing the setup logic > separately in each of the e500 machines. The large common slabs of > code can be helpers in e500.c, but the overall logic - including most > of the things controlled by the current params - would be under the > individual machine's control. I agree that ppce500_init() is unwieldy, and am willing to spend some time on cleanup. I'm just not sure what to do. I can see moving initialization of at least the serial, i2c, gpio, and possibly MPIC and PCI host bridge into the e500-ccsr class. I'm not sure what to do with all the device tree construction code in e500.c, which has a bunch of hard coded register offsets. A comment here summarizes the problem nicely. > /* TODO: parameterize */ > #define MPC8544_CCSRBAR_SIZE 0x0010ULL > #define MPC8544_MPIC_REGS_OFFSET 0x4ULL It seems desirable to avoid having these offsets appear in two different files, which could allow them to get out of sync. I had the idea of using an Interface to split up device tree construction, and was encouraged to find PnvXScomInterfaceClass which seems be a way of combining device tree construction in a device class. Is this the way to go? Some guidance would be appreciated. Michael >> --- >> hw/ppc/e500.c | 8 ++-- >> hw/ppc/e500.h | 3 +++ >> hw/ppc/e500plat.c | 1 + >> hw/ppc/mpc8544ds.c | 1 + >> 4 files changed, 11 insertions(+), 2 deletions(-) >> >> diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c >> index 5cf0dabef3..9e7e1b29c4 100644 >> --- a/hw/ppc/e500.c >> +++ b/hw/ppc/e500.c >> @@ -826,7 +826,7 @@ void ppce500_init(MachineState *machine, PPCE500Params >> *params) >> env->mpic_iack = params->ccsrbar_base + >> MPC8544_MPIC_REGS_OFFSET + 0xa0; >> >> -ppc_booke_timers_init(cpu, 4, PPC_TIMER_E500); >> +ppc_booke_timers_init(cpu, params->decrementor_freq, >> PPC_TIMER_E500); >> >> /* Register reset handler */ >> if (!i) { >> @@ -899,7 +899,7 @@ void ppce500_init(MachineState *machine, PPCE500Params >> *params) >> if (!pci_bus) >> printf("couldn't create PCI controller!\n"); >> >> -if (pci_bus) { >> +if (pci_bus && !params->tsec_nic) { >> /* Register network interfaces. */ >> for (i = 0; i < nb_nics; i++) { >> pci_nic_init_nofail(&nd_table[i], pci_bus, "virtio", NULL); >> @@ -948,6 +948,10 @@ void ppce500_init(MachineState *machine, PPCE500Params >> *params) >> sysbus_mmio_get_region(s, 0)); >> } >> >> +if (params->skip_load) { >> +return; >> +} >> + >> /* Load kernel. */ >> if (machine->kernel_filename) { >> kernel_base = cur_base; >> diff --git a/hw/ppc/e500.h b/hw/ppc/e500.h >> index 70ba1d8f4f..40f72f2de2 100644 >> --- a/hw/ppc/e500.h >> +++ b/hw/ppc/e500.h >> @@ -22,6 +22,9 @@ typedef struct PPCE500Params { >> hwaddr pci_mmio_base; >> hwaddr pci_mmio_bus_base; >> hwaddr spin_base; >> +uint32_t decrementor_freq; /* in Hz */ >> +bool skip_load; >> +bool tsec_nic; >> } PPCE500Params; >> >> void ppce500_init(MachineState *machine, PPCE500Params *params); >> diff --git a/hw/ppc/e500plat.c b/hw/ppc/e500plat.c >> index e59e80fb9e..3d07987bd1 100644 >> --- a/hw/ppc/e500plat.c >> +++ b/hw/ppc/e500plat.c >> @@ -47,6 +47,7 @@ static void e500plat_init(MachineState *machine) >> .pci_mmio_base = 0xCULL, >> .pci_mmio_bus_base = 0xE000ULL, >> .spin_base = 0xFEF00ULL, >> +.decrementor_freq = 4, >> }; >> >> /* Older KVM versions don't support EPR which breaks guests when we >> announce >> diff --git a/hw/ppc/mpc8544ds.c b/hw/ppc/mpc8544ds.c >> index 1717953ec7..6d9931c475 100644 >> --- a/hw/ppc/mpc8544ds.c >> +++ b/hw/ppc/mpc8544ds.c >> @@ -40,6 +40,7 @@ static void mpc8544ds_init(MachineState *machine) >> .pci_mmio_bus_base = 0xC000ULL, >> .pci_pio_base = 0xE100ULL, >> .spin_base = 0xEF00ULL, >> +.decrementor_freq = 4, >> }; >> >> if (machine->ram_size > 0xc000) { > signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [RFC PATCH 1/1] s390x: deprecate s390-squash-mcss machine prop
On Wed, 22 Nov 2017 15:18:32 +0100 Halil Pasic wrote: > With the cssids unrestricted (commit "s390x/css: unrestrict I think you can simply reference the title only. I plan to merge both in one go. > cssids") the s390-squash-mcss machine property should not be used. > Actually libvirt never supported this, so the expectation is that > removing it should be pretty painless. But let's play nice and deprecate > it first. > > Signed-off-by: Halil Pasic > --- > > I'm not sure about the warnig I generate if 's390-squash-mcss' > is set to 'on': > * There is prior art where we don't warn_report but info_report > (commit 83926ad527) Tbh, I don't really care about one or the other. > * Warning iff specified on the command line regardless of value > would probably be cleaner. Yet I'm not sure where would I need > to hook in to do that. Yeah, I don't see a good way to do that either. OTOH, I don't really think people set this to false explicitly. > > I would like to reference a commit for "s390x/css: unrestrict cssids" > which is currently in RFC status on the list. As said, I'll just merge them in one go. We can reference the commit in the changelog, though. There also should be an entry in https://wiki.qemu.org/Features/LegacyRemoval, and we should update the advice in https://wiki.qemu.org/index.php/Features/Channel_I/O_Passthrough. > --- > hw/s390x/s390-virtio-ccw.c | 5 - > qemu-doc.texi | 6 ++ > qemu-options.hx| 6 +- > 3 files changed, 15 insertions(+), 2 deletions(-) > > diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c > index 6a57f94197..c71e1d7e55 100644 > --- a/hw/s390x/s390-virtio-ccw.c > +++ b/hw/s390x/s390-virtio-ccw.c > @@ -554,6 +554,9 @@ static inline void machine_set_squash_mcss(Object *obj, > bool value, > S390CcwMachineState *ms = S390_CCW_MACHINE(obj); > > ms->s390_squash_mcss = value; > +if (ms->s390_squash_mcss) { > +warn_report("the machine property 's390-squash-mcss' is deprecated"); s/the/The/ You should also add an explanation ("obsoleted by lifting the cssid restrictions"). > +} > } > > static inline void s390_machine_initfn(Object *obj) > @@ -583,7 +586,7 @@ static inline void s390_machine_initfn(Object *obj) > object_property_add_bool(obj, "s390-squash-mcss", > machine_get_squash_mcss, > machine_set_squash_mcss, NULL); > -object_property_set_description(obj, "s390-squash-mcss", > +object_property_set_description(obj, "s390-squash-mcss", "(deprecated) " > "enable/disable squashing subchannels into the default css", > NULL); > object_property_set_bool(obj, false, "s390-squash-mcss", NULL); > diff --git a/qemu-doc.texi b/qemu-doc.texi > index d383ac44d4..90aa92d8b9 100644 > --- a/qemu-doc.texi > +++ b/qemu-doc.texi > @@ -2500,6 +2500,12 @@ enabled via the ``-machine usb=on'' argument. > > The ``-nodefconfig`` argument is a synonym for ``-no-user-config``. > > +@subsection -machine virtio-ccw,s390-squash-mcss=on|off (since 2.12.0) > + > +Since version 2.12.0 the cssid can be chosen freely. Instead of squashing > +mcss for guest that don't support multiple channel subsystems we recommend > +putting all devices into the default channel subsystem (that is 0xfe). "The ``s390-squash-mcss=on`` property has been obsoleted by allowing the cssid to be chosen freely. Instead of squashing subchannels into the default channel subsystem image for guests that do not support multiple channel subsystems, all devices can be put into the default channel subsystem image." > + > @section qemu-img command line arguments > > @subsection convert -s (since 2.0.0) > diff --git a/qemu-options.hx b/qemu-options.hx > index 3728e9b4dd..53b13ec203 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -43,7 +43,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \ > "suppress-vmdesc=on|off disables self-describing > migration (default=off)\n" > "nvdimm=on|off controls NVDIMM support (default=off)\n" > "enforce-config-section=on|off enforce configuration > section migration (default=off)\n" > -"s390-squash-mcss=on|off controls support for squashing > into default css (default=off)\n", > +"s390-squash-mcss=on|off (deprecated) controls support > for squashing into default css (default=off)\n", > QEMU_ARCH_ALL) > STEXI > @item -machine [type=]@var{name}[,prop=@var{value}[,...]] > @@ -98,6 +98,10 @@ Enables or disables NVDIMM support. The default is off. > @item s390-squash-mcss=on|off > Enables or disables squashing subchannels into the default css. > The default is off. > +NOTE: This property is deprecated and may be removed in future releases. s/may/will/ Let them complain :) > +Since version 2.12.0 the cssid can be chosen freely. Instead of squashing > +mcss for guest that don't
Re: [Qemu-devel] [PATCH 2/5] nbd/server: add nbd_opt_{read, drop} to track client->optlen
On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy The subject line says what and sort of implies why, but the commit message body is rather sparse. > --- > nbd/server.c | 34 ++ > 1 file changed, 22 insertions(+), 12 deletions(-) Not a net win in lines of code, so a bit more justification may be helpful. > > diff --git a/nbd/server.c b/nbd/server.c > index bccc0274e7..c9445a89e9 100644 > --- a/nbd/server.c > +++ b/nbd/server.c > @@ -139,6 +139,19 @@ static void nbd_client_receive_next_request(NBDClient > *client); > > */ > > +static inline int nbd_opt_read(NBDClient *client, void *buffer, size_t size, > + Error **errp) > +{ > +client->optlen -= size; > +return qio_channel_read_all(client->ioc, buffer, size, errp) < 0 ? -EIO > : 0; Should this code check that client->optlen >= size, and issue the appropriate error message if not? That may centralize some of the error checking repeated elsewhere. > +} > + > +static inline int nbd_opt_drop(NBDClient *client, size_t size, Error **errp) > +{ > +client->optlen -= size; > +return nbd_drop(client->ioc, size, errp); Same question on size validation. > +} > + > /* Send a reply header, including length, but no payload. > * Return -errno on error, 0 on success. */ > static int nbd_negotiate_send_rep_len(NBDClient *client, uint32_t type, > @@ -299,7 +312,7 @@ static int nbd_negotiate_handle_export_name(NBDClient > *client, > error_setg(errp, "Bad length received"); > return -EINVAL; > } > -if (nbd_read(client->ioc, name, client->optlen, errp) < 0) { > +if (nbd_opt_read(client, name, client->optlen, errp) < 0) { > error_prepend(errp, "read failed: "); > return -EINVAL; Here's an example caller that had a previous size validation. > } > @@ -383,40 +396,36 @@ static int nbd_negotiate_handle_info(NBDClient *client, > uint16_t myflags, > msg = "overall request too short"; > goto invalid; > } > -if (nbd_read(client->ioc, &namelen, sizeof(namelen), errp) < 0) { > +if (nbd_opt_read(client, &namelen, sizeof(namelen), errp) < 0) { > return -EIO; And again, a size validation right before the call. > } > be32_to_cpus(&namelen); > -client->optlen -= sizeof(namelen); Okay, part of the refactoring means you don't have to remember to track remaining size separately from reading; that's a nice feature. I suspect it is also possible to assert that client->optlen is zero before reading the next opt (I'm reviewing the patch in order, we'll see if I come back here). [1] > if (namelen > client->optlen - sizeof(requests) || > (client->optlen - namelen) % 2) > { > msg = "name length is incorrect"; > goto invalid; > } > -if (nbd_read(client->ioc, name, namelen, errp) < 0) { > +if (nbd_opt_read(client, name, namelen, errp) < 0) { > return -EIO; > } Another size check prior to the call (actually checking multiple conditions)... > name[namelen] = '\0'; > -client->optlen -= namelen; > trace_nbd_negotiate_handle_export_name_request(name); > > -if (nbd_read(client->ioc, &requests, sizeof(requests), errp) < 0) { > +if (nbd_opt_read(client, &requests, sizeof(requests), errp) < 0) { > return -EIO; ...and no direct size check before this call, because the earlier size checks already covered it. Arguably, the in-place size check error messages may be slightly more precise, but any message at all about unexpected sizing is appropriate (given that only broken clients should be sending incorrect sizes) - so I'm still leaning towards a stronger refactoring that puts the size check in the helper function. > @@ -521,7 +530,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, > uint16_t myflags, > return rc; > > invalid: > -if (nbd_drop(client->ioc, client->optlen, errp) < 0) { > +if (nbd_opt_drop(client, client->optlen, errp) < 0) { > return -EIO; > } > return nbd_negotiate_send_rep_err(client, NBD_REP_ERR_INVALID, > @@ -715,7 +724,7 @@ static int nbd_negotiate_options(NBDClient *client, > uint16_t myflags, > return -EINVAL; > > default: > -if (nbd_drop(client->ioc, length, errp) < 0) { > +if (nbd_opt_drop(client, length, errp) < 0) { Isn't length == client->optlen here? If so, can we omit the length parameter to nbd_opt_drop(), and instead have it defined as dropping to the end of the current option? > @@ -821,6 +830,7 @@ static int nbd_negotiate_options(NBDClient *client, > uint16_t myflags, > if (ret < 0) { > return ret; > } > +assert(client->optlen == 0); [1] Ah, you did add the assertion. I'm going to propose a variant on this patch, to cover the points I made above about ease-of-use (a
Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
On Wed, Nov 22, 2017 at 07:33:28AM -0800, Deepa Srinivasan wrote: > Starting qemu with the following arguments causes qemu to segfault: > ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name= > iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1 > > This patch fixes blk_aio_ioctl() so it does not pass stack addresses to > blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More > details about the bug follow. > > blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the > coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter(). > > When blk_aio_ioctl() is executed from within a coroutine context (e.g. > iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to > the current coroutine's wakeup queue. blk_aio_ioctl() then returns. > > When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer: > > BlkRwCo *rwco = &acb->rwco; > > rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, > rwco->qiov->iov[0].iov_base); <--- qiov is > invalid here > ... > > In the case when blk_aio_ioctl() is called from a non-coroutine context, > blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls > qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine > execution is complete, control returns to blk_aio_ioctl_entry() after the call > to blk_co_ioctl(). There is no invalid reference after this point, but the > function is still holding on to invalid pointers. > > The fix is to allocate memory for the QEMUIOVector and struct iovec as part of > the request struct which the IO buffer is part of. The memory for this struct > is > guaranteed to be valid till the AIO is completed. Thanks for the patch! AIO APIs currently don't require the caller to match qiov's lifetime to the I/O request lifetime. This patch changes that for blk_aio_ioctl() only. If we want to do this consistently then all aio callers need to be audited and fixed. The alternative is to make the API copy qiov when necessary. That is less efficient but avoids modifying all callers. Either way, the lifetime of qiov must be consistent across all aio APIs, not just blk_aio_ioctl(). signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH] Add a blog post about HAXM acceleration on Windows
On 22.11.2017 15:52, Yu Ning wrote: > > On 11/22/2017 20:25, Thomas Huth wrote: [...] >> Would it be OK to remove the qemu-debian-wheezy-gui-with-haxm.png from >> the blog post? > > Yes, I'm fine with removing it. Sorry I haven't installed Jekyll and > didn't test the rendering. > > Would you confirm whether that's the only change I need to make in v2? No need to respin, since this was just a one-line change, I was able to do it on my own (I still removed the screenshot, even if it seemed to be working with Paolo's patch to the CSS, since the screenshot looked just a bit too big for the blog - I hope that's OK for you). So the blog post is now online: https://www.qemu.org/2017/11/22/haxm-usage-windows/ Thanks again, Thomas
[Qemu-devel] [Bug 1673976] Re: linux-user clone() can't handle glibc posix_spawn() (causes locale-gen to assert)
Thanks for tracking down the glibc change; I will try to set up a chroot with a more recent glibc to see whether we can do something that fixes that posix_spawn implementation... -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1673976 Title: linux-user clone() can't handle glibc posix_spawn() (causes locale-gen to assert) Status in QEMU: New Bug description: I'm running a command (locale-gen) inside of an armv7h chroot mounted on my x86_64 desktop by putting qemu-arm-static into /usr/bin/ of the chroot file system and I get a core dump. locale-gen Generating locales... en_US.UTF-8...localedef: ../sysdeps/unix/sysv/linux/spawni.c:360: __spawnix: Assertion `ec >= 0' failed. qemu: uncaught target signal 6 (Aborted) - core dumped /usr/bin/locale-gen: line 41:34 Aborted (core dumped) localedef -i $input -c -f $charset -A /usr/share/locale/locale.alias $locale I've done this same thing successfully for years, but this breakage has appeared some time in the last 3 or so months. Possibly with the update to qemu version 2.8. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1673976/+subscriptions
Re: [Qemu-devel] [qemu-web PATCH] css: avoid over-large images
On Wed, 22 Nov 2017 17:48:28 +0100 Christian Borntraeger wrote: > On 11/22/2017 05:37 PM, Paolo Bonzini wrote: > > Make sure that images are scaled to fit inside their container. > > Thanks God. I was thinking "what is wrong with the channel subsystem in > QEMU?" when > reading your subject. Me too. Hurray for overloaded TLAs ;) > > > Tested-by: Thomas Huth > > Reviewed-by: Thomas Huth > > Signed-off-by: Paolo Bonzini > > --- > > assets/css/style.css | 4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/assets/css/style.css b/assets/css/style.css > > index 2d4fe0c..b828887 100644 > > --- a/assets/css/style.css > > +++ b/assets/css/style.css > > @@ -200,6 +200,10 @@ > > > > /* Images */ > > > > + img { > > + max-width: 100%; > > + } > > + > > .image > > { > > display: inline-block; > > > >
Re: [Qemu-devel] [qemu-web PATCH] css: avoid over-large images
On 11/22/2017 05:37 PM, Paolo Bonzini wrote: > Make sure that images are scaled to fit inside their container. Thanks God. I was thinking "what is wrong with the channel subsystem in QEMU?" when reading your subject. > Tested-by: Thomas Huth > Reviewed-by: Thomas Huth > Signed-off-by: Paolo Bonzini > --- > assets/css/style.css | 4 > 1 file changed, 4 insertions(+) > > diff --git a/assets/css/style.css b/assets/css/style.css > index 2d4fe0c..b828887 100644 > --- a/assets/css/style.css > +++ b/assets/css/style.css > @@ -200,6 +200,10 @@ > > /* Images */ > > + img { > + max-width: 100%; > + } > + > .image > { > display: inline-block; >
Re: [Qemu-devel] [PATCH 08/12] e500: add mpc8540 i2c controller to ccsr
On 11/21/2017 10:08 PM, David Gibson wrote: > On Sun, Nov 19, 2017 at 09:24:16PM -0600, Michael Davidsaver wrote: >> Signed-off-by: Michael Davidsaver > > You're adding what seems to be a fairly specific device to the general > e500 init - this again suggests that it should be split, putting > creation of devices under control of individual machines. I'll address the ppce500_init() part of this question separately. In addition to the MPC8540, I find that documentation for the MPC8544 (modeled) and P2020 (un-modeled) show the same i2c controller registers. So I think it's reasonable for the generic ppce500 machine to have it as well. For what it's worth, the Linux driver for this unit (drivers/i2c/busses/i2c-mpc.c) lists compatibility with a number of other freescale SoCs with only some differences in clock selection (not modeled). The description for the module is: > MODULE_DESCRIPTION("I2C-Bus adapter for MPC107 bridge and " >"MPC824x/83xx/85xx/86xx/512x/52xx processors"); >> --- >> hw/ppc/e500.c | 8 >> 1 file changed, 8 insertions(+) >> >> diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c >> index 6f77844303..bef7d313d4 100644 >> --- a/hw/ppc/e500.c >> +++ b/hw/ppc/e500.c >> @@ -861,6 +861,14 @@ void ppce500_init(MachineState *machine, PPCE500Params >> *params) >> qdev_init_nofail(dev); >> ccsr_addr_space = sysbus_mmio_get_region(SYS_BUS_DEVICE(dev), 0); >> >> +dev = qdev_create(NULL, "mpc8540-i2c"); >> +object_property_add_child(qdev_get_machine(), "i2c[*]", >> + OBJECT(dev), NULL); >> +qdev_init_nofail(dev); >> +s = SYS_BUS_DEVICE(dev); >> +memory_region_add_subregion(ccsr_addr_space, 0x3000, >> +sysbus_mmio_get_region(s, 0)); >> + >> mpicdev = ppce500_init_mpic(machine, params, ccsr_addr_space, irqs); >> >> /* Serial */ > signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 1/5] nbd/server: refactor negotiation functions parameters
On 11/22/2017 04:19 AM, Vladimir Sementsov-Ogievskiy wrote: > Instead of passing currently negotiating option and its length to > many of negotiation functions let's just store them on NBDClient > struct to be state-variables of negotiation phase. > > This unifies semantics of negotiation functions and this allows to > track changes of remaining option length in future patches. "allows to $verb" is not idiomatic English; the options are "allows $subject to $verb" or "allows ${verb}ing" > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > nbd/server.c | 164 > +-- > 1 file changed, 82 insertions(+), 82 deletions(-) Not really a net win in lines of code; but I'll have to see how the other patches in the series are made easier by this. It relies on the fact that the NBD protocol states that negotiation phase is synchronous (at one point, it was argued that negotiation, like transmission, could allow out-of-order replies from the server, but we got that fixed in the spec to match existing practice, and this patch cements the fact that we are processing exactly one option at a time). Mechanically, the conversion looks sane, so: Reviewed-by: Eric Blake We'll see how I feel at the end of the series. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] Add a blog post about HAXM acceleration on Windows
On 22/11/2017 16:37, Thomas Huth wrote: > On 22.11.2017 15:42, Paolo Bonzini wrote: >> On 22/11/2017 13:25, Thomas Huth wrote: >>> diff --git a/assets/css/skel-noscript.css b/assets/css/skel-noscript.css >>> index e0a141e..f574b9f 100644 >>> --- a/assets/css/skel-noscript.css >>> +++ b/assets/css/skel-noscript.css >>> @@ -27,6 +27,7 @@ >>> -o-box-sizing: border-box; >>> -ms-box-sizing: border-box; >>> box-sizing: border-box; >>> + max-width:100%; >>> } >>> >>> /* Rows */ >>> >>> Paolo, does that look OK for you? >> >> That is black magic and I'm a bit afraid to touch it. >> >> Does this also work for you? >> >> diff --git a/assets/css/style.css b/assets/css/style.css >> index 2d4fe0c..d482fe9 100644 >> --- a/assets/css/style.css >> +++ b/assets/css/style.css >> @@ -200,6 +200,10 @@ >> >> /* Images */ >> >> +img { >> +max-width: 100%; >> +} >> + > > Yes, that seems to work, too. Okay, I've pushed it. Can you apply Yu Ning's patch? Thanks, Paolo
[Qemu-devel] [qemu-web PATCH] css: avoid over-large images
Make sure that images are scaled to fit inside their container. Tested-by: Thomas Huth Reviewed-by: Thomas Huth Signed-off-by: Paolo Bonzini --- assets/css/style.css | 4 1 file changed, 4 insertions(+) diff --git a/assets/css/style.css b/assets/css/style.css index 2d4fe0c..b828887 100644 --- a/assets/css/style.css +++ b/assets/css/style.css @@ -200,6 +200,10 @@ /* Images */ + img { + max-width: 100%; + } + .image { display: inline-block; -- 2.14.3
Re: [Qemu-devel] [PATCH 2/2] pc-bios/s390-ccw: zero out bss section
On 22.11.2017 15:26, Christian Borntraeger wrote: > The QEMU ELF loader does not zero the bss segment. > This resulted in several bugs, e.g. see > > commit 5d739a4787a5 (s390-ccw.img: Fix sporadic errors with ccw boot image - > initialize css) > commit 6a40fa2669d3 (s390-ccw.img: Initialize next_idx) > commit 8775d91a0f42 (pc-bios/s390-ccw: Fix problem with invalid virtio-scsi > LUN when rebooting) > > Lets fix this once and forever by letting the BIOS zero the bss itself. > > Suggested-by: Alexander Graf > Signed-off-by: Christian Borntraeger > --- > pc-bios/s390-ccw/start.S | 30 +++--- > 1 file changed, 27 insertions(+), 3 deletions(-) > > diff --git a/pc-bios/s390-ccw/start.S b/pc-bios/s390-ccw/start.S > index 43f9bd2..eb8d024 100644 > --- a/pc-bios/s390-ccw/start.S > +++ b/pc-bios/s390-ccw/start.S > @@ -3,7 +3,7 @@ > * into the pc-bios directory of qemu. > * > * Copyright (c) 2013 Alexander Graf > - * Copyright 2013 IBM Corp. > + * Copyright IBM Corp. 2013, 2017 > * > * This work is licensed under the terms of the GNU GPL, version 2 or (at > * your option) any later version. See the COPYING file in the top-level > @@ -13,8 +13,32 @@ > .globl _start > _start: > > -larl %r15, stack + 0x8000/* Set up stack */ > -jmain/* And call C */ > + larl %r15, stack + 0x8000 /* Set up stack */ > + > + /* clear bss */ > + larl %r2, __bss_start > + larl %r3, _end > + slgr %r3, %r2 /* get sizeof bss */ > + ltgr%r3,%r3 /* bss emtpy? */ > + jz done > + aghi%r3,-1 > + srlg%r4,%r3,8 /* how many 256 byte chunks? */ > + ltgr%r4,%r4 > + lgr %r1,%r2 > + jz remainder > +loop: > + xc 0(256,%r1),0(%r1) > + la %r1,256(%r1) > + brctg %r4,loop > +remainder: > + larl%r2,memsetxc > + ex %r3,0(%r2) Wow, with EXECUTE :-) > +done: > + j main /* And call C */ > + > +memsetxc: > + xc 0(1,%r1),0(%r1) > + > > /* > * void disabled_wait(void) > Looks good to me: Reviewed-by: Thomas Huth
Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
On 22/11/2017 16:33, Deepa Srinivasan wrote: > Starting qemu with the following arguments causes qemu to segfault: > ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name= > iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1 > > This patch fixes blk_aio_ioctl() so it does not pass stack addresses to > blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More > details about the bug follow. > > blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the > coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter(). > > When blk_aio_ioctl() is executed from within a coroutine context (e.g. > iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to > the current coroutine's wakeup queue. blk_aio_ioctl() then returns. > > When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer: > > BlkRwCo *rwco = &acb->rwco; > > rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, > rwco->qiov->iov[0].iov_base); <--- qiov is > invalid here > ... > > In the case when blk_aio_ioctl() is called from a non-coroutine context, > blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls > qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine > execution is complete, control returns to blk_aio_ioctl_entry() after the call > to blk_co_ioctl(). There is no invalid reference after this point, but the > function is still holding on to invalid pointers. > > The fix is to allocate memory for the QEMUIOVector and struct iovec as part of > the request struct which the IO buffer is part of. The memory for this struct > is > guaranteed to be valid till the AIO is completed. > > Signed-off-by: Deepa Srinivasan > Signed-off-by: Konrad Rzeszutek Wilk > Reviewed-by: Mark Kanda > --- > block/block-backend.c | 13 ++--- > hw/block/virtio-blk.c | 9 - > hw/scsi/scsi-disk.c| 10 +- > hw/scsi/scsi-generic.c | 9 - > include/sysemu/block-backend.h | 2 +- > 5 files changed, 28 insertions(+), 15 deletions(-) > > diff --git a/block/block-backend.c b/block/block-backend.c > index baef8e7..c275827 100644 > --- a/block/block-backend.c > +++ b/block/block-backend.c > @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque) > blk_aio_complete(acb); > } > > -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void > *buf, > +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, > QEMUIOVector *qiov, >BlockCompletionFunc *cb, void *opaque) I think this is not the best way to fix the bug, because it adds extra unnecessary code in the callers. Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the same for blk_aio_prwv's "qiov" argument? Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just use rwco->buf. Thanks, Paolo > { > -QEMUIOVector qiov; > -struct iovec iov; > - > -iov = (struct iovec) { > -.iov_base = buf, > -.iov_len = 0, > -}; > -qemu_iovec_init_external(&qiov, &iov, 1); > - > -return blk_aio_prwv(blk, req, 0, &qiov, blk_aio_ioctl_entry, 0, cb, > opaque); > +return blk_aio_prwv(blk, req, 0, qiov, blk_aio_ioctl_entry, 0, cb, > opaque); > } > > int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes) > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c > index 05d1440..ed9f774 100644 > --- a/hw/block/virtio-blk.c > +++ b/hw/block/virtio-blk.c > @@ -151,6 +151,8 @@ out: > typedef struct { > VirtIOBlockReq *req; > struct sg_io_hdr hdr; > +QEMUIOVector qiov; > +struct iovec iov; > } VirtIOBlockIoctlReq; > > static void virtio_blk_ioctl_complete(void *opaque, int status) > @@ -298,7 +300,12 @@ static int virtio_blk_handle_scsi_req(VirtIOBlockReq > *req) > ioctl_req->hdr.sbp = elem->in_sg[elem->in_num - 3].iov_base; > ioctl_req->hdr.mx_sb_len = elem->in_sg[elem->in_num - 3].iov_len; > > -acb = blk_aio_ioctl(blk->blk, SG_IO, &ioctl_req->hdr, > +ioctl_req->iov.iov_base = &ioctl_req->hdr; > +ioctl_req->iov.iov_len = 0; > + > +qemu_iovec_init_external(&ioctl_req->qiov, &ioctl_req->iov, 1); > + > +acb = blk_aio_ioctl(blk->blk, SG_IO, &ioctl_req->qiov, > virtio_blk_ioctl_complete, ioctl_req); > if (!acb) { > g_free(ioctl_req); > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c > index 1243117..7cbe18d 100644 > --- a/hw/scsi/scsi-disk.c > +++ b/hw/scsi/scsi-disk.c > @@ -2636,6 +2636,9 @@ typedef struct SCSIBlockReq { > SCSIDiskReq req; > sg_io_hdr_t io_header; > > +QEMUIOVector qiov; > +struct iovec iov; > + > /* Selected bytes of the original CDB, copied into our own CDB. */ > uint8_t cmd, cdb1, group_number; > > @@ -2722,7 +272
Re: [Qemu-devel] [RFC PATCH 1/1] s390x/css: unresrict cssids
On Wed, 22 Nov 2017 15:45:56 +0100 Boris Fiuczynski wrote: > On 11/22/2017 01:13 PM, Cornelia Huck wrote: > >> +object_class_property_add_bool(klass, "cssid-unrestricted", > >> + prop_get_true, NULL, NULL); > > This looks really, really strange. This is a property that is always > > true if it exists. > > > Won't argue about that. The libvirt guys are actually not interested > int he value at all: only that there is such a property. > >>> So what about a machine property? Wouldn't that help as well? > >>> > >> Technically it is doable. The property would be still a weird > >> one, but from QEMU perspective in a less weird place. I was also > >> arguing that from OO perspective this kind of a don't care about > >> it's value property is weird, but AFAIK not having the info if > >> we can do something with a device at the device is weird from > >> Libvirt perspective. I'm really uncomforble with speaking for > >> Libvirt/Boris. Hope he will make his point tomorrow. > >> > >>> Or a css object? > >>> > >> My first internal-only version used this approach, but > >> I was asked to do it like this. > > If we formulate this rather as "we now expose the default css", we (a) > > have something that really makes the most sense at a channel subsystem > > or machine level, and (b) can be detected by libvirt as an indicator of > > "no restriction for virtual vs. non-virtual". > > I think that there are good reasons for using a device property as well > as for using a machine property. Technically both options are possible > (even for libvirt :) without too much rope...). At first my personal > choice was to express the changed behavior/capability on the device > level since that is what a user defines on the command line and also > where he gets restricted to use a css matching his device type. > From the libvirt perspective we would have supported vfio-ccw devices > only if the vfio-ccw would be allowed to be defined with unrestricted > cssids. Thanks, I can now understand where that property came from. > As I wrote before I also understand the reasoning to express the channel > subsystem wide changed behavior as a machine option. In that case > libvirt would need to check that both capabilities (vfio-ccw and machine > option cssid-unrestricted) are set and not just > vfio-cww.cssid-unrestricted. All doable. A qemu command line user would > obviously need to know the correlation of the machine option and the ccw > devices but that certainly is also nothing new. > When talking to Christian and Halil I started to favor the idea of a new > css object especially when thinking about the future in which the user > might want to specify the default css. It for sure would also be able to > be set with the use of a machine option but a css object would at that > point be much a nicer and more capable approach. What do you think? I think exposing the css object and making the default_css property available is the cleanest solution (especially if we want more css properties in the future). The only drawback I see is that it needs more coding (and probably care to keep it backwards compatible.) Halil, do you think you can come up with something?
Re: [Qemu-devel] [PATCH] iotests: fix 075 and 078
Am 22.11.2017 um 01:16 hat John Snow geschrieben: > Both of these tests are for formats which now stipulate that they are > read-only. Adjust the tests to match. > > Signed-off-by: John Snow Oh, right, seems I forgot about the tests... Thanks, applied to the block branch. Kevin