[Qemu-devel] [Bug 1843852] [NEW] QEMU does not express a dependency on perl-Test-Harness

2019-09-12 Thread John Snow
Public bug reported:

This is a minor thing; in Fedora you can install most of the developer
dependencies by issuing something like `dnf builddep qemu-kvm` and this
takes care of just about everything such that you can run ./configure
and make.

For "make check" though, configure doesn't catch that you'll need perl-
Test-Harness; so it fails halfway through the check routine, and you'll
see this:

```
Can't locate TAP/Parser.pm in @INC (you may need to install the TAP::Parser 
module) (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 
/usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 
/usr/share/perl5) at ./scripts/tap-driver.pl line 30.
BEGIN failed--compilation aborted at ./scripts/tap-driver.pl line 30.
make: *** [/home/jhuston/src/qemu/tests/Makefile.include:905: check-unit] Error 
2
```

I'm not sure how we should express this dependency; it shouldn't be a
requirement for building, but it IS a dependency for testing. We
probably ought not let users skip the qapi tests just because they don't
have the perl requirement met.

(And, separately, the Fedora package should list this as a builddep, but
that's not an issue for here.)

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843852

Title:
  QEMU does not express a dependency on perl-Test-Harness

Status in QEMU:
  New

Bug description:
  This is a minor thing; in Fedora you can install most of the developer
  dependencies by issuing something like `dnf builddep qemu-kvm` and
  this takes care of just about everything such that you can run
  ./configure and make.

  For "make check" though, configure doesn't catch that you'll need
  perl-Test-Harness; so it fails halfway through the check routine, and
  you'll see this:

  ```
  Can't locate TAP/Parser.pm in @INC (you may need to install the TAP::Parser 
module) (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 
/usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 
/usr/share/perl5) at ./scripts/tap-driver.pl line 30.
  BEGIN failed--compilation aborted at ./scripts/tap-driver.pl line 30.
  make: *** [/home/jhuston/src/qemu/tests/Makefile.include:905: check-unit] 
Error 2
  ```

  I'm not sure how we should express this dependency; it shouldn't be a
  requirement for building, but it IS a dependency for testing. We
  probably ought not let users skip the qapi tests just because they
  don't have the perl requirement met.

  (And, separately, the Fedora package should list this as a builddep,
  but that's not an issue for here.)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843852/+subscriptions



[Qemu-devel] [PATCH 2/2] roms/Makefile.edk2: don't pull in submodules when building from tarball

2019-09-12 Thread Michael Roth
Currently the `make efi` target pulls submodules nested under the
roms/edk2 submodule as dependencies. However, when we attempt to build
from a tarball this fails since we are no longer in a git tree.

A preceding patch will pre-populate these submodules in the tarball,
so assume this build dependency is only needed when building from a
git tree.

Reported-by: Bruce Rogers 
Cc: Laszlo Ersek 
Cc: Bruce Rogers 
Cc: qemu-sta...@nongnu.org # v4.1.0
Signed-off-by: Michael Roth 
---
 roms/Makefile.edk2 | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/roms/Makefile.edk2 b/roms/Makefile.edk2
index c2f2ff59d5..33a074d3a4 100644
--- a/roms/Makefile.edk2
+++ b/roms/Makefile.edk2
@@ -46,8 +46,13 @@ all: $(foreach 
flashdev,$(flashdevs),../pc-bios/edk2-$(flashdev).fd.bz2) \
 # files.
 .INTERMEDIATE: $(foreach flashdev,$(flashdevs),../pc-bios/edk2-$(flashdev).fd)
 
+# Fetch edk2 submodule's submodules. If it is not in a git tree, assume
+# we're building from a tarball and that they've already been fetched by
+# make-release/tarball scripts.
 submodules:
-   cd edk2 && git submodule update --init --force
+   if test -d edk2/.git; then \
+   cd edk2 && git submodule update --init --force; \
+   fi
 
 # See notes on the ".NOTPARALLEL" target and the "+" indicator in
 # "tests/uefi-test-tools/Makefile".
-- 
2.17.1




[Qemu-devel] [PATCH 0/2] Fix tarball builds of UEFI/EDK2 firmware

2019-09-12 Thread Michael Roth
Bruce noticed that we cannot build `make efi` target from the v4.1.0
tarball. This is due to a failure on the part of the make-release script
to pull in submodules nested under other submodules, as well as
Makefile.edk2's assumptions about being in a git tree.

Suggestions for more robust solutions are definitely welcome, but for
now this series takes what seems to be the most direct approach.

 roms/Makefile.edk2   | 7 ++-
 scripts/make-release | 8 
 2 files changed, 14 insertions(+), 1 deletion(-)





[Qemu-devel] [PATCH 1/2] make-release: pull in edk2 submodules so we can build it from tarballs

2019-09-12 Thread Michael Roth
The `make efi` target added by 536d2173 is built from the roms/edk2
submodule, which in turn relies on additional submodules nested under
roms/edk2.

The make-release script currently only pulls in top-level submodules,
so these nested submodules are missing in the resulting tarball.

We could try to address this situation more generally by recursively
pulling in all submodules, but this doesn't necessarily ensure the
end-result will build properly (this case also required other changes).

Additionally, due to the nature of submodules, we may not always have
control over how these sorts of things are dealt with, so for now we
continue to handle it on a case-by-case in the make-release script.

Reported-by: Bruce Rogers 
Cc: Laszlo Ersek 
Cc: Bruce Rogers 
Cc: qemu-sta...@nongnu.org # v4.1.0
Signed-off-by: Michael Roth 
---
 scripts/make-release | 8 
 1 file changed, 8 insertions(+)

diff --git a/scripts/make-release b/scripts/make-release
index b4af9c9e52..a2a8cda33c 100755
--- a/scripts/make-release
+++ b/scripts/make-release
@@ -20,6 +20,14 @@ git checkout "v${version}"
 git submodule update --init
 (cd roms/seabios && git describe --tags --long --dirty > .version)
 (cd roms/skiboot && ./make_version.sh > .version)
+# Fetch edk2 submodule's submodules, since it won't have access to them via
+# the tarball later.
+#
+# A more uniform way to handle this sort of situation would be nice, but we
+# don't necessarily have much control over how a submodule handles its
+# submodule dependencies, so we continue to handle these on a case-by-case
+# basis for now.
+(cd roms/edk2 && git submodule update --init)
 popd
 tar --exclude=.git -cjf ${destination}.tar.bz2 ${destination}
 rm -rf ${destination}
-- 
2.17.1




Re: [Qemu-devel] [PATCH] docker: add sanitizers back to clang build

2019-09-12 Thread John Snow



On 9/11/19 9:52 PM, no-re...@patchew.org wrote:
> Patchew URL: https://patchew.org/QEMU/20190912014442.5757-1-js...@redhat.com/
> 
> 
> 
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> Subject: [Qemu-devel] [PATCH] docker: add sanitizers back to clang build
> Message-id: 20190912014442.5757-1-js...@redhat.com
> Type: series
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> git rev-parse base > /dev/null || exit 0
> git config --local diff.renamelimit 0
> git config --local diff.renames True
> git config --local diff.algorithm histogram
> ./scripts/checkpatch.pl --mailback base..
> === TEST SCRIPT END ===
> 
> From https://github.com/patchew-project/qemu
>  * [new tag] patchew/20190912014442.5757-1-js...@redhat.com -> 
> patchew/20190912014442.5757-1-js...@redhat.com
> Switched to a new branch 'test'
> 96d44b9 docker: add sanitizers back to clang build
> 
> === OUTPUT BEGIN ===
> ERROR: Missing Signed-off-by: line(s)

GDI.

I keep adding this to my configuration files, but it keeps "falling
off", somehow.

I have some patches in the works for stgit where I'm going to work
through some test cases for setting profile variables and try to fix this.

In the meantime:

Signed-off-by: John Snow 



Re: [Qemu-devel] [PATCH 2/6] exec.c: remove an unnecessary assert on PHYS_MAP_NODE_NIL in phys_map_node_alloc()

2019-09-12 Thread Wei Yang
On Thu, Sep 12, 2019 at 02:42:26PM +0200, Paolo Bonzini wrote:
>On 12/09/19 04:51, Wei Yang wrote:
>> On Fri, Aug 23, 2019 at 09:07:50AM +0800, Wei Yang wrote:
>>> On Thu, Aug 22, 2019 at 12:24:32PM +0200, Paolo Bonzini wrote:
 On 21/03/19 09:25, Wei Yang wrote:
> PHYS_MAP_NODE_NIL is assigned to PhysPageEntry.ptr in case this is not a
> leaf entry, while map->nodes_nb range in [0, nodes_nb_alloc).
>
> Seems we are asserting on two different things, just remove it.

 The assertion checks that this "if" is not entered incorrectly:

if (lp->skip && lp->ptr == PHYS_MAP_NODE_NIL) {
lp->ptr = phys_map_node_alloc(map, level == 0);
}

>>>
>>> Hmm... I may not get your point.
>>>
>>> phys_map_node_alloc() will get an available PhysPageEntry and return its
>>> index, which will be assigned to its parent's ptr.
>>>
>>> The "if" checks on the parent's ptr, while the assertion asserts the index 
>>> for
>>> the new child. I may miss something?
>>>
>> 
>> Hi, Paolo,
>> 
>> Do I miss something?
>
>Sorry, I was on vacation.  phys_page_set_level can be called multiple
>times, with the same lp.  The assertion checks that only the first call
>will reach phys_map_node_alloc.
>

Ah, I am just back from vacation too. The mountain makes me painful. :-) 

So I guess you are talking about the situation of wrap around.
PHYS_MAP_NODE_NIL is an indicator that this lp is not used yet. And we are not
expecting any valid lp has its ptr with value equal or bigger than
PHYS_MAP_NODE_NIL.

If this is the case, I am thinking this won't happen in current
implementation. Because if my understanding is correct, the total number of
PhysPageEntry would be less than PHYS_MAP_NODE_NIL.

First let's look at the PHYS_MAP_NODE_NIL's value

PHYS_MAP_NODE_NIL = 2^26 = 6710 8864.

So we could represent 6710 8864 PhysPageEntry at most.

Then let's look how many PhysPageEntry would be. The PhysPageEntry structure
forms a tree with 9 nodes on each level with P_L2_LEVELS levels. This means
the tree could have 9^P_L2_LEVELS nodes at most. 

Since

#define ADDR_SPACE_BITS 64
#define P_L2_BITS 9
#define P_L2_LEVELS (((ADDR_SPACE_BITS - TARGET_PAGE_BITS - 1) / P_L2_BITS) + 1)

And TARGET_PAGE_BITS >= 12, so P_L2_LEVELS is smaller than 

7 = (64 - 12 - 1) / 9 + 1.

This leads to 

9^P_L2_LEVELS <= 9^7 = 478 2969

It shows PHYS_MAP_NODE_NIL may represents more node the tree could hold.

The assert here is not harmful, while maybe we can have a better way to handle
it?

>Paolo

-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH v8 01/13] vfio: KABI for migration interface

2019-09-12 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Thursday, September 12, 2019 10:41 PM
> 
> On Tue, 3 Sep 2019 06:57:27 +
> "Tian, Kevin"  wrote:
> 
> > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > Sent: Saturday, August 31, 2019 12:33 AM
> > >
> > > On Fri, 30 Aug 2019 08:06:32 +
> > > "Tian, Kevin"  wrote:
> > >
> > > > > From: Tian, Kevin
> > > > > Sent: Friday, August 30, 2019 3:26 PM
> > > > >
> > > > [...]
> > > > > > How does QEMU handle the fact that IOVAs are potentially
> dynamic
> > > while
> > > > > > performing the live portion of a migration?  For example, each
> time a
> > > > > > guest driver calls dma_map_page() or dma_unmap_page(), a
> > > > > > MemoryRegionSection pops in or out of the AddressSpace for the
> device
> > > > > > (I'm assuming a vIOMMU where the device AddressSpace is not
> > > > > > system_memory).  I don't see any QEMU code that intercepts that
> > > change
> > > > > > in the AddressSpace such that the IOVA dirty pfns could be
> recorded and
> > > > > > translated to GFNs.  The vendor driver can't track these beyond
> getting
> > > > > > an unmap notification since it only knows the IOVA pfns, which
> can be
> > > > > > re-used with different GFN backing.  Once the DMA mapping is
> torn
> > > down,
> > > > > > it seems those dirty pfns are lost in the ether.  If this works in
> QEMU,
> > > > > > please help me find the code that handles it.
> > > > >
> > > > > I'm curious about this part too. Interestingly, I didn't find any
> log_sync
> > > > > callback registered by emulated devices in Qemu. Looks dirty pages
> > > > > by emulated DMAs are recorded in some implicit way. But KVM
> always
> > > > > reports dirty page in GFN instead of IOVA, regardless of the
> presence of
> > > > > vIOMMU. If Qemu also tracks dirty pages in GFN for emulated DMAs
> > > > >  (translation can be done when DMA happens), then we don't need
> > > > > worry about transient mapping from IOVA to GFN. Along this way
> we
> > > > > also want GFN-based dirty bitmap being reported through VFIO,
> > > > > similar to what KVM does. For vendor drivers, it needs to translate
> > > > > from IOVA to HVA to GFN when tracking DMA activities on VFIO
> > > > > devices. IOVA->HVA is provided by VFIO. for HVA->GFN, it can be
> > > > > provided by KVM but I'm not sure whether it's exposed now.
> > > > >
> > > >
> > > > HVA->GFN can be done through hva_to_gfn_memslot in kvm_host.h.
> > >
> > > I thought it was bad enough that we have vendor drivers that depend
> on
> > > KVM, but designing a vfio interface that only supports a KVM interface
> > > is more undesirable.  I also note without comment that
> gfn_to_memslot()
> > > is a GPL symbol.  Thanks,
> >
> > yes it is bad, but sometimes inevitable. If you recall our discussions
> > back to 3yrs (when discussing the 1st mdev framework), there were
> similar
> > hypervisor dependencies in GVT-g, e.g. querying gpa->hpa when
> > creating some shadow structures. gpa->hpa is definitely hypervisor
> > specific knowledge, which is easy in KVM (gpa->hva->hpa), but needs
> > hypercall in Xen. but VFIO already makes assumption based on KVM-
> > only flavor when implementing vfio_{un}pin_page_external.
> 
> Where's the KVM assumption there?  The MAP_DMA ioctl takes an IOVA
> and
> HVA.  When an mdev vendor driver calls vfio_pin_pages(), we GUP the HVA
> to get an HPA and provide an array of HPA pfns back to the caller.  The
> other vGPU mdev vendor manages to make use of this without KVM... the
> KVM interface used by GVT-g is GPL-only.

To be clear it's the assumption on the host-based hypervisors e.g. KVM.
GUP is a perfect example, which doesn't work for Xen since DomU's
memory doesn't belong to Dom0. VFIO in Dom0 has to find the HPA
through Xen specific hypercalls.

> 
> > So GVT-g
> > has to maintain an internal abstraction layer to support both Xen and
> > KVM. Maybe someday we will re-consider introducing some hypervisor
> > abstraction layer in VFIO, if this issue starts to hurt other devices and
> > Xen guys are willing to support VFIO.
> 
> Once upon a time, we had a KVM specific device assignment interface,
> ie. legacy KVM devie assignment.  We developed VFIO specifically to get
> KVM out of the business of being a (bad) device driver.  We do have
> some awareness and interaction between VFIO and KVM in the vfio-kvm
> pseudo device, but we still try to keep those interfaces generic.  In
> some cases we're not very successful at that, see vfio_group_set_kvm(),
> but that's largely just a mechanism to associate a cookie with a group
> to be consumed by the mdev vendor driver such that it can work with kvm
> external to vfio.  I don't intend to add further hypervisor awareness
> to vfio.
> 
> > Back to this IOVA issue, I discussed with Yan and we found another
> > hypervisor-agnostic alternative, by learning from vhost. vhost is very
> > similar to VFIO - DMA also happens in the kernel, while it already
> > supports 

[Qemu-devel] [PATCH v3 1/3] block/qcow2: refactoring of threaded encryption code

2019-09-12 Thread Maxim Levitsky
This commit tries to clarify few function arguments,
and add comments describing the encrypt/decrypt interface

Signed-off-by: Maxim Levitsky 
---
 block/qcow2-cluster.c |  8 +++---
 block/qcow2-threads.c | 63 ++-
 2 files changed, 54 insertions(+), 17 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index f09cc992af..b95e64c237 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -463,8 +463,8 @@ static int coroutine_fn 
do_perform_cow_read(BlockDriverState *bs,
 }
 
 static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
-uint64_t src_cluster_offset,
-uint64_t cluster_offset,
+uint64_t guest_cluster_offset,
+uint64_t host_cluster_offset,
 unsigned offset_in_cluster,
 uint8_t *buffer,
 unsigned bytes)
@@ -474,8 +474,8 @@ static bool coroutine_fn 
do_perform_cow_encrypt(BlockDriverState *bs,
 assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
 assert((bytes & ~BDRV_SECTOR_MASK) == 0);
 assert(s->crypto);
-if (qcow2_co_encrypt(bs, cluster_offset,
- src_cluster_offset + offset_in_cluster,
+if (qcow2_co_encrypt(bs, host_cluster_offset,
+ guest_cluster_offset + offset_in_cluster,
  buffer, bytes) < 0) {
 return false;
 }
diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 3b1e63fe41..6da1838e95 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c
@@ -234,15 +234,19 @@ static int qcow2_encdec_pool_func(void *opaque)
 }
 
 static int coroutine_fn
-qcow2_co_encdec(BlockDriverState *bs, uint64_t file_cluster_offset,
-  uint64_t offset, void *buf, size_t len, Qcow2EncDecFunc func)
+qcow2_co_encdec(BlockDriverState *bs, uint64_t host_cluster_offset,
+uint64_t guest_offset, void *buf, size_t len,
+Qcow2EncDecFunc func)
 {
 BDRVQcow2State *s = bs->opaque;
+
+uint64_t offset = s->crypt_physical_offset ?
+host_cluster_offset + offset_into_cluster(s, guest_offset) :
+guest_offset;
+
 Qcow2EncDecData arg = {
 .block = s->crypto,
-.offset = s->crypt_physical_offset ?
-  file_cluster_offset + offset_into_cluster(s, offset) :
-  offset,
+.offset = offset,
 .buf = buf,
 .len = len,
 .func = func,
@@ -251,18 +255,51 @@ qcow2_co_encdec(BlockDriverState *bs, uint64_t 
file_cluster_offset,
 return qcow2_co_process(bs, qcow2_encdec_pool_func, );
 }
 
+
+/*
+ * qcow2_co_encrypt()
+ *
+ * Encrypts one or more contiguous aligned sectors
+ *
+ * @host_cluster_offset - underlying storage offset of the first cluster
+ * in which the encrypted data will be written
+ * Used as an initialization vector for encryption
+ *
+ * @guest_offset - guest (virtual) offset of the first sector of the
+ * data to be encrypted
+ * Used as an initialization vector for older, qcow2 native encryption
+ *
+ * @buf - buffer with the data to encrypt, that after encryption
+ *will be written to the underlying storage device at
+ *@host_cluster_offset
+ *
+ * @len - length of the buffer (in sector size multiplies)
+ *
+ * Note that the group of the sectors, don't have to be aligned
+ * on cluster boundary and can also cross a cluster boundary.
+ *
+ */
 int coroutine_fn
-qcow2_co_encrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
- uint64_t offset, void *buf, size_t len)
+qcow2_co_encrypt(BlockDriverState *bs, uint64_t host_cluster_offset,
+ uint64_t guest_offset, void *buf, size_t len)
 {
-return qcow2_co_encdec(bs, file_cluster_offset, offset, buf, len,
- qcrypto_block_encrypt);
+return qcow2_co_encdec(bs, host_cluster_offset, guest_offset, buf, len,
+   qcrypto_block_encrypt);
 }
 
+
+/*
+ * qcow2_co_decrypt()
+ *
+ * Decrypts one or more contiguous aligned sectors
+ * Similar to qcow2_co_encrypt
+ *
+ */
+
 int coroutine_fn
-qcow2_co_decrypt(BlockDriverState *bs, uint64_t file_cluster_offset,
- uint64_t offset, void *buf, size_t len)
+qcow2_co_decrypt(BlockDriverState *bs, uint64_t host_cluster_offset,
+ uint64_t guest_offset, void *buf, size_t len)
 {
-return qcow2_co_encdec(bs, file_cluster_offset, offset, buf, len,
- qcrypto_block_decrypt);
+return qcow2_co_encdec(bs, host_cluster_offset, guest_offset, buf, len,
+   qcrypto_block_decrypt);
 }
-- 
2.17.2




Re: [Qemu-devel] [PATCH 1/2] LUKS: better error message when creating too large files

2019-09-12 Thread Maxim Levitsky
On Mon, 2019-07-22 at 10:05 +0100, Daniel P. Berrangé wrote:
> On Sun, Jul 21, 2019 at 09:15:07PM +0300, Maxim Levitsky wrote:
> > Currently if you attampt to create too large file with luks you
> > get the following error message:
> > 
> > Formatting 'test.luks', fmt=luks size=17592186044416 key-secret=sec0
> > qemu-img: test.luks: Could not resize file: File too large
> > 
> > While for raw format the error message is
> > qemu-img: test.img: The image size is too large for file format 'raw'
> > 
> > 
> > The reason for this is that qemu-img checks for errono of the failure,
> > and presents the later error when it is -EFBIG
> > 
> > However crypto generic code 'swallows' the errno and replaces it
> > with -EIO.
> > 
> > As an attempt to make it better, we can make luks driver,
> > detect -EFBIG and in this case present a better error message,
> > which is what this patch does
> > 
> > The new error message is:
> > 
> > qemu-img: error creating test.luks: The requested file size is too large
> > 
> > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1534898
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  block/crypto.c | 25 +
> >  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> Reviewed-by: Daniel P. Berrangé 
> 
> Regards,
> Daniel
Hi!

Do you think that we should merge this or, shall I close the bugreport
as WONTFIX?

Best regards,
Maxim Levitsky





[Qemu-devel] [PATCH v3 3/3] qemu-iotests: Add test for bz #1745922

2019-09-12 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
---
 tests/qemu-iotests/263 | 91 ++
 tests/qemu-iotests/263.out | 40 +
 tests/qemu-iotests/group   |  1 +
 3 files changed, 132 insertions(+)
 create mode 100755 tests/qemu-iotests/263
 create mode 100644 tests/qemu-iotests/263.out

diff --git a/tests/qemu-iotests/263 b/tests/qemu-iotests/263
new file mode 100755
index 00..afbd668cda
--- /dev/null
+++ b/tests/qemu-iotests/263
@@ -0,0 +1,91 @@
+#!/usr/bin/env bash
+#
+# Test encrypted write that crosses cluster boundary of two unallocated 
clusters
+# Based on 188
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=mlevi...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto generic
+_supported_os Linux
+
+
+size=1M
+
+SECRET="secret,id=sec0,data=astrochicken"
+QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT
+
+
+_run_test()
+{
+   echo "== reading the whole image =="
+   $QEMU_IO --object $SECRET -c "read -P 0 0 $size" --image-opts $1 | 
_filter_qemu_io | _filter_testdir
+
+   echo
+   echo "== write two 512 byte sectors on a cluster boundary =="
+   $QEMU_IO --object $SECRET -c "write -P 0xAA 0xFE00 0x400" --image-opts 
$1 | _filter_qemu_io | _filter_testdir
+
+   echo
+   echo "== verify that the rest of the image is not changed =="
+   $QEMU_IO --object $SECRET -c "read -P 0x00 0x0 0xFE00" --image-opts 
$1 | _filter_qemu_io | _filter_testdir
+   $QEMU_IO --object $SECRET -c "read -P 0xAA 0x0FE00 0x400" --image-opts 
$1 | _filter_qemu_io | _filter_testdir
+   $QEMU_IO --object $SECRET -c "read -P 0x00 0x10200 0xEFE00" 
--image-opts $1 | _filter_qemu_io | _filter_testdir
+
+}
+
+
+echo
+echo "testing LUKS qcow2 encryption"
+echo
+
+_make_test_img --object $SECRET -o 
"encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10,cluster_size=64K"
 $size
+_run_test "driver=$IMGFMT,encrypt.key-secret=sec0,file.filename=$TEST_IMG"
+_cleanup_test_img
+
+echo
+echo "testing legacy AES qcow2 encryption"
+echo
+
+
+_make_test_img --object $SECRET -o 
"encrypt.format=aes,encrypt.key-secret=sec0,cluster_size=64K" $size
+_run_test "driver=$IMGFMT,encrypt.key-secret=sec0,file.filename=$TEST_IMG"
+_cleanup_test_img
+
+
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/263.out b/tests/qemu-iotests/263.out
new file mode 100644
index 00..0c982c55cb
--- /dev/null
+++ b/tests/qemu-iotests/263.out
@@ -0,0 +1,40 @@
+QA output created by 263
+
+testing LUKS qcow2 encryption
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 encrypt.format=luks 
encrypt.key-secret=sec0 encrypt.iter-time=10
+== reading the whole image ==
+read 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== write two 512 byte sectors on a cluster boundary ==
+wrote 1024/1024 bytes at offset 65024
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== verify that the rest of the image is not changed ==
+read 65024/65024 bytes at offset 0
+63.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 1024/1024 bytes at offset 65024
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 982528/982528 bytes at offset 66048
+959.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+testing legacy AES qcow2 encryption
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 encrypt.format=aes 
encrypt.key-secret=sec0
+== reading the whole image ==
+read 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== write two 512 byte sectors on a cluster boundary ==
+wrote 1024/1024 bytes at offset 65024
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== verify that the rest of the image is not changed ==
+read 65024/65024 bytes at offset 0
+63.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 1024/1024 bytes at offset 65024
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 982528/982528 bytes at offset 66048
+959.500 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX 

[Qemu-devel] [PATCH v2 10/11] iotests: filter few more luks specific create options

2019-09-12 Thread Maxim Levitsky
Those options are test input anyway, and this allows more tests
to be able to have same output on both qcow2 luks encrypted images
and raw luks images

Signed-off-by: Maxim Levitsky 
---
 tests/qemu-iotests/087.out   | 6 +++---
 tests/qemu-iotests/134.out   | 2 +-
 tests/qemu-iotests/158.out   | 4 ++--
 tests/qemu-iotests/188.out   | 2 +-
 tests/qemu-iotests/189.out   | 4 ++--
 tests/qemu-iotests/198.out   | 4 ++--
 tests/qemu-iotests/common.filter | 6 --
 7 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/tests/qemu-iotests/087.out b/tests/qemu-iotests/087.out
index 2d92ea847b..b61ba638af 100644
--- a/tests/qemu-iotests/087.out
+++ b/tests/qemu-iotests/087.out
@@ -34,7 +34,7 @@ QMP_VERSION
 
 === Encrypted image QCow ===
 
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 encryption=on 
encrypt.key-secret=sec0
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 encryption=on
 Testing:
 QMP_VERSION
 {"return": {}}
@@ -46,7 +46,7 @@ QMP_VERSION
 
 === Encrypted image LUKS ===
 
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 encrypt.format=luks 
encrypt.key-secret=sec0
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 Testing:
 QMP_VERSION
 {"return": {}}
@@ -58,7 +58,7 @@ QMP_VERSION
 
 === Missing driver ===
 
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 encryption=on 
encrypt.key-secret=sec0
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 encryption=on
 Testing: -S
 QMP_VERSION
 {"return": {}}
diff --git a/tests/qemu-iotests/134.out b/tests/qemu-iotests/134.out
index 09d46f6b17..4abc5b5f7d 100644
--- a/tests/qemu-iotests/134.out
+++ b/tests/qemu-iotests/134.out
@@ -1,5 +1,5 @@
 QA output created by 134
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 encryption=on 
encrypt.key-secret=sec0
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 encryption=on
 
 == reading whole image ==
 read 134217728/134217728 bytes at offset 0
diff --git a/tests/qemu-iotests/158.out b/tests/qemu-iotests/158.out
index 6def216e55..f28a17626b 100644
--- a/tests/qemu-iotests/158.out
+++ b/tests/qemu-iotests/158.out
@@ -1,6 +1,6 @@
 QA output created by 158
 == create base ==
-Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=134217728 encryption=on 
encrypt.key-secret=sec0
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=134217728 encryption=on
 
 == writing whole image ==
 wrote 134217728/134217728 bytes at offset 0
@@ -10,7 +10,7 @@ wrote 134217728/134217728 bytes at offset 0
 read 134217728/134217728 bytes at offset 0
 128 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 == create overlay ==
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 
backing_file=TEST_DIR/t.IMGFMT.base encryption=on encrypt.key-secret=sec0
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 
backing_file=TEST_DIR/t.IMGFMT.base encryption=on
 
 == writing part of a cluster ==
 wrote 1024/1024 bytes at offset 0
diff --git a/tests/qemu-iotests/188.out b/tests/qemu-iotests/188.out
index c568ef3701..5426861b18 100644
--- a/tests/qemu-iotests/188.out
+++ b/tests/qemu-iotests/188.out
@@ -1,5 +1,5 @@
 QA output created by 188
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=16777216 encrypt.format=luks 
encrypt.key-secret=sec0 encrypt.iter-time=10
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=16777216
 
 == reading whole image ==
 read 16777216/16777216 bytes at offset 0
diff --git a/tests/qemu-iotests/189.out b/tests/qemu-iotests/189.out
index a0b7c9c24c..bc213cbe14 100644
--- a/tests/qemu-iotests/189.out
+++ b/tests/qemu-iotests/189.out
@@ -1,6 +1,6 @@
 QA output created by 189
 == create base ==
-Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=16777216 
encrypt.format=luks encrypt.key-secret=sec0 encrypt.iter-time=10
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=16777216
 
 == writing whole image ==
 wrote 16777216/16777216 bytes at offset 0
@@ -10,7 +10,7 @@ wrote 16777216/16777216 bytes at offset 0
 read 16777216/16777216 bytes at offset 0
 16 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 == create overlay ==
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=16777216 
backing_file=TEST_DIR/t.IMGFMT.base encrypt.format=luks encrypt.key-secret=sec1 
encrypt.iter-time=10
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=16777216 
backing_file=TEST_DIR/t.IMGFMT.base
 
 == writing part of a cluster ==
 wrote 1024/1024 bytes at offset 0
diff --git a/tests/qemu-iotests/198.out b/tests/qemu-iotests/198.out
index e86b175e39..2eff420795 100644
--- a/tests/qemu-iotests/198.out
+++ b/tests/qemu-iotests/198.out
@@ -1,12 +1,12 @@
 QA output created by 198
 == create base ==
-Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=16777216 
encrypt.format=luks encrypt.key-secret=sec0 encrypt.iter-time=10
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=16777216
 
 == writing whole image base ==
 wrote 16777216/16777216 bytes at offset 0
 16 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 

[Qemu-devel] [PATCH v3 2/3] block/qcow2: fix the corruption when rebasing luks encrypted files

2019-09-12 Thread Maxim Levitsky
This fixes subtle corruption introduced by luks threaded encryption
in commit 8ac0f15f335

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1745922

The corruption happens when we do a write that
   * writes to two or more unallocated clusters at once
   * doesn't fully cover the first sector
   * doesn't fully cover the last sector

In this case, when allocating the new clusters we COW both areas
prior to the write and after the write, and we encrypt them.

The above mentioned commit accidentally made it so we encrypt the
second COW area using the physical cluster offset of the first area.

Fix this by:
 * Remove the offset_in_cluster parameter of do_perform_cow_encrypt,
   since it is misleading. That offset can be larger than cluster size
   currently.

   Instead just add the start and the end COW area offsets to both host
   and guest offsets that do_perform_cow_encrypt receives.

*  in do_perform_cow_encrypt, remove the cluster offset from the host_offset,
   and thus pass correctly to the qcow2_co_encrypt, the host cluster offset
   and full guest offset

In the bugreport that was triggered by rebasing a luks image to new,
zero filled base, which lot of such writes, and causes some files
with zero areas to contain garbage there instead.
But as described above it can happen elsewhere as well


Signed-off-by: Maxim Levitsky 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/qcow2-cluster.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index b95e64c237..7203d4cb85 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -463,20 +463,21 @@ static int coroutine_fn 
do_perform_cow_read(BlockDriverState *bs,
 }
 
 static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
-uint64_t guest_cluster_offset,
-uint64_t host_cluster_offset,
-unsigned offset_in_cluster,
+uint64_t guest_offset,
+uint64_t host_offset,
 uint8_t *buffer,
 unsigned bytes)
 {
 if (bytes && bs->encrypted) {
 BDRVQcow2State *s = bs->opaque;
-assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
-assert((bytes & ~BDRV_SECTOR_MASK) == 0);
+
+assert(QEMU_IS_ALIGNED(guest_offset, BDRV_SECTOR_SIZE));
+assert(QEMU_IS_ALIGNED(host_offset, BDRV_SECTOR_SIZE));
+assert(QEMU_IS_ALIGNED(bytes, BDRV_SECTOR_SIZE));
 assert(s->crypto);
-if (qcow2_co_encrypt(bs, host_cluster_offset,
- guest_cluster_offset + offset_in_cluster,
- buffer, bytes) < 0) {
+
+if (qcow2_co_encrypt(bs, start_of_cluster(s, host_offset),
+ guest_offset, buffer, bytes) < 0) {
 return false;
 }
 }
@@ -890,11 +891,15 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta 
*m)
 
 /* Encrypt the data if necessary before writing it */
 if (bs->encrypted) {
-if (!do_perform_cow_encrypt(bs, m->offset, m->alloc_offset,
-start->offset, start_buffer,
+if (!do_perform_cow_encrypt(bs,
+m->offset + start->offset,
+m->alloc_offset + start->offset,
+start_buffer,
 start->nb_bytes) ||
-!do_perform_cow_encrypt(bs, m->offset, m->alloc_offset,
-end->offset, end_buffer, end->nb_bytes)) {
+!do_perform_cow_encrypt(bs,
+m->offset + end->offset,
+m->alloc_offset + end->offset,
+end_buffer, end->nb_bytes)) {
 ret = -EIO;
 goto fail;
 }
-- 
2.17.2




[Qemu-devel] [PATCH v2 09/11] block/qcow2: implement blockdev-amend

2019-09-12 Thread Maxim Levitsky
Currently only for changing crypto parameters

Signed-off-by: Maxim Levitsky 
---
 block/qcow2.c| 71 
 qapi/block-core.json |  6 ++--
 2 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 26f83aeb44..c8847ec6e2 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3079,6 +3079,18 @@ qcow2_co_create(BlockdevCreateOptions *create_options, 
Error **errp)
 assert(create_options->driver == BLOCKDEV_DRIVER_QCOW2);
 qcow2_opts = _options->u.qcow2;
 
+if (!qcow2_opts->has_size) {
+error_setg(errp, "Size is manadatory for image creation");
+return -EINVAL;
+
+}
+
+if (!qcow2_opts->has_file) {
+error_setg(errp, "'file' is manadatory for image creation");
+return -EINVAL;
+
+}
+
 bs = bdrv_open_blockdev_ref(qcow2_opts->file, errp);
 if (bs == NULL) {
 return -EIO;
@@ -5111,6 +5123,64 @@ static int qcow2_amend_options(BlockDriverState *bs, 
QemuOpts *opts,
 return 0;
 }
 
+
+static int coroutine_fn qcow2_co_amend(BlockDriverState *bs,
+   BlockdevCreateOptions *opts,
+   bool force,
+   Error **errp)
+{
+BlockdevCreateOptionsQcow2 *qopts = >u.qcow2;
+BDRVQcow2State *s = bs->opaque;
+int ret;
+
+/*
+ * This is ugly as hell, in later versions of this patch
+ * something has to be done about this
+ */
+if (qopts->has_file || qopts->has_size || qopts->has_data_file ||
+qopts->has_data_file_raw || qopts->has_version ||
+qopts->has_backing_file || qopts->has_backing_fmt ||
+qopts->has_cluster_size || qopts->has_preallocation ||
+qopts->has_lazy_refcounts || qopts->has_refcount_bits) {
+
+error_setg(errp,
+"Only LUKS encryption options can be amended for qcow2 with 
blockdev-amend");
+return -EOPNOTSUPP;
+
+}
+
+if (qopts->has_encrypt) {
+if (!s->crypto) {
+error_setg(errp, "QCOW2 image is not encrypted, can't amend");
+return -EOPNOTSUPP;
+}
+
+if (qopts->encrypt->format != Q_CRYPTO_BLOCK_FORMAT_LUKS) {
+error_setg(errp,
+   "Amend can't be used to change the qcow2 encryption 
format");
+return -EOPNOTSUPP;
+}
+
+if (s->crypt_method_header != QCOW_CRYPT_LUKS) {
+error_setg(errp,
+   "Only LUKS encryption options can be amended for qcow2 
with blockdev-amend");
+return -EOPNOTSUPP;
+}
+
+ret = qcrypto_block_amend_options(s->crypto,
+  qcow2_crypto_hdr_read_func,
+  qcow2_crypto_hdr_write_func,
+  bs,
+  qopts->encrypt,
+  force,
+  errp);
+if (ret) {
+return ret;
+}
+}
+return 0;
+}
+
 /*
  * If offset or size are negative, respectively, they will not be included in
  * the BLOCK_IMAGE_CORRUPTED event emitted.
@@ -5303,6 +5373,7 @@ BlockDriver bdrv_qcow2 = {
 .mutable_opts= mutable_opts,
 .bdrv_co_check   = qcow2_co_check,
 .bdrv_amend_options  = qcow2_amend_options,
+.bdrv_co_amend   = qcow2_co_amend,
 
 .bdrv_detach_aio_context  = qcow2_detach_aio_context,
 .bdrv_attach_aio_context  = qcow2_attach_aio_context,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 4a6db98938..0eb4e45168 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4294,6 +4294,7 @@
 # Driver specific image creation options for qcow2.
 #
 # @file Node to create the image format on
+#   Mandatory for create
 # @data-fileNode to use as an external data file in which all guest
 #   data is stored so that only metadata remains in the qcow2
 #   file (since: 4.0)
@@ -4301,6 +4302,7 @@
 #   standalone (read-only) raw image without looking at qcow2
 #   metadata (default: false; since: 4.0)
 # @size Size of the virtual disk in bytes
+#   Mandatory for create
 # @version  Compatibility level (default: v3)
 # @backing-file File name of the backing file if a backing file
 #   should be used
@@ -4315,10 +4317,10 @@
 # Since: 2.12
 ##
 { 'struct': 'BlockdevCreateOptionsQcow2',
-  'data': { 'file': 'BlockdevRef',
+  'data': { '*file':'BlockdevRef',
 '*data-file':   'BlockdevRef',
 '*data-file-raw':   'bool',
-'size': 'size',
+'*size':'size',
 '*version': 'BlockdevQcow2Version',
 '*backing-file':'str',
   

[Qemu-devel] [PATCH v2 02/11] qcrypto-luks: extend the create options for upcoming encryption key management

2019-09-12 Thread Maxim Levitsky
Now you can specify which slot to put the encryption key to
Plus add 'active' option which will let  user erase the key secret
instead of adding it.
Check that active=true it when creating.

Signed-off-by: Maxim Levitsky 
---
 block/crypto.c |  2 ++
 block/crypto.h | 16 +++
 block/qcow2.c  |  2 ++
 crypto/block-luks.c| 26 +++---
 qapi/crypto.json   | 19 ++
 tests/qemu-iotests/082.out | 54 ++
 6 files changed, 115 insertions(+), 4 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index 6e822c6e50..a6a3e1f1d8 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -144,6 +144,8 @@ static QemuOptsList block_crypto_create_opts_luks = {
 BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG(""),
 BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG(""),
 BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME(""),
+BLOCK_CRYPTO_OPT_DEF_LUKS_SLOT(""),
+BLOCK_CRYPTO_OPT_DEF_LUKS_ACTIVE(""),
 { /* end of list */ }
 },
 };
diff --git a/block/crypto.h b/block/crypto.h
index b935695e79..05cc43d9bc 100644
--- a/block/crypto.h
+++ b/block/crypto.h
@@ -35,12 +35,14 @@
 "ID of the secret that provides the AES encryption key")
 
 #define BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET "key-secret"
+#define BLOCK_CRYPTO_OPT_LUKS_SLOT "slot"
 #define BLOCK_CRYPTO_OPT_LUKS_CIPHER_ALG "cipher-alg"
 #define BLOCK_CRYPTO_OPT_LUKS_CIPHER_MODE "cipher-mode"
 #define BLOCK_CRYPTO_OPT_LUKS_IVGEN_ALG "ivgen-alg"
 #define BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG "ivgen-hash-alg"
 #define BLOCK_CRYPTO_OPT_LUKS_HASH_ALG "hash-alg"
 #define BLOCK_CRYPTO_OPT_LUKS_ITER_TIME "iter-time"
+#define BLOCK_CRYPTO_OPT_LUKS_ACTIVE "active"
 
 #define BLOCK_CRYPTO_OPT_DEF_LUKS_KEY_SECRET(prefix)\
 BLOCK_CRYPTO_OPT_DEF_KEY_SECRET(prefix, \
@@ -88,6 +90,20 @@
 .help = "Time to spend in PBKDF in milliseconds", \
 }
 
+#define BLOCK_CRYPTO_OPT_DEF_LUKS_SLOT(prefix)   \
+{ \
+.name = prefix BLOCK_CRYPTO_OPT_LUKS_SLOT,   \
+.type = QEMU_OPT_NUMBER,  \
+.help = "Controls the slot where the secret is added/erased", \
+}
+
+#define BLOCK_CRYPTO_OPT_DEF_LUKS_ACTIVE(prefix)   \
+{ \
+.name = prefix BLOCK_CRYPTO_OPT_LUKS_ACTIVE,   \
+.type = QEMU_OPT_BOOL,  \
+.help = "Controls if the added secret is added or erased", \
+}
+
 QCryptoBlockCreateOptions *
 block_crypto_create_opts_init(QDict *opts, Error **errp);
 
diff --git a/block/qcow2.c b/block/qcow2.c
index 0882ff6e92..5bdb8b18f4 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -5166,6 +5166,8 @@ static QemuOptsList qcow2_create_opts = {
 BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG("encrypt."),
 BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG("encrypt."),
 BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME("encrypt."),
+BLOCK_CRYPTO_OPT_DEF_LUKS_SLOT("encrypt."),
+BLOCK_CRYPTO_OPT_DEF_LUKS_ACTIVE("encrypt."),
 {
 .name = BLOCK_OPT_CLUSTER_SIZE,
 .type = QEMU_OPT_SIZE,
diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 6c53bdc428..fed80e6646 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -1211,6 +1211,7 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 const char *hash_alg;
 g_autofree char *cipher_mode_spec = NULL;
 uint64_t iters;
+unsigned int slot_idx = 0;
 
 memcpy(_opts, >u.luks, sizeof(luks_opts));
 if (!luks_opts.has_iter_time) {
@@ -1244,12 +1245,30 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 luks->ivgen_hash_alg = luks_opts.ivgen_hash_alg;
 luks->hash_alg = luks_opts.hash_alg;
 
+if (luks_opts.has_active && !luks_opts.active) {
+error_setg(errp,
+   "For image creation, the added secret must be active!");
+goto error;
+
+}
+
+if (luks_opts.has_slot) {
+if (luks_opts.slot >= QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS ||
+luks_opts.slot < 0) {
+error_setg(errp,
+   "Invalid slot %" PRId64 " is specified",
+   luks_opts.slot);
+goto error;
+}
+slot_idx = (unsigned int)luks_opts.slot;
+}
+
 
 /* Note we're allowing ivgen_hash_alg to be set even for
  * non-essiv iv generators that don't need a hash. It will
  * be silently ignored, for compatibility with dm-crypt */
 
-if (!options->u.luks.key_secret) {
+if (!luks_opts.has_key_secret) {
 error_setg(errp, "Parameter '%skey-secret' is required for cipher",
optprefix ? optprefix : "");
 goto error;
@@ -1455,11 +1474,10 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 goto error;
 }
 
-
-   

[Qemu-devel] [PATCH v3 0/3] Fix qcow2+luks corruption introduced by commit 8ac0f15f335

2019-09-12 Thread Maxim Levitsky
Commit 8ac0f15f335 accidently broke the COW of non changed areas
of newly allocated clusters, when the write spans multiple clusters,
and needs COW both prior and after the write.
This results in 'after' COW area being encrypted with wrong
sector address, which render it corrupted.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1745922

CC: qemu-stable 

V2: grammar, spelling and code style fixes.
V3: more fixes after the review.

Best regards,
Maxim Levitsky

Maxim Levitsky (3):
  block/qcow2: refactoring of threaded encryption code
  block/qcow2: fix the corruption when rebasing luks encrypted files
  qemu-iotests: Add test for bz #1745922

 block/qcow2-cluster.c  | 29 +++-
 block/qcow2-threads.c  | 63 --
 tests/qemu-iotests/263 | 91 ++
 tests/qemu-iotests/263.out | 40 +
 tests/qemu-iotests/group   |  1 +
 5 files changed, 199 insertions(+), 25 deletions(-)
 create mode 100755 tests/qemu-iotests/263
 create mode 100644 tests/qemu-iotests/263.out

-- 
2.17.2




[Qemu-devel] [PATCH v2 06/11] qcow2: implement crypto amend options

2019-09-12 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
---
 block/qcow2.c | 77 +--
 1 file changed, 62 insertions(+), 15 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 0618a63793..26f83aeb44 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -172,6 +172,25 @@ static ssize_t qcow2_crypto_hdr_write_func(QCryptoBlock 
*block, size_t offset,
 return ret;
 }
 
+static QCryptoBlockCreateOptions*
+qcow2_extract_crypto_create_opts(QemuOpts *opts, const char *fmt, Error **errp)
+{
+QDict *cryptoopts_qdict;
+QCryptoBlockCreateOptions *cryptoopts;
+QDict *opts_qdict;
+
+/* Extract "encrypt." options into a qdict */
+opts_qdict = qemu_opts_to_qdict(opts, NULL);
+qdict_extract_subqdict(opts_qdict, _qdict, "encrypt.");
+qobject_unref(opts_qdict);
+
+/* Build QCryptoBlockCreateOptions object from qdict */
+qdict_put_str(cryptoopts_qdict, "format", "luks");
+cryptoopts = block_crypto_create_opts_init(cryptoopts_qdict, errp);
+qobject_unref(cryptoopts_qdict);
+return cryptoopts;
+}
+
 
 /* 
  * read qcow2 extension and fill bs
@@ -4365,20 +4384,10 @@ static ssize_t 
qcow2_measure_crypto_hdr_write_func(QCryptoBlock *block,
 static bool qcow2_measure_luks_headerlen(QemuOpts *opts, size_t *len,
  Error **errp)
 {
-QDict *opts_qdict;
-QDict *cryptoopts_qdict;
 QCryptoBlockCreateOptions *cryptoopts;
 QCryptoBlock *crypto;
 
-/* Extract "encrypt." options into a qdict */
-opts_qdict = qemu_opts_to_qdict(opts, NULL);
-qdict_extract_subqdict(opts_qdict, _qdict, "encrypt.");
-qobject_unref(opts_qdict);
-
-/* Build QCryptoBlockCreateOptions object from qdict */
-qdict_put_str(cryptoopts_qdict, "format", "luks");
-cryptoopts = block_crypto_create_opts_init(cryptoopts_qdict, errp);
-qobject_unref(cryptoopts_qdict);
+cryptoopts = qcow2_extract_crypto_create_opts(opts, "luks", errp);
 if (!cryptoopts) {
 return false;
 }
@@ -4755,6 +4764,7 @@ typedef enum Qcow2AmendOperation {
  * invocation from an operation change */
 QCOW2_NO_OPERATION = 0,
 
+QCOW2_UPDATING_ENCRYPTION,
 QCOW2_CHANGING_REFCOUNT_ORDER,
 QCOW2_DOWNGRADING,
 } Qcow2AmendOperation;
@@ -4839,6 +4849,7 @@ static int qcow2_amend_options(BlockDriverState *bs, 
QemuOpts *opts,
 int ret;
 QemuOptDesc *desc = opts->list->desc;
 Qcow2AmendHelperCBInfo helper_cb_info;
+bool encryption_update = false;
 
 while (desc && desc->name) {
 if (!qemu_opt_find(opts, desc->name)) {
@@ -4887,9 +4898,22 @@ static int qcow2_amend_options(BlockDriverState *bs, 
QemuOpts *opts,
 return -ENOTSUP;
 }
 } else if (g_str_has_prefix(desc->name, "encrypt.")) {
-error_setg(errp,
-   "Changing the encryption parameters is not supported");
-return -ENOTSUP;
+
+if (!s->crypto) {
+error_setg(errp,
+   "Can't amend encryption options - encryption not 
present");
+return -EINVAL;
+
+}
+
+if (s->crypt_method_header != QCOW_CRYPT_LUKS) {
+error_setg(errp,
+   "Only LUKS encryption options can be amended");
+return -ENOTSUP;
+}
+
+encryption_update = true;
+
 } else if (!strcmp(desc->name, BLOCK_OPT_CLUSTER_SIZE)) {
 cluster_size = qemu_opt_get_size(opts, BLOCK_OPT_CLUSTER_SIZE,
  cluster_size);
@@ -4939,7 +4963,8 @@ static int qcow2_amend_options(BlockDriverState *bs, 
QemuOpts *opts,
 .original_status_cb = status_cb,
 .original_cb_opaque = cb_opaque,
 .total_operations = (new_version < old_version)
-  + (s->refcount_bits != refcount_bits)
+  + (s->refcount_bits != refcount_bits) +
+  (encryption_update == true)
 };
 
 /* Upgrade first (some features may require compat=1.1) */
@@ -4953,6 +4978,28 @@ static int qcow2_amend_options(BlockDriverState *bs, 
QemuOpts *opts,
 }
 }
 
+if (encryption_update) {
+QCryptoBlockCreateOptions *cryptoopts;
+
+cryptoopts = qcow2_extract_crypto_create_opts(opts, "luks", errp);
+if (!cryptoopts) {
+return -EINVAL;
+}
+
+helper_cb_info.current_operation = QCOW2_UPDATING_ENCRYPTION;
+
+ret = qcrypto_block_amend_options(s->crypto,
+  qcow2_crypto_hdr_read_func,
+  qcow2_crypto_hdr_write_func,
+  bs,
+  cryptoopts,
+  force,
+  errp);
+if (ret < 0) {
+return ret;
+}
+}
+
 if 

[Qemu-devel] [PATCH v2 11/11] iotests : add tests for encryption key management

2019-09-12 Thread Maxim Levitsky
Note that currently I add tests 300-302, which are
placeholders to ease the rebase. In final version
of these patches I will update these.

Signed-off-by: Maxim Levitsky 
---
 tests/qemu-iotests/300 | 202 +
 tests/qemu-iotests/300.out |  98 +++
 tests/qemu-iotests/301 |  90 +
 tests/qemu-iotests/301.out |  30 +
 tests/qemu-iotests/302 | 252 +
 tests/qemu-iotests/302.out |  18 +++
 tests/qemu-iotests/303 | 228 +
 tests/qemu-iotests/303.out |  28 +
 tests/qemu-iotests/group   |   9 ++
 9 files changed, 955 insertions(+)
 create mode 100755 tests/qemu-iotests/300
 create mode 100644 tests/qemu-iotests/300.out
 create mode 100755 tests/qemu-iotests/301
 create mode 100644 tests/qemu-iotests/301.out
 create mode 100644 tests/qemu-iotests/302
 create mode 100644 tests/qemu-iotests/302.out
 create mode 100644 tests/qemu-iotests/303
 create mode 100644 tests/qemu-iotests/303.out

diff --git a/tests/qemu-iotests/300 b/tests/qemu-iotests/300
new file mode 100755
index 00..5b65ef95de
--- /dev/null
+++ b/tests/qemu-iotests/300
@@ -0,0 +1,202 @@
+#!/usr/bin/env bash
+#
+# Test encryption key management with luks
+# Based on 134
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=mlevi...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2 luks
+_supported_proto file #TODO
+
+QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT
+
+if [ "$IMGFMT" = "qcow2" ] ; then
+   PR="encrypt."
+   EXTRA_IMG_ARGS="-o encrypt.format=luks"
+fi
+
+
+# secrets: you are supposed to see the password as ***, see :-)
+S0="--object secret,id=sec0,data=hunter0"
+S1="--object secret,id=sec1,data=hunter1"
+S2="--object secret,id=sec2,data=hunter2"
+S3="--object secret,id=sec3,data=hunter3"
+S4="--object secret,id=sec4,data=hunter4"
+SECRETS="$S0 $S1 $S2 $S3 $S4"
+
+# image with given secret
+IMGS0="--image-opts 
driver=$IMGFMT,file.filename=$TEST_IMG,${PR}key-secret=sec0"
+IMGS1="--image-opts 
driver=$IMGFMT,file.filename=$TEST_IMG,${PR}key-secret=sec1"
+IMGS2="--image-opts 
driver=$IMGFMT,file.filename=$TEST_IMG,${PR}key-secret=sec2"
+IMGS3="--image-opts 
driver=$IMGFMT,file.filename=$TEST_IMG,${PR}key-secret=sec3"
+IMGS4="--image-opts 
driver=$IMGFMT,file.filename=$TEST_IMG,${PR}key-secret=sec4"
+
+
+echo "== creating a test image =="
+_make_test_img $S4 $EXTRA_IMG_ARGS -o 
${PR}key-secret=sec4,${PR}iter-time=10,${PR}slot=4 32M
+
+echo
+echo "== test that key 4 opens the image =="
+$QEMU_IO $S4 -c "read 0 4096" $IMGS4 | _filter_qemu_io | _filter_testdir
+
+echo
+echo "== adding a password to slot 0 =="
+$QEMU_IMG amend $SECRETS $IMGS4 -o ${PR}key-secret=sec0,${PR}iter-time=10
+echo "== adding a password to slot 1 =="
+$QEMU_IMG amend $SECRETS $IMGS0 -o ${PR}key-secret=sec1,${PR}iter-time=10
+echo "== adding a password to slot 3 =="
+$QEMU_IMG amend $SECRETS $IMGS1 -o 
${PR}key-secret=sec3,${PR}iter-time=10,${PR}slot=3
+echo "== adding a password to slot 2 =="
+$QEMU_IMG amend $SECRETS $IMGS3 -o ${PR}key-secret=sec2,${PR}iter-time=10
+
+
+echo "== erase slot 4 =="
+$QEMU_IMG amend $SECRETS $IMGS1 -o ${PR}active=off,${PR}slot=4 | 
_filter_img_create
+
+
+echo
+echo "== all secrets should work =="
+for IMG in "$IMGS0" "$IMGS1" "$IMGS2" "$IMGS3"; do
+   $QEMU_IO $SECRETS -c "read 0 4096" $IMG | _filter_qemu_io | 
_filter_testdir
+done
+
+echo
+echo "== erase slot 0 and try it =="
+$QEMU_IMG amend $SECRETS $IMGS1 -o ${PR}active=off,${PR}key-secret=sec0 | 
_filter_img_create
+$QEMU_IO $SECRETS -c "read 0 4096" $IMGS0 | _filter_qemu_io | _filter_testdir
+
+echo
+echo "== erase slot 2 and try it =="
+$QEMU_IMG amend $SECRETS $IMGS1 -o ${PR}active=off,${PR}slot=2| 
_filter_img_create
+$QEMU_IO $SECRETS -c "read 0 4096" $IMGS2 | _filter_qemu_io | _filter_testdir
+
+
+# at this point slots 1 and 3 should be active
+
+echo
+echo "== filling  4 slots with secret 2 =="
+for i in $(seq 0 3) ; do
+   $QEMU_IMG amend $SECRETS $IMGS3 -o 
${PR}key-secret=sec2,${PR}iter-time=10
+done
+
+echo
+echo "== adding 

[Qemu-devel] [PATCH v2 07/11] block: add x-blockdev-amend qmp command

2019-09-12 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
---
 block/Makefile.objs   |   2 +-
 block/amend.c | 116 ++
 include/block/block_int.h |  23 ++--
 qapi/block-core.json  |  26 +
 qapi/job.json |   4 +-
 5 files changed, 163 insertions(+), 8 deletions(-)
 create mode 100644 block/amend.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 35f3bca4d9..10d0308792 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -18,7 +18,7 @@ block-obj-y += block-backend.o snapshot.o qapi.o
 block-obj-$(CONFIG_WIN32) += file-win32.o win32-aio.o
 block-obj-$(CONFIG_POSIX) += file-posix.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
-block-obj-y += null.o mirror.o commit.o io.o create.o
+block-obj-y += null.o mirror.o commit.o io.o create.o amend.o
 block-obj-y += throttle-groups.o
 block-obj-$(CONFIG_LINUX) += nvme.o
 
diff --git a/block/amend.c b/block/amend.c
new file mode 100644
index 00..9bd28e08e7
--- /dev/null
+++ b/block/amend.c
@@ -0,0 +1,116 @@
+/*
+ * Block layer code related to image options amend
+ *
+ * Copyright (c) 2018 Kevin Wolf 
+ * Copyright (c) 2019 Maxim Levitsky 
+ *
+ * Heavily based on create.c
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "block/block_int.h"
+#include "qemu/job.h"
+#include "qemu/main-loop.h"
+#include "qapi/qapi-commands-block-core.h"
+#include "qapi/qapi-visit-block-core.h"
+#include "qapi/clone-visitor.h"
+#include "qapi/error.h"
+
+typedef struct BlockdevAmendJob {
+Job common;
+BlockdevCreateOptions *opts;
+BlockDriverState *bs;
+bool force;
+} BlockdevAmendJob;
+
+static int coroutine_fn blockdev_amend_run(Job *job, Error **errp)
+{
+BlockdevAmendJob *s = container_of(job, BlockdevAmendJob, common);
+int ret;
+
+job_progress_set_remaining(>common, 1);
+ret = s->bs->drv->bdrv_co_amend(s->bs, s->opts, s->force, errp);
+job_progress_update(>common, 1);
+
+qapi_free_BlockdevCreateOptions(s->opts);
+
+return ret;
+}
+
+static const JobDriver blockdev_amend_job_driver = {
+.instance_size = sizeof(BlockdevAmendJob),
+.job_type  = JOB_TYPE_AMEND,
+.run   = blockdev_amend_run,
+};
+
+void qmp_x_blockdev_amend(const char *job_id,
+const char *node_name,
+BlockdevCreateOptions *options,
+bool has_force,
+bool force,
+Error **errp)
+{
+BlockdevAmendJob *s;
+const char *fmt = BlockdevDriver_str(options->driver);
+BlockDriver *drv = bdrv_find_format(fmt);
+BlockDriverState *bs = bdrv_find_node(node_name);
+
+/*
+ * If the driver is in the schema, we know that it exists. But it may not
+ * be whitelisted.
+ */
+assert(drv);
+if (bdrv_uses_whitelist() && !bdrv_is_whitelisted(drv, false)) {
+error_setg(errp, "Driver is not whitelisted");
+return;
+}
+
+if (bs->drv != drv) {
+error_setg(errp,
+   "x-blockdev-amend doesn't support changing the block 
driver");
+return;
+
+}
+
+/* Error out if the driver doesn't support .bdrv_co_amend */
+if (!drv->bdrv_co_amend) {
+error_setg(errp, "Driver does not support x-blockdev-amend");
+return;
+}
+
+/*
+ * Create the block job
+ * TODO Running in the main context. Block drivers need to error out or add
+ * locking when they use a BDS in a different AioContext.
+ */
+s = job_create(job_id, _amend_job_driver, NULL,
+   qemu_get_aio_context(), JOB_DEFAULT | JOB_MANUAL_DISMISS,
+   NULL, NULL, errp);
+if (!s) {
+return;
+}
+
+s->bs = bs,
+s->opts = QAPI_CLONE(BlockdevCreateOptions, options),
+s->force = has_force ? force : false;
+
+job_start(>common);
+}
diff --git a/include/block/block_int.h 

[Qemu-devel] [PATCH v2 05/11] block/crypto: implement the encryption key management

2019-09-12 Thread Maxim Levitsky
This implements the encryption key management
using the generic code in qcrypto layer
(currently only for qemu-img amend)

This code adds another 'write_func' because the initialization
write_func works directly on the underlying file,
because during the creation, there is no open instance
of the luks driver, but during regular use, we have it,
and should use it instead.


This commit also adds a 'hack/workaround' I and Kevin Wolf (thanks)
made to make the driver still support write sharing,
but be safe against concurrent  metadata update (the keys)
Eventually write sharing for luks driver will be deprecated
and removed together with this hack.

The hack is that we ask (as a format driver) for
BLK_PERM_CONSISTENT_READ always
(technically always unless opened with BDRV_O_NO_IO)

and then when we want to update the keys, we
unshare that permission. So if someone else
has the image open, even readonly, this will fail.

Also thanks to Daniel Berrange for the variant of
that hack that involves asking for read,
rather that write permission

Signed-off-by: Maxim Levitsky 
---
 block/crypto.c | 118 +++--
 1 file changed, 115 insertions(+), 3 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index a6a3e1f1d8..f42fa057e6 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -36,6 +36,7 @@ typedef struct BlockCrypto BlockCrypto;
 
 struct BlockCrypto {
 QCryptoBlock *block;
+bool updating_keys;
 };
 
 
@@ -70,6 +71,24 @@ static ssize_t block_crypto_read_func(QCryptoBlock *block,
 return ret;
 }
 
+static ssize_t block_crypto_write_func(QCryptoBlock *block,
+   size_t offset,
+   const uint8_t *buf,
+   size_t buflen,
+   void *opaque,
+   Error **errp)
+{
+BlockDriverState *bs = opaque;
+ssize_t ret;
+
+ret = bdrv_pwrite(bs->file, offset, buf, buflen);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Could not write encryption header");
+return ret;
+}
+return ret;
+}
+
 
 struct BlockCryptoCreateData {
 BlockBackend *blk;
@@ -647,6 +666,100 @@ block_crypto_get_specific_info_luks(BlockDriverState *bs, 
Error **errp)
 return spec_info;
 }
 
+
+static int
+block_crypto_amend_options(BlockDriverState *bs,
+   QemuOpts *opts,
+   BlockDriverAmendStatusCB *status_cb,
+   void *cb_opaque,
+   bool force,
+   Error **errp)
+{
+BlockCrypto *crypto = bs->opaque;
+QDict *cryptoopts = NULL;
+QCryptoBlockCreateOptions *amend_options = NULL;
+int ret;
+
+assert(crypto);
+assert(crypto->block);
+
+crypto->updating_keys = true;
+
+ret = bdrv_child_refresh_perms(bs, bs->file, errp);
+if (ret < 0) {
+goto cleanup;
+}
+
+cryptoopts = qemu_opts_to_qdict_filtered(opts, NULL,
+ _crypto_create_opts_luks,
+ true);
+
+qdict_put_str(cryptoopts, "format", "luks");
+amend_options = block_crypto_create_opts_init(cryptoopts, errp);
+if (!amend_options) {
+ret = -EINVAL;
+goto cleanup;
+}
+
+ret = qcrypto_block_amend_options(crypto->block,
+  block_crypto_read_func,
+  block_crypto_write_func,
+  bs,
+  amend_options,
+  force,
+  errp);
+cleanup:
+crypto->updating_keys = false;
+bdrv_child_refresh_perms(bs, bs->file, errp);
+qapi_free_QCryptoBlockCreateOptions(amend_options);
+qobject_unref(cryptoopts);
+return ret;
+}
+
+
+static void
+block_crypto_child_perms(BlockDriverState *bs, BdrvChild *c,
+ const BdrvChildRole *role,
+ BlockReopenQueue *reopen_queue,
+ uint64_t perm, uint64_t shared,
+ uint64_t *nperm, uint64_t *nshared)
+{
+
+BlockCrypto *crypto = bs->opaque;
+
+/*
+ * Ask for consistent read permission so that if
+ * someone else tries to open this image with this permission
+ * neither will be able to edit encryption keys
+ */
+if (!(bs->open_flags & BDRV_O_NO_IO)) {
+perm |= BLK_PERM_CONSISTENT_READ;
+}
+
+/*
+ * This driver doesn't modify LUKS metadata except
+ * when updating the encryption slots.
+ * Thus unlike a proper format driver we don't ask for
+ * shared write permission. However we need it
+ * when we area updating keys, to ensure that only we
+ * had opened the device r/w
+ *
+ * Encryption update will set the crypto->updating_keys
+ * during 

[Qemu-devel] [PATCH v2 01/11] qcrypto: add suport for amend options

2019-09-12 Thread Maxim Levitsky
This adds the qcrypto_amend_options and corresponding
crypto driver callbacks for the  for encrypted
key managedment

Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 crypto/block.c | 31 +++
 crypto/blockpriv.h |  8 
 include/crypto/block.h | 22 ++
 3 files changed, 61 insertions(+)

diff --git a/crypto/block.c b/crypto/block.c
index 325752871c..14b684de7f 100644
--- a/crypto/block.c
+++ b/crypto/block.c
@@ -115,6 +115,37 @@ QCryptoBlock 
*qcrypto_block_create(QCryptoBlockCreateOptions *options,
 }
 
 
+int qcrypto_block_amend_options(QCryptoBlock *block,
+QCryptoBlockReadFunc readfunc,
+QCryptoBlockWriteFunc writefunc,
+void *opaque,
+QCryptoBlockCreateOptions *options,
+bool force,
+Error **errp)
+{
+if (options->format != block->format) {
+error_setg(errp,
+   "Its not possible to change encryption format with amend 
interface");
+return -1;
+}
+
+if (!block->driver->amend) {
+error_setg(errp,
+   "Crypto format %s doesn't support format options amendment",
+   QCryptoBlockFormat_str(block->format));
+return -1;
+}
+
+return block->driver->amend(block,
+readfunc,
+writefunc,
+opaque,
+options,
+force,
+errp);
+}
+
+
 QCryptoBlockInfo *qcrypto_block_get_info(QCryptoBlock *block,
  Error **errp)
 {
diff --git a/crypto/blockpriv.h b/crypto/blockpriv.h
index 71c59cb542..c18a4e0b43 100644
--- a/crypto/blockpriv.h
+++ b/crypto/blockpriv.h
@@ -62,6 +62,14 @@ struct QCryptoBlockDriver {
   void *opaque,
   Error **errp);
 
+int (*amend)(QCryptoBlock *block,
+ QCryptoBlockReadFunc readfunc,
+ QCryptoBlockWriteFunc writefunc,
+ void *opaque,
+ QCryptoBlockCreateOptions *options,
+ bool force,
+ Error **errp);
+
 int (*get_info)(QCryptoBlock *block,
 QCryptoBlockInfo *info,
 Error **errp);
diff --git a/include/crypto/block.h b/include/crypto/block.h
index d49d2c2da9..777fd51ebe 100644
--- a/include/crypto/block.h
+++ b/include/crypto/block.h
@@ -144,6 +144,28 @@ QCryptoBlock 
*qcrypto_block_create(QCryptoBlockCreateOptions *options,
void *opaque,
Error **errp);
 
+/**
+ * qcrypto_block_amend_options:
+ * @block: the block encryption object
+ *
+ * @readfunc: callback for reading data from the volume header
+ * @writefunc: callback for writing data to the volume header
+ * @opaque: data to pass to @readfunc and @writefunc
+ * @options: the new/amended encryption options
+ * @force: hint for the driver to allow unsafe operation
+ * @errp: error pointer
+ *
+ * Changes the crypto options of the encryption format
+ *
+ */
+int qcrypto_block_amend_options(QCryptoBlock *block,
+QCryptoBlockReadFunc readfunc,
+QCryptoBlockWriteFunc writefunc,
+void *opaque,
+QCryptoBlockCreateOptions *options,
+bool force,
+Error **errp);
+
 
 /**
  * qcrypto_block_get_info:
-- 
2.17.2




[Qemu-devel] [PATCH v2 08/11] block/crypto: implement blockdev-amend

2019-09-12 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 block/crypto.c   | 85 ++--
 qapi/block-core.json |  7 ++--
 2 files changed, 71 insertions(+), 21 deletions(-)

diff --git a/block/crypto.c b/block/crypto.c
index f42fa057e6..5905f7f520 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -534,6 +534,17 @@ block_crypto_co_create_luks(BlockdevCreateOptions 
*create_options, Error **errp)
 assert(create_options->driver == BLOCKDEV_DRIVER_LUKS);
 luks_opts = _options->u.luks;
 
+if (!luks_opts->has_size) {
+error_setg(errp, "'size' is manadatory for image creation");
+return -EINVAL;
+}
+
+if (!luks_opts->has_file) {
+error_setg(errp, "'file' is manadatory for image creation");
+return -EINVAL;
+}
+
+
 bs = bdrv_open_blockdev_ref(luks_opts->file, errp);
 if (bs == NULL) {
 return -EIO;
@@ -667,6 +678,39 @@ block_crypto_get_specific_info_luks(BlockDriverState *bs, 
Error **errp)
 }
 
 
+static int
+block_crypto_amend_options_generic(BlockDriverState *bs,
+   QCryptoBlockCreateOptions *amend_options,
+   bool force,
+   Error **errp)
+{
+BlockCrypto *crypto = bs->opaque;
+int ret = -1;
+
+assert(crypto);
+assert(crypto->block);
+
+/* apply for exclusive write permissions to the underlying file*/
+crypto->updating_keys = true;
+ret = bdrv_child_refresh_perms(bs, bs->file, errp);
+if (ret) {
+goto cleanup;
+}
+
+ret = qcrypto_block_amend_options(crypto->block,
+  block_crypto_read_func,
+  block_crypto_write_func,
+  bs,
+  amend_options,
+  force,
+  errp);
+cleanup:
+/* release exclusive write permissions to the underlying file*/
+crypto->updating_keys = false;
+bdrv_child_refresh_perms(bs, bs->file, errp);
+return ret;
+}
+
 static int
 block_crypto_amend_options(BlockDriverState *bs,
QemuOpts *opts,
@@ -678,44 +722,45 @@ block_crypto_amend_options(BlockDriverState *bs,
 BlockCrypto *crypto = bs->opaque;
 QDict *cryptoopts = NULL;
 QCryptoBlockCreateOptions *amend_options = NULL;
-int ret;
+int ret = -EINVAL;
 
 assert(crypto);
 assert(crypto->block);
 
-crypto->updating_keys = true;
-
-ret = bdrv_child_refresh_perms(bs, bs->file, errp);
-if (ret < 0) {
-goto cleanup;
-}
-
 cryptoopts = qemu_opts_to_qdict_filtered(opts, NULL,
  _crypto_create_opts_luks,
  true);
 
 qdict_put_str(cryptoopts, "format", "luks");
 amend_options = block_crypto_create_opts_init(cryptoopts, errp);
+
 if (!amend_options) {
-ret = -EINVAL;
 goto cleanup;
 }
 
-ret = qcrypto_block_amend_options(crypto->block,
-  block_crypto_read_func,
-  block_crypto_write_func,
-  bs,
-  amend_options,
-  force,
-  errp);
+ret = block_crypto_amend_options_generic(bs, amend_options, force, errp);
 cleanup:
-crypto->updating_keys = false;
-bdrv_child_refresh_perms(bs, bs->file, errp);
 qapi_free_QCryptoBlockCreateOptions(amend_options);
 qobject_unref(cryptoopts);
 return ret;
 }
 
+static int
+coroutine_fn block_crypto_co_amend(BlockDriverState *bs,
+   BlockdevCreateOptions *opts,
+   bool force,
+   Error **errp)
+{
+QCryptoBlockCreateOptions amend_opts;
+
+amend_opts = (QCryptoBlockCreateOptions) {
+.format = Q_CRYPTO_BLOCK_FORMAT_LUKS,
+.u.luks = *qapi_BlockdevCreateOptionsLUKS_base(>u.luks),
+};
+
+return block_crypto_amend_options_generic(bs, _opts, force, errp);
+}
+
 
 static void
 block_crypto_child_perms(BlockDriverState *bs, BdrvChild *c,
@@ -750,7 +795,8 @@ block_crypto_child_perms(BlockDriverState *bs, BdrvChild *c,
  */
 
 if (crypto->updating_keys) {
-/*need exclusive write access for header update  */
+assert(!(bs->open_flags & BDRV_O_NO_IO));
+/*need exclusive read and write access for header update  */
 perm |= BLK_PERM_WRITE;
 shared &= ~(BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE);
 }
@@ -786,6 +832,7 @@ static BlockDriver bdrv_crypto_luks = {
 .bdrv_get_info  = block_crypto_get_info_luks,
 .bdrv_get_specific_info = block_crypto_get_specific_info_luks,
 .bdrv_amend_options = block_crypto_amend_options,

[Qemu-devel] [PATCH v2 03/11] qcrypto-luks: implement the encryption key management

2019-09-12 Thread Maxim Levitsky
Signed-off-by: Maxim Levitsky 
---
 crypto/block-luks.c | 356 +++-
 1 file changed, 354 insertions(+), 2 deletions(-)

diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index fed80e6646..26ce50b111 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -70,6 +70,9 @@ typedef struct QCryptoBlockLUKSKeySlot 
QCryptoBlockLUKSKeySlot;
 
 #define QCRYPTO_BLOCK_LUKS_SECTOR_SIZE 512LL
 
+#define QCRYPTO_BLOCK_LUKS_DEFAULT_ITER_TIME_MS 2000
+#define QCRYPTO_BLOCK_LUKS_ERASE_ITERATIONS 40
+
 static const char qcrypto_block_luks_magic[QCRYPTO_BLOCK_LUKS_MAGIC_LEN] = {
 'L', 'U', 'K', 'S', 0xBA, 0xBE
 };
@@ -219,6 +222,9 @@ struct QCryptoBlockLUKS {
 
 /* Hash algorithm used in pbkdf2 function */
 QCryptoHashAlgorithm hash_alg;
+
+/* Name of the secret that was used to open the image */
+char *secret;
 };
 
 
@@ -1070,6 +1076,170 @@ qcrypto_block_luks_find_key(QCryptoBlock *block,
 }
 
 
+
+/*
+ * Returns true if a slot i is marked as active
+ * (contains encrypted copy of the master key)
+ */
+static bool
+qcrypto_block_luks_slot_active(const QCryptoBlockLUKS *luks,
+   unsigned int slot_idx)
+{
+uint32_t val = luks->header.key_slots[slot_idx].active;
+return val ==  QCRYPTO_BLOCK_LUKS_KEY_SLOT_ENABLED;
+}
+
+/*
+ * Returns the number of slots that are marked as active
+ * (contains encrypted copy of the master key)
+ */
+static unsigned int
+qcrypto_block_luks_count_active_slots(const QCryptoBlockLUKS *luks)
+{
+size_t i = 0;
+unsigned int ret = 0;
+
+for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
+if (qcrypto_block_luks_slot_active(luks, i)) {
+ret++;
+}
+}
+return ret;
+}
+
+
+/*
+ * Finds first key slot which is not active
+ * Returns the key slot index, or -1 if doesn't exist
+ */
+static int
+qcrypto_block_luks_find_free_keyslot(const QCryptoBlockLUKS *luks)
+{
+size_t i;
+
+for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
+if (!qcrypto_block_luks_slot_active(luks, i)) {
+return i;
+}
+}
+return -1;
+
+}
+
+/*
+ * Erases an keyslot given its index
+ * Returns:
+ *0 if the keyslot was erased successfully
+ *   -1 if a error occurred while erasing the keyslot
+ *
+ */
+static int
+qcrypto_block_luks_erase_key(QCryptoBlock *block,
+ unsigned int slot_idx,
+ QCryptoBlockWriteFunc writefunc,
+ void *opaque,
+ Error **errp)
+{
+QCryptoBlockLUKS *luks = block->opaque;
+QCryptoBlockLUKSKeySlot *slot = >header.key_slots[slot_idx];
+g_autofree uint8_t *garbagesplitkey = NULL;
+size_t splitkeylen = luks->header.master_key_len * slot->stripes;
+size_t i;
+
+assert(slot_idx < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS);
+assert(splitkeylen > 0);
+
+garbagesplitkey = g_new0(uint8_t, splitkeylen);
+
+/* Reset the key slot header */
+memset(slot->salt, 0, QCRYPTO_BLOCK_LUKS_SALT_LEN);
+slot->iterations = 0;
+slot->active = QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED;
+
+qcrypto_block_luks_store_header(block,  writefunc, opaque, errp);
+
+/*
+ * Now try to erase the key material, even if the header
+ * update failed
+ */
+
+for (i = 0 ; i < QCRYPTO_BLOCK_LUKS_ERASE_ITERATIONS ; i++) {
+if (qcrypto_random_bytes(garbagesplitkey, splitkeylen, errp) < 0) {
+/*
+ * If we failed to get the random data, still write
+ * at least zeros to the key slot at least once
+ */
+
+if (i > 0) {
+return -1;
+}
+}
+
+if (writefunc(block,
+  slot->key_offset_sector * QCRYPTO_BLOCK_LUKS_SECTOR_SIZE,
+  garbagesplitkey,
+  splitkeylen,
+  opaque,
+  errp) != splitkeylen) {
+return -1;
+}
+}
+return 0;
+}
+
+
+/*
+ * Erase all the keys that match the given password
+ * Will stop when only one keyslot is remaining
+ * Returns number of slots that were erased or -1 on failure
+ */
+static int
+qcrypto_block_luks_erase_matching_keys(QCryptoBlock *block,
+   const char *password,
+   QCryptoBlockReadFunc readfunc,
+   QCryptoBlockWriteFunc writefunc,
+   void *opaque,
+   bool force,
+   Error **errp)
+{
+QCryptoBlockLUKS *luks = block->opaque;
+size_t i;
+int rv;
+g_autofree uint8_t *masterkey = NULL;
+unsigned int erased_cnt = 0;
+unsigned int active_slot_cnt = qcrypto_block_luks_count_active_slots(luks);
+
+masterkey = g_new0(uint8_t, luks->header.master_key_len);
+

[Qemu-devel] [PATCH v2 00/11] RFC crypto/luks: encryption key managment using amend interface

2019-09-12 Thread Maxim Levitsky
This patch series is continuation of my work to add encryption
key managment to luks/qcow2 with luks.

This is second version of this patch set.
The changes are mostly addressing the review feedback,
plus I tested (and fixed sadly) the somewhat ugly code
that allows to still write share a raw luks device,
while preveting the key managment from happening in this case,
as it is unsafe.
I added a new iotest dedicated to that as well.

Best regards,
Maxim Levitsky

Maxim Levitsky (11):
  qcrypto: add suport for amend options
  qcrypto-luks: extend the create options for upcoming encryption key
management
  qcrypto-luks: implement the encryption key management
  block: amend: add 'force' option
  block/crypto: implement the encryption key management
  qcow2: implement crypto amend options
  block: add x-blockdev-amend qmp command
  block/crypto: implement blockdev-amend
  block/qcow2: implement blockdev-amend
  iotests: filter few more luks specific create options
  iotests : add tests for encryption key management

 block.c  |   4 +-
 block/Makefile.objs  |   2 +-
 block/amend.c| 116 ++
 block/crypto.c   | 167 +-
 block/crypto.h   |  16 ++
 block/qcow2.c| 151 ++--
 crypto/block-luks.c  | 382 ++-
 crypto/block.c   |  31 +++
 crypto/blockpriv.h   |   8 +
 include/block/block.h|   1 +
 include/block/block_int.h|  22 +-
 include/crypto/block.h   |  22 ++
 qapi/block-core.json |  39 +++-
 qapi/crypto.json |  19 ++
 qapi/job.json|   4 +-
 qemu-img-cmds.hx |   4 +-
 qemu-img.c   |   8 +-
 qemu-img.texi|   6 +-
 tests/qemu-iotests/082.out   |  54 +
 tests/qemu-iotests/087.out   |   6 +-
 tests/qemu-iotests/134.out   |   2 +-
 tests/qemu-iotests/158.out   |   4 +-
 tests/qemu-iotests/188.out   |   2 +-
 tests/qemu-iotests/189.out   |   4 +-
 tests/qemu-iotests/198.out   |   4 +-
 tests/qemu-iotests/300   | 202 
 tests/qemu-iotests/300.out   |  98 
 tests/qemu-iotests/301   |  90 
 tests/qemu-iotests/301.out   |  30 +++
 tests/qemu-iotests/302   | 252 
 tests/qemu-iotests/302.out   |  18 ++
 tests/qemu-iotests/303   | 228 ++
 tests/qemu-iotests/303.out   |  28 +++
 tests/qemu-iotests/common.filter |   6 +-
 tests/qemu-iotests/group |   9 +
 35 files changed, 1986 insertions(+), 53 deletions(-)
 create mode 100644 block/amend.c
 create mode 100755 tests/qemu-iotests/300
 create mode 100644 tests/qemu-iotests/300.out
 create mode 100755 tests/qemu-iotests/301
 create mode 100644 tests/qemu-iotests/301.out
 create mode 100644 tests/qemu-iotests/302
 create mode 100644 tests/qemu-iotests/302.out
 create mode 100644 tests/qemu-iotests/303
 create mode 100644 tests/qemu-iotests/303.out

-- 
2.17.2




[Qemu-devel] [PATCH v2 04/11] block: amend: add 'force' option

2019-09-12 Thread Maxim Levitsky
'force' optinion will be used for some unsafe option amend operations.

This includes things like erasing last keyslot in luks (which pretty much 
guarantees
destroying the data, unless the master key is backed up by extrnal means,
but that _might_ be desired result)


Signed-off-by: Maxim Levitsky 
Reviewed-by: Daniel P. Berrangé 
---
 block.c   | 4 +++-
 block/qcow2.c | 1 +
 include/block/block.h | 1 +
 include/block/block_int.h | 1 +
 qemu-img-cmds.hx  | 4 ++--
 qemu-img.c| 8 +++-
 qemu-img.texi | 6 +-
 7 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index 5944124845..414303f76d 100644
--- a/block.c
+++ b/block.c
@@ -6141,6 +6141,7 @@ void bdrv_remove_aio_context_notifier(BlockDriverState 
*bs,
 
 int bdrv_amend_options(BlockDriverState *bs, QemuOpts *opts,
BlockDriverAmendStatusCB *status_cb, void *cb_opaque,
+   bool force,
Error **errp)
 {
 if (!bs->drv) {
@@ -6152,7 +6153,8 @@ int bdrv_amend_options(BlockDriverState *bs, QemuOpts 
*opts,
bs->drv->format_name);
 return -ENOTSUP;
 }
-return bs->drv->bdrv_amend_options(bs, opts, status_cb, cb_opaque, errp);
+return bs->drv->bdrv_amend_options(bs, opts, status_cb,
+   cb_opaque, force, errp);
 }
 
 /* This function will be called by the bdrv_recurse_is_first_non_filter method
diff --git a/block/qcow2.c b/block/qcow2.c
index 5bdb8b18f4..0618a63793 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -4822,6 +4822,7 @@ static void qcow2_amend_helper_cb(BlockDriverState *bs,
 static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
BlockDriverAmendStatusCB *status_cb,
void *cb_opaque,
+   bool force,
Error **errp)
 {
 BDRVQcow2State *s = bs->opaque;
diff --git a/include/block/block.h b/include/block/block.h
index 124ad40809..6bc89c7667 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -400,6 +400,7 @@ typedef void BlockDriverAmendStatusCB(BlockDriverState *bs, 
int64_t offset,
   int64_t total_work_size, void *opaque);
 int bdrv_amend_options(BlockDriverState *bs_new, QemuOpts *opts,
BlockDriverAmendStatusCB *status_cb, void *cb_opaque,
+   bool force,
Error **errp);
 
 /* external snapshots */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 0422acdf1c..5ea30f9d58 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -411,6 +411,7 @@ struct BlockDriver {
 int (*bdrv_amend_options)(BlockDriverState *bs, QemuOpts *opts,
   BlockDriverAmendStatusCB *status_cb,
   void *cb_opaque,
+  bool force,
   Error **errp);
 
 void (*bdrv_debug_event)(BlockDriverState *bs, BlkdebugEvent event);
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 1c93e6d185..323ea10ad0 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -14,9 +14,9 @@ STEXI
 ETEXI
 
 DEF("amend", img_amend,
-"amend [--object objectdef] [--image-opts] [-p] [-q] [-f fmt] [-t cache] 
-o options filename")
+"amend [--object objectdef] [--image-opts] [-p] [-q] [-f fmt] [-t cache] 
[--force] -o options filename")
 STEXI
-@item amend [--object @var{objectdef}] [--image-opts] [-p] [-q] [-f @var{fmt}] 
[-t @var{cache}] -o @var{options} @var{filename}
+@item amend [--object @var{objectdef}] [--image-opts] [-p] [-q] [-f @var{fmt}] 
[-t @var{cache}] [--force] -o @var{options} @var{filename}
 ETEXI
 
 DEF("bench", img_bench,
diff --git a/qemu-img.c b/qemu-img.c
index 4ee436fc94..30300870ff 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -70,6 +70,7 @@ enum {
 OPTION_PREALLOCATION = 265,
 OPTION_SHRINK = 266,
 OPTION_SALVAGE = 267,
+OPTION_FORCE = 268,
 };
 
 typedef enum OutputFormat {
@@ -3915,6 +3916,7 @@ static int img_amend(int argc, char **argv)
 BlockBackend *blk = NULL;
 BlockDriverState *bs = NULL;
 bool image_opts = false;
+bool force = false;
 
 cache = BDRV_DEFAULT_CACHE;
 for (;;) {
@@ -3922,6 +3924,7 @@ static int img_amend(int argc, char **argv)
 {"help", no_argument, 0, 'h'},
 {"object", required_argument, 0, OPTION_OBJECT},
 {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
+{"force", no_argument, 0, OPTION_FORCE},
 {0, 0, 0, 0}
 };
 c = getopt_long(argc, argv, ":ho:f:t:pq",
@@ -3977,6 +3980,9 @@ static int img_amend(int argc, char **argv)
 case OPTION_IMAGE_OPTS:
 image_opts = true;
 break;
+case OPTION_FORCE:
+force = true;
+break;
 }
  

Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap

2019-09-12 Thread Paolo Bonzini
Il gio 12 set 2019, 19:43 Richard Henderson 
ha scritto:

> >>> Fortunately, in order to fix it, no change is required to the
> >>> vCPU thread.  However, the reader thread must delay the read after
> >>> the vCPU thread has finished the write.  This can be approximated
> >>> conservatively by run_on_cpu, which waits for the end of the current
> >>> translation block.
>
> If we are going to delay any read of the dirty flags until vCPU has
> completed
> any active TranslationBlock, then we can simplify the TCG operation so
> that we
> do not (ab)use the mmio path, and can promote this into the tlb slow path
> as we
> have recently done with watchpoints.


Uh, that's true and I agree it would be great. Die notdirty_mem_ops die...

Paolo


Re: [Qemu-devel] [PATCH v2] docs/nvdimm: add example on persistent backend setup

2019-09-12 Thread Wei Yang
On Thu, Sep 12, 2019 at 02:16:00PM +0200, Stefan Hajnoczi wrote:
>On Thu, Aug 01, 2019 at 08:40:53AM +0800, Wei Yang wrote:
>> Persistent backend setup requires some knowledge about nvdimm and ndctl
>> tool. Some users report they may struggle to gather these knowledge and
>> have difficulty to setup it properly.
>> 
>> Here we provide two examples for persistent backend and gives the link
>> to ndctl. By doing so, user could try it directly and do more
>> investigation on persistent backend setup with ndctl.
>> 
>> Signed-off-by: Wei Yang 
>> Reviewed-by: Pankaj Gupta 
>> 
>> ---
>> v2: rephrase the doc based on Stefan Hajnoczi's suggestion
>> ---
>>  docs/nvdimm.txt | 31 +++
>>  1 file changed, 31 insertions(+)
>
>Sorry, I was expecting someone else to pick this patch up.  But since
>there have been no takers...
>
>Thanks, applied to my block-next tree:
>https://github.com/stefanha/qemu/commits/block-next
>

Thanks :-)

>Stefan



-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH] migration: fix one typo in comment of function migration_total_bytes()

2019-09-12 Thread Wei Yang
On Thu, Sep 12, 2019 at 09:44:41AM +0200, Juan Quintela wrote:
>Wei Yang  wrote:
>> Signed-off-by: Wei Yang 
>
>Reviewed-by: Juan Quintela 
>
>for(i = 0; i < 0; i++)
>  printf("Beginning is with double n, not double g");

:-)

-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH] memory: inline and optimize devend_memop

2019-09-12 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20190912141820.30702-1-pbonz...@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

libudev   no
default devices   yes

warning: Python 2 support is deprecated
warning: Python 3 will be required for building future versions of QEMU
cross containers  no

NOTE: guest cross-compilers enabled: cc
---
In file included from /tmp/qemu-test/src/include/exec/cpu-all.h:23:0,
 from /tmp/qemu-test/src/target/i386/cpu.h:1792,
 from /tmp/qemu-test/src/disas.c:7:
/tmp/qemu-test/src/include/exec/memory.h:2206:7: error: no previous prototype 
for 'devend_memop' [-Werror=missing-prototypes]
 MemOp devend_memop(enum device_endian end)
   ^
cc1: all warnings being treated as errors
---
 from /tmp/qemu-test/src/target/i386/cpu.h:1792,
 from /tmp/qemu-test/src/tcg/tcg.h:28,
 from /tmp/qemu-test/src/tcg/tcg-common.c:26:
/tmp/qemu-test/src/include/exec/memory.h:2206:7: error: no previous prototype 
for 'devend_memop' [-Werror=missing-prototypes]
 MemOp devend_memop(enum device_endian end)
   ^
cc1: all warnings being treated as errors
---
In file included from /tmp/qemu-test/src/include/exec/cpu-all.h:23:0,
 from /tmp/qemu-test/src/target/i386/cpu.h:1792,
 from /tmp/qemu-test/src/exec.c:25:
/tmp/qemu-test/src/include/exec/memory.h:2206:7: error: no previous prototype 
for 'devend_memop' [-Werror=missing-prototypes]
 MemOp devend_memop(enum device_endian end)
   ^
cc1: all warnings being treated as errors
---
In file included from /tmp/qemu-test/src/include/exec/cpu-all.h:23:0,
 from /tmp/qemu-test/src/target/i386/cpu.h:1792,
 from /tmp/qemu-test/src/arch_init.c:25:
/tmp/qemu-test/src/include/exec/memory.h:2206:7: error: no previous prototype 
for 'devend_memop' [-Werror=missing-prototypes]
 MemOp devend_memop(enum device_endian end)
   ^
cc1: all warnings being treated as errors
---
In file included from /tmp/qemu-test/src/include/exec/cpu-all.h:23:0,
 from /tmp/qemu-test/src/target/arm/cpu.h:3141,
 from /tmp/qemu-test/src/exec.c:25:
/tmp/qemu-test/src/include/exec/memory.h:2206:7: error: no previous prototype 
for 'devend_memop' [-Werror=missing-prototypes]
 MemOp devend_memop(enum device_endian end)
   ^
cc1: all warnings being treated as errors
---
In file included from /tmp/qemu-test/src/include/exec/cpu-all.h:23:0,
 from /tmp/qemu-test/src/target/i386/cpu.h:1792,
 from /tmp/qemu-test/src/tcg/tcg-op-vec.c:21:
/tmp/qemu-test/src/include/exec/memory.h:2206:7: error: no previous prototype 
for 'devend_memop' [-Werror=missing-prototypes]
 MemOp devend_memop(enum device_endian end)
   ^
cc1: all warnings being treated as errors
---
 from /tmp/qemu-test/src/target/i386/cpu.h:1792,
 from /tmp/qemu-test/src/tcg/tcg.h:28,
 from /tmp/qemu-test/src/tcg/tcg-op-gvec.c:21:
/tmp/qemu-test/src/include/exec/memory.h:2206:7: error: no previous prototype 
for 'devend_memop' [-Werror=missing-prototypes]
 MemOp devend_memop(enum device_endian end)
   ^
cc1: all warnings being treated as errors
---
 from /tmp/qemu-test/src/target/arm/cpu.h:3141,
 from /tmp/qemu-test/src/tcg/tcg.h:28,
 from /tmp/qemu-test/src/tcg/tcg-common.c:26:
/tmp/qemu-test/src/include/exec/memory.h:2206:7: error: no previous prototype 
for 'devend_memop' [-Werror=missing-prototypes]
 MemOp devend_memop(enum device_endian end)
   ^
cc1: all warnings being treated as errors
---
 from /tmp/qemu-test/src/tcg/tcg.h:28,
 from /tmp/qemu-test/src/tcg/tcg-op.h:28,
 from /tmp/qemu-test/src/tcg/optimize.c:27:
/tmp/qemu-test/src/include/exec/memory.h:2206:7: error: no previous prototype 
for 'devend_memop' [-Werror=missing-prototypes]
 MemOp devend_memop(enum device_endian end)
   ^
cc1: all warnings being treated as errors
---
In file included from /tmp/qemu-test/src/include/exec/cpu-all.h:23:0,
 from /tmp/qemu-test/src/target/i386/cpu.h:1792,
 from /tmp/qemu-test/src/tcg/tcg-op.c:26:
/tmp/qemu-test/src/include/exec/memory.h:2206:7: error: no previous prototype 
for 'devend_memop' [-Werror=missing-prototypes]
 MemOp devend_memop(enum device_endian end)
   ^
cc1: all warnings being treated as errors
---
In file included from /tmp/qemu-test/src/include/exec/cpu-all.h:23:0,
 from /tmp/qemu-test/src/target/i386/cpu.h:1792,
 from /tmp/qemu-test/src/tcg/tcg.c:43:

[Qemu-devel] [Bug 1841592] Re: ppc: softfloat float implementation issues

2019-09-12 Thread Alex Bennée
Testing on current master shows the behavior is correct. I guess rth's
patch fixed this case.

** Changed in: qemu
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1841592

Title:
  ppc: softfloat float implementation issues

Status in QEMU:
  Fix Committed

Bug description:
  Per bug #1841491, Richard Henderson (rth) said:
  > The float test failure is part of a larger problem for target/powerpc
  > in which all float routines are implemented incorrectly. They are all
  > implemented as double operations with rounding to float as a second
  > step. Which not only produces incorrect exceptions, as in this case,
  > but incorrect numerical results from the double rounding.
  >
  > This should probably be split to a separate bug...

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1841592/+subscriptions



Re: [Qemu-devel] [PATCH 2/2] ati: use vga_read_byte in ati_cursor_define

2019-09-12 Thread BALATON Zoltan

On Thu, 12 Sep 2019, Gerd Hoffmann wrote:

This makes sure reads are confined to vga video memory.

Reported-by: xu hang 
Signed-off-by: Gerd Hoffmann 
---
hw/display/ati.c | 11 ++-
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/display/ati.c b/hw/display/ati.c
index 8f940eee221a..6d77c40b8287 100644
--- a/hw/display/ati.c
+++ b/hw/display/ati.c
@@ -19,6 +19,7 @@
#include "qemu/osdep.h"
#include "ati_int.h"
#include "ati_regs.h"
+#include "vga-access.h"
#include "hw/qdev-properties.h"
#include "vga_regs.h"
#include "qemu/log.h"
@@ -135,19 +136,19 @@ static void ati_vga_switch_mode(ATIVGAState *s)
static void ati_cursor_define(ATIVGAState *s)
{
uint8_t data[1024];
-uint8_t *src;
+unsigned srcoff;
int i, j, idx = 0;

if ((s->regs.cur_offset & BIT(31)) || s->cursor_guest_mode) {
return; /* Do not update cursor if locked or rendered by guest */
}
/* FIXME handle cur_hv_offs correctly */
-src = s->vga.vram_ptr + s->regs.cur_offset -
-  (s->regs.cur_hv_offs >> 16) - (s->regs.cur_hv_offs & 0x) * 16;
+srcoff = s->regs.cur_offset -
+(s->regs.cur_hv_offs >> 16) - (s->regs.cur_hv_offs & 0x) * 16;


Do we need similar fix in ati_cursor_draw_line() as well which also 
accesses cursor data when guest_hwcursor property is true?


Regards,
BALATON Zoltan


for (i = 0; i < 64; i++) {
for (j = 0; j < 8; j++, idx++) {
-data[idx] = src[i * 16 + j];
-data[512 + idx] = src[i * 16 + j + 8];
+data[idx] = vga_read_byte(>vga, srcoff + i * 16 + j);
+data[512 + idx] = vga_read_byte(>vga, srcoff + i * 16 + j + 8);
}
}
if (!s->cursor) {





Re: [Qemu-devel] [RFC] docs: vhost-user: add in-band kick/call messages

2019-09-12 Thread Johannes Berg
On Thu, 2019-09-12 at 14:22 +0200, Stefan Hajnoczi wrote:
> 
> The vhost-user spec is unclear and inconsistent.  Patches are welcome.

:)

> A footnote describing the old terminology would be necessary so that
> existing documentation, code, etc can still be decyphered when the spec
> changes the terminology :).

Yeah.

But we (you) would have to decide first what the terminology actually
*should* be :-)

I'd have said something like "host" and "device", but then "host" can
get confusing in the context of qemu, is it the host that the VM is
running on? It's the VM that's hosting the device, but that's a bit confusing 
in this context.

Client/server might work, but it's not very obvious either (**), and
quite a bit of the text actually gets it wrong right now, so it'd be
very confusing to swap that and have a footnote or similar indicate that
it was described the other way around previously ...

Calling it master/slave as the text does now is a bit confusing to me
because it has DMA (of sorts) and that's called "bus mastering" in most
other contexts, but perhaps that's what would work best nonetheless,
though I'm not really a fan of that terminology.

Perhaps master/device would be nicer, getting the term "device" in there
somewhere would seem appropriate, even if it's not really actually a
device, but from the view inside the "master" VM it is a device...


(**) Btw, is it really true that there's a way to have the device make
the connection to a listening unix domain socket, as implied by the
spec? I cannot find evidence for such a mode in any code...

johannes




[Qemu-devel] [PATCH v2 2/3] cputlb: Replace switches in load/store_helper with callback

2019-09-12 Thread Richard Henderson
Add a function parameter to perform the actual load/store to ram.
With optimization, this results in identical code.

Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 159 +++--
 1 file changed, 83 insertions(+), 76 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index b87764..b4a63d3928 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1280,11 +1280,38 @@ static void *atomic_mmu_lookup(CPUArchState *env, 
target_ulong addr,
 
 typedef uint64_t FullLoadHelper(CPUArchState *env, target_ulong addr,
 TCGMemOpIdx oi, uintptr_t retaddr);
+typedef uint64_t LoadHelper(const void *);
+
+/* Wrap the unaligned load helpers to that they have a common signature.  */
+static inline uint64_t wrap_ldub(const void *haddr)
+{
+return ldub_p(haddr);
+}
+
+static inline uint64_t wrap_lduw_be(const void *haddr)
+{
+return lduw_be_p(haddr);
+}
+
+static inline uint64_t wrap_lduw_le(const void *haddr)
+{
+return lduw_le_p(haddr);
+}
+
+static inline uint64_t wrap_ldul_be(const void *haddr)
+{
+return (uint32_t)ldl_be_p(haddr);
+}
+
+static inline uint64_t wrap_ldul_le(const void *haddr)
+{
+return (uint32_t)ldl_le_p(haddr);
+}
 
 static inline uint64_t QEMU_ALWAYS_INLINE
 load_helper(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
 uintptr_t retaddr, MemOp op, bool code_read,
-FullLoadHelper *full_load)
+FullLoadHelper *full_load, LoadHelper *direct)
 {
 uintptr_t mmu_idx = get_mmuidx(oi);
 uintptr_t index = tlb_index(env, mmu_idx, addr);
@@ -1373,33 +1400,7 @@ load_helper(CPUArchState *env, target_ulong addr, 
TCGMemOpIdx oi,
 
  do_aligned_access:
 haddr = (void *)((uintptr_t)addr + entry->addend);
-switch (op) {
-case MO_UB:
-res = ldub_p(haddr);
-break;
-case MO_BEUW:
-res = lduw_be_p(haddr);
-break;
-case MO_LEUW:
-res = lduw_le_p(haddr);
-break;
-case MO_BEUL:
-res = (uint32_t)ldl_be_p(haddr);
-break;
-case MO_LEUL:
-res = (uint32_t)ldl_le_p(haddr);
-break;
-case MO_BEQ:
-res = ldq_be_p(haddr);
-break;
-case MO_LEQ:
-res = ldq_le_p(haddr);
-break;
-default:
-g_assert_not_reached();
-}
-
-return res;
+return direct(haddr);
 }
 
 /*
@@ -1415,7 +1416,8 @@ load_helper(CPUArchState *env, target_ulong addr, 
TCGMemOpIdx oi,
 static uint64_t full_ldub_mmu(CPUArchState *env, target_ulong addr,
   TCGMemOpIdx oi, uintptr_t retaddr)
 {
-return load_helper(env, addr, oi, retaddr, MO_UB, false, full_ldub_mmu);
+return load_helper(env, addr, oi, retaddr, MO_UB, false,
+   full_ldub_mmu, wrap_ldub);
 }
 
 tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr,
@@ -1428,7 +1430,7 @@ static uint64_t full_le_lduw_mmu(CPUArchState *env, 
target_ulong addr,
  TCGMemOpIdx oi, uintptr_t retaddr)
 {
 return load_helper(env, addr, oi, retaddr, MO_LEUW, false,
-   full_le_lduw_mmu);
+   full_le_lduw_mmu, wrap_lduw_le);
 }
 
 tcg_target_ulong helper_le_lduw_mmu(CPUArchState *env, target_ulong addr,
@@ -1441,7 +1443,7 @@ static uint64_t full_be_lduw_mmu(CPUArchState *env, 
target_ulong addr,
  TCGMemOpIdx oi, uintptr_t retaddr)
 {
 return load_helper(env, addr, oi, retaddr, MO_BEUW, false,
-   full_be_lduw_mmu);
+   full_be_lduw_mmu, wrap_lduw_be);
 }
 
 tcg_target_ulong helper_be_lduw_mmu(CPUArchState *env, target_ulong addr,
@@ -1454,7 +1456,7 @@ static uint64_t full_le_ldul_mmu(CPUArchState *env, 
target_ulong addr,
  TCGMemOpIdx oi, uintptr_t retaddr)
 {
 return load_helper(env, addr, oi, retaddr, MO_LEUL, false,
-   full_le_ldul_mmu);
+   full_le_ldul_mmu, wrap_ldul_le);
 }
 
 tcg_target_ulong helper_le_ldul_mmu(CPUArchState *env, target_ulong addr,
@@ -1467,7 +1469,7 @@ static uint64_t full_be_ldul_mmu(CPUArchState *env, 
target_ulong addr,
  TCGMemOpIdx oi, uintptr_t retaddr)
 {
 return load_helper(env, addr, oi, retaddr, MO_BEUL, false,
-   full_be_ldul_mmu);
+   full_be_ldul_mmu, wrap_ldul_be);
 }
 
 tcg_target_ulong helper_be_ldul_mmu(CPUArchState *env, target_ulong addr,
@@ -1480,14 +1482,14 @@ uint64_t helper_le_ldq_mmu(CPUArchState *env, 
target_ulong addr,
TCGMemOpIdx oi, uintptr_t retaddr)
 {
 return load_helper(env, addr, oi, retaddr, MO_LEQ, false,
-   helper_le_ldq_mmu);
+   helper_le_ldq_mmu, ldq_le_p);
 }
 
 uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr,
TCGMemOpIdx oi, uintptr_t retaddr)

[Qemu-devel] [PATCH v2 3/3] cputlb: Introduce TLB_BSWAP

2019-09-12 Thread Richard Henderson
Handle bswap on ram directly in load/store_helper.  This fixes a
bug with the previous implementation in that one cannot use the
I/O path for RAM.

Fixes: a26fc6f5152b47f1
Signed-off-by: Richard Henderson 
---
 include/exec/cpu-all.h |   2 +
 accel/tcg/cputlb.c | 118 -
 2 files changed, 59 insertions(+), 61 deletions(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index d2d443c4f9..3928edab9a 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -331,6 +331,8 @@ CPUArchState *cpu_copy(CPUArchState *env);
 #define TLB_MMIO(1 << (TARGET_PAGE_BITS - 3))
 /* Set if TLB entry contains a watchpoint.  */
 #define TLB_WATCHPOINT  (1 << (TARGET_PAGE_BITS - 4))
+/* Set if TLB entry requires byte swap.  */
+#define TLB_BSWAP   (1 << (TARGET_PAGE_BITS - 5))
 
 /* Use this mask to check interception with an alignment mask
  * in a TCG backend.
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index b4a63d3928..354a75927a 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -737,8 +737,7 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 address |= TLB_INVALID_MASK;
 }
 if (attrs.byte_swap) {
-/* Force the access through the I/O slow path.  */
-address |= TLB_MMIO;
+address |= TLB_BSWAP;
 }
 if (!memory_region_is_ram(section->mr) &&
 !memory_region_is_romd(section->mr)) {
@@ -901,10 +900,6 @@ static uint64_t io_readx(CPUArchState *env, CPUIOTLBEntry 
*iotlbentry,
 bool locked = false;
 MemTxResult r;
 
-if (iotlbentry->attrs.byte_swap) {
-op ^= MO_BSWAP;
-}
-
 section = iotlb_to_section(cpu, iotlbentry->addr, iotlbentry->attrs);
 mr = section->mr;
 mr_offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
@@ -947,10 +942,6 @@ static void io_writex(CPUArchState *env, CPUIOTLBEntry 
*iotlbentry,
 bool locked = false;
 MemTxResult r;
 
-if (iotlbentry->attrs.byte_swap) {
-op ^= MO_BSWAP;
-}
-
 section = iotlb_to_section(cpu, iotlbentry->addr, iotlbentry->attrs);
 mr = section->mr;
 mr_offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
@@ -1311,7 +1302,8 @@ static inline uint64_t wrap_ldul_le(const void *haddr)
 static inline uint64_t QEMU_ALWAYS_INLINE
 load_helper(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
 uintptr_t retaddr, MemOp op, bool code_read,
-FullLoadHelper *full_load, LoadHelper *direct)
+FullLoadHelper *full_load, LoadHelper *direct,
+LoadHelper *direct_swap)
 {
 uintptr_t mmu_idx = get_mmuidx(oi);
 uintptr_t index = tlb_index(env, mmu_idx, addr);
@@ -1361,26 +1353,27 @@ load_helper(CPUArchState *env, target_ulong addr, 
TCGMemOpIdx oi,
 /* On watchpoint hit, this will longjmp out.  */
 cpu_check_watchpoint(env_cpu(env), addr, size,
  iotlbentry->attrs, BP_MEM_READ, retaddr);
-
-/* The backing page may or may not require I/O.  */
-tlb_addr &= ~TLB_WATCHPOINT;
-if ((tlb_addr & ~TARGET_PAGE_MASK) == 0) {
-goto do_aligned_access;
-}
 }
 
 /* Handle I/O access.  */
-return io_readx(env, iotlbentry, mmu_idx, addr,
-retaddr, access_type, op);
-}
+if (likely(tlb_addr & TLB_MMIO)) {
+return io_readx(env, iotlbentry, mmu_idx, addr,
+retaddr, access_type,
+op ^ (tlb_addr & TLB_BSWAP ? MO_BSWAP : 0));
+}
 
-/* Handle slow unaligned access (it spans two pages or IO).  */
-if (size > 1
-&& unlikely((addr & ~TARGET_PAGE_MASK) + size - 1
->= TARGET_PAGE_SIZE)) {
+if (unlikely(tlb_addr & TLB_BSWAP)) {
+haddr = (void *)((uintptr_t)addr + entry->addend);
+return direct_swap(haddr);
+}
+} else if (size > 1
+   && unlikely((addr & ~TARGET_PAGE_MASK) + size - 1
+   >= TARGET_PAGE_SIZE)) {
+/* Handle slow unaligned access (it spans two pages or IO).  */
 target_ulong addr1, addr2;
 uint64_t r1, r2;
 unsigned shift;
+
 do_unaligned_access:
 addr1 = addr & ~((target_ulong)size - 1);
 addr2 = addr1 + size;
@@ -1398,7 +1391,6 @@ load_helper(CPUArchState *env, target_ulong addr, 
TCGMemOpIdx oi,
 return res & MAKE_64BIT_MASK(0, size * 8);
 }
 
- do_aligned_access:
 haddr = (void *)((uintptr_t)addr + entry->addend);
 return direct(haddr);
 }
@@ -1417,7 +1409,7 @@ static uint64_t full_ldub_mmu(CPUArchState *env, 
target_ulong addr,
   TCGMemOpIdx oi, uintptr_t retaddr)
 {
 return load_helper(env, addr, oi, retaddr, MO_UB, false,
-   full_ldub_mmu, wrap_ldub);
+   full_ldub_mmu, wrap_ldub, wrap_ldub);
 }
 
 

[Qemu-devel] [PATCH v2 1/3] cputlb: Disable __always_inline__ without optimization

2019-09-12 Thread Richard Henderson
This forced inlining can result in missing symbols,
which makes a debugging build harder to follow.

Reported-by: Peter Maydell 
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 include/qemu/compiler.h | 11 +++
 accel/tcg/cputlb.c  |  4 ++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 09fc44cca4..d6d400c523 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -170,6 +170,17 @@
 # define QEMU_NONSTRING
 #endif
 
+/*
+ * Forced inlining may be desired to encourage constant propagation
+ * of function parameters.  However, it can also make debugging harder,
+ * so disable it for a non-optimizing build.
+ */
+#if defined(__OPTIMIZE__) && __has_attribute(always_inline)
+#define QEMU_ALWAYS_INLINE  __attribute__((always_inline))
+#else
+#define QEMU_ALWAYS_INLINE
+#endif
+
 /* Implement C11 _Generic via GCC builtins.  Example:
  *
  *QEMU_GENERIC(x, (float, sinf), (long double, sinl), sin) (x)
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index abae79650c..b87764 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1281,7 +1281,7 @@ static void *atomic_mmu_lookup(CPUArchState *env, 
target_ulong addr,
 typedef uint64_t FullLoadHelper(CPUArchState *env, target_ulong addr,
 TCGMemOpIdx oi, uintptr_t retaddr);
 
-static inline uint64_t __attribute__((always_inline))
+static inline uint64_t QEMU_ALWAYS_INLINE
 load_helper(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi,
 uintptr_t retaddr, MemOp op, bool code_read,
 FullLoadHelper *full_load)
@@ -1530,7 +1530,7 @@ tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, 
target_ulong addr,
  * Store Helpers
  */
 
-static inline void __attribute__((always_inline))
+static inline void QEMU_ALWAYS_INLINE
 store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
  TCGMemOpIdx oi, uintptr_t retaddr, MemOp op)
 {
-- 
2.17.1




[Qemu-devel] [PATCH v2 0/3] cputlb: Adjust tlb bswap implementation

2019-09-12 Thread Richard Henderson
Changes from v1:
  * Move QEMU_ALWAYS_INLINE to qemu/compiler.h.
  * Rename some inline wrapper functions.
  * Don't break TLB_NOTDIRTY in patch 3.

Blurb from v1:

The version that Tony came up with, and I reviewed, doesn't actually
work when applied to RAM.  It only worked for i/o memory.  This was
the root cause for

https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg00036.html

I tried a couple of different approaches in load/store_helper, but
this is the one that didn't affect the normal case -- a simple tlb
miss against (non-swapped) ram.


r~


Richard Henderson (3):
  cputlb: Disable __always_inline__ without optimization
  cputlb: Replace switches in load/store_helper with callback
  cputlb: Introduce TLB_BSWAP

 include/exec/cpu-all.h  |   2 +
 include/qemu/compiler.h |  11 ++
 accel/tcg/cputlb.c  | 235 
 3 files changed, 132 insertions(+), 116 deletions(-)

-- 
2.17.1




Re: [Qemu-devel] [RFC PATCH] virtio-blk: schedule virtio_notify_config to run on main context

2019-09-12 Thread Michael S. Tsirkin
On Thu, Sep 12, 2019 at 08:19:25PM +0200, Sergio Lopez wrote:
> Another AioContext-related issue, and this is a tricky one.
> 
> Executing a QMP block_resize request for a virtio-blk device running
> on an iothread may cause a deadlock involving the following mutexes:
> 
>  - main thead
>   * Has acquired: qemu_mutex_global.
>   * Is trying the acquire: iothread AioContext lock via
> AIO_WAIT_WHILE (after aio_poll).
> 
>  - iothread
>   * Has acquired: AioContext lock.
>   * Is trying to acquire: qemu_mutex_global (via
> virtio_notify_config->prepare_mmio_access).

Hmm is this really the only case iothread takes qemu mutex?
If any such access can deadlock, don't we need a generic
solution? Maybe main thread can drop qemu mutex
before taking io thread AioContext lock?

> With this change, virtio_blk_resize checks if it's being called from a
> coroutine context running on a non-main thread, and if that's the
> case, creates a new coroutine and schedules it to be run on the main
> thread.
> 
> This works, but means the actual operation is done
> asynchronously, perhaps opening a window in which a "device_del"
> operation may fit and remove the VirtIODevice before
> virtio_notify_config() is executed.
> 
> I *think* it shouldn't be possible, as BHs will be processed before
> any new QMP/monitor command, but I'm open to a different approach.
> 
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955
> Signed-off-by: Sergio Lopez 
> ---
>  hw/block/virtio-blk.c | 25 -
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 18851601cb..c763d071f6 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -16,6 +16,7 @@
>  #include "qemu/iov.h"
>  #include "qemu/module.h"
>  #include "qemu/error-report.h"
> +#include "qemu/main-loop.h"
>  #include "trace.h"
>  #include "hw/block/block.h"
>  #include "hw/qdev-properties.h"
> @@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, 
> QEMUFile *f,
>  return 0;
>  }
>  
> +static void coroutine_fn virtio_resize_co_entry(void *opaque)
> +{
> +VirtIODevice *vdev = opaque;
> +
> +assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> +virtio_notify_config(vdev);
> +aio_wait_kick();
> +}
> +
>  static void virtio_blk_resize(void *opaque)
>  {
>  VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
> +Coroutine *co;
>  
> -virtio_notify_config(vdev);
> +if (qemu_in_coroutine() &&
> +qemu_get_current_aio_context() != qemu_get_aio_context()) {
> +/*
> + * virtio_notify_config() needs to acquire the global mutex,
> + * so calling it from a coroutine running on a non-main context
> + * may cause a deadlock. Instead, create a new coroutine and
> + * schedule it to be run on the main thread.
> + */
> +co = qemu_coroutine_create(virtio_resize_co_entry, vdev);
> +aio_co_schedule(qemu_get_aio_context(), co);
> +} else {
> +virtio_notify_config(vdev);
> +}
>  }
>  
>  static const BlockDevOps virtio_block_ops = {
> -- 
> 2.21.0



[Qemu-devel] [Bug 1841592] Re: ppc: softfloat float implementation issues

2019-09-12 Thread Richard Henderson
It should be a fused multiply add; you may need to use -ffast-math or
something to get the compiler to generate the proper instruction.

However, one can see from target/ppc/translate/fp-impl.inc.c:

/* fmadd - fmadds */
GEN_FLOAT_ACB(madd, 0x1D, 1, PPC_FLOAT);

through to _GEN_FLOAT_ACB:

gen_helper_f##op(t3, cpu_env, t0, t1, t2); \
if (isfloat) { \
gen_helper_frsp(t3, cpu_env, t3);  \
}  \

That right there is a double-precision fma followed by a round
to single precision.  This pattern is replicated for all single
precision operations, and is of course wrong.

I believe that correct results may be obtained by having
single-precision helpers that first convert the double-precision
input into a single-precision input using helper_tosingle(),
perform the required operation, then convert the result back to
double-precision using helper_todouble().

The manual says:

# For single-precision arithmetic instructions, all input values
# must be representable in single format; if they are not, the
# result placed into the target FPR, and the setting of
# status bits in the FPSCR and in the Condition Register
# (if Rc=1), are undefined.

The tosingle/todouble conversions are exact and bit-preserving.
They are used by load-single and store-single that convert a
single-precision in-memory value to the double-precision register
value.  Therefore the input given to float32_add using this
conversion would be exactly the same as if we had given the
value unmollested from a memory input.

I don't know what real ppc hw does -- whether it takes all of
the double-precision input bits and rounds to 23-bits, like the
old 80387 hardware does, or truncates the input as I propose.
But for architectural results we don't have to care, because
of the UNDEFINED escape clause.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1841592

Title:
  ppc: softfloat float implementation issues

Status in QEMU:
  New

Bug description:
  Per bug #1841491, Richard Henderson (rth) said:
  > The float test failure is part of a larger problem for target/powerpc
  > in which all float routines are implemented incorrectly. They are all
  > implemented as double operations with rounding to float as a second
  > step. Which not only produces incorrect exceptions, as in this case,
  > but incorrect numerical results from the double rounding.
  >
  > This should probably be split to a separate bug...

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1841592/+subscriptions



Re: [Qemu-devel] [PATCH 09/10] block/qcow2: implement blockdev-amend

2019-09-12 Thread Maxim Levitsky
On Fri, 2019-09-06 at 15:12 +0100, Daniel P. Berrangé wrote:
> On Fri, Aug 30, 2019 at 11:56:07PM +0300, Maxim Levitsky wrote:
> > Currently only for changing crypto parameters
> > 
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  block/qcow2.c| 71 
> >  qapi/block-core.json |  4 +--
> >  2 files changed, 73 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/qcow2.c b/block/qcow2.c
> > index 8dff4c6b5f..327d2afd9f 100644
> > --- a/block/qcow2.c
> > +++ b/block/qcow2.c
> > @@ -3082,6 +3082,18 @@ qcow2_co_create(BlockdevCreateOptions 
> > *create_options, Error **errp)
> >  assert(create_options->driver == BLOCKDEV_DRIVER_QCOW2);
> >  qcow2_opts = _options->u.qcow2;
> >  
> > +if (!qcow2_opts->has_size) {
> > +error_setg(errp, "Size is manadatory for image creation");
> > +return -EINVAL;
> > +
> > +}
> > +
> > +if (!qcow2_opts->has_file) {
> > +error_setg(errp, "'file' is manadatory for image creation");
> > +return -EINVAL;
> > +
> > +}
> > +
> >  bs = bdrv_open_blockdev_ref(qcow2_opts->file, errp);
> >  if (bs == NULL) {
> >  return -EIO;
> > @@ -5112,6 +5124,64 @@ static int qcow2_amend_options(BlockDriverState *bs, 
> > QemuOpts *opts,
> >  return 0;
> >  }
> >  
> > +
> > +static int coroutine_fn qcow2_co_amend(BlockDriverState *bs,
> > +   BlockdevCreateOptions *opts,
> > +   bool force,
> > +   Error **errp)
> > +{
> > +BlockdevCreateOptionsQcow2 *qopts = >u.qcow2;
> > +BDRVQcow2State *s = bs->opaque;
> > +int ret;
> > +
> > +/*
> > + * This is ugly as hell, in later versions of this patch
> > + * something has to be done about this
> > + */
> > +if (qopts->has_file || qopts->has_size || qopts->has_data_file ||
> > +qopts->has_data_file_raw || qopts->has_version ||
> > +qopts->has_backing_file || qopts->has_backing_fmt ||
> > +qopts->has_cluster_size || qopts->has_preallocation ||
> > +qopts->has_lazy_refcounts || qopts->has_refcount_bits) {
> > +
> > +error_setg(errp,
> > +"Only LUKS encryption options can be amended for qcow2 
> > with blockdev-amend");
> > +return -EOPNOTSUPP;
> > +
> > +}
> > +
> > +if (qopts->has_encrypt) {
> > +if (!s->crypto) {
> > +error_setg(errp, "QCOW2 image is not encrypted, can't amend");
> > +return -EOPNOTSUPP;
> > +}
> > +
> > +if (qopts->encrypt->format != Q_CRYPTO_BLOCK_FORMAT_LUKS) {
> > +error_setg(errp,
> > +   "Amend can't be used to change the qcow2 encryption 
> > format");
> > +return -EOPNOTSUPP;
> > +}
> > +
> > +if (s->crypt_method_header != QCOW_CRYPT_LUKS) {
> > +error_setg(errp,
> > +   "Only LUKS encryption options can be amended for 
> > qcow2 with blockdev-amend");
> > +return -EOPNOTSUPP;
> > +}
> > +
> > +ret = qcrypto_block_amend_options(s->crypto,
> > +  qcow2_crypto_hdr_read_func,
> > +  qcow2_crypto_hdr_write_func,
> > +  bs,
> > +  qopts->encrypt,
> > +  force,
> > +  errp);
> > +if (ret) {
> > +return ret;
> > +}
> > +}
> > +return 0;
> > +}
> > +
> >  /*
> >   * If offset or size are negative, respectively, they will not be included 
> > in
> >   * the BLOCK_IMAGE_CORRUPTED event emitted.
> > @@ -5304,6 +5374,7 @@ BlockDriver bdrv_qcow2 = {
> >  .mutable_opts= mutable_opts,
> >  .bdrv_co_check   = qcow2_co_check,
> >  .bdrv_amend_options  = qcow2_amend_options,
> > +.bdrv_co_amend   = qcow2_co_amend,
> >  
> >  .bdrv_detach_aio_context  = qcow2_detach_aio_context,
> >  .bdrv_attach_aio_context  = qcow2_attach_aio_context,
> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > index 02375fb59a..ba41744427 100644
> > --- a/qapi/block-core.json
> > +++ b/qapi/block-core.json
> > @@ -4312,10 +4312,10 @@
> >  # Since: 2.12
> >  ##
> >  { 'struct': 'BlockdevCreateOptionsQcow2',
> > -  'data': { 'file': 'BlockdevRef',
> > +  'data': { '*file':'BlockdevRef',
> >  '*data-file':   'BlockdevRef',
> >  '*data-file-raw':   'bool',
> > -'size': 'size',
> > +'*size':'size',
> >  '*version': 'BlockdevQcow2Version',
> >  '*backing-file':'str',
> >  '*backing-fmt': 'BlockdevDriver',
> 
> Docs comment to say they  are mandatory for creation.
Done
> 
> 
> Regards,
> Daniel

Best regards,

Re: [Qemu-devel] [PATCH 08/10] block/crypto: implement blockdev-amend

2019-09-12 Thread Maxim Levitsky
On Fri, 2019-09-06 at 15:10 +0100, Daniel P. Berrangé wrote:
> On Fri, Aug 30, 2019 at 11:56:06PM +0300, Maxim Levitsky wrote:
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  block/crypto.c   | 86 +---
> >  qapi/block-core.json |  4 +--
> >  2 files changed, 68 insertions(+), 22 deletions(-)
> 
> Reviewed-by: Daniel P. Berrangé 
> 
> 
> >  static int
> >  block_crypto_amend_options(BlockDriverState *bs,
> > QemuOpts *opts,
> > @@ -678,44 +722,45 @@ block_crypto_amend_options(BlockDriverState *bs,
> >  BlockCrypto *crypto = bs->opaque;
> >  QDict *cryptoopts = NULL;
> >  QCryptoBlockCreateOptions *amend_options = NULL;
> > -int ret;
> > +int ret= -EINVAL;
> 
> nitpick - space before '='
Done. This is one of the few errors that checkpatch.pl does catch,
but apparently I forgot to run it on this patch.
> 
> >  
> >  assert(crypto);
> >  assert(crypto->block);
> >  
> > -crypto->updating_keys = true;
> > -
> > -ret = bdrv_child_refresh_perms(bs, bs->file, errp);
> > -if (ret) {
> > -goto cleanup;
> > -}
> > -
> >  cryptoopts = qemu_opts_to_qdict_filtered(opts, NULL,
> >   
> > _crypto_create_opts_luks,
> >   true);
> >  
> >  qdict_put_str(cryptoopts, "format", "luks");
> >  amend_options = block_crypto_create_opts_init(cryptoopts, errp);
> > +
> >  if (!amend_options) {
> > -ret = -EINVAL;
> > -goto cleanup;
> > +goto out;
> >  }
> >  
> > -ret = qcrypto_block_amend_options(crypto->block,
> > -  block_crypto_read_func,
> > -  block_crypto_write_func,
> > -  bs,
> > -  amend_options,
> > -  force,
> > -  errp);
> > -cleanup:
> > -crypto->updating_keys = false;
> > -bdrv_child_refresh_perms(bs, bs->file, errp);
> > +ret = block_crypto_amend_options_generic(bs, amend_options, force, 
> > errp);
> > +out:
> 
> No need to rename the "cleanup" label to "out"
All right.
> 
> >  qapi_free_QCryptoBlockCreateOptions(amend_options);
> >  qobject_unref(cryptoopts);
> >  return ret;
> >  }
> >  
> > +static int
> > +coroutine_fn block_crypto_co_amend(BlockDriverState *bs,
> > +   BlockdevCreateOptions *opts,
> > +   bool force,
> > +   Error **errp)
> > +{
> > +QCryptoBlockCreateOptions amend_opts;
> > +
> > +amend_opts = (QCryptoBlockCreateOptions) {
> > +.format = Q_CRYPTO_BLOCK_FORMAT_LUKS,
> > +.u.luks = *qapi_BlockdevCreateOptionsLUKS_base(>u.luks),
> > +};
> > +
> > +return block_crypto_amend_options_generic(bs, _opts, force, 
> > errp);
> > +}
> > +
> >  
> >  static void
> >  block_crypto_child_perms(BlockDriverState *bs, BdrvChild *c,
> > @@ -774,6 +819,7 @@ static BlockDriver bdrv_crypto_luks = {
> >  .bdrv_get_info  = block_crypto_get_info_luks,
> >  .bdrv_get_specific_info = block_crypto_get_specific_info_luks,
> >  .bdrv_amend_options = block_crypto_amend_options,
> > +.bdrv_co_amend  = block_crypto_co_amend,
> >  
> >  .strong_runtime_opts = block_crypto_strong_runtime_opts,
> >  };
> > diff --git a/qapi/block-core.json b/qapi/block-core.json
> > index 7900914506..02375fb59a 100644
> > --- a/qapi/block-core.json
> > +++ b/qapi/block-core.json
> > @@ -4220,8 +4220,8 @@
> >  ##
> >  { 'struct': 'BlockdevCreateOptionsLUKS',
> >'base': 'QCryptoBlockCreateOptionsLUKS',
> > -  'data': { 'file': 'BlockdevRef',
> > -'size': 'size',
> > +  'data': { '*file': 'BlockdevRef',
> > +'*size': 'size',
> 
> Docs comment to explain they are mandatory for create 
Done
> 
> >  '*preallocation':   'PreallocMode' } }
> >  
> >  ##
> > -- 
> > 2.17.2
> > 
> 
> Regards,
> Daniel

Best regards,
Maxim Levitsky




Re: [Qemu-devel] [PATCH 06/10] qcow2: implement crypto amend options

2019-09-12 Thread Maxim Levitsky
On Fri, 2019-09-06 at 15:06 +0100, Daniel P. Berrangé wrote:
> On Fri, Aug 30, 2019 at 11:56:04PM +0300, Maxim Levitsky wrote:
> > ---
> >  block/qcow2.c | 79 ---
> >  1 file changed, 63 insertions(+), 16 deletions(-)
> > 
> > @@ -4888,9 +4899,22 @@ static int qcow2_amend_options(BlockDriverState *bs, 
> > QemuOpts *opts,
> >  return -ENOTSUP;
> >  }
> >  } else if (g_str_has_prefix(desc->name, "encrypt.")) {
> > -error_setg(errp,
> > -   "Changing the encryption parameters is not 
> > supported");
> > -return -ENOTSUP;
> > +
> > +if (!s->crypto) {
> > +error_setg(errp,
> > +   "Can't amend encryption options - encryption 
> > not supported");
> > +return -ENOTSUP;
> > +
> > +}
> 
> Perhaps  ' - encryption not present', and -EINVAL
Agree. Fixed.

> 
> > +
> > +if (s->crypt_method_header != QCOW_CRYPT_LUKS) {
> > +error_setg(errp,
> > +   "Only LUKS encryption options can be amended");
> > +return -ENOTSUP;
> > +}
> > +
> > +encryption_update = true;
> > +
> >  } else if (!strcmp(desc->name, BLOCK_OPT_CLUSTER_SIZE)) {
> >  cluster_size = qemu_opt_get_size(opts, BLOCK_OPT_CLUSTER_SIZE,
> >   cluster_size);
> > @@ -4927,7 +4951,7 @@ static int qcow2_amend_options(BlockDriverState *bs, 
> > QemuOpts *opts,
> >   "images");
> >  return -EINVAL;
> >  }
> > -} else {
> > +} else  {
> 
> Accidental change
Fixed.
> 
> >  /* if this point is reached, this probably means a new option 
> > was
> >   * added without having it covered here */
> >  abort();
> > @@ -4940,7 +4964,8 @@ static int qcow2_amend_options(BlockDriverState *bs, 
> > QemuOpts *opts,
> >  .original_status_cb = status_cb,
> >  .original_cb_opaque = cb_opaque,
> >  .total_operations = (new_version < old_version)
> > -  + (s->refcount_bits != refcount_bits)
> > +  + (s->refcount_bits != refcount_bits) +
> > +  (encryption_update == true)
> >  };
> >  
> >  /* Upgrade first (some features may require compat=1.1) */
> > @@ -4954,6 +4979,28 @@ static int qcow2_amend_options(BlockDriverState *bs, 
> > QemuOpts *opts,
> >  }
> >  }
> >  
> > +if (encryption_update) {
> > +
> 
> Redundant blank line
Fixed.
> 
> > +QCryptoBlockCreateOptions *cryptoopts;
> > +
> > +cryptoopts = qcow2_extract_crypto_create_opts(opts, "luks", errp);
> > +if (!cryptoopts)
> > +return -EINVAL;
> > +
> > +helper_cb_info.current_operation = QCOW2_UPDATING_ENCRYPTION;
> > +
> > +ret = qcrypto_block_amend_options(s->crypto,
> > +  qcow2_crypto_hdr_read_func,
> > +  qcow2_crypto_hdr_write_func,
> > +  bs,
> > +  cryptoopts,
> > +  force,
> > +  errp);
> > +if (ret) {
> 
> Check  ret < 0
Fixed.
> 
> > +return ret;
> > +}
> > +}
> > +
> 
> Regards,
> Daniel

Best regards,
Maxim Levitsky





Re: [Qemu-devel] [PATCH 3/3] cputlb: Introduce TLB_BSWAP

2019-09-12 Thread Richard Henderson
On 9/11/19 10:56 AM, Tony Nguyen wrote:
>> @@ -1372,26 +1364,27 @@ load_helper(CPUArchState *env, target_ulong addr, 
>> TCGMemOpIdx oi,
>>  /* On watchpoint hit, this will longjmp out.  */
>>  cpu_check_watchpoint(env_cpu(env), addr, size,
>>   iotlbentry->attrs, BP_MEM_READ, retaddr);
>> -
>> -/* The backing page may or may not require I/O.  */
>> -tlb_addr &= ~TLB_WATCHPOINT;
>> -if ((tlb_addr & ~TARGET_PAGE_MASK) == 0) {
>> -goto do_aligned_access;
>> -}
>>  }
>>  
>>  /* Handle I/O access.  */
>> -return io_readx(env, iotlbentry, mmu_idx, addr,
>> -retaddr, access_type, op);
>> -}
>> +if (likely(tlb_addr & TLB_MMIO)) {
>> +return io_readx(env, iotlbentry, mmu_idx, addr,
>> +retaddr, access_type,
>> +op ^ (tlb_addr & TLB_BSWAP ? MO_BSWAP : 0));
>> +}
> 
> Previously, the end of if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) branch
> called and returned the result of io_readx.

Correct.  However, rather thank clearing TLB_WATCHPOINT and TLB_BSWAP, it
seemed easier to test for those bits that *do* require that we call io_readx.

As we've seen from the bug leading to this patch set, it's invalid to call
io_readx on anything that doesn't have TLB_MMIO set -- we'll either crash due
to the missing read accessor or reach the point at which we issue a bus error
for an i/o operation without a device.

BTW, there's a bug in this same location for store_helper in that I need to
also test for TLB_NOTDIRTY, which also goes through io_writex for the moment.
That bug is trivially shown during the make check migration tests.  Due to the
late hour I failed to run those before posting this patch set.  Will be fixed
in v2.


r~



[Qemu-devel] [Bug 1841592] Re: ppc: softfloat float implementation issues

2019-09-12 Thread Alex Bennée
I'm confused by this testcase as it's not a fused multiply-add but as
you say two combined operations.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1841592

Title:
  ppc: softfloat float implementation issues

Status in QEMU:
  New

Bug description:
  Per bug #1841491, Richard Henderson (rth) said:
  > The float test failure is part of a larger problem for target/powerpc
  > in which all float routines are implemented incorrectly. They are all
  > implemented as double operations with rounding to float as a second
  > step. Which not only produces incorrect exceptions, as in this case,
  > but incorrect numerical results from the double rounding.
  >
  > This should probably be split to a separate bug...

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1841592/+subscriptions



[Qemu-devel] [PATCH] build: Don't ignore qapi-visit-core.c

2019-09-12 Thread Eric Blake
This file is version-controlled, and not generated from a .json file.

Fixes: bf582c3461b
Reported-by: Thomas Huth 
Signed-off-by: Eric Blake 
---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index e9bbc006d39e..7de868d1eab4 100644
--- a/.gitignore
+++ b/.gitignore
@@ -41,6 +41,7 @@
 /qapi/qapi-types-*.[ch]
 /qapi/qapi-types.[ch]
 /qapi/qapi-visit-*.[ch]
+!/qapi/qapi-visit-core.c
 /qapi/qapi-visit.[ch]
 /qapi/qapi-doc.texi
 /qemu-doc.html
-- 
2.21.0




[Qemu-devel] [Bug 1843651] Re: m68k fpu bug

2019-09-12 Thread Alex Bennée
** Tags added: fpu m68k

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843651

Title:
  m68k fpu bug

Status in QEMU:
  New

Bug description:
  On gcc123 cfarm machine,
  I was testing m68k executables generated by Free Pascal Compiler.

  muller@gcc123:~/pas/check$ cat inf.pp
  function get_double(x : double):double;
begin
  get_double:=x;
end;

  
  var
y : double;
py : pbyte;
i : byte;
  begin
y:=1.0/0.0;
py:=@y;
  {$ifdef ENDIAN_LITTLE}
write('little endian y=');
for i:=7 downto 0 do
  {$else not ENDIAN_LITTLE}
write('big endian y=');
for i:=0 to 7 do
  {$endif}
  write(hexstr(py[i],2));
writeln;
y:=get_double(y)+1;
  {$ifdef ENDIAN_LITTLE}
write('little endian y=');
for i:=7 downto 0 do
  {$else not ENDIAN_LITTLE}
write('big endian y=');
for i:=0 to 7 do
  {$endif}
  write(hexstr(py[i],2));
writeln;
  end.
  muller@gcc123:~/pas/check$ ppc68k inf
  Free Pascal Compiler version 3.3.1-r20:42973M [2019/09/11] for m68k
  Copyright (c) 1993-2019 by Florian Klaempfl and others
  Target OS: Linux for m68k
  Compiling inf.pp
  Assembling program
  Linking inf
  33 lines compiled, 0.1 sec
  muller@gcc123:~/pas/check$ ./inf
  big endian y=7FF0
  big endian y=7FFF
  muller@gcc123:~/pas/check$ qemu-m68k ./inf
  big endian y=7FF0
  big endian y=7FFF
  muller@gcc123:~/pas/check$ ~/sys-root/bin/qemu-m68k ./inf
  qemu-m68kqemu-m68k-fixed
  muller@gcc123:~/pas/check$ ~/sys-root/bin/qemu-m68k-fixed ./inf
  big endian y=7FF0
  big endian y=7FF0

  ~/sys-root/bin/qemu-m68k  is 4.1.0 release,
  ~/sys-root/bin/qemu-m68k-fixed is the same source with a unique change:

  gnu/qemu/qemu-4.1.0/fpu/softfloat-specialize.h:214:#if defined(TARGET_M68K)
  gnu/qemu/qemu-4.1.0/fpu/softfloat-specialize.h-215-#define 
floatx80_infinity_low  LIT64(0x)
  gnu/qemu/qemu-4.1.0/fpu/softfloat-specialize.h-216-#else
  gnu/qemu/qemu-4.1.0/fpu/softfloat-specialize.h-217-#define 
floatx80_infinity_low  LIT64(0x8000)
  gnu/qemu/qemu-4.1.0/fpu/softfloat-specialize.h-218-#endif

  the M68K branch value is set to the same value as the other branch.

  The problem of the M68K specific floatx86_infinity_low values
  is that is enters in conflict with
  muller@gcc123:~/pas/check$ grep -nA6 invalid_enc  
/home/muller/gnu/qemu/qemu-4.1.0/include/fpu/softfloat.h
  752:static inline bool floatx80_invalid_encoding(floatx80 a)
  753-{
  754-return (a.low & (1ULL << 63)) == 0 && (a.high & 0x7FFF) != 0;
  755-}

  And thus the m68k variant of floatx80 representing +Infinity is
  considered as an invalid encoding, and thus converted into a NaN 
7FFF

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843651/+subscriptions



[Qemu-devel] [Bug 1843795] Re: 'mtfsf' instruction can clear FI incorrectly

2019-09-12 Thread Alex Bennée
** Tags added: fpu ppc testcase

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843795

Title:
  'mtfsf' instruction can clear FI incorrectly

Status in QEMU:
  New

Bug description:
  Using mtfsf instruction can clear the FPSCR FI bit incorrectly.  This code 
snippet exhibits the issue:
  --
fpscr.ll = 0x1fff;
__builtin_mtfsf (0b, fpscr.d);
fpscr.d = __builtin_mffs ();
  --

  On POWER9 hardware:
  mffs: FPSCR = 0x77ff

  On qemu (git master; "-cpu POWER9"):
  --
  $ ./mtfsf
  mffs: FPSCR = 0x7ffd
  --

  Two differences:
  bit 52: "reserved", so maybe a "don't care" case
  bit 46: "FI"

  $ git log -1 master
  commit 89ea03a7dc83ca36b670ba7f787802791fcb04b1
  Merge: 019217c 2531164
  Author: Peter Maydell 
  Date:   Mon Sep 9 09:48:34 2019 +0100

  I tracked the clear is coming from do_float_check_status, likely the
  one in gen_mtfsf, but then I get lost figuring out what _should_ be
  happening. :-/

  Test attached.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843795/+subscriptions



Re: [Qemu-devel] hw/vfio/pci: duplicated invocation of vfio_msix_vector_release() in vfio_msix_disable()

2019-09-12 Thread Alex Williamson
On Tue, 10 Sep 2019 19:01:22 +0800
Guoheyi  wrote:

> Hi folks,
> 
> Recently we found there are 2 invocations of vfio_msix_vector_release() 
> in vfio_msix_disable(). The 1st one is in msix_unset_vector_notifiers(), 
> for we set device's msix_vector_release_notifier to 
> vfio_msix_vector_release() in vfio_msix_enable(), while the 2nd is the 
> explicit one in vfio_msix_disable(). Both invocations switch VFIO to use 
> the non-bypass eventfd.
> 
> Is there any special reason for doing this? Or can we remove one of them?

The comment certainly suggests there is, and git blame trivially finds:

commit 3e40ba0faf0822fa78336fe6cd9d677ea9b14f1b
Author: Alex Williamson 
Date:   Fri Dec 6 11:16:40 2013 -0700

vfio-pci: Release all MSI-X vectors when disabled

We were relying on msix_unset_vector_notifiers() to release all the
vectors when we disable MSI-X, but this only happens when MSI-X is
still enabled on the device.  Perform further cleanup by releasing
any remaining vectors listed as in-use after this call.  This caused
a leak of IRQ routes on hotplug depending on how the guest OS prepared
the device for removal.

Signed-off-by: Alex Williamson 
Cc: qemu-sta...@nongnu.org

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index f367537737d2..9aecaa82bc34 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -905,8 +905,20 @@ static void vfio_disable_msi_common(VFIODevice *vdev)
 
 static void vfio_disable_msix(VFIODevice *vdev)
 {
+int i;
+
 msix_unset_vector_notifiers(>pdev);
 
+/*
+ * MSI-X will only release vectors if MSI-X is still enabled on the
+ * device, check through the rest and release it ourselves if necessary.
+ */
+for (i = 0; i < vdev->nr_vectors; i++) {
+if (vdev->msi_vectors[i].use) {
+vfio_msix_vector_release(>pdev, i);
+}
+}
+
 if (vdev->nr_vectors) {
 vfio_disable_irqindex(vdev, VFIO_PCI_MSIX_IRQ_INDEX);
 }



Re: [Qemu-devel] [PATCH v8 01/13] vfio: KABI for migration interface

2019-09-12 Thread Alex Williamson
On Tue, 3 Sep 2019 06:57:27 +
"Tian, Kevin"  wrote:

> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Saturday, August 31, 2019 12:33 AM
> > 
> > On Fri, 30 Aug 2019 08:06:32 +
> > "Tian, Kevin"  wrote:
> >   
> > > > From: Tian, Kevin
> > > > Sent: Friday, August 30, 2019 3:26 PM
> > > >  
> > > [...]  
> > > > > How does QEMU handle the fact that IOVAs are potentially dynamic  
> > while  
> > > > > performing the live portion of a migration?  For example, each time a
> > > > > guest driver calls dma_map_page() or dma_unmap_page(), a
> > > > > MemoryRegionSection pops in or out of the AddressSpace for the device
> > > > > (I'm assuming a vIOMMU where the device AddressSpace is not
> > > > > system_memory).  I don't see any QEMU code that intercepts that  
> > change  
> > > > > in the AddressSpace such that the IOVA dirty pfns could be recorded 
> > > > > and
> > > > > translated to GFNs.  The vendor driver can't track these beyond 
> > > > > getting
> > > > > an unmap notification since it only knows the IOVA pfns, which can be
> > > > > re-used with different GFN backing.  Once the DMA mapping is torn  
> > down,  
> > > > > it seems those dirty pfns are lost in the ether.  If this works in 
> > > > > QEMU,
> > > > > please help me find the code that handles it.  
> > > >
> > > > I'm curious about this part too. Interestingly, I didn't find any 
> > > > log_sync
> > > > callback registered by emulated devices in Qemu. Looks dirty pages
> > > > by emulated DMAs are recorded in some implicit way. But KVM always
> > > > reports dirty page in GFN instead of IOVA, regardless of the presence of
> > > > vIOMMU. If Qemu also tracks dirty pages in GFN for emulated DMAs
> > > >  (translation can be done when DMA happens), then we don't need
> > > > worry about transient mapping from IOVA to GFN. Along this way we
> > > > also want GFN-based dirty bitmap being reported through VFIO,
> > > > similar to what KVM does. For vendor drivers, it needs to translate
> > > > from IOVA to HVA to GFN when tracking DMA activities on VFIO
> > > > devices. IOVA->HVA is provided by VFIO. for HVA->GFN, it can be
> > > > provided by KVM but I'm not sure whether it's exposed now.
> > > >  
> > >
> > > HVA->GFN can be done through hva_to_gfn_memslot in kvm_host.h.  
> > 
> > I thought it was bad enough that we have vendor drivers that depend on
> > KVM, but designing a vfio interface that only supports a KVM interface
> > is more undesirable.  I also note without comment that gfn_to_memslot()
> > is a GPL symbol.  Thanks,  
> 
> yes it is bad, but sometimes inevitable. If you recall our discussions
> back to 3yrs (when discussing the 1st mdev framework), there were similar
> hypervisor dependencies in GVT-g, e.g. querying gpa->hpa when
> creating some shadow structures. gpa->hpa is definitely hypervisor
> specific knowledge, which is easy in KVM (gpa->hva->hpa), but needs
> hypercall in Xen. but VFIO already makes assumption based on KVM-
> only flavor when implementing vfio_{un}pin_page_external.

Where's the KVM assumption there?  The MAP_DMA ioctl takes an IOVA and
HVA.  When an mdev vendor driver calls vfio_pin_pages(), we GUP the HVA
to get an HPA and provide an array of HPA pfns back to the caller.  The
other vGPU mdev vendor manages to make use of this without KVM... the
KVM interface used by GVT-g is GPL-only.

> So GVT-g
> has to maintain an internal abstraction layer to support both Xen and
> KVM. Maybe someday we will re-consider introducing some hypervisor
> abstraction layer in VFIO, if this issue starts to hurt other devices and
> Xen guys are willing to support VFIO.

Once upon a time, we had a KVM specific device assignment interface,
ie. legacy KVM devie assignment.  We developed VFIO specifically to get
KVM out of the business of being a (bad) device driver.  We do have
some awareness and interaction between VFIO and KVM in the vfio-kvm
pseudo device, but we still try to keep those interfaces generic.  In
some cases we're not very successful at that, see vfio_group_set_kvm(),
but that's largely just a mechanism to associate a cookie with a group
to be consumed by the mdev vendor driver such that it can work with kvm
external to vfio.  I don't intend to add further hypervisor awareness
to vfio.

> Back to this IOVA issue, I discussed with Yan and we found another 
> hypervisor-agnostic alternative, by learning from vhost. vhost is very
> similar to VFIO - DMA also happens in the kernel, while it already 
> supports vIOMMU.
> 
> Generally speaking, there are three paths of dirty page collection
> in Qemu so far (as previously noted, Qemu always tracks the dirty
> bitmap in GFN):

GFNs or simply PFNs within an AddressSpace?
 
> 1) Qemu-tracked memory writes (e.g. emulated DMAs). Dirty bitmaps 
> are updated directly when the guest memory is being updated. For 
> example, PCI writes are completed through pci_dma_write, which 
> goes through vIOMMU to translate IOVA into GPA and then 

[Qemu-devel] [PATCH] target/arm: Fix sign-extension for SMLAL*

2019-09-12 Thread Richard Henderson
The 32-bit product should be sign-extended, not zero-extended.

Fixes: ea96b374641b
Reported-by: Laurent Desnogues 
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 34bb280e3d..fd2f0e3048 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8045,7 +8045,9 @@ static bool op_smlaxxx(DisasContext *s, arg_ *a,
 case 2:
 tl = load_reg(s, a->ra);
 th = load_reg(s, a->rd);
-t1 = tcg_const_i32(0);
+/* Sign-extend the 32-bit product to 64 bits.  */
+t1 = tcg_temp_new_i32();
+tcg_gen_sari_i32(t1, t0, 31);
 tcg_gen_add2_i32(tl, th, tl, th, t0, t1);
 tcg_temp_free_i32(t0);
 tcg_temp_free_i32(t1);
-- 
2.17.1




[Qemu-devel] [RFC PATCH] virtio-blk: schedule virtio_notify_config to run on main context

2019-09-12 Thread Sergio Lopez
Another AioContext-related issue, and this is a tricky one.

Executing a QMP block_resize request for a virtio-blk device running
on an iothread may cause a deadlock involving the following mutexes:

 - main thead
  * Has acquired: qemu_mutex_global.
  * Is trying the acquire: iothread AioContext lock via
AIO_WAIT_WHILE (after aio_poll).

 - iothread
  * Has acquired: AioContext lock.
  * Is trying to acquire: qemu_mutex_global (via
virtio_notify_config->prepare_mmio_access).

With this change, virtio_blk_resize checks if it's being called from a
coroutine context running on a non-main thread, and if that's the
case, creates a new coroutine and schedules it to be run on the main
thread.

This works, but means the actual operation is done
asynchronously, perhaps opening a window in which a "device_del"
operation may fit and remove the VirtIODevice before
virtio_notify_config() is executed.

I *think* it shouldn't be possible, as BHs will be processed before
any new QMP/monitor command, but I'm open to a different approach.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1744955
Signed-off-by: Sergio Lopez 
---
 hw/block/virtio-blk.c | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 18851601cb..c763d071f6 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -16,6 +16,7 @@
 #include "qemu/iov.h"
 #include "qemu/module.h"
 #include "qemu/error-report.h"
+#include "qemu/main-loop.h"
 #include "trace.h"
 #include "hw/block/block.h"
 #include "hw/qdev-properties.h"
@@ -1086,11 +1087,33 @@ static int virtio_blk_load_device(VirtIODevice *vdev, 
QEMUFile *f,
 return 0;
 }
 
+static void coroutine_fn virtio_resize_co_entry(void *opaque)
+{
+VirtIODevice *vdev = opaque;
+
+assert(qemu_get_current_aio_context() == qemu_get_aio_context());
+virtio_notify_config(vdev);
+aio_wait_kick();
+}
+
 static void virtio_blk_resize(void *opaque)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
+Coroutine *co;
 
-virtio_notify_config(vdev);
+if (qemu_in_coroutine() &&
+qemu_get_current_aio_context() != qemu_get_aio_context()) {
+/*
+ * virtio_notify_config() needs to acquire the global mutex,
+ * so calling it from a coroutine running on a non-main context
+ * may cause a deadlock. Instead, create a new coroutine and
+ * schedule it to be run on the main thread.
+ */
+co = qemu_coroutine_create(virtio_resize_co_entry, vdev);
+aio_co_schedule(qemu_get_aio_context(), co);
+} else {
+virtio_notify_config(vdev);
+}
 }
 
 static const BlockDevOps virtio_block_ops = {
-- 
2.21.0




Re: [Qemu-devel] [PATCH v2] memory: inline and optimize devend_memop

2019-09-12 Thread Richard Henderson
On 9/12/19 10:18 AM, Paolo Bonzini wrote:
> -MemOp devend_memop(enum device_endian end);
> +static inline MemOp devend_memop(enum device_endian end)

Ah ha, yes of course, the static inline.
Reviewed-by: Richard Henderson 


r~





Re: [Qemu-devel] [PATCH] memory: inline and optimize devend_memop

2019-09-12 Thread Richard Henderson
On 9/12/19 10:18 AM, Paolo Bonzini wrote:
> devend_memop can rely on the fact that the result is always either
> 0 or MO_BSWAP, corresponding respectively to host endianness and
> the opposite.  Native (target) endianness in turn can be either
> the host endianness, in which case MO_BSWAP is only returned for
> host-opposite endianness, or the opposite, in which case 0 is only
> returned for host endianness.
> 
> With this in mind, devend_memop can be compiled as a setcond+shift
> for every target.  Do this and, while at it, move it to
> include/exec/memory.h since !NEED_CPU_H files do not (and should not)
> need it.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  include/exec/memory.h | 19 ++-
>  memory.c  | 18 --
>  2 files changed, 18 insertions(+), 19 deletions(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH v2 1/5] rcu: Add automatically released rcu_read_lock variant

2019-09-12 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> On 11/09/19 21:06, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" 
> > 
> > RCU_READ_LOCK_AUTO takes the rcu_read_lock and then uses glib's
> > g_auto infrastructure (and thus whatever the compiler's hooks are) to
> > release it on all exits of the block.
> > 
> > Signed-off-by: Dr. David Alan Gilbert 
> > ---
> >  include/qemu/rcu.h | 18 ++
> >  1 file changed, 18 insertions(+)
> > 
> > diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h
> > index 22876d1428..8768a7b60a 100644
> > --- a/include/qemu/rcu.h
> > +++ b/include/qemu/rcu.h
> > @@ -154,6 +154,24 @@ extern void call_rcu1(struct rcu_head *head, RCUCBFunc 
> > *func);
> >}),\
> >(RCUCBFunc *)g_free);
> >  
> > +typedef void RCUReadAuto;
> > +static inline RCUReadAuto *rcu_read_auto_lock(void)
> > +{
> > +rcu_read_lock();
> > +/* Anything non-NULL causes the cleanup function to be called */
> > +return (void *)0x1;
> 
> Doesn't this cause a warning (should be "(void *)(uintptr_t)1" instead)?

Apparently not, but I'll change it anyway.

> > +}
> > +
> > +static inline void rcu_read_auto_unlock(RCUReadAuto *r)
> > +{
> > +rcu_read_unlock();
> > +}
> > +
> > +G_DEFINE_AUTOPTR_CLEANUP_FUNC(RCUReadAuto, rcu_read_auto_unlock)
> > +
> > +#define RCU_READ_LOCK_AUTO \
> > +g_autoptr(RCUReadAuto) _rcu_read_auto = rcu_read_auto_lock()
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > 
> 
> Is it possible to make this a scope, like
> 
>   WITH_RCU_READ_LOCK() {
>   }
> 
> ?  Perhaps something like
> 
> #define WITH_RCU_READ_LOCK()  \
>   WITH_RCU_READ_LOCK_(_rcu_read_auto##__COUNTER__)
> 
> #define WITH_RCU_READ_LOCK_(var) \
>   for (g_autoptr(RCUReadAuto) var = rcu_read_auto_lock();
>(var); rcu_read_auto_unlock(), (var) = NULL)
> 
> where the dummy variable doubles as an execute-once guard and the gauto
> cleanup is still used in case of a "break".  I'm not sure if the above
> raises a warning because of the variable declaration inside the for
> loop, though.

Yes, that works - I'm not seeing any warnings.

Do you think it's best to use the block version for all cases
or to use the non-block version by taste?
The block version is quite nice, but that turns most of the patches
into 'indent everything'.

Dave

> Thanks,
> 
> Paolo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and accesses to dirty bitmap

2019-09-12 Thread Richard Henderson
On 9/12/19 2:54 AM, Pavel Dovgalyuk wrote:
> Ping.
> 
> 
> Pavel Dovgalyuk
> 
>> -Original Message-
>> From: dovgaluk [mailto:dovga...@ispras.ru]
>> Sent: Monday, August 26, 2019 3:19 PM
>> To: Paolo Bonzini; pavel.dovga...@ispras.ru
>> Cc: qemu-devel@nongnu.org; Qemu-devel
>> Subject: Re: [Qemu-devel] [PULL 15/36] memory: fix race between TCG and 
>> accesses to dirty
>> bitmap
>>
>> This patch breaks the execution recording.
>> While vCPU tries to lock replay mutex in main while loop,
>> vga causes dirty memory sync and do_run_on_cpu call.
>> This call waits for vCPU to process the work queue.
>>
>> Pavel Dovgalyuk
>>
>> Paolo Bonzini писал 2019-08-20 09:59:
>>> There is a race between TCG and accesses to the dirty log:
>>>
>>>   vCPU thread  reader thread
>>>   ---  ---
>>>   TLB check -> slow path
>>> notdirty_mem_write
>>>   write to RAM
>>>   set dirty flag
>>>clear dirty flag
>>>   TLB check -> fast path
>>>read memory
>>> write to RAM
>>>
>>> Fortunately, in order to fix it, no change is required to the
>>> vCPU thread.  However, the reader thread must delay the read after
>>> the vCPU thread has finished the write.  This can be approximated
>>> conservatively by run_on_cpu, which waits for the end of the current
>>> translation block.

If we are going to delay any read of the dirty flags until vCPU has completed
any active TranslationBlock, then we can simplify the TCG operation so that we
do not (ab)use the mmio path, and can promote this into the tlb slow path as we
have recently done with watchpoints.  C.f.

commit 50b107c5d617eaf93301cef20221312e7a986701
Author: Richard Henderson 
Date:   Sat Aug 24 09:51:09 2019 -0700

cputlb: Handle watchpoints via TLB_WATCHPOINT

That would greatly simplify things from my perspective, for vector and
block-type operations such as we have recently been discussing for S390.  It
would mean that the *only* time we go through TLB_MMIO is for true mmio.

Have I understood your proposal here properly?


r~



[Qemu-devel] [Bug 1841592] Re: ppc: softfloat float implementation issues

2019-09-12 Thread Alex Bennée
** Tags added: ppc64 testcase

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1841592

Title:
  ppc: softfloat float implementation issues

Status in QEMU:
  New

Bug description:
  Per bug #1841491, Richard Henderson (rth) said:
  > The float test failure is part of a larger problem for target/powerpc
  > in which all float routines are implemented incorrectly. They are all
  > implemented as double operations with rounding to float as a second
  > step. Which not only produces incorrect exceptions, as in this case,
  > but incorrect numerical results from the double rounding.
  >
  > This should probably be split to a separate bug...

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1841592/+subscriptions



Re: [Qemu-devel] [PATCH v2 17/17] RISC-V: add vector extension premutation instructions

2019-09-12 Thread Richard Henderson
On 9/11/19 2:25 AM, liuzhiwei wrote:
> +/* vfmv.f.s rd, vs2 # rd = vs2[0] (rs1=0)  */
> +void VECTOR_HELPER(vfmv_f_s)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
> +uint32_t rd)
...
> +/* vmv.s.x vd, rs1 # vd[0] = rs1 */
> +void VECTOR_HELPER(vmv_s_x)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
> +uint32_t rd)
...
> +/* vfmv.s.f vd, rs1 #  vd[0] = rs1 (vs2 = 0)  */
> +void VECTOR_HELPER(vfmv_s_f)(CPURISCVState *env, uint32_t rs1,
> +uint32_t rs2, uint32_t rd)

I'll note that, with the vector parameters known to the translator, as I have
advocated, these operations are trivially expanded inline as one or two tcg
operations.


r~



Re: [Qemu-devel] [PULL v2 00/15] Linux user for 4.2 patches

2019-09-12 Thread Peter Maydell
On Wed, 11 Sep 2019 at 07:53, Laurent Vivier  wrote:
>
> The following changes since commit 89ea03a7dc83ca36b670ba7f787802791fcb04b1:
>
>   Merge remote-tracking branch 
> 'remotes/huth-gitlab/tags/m68k-pull-2019-09-07' into staging (2019-09-09 
> 09:48:34 +0100)
>
> are available in the Git repository at:
>
>   git://github.com/vivier/qemu.git tags/linux-user-for-4.2-pull-request
>
> for you to fetch changes up to 5eea942900536b76ad13bef35e1d8f276566ae9e:
>
>   linux-user: Add support for FDRESET, FDRAWCMD, FDTWADDLE, and FDEJECT 
> ioctls (2019-09-11 08:47:06 +0200)
>
> 
> Add several floppy drive ioctl,
> xtensa call0 ABI support, arm MAX_RESERVED_VA for M-profile,
> aarch64 AT_HWCAP2, qOffsets' query for ELF, memfd_create,
> and some code cleanup
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2
for any user-visible changes.

-- PMM



Re: [Qemu-devel] [PATCH] target/m68k/fpu_helper.c: rename the access arguments

2019-09-12 Thread Laurent Vivier
Le 12/09/2019 à 16:02, KONRAD Frederic a écrit :
> The "access" arguments clash with a macro under Windows with MinGW:
>   CC  m68k-softmmu/target/m68k/fpu_helper.o
>   target/m68k/fpu_helper.c: In function 'fmovem_predec':
>   target/m68k/fpu_helper.c:405:56: error: macro "access" passed 4 arguments,
>but takes just 2
>size = access(env, addr, >fregs[i], ra);
> 
> So this renames them access_fn.
> 
> Tested with:
>  ./configure --target-list=m68k-softmmu
>  make -j8
> 
> Signed-off-by: KONRAD Frederic 
> ---
>  target/m68k/fpu_helper.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
> index 9b039c8..4137542 100644
> --- a/target/m68k/fpu_helper.c
> +++ b/target/m68k/fpu_helper.c
> @@ -396,14 +396,14 @@ typedef int (*float_access)(CPUM68KState *env, uint32_t 
> addr, FPReg *fp,
>  uintptr_t ra);
>  
>  static uint32_t fmovem_predec(CPUM68KState *env, uint32_t addr, uint32_t 
> mask,
> -   float_access access)
> +  float_access access_fn)
>  {
>  uintptr_t ra = GETPC();
>  int i, size;
>  
>  for (i = 7; i >= 0; i--, mask <<= 1) {
>  if (mask & 0x80) {
> -size = access(env, addr, >fregs[i], ra);
> +size = access_fn(env, addr, >fregs[i], ra);
>  if ((mask & 0xff) != 0x80) {
>  addr -= size;
>  }
> @@ -414,14 +414,14 @@ static uint32_t fmovem_predec(CPUM68KState *env, 
> uint32_t addr, uint32_t mask,
>  }
>  
>  static uint32_t fmovem_postinc(CPUM68KState *env, uint32_t addr, uint32_t 
> mask,
> -   float_access access)
> +   float_access access_fn)
>  {
>  uintptr_t ra = GETPC();
>  int i, size;
>  
>  for (i = 0; i < 8; i++, mask <<= 1) {
>  if (mask & 0x80) {
> -size = access(env, addr, >fregs[i], ra);
> +size = access_fn(env, addr, >fregs[i], ra);
>  addr += size;
>  }
>  }
> 

Reviewed-by: Laurent Vivier 




Re: [Qemu-devel] [PATCH v2 16/17] RISC-V: add vector extension mask instructions

2019-09-12 Thread Richard Henderson
On 9/11/19 2:25 AM, liuzhiwei wrote:
> +for (i = 0; i < vlmax; i++) {
> +if (i < env->vfp.vstart) {
> +continue;
> +} else if (i < vl) {
> +tmp = ~vector_mask_reg(env, rs1, width, lmul, i) &
> +vector_mask_reg(env, rs2, width, lmul, i);
> +vector_mask_result(env, rd, width, lmul, i, tmp);
> +} else {
> +vector_mask_result(env, rd, width, lmul, i, 0);
> +}
> +}

These can be processed in uint64_t units, with a mask based on width:

   8: 0x
  16: 0x
  32: 0x
  64: 0x0101010101010101

  dest = ~in1 & in2 & mask;

with an additional final mask to handle vl not being a multiple of 64.

Again, I urge you not to bother with impossible vstart -- instructions like
this cannot be interrupted, and the spec allows you to not handle values of
vstart that cannot be produced by the implementation.


r~



Re: [Qemu-devel] [PATCH v6 4/8] linux-user/syscall: Introduce target_sockaddr_nl

2019-09-12 Thread Laurent Vivier
Le 11/09/2019 à 21:34, Philippe Mathieu-Daudé a écrit :
> On Mon, Sep 9, 2019 at 4:23 PM Laurent Vivier  wrote:
>> Le 08/09/2019 à 08:15, Philippe Mathieu-Daudé a écrit :
>>> Signed-off-by: Philippe Mathieu-Daudé 
>>> Tested-By: Guido Günther 
>>> ---
>>>  linux-user/syscall.c  | 6 --
>>>  linux-user/syscall_defs.h | 7 +++
>>>  2 files changed, 11 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>>> index 8b41a03901..5128312db5 100644
>>> --- a/linux-user/syscall.c
>>> +++ b/linux-user/syscall.c
>>> @@ -1494,8 +1494,10 @@ static inline abi_long 
>>> host_to_target_sockaddr(abi_ulong target_addr,
>>>  sizeof(target_saddr->sa_family)) {
>>>  target_saddr->sa_family = tswap16(addr->sa_family);
>>>  }
>>> -if (addr->sa_family == AF_NETLINK && len >= sizeof(struct 
>>> sockaddr_nl)) {
>>> -struct sockaddr_nl *target_nl = (struct sockaddr_nl *)target_saddr;
>>> +if (addr->sa_family == AF_NETLINK &&
>>> +len >= sizeof(struct target_sockaddr_nl)) {
>>> +struct target_sockaddr_nl *target_nl =
>>> +   (struct target_sockaddr_nl *)target_saddr;
>>>  target_nl->nl_pid = tswap32(target_nl->nl_pid);
>>>  target_nl->nl_groups = tswap32(target_nl->nl_groups);
>>>  } else if (addr->sa_family == AF_PACKET) {
>>> diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
>>> index 0662270300..fcedd0d0fb 100644
>>> --- a/linux-user/syscall_defs.h
>>> +++ b/linux-user/syscall_defs.h
>>> @@ -153,6 +153,13 @@ struct target_sockaddr_un {
>>>  uint8_t sun_path[108];
>>>  };
>>>
>>> +struct target_sockaddr_nl {
>>> +uint16_t nl_family; /* AF_NETLINK */
>>> +uint16_t __pad;
>>> +uint32_t nl_pid;
>>> +uint32_t nl_groups;
>>> +};
>>
>> You should use abi_ushort and abi_uint to keep alignments good (for
>> instance m68k aligns on 16bit whereas others on 32bit).
> 
> Are you sure? The other target_sockaddr don't use that...
> Is this because netlink is a host-only structure? (while other are
> serialized over some interface).

I think other target_sockaddr are wrong too. No one takes really care of
alignment, and most of the time structures are by construction well aligned.

Thanks,
Laurent




[Qemu-devel] [Bug 1843795] [NEW] 'mtfsf' instruction can clear FI incorrectly

2019-09-12 Thread Paul Clarke
Public bug reported:

Using mtfsf instruction can clear the FPSCR FI bit incorrectly.  This code 
snippet exhibits the issue:
--
  fpscr.ll = 0x1fff;
  __builtin_mtfsf (0b, fpscr.d);
  fpscr.d = __builtin_mffs ();
--

On POWER9 hardware:
mffs: FPSCR = 0x77ff

On qemu (git master; "-cpu POWER9"):
--
$ ./mtfsf
mffs: FPSCR = 0x7ffd
--

Two differences:
bit 52: "reserved", so maybe a "don't care" case
bit 46: "FI"

$ git log -1 master
commit 89ea03a7dc83ca36b670ba7f787802791fcb04b1
Merge: 019217c 2531164
Author: Peter Maydell 
Date:   Mon Sep 9 09:48:34 2019 +0100

I tracked the clear is coming from do_float_check_status, likely the one
in gen_mtfsf, but then I get lost figuring out what _should_ be
happening. :-/

Test attached.

** Affects: qemu
 Importance: Undecided
 Status: New

** Attachment added: "test case - mtfsf clears FI"
   https://bugs.launchpad.net/bugs/1843795/+attachment/5288363/+files/mtfsf.c

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843795

Title:
  'mtfsf' instruction can clear FI incorrectly

Status in QEMU:
  New

Bug description:
  Using mtfsf instruction can clear the FPSCR FI bit incorrectly.  This code 
snippet exhibits the issue:
  --
fpscr.ll = 0x1fff;
__builtin_mtfsf (0b, fpscr.d);
fpscr.d = __builtin_mffs ();
  --

  On POWER9 hardware:
  mffs: FPSCR = 0x77ff

  On qemu (git master; "-cpu POWER9"):
  --
  $ ./mtfsf
  mffs: FPSCR = 0x7ffd
  --

  Two differences:
  bit 52: "reserved", so maybe a "don't care" case
  bit 46: "FI"

  $ git log -1 master
  commit 89ea03a7dc83ca36b670ba7f787802791fcb04b1
  Merge: 019217c 2531164
  Author: Peter Maydell 
  Date:   Mon Sep 9 09:48:34 2019 +0100

  I tracked the clear is coming from do_float_check_status, likely the
  one in gen_mtfsf, but then I get lost figuring out what _should_ be
  happening. :-/

  Test attached.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843795/+subscriptions



Re: [Qemu-devel] [PATCH v2 15/17] RISC-V: add vector extension reduction instructions

2019-09-12 Thread Richard Henderson
On 9/11/19 2:25 AM, liuzhiwei wrote:
> +/* vredsum.vs vd, vs2, vs1, vm # vd[0] = sum(vs1[0] , vs2[*]) */
> +void VECTOR_HELPER(vredsum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
> +uint32_t rs2, uint32_t rd)
> +{
>  
> +int width, lmul, vl, vlmax;
> +int i, j, src2;
> +uint64_t sum = 0;
> +
> +lmul = vector_get_lmul(env);
> +vector_lmul_check_reg(env, lmul, rs2, false);
> +
> +if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
> +riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
> +return;
> +}
> +if (env->vfp.vstart != 0) {
> +riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
> +return;
> +}
> +
> +vl = env->vfp.vl;
> +if (vl == 0) {
> +return;
> +}
> +
> +width = vector_get_width(env);
> +vlmax = vector_get_vlmax(env);
> +
> +for (i = 0; i < VLEN / 64; i++) {
> +env->vfp.vreg[rd].u64[i] = 0;
> +}
> +

There is no requirement that I see for vd != vs1 && vd != vs2.  Thus clearing
vd before the operation may clobber the inputs.

> +if (i < vl) {
> +switch (width) {
> +case 8:
> +if (vector_elem_mask(env, vm, width, lmul, i)) {
> +sum += env->vfp.vreg[src2].u8[j];
> +}
> +if (i == 0) {
> +sum += env->vfp.vreg[rs1].u8[0];
> +}

Hoist the rs1 case outside the loop.


r~



Re: [Qemu-devel] [PATCH v2 09/17] RISC-V: add vector extension integer instructions part2, bit/shift

2019-09-12 Thread Richard Henderson
On 9/11/19 2:25 AM, liuzhiwei wrote:
> +void VECTOR_HELPER(vand_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
> +uint32_t rs2, uint32_t rd)
> +{
> +int i, j, vl;
> +uint32_t lmul, width, src1, src2, dest, vlmax;
> +
> +vl = env->vfp.vl;
> +lmul  = vector_get_lmul(env);
> +width   = vector_get_width(env);
> +vlmax = vector_get_vlmax(env);
> +
> +if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
> +riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
> +return;
> +}
> +vector_lmul_check_reg(env, lmul, rs1, false);
> +vector_lmul_check_reg(env, lmul, rs2, false);
> +vector_lmul_check_reg(env, lmul, rd, false);
> +
> +for (i = 0; i < vlmax; i++) {
> +src1 = rs1 + (i / (VLEN / width));
> +src2 = rs2 + (i / (VLEN / width));
> +dest = rd + (i / (VLEN / width));
> +j = i % (VLEN / width);
> +if (i < env->vfp.vstart) {
> +continue;
> +} else if (i < vl) {
> +switch (width) {
> +case 8:
> +if (vector_elem_mask(env, vm, width, lmul, i)) {
> +env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j]
> +& env->vfp.vreg[src2].u8[j];
> +}
> +break;

Note that a non-predicated logical operation need not consider the width.  All
of the widths perform the same operation, and therefore having the host operate
on u64 is fastest.  This is another good reason to notice vm=1 within the
translator and use separate helper functions for masked vs non-masked.

> +void VECTOR_HELPER(vand_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
> +uint32_t rs2, uint32_t rd)
...
> +void VECTOR_HELPER(vand_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
> +uint32_t rs2, uint32_t rd)

As with the previous set of arithmetic instructions, these should be a single
helper that is passed a 64-bit scalar.

Note that scalars smaller than 64-bit can be replicated with dup_const().  At
which point the logical operation is easily performed in 64-bit units instead
of any smaller unit.

Note that predication can be handled via logical masking.  For ARM SVE, we have
a set of functions that map the active bits of a predicate mask to byte masks.
 See e.g.

static inline uint64_t expand_pred_b(uint8_t byte)
static inline uint64_t expand_pred_h(uint8_t byte)
static inline uint64_t expand_pred_s(uint8_t byte)

so that the predicated logical and operation looks like

mask = expand_pred_n(env->vfp.vreg[0].u8[i]);
result = in1 & in2;
dest = (result & mask) | (dest & ~mask);


r~



[Qemu-devel] [PULL 4/4] target/mips: gdbstub: Revert commit 8e0b373

2019-09-12 Thread Aleksandar Markovic
From: Libo Zhou 

Multiple reports from users were received regarding failures of
packet 'g' communication with gdb for some MIPS configurations.
It was found out (by bisecting) that the problematic commit is
8e0b373. Revert that commit until a better solution is developed.

Suggested-by: Aleksandar Markovic 
Signed-off-by: Libo Zhou 
Signed-off-by: Aleksandar Markovic 
Reviewed-by: Aleksandar Markovic 
Message-Id: <1568207966-25202-1-git-send-email-aleksandar.marko...@rt-rk.com>
---
 target/mips/gdbstub.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/mips/gdbstub.c b/target/mips/gdbstub.c
index ebcc98b..bbb2544 100644
--- a/target/mips/gdbstub.c
+++ b/target/mips/gdbstub.c
@@ -38,7 +38,7 @@ int mips_cpu_gdb_read_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 return gdb_get_regl(mem_buf, (int32_t)env->active_fpu.fcr0);
 default:
 if (env->CP0_Status & (1 << CP0St_FR)) {
-return gdb_get_reg64(mem_buf,
+return gdb_get_regl(mem_buf,
 env->active_fpu.fpr[n - 38].d);
 } else {
 return gdb_get_regl(mem_buf,
@@ -99,7 +99,6 @@ int mips_cpu_gdb_write_register(CPUState *cs, uint8_t 
*mem_buf, int n)
 break;
 default:
 if (env->CP0_Status & (1 << CP0St_FR)) {
-uint64_t tmp = ldq_p(mem_buf);
 env->active_fpu.fpr[n - 38].d = tmp;
 } else {
 env->active_fpu.fpr[n - 38].w[FP_ENDIAN_IDX] = tmp;
-- 
2.7.4




[Qemu-devel] [PULL 0/4] MIPS queue for September 12th, 2019

2019-09-12 Thread Aleksandar Markovic
From: Aleksandar Markovic 

The following changes since commit 6d2fdde42c3344099262431df6a3f429c509291d:

  Merge remote-tracking branch 
'remotes/stsquad/tags/pull-testing-next-100919-2' into staging (2019-09-10 
14:52:09 +0100)

are available in the git repository at:

  https://github.com/AMarkovic/qemu tags/mips-queue-sep-12-2019

for you to fetch changes up to d1cc1533509012916dceeb7f23accda8a9fee85c:

  target/mips: gdbstub: Revert commit 8e0b373 (2019-09-12 18:25:34 +0200)



MIPS queue for September 12th, 2019

Highlights:

  - switch to using do_transaction_failed() hook for MIPS' Jazz boards
  - revert a commit that caused problems with gdb's packet 'g' for MIPS



Libo Zhou (1):
  target/mips: gdbstub: Revert commit 8e0b373

Peter Maydell (3):
  hw/mips/mips_jazz: Override do_transaction_failed hook
  target/mips: Switch to do_transaction_failed() hook
  hw/mips/mips_jazz: Remove no-longer-necessary override of
do_unassigned_access

 hw/mips/mips_jazz.c | 47 +--
 target/mips/cpu.c   |  2 +-
 target/mips/gdbstub.c   |  3 +--
 target/mips/internal.h  |  8 +---
 target/mips/op_helper.c | 24 
 5 files changed, 48 insertions(+), 36 deletions(-)

-- 
2.7.4




[Qemu-devel] [PULL 3/4] hw/mips/mips_jazz: Remove no-longer-necessary override of do_unassigned_access

2019-09-12 Thread Aleksandar Markovic
From: Peter Maydell 

Now that the MIPS CPU implementation uses the new
do_transaction_failed hook, we can remove the old code that handled
the do_unassigned_access hook.

Signed-off-by: Peter Maydell 
Signed-off-by: Aleksandar Markovic 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Hervé Poussineau 
Message-Id: <20190802160458.25681-4-peter.mayd...@linaro.org>
---
 hw/mips/mips_jazz.c | 27 ---
 1 file changed, 4 insertions(+), 23 deletions(-)

diff --git a/hw/mips/mips_jazz.c b/hw/mips/mips_jazz.c
index 1a8e847..c967b97 100644
--- a/hw/mips/mips_jazz.c
+++ b/hw/mips/mips_jazz.c
@@ -111,18 +111,6 @@ static const MemoryRegionOps dma_dummy_ops = {
 #define MAGNUM_BIOS_SIZE_MAX 0x7e000
 #define MAGNUM_BIOS_SIZE (BIOS_SIZE < MAGNUM_BIOS_SIZE_MAX ? BIOS_SIZE : 
MAGNUM_BIOS_SIZE_MAX)
 
-static CPUUnassignedAccess real_do_unassigned_access;
-static void mips_jazz_do_unassigned_access(CPUState *cpu, hwaddr addr,
-   bool is_write, bool is_exec,
-   int opaque, unsigned size)
-{
-if (!is_exec) {
-/* ignore invalid access (ie do not raise exception) */
-return;
-}
-(*real_do_unassigned_access)(cpu, addr, is_write, is_exec, opaque, size);
-}
-
 static void (*real_do_transaction_failed)(CPUState *cpu, hwaddr physaddr,
   vaddr addr, unsigned size,
   MMUAccessType access_type,
@@ -184,9 +172,8 @@ static void mips_jazz_init(MachineState *machine,
  * However, we can't simply add a global memory region to catch
  * everything, as this would make all accesses including instruction
  * accesses be ignored and not raise exceptions.
- * So instead we hijack either the do_unassigned_access method or
- * the do_transaction_failed method on the CPU, and do not raise exceptions
- * for data access.
+ * So instead we hijack the do_transaction_failed method on the CPU, and
+ * do not raise exceptions for data access.
  *
  * NOTE: this behaviour of raising exceptions for bad instruction
  * fetches but not bad data accesses was added in commit 54e755588cf1e9
@@ -197,14 +184,8 @@ static void mips_jazz_init(MachineState *machine,
  * memory region that catches all memory accesses, as we do on Malta.
  */
 cc = CPU_GET_CLASS(cpu);
-if (cc->do_unassigned_access) {
-real_do_unassigned_access = cc->do_unassigned_access;
-cc->do_unassigned_access = mips_jazz_do_unassigned_access;
-}
-if (cc->do_transaction_failed) {
-real_do_transaction_failed = cc->do_transaction_failed;
-cc->do_transaction_failed = mips_jazz_do_transaction_failed;
-}
+real_do_transaction_failed = cc->do_transaction_failed;
+cc->do_transaction_failed = mips_jazz_do_transaction_failed;
 
 /* allocate RAM */
 memory_region_allocate_system_memory(ram, NULL, "mips_jazz.ram",
-- 
2.7.4




[Qemu-devel] [PULL 1/4] hw/mips/mips_jazz: Override do_transaction_failed hook

2019-09-12 Thread Aleksandar Markovic
From: Peter Maydell 

The MIPS Jazz ('magnum' and 'pica61') boards have some code which
overrides the CPU's do_unassigned_access hook, so they can intercept
it and not raise exceptions on data accesses to invalid addresses,
only for instruction fetches.

We want to switch MIPS over to using the do_transaction_failed
hook instead, so add an intercept for that as well, and make
the board code install whichever hook the CPU is actually using.
Once we've changed the CPU implementation we can remove the
redundant code for the old hook.

Note: I am suspicious that the behaviour as implemented here may not
be what the hardware really does.  It was added in commit
54e755588cf1e90f0b14 to restore the behaviour that was broken by
commit c658b94f6e8c206c59d.  But prior to commit c658b94f6e8c206c59d
every MIPS board generated exceptions for instruction access to
invalid addresses but not for data accesses; and other boards,
notably Malta, were fixed by making all invalid accesses behave as
reads-as-zero (see the call to empty_slot_init() in
mips_malta_init()).  Hardware that raises exceptions for instruction
access and not data access seems to me to be an unlikely design, and
it's possible that the right way to emulate this is to make the Jazz
boards do what we did with Malta (or some variation of that).
Nonetheless, since I don't have access to real hardware to test
against I have taken the approach of "make QEMU continue to behave
the same way it did before this commit".  I have updated the comment
to correct the parts that are no longer accurate and note that
the hardware might behave differently.

The test case for the need for the hook-hijacking is in
https://bugs.launchpad.net/qemu/+bug/1245924 That BIOS will boot OK
either with this overriding of both hooks, or with a simple "global
memory region to ignore bad accesses of all types", so it doesn't
provide evidence either way, unfortunately.

Signed-off-by: Peter Maydell 
Signed-off-by: Aleksandar Markovic 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Hervé Poussineau 
Message-Id: <20190802160458.25681-2-peter.mayd...@linaro.org>
---
 hw/mips/mips_jazz.c | 54 +
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/hw/mips/mips_jazz.c b/hw/mips/mips_jazz.c
index 388c15c..1a8e847 100644
--- a/hw/mips/mips_jazz.c
+++ b/hw/mips/mips_jazz.c
@@ -123,6 +123,28 @@ static void mips_jazz_do_unassigned_access(CPUState *cpu, 
hwaddr addr,
 (*real_do_unassigned_access)(cpu, addr, is_write, is_exec, opaque, size);
 }
 
+static void (*real_do_transaction_failed)(CPUState *cpu, hwaddr physaddr,
+  vaddr addr, unsigned size,
+  MMUAccessType access_type,
+  int mmu_idx, MemTxAttrs attrs,
+  MemTxResult response,
+  uintptr_t retaddr);
+
+static void mips_jazz_do_transaction_failed(CPUState *cs, hwaddr physaddr,
+vaddr addr, unsigned size,
+MMUAccessType access_type,
+int mmu_idx, MemTxAttrs attrs,
+MemTxResult response,
+uintptr_t retaddr)
+{
+if (access_type != MMU_INST_FETCH) {
+/* ignore invalid access (ie do not raise exception) */
+return;
+}
+(*real_do_transaction_failed)(cs, physaddr, addr, size, access_type,
+  mmu_idx, attrs, response, retaddr);
+}
+
 static void mips_jazz_init(MachineState *machine,
enum jazz_model_e jazz_model)
 {
@@ -157,16 +179,32 @@ static void mips_jazz_init(MachineState *machine,
 env = >env;
 qemu_register_reset(main_cpu_reset, cpu);
 
-/* Chipset returns 0 in invalid reads and do not raise data exceptions.
+/*
+ * Chipset returns 0 in invalid reads and do not raise data exceptions.
  * However, we can't simply add a global memory region to catch
- * everything, as memory core directly call unassigned_mem_read/write
- * on some invalid accesses, which call do_unassigned_access on the
- * CPU, which raise an exception.
- * Handle that case by hijacking the do_unassigned_access method on
- * the CPU, and do not raise exceptions for data access. */
+ * everything, as this would make all accesses including instruction
+ * accesses be ignored and not raise exceptions.
+ * So instead we hijack either the do_unassigned_access method or
+ * the do_transaction_failed method on the CPU, and do not raise exceptions
+ * for data access.
+ *
+ * NOTE: this behaviour of raising exceptions for bad instruction
+ * fetches but not bad data accesses was added in commit 54e755588cf1e9
+ * to restore behaviour broken by 

[Qemu-devel] [PULL 2/4] target/mips: Switch to do_transaction_failed() hook

2019-09-12 Thread Aleksandar Markovic
From: Peter Maydell 

Switch the MIPS target from the old unassigned_access hook to the new
do_transaction_failed hook.

Unlike the old hook, do_transaction_failed is only ever called from
the TCG memory access paths, so there is no need for the "ignore this
if we're using KVM" hack that we were previously using to work around
the way unassigned_access was called for all kinds of memory accesses
to unassigned physical addresses.

The MIPS target does not ever do direct memory reads by physical
address (via either ldl_phys etc or address_space_ldl etc), so the
only memory accesses this affects are the 'normal' guest loads and
stores, which will be handled by the new hook; their behaviour is
unchanged.

Signed-off-by: Peter Maydell 
Signed-off-by: Aleksandar Markovic 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Hervé Poussineau 
Message-Id: <20190802160458.25681-3-peter.mayd...@linaro.org>
---
 target/mips/cpu.c   |  2 +-
 target/mips/internal.h  |  8 +---
 target/mips/op_helper.c | 24 
 3 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/target/mips/cpu.c b/target/mips/cpu.c
index 3ffa342..bbcf7ca 100644
--- a/target/mips/cpu.c
+++ b/target/mips/cpu.c
@@ -202,7 +202,7 @@ static void mips_cpu_class_init(ObjectClass *c, void *data)
 cc->gdb_read_register = mips_cpu_gdb_read_register;
 cc->gdb_write_register = mips_cpu_gdb_write_register;
 #ifndef CONFIG_USER_ONLY
-cc->do_unassigned_access = mips_cpu_unassigned_access;
+cc->do_transaction_failed = mips_cpu_do_transaction_failed;
 cc->do_unaligned_access = mips_cpu_do_unaligned_access;
 cc->get_phys_page_debug = mips_cpu_get_phys_page_debug;
 cc->vmsd = _mips_cpu;
diff --git a/target/mips/internal.h b/target/mips/internal.h
index ae29b57..685e8d6 100644
--- a/target/mips/internal.h
+++ b/target/mips/internal.h
@@ -139,9 +139,11 @@ void r4k_helper_tlbinv(CPUMIPSState *env);
 void r4k_helper_tlbinvf(CPUMIPSState *env);
 void r4k_invalidate_tlb(CPUMIPSState *env, int idx, int use_extra);
 
-void mips_cpu_unassigned_access(CPUState *cpu, hwaddr addr,
-bool is_write, bool is_exec, int unused,
-unsigned size);
+void mips_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr,
+vaddr addr, unsigned size,
+MMUAccessType access_type,
+int mmu_idx, MemTxAttrs attrs,
+MemTxResult response, uintptr_t retaddr);
 hwaddr cpu_mips_translate_address(CPUMIPSState *env, target_ulong address,
   int rw);
 #endif
diff --git a/target/mips/op_helper.c b/target/mips/op_helper.c
index 01b9e78..4de6465 100644
--- a/target/mips/op_helper.c
+++ b/target/mips/op_helper.c
@@ -2668,27 +2668,19 @@ void mips_cpu_do_unaligned_access(CPUState *cs, vaddr 
addr,
 do_raise_exception_err(env, excp, error_code, retaddr);
 }
 
-void mips_cpu_unassigned_access(CPUState *cs, hwaddr addr,
-bool is_write, bool is_exec, int unused,
-unsigned size)
+void mips_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr,
+vaddr addr, unsigned size,
+MMUAccessType access_type,
+int mmu_idx, MemTxAttrs attrs,
+MemTxResult response, uintptr_t retaddr)
 {
 MIPSCPU *cpu = MIPS_CPU(cs);
 CPUMIPSState *env = >env;
 
-/*
- * Raising an exception with KVM enabled will crash because it won't be 
from
- * the main execution loop so the longjmp won't have a matching setjmp.
- * Until we can trigger a bus error exception through KVM lets just ignore
- * the access.
- */
-if (kvm_enabled()) {
-return;
-}
-
-if (is_exec) {
-raise_exception(env, EXCP_IBE);
+if (access_type == MMU_INST_FETCH) {
+do_raise_exception(env, EXCP_IBE, retaddr);
 } else {
-raise_exception(env, EXCP_DBE);
+do_raise_exception(env, EXCP_DBE, retaddr);
 }
 }
 #endif /* !CONFIG_USER_ONLY */
-- 
2.7.4




Re: [Qemu-devel] [PATCH v3 3/6] vmstate: replace DeviceState with VMStateIf

2019-09-12 Thread Halil Pasic
On Thu, 12 Sep 2019 16:25:11 +0400
Marc-André Lureau  wrote:

> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
> index d4807f..16b9bbf04d 100644
> --- a/hw/s390x/s390-skeys.c
> +++ b/hw/s390x/s390-skeys.c
> @@ -392,7 +392,7 @@ static inline void 
> s390_skeys_set_migration_enabled(Object *obj, bool value,
>  register_savevm_live(NULL, TYPE_S390_SKEYS, 0, 1,
>   _s390_storage_keys, ss);
>  } else {
> -unregister_savevm(DEVICE(ss), TYPE_S390_SKEYS, ss);
> +unregister_savevm(VMSTATE_IF(ss), TYPE_S390_SKEYS, ss);
>  }
>  }

Acked-by: Halil Pasic 

BTW what does the 'f' in VMStateIf stand for? (I've had a look
at 2/6 but didn't figure out the answer).

Regards,
Halil




[Qemu-devel] [PATCH] blockdev: avoid acquiring AioContext lock twice at do_drive_backup()

2019-09-12 Thread Sergio Lopez
do_drive_backup() acquires the AioContext lock of the corresponding
BlockDriverState. This is not a problem when it's called from
qmp_drive_backup(), but drive_backup_prepare() also acquires the lock
before calling it.

This change adds a BlockDriverState argument to do_drive_backup(),
which is used to signal that the context lock is already acquired and
to save a couple of redundant calls.

Signed-off-by: Sergio Lopez 
---
 blockdev.c | 54 ++
 1 file changed, 38 insertions(+), 16 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index fbef6845c8..0cc6c69ceb 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1762,8 +1762,10 @@ typedef struct DriveBackupState {
 BlockJob *job;
 } DriveBackupState;
 
-static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
-Error **errp);
+static BlockJob *do_drive_backup(DriveBackup *backup,
+ BlockDriverState *backup_bs,
+ JobTxn *txn,
+ Error **errp);
 
 static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
@@ -1781,6 +1783,11 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 return;
 }
 
+if (!bs->drv) {
+error_setg(errp, "Device has no medium");
+return;
+}
+
 aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(aio_context);
 
@@ -1789,7 +1796,9 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 
 state->bs = bs;
 
-state->job = do_drive_backup(backup, common->block_job_txn, _err);
+state->job = do_drive_backup(backup, bs,
+ common->block_job_txn,
+ _err);
 if (local_err) {
 error_propagate(errp, local_err);
 goto out;
@@ -3607,7 +3616,9 @@ static BlockJob *do_backup_common(BackupCommon *backup,
 return job;
 }
 
-static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
+static BlockJob *do_drive_backup(DriveBackup *backup,
+ BlockDriverState *backup_bs,
+ JobTxn *txn,
  Error **errp)
 {
 BlockDriverState *bs;
@@ -3625,18 +3636,27 @@ static BlockJob *do_drive_backup(DriveBackup *backup, 
JobTxn *txn,
 backup->mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
 }
 
-bs = bdrv_lookup_bs(backup->device, backup->device, errp);
-if (!bs) {
-return NULL;
-}
+if (backup_bs) {
+bs = backup_bs;
+/*
+ * If the caller passes us a BDS, we assume it has already
+ * acquired the context lock.
+ */
+aio_context = bdrv_get_aio_context(bs);
+} else {
+bs = bdrv_lookup_bs(backup->device, backup->device, errp);
+if (!bs) {
+return NULL;
+}
 
-if (!bs->drv) {
-error_setg(errp, "Device has no medium");
-return NULL;
-}
+if (!bs->drv) {
+error_setg(errp, "Device has no medium");
+return NULL;
+}
 
-aio_context = bdrv_get_aio_context(bs);
-aio_context_acquire(aio_context);
+aio_context = bdrv_get_aio_context(bs);
+aio_context_acquire(aio_context);
+}
 
 if (!backup->has_format) {
 backup->format = backup->mode == NEW_IMAGE_MODE_EXISTING ?
@@ -3713,7 +3733,9 @@ static BlockJob *do_drive_backup(DriveBackup *backup, 
JobTxn *txn,
 unref:
 bdrv_unref(target_bs);
 out:
-aio_context_release(aio_context);
+if (!backup_bs) {
+aio_context_release(aio_context);
+}
 return job;
 }
 
@@ -3721,7 +3743,7 @@ void qmp_drive_backup(DriveBackup *arg, Error **errp)
 {
 
 BlockJob *job;
-job = do_drive_backup(arg, NULL, errp);
+job = do_drive_backup(arg, NULL, NULL, errp);
 if (job) {
 job_start(>job);
 }
-- 
2.21.0




Re: [Qemu-devel] [PATCH] util/ioc.c: try to reassure Coverity about qemu_iovec_init_extended

2019-09-12 Thread Stefan Hajnoczi
On Tue, Sep 10, 2019 at 12:03:10PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Make it more obvious, that filling qiov corresponds to qiov allocation,
> which in turn corresponds to total_niov calculation, based on mid_niov
> (not mid_len). Still add an assertion to show that there should be no
> difference.
> 
> Reported-by: Coverity (CID 1405302)
> Suggested-by: Peter Maydell 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  util/iov.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Thanks, applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] target/arm: fix CBAR register for AArch64 CPUs

2019-09-12 Thread Peter Maydell
On Thu, 12 Sep 2019 at 12:01, Luc Michel  wrote:
>
> For AArch64 CPUs with a CBAR register, we have two views for it:
>   - in AArch64 state, the CBAR_EL1 register (S3_1_C15_C3_0), returns the
> full 64 bits CBAR value
>   - in AArch32 state, the CBAR register (cp15, opc1=1, CRn=15, CRm=3, opc2=0)
> returns a 32 bits view such that:
>   CBAR = CBAR_EL1[31:18] 0..0 CBAR_EL1[43:32]
>
> This commit fixes the current implementation where:
>   - CBAR_EL1 was returning the 32 bits view instead of the full 64 bits
> value,
>   - CBAR was returning a truncated 32 bits version of the full 64 bits
> one, instead of the 32 bits view
>   - CBAR was declared as cp15, opc1=4, CRn=15, CRm=0, opc2=0, which is
> the CBAR register found in the ARMv7 Cortex-Ax CPUs, but not in
> ARMv8 CPUs.
>
> Signed-off-by: Luc Michel 
> ---
>  target/arm/helper.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 507026c915..755aa18a2d 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -6740,12 +6740,12 @@ void register_cp_regs_for_features(ARMCPU *cpu)
>  ARMCPRegInfo cbar_reginfo[] = {
>  { .name = "CBAR",
>.type = ARM_CP_CONST,
> -  .cp = 15, .crn = 15, .crm = 0, .opc1 = 4, .opc2 = 0,
> -  .access = PL1_R, .resetvalue = cpu->reset_cbar },
> +  .cp = 15, .crn = 15, .crm = 3, .opc1 = 1, .opc2 = 0,
> +  .access = PL1_R, .resetvalue = cbar32 },

This will break the Cortex-A9  which use the 15/0/4/0 encoding
and the un-rearranged value for this register.

I think we need to check through the TRMs to confirm which CPUs use
which format for the CBAR, and have a different feature bit for the
newer format/sysreg encoding, so we can provide the right sysregs for
the right cores.

thanks
-- PMM



Re: [Qemu-devel] [RFC v3 PATCH 08/45] multi-process: add functions to synchronize proxy and remote endpoints

2019-09-12 Thread Stefan Hajnoczi
On Tue, Sep 03, 2019 at 04:37:34PM -0400, Jagannathan Raman wrote:
> In some cases, for example MMIO read, QEMU has to wait for the remote to
> complete a command before proceeding. An eventfd based mechanism is
> added to synchronize QEMU & remote process.
> 
> Signed-off-by: John G Johnson 
> Signed-off-by: Jagannathan Raman 
> Signed-off-by: Elena Ufimtseva 
> ---
>  v1 -> v2:
>- Added timeout to synchronization functions
> 
>  include/io/proxy-link.h |  8 
>  io/proxy-link.c | 42 ++
>  2 files changed, 50 insertions(+)
> 
> diff --git a/include/io/proxy-link.h b/include/io/proxy-link.h
> index ee78cdd..b76c574 100644
> --- a/include/io/proxy-link.h
> +++ b/include/io/proxy-link.h
> @@ -28,7 +28,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> +#include 
>  
>  #include "qemu/osdep.h"
>  #include "qom/object.h"
> @@ -133,11 +135,17 @@ struct ProxyLinkState {
>  proxy_link_callback callback;
>  };
>  
> +#define GET_REMOTE_WAIT eventfd(0, 0)
> +#define PUT_REMOTE_WAIT(wait) close(wait)

Can you use functions instead of macros?  eventfd() is Linux-specific so
this code is not portable.  QEMU has an EventNotifier abstraction but
I'm not sure if it can be used since this patch doesn't include any code
that calls GET_REMOTE_WAIT/PUT_REMOTE_WAIT and there are no comments.  I
don't know what the expected semantics are.

> +#define PROXY_LINK_WAIT_DONE 1
> +
>  ProxyLinkState *proxy_link_create(void);
>  void proxy_link_finalize(ProxyLinkState *s);
>  
>  void proxy_proc_send(ProxyLinkState *s, ProcMsg *msg, ProcChannel *chan);
>  int proxy_proc_recv(ProxyLinkState *s, ProcMsg *msg, ProcChannel *chan);
> +uint64_t wait_for_remote(int efd);
> +void notify_proxy(int fd, uint64_t val);
>  
>  void proxy_link_init_channel(ProxyLinkState *s, ProcChannel **chan, int fd);
>  void proxy_link_destroy_channel(ProcChannel *chan);
> diff --git a/io/proxy-link.c b/io/proxy-link.c
> index 5eb9718..381a38e 100644
> --- a/io/proxy-link.c
> +++ b/io/proxy-link.c
> @@ -31,6 +31,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  #include "qemu/module.h"
>  #include "io/proxy-link.h"
> @@ -216,6 +218,46 @@ int proxy_proc_recv(ProxyLinkState *s, ProcMsg *msg, 
> ProcChannel *chan)
>  return rc;
>  }
>  
> +uint64_t wait_for_remote(int efd)

Hard to tell if this makes sense without any context.  I notice that
EFD_CLOEXEC and EFD_NONBLOCK were not used.  It's likely that
EFD_CLOEXEC should be used.  Since the eventfd is used with poll(2)
EFD_NONBLOCK should probably also be used so it's certain that read()
will not block (which could exceed the timeout).


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 08/17] RISC-V: add vector extension integer instructions part1, add/sub/adc/sbc

2019-09-12 Thread Richard Henderson
On 9/12/19 11:27 AM, Richard Henderson wrote:
>> +void VECTOR_HELPER(vadc_vxm)(CPURISCVState *env, uint32_t rs1,
>> +uint32_t rs2, uint32_t rd)
>> +{
> 
> Watch the spacing between functions.
> Pass gpr rs1 by value.
> 
>> +void VECTOR_HELPER(vadc_vim)(CPURISCVState *env, uint32_t rs1,
>> +uint32_t rs2, uint32_t rd)
>> +{
> ...
>> +env->vfp.vreg[dest].u8[j] = sign_extend(rs1, 5)
> 
> Pass the immediate as a sign-extended immediate to begin with, not as an
> unsigned 5-bit field.

Oh, and of course *_vxm and *_vim should be identical, because in both cases
there is a single scalar parameter.  In the first case the scalar is passed by
value from the gpr; in the second case the scalar is the sign-extended constant.


r~



Re: [Qemu-devel] [RFC v3 PATCH 07/45] multi-process: define proxy-link object

2019-09-12 Thread Stefan Hajnoczi
On Tue, Sep 03, 2019 at 04:37:33PM -0400, Jagannathan Raman wrote:
> diff --git a/include/io/proxy-link.h b/include/io/proxy-link.h
> new file mode 100644
> index 000..ee78cdd
> --- /dev/null
> +++ b/include/io/proxy-link.h
> @@ -0,0 +1,147 @@

Regarding naming: so far I've seen "remote", "mpqemu", and "proxy".
These concepts aren't well-defined and I'm not sure if they refer to the
same thing or are different.  ProxyLinkState should be named
MPQemuLinkState?

It's also unclear to me how tightly coupled this "proxy link" is to PCI
(since it has PCI Configuration Space commands and how it would be
cleanly extended to support other busses).

> +/*
> + * Communication channel between QEMU and remote device process
> + *
> + * Copyright 2019, Oracle and/or its affiliates.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#ifndef PROXY_LINK_H
> +#define PROXY_LINK_H
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "qemu/osdep.h"
> +#include "qom/object.h"
> +#include "qemu/thread.h"
> +
> +typedef struct ProxyLinkState ProxyLinkState;
> +
> +#define TYPE_PROXY_LINK "proxy-link"
> +#define PROXY_LINK(obj) \
> +OBJECT_CHECK(ProxyLinkState, (obj), TYPE_PROXY_LINK)
> +
> +#define REMOTE_MAX_FDS 8
> +
> +#define PROC_HDR_SIZE offsetof(ProcMsg, data1.u64)
> +
> +/*
> + * proc_cmd_t enum type to specify the command to be executed on the remote
> + * device
> + *
> + * Following commands are supported:
> + * CONF_READPCI config. space read
> + * CONF_WRITE   PCI config. space write
> + *
> + */
> +typedef enum {
> +INIT = 0,
> +CONF_READ,
> +CONF_WRITE,
> +MAX,
> +} proc_cmd_t;
> +
> +/*
> + * ProcMsg Format of the message sent to the remote device from QEMU
> + *
> + * cmd The remote command
> + * bytestream  Indicates if the data to be shared is structured (data1)
> + * or unstructured (data2)
> + * sizeSize of the data to be shared
> + * data1   Structured data
> + * fds File descriptors to be shared with remote device
> + * data2   Unstructured data
> + *
> + */
> +typedef struct {
> +proc_cmd_t cmd;
> +int bytestream;

Please use bool.

> +size_t size;
> +
> +union {
> +uint64_t u64;
> +} data1;
> +
> +int fds[REMOTE_MAX_FDS];
> +int num_fds;
> +
> +uint8_t *data2;
> +} ProcMsg;
> +
> +struct conf_data_msg {
> +uint32_t addr;
> +uint32_t val;
> +int l;

Please pick a descriptive name for this field.

> +};
> +
> +/*
> + * ProcChannel defines the channel that make up the communication link
> + * between QEMU and remote process
> + *
> + * gsrc   GSource object to be used by loop
> + * gpfd   GPollFD object containing the socket & events to monitor
> + * sock   Socket to send/receive communication, same as the one in gpfd
> + * lock   Mutex to synchronize access to the channel
> + */

Please see the doc comment style in include/qom/object.h for examples of
how to format doc comments.

> +
> +typedef struct ProcChannel {
> +GSource gsrc;
> +GPollFD gpfd;
> +int sock;
> +QemuMutex lock;
> +} ProcChannel;
> +
> +typedef void (*proxy_link_callback)(GIOCondition cond, ProcChannel *chan);
> +
> +/*
> + * ProxyLinkState Instance info. of the communication
> + * link between QEMU and remote process. The Link could
> + * be made up of multiple channels.
> + *
> + * ctxGMainContext to be used for communication
> + * loop   Main loop that would be used to poll for incoming data
> + * comCommunication channel to transport control messages
> + *
> + */
> +struct ProxyLinkState {
> +Object obj;
> +
> +GMainContext *ctx;
> +GMainLoop *loop;
> +
> +ProcChannel *com;
> +
> +proxy_link_callback callback;
> +};
> +
> +ProxyLinkState *proxy_link_create(void);
> +void proxy_link_finalize(ProxyLinkState *s);
> +
> +void 

Re: [Qemu-devel] [PATCH v2 08/17] RISC-V: add vector extension integer instructions part1, add/sub/adc/sbc

2019-09-12 Thread Richard Henderson
On 9/11/19 2:25 AM, liuzhiwei wrote:
>  #define VECTOR_HELPER(name) HELPER(glue(vector_, name))
> +#define SIGNBIT8(1 << 7)
> +#define SIGNBIT16   (1 << 15)
> +#define SIGNBIT32   (1 << 31)
> +#define SIGNBIT64   ((uint64_t)1 << 63)

Perhaps make up your mind if you want signed or unsigned values?  Perhaps just
use or redefine INT_MIN instead?

> +static int64_t extend_gpr(target_ulong reg)
> +{
> +return sign_extend(reg, sizeof(target_ulong) * 8);
> +}

Note wrt usage:
+extend_rs1 = (uint64_t)extend_gpr(env->gpr[rs1]);

This is equivalent to "extend_rs1 = (target_long)env->gpr[rs1]".

I don't see how this helper function is helping, really.
Also, pass gprs by value, not by index.

> +static inline int vector_get_carry(CPURISCVState *env, int width, int lmul,
> +int index)
> +{
> +int mlen = width / lmul;
> +int idx = (index * mlen) / 8;
> +int pos = (index * mlen) % 8;
> +
> +return (env->vfp.vreg[0].u8[idx] >> pos) & 0x1;
> +}

Any reason not to re-use vector_elem_mask?

> +static inline uint64_t u64xu64_lh(uint64_t a, uint64_t b)
> +{
> +uint64_t hi_64, carry;
> +
> +/* first get the whole product in {hi_64, lo_64} */
> +uint64_t a_hi = a >> 32;
> +uint64_t a_lo = (uint32_t)a;
> +uint64_t b_hi = b >> 32;
> +uint64_t b_lo = (uint32_t)b;
> +
> +/*
> + * a * b = (a_hi << 32 + a_lo) * (b_hi << 32 + b_lo)
> + *   = (a_hi * b_hi) << 64 + (a_hi * b_lo) << 32 +
> + * (a_lo * b_hi) << 32 + a_lo * b_lo
> + *   = {hi_64, lo_64}
> + * hi_64 = ((a_hi * b_lo) << 32 + (a_lo * b_hi) << 32 + (a_lo * b_lo)) 
> >> 64
> + *   = (a_hi * b_lo) >> 32 + (a_lo * b_hi) >> 32 + carry
> + * carry = ((uint64_t)(uint32_t)(a_hi * b_lo) +
> + *   (uint64_t)(uint32_t)(a_lo * b_hi) + (a_lo * b_lo) >> 32) >> 
> 32
> + */
> +
> +carry =  ((uint64_t)(uint32_t)(a_hi * b_lo) +
> +  (uint64_t)(uint32_t)(a_lo * b_hi) +
> +  ((a_lo * b_lo) >> 32)) >> 32;
> +
> +hi_64 = a_hi * b_hi +
> +((a_hi * b_lo) >> 32) + ((a_lo * b_hi) >> 32) +
> +carry;
> +
> +return hi_64;
> +}

Use mulu64().

> +static inline int64_t s64xu64_lh(int64_t a, uint64_t b)
> +{
> +uint64_t abs_a = a;
> +uint64_t lo_64, hi_64;
> +
> +if (a < 0) {
> +abs_a =  ~a + 1;

 abs_a = -a

> +static inline int64_t s64xs64_lh(int64_t a, int64_t b)
> +{
> +uint64_t abs_a = a, abs_b = b;
> +uint64_t lo_64, hi_64;
> +
> +if (a < 0) {
> +abs_a =  ~a + 1;
> +}
> +if (b < 0) {
> +abs_b = ~b + 1;
> +}
> +lo_64 = abs_a * abs_b;
> +hi_64 = u64xu64_lh(abs_a, abs_b);
> +
> +if ((a ^ b) & SIGNBIT64) {
> +lo_64 = ~lo_64;
> +hi_64 = ~hi_64;
> +if (lo_64 == UINT64_MAX) {
> +lo_64 = 0;
> +hi_64 += 1;
> +} else {
> +lo_64 += 1;
> +}
> +}
> +return hi_64;
> +}

Use muls64().


> +void VECTOR_HELPER(vadc_vvm)(CPURISCVState *env, uint32_t rs1,
> +uint32_t rs2, uint32_t rd)
> +{
> +int i, j, vl;
> +uint32_t lmul, width, src1, src2, dest, vlmax, carry;
> +
> +vl= env->vfp.vl;
> +lmul  = vector_get_lmul(env);
> +width   = vector_get_width(env);
> +vlmax = vector_get_vlmax(env);
> +
> +if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) {
> +riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
> +return;
> +}
> +vector_lmul_check_reg(env, lmul, rs1, false);
> +vector_lmul_check_reg(env, lmul, rs2, false);
> +vector_lmul_check_reg(env, lmul, rd, false);
> +
> +for (i = 0; i < vlmax; i++) {
> +src1 = rs1 + (i / (VLEN / width));
> +src2 = rs2 + (i / (VLEN / width));
> +dest = rd + (i / (VLEN / width));
> +j = i % (VLEN / width);
> +if (i < env->vfp.vstart) {
> +continue;

Again, hoist.

> +} else if (i < vl) {

I would think this too could be moved into the loop condition.

> +switch (width) {
> +case 8:
> +carry = vector_get_carry(env, width, lmul, i);
> +env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j]
> ++ env->vfp.vreg[src2].u8[j] + carry;
> +break;
> +case 16:
> +carry = vector_get_carry(env, width, lmul, i);
> +env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j]
> ++ env->vfp.vreg[src2].u16[j] + carry;
> +break;
> +case 32:
> +carry = vector_get_carry(env, width, lmul, i);
> +env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j]
> ++ env->vfp.vreg[src2].u32[j] + carry;
> +break;
> +case 64:
> +carry = vector_get_carry(env, width, lmul, i);
> +env->vfp.vreg[dest].u64[j] = 

[Qemu-devel] [PATCH 3/4] block/mirror: support unaligned write in active mirror

2019-09-12 Thread Vladimir Sementsov-Ogievskiy
Prior 9adc1cb49af8d do_sync_target_write had a bug: it reset aligned-up
region in the dirty bitmap, which means that we may not copy some bytes
and assume them copied, which actually leads to producing corrupted
target.

So 9adc1cb49af8d forced dirty bitmap granularity to be
request_alignment for mirror-top filter, so we are not working with
unaligned requests. However forcing large alignment obviously decreases
performance of unaligned requests.

This commit provides another solution for the problem: if unaligned
padding is already dirty, we can safely ignore it, as
1. It's dirty, it will be copied by mirror_iteration anyway
2. It's dirty, so skipping it now we don't increase dirtiness of the
   bitmap and therefore don't damage "synchronicity" of the
   write-blocking mirror.

If unaligned padding is not dirty, we just write it, no reason to touch
dirty bitmap if we succeed (on failure we'll set the whole region
ofcourse, but we loss "synchronicity" on failure anyway).

Note: we need to disable dirty_bitmap, otherwise we will not be able to
see in do_sync_target_write bitmap state before current operation. We
may of course check dirty bitmap before the operation in
bdrv_mirror_top_do_write and remember it, but we don't need active
dirty bitmap for write-blocking mirror anyway.

New code-path is unused until the following commit reverts
9adc1cb49af8d.

Suggested-by: Denis V. Lunev 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/mirror.c | 39 ++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/block/mirror.c b/block/mirror.c
index d176bf5920..d192f6a96b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1204,6 +1204,39 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod 
method,
  QEMUIOVector *qiov, int flags)
 {
 int ret;
+size_t qiov_offset = 0;
+
+if (!QEMU_IS_ALIGNED(offset, job->granularity) &&
+bdrv_dirty_bitmap_get(job->dirty_bitmap, offset)) {
+/*
+ * Dirty unaligned padding
+ * 1. It's already dirty, no damage to "actively_synced" if we just
+ *skip unaligned part.
+ * 2. If we copy it, we can't reset corresponding bit in
+ *dirty_bitmap as there may be some "dirty" bytes still not
+ *copied.
+ * So, just ignore it.
+ */
+qiov_offset = QEMU_ALIGN_UP(offset, job->granularity) - offset;
+if (bytes <= qiov_offset) {
+/* nothing to do after shrink */
+return;
+}
+offset += qiov_offset;
+bytes -= qiov_offset;
+}
+
+if (!QEMU_IS_ALIGNED(offset + bytes, job->granularity) &&
+bdrv_dirty_bitmap_get(job->dirty_bitmap, offset + bytes - 1))
+{
+uint64_t tail = (offset + bytes) % job->granularity;
+
+if (bytes <= tail) {
+/* nothing to do after shrink */
+return;
+}
+bytes -= tail;
+}
 
 bdrv_reset_dirty_bitmap(job->dirty_bitmap, offset, bytes);
 
@@ -1211,7 +1244,8 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod 
method,
 
 switch (method) {
 case MIRROR_METHOD_COPY:
-ret = blk_co_pwritev(job->target, offset, bytes, qiov, flags);
+ret = blk_co_pwritev_part(job->target, offset, bytes,
+  qiov, qiov_offset, flags);
 break;
 
 case MIRROR_METHOD_ZERO:
@@ -1640,6 +1674,9 @@ static BlockJob *mirror_start_job(
 if (!s->dirty_bitmap) {
 goto fail;
 }
+if (s->copy_mode == MIRROR_COPY_MODE_WRITE_BLOCKING) {
+bdrv_disable_dirty_bitmap(s->dirty_bitmap);
+}
 
 ret = block_job_add_bdrv(>common, "source", bs, 0,
  BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE |
-- 
2.21.0




Re: [Qemu-devel] [PULL v2 00/46] testing updates (fixes, upgrades, caching)

2019-09-12 Thread Peter Maydell
On Tue, 10 Sep 2019 at 14:24, Alex Bennée  wrote:
>
> The following changes since commit 89ea03a7dc83ca36b670ba7f787802791fcb04b1:
>
>   Merge remote-tracking branch 
> 'remotes/huth-gitlab/tags/m68k-pull-2019-09-07' into staging (2019-09-09 
> 09:48:34 +0100)
>
> are available in the Git repository at:
>
>   https://github.com/stsquad/qemu.git tags/pull-testing-next-100919-2
>
> for you to fetch changes up to dda60da384ddbe4fc75182dd23db7e9aa4a88f46:
>
>   tests/tcg: fix typo when calling clean-tcg (2019-09-10 14:14:32 +0100)
>
> 
> Testing fixes:
>
>   - podman cleanups
>   - docker.py python3 fixes (encode)
>   - DEF_TARGET_LIST applied to cross build images
>   - move a bunch to Buster based images
>   - enable Travis caching
>   - more common objs for faster builds
>   - stable URLs for acceptance tests
>   - additional travis dependencies
>   - work around ppc64abi32 linux-test breakage [v2]
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2
for any user-visible changes.

-- PMM



[Qemu-devel] [PATCH 0/4] active-mirror: support unaligned guest operations

2019-09-12 Thread Vladimir Sementsov-Ogievskiy
Commit 9adc1cb49af8d fixed a bug about unaligned (to dirty bitmap
granularity) guest writes (and discards) by simply requesting
corresponding alignment on mirror-top filter. However forcing large
alignment obviously decreases performance of unaligned requests.

So it's time for a new solution which is in 03. And 04 reverts
9adc1cb49af8d.

Vladimir Sementsov-Ogievskiy (4):
  block/mirror: simplify do_sync_target_write
  block/block-backend: add blk_co_pwritev_part
  block/mirror: support unaligned write in active mirror
  Revert "mirror: Only mirror granularity-aligned chunks"

 include/sysemu/block-backend.h |   4 +
 block/block-backend.c  |  17 +++-
 block/mirror.c | 153 +
 3 files changed, 78 insertions(+), 96 deletions(-)

-- 
2.21.0




Re: [Qemu-devel] [PATCH v2 3/3] xen: perform XenDevice clean-up in XenBus watch handler

2019-09-12 Thread Paul Durrant
> -Original Message-
> From: Anthony PERARD 
> Sent: 12 September 2019 16:04
> To: Paul Durrant 
> Cc: qemu-devel@nongnu.org; xen-de...@lists.xenproject.org; Stefano Stabellini 
> 
> Subject: Re: [PATCH v2 3/3] xen: perform XenDevice clean-up in XenBus watch 
> handler
> 
> On Thu, Sep 12, 2019 at 01:16:46PM +0100, Paul Durrant wrote:
> > Cleaning up offine XenDevice objects directly in
>   ^ offline
> 
> > xen_device_backend_changed() is dangerous as xen_device_unrealize() will
> > modify the watch list that is being walked. Even the QLIST_FOREACH_SAFE()
> > used in notifier_list_notify() is insufficient as *two* notifiers (for
> > the frontend and backend watches) are removed, thus potentially rendering
> > the 'next' pointer unsafe.
> >
> > The solution is to use the XenBus backend_watch handler to do the clean-up
> > instead, as it is invoked whilst walking a separate watch list.
> >
> > This patch therefore adds a new 'offline_devices' list to XenBus, to which
> > offline devices are added by xen_device_backend_changed(). The XenBus
> > backend_watch registration is also changed to not only invoke
> > xen_bus_enumerate() but also a new xen_bus_cleanup() function, which will
> > walk 'offline_devices' and perform the necessary actions.
> > For safety a an extra 'online' check is also added to
>  ^ one 'a' too many?
> 
> > xen_bus_type_enumerate() to make sure that no attempt is made to create a
> > new XenDevice object for a backend that is offline.
> >
> > NOTE: This patch also include some cosmetic changes:
> >   - substitute the local variable name 'backend_state'
> > in xen_bus_type_enumerate() with 'state', since there
> > is no ambiguity with any other state in that context.
> >   - change xen_device_state_is_active() to
> > xen_device_frontend_is_active() (and pass a XenDevice directly)
> > since the state tests contained therein only apply to a frontend.
> >   - use 'state' rather then 'xendev->backend_state' in
> > xen_device_backend_changed() to shorten the code.
> >
> > Signed-off-by: Paul Durrant 
> > ---
> >
> > v2:
> >  - Make sure we don't try to add a XenDevice to 'offline_devices' more than
> >once
> > ---
> >
> >  /*
> >   * If a backend is still 'online' then we should leave it alone but,
> > - * if a backend is not 'online', then the device should be destroyed
> > - * once the state is Closed.
> > + * if a backend is not 'online', then the device is a candidate
> > + * for destruction. Hence add it to the 'offline' list to be cleaned
> > + * by xen_bus_cleanup().
> >   */
> > -if (!xendev->backend_online &&
> > -(xendev->backend_state == XenbusStateClosed ||
> > - xendev->backend_state == XenbusStateInitialising ||
> > - xendev->backend_state == XenbusStateInitWait ||
> > - xendev->backend_state == XenbusStateUnknown)) {
> > -Error *local_err = NULL;
> > +if (!online &&
> > +(state == XenbusStateClosed ||  state == XenbusStateInitialising ||
> > + state == XenbusStateInitWait || state == XenbusStateUnknown) &&
> > +!QLIST_NEXT(xendev, list)) {
> 
> Could you add a note about this QLIST_NEXT? I don't think it's going to
> be obvious enough why we check that there are no next item. I might only
> understand it just because of your reply to the v1 of the patch.
> (Well the changelog of the patch also point out what it's for.)
> 

Sure, it's worth a comment.

> > +XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
> >
> > -if (!xen_backend_try_device_destroy(xendev, _err)) {
> > -object_unparent(OBJECT(xendev));
> > -}
> > +QLIST_INSERT_HEAD(>offline_devices, xendev, list);
> >
> > -if (local_err) {
> > -error_report_err(local_err);
> > -}
> > +/*
> > + * Re-write the state to cause a XenBus backend_watch notification,
> > + * resulting in a call to xen_bus_cleanup().
> > + */
> > +xen_device_backend_printf(xendev, "state", "%u", state);
> 
> It kind of feels wrong to rely on xenstore to notify QEMU's xenbus
> driver that a device needs cleanup. But that does the job.
> 

I had originally considered setting up an event notifier but that seemed like 
more fds than were strictly necessary ;-)

> With a note about the use of QLIST_NEXT and the few typo fixed, the
> patch will be good to go.
> 

Cool. I'll tidy up and send a v3.

  Paul

> Thanks,
> 
> --
> Anthony PERARD



[Qemu-devel] [PATCH 2/4] block/block-backend: add blk_co_pwritev_part

2019-09-12 Thread Vladimir Sementsov-Ogievskiy
Add blk write function with qiov_offset parameter. It's needed for the
following commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/sysemu/block-backend.h |  4 
 block/block-backend.c  | 17 +
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 368d53af77..73f2cef7fe 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -121,6 +121,10 @@ void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps 
*ops, void *opaque);
 int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset,
unsigned int bytes, QEMUIOVector *qiov,
BdrvRequestFlags flags);
+int coroutine_fn blk_co_pwritev_part(BlockBackend *blk, int64_t offset,
+ unsigned int bytes,
+ QEMUIOVector *qiov, size_t qiov_offset,
+ BdrvRequestFlags flags);
 int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
unsigned int bytes, QEMUIOVector *qiov,
BdrvRequestFlags flags);
diff --git a/block/block-backend.c b/block/block-backend.c
index 1c605d5444..b3d00478af 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1176,9 +1176,10 @@ int coroutine_fn blk_co_preadv(BlockBackend *blk, 
int64_t offset,
 return ret;
 }
 
-int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
-unsigned int bytes, QEMUIOVector *qiov,
-BdrvRequestFlags flags)
+int coroutine_fn blk_co_pwritev_part(BlockBackend *blk, int64_t offset,
+ unsigned int bytes,
+ QEMUIOVector *qiov, size_t qiov_offset,
+ BdrvRequestFlags flags)
 {
 int ret;
 BlockDriverState *bs;
@@ -1205,11 +1206,19 @@ int coroutine_fn blk_co_pwritev(BlockBackend *blk, 
int64_t offset,
 flags |= BDRV_REQ_FUA;
 }
 
-ret = bdrv_co_pwritev(blk->root, offset, bytes, qiov, flags);
+ret = bdrv_co_pwritev_part(blk->root, offset, bytes, qiov, qiov_offset,
+   flags);
 bdrv_dec_in_flight(bs);
 return ret;
 }
 
+int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
+unsigned int bytes, QEMUIOVector *qiov,
+BdrvRequestFlags flags)
+{
+return blk_co_pwritev_part(blk, offset, bytes, qiov, 0, flags);
+}
+
 typedef struct BlkRwCo {
 BlockBackend *blk;
 int64_t offset;
-- 
2.21.0




[Qemu-devel] [PATCH 4/4] Revert "mirror: Only mirror granularity-aligned chunks"

2019-09-12 Thread Vladimir Sementsov-Ogievskiy
This reverts commit 9adc1cb49af8d4e54f57980b1eed5c0a4b2dafa6.
"mirror: Only mirror granularity-aligned chunks"

Since previous commit unaligned chunks are supported by
do_sync_target_write.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/mirror.c | 29 -
 1 file changed, 29 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index d192f6a96b..4626b91541 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1482,15 +1482,6 @@ static void bdrv_mirror_top_child_perm(BlockDriverState 
*bs, BdrvChild *c,
 *nshared = BLK_PERM_ALL;
 }
 
-static void bdrv_mirror_top_refresh_limits(BlockDriverState *bs, Error **errp)
-{
-MirrorBDSOpaque *s = bs->opaque;
-
-if (s && s->job && s->job->copy_mode == MIRROR_COPY_MODE_WRITE_BLOCKING) {
-bs->bl.request_alignment = s->job->granularity;
-}
-}
-
 /* Dummy node that provides consistent read to its users without requiring it
  * from its backing file and that allows writes on the backing file chain. */
 static BlockDriver bdrv_mirror_top = {
@@ -1503,7 +1494,6 @@ static BlockDriver bdrv_mirror_top = {
 .bdrv_co_block_status   = bdrv_co_block_status_from_backing,
 .bdrv_refresh_filename  = bdrv_mirror_top_refresh_filename,
 .bdrv_child_perm= bdrv_mirror_top_child_perm,
-.bdrv_refresh_limits= bdrv_mirror_top_refresh_limits,
 };
 
 static BlockJob *mirror_start_job(
@@ -1651,25 +1641,6 @@ static BlockJob *mirror_start_job(
 s->should_complete = true;
 }
 
-/*
- * Must be called before we start tracking writes, but after
- *
- * ((MirrorBlockJob *)
- * ((MirrorBDSOpaque *)
- * mirror_top_bs->opaque
- * )->job
- * )->copy_mode
- *
- * has the correct value.
- * (We start tracking writes as of the following
- * bdrv_create_dirty_bitmap() call.)
- */
-bdrv_refresh_limits(mirror_top_bs, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-goto fail;
-}
-
 s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
 if (!s->dirty_bitmap) {
 goto fail;
-- 
2.21.0




[Qemu-devel] [PATCH 1/4] block/mirror: simplify do_sync_target_write

2019-09-12 Thread Vladimir Sementsov-Ogievskiy
do_sync_target_write is called from bdrv_mirror_top_do_write after
write/discard operation, all inside active_write/active_write_settle
protecting us from mirror iteration. So the whole area is dirty for
sure, no reason to examine dirty bitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/mirror.c | 95 +++---
 1 file changed, 28 insertions(+), 67 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 853e2c7510..d176bf5920 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1203,84 +1203,45 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod 
method,
  uint64_t offset, uint64_t bytes,
  QEMUIOVector *qiov, int flags)
 {
-QEMUIOVector target_qiov;
-uint64_t dirty_offset = offset;
-uint64_t dirty_bytes;
-
-if (qiov) {
-qemu_iovec_init(_qiov, qiov->niov);
-}
-
-while (true) {
-bool valid_area;
-int ret;
-
-bdrv_dirty_bitmap_lock(job->dirty_bitmap);
-dirty_bytes = MIN(offset + bytes - dirty_offset, INT_MAX);
-valid_area = bdrv_dirty_bitmap_next_dirty_area(job->dirty_bitmap,
-   _offset,
-   _bytes);
-if (!valid_area) {
-bdrv_dirty_bitmap_unlock(job->dirty_bitmap);
-break;
-}
+int ret;
 
-bdrv_reset_dirty_bitmap_locked(job->dirty_bitmap,
-   dirty_offset, dirty_bytes);
-bdrv_dirty_bitmap_unlock(job->dirty_bitmap);
+bdrv_reset_dirty_bitmap(job->dirty_bitmap, offset, bytes);
 
-job_progress_increase_remaining(>common.job, dirty_bytes);
+job_progress_increase_remaining(>common.job, bytes);
 
-assert(dirty_offset - offset <= SIZE_MAX);
-if (qiov) {
-qemu_iovec_reset(_qiov);
-qemu_iovec_concat(_qiov, qiov,
-  dirty_offset - offset, dirty_bytes);
-}
-
-switch (method) {
-case MIRROR_METHOD_COPY:
-ret = blk_co_pwritev(job->target, dirty_offset, dirty_bytes,
- qiov ? _qiov : NULL, flags);
-break;
+switch (method) {
+case MIRROR_METHOD_COPY:
+ret = blk_co_pwritev(job->target, offset, bytes, qiov, flags);
+break;
 
-case MIRROR_METHOD_ZERO:
-assert(!qiov);
-ret = blk_co_pwrite_zeroes(job->target, dirty_offset, dirty_bytes,
-   flags);
-break;
+case MIRROR_METHOD_ZERO:
+assert(!qiov);
+ret = blk_co_pwrite_zeroes(job->target, offset, bytes, flags);
+break;
 
-case MIRROR_METHOD_DISCARD:
-assert(!qiov);
-ret = blk_co_pdiscard(job->target, dirty_offset, dirty_bytes);
-break;
+case MIRROR_METHOD_DISCARD:
+assert(!qiov);
+ret = blk_co_pdiscard(job->target, offset, bytes);
+break;
 
-default:
-abort();
-}
+default:
+abort();
+}
 
-if (ret >= 0) {
-job_progress_update(>common.job, dirty_bytes);
-} else {
-BlockErrorAction action;
+if (ret >= 0) {
+job_progress_update(>common.job, bytes);
+} else {
+BlockErrorAction action;
 
-bdrv_set_dirty_bitmap(job->dirty_bitmap, dirty_offset, 
dirty_bytes);
-job->actively_synced = false;
+bdrv_set_dirty_bitmap(job->dirty_bitmap, offset, bytes);
+job->actively_synced = false;
 
-action = mirror_error_action(job, false, -ret);
-if (action == BLOCK_ERROR_ACTION_REPORT) {
-if (!job->ret) {
-job->ret = ret;
-}
-break;
+action = mirror_error_action(job, false, -ret);
+if (action == BLOCK_ERROR_ACTION_REPORT) {
+if (!job->ret) {
+job->ret = ret;
 }
 }
-
-dirty_offset += dirty_bytes;
-}
-
-if (qiov) {
-qemu_iovec_destroy(_qiov);
 }
 }
 
-- 
2.21.0




Re: [Qemu-devel] [PATCH qemu] loader: Trace loaded images

2019-09-12 Thread Philippe Mathieu-Daudé
On 6/20/19 10:53 AM, Philippe Mathieu-Daudé wrote:
> On 6/20/19 7:50 AM, Alexey Kardashevskiy wrote:
>> On 17/06/2019 14:56, Philippe Mathieu-Daudé wrote:
>>> On 6/17/19 3:25 AM, Alexey Kardashevskiy wrote:
 On 14/06/2019 19:33, Stefan Hajnoczi wrote:
> On Fri, Jun 14, 2019 at 10:13:04AM +1000, Alexey Kardashevskiy wrote:
>>
>>
>> On 13/06/2019 23:08, Philippe Mathieu-Daudé wrote:
>>> Hi Alexey,
>>>
>>> On 6/13/19 7:09 AM, Alexey Kardashevskiy wrote:
 This adds a trace point which prints every loaded image. This includes
 bios/firmware/kernel/initradmdisk/pcirom.

 Signed-off-by: Alexey Kardashevskiy 
 ---

 The example for a pseries guest:

 loader_write_rom slof.bin: @0x0 size=0xe22e0 ROM=0
 loader_write_rom phdr #0: /home/aik/t/vml4120le: @0x40 
 size=0x13df000 ROM=0
 loader_write_rom /home/aik/t/le.cpio: @0x1ad size=0x9463a00 ROM=0
>>>
>>> I find the "ROM=0" part confuse, maybe you can change to "ROM:false".
>>
>> How? I mean I can do that in the code as rom->isrom?"true":"false" and
>> make trace point accept "%s" but it is quite ugly and others seem to
>> just use %d for bool.
>
> Yes, %d is the convention for bool.  Perhaps you can name it "is_rom"
> instead of "ROM".  That way the name communicates that this is a boolean
> value.

 It is quite obvious though that it is boolean even as "ROM" (what else
 can that be realistically?) and there does not seem to be a convention
 about xxx:N vs is_xxx:N. And personally I find longer lines worse for
 limited width screens (I run multiple qemus in tiled tmux). Whose tree
 is this going to? Let's ask that person :)
>>>
>>> Personally I find 'is_rom' clearer. I read 2 addresses, then my first
>>> reaction was to parse it as another address. But it is also true we now
>>> enforce traced hex values with '0x' prefix, so your 'ROM' is unlikely an
>>> address. Tiled tmux is an acceptable argument. Anyway you already got my
>>> R-b.
>>>
>>> Tree: the PPC tree is likely to get it merged quicker than the MISC tree.
>>
>>
>> There is nothing specific about PPC though so I guess it is the MISC
>> tree, who does maintain that?
> 
> Paolo, Cc'ing him.

Stefan sometime takes tracing patches, but this patch might also go via
Trivial@ (Cc'ed).



Re: [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState

2019-09-12 Thread Richard Henderson
On 9/12/19 10:53 AM, Chih-Min Chao wrote:
> 
> 
> On Thu, Sep 12, 2019 at 6:39 AM Richard Henderson 
>  > wrote:
> 
> On 9/11/19 10:51 AM, Chih-Min Chao wrote:
> > Could  the VLEN be configurable in cpu initialization but not fixed in
> > compilation phase ?
> > Take the integer element as example  and the difference should be the
> > stride of vfp.vreg[x] isn't continuous
> 
> Do you really want an unbounded amount of vector register storage?
> 
> 
>  Hi Richard,
> 
> VLEN is implementation-defined parameter and the only limitation on spec is
> that it must be power of 2.
> What I prefer is the value could be adjustable in runtime.

Ok, fine, I suppose.  I'll let a risc-v maintainer opine on whether there
should be some sanity check on the bounds of VLEN.  If you really do have an
unbounded vlen, you'll need to consider carefully how you want to manage 
migration.

> >     uint8_t *mem = malloc(size)
> >     for (int idx = 0; idx < 32; ++idx) {
> >         vfp.vreg[idx].u64 = (void *)[idx * elem];
> >         vfp.vreg[idx].u32 = (void *)[idx * elem];
> >         vfp.vreg[idx].u16 = (void *)[idx * elem];
> >    }
> 
> This isn't adjusting the stride of the elements.  And in any case this 
> would
> have to be re-adjusted for every vsetvl.
> 
>  Not sure about the relation with vsetvl. Could you provide an example ?

Well, I think it's merely a matter of there's no point having so many different
pointers into the block of memory that provides the backing storage.  I've
asserted elsewhere in the thread that we shouldn't have an array of 32
"registers" anyway.


r~



Re: [Qemu-devel] [PATCH v2 3/3] xen: perform XenDevice clean-up in XenBus watch handler

2019-09-12 Thread Anthony PERARD
On Thu, Sep 12, 2019 at 01:16:46PM +0100, Paul Durrant wrote:
> Cleaning up offine XenDevice objects directly in
  ^ offline

> xen_device_backend_changed() is dangerous as xen_device_unrealize() will
> modify the watch list that is being walked. Even the QLIST_FOREACH_SAFE()
> used in notifier_list_notify() is insufficient as *two* notifiers (for
> the frontend and backend watches) are removed, thus potentially rendering
> the 'next' pointer unsafe.
> 
> The solution is to use the XenBus backend_watch handler to do the clean-up
> instead, as it is invoked whilst walking a separate watch list.
> 
> This patch therefore adds a new 'offline_devices' list to XenBus, to which
> offline devices are added by xen_device_backend_changed(). The XenBus
> backend_watch registration is also changed to not only invoke
> xen_bus_enumerate() but also a new xen_bus_cleanup() function, which will
> walk 'offline_devices' and perform the necessary actions.
> For safety a an extra 'online' check is also added to
 ^ one 'a' too many?

> xen_bus_type_enumerate() to make sure that no attempt is made to create a
> new XenDevice object for a backend that is offline.
> 
> NOTE: This patch also include some cosmetic changes:
>   - substitute the local variable name 'backend_state'
> in xen_bus_type_enumerate() with 'state', since there
> is no ambiguity with any other state in that context.
>   - change xen_device_state_is_active() to
> xen_device_frontend_is_active() (and pass a XenDevice directly)
> since the state tests contained therein only apply to a frontend.
>   - use 'state' rather then 'xendev->backend_state' in
> xen_device_backend_changed() to shorten the code.
> 
> Signed-off-by: Paul Durrant 
> ---
> 
> v2:
>  - Make sure we don't try to add a XenDevice to 'offline_devices' more than
>once
> ---
>  
>  /*
>   * If a backend is still 'online' then we should leave it alone but,
> - * if a backend is not 'online', then the device should be destroyed
> - * once the state is Closed.
> + * if a backend is not 'online', then the device is a candidate
> + * for destruction. Hence add it to the 'offline' list to be cleaned
> + * by xen_bus_cleanup().
>   */
> -if (!xendev->backend_online &&
> -(xendev->backend_state == XenbusStateClosed ||
> - xendev->backend_state == XenbusStateInitialising ||
> - xendev->backend_state == XenbusStateInitWait ||
> - xendev->backend_state == XenbusStateUnknown)) {
> -Error *local_err = NULL;
> +if (!online &&
> +(state == XenbusStateClosed ||  state == XenbusStateInitialising ||
> + state == XenbusStateInitWait || state == XenbusStateUnknown) &&
> +!QLIST_NEXT(xendev, list)) {

Could you add a note about this QLIST_NEXT? I don't think it's going to
be obvious enough why we check that there are no next item. I might only
understand it just because of your reply to the v1 of the patch.
(Well the changelog of the patch also point out what it's for.)

> +XenBus *xenbus = XEN_BUS(qdev_get_parent_bus(DEVICE(xendev)));
>  
> -if (!xen_backend_try_device_destroy(xendev, _err)) {
> -object_unparent(OBJECT(xendev));
> -}
> +QLIST_INSERT_HEAD(>offline_devices, xendev, list);
>  
> -if (local_err) {
> -error_report_err(local_err);
> -}
> +/*
> + * Re-write the state to cause a XenBus backend_watch notification,
> + * resulting in a call to xen_bus_cleanup().
> + */
> +xen_device_backend_printf(xendev, "state", "%u", state);

It kind of feels wrong to rely on xenstore to notify QEMU's xenbus
driver that a device needs cleanup. But that does the job.

With a note about the use of QLIST_NEXT and the few typo fixed, the
patch will be good to go.

Thanks,

-- 
Anthony PERARD



Re: [Qemu-devel] [PATCH v2 07/17] RISC-V: add vector extension atomic instructions

2019-09-12 Thread Richard Henderson
On 9/11/19 2:25 AM, liuzhiwei wrote:
> +case 64:
> +if (vector_elem_mask(env, vm, width, lmul, i)) {
> +int64_t tmp;
> +idx= (target_long)env->vfp.vreg[src2].s64[j];
> +addr   = idx + env->gpr[rs1];
> +
> +#ifdef CONFIG_SOFTMMU
> +tmp = (int64_t)(int32_t)helper_atomic_xchgl_le(env, addr,
> +env->vfp.vreg[src3].s64[j],
> +make_memop_idx(memop & ~MO_SIGN, mem_idx));
> +#else
> +tmp = (int64_t)(int32_t)helper_atomic_xchgl_le(env, addr,
> +env->vfp.vreg[src3].s64[j]);
> +#endif
> +if (wd) {
> +env->vfp.vreg[src3].s64[j] = tmp;
> +}
> +env->vfp.vstart++;
> +}
> +break;

This will not link if !defined(CONFIG_ATOMIC64).

That's pretty rare these days, admittedly.  I think you'd need to compile for
ppc32 or mips32 (or riscv32!) host to see this.  You can force this condition
for i686 host with --extra-cflags='-march=i486', just to see if you've got it
right.

There should be two different versions of this helper: one that performs actual
atomic operations, as above, and a second that performs the same operation with
non-atomic operations.

The version of the helper that you call should be based on the translation time
setting of "tb_cflags(s->base.tb) & CF_PARALLEL":  If PARALLEL is set, call the
atomic helper otherwise the non-atomic helper.

If you arrive at a situation in which the host cannot handle any atomic
operation, then you must raise the EXCP_ATOMIC exception.  This will halt all
other cpus and run one instruction on this cpu while holding the exclusive lock.

If you cannot detect this condition any earlier than here at runtime, use
cpu_loop_exit_atomic(), but you must do so before altering any cpu state.
However, as per my comments for normal loads, you should be able to detect this
condition at translation time and call gen_helper_exit_atomic().


r~



Re: [Qemu-devel] [Qemu-discuss] Cross-posted : Odd QXL/KVM performance issue with a Windows 7 Guest

2019-09-12 Thread Brad Campbell

On 9/9/19 11:22 pm, Dr. David Alan Gilbert wrote:


Oh, hmm.
Sorry I don't know too much where to look then; you have any of:
   a) Windows
   b) guest graphics drivers
   c) spice server in qemu
  
and probalby some more.


So I think it's going to be a case of profiling on the two different
systems and see if you can spot anything in particular that stands out.



G'day Dave,

Thanks for the advice.

I ran qemu through Valgrind/callgrind and nothing excessive popped up. 
It looks like the issue might be in Windows, so I'm now trying to figure 
out how to profile inside the guest.


Suggestions as to how to profile the qxl driver while it's underway 
gratefully accepted. I'm working my way through the profiling tools in 
the Windows SDK & DDK but haven't made much headway.


Regards,
Brad



Re: [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState

2019-09-12 Thread Chih-Min Chao
On Thu, Sep 12, 2019 at 6:39 AM Richard Henderson <
richard.hender...@linaro.org> wrote:

> On 9/11/19 10:51 AM, Chih-Min Chao wrote:
> > Could  the VLEN be configurable in cpu initialization but not fixed in
> > compilation phase ?
> > Take the integer element as example  and the difference should be the
> > stride of vfp.vreg[x] isn't continuous
>
> Do you really want an unbounded amount of vector register storage?


 Hi Richard,

VLEN is implementation-defined parameter and the only limitation on spec is
that it must be power of 2.
What I prefer is the value could be adjustable in runtime.

>


> > uint8_t *mem = malloc(size)
> > for (int idx = 0; idx < 32; ++idx) {
> > vfp.vreg[idx].u64 = (void *)[idx * elem];
> > vfp.vreg[idx].u32 = (void *)[idx * elem];
> > vfp.vreg[idx].u16 = (void *)[idx * elem];
> >}
>
> This isn't adjusting the stride of the elements.  And in any case this
> would
> have to be re-adjusted for every vsetvl.
>
>  Not sure about the relation with vsetvl. Could you provide an example ?

Chih-Min

>
> r~
>


Re: [Qemu-devel] [PATCH] target/m68k/fpu_helper.c: rename the access arguments

2019-09-12 Thread KONRAD Frederic




Le 9/12/19 à 4:32 PM, Philippe Mathieu-Daudé a écrit :

On 9/12/19 4:02 PM, KONRAD Frederic wrote:

The "access" arguments clash with a macro under Windows with MinGW:
   CC  m68k-softmmu/target/m68k/fpu_helper.o
   target/m68k/fpu_helper.c: In function 'fmovem_predec':
   target/m68k/fpu_helper.c:405:56: error: macro "access" passed 4 arguments,
but takes just 2
size = access(env, addr, >fregs[i], ra);

So this renames them access_fn.


access() is not your friend... this reminds me of

commit 05e015f73c3b5c50c237d3d8e555e25cfa543a5c
Author: KONRAD Frederic 
Date:   Thu Sep 21 12:04:20 2017 +0200

 memory: avoid a name clash with access macro

 This avoids a name clash with the access macro on windows 64:


True, I didn't catch this one at the time because we didn't build m68k.



 make
 CHK version_gen.h
   CC  aarch64-softmmu/memory.o
 /home/konrad/qemu/memory.c: In function 'access_with_adjusted_size':
 /home/konrad/qemu/memory.c:591:73: error: macro "access" passed 7
arguments, \
  but takes just 2
  (size - access_size - i) * 8, access_mask, attrs);
  ^


Tested with:
  ./configure --target-list=m68k-softmmu
  make -j8

Signed-off-by: KONRAD Frederic 


Reviewed-by: Philippe Mathieu-Daudé 


Thanks!

Cheers,
Fred




Re: [Qemu-devel] [PATCH v2 06/17] RISC-V: add vector extension fault-only-first implementation

2019-09-12 Thread Richard Henderson
On 9/11/19 2:25 AM, liuzhiwei wrote:
> diff --git a/linux-user/riscv/cpu_loop.c b/linux-user/riscv/cpu_loop.c
> index 12aa3c0..d673fa5 100644
> --- a/linux-user/riscv/cpu_loop.c
> +++ b/linux-user/riscv/cpu_loop.c
> @@ -41,6 +41,13 @@ void cpu_loop(CPURISCVState *env)
>  sigcode = 0;
>  sigaddr = 0;
>  
> +if (env->foflag) {
> +if (env->vfp.vl != 0) {
> +env->foflag = false;
> +env->pc += 4;
> +continue;
> +}
> +}
>  switch (trapnr) {
>  case EXCP_INTERRUPT:
>  /* just indicate that signals should be handled asap */
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index e32b612..405caf6 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -521,6 +521,13 @@ void riscv_cpu_do_interrupt(CPUState *cs)
>  [PRV_H] = RISCV_EXCP_H_ECALL,
>  [PRV_M] = RISCV_EXCP_M_ECALL
>  };
> +if (env->foflag) {
> +if (env->vfp.vl != 0) {
> +env->foflag = false;
> +env->pc += 4;
> +return;
> +}
> +}

I renew my objection to this FOFLAG mechanism.  I believe, but have no proof,
that this will race between different types of interrupts.  Once again I
present the ARM SVE first-fault helpers as proof that there is another way.

Otherwise, all of the same comments from the normal loads apply.


r~



Re: [Qemu-devel] [RFC v3 PATCH 05/45] multi-process: Add config option for multi-process QEMU

2019-09-12 Thread Stefan Hajnoczi
On Tue, Sep 03, 2019 at 04:37:31PM -0400, Jagannathan Raman wrote:
> @@ -1543,6 +1544,10 @@ for opt do
>;;
>--disable-libpmem) libpmem=no
>;;
> +  --enable-mpqemu) mpqemu=yes
> +  ;;
> +  --disable-mpqemu) mpqemu=no

A previous patch used "remote" instead of "mpqemu", which is confusing.

"mpqemu" seems reasonable.  "remote" is too generic.  Can you use
"mpqemu" everywhere?

> +  ;;
>*)
>echo "ERROR: unknown option $opt"
>echo "Try '$0 --help' for more information"
> @@ -1842,6 +1847,7 @@ disabled with --disable-FEATURE, default is enabled if 
> available:
>capstonecapstone disassembler support
>debug-mutex mutex debugging support
>libpmem libpmem support
> +  mpqemu  multi-process QEMU support
>  
>  NOTE: The object files are built at the place where configure is launched
>  EOF
> @@ -6481,6 +6487,7 @@ echo "docker$docker"
>  echo "libpmem support   $libpmem"
>  echo "libudev   $libudev"
>  echo "default devices   $default_devices"
> +echo "multiprocess QEMU $mpqemu"

multi-process (see above) or multiprocess? :-)


signature.asc
Description: PGP signature


[Qemu-devel] [PATCH] spapr: Report kvm_irqchip_in_kernel() in 'info pic'

2019-09-12 Thread Greg Kurz
Unless the machine was started with kernel-irqchip=on, we cannot easily
tell if we're actually using an in-kernel or an emulated irqchip. This
information is important enough that it is worth printing it in 'info
pic'.

Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr.c |4 
 1 file changed, 4 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 761f8214c312..348c007ffbd3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -81,6 +81,8 @@
 #include "hw/mem/memory-device.h"
 #include "hw/ppc/spapr_tpm_proxy.h"
 
+#include "monitor/monitor.h"
+
 #include 
 
 /* SLOF memory layout:
@@ -4360,6 +4362,8 @@ static void spapr_pic_print_info(InterruptStatsProvider 
*obj,
 SpaprMachineState *spapr = SPAPR_MACHINE(obj);
 
 spapr->irq->print_info(spapr, mon);
+monitor_printf(mon, "irqchip: %s\n",
+   kvm_irqchip_in_kernel() ? "in-kernel" : "emulated");
 }
 
 int spapr_get_vcpu_id(PowerPCCPU *cpu)




[Qemu-devel] [PATCH] memory: inline and optimize devend_memop

2019-09-12 Thread Paolo Bonzini
devend_memop can rely on the fact that the result is always either
0 or MO_BSWAP, corresponding respectively to host endianness and
the opposite.  Native (target) endianness in turn can be either
the host endianness, in which case MO_BSWAP is only returned for
host-opposite endianness, or the opposite, in which case 0 is only
returned for host endianness.

With this in mind, devend_memop can be compiled as a setcond+shift
for every target.  Do this and, while at it, move it to
include/exec/memory.h since !NEED_CPU_H files do not (and should not)
need it.

Signed-off-by: Paolo Bonzini 
---
 include/exec/memory.h | 19 ++-
 memory.c  | 18 --
 2 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 2dd810259d..d898cfb5db 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -2201,8 +2201,25 @@ address_space_write_cached(MemoryRegionCache *cache, 
hwaddr addr,
 }
 }
 
+#ifdef NEED_CPU_H
 /* enum device_endian to MemOp.  */
-MemOp devend_memop(enum device_endian end);
+MemOp devend_memop(enum device_endian end)
+{
+QEMU_BUILD_BUG_ON(DEVICE_HOST_ENDIAN != DEVICE_LITTLE_ENDIAN &&
+  DEVICE_HOST_ENDIAN != DEVICE_BIG_ENDIAN);
+
+#ifdef BSWAP_NEEDED
+/* Swap if non-host endianness or native (target) endianness */
+return (end == DEVICE_HOST_ENDIAN) ? 0 : MO_BSWAP;
+#else
+const int non_host_endianness =
+DEVICE_LITTLE_ENDIAN ^ DEVICE_BIG_ENDIAN ^ DEVICE_HOST_ENDIAN;
+
+/* In this case, native (target) endianness needs no swap.  */
+return (end == non_host_endianness) ? MO_BSWAP : 0;
+#endif
+}
+#endif
 
 #endif
 
diff --git a/memory.c b/memory.c
index 61a254c3f9..b9dd6b94ca 100644
--- a/memory.c
+++ b/memory.c
@@ -3267,21 +3267,3 @@ static void memory_register_types(void)
 }
 
 type_init(memory_register_types)
-
-MemOp devend_memop(enum device_endian end)
-{
-static MemOp conv[] = {
-[DEVICE_LITTLE_ENDIAN] = MO_LE,
-[DEVICE_BIG_ENDIAN] = MO_BE,
-[DEVICE_NATIVE_ENDIAN] = MO_TE,
-[DEVICE_HOST_ENDIAN] = 0,
-};
-switch (end) {
-case DEVICE_LITTLE_ENDIAN:
-case DEVICE_BIG_ENDIAN:
-case DEVICE_NATIVE_ENDIAN:
-return conv[end];
-default:
-g_assert_not_reached();
-}
-}
-- 
2.21.0




Re: [Qemu-devel] [PATCH] target/m68k/fpu_helper.c: rename the access arguments

2019-09-12 Thread Philippe Mathieu-Daudé
On 9/12/19 4:02 PM, KONRAD Frederic wrote:
> The "access" arguments clash with a macro under Windows with MinGW:
>   CC  m68k-softmmu/target/m68k/fpu_helper.o
>   target/m68k/fpu_helper.c: In function 'fmovem_predec':
>   target/m68k/fpu_helper.c:405:56: error: macro "access" passed 4 arguments,
>but takes just 2
>size = access(env, addr, >fregs[i], ra);
> 
> So this renames them access_fn.

access() is not your friend... this reminds me of

commit 05e015f73c3b5c50c237d3d8e555e25cfa543a5c
Author: KONRAD Frederic 
Date:   Thu Sep 21 12:04:20 2017 +0200

memory: avoid a name clash with access macro

This avoids a name clash with the access macro on windows 64:

make
CHK version_gen.h
  CC  aarch64-softmmu/memory.o
/home/konrad/qemu/memory.c: In function 'access_with_adjusted_size':
/home/konrad/qemu/memory.c:591:73: error: macro "access" passed 7
arguments, \
 but takes just 2
 (size - access_size - i) * 8, access_mask, attrs);
 ^
> 
> Tested with:
>  ./configure --target-list=m68k-softmmu
>  make -j8
> 
> Signed-off-by: KONRAD Frederic 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  target/m68k/fpu_helper.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
> index 9b039c8..4137542 100644
> --- a/target/m68k/fpu_helper.c
> +++ b/target/m68k/fpu_helper.c
> @@ -396,14 +396,14 @@ typedef int (*float_access)(CPUM68KState *env, uint32_t 
> addr, FPReg *fp,
>  uintptr_t ra);
>  
>  static uint32_t fmovem_predec(CPUM68KState *env, uint32_t addr, uint32_t 
> mask,
> -   float_access access)
> +  float_access access_fn)
>  {
>  uintptr_t ra = GETPC();
>  int i, size;
>  
>  for (i = 7; i >= 0; i--, mask <<= 1) {
>  if (mask & 0x80) {
> -size = access(env, addr, >fregs[i], ra);
> +size = access_fn(env, addr, >fregs[i], ra);
>  if ((mask & 0xff) != 0x80) {
>  addr -= size;
>  }
> @@ -414,14 +414,14 @@ static uint32_t fmovem_predec(CPUM68KState *env, 
> uint32_t addr, uint32_t mask,
>  }
>  
>  static uint32_t fmovem_postinc(CPUM68KState *env, uint32_t addr, uint32_t 
> mask,
> -   float_access access)
> +   float_access access_fn)
>  {
>  uintptr_t ra = GETPC();
>  int i, size;
>  
>  for (i = 0; i < 8; i++, mask <<= 1) {
>  if (mask & 0x80) {
> -size = access(env, addr, >fregs[i], ra);
> +size = access_fn(env, addr, >fregs[i], ra);
>  addr += size;
>  }
>  }
> 



Re: [Qemu-devel] [PATCH] spapr: Report kvm_irqchip_in_kernel() in 'info pic'

2019-09-12 Thread Cédric Le Goater
On 12/09/2019 16:30, Greg Kurz wrote:
> Unless the machine was started with kernel-irqchip=on, we cannot easily
> tell if we're actually using an in-kernel or an emulated irqchip. This
> information is important enough that it is worth printing it in 'info
> pic'.

Nice ! 
 
> Signed-off-by: Greg Kurz 

Reviewed-by: Cédric Le Goater 

Thanks,

C.

> ---
>  hw/ppc/spapr.c |4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 761f8214c312..348c007ffbd3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -81,6 +81,8 @@
>  #include "hw/mem/memory-device.h"
>  #include "hw/ppc/spapr_tpm_proxy.h"
>  
> +#include "monitor/monitor.h"
> +
>  #include 
>  
>  /* SLOF memory layout:
> @@ -4360,6 +4362,8 @@ static void spapr_pic_print_info(InterruptStatsProvider 
> *obj,
>  SpaprMachineState *spapr = SPAPR_MACHINE(obj);
>  
>  spapr->irq->print_info(spapr, mon);
> +monitor_printf(mon, "irqchip: %s\n",
> +   kvm_irqchip_in_kernel() ? "in-kernel" : "emulated");
>  }
>  
>  int spapr_get_vcpu_id(PowerPCCPU *cpu)
> 




[Qemu-devel] [PATCH] kvm: Fix typo in header of kvm_device_access()

2019-09-12 Thread Greg Kurz
Signed-off-by: Greg Kurz 
---
 include/sysemu/kvm.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 909bcd77cf82..fd674772ab31 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -308,7 +308,7 @@ int kvm_vm_check_attr(KVMState *s, uint32_t group, uint64_t 
attr);
 int kvm_device_check_attr(int fd, uint32_t group, uint64_t attr);
 
 /**
- * kvm_device_access - set or get value of a specific vm attribute
+ * kvm_device_access - set or get value of a specific device attribute
  * @fd: The device file descriptor
  * @group: the group
  * @attr: the attribute of that group to set or get




Re: [Qemu-devel] [PATCH v3 6/6] Add dbus-vmstate object

2019-09-12 Thread Eric Blake
On 9/12/19 7:25 AM, Marc-André Lureau wrote:
> When instanciated, this object will connect to the given D-Bus

instantiated

> bus. During migration, it will take the data from org.qemu.VMState1
> instances.
> 
> See documentation for further details.
> 
> Signed-off-by: Marc-André Lureau 
> ---

> +++ b/docs/interop/dbus-vmstate.rst
> @@ -0,0 +1,68 @@
> +=
> +D-Bus VMState
> +=
> +
> +Introduction
> +
> +
> +The QEMU dbus-vmstate object aim is to migrate helpers data running on
> +a QEMU D-Bus bus. (refer to the :doc:`dbus` document for
> +recommendation on D-Bus usage)

object's aim is to migrate helpers' data

> +
> +Upon migration, QEMU will go through the queue of
> +``org.qemu.VMState1`` D-Bus name owners and query their ``Id``. It
> +must be unique among the helpers.
> +
> +It will then save arbitrary data of each Id to be transferred in the
> +migration stream and restored/loaded at the corresponding destination
> +helper.
> +
> +The data amount to be transferred is limited to 1Mb. The state must be
> +saved quickly (a few seconds maximum). (D-Bus imposes a time limit on
> +reply anyway, and migration would fail if data isn't given quickly
> +enough)

s/enough)/enough.)/

> +
> +dbus-vmstate object can be configured with the expected list of
> +helpers by setting its ``id-list`` property, with a coma-separated

comma

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PULL 11/12] migration/qemu-file: fix potential buf waste for extra buf_index adjustment

2019-09-12 Thread Dr. David Alan Gilbert (git)
From: Wei Yang 

In add_to_iovec(), qemu_fflush() will be called if iovec is full. If
this happens, buf_index is reset. Currently, this is not checked and
buf_index would always been adjust with buf size.

This is not harmful, but will waste some space in file buffer.

This patch make add_to_iovec() return 1 when it has flushed the file.
Then the caller could check the return value to see whether it is
necessary to adjust the buf_index any more.

Signed-off-by: Wei Yang 
Reviewed-by: Dr. David Alan Gilbert 

Message-Id: <20190911132839.23336-3-richard.weiy...@gmail.com>
Signed-off-by: Dr. David Alan Gilbert 
---
 migration/qemu-file.c | 43 ++-
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 66627088d4..26fb25ddc1 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -381,8 +381,16 @@ int qemu_fclose(QEMUFile *f)
 return ret;
 }
 
-static void add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
- bool may_free)
+/*
+ * Add buf to iovec. Do flush if iovec is full.
+ *
+ * Return values:
+ * 1 iovec is full and flushed
+ * 0 iovec is not flushed
+ *
+ */
+static int add_to_iovec(QEMUFile *f, const uint8_t *buf, size_t size,
+bool may_free)
 {
 /* check for adjacent buffer and coalesce them */
 if (f->iovcnt > 0 && buf == f->iov[f->iovcnt - 1].iov_base +
@@ -400,6 +408,19 @@ static void add_to_iovec(QEMUFile *f, const uint8_t *buf, 
size_t size,
 
 if (f->iovcnt >= MAX_IOV_SIZE) {
 qemu_fflush(f);
+return 1;
+}
+
+return 0;
+}
+
+static void add_buf_to_iovec(QEMUFile *f, size_t len)
+{
+if (!add_to_iovec(f, f->buf + f->buf_index, len, false)) {
+f->buf_index += len;
+if (f->buf_index == IO_BUF_SIZE) {
+qemu_fflush(f);
+}
 }
 }
 
@@ -429,11 +450,7 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, 
size_t size)
 }
 memcpy(f->buf + f->buf_index, buf, l);
 f->bytes_xfer += l;
-add_to_iovec(f, f->buf + f->buf_index, l, false);
-f->buf_index += l;
-if (f->buf_index == IO_BUF_SIZE) {
-qemu_fflush(f);
-}
+add_buf_to_iovec(f, l);
 if (qemu_file_get_error(f)) {
 break;
 }
@@ -450,11 +467,7 @@ void qemu_put_byte(QEMUFile *f, int v)
 
 f->buf[f->buf_index] = v;
 f->bytes_xfer++;
-add_to_iovec(f, f->buf + f->buf_index, 1, false);
-f->buf_index++;
-if (f->buf_index == IO_BUF_SIZE) {
-qemu_fflush(f);
-}
+add_buf_to_iovec(f, 1);
 }
 
 void qemu_file_skip(QEMUFile *f, int size)
@@ -760,11 +773,7 @@ ssize_t qemu_put_compression_data(QEMUFile *f, z_stream 
*stream,
 }
 
 qemu_put_be32(f, blen);
-add_to_iovec(f, f->buf + f->buf_index, blen, false);
-f->buf_index += blen;
-if (f->buf_index == IO_BUF_SIZE) {
-qemu_fflush(f);
-}
+add_buf_to_iovec(f, blen);
 return blen + sizeof(int32_t);
 }
 
-- 
2.21.0




[Qemu-devel] [PULL 08/12] tests/migration: Add a test for validate-uuid capability

2019-09-12 Thread Dr. David Alan Gilbert (git)
From: Yury Kotov 

Signed-off-by: Yury Kotov 
Message-Id: <20190903162246.18524-4-yury-ko...@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Dr. David Alan Gilbert 
---
 tests/migration-test.c | 140 -
 1 file changed, 110 insertions(+), 30 deletions(-)

diff --git a/tests/migration-test.c b/tests/migration-test.c
index a9f81cc185..258aa064d4 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -512,7 +512,8 @@ static void migrate_postcopy_start(QTestState *from, 
QTestState *to)
 
 static int test_migrate_start(QTestState **from, QTestState **to,
const char *uri, bool hide_stderr,
-   bool use_shmem)
+   bool use_shmem, const char *opts_src,
+   const char *opts_dst)
 {
 gchar *cmd_src, *cmd_dst;
 char *bootpath = NULL;
@@ -521,6 +522,9 @@ static int test_migrate_start(QTestState **from, QTestState 
**to,
 const char *arch = qtest_get_arch();
 const char *accel = "kvm:tcg";
 
+opts_src = opts_src ? opts_src : "";
+opts_dst = opts_dst ? opts_dst : "";
+
 if (use_shmem) {
 if (!g_file_test("/dev/shm", G_FILE_TEST_IS_DIR)) {
 g_test_skip("/dev/shm is not supported");
@@ -539,16 +543,16 @@ static int test_migrate_start(QTestState **from, 
QTestState **to,
 cmd_src = g_strdup_printf("-machine accel=%s -m 150M"
   " -name source,debug-threads=on"
   " -serial file:%s/src_serial"
-  " -drive file=%s,format=raw %s",
+  " -drive file=%s,format=raw %s %s",
   accel, tmpfs, bootpath,
-  extra_opts ? extra_opts : "");
+  extra_opts ? extra_opts : "", opts_src);
 cmd_dst = g_strdup_printf("-machine accel=%s -m 150M"
   " -name target,debug-threads=on"
   " -serial file:%s/dest_serial"
   " -drive file=%s,format=raw"
-  " -incoming %s %s",
+  " -incoming %s %s %s",
   accel, tmpfs, bootpath, uri,
-  extra_opts ? extra_opts : "");
+  extra_opts ? extra_opts : "", opts_dst);
 start_address = X86_TEST_MEM_START;
 end_address = X86_TEST_MEM_END;
 } else if (g_str_equal(arch, "s390x")) {
@@ -556,15 +560,15 @@ static int test_migrate_start(QTestState **from, 
QTestState **to,
 extra_opts = use_shmem ? get_shmem_opts("128M", shmem_path) : NULL;
 cmd_src = g_strdup_printf("-machine accel=%s -m 128M"
   " -name source,debug-threads=on"
-  " -serial file:%s/src_serial -bios %s %s",
+  " -serial file:%s/src_serial -bios %s %s %s",
   accel, tmpfs, bootpath,
-  extra_opts ? extra_opts : "");
+  extra_opts ? extra_opts : "", opts_src);
 cmd_dst = g_strdup_printf("-machine accel=%s -m 128M"
   " -name target,debug-threads=on"
   " -serial file:%s/dest_serial -bios %s"
-  " -incoming %s %s",
+  " -incoming %s %s %s",
   accel, tmpfs, bootpath, uri,
-  extra_opts ? extra_opts : "");
+  extra_opts ? extra_opts : "", opts_dst);
 start_address = S390_TEST_MEM_START;
 end_address = S390_TEST_MEM_END;
 } else if (strcmp(arch, "ppc64") == 0) {
@@ -575,14 +579,15 @@ static int test_migrate_start(QTestState **from, 
QTestState **to,
   " -prom-env 'use-nvramrc?=true' -prom-env "
   "'nvramrc=hex .\" _\" begin %x %x "
   "do i c@ 1 + i c! 1000 +loop .\" B\" 0 "
-  "until' %s",  accel, tmpfs, end_address,
-  start_address, extra_opts ? extra_opts : "");
+  "until' %s %s",  accel, tmpfs, end_address,
+  start_address, extra_opts ? extra_opts : "",
+  opts_src);
 cmd_dst = g_strdup_printf("-machine accel=%s -m 256M"
   " -name target,debug-threads=on"
   " -serial file:%s/dest_serial"
-  " -incoming %s %s",
+  " -incoming %s %s %s",
   

Re: [Qemu-devel] [PATCH v2 05/17] RISC-V: add vector extension load and store instructions

2019-09-12 Thread Richard Henderson
> +static bool  vector_lmul_check_reg(CPURISCVState *env, uint32_t lmul,
> +uint32_t reg, bool widen)
> +{
> +int legal = widen ? (lmul * 2) : lmul;
> +
> +if ((lmul != 1 && lmul != 2 && lmul != 4 && lmul != 8) ||
> +(lmul == 8 && widen)) {
> +helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
> +return false;
> +}
> +
> +if (reg % legal != 0) {
> +helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
> +return false;
> +}
> +return true;
> +}

These exceptions will not do the right thing.

You cannot call helper_raise_exception from another helper, or from something
called from another helper, as here.  You need to use riscv_raise_exception, as
you do elsewhere in this patch, with a GETPC() value passed down from the
outermost helper.

Ideally you would check these conditions at translate time.
I've mentioned how to do this in reply to your v1.


> +void VECTOR_HELPER(vlbu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,> +  
>   uint32_t rs1, uint32_t rd)

You should pass the rs1 register by value, not by index.

> +{> +int i, j, k, vl, vlmax, lmul, width, dest, read;> +> +vl =
env->vfp.vl;> +> +lmul   = vector_get_lmul(env);> +width =
vector_get_width(env);> +vlmax = vector_get_vlmax(env);> +> +if
(vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {> +
riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());> +
return;> +}> +if (lmul * (nf + 1) > 32) {> +
riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());> +
return;> +}
Again, these exceptions should ideally be identified at translate time.

I also think that you should have at least two different helpers: one that
checks the vector mask and one that doesn't.  If you check the above conditions
at translate time then you'll also want to split the helpers based on element
width.

You could also meaningfully split nf == 0 vs nf != 0.  You will, in any case,
need to check at translate time whether the Zvlsseg extension is enabled before
allowing nf != 0.


> +
> +vector_lmul_check_reg(env, lmul, rd, false);
> +
> +for (i = 0; i < vlmax; i++) {
> +dest = rd + (i / (VLEN / width));
> +j = i % (VLEN / width);

This division is exactly why I suggested making vreg[] one contiguous array of
elements instead of a two-dimensional array.  I think the distinction of 32
VLEN-sized registers should be reserved for cpu dumps and gdbstub.


> +k = nf;
> +if (i < env->vfp.vstart) {
> +continue;

Surely you should hoist this check outside the loop.

> +} else if (i < vl) {
> +switch (width) {
> +case 8:
> +if (vector_elem_mask(env, vm, width, lmul, i)) {
> +while (k >= 0) {
> +read = i * (nf + 1)  + k;
> +env->vfp.vreg[dest + k * lmul].u8[j] =
> +cpu_ldub_data(env, env->gpr[rs1] + read);

You must not modify vreg[x] before you've recognized all possible exceptions,
e.g. validating that a subsequent access will not trigger a page fault.
Otherwise you will have a partially modified register value when the exception
handler is entered.

Without a stride, and without a predicate mask, this can be done with at most
two calls to probe_access (one per page).  This is the simplification that
makes splitting the helper into two very helpful.

With a stride or with a predicate mask requires either
(1) temporary storage for the loads, and copy back to env at the end, or
(2) use probe_access for each load, and then perform the actual loads directly
into env.

FWIW, ARM SVE uses (1), as probe_access is very new.


> +k--;
> +}
> +env->vfp.vstart++;
> +}
> +break;
> +case 16:
> +if (vector_elem_mask(env, vm, width, lmul, i)) {
> +while (k >= 0) {
> +read = i * (nf + 1)  + k;
> +env->vfp.vreg[dest + k * lmul].u16[j] =
> +cpu_ldub_data(env, env->gpr[rs1] + read);

I don't see anything in these assignments to vreg[x].uN[y] that take the
endianness of the host into account.

You need to think about how the architecture defines the overlap of elements --
particularly across vlset -- and make adjustments.

I can imagine, if you have explicit tests for this, your tests are passing
because the architecture defines a little-endian based indexing of the register
file, and you have only run tests on a little-endian host, like x86_64.

For ARM, we define the representation as a little-endian indexed array of
host-endian uint64_t.  This means that a big-endian host needs to adjust the
address of any element smaller than 64-bit.  E.g.

#ifdef HOST_WORDS_BIGENDIAN
#define H1(x)   ((x) ^ 7)
#define H2(x)   ((x) ^ 3)
#define H4(x)   ((x) ^ 1)
#else

  1   2   3   4   >